By Stof in Machine Learning — Jun 28, 2017

Precise vs Accurate on Arbitrary-precision Arithmetic

When Math isn't accurate in code

Precise vs Accurate

So here's a simple example to get you started, punch a simple calculation 0.1 + 0.2 into any calculator, scientific, google, whatever.
What result do you get?

You should see 0.3 right?

The code issues with arithmetic

Now go into your favourite programming language and do the same thing, what do you get?

You'll be pretty pleased if you're using Go, Haskell, Groovy, Lua, Swift, or even PHP and MySQL!

Surprising to me, at first, I assumed R was accurate until I double checked;

print(.1+.2, digits=18)
// 0.30000000000000004

Even R gets numbers wrong, and it's used heavily by data scientists

A contract I am working on currently is doing some big data analytics using Python 3 and R, so lets look at some python 3;

print(.1 + .2)
// 0.30000000000000004

Not cool snakey...

So this is a pretty common issue right? no. Common assumes it's commonly identified. It is almost never known but it is most definitely a wide spread practice.

The business (i.e. a Ph.D scientist) was completely unaware that their developers are using programming languages that will not produce the same results that their calculators produce when they proof the result the software produces...

So several years back I recall a conversation i had with a tech lead, the below is a very basic version of a Node.js solution I saw in a B2B financial transaction software;

Number.prototype.addTo = function(n) {
  return this + n
}
var number = 0.1
number.addTo(0.2)
// 0.30000000000000004

So what's a few cents you say? Consider we are running millions of calculations over millions of dollar transactions and then consider using multiplication instead of addition, yeah

Scary.

What is this problem called?

Caused when using any floating points in any calculation, this isn't a "bug" but rather an example of Arbitrary-precision arithmetic.

There is one very hard way to teach yourself, read What Every Computer Scientist Should Know About Floating-Point Arithmetic but a much better way is to take one step back and become familiar with all of the relevant material from IEEE 754: Standard for Binary Floating-Point Arithmetic so you've first found a baseline of what should be expected in your programming of Arithmetic.

Solutions?

For the python code you can try just using Python 2 and keep values as decimals and not floats

For JavaScript there are libraries such as BigDecimal to help fix the issues, Ruby also has a library option called BigDecimal but of course developers still use language native operators and cannot be forced to use a math library..

Some languages like Java are much easier, just ensure you're being explicit (should be disciplined enough by now)

System.out.println(.1F + .2F);
// 0.3

But be careful because some languages like Go will go wrong when forcing the use of floats;

var a float64 = .1
var b float64 = .2
fmt.Println(a + b)
// 0.30000000000000004

The key is to understand your language of choice, and mentor team members that are not well informed.

Fixing defects in existing software

Don't fix the bugs!

Really, I'm not kidding. Don't go refactor your entire code base dot fix bugs. The software has a particular reliance on the faulty functionality and therefore it's likely the faultiness has forced the end-users or integrated systems to work around the defect (usually by treating values as strings or truncating values).

If you change the source of the defect now and the dependants work-arounds downstream are not removed you will just be facing a wider range of defects later.

Amazingly you don't need to take my word for it, According to Edaena Salinas on SE-Radio #295, Microsoft Excel rounding issues in Excel that will never be fixed and they even went so far as to give the defects a name so as to legitimise them, they call the Mathematical precision defects of excel a "domain expectation" now.

The take away

If you've been reading my dribble you might be expecting an opinion around this point, sorry to disappoint, this is an education issue now. If even the language developers, well most of them, weren't aware of the IEEE standards or willing to be thorough enough to educate themselves and be good developers, what hope do companies even as large as Microsoft have to be precise?

We as individuals can only try our best.