Java Arithmetic Optimizations

We all know that floating point arithmetic is not perfect. There are rounding errors we have to be careful with. In this blog post I attempt to give a few rules of thumb in case you want to make sure your calculations are ass correct as possible as well as point out some very interesting behavior.

First of all, it is known that floating point can not perfectly encode some very simple decimal digits simply because it has to do so in the binary system. This means that any decimal number is a sum of the numbers 2^-n for any number of n. To be more concrete, this is the sequence of numbers: 0.5, 0.25, 0.125, 0.0625, 0.03125... So for example: 0.75 = 0.5 + 0.25

Upon every step the value halves and thus also gets more and more digits after the decimal point which makes it very difficult to create certain simple numbers, such as 0.3 = 0.25 + 0.03215 + ... which is actually impossible to correctly represent. This is no different than the decimal system where we also have certain values we can not perfectly represent, such as 1/3. Nevertheless Java employs tricks to know which value you actually want, for example, if you would assign a variable the value 0.3 and later on print it, it will show 0.3 exactly.

So what is the problem? Well, the magic starts to break down when you start using very long numbers or when you perform calculations with these values. Even though java is able to show you perfectly the value of 0.1 and 0.2 when you use them, when you add them together, it does not show you 0.3 but 0.30000000000000004 instead. This is cause by rounding errors being added on top of each other causing the reverse magic to no longer work.

That is the point of this blog post, it are the operations that are dangerous and tend to introduce errors. A very interesting, but not that easy, read is: https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html which I link here to be complete. The idea of floating point is that you want your rounding error to be small compared to the value, i.e. an error of 10 on a value of 10 million is not that bad, the same error of 10 on 100 means you have an error margin of 10% which can be catastrophic.

This cause the most dangerous operation to be subtracting numbers, which may come as a surprise, but it makes perfect sense. It is of course not every subtraction that is a problem, but only for numbers that are very close to each other, as both number can have a rounding being applied to them. As an example: 0.2001 - 0.2000. Let's say the second number gets rounded down to 0.19999, causing the result to be 0.0002, in that case the error is 0.0001 (since the result should have been 0.0001), which is not a very big error, but you are off by 100%. This problem does not only happen for small numbers, but is possible with large numbers as well due to the way rounding happens, 1.05 million - 1.049 million will have a similar problem.

Working with numbers on the opposite spectrum also tends to cause errors, be it not catastrophic. Adding a very small number to a very large number for instance will typically not have any effect as the rounding caused by the representation cuts off the small number you have added. This means that repeatedly adding small numbers to a large number doesn't have any effect and over time may actually cause an error that is too big to ignore.

The way you structure your operations will have impact on the precision and correctness of your final result. An example of this is the operation (x-y)/z versus x/z - y/z.

x = 100, y = 90, z = 1,000,000,000
(x-y)/z = 1.0E-8
x/z - y/z = 9.999999999999997E-9

As you can see, the first one yields the correct value, whereas the second one has a small error.

To conclude: try to avoid extreme numbers, either very big or very small. If you can predict what the range of your numbers is you can make sure the the most catastrophic calculation happens last.

This blog post is not meant to scare you, or make you anxious to work with floating point numbers as in most common cases this is not relevant and any rounding error will be negligible, but just in case you want to increase the correctness of calculations without having to rely on an exact format, these are some good things to keep in mind.