How are floating errors handled by the computer - python

I understand that floating number has their limitations so this can be expected:
>>> 0.1 + 0.2 == 0.3
False
But why is this valid? Computers can't store 0.45, 0.55 reliably either right?
>>> 0.45 + 0.55 == 1.00
True
I want to know how in the first case computer couldn't correct its inaccuracy and in the later one it could.

As you know most decimal numbers can't be stored exactly. That's true for all of your above numbers except 1.0.
But they get stored with a high accuracy. Instead of 0.3, some very close representable number gets used. It's not only very close, it's the closest such number.
When you compute 0.1 + 0.2, then another representable number gets computed, which is also very close to 0.3. You are "unlucky" and it differs from the closest possible representable number.
There's no real luck involved, both 0.1 and 0.2 get represented by a slightly larger number. When added, the two errors add as they're of the same sign and you get something like 0.30000000000000004.
With 0.45 + 0.55, the errors are of different signs and cancel out.

Related

floating point approximation in python

I'm new to python, and I'm trying to understand the floating point approximation and how floats are represented in Python.
For example:
>>> .1 + .1 + .1 == .3
False
>>> .25 + .25 + .25 == 0.75
True
I understand these two situations but what about these specific situations.
>>> .1 + .1 + .1 +.1 == .4
True
>>> .1 + .1 == .2
True
Is it coincidently just because the values of .1+.1+.1+.1 and .1+.1 are equal to .4 and .2 respectively even if these numbers are not correctly represented in Python? Are there any other situations like this or is there any way to identify them?
Thank you!
Short answer: Yes its just a coincidence.
Numbers are represented as 64 bit IEEE floating point numbers in Python, also called double-precision.
https://en.wikipedia.org/wiki/IEEE_754#Basic_and_interchange_formats
When you write 0.3 Python finds the closest IEEE number that represents 0.3.
When adding multiple numbers these small errors in the last digits accumulate and you end up with a different number. Sometimes that happens sooner, other times later. Sometimes those errors counter-act, often not.
This answer is a good read:
Is floating point math broken?
To go deeper into your examples, you would need to look at the bit representation of these numbers. However it gets complicated, as one also need to look at how rounding and addition works ...
Floating-point numbers are represented in computer hardware as base 2 (binary) fractions. For example, the decimal fraction 0.125 has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction 0.001 has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only real difference being that the first is written in base 10 fractional notation, and the second in base 2.
Unfortunately, most decimal fractions cannot be represented exactly as binary fractions. A consequence is that, in general, the decimal floating-point numbers you enter are only approximated by the binary floating-point numbers actually stored in the machine.
One illusion may beget another. For example, since 0.1 is not exactly 1/10, summing three values of 0.1 may not yield exactly 0.3, either:
>>>.1 + .1 + .1 == .3
False
Also, since the 0.1 cannot get any closer to the exact value of 1/10 and 0.3 cannot get any closer to the exact value of 3/10, then pre-rounding with round() function cannot help:
>>>round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)
False
Though the numbers cannot be made closer to their intended exact values, the round() function can be useful for post-rounding so that results with inexact values become comparable to one another:
>>>round(.1 + .1 + .1, 10) == round(.3, 10)
True

How does Python remember the number of decimal places one used to specify a float?

Today it was pointed out to me that 0.99 can't be represented by a float:
num = 0.99
print('{0:.20f}'.format(num))
prints 0.98999999999999999112. I'm fine with this concept.
So then how does python know to do this:
num = 0.99
print(num)
prints 0.99.
How does Python remember the number of decimal places one used to specify a float?
It doesn't. Try this:
num = 0.990
print(num)
Notice that that also outputs 0.99, not 0.990.
I can't speak specifically for the print function, but it's common in environments that have IEEE-754 double-precision binary floating point numbers to use an algorithm that outputs only as many digits as are needed to differentiate the number from its closest "representable" neighbour. But it's much more complicated than it might seem on the surface. See this paper on number rounding for details (associated code here and here).
Sam Mason provided some great links related to this:
From Floating Point Arithmetic: Issues and Limitations
This bears out the "closest representable" thing above. It starts by describing the issue in base 10 that you can't accurately represent one-third (1/3). 0.3 comes close, 0.33 comes closer, 0.333 comes even closer, but really 1/3 is an infinitely repeating series of 0.3 followed by 3s forever. In the same way, binary float point (which stores the number as a base 2 fraction rather than a base 10 fraction) can't exactly represent 0.1 (for instance), like 1/3 in base 10 it's an infinitely repeating series of digits in base 2 and anything else is an approximation. It then continues:
In the same way, no matter how many base 2 digits you’re willing to use, the decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base 2, 1/10 is the infinitely repeating fraction
0.0001100110011001100110011001100110011001100110011...
Stop at any finite number of bits, and you get an approximation. On most machines today, floats are approximated using a binary fraction with the numerator using the first 53 bits starting with the most significant bit and with the denominator as a power of two. In the case of 1/10, the binary fraction is 3602879701896397 / 2 ** 55 which is close to but not exactly equal to the true value of 1/10.
Many users are not aware of the approximation because of the way values are displayed. Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine. On most machines, if Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display
>>> 0.1
0.1000000000000000055511151231257827021181583404541015625
That is more digits than most people find useful, so Python keeps the number of digits manageable by displaying a rounded value instead
>>> 1 / 10
0.1
Just remember, even though the printed result looks like the exact value of 1/10, the actual stored value is the nearest representable binary fraction.
The code for it in CPython
An issue discussing it on the issues list
It's not remembering. It's looking at the value it's got and deciding the best way to present it, which it thinks in this case is 0.99 because the value is as close as possible to 0.99.
If you print(0.98999999999999999112) it will show 0.99, even though that is not the number of decimal places you used to specify it.

While using np.arange() it was incrementing with wrong step size

for i in np.arange(0.0,1.1,0.1):
print(i)
Output:
0.0
0.1
0.2
0.30000000000000004
0.4
0.5
0.6000000000000001
0.7000000000000001
0.8
0.9
1.0
Expected output:
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
It's not incrementing by the wrong step size, those are just floating point errors. From here:
This can be considered as a bug in Python, but it is not. This has little to do with Python, and much more to do with how the underlying platform handles floating-point numbers. It’s a normal case encountered when handling floating-point numbers internally in a system. It’s a problem caused when the internal representation of floating-point numbers, which uses a fixed number of binary digits to represent a decimal number. It is difficult to represent some decimal number in binary, so in many cases, it leads to small roundoff errors.

correcting for floating point arithmetic 'errors' when rounding in pandas

I have a number that I have to deal with that I hate (and I am sure there are others).
It is
a17=0.0249999999999999
a18=0.02499999999999999
Case 1:
round(a17,2) gives 0.02
round(a18,2) gives 0.03
Case 2:
round(a17,3)=round(a18,3)=0.025
Case 3:
round(round(a17,3),2)=round(round(a18,3),2)=0.03
but when these numbers are in a data frame...
Case 4:
df=pd.DataFrame([a17,a18])
np.round(df.round(3),2)=[0.02, 0.02]
Why are the answers I get are the same as in Case 1?
When you are working with floats - you will be unable to get EXACT value, but only approximated in most cases. Because of the in-memory organization of floats.
You should keep in mind, that when you print float - you always print approximated decimal!!!
And this is not the same.
Exact value will be only 17 digits after '.' in 0.xxxx
That is why:
>>> round(0.0249999999999999999,2)
0.03
>>> round(0.024999999999999999,2)
0.02
This is true for most of programming languages (Fortran, Python, C++ etc)
Let us look into fragment of Python documentation:
(https://docs.python.org/3/tutorial/floatingpoint.html)
0.0001100110011001100110011001100110011001100110011...
Stop at any finite number of bits, and you get an approximation. On most machines today, floats are approximated using a binary fraction with the numerator using the first 53 bits starting with the most significant bit and with the denominator as a power of two. In the case of 1/10, the binary fraction is 3602879701896397 / 2 ** 55 which is close to but not exactly equal to the true value of 1/10.
Many users are not aware of the approximation because of the way values are displayed. Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine. On most machines, if Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display
>>>0.1
0.1000000000000000055511151231257827021181583404541015625
That is more digits than most people find useful, so Python keeps the number of digits manageable by displaying a rounded value instead
>>>1 / 10
0.1
Just remember, even though the printed result looks like the exact value of 1/10, the actual stored value is the nearest representable binary fraction.
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.
Let us look into fragment of NumPy documentation:
(https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.around.html#numpy.around)
For understanding - np.round uses np.around - see NumPy documentation
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R9] and errors introduced when scaling by powers of ten.
Conclusions:
In your case np.round just rounded 0.025 to 0.02 by rules described above (source - NumPy documentation)

why 1 // 0.05 results in 19.0 in python?

I'm a new to python and I found a confusing result when using Python3.5.1 on my mac, I simply ran this command in my terminal
1 // 0.05
However, it printed 19.0 on my screen. From my point of view, it should be 20. Can someone explain what's happening here? I've already known that the '//' is similar to the math.floor() function. But I still can't get across to this.
Because the Python floating-point literal 0.05 represents a number very slightly larger than the mathematical value 0.05.
>>> '%.60f' % 0.05
'0.050000000000000002775557561562891351059079170227050781250000'
// is floor division, meaning that the result is the largest integer n such that n times the divisor is less than or equal to the dividend. Since 20 times 0.05000000000000000277555756156289135105907917022705078125 is larger than 1, this means the correct result is 19.
As for why the Python literal 0.05 doesn't represent the number 0.05, as well as many other things about floating point, see What Every Computer Scientist Should Know About Floating-Point Arithmetic
0.05 is not exactly representable in floating point. "%0.20f" % 0.05 shows that 0.05 is stored as a value very slightly greater than the exact value:
>>> print "%0.20f" % 0.05
0.05000000000000000278
On the other hand 1/0.05 does appear to be exactly 20:
>>> print "%0.20f" % (1/0.05)
20.00000000000000000000
However all floating point values are rounded to double when stored but calculations are done to a higher precision. In this case it seems the floor operation performed by 1//0.05 is done at full internal precision hence it is rounded down.
As the previous answerers have correctly pointed out, the fraction 0.05 = 1/20 cannot be exactly represented with a finite number of base-two digits. It works out to the repeating fraction 0.0000 1100 1100 1100... (much like 1/3 = 0.333... in familiar base-ten).
But this is not quite a complete answer to your question, because there's another bit of weirdness going on here:
>>> 1 / 0.05
20.0
>>> 1 // 0.05
19.0
Using the “true division” operator / happens to give the expected answer 20.0. You got lucky here: The rounding error in the division exactly cancels out the error in representing the value 0.05 itself.
But how come 1 // 0.05 returns 19? Isn't a // b supposed to be the same as math.floor(a /b)? Why the inconsistency between / and //?
Note that the divmod function is consistent with the // operator:
>>> divmod(1, 0.05)
(19.0, 0.04999999999999995)
This behavior can be explained by performing computing the floating-point division with exact rational arithmetic. When you write the literal 0.05 in Python (on an IEEE 754-compliant platform), the actual value represented is 3602879701896397 / 72057594037927936 = 0.05000000000000000277555756156289135105907917022705078125. This value happens to be slightly more than the intended 0.05, which means that its reciprocal will be slightly less.
To be precise, 72057594037927936 / 3602879701896397 = 19.999999999999998889776975374843521206126552300723564152465244707437044687...
So, // and divmod see an integer quotient of 19. The remainder works out to 0.04999999999999994726440633030506432987749576568603515625, which is rounded for display as 0.04999999999999995. So, the divmod answer above is in fact good to 53-bit accuracy, given the original incorrect value of 0.05.
But what about /? Well, the true quotient 72057594037927936 / 3602879701896397 isn't representable as a float, so it must be rounded, either down to 20-2**-48 (an error of about 2.44e-15) or up to 20.0 (an error of about 1.11e-15). And Python correctly picks the more accurate choice, 20.0.
So, it seems that Python's floating-point division is internally done with high enough precision to know that 1 / 0.05 (that's the float literal 0.05, not the exact decimal fraction 0.05), is actually less than 20, but the float type in itself is incapable of representing the difference.
At this point you may be thinking “So what? I don't care that Python is giving a correct reciprocal to an incorrect value. I want to know how to get the correct value in the first place.” And the answer to that is either:
decimal.Decimal('0.05') (and don't forget the quotes!)
fractions.Fraction('0.05') (Of course, you may also use the numerator-denominator arguments as Fraction(1, 20), which is useful if you need to deal with non-decimal fractions like 1/3.)

Categories

Resources