Python2 math.fsum not accurate?

Python2 math.fsum not accurate? - python

I'm using the python2 math module to calculate sums with fsum. I understand that 0.1 usually can't be stored binary. As far as i understand math.fsum should fix this somehow.
import math
math.fsum([0.0, 0.1])
#0.1
math.fsum([0.1, 0.1])
#0.2
math.fsum([0.2, 0.1])
#0.30000000000000004
math.fsum([0.3, 0.1])
#0.4
math.fsum([0.4, 0.1])
#0.5
So math.fsum([0.2, 0.1]) == 0.3 will be False. Is this supposed to be like this? Am i doing something wrong?
How can i get 0.2 + 0.1 == 0.3 to be True?

You're misunderstanding what math.fsum does. It computes the most accurate possible sum of the given inputs (that is, the closest exactly representable value to the exact mathematical sum of the inputs). It does not magically replace its inputs with the numbers you originally thought of.
In your third line above, the input to math.fsum is a list containing the values 0.1000000000000000055511151231257827021181583404541015625 and 0.200000000000000011102230246251565404236316680908203125 (remember that with binary floating-point, What You See Is Not What You Get; here I'm showing the exact values that Python's using). The exact sum of those two values is 0.3000000000000000166533453693773481063544750213623046875, and the closest representable IEEE 754 binary64 float to that exact sum is 0.3000000000000000444089209850062616169452667236328125, which is what you're getting.
You're asking for math.fsum to behave as though it were given the exact values 0.1 and 0.2, but it has no way of knowing that that's what you want: it can only operate on the inputs that you give it.
Note that on most machines, addition of two floats will already be correctly rounded, so there's no advantage to using math.fsum. math.fsum is intended to remove the accumulation of rounding error involved in summing more than two floats.

Actually, you should avoid to use the equal operator for the float. because computer present them in binary, and only an approximate value of the float.
if you real need to check whether two float is equal, you need to define a tolerance:
For example:
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
please refer to this:
What is the best way to compare floats for almost-equality in Python?
and I find this funny website:
http://0.30000000000000004.com/
this 0.1 + 0.2 not equals to 0.3 exits in most languages.

Related

If 0.1 can't be represented exactly in binary, why does printing it show "0.1" and not "0.09999999999999998"? [duplicate]

I know that most decimals don't have an exact floating point representation (Is floating point math broken?).
But I don't see why 4*0.1 is printed nicely as 0.4, but 3*0.1 isn't, when
both values actually have ugly decimal representations:
>>> 3*0.1
0.30000000000000004
>>> 4*0.1
0.4
>>> from decimal import Decimal
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')

The simple answer is because 3*0.1 != 0.3 due to quantization (roundoff) error (whereas 4*0.1 == 0.4 because multiplying by a power of two is usually an "exact" operation). Python tries to find the shortest string that would round to the desired value, so it can display 4*0.1 as 0.4 as these are equal, but it cannot display 3*0.1 as 0.3 because these are not equal.
You can use the .hex method in Python to view the internal representation of a number (basically, the exact binary floating point value, rather than the base-10 approximation). This can help to explain what's going on under the hood.
>>> (0.1).hex()
'0x1.999999999999ap-4'
>>> (0.3).hex()
'0x1.3333333333333p-2'
>>> (0.1*3).hex()
'0x1.3333333333334p-2'
>>> (0.4).hex()
'0x1.999999999999ap-2'
>>> (0.1*4).hex()
'0x1.999999999999ap-2'
0.1 is 0x1.999999999999a times 2^-4. The "a" at the end means the digit 10 - in other words, 0.1 in binary floating point is very slightly larger than the "exact" value of 0.1 (because the final 0x0.99 is rounded up to 0x0.a). When you multiply this by 4, a power of two, the exponent shifts up (from 2^-4 to 2^-2) but the number is otherwise unchanged, so 4*0.1 == 0.4.
However, when you multiply by 3, the tiny little difference between 0x0.99 and 0x0.a0 (0x0.07) magnifies into a 0x0.15 error, which shows up as a one-digit error in the last position. This causes 0.1*3 to be very slightly larger than the rounded value of 0.3.
Python 3's float repr is designed to be round-trippable, that is, the value shown should be exactly convertible into the original value (float(repr(f)) == f for all floats f). Therefore, it cannot display 0.3 and 0.1*3 exactly the same way, or the two different numbers would end up the same after round-tripping. Consequently, Python 3's repr engine chooses to display one with a slight apparent error.

repr (and str in Python 3) will put out as many digits as required to make the value unambiguous. In this case the result of the multiplication 3*0.1 isn't the closest value to 0.3 (0x1.3333333333333p-2 in hex), it's actually one LSB higher (0x1.3333333333334p-2) so it needs more digits to distinguish it from 0.3.
On the other hand, the multiplication 4*0.1 does get the closest value to 0.4 (0x1.999999999999ap-2 in hex), so it doesn't need any additional digits.
You can verify this quite easily:
>>> 3*0.1 == 0.3
False
>>> 4*0.1 == 0.4
True
I used hex notation above because it's nice and compact and shows the bit difference between the two values. You can do this yourself using e.g. (3*0.1).hex(). If you'd rather see them in all their decimal glory, here you go:
>>> Decimal(3*0.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(0.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(4*0.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(0.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')

Here's a simplified conclusion from other answers.
If you check a float on Python's command line or print it, it goes through function repr which creates its string representation.
Starting with version 3.2, Python's str and repr use a complex rounding scheme, which prefers
nice-looking decimals if possible, but uses more digits where
necessary to guarantee bijective (one-to-one) mapping between floats
and their string representations.
This scheme guarantees that value of repr(float(s)) looks nice for simple
decimals, even if they can't be
represented precisely as floats (eg. when s = "0.1").
At the same time it guarantees that float(repr(x)) == x holds for every float x

Not really specific to Python's implementation but should apply to any float to decimal string functions.
A floating point number is essentially a binary number, but in scientific notation with a fixed limit of significant figures.
The inverse of any number that has a prime number factor that is not shared with the base will always result in a recurring dot point representation. For example 1/7 has a prime factor, 7, that is not shared with 10, and therefore has a recurring decimal representation, and the same is true for 1/10 with prime factors 2 and 5, the latter not being shared with 2; this means that 0.1 cannot be exactly represented by a finite number of bits after the dot point.
Since 0.1 has no exact representation, a function that converts the approximation to a decimal point string will usually try to approximate certain values so that they don't get unintuitive results like 0.1000000000004121.
Since the floating point is in scientific notation, any multiplication by a power of the base only affects the exponent part of the number. For example 1.231e+2 * 100 = 1.231e+4 for decimal notation, and likewise, 1.00101010e11 * 100 = 1.00101010e101 in binary notation. If I multiply by a non-power of the base, the significant digits will also be affected. For example 1.2e1 * 3 = 3.6e1
Depending on the algorithm used, it may try to guess common decimals based on the significant figures only. Both 0.1 and 0.4 have the same significant figures in binary, because their floats are essentially truncations of (8/5)(2^-4) and (8/5)(2^-6) respectively. If the algorithm identifies the 8/5 sigfig pattern as the decimal 1.6, then it will work on 0.1, 0.2, 0.4, 0.8, etc. It may also have magic sigfig patterns for other combinations, such as the float 3 divided by float 10 and other magic patterns statistically likely to be formed by division by 10.
In the case of 3*0.1, the last few significant figures will likely be different from dividing a float 3 by float 10, causing the algorithm to fail to recognize the magic number for the 0.3 constant depending on its tolerance for precision loss.
Edit:
https://docs.python.org/3.1/tutorial/floatingpoint.html
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.
There is no tolerance for precision loss, if float x (0.3) is not exactly equal to float y (0.1*3), then repr(x) is not exactly equal to repr(y).

float point arithmetic in python

>>> .1+.1+.1+.1 ==.4
True
>>> .1+.1+.1 ==.3
False
>>>
The above is an output from python interpreter. I understand the fact that
floating point arithmetic is done using base 2 and is stored as binary in the
and so the difference in calculations like above results.
Now I found that .4 = .011(0011) [The number inside () repeats infinitely this is a binary
representation of this fraction] since this cannot be stored exactly an approximate value
will be stored.
Similary 0.3 = .01(0011)
So both 0.4 and 0.3 cannot be stored exactly internally.
But then what's the reason for python to return first as True and the second as False
As both cannot be compared
_______________________________________________________________________________
I did some research and found the following :
>>> Decimal(.4)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(.1+.1+.1+.1)
Decimal('0.40000000000000002220446049250313080847263336181640625')
>>> Decimal(.1+.1+.1)
Decimal('0.3000000000000000444089209850062616169452667236328125')
>>> Decimal(.3)
Decimal('0.299999999999999988897769753748434595763683319091796875')
>>> Decimal(.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')
This probably explains why the additions are happening the way they are
assuming that Decimal is giving the exact ouput of the number stored underneath

But then what's the reason for python to return first as True and the second as False
As both cannot be compared
Floating-point numbers absolutely can be compared for equality. Problems arise only when you expect exact equality to be preserved by an approximate computation. But the semantics of floating-point equality comparison is perfectly well defined.
When you write 0.1 in a program, this is rounded to the nearest IEEE 754 binary64 floating-point number, which is the real number 0.1000000000000000055511151231257827021181583404541015625, or 0x1.999999999999ap−4 in hexadecimal notation (the ‘p−4’ part means  × 2⁻⁴). Every (normal) binary64 floating-point number is a real number of the form ±2ⁿ × (1 + 𝑓/2⁵³), where 𝑛 and 𝑓 are integers with −1022 ≤ 𝑛 ≤ 1023 and 0 ≤ 𝑓 < 2⁵³; this one is the nearest such number to 0.1.
When you add that to itself three times in floating-point arithmetic, the exact result 0.3000000000000000166533453693773481063544750213623046875 is rounded to 0.3000000000000000444089209850062616169452667236328125 or 0x1.3333333333334p−2 (since there are only 53 bits of precision available), but when you write 0.3, you get 0.299999999999999988897769753748434595763683319091796875 or 0x1.3333333333333p−2 which is slightly closer to 0.3.
However, four times 0.1000000000000000055511151231257827021181583404541015625 or 0x1.999999999999ap−4 is 0.4000000000000000222044604925031308084726333618164062500 or 0x1.999999999999ap−2, which is also the closest floating-point number to 0.4 and hence is what you get when you write 0.4 in a program. So when you write 4*0.1, the result is exactly the same floating-point number as when you write 0.4.
Now, you didn't write 4*0.1—instead you wrote .1 + .1 + .1 + .1. But it turns out there is a theorem in binary floating-point arithmetic that x + x + x + x—that is, fl(fl(fl(𝑥 + 𝑥) + 𝑥) + 𝑥)—always yields exactly 4𝑥 without rounding (except when it overflows), in spite of the fact that x + x + x or fl(fl(𝑥 + 𝑥) + 𝑥) = fl(3𝑥) may be rounded and not exactly equal to 3𝑥. (Note that fl(𝑥 + 𝑥) = fl(2𝑥) is always equal to 2𝑥, again ignoring overflow, because it's just a matter of adjusting the exponent.)
It just happens that any rounding error committed by adding the fourth term cancels out whatever rounding error may have been committed by adding the third!

Floating numbers for currency python

Currently I have written a program in python that calculates currencies and many other things. Now I noticed the IEEE-754 problem on floating point numbers.
Is there a way to do correct mathematical operations like addition in python with the correct value?
As example:
0.1 + 0.1 + 0.1 = 0.3
I already tried the decimal library but it seems that, it is not giving me correct rounded digits in rounded places.
Greetings.

When using the decimal module, you need to initialize with strings; initializing with floats will preserve in high resolution the imprecision of the underlying float type. For example, using Decimal(0.1) actually produces a value:
Decimal('0.1000000000000000055511151231257827021181583404541015625')
because that's the 30 digit extrapolation of the value represented by the float 0.1.
So instead of testing Decimal(0.1) + Decimal(0.1) + Decimal(0.1) == Decimal(0.3) (where you compute in fixed decimal precision, but initialize with non-decimal values), use Decimal("0.1") + Decimal("0.1") + Decimal("0.1") == Decimal("0.3") and you'll get the expected results.
Aside from that, you'll want to adjust the decimal context's .prec attribute to ensure you're preserving the intended amount of precision, but usually it's okay to preserve the full precision and just round to the correct precision later with the quantize method.

Did you try setting the context? For instance, the code below returns 0.3 as expected:
import decimal
decimal.getcontext().prec = 1
a = decimal.Decimal(0.1)
print(a + a + a)

correcting for floating point arithmetic 'errors' when rounding in pandas

I have a number that I have to deal with that I hate (and I am sure there are others).
It is
a17=0.0249999999999999
a18=0.02499999999999999
Case 1:
round(a17,2) gives 0.02
round(a18,2) gives 0.03
Case 2:
round(a17,3)=round(a18,3)=0.025
Case 3:
round(round(a17,3),2)=round(round(a18,3),2)=0.03
but when these numbers are in a data frame...
Case 4:
df=pd.DataFrame([a17,a18])
np.round(df.round(3),2)=[0.02, 0.02]
Why are the answers I get are the same as in Case 1?

When you are working with floats - you will be unable to get EXACT value, but only approximated in most cases. Because of the in-memory organization of floats.
You should keep in mind, that when you print float - you always print approximated decimal!!!
And this is not the same.
Exact value will be only 17 digits after '.' in 0.xxxx
That is why:
>>> round(0.0249999999999999999,2)
0.03
>>> round(0.024999999999999999,2)
0.02
This is true for most of programming languages (Fortran, Python, C++ etc)
Let us look into fragment of Python documentation:
(https://docs.python.org/3/tutorial/floatingpoint.html)
0.0001100110011001100110011001100110011001100110011...
Stop at any finite number of bits, and you get an approximation. On most machines today, floats are approximated using a binary fraction with the numerator using the first 53 bits starting with the most significant bit and with the denominator as a power of two. In the case of 1/10, the binary fraction is 3602879701896397 / 2 ** 55 which is close to but not exactly equal to the true value of 1/10.
Many users are not aware of the approximation because of the way values are displayed. Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine. On most machines, if Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display
>>>0.1
0.1000000000000000055511151231257827021181583404541015625
That is more digits than most people find useful, so Python keeps the number of digits manageable by displaying a rounded value instead
>>>1 / 10
0.1
Just remember, even though the printed result looks like the exact value of 1/10, the actual stored value is the nearest representable binary fraction.
Interestingly, there are many different decimal numbers that share the same nearest approximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invariant eval(repr(x)) == x.
Let us look into fragment of NumPy documentation:
(https://docs.scipy.org/doc/numpy-1.13.0/reference/generated/numpy.around.html#numpy.around)
For understanding - np.round uses np.around - see NumPy documentation
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R9] and errors introduced when scaling by powers of ten.
Conclusions:
In your case np.round just rounded 0.025 to 0.02 by rules described above (source - NumPy documentation)

why 1 // 0.05 results in 19.0 in python?

I'm a new to python and I found a confusing result when using Python3.5.1 on my mac, I simply ran this command in my terminal
1 // 0.05
However, it printed 19.0 on my screen. From my point of view, it should be 20. Can someone explain what's happening here? I've already known that the '//' is similar to the math.floor() function. But I still can't get across to this.

Because the Python floating-point literal 0.05 represents a number very slightly larger than the mathematical value 0.05.
>>> '%.60f' % 0.05
'0.050000000000000002775557561562891351059079170227050781250000'
// is floor division, meaning that the result is the largest integer n such that n times the divisor is less than or equal to the dividend. Since 20 times 0.05000000000000000277555756156289135105907917022705078125 is larger than 1, this means the correct result is 19.
As for why the Python literal 0.05 doesn't represent the number 0.05, as well as many other things about floating point, see What Every Computer Scientist Should Know About Floating-Point Arithmetic

0.05 is not exactly representable in floating point. "%0.20f" % 0.05 shows that 0.05 is stored as a value very slightly greater than the exact value:
>>> print "%0.20f" % 0.05
0.05000000000000000278
On the other hand 1/0.05 does appear to be exactly 20:
>>> print "%0.20f" % (1/0.05)
20.00000000000000000000
However all floating point values are rounded to double when stored but calculations are done to a higher precision. In this case it seems the floor operation performed by 1//0.05 is done at full internal precision hence it is rounded down.

As the previous answerers have correctly pointed out, the fraction 0.05 = 1/20 cannot be exactly represented with a finite number of base-two digits. It works out to the repeating fraction 0.0000 1100 1100 1100... (much like 1/3 = 0.333... in familiar base-ten).
But this is not quite a complete answer to your question, because there's another bit of weirdness going on here:
>>> 1 / 0.05
20.0
>>> 1 // 0.05
19.0
Using the “true division” operator / happens to give the expected answer 20.0. You got lucky here: The rounding error in the division exactly cancels out the error in representing the value 0.05 itself.
But how come 1 // 0.05 returns 19? Isn't a // b supposed to be the same as math.floor(a /b)? Why the inconsistency between / and //?
Note that the divmod function is consistent with the // operator:
>>> divmod(1, 0.05)
(19.0, 0.04999999999999995)
This behavior can be explained by performing computing the floating-point division with exact rational arithmetic. When you write the literal 0.05 in Python (on an IEEE 754-compliant platform), the actual value represented is 3602879701896397 / 72057594037927936 = 0.05000000000000000277555756156289135105907917022705078125. This value happens to be slightly more than the intended 0.05, which means that its reciprocal will be slightly less.
To be precise, 72057594037927936 / 3602879701896397 = 19.999999999999998889776975374843521206126552300723564152465244707437044687...
So, // and divmod see an integer quotient of 19. The remainder works out to 0.04999999999999994726440633030506432987749576568603515625, which is rounded for display as 0.04999999999999995. So, the divmod answer above is in fact good to 53-bit accuracy, given the original incorrect value of 0.05.
But what about /? Well, the true quotient 72057594037927936 / 3602879701896397 isn't representable as a float, so it must be rounded, either down to 20-2**-48 (an error of about 2.44e-15) or up to 20.0 (an error of about 1.11e-15). And Python correctly picks the more accurate choice, 20.0.
So, it seems that Python's floating-point division is internally done with high enough precision to know that 1 / 0.05 (that's the float literal 0.05, not the exact decimal fraction 0.05), is actually less than 20, but the float type in itself is incapable of representing the difference.
At this point you may be thinking “So what? I don't care that Python is giving a correct reciprocal to an incorrect value. I want to know how to get the correct value in the first place.” And the answer to that is either:
decimal.Decimal('0.05') (and don't forget the quotes!)
fractions.Fraction('0.05') (Of course, you may also use the numerator-denominator arguments as Fraction(1, 20), which is useful if you need to deal with non-decimal fractions like 1/3.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.