This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 6 years ago.
I am new to python and I tried this:
import numpy as np
x = np.arange(0.7,1.3,0.1)
print (x)
y = np.arange(0.6,1.3,0.1)
print (y)
The output was [ 0.7 0.8 0.9 1. 1.1 1.2 1.3] and [ 0.6 0.7 0.8 0.9 1. 1.1 1.2]. Why in the first case 1.3 appears in the list and in the second case it doesn't?
This is due to rounding errors. If you actually print the last element in x in it's full precision, you'll see that it is smaller than 1.3:
>>> import numpy as np
>>> x = np.arange(0.7,1.3,0.1)
>>> 1.3 > x[-1]
True
>>> x[-1]
1.2999999999999998
Note, as stated in the documentation
When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.:
arange is not suitable for floating point numbers:
When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases.
I'm not familiar with the internals of numpy, but my guess is that this is a side effect of floating point numbers not being exact (meaning that they can't exactly represent some values).
See the numpy.arange documentation here:
specifically "When using a non-integer step, such as 0.1, the results will often not be consistent. It is better to use linspace for these cases"
Related
numpy.r_ can be use to build arrays quickly from slice notation. However, the following example appears to demonstrate inconsistent behavior:
>>> import numpy as np
>>> a = np.r_[0.1 : 0.3 : 0.1]
>>> a
array([0.1, 0.2])
Endpoint of the slice 0.3 not included - as expected.
>>> b = np.r_[0.1 : 0.4 : 0.1]
>>> b
array([0.1, 0.2, 0.3, 0.4])
Endpoint of the slice 0.4 included!
There does not appear to be an explanation for this behavior in the documentation.
When c is real, numpy.r_[a:b:c] is equivalent to numpy.arange(a, b, c). Using floats here is a bad idea, as documented in the numpy.arange docs - the length may be wrong, because a length calculation based on floating-point values is subject to floating-point rounding error, and the step itself may suffer precision loss due to implementation details of how NumPy handles the step internally.
As suggested in the numpy.arange docs, you should use numpy.linspace instead. numpy.linspace takes an element count as an integer, instead of taking a step:
b = numpy.linspace(0.1, 0.4, num=3, endpoint=False)
because in python real numbers are not always rounded, as in your example horizontal step 0.2+0.1 is close to 0.300000000000004 and the result will be wrong.
I would use this way but it seems to be more complicated:
from decimal import *
import numpy as np
getcontext().prec = 6 # setting new values for precision, rounding, or enabled traps
b = Decimal(2)/Decimal(5) # 0.4
a = Decimal(1)/Decimal(10) # 0.1
print(np.r_[a: b : a])
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 2 years ago.
Maybe this was answered before, but I'm trying to understand what is the best way to work with Pandas subtraction.
import pandas as pd
import random
import numpy as np
random.seed(42)
data = {'r': list([float(random.random()) for i in range(5)])}
for i in range(5):
data['r'].append(float(0.7))
df = pd.DataFrame(data)
If I run the following, I get the expected results:
print(np.sum(df['r'] >= 0.7))
6
However, if I modify slightly the condition, I don't get the expected results:
print(np.sum(df['r']-0.5 >= 0.2))
1
The same happens if I try to fix it by casting into float or np.float64 (and combinations of this), like the following:
print(np.sum(df['r'].astype(np.float64)-np.float64(0.5) >= np.float64(0.2)))
1
For sure I'm not doing the casting properly, but any help on this would be more than welcome!
You're not doing anything improperly. This is a totally straightforward floating point error. It will always happen.
>>> 0.7 >= 0.7
True
>>> (0.7 - 0.5) >= 0.2
False
You have to remember that floating point numbers are represented in binary, so they can only represent sums of powers of 2 with perfect precision. Anything that can't be represented finitely as a sum of powers of two will be subject to error like this.
You can see why by forcing Python to display the full-precision value associated with the literal 0.7:
format(0.7, '.60g')
'0.6999999999999999555910790149937383830547332763671875'
To add to #senderle answer, since this is a floating point issue you can solve it by:
((df['r'] - 0.5) >= 0.19).sum()
Oh a slightly different note, I'm not sure why you use np.sum when you could just use pandas .sum, seems like an unnecessary import
I have a pandas series
In [1]: import pandas as pd
In [2]: s = pd.Series([1.3, 2.6, 1.24, 1.27, 1.45])
and I need to round the numbers.
In [4]: s.round(1)
Out[4]:
0 1.3
1 2.6
2 1.2
3 1.3
4 1.4
dtype: float64
it works for 1.27, however 1.45 is rounded to be 1.4, is it the problem of the precision loss of float type? If it is, how can I deal with this problem?
This isn't a bug but it is because, most decimal numbers cannot be represented exactly as a float.
https://www.programiz.com/python-programming/methods/built-in/round
another way of rounding is:
int(number*10^precission+0.5)
however, you might run in simular problems because who knows if 1.45 is closer to 1.4499999.. or 1.4500...1
In general, round() often fails due to floats being imprecise estimates.
In this case though, it's because of a convention by which half of all the numbers (evens) are rounded down, in order to balance out rounding error.
You can pretty easily disable this behavior:
round(x[, n])
x rounded to n digits, rounding half to even. If n is omitted, it defaults to 0.
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 4 years ago.
I would like to create an array of number equally spaced (0.1) between 0.1 and 100
step=0.1
range_a=numpy.arange(step,100+step,step)
why my first element is
range_a[0]
Out[27]: 0.10000000000000001
and not 0.1?
and how do I get an array equal to
[0.1, 0.2, 0.3, ..., 100]
As mentioned in the comments, this is due to how floats are handled. In general, floats are used for imprecise but quick calculations, and double is used where accuracy is important. Your code can be rewritten as follows to get precisely what you want
step = 0.1
range_a = numpy.arange(step, 100+step, step).astype(numpy.double)
This question already has answers here:
Python rounding error with float numbers [duplicate]
(2 answers)
Closed 9 years ago.
When using list comprehension expression:
[x * 0.1 for x in range(0, 5)]
I expect to get a list like this:
[0.0, 0.1, 0.2, 0.3, 0.4]
However I instead I get this:
[0.0, 0.1, 0.2, 0.30000000000000004, 0.4]
What is the reason behind this?
floats are inherently imprecise in pretty much every language
if you need exact precision use the Decimal class
from decimal import Decimal
print Decimal("0.3")
if you just need them to look pretty just use format strings when displaying
eg :
"%0.2f"%2.030000000000034
if you want to compare them use some threshold
if num1 - num2 < 1e-3 : print "Equal Enough For Me!"
**see abarnert's comments on thresholding ... this is a very simplified example for a more indepth explanation of epsilon thresholding one article I found is here http://www.cygnus-software.com/papers/comparingfloats/Comparing%20floating%20point%20numbers.htm
Additional Reading:
http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html (for a detailed explanation)
http://floating-point-gui.de/basic/ (basic tutorial for working with floats in general)
List comprehension does not matter.
>>> 3 * 0.1
0.30000000000000004
>>> 2 * 0.1
0.2
>>> 0.1 + 0.2
0.30000000000000004
More information about Python float and floating point arithmetic - here
The list comprehension is irrelevant, this is purely an issue with floating-point numbers. For an extremely detailed answer you should give this article a read: http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html