In numpy.allclose() there are two tolerance factors used to determine if two arrays are close enough to count as the same. There is the relative tolerance rtol and absolute tolerance atol. From the docs:
numpy.allclose(a, b, rtol=1e-05, atol=1e-08)
Also from the docs:
If the following equation is element-wise True, then allclose returns True.
absolute(a - b) <= (atol + rtol * absolute(b))
Mathematically I understand this, but I am confused about the point of rtol. Why not just use a single tolerance value tol, and if |a-b| < tol, then return False? Obviously, following the above equation, I could just do this manually by setting rtol to zero, thereby making everything symmetric. What is the point of the symmetry-breaking rtol factor?
Related question
How allclose() works?
The confusing part is that the equation shows both parameters being used at the same time. Look at it like this instead:
Usecase 1: absolute tolerance (atol): absolute(a - b) <= atol
Usecase 2: relative tolerance (rtol): absolute(a - b) <= rtol * absolute(b)
An alternative way to implement both with a single tolerance parameter would be to add a flag that determines if the toerance is relative or absolute. Separating the use-cases like that breaks down in the usecase where array values can be both large and zero. If only one array can have zeros, make that one a and use the asymmetrical equation to your benefit without atol. If either one can have zeros, simply set rtol to some acceptable value for large elements, and set atol to the value you want to kick in for zeros.
You generally want to use rtol: since the precision of numbers and calculations is very much finite, larger numbers will almost always be less precise than smaller ones, and the difference scales linearly (again, in general). The only time you use atol is for numbers that are so close to zero that rounding errors are liable to be larger than the number itself.
Another way to look at it is atol compares fixed decimal places, while rtol compares significant figures.
Which tolerance(s) to use depends on your problem statement. For example, what if my array has a wide domain of values, ranging from 1e-10 to 1e10? A small atol would work well for small values, but poorly for large values, and vice-versa for a large atol. But rtol is perfect in this case because I can specify that the acceptable delta should scale with each value.
Related
I'm trying to test in Python whether a vector of recovered times is close to a vector of ground truth times. Let's ignore how we recover the times, it's not relevant to the question.
My first instinct was to use numpy.allclose, but unless I'm misunderstanding something, allclose is actually a bad fit here because of how it works.
Essentially you specify an absolute tolerance atol and relative tolerance rtol, along with your ground truth vector b and a comparison vector a, and numpy.allclose returns:
all(numpy.abs(a - b) <= atol + rtol * numpy.abs(b))
There's some nuance to what the actual function does as you can see in the
source but the "pseudo-numpython" above from the docs gives you the basic idea.
The issue is that with any monotonically-increasing vector of positive values, like a time series, your tolerance actually will increase!
Take this series of times in seconds:
>>> times_true = array([0.01147392, 0.46244898, 0.78571429, 1.22238095, 1.74857143,
2.30984127, 2.92777778, 3.57 , 4.16634921, 4.76809524])
>>> times_recovered = array([0.00944365, 0.46007857, 0.7838881 , 1.22103095, 1.74722143,
2.30849127, 2.92642778, 3.56865 , 4.16499921, 4.76674524])
I want my times to be no more than a millisecond apart, plus or minus some wiggle room. This is basically the case for my example vectors.
>>> np.abs(times_recovered - times_true)
array([0.00203027, 0.00237041, 0.00182619, 0.00135 , 0.00135 ,
0.00135 , 0.00135 , 0.00135 , 0.00135 , 0.00135 ])
since I want the values to be "roughly 1 msec apart", I specify atol to be 0.001 and my rtol to be 0.001. My understanding of these terms right now is that atol is the absolute difference between each element of a and b, i.e., np.abs(a - b), and that rtol is some additional "slop" tolerance we can add. (edit: changed how I defined the terms originally).
Now look at the what this gives me for the second term above:
>>> atol, rtol = 0.001, 0.001
>>> rtol * np.abs(times_true)
array([1.14739229e-05, 4.62448980e-04, 7.85714286e-04, 1.22238095e-03,
1.74857143e-03, 2.30984127e-03, 2.92777778e-03, 3.57000000e-03,
4.16634921e-03, 4.76809524e-03])
For this vector of times, we start out with a relative tolerance of 1.e-5 and finish with 1.e-3, a two orders of magnitude difference. In other words, allclose will check whether the differences np.abs(a - b) are less than or equal to the following:
>>> atol + rtol * np.abs(times_true)
array([0.00201147, 0.00246245, 0.00278571, 0.00322238, 0.00374857,
0.00430984, 0.00492778, 0.00557 , 0.00616635, 0.0067681 ])
This seems bad? I want my tolerance to be roughly the same at every point but it's clearly increasing. And the tolerance will only continue to increase as I get larger times in my vectors. It's also bad because for small times my tolerance will actually be smaller! Giving me false alarms.
It seems like what I should really do is just take np.abs(times_recovered - times_true) and ask whether any of the values are greater than the largest difference I'm willing to tolerate
>>> MAX_DIFF = 0.003
>>> assert not np.any(np.abs(times_recovered - times_true) > MAX_DIFF)
but if so then am I just completely understanding how numpy.allclose is supposed to work?
Any feedback from sage scientific Pythonistas would be appreciated
For your problem you want all your (pointwise) errors to be close to zero. So... just use allclose on the error timeseries and a zero vector (it broadcasts under the hood):
t_err = times_recovered - times_true
MAX_DIFF = 0.003
np.allclose(t_err, 0, rtol=0, atol=MAX_DIFF)
This is effectively the same as your assert statement (but don't use assert in production code, unless it's a test!) - your choice which you want to use.
I happen to have a numpy array of floats:
a.dtype, a.shape
#(dtype('float64'), (32769,))
The values are:
a[0]
#3.699822718929953
all(a == a[0])
True
However:
a.mean()
3.6998227189299517
The mean is off by 15th and 16th figure.
Can anybody show how this difference is accumulated over 30K mean and if there is a way to avoid it?
In case it matters my OS is 64 bit.
Here is a rough approximation of a bound on the maximum error. This will not be representative of average error, and it could be improved with more analysis.
Consider calculating a sum using floating-point arithmetic with round-to-nearest ties-to-even:
sum = 0;
for (i = 0; i < n; ++n)
sum += a[i];
where each a[i] is in [0, m).
Let ULP(x) denote the unit of least precision in the floating-point number x. (For example, in the IEEE-754 binary64 format with 53-bit significands, if the largest power of 2 not greater than |x| is 2p, then ULP(x) = 2p−52. With round-to-nearest, the maximum error in any operation with result x is ½ULP(x).
If we neglect rounding errors, the maximum value of sum after i iterations is i•m. Therefore, a bound on the error in the addition in iteration i is ½ULP(i•m). (Actually zero for i=1, since that case adds to zero, which has no error, but we neglect that for this approximation.) Then the total of the bounds on all the additions is the sum of ½ULP(i•m) for i from 1 to n. This is approximately ½•n•(n+1)/2•ULP(m) = ¼•n•(n+1)•ULP(m). (This is an approximation because it moves i outside the ULP function, but ULP is a discontinuous function. It is “approximately linear,“ but there are jumps. Since the jumps are by factors of two, the approximation can be off by at most a factor of two.)
So, with 32,769 elements, we can say the total rounding error will be at most about ¼•32,769•32,770•ULP(m), about 2.7•108 times the ULP of the maximum element value. The ULP is 2−52 times the greatest power of two not less than m, so that is about 2.7•108•2−52 = 6•10−8 times m.
Of course, the likelihood that 32,768 sums (not 32,769 because the first necessarily has no error) all round in the same direction by chance is vanishingly small but I conjecture one might engineer a sequence of values that gets close to that.
An Experiment
Here is a chart of (in blue) the mean error over 10,000 samples of summing arrays with sizes 100 to 32,800 by 100s and elements drawn randomly from a uniform distribution over [0, 1). The error was calculated by comparing the sum calculated with float (IEEE-754 binary32) to that calculated with double (IEEE-754 binary64). (The samples were all multiples of 2−24, and double has enough precision so that the sum for up to 229 such values is exact.)
The green line is c n √n with c set to match the last point of the blue line. We see it tracks the blue line over the long term. At points where the average sum crosses a power of two, the mean error increases faster for a time. At these points, the sum has entered a new binade, and further additions have higher average errors due to the increased ULP. Over the course of the binade, this fixed ULP decreases relative to n, bringing the blue line back to the green line.
This is due to incapability of float64 type to store the sum of your float numbers with correct precision. In order to get around this problem you need to use a larger data type of course*. Numpy has a longdouble dtype that you can use in such cases:
In [23]: np.mean(a, dtype=np.longdouble)
Out[23]: 3.6998227189299530693
Also, note:
In [25]: print(np.longdouble.__doc__)
Extended-precision floating-point number type, compatible with C
``long double`` but not necessarily with IEEE 754 quadruple-precision.
Character code: ``'g'``.
Canonical name: ``np.longdouble``.
Alias: ``np.longfloat``.
Alias *on this platform*: ``np.float128``: 128-bit extended-precision floating-point number type.
* read the comments for more details.
The mean is (by definition):
a.sum()/a.size
Unfortunately, adding all those values up and dividing accumulates floating point errors. They are usually around the magnitude of:
np.finfo(np.float).eps
Out[]: 2.220446049250313e-16
Yeah, e-16, about where you get them. You can make the error smaller by using higher-accuracy floats like float128 (if your system supports it) but they'll always accumulate whenever you're summing a large number of float together. If you truly want the identity, you'll have to hardcode it:
def mean_(arr):
if np.all(arr == arr[0]):
return arr[0]
else:
return arr.mean()
In practice, you never really want to use == between floats. Generally in numpy we use np.isclose or np.allclose to compare floats for exactly this reason. There are ways around it using other packages and leveraging arcane machine-level methods of calculating numbers to get (closer to) exact equality, but it's rarely worth the performance and clarity hit.
I have the following method in my python code that compares values between two objects to determine if they are equal:
def equals(self, vec, tol):
return all(i < tol for i in [abs(a - b) for a, b in zip(self, vec)])
I want to give a default value to my tolerance variable, tol, such that it is the smallest possible value that is always greater than error that could occur from floating-point inaccuracies. What is this value?
The largest possible error is infinity, and NaN (Not a Number) is also possible. There is no general formula that is correct for tol. Determining what error could occur always requires knowledge of the values used and the operations performed.
Additionally, there are limited situations where “comparing for equality using a tolerance” is a proper technique. (Testing software is one of them.) Comparing for equality using a tolerance reduces the risk of deciding two numbers are unequal even though they would be equal if computed with exact mathematics, but it does so at the expense of falsely accepting as equal two numbers that would be unequal if computed with exact mathematics. Even deciding whether such a trade-off is acceptable is application-dependent, let alone deciding what the tolerance should be.
I usually use something like this with numpy:
tol = max(np.finfo(float).eps, np.finfo(float).eps * abs(a - b))
I checked the numpy library and found the following definition for the standard deviation in numpy:
std = sqrt(mean(abs(x - x.mean())**2))
Why is the abs() function used? - Because mathematically the square of a number will be positive per definition.
So I thought:
abs(x - x.mean())**2 == (x - x.mean())**2
The square of a real number is always positive, but this is not true for complex numbers.
A very simple example: j**2=-1
A more complex (pun intended) example: (3-2j)**2=(5-12j)
From documentation:
Note that, for complex numbers, std takes the absolute value before squaring, so that the result is always real and nonnegative.
Note:
Python uses j for the imaginary unit, while mathematicians uses i.
I have been asked to test a library provided by a 3rd party. The library is known to be accurate to n significant figures. Any less-significant errors can safely be ignored. I want to write a function to help me compare the results:
def nearlyequal( a, b, sigfig=5 ):
The purpose of this function is to determine if two floating-point numbers (a and b) are approximately equal. The function will return True if a==b (exact match) or if a and b have the same value when rounded to sigfig significant-figures when written in decimal.
Can anybody suggest a good implementation? I've written a mini unit-test. Unless you can see a bug in my tests then a good implementation should pass the following:
assert nearlyequal(1, 1, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(-1e-9, 1e-9, 5)
assert nearlyequal(1e9, 1e9 + 1 , 5)
assert not nearlyequal( 1e4, 1e4 + 1, 5)
assert nearlyequal( 0.0, 1e-15, 5 )
assert not nearlyequal( 0.0, 1e-4, 6 )
Additional notes:
Values a and b might be of type int, float or numpy.float64. Values a and b will always be of the same type. It's vital that conversion does not introduce additional error into the function.
Lets keep this numerical, so functions that convert to strings or use non-mathematical tricks are not ideal. This program will be audited by somebody who is a mathematician who will want to be able to prove that the function does what it is supposed to do.
Speed... I've got to compare a lot of numbers so the faster the better.
I've got numpy, scipy and the standard-library. Anything else will be hard for me to get, especially for such a small part of the project.
As of Python 3.5, the standard way to do this (using the standard library) is with the math.isclose function.
It has the following signature:
isclose(a, b, rel_tol=1e-9, abs_tol=0.0)
An example of usage with absolute error tolerance:
from math import isclose
a = 1.0
b = 1.00000001
assert isclose(a, b, abs_tol=1e-8)
If you want it with precision of n significant digits, simply replace the last line with:
assert isclose(a, b, abs_tol=10**-n)
There is a function assert_approx_equal in numpy.testing (source here) which may be a good starting point.
def assert_approx_equal(actual,desired,significant=7,err_msg='',verbose=True):
"""
Raise an assertion if two items are not equal up to significant digits.
.. note:: It is recommended to use one of `assert_allclose`,
`assert_array_almost_equal_nulp` or `assert_array_max_ulp`
instead of this function for more consistent floating point
comparisons.
Given two numbers, check that they are approximately equal.
Approximately equal is defined as the number of significant digits
that agree.
Here's a take.
def nearly_equal(a,b,sig_fig=5):
return ( a==b or
int(a*10**sig_fig) == int(b*10**sig_fig)
)
I believe your question is not defined well enough, and the unit-tests you present prove it:
If by 'round to N sig-fig decimal places' you mean 'N decimal places to the right of the decimal point', then the test assert nearlyequal(1e9, 1e9 + 1 , 5) should fail, because even when you round 1000000000 and 1000000001 to 0.00001 accuracy, they are still different.
And if by 'round to N sig-fig decimal places' you mean 'The N most significant digits, regardless of the decimal point', then the test assert nearlyequal(-1e-9, 1e-9, 5) should fail, because 0.000000001 and -0.000000001 are totally different when viewed this way.
If you meant the first definition, then the first answer on this page (by Triptych) is good.
If you meant the second definition, please say it, I promise to think about it :-)
There are already plenty of great answers, but here's a think:
def closeness(a, b):
"""Returns measure of equality (for two floats), in unit
of decimal significant figures."""
if a == b:
return float("infinity")
difference = abs(a - b)
avg = (a + b)/2
return math.log10( avg / difference )
if closeness(1000, 1000.1) > 3:
print "Joy!"
This is a fairly common issue with floating point numbers. I solve it based on the discussion in Section 1.5 of Demmel[1]. (1) Calculate the roundoff error. (2) Check that the roundoff error is less than some epsilon. I haven't used python in some time and only have version 2.4.3, but I'll try to get this correct.
Step 1. Roundoff error
def roundoff_error(exact, approximate):
return abs(approximate/exact - 1.0)
Step 2. Floating point equality
def float_equal(float1, float2, epsilon=2.0e-9):
return (roundoff_error(float1, float2) < epsilon)
There are a couple obvious deficiencies with this code.
Division by zero error if the exact value is Zero.
Does not verify that the arguments are floating point values.
Revision 1.
def roundoff_error(exact, approximate):
if (exact == 0.0 or approximate == 0.0):
return abs(exact + approximate)
else:
return abs(approximate/exact - 1.0)
def float_equal(float1, float2, epsilon=2.0e-9):
if not isinstance(float1,float):
raise TypeError,"First argument is not a float."
elif not isinstance(float2,float):
raise TypeError,"Second argument is not a float."
else:
return (roundoff_error(float1, float2) < epsilon)
That's a little better. If either the exact or the approximate value is zero, than the error is equal to the value of the other. If something besides a floating point value is provided, a TypeError is raised.
At this point, the only difficult thing is setting the correct value for epsilon. I noticed in the documentation for version 2.6.1 that there is an epsilon attribute in sys.float_info, so I would use twice that value as the default epsilon. But the correct value depends on both your application and your algorithm.
[1] James W. Demmel, Applied Numerical Linear Algebra, SIAM, 1997.
"Significant figures" in decimal is a matter of adjusting the decimal point and truncating to an integer.
>>> int(3.1415926 * 10**3)
3141
>>> int(1234567 * 10**-3)
1234
>>>
Oren Shemesh got part of the problem with the problem as stated but there's more:
assert nearlyequal( 0.0, 1e-15, 5 )
also fails the second definition (and that's the definition I learned in school.)
No matter how many digits you are looking at, 0 will not equal a not-zero. This could prove to be a headache for such tests if you have a case whose correct answer is zero.
There is a interesting solution to this by B. Dawson (with C++ code)
at "Comparing Floating Point Numbers". His approach relies on strict IEEE representation of two numbers and the enforced lexicographical ordering when said numbers are represented as unsigned integers.
I have been asked to test a library provided by a 3rd party
If you are using the default Python unittest framework, you can use assertAlmostEqual
self.assertAlmostEqual(a, b, places=5)
There are lots of ways of comparing two numbers to see if they agree to N significant digits. Roughly speaking you just want to make sure that their difference is less than 10^-N times the largest of the two numbers being compared. That's easy enough.
But, what if one of the numbers is zero? The whole concept of relative-differences or significant-digits falls down when comparing against zero. To handle that case you need to have an absolute-difference as well, which should be specified differently from the relative-difference.
I discuss the problems of comparing floating-point numbers -- including a specific case of handling zero -- in this blog post:
http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/