How to correct numerical error in numpy sum - python

I'm trying to return a vector (1-d numpy array) that has a sum of 1.
The key is that it has to equal 1.0 as it represents a percentage.
However, there seems to be a lot of cases where the sum does not equal to 1 even when I divided each element by the total.
In other words, the sum of 'x' does not equal to 1.0 even when x = x'/sum(x')
One of the cases where this occurred was the vector below.
x = np.array([0.090179377557090171, 7.4787182000074775e-05, 0.52465058646452456, 1.3594135000013591e-05, 0.38508165466138505])
The summation of this vector x.sum() is 1.0000000000000002 whereas the summation of the vector that is divided by this value is 0.99999999999999978.
From that point on that reciprocates.
What I did do was round the elements in the vector by the 10th decimal place (np.round(x, decimals = 10)) then divided this by the sum which results in a sum of exactly 1.0. This works when I know the size of the numerical error.
Unfortunately, that would not be the case in usual circumstances.
I'm wondering if there is a way to correct the numerical error of only when the vector is known so that the sum will equal to 1.0.
Edit:
Is floating point math broken?
This question doesn't answer my question as it states only 'why' the difference occurs and not how to resolve the issue.

A bit of a hacky solution:
x[-1] = 0
x[-1] = 1 - x.sum()
Essentially shoves the numerical errors into the last element of the array.
(No roundings beforehand are needed.)
Note: A mathematically simpler solution:
x[-1] = 1.0 - x[:-1].sum()
does not work, due to different behavior of numpy.sum on whole array vs a slice.

Related

Numpy float mean calculation precision

I happen to have a numpy array of floats:
a.dtype, a.shape
#(dtype('float64'), (32769,))
The values are:
a[0]
#3.699822718929953
all(a == a[0])
True
However:
a.mean()
3.6998227189299517
The mean is off by 15th and 16th figure.
Can anybody show how this difference is accumulated over 30K mean and if there is a way to avoid it?
In case it matters my OS is 64 bit.
Here is a rough approximation of a bound on the maximum error. This will not be representative of average error, and it could be improved with more analysis.
Consider calculating a sum using floating-point arithmetic with round-to-nearest ties-to-even:
sum = 0;
for (i = 0; i < n; ++n)
sum += a[i];
where each a[i] is in [0, m).
Let ULP(x) denote the unit of least precision in the floating-point number x. (For example, in the IEEE-754 binary64 format with 53-bit significands, if the largest power of 2 not greater than |x| is 2p, then ULP(x) = 2p−52. With round-to-nearest, the maximum error in any operation with result x is ½ULP(x).
If we neglect rounding errors, the maximum value of sum after i iterations is i•m. Therefore, a bound on the error in the addition in iteration i is ½ULP(i•m). (Actually zero for i=1, since that case adds to zero, which has no error, but we neglect that for this approximation.) Then the total of the bounds on all the additions is the sum of ½ULP(i•m) for i from 1 to n. This is approximately ½•n•(n+1)/2•ULP(m) = ¼•n•(n+1)•ULP(m). (This is an approximation because it moves i outside the ULP function, but ULP is a discontinuous function. It is “approximately linear,“ but there are jumps. Since the jumps are by factors of two, the approximation can be off by at most a factor of two.)
So, with 32,769 elements, we can say the total rounding error will be at most about ¼•32,769•32,770•ULP(m), about 2.7•108 times the ULP of the maximum element value. The ULP is 2−52 times the greatest power of two not less than m, so that is about 2.7•108•2−52 = 6•10−8 times m.
Of course, the likelihood that 32,768 sums (not 32,769 because the first necessarily has no error) all round in the same direction by chance is vanishingly small but I conjecture one might engineer a sequence of values that gets close to that.
An Experiment
Here is a chart of (in blue) the mean error over 10,000 samples of summing arrays with sizes 100 to 32,800 by 100s and elements drawn randomly from a uniform distribution over [0, 1). The error was calculated by comparing the sum calculated with float (IEEE-754 binary32) to that calculated with double (IEEE-754 binary64). (The samples were all multiples of 2−24, and double has enough precision so that the sum for up to 229 such values is exact.)
The green line is c n √n with c set to match the last point of the blue line. We see it tracks the blue line over the long term. At points where the average sum crosses a power of two, the mean error increases faster for a time. At these points, the sum has entered a new binade, and further additions have higher average errors due to the increased ULP. Over the course of the binade, this fixed ULP decreases relative to n, bringing the blue line back to the green line.
This is due to incapability of float64 type to store the sum of your float numbers with correct precision. In order to get around this problem you need to use a larger data type of course*. Numpy has a longdouble dtype that you can use in such cases:
In [23]: np.mean(a, dtype=np.longdouble)
Out[23]: 3.6998227189299530693
Also, note:
In [25]: print(np.longdouble.__doc__)
Extended-precision floating-point number type, compatible with C
``long double`` but not necessarily with IEEE 754 quadruple-precision.
Character code: ``'g'``.
Canonical name: ``np.longdouble``.
Alias: ``np.longfloat``.
Alias *on this platform*: ``np.float128``: 128-bit extended-precision floating-point number type.
* read the comments for more details.
The mean is (by definition):
a.sum()/a.size
Unfortunately, adding all those values up and dividing accumulates floating point errors. They are usually around the magnitude of:
np.finfo(np.float).eps
Out[]: 2.220446049250313e-16
Yeah, e-16, about where you get them. You can make the error smaller by using higher-accuracy floats like float128 (if your system supports it) but they'll always accumulate whenever you're summing a large number of float together. If you truly want the identity, you'll have to hardcode it:
def mean_(arr):
if np.all(arr == arr[0]):
return arr[0]
else:
return arr.mean()
In practice, you never really want to use == between floats. Generally in numpy we use np.isclose or np.allclose to compare floats for exactly this reason. There are ways around it using other packages and leveraging arcane machine-level methods of calculating numbers to get (closer to) exact equality, but it's rarely worth the performance and clarity hit.

How to avoid math.sin(math.pi*2*VERY LARGE NUMBER) having a much larger error margin than math.sin(math.pi*2)?

I've read in other questions that that for example sin(2π) is not zero due to floating point representation, but is very close. This very small error is no issue in my code as I can just round up 5 decimals for example.
However when multiplying 2π with a very large number, the error is magnified a lot. The answer should be zero (or close), but is far from it.
Am I doing something fundamentally wrong in my thinking? If not, how can I avoid the error margin of floating numbers for π to get "magnified" as the number of periods (2*PI*X) → ∞ ?
Notice that all the 3 last results are the same. Can anyone explain why this is even though 5) is exactly PI/2 larger than 4)? Even with a huge offset in the sinus curve, an increase in PI/2 should still produce a different number right?
Checking small number SIN(2*PI)
print math.sin(math.pi*2)
RESULT = -2.44929359829e-16 AS EXPECTED → This error margin is OK for my purpose
Adding PI/2 to code above: SIN(2*PI + PI/2)
print math.sin((math.pi*2)+(math.pi/2))
RESULT: 1.0 AS EXPECTED
Checking very large number SIN(2*PI*VERY LARGE NUMBER) (still expecting close to zero)
print math.sin(math.pi*2*(415926535897932384626433832795028841971693993751))
RESULT:-0.759488037749 NOT AS EXPECTED --> This error margin is NOT OK for my purpose
Adding PI/2 to code above: SIN(2*PI*VERY LARGE NUMBER + PI/2) (expecting close to one)
print math.sin((math.pi*2*(415926535897932384626433832795028841971693993751))+(math.pi/2))
As above but I added PI/2 - expecting to get 1.0 as result
RESULT:-0.759488037749 NOT AS EXPECTED - why the same result as above when I added PI/2 (should go a quarter period on the sinus curve)
Adding random number (8) to the very large number, expecting neither 1 nor 0
print math.sin(math.pi*2*(415926535897932384626433832795028841971693993759))
as above but I added 8 - expecting to get neither 0 nor 1
RESULT:-0.759488037749 NOT AS EXPECTED - why the same result as above when I added 8
This simply isn't going to work with double-precision variables.
The value of math.pi is correct only to about 16 places of decimals (53 bits in binary), so when you multiply it by a number like 415926535897932384626433832795028841971693993751 (159 bits), it would be impossible to get meaningful results.
You need to use an arbitrary precision math library instead. Try using mpmath for example. Tell it you want 1000 bits of precision, and then try your sums again:
>>> import mpmath
>>> mpmath.mp.prec=1000
>>> print(mpmath.sin((mpmath.pi*2*(415926535897932384626433832795028841971693993751))+(mpmath.pi/2)))
1.0
How to avoid math.sin(math.pi*2*VERY LARGE NUMBER) having a much larger error margin than math.sin(math.pi*2)?
You could % 1 that very large number:
>>> math.sin(math.pi*2*(415926535897932384626433832795028841971693993751))
-0.8975818793257183
>>> math.sin(math.pi*2*(415926535897932384626433832795028841971693993751 % 1))
0.0
>>> math.sin((math.pi*2*(415926535897932384626433832795028841971693993751))+(math.pi/2))
-0.8975818793257183
>>> math.sin((math.pi*2*(415926535897932384626433832795028841971693993751 % 1))+(math.pi/2))
1.0
The algorithms used are approximate, and the values (e.g. pi) are approximate. So $\pi \cdot {SomeLargeNumber}$ will have a largeish error (as $\pi$'s value is approximate). The function used (by hardware?) will reduce the argument, perhaps using a slightly different value of $\pi$.
Note that floating point arithmetic does not satisfy the axioms for real arithmetic.

I cannot understand why sum(df['series']) != df['series'].sum()

I'm summing up the values in a series, but depending on how I do it, I get different results. The two ways I've tried are:
sum(df['series'])
df['series'].sum()
Why would they return different values?
Sample Code.
s = pd.Series([
0.428229
, -0.948957
, -0.110125
, 0.791305
, 0.113980
,-0.479462
,-0.623440
,-0.610920
,-0.135165
, 0.090192])
print(s.sum())
print(sum(s))
-1.4843630000000003
-1.4843629999999999
The difference is quite small here, but in a dataset with a few thousand values, it becomes quite large.
Floating point numbers are only accurate to a certain number of significant figures. Imagine if all of your numbers - including intermediate results - are only accurate to two significant figures, and you want the sum of the list [100, 1, 1, 1, 1, 1, 1].
The "true" sum is 106, but this cannot be represented since we're only allowed two significant figures;
The "correct" answer is 110, since that's the "true" sum rounded to 2 s.f.;
But if we naively add the numbers in sequence, we'll first do 100 + 1 = 100 (to 2 s.f.), then 100 + 1 = 100 (to 2 s.f.), and so on until the final result is 100.
The "correct" answer can be achieved by adding the numbers up from smallest to largest; 1 + 1 = 2, then 2 + 1 = 3, then 3 + 1 = 4, then 4 + 1 = 5, then 5 + 1 = 6, then 6 + 100 = 110 (to 2 s.f.). However, even this doesn't work in the general case; if there were over a hundred 1s then the intermediate sums would start being inaccurate. You can do even better by always adding the smallest two remaining numbers.
Python's built-in sum function uses the naive algorithm, while df['series'].sum() method uses a more accurate algorithm with a lower accumulated rounding error. From the numpy source code, which pandas uses:
For floating point numbers the numerical precision of sum (and
np.add.reduce) is in general limited by directly adding each number
individually to the result causing rounding errors in every step.
However, often numpy will use a numerically better approach (partial
pairwise summation) leading to improved precision in many use-cases.
This improved precision is always provided when no axis is given.
The math.fsum function uses an algorithm which is more accurate still:
In contrast to NumPy, Python's math.fsum function uses a slower but
more precise approach to summation.
For your list, the result of math.fsum is -1.484363, which is the correctly-rounded answer.

What are the odds of a repeat in numpy.random.rand(n) (assuming perfect randomness)?

For the moment, put aside any issues relating to pseudorandom number generators and assume that numpy.random.rand perfectly samples from the discrete distribution of floating point numbers over [0, 1). What are the odds getting at least two exactly identical floating point numbers in the result of:
numpy.random.rand(n)
for any given value of n?
Mathematically, I think this is equivalent to first asking how many IEEE 754 singles or doubles there are in the interval [0, 1). Then I guess the next step would be to solve the equivalent birthday problem? I'm not really sure. Anyone have some insight?
The computation performed by numpy.random.rand for each element generates a number 0.<53 random bits>, for a total of 2^53 equally likely outputs. (Of course, the memory representation isn't a fixed-point 0.stuff; it's still floating point.) This computation is incapable of producing most binary64 floating-point numbers between 0 and 1; for example, it cannot produce 1/2^60. You can see the code in numpy/random/mtrand/randomkit.c:
double
rk_double(rk_state *state)
{
/* shifts : 67108864 = 0x4000000, 9007199254740992 = 0x20000000000000 */
long a = rk_random(state) >> 5, b = rk_random(state) >> 6;
return (a * 67108864.0 + b) / 9007199254740992.0;
}
(Note that rk_random produces 32-bit outputs, regardless of the size of long.)
Assuming a perfect source of randomness, the probability of repeats in numpy.random.rand(n) is 1-(1-0/k)(1-1/k)(1-2/k)...(1-(n-1)/k), where k=2^53. It's probably best to use an approximation instead of calculating this directly for large values of n. (The approximation may even be more accurate, depending on how the approximation error compares to the rounding error accumulated in a direct computation.)
I think you are correct, this is like the birthday problem.
But you need to decide on the number of possible options. You do this by deciding the precision of your floating point numbers.
For example, if you decide to have a precision of 2 numbers after the dot, then there are 100 options(including zero and excluding 1).
And if you have n numbers then the probability of not having a collision is:
or when given R possible numbers and N data points, the probability of no collision is:
And of collision is 1 - P.
This is because the probability of getting any given number is 1/R. And at any point, the probability of a data point not colliding with prior data points is (R-i)/R for i being the index of the data point. But to get the probability of no data points colliding with each other, we need to multiply all the probabilities of data points not colliding with those prior to them. Applying some algebraic operations, we get the equation above.

Float precision breakdown in python/numpy when adding numbers

I have some problems due to really low numbers used with numpy. It took me several weeks to trace back my constant problems with numerical integration to the fact, that when I add up floats in a function the float64 precision gets lost. Performing the mathematically identic calculation with a product instead of a sum leads to values that are alright.
Here is a code sample and a plot of the results:
from matplotlib.pyplot import *
from numpy import vectorize, arange
import math
def func_product(x):
return math.exp(-x)/(1+math.exp(x))
def func_sum(x):
return math.exp(-x)-1/(1+math.exp(x))
#mathematically, both functions are the same
vecfunc_sum = vectorize(func_sum)
vecfunc_product = vectorize(func_product)
x = arange(0.,300.,1.)
y_sum = vecfunc_sum(x)
y_product = vecfunc_product(x)
plot(x,y_sum, 'k.-', label='sum')
plot(x,y_product,'r--',label='product')
yscale('symlog', linthreshy=1E-256)
legend(loc='lower right')
show()
As you can see, the summed values that are quite low are scattered around zero or are exactly zero while the multiplicated values are fine...
Please, could someone help/explain? Thanks a lot!
Floating point precision is pretty sensitive to addition/subtraction due to roundoff error. Eventually, 1+exp(x) gets so big that adding 1 to exp(x) gives the same thing as exp(x). In double precision that's somewhere around exp(x) == 1e16:
>>> (1e16 + 1) == (1e16)
True
>>> (1e15 + 1) == (1e15)
False
Note that math.log(1e16) is approximately 37 -- Which is roughly where things go crazy on your plot.
You can have the same problem, but on different scales:
>>> (1e-16 + 1.) == (1.)
True
>>> (1e-15 + 1.) == (1.)
False
For a vast majority of the points in your regime, your func_product is actually calculating:
exp(-x)/exp(x) == exp(-2*x)
Which is why your graph has a nice slope of -2.
Taking it to the other extreme, you're other version is calculating (at least approximately):
exp(-x) - 1./exp(x)
which is approximately
exp(-x) - exp(-x)
This is an example of catastrophic cancellation.
Let's look at the first point where the calculation goes awry, when x = 36.0
In [42]: np.exp(-x)
Out[42]: 2.3195228302435691e-16
In [43]: - 1/(1+np.exp(x))
Out[43]: -2.3195228302435691e-16
In [44]: np.exp(-x) - 1/(1+np.exp(x))
Out[44]: 0.0
The calculation using func_product does not subtract nearly equal numbers, so it avoids the catastrophic cancellation.
By the way, if you change math.exp to np.exp, you can get rid of np.vectorize (which is slow):
def func_product(x):
return np.exp(-x)/(1+np.exp(x))
def func_sum(x):
return np.exp(-x)-1/(1+np.exp(x))
y_sum = func_sum_sum(x)
y_product = func_product_product(x)
The problem is that your func_sum is numerically unstable because it involves a subtraction between two very close values.
In the calculation of func_sum(200), for example, math.exp(-200) and 1/(1+math.exp(200)) have the same value, because adding 1 to math.exp(200) has no effect, since it is outside the precision of 64-bit floating point:
math.exp(200).hex()
0x1.73f60ea79f5b9p+288
(math.exp(200) + 1).hex()
0x1.73f60ea79f5b9p+288
(1/(math.exp(200) + 1)).hex()
0x1.6061812054cfap-289
math.exp(-200).hex()
0x1.6061812054cfap-289
This explains why func_sum(200) gives zero, but what about the points that lie off the x axis? These are also caused by floating point imprecision; it occasionally happens that math.exp(-x) is not equal to 1/math.exp(x); ideally, math.exp(x) is the closest floating-point value to e^x, and 1/math.exp(x) is the closest floating-point value to the reciprocal of the floating-point number calculated by math.exp(x), not necessarily to e^-x. Indeed, math.exp(-100) and 1/(1+math.exp(100)) are very close and in fact only differ in the last unit:
math.exp(-100).hex()
0x1.a8c1f14e2af5dp-145
(1/math.exp(100)).hex()
0x1.a8c1f14e2af5cp-145
(1/(1+math.exp(100))).hex()
0x1.a8c1f14e2af5cp-145
func_sum(100).hex()
0x1.0000000000000p-197
So what you have actually calculated is the difference, if any, between math.exp(-x) and 1/math.exp(x). You can trace the line of the function math.pow(2, -52) * math.exp(-x) to see that it passes through the positive values of func_sum (recall that 52 is the size of the significand in 64-bit floating point).

Categories

Resources