I have the following code
x = -10
for i in range(2,10):
print i, " | ",np.exp(-x**i)
with the following output:
2 | 3.72007597602e-44
3 | inf
4 | 0.0
5 | inf
6 | 0.0
7 | inf
8 | 0.0
9 | inf
Why is the results ~0 for i even and Inf for i odd?
Since x = -10, x**i will alternate between positive and negative high values, and so will -(x**i) which is what is calulated when you write -x**i .np.exp(inf) = inf and np.exp(-inf) = 0 so for high enough numbers, you're alternating between infinity and 0.
You probably wanted to write np.exp((-x)**i), which will make the power index always positive.
The maximum value of a double-precision floating value (which numpy uses for floating-point arithmetic) is a little under 1.8e308 (1.8 * 10^308).
The value for 3 in your table would be e^1000, which WolframAlpha says is a little less than 2e434. The problem is that the numbers you want to use are just too large for numpy to handle, so for odd values, you get infinity.
The inverse is true for the even numbers; you're calculating a number so small that numpy must treat it as effectively zero.
Short answer
Because xi with x < 0 is greater than zero if i is even and smaller than zero if i is odd. Since the exponential function generates very large or very small numbers with very large positive and very large negative input, the values are rounded off to infinity or zero, as is normal for IEEE-754 numbers.
For odd values
By default Python handles floating point numbers as IEEE-754 "double precision".
The limits of a double are between -1.7*10308 and 1.7*10308, everything above and below is considered to be -Inf and +Inf. Now if you calculate this for i=3, you get:
exp(-(-10**3))=exp(-(-1000))=exp(1000)
or approximately 1.9*10434, clearly above the threshold value. For every odd i, the result of -10**i is negative, thus the operand of exp is positive, and you will get values above the threshold.
For even values
For even values, this results in:
exp(-(-10**4))=exp(-10000)
which is again approximately 1.13*10-4343
The smallest positive value (greater than zero) a double can represent is 2-53. The obtained value is clearly lower, all positive values lower than this epsilon; are rounded off to zero, simply because it is the nearest representable value.
Related
With numpy, I'm trying to understand what is the maximum value that can be downcasted from float64 to float32 with a loss on accuracy less or equal to 0.001.
Since I could not find a simple explanation online, I quickly came up with this piece of code to test :
result = {}
for j in range(1,1000):
for i in range (1, 1_000_000):
num = i + j/1000
x=np.array([num],dtype=np.float32)
y=np.array([num],dtype=np.float64)
if abs(x[0]-y[0]) > 0.001:
result[j] = i
break
Based on the results, it seems any positive value <32768 can be safely downcasted from float64 to float32 with an acceptable loss on accuracy (given the criteria of <=0.001)
Is this correct ?
Could someone explain the math behind ?
Thanks a lot
Assuming IEEE 754 representation, float32 has a 24-bit significand precision, while float64 has a 53-bit significand precision (except for “denormal” numbers).
In order to represent a number with an absolute error of at most 0.001, you need at least 9 bits to the right of the binary point, which means the numbers are rounded off to the nearest multiple of 1/512, thus having a maximum representation error of 1/1024 = 0.0009765625 < 0.001.
With 24 significant bits in total, and 9 to the right of the binary point, that leaves 15 bits to the left of the binary point, which can represent all integers less than 215 = 32768, as you have experimentally determined.
However, there are some numbers higher than this threshold that still have an error less than 0.001. As Eric Postpischil pointed out in his comment, all float64 values between 32768.0 and 32768.001 (the largest being exactly 32768+137438953/237), which the float32 conversion rounds down to exactly 32768.0, meet your accuracy requirement. And of course, any number that happens to be exactly representable in a float32 will have no representation error.
I happen to have a numpy array of floats:
a.dtype, a.shape
#(dtype('float64'), (32769,))
The values are:
a[0]
#3.699822718929953
all(a == a[0])
True
However:
a.mean()
3.6998227189299517
The mean is off by 15th and 16th figure.
Can anybody show how this difference is accumulated over 30K mean and if there is a way to avoid it?
In case it matters my OS is 64 bit.
Here is a rough approximation of a bound on the maximum error. This will not be representative of average error, and it could be improved with more analysis.
Consider calculating a sum using floating-point arithmetic with round-to-nearest ties-to-even:
sum = 0;
for (i = 0; i < n; ++n)
sum += a[i];
where each a[i] is in [0, m).
Let ULP(x) denote the unit of least precision in the floating-point number x. (For example, in the IEEE-754 binary64 format with 53-bit significands, if the largest power of 2 not greater than |x| is 2p, then ULP(x) = 2p−52. With round-to-nearest, the maximum error in any operation with result x is ½ULP(x).
If we neglect rounding errors, the maximum value of sum after i iterations is i•m. Therefore, a bound on the error in the addition in iteration i is ½ULP(i•m). (Actually zero for i=1, since that case adds to zero, which has no error, but we neglect that for this approximation.) Then the total of the bounds on all the additions is the sum of ½ULP(i•m) for i from 1 to n. This is approximately ½•n•(n+1)/2•ULP(m) = ¼•n•(n+1)•ULP(m). (This is an approximation because it moves i outside the ULP function, but ULP is a discontinuous function. It is “approximately linear,“ but there are jumps. Since the jumps are by factors of two, the approximation can be off by at most a factor of two.)
So, with 32,769 elements, we can say the total rounding error will be at most about ¼•32,769•32,770•ULP(m), about 2.7•108 times the ULP of the maximum element value. The ULP is 2−52 times the greatest power of two not less than m, so that is about 2.7•108•2−52 = 6•10−8 times m.
Of course, the likelihood that 32,768 sums (not 32,769 because the first necessarily has no error) all round in the same direction by chance is vanishingly small but I conjecture one might engineer a sequence of values that gets close to that.
An Experiment
Here is a chart of (in blue) the mean error over 10,000 samples of summing arrays with sizes 100 to 32,800 by 100s and elements drawn randomly from a uniform distribution over [0, 1). The error was calculated by comparing the sum calculated with float (IEEE-754 binary32) to that calculated with double (IEEE-754 binary64). (The samples were all multiples of 2−24, and double has enough precision so that the sum for up to 229 such values is exact.)
The green line is c n √n with c set to match the last point of the blue line. We see it tracks the blue line over the long term. At points where the average sum crosses a power of two, the mean error increases faster for a time. At these points, the sum has entered a new binade, and further additions have higher average errors due to the increased ULP. Over the course of the binade, this fixed ULP decreases relative to n, bringing the blue line back to the green line.
This is due to incapability of float64 type to store the sum of your float numbers with correct precision. In order to get around this problem you need to use a larger data type of course*. Numpy has a longdouble dtype that you can use in such cases:
In [23]: np.mean(a, dtype=np.longdouble)
Out[23]: 3.6998227189299530693
Also, note:
In [25]: print(np.longdouble.__doc__)
Extended-precision floating-point number type, compatible with C
``long double`` but not necessarily with IEEE 754 quadruple-precision.
Character code: ``'g'``.
Canonical name: ``np.longdouble``.
Alias: ``np.longfloat``.
Alias *on this platform*: ``np.float128``: 128-bit extended-precision floating-point number type.
* read the comments for more details.
The mean is (by definition):
a.sum()/a.size
Unfortunately, adding all those values up and dividing accumulates floating point errors. They are usually around the magnitude of:
np.finfo(np.float).eps
Out[]: 2.220446049250313e-16
Yeah, e-16, about where you get them. You can make the error smaller by using higher-accuracy floats like float128 (if your system supports it) but they'll always accumulate whenever you're summing a large number of float together. If you truly want the identity, you'll have to hardcode it:
def mean_(arr):
if np.all(arr == arr[0]):
return arr[0]
else:
return arr.mean()
In practice, you never really want to use == between floats. Generally in numpy we use np.isclose or np.allclose to compare floats for exactly this reason. There are ways around it using other packages and leveraging arcane machine-level methods of calculating numbers to get (closer to) exact equality, but it's rarely worth the performance and clarity hit.
I tried to add float('-inf') and 10 in python, as per my knowledge -inf is smaller than all other values.
So if I add -inf and 10 it should give 10 as an answer. Rather than giving 10 as an output it is giving -inf.
Is -inf bigger than 10?
-inf means negative infinity. It is "smaller" than all other values in that it is less than them. Adding negative infinity to any finite number still gives negative infinity.
-inf is the smallest number but what that means is that it's the negative number with the largest magnitude. It doesn't mean it's the closest you can get to zero without actually being zero (i.e., the smallest positive number):
<---------------------------|-------------------------->
-inf 0 10 +inf
When you add a massive negative number to 10, you'll still end up with a massive negative number. That's the same idea as with -inf, other than the fact infinity is not a real number.
I am currently translating a MATLAB program into Python. I successfully ported all the previous vector operations using numpy. However I am stuck in the following bit of code which is a cosine similarity measure.
% W and ind are different sized matrices
dist = full(W * (W(ind2(range),:)' - W(ind1(range),:)' + W(ind3(range),:)'));
for i=1:length(range)
dist(ind1(range(i)),i) = -Inf;
dist(ind2(range(i)),i) = -Inf;
dist(ind3(range(i)),i) = -Inf;
end
disp(dist)
[~, mx(range)] = max(dist);
I did not understand the following part.
dist(indx(range(i)),i) = -Inf;
What actuality is happening when you use
= -Inf;
on the right side?
In Matlab (see: Inf):
Inf returns the IEEE® arithmetic representation for positive infinity.
So Inf produces a value that is greater than all other numeric values. -Inf produces a value that is guaranteed to be less than any other numeric value. It's generally used when you want to iteratively find a maximum and need a first value to compare to that's always going to be less than your first comparison.
According to Wikipedia (see: IEEE 754 Inf):
Positive and negative infinity are represented thus:
sign = 0 for positive infinity, 1 for negative infinity.
biased exponent = all 1 bits.
fraction = all 0 bits.
Python has the same concept using '-inf' (see Note 6 here):
float also accepts the strings “nan” and “inf” with an optional prefix “+” or “-” for Not a Number (NaN) and positive or negative infinity.
>>> a=float('-inf')
>>> a
-inf
>>> b=-27983.444
>>> min(a,b)
-inf
It just assigns a minus infinity value to the left-hand side.
It may appear weird to assign that value, particularly because a distance cannot be negative. But it looks like it's used for effectively removing those entries from the max computation in the last line.
If Python doesn't have "infinity" (I don't know Python) and if dist is really a distance (hence nonnegative) , you could use any negative value instead of -inf to achieve the same effect, namely remove those entries from the max computation.
The -Inf is typically used to initialize a variable such that you later can use it to in a comparison in a loop.
For instance if I want to find the the maximum value in a function (and have forgotten the command max). Then I would have made something like:
function maxF = findMax(f,a,b)
maxF = -Inf;
x = a:0.001:b;
for i = 1:length(x)
if f(x) > maxF
maxF = f(x);
end
end
It is a method in matlab to make sure that any other value is larger than the current. So the comparison in Python would be -sys.maxint +1.
See for instance:
Maximum and Minimum values for ints
I am trying to find the cause of this result:
import numpy
result1 = numpy.rint(1.5)
result2 = numpy.rint(6.5)
print result
The output:
result1-> 2
result2-> 6
This is odd: result1 is correct but I result2 is not (It has to be 7 because rint rounds any float to the nearest integer).
Any idea? (THANKS!)
From numpy's documentation on numpy.around, equivalent to numpy.round, which supposedly also is relevant for numpy.rint:
For values exactly halfway between rounded decimal values, Numpy
rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5
and 0.5 round to 0.0, etc. Results may also be surprising due to the
inexact representation of decimal fractions in the IEEE floating point
standard [R9] and errors introduced when scaling by powers of ten.
Also relevant: While for large numbers there might be representation errors, for small values half integers are exactly representable in binary-base floating points, in particular 1.5 and 6.5 are exactly representable in standard single-precision floats. Without the preference for either odd, even, lower, upper integers or any other scheme one would have undefined behaviour here.
As #wim points out in the comments the behaviour of Python's build-in round is different. It rounds away from zero: It prefers upper integers for positive inputs and lower integers for negative inputs. (see http://docs.python.org/2/library/functions.html#round)
I think this is the rule of the thumb - when you have a float midway between two integers, like 1.5 lies midway between 1 and 2 and since both choices are equally good, we prefer rounding to the even number(which is 2 in this case) and for 6.5, which lies midway between 6 and 7, 6 is the closest even number.