I am performing some vectorized calculation using numpy. I was investigating a bug I am having and I ended with this line:
(vertices[:,:,:,0]+vertices[:,:,:,1]*256)*4
The result was expected to be 100728 for the index vertices[0,0,17], however, I am getting 35192.
When I tried to change it into 4.0 instead of 4, I ended getting the correct value of 100728 and thus fixing my bug.
I would like to understand why the floating point matters here especially that I am using python 3.7 and it is multiplication, not even division.
Extra information:
vertices.shape=(203759, 12, 32, 3)
python==3.7
numpy==1.16.1
Edit 1:
vertices type is "numpy.uint8"
vertices[0, 0, 17] => [94, 98, 63]
The issue here is that you are using too small integers, and the number overflows and wraps around because numpy uses fixed width integers rather than infinite precision like python int's. Numpy will "promote" the type of a result based on the inputs, but it won't promote the result based on whether an overflow happens or not (it's done before the actual calculation.
In this case when you multiply: vertices[:,:,:,1]*256 (I shall call this A), 256 cannot be held in a uint8, so it goes to the next higher type: uint16 this allows the result of the multiplication to hold the correct value in this case, because the maximum possible value of any element in verticies is 255, so the largest value possible is 255*256, which fits just fine in a 16 bit uint.
Then you add vertices[:,:,:,0] + A (I shall call this B). if the largest value of A was 255*256, and the largest value of vertices[:,:,:,0] is 255 (again the largest value of a uint8), the largest sum of the two is equal to 216-1 (the largest value you can hold in a 16 bit unsigned int). This is still fine right up until you go for your last multiplication.
When you get to B * 4, numpy again has to decide what the return type should be. The integer 4 easily fits in a uint16, so numpy does not promote the type higher still to a uint32 or uint64 because it does not preemptively avoid overflows as previously described. This results in any multiplication products greater than 216-1 being returned as modulo 216.
If you instead use a floating point number (4. or 4.0), numpy sees this as a "higher" value type that cannot fit inside a uint16, so it promotes the result to floating point, which can accomodate much higher numbers without overflowing.
If you don't want to change the entire array: verticies to a larger dtype, you could simply take the result B and convert that before you multiply by 4 as such: B.astype(np.uint64) * 4. This will allow you to hold much larger values without overflowing (though it does not actually eliminate the problem if the value is larger than 4 ever).
Related
I happen to have a numpy array of floats:
a.dtype, a.shape
#(dtype('float64'), (32769,))
The values are:
a[0]
#3.699822718929953
all(a == a[0])
True
However:
a.mean()
3.6998227189299517
The mean is off by 15th and 16th figure.
Can anybody show how this difference is accumulated over 30K mean and if there is a way to avoid it?
In case it matters my OS is 64 bit.
Here is a rough approximation of a bound on the maximum error. This will not be representative of average error, and it could be improved with more analysis.
Consider calculating a sum using floating-point arithmetic with round-to-nearest ties-to-even:
sum = 0;
for (i = 0; i < n; ++n)
sum += a[i];
where each a[i] is in [0, m).
Let ULP(x) denote the unit of least precision in the floating-point number x. (For example, in the IEEE-754 binary64 format with 53-bit significands, if the largest power of 2 not greater than |x| is 2p, then ULP(x) = 2p−52. With round-to-nearest, the maximum error in any operation with result x is ½ULP(x).
If we neglect rounding errors, the maximum value of sum after i iterations is i•m. Therefore, a bound on the error in the addition in iteration i is ½ULP(i•m). (Actually zero for i=1, since that case adds to zero, which has no error, but we neglect that for this approximation.) Then the total of the bounds on all the additions is the sum of ½ULP(i•m) for i from 1 to n. This is approximately ½•n•(n+1)/2•ULP(m) = ¼•n•(n+1)•ULP(m). (This is an approximation because it moves i outside the ULP function, but ULP is a discontinuous function. It is “approximately linear,“ but there are jumps. Since the jumps are by factors of two, the approximation can be off by at most a factor of two.)
So, with 32,769 elements, we can say the total rounding error will be at most about ¼•32,769•32,770•ULP(m), about 2.7•108 times the ULP of the maximum element value. The ULP is 2−52 times the greatest power of two not less than m, so that is about 2.7•108•2−52 = 6•10−8 times m.
Of course, the likelihood that 32,768 sums (not 32,769 because the first necessarily has no error) all round in the same direction by chance is vanishingly small but I conjecture one might engineer a sequence of values that gets close to that.
An Experiment
Here is a chart of (in blue) the mean error over 10,000 samples of summing arrays with sizes 100 to 32,800 by 100s and elements drawn randomly from a uniform distribution over [0, 1). The error was calculated by comparing the sum calculated with float (IEEE-754 binary32) to that calculated with double (IEEE-754 binary64). (The samples were all multiples of 2−24, and double has enough precision so that the sum for up to 229 such values is exact.)
The green line is c n √n with c set to match the last point of the blue line. We see it tracks the blue line over the long term. At points where the average sum crosses a power of two, the mean error increases faster for a time. At these points, the sum has entered a new binade, and further additions have higher average errors due to the increased ULP. Over the course of the binade, this fixed ULP decreases relative to n, bringing the blue line back to the green line.
This is due to incapability of float64 type to store the sum of your float numbers with correct precision. In order to get around this problem you need to use a larger data type of course*. Numpy has a longdouble dtype that you can use in such cases:
In [23]: np.mean(a, dtype=np.longdouble)
Out[23]: 3.6998227189299530693
Also, note:
In [25]: print(np.longdouble.__doc__)
Extended-precision floating-point number type, compatible with C
``long double`` but not necessarily with IEEE 754 quadruple-precision.
Character code: ``'g'``.
Canonical name: ``np.longdouble``.
Alias: ``np.longfloat``.
Alias *on this platform*: ``np.float128``: 128-bit extended-precision floating-point number type.
* read the comments for more details.
The mean is (by definition):
a.sum()/a.size
Unfortunately, adding all those values up and dividing accumulates floating point errors. They are usually around the magnitude of:
np.finfo(np.float).eps
Out[]: 2.220446049250313e-16
Yeah, e-16, about where you get them. You can make the error smaller by using higher-accuracy floats like float128 (if your system supports it) but they'll always accumulate whenever you're summing a large number of float together. If you truly want the identity, you'll have to hardcode it:
def mean_(arr):
if np.all(arr == arr[0]):
return arr[0]
else:
return arr.mean()
In practice, you never really want to use == between floats. Generally in numpy we use np.isclose or np.allclose to compare floats for exactly this reason. There are ways around it using other packages and leveraging arcane machine-level methods of calculating numbers to get (closer to) exact equality, but it's rarely worth the performance and clarity hit.
I am getting different results for conversion of float to int in my Hill Cipher code (during decryption).
Code: https://github.com/krshrimali/Hill-Cipher/blob/master/hill_cipher.py
Issue: https://github.com/krshrimali/Hill-Cipher/issues/1
Code:
# create empty plain text string
plain_text = ""
# result is a matrix [[260. 574. 439.]]
# addition of 65 because inputs are uppercase letters
for i in range(dimensions):
plain_text += chr(int(result[0][i]) % 26 + 65)
Output: ABS
(the cipher text - encrypted text - was POH)
Result Matrix: (after multiplication of inverse with cipher key matrix)
[[ 260. 574. 539.]]
After conversion to int:
[260, 573, 538]
Can anyone explain why this happens and give a fix on this? Thanks.
The problem is that you're using int, which truncates toward zero.
Math with float values is inherently imprecise. If you don't understand why, the classic explanation is in What Every Computer Scientist Should Know About Floating-Point Numbers. But the short version is that every conversion and every intermediate calculation gets rounded to the nearest 52-bit fraction to the actual number. And that may mean that a calculation that would yield exactly 574 if performed with real numbers actually yields a number a tiny bit more or less than 574 when performed with floats. And if you end up with a number a tiny bit less than 574 and truncate it toward zero withint, you get 573.
In this case, what you want to do is use round instead, which rounds to the nearest integer. As long as you can be sure that your accumulated error is never as large as 0.5, that will do what you want. And, as long as you don't pick ridiculously huge key values (which would be pointless, because you don't get any more security that way), you can be sure of that.
However, there are two things worth considering here.
From a brief scan of the Hill cipher article at Wikipedia: It designed to be performed with quick pencil-and-paper operation. First, you don't need the inverse matrix, just a matrix that's inverse mod 26, which is easier to calculate, and means you stay in smaller numbers that are less likely to have this problem. And it means you can do all the math in integers, so the problem doesn't arise in the first place: create your matrix as an array with dtype=int, and there will be no rounding issues. And, as a bonus, if you do pick ridiculously huge key values, you'll get an error instead of incorrect results. (If you want to allow such values, you'd want to store Python unlimited-size int values in a dtype=object array. But if you don't need that, it just makes things slower and more complicated.)
I am using numpy to do the always fun "count the triangles in an adjacency matrix" task. (Given an nxn Adjacency matrix, how can one compute the number of triangles in the graph (Matlab)?)
Given my matrix A, numpy.matmul() computes the cube of A without problem, but for a large matrix numpy.trace() returns a negative number.
I extracted the diagonal using numpy.diagonal() and summed the entries using math.sum() and also using a for loop -- both returned the same negative number as numpy.trace().
An attempt with math.fsum() finally returned (the assumably correct) number 4,088,103,618 -- a seemingly small number for both python and for my 64-bit operating system, especially since python documents claim integer values are unlimited.
Surely this is an overflow or undefined behavior issue, but where does the inconsistency come from? I have performed the test on the following post to successfully validate my system architecture as 64 bit, and therefore numpy should also be a 64 bit package.
Do I have Numpy 32 bit or 64 bit?
To visualize the summation process print statements were added to the for-loop, output appears as follows with an asterisk marking the interesting line.
.
.
.
adding diag val 2013124 to the running total 2140898426 = 2142911550
adding diag val 2043358 to the running total 2142911550 = 2144954908
adding diag val 2035410 to the running total 2144954908 = 2146990318
adding diag val 2000416 to the running total 2146990318 = -2145976562 *
adding diag val 2062276 to the running total -2145976562 = -2143914286
adding diag val 2092890 to the running total -2143914286 = -2141821396
adding diag val 2092854 to the running total -2141821396 = -2139728542
.
.
.
Why would adding 2000416 to 2146990318 create an overflow? The sum is only 2148990734 -- a very small number for python!
Numpy doesn't use the "python types" but rather underlying C types which you have to specify that meets your needs. By default, an array of integers will be given the "int_" type which from the docs:
int_ Default integer type (same as C long; normally either int64 or int32)
Hence why you're seeing the overflow. You'll have to specify some other type when you construct your array so that it doesn't overflow.
When you do the addition with scalars you probably get a Warning:
>>> import numpy as np
>>> np.int32(2146990318) + np.int32(2035410)
RuntimeWarning: overflow encountered in long_scalars
-2145941568
So yes, it is overflow related. The maximum 32-bit integer is 2.147.483.647!
To make sure your arrays support a bigger range of values you could cast the array (I assume you operate on an array) to int64 (or a floating point value):
array = array.astype('int64') # makes sure the values are 64 bit integers
or when creating the array:
import numpy as np
array = np.array(something, dtype=np.int64)
NumPy uses fixed-size integers and these aren't arbitary precision integers. By default it's either a 32 bit integer or a 64 bit integer, which one depends on your system. For example Windows uses int32 even when python + numpy is compiled for 64-bit.
I'm just working on Project Euler problem 12, so I need to do some testing against numbers that are multiples of over 500 unique factors.
I figured that the array [1, 2, 3... 500] would be a good starting point, since the product of that array is the lowest possible such number. However, numpy.prod() returns zero for this array. I'm sure I'm missing something obvious, but what the hell is it?
>>> import numpy as np
>>> array = []
>>> for i in range(1,100):
... array.append(i)
...
>>> np.prod(array)
0
>>> array.append(501)
>>> np.prod(array)
0
>>> array.append(5320934)
>>> np.prod(array)
0
Note that Python uses "unlimited" integers, but in numpy everything is typed, and so it is a "C"-style (probably 64-bit) integer here. You're probably experiencing an overflow.
If you look at the documentation for numpy.prod, you can see the dtype parameter:
The type of the returned array, as well as of the accumulator in which the elements are multiplied.
There are a few things you can do:
Drop back to Python, and multiply using its "unlimited integers" (see this question for how to do so).
Consider whether you actually need to find the product of such huge numbers. Often, when you're working with the product of very small or very large numbers, you switch to sums of logarithms. As #WarrenWeckesser notes, this is obviously imprecise (it's not like taking the exponent at the end will give you the exact solution) - rather, it's used to gauge whether one product is growing faster than another.
Those numbers get very big, fast.
>>> np.prod(array[:25])
7034535277573963776
>>> np.prod(array[:26])
-1569523520172457984
>>> type(_)
numpy.int64
You're actually overflowing numpy's data type here, hence the wack results. If you stick to python ints, you won't have overflow.
>>> import operator
>>> reduce(operator.mul, array, 1)
933262154439441526816992388562667004907159682643816214685929638952175999932299156089414639761565182862536979208272237582511852109168640000000000000000000000L
You get the result 0 due to the large number of factors 2 in the product, there are more than 450 of those factors. Thus in a reduction modulo 2^64, the result is zero.
Why the data type forces this reduction is explained in the other answers.
250+125+62+31+15+7+3+1 = 494 is the multiplicity of 2 in 500!
added 12/2020: or, in closer reading the question and its code,
49+24+12+6+3+1 = 95 as the multiplicity of 2 in 99!
which is the product of the first part of your list. Still enough binary zeros at the end of the number to fill all the bit positions of a 64bit integer. Just to compare, you get
19+3 = 22 factors of 5 in 99!
which is also the number of trailing zeros in the decimal expression of this factorial.
r_capr
Out[148]: array([[-0.42300825, 0.90516059, 0.04181294]])
r_capr
np.linalg.norm(r_capr.T)
Out[149]: 0.99999999760432712
a.T
Out[150]: array([[-0.42300825, 0.90516059, 0.04181294]])
a.T
np.linalg.norm(a.T)
Out[151]: 1.0
In the above we can see for the same vector we have different norm? Why is it happening?
Machines are not 100% precise with numbers seeing as they are stored with finite precision (depending on architecture it could be 16 to 128 bits float point) so numbers that are very precise such as getting close to the limit of a float point mantissa are more prone to errors. Given the machine precision error, you can safely assume those numbers are actually the same. When computing norms it may make more sense to scale or otherwise modify your numbers to get less error prone results.
Also using dot(x,x) instead of an l2 norm can be much more accurate since it avoids the square root.
See http://en.wikipedia.org/wiki/Machine_epsilon for a better discussion since this is actually a fairly complex topic.
Your exact error is caused by machine errors but since your vectors are not actually equal (you are showing two logically equivalent vectors but their internal representation will be different) the calculation of the norm is probably being processed with different precision numbers.
See this:
a = mat('-0.42300825 ; 0.90516059 ; 0.04181294', np.float32)
r = mat('-0.42300825 ; 0.90516059 ; 0.04181294', np.float64)
print linalg.norm(a)
print linalg.norm(r)
and compare the results. It will get the exact results you are seeing. You can also verify this by checking the dtype property of your matrix.