Safety of taking `int(numpy.sqrt(N))` - python

Let's say I'm considering M=N**2 where N is an integer. It appears that numpy.sqrt(M) returns a float (actually numpy.float64).
I could imagine that there could be a case where it returns, say, N-10**(-16) due to numerical precision issues, in which case int(numpy.sqrt(M)) would be N-1.
Nevertheless, my tests have N==numpy.sqrt(M) returning True, so it looks like this approximation isn't happening.
Is it safe for me to assume that int(numpy.sqrt(M)) is indeed accurate when M is a perfect square? If so, for bonus, what's going on in the background that makes it work?

To avoid missing the integer by 1E-15, you could use :
int(numpy.sqrt(M)+0.5)
or
int(round(numpy.sqrt(M)))

Related

numpy.linalg.det returns very small numbers instead of 0

I calculated the determinant of matrix using np.linalg.det(matrix) but it returns weird values. For example, it gives 1.1012323e-16 instead of 0.
Of course, I can round the result with numpy.around, but is there any option to set some "default" rounding for results of all numpy methods, including numpy.linalg.det?
The value of the determinant looking "weird" is due to the floating point arithmetic, you can look it up.
Regarding your question, I believe numpy.set_printoptions is what you are looking for. Please, see Docs

Can I discard the complex portion of results generated with scipy.linalg.logm?

I have a matrix that look like the one below. It is always a square matrix (up to 1000 x 1000) with the values are between 0 and 1:
data = np.array([[0.0308, 0.07919, 0.05694, 0.00662, 0.00927],
[0.07919, 0.00757, 0.00720, 0.00526, 0.00709],
[0.05694, 0.00720, 0.00518, 0.00707, 0.00413],
[0.00662, 0.00526, 0.00707, 0.01612, 0.00359],
[0.00927, 0.00709, 0.00413, 0.00359, 0.00870]])
When I try to take the natural log of this matrix, using scipy.linalg.logm, it gives me the following result.
print(logm(data))
>> [[-2.3492917 +1.42962407j 0.15360003-1.26717846j 0.15382223-0.91631624j 0.15673496+0.0443927j 0.20636448-0.01113953j]
[ 0.15360003-1.26717846j -3.75764578+2.16378501j 1.92614937-0.60836013j -0.13584605+0.27652444j 0.27819383-0.25190565j]
[ 0.15382223-0.91631624j 1.92614937-0.60836013j -5.08018989+2.52657239j 0.37036433-0.45966441j -0.03892575+0.36450564j]
[ 0.15673496+0.0443927j -0.13584605+0.27652444j 0.37036433-0.45966441j -4.22733838+0.09726189j 0.26291385-0.07980921j]
[ 0.20636448-0.01113953j 0.27819383-0.25190565j -0.03892575+0.36450564j 0.26291385-0.07980921j -4.91972246+0.06594195j]]
First of all, why is this happening? Based on another post I found here, pertaining to a different scipy.linalg method, this is due to truncation and rounding issues caused by floating point errors.
If that is correct, then how am I able to fix it? The second answer on that same linked post suggested this:
(2) All imaginary parts returned by numpy's linalg.eig are close to the machine precision. Thus you should consider them zero.
Is this correct? I can use numpy.real(data) to simply discard the complex portion of the values, but I don't know if that is a mathematically (or scientifically) robust thing to do.
Additionally I attempted to use tensorflow's linalg.logm method, but got the exact same complex results meaning this isn't unexpected behavior?

how to compare two infinitely large numbers in python

I am creating an algorithm where a metric may take 3 values:
Infinite
Too large but not infinite
Some number that is the result of a calculation
Now, math.inf handles the infinite.
The result of the third value has no determined borders. But, I want the second value to be always smaller than infinite and always larger than the third value. Therefore I cannot give it some very large number like 999999999999999 since there is always a possibility that the calculation may exceed it.
What I am looking for is another constant like Ellipsis of Python 2.
How can I make this happen?
You can try sys.float_info.max:
>>> import sys
>>> sys.float_info.max
1.7976931348623157e+308
According to the documentation, it's the "maximum representable positive finite float".

Python Numbers mysteriously being rounded on comparison

I was having some problem with Numpy arrays and I stumbled across it, and it confused me.
I'm trying to compare 2 parts of arrays using array_equal
np.array_equal(updated_image_values[j][k],np.array(initial_means[i]))
This is returning False when the numbers are
[ 0.90980393 0.8392157 0.65098041]
[ 0.90980393 0.8392157 0.65098041]
Above is my print of the two arrays.
However, when I print the individual elements one seems to be rounded of for no reason
print updated_image_values[j][k][0] #0.909804
print initial_means[i][0] #0.90980393
Then obviously when these individual elements are compared it returns False
print updated_image_values[j][k][0]==initial_means[i][0] #False
Can anyone explain why Python is doing the comparison wrong and for no apparent reason rounding the numbers?
I assume that updated_image_values has had some operations done on it. And what classes are the numbers?
My guess is that what you're seeing isn't "rounding", it's got something to do with the __str__ or __repr__ functions of the classes. The fact that you're seeing 0.90980393 when you print the list means that the element is not really rounded to 0.909804. Try "{0:.10f}".format(updated_image_values[j][k][0]).
As for the comparison, you're probably seeing the floating point operations change the value enough that it's outside of the tolerance of array_equal. Try using isclose instead.

Realistic float value for "about zero"

I'm working on a program with fairly complex numerics, mostly in numpy with complex datatypes. Some of the calculation are returning nearly empty arrays with a complex component that is almost zero. For example:
(2 + 0j, 3+0j, 4+3.9320340202e-16j)
Clearly the third component is basically 0, but for whatever reason, this is the output of my calculation and it turns out that for some of these nearly zero values, np.is_complex() returns True. Rather than dig through that big code, I think it's sensible to just apply a cutoff. My question is, what is a sensible cutoff that anything below should be considered a zero? 0.00? 0.000000? etc...
I understand that these values are due to rounding errors in floating point math, and just want to handle them sensibly. What is the tolerance/range one allows for such precision error? I'd like to set it to a parameter:
ABOUTZERO=0.000001
As others have commented, what constitutes 'almost zero' really does depend on your particular application, and how large you expect the rounding errors to be.
If you must use a hard threshold, a sensible value might be the machine epsilon, which is defined as the upper bound on the relative error due to rounding for floating point operations. Intuitively, it is the smallest positive number that, when added to 1.0, gives a result >1.0 using a given floating point representation and rounding method.
In numpy, you can get the machine epsilon for a particular float type using np.finfo:
import numpy as np
print(np.finfo(float).eps)
# 2.22044604925e-16
print(np.finfo(np.float32).eps)
# 1.19209e-07

Categories

Resources