I'm getting inverse of matrix even if determinant is zero. I tried with the code below:
import numpy as np
from scipy import linalg
matrix=np.array([[5,10],[2,4]])
print(linalg.det(matrix))
linalg.inv(matrix)
The problem here is a disconnect between how different routines calculate the matrix determinant. For example:
import numpy as np
matrix=np.array([[5.,10.],[2.,4.]])
print(np.linalg.det(matrix))
print(np.linalg.slogdet(matrix))
try:
invmatrix=np.linalg.inv(matrix)
except np.linalg.LinAlgError:
print("inversion failed")
produces no exception and prints this:
-1.1102230246251625e-15
(-1.0, -34.43421547668305)
i.e. not using a direct algebraic calculation of the determinant (which scipy.linalg.det does) yields a non-zero determinant, because of accumulated floating point rounding error. Thus the standard linear algebra routines treat the matrix as non-singular and produce an incorrect inverse from an extremely poor conditioned problem.
(Tested with numpy version 1.15.4 and scipy version 1.1.0)
Related
When I try to integrate a periodic array with the scipy function sp.fftpack.diff(x,order=-1), it sometimes works and sometimes doesn't.
For example, when integrating x=sin(alpha) to obtain an array of the values of the integral when evaluated from 0 to discrete values up to 2*pi I get the expected result -cos(\alphas). However, when I use it to calculate the values of the integrals of x=sin(alpha)+cos(alpha)+1 in the same ranges I do not get the right answer, even when the function is periodic.
I do not understand how this function works. Does someone have an idea?
https://docs.scipy.org/doc/scipy/reference/generated/scipy.fftpack.diff.html
For example, with this code I obtain the results in the image,I am also comparing the results with the obtained by the trapezoidal rule, which does work when fixing the offset.enter image description here
import numpy as np
from scipy import fftpack as sp
from scipy import integrate as inte
import matplotlib.pyplot as plt
N=150
h=(2*np.pi)/N
x=np.arange(-np.pi,np.pi,h)
y=np.sin(x)+np.cos(x)+1
arrExact=-np.cos(x)+np.sin(x)+x
st=inte.cumtrapz(y,x,initial=0)-2.1
di=sp.diff(y, order=-1)-1
plt.plot(x,di,label='diff')
plt.plot(x,arrExact,label='Exact')
plt.plot(x,st,label='cumpTrapz')
plt.legend()
plt.show()
Edit: Well, reading again I realized scipy assumes x[0]=0, however I need to integrate spectrally arrays that do not satisfies this condition, How can I proceed?
I am implementing Andrew Ng's Machine Learning course on Python, but I got stuck because the scipy's optimize functions keep giving me a hard time by not working/giving me dimension errors
The goal is to find the minimum of the cost function (a scalar function that takes theta (dimension (1,401)), X (dimension (5000,401)), and y (dimension (5000,1)) as inputs). I have defined such cost function and its gradient wrt parameters. When running one of the optimize functions (I have tried fmin_tnc, minimize, Nelder-Mead and others, all not working), either they run for ages or keep giving me errors saying that the array dimension is wrong, or that they find a division by 0... errors that I am not able to spot.
weirdest thing is that this problem has popped up at first when I was doing exercise 2 on logistic regression, and then magically disappeared without me changing anything. Now, Implementing multi-classification logistic regression, it has appeared again, and it won't fix even though I have literally copied and pasted the code of exercise 2!
The code is the following:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.io import loadmat
import scipy.misc
import matplotlib.cm as cm
from scipy.optimize import minimize,fmin_tnc
import random
def sigmoid(z):
return 1/(1+np.exp(-z))
def J(theta,X,y):
theta_t=np.transpose(theta)
prod=np.matmul(X,theta_t)
sigm=sigmoid(prod)
vec=y*np.log(sigm)+(1-y)*np.log(1-sigm)
return -np.sum(vec)/len(y)
def grad(theta,X,y):
theta_t=np.transpose(theta)
prod=np.matmul(X,theta_t)
sigm=sigmoid(prod)
one=sigm-y
return np.matmul(np.transpose(one),X)/len(y)
data=loadmat('/home/marco/Desktop/MLang/mlex3/ex3/ex3data1.mat')
X,y = data['X'],data['y']
X=np.column_stack((np.ones(len(X[:,0])),X))
initial_theta=np.zeros((1,len(X[0,:])))
res=fmin_tnc(func=J, x0=initial_theta.flatten(), args=(X,y.flatten()), fprime=grad)
theta_opt=res[0]
Instead of returning the value of theta that minimizes the function as theta_opt, it says:
/home/marco/anaconda3/lib/python3.6/site-packages ipykernel_launcher.py:8: RuntimeWarning: divide by zero encountered in log
I have no clue where this divide by zero occurs, given that there is literally no division in the whole code, except for the division by len(y), which is 5000, and the division in the sigmoid function (1/(1+exp(-z)), which can never be 0!
Any suggestions?
I am trying to use code which uses Bessel function zeros for other calculations. I noticed the following piece of code produces results that I consider unexpected.
import scipy
from scipy import special
scipy.special.jn_zeros(1,2)
I would expect the result from this call to be
array([0., 3.83170597])
instead of
array([3.83170597, 7.01558667])
Is there a reason a reason why the root at x=0.0 is not being returned?
From what I can see the roots are symmetric along the x-axis except for any found at the origin, but I do not think this would be enough of a reason to leave off the root completely.
The computer I am using has python version 2.7.10 installed and is using scipy version 0.19.0
P.S. the following function is what I am trying to find the zeros of
scipy.special.j1
It appears to be convention to not count the zero at zero, see for example ħere. Maybe it is considered redundant?
This question is about precision of computation using NumPy vs. Octave/MATLAB (the MATLAB code below has only been tested with Octave, however). I am aware of a similar question on Stackoverflow, namely this, but that seems somewhat far from what I'm asking below.
Setup
Everything is running on Ubuntu 14.04.
Python version 3.4.0.
NumPy version 1.8.1 compiled against OpenBLAS.
Octave version 3.8.1 compiled against OpenBLAS.
Sample Code
Sample Python code.
import numpy as np
from scipy import linalg as la
def build_laplacian(n):
lap=np.zeros([n,n])
for j in range(n-1):
lap[j+1][j]=1
lap[j][j+1]=1
lap[n-1][n-2]=1
lap[n-2][n-1]=1
return lap
def evolve(s, lap):
wave=la.expm(-1j*s*lap).dot([1]+[0]*(lap.shape[0]-1))
for i in range(len(wave)):
wave[i]=np.linalg.norm(wave[i])**2
return wave
We now run the following.
np.min(evolve(2, build_laplacian(500)))
which gives something on the order of e-34.
We can produce similar code in Octave/MATLAB:
function lap=build_laplacian(n)
lap=zeros(n,n);
for i=1:(n-1)
lap(i+1,i)=1;
lap(i,i+1)=1;
end
lap(n,n-1)=1;
lap(n-1,n)=1;
end
function result=evolve(s, lap)
d=zeros(length(lap(:,1)),1); d(1)=1;
result=expm(-1i*s*lap)*d;
for i=1:length(result)
result(i)=norm(result(i))^2;
end
end
We then run
min(evolve(2, build_laplacian(500)))
and get 0. In fact, evolve(2, build_laplacian(500)))(60) gives something around e-100 or less (as expected).
The Question
Does anyone know what would be responsible for such a large discrepancy between NumPy and Octave (again, I haven't tested the code with MATLAB, but I'd expect to see similar results).
Of course, one can also compute the matrix exponential by first diagonalizing the matrix. I have done this and have gotten similar or worse results (with NumPy).
EDITS
My scipy version is 0.14.0. I am aware that Octave/MATLAB use the Pade approximation scheme, and am familiar with this algorithm. I am not sure what scipy does, but we can try the following.
Diagonalize the matrix with numpy's eig or eigh (in our case the latter works fine since the matrix is Hermitian). As a result we get two matrices: a diagonal matrix D, and the matrix U, with D consisting of eigenvalues of the original matrix on the diagonal, and U consists of the corresponding eigenvectors as columns; so that the original matrix is given by U.T.dot(D).dot(U).
Exponentiate D (this is now easy since D is diagonal).
Now, if M is the original matrix and d is the original vector d=[1]+[0]*n, we get scipy.linalg.expm(-1j*s*M).dot(d)=U.T.dot(numpy.exp(-1j*s*D).dot(U.dot(d)).
Unfortunately, this produces the same result as before. Thus this probably has something to do either with the way numpy.linalg.eig and numpy.linalg.eigh work, or with the way numpy does arithmetic internally.
So the question is: how do we increase numpy's precision? Indeed, as mentioned above, Octave seems to do a much finer job in this case.
The following code
import numpy as np
from scipy import linalg as la
import scipy
print np.__version__
print scipy.__version__
def build_laplacian(n):
lap=np.zeros([n,n])
for j in range(n-1):
lap[j+1][j]=1
lap[j][j+1]=1
lap[n-1][n-2]=1
lap[n-2][n-1]=1
return lap
def evolve(s, lap):
wave=la.expm(-1j*s*lap).dot([1]+[0]*(lap.shape[0]-1))
for i in range(len(wave)):
wave[i]=la.norm(wave[i])**2
return wave
r = evolve(2, build_laplacian(500))
print np.min(abs(r))
print r[59]
prints
1.8.1
0.14.0
0
(2.77560227344e-101+0j)
for me, with OpenBLAS 0.2.8-6ubuntu1.
So it appears your problem is not immediately reproduced. Your code examples above are not runnable as-is (typos).
As mentioned in scipy.linalg.expm documentation, the algorithm is from Al-Mohy and Higham (2009), which is different from the simpler scale-and-square-Pade in Octave.
As a consequence, the results also I get from Octave are slightly different, although the results are eps-close in matrix norms (1,2,inf). MATLAB uses the Pade approach from Higham (2005), which seems to give the same results as Scipy above.
I'm a Python newbie coming from using MATLAB extensively. I was converting some code that uses log2 in MATLAB and I used the NumPy log2 function and got a different result than I was expecting for such a small number. I was surprised since the precision of the numbers should be the same (i.e. MATLAB double vs NumPy float64).
MATLAB Code
a = log2(64);
--> a=6
Base Python Code
import math
a = math.log2(64)
--> a = 6.0
NumPy Code
import numpy as np
a = np.log2(64)
--> a = 5.9999999999999991
Modified NumPy Code
import numpy as np
a = np.log(64) / np.log(2)
--> a = 6.0
So the native NumPy log2 function gives a result that causes the code to fail a test since it is checking that a number is a power of 2. The expected result is exactly 6, which both the native Python log2 function and the modified NumPy code give using the properties of the logarithm. Am I doing something wrong with the NumPy log2 function? I changed the code to use the native Python log2 for now, but I just wanted to know the answer.
No. There is nothing wrong with the code, it is just because floating points cannot be represented perfectly on our computers. Always use an epsilon value to allow a range of error while checking float values. Read The Floating Point Guide and this post to know more.
EDIT - As cgohlke has pointed out in the comments,
Depending on the compiler used to build numpy np.log2(x) is either computed by the C library or as 1.442695040888963407359924681001892137*np.log(x) See this link.
This may be a reason for the erroneous output.