Numpy Cholesky decomposition LinAlgError - python

In my attempt to perform cholesky decomposition on a variance-covariance matrix for a 2D array of periodic boundary condition, under certain parameter combinations, I always get LinAlgError: Matrix is not positive definite - Cholesky decomposition cannot be computed. Not sure if it's a numpy.linalg or implementation issue, as the script is straightforward:
sigma = 3.
U = 4
def FromListToGrid(l_):
i = np.floor(l_/U)
j = l_ - i*U
return np.array((i,j))
Ulist = range(U**2)
Cov = []
for l in Ulist:
di = np.array([np.abs(FromListToGrid(l)[0]-FromListToGrid(i)[0]) for i, x in enumerate(Ulist)])
di = np.minimum(di, U-di)
dj = np.array([np.abs(FromListToGrid(l)[1]-FromListToGrid(i)[1]) for i, x in enumerate(Ulist)])
dj = np.minimum(dj, U-dj)
d = np.sqrt(di**2+dj**2)
Cov.append(np.exp(-d/sigma))
Cov = np.vstack(Cov)
W = np.linalg.cholesky(Cov)
Attempts to remove potential singularies also failed to resolve the problem. Any help is much appreciated.

Digging a bit deeper in problem, I tried printing the Eigenvalues of the Cov matrix.
print np.linalg.eigvalsh(Cov)
And the answer turns out to be this
[-0.0801339 -0.0801339 0.12653595 0.12653595 0.12653595 0.12653595 0.14847999 0.36269785 0.36269785 0.36269785 0.36269785 1.09439988 1.09439988 1.09439988 1.09439988 9.6772531 ]
Aha! Notice the first two negative eigenvalues? Now, a matrix is positive definite if and only if all its eigenvalues are positive. So, the problem with the matrix is not that it's close to 'zero', but that it's 'negative'. To extend #duffymo analogy, this is linear algebra equivalent of trying to take square root of negative number.
Now, let's try to perform same operation, but this time with scipy.
scipy.linalg.cholesky(Cov, lower=True)
And that fails saying something more
numpy.linalg.linalg.LinAlgError: 12-th leading minor not positive definite
That's telling something more, (though I couldn't really understand why it's complaining about 12-th minor).
Bottom line, the matrix is not quite close to 'zero' but is more like 'negative'

The problem is the data you're feeding to it. The matrix is singular, according to the solver. That means a zero or near-zero diagonal element, so inversion is impossible.
It'd be easier to diagnose if you could provide a small version of the matrix.
Zero diagonals aren't the only way to create a singularity. If two rows are proportional to each other then you don't need both in the solution; they're redundant. It's more complex than just looking for zeroes on the diagonal.
If your matrix is correct, you have a non-empty null space. You'll need to change algorithms to something like SVD.
See my comment below.

Related

Complex eigenvalues computation using scipy.sparse.linalg.eigs

Given the following input numpy 2d-array A that may be retrieved with the following link through the file hill_mat.npy, it would be great if I can compute only a subset of its eigenvalues using an iterative solver like scipy.sparse.linalg.eigs.
First of all, a little bit of context. This matrix A results from a quadratic eigenvalue problem of size N which has been linearized in an equivalent eigenvalue problem of double size 2*N. A has the following structure (blue color being zeroes):
plt.imshow(np.where(A > 1e-15,1.,0), interpolation='None')
and the following features:
A shape = (748, 748)
A dtype = float64
A sparsity ratio = 77.64841716949297 %
The true dimensions of A are much bigger than this small reproducible example. I expect the real sparsity ratio and shape to be close to 95% and (5508, 5508) for this case.
The resulting eigenvalues of A are complex (which come in complex conjugate pairs) and I am more interested in the ones with the smallest imaginary part in modulus.
Problem: when using direct solver:
w_dense = np.linalg.eigvals(A)
idx = np.argsort(abs(w_dense.imag))
w_dense = w_dense[idx]
calculation times become rapidly prohibitive. I am thus looking to use a sparse algorithm:
from scipy.sparse import csc_matrix, linalg as sla
w_sparse = sla.eigs(A, k=100, sigma=0+0j, which='SI', return_eigenvectors=False)
but it seems that ARPACK doesn't find any eigenvalues this way. From the scipy/arpack tutorial, when looking for small eigenvalues like which = 'SI', one should use the so-called shift-invert mode by specifying sigma kwarg, i.e. in order for the algorithm to know where it could expect to find these eigenvalues. Nonetheless, all of my attempts did not yield any results...
Could someone more experienced with this function give me a hand in order to make this work?
Here follows a whole code snippet:
import numpy as np
from matplotlib import pyplot as plt
from scipy.sparse import csc_matrix, linalg as sla
A = np.load('hill_mat.npy')
print('A shape =', A.shape)
print('A dtype =', A.dtype)
print('A sparsity ratio =',(np.product(A.shape) - np.count_nonzero(A)) / np.product(A.shape) *100, '%')
# quick look at the structure of A
plt.imshow(np.where(A > 1e-15,1.,0), interpolation='None')
# direct
w_dense = np.linalg.eigvals(A)
idx = np.argsort(abs(w_dense.imag))
w_dense = w_dense[idx]
# sparse
w_sparse = sla.eigs(csc_matrix(A), k=100, sigma=0+0j, which='SI', return_eigenvectors=False)
Problem finally solved, I guess I should have read the documentation more carefully, but yet, the following is quite counter-intuitive and could be better emphasized in my opinion:
... ARPACK contains a mode that allows a quick determination of
non-external eigenvalues: shift-invert mode. As mentioned above, this
mode involves transforming the eigenvalue problem to an equivalent
problem with different eigenvalues. In this case, we hope to find
eigenvalues near zero, so we’ll choose sigma = 0. The transformed
eigenvalues will then satisfy , so our small eigenvalues become large
eigenvalues .
This way, when looking for small eigenvalues, in order to help LAPACK do the work, one should activate shift-invert mode by specifying an appropriate sigma value while also reversing the desired specified subset specified in the which keyword argument.
Thus, it is simply a matter of executing:
w_sparse = sla.eigs(csc_matrix(A), k=100, sigma=0+0j, which='LM', return_eigenvectors=False, maxiter=2000)
idx = np.argsort(abs(w_sparse.imag))
w_sparse = w_sparse[idx]
Therefore, I can only hope this mistake help someone else :)

How to find A in a Matrix multiplication Ax=b, with some Values of A known, and A being left stochastic

I have been trying to find an answer to this problem for a couple of hours now, but i can't find anything so far...
So I have two vectors let's call them b and x, of which i know all values. They add up to be the same amount, so sum(b) = sum(x).
I also have a Matrix, let's call it A, of which i know what values are 0, all the other values are unknown (but are different from 0).
Furthermore, the the elements of each column of A has the sum of 1 (I think that's called it's a left stochastic matrix)
Generally the Equation can be written in the form A*x = b.
Now I'm trying to find the missing values of A.
I have found one answer to the general problem here: https://math.stackexchange.com/questions/1170843/solving-ax-b-when-x-and-b-are-given
Furthermore i looked at the documentation of numpy.linalg
:https://docs.scipy.org/doc/numpy/reference/routines.linalg.html, but i just can't figure out how to do it.
It looks similar to a multi linear regression problem, but also on sklearn, i couldn't find anything: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression
Not a complete answer, but a bit of a more formal statement of the problem.
I think this can be solved as just a system of linear equations. Let
NZ = {(i,j)|a(i,j) is not fixed to zero}
Then write:
sum( j | (i,j) ∈ NZ, a(i,j) * x(j) ) = b(i) ∀i
sum( i | (i,j) ∈ NZ, a(i,j)) = 1 ∀j
This is just a system of linear equations in a(i,j). It may be under- (or over-) determined and it may be sparse. I think it depends a bit on this how to solve it. It may possible to think about these as constraints in a linear (or quadratic) programming problem. That would allow you to add an objective (in case of an underdetermined system or overdetermined -- in that case minimize sum of squared deviations, or 1-norm of deviations). In addition we can add bounds on a(i,j) (e.g. lower bounds of zero and upper bounds of one). So a linear programming approach may be what you are looking for.
This problem looks a bit like matrix balancing. This is used a lot for economic data sets that come from different sources and where we want to reconcile the data to get a consistent data set usable for subsequent modeling.

Incorrect eigenvalue with simply QR iteration by python

I try to solve eigenvalue and eigenvector by QR iteration, the code is super simple. But the answer by QR iteration always has some opposite or incorrect values compare to the answer by linalg.eigs.
import numpy as np
import scipy.linalg as linalg
def qr_iteration(A):
for i in range(100):
Q, R = linalg.qr(A)
A = np.dot(R, Q)
return np.diag(R), Q
a, b = linalg.eig(A)
c, d = qr_iteration(A)
print(a) # [ 1.61168440e+01+0.j -1.11684397e+00+0.j -1.30367773e-15+0.j]
print(c) # [-1.61168440e+01 1.11684397e+00 -1.33381856e-15]
Some of the values are opposite to the correct answer
Which part of my code is wrong?
Thanks for all the answers.
The final eigenvalues should be found as the diagonal elements of A instead of R (change the return statement to np.diag(A)). Moreover, the order of the eigenvalues appearing in the diagonal may differ from other algorithms.
Are you dealing with real symmetric matrices? If not, the eigenvalues may be complex and QR algorithm should not be applied. If it has complex eigenvalues, those eigenvalues come in pairs having the same magnitude and the algorithm will not converge. You will never get imaginary numbers using your procedure.
To get the eigenvectors, you need to multiply all the Q's, i.e. $Q1 Q2 Q3...$; the column vectors are the corresponding eigenvectors (don't know how to type latex here)

theano gradient with respect to matrix row

As the question suggests, I would like to compute the gradient with respect to a matrix row. In code:
import numpy.random as rng
import theano.tensor as T
from theano import function
t_x = T.matrix('X')
t_w = T.matrix('W')
t_y = T.dot(t_x, t_w.T)
t_g = T.grad(t_y[0,0], t_x[0]) # my wish, but DisconnectedInputError
t_g = T.grad(t_y[0,0], t_x) # no problems, but a lot of unnecessary zeros
f = function([t_x, t_w], [t_y, t_g])
y,g = f(rng.randn(2,5), rng.randn(7,5))
As the comments indicate, the code works without any problems when I compute the gradient with respect to the entire matrix. In this case the gradient is correctly computed, but the problem is that the result has only non-zero entries in row 0 (because other rows of x obviously do not appear in the equations for the first row of y).
I have found this question, suggesting to store all rows of the matrix in separate variables and build graphs from these variables. In my setting though, I have no idea how much rows might be in X.
Would anybody have an idea how to get the gradient with respect to a single row of a matrix or how I could omit the extra zeros in the output? If anybody would have suggestions how an arbitrary amount of vectors can be stacked, that should work as well, I guess.
I realised that it is possible to get rid of the zeros when computing derivatives with respect to the entries in row i:
t_g = T.grad(t_y[i,0], t_x)[i]
and for computing the Jacobian, I found out that
t_g = T.jacobian(t_y[i], t_x)[:,i]
does the trick. However it seems to have a rather heavy impact on computation speed.
It would also be possible to approach this problem mathematically. The Jacobian of the matrix multiplication t_y w.r.t. t_x is simply the transpose of t_w.T, which is t_w in this case (the transpose of the transpose is the original matrix). Thus, the computation would be as simple as
t_g = t_w

scipy / numpy linalg.eigval result interpretation

I am a newbie when it comes to using python libraries for numerical tasks. I was reading a paper on LexRank and wanted to know how to compute eigenvectors of a transition matrix. I used the eigval function and got a result that I have a hard time interpreting:
a = numpy.zeros(shape=(4,4))
a[0,0]=0.333
a[0,1]=0.333
a[0,2]=0
a[0,3]=0.333
a[1,0]=0.25
a[1,1]=0.25
a[1,2]=0.25
a[1,3]=0.25
a[2,0]=0.5
a[2,1]=0.0
a[2,2]=0.0
a[2,3]=0.5
a[3,0]=0.0
a[3,1]=0.333
a[3,2]=0.333
a[3,3]=0.333
print LA.eigval(a)
and the eigenvalue is:
[ 0.99943032+0.j
-0.13278637+0.24189178j
-0.13278637-0.24189178j
0.18214242+0.j ]
Can anyone please explain what j is doing here? Isn't the eigenvalue supposed to be a scalar quantity? How can I interpret this result broadly?
j is the imaginary number, the square root of minus one. In math it is often denoted by i, in engineering, and in Python, it is denoted by j.
A single eigenvalue is a scalar quantity, but an (m, m) matrix will have m eigenvalues (and m eigenvectors). The Wiki page on eigenvalues and eigenvectors has some examples that might help you to get your head around the concepts.
As #unutbu mentions, j denotes the imaginary number in Python. In general, a matrix may have complex eigenvalues (i.e. with real and imaginary components) even if it contains only real values (see here, for example). Symmetric real-valued matrices are an exception, in that they are guaranteed to have only real eigenvalues.

Categories

Resources