Using scipy.sparse.linalg.eigsh to get all eigenvectors - python

I would like to get all of the eigenvalues and eigenvectors for a particular real symmetric matrix. This is obviously possible with numpy.linalg.eigh, however, this matrix has a particular sparse structure which allows linear scaling dot-product with a vector. For this reason, I would like to use scipy.sparse.linalg.eigsh, which allows for a LinearOperator in place of the input array, and use of the implicitly restarted Lanczos method.
My problem is that scipy.sparse.linalg.eigsh does not allow calculation of all eigenvalues and eigenvectors (i.e. k=n), and the rank of my input matrix is typically equal to n. Is there any way to get around this, or does any other function allow similar functionality?

Related

Large sparse matrix inversion on Python

I'm currently working with a least-square algorithm on Python, regarding some geodetic calculations.
I chose Python (which is not the fastest) and it works pretty well. However, in my code, I have inversions of large sparse symmetric (non-positive definite, so can't use Cholesky) matrix to execute (image below). I currenty use np.linalg.inv() which is using the LU decomposition method.
I pretty sure there would be some optimization to do in terms of rapidity.
I thought about Cuthill-McKee algotihm to rearange the matrix and take its inverse. Do you have any ideas or advice ?
Thank you very much for your answers !
Good news is that if you're using any of the popular python libraries for linear algebra (namely, numpy), the speed of python really doesn't matter for the math – it's all done natively inside the library.
For example, when you write matrix_prod = matrix_a # matrix_b, that's not triggering a bunch of Python loops to multiply the two matrices, but using numpy's internal implementation (which I think uses the FORTRAN LAPACK library).
The scipy.sparse.linalg module has your back covered when it comes to solving sparsely stored matrices specifying sparse systems of equations. (which is what you do with the inverse of a matrix). If you want to use sparse matrices, that's your way to go – notice that there's matrices that are sparse in mathematical terms (i.e., most entries are 0), and matrices which are stored as sparse matrix, which means you avoid storing millions of zeros. Numpy itself doesn't have sparsely stored matrices, but scipy does.
If your matrix is densely stored, but mathematically sparse, i.e. you're using standard numpy ndarrays to store it, then you won't get any more rapid by implementing anything in Python. The theoretical complexity gains will be outweighed by the practical slowness of Python compared to highly optimized inversion.
Inverting a sparse matrix usually loses the sparsity. Also, you never invert a matrix if you can avoid it at all! For a sparse matrix, solving the linear equation system Ax = b, with A your matrix and b a known vector, for x, is so much faster done forward than computing A⁻¹! So,
I'm currently working with a least-square algorithm on Python, regarding some geodetic calculations.
since LS says you don't need the inverse matrix, simply don't calculate it, ever. The point of LS is finding a solution that's as close as it gets, even if your matrix isn't invertible. Which can very well be the case for sparse matrices!

Python: Is there a matlab-like backslash operator?

Matlab and Julia have the backslash operator that solves linear systems. I don't really know what Matlab does, but Julia does not compute the inverse, but it computes the effect the inverse has on a given vector, which is computationally easier.
I have a numpy sparse matrix and I want to apply its pseudo-inverse to a vector. Does Python have to compute the pseudo-inverse first or is there a backslash-like operator I can use?
Edit: In a sense I want to solve a linear system Ax=b. However the matrix A does not have full rank and the vector b is not in A's range. So the system does not have a solution. So in practice I want to get the vector X that minimises the norm of Ax-b. This is exactly what the pseudo-inverse matrix does. My question is whether I there is a function that will give me that without having to compute the pseudo-inverse first.

Suppress negligible complex numpy eigenvalues?

I am calculating the eigenvalues of a covariance matrix, which is real and symmetric positive semi-definite. Therefore, the eigenvalues and eigenvectors should all be real, however numpy.linalg.eig() is returning complex values with (almost) zero imaginary components.
The covariance matrix is too large to post here, but the eigenvalues come out as
[1.38174e01+00j, 9.00153e00+00j, ....]
with the largest imaginary component in the vector being negligible at -9.7557e-16j.
I think there is some machine precision issue here, as clearly the imaginary components are negligible (and given that my covariance matrix is real pos semi-def).
Is there a way to suppress returning the imaginary component using numpy eig (or scipy)? I'm trying to avoid an if statement that checks if the eigenvalue object is complex and then sets it to the real components only, if possible.
I think the best solution for this specific case is to use #PaulPanzer's suggestion, that is np.linalg.eigh. This works directly for Hermitian matrices, and thus will have only real Eigen values, exactly this specific use case.
In general, to retrieve the real part of numbers in an array is as easy as:
>>> np.real(np.array([1+1j,2+1j]))
array([ 1., 2.])
numpy.real returns the real part of your numbers.

Diagonalizing large sparse matrix with Python/Scipy

I am working with a large (complex) Hermitian matrix and I am trying to diagonalize it efficiently using Python/Scipy.
Using the eigh function from scipy.linalgit takes about 3s to generate and diagonalize a roughly 800x800 matrix and compute all the eigenvalues and eigenvectors.
The eigenvalues in my problem are symmetrically distributed around 0 and range from roughly -4 to 4. I only need the eigenvectors corresponding to the negative eigenvalues, though, which turns the range I am looking to calculate into [-4,0).
My matrix is sparse, so it's natural to use the scipy.sparsepackage and its functions to calculate the eigenvectors via eigsh, since it uses much less memory to store the matrix.
Also I can tell the program to only calculate the negative eigenvalues via which='SA'. The problem with this method is, that it takes now roughly 40s to compute half the eigenvalues/eigenvectors. I know, that the ARPACK algorithm is very inefficient when computing small eigenvalues, but I can't think of any other way to compute all the eigenvectors that I need.
Is there any way, to speed up the calculation? Maybe with using the shift-invert mode? I will have to do many, many diagonalizations and eventually increase the size of the matrix as well, so I am a bit lost at the moment.
I would really appreciate any help!
This question is probably better to ask on http://scicomp.stackexchange.com as it's more of a general math question, rather than specific to Scipy or related to programming.
If you need all eigenvectors, it does not make very much sense to use ARPACK. Since you need N/2 eigenvectors, your memory requirement is at least N*N/2 floats; and probably in practice more. Using eigh requires N*N+3*N floats. eigh is then within a factor of 2 from the minimum requirement, so the easiest solution is to stick with it.
If you can process the eigenvectors "on-line" so that you can throw the previous one away before processing the next, there are other approaches; look at the answers to similar questions on scicomp.

How does numpy.linalg.inv calculate the inverse of an orthogonal matrix?

I'm implementing a LinearTransformation class, which inherits from numpy.matrix and uses numpy.matrix.I to calculate the inverse of the transformation matrix.
Does anyone know whether numpy checks for orthogonality of the matrix before trying to calculate the inverse? I ask because most of my matrices (but not all) will be orthogonal and I wondered whether to implement some quick orthogonality check before trying to invert.
It does not!
numpy.linalg.inv(A) actually calls numpy.linalg.solve(A,I), where I is the identity, and solve uses lapack's LU factorization.
That is, eventually, it does Gaussian elimination where orthogonality isn't detected by default.
And I don't think there is a shot into the dark to check something like A * A.T = I as matrix times matrix is costly.

Categories

Resources