I'm trying to convert some code to Python but I noticed that SciPy's sparse diagonal operations are having some trouble handling systems that are diagonal.
For example the following code can be written as a per-pixel convolution, which in my C++ implementation is really fast. With overlap, it is approximately the memory access time. I would expect Python to know this because the system is diagonal.
When I try to run it in Python my terminal hogs the system resources and nearly killed my system. For example, the Matlab version goes much faster.
import numpy as np
from scipy import sparse
print(np.version.version)
color=imread('lena.png')
gray=mean(color,2)
[h,w]=gray.shape
img=gray.flatten()
ne=h*w;
L=np.ones(ne);
M=.5*np.ones(ne);
R=np.ones(ne);
Diags=[L,M,R]
mtx = sparse.spdiags(Diags, [-1,0,1], ne, ne);
#blured=np.dot(mtx,img) Dies
I understand I could rewrite it as 'for' loop going through the pixels, but is there a way I can keep a sparse data structure while keeping performance?
Use mtx.dot instead of np.dot as
blured = mtx.dot(img)
or just
blured = mtx * img # where mtx is sparse and img is `dense` or `sparse`
Two parameters of np.dot is dealt with dense even though one of them is sparse.
So, this will raise MemoryError
Related
I have a vector of indices (let's call it peo), a sparse matrix P and a matrix W. In MATLAB I can do an operation of this kind:
P(peo, peo) = W(peo, peo)
Is there a way to do the same in Python maintaining the same computational and time complexity?
Choosing library
There is a very similar way of expressing that in python if you use dense matrices. Using a sparse matrix is a little more complex. In general, if your code is not slowed by dense matrices too much and memory is not a problem I would stick to dense matrices with numpy since it is very convenient. (As they say premature optimization is the root of all evil... or something like that). However if you really need sparse matrices scipy will offer you an option for that.
Dense matrices
If you want to use dense matrices, you can use numpy to define matrices and peo should be defined as a list. Then you should note that the indexing with list (or vectors) doesn't work the same way in matlab as it does in python. Check this for details. (thank you Cris Lunego for pointing that out). To circumvent this and obtain the same behaviour as matlab, we will be using numpy.ix_. This will allows us to reproduce the matlab behavior of indexing with minimal alteration to our code.
Here is an example:
import numpy as np
# Dummy matrices definition
peo = [1, 3, 4]
P = np.zeros((5, 5))
W = np.ones((5, 5))
# Assignment (using np.ix_ to reproduce matlab behavior)
P[np.ix_(peo, peo)] = W[np.ix_(peo, peo)]
print(P)
Sparse matrices
For sparse matrices, scipy has a package called sparse that allows you to use sparse matrices along the way of matlab does. It gives you an actual choice on how the matrix should be represented where matlab doesn't. With great power comes great responsabiliy. Taking the time to read the pros and cons of each representation will help you choose the right one for your application.
In general it's hard to guarantee the exact same complexity because the languages are different and I don't know the intricate details of each. But the concept of sparse matrices is the same in scipy and matlab so you can expect the complexity to be comparable. (You might even be faster in python since you can choose a representation tailored to your needs).
Note that in this case if you want to keep working the same way as you describe in matlab you should choose a dok or lil representation. Those are the only two formats that allow efficient index access and sparsity change.
Here is an example of what you want to archive using dok representation:
from scipy.sparse import dok_matrix
import numpy as np
# Dummy matrices definition
peo = [1, 2, 4]
P = dok_matrix((5, 5))
W = np.ones((5, 5))
# Assignment (using np.ix_ to reproduce matlab behavior)
P[np.ix_(peo, peo)] = W[np.ix_(peo, peo)]
print(P.toarray())
If you are interested in the pros and cons of sparse matrix representation and algebra in Python here is a post that explores this a bit as well as performances. This is to take with a grain of salt since it is a little old, but the ideas behind it are still mostly correct.
I'm currently working with a least-square algorithm on Python, regarding some geodetic calculations.
I chose Python (which is not the fastest) and it works pretty well. However, in my code, I have inversions of large sparse symmetric (non-positive definite, so can't use Cholesky) matrix to execute (image below). I currenty use np.linalg.inv() which is using the LU decomposition method.
I pretty sure there would be some optimization to do in terms of rapidity.
I thought about Cuthill-McKee algotihm to rearange the matrix and take its inverse. Do you have any ideas or advice ?
Thank you very much for your answers !
Good news is that if you're using any of the popular python libraries for linear algebra (namely, numpy), the speed of python really doesn't matter for the math – it's all done natively inside the library.
For example, when you write matrix_prod = matrix_a # matrix_b, that's not triggering a bunch of Python loops to multiply the two matrices, but using numpy's internal implementation (which I think uses the FORTRAN LAPACK library).
The scipy.sparse.linalg module has your back covered when it comes to solving sparsely stored matrices specifying sparse systems of equations. (which is what you do with the inverse of a matrix). If you want to use sparse matrices, that's your way to go – notice that there's matrices that are sparse in mathematical terms (i.e., most entries are 0), and matrices which are stored as sparse matrix, which means you avoid storing millions of zeros. Numpy itself doesn't have sparsely stored matrices, but scipy does.
If your matrix is densely stored, but mathematically sparse, i.e. you're using standard numpy ndarrays to store it, then you won't get any more rapid by implementing anything in Python. The theoretical complexity gains will be outweighed by the practical slowness of Python compared to highly optimized inversion.
Inverting a sparse matrix usually loses the sparsity. Also, you never invert a matrix if you can avoid it at all! For a sparse matrix, solving the linear equation system Ax = b, with A your matrix and b a known vector, for x, is so much faster done forward than computing A⁻¹! So,
I'm currently working with a least-square algorithm on Python, regarding some geodetic calculations.
since LS says you don't need the inverse matrix, simply don't calculate it, ever. The point of LS is finding a solution that's as close as it gets, even if your matrix isn't invertible. Which can very well be the case for sparse matrices!
I am using scipy.signal.correlate to perform large 2D convolutions. I have a large number of arrays that I want to operate on, and so naturally thought the multiprocessing.Poolcould help. However using the following simple setup (on a 4-core cpu) provides no benefit.
import multiprocessing as mp
import numpy as np
from scipy import signal
arrays = [np.ones([500, 500])] * 100
kernel = np.ones([30, 30])
def conv(array, kernel):
return (array, kernel, signal.correlate(array, kernel, mode="valid", method="fft"))
pool = mp.Pool(processes=4)
results = [pool.apply(conv, args=(arr, kernel)) for arr in arrays]
Changing the process count to {1, 2, 3, 4} has approximately the same time (2.6 sec +- .2).
What could be going ?
I think the problem is in this line:
results = [pool.apply(conv, args=(arr, kernel)) for arr in arrays]
Pool.apply is a blocking operation, running conv on each element in the array and waiting before going to the next element, so even though it looks like you are multiprocessing nothing is actually being distributed. You need to use pool.map instead to get the behavior you are looking for:
from functools import partial
conv_partial = partial( conv, kernel=kernel )
results = pool.map( conv_partial, arrays )
What could be going on?
First I thought that the reason is that the underlying FFT implementation is already parallelized. While the Python interpreter is single threaded, the C code that may be called from Python may be able to fully utilize your CPU.
However, the underlying FFT implementation of scipy.correlate seems to be fftpack in NumPy (translated from Fortran that was written in 1985) which seems to be single threaded from what it looks like on the fftpack page.
Indeed, I ran your script and got considerably higher CPU usage. With four cores on my computer and four processes I got a speedup of ~2 (for 2000x2000 arrays and using the code changes from Raw Dawg).
The overhead from creating the processes and communicating with the processes eats up parts of the benefits of more efficient CPU usage.
You can still try to optimize the array sizes or to compute only a small part of the correlation (if you don't need it all) or if the kernel is the same all the time, compute the FFT of the kernel only once and reuse it (this would require implementing part of scipy.correlate yourself) or try single precision instead of double or do the correlation on the graphics card with CUDA.
I am performing numpy svd
U, S, V = np.linalg.svd(A)
shape of A is :
(10000, 10000)
Due to the large size, it gives me memory error :
U, S, V = np.linalg.svd(A, full_matrices=False) # nargout=3
File "/usr/lib/python2.7/dist-packages/numpy/linalg/linalg.py", line 1319, in svd
work = zeros((lwork,), t)
MemoryError
Then how can I find svd for my matrix?
Some small tips:
Close everything else that is open on your computer. Remove all unnecessary memory hogging things in your program by setting the variables you don't need anymore to None. Say you used a big dict D for some computations earlier but don't need it anymore set D = None. Try initializing your numpy arrays with dtype=np.int32 or dtype=np.float32 to lower memory requirements.
Depending on what you need the SVD for you can also have a look at the scikit-learn package for python, they have support for many decomposition methods such as PCA and SVD together with sparse matrix support.
There is a light implementation of SVD which is called thin-SVD. It is used when your base matrix is approximately low-rank. Considering the dimensions of your matrix, it is highly likely that it is a low-rank matrix since almost all big matrices are low rank according to a paper entitled, "Why are Big Data Matrices Approximately Low Rank?" hence, thin-SVD might solve this problem by not calculating all singular values and their singular vectors. Rather it aims at finding the highest singular values.
To find the corresponding implemetation you can search for: sklearn.decomposition.TruncatedSVD¶
I am using scipy.sparse.linalg.eigsh to solve the generalized eigen value problem for a very sparse matrix and running into memory problems. The matrix is a square matrix with 1 million rows/columns, but each row has only about 25 non-zero entries. Is there a way to solve the problem without reading the entire matrix into memory, i.e. working with only blocks of the matrix in memory at a time?
It's ok if the solution involves using a different library in python or in java.
For ARPACK, you only need to code up a routine that computes certain matrix-vector products. This can be implemented in any way you like, for instance reading the matrix from the disk.
from scipy.sparse.linalg import LinearOperator
def my_matvec(x):
y = compute matrix-vector product A x
return y
A = LinearOperator(matvec=my_matvec, shape=(1000000, 1000000))
scipy.sparse.linalg.eigsh(A)
Check the scipy.sparse.linalg.eigsh documentation for what is needed in the generalized eigenvalue problem case.
The Scipy ARPACK interface exposes more or less the complete ARPACK interface, so I doubt you will gain much by switching to FORTRAN or some other way to access Arpack.