I need to form a 2D matrix with total size 2,886 X 2,003,817. I try to use numpy.zeros to make a 2D zero element matrix and then calculate and assign each element of Matrix (most of them are zero son I need to replace few of them).
but when I try numpy.zero to initialize my matrix I get the following memory error:
C=numpy.zeros((2886,2003817)) "MemoryError"
I also try to form the matrix without initialization. Basically I calculate the element of each row in each iteration of my algorithm and then
C=numpy.concatenate((C,[A]),axis=0)
in which C is my final matrix and A is the calculated row at the current iteration. But I find out this method takes a lots of time, I am guessing it is because of using numpy.concatenate(?)
could you please let me know if there is a way to avoid memory error in initializing my matrix or is there any better method or suggestion to form the matrix in this size?
Thanks,
Amir
If your data has a lot of zeros in it, you should use scipy.sparse matrix.
It is a special data structure designed to save memory for matrices that have a lot of zeros. However, if your matrix is not that sparse, sparse matrices start to take up more memory. There are many kinds of sparse matrices, and each of them is efficient at one thing, while inefficient at another thing, so be careful with what you choose.
Related
I'm comparing in python the reading time of a row of a matrix, taken first in dense and then in sparse format.
The "extraction" of a row from a dense matrix costs around 3.6e-05 seconds
For the sparse format I tried both csr_mtrix and lil_matrix, but both for the row-reading spent around 1-e04 seconds
I would expect the sparse format to give the best performance, can anyone help me understand this ?
arr[i,:] for a dense array produces a view, so its execution time is independent of arr.shape. If you don't understand the distinction between view and copy, you need to do more reading about numpy basics.
csr and lil formats allow indexing that looks a lot like ndarray's, but there are key differences. For the most part the concept of a view does not apply. There is one exception. M.getrowview(i) takes advantage of the unique data structure of a lil to produce a view. (Read its doc and code)
Some indexing of a csr format actually uses matrix multiplication, using a specially constructed 'extractor' matrix.
In all cases where sparse indexing produces sparse matrix, actually constructing the new matrix from the data takes time. Sparse does not use compiled code nearly as much as numpy. It's strong point, relative to numpy is matrix multiplication of matrices that are 10% sparse (or smaller).
In the simplest format (to understand), coo, each nonzero element is represented by 3 values - data, row, col. Those are stored in 3 1d arrays. So it has to have a sparsity of less than 30% to even break even with respect to memory use. coo does not implement indexing.
Let's say I have a matrix M of size 10x5, and a set of indices ix of size 4x3. I want to do tf.reduce_sum(tf.gather(M,ix),axis=1) which would give me a result of size 4x5. However, to do this, it creates an intermediate gather matrix of size 4x3x5. While at these small sizes this isn't a problem, if these sizes grow large enough, I get an OOM error. However, since I'm simply doing a sum over the 1st dimension, I never need to calculate the full matrix. So my question is, is there a way to calculate the end 4x5 matrix without going through the intermediate 4x3x5 matrix?
I think you can just multiply by sparse matrix -- I was searching if the two are internally equivalent then I landed on your post
I have a matrix
x=np.mat('0.1019623; 0.1019623; 0.1019623')
and I want to find the exponential of every element and have it in a matrix of the same size. One way I found was by converting to array and proceed. However, this won't be a solution if we have, let's say, a 2x3 matrix. Is there a general solution?
The problem was with me using math.exp instead of np.exp.
I have a matrix M with dimensions (m, n) and I need to append new columns to it from a matrix L with dimensions (m, l). So basically I will end up with a matrix (m, n + l).
No problem in doing this, I can use:
numpy.concatenate
numpy.vstack
numpy.append
in the following fashion np.command(M, L) and it will return me a new matrix. The problem arises with the fact that I need to append many many matrices to the original matrix, and the size of these matrices L are not known beforehand.
So I ended up with
# M is my original matrix
while:
# find out my L matrix
M = np.append(M, L)
# check if I do not need to append the matrix
Knowing that my matrix M has approximately 100k rows, and I add on average 5k columns, the process is super slow and takes more than couple of hours (I don't know exactly how long because I gave up after 2 hours).
The problem here is clearly in this append function (I tried it with vstack and nothing changes). Also if I just calculate matrices L (without appending them), I spend less than 10 minutes for the task. I assume that this reassigning of matrix is what makes it slow. Intuitively it makes sense because I am constantly recreating the matrix M and removing the old matrix. But I do not know how to get rid of the reassigning part.
One idea is that creating an empty matrix beforehand and then populating it with correct columns should be faster, but the problem is that I do not know with what dimensions I should create it (there is no way to predict the number of columns in my matrix).
So how can I improve performance here?
There's no way to append to an existing numpy array without creating a copy.
The reason is that a numpy array must be backed by a contiguous block of memory. If I create a (1000, 10) array, then decide that I want to append another row, I'd need to be able to extend the chunk of RAM corresponding to the array so that it's big enough to accommodate (1001, 10) elements. In the general case this is impossible, since the adjacent memory addresses may already be allocated to other objects.
The only way to 'concatenate' arrays is to get the OS to allocate another chunk of memory big enough for the new array, then copy the contents of the original array and the new row into this space. This is obviously very inefficient if you're doing it repeatedly in a loop, especially since the copying step becomes more and more expensive as your array gets larger and larger.
Here are two possible work-arounds:
Use a standard Python list to accumulate your rows inside your while loop, then convert the list to an array in a single step, outside the loop. Appending to a Python list is very cheap compared with concatenating numpy arrays, since a list is just an array of pointers which don't necessarily have to reference adjacent memory addresses, and therefore no copying is required.
Take an educated guess at the number of rows in your final array, then allocate a numpy array that's slightly bigger and fill in the rows as you go along. If you run out of space, concatenate on another chunk of rows. Obviously the concatenation step is expensive, since you'll need to make a copy, but you're much better off doing this once or twice than on every iteration of your loop. When you're choosing the initial number of rows in your output array there will be a trade-off between avoiding over-allocating and unnecessary concatenation steps. Once you're done, you could then 'trim off' any unused rows using slice indexing.
I am working on an FEM project using Scipy. Now my problem is, that
the assembly of the sparse matrices is too slow. I compute the
contribution of every element in dense small matrices (one for each
element). For the assembly of the global matrices I loop over all
small dense matrices and set the matrice entries the following way:
[i,j] = someList[k][l]
Mglobal[i,j] = Mglobal[i,j] + Mlocal[k,l]
Mglobal is a lil_matrice of appropriate size, someList maps the
indexing variables.
Of course this is rather slow and consumes most of the matrice
assembly time. Is there a better way to assemble a large sparse matrix
from many small dense matrices? I tried scipy.weave but it doesn't
seem to work with sparse matrices
I posted my response to the scipy mailing list; stack overflow is a bit easier
to access so I will post it here as well, albeit a slightly improved version.
The trick is to use the IJV storage format. This is a trio of three arrays
where the first one contains row indicies, the second has column indicies, and
the third has the values of the matrix at that location. This is the best way
to build finite element matricies (or any sparse matrix in my opinion) as access
to this format is really fast (just filling an an array).
In scipy this is called coo_matrix; the class takes the three arrays as an
argument. It is really only useful for converting to another format (CSR os
CSC) for fast linear algebra.
For finite elements, you can estimate the size of the three arrays by something
like
size = number_of_elements * number_of_basis_functions**2
so if you have 2D quadratics you would do number_of_elements * 36, for example.
This approach is convenient because if you have local matricies you definitely
have the global numbers and entry values: exactly what you need for building
the three IJV arrays. Scipy is smart enough to throw out zero entries, so
overestimating is fine.