Speed up multilple matrix products with numpy - python

In python I have 2 three dimensional arrays:
T with size (n,n,n)
U with size (k,n,n)
T and U can be seen as many 2-D arrays one next to the other. I need to multiply all those matrices, ie I have to perform the following operation:
for i in range(n):
H[:,:,i] = U[:,:,i].dot(T[:,:,i]).dot(U[:,:,i].T)
As n might be very big I am wondering if this operation could be in some way speed up with numpy.

Carefully looking into the iterators and how they are involved in those dot product reductions, we could translate all of those into one np.einsum implementation like so -
H = np.einsum('ijk,jlk,mlk->imk',U,T,U)

Related

Combining sparse and einsum to perform large sparse sum

I have a matrix A with shape=(N, N) and a matrix B with the same shape=(N, N).
I am constructing a matrix M using the following einsum (using the opt_einsum library):
M = oe.contract('nm,in,jm,pn,qm->ijpq', A, B, B, B, B)
This is evaluating the following sum:
This yeilds matrix M with shape (N, N, N, N). I then reshape this to a 2D array of shape (N**2, N**2)
M = M.reshape((N**2, N**2))
This must be 2D as it is treated as a linear operator.
I want to use the sparse library, as M is sparse, and becomes too large to store for large N. I can make A and B sparse, and insert them into the oe.contract.
The problem is, sparse only supports 2D arrays and so fails to produce the 4D output of shape (N, N, N. N). Is there a way to combine the einsum and reshape steps to allow sparse to be used in this way, as the final shape of M is 2D?
This may not help with your use of opt_einsum, but with a bit of reorganizing I can speed up np.einsum quite a bit, at least for small arrays.
Do a partial product of two B:
c1 = np.einsum('in,jm->ijnm',B,B).reshape(N*N,N,N)
The pq pair is the same, so we don't need to recalculate it:
c2 = np.einsum('nm,onm,pnm->op',A,c1,c1)
I verified that this works for two (3,3) arrays, and the speed up is about 10x.
We can even reshape the nm to 1d, though this doesn't improve speed:
c1 = np.einsum('in,jm->ijnm',B,B).reshape(N*N,N*N)
c3 = np.einsum('n,on,pn->op',A.reshape(N*N),c1,c1)
I did not correctly interpret the error given by opt_einsum.
The problem is not that sparse does not support ND sparse arrays (it does!), but that I was not using a true einsum, as the indices summed over appear more than twice (n and m). As stated in the opt_einsum documentation this will result in the use of the sparse.einsum function, of which none exists. Using only 1 or 2 of each index works. Using a differerent path, one suggested for example by hpaulj can be used to solve the problem.

whether to use numpy's dot or matmul function

I need to do the following two operations:
solve Ax=b by inverting the n-by-n matrix A, and
solve r=Ar using power iteration (i.e. by repeated multiplying current vector r by A) such as one would do for the PageRank algorithm.
My question is: When computing the matrix-vector product A^{-1}b or the matrix-vector product Ar, is it better to use numpy.dot or numpy.matmul? (I understand there might be differences in higher dimensions, but my question is only for the case where A is a 2D array and b, r are vectors.)
From the numpy doc for np.dot:
Dot product of two arrays. Specifically, If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a # b is
preferred.
So basically for your case, it does not matter, although matmul is preferred according to the doc.
Also since one of your arrays is 1-D, from docs for np.matmul:
If the second argument is 1-D, it is promoted to a matrix by appending
a 1 to its dimensions. After matrix multiplication the appended 1 is
removed.
And:
matmul differs from dot in two important ways:
Multiplication by scalars is not allowed, use * instead. Stacks of matrices are
broadcast together as if the matrices were elements, respecting the
signature
Therefore, they would work the same in your case, but I would go with numpy doc's recommendation on using matmul.

best way to store numbers in a multidimensional (sparse) array in python

What is the best container object for a calculation in N dimensions, when the problem is symmetric so that only some numbers need to be calculated?
Concretely, for N=4 I have:
M=50
results = np.zeros((M,M,M,M))
for ii in range(M):
for jj in range(ii,M):
for kk in range(jj,M):
for ll in range(kk, M):
res=1 #really some calculation
results[ii,jj,kk,ll] = res
Many elements in this array are completely redundant and aren't even accessed. This is even more true for higher N (I'd like to go up to N=10 or ideally N=15).
Is it better to use lists and append in each step for such a problem, or a dictionary, or sparse matrices? I tried a sparse matrix, but it keeps warning me that I shouldn't frequently change elements in a sparse matrix, so presumably this is not a good idea.
The only functionality that I'd need to retain is finding maxima (ideally along each dimension).
Any insights would be appreciated!
The "density" of the matrix will by 1 / D**2, where D is the number of dimensions - so you can see that the payoff in space is exponential, while the performance penalty comparing to lists or dense matrices is constant.
So, when the number of dimensions is high, sparse matrices will provide HUGE advantage in space used, and they're still faster than just lists. If the number of dimensions is small, dense matrices will be slightly bigger but also only slightly faster (slightly here: few times faster, but since the total execution time is small, the absolute difference is still small).
Overall, unless the number of dimensions is fixed, it makes more sense to stick with sparse matrices. However, if D is fixed, it's better to just benchmark for this specific case.

Fastest way generate and sum arrays

I am generating a series of Gaussian arrays given a x vector of length (1400), and arrays for the sigma, center, amplitude (amp), all with length (100). I thought the best way to speed this up would be to use numpy and list comprehension:
g = np.sum([(amp[i]*np.exp(-0.5*(x - (center[i]))**2/(sigma[i])**2)) for i in range(len(center))],axis=0)
Each row is a gaussian along a vector x, and then I sum the columns into a single array of length x.
But this doesn't seem to speed things up at all. I think there is a faster way to do this while avoiding the for loop but I can't quite figure out how.
You should use vectorized computation instead of comprehension so the loops are all performed at c speed.
In order to do so you have to reshape x to be a column vector. For example you could do x = x.reshape((1400,1)).
Then you can operate directly on the arrays, like this:
v=(amp*np.exp(-0.5*(x - (center))**2/(sigma)**2
Then you obtain an array of shape (1400,100) which you can sum up to a vector by np.sum(v, axe=1)
You should try to vectorize all the operations. IMHO the most efficient to first converts your input data to numpy arrays (if they were plain Python lists) and then let numpy process the computations:
np_amp = np.array(amp)
np_center = np.array(center)
np_sigma = np.array(sigma)
g = np.sum((np_amp*np.exp(-0.5*(x - (np_center))**2/(np_sigma)**2)),axis=0)

python vector * vector------> matrix

In the python computer graphics kit, there is a vec3 type for the representation of three-component vectors, but how can I do the following multiplication:
A three-component vector multiply by its transpose result in a 3*3 matrix, like the following example:
a = vec3(1,1,1)
matrix_m = a * a.transpose()
Anyone knows such a library that can handle multiplying a matrix of dimension 1*3 by another one of dimension 3*1 and result in a matrix of 3*3.
Sorry, I have to clarify a bit more about this. I am talking about matrix math.
It is like:
[a0, a1, a2]*[a0, a1, a2]T = [a0*a0, a0*a1, a0*a2; a1*a0, a1*a1, a1*a2;a2*a0, a2*a1, a2*a2]
Maybe I can try write a function myself, it is so straightforward.....
Some vector math software, such as MATLAB, happily keep track of column vectors and row vectors as separate types of things. Python's Numpy doesn't, but does offer numpy.outer(A,B). Unfortunately, the Graphics Kit (I assume you refer to http://cgkit.sourceforge.net/) doesn't track rows vs columns, use numpy (which would be huge overkill), or provide a vector x vector --> matrix outer product. It looks like you'll have to write your own function to do that.

Categories

Resources