Fast way for matrix multiplication in Python

Fast way for matrix multiplication in Python - python

Does anybody know a fast way to compute matrices such as:
Z{i,j} = \sum_{p,k,l,q} \frac{A_{ip} B_{pk} C_{kl} D_{lq} E_{qj} }{a_p - b_q - c}
For normal matrix multiplication I would use numpy.dot(a,b), but now I got to divide the elements by $a_p$ and $b_q$.
Any suggestions?
Any suggestions on how to compute
$$ C_{i,j} = \sum _p = \frac{E_{i,p} B_{p,j}}{m_p} $$
will be of great help as well.

Note that (E[i, p] * B[p, j]) / m[p] is equal to E[i, p] * (B[p, j] / m[p]), so you can simply divide m into B before calling np.dot.
def f(E, B, m):
B = np.asarray(B) # matrix
m = np.asarray(m).reshape((B.shape[0], 1)) # row vector
return np.dot(E, B / m) # m is broadcasted to match B

Related

Compute difference of dense matrix at non-zero elements of another sparse matrix

I have a n x n matrix W a n-diemensional vector U and a scalar p > 1. I want to compute W * (np.abs(U[:,np.newaxis] - U) ** p), where *, ** and abs are understood element wise (as in numpy).
Now the problem is that U[:,np.newaxis] - U does not fit into memory. However, W is a sparse matrix (scipy.sparse), so I don't actually have to compute all entries of U[:,np.newaxis] - U, but only those, where W is not zero.
How can I compute W * (np.abs(U[:,np.newaxis] - U) ** p) most efficiently in terms of computation time and memory, ideally doing only sparse operations, without a step through numpy?

To make use of the sparseness you could apply the distributive law so
C = W * (np.abs(U[:,np.newaxis] - U) ** p)
would result in
rtW = W**(1/p)
C = (rtW * np.abs(U[:,np.newaxis] - rtW * U) ** p
note that this only makes sense as long the entries of W aren't negative. But we can remedy that by using
rtW = np.abs(W)**(1/p)
signW = np.sign(W)
C = signW * (rtW * np.abs(U[:,np.newaxis] - rtW * U) ** p

I figured it out, we can just do the desired computation using W.nonzero()
Assuming W is a csc_matrix:
nonzero = W.nonzero()
values = np.abs(U[nonzero[0]] - U[nonzero[1]]) ** p
A = sparse.csc_matrix((values, nonzero), shape=(U.size, U.size))
W.multiply(A)

Multiplication of two huge dense matrices Hadamard-multiplied by a sparse matrix

I have two dense matrices A and B, and each of them has a size fo 3e5x100. Another sparse binary matrix, C, with size 3e5x3e5. I want to find the following quantity: C ∘ (AB'), where ∘ is Hadamard product (i.e., element wise) and B' is the transpose of B. Explicitly calculating AB' will ask for crazy amount of memory (~500GB). Since the end result won't need the whole AB', it is sufficient to only calculate the multiplication A_iB_j' where C_ij != 0, where A_i is the column i of matrix A and C_ij is the element at location (i,j) of the matrix C. A suggested approach would be like the algorithm below:
result = numpy.initalize_sparse_matrix(shape = C.shape)
while True:
(i,j) = C_ij.pop_nonzero_index() #prototype function returns the nonzero index and then points to the next nonzero index
if (i,j) is empty:
break
result(i,j) = A_iB_j'
This algorithm however takes too much time. Is there anyway to improve it using LAPACK/BLAS algorithms? I am coding in Python so I think numpy can be more human friendly wrapper for LAPACK/BLAS.

You can do this computation using the following, assuming C is stored as a scipy.sparse matrix:
C = C.tocoo()
result_data = C.data * (A[C.row] * B[C.col]).sum(1)
result = sparse.coo_matrix((result_data, (row, col)), shape=C.shape)
Here we show that the result matches the naive algorithm for some smaller inputs:
import numpy as np
from scipy import sparse
N = 300
M = 10
def make_C(N, nnz=1000):
data = np.random.rand(nnz)
row = np.random.randint(0, N, nnz)
col = np.random.randint(0, N, nnz)
return sparse.coo_matrix((data, (row, col)), shape=(N, N))
A = np.random.rand(N, M)
B = np.random.rand(N, M)
C = make_C(N)
def f_naive(C, A, B):
return C.multiply(np.dot(A, B.T))
def f_efficient(C, A, B):
C = C.tocoo()
result_data = C.data * (A[C.row] * B[C.col]).sum(1)
return sparse.coo_matrix((result_data, (C.row, C.col)), shape=C.shape)
np.allclose(
f_naive(C, A, B).toarray(),
f_efficient(C, A, B).toarray()
)
# True
And here we see that it works for the full input size:
N = 300000
M = 100
A = np.random.rand(N, M)
B = np.random.rand(N, M)
C = make_C(N)
out = f_efficient(C, A, B)
print(out.shape)
# (300000, 300000)
print(out.nnz)
# 1000

Vectorizing three nested loops with Numpy

I have a complex matrix C with dimensions (r, r) as well as a complex vector of size r. I need to compute a new matrix from C and v following this equation:
where K is also a square matrix of dimensions (r, r). Here is the code to compute K with three loops:
import numpy as np
import matplotlib.pyplot as plt
r = 9
# Create random matrix
C = np.random.rand(r,r) + np.random.rand(r,r) * 1j
v = np.random.rand(r) + np.random.rand(r) * 1j
# Original loops
K = np.zeros((r, r))
for m in range(r):
for n in range(r):
for i in range(r):
K[m,n] += np.imag( C[i,m] * np.conj(C[i,n]) * np.sign(np.imag(v[i])) )
plt.figure()
plt.imshow(K)
plt.show()
Removing the loop with i is relatively easy:
# First optimization
K = np.zeros((r, r))
for m in range(r):
for n in range(r):
K[m,n] = np.imag(np.sum(C[:,m] * np.conj(C[:,n]) * np.sign(np.imag(v)) ))
but I am not sure how to proceed to vectorize the two remaining loops. Is it actually possible in this case?

I had a lot of these of problems and here is how I usually proceeded to find solutions to writing out vectorized code.
Here is what I have noticed about your summation. Cool conclusion is that you probably do not need vectorization at all, as you can express your whole calculation as a single product of 2D matrics. Here comes...
Lets first define following matrix (sorry for lack of Latex notation, Stackoverflow does not support Mathjax) :
A_{i,j} = c_{i,j}.
B_{i,j} = c_{i,j} * sgn(Im(v_i))
Then you can write your summation as:
k_{m,n} = Im( \sum_{i=1}^{r} c_{i,m} * sgn(Im(v_i)) * c_{i,n}^* ) = Im ( \sum_{i=1}^{r} B_{i,m} * A_{i,n}^* ) = Im( \sum_{i=1}^{r} B_{m,i}^T * A_{i,n}^* )
The expression above inside of Im(.) is the by definition of matrix multiplication equivalent to following :
k_{m,n} = Im( (B^T * A^*)_{m,n} )
Which means that your matrix k can be expressed as product of transpose of matrix B and product of matrix A. In your code the matrix matrix A is assigned already to variable C. So the vectorization could be done as follows:
C = np.random.rand(r,r) + np.random.rand(r,r) * 1j
v = np.random.rand(r) + np.random.rand(r) * 1j
k = np.imag( (C * np.sign(np.imag(v)).T # np.conj(C) )
And you have avoided both nasty loops and convoluted expressions

This looks like matrix multiplication:
out = np.imag((C*np.sign(np.imag(v))[:,None]).T # np.conj(C))
Or you can use np.einsum:
out = np.imag(np.einsum('im,in,i', C, np.conj(C), np.sign(np.imag(v))))
Verification with your approach:
np.all(np.abs(out-K) < 1e-6)
# True

I found something that can work for now. However, one loop remains and since the resulting matrix is symetric, there is still some optimization to be made.
Instead of removing the i loop, I removed the two other ones:
K = np.zeros((r, r), dtype=np.complex128)
for i in range(r):
K += adjointMatrix(C) # (np.sign(np.imag(v)) * C)
K = np.imag(K)
with:
def adjointMatrix(X):
return np.conjugate( np.transpose(X) )

Efficient computation of a bilinear norm over sparse vectors in Python

Given two column vectors x, y : scipy.sparse.csc_matrix, where len(x) == len(y) == N and max(x.nnz, y.nnz) == M, and a symmetric N × N matrix A : scipy.sparse.csc_matrix, where for all columns j, A[j].nnz = C, I need to compute x.T * A * y = ∑∑ᵢ,ⱼ x[i] * a[j][i] * y[j] efficiently in at most M * max(M, C) steps, which can be achieved as follows:
in the outer loop, we iterate over the y.nnz columns j of A,
in the inner loop, we iterate over either:
the x.nnz rows i of A, if x.nnz < C, or
the C rows i of A, otherwise.
My question is whether this can be achieved using high-level Python and existing libraries (and if so, then how), or whether this requires custom C / C++ code.
The following naive Python code using the scipy library:
(x.T).dot(A).dot(y)[0, 0]
computes separately:
x.T * A using up to x.nnz * N steps, and
∑ⱼ (x.T * A)[j] * y[j] using up to M steps.
This takes O(M * N) steps in total, which is a major slow-down for large N.

Inverse of numpy.dot

I can easily calculate something like:
R = numpy.column_stack([A,np.ones(len(A))])
M = numpy.dot(R,[k,m0])
where A is a simple array and k,m0 are known values.
I want something different. Having fixed R, M and k, I need to obtain m0.
Is there a way to calculate this by an inverse of the function numpy.dot()?
Or it is only possible by rearranging the matrices?

M = numpy.dot(R,[k,m0])
is performing matrix multiplication. M = R * x.
So to compute the inverse, you could use np.linalg.lstsq(R, M):
import numpy as np
A = np.random.random(5)
R = np.column_stack([A,np.ones(len(A))])
k = np.random.random()
m0 = np.random.random()
M = R.dot([k,m0])
(k_inferred, m0_inferred), residuals, rank, s = np.linalg.lstsq(R, M)
assert np.allclose(m0, m0_inferred)
assert np.allclose(k, k_inferred)
Note that both k and m0 are determined, given M and R (assuming len(M) >= 2).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Fast way for matrix multiplication in Python - python

Note that (E[i, p] * B[p, j]) / m[p] is equal to E[i, p] * (B[p, j] / m[p]), so you can simply divide m into B before calling np.dot. def f(E, B, m): B = np.asarray(B) # matrix m = np.asarray(m).reshape((B.shape[0], 1)) # row vector return np.dot(E, B / m) # m is broadcasted to match B

Related

Compute difference of dense matrix at non-zero elements of another sparse matrix

Multiplication of two huge dense matrices Hadamard-multiplied by a sparse matrix

Vectorizing three nested loops with Numpy

Efficient computation of a bilinear norm over sparse vectors in Python

Inverse of numpy.dot

Categories

Resources