Given a large sparse matrix A which are banded or tridiagonals (however it is called) and a vector f, I would like to solve for Z, where AZ = f.
There are 6 diagonals, not clearly shown here.
A has more M rows than N columns (just by 1, M ~= N), hence it is over-determined. Here is the source Matlab code, and I would like to convert it to its Scipy equivalent.
Matlab
A = A(:,2:end); #less one column
f = f(:);
Z = A\f;
Z = [0;-Z];
Z = reshape(Z,H,W);
Z = Z - min(Z(:));
My attempt on Scipy gives me this, but solving Z with scipy.sparse.linalg lsqr & lsmr is a lot slower than Matlab \ as well as not giving a good enough solution. A is created as a csr_matrix.
Python
A = A[:,1:]
f = f.flatten(1)
Z = la.lsqr(A, f, atol=1e-6, btol=1e-6)
#Z = la.lsmr(A, f) # the other method i used
Z = Z[0]
Z = np.append([0], np.negative(Z))
Z = np.reshape(Z, (height, width), order='F').copy()
Z = Z - Z.flatten(1).min()
Could anyone recommend a better alternative to solve for Z, that is as effective and fast as Matlab \ ?
This looks like a good candidate for solve_banded.
Unfortunately, the interface for providing the banded matrix is a little complex. You could start by converting your sparse matrix to DIA format, and work from there.
Related
Let a be a big scipy.sparse matrix and IJ={(i0,j0),(i1,j1),...} a set of positions. How can I efficiently set all the entries in a in positions IJ to 0? Something like a[IJ]=0.
In Mathematica, I would create a new sparse matrix b with background value 1 (instead of 0) and all entries in IJ. Then, I would use a=a*b (entry-wise multiplication). That does not seem to be an option here.
A toy example:
import scipy.sparse as sp
import numpy as np
np.set_printoptions(linewidth=200,edgeitems=5,precision=4)
m=n=10**1;
a=sp.random(m,n,4/m,format='csr'); print(a.toarray())
IJ=np.array([range(0,n,2),range(0,n,2)]); print(IJ) #every second diagonal
You are almost there. To go by your definitions, all you'd need to do is:
a[IJ[0],IJ[1]] = 0
Note that scipy will warn you:
SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
You can read more about that here.
The scipy sparse matrices can't have a non-zero background value. While it it possible to make a "sparse" matrix with lots of non-zero value, the performance (speed & memory) would be far worse than dense matrix multiplication.
A possible work-around is to rewrite every sparse matrix to have a default value of zero. For example, if matrix Y' contains mostly 1, I can replace Y' by I - Y where Y = I - Y' and I is the identity matrix.
import scipy.sparse as sp
import numpy as np
size = (100, 100)
x = np.random.uniform(-1, 1, size=size)
y = sp.random(*size, 0.001, format='csr')
# Z = (I - Y)X = X - YX
z = x - y.multiply(x)
# A = X(I - Y) = X - XY = X - transpose(YX)
a = x - y.multiply(x).T
I want to write a function that uses SVD decomposition to solve a system of equations ax=b, where a is a square matrix and b is a vector of values. The scipy function scipy.linalg.svd() should turn a into the matrices U W V. For U and V, I can simply take the transpose of to find their inverse. But for W the function gives me a 1-D array of values that I need to put down the diagonal of a matrix and then enter one over the value.
def solveSVD(a,b):
U,s,V=sp.svd(a,compute_uv=True)
Ui=np.transpose(a)
Vi=np.transpose(V)
W=np.diag(s)
Wi=np.empty(np.shape(W)[0],np.shape(W)[1])
for i in range(np.shape(Wi)[0]):
if W[i,i]!=0:
Wi[i,i]=1/W[i,i]
ai=np.matmul(Ui,np.matmul(Wi,Vi))
x=np.matmul(ai,b)
return(x)
However, I get a "TypeError: data type not understood" error. I think part of the issue is that
W=np.diag(s)
is not producing a square diagonal matrix.
This is my first time working with this library so apologies if I've done something very stupid, but I cannot work out why this line hasn't worked. Thanks all!
To be short, using singular value decomposition let you replace your initial problem which is A x = b by U diag(s) Vh x = b. Using a bit of algebra on the latter, give you the following 3 steps function which is really easy to read :
import numpy as np
from scipy.linalg import svd
def solve_svd(A,b):
# compute svd of A
U,s,Vh = svd(A)
# U diag(s) Vh x = b <=> diag(s) Vh x = U.T b = c
c = np.dot(U.T,b)
# diag(s) Vh x = c <=> Vh x = diag(1/s) c = w (trivial inversion of a diagonal matrix)
w = np.dot(np.diag(1/s),c)
# Vh x = w <=> x = Vh.H w (where .H stands for hermitian = conjugate transpose)
x = np.dot(Vh.conj().T,w)
return x
Now, let's test it with
A = np.random.random((100,100))
b = np.random.random((100,1))
and compare it with LU decomposition of np.linalg.solve function
x_svd = solve_svd(A,b)
x_lu = np.linalg.solve(A,b)
which gives
np.allclose(x_lu,x_svd)
>>> True
Please, feel free to ask more explanations in comments if needed. Hope this helps.
I'm currently trying to find an easy way to do the following operation to an N dimensional array in Python. For simplicity let's start with a 1 dimensional array of size 4.
X = np.array([1,2,3,4])
What I want to do is create a new array, call it Y, such that:
Y = np.array([1,2,3,4],[2,3,4,1],[3,4,1,2],[4,1,2,3])
So what I'm trying to do is create an array Y such that:
Y[:,i] = np.roll(X[:],-i, axis = 0)
I know how to do this using for loops, but I'm looking for a faster method of doing so. The actual array I'm trying to do this to is a 3 dimensional array, call it X. What I'm looking for is a way to find an array Y, such that:
Y[:,:,:,i,j,k] = np.roll(X[:,:,:],(-i,-j,-k),axis = (0,1,2))
I can do this using the itertools.product class using for loops, but this is quite slow. If anyone has a better way of doing this, please let me know. I also have CUPY installed with a GTX-970, so if there's a way of using CUDA to do this faster please let me know. If anyone wants some more context please let me know.
Here is my original code for computing the position space two point correlation function. The array x0 is an n by n by n real valued array representing a real scalar field. The function iterate(j,s) runs j iterations. Each iteration consists of generating a random float between -s and s for each lattice site. It then computes the change in the action dS and accepts the change with a probability of min(1,exp^(-dS))
def momentum(k,j,s):
global Gxa
Gx = numpy.zeros((n,n,t))
for i1 in range(0,k):
iterate(j,s)
for i2,i3,i4 in itertools.product(range(0,n),range(0,n),range(0,n)):
x1 = numpy.roll(numpy.roll(numpy.roll(x0, -i2, axis = 0),-i3, axis = 1),-i4,axis = 2)
x2 = numpy.mean(numpy.multiply(x0,x1))
Gx[i2,i3,i4] = x2
Gxa = Gxa + Gx
Gxa = Gxa/k
Approach #1
We can extend this idea to our 3D array case here. So, simply concatenate with sliced versions along the three dims and then use np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to efficiently get the final output as the strided-view of the concatenated version, like so -
from skimage.util.shape import view_as_windows
X1 = np.concatenate((X,X[:,:,:-1]),axis=2)
X2 = np.concatenate((X1,X1[:,:-1,:]),axis=1)
X3 = np.concatenate((X2,X2[:-1,:,:]),axis=0)
out = view_as_windows(X3,X.shape)
Approach #2
For really large arrays, we might want to initialize the output array and then re-use X3 from earlier approach to assign with slicing it. This slicing process would be faster than the original-rolling. The implementation would be -
m,n,r = X.shape
Yout = np.empty((m,n,r,m,n,r),dtype=X.dtype)
for i in range(m):
for j in range(n):
for k in range(r):
Yout[:,:,:,i,j,k] = X3[i:i+m,j:j+n,k:k+r]
In the following function, if I use
np.linalg.inv when Nx, Nt get large the function seems to take forever. In my mind I know I should instead use sparse matrices, which are in scipy (which I've never used before), but I'm getting really stuck how to convert M to a sparse matrix, find its inverse, and then convert it back to a numpy array for the for loop.
If anyone could help I'd be really grateful! Thanks!
def BTCS(phiOld, c, Nx, Nt):
#Initiate phi for the for loop
phi = phiOld.copy()
#Crate the matrix M for the BTCS scheme
M = np.zeros((Nx, Nx))
for i in range(Nx):
M[i,(i-1)%Nx] = -c/2
M[i,i] = 1
M[i,(i+1)%Nx] = c/2
#Take the inverse of M so as to have phi(n+1) = M^(-1) * phi(n)
M_inv = np.linalg.inv(M)
#Loop over all time steps
for it in range(Nt):
#Loop over space (excluding end points)
for ix in range(1,Nx-1):
phi[ix] = M_inv.dot(phiOld)[ix]
#Compute boundary values using periodic boundary conditions
phi[0] = M_inv.dot(phiOld)[0]
phi[Nx-1] = phi[0]
#Update old time value
phiOld = phi.copy()
return phi
Is there any obvious way in numpy to replace something like:
for x in X:
xi, xj = meshgrid(x, x, indexing='ij')
with a single (and possibly more efficient) operation like:
Xi, Xj = multi_meshgrid(X, X, indexing='ij')
The example of X is the following:
X = np.array([[0,1,2,3,4,5], [5,6,7,8,9,10], [11,12,13,14,15], ...])
The main problem is that i can have tens and hundreds of thousands of entries in X and possibly operation is often repeated.
The problem arises from assembling global stiffness matrix K in finite element method. For each entry in X of length n i have a matrix "n x n" which i have to inscribe into this global matrix. This matrix is in scipy.sparse coordinate format.
Regards, Marek
I think this answers the question, though I'm not sure if this is the best for constructing the sparse matrix in the end..
Anyway the following code creates a "view" into X, so it's very efficient both computationally and memory-wise.
Try it :)
from numpy.lib.stride_tricks import as_strided
m = 3
n = 4
X = np.arange(m*n).reshape((m,n))
sz = X.itemsize
Xi = as_strided(X, shape=(m,n,n), strides=(n*sz, sz, 0))
Xj = as_strided(X, shape=(m,n,n), strides=(n*sz, 0, sz))
This does, however, not work when X is not a regular matrix. E.g. in your example the third row has 5 elements whereas the others have 6.