Is there any obvious way in numpy to replace something like:
for x in X:
xi, xj = meshgrid(x, x, indexing='ij')
with a single (and possibly more efficient) operation like:
Xi, Xj = multi_meshgrid(X, X, indexing='ij')
The example of X is the following:
X = np.array([[0,1,2,3,4,5], [5,6,7,8,9,10], [11,12,13,14,15], ...])
The main problem is that i can have tens and hundreds of thousands of entries in X and possibly operation is often repeated.
The problem arises from assembling global stiffness matrix K in finite element method. For each entry in X of length n i have a matrix "n x n" which i have to inscribe into this global matrix. This matrix is in scipy.sparse coordinate format.
Regards, Marek
I think this answers the question, though I'm not sure if this is the best for constructing the sparse matrix in the end..
Anyway the following code creates a "view" into X, so it's very efficient both computationally and memory-wise.
Try it :)
from numpy.lib.stride_tricks import as_strided
m = 3
n = 4
X = np.arange(m*n).reshape((m,n))
sz = X.itemsize
Xi = as_strided(X, shape=(m,n,n), strides=(n*sz, sz, 0))
Xj = as_strided(X, shape=(m,n,n), strides=(n*sz, 0, sz))
This does, however, not work when X is not a regular matrix. E.g. in your example the third row has 5 elements whereas the others have 6.
Related
Let a be a big scipy.sparse matrix and IJ={(i0,j0),(i1,j1),...} a set of positions. How can I efficiently set all the entries in a in positions IJ to 0? Something like a[IJ]=0.
In Mathematica, I would create a new sparse matrix b with background value 1 (instead of 0) and all entries in IJ. Then, I would use a=a*b (entry-wise multiplication). That does not seem to be an option here.
A toy example:
import scipy.sparse as sp
import numpy as np
np.set_printoptions(linewidth=200,edgeitems=5,precision=4)
m=n=10**1;
a=sp.random(m,n,4/m,format='csr'); print(a.toarray())
IJ=np.array([range(0,n,2),range(0,n,2)]); print(IJ) #every second diagonal
You are almost there. To go by your definitions, all you'd need to do is:
a[IJ[0],IJ[1]] = 0
Note that scipy will warn you:
SparseEfficiencyWarning: Changing the sparsity structure of a csr_matrix is expensive. lil_matrix is more efficient.
You can read more about that here.
The scipy sparse matrices can't have a non-zero background value. While it it possible to make a "sparse" matrix with lots of non-zero value, the performance (speed & memory) would be far worse than dense matrix multiplication.
A possible work-around is to rewrite every sparse matrix to have a default value of zero. For example, if matrix Y' contains mostly 1, I can replace Y' by I - Y where Y = I - Y' and I is the identity matrix.
import scipy.sparse as sp
import numpy as np
size = (100, 100)
x = np.random.uniform(-1, 1, size=size)
y = sp.random(*size, 0.001, format='csr')
# Z = (I - Y)X = X - YX
z = x - y.multiply(x)
# A = X(I - Y) = X - XY = X - transpose(YX)
a = x - y.multiply(x).T
I'm currently trying to find an easy way to do the following operation to an N dimensional array in Python. For simplicity let's start with a 1 dimensional array of size 4.
X = np.array([1,2,3,4])
What I want to do is create a new array, call it Y, such that:
Y = np.array([1,2,3,4],[2,3,4,1],[3,4,1,2],[4,1,2,3])
So what I'm trying to do is create an array Y such that:
Y[:,i] = np.roll(X[:],-i, axis = 0)
I know how to do this using for loops, but I'm looking for a faster method of doing so. The actual array I'm trying to do this to is a 3 dimensional array, call it X. What I'm looking for is a way to find an array Y, such that:
Y[:,:,:,i,j,k] = np.roll(X[:,:,:],(-i,-j,-k),axis = (0,1,2))
I can do this using the itertools.product class using for loops, but this is quite slow. If anyone has a better way of doing this, please let me know. I also have CUPY installed with a GTX-970, so if there's a way of using CUDA to do this faster please let me know. If anyone wants some more context please let me know.
Here is my original code for computing the position space two point correlation function. The array x0 is an n by n by n real valued array representing a real scalar field. The function iterate(j,s) runs j iterations. Each iteration consists of generating a random float between -s and s for each lattice site. It then computes the change in the action dS and accepts the change with a probability of min(1,exp^(-dS))
def momentum(k,j,s):
global Gxa
Gx = numpy.zeros((n,n,t))
for i1 in range(0,k):
iterate(j,s)
for i2,i3,i4 in itertools.product(range(0,n),range(0,n),range(0,n)):
x1 = numpy.roll(numpy.roll(numpy.roll(x0, -i2, axis = 0),-i3, axis = 1),-i4,axis = 2)
x2 = numpy.mean(numpy.multiply(x0,x1))
Gx[i2,i3,i4] = x2
Gxa = Gxa + Gx
Gxa = Gxa/k
Approach #1
We can extend this idea to our 3D array case here. So, simply concatenate with sliced versions along the three dims and then use np.lib.stride_tricks.as_strided based scikit-image's view_as_windows to efficiently get the final output as the strided-view of the concatenated version, like so -
from skimage.util.shape import view_as_windows
X1 = np.concatenate((X,X[:,:,:-1]),axis=2)
X2 = np.concatenate((X1,X1[:,:-1,:]),axis=1)
X3 = np.concatenate((X2,X2[:-1,:,:]),axis=0)
out = view_as_windows(X3,X.shape)
Approach #2
For really large arrays, we might want to initialize the output array and then re-use X3 from earlier approach to assign with slicing it. This slicing process would be faster than the original-rolling. The implementation would be -
m,n,r = X.shape
Yout = np.empty((m,n,r,m,n,r),dtype=X.dtype)
for i in range(m):
for j in range(n):
for k in range(r):
Yout[:,:,:,i,j,k] = X3[i:i+m,j:j+n,k:k+r]
Given a large sparse matrix A which are banded or tridiagonals (however it is called) and a vector f, I would like to solve for Z, where AZ = f.
There are 6 diagonals, not clearly shown here.
A has more M rows than N columns (just by 1, M ~= N), hence it is over-determined. Here is the source Matlab code, and I would like to convert it to its Scipy equivalent.
Matlab
A = A(:,2:end); #less one column
f = f(:);
Z = A\f;
Z = [0;-Z];
Z = reshape(Z,H,W);
Z = Z - min(Z(:));
My attempt on Scipy gives me this, but solving Z with scipy.sparse.linalg lsqr & lsmr is a lot slower than Matlab \ as well as not giving a good enough solution. A is created as a csr_matrix.
Python
A = A[:,1:]
f = f.flatten(1)
Z = la.lsqr(A, f, atol=1e-6, btol=1e-6)
#Z = la.lsmr(A, f) # the other method i used
Z = Z[0]
Z = np.append([0], np.negative(Z))
Z = np.reshape(Z, (height, width), order='F').copy()
Z = Z - Z.flatten(1).min()
Could anyone recommend a better alternative to solve for Z, that is as effective and fast as Matlab \ ?
This looks like a good candidate for solve_banded.
Unfortunately, the interface for providing the banded matrix is a little complex. You could start by converting your sparse matrix to DIA format, and work from there.
I'm writing some python + numpy + cython code, and am trying to find the most elegant and efficient way of doing the following kind of iteration over an array:
Let's say I have a function f(x, y) that takes a vector x of shape (3,) and a vector y of shape (10,) and returns a vector of shape (10,). Now I have two arrays X and Y of shape sx + (3,) and sy + (10,), where the sx and sy are two shapes that can be broadcast together (i.e. either sx == sy, or when an axis differs, one of the two has length 1, in which case it will be repeated). I want to produce an array Z that has the shape zs + (10,), where zs is the shape of the broadcasting of sx with sy. Each 10 dimensional vector in Z is equal to f(x, y) of the vectors x and y at the corresponding locations in X and Y.
I looked into np.nditer and while it plays nice with cython (see bottom of linked page), it doesn't seem to allow iterating over vectors from a multidimensional array, instead of elements. I also looked at index grids, but the problem there is that cython indexing is only fast when the number of indexes is equal to the dimensionality of the array, and are stored as cython integers instead of python tuples.
Any help is greatly appreciated!
You are describing what Numpy calls a Generalized Universal FUNCtion, or gufunc. As it name suggests, it is an extension of ufuncs. You probably want to start by reading these two pages:
Writing your own ufunc
Building a ufunc from scratch
The second example uses Cython and has some material on gufuncs. To fully go down the gufunc road, you will need to read the corresponding section in the numpy C API documentation:
Generalized Universal Function API
I do not know of any example of gufuncs being coded in Cython, although it shouldn't be too hard to do following the examples above. If you want to look at gufuncs coded in C, you can take a look at the source code for np.linalg here, although that can be a daunting experience. A while back I bored my local Python User Group to death giving a talk on extending numpy with C, which was mostly about writing gufuncs in C, the slides of that talk and a sample Python module providing a new gufunc can be found here.
If you want to stick with nditer, here's a way using your example dimensions. It's pure Python here, but shouldn't be hard to implement with cython (though it still has the tuple iterator). I'm borrowing ideas from ndindex as described in shallow iteration with nditer
The idea is to find the common broadcasting shape, sz, and construct a multi_index iterator over it.
I'm using as_strided to expand X and Y to usable views, and passing the appropriate vectors (actually (1,n) arrays) to the f(x,y) function.
import numpy as np
from numpy.lib.stride_tricks import as_strided
def f(x,y):
# sample that takes (10,) and (3,) arrays, and returns (10,) array
assert x.shape==(1,10), x.shape
assert y.shape==(1,3), y.shape
z = x*10 + y.mean()
return z
def brdcast(X, X1):
# broadcast X to shape of X1 (keep last dim of X)
# modeled on np.broadcast_arrays
shape = X1.shape + (X.shape[-1],)
strides = X1.strides + (X.strides[-1],)
X1 = as_strided(X, shape=shape, strides=strides)
return X1
def F(X, Y):
X1, Y1 = np.broadcast_arrays(X[...,0], Y[...,0])
Z = np.zeros(X1.shape + (10,))
it = np.nditer(X1, flags=['multi_index'])
X1 = brdcast(X, X1)
Y1 = brdcast(Y, Y1)
while not it.finished:
I = it.multi_index + (None,)
Z[I] = f(X1[I], Y1[I])
it.iternext()
return Z
sx = (2,3) # works with (2,1)
sy = (1,3)
# X, Y = np.ones(sx+(10,)), np.ones(sy+(3,))
X = np.repeat(np.arange(np.prod(sx)).reshape(sx)[...,None], 10, axis=-1)
Y = np.repeat(np.arange(np.prod(sy)).reshape(sy)[...,None], 3, axis=-1)
Z = F(X,Y)
print Z.shape
print Z[...,0]
I have one question about accessing a matrix position that in fact does not exists.
First, I have an matrix with rows rows and cols columns. From this matrix, I have to get sets of n x n sub matrices. For example, to get 3 x 3 sub matrices, I do the following:
for x, y in product(range(1, matrix.rows-1), range(1, matrix.cols-1)):
bootstrap_3x3 = npr.choice(matrix.data[x-1:x+2, y-1:y+2].flatten(), size=(3, 3), replace=True)
But, as can be seen, I'm not considering the extremes, and I have to. For x = 0 and y = 0, for example, I should consider matrix.data[x:x+2, y:y+2] (the center should be the current x and y), returning a 3 x 3 with the first row/column = 0.
I know that I can achieve this with some if statements. But I guess Python should have a clever way to do this properly.
Thank you in advance.
I would make a new matrix, padded with (n-1)/2 zeros around it:
import numpy as np
rows, cols = 4, 6
n = 3
d = (n-1)/2
data = np.arange(rows*cols).reshape(rows, cols)
padded = np.pad(data, d, mode='constant')
for x, y in np.indices(data.shape).reshape(2, -1).T:
sub = padded[x:x+n, y:y+n]
print sub
bootstrap_nxn = np.random.choice(sub.ravel(), (n, n))
This assumes n is odd, and that the submatrix center is always within the the original data matrix. If n is even, the center of the submatrix isn't well defined.
If you actually want to have the submatrix overlap with the data matrix with only one row, then you'd need to pad with n-1 zeros (and in that case even vs odd n won't matter).