how can I simplify and extend the following code for arbitrary shapes of A?
import numpy as np
A = np.random.random([10,12,13,5,5])
B = np.zeros([10,12,13,10,10])
s2 = np.array([[0,1],[-1,0]])
for i in range(10):
for j in range(12):
for k in range(13):
B[i,j,k,:,:] = np.kron(A[i,j,k,:,:],s2)
I know it would be possible with np.einsum, but also there I would have to explicitly give the shape.
That output shape has to be computed for the last two axes -
out_shp = A.shape[:-2] + tuple(A.shape[-2:]*np.array(s2.shape))
Then einsum or explicit extension of dims could be used -
B_out = (A[...,:,None,:,None]*s2[:,None]).reshape(out_shp)
B_out = np.einsum('ijklm,no->ijklnmo',A,s2).reshape(out_shp)
That einsum one could be generalized more for generic dims with ellipsis ... -
np.einsum('...lm,no->...lnmo',A,s2).reshape(out_shp)
Extend to generic dims
We can generalize to generic dims that would accept the axes along which the kronecker multiplications are to be performed with some reshaping work -
def kron_along_axes(a, b, axis):
# Extend a to the extent of the broadcasted o/p shape
ae = a.reshape(np.insert(a.shape,np.array(axis)+1,1))
# Extend b to the extent of the broadcasted o/p shape
d = np.ones(a.ndim,dtype=int)
np.put(d,axis,b.shape)
be = b.reshape(np.insert(d,np.array(axis),1))
# Get o/p and reshape back to a's dims
out = ae*be
out_shp = np.array(a.shape)
out_shp[list(axis)] *= b.shape
return out.reshape(out_shp)
Thus, to solve our case, it would be -
B = kron_along_axes(A, s2, axis=(3,4))
With numpy.kron
If you are looking for elegance and okay with something slower, we can use the built-in np.kron too with some axes-permutations -
def kron_along_axes(a, b, axis):
new_order = list(np.setdiff1d(range(a.ndim),axis)) + list(axis)
return np.kron(a.transpose(new_order),b).transpose(new_order)
flattened_A = A.reshape([-1, A.shape[-2], A.shape[-1]])
flattened_kron_product = np.kron(flattened_A, s2)
dims = list(A.shape[:-2]) + [flattened_kron_product.shape[-2], flattened_kron_product.shape[-1]]
result = flattened_kron_product.reshape(dims)
Subtracting result from B results in a zero filled. matrix.
Related
I was reading about attention and came across this equation:
import einops
from fancy_einsum import einsum
import torch
x = torch.rand((200, 10, 768))
y = torch.rand((20, 768, 64))
res = einsum("batch query_pos d_model, n_heads d_model d_head -> batch query_pos n_heads d_head", x, y)
And I am not able to understand the underlying operations that give the result res
I thought it might be matmul and tried this:
import torch
x_ = x.unsqueeze(dim = 2).unsqueeze(dim = 2)
y_ = torch.broadcast_to(y, (1, 1, 20, 768, 64))
res2 = x_ # y_
res2 = res2.squeeze(dim = -2)
(res == res2).all() # Prints False
But that does not seem to be right.
Any help regarding this is greatly appreciated
So whenever using einsum you best think about the meaning of the dimensions. Basically we perform a multiplication between the two inputs in this case. The signature passed to einsum shows what dimensions will be preserved and which ones will be "summed away". I simplified the signature with single letters here:
res = einsum("b q m, n m h -> b q n h", x, y)
We can read from this that both x and y have three dimensions. Furthermore both have a dimension called m, and this doesn't appear in the output. So we can conclude that it gets "summed away". So for each entry of the output we have following formula. For simplicity I reused the dimension names as indices, so for every b,q,n,h we get
___
\
res[b,q,n,h] = / x[b,q,m] * y[n,m,h]
/__
m
To do this with any other function than einsum is usually more cumbersome. So first we need to reorder and unsqueeze the dimensions in a way that they are compatible to be multiplied, so we can do the following (the shapes annotated above):
#(b,q,m,n,h) (b, q, m, 1, 1) (m, n, h)
product = x[:, :, :, None, None] * y.permute([1,0,2])
Due to the broadcasting rules, the second (y-) term will implicitly get the required leading dummy dimensions.
Then we can "sum away" the dimension m:
res = product.sum(dim=2) # (b,q,n,h)
So you can interpret that as a matrix multiplication if you want, or also just a scalar product, but of course with many "batch"-dimensions.
I have three 1D vectors. Let's say T with 100k element array, f and df each with 200 element array:
T = [T0, T1, ..., T100k]
f = [f0, f1, ..., f200]
df = [df0, df1, ..., df200]
For each element array, I have to calculate a function such as the following:
P = T*f + T**2 *df
My first instinct was to use the NumPy outer to find the function with each combination of f and df
P1 = np.outer(f,T)
P2 = np.outer(df,T**2)
P = np.add.outer(P1, P2)
However, in this case, I am facing the ram issue and receiving the following error:
Unable to allocate 2.23 PiB for an array with shape (200, 100000, 200,
100000) and data type float64
Is there a good way that I can calculate this?
My attempt using for loops
n=100
f_range = 5e-7
df_range = 1.5e-15
fsrc = np.arange(f - n * f_range, f + n * f_range, f_range) #array of 200
dfsrc = np.arange(df - n * df_range, df + n * df_range, df_range) #array of 200
dfnus=pd.DataFrame(fsrc)
numf=dfnus.shape[0]
dfnudots=pd.DataFrame(dfsrc)
numfdot=dfnudots.shape[0]
test2D = np.zeros([numf,(numfdot)])
for indexf, f in enumerate(fsrc):
for indexfd, fd in enumerate(dfsrc):
a=make_phase(T,f,fd) #--> this is just a function that performs T*f + T**2 *df
zlauf2d=z_n(a, n=1, norm=1) #---> And this is just another function that takes this 1D "a" and gives another 1D element array
test2D[indexf, indexfd]=np.copy(zlauf2d) #---> I do this so I could make a contour plot at the end. It just copys the same thing to 2D
Now my test2D has the shape of (200,200). This is what I want, however the floor loop is taking ages and I want somehow reduce two for loop to at least one.
Using broadcasting:
P1 = (f[:, np.newaxis] * T).sum(axis=-1)
P2 = (df[:, np.newaxis] * T**2).sum(axis=-1)
P = P1[:, np.newaxis] + P2
Alternatively, using outer:
P1 = (np.outer(f, T)).sum(axis=-1)
P2 = (np.outer(df, T**2)).sum(axis=-1)
P = P1[..., np.newaxis] + P2
This produces an array of shape (f.size, df.size) == (200, 200).
Generally speaking, if the final output array size is very large, one can either:
Reduce the size of the datatypes. One way is to change the datatypes of the arrays used to calculate the final output via P1.astype(np.float32). Alternatively, some operations allow one to pass in a dtype=np.float32 as a parameter.
Chunk the computation and work with smaller subsections of the result.
Based on the most recent edit, compute an array a with shape (200, 200, 100000). Then, take its element-wise norm along the last axis to produce an array z with shape (200, 200).
a = (
f[:, np.newaxis, np.newaxis] * T
+ df[np.newaxis, :, np.newaxis] * T**2
)
# L1 norm along last axis.
z = np.abs(a).sum(axis=-1)
This produces an array of shape (f.size, df.size) == (200, 200).
I am attempting to write a program which constructs a matrix and performs a singular value decomposition on it. I am evaluating the function ax^2 +bx + 1 on a grid. I then make a uniform meshgrid of a and b. The rows of the matrix correspond to different quadratic coefficients, while each column corresponds to a grid point at which the function is evaluated.
The matlab code is here:
% Collect data
x = linspace(-1,1,100);
[a,b] = meshgrid(0:0.1:1,0:0.1:1);
D=zeros(numel(x),numel(a));
sz = size(D)
% Build “Dose” matrix
for i=1:numel(a)
D(:,i) = a(i)*x.^2+b(i)*x+1;
end
% Do the SVD:
[U,S,V]=svd(D,'econ');
D_reconstructed = U*S*V';
plot(diag(S))
scatter3(a(:),b(:),V(:,1))
This is my attempt at a solution:
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(-1, 1, 100)
def f(x, a, b):
return a*x*x + b*x + 1
a, b = np.mgrid[0:1:0.1,0:1:0.1]
#a = b = np.arange(0,1,0.01)
D = np.zeros((x.size, a.size))
for i in range(a.size):
D[i] = a[i]*x*x +b[i]*x +1
U, S, V = np.linalg.svd(D)
plt.plot(np.diag(S))
fig = plt.figure()
ax = plt.axes(projection="3d")
ax.scatter(a, b, V[0])
but I always get broadcasting errors which I am not sure how to fix.
Firstly, in MATLAB you're assigning to D(:,i), but in python you're assigning to D[i]. The latter is equivalent to D[i, ...] which is in your case D[i, :]. Instead you seem to need D[:, i].
Secondly, in MATLAB using a linear index into a 2d array (namely a and b) will give you flattened views. If you do that with numpy you get slices of an array instead, just as I mentioned with D[i].
You can do away with the loop with broadcasting and getting your desired 2d array by .ravelling (or reshaping) your a and b arrays:
x = np.linspace(-1, 1, 100)[:, None] # inject trailing singleton for broadcasting
a, b = np.mgrid[0:1:0.1, 0:1:0.1]
D = a.ravel() * x**2 + b.ravel() * x + 1
The way this works is that x has shape (100, 1) after we inject a trailing singleton (in MATLAB trailing singletons are implied, in numpy leading ones), and both a.ravel() and b.ravel() have shape (10*10,) which is compatible with (1, 10*10), making broadcasting possible into shape (100, 10*10). You could also replace the calls to ravel with
a, b = np.mgrid[...].reshape(2, -1)
which is a trick I sometimes use, but this is harder to read if you're unfamiliar with the pattern.
Side note: it's better to use example data where dimensions end up being of different size so that you notice if something ends up being transposed.
I am searching a sorted array for the proper insertion indices of new data so that it remains sorted. Although searchsorted2d by #Divakar works great along column insertions, it just cannot work along rows. Is there a way to perform the same, yet along the rows?
The first idea that comes to mind is to adapt searchsorted2d for the desired behavior. However, that does not seem as easy as it appears. Here is my attempt at adapting it, but it still does not work when axis is set to 0.
import numpy as np
# By Divakar
# See https://stackoverflow.com/a/40588862
def searchsorted2d(a, b, axis=0):
shape = list(a.shape)
shape[axis] = 1
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = np.ceil(max_num) * np.arange(a.shape[1-axis]).reshape(shape)
p = np.searchsorted((a + r).ravel(), (b + r).ravel()).reshape(b.shape)
return p #- a.shape[axis] * np.arange(a.shape[1-axis]).reshape(shape)
axis = 0 # Operate along which axis?
n = 16 # vector size
# Initial array
a = np.random.rand(n).reshape((n, 1) if axis else (1, n))
insert_into_a = np.random.rand(n).reshape((n, 1) if axis else (1, n))
indices = searchsorted2d(a, insert_into_a, axis=axis)
a = np.insert(a, indices.ravel(), insert_into_a.ravel()).reshape(
(n, -1) if axis else (-1, n))
assert(np.all(a == np.sort(a, axis=axis))), 'Failed :('
print('Success :)')
I expect that the assertion passes in both cases (axis = 0 and axis = 1).
Is there an easy/build-in way to get the element-wise maximum of two (or ideally more) sparse matrices? I.e. a sparse equivalent of np.maximum.
This did the trick:
def maximum (A, B):
BisBigger = A-B
BisBigger.data = np.where(BisBigger.data < 0, 1, 0)
return A - A.multiply(BisBigger) + B.multiply(BisBigger)
No, there's no built-in way to do this in scipy.sparse. The easy solution is
np.maximum(X.A, Y.A)
but this is obviously going to be very memory-intensive when the matrices have large dimensions and it might crash your machine. A memory-efficient (but by no means fast) solution is
# convert to COO, if necessary
X = X.tocoo()
Y = Y.tocoo()
Xdict = dict(((i, j), v) for i, j, v in zip(X.row, X.col, X.data))
Ydict = dict(((i, j), v) for i, j, v in zip(Y.row, Y.col, Y.data))
keys = list(set(Xdict.iterkeys()).union(Ydict.iterkeys()))
XmaxY = [max(Xdict.get((i, j), 0), Ydict.get((i, j), 0)) for i, j in keys]
XmaxY = coo_matrix((XmaxY, zip(*keys)))
Note that this uses pure Python instead of vectorized idioms. You can try shaving some of the running time off by vectorizing parts of it.
Here's another memory-efficient solution that should be a bit quicker than larsmans'. It's based on finding the set of unique indices for the nonzero elements in the two arrays using code from Jaime's excellent answer here.
import numpy as np
from scipy import sparse
def sparsemax(X, Y):
# the indices of all non-zero elements in both arrays
idx = np.hstack((X.nonzero(), Y.nonzero()))
# find the set of unique non-zero indices
idx = tuple(unique_rows(idx.T).T)
# take the element-wise max over only these indices
X[idx] = np.maximum(X[idx].A, Y[idx].A)
return X
def unique_rows(a):
void_type = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
b = np.ascontiguousarray(a).view(void_type)
idx = np.unique(b, return_index=True)[1]
return a[idx]
Testing:
def setup(n=1000, fmt='csr'):
return sparse.rand(n, n, format=fmt), sparse.rand(n, n, format=fmt)
X, Y = setup()
Z = sparsemax(X, Y)
print np.all(Z.A == np.maximum(X.A, Y.A))
# True
%%timeit X, Y = setup()
sparsemax(X, Y)
# 100 loops, best of 3: 4.92 ms per loop
The latest scipy (13.0) defines element-wise booleans for sparse matricies. So:
BisBigger = B>A
A - A.multiply(BisBigger) + B.multiply(BisBigger)
np.maximum does not (yet) work because it uses np.where, which is still trying to get the truth value of an array.
Curiously B>A returns a boolean dtype, while B>=A is float64.
Here is a function that returns a sparse matrix that is element-wise maximum of two sparse matrices. It implements the answer by hpaulj:
def sparse_max(A, B):
"""
Return the element-wise maximum of sparse matrices `A` and `B`.
"""
AgtB = (A > B).astype(int)
M = AgtB.multiply(A - B) + B
return M
Testing:
A = sparse.csr_matrix(np.random.randint(-9,10, 25).reshape((5,5)))
B = sparse.csr_matrix(np.random.randint(-9,10, 25).reshape((5,5)))
M = sparse_max(A, B)
M2 = sparse_max(B, A)
# Test symmetry:
print((M.A == M2.A).all())
# Test that M is larger or equal to A and B, element-wise:
print((M.A >= A.A).all())
print((M.A >= B.A).all())
from scipy import sparse
from numpy import array
I = array([0,3,1,0])
J = array([0,3,1,2])
V = array([4,5,7,9])
A = sparse.coo_matrix((V,(I,J)),shape=(4,4))
A.data.max()
9
If you haven't already, you should try out ipython, you could have saved your self time my making your spare matrix A then simply typing A. then tab, this will print a list of methods that you can call on A. From this you would see A.data gives you the non-zero entries as an array and hence you just want the maximum of this.