Is there a way, in numpy, to perform what amounts to an outer addition of subarrays?
That is to say, I have 2 arrays of the form 2x2xNxM, which may each be considered a stack of 2x2 matrices N high and M wide. I would like to add each of these matrices to each matrix from the other array, to form a 2x2xNxMxNxM array in which the last four indices correspond to the indices in my initial two arrays so that I can index output[:,:,x1,y1,x2,y2] == a1[:,:,x1,y1] + a2[:,:,x2,y2].
If these were arrays of scalars, it would be trivial, all I'd have to do is:
A, B = a.ravel(), b.ravel()
four_D = (a[...:np.newaxis] + b).reshape(*a1.shape, *a2.shape)
for (x1, y1, x2, y2), added in np.ndenumerate(four_D):
assert added == a1[x1,y1] + a2[x2,y2]
However, this doesn't work for the case where a and b comprise of matrices. I could, of course, use nested for loops, but my dataset is going to be fairly large, and I'm expecting to run this over multiple datasets.
Is there an efficient way to do this?
Extend arrays to have more dimensions and then leverage broadcasting -
output = a1[...,None,None] + a2[...,None,None,:,:]
Sample run -
In [38]: # Setup input arrays
...: N = 3
...: M = 4
...: a1 = np.random.rand(2,2,N,M)
...: a2 = np.random.rand(2,2,N,M)
...:
...: output = np.zeros((2,2,N,M,N,M))
...: for x1 in range(N):
...: for x2 in range(N):
...: for y1 in range(M):
...: for y2 in range(M):
...: output[:,:,x1,y1,x2,y2] = a1[:,:,x1,y1] + a2[:,:,x2,y2]
...:
...: output1 = a1[...,None,None] + a2[...,None,None,:,:]
...:
...: print np.allclose(output, output1)
True
Same as for scalars inserting additional axes works for higher dimensional arrays too (this is called broadcasting):
import numpy as np
a1 = np.random.randn(2, 2, 3, 4)
a2 = np.random.randn(2, 2, 3, 4)
added = a1[..., np.newaxis, np.newaxis] + a2[..., np.newaxis, np.newaxis, :, :]
print(added.shape) # (2, 2, 3, 4, 3, 4)
Related
In R, when I execute the code below:
> X=matrix(1,2,3)
> c=c(1,2,3)
> X*c
R gives out the following output:
[,1] [,2] [,3]
[1,] 1 3 2
[2,] 2 1 3
But when I do the below on Python:
>>> import numpy as np
>>> X=np.array([[1,1,1],[1,1,1]])
>>> c=np.array([1,2,3])
>>> X*c
the Python code above gives the following output:
array([[1, 2, 3],
[1, 2, 3]])
Is there any way that I can make the Python to come up with the identical output as R? I think I somehow have to tell Python that I want the numpy to multiply each element of the matrix X by each element of the vector c along the column, instead of along the row, but I am not sure how to go about this.
In [18]: np.reshape([1,2,3]*2,(2,3),order='F')
Out[18]:
array([[1, 3, 2],
[2, 1, 3]])
This starts with a list multiply, which is replication:
In [19]: [1,2,3]*2
Out[19]: [1, 2, 3, 1, 2, 3]
The rest uses numpy to reshape it into a (2,3) array, but with consecutive values going down, 'F' order.
Not knowning R, and in particular the c(1,2,3) expression, I can't say that's what's going on in R.
===
You talk about rows with columns, but I don't see how that works in your example. That said, we can easily perform outer like products
===
This reproduces your R_Product (at least in a few test cases):
In [138]: def foo(X,c):
...: X1 = X.ravel()
...: Y = np.resize(c,X1.shape)*X1
...: return Y.reshape(X.shape, order='F')
...:
In [139]: foo(np.ones((2,3)),np.arange(1,4))
Out[139]:
array([[1., 3., 2.],
[2., 1., 3.]])
In [140]: foo(np.arange(6).reshape(2,3),np.arange(1,4))
Out[140]:
array([[ 0, 6, 8],
[ 2, 3, 15]])
I'm using the resize function to replicate c to match the total number of elements of X. And order F to stack them in the desired column order. The default for numpy is order C.
In numpy replicating an array to match another is not common, at least not in this sense. Replicating by row or column, as in broadcasting is common. And of course reshaping.
I am the OP.
I was looking for a quick and easy solution, but I guess there is no straightforward functionality in Python that allows us to do this. So, I had to make a function that multiplies a matrix with a vector in the same manner that R does:
def R_product(X,c):
"""
Computes the regular R product
(not same as the matrix product) between
a 2D Numpy Array X, and a numpy vector c.
Args:
X: 2D Numpy Array
c: A Numpy vector
Returns: the output of X*c in R.
(This is different than X/*/c in R)
"""
X_nrow = X.shape[0]
X_ncol = X.shape[1]
X_dummy = np.zeros(shape=((X_nrow * X_ncol),1))
nrow = X_dummy.shape[0]
nc = nrow // len(c)
Y = np.zeros(shape=(nrow,1))
for j in range(X_ncol):
for u in range(X_nrow):
X_element = X[u,j]
if u == X_nrow - 1:
idx = X_nrow * (j+1) - 1
else:
idx = X_nrow * j + (u+1) - 1
X_dummy[idx,0] = X_element
for i in range(nc):
for j in range(len(c)):
Y[(i*len(c)+j):(i*len(c)+j+1),:] = (X_dummy[(i*len(c)+j):(i*len(c)+j+1),:]) * c[j]
for z in range(nrow-nc*len(c)):
Y[(nc*len(c)+z):(nc*len(c)+z+1),:] = (X_dummy[(nc*len(c)+z):(nc*len(c)+z+1),:]) * c[z]
return Y.reshape(X_ncol, X_nrow).transpose() # the answer I am looking for
Should work.
I have two very large numpy arrays, which are both 3D. I need to find an efficient way to check if they are overlapping, because turning them both into sets first takes too long. I tried to use another solution I found here for this same problem but for 2D arrays, but I didn't manage to make it work for 3D.
Here is the solution for 2D:
nrows, ncols = A.shape
dtype={'names':['f{}'.format(i) for i in range(ndep)],
'formats':ndep * [A.dtype]}
C = np.intersect1d(A.view(dtype).view(dtype), B.view(dtype).view(dtype))
# This last bit is optional if you're okay with "C" being a structured array...
C = C.view(A.dtype).reshape(-1, ndep)
(where A and B are the 2D arrays)
I need to find the number of overlapping numpy arrays, but not the specific ones.
We could leverage views using a helper function that I have used across few Q&As. To get the presence of subarrays, we could use np.isin on the views or use a more laborious one with np.searchsorted.
Approach #1 : Using np.isin -
# https://stackoverflow.com/a/45313353/ #Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
def isin_nd(a,b):
# a,b are the 3D input arrays to give us "isin-like" functionality across them
A,B = view1D(a.reshape(a.shape[0],-1),b.reshape(b.shape[0],-1))
return np.isin(A,B)
Approach #2 : We could also leverage np.searchsorted upon the views -
def isin_nd_searchsorted(a,b):
# a,b are the 3D input arrays
A,B = view1D(a.reshape(a.shape[0],-1),b.reshape(b.shape[0],-1))
sidx = A.argsort()
sorted_index = np.searchsorted(A,B,sorter=sidx)
sorted_index[sorted_index==len(A)] = len(A)-1
idx = sidx[sorted_index]
return A[idx] == B
So, these two solutions give us the mask of presence of each of the subarrays from a in b. Hence, to get our desired count, it would be - isin_nd(a,b).sum() or isin_nd_searchsorted(a,b).sum().
Sample run -
In [71]: # Setup with 3 common "subarrays"
...: np.random.seed(0)
...: a = np.random.randint(0,9,(10,4,5))
...: b = np.random.randint(0,9,(7,4,5))
...:
...: b[1] = a[4]
...: b[3] = a[2]
...: b[6] = a[0]
In [72]: isin_nd(a,b).sum()
Out[72]: 3
In [73]: isin_nd_searchsorted(a,b).sum()
Out[73]: 3
Timings on large arrays -
In [74]: # Setup
...: np.random.seed(0)
...: a = np.random.randint(0,9,(100,100,100))
...: b = np.random.randint(0,9,(100,100,100))
...: idxa = np.random.choice(range(len(a)), len(a)//2, replace=False)
...: idxb = np.random.choice(range(len(b)), len(b)//2, replace=False)
...: a[idxa] = b[idxb]
# Verify output
In [82]: np.allclose(isin_nd(a,b),isin_nd_searchsorted(a,b))
Out[82]: True
In [75]: %timeit isin_nd(a,b).sum()
10 loops, best of 3: 31.2 ms per loop
In [76]: %timeit isin_nd_searchsorted(a,b).sum()
100 loops, best of 3: 1.98 ms per loop
I have two 3D matrices:
a = np.random.normal(size=[3,2,5])
b = np.random.normal(size=[5,2,3])
I want the dot product of each slice along 2 and 0 axes respectively:
c = np.zeros([3,3,5]) # c.size is 45
c[:,:,0] = a[:,:,0].dot(b[0,:,:])
c[:,:,1] = a[:,:,1].dot(b[1,:,:])
...
I would like to do that using np.tensordot (for efficiency and speed)
I have tried:
c = np.tensordot(a, b, axes=[2,0])
but I get a 4D array with 36 elements (instead of 45). c.shape, c.size = ((3L, 2L, 2L, 3L), 36). I have found a similar question here (Numpy tensor: Tensordot over frontal slices of tensor) but it's not exactly what I want, and I was unable to extrapolate that solution to my problem.
To summarise, can I use np.tensordot to compute c array show above?
Update #1
The answer by #hpaulj is what I wanted, however in my system (python 2.7 and np 1.13.3) those aproaches are pretty slow:
n = 3000
a = np.random.normal(size=[n, 20, 5])
b = np.random.normal(size=[5, 20, n])
t = time.clock()
c_slice = a[:,:,0].dot(b[0,:,:])
print('one slice_x_5: {:.3f} seconds'.format( (time.clock()-t)*5 ))
t = time.clock()
c = np.zeros([n, n, 5])
for i in range(5):
c[:,:,i] = a[:,:,i].dot(b[i,:,:])
print('for loop: {:.3f} seconds'.format(time.clock()-t))
t = time.clock()
d = np.einsum('abi,ibd->adi', a, b)
print('einsum: {:.3f} seconds'.format(time.clock()-t))
t = time.clock()
e = np.tensordot(a,b,[1,1])
e1 = e.transpose(0,3,1,2)[:,:,np.arange(5),np.arange(5)]
print('tensordot: {:.3f} seconds'.format(time.clock()-t))
a = a.transpose(2,0,1)
t = time.clock()
f = np.matmul(a,b)
print('matmul: {:.3f} seconds'.format(time.clock()-t))
It's easier to work with einsum than tensordot. So let's start there:
In [469]: a = np.random.normal(size=[3,2,5])
...: b = np.random.normal(size=[5,2,3])
...:
In [470]: c = np.zeros([3,3,5]) # c.size is 45
In [471]: for i in range(5):
...: c[:,:,i] = a[:,:,i].dot(b[i,:,:])
...:
In [472]: d = np.einsum('abi,ibd->iad', a, b)
In [473]: d.shape
Out[473]: (5, 3, 3)
In [474]: d = np.einsum('abi,ibd->adi', a, b)
In [475]: d.shape
Out[475]: (3, 3, 5)
In [476]: np.allclose(c,d)
Out[476]: True
I had to think a bit about to match up the dimensions. It helped to focus on a[:,:,i] as 2d, and similarly for b[i,:,:]. So the dot sum is over the middle dimension of both arrays (size 2).
In testing ideas it might help if the first 2 dimensions of c were different. There'd be less chance of mixing them up.
It's easy to specify the dot summation axis (axes) in tensordot, but harder to constrain the handling of the other dimensions. That's why you get a 4d array.
I can get it to work with a transpose, followed by taking the diagonal:
In [477]: e = np.tensordot(a,b,[1,1])
In [478]: e.shape
Out[478]: (3, 5, 5, 3)
In [479]: e1 = e.transpose(0,3,1,2)[:,:,np.arange(5),np.arange(5)]
In [480]: e1.shape
Out[480]: (3, 3, 5)
In [481]: np.allclose(c,e1)
Out[481]: True
I've calculated a lot more values than needed, and thrown most of them away.
matmul with some transposing might work better.
In [482]: f = a.transpose(2,0,1)#b
In [483]: f.shape
Out[483]: (5, 3, 3)
In [484]: np.allclose(c, f.transpose(1,2,0))
Out[484]: True
I think of the 5 dimension as 'going-along-for-ride'. That's what your loop does. In einsum the i is the same in all parts.
Let
import numpy as np
A = np.ones([n,m])
B = np.ones([o,n,m])
Is there any way to compute correlation coefficient witout looping such that
C = corr(A,B) = array([1,o])
Where m, n and o are used to express dimension.
Loopy Example:
from scipy.stats.stats import pearsonr
A = np.random.random([5,5])
B = np.random.random([3,5,5])
C = []
for i in B:
C.append(pearsonr(A.flatten(), i.flatten())[0])
C = np.array(C)
We could use corr2_coeff from this post after reshaping the inputs to 2D versions, such that the first input is reshaped to a one-column array and the second one would have number of columns same as the combined length of its last two axes, like so -
corr2_coeff(A.reshape(1,-1),B.reshape(B.shape[0],-1)).ravel()
Sample run -
In [143]: from scipy.stats.stats import pearsonr
...:
...: A = np.random.random([5,5])
...: B = np.random.random([3,5,5])
...: C = []
...: for i in B:
...: C.append(pearsonr(A.flatten(), i.flatten())[0])
...:
...: C = np.array(C)
...:
In [144]: C
Out[144]: array([ 0.05637413, -0.26749579, -0.08957621])
In [145]: corr2_coeff(A.reshape(1,-1),B.reshape(B.shape[0],-1)).ravel()
Out[145]: array([ 0.05637413, -0.26749579, -0.08957621])
For really huge arrays, we might need to resort to one-loop, like so -
[corr2_coeff(A.reshape(1,-1), i.reshape(1,-1)) for i in B]
To get the lowest 10 values of an array X I do something like:
lowest10 = np.argsort(X)[:10]
what is the most efficient way, avoiding loops, to filter the results so that I get the lowest 10 values whose index is not an element of another array Y?
So for example if the array Y is:
[2,20,51]
X[2], X[20] and X[51] shouldn't be taken into consideration to compute the lowest 10.
After some benchmarking here is my humble recommendation:
Swapping out appears to be more or less always faster than masking (even if 99% of X are forbidden.) So use something along the lines of
swap = X[Y]
X[Y] = np.inf
Sorting is expensive, therefore use argpartition and only sort what's necessary. Like
lowest10 = np.argpartition(Xfiltered, 10)[:10]
lowest10 = lowest10[np.argsort(Xfiltered[lowest10])]
Here are some benchmarks:
import numpy as np
from timeit import timeit
def swap_out():
global sol
swap = X[Y]
X[Y] = np.inf
sol = np.argpartition(X, K)[:K]
sol = sol[np.argsort(X[sol])]
X[Y] = swap
def app1():
sidx = X.argsort()
return sidx[~np.in1d(sidx, Y)][:K]
def app2():
sidx = np.argpartition(X,range(K+Y.size))
return sidx[~np.in1d(sidx, Y)][:K]
def app3():
sidx = np.argpartition(X,K+Y.size)
return sidx[~np.in1d(sidx, Y)][:K]
K = 10 # number of small elements wanted
N = 10000 # size of X
M = 10 # size of Y
S = 10 # number of repeats in benchmark
X = np.random.random((N,))
Y = np.random.choice(N, (M,))
so = timeit(swap_out, number=S)
print(sol)
print(X[sol])
d1 = timeit(app1, number=S)
print(sol)
print(X[sol])
d2 = timeit(app2, number=S)
print(sol)
print(X[sol])
d3 = timeit(app3, number=S)
print(sol)
print(X[sol])
print('pp', f'{so:8.5f}', ' d1(um)', f'{d1:8.5f}', ' d2', f'{d2:8.5f}', ' d3', f'{d3:8.5f}')
# pp 0.00053 d1(um) 0.00731 d2 0.00313 d3 0.00149
Here's one approach -
sidx = X.argsort()
idx_out = sidx[~np.in1d(sidx, Y)][:10]
Sample run -
# Setup inputs
In [141]: X = np.random.choice(range(60), 60)
In [142]: Y = np.array([2,20,51])
# For testing, let's set the Y positions as 0s and
# we want to see them skipped in o/p
In [143]: X[Y] = 0
# Use proposed approach
In [144]: sidx = X.argsort()
In [145]: X[sidx[~np.in1d(sidx, Y)][:10]]
Out[145]: array([ 0, 2, 4, 5, 5, 9, 9, 10, 12, 14])
# Print the first 13 numbers and skip three 0s and
# that should match up with the output from proposed approach
In [146]: np.sort(X)[:13]
Out[146]: array([ 0, 0, 0, 0, 2, 4, 5, 5, 9, 9, 10, 12, 14])
Alternatively, for performance, we might want to use np.argpartition, like so -
sidx = np.argpartition(X,range(10+Y.size))
idx_out = X[sidx[~np.in1d(sidx, Y)][:10]]
This would be beneficial if the length of X is a much larger number than 10.
If you don't care about the order of elements in that list of 10 indices, for further boost, we can simply pass on the scalar length instead of range array to np.argpartition : np.argpartition(X,10+Y.size).
We can optimize np.in1d with searchsorted to have one more approach (listing next).
Listing below all the discussed approaches in this post -
def app1(X, Y, n=10):
sidx = X.argsort()
return sidx[~np.in1d(sidx, Y)][:n]
def app2(X, Y, n=10):
sidx = np.argpartition(X,range(n+Y.size))
return sidx[~np.in1d(sidx, Y)][:n]
def app3(X, Y, n=10):
sidx = np.argpartition(X,n+Y.size)
return sidx[~np.in1d(sidx, Y)][:n]
def app4(X, Y, n=10):
n_ext = n+Y.size
sidx = np.argpartition(X,np.arange(n_ext))[:n_ext]
ssidx = sidx.argsort()
mask = np.ones(ssidx.size,dtype=bool)
search_idx = np.searchsorted(sidx, Y, sorter=ssidx)
search_idx[search_idx==sidx.size] = 0
idx = ssidx[search_idx]
mask[idx[sidx[idx] == Y]] = 0
return sidx[mask][:n]
You can work on a subset of original array using numpy.delete();
lowest10 = np.argsort(np.delete(X, Y))[:10]
Since delete works by slicing the original array with indexes to keep, complexity should be constant.
Warning: This solution uses a subset of original X array (X without the elements indexed in Y), thus the end result will be the lowest 10 of that subset.