Suppose A and B are two 4 dimensional numpy arrays with the same dimension.
A = np.random.rand(5,5,2,10)
B = np.random.rand(5,5,2,10)
a, b, c, d = A.shape
dat = []
for k in range(d):
sum = 0
for l in range(c):
sum = sum + np.einsum('ij,ji->', A[:,:,l,k], B[:,:,l,k])
dat.append(sum)
I was wondering whether I can use the "einsum" to replace the inner for loop, maybe even outer for loop, or maybe some matrix manipulation to replace all of it, casue the data set is large.
Is there any faster way to achieve this?
Related
So suppose i have two numpy ndarrays whose elements are matrices. I need element-wise multiplication for these two arrays, however, there should be matrix multiplication between the two matrix elements. Of course i would be able to implement this with for loops but i was looking to solve this problem without using an explicit for loop. How do i implement this?
EDIT: This for-loop does what I want to do. I'm on python 2.7
n = np.arange(8).reshape(2,2,1,2)
l = np.arange(1,9).reshape(2,2,2,1)
k = np.zeros((2,2))
for i in range(len(n)):
for j in range(len(n[i])):
k[i][j] = np.asscalar(n[i][j].dot(l[i][j]))
print k
Assuming your arrays of matrices are given as n+2 dimensional arrays A and B. What you want to achieve is as simple as C = A#B
Example
outer_dims = 2,3,4
inner_dims = 4,5,6
A = np.random.randint(0,10,(*outer_dims, *inner_dims[:2]))
B = np.random.randint(0,10,(*outer_dims, *inner_dims[1:]))
C = A#B
# check
for I in np.ndindex(outer_dims):
assert (C[I] == A[I]#B[I]).all()
UPDATE: Py2 version; thanks # hpaulj, Divakar
A = np.random.randint(0,10, outer_dims + inner_dims[:2])
B = np.random.randint(0,10, outer_dims + inner_dims[1:])
C = np.matmul(A,B)
# check
for I in np.ndindex(outer_dims):
assert (C[I] == np.matmul(A[I],B[I])).all()
If I understand correctly, this might work:
import numpy as np
a = np.array([[1,1],[1,0]])
b = np.array([[3,4],[5,4]])
x = np.array([[a,b],[b,a]])
y = np.array([[a,a],[b,b]])
result = np.array([_x # _y for _x, _y in zip(x,y)])
I am trying to conduct something similar to searchsorted, but in the case where the array is not completely monotonic. Say I have a scalar, c and a 1D array x, I want to find the indices i of all elements such that x[i] < c <= x[i + 1]. Importantly, x is not completely monotonic.
The following code works, but I just would like to know if this is the most efficient way to do this, or if there is a simper way:
x = np.array([1,2,3,1,2,3,1,2,3])
c = 2.5
t = c > x[:-1]
u = c <= x[1:]
v = t*u
i = v.nonzero()[0]
Or in one line of code:
i = ( (c > x[:-1]) * (c <= x[1:] ).nonzero()[0]
Is this the most efficient way to recover these indices?
Two additional questions.
Is there an easy way to extend this to the case where c is a 1D array and x is a 2D array, where c has as many elements as "rows" in x, and I perform this search for each element of c in the corresponding "row" of x?
My ultimate goal is to do this with a three dimensional case. That is, suppose c is still a 1D vector with n elements. Now, let x be a 3D array, with dimensions j by n by k. Is there a way to do #1 above for each "submatrix" in x? Basically, performing #1 above j times.
For example:
x1 = np.array([1,2,3,1,2,3],[1,2,3,1,2,3],[1,2,3,1,2,3])
x2 = x1 + 1
x = np.array([x1,x2])
c = np.array([1.5,2.5,3.5])
Under #1 above, when we compare c and x1, we would get: [[0,4],[1,5],[]]
When we compare c and x2, we would get: [[],[0,4],[1,5]]
Finally, under #2, I would like to get:
[[[0,4],[1,5],[]],
[[],[0,4],[1,5]]]
We could compare once to give us the boolean mask and re-use it with negation to get the other comparison array and also use slicing -
m = c > x
i = np.flatnonzero( m[:-1] & ~m[1:] )
We can extend it to x as 2D and c as 1D case with a loop, but do minimal computations with it by pre-computing on the masks generation in a vectorized manner, like so -
m = c[:,None] > x
m2 = m[:,:-1] & ~m[:,1:]
i = [np.flatnonzero( mi ) for mi in m2]
On such task, numpy make too much comparisons. You can win a 5X factor with Numba. No difficulties to adapt for 3 dimensions.
#numba.njit
def ind(x,c):
res = empty_like(x)
i=j=0
while i < x.size-1:
if x[i]<c and c<=x[i+1]:
res[j]=i
j+=1
i+=1
return res[:j]
I have to evaluate the following expression, given two quite large matrices A,B and a very complicated function F:
The mathematical expression
I was thinking if there is an efficient way in order to first find those indices i,j that will give a non-zero element after the multiplication of the matrices, so that I avoid the quite slow 'for loops'.
Current working code
# Starting with 4 random matrices
A = np.random.randint(0,2,size=(50,50))
B = np.random.randint(0,2,size=(50,50))
C = np.random.randint(0,2,size=(50,50))
D = np.random.randint(0,2,size=(50,50))
indices []
for i in range(A.shape[0]):
for j in range(A.shape[0]):
if A[i,j] != 0:
for k in range(B.shape[1]):
if B[j,k] != 0:
for l in range(C.shape[1]):
if A[i,j]*B[j,k]*C[k,l]*D[l,i]!=0:
indices.append((i,j,k,l))
print indices
As you can see, in order to get the indices I need I have to use nested loops (= huge computational time).
My guess would be NO: you cannot avoid the for-loops. In order to find all the indices ij you need to loop through all the elements which defeats the purpose of this check. Therefore, you should go ahead and use simple array elementwise multiplication and dot product in numpy - it should be quite fast with for loops taken care by numpy.
However, if you plan on using a Python loop then the answer is YES, you can avoid them by using numpy, using the following pseudo-code (=hand-waving):
i, j = np.indices((N, M)) # CAREFUL: you may need to swap i<->j or N<->M
fs = F(i, j, z) # array of values of function F
# for a given z over the index grid
R = np.dot(A*fs, B) # summation over j
# return R # if necessary do a summation over i: np.sum(R, axis=...)
If the issue is that computing fs = F(i, j, z) is a very slow operation, then you will have to identify elements of A that are zero using two loops built-in into numpy (so they are quite fast):
good = np.nonzero(A) # hidden double loop (for 2D data)
fs = np.zeros_like(A)
fs[good] = F(i[good], j[good], z) # compute F only where A != 0
I am trying to do some linear combination of numpy arrays.
I have three lists of numpy arrays:
a = [np.random.normal(0,1, [1,2]), np.random.normal(0,1, [3,4]), np.random.normal(0,1, [10,11])]
b = [np.random.normal(0,1, [1,2]), np.random.normal(0,1, [3,4]), np.random.normal(0,1, [10,11])]
c = [np.random.normal(0,1, [1,2]), np.random.normal(0,1, [3,4]), np.random.normal(0,1, [10,11])]
I want to element-wise combine each element in each array in list a and b based on corresponding element's value of c , to get a new list d: say d_i = a_i * c_i + (1-c_i) *b_i(linear combination).
What I thought was to pick each element in each array in a and find corresponding elements in b and c and then combine. However, I found this is troublesome, inefficient and a bit stupid. Could anyone suggest a better way?
Well assuming all of your lists are the same length then I don't think that there is going to be anything much more efficient than
d = [a[i] * c[i] + (1-c[i]) * b[i] for i in range(len(a))]
Now if all you need to do is operate upon the list d one time then maybe you could speed things up with a generator comprehension?
d = (a[i] * c[i] + (1-c[i]) * b[i] for i in range(len(a)))
But at the end of the day there is no way to create a linear combination of elements in less than linear time.
how should i compare more than 2 numpy arrays?
import numpy
a = numpy.zeros((512,512,3),dtype=numpy.uint8)
b = numpy.zeros((512,512,3),dtype=numpy.uint8)
c = numpy.zeros((512,512,3),dtype=numpy.uint8)
if (a==b==c).all():
pass
this give a valueError, and i am not interested in comparing arrays two at a time.
For three arrays, you can check for equality among the corresponding elements between the first and second arrays and then second and third arrays to give us two boolean scalars and finally see if both of these scalars are True for final scalar output, like so -
np.logical_and( (a==b).all(), (b==c).all() )
For more number of arrays, you could stack them, get the differentiation along the axis of stacking and check if all of those differentiations are equal to zeros. If they are, we have equality among all input arrays, otherwise not. The implementation would look like so -
L = [a,b,c] # List of input arrays
out = (np.diff(np.vstack(L).reshape(len(L),-1),axis=0)==0).all()
For three arrays, you should really just compare them two at a time:
if np.array_equal(a, b) and np.array_equal(b, c):
do_whatever()
For a variable number of arrays, let's suppose they're all combined into one big array arrays. Then you could do
if np.all(arrays[:-1] == arrays[1:]):
do_whatever()
To expand on previous answers, I would use combinations from itertools to construct all pairs, then run your comparison on each pair. For example, if I have three arrays and want to confirm that they're all equal, I'd use:
from itertools import combinations
for pair in combinations([a, b, c], 2):
assert np.array_equal(pair[0], pair[1])
solution supporting different shapes and nans
compare against first element of array-list:
import numpy as np
a = np.arange(3)
b = np.arange(3)
c = np.arange(3)
d = np.arange(4)
lst_eq = [a, b, c]
lst_neq = [a, b, d]
def all_equal(lst):
for arr in lst[1:]:
if not np.array_equal(lst[0], arr, equal_nan=True):
return False
return True
print('all_equal(lst_eq)=', all_equal(lst_eq))
print('all_equal(lst_neq)=', all_equal(lst_neq))
output
all_equal(lst_eq)= True
all_equal(lst_neq)= False
for equal shape and without nan-support
Combine everything into one array, calculate the absolute diff along the new axis and check if the maximum element along the new dimension is equal 0 or lower than some threshold. This should be quite fast.
import numpy as np
a = np.arange(3)
b = np.arange(3)
c = np.arange(3)
d = np.array([0, 1, 3])
lst_eq = [a, b, c]
lst_neq = [a, b, d]
def all_equal(lst, threshold = 0):
arr = np.stack(lst, axis=0)
return np.max(np.abs(np.diff(arr, axis=0))) <= threshold
print('all_equal(lst_eq)=', all_equal(lst_eq))
print('all_equal(lst_neq)=', all_equal(lst_neq))
output
all_equal(lst_eq)= True
all_equal(lst_neq)= False
This might work.
import numpy
x = np.random.rand(10)
arrays = [x for _ in range(10)]
print(np.allclose(arrays[:-1], arrays[1:])) # True
arrays.append(np.random.rand(10))
print(np.allclose(arrays[:-1], arrays[1:])) # False
one-liner solution:
arrays = [a, b, c]
all([np.array_equal(a, b) for a, b in zip(arrays, arrays[1:])])
We test the equality of consecutive pairs of arrays