Is there a way to efficiently compare multiple arrays which are broadcast together? For example:
a = np.arange( 0, 9).reshape(3,3)
b = np.arange( 9, 18).reshape(3,3)
c = np.arange(18, 27).reshape(3,3)
If I were to broadcast these as follows:
abc = a[:,:,None,None,None,None] + b[None,None,:,:,None,None] + c[None,None,None,None,:,:]
Then each element of abc is equal to a_ij + b_kl + c_mn where ij, kl, and mn index the respective arrays. What I would like instead is to get min(a_ij, b_kl, c_mn), or ideally, max(a_ij, b_kl, c_mn) - min(a_ij, b_kl, c_mn). Is there an efficient way in which I can do this?
I could, of course, broadcast temporary arrays as:
Abc = a[:,:,None,None,None,None] + 0 * b[None,None,:,:,None,None] + 0 * c[None,None,None,None,:,:]
aBc = 0 * a[:,:,None,None,None,None] + b[None,None,:,:,None,None] + 0 * c[None,None,None,None,:,:]
abC = 0 * a[:,:,None,None,None,None] + 0 * b[None,None,:,:,None,None] + c[None,None,None,None,:,:]
and then find the min/max from these arrays, however, these arrays can get quite large. It would be better if there were some way to do it in one step.
And as an additional note, these arrays are guaranteed to be broadcastable, but not necessarily have the same shape (for example, (1, 3) and (3, 3)).
You can store an intermediate array (smaller than your final result anyway) by doing the operation on a and b:
temp = np.minimum.outer(a.ravel(), b.ravel())
res = np.minimum.outer(temp.ravel(), c.ravel())
and then repeat that same operation with c. minimum computes element-wise min of 2 arrays. Since it is a ufunc, you can use outer to apply that operation to all pairs of values for those 2 arrays.
You can reshape res as you prefer.
Edit # 1
Thanks to P. Panzer comment, you do not need to use 1D arrays with ufunc.outer, which results in even simpler code:
temp = np.minimum.outer(a, b)
res = np.minimum.outer(temp, c)
Related
Suppose A and B are two 4 dimensional numpy arrays with the same dimension.
A = np.random.rand(5,5,2,10)
B = np.random.rand(5,5,2,10)
a, b, c, d = A.shape
dat = []
for k in range(d):
sum = 0
for l in range(c):
sum = sum + np.einsum('ij,ji->', A[:,:,l,k], B[:,:,l,k])
dat.append(sum)
I was wondering whether I can use the "einsum" to replace the inner for loop, maybe even outer for loop, or maybe some matrix manipulation to replace all of it, casue the data set is large.
Is there any faster way to achieve this?
So suppose i have two numpy ndarrays whose elements are matrices. I need element-wise multiplication for these two arrays, however, there should be matrix multiplication between the two matrix elements. Of course i would be able to implement this with for loops but i was looking to solve this problem without using an explicit for loop. How do i implement this?
EDIT: This for-loop does what I want to do. I'm on python 2.7
n = np.arange(8).reshape(2,2,1,2)
l = np.arange(1,9).reshape(2,2,2,1)
k = np.zeros((2,2))
for i in range(len(n)):
for j in range(len(n[i])):
k[i][j] = np.asscalar(n[i][j].dot(l[i][j]))
print k
Assuming your arrays of matrices are given as n+2 dimensional arrays A and B. What you want to achieve is as simple as C = A#B
Example
outer_dims = 2,3,4
inner_dims = 4,5,6
A = np.random.randint(0,10,(*outer_dims, *inner_dims[:2]))
B = np.random.randint(0,10,(*outer_dims, *inner_dims[1:]))
C = A#B
# check
for I in np.ndindex(outer_dims):
assert (C[I] == A[I]#B[I]).all()
UPDATE: Py2 version; thanks # hpaulj, Divakar
A = np.random.randint(0,10, outer_dims + inner_dims[:2])
B = np.random.randint(0,10, outer_dims + inner_dims[1:])
C = np.matmul(A,B)
# check
for I in np.ndindex(outer_dims):
assert (C[I] == np.matmul(A[I],B[I])).all()
If I understand correctly, this might work:
import numpy as np
a = np.array([[1,1],[1,0]])
b = np.array([[3,4],[5,4]])
x = np.array([[a,b],[b,a]])
y = np.array([[a,a],[b,b]])
result = np.array([_x # _y for _x, _y in zip(x,y)])
This is my code, working with dim=3, but I would like it to work for any dimensionality without having to manually edit code.
I would like to be able to vary the dimensionality between 3 and 20 eventually without manually having to ad for-loops.
I was looking at itertools, but don't know how to select the correct values from the tuples created by itertools.product() to square and add up for my if statement.
arrayshape = (width * 2 + 1,) * dim
funcspace = np.zeros(shape=arrayshape, dtype='b')
x1 = list(range(-int(width), int(width + 1)))
x2 = x1
x3 = x1
for i in range(len(x1)):
for j in range(len(x2)):
for k in range(len(x3)):
if round(np.sqrt(x1[i] ** 2 + x2[j] ** 2 + x3[k] ** 2)) in ranges:
funcspace[i][j][k] = 1
You can use product on enumerate of your vectors, which will yield the value and the index:
for ((i,v1),(j,v2),(k,v3)) in itertools.product(enumerate(x1),enumerate(x2),enumerate(x3)):
if round(np.sqrt(v1**2+v2**2+v3**2)) in ranges:
funcspace[i][j][k]=1
as a bonus, you get rid of the unpythonic range(len()) construct.
I've cooked a more general case when you have a vector of vectors. It's a little harder to read because unpacking isn't done in the for loop.
The square sum is done using sum on the 1 indexes (the values), and if the condition matches, we loop until we find the "deeper" list to set the value to 1.
for t in itertools.product(*(enumerate(x) for x in x_list)):
# compute the squared sum of values
sqsum = sum(v[1]**2 for v in t)
if round(sqsum) in ranges:
# traverse the dimensions except the last one
deeper_list = funcspace
for i in range(len(t)-1):
deeper_list = deeper_list[t[i][0]]
# set the flag using the last dimension list
deeper_list[t[-1][0]] = 1
as noted in comments, since x1 seems to be repeated you can replace the first statement by:
for t in itertools.product(enumerate(x1), repeat=dim):
Another comments states that since funcspace is a numpy ndarray, we can simplify the "set to 1" loop by passing the list of indexes:
funcspace[[x[0] for x in t]] = 1
I am trying to conduct something similar to searchsorted, but in the case where the array is not completely monotonic. Say I have a scalar, c and a 1D array x, I want to find the indices i of all elements such that x[i] < c <= x[i + 1]. Importantly, x is not completely monotonic.
The following code works, but I just would like to know if this is the most efficient way to do this, or if there is a simper way:
x = np.array([1,2,3,1,2,3,1,2,3])
c = 2.5
t = c > x[:-1]
u = c <= x[1:]
v = t*u
i = v.nonzero()[0]
Or in one line of code:
i = ( (c > x[:-1]) * (c <= x[1:] ).nonzero()[0]
Is this the most efficient way to recover these indices?
Two additional questions.
Is there an easy way to extend this to the case where c is a 1D array and x is a 2D array, where c has as many elements as "rows" in x, and I perform this search for each element of c in the corresponding "row" of x?
My ultimate goal is to do this with a three dimensional case. That is, suppose c is still a 1D vector with n elements. Now, let x be a 3D array, with dimensions j by n by k. Is there a way to do #1 above for each "submatrix" in x? Basically, performing #1 above j times.
For example:
x1 = np.array([1,2,3,1,2,3],[1,2,3,1,2,3],[1,2,3,1,2,3])
x2 = x1 + 1
x = np.array([x1,x2])
c = np.array([1.5,2.5,3.5])
Under #1 above, when we compare c and x1, we would get: [[0,4],[1,5],[]]
When we compare c and x2, we would get: [[],[0,4],[1,5]]
Finally, under #2, I would like to get:
[[[0,4],[1,5],[]],
[[],[0,4],[1,5]]]
We could compare once to give us the boolean mask and re-use it with negation to get the other comparison array and also use slicing -
m = c > x
i = np.flatnonzero( m[:-1] & ~m[1:] )
We can extend it to x as 2D and c as 1D case with a loop, but do minimal computations with it by pre-computing on the masks generation in a vectorized manner, like so -
m = c[:,None] > x
m2 = m[:,:-1] & ~m[:,1:]
i = [np.flatnonzero( mi ) for mi in m2]
On such task, numpy make too much comparisons. You can win a 5X factor with Numba. No difficulties to adapt for 3 dimensions.
#numba.njit
def ind(x,c):
res = empty_like(x)
i=j=0
while i < x.size-1:
if x[i]<c and c<=x[i+1]:
res[j]=i
j+=1
i+=1
return res[:j]
What is the best way to exclude exact one NumPy array entry from an operation?
I have an array x containing n values and want to exclude the i-th entry when I call numpy.prod(x). I know about MaskedArray, but is there another/better way?
I think the simplest would be
np.prod(x[:i]) * np.prod(x[i+1:])
This should be fast and also works when you don't want to or can't modify x.
And in case x is multidimensional and i is a tuple:
x_f = x.ravel()
i_f = np.ravel_multi_index(i, x.shape)
np.prod(x_f[:i_f]) * np.prod(x_f[i_f+1:])
You could use np.delete whch removes an element from a one-dimensional array:
import numpy as np
x = np.arange(1, 5)
i = 2
y = np.prod(np.delete(x, i)) # gives 8
I don't think there is any better way, honestly. Even without knowing the NumPy functions, I would do it like:
#I assume x is array of len n
temp = x[i] #where i is the index of the value you don't want to change
x = x * 5
#...do whatever with the array...
x[i] = temp
If I understand correctly, your problem is one dimensional? Even if not, you can do this the same way.
EDIT:
I checked the prod function and in this case I think you can just replace the value u don't want to use with 1 (using temp approach I've given you above) and later just put in the right value. It is just a in-place change, so it's kinda efficient. The second way you can do this is just to divide the result by the x[i] value (assuming it's not 0, as commenters said).
As np.prod is taking the product of all the elements in an array, if we want to exclude one element from the solution, we can set that element to 1 first in order to ignore it (as p * 1 = p).
So:
>>> n = 10
>>> x = np.arange(10)
>>> i = 0
>>> x[i] = 1
>>> np.prod(x)
362880
which, we can see, works:
>>> 1 * 2 * 3 * 4 * 5 * 6 * 7 * 8 * 9
362880
You could use a list comprehension to index all the points but 1:
i = 2
np.prod(x[[val for val in range(len(x)) if val != i]])
or use a set difference:
np.prod(x[list(set(range(len(x)) - {i})])