I have a source multidimensional array of shape (a,b,c,c,d) which stores vectors/data of size d, and another array of shape (a,b,e,2) that stores e indices of size 2. 2-dimensional values correspond to the indices 2-3 of the data array (both dimensions of size c). Note that both arrays share the same a,b dimension sizes.
What I want to do is to use these indices to retrieve rows of size d from the first array. So that, the output array should have size (a,b,e,d), i.e. e vectors of size d along the a,b dimensions.
a, b, c, d = 3,5,7,9
e = 11
data = np.random.rand(a,b,c,c,d)
inds = np.random.randint(0,c, size=(a,b,e,2))
res = data[:, :, inds[:,:,:,0], inds[:,:,:,1],:]
print(' - Obtained shape:', res.shape)
print(' - Desired shape:', (a,b,e,d))
# - Obtained shape: (3, 5, 3, 5, 11, 9)
# - Desired shape: (3, 5, 11, 9)
The only way I can think right now is enforcing full fancy indexing by generating range-like indices in all three leading dimensions:
import numpy as np
rng = np.random.default_rng()
a, b, c, d = 3, 5, 7, 9
e = 11
data = rng.uniform(size=(a, b, c, c, d))
inds = rng.integers(0, c, size=(a, b, e, 2))
# generate open index meshes to reduce memory for at least here
aind, bind, _, = np.ogrid[:a, :b, :e]
res = data[aind, bind, inds[..., 0], inds[..., 1], :]
print(' - Obtained shape:', res.shape)
print(' - Desired shape:', (a, b, e, d))
Random check to see that the values are correct too:
sample_index_pos = (1, 1, 8) # <-> (a, b, e)
c_inds = inds[sample_index_pos] # <-> (c, c)
expected = data[sample_index_pos[:2] + tuple(c_inds)]
have = res[sample_index_pos]
print(np.array_equal(expected, have))
# True
Related
I have a tensor A with shape (NB, N, 2, 2).
If I have a list B, consisting of indices with length NB that I want to keep in tensor A, how should I do that?
That is to say, I want to keep 1 (out of N) element per batch, based on the indices in B.
I can get it done with a for loop specifying the batch i in A, and the i th element in b. But is there a vectorized way to do it?
I tried A[B] or A[B.unsqueeze(1)], both had index errors. And A[:, B] would return NB elements for every batch.
Example:
A = Tensor([[[a 2x2 mat AAA1], [a 2x2 mat BBB1], [a 2x2 mat CCC1], [a 2x2 mat DDD1]],
[[a 2x2 mat AAA2], [a 2x2 mat BBB2], [a 2x2 mat CCC2], [a 2x2 mat DDD2]],
[[a 2x2 mat AAA3], [a 2x2 mat BBB3], [a 2x2 mat CCC3], [a 2x2 mat DDD3]]
])
B = [1, 3, 0]
Expected output:
Tensor([[[a 2x2 mat BBB1]],
[[a 2x2 mat DDD2]],
[[a 2x2 mat AAA3]]
])
torch.gather comes to rescue.
Prepare your index list like
# A.shape = (NB, N, 2, 2)
B = torch.tensor([1, 3, 0]) # should be of length NB
B = B[:, None, None, None].repeat(1, # your actual indecies in batch dim
1, # indexing dim to be kept 1
2, # these two must be repeated
2)
And finally, use gather like this
torch.gather(A, 1, B) # indexing along '1'-th dim
I'm currently struggling with a probably rather simple question but I can't get my head around it.
Assuming I have the follow two 2d arrays with different shapes, I can combine them into a new array using:
a = np.zeros((2, 3))
b = np.zeros((4, 5))
c = np.array([a, b])
print(c.shape)
# Output
# (2,)
for elements in c:
print(elements.shape)
# Output:
# (2, 3)
# (4, 5)
So far so good!
But how would I do this if I have a large list where I'd have to iterate over? Here is a simple example with just 4 different 2d arrays:
This works as expected:
a = np.zeros((2,3))
b = np.zeros((4,5))
c = np.zeros((6,7))
d = np.zeros((8,9))
e = np.array([a, b, c, d])
print(e.shape)
# Output
# (4,)
for elements in e:
print(elements.shape)
# Output
# (2, 3)
# (4, 5)
# (6, 7)
# (8, 9)
This doesn't work as expected and my question would be how to do this in an iterative way:
a = np.zeros((2,3))
b = np.zeros((4,5))
c = np.zeros((6,7))
d = np.zeros((8,9))
e = None
for elements in [a, b, c, d]:
e = np.array([e, elements])
print(e.shape)
# Output
# (2,) <--- This should be (4,) as in the upper example, but I don't know how to achieve that :-/
for elements in e:
print(elements.shape)
# (2,)
# (8, 9)
I understand that in each iteration I'm just combining two arrays why it always stays at shape of (2,), but I wonder how this can be done in an elegant way.
So basically I want to have a third dimension which states the count or amount of arrays that are stored. E.g. if I iterate of 1000 different 2d arrays I'd expect to have a shape of (1000,)
Hope my question is understandable - if not let me know!
Thanks a lot!
If I understood your issue correctly, you can achieve what you want in a list comprehension. This will yield the exact same solution as your code above that you described as working.
a = np.zeros((2,3))
b = np.zeros((4,5))
c = np.zeros((6,7))
d = np.zeros((8,9))
e = np.array([element for element in [a, b, c, d]])
print(e.shape)
for elements in e:
print(elements.shape)
I have 4 arrays, A,B,C,D. A and B have shape (n,n) and C/D have shape (n,n,m). I am trying to set it up so that when an element of A is greater than B, that array of length m belongs to C. In essence
C_new = np.where(A > B, C,D) , D_new = np.where(A < B , D, C). However this gives me a value error (operands could not be broadcast together with shapes)
I am curious if I can use where here instead of just looping through each element?
Edit: example:
A = np.ones((2,2))
B = 2*np.eye(2)
C = np.ones((2,2,3))
D = np.zeros((2,2,3))
# Cnew = np.where(A > B, C,D)-> ValueError: operands could not be broadcast together with shapes (2,2) (2,2,3) (2,2,3)
The Cnew would be zeros in the (0,0) and (1,1) index.
You need to add a new axis at the end of the condition in order for it to broadcast correctly:
C_new = np.where((A > B)[..., np.newaxis], C, D)
D_new = np.where((A < B)[..., np.newaxis], D, C)
I have several Numpy 3D arrays of the shape (a, b, c). The values of a, b, and c are unknown. However, I want to reshape each of the arrays to (a, c, b) in an efficient way.
Here is what I am doing:
for array in list_of_arrays:
a, b, c = array.shape
array = array.reshape(a, c, b)
Is there a more efficient way to do this, possibly in one line of code? Can I use the -1 indexing method to reshape/reorder the arrays?
Thank you.
Maybe np.transpose? It swaps all dimensions to the specified order.
x = np.random.randint(0, 256, (100, 80, 3))
np.transpose(x, (1, 0, 2))
(80, 100, 3)
import numpy as np
# Example array with shape (2, 4, 6)
array = np.arange(48).reshape((2, 4, 6))
# Swap axis in the 1st and 2nd dimension and print out its shape
np.swapaxis(array, 1, 2).shape
Output:
(2, 6, 4)
Suppose I have a 5x10x3 array, which I interpret as 5 'sub-arrays', each consisting of 10 rows and 3 columns. I also have a seperate 1D array of length 5, which I call b.
I am trying to insert a new column into each sub-array, where the column inserted into the ith (i=0,1,2,3,4) sub-array is a 10x1 vector where each element is equal to b[i].
For example:
import numpy as np
np.random.seed(777)
A = np.random.rand(5,10,3)
b = np.array([2,4,6,8,10])
A[0] should look like:
A[1] should look like:
And similarly for the other 'sub-arrays'.
(Notice b[0]=2 and b[1]=4)
What about this?
# Make an array B with the same dimensions than A
B = np.tile(b, (1, 10, 1)).transpose(2, 1, 0) # shape: (5, 10, 1)
# Concatenate both
np.concatenate([A, B], axis=-1) # shape: (5, 10, 4)
One method would be np.pad:
np.pad(A, ((0,0),(0,0),(0,1)), 'constant', constant_values=[[[],[]],[[],[]],[[],b[:, None,None]]])
# array([[[9.36513084e-01, 5.33199169e-01, 1.66763960e-02, 2.00000000e+00],
# [9.79060284e-02, 2.17614285e-02, 4.72452812e-01, 2.00000000e+00],
# etc.
Or (more typing but probably faster):
i,j,k = A.shape
res = np.empty((i,j,k+1), np.result_type(A, b))
res[...,:-1] = A
res[...,-1] = b[:, None]
Or dstack after broadcast_to:
np.dstack([A,np.broadcast_to(b[:,None],A.shape[:2])]