Related
It is similar to some questions around SO, but I don't quite understand the trick to get what I want.
I have two arrays,
arr of shape (x, y, z)
indexes of shape (x, y) which hold indexes of interest for z.
For each value of indexes I want to get the actual value in arr where:
arr.x == indexes.x
arr.y == indexes.y
arr.z == indexes[x,y]
This would give an array of shape(x,y) similar to indexes' shape.
For example:
arr = np.arange(99)
arr = arr.reshape(3,3,11)
indexes = np.asarray([
[0,2,2],
[1,2,3],
[3,2,10]])
# indexes.shape == (3,3)
# Example for the first element to be computed
first_element = arr[0,0,indexes[0,0]]
With the above indexes, the expected arrays would look like:
expected_result = np.asarray([
[0,13,24],
[34,46,58],
[69,79,98]])
I tried elements = np.take(arr, indexes, axis=z)
but it gives an array of shape (x, y, x, y)
I also tried things like elements = arr[indexes, indexes,:] but I don't get what I wish.
I saw a few answers involving transposing indexes and transforming it into tuples but I don't understand how it would help.
Note: I'm a bit new to numpy so I don't fully understand indexing yet.
How would you solve this numpy style ?
This can be done using np.take_along_axis
import numpy as np
#sample data
np.random.seed(0)
arr = np.arange(3*4*2).reshape(3, 4, 2) # 3d array
idx = np.random.randint(0, 2, (3, 4)) # array of indices
out = np.squeeze(np.take_along_axis(arr, idx[..., np.newaxis], axis=-1))
In this code, the array of indices gets added one more axis, so it can be broadcasted to the shape of the array arr from which we are making the selection. Then, since the return value of np.take_along_axis has the same shape as the array of indices, we need to remove this extra dimension using np.squeeze.
Another option is to use np.choose, but in this case the axis along which you are making selections must be moved to be the first axis of the array:
out = np.choose(idx, np.moveaxis(arr, -1, 0))
The solution here should work for you: Indexing 3d numpy array with 2d array
Adapted to your code:
ax_0 = np.arange(arr.shape[0])[:,None]
ax_1 = np.arange(arr.shape[1])[None,:]
new_array = arr[ax_0, ax_1, indexes]
You can perform such an operation with np.take_along_axis, the operation can only be applied along one dimension so you will need to reshape your input and indices.
The operation you are looking to perform is:
out[i, j] = arr[i, j, indices[i, j]]
However, we are forced to reshape both arr and indices, i.e. map (i, j) to k, such that we can apply np.take_along_axis. The following operation will take place:
out[k] = arr[k, indices[k]] # indexing along axis=1
The actual usage here comes down to:
>>> put = np.take_along_axis(arr.reshape(9, 11), indices.reshape(9, 1), axis=1)
array([[ 0],
[13],
[24],
[34],
[46],
[58],
[69],
[79],
[91]])
Then reshape back to the shape of indices:
>>> put.reshape(indices.shape)
array([[ 0, 13, 24],
[34, 46, 58],
[69, 79, 91]])
just asked a basic question earlier on dot product of two 2d arrays with hopes that I can then figure out how to expand the logic to the following question where one matrix is actually in 3d and then further multiplied by a 1d scalar vector, with no success... sorry for the back-to-back questions, numpy looks much more complicated that I thought..
I have the following 3 arrays
a = np.array([[1, 2, 3], [3, 4, 5]])
b = np.array([[[1, 0, 1], [1, 1, 0], [0, 1, 1]], [[1, 1, 1], [0, 1, 0], [0, 0, 1]]])
c = np.array([1, 2, 3])
I'd like to do the following computation. I am just going to type out the formula directly as I am not sure how to properly describe it in words (if someone can enlighten me on this as well I would be interested to know!)...
[[(1*1 + 0*2 + 1*3)/1, (1*1 + 1*2 + 0*3)/2, (0*1, 1*2, 1*3)/3],
[(1*3 + 1*4 + 1*5)/1, (0*3 + 1*4 + 0*5)/2, (0*3, 0*4, 1*5)/3]]
>>> [[4, 1.5, 1.67], [12, 2, 1.67]]
it probably have to do with how to use the axis arg but I can't figure out quite yet... thanks much again!!
Correct answer: np.sum((a[:,None,:] * b), axis=2) / c
Process step by step visually:
Additional comments
a[:,None,:] inserts an extra dimension in the middle of size 1. a.shape is (2, 3) and a[:,None,:] is (2, 1, 3). a[None,:,:] and [a[:,:,None] does the same, you can check it with a shape parameter.
One of approaches of understanding this problem is visual. All the pictures demonstrate how three dimensions corresponds to axis=0, axis=1, axis=2. So if you had an array of shape = (2, 1, 3), its visualisation would be a cuboid of height = 2, width = 1 and lenght = 3. For example, you can see visually how a[:,:,None] and b are broadcasted into a new array a[:,:,None] * b.
There's also a symbolic (formal) way to see it. Only arrays of balanced shapes could be broadcasted. So arrays of shapes (2, 3) and (2, 1, 3) can't be broadcasted unlikely to arrays of shapes (2, 3, 3) and (2, 1, 3). But, arrays of shapes (4, 3) and (3,) can be broadcasted. I mean, there are mathematical rules which defines whether arrays are broadcastable.
That may not be what you were looking for exactly (as it reshapes the numpy array using a list, which is not very efficient), but it seems to work :
d = np.array([[row]*3 for row in a]) #reshape np.array to 3d
print((d*b).sum(axis=2)/c)
I have a numpy array "data" with dimensions [t, z, x, y]. The
dimensions represent time (t) and three spatial dimensions (x, y, z).
I have a separate array "idx" of indices with dimensions [t, x, y]
describing vertical coordinates in data: each value in idx describes a
single vertical level in data.
I want to pull out the values from data indexed by idx. I've done it
successfully using loops (below). I've read several SO threads and numpy's indexing docs but I haven't been able to make it more pythonic/vectorized.
Is there an easy way I'm just not getting quite right? Or maybe loops
are a clearer way to do this anyway...
import numpy as np
dim = (4, 4, 4, 4) # dimensions time, Z, X, Y
data = np.random.randint(0, 10, dim)
idx = np.random.randint(0, 3, dim[0:3])
# extract vertical indices in idx from data using loops
foo = np.zeros(dim[0:3])
for this_t in range(dim[0]):
for this_x in range(dim[2]):
for this_y in range(dim[3]):
foo[this_t, this_x, this_y] = data[this_t,
idx[this_t, this_x, this_y],
this_x,
this_y]
# surely there's a better way to do this with fancy indexing
# data[idx] gives me an array with dimensions (4, 4, 4, 4, 4, 4)
# data[idx[:, np.newaxis, ...]] is a little closer
# data[tuple(idx[:, np.newaxis, ...])] doesn't quite get it either
# I tried lots of variations on those ideas but no luck yet
In [7]: I,J,K = np.ogrid[:4,:4,:4]
In [8]: data[I,idx,J,K].shape
Out[8]: (4, 4, 4)
In [9]: np.allclose(foo, data[I,idx,J,K])
Out[9]: True
I,J,K broadcast together to the same shape as idx (4,4,4).
More detail on this kind of indexing at
How to take elements along a given axis, given by their indices?
I have a 4d array (python) with a batch of 10000 images with 5 channels in each image. Each image is 25*25 i.e. the 4d array shape is 10000*5*25*25.
I need to transpose the images. The naive way is with nested loops:
for i in range(np.shape(img)[0]):
for j in range(np.shape(img)[1]):
img[i, j, :, :] = np.transpose(img[i, j, :, :])
but I'm sure that there is a more efficient way to do this. Do you have any idea?
Thanks!
The function numpy.transpose is general enough to handle multi-dimensional arrays. By default it reverses the order of dimensions.
However, it takes an optional axis argument, which explicitly specifies the order in which to rearrange the dimensions. To swap the last two dimensions in a 4D array (i.e. transposing a stack of images):
np.transpose(x, [0, 1, 3, 2])
No loops are required, it simply works on the entire 4D array and is super efficient.
Some more examples:
np.transpose(x, [0, 1, 2, 3]) # leaves the array unchanged
np.transpose(x, [3, 2, 1, 0]) # same as np.transpose(x)
np.transpose(x, [0, 2, 1, 3]) # transpose a stack of images with channel in the last dim
Suppose I have 3 numpy arrays a, b, c, of the same shape, say
a.shape == b.shape == c.shape == (7,9)
Now I'd like to create a 3-dimensional array of size (7,9,3), say x, such that
x[:,:,0] == a
x[:,:,1] == b
x[:,:,2] == c
What is the "pythonic" way of doing it (perhaps in one line)?
Thanks in advance!
There's a function that does exactly that: numpy.dstack ("d" for "depth"). For example:
In [10]: import numpy as np
In [11]: a = np.ones((7, 9))
In [12]: b = a * 2
In [13]: c = a * 3
In [15]: x = np.dstack((a, b, c))
In [16]: x.shape
Out[16]: (7, 9, 3)
In [17]: (x[:, :, 0] == a).all()
Out[17]: True
In [18]: (x[:, :, 1] == b).all()
Out[18]: True
In [19]: (x[:, :, 2] == c).all()
Out[19]: True
TL;DR:
Use numpy.stack (docs), which joins a sequence of arrays along a new axis of your choice.
Although #NPE answer is very good and cover many cases, there are some scenarios in which numpy.dstack isn't the right choice (I've just found that out while trying to use it). That's because numpy.dstack, according to the docs:
Stacks arrays in sequence depth wise (along third axis).
This is equivalent to concatenation along the third axis after 2-D
arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of
shape (N,) have been reshaped to (1,N,1).
Let's walk through an example in which this function isn't desirable. Suppose you have a list with 512 numpy arrays of shape (3, 3, 3) and want to stack them in order to get a new array of shape (3, 3, 3, 512). In my case, those 512 arrays were filters of a 2D-convolutional layer. If you use numpy.dstack:
>>> len(arrays_list)
512
>>> arrays_list[0].shape
(3, 3, 3)
>>> numpy.dstack(arrays_list).shape
(3, 3, 1536)
That's because numpy.dstack always stacks the arrays along the third axis! Alternatively, you should use numpy.stack (docs), which joins a sequence of arrays along a new axis of your choice:
>>> numpy.stack(arrays_list, axis=-1).shape
(3, 3, 3, 512)
In my case, I passed -1 to the axis parameter because I wanted the arrays stacked along the last axis.