I have a 3d array, a, of shape say a.shape = (10, 10, 10)
When slicing, the dimensions are squeezed automatically i.e.
a[:,:,5].shape = (10, 10)
I'd like to preserve the number of dimensions but also ensure that the dimension that was squeezed is the one that shows 1 i.e.
a[:,:,5].shape = (10, 10, 1)
I have thought of re-casting the array and passing ndmin but that just adds the extra dimensions to the start of the shape tuple regardless of where the slice came from in the array a.
a[:,:,[5]].shape
# (10,10,1)
a[:,:,5] is an example of basic slicing.
a[:,:,[5]] is an example of integer array indexing -- combined with basic slicing. When using integer array indexing the resultant shape is always "identical to the (broadcast) indexing array shapes". Since [5] (as an array) has shape (1,),
a[:,:,[5]] ends up having shape (10,10,1).
Related
I have a 4-D NumPy array, with axis say x,y,z,t. I want to take slice corresponding to t=0 and to permute the order in the y axis.
I have the following
import numpy as np
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
b.shape
I get (5, 4, 3) instead of (4,5,3).
When, instead, I enter
aa = a[:,:,:,0]
bb = aa[:,[1,2,3,4,0],:]
bb.shape
I get the expected (4,5,3). Can someone explain why does the first version swap the first two dimensions?
As #hpaulj mentioned in the comments, this behaviour is because of mixing basic slicing and advanced indexing:
a = np.arange(120).reshape(4,5,3,2)
b = a[:,[1,2,3,4,0],:,0]
In the above code snippet, what happens is the following:
when we do basic slicing along last dimension, it triggers a __getitem__ call. So, that dimension is gone. (i.e. no singleton dimension)
[1,2,3,4,0] returns 5 slices from second dimension. There are two possibilities to put this shape in the returned array: either at the first or at the last position. NumPy decided to put it at the first dimension. This is why you get 5 (5, ...) in the first position in the returned shape tuple. Jaime explained this in one of the PyCon talks, if I recall correctly.
Along first and third dimension, since you slice everything using :, the original length along those dimensions is retained.
Putting all these together, NumPy returns the shape tuple as: (5, 4, 3)
You can read more about it at numpy-indexing-ambiguity-in-3d-arrays and arrays.indexing#combining-advanced-and-basic-indexing
I am looking for an elegant way to flatten an array of arbitrary shape to a matrix based on a single parameter that specifies the dimension to retain. For illustration, I would like
def my_func(input, dim):
# code to compute output
return output
Given for example an input array of shape 2x3x4, output should be for dim=0 an array of shape 12x2; for dim=1 an array of shape 8x3; for dim=2 an array of shape 6x8. If I want to flatten the last dimension only, then this is easily accomplished by
input.reshape(-1, input.shape[-1])
But I would like to add the functionality of adding dim (elegantly, without going through all possible cases + checking with if conditions, etc.). It might be possible by first swapping dimensions, so that the dimension of interest is trailing and then applying the operation above.
Any help?
We can permute axes and reshape -
# a is input array; axis is input axis/dim
np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis])
Functionally, it's basically pushing the specified axis to the back and then reshaping keeping that axis length to form the second axis and merging rest of the axes to form the first axis.
Sample runs -
In [32]: a = np.random.rand(2,3,4)
In [33]: axis = 0
In [34]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[34]: (12, 2)
In [35]: axis = 1
In [36]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[36]: (8, 3)
In [37]: axis = 2
In [38]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[38]: (6, 4)
I have some code that constructs a 3D numpy array (x_3d) on the fly using values from a 2D numpy array (x) in a forloop
x_3d = np.empty((0, 20, 10))
for i in range(num_samples):
x_3d = np.append(x_3d, [x[i*20:(i+1)*20, :]],
axis=0)
The resulting shape of the 3D array is (num_samples, 20, 10).
If I want to take slices of different length from the 2D array so that the number of rows varies how can I do that? I have looked at this post. By storing the 2D arrays initially in a list, and then transform the list back to an array gave me the shape of (num_samples, ), while each element is a 2D numpy array it is not a 3D numpy array with the shape of (num_samples, length_varies, 10).
I have a numpy array with size (N,1). When I insert a value somewhere into the array using numpy.insert, it results in an (N,) array. This later causes problems when subtracting an (N,1) array from an (N,) array.
Example:
#Random (4 x 1) array
a = np.random.rand(4,1)
#Insert a number. This results in a (4,) array
b = np.insert(a,0,10)
#Some other (5 x 1) array
c = np.random.rand(5,1)
#Because c is (5,1) and b is (5,), this subtraction is not element by
#element and results in a (5,5) array.
d = b - c
Two questions:
Why does "insert" decrease the dimensions of the array?
Why does subtracting a (5,) array from a (5,1) array result in a (5,5) array rather than an element-wise subtraction?
From the numpy.insert docs:
axis : int, optional
Axis along which to insert values. If axis is None then arr is flattened first.
You didn't specify an axis, so insert flattened the array as the first step. As for how the subtraction works, that's broadcasting.
Suppose we have a 3d numpy array in Python of shape (1, 22, 22) -random dimensions for illustration. If i want to extract the first 2 dimensions from Y, Z, then I can do:
new_array = array[:, 0:2, 0:2]
new_array.shape
(1, 2, 2)
But when I try to do the same by explicitly specifying the first two dimensions, as:
new_array = array[:, [0,1], [0,1]]
new_array.shape
(1, 2)
I'm getting a different result. Why's that? How can I select specific dimensions and and not a range of dimensions?
Passing a list to a numpy array's __getite__ uses advanced indexing instead of slicing. See the documentation here.
Advanced indexing is triggered when the selection object, obj, is a non-tuple sequence object, an ndarray (of data type integer or bool), or a tuple with at least one sequence object or ndarray (of data type integer or bool). There are two types of advanced indexing: integer and Boolean.
In your case, you are using the integer array indexing. The chain of integer indices are broadcast and iterated as a single unit. So using
array[:, [0,1], [0,1]]
selects elements (0,0) and (1,1), not the zeroth and first subarray from dimension 1 and the zeroth and first subarray form dimension 2.
I read the documentation and played around with my code. The only thing that seemed to work -but doesn't- with respect to my question is:
columns = np.array(([0, 1]), ([0,1]), dtype=np.intp)
new_array = my_array[:, columns, 0]
I'm still not quite sure why it works though.
EDIT: doesn't work as expected