Slicing n-dimensional numpy array using list of indices - python

Say I have a 3 dimensional numpy array:
np.random.seed(1145)
A = np.random.random((5,5,5))
and I have two lists of indices corresponding to the 2nd and 3rd dimensions:
second = [1,2]
third = [3,4]
and I want to select the elements in the numpy array corresponding to
A[:][second][third]
so the shape of the sliced array would be (5,2,2) and
A[:][second][third].flatten()
would be equivalent to to:
In [226]:
for i in range(5):
for j in second:
for k in third:
print A[i][j][k]
0.556091074129
0.622016249651
0.622530505868
0.914954716368
0.729005532319
0.253214472335
0.892869371179
0.98279375528
0.814240066639
0.986060321906
0.829987410941
0.776715489939
0.404772469431
0.204696635072
0.190891168574
0.869554447412
0.364076117846
0.04760811817
0.440210532601
0.981601369658
Is there a way to slice a numpy array in this way? So far when I try A[:][second][third] I get IndexError: index 3 is out of bounds for axis 0 with size 2 because the [:] for the first dimension seems to be ignored.

Numpy uses multiple indexing, so instead of A[1][2][3], you can--and should--use A[1,2,3].
You might then think you could do A[:, second, third], but the numpy indices are broadcast, and broadcasting second and third (two one-dimensional sequences) ends up being the numpy equivalent of zip, so the result has shape (5, 2).
What you really want is to index with, in effect, the outer product of second and third. You can do this with broadcasting by making one of them, say second into a two-dimensional array with shape (2,1). Then the shape that results from broadcasting second and third together is (2,2).
For example:
In [8]: import numpy as np
In [9]: a = np.arange(125).reshape(5,5,5)
In [10]: second = [1,2]
In [11]: third = [3,4]
In [12]: s = a[:, np.array(second).reshape(-1,1), third]
In [13]: s.shape
Out[13]: (5, 2, 2)
Note that, in this specific example, the values in second and third are sequential. If that is typical, you can simply use slices:
In [14]: s2 = a[:, 1:3, 3:5]
In [15]: s2.shape
Out[15]: (5, 2, 2)
In [16]: np.all(s == s2)
Out[16]: True
There are a couple very important difference in those two methods.
The first method would also work with indices that are not equivalent to slices. For example, it would work if second = [0, 2, 3]. (Sometimes you'll see this style of indexing referred to as "fancy indexing".)
In the first method (using broadcasting and "fancy indexing"), the data is a copy of the original array. In the second method (using only slices), the array s2 is a view into the same block of memory used by a. An in-place change in one will change them both.

One way would be to use np.ix_:
>>> out = A[np.ix_(range(A.shape[0]),second, third)]
>>> out.shape
(5, 2, 2)
>>> manual = [A[i,j,k] for i in range(5) for j in second for k in third]
>>> (out.ravel() == manual).all()
True
Downside is that you have to specify the missing coordinate ranges explicitly, but you could wrap that into a function.

I think there are three problems with your approach:
Both second and third should be slices
Since the 'to' index is exclusive, they should go from 1 to 3 and from 3 to 5
Instead of A[:][second][third], you should use A[:,second,third]
Try this:
>>> np.random.seed(1145)
>>> A = np.random.random((5,5,5))
>>> second = slice(1,3)
>>> third = slice(3,5)
>>> A[:,second,third].shape
(5, 2, 2)
>>> A[:,second,third].flatten()
array([ 0.43285482, 0.80820122, 0.64878266, 0.62689481, 0.01298507,
0.42112921, 0.23104051, 0.34601169, 0.24838564, 0.66162209,
0.96115751, 0.07338851, 0.33109539, 0.55168356, 0.33925748,
0.2353348 , 0.91254398, 0.44692211, 0.60975602, 0.64610556])

Related

How to check numpy array is empty? [duplicate]

How can I check whether a numpy array is empty or not?
I used the following code, but this fails if the array contains a zero.
if not self.Definition.all():
Is this the solution?
if self.Definition == array([]):
You can always take a look at the .size attribute. It is defined as an integer, and is zero (0) when there are no elements in the array:
import numpy as np
a = np.array([])
if a.size == 0:
# Do something when `a` is empty
https://numpy.org/devdocs/user/quickstart.html (2020.04.08)
NumPy’s main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of non-negative integers. In NumPy dimensions are called axes.
(...) NumPy’s array class is called ndarray. (...) The more important attributes of an ndarray object are:
ndarray.ndim
the number of axes (dimensions) of the array.
ndarray.shape
the dimensions of the array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.
ndarray.size
the total number of elements of the array. This is equal to the product of the elements of shape.
One caveat, though.
Note that np.array(None).size returns 1!
This is because a.size is equivalent to np.prod(a.shape),
np.array(None).shape is (), and an empty product is 1.
>>> import numpy as np
>>> np.array(None).size
1
>>> np.array(None).shape
()
>>> np.prod(())
1.0
Therefore, I use the following to test if a numpy array has elements:
>>> def elements(array):
... return array.ndim and array.size
>>> elements(np.array(None))
0
>>> elements(np.array([]))
0
>>> elements(np.zeros((2,3,4)))
24
Why would we want to check if an array is empty? Arrays don't grow or shrink in the same that lists do. Starting with a 'empty' array, and growing with np.append is a frequent novice error.
Using a list in if alist: hinges on its boolean value:
In [102]: bool([])
Out[102]: False
In [103]: bool([1])
Out[103]: True
But trying to do the same with an array produces (in version 1.18):
In [104]: bool(np.array([]))
/usr/local/bin/ipython3:1: DeprecationWarning: The truth value
of an empty array is ambiguous. Returning False, but in
future this will result in an error. Use `array.size > 0` to
check that an array is not empty.
#!/usr/bin/python3
Out[104]: False
In [105]: bool(np.array([1]))
Out[105]: True
and bool(np.array([1,2]) produces the infamous ambiguity error.
edit
The accepted answer suggests size:
In [11]: x = np.array([])
In [12]: x.size
Out[12]: 0
But I (and most others) check the shape more than the size:
In [13]: x.shape
Out[13]: (0,)
Another thing in its favor is that it 'maps' on to an empty list:
In [14]: x.tolist()
Out[14]: []
But there are other other arrays with 0 size, that aren't 'empty' in that last sense:
In [15]: x = np.array([[]])
In [16]: x.size
Out[16]: 0
In [17]: x.shape
Out[17]: (1, 0)
In [18]: x.tolist()
Out[18]: [[]]
In [19]: bool(x.tolist())
Out[19]: True
np.array([[],[]]) is also size 0, but shape (2,0) and len 2.
While the concept of an empty list is well defined, an empty array is not well defined. One empty list is equal to another. The same can't be said for a size 0 array.
The answer really depends on
what do you mean by 'empty'?
what are you really test for?

non adjacent slicing of numpy multidimensional array in python

I have a multidimensional array a:
a = np.random.uniform(1,10,(2,4,2,3,10,10))
For dimensions 4-6, I have 3 lists which contain the indexes for slicing that dimension of array 'a'
dim4 = [0,2]
dim5 = [3,5,9]
dim6 = [1,2,7,8]
How do I slice out array 'a' such that i get:
b = a[0,:,0,dim4,dim5,dim6]
So b should be an array with shape (4,2,3,4), and containing elements from the corresponding dimensions of a. When I try the code above, I get an error saying that different shapes can't be broadcast together for axis 4-6, but if I were to do:
b = a[0,:,0:2,0:3,0:4]
then it does work, even though the slicing lists all have different lengths. So how do you slice multidimensional arrays with non adjacent indexes?
You can use the numpy.ix_ function to construct complex indexing like this. It takes a sequence of array_like, and makes an "open mesh" from them. The example from the docstring is pretty clear:
Using ix_ one can quickly construct index arrays that will index
the cross product. a[np.ix_([1,3],[2,5])] returns the array
[[a[1,2] a[1,5]], [a[3,2] a[3,5]]].
So, for your data, you'd do:
>>> indices = np.ix_((0,), np.arange(a.shape[1]), (0,), dim4, dim5, dim6)
>>> a[indices].shape
(1, 4, 1, 2, 3, 4)
Get rid of the size-1 dimensions with np.squeeze:
>>> np.squeeze(a[indices]).shape
(4, 2, 3, 4)

How to quickly grab specific indices from a numpy array?

But I don't have the index values, I just have ones in those same indices in a different array. For example, I have
a = array([3,4,5,6])
b = array([0,1,0,1])
Is there some NumPy method than can quickly look at both of these and extract all values from a whose indices match the indices of all 1's in b? I want it to result in:
array([4,6])
It is probably worth mentioning that my a array is multidimensional, while my b array will always have values of either 0 or 1. I tried using NumPy's logical_and function, though this returns ValueError with a and b having different dimensions:
a = numpy.array([[3,2], [4,5], [6,1]])
b = numpy.array([0, 1, 0])
print numpy.logical_and(a,b)
ValueError: operands could not be broadcast together with shapes (3,2) (3,)
Though this method does seem to work if a is flat. Either way, the return type of numpy.logical_and() is a boolean, which I do not want. Is there another way? Again, in the second example above, the desired return would be
array([[4,5]])
Obviously I could write a simple loop to accomplish this, I'm just looking for something a bit more concise.
Edit:
This will introduce more constraints, I should also mention that each element of the multidimensional array a may be any arbitrary length, that does not match its neighbour.
You can simply use fancy indexing.
b == 1
will give you a boolean array:
>>> from numpy import array
>>> a = array([3,4,5,6])
>>> b = array([0,1,0,1])
>>> b==1
array([False, True, False, True], dtype=bool)
which you can pass as an index to a.
>>> a[b==1]
array([4, 6])
Demo for your second example:
>>> a = array([[3,2], [4,5], [6,1]])
>>> b = array([0, 1, 0])
>>> a[b==1]
array([[4, 5]])
You could use compress:
>>> a = np.array([3,4,5,6])
>>> b = np.array([0,1,0,1])
>>> a.compress(b)
array([4, 6])
You can provide an axis argument for multi-dimensional cases:
>>> a2 = np.array([[3,2], [4,5], [6,1]])
>>> b2 = np.array([0, 1, 0])
>>> a2.compress(b2, axis=0)
array([[4, 5]])
This method will work even if the axis of a you're indexing against is a different length to b.

Python: How can I force 1-element NumPy arrays to be two-dimensional?

I have piece of code that slices a 2D NumPy array and returns the resulting (sub-)array. In some cases, the slicing only indexes one element, in which case the result is a one-element array:
>>> sub_array = orig_array[indices_h, indices_w]
>>> sub_array.shape
(1,)
How can I force this array to be two-dimensional in a general way? I.e.:
>>> sub_array.shape
(1,1)
I know that sub_array.reshape(1,1) works, but I would like to be able to apply it to sub_array generally without worrying about the number of elements in it. To put it in another way, I would like to compose a (light-weight) operation that converts a shape-(1,) array to a shape-(1,1) array, a shape-(2,2) array to a shape-(2,2) array etc. I can make a function:
def twodimensionalise(input_array):
if input_array.shape == (1,):
return input_array.reshape(1,1)
else:
return input_array
Is this the best I am going to get or does NumPy have something more 'native'?
Addition:
As pointed out in https://stackoverflow.com/a/31698471/865169, I was doing the indexing wrong. I really wanted to do:
sub_array = orig_array[indices_h][:, indices_w]
This does not work when there is only one entry in indices_h, but combining it with np.atleast_2d suggested in another answer, I arrive at:
sub_array = np.atleast_2d(orig_array[indices_h])[:, indices_w]
It sounds like you might be looking for atleast_2d. This function returns a view of a 1D array as a 2D array:
>>> arr1 = np.array([1.7]) # shape (1,)
>>> np.atleast_2d(arr1)
array([[ 1.7]])
>>> _.shape
(1, 1)
Arrays that are already 2D (or have more dimensions) are unchanged:
>>> arr2 = np.arange(4).reshape(2,2) # shape (2, 2)
>>> np.atleast_2d(arr2)
array([[0, 1],
[2, 3]])
>>> _.shape
(2, 2)
When defining a numpy array you can use the keyword argument ndmin to specify that you want at least two dimensions.
e.g.
arr = np.array(item_list, ndmin=2)
arr.shape
>>> (100, 1) # if item_list is 100 elements long etc
In the example in the question, just do
sub_array = np.array(orig_array[indices_h, indices_w], ndmin=2)
sub_array.shape
>>> (1,1)
This can be extended to higher dimensions too, unlike np.atleast_2d().
Are you sure you are indexing in the way you want to? In the case where indices_h and indices_w are broadcastable integer indexing arrays, the result will have the broadcasted shape of indices_h and indices_w. So if you want to make sure that the result is 2D, make the indices arrays 2D.
Otherwise, if you want all combinations of indices_h[i] and indices_w[j] (for all i, j), do e.g. a sequential indexing:
sub_array = orig_array[indices_h][:, indices_w]
Have a look at the documentation for details about advanced indexing.

Issue with numpy.concatenation

I have defined 2 numpy array 2,3 and horizontally concatenate them
a=numpy.array([[1,2,3],[4,5,6]])
b=numpy.array([[7,8,9],[10,11,12]])
C=numpy.concatenate((a,b),axis=0)
c becomes 4,3 matrix
Now I tried same thing with 1,3 list as
a=numpy.array([1,2,3])
b=numpy.array([4,5,6])
c=numpy.concatenate((a,b),axis=0)
Now I was expecting 2,3 matrix but instead I have 1,6. I understand that vstack etc will work but I am curious as to why this is happening? And what I am doing wrong with numpy.concatenate?
Thanks for the reply. I can get the result as suggested by having 1,3 array and then concatenation. But logic is I have to add rows to an empty matrix at each iteration. I tried append as Suggested:
testing=[]
for i in range(3):
testing=testing.append([1,2,3])
It gave error testing doesnot have attribute append as its of None Type. Further If I use logic of 1,3 array using np.array([[1,2,3]]) how can i do this inside for loop?
You didn't do anything wrong. numpy.concatenate join a sequence of arrays together.which means it create an integrated array from the current array's element which in a 2D array the elements are nested lists and in a 1D array the elements are variables.
So this is not the concatenate's job, as you said you can use np.vstack :
>>> c=numpy.vstack((a,b))
>>> c
array([[1, 2, 3],
[4, 5, 6]])
Also in your code list.append appends and element in-place to a list you can not assign it to a variable.instead you can just append to testing in each iteration.
testing=[]
for i in range(3):
testing.append([1,2,3])
also as a more efficient way you can create that list using a list comprehension list following :
testing=[[1,2,3] for _ in xrange(3)]
This is happening because you are concatenating along axis=0.
In your first example:
a=numpy.array([[1,2,3],[4,5,6]]) # 2 elements in 0th dimension
b=numpy.array([[7,8,9],[10,11,12]]) # 2 elements in 0th dimension
C=numpy.concatenate((a,b),axis=0) # 4 elements in 0th dimension
In your second example:
a=numpy.array([1,2,3]) # 3 elements in 0th dimension
b=numpy.array([4,5,6]) # 3 elements in 0th dimension
c=numpy.concatenate((a,b),axis=0) # 6 elements in 0th dimension
Edit:
Note that in your second example, you only have one dimensional array.
In [35]: a=numpy.array([1,2,3])
In [36]: a.shape
Out[36]: (3,)
If the shape of the arrays was (1,3) you would get your expected result:
In [43]: a2=numpy.array([[1,2,3]])
In [44]: b2=numpy.array([[4,5,6]])
In [45]: numpy.concatenate((a2,b2), axis=0)
Out[45]:
array([[1, 2, 3],
[4, 5, 6]])

Categories

Resources