Assuming that I have a numpy array such as:
import numpy as np
arr = np.array([10,1,2,5,6,2,3,8])
How could I extract an array containing the indices of the elements smaller than 6 so I get the following result:
np.array([1,2,3,5,6])
I would like something that behave like np.nonzero() but instead of testing for nonzero value, it test for value smaller than x
You can use numpy.flatnonzero on the boolean mask and Return indices that are non-zero in the flattened version of a:
np.flatnonzero(arr < 6)
# array([1, 2, 3, 5, 6])
Another option on 1d array is numpy.where:
np.where(arr < 6)[0]
# array([1, 2, 3, 5, 6])
The simplest way one can do this is by
arr[arr<6]
I'd suggest a cleaner and self-explainable way to do so:
First, find the indices where the condition is valid:
>> indices = arr < 6
>> indices
>> [False, True, True, True, False, True, False]
Then, use the indices for indexing:
>> arr[indices]
>> [1, 2, 5, 2, 3]
or for finding the right position in the original array:
>> np.where(indices)[0]
>> [1, 2, 3, 5, 6]
Related
I have a simple numpy array. I want to select all rows but 1st and 6th
I tried:
temp = np.array([1,2,3,4,5,6,7,8,9])
t = temp[~[0,5]]
I get the following error:
TypeError: bad operand type for unary ~: 'list'
What is the correct way to do this?
You can use numpy.delete to delete elements at a specific index position:
t = np.delete(temp, [0, 5])
Or you can create an boolean array, than it is possible to negate the indices:
bool_idx = np.zeros(len(temp), dtype=bool)
bool_idx[[0, 5]] = True
t = temp[~bool_idx]
You cant create the indices that way. Instead you could create a range of numbers from 0 to temp.size and delete the unwanted indices:
In [19]: ind = np.delete(np.arange(temp.size), [0, 5])
In [21]: temp[ind]
Out[21]: array([2, 3, 4, 5, 7, 8, 9])
Or just create it like following:
In [16]: ind = np.concatenate((np.arange(1, 5), np.arange(6, temp.size)))
In [17]: temp[ind]
Out[17]: array([2, 3, 4, 5, 7, 8, 9])
You can use the np.r_ numpy object which concatenates the array into by breaking them using the indices giving the resultant output.
np.r_[temp[1:5], temp[6:]]
The code above concatenates the two arrays which are sliced from the original array and hence the resultant array without the indices specified.
I need to extract one element from each column of a matrix according to an index vector. Say:
index = [0,1,1]
matrix = [[1,4,7],[2,5,8],[3,6,9]]
Index vector tells me I need the first element from column 1, second element from column 2, and third element from column 3.
The output should be [1,5,8]. How can I write it out without explicit loop?
Thanks
You can use advanced indexing:
index = np.array([0,1,2])
matrix = np.array([[1,4,7],[2,5,8],[3,6,9]])
res = matrix[np.arange(matrix.shape[0]), index]
# array([1, 5, 9])
For your second example, reverse your indices:
index = np.array([0,1,1])
matrix = np.array([[1,4,7],[2,5,8],[3,6,9]])
res = matrix[index, np.arange(matrix.shape[1])]
# array([1, 5, 8])
Since you're working with 2-dimensional matrices, I'd suggest using numpy. Then, in your case, you can just use np.diag:
>>> import numpy as np
>>> matrix = np.array([[1,4,7],[2,5,8],[3,6,9]])
>>> matrix
array([[1, 4, 7],
[2, 5, 8],
[3, 6, 9]])
>>> np.diag(matrix)
array([1, 5, 9])
However, #jpp's solution is more generalizable. My solution is useful in your case because you really just want the diagonal of your matrix.
val = [matrix[i][index[i]] for i in range(0, len(index))]
Suppose I have an N-dimensional np.array (or just a list) and a list of N indices. What is the preferred/efficient way to index the array without using loops?
# 4D array with shape of (2, 3, 4, 5)
arr = np.random.random((2, 3, 4, 5))
index = [0, 2, 1, 3]
result = ??? # Equivalent to arr[0, 2, 1, 3]
Additionally, supplying only a 3D index the result should be an array of the last dimension.
index = [0, 2, 1]
result2 = ??? # Equivalent to arr[0, 2, 1]
Please note that I am not able to just index with the usual syntax because the implementation has to handle arrays of different shapes.
I am aware that NumPy supports indexing by an array but that behaves differently as it cherry-picks values from the array rather by indexing by dimension (https://docs.scipy.org/doc/numpy/user/basics.indexing.html).
Per the docs:
If one supplies to the index a tuple, the tuple will be interpreted as a list of indices.
Therefore, change index to a tuple:
In [46]: np.allclose(arr[tuple([0,2,1])], arr[0,2,1])
Out[46]: True
In [47]: np.allclose(arr[tuple([0,2,1,3])], arr[0,2,1,3])
Out[47]: True
I have a multidimensional array, say of shape (4, 3) that looks like
a = np.array([(1,2,3),(4,5,6),(7,8,9),(10,11,12)])
If I have a list of fixed conditions
conditions = [True, False, False, True]
How can I return the list
array([(1,2,3),(10,11,12)])
Using np.extract returns
>>> np.extract(conditions, a)
array([1, 4])
which only returns the first element along each nested array, as opposed to the array itself. I wasn't sure if or how I could do this with np.where. Any help is much appreciated, thanks!
Let's define you variables:
>>> import numpy as np
>>> a = np.array([(1,2,3),(4,5,6),(7,8,9),(10,11,12)])
>>> conditions = [True, False, False, True]
Now, let's select the elements that you want:
>>> a[np.array(conditions)]
array([[ 1, 2, 3],
[10, 11, 12]])
Aside
Note that the simpler a[conditions] has some ambiguity:
>>> a[conditions]
-c:1: FutureWarning: in the future, boolean array-likes will be handled as a boolean array index
array([[4, 5, 6],
[1, 2, 3],
[1, 2, 3],
[4, 5, 6]])
As you can see, conditions are treated here as (integer-like) index values which is not what we wanted.
you can use simple list slicing and np.where It's more or less made specifically for this situation..
>>> a[np.where(conditions)]
array([[[ 1, 2, 3],
[10, 11, 12]]])
I have seen the post Difference between nonzero(a), where(a) and argwhere(a). When to use which? and I don't really understand the use of the where function from numpy module.
For example I have this code
import numpy as np
Z =np.array(
[[1,0,1,1,0,0],
[0,0,0,1,0,0],
[0,1,0,1,0,0],
[0,0,1,1,0,0],
[0,1,0,0,0,0],
[0,0,0,0,0,0]])
print Z
print np.where(Z)
Which gives:
(array([0, 0, 0, 1, 2, 2, 3, 3, 4], dtype=int64),
array([0, 2, 3, 3, 1, 3, 2, 3, 1], dtype=int64))
The definition of where function is:
Return elements, either from x or y, depending on condition. But it doesn't also makes sense to me
So what does the output exactly mean?
np.where returns indices where a given condition is met. In your case, you're asking for the indices where the value in Z is not 0 (e.g. Python considers any non-0 value as True). Which for Z results in:
(0, 0) # top left
(0, 2) # third element in the first row
(0, 3) # fourth element in the first row
(1, 3) # fourth element in the second row
... # and so on
np.where starts to make sense in the following scenarios:
a = np.arange(10)
np.where(a > 5) # give me all indices where the value of a is bigger than 5
# a > 5 is a boolean mask like [False, False, ..., True, True, True]
# (array([6, 7, 8, 9], dtype=int64),)
Hope that helps.