Find indices of the elements smaller than x in a numpy array - python

Assuming that I have a numpy array such as:
import numpy as np
arr = np.array([10,1,2,5,6,2,3,8])
How could I extract an array containing the indices of the elements smaller than 6 so I get the following result:
np.array([1,2,3,5,6])
I would like something that behave like np.nonzero() but instead of testing for nonzero value, it test for value smaller than x

You can use numpy.flatnonzero on the boolean mask and Return indices that are non-zero in the flattened version of a:
np.flatnonzero(arr < 6)
# array([1, 2, 3, 5, 6])
Another option on 1d array is numpy.where:
np.where(arr < 6)[0]
# array([1, 2, 3, 5, 6])

The simplest way one can do this is by
arr[arr<6]

I'd suggest a cleaner and self-explainable way to do so:
First, find the indices where the condition is valid:
>> indices = arr < 6
>> indices
>> [False, True, True, True, False, True, False]
Then, use the indices for indexing:
>> arr[indices]
>> [1, 2, 5, 2, 3]
or for finding the right position in the original array:
>> np.where(indices)[0]
>> [1, 2, 3, 5, 6]

Related

Numpy array : NOT select specific rows or columns

I have a simple numpy array. I want to select all rows but 1st and 6th
I tried:
temp = np.array([1,2,3,4,5,6,7,8,9])
t = temp[~[0,5]]
I get the following error:
TypeError: bad operand type for unary ~: 'list'
What is the correct way to do this?
You can use numpy.delete to delete elements at a specific index position:
t = np.delete(temp, [0, 5])
Or you can create an boolean array, than it is possible to negate the indices:
bool_idx = np.zeros(len(temp), dtype=bool)
bool_idx[[0, 5]] = True
t = temp[~bool_idx]
You cant create the indices that way. Instead you could create a range of numbers from 0 to temp.size and delete the unwanted indices:
In [19]: ind = np.delete(np.arange(temp.size), [0, 5])
In [21]: temp[ind]
Out[21]: array([2, 3, 4, 5, 7, 8, 9])
Or just create it like following:
In [16]: ind = np.concatenate((np.arange(1, 5), np.arange(6, temp.size)))
In [17]: temp[ind]
Out[17]: array([2, 3, 4, 5, 7, 8, 9])
You can use the np.r_ numpy object which concatenates the array into by breaking them using the indices giving the resultant output.
np.r_[temp[1:5], temp[6:]]
The code above concatenates the two arrays which are sliced from the original array and hence the resultant array without the indices specified.

python, how to select element from each column of matrix

I need to extract one element from each column of a matrix according to an index vector. Say:
index = [0,1,1]
matrix = [[1,4,7],[2,5,8],[3,6,9]]
Index vector tells me I need the first element from column 1, second element from column 2, and third element from column 3.
The output should be [1,5,8]. How can I write it out without explicit loop?
Thanks
You can use advanced indexing:
index = np.array([0,1,2])
matrix = np.array([[1,4,7],[2,5,8],[3,6,9]])
res = matrix[np.arange(matrix.shape[0]), index]
# array([1, 5, 9])
For your second example, reverse your indices:
index = np.array([0,1,1])
matrix = np.array([[1,4,7],[2,5,8],[3,6,9]])
res = matrix[index, np.arange(matrix.shape[1])]
# array([1, 5, 8])
Since you're working with 2-dimensional matrices, I'd suggest using numpy. Then, in your case, you can just use np.diag:
>>> import numpy as np
>>> matrix = np.array([[1,4,7],[2,5,8],[3,6,9]])
>>> matrix
array([[1, 4, 7],
[2, 5, 8],
[3, 6, 9]])
>>> np.diag(matrix)
array([1, 5, 9])
However, #jpp's solution is more generalizable. My solution is useful in your case because you really just want the diagonal of your matrix.
val = [matrix[i][index[i]] for i in range(0, len(index))]

How to index an np.array with a list of indices in Python

Suppose I have an N-dimensional np.array (or just a list) and a list of N indices. What is the preferred/efficient way to index the array without using loops?
# 4D array with shape of (2, 3, 4, 5)
arr = np.random.random((2, 3, 4, 5))
index = [0, 2, 1, 3]
result = ??? # Equivalent to arr[0, 2, 1, 3]
Additionally, supplying only a 3D index the result should be an array of the last dimension.
index = [0, 2, 1]
result2 = ??? # Equivalent to arr[0, 2, 1]
Please note that I am not able to just index with the usual syntax because the implementation has to handle arrays of different shapes.
I am aware that NumPy supports indexing by an array but that behaves differently as it cherry-picks values from the array rather by indexing by dimension (https://docs.scipy.org/doc/numpy/user/basics.indexing.html).
Per the docs:
If one supplies to the index a tuple, the tuple will be interpreted as a list of indices.
Therefore, change index to a tuple:
In [46]: np.allclose(arr[tuple([0,2,1])], arr[0,2,1])
Out[46]: True
In [47]: np.allclose(arr[tuple([0,2,1,3])], arr[0,2,1,3])
Out[47]: True

How do I return a nonflat numpy array selecting elements given a set of conditions?

I have a multidimensional array, say of shape (4, 3) that looks like
a = np.array([(1,2,3),(4,5,6),(7,8,9),(10,11,12)])
If I have a list of fixed conditions
conditions = [True, False, False, True]
How can I return the list
array([(1,2,3),(10,11,12)])
Using np.extract returns
>>> np.extract(conditions, a)
array([1, 4])
which only returns the first element along each nested array, as opposed to the array itself. I wasn't sure if or how I could do this with np.where. Any help is much appreciated, thanks!
Let's define you variables:
>>> import numpy as np
>>> a = np.array([(1,2,3),(4,5,6),(7,8,9),(10,11,12)])
>>> conditions = [True, False, False, True]
Now, let's select the elements that you want:
>>> a[np.array(conditions)]
array([[ 1, 2, 3],
[10, 11, 12]])
Aside
Note that the simpler a[conditions] has some ambiguity:
>>> a[conditions]
-c:1: FutureWarning: in the future, boolean array-likes will be handled as a boolean array index
array([[4, 5, 6],
[1, 2, 3],
[1, 2, 3],
[4, 5, 6]])
As you can see, conditions are treated here as (integer-like) index values which is not what we wanted.
you can use simple list slicing and np.where It's more or less made specifically for this situation..
>>> a[np.where(conditions)]
array([[[ 1, 2, 3],
[10, 11, 12]]])

Basics of numpy where function, what does it do to the array?

I have seen the post Difference between nonzero(a), where(a) and argwhere(a). When to use which? and I don't really understand the use of the where function from numpy module.
For example I have this code
import numpy as np
Z =np.array(
[[1,0,1,1,0,0],
[0,0,0,1,0,0],
[0,1,0,1,0,0],
[0,0,1,1,0,0],
[0,1,0,0,0,0],
[0,0,0,0,0,0]])
print Z
print np.where(Z)
Which gives:
(array([0, 0, 0, 1, 2, 2, 3, 3, 4], dtype=int64),
array([0, 2, 3, 3, 1, 3, 2, 3, 1], dtype=int64))
The definition of where function is:
Return elements, either from x or y, depending on condition. But it doesn't also makes sense to me
So what does the output exactly mean?
np.where returns indices where a given condition is met. In your case, you're asking for the indices where the value in Z is not 0 (e.g. Python considers any non-0 value as True). Which for Z results in:
(0, 0) # top left
(0, 2) # third element in the first row
(0, 3) # fourth element in the first row
(1, 3) # fourth element in the second row
... # and so on
np.where starts to make sense in the following scenarios:
a = np.arange(10)
np.where(a > 5) # give me all indices where the value of a is bigger than 5
# a > 5 is a boolean mask like [False, False, ..., True, True, True]
# (array([6, 7, 8, 9], dtype=int64),)
Hope that helps.

Categories

Resources