Finding index of maximum value in array with NumPy - python

I would like to find a maximum in a float64 array, excluding nan values.
I saw np.nanmax function but it doesn't give the index corresponding to the found value.
it 's quite strange to scan after to the value specially the function necessarily use the index ??? Can't it be a mistake searching like that .
isn't there a way to recover the index directly ?

Numpy has an argmax function that returns just that, although you will have to deal with the nans manually. nans always get sorted to the end of an array, so with that in mind you can do:
a = np.random.rand(10000)
a[np.random.randint(10000, size=(10,))] = np.nan
a = a.reshape(100, 100)
def nanargmax(a):
idx = np.argmax(a, axis=None)
multi_idx = np.unravel_index(idx, a.shape)
if np.isnan(a[multi_idx]):
nan_count = np.sum(np.isnan(a))
# In numpy < 1.8 use idx = np.argsort(a, axis=None)[-nan_count-1]
idx = np.argpartition(a, -nan_count-1, axis=None)[-nan_count-1]
multi_idx = np.unravel_index(idx, a.shape)
return multi_idx
>>> nanargmax(a)
(20, 93)

You should use np.where
In [17]: a=np.random.uniform(0, 10, size=10)
In [18]: a
Out[18]:
array([ 1.43249468, 4.93950873, 7.22094395, 1.20248629, 4.66783985,
6.17578054, 4.6542771 , 7.09244492, 7.58580515, 5.72501954])
In [20]: np.where(a==a.max())
Out[20]: (array([8]),)
This also works for 2 arrays, the returned value, is the index.
Here we create a range from 1 to 9:
x = np.arange(9.).reshape(3, 3)
This returns the index, of the the items that equal 5:
In [34]: np.where(x == 5)
Out[34]: (array([1]), array([2])) # the first one is the row index, the second is the column
You can use this value directly to slice your array:
In [35]: x[np.where(x == 5)]
Out[35]: array([ 5.])

You want to use numpy.nanargmax
The documentation provides some clear examples.
a = np.array([[np.nan, 4], [2, 3]])
print np.argmax(a)
0
print np.nanargmax(a)
1
np.nanargmax(a, axis=0)
array([1, 0])
np.nanargmax(a, axis=1)
array([1, 1])

Related

How to extract numpy array stored in tuple?

Let's consider very easy example:
import numpy as np
a = np.array([0, 1, 2])
print(np.where(a < -1))
(array([], dtype=int64),)
print(np.where(a < 2))
(array([0, 1]),)
I'm wondering if its possible to extract length of those arrays, i.e. I want to know that the first array is empty, and the second is not. Usually it can be easily done with len function, however now numpy array is stored in tuple. Do you know how it can be done?
Just use this:
import numpy as np
a = np.array([0, 1, 2])
x = np.where(a < 2)[0]
print(len(x))
Outputs 2
To find the number of values in the array satisfying the predicate, you can skip np.where and use np.count_nonzero instead:
a = np.array([0, 1, 2])
print(np.count_nonzero(a < -1))
>>> 0
print(np.count_nonzero(a < 2))
>>> 2
If you need to know whether there are any values in a that satisfy the predicate, but not how many there are, a cleaner way of doing so is with np.any:
a = np.array([0, 1, 2])
print(np.any(a < -1))
>>> False
print(np.any(a < 2))
>>> True
np.where takes 3 arguments: condition, x, y where last two are arrays and are optional. When provided the funciton returns element from x for indices where condition is True, and y otherwise. When only condition is provided it acts like np.asarray(condition).nonzero() and returns a tuple, as in your case. For more details see Note at np.where.
Alternatively, because you need only length of sublist where condition is True, you can simply use np.sum(condition):
a = np.array([0, 1, 2])
print(np.sum(a < -1))
>>> 0
print(np.sum(a < 2))
>>> 2

Numpy 2d array union when guaranteed nonzero equality

I have two NxN numpy arrays, they are equal size.
If a given row and column in the first array is nonzero, then it is guaranteed that we either have the same value in the same row and column of the other array, or that we have a zero there.
If a given row and column in the first array is zero, then we can have either a zero or a nonzero value in that row and column in the other array.
I would like to combine both array, such that for every [row,col], if one array has a value of zero, and the other has nonzero, then my second array will be modified (if necessary), to have the nonzero value.
And, if they both have a nonzero value, (which is guaranteed to be the same value), then there will be no modification for that row,column - it stays the same.
Example:
array 1:
[[0,9],[2,0]]
array 2:
[[0,0],[2,2]]
After doing my "union", I want array 2 to be:
[[0,9],[2,2]]
What is a fast way to do this for large matrices? Thank you.
All you wanna do is to change the zeros in second array to items in same index in first array. You can do the following:
mask = arr2 == 0
arr2[mask] = arr1[mask]
Demo:
In [7]: arr1 = np.array([[0,9],[2,0]])
In [8]: arr2 = np.array([[0,0],[2,2]])
In [9]: mask = arr2 == 0
In [10]: arr2[mask] = arr1[mask]
In [11]: arr2
Out[11]:
array([[0, 9],
[2, 2]])
Since you are asking for "fast" you may be interested in np.copyto:
>>> a = np.random.randint(0, 2, (100, 100))
>>> b = np.random.randint(-1, 1, (100, 100))
>>>
>>>
>>> timeit("bk = b.copy(); mask=bk==0; bk[mask] = a[mask]", globals=globals(), number=10000)
1.3142543959984323
>>> timeit("bp = b.copy(); np.copyto(bp, a, where=bp==0)", globals=globals(), number=10000)
0.7330851459992118
>>>
# check results are the same
>>> bk = b.copy(); mask=bk==0; bk[mask] = a[mask]
>>> bp = b.copy(); np.copyto(bp, a, where=bp==0)
>>> np.all(bk==bp)
True

locate numpy indices based on closest value in 2d array with unmatch dimensions

**Made a mistake in the original version. The dimensions of arrays are unequal now.
This is a stupid question but I can't find the right answer.
How do you index the closest number in a 2d numpy array? Let say we have
e = np.array([[1, 2], [4, 5, 6]])
I want to locate the indices of values closest to 2, such that it return
array([1, 0])
Many thanks!
Usually you would use np.argwhere(e == 2):
In [4]: e = np.array([[1,2,3],[4,5,6]])
In [6]: np.argwhere(e == 2)
Out[6]: array([[0, 1]])
In case you really need the output you specified, you have to add an extra [0]
In [7]: np.argwhere(e == 2)[0]
Out[7]: array([0, 1])
However, the input you provided is not a standard numeric array but an object array because len(e[0]) != len(e[1]):
In [1]: e = np.array([[1,2],[4,5,6]])
In [3]: e
Out[3]: array([list([1, 2]), list([4, 5, 6])], dtype=object)
This makes numpy much less useful and efficient. You would have to resort to something like:
In [26]: res = []
...: for i, f in enumerate(e):
...: g = np.array(f)
...: w = np.argwhere(g==2)
...: if len(w):
...: res += [(i, v) for v in w]
...: res = np.array(res)
Assuming this was a typo and if you are interested in the value closest to 2 even if 2 is not present, you would have to do something like:
In [35]: np.unravel_index((np.abs(e - 2.2)).argmin(), e.shape)
Out[35]: (0, 1)
Here I chose 2.2 as an example value.
This can be done by defining a function that works on a 1D array and applying it over the rows of the 2D array:
e = np.array([[1,2,3], [4,5,6]])
# function to find position of nearest value in 1D array
def find_nearest(a, val):
return np.abs(a - val).argmin()
# apply it
np.apply_along_axis(find_nearest, axis = 1, arr = e, val = 2)

delete all columns of a dimension except for a specific column

I want to make a function which takes a n-dimensional array, the dimension and the column index, and it will return the (n-1)-dimensional array after removing all the other columns of that specific dimension.
Here is the code I am using now
a = np.arange(6).reshape((2, 3)) # the n-dimensional array
axisApplied = 1
colToKeep = 0
colsToDelete = np.delete(np.arange(a.shape[axisApplied]), colToKeep)
a = np.squeeze(np.delete(a, colsToDelete, axisApplied), axis=axisApplied)
print(a)
# [0, 3]
Note that I have to manually calculate the n-1 indices (the complement of the specific column index) to use np.delete(), and I am wondering whether there is a more convenient way to achieve my goal, e.g. specify which column to keep directly.
Thank you for reading and I am welcome to any suggestions.
In [1]: arr = np.arange(6).reshape(2,3)
In [2]: arr
Out[2]:
array([[0, 1, 2],
[3, 4, 5]])
Simple indexing:
In [3]: arr[:,0]
Out[3]: array([0, 3])
Or if you need to used the general axis parameter, try take:
In [4]: np.take(arr,0,axis=1)
Out[4]: array([0, 3])
Picking one element, or a list of elements, along an axis is a lot easier than deleting some. Look at the code for np.delete.

numpy-equivalent of list.pop?

Is there a numpy method which is equivalent to the builtin pop for python lists?
Popping obviously doesn't work on numpy arrays, and I want to avoid a list conversion.
There is no pop method for NumPy arrays, but you could just use basic slicing (which would be efficient since it returns a view, not a copy):
In [104]: y = np.arange(5); y
Out[105]: array([0, 1, 2, 3, 4])
In [106]: last, y = y[-1], y[:-1]
In [107]: last, y
Out[107]: (4, array([0, 1, 2, 3]))
If there were a pop method it would return the last value in y and modify y.
Above,
last, y = y[-1], y[:-1]
assigns the last value to the variable last and modifies y.
Here is one example using numpy.delete():
import numpy as np
arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print(arr)
# array([[ 1, 2, 3, 4],
# [ 5, 6, 7, 8],
# [ 9, 10, 11, 12]])
arr = np.delete(arr, 1, 0)
print(arr)
# array([[ 1, 2, 3, 4],
# [ 9, 10, 11, 12]])
Pop doesn't exist for NumPy arrays, but you can use NumPy indexing in combination with array restructuring, for example hstack/vstack or numpy.delete(), to emulate popping.
Here are some example functions I can think of (which apparently don't work when the index is -1, but you can fix this with a simple conditional):
def poprow(my_array,pr):
""" row popping in numpy arrays
Input: my_array - NumPy array, pr: row index to pop out
Output: [new_array,popped_row] """
i = pr
pop = my_array[i]
new_array = np.vstack((my_array[:i],my_array[i+1:]))
return [new_array,pop]
def popcol(my_array,pc):
""" column popping in numpy arrays
Input: my_array: NumPy array, pc: column index to pop out
Output: [new_array,popped_col] """
i = pc
pop = my_array[:,i]
new_array = np.hstack((my_array[:,:i],my_array[:,i+1:]))
return [new_array,pop]
This returns the array without the popped row/column, as well as the popped row/column separately:
>>> A = np.array([[1,2,3],[4,5,6]])
>>> [A,poparow] = poprow(A,0)
>>> poparow
array([1, 2, 3])
>>> A = np.array([[1,2,3],[4,5,6]])
>>> [A,popacol] = popcol(A,2)
>>> popacol
array([3, 6])
There isn't any pop() method for numpy arrays unlike List, Here're some alternatives you can try out-
Using Basic Slicing
>>> x = np.array([1,2,3,4,5])
>>> x = x[:-1]; x
>>> [1,2,3,4]
Or, By Using delete()
Syntax - np.delete(arr, obj, axis=None)
arr: Input array
obj: Row or column number to delete
axis: Axis to delete
>>> x = np.array([1,2,3,4,5])
>>> x = x = np.delete(x, len(x)-1, 0)
>>> [1,2,3,4]
The important thing is that it takes one from the original array and deletes it.
If you don't m
ind the superficial implementation of a single method to complete the process, the following code will do what you want.
import numpy as np
a = np.arange(0, 3)
i = 0
selected, others = a[i], np.delete(a, i)
print(selected)
print(others)
# result:
# 0
# [1 2]
The most 'elegant' solution for retrieving and removing a random item in Numpy is this:
import numpy as np
import random
arr = np.array([1, 3, 5, 2, 8, 7])
element = random.choice(arr)
elementIndex = np.where(arr == element)[0][0]
arr = np.delete(arr, elementIndex)
For curious coders:
The np.where() method returns two lists. The first returns the row indexes of the matching elements and the second the column indexes. This is useful when searching for elements in a 2d array. In our case, the first element of the first returned list is interesting.
To add, If you want to implement pop for a row or column from a numpy 2D array you could do like:
col = arr[:, -1] # gets the last column
np.delete(arr, -1, 1) # deletes the last column
and for row:
row = arr[-1, :] # gets the last row
np.delete(arr, -1, 0) # deletes the last row
unutbu had a simple answer for this, but pop() can also take an index as a parameter. This is how you replicate it with numpy:
pop_index = 4
pop = y[pop_index]
y = np.concatenate([y[:pop_index],y[pop_index+1:]])
OK, since I didn't see a good answer that RETURNS the 1st element and REMOVES it from the original array, I wrote a simple (if kludgy) function utilizing global for a 1d array (modification required for multidims):
tmp_array_for_popfunc = 1d_array
def array_pop():
global tmp_array_for_popfunc
r = tmp_array_for_popfunc[0]
tmp_array_for_popfunc = np.delete(tmp_array_for_popfunc, 0)
return r
check it by using-
print(len(tmp_array_for_popfunc)) # confirm initial size of tmp_array_for_popfunc
print(array_pop()) #prints return value at tmp_array_for_popfunc[0]
print(len(tmp_array_for_popfunc)) # now size is 1 smaller
I made a function as follow, doing almost the same. This function has 2 arguments: np_array and index, and return the value of the given index of the array.
def np_pop(np_array, index=-1):
'''
Pop the "index" from np_array and return the value.
Default value for index is the last element.
'''
# add this to make sure 'numpy' is imported
import numpy as np
# read the value of the given array at the given index
value = np_array[index]
# remove value from array
np.delete(np_array, index, 0)
# return the value
return value
Remember you can add a condition to make sure the given index is exist in the array and return -1 if anything goes wrong.
Now you can use it like this:
import numpy as np
i = 2 # let's assume we want to pop index number 2
y = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9]) # assume 'y' is our numpy array
poped_val = np_pop(y, i) # value of the piped index

Categories

Resources