Numpy max over only the first element of an array of pairs - python

I have a multidimensional numpy array consisting of tuples like below:
[[(0.56, 1),(0.25, 4), ...],[(0.11, 9), ...], ...]
The second element of each tuple is an index reference. I want to extract the tuple with the highest first value per row. Is there a way to achieve this with numpy max?
One thing I tried is playing around with the axis parameter like below:
np.max(my_array, axis=0)
But this shuffles around the pairs with the index reference not preserved. E.g. the first row in the above example would show something like [(0.56,4), ...] whereas I want it to show [(0.56,1), ...]

In plain python, you could use :
[max(row, key=lambda row: row[0]) for row in array]

Don't use tuples in numpy arrays. Convert it all to a numpy array with the last dimension being 2:
>>> a = np.array([[(0.56, 1), (0.25, 4)],[(0.11, 9), (0.19, 5)]])
>>> a.shape
(2, 2, 2)
Then:
>>> highest_val_per_row = np.argmax(a[:,:,0], axis=1)
>>> a[np.arange(a.shape[0]), highest_val_per_row]
array([[0.56, 1. ],
[0.19, 5. ]])

You can try something linke this:
lst = [[(0.56, 1),(0.25, 4)],[(0.11, 9), (0.25, 4)]]
for e in lst:
print(max(e))
However, i think there are more efficient ways of doing it.

Related

How to get the index of np.maximum?

I know np.maximum computes the element-wise maximum, e.g.
>>> b = np.array([3, 6, 1])
>>> c = np.array([4, 2, 9])
>>> np.maximum(b, c)
array([4, 6, 9])
But is there any way to get the index as well? like in the above example, I also want something like this where each tuple denote (which array, index), it could be tuple or dictionary or something else. And also it would be great if it could work on 3d array, like the input two arrays are 3d arrays.
array([(1, 0), (0, 1), (1, 2)])
You could stack the two 1d-arrays to get a 2d-array and use argmax:
arr = np.vstack((b, c))
indices = np.argmax(arr, axis=0)
This will give you a list of integers, not tuples, but as you know that you compare per column, the last elements of each tuple are unnecessary anyway. They are just ascending integers starting at 0. If you really need them, though, you could just add
indices = list(zip(indices, range(len(b)))

find indices of maximum value in matrix (python)

I want to find the indices[i,j] of the maximum value in a 2d numpy array:
a = numpy.array([[1,2,3],[4,3,1]])
I tried to do it using numpy.argsort() but it returns an array because it can be done along an axis only.
One solution can be by comparing elements at all indices (along both axes) returned by argsort using for loops, but it seems kind of complicated for this. Maybe there is a simple solution?
You want np.unravel_index. The np.argmax will return an index as if the flattened version of array is traversed. The unravel_index will give you the N-D indices.
a = np.random.randint(0, 10, (4,4))
ind = np.unravel_index(np.argmax(a, axis=None), a.shape) # returns a tuple
Maybe this can return what you are looking for? It returns the index of the max (
max_xy = np.where(a == a.max() )
Zip the result to get the index as list of tuples:
zip(max_xy[0], max_xy[1]) #=> [(1, 0)]
In case of more than one max: a = np.array([[4,2,3],[4,3,4]]), it returns #=> [(0, 0), (1, 0), (1, 2)]
To return as a tuple the first maximum found, just fetch the first element of the array:
zip(max_xy[0], max_xy[1])[0] #=> (0, 0)

How to select a row from a 2D tuple

This is probably a very straightforward thing but I can't get it.
How do you go about selecting "rows" (I use the word row for lack of a better one) from a 2D (or nD) tuple?
A = [0,1,2,3]
B = [4,5,6,7]
C = (A,B)
I.E., how do I get the result ([1,2],[5,6]) from C?
I've tried C[:][1:2] but I get the result ([4, 5, 6, 7],)
You could use a comprehension:
tuple(x[1:3] for x in C)
You could also map itemgetter passing whatever indexes you want to get:
from operator import itemgetter
print(list(map(itemgetter(1,2),C)))
[(1, 2), (5, 6)]

Filter values in a list using an array with boolean expressions

I have a list of tuples like this:
listOfTuples = [(0, 1), (0, 2), (3, 1)]
and an array that could look like this:
myArray = np.array([-2, 9, 5])
Furthermore, I have an array with Boolean expressions which I created like this:
dummyArray = np.array([0, 1, 0.6])
myBooleanArray = dummyArray < 1
myBooleanArray therefore looks like this:
array([True, False, True], dtype=bool)
Now I would like to extract values from listOfTuples and myArray based on myBooleanArray. For myArray it is straight forward and I can just use:
myArray[myBooleanArray]
which gives me the desired output
[-2 5]
However, when I use
listOfTuples[myBooleanArray]
I receive
TypeError: only integer arrays with one element can be converted to an
index
A workaround would be to convert this list to an array first by doing:
np.array(listOfTuples)[myBooleanArray]
which yields
[[0 1]
[3 1]]
Is there any smarter way of doing this? My desired output would be
[(0, 1), (3, 1)]
Python list object, unlike Numpy array, doesn't support boolean indexing directly. For that you could use itertools.compress function:
>>> from itertools import compress
>>> list(compress(listOfTuples,bool_array))
[(0, 1), (3, 1)]
Note that one of the advantages of compress along side its functional structure which can be very useful in many cases, is that it returns a generator and its very memory efficient in cases where you have a very large list object to filter.
If you want you can also to loop over the result if you wish to process the items one by one instead of converting the whole object to a list:
for item in compress(listOfTuples,bool_array):
#do stuff
The answer by Kasra is the best this is just an alternate
In [30]: [i[0] for i in list(zip(listOfTuples,bools)) if i[1] == True ]
Out[30]: [(0, 1), (3, 1)]

Sort a list then give the indexes of the elements in their original order

I have an array of n numbers, say [1,4,6,2,3]. The sorted array is [1,2,3,4,6], and the indexes of these numbers in the old array are 0, 3, 4, 1, and 2. What is the best way, given an array of n numbers, to find this array of indexes?
My idea is to run order statistics for each element. However, since I have to rewrite this function many times (in contest), I'm wondering if there's a short way to do this.
>>> a = [1,4,6,2,3]
>>> [b[0] for b in sorted(enumerate(a),key=lambda i:i[1])]
[0, 3, 4, 1, 2]
Explanation:
enumerate(a) returns an enumeration over tuples consisting of the indexes and values in the original list: [(0, 1), (1, 4), (2, 6), (3, 2), (4, 3)]
Then sorted with a key of lambda i:i[1] sorts based on the original values (item 1 of each tuple).
Finally, the list comprehension [b[0] for b in ...] returns the original indexes (item 0 of each tuple).
Using numpy arrays instead of lists may be beneficial if you are doing a lot of statistics on the data. If you choose to do so, this would work:
import numpy as np
a = np.array( [1,4,6,2,3] )
b = np.argsort( a )
argsort() can operate on lists as well, but I believe that in this case it simply copies the data into an array first.
Here is another way:
>>> sorted(xrange(len(a)), key=lambda ix: a[ix])
[0, 3, 4, 1, 2]
This approach sorts not the original list, but its indices (created with xrange), using the original list as the sort keys.
This should do the trick:
from operator import itemgetter
indices = zip(*sorted(enumerate(my_list), key=itemgetter(1)))[0]
The long way instead of using list comprehension for beginner like me
a = [1,4,6,2,3]
b = enumerate(a)
c = sorted(b, key = lambda i:i[1])
d = []
for e in c:
d.append(e[0])
print(d)

Categories

Resources