This question already has answers here:
Rank items in an array using Python/NumPy, without sorting array twice
(11 answers)
Closed 3 years ago.
I have an array and I want to get the order of each element.
a=[1.83976,1.57624,1.00528,1.55184]
np.argsort(a)
the above code return
array([2, 3, 1, 0], dtype=int64)
but I want to get the array of the order of each element.
e.g.
[0, 1, 3, 2]
means a[0] is largest number (0th)
a[1] is 1th
a[2] is 3rd
a[3] is 2nd
Ill explain,np.argsort(np.argsort(a)) gives the element in there order like 1.83976 is the highest value in the array so it is assigned the highest value 3. I just subtracted the value from len(a)-1 to get your output.
>>> import numpy as np
>>> a=[1.83976,1.57624,1.00528,1.55184]
>>> np.argsort(a)
array([2, 3, 1, 0])
>>> np.argsort(np.argsort(a))
array([3, 2, 0, 1])
>>> [len(a)-i for i in np.argsort(np.argsort(a))]
[1, 2, 4, 3]
>>> [len(a)-1-i for i in np.argsort(np.argsort(a))]
[0, 1, 3, 2]
>>> np.array([len(a)-1]*len(a))-np.argsort(np.argsort(a))
array([0, 1, 3, 2])
By default argsort returnes indexes of sorted elements in ascending order. As you need descending order, argsort(-a) will give you right sorted indexes. To get the rank of elements you need to apply argsort again.
a = np.array([1.83976,1.57624,1.00528,1.55184])
indx_sorted = np.argsort(-a)
np.argsort(indx_sorted)
>>> array([0, 1, 3, 2])
Related
I know that this question has been asked a hundred times, but the answer always seems to be "use numpy's argsort". But either I am misinterpreting what most people are asking, or the answers are not correct for the question. Whatever be the case, I wish to get indices of a list's ascending order. The phrasing is confusing, so as an example, given a list [4, 2, 1, 3] I expect to get a list back [3, 1, 0, 2]. The smallest item is 1, so it gets index 0, the largest one is 4 so it gets index 3. It seems to me that argsort is often suggested, but it just doesn't seem to do that.
from numpy import argsort
l = [4, 2, 1, 3]
print(argsort(l))
# [2, 1, 3, 0]
# Expected [3, 1, 0, 2]
Clearly argsort is doing something else, so what is it actually doing and how is it similar to the expected behaviour so that it is so often (wrongly) suggested? And, more importantly, how can I get the desired output?
The argsort() is basically converting your list to a sorted list of indices.
l = [4, 2, 1, 3]
First it gets index of each element in the list so new list becomes:
indexed=[0, 1, 2, 3]
Then it sorts the indexed list according to the items in the original list. As 4:0 , 2:1 , 1:2 and 3:3 where : means "corresponds to".
Sorting the original list we get
l=[1, 2, 3, 4]
And placing values of each corresponding index of old list
new=[2,1,3,0]
So basically it sorts the indices of a list according to the original list.
The reason why you are not getting the 'right,' or expected, answer is because you are asking the wrong question!
What you are after is the element rank after sort while Numpy's argsort() returns the sorted index list, as documented!. These are not the same thing (as you found out ;) )!
#hpaulj answered me correctly, but in a comment. And you can't see him.
His answer helped me a lot, it allows me to get what I want.
import numpy as np
l = [4, 2, 1, 3]
print(np.argsort(np.argsort(l)))
Return:
[3, 1, 0, 2]
This is what you expect. This method returns the indices for the array if it were sorted.
⚠️ But note that if the input array contains repetitions, then there is an interesting effect:
import numpy as np
l = [4, 2, 1, 3, 4]
print(np.argsort(np.argsort(l)))
Return:
[3 1 0 2 4]
He may not harm you, but he does harm to me. I solve this problem like this:
import numpy as np
l = [4, 2, 1, 3, 4]
ret2 = np.vectorize(lambda val: np.searchsorted(np.unique(l), val))(l)
print('Returned', ret2)
print('Expected', [3, 1, 0, 2, 3])
Return:
Returned [3 1 0 2 3]
Expected [3, 1, 0, 2, 3]
True, my solution will be slow due to the vectorize function.
But nothing prevents you from using numba. I haven't tested it though 😉.
This question already has answers here:
Transform a set of numbers in numpy so that each number gets converted into a number of other numbers which are less than it
(4 answers)
Closed 4 years ago.
I would like to sort a numpy array and find out where each element went.
numpy.argsort will tell me for each index in the sorted array, which index in the unsorted array goes there. I'm looking for something like the inverse: For each index in the unsorted array, where does it go in the sorted array.
a = np.array([1, 4, 2, 3])
# a sorted is [1,2,3,4]
# the 1 goes to index 0
# the 4 goes to index 3
# the 2 goes to index 1
# the 3 goes to index 2
# desired output
[0, 3, 1, 2]
# for comparison, argsort output
[0, 2, 3, 1]
A simple solution uses numpy.searchsorted
np.searchsorted(np.sort(a), a)
# produces [0, 3, 1, 2]
I'm unhappy with this solution, because it seems very inefficient. It sorts and searches in two separate steps.
This fancy indexing fails for arrays with duplicates, look at:
a = np.array([1, 4, 2, 3, 5])
print(np.argsort(a)[np.argsort(a)])
print(np.searchsorted(np.sort(a),a))
a = np.array([1, 4, 2, 3, 5, 2])
print(np.argsort(a)[np.argsort(a)])
print(np.searchsorted(np.sort(a),a))
You can just use argsort twice on the list.
At first the fact that this works seems a bit confusing, but if you think about it for a while it starts to make sense.
a = np.array([1, 4, 2, 3])
argSorted = np.argsort(a) # [0, 2, 3, 1]
invArgSorted = np.argsort(argSorted) # [0, 3, 1, 2]
You just need to invert the permutation that sorts the array. As shown in the linked question, you can do that like this:
import numpy as np
def sorted_position(array):
a = np.argsort(array)
a[a.copy()] = np.arange(len(a))
return a
print(sorted_position([0.1, 0.2, 0.0, 0.5, 0.8, 0.4, 0.7, 0.3, 0.9, 0.6]))
# [1 2 0 5 8 4 7 3 9 6]
I am trying to understand numpy's argpartition function. I have made the documentation's example as basic as possible.
import numpy as np
x = np.array([3, 4, 2, 1])
print("x: ", x)
a=np.argpartition(x, 3)
print("a: ", a)
print("x[a]:", x[a])
This is the output...
('x: ', array([3, 4, 2, 1]))
('a: ', array([2, 3, 0, 1]))
('x[a]:', array([2, 1, 3, 4]))
In the line a=np.argpartition(x, 3) isn't the kth element the last element (the number 1)? If it is number 1, when x is sorted shouldn't 1 become the first element (element 0)?
In x[a], why is 2 the first element "in front" of 1?
What fundamental thing am I missing?
The more complete answer to what argpartition does is in the documentation of partition, and that one says:
Creates a copy of the array with its elements rearranged in such a way
that the value of the element in k-th position is in the position it
would be in a sorted array. All elements smaller than the k-th element
are moved before this element and all equal or greater are moved
behind it. The ordering of the elements in the two partitions is
undefined.
So, for the input array 3, 4, 2, 1, the sorted array would be 1, 2, 3, 4.
The result of np.partition([3, 4, 2, 1], 3) will have the correct value (i.e. same as sorted array) in the 3rd (i.e. last) element. The correct value for the 3rd element is 4.
Let me show this for all values of k to make it clear:
np.partition([3, 4, 2, 1], 0) - [1, 4, 2, 3]
np.partition([3, 4, 2, 1], 1) - [1, 2, 4, 3]
np.partition([3, 4, 2, 1], 2) - [1, 2, 3, 4]
np.partition([3, 4, 2, 1], 3) - [2, 1, 3, 4]
In other words: the k-th element of the result is the same as the k-th element of the sorted array. All elements before k are smaller than or equal to that element. All elements after it are greater than or equal to it.
The same happens with argpartition, except argpartition returns indices which can then be used for form the same result.
Similar to #Imtinan, I struggled with this. I found it useful to break up the function into the arg and the partition.
Take the following array:
array = np.array([9, 2, 7, 4, 6, 3, 8, 1, 5])
the corresponding indices are: [0,1,2,3,4,5,6,7,8] where 8th index = 5 and 0th = 9
if we do np.partition(array, k=5), the code is going to take the 5th element (not index) and then place it into a new array. It is then going to put those elements < 5th element before it and that > 5th element after, like this:
pseudo output: [lower value elements, 5th element, higher value elements]
if we compute this we get:
array([3, 5, 1, 4, 2, 6, 8, 7, 9])
This makes sense as the 5th element in the original array = 6, [1,2,3,4,5] are all lower than 6 and [7,8,9] are higher than 6. Note that the elements are not ordered.
The arg part of the np.argpartition() then goes one step further and swaps the elements out for their respective indices in the original array. So if we did:
np.argpartition(array, 5) we will get:
array([5, 8, 7, 3, 1, 4, 6, 2, 0])
from above, the original array had this structure [index=value]
[0=9, 1=2, 2=7, 3=4, 4=6, 5=3, 6=8, 7=1, 8=5]
you can map the value of the index to the output and you with satisfy the condition:
argpartition() = partition(), like this:
[index form] array([5, 8, 7, 3, 1, 4, 6, 2, 0]) becomes
[3, 5, 1, 4, 2, 6, 8, 7, 9]
which is the same as the output of np.partition(array),
array([3, 5, 1, 4, 2, 6, 8, 7, 9])
Hopefully, this makes sense, it was the only way I could get my head around the arg part of the function.
i remember having a hard time figuring it out too, maybe the documentation is written badly but this is what it means
When you do a=np.argpartition(x, 3) then x is sorted in such a way that only the element at the k'th index will be sorted (in our case k=3)
So when you run this code basically you are asking what would the value of the 3rd index be in a sorted array. Hence the output is ('x[a]:', array([2, 1, 3, 4]))where only element 3 is sorted.
As the document suggests all numbers smaller than the kth element are before it (in no particular order) hence you get 2 before 1, since its no particular order.
i hope this clarifies it, if you are still confused then feel free to comment :)
I have a numpy array and I want a function that takes as an input the numpy array and a list of indices and returns as an output another array which has the following property: a zero has been added to the initial array just before the position of each of the indices of the origional array.
Let me give a couple of examples:
If indices = [1] and the initial array is array([1, 1, 2]), then the output of the function should be array([1, 0, 1, 2]).
If indices = [0, 1, 3] and the initial array is array([1, 2, 3, 4]), then the output of the function should be array([0, 1, 0, 2, 3, 0, 4]).
I would like to do it in a vectorized manner without any for loops.
Had the same issue before. Found a solution using np.insert:
import numpy as np
np.insert([1, 1, 2], [1], 0)
>>> [1, 0, 1, 2]
I see #jdehesa has commented this already but adding as a permanent answer for future visitors.
import numpy as np
from scipy import signal
y = np.array([[2, 1, 2, 3, 2, 0, 1, 0],
[2, 1, 2, 3, 2, 0, 1, 0]])
maximas = signal.argrelmax(y, axis=1)
print maximas
(array([0, 0, 1, 1], dtype=int64), array([3, 6, 3, 6], dtype=int64))
The maximas produced the index of tuples: (0,3) and (0,6) are for row one [2, 1, 2, 3, 2, 0, 1, 0]; and (1,6) and (1,6) are for another row [2, 1, 2, 3, 2, 0, 1, 0].
The following prints all the results, but I want to extract only the first maxima of both rows, i.e., [3,3] using the tuples. So, the tuples I need are (0,3) and (1,3).
How can I extract them from the array of tuples, i.e., 'maximas'?
>>> print y[kk]
[3 1 3 1]
Given the tuple maximas, here's one possible NumPy way:
>>> a = np.column_stack(maximas)
>>> a[np.unique(a[:,0], return_index=True)[1]]
array([[0, 3],
[1, 3]], dtype=int64)
This stacks the coordinate lists returned by signal.argrelmax into an array a. The return_index parameter of np.unique is used to find the first index of each row number. We can then retrieve the relevant rows from a using these first indexes.
This returns an array, but you could turn it into a list of lists with tolist().
To return the first column index of the maximum in each row, you just need to take the indices returned by np.unique from maximas[0] and use them to index maximas[1]. In one line, it's this:
>>> maximas[1][np.unique(maximas[0], return_index=True)[1]]
array([3, 3], dtype=int64)
To retrieve the corresponding values from each row of y, you can use np.choose:
>>> cols = maximas[1][np.unique(maximas[0], return_index=True)[1]]
>>> np.choose(cols, y.T)
array([3, 3])
Well, a pure Python approach will be to use itertools.groupby(group on the row's index) and a list comprehension:
>>> from itertools import groupby
>>> from operator import itemgetter
>>> [max(g, key=lambda x: y[x])
for k, g in groupby(zip(*maximas), itemgetter(0))]
[(0, 3), (1, 3)]