Sort Key Lambda Parameters
I do not understand how the lambda parameters are working, the [-e[0],e[1]] portion is especially confusing. I have removed all the excessive printing code and I have also removed all unnecessary code from my question. What does the parameter -e[0] achieve and what is that e[1] achieves?
data.sort(key = lambda e: [-e[0],e[1]]) # --> anonymous function
print ("This is the data sort after the lambda filter but NOT -e %s" %data)`
[in] 'aeeccccbbbbwwzzzwww'
[out] This is the data before the sort [[2, 'e'], [4, 'c'], [1, 'a'], [4, 'b'], [5, 'w'], [3, 'z']]
[out] This is the data sort before the lambda filter [[1, 'a'], [2, 'e'], [3, 'z'], [4, 'b'], [4, 'c'], [5, 'w']]
[out] This is the data sort after the lambda filter but NOT -e [[1, 'a'], [2, 'e'], [3, 'z'], [4, 'b'], [4, 'c'], [5, 'w']]
[out] This is the data sort after the lambda filter [[5, 'w'], [4, 'b'], [4, 'c'], [3, 'z'], [2, 'e'], [1, 'a']]
[out] w 5
[out] b 4
[out] c 4
l = [[2, 'e'], [4, 'c'], [1, 'a'], [4, 'b'], [5, 'w'], [3, 'z']]
>>> l.sort()
Normal sort: first the first element of the nested list is considered and then the second element.
>>>l.sort(key=lambda e: [e[0], e[1]])
Similar to l.sort()
>>>l.sort(key=lambda e: [-e[0], e[1]])
Now, what is does is- Reverse sort the the list on the basis of first element of the nested list AND sort normally on the internal elements of the nested sorted list i.e
first 2,3,4,5 etc are considered for sorting the list in reverse order( -e[0] == -2,-3,-4...) and then we sort the elements on the basis of second element for internal sorting (e[1] == 'w', 'a', 'b'...)
Related
Let's say I have:
list_a = [1, 2, 3, 4, 5]
list_b = ['a', 'b', 'c']
And I expect the outcome to be something like this, so I can easily access it later:
list_c = [['a', 1], ['a', 2], ['a', 3], ...]
What's the easiest way to do that?
The two lists have different lengths
I need every letter in list_b to have the five corresponding numbers, basically all possible combinations, because I need to easily access ie. [c, 4] later on.
I tried just to append list_a and list_b to list_c but it obviously didn't go as planned.
I can't use builtin functions such as zip, itertools, etc.
Use a list comprehension with 2 for statements:
list_c = [[b, a] for b in list_b for a in list_a]
Output: [['a', 1], ['a', 2], ['a', 3], ['a', 4], ['a', 5], ['b', 1], ['b', 2], ['b', 3], ['b', 4], ['b', 5], ['c', 1], ['c', 2], ['c', 3], ['c', 4], ['c', 5]]
Below is the DataFrame I want to action upon:
df = pd.DataFrame({'A': [1,1,1],
'B': [2,2,3],
'C': [4,5,4]})
Each row of df creates a unique key. Objective is to create the following list of multi-dimensional arrays:
parameter = [[['A', 1],['B', 2], ['C', 4]],
[['A', 1],['B', 2], ['C', 5]],
[['A', 1],['B', 3], ['C', 4]]]
Problem is related to this question where I have to iterate over the parameter but instead of manually providing them to my function, I have to put all parameter from df (rows) in a list.
You could use the following list comprehension, which zips the values on each row with the columns of the dataframe:
from itertools import repeat
[list(map(list,zip(cols, i))) for cols, i in zip(df.values.tolist(), repeat(df.columns))]
[[[1, 'A'], [2, 'B'], [4, 'C']],
[[1, 'A'], [2, 'B'], [5, 'C']],
[[1, 'A'], [3, 'B'], [4, 'C']]]
I have the following list
a = [['a', 'b', 1], ['c', 'b', 3], ['c','a', 4], ['a', 'd', 2]]
and I'm trying to remove all the elements from the list where the last element is less than 3. So the output should look like
a = [['c', 'b', 3], ['c','a', 4]]
I tried to use filter in the following way
list(filter(lambda x: x == [_, _, 2], a))
Here _ tries to denote that the element in those places can be anything. I'm used to this kind of syntax from mathematica but I have been unable to find something like this in Python (is there even such a symbol in python ?).
I would prefer solution using map and filter as those are most intuitive for me.
You should be using x[-1] >= 3 in lambda to retain all sub lists with last value greater than or equal to 3:
>>> a = [['a', 'b', 1], ['c', 'b', 3], ['c','a', 4], ['a', 'd', 2]]
>>> list(filter(lambda x: x[-1] >= 3, a))
[['c', 'b', 3], ['c', 'a', 4]]
List comprehension approach:
a_new = [sublist for sublist in a if sublist[-1] >= 3]
a = [['a', 'b', 1], ['c', 'b', 3], ['c','a', 4], ['a', 'd', 2]]
Filter above list with list comprehension like:
b = [x for x in a if x[-1] >= 3]
This question already has answers here:
Sorting list based on values from another list
(20 answers)
Closed 5 years ago.
Assume I want to sort a list of lists like explained here:
>>>L=[[0, 1, 'f'], [4, 2, 't'], [9, 4, 'afsd']]
>>>sorted(L, key=itemgetter(2))
[[9, 4, 'afsd'], [0, 1, 'f'], [4, 2, 't']]
(Or with lambda.) Now I have a second list which I want to sort in the same order, so I need the new order of the indices. sorted() or .sort() do not return indices. How can I do that?
Actually in my case both lists contain numpy arrays. But the numpy sort/argsort aren't intuitive for that case either.
If I understood you correctly, you want to order B in the example below, based on a sorting rule you apply on L. Take a look at this:
L = [[0, 1, 'f'], [4, 2, 't'], [9, 4, 'afsd']]
B = ['a', 'b', 'c']
result = [i for _, i in sorted(zip(L, B), key=lambda x: x[0][2])]
print(result) # ['c', 'a', 'b']
# that corresponds to [[9, 4, 'afsd'], [0, 1, 'f'], [4, 2, 't']]
If I understand correctly, you want to know how the list has been rearranged. i.e. where is the 0th element after sorting, etc.
If so, you are one step away:
L2 = [L.index(x) for x in sorted(L, key=itemgetter(2))]
which gives:
[2, 0, 1]
As tobias points out, this is needlessly complex compared to
map(itemgetter(0), sorted(enumerate(L), key=lambda x: x[1][2]))
NumPy
Setup:
import numpy as np
L = np.array([[0, 1, 'f'], [4, 2, 't'], [9, 4, 'afsd']])
S = np.array(['a', 'b', 'c'])
Solution:
print S[L[:,2].argsort()]
Output:
['c' 'a' 'b']
Just Python
You could combine both lists, sort them together, and separate them again.
>>> L = [[0, 1, 'f'], [4, 2, 't'], [9, 4, 'afsd']]
>>> S = ['a', 'b', 'c']
>>> L, S = zip(*sorted(zip(L, S), key=lambda x: x[0][2]))
>>> L
([9, 4, 'afsd'], [0, 1, 'f'], [4, 2, 't'])
>>> S
('c', 'a', 'b')
I guess you could do something similar in NumPy as well...
I want to shuffle this list:
[[1, 'A'], [2, 'A'], [6, 'B'], [3, 'B'], [4, 'C'], [5, 'C'], [7, 'F']]
But I need groups identified by sublists second elements to stay together, so that the shuffled list could look like this:
[[6, 'B'], [3, 'B'], [7, 'F'], [1, 'A'], [2, 'A'], [4, 'C'], [5, 'C']]
Where all 'B', 'F', 'A', and 'C' sublists stay together.
I'm guessing using a combination of shuffle and groupby would do the trick, but I don't know where to start with this. Any idea would be appreciated!
items = [[1, 'A'], [2, 'A'], [6, 'B'], [3, 'B'], [4, 'C'], [5, 'C'], [7, 'F']]
import itertools, operator, random
groups = [list(g) for _, g in itertools.groupby(items, operator.itemgetter(1))]
random.shuffle(groups)
shuffled = [item for group in groups for item in group]
print(shuffled)
Prints for example:
[[4, 'C'], [5, 'C'], [1, 'A'], [2, 'A'], [7, 'F'], [6, 'B'], [3, 'B']]
Giving each group a random number and sorting by that. Sublists stay together because Pythons sorting is stable.
Update years later: Using a defaultdict looks nicer and only generates one random number for each group, not one for every element:
from random import random
from collections import defaultdict
r = defaultdict(random)
items.sort(key=lambda item: r[item[1]])
As squeezed oneliner:
items.sort(key=lambda i, r=defaultdict(random): r[i[1]])
Back to original answer:
items = [[1, 'A'], [2, 'A'], [6, 'B'], [3, 'B'], [4, 'C'], [5, 'C'], [7, 'F']]
import random
r = {b: random.random() for a, b in items}
items.sort(key=lambda item: r[item[1]])
print(items)
Prints for example:
[[6, 'B'], [3, 'B'], [4, 'C'], [5, 'C'], [7, 'F'], [1, 'A'], [2, 'A']]
The two lines could be combined, then you don't have that extra variable flying around afterwards.
items.sort(key=lambda item, r={b: random.random() for a, b in items}: r[item[1]])
You can use a dict to group without needing to sort then just shuffle the values the flatten into a flat list:
from collections import defaultdict
from random import shuffle
from itertools import chain
def shuffle_groups(l):
d = defaultdict(list)
for v, k in l:
d[k].append([k, v])
vals = list(d.values())
shuffle(vals)
return chain(*vals)
Output:
In [9]: list(shuffle_groups(l))
Out[9]: [['A', 1], ['A', 2], ['F', 7], ['B', 6], ['B', 3], ['C', 4], ['C', 5]]
In [10]: list(shuffle_groups(l))
Out[10]: [['C', 4], ['C', 5], ['B', 6], ['B', 3], ['A', 1], ['A', 2], ['F', 7]]
In [11]: list(shuffle_groups(l))
Out[11]: [['F', 7], ['B', 6], ['B', 3], ['A', 1], ['A', 2], ['C', 4], ['C', 5]]
Some timings:
In [5]: l =[choice(l) for _ in range(100000)]
In [6]: timeit _groupy(l)
10 loops, best of 3: 139 ms per loop
In [7]: timeit shuffle_groups(l)
10 loops, best of 3: 27.1 ms per loop