Breaking ties in Python sort - python

I have a list of tuples, each tuple contains two integers. I need to sort the the list (in reverse order) according to the difference of the integers in each tuple, but break ties with the larger first integer.
Example
For [(5, 6), (4, 1), (6, 7)], we should get [(4, 1), (6, 7), (5, 6)].
My way
I have already solved it by making a dictionary that contains the difference as the key and the tuple as the value. But the whole thing is a bit clumsy.
What is a better way?

Use a key function to sorted() and return a tuple; values will be sorted lexicographically:
sorted(yourlst, key=lambda t: (abs(t[0] - t[1])), t[0]), reverse=True)
I'm using abs() here to calculate a difference, regardless of which of the two integers is larger.
For your sample input, the key produces (1, 5), (3, 4) and (1, 6); in reverse order that puts (1, 6) (for the (6, 7) tuple) before (1, 5) (corresponding with (5, 6)).
Demo:
>>> yourlst = [(5, 6), (4, 1), (6, 7)]
>>> sorted(yourlst, key=lambda t: (abs(t[0] - t[1]), t[0]), reverse=True)
[(4, 1), (6, 7), (5, 6)]

Given this list=[('a','b',3),('d','e',3),('e','f',5)], if you'd like to sort by the number in descending order, but break ties (when counts are equal like with the '3' on this example) using ascending alphabetical order of the first element and then the second element repectively, the following code works:
sorted(list,key=lambda x: (-x[2],x[0],x[1]))
Here the '-' sign on the x[2] indicates it needs to be sorted in the descending order.
The output will be: [('e', 'f', 5), ('a', 'b', 3), ('d', 'e', 3)]

Related

How do I solve it? [duplicate]

This question already has answers here:
Sort a list of tuples by 2nd item (integer value) [duplicate]
(9 answers)
Closed 1 year ago.
I am currently trying to solve a problem involving lists and sorting (i.e.) when a list (not a normal one ,but a list of tuple ) is entered ; the program should print out the list but in an orderly(increasing order) manner based on the 2nd elements of each tuples.
ex:
Sample List : [(2, 5), (1, 2), (4, 4), (2, 3), (2, 1)]
Expected Result : [(2, 1), (1, 2), (2, 3), (4, 4), (2, 5)]
Here is what I have tried until now:
def sorting(L):
le=len(L)
G=[]
Lnew=[list(l) for l in L]
for i in range(le):
G.append(Lnew[i][1])
G.sort()
Lnew.remove(Lnew[i][1]) #where the problem is
for k in range(len(G)):
Lnew[k][1]=G[k]
for elt in Lnew:
tuple(elt)
return L
An error displays "list.remove(x): x not in list" .
how do I proceed in that case? or is there a simpler way to tackle the problem ?
from operator import itemgetter
lista_of_tuples = sorted(list_of_tuples, key_itemgetter(0))
itemgetter parameter can be 0,1,2 depending which one to order it by
def sorting(my_list):
return sorted(my_list, key=lambda x: x[1])
sorting([(2, 5), (1, 2), (4, 4), (2, 3), (2, 1)])
Output: [(2, 1), (1, 2), (2, 3), (4, 4), (2, 5)]

Insert every tuple in a list to another tuple so I will have a list of tuples of tuples

I am trying to convert the following list of tuples to a list that contains tuples of the original tuples.
For example we have the following list:
arr1 = [(1, 2), (2, 3), (4, 5)]
I tried to do the following conversion but it didn't work:
arr1_tuples = [tuple(item) for item in arr1]
The desired output is:
[((1, 2)), ((2, 3)), ((4, 5))]
Your desired output is impossible (single element tuples will have a repr with a trailing comma), but your desired data structure is achievable with:
arr1_tuples = [(item,) for item in arr1]
That wraps each item in a one-tuple. The parentheses aren't necessary, but the lone trailing comma is easy to miss without them.
The result, if printed, would be:
[((1, 2),), ((2, 3),), ((4, 5),)]
# ^ ^ ^ Commas unavoidable in one-tuples; otherwise matches request
Use map:
arr1 = [(1, 2), (2, 3), (4, 5)]
arr1_tuples = map(lambda x:(x,), arr1)
print(list(arr1_tuples))
Output:
[((1, 2),), ((2, 3),), ((4, 5),)]
lambda x:(x,) will take in an element, and return it inside a tuple.

Python 3: Reverse consecutive runs in sorted list?

This is a question is an extension of What's the most Pythonic way to identify consecutive duplicates in a list?.
Suppose you have a list of tuples:
my_list = [(1,4), (2,3), (3,2), (4,4), (5,2)]
and you sort it by each tuple's last value:
my_list = sorted(my_list, key=lambda tuple: tuple[1])
# [(3,2), (5,2), (2,3), (1,4), (4,4)]
then we have two consecutive runs (looking at the last value in each tuple), namely [(3,2), (5,2)] and [(1,4), (4,4)].
What is the pythonic way to reverse each run (not the tuples within), e.g.
reverse_runs(my_list)
# [(5,2), (3,2), (2,3), (4,4), (1,4)]
Is this possible to do within a generator?
UPDATE
It has come to my attention that perhaps the example list was not clear. So instead consider:
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
Where the ideal output from reverse_runs would be
[(7,"A"), (6,"A"), (1,"A"), (2,"B"), (3,"C"), (4,"C"), (5,"C"), (8,"D")]
To be clear on terminology, I am adopting the use of "run" as used in describing TimSort which is what Python's sort function is based upon - giving it (the sort function) its safety.
Thus if you sort on a collection, should the collection be multi-faceted, then only the specified dimension is sorted on and if two elements are the same for the specified dimension, their ordering will not be altered.
Thus the following function:
sorted(my_list,key=lambda t: t[1])
yields:
[(1, 'A'), (6, 'A'), (7, 'A'), (2, 'B'), (5, 'C'), (4, 'C'), (3, 'C'), (8, 'D')]
and the run on "C" (i.e. (5, 'C'), (4, 'C'), (3, 'C') ) is not disturbed.
So in conclusion the desired output from the yet to be defined function reverse_runs:
1.) sorts the tuples by their last element
2.) maintaining the order of the first element, reverses runs on the last element
Ideally I would like this in a generator functions, but that does not (to me at the moment) seem possible.
Thus one could adopt the following strategy:
1.) Sort the tuples by the last element via sorted(my_list, key=lambda tuple: tuple[1])
2.) Identify the indexes for the last element in each tuple when the succeeding tuple (i+1) is different than the last element in (i). i.e. identify runs
3.) Make an empty list
4.) Using the splice operator, obtain, reverse, and the append each sublist to the empty list
I think this will work.
my_list = [(1,4), (2,3), (3,2), (4,4), (5,2)]
my_list = sorted(my_list, key=lambda tuple: (tuple[1], -tuple[0]))
print(my_list)
Output
[(5, 2), (3, 2), (2, 3), (4, 4), (1, 4)]
Misunderstood question. Less pretty but this should work for what you really want:
from itertools import groupby
from operator import itemgetter
def reverse_runs(l):
sorted_list = sorted(l, key=itemgetter(1))
reversed_groups = (reversed(list(g)) for _, g in groupby(sorted_list, key=itemgetter(1)))
reversed_runs = [e for sublist in reversed_groups for e in sublist]
return reversed_runs
if __name__ == '__main__':
print(reverse_runs([(1, 4), (2, 3), (3, 2), (4, 4), (5, 2)]))
print(reverse_runs([(1, "A"), (2, "B"), (5, "C"), (4, "C"), (3, "C"), (6, "A"), (7, "A"), (8, "D")]))
Output
[(5, 2), (3, 2), (2, 3), (4, 4), (1, 4)]
[(7, 'A'), (6, 'A'), (1, 'A'), (2, 'B'), (3, 'C'), (4, 'C'), (5, 'C'), (8, 'D')]
Generator version:
from itertools import groupby
from operator import itemgetter
def reverse_runs(l):
sorted_list = sorted(l, key=itemgetter(1))
reversed_groups = (reversed(list(g)) for _, g in groupby(sorted_list, key=itemgetter(1)))
for group in reversed_groups:
yield from group
if __name__ == '__main__':
print(list(reverse_runs([(1, 4), (2, 3), (3, 2), (4, 4), (5, 2)])))
print(list(reverse_runs([(1, "A"), (2, "B"), (5, "C"), (4, "C"), (3, "C"), (6, "A"), (7, "A"), (8, "D")])))
The most general case requires 2 sorts. The first sort is a reversed sort on the second criteria. The second sort is a forward sort on the first criteria:
pass1 = sorted(my_list, key=itemgetter(0), reverse=True)
result = sorted(pass1, key=itemgetter(1))
We can sort in multiple passes like this because python's sort algorithm is guaranteed to be stable.
However, in real life it's often possible to simply construct a more clever key function which allows the sorting to happen in one pass. This usually involves "negating" one of the values and relying on the fact that tuples order themselves lexicographically:
result = sorted(my_list, key=lambda t: (t[1], -t[0]))
In response to your update, it looks like the following might be a suitable solution:
from operator import itemgetter
from itertools import chain, groupby
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
pass1 = sorted(my_list, key=itemgetter(1))
result = list(chain.from_iterable(reversed(list(g)) for k, g in groupby(pass1, key=itemgetter(1))))
print(result)
We can take apart the expression:
chain.from_iterable(reversed(list(g)) for k, g in groupby(pass1, key=itemgetter(1)))
to try to figure out what it's doing...
First, let's look at groupby(pass1, key=itemgetter(1)). groupby will yield 2-tuples. The first item (k) in the tuple is the "key" -- e.g. whatever was returned from itemgetter(1). The key isn't really important here after the grouping has taken place, so we don't use it. The second item (g -- for "group") is an iterable that yields consecutive values that have the same "key". This is exactly the items that you requested, however, they're in the order that they were in after sorting. You requested them in reverse order. In order to reverse an arbitrary iterable, we can construct a list from it and then reverse the list. e.g. reversed(list(g)). Finally, we need to paste those chunks back together again which is where chain.from_iterable comes in.
If we want to get more clever, we might do better from an algorithmic standpoint (assuming that the "key" for the bins is hashible). The trick is to bin the objects in a dictionary and then sort the bins. This means that we're potentially sorting a much shorter list than the original:
from collections import defaultdict, deque
from itertools import chain
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
bins = defaultdict(deque)
for t in my_list:
bins[t[1]].appendleft(t)
print(list(chain.from_iterable(bins[key] for key in sorted(bins))))
Note that whether this does better than the first approach is very dependent on the initial data. Since TimSort is such a beautiful algorithm, if the data starts already grouped into bins, then this algorithm will likely not beat it (though, I'll leave it as an exercise for you to try...). However, if the data is well scattered (causing TimSort to behave more like MergeSort), then binning first will possibly make for a slight win.

Difficulty Sorting Tuples with itemgetter

I'm new to Python as the screen name attests. I was attempting to sort a list of tuples, think (x,y) pairs in a list and ran into a problem. My goal is to sort the list of tuples by the x variables in ascending order primarily but then sort
I investigated the wiki on HowToSort at http://wiki.python.org/moin/HowTo/Sorting/ and thought I would try the operator module and the itemgetter function as a key.
The simple sorted() function can sort the tuple fine, but when you want one index ascending and one ascending, I'm lost. Here is the code:
from operator import itemgetter, attrgetter
ItemList = [(1,7),(2,1),(1,5),(1,1)]
# Want list sorted with X values descending, then y values ascending
# expected [(2, 1), (1, 1), (1,5), (1, 7)]
print
print ' Input:', ItemList
print 'Output1:',sorted(ItemList, reverse = True)
print
print ' Input:', ItemList
print 'Output2:', sorted(ItemList, key = itemgetter(-0,1))
print
print ' WANTED:', '[(2, 1), (1, 1), (1,5), (1, 7)]'
with the following output:
Input: [(1, 7), (2, 1), (1, 5), (1, 1)]
Output1: [(2, 1), (1, 7), (1, 5), (1, 1)]
Input: [(1, 7), (2, 1), (1, 5), (1, 1)]
Output2: [(1, 1), (1, 5), (1, 7), (2, 1)]
WANTED: [(2, 1), (1, 1), (1, 5), (1, 7)]
I obviously, do not understand the itemgetter function, so any help would be appreciated on that.
Also, any ideas on how to do the two sort on (x,y) pairs? I am hoping to avoid a lambda solution but I'm sure that's where this is going. Thanks.
-0 is the same thing as 0. More-over, negative indices have a different meaning to itemgetter(); it does not mean that the values are negated.
Use a lambda instead:
sorted(ItemList, key=lambda item: (-item[0], item[1]))
Demo:
>>> ItemList = [(1,7),(2,1),(1,5),(1,1)]
>>> sorted(ItemList, key=lambda item: (-item[0], item[1]))
[(2, 1), (1, 1), (1, 5), (1, 7)]
Negative indices take items from the end of a sequence:
>>> end = itemgetter(-1)
>>> end([1, 2, 3])
3
The itemgetter() will never modify the retrieved item, certainly not negate it.
Note that itemgetter() is only a convenience method, you do not have to use it and for more complex sorting orders, a custom function or lambda is the better choice.

Python Easiest Way to Sum List Intersection of List of Tuples

Let's say I have the following two lists of tuples
myList = [(1, 7), (3, 3), (5, 9)]
otherList = [(2, 4), (3, 5), (5, 2), (7, 8)]
returns => [(1, 7), (2, 4), (3, 8), (5, 11), (7, 8)]
I would like to design a merge operation that merges these two lists by checking for any intersections on the first element of the tuple, if there are intersections, add the second elements of each tuple in question (merge the two). After the operation I would like to sort based upon the first element.
I am also posting this because I think its a pretty common problem that has an obvious solution, but I feel that there could be very pythonic solutions to this question ;)
Use a dictionary for the result:
result = {}
for k, v in my_list + other_list:
result[k] = result.get(k, 0) + v
If you want a list of tuples, you can get it via result.items(). The resulting list will be in arbitrary order, but of course you can sort it if desired.
(Note that I renamed your lists to conform with Python's style conventions.)
Use defaultdict:
from collections import defaultdict
results_dict = defaultdict(int)
results_dict.update(my_list)
for a, b in other_list:
results_dict[a] += b
results = sorted(results_dict.items())
Note: When sorting sequences, sorted sorts by the first item in the sequence. If the first elements are the same, then it compares the second element. You can give sorted a function to sort by, using the key keyword argument:
results = sorted(results_dict.items(), key=lambda x: x[1]) #sort by the 2nd item
or
results = sorted(results_dict.items(), key=lambda x: abs(x[0])) #sort by absolute value
A method using itertools:
>>> myList = [(1, 7), (3, 3), (5, 9)]
>>> otherList = [(2, 4), (3, 5), (5, 2), (7, 8)]
>>> import itertools
>>> merged = []
>>> for k, g in itertools.groupby(sorted(myList + otherList), lambda e: e[0]):
... merged.append((k, sum(e[1] for e in g)))
...
>>> merged
[(1, 7), (2, 4), (3, 8), (5, 11), (7, 8)]
This first concatenates the two lists together and sorts it. itertools.groupby returns the elements of the merged list, grouped by the first element of the tuple, so it just sums them up and places it into the merged list.
>>> [(k, sum(v for x,v in myList + otherList if k == x)) for k in dict(myList + otherList).keys()]
[(1, 7), (2, 4), (3, 8), (5, 11), (7, 8)]
>>>
tested for both Python2.7 and 3.2
dict(myList + otherList).keys() returns an iterable containing a set of the keys for the joined lists
sum(...) takes 'k' to loop again through the joined list and add up tuple items 'v' where k == x
... but the extra looping adds processing overhead. Using an explicit dictionary as proposed by Sven Marnach avoids it.

Categories

Resources