sorting tuples in python with a custom key

sorting tuples in python with a custom key - python

Hi:
I'm trying to sort a list of tuples in a custom way:
For example:
lt = [(2,4), (4,5), (5,2)]
must be sorted:
lt = [(5,2), (2,4), (4,5)]
Rules:
* b tuple is greater than a tuple if a[1] == b[0]
* a tuple is greater than b tuple if a[0] == b[1]
I've implemented a cmp function like this:
def tcmp(a, b):
if a[1] == b[0]:
return -1
elif a[0] == b[1]:
return 1
else:
return 0
but sorting the list:
lt.sort(tcmp)
lt show me:
lt = [(2, 4), (4, 5), (5, 2)]
What am I doing wrong?

Sounds a lot to me you are trying to solve one of the Google's Python class problems, which is to sort a list of tuples in increasing order based on their last element.
This how I did it:
def sort_last(tuples):
def last_value_tuple(t):
return t[-1]
return sorted(tuples, key=last_value_tuple)
EDIT: I didn't read the whole thing, and I assumed it was based on the last element of the tuple. Well, still I'm going to leave it here because it can be useful to anyone.

You could also write your code using lambda
def sort(tuples):
return sorted (tuples,key=lambda last : last[-1])
so sort([(1, 3), (3, 2), (2, 1)]) would yield [(2, 1), (3, 2), (1, 3)]

You can write your own custom key function to specify the key value for sorting.
Ex.
def sort_last(tuples):
return sorted(tuples, key=last)
def last(a):
return a[-1]
tuples => sorted tuple by last element
[(1, 3), (3, 2), (2, 1)] => [(2, 1), (3, 2), (1, 3)]
[(1, 7), (1, 3), (3, 4, 5), (2, 2)] => [(2, 2), (1, 3), (3, 4, 5), (1, 7)]

I'm not sure your comparison function is a valid one in a mathematical sense, i.e. transitive. Given a, b, c a comparison function saying that a > b and b > c implies that a > c. Sorting procedures rely on this property.
Not to mention that by your rules, for a = [1, 2] and b = [2, 1] you have both a[1] == b[0] and a[0] == b[1] which means that a is both greater and smaller than b.

Your ordering specification is wrong because it is not transitive.
Transitivity means that if a < b and b < c, then a < c. However, in your case:
(1,2) < (2,3)
(2,3) < (3,1)
(3,1) < (1,2)

Try lt.sort(tcmp, reverse=True).
(While this may produce the right answer, there may be other problems with your comparison method)

Related

Finding mismatches in tuples and merging them in Python

I have two tuples a = ((1, 'AB'), (2, 'BC'), (3, 'CD')) and b = ((1, 'AB'), (2, 'XY'), (3, 'ZA')). By analysing these two tuples, it can be found that there are mismatches in the tuples, i.e, (2, 'BC') is present in a but (2, 'XY') is present in b.
I need to figure out such mismatches and come with a tuple that has the values as
result = ((2, 'BC', 'XY'), (3, 'CD', 'ZA'))
(order shall be preserved)
The closest reference I could catch hold is Comparing sublists and merging them, but this is for lists and I couldn't find a way to work with tuples.
Is there a way by which I can perform this operation?

Since there cannot be missing "keys" from a or b (or those values should be ignored), I would turn b into a dictionary, then loop on a and compare values.
a = ((1, 'AB'), (2, 'BC'), (3, 'CD'))
b = ((1, 'AB'), (2, 'XY'), (3, 'ZA'))
b = dict(b)
mismatches = [(k,v,b[k]) for k,v in a if b.get(k,v) != v]
print(mismatches)
result:
[(2, 'BC', 'XY'), (3, 'CD', 'ZA')]
the solution has the advantage of being almost 1 line, fast (because of dict lookup) and preserves order.
the if b.get(k,v) != v condition safeguards against a having one tuple with a number not in b dictionary. In that case, default value of get returns v and the condition is False

If the lists are guaranteed to have the same order of the numbers in the tuples, you can do something like:
[ai + (bi[1],) for ai, bi in zip(a, b) if ai != bi]
and if there is no guarantee on the order you can do:
[ai + (bi[1],) for ai, bi in zip(sorted(a), sorted(b)) if ai != bi]

Algorithm to generate subsets satisfying binary relation

I am looking for a reasonable algorithm in python (well, because I have rather complicated mathematical objects implemented in python, so I cannot change the language) to achieve the following:
I am given a reflexive, symmetric binary relation bin_rel on a set X. The requested function maximal_compatible_subsets(X, bin_rel) should return all containmentwise maximal subsets of X such that the binary relation holds for all pairs a,b of elements in X.
In some more detail: Suppose I am given a binary relation on a set of objects, say
def bin_rel(elt1,elt2):
# return True if elt1 and elt2 satisfy the relation and False otherwise
# Example: set intersection, in this case, elt1 and elt2 are sets
# and the relation picks those pairs that have a nonempty intersection
return elt1.intersection(elt2)
I can also assume that the relation bin_rel is reflexive (this is, binary_rel(a,a) is True holds) and symmetric (this is, binary_rel(a,b) is binary_rel(b,a) holds).
I am now given a set X and a function bin_rel as above and seek an efficient algorithm to obtain the desired subsets of X
For example, in the case of the set intersection above (with sets replaced by lists for easier reading):
> X = [ [1,2,3], [1,3], [1,6], [3,4], [3,5], [4,5] ]
> maximal_compatible_subsets(X,bin_rel)
[[[1,2,3],[1,3],[1,6]], [[1,2,3],[1,3],[3,4],[3,5]], [[3,4],[3,5],[4,5]]]
This problem doesn't seem to be very exotic, so most welcome would be a pointer to an efficient existing snippet of code.

As Matt Timmermans noted this is finding maximal cliques problem that can be solved by Bron–Kerbosch algorithm. NetworkX has implementation that can be used for Python.

If you want to use python straight out of the box, you could use the following as a starting point:
from itertools import combinations
def maximal_compatible_subsets(X, bin_rel):
retval = []
for i in range(len(X) + 1, 1, -1):
for j in combinations(X, i):
if all(bin_rel(a, b) for a, b in combinations(j, 2)) and not any(set(j).issubset(a) for a in retval):
retval.append(tuple(j))
return tuple(retval)
if __name__ == '__main__':
x = ( (1,2,3), (1,3), (1,6), (3,4), (3,5), (4,5) )
def nonempty_intersection(a, b):
return set(a).intersection(b)
print x
print maximal_compatible_subsets(x, nonempty_intersection)
Outputs:
((1, 2, 3), (1, 3), (1, 6), (3, 4), (3, 5), (4, 5))
(((1, 2, 3), (1, 3), (3, 4), (3, 5)), ((1, 2, 3), (1, 3), (1, 6)), ((3, 4), (3, 5), (4, 5)))

Pairwise circular Python 'for' loop

Is there a nice Pythonic way to loop over a list, retuning a pair of elements? The last element should be paired with the first.
So for instance, if I have the list [1, 2, 3], I would like to get the following pairs:
1 - 2
2 - 3
3 - 1

A Pythonic way to access a list pairwise is: zip(L, L[1:]). To connect the last item to the first one:
>>> L = [1, 2, 3]
>>> zip(L, L[1:] + L[:1])
[(1, 2), (2, 3), (3, 1)]

I would use a deque with zip to achieve this.
>>> from collections import deque
>>>
>>> l = [1,2,3]
>>> d = deque(l)
>>> d.rotate(-1)
>>> zip(l, d)
[(1, 2), (2, 3), (3, 1)]

I'd use a slight modification to the pairwise recipe from the itertools documentation:
def pairwise_circle(iterable):
"s -> (s0,s1), (s1,s2), (s2, s3), ... (s<last>,s0)"
a, b = itertools.tee(iterable)
first_value = next(b, None)
return itertools.zip_longest(a, b,fillvalue=first_value)
This will simply keep a reference to the first value and when the second iterator is exhausted, zip_longest will fill the last place with the first value.
(Also note that it works with iterators like generators as well as iterables like lists/tuples.)
Note that #Barry's solution is very similar to this but a bit easier to understand in my opinion and easier to extend beyond one element.

I would pair itertools.cycle with zip:
import itertools
def circular_pairwise(l):
second = itertools.cycle(l)
next(second)
return zip(l, second)
cycle returns an iterable that yields the values of its argument in order, looping from the last value to the first.
We skip the first value, so it starts at position 1 (rather than 0).
Next, we zip it with the original, unmutated list. zip is good, because it stops when any of its argument iterables are exhausted.
Doing it this way avoids the creation of any intermediate lists: cycle holds a reference to the original, but doesn't copy it. zip operates in the same way.
It's important to note that this will break if the input is an iterator, such as a file, (or a map or zip in python-3), as advancing in one place (through next(second)) will automatically advance the iterator in all the others. This is easily solved using itertools.tee, which produces two independently operating iterators over the original iterable:
def circular_pairwise(it):
first, snd = itertools.tee(it)
second = itertools.cycle(snd)
next(second)
return zip(first, second)
tee can use large amounts of additional storage, for example, if one of the returned iterators is used up before the other is touched, but as we only ever have one step difference, the additional storage is minimal.

There are more efficient ways (that don't built temporary lists), but I think this is the most concise:
> l = [1,2,3]
> zip(l, (l+l)[1:])
[(1, 2), (2, 3), (3, 1)]

Pairwise circular Python 'for' loop
If you like the accepted answer,
zip(L, L[1:] + L[:1])
you can go much more memory light with semantically the same code using itertools:
from itertools import islice, chain #, izip as zip # uncomment if Python 2
And this barely materializes anything in memory beyond the original list (assuming the list is relatively large):
zip(l, chain(islice(l, 1, None), islice(l, None, 1)))
To use, just consume (for example, with a list):
>>> list(zip(l, chain(islice(l, 1, None), islice(l, None, 1))))
[(1, 2), (2, 3), (3, 1)]
This can be made extensible to any width:
def cyclical_window(l, width=2):
return zip(*[chain(islice(l, i, None), islice(l, None, i)) for i in range(width)])
and usage:
>>> l = [1, 2, 3, 4, 5]
>>> cyclical_window(l)
<itertools.izip object at 0x112E7D28>
>>> list(cyclical_window(l))
[(1, 2), (2, 3), (3, 4), (4, 5), (5, 1)]
>>> list(cyclical_window(l, 4))
[(1, 2, 3, 4), (2, 3, 4, 5), (3, 4, 5, 1), (4, 5, 1, 2), (5, 1, 2, 3)]
Unlimited generation with itertools.tee with cycle
You can also use tee to avoid making a redundant cycle object:
from itertools import cycle, tee
ic1, ic2 = tee(cycle(l))
next(ic2) # must still queue up the next item
and now:
>>> [(next(ic1), next(ic2)) for _ in range(10)]
[(1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2)]
This is incredibly efficient, an expected usage of iter with next, and elegant usage of cycle, tee, and zip.
Don't pass cycle directly to list unless you have saved your work and have time for your computer to creep to a halt as you max out its memory - if you're lucky, after a while your OS will kill the process before it crashes your computer.
Pure Python Builtin Functions
Finally, no standard lib imports, but this only works for up to the length of original list (IndexError otherwise.)
>>> [(l[i], l[i - len(l) + 1]) for i in range(len(l))]
[(1, 2), (2, 3), (3, 1)]
You can continue this with modulo:
>>> len_l = len(l)
>>> [(l[i % len_l], l[(i + 1) % len_l]) for i in range(10)]
[(1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2), (2, 3), (3, 1), (1, 2)]

I would use a list comprehension, and take advantage of the fact that l[-1] is the last element.
>>> l = [1,2,3]
>>> [(l[i-1],l[i]) for i in range(len(l))]
[(3, 1), (1, 2), (2, 3)]
You don't need a temporary list that way.

Amazing how many different ways there are to solve this problem.
Here's one more. You can use the pairwise recipe but instead of zipping with b, chain it with the first element that you already popped off. Don't need to cycle when we just need a single extra value:
from itertools import chain, izip, tee
def pairwise_circle(iterable):
a, b = tee(iterable)
first = next(b, None)
return izip(a, chain(b, (first,)))

I like a solution that does not modify the original list and does not copy the list to temporary storage:
def circular(a_list):
for index in range(len(a_list) - 1):
yield a_list[index], a_list[index + 1]
yield a_list[-1], a_list[0]
for x in circular([1, 2, 3]):
print x
Output:
(1, 2)
(2, 3)
(3, 1)
I can imagine this being used on some very large in-memory data.

This one will work even if the list l has consumed most of the system's memory. (If something guarantees this case to be impossible, then zip as posted by chepner is fine)
l.append( l[0] )
for i in range( len(l)-1):
pair = l[i],l[i+1]
# stuff involving pair
del l[-1]
or more generalizably (works for any offset n i.e. l[ (i+n)%len(l) ] )
for i in range( len(l)):
pair = l[i], l[ (i+1)%len(l) ]
# stuff
provided you are on a system with decently fast modulo division (i.e. not some pea-brained embedded system).
There seems to be a often-held belief that indexing a list with an integer subscript is un-pythonic and best avoided. Why?

This is my solution, and it looks Pythonic enough to me:
l = [1,2,3]
for n,v in enumerate(l):
try:
print(v,l[n+1])
except IndexError:
print(v,l[0])
prints:
1 2
2 3
3 1
The generator function version:
def f(iterable):
for n,v in enumerate(iterable):
try:
yield(v,iterable[n+1])
except IndexError:
yield(v,iterable[0])
>>> list(f([1,2,3]))
[(1, 2), (2, 3), (3, 1)]

How about this?
li = li+[li[0]]
pairwise = [(li[i],li[i+1]) for i in range(len(li)-1)]

from itertools import izip, chain, islice
itr = izip(l, chain(islice(l, 1, None), islice(l, 1)))
(As above with #j-f-sebastian's "zip" answer, but using itertools.)
NB: EDITED given helpful nudge from #200_success. previously was:
itr = izip(l, chain(l[1:], l[:1]))

If you don't want to consume too much memory, you can try my solution:
[(l[i], l[(i+1) % len(l)]) for i, v in enumerate(l)]
It's a little slower, but consume less memory.

Starting in Python 3.10, the new pairwise function provides a way to create sliding pairs of consecutive elements:
from itertools import pairwise
# l = [1, 2, 3]
list(pairwise(l + l[:1]))
# [(1, 2), (2, 3), (3, 1)]
or simply pairwise(l + l[:1]) if you don't need the result as a list.
Note that we pairwise on the list appended with its head (l + l[:1]) so that rolling pairs are circular (i.e. so that we also include the (3, 1) pair):
list(pairwise(l)) # [(1, 2), (2, 3)]
l + l[:1] # [1, 2, 3, 1]

Just another try
>>> L = [1,2,3]
>>> zip(L,L[1:]) + [(L[-1],L[0])]
[(1, 2), (2, 3), (3, 1)]

L = [1, 2, 3]
a = zip(L, L[1:]+L[:1])
for i in a:
b = list(i)
print b

this seems like combinations would do the job.
from itertools import combinations
x=combinations([1,2,3],2)
this would yield a generator. this can then be iterated over as such
for i in x:
print i
the results would look something like
(1, 2)
(1, 3)
(2, 3)

Generating all possible combinations of a list, "itertools.combinations" misses some results

Given a list of items in Python, how can I get all the possible combinations of the items?
There are several similar questions on this site, that suggest using itertools.combinations, but that returns only a subset of what I need:
stuff = [1, 2, 3]
for L in range(0, len(stuff)+1):
for subset in itertools.combinations(stuff, L):
print(subset)
()
(1,)
(2,)
(3,)
(1, 2)
(1, 3)
(2, 3)
(1, 2, 3)
As you see, it returns only items in a strict order, not returning (2, 1), (3, 2), (3, 1), (2, 1, 3), (3, 1, 2), (2, 3, 1), and (3, 2, 1). Is there some workaround for that? I can't seem to come up with anything.

Use itertools.permutations:
>>> import itertools
>>> stuff = [1, 2, 3]
>>> for L in range(0, len(stuff)+1):
for subset in itertools.permutations(stuff, L):
print(subset)
...
()
(1,)
(2,)
(3,)
(1, 2)
(1, 3)
(2, 1)
(2, 3)
(3, 1)
....
Help on itertools.permutations:
permutations(iterable[, r]) --> permutations object
Return successive r-length permutations of elements in the iterable.
permutations(range(3), 2) --> (0,1), (0,2), (1,0), (1,2), (2,0), (2,1)

You can generate all the combinations of a list in python using this simple code
import itertools
a = [1,2,3,4]
for i in xrange(1,len(a)+1):
print list(itertools.combinations(a,i))
Result:
[(1,), (2,), (3,), (4,)]
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
[(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]
[(1, 2, 3, 4)]

Are you looking for itertools.permutations instead?
From help(itertools.permutations),
Help on class permutations in module itertools:
class permutations(__builtin__.object)
| permutations(iterable[, r]) --> permutations object
|
| Return successive r-length permutations of elements in the iterable.
|
| permutations(range(3), 2) --> (0,1), (0,2), (1,0), (1,2), (2,0), (2,1)
Sample Code :
>>> from itertools import permutations
>>> stuff = [1, 2, 3]
>>> for i in range(0, len(stuff)+1):
for subset in permutations(stuff, i):
print(subset)
()
(1,)
(2,)
(3,)
(1, 2)
(1, 3)
(2, 1)
(2, 3)
(3, 1)
(3, 2)
(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)
From Wikipedia, the difference between permutations and combinations :
Permutation :
Informally, a permutation of a set of objects is an arrangement of those objects into a particular order. For example, there are six permutations of the set {1,2,3}, namely (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), and (3,2,1).
Combination :
In mathematics a combination is a way of selecting several things out of a larger group, where (unlike permutations) order does not matter.

itertools.permutations is going to be what you want. By mathematical definition, order does not matter for combinations, meaning (1,2) is considered identical to (2,1). Whereas with permutations, each distinct ordering counts as a unique permutation, so (1,2) and (2,1) are completely different.

Here is a solution without itertools
First lets define a translation between an indicator vector of 0 and 1s and a sub-list (1 if the item is in the sublist)
def indicators2sublist(indicators,arr):
return [item for item,indicator in zip(arr,indicators) if int(indicator)==1]
Next, Well define a mapping from a number between 0 and 2^n-1 to the its binary vector representation (using string's format function) :
def bin(n,sz):
return ('{d:0'+str(sz)+'b}').format(d=n)
All we have left to do, is to iterate all the possible numbers, and call indicators2sublist
def all_sublists(arr):
sz=len(arr)
for n in xrange(0,2**sz):
b=bin(n,sz)
yield indicators2sublist(b,arr)

I assume you want all possible combinations as 'sets' of values. Here is a piece of code that I wrote that might help give you an idea:
def getAllCombinations(object_list):
uniq_objs = set(object_list)
combinations = []
for obj in uniq_objs:
for i in range(0,len(combinations)):
combinations.append(combinations[i].union([obj]))
combinations.append(set([obj]))
return combinations
Here is a sample:
combinations = getAllCombinations([20,10,30])
combinations.sort(key = lambda s: len(s))
print combinations
... [set([10]), set([20]), set([30]), set([10, 20]), set([10, 30]), set([20, 30]), set([10, 20, 30])]
I think this has n! time complexity, so be careful. This works but may not be most efficient

just thought i'd put this out there since i couldn't fine EVERY possible outcome and keeping in mind i only have the rawest most basic of knowledge when it comes to python and there's probably a much more elegant solution...(also excuse the poor variable names
testing = [1, 2, 3]
testing2= [0]
n = -1
def testingSomethingElse(number):
try:
testing2[0:len(testing2)] == testing[0]
n = -1
testing2[number] += 1
except IndexError:
testing2.append(testing[0])
while True:
n += 1
testing2[0] = testing[n]
print(testing2)
if testing2[0] == testing[-1]:
try:
n = -1
testing2[1] += 1
except IndexError:
testing2.append(testing[0])
for i in range(len(testing2)):
if testing2[i] == 4:
testingSomethingElse(i+1)
testing2[i] = testing[0]
i got away with == 4 because i'm working with integers but you may have to modify that accordingly...

Delete (a,b) from a python list of tuples if (b,a) exists

From a python list of tuples (which is essentially a cartesian product of a list with itself) I want to delete (a,b) if (b,a) is in the list.Only one of (a,b) or (b,a) must be retained. So a list
[(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
must reduce to
[(1,2),(1,3),(2,3)]
(Although deleting (1,2) and retaining (2,1) is fine)
I tried doing this but I am not sure about deleting from a list while iterating over it. This doesn't work. (Gives me [(1, 2), (2, 1), (2, 3), (3, 1), (3, 3)])
[pairs.remove((a,b)) for (a,b) in pairs if ((b,a) in pairs)]

Why delete the incorrect ones from the list?
Use itertools.combinations to generate the correct ones instead.
>>> import itertools
>>> list(itertools.combinations((1, 2, 3), 2))
[(1, 2), (1, 3), (2, 3)]

>>> [el for el in pairs if el[0] < el[1]]
[(1,2),(1,3),(2,3)]

pairs = [(1,1),(1,2),(1,3),(2,1),(2,2),(2,3),(3,1),(3,2),(3,3)]
new_pairs = []
for a, b in pairs:
if (a, b) in new_pairs or (b, a) in new_pairs:
pass
else:
new_pairs += [(a,b)]
new_pairs = [(1, 1), (1, 2), (1, 3), (2, 2), (2, 3), (3, 3)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

sorting tuples in python with a custom key - python

You could also write your code using lambda def sort(tuples): return sorted (tuples,key=lambda last : last[-1]) so sort([(1, 3), (3, 2), (2, 1)]) would yield [(2, 1), (3, 2), (1, 3)]

Your ordering specification is wrong because it is not transitive. Transitivity means that if a < b and b < c, then a < c. However, in your case: (1,2) < (2,3) (2,3) < (3,1) (3,1) < (1,2)

Try lt.sort(tcmp, reverse=True). (While this may produce the right answer, there may be other problems with your comparison method)

Related

Finding mismatches in tuples and merging them in Python

Algorithm to generate subsets satisfying binary relation

Pairwise circular Python 'for' loop

Generating all possible combinations of a list, "itertools.combinations" misses some results

Delete (a,b) from a python list of tuples if (b,a) exists

Categories

Resources