a = (('we', 23), ('b', 2))
b = (('we', 3), ('e', 3), ('b', 4))
#wanted_result = (('we', 3), ('b', 4), ('we', 23), ('b', 2))
How can I receive the tuple that contains the same string in both a and b
like the result I have written below the code?
I would prefer using list comprehensions using filters btw... would that be available?
You can use set intersection:
keys = dict(a).keys() & dict(b)
tuple(t for t in a + b if t[0] in keys)
You can make a set of the intersection between the first part of the tuples in both lists. Then use a list comprehension to extract the tuples that match this common set:
a = (('we', 23), ('b', 2))
b = (('we', 3), ('e', 3), ('b', 4))
common = set(next(zip(*a))) & set(next(zip(*b)))
result = [t for t in a+b if t[0] in common]
[('we', 23), ('b', 2), ('we', 3), ('b', 4)]
You can also do something similar using the Counter class from collections (by filtering tuples on string counts greater than 1:
from collections import Counter
common = Counter(next(zip(*a,*b)))
result = [(s,n) for (s,n) in a+b if common[s]>1]
If you want a single list comprehension, given that your tuples have exactly two values, you can pair each one with a dictionary formed form the other and use the dictionary as a filter mechanism:
result = [t for d,tl in [(dict(b),a),(dict(a),b)] for t in tl if t[0] in d]
Adding two list comprehensions (i.e. concatenating lists):
print([bi for bi in b if any(bi[0]==i[0] for i in a)] +
[ai for ai in a if any(ai[0]==i[0] for i in b)])
# Output: [('we', 3), ('b', 4), ('we', 23), ('b', 2)]
Explanation
[bi for bi in b if any(bi[0]==i[0] for i in a)] # ->>
# Take tuples from b whose first element equals one of the
# first elements of a
[ai for ai in a if ai[0] in [i[0] for i in b]]
# Similarly take tuples from a whose first elements equals one of the
# first elements of b
another variation with sets
filtered_keys=set(k for k,v in a)&set(k for k,v in b)
res=tuple((k, v) for k, v in [*a, *b] if k in filtered_keys)
>>> (('we', 23), ('b', 2), ('we', 3), ('b', 4))
Related
I'm very new in Python and coding in general, so this question probably will sound dumb.
I want to append tuples with two elements in listay: if the first element of l2 matches with any first element of listax, then it would be appended as a tuple in listay with its second element.
If it worked my output (print(listay)) would be: ['a',4),('b',2), ('c',1)]. Instead, the output is an empty list. What am I doing wrong?
Also, I am sorry if I am not offering all the information necessary. This is my first question ever about coding in a forum.
import operator
listax= []
listay= []
l1= [('a',3), ('b',3), ('c',3), ('d',2)]
l2= [('a',4),('b',2), ('c',1), ('d',2)]
sl1= sorted(l1, key= lambda t: t[1])
sl2= sorted(l2, key= lambda t: t[1])
tup1l1= sl1[len(sl1)-1]
k1l1= tup1l1[0]
v1l1= tup1l1[1]
tup2l1= sl1[len(sl1)-2]
k2l1= tup2l1[0]
v2l1= tup2l1[1]
tup3l1= sl1[len(sl1)-3]
k3l1= tup3l1[0]
v3l1= tup3l1[1]
tup1l2= sl2[len(sl2)-1]
k1l2= tup1l2[0]
v1l2= tup1l2[1]
tup2l2= sl2[len(sl2)-2]
k2l2= tup2l2[0]
v2l2= tup2l2[1]
tup3l2= sl2[len(sl2)-3]
k3l2= tup3l2[0]
v3l2= tup3l2[1]
listax.append((k2l1, v2l1))
if v2l1== v1l1:
listax.append((k1l1, v1l1))
if v2l1== v3l1:
listax.append((k3l1, v3l1))
for i,n in l2:
if i in listax:
listay.append((i,n))
print(listay)
I'll play the debugger role here, because I'm not sure what are you trying to achieve, but you could do it yourself - try out breakpoint() build-in function and python debugger commands - it helps immensely ;)
Side note - I'm not sure why you import operator, but I assume it's not related to question.
You sort lists by the second element, ascending, python sort is stable, so you get:
sl1 = [('d', 2), ('a', 3), ('b', 3), ('c', 3)]
sl2 = [('c', 1), ('b', 2), ('d', 2), ('a', 4)]
k1l1 = 'c'
v1l1 = 3
k2l1 = 'b'
v2k1 = 3
k3l1 = 'a'
v3l1 = 3
k1l2 = 'a'
v1l2 = 4
k2l2 = 'd'
v2k2 = 2
k3l2 = 'b'
v3l2 = 2
after append
listax = [('b', 3)]
v2l1 == v1l1 is True 3 == 3, so
listax = [('b', 3), ('c', 3)]
v2l1 == v3l1 is True 3 == 3, so
listax = [('b', 3), ('c', 3), ('a', 3)]
I think it gets tricky here:
for i,n in l2:
with
l2 = [('a', 4), ('b', 2), ('c', 1), ('d', 2)]
we get
i = 'a'
n = 4
maybe you wanted enumerate(l2)?
'a' in listax ([('b', 3), ('c', 3), ('a', 3)]) is False
listax doesn't contain an element equal to 'a' - it contains an element, which contains the element 'a'. Maybe that's the mistake?
i = 'b'
n = 3
just like before
nothing interesting happens later ;)
Hope this helps :D
Given this list:
[(1, 's'), (2, 'e'), (2, 's'), (3, 'e')]
This is a representation of potentially overlapping intervals, e.g. 1 --> 2 and 2 --> 3, I've brought it into this representation for easier processing (see this answer for context)
I'd like to remove the pair (2, 'e') -- (2, 's') because the end (e) of the one interval is at the same number (2) as start (s) of the next interval. So the result should be
[(1, 's'), (3, 'e')]
And would represent 1 --> 3.
Edit: It's also possible that the intervals are overlapping, e.g. 1-->4 and 2-->3. That would be represented in this list of tuples (Note that the list is already sorted): [(1, 's'), (2, 's'), (3, 'e'), (4, 'e')]. In this case nothing needs to be done as no two tuples share the same number.
I've come up with this reduce:
import functools
functools.reduce(lambda l,i: l[:-1] if i[0] == l[-1][0] and i[1] != l[-1][1] else l + [i], a[1:], [a[0]])
Are there nicer ways to achieve that?
You can use itertools.groupby for a slightly longer (two lines), although more readable solution:
import itertools
def get_combinations(s):
new_data = [(a, list(b)) for a, b in itertools.groupby(s, key=lambda x:x[0])]
return [b[-1] for i, [a, b] in enumerate(new_data) if len(b) == 1 or len(b) > 1 and i == len(new_data) - 1]
print(get_combinations([(1, 's'), (2, 'e'), (2, 's'), (2, 'e')]))
print(get_combinations([(1, 's'), (2, 'e'), (2, 's'), (3, 'e')]))
Output:
[(1, 's'), (2, 'e')]
[(1, 's'), (3, 'e')]
I've been toying with functional languages a lot lately, so this may read less Pythonic than some, but I would use a (modified) itertools's pairwise recipe to iterate through by pairs
def pairwise(iterable):
a, b = itertools.tee(iterable)
next(b, None) # advance the second iterator
return itertools.zip_longest(a, b, fillvalue=(None, None))
then filter by which pairs don't match each other:
def my_filter(a, b):
a_idx, a_type = a
b_idx, b_type = b
if a_idx == b_idx and a_type == "e" and b_type == "s":
return False
return True
Filter them yourself (because a naive filter will allow the "start" value to live since it pairs with the element ahead of it)
def filter_them(some_list):
pairs = pairwise(some_list)
acc = []
while True:
try:
a, b = next(pairs)
if my_filter(a, b):
acc.append(a)
else:
next(pairs) # skip the next pair
except StopIteration:
break
return acc
I was tinkering about a "double continue" approach, and came up with this generator solution:
def remove_adjacent(l):
iterator = enumerate(l[:-1])
for i, el in iterator:
if el[0] == l[i+1][0] and el[1] != l[i+1][1]:
next(iterator)
continue
yield el
yield l[-1]
This is a question is an extension of What's the most Pythonic way to identify consecutive duplicates in a list?.
Suppose you have a list of tuples:
my_list = [(1,4), (2,3), (3,2), (4,4), (5,2)]
and you sort it by each tuple's last value:
my_list = sorted(my_list, key=lambda tuple: tuple[1])
# [(3,2), (5,2), (2,3), (1,4), (4,4)]
then we have two consecutive runs (looking at the last value in each tuple), namely [(3,2), (5,2)] and [(1,4), (4,4)].
What is the pythonic way to reverse each run (not the tuples within), e.g.
reverse_runs(my_list)
# [(5,2), (3,2), (2,3), (4,4), (1,4)]
Is this possible to do within a generator?
UPDATE
It has come to my attention that perhaps the example list was not clear. So instead consider:
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
Where the ideal output from reverse_runs would be
[(7,"A"), (6,"A"), (1,"A"), (2,"B"), (3,"C"), (4,"C"), (5,"C"), (8,"D")]
To be clear on terminology, I am adopting the use of "run" as used in describing TimSort which is what Python's sort function is based upon - giving it (the sort function) its safety.
Thus if you sort on a collection, should the collection be multi-faceted, then only the specified dimension is sorted on and if two elements are the same for the specified dimension, their ordering will not be altered.
Thus the following function:
sorted(my_list,key=lambda t: t[1])
yields:
[(1, 'A'), (6, 'A'), (7, 'A'), (2, 'B'), (5, 'C'), (4, 'C'), (3, 'C'), (8, 'D')]
and the run on "C" (i.e. (5, 'C'), (4, 'C'), (3, 'C') ) is not disturbed.
So in conclusion the desired output from the yet to be defined function reverse_runs:
1.) sorts the tuples by their last element
2.) maintaining the order of the first element, reverses runs on the last element
Ideally I would like this in a generator functions, but that does not (to me at the moment) seem possible.
Thus one could adopt the following strategy:
1.) Sort the tuples by the last element via sorted(my_list, key=lambda tuple: tuple[1])
2.) Identify the indexes for the last element in each tuple when the succeeding tuple (i+1) is different than the last element in (i). i.e. identify runs
3.) Make an empty list
4.) Using the splice operator, obtain, reverse, and the append each sublist to the empty list
I think this will work.
my_list = [(1,4), (2,3), (3,2), (4,4), (5,2)]
my_list = sorted(my_list, key=lambda tuple: (tuple[1], -tuple[0]))
print(my_list)
Output
[(5, 2), (3, 2), (2, 3), (4, 4), (1, 4)]
Misunderstood question. Less pretty but this should work for what you really want:
from itertools import groupby
from operator import itemgetter
def reverse_runs(l):
sorted_list = sorted(l, key=itemgetter(1))
reversed_groups = (reversed(list(g)) for _, g in groupby(sorted_list, key=itemgetter(1)))
reversed_runs = [e for sublist in reversed_groups for e in sublist]
return reversed_runs
if __name__ == '__main__':
print(reverse_runs([(1, 4), (2, 3), (3, 2), (4, 4), (5, 2)]))
print(reverse_runs([(1, "A"), (2, "B"), (5, "C"), (4, "C"), (3, "C"), (6, "A"), (7, "A"), (8, "D")]))
Output
[(5, 2), (3, 2), (2, 3), (4, 4), (1, 4)]
[(7, 'A'), (6, 'A'), (1, 'A'), (2, 'B'), (3, 'C'), (4, 'C'), (5, 'C'), (8, 'D')]
Generator version:
from itertools import groupby
from operator import itemgetter
def reverse_runs(l):
sorted_list = sorted(l, key=itemgetter(1))
reversed_groups = (reversed(list(g)) for _, g in groupby(sorted_list, key=itemgetter(1)))
for group in reversed_groups:
yield from group
if __name__ == '__main__':
print(list(reverse_runs([(1, 4), (2, 3), (3, 2), (4, 4), (5, 2)])))
print(list(reverse_runs([(1, "A"), (2, "B"), (5, "C"), (4, "C"), (3, "C"), (6, "A"), (7, "A"), (8, "D")])))
The most general case requires 2 sorts. The first sort is a reversed sort on the second criteria. The second sort is a forward sort on the first criteria:
pass1 = sorted(my_list, key=itemgetter(0), reverse=True)
result = sorted(pass1, key=itemgetter(1))
We can sort in multiple passes like this because python's sort algorithm is guaranteed to be stable.
However, in real life it's often possible to simply construct a more clever key function which allows the sorting to happen in one pass. This usually involves "negating" one of the values and relying on the fact that tuples order themselves lexicographically:
result = sorted(my_list, key=lambda t: (t[1], -t[0]))
In response to your update, it looks like the following might be a suitable solution:
from operator import itemgetter
from itertools import chain, groupby
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
pass1 = sorted(my_list, key=itemgetter(1))
result = list(chain.from_iterable(reversed(list(g)) for k, g in groupby(pass1, key=itemgetter(1))))
print(result)
We can take apart the expression:
chain.from_iterable(reversed(list(g)) for k, g in groupby(pass1, key=itemgetter(1)))
to try to figure out what it's doing...
First, let's look at groupby(pass1, key=itemgetter(1)). groupby will yield 2-tuples. The first item (k) in the tuple is the "key" -- e.g. whatever was returned from itemgetter(1). The key isn't really important here after the grouping has taken place, so we don't use it. The second item (g -- for "group") is an iterable that yields consecutive values that have the same "key". This is exactly the items that you requested, however, they're in the order that they were in after sorting. You requested them in reverse order. In order to reverse an arbitrary iterable, we can construct a list from it and then reverse the list. e.g. reversed(list(g)). Finally, we need to paste those chunks back together again which is where chain.from_iterable comes in.
If we want to get more clever, we might do better from an algorithmic standpoint (assuming that the "key" for the bins is hashible). The trick is to bin the objects in a dictionary and then sort the bins. This means that we're potentially sorting a much shorter list than the original:
from collections import defaultdict, deque
from itertools import chain
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
bins = defaultdict(deque)
for t in my_list:
bins[t[1]].appendleft(t)
print(list(chain.from_iterable(bins[key] for key in sorted(bins))))
Note that whether this does better than the first approach is very dependent on the initial data. Since TimSort is such a beautiful algorithm, if the data starts already grouped into bins, then this algorithm will likely not beat it (though, I'll leave it as an exercise for you to try...). However, if the data is well scattered (causing TimSort to behave more like MergeSort), then binning first will possibly make for a slight win.
Let's say I have the following two lists of tuples
myList = [(1, 7), (3, 3), (5, 9)]
otherList = [(2, 4), (3, 5), (5, 2), (7, 8)]
returns => [(1, 7), (2, 4), (3, 8), (5, 11), (7, 8)]
I would like to design a merge operation that merges these two lists by checking for any intersections on the first element of the tuple, if there are intersections, add the second elements of each tuple in question (merge the two). After the operation I would like to sort based upon the first element.
I am also posting this because I think its a pretty common problem that has an obvious solution, but I feel that there could be very pythonic solutions to this question ;)
Use a dictionary for the result:
result = {}
for k, v in my_list + other_list:
result[k] = result.get(k, 0) + v
If you want a list of tuples, you can get it via result.items(). The resulting list will be in arbitrary order, but of course you can sort it if desired.
(Note that I renamed your lists to conform with Python's style conventions.)
Use defaultdict:
from collections import defaultdict
results_dict = defaultdict(int)
results_dict.update(my_list)
for a, b in other_list:
results_dict[a] += b
results = sorted(results_dict.items())
Note: When sorting sequences, sorted sorts by the first item in the sequence. If the first elements are the same, then it compares the second element. You can give sorted a function to sort by, using the key keyword argument:
results = sorted(results_dict.items(), key=lambda x: x[1]) #sort by the 2nd item
or
results = sorted(results_dict.items(), key=lambda x: abs(x[0])) #sort by absolute value
A method using itertools:
>>> myList = [(1, 7), (3, 3), (5, 9)]
>>> otherList = [(2, 4), (3, 5), (5, 2), (7, 8)]
>>> import itertools
>>> merged = []
>>> for k, g in itertools.groupby(sorted(myList + otherList), lambda e: e[0]):
... merged.append((k, sum(e[1] for e in g)))
...
>>> merged
[(1, 7), (2, 4), (3, 8), (5, 11), (7, 8)]
This first concatenates the two lists together and sorts it. itertools.groupby returns the elements of the merged list, grouped by the first element of the tuple, so it just sums them up and places it into the merged list.
>>> [(k, sum(v for x,v in myList + otherList if k == x)) for k in dict(myList + otherList).keys()]
[(1, 7), (2, 4), (3, 8), (5, 11), (7, 8)]
>>>
tested for both Python2.7 and 3.2
dict(myList + otherList).keys() returns an iterable containing a set of the keys for the joined lists
sum(...) takes 'k' to loop again through the joined list and add up tuple items 'v' where k == x
... but the extra looping adds processing overhead. Using an explicit dictionary as proposed by Sven Marnach avoids it.
I have a list of tuples like this:
[(1, 0), (2, 1), (3, 1), (6, 2), (3, 2), (2, 3)]
I want to keep the tuples which have the max first value of every tuple with the same second value. For example (2, 1) and (3, 1) share the same second (key) value, so I just want to keep the one with the max first value -> (3, 1). In the end I would get this:
[(1, 0), (3, 1), (6, 2), (2, 3)]
I don't mind at all if it is not a one-liner but I was wondering about an efficient way to go about this...
from operator import itemgetter
from itertools import groupby
[max(items) for key, items in groupby(L,key = itemgetter(1))]
It's assuming that you initial list of tuples is sorted by key values.
groupby creates an iterator that yields objects like (0, <itertools._grouper object at 0x01321330>), where the first value is the key value, the second one is another iterator which gives all the tuples with that key.
max(items) just selects the tuple with the maximum value, and since all the second values of the group are the same (and is also the key), it gives the tuple with the maximum first value.
A list comprehension is used to form an output list of tuples based on the output of these functions.
Probably using a dict:
rd = {}
for V,K in my_tuples:
if V > rd.setdefault(K,V):
rd[K] = V
result = [ (V,K) for K,V in rd.items() ]
import itertools
import operator
l = [(1, 0), (2, 1), (3, 1), (6, 2), (3, 2), (2, 3)]
result = list(max(v, key=operator.itemgetter(0)) for k, v in itertools.groupby(l, operator.itemgetter(1)))
You could use a dictionary keyed on the second element of the tuple:
l = [(1, 0), (2, 1), (3, 1), (6, 2), (3, 2), (2, 3)]
d = dict([(t[1], None) for t in l])
for v, k in l:
if d[k] < v:
d[k] = v
l2 = [ (v, k) for (k, v) in d.items() if v != None ]
print l2