Given this list:
[(1, 's'), (2, 'e'), (2, 's'), (3, 'e')]
This is a representation of potentially overlapping intervals, e.g. 1 --> 2 and 2 --> 3, I've brought it into this representation for easier processing (see this answer for context)
I'd like to remove the pair (2, 'e') -- (2, 's') because the end (e) of the one interval is at the same number (2) as start (s) of the next interval. So the result should be
[(1, 's'), (3, 'e')]
And would represent 1 --> 3.
Edit: It's also possible that the intervals are overlapping, e.g. 1-->4 and 2-->3. That would be represented in this list of tuples (Note that the list is already sorted): [(1, 's'), (2, 's'), (3, 'e'), (4, 'e')]. In this case nothing needs to be done as no two tuples share the same number.
I've come up with this reduce:
import functools
functools.reduce(lambda l,i: l[:-1] if i[0] == l[-1][0] and i[1] != l[-1][1] else l + [i], a[1:], [a[0]])
Are there nicer ways to achieve that?
You can use itertools.groupby for a slightly longer (two lines), although more readable solution:
import itertools
def get_combinations(s):
new_data = [(a, list(b)) for a, b in itertools.groupby(s, key=lambda x:x[0])]
return [b[-1] for i, [a, b] in enumerate(new_data) if len(b) == 1 or len(b) > 1 and i == len(new_data) - 1]
print(get_combinations([(1, 's'), (2, 'e'), (2, 's'), (2, 'e')]))
print(get_combinations([(1, 's'), (2, 'e'), (2, 's'), (3, 'e')]))
Output:
[(1, 's'), (2, 'e')]
[(1, 's'), (3, 'e')]
I've been toying with functional languages a lot lately, so this may read less Pythonic than some, but I would use a (modified) itertools's pairwise recipe to iterate through by pairs
def pairwise(iterable):
a, b = itertools.tee(iterable)
next(b, None) # advance the second iterator
return itertools.zip_longest(a, b, fillvalue=(None, None))
then filter by which pairs don't match each other:
def my_filter(a, b):
a_idx, a_type = a
b_idx, b_type = b
if a_idx == b_idx and a_type == "e" and b_type == "s":
return False
return True
Filter them yourself (because a naive filter will allow the "start" value to live since it pairs with the element ahead of it)
def filter_them(some_list):
pairs = pairwise(some_list)
acc = []
while True:
try:
a, b = next(pairs)
if my_filter(a, b):
acc.append(a)
else:
next(pairs) # skip the next pair
except StopIteration:
break
return acc
I was tinkering about a "double continue" approach, and came up with this generator solution:
def remove_adjacent(l):
iterator = enumerate(l[:-1])
for i, el in iterator:
if el[0] == l[i+1][0] and el[1] != l[i+1][1]:
next(iterator)
continue
yield el
yield l[-1]
Related
I'm very new in Python and coding in general, so this question probably will sound dumb.
I want to append tuples with two elements in listay: if the first element of l2 matches with any first element of listax, then it would be appended as a tuple in listay with its second element.
If it worked my output (print(listay)) would be: ['a',4),('b',2), ('c',1)]. Instead, the output is an empty list. What am I doing wrong?
Also, I am sorry if I am not offering all the information necessary. This is my first question ever about coding in a forum.
import operator
listax= []
listay= []
l1= [('a',3), ('b',3), ('c',3), ('d',2)]
l2= [('a',4),('b',2), ('c',1), ('d',2)]
sl1= sorted(l1, key= lambda t: t[1])
sl2= sorted(l2, key= lambda t: t[1])
tup1l1= sl1[len(sl1)-1]
k1l1= tup1l1[0]
v1l1= tup1l1[1]
tup2l1= sl1[len(sl1)-2]
k2l1= tup2l1[0]
v2l1= tup2l1[1]
tup3l1= sl1[len(sl1)-3]
k3l1= tup3l1[0]
v3l1= tup3l1[1]
tup1l2= sl2[len(sl2)-1]
k1l2= tup1l2[0]
v1l2= tup1l2[1]
tup2l2= sl2[len(sl2)-2]
k2l2= tup2l2[0]
v2l2= tup2l2[1]
tup3l2= sl2[len(sl2)-3]
k3l2= tup3l2[0]
v3l2= tup3l2[1]
listax.append((k2l1, v2l1))
if v2l1== v1l1:
listax.append((k1l1, v1l1))
if v2l1== v3l1:
listax.append((k3l1, v3l1))
for i,n in l2:
if i in listax:
listay.append((i,n))
print(listay)
I'll play the debugger role here, because I'm not sure what are you trying to achieve, but you could do it yourself - try out breakpoint() build-in function and python debugger commands - it helps immensely ;)
Side note - I'm not sure why you import operator, but I assume it's not related to question.
You sort lists by the second element, ascending, python sort is stable, so you get:
sl1 = [('d', 2), ('a', 3), ('b', 3), ('c', 3)]
sl2 = [('c', 1), ('b', 2), ('d', 2), ('a', 4)]
k1l1 = 'c'
v1l1 = 3
k2l1 = 'b'
v2k1 = 3
k3l1 = 'a'
v3l1 = 3
k1l2 = 'a'
v1l2 = 4
k2l2 = 'd'
v2k2 = 2
k3l2 = 'b'
v3l2 = 2
after append
listax = [('b', 3)]
v2l1 == v1l1 is True 3 == 3, so
listax = [('b', 3), ('c', 3)]
v2l1 == v3l1 is True 3 == 3, so
listax = [('b', 3), ('c', 3), ('a', 3)]
I think it gets tricky here:
for i,n in l2:
with
l2 = [('a', 4), ('b', 2), ('c', 1), ('d', 2)]
we get
i = 'a'
n = 4
maybe you wanted enumerate(l2)?
'a' in listax ([('b', 3), ('c', 3), ('a', 3)]) is False
listax doesn't contain an element equal to 'a' - it contains an element, which contains the element 'a'. Maybe that's the mistake?
i = 'b'
n = 3
just like before
nothing interesting happens later ;)
Hope this helps :D
a = (('we', 23), ('b', 2))
b = (('we', 3), ('e', 3), ('b', 4))
#wanted_result = (('we', 3), ('b', 4), ('we', 23), ('b', 2))
How can I receive the tuple that contains the same string in both a and b
like the result I have written below the code?
I would prefer using list comprehensions using filters btw... would that be available?
You can use set intersection:
keys = dict(a).keys() & dict(b)
tuple(t for t in a + b if t[0] in keys)
You can make a set of the intersection between the first part of the tuples in both lists. Then use a list comprehension to extract the tuples that match this common set:
a = (('we', 23), ('b', 2))
b = (('we', 3), ('e', 3), ('b', 4))
common = set(next(zip(*a))) & set(next(zip(*b)))
result = [t for t in a+b if t[0] in common]
[('we', 23), ('b', 2), ('we', 3), ('b', 4)]
You can also do something similar using the Counter class from collections (by filtering tuples on string counts greater than 1:
from collections import Counter
common = Counter(next(zip(*a,*b)))
result = [(s,n) for (s,n) in a+b if common[s]>1]
If you want a single list comprehension, given that your tuples have exactly two values, you can pair each one with a dictionary formed form the other and use the dictionary as a filter mechanism:
result = [t for d,tl in [(dict(b),a),(dict(a),b)] for t in tl if t[0] in d]
Adding two list comprehensions (i.e. concatenating lists):
print([bi for bi in b if any(bi[0]==i[0] for i in a)] +
[ai for ai in a if any(ai[0]==i[0] for i in b)])
# Output: [('we', 3), ('b', 4), ('we', 23), ('b', 2)]
Explanation
[bi for bi in b if any(bi[0]==i[0] for i in a)] # ->>
# Take tuples from b whose first element equals one of the
# first elements of a
[ai for ai in a if ai[0] in [i[0] for i in b]]
# Similarly take tuples from a whose first elements equals one of the
# first elements of b
another variation with sets
filtered_keys=set(k for k,v in a)&set(k for k,v in b)
res=tuple((k, v) for k, v in [*a, *b] if k in filtered_keys)
>>> (('we', 23), ('b', 2), ('we', 3), ('b', 4))
I was trying to use functools.partial with itertools.count, by currying zip with itertools.count():
g = functools.partial(zip, itertools.count())
When calling g with inputs like "abc", "ABC", I noticed that itertools.count() mysteriously "jumps".
I thought I should get the same result as directly using zip with itertools.count()? like:
>>> x=itertools.count();
>>> list(zip("abc",x))
[('a', 0), ('b', 1), ('c', 2)]
>>> list(zip("ABC",x))
[('A', 3), ('B', 4), ('C', 5)]
But instead, I get the following -- notice the starting index at the second call of g is 4 instead of 3:
>>> g = functools.partial(zip, itertools.count())
>>> list(g("abc"))
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> list(g("ABC"))
[(4, 'A'), (5, 'B'), (6, 'C')]
Note that you'd get the same result if your original code used arguments in the same order as your altered code:
>>> x = itertools.count()
>>> list(zip(x, "abc"))
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> list(zip(x, "ABC"))
[(4, 'A'), (5, 'B'), (6, 'C')]
zip() tries its first argument first, then its second, then its third ... and stops when one of them is exhausted.
In the spelling just above, after "abc" is exhausted, it goes back to the first argument and gets 3 from x. But its second argument is already exhausted, so zip() stops, and the 3 is silently lost.
Then it moves on to the second zip(), and starts by getting 4 from x.
partial() really has nothing to do with it.
It'll be easy to see why if you encapsulate itertools.count() inside a function:
def count():
c = itertools.count()
while True:
v = next(c)
print('yielding', v)
yield v
g = functools.partial(zip, count())
list(g("abc"))
The output is
yielding 0
yielding 1
yielding 2
yielding 3
[(0, 'a'), (1, 'b'), (2, 'c')]
You'll see zip will evaluate the next argument from count() (so an extra value 3 is yielded) before it realises there isn't anything else left in the second iterable.
As an exercise, reverse the arguments and you'll see the evaluation is a little different.
I have a list of some elements, e.g. [1, 2, 3, 4] and a single object, e.g. 'a'. I want to produce a list of tuples with the elements of the list in the first position and the single object in the second position: [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')].
I could do it with zip like this:
def zip_with_scalar(l, o): # l - the list; o - the object
return list(zip(l, [o] * len(l)))
However, this gives me a feeling of creating and unnecessary list of repeating element.
Another possibility is
def zip_with_scalar(l, o):
return [(i, o) for i in l]
which is very clean and pythonic indeed, but here I do the whole thing "manually". In Haskell I would do something like
zipWithScalar l o = zip l $ repeat o
Is there any built-in function or trick, either for the zipping with scalar or for something that would enable me to use ordinary zip, i.e. sort-of infinite list?
This is the cloest to your Haskell solution:
import itertools
def zip_with_scalar(l, o):
return zip(l, itertools.repeat(o))
You could also use generators, which avoid creating a list like comprehensions do:
def zip_with_scalar(l, o):
return ((i, o) for i in l)
You can use the built-in map function:
>>> elements = [1, 2, 3, 4]
>>> key = 'a'
>>> map(lambda e: (e, key), elements)
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
This is a perfect job for the itertools.cycle class.
from itertools import cycle
def zip_with_scalar(l, o):
return zip(i, cycle(o))
Demo:
>>> from itertools import cycle
>>> l = [1, 2, 3, 4]
>>> list(zip(l, cycle('a')))
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
lst = [1,2,3,4]
tups = [(itm, 'a') for itm in lst]
tups
> [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
>>> l = [1, 2, 3, 4]
>>> list(zip(l, "a"*len(l)))
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
You could also use zip_longest with a fillvalue of o:
from itertools import zip_longest
def zip_with_scalar(l, o): # l - the list; o - the object
return zip_longest(l, [o], fillvalue=o)
print(list(zip_with_scalar([1, 2, 3, 4] ,"a")))
Just be aware that any mutable values used for o won't be copied whether using zip_longest or repeat.
The more-itertools library recently added a zip_broadcast() function that solves this problem well:
>>> from more_itertools import zip_broadcast
>>> list(zip_broadcast([1,2,3,4], 'a'))
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
This is a much more general solution than the other answers posted here:
Empty iterables are correctly handled.
There can be multiple iterable and/or scalar arguments.
The order of the scalar/iterable arguments doesn't need to be known.
If there are multiple iterable arguments, you can check that they are the same length with strict=True.
You can easily control whether or not strings should be treated as iterables (by default they are not).
Just define a class with infinite iterator which is initialized with the single element you want to injected in the lists:
class zipIterator:
def __init__(self, val):
self.__val = val
def __iter__(self):
return self
def __next__(self):
return self.__val
and then create your new list from this class and the lists you have:
elements = [1, 2, 3, 4]
key = 'a'
res = [it for it in zip(elements, zipIterator(key))]
the result would be:
>>res
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
I need to merge two iterators. I wrote this function:
def merge_no_repeat(iter1, iter2, key=None):
"""
a = iter([(2, 'a'), (4, 'a'), (6, 'a')])
b = iter([(1, 'b'), (2, 'b'), (3, 'b'), (4, 'b'), (5, 'b'), (6, 'b'), (7, 'b'), (8, 'b')])
key = lambda item: item[0]
fusion_no_repeat(a, b, key) ->
iter([(1, 'b'), (2, 'a'), (3, 'b'), (4, 'a'), (5, 'b'), (6, 'a'), (7, 'b'), (8, 'b')])
:param iter1: sorted iterator
:param iter2: sorted iterator
:param key: lambda get sorted key, default: lambda x: x
:return: merged iterator
"""
if key is None:
key = lambda x: x
element1 = next(iter1, None)
element2 = next(iter2, None)
while element1 is not None or element2 is not None:
if element1 is None:
yield element2
element2 = next(iter2, None)
elif element2 is None:
yield element1
element1 = next(iter1, None)
elif key(element1) > key(element2):
yield element2
element2 = next(iter2, None)
elif key(element1) == key(element2):
yield element1
element1 = next(iter1, None)
element2 = next(iter2, None)
elif key(element1) < key(element2):
yield element1
element1 = next(iter1, None)
This function works. But I think it's too complicated. Is it possible to make this function easiest using Python Standard Libraries?
The pytoolz library has an implementation of this. It doesn't look like it uses any non-standard-library functions so if you really don't want to include an external library you could probably just copy the code.
If you're interested in speed there's also a cython implementation of pytoolz.
One, this fails if either of the iterators returns None, you should probably catch StopIteration exceptions. Two, once one of the iterators has no more values, you can just return all the rest of the values of the other one.
I think this is easier to make if you use a small wrapper class around an iterator that makes the next value visible:
class NextValueWrapper(object):
def __init__(self, iterator):
self.iterator = iterator
self.next_value = None
self.finished = False
self.get()
def get(self):
if self.finished: return # Shouldn't happen, maybe raise an exception
value = self.next_value
try:
self.next_value = next(self.iterator)
except StopIteration:
self.finished = True
return value
Then the code becomes:
def merge(iter1, iter2, key=None):
if key is None:
key = lambda x: x
wrap1 = NextValueWrapper(iter1)
wrap2 = NextValueWrapper(iter2)
while not (wrap1.finished and wrap2.finished):
if (wrap2.finished or
(not wrap1.finished and
key(wrap1.next_value) <= key(wrap2.next_value))):
yield wrap1.get()
else:
yield wrap2.get()
This is untested. And it repeats. And it's Python 2, out of habit. Making it non-repeating is left as an exercise for the reader, I hadn't noticed that was a requirement too...
As I'm just working on a problem that needs this, here is the solution I came up with. I worked out my problem in Haskell first. I believe looking at an idiomatic Haskell solution is instructive.
merge k [] ys = ys
merge k xs [] = xs
merge k (x:xs) (y:ys) =
if k y < k x then y:merge k (x:xs) ys
else x:merge k xs (if k y==k x then ys else (y:ys))
The arguments are just lists, but because Haskell is lazy, it works even with infinite lists. : is the Haskell list constructor, so x:xs is equivalent to [x]+xs in Python.
A recursive solution like this is a bit of a non-starter in Python, but we can preserve its essence while transforming the recursion into iteration by trampolining, like this:
from functools import partial
Nil = object()
def merge(xs, ys, k=lambda x:x):
def nxt(it): return next(it, Nil)
def tail(it): return lambda: (next(it), tail(it))
def both(x, y):
return ((y, tail(ys)) if x is Nil else
(x, tail(xs)) if y is Nil else
((y, partial(both, x, nxt(ys))) if k(y) < k(x) else
(x, partial(both, nxt(xs), nxt(ys) if k(x) == k(y) else y))))
to_yield, extract = both(nxt(xs), nxt(ys))
while to_yield is not Nil:
yield to_yield
to_yield, extract = extract()
A few observations about this:
We create a new unique out-of-band value Nil instead of None so that the sequences can contain None values without interfering with the logic.
We use the two argument form of next with a value to return if the iterator is empty to avoid raising a StopIteration exception too early.
We trampoline by having the extract function return the next value to yield and the next extract function to use. While both iterables are active, the extract function is both.
This is Python 2 because that's what I use, but it does mean there's no yield from to play out one iterator when the other is empty. Instead, we switch into tail mode, which continues trampolining but relies on the single argument form of next raising StopIteration to terminate the process.
Here's an example of it in action:
>>> from itertools import islice, imap
>>> mul = lambda x: lambda y: x*y
>>> list(islice(merge(imap(mul(5), count()), imap(mul(3), count())), 20))
[0, 3, 5, 6, 9, 10, 12, 15, 18, 20, 21, 24, 25, 27, 30, 33, 35, 36, 39, 40]
You can use:
def merge_no_repeat(iter1, iter2, key=None):
if key is None:
key = lambda x: x
ref = next(iter1, None)
for elem in iter2:
key_elem = key(elem) # caching value so we won't compute it for each value in iter1 that is before this one
while ref is not None and key_elem > key(ref):
# Catch up with low values from iter1
yield ref
ref = next(iter1, None)
if ref is None or key_elem < key(ref):
# Catch up with low values from iter2, eliminate duplicates
yield elem
# Update: I forgot to consume iter1 in the first version of this code
for elem in iter1:
# Use remaining items of iter1 if needed
yield elem
I assumed that the iterators wouldn't return None values except when completely consumed, since you have if element1 is None: and elif element1 is None: tests in your original code.
Examples:
>>> from operator import itemgetter
>>> list(merge_no_repeat(
... iter([(2, 'a'), (4, 'a'), (6, 'a')]),
... iter([(1, 'b')]),
... itemgetter(0)))
[(1, 'b'), (2, 'a'), (4, 'a'), (6, 'a')]
>>> list(merge_no_repeat(
... iter([(2, 'a'), (4, 'a'), (6, 'a')]),
... iter([(1, 'b'),(7, 'b'), (8, 'b')]),
... itemgetter(0)))
[(1, 'b'), (2, 'a'), (4, 'a'), (6, 'a'), (7, 'b'), (8, 'b')]
>>> list(merge_no_repeat(
... iter([(2, 'a'), (4, 'a'), (6, 'a')]),
... iter([(1, 'b'),(3, 'b'), (4,'b'),(5,'b'),(7, 'b'), (8, 'b')]),
... itemgetter(0)))
[(1, 'b'), (2, 'a'), (3, 'b'), (4, 'a'), (5, 'b'), (6, 'a'), (7, 'b'), (8, 'b')]