why does itertools.count() consume an extra element when used with zip? - python

I was trying to use functools.partial with itertools.count, by currying zip with itertools.count():
g = functools.partial(zip, itertools.count())
When calling g with inputs like "abc", "ABC", I noticed that itertools.count() mysteriously "jumps".
I thought I should get the same result as directly using zip with itertools.count()? like:
>>> x=itertools.count();
>>> list(zip("abc",x))
[('a', 0), ('b', 1), ('c', 2)]
>>> list(zip("ABC",x))
[('A', 3), ('B', 4), ('C', 5)]
But instead, I get the following -- notice the starting index at the second call of g is 4 instead of 3:
>>> g = functools.partial(zip, itertools.count())
>>> list(g("abc"))
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> list(g("ABC"))
[(4, 'A'), (5, 'B'), (6, 'C')]

Note that you'd get the same result if your original code used arguments in the same order as your altered code:
>>> x = itertools.count()
>>> list(zip(x, "abc"))
[(0, 'a'), (1, 'b'), (2, 'c')]
>>> list(zip(x, "ABC"))
[(4, 'A'), (5, 'B'), (6, 'C')]
zip() tries its first argument first, then its second, then its third ... and stops when one of them is exhausted.
In the spelling just above, after "abc" is exhausted, it goes back to the first argument and gets 3 from x. But its second argument is already exhausted, so zip() stops, and the 3 is silently lost.
Then it moves on to the second zip(), and starts by getting 4 from x.
partial() really has nothing to do with it.

It'll be easy to see why if you encapsulate itertools.count() inside a function:
def count():
c = itertools.count()
while True:
v = next(c)
print('yielding', v)
yield v
g = functools.partial(zip, count())
list(g("abc"))
The output is
yielding 0
yielding 1
yielding 2
yielding 3
[(0, 'a'), (1, 'b'), (2, 'c')]
You'll see zip will evaluate the next argument from count() (so an extra value 3 is yielded) before it realises there isn't anything else left in the second iterable.
As an exercise, reverse the arguments and you'll see the evaluation is a little different.

Related

Why listay is not appending the elements?

I'm very new in Python and coding in general, so this question probably will sound dumb.
I want to append tuples with two elements in listay: if the first element of l2 matches with any first element of listax, then it would be appended as a tuple in listay with its second element.
If it worked my output (print(listay)) would be: ['a',4),('b',2), ('c',1)]. Instead, the output is an empty list. What am I doing wrong?
Also, I am sorry if I am not offering all the information necessary. This is my first question ever about coding in a forum.
import operator
listax= []
listay= []
l1= [('a',3), ('b',3), ('c',3), ('d',2)]
l2= [('a',4),('b',2), ('c',1), ('d',2)]
sl1= sorted(l1, key= lambda t: t[1])
sl2= sorted(l2, key= lambda t: t[1])
tup1l1= sl1[len(sl1)-1]
k1l1= tup1l1[0]
v1l1= tup1l1[1]
tup2l1= sl1[len(sl1)-2]
k2l1= tup2l1[0]
v2l1= tup2l1[1]
tup3l1= sl1[len(sl1)-3]
k3l1= tup3l1[0]
v3l1= tup3l1[1]
tup1l2= sl2[len(sl2)-1]
k1l2= tup1l2[0]
v1l2= tup1l2[1]
tup2l2= sl2[len(sl2)-2]
k2l2= tup2l2[0]
v2l2= tup2l2[1]
tup3l2= sl2[len(sl2)-3]
k3l2= tup3l2[0]
v3l2= tup3l2[1]
listax.append((k2l1, v2l1))
if v2l1== v1l1:
listax.append((k1l1, v1l1))
if v2l1== v3l1:
listax.append((k3l1, v3l1))
for i,n in l2:
if i in listax:
listay.append((i,n))
print(listay)
I'll play the debugger role here, because I'm not sure what are you trying to achieve, but you could do it yourself - try out breakpoint() build-in function and python debugger commands - it helps immensely ;)
Side note - I'm not sure why you import operator, but I assume it's not related to question.
You sort lists by the second element, ascending, python sort is stable, so you get:
sl1 = [('d', 2), ('a', 3), ('b', 3), ('c', 3)]
sl2 = [('c', 1), ('b', 2), ('d', 2), ('a', 4)]
k1l1 = 'c'
v1l1 = 3
k2l1 = 'b'
v2k1 = 3
k3l1 = 'a'
v3l1 = 3
k1l2 = 'a'
v1l2 = 4
k2l2 = 'd'
v2k2 = 2
k3l2 = 'b'
v3l2 = 2
after append
listax = [('b', 3)]
v2l1 == v1l1 is True 3 == 3, so
listax = [('b', 3), ('c', 3)]
v2l1 == v3l1 is True 3 == 3, so
listax = [('b', 3), ('c', 3), ('a', 3)]
I think it gets tricky here:
for i,n in l2:
with
l2 = [('a', 4), ('b', 2), ('c', 1), ('d', 2)]
we get
i = 'a'
n = 4
maybe you wanted enumerate(l2)?
'a' in listax ([('b', 3), ('c', 3), ('a', 3)]) is False
listax doesn't contain an element equal to 'a' - it contains an element, which contains the element 'a'. Maybe that's the mistake?
i = 'b'
n = 3
just like before
nothing interesting happens later ;)
Hope this helps :D

How do I print out the elements that is in both tuples?

a = (('we', 23), ('b', 2))
b = (('we', 3), ('e', 3), ('b', 4))
#wanted_result = (('we', 3), ('b', 4), ('we', 23), ('b', 2))
How can I receive the tuple that contains the same string in both a and b
like the result I have written below the code?
I would prefer using list comprehensions using filters btw... would that be available?
You can use set intersection:
keys = dict(a).keys() & dict(b)
tuple(t for t in a + b if t[0] in keys)
You can make a set of the intersection between the first part of the tuples in both lists. Then use a list comprehension to extract the tuples that match this common set:
a = (('we', 23), ('b', 2))
b = (('we', 3), ('e', 3), ('b', 4))
common = set(next(zip(*a))) & set(next(zip(*b)))
result = [t for t in a+b if t[0] in common]
[('we', 23), ('b', 2), ('we', 3), ('b', 4)]
You can also do something similar using the Counter class from collections (by filtering tuples on string counts greater than 1:
from collections import Counter
common = Counter(next(zip(*a,*b)))
result = [(s,n) for (s,n) in a+b if common[s]>1]
If you want a single list comprehension, given that your tuples have exactly two values, you can pair each one with a dictionary formed form the other and use the dictionary as a filter mechanism:
result = [t for d,tl in [(dict(b),a),(dict(a),b)] for t in tl if t[0] in d]
Adding two list comprehensions (i.e. concatenating lists):
print([bi for bi in b if any(bi[0]==i[0] for i in a)] +
[ai for ai in a if any(ai[0]==i[0] for i in b)])
# Output: [('we', 3), ('b', 4), ('we', 23), ('b', 2)]
Explanation
[bi for bi in b if any(bi[0]==i[0] for i in a)] # ->>
# Take tuples from b whose first element equals one of the
# first elements of a
[ai for ai in a if ai[0] in [i[0] for i in b]]
# Similarly take tuples from a whose first elements equals one of the
# first elements of b
another variation with sets
filtered_keys=set(k for k,v in a)&set(k for k,v in b)
res=tuple((k, v) for k, v in [*a, *b] if k in filtered_keys)
>>> (('we', 23), ('b', 2), ('we', 3), ('b', 4))

Python 3: Reverse consecutive runs in sorted list?

This is a question is an extension of What's the most Pythonic way to identify consecutive duplicates in a list?.
Suppose you have a list of tuples:
my_list = [(1,4), (2,3), (3,2), (4,4), (5,2)]
and you sort it by each tuple's last value:
my_list = sorted(my_list, key=lambda tuple: tuple[1])
# [(3,2), (5,2), (2,3), (1,4), (4,4)]
then we have two consecutive runs (looking at the last value in each tuple), namely [(3,2), (5,2)] and [(1,4), (4,4)].
What is the pythonic way to reverse each run (not the tuples within), e.g.
reverse_runs(my_list)
# [(5,2), (3,2), (2,3), (4,4), (1,4)]
Is this possible to do within a generator?
UPDATE
It has come to my attention that perhaps the example list was not clear. So instead consider:
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
Where the ideal output from reverse_runs would be
[(7,"A"), (6,"A"), (1,"A"), (2,"B"), (3,"C"), (4,"C"), (5,"C"), (8,"D")]
To be clear on terminology, I am adopting the use of "run" as used in describing TimSort which is what Python's sort function is based upon - giving it (the sort function) its safety.
Thus if you sort on a collection, should the collection be multi-faceted, then only the specified dimension is sorted on and if two elements are the same for the specified dimension, their ordering will not be altered.
Thus the following function:
sorted(my_list,key=lambda t: t[1])
yields:
[(1, 'A'), (6, 'A'), (7, 'A'), (2, 'B'), (5, 'C'), (4, 'C'), (3, 'C'), (8, 'D')]
and the run on "C" (i.e. (5, 'C'), (4, 'C'), (3, 'C') ) is not disturbed.
So in conclusion the desired output from the yet to be defined function reverse_runs:
1.) sorts the tuples by their last element
2.) maintaining the order of the first element, reverses runs on the last element
Ideally I would like this in a generator functions, but that does not (to me at the moment) seem possible.
Thus one could adopt the following strategy:
1.) Sort the tuples by the last element via sorted(my_list, key=lambda tuple: tuple[1])
2.) Identify the indexes for the last element in each tuple when the succeeding tuple (i+1) is different than the last element in (i). i.e. identify runs
3.) Make an empty list
4.) Using the splice operator, obtain, reverse, and the append each sublist to the empty list
I think this will work.
my_list = [(1,4), (2,3), (3,2), (4,4), (5,2)]
my_list = sorted(my_list, key=lambda tuple: (tuple[1], -tuple[0]))
print(my_list)
Output
[(5, 2), (3, 2), (2, 3), (4, 4), (1, 4)]
Misunderstood question. Less pretty but this should work for what you really want:
from itertools import groupby
from operator import itemgetter
def reverse_runs(l):
sorted_list = sorted(l, key=itemgetter(1))
reversed_groups = (reversed(list(g)) for _, g in groupby(sorted_list, key=itemgetter(1)))
reversed_runs = [e for sublist in reversed_groups for e in sublist]
return reversed_runs
if __name__ == '__main__':
print(reverse_runs([(1, 4), (2, 3), (3, 2), (4, 4), (5, 2)]))
print(reverse_runs([(1, "A"), (2, "B"), (5, "C"), (4, "C"), (3, "C"), (6, "A"), (7, "A"), (8, "D")]))
Output
[(5, 2), (3, 2), (2, 3), (4, 4), (1, 4)]
[(7, 'A'), (6, 'A'), (1, 'A'), (2, 'B'), (3, 'C'), (4, 'C'), (5, 'C'), (8, 'D')]
Generator version:
from itertools import groupby
from operator import itemgetter
def reverse_runs(l):
sorted_list = sorted(l, key=itemgetter(1))
reversed_groups = (reversed(list(g)) for _, g in groupby(sorted_list, key=itemgetter(1)))
for group in reversed_groups:
yield from group
if __name__ == '__main__':
print(list(reverse_runs([(1, 4), (2, 3), (3, 2), (4, 4), (5, 2)])))
print(list(reverse_runs([(1, "A"), (2, "B"), (5, "C"), (4, "C"), (3, "C"), (6, "A"), (7, "A"), (8, "D")])))
The most general case requires 2 sorts. The first sort is a reversed sort on the second criteria. The second sort is a forward sort on the first criteria:
pass1 = sorted(my_list, key=itemgetter(0), reverse=True)
result = sorted(pass1, key=itemgetter(1))
We can sort in multiple passes like this because python's sort algorithm is guaranteed to be stable.
However, in real life it's often possible to simply construct a more clever key function which allows the sorting to happen in one pass. This usually involves "negating" one of the values and relying on the fact that tuples order themselves lexicographically:
result = sorted(my_list, key=lambda t: (t[1], -t[0]))
In response to your update, it looks like the following might be a suitable solution:
from operator import itemgetter
from itertools import chain, groupby
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
pass1 = sorted(my_list, key=itemgetter(1))
result = list(chain.from_iterable(reversed(list(g)) for k, g in groupby(pass1, key=itemgetter(1))))
print(result)
We can take apart the expression:
chain.from_iterable(reversed(list(g)) for k, g in groupby(pass1, key=itemgetter(1)))
to try to figure out what it's doing...
First, let's look at groupby(pass1, key=itemgetter(1)). groupby will yield 2-tuples. The first item (k) in the tuple is the "key" -- e.g. whatever was returned from itemgetter(1). The key isn't really important here after the grouping has taken place, so we don't use it. The second item (g -- for "group") is an iterable that yields consecutive values that have the same "key". This is exactly the items that you requested, however, they're in the order that they were in after sorting. You requested them in reverse order. In order to reverse an arbitrary iterable, we can construct a list from it and then reverse the list. e.g. reversed(list(g)). Finally, we need to paste those chunks back together again which is where chain.from_iterable comes in.
If we want to get more clever, we might do better from an algorithmic standpoint (assuming that the "key" for the bins is hashible). The trick is to bin the objects in a dictionary and then sort the bins. This means that we're potentially sorting a much shorter list than the original:
from collections import defaultdict, deque
from itertools import chain
my_list = [(1,"A"), (2,"B"), (5,"C"), (4,"C"), (3,"C"), (6,"A"),(7,"A"), (8,"D")]
bins = defaultdict(deque)
for t in my_list:
bins[t[1]].appendleft(t)
print(list(chain.from_iterable(bins[key] for key in sorted(bins))))
Note that whether this does better than the first approach is very dependent on the initial data. Since TimSort is such a beautiful algorithm, if the data starts already grouped into bins, then this algorithm will likely not beat it (though, I'll leave it as an exercise for you to try...). However, if the data is well scattered (causing TimSort to behave more like MergeSort), then binning first will possibly make for a slight win.

zip list with a single element

I have a list of some elements, e.g. [1, 2, 3, 4] and a single object, e.g. 'a'. I want to produce a list of tuples with the elements of the list in the first position and the single object in the second position: [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')].
I could do it with zip like this:
def zip_with_scalar(l, o): # l - the list; o - the object
return list(zip(l, [o] * len(l)))
However, this gives me a feeling of creating and unnecessary list of repeating element.
Another possibility is
def zip_with_scalar(l, o):
return [(i, o) for i in l]
which is very clean and pythonic indeed, but here I do the whole thing "manually". In Haskell I would do something like
zipWithScalar l o = zip l $ repeat o
Is there any built-in function or trick, either for the zipping with scalar or for something that would enable me to use ordinary zip, i.e. sort-of infinite list?
This is the cloest to your Haskell solution:
import itertools
def zip_with_scalar(l, o):
return zip(l, itertools.repeat(o))
You could also use generators, which avoid creating a list like comprehensions do:
def zip_with_scalar(l, o):
return ((i, o) for i in l)
You can use the built-in map function:
>>> elements = [1, 2, 3, 4]
>>> key = 'a'
>>> map(lambda e: (e, key), elements)
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
This is a perfect job for the itertools.cycle class.
from itertools import cycle
def zip_with_scalar(l, o):
return zip(i, cycle(o))
Demo:
>>> from itertools import cycle
>>> l = [1, 2, 3, 4]
>>> list(zip(l, cycle('a')))
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
lst = [1,2,3,4]
tups = [(itm, 'a') for itm in lst]
tups
> [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
>>> l = [1, 2, 3, 4]
>>> list(zip(l, "a"*len(l)))
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
You could also use zip_longest with a fillvalue of o:
from itertools import zip_longest
def zip_with_scalar(l, o): # l - the list; o - the object
return zip_longest(l, [o], fillvalue=o)
print(list(zip_with_scalar([1, 2, 3, 4] ,"a")))
Just be aware that any mutable values used for o won't be copied whether using zip_longest or repeat.
The more-itertools library recently added a zip_broadcast() function that solves this problem well:
>>> from more_itertools import zip_broadcast
>>> list(zip_broadcast([1,2,3,4], 'a'))
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
This is a much more general solution than the other answers posted here:
Empty iterables are correctly handled.
There can be multiple iterable and/or scalar arguments.
The order of the scalar/iterable arguments doesn't need to be known.
If there are multiple iterable arguments, you can check that they are the same length with strict=True.
You can easily control whether or not strings should be treated as iterables (by default they are not).
Just define a class with infinite iterator which is initialized with the single element you want to injected in the lists:
class zipIterator:
def __init__(self, val):
self.__val = val
def __iter__(self):
return self
def __next__(self):
return self.__val
and then create your new list from this class and the lists you have:
elements = [1, 2, 3, 4]
key = 'a'
res = [it for it in zip(elements, zipIterator(key))]
the result would be:
>>res
[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]

Python enumerate downwards or with a custom step

How to make Python's enumerate function to enumerate from bigger numbers to lesser (descending order, decrement, count down)? Or in general, how to use different step increment/decrement in enumerate?
For example, such function, applied to list ['a', 'b', 'c'], with start value 10 and step -2, would produce iterator [(10, 'a'), (8, 'b'), (6, 'c')].
I haven't found more elegant, idiomatic, and concise way, than to write a simple generator:
def enumerate2(xs, start=0, step=1):
for x in xs:
yield (start, x)
start += step
Examples:
>>> list(enumerate2([1,2,3], 5, -1))
[(5, 1), (4, 2), (3, 3)]
>>> list(enumerate2([1,2,3], 5, -2))
[(5, 1), (3, 2), (1, 3)]
If you don't understand the above code, read What does the "yield" keyword do in Python? and Difference between Python's Generators and Iterators.
One option is to zip your iterable to a range:
for index, item in zip(range(10, 0, -2), ['a', 'b', 'c']):
...
This does have the limitation that you need to know how far the range should go (the minimum it should cover - as in my example, excess will be truncated by zip).
If you don't know, you could roll your own "infinite range" (or just use itertools.count) and use that:
>>> def inf_range(start, step):
"""Generator function to provide a never-ending range."""
while True:
yield start
start += step
>>> list(zip(inf_range(10, -2), ['a', 'b', 'c']))
[(10, 'a'), (8, 'b'), (6, 'c')]
Here is an idiomatic way to do that:
list(zip(itertools.count(10,-2), 'abc'))
returns:
[(10, 'a'), (8, 'b'), (6, 'c')]
Another option is to use itertools.count, which is helpful for "enumerating" by a step, in reverse.
import itertools
counter = itertools.count(10, -2)
[(next(counter), letter) for letter in ["a", "b", "c"]]
# [(10, 'a'), (8, 'b'), (6, 'c')]
Characteristics
concise
the step and direction logic is compactly stored in count()
enumerated indices are iterated with next()
count() is inherently infinite; useful if the terminal boundary is unknown
(see #jonrsharpe)
the sequence length intrinsically terminates the infinite iterator
If you don't need iterator keeped in variable and just iterate through some container, multiply your index by step.
container = ['a', 'b', 'c']
step = -2
for index, value in enumerate(container):
print(f'{index * step}, {value}')
>>> 0, a
-2, b
-4, c
May be not very elegant, using f'strings the following quick solution can be handy
my_list = ['apple', 'banana', 'grapes', 'pear']
p=10
for counter, value in enumerate(my_list):
print(f" {counter+p}, {value}")
p+=9
> 10, apple
> 20, banana
> 30, grapes
> 40, pear

Categories

Resources