Outerzip / zip longest function (with multiple fill values)

Outerzip / zip longest function (with multiple fill values) - python

Is there a Python function an "outer-zip", which is a extension of zip with different default values for each iterable?
a = [1, 2, 3] # associate a default value 0
b = [4, 5, 6, 7] # associate b default value 1
zip(a,b) # [(1, 4), (2, 5), (3, 6)]
outerzip((a, 0), (b, 1)) = [(1, 4), (2, 5), (3, 6), (0, 7)]
outerzip((b, 0), (a, 1)) = [(4, 1), (5, 2), (6, 3), (7, 1)]
I can almost replicate this outerzip function using map, but with None as the only default:
map(None, a, b) # [(1, 4), (2, 5), (3, 6), (None, 7)]
Note1: The built-in zip function takes an arbitrary number of iterables, and so should an outerzip function. (e.g. one should be able to calculate outerzip((a,0),(a,0),(b,1)) similarly to zip(a,a,b) and map(None, a, a, b).)
Note2: I say "outer-zip", in the style of this haskell question, but perhaps this is not correct terminology.

It's called izip_longest (zip_longest in python-3.x):
>>> from itertools import zip_longest
>>> a = [1,2,3]
>>> b = [4,5,6,7]
>>> list(zip_longest(a, b, fillvalue=0))
[(1, 4), (2, 5), (3, 6), (0, 7)]

You could modify zip_longest to support your use case for general iterables.
from itertools import chain, repeat
class OuterZipStopIteration(Exception):
pass
def outer_zip(*args):
count = len(args) - 1
def sentinel(default):
nonlocal count
if not count:
raise OuterZipStopIteration
count -= 1
yield default
iters = [chain(p, sentinel(default), repeat(default)) for p, default in args]
try:
while iters:
yield tuple(map(next, iters))
except OuterZipStopIteration:
pass
print(list(outer_zip( ("abcd", '!'),
("ef", '#'),
(map(int, '345'), '$') )))

This function can be defined by extending each inputted list and zipping:
def outerzip(*args):
# args = (a, default_a), (b, default_b), ...
max_length = max( map( lambda s: len(s[0]), args))
extended_args = [ s[0] + [s[1]]*(max_length-len(s[0])) for s in args ]
return zip(*extended_args)
outerzip((a, 0), (b, 1)) # [(1, 4), (2, 5), (3, 6), (0, 7)]

Related

Given a List get all the combinations of tuples without duplicated results

I have a list=[1,2,3,4]
And I only want to receive tuple results for like all the positions in a matrix, so it would be
(1,1),(1,2),(1,3),(1,4),(2,1),(2,2),(2,3),(2,4),(3,1),(3,2),(3,3),(3,4),(4,1),(4,2),(4,3),(4,4)
I've seen several codes that return all the combinations but i don't know how to restrict it only to the tuples or how to add the (1,1),(2,2),(3,3),(4,4)
Thank you in advance.

You just need a double loop. A generator makes it easy to use
lst = [1,2,3,4]
def matrix(lst):
for i in range(len(lst)):
for j in range(len(lst)):
yield lst[i], lst[j]
output = [t for t in matrix(lst)]
print(output)
Output:
[(1, 1), (1, 2), (1, 3), (1, 4), (2, 1), (2, 2), (2, 3), (2, 4), (3, 1), (3, 2), (3, 3), (3, 4), (4, 1), (4, 2), (4, 3), (4, 4)]

If you just want to do this for making pairs of all symbols in the list
tuple_pairs = [(r,c) for r in lst for c in lst]
If you have instead some maximum row/colum numbers max_row and max_col you could avoid making the lst=[1,2,3,4] and instead;
tuple_pairs = [(r,c) for r in range(1,max_row+1) for c in range(1,max_col+1)]
But that's assuming that the lst's goal was to be = range(1, some_num).

Use itertools.product to get all possible combinations of an iterable object. product is roughly equivalent to nested for-loops with depth specified by the keyword parameter repeat. It returns an iterator.
from itertools import product
lst = [1, 2, 3, 4]
combos = product(lst, repeat=2)
combos = list(combos) # cast to list
print(*combos, sep=' ')
Diagonal terms can be found in a single loop (without any extra imports)
repeat = 2
diagonal = [(i,)*repeat for i in lst]
print(*diagonal sep=' ')

You can do that using list comprehension.
lst=[1,2,3,4]
out=[(i,i) for i in lst]
print(out)
Output:
[(1, 1), (2, 2), (3, 3), (4, 4)]

print most k frequent numbers of list with rank ties

I was trying to find a way to print k most frequent number of the text file. I was able to sort those numbers into a list of lists with its number of appearance in the text file.
l =[(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)] # 0 is the number itself, 7 means it appeared in file 7 times, and etc
So, now I want to print out k most frequent numbers of the file(should be done RECURSIVELY), but I am struggling with rank ties. For example, if k=3 I want to print:
[(0, 7), (3, 4), (-101, 3), (2, 3)] # top 3 frequencies
I tried doing:
def head(l): return l[0]
def tail(l): return l[1:]
def topk(l,k,e):
if(len(l)<=1 or k==0):
return [head(l)[1]]
elif(head(l)[1]!=e):
return [head(l)[1]] + topk(tail(l),k-1,head(l)[1])
else:
return [head(l)[1]] + topk(tail(l),k,head(l)[1])
l1 = [(0, 7), (3, 4), (-101, 3), (2, 3), (-3, 1), (-2, 1), (-1, 1), (101, 1)]
l2 = [(3.3, 4), (-3.3, 3), (-2.2, 2), (1.1, 1)]
print(topk(l1,3,''))
print(took(l2,3,''))
l1 prints correctly, but l2 has an extra frequency for some reason.

you can use sorted built-in function with parameter key to get the last frequency from top k and then you can use a list comprehenstion to get all the elements that have the frequency >= than that min value:
v = sorted(l, key=lambda x: x[1])[-3][1]
[e for e in l if e[1] >= v]
output:
[(0, 7), (3, 4), (-101, 3), (2, 3)]
if you want a recursive version you can use:
def my_f(l, v, top=None, i=0):
if top is None:
top = []
if l[i][1] >= v:
top.append(l[i])
if i == len(l) - 1:
return top
return my_f(l, v, top, i+1)
def topk(l, k):
k = min(len(l), k)
v = sorted(l, key=lambda x: x[1])[-3][1]
return my_f(l, v)
topk(l, 3)

Sort out pairs with same members but different order from list of pairs

From the list
l =[(3,4),(2,3),(4,3),(3,2)]
I want to sort out all second appearances of pairs with the same members in reverse order. I.e., the result should be
[(3,4),(2,3)]
What's the most concise way to do that in Python?

Alternatively, one might do it in a more verbose way:
l = [(3,4),(2,3),(4,3),(3,2)]
L = []
omega = set([])
for a,b in l:
key = (min(a,b), max(a,b))
if key in omega:
continue
omega.add(key)
L.append((a,b))
print(L)

If we want to keep only the first tuple of each pair:
l =[(3,4),(2,3),(4,3),(3,2), (3, 3), (5, 6)]
def first_tuples(l):
# We could use a list to keep track of the already seen
# tuples, but checking if they are in a set is faster
already_seen = set()
out = []
for tup in l:
if set(tup) not in already_seen:
out.append(tup)
# As sets can only contain hashables, we use a
# frozenset here
already_seen.add(frozenset(tup))
return out
print(first_tuples(l))
# [(3, 4), (2, 3), (3, 3), (5, 6)]

This ought to do the trick:
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[23]: [(3, 4), (2, 3)]
Expanding the initial list a little bit with different orderings:
l =[(3,4),(2,3),(4,3),(3,2), (1,3), (3,1)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:])]
Out[25]: [(3, 4), (2, 3), (1, 3)]
And, depending on whether each tuple is guaranteed to have an accompanying "sister" reversed tuple, the logic may change in order to keep "singleton" tuples:
l = [(3, 4), (2, 3), (4, 3), (3, 2), (1, 3), (3, 1), (10, 11), (10, 12)]
[x for i, x in enumerate(l) if any(y[::-1] == x for y in l[i:]) or not any(y[::-1] == x for y in l)]
Out[35]: [(3, 4), (2, 3), (1, 3), (10, 11), (10, 12)]

IMHO, this should be both shorter and clearer than anything posted so far:
my_tuple_list = [(3,4),(2,3),(4,3),(3,2)]
set((left, right) if left < right else (right, left) for left, right in my_tuple_list)
>>> {(2, 3), (3, 4)}
It simply makes a set of all tuples, whose members are exchanged beforehand if first member is > second member.

returning a list of tuples like zip, generate it incrementally a tuple at a time, using comprehensions

I need a function which returns a tuple one at a time from a list of sequences without using zip . i tried to do it in this fashion:
gen1=[(x,y)for x in range(3) for y in range(4)]
which gives the following list:
[(0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2), (2, 3)]
next I tried to return one tuple at a time by:
next(gen1)
But an error occured that list is not 'iterable' . how can i do it using generators.

If you want the behavior to work with an arbitrary number of sequences, while there are still some ambiguities in the question, if what you're trying to do is just make a generator version of zip, the below should work well:
def generator_zip(*args):
iterators = map(iter, args)
while iterators:
yield tuple(map(next, iterators))
First it turns each arg into an iterator, then continues to yield tuple that include the next relevant entry from each iterator until the shortest list is exhausted.

As of Python 2.4, you can do:
gen1 = ((x, y) for x in range(3) for y in range(4))
Note that you can always make a generator (well, iterator) from a list with iter:
gen1 = iter([(x, y) for x in range(3) for y in range(4)])
The difference in usage will be none. The second way will require the whole list to be in memory, though, while the first will not.
Note that you can also use the builtin functionality of zip, which is a generator (in Python 3). In Python 2, use itertools.izip.
Python 3:
>>> zip(range(0, 5), range(3, 8))
<zip object at 0x7f07519b3b90>
>>> list(zip(range(0, 5), range(3, 8)))
[(0, 3), (1, 4), (2, 5), (3, 6), (4, 7)]
Python < 3:
# Python < 3
>>> from itertools import izip
>>> izip(range(0, 5), range(3, 8))
<itertools.izip object at 0x7f5247807440>
>>> list(izip(range(0, 5), range(3, 8)))
[(0, 3), (1, 4), (2, 5), (3, 6), (4, 7)]
>>> zip(range(0, 5), range(3, 8))
[(0, 3), (1, 4), (2, 5), (3, 6), (4, 7)]

concurrently iterating through even and odd items of list

I have a list of items (which are HTML table rows, extracted with Beautiful Soup) and I need to iterate over the list and get even and odd elements (I mean index) for each loop run.
My code looks like this:
for top, bottom in izip(table[::2], table[1::2]):
#do something with top
#do something else with bottom
How to make this code less ugly? Or maybe is it good way to do this?
EDIT:
table[1::2], table[::2] => table[::2], table[1::2]

izip is a pretty good option, but here's a few alternatives since you're unhappy with it:
>>> def chunker(seq, size):
... return (tuple(seq[pos:pos+size]) for pos in xrange(0, len(seq), size))
...
>>> x = range(11)
>>> x
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> chunker(x, 2)
<generator object <genexpr> at 0x00B44328>
>>> list(chunker(x, 2))
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9), (10,)]
>>> list(izip(x[1::2], x[::2]))
[(1, 0), (3, 2), (5, 4), (7, 6), (9, 8)]
As you can see, this has the advantage of properly handling an uneven amount of elements, which may or not be important to you. There's also this recipe from the itertools documentation itself:
>>> def grouper(n, iterable, fillvalue=None):
... "grouper(3, 'ABCDEFG', 'x') --> ABC DEF Gxx"
... args = [iter(iterable)] * n
... return izip_longest(fillvalue=fillvalue, *args)
...
>>>
>>> from itertools import izip_longest
>>> list(grouper(2, x))
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9), (10, None)]

Try:
def alternate(i):
i = iter(i)
while True:
yield(i.next(), i.next())
>>> list(alternate(range(10)))
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
This solution works on any sequence, not just lists, and doesn't copy the sequence (it will be far more efficient if you only want the first few elements of a long sequence).

Looks good. My only suggestion would be to wrap this in a function or method. That way, you can give it a name (evenOddIter()) which makes it much more readable.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Outerzip / zip longest function (with multiple fill values) - python

It's called izip_longest (zip_longest in python-3.x): >>> from itertools import zip_longest >>> a = [1,2,3] >>> b = [4,5,6,7] >>> list(zip_longest(a, b, fillvalue=0)) [(1, 4), (2, 5), (3, 6), (0, 7)]

Related

Given a List get all the combinations of tuples without duplicated results

print most k frequent numbers of list with rank ties

Sort out pairs with same members but different order from list of pairs

returning a list of tuples like zip, generate it incrementally a tuple at a time, using comprehensions

concurrently iterating through even and odd items of list

Categories

Resources