Merge python lists of different lengths - python

I am attempting to merge two python lists, where their values at a given index will form a list (element) in a new list. For example:
merge_lists([1,2,3,4], [1,5]) = [[1,1], [2,5], [3], [4]]
I could iterate on this function to combine ever more lists. What is the most efficient way to accomplish this?
Edit (part 2)
Upon testing the answer I had previously selected, I realized I had additional criteria and a more general problem. I would also like to combine lists containing lists or values. For example:
merge_lists([[1,2],[1]] , [3,4]) = [[1,2,3], [1,4]]
The answers currently provided generate lists of higher dimensions in cases like this.

One option is to use itertools.zip_longest (in python 3):
from itertools import zip_longest
[[x for x in t if x is not None] for t in zip_longest([1,2,3,4], [1,5])]
# [[1, 1], [2, 5], [3], [4]]
If you prefer sets:
[{x for x in t if x is not None} for t in zip_longest([1,2,3,4], [1,5])]
# [{1}, {2, 5}, {3}, {4}]
In python 2, use itertools.izip_longest:
from itertools import izip_longest
[[x for x in t if x is not None] for t in izip_longest([1,2,3,4], [1,5])]
#[[1, 1], [2, 5], [3], [4]]
Update to handle the slightly more complicated case:
def flatten(lst):
result = []
for s in lst:
if isinstance(s, list):
result.extend(s)
else:
result.append(s)
return result
This handles the above two cases pretty well:
[flatten(x for x in t if x is not None) for t in izip_longest([1,2,3,4], [1,5])]
# [[1, 1], [2, 5], [3], [4]]
[flatten(x for x in t if x is not None) for t in izip_longest([[1,2],[1]] , [3,4])]
# [[1, 2, 3], [1, 4]]
Note even though this works for the above two cases, but it can still break under deeper nested structure, since the case can get complicated very quickly. For a more general solution, you can see here.

Another way to have your desired output using zip():
def merge(a, b):
m = min(len(a), len(b))
sub = []
for k,v in zip(a,b):
sub.append([k, v])
return sub + list([k] for k in a[m:]) if len(a) > len(b) else sub + list([k] for k in b[m:])
a = [1, 2, 3, 4]
b = [1, 5]
print(merge(a, b))
>>> [[1, 1], [2, 5], [3], [4]]

You could use itertools.izip_longest and filter():
>>> lst1, lst2 = [1, 2, 3, 4], [1, 5]
>>> from itertools import izip_longest
>>> [list(filter(None, x)) for x in izip_longest(lst1, lst2)]
[[1, 1], [2, 5], [3], [4]]
How it works: izip_longest() aggregates the elements from two lists, filling missing values with Nones, which you then filter out with filter().

Another way using zip_longest and chain from itertools:
import itertools
[i for i in list(itertools.chain(*itertools.zip_longest(list1, list2, list3))) if i is not None]
or in 2 lines (more readable):
merged_list = list(itertools.chain(*itertools.zip_longest(a, b, c)))
merged_list = [i for i in merged_list if i is not None]

Related

Compare 1st element of list from nest list in python

I have a list of lists like following :
[[a1,a2], [b1,b2],...., [n1]]
and I want to find whether the first elements of all these lists are equal?
I'd prefer to do this with a list comprehension unless there's a reason to avoid it, for readability's sake.
list_of_lists = [[1, 2], [1, 3], [1, 4]]
len(set([sublist[0] for sublist in list_of_lists])) == 1
# True
The solution is quite straight forward.
Transpose your list. Use zip to perform this task
Index the first row of your transposed list
Use set to remove duplicate
Determine if no of elements is equal to 1
>>> test = [[1, 2], [1, 3], [1, 4]]
>>> len(set(zip(*test)[0])) == 1
True
Note
If you are using Py 3.X, instead of slicing, wrap the call to zip with next
>>> len(set(next(zip(*test)))) == 1
How about?
>>> from operator import itemgetter
>>> test = [[1, 2], [1, 3], [1, 4]]
>>> len(set(map(itemgetter(0), test))) == 1
True
>>> test.append([2, 5])
>>> test
[[1, 2], [1, 3], [1, 4], [2, 5]]
>>> len(set(map(itemgetter(0), test))) == 1
False
And another way would be (Thanks, Peter DeGlopper!)
all(sublist[0] == test[0][0] for sublist in test)
This version would short-circuit too, so it wouldn't need to check every element in every case.
You can create a list of first elements compared to the first sublist's first element:
False not in [len(yourList[0])>0 and len(x)>0 and x[0] == yourList[0][0] for x in yourList]
With a one liner:
>>> sample = [[1, 2], [1, 3], [1, 4]]
>>> reduce(lambda x, y: x if x == y[0] else None, sample, sample[0][0])
1
>>> sample = [[0, 2], [1, 3], [1, 4]]
>>> reduce(lambda x, y: x if x == y[0] else None, sample, sample[0][0])
None
Try this...
>>> test = [[1, 2], [1, 3], [1, 4]]
>>> eval("==".join(map(lambda x: str(x[0]), test)))
True

Partition N items into K bins in Python lazily

Give an algorithm (or straight Python code) that yields all partitions of a collection of N items into K bins such that each bin has at least one item. I need this in both the case where order matters and where order does not matter.
Example where order matters
>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 2))
[([1], [2,3,4]), ([1,2], [3,4]), ([1,2,3], [4])]
>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 3))
[([1], [2], [3,4]), ([1], [2,3], [4]), ([1,2], [3], [4])]
>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 4))
[([1], [2], [3], [4])]
Example where order does not matter
>>> list(partition_n_in_k_bins_unordered({1,2,3,4}, 2))
[{{1}, {2,3,4}}, {{2}, {1,3,4}}, {{3}, {1,2,4}}, {{4}, {1,2,3}},
{{1,2}, {3,4}}, {{1,3}, {2,4}}, {{1,4}, {2,3}}]
These functions should produce lazy iterators/generators, not lists. Ideally they would use primitives found in itertools. I suspect that there is a clever solution that is eluding me.
While I've asked for this in Python I'm also willing to translate a clear algorithm.
you need a recursive function to solve this kind of problem: you take the list, take a subportion of it of increasing length and apply the same procedure to the remaining tail of the list in n-1 pieces.
here is my take to the ordered combination
def partition(lista,bins):
if len(lista)==1 or bins==1:
yield [lista]
elif len(lista)>1 and bins>1:
for i in range(1,len(lista)):
for part in partition(lista[i:],bins-1):
if len([lista[:i]]+part)==bins:
yield [lista[:i]]+part
for i in partition(range(1,5),1):
print i
#[[1, 2, 3, 4]]
for i in partition(range(1,5),2):
print i
#[[1], [2, 3, 4]]
#[[1, 2], [3, 4]]
#[[1, 2, 3], [4]]
for i in partition(range(1,5),3):
print i
#[[1], [2], [3, 4]]
#[[1], [2, 3], [4]]
#[[1, 2], [3], [4]]
for i in partition(range(1,5),4):
print i
#[[1], [2], [3], [4]]
Enrico's algorithm, Knuth's, and only my glue are needed to paste together something that returns the list of lists or set of sets (returned as lists of lists in case elements are not hashable).
def kbin(l, k, ordered=True):
"""
Return sequence ``l`` partitioned into ``k`` bins.
Examples
========
The default is to give the items in the same order, but grouped
into k partitions:
>>> for p in kbin(range(5), 2):
... print p
...
[[0], [1, 2, 3, 4]]
[[0, 1], [2, 3, 4]]
[[0, 1, 2], [3, 4]]
[[0, 1, 2, 3], [4]]
Setting ``ordered`` to None means that the order of the elements in
the bins is irrelevant and the order of the bins is irrelevant. Though
they are returned in a canonical order as lists of lists, all lists
can be thought of as sets.
>>> for p in kbin(range(3), 2, ordered=None):
... print p
...
[[0, 1], [2]]
[[0], [1, 2]]
[[0, 2], [1]]
"""
from sympy.utilities.iterables import (
permutations, multiset_partitions, partitions)
def partition(lista, bins):
# EnricoGiampieri's partition generator from
# http://stackoverflow.com/questions/13131491/
# partition-n-items-into-k-bins-in-python-lazily
if len(lista) == 1 or bins == 1:
yield [lista]
elif len(lista) > 1 and bins > 1:
for i in range(1, len(lista)):
for part in partition(lista[i:], bins - 1):
if len([lista[:i]] + part) == bins:
yield [lista[:i]] + part
if ordered:
for p in partition(l, k):
yield p
else:
for p in multiset_partitions(l, k):
yield p

Merge List of lists where sublists have common elements

I have a list of lists like this
list = [[1, 2], [1, 3], [4, 5]]
and as you see the first element of the first two sublists is repeated
So I want my output too be:
list = [[1, 2, 3], [4, 5]]
Thank you
The following code should solve your problem:
def merge_subs(lst_of_lsts):
res = []
for row in lst_of_lsts:
for i, resrow in enumerate(res):
if row[0]==resrow[0]:
res[i] += row[1:]
break
else:
res.append(row)
return res
Note that the elsebelongs to the inner for and is executed if the loop is exited without hitting the break.
I have a solution that builds a dict first with the 1st values, then creates a list from that, but the order may not be the same (i.e. [4, 5] may be before [1, 2, 3]):
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> map(lambda x: d[x[0]].append(x[1]), l)
[None, None, None]
>>> d
defaultdict(<type 'list'>, {1: [2, 3], 4: [5]})
>>> [[key] + list(val) for key, val in d.iteritems()]
[[1, 2, 3], [4, 5]]
You can use python sets, because you can compute intersection and union pretty easy. The code would be more clear, but the complexity would probably be comparable to the other solutions.
Although arguably unreadable:
# Note the _ after the list, otherwise you are redefining the list type in your scope
list_ = [[1, 2], [1, 3], [4, 5]]
from itertools import groupby
grouper = lambda l: [[k] + sum((v[1::] for v in vs), []) for k, vs in groupby(l, lambda x: x[0])]
print grouper(list_)
A more readable variant:
from collections import defaultdict
groups = defaultdict(list)
for vs in list_:
group[vs[0]] += vs[1:]
print group.items()
Note that these solve a more generic form of your problem, instead of [[1, 2], [1, 3], [4, 5]] you could also have something like this: [[1, 2, 3], [1, 4, 5], [2, 4, 5, 6], [3]]
Explanation about the _. This is why you don't want to overwrite list:
spam = list()
print spam
# returns []
list = spam
print list
# returns []
spam = list()
# TypeError: 'list' object is not callable
As you can see above, by setting list = spam we broke the default behaviour of list().

Processing a list of lists

I have a list of lists, say:
arr = [[1, 2], [1, 3], [1, 4]]
I would like to append 100 to each of the inner lists. Output for the above example would be:
arr = [[1, 2, 100], [1, 3, 100], [1, 4, 100]]
I can of course do:
for elem in arr:
elem.append(100)
But is there something more pythonic that can be done? Why does the following not work:
arr = [elem.append(100) for elem in arr]
The second version should be written like arr = [elem + [100] for elem in arr]. But the most pythonic way if you ask me is the first one. The for construct has it's own use, and it suits very well here.
Note that your code also works - the only thing you have to know is that there is no need to assign result to variable in such a case:
arr = [[1, 2], [1, 3], [1, 4]]
[x.append(100) for x in arr]
After execution arr will contain updated list [[1, 2, 100], [1, 3, 100], [1, 4, 100]], e.g it work like in-place update, but this not very common practice to do this.
The same result you will get in the next case:
map(lambda x: x.append(100), arr)
As was discussed you can use list comprehension or map to do this with assigning result to any variable:
res = map(lambda x: x + [100], arr)
You can do
[a + [100] for a in arr]
The reason why your append doesn't work is that append doesn't return the list, but rather None.
Of course, this is more resource intensive than just doing append - you end up making copies of everything.
This is the more pythonic way
for elem in arr:
elem.append(100)
but as an option you can also try this:
[arr[i].append(100) for i in range(len(arr))]
print arr # It will return [[1, 2, 100], [1, 3, 100], [1, 4, 100]]

Python, working with list comprehensions

I have such code:
a = [[1, 1], [2, 1], [3, 0]]
I want to get two lists, the first contains elements of 'a', where a[][1] = 1, and the second - elements where a[][1] = 0. So
first_list = [[1, 1], [2, 1]]
second_list = [[3, 0]].
I can do such thing with two list comprehension:
first_list = [i for i in a if i[1] == 1]
second_list = [i for i in a if i[1] == 0]
But maybe exists other (more pythonic, or shorter) way to do this? Thanks for your answers.
List comprehension are very pythonic and the recommended way of doing this. Your code is fine.
If you want to have it in a single line you could do something like
first_list, second_list = [i for i in a if i[1] == 1], [i for i in a if i[1] == 0]
Remember that, "Explicit is better than implicit."
Your code is fine
You can use sorted() and itertools.groupby() to do this, but I don't know that it would qualify as Pythonic per se:
>>> dict((k, list(v)) for (k, v) in itertools.groupby(sorted(a, key=operator.itemgetter(1)), operator.itemgetter(1)))
{0: [[3, 0]], 1: [[1, 1], [2, 1]]}
what about this,
In [1]: a = [[1, 1], [2, 1], [3, 0]]
In [2]: first_list = []
In [3]: second_list = []
In [4]: [first_list.append(i) if i[1] == 1 else second_list.append(i) for i in a]
Out[4]: [None, None, None]
In [5]: first_list, second_list
Out[5]: ([[1, 1], [2, 1]], [[3, 0]])
instead of two sublist, I prefer dict (or defaultdict, OrderedDict, Counter, etc.)
In [6]: from collections import defaultdict
In [7]: d = defaultdict(list)
In [8]: [d[i[1]].append(i) for i in a]
Out[8]: [None, None, None]
In [9]: d
Out[9]: {0: [[3, 0]], 1: [[1, 1], [2, 1]]}
If the lists are reasonably short then two list comprehensions will do fine: you shouldn't be worried about performance until your code is all working and you know it is too slow.
If your lists are long or the code runs often and you have demonstrated that it is a bottleneck then all you have to do is switch from list comprehensions to a for loop:
first_list, second_list = [], []
for element in a:
if element[1] == 1:
first_list.append(element)
else:
second_list.append(element)
which is both clear and easily extended to more cases.
list comprehensions are great. If you want slightly more simple code (but slightly longer) then just use a for loop.
Yet another option would be filters and maps:
a = [[1, 1], [2, 1], [3, 0]]
g1=filter(lambda i: i[1]==1,a)
g1=map(lambda i: i[0],g1)
g2=filter(lambda i: i[1]==0,a)
g2=map(lambda i: i[0],g2)
print g1
print g2

Categories

Resources