Python all combinations of a list of lists - python

So I have a list of lists of strings
[['a','b'],['c','d'],['e','f']]
and I want to get all possible combinations, such that the result is
[['a','b'],['c','d'],['e','f'],
['a','b','c','d'],['a','b','e','f'],['c','d','e','f'],
['a','b','c','d','e','f']]
So far I have come up with this code snippet
input = [['a','b'],['c','d'],['e','f']]
combs = []
for i in xrange(1, len(input)+1):
els = [x for x in itertools.combinations(input, i)]
combs.extend(els)
print combs
largely following an answer in this post.
But that results in
[(['a','b'],),(['c','d'],),(['e','f'],),
(['a','b'],['c','d']),(['a','b'],['e','f']),(['c','d'],['e','f']),
(['a','b'],['c', 'd'],['e', 'f'])]
and I am currently stumped, trying to find an elegant, pythonic way to unpack those tuples.

You can use itertools.chain.from_iterable to flatten the tuple of lists into a list. Example -
import itertools
input = [['a','b'],['c','d'],['e','f']]
combs = []
for i in xrange(1, len(input)+1):
els = [list(itertools.chain.from_iterable(x)) for x in itertools.combinations(input, i)]
combs.extend(els)
Demo -
>>> import itertools
>>> input = [['a','b'],['c','d'],['e','f']]
>>> combs = []
>>> for i in range(1, len(input)+1):
... els = [list(itertools.chain.from_iterable(x)) for x in itertools.combinations(input, i)]
... combs.extend(els)
...
>>> import pprint
>>> pprint.pprint(combs)
[['a', 'b'],
['c', 'd'],
['e', 'f'],
['a', 'b', 'c', 'd'],
['a', 'b', 'e', 'f'],
['c', 'd', 'e', 'f'],
['a', 'b', 'c', 'd', 'e', 'f']]

One idea for such a goal is to map integers from [0..2**n-1] where n is the number of sublists to all your target element according to a very simple rule:
Take the element of index k if (2**k)&i!=0 where i runs over [0..2**n-1]. In other word, i has to be read bitwise, and for each bit set, the corresponding element from l is kept. From a mathematical point of view it is one of the cleanest way of achieving what you want to do since it follows very closely the definition of the parts of a set (where you have exactly 2**n parts for a set with n elements).
Not tried but something like that should work:
l = [['a','b'],['c','d'],['e','f']]
n = len(l)
output = []
for i in range(2**n):
s = []
for k in range(n):
if (2**k)&i: s = s + l[k]
output.append(s)
If you don't want the empty list, just replace the relevant line with:
for i in range(1,2**n):

If you want all combinations, you may consider this simple way:
import itertools
a = [['a','b'],['c','d'],['e','f']]
a = a + [i + j for i in a for j in a if i != j] + [list(itertools.chain.from_iterable(a))]

With comprehension lists :
combs=[sum(x,[]) for i in range(len(l)) for x in itertools.combinations(l,i+1)]

Related

How to generate combination of characters in a string at a particular position?

I have a string list :
li = ['a', 'b', 'c', 'd']
Using the following code in Python, I generated all the possible combination of characters for list li and got a result of 256 strings.
from itertools import product
li = ['a', 'b', 'c', 'd']
for comb in product(li, repeat=4):
print(''.join(comb))
Say for example, I know the character of the second and fourth position of the string in the list li which is 'b' and 'c'.
So the result will be a set of only 16 strings which is :
abac
abbc
abcc
abdc
bbac
bbbc
bbcc
bbdc
cbac
cbbc
cbcc
cbdc
dbac
dbbc
dbcc
dbdc
How to get this result? Is there a Pythonic way to achieve this?
Thanks.
Edit : My desired size of list li is a to z and the value for repeat is 13. When I tried the above code, compiler throwed memory error!
Use list comprehension:
from itertools import product
li = ['a', 'b', 'c', 'd']
combs = [list(x) for x in product(li, repeat=4)]
selected_combs = [comb for comb in combs if (comb[1] == 'b' and comb[3] == 'c')]
print(["".join(comb) for comb in selected_combs])
# ['abac', 'abbc', 'abcc', 'abdc', 'bbac', 'bbbc', 'bbcc', 'bbdc', 'cbac', 'cbbc', 'cbcc', 'cbdc', 'dbac', 'dbbc', 'dbcc', 'dbdc']
To save memory in case you do not need all the combinations combs, you can simply do:
li = ['a', 'b', 'c', 'd']
selected_combs = [comb for comb in product(li, repeat=4) if (comb[1] == 'b' and comb[3] == 'c')]
print(["".join(comb) for comb in selected_combs])
def permute(s):
out = []
if len(s) == 1:
return s
else:
for i,let in enumerate(s):
for perm in permute(s[:i] + s[i+1:]):
out += [let + perm]
return out
per=permute(['a', 'b', 'c', 'd'])
print(per)
Do you want this?

How to find duplicate values in a list and merge them

So basically for example of you have a list like:
l = ['a','b','a','b','c','c']
The output should be:
[['a','a'],['b','b'],['c','c']]
So basically put together the values that are duplicated into a list,
I tried:
l = ['a','b','a','b','c','c']
it=iter(sorted(l))
next(it)
new_l=[]
for i in sorted(l):
new_l.append([])
if next(it,None)==i:
new_l[-1].append(i)
else:
new_l.append([])
But doesn't work, and if it does work it is not gonna be efficient
Sort the list then use itertools.groupby:
>>> from itertools import groupby
>>> l = ['a','b','a','b','c','c']
>>> [list(g) for _, g in groupby(sorted(l))]
[['a', 'a'], ['b', 'b'], ['c', 'c']]
EDIT: this is probably not the fastest approach, sorting is O(n log n) time complexity for the average case and not required for all solutions (see the comments)
Use collections.Counter:
from collections import Counter
l = ['a','b','a','b','c','c']
c = Counter(l)
print([[x] * y for x, y in c.items()])
# [['a', 'a'], ['b', 'b'], ['c', 'c']]
You can use collections.Counter:
from collections import Counter
[[k] * c for k, c in Counter(l).items()]
This returns:
[['a', 'a'], ['b', 'b'], ['c', 'c']]
%%timeit comparison
Given a sample dataset of 100000 values, this answer is the fastest approach.
Another approach is to use zip method.
l = ['a','b','a','b','c','c','b','c', 'a']
l = sorted(l)
grouped = [list(item) for item in list(zip(*[iter(l)] * l.count(l[0])))]
Output
[['a', 'a', 'a'], ['b', 'b', 'b'], ['c', 'c', 'c']]
Here's a functional solution via itertools.groupby. As it requires sorting, this will have time complexity O(n log n).
from itertools import groupby
from operator import itemgetter
L = ['a','b','a','b','c','c']
res = list(map(list, map(itemgetter(1), groupby(sorted(L)))))
[['a', 'a'], ['b', 'b'], ['c', 'c']]
The syntax is cumbersome since Python does not offer native function composition. This is supported by 3rd party library toolz:
from toolz import compose
foo = compose(list, itemgetter(1))
res = list(map(foo, groupby(sorted(L))))
My solution using list comprehension would be (l is a list):
[l.count(x) * [x] for x in set(l)]
set(l) will retrieve all the element which appears in l, without duplicates
l.count(x) will return the number of times a specific element x appears in a given list l
the * operator creates a new list with the elements in a list (in this case, [x]) repeated the specified number of times (in this case, l.count(x) is the number of times)
l = ['a','b','a','b','c','c']
want = []
for i in set(l):
want.append(list(filter(lambda x: x == i, l)))
print(want)
Probably not the most efficient, but this is understandable:
l = ['a','b','a','b','c','c']
dict = {}
for i in l:
if dict[i]:
dict[i] += 1
else:
dict[i] = 1
new = []
for key in list(dict.keys()):
new.append([key] * dict[key])

All possible combinations for 3 digits of list never the same

I have a list that looks like:
A
B
C
D
E
F
G
How do I solve this to find all combinations for 3 digits. The same letter cannot be used in same row.
ABC
ABD
ABE
ABF
ABG
AGB
E.g something like...:
x = ['a','b','c','d','e']
n = 3
import itertools
aa = [list(comb) for i in range(1, n+2) for comb in itertools.combinations(x, i)]
print(aa)
This does not give desired input:
[['a'], ['b'], ['c'], ['d'], ['e'], ['a', 'b'], ['a', 'c'], ['a', 'd'], ['a', 'e'], ['b', 'c'], ['b', 'd'], ['b', 'e'], ['c'
The Python Standard Library itertools already has the functionality you are trying to implement. Also you are using it in your code (funnily).
itertools.combinations(a,3) returns all 3-combinations of the a. To convert that to "list of list" you should use .extend() as follows;
x = ['a','b','c','d','e']
n = 3
import itertools
permutations = []
combinations = []
combinations.extend(itertools.combinations(x,n))
permutations.extend(itertools.permutations(x,n))
print("Permutations;", permutations)
print("\n")
print("Combinations;", combinations)
Additionally, I suggest you to search on "Combination, Permutation Difference". As I understood from your question; permutation is what you want. (If you run the code I shared, you will understand the difference easliy.)
To understand how the solution process works, try the following:
# get all combinations of n items from given list
def getCombinations(items, n):
if len(items) < n: return [] # need more items than are remaining
if n == 0: return [''] # need no more items, return the combination of no items
[fst, *rst] = items
# all combinations including the first item in the list
including = [fst + comb for comb in getCombinations(rst, n-1)]
# all combinations excluding the first item in the list
excluding = getCombinations(rst, n)
both = including + excluding
return both
x = ['a','b','c','d','e']
n = 3
print(getCombinations(x, n))
# ['abc', 'abd', 'abe', 'acd', 'ace', 'ade', 'bcd', 'bce', 'bde', 'cde']
combinations works on strings not lists, so you should first turn it into a string using: ''.join(x)
from itertools import combinations
x = ['a', 'b', 'c', 'd', 'e']
n = 3
aa = combinations(''.join(x), n)
for comb in aa:
print(''.join(comb))
OUTPUT
abc
abd
abe
acd
ace
ade
bcd
bce
bde
cde
Or as a one-liner:
[''.join(comb) for comb in combinations(''.join(x), n)]

Separating a String

Given a string, I want to generate all possible combinations. In other words, all possible ways of putting a comma somewhere in the string.
For example:
input: ["abcd"]
output: ["abcd"]
["abc","d"]
["ab","cd"]
["ab","c","d"]
["a","bc","d"]
["a","b","cd"]
["a","bcd"]
["a","b","c","d"]
I am a bit stuck on how to generate all the possible lists. Combinations will just give me lists with length of subset of the set of strings, permutations will give all possible ways to order.
I can make all the cases with only one comma in the list because of iterating through the slices, but I can't make cases with two commas like "ab","c","d" and "a","b","cd"
My attempt w/slice:
test="abcd"
for x in range(len(test)):
print test[:x],test[x:]
How about something like:
from itertools import combinations
def all_splits(s):
for numsplits in range(len(s)):
for c in combinations(range(1,len(s)), numsplits):
split = [s[i:j] for i,j in zip((0,)+c, c+(None,))]
yield split
after which:
>>> for x in all_splits("abcd"):
... print(x)
...
['abcd']
['a', 'bcd']
['ab', 'cd']
['abc', 'd']
['a', 'b', 'cd']
['a', 'bc', 'd']
['ab', 'c', 'd']
['a', 'b', 'c', 'd']
You can certainly use itertools for this, but I think it's easier to write a recursive generator directly:
def gen_commas(s):
yield s
for prefix_len in range(1, len(s)):
prefix = s[:prefix_len]
for tail in gen_commas(s[prefix_len:]):
yield prefix + "," + tail
Then
print list(gen_commas("abcd"))
prints
['abcd', 'a,bcd', 'a,b,cd', 'a,b,c,d', 'a,bc,d', 'ab,cd', 'ab,c,d', 'abc,d']
I'm not sure why I find this easier. Maybe just because it's dead easy to do it directly ;-)
You could generate the power set of the n - 1 places that you could put commas:
what's a good way to combinate through a set?
and then insert commas in each position.
Using itertools:
import itertools
input_str = "abcd"
for k in range(1,len(input_str)):
for subset in itertools.combinations(range(1,len(input_str)), k):
s = list(input_str)
for i,x in enumerate(subset): s.insert(x+i, ",")
print "".join(s)
Gives:
a,bcd
ab,cd
abc,d
a,b,cd
a,bc,d
ab,c,d
a,b,c,d
Also a recursive version:
def commatoze(s,p=1):
if p == len(s):
print s
return
commatoze(s[:p] + ',' + s[p:], p + 2)
commatoze(s, p + 1)
input_str = "abcd"
commatoze(input_str)
You can solve the integer composition problem and use the compositions to guide where to split the list. Integer composition can be solved fairly easily with a little bit of dynamic programming.
def composition(n):
if n == 1:
return [[1]]
comp = composition (n - 1)
return [x + [1] for x in comp] + [y[:-1] + [y[-1]+1] for y in comp]
def split(lst, guide):
ret = []
total = 0
for g in guide:
ret.append(lst[total:total+g])
total += g
return ret
lst = list('abcd')
for guide in composition(len(lst)):
print split(lst, guide)
Another way to generate integer composition:
from itertools import groupby
def composition(n):
for i in xrange(2**(n-1)):
yield [len(list(group)) for _, group in groupby('{0:0{1}b}'.format(i, n))]
Given
import more_itertools as mit
Code
list(mit.partitions("abcd"))
Output
[[['a', 'b', 'c', 'd']],
[['a'], ['b', 'c', 'd']],
[['a', 'b'], ['c', 'd']],
[['a', 'b', 'c'], ['d']],
[['a'], ['b'], ['c', 'd']],
[['a'], ['b', 'c'], ['d']],
[['a', 'b'], ['c'], ['d']],
[['a'], ['b'], ['c'], ['d']]]
Install more_itertools via > pip install more-itertools.

Pythonic way to combine (interleave, interlace, intertwine) two lists in an alternating fashion?

I have two lists, the first of which is guaranteed to contain exactly one more item than the second. I would like to know the most Pythonic way to create a new list whose even-index values come from the first list and whose odd-index values come from the second list.
# example inputs
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
# desired output
['f', 'hello', 'o', 'world', 'o']
This works, but isn't pretty:
list3 = []
while True:
try:
list3.append(list1.pop(0))
list3.append(list2.pop(0))
except IndexError:
break
How else can this be achieved? What's the most Pythonic approach?
If you need to handle lists of mismatched length (e.g. the second list is longer, or the first has more than one element more than the second), some solutions here will work while others will require adjustment. For more specific answers, see How to interleave two lists of different length? to leave the excess elements at the end, or How to elegantly interleave two lists of uneven length? to try to intersperse elements evenly, or Insert element in Python list after every nth element for the case where a specific number of elements should come before each "added" element.
Here's one way to do it by slicing:
>>> list1 = ['f', 'o', 'o']
>>> list2 = ['hello', 'world']
>>> result = [None]*(len(list1)+len(list2))
>>> result[::2] = list1
>>> result[1::2] = list2
>>> result
['f', 'hello', 'o', 'world', 'o']
There's a recipe for this in the itertools documentation (note: for Python 3):
from itertools import cycle, islice
def roundrobin(*iterables):
"roundrobin('ABC', 'D', 'EF') --> A D E B F C"
# Recipe credited to George Sakkis
num_active = len(iterables)
nexts = cycle(iter(it).__next__ for it in iterables)
while num_active:
try:
for next in nexts:
yield next()
except StopIteration:
# Remove the iterator we just exhausted from the cycle.
num_active -= 1
nexts = cycle(islice(nexts, num_active))
import itertools
print([x for x in itertools.chain.from_iterable(itertools.zip_longest(list1,list2)) if x])
I think this is the most pythonic way of doing it.
In Python 2, this should do what you want:
>>> iters = [iter(list1), iter(list2)]
>>> print list(it.next() for it in itertools.cycle(iters))
['f', 'hello', 'o', 'world', 'o']
Without itertools and assuming l1 is 1 item longer than l2:
>>> sum(zip(l1, l2+[0]), ())[:-1]
('f', 'hello', 'o', 'world', 'o')
In python 2, using itertools and assuming that lists don't contain None:
>>> filter(None, sum(itertools.izip_longest(l1, l2), ()))
('f', 'hello', 'o', 'world', 'o')
If both lists have equal length, you can do:
[x for y in zip(list1, list2) for x in y]
As the first list has one more element, you can add it post hoc:
[x for y in zip(list1, list2) for x in y] + [list1[-1]]
Edit: To illustrate what is happening in that first list comprehension, this is how you would spell it out as a nested for loop:
result = []
for y in zip(list1, list2): # y is is a 2-tuple, containining one element from each list
for x in y: # iterate over the 2-tuple
result.append(x) # append each element individually
I know the questions asks about two lists with one having one item more than the other, but I figured I would put this for others who may find this question.
Here is Duncan's solution adapted to work with two lists of different sizes.
list1 = ['f', 'o', 'o', 'b', 'a', 'r']
list2 = ['hello', 'world']
num = min(len(list1), len(list2))
result = [None]*(num*2)
result[::2] = list1[:num]
result[1::2] = list2[:num]
result.extend(list1[num:])
result.extend(list2[num:])
result
This outputs:
['f', 'hello', 'o', 'world', 'o', 'b', 'a', 'r']
Here's a one liner that does it:
list3 = [ item for pair in zip(list1, list2 + [0]) for item in pair][:-1]
Here's a one liner using list comprehensions, w/o other libraries:
list3 = [sub[i] for i in range(len(list2)) for sub in [list1, list2]] + [list1[-1]]
Here is another approach, if you allow alteration of your initial list1 by side effect:
[list1.insert((i+1)*2-1, list2[i]) for i in range(len(list2))]
This one is based on Carlos Valiente's contribution above
with an option to alternate groups of multiple items and make sure that all items are present in the output :
A=["a","b","c","d"]
B=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]
def cyclemix(xs, ys, n=1):
for p in range(0,int((len(ys)+len(xs))/n)):
for g in range(0,min(len(ys),n)):
yield ys[0]
ys.append(ys.pop(0))
for g in range(0,min(len(xs),n)):
yield xs[0]
xs.append(xs.pop(0))
print [x for x in cyclemix(A, B, 3)]
This will interlace lists A and B by groups of 3 values each:
['a', 'b', 'c', 1, 2, 3, 'd', 'a', 'b', 4, 5, 6, 'c', 'd', 'a', 7, 8, 9, 'b', 'c', 'd', 10, 11, 12, 'a', 'b', 'c', 13, 14, 15]
Might be a bit late buy yet another python one-liner. This works when the two lists have equal or unequal size. One thing worth nothing is it will modify a and b. If it's an issue, you need to use other solutions.
a = ['f', 'o', 'o']
b = ['hello', 'world']
sum([[a.pop(0), b.pop(0)] for i in range(min(len(a), len(b)))],[])+a+b
['f', 'hello', 'o', 'world', 'o']
from itertools import chain
list(chain(*zip('abc', 'def'))) # Note: this only works for lists of equal length
['a', 'd', 'b', 'e', 'c', 'f']
itertools.zip_longest returns an iterator of tuple pairs with any missing elements in one list replaced with fillvalue=None (passing fillvalue=object lets you use None as a value). If you flatten these pairs, then filter fillvalue in a list comprehension, this gives:
>>> from itertools import zip_longest
>>> def merge(a, b):
... return [
... x for y in zip_longest(a, b, fillvalue=object)
... for x in y if x is not object
... ]
...
>>> merge("abc", "defgh")
['a', 'd', 'b', 'e', 'c', 'f', 'g', 'h']
>>> merge([0, 1, 2], [4])
[0, 4, 1, 2]
>>> merge([0, 1, 2], [4, 5, 6, 7, 8])
[0, 4, 1, 5, 2, 6, 7, 8]
Generalized to arbitrary iterables:
>>> def merge(*its):
... return [
... x for y in zip_longest(*its, fillvalue=object)
... for x in y if x is not object
... ]
...
>>> merge("abc", "lmn1234", "xyz9", [None])
['a', 'l', 'x', None, 'b', 'm', 'y', 'c', 'n', 'z', '1', '9', '2', '3', '4']
>>> merge(*["abc", "x"]) # unpack an iterable
['a', 'x', 'b', 'c']
Finally, you may want to return a generator rather than a list comprehension:
>>> def merge(*its):
... return (
... x for y in zip_longest(*its, fillvalue=object)
... for x in y if x is not object
... )
...
>>> merge([1], [], [2, 3, 4])
<generator object merge.<locals>.<genexpr> at 0x000001996B466740>
>>> next(merge([1], [], [2, 3, 4]))
1
>>> list(merge([1], [], [2, 3, 4]))
[1, 2, 3, 4]
If you're OK with other packages, you can try more_itertools.roundrobin:
>>> list(roundrobin('ABC', 'D', 'EF'))
['A', 'D', 'E', 'B', 'F', 'C']
My take:
a = "hlowrd"
b = "el ol"
def func(xs, ys):
ys = iter(ys)
for x in xs:
yield x
yield ys.next()
print [x for x in func(a, b)]
def combine(list1, list2):
lst = []
len1 = len(list1)
len2 = len(list2)
for index in range( max(len1, len2) ):
if index+1 <= len1:
lst += [list1[index]]
if index+1 <= len2:
lst += [list2[index]]
return lst
How about numpy? It works with strings as well:
import numpy as np
np.array([[a,b] for a,b in zip([1,2,3],[2,3,4,5,6])]).ravel()
Result:
array([1, 2, 2, 3, 3, 4])
Stops on the shortest:
def interlace(*iters, next = next) -> collections.Iterable:
"""
interlace(i1, i2, ..., in) -> (
i1-0, i2-0, ..., in-0,
i1-1, i2-1, ..., in-1,
.
.
.
i1-n, i2-n, ..., in-n,
)
"""
return map(next, cycle([iter(x) for x in iters]))
Sure, resolving the next/__next__ method may be faster.
Multiple one-liners inspired by answers to another question:
import itertools
list(itertools.chain.from_iterable(itertools.izip_longest(list1, list2, fillvalue=object)))[:-1]
[i for l in itertools.izip_longest(list1, list2, fillvalue=object) for i in l if i is not object]
[item for sublist in map(None, list1, list2) for item in sublist][:-1]
An alternative in a functional & immutable way (Python 3):
from itertools import zip_longest
from functools import reduce
reduce(lambda lst, zipped: [*lst, *zipped] if zipped[1] != None else [*lst, zipped[0]], zip_longest(list1, list2),[])
using for loop also we can achive this easily:
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
list3 = []
for i in range(len(list1)):
#print(list3)
list3.append(list1[i])
if i < len(list2):
list3.append(list2[i])
print(list3)
output :
['f', 'hello', 'o', 'world', 'o']
Further by using list comprehension this can be reduced. But for understanding this loop can be used.
My approach looks as follows:
from itertools import chain, zip_longest
def intersperse(*iterators):
# A random object not occurring in the iterators
filler = object()
r = (x for x in chain.from_iterable(zip_longest(*iterators, fillvalue=filler)) if x is not filler)
return r
list1 = ['f', 'o', 'o']
list2 = ['hello', 'world']
print(list(intersperse(list1, list2)))
It works for an arbitrary number of iterators and yields an iterator, so I applied list() in the print line.
def alternate_elements(small_list, big_list):
mew = []
count = 0
for i in range(len(small_list)):
mew.append(small_list[i])
mew.append(big_list[i])
count +=1
return mew+big_list[count:]
if len(l2)>len(l1):
res = alternate_elements(l1,l2)
else:
res = alternate_elements(l2,l1)
print(res)
Here we swap lists based on size and perform, can someone provide better solution with time complexity O(len(l1)+len(l2))
I'd do the simple:
chain.from_iterable( izip( list1, list2 ) )
It'll come up with an iterator without creating any additional storage needs.
This is nasty but works no matter the size of the lists:
list3 = [
element for element in
list(itertools.chain.from_iterable([
val for val in itertools.izip_longest(list1, list2)
]))
if element != None
]
Obviously late to the party, but here's a concise one for equal-length lists:
output = [e for sub in zip(list1,list2) for e in sub]
It generalizes for an arbitrary number of equal-length lists, too:
output = [e for sub in zip(list1,list2,list3) for e in sub]
etc.
I'm too old to be down with list comprehensions, so:
import operator
list3 = reduce(operator.add, zip(list1, list2))

Categories

Resources