Combination of entries from N lists - python

I have N lists [the size of each list might vary] in python. The value of N itself might vary too.
Let us consider an example where the number of lists are 3 [for simplicity, each list has size 2]:
[A, B], [C, D], [E, F]
Now what I am expecting is something like this:
ACE, ADE, ACF, ADF, BCE, BDE, BCF, BDF
A combination of each entry from all the lists [we will avoid combining entries from the same list].
What would be the most efficient way of solving this problem? [I am unsure if a similar problem is available online, as I wasn't able to find one].

itertools.product does exactly what you need:
##separate lists
ls1 = ['A','B']
ls2 = ['C','D']
ls3 = ['E','F','G']
out = list(product(ls1,ls2,ls3))
##or, list of lists, using the * operator
ls = [['A','B'],['C','D'],['E','F','G']]
out = list(product(*ls))
##in both cases
print(out)
[('A', 'C', 'E'), ('A', 'C', 'F'), ('A', 'C', 'G'), ('A', 'D', 'E'), ('A', 'D', 'F'), ('A', 'D', 'G'), ('B', 'C', 'E'), ('B', 'C', 'F'), ('B', 'C', 'G'), ('B', 'D', 'E'), ('B', 'D', 'F'), ('B', 'D', 'G')]

Running this code:
from itertools import product
from functools import reduce
def combine_words(list1, list2):
word_pairs = product(list1, list2)
combined_words = []
for pair in word_pairs:
combined_words.append("".join(pair))
return combined_words
reduce(combine_words, [["A", "B"], ["C", "D"], ["E", "F"]])
Will return the following result:
['ACE', 'ACF', 'ADE', 'ADF', 'BCE', 'BCF', 'BDE', 'BDF']
It works regardless of the number of words in each list, or regardless of the number of lists inside the total list.

It turns out to be a product rather than a combination problem.
Because Python strings can be iterated over as if they were a list of characters this leads to several ways in which the input could be stated. (I like the third way as it can be less typing if spaces are not in any of the "words").
Code
from itertools import product
from typing import List, Union
def word_product(list_of_lists: Union[List[List[str]], List[str]]) -> str:
return ', '.join(''.join(values) for values in product(*list_of_lists))
example1 = [["A", "B"], ["C", "D"], ["E", "F"]]
print(word_product(example1))
example2 = ["AB", "CD", "EF"]
print(word_product(example2))
example3 = "AB CD EF".split()
print(word_product(example3))
Output
It is the same line for each of the equivalent example inputs:
ACE, ACF, ADE, ADF, BCE, BCF, BDE, BDF
ACE, ACF, ADE, ADF, BCE, BCF, BDE, BDF
ACE, ACF, ADE, ADF, BCE, BCF, BDE, BDF

Related

How can I generate all unique nested 2-tuples (nested pairings) of a set of n objects in Python?

By nested 2-tuples, I mean something like this: ((a,b),(c,(d,e))) where all tuples have two elements. I don't need different orderings of the elements, just the different ways of putting parentheses around them. For items = [a, b, c, d], there are 5 unique pairings, which are:
(((a,b),c),d)
((a,(b,c)),d)
(a,((b,c),d))
(a,(b,(c,d)))
((a,b),(c,d))
In a perfect world I'd also like to have control over the maximum depth of the returned tuples, so that if I generated all pairings of items = [a, b, c, d] with max_depth=2, it would only return ((a,b),(c,d)).
This problem turned up because I wanted to find a way to generate the results of addition on non-commutative, non-associative numbers. If a+b doesn't equal b+a, and a+(b+c) doesn't equal (a+b)+c, what are all the possible sums of a, b, and c?
I have made a function that generates all pairings, but it also returns duplicates.
import itertools
def all_pairings(items):
if len(items) == 2:
yield (*items,)
else:
for i, pair in enumerate(itertools.pairwise(items)):
for pairing in all_pairings(items[:i] + [pair] + items[i+2:]):
yield pairing
For example, it returns ((a,b),(c,d)) twice for items=[a, b, c, d], since it pairs up (a,b) first in one case and (c,d) first in the second case.
Returning duplicates becomes a bigger and bigger problem for larger numbers of items. With duplicates, the number of pairings grows factorially, and without duplicates it grows exponentially, according to the Catalan Numbers (https://oeis.org/A000108).
n
With duplicates: (n-1)!
Without duplicates: (2(n-1))!/(n!(n-1)!)
1
1
1
2
1
1
3
2
2
4
6
5
5
24
14
6
120
42
7
720
132
8
5040
429
9
40320
1430
10
362880
4862
Because of this, I have been trying to come up with an algorithm that doesn't need to search through all the possibilities, only the unique ones. Again, it would also be nice to have control over the maximum depth, but that could probably be added to an existing algorithm. So far I've been unsuccessful in coming up with an approach, and I also haven't found any resources that cover this specific problem. I'd appreciate any help or links to helpful resources.
Using a recursive generator:
items = ['a', 'b', 'c', 'd']
def split(l):
if len(l) == 1:
yield l[0]
for i in range(1, len(l)):
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
list(split(items))
Output:
[('a', ('b', ('c', 'd'))),
('a', (('b', 'c'), 'd')),
(('a', 'b'), ('c', 'd')),
(('a', ('b', 'c')), 'd'),
((('a', 'b'), 'c'), 'd')]
Check of uniqueness:
assert len(list(split(list(range(10))))) == 4862
Reversed order of the items:
items = ['a', 'b', 'c', 'd']
def split(l):
if len(l) == 1:
yield l[0]
for i in range(len(l)-1, 0, -1):
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
list(split(items))
[((('a', 'b'), 'c'), 'd'),
(('a', ('b', 'c')), 'd'),
(('a', 'b'), ('c', 'd')),
('a', (('b', 'c'), 'd')),
('a', ('b', ('c', 'd')))]
With maxdepth:
items = ['a', 'b', 'c', 'd']
def split(l, maxdepth=None):
if len(l) == 1:
yield l[0]
elif maxdepth is not None and maxdepth <= 0:
yield tuple(l)
else:
for i in range(1, len(l)):
for a in split(l[:i], maxdepth=maxdepth and maxdepth-1):
for b in split(l[i:], maxdepth=maxdepth and maxdepth-1):
yield (a, b)
list(split(items))
# or
list(split(items, maxdepth=3))
# or
list(split(items, maxdepth=2))
[('a', ('b', ('c', 'd'))),
('a', (('b', 'c'), 'd')),
(('a', 'b'), ('c', 'd')),
(('a', ('b', 'c')), 'd'),
((('a', 'b'), 'c'), 'd')]
list(split(items, maxdepth=1))
[('a', ('b', 'c', 'd')),
(('a', 'b'), ('c', 'd')),
(('a', 'b', 'c'), 'd')]
list(split(items, maxdepth=0))
[('a', 'b', 'c', 'd')]
Full-credit to mozway for the algorithm - my original idea was to represent the pairing in reverse-polish notation, which would not have lent itself to the following optimizations:
First, we replace the two nested loops:
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
-with itertools.product, which will itself cache the results of the inner split(...) call, as well as produce the pairing in internal C code, which will run much faster.
yield from product(split(l[:i]), split(l[i:]))
Next, we cache the results of the previous split(...) calls. To do this we must sacrifice the laziness of generators, as well as ensure that our function parameters are hashable. Explicitly, this means creating a wrapper that casts the input list to a tuple, and to modify the function body to return lists instead of yielding.
def split(l):
return _split(tuple(l))
def _split(l):
if len(l) == 1:
return l[:1]
res = []
for i in range(1, len(l)):
res.extend(product(_split(l[:i]), _split(l[i:])))
return res
We then decorate the function with functools.cache, to perform the caching. So putting it all together:
from itertools import product
from functools import cache
def split(l):
return _split(tuple(l))
#cache
def _split(l):
if len(l) == 1:
return l[:1]
res = []
for i in range(1, len(l)):
res.extend(product(_split(l[:i]), _split(l[i:])))
return res
Testing for following input-
test = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n']`
-produces the following timings:
Original: 5.922573089599609
Revised: 0.08888077735900879
I did also verify that the results matched the original exactly- order and all.
Again, full credit to mozway for the algorithm. I've just applied a few optimizations to speed it up a bit.

How to combine list elements in a tuple?

I am struggling to combine specific list elements from a tuple. Would love any feedback or help! I am new to Python, so apologies if this is not a good question.
If I have a tuple list like this:
tuple_1 = [('A', 'B', 'C', 'D'), ('A', 'H'), ('B', 'C', 'D', 'A')]
I want to combine elements 'B', 'C', and 'D' from every tuple in the list:
tuple_1_new = [('A', 'BCD'), ('A', 'H'), ('BCD', 'A')]
My code looks like this:
next_insert = [(iter(x)) for x in tuple_1]
tuple_1_new = [i + next(next_insert) + next(next_insert) if i == "B" else i for i in next_insert]
but when I print(tuple_1_new), it is giving me an output of:
[<tuple_iterator object at ...>, <tuple_iterator object at ...>, <tuple_iterator object at ...>]
I feel like my code is correct, but I'm confused with this output. Again, sorry if this is a dumb question. Would appreciate any help - thanks!
def foo(arr):
w = "".join(arr)
ind = w.find("BCD")
if ind >= 0:
ans = list(arr)
return tuple(ans[:ind] + ["BCD"] + ans[ind + 3:])
return arr
[foo(x) for x in tuple_1]
# [('A', 'BCD'), ('A', 'H'), ('BCD', 'A')]
Another solution, using generator:
tuple_1 = [("A", "B", "C", "D"), ("A", "H"), ("B", "C", "D", "A")]
def fn(x):
while x:
if x[:3] == ("B", "C", "D"):
yield "".join(x[:3])
x = x[3:]
else:
yield x[0]
x = x[1:]
out = [tuple(fn(t)) for t in tuple_1]
print(out)
Prints:
[('A', 'BCD'), ('A', 'H'), ('BCD', 'A')]
A list comprehension answer:
[tuple([t for t in tup if t not in ['B', 'C', 'D']] + [''.join([t for t in tup if t in ['B', 'C', 'D']])]) for tup in tuple_1]
Although not quite getting the desired output, prints:
[('A', 'BCD'), ('A', 'H', ''), ('A', 'BCD')]
Note: In a 'simple' list comprehension, the 'for x in iterable_name' creates a processing loop using 'x' (or a collection of names if expanding a tuple or performing a zip extraction) as variable(s) in each loop.
When performing list comprehension within a list comprehension (for loop inside for loop), each loop will contribute one or more variables, which obviously must not incur a name collision.
Assuming the strings are single letters like in your example:
tuple_1_new = [tuple(' '.join(t).replace('B C D', 'BCD').split())
for t in tuple_1]
Or with Regen:
tuple_1_new = [tuple(re.findall('BCD|.', ''.join(t)))
for t in tuple_1]

All permutations of an array in python

I have an array. I want to generate all permutations from that array, including single element, repeated element, change the order, etc. For example, say I have this array:
arr = ['A', 'B', 'C']
And if I use the itertools module by doing this:
from itertools import permutations
perms = [''.join(p) for p in permutations(['A','B','C'])]
print(perms)
Or using loop like this:
def permutations(head, tail=''):
if len(head) == 0:
print(tail)
else:
for i in range(len(head)):
permutations(head[:i] + head[i+1:], tail + head[i])
arr= ['A', 'B', 'C']
permutations(arr)
I only get:
['ABC', 'ACB', 'BAC', 'BCA', 'CAB', 'CBA']
But what I want is:
['A', 'B', 'C',
'AA', 'AB', 'AC', 'BB', 'BA', 'BC', 'CA', 'CB', 'CC',
'AAA', 'AAB', 'AAC', 'ABA', 'ABB', 'ACA', 'ACC', 'BBB', 'BAA', 'BAB', 'BAC', 'CCC', 'CAA', 'CCA'.]
The result is all permutations from the array given. Since the array is 3 element and all the element can be repetitive, so it generates 3^3 (27) ways. I know there must be a way to do this but I can't quite get the logic right.
A generator that would generate all sequences as you describe (which has infinite length if you would try to exhaust it):
from itertools import product
def sequence(xs):
n = 1
while True:
yield from (product(xs, repeat=n))
n += 1
# example use: print first 100 elements from the sequence
s = sequence('ABC')
for _ in range(100):
print(next(s))
Output:
('A',)
('B',)
('C',)
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'B')
('B', 'C')
('C', 'A')
('C', 'B')
('C', 'C')
('A', 'A', 'A')
('A', 'A', 'B')
('A', 'A', 'C')
('A', 'B', 'A')
...
Of course, if you don't want tuples, but strings, just replace the next(s) with ''.join(next(s)), i.e.:
print(''.join(next(s)))
If you don't want the sequences to exceed the length of the original collection:
from itertools import product
def sequence(xs):
n = 1
while n <= len(xs):
yield from (product(xs, repeat=n))
n += 1
for element in sequence('ABC'):
print(''.join(element))
Of course, in that limited case, this will do as well:
from itertools import product
xs = 'ABC'
for s in (''.join(x) for n in range(len(xs)) for x in product(xs, repeat=n+1)):
print(s)
Edit: In the comments, OP asked for an explanation of the yield from (product(xs, repeat=n)) part.
product() is a function in itertools that generates the cartesian product of iterables, which is a fancy way to say that you get all possible combinations of elements from the first iterable, with elements from the second etc.
Play around with it a bit to get a better feel for it, but for example:
list(product([1, 2], [3, 4])) == [(1, 3), (1, 4), (2, 3), (2, 4)]
If you take the product of an iterable with itself, the same happens, for example:
list(product('AB', 'AB')) == [('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]
Note that I keep calling product() with list() around it here, that's because product() returns a generator and passing the generator to list() exhausts the generator into a list, for printing.
The final step with product() is that you can also give it an optional repeat argument, which tells product() to do the same thing, but just repeat the iterable a certain number of times. For example:
list(product('AB', repeat=2)) == [('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]
So, you can see how calling product(xs, repeat=n) will generate all the sequences you're after, if you start at n=1 and keep exhausting it for ever greater n.
Finally, yield from is a way to yield results from another generator one at a time in your own generator. For example, yield from some_gen is the same as:
for x in some_gen:
yield x
So, yield from (product(xs, repeat=n)) is the same as:
for p in (product(xs, repeat=n)):
yield p

Generate list combinations in Python - with rules

I have two list:
list1 = ["A", "B", "C", "D", "E"]
list2 = ["AA", "BB", "CC", "DD", "EE"]
I want to create every possible combination, as the following:
A-B-C-D-E
A-BB-C-D-E
A-B-C-DD-E...etc.
The rule is that two similar letter (like A-AA, B-BB) can't be in the combination at the same time, and the order is reversible (A-B-C-D-E and B-A-E-C-D is the same in content so i don't need them both)
How can I manage it with itertools?
TL;DR: map('-'.join, product(*zip(list1, list2))).
It's relatively simple to do using itertools, but you have to think through each step carefully.
First, you can zip your two lists together to get a sequence of tuples. Every element in your final result will include exactly one choice from each tuple.
>>> list1 = ["A", "B", "C", "D", "E"]
>>> list2 = ["AA", "BB", "CC", "DD", "EE"]
>>> list(zip(list1, list2))
[('A', 'AA'), ('B', 'BB'), ('C', 'CC'), ('D', 'DD'), ('E', 'EE')]
Next, we'll want the Cartesian product of each of the five tuples. This gives us the 32 different ways of choosing one of A or AA, then one of B or BB, etc. To do that, we use * to unpack the result of zip into five separate arguments for product.
>>> from itertools import product
>>> for x in product(*zip(list1, list2)):
... print(x)
...
('A', 'B', 'C', 'D', 'E')
('A', 'B', 'C', 'D', 'EE')
('A', 'B', 'C', 'DD', 'E')
('A', 'B', 'C', 'DD', 'EE')
('A', 'B', 'CC', 'D', 'E')
('A', 'B', 'CC', 'D', 'EE')
('A', 'B', 'CC', 'DD', 'E')
# etc
Once you have the product, each element of the product is a valid argument to '-'.join to create one of the strings in your desired set:
>>> for x in map('-'.join, product(*zip(list1, list2))):
... print(x)
...
A-B-C-D-E
A-B-C-D-EE
A-B-C-DD-E
A-B-C-DD-EE
A-B-CC-D-E
A-B-CC-D-EE
A-B-CC-DD-E
A-B-CC-DD-EE
A-BB-C-D-E
A-BB-C-D-EE
A-BB-C-DD-E
A-BB-C-DD-EE
A-BB-CC-D-E
A-BB-CC-D-EE
A-BB-CC-DD-E
A-BB-CC-DD-EE
AA-B-C-D-E
AA-B-C-D-EE
AA-B-C-DD-E
AA-B-C-DD-EE
AA-B-CC-D-E
AA-B-CC-D-EE
AA-B-CC-DD-E
AA-B-CC-DD-EE
AA-BB-C-D-E
AA-BB-C-D-EE
AA-BB-C-DD-E
AA-BB-C-DD-EE
AA-BB-CC-D-E
AA-BB-CC-D-EE
AA-BB-CC-DD-E
AA-BB-CC-DD-EE

how to find words out of given alphabets in ascending order

I have a sequence of
words=[a,b,c,d]
And I want to find words that can be made out of them in ascending order.
the result list has
[a,ab,abc,abcd,b,bc,bcd,c,cd,d]
how to do it.
I have the code but it has C and python mixed, can someone help me with its python equivalent.
here it goes:
word_list=input("Enter the word")
n=len(word_list)
newlist=[]
for(i=0;i<n;i++)
{
c=''
for(j=i;j<n;j++)
{
c.join(j)
newlist=append(c)
}
}
letters = input("Enter the word")
n = len(letters)
words = [letters[start:end+1] for start in range(n) for end in range(start, n)]
You can do it easily with itertools.combinations
Itertools has some great functions for this kind of thing. itertools.combinations does exactly what you want.
The syntax is:
itertools.combinations(iterable [, length] )
so you can enter your list of words directly as it is an iterable. As you want all the different lengths, you will have to do it in a for-loop to get a list of combinations for all lengths.
So if your words are:
words = ['a', 'b', 'c', 'd']
and you do:
import itertools
itertools.combinations(words, 2)
you will get back an itertools object which you can easily convert to a list with list():
list(itertools.combinations(words, 2))
which will return:
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
However, if you want a list of all lengths (i.e. including just 'a' and 'abc') then you can just extend the results of each individual list of each list onto another list of all lengths. So something like:
import itertools
words = ['a', 'b', 'c', 'd']
combinations = []
for l in range(1, len(words) + 1):
combinations.extend(list(itertools.combinations(words, l )))
and this will give you the result of:
[('a'), ('b'), ('c'), ('d'), ('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b, 'd'), ('c', 'd'), ('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'c', 'd'), ('b', 'c', 'd), ('a', 'b', 'c', 'd')]
and if you want these to be more readable (as strings rather than tuples), you can use a list comprehension...
combinations = [''.join(c) for c in combinations]
so now combinations is simply an array of the strings:
['a', 'b', 'c', 'd', 'ab', 'ac', 'ad', 'bc', 'bd', 'cd', 'abc', 'abd', 'acd', 'bcd', 'abcd']
you can use itertools :
>>> import itertools
>>> w=['a','b','c','d']
>>> result=[]
>>> for L in range(1, len(w)+1):
... for subset in itertools.combinations(w, L):
... result.append(''.join(subset))
...
>>> result
['a', 'b', 'c', 'd', 'ab', 'ac', 'ad', 'bc', 'bd', 'cd', 'abc', 'abd', 'acd', 'bcd', 'abcd']

Categories

Resources