I have an array. I want to generate all permutations from that array, including single element, repeated element, change the order, etc. For example, say I have this array:
arr = ['A', 'B', 'C']
And if I use the itertools module by doing this:
from itertools import permutations
perms = [''.join(p) for p in permutations(['A','B','C'])]
print(perms)
Or using loop like this:
def permutations(head, tail=''):
if len(head) == 0:
print(tail)
else:
for i in range(len(head)):
permutations(head[:i] + head[i+1:], tail + head[i])
arr= ['A', 'B', 'C']
permutations(arr)
I only get:
['ABC', 'ACB', 'BAC', 'BCA', 'CAB', 'CBA']
But what I want is:
['A', 'B', 'C',
'AA', 'AB', 'AC', 'BB', 'BA', 'BC', 'CA', 'CB', 'CC',
'AAA', 'AAB', 'AAC', 'ABA', 'ABB', 'ACA', 'ACC', 'BBB', 'BAA', 'BAB', 'BAC', 'CCC', 'CAA', 'CCA'.]
The result is all permutations from the array given. Since the array is 3 element and all the element can be repetitive, so it generates 3^3 (27) ways. I know there must be a way to do this but I can't quite get the logic right.
A generator that would generate all sequences as you describe (which has infinite length if you would try to exhaust it):
from itertools import product
def sequence(xs):
n = 1
while True:
yield from (product(xs, repeat=n))
n += 1
# example use: print first 100 elements from the sequence
s = sequence('ABC')
for _ in range(100):
print(next(s))
Output:
('A',)
('B',)
('C',)
('A', 'A')
('A', 'B')
('A', 'C')
('B', 'A')
('B', 'B')
('B', 'C')
('C', 'A')
('C', 'B')
('C', 'C')
('A', 'A', 'A')
('A', 'A', 'B')
('A', 'A', 'C')
('A', 'B', 'A')
...
Of course, if you don't want tuples, but strings, just replace the next(s) with ''.join(next(s)), i.e.:
print(''.join(next(s)))
If you don't want the sequences to exceed the length of the original collection:
from itertools import product
def sequence(xs):
n = 1
while n <= len(xs):
yield from (product(xs, repeat=n))
n += 1
for element in sequence('ABC'):
print(''.join(element))
Of course, in that limited case, this will do as well:
from itertools import product
xs = 'ABC'
for s in (''.join(x) for n in range(len(xs)) for x in product(xs, repeat=n+1)):
print(s)
Edit: In the comments, OP asked for an explanation of the yield from (product(xs, repeat=n)) part.
product() is a function in itertools that generates the cartesian product of iterables, which is a fancy way to say that you get all possible combinations of elements from the first iterable, with elements from the second etc.
Play around with it a bit to get a better feel for it, but for example:
list(product([1, 2], [3, 4])) == [(1, 3), (1, 4), (2, 3), (2, 4)]
If you take the product of an iterable with itself, the same happens, for example:
list(product('AB', 'AB')) == [('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]
Note that I keep calling product() with list() around it here, that's because product() returns a generator and passing the generator to list() exhausts the generator into a list, for printing.
The final step with product() is that you can also give it an optional repeat argument, which tells product() to do the same thing, but just repeat the iterable a certain number of times. For example:
list(product('AB', repeat=2)) == [('A', 'A'), ('A', 'B'), ('B', 'A'), ('B', 'B')]
So, you can see how calling product(xs, repeat=n) will generate all the sequences you're after, if you start at n=1 and keep exhausting it for ever greater n.
Finally, yield from is a way to yield results from another generator one at a time in your own generator. For example, yield from some_gen is the same as:
for x in some_gen:
yield x
So, yield from (product(xs, repeat=n)) is the same as:
for p in (product(xs, repeat=n)):
yield p
Related
By nested 2-tuples, I mean something like this: ((a,b),(c,(d,e))) where all tuples have two elements. I don't need different orderings of the elements, just the different ways of putting parentheses around them. For items = [a, b, c, d], there are 5 unique pairings, which are:
(((a,b),c),d)
((a,(b,c)),d)
(a,((b,c),d))
(a,(b,(c,d)))
((a,b),(c,d))
In a perfect world I'd also like to have control over the maximum depth of the returned tuples, so that if I generated all pairings of items = [a, b, c, d] with max_depth=2, it would only return ((a,b),(c,d)).
This problem turned up because I wanted to find a way to generate the results of addition on non-commutative, non-associative numbers. If a+b doesn't equal b+a, and a+(b+c) doesn't equal (a+b)+c, what are all the possible sums of a, b, and c?
I have made a function that generates all pairings, but it also returns duplicates.
import itertools
def all_pairings(items):
if len(items) == 2:
yield (*items,)
else:
for i, pair in enumerate(itertools.pairwise(items)):
for pairing in all_pairings(items[:i] + [pair] + items[i+2:]):
yield pairing
For example, it returns ((a,b),(c,d)) twice for items=[a, b, c, d], since it pairs up (a,b) first in one case and (c,d) first in the second case.
Returning duplicates becomes a bigger and bigger problem for larger numbers of items. With duplicates, the number of pairings grows factorially, and without duplicates it grows exponentially, according to the Catalan Numbers (https://oeis.org/A000108).
n
With duplicates: (n-1)!
Without duplicates: (2(n-1))!/(n!(n-1)!)
1
1
1
2
1
1
3
2
2
4
6
5
5
24
14
6
120
42
7
720
132
8
5040
429
9
40320
1430
10
362880
4862
Because of this, I have been trying to come up with an algorithm that doesn't need to search through all the possibilities, only the unique ones. Again, it would also be nice to have control over the maximum depth, but that could probably be added to an existing algorithm. So far I've been unsuccessful in coming up with an approach, and I also haven't found any resources that cover this specific problem. I'd appreciate any help or links to helpful resources.
Using a recursive generator:
items = ['a', 'b', 'c', 'd']
def split(l):
if len(l) == 1:
yield l[0]
for i in range(1, len(l)):
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
list(split(items))
Output:
[('a', ('b', ('c', 'd'))),
('a', (('b', 'c'), 'd')),
(('a', 'b'), ('c', 'd')),
(('a', ('b', 'c')), 'd'),
((('a', 'b'), 'c'), 'd')]
Check of uniqueness:
assert len(list(split(list(range(10))))) == 4862
Reversed order of the items:
items = ['a', 'b', 'c', 'd']
def split(l):
if len(l) == 1:
yield l[0]
for i in range(len(l)-1, 0, -1):
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
list(split(items))
[((('a', 'b'), 'c'), 'd'),
(('a', ('b', 'c')), 'd'),
(('a', 'b'), ('c', 'd')),
('a', (('b', 'c'), 'd')),
('a', ('b', ('c', 'd')))]
With maxdepth:
items = ['a', 'b', 'c', 'd']
def split(l, maxdepth=None):
if len(l) == 1:
yield l[0]
elif maxdepth is not None and maxdepth <= 0:
yield tuple(l)
else:
for i in range(1, len(l)):
for a in split(l[:i], maxdepth=maxdepth and maxdepth-1):
for b in split(l[i:], maxdepth=maxdepth and maxdepth-1):
yield (a, b)
list(split(items))
# or
list(split(items, maxdepth=3))
# or
list(split(items, maxdepth=2))
[('a', ('b', ('c', 'd'))),
('a', (('b', 'c'), 'd')),
(('a', 'b'), ('c', 'd')),
(('a', ('b', 'c')), 'd'),
((('a', 'b'), 'c'), 'd')]
list(split(items, maxdepth=1))
[('a', ('b', 'c', 'd')),
(('a', 'b'), ('c', 'd')),
(('a', 'b', 'c'), 'd')]
list(split(items, maxdepth=0))
[('a', 'b', 'c', 'd')]
Full-credit to mozway for the algorithm - my original idea was to represent the pairing in reverse-polish notation, which would not have lent itself to the following optimizations:
First, we replace the two nested loops:
for a in split(l[:i]):
for b in split(l[i:]):
yield (a, b)
-with itertools.product, which will itself cache the results of the inner split(...) call, as well as produce the pairing in internal C code, which will run much faster.
yield from product(split(l[:i]), split(l[i:]))
Next, we cache the results of the previous split(...) calls. To do this we must sacrifice the laziness of generators, as well as ensure that our function parameters are hashable. Explicitly, this means creating a wrapper that casts the input list to a tuple, and to modify the function body to return lists instead of yielding.
def split(l):
return _split(tuple(l))
def _split(l):
if len(l) == 1:
return l[:1]
res = []
for i in range(1, len(l)):
res.extend(product(_split(l[:i]), _split(l[i:])))
return res
We then decorate the function with functools.cache, to perform the caching. So putting it all together:
from itertools import product
from functools import cache
def split(l):
return _split(tuple(l))
#cache
def _split(l):
if len(l) == 1:
return l[:1]
res = []
for i in range(1, len(l)):
res.extend(product(_split(l[:i]), _split(l[i:])))
return res
Testing for following input-
test = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n']`
-produces the following timings:
Original: 5.922573089599609
Revised: 0.08888077735900879
I did also verify that the results matched the original exactly- order and all.
Again, full credit to mozway for the algorithm. I've just applied a few optimizations to speed it up a bit.
I currently want all permutations of a set of elements with replacement.
Example:
elements = ['a', 'b']
permutations with replacement =
[('a', 'a', 'a'),
('a', 'a', 'b'),
('a', 'b', 'a'),
('a', 'b', 'b'),
('b', 'a', 'a'),
('b', 'a', 'b'),
('b', 'b', 'a'),
('b', 'b', 'b')]
The only way I have been able to do this is so far is with itertools.product as follows:
import itertools as it
sample_space = ['a', 'b']
outcomes = it.product(sample_space, sample_space, sample_space)
list(outcomes)
I am just wondering if there is a better way to do this as it obvious that this can get unwieldy and error prone as the sample space and required length gets larger
was expecting to find something along the lines of itertools.permutations(['a', 'b'], length=3, replace=True) maybe?
I tried itertools.permutations but the only arguments are iterable, and r which is the length required.
The output for the above example using it.permutations(sample_space, 3) would be an empty list []
If you're sampling with replacement, what you get is by definition not a permutation (which just means "rearrangement") of the set. So I wouldn't look to the permutations function for this. product is the right thing; e.g. itertools.product(['a','b'], repeat=3).
Note that if you sample from a two-element set with replacement N times, then you have effectively created an N-digit binary number, with all the possibilities that go with it. You've just swapped in 'a' and 'b' for 0 and 1.
I am trying to create a branch and bound algorithm, to do this I would like to create an iterator object which stores all possible combinations of a list of items of size 0 to n.
Take the following example to demonstrate:
import itertools as it
list_tmp = ['a', 'b', 'c', 'd']
tmp_it = sum([list(map(list, it.combinations(list_tmp, i))) for i in range(2 + 1)], [])
tmp_it is a list of all possible combinations of size 0 to 2. This code works perfectly for small list sizes, but I need to act on a larger list and so would like to preserve
the iterator characteristics of the it.combinations object (generate the combinations on the fly). e.g.
for iteration in it.combinations(list_tmp, 2):
print(iteration)
Is there any method of doing this for combinations of multiple sizes? Rather than converting to a list and losing the characteristics of the iterator object.
You can do this using itertools.chain.from_iterable, which lazily evaluates its argument. Something like this:
tmp_it = it.chain.from_iterable(it.combinations(list_tmp, i) for i in range(2+1)))
You can chain iterators:
>>> sizes = it.chain.from_iterable(it.combinations(list_tmp, i) for i in range(len(list_tmp)))
>>> for i in sizes:
... print(i)
...
()
('a',)
('b',)
('c',)
('d',)
('a', 'b')
('a', 'c')
('a', 'd')
('b', 'c')
('b', 'd')
('c', 'd')
('a', 'b', 'c')
('a', 'b', 'd')
('a', 'c', 'd')
('b', 'c', 'd')
I have a sequence of
words=[a,b,c,d]
And I want to find words that can be made out of them in ascending order.
the result list has
[a,ab,abc,abcd,b,bc,bcd,c,cd,d]
how to do it.
I have the code but it has C and python mixed, can someone help me with its python equivalent.
here it goes:
word_list=input("Enter the word")
n=len(word_list)
newlist=[]
for(i=0;i<n;i++)
{
c=''
for(j=i;j<n;j++)
{
c.join(j)
newlist=append(c)
}
}
letters = input("Enter the word")
n = len(letters)
words = [letters[start:end+1] for start in range(n) for end in range(start, n)]
You can do it easily with itertools.combinations
Itertools has some great functions for this kind of thing. itertools.combinations does exactly what you want.
The syntax is:
itertools.combinations(iterable [, length] )
so you can enter your list of words directly as it is an iterable. As you want all the different lengths, you will have to do it in a for-loop to get a list of combinations for all lengths.
So if your words are:
words = ['a', 'b', 'c', 'd']
and you do:
import itertools
itertools.combinations(words, 2)
you will get back an itertools object which you can easily convert to a list with list():
list(itertools.combinations(words, 2))
which will return:
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
However, if you want a list of all lengths (i.e. including just 'a' and 'abc') then you can just extend the results of each individual list of each list onto another list of all lengths. So something like:
import itertools
words = ['a', 'b', 'c', 'd']
combinations = []
for l in range(1, len(words) + 1):
combinations.extend(list(itertools.combinations(words, l )))
and this will give you the result of:
[('a'), ('b'), ('c'), ('d'), ('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b, 'd'), ('c', 'd'), ('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'c', 'd'), ('b', 'c', 'd), ('a', 'b', 'c', 'd')]
and if you want these to be more readable (as strings rather than tuples), you can use a list comprehension...
combinations = [''.join(c) for c in combinations]
so now combinations is simply an array of the strings:
['a', 'b', 'c', 'd', 'ab', 'ac', 'ad', 'bc', 'bd', 'cd', 'abc', 'abd', 'acd', 'bcd', 'abcd']
you can use itertools :
>>> import itertools
>>> w=['a','b','c','d']
>>> result=[]
>>> for L in range(1, len(w)+1):
... for subset in itertools.combinations(w, L):
... result.append(''.join(subset))
...
>>> result
['a', 'b', 'c', 'd', 'ab', 'ac', 'ad', 'bc', 'bd', 'cd', 'abc', 'abd', 'acd', 'bcd', 'abcd']
I am trying to create the code for the exercise given by my prof. I am to turn piece of string(random) into a list, using lists(guessing .append())
The example that I got was this:
def string_to_list_in_pairs('abcd')
Should return:
['ab', 'bc', 'cd']
I was using the code that I learned in class:
def string_to_list_in_pairs (st):
newL2 = []
for i in range(len(st)):
newL2.append(st[i])
return newL2
but I'm stuck on the part where it turns it into pairs. I know that with this equation I get
['a', 'b', 'c', 'd']
for string 'abcd'
what do I have to add in, in order to make it into pairs? (we didn't learn iterate, prof wants us to use what we learned).
def string_to_list_in_pairs (s):
return [''.join(pair) for pair in zip(s[:-1], s[1:])]
Example
>>> string_to_list_in_pairs('abcd')
['ab', 'bc', 'cd']
How it works
Note how the output looks. The first characters in the three strings are abc which are just the input string without its last character: s[:-1].
Now, look at the last characters in the three strings. They are 'bcd' which are the input string without its first character: s[1:].
We can combine those two with zip and it looks like:
>>> s = 'abcd'
>>> s[:-1], s[1:]
('abc', 'bcd')
>>> zip(s[:-1], s[1:])
[('a', 'b'), ('b', 'c'), ('c', 'd')]
This is almost the right answer. Each tuple in the list has the right characters. The only remaining issue is that it is a tuple and we want a string. To convert a tuple to a string, we apply join. This can be done for each tuple in that list via:
>>> [ ''.join(pair) for pair in [('a', 'b'), ('b', 'c'), ('c', 'd')] ]
['ab', 'bc', 'cd']
Or, putting it all together:
>>> [''.join(pair) for pair in zip(s[:-1], s[1:])]
['ab', 'bc', 'cd']
This just what the function defined above does.
You can use itertools.combinations
>>> import itertools
>>> list(itertools.combinations('abcd', 2))
[('a', 'b'), ('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd'), ('c', 'd')]
Two changes are required:
Adding -1 to range(len(st))
Adding +st[i+1] to newL2.append(st[i])
Then:
def string_to_list_in_pairs (st):
newL2 = []
for i in range(len(st)-1):
newL2.append(st[i]+st[i+1])
return newL2
Resulting in:
>>> string_to_list_in_pairs('abcd')
['ab', 'bc', 'cd']
Use itertools
from itertools import combinations
for i in combinations('abcd', 2):
print ''.join(i)
from itertools import combinations
def string_to_list_in_pairs (st):
newL2 = []
for i in combinations(st, 2):
newL2.append(''.join(i))
return newL2
print string_to_list_in_pairs('abcd')
Result:
['ab', 'ac', 'ad', 'bc', 'bd', 'cd']
Using the code you provided you could try the following:
def string_to_list_in_pairs (st):
newL2 = []
for i in range(len(st)-1):
newL2.append(st[i]+st[i+1])
return newL2
The modifications that were made to your code were to reduce the number of times the for loop ran. This was done to ensure that during each iteration you were appending the character that was next to it therefore making a pair.
Here is an example without itertools. The rand_str line creates a random string and the for loop cuts it into two letter pieces:
from random import choice
rand_str="".join([x for x in map(lambda y: choice('ATCG'), range(100))])
duplex_list=[]
for y in range(0,len(rand_str),2):
duplex_list.append(rand_str[y:y+2])
print rand_str
print duplex_list
My answer,
def string_to_list_in_pairs(st):
"""
Compute a list with char pairs from a string """
# For strings with length 1 or 0
if len(st) <= 1:
return list(st)
else:
return [st[i:i+2] for i in range(len(st)-1)]
Results,
>>> string_to_list_in_pairs('')
[]
>>> string_to_list_in_pairs('a')
['a']
>>> string_to_list_in_pairs('ab')
['ab']
>>> string_to_list_in_pairs('abc')
['ab', 'bc']
>>> string_to_list_in_pairs('abcd')
['ab', 'bc', 'cd']
>>> string_to_list_in_pairs('abcde')
['ab', 'bc', 'cd', 'de']