Related
My goal is to create 3 lists.
The 1st one is the input: choose 3 from ABCD to create AAA, ABC...etc
The 2nd one is the output: change the middle letter of each input and create a new list. eg: for AAA -> ABA,ACA,ADA. So 3 times the length of the input.
The third one is the Change: I want to name each change as c_i, for example, AAA->ABA is C1.
For Input,
>>> lis = ["A","B","C","D"]
>>> import itertools as it
>>> inp = list(it.product(lis, repeat = 3))
>>> print(inp)
[('A', 'A', 'A'), ('A', 'A', 'B'), ... ('D', 'D', 'C'), ('D', 'D', 'D')]
>>> len(inp)
64
But I am stuck on how to create the output list. Any idea is appreciated!
Thanks
You can use list comprehension:
import itertools
lst = ['A', 'B', 'C', 'D']
lst_input = list(itertools.product(lst, repeat=3))
lst_output = [(tup[0], x, tup[2]) for tup in lst_input for x in lst if tup[1] is not x]
lst_change = [f'C{i}' for i in range(1, len(lst_output) + 1)]
print(len(lst_input), len(lst_output), len(lst_change))
print(lst_input[:5])
print(lst_output[:5])
print(lst_change[:5])
# 64 192 192
# [('A', 'A', 'A'), ('A', 'A', 'B'), ('A', 'A', 'C'), ('A', 'A', 'D'), ('A', 'B', 'A')]
# [('A', 'B', 'A'), ('A', 'C', 'A'), ('A', 'D', 'A'), ('A', 'B', 'B'), ('A', 'C', 'B')]
# ['C1', 'C2', 'C3', 'C4', 'C5']
For each tuple in lst_input, the middle item is replaced by all the candidate characters, but the replacement is thrown out if that replacement character is the same as the original character (if tup[1] is not x).
I'm trying to solve the general problem of getting the unique combinations from a list in Python
Mathematically from https://www.mathsisfun.com/combinatorics/combinations-permutations-calculator.html I can see that the formula for the number of combinations is n!/r!(n-r)! where n is the length of the sequence and r is the number to choose.
As shown by the following python where n is 4 and r is 2:
lst = 'ABCD'
result = list(itertools.combinations(lst, len(lst)/2))
print len(result)
6
The following is a helper function to show the issue I have:
def C(lst):
l = list(itertools.combinations(sorted(lst), len(lst)/2))
s = set(l)
print 'actual', len(l), l
print 'unique', len(s), list(s)
If I run this from iPython I can call it thus:
In [41]: C('ABCD')
actual 6 [('A', 'B'), ('A', 'C'), ('A', 'D'), ('B', 'C'), ('B', 'D'), ('C', 'D')]
unique 6 [('B', 'C'), ('C', 'D'), ('A', 'D'), ('A', 'B'), ('A', 'C'), ('B', 'D')]
In [42]: C('ABAB')
actual 6 [('A', 'A'), ('A', 'B'), ('A', 'B'), ('A', 'B'), ('A', 'B'), ('B', 'B')]
unique 3 [('A', 'B'), ('A', 'A'), ('B', 'B')]
In [43]: C('ABBB')
actual 6 [('A', 'B'), ('A', 'B'), ('A', 'B'), ('B', 'B'), ('B', 'B'), ('B', 'B')]
unique 2 [('A', 'B'), ('B', 'B')]
In [44]: C('AAAA')
actual 6 [('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A'), ('A', 'A')]
unique 1 [('A', 'A')]
What I want to get is the unique count as shown above but doing a combinations and then set doesn't scale.
As when the length of lst which is n gets longer it slows down as the combinations get greater and greater.
Is there a way of using math or Python tricks to to solve the issue of counting the unique combinations ?
Here's some Python code based on the generating function approach outlined in this Math Forum article. For each letter appearing in the input we create a polynomial 1 + x + x^2 + ... + x^k, where k is the number of times that the letter appears. We then multiply those polynomials together: the nth coefficient of the resulting polynomial then tells you how many combinations of length n there are.
We'll represent a polynomial simply as a list of its (integer) coefficients, with the first coefficient representing the constant term, the next coefficient representing the coefficient of x, and so on. We'll need to be able to multiply such polynomials, so here's a function for doing so:
def polymul(p, q):
"""
Multiply two polynomials, represented as lists of coefficients.
"""
r = [0]*(len(p) + len(q) - 1)
for i, c in enumerate(p):
for j, d in enumerate(q):
r[i+j] += c*d
return r
With the above in hand, the following function computes the number of combinations:
from collections import Counter
from functools import reduce
def ncombinations(it, k):
"""
Number of combinations of length *k* of the elements of *it*.
"""
counts = Counter(it).values()
prod = reduce(polymul, [[1]*(count+1) for count in counts], [1])
return prod[k] if k < len(prod) else 0
Testing this on your examples:
>>> ncombinations("abcd", 2)
6
>>> ncombinations("abab", 2)
3
>>> ncombinations("abbb", 2)
2
>>> ncombinations("aaaa", 2)
1
And on some longer examples, demonstrating that this approach is feasible even for long-ish inputs:
>>> ncombinations("abbccc", 3) # the math forum example
6
>>> ncombinations("supercalifragilisticexpialidocious", 10)
334640
>>> from itertools import combinations # double check ...
>>> len(set(combinations(sorted("supercalifragilisticexpialidocious"), 10)))
334640
>>> ncombinations("supercalifragilisticexpialidocious", 20)
1223225
>>> ncombinations("supercalifragilisticexpialidocious", 34)
1
>>> ncombinations("supercalifragilisticexpialidocious", 35)
0
>>> from string import printable
>>> ncombinations(printable, 50) # len(printable)==100
100891344545564193334812497256
>>> from math import factorial
>>> factorial(100)//factorial(50)**2 # double check the result
100891344545564193334812497256
>>> ncombinations("abc"*100, 100)
5151
>>> factorial(102)//factorial(2)//factorial(100) # double check (bars and stars)
5151
Start with a regular recursive definition of combinations() but add a test to only recurse when the lead value at that level hasn't been used before:
def uniq_comb(pool, r):
""" Return an iterator over a all distinct r-length
combinations taken from a pool of values that
may contain duplicates.
Unlike itertools.combinations(), element uniqueness
is determined by value rather than by position.
"""
if r:
seen = set()
for i, item in enumerate(pool):
if item not in seen:
seen.add(item)
for tail in uniq_comb(pool[i+1:], r-1):
yield (item,) + tail
else:
yield ()
if __name__ == '__main__':
from itertools import combinations
pool = 'ABRACADABRA'
for r in range(len(pool) + 1):
assert set(uniq_comb(pool, r)) == set(combinations(pool, r))
assert dict.fromkeys(uniq_comb(pool, r)) == dict.fromkeys(combinations(pool, r))
It seems that this is called a multiset combination. I've faced the same problem and finally came up rewriting a function from sympy (here).
Instead of passing your iterable to something like itertools.combinations(p, r), you pass collections.Counter(p).most_common() to the following function to directly retrieve distinct combinations. It's a lot faster than filtering all combinations and also memory safe!
def counter_combinations(g, n):
if sum(v for k, v in g) < n or not n:
yield []
else:
for i, (k, v) in enumerate(g):
if v >= n:
yield [k]*n
v = n - 1
for v in range(min(n, v), 0, -1):
for j in counter_combinations(g[i + 1:], n - v):
rv = [k]*v + j
if len(rv) == n:
yield rv
Here is an example:
from collections import Counter
p = Counter('abracadabra').most_common()
print(p)
c = [_ for _ in counter_combinations(p, 4)]
print(c)
print(len(c))
Output:
[('a', 5), ('b', 2), ('r', 2), ('c', 1), ('d', 1)]
[['a', 'a', 'a', 'a'], ['a', 'a', 'a', 'b'], ['a', 'a', 'a', 'r'], ['a', 'a', 'a', 'c'], ['a', 'a', 'a', 'd'], ['a', 'a', 'b', 'b'], ['a', 'a', 'b', 'r'], ['a', 'a', 'b', 'c'], ['a', 'a', 'b', 'd'], ['a', 'a', 'r', 'r'], ['a', 'a', 'r', 'c'], ['a', 'a', 'r', 'd'], ['a', 'a', 'c', 'd'], ['a', 'b', 'b', 'r'], ['a', 'b', 'b', 'c'], ['a', 'b', 'b', 'd'], ['a', 'b', 'r', 'r'], ['a', 'b', 'r', 'c'], ['a', 'b', 'r', 'd'], ['a', 'b', 'c', 'd'], ['a', 'r', 'r', 'c'], ['a', 'r', 'r', 'd'], ['a', 'r', 'c', 'd'], ['b', 'b', 'r', 'r'], ['b', 'b', 'r', 'c'], ['b', 'b', 'r', 'd'], ['b', 'b', 'c', 'd'], ['b', 'r', 'r', 'c'], ['b', 'r', 'r', 'd'], ['b', 'r', 'c', 'd'], ['r', 'r', 'c', 'd']]
31
I want to remove the tuple from the list whose first elements are the same since I am treating the letter pairs as having the same value despite their ordering. Here is the list I am trying to iterate through, called tuples2:
[(3, 'A', 'C'), (3, 'C', 'A'), (2, 'B', 'C'), (2, 'C', 'B'), (1, 'A', 'B'), (1, 'B', 'A')]
My current code:
for i in list(tuples2):
if i[0] == i+1[0]:
tuples2.remove(i)
print tuples2
...Is throwing this error:
line 6: if i[0] == (i+1)[0]: TypeError: can only concatenate tuple (not "int") to tuple
How should I modify my code to account for this if I wanted to end up with
[(3, 'A', 'C'), (2, 'B', 'C'), (1, 'A', 'B')]?
There are a lot of red-flags in this code. You should not be modifying a list that you are iterating over, that will cause you stop skip over items. EDIT I now see that you copied the list in your for-loop, but still, the following approach is a bit more safe. You can iterate backwards, but it's probably easier to build a new list. A straightforward way is to keep track of already-seen first elements, and only add if you haven't seen the first element before:
In [1]: data = [(3, 'A', 'C'), (3, 'C', 'A'), (2, 'B', 'C'), (2, 'C', 'B'), (1, 'A', 'B'), (1, 'B', 'A')]
...:
In [2]: seen = set()
In [3]: new_data = []
...: for triple in data:
...: first = triple[0]
...: if first in seen:
...: continue
...: seen.add(first)
...: new_data.append(triple)
...:
In [4]: new_data
Out[4]: [(3, 'A', 'C'), (2, 'B', 'C'), (1, 'A', 'B')]
Using .remove is very inefficient. It changes your algorithm into O(n^2) rather than O(n).
You could read into a dictionary keyed by the first components and then read out the values:
tuples = [(3, 'A', 'C'), (3, 'C', 'A'), (2, 'B', 'C'), (2, 'C', 'B'), (1, 'A', 'B'), (1, 'B', 'A')]
d = {x:(x,y,z) for x,y,z in tuples}
tuples = list(d.values())
Resulting tuples:
[(1, 'B', 'A'), (2, 'C', 'B'), (3, 'C', 'A')]
You have a false concept of the iteration of i in list(tuples2) : Using this syntax, i is not a index but actually the tuple itself. So, you can't do i+1[0].
First, I recommend you do:
tuples_list = list(tuples2)
To solve this, you can use the xrange that python suggests (or range) that will work by index:
for i in xrange(len(tuples_list)-1):
if tuples_list[i][0] == tuples_list[i+1][0]:
#Do what you want
Just group by first element and take the first of each group.
>>> [next(g) for _, g in itertools.groupby(tuples2, lambda x: x[0])]
[(3, 'A', 'C'), (2, 'B', 'C'), (1, 'A', 'B')]
Or even simpler:
>>> tuples2[::2]
[(3, 'A', 'C'), (2, 'B', 'C'), (1, 'A', 'B')]
You can solve your question using groupby from itertools module like this way:
from itertools import groubpy
a = [(3, 'A', 'C'), (3, 'C', 'A'), (2, 'B', 'C'), (2, 'C', 'B'), (1, 'A', 'B'), (1, 'B', 'A')]
final = [list(v)[0] for _,v in groupby(sorted(a), lambda x: x[0])]
print(final)
Output:
>>> [(1, 'A', 'B'), (2, 'B', 'C'), (3, 'A', 'C')]
Otherwise, if you need the final list in the same order as you gave in your question, you can reverse it:
final = list(reversed(final))
# OR
#final = sorted(final, reverse = True)
print(final)
Output:
>>> [(3, 'A', 'C'), (2, 'B', 'C'), (1, 'A', 'B')]
IMMEDIATE PROBLEM
What is this supposed to mean?
i+1[0]
i is a tuple; you are trying to use i as both the index and the element. The iteration you need is more like:
for i in range (len(tuples2)):
if tuples2[i] == tuples2[i+1]:
... which still doesn't do the job. This checks for equality of the entire tuple. However, you say that all you care about is equality of the first element. You would then want:
if tuples2[i][0] == tuples2[i+1][0]:
This is in terms of your present code; others have shown you more "Pythonic" ways of doing this.
GENERAL SOLUTION
This code assumes that the other elements of the tuple are equal, that tuples with identical first elements are adjacent in the list, and that matching tuples come only in pairs. Is it possible that your list will include something such as:
tuples2 = [(3, 'A', 'C'), (3, 'C', 'A'),
(2, 'B', 'C'), (2, 'C', 'B'),
(1, 'A', 'B'), (1, 'A', 'Z'), (1, 'B', 'A')]
With the additional 'Z' element perhaps being buried between the "3" elements? Regardless, even if you sort the list, you get the "AZ" element between the other "1" elements.
If this is a problem for you, then I suggest that you first convert each tuple to a list, sorting the elements into order. For instance, thi'bwould convert (1, 'B', 'A') to [1, 'A', 'B']. Then use any of the given methods to eliminate duplicates, including the one you've already programmed. I generally do this by turning things back into tuples and then forming a set -- which automatically eliminates duplicates.
Given the iterable [A, B, C] and the function f(x) I want to get the following:
[ A, B, C]
[ A, B, f(C)]
[ A, f(B), C]
[ A, f(B), f(C)]
[f(A), B, C]
[f(A), B, f(C)]
[f(A), f(B), C]
[f(A), f(B), f(C)]
Unfortunately I didn't find anything suitable in the itertools module.
>>> from itertools import product
>>> L = ["A", "B", "C"]
>>> def f(c): return c.lower()
...
>>> fL = [f(x) for x in L]
>>> for i in product(*zip(L, fL)):
... print i
...
('A', 'B', 'C')
('A', 'B', 'c')
('A', 'b', 'C')
('A', 'b', 'c')
('a', 'B', 'C')
('a', 'B', 'c')
('a', 'b', 'C')
('a', 'b', 'c')
Explanation:
Call f for each item in L to generate fL
>>> fL
['a', 'b', 'c']
Use zip to zip the two lists into pairs
>>> zip(L, fL)
[('A', 'a'), ('B', 'b'), ('C', 'c')]
Take the cartesian product of those tuples using itertools.product
product(*zip(L, fL))
is equivalent to
product(*[('A', 'a'), ('B', 'b'), ('C', 'c')])
and that is equivalent to
product(('A', 'a'), ('B', 'b'), ('C', 'c'))
looping over that product, gives exactly the result we need.
You can use itertools.combinations, like this
def f(char):
return char.lower()
iterable = ["A", "B", "C"]
indices = range(len(iterable))
from itertools import combinations
for i in range(len(iterable) + 1):
for items in combinations(indices, i):
print [f(iterable[j]) if j in items else iterable[j] for j in range(len(iterable))]
Output
['A', 'B', 'C']
['a', 'B', 'C']
['A', 'b', 'C']
['A', 'B', 'c']
['a', 'b', 'C']
['a', 'B', 'c']
['A', 'b', 'c']
['a', 'b', 'c']
import itertools
def func_combinations(f, l):
return itertools.product(*zip(l, map(f, l)))
Demo:
>>> for combo in func_combinations(str, range(3)):
... print combo
...
(0, 1, 2)
(0, 1, '2')
(0, '1', 2)
(0, '1', '2')
('0', 1, 2)
('0', 1, '2')
('0', '1', 2)
('0', '1', '2')
This function first computes f once for every element of the input. Then, it uses zip to turn the input and the list of f values into a list of input-output pairs. Finally, it uses itertools.product to produce each possible way to select either input or output.
I have 2 variables - a and b. I need to fill up k places using these variables. So if k = 3 output should be
[a,a,a], [a,a,b] , [a,b,a], [b,a,a], [a,b,b], [b,a,b], [b,b,a] and [b,b,b]
Input - k
Output - All the combinations
How do I code this in Python? Can itertools be of any help here?
>>> import itertools
>>> list(itertools.product('ab', repeat=3))
[('a', 'a', 'a'), ('a', 'a', 'b'), ('a', 'b', 'a'), ('a', 'b', 'b'), ('b', 'a', 'a'), ('b', 'a', 'b'), ('b', 'b', 'a'), ('b', 'b', 'b')]
def genPerm(varslist, pos,resultLen, result, resultsList)
if pos>resultLen:
return;
for e in varslist:
if pos==resultLen:
resultsList.append(result + [e]);
else
genPerm(varsList, pos+1, resultLen, result + [e], resultsList);
Call with:
genPerm([a,b], 0, resLength, [], resultsList);