I have the following lists:
list1 = [ 'A','B','C']
list2 = [ '1', '2' ]
Trying to generate a new list of tuples with the following desired result:
[(A1),(A2),(A1,B1),(A1,B2),(A2,B1),(A2,B2),(A1,B1,C1),(A2,B1,C1)...]
Each tuple will eventully be used to write a single line in an output file.
Note that:
In each tuple, each letter from list1, if defined, must be defined after the preceding letters. for example, if 'B' is defined in a tuple then 'A' must be in the tuple as well and prior to 'B'. tuple (A1,C1) is not desired since 'B' is not defined as well.
Tuples must be unique.
list1 & list2 are just an example and may vary in length.
I tried playing around with itertools, specifically with,
product,
permutations,
combinations
for quite some time. I can't seem to pull it off and I don't even have some code worth sharing.
Take successive larger slices of list1, and use products of products:
from itertools import product
elements = []
for letter in list1:
elements.append([''.join(c) for c in product(letter, list2)])
for combo in product(*elements):
print combo
The elements list is grown each loop, adding another set of letter + numbers list to produce products from.
This produces:
>>> elements = []
>>> for letter in list1:
... elements.append([''.join(c) for c in product(letter, list2)])
... for combo in product(*elements):
... print combo
...
('A1',)
('A2',)
('A1', 'B1')
('A1', 'B2')
('A2', 'B1')
('A2', 'B2')
('A1', 'B1', 'C1')
('A1', 'B1', 'C2')
('A1', 'B2', 'C1')
('A1', 'B2', 'C2')
('A2', 'B1', 'C1')
('A2', 'B1', 'C2')
('A2', 'B2', 'C1')
('A2', 'B2', 'C2')
What about this:
from itertools import product
output = []
for z in [list1[:n+1] for n in range(len(list1))]:
for y in product(list2, repeat=len(z)):
output.append(tuple(''.join(u) for u in zip(z, y)))
print(output)
Related
This question already has answers here:
Pairs from single list [duplicate]
(10 answers)
Closed 2 years ago.
I've got a list with (for example) 100 entries of the sort ['A0', 'B0', 'A1', 'B1', 'A2', 'B2', ... 'A99', 'B99'].
I'd now like to make this into a list of 50 entries with each entry a tuple (Ai, Bi) such that they are grouped together. So the result should be
[('A0','B0'),('A1','B1'),('A2','B2'),...,('A99','B99')]. Is there a shortcut to achieve this or do I have to use a loop like
for i in numpy.arange(0,100,2):
newlist.add((oldlist[i], oldlist[i+1]))
I'm trying to do quick and advanced programming in python so I'd prefer using shortcuts, list comprehension, ... and not simple for loops where possible
This is the most pythonic way I can think of:
list(zip(oldlist[::2], oldlist[1::2]))
The code below should simply it:
list_1 = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2']
list_of_groups = [x for x in zip(*(iter(list_1),) * 2)]
If you really want, you can do this:
a = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2']
list(zip(*[iter(a)]*2))
# [('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]
Another way wiith numpy
list(map(tuple, np.array(l).reshape(-1, 2)))
#[('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]
I have two lists:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
and I want to pair items in the format any string same number, like so:
listC = [('a1', None),('a2', 'b2'),('a3', None),('a4', 'b4')]
I´ve tried itertools.zip_longest but I couldn´t get what I need:
>>>list(itertools.zip_longest(listA, listB)
[('a1', 'b2'), ('a2', 'b4'), ('a3', None), ('a4', None)]
Any suggestions how to get listC?
You can use iter with next:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
l = iter(listB)
listC = [(a, next(l) if i%2 != 0 else None) for i, a in enumerate(listA)]
Output:
[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
Edit: pairing by trailing number:
import re
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
d = {re.findall('\d+$', b)[0]:b for b in listB}
listC = [(i, d.get(re.findall('\d+$', i)[0])) for i in listA]
Output:
[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
You can use a list comprehension with a ternary statement for this:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
listB_set = set(listB)
listC = [(i, 'b'+i[1:] if 'b'+i[1:] in listB_set else None) for i in listA]
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
However, for clarity and performance, I would consider separating numeric and string data.
You can try dict approach:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
final_list={}
import itertools
for i in itertools.product(listA,listB):
data,data1=list(i[0]),list(i[1])
if data[1]==data1[1]:
final_list[i[0]]=i
else:
if i[0] not in final_list:
final_list[i[0]]=(i[0],None)
print(final_list.values())
output:
[('a2', 'b2'), ('a3', None), ('a4', 'b4'), ('a1', None)]
Given
import itertools as it
list_a = ["a1", "a2", "a3", "a4"]
list_b = ["b2", "b4"]
Code
pred = lambda x: x[1:]
res = [tuple(g) for k, g in it.groupby(sorted(list_a + list_b, key=pred), pred)]
res
# [('a1',), ('a2', 'b2'), ('a3',), ('a4', 'b4')]
list(zip(*it.zip_longest(*res)))
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
Details
A flat, sorted list is grouped by the numbers of each string and yields grouped results according to the predicate. Note, if strings start with a single letter, the predicate should work for any digit, "a1", "b23", "c132", etc. If you are willing, you might also consider a trailing number regex as seen in #Ajax1234's answer.
As you discovered, itertools.zip_longest pads None to shorter sub-groups by default.
See Also
this post for more ideas on padding iterables
this post on how to use itertool.groupby
this post on natural sorting for a more robust predicate
I want to create all possible character combinations from lists. The first char needs to be from the first array, the second char from the second array, etc.
If I have the following lists:
char1 = ['a','b','c']
char2 = ['1','2']
The possible strings, would be: a1, a2, b1, b2, c1 and c2.
How do I make the code with makes all the combinations from an unknown amount of lists with an unknown size?
The problem is that I do not know, how many lists there will be. The amount of lists will be decided by the user, while the code is running.
As mentioned above, you can use itertools.product()
And since you don't know number of lists you can pass list of lists as an argument:
import itertools
lists = [
['a','b','c'],
['1','2']
]
["".join(x) for x in itertools.product(*lists)]
Result:
['a1', 'a2', 'b1', 'b2', 'c1', 'c2']
That's a task for itertools.product()! Check out the docs: https://docs.python.org/2/library/itertools.html#itertools.product
>>> ["%s%s" % (c1,c2) for (c1,c2) in itertools.product(char1, char2)]
['a1', 'a2', 'b1', 'b2', 'c1', 'c2']
And yeah, it extends to a variable number of lists of unknown size.
I am working on a script that uses itertools to generate desired combinations of given parameters. I am having trouble generating the appropriate 'sets' however when some variables are linked, i.e. excluding certain combinations. Consider the following:
import itertools
A = ['a1','a2','a3']
B = ['b1','b2','b3']
C = ['c1','c2']
If I want to generate all possible combinations of these elements, I can simply use itertools.product()
all_combinations = list(itertools.product(A,B,C))
Which gives the expected
[('a1', 'b1', 'c1'), ('a1', 'b1', 'c2'), ('a1', 'b2', 'c1'), ...
('a3', 'b2', 'c2'), ('a3', 'b3', 'c1'), ('a3', 'b3', 'c2')]
For 18 combinations (3*3*2)
However, how can I 'link' parameters A and B such that each returned set contains only 'an','bn' elements? That is, I have tried:
ABprecombine = zip(A,B)
limited_combinations = list(itertools.product(ABprecombine,C))
Which returns
[(('a1', 'b1'), 'c1'), (('a1', 'b1'), 'c2'), (('a2', 'b2'), 'c1'),
(('a2', 'b2'), 'c2'), (('a3', 'b3'), 'c1'), (('a3', 'b3'), 'c2')]
This is the six (3*1*2) desired products, but obviously due to the way I created it I now have an extra tuple.
Of course I could generate all combinations and then filter out given ones, but is there a smart way to 'link' parameters as above?
Here, zipping A and B is the right way to go. You can flatten the tuples pretty easily if you want:
limited_combinations = [(a, b, c) for ((a, b), c) in itertools.product(zip(A, B), C)]
If you want more detailed control of what combinations get produced, things can rapidly get more complicated, up to the difficulty of needing to solve NP-hard problems like boolean satisfiability. If that happens, look into existing libraries for that kind of thing.
I have a list of iterable objects, and I'm interested in obtaining all lists that consist of 0 or 1 items from each iterable (order is unimportant, so it's combinations not permutations I seek).
I have a really inelegant implementation that I've posted below.
I'm convinced there's a far more elegant way to do this, possibly with the itertools module, but I can't come up with anything. Any advice?
import itertools
def all_subsets(ss):
subset_lens = range(0, len(ss) + 1)
list_of_subsets = map(lambda n: itertools.combinations(ss, n), subset_lens)
return itertools.chain.from_iterable(list_of_subsets)
list_of_iterables = [["A1"], ["B1", "B2", "B3"], ["C1", "C2"]]
all_possibilities = itertools.chain.from_iterable(itertools.product(*subset)
for subset in all_subsets(list_of_iterables))
# Visual representation of the desired result
for eg in all_possibilities:
print eg
Result:
()
('A1',)
('B1',)
('B2',)
('B3',)
('C1',)
('C2',)
('A1', 'B1')
('A1', 'B2')
('A1', 'B3')
('A1', 'C1')
...
[filter(None, comb) for comb in itertools.product(*[[None] + it for it in list_of_iterables])]
This makes a couple simplifying assumptions. If your iterables contain values that aren't true in a boolean context, you'd have to use a more complicated filter. If your iterables aren't lists, you'd have to use itertools.chain instead of [None] + it.
Here's what I came up with...
data = [["A1"], ["B1", "B2", "B3"], ["C1", "C2"]]
data = [[None] + x for x in data]
data = sorted(filter(None, x) for x in itertools.product(*data))
for result in data:
print result
Output:
()
('A1',)
('A1', 'B1')
('A1', 'B1', 'C1')
('A1', 'B1', 'C2')
('A1', 'B2')
('A1', 'B2', 'C1')
('A1', 'B2', 'C2')
('A1', 'B3')
('A1', 'B3', 'C1')
('A1', 'B3', 'C2')
('A1', 'C1')
('A1', 'C2')
('B1',)
('B1', 'C1')
('B1', 'C2')
('B2',)
('B2', 'C1')
('B2', 'C2')
('B3',)
('B3', 'C1')
('B3', 'C2')
('C1',)
('C2',)