I have two lists:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
and I want to pair items in the format any string same number, like so:
listC = [('a1', None),('a2', 'b2'),('a3', None),('a4', 'b4')]
I´ve tried itertools.zip_longest but I couldn´t get what I need:
>>>list(itertools.zip_longest(listA, listB)
[('a1', 'b2'), ('a2', 'b4'), ('a3', None), ('a4', None)]
Any suggestions how to get listC?
You can use iter with next:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
l = iter(listB)
listC = [(a, next(l) if i%2 != 0 else None) for i, a in enumerate(listA)]
Output:
[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
Edit: pairing by trailing number:
import re
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
d = {re.findall('\d+$', b)[0]:b for b in listB}
listC = [(i, d.get(re.findall('\d+$', i)[0])) for i in listA]
Output:
[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
You can use a list comprehension with a ternary statement for this:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
listB_set = set(listB)
listC = [(i, 'b'+i[1:] if 'b'+i[1:] in listB_set else None) for i in listA]
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
However, for clarity and performance, I would consider separating numeric and string data.
You can try dict approach:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
final_list={}
import itertools
for i in itertools.product(listA,listB):
data,data1=list(i[0]),list(i[1])
if data[1]==data1[1]:
final_list[i[0]]=i
else:
if i[0] not in final_list:
final_list[i[0]]=(i[0],None)
print(final_list.values())
output:
[('a2', 'b2'), ('a3', None), ('a4', 'b4'), ('a1', None)]
Given
import itertools as it
list_a = ["a1", "a2", "a3", "a4"]
list_b = ["b2", "b4"]
Code
pred = lambda x: x[1:]
res = [tuple(g) for k, g in it.groupby(sorted(list_a + list_b, key=pred), pred)]
res
# [('a1',), ('a2', 'b2'), ('a3',), ('a4', 'b4')]
list(zip(*it.zip_longest(*res)))
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
Details
A flat, sorted list is grouped by the numbers of each string and yields grouped results according to the predicate. Note, if strings start with a single letter, the predicate should work for any digit, "a1", "b23", "c132", etc. If you are willing, you might also consider a trailing number regex as seen in #Ajax1234's answer.
As you discovered, itertools.zip_longest pads None to shorter sub-groups by default.
See Also
this post for more ideas on padding iterables
this post on how to use itertool.groupby
this post on natural sorting for a more robust predicate
Related
This question already has answers here:
Pairs from single list [duplicate]
(10 answers)
Closed 2 years ago.
I've got a list with (for example) 100 entries of the sort ['A0', 'B0', 'A1', 'B1', 'A2', 'B2', ... 'A99', 'B99'].
I'd now like to make this into a list of 50 entries with each entry a tuple (Ai, Bi) such that they are grouped together. So the result should be
[('A0','B0'),('A1','B1'),('A2','B2'),...,('A99','B99')]. Is there a shortcut to achieve this or do I have to use a loop like
for i in numpy.arange(0,100,2):
newlist.add((oldlist[i], oldlist[i+1]))
I'm trying to do quick and advanced programming in python so I'd prefer using shortcuts, list comprehension, ... and not simple for loops where possible
This is the most pythonic way I can think of:
list(zip(oldlist[::2], oldlist[1::2]))
The code below should simply it:
list_1 = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2']
list_of_groups = [x for x in zip(*(iter(list_1),) * 2)]
If you really want, you can do this:
a = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2']
list(zip(*[iter(a)]*2))
# [('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]
Another way wiith numpy
list(map(tuple, np.array(l).reshape(-1, 2)))
#[('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]
I am working on a script that uses itertools to generate desired combinations of given parameters. I am having trouble generating the appropriate 'sets' however when some variables are linked, i.e. excluding certain combinations. Consider the following:
import itertools
A = ['a1','a2','a3']
B = ['b1','b2','b3']
C = ['c1','c2']
If I want to generate all possible combinations of these elements, I can simply use itertools.product()
all_combinations = list(itertools.product(A,B,C))
Which gives the expected
[('a1', 'b1', 'c1'), ('a1', 'b1', 'c2'), ('a1', 'b2', 'c1'), ...
('a3', 'b2', 'c2'), ('a3', 'b3', 'c1'), ('a3', 'b3', 'c2')]
For 18 combinations (3*3*2)
However, how can I 'link' parameters A and B such that each returned set contains only 'an','bn' elements? That is, I have tried:
ABprecombine = zip(A,B)
limited_combinations = list(itertools.product(ABprecombine,C))
Which returns
[(('a1', 'b1'), 'c1'), (('a1', 'b1'), 'c2'), (('a2', 'b2'), 'c1'),
(('a2', 'b2'), 'c2'), (('a3', 'b3'), 'c1'), (('a3', 'b3'), 'c2')]
This is the six (3*1*2) desired products, but obviously due to the way I created it I now have an extra tuple.
Of course I could generate all combinations and then filter out given ones, but is there a smart way to 'link' parameters as above?
Here, zipping A and B is the right way to go. You can flatten the tuples pretty easily if you want:
limited_combinations = [(a, b, c) for ((a, b), c) in itertools.product(zip(A, B), C)]
If you want more detailed control of what combinations get produced, things can rapidly get more complicated, up to the difficulty of needing to solve NP-hard problems like boolean satisfiability. If that happens, look into existing libraries for that kind of thing.
I have the following lists:
list1 = [ 'A','B','C']
list2 = [ '1', '2' ]
Trying to generate a new list of tuples with the following desired result:
[(A1),(A2),(A1,B1),(A1,B2),(A2,B1),(A2,B2),(A1,B1,C1),(A2,B1,C1)...]
Each tuple will eventully be used to write a single line in an output file.
Note that:
In each tuple, each letter from list1, if defined, must be defined after the preceding letters. for example, if 'B' is defined in a tuple then 'A' must be in the tuple as well and prior to 'B'. tuple (A1,C1) is not desired since 'B' is not defined as well.
Tuples must be unique.
list1 & list2 are just an example and may vary in length.
I tried playing around with itertools, specifically with,
product,
permutations,
combinations
for quite some time. I can't seem to pull it off and I don't even have some code worth sharing.
Take successive larger slices of list1, and use products of products:
from itertools import product
elements = []
for letter in list1:
elements.append([''.join(c) for c in product(letter, list2)])
for combo in product(*elements):
print combo
The elements list is grown each loop, adding another set of letter + numbers list to produce products from.
This produces:
>>> elements = []
>>> for letter in list1:
... elements.append([''.join(c) for c in product(letter, list2)])
... for combo in product(*elements):
... print combo
...
('A1',)
('A2',)
('A1', 'B1')
('A1', 'B2')
('A2', 'B1')
('A2', 'B2')
('A1', 'B1', 'C1')
('A1', 'B1', 'C2')
('A1', 'B2', 'C1')
('A1', 'B2', 'C2')
('A2', 'B1', 'C1')
('A2', 'B1', 'C2')
('A2', 'B2', 'C1')
('A2', 'B2', 'C2')
What about this:
from itertools import product
output = []
for z in [list1[:n+1] for n in range(len(list1))]:
for y in product(list2, repeat=len(z)):
output.append(tuple(''.join(u) for u in zip(z, y)))
print(output)
I have a list of iterable objects, and I'm interested in obtaining all lists that consist of 0 or 1 items from each iterable (order is unimportant, so it's combinations not permutations I seek).
I have a really inelegant implementation that I've posted below.
I'm convinced there's a far more elegant way to do this, possibly with the itertools module, but I can't come up with anything. Any advice?
import itertools
def all_subsets(ss):
subset_lens = range(0, len(ss) + 1)
list_of_subsets = map(lambda n: itertools.combinations(ss, n), subset_lens)
return itertools.chain.from_iterable(list_of_subsets)
list_of_iterables = [["A1"], ["B1", "B2", "B3"], ["C1", "C2"]]
all_possibilities = itertools.chain.from_iterable(itertools.product(*subset)
for subset in all_subsets(list_of_iterables))
# Visual representation of the desired result
for eg in all_possibilities:
print eg
Result:
()
('A1',)
('B1',)
('B2',)
('B3',)
('C1',)
('C2',)
('A1', 'B1')
('A1', 'B2')
('A1', 'B3')
('A1', 'C1')
...
[filter(None, comb) for comb in itertools.product(*[[None] + it for it in list_of_iterables])]
This makes a couple simplifying assumptions. If your iterables contain values that aren't true in a boolean context, you'd have to use a more complicated filter. If your iterables aren't lists, you'd have to use itertools.chain instead of [None] + it.
Here's what I came up with...
data = [["A1"], ["B1", "B2", "B3"], ["C1", "C2"]]
data = [[None] + x for x in data]
data = sorted(filter(None, x) for x in itertools.product(*data))
for result in data:
print result
Output:
()
('A1',)
('A1', 'B1')
('A1', 'B1', 'C1')
('A1', 'B1', 'C2')
('A1', 'B2')
('A1', 'B2', 'C1')
('A1', 'B2', 'C2')
('A1', 'B3')
('A1', 'B3', 'C1')
('A1', 'B3', 'C2')
('A1', 'C1')
('A1', 'C2')
('B1',)
('B1', 'C1')
('B1', 'C2')
('B2',)
('B2', 'C1')
('B2', 'C2')
('B3',)
('B3', 'C1')
('B3', 'C2')
('C1',)
('C2',)
I have a python dictionary setup like so
mydict = { 'a1': ['g',6],
'a2': ['e',2],
'a3': ['h',3],
'a4': ['s',2],
'a5': ['j',9],
'a6': ['y',7] }
I need to write a function which returns the ordered keys in a list, depending on which column your sorting on so for example if we're sorting on mydict[key][1] (ascending)
I should receive a list back like so
['a2', 'a4', 'a3', 'a1', 'a6', 'a5']
It mostly works, apart from when you have columns of the same value for multiple keys, eg. 'a2': ['e',2] and 'a4': ['s',2]. In this instance it returns the list like so
['a4', 'a4', 'a3', 'a1', 'a6', 'a5']
Here's the function I've defined
def itlist(table_dict,column_nb,order="A"):
try:
keys = table_dict.keys()
values = [i[column_nb-1] for i in table_dict.values()]
combo = zip(values,keys)
valkeys = dict(combo)
sortedCols = sorted(values) if order=="A" else sorted(values,reverse=True)
sortedKeys = [valkeys[i] for i in sortedCols]
except (KeyError, IndexError), e:
pass
return sortedKeys
And if I want to sort on the numbers column for example it is called like so
sortedkeysasc = itmethods.itlist(table,2)
So any suggestions?
Paul
Wouldn't it be much easier to use
sorted(d, key=lambda k: d[k][1])
(with d being the dictionary)?
>>> L = sorted(d.items(), key=lambda (k, v): v[1])
>>> L
[('a2', ['e', 2]), ('a4', ['s', 2]), ('a3', ['h', 3]), ('a1', ['g', 6]), ('a6', ['y', 7]), ('a5', ['j', 9])]
>>> map(lambda (k,v): k, L)
['a2', 'a4', 'a3', 'a1', 'a6', 'a5']
Here you sort the dictionary items (key-value pairs) using a key - callable which establishes a total order on the items.
Then, you just filter out needed values using a map with a lambda which just selects the key. So you get the needed list of keys.
EDIT: see this answer for a much better solution.
Although there are multiple working answers above, a slight variation / combination of them is the most pythonic to me:
[k for (k,v) in sorted(mydict.items(), key=lambda (k, v): v[1])]
>>> mydict = { 'a1': ['g',6],
... 'a2': ['e',2],
... 'a3': ['h',3],
... 'a4': ['s',2],
... 'a5': ['j',9],
... 'a6': ['y',7] }
>>> sorted(mydict, key=lambda k:mydict[k][1])
['a2', 'a4', 'a3', 'a1', 'a6', 'a5']
>>> sorted(mydict, key=lambda k:mydict[k][0])
['a2', 'a1', 'a3', 'a5', 'a4', 'a6']
def itlist(table_dict, col, desc=False):
return [key for (key,val) in
sorted(
table_dict.iteritems(),
key=lambda x:x[1][col-1],
reverese=desc,
)
]