Python: list grouping [duplicate]

Python: list grouping [duplicate] - python

This question already has answers here:
Pairs from single list [duplicate]
(10 answers)
Closed 2 years ago.
I've got a list with (for example) 100 entries of the sort ['A0', 'B0', 'A1', 'B1', 'A2', 'B2', ... 'A99', 'B99'].
I'd now like to make this into a list of 50 entries with each entry a tuple (Ai, Bi) such that they are grouped together. So the result should be
[('A0','B0'),('A1','B1'),('A2','B2'),...,('A99','B99')]. Is there a shortcut to achieve this or do I have to use a loop like
for i in numpy.arange(0,100,2):
newlist.add((oldlist[i], oldlist[i+1]))
I'm trying to do quick and advanced programming in python so I'd prefer using shortcuts, list comprehension, ... and not simple for loops where possible

This is the most pythonic way I can think of:
list(zip(oldlist[::2], oldlist[1::2]))

The code below should simply it:
list_1 = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2']
list_of_groups = [x for x in zip(*(iter(list_1),) * 2)]

If you really want, you can do this:
a = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2']
list(zip(*[iter(a)]*2))
# [('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]

Another way wiith numpy
list(map(tuple, np.array(l).reshape(-1, 2)))
#[('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]

Related

Pair elements from two different lists

I have two lists:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
and I want to pair items in the format any string same number, like so:
listC = [('a1', None),('a2', 'b2'),('a3', None),('a4', 'b4')]
I´ve tried itertools.zip_longest but I couldn´t get what I need:
>>>list(itertools.zip_longest(listA, listB)
[('a1', 'b2'), ('a2', 'b4'), ('a3', None), ('a4', None)]
Any suggestions how to get listC?

You can use iter with next:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
l = iter(listB)
listC = [(a, next(l) if i%2 != 0 else None) for i, a in enumerate(listA)]
Output:
[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
Edit: pairing by trailing number:
import re
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
d = {re.findall('\d+$', b)[0]:b for b in listB}
listC = [(i, d.get(re.findall('\d+$', i)[0])) for i in listA]
Output:
[('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]

You can use a list comprehension with a ternary statement for this:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
listB_set = set(listB)
listC = [(i, 'b'+i[1:] if 'b'+i[1:] in listB_set else None) for i in listA]
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
However, for clarity and performance, I would consider separating numeric and string data.

You can try dict approach:
listA = ['a1', 'a2', 'a3', 'a4']
listB = ['b2', 'b4']
final_list={}
import itertools
for i in itertools.product(listA,listB):
data,data1=list(i[0]),list(i[1])
if data[1]==data1[1]:
final_list[i[0]]=i
else:
if i[0] not in final_list:
final_list[i[0]]=(i[0],None)
print(final_list.values())
output:
[('a2', 'b2'), ('a3', None), ('a4', 'b4'), ('a1', None)]

Given
import itertools as it
list_a = ["a1", "a2", "a3", "a4"]
list_b = ["b2", "b4"]
Code
pred = lambda x: x[1:]
res = [tuple(g) for k, g in it.groupby(sorted(list_a + list_b, key=pred), pred)]
res
# [('a1',), ('a2', 'b2'), ('a3',), ('a4', 'b4')]
list(zip(*it.zip_longest(*res)))
# [('a1', None), ('a2', 'b2'), ('a3', None), ('a4', 'b4')]
Details
A flat, sorted list is grouped by the numbers of each string and yields grouped results according to the predicate. Note, if strings start with a single letter, the predicate should work for any digit, "a1", "b23", "c132", etc. If you are willing, you might also consider a trailing number regex as seen in #Ajax1234's answer.
As you discovered, itertools.zip_longest pads None to shorter sub-groups by default.
See Also
this post for more ideas on padding iterables
this post on how to use itertool.groupby
this post on natural sorting for a more robust predicate

How can I improve this heavily nested for-loop?

I have a function which I'd like to optimize, if possible. But I cannot easily tell if there's a better way to refactor (and optimize) this...
Suppose,
keys_in_order = ['A', 'B', 'C', 'D', 'E']
key_table = { 'A': {'A1': 'one', 'A2': 'two', 'A3': 'three', 'A4': 'four'},
'B': {'B1': 'one-one', 'B2': 'two-two', 'B3': 'three-three'},
... # mapping for 'C', 'D' here
'E': {'E1': 'one-one', 'E2': 'two-two', 'E3': 'three-three', 'E6': 'six-six'}
}
The purpose is to feed the above two parameters to the function as below:
def generate_all_possible_key_combinations(keys_in_order, key_table):
first_key = keys_in_order[0]
second_key = keys_in_order[1]
third_key = keys_in_order[2]
fourth_key = keys_in_order[3]
fifth_key = keys_in_order[4]
table_out = [['Demo Group', first_key, second_key, third_key, fourth_key, fifth_key]] # just the header row so that we can write to a CSV file later
for k1, v1 in key_table[first_key].items():
for k2, v2 in key_table[second_key].items():
for k3, v3 in key_table[third_key].items():
for k4, v4 in key_table[fourth_key].items():
for k5, v5 in key_table[fifth_key].items():
demo_gp = k1 + k2 + k3 + k4 + k5
table_out.append([demo_gp, v1, v2, v3, v4, v5])
return table_out
so that the goal is to have a table with all possible combination of sub-keys (that is, 'A1B1C1D1E1', 'A1B1C1D1E2', 'A1B1C1D1E3', etc.) along with their corresponding values in key_table.
To me, the current code with five heavily nested loop through the dict key_table is ugly, not to mention it being inefficient computation-wise. Is there a way to improve this? I hope folks from code_review might be able to shed some lights on how I might go about it. Thank you!

I have implemented with an alternative method. Consider as key_table as your main dictionary.
My logic is
From this i will get all the possible sub keys from the main dict.
In [1]: [i.keys() for i in key_table.values()]
Out[1]:
[['A1', 'A3', 'A2', 'A4'],
['C3', 'C2', 'C1'],
['B1', 'B2', 'B3'],
['E6', 'E1', 'E3', 'E2'],
['D2', 'D3', 'D1']]
Then i made this list of list as a single list.
In [2]: print [item for sublist in [i.keys() for i in key_table.values()] for item in sublist]
['A1', 'A3', 'A2', 'A4', 'C3', 'C2', 'C1', 'B1', 'B2', 'B3', 'E6', 'E1', 'E3', 'E2', 'D2', 'D3', 'D1']
With using itertools.combinations implemented the combination of all possible values. It have 5 elements so i given that as a hard code method. You can replace that with len([i.keys() for i in key_table.values()]) if you more values. Here provides an example of itertools.combinations. Then you can understand it.
In [83]: for i in itertools.combinations(['A1','B1','C1'],2):
....: print i
....:
('A1', 'B1')
('A1', 'C1')
('B1', 'C1')
Here is the full code with one line implementation.
for item in itertools.combinations([item for sublist in [i.keys() for i in key_table.values()] for item in sublist],5):
print ''.join(item)

Some optimizations:
The various key_table[?].items() could be computed before the nested loop
You could compute partials of demo_gp when they are available: demo_gp12 = k1 + k2, demo_gp123 = demo_gp12 + k3, etc. Similar thing could be done with the array of vs.
As #JohnColeman suggested, itertools would be a good place to look to simplifying it.

Python itertools - Create only a subset of all possible products

I am working on a script that uses itertools to generate desired combinations of given parameters. I am having trouble generating the appropriate 'sets' however when some variables are linked, i.e. excluding certain combinations. Consider the following:
import itertools
A = ['a1','a2','a3']
B = ['b1','b2','b3']
C = ['c1','c2']
If I want to generate all possible combinations of these elements, I can simply use itertools.product()
all_combinations = list(itertools.product(A,B,C))
Which gives the expected
[('a1', 'b1', 'c1'), ('a1', 'b1', 'c2'), ('a1', 'b2', 'c1'), ...
('a3', 'b2', 'c2'), ('a3', 'b3', 'c1'), ('a3', 'b3', 'c2')]
For 18 combinations (3*3*2)
However, how can I 'link' parameters A and B such that each returned set contains only 'an','bn' elements? That is, I have tried:
ABprecombine = zip(A,B)
limited_combinations = list(itertools.product(ABprecombine,C))
Which returns
[(('a1', 'b1'), 'c1'), (('a1', 'b1'), 'c2'), (('a2', 'b2'), 'c1'),
(('a2', 'b2'), 'c2'), (('a3', 'b3'), 'c1'), (('a3', 'b3'), 'c2')]
This is the six (3*1*2) desired products, but obviously due to the way I created it I now have an extra tuple.
Of course I could generate all combinations and then filter out given ones, but is there a smart way to 'link' parameters as above?

Here, zipping A and B is the right way to go. You can flatten the tuples pretty easily if you want:
limited_combinations = [(a, b, c) for ((a, b), c) in itertools.product(zip(A, B), C)]
If you want more detailed control of what combinations get produced, things can rapidly get more complicated, up to the difficulty of needing to solve NP-hard problems like boolean satisfiability. If that happens, look into existing libraries for that kind of thing.

lists permutation in python

I have the following lists:
list1 = [ 'A','B','C']
list2 = [ '1', '2' ]
Trying to generate a new list of tuples with the following desired result:
[(A1),(A2),(A1,B1),(A1,B2),(A2,B1),(A2,B2),(A1,B1,C1),(A2,B1,C1)...]
Each tuple will eventully be used to write a single line in an output file.
Note that:
In each tuple, each letter from list1, if defined, must be defined after the preceding letters. for example, if 'B' is defined in a tuple then 'A' must be in the tuple as well and prior to 'B'. tuple (A1,C1) is not desired since 'B' is not defined as well.
Tuples must be unique.
list1 & list2 are just an example and may vary in length.
I tried playing around with itertools, specifically with,
product,
permutations,
combinations
for quite some time. I can't seem to pull it off and I don't even have some code worth sharing.

Take successive larger slices of list1, and use products of products:
from itertools import product
elements = []
for letter in list1:
elements.append([''.join(c) for c in product(letter, list2)])
for combo in product(*elements):
print combo
The elements list is grown each loop, adding another set of letter + numbers list to produce products from.
This produces:
>>> elements = []
>>> for letter in list1:
... elements.append([''.join(c) for c in product(letter, list2)])
... for combo in product(*elements):
... print combo
...
('A1',)
('A2',)
('A1', 'B1')
('A1', 'B2')
('A2', 'B1')
('A2', 'B2')
('A1', 'B1', 'C1')
('A1', 'B1', 'C2')
('A1', 'B2', 'C1')
('A1', 'B2', 'C2')
('A2', 'B1', 'C1')
('A2', 'B1', 'C2')
('A2', 'B2', 'C1')
('A2', 'B2', 'C2')

What about this:
from itertools import product
output = []
for z in [list1[:n+1] for n in range(len(list1))]:
for y in product(list2, repeat=len(z)):
output.append(tuple(''.join(u) for u in zip(z, y)))
print(output)

sort multi demension list by another single list in python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have lists like this.
first : (apple, durian, cherry, egg, banana)
second : ((banana,b1,b2,b3,b4),
(durian,d1,d2,d3,d4),
(apple,a1,a2,a3,a4),
(egg,e1,e2,e3,e4),
(cherry,c1,c2,c3,c4))
I want to arrange second list using first list.
So I expect this.
((apple,a1,a2,a3,a4),
(durian,d1,d2,d3,d4),
(cherry,c1,c2,c3,c4),
(egg,e1,e2,e3,e4),
(banana,b1,b2,b3,b4))
please let me know how to do this.
thanks.

First of all - those are tuples, secondly all the samples you gave are not actually strings, so I did that for you.
Now lets convert it to a dictionary first:
data = [('banana','b1','b2','b3','b4'),
('durian','d1','d2','d3','d4'),
('apple','a1','a2','a3','a4'),
('egg','e1','e2','e3','e4'),
('cherry','c1','c2','c3','c4')]
data = {t[0]:t for t in data} # make dictionary with dictionary comprehension.
No we have our selector:
selector = ['apple', 'durian', 'cherry', 'egg', 'banana']
Then we order and create the list:
results = [data[key] for key in selector] # order result by selector
Answer:
[('apple', 'a1', 'a2', 'a3', 'a4'),
('durian', 'd1', 'd2', 'd3', 'd4'),
('cherry', 'c1', 'c2', 'c3', 'c4'),
('egg', 'e1', 'e2', 'e3', 'e4'),
('banana', 'b1', 'b2', 'b3', 'b4')]

What about using a dictionary? You could try this:
# first : (apple, durian, cherry, egg, banana)
# second : ((banana,b1,b2,b3,b4), (durian,d1,d2,d3,d4), (apple,a1,a2,a3,a4), (egg,e1,e2,e3,e4), (cherry,c1,c2,c3,c4))
d = {}
for lst in second:
d[lst[0]] = lst
result = []
for item in first:
# you shall ensure that key `item` exists in `d`
result.append(d[item])

In [25]: d = {L[0]:list(L[1:]) for L in second}
In [26]: answer = [[k]+d[k] for k in first]
In [27]: answer
Out[27]:
[['apple', 'a1', 'a2', 'a3', 'a4'],
['durian', 'd1', 'd2', 'd3', 'd4'],
['cherry', 'c1', 'c2', 'c3', 'c4'],
['egg', 'e1', 'e2', 'e3', 'e4'],
['banana', 'b1', 'b2', 'b3', 'b4']]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: list grouping [duplicate] - python

This is the most pythonic way I can think of: list(zip(oldlist[::2], oldlist[1::2]))

The code below should simply it: list_1 = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2'] list_of_groups = [x for x in zip((iter(list_1),) 2)]

If you really want, you can do this: a = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2'] list(zip([iter(a)]2)) # [('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]

Another way wiith numpy list(map(tuple, np.array(l).reshape(-1, 2))) #[('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]

Related

Pair elements from two different lists

How can I improve this heavily nested for-loop?

Python itertools - Create only a subset of all possible products

lists permutation in python

sort multi demension list by another single list in python [closed]

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: list grouping [duplicate] - python

This is the most pythonic way I can think of: list(zip(oldlist[::2], oldlist[1::2]))

The code below should simply it: list_1 = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2'] list_of_groups = [x for x in zip(*(iter(list_1),) * 2)]

If you really want, you can do this: a = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2'] list(zip(*[iter(a)]*2)) # [('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]

Another way wiith numpy list(map(tuple, np.array(l).reshape(-1, 2))) #[('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]

Related

Pair elements from two different lists

How can I improve this heavily nested for-loop?

Python itertools - Create only a subset of all possible products

lists permutation in python

sort multi demension list by another single list in python [closed]

Categories

Resources

The code below should simply it: list_1 = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2'] list_of_groups = [x for x in zip((iter(list_1),) 2)]

If you really want, you can do this: a = ['A0', 'B0', 'A1', 'B1', 'A2', 'B2'] list(zip([iter(a)]2)) # [('A0', 'B0'), ('A1', 'B1'), ('A2', 'B2')]