Using a Python list comprehension a bit like a zip - python

Ok, so I'm really bad at writing Python list comprehensions with more than one "for," but I want to get better at it. I want to know for sure whether or not the line
>>> [S[j]+str(i) for i in range(1,11) for j in range(3) for S in "ABCD"]
can be amended to return something like ["A1","B1","C1","D1","A2","B2","C2","D2"...(etc.)]
and if not, if there is a list comprehension that can return the same list, namely, a list of strings of all of the combinations of "ABCD" and the numbers from 1 to 10.

You have too many loops there. You don't need j at all.
This does the trick:
[S+str(i) for i in range(1,11) for S in "ABCD"]

The way I like to see more than one for loop in list comprehension is like the nested loop. Treat the next for loop as the loop nested in the first one and that will make it whole lot easier. To add to Daniel's answer:
[S+str(i) for i in range(1,11) for S in "ABCD"]
is nothing more than:
new_loop=[]
for i in range (1,11):
for S in "ABCD:
new_loop.append(S+str(i))

You may use itertools.product like this
import itertools
print [item[1] + str(item[0]) for item in itertools.product(range(1, 11),"ABCD")]
Output
['A1', 'B1', 'C1', 'D1', 'A2', 'B2', 'C2', 'D2', 'A3', 'B3', 'C3', 'D3', 'A4',
'B4', 'C4', 'D4', 'A5', 'B5', 'C5', 'D5', 'A6', 'B6', 'C6', 'D6', 'A7', 'B7',
'C7', 'D7', 'A8', 'B8', 'C8', 'D8', 'A9', 'B9', 'C9', 'D9', 'A10', 'B10', 'C10',
'D10']

EVERY time you think in combining all the elements if a iterable with all the elements of another iterable, think itertools.product. It is a cartesian product of two sets (or lists).
I've found a solution that is slightly more fast than the ones presented here until now. And more than 2x fast than #daniel solution (Although his solution looks far more elegant):
import itertools
[x + y for (x,y) in (itertools.product('ABCD', map(str,range(1,5))))]
The difference here is that I casted the int to strings using map. Applying functions over vectors is usually faster than applying them on individual items.
And a general tip when dealing with complex comprehensions:
When you have lots of for and lots of conditionals inside your comprehension, break it into several lines, like this:
[S[j]+str(i) for i in range(1,11)
for j in range(3)
for S in "ABCD"]
In this case the change in easyness to read wasn't so big, but, when you have lots of conditionals and lots of fors, it makes a big diference. It's exactly like writing for loops and if statements nested, but without the ":" and the identation.
See the code using regular fors:
ans = []
for i in range(1,11):
for j in range(3):
for S in "ABCD":
ans.append(S[j] + str(i))
Almost the same thing :)

Why don't use itertools.product?
>>> import itertools
>>> [ i[0] + str(i[1]) for i in itertools.product('ABCD', range(1,5))]
['A1', 'A2', 'A3', 'A4', 'B1', 'B2', 'B3', 'B4', 'C1', 'C2', 'C3', 'C4', 'D1', 'D2', 'D3', 'D4']

Related

Concatenating one dimensional numpyarrays with variable size numpy array in loop

nNumbers = [1,2,3]
baseVariables = ['a','b','c','d','e']
arr = np.empty(0)
for i in nNumbers:
x = np.empty(0)
for v in baseVariables:
x = np.append(x, y['result'][i][v])
print(x)
arr = np.concatenate((arr, x))
I have one Json input stored in y. need to filter some variables out of that json format. the above code works in that it gives me the output in an array, but it is only in a one dimensional array. I want the output in a two dimensional array like:
[['q','qr','qe','qw','etc']['','','','','']['','','','','']]
I have tried various different ways but am not able to figure it out. Any feedback on how to get it to the desired output format would be greatly appreciated.
A correct basic Python way of making a nested list of strings:
In [57]: nNumbers = [1,2,3]
...: baseVariables = ['a','b','c','d','e']
In [58]: alist = []
...: for i in nNumbers:
...: blist = []
...: for v in baseVariables:
...: blist.append(v+str(i))
...: alist.append(blist)
...:
In [59]: alist
Out[59]:
[['a1', 'b1', 'c1', 'd1', 'e1'],
['a2', 'b2', 'c2', 'd2', 'e2'],
['a3', 'b3', 'c3', 'd3', 'e3']]
That can be turned into an array if necessary - though numpy doesn't provide much added utility for strings:
In [60]: np.array(alist)
Out[60]:
array([['a1', 'b1', 'c1', 'd1', 'e1'],
['a2', 'b2', 'c2', 'd2', 'e2'],
['a3', 'b3', 'c3', 'd3', 'e3']], dtype='<U2')
Or in a compact list comprehension form:
In [61]: [[v+str(i) for v in baseVariables] for i in nNumbers]
Out[61]:
[['a1', 'b1', 'c1', 'd1', 'e1'],
['a2', 'b2', 'c2', 'd2', 'e2'],
['a3', 'b3', 'c3', 'd3', 'e3']]
You are starting with lists! And making strings! And selecting items from a JSON, with y['result'][i][v]. None of that benefits from using numpy, especially not the repeated use of np.append and np.concatenate.
Could you provide an example of JSON? It sounds like you basically want to
Filter the JSON
Flatten the JSON
Depending on what your output example means, you might want to not filter, but replace certain values with empty values, is that correct?
Please note that Pandas has very powerfull out-of-the-box options to handle, and in particular, flatten JSONs. https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html#io-json-reader. An approach could be to first load in Pandas and filter it from there. Flattening a JSON can also be done by iterating over it like so:
def flatten_json(y):
out = {}
def flatten(x, name=''):
if type(x) is dict:
for a in x:
flatten(x[a], name + a + '_')
elif type(x) is list:
i = 0
for a in x:
flatten(a, name + str(i) + '_')
i += 1
else:
out[name[:-1]] = x
flatten(y)
return out
I got this code from: https://towardsdatascience.com/flattening-json-objects-in-python-f5343c794b10. The author explains some challenges of flattening JSON. Of course, you can put some if statement into the function for your filtering need. I hope this can get you started at least!

Creating a list of 6 every possible combination

How would I create a list of 6 columns from a dataframe of 21 columns. I need to create every single combination possible and store these combinations is a dataframe.
Suppose
lst = ['c1', 'c2', 'c3', 'c4','c5', 'c6', 'c7','c8', 'c9', 'c10', 'c11','c12', 'c13', 'c14','c15', 'c16', 'c17', 'c18','c19', 'c20', 'c21']
# Calling DataFrame constructor on list
df = pd.DataFrame(lst)
Some list generator
adds new list to cdf dataframe
final df should be something like this not sure if i wrote this in the right syntax but a dataframe with 1 column haveing a list of 6 elements
cdf = [[c1', 'c2', 'c3', 'c4','c5', 'c6'],['c7','c8', 'c9', 'c10', 'c11','c12'], ['c13', 'c14','c15', 'c16', 'c17', 'c18']...]
Thank you!
I would do the combinations work in pure python, using the itertools.combinations function, documentation here. You can then import that list of tuples into pandas if desired.
Example code, which generates the combinations (combos are in sorted order, no repeated elements), prints how many combinations there are, and shows the first 2 and last 2 as examples:
import itertools
lst = ['c1', 'c2', 'c3', 'c4','c5', 'c6', 'c7','c8', 'c9', 'c10', 'c11','c12', 'c13', 'c14','c15', 'c16', 'c17', 'c18','c19', 'c20', 'c21']
combos = itertools.combinations(lst, 6)
combos_list = list(combos)
print(f'{len(combos_list)} Combinations')
print(combos_list[0])
print(combos_list[1])
print(combos_list[-2])
print(combos_list[-1])
This generates output:
54264 Combinations
('c1', 'c2', 'c3', 'c4', 'c5', 'c6')
('c1', 'c2', 'c3', 'c4', 'c5', 'c7')
('c15', 'c17', 'c18', 'c19', 'c20', 'c21')
('c16', 'c17', 'c18', 'c19', 'c20', 'c21')
Happy Coding!

How can I detect common elements lists and groupe lists with at least 1 common element?

I have a Dataframe with 1 column (+the index) containing lists of sublists or elements.
I would like to detect common elements in the lists/sublists and group the lists with at least 1 common element in order to have only lists of elements without any common elements.
The lists/sublists are currently like this (exemple for 4 rows):
Num_ID
Row1 [['A1','A2','A3'],['A1','B1','B2','C3','D1']]`
Row2 ['A1','E2','E3']
Row3 [['B4','B5','G4'],['B6','B4']]
Row4 ['B4','C9']
n lists with no common elements (example for the first 2):
['A1','A2','A3','B1','B2','C3','D1','E2','E3']
['B4','B5','B6','C9','G4']
You can use NetworkX's connected_components method for this. Here's how I'd approach this adapting this solution:
import networkx as nx
from itertools import combinations, chain
df= pd.DataFrame({'Num_ID':[[['A1','A2','A3'],['A1','B1','B2','C3','D1']],
['A1','E2','E3'],
[['B4','B5','G4'],['B6','B4']],
['B4','C9']]})
Start by flattening the sublists in each list:
L = [[*chain.from_iterable(i)] if isinstance(i[0], list) else i
for i in df.Num_ID.values.tolist()]
[['A1', 'A2', 'A3', 'A1', 'B1', 'B2', 'C3', 'D1'],
['A1', 'E2', 'E3'],
['B4', 'B5', 'G4', 'B6', 'B4'],
['B4', 'C9']]
Given that the lists/sublists have more than 2 elements, you can get all the length 2 combinations from each sublist and use these as the network edges (note that edges can only connect two nodes):
L2_nested = [list(combinations(l,2)) for l in L]
L2 = list(chain.from_iterable(L2_nested))
Generate a graph, and add your list as the graph edges using add_edges_from. Then use connected_components, which will precisely give you a list of sets of the connected components in the graph:
G=nx.Graph()
G.add_edges_from(L2)
list(nx.connected_components(G))
[{'A1', 'A2', 'A3', 'B1', 'B2', 'C3', 'D1', 'E2', 'E3'},
{'B4', 'B5', 'B6', 'C9', 'G4'}]

How can I improve this heavily nested for-loop?

I have a function which I'd like to optimize, if possible. But I cannot easily tell if there's a better way to refactor (and optimize) this...
Suppose,
keys_in_order = ['A', 'B', 'C', 'D', 'E']
key_table = { 'A': {'A1': 'one', 'A2': 'two', 'A3': 'three', 'A4': 'four'},
'B': {'B1': 'one-one', 'B2': 'two-two', 'B3': 'three-three'},
... # mapping for 'C', 'D' here
'E': {'E1': 'one-one', 'E2': 'two-two', 'E3': 'three-three', 'E6': 'six-six'}
}
The purpose is to feed the above two parameters to the function as below:
def generate_all_possible_key_combinations(keys_in_order, key_table):
first_key = keys_in_order[0]
second_key = keys_in_order[1]
third_key = keys_in_order[2]
fourth_key = keys_in_order[3]
fifth_key = keys_in_order[4]
table_out = [['Demo Group', first_key, second_key, third_key, fourth_key, fifth_key]] # just the header row so that we can write to a CSV file later
for k1, v1 in key_table[first_key].items():
for k2, v2 in key_table[second_key].items():
for k3, v3 in key_table[third_key].items():
for k4, v4 in key_table[fourth_key].items():
for k5, v5 in key_table[fifth_key].items():
demo_gp = k1 + k2 + k3 + k4 + k5
table_out.append([demo_gp, v1, v2, v3, v4, v5])
return table_out
so that the goal is to have a table with all possible combination of sub-keys (that is, 'A1B1C1D1E1', 'A1B1C1D1E2', 'A1B1C1D1E3', etc.) along with their corresponding values in key_table.
To me, the current code with five heavily nested loop through the dict key_table is ugly, not to mention it being inefficient computation-wise. Is there a way to improve this? I hope folks from code_review might be able to shed some lights on how I might go about it. Thank you!
I have implemented with an alternative method. Consider as key_table as your main dictionary.
My logic is
From this i will get all the possible sub keys from the main dict.
In [1]: [i.keys() for i in key_table.values()]
Out[1]:
[['A1', 'A3', 'A2', 'A4'],
['C3', 'C2', 'C1'],
['B1', 'B2', 'B3'],
['E6', 'E1', 'E3', 'E2'],
['D2', 'D3', 'D1']]
Then i made this list of list as a single list.
In [2]: print [item for sublist in [i.keys() for i in key_table.values()] for item in sublist]
['A1', 'A3', 'A2', 'A4', 'C3', 'C2', 'C1', 'B1', 'B2', 'B3', 'E6', 'E1', 'E3', 'E2', 'D2', 'D3', 'D1']
With using itertools.combinations implemented the combination of all possible values. It have 5 elements so i given that as a hard code method. You can replace that with len([i.keys() for i in key_table.values()]) if you more values. Here provides an example of itertools.combinations. Then you can understand it.
In [83]: for i in itertools.combinations(['A1','B1','C1'],2):
....: print i
....:
('A1', 'B1')
('A1', 'C1')
('B1', 'C1')
Here is the full code with one line implementation.
for item in itertools.combinations([item for sublist in [i.keys() for i in key_table.values()] for item in sublist],5):
print ''.join(item)
Some optimizations:
The various key_table[?].items() could be computed before the nested loop
You could compute partials of demo_gp when they are available: demo_gp12 = k1 + k2, demo_gp123 = demo_gp12 + k3, etc. Similar thing could be done with the array of vs.
As #JohnColeman suggested, itertools would be a good place to look to simplifying it.

sort multi demension list by another single list in python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking for code must demonstrate a minimal understanding of the problem being solved. Include attempted solutions, why they didn't work, and the expected results. See also: Stack Overflow question checklist
Closed 9 years ago.
Improve this question
I have lists like this.
first : (apple, durian, cherry, egg, banana)
second : ((banana,b1,b2,b3,b4),
(durian,d1,d2,d3,d4),
(apple,a1,a2,a3,a4),
(egg,e1,e2,e3,e4),
(cherry,c1,c2,c3,c4))
I want to arrange second list using first list.
So I expect this.
((apple,a1,a2,a3,a4),
(durian,d1,d2,d3,d4),
(cherry,c1,c2,c3,c4),
(egg,e1,e2,e3,e4),
(banana,b1,b2,b3,b4))
please let me know how to do this.
thanks.
First of all - those are tuples, secondly all the samples you gave are not actually strings, so I did that for you.
Now lets convert it to a dictionary first:
data = [('banana','b1','b2','b3','b4'),
('durian','d1','d2','d3','d4'),
('apple','a1','a2','a3','a4'),
('egg','e1','e2','e3','e4'),
('cherry','c1','c2','c3','c4')]
data = {t[0]:t for t in data} # make dictionary with dictionary comprehension.
No we have our selector:
selector = ['apple', 'durian', 'cherry', 'egg', 'banana']
Then we order and create the list:
results = [data[key] for key in selector] # order result by selector
Answer:
[('apple', 'a1', 'a2', 'a3', 'a4'),
('durian', 'd1', 'd2', 'd3', 'd4'),
('cherry', 'c1', 'c2', 'c3', 'c4'),
('egg', 'e1', 'e2', 'e3', 'e4'),
('banana', 'b1', 'b2', 'b3', 'b4')]
What about using a dictionary? You could try this:
# first : (apple, durian, cherry, egg, banana)
# second : ((banana,b1,b2,b3,b4), (durian,d1,d2,d3,d4), (apple,a1,a2,a3,a4), (egg,e1,e2,e3,e4), (cherry,c1,c2,c3,c4))
d = {}
for lst in second:
d[lst[0]] = lst
result = []
for item in first:
# you shall ensure that key `item` exists in `d`
result.append(d[item])
In [25]: d = {L[0]:list(L[1:]) for L in second}
In [26]: answer = [[k]+d[k] for k in first]
In [27]: answer
Out[27]:
[['apple', 'a1', 'a2', 'a3', 'a4'],
['durian', 'd1', 'd2', 'd3', 'd4'],
['cherry', 'c1', 'c2', 'c3', 'c4'],
['egg', 'e1', 'e2', 'e3', 'e4'],
['banana', 'b1', 'b2', 'b3', 'b4']]

Categories

Resources