Multiple Lists into Single List based on Index - python

I'm new to python and I'm trying to merge three different lists into one list based on the index value as shown in the example below:
All three lists are of same size.
A=['ABC', 'PQR', 'MNO']
B=['X', 'Y', 'Z']
C=['1','2','3']***
The output that I wanted is
P=[['ABC', 'X', '1'],['PQR', 'Y', '2'],['MNO', 'Z', '3']]
Thanks in advance.

I usually do it with numpy, as it is a simple traspose, and works with as many lists as you throw at it:
import numpy as np
A = ['ABC', 'PQR', 'MNO']
B = ['X', 'Y', 'Z']
C = ['1', '2', '3']
lists = [A, B, C]
numpy_array = np.array(lists)
transpose = numpy_array.T
transpose_list = transpose.tolist()
print(transpose_list)

Here is the solution for you using the for loop with the range() function:
A=['ABC', 'PQR', 'MNO']
B=['X', 'Y', 'Z']
C=['1','2','3']
list1=[]
for i in range(len(A)):
list1.append([A[i],B[i],C[i]])
display(list1)
OUTPUT:
[['ABC', 'X', '1'], ['PQR', 'Y', '2'], ['MNO', 'Z', '3']]
Using for loop with the zip() function:
l=[]
for a,b,c in zip(A,B,C):
l.append([a,b,c])
display(l)
OUTPUT:
[['ABC', 'X', '1'], ['PQR', 'Y', '2'], ['MNO', 'Z', '3']]
You don't want to use for loop?
Then here is the map() function for you:
result = list(map(lambda a, b, c: [a,b,c] , A, B,C))
display(result)
OUTPUT:
[['ABC', 'X', '1'], ['PQR', 'Y', '2'], ['MNO', 'Z', '3']]

you can use list comprehension to get desired output;
a=[[x,y,z] for x,y,z in zip(A,B,C)]
print(a)

Related

Pandas, adding multiple columns of list

I have a dataframe like this one
df = pd.DataFrame({'A' : [['a', 'b', 'c'], ['e', 'f', 'g','g']], 'B' : [['1', '4', 'a'], ['5', 'a']]})
I would like to create another column C that will be a column of list like the others but this one will be the "union" of the others
Something like this :
df = pd.DataFrame({'A' : [['a', 'b', 'c'], ['e', 'f', 'g', 'g']], 'B' : [['1', '4', 'a'], ['5', 'a']], 'C' : [['a', 'b', 'c', '1', '4', 'a'], ['e', 'f', 'g', 'g', '5', 'a']]})
But i have like hundreds of columns and C will be the "union" of these hundreds of columns i dont want to index on it like this :
df['C'] = df['A'] + df['B]
And i dont want to make a for loop because the dataframe i am manipulating are too big and i want something fast
Thank you for helping
As you have lists, you cannot vectorize the operation.
A list comprehension might be the fastest:
from itertools import chain
df['out'] = [list(chain.from_iterable(x[1:])) for x in df.itertuples()]
Example:
A B C out
0 [a, b, c] [1, 4, a] [x, y] [a, b, c, 1, 4, a, x, y]
1 [e, f, g, g] [5, a] [z] [e, f, g, g, 5, a, z]
As an alternative to #mozway 's answer, you could try something like this:
df = pd.DataFrame({'A': [['a', 'b', 'c'], ['e', 'f', 'g','g']], 'B' : [['1', '4', 'a'], ['5', 'a']]})
df['C'] = df.sum(axis=1).astype(str)
use 'astype' as required for list contents
you can use the apply method
df['C']=df.apply(lambda x: [' '.join(i) for i in list(x[df.columns.to_list()])], axis=1)

How can I create an algorithm for display all possible character combinations as a list in python?

The inputs for the function are
a list of characters, eg: ['a','1']
length of combinations
The function should output a list of all possible character combinations as a list.
For example, for input ['a','1'] and length of 2, the function should output:
[['a','a'],
['a','1'],
['1','a'],
['1','1']]
and if the length is 3, the output should be:
[['a','a','a'],
['a','a','1'],
['a','1','a'],
['a','1','1'],
['1','a','a'],
['1','a','1'],
['1','1','a'],
['1','1','1']]
You can use itertools.product with the repeat parameter:
from itertools import product
data = ['a', '1']
n = 3
print(list(list(p) for p in product(data, repeat=n)))
This gives an output of:
[['a', 'a', 'a'], ['a', 'a', '1'], ['a', '1', 'a'], ['a', '1', '1'],
['1', 'a', 'a'], ['1', 'a', '1'], ['1', '1', 'a'], ['1', '1', '1']]

How to join innermost elements of a deep nested list using zip

Suppose that I have the following list of lists containing lists:
samples = [
# First sample
[
# Think 'x' as in input variable in ML
[
['A','E'], # Data
['B','F'] # Metadata
],
# Think 'y' as in target variable in ML
[
['C','G'], # Data
['D','H'], # Metadata
]
],
# Second sample
[
[
['1'],
['2']
],
[
['3'],
['4']
]
]
]
The output that I'm after looks like the following:
>>> samples
[
['A','E','1'], # x.data
['B','F','2'], # x.metadata
['C','G','3'], # y.data
['D','H','4'] # y.metadata
]
My question is that does there exist a way to utilize Python's zip function and maybe some list comprehensions to achieve this?
I have searched for some solutions, but for example this and this deal with using zip to address different lists, not inner lists.
A way to achieve this could very well be just a simple iteration over the samples like this:
x,x_len,y,y_len=[],[],[],[]
for sample in samples:
x.append(sample[0][0])
x_len.append(sample[0][1])
y.append(sample[1][0])
y_len.append(sample[1][1])
samples = [
x,
x_len,
y,
y_len
]
I'm still curious if there exists a way to utilize zip over for looping the samples and their nested lists.
Note that the data and metadata can vary in length across samples.
IIUC, one way is to use itertools.chain to flatten the results of zip(samples):
from itertools import chain
new_samples = [
list(chain.from_iterable(y)) for y in zip(
*((chain.from_iterable(*x)) for x in zip(samples))
)
]
print(new_samples)
#[['A', 'E', '1'], ['B', 'F', '2'], ['C', 'G', '3'], ['D', 'H', '4']]
Step by step explanation
1) First call zip on samples:
print(list(zip(samples)))
#[([[['A', 'E'], ['B', 'F']], [['C', 'G'], ['D', 'H']]],),
# ([[['1'], ['2']], [['3'], ['4']]],)]
Notice that in the two lines in the output above, if the elements were flattened, you'd have the structure needed to zip in order to get your final results.
2) Use itertools.chain to flatten (which will be much more efficient than using sum).
print([list(chain.from_iterable(*x)) for x in zip(samples)])
#[[['A', 'E'], ['B', 'F'], ['C', 'G'], ['D', 'H']],
# [['1'], ['2'], ['3'], ['4']]]
3) Now call zip again:
print(list(zip(*((chain.from_iterable(*x)) for x in zip(samples)))))
#[(['A', 'E'], ['1']),
# (['B', 'F'], ['2']),
# (['C', 'G'], ['3']),
# (['D', 'H'], ['4'])]
4) Now you basically have what you want, except the lists are nested. So use itertools.chain again to flatten the final list.
print(
[
list(chain.from_iterable(y)) for y in zip(
*((chain.from_iterable(*x)) for x in zip(samples))
)
]
)
#[['A', 'E', '1'], ['B', 'F', '2'], ['C', 'G', '3'], ['D', 'H', '4']]
Here's another solution. Quite ugly, but it does use zip, even twice!
>>> sum(map(lambda y: list(map(lambda x: sum(x, []), zip(*y))), zip(*samples)), [])
[['A', '1'], ['B', '2'], ['C', '3'], ['D', '4']]
It is interesting to see how it works, but please don't actually use it; it is both hard to read and algorithmically bad.
You could do:
res = [[y for l in x for y in l] for x in zip(*([x for var in sample for x in var] for sample in samples))]
print([list(i) for i in res])
Gives on your example:
[['A', 'E', '1'], ['B', 'F', '2'], ['C', 'G', '3'], ['D', 'H', '4']]
This basically flattens each "sample" to a list and packs that in a big list, then unbpacks that into zip and then packs each zipped element to a list.
Not the most comfortable data structure to work with you have there. I would advise to refactor the code and choose something else than 3-times nested lists to keep the data, but if it is currently not possible, I suggest the following approach:
import itertools
def flatten(iterable):
yield from itertools.chain.from_iterable(iterable)
result = []
for elements in zip(*map(flatten, samples)):
result.append(list(flatten(elements)))
For your example it gives:
[['A', 'E', '1'],
['B', 'F', '2'],
['C', 'G', '3'],
['D', 'H', '4']]
Test for more than 2 samples:
samples = [[[['A', 'E'], ['B', 'F']],
[['C', 'G'], ['D', 'H']]],
[[['1'], ['2']],
[['3'], ['4']]],
[[['5'], ['6']],
[['7'], ['8']]]]
gives:
[['A', 'E', '1', '5'],
['B', 'F', '2', '6'],
['C', 'G', '3', '7'],
['D', 'H', '4', '8']]
Explanation:
The flatten generator function simply flattens 1 level of a nested iterable. It is based on itertools.chain.from_iterable function. In map(flatten, samples) we apply this function to each element of samples:
>>> map(flatten, samples)
<map at 0x3c6685fef0> # <-- map object returned, to see result wrap it in `list`:
>>> list(map(flatten, samples))
[<generator object flatten at 0x0000003C67A2F9A8>, # <-- will flatten the 1st sample
<generator object flatten at 0x0000003C67A2FA98>, # <-- ... the 2nd
<generator object flatten at 0x0000003C67A2FB10>] # <-- ... the 3rd and so on if there are more
# We can see what each generator will give by applying `list` on each one of them
>>> list(map(list, map(flatten, samples)))
[[['A', 'E'], ['B', 'F'], ['C', 'G'], ['D', 'H']],
[['1'], ['2'], ['3'], ['4']],
[['5'], ['6'], ['7'], ['8']]]
Next, we can use zip to iterate over the flattened samples. Note that we cannot apply it on map object directly:
>>> list(zip(map(flatten, samples)))
[(<generator object flatten at 0x0000003C66944138>,),
(<generator object flatten at 0x0000003C669441B0>,),
(<generator object flatten at 0x0000003C66944228>,)]
we should unpack it first:
>>> list(zip(*map(flatten, samples)))
[(['A', 'E'], ['1'], ['5']),
(['B', 'F'], ['2'], ['6']),
(['C', 'G'], ['3'], ['7']),
(['D', 'H'], ['4'], ['8'])]
# or in a for loop:
>>> for elements in zip(*map(flatten, samples)):
... print(elements)
(['A', 'E'], ['1'], ['5'])
(['B', 'F'], ['2'], ['6'])
(['C', 'G'], ['3'], ['7'])
(['D', 'H'], ['4'], ['8'])
Finally, we just have to join all the lists in each elements tuple together. We can use the same flatten function for that:
>>> for elements in zip(*map(flatten, samples)):
... print(list(flatten(elements)))
['A', 'E', '1', '5']
['B', 'F', '2', '6']
['C', 'G', '3', '7']
['D', 'H', '4', '8']
And you just have to put it all back in a list as shown in the first code sample.

Select sublist if element present in second list

I have two lists:
A = [['67', '75', 'X'], ['85','72', 'V'], ['1','2', 'Y'], ['3','5', 'X', 'Y']]
B = ['X', 'Y']
I want to create a third list, C, that have the sublists of A which have the elements defined on B (an / or).
C = [[67', '75', 'X'],['1','2', 'Y'], ['3','5', 'X', 'Y']]
I have tried:
C = [i for i in B if i in A]
But it didn't work, I get an empty C list. Please let me know what would be the best approach to obtain C.
Use a list-comprehension that checks if any of the elements in B is in A:
A = [['67', '75', 'X'], ['85','72', 'V'], ['1','2', 'Y'], ['3','5', 'X', 'Y']]
B = ['X', 'Y']
C = [x for x in A if any(y in x for y in B)]
# [['67', '75', 'X'], ['1', '2', 'Y'], ['3', '5', 'X', 'Y']]
C = [y for y in A for x in B if x in y]
This should do the trick.
You can also use this:
C = list()
for i in A:
if B[0] in i or B[1] in i:
C.append(i)
You can also use set intersection to check if there is any element in common between the element e (sublist) of A and b defined as set(B).
So,
b = set(B)
C = [ e for e in A if b.intersection(set(e)) ]
#=> [['67', '75', 'X'], ['1', '2', 'Y'], ['3', '5', 'X', 'Y']]

zip unknown number of lists with Python for more than one list

I need to do something very similar to what was asked here How would you zip an unknown number of lists in Python?, but in a more general case.
I have the following set up:
a = [['a', '0', 'b', 'f', '78']]
b = [['3', 'w', 'hh', '12', '8']]
c = [['g', '7', '1', 'a0', '9'], ['45', '4', 'fe', 'h', 'k']]
I need to zip these lists together to obtain:
abc = [['a', '3', 'g', '45'], ['0', 'w', '7', '4'], ['b', 'hh', '1', 'fe'], ['f', '12', 'a0', 'h'], ['78', '8', '9', 'k']]
which I can generate with:
zip(a[0], b[0], c[0], c[1])
But the lists a,b,c contain a number of sublists that will vary for successive runs, so this "manual" way of expanding them won't work.
The closest I can get is:
zip(a[0], b[0], *c)
Since unpacking a list with * in any other position than the last is not allowed, the "ideal" expression:
zip(*a, *b, *c)
does not work.
How could I zip together a number of lists with an unknown number of sublists?
itertools.chain to the rescue:
import itertools
z = list(zip(*itertools.chain(a, b, c)))

Categories

Resources