splitting a python list into sublists

splitting a python list into sublists - python

I have a list that I want to split into multiple sublists
acq=['A1', 'A2', 'D', 'A3', 'A4', 'A5', 'D', 'A6']
ll=[]
for k,v in enumerate(acq):
if v == 'D':
continue # continue here
ll.append(v)
print(ll)
Above solution give gives an expanded appended list, which is not what I am looking for. My desired solution is:
['A1', 'A2']
['A3', 'A4', 'A5']
['A6']

Try itertools.groupby:
from itertools import groupby
acq = ["A1", "A2", "D", "A3", "A4", "A5", "D", "A6"]
for v, g in groupby(acq, lambda v: v == "D"):
if not v:
print(list(g))
Prints:
['A1', 'A2']
['A3', 'A4', 'A5']
['A6']

No additional Library, and return a list of lists:
acq=['A1', 'A2', 'D', 'A3', 'A4', 'A5', 'D', 'A6']
all_list=[]
ll=[]
for i in acq:
if i == 'D':
all_list.append(ll)
ll=[]
continue
ll.append(i)
all_list.append(ll)
print(*all_list,sep='\n')
print:
['A1', 'A2']
['A3', 'A4', 'A5']
['A6']

acq=['A1', 'A2', 'D', 'A3', 'A4', 'A5', 'D', 'A6']
ll=[]
temp=[]
for k,v in enumerate(acq):
if v == 'D':
ll.append(temp)
temp=[]
continue # continue here
temp.append(v)
l1.append(temp)
print(ll)

Related

Split data in list based on condition

I have following list :
data = ['A1', 'C3', 'B2', 'A2', 'D3', 'C2', 'A3', 'D2', 'C1', 'B1', 'D1', 'B3']
I want to split the list such that
split1 = ['A1', 'C3', 'B2', 'A2', 'C2', 'A3', 'C1', 'B1', 'B3']
split2 = ['D3', 'D2', 'D1']
Constraint is that no item with same prefix(A, B, etc.) can wind up in separate list. The data can be split in any ratio like 50-50, 80-20.

Here you go:
import numpy as np
data = np.array(['A1', 'C3', 'B2', 'A2', 'D3', 'C2', 'A3', 'D2', 'C1', 'B1', 'D1', 'B3'])
# define some condition
condition = ['B', 'D']
boolean_selection = [np.any([ c in d for c in condition]) for d in data]
split1 = data[boolean_selection]
split2 = data[np.logical_not(boolean_selection)]

Remove element from every list in a column in pandas dataframe based on another column

I'd like to remove values in list from column B based on column A, wondering how.
Given:
df = pd.DataFrame({
'A': ['a1', 'a2', 'a3', 'a4'],
'B': [['a1', 'a2'], ['a1', 'a2', 'a3'], ['a1', 'a3'], []]
})
I want:
result = pd.DataFrame({
'A': ['a1', 'a2', 'a3', 'a4'],
'B': [['a1', 'a2'], ['a1', 'a2', 'a3'], ['a1', 'a3'], []],
'Output': [['a2'], ['a1', 'a3'], ['a1'], []]
})

One way of doing that is applying a filtering function to each row via DataFrame.apply:
df['Output'] = df.apply(lambda x: [i for i in x.B if i != x.A], axis=1)

Another solution using iterrows():
for i,value in df.iterrows():
try:
value['B'].remove(value['A'])
except ValueError:
pass
print(df)
Output:
A B
0 a1 [a2]
1 a2 [a1, a3]
2 a3 [a1]
3 a4 []

python fixed array of dynamic strings list

I would like to fill iteratively an array of fixed size where each item is a list of strings. For example, let's consider the following strings list:
arr = ['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
I want to obtain the following array of 3 items (no ordering is required):
res = [['A1', 'A2', 'A3', 'A4'],
['B2', 'B1'],
['C3', 'C1', 'C2']]
I have the following piece of code:
arr = ['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
res = [[]] * 3
for i in range(len(arr)):
# Calculate index corresponding to A, B or C
j = ord(arr[i][0])-65
# Extend corresponding string list
res[j].extend([arr[i]])
for i in range(len(res)):
print(res[i])
But I get this result:
['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
Where am I wrong please?
Thank you for your help!

You can use itertools.groupby and group the elements in the list (having been sorted) according to the first element. You can use operator.itemgetter to efficiently fetch the first substring in each string:
from itertools import groupby
from operator import itemgetter
[list(v) for k,v in groupby(sorted(arr), key=itemgetter(0))]
# [['A1', 'A2', 'A3', 'A4'], ['B1', 'B2'], ['C1', 'C2', 'C3']]

The problem is due to the following:
res = [[]] * 3 will create three lists, but all three are the same object. So whenever you append or extend one of them it will be added to "all" (they are all the same object after all).
You can easily check this by replacing it with:
res = [[],[],[]]
which will then give you the expected answer.
Consider these snippets:
res = [[]]*2
res[0].append(1)
print(res)
Out:
[[1], [1]]
While
res = [[],[]]
res[0].append(1)
print(res)
Out:
[[1], []]
Alternatively you can create the nested list like this: res = [[] for i in range(3)]

You can use list comprehension :
[[k for k in arr if k[0]==m] for m in sorted(set([i[0] for i in arr]))]
OUTPUT :
[['A1', 'A2', 'A3', 'A4'], ['B2', 'B1'], ['C3', 'C1', 'C2']]

haw can I find the smallest list among some lists generated by my program?

I wrote a program that generates some lists, something like
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'b5', 'b5', 'b4', 'D', 'c4']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b5']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b5', 'b5', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'D', 'c4']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b5', 'b5', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b3', 'b2']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'b5']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b5', 'b5', 'b4', 'b3', 'b2']
and I want to find the shortest list, the list that has the minimum number of elements
thanks,

You can use the min function:
min(data, key = len)
If you want to handle cases where there are multiple elements having the shortest length, you can sort the list in ascending order by length:
sorted(data, key = len)

You can sort it by list length then get the first element but this won't take into account lists that all have the same length.
smallest_list = sorted(list_of_list, key=len)[0]
Another would be get the length of the smallest list then use that as a filter
len_smallest_list = min(len(x) for x in list_of_list)
smallest_list = [list for list in list_of_list if len(list) == len_smallest_list]

Convert dictionary to list with some data omitted

I'm trying to convert a dictionary of the format:
d = {'A1': ['a', 'a', 'A2 (A3-)', 'a'],
'B1': ['b', 'b', 'B2 (B3-)', 'b'],
'C1': ['c', 'c', 'C2 (C3)-', 'c']}
To a list of the form:
e = [['A1', 'A2', 'A3'], ['B1', 'B2', 'B3'], ['C1', 'C2', 'C3']]
I know I should use regex to get the A2 and A3 data, but I'm having trouble putting this all together...

import re
regex = re.compile(r'(\w+) \((\w+)-.*')
# I suppose that you meant (C3-) and not (C3)-
d = {'A1': ['a', 'a', 'A2 (A3-)', 'a'], 'B1': ['b', 'b', 'B2 (B3-)', 'b'], 'C1': ['c', 'c', 'C2 (C3-)', 'c']}
out = []
for key, values_list in d.items():
v2, v3 = regex.match(values_list[2]).groups()
out.append([key, v2, v3])
print(out)
# [['C1', 'C2', 'C3'], ['B1', 'B2', 'B3'], ['A1', 'A2', 'A3']]
Note that the order is random, as your original dict is unordered.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

splitting a python list into sublists - python

Try itertools.groupby: from itertools import groupby acq = ["A1", "A2", "D", "A3", "A4", "A5", "D", "A6"] for v, g in groupby(acq, lambda v: v == "D"): if not v: print(list(g)) Prints: ['A1', 'A2'] ['A3', 'A4', 'A5'] ['A6']

No additional Library, and return a list of lists: acq=['A1', 'A2', 'D', 'A3', 'A4', 'A5', 'D', 'A6'] all_list=[] ll=[] for i in acq: if i == 'D': all_list.append(ll) ll=[] continue ll.append(i) all_list.append(ll) print(*all_list,sep='\n') print: ['A1', 'A2'] ['A3', 'A4', 'A5'] ['A6']

acq=['A1', 'A2', 'D', 'A3', 'A4', 'A5', 'D', 'A6'] ll=[] temp=[] for k,v in enumerate(acq): if v == 'D': ll.append(temp) temp=[] continue # continue here temp.append(v) l1.append(temp) print(ll)

Related

Split data in list based on condition

Remove element from every list in a column in pandas dataframe based on another column

python fixed array of dynamic strings list

haw can I find the smallest list among some lists generated by my program?

Convert dictionary to list with some data omitted

Categories

Resources