Split data in list based on condition - python

I have following list :
data = ['A1', 'C3', 'B2', 'A2', 'D3', 'C2', 'A3', 'D2', 'C1', 'B1', 'D1', 'B3']
I want to split the list such that
split1 = ['A1', 'C3', 'B2', 'A2', 'C2', 'A3', 'C1', 'B1', 'B3']
split2 = ['D3', 'D2', 'D1']
Constraint is that no item with same prefix(A, B, etc.) can wind up in separate list. The data can be split in any ratio like 50-50, 80-20.

Here you go:
import numpy as np
data = np.array(['A1', 'C3', 'B2', 'A2', 'D3', 'C2', 'A3', 'D2', 'C1', 'B1', 'D1', 'B3'])
# define some condition
condition = ['B', 'D']
boolean_selection = [np.any([ c in d for c in condition]) for d in data]
split1 = data[boolean_selection]
split2 = data[np.logical_not(boolean_selection)]

Related

Python saving results to variables

I am trying to loop through an array and I am checking where I have connection. (from an other function comes back a 0 or 1)
The bold arrays should be my results. I want to save them to a global variable, myResults, but if I print it at the end, then I will get back a much longer array...
I think the global variable is overwritten, why does it happen?
Thank you :)
[[['A2', 'A1', 'B1', 'B2', 'B3'], 'A3'], 26]
[[['A2', 'A1', 'B1', 'B2', 'B3', 'B4', 'A4'], 'A3'], 32]
[[[['A2', 'A1', 'B1', 'B2', 'B3', 'B4', 'A4', 'B4', 'C4', 'C3', 'C2', 'C1', 'D1', 'D2', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'C4', 'D4', 'C4', 'D4', 'C4', 'D4', 'C4', 'D4'], 'A3'], 26], [[['A2', 'A1', 'B1', 'B2', 'B3', 'B4', 'A4', 'B4', 'C4', 'C3', 'C2', 'C1', 'D1', 'D2', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'D3', 'D4', 'C4', 'D4', 'C4', 'D4', 'C4', 'D4', 'C4', 'D4'], 'A3'], 32]]
def callbackFunction(start, end, time):
myResult = []
def logToVariable(path, goal, timeFor):
finalPath = path
finalPath.append(goal)
print(finalPath)
add_result = [finalPath, timeFor]
print(add_result)
myResult.append(add_result)
def innerLoop(start, end, time):
correctList = [item for item in myList if item not in start]
for i in correctList:
first = start[-1]
time_diff = directPoints(first, i, time)
if time_diff[1] == 1:
if i==end:
#new_result = []
new_result = [[start, i], time_diff[0]]
#logToVariable(start, i, time_diff[0])
# print(start, i)
myResult.append(new_result)
print(new_result)
#break
#return result
else:
start.append(i)
innerLoop(start, end, time_diff[0])
else:
continue
innerLoop(start, end, time)
print(myResult)
return myResult

python fixed array of dynamic strings list

I would like to fill iteratively an array of fixed size where each item is a list of strings. For example, let's consider the following strings list:
arr = ['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
I want to obtain the following array of 3 items (no ordering is required):
res = [['A1', 'A2', 'A3', 'A4'],
['B2', 'B1'],
['C3', 'C1', 'C2']]
I have the following piece of code:
arr = ['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
res = [[]] * 3
for i in range(len(arr)):
# Calculate index corresponding to A, B or C
j = ord(arr[i][0])-65
# Extend corresponding string list
res[j].extend([arr[i]])
for i in range(len(res)):
print(res[i])
But I get this result:
['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
['A1', 'C3', 'B2', 'A2', 'C1', 'A3', 'B1', 'C2', 'A4']
Where am I wrong please?
Thank you for your help!
You can use itertools.groupby and group the elements in the list (having been sorted) according to the first element. You can use operator.itemgetter to efficiently fetch the first substring in each string:
from itertools import groupby
from operator import itemgetter
[list(v) for k,v in groupby(sorted(arr), key=itemgetter(0))]
# [['A1', 'A2', 'A3', 'A4'], ['B1', 'B2'], ['C1', 'C2', 'C3']]
The problem is due to the following:
res = [[]] * 3 will create three lists, but all three are the same object. So whenever you append or extend one of them it will be added to "all" (they are all the same object after all).
You can easily check this by replacing it with:
res = [[],[],[]]
which will then give you the expected answer.
Consider these snippets:
res = [[]]*2
res[0].append(1)
print(res)
Out:
[[1], [1]]
While
res = [[],[]]
res[0].append(1)
print(res)
Out:
[[1], []]
Alternatively you can create the nested list like this: res = [[] for i in range(3)]
You can use list comprehension :
[[k for k in arr if k[0]==m] for m in sorted(set([i[0] for i in arr]))]
OUTPUT :
[['A1', 'A2', 'A3', 'A4'], ['B2', 'B1'], ['C3', 'C1', 'C2']]

haw can I find the smallest list among some lists generated by my program?

I wrote a program that generates some lists, something like
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'b5', 'b5', 'b4', 'D', 'c4']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b5']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b5', 'b5', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'D', 'c4']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'b5', 'b5', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b3', 'b2']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b3', 'b2', 'b2', 'b3', 'b4', 'b5']
['a0', 'a1', 'a2', 'a3', 'a3', 'a4', 'C', 'b4', 'D', 'c4', 'c4', 'D', 'b4', 'b5', 'b5', 'b4', 'b3', 'b2']
and I want to find the shortest list, the list that has the minimum number of elements
thanks,
You can use the min function:
min(data, key = len)
If you want to handle cases where there are multiple elements having the shortest length, you can sort the list in ascending order by length:
sorted(data, key = len)
You can sort it by list length then get the first element but this won't take into account lists that all have the same length.
smallest_list = sorted(list_of_list, key=len)[0]
Another would be get the length of the smallest list then use that as a filter
len_smallest_list = min(len(x) for x in list_of_list)
smallest_list = [list for list in list_of_list if len(list) == len_smallest_list]

How to find different combinations between two list?

There are two lists.
list_1=[a1,b1,c1,d1]
list_2=[a2,b2,c2,d2]
Conditions are (i) there must be four elements in each of the combinations and (ii) combinations should contain one element of a (i.e. either a1 or a2),one element of b (i.e. either b1 or b2),one element of c (i.e. either c1 or c2) and one element of d (i.e. either d1 or d2).
Please help me to find out different combinations by using python 3x.
You can use itertools.product:
from itertools import product
list_1 = ['a1','b1','c1','d1']
list_2 = ['a2','b2','c2','d2']
result = list(product(*zip(list_1, list_2)))
print(result)
[('a1', 'b1', 'c1', 'd1'), ('a1', 'b1', 'c1', 'd2'), ('a1', 'b1', 'c2', 'd1'), ('a1', 'b1', 'c2', 'd2'), ('a1', 'b2', 'c1', 'd1'), ('a1', 'b2', 'c1', 'd2'), ('a1', 'b2', 'c2', 'd1'), ('a1', 'b2', 'c2', 'd2'), ('a2', 'b1', 'c1', 'd1'), ('a2', 'b1', 'c1', 'd2'), ('a2', 'b1', 'c2', 'd1'), ('a2', 'b1', 'c2', 'd2'), ('a2', 'b2', 'c1', 'd1'), ('a2', 'b2', 'c1', 'd2'), ('a2', 'b2', 'c2', 'd1'), ('a2', 'b2', 'c2', 'd2')]

Concatenation of a variant number of keys of a dictionary Python (recursion?)

Hello Stackoverlow members,
I'm trying to concatenate keys (string) on a hand, and values (list) on the other hand, of a dictionnary.
For your better understanding, here is what I have at the beginning:
dict = {'bk1':
{'k11': ['a1', 'b1', 'c1'],
'k12': ['a2', 'b2', 'c2']},
'bk2':
{'k21': ['d1', 'e1'],
'k22': ['d2', 'e2'],
'k23': ['d3', 'e3']},
'bk3':
{'k31': ['f1', 'g1', 'h1'],
'k32': ['f2', 'g2', 'h2']}
}
And here is what I would like at the end:
newdict = {'k11_k21_k31': ['a1', 'b1', 'c1', 'd1', 'e1', 'f1', 'g1', 'h1'],
'k11_k21_k32': ['a1', 'b1', 'c1', 'd1', 'e1', 'f2', 'g2', 'h2'],
'k11_k22_k31': ['a1', 'b1', 'c1', 'd2', 'e2', 'f1', 'g1', 'h1'],
'k11_k22_k32': ['a1', 'b1', 'c1', 'd2', 'e2', 'f2', 'g2', 'h2'],
'k11_k23_k31': ['a1', 'b1', 'c1', 'd3', 'e3', 'f1', 'g1', 'h1'],
'k11_k23_k32': ['a1', 'b1', 'c1', 'd3', 'e3', 'f2', 'g2', 'h2'],
'k12_k21_k31': ['a2', 'b2', 'c2', 'd1', 'e1', 'f1', 'g1', 'h1'],
'k12_k21_k32': ['a2', 'b2', 'c2', 'd1', 'e1', 'f2', 'g2', 'h2'],
'k12_k22_k31': ['a2', 'b2', 'c2', 'd2', 'e2', 'f1', 'g1', 'h1'],
'k12_k22_k32': ['a2', 'b2', 'c2', 'd2', 'e2', 'f2', 'g2', 'h2'],
'k12_k23_k31': ['a2', 'b2', 'c2', 'd3', 'e3', 'f1', 'g1', 'h1'],
'k12_k23_k32': ['a2', 'b2', 'c2', 'd3', 'e3', 'f2', 'g2', 'h2']}
I wish to do that with:
a variant number of "big key" (bki), and for each bki, a variant number of key (kij).
"Full combination" between "big keys". For example, I don't expect results like:
{'k11_k23': ['a1', 'b1', 'c1', 'd3', 'e3']}
where the "bk3" is missed.
I tried with imbricated "for" loops but the number of loops is depending on the number of "big keys"...
Then, I felt that the problem could be solved with recursion (maybe?), but in spite of my research and my will to implement it, I failed.
Any help with "recursive or not" solution would be strongly appreciated.
Thank you,
Mat
Whoaa, what a reactivity!
Thanks a lot for all your quick answers, it works perfect!
As suggested by #jksnw in the comments, you can use itertools.product to do this:
import itertools
dct = {
'bk1': {
'k11': ['a1', 'b1', 'c1'],
'k12': ['a2', 'b2', 'c2']
},
'bk2':{
'k21': ['d1', 'e1'],
'k22': ['d2', 'e2'],
'k23': ['d3', 'e3']
},
'bk3': {
'k31': ['f1', 'g1', 'h1'],
'k32': ['f2', 'g2', 'h2']
}
}
big_keys = dct.keys()
small_keys = (dct[big_key].keys() for big_key in big_keys)
res = {}
for keys_from_each in itertools.product(*small_keys):
key = "_".join(keys_from_each)
value = []
for big_key, small_key in zip(big_keys, keys_from_each):
value.extend(dct[big_key][small_key])
res[key] = value
So that:
>>> res
{'k11_k21_k31': ['a1', 'b1', 'c1', 'd1', 'e1', 'f1', 'g1', 'h1'],
'k11_k21_k32': ['a1', 'b1', 'c1', 'd1', 'e1', 'f2', 'g2', 'h2'],
'k11_k22_k31': ['a1', 'b1', 'c1', 'd2', 'e2', 'f1', 'g1', 'h1'],
'k11_k22_k32': ['a1', 'b1', 'c1', 'd2', 'e2', 'f2', 'g2', 'h2'],
'k11_k23_k31': ['a1', 'b1', 'c1', 'd3', 'e3', 'f1', 'g1', 'h1'],
'k11_k23_k32': ['a1', 'b1', 'c1', 'd3', 'e3', 'f2', 'g2', 'h2'],
'k12_k21_k31': ['a2', 'b2', 'c2', 'd1', 'e1', 'f1', 'g1', 'h1'],
'k12_k21_k32': ['a2', 'b2', 'c2', 'd1', 'e1', 'f2', 'g2', 'h2'],
'k12_k22_k31': ['a2', 'b2', 'c2', 'd2', 'e2', 'f1', 'g1', 'h1'],
'k12_k22_k32': ['a2', 'b2', 'c2', 'd2', 'e2', 'f2', 'g2', 'h2'],
'k12_k23_k31': ['a2', 'b2', 'c2', 'd3', 'e3', 'f1', 'g1', 'h1'],
'k12_k23_k32': ['a2', 'b2', 'c2', 'd3', 'e3', 'f2', 'g2', 'h2']}
Here, itertools.product is used to get a list of the "small keys" that we take from each block:
>>> big_keys = dct.keys()
>>> small_keys = (dct[big_key].keys() for big_key in big_keys)
>>> list(itertools.product(*small_keys))
[('k12', 'k22', 'k31'),
('k12', 'k22', 'k32'),
('k12', 'k23', 'k31'),
('k12', 'k23', 'k32'),
('k12', 'k21', 'k31'),
('k12', 'k21', 'k32'),
('k11', 'k22', 'k31'),
('k11', 'k22', 'k32'),
('k11', 'k23', 'k31'),
('k11', 'k23', 'k32'),
('k11', 'k21', 'k31'),
('k11', 'k21', 'k32')]
You can use itertools.product, and reduce(lambda x,y:x+y,i) to flatten your nested lists , also do not use dict or other python built-in types name or keywords as your variables name (i used d) :
>>> from itertools import product
>>> v=[i.values() for i in d.values()]
>>> v=[reduce(lambda x,y:x+y,i) for i in product(*v)]
>>> k=[i.keys() for i in d.values()]
>>> k=['_'.join(i) for i in product(*k)]
>>> {k:v for k,v in zip(k,v)}
{'k31_k12_k22': ['f1', 'g1', 'h1', 'a2', 'b2', 'c2', 'd2', 'e2'],
'k32_k12_k21': ['f2', 'g2', 'h2', 'a2', 'b2', 'c2', 'd1', 'e1'],
'k31_k11_k22': ['f1', 'g1', 'h1', 'a1', 'b1', 'c1', 'd2', 'e2'],
'k31_k12_k23': ['f1', 'g1', 'h1', 'a2', 'b2', 'c2', 'd3', 'e3'],
'k32_k12_k22': ['f2', 'g2', 'h2', 'a2', 'b2', 'c2', 'd2', 'e2'],
'k31_k12_k21': ['f1', 'g1', 'h1', 'a2', 'b2', 'c2', 'd1', 'e1'],
'k32_k11_k23': ['f2', 'g2', 'h2', 'a1', 'b1', 'c1', 'd3', 'e3'],
'k32_k12_k23': ['f2', 'g2', 'h2', 'a2', 'b2', 'c2', 'd3', 'e3'],
'k31_k11_k21': ['f1', 'g1', 'h1', 'a1', 'b1', 'c1', 'd1', 'e1'],
'k31_k11_k23': ['f1', 'g1', 'h1', 'a1', 'b1', 'c1', 'd3', 'e3'],
'k32_k11_k21': ['f2', 'g2', 'h2', 'a1', 'b1', 'c1', 'd1', 'e1'],
'k32_k11_k22': ['f2', 'g2', 'h2', 'a1', 'b1', 'c1', 'd2', 'e2']}

Categories

Resources