Related
So here is my problem I have a list of size eleven along with a dictionary which tells us how the list should be split. So here the first index should contain the sublist [14, 10, 2, 4], the second index [12, 8, 8, 5] and so on.
l = [14, 10, 2, 4, 12, 8, 8, 5, 9, 2, 7]
dico = dict()
dico[0] = 4
dico[1] = 4
dico[2] = 3
dico
>>> {0: 4, 1: 4, 2: 3}
Eventually the expected behavior is the following
{0:[14, 10, 2, 4], 1:[12, 8, 8, 5], 2:[9, 2, 7]}
Note that retaining the order of the initial list is important.
Given your sample data and a reasonably young Python version (that maintains insertion order for iteration of dicts), you can simply do:
i = iter(l)
{k: [next(i) for _ in range(v)] for k, v in dico.items()}
# {0: [14, 10, 2, 4], 1: [12, 8, 8, 5], 2: [9, 2, 7]}
new_d = {}
for key in dico:
new_d[key] = l[:dico[key]]
l[:dico[key]] = []
print(new_d)
{0: [14, 10, 2, 4], 1: [12, 8, 8, 5], 2: [9, 2, 7]}
I will offer this option:
l = [14, 10, 2, 4, 12, 8, 8, 5, 9, 2, 7]
l2 = l.copy()
dico = {0: 4, 1: 4, 2: 3}
for key, val in dico.items():
for j in range(dico[key]):
if j == 0:
dico[key] = []
dico[key].append(l2.pop(0))
print(dico)
print(l)
Output:
{0: [14, 10, 2, 4], 1: [12, 8, 8, 5], 2: [9, 2, 7]}
[14, 10, 2, 4, 12, 8, 8, 5, 9, 2, 7]
Given two lists of lists of arbitrary length, let's say list1 and list2 I want to divide the lists in list1 into subsets of lists, if they contain only one of the lists of list2.
I give you a specific example:
list1 = [[1, 2, 3, 4], [1, 2, 3, 5, 6, 8], [1, 2, 3, 6, 7], [1, 2, 3, 6, 8, 9, 10],
[1, 2, 3, 6, 8, 11, 12], [1, 2, 4, 5, 9, 10], [1, 2, 4, 5, 11, 12],
[1, 2, 5, 6, 7, 9, 10], [1, 2, 5, 6, 7, 11, 12], [1, 2, 5, 6, 8, 9, 10],
[1, 2, 5, 6, 8, 11, 12], [3, 4, 5, 6, 8], [3, 5, 9, 10], [3, 5, 11, 12],
[4, 6, 7], [4, 6, 8, 9, 10], [4, 6, 8, 11, 12], [9, 10, 11, 12]]
list2 = [[2], [6, 7], [6, 8], [9,9]]
and then desired outcome of the function would be for "inner" matches:
[[1, 2, 3, 4],
[1, 2, 4, 5, 11, 12],
[4, 6, 7],
[3, 4, 5, 6, 8],
[4, 6, 8, 11, 12],
[3, 5, 9, 10],
[9, 10, 11, 12]]
and for the "outer" matches (that are consequently the remaining items in list_1):
[(1, 2, 5, 6, 8, 11, 12),
(1, 2, 5, 6, 7, 11, 12),
(4, 6, 8, 9, 10),
(1, 2, 5, 6, 7, 9, 10),
(1, 2, 3, 5, 6, 8),
(1, 2, 3, 6, 8, 11, 12),
(1, 2, 3, 6, 7),
(3, 5, 11, 12),
(1, 2, 4, 5, 9, 10),
(1, 2, 5, 6, 8, 9, 10),
(1, 2, 3, 6, 8, 9, 10)]
I coded a quick and dirty solution that produces the desired outcome, but does not scale well for very long lists (for example 100000 & 2500).
My solution:
from itertools import chain
def find_all_sets(list1,list2):
d = {}
d2 = {}
count = 0
for i in list2:
count = count + 1
set2 = set(i)
d['set'+str(count)] = set2
d['lists'+str(count)] = []
first = []
d2['match'+str(count)] = []
for a in list1:
set1 = set(a)
if d['set'+str(count)].issubset(set1) == True:
first.append(a)
d['lists'+str(count)].append(first)
d2['match'+str(count)].append(d['lists'+str(count)])
count = 0
count2 = -1
d3 = {}
all_sub_lists = []
for i in d2.values():
count = count + 1
count2 = count2 + 1
d3['final'+str(count)] = []
real = []
for item in i:
for each_item in item:
for each_each_item in each_item:
seta= set(each_each_item)
save = []
for i in list2:
setb = set(i)
a=setb.issubset(seta)
save.append(a)
index_to_remove = count2
new_save = save[:index_to_remove] + save[index_to_remove + 1:]
if True not in new_save:
real.append(each_each_item)
d3['final'+str(count)].append(real)
all_sub_lists.append(real)
inner_matches = list(chain(*all_sub_lists))
setA = set(map(tuple, inner_matches))
setB = set(map(tuple, list1))
outer_matches = [i for i in setB if i not in setA]
return inner_matches, outer_matches
inner_matches, outer_matches = find_all_sets(list1,list2)
I am looking for a faster way to process large lists. Please excuse if the terminology of "inner" an "outer" matches is unclear. I did not know how else to call them.
Here is my suggestion (let me know if you need it as a function):
inner_matches=[]
outer_matches=[]
for i in list1:
if sum(1 for k in list2 if set(k).intersection(set(i))==set(k))==1:
inner_matches.append(i)
else:
outer_matches.append(i)
print(inner_matches)
#[[1, 2, 3, 4], [1, 2, 4, 5, 11, 12], [3, 4, 5, 6, 8], [3, 5, 9, 10], [4, 6, 7], [4, 6, 8, 11, 12], [9, 10, 11, 12]]
print(outer_matches)
#[[1, 2, 3, 5, 6, 8], [1, 2, 3, 6, 7], [1, 2, 3, 6, 8, 9, 10], [1, 2, 3, 6, 8, 11, 12], [1, 2, 4, 5, 9, 10], [1, 2, 5, 6, 7, 9, 10], [1, 2, 5, 6, 7, 11, 12], [1, 2, 5, 6, 8, 9, 10], [1, 2, 5, 6, 8, 11, 12], [3, 5, 11, 12], [4, 6, 8, 9, 10]]
Here's a solution that uses issubset() to detect the inner lists. Using your sample data it's faster than your algorithm by a factor of nearly 4.
inner = []
outer = []
search_sets = [set(l) for l in list2]
for l in list1:
if sum(s.issubset(l) for s in search_sets) == 1:
inner.append(l)
else:
outer.append(l)
print(f'{inner = }')
print()
print(f'{outer = }')
Output
inner = [[1, 2, 3, 4], [1, 2, 4, 5, 11, 12], [3, 4, 5, 6, 8], [3, 5, 9, 10], [4, 6, 7], [4, 6, 8, 11, 12], [9, 10, 11, 12]]
outer = [[1, 2, 3, 5, 6, 8], [1, 2, 3, 6, 7], [1, 2, 3, 6, 8, 9, 10], [1, 2, 3, 6, 8, 11, 12], [1, 2, 4, 5, 9, 10], [1, 2, 5, 6, 7, 9, 10], [1, 2, 5, 6, 7, 11, 12], [1, 2, 5, 6, 8, 9, 10], [1, 2, 5, 6, 8, 11, 12], [3, 5, 11, 12], [4, 6, 8, 9, 10]]
Say, if we have a dictionary like below:
{'time' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'x_coordinates': [3, 1, 2, 4, 6, 8, 2, 4, 8, 9],
'y_coordinates': [3, 5, 8, 1, 7, 3, 7, 2, 5, 2]
}
And one like:
{'time' : [2, 6, 8, 10]}
I want to filter out all the key values belonging to the first dict by the key values belonging to the second dict. That is my desired output would be:
{'time': [2, 6, 8, 10],
'x_coordinates': [1, 8, 4, 9],
'y_coordinates': [5, 3, 2, 2]
}
How can I do this in the most efficient way possible?
You can try this.
a={'time' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'x_coordinates': [3, 1, 2, 4, 6, 8, 2, 4, 8, 9],
'y_coordinates': [3, 5, 8, 1, 7, 3, 7, 2, 5, 2]
}
search={'time' : [2, 6, 8, 10]}
idx=[a['time'].index(i) for i in search['time']]
#[1, 5, 7, 9]
final_dict={key:[a[key][i] for i in idx] for key in a.keys()}
{'time': [2, 6, 8, 10],
'x_coordinates': [1, 8, 4, 9],
'y_coordinates': [5, 3, 2, 2]}
It appears that you are looking for corresponding list elements based on the list provided for time. This can be accomplished with zip, and to construct your dictionary, you could leverage a defaultdict
from collections import defaultdict
d = {'time' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'x_coordinates': [3, 1, 2, 4, 6, 8, 2, 4, 8, 9],
'y_coordinates': [3, 5, 8, 1, 7, 3, 7, 2, 5, 2]
}
# sets provide faster lookup times than lists
vals = set([2, 6, 8, 10])
new_values = defaultdict(list)
for time, x, y in zip(d['time'], d['x_coordinates'], d['y_coordinates']):
if time in vals:
# you only have to do a single membership test, then you
# simply append the desired values
new_values['time'].append(time)
new_values['x_coordinates'].append(x)
new_values['y_coordinates'].append(y)
defaultdict(<class 'list'>, {'time': [2, 6, 8, 10], 'x_coordinates': [1, 8, 4, 9], 'y_coordinates': [5, 3, 2, 2]})
The benefit here is that you only iterate once over all of the values
Let's say I have the following simple dictionary in Python3.x:
example = {1:[4, 5, 6], 2:[7, 8, 9]}
I would like a way to expand the dictionary as follows:
expanded_example = {1:[4, 5, 6], 2:[7, 8, 9], 4:[5, 6], 5:[4, 6], 6:[4, 5], 7:[8, 9], 8:[7, 9], 9:[7, 8]}
This becomes quite complicated by values shared by multiple keys. As an example,
example2 = {1:[4, 5, 6], 2:[4, 7, 8, 9]}
Here 4 is a value in the lists associated with 1 and 2.
There are two approaches if there are "repeat" value elements:
(1) Only keep values immediately associated with a certain key:
{1:[4, 5, 6], 2:[4, 7, 8, 9], 4:[5, 6], 5:[4, 6], 6:[4, 5], 7:[8, 9], 8:[7, 9], 9:[7, 8]}
(2) Keep all associated values (as '4' is shared between keys '1' and '2'):
{1:[4, 5, 6], 2:[4, 7, 8, 9], 4:[5, 6, 7, 8, 9], 5:[4, 6], 6:[4, 5], 7:[4, 8, 9], 8:[4, 7, 9], 9:[4, 7, 8]}
EDITED:
My thought for this task was to use collections.defaultdict:
from collections import defaultdict
dict1 = {1:[4, 5, 6], 2:[4, 7, 8, 9]}
d_dict = defaultdict(list)
for k,l in dict1.items():
for v in l:
d_dict[v].append(l)
print(d_dict)
## defaultdict(<class 'list'>, {4: [[4, 5, 6], [4, 7, 8, 9]], 5: [[4, 5, 6]], 6: [[4, 5, 6]], 7: [[4, 7, 8, 9]], 8: [[4, 7, 8, 9]], 9: [[4, 7, 8, 9]]})
This gets me some of the way, but there are repeat elements in lists of lists...
strategy 2
example2 = {1:[4, 5, 6], 2:[4, 7, 8, 9]}
output = {**example2}
for val in example2.values():
for idx,v in enumerate(val):
if v not in output:
output[v] = val[0:idx]+val[idx+1:]
else:
output[v].extend(val[0:idx]+val[idx+1:])
print(output)
#{1: [4, 5, 6], 2: [4, 7, 8, 9], 4: [5, 6, 7, 8, 9], 5: [4, 6], 6: [4, 5], 7: [4, 8, 9], 8: [4, 7, 9], 9: [4, 7, 8]}
strategy 1
import copy
example2 = {1:[4, 5, 6], 2:[4, 7, 8, 9]}
output = copy.deepcopy(example2)
for val in example2.values():
for num in val:
if num in output:
val.remove(num)
for idx,v in enumerate(val):
output[v] = val[0:idx]+val[idx+1:]
print(output)
#{1: [4, 5, 6], 2: [4, 7, 8, 9], 4: [5, 6], 5: [4, 6], 6: [4, 5], 7: [8, 9], 8: [7, 9], 9: [7, 8]}
Note: This answer only deals with Approach #1.
You can work with copies of your data, as you should not add/remove dictionary items while iterating a view:
d = {1:[4, 5, 6], 2:[7, 8, 9]}
for k, v in list(d.items()):
for w in v:
L = v.copy()
d[L.pop(L.index(w))] = L
print(d)
{1: [4, 5, 6], 2: [7, 8, 9], 4: [5, 6], 5: [4, 6],
6: [4, 5], 7: [8, 9], 8: [7, 9], 9: [7, 8]}
I have two dictionaries:
concave = {6: [2, 3, 4, 5], 2: [6], 3: [6], 4: [6], 5: [6]}
convex = {1: [2, 3, 4, 5], 2: [1, 3, 5], 3: [1, 2, 4], 4: [1, 3, 5], 5: [1, 2, 4], 6: [7, 8, 9, 10], 7: [6, 8, 10, 11], 8: [6, 7, 9, 11], 9: [6, 8, 10, 11], 10: [6, 7, 9, 11], 11: [7, 8, 9, 10]}
And I have returned the keys which have max length values in the convex dict:
max_lens = [1, 6, 7, 8, 9, 10, 11]
For each number in max_lens, I want to check that it does not exist as a key in concave and its values in convex exist as keys in concave.
So in this example, '1' would satisfy this condition as it is not included in concave as a key, but its values in convex are (i.e. 2, 3 4 and 5).
I have tried to figure out how to go about this using for loops/if statements:
for i in enumerate(max_lens):
if i not in concave:
for k,v in convex.items():
for j in v:
That is about as far as I got before getting totally confused. There must be an easier way to do this other than using multiple for loops and if statements?
I'm a bit of a python noob so sorry if this comes across as confusing!
I think I understood (for the record I prefer the explicit concave.keys())
result_dict = {}
for convex_key in max_lens:
result_dict[convex_key] = convex_key not in concave.keys() \
and all(convex_val in concave.keys()
for convex_val in convex[convex_key])
Edit (see comments)
for convex_key in max_lens:
if convex_key not in concave.keys() and \
all(convex_val in concave.keys() for convex_val in convex[convex_key]):
top_face = convex_key
break
Spelling this problem out into steps always helps:
Loop over each of the lengths l in max_lens
Check if l doesn't exist in concave but exists in convex. A conjunction of these two conditions is needed here. If either fails, don't continue.
If the above two conditions are accepted, check if all the values from convex[l] exist in concave.
If the code reaches here with no issues, all the conditions are met.
Demo:
concave = {6: [2, 3, 4, 5], 2: [6], 3: [6], 4: [6], 5: [6]}
convex = {1: [2, 3, 4, 5], 2: [1, 3, 5], 3: [1, 2, 4], 4: [1, 3, 5], 5: [1, 2, 4], 6: [7, 8, 9, 10], 7: [6, 8, 10, 11], 8: [6, 7, 9, 11], 9: [6, 8, 10, 11], 10: [6, 7, 9, 11], 11: [7, 8, 9, 10]}
max_lens = [1, 6, 7, 8, 9, 10, 11]
for l in max_lens:
if l not in concave and l in convex and all(v in concave for v in convex[l]):
print(l)
Output:
1
You can do it with comprehension:
[i for i in max_lens if i not in concave and convex[i] in concave.values()]
Using a simple forloop.
concave = {6: [2, 3, 4, 5], 2: [6], 3: [6], 4: [6], 5: [6]}
convex = {1: [2, 3, 4, 5], 2: [1, 3, 5], 3: [1, 2, 4], 4: [1, 3, 5], 5: [1, 2, 4], 6: [7, 8, 9, 10], 7: [6, 8, 10, 11], 8: [6, 7, 9, 11], 9: [6, 8, 10, 11], 10: [6, 7, 9, 11], 11: [7, 8, 9, 10]}
max_lens = [1, 6, 7, 8, 9, 10, 11]
for i in max_lens:
if (i not in concave): #Check if not in key.
if convex[i] in concave.values(): #Check Value.
print i
Output:
1
If you don't understand a problem easily, it's often a good way, to divide it into several smaller problems:
write a function that checks if a value is not key of a dict:
def is_no_key_in(v, _dict):
return key not in _dict
Since that is too simple return a list of keys that are not in dict:
def no_key_values(_list, _dict):
return [ v for v in _list if is_no_key_in(v, _dict) ]
Now that you only have values that fit your first condition, you can concentrate on your second condition. Since you want that every value of a list is in a list of keys, you can start making a union-like function:
def union(a_lst, b_lst):
return [ a for a in a_lst if a in b_lst]
To make it more into something serving your needs, you could change it to a function that checks for any diffs:
def is_subset(a_lst, b_lst):
return len([a for a in a_lst if a not in b_lst]) == 0
Now you piece the functions together:
def satisfies_conditions(max_lens):
for lens in no_key_values(max_lens, concave):
if is_subset(convex[lens], concave.keys())
yield lens
result = [ lens for lens in satisfies_conditions(max_lens) ]
result now contains all lenses that satisfy your conditions and if you want to change your conditions, you can easily do so. If your code works, you can go on and refactor it. For example you might not need is_no_key_in as it is a very simple function. Then go on and inline it into no_key_values:
def no_key_values(_list, _dict):
return [ v for v in _list if v not in _dict ]
If you write some tests before refactoring (or even writing the code) you can ensure, that your refactoring won't introduce bugs. Then simplify the code step by step. Maybe you will end up with a solution as simple as proposed in other answers here.
(I hope this will also help you with future problems like that :-))