Merging dictionaries not including duplicate values in python - python

I would like to merge two dictionaries, but if they have the same key, I would only merge non-duplicate values.
The following code works, but I have a question if it's possible to rewrite this when trying to get a union by using | or (**dict1, **dict2)? When I tried using |, my output would be from this dict_merge({ 'A': [1, 2, 3] }, { 'A': [2, 3, 4] }) to this {'A': [2, 3, 4]}
def dict_merge(dict1, dict2):
for key in dict2.keys():
if key in dict1.keys():
d3 = dict1[key] + dict2[key]
d3 = set(d3)
dict1[key] = list(d3)
else:
dict1[key] = dict2[key]
return dict1
dict_merge({ 'A': [1, 2, 3] }, { 'B': [2, 4, 5, 6]})
Output
{ 'A': [1, 2, 3], 'B': [2, 4, 5, 6] }

Giving your two dictionaries names, let's get the union of their keys.
>>> d1 = { 'A': [1, 2, 3] }
>>> d2 = { 'A': [2, 3, 4] }
>>> d1.keys() | d2.keys()
{'A'}
Assuming the lists are really sets based on your code, we can now iterate over the union of the keys in a dictionary comprehension, and union those two sets and turning them back into a list.
>>> {k: list(set(d1.get(k, [])) | set(d2.get(k, []))) for k in d1.keys() | d2.keys()}
{'A': [1, 2, 3, 4]}
If we incorporate some more interesting dictionaries and repeat the same dictionary comprehension:
>>> d1 = {'A': [1,2,3], 'B': [4,5,6]}
>>> d2 = {'B': [5,6,7,8], 'C': [9,10]}
>>> {k: list(set(d1.get(k, [])) | set(d2.get(k, []))) for k in d1.keys() | d2.keys()}
{'C': [9, 10], 'A': [1, 2, 3], 'B': [4, 5, 6, 7, 8]}

Is that the solution you're wanting?
In [61]: my_dict = {}
...: d1 = {'A': [1, 2, 3], 'C': 123}
...: d2 = {'B': [2, 3, 4], 'A': [1, 2, 3]}
...: for i in set(d1).symmetric_difference(set(d2)):
...: my_dict[i] = d1[i] if i in d1 else d2[i]
...: print(my_dict)
Output :
{'B': [2, 3, 4], 'C': 123}

Related

how to merge two dictonaries in python with union of values if the keys do exist in both

how to merge two dictonaries in python with union of values if the keys do exist in both. Each dictionary has values as list.
I have three dictionaries:
d1 = {"KEY1": [1, 2, 3]}
d2 = {"KEY1": [2, 3, 4]}
d3 = {"KEY2": [1, 2, 3]}
how could I merge then so if:
merge(d1,d2) --> {"KEY1": [1, 2, 3, 4]}
merge(d1,d3) --> {"KEY1": [1, 2, 3],"KEY2": [1, 2, 3]}
For such a specific kind of dictionary merging, you have to create a custom merge function. Something like this is probably a good start:
def merge(d1, d2):
merged = {}
for k in d1:
if k in d2:
merged[k] = sorted(list(set(d1[k] + d2[k])))
else:
merged[k] = d1[k]
for k in d2:
if k not in merged:
merged[k] = d2[k]
return merged
d1 = {'KEY1': [1, 2, 3]}
d2 = {'KEY1': [2, 3, 4]}
d3 = {'KEY2': [1, 2, 3]}
print(merge(d1, d2))
print(merge(d1, d3))
Output:
{'KEY1': [1, 2, 3, 4]}
{'KEY1': [1, 2, 3], 'KEY2': [1, 2, 3]}
define a merge function (here as lambda function):
merge = lambda x,y: {v:list(set(x.get(v, []) + y.get(v, []))) for v in x.keys() | y.keys()}
output:
merge(d1,d3)
{'KEY2': [1, 2, 3], 'KEY1': [1, 2, 3]}
merge(d1,d2)
{'KEY1': [1, 2, 3, 4]}
I got the expected output by using **dictionary which treats the elements of this iterable as positional arguments to any function call.
def merge(a,b):
try:
return {**{**a,**b},**{k: set(set(v).union(b[k])) for k, v in a.items()}}
except:
return {**a,**b}
print(merge(d1,d2))
print(merge(d1,d3))
Output:

Convert each list element into a nested dictionary key

There is this list of string that I need to use to create a nested dictionary with some values ['C/A', 'C/B/A', 'C/B/B']
The output will be in the format {'C': {'A': [1, 2, 3], 'B': {'A': [1, 2, 3], 'B': [1, 2, 3]}}}
I've tried to use the below code to create the nested dictionary and update the value, but instead I get {'C': {'A': [1, 2, 3], 'C': {'B': {'A': [1, 2, 3], 'C': {'B': {'B': [1, 2, 3]}}}}}} as the output which is not the correct format. I'm still trying to figure out a way. any ideas?
s = ['C/A', 'C/B/A', 'C/B/B']
new = current = dict()
for each in s:
lst = each.split('/')
for i in range(len(lst)):
current[lst[i]] = dict()
if i != len(lst)-1:
current = current[lst[i]]
else:
current[lst[i]] = [1,2,3]
print(new)
You can create a custom Tree class:
class Tree(dict):
'''
Create arbitrarily nested dicts.
>>> t = Tree()
>>> t[1][2][3] = 4
>>> t
{1: {2: {3: 4}}}
>>> t.set_nested_item('a', 'b', 'c', value=5)
>>> t
{1: {2: {3: 4}}, 'a': {'b': {'c': 5}}}
'''
def __missing__(self, key):
self[key] = type(self)()
return self[key]
def set_nested_item(self, *keys, value):
head, *rest = keys
if not rest:
self[head] = value
else:
self[head].set_nested_item(*rest, value=value)
>>> s = ['C/A', 'C/B/A', 'C/B/B']
>>> output = Tree()
>>> default = [1, 2, 3]
>>> for item in s:
... output.set_nested_item(*item.split('/'), value=list(default))
>>> output
{'C': {'A': [1, 2, 3], 'B': {'A': [1, 2, 3], 'B': [1, 2, 3]}}}
You do not need numpy for this problem, but you may want to use recursion. Here is a recursive function add that adds a list of string keys lst and eventually a list of numbers to the dictionary d:
def add(lst, d):
key = lst[0]
if len(lst) == 1: # if the list has only 1 element
d[key] = [1, 2, 3] # That element is the last key
return
if key not in d: # Haven't seen that key before
d[key] = dict()
add(lst[1:], d[key]) # The recursive part
To use the function, create a new dictionary and apply the function to each splitter string:
d = dict()
for each in s:
add(each.split("/"), d)
# d
# {'C': {'A': [1, 2, 3], 'B': {'A': [1, 2, 3], 'B': [1, 2, 3]}}}

Python Dictionary Filtration with items as a list

I have several lists as items in my dictionary. I want to create a dictionary with the same keys, but only with items that correspond to the unique values of the list in the first key. What's the best way to do this?
Original:
d = {'s': ['a','a','a','b','b','b','b'],
'd': ['c1','d2','c3','d4','c5','d6','c7'],
'g': ['e1','f2','e3','f4','e5','f6','e7']}
Output:
e = {'s': ['a','a','a'],
'd': ['c1','d2','c3'],
'g': ['e1','f2','e3']}
f = {'s': ['b','b','b','b'],
'd': ['d4','c5','d6','c7'],
'g': ['f4','e5','f6','e7']}
I don't think there is an easy way to do this. I created a (not so) little function for you:
def func(entry):
PARSING_KEY = "s"
# check if entry dict is valid (optional)
assert type(entry)==dict
for key in entry.keys():
assert type(entry[key])==list
first_list = entry[PARSING_KEY]
first_list_len = len(first_list)
for key in entry.keys():
assert len(entry[key]) == first_list_len
# parsing
output_list_index = []
already_check = set()
for index1, item1 in enumerate(entry[PARSING_KEY]):
if not item1 in already_check:
output_list_index.append([])
for index2, item2 in enumerate(entry[PARSING_KEY][index1:]):
if item2==item1:
output_list_index[-1].append(index2)
already_check.add(item1)
# creating lists
output_list = []
for indexes in output_list_index:
new_dict = {}
for key, value in entry.items():
new_dict[key] = [value[i] for i in indexes]
output_list.append(new_dict)
return output_list
Note that because of the structure of dict, there isn't a "first key" so you have to hardcode the key you want to use to parse (whit the "PARSING_KEY" constant at the top of the function)
original_dict = {
'a': [1, 3, 5, 8, 4, 2, 1, 2, 7],
'b': [4, 4, 4, 4, 4, 3],
'c': [822, 1, 'hello', 'world']
}
distinct_dict = {k: list(set(v)) for k, v in original_dict.items()}
distinct_dict
yields
{'a': [1, 2, 3, 4, 5, 7, 8], 'b': [3, 4], 'c': [1, 'hello', 'world', 822]}

Restructuring the hierarchy of dictionaries in Python?

If I have a nested dictionary in Python, is there any way to restructure it based on keys?
I'm bad at explaining, so I'll give a little example.
d = {'A':{'a':[1,2,3],'b':[3,4,5],'c':[6,7,8]},
'B':{'a':[7,8,9],'b':[4,3,2],'d':[0,0,0]}}
Re-organize like this
newd = {'a':{'A':[1,2,3],'B':[7,8,9]},
'b':{'A':[3,4,5],'B':[4,3,2]},
'c':{'A':[6,7,8]},
'd':{'B':[0,0,0]}}
Given some function with inputs like
def mysteryfunc(olddict,newkeyorder):
????
mysteryfunc(d,[1,0])
Where the [1,0] list passed means to put the dictionaries 2nd level of keys in the first level and the first level in the 2nd level. Obviously the values need to be associated with their unique key values.
Edit:
Looking for an answer that covers the general case, with arbitrary unknown nested dictionary depth.
Input:
d = {'A':{'a':[1,2,3],'b':[3,4,5],'c':[6,7,8]},
'B':{'a':[7,8,9],'b':[4,3,2],'d':[0,0,0]}}
inner_dict={}
for k,v in d.items():
print(k)
for ka,va in v.items():
val_list=[]
if ka not in inner_dict:
val_dict={}
val_dict[k]=va
inner_dict[ka]=val_dict
else:
val_dict=inner_dict[ka]
val_dict[k]=va
inner_dict[ka]=val_dict
Output:
{'a': {'A': [1, 2, 3], 'B': [7, 8, 9]},
'b': {'A': [3, 4, 5], 'B': [4, 3, 2]},
'c': {'A': [6, 7, 8]},
'd': {'B': [0, 0, 0]}}
you can use 2 for loops, one to iterate over each key, value pair and the second for loop to iterate over the nested dict, at each step form the second for loop iteration you can build your desired output:
from collections import defaultdict
new_dict = defaultdict(dict)
for k0, v0 in d.items():
for k1, v1 in v0.items():
new_dict[k1][k0] = v1
print(dict(new_dict))
output:
{'a': {'A': [1, 2, 3], 'B': [7, 8, 9]},
'b': {'A': [3, 4, 5], 'B': [4, 3, 2]},
'c': {'A': [6, 7, 8]},
'd': {'B': [0, 0, 0]}}
You can use recursion with a generator to handle input of arbitrary depth:
def paths(d, c = []):
for a, b in d.items():
yield from ([((c+[a])[::-1], b)] if not isinstance(b, dict) else paths(b, c+[a]))
from collections import defaultdict
def group(d):
_d = defaultdict(list)
for [a, *b], c in d:
_d[a].append([b, c])
return {a:b[-1][-1] if not b[0][0] else group(b) for a, b in _d.items()}
print(group(list(paths(d))))
Output:
{'a': {'A': [1, 2, 3], 'B': [7, 8, 9]}, 'b': {'A': [3, 4, 5], 'B': [4, 3, 2]}, 'c': {'A': [6, 7, 8]}, 'd': {'B': [0, 0, 0]}}

Python list to dictionnary with indexes

I am trying to convert a list :
[A, B, A, A, B, C]
to a dictionnary with each item and the indexes where it was found :
{ A : [0,2,3], B : [1,4], C : [5] }
Any idea of an efficient way to do that ?
Use a defaultdict and enumerate:
>>> lst = ['a','b','a','a','b','c']
>>> from collections import defaultdict
>>> d = defaultdict(list)
>>> for i, value in enumerate(lst):
... d[value].append(i)
...
>>> d
defaultdict(<class 'list'>, {'a': [0, 2, 3], 'c': [5], 'b': [1, 4]})
Or, this can be accomplished with a plain dict, although, it is usually slower:
>>> lst = ['a','b','a','a','b','c']
>>> d = {}
>>> for i, value in enumerate(lst):
... d.setdefault(value, []).append(i)
...
>>> d
{'a': [0, 2, 3], 'c': [5], 'b': [1, 4]}
You could have, of course, converted the defaultdict to a dict:
>>> d
defaultdict(<class 'list'>, {'a': [0, 2, 3], 'c': [5], 'b': [1, 4]})
>>> dict(d)
{'a': [0, 2, 3], 'c': [5], 'b': [1, 4]}
>>> help(dict)
Try this,
lst = ['A', 'B', 'A', 'A', 'B', 'C']
print {i:[j[0] for j in enumerate(lst) if j[1] == i] for i in set(lst)}
Result
{'A': [0, 2, 3], 'B': [1, 4], 'C': [5]}
Use list comprehension and a dict comprehension. Create a set out of the list first. Then you can easily use enumerate and do this.
>>> l = ["A", "B", "A", "A", "B", "C"]
>>> {i:[j for j,k in enumerate(l) if k==i] for i in set(l)}
{'C': [5], 'B': [1, 4], 'A': [0, 2, 3]}

Categories

Resources