Combining multiple nested dictionaries in python - python

I have multiple nested dictionaries with different levels and I would like to combine them on the same key. Here, I am sharing with 3 examples such as:
dict_1={'D': {'D': '1','B': '2','A': '3'},'A': {'A': '5','J': '6'}}
dict_2={'D': {'D': '7', 'B': '8', 'C': '9'},'A': {'A': '12', 'C':'13'}}
dict_3={'D': {'test1': '14','test2': '3'},'B': {'test1': '21','test2': '16'},'A': {'test1': '3','test2': '2'},'J': {'test1': '15','test2': '3'}, 'C':{'test1': '44','test2': '33'}}
I want to combine these 3 as by adding 'dict_3' keys (adding the information from dict_3) and values to the combination of 'dict_1' and 'dict_2' for each key:
main_dict={
'D':
{'D':{'dict_1_value':'1', 'dict_2_value':'7', 'test1': '14', 'test2': '3'},
'B':{'dict_1_value':'2', 'dict_2_value':'8', 'test1': '21', 'test2': '16'},
'A':{'dict_1_value':'3', 'test1': '3', 'test2': '2'},
'C':{'dict_2_value':'9', 'test1': '44', 'test2': '33'}},
'A':
{'A':{'dict_1_value':'5', 'dict_2_value':'12', 'test1': '3', 'test2': '2'},
'J':{'dict_1_value':'6', 'test1': '15', 'test2': '3'},
'C':{'dict_2_value':'13', 'test1': '44', 'test2': '33'}}
}
At first, I have tried to combine dict_1 and dict_2 but I am overwriting the values from dict_1 when I tried such as {k: v | dict_2[k] for k, v in dict_1.items()} or dict(**dict_1,**dict_2). Moreover, I don't know how I can add dict_3 by adding key name as 'dict_1_value' or 'dict_2_value'.
Is there any way to accomplish main_dict?

all_keys = set(dict_1.keys()).union(dict_2.keys())
temp_1 = {key: {k: {'dict_1_value': v} for k, v in sub.items()} for key, sub in dict_1.items()}
temp_2 = {key: {k: {'dict_2_value': v} for k, v in sub.items()} for key, sub in dict_2.items()}
combined = {}
for key in all_keys:
sub_1 = temp_1.get(key, {})
sub_2 = temp_2.get(key, {})
sub_keys = set(sub_1.keys()).union(sub_2.keys())
combined[key] = {k: sub_1.get(k, {}) | sub_2.get(k, {}) for k in sub_keys}
Now there are 2 options:
1
Dictionary comprehension - the new dictionary is constructed from scratch:
main_dict = {key: {k: sub[k] | dict_3.get(k, {})
for k, v in sub.items()}
for key, sub in combined.items()}
2
Loop - items of the existing dictionary are just updated:
for key, sub in combined.items():
for k, v in sub.items():
v.update(dict_3.get(k, {}))
main_dict = combined

Related

list duplicate values in a nested dictionary

i need to check for duplicate values that might occur in a dictionary. I have a dictionary in the following layout. Any advise is welcome! thanks so much
the original dictionary
dic = {'ab1': [{'ans': 'Male', 'val': '1'},
{'ans': 'Female', 'val': '2'},
{'ans': 'Other', 'val': '3'},
{'ans': 'Prefer not to answer', 'val': '3'}],
'bc1': [{'ans': 'Employed', 'val': '1'},
{'ans': 'Unemployed', 'val': '2'},
{'ans': 'Student', 'val': '3'},
{'ans': 'Retired', 'val': '4'},
{'ans': 'Part-time', 'val': '5'},
{'ans': 'Prefer not to answer', 'val': '7'}],
'bc2': [{'ans': 'Mother',
'val': '1'},
{'ans': 'Father ', 'val': '2'},
{'ans': 'Brother', 'val': '3'},
{'ans': 'Sister', 'val': '4'},
{'ans': 'Grandmother', 'val': '4'},
{'ans': 'Grandfather', 'val': '6'},
{'ans': 'Son', 'val': '7'},
{'ans': 'Daughter', 'val': '8'}]}
the expected output - a list that contains ONLY items with identical values per key - so only this
ab1: Other 3, Prefer not to answer 3
bc2: Sister 4, Grandmother 4
code I have tried it aims to reverse the dictionary first - but throws unhashable type list error i think because it treats it as a list when in fact the dict might be a tupple but i don't know how to change it
rev_dict = {}
for k, v in dic.items():
rev_dict.setdefault(v, set()).add(k)
res = set(chain.from_iterable(v for k, v in rev_dict.items()
if len(v) > 1))
You've not specified an exact output format, but since you tagged pandas, here's a pandas solution.
import pandas as pd
{k: pd.DataFrame(v)[lambda df: df['val'].duplicated(keep=False)].to_dict(orient='records') for k, v in dic.items()}
Output:
{
'ab1': [{'ans': 'Other', 'val': '3'},
{'ans': 'Prefer not to answer', 'val': '3'}],
'bc1': [],
'bc2': [{'ans': 'Sister', 'val': '4'}, {'ans': 'Grandmother', 'val': '4'}]
}
The panda's answer is certainly nicer:
lst = []
for i in dic.keys():
counts = Counter([j['val'] for j in dic[i]])
new = {j['ans']: j['val'] for j in dic[i] if counts[j['val']] > 1}
lst.append(i + ': ' + ', '.join(['{} {}'.format(i, new[i]) for i in new])) if new else None
Import itertools and try this:
list(itertools.chain(*[[(k, i['ans'],i['val']) for i in v] for k, v in dic.items()]))
Long version
import itertools
lst = []
for k,v in dic.items():
for i in v:
tup = (k, i['ans'],i['val'])
lst.append(tup)
list(itertools.chain(*lst))

dict from a dict of list

I have a Python dictionary with following format:
d1 = {'Name':['ABC'], 'Number':['123'], 'Element 1':['1', '2', '3'],
'Element2':['1','2','3']}
Expected output:
{'Name': 'ABC', 'Number': '123',
'Elements': [{'Element 1': '1', 'Element2': '1'},
{'Element 1': '2', 'Element2': '2'},
{'Element 1': '3', 'Element2': '3'}]
I have tried the following:
[{k: v[i] for k, v in d1.items() if i < len(v)}
for i in range(max([len(l) for l in d1.values()]))]
but getting this result:
[{'Name': 'ABC', 'Number': '123', 'Element 1': '1', 'Element 2': '1'},
{'Element 1': '2', 'Element 2': '2'},
{'Element 1': '3', 'Element 2': '3'}]
How can I go from here?
I strongly recommend not trying to do everything in one line. It's not always more efficient, and almost always less readable if you have any branching logic or nested loops.
Given your dict, we can pop() the Name and Number keys into our new dict. Then
output = dict()
d1 = {'Name':['ABC'], 'Number':['123'], 'Element 1':['1', '2', '3'], 'Element2':['1','2','3']}
output["Name"] = d1.pop("Name")
output["Number"] = d1.pop("Number")
print(output)
# prints:
# {'Name': ['ABC'], 'Number': ['123']}
print(d1)
# prints:
# {'Element 1': ['1', '2', '3'], 'Element2': ['1', '2', '3']}
Then, we zip all remaining values in the dictionary, and add them to a new list:
mylist = []
keys = d1.keys()
for vals in zip(*d1.values()):
temp_obj = dict(zip(keys, vals))
mylist.append(temp_obj)
print(mylist)
# prints:
# [{'Element 1': '1', 'Element2': '1'},
# {'Element 1': '2', 'Element2': '2'},
# {'Element 1': '3', 'Element2': '3'}]
And finally, assign that to output["Elements"]
output["Elements"] = mylist
print(output)
# prints:
# {'Name': ['ABC'], 'Number': ['123'], 'Elements': [{'Element 1': '1', 'Element2': '1'}, {'Element 1': '2', 'Element2': '2'}, {'Element 1': '3', 'Element2': '3'}]}
Since you don't want to hardcode the first two keys,
for k, v in d1.items():
if "element" not in k.lower():
output[k] = v
Or as a dict-comprehension:
output = {k: v for k, v in d1.items() if "element" not in k.lower()}
use a list of tuples to create the elements list of dictionaries. Use Convert to build your dictionary item from the tuple.
#https://www.geeksforgeeks.org/python-convert-list-tuples-dictionary/
d1 = {'Name':['ABC'], 'Number':['123'], 'Element 1':['1', '2', '3'],
'Element2':['1','2','3']}
def Convert(tup, di):
for a, b in tup:
di[a]=b
return di
dict={}
listElements=[]
for key,value in d1.items():
if isinstance(value,list) and len(value)>1:
for item in value:
listElements.append((key,item))
elif isinstance(value,list) and len(value)==1:
dict[key]=value[0]
else:
dict[key]=value
dict['Elements']=[Convert([(x,y)],{}) for x,y in listElements]
print(dict)
output:
{'Name': 'ABC', 'Number': '123', 'Elements': [{'Element 1': '1'}, {'Element 1': '2'}, {'Element 1': '3'}, {'Element2': '1'}, {'Element2': '2'}, {'Element2': '3'}]}
I'm going to explain step by step:
We build new_d1 variable, that is the dictionary you expect as output and it's initialized as {'Name': 'ABC', 'Number': '123'}. For achieving the above, we use comprehension notation taking into account the keys != 'Element'
new_d1 = {key: d1.get(key)[0] for key in filter(lambda x: 'Element' not in x, d1)}
We build elements variable, that's a list with the dictionaries matter for us, I mean, the dictionaries we have to manipulate to achieve the expected result. Then elements is [{'Element 1': ['1', '2', '3']}, {'Element2': ['1', '2', '3']}].
elements = [{key: d1.get(key)} for key in filter(lambda x: 'Element' in x, d1)]
We are going to do a Cartesian product using itertools.product taking into account each key and each item of the values present in elements.
product = [list(it.product(d.keys(), *d.values())) for d in elements]
Using zip, we arrange the data and covert them in dictionary. And finally we create "Elements" key in new_df1
elements_list = [dict(t) for index, t in enumerate(list(zip(*product)))]
new_d1["Elements"] = elements_list
print(new_d1)
Full code:
import itertools as it
new_d1 = {key: d1.get(key)[0] for key in filter(lambda x: 'Element' not in x, d1)}
elements = [{key: d1.get(key)} for key in filter(lambda x: 'Element' in x, d1)]
product = [list(it.product(d.keys(), *d.values())) for d in elements]
elements_list = [dict(t) for index, t in enumerate(list(zip(*product)))]
new_d1["Elements"] = elements_list
Output:
{'Elements': [{'Element 1': '1', 'Element2': '1'},
{'Element 1': '2', 'Element2': '2'},
{'Element 1': '3', 'Element2': '3'}],
'Name': 'ABC',
'Number': '123'}

Switch key and value in a dictionary of sets

I have dictionary something like:
d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E','F','G'}}
and I want result like this
d2 = {'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
so I tried
d2 = dict ((v, k) for k, v in d1.items())
but value is surrounded by set{}, so it didn't work well...
is there any way that I can fix it?
You could use a dictionary comprehension:
{v:k for k,vals in d1.items() for v in vals}
# {'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
Note that you need an extra level of iteration over the values in each key here to get a flat dictionary.
Another dict comprehension:
>>> {v: k for k in d1 for v in d1[k]}
{'a': '0', 'b': '1', 'c': '2', 'd': '2', 'E': '3', 'F': '3', 'G': '3'}
Benchmark comparison with yatu's:
from timeit import repeat
setup = "d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E','F','G'}}"
yatu = "{v:k for k,vals in d1.items() for v in vals}"
heap = "{v:k for k in d1 for v in d1[k]}"
for _ in range(3):
print('yatu', min(repeat(yatu, setup)))
print('heap', min(repeat(heap, setup)))
print()
Results:
yatu 1.4274586000000227
heap 1.4059823000000051
yatu 1.4562267999999676
heap 1.3701727999999775
yatu 1.4313863999999512
heap 1.3878657000000203
Another benchmark, with a million keys/values:
setup = "d1 = {k: {k+1, k+2} for k in range(0, 10**6, 3)}"
for _ in range(3):
print('yatu', min(repeat(yatu, setup, number=10)))
print('heap', min(repeat(heap, setup, number=10)))
print()
yatu 1.071519999999964
heap 1.1391495000000305
yatu 1.0880677000000105
heap 1.1534022000000732
yatu 1.0944767999999385
heap 1.1526202000000012
Here's another possible solution to the given problem:
def flatten_dictionary(dct):
d = {}
for k, st_values in dct.items():
for v in st_values:
d[v] = k
return d
if __name__ == '__main__':
d1 = {'0': {'a'}, '1': {'b'}, '2': {'c', 'd'}, '3': {'E', 'F', 'G'}}
d2 = flatten_dictionary(d1)
print(d2)

How to write two else condition in dict comprehension

2 dictionary d1,d2, create a new dictionary with same keys.
d1 = {'product': '8', 'order': '8', 'tracking': '3'}
d2 = {'order': 1, 'product': 1,'customer':'5'}
dict3 = { k: [ d1[k], d2[k] ] if k in d2 else [d1[k]] for k in d1 }
dict3
{'product': ['8', 1], 'order': ['8', 1], 'tracking': ['3']}
How to pass else [d2[k]] for k in d2 to get the expected out
My Expected out
{'product': ['8', 1], 'order': ['8', 1], 'tracking': ['3'],'customer':['5']}
Disclaimer. I have done with defaultdict. Please give answer in dict comprehension only
You could use a nested ternary ... if ... else (... if ... else ...), but what if there are three dictionaries, or four?
Better use a nested list comprehension and iterate over the different dictionaries.
>>> d1 = {'product': '8', 'order': '8', 'tracking': '3'}
>>> d2 = {'order': 1, 'product': 1,'customer':'5'}
>>> {k: [d[k] for d in (d1, d2) if k in d] for k in set(d1) | set(d2)}
{'customer': ['5'], 'order': ['8', 1], 'product': ['8', 1], 'tracking': ['3']}
You have to iterate over both the dictionaries to include all the keys in new constructed dict.
You can achieve this by using defaultdict
from collections import defaultdict
res = defaultdict(list)
for key, value in d1.items():
res[key].append(value)
for key, value in d2.items():
res[key].append(value)
Output:
>>> dict(res)
>>> {'product': ['8', 1], 'order': ['8', 1], 'tracking': ['3'], 'customer': ['5']}
Using a defaultdict without a comprehension is a much, much better way to go, but as requested:
d1 = {'product': '8', 'order': '8', 'tracking': '3'}
d2 = {'order': 1, 'product': 1,'customer':'5'}
d3 = {
k: [d1[k], d2[k]]
if (k in d1 and k in d2)
else [d1[k]]
if k in d1
else [d2[k]]
for k in list(d1.keys()) + list(d2.keys())
}
d3 is now:
{'product': ['8', 1], 'order': ['8', 1], 'tracking': ['3'], 'customer': ['5']}
>>> d1 = {'product': '8', 'order': '8', 'tracking': '3'}
>>> d2 = {'order': 1, 'product': 1, 'customer': '5'}
>>> dict3 = {k: [d1[k], d2[k]] if k in d1 and k in d2 else [d1[k]] if k in d1 else [d2[k]] for list in [d1, d2] for k in list}
>>> dict3
{'product': ['8', 1], 'order': ['8', 1], 'tracking': ['3'],'customer':['5']}
d1 = {'product': '8', 'order': '8', 'tracking': '3'}
d2 = {'order': 1, 'product': 1, 'customer': '5'}
list_ = []
for i in [d1, d2]:
list_.append(i)
list_
[{'product': '8', 'order': '8', 'tracking': '3'},
{'order': 1, 'product': 1, 'customer': '5'}]
dict_={}
for d in list_:
for k,v in d.items():
dict_.setdefault(k,[]).append(v)
dict_
{'product': ['8', 1], 'order': ['8', 1], 'tracking': ['3'], 'customer': ['5']}
Comprehension
combined_key = {key for d in list_ for key in d}
combined_key
{'customer', 'order', 'product', 'tracking'}
super_dict = {key:[d[key] for d in list_ if key in d] for key in combined_key}
super_dict
{'customer': ['5'], 'tracking': ['3'], 'order': ['8', 1], 'product': ['8', 1]}

Find biggest value in a python dict

I have a python dict like below:
{ '1': {'a': '0.6', 'b': '0.8', 'c': '2','d': '0.5'},
'2': {'a': '0.7', 'b': '0.9', 'c': '0.1','d': '0.2'},
'3': {'a': '0.5', 'b': '0.8', 'c': '3'},
}
How could I get the following result?
('2','a','0.7') ('2',b','0.9') ('3','c', '3') ('1','d', '0.5')
Well, here is the code for it (just 5 lines):
total = []
for i in ['a', 'b', 'c', 'd']:
kv = max(a.iterkeys(), key=(lambda key: float(a[key][i]) if i in a[key].keys() else -9.0))
hv = a[kv][i]
total.append((kv, i, hv))
print total
Output:
[('2', 'a', '0.7'), ('2', 'b', '0.9'), ('3', 'c', '3'), ('1', 'd', '0.5')]
-9.0 is just a random low number.
x={ '1': {'a': '0.6', 'b': '0.8', 'c': '2','d': '0.5'},
'2': {'a': '0.7', 'b': '0.9', 'c': '0.1','d': '0.2'},
'3': {'a': '0.5', 'b': '0.8', 'c': '3'},
}
d={}
for i,j in x.iteritems():
for k,m in j.iteritems():
d.setdefault(k,[0,0])
if j[k]>d[k][0]:
d[k]=(j[k],i)
print [(j[1],i,j[0]) for i,j in d.items()]
You can use additional dict to do your job.
Output:[('2', 'a', '0.7'), ('3', 'c', '3'), ('2', 'b', '0.9'), ('1', 'd', '0.5')]
I agree the question is a bit vague.. I recommend you dont use strings as values.. use int or float if you can in the dictionaries, also does not specify if python 2.x or 3.x
but I think you are after something like this..
def filter_dict(values):
result = collections.Counter()
for value in values.keys():
for k, v in values[value].items():
v = float(v)
result[k] = v if v > result[k] else result[k]
return result
this is how it behaves:
class FilterDictTest(unittest.TestCase):
def test_filter_dict(self):
# Arrange
actual = {
'1': {'a': '0.6', 'b': '0.8', 'c': '2', 'd': '0.5'},
'2': {'a': '0.7', 'b': '0.9', 'c': '0.1', 'd': '0.2'},
'3': {'a': '0.5', 'b': '0.8', 'c': '3'}
}
expected = {
'a': 0.7,
'b': 0.9,
'c': 3,
'd': 0.5
}
# Act & Assert
self.assertEquals(filter_dict(actual), expected)
A little late here.
#!/usr/bin/env python3.5
# entry
entry = {'1': {'a': '0.6', 'b': '0.8', 'c': '2','d': '0.5'}, '2': {'a': '0.7', 'b': '0.9', 'c': '0.1','d': '0.2'}, '3': {'a': '0.5', 'b': '0.8', 'c': '3'}}
# identify keys
all_categories = []
for number, dct in entry.items():
for key, val in dct.items():
all_categories = all_categories + list(dct.keys())
all_categories = set(all_categories)
# Get max values
max_values = {category:None for category in all_categories}
for category in all_categories:
for number, dct in entry.items():
if category in dct.keys():
if max_values[category] is None:
max_values[category] = (number, dct[category])
elif float(max_values[category][1]) < float(dct[category]):
max_values[category] = (number, dct[category])
output = [(number, category, value) for (category, (number, value)) in max_values.items()]
print (output)
Output:
[('2', 'a', '0.7'), ('1', 'd', '0.5'), ('2', 'b', '0.9'), ('3', 'c', '3')]
Not exactly in the order you expected them, but the values are correct. It's not the most elegant solution, though.
I iterate a second time in dict to compare values.
values = []
for key in d:
for skey in d[key]:
max = 0
_key_ = ''
for _ in d:
if d[_].has_key(skey):
if d[_][skey]>max:
max = d[_][skey]
_key_ = _
if (_key_, skey, max) not in values:
values.append((_key_, skey, max))
print values

Categories

Resources