python update nested dictionary with other nested dictionary [duplicate] - python

This question already has answers here:
How to update values in nested dictionary if keys are in a list? [duplicate]
(5 answers)
Closed 3 years ago.
I want to use dd1 and dd2 to update dd3
dd1 = {'a': {'b': [{'x':1}]}}
dd2 = {'a': {'c': [{'x':2}]}}
dd3 = {'a': {'b': {}, 'c': {}}}
so I get dd3:
dd3 = {'a': {'b': [{'x':1}], 'c': [{'x':2}]}}
I know how to update flat dictionary
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'a': 2, 'b': 3, 'd': 4}
d3 = defaultdict(list)
for k, v in chain(d1.items(), d2.items()):
d3[k].append(v)
but struggle to find clear way to update nested dictionary:

With recursive traversal (assuming that all source dicts have same depth levels and lists as final expected update values):
dd1 = {'a': {'b': [{'x':1}]}}
dd2 = {'a': {'c': [{'x':2}]}}
dd3 = {'a': {'b': {}, 'c': {}}}
def update_nested_dict(target, d1, d2):
for k,v in target.items():
d1_v, d2_v = d1.get(k, []), d2.get(k, [])
if isinstance(v, dict):
if not v:
target[k] = d1_v + d2_v
else:
update_nested_dict(v, d1_v, d2_v)
update_nested_dict(dd3, dd1, dd2)
print(dd3)
The output:
{'a': {'b': [{'x': 1}], 'c': [{'x': 2}]}}

You can use recursion:
dd1 = {'a': {'b': [{'x':1}]}}
dd2 = {'a': {'c': [{'x':2}]}}
dd3 = {'a': {'b': {}, 'c': {}}}
def update(d, *args):
return {a:(lambda x:x[0] if not b else update(b, *x))([i[a] for i in args if a in i])\
if isinstance(b, dict) else b for a, b in d.items()}
print(update(dd3, dd1, dd2))
Output:
{'a': {'b': [{'x': 1}], 'c': [{'x': 2}]}}
This solution can handle multiple dictionaries of varying depths:
dd1 = {'a': {'b':{'d':{'e':[{'x':1}]}}}}
dd2 = {'a': {'c': [{'x':2}]}}
dd4 = {'a':{'j':{'k':[{'x':3}]}}}
dd3 = {'a': {'b': {'d':{'e':{}}}, 'c': {}, 'j':{'k':{}}}}
print(update(dd3, dd1, dd2, dd4))
Output:
{'a': {'b': {'d': {'e': [{'x': 1}]}}, 'c': [{'x': 2}], 'j': {'k': [{'x': 3}]}}}

Related

Find matching pairs of dictionaries in a list using Python

In a given list:
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
Find all 'key' pairs and print out and if no pairs found for given dictionary print out that dictionary.
What I managed to write so far sort of works but it keeps testing some items of the list even though they were already tested. Not sure how to fix it.
for i in range(len(unmatched_items_array)):
for j in range(i + 1, len(unmatched_items_array)):
# when keys are the same print matching dictionary pairs
if unmatched_items_array[i].keys() == unmatched_items_array[j].keys():
print(unmatched_items_array[i], unmatched_items_array[j])
break
# when no matching pairs print currently processed dictionary
print(unmatched_items_array[i])
Output:
{'c': 45} {'c': 35}
{'c': 45}
{'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
{'a': 3.2}
{'a': 3}
What the output should be:
{'c': 45} {'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
What am I doing wrong here?
Using collections.defaultdict
Ex:
from collections import defaultdict
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
result = defaultdict(list)
for i in unmatched_items_array:
key, _ = i.items()[0]
result[key].append(i) #Group by key.
for _, v in result.items(): #print Result.
print(v)
Output:
[{'a': 3.2}, {'a': 3}]
[{'c': 45}, {'c': 35}]
[{'d': 5}]
With itertools.groupby:
from itertools import groupby
unmatched_items_array = [{'d': 5}, {'c': 35}, {'a': 3}, {'a': 3.2}, {'c': 45}]
for v, g in groupby(sorted(unmatched_items_array, key=lambda k: tuple(k.keys())), lambda k: tuple(k.keys())):
print([*g])
Prints:
[{'a': 3}, {'a': 3.2}]
[{'c': 35}, {'c': 45}]
[{'d': 5}]
EDIT: If your items in the list are sorted by keys already, then you can skip the sorted() call:
for v, g in groupby(unmatched_items_array, lambda k: tuple(k.keys()) ):
print([*g])

Sorting list of dicts by value of a key (or default-value, if key is missing)

Imagine that you have to sort a list of dicts, by the value of a particular key. Note that the key might be missing from some of the dicts, in which case you default to the value of that key to being 0.
sample input
input = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
sample output (sorted by value of key 'a')
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
note that {'b': 5} is first in the sort-order because it has the lowest value for 'a' (0)
I would've used input.sort(key=operator.itemgetter('a')), if all the dicts were guaranteed to have the key 'a'. Or I could convert the input dicts to collections.defaultdict and then sort.
Is there a way to do this in-place without having to creating new dicts or updating the existing dicts? Can operator.itemgetter handle missing keys?
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items, key=lambda d: d.get('a', 0))
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
Or to update the existing dictionary in-place
items.sort(key=lambda d: d.get('a', 0))
Or if in sorted:
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items,key=lambda x: x['a'] if 'a' in x else 0)
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
>>>

dict() in for loop - different behavior

I am trying to update values for a dict() key dynamically with a for loop.
def update_dict():
f = []
for i, j in enumerate(list_range):
test_dict.update({'a': i})
j['c'] = test_dict
print(j)
f.append(j)
print(f)
test_dict = dict({'a': 1})
list_range = [{'b': i} for i in range(0, 5)]
update_dict()
Even print(j) gives iterating value (0,1,2,3,4), somehow the last dict is getting overwritten all over the list and giving wrong output (4,4,4,4,4).
Expected Output,
[{'b': 0, 'c': {'a': 0}}, {'b': 1, 'c': {'a': 1}}, {'b': 2, 'c': {'a': 2}}, {'b': 3, 'c': {'a': 3}}, {'b': 4, 'c': {'a': 4}}]
Output obtained,
[{'b': 0, 'c': {'a': 4}}, {'b': 1, 'c': {'a': 4}}, {'b': 2, 'c': {'a': 4}}, {'b': 3, 'c': {'a': 4}}, {'b': 4, 'c': {'a': 4}}]
I need to understand how the dictionaries are getting overwritten and what could be the best solution to avoid this?
Thanks in advance!
P.S. : please avoid suggesting list or dict comprehension method as bare answer as i am aware of them and the only purpose of this question is to understand the wrong behavior of dict().
The reason of such behaviour is that all references in list points to the same dict. Line j['c'] = test_dict doesn't create copy of dictionary, but just make j['c'] refer to test_dict. To get expected result you need change this line to:
j['c'] = test_dict.copy(). It will make deep copy of test_dict and assign it to j['c'].
You try to add values to same dictionary every time in the loop and as loop progresses, you keep replacing the values.
You need to define dictionary in every iteration to create separate references of the dictionary:
def update_dict():
f = []
for i, j in enumerate(list_range):
test_dict = {'a': i}
j['c'] = test_dict
f.append(j)
print(f)
list_range = [{'b': i} for i in range(0, 5)]
update_dict()
# [{'b': 0, 'c': {'a': 0}},
# {'b': 1, 'c': {'a': 1}},
# {'b': 2, 'c': {'a': 2}},
# {'b': 3, 'c': {'a': 3}},
# {'b': 4, 'c': {'a': 4}}]
A simpler solution could be to iterate through list_range and create c using the values from b
lista = [{'b': i } for i in range(0, 5)]
for i in lista:
i['c'] = {'a': i['b']}
# [{'b': 0, 'c': {'a': 0}}, {'b': 1, 'c': {'a': 1}}, {'b': 2, 'c': {'a': 2}}, {'b': 3, 'c': {'a': 3}}, {'b': 4, 'c': {'a': 4}}]
def update_dict():
f = []
for i, j in enumerate(list_range):
j['c'] = {'a': i}
print(j)
f.append(j)
return f
list_range = [{'b': i} for i in range(0, 5)]
print(update_dict())
#output
{'b': 0, 'c': {'a': 0}}
{'b': 1, 'c': {'a': 1}}
{'b': 2, 'c': {'a': 2}}
{'b': 3, 'c': {'a': 3}}
{'b': 4, 'c': {'a': 4}}

Update list value in list of dictionaries

I have a list of dictionaries (much like in JSON). I want to apply a function to a key in every dictionary of the list.
>> d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
# Desired value
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
# If I do this, I can only get the changed key
>> map(lambda x: {k: v * 100 for k, v in x.iteritems() if k == 'a'}, d)
[{'a': 200}, {'a': 100}, {'a': 100}, {'a': 100}]
# I try to add the non-modified key-values but get an error
>> map(lambda x: {k: v * 100 for k, v in x.iteritems() if k == 'a' else k:v}, d)
SyntaxError: invalid syntax
File "<stdin>", line 1
map(lambda x: {k: v * 100 for k, v in x.iteritems() if k == 'a' else k:v}, d)
How can I achieve this?
EDIT: 'a' and 'b' are not the only keys. These were selected for demo purposes only.
Iterate through the list and update the desired dict item,
lst = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
for d in lst:
d['a'] *= 100
Using list comprehension will give you speed but it will create a new list and n new dicts, It's useful if you don't wanna mutate your list, here it is
new_lst = [{**d, 'a': d['a']*100} for d in lst]
In python 2.X we can't use {**d} so I built custom_update based on the update method and the code will be
def custom_update(d):
new_dict = dict(d)
new_dict.update({'a':d['a']*100})
return new_dict
[custom_update(d) for d in lst]
If for every item in the list you want to update a different key
keys = ['a', 'b', 'a', 'b'] # keys[0] correspond to lst[0] and keys[0] correspond to lst[0], ...
for index, d in enumerate(lst):
key = keys[index]
d[key] *= 100
using list comprehension
[{**d, keys[index]: d[keys[index]] * 100} for index, d in enumerate(lst)]
In python 2.x the list comprehension will be
def custom_update(d, key):
new_dict = dict(d)
new_dict.update({key: d[key]*100})
return new_dict
[custom_update(d, keys[index]) for index, d in enumerate(lst)]
You can use your inline conditionals (ternaries) in a better location within a comprehension:
>>> d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
>>> d2 = [{k: v * 100 if k == 'a' else v for k, v in i.items()} for i in d]
>>> d2
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
Your map() call is close to working, you just need to change the order of your dict comprehension, and turn else k:v into else v:
>>> d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
>>> list(map(lambda x: {k: v * 100 if k == 'a' else v for k, v in x.items()}, d))
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
If you are using a function, you may want to provide a target key and corresponding value:
d = [{'a': 2, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}, {'a': 1, 'b': 2}]
f = lambda new_val, d1, key='a': {a:b*new_val if a == key else b for a, b in d1.items()}
new_d = list(map(lambda x:f(100, x), d))
Output:
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]
After your edit of "'a' and 'b' are not the only keys. These were selected for demo purposes only", here is a very simple map function that alters only the value of 'a', and leaves the rest as is:
map(lambda x: x.update({'a': x['a']*100}), d)
My original answer was:
I think the simplest and most appropriate way of this is iterating in d and utilizing the fact that each item in d is a dictionary that has keys 'a' and 'b':
res = [{'a':e['a']*100, 'b':e['b']} for e in d]
Result:
[{'a': 200, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}, {'a': 100, 'b': 2}]

Dictionary multiple in Python

I have a multidictionary:
{'a': {'b': {'c': {'d': '2'}}},
'b': {'b': {'c': {'d': '7'}}},
'c': {'b': {'c': {'d': '3'}}},
'f': {'d': {'c': {'d': '1'}}}}
How can I sort it based on the values '2' '3' '7' '1'
so my output will be:
f.d.c.d.1
a.b.c.d.2
c.b.c.d.3
b.b.c.d.7
You've got a fixed-shape structure, which is pretty simple to sort:
>>> d = {'a': {'b': {'c': {'d': '2'}}}, 'c': {'b': {'c': {'d': '3'}}}, 'b': {'b': {'c': {'d': '7'}}}, 'f': {'d': {'c': {'d': '1'}}}}
>>> sorted(d, key=lambda x: d[x].values()[0].values()[0].values()[0])
['f', 'a', 'c', 'b']
>>> sorted(d.items(), key=lambda x: x[1].values()[0].values()[0].values()[0])
[('f', {'d': {'c': {'d': '1'}}}),
('a', {'b': {'c': {'d': '2'}}}),
('c', {'b': {'c': {'d': '3'}}}),
('b', {'b': {'c': {'d': '7'}}})]
Yes, this is a bit ugly and clumsy, but only because your structure is inherently ugly and clumsy.
In fact, other than the fact that d['f'] has a key 'd' instead of 'b', it's even more straightforward. I suspect that may be a typo, in which case things are even easier:
>>> d = {'a': {'b': {'c': {'d': '2'}}}, 'c': {'b': {'c': {'d': '3'}}}, 'b': {'b': {'c': {'d': '7'}}}, 'f': {'b': {'c': {'d': '1'}}}}
>>> sorted(d.items(), key=lambda x:x[1]['b']['c']['d'])
[('f', {'b': {'c': {'d': '1'}}}),
('a', {'b': {'c': {'d': '2'}}}),
('c', {'b': {'c': {'d': '3'}}}),
('b', {'b': {'c': {'d': '7'}}})]
As others have pointed out, this is almost certainly not the right data structure for whatever it is you're trying to do. But, if it is, this is how to deal with it.
PS, it's confusing to call this a "multidictionary". That term usually means "dictionary with potentially multiple values per key" (a concept which in Python you'd probably implement as a defaultdict with list or set as its default). A single, single-valued dictionary that happens to contain dictionaries is better named a "nested dictionary".
In my opinion this kind of design is very hard to read and maintain. Can you consider replacing the internal dictionaries with string-names?
E.g.:
mydict = {
'a.b.c.d' : 2,
'b.b.c.d' : 7,
'c.b.c.d' : 3,
'f.d.c.d' : 1,
}
This one is much easier to sort and waaaay more readable.
Now, a dictionary is something unsortable due to its nature. Thus, you have to sort an e.g. a list representation of it:
my_sorted_dict_as_list = sorted(mydict.items(),
key=lambda kv_pair: kv_pair[1])
you can do it recursively:
d = {'a': {'b': {'c': {'d': '2'}}}, 'c': {'b': {'c': {'d': '3'}}}, 'b': {'b': {'c': {'d': '7'}}}, 'f': {'d': {'c': {'d': '1'}}}}
def nested_to_string(item):
if hasattr(item, 'items'):
out = ''
for key in item.keys():
out += '%s.' % key + nested_to_string(item[key])
return out
else:
return item + '\n'
print nested_to_string(d)
or
def nested_to_string(item):
def rec_fun(item, temp, res):
if hasattr(item, 'items'):
for key in item.keys():
temp += '%s.' % key
rec_fun(item[key], temp, res)
temp = ''
else:
res.append(temp + item)
res = []
rec_fun(d, '', res)
return res
why do you want to do this.
Your data structure is basically a multi-level tree, so a good way to do what you want is to do what is called a depth-first traversal of it, which can be done recursively, and then massage the intermediate results a bit to sort and format them them into the desired format.
multidict = {'a': {'b': {'c': {'d': '2'}}},
'b': {'b': {'c': {'d': '7'}}},
'c': {'b': {'c': {'d': '3'}}},
'f': {'d': {'c': {'d': '1'}}}}
def nested_dict_to_string(nested_dict):
chains = []
for key,value in nested_dict.items():
chains.append([key] + visit(value))
chains = ['.'.join(chain) for chain in sorted(chains, key=lambda chain: chain[-1])]
return '\n'.join(chains)
def visit(node):
result = []
try:
for key,value in node.items():
result += [key] + visit(value)
except AttributeError:
result = [node]
return result
print nested_dict_to_string(multidict)
Output:
f.d.c.d.1
a.b.c.d.2
c.b.c.d.3
b.b.c.d.7

Categories

Resources