Find matching pairs of dictionaries in a list using Python

Find matching pairs of dictionaries in a list using Python - python

In a given list:
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
Find all 'key' pairs and print out and if no pairs found for given dictionary print out that dictionary.
What I managed to write so far sort of works but it keeps testing some items of the list even though they were already tested. Not sure how to fix it.
for i in range(len(unmatched_items_array)):
for j in range(i + 1, len(unmatched_items_array)):
# when keys are the same print matching dictionary pairs
if unmatched_items_array[i].keys() == unmatched_items_array[j].keys():
print(unmatched_items_array[i], unmatched_items_array[j])
break
# when no matching pairs print currently processed dictionary
print(unmatched_items_array[i])
Output:
{'c': 45} {'c': 35}
{'c': 45}
{'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
{'a': 3.2}
{'a': 3}
What the output should be:
{'c': 45} {'c': 35}
{'d': 5}
{'a': 3.2} {'a': 3}
What am I doing wrong here?

Using collections.defaultdict
Ex:
from collections import defaultdict
unmatched_items_array = [{'c': 45}, {'c': 35}, {'d': 5}, {'a': 3.2}, {'a': 3}]
result = defaultdict(list)
for i in unmatched_items_array:
key, _ = i.items()[0]
result[key].append(i) #Group by key.
for _, v in result.items(): #print Result.
print(v)
Output:
[{'a': 3.2}, {'a': 3}]
[{'c': 45}, {'c': 35}]
[{'d': 5}]

With itertools.groupby:
from itertools import groupby
unmatched_items_array = [{'d': 5}, {'c': 35}, {'a': 3}, {'a': 3.2}, {'c': 45}]
for v, g in groupby(sorted(unmatched_items_array, key=lambda k: tuple(k.keys())), lambda k: tuple(k.keys())):
print([*g])
Prints:
[{'a': 3}, {'a': 3.2}]
[{'c': 35}, {'c': 45}]
[{'d': 5}]
EDIT: If your items in the list are sorted by keys already, then you can skip the sorted() call:
for v, g in groupby(unmatched_items_array, lambda k: tuple(k.keys()) ):
print([*g])

Related

python update nested dictionary with other nested dictionary [duplicate]

This question already has answers here:
How to update values in nested dictionary if keys are in a list? [duplicate]
(5 answers)
Closed 3 years ago.
I want to use dd1 and dd2 to update dd3
dd1 = {'a': {'b': [{'x':1}]}}
dd2 = {'a': {'c': [{'x':2}]}}
dd3 = {'a': {'b': {}, 'c': {}}}
so I get dd3:
dd3 = {'a': {'b': [{'x':1}], 'c': [{'x':2}]}}
I know how to update flat dictionary
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'a': 2, 'b': 3, 'd': 4}
d3 = defaultdict(list)
for k, v in chain(d1.items(), d2.items()):
d3[k].append(v)
but struggle to find clear way to update nested dictionary:

With recursive traversal (assuming that all source dicts have same depth levels and lists as final expected update values):
dd1 = {'a': {'b': [{'x':1}]}}
dd2 = {'a': {'c': [{'x':2}]}}
dd3 = {'a': {'b': {}, 'c': {}}}
def update_nested_dict(target, d1, d2):
for k,v in target.items():
d1_v, d2_v = d1.get(k, []), d2.get(k, [])
if isinstance(v, dict):
if not v:
target[k] = d1_v + d2_v
else:
update_nested_dict(v, d1_v, d2_v)
update_nested_dict(dd3, dd1, dd2)
print(dd3)
The output:
{'a': {'b': [{'x': 1}], 'c': [{'x': 2}]}}

You can use recursion:
dd1 = {'a': {'b': [{'x':1}]}}
dd2 = {'a': {'c': [{'x':2}]}}
dd3 = {'a': {'b': {}, 'c': {}}}
def update(d, *args):
return {a:(lambda x:x[0] if not b else update(b, *x))([i[a] for i in args if a in i])\
if isinstance(b, dict) else b for a, b in d.items()}
print(update(dd3, dd1, dd2))
Output:
{'a': {'b': [{'x': 1}], 'c': [{'x': 2}]}}
This solution can handle multiple dictionaries of varying depths:
dd1 = {'a': {'b':{'d':{'e':[{'x':1}]}}}}
dd2 = {'a': {'c': [{'x':2}]}}
dd4 = {'a':{'j':{'k':[{'x':3}]}}}
dd3 = {'a': {'b': {'d':{'e':{}}}, 'c': {}, 'j':{'k':{}}}}
print(update(dd3, dd1, dd2, dd4))
Output:
{'a': {'b': {'d': {'e': [{'x': 1}]}}, 'c': [{'x': 2}], 'j': {'k': [{'x': 3}]}}}

Sorting list of dicts by value of a key (or default-value, if key is missing)

Imagine that you have to sort a list of dicts, by the value of a particular key. Note that the key might be missing from some of the dicts, in which case you default to the value of that key to being 0.
sample input
input = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
sample output (sorted by value of key 'a')
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
note that {'b': 5} is first in the sort-order because it has the lowest value for 'a' (0)
I would've used input.sort(key=operator.itemgetter('a')), if all the dicts were guaranteed to have the key 'a'. Or I could convert the input dicts to collections.defaultdict and then sort.
Is there a way to do this in-place without having to creating new dicts or updating the existing dicts? Can operator.itemgetter handle missing keys?

>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items, key=lambda d: d.get('a', 0))
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
Or to update the existing dictionary in-place
items.sort(key=lambda d: d.get('a', 0))

Or if in sorted:
>>> items = [{'a': 1, 'b': 2}, {'a': 10, 'b': 3}, {'b': 5}]
>>> sorted(items,key=lambda x: x['a'] if 'a' in x else 0)
[{'b': 5}, {'a': 1, 'b': 2}, {'a': 10, 'b': 3}]
>>>

dict() in for loop - different behavior

I am trying to update values for a dict() key dynamically with a for loop.
def update_dict():
f = []
for i, j in enumerate(list_range):
test_dict.update({'a': i})
j['c'] = test_dict
print(j)
f.append(j)
print(f)
test_dict = dict({'a': 1})
list_range = [{'b': i} for i in range(0, 5)]
update_dict()
Even print(j) gives iterating value (0,1,2,3,4), somehow the last dict is getting overwritten all over the list and giving wrong output (4,4,4,4,4).
Expected Output,
[{'b': 0, 'c': {'a': 0}}, {'b': 1, 'c': {'a': 1}}, {'b': 2, 'c': {'a': 2}}, {'b': 3, 'c': {'a': 3}}, {'b': 4, 'c': {'a': 4}}]
Output obtained,
[{'b': 0, 'c': {'a': 4}}, {'b': 1, 'c': {'a': 4}}, {'b': 2, 'c': {'a': 4}}, {'b': 3, 'c': {'a': 4}}, {'b': 4, 'c': {'a': 4}}]
I need to understand how the dictionaries are getting overwritten and what could be the best solution to avoid this?
Thanks in advance!
P.S. : please avoid suggesting list or dict comprehension method as bare answer as i am aware of them and the only purpose of this question is to understand the wrong behavior of dict().

The reason of such behaviour is that all references in list points to the same dict. Line j['c'] = test_dict doesn't create copy of dictionary, but just make j['c'] refer to test_dict. To get expected result you need change this line to:
j['c'] = test_dict.copy(). It will make deep copy of test_dict and assign it to j['c'].

You try to add values to same dictionary every time in the loop and as loop progresses, you keep replacing the values.
You need to define dictionary in every iteration to create separate references of the dictionary:
def update_dict():
f = []
for i, j in enumerate(list_range):
test_dict = {'a': i}
j['c'] = test_dict
f.append(j)
print(f)
list_range = [{'b': i} for i in range(0, 5)]
update_dict()
# [{'b': 0, 'c': {'a': 0}},
# {'b': 1, 'c': {'a': 1}},
# {'b': 2, 'c': {'a': 2}},
# {'b': 3, 'c': {'a': 3}},
# {'b': 4, 'c': {'a': 4}}]

A simpler solution could be to iterate through list_range and create c using the values from b
lista = [{'b': i } for i in range(0, 5)]
for i in lista:
i['c'] = {'a': i['b']}
# [{'b': 0, 'c': {'a': 0}}, {'b': 1, 'c': {'a': 1}}, {'b': 2, 'c': {'a': 2}}, {'b': 3, 'c': {'a': 3}}, {'b': 4, 'c': {'a': 4}}]

def update_dict():
f = []
for i, j in enumerate(list_range):
j['c'] = {'a': i}
print(j)
f.append(j)
return f
list_range = [{'b': i} for i in range(0, 5)]
print(update_dict())
#output
{'b': 0, 'c': {'a': 0}}
{'b': 1, 'c': {'a': 1}}
{'b': 2, 'c': {'a': 2}}
{'b': 3, 'c': {'a': 3}}
{'b': 4, 'c': {'a': 4}}

Division of nested dictionary

I have two dictionaries. one is a nested dictionary and another one is general dictionary. I want to do some divisions:
dict1 = {'document1': {'a': 3, 'b': 1, 'c': 5}, 'document2': {'d': 2, 'e': 4}}
dict2 = {'document1': 28, 'document2': 36}
I want to use the inner dictionary values form dict1 to divided by the value of matching document in dict2. The expect output would be:
enter code here
dict3 = {'document1': {'a': 3/28, 'b': 1/28, 'c': 5/28}, 'document2': {'d': 2/36, 'e': 4/36}}
I tried using two for loop to run each dictionary, but values will be duplicate multiple times and I have no idea how to fix this? Does anyone has idea of how to achieve this goal? I would be appreciate it!``

You can achieve this using dictionary comprehension.
dict3 = {} # create a new dictionary
# iterate dict1 keys, to get value from dict2, which will be used to divide dict 1 values
for d in dict1:
y = dict2[d]
dict3[d] = {k:(v/y) for k, v in dict1[d].items() }

You can try the following code
dict1 = {'document1': {'a': 3, 'b': 1, 'c': 5},
'document2': {'d': 2, 'e': 4}}
dict2 = {'document1': 28, 'document2': 36}
for k,v in dict1.items():
for ki,vi in v.items():
dict1[k][ki] /= dict2[k]
print(dict1)
# output
#{'document1': {'a': 0.10714285714285714, 'b': 0.03571428571428571, 'c': 0.17857142857142858},
#'document2': {'d': 0.05555555555555555, 'e': 0.1111111111111111}}

In one line, using nested dictionary comprehensions:
dict3 = {doc_key: {k: (v/doc_value) for k, v in dict1[doc_key].items()} for doc_key, doc_value in dict2.items()}

How to combine a list of dictionaries to one dictionary

I have a list of dicts:
d =[{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
I want to remove the curly braces and convert d to a single dict which looks like:
d ={'a': 4, 'b': 20, 'c': 5, 'd': 3}

If you don't mind duplicate keys replacing earlier keys you can use:
from functools import reduce # Python 3 compatibility
d = reduce(lambda a, b: dict(a, **b), d)
This merges the first two dictionaries then merges each following dictionary into the result built so far.
Demo:
>>> d =[{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
>>> reduce(lambda a, b: dict(a, **b), d)
{'a': 4, 'c': 5, 'b': 20, 'd': 3}
Or if you need this to work for arbitrary (non string) keys (and you are using Python 3.5 or greater):
>>> d =[{4: 4}, {20: 20}, {5: 5}, {3: 3}]
>>> reduce(lambda a, b: dict(a, **b), d) # This wont work
TypeError: keywords must be strings
>>> reduce(lambda a, b: {**a, **b}, d) # Use this instead
{4: 4, 20: 20, 5: 5, 3: 3}
The first solution hacks the behaviour of keyword arguments to the dict function. The second solution is using the more general ** operator introduced in Python 3.5.

You just need to iterate over d and append (update()) the element to a new dict e.g. newD.
d =[{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
newD = {}
for entry in d:
newD.update(entry)
>>> newD
{'c': 5, 'b': 20, 'a': 4, 'd': 3}
Note: If there are duplicate values in d the last one will be appear in newD.

Overwriting the values of existing keys, a brutal and inexperienced solution is
nd = {}
for el in d:
for k,v in el.items():
nd[k] = v
or, written as a dictionary comprehension:
d = {k:v for el in d for k,v in el.items()}

a = [{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
b = {}
[b.update(c) for c in a]
b = {'a': 4, 'b': 20, 'c': 5, 'd': 3}
if order is important:
from collections import OrderedDict
a = [{'a': 4}, {'b': 20}, {'c': 5}, {'d': 3}]
newD = OrderedDict()
[newD.update(c) for c in a]
out = dict(newD)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Find matching pairs of dictionaries in a list using Python - python

Related

python update nested dictionary with other nested dictionary [duplicate]

Sorting list of dicts by value of a key (or default-value, if key is missing)

dict() in for loop - different behavior

Division of nested dictionary

How to combine a list of dictionaries to one dictionary

Categories

Resources