How to merge an arbitrary amount of dictionaries - python

Given any amount of dictionaries, how would one go about merging them all together, such that the merged dictionary contains all the dictionaries' elements as well as summing similar key values.
eg.
d1 = {a: 2, b: 3, c: 1}
d2 = {a: 3, b: 2, c: 3}
d3 = {b: 8, d: 2}
our merged dictionary would look like such:
{a: 5, b: 13, c: 4, d: 2}
Can this be done via kwargs? I am aware that one can do:
{**d1, **d2, **d3}
But can this be done for n-defined dictionaries?

you can use a Counter
from collections import Counter
d1 = {'a': 2, 'b': 3, 'c': 1}
d2 = {'a': 3, 'b': 2, 'c': 3}
d3 = {'b': 8, 'd': 2}
list_of_dicts = [d1, d2, d3]
cnt = Counter()
for d in list_of_dicts:
cnt.update(d)
print(cnt)
Counter({'b': 13, 'a': 5, 'c': 4, 'd': 2})

Per your comment regarding defaultdict, here is an approach along those lines. That said, I prefer the Counter approach in the answer from #Raphael.
from collections import defaultdict
def mergesum(*dicts):
merged = defaultdict(int)
for k, v in (item for d in dicts for item in d.items()):
merged[k] += v
return merged
d1 = {'a': 2, 'b': 3, 'c': 1}
d2 = {'a': 3, 'b': 2, 'c': 3}
d3 = {'b': 8, 'd': 2}
result = mergesum(d1, d2, d3)
print(result)
# defaultdict(<class 'int'>, {'a': 5, 'b': 13, 'c': 4, 'd': 2})

Related

Sum Values in List of Dictionaries with some elements as list

I have this list of dictionaries:
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':5}, {'A':1,'B':1,'C':1,'D':1,'E':1}, {'A':2,'B':2,'C':2,'D':2,'E':2}]
To sum up values, I can use counter like this:
from collections import Counter
import functools, operator
# sum the values with same keys
counter = Counter()
for d in list_of_dicts:
counter.update(d)
result = dict(counter)
result
{'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': 8}
But how to achieve summation if some key in the dictionary has value as list:
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':[1,2,3]}, {'A':1,'B':1,'C':1,'D':1,'E':[1,2,3]}, {'A':2,'B':2,'C':2,'D':2,'E':[1,2,3]}]
I want to get this result:
{'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E':[3,6,9]}
If you can not use numpy you can try this:
(using collections.defaultdict)
from collections import defaultdict
list_of_dicts = [{'A':1,'B':2,'C':3,'D':4,'E':[1,2,3]},
{'A':1,'B':1,'C':1,'D':1,'E':[1,2,3]},
{'A':2,'B':2,'C':2,'D':2,'E':[1,2,3]}]
dct = defaultdict(list)
for l in list_of_dicts:
for k,v in l.items():
dct[k].append(v)
for k,v in dct.items():
if isinstance(v[0],list):
dct[k] = [sum(x) for x in zip(*v)]
else:
dct[k] = sum(v)
Output:
>>> dct
defaultdict(list, {'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': [3, 6, 9]})
If you can use numpy you can try this:
import numpy as np
dct = defaultdict(list)
for l in list_of_dicts:
for k,v in l.items():
dct[k].append(v)
for k,v in dct.items():
dct[k] = (np.array(v).sum(axis=0))
Output:
>>> dct
defaultdict(list, {'A': 4, 'B': 5, 'C': 6, 'D': 7, 'E': array([3, 6, 9])})

Python combine values of identical dictionaries without using looping

I have list of identical dictionaries:
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
I need to get something like this:
a = [1, 4, 7]
b = [2, 5, 8]
c = [3, 6, 9]
I know how to do in using for .. in .., but is there way to do it without looping?
If i do
a, b, c = zip(*my_list)
i`m getting
a = ('a', 'a', 'a')
b = ('b', 'b', 'b')
c = ('c', 'c', 'c')
Any solution?
You need to extract all the values in my_list.You could try:
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
a, b, c = zip(*map(lambda d: d.values(), my_list))
print(a, b, c)
# (1, 4, 7) (2, 5, 8) (3, 6, 9)
Pointed out by #Alexandre,This work only when the dict is ordered.If you couldn't make sure the order, consider the answer of yatu.
You will have to loop to obtain the values from the inner dictionaries. Probably the most appropriate structure would be to have a dictionary, mapping the actual letter and a list of values. Assigning to different variables is usually not the best idea, as it will only work with the fixed amount of variables.
You can iterate over the inner dictionaries, and append to a defaultdict as:
from collections import defaultdict
out = defaultdict(list)
for d in my_list:
for k,v in d.items():
out[k].append(v)
print(out)
#defaultdict(list, {'a': [1, 4, 7], 'b': [2, 5, 8], 'c': [3, 6, 9]})
Pandas DataFrame has just a factory method for this, so if you already have it as a dependency or if the input data is large enough:
import pandas as pd
my_list = ...
df = pd.DataFrame.from_rows(my_list)
a = list(df['a']) # df['a'] is a pandas Series, essentially a wrapped C array
b = list(df['b'])
c = list(df['c'])
Please find the code below. I believe that the version with a loop is much easier to read.
my_list = [{'a': 1, 'b': 2, 'c': 3}, {'a': 4, 'b': 5, 'c': 6}, {'a': 7, 'b': 8, 'c': 9}]
# we assume that all dictionaries have the sames keys
a, b, c = map(list, map(lambda k: map(lambda d: d[k], my_list), my_list[0]))
print(a,b,c)

Merge several dictionaries creating array on different values

So I have a list with several dictionaries, they all have the same keys. Some dictionaries are the same but one value is different. How could I merge them into 1 dictionary having that different values as array?
Let me give you an example:
let's say I have this dictionaries
[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]
My desired output would be this:
[{'a':1, 'b':2,'c':[3,4]},{'a':1, 'b':3,'c':[3,4]}]
I've tried using for and if nested, but it's too expensive and nasty, and I'm sure there must be a better way. Could you give me a hand?
How could I do that for any kind of dictionary assuming that the amount of keys is the same on the dictionaries and knowing the name of the key to be merged as array (c in this case)
thanks!
Use a collections.defaultdict to group the c values by a and b tuple keys:
from collections import defaultdict
lst = [
{"a": 1, "b": 2, "c": 3},
{"a": 1, "b": 2, "c": 4},
{"a": 1, "b": 3, "c": 3},
{"a": 1, "b": 3, "c": 4},
]
d = defaultdict(list)
for x in lst:
d[x["a"], x["b"]].append(x["c"])
result = [{"a": a, "b": b, "c": c} for (a, b), c in d.items()]
print(result)
Could also use itertools.groupby if lst is already ordered by a and b:
from itertools import groupby
from operator import itemgetter
lst = [
{"a": 1, "b": 2, "c": 3},
{"a": 1, "b": 2, "c": 4},
{"a": 1, "b": 3, "c": 3},
{"a": 1, "b": 3, "c": 4},
]
result = [
{"a": a, "b": b, "c": [x["c"] for x in g]}
for (a, b), g in groupby(lst, key=itemgetter("a", "b"))
]
print(result)
Or if lst is not ordered by a and b, we can sort by those two keys as well:
result = [
{"a": a, "b": b, "c": [x["c"] for x in g]}
for (a, b), g in groupby(
sorted(lst, key=itemgetter("a", "b")), key=itemgetter("a", "b")
)
]
print(result)
Output:
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
Update
For a more generic solution for any amount of keys:
def merge_lst_dicts(lst, keys, merge_key):
groups = defaultdict(list)
for item in lst:
key = tuple(item.get(k) for k in keys)
groups[key].append(item.get(merge_key))
return [
{**dict(zip(keys, group_key)), **{merge_key: merged_values}}
for group_key, merged_values in groups.items()
]
print(merge_lst_dicts(lst, ["a", "b"], "c"))
# [{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
You could use a temp dict to solve this problem -
>>>python3
Python 3.6.9 (default, Nov 7 2019, 10:44:02)
>>> di=[{'a':1, 'b':2,'c':3},{'a':1, 'b':2,'c':4},{'a':1, 'b':3,'c':3},{'a':1, 'b':3,'c':4}]
>>> from collections import defaultdict as dd
>>> dt=dd(list) #default dict of list
>>> for d in di: #create temp dict with 'a','b' as tuple and append 'c'
... dt[d['a'],d['b']].append(d['c'])
>>> for k,v in dt.items(): #Create final output from temp
... ol.append({'a':k[0],'b':k[1], 'c':v})
...
>>> ol #output
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
If the number of keys in input dict is large, the process to extract
tuple for temp_dict can be automated -
if the keys the define condition for merging are known than it can be simply a constant tuple eg.
keys=('a','b') #in this case, merging happens over these keys
If this is not known at until runtime, then we can get these keys using zip function and set difference, eg.
>>> di
[{'a': 1, 'b': 2, 'c': 3}, {'a': 1, 'b': 2, 'c': 4}, {'a': 1, 'b': 3, 'c': 3}, {'a': 1, 'b': 3, 'c': 4}]
>>> key_to_ignore_for_merge='c'
>>> keys=tuple(set(list(zip(*zip(*di)))[0])-set(key_to_ignore_for_merge))
>>> keys
('a', 'b')
At this point, we can use map to extract tuple for keys only-
>>> dt=dd(list)
>>> for d in di:
... dt[tuple(map(d.get,keys))].append(d[key_to_ignore_for_merge])
>>> dt
defaultdict(<class 'list'>, {(1, 2): [3, 4], (1, 3): [3, 4]})
Now, to recreate the dictionary from default_dict and keys will require some zip magic again!
>>> for k,v in dt.items():
... dtt=dict(tuple(zip(keys, k)))
... dtt[key_to_ignore_for_merge]=v
... ol.append(dtt)
...
>>> ol
[{'a': 1, 'b': 2, 'c': [3, 4]}, {'a': 1, 'b': 3, 'c': [3, 4]}]
This solution assumes that you only know the keys that can be different (eg. 'c') and rest is all runtime.

python multiply two collection counters

Python collection counter
Curious if there is a better way to do this. Overriding a Counter class method?
The built-in multiply produces the dot product of two counters
from collections import Counter
a = Counter({'b': 4, 'c': 2, 'a': 1})
b = Counter({'b': 8, 'c': 4, 'a': 2})
newcounter = Counter()
for x in a.elements():
for y in b.elements():
if x == y:
newcounter[x] = a[x]*b[y]
$ newcounter
Counter({'b': 32, 'c': 8, 'a': 2})
Assuming a and b always have the same keys, you can achieve this with a dictionary comprehension as follows:
a = Counter({'b': 4, 'c': 2, 'a': 1})
b = Counter({'b': 8, 'c': 4, 'a': 2})
c = Counter({k:a[k]*b[k] for k in a})
print(c)
Output
Counter({'b': 32, 'c': 8, 'a': 2})
You can get the intersection of the keys if you don't have identical dicts:
from collections import Counter
a = Counter({'b': 4, 'c': 2, 'a': 1, "d":4})
b = Counter({'b': 8, 'c': 4, 'a': 2})
# just .keys() for python3
print Counter(({k: a[k] * b[k] for k in a.viewkeys() & b}))
Counter({'b': 32, 'c': 8, 'a': 2})
Or if you want to join both you can or the dicts and use dict.get:
from collections import Counter
a = Counter({'b': 4, 'c': 2, 'a': 1, "d":4})
b = Counter({'b': 8, 'c': 4, 'a': 2})
print Counter({k: a.get(k,1) * b.get(k, 1) for k in a.viewkeys() | b})
Counter({'b': 32, 'c': 8, 'd': 4, 'a': 2})
If you wanted to be able to use the * operator on the Counter dicts you would have to roll your own:
class _Counter(Counter):
def __mul__(self, other):
return _Counter({k: self[k] * other[k] for k in self.viewkeys() & other})
a = _Counter({'b': 4, 'c': 2, 'a': 1, "d": 4})
b = _Counter({'b': 8, 'c': 4, 'a': 2})
print(a * b)
Which would give you:
_Counter({'b': 32, 'c': 8, 'a': 2})
If you wanted inplace:
from collections import Counter
class _Counter(Counter):
def __imul__(self, other):
return _Counter({k: self[k] * other[k] for k in self.viewkeys() & other})
Output:
In [28]: a = _Counter({'b': 4, 'c': 2, 'a': 1, "d": 4})
In [29]: b = _Counter({'b': 8, 'c': 4, 'a': 2})
In [30]: a *= b
In [31]: a
Out[31]: _Counter({'a': 2, 'b': 32, 'c': 8})
This seems a bit better:
a = Counter({'b': 4, 'c': 2, 'a': 1})
b = Counter({'b': 8, 'c': 4, 'a': 2})
newcounter = Counter({k:a[k]*v for k,v in b.items()})
>>> newcounter
Counter({'b': 32, 'c': 8, 'a': 2})

How to subtract values from dictionaries

I have two dictionaries in Python:
d1 = {'a': 10, 'b': 9, 'c': 8, 'd': 7}
d2 = {'a': 1, 'b': 2, 'c': 3, 'e': 2}
I want to substract values between dictionaries d1-d2 and get the result:
d3 = {'a': 9, 'b': 7, 'c': 5, 'd': 7 }
Now I'm using two loops but this solution is not too fast
for x,i in enumerate(d2.keys()):
for y,j in enumerate(d1.keys()):
I think a very Pythonic way would be using dict comprehension:
d3 = {key: d1[key] - d2.get(key, 0) for key in d1}
Note that this only works in Python 2.7+ or 3.
Use collections.Counter, iif all resulting values are known to be strictly positive. The syntax is very easy:
>>> from collections import Counter
>>> d1 = Counter({'a': 10, 'b': 9, 'c': 8, 'd': 7})
>>> d2 = Counter({'a': 1, 'b': 2, 'c': 3, 'e': 2})
>>> d3 = d1 - d2
>>> print d3
Counter({'a': 9, 'b': 7, 'd': 7, 'c': 5})
Mind, if not all values are known to remain strictly positive:
elements with values that become zero will be omitted in the result
elements with values that become negative will be missing, or replaced with wrong values. E.g., print(d2-d1) can yield Counter({'e': 2}).
Just an update to Haidro answer.
Recommended to use subtract method instead of "-".
d1.subtract(d2)
When - is used, only positive counters are updated into dictionary.
See examples below
c = Counter(a=4, b=2, c=0, d=-2)
d = Counter(a=1, b=2, c=3, d=4)
a = c-d
print(a) # --> Counter({'a': 3})
c.subtract(d)
print(c) # --> Counter({'a': 3, 'b': 0, 'c': -3, 'd': -6})
Please note the dictionary is updated when subtract method is used.
And finally use dict(c) to get Dictionary from Counter object
Haidro posted an easy solution, but even without collections you only need one loop:
d1 = {'a': 10, 'b': 9, 'c': 8, 'd': 7}
d2 = {'a': 1, 'b': 2, 'c': 3, 'e': 2}
d3 = {}
for k, v in d1.items():
d3[k] = v - d2.get(k, 0) # returns value if k exists in d2, otherwise 0
print(d3) # {'c': 5, 'b': 7, 'a': 9, 'd': 7}

Categories

Resources