Related
I want to merge list of dictionaries in python. The number of dictionaries contained inside the list is not fixed and the nested dictionaries are being merged on both same and different keys. The dictionaries within the list do not contain nested dictionary. The values from same keys can be stored in a list.
My code is:
list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3} ...... ]
output = {}
for i in list_of_dict:
for k,v in i.items():
if k in output:
output[k].append(v)
else:
output[k] = [v]
Is there a shorter and faster way of implementing this?
I am actually trying to implement the most fast way of doing this because the list of dictionary is very large and then there are lots of rows with such data.
One way using collections.defaultdict:
from collections import defaultdict
res = defaultdict(list)
for d in list_of_dict:
for k, v in d.items():
res[k].append(v)
Output:
defaultdict(list,
{'a': [1, 3, 3, 3],
'b': [2, 5],
'c': [3],
'k': [5, 5],
'j': [5],
'd': [4]})
items() is a dictionary method, but list_of_dict is a list. You need a nested loop so you can loop over the dictionaries and then loop over the items of each dictionary.
ou = {}
for d in list_of_dict:
for key, value in d.items():
output.setdefault(key, []).append(value)
another shorten version can be,
list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3}]
output = {
k: [d[k] for d in list_of_dict if k in d]
for k in set().union(*list_of_dict)
}
print(output)
{'d': [4], 'k': [5, 5], 'a': [1, 3, 3, 3], 'j': [5], 'c': [3], 'b': [2, 5]}
Python 3.9+ you can use the merge operator for this.
def merge_dicts(dicts):
result = dict()
for _dict in dicts:
result |= _dict
return result
One of the shortest way would be to
prepare a list/set of all the keys from all the dictionaries
and call that key on all the dictionary in the list.
list_of_dict = [{'a': 1, 'b': 2, 'c': 3}, {'a': 3, 'b': 5}, {'k': 5, 'j': 5}, {'a': 3, 'k': 5, 'd': 4}, {'a': 3}]
# prepare a list/set of all the keys from all the dictionaries
# method 1: use sum
all_keys = sum([[a for a in x.keys()] for x in list_of_dict], [])
# method 2: use itertools
import itertools
all_keys = list(itertools.chain.from_iterable(list_of_dict))
# method 3: use union of the set
all_keys = set().union(*list_of_dict)
print(all_keys)
# ['a', 'b', 'c', 'a', 'b', 'k', 'j', 'a', 'k', 'd', 'a']
# convert the list to set to remove duplicates
all_keys = set(all_keys)
print(all_keys)
# {'a', 'k', 'c', 'd', 'b', 'j'}
# now merge the dictionary
merged = {k: [d.get(k) for d in list_of_dict if k in d] for k in all_keys}
print(merged)
# {'a': [1, 3, 3, 3], 'k': [5, 5], 'c': [3], 'd': [4], 'b': [2, 5], 'j': [5]}
In short:
all_keys = set().union(*list_of_dict)
merged = {k: [d.get(k) for d in list_of_dict if k in d] for k in all_keys}
print(merged)
# {'a': [1, 3, 3, 3], 'k': [5, 5], 'c': [3], 'd': [4], 'b': [2, 5], 'j': [5]}
I have 4 dictionaries, let's call them:
dict1 , dict2 , dict3 , dict4
Example:
dict1 = {'A': 1, 'B':2}
dict2 = {'A': 3, 'C':4}
dict3 = {'B': 5, 'D':6}
dict4 = {'A': 7, 'B':8, 'C': 9, 'D':10, 'E':11}
Each dictionary level is "stronger" than those who come after it. As in, A found in dict1 will be 'stronger' than A found in dict2 in terms of precedence.
Is there a short, elegant script to create a new dictionary, assembled from all four, where each key is taken from the "strongest" dictionary that contains that key?
The result should be: dict = {'A':1, 'B':2, 'C':4, 'D:6', 'E':11}
I think the easiest/clearest approach here would be to create a new dictionary then use its update method, which overwrites existing keys. Something like this makes the precedence pretty obvious:
>>> x = {}
>>> x.update(dict4)
>>> x.update(dict3)
>>> x.update(dict2)
>>> x.update(dict1)
>>> x
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
Docs
You could of course make a utility of some sort for this, something like:
>>> def collapse(*dicts):
... x = {}
... for dict in dicts:
... x.update(dict)
... return x
...
>>>
>>> collapse(dict4, dict3, dict2, dict1)
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
(Though you'd need to remember to pass the dictionaries in the correct order.)
You could do the following (works on python 3.5 and newer):
result = {**dict4, **dict3, **dict2, **dict1}
Here's a fairly simple way for an arbitrary number of dictionaries:
dict1 = {'A': 1, 'B':2}
dict2 = {'A': 3, 'C':4}
dict3 = {'B': 5, 'D':6}
dict4 = {'A': 7, 'B':8, 'C': 9, 'D':10, 'E':11}
# strongest dictionary last
dictionaries = [dict4, dict3, dict2, dict1]
dict(i for d in dictionaries for i in d.items())
Output:
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
You probably want a ChainMap, which is perfect for simulating scope.
>>> import collections
>>> cm = collections.ChainMap(dict1, dict2, dict3, dict4)
>>> dict(cm)
{'A': 1, 'B': 2, 'C': 4, 'D': 6, 'E': 11}
>>> cm['A'] = 'foo'
>>> cm
ChainMap({'A': 'foo', 'B': 2}, {'A': 3, 'C': 4}, {'B': 5, 'D': 6}, {'A': 7, 'B': 8, 'C': 9, 'D': 10, 'E': 11})
>>> dict1
{'A': 'foo', 'B': 2}
Question
According to this answer, in Python 3.5 or greater, it is possible to merge two dictionaries x and y by unpacking them:
z = {**x, **y}
Is it possible to unpack a variadic list of dictionaries? Something like
def merge(*dicts):
return {***dicts} # this fails, of course. What should I use here?
For instance, I would expect that
list_of_dicts = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
{***list_of_dicts} == {'a': 1, 'b': 2, 'c': 3, 'd': 4}
Note that this question is not about how to merge lists of dictionaries since the link above provides an answer to this. The question here is: is it possible, and how, to unpack lists of dictionaries?
Edit
As stated in the comments, this question is very similar to this one. However, unpacking a list of dictionaries is different from simply merging them. Supposing that there was an operator *** designed to unpack lists of dictionaries, and given
def print_values(a, b, c, d):
print('a =', a)
print('b =', b)
print('c =', c)
print('d =', d)
list_of_dicts = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
it would be possible to write
print_values(***list_of_dicts)
instead of
print_values(**merge(list_of_dicts))
Another solution is using collections.ChainMap
from collections import ChainMap
dict(ChainMap(*list_of_dicts[::-1]))
Out[88]: {'a': 1, 'b': 2, 'c': 3, 'd': 4}
You could just iterate over the list and use update:
lst = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
dct = {}
for item in lst:
dct.update(item)
print(dct)
# {'a': 1, 'b': 2, 'c': 3, 'd': 4}
There's no syntax for that, but you can use itertools.chain to concatenate the key/value tuples from each dict into a single stream that dict can consume.
from itertools import chain
def merge(*dicts):
return dict(chain.from_iterable(d.items() for d in dicts))
You can also unpack a list created by a list comprehension as well:
def merge(*dicts):
return dict(*[d.items() for d in dicts])
To merge multiple dictionaries you can use the function reduce:
from functools import reduce
lst = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
reduce(lambda x, y: dict(**x, **y), lst)
# {'a': 1, 'b': 2, 'c': 3, 'd': 4}
You could use list comprehension and put this iterable object as an argument to dict
def merge(*dicts):
lst = [*[d.items() for d in dicts]]
return dict(lst)
You can just use a list comprehension to iterate over all the dicts in the list and then iterate over each if those dicts' items and finally convert them to dict
>>> lst = [{'a':1}, {'b':2}, {'c':1}, {'d':2}]
>>> dict(kv for d in lst for kv in d.items())
{'a': 1, 'b': 2, 'c': 1, 'd': 2}
You can use reduce to merge two dicts at a time using dict.update
>>> from functools import reduce
>>> lst = [{'a':1}, {'b':2}, {'c':1}, {'d':2}]
>>> reduce(lambda d1, d2: d1.update(d2) or d1, lst, {})
{'a': 1, 'b': 2, 'c': 1, 'd': 2}
When you *dicts its put in as a tuple, you can pull the list out with d[0], then use this comprehension for nonuniform keys
list_of_dicts = [{'a': 1, 'b': 2}, {'c': 3}, {'d': 4}]
def merge(*dicts):
return dict( j for i in dicts[0] for j in i.items())
print(merge(list_of_dicts))
{'a': 1, 'b': 2, 'c': 3, 'd': 4}
[Program finished]
Let's say I have a list of dictionaries:
>>> d = [{'a': 2, 'b': 3, 'c': 4}, {'a': 5, 'b': 6, 'c': 7}]
And I want to perform a map operation where I change just one value in each dictionary. One possible way to do that is to create a new dictionary which simply contains the original values along with the changed ones:
>>> map(lambda x: {'a': x['a'], 'b': x['b'] + 1, 'c': x['c']}, d)
[{'a': 2, 'c': 4, 'b': 4}, {'a': 5, 'c': 7, 'b': 7}]
This can get unruly if the dictionaries have many items.
Another way might be to define a function which copies the original dictionary and only changes the desired values:
>>> def change_b(x):
... new_x = x.copy()
... new_x['b'] = x['b'] + 1
... return new_x
...
>>> map(change_b, d)
[{'a': 2, 'c': 4, 'b': 4}, {'a': 5, 'c': 7, 'b': 7}]
This, however, requires writing a separate function and loses the elegance of a lambda expression.
Is there a better way?
This works (and is compatible with python2 and python31):
>>> map(lambda x: dict(x, b=x['b']+1), d)
[{'a': 2, 'c': 4, 'b': 4}, {'a': 5, 'c': 7, 'b': 7}]
With that said, I think that more often than not, lambda based solutions are less elegant than non-lambda counterparts... The rational behind this statement is that I can immediately look at the non-lambda solution that you proposed and I know exactly what it does. The lambda based solution that I just wrote would take a bit of thinking to parse and then more thinking to actually understand...
1Though, map will give you an iterable object on python3.x that isn't a list...
First, writing a function doesn't seem that inelegant to me in the first place. That said, welcome to the brave new world of Python 3.5 and PEP 448:
>>> d = [{'a': 2, 'b': 3, 'c': 4}, {'a': 5, 'b': 6, 'c': 7}]
>>> d
[{'b': 3, 'a': 2, 'c': 4}, {'b': 6, 'a': 5, 'c': 7}]
>>> [{**x, 'b': x['b']+1} for x in d]
[{'b': 4, 'a': 2, 'c': 4}, {'b': 7, 'a': 5, 'c': 7}]
From how your map is behaving, it's clear you're using 2, but that's easy enough to fix. :-)
You can use a for loop with an update call. Here is a hacky one-liner:
dcts = [{'a': 2, 'b': 3, 'c': 4}, {'a': 5, 'b': 6, 'c': 7}]
dcts = [d.update({'b': d['b']+1}) or d for d in dcts]
Edit: To preserve original dicts:
from copy import copy
dcts = [d.update({'b': d['b']+1}) or d for d in map(copy, dcts)]
I have the following dict:
my_dict = {'A': [1, 2], 'B': [1, 4]}
And I want to end up with a list of dicts like this:
[
{'A': 1, 'B': 1},
{'A': 1, 'B': 4},
{'A': 2, 'B': 1},
{'A': 2, 'B': 4}
]
So, I'm after the product of dict's lists, expressed as a list of dicts using the same keys as the incoming dict.
The closest I got was:
my_dict = {'A': [1, 2], 'B': [1, 4]}
it = []
for k in my_dict.keys():
current = my_dict.pop(k)
for i in current:
it.append({k2: i2 for k2, i2 in my_dict.iteritems()})
it[-1].update({k: i})
Which, apart from looking hideous, doesn't give me what I want:
[
{'A': 1, 'B': [1, 4]},
{'A': 2, 'B': [1, 4]},
{'B': 1},
{'B': 4}
]
If anyone feels like solving a riddle, I'd love to see how you'd approach it.
You can use itertools.product for this, i.e calculate cartesian product of the value and then simply zip each of the them with the keys from the dictionary. Note that ordering of a dict's keys() and corresponding values() remains same if it is not modified in-between hence ordering won't be an issue here:
>>> from itertools import product
>>> my_dict = {'A': [1, 2], 'B': [1, 4]}
>>> keys = list(my_dict)
>>> [dict(zip(keys, p)) for p in product(*my_dict.values())]
[{'A': 1, 'B': 1}, {'A': 1, 'B': 4}, {'A': 2, 'B': 1}, {'A': 2, 'B': 4}]
you can use itertools.product function within a list comprehension :
>>> from itertools import product
>>> [dict(i) for i in product(*[[(i,k) for k in j] for i,j in my_dict.items()])]
[{'A': 1, 'B': 1}, {'A': 1, 'B': 4}, {'A': 2, 'B': 1}, {'A': 2, 'B': 4}]
You can get the pairs contain your key and values with the following list comprehension :
[(i,k) for k in j] for i,j in my_dict.items()]
[[('A', 1), ('A', 2)], [('B', 1), ('B', 4)]]
Then you can use product to calculate the product of the preceding lists and then convert them to dictionary with dict function.
With itertools:
>>> from itertools import product
>>> my_dict = {'A': [1, 2], 'B': [1, 4]}
>>> keys, items = zip(*my_dict.items())
>>> [dict(zip(keys, x)) for x in product(*items)]
[{'A': 1, 'B': 1}, {'A': 1, 'B': 4}, {'A': 2, 'B': 1}, {'A': 2, 'B': 4}]
Try this:
from itertools import product
def dict_product(values, first, second):
return [
{first: first_value, second: second_value}
for first_value, second_value in product(values[first], values[second])
]
This is the result:
>>> dict_product({'A': [1, 2], 'B': [1, 4]}, 'A', 'B')
[{'A': 1, 'B': 1}, {'A': 1, 'B': 4}, {'A': 2, 'B': 1}, {'A': 2, 'B': 4}]