concatenate dictionaries over a key - python

I have a list of dictionaries (with some data fetched from an API) assume:
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}..]
There are multiple dictionaries which are repeated in alist. But only one of key has a different values out of repeated dictionaries.So, the query is:
What's the easiest way to combine those dictionaries by keeping separate values in a list?
like:
alist = [{'a':1, 'b':2, 'c':[3, 35, 87]}...]
Update - I have a list which specifies me the repeated keys like:
repeated_keys = ['c',...]

Use defaultdict (it is faster ) and generate dictionary from it- you can also easily convert that dictionary into list.You can modify j in i.keys() to filter keys.
from collections import defaultdict as df
d=df(list)
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}]
for i in alist:
for j in i.keys():
d[j].append(i[j])
print dict(d.items())
Output-
{'a': [1, 1, 1], 'c': [3, 35, 87], 'b': [2, 2, 2]}
If you want to get rid of repeated element from that use dict-comprehension and set-
>>>{k:list(set(v)) for k,v in d.items()}
>>>{'a': [1], 'c': [35, 3, 87], 'b': [2]}

You could use a list comprehension:
result = [alist[0].copy()]
result[0]['c'] = [d['c'] for d in alist]
Note that there is little point in making this a list again; you combined everything into one dictionary, after all:
result = dict(alist[0], c=[d['c'] for d in alist])
If you have multiple repeated keys, you have two options:
Loop and get each key out:
result = alist[0].copy()
for key in repeated:
result[key] = [d[key] for d in alist]
Make all keys lists, that way you don't have to keep consulting your list of repeated keys:
result = {}
for key in alist[0]:
result[key] = [d[key] for d in alist]
The latter option is alternatively implemented by iterating over alist just once:
result = {}
for d in alist:
for key, value in d.items():
result.setdefault(key, []).append(value)

from collections import defaultdict
con_dict = defaultdict(list)
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}]
for curr_dict in alist:
for k, v in curr_dict.iteritems():
con_dict[k].append(v)
con_dict = dict(con_dict)
We create a default dict of type list and then iterate over the items and append them in the right key.

It is possible to get your result.You have to test if you want to create a list if items has different values or keep it as is.
repeated_keys is used to store repeated keys and count how many times they are repeated.
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}]
z = {}
repeated_keys = {}
for dict in alist:
for key in dict:
if z.has_key(key):
if isinstance(z[key], list):
if not dict[key] in z[key]:
repeated_keys[key] +=1
z[key].append(dict[key])
else:
if z[key] != dict[key]:
repeated_keys[key] = 1
z[key] = [z[key], dict[key]]
else:
z[key] = dict[key]
print 'dict: ',z
print 'Repeated keys: ', repeated_keys
output:
dict: {'a': [1, 3], 'c': [3, 35, 87], 'b': 2}
Repeated keys: {'c'}
if:
alist = [{'a':1, 'b':2, 'c':3}, {'a':1, 'b':2, 'c':35}, {'a':1, 'b':2, 'c':87}, {'a':3,'b':2}]
output should be:
dict: {'a': [1, 3], 'c': [3, 35, 87], 'b': 2}
Repeated keys: {'a': 1, 'c': 2}

Related

Dictionary difference similar to set difference

I have a dictionary and a list:
dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
remove = ['b', 'c', 'e']
I need to split "dictionary" into two dictionaries using "remove". The idea is to remove keys in "remove" from "dictionary" but instead of discarding them, I want to keep them in a new dictionary. The outcome I want is
old_dictionary = {'a':1, 'd':4, 'f':6}
new_dictionary = {'b':2, 'c':3, 'e':5}
Getting "new_dictionary" is fairly easy.
new_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
How do I find the difference between "dictionary" and "new_dictionary" to get "old_dictionary"? I guess I could loop again only with not in remove... but is there a nice trick for dictionaries similar to set difference?
One way could be to use dict.pop in loop. dict.pop method removes the key and returns its value. So in each iteration, we remove a key in remove from dictionary and add this key along with its value to new_dict. At the end of the iteration, dictionary will have all keys in remove removed from it.
new_dict = {k: dictionary.pop(k) for k in remove}
old_dict = dictionary.copy()
Output:
>>> new_dict
{'b': 2, 'c': 3, 'e': 5}
>>> old_dict
{'a': 1, 'd': 4, 'f': 6}
Just add else
new_dictionary = {}
old_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
else:
old_dictionary[key] = value
Use else: to put it in the other dictionary.
new_dictionary = {}
old_dictionary = {}
for key, value in dictionary.items():
if key in remove:
new_dictionary[key] = value
else:
old_dictionary[key] = value
The dict.keys() or dict.items() can be operated like a set with other iterable sequences:
>>> dictionary = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6}
>>> remove = list('bce')
>>> new_dict = {key: dictionary[key] for key in remove}
>>> new_dict
{'b': 2, 'c': 3, 'e': 5}
>>> dict(dictionary.items() - new_dict.items())
{'d': 4, 'f': 6, 'a': 1}
However, in terms of performance, this method is not as good as the answer with the highest score.

How to get dictionary keys and values if both keys are in two separated dictionaries?

I would like to get a new dictionary with keys only if both dictionaries have those keys in them, and then get the values of the second one.
# example:
Dict1 = {'A':3, 'B':5, 'C':2, 'D':5}
Dict2 = {'B':3, 'C':1, 'K':5}
# result--> {'B':3, 'C':1}
As a dictionary comprehension:
>>> {k:v for k, v in Dict2.items() if k in Dict1}
{'B': 3, 'C': 1}
Or use filter:
>>> dict(filter(lambda x: x[0] in Dict1, Dict2.items()))
{'B': 3, 'C': 1}
>>>
Just another solution, doesn't use comprehension. This function loops through the keys k in Dict1 and tries to add Dict2[k] to a new dict which is returned at the end. I think the try-except approach is "pythonic".
def shared_keys(a, b):
"""
returns dict of all KVs in b which are also in a
"""
shared = {}
for k in a.keys():
try:
shared[k] = b[k]
except:
pass
return shared
Dict1 = {'A':3, 'B':5, 'C':2, 'D':5}
Dict2 = {'B':3, 'C':1, 'K':5}
print(shared_keys(Dict1, Dict2))
# >>> {'B': 3, 'C': 1}

Python. Get {key:value} from dictionary in list and set it to another dict in list

I was researching answer for my question, but I have not found a solution.
I have two lists. Elements of the lists are dictionaries. I want to get key:value from first list only if a dictionary has equal another key:value.
Example:
list_1 = [{'A':1, 'B':2, 'C':3}, {'A':10, 'B':20, 'C':30}]
list_2 = [{'A':1, 'B':22,}, {'A':111, 'B':20}]
# I need get key and value of 'C' from list_1 IF value of 'A' in both dict are equal
# code block for my task...
# result
list_2 = [{'A':1, 'B':22, 'C':3}, {'A':111, 'B':20}]
# 'C':3 append in list_2[0], because 'A' has same value
UPD:
It should be working even if dict with the same value of 'A' has different indices:
list_1 = [{'A':1, 'B':2, 'C':3}, {'A':10, 'B':20, 'C':30}]
list_2 = [{'A':111, 'B':20}, {'A':1, 'B':22,}]
# code...
# result
list_2 = [{'A':111, 'B':20}, {'A':1, 'B':22, 'C':3}]
if i get it right this is what you want
list_1 = [{'A':1, 'B':2, 'C':3}, {'A':10, 'B':20, 'C':30}]
list_2 = [{'A':1, 'B':22,}, {'A':111, 'B':20}]
for dic in range(len(list_1)):
if list_1[dic]['A']==list_2[dic]['A']:
list_2[dic]['C']=list_1[dic]['C']
print(list_2)
out:
[{'A': 1, 'B': 22, 'C': 3}, {'A': 111, 'B': 20}]
UPDATE: I implemented as a function and added the functionality you want,check if it ok..
def add_to_other_list(list_1,list_2):
for dic_1 in list_1:
for dic_2 in list_2:
if dic_1['A']==dic_2['A']:
dic_2['C']=dic_1['C']
return list_2
list_2 = add_to_other_list(list_1,list_2)
Here is a one liner, which assumes that the key A is in all dicts in list_1 and list_2 and C in all dicts of list_1:
list_2 = [dict(l2, C=l1['C']) if l1['A'] == l2['A'] else l1 for l1, l2 in zip(list_1, list_2)]
def copymatch (matchkey, copykey, source, target):
if source.get(matchkey) == target.get(matchkey):
target[copykey] = source.get(copykey)
for source, target in zip(list_1,list_2):
copymatch('A','C',source,target)

Making a list of dictionaries pass-by-value

I have a bit of headache in a list of dicts.
def funk(x):
for i in x:
i['a'] += 1
print i
list1 = [{'a':1, 'b':2}, {'a':3, 'b':4}]
funk(list1)
print list1
this will output:
{'a': 2, 'b': 2}
{'a': 4, 'b': 4}
[{'a': 2, 'b': 2}, {'a': 4, 'b': 4}]
but I want to have this:
{'a': 2, 'b': 2}
{'a': 4, 'b': 4}
[{'a':1, 'b':2}, {'a':3, 'b':4}]
How do I make list1 stay untouched?
eg: [{'a':1, 'b':2}, {'a':3, 'b':4}]
funk() could make a copy of x and modify that copy instead of modifying the original x.
import copy
def funk(x):
x = copy.deepcopy(x)
for i in x:
i['a'] += 1
print i
list1 = [{'a':1, 'b':2}, {'a':3, 'b':4}]
funk(list1)
print list1
It seems to me that the copy method of a dictionary might help you here:
def funk(x):
for i in x:
new_dict = i.copy()
new_dict['a'] += 1
print new_dict
Obviously, whether or not it's viable depends on the complexity/goal of your operation, but instead of modifying your list in-place, use a list comprehension to create a new one:
def funk(x):
return [{key: value+1 if key == "a" else value for key, value in i.items()} for i in x]
In your example, you are printing the values, so this wouldn't be useful, but if your intent can be filled by having a new, altered list of dicts, it might be the best route.

Merge several Python dictionaries [duplicate]

This question already has answers here:
How to merge dicts, collecting values from matching keys?
(17 answers)
Closed 3 months ago.
I have to merge list of python dictionary. For eg:
dicts[0] = {'a':1, 'b':2, 'c':3}
dicts[1] = {'a':1, 'd':2, 'c':'foo'}
dicts[2] = {'e':57,'c':3}
super_dict = {'a':[1], 'b':[2], 'c':[3,'foo'], 'd':[2], 'e':[57]}
I wrote the following code:
super_dict = {}
for d in dicts:
for k, v in d.items():
if super_dict.get(k) is None:
super_dict[k] = []
if v not in super_dict.get(k):
super_dict[k].append(v)
Can it be presented more elegantly / optimized?
Note
I found another question on SO but its about merging exactly 2 dictionaries.
You can iterate over the dictionaries directly -- no need to use range. The setdefault method of dict looks up a key, and returns the value if found. If not found, it returns a default, and also assigns that default to the key.
super_dict = {}
for d in dicts:
for k, v in d.iteritems(): # d.items() in Python 3+
super_dict.setdefault(k, []).append(v)
Also, you might consider using a defaultdict. This just automates setdefault by calling a function to return a default value when a key isn't found.
import collections
super_dict = collections.defaultdict(list)
for d in dicts:
for k, v in d.iteritems(): # d.items() in Python 3+
super_dict[k].append(v)
Also, as Sven Marnach astutely observed, you seem to want no duplication of values in your lists. In that case, set gets you what you want:
import collections
super_dict = collections.defaultdict(set)
for d in dicts:
for k, v in d.iteritems(): # d.items() in Python 3+
super_dict[k].add(v)
from collections import defaultdict
dicts = [{'a':1, 'b':2, 'c':3},
{'a':1, 'd':2, 'c':'foo'},
{'e':57, 'c':3} ]
super_dict = defaultdict(set) # uses set to avoid duplicates
for d in dicts:
for k, v in d.items(): # use d.iteritems() in python 2
super_dict[k].add(v)
you can use this behaviour of dict. (a bit elegant)
a = {'a':1, 'b':2, 'c':3}
b = {'d':1, 'e':2, 'f':3}
c = {1:1, 2:2, 3:3}
merge = {**a, **b, **c}
print(merge) # {'a': 1, 'b': 2, 'c': 3, 'd': 1, 'e': 2, 'f': 3, 1: 1, 2: 2, 3: 3}
and you are good to go :)
Merge the keys of all dicts, and for each key assemble the list of values:
super_dict = {}
for k in set(k for d in dicts for k in d):
super_dict[k] = [d[k] for d in dicts if k in d]
The expression set(k for d in dicts for k in d) builds a set of all unique keys of all dictionaries. For each of these unique keys, we use the list comprehension [d[k] for d in dicts if k in d] to build the list of values from all dicts for this key.
Since you only seem to one the unique value of each key, you might want to use sets instead:
super_dict = {}
for k in set(k for d in dicts for k in d):
super_dict[k] = set(d[k] for d in dicts if k in d)
It seems like most of the answers using comprehensions are not all that readable. In case any gets lost in the mess of answers above this might be helpful (although extremely late...). Just loop over the items of each dict and place them in a separate one.
super_dict = {key:val for d in dicts for key,val in d.items()}
When the value of the keys are in list:
from collections import defaultdict
dicts = [{'a':[1], 'b':[2], 'c':[3]},
{'a':[11], 'd':[2], 'c':['foo']},
{'e':[57], 'c':[3], "a": [1]} ]
super_dict = defaultdict(list) # uses set to avoid duplicates
for d in dicts:
for k, v in d.items(): # use d.iteritems() in python 2
super_dict[k] = list(set(super_dict[k] + v))
combined_dict = {}
for elem in super_dict.keys():
combined_dict[elem] = super_dict[elem]
combined_dict
## output: {'a': [1, 11], 'b': [2], 'c': [3, 'foo'], 'd': [2], 'e': [57]}
I have a very easy to go solution without any imports.
I use the dict.update() method.
But sadly it will overwrite, if same key appears in more than one dictionary, then the most recently merged dict's value will appear in the output.
dict1 = {'Name': 'Zara', 'Age': 7}
dict2 = {'Sex': 'female' }
dict3 = {'Status': 'single', 'Age': 27}
dict4 = {'Occupation':'nurse', 'Wage': 3000}
def mergedict(*args):
output = {}
for arg in args:
output.update(arg)
return output
print(mergedict(dict1, dict2, dict3, dict4))
The output is this:
{'Name': 'Zara', 'Age': 27, 'Sex': 'female', 'Status': 'single', 'Occupation': 'nurse', 'Wage': 3000}
Perhaps a more modern and concise approach for those who use python 3.3 or later versions is the use of ChainMap from the collections module.
from collections import ChainMap
d1 = {'a': 1, 'b': 3}
d2 = {'c': 2}
d3 = {'d': 7, 'a': 9}
d4 = {}
combo = dict(ChainMap(d1, d2, d3, d4))
# {'d': 7, 'a': 1, 'c': 2, 'b': 3}
For a larger collection of dict objects then star operator works
dict(ChainMap(*dict_collection))
Note that the resulting dictionary seems to only keep the value of the first key it encounters in the ordered collection and ignores any further duplicates.
This may be a bit more elegant:
super_dict = {}
for d in dicts:
for k, v in d.iteritems():
l=super_dict.setdefault(k,[])
if v not in l:
l.append(v)
UPDATE: made change suggested by Sven
UPDATE: changed to avoid duplicates (thanks Marcin and Steven)
Never forget that the standard libraries have a wealth of tools for dealing with dicts and iteration:
from itertools import chain
from collections import defaultdict
super_dict = defaultdict(list)
for k,v in chain.from_iterable(d.iteritems() for d in dicts):
if v not in super_dict[k]: super_dict[k].append(v)
Note that the if v not in super_dict[k] can be avoided by using defaultdict(set) as per Steven Rumbalski's answer.
If you assume that the keys in which you are interested are at the same nested level, you can recursively traverse each dictionary and create a new dictionary using that key, effectively merging them.
merged = {}
for d in dicts:
def walk(d,merge):
for key, item in d.items():
if isinstance(item, dict):
merge.setdefault(key, {})
walk(item, merge[key])
else:
merge.setdefault(key, [])
merge[key].append(item)
walk(d,merged)
For example, say you have the following dictionaries you want to merge.
dicts = [{'A': {'A1': {'FOO': [1,2,3]}}},
{'A': {'A1': {'A2': {'BOO': [4,5,6]}}}},
{'A': {'A1': {'FOO': [7,8]}}},
{'B': {'B1': {'COO': [9]}}},
{'B': {'B2': {'DOO': [10,11,12]}}},
{'C': {'C1': {'C2': {'POO':[13,14,15]}}}},
{'C': {'C1': {'ROO': [16,17]}}}]
Using the key at each level, you should get something like this:
{'A': {'A1': {'FOO': [[1, 2, 3], [7, 8]],
'A2': {'BOO': [[4, 5, 6]]}}},
'B': {'B1': {'COO': [[9]]},
'B2': {'DOO': [[10, 11, 12]]}},
'C': {'C1': {'C2': {'POO': [[13, 14, 15]]},
'ROO': [[16, 17]]}}}
Note: I assume the leaf at each branch is a list of some kind, but you can obviously change the logic to do whatever is necessary for your situation.
This is a more recent enhancement over the prior answer by ElbowPipe, using newer syntax introduced in Python 3.9 for merging dictionaries. Note that this answer does not merge conflicting values into a list!
> import functools
> import operator
> functools.reduce(operator.or_, [{0:1}, {2:3, 4:5}, {2:6}])
{0: 1, 2: 6, 4: 5}
For a oneliner, the following could be used:
{key: {d[key] for d in dicts if key in d} for key in {key for d in dicts for key in d}}
although readibility would benefit from naming the combined key set:
combined_key_set = {key for d in dicts for key in d}
super_dict = {key: {d[key] for d in dicts if key in d} for key in combined_key_set}
Elegance can be debated but personally I prefer comprehensions over for loops. :)
(The dictionary and set comprehensions are available in Python 2.7/3.1 and newer.)
python 3.x (reduce is builtin for python 2.x, so no need to import if in 2.x)
import operator
from functools import operator.add
a = [{'a': 1}, {'b': 2}, {'c': 3, 'd': 4}]
dict(reduce(operator.add, map(list,(map(dict.items, a))))
map(dict.items, a) # converts to list of key, value iterators
map(list, ... # converts to iterator equivalent of [[[a, 1]], [[b, 2]], [[c, 3],[d,4]]]
reduce(operator.add, ... # reduces the multiple list down to a single list
My solution is similar to #senderle proposed, but instead of for loop I used map
super_dict = defaultdict(set)
map(lambda y: map(lambda x: super_dict[x].add(y[x]), y), dicts)
The use of defaultdict is good, this also can be done with the use of itertools.groupby.
import itertools
# output all dict items, and sort them by key
dicts_ele = sorted( ( item for d in dicts for item in d.items() ), key = lambda x: x[0] )
# groups items by key
ele_groups = itertools.groupby( dicts_ele, key = lambda x: x[0] )
# iterates over groups and get item value
merged = { k: set( v[1] for v in grouped ) for k, grouped in ele_groups }
and obviously, you can merge this block of code into one-line style
merged = {
k: set( v[1] for v in grouped )
for k, grouped in (
itertools.groupby(
sorted(
( item for d in dicts for item in d.items() ),
key = lambda x: x[0]
),
key = lambda x: x[0]
)
)
}
I'm a bit late to the game but I did it in 2 lines with no dependencies beyond python itself:
flatten = lambda *c: (b for a in c for b in (flatten(*a) if isinstance(a, (tuple, list)) else (a,)))
o = reduce(lambda d1,d2: dict((k, list(flatten([d1.get(k), d2.get(k)]))) for k in set(d1.keys() + d2.keys())), dicts)
# output:
# {'a': [1, 1, None], 'c': [3, 'foo', 3], 'b': [2, None, None], 'e': [None, 57], 'd': [None, 2, None]}
Though if you don't care about nested lists, then:
o2 = reduce(lambda d1,d2: dict((k, [d1.get(k), d2.get(k)]) for k in set(d1.keys() + d2.keys())), dicts)
# output:
# {'a': [[1, 1], None], 'c': [[3, 'foo'], 3], 'b': [[2, None], None], 'e': [None, 57], 'd': [[None, 2], None]}

Categories

Resources