How can I perform set operations on Python dictionaries? - python

While it is incredibly useful to be able to do set operations between the keys of a dictionary, I often wish that I could perform the set operations on the dictionaries themselves.
I found some recipes for taking the difference of two dictionaries but I found those to be quite verbose and felt there must be more pythonic answers.

tl;dr Recipe: {k:d1.get(k, k in d1 or d2[k]) for k in set(d1) | set(d2)} and | can be replaced with any other set operator.
Based #torek's comment, another recipe that might be easier to remember (while being fully general) is: {k:d1.get(k,d2.get(k)) for k in set(d1) | set(d2)}.
Full answer below:
My first answer didn't deal correctly with values that evaluated to False. Here's an improved version which deals with Falsey values:
>>> d1 = {'one':1, 'both':3, 'falsey_one':False, 'falsey_both':None}
>>> d2 = {'two':2, 'both':30, 'falsey_two':None, 'falsey_both':False}
>>>
>>> print "d1 - d2:", {k:d1[k] for k in d1 if k not in d2} # 0
d1 - d2: {'falsey_one': False, 'one': 1}
>>> print "d2 - d1:", {k:d2[k] for k in d2 if k not in d1} # 1
d2 - d1: {'falsey_two': None, 'two': 2}
>>> print "intersection:", {k:d1[k] for k in d1 if k in d2} # 2
intersection: {'both': 3, 'falsey_both': None}
>>> print "union:", {k:d1.get(k, k in d1 or d2[k]) for k in set(d1) | set(d2)} # 3
union: {'falsey_one': False, 'falsey_both': None, 'both': 3, 'two': 2, 'one': 1, 'falsey_two': None}
The version for union is the most general and can be turned into a function:
>>> def dict_ops(d1, d2, setop):
... """Apply set operation `setop` to dictionaries d1 and d2
...
... Note: In cases where values are present in both d1 and d2, the value from
... d1 will be used.
... """
... return {k:d1.get(k,k in d1 or d2[k]) for k in setop(set(d1), set(d2))}
...
>>> print "d1 - d2:", dict_ops(d1, d2, lambda x,y: x-y)
d1 - d2: {'falsey_one': False, 'one': 1}
>>> print "d2 - d1:", dict_ops(d1, d2, lambda x,y: y-x)
d2 - d1: {'falsey_two': None, 'two': 2}
>>> import operator as op
>>> print "intersection:", dict_ops(d1, d2, op.and_)
intersection: {'both': 3, 'falsey_both': None}
>>> print "union:", dict_ops(d1, d2, op.or_)
union: {'falsey_one': False, 'falsey_both': None, 'both': 3, 'two': 2, 'one': 1, 'falsey_two': None}
Where items are in both dictionaries, the value from d1 will be used. Of course we can return the value from d2 instead by changing the order of the function arguments.
>>> print "union:", dict_ops(d2, d1, op.or_)
union: {'both': 30, 'falsey_two': None, 'falsey_one': False, 'two': 2, 'one': 1, 'falsey_both': False}

EDIT: The recipes here don't deal correctly with False values. I've submitted another improved answer.
Here are some recipes I've come up with:
>>> d1 = {'one':1, 'both':3}
>>> d2 = {'two':2, 'both':30}
>>>
>>> print "d1 only:", {k:d1.get(k) or d2[k] for k in set(d1) - set(d2)} # 0
d1 only: {'one': 1}
>>> print "d2 only:", {k:d1.get(k) or d2[k] for k in set(d2) - set(d1)} # 1
d2 only: {'two': 2}
>>> print "in both:", {k:d1.get(k) or d2[k] for k in set(d1) & set(d2)} # 2
in both: {'both': 3}
>>> print "in either:", {k:d1.get(k) or d2[k] for k in set(d1) | set(d2)} # 3
in either: {'both': 3, 'two': 2, 'one': 1}
While the expressions in #0 and #2 could be made simpler, I like the generality of this expression which allows me to copy and paste this recipe everywhere and simply change the set operation at the end to what I require.
Of course we can turn this into a function:
>>> def dict_ops(d1, d2, setop):
... return {k:d1.get(k) or d2[k] for k in setop(set(d1), set(d2))}
...
>>> print "d1 only:", dict_ops(d1, d2, lambda x,y: x-y)
d1 only: {'one': 1}
>>> print "d2 only:", dict_ops(d1, d2, lambda x,y: y-x)
d2 only: {'two': 2}
>>> import operator as op
>>> print "in both:", dict_ops(d1, d2, op.and_)
in both: {'both': 3}
>>> print "in either:", dict_ops(d1, d2, op.or_)
in either: {'both': 3, 'two': 2, 'one': 1}
>>> print "in either:", dict_ops(d2, d1, lambda x,y: x|y)
in either: {'both': 30, 'two': 2, 'one': 1}

This is an old question, but I'd like to highlight my ubelt package (https://github.com/Erotemic/ubelt), which contains a solutions to this problem.
I'm not sure why PEP584 only added union and no other set operations to dictionaries. I'll need to look into it more to see if there is any existing rational for why Python dictionaries do not contain these methods by default (I imagine there must be, I don't see how the devs could not be aware of set operations on dictionaries).
But to the point. Ubelt implements these functions for core dictionary set operations:
ubelt.dict_diff - analog of set.difference
ubelt.dict_union - analog of set.union
ubelt.dict_isect - analog of set.intersection
I hope someday these or something similar are added to Python dictionaries themselves.

Here are some more:
Set addition d1 + d2
{key: value for key, value in d1.items() + d2.items()}
# here values that are present in `d1` are replaced by values in `d2`
Alternatively,
d3 = d1.copy()
d3.update(d2)
Set difference d1 - d2
{key: value for key, value in d1.items() if key not in d2}

Related

Python: Merging multiple dictionaries with the same keys and different values [duplicate]

I have multiple dicts (or sequences of key-value pairs) like this:
d1 = {key1: x1, key2: y1}
d2 = {key1: x2, key2: y2}
How can I efficiently get a result like this, as a new dict?
d = {key1: (x1, x2), key2: (y1, y2)}
See also: How can one make a dictionary with duplicate keys in Python?.
Here's a general solution that will handle an arbitrary amount of dictionaries, with cases when keys are in only some of the dictionaries:
from collections import defaultdict
d1 = {1: 2, 3: 4}
d2 = {1: 6, 3: 7}
dd = defaultdict(list)
for d in (d1, d2): # you can list as many input dicts as you want here
for key, value in d.items():
dd[key].append(value)
print(dd) # result: defaultdict(<type 'list'>, {1: [2, 6], 3: [4, 7]})
assuming all keys are always present in all dicts:
ds = [d1, d2]
d = {}
for k in d1.iterkeys():
d[k] = tuple(d[k] for d in ds)
Note: In Python 3.x use below code:
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = tuple(d[k] for d in ds)
and if the dic contain numpy arrays:
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = np.concatenate(list(d[k] for d in ds))
This function merges two dicts even if the keys in the two dictionaries are different:
def combine_dict(d1, d2):
return {
k: tuple(d[k] for d in (d1, d2) if k in d)
for k in set(d1.keys()) | set(d2.keys())
}
Example:
d1 = {
'a': 1,
'b': 2,
}
d2` = {
'b': 'boat',
'c': 'car',
}
combine_dict(d1, d2)
# Returns: {
# 'a': (1,),
# 'b': (2, 'boat'),
# 'c': ('car',)
# }
dict1 = {'m': 2, 'n': 4}
dict2 = {'n': 3, 'm': 1}
Making sure that the keys are in the same order:
dict2_sorted = {i:dict2[i] for i in dict1.keys()}
keys = dict1.keys()
values = zip(dict1.values(), dict2_sorted.values())
dictionary = dict(zip(keys, values))
gives:
{'m': (2, 1), 'n': (4, 3)}
If you only have d1 and d2,
from collections import defaultdict
d = defaultdict(list)
for a, b in d1.items() + d2.items():
d[a].append(b)
Here is one approach you can use which would work even if both dictonaries don't have same keys:
d1 = {'a':'test','b':'btest','d':'dreg'}
d2 = {'a':'cool','b':'main','c':'clear'}
d = {}
for key in set(d1.keys() + d2.keys()):
try:
d.setdefault(key,[]).append(d1[key])
except KeyError:
pass
try:
d.setdefault(key,[]).append(d2[key])
except KeyError:
pass
print d
This would generate below input:
{'a': ['test', 'cool'], 'c': ['clear'], 'b': ['btest', 'main'], 'd': ['dreg']}
Using precomputed keys
def merge(dicts):
# First, figure out which keys are present.
keys = set().union(*dicts)
# Build a dict with those keys, using a list comprehension to
# pull the values from the source dicts.
return {
k: [d[k] for d in dicts if k in d]
for k in keys
}
This is essentially Flux's answer, generalized for a list of input dicts.
The set().union trick works by making a set union of the keys in all the source dictionaries. The union method on a set (we start with an empty one) can accept an arbitrary number of arguments, and make a union of each input with the original set; and it can accept other iterables (it does not require other sets for the arguments) - it will iterate over them and look for all unique elements. Since iterating over a dict yields its keys, they can be passed directly to the union method.
In the case where the keys of all inputs are known to be the same, this can be simplified: the keys can be hard-coded (or inferred from one of the inputs), and the if check in the list comprehension becomes unnecessary:
def merge(dicts):
return {
k: [d[k] for d in dicts]
for k in dicts[0].keys()
}
This is analogous to blubb's answer, but using a dict comprehension rather than an explicit loop to build the final result.
We could also try something like Mahdi Ghelichi's answer:
def merge(dicts):
values = zip(*(d.values() for d in ds))
return dict(zip(dicts[0].keys(), values))
This should work in Python 3.5 and below: dicts with identical keys will store them in the same order, during the same run of the program (if you run the program again, you may get a different ordering, but still a consistent one).
In 3.6 and above, dictionaries preserve their insertion order (though they are only guaranteed to do so by the specification in 3.7 and above). Thus, input dicts could have the same keys in a different order, which would cause the first zip to combine the wrong values.
We can work around this by "sorting" the input dicts (re-creating them with keys in a consistent order, like [{k:d[k] for k in dicts[0].keys()} for d in dicts]. (In older versions, this would be extra work with no net effect.) However, this adds complexity, and this double-zip approach really doesn't offer any advantages over the previous one using a dict comprehension.
Building the result explicitly, discovering keys on the fly
As in Eli Bendersky's answer, but as a function:
from collections import defaultdict
def merge(dicts):
result = defaultdict(list)
for d in dicts:
for key, value in d.items():
result[key].append(value)
return result
This will produce a defaultdict, a subclass of dict defined by the standard library. The equivalent code using only built-in dicts might look like:
def merge(dicts):
result = {}
for d in dicts:
for key, value in d.items():
result.setdefault(key, []).append(value)
return result
Using other container types besides lists
The precomputed-key approach will work fine to make tuples; replace the list comprehension [d[k] for d in dicts if k in d] with tuple(d[k] for d in dicts if k in d). This passes a generator expression to the tuple constructor. (There is no "tuple comprehension".)
Since tuples are immutable and don't have an append method, the explicit loop approach should be modified by replacing .append(value) with += (value,). However, this may perform poorly if there is a lot of key duplication, since it must create a new tuple each time. It might be better to produce lists first and then convert the final result with something like {k: tuple(v) for (k, v) in merged.items()}.
Similar modifications can be made to get sets (although there is a set comprehension, using {}), Numpy arrays etc. For example, we can generalize both approaches with a container type like so:
def merge(dicts, value_type=list):
# First, figure out which keys are present.
keys = set().union(*dicts)
# Build a dict with those keys, using a list comprehension to
# pull the values from the source dicts.
return {
k: value_type(d[k] for d in dicts if k in d)
for k in keys
}
and
from collections import defaultdict
def merge(dicts, value_type=list):
# We stick with hard-coded `list` for the first part,
# because even other mutable types will offer different interfaces.
result = defaultdict(list)
for d in dicts:
for key, value in d.items():
result[key].append(value)
# This is redundant for the default case, of course.
return {k:value_type(v) for (k, v) in result}
If the input values are already sequences
Rather than wrapping the values from the source in a new list, often people want to take inputs where the values are all already lists, and concatenate those lists in the output (or concatenate tuples or 1-dimensional Numpy arrays, combine sets, etc.).
This is still a trivial modification. For precomputed keys, use a nested list comprehension, ordered to get a flat result:
def merge(dicts):
keys = set().union(*dicts)
return {
k: [v for d in dicts if k in d for v in d[k]]
# Alternately:
# k: [v for d in dicts for v in d.get(k, [])]
for k in keys
}
One might instead think of using sum to concatenate results from the original list comprehension. Don't do this - it will perform poorly when there are a lot of duplicate keys. The built-in sum isn't optimized for sequences (and will explicitly disallow "summing" strings) and will try to create a new list with each addition internally.
With the explicit loop approach, use .extend instead of .append:
from collections import defaultdict
def merge(dicts):
result = defaultdict(list)
for d in dicts:
for key, value in d.items():
result[key].extend(value)
return result
The extend method of lists accepts any iterable, so this will work with inputs that have tuples for the values - of course, it still uses lists in the output; and of course, those can be converted back as shown previously.
If the inputs have one item each
A common version of this problem involves input dicts that each have a single key-value pair. Alternately, the input might be (key, value) tuples (or lists).
The above approaches will still work, of course. For tuple inputs, converting them to dicts first, like [{k:v} for (k, v) in tuples], allows for using the directly. Alternately, the explicit iteration approach can be modified to accept the tuples directly, like in Victoria Stuart's answer:
from collections import defaultdict
def merge(pairs):
result = defaultdict(list)
for key, value in pairs:
result[key].extend(value)
return result
(The code was simplified because there is no need to iterate over key-value pairs when there is only one of them and it has been provided directly.)
However, for these single-item cases it may work better to sort the values by key and then use itertools.groupby. In this case, it will be easier to work with the tuples. That looks like:
from itertools import groupby
def merge(tuples):
grouped = groupby(tuples, key=lambda t: t[0])
return {k: [kv[1] for kv in ts] for k, ts in grouped}
Here, t is used as a name for one of the tuples from the input. The grouped iterator will provide pairs of a "key" value k (the first element that was common to the tuples being grouped) and an iterator ts over the tuples in that group. Then we extract the values from the key-value pairs kv in the ts, make a list from those, and use that as the value for the k key in the resulting dict.
To merge one-item dicts this way, of course, convert them to tuples first. One simple way to do this, for a list of one-item dicts, is [next(iter(d.items())) for d in dicts].
Assuming there are two dictionaries with exact same keys, below is the most succinct way of doing it (python3 should be used for both the solution).
d1 = {'a': 1, 'b': 2, 'c':3}
d2 = {'a': 5, 'b': 6, 'c':7}
# get keys from one of the dictionary
ks = [k for k in d1.keys()]
print(ks)
['a', 'b', 'c']
# call values from each dictionary on available keys
d_merged = {k: (d1[k], d2[k]) for k in ks}
print(d_merged)
{'a': (1, 5), 'b': (2, 6), 'c': (3, 7)}
# to merge values as list
d_merged = {k: [d1[k], d2[k]] for k in ks}
print(d_merged)
{'a': [1, 5], 'b': [2, 6], 'c': [3, 7]}
If there are two dictionaries with some common keys, but a few different keys, a list of all the keys should be prepared.
d1 = {'a': 1, 'b': 2, 'c':3, 'd': 9}
d2 = {'a': 5, 'b': 6, 'c':7, 'e': 4}
# get keys from one of the dictionary
d1_ks = [k for k in d1.keys()]
d2_ks = [k for k in d2.keys()]
all_ks = set(d1_ks + d2_ks)
print(all_ks)
['a', 'b', 'c', 'd', 'e']
# call values from each dictionary on available keys
d_merged = {k: [d1.get(k), d2.get(k)] for k in all_ks}
print(d_merged)
{'d': [9, None], 'a': [1, 5], 'b': [2, 6], 'c': [3, 7], 'e': [None, 4]}
There is a great library funcy doing what you need in a just one, short line.
from funcy import join_with
from pprint import pprint
d1 = {"key1": "x1", "key2": "y1"}
d2 = {"key1": "x2", "key2": "y2"}
list_of_dicts = [d1, d2]
merged_dict = join_with(tuple, list_of_dicts)
pprint(merged_dict)
Output:
{'key1': ('x1', 'x2'), 'key2': ('y1', 'y2')}
More info here: funcy -> join_with.
def merge(d1, d2, merge):
result = dict(d1)
for k,v in d2.iteritems():
if k in result:
result[k] = merge(result[k], v)
else:
result[k] = v
return result
d1 = {'a': 1, 'b': 2}
d2 = {'a': 1, 'b': 3, 'c': 2}
print merge(d1, d2, lambda x, y:(x,y))
{'a': (1, 1), 'c': 2, 'b': (2, 3)}
If keys are nested:
d1 = { 'key1': { 'nkey1': 'x1' }, 'key2': { 'nkey2': 'y1' } }
d2 = { 'key1': { 'nkey1': 'x2' }, 'key2': { 'nkey2': 'y2' } }
ds = [d1, d2]
d = {}
for k in d1.keys():
for k2 in d1[k].keys():
d.setdefault(k, {})
d[k].setdefault(k2, [])
d[k][k2] = tuple(d[k][k2] for d in ds)
yields:
{'key1': {'nkey1': ('x1', 'x2')}, 'key2': {'nkey2': ('y1', 'y2')}}
Modifying this answer to create a dictionary of tuples (what the OP asked for), instead of a dictionary of lists:
from collections import defaultdict
d1 = {1: 2, 3: 4}
d2 = {1: 6, 3: 7}
dd = defaultdict(tuple)
for d in (d1, d2): # you can list as many input dicts as you want here
for key, value in d.items():
dd[key] += (value,)
print(dd)
The above prints the following:
defaultdict(<class 'tuple'>, {1: (2, 6), 3: (4, 7)})
d1 ={'B': 10, 'C ': 7, 'A': 20}
d2 ={'B': 101, 'Y ': 7, 'X': 8}
d3 ={'A': 201, 'Y ': 77, 'Z': 8}
def CreateNewDictionaryAssemblingAllValues1(d1,d2,d3):
aa = {
k :[d[k] for d in (d1,d2,d3) if k in d ] for k in set(d1.keys() | d2.keys() | d3.keys() )
}
aap = print(aa)
return aap
CreateNewDictionaryAssemblingAllValues1(d1, d2, d3)
"""
Output :
{'X': [8], 'C ': [7], 'Y ': [7, 77], 'Z': [8], 'B': [10, 101], 'A': [20, 201]}
"""
From blubb answer:
You can also directly form the tuple using values from each list
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = (d1[k], d2[k])
This might be useful if you had a specific ordering for your tuples
ds = [d1, d2, d3, d4]
d = {}
for k in d1.keys():
d[k] = (d3[k], d1[k], d4[k], d2[k]) #if you wanted tuple in order of d3, d1, d4, d2
Using below method we can merge two dictionaries having same keys.
def update_dict(dict1: dict, dict2: dict) -> dict:
output_dict = {}
for key in dict1.keys():
output_dict.update({key: []})
if type(dict1[key]) != str:
for value in dict1[key]:
output_dict[key].append(value)
else:
output_dict[key].append(dict1[key])
if type(dict2[key]) != str:
for value in dict2[key]:
output_dict[key].append(value)
else:
output_dict[key].append(dict2[key])
return output_dict
Input: d1 = {key1: x1, key2: y1} d2 = {key1: x2, key2: y2}
Output: {'key1': ['x1', 'x2'], 'key2': ['y1', 'y2']}
dicts = [dict1,dict2,dict3]
out = dict(zip(dicts[0].keys(),[[dic[list(dic.keys())[key]] for dic in dicts] for key in range(0,len(dicts[0]))]))
A compact possibility
d1={'a':1,'b':2}
d2={'c':3,'d':4}
context={**d1, **d2}
context
{'b': 2, 'c': 3, 'd': 4, 'a': 1}

addition of more than one dict in python

i have 6 dict like this
dict1
dict2
dict3
dict4
dict5
dict6
now I want all of this in one dict. so I used this
dict1.update({'dict2':dict2})
dict3.update({'dict1':dict1})
dict4.update({'dict4':dict3})
dict5.update({'dict5':dict4})
dict6.update({'dict6':dict5})
at last dict6 contains all value but it's not formatted correctly and it's not the pythonic way to do it
I want to improve this any suggestions
right now I'm getting like this but I don't want like this
{"main_responses": {"dict1": {"dict2": {"dict3": {"dict4": {"dict5": {}}}}}}}
i want
{"main_responses":{ "dict1": {dict1_values}, "dict2": {dict2_values}..... and so on
Try this:
from itertools import chain
d = chain.from_iterable(d.items() for d in (ada_dict,
wordpress_version_dict,
drupal_version_dict,
ssl_dict,
link_dict,
tag_dict))
api_response = {'api_response':d}
Or this, using reduce:
d = reduce(lambda x,y: dict(x, **y), (ada_dict,
wordpress_version_dict,
drupal_version_dict,
ssl_dict,
link_dict,
tag_dict))
api_response = {'api_response':d}
Giving a very similar example of my own based on your requirements:
>>> d1 = {'a':1}
>>> d2 = {'b':2}
>>> d3 = {'c':3}
>>> d4 = {'d':4}
#magic happens here
>>> d = {'d1':d1 , 'd2':d2, 'd3':d3, 'd4':d4}
>>> d
=> {'d1': {'a': 1}, 'd2': {'b': 2}, 'd3': {'c': 3}, 'd4': {'d': 4}}
Since you do not have all the dictionaries that you want added in one place, this is about as easy as it gets.
In case you want to add another key to your collection of all dictionaries (d here), do:
>>> out = {'api_responses': d}
#or in one step if you do not want to use `d`
>>> out = {'api_responses': {'d1':d1 , 'd2':d2, 'd3':d3, 'd4':d4}}
>>> out
=> {'api_responses': {'d1': {'a': 1}, 'd2': {'b': 2}, 'd3': {'c': 3}, 'd4': {'d': 4}}}
If you want add all dict in a single one "newDict", Be carrefull if several keys exist in multiple Dict :
ada_dict={'k1':'v1'}
wordpress_version_dict={'k2':'v2'}
drupal_version_dict={'k3':'v3'}
ssl_dict={'k4':'v4'}
link_dict={'k5':'v5'}
tag_dict={'k5':'v5'}
newDict={}
newDict.update( (k,v) for k,v in ada_dict.iteritems() if v is not None)
newDict.update( (k,v) for k,v in wordpress_version_dict.iteritems() if v is not None)
newDict.update( (k,v) for k,v in drupal_version_dict.iteritems() if v is not None)
newDict.update( (k,v) for k,v in ssl_dict.iteritems() if v is not None)
newDict.update( (k,v) for k,v in link_dict.iteritems() if v is not None)
newDict.update( (k,v) for k,v in tag_dict.iteritems() if v is not None)
print {'api_response':newDict}
https://repl.it/ND3p/1
please add more comment to your post, but as I see you need something like , is not it ?
>>> foo=lambda dest, src, tag: [dest.update({tag:i}) for i in src]
>>> x={1:1}
>>> y={2:2}
>>> foo(x, y, "tag")
[None]
>>> x
{1: 1, 'tag': 2}
>>> y
{2: 2}
Does this help in getting the output you want?
new_dict=dict(ada_dict.items()+wordpress_version_dict.items() +drupal_version_dict.items()+ssl_dict.items()+link_dict.items()+tag_dict.items())
a =[ada_dict,
wordpress_version_dict,
drupal_version_dict,
ssl_dict,
link_dict,
tag_dict]
c= {}
for i in a:
c.update({i:i})
print c
{'ada_dict': 'ada_dict',
'drupal_version_dict': 'drupal_version_dict',
'link_dict': 'link_dict',
'ssl_dict': 'ssl_dict',
'tag_dict': 'tag_dict',
'wordpress_version_dic': 'wordpress_version_dic'}
d={}
d.update({"api_responses":c})
print d
{'api_responses': {'ada_dict': 'ada_dict',
'drupal_version_dict': 'drupal_version_dict',
'link_dict': 'link_dict',
'ssl_dict': 'ssl_dict',
'tag_dict': 'tag_dict',
'wordpress_version_dic': 'wordpress_version_dic'}}

member-wise operations on two dicts

How would you implement operations like element-wise max, min, avg, +, etc. for dictionaries?
As an example:
def max_dict(d1, d2):
''' #return the maximum member value for each key
>>> a = {3: 4, 4: 7}; b = {3: 5, 4: 6}; max_dict(a, b)
{3: 5, 4: 7}
'''
out = {}
for (k, v) in d1.iteritems():
out[k] = max(v, d2[k])
return out
Is this a sensible way to do it? Is there some built-in function to simplify this? Should +, avg, etc. be implemented similarly?
(The tutorial and the library reference did not contain obvious answers)
Is this a sensible way to do it?
Yeah. I'd probably use a dict comprehension, but the loop works fine.
Is there some built-in function to simplify this?
No.
Should +, avg, etc. be implemented similarly?
You can take the function to apply as an argument:
def dict_elementwise(func, d1, d2):
return {k: func(d1[k], d2[k]) for k in d1}
Like this:
d1 = {1:2, 3:4}
d2 = {1:-8, 3:10}
print {k:min(v, d2[k]) for k,v in d1.items()}
This uses Python's dictionary comprehension syntax.
If the dictionaries have distinct keys:
d3 = {1:-8, 3:10}
print {k:min(v, d3[k]) for k,v in d1.items() if k in d3}
will find produce values that are the intersection of d1 and d3's keys.
So, generically:
def map_values_for_keys(func, d1, d2):
"""
Return a dictionary of k:func(d1[k], d2[k]) for each key k
defined in both dictionaries d1 and d2.
"""
return {func(v, d2[k]) for k,v in d1.items() if k in d2}
... although, to be honest, I actually prefer not defining a function for this case since dictionary comprehensions are both simple to read (with practice) and unambiguous to the reader.
This one is generalizable to an arbitrary number of dicts, any function you can apply to a list, and dicts with different keys:
def agg_dicts(func, *args):
keys = []
for d in args:
keys += list(d.keys())
keys = set(keys)
out = {}
for key in keys:
vals = [d[key] for d in args if key in d.keys()]
out[key] = func(vals)
return out
from numpy import mean
a = {'one':1, 'two':4, 'four':2}
b = {'one':2, 'two':2}
c = {'two':4, 'four':2, 'five':5}
dict_min = agg_dicts(min, a, b, c)
dict_max = agg_dicts(max, a, b)
dict_avg = agg_dicts(mean, a, b, c)
dict_sum = agg_dicts(sum, a, b, c)
dict_max
{'four': 2, 'one': 2, 'two': 4}
dict_min
{'five': 5, 'four': 2, 'one': 1, 'two': 2}
dict_max
{'four': 2, 'one': 2, 'two': 4}
dict_avg
{'five': 5.0, 'four': 2.0, 'one': 1.5, 'two': 3.3333333333333335}
dict_sum
{'five': 5, 'four': 4, 'one': 3, 'two': 10}

Python - Linking three dictionaries into one if keys are same [duplicate]

I have multiple dicts (or sequences of key-value pairs) like this:
d1 = {key1: x1, key2: y1}
d2 = {key1: x2, key2: y2}
How can I efficiently get a result like this, as a new dict?
d = {key1: (x1, x2), key2: (y1, y2)}
See also: How can one make a dictionary with duplicate keys in Python?.
Here's a general solution that will handle an arbitrary amount of dictionaries, with cases when keys are in only some of the dictionaries:
from collections import defaultdict
d1 = {1: 2, 3: 4}
d2 = {1: 6, 3: 7}
dd = defaultdict(list)
for d in (d1, d2): # you can list as many input dicts as you want here
for key, value in d.items():
dd[key].append(value)
print(dd) # result: defaultdict(<type 'list'>, {1: [2, 6], 3: [4, 7]})
assuming all keys are always present in all dicts:
ds = [d1, d2]
d = {}
for k in d1.iterkeys():
d[k] = tuple(d[k] for d in ds)
Note: In Python 3.x use below code:
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = tuple(d[k] for d in ds)
and if the dic contain numpy arrays:
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = np.concatenate(list(d[k] for d in ds))
This function merges two dicts even if the keys in the two dictionaries are different:
def combine_dict(d1, d2):
return {
k: tuple(d[k] for d in (d1, d2) if k in d)
for k in set(d1.keys()) | set(d2.keys())
}
Example:
d1 = {
'a': 1,
'b': 2,
}
d2` = {
'b': 'boat',
'c': 'car',
}
combine_dict(d1, d2)
# Returns: {
# 'a': (1,),
# 'b': (2, 'boat'),
# 'c': ('car',)
# }
dict1 = {'m': 2, 'n': 4}
dict2 = {'n': 3, 'm': 1}
Making sure that the keys are in the same order:
dict2_sorted = {i:dict2[i] for i in dict1.keys()}
keys = dict1.keys()
values = zip(dict1.values(), dict2_sorted.values())
dictionary = dict(zip(keys, values))
gives:
{'m': (2, 1), 'n': (4, 3)}
If you only have d1 and d2,
from collections import defaultdict
d = defaultdict(list)
for a, b in d1.items() + d2.items():
d[a].append(b)
Here is one approach you can use which would work even if both dictonaries don't have same keys:
d1 = {'a':'test','b':'btest','d':'dreg'}
d2 = {'a':'cool','b':'main','c':'clear'}
d = {}
for key in set(d1.keys() + d2.keys()):
try:
d.setdefault(key,[]).append(d1[key])
except KeyError:
pass
try:
d.setdefault(key,[]).append(d2[key])
except KeyError:
pass
print d
This would generate below input:
{'a': ['test', 'cool'], 'c': ['clear'], 'b': ['btest', 'main'], 'd': ['dreg']}
Using precomputed keys
def merge(dicts):
# First, figure out which keys are present.
keys = set().union(*dicts)
# Build a dict with those keys, using a list comprehension to
# pull the values from the source dicts.
return {
k: [d[k] for d in dicts if k in d]
for k in keys
}
This is essentially Flux's answer, generalized for a list of input dicts.
The set().union trick works by making a set union of the keys in all the source dictionaries. The union method on a set (we start with an empty one) can accept an arbitrary number of arguments, and make a union of each input with the original set; and it can accept other iterables (it does not require other sets for the arguments) - it will iterate over them and look for all unique elements. Since iterating over a dict yields its keys, they can be passed directly to the union method.
In the case where the keys of all inputs are known to be the same, this can be simplified: the keys can be hard-coded (or inferred from one of the inputs), and the if check in the list comprehension becomes unnecessary:
def merge(dicts):
return {
k: [d[k] for d in dicts]
for k in dicts[0].keys()
}
This is analogous to blubb's answer, but using a dict comprehension rather than an explicit loop to build the final result.
We could also try something like Mahdi Ghelichi's answer:
def merge(dicts):
values = zip(*(d.values() for d in ds))
return dict(zip(dicts[0].keys(), values))
This should work in Python 3.5 and below: dicts with identical keys will store them in the same order, during the same run of the program (if you run the program again, you may get a different ordering, but still a consistent one).
In 3.6 and above, dictionaries preserve their insertion order (though they are only guaranteed to do so by the specification in 3.7 and above). Thus, input dicts could have the same keys in a different order, which would cause the first zip to combine the wrong values.
We can work around this by "sorting" the input dicts (re-creating them with keys in a consistent order, like [{k:d[k] for k in dicts[0].keys()} for d in dicts]. (In older versions, this would be extra work with no net effect.) However, this adds complexity, and this double-zip approach really doesn't offer any advantages over the previous one using a dict comprehension.
Building the result explicitly, discovering keys on the fly
As in Eli Bendersky's answer, but as a function:
from collections import defaultdict
def merge(dicts):
result = defaultdict(list)
for d in dicts:
for key, value in d.items():
result[key].append(value)
return result
This will produce a defaultdict, a subclass of dict defined by the standard library. The equivalent code using only built-in dicts might look like:
def merge(dicts):
result = {}
for d in dicts:
for key, value in d.items():
result.setdefault(key, []).append(value)
return result
Using other container types besides lists
The precomputed-key approach will work fine to make tuples; replace the list comprehension [d[k] for d in dicts if k in d] with tuple(d[k] for d in dicts if k in d). This passes a generator expression to the tuple constructor. (There is no "tuple comprehension".)
Since tuples are immutable and don't have an append method, the explicit loop approach should be modified by replacing .append(value) with += (value,). However, this may perform poorly if there is a lot of key duplication, since it must create a new tuple each time. It might be better to produce lists first and then convert the final result with something like {k: tuple(v) for (k, v) in merged.items()}.
Similar modifications can be made to get sets (although there is a set comprehension, using {}), Numpy arrays etc. For example, we can generalize both approaches with a container type like so:
def merge(dicts, value_type=list):
# First, figure out which keys are present.
keys = set().union(*dicts)
# Build a dict with those keys, using a list comprehension to
# pull the values from the source dicts.
return {
k: value_type(d[k] for d in dicts if k in d)
for k in keys
}
and
from collections import defaultdict
def merge(dicts, value_type=list):
# We stick with hard-coded `list` for the first part,
# because even other mutable types will offer different interfaces.
result = defaultdict(list)
for d in dicts:
for key, value in d.items():
result[key].append(value)
# This is redundant for the default case, of course.
return {k:value_type(v) for (k, v) in result}
If the input values are already sequences
Rather than wrapping the values from the source in a new list, often people want to take inputs where the values are all already lists, and concatenate those lists in the output (or concatenate tuples or 1-dimensional Numpy arrays, combine sets, etc.).
This is still a trivial modification. For precomputed keys, use a nested list comprehension, ordered to get a flat result:
def merge(dicts):
keys = set().union(*dicts)
return {
k: [v for d in dicts if k in d for v in d[k]]
# Alternately:
# k: [v for d in dicts for v in d.get(k, [])]
for k in keys
}
One might instead think of using sum to concatenate results from the original list comprehension. Don't do this - it will perform poorly when there are a lot of duplicate keys. The built-in sum isn't optimized for sequences (and will explicitly disallow "summing" strings) and will try to create a new list with each addition internally.
With the explicit loop approach, use .extend instead of .append:
from collections import defaultdict
def merge(dicts):
result = defaultdict(list)
for d in dicts:
for key, value in d.items():
result[key].extend(value)
return result
The extend method of lists accepts any iterable, so this will work with inputs that have tuples for the values - of course, it still uses lists in the output; and of course, those can be converted back as shown previously.
If the inputs have one item each
A common version of this problem involves input dicts that each have a single key-value pair. Alternately, the input might be (key, value) tuples (or lists).
The above approaches will still work, of course. For tuple inputs, converting them to dicts first, like [{k:v} for (k, v) in tuples], allows for using the directly. Alternately, the explicit iteration approach can be modified to accept the tuples directly, like in Victoria Stuart's answer:
from collections import defaultdict
def merge(pairs):
result = defaultdict(list)
for key, value in pairs:
result[key].extend(value)
return result
(The code was simplified because there is no need to iterate over key-value pairs when there is only one of them and it has been provided directly.)
However, for these single-item cases it may work better to sort the values by key and then use itertools.groupby. In this case, it will be easier to work with the tuples. That looks like:
from itertools import groupby
def merge(tuples):
grouped = groupby(tuples, key=lambda t: t[0])
return {k: [kv[1] for kv in ts] for k, ts in grouped}
Here, t is used as a name for one of the tuples from the input. The grouped iterator will provide pairs of a "key" value k (the first element that was common to the tuples being grouped) and an iterator ts over the tuples in that group. Then we extract the values from the key-value pairs kv in the ts, make a list from those, and use that as the value for the k key in the resulting dict.
To merge one-item dicts this way, of course, convert them to tuples first. One simple way to do this, for a list of one-item dicts, is [next(iter(d.items())) for d in dicts].
Assuming there are two dictionaries with exact same keys, below is the most succinct way of doing it (python3 should be used for both the solution).
d1 = {'a': 1, 'b': 2, 'c':3}
d2 = {'a': 5, 'b': 6, 'c':7}
# get keys from one of the dictionary
ks = [k for k in d1.keys()]
print(ks)
['a', 'b', 'c']
# call values from each dictionary on available keys
d_merged = {k: (d1[k], d2[k]) for k in ks}
print(d_merged)
{'a': (1, 5), 'b': (2, 6), 'c': (3, 7)}
# to merge values as list
d_merged = {k: [d1[k], d2[k]] for k in ks}
print(d_merged)
{'a': [1, 5], 'b': [2, 6], 'c': [3, 7]}
If there are two dictionaries with some common keys, but a few different keys, a list of all the keys should be prepared.
d1 = {'a': 1, 'b': 2, 'c':3, 'd': 9}
d2 = {'a': 5, 'b': 6, 'c':7, 'e': 4}
# get keys from one of the dictionary
d1_ks = [k for k in d1.keys()]
d2_ks = [k for k in d2.keys()]
all_ks = set(d1_ks + d2_ks)
print(all_ks)
['a', 'b', 'c', 'd', 'e']
# call values from each dictionary on available keys
d_merged = {k: [d1.get(k), d2.get(k)] for k in all_ks}
print(d_merged)
{'d': [9, None], 'a': [1, 5], 'b': [2, 6], 'c': [3, 7], 'e': [None, 4]}
There is a great library funcy doing what you need in a just one, short line.
from funcy import join_with
from pprint import pprint
d1 = {"key1": "x1", "key2": "y1"}
d2 = {"key1": "x2", "key2": "y2"}
list_of_dicts = [d1, d2]
merged_dict = join_with(tuple, list_of_dicts)
pprint(merged_dict)
Output:
{'key1': ('x1', 'x2'), 'key2': ('y1', 'y2')}
More info here: funcy -> join_with.
def merge(d1, d2, merge):
result = dict(d1)
for k,v in d2.iteritems():
if k in result:
result[k] = merge(result[k], v)
else:
result[k] = v
return result
d1 = {'a': 1, 'b': 2}
d2 = {'a': 1, 'b': 3, 'c': 2}
print merge(d1, d2, lambda x, y:(x,y))
{'a': (1, 1), 'c': 2, 'b': (2, 3)}
If keys are nested:
d1 = { 'key1': { 'nkey1': 'x1' }, 'key2': { 'nkey2': 'y1' } }
d2 = { 'key1': { 'nkey1': 'x2' }, 'key2': { 'nkey2': 'y2' } }
ds = [d1, d2]
d = {}
for k in d1.keys():
for k2 in d1[k].keys():
d.setdefault(k, {})
d[k].setdefault(k2, [])
d[k][k2] = tuple(d[k][k2] for d in ds)
yields:
{'key1': {'nkey1': ('x1', 'x2')}, 'key2': {'nkey2': ('y1', 'y2')}}
Modifying this answer to create a dictionary of tuples (what the OP asked for), instead of a dictionary of lists:
from collections import defaultdict
d1 = {1: 2, 3: 4}
d2 = {1: 6, 3: 7}
dd = defaultdict(tuple)
for d in (d1, d2): # you can list as many input dicts as you want here
for key, value in d.items():
dd[key] += (value,)
print(dd)
The above prints the following:
defaultdict(<class 'tuple'>, {1: (2, 6), 3: (4, 7)})
d1 ={'B': 10, 'C ': 7, 'A': 20}
d2 ={'B': 101, 'Y ': 7, 'X': 8}
d3 ={'A': 201, 'Y ': 77, 'Z': 8}
def CreateNewDictionaryAssemblingAllValues1(d1,d2,d3):
aa = {
k :[d[k] for d in (d1,d2,d3) if k in d ] for k in set(d1.keys() | d2.keys() | d3.keys() )
}
aap = print(aa)
return aap
CreateNewDictionaryAssemblingAllValues1(d1, d2, d3)
"""
Output :
{'X': [8], 'C ': [7], 'Y ': [7, 77], 'Z': [8], 'B': [10, 101], 'A': [20, 201]}
"""
From blubb answer:
You can also directly form the tuple using values from each list
ds = [d1, d2]
d = {}
for k in d1.keys():
d[k] = (d1[k], d2[k])
This might be useful if you had a specific ordering for your tuples
ds = [d1, d2, d3, d4]
d = {}
for k in d1.keys():
d[k] = (d3[k], d1[k], d4[k], d2[k]) #if you wanted tuple in order of d3, d1, d4, d2
Using below method we can merge two dictionaries having same keys.
def update_dict(dict1: dict, dict2: dict) -> dict:
output_dict = {}
for key in dict1.keys():
output_dict.update({key: []})
if type(dict1[key]) != str:
for value in dict1[key]:
output_dict[key].append(value)
else:
output_dict[key].append(dict1[key])
if type(dict2[key]) != str:
for value in dict2[key]:
output_dict[key].append(value)
else:
output_dict[key].append(dict2[key])
return output_dict
Input: d1 = {key1: x1, key2: y1} d2 = {key1: x2, key2: y2}
Output: {'key1': ['x1', 'x2'], 'key2': ['y1', 'y2']}
dicts = [dict1,dict2,dict3]
out = dict(zip(dicts[0].keys(),[[dic[list(dic.keys())[key]] for dic in dicts] for key in range(0,len(dicts[0]))]))
A compact possibility
d1={'a':1,'b':2}
d2={'c':3,'d':4}
context={**d1, **d2}
context
{'b': 2, 'c': 3, 'd': 4, 'a': 1}

two dictionaries to make a new one, difficult referencing

This is what my code looks like thus far:
Given dictionaries, d1 and d2, create a new dictionary with the following property: for each entry (a, b) in d1, if there is an entry (b, c) in d2, then the entry (a, c) should be added to the new dictionary.
For example, if d1 is {2:3, 8:19, 6:4, 5:12} and d2 is {2:5, 4:3, 3:9}, then the new dictionary should be {2:9, 6:3}
Associate the new dictionary with the variable  d3
d3 ={}
for i in d1:
for i in d2:
if d1.get(i,default=none) in d2:
d3[d1] = d2.get(i,default = None)
Python can express this beautifully. Too bad it's homework and you probably can't use dict comprehensions or whatever other limitations your "teacher" gives you
>>> d1 = {2:3, 8:19, 6:4, 5:12}
>>> d2 = {2:5, 4:3, 3:9}
>>> d3 = {a:d2[b] for a, b in d1.items() if b in d2}
>>> d3
{2: 9, 6: 3}
For Python2.5 or Python2.6 use a generator expression with dict()
d3 = dict((a, d2[b]) for a, b in d1.items() if b in d2)
For 2.4 see #KP's answer
Or, if you like one-liners:
d3 = dict([(k, d2[v]) for k, v in d1.items() if v in d2])

Categories

Resources