Merging two dictionaries? [duplicate] - python
I want to merge two dictionaries into a new dictionary.
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
z = merge(x, y)
>>> z
{'a': 1, 'b': 3, 'c': 4}
Whenever a key k is present in both dictionaries, only the value y[k] should be kept.
How can I merge two Python dictionaries in a single expression?
For dictionaries x and y, their shallowly-merged dictionary z takes values from y, replacing those from x.
In Python 3.9.0 or greater (released 17 October 2020, PEP-584, discussed here):
z = x | y
In Python 3.5 or greater:
z = {**x, **y}
In Python 2, (or 3.4 or lower) write a function:
def merge_two_dicts(x, y):
z = x.copy() # start with keys and values of x
z.update(y) # modifies z with keys and values of y
return z
and now:
z = merge_two_dicts(x, y)
Explanation
Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries:
x = {'a': 1, 'b': 2}
y = {'b': 3, 'c': 4}
The desired result is to get a new dictionary (z) with the values merged, and the second dictionary's values overwriting those from the first.
>>> z
{'a': 1, 'b': 3, 'c': 4}
A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is
z = {**x, **y}
And it is indeed a single expression.
Note that we can merge in with literal notation as well:
z = {**x, 'foo': 1, 'bar': 2, **y}
and now:
>>> z
{'a': 1, 'b': 3, 'foo': 1, 'bar': 2, 'c': 4}
It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What's New in Python 3.5 document.
However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process:
z = x.copy()
z.update(y) # which returns None since it mutates z
In both approaches, y will come second and its values will replace x's values, thus b will point to 3 in our final result.
Not yet on Python 3.5, but want a single expression
If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function:
def merge_two_dicts(x, y):
"""Given two dictionaries, merge them into a new dict as a shallow copy."""
z = x.copy()
z.update(y)
return z
and then you have a single expression:
z = merge_two_dicts(x, y)
You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number:
def merge_dicts(*dict_args):
"""
Given any number of dictionaries, shallow copy and merge into a new dict,
precedence goes to key-value pairs in latter dictionaries.
"""
result = {}
for dictionary in dict_args:
result.update(dictionary)
return result
This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a to g:
z = merge_dicts(a, b, c, d, e, f, g)
and key-value pairs in g will take precedence over dictionaries a to f, and so on.
Critiques of Other Answers
Don't use what you see in the formerly accepted answer:
z = dict(x.items() + y.items())
In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you're adding two dict_items objects together, not two lists -
>>> c = dict(a.items() + b.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items'
and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items())). This is a waste of resources and computation power.
Similarly, taking the union of items() in Python 3 (viewitems() in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don't do this:
>>> c = dict(a.items() | b.items())
This example demonstrates what happens when values are unhashable:
>>> x = {'a': []}
>>> y = {'b': []}
>>> dict(x.items() | y.items())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'list'
Here's an example where y should have precedence, but instead the value from x is retained due to the arbitrary order of sets:
>>> x = {'a': 2}
>>> y = {'a': 1}
>>> dict(x.items() | y.items())
{'a': 2}
Another hack you should not use:
z = dict(x, **y)
This uses the dict constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it's difficult to read, it's not the intended usage, and so it is not Pythonic.
Here's an example of the usage being remediated in django.
Dictionaries are intended to take hashable keys (e.g. frozensets or tuples), but this method fails in Python 3 when keys are not strings.
>>> c = dict(a, **b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: keyword arguments must be strings
From the mailing list, Guido van Rossum, the creator of the language, wrote:
I am fine with
declaring dict({}, **{1:3}) illegal, since after all it is abuse of
the ** mechanism.
and
Apparently dict(x, **y) is going around as "cool hack" for "call
x.update(y) and return x". Personally, I find it more despicable than
cool.
It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y) is for creating dictionaries for readability purposes, e.g.:
dict(a=1, b=10, c=11)
instead of
{'a': 1, 'b': 10, 'c': 11}
Response to comments
Despite what Guido says, dict(x, **y) is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords.
Again, it doesn't work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict broke this consistency in Python 2:
>>> foo(**{('a', 'b'): None})
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: foo() keywords must be strings
>>> dict(**{('a', 'b'): None})
{('a', 'b'): None}
This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change.
I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints.
More comments:
dict(x.items() + y.items()) is still the most readable solution for Python 2. Readability counts.
My response: merge_two_dicts(x, y) actually seems much clearer to me, if we're actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated.
{**x, **y} does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging.
Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first's values being overwritten by the second's - in a single expression.
Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them:
from copy import deepcopy
def dict_of_dicts_merge(x, y):
z = {}
overlapping_keys = x.keys() & y.keys()
for key in overlapping_keys:
z[key] = dict_of_dicts_merge(x[key], y[key])
for key in x.keys() - overlapping_keys:
z[key] = deepcopy(x[key])
for key in y.keys() - overlapping_keys:
z[key] = deepcopy(y[key])
return z
Usage:
>>> x = {'a':{1:{}}, 'b': {2:{}}}
>>> y = {'b':{10:{}}, 'c': {11:{}}}
>>> dict_of_dicts_merge(x, y)
{'b': {2: {}, 10: {}}, 'a': {1: {}}, 'c': {11: {}}}
Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge".
Less Performant But Correct Ad-hocs
These approaches are less performant, but they will provide correct behavior.
They will be much less performant than copy and update or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence)
You can also chain the dictionaries manually inside a dict comprehension:
{k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7
or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced):
dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2
itertools.chain will chain the iterators over the key-value pairs in the correct order:
from itertools import chain
z = dict(chain(x.items(), y.items())) # iteritems in Python 2
Performance Analysis
I'm only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.)
from timeit import repeat
from itertools import chain
x = dict.fromkeys('abcdefg')
y = dict.fromkeys('efghijk')
def merge_two_dicts(x, y):
z = x.copy()
z.update(y)
return z
min(repeat(lambda: {**x, **y}))
min(repeat(lambda: merge_two_dicts(x, y)))
min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
min(repeat(lambda: dict(chain(x.items(), y.items()))))
min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
In Python 3.8.1, NixOS:
>>> min(repeat(lambda: {**x, **y}))
1.0804965235292912
>>> min(repeat(lambda: merge_two_dicts(x, y)))
1.636518670246005
>>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()}))
3.1779992282390594
>>> min(repeat(lambda: dict(chain(x.items(), y.items()))))
2.740647904574871
>>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items())))
4.266070580109954
$ uname -a
Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux
Resources on Dictionaries
My explanation of Python's dictionary implementation, updated for 3.6.
Answer on how to add new keys to a dictionary
Mapping two lists into a dictionary
The official Python docs on dictionaries
The Dictionary Even Mightier - talk by Brandon Rhodes at Pycon 2017
Modern Python Dictionaries, A Confluence of Great Ideas - talk by Raymond Hettinger at Pycon 2017
In your case, you can do:
z = dict(list(x.items()) + list(y.items()))
This will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict's value:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = dict(list(x.items()) + list(y.items()))
>>> z
{'a': 1, 'c': 11, 'b': 10}
If you use Python 2, you can even remove the list() calls. To create z:
>>> z = dict(x.items() + y.items())
>>> z
{'a': 1, 'c': 11, 'b': 10}
If you use Python version 3.9.0a4 or greater, then you can directly use:
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z = x | y
print(z)
{'a': 1, 'c': 11, 'b': 10}
An alternative:
z = x.copy()
z.update(y)
Another, more concise, option:
z = dict(x, **y)
Note: this has become a popular answer, but it is important to point out that if y has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can't recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.
This probably won't be a popular answer, but you almost certainly do not want to do this. If you want a copy that's a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit.
In addition, when you use .items() (pre Python 3.0), you're creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item.
In terms of time:
>>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.52571702003479
>>> timeit.Timer("temp = x.copy()\ntemp.update(y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
15.694622993469238
>>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000)
41.484580039978027
IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.
In a follow-up answer, you asked about the relative performance of these two alternatives:
z1 = dict(x.items() + y.items())
z2 = dict(x, **y)
On my machine, at least (a fairly ordinary x86_64 running Python 2.5.2), alternative z2 is not only shorter and simpler but also significantly faster. You can verify this for yourself using the timeit module that comes with Python.
Example 1: identical dictionaries mapping 20 consecutive integers to themselves:
% python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z1=dict(x.items() + y.items())'
100000 loops, best of 3: 5.67 usec per loop
% python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z2=dict(x, **y)'
100000 loops, best of 3: 1.53 usec per loop
z2 wins by a factor of 3.5 or so. Different dictionaries seem to yield quite different results, but z2 always seems to come out ahead. (If you get inconsistent results for the same test, try passing in -r with a number larger than the default 3.)
Example 2: non-overlapping dictionaries mapping 252 short strings to integers and vice versa:
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z1=dict(x.items() + y.items())'
1000 loops, best of 3: 260 usec per loop
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z2=dict(x, **y)'
10000 loops, best of 3: 26.9 usec per loop
z2 wins by about a factor of 10. That's a pretty big win in my book!
After comparing those two, I wondered if z1's poor performance could be attributed to the overhead of constructing the two item lists, which in turn led me to wonder if this variation might work better:
from itertools import chain
z3 = dict(chain(x.iteritems(), y.iteritems()))
A few quick tests, e.g.
% python -m timeit -s 'from itertools import chain; from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z3=dict(chain(x.iteritems(), y.iteritems()))'
10000 loops, best of 3: 66 usec per loop
lead me to conclude that z3 is somewhat faster than z1, but not nearly as fast as z2. Definitely not worth all the extra typing.
This discussion is still missing something important, which is a performance comparison of these alternatives with the "obvious" way of merging two lists: using the update method. To try to keep things on an equal footing with the expressions, none of which modify x or y, I'm going to make a copy of x instead of modifying it in-place, as follows:
z0 = dict(x)
z0.update(y)
A typical result:
% python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z0=dict(x); z0.update(y)'
10000 loops, best of 3: 26.9 usec per loop
In other words, z0 and z2 seem to have essentially identical performance. Do you think this might be a coincidence? I don't....
In fact, I'd go so far as to claim that it's impossible for pure Python code to do any better than this. And if you can do significantly better in a C extension module, I imagine the Python folks might well be interested in incorporating your code (or a variation on your approach) into the Python core. Python uses dict in lots of places; optimizing its operations is a big deal.
You could also write this as
z0 = x.copy()
z0.update(y)
as Tony does, but (not surprisingly) the difference in notation turns out not to have any measurable effect on performance. Use whichever looks right to you. Of course, he's absolutely correct to point out that the two-statement version is much easier to understand.
In Python 3.0 and later, you can use collections.ChainMap which groups multiple dicts or other mappings together to create a single, updateable view:
>>> from collections import ChainMap
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = dict(ChainMap({}, y, x))
>>> for k, v in z.items():
print(k, '-->', v)
a --> 1
b --> 10
c --> 11
Update for Python 3.5 and later: You can use PEP 448 extended dictionary packing and unpacking. This is fast and easy:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> {**x, **y}
{'a': 1, 'b': 10, 'c': 11}
Update for Python 3.9 and later: You can use the PEP 584 union operator:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> x | y
{'a': 1, 'b': 10, 'c': 11}
I wanted something similar, but with the ability to specify how the values on duplicate keys were merged, so I hacked this out (but did not heavily test it). Obviously this is not a single expression, but it is a single function call.
def merge(d1, d2, merge_fn=lambda x,y:y):
"""
Merges two dictionaries, non-destructively, combining
values on duplicate keys as defined by the optional merge
function. The default behavior replaces the values in d1
with corresponding values in d2. (There is no other generally
applicable merge strategy, but often you'll have homogeneous
types in your dicts, so specifying a merge technique can be
valuable.)
Examples:
>>> d1
{'a': 1, 'c': 3, 'b': 2}
>>> merge(d1, d1)
{'a': 1, 'c': 3, 'b': 2}
>>> merge(d1, d1, lambda x,y: x+y)
{'a': 2, 'c': 6, 'b': 4}
"""
result = dict(d1)
for k,v in d2.iteritems():
if k in result:
result[k] = merge_fn(result[k], v)
else:
result[k] = v
return result
Recursively/deep update a dict
def deepupdate(original, update):
"""
Recursively update a dict.
Subdict's won't be overwritten but also updated.
"""
for key, value in original.iteritems():
if key not in update:
update[key] = value
elif isinstance(value, dict):
deepupdate(value, update[key])
return update
Demonstration:
pluto_original = {
'name': 'Pluto',
'details': {
'tail': True,
'color': 'orange'
}
}
pluto_update = {
'name': 'Pluutoo',
'details': {
'color': 'blue'
}
}
print deepupdate(pluto_original, pluto_update)
Outputs:
{
'name': 'Pluutoo',
'details': {
'color': 'blue',
'tail': True
}
}
Thanks rednaw for edits.
Python 3.5 (PEP 448) allows a nicer syntax option:
x = {'a': 1, 'b': 1}
y = {'a': 2, 'c': 2}
final = {**x, **y}
final
# {'a': 2, 'b': 1, 'c': 2}
Or even
final = {'a': 1, 'b': 1, **x, **y}
In Python 3.9 you also use | and |= with the below example from PEP 584
d = {'spam': 1, 'eggs': 2, 'cheese': 3}
e = {'cheese': 'cheddar', 'aardvark': 'Ethel'}
d | e
# {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z = dict(x.items() + y.items())
print z
For items with keys in both dictionaries ('b'), you can control which one ends up in the output by putting that one last.
The best version I could think while not using copy would be:
from itertools import chain
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
dict(chain(x.iteritems(), y.iteritems()))
It's faster than dict(x.items() + y.items()) but not as fast as n = copy(a); n.update(b), at least on CPython. This version also works in Python 3 if you change iteritems() to items(), which is automatically done by the 2to3 tool.
Personally I like this version best because it describes fairly good what I want in a single functional syntax. The only minor problem is that it doesn't make completely obvious that values from y takes precedence over values from x, but I don't believe it's difficult to figure that out.
I benchmarked the suggested with perfplot and found that
x | y # Python 3.9+
is the fastest solution together with the good old
{**x, **y}
and
temp = x.copy()
temp.update(y)
Code to reproduce the plot:
from collections import ChainMap
from itertools import chain
import perfplot
def setup(n):
x = dict(zip(range(n), range(n)))
y = dict(zip(range(n, 2 * n), range(n, 2 * n)))
return x, y
def copy_update(x, y):
temp = x.copy()
temp.update(y)
return temp
def add_items(x, y):
return dict(list(x.items()) + list(y.items()))
def curly_star(x, y):
return {**x, **y}
def chain_map(x, y):
return dict(ChainMap({}, y, x))
def itertools_chain(x, y):
return dict(chain(x.items(), y.items()))
def python39_concat(x, y):
return x | y
b = perfplot.bench(
setup=setup,
kernels=[
copy_update,
add_items,
curly_star,
chain_map,
itertools_chain,
python39_concat,
],
labels=[
"copy_update",
"dict(list(x.items()) + list(y.items()))",
"{**x, **y}",
"chain_map",
"itertools.chain",
"x | y",
],
n_range=[2 ** k for k in range(18)],
xlabel="len(x), len(y)",
equality_check=None,
)
b.save("out.png")
b.show()
While the question has already been answered several times,
this simple solution to the problem has not been listed yet.
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
z4 = {}
z4.update(x)
z4.update(y)
It is as fast as z0 and the evil z2 mentioned above, but easy to understand and change.
def dict_merge(a, b):
c = a.copy()
c.update(b)
return c
new = dict_merge(old, extras)
Among such shady and dubious answers, this shining example is the one and only good way to merge dicts in Python, endorsed by dictator for life Guido van Rossum himself! Someone else suggested half of this, but did not put it in a function.
print dict_merge(
{'color':'red', 'model':'Mini'},
{'model':'Ferrari', 'owner':'Carl'})
gives:
{'color': 'red', 'owner': 'Carl', 'model': 'Ferrari'}
Be Pythonic. Use a comprehension:
z={k: v for d in [x,y] for k, v in d.items()}
>>> print z
{'a': 1, 'c': 11, 'b': 10}
If you think lambdas are evil then read no further.
As requested, you can write the fast and memory-efficient solution with one expression:
x = {'a':1, 'b':2}
y = {'b':10, 'c':11}
z = (lambda a, b: (lambda a_copy: a_copy.update(b) or a_copy)(a.copy()))(x, y)
print z
{'a': 1, 'c': 11, 'b': 10}
print x
{'a': 1, 'b': 2}
As suggested above, using two lines or writing a function is probably a better way to go.
In python3, the items method no longer returns a list, but rather a view, which acts like a set. In this case you'll need to take the set union since concatenating with + won't work:
dict(x.items() | y.items())
For python3-like behavior in version 2.7, the viewitems method should work in place of items:
dict(x.viewitems() | y.viewitems())
I prefer this notation anyways since it seems more natural to think of it as a set union operation rather than concatenation (as the title shows).
Edit:
A couple more points for python 3. First, note that the dict(x, **y) trick won't work in python 3 unless the keys in y are strings.
Also, Raymond Hettinger's Chainmap answer is pretty elegant, since it can take an arbitrary number of dicts as arguments, but from the docs it looks like it sequentially looks through a list of all the dicts for each lookup:
Lookups search the underlying mappings successively until a key is found.
This can slow you down if you have a lot of lookups in your application:
In [1]: from collections import ChainMap
In [2]: from string import ascii_uppercase as up, ascii_lowercase as lo; x = dict(zip(lo, up)); y = dict(zip(up, lo))
In [3]: chainmap_dict = ChainMap(y, x)
In [4]: union_dict = dict(x.items() | y.items())
In [5]: timeit for k in union_dict: union_dict[k]
100000 loops, best of 3: 2.15 µs per loop
In [6]: timeit for k in chainmap_dict: chainmap_dict[k]
10000 loops, best of 3: 27.1 µs per loop
So about an order of magnitude slower for lookups. I'm a fan of Chainmap, but looks less practical where there may be many lookups.
Two dictionaries
def union2(dict1, dict2):
return dict(list(dict1.items()) + list(dict2.items()))
n dictionaries
def union(*dicts):
return dict(itertools.chain.from_iterable(dct.items() for dct in dicts))
sum has bad performance. See https://mathieularose.com/how-not-to-flatten-a-list-of-lists-in-python/
Simple solution using itertools that preserves order (latter dicts have precedence)
# py2
from itertools import chain, imap
merge = lambda *args: dict(chain.from_iterable(imap(dict.iteritems, args)))
# py3
from itertools import chain
merge = lambda *args: dict(chain.from_iterable(map(dict.items, args)))
And it's usage:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> merge(x, y)
{'a': 1, 'b': 10, 'c': 11}
>>> z = {'c': 3, 'd': 4}
>>> merge(x, y, z)
{'a': 1, 'b': 10, 'c': 3, 'd': 4}
Abuse leading to a one-expression solution for Matthew's answer:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = (lambda f=x.copy(): (f.update(y), f)[1])()
>>> z
{'a': 1, 'c': 11, 'b': 10}
You said you wanted one expression, so I abused lambda to bind a name, and tuples to override lambda's one-expression limit. Feel free to cringe.
You could also do this of course if you don't care about copying it:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> z = (x.update(y), x)[1]
>>> z
{'a': 1, 'b': 10, 'c': 11}
If you don't mind mutating x,
x.update(y) or x
Simple, readable, performant. You know update() always returns None, which is a false value. So the above expression will always evaluate to x, after updating it.
Most mutating methods in the standard library (like .update()) return None by convention, so this kind of pattern will work on those too. However, if you're using a dict subclass or some other method that doesn't follow this convention, then or may return its left operand, which may not be what you want. Instead, you can use a tuple display and index, which works regardless of what the first element evaluates to (although it's not quite as pretty):
(x.update(y), x)[-1]
If you don't have x in a variable yet, you can use lambda to make a local without using an assignment statement. This amounts to using lambda as a let expression, which is a common technique in functional languages, but is maybe unpythonic.
(lambda x: x.update(y) or x)({'a': 1, 'b': 2})
Although it's not that different from the following use of the new walrus operator (Python 3.8+ only),
(x := {'a': 1, 'b': 2}).update(y) or x
especially if you use a default argument:
(lambda x={'a': 1, 'b': 2}: x.update(y) or x)()
If you do want a copy, PEP 584 style x | y is the most Pythonic on 3.9+. If you must support older versions, PEP 448 style {**x, **y} is easiest for 3.5+. But if that's not available in your (even older) Python version, the let expression pattern works here too.
(lambda z=x.copy(): z.update(y) or z)()
(That is, of course, nearly equivalent to (z := x.copy()).update(y) or z, but if your Python version is new enough for that, then the PEP 448 style will be available.)
Drawing on ideas here and elsewhere I've comprehended a function:
def merge(*dicts, **kv):
return { k:v for d in list(dicts) + [kv] for k,v in d.items() }
Usage (tested in python 3):
assert (merge({1:11,'a':'aaa'},{1:99, 'b':'bbb'},foo='bar')==\
{1: 99, 'foo': 'bar', 'b': 'bbb', 'a': 'aaa'})
assert (merge(foo='bar')=={'foo': 'bar'})
assert (merge({1:11},{1:99},foo='bar',baz='quux')==\
{1: 99, 'foo': 'bar', 'baz':'quux'})
assert (merge({1:11},{1:99})=={1: 99})
You could use a lambda instead.
New in Python 3.9: Use the union operator (|) to merge dicts similar to sets:
>>> d = {'a': 1, 'b': 2}
>>> e = {'a': 9, 'c': 3}
>>> d | e
{'a': 9, 'b': 2, 'c': 3}
For matching keys, the right dict takes precedence.
This also works for |= to modify a dict in-place:
>>> e |= d # e = e | d
>>> e
{'a': 1, 'c': 3, 'b': 2}
It's so silly that .update returns nothing.
I just use a simple helper function to solve the problem:
def merge(dict1,*dicts):
for dict2 in dicts:
dict1.update(dict2)
return dict1
Examples:
merge(dict1,dict2)
merge(dict1,dict2,dict3)
merge(dict1,dict2,dict3,dict4)
merge({},dict1,dict2) # this one returns a new copy
(For Python 2.7* only; there are simpler solutions for Python 3*.)
If you're not averse to importing a standard library module, you can do
from functools import reduce
def merge_dicts(*dicts):
return reduce(lambda a, d: a.update(d) or a, dicts, {})
(The or a bit in the lambda is necessary because dict.update always returns None on success.)
The problem I have with solutions listed to date is that, in the merged dictionary, the value for key "b" is 10 but, to my way of thinking, it should be 12.
In that light, I present the following:
import timeit
n=100000
su = """
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
"""
def timeMerge(f,su,niter):
print "{:4f} sec for: {:30s}".format(timeit.Timer(f,setup=su).timeit(n),f)
timeMerge("dict(x, **y)",su,n)
timeMerge("x.update(y)",su,n)
timeMerge("dict(x.items() + y.items())",su,n)
timeMerge("for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k] ",su,n)
#confirm for loop adds b entries together
x = {'a':1, 'b': 2}
y = {'b':10, 'c': 11}
for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k]
print "confirm b elements are added:",x
Results:
0.049465 sec for: dict(x, **y)
0.033729 sec for: x.update(y)
0.150380 sec for: dict(x.items() + y.items())
0.083120 sec for: for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k]
confirm b elements are added: {'a': 1, 'c': 11, 'b': 12}
from collections import Counter
dict1 = {'a':1, 'b': 2}
dict2 = {'b':10, 'c': 11}
result = dict(Counter(dict1) + Counter(dict2))
This should solve your problem.
There will be a new option when Python 3.8 releases (scheduled for 20 October, 2019), thanks to PEP 572: Assignment Expressions. The new assignment expression operator := allows you to assign the result of the copy and still use it to call update, leaving the combined code a single expression, rather than two statements, changing:
newdict = dict1.copy()
newdict.update(dict2)
to:
(newdict := dict1.copy()).update(dict2)
while behaving identically in every way. If you must also return the resulting dict (you asked for an expression returning the dict; the above creates and assigns to newdict, but doesn't return it, so you couldn't use it to pass an argument to a function as is, a la myfunc((newdict := dict1.copy()).update(dict2))), then just add or newdict to the end (since update returns None, which is falsy, it will then evaluate and return newdict as the result of the expression):
(newdict := dict1.copy()).update(dict2) or newdict
Important caveat: In general, I'd discourage this approach in favor of:
newdict = {**dict1, **dict2}
The unpacking approach is clearer (to anyone who knows about generalized unpacking in the first place, which you should), doesn't require a name for the result at all (so it's much more concise when constructing a temporary that is immediately passed to a function or included in a list/tuple literal or the like), and is almost certainly faster as well, being (on CPython) roughly equivalent to:
newdict = {}
newdict.update(dict1)
newdict.update(dict2)
but done at the C layer, using the concrete dict API, so no dynamic method lookup/binding or function call dispatch overhead is involved (where (newdict := dict1.copy()).update(dict2) is unavoidably identical to the original two-liner in behavior, performing the work in discrete steps, with dynamic lookup/binding/invocation of methods.
It's also more extensible, as merging three dicts is obvious:
newdict = {**dict1, **dict2, **dict3}
where using assignment expressions won't scale like that; the closest you could get would be:
(newdict := dict1.copy()).update(dict2), newdict.update(dict3)
or without the temporary tuple of Nones, but with truthiness testing of each None result:
(newdict := dict1.copy()).update(dict2) or newdict.update(dict3)
either of which is obviously much uglier, and includes further inefficiencies (either a wasted temporary tuple of Nones for comma separation, or pointless truthiness testing of each update's None return for or separation).
The only real advantage to the assignment expression approach occurs if:
You have generic code that needs handle both sets and dicts (both of them support copy and update, so the code works roughly as you'd expect it to)
You expect to receive arbitrary dict-like objects, not just dict itself, and must preserve the type and semantics of the left hand side (rather than ending up with a plain dict). While myspecialdict({**speciala, **specialb}) might work, it would involve an extra temporary dict, and if myspecialdict has features plain dict can't preserve (e.g. regular dicts now preserve order based on the first appearance of a key, and value based on the last appearance of a key; you might want one that preserves order based on the last appearance of a key so updating a value also moves it to the end), then the semantics would be wrong. Since the assignment expression version uses the named methods (which are presumably overloaded to behave appropriately), it never creates a dict at all (unless dict1 was already a dict), preserving the original type (and original type's semantics), all while avoiding any temporaries.
This can be done with a single dict comprehension:
>>> x = {'a':1, 'b': 2}
>>> y = {'b':10, 'c': 11}
>>> { key: y[key] if key in y else x[key]
for key in set(x) + set(y)
}
In my view the best answer for the 'single expression' part as no extra functions are needed, and it is short.
Related
Fastest way of using has_key() inside filter()?
I know this is a very efficient way in python 2, to intersect 2 dictionaries filter(dict_1.has_key, dict_2.keys()) However has_key() was removed from Python3, so I can't really use the fast filter() and has_key() functions. What I'm doing right now is: [key for key in dict_2 if key in dict_1] But it seems a bit janky, on top of not being so much readable. Is this really the new fastest way with python3, or is there a faster, cleaner way by using filter()?
Instead of has_key in Python 2, you can use the in operator in Python 3.x. With filter, which gives a lazy iterator in 3.x, you can use dict.__contains__. There's also no need to call dict.keys: res = filter(dict_1.__contains__, dict_2) # lazy print(list(res)) # [2, 3] An equivalent, but less aesthetic, lambda-based solution: res = filter(lambda x: x in dict_1, dict_2) # lazy A generator expression is a third alternative: res = (x for x in dict_2 if ix in dict_1) # lazy For a non-lazy method, you can use set.intersection (or its syntactic sugar &): res = set(dict_1) & set(dict_2) # {2, 3}
As you want the intersection of the keys, you could do: d1 = {1 : 1, 2 : 2} d2 = {1 : 3, 2 : 4, 3 : 5} common = list(d1.keys() & d2.keys()) print(common) Output [1, 2]
Modify parents dict() argument in child definition [duplicate]
I want to merge two dictionaries into a new dictionary. x = {'a': 1, 'b': 2} y = {'b': 3, 'c': 4} z = merge(x, y) >>> z {'a': 1, 'b': 3, 'c': 4} Whenever a key k is present in both dictionaries, only the value y[k] should be kept.
How can I merge two Python dictionaries in a single expression? For dictionaries x and y, their shallowly-merged dictionary z takes values from y, replacing those from x. In Python 3.9.0 or greater (released 17 October 2020, PEP-584, discussed here): z = x | y In Python 3.5 or greater: z = {**x, **y} In Python 2, (or 3.4 or lower) write a function: def merge_two_dicts(x, y): z = x.copy() # start with keys and values of x z.update(y) # modifies z with keys and values of y return z and now: z = merge_two_dicts(x, y) Explanation Say you have two dictionaries and you want to merge them into a new dictionary without altering the original dictionaries: x = {'a': 1, 'b': 2} y = {'b': 3, 'c': 4} The desired result is to get a new dictionary (z) with the values merged, and the second dictionary's values overwriting those from the first. >>> z {'a': 1, 'b': 3, 'c': 4} A new syntax for this, proposed in PEP 448 and available as of Python 3.5, is z = {**x, **y} And it is indeed a single expression. Note that we can merge in with literal notation as well: z = {**x, 'foo': 1, 'bar': 2, **y} and now: >>> z {'a': 1, 'b': 3, 'foo': 1, 'bar': 2, 'c': 4} It is now showing as implemented in the release schedule for 3.5, PEP 478, and it has now made its way into the What's New in Python 3.5 document. However, since many organizations are still on Python 2, you may wish to do this in a backward-compatible way. The classically Pythonic way, available in Python 2 and Python 3.0-3.4, is to do this as a two-step process: z = x.copy() z.update(y) # which returns None since it mutates z In both approaches, y will come second and its values will replace x's values, thus b will point to 3 in our final result. Not yet on Python 3.5, but want a single expression If you are not yet on Python 3.5 or need to write backward-compatible code, and you want this in a single expression, the most performant while the correct approach is to put it in a function: def merge_two_dicts(x, y): """Given two dictionaries, merge them into a new dict as a shallow copy.""" z = x.copy() z.update(y) return z and then you have a single expression: z = merge_two_dicts(x, y) You can also make a function to merge an arbitrary number of dictionaries, from zero to a very large number: def merge_dicts(*dict_args): """ Given any number of dictionaries, shallow copy and merge into a new dict, precedence goes to key-value pairs in latter dictionaries. """ result = {} for dictionary in dict_args: result.update(dictionary) return result This function will work in Python 2 and 3 for all dictionaries. e.g. given dictionaries a to g: z = merge_dicts(a, b, c, d, e, f, g) and key-value pairs in g will take precedence over dictionaries a to f, and so on. Critiques of Other Answers Don't use what you see in the formerly accepted answer: z = dict(x.items() + y.items()) In Python 2, you create two lists in memory for each dict, create a third list in memory with length equal to the length of the first two put together, and then discard all three lists to create the dict. In Python 3, this will fail because you're adding two dict_items objects together, not two lists - >>> c = dict(a.items() + b.items()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for +: 'dict_items' and 'dict_items' and you would have to explicitly create them as lists, e.g. z = dict(list(x.items()) + list(y.items())). This is a waste of resources and computation power. Similarly, taking the union of items() in Python 3 (viewitems() in Python 2.7) will also fail when values are unhashable objects (like lists, for example). Even if your values are hashable, since sets are semantically unordered, the behavior is undefined in regards to precedence. So don't do this: >>> c = dict(a.items() | b.items()) This example demonstrates what happens when values are unhashable: >>> x = {'a': []} >>> y = {'b': []} >>> dict(x.items() | y.items()) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unhashable type: 'list' Here's an example where y should have precedence, but instead the value from x is retained due to the arbitrary order of sets: >>> x = {'a': 2} >>> y = {'a': 1} >>> dict(x.items() | y.items()) {'a': 2} Another hack you should not use: z = dict(x, **y) This uses the dict constructor and is very fast and memory-efficient (even slightly more so than our two-step process) but unless you know precisely what is happening here (that is, the second dict is being passed as keyword arguments to the dict constructor), it's difficult to read, it's not the intended usage, and so it is not Pythonic. Here's an example of the usage being remediated in django. Dictionaries are intended to take hashable keys (e.g. frozensets or tuples), but this method fails in Python 3 when keys are not strings. >>> c = dict(a, **b) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: keyword arguments must be strings From the mailing list, Guido van Rossum, the creator of the language, wrote: I am fine with declaring dict({}, **{1:3}) illegal, since after all it is abuse of the ** mechanism. and Apparently dict(x, **y) is going around as "cool hack" for "call x.update(y) and return x". Personally, I find it more despicable than cool. It is my understanding (as well as the understanding of the creator of the language) that the intended usage for dict(**y) is for creating dictionaries for readability purposes, e.g.: dict(a=1, b=10, c=11) instead of {'a': 1, 'b': 10, 'c': 11} Response to comments Despite what Guido says, dict(x, **y) is in line with the dict specification, which btw. works for both Python 2 and 3. The fact that this only works for string keys is a direct consequence of how keyword parameters work and not a short-coming of dict. Nor is using the ** operator in this place an abuse of the mechanism, in fact, ** was designed precisely to pass dictionaries as keywords. Again, it doesn't work for 3 when keys are not strings. The implicit calling contract is that namespaces take ordinary dictionaries, while users must only pass keyword arguments that are strings. All other callables enforced it. dict broke this consistency in Python 2: >>> foo(**{('a', 'b'): None}) Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: foo() keywords must be strings >>> dict(**{('a', 'b'): None}) {('a', 'b'): None} This inconsistency was bad given other implementations of Python (PyPy, Jython, IronPython). Thus it was fixed in Python 3, as this usage could be a breaking change. I submit to you that it is malicious incompetence to intentionally write code that only works in one version of a language or that only works given certain arbitrary constraints. More comments: dict(x.items() + y.items()) is still the most readable solution for Python 2. Readability counts. My response: merge_two_dicts(x, y) actually seems much clearer to me, if we're actually concerned about readability. And it is not forward compatible, as Python 2 is increasingly deprecated. {**x, **y} does not seem to handle nested dictionaries. the contents of nested keys are simply overwritten, not merged [...] I ended up being burnt by these answers that do not merge recursively and I was surprised no one mentioned it. In my interpretation of the word "merging" these answers describe "updating one dict with another", and not merging. Yes. I must refer you back to the question, which is asking for a shallow merge of two dictionaries, with the first's values being overwritten by the second's - in a single expression. Assuming two dictionaries of dictionaries, one might recursively merge them in a single function, but you should be careful not to modify the dictionaries from either source, and the surest way to avoid that is to make a copy when assigning values. As keys must be hashable and are usually therefore immutable, it is pointless to copy them: from copy import deepcopy def dict_of_dicts_merge(x, y): z = {} overlapping_keys = x.keys() & y.keys() for key in overlapping_keys: z[key] = dict_of_dicts_merge(x[key], y[key]) for key in x.keys() - overlapping_keys: z[key] = deepcopy(x[key]) for key in y.keys() - overlapping_keys: z[key] = deepcopy(y[key]) return z Usage: >>> x = {'a':{1:{}}, 'b': {2:{}}} >>> y = {'b':{10:{}}, 'c': {11:{}}} >>> dict_of_dicts_merge(x, y) {'b': {2: {}, 10: {}}, 'a': {1: {}}, 'c': {11: {}}} Coming up with contingencies for other value types is far beyond the scope of this question, so I will point you at my answer to the canonical question on a "Dictionaries of dictionaries merge". Less Performant But Correct Ad-hocs These approaches are less performant, but they will provide correct behavior. They will be much less performant than copy and update or the new unpacking because they iterate through each key-value pair at a higher level of abstraction, but they do respect the order of precedence (latter dictionaries have precedence) You can also chain the dictionaries manually inside a dict comprehension: {k: v for d in dicts for k, v in d.items()} # iteritems in Python 2.7 or in Python 2.6 (and perhaps as early as 2.4 when generator expressions were introduced): dict((k, v) for d in dicts for k, v in d.items()) # iteritems in Python 2 itertools.chain will chain the iterators over the key-value pairs in the correct order: from itertools import chain z = dict(chain(x.items(), y.items())) # iteritems in Python 2 Performance Analysis I'm only going to do the performance analysis of the usages known to behave correctly. (Self-contained so you can copy and paste yourself.) from timeit import repeat from itertools import chain x = dict.fromkeys('abcdefg') y = dict.fromkeys('efghijk') def merge_two_dicts(x, y): z = x.copy() z.update(y) return z min(repeat(lambda: {**x, **y})) min(repeat(lambda: merge_two_dicts(x, y))) min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()})) min(repeat(lambda: dict(chain(x.items(), y.items())))) min(repeat(lambda: dict(item for d in (x, y) for item in d.items()))) In Python 3.8.1, NixOS: >>> min(repeat(lambda: {**x, **y})) 1.0804965235292912 >>> min(repeat(lambda: merge_two_dicts(x, y))) 1.636518670246005 >>> min(repeat(lambda: {k: v for d in (x, y) for k, v in d.items()})) 3.1779992282390594 >>> min(repeat(lambda: dict(chain(x.items(), y.items())))) 2.740647904574871 >>> min(repeat(lambda: dict(item for d in (x, y) for item in d.items()))) 4.266070580109954 $ uname -a Linux nixos 4.19.113 #1-NixOS SMP Wed Mar 25 07:06:15 UTC 2020 x86_64 GNU/Linux Resources on Dictionaries My explanation of Python's dictionary implementation, updated for 3.6. Answer on how to add new keys to a dictionary Mapping two lists into a dictionary The official Python docs on dictionaries The Dictionary Even Mightier - talk by Brandon Rhodes at Pycon 2017 Modern Python Dictionaries, A Confluence of Great Ideas - talk by Raymond Hettinger at Pycon 2017
In your case, you can do: z = dict(list(x.items()) + list(y.items())) This will, as you want it, put the final dict in z, and make the value for key b be properly overridden by the second (y) dict's value: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> z = dict(list(x.items()) + list(y.items())) >>> z {'a': 1, 'c': 11, 'b': 10} If you use Python 2, you can even remove the list() calls. To create z: >>> z = dict(x.items() + y.items()) >>> z {'a': 1, 'c': 11, 'b': 10} If you use Python version 3.9.0a4 or greater, then you can directly use: x = {'a':1, 'b': 2} y = {'b':10, 'c': 11} z = x | y print(z) {'a': 1, 'c': 11, 'b': 10}
An alternative: z = x.copy() z.update(y)
Another, more concise, option: z = dict(x, **y) Note: this has become a popular answer, but it is important to point out that if y has any non-string keys, the fact that this works at all is an abuse of a CPython implementation detail, and it does not work in Python 3, or in PyPy, IronPython, or Jython. Also, Guido is not a fan. So I can't recommend this technique for forward-compatible or cross-implementation portable code, which really means it should be avoided entirely.
This probably won't be a popular answer, but you almost certainly do not want to do this. If you want a copy that's a merge, then use copy (or deepcopy, depending on what you want) and then update. The two lines of code are much more readable - more Pythonic - than the single line creation with .items() + .items(). Explicit is better than implicit. In addition, when you use .items() (pre Python 3.0), you're creating a new list that contains the items from the dict. If your dictionaries are large, then that is quite a lot of overhead (two large lists that will be thrown away as soon as the merged dict is created). update() can work more efficiently, because it can run through the second dict item-by-item. In terms of time: >>> timeit.Timer("dict(x, **y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000) 15.52571702003479 >>> timeit.Timer("temp = x.copy()\ntemp.update(y)", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000) 15.694622993469238 >>> timeit.Timer("dict(x.items() + y.items())", "x = dict(zip(range(1000), range(1000)))\ny=dict(zip(range(1000,2000), range(1000,2000)))").timeit(100000) 41.484580039978027 IMO the tiny slowdown between the first two is worth it for the readability. In addition, keyword arguments for dictionary creation was only added in Python 2.3, whereas copy() and update() will work in older versions.
In a follow-up answer, you asked about the relative performance of these two alternatives: z1 = dict(x.items() + y.items()) z2 = dict(x, **y) On my machine, at least (a fairly ordinary x86_64 running Python 2.5.2), alternative z2 is not only shorter and simpler but also significantly faster. You can verify this for yourself using the timeit module that comes with Python. Example 1: identical dictionaries mapping 20 consecutive integers to themselves: % python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z1=dict(x.items() + y.items())' 100000 loops, best of 3: 5.67 usec per loop % python -m timeit -s 'x=y=dict((i,i) for i in range(20))' 'z2=dict(x, **y)' 100000 loops, best of 3: 1.53 usec per loop z2 wins by a factor of 3.5 or so. Different dictionaries seem to yield quite different results, but z2 always seems to come out ahead. (If you get inconsistent results for the same test, try passing in -r with a number larger than the default 3.) Example 2: non-overlapping dictionaries mapping 252 short strings to integers and vice versa: % python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z1=dict(x.items() + y.items())' 1000 loops, best of 3: 260 usec per loop % python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z2=dict(x, **y)' 10000 loops, best of 3: 26.9 usec per loop z2 wins by about a factor of 10. That's a pretty big win in my book! After comparing those two, I wondered if z1's poor performance could be attributed to the overhead of constructing the two item lists, which in turn led me to wonder if this variation might work better: from itertools import chain z3 = dict(chain(x.iteritems(), y.iteritems())) A few quick tests, e.g. % python -m timeit -s 'from itertools import chain; from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z3=dict(chain(x.iteritems(), y.iteritems()))' 10000 loops, best of 3: 66 usec per loop lead me to conclude that z3 is somewhat faster than z1, but not nearly as fast as z2. Definitely not worth all the extra typing. This discussion is still missing something important, which is a performance comparison of these alternatives with the "obvious" way of merging two lists: using the update method. To try to keep things on an equal footing with the expressions, none of which modify x or y, I'm going to make a copy of x instead of modifying it in-place, as follows: z0 = dict(x) z0.update(y) A typical result: % python -m timeit -s 'from htmlentitydefs import codepoint2name as x, name2codepoint as y' 'z0=dict(x); z0.update(y)' 10000 loops, best of 3: 26.9 usec per loop In other words, z0 and z2 seem to have essentially identical performance. Do you think this might be a coincidence? I don't.... In fact, I'd go so far as to claim that it's impossible for pure Python code to do any better than this. And if you can do significantly better in a C extension module, I imagine the Python folks might well be interested in incorporating your code (or a variation on your approach) into the Python core. Python uses dict in lots of places; optimizing its operations is a big deal. You could also write this as z0 = x.copy() z0.update(y) as Tony does, but (not surprisingly) the difference in notation turns out not to have any measurable effect on performance. Use whichever looks right to you. Of course, he's absolutely correct to point out that the two-statement version is much easier to understand.
In Python 3.0 and later, you can use collections.ChainMap which groups multiple dicts or other mappings together to create a single, updateable view: >>> from collections import ChainMap >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> z = dict(ChainMap({}, y, x)) >>> for k, v in z.items(): print(k, '-->', v) a --> 1 b --> 10 c --> 11 Update for Python 3.5 and later: You can use PEP 448 extended dictionary packing and unpacking. This is fast and easy: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> {**x, **y} {'a': 1, 'b': 10, 'c': 11} Update for Python 3.9 and later: You can use the PEP 584 union operator: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> x | y {'a': 1, 'b': 10, 'c': 11}
I wanted something similar, but with the ability to specify how the values on duplicate keys were merged, so I hacked this out (but did not heavily test it). Obviously this is not a single expression, but it is a single function call. def merge(d1, d2, merge_fn=lambda x,y:y): """ Merges two dictionaries, non-destructively, combining values on duplicate keys as defined by the optional merge function. The default behavior replaces the values in d1 with corresponding values in d2. (There is no other generally applicable merge strategy, but often you'll have homogeneous types in your dicts, so specifying a merge technique can be valuable.) Examples: >>> d1 {'a': 1, 'c': 3, 'b': 2} >>> merge(d1, d1) {'a': 1, 'c': 3, 'b': 2} >>> merge(d1, d1, lambda x,y: x+y) {'a': 2, 'c': 6, 'b': 4} """ result = dict(d1) for k,v in d2.iteritems(): if k in result: result[k] = merge_fn(result[k], v) else: result[k] = v return result
Recursively/deep update a dict def deepupdate(original, update): """ Recursively update a dict. Subdict's won't be overwritten but also updated. """ for key, value in original.iteritems(): if key not in update: update[key] = value elif isinstance(value, dict): deepupdate(value, update[key]) return update Demonstration: pluto_original = { 'name': 'Pluto', 'details': { 'tail': True, 'color': 'orange' } } pluto_update = { 'name': 'Pluutoo', 'details': { 'color': 'blue' } } print deepupdate(pluto_original, pluto_update) Outputs: { 'name': 'Pluutoo', 'details': { 'color': 'blue', 'tail': True } } Thanks rednaw for edits.
Python 3.5 (PEP 448) allows a nicer syntax option: x = {'a': 1, 'b': 1} y = {'a': 2, 'c': 2} final = {**x, **y} final # {'a': 2, 'b': 1, 'c': 2} Or even final = {'a': 1, 'b': 1, **x, **y} In Python 3.9 you also use | and |= with the below example from PEP 584 d = {'spam': 1, 'eggs': 2, 'cheese': 3} e = {'cheese': 'cheddar', 'aardvark': 'Ethel'} d | e # {'spam': 1, 'eggs': 2, 'cheese': 'cheddar', 'aardvark': 'Ethel'}
x = {'a':1, 'b': 2} y = {'b':10, 'c': 11} z = dict(x.items() + y.items()) print z For items with keys in both dictionaries ('b'), you can control which one ends up in the output by putting that one last.
The best version I could think while not using copy would be: from itertools import chain x = {'a':1, 'b': 2} y = {'b':10, 'c': 11} dict(chain(x.iteritems(), y.iteritems())) It's faster than dict(x.items() + y.items()) but not as fast as n = copy(a); n.update(b), at least on CPython. This version also works in Python 3 if you change iteritems() to items(), which is automatically done by the 2to3 tool. Personally I like this version best because it describes fairly good what I want in a single functional syntax. The only minor problem is that it doesn't make completely obvious that values from y takes precedence over values from x, but I don't believe it's difficult to figure that out.
I benchmarked the suggested with perfplot and found that x | y # Python 3.9+ is the fastest solution together with the good old {**x, **y} and temp = x.copy() temp.update(y) Code to reproduce the plot: from collections import ChainMap from itertools import chain import perfplot def setup(n): x = dict(zip(range(n), range(n))) y = dict(zip(range(n, 2 * n), range(n, 2 * n))) return x, y def copy_update(x, y): temp = x.copy() temp.update(y) return temp def add_items(x, y): return dict(list(x.items()) + list(y.items())) def curly_star(x, y): return {**x, **y} def chain_map(x, y): return dict(ChainMap({}, y, x)) def itertools_chain(x, y): return dict(chain(x.items(), y.items())) def python39_concat(x, y): return x | y b = perfplot.bench( setup=setup, kernels=[ copy_update, add_items, curly_star, chain_map, itertools_chain, python39_concat, ], labels=[ "copy_update", "dict(list(x.items()) + list(y.items()))", "{**x, **y}", "chain_map", "itertools.chain", "x | y", ], n_range=[2 ** k for k in range(18)], xlabel="len(x), len(y)", equality_check=None, ) b.save("out.png") b.show()
While the question has already been answered several times, this simple solution to the problem has not been listed yet. x = {'a':1, 'b': 2} y = {'b':10, 'c': 11} z4 = {} z4.update(x) z4.update(y) It is as fast as z0 and the evil z2 mentioned above, but easy to understand and change.
def dict_merge(a, b): c = a.copy() c.update(b) return c new = dict_merge(old, extras) Among such shady and dubious answers, this shining example is the one and only good way to merge dicts in Python, endorsed by dictator for life Guido van Rossum himself! Someone else suggested half of this, but did not put it in a function. print dict_merge( {'color':'red', 'model':'Mini'}, {'model':'Ferrari', 'owner':'Carl'}) gives: {'color': 'red', 'owner': 'Carl', 'model': 'Ferrari'}
Be Pythonic. Use a comprehension: z={k: v for d in [x,y] for k, v in d.items()} >>> print z {'a': 1, 'c': 11, 'b': 10}
If you think lambdas are evil then read no further. As requested, you can write the fast and memory-efficient solution with one expression: x = {'a':1, 'b':2} y = {'b':10, 'c':11} z = (lambda a, b: (lambda a_copy: a_copy.update(b) or a_copy)(a.copy()))(x, y) print z {'a': 1, 'c': 11, 'b': 10} print x {'a': 1, 'b': 2} As suggested above, using two lines or writing a function is probably a better way to go.
In python3, the items method no longer returns a list, but rather a view, which acts like a set. In this case you'll need to take the set union since concatenating with + won't work: dict(x.items() | y.items()) For python3-like behavior in version 2.7, the viewitems method should work in place of items: dict(x.viewitems() | y.viewitems()) I prefer this notation anyways since it seems more natural to think of it as a set union operation rather than concatenation (as the title shows). Edit: A couple more points for python 3. First, note that the dict(x, **y) trick won't work in python 3 unless the keys in y are strings. Also, Raymond Hettinger's Chainmap answer is pretty elegant, since it can take an arbitrary number of dicts as arguments, but from the docs it looks like it sequentially looks through a list of all the dicts for each lookup: Lookups search the underlying mappings successively until a key is found. This can slow you down if you have a lot of lookups in your application: In [1]: from collections import ChainMap In [2]: from string import ascii_uppercase as up, ascii_lowercase as lo; x = dict(zip(lo, up)); y = dict(zip(up, lo)) In [3]: chainmap_dict = ChainMap(y, x) In [4]: union_dict = dict(x.items() | y.items()) In [5]: timeit for k in union_dict: union_dict[k] 100000 loops, best of 3: 2.15 µs per loop In [6]: timeit for k in chainmap_dict: chainmap_dict[k] 10000 loops, best of 3: 27.1 µs per loop So about an order of magnitude slower for lookups. I'm a fan of Chainmap, but looks less practical where there may be many lookups.
Two dictionaries def union2(dict1, dict2): return dict(list(dict1.items()) + list(dict2.items())) n dictionaries def union(*dicts): return dict(itertools.chain.from_iterable(dct.items() for dct in dicts)) sum has bad performance. See https://mathieularose.com/how-not-to-flatten-a-list-of-lists-in-python/
Simple solution using itertools that preserves order (latter dicts have precedence) # py2 from itertools import chain, imap merge = lambda *args: dict(chain.from_iterable(imap(dict.iteritems, args))) # py3 from itertools import chain merge = lambda *args: dict(chain.from_iterable(map(dict.items, args))) And it's usage: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> merge(x, y) {'a': 1, 'b': 10, 'c': 11} >>> z = {'c': 3, 'd': 4} >>> merge(x, y, z) {'a': 1, 'b': 10, 'c': 3, 'd': 4}
Abuse leading to a one-expression solution for Matthew's answer: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> z = (lambda f=x.copy(): (f.update(y), f)[1])() >>> z {'a': 1, 'c': 11, 'b': 10} You said you wanted one expression, so I abused lambda to bind a name, and tuples to override lambda's one-expression limit. Feel free to cringe. You could also do this of course if you don't care about copying it: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> z = (x.update(y), x)[1] >>> z {'a': 1, 'b': 10, 'c': 11}
If you don't mind mutating x, x.update(y) or x Simple, readable, performant. You know update() always returns None, which is a false value. So the above expression will always evaluate to x, after updating it. Most mutating methods in the standard library (like .update()) return None by convention, so this kind of pattern will work on those too. However, if you're using a dict subclass or some other method that doesn't follow this convention, then or may return its left operand, which may not be what you want. Instead, you can use a tuple display and index, which works regardless of what the first element evaluates to (although it's not quite as pretty): (x.update(y), x)[-1] If you don't have x in a variable yet, you can use lambda to make a local without using an assignment statement. This amounts to using lambda as a let expression, which is a common technique in functional languages, but is maybe unpythonic. (lambda x: x.update(y) or x)({'a': 1, 'b': 2}) Although it's not that different from the following use of the new walrus operator (Python 3.8+ only), (x := {'a': 1, 'b': 2}).update(y) or x especially if you use a default argument: (lambda x={'a': 1, 'b': 2}: x.update(y) or x)() If you do want a copy, PEP 584 style x | y is the most Pythonic on 3.9+. If you must support older versions, PEP 448 style {**x, **y} is easiest for 3.5+. But if that's not available in your (even older) Python version, the let expression pattern works here too. (lambda z=x.copy(): z.update(y) or z)() (That is, of course, nearly equivalent to (z := x.copy()).update(y) or z, but if your Python version is new enough for that, then the PEP 448 style will be available.)
Drawing on ideas here and elsewhere I've comprehended a function: def merge(*dicts, **kv): return { k:v for d in list(dicts) + [kv] for k,v in d.items() } Usage (tested in python 3): assert (merge({1:11,'a':'aaa'},{1:99, 'b':'bbb'},foo='bar')==\ {1: 99, 'foo': 'bar', 'b': 'bbb', 'a': 'aaa'}) assert (merge(foo='bar')=={'foo': 'bar'}) assert (merge({1:11},{1:99},foo='bar',baz='quux')==\ {1: 99, 'foo': 'bar', 'baz':'quux'}) assert (merge({1:11},{1:99})=={1: 99}) You could use a lambda instead.
New in Python 3.9: Use the union operator (|) to merge dicts similar to sets: >>> d = {'a': 1, 'b': 2} >>> e = {'a': 9, 'c': 3} >>> d | e {'a': 9, 'b': 2, 'c': 3} For matching keys, the right dict takes precedence. This also works for |= to modify a dict in-place: >>> e |= d # e = e | d >>> e {'a': 1, 'c': 3, 'b': 2}
It's so silly that .update returns nothing. I just use a simple helper function to solve the problem: def merge(dict1,*dicts): for dict2 in dicts: dict1.update(dict2) return dict1 Examples: merge(dict1,dict2) merge(dict1,dict2,dict3) merge(dict1,dict2,dict3,dict4) merge({},dict1,dict2) # this one returns a new copy
(For Python 2.7* only; there are simpler solutions for Python 3*.) If you're not averse to importing a standard library module, you can do from functools import reduce def merge_dicts(*dicts): return reduce(lambda a, d: a.update(d) or a, dicts, {}) (The or a bit in the lambda is necessary because dict.update always returns None on success.)
The problem I have with solutions listed to date is that, in the merged dictionary, the value for key "b" is 10 but, to my way of thinking, it should be 12. In that light, I present the following: import timeit n=100000 su = """ x = {'a':1, 'b': 2} y = {'b':10, 'c': 11} """ def timeMerge(f,su,niter): print "{:4f} sec for: {:30s}".format(timeit.Timer(f,setup=su).timeit(n),f) timeMerge("dict(x, **y)",su,n) timeMerge("x.update(y)",su,n) timeMerge("dict(x.items() + y.items())",su,n) timeMerge("for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k] ",su,n) #confirm for loop adds b entries together x = {'a':1, 'b': 2} y = {'b':10, 'c': 11} for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k] print "confirm b elements are added:",x Results: 0.049465 sec for: dict(x, **y) 0.033729 sec for: x.update(y) 0.150380 sec for: dict(x.items() + y.items()) 0.083120 sec for: for k in y.keys(): x[k] = k in x and x[k]+y[k] or y[k] confirm b elements are added: {'a': 1, 'c': 11, 'b': 12}
from collections import Counter dict1 = {'a':1, 'b': 2} dict2 = {'b':10, 'c': 11} result = dict(Counter(dict1) + Counter(dict2)) This should solve your problem.
There will be a new option when Python 3.8 releases (scheduled for 20 October, 2019), thanks to PEP 572: Assignment Expressions. The new assignment expression operator := allows you to assign the result of the copy and still use it to call update, leaving the combined code a single expression, rather than two statements, changing: newdict = dict1.copy() newdict.update(dict2) to: (newdict := dict1.copy()).update(dict2) while behaving identically in every way. If you must also return the resulting dict (you asked for an expression returning the dict; the above creates and assigns to newdict, but doesn't return it, so you couldn't use it to pass an argument to a function as is, a la myfunc((newdict := dict1.copy()).update(dict2))), then just add or newdict to the end (since update returns None, which is falsy, it will then evaluate and return newdict as the result of the expression): (newdict := dict1.copy()).update(dict2) or newdict Important caveat: In general, I'd discourage this approach in favor of: newdict = {**dict1, **dict2} The unpacking approach is clearer (to anyone who knows about generalized unpacking in the first place, which you should), doesn't require a name for the result at all (so it's much more concise when constructing a temporary that is immediately passed to a function or included in a list/tuple literal or the like), and is almost certainly faster as well, being (on CPython) roughly equivalent to: newdict = {} newdict.update(dict1) newdict.update(dict2) but done at the C layer, using the concrete dict API, so no dynamic method lookup/binding or function call dispatch overhead is involved (where (newdict := dict1.copy()).update(dict2) is unavoidably identical to the original two-liner in behavior, performing the work in discrete steps, with dynamic lookup/binding/invocation of methods. It's also more extensible, as merging three dicts is obvious: newdict = {**dict1, **dict2, **dict3} where using assignment expressions won't scale like that; the closest you could get would be: (newdict := dict1.copy()).update(dict2), newdict.update(dict3) or without the temporary tuple of Nones, but with truthiness testing of each None result: (newdict := dict1.copy()).update(dict2) or newdict.update(dict3) either of which is obviously much uglier, and includes further inefficiencies (either a wasted temporary tuple of Nones for comma separation, or pointless truthiness testing of each update's None return for or separation). The only real advantage to the assignment expression approach occurs if: You have generic code that needs handle both sets and dicts (both of them support copy and update, so the code works roughly as you'd expect it to) You expect to receive arbitrary dict-like objects, not just dict itself, and must preserve the type and semantics of the left hand side (rather than ending up with a plain dict). While myspecialdict({**speciala, **specialb}) might work, it would involve an extra temporary dict, and if myspecialdict has features plain dict can't preserve (e.g. regular dicts now preserve order based on the first appearance of a key, and value based on the last appearance of a key; you might want one that preserves order based on the last appearance of a key so updating a value also moves it to the end), then the semantics would be wrong. Since the assignment expression version uses the named methods (which are presumably overloaded to behave appropriately), it never creates a dict at all (unless dict1 was already a dict), preserving the original type (and original type's semantics), all while avoiding any temporaries.
This can be done with a single dict comprehension: >>> x = {'a':1, 'b': 2} >>> y = {'b':10, 'c': 11} >>> { key: y[key] if key in y else x[key] for key in set(x) + set(y) } In my view the best answer for the 'single expression' part as no extra functions are needed, and it is short.
Is it possible to "unpack" a dict in one call?
I was looking for a way to "unpack" a dictionary in a generic way and found a relevant question (and answers) which explained various techniques (TL;DR: it is not too elegant). That question, however, addresses the case where the keys of the dict are not known, the OP anted to have them added to the local namespace automatically. My problem is possibly simpler: I get a dict from a function and would like to dissecate it on the fly, knowing the keys I will need (I may not need all of them every time). Right now I can only do def myfunc(): return {'a': 1, 'b': 2, 'c': 3} x = myfunc() a = x['a'] my_b_so_that_the_name_differs_from_the_key = x['b'] # I do not need c this time while I was looking for the equivalent of def myotherfunc(): return 1, 2 a, b = myotherfunc() but for a dict (which is what is returned by my function). I do not want to use the latter solution for several reasons, one of them being that it is not obvious which variable corresponds to which returned element (the first solution has at least the merit of being readable). Is such operation available?
If you really must, you can use an operator.itemgetter() object to extract values for multiple keys as a tuple: from operator import itemgetter a, b = itemgetter('a', 'b')(myfunc()) This is still not pretty; I'd prefer the explicit and readable separate lines where you first assign the return value, then extract those values. Demo: >>> from operator import itemgetter >>> def myfunc(): ... return {'a': 1, 'b': 2, 'c': 3} ... >>> itemgetter('a', 'b')(myfunc()) (1, 2) >>> a, b = itemgetter('a', 'b')(myfunc()) >>> a 1 >>> b 2
You could also use map: def myfunc(): return {'a': 1, 'b': 2, 'c': 3} a,b = map(myfunc().get,["a","b"]) print(a,b)
In addition to the operator.itemgetter() method, you can also write your own myotherfunc(). It takes list of the required keys as an argument and returns a tuple of their corresponding value. def myotherfunc(keys_list): reference_dict = myfunc() return tuple(reference_dict[key] for key in keys_list) >>> a,b = myotherfunc(['a','b']) >>> a 1 >>> b 2 >>> a,c = myotherfunc(['a','c']) >>> a 1 >>> c 3
Destructuring-bind dictionary contents
I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like params = {'a':1,'b':2} a,b = params.values() But since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a, b). Is there a nice way to do this?
from operator import itemgetter params = {'a': 1, 'b': 2} a, b = itemgetter('a', 'b')(params) Instead of elaborate lambda functions or dictionary comprehension, may as well use a built in library.
One way to do this with less repetition than Jochen's suggestion is with a helper function. This gives the flexibility to list your variable names in any order and only destructure a subset of what is in the dict: pluck = lambda dict, *args: (dict[arg] for arg in args) things = {'blah': 'bleh', 'foo': 'bar'} foo, blah = pluck(things, 'foo', 'blah') Also, instead of joaquin's OrderedDict you could sort the keys and get the values. The only catches are you need to specify your variable names in alphabetical order and destructure everything in the dict: sorted_vals = lambda dict: (t[1] for t in sorted(dict.items())) things = {'foo': 'bar', 'blah': 'bleh'} blah, foo = sorted_vals(things)
How come nobody posted the simplest approach? params = {'a':1,'b':2} a, b = params['a'], params['b']
Python is only able to "destructure" sequences, not dictionaries. So, to write what you want, you will have to map the needed entries to a proper sequence. As of myself, the closest match I could find is the (not very sexy): a,b = [d[k] for k in ('a','b')] This works with generators too: a,b = (d[k] for k in ('a','b')) Here is a full example: >>> d = dict(a=1,b=2,c=3) >>> d {'a': 1, 'c': 3, 'b': 2} >>> a, b = [d[k] for k in ('a','b')] >>> a 1 >>> b 2 >>> a, b = (d[k] for k in ('a','b')) >>> a 1 >>> b 2
Here's another way to do it similarly to how a destructuring assignment works in JS: params = {'b': 2, 'a': 1} a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params) What we did was to unpack the params dictionary into key values (using **) (like in Jochen's answer), then we've taken those values in the lambda signature and assigned them according to the key name - and here's a bonus - we also get a dictionary of whatever is not in the lambda's signature so if you had: params = {'b': 2, 'a': 1, 'c': 3} a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params) After the lambda has been applied, the rest variable will now contain: {'c': 3} Useful for omitting unneeded keys from a dictionary. Hope this helps.
Maybe you really want to do something like this? def some_func(a, b): print a,b params = {'a':1,'b':2} some_func(**params) # equiv to some_func(a=1, b=2)
If you are afraid of the issues involved in the use of the locals dictionary and you prefer to follow your original strategy, Ordered Dictionaries from python 2.7 and 3.1 collections.OrderedDicts allows you to recover you dictionary items in the order in which they were first inserted
(Ab)using the import system The from ... import statement lets us desctructure and bind attribute names of an object. Of course, it only works for objects in the sys.modules dictionary, so one could use a hack like this: import sys, types mydict = {'a':1,'b':2} sys.modules["mydict"] = types.SimpleNamespace(**mydict) from mydict import a, b A somewhat more serious hack would be to write a context manager to load and unload the module: with obj_as_module(mydict, "mydict_module"): from mydict_module import a, b By pointing the __getattr__ method of the module directly to the __getitem__ method of the dict, the context manager can also avoid using SimpleNamespace(**mydict). See this answer for an implementation and some extensions of the idea. One can also temporarily replace the entire sys.modules dict with the dict of interest, and do import a, b without from.
Warning 1: as stated in the docs, this is not guaranteed to work on all Python implementations: CPython implementation detail: This function relies on Python stack frame support in the interpreter, which isn’t guaranteed to exist in all implementations of Python. If running in an implementation without Python stack frame support this function returns None. Warning 2: this function does make the code shorter, but it probably contradicts the Python philosophy of being as explicit as you can. Moreover, it doesn't address the issues pointed out by John Christopher Jones in the comments, although you could make a similar function that works with attributes instead of keys. This is just a demonstration that you can do that if you really want to! def destructure(dict_): if not isinstance(dict_, dict): raise TypeError(f"{dict_} is not a dict") # the parent frame will contain the information about # the current line parent_frame = inspect.currentframe().f_back # so we extract that line (by default the code context # only contains the current line) (line,) = inspect.getframeinfo(parent_frame).code_context # "hello, key = destructure(my_dict)" # -> ("hello, key ", "=", " destructure(my_dict)") lvalues, _equals, _rvalue = line.strip().partition("=") # -> ["hello", "key"] keys = [s.strip() for s in lvalues.split(",") if s.strip()] if missing := [key for key in keys if key not in dict_]: raise KeyError(*missing) for key in keys: yield dict_[key] In [5]: my_dict = {"hello": "world", "123": "456", "key": "value"} In [6]: hello, key = destructure(my_dict) In [7]: hello Out[7]: 'world' In [8]: key Out[8]: 'value' This solution allows you to pick some of the keys, not all, like in JavaScript. It's also safe for user-provided dictionaries
With Python 3.10, you can do: d = {"a": 1, "b": 2} match d: case {"a": a, "b": b}: print(f"A is {a} and b is {b}") but it adds two extra levels of indentation, and you still have to repeat the key names.
Look for other answers as this won't cater to the unexpected order in the dictionary. will update this with a correct version sometime soon. try this data = {'a':'Apple', 'b':'Banana','c':'Carrot'} keys = data.keys() a,b,c = [data[k] for k in keys] result: a == 'Apple' b == 'Banana' c == 'Carrot'
Well, if you want these in a class you can always do this: class AttributeDict(dict): def __init__(self, *args, **kwargs): super(AttributeDict, self).__init__(*args, **kwargs) self.__dict__.update(self) d = AttributeDict(a=1, b=2)
Based on #ShawnFumo answer I came up with this: def destruct(dict): return (t[1] for t in sorted(dict.items())) d = {'b': 'Banana', 'c': 'Carrot', 'a': 'Apple' } a, b, c = destruct(d) (Notice the order of items in dict)
An old topic, but I found this to be a useful method: data = {'a':'Apple', 'b':'Banana','c':'Carrot'} for key in data.keys(): locals()[key] = data[key] This method loops over every key in your dictionary and sets a variable to that name and then assigns the value from the associated key to this new variable. Testing: print(a) print(b) print(c) Output Apple Banana Carrot
An easy and simple way to destruct dict in python: params = {"a": 1, "b": 2} a, b = [params[key] for key in ("a", "b")] print(a, b) # Output: # 1 2
I don't know whether it's good style, but locals().update(params) will do the trick. You then have a, b and whatever was in your params dict available as corresponding local variables.
Since dictionaries are guaranteed to keep their insertion order in Python >= 3.7, that means that it's complete safe and idiomatic to just do this nowadays: params = {'a': 1, 'b': 2} a, b = params.values() print(a) print(b) Output: 1 2
Python "extend" for a dictionary
What is the best way to extend a dictionary with another one while avoiding the use of a for loop? For instance: >>> a = { "a" : 1, "b" : 2 } >>> b = { "c" : 3, "d" : 4 } >>> a {'a': 1, 'b': 2} >>> b {'c': 3, 'd': 4} Result: { "a" : 1, "b" : 2, "c" : 3, "d" : 4 } Something like: a.extend(b) # This does not work
a.update(b) Latest Python Standard Library Documentation
A beautiful gem in this closed question: The "oneliner way", altering neither of the input dicts, is basket = dict(basket_one, **basket_two) Learn what **basket_two (the **) means here. In case of conflict, the items from basket_two will override the ones from basket_one. As one-liners go, this is pretty readable and transparent, and I have no compunction against using it any time a dict that's a mix of two others comes in handy (any reader who has trouble understanding it will in fact be very well served by the way this prompts him or her towards learning about dict and the ** form;-). So, for example, uses like: x = mungesomedict(dict(adict, **anotherdict)) are reasonably frequent occurrences in my code. Originally submitted by Alex Martelli Note: In Python 3, this will only work if every key in basket_two is a string.
Have you tried using dictionary comprehension with dictionary mapping: a = {'a': 1, 'b': 2} b = {'c': 3, 'd': 4} c = {**a, **b} # c = {"a": 1, "b": 2, "c": 3, "d": 4} Another way of doing is by Using dict(iterable, **kwarg) c = dict(a, **b) # c = {'a': 1, 'b': 2, 'c': 3, 'd': 4} In Python 3.9 you can add two dict using union | operator # use the merging operator | c = a | b # c = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
a.update(b) Will add keys and values from b to a, overwriting if there's already a value for a key.
As others have mentioned, a.update(b) for some dicts a and b will achieve the result you've asked for in your question. However, I want to point out that many times I have seen the extend method of mapping/set objects desire that in the syntax a.extend(b), a's values should NOT be overwritten by b's values. a.update(b) overwrites a's values, and so isn't a good choice for extend. Note that some languages call this method defaults or inject, as it can be thought of as a way of injecting b's values (which might be a set of default values) in to a dictionary without overwriting values that might already exist. Of course, you could simple note that a.extend(b) is nearly the same as b.update(a); a=b. To remove the assignment, you could do it thus: def extend(a,b): """Create a new dictionary with a's properties extended by b, without overwriting. >>> extend({'a':1,'b':2},{'b':3,'c':4}) {'a': 1, 'c': 4, 'b': 2} """ return dict(b,**a) Thanks to Tom Leys for that smart idea using a side-effect-less dict constructor for extend.
Notice that since Python 3.9 a much easier syntax was introduced (Union Operators): d1 = {'a': 1} d2 = {'b': 2} extended_dict = d1 | d2 >> {'a':1, 'b': 2} Pay attention: in case first dict shared keys with second dict, position matters! d1 = {'b': 1} d2 = {'b': 2} d1 | d2 >> {'b': 2} Relevant PEP
You can also use python's collections.ChainMap which was introduced in python 3.3. from collections import ChainMap c = ChainMap(a, b) c['a'] # returns 1 This has a few possible advantages, depending on your use-case. They are explained in more detail here, but I'll give a brief overview: A chainmap only uses views of the dictionaries, so no data is actually copied. This results in faster chaining (but slower lookup) No keys are actually overwritten so, if necessary, you know whether the data comes from a or b. This mainly makes it useful for things like configuration dictionaries.
In terms of efficiency, it seems faster to use the unpack operation, compared with the update method. Here an image of a test I did: