Append a dictionary to a dictionary [duplicate] - python

This question already has answers here:
Python "extend" for a dictionary
(8 answers)
Closed 5 years ago.
I have two existing dictionaries, and I wish to 'append' one of them to the other. By that I mean that the key,values of the other dictionary should be made into the first dictionary. For example:
orig = {
'A': 1,
'B': 2,
'C': 3,
}
extra = {
'D': 4,
'E': 5,
}
dest = # Something here involving orig and extra
print dest
{
'A': 1,
'B': 2,
'C': 3,
'D': 4,
'E': 5
}
I think this all can be achieved through a for loop (maybe?), but is there some method of dictionaries or any other module that saves this job for me? The actual dictionaries I'm using are really big...

You can do
orig.update(extra)
or, if you don't want orig to be modified, make a copy first:
dest = dict(orig) # or orig.copy()
dest.update(extra)
Note that if extra and orig have overlapping keys, the final value will be taken from extra. For example,
>>> d1 = {1: 1, 2: 2}
>>> d2 = {2: 'ha!', 3: 3}
>>> d1.update(d2)
>>> d1
{1: 1, 2: 'ha!', 3: 3}

There are two ways to add one dictionary to another:
Update (modifies orig in place)
orig.update(extra) # Python 2.7+
orig |= extra # Python 3.9+
Merge (creates a new dictionary)
# Python 2.7+
dest = collections.ChainMap(orig, extra)
dest = {k: v for d in (orig, extra) for (k, v) in d.items()}
# Python 3
dest = {**orig, **extra}
dest = {**orig, 'D': 4, 'E': 5}
# Python 3.9+
dest = orig | extra
Caveats
Note that these operations are noncommutative. In all cases, the latter is the winner. E.g.
orig = {'A': 1, 'B': 2}
extra = {'A': 3, 'C': 3}
dest = orig | extra
# dest = {'A': 3, 'B': 2, 'C': 3}
dest = extra | orig
# dest = {'A': 1, 'B': 2, 'C': 3}
It is also important to note that only from Python 3.7 (and CPython 3.6) dicts are ordered. So, in previous versions, the order of the items in the dictionary may vary.

dict.update() looks like it will do what you want...
>> orig.update(extra)
>>> orig
{'A': 1, 'C': 3, 'B': 2, 'E': 5, 'D': 4}
>>>
Perhaps, though, you don't want to update your original dictionary, but work on a copy:
>>> dest = orig.copy()
>>> dest.update(extra)
>>> orig
{'A': 1, 'C': 3, 'B': 2}
>>> dest
{'A': 1, 'C': 3, 'B': 2, 'E': 5, 'D': 4}

Assuming that you do not want to change orig, you can either do a copy and update like the other answers, or you can create a new dictionary in one step by passing all items from both dictionaries into the dict constructor:
from itertools import chain
dest = dict(chain(orig.items(), extra.items()))
Or without itertools:
dest = dict(list(orig.items()) + list(extra.items()))
Note that you only need to pass the result of items() into list() on Python 3, on 2.x dict.items() already returns a list so you can just do dict(orig.items() + extra.items()).
As a more general use case, say you have a larger list of dicts that you want to combine into a single dict, you could do something like this:
from itertools import chain
dest = dict(chain.from_iterable(map(dict.items, list_of_dicts)))

A three-liner to combine or merge two dictionaries:
dest = {}
dest.update(orig)
dest.update(extra)
This creates a new dictionary dest without modifying orig and extra.
Note: If a key has different values in orig and extra, then extra overrides orig.

There is the .update() method :)
update([other])
Update the dictionary with the key/value pairs from other, overwriting existing keys. Return None.
update() accepts either another dictionary object or an iterable of key/value pairs (as tuples or other iterables of length two). If
keyword arguments are specified, the dictionary is then updated with
those key/value pairs: d.update(red=1, blue=2).
Changed in version 2.4: Allowed the argument to be an iterable of key/value pairs and allowed keyword arguments.

The answer I want to give is "use collections.ChainMap", but I just discovered that it was only added in Python 3.3: https://docs.python.org/3.3/library/collections.html#chainmap-objects
You can try to crib the class from the 3.3 source though: http://hg.python.org/cpython/file/3.3/Lib/collections/init.py#l763
Here is a less feature-full Python 2.x compatible version (same author): http://code.activestate.com/recipes/305268-chained-map-lookups/
Instead of expanding/overwriting one dictionary with another using dict.merge, or creating an additional copy merging both, you create a lookup chain that searches both in order. Because it doesn't duplicate the mappings it wraps ChainMap uses very little memory, and sees later modifications to any sub-mapping. Because order matters you can also use the chain to layer defaults (i.e. user prefs > config > env).

Related

How to merge two or more dict into one dict with retaining multiple values of same key as list?

I have two or more dictionary, I like to merge it as one with retaining multiple values of the same key as list. I would not able to share the original code, so please help me with the following example.
Input:
a= {'a':1, 'b': 2}
b= {'aa':4, 'b': 6}
c= {'aa':3, 'c': 8}
Output:
c= {'a':1,'aa':[3,4],'b': [2,6], 'c': 8}
I suggest you read up on the defaultdict: it lets you provide a factory method that initializes missing keys, i.e. if a key is looked up but not found, it creates a value by calling factory_method(missing_key). See this example, it might make things clearer:
from collections import defaultdict
a = {'a': 1, 'b': 2}
b = {'aa': 4, 'b': 6}
c = {'aa': 3, 'c': 8}
stuff = [a, b, c]
# our factory method is the list-constructor `list`,
# so whenever we look up a value that doesn't exist, a list is created;
# we can always be sure that we have list-values
store = defaultdict(list)
for s in stuff:
for k, v in s.items():
# since we know that our value is always a list, we can safely append
store[k].append(v)
print(store)
This has the "downside" of creating one-element lists for single occurences of values, but maybe you are able to work around that.
Please find below to resolve your issue. I hope this would work for you.
from collections import defaultdict
a = {'a':1, 'b': 2}
b = {'aa':4, 'b': 6}
c={'aa':3, 'c': 8}
dd = defaultdict(list)
for d in (a,b,c):
for key, value in d.items():
dd[key].append(value)
print(dd)
Use defaultdict to automatically create a dictionary entry with an empty list.
To process all source dictionaries in a single loop, use itertools.chain.
The main loop just adds a value from the current item, to the list under
the current key.
As you wrote, for cases when under some key there is only one item,
you have to generate a work dictionary (using dictonary comprehension),
limited to items with value (list) containing only one item.
The value of such item shoud contain only the first (and only) number
from the source list.
Then use this dictionary to update d.
So the whole script can be surprisingly short, as below:
from collections import defaultdict
from itertools import chain
a = {'a':1, 'b': 2}
b = {'aa':4, 'b': 6}
c = {'aa':3, 'c': 8}
d = defaultdict(list)
for k, v in chain(a.items(), b.items(), c.items()):
d[k].append(v)
d.update({ k: v[0] for k, v in d.items() if len(v) == 1 })
As you can see, the actual processing code is contained in only 4 (last) lines.
If you print d, the result is:
defaultdict(list, {'a': 1, 'b': [2, 6], 'aa': [4, 3], 'c': 8})

Python : Get list by index not key in defaultdict

I'm new to python and I have become stuck on a data type issue.
I have a script which looks a bit like this
dd = defaultdict(list)
for i in arr:
dd[color].append(i)
which creates a default dict which resembles something along the lines of
dd = [('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
However I need to now access the first list([2,4]). I have tried
print(dd[0])
but this game me the following output
[][][]
I know the defaultdict has data in it as I have printed it in its entirety. However other than access the first item by its dictionary index I don't know how to access it. However, other than access the list by the dictionary key I don't know how to get it. However, I don't know the name of the key until I populate the dict.
I have thought about creating a list of lists rather than a defaultdict but being able to search via key is going to be really usefull for another part of the code so I would like to maintain this data structure if possible.
is there a way to grab the list by an index number or can you only do it using a key?
You can get a list of keys, pick the key by index, then access that key.
print(dd[dd.keys()[0]])
Note that a dictionary in Python is an unordered collection. This means that the order of keys is undefined. Consider the following example:
from collections import defaultdict
d = defaultdict (int)
d['a'] = 1
d['b'] = 2
d['c'] = 3
d['d'] = 4
d['e'] = 5
print (d)
My Python2 gives:
defaultdict(<type 'int'>, {'a': 1, 'c': 3, 'b': 2, 'e': 5, 'd': 4})
Python3 output is different by the way:
defaultdict(<class 'int'>, {'c': 3, 'b': 2, 'a': 1, 'e': 5, 'd': 4})
So, you will have to use some other means to remember the order in which you populate the dictionary. Either maintain a separate list of keys (colors) in the order you need, or use OrderedDict.

How Can I Get a Subset of an Object's Properties as a Python Dictionary?

Short Version:
In Python is there a way to (cleanly/elegantly) say "Give me these 5 (or however many) properties of an object, and nothing else, as a dictionary"?
Longer Version:
Using the Javascript Underscore library, I can reduce an bunch of objects/dictionaries (in JS they're the same thing) to a bunch of subsets of their properties like so:
var subsets = _(someObjects).map(function(someObject) {
_(someObject).pick(['a', 'd']);
});
If I want to do the same thing with a Python object (not a dictionary) however it seems like the best I can do is use a list comprehension and manually set each property:
subsets = [{"a": x.a, "d": x.d} for x in someObjects]
That doesn't look so bad when there's only two properties, and they're both one letter, but it gets uglier fast if I start having more/longer properties (plus I feel wrong whenever I write a multi-line list comprehension). I could turn the whole thing in to a function that uses a for loop, but before I do that, is there any cool built-in Python utility thing that I can use to do this as cleanly (or even more cleanly) than the JS version?
This can be done simply by combining a list comprehension with a dictionary comprehension.
subsets = [{attr: getattr(x, attr) for attr in ["a", "d"]}
for x in someObjects]
Naturally, you could distill out that comprehension if you wanted to:
def pick(*attrs):
return {attr: getattr(x, attr) for attr in attrs}
subsets = [pick("a", "d") for x in someObjects]
>>> A = ['a', 'c']
>>> O = [{'a': 1, 'b': 2, 'c': 3}, {'a': 11, 'b': 22, 'c': 33, 'd': 44}]
>>> [{a: o[a] for a in A} for o in O]
[{'a': 1, 'c': 3}, {'a': 11, 'c': 33}]
>>> list(map(lambda o: {a: o[a] for a in A}, O))
[{'a': 1, 'c': 3}, {'a': 11, 'c': 33}]

Not getting the same result when inverting a dictionary twice

I'm trying to invert a simple dictionary like:
{'a' : 1, 'b' : 2, 'c' : 3, 'd' : 4}
I'm using this function:
def invert(d):
return dict([(x,y) for y,x in d.iteritems()])
Now when I invert my dictionary, everything works out fine. When I invert it twice however, I get:
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
which is not in the same order as the dictionary I started with. Is there a problem with my invert function? Sorry I'm kinda new to python, but thanks for any help!
That is correct, dictionaries are unordered in python
from another so answer answer:
CPython implementation detail: Keys and values are listed in an
arbitrary order which is non-random, varies across Python
implementations, and depends on the dictionary’s history of insertions
and deletions.
from the docs:
It is best to think of a dictionary as an unordered set of key: value
pairs, with the requirement that the keys are unique (within one
dictionary). A pair of braces creates an empty dictionary: {}. Placing
a comma-separated list of key:value pairs within the braces adds
initial key:value pairs to the dictionary; this is also the way
dictionaries are written on output.
Python dictionaries are unsorted by design.
You can use collections.OrderedDict instead if you really need this behaviour.
Try running this code:
d = {
'a' : 1, 'b' : 2,
'c' : 3, 'd' : 4
}
def invert(d):
return dict([(x,y) for y,x in d.iteritems()])
print d
d = invert(d)
print d
d = invert(d)
print d
This is the output:
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
{1: 'a', 2: 'b', 3: 'c', 4: 'd'}
{'a': 1, 'c': 3, 'b': 2, 'd': 4}
As you can see, it technically is the same dictionary, but when you declare it, it is unordered.

Python "extend" for a dictionary

What is the best way to extend a dictionary with another one while avoiding the use of a for loop? For instance:
>>> a = { "a" : 1, "b" : 2 }
>>> b = { "c" : 3, "d" : 4 }
>>> a
{'a': 1, 'b': 2}
>>> b
{'c': 3, 'd': 4}
Result:
{ "a" : 1, "b" : 2, "c" : 3, "d" : 4 }
Something like:
a.extend(b) # This does not work
a.update(b)
Latest Python Standard Library Documentation
A beautiful gem in this closed question:
The "oneliner way", altering neither of the input dicts, is
basket = dict(basket_one, **basket_two)
Learn what **basket_two (the **) means here.
In case of conflict, the items from basket_two will override the ones from basket_one. As one-liners go, this is pretty readable and transparent, and I have no compunction against using it any time a dict that's a mix of two others comes in handy (any reader who has trouble understanding it will in fact be very well served by the way this prompts him or her towards learning about dict and the ** form;-). So, for example, uses like:
x = mungesomedict(dict(adict, **anotherdict))
are reasonably frequent occurrences in my code.
Originally submitted by Alex Martelli
Note: In Python 3, this will only work if every key in basket_two is a string.
Have you tried using dictionary comprehension with dictionary mapping:
a = {'a': 1, 'b': 2}
b = {'c': 3, 'd': 4}
c = {**a, **b}
# c = {"a": 1, "b": 2, "c": 3, "d": 4}
Another way of doing is by Using dict(iterable, **kwarg)
c = dict(a, **b)
# c = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
In Python 3.9 you can add two dict using union | operator
# use the merging operator |
c = a | b
# c = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
a.update(b)
Will add keys and values from b to a, overwriting if there's already a value for a key.
As others have mentioned, a.update(b) for some dicts a and b will achieve the result you've asked for in your question. However, I want to point out that many times I have seen the extend method of mapping/set objects desire that in the syntax a.extend(b), a's values should NOT be overwritten by b's values. a.update(b) overwrites a's values, and so isn't a good choice for extend.
Note that some languages call this method defaults or inject, as it can be thought of as a way of injecting b's values (which might be a set of default values) in to a dictionary without overwriting values that might already exist.
Of course, you could simple note that a.extend(b) is nearly the same as b.update(a); a=b. To remove the assignment, you could do it thus:
def extend(a,b):
"""Create a new dictionary with a's properties extended by b,
without overwriting.
>>> extend({'a':1,'b':2},{'b':3,'c':4})
{'a': 1, 'c': 4, 'b': 2}
"""
return dict(b,**a)
Thanks to Tom Leys for that smart idea using a side-effect-less dict constructor for extend.
Notice that since Python 3.9 a much easier syntax was introduced (Union Operators):
d1 = {'a': 1}
d2 = {'b': 2}
extended_dict = d1 | d2
>> {'a':1, 'b': 2}
Pay attention: in case first dict shared keys with second dict, position matters!
d1 = {'b': 1}
d2 = {'b': 2}
d1 | d2
>> {'b': 2}
Relevant PEP
You can also use python's collections.ChainMap which was introduced in python 3.3.
from collections import ChainMap
c = ChainMap(a, b)
c['a'] # returns 1
This has a few possible advantages, depending on your use-case. They are explained in more detail here, but I'll give a brief overview:
A chainmap only uses views of the dictionaries, so no data is actually copied. This results in faster chaining (but slower lookup)
No keys are actually overwritten so, if necessary, you know whether the data comes from a or b.
This mainly makes it useful for things like configuration dictionaries.
In terms of efficiency, it seems faster to use the unpack operation, compared with the update method.
Here an image of a test I did:

Categories

Resources