defaultdict of defaultdict? - python

Is there a way to have a defaultdict(defaultdict(int)) in order to make the following code work?
for x in stuff:
d[x.a][x.b] += x.c_int
d needs to be built ad-hoc, depending on x.a and x.b elements.
I could use:
for x in stuff:
d[x.a,x.b] += x.c_int
but then I wouldn't be able to use:
d.keys()
d[x.a].keys()

Yes like this:
defaultdict(lambda: defaultdict(int))
The argument of a defaultdict (in this case is lambda: defaultdict(int)) will be called when you try to access a key that doesn't exist. The return value of it will be set as the new value of this key, which means in our case the value of d[Key_doesnt_exist] will be defaultdict(int).
If you try to access a key from this last defaultdict i.e. d[Key_doesnt_exist][Key_doesnt_exist] it will return 0, which is the return value of the argument of the last defaultdict i.e. int().

The parameter to the defaultdict constructor is the function which will be called for building new elements. So let's use a lambda !
>>> from collections import defaultdict
>>> d = defaultdict(lambda : defaultdict(int))
>>> print d[0]
defaultdict(<type 'int'>, {})
>>> print d[0]["x"]
0
Since Python 2.7, there's an even better solution using Counter:
>>> from collections import Counter
>>> c = Counter()
>>> c["goodbye"]+=1
>>> c["and thank you"]=42
>>> c["for the fish"]-=5
>>> c
Counter({'and thank you': 42, 'goodbye': 1, 'for the fish': -5})
Some bonus features
>>> c.most_common()[:2]
[('and thank you', 42), ('goodbye', 1)]
For more information see PyMOTW - Collections - Container data types and Python Documentation - collections

Previous answers have addressed how to make a two-levels or n-levels defaultdict. In some cases you want an infinite one:
def ddict():
return defaultdict(ddict)
Usage:
>>> d = ddict()
>>> d[1]['a'][True] = 0.5
>>> d[1]['b'] = 3
>>> import pprint; pprint.pprint(d)
defaultdict(<function ddict at 0x7fcac68bf048>,
{1: defaultdict(<function ddict at 0x7fcac68bf048>,
{'a': defaultdict(<function ddict at 0x7fcac68bf048>,
{True: 0.5}),
'b': 3})})

I find it slightly more elegant to use partial:
import functools
dd_int = functools.partial(defaultdict, int)
defaultdict(dd_int)
Of course, this is the same as a lambda.

For reference, it's possible to implement a generic nested defaultdict factory method through:
from collections import defaultdict
from functools import partial
from itertools import repeat
def nested_defaultdict(default_factory, depth=1):
result = partial(defaultdict, default_factory)
for _ in repeat(None, depth - 1):
result = partial(defaultdict, result)
return result()
The depth defines the number of nested dictionary before the type defined in default_factory is used.
For example:
my_dict = nested_defaultdict(list, 3)
my_dict['a']['b']['c'].append('e')

Others have answered correctly your question of how to get the following to work:
for x in stuff:
d[x.a][x.b] += x.c_int
An alternative would be to use tuples for keys:
d = defaultdict(int)
for x in stuff:
d[x.a,x.b] += x.c_int
# ^^^^^^^ tuple key
The nice thing about this approach is that it is simple and can be easily expanded. If you need a mapping three levels deep, just use a three item tuple for the key.

Related

python defaultdict how to use it instead of function [duplicate]

I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].
You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6
Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.
I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}
I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.
This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.
Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts

Create a deeply nested dictionnary [duplicate]

I have a large list like:
[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4
I want to build a multi-level dict like:
A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4
I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.
While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].
You can do it without even defining a class:
from collections import defaultdict
nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()
nest[0][1][2][3][4][5] = 6
Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:
from collections import defaultdict
class Tree(defaultdict):
def __init__(self, value=None):
super(Tree, self).__init__(Tree)
self.value = value
root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value
Outputs:
1
3
None
You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.
I think the simplest implementation of a recursive dictionary is this. Only leaf nodes can contain values.
# Define recursive dictionary
from collections import defaultdict
tree = lambda: defaultdict(tree)
Usage:
# Create instance
mydict = tree()
mydict['a'] = 1
mydict['b']['a'] = 2
mydict['c']
mydict['d']['a']['b'] = 0
# Print
import prettyprint
prettyprint.pp(mydict)
Output:
{
"a": 1,
"b": {
"a": 1
},
"c": {},
"d": {
"a": {
"b": 0
}
}
}
I'd do it with a subclass of dict that defines __missing__:
>>> class NestedDict(dict):
... def __missing__(self, key):
... self[key] = NestedDict()
... return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}
You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.
This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?
def dict_factory():
return defaultdict(dict_factory)
your_dict = dict_factory()
Also -- from the comments -- if you'd like to update from an existing dict, you can simply call
your_dict[0][1][2].update({"some_key":"some_value"})
In order to add values to the dict.
Dan O'Huiginn posted a very nice solution on his journal in 2010:
http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html
>>> class NestedDict(dict):
... def __getitem__(self, key):
... if key in self: return self.get(key)
... return self.setdefault(key, NestedDict())
>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}
You may achieve this with a recursive defaultdict.
from collections import defaultdict
def tree():
def the_tree():
return defaultdict(the_tree)
return the_tree()
It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.
The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:
>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined
To add to #Hugo To have a max depth:
l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)
A slightly different possibility that allows regular dictionary initialization:
from collections import defaultdict
def superdict(arg=()):
update = lambda obj, arg: obj.update(arg) or obj
return update(defaultdict(superdict), arg)
Example:
>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2
You could use a NestedDict.
from ndicts.ndicts import NestedDict
nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6
The result as a dictionary:
>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}
To install ndicts
pip install ndicts

Dictionary out of a list in python

I want to create a dictionary out of a list that has several similar elements. But, in the dictionary, all these similar elements must have the same key.
d_dict={}
lst=['A1','A2','A3','2e','2o','2m']
for element in lst:
if element.startswith('A'):
d_dict[1].append(element)
elif element.startswith('2'):
d_dict[2].append(element)
print(d_dict)
My output should look like:
d_dict={1:['A1','A2','A3'],2:['2e','2o','2m']}
thanks.
You're looking for collections.defaultdict:
>>> from collections import defaultdict
>>> d_dict = defaultdict(list)
>>> for element in lst:
... if element.startswith('A'):
... d_dict[1].append(element)
... elif element.startswith('2'):
... d_dict[2].append(element)
...
>>> print d_dict
defaultdict(<type 'list'>, {1: ['A1', 'A2', 'A3'], 2: ['2e', '2o', '2m']})
So pretty much, with this module, your code is exactly the same. You only need to make your dictionary a type defaultdict so that you can have lists as values without having to create any.
You need to create the lists before appending something to them:
for element in lst:
if element.startswith('A'):
if 1 not in d_dict: # if it is not already created
d_dict[1] = [element] # add list with the current element
else:
d_dict[1].append(element)
elif element.startswith('2'):
if 2 not in d_dict:
d_dict[2] = [element]
else:
d_dict[2].append(element)
The only problem I see in your code is that you don't initialize the sublists, so d_dict[1] doesn't exist. But you can avoid having to do this altogether if you use a defaultdict.
from collections import defaultdict
d_dict=defaultdict(list)
d_dict[1].extend(e for e in lst if e.startswith('A'))
d_dict[2].extend(e for e in lst if e.startswith('2'))
If you wanted your code to be more flexible, this will work with any strings, not just those starting with A and 2. You can use the defaultdict class from collections.
from collections import defaultdict
values = ['A1','A2','A3','2e','2o','2m']
grouped = defaultdict(list)
for i in values:
grouped[i[0]].append(i)
print(dict(grouped))

pythonic way to create 3d dict

I want to create a dict which can be accessed as:
d[id_1][id_2][id_3] = amount
As of now I have a huge ugly function:
def parse_dict(id1,id2,id3,principal, data_dict):
if data_dict.has_key(id1):
values = data_dict[id1]
if values.has_key[id2]
..
else:
inner_inner_dict = {}
# and so on
What is the pythonic way to do this?
note, that i input the principal.. but what i want is the amount..
So if all the three keys are there.. add principal to the previous amount!
Thanks
You may want to consider using defaultdict:
For example:
json_dict = defaultdict(lambda: defaultdict(dict))
will create a defaultdict of defaultdicts of dicts (I know..but it is right), to access it, you can simply do:
json_dict['context']['name']['id'] = '42'
without having to resort to using if...else to initialize.
from collections import defaultdict
d = defaultdict(lambda : defaultdict(dict))
d[id_1][id_2][id_3] = amount
You can make a simple dictionary that creates new ones (using Autovivification):
>>> class AutoDict(dict):
def __missing__(self, key):
x = AutoDict()
self[key] = x
return x
>>> d = AutoDict()
>>> d[1][2][3] = 4
>>> d
{1: {2: {3: 4}}}
This will have no limit of dimensions as the defaultdict with dict has.
Or a simpler version using defaultdict (from the above wiki link):
def auto_dict():
return defaultdict(auto_dict)
>>> from collections import defaultdict
>>> import json
>>> def tree(): return defaultdict(tree)
>>> t = tree()
>>> t['a']['b']['c'] = 'foo'
>>> t['a']['b']['d'] = 'bar'
>>> json.dumps(t)
'{"a": {"b": {"c": "foo", "d": "bar"}}}'
Maybe you need to have a look at multi-dimensional arrays - for example in numpy:
http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html

How does collections.defaultdict work?

I've read the examples in python docs, but still can't figure out what this method means. Can somebody help? Here are two examples from the python docs
>>> from collections import defaultdict
>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> for k in s:
... d[k] += 1
...
>>> d.items()
dict_items([('m', 1), ('i', 4), ('s', 4), ('p', 2)])
and
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
the parameters int and list are for what?
Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.
defaultdict means that if a key is not found in the dictionary, then instead of a KeyError being thrown, a new entry is created. The type of this new entry is given by the argument of defaultdict.
For example:
somedict = {}
print(somedict[3]) # KeyError
someddict = defaultdict(int)
print(someddict[3]) # print int(), thus 0
defaultdict
"The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default(value to be returned) up front when the container is initialized."
as defined by Doug Hellmann in The Python Standard Library by Example
How to use defaultdict
Import defaultdict
>>> from collections import defaultdict
Initialize defaultdict
Initialize it by passing
callable as its first argument(mandatory)
>>> d_int = defaultdict(int)
>>> d_list = defaultdict(list)
>>> def foo():
... return 'default value'
...
>>> d_foo = defaultdict(foo)
>>> d_int
defaultdict(<type 'int'>, {})
>>> d_list
defaultdict(<type 'list'>, {})
>>> d_foo
defaultdict(<function foo at 0x7f34a0a69578>, {})
**kwargs as its second argument(optional)
>>> d_int = defaultdict(int, a=10, b=12, c=13)
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})
or
>>> kwargs = {'a':10,'b':12,'c':13}
>>> d_int = defaultdict(int, **kwargs)
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})
How does it works
As is a child class of standard dictionary, it can perform all the same functions.
But in case of passing an unknown key it returns the default value instead of error. For ex:
>>> d_int['a']
10
>>> d_int['d']
0
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12, 'd': 0})
In case you want to change default value overwrite default_factory:
>>> d_int.default_factory = lambda: 1
>>> d_int['e']
1
>>> d_int
defaultdict(<function <lambda> at 0x7f34a0a91578>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0})
or
>>> def foo():
... return 2
>>> d_int.default_factory = foo
>>> d_int['f']
2
>>> d_int
defaultdict(<function foo at 0x7f34a0a0a140>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0, 'f': 2})
Examples in the Question
Example 1
As int has been passed as default_factory, any unknown key will return 0 by default.
Now as the string is passed in the loop, it will increase the count of those alphabets in d.
>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> d.default_factory
<type 'int'>
>>> for k in s:
... d[k] += 1
>>> d.items()
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
>>> d
defaultdict(<type 'int'>, {'i': 4, 'p': 2, 's': 4, 'm': 1})
Example 2
As a list has been passed as default_factory, any unknown(non-existent) key will return [ ](ie. list) by default.
Now as the list of tuples is passed in the loop, it will append the value in the d[color]
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> d.default_factory
<type 'list'>
>>> for k, v in s:
... d[k].append(v)
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
>>> d
defaultdict(<type 'list'>, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})
Dictionaries are a convenient way to store data for later retrieval by name (key). Keys must be unique, immutable objects, and are typically strings. The values in a dictionary can be anything. For many applications, the values are simple types such as integers and strings.
It gets more interesting when the values in a dictionary are collections (lists, dicts, etc.) In this case, the value (an empty list or dict) must be initialized the first time a given key is used. While this is relatively easy to do manually, the defaultdict type automates and simplifies these kinds of operations.
A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key.
A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.
from collections import defaultdict
ice_cream = defaultdict(lambda: 'Vanilla')
ice_cream['Sarah'] = 'Chunky Monkey'
ice_cream['Abdul'] = 'Butter Pecan'
print(ice_cream['Sarah'])
>>>Chunky Monkey
print(ice_cream['Joe'])
>>>Vanilla
Here is another example on How using defaultdict, we can reduce complexity
from collections import defaultdict
# Time complexity O(n^2)
def delete_nth_naive(array, n):
ans = []
for num in array:
if ans.count(num) < n:
ans.append(num)
return ans
# Time Complexity O(n), using hash tables.
def delete_nth(array,n):
result = []
counts = defaultdict(int)
for i in array:
if counts[i] < n:
result.append(i)
counts[i] += 1
return result
x = [1,2,3,1,2,1,2,3]
print(delete_nth(x, n=2))
print(delete_nth_naive(x, n=2))
In conclusion, whenever you need a dictionary, and each element’s value should start with a default value, use a defaultdict.
There is a great explanation of defaultdicts here: http://ludovf.net/blog/python-collections-defaultdict/
Basically, the parameters int and list are functions that you pass. Remember that Python accepts function names as arguments. int returns 0 by default and list returns an empty list when called with parentheses.
In normal dictionaries, if in your example I try calling d[a], I will get an error (KeyError), since only keys m, s, i and p exist and key a has not been initialized. But in a defaultdict, it takes a function name as an argument, when you try to use a key that has not been initialized, it simply calls the function you passed in and assigns its return value as the value of the new key.
The behavior of defaultdict can be easily mimicked using dict.setdefault instead of d[key] in every call.
In other words, the code:
from collections import defaultdict
d = defaultdict(list)
print(d['key']) # empty list []
d['key'].append(1) # adding constant 1 to the list
print(d['key']) # list containing the constant [1]
is equivalent to:
d = dict()
print(d.setdefault('key', list())) # empty list []
d.setdefault('key', list()).append(1) # adding constant 1 to the list
print(d.setdefault('key', list())) # list containing the constant [1]
The only difference is that, using defaultdict, the list constructor is called only once, and using dict.setdefault the list constructor is called more often (but the code may be rewriten to avoid this, if really needed).
Some may argue there is a performance consideration, but this topic is a minefield. This post shows there isn't a big performance gain in using defaultdict, for example.
IMO, defaultdict is a collection that adds more confusion than benefits to the code. Useless for me, but others may think different.
Since the question is about "how it works", some readers may want to see more nuts and bolts. Specifically, the method in question is the __missing__(key) method. See: https://docs.python.org/2/library/collections.html#defaultdict-objects .
More concretely, this answer shows how to make use of __missing__(key) in a practical way:
https://stackoverflow.com/a/17956989/1593924
To clarify what 'callable' means, here's an interactive session (from 2.7.6 but should work in v3 too):
>>> x = int
>>> x
<type 'int'>
>>> y = int(5)
>>> y
5
>>> z = x(5)
>>> z
5
>>> from collections import defaultdict
>>> dd = defaultdict(int)
>>> dd
defaultdict(<type 'int'>, {})
>>> dd = defaultdict(x)
>>> dd
defaultdict(<type 'int'>, {})
>>> dd['a']
0
>>> dd
defaultdict(<type 'int'>, {'a': 0})
That was the most typical use of defaultdict (except for the pointless use of the x variable). You can do the same thing with 0 as the explicit default value, but not with a simple value:
>>> dd2 = defaultdict(0)
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
dd2 = defaultdict(0)
TypeError: first argument must be callable
Instead, the following works because it passes in a simple function (it creates on the fly a nameless function which takes no arguments and always returns 0):
>>> dd2 = defaultdict(lambda: 0)
>>> dd2
defaultdict(<function <lambda> at 0x02C4C130>, {})
>>> dd2['a']
0
>>> dd2
defaultdict(<function <lambda> at 0x02C4C130>, {'a': 0})
>>>
And with a different default value:
>>> dd3 = defaultdict(lambda: 1)
>>> dd3
defaultdict(<function <lambda> at 0x02C4C170>, {})
>>> dd3['a']
1
>>> dd3
defaultdict(<function <lambda> at 0x02C4C170>, {'a': 1})
>>>
My own 2¢: you can also subclass defaultdict:
class MyDict(defaultdict):
def __missing__(self, key):
value = [None, None]
self[key] = value
return value
This could come in handy for very complex cases.
Well, defaultdict can also raise keyerror in the following case:
from collections import defaultdict
d = defaultdict()
print(d[3]) #raises keyerror
Always remember to give argument to the defaultdict like
d = defaultdict(int)
The defaultdict tool is a container in the collections class of Python. It's similar to the usual dictionary (dict) container, but it has one difference: The value fields' data type is specified upon initialization.
For example:
from collections import defaultdict
d = defaultdict(list)
d['python'].append("awesome")
d['something-else'].append("not relevant")
d['python'].append("language")
for i in d.items():
print i
This prints:
('python', ['awesome', 'language'])
('something-else', ['not relevant'])
In short:
defaultdict(int) - the argument int indicates that the values will be int type.
defaultdict(list) - the argument list indicates that the values will be list type.
I think its best used in place of a switch case statement. Imagine if we have a switch case statement as below:
option = 1
switch(option) {
case 1: print '1st option'
case 2: print '2nd option'
case 3: print '3rd option'
default: return 'No such option'
}
There is no switch case statements available in python. We can achieve the same by using defaultdict.
from collections import defaultdict
def default_value(): return "Default Value"
dd = defaultdict(default_value)
dd[1] = '1st option'
dd[2] = '2nd option'
dd[3] = '3rd option'
print(dd[4])
print(dd[5])
print(dd[3])
It prints:
Default Value
Default Value
3rd option
In the above snippet dd has no keys 4 or 5 and hence it prints out a default value which we have configured in a helper function. This is quite nicer than a raw dictionary where a KeyError is thrown if key is not present. From this it is evident that defaultdict more like a switch case statement where we can avoid a complicated if-elif-elif-else blocks.
One more good example that impressed me a lot from this site is:
>>> from collections import defaultdict
>>> food_list = 'spam spam spam spam spam spam eggs spam'.split()
>>> food_count = defaultdict(int) # default value of int is 0
>>> for food in food_list:
... food_count[food] += 1 # increment element's value by 1
...
defaultdict(<type 'int'>, {'eggs': 1, 'spam': 7})
>>>
If we try to access any items other than eggs and spam we will get a count of 0.
Without defaultdict, you can probably assign new values to unseen keys but you cannot modify it. For example:
import collections
d = collections.defaultdict(int)
for i in range(10):
d[i] += i
print(d)
# Output: defaultdict(<class 'int'>, {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9})
import collections
d = {}
for i in range(10):
d[i] += i
print(d)
# Output: Traceback (most recent call last): File "python", line 4, in <module> KeyError: 0
The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default up front when the container is initialized.
import collections
def default_factory():
return 'default value'
d = collections.defaultdict(default_factory, foo='bar')
print 'd:', d
print 'foo =>', d['foo']
print 'bar =>', d['bar']
This works well as long as it is appropriate for all keys to have the same default. It can be especially useful if the default is a type used for aggregating or accumulating values, such as a list, set, or even int. The standard library documentation includes several examples of using defaultdict this way.
$ python collections_defaultdict.py
d: defaultdict(<function default_factory at 0x100468c80>, {'foo': 'bar'})
foo => bar
bar => default value
#dictinary and defaultdict
normaldictionary=dict()
print(type(normaldictionary))
#print(normaldictionary["keynotexisit"])
#Above normal dictionary give an error as key not present
from collections import defaultdict
defaultdict1=defaultdict()
print(type(defaultdict1))
#print(defaultdict1['keynotexisit'])
######################################
from collections import defaultdict
default2=defaultdict(int)
print(default2['keynotexist'])
https://msatutorpy.medium.com/different-between-dictionary-and-defaultdictionary-cb215f682971
The documentation and the explanation are pretty much self-explanatory:
http://docs.python.org/library/collections.html#collections.defaultdict
The type function(int/str etc.) passed as an argument is used to initialize a default value for any given key where the key is not present in the dict.

Categories

Resources