Is printing defaultdict supposed to be ugly (non human-readable) by default? - python

Print dict and defaultdict:
>>> d = {'key': 'value'}
>>> print(d)
{'key': 'value'}
>>> dd = defaultdict(lambda: 'value')
>>> dd['key']
'value'
>>> print(dd)
defaultdict(<function <lambda> at 0x7fbd44cb6b70>, {'key': 'value'})
With nested structure it becomes ugly:
>>> nested_d = {'key1': {'key2': {'key3': 'value'}}}
>>> print(nested_d)
{'key1': {'key2': {'key3': 'value'}}}
>>> def factory():
... return defaultdict(factory)
...
>>> nested_dd = defaultdict(factory)
>>> nested_dd['key1']['key2']['key3'] = 'value'
>>> print(nested_dd)
defaultdict(<function factory at 0x7fbd44cd4ea0>, {'key1': defaultdict(<function factory at 0x7fbd44cd4ea0>, {'key2': defaultdict(<function factory at 0x7fbd44cd4ea0>, {'key3': 'value'})})})
Were there any reasons for not making it human-readable by default? (UPD: I mean what are the reasons behind not having custom __str__ defined for defaultdict by default?)

repr() output (defaultdict has no __str__, only __repr__) is debugging output. It is not meant to be pretty, it is meant to be functional. It tells you the type, the repr() of the callable that produces the default, and the contents.
From the __repr__ documentation:
This is typically used for debugging, so it is important that the representation is information-rich and unambiguous.
Like all datatypes in Python, (except for strings for obvious reasons), no informal (__str__) is defined because it is up to the programmer to decide what output is suitable for their use-cases. No default can be set for that, because use-cases vary so widely. Output for a file has different needs than output to a GUI or to a web-page for example.
In Python 2, convert the object to a plain dictionary first, then use pprint() if you want 'pretty' output:
def todict(d):
if not isinstance(d, dict):
return d
return {k: todict(v) for k, v in d.items()}
pprint(todict(nested_dd))
In Python 3, pprint supports defaultdict directly:
>>> pprint(nested_dd)
defaultdict(<function factory at 0x105ed2f28>,
{'key1': defaultdict(<function factory at 0x105ed2f28>,
{'key2': defaultdict(<function factory at 0x105ed2f28>,
{'key3': 'value'})})})

There's no way to know what, if anything, the author(s) were thinking or even whether they gave it much consideration at all.
For the specific case of nested defaultdicts, as shown your example code:
def factory():
return defaultdict(factory)
nested_dd = defaultdict(factory)
nested_dd['key1']['key2']['key3'] = 'value'
You can avoid the issue by subclassing dict like this instead:
class Tree(dict):
def __missing__(self, key):
value = self[key] = type(self)()
return value
nested_dd = Tree()
nested_dd['key1']['key2']['key3'] = 'value'
print(nested_dd) # -> {'key1': {'key2': {'key3': 'value'}}}
Since the subclass doesn't define its own __repr__() or __str__() methods, instances of it will print (and pprint) just like regular dict instances do.

Related

dict get item from key

I'm trying to get a key-value item from a dictionary using a key instead of getting only the value.
I understand that I could do something like
foo = {"bar":"baz", "hello":"world"}
some_item = {"bar": foo.get("bar")}
But here I need to type out the key twice, which seems a bit redundant. Is there some direct way to get the key-value pair for the key bar? Something like
foo.get_item("bar")
>>> {"bar": "baz"}
One way or another, you'll need to bind "bar" to a variable.
>>> foo = {"bar":"baz", "hello":"world"}
>>> (lambda k="bar": {k: foo[k]})()
{'bar': 'baz'}
or:
>>> k = "bar"
>>> {k: foo[k]}
{'bar': 'baz'}
or:
>>> def item(d, k):
... return {k: d[k]}
...
>>> item(foo, "bar")
{'bar': 'baz'}
It's possible to extend dict to create your own methods. It could be useful if you're the one creating the initial dictionary in the first place.
class MyDict(dict):
def fetch(self, key):
return {key:self.get(key)}
The downside is you would need to recast regular dictionaries (assuming you didn't create the initial)
new_foo = MyDict(foo)
some_item = new_foo.fetch("bar")
But in this case it would probably be easier just to use a lambda (see Samwise's answer)
you can get the key value pair for key 'bar' by doing something like ->
foo = {"bar":"baz", "hello":"world"}
some_item = {k:v for k,v in foo.items() if k=='bar'}

don't understand this lambda expression with defaultdict

I saw this example at pythontips. I do not understand the second line when defaultdict takes an argument "tree" and return a "tree".
import collections
tree = lambda: collections.defaultdict(tree)
some_dict = tree()
some_dict['color']['favor'] = "yellow"
# Works fine
After I run this code, I checked the type of some_dict
defaultdict(< function < lambda > at 0x7f19ae634048 >,
{'color': defaultdict(
< function < lambda > at 0x7f19ae634048 >, {'favor': 'yellow'})})
This is a pretty clever way to create a recursive defaultdict. It's a little tricky to understand at first but once you dig into what's happening, it's actually a pretty simple use of recursion.
In this example, we define a recursive lambda function, tree, that returns a defaultdict whose constructor is tree. Let's rewrite this using regular functions for clarity.
from collections import defaultdict
from pprint import pprint
def get_recursive_dict():
return defaultdict(get_recursive_dict)
Note that we're returning defaultdict(get_recursive_dict) and not defaultdict(get_recursive_dict()). We want to pass defaultdict a callable object (i.e. the function get_recursive_dict). Actually calling get_recursive_dict() would result in infinite recursion.
If we call get_recursive_dict, we get an empty defaultdict whose default value is the function get_recursive_dict.
recursive_dict = get_recursive_dict()
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x0000000004FFC4A8>, {})
Let's see this in action. Create the key 'alice' and it's corresponding value defaults to an empty defaultdict whose default value is the function get_recursive_dict. Notice that this is the same default value as our recursive_dict!
print(recursive_dict['alice'])
# defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {})
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {'alice': defaultdict(<function get_recursive_dict at 0x0000000004AF46D8>, {})})
So we can create as many nested dictionaries as we want.
recursive_dict['bob']['age'] = 2
recursive_dict['charlie']['food']['dessert'] = 'cake'
print(recursive_dict)
# defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'charlie': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'food': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'dessert': 'cake'})}), 'bob': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {'age': 2}), 'alice': defaultdict(<function get_recursive_dict at 0x00000000049BD4A8>, {})})
Once you overwrite the default value with a key, you can no longer create arbitrarily deep nested dictionaries.
recursive_dict['bob']['age']['year'] = 2016
# TypeError: 'int' object does not support item assignment
I hope this clears things up!
Two points to note:
lambda represents an anonymous function.
Functions are first-class objects in Python. They may be assigned to a variable like any other object.
So here are 2 different ways to define functionally identical objects. They are recursive functions because they reference themselves.
from collections import defaultdict
# anonymous
tree = lambda: defaultdict(tree)
# explicit
def tree(): return defaultdict(tree)
Running the final 2 lines with these different definitions in turn, you see only a subtle difference in the naming of the defaultdict type:
# anonymous
defaultdict(<function __main__.<lambda>()>,
{'color': defaultdict(<function __main__.<lambda>()>,
{'favor': 'yellow'})})
# explicit
defaultdict(<function __main__.tree()>,
{'color': defaultdict(<function __main__.tree()>,
{'favor': 'yellow'})})
It's easier to see if you try this: a = lambda: a, you'll see that a() returns a. So...
>>> a = lambda: a
>>> a()()()()
<function <lambda> at 0x102bffd08>
They're doing this with the defaultdict too. tree is a function returning a defaultdict whose default value is yet another defaultdict, and so on.
I wasn't actually aware of this either. I thought tree would have to be defined first. Maybe it's a special Python rule? (EDIT:) No, I forgot that Python does the name lookup at runtime, and tree already points to the lambda then. In C++ there's compile-time reference checking, but you can define functions that reference themselves.
It seems like a way to create behavior that some users wouldn't expect. Like say you accidentally redefine tree later, your defaultdict is broken:
>>> import collections
>>> tree = lambda: collections.defaultdict(tree)
>>> some_dict = tree()
>>> tree = 4
>>> some_dict[4][3] = 2 # TypeError: first argument must be callable or None

python dictionary replace dict[] operator with dict.get() behavior

my_dict = {'a': 1}
I wish for my_dict['a'] to behave the same as my_dict.get('a')
That way, if I do my_dict['b'], I will not raise an error but get the default None value, the same way you would get it from my_dict.get('b')
In the case of my_dict = {'a': {'b': 2}} I could do my_dict['a']['b'] and it would act as my_dict.get('a').get('b')
When doing my_dict['b'] = 2 it will act same as my_dict.update({'b': 2})
Is it possible to do so that I will not have to inherit from dict?
You can use a collections.defaultdict() object to add a new value to the dictionary each time you try to access a non-existing key:
>>> from collections import defaultdict
>>> d = defaultdict(lambda: None)
>>> d['a'] is None
True
>>> d
defaultdict(<function <lambda> at 0x10f463e18>, {'a': None})
If you don't want the key added, create a subclass of dict that implements the __missing__ method:
class DefaultNoneDict(dict):
def __missing__(self, key):
return None
This explicitly won't add new keys:
>>> d = DefaultNoneDict()
>>> d['a'] is None
True
>>> d
{}
If you wanted to chain .get() calls, you'll have to return an empty dictionary instead, otherwise dict.get(keyA).get(keyB) will fail with an attribute error (the first None returned won't have a .get() method).
Generally speaking, it is better to stick to the default type and be explicit. There is nothing wrong with:
value = some_d.get(outer, {}).get(inner)
Using a defaultdict or a dict subclass with custom __missing__ hook have a downside: they will always produce a default when the key is missing, even when you accidentally produced incorrect keys somewhere else in your code. I often opt for an explicit dict.get() or dict.setdefault() codepath over defaultdict precisely because I want a non-existing key to produce an error in other parts of my project.

Override the {...} notation so i get an OrderedDict() instead of a dict()?

Update: dicts retaining insertion order is guaranteed for Python 3.7+
I want to use a .py file like a config file.
So using the {...} notation I can create a dictionary using strings as keys but the definition order is lost in a standard python dictionary.
My question: is it possible to override the {...} notation so that I get an OrderedDict() instead of a dict()?
I was hoping that simply overriding dict constructor with OrderedDict (dict = OrderedDict) would work, but it doesn't.
Eg:
dict = OrderedDict
dictname = {
'B key': 'value1',
'A key': 'value2',
'C key': 'value3'
}
print dictname.items()
Output:
[('B key', 'value1'), ('A key', 'value2'), ('C key', 'value3')]
Here's a hack that almost gives you the syntax you want:
class _OrderedDictMaker(object):
def __getitem__(self, keys):
if not isinstance(keys, tuple):
keys = (keys,)
assert all(isinstance(key, slice) for key in keys)
return OrderedDict([(k.start, k.stop) for k in keys])
ordereddict = _OrderedDictMaker()
from nastyhacks import ordereddict
menu = ordereddict[
"about" : "about",
"login" : "login",
'signup': "signup"
]
Edit: Someone else discovered this independently, and has published the odictliteral package on PyPI that provides a slightly more thorough implementation - use that package instead
To literally get what you are asking for, you have to fiddle with the syntax tree of your file. I don't think it is advisable to do so, but I couldn't resist the temptation to try. So here we go.
First, we create a module with a function my_execfile() that works like the built-in execfile(), except that all occurrences of dictionary displays, e.g. {3: 4, "a": 2} are replaced by explicit calls to the dict() constructor, e.g. dict([(3, 4), ('a', 2)]). (Of course we could directly replace them by calls to collections.OrderedDict(), but we don't want to be too intrusive.) Here's the code:
import ast
class DictDisplayTransformer(ast.NodeTransformer):
def visit_Dict(self, node):
self.generic_visit(node)
list_node = ast.List(
[ast.copy_location(ast.Tuple(list(x), ast.Load()), x[0])
for x in zip(node.keys, node.values)],
ast.Load())
name_node = ast.Name("dict", ast.Load())
new_node = ast.Call(ast.copy_location(name_node, node),
[ast.copy_location(list_node, node)],
[], None, None)
return ast.copy_location(new_node, node)
def my_execfile(filename, globals=None, locals=None):
if globals is None:
globals = {}
if locals is None:
locals = globals
node = ast.parse(open(filename).read())
transformed = DictDisplayTransformer().visit(node)
exec compile(transformed, filename, "exec") in globals, locals
With this modification in place, we can modify the behaviour of dictionary displays by overwriting dict. Here is an example:
# test.py
from collections import OrderedDict
print {3: 4, "a": 2}
dict = OrderedDict
print {3: 4, "a": 2}
Now we can run this file using my_execfile("test.py"), yielding the output
{'a': 2, 3: 4}
OrderedDict([(3, 4), ('a', 2)])
Note that for simplicity, the above code doesn't touch dictionary comprehensions, which should be transformed to generator expressions passed to the dict() constructor. You'd need to add a visit_DictComp() method to the DictDisplayTransformer class. Given the above example code, this should be straight-forward.
Again, I don't recommend this kind of messing around with the language semantics. Did you have a look into the ConfigParser module?
OrderedDict is not "standard python syntax", however, an ordered set of key-value pairs (in standard python syntax) is simply:
[('key1 name', 'value1'), ('key2 name', 'value2'), ('key3 name', 'value3')]
To explicitly get an OrderedDict:
OrderedDict([('key1 name', 'value1'), ('key2 name', 'value2'), ('key3 name', 'value3')])
Another alternative, is to sort dictname.items(), if that's all you need:
sorted(dictname.items())
As of python 3.6, all dictionaries will be ordered by default. For now, this is an implementation detail of dict and should not be relied upon, but it will likely become standard after v3.6.
Insertion order is always preserved in the new dict implementation:
>>>x = {'a': 1, 'b':2, 'c':3 }
>>>list(x.keys())
['a', 'b', 'c']
As of python 3.6 **kwargs order [PEP468] and class attribute order [PEP520] are preserved. The new compact, ordered dictionary implementation is used to implement the ordering for both of these.
What you are asking for is impossible, but if a config file in JSON syntax is sufficient you can do something similar with the json module:
>>> import json, collections
>>> d = json.JSONDecoder(object_pairs_hook = collections.OrderedDict)
>>> d.decode('{"a":5,"b":6}')
OrderedDict([(u'a', 5), (u'b', 6)])
The one solution I found is to patch python itself, making the dict object remember the order of insertion.
This then works for all kind of syntaxes:
x = {'a': 1, 'b':2, 'c':3 }
y = dict(a=1, b=2, c=3)
etc.
I have taken the ordereddict C implementation from https://pypi.python.org/pypi/ruamel.ordereddict/ and merged back into the main python code.
If you do not mind re-building the python interpreter, here is a patch for Python 2.7.8:
https://github.com/fwyzard/cpython/compare/2.7.8...ordereddict-2.7.8.diff
.A
If what you are looking for is a way to get easy-to-use initialization syntax - consider creating a subclass of OrderedDict and adding operators to it that update the dict, for example:
from collections import OrderedDict
class OrderedMap(OrderedDict):
def __add__(self,other):
self.update(other)
return self
d = OrderedMap()+{1:2}+{4:3}+{"key":"value"}
d will be- OrderedMap([(1, 2), (4, 3), ('key','value')])
Another possible syntactic-sugar example using the slicing syntax:
class OrderedMap(OrderedDict):
def __getitem__(self, index):
if isinstance(index, slice):
self[index.start] = index.stop
return self
else:
return OrderedDict.__getitem__(self, index)
d = OrderedMap()[1:2][6:4][4:7]["a":"H"]

How does collections.defaultdict work?

I've read the examples in python docs, but still can't figure out what this method means. Can somebody help? Here are two examples from the python docs
>>> from collections import defaultdict
>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> for k in s:
... d[k] += 1
...
>>> d.items()
dict_items([('m', 1), ('i', 4), ('s', 4), ('p', 2)])
and
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> for k, v in s:
... d[k].append(v)
...
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
the parameters int and list are for what?
Usually, a Python dictionary throws a KeyError if you try to get an item with a key that is not currently in the dictionary. The defaultdict in contrast will simply create any items that you try to access (provided of course they do not exist yet). To create such a "default" item, it calls the function object that you pass to the constructor (more precisely, it's an arbitrary "callable" object, which includes function and type objects). For the first example, default items are created using int(), which will return the integer object 0. For the second example, default items are created using list(), which returns a new empty list object.
defaultdict means that if a key is not found in the dictionary, then instead of a KeyError being thrown, a new entry is created. The type of this new entry is given by the argument of defaultdict.
For example:
somedict = {}
print(somedict[3]) # KeyError
someddict = defaultdict(int)
print(someddict[3]) # print int(), thus 0
defaultdict
"The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default(value to be returned) up front when the container is initialized."
as defined by Doug Hellmann in The Python Standard Library by Example
How to use defaultdict
Import defaultdict
>>> from collections import defaultdict
Initialize defaultdict
Initialize it by passing
callable as its first argument(mandatory)
>>> d_int = defaultdict(int)
>>> d_list = defaultdict(list)
>>> def foo():
... return 'default value'
...
>>> d_foo = defaultdict(foo)
>>> d_int
defaultdict(<type 'int'>, {})
>>> d_list
defaultdict(<type 'list'>, {})
>>> d_foo
defaultdict(<function foo at 0x7f34a0a69578>, {})
**kwargs as its second argument(optional)
>>> d_int = defaultdict(int, a=10, b=12, c=13)
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})
or
>>> kwargs = {'a':10,'b':12,'c':13}
>>> d_int = defaultdict(int, **kwargs)
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12})
How does it works
As is a child class of standard dictionary, it can perform all the same functions.
But in case of passing an unknown key it returns the default value instead of error. For ex:
>>> d_int['a']
10
>>> d_int['d']
0
>>> d_int
defaultdict(<type 'int'>, {'a': 10, 'c': 13, 'b': 12, 'd': 0})
In case you want to change default value overwrite default_factory:
>>> d_int.default_factory = lambda: 1
>>> d_int['e']
1
>>> d_int
defaultdict(<function <lambda> at 0x7f34a0a91578>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0})
or
>>> def foo():
... return 2
>>> d_int.default_factory = foo
>>> d_int['f']
2
>>> d_int
defaultdict(<function foo at 0x7f34a0a0a140>, {'a': 10, 'c': 13, 'b': 12, 'e': 1, 'd': 0, 'f': 2})
Examples in the Question
Example 1
As int has been passed as default_factory, any unknown key will return 0 by default.
Now as the string is passed in the loop, it will increase the count of those alphabets in d.
>>> s = 'mississippi'
>>> d = defaultdict(int)
>>> d.default_factory
<type 'int'>
>>> for k in s:
... d[k] += 1
>>> d.items()
[('i', 4), ('p', 2), ('s', 4), ('m', 1)]
>>> d
defaultdict(<type 'int'>, {'i': 4, 'p': 2, 's': 4, 'm': 1})
Example 2
As a list has been passed as default_factory, any unknown(non-existent) key will return [ ](ie. list) by default.
Now as the list of tuples is passed in the loop, it will append the value in the d[color]
>>> s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
>>> d = defaultdict(list)
>>> d.default_factory
<type 'list'>
>>> for k, v in s:
... d[k].append(v)
>>> d.items()
[('blue', [2, 4]), ('red', [1]), ('yellow', [1, 3])]
>>> d
defaultdict(<type 'list'>, {'blue': [2, 4], 'red': [1], 'yellow': [1, 3]})
Dictionaries are a convenient way to store data for later retrieval by name (key). Keys must be unique, immutable objects, and are typically strings. The values in a dictionary can be anything. For many applications, the values are simple types such as integers and strings.
It gets more interesting when the values in a dictionary are collections (lists, dicts, etc.) In this case, the value (an empty list or dict) must be initialized the first time a given key is used. While this is relatively easy to do manually, the defaultdict type automates and simplifies these kinds of operations.
A defaultdict works exactly like a normal dict, but it is initialized with a function (“default factory”) that takes no arguments and provides the default value for a nonexistent key.
A defaultdict will never raise a KeyError. Any key that does not exist gets the value returned by the default factory.
from collections import defaultdict
ice_cream = defaultdict(lambda: 'Vanilla')
ice_cream['Sarah'] = 'Chunky Monkey'
ice_cream['Abdul'] = 'Butter Pecan'
print(ice_cream['Sarah'])
>>>Chunky Monkey
print(ice_cream['Joe'])
>>>Vanilla
Here is another example on How using defaultdict, we can reduce complexity
from collections import defaultdict
# Time complexity O(n^2)
def delete_nth_naive(array, n):
ans = []
for num in array:
if ans.count(num) < n:
ans.append(num)
return ans
# Time Complexity O(n), using hash tables.
def delete_nth(array,n):
result = []
counts = defaultdict(int)
for i in array:
if counts[i] < n:
result.append(i)
counts[i] += 1
return result
x = [1,2,3,1,2,1,2,3]
print(delete_nth(x, n=2))
print(delete_nth_naive(x, n=2))
In conclusion, whenever you need a dictionary, and each element’s value should start with a default value, use a defaultdict.
There is a great explanation of defaultdicts here: http://ludovf.net/blog/python-collections-defaultdict/
Basically, the parameters int and list are functions that you pass. Remember that Python accepts function names as arguments. int returns 0 by default and list returns an empty list when called with parentheses.
In normal dictionaries, if in your example I try calling d[a], I will get an error (KeyError), since only keys m, s, i and p exist and key a has not been initialized. But in a defaultdict, it takes a function name as an argument, when you try to use a key that has not been initialized, it simply calls the function you passed in and assigns its return value as the value of the new key.
The behavior of defaultdict can be easily mimicked using dict.setdefault instead of d[key] in every call.
In other words, the code:
from collections import defaultdict
d = defaultdict(list)
print(d['key']) # empty list []
d['key'].append(1) # adding constant 1 to the list
print(d['key']) # list containing the constant [1]
is equivalent to:
d = dict()
print(d.setdefault('key', list())) # empty list []
d.setdefault('key', list()).append(1) # adding constant 1 to the list
print(d.setdefault('key', list())) # list containing the constant [1]
The only difference is that, using defaultdict, the list constructor is called only once, and using dict.setdefault the list constructor is called more often (but the code may be rewriten to avoid this, if really needed).
Some may argue there is a performance consideration, but this topic is a minefield. This post shows there isn't a big performance gain in using defaultdict, for example.
IMO, defaultdict is a collection that adds more confusion than benefits to the code. Useless for me, but others may think different.
Since the question is about "how it works", some readers may want to see more nuts and bolts. Specifically, the method in question is the __missing__(key) method. See: https://docs.python.org/2/library/collections.html#defaultdict-objects .
More concretely, this answer shows how to make use of __missing__(key) in a practical way:
https://stackoverflow.com/a/17956989/1593924
To clarify what 'callable' means, here's an interactive session (from 2.7.6 but should work in v3 too):
>>> x = int
>>> x
<type 'int'>
>>> y = int(5)
>>> y
5
>>> z = x(5)
>>> z
5
>>> from collections import defaultdict
>>> dd = defaultdict(int)
>>> dd
defaultdict(<type 'int'>, {})
>>> dd = defaultdict(x)
>>> dd
defaultdict(<type 'int'>, {})
>>> dd['a']
0
>>> dd
defaultdict(<type 'int'>, {'a': 0})
That was the most typical use of defaultdict (except for the pointless use of the x variable). You can do the same thing with 0 as the explicit default value, but not with a simple value:
>>> dd2 = defaultdict(0)
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
dd2 = defaultdict(0)
TypeError: first argument must be callable
Instead, the following works because it passes in a simple function (it creates on the fly a nameless function which takes no arguments and always returns 0):
>>> dd2 = defaultdict(lambda: 0)
>>> dd2
defaultdict(<function <lambda> at 0x02C4C130>, {})
>>> dd2['a']
0
>>> dd2
defaultdict(<function <lambda> at 0x02C4C130>, {'a': 0})
>>>
And with a different default value:
>>> dd3 = defaultdict(lambda: 1)
>>> dd3
defaultdict(<function <lambda> at 0x02C4C170>, {})
>>> dd3['a']
1
>>> dd3
defaultdict(<function <lambda> at 0x02C4C170>, {'a': 1})
>>>
My own 2¢: you can also subclass defaultdict:
class MyDict(defaultdict):
def __missing__(self, key):
value = [None, None]
self[key] = value
return value
This could come in handy for very complex cases.
Well, defaultdict can also raise keyerror in the following case:
from collections import defaultdict
d = defaultdict()
print(d[3]) #raises keyerror
Always remember to give argument to the defaultdict like
d = defaultdict(int)
The defaultdict tool is a container in the collections class of Python. It's similar to the usual dictionary (dict) container, but it has one difference: The value fields' data type is specified upon initialization.
For example:
from collections import defaultdict
d = defaultdict(list)
d['python'].append("awesome")
d['something-else'].append("not relevant")
d['python'].append("language")
for i in d.items():
print i
This prints:
('python', ['awesome', 'language'])
('something-else', ['not relevant'])
In short:
defaultdict(int) - the argument int indicates that the values will be int type.
defaultdict(list) - the argument list indicates that the values will be list type.
I think its best used in place of a switch case statement. Imagine if we have a switch case statement as below:
option = 1
switch(option) {
case 1: print '1st option'
case 2: print '2nd option'
case 3: print '3rd option'
default: return 'No such option'
}
There is no switch case statements available in python. We can achieve the same by using defaultdict.
from collections import defaultdict
def default_value(): return "Default Value"
dd = defaultdict(default_value)
dd[1] = '1st option'
dd[2] = '2nd option'
dd[3] = '3rd option'
print(dd[4])
print(dd[5])
print(dd[3])
It prints:
Default Value
Default Value
3rd option
In the above snippet dd has no keys 4 or 5 and hence it prints out a default value which we have configured in a helper function. This is quite nicer than a raw dictionary where a KeyError is thrown if key is not present. From this it is evident that defaultdict more like a switch case statement where we can avoid a complicated if-elif-elif-else blocks.
One more good example that impressed me a lot from this site is:
>>> from collections import defaultdict
>>> food_list = 'spam spam spam spam spam spam eggs spam'.split()
>>> food_count = defaultdict(int) # default value of int is 0
>>> for food in food_list:
... food_count[food] += 1 # increment element's value by 1
...
defaultdict(<type 'int'>, {'eggs': 1, 'spam': 7})
>>>
If we try to access any items other than eggs and spam we will get a count of 0.
Without defaultdict, you can probably assign new values to unseen keys but you cannot modify it. For example:
import collections
d = collections.defaultdict(int)
for i in range(10):
d[i] += i
print(d)
# Output: defaultdict(<class 'int'>, {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7, 8: 8, 9: 9})
import collections
d = {}
for i in range(10):
d[i] += i
print(d)
# Output: Traceback (most recent call last): File "python", line 4, in <module> KeyError: 0
The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets the caller specify the default up front when the container is initialized.
import collections
def default_factory():
return 'default value'
d = collections.defaultdict(default_factory, foo='bar')
print 'd:', d
print 'foo =>', d['foo']
print 'bar =>', d['bar']
This works well as long as it is appropriate for all keys to have the same default. It can be especially useful if the default is a type used for aggregating or accumulating values, such as a list, set, or even int. The standard library documentation includes several examples of using defaultdict this way.
$ python collections_defaultdict.py
d: defaultdict(<function default_factory at 0x100468c80>, {'foo': 'bar'})
foo => bar
bar => default value
#dictinary and defaultdict
normaldictionary=dict()
print(type(normaldictionary))
#print(normaldictionary["keynotexisit"])
#Above normal dictionary give an error as key not present
from collections import defaultdict
defaultdict1=defaultdict()
print(type(defaultdict1))
#print(defaultdict1['keynotexisit'])
######################################
from collections import defaultdict
default2=defaultdict(int)
print(default2['keynotexist'])
https://msatutorpy.medium.com/different-between-dictionary-and-defaultdictionary-cb215f682971
The documentation and the explanation are pretty much self-explanatory:
http://docs.python.org/library/collections.html#collections.defaultdict
The type function(int/str etc.) passed as an argument is used to initialize a default value for any given key where the key is not present in the dict.

Categories

Resources