I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like
params = {'a':1,'b':2}
a,b = params.values()
But since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a, b). Is there a nice way to do this?
from operator import itemgetter
params = {'a': 1, 'b': 2}
a, b = itemgetter('a', 'b')(params)
Instead of elaborate lambda functions or dictionary comprehension, may as well use a built in library.
One way to do this with less repetition than Jochen's suggestion is with a helper function. This gives the flexibility to list your variable names in any order and only destructure a subset of what is in the dict:
pluck = lambda dict, *args: (dict[arg] for arg in args)
things = {'blah': 'bleh', 'foo': 'bar'}
foo, blah = pluck(things, 'foo', 'blah')
Also, instead of joaquin's OrderedDict you could sort the keys and get the values. The only catches are you need to specify your variable names in alphabetical order and destructure everything in the dict:
sorted_vals = lambda dict: (t[1] for t in sorted(dict.items()))
things = {'foo': 'bar', 'blah': 'bleh'}
blah, foo = sorted_vals(things)
How come nobody posted the simplest approach?
params = {'a':1,'b':2}
a, b = params['a'], params['b']
Python is only able to "destructure" sequences, not dictionaries. So, to write what you want, you will have to map the needed entries to a proper sequence. As of myself, the closest match I could find is the (not very sexy):
a,b = [d[k] for k in ('a','b')]
This works with generators too:
a,b = (d[k] for k in ('a','b'))
Here is a full example:
>>> d = dict(a=1,b=2,c=3)
>>> d
{'a': 1, 'c': 3, 'b': 2}
>>> a, b = [d[k] for k in ('a','b')]
>>> a
1
>>> b
2
>>> a, b = (d[k] for k in ('a','b'))
>>> a
1
>>> b
2
Here's another way to do it similarly to how a destructuring assignment works in JS:
params = {'b': 2, 'a': 1}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
What we did was to unpack the params dictionary into key values (using **) (like in Jochen's answer), then we've taken those values in the lambda signature and assigned them according to the key name - and here's a bonus - we also get a dictionary of whatever is not in the lambda's signature so if you had:
params = {'b': 2, 'a': 1, 'c': 3}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
After the lambda has been applied, the rest variable will now contain:
{'c': 3}
Useful for omitting unneeded keys from a dictionary.
Hope this helps.
Maybe you really want to do something like this?
def some_func(a, b):
print a,b
params = {'a':1,'b':2}
some_func(**params) # equiv to some_func(a=1, b=2)
If you are afraid of the issues involved in the use of the locals dictionary and you prefer to follow your original strategy, Ordered Dictionaries from python 2.7 and 3.1 collections.OrderedDicts allows you to recover you dictionary items in the order in which they were first inserted
(Ab)using the import system
The from ... import statement lets us desctructure and bind attribute names of an object. Of course, it only works for objects in the sys.modules dictionary, so one could use a hack like this:
import sys, types
mydict = {'a':1,'b':2}
sys.modules["mydict"] = types.SimpleNamespace(**mydict)
from mydict import a, b
A somewhat more serious hack would be to write a context manager to load and unload the module:
with obj_as_module(mydict, "mydict_module"):
from mydict_module import a, b
By pointing the __getattr__ method of the module directly to the __getitem__ method of the dict, the context manager can also avoid using SimpleNamespace(**mydict).
See this answer for an implementation and some extensions of the idea.
One can also temporarily replace the entire sys.modules dict with the dict of interest, and do import a, b without from.
Warning 1: as stated in the docs, this is not guaranteed to work on all Python implementations:
CPython implementation detail: This function relies on Python stack frame support
in the interpreter, which isn’t guaranteed to exist in all implementations
of Python. If running in an implementation without Python stack frame support
this function returns None.
Warning 2: this function does make the code shorter, but it probably contradicts the Python philosophy of being as explicit as you can. Moreover, it doesn't address the issues pointed out by John Christopher Jones in the comments, although you could make a similar function that works with attributes instead of keys. This is just a demonstration that you can do that if you really want to!
def destructure(dict_):
if not isinstance(dict_, dict):
raise TypeError(f"{dict_} is not a dict")
# the parent frame will contain the information about
# the current line
parent_frame = inspect.currentframe().f_back
# so we extract that line (by default the code context
# only contains the current line)
(line,) = inspect.getframeinfo(parent_frame).code_context
# "hello, key = destructure(my_dict)"
# -> ("hello, key ", "=", " destructure(my_dict)")
lvalues, _equals, _rvalue = line.strip().partition("=")
# -> ["hello", "key"]
keys = [s.strip() for s in lvalues.split(",") if s.strip()]
if missing := [key for key in keys if key not in dict_]:
raise KeyError(*missing)
for key in keys:
yield dict_[key]
In [5]: my_dict = {"hello": "world", "123": "456", "key": "value"}
In [6]: hello, key = destructure(my_dict)
In [7]: hello
Out[7]: 'world'
In [8]: key
Out[8]: 'value'
This solution allows you to pick some of the keys, not all, like in JavaScript. It's also safe for user-provided dictionaries
With Python 3.10, you can do:
d = {"a": 1, "b": 2}
match d:
case {"a": a, "b": b}:
print(f"A is {a} and b is {b}")
but it adds two extra levels of indentation, and you still have to repeat the key names.
Look for other answers as this won't cater to the unexpected order in the dictionary. will update this with a correct version sometime soon.
try this
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
keys = data.keys()
a,b,c = [data[k] for k in keys]
result:
a == 'Apple'
b == 'Banana'
c == 'Carrot'
Well, if you want these in a class you can always do this:
class AttributeDict(dict):
def __init__(self, *args, **kwargs):
super(AttributeDict, self).__init__(*args, **kwargs)
self.__dict__.update(self)
d = AttributeDict(a=1, b=2)
Based on #ShawnFumo answer I came up with this:
def destruct(dict): return (t[1] for t in sorted(dict.items()))
d = {'b': 'Banana', 'c': 'Carrot', 'a': 'Apple' }
a, b, c = destruct(d)
(Notice the order of items in dict)
An old topic, but I found this to be a useful method:
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
for key in data.keys():
locals()[key] = data[key]
This method loops over every key in your dictionary and sets a variable to that name and then assigns the value from the associated key to this new variable.
Testing:
print(a)
print(b)
print(c)
Output
Apple
Banana
Carrot
An easy and simple way to destruct dict in python:
params = {"a": 1, "b": 2}
a, b = [params[key] for key in ("a", "b")]
print(a, b)
# Output:
# 1 2
I don't know whether it's good style, but
locals().update(params)
will do the trick. You then have a, b and whatever was in your params dict available as corresponding local variables.
Since dictionaries are guaranteed to keep their insertion order in Python >= 3.7, that means that it's complete safe and idiomatic to just do this nowadays:
params = {'a': 1, 'b': 2}
a, b = params.values()
print(a)
print(b)
Output:
1
2
Related
I was looking for a way to "unpack" a dictionary in a generic way and found a relevant question (and answers) which explained various techniques (TL;DR: it is not too elegant).
That question, however, addresses the case where the keys of the dict are not known, the OP anted to have them added to the local namespace automatically.
My problem is possibly simpler: I get a dict from a function and would like to dissecate it on the fly, knowing the keys I will need (I may not need all of them every time). Right now I can only do
def myfunc():
return {'a': 1, 'b': 2, 'c': 3}
x = myfunc()
a = x['a']
my_b_so_that_the_name_differs_from_the_key = x['b']
# I do not need c this time
while I was looking for the equivalent of
def myotherfunc():
return 1, 2
a, b = myotherfunc()
but for a dict (which is what is returned by my function). I do not want to use the latter solution for several reasons, one of them being that it is not obvious which variable corresponds to which returned element (the first solution has at least the merit of being readable).
Is such operation available?
If you really must, you can use an operator.itemgetter() object to extract values for multiple keys as a tuple:
from operator import itemgetter
a, b = itemgetter('a', 'b')(myfunc())
This is still not pretty; I'd prefer the explicit and readable separate lines where you first assign the return value, then extract those values.
Demo:
>>> from operator import itemgetter
>>> def myfunc():
... return {'a': 1, 'b': 2, 'c': 3}
...
>>> itemgetter('a', 'b')(myfunc())
(1, 2)
>>> a, b = itemgetter('a', 'b')(myfunc())
>>> a
1
>>> b
2
You could also use map:
def myfunc():
return {'a': 1, 'b': 2, 'c': 3}
a,b = map(myfunc().get,["a","b"])
print(a,b)
In addition to the operator.itemgetter() method, you can also write your own myotherfunc(). It takes list of the required keys as an argument and returns a tuple of their corresponding value.
def myotherfunc(keys_list):
reference_dict = myfunc()
return tuple(reference_dict[key] for key in keys_list)
>>> a,b = myotherfunc(['a','b'])
>>> a
1
>>> b
2
>>> a,c = myotherfunc(['a','c'])
>>> a
1
>>> c
3
I have multiple variables that I need to pack as one and hold it sequentially like in a array or list. This needs to be done in Python and I am still at Python infancy.
E.g. in Python:
a = Tom
b = 100
c = 3.14
d = {'x':1, 'y':2, 'z':3}
All the above in one sequential data structure. I can probably try and also a similar implementation I would have done in C++ just for the sake of clarity.
struct
{
string a;
int b;
float c;
map <char,int> d;// just as an example for dictionary in python
} node;
vector <node> v; // looking for something like this which can be iterable
If some one can give me a similar implementation for storing, iterating and modifying the contents would be really helpful. Any pointers in the right direction is also good with me.
Thanks
You can either use a dictionary like Michael proposes (but then you need to access the contents of v with v['a'], which is a little cumbersome), or you can use the equivalent of C++'s struct: a named tuple:
import collections
node = collections.namedtuple('node', 'a b c d')
# Tom = ...
v = node(Tom, 100, 3.14, {'x':1, 'y':2, 'z':3})
print node # node(a=…, b=100, c=3.14, d={'x':1, 'y':2, 'z':3})
print node.c # 3.14
print node[2] # 3.14 (works too, but is less meaningful and robust than something like node.last_name)
This is similar to, but simpler than defining your own class: type(v) == node, etc. Note however, as volcano pointed out, that the values stored in a namedtuple cannot be changed (a namedtuple is immutable).
If you indeed need to modify the values inside your records, the best option is a class:
class node(object):
def __init__(self, *arg_list):
for (name, arg) in zip('a b c d'.split(), arg_list):
setattr(self, name, arg)
v = node(1, 20, 300, "Eric")
print v.d # "Eric"
v.d = "Ajay" # Works
The last option, which I do not recommend, is indeed to use a list or a tuple, like ATOzTOA mentions: elements must be accessed in a not-so-legible way: node[3] is less meaningful than node.last_name; also, you cannot easily change the order of the fields, when using a list or a tuple (whereas the order is immaterial if you access a named tuple or custom class attributes).
Multiple node objects are customarily put in a list, the standard Python structure for such a purpose:
all_nodes = [node(…), node(…),…]
or
all_nodes = []
for … in …:
all_nodes.append(node(…))
or
all_nodes = [node(…) for … in …]
etc. The best method depends on how the various node objects are created, but in many cases a list is likely to be the best structure.
Note, however, that if you need to store something akin to an spreadsheet table and need speed and facilities for accessing its columns, you might be better off with NumPy's record arrays, or a package like Pandas.
You could put all the values in a dictionary, and have a list of these dictionaries.
{'a': a, 'b': b, 'c': c, 'd': d}
Otherwise, if this data is something that could be represented by a class, for example a 'Person'; create a class of type Person and create an object of that class with your data:
http://docs.python.org/2/tutorial/classes.html
Just use lists:
a = "Tom"
b = 100
c = 3.14
d = {'x':1, 'y':2, 'z':3}
data = [a, b, c, d]
print data
for item in data:
print item
Output:
['Tom', 100, 3.14, {'y': 2, 'x': 1, 'z': 3}]
Tom
100
3.14
{'y': 2, 'x': 1, 'z': 3}
Imagine you have a dictionary in python: myDic = {'a':1, 'b':{'c':2, 'd':3}}. You can certainly set a variable to a key value and use it later, such as:
myKey = 'b'
myDic[myKey]
>>> {'c':2, 'd':3}
However, is there a way to somehow set a variable to a value that, when used as a key, will dig into sub dictionaries as well? Is there a way to accomplish the following pseudo-code in python?
myKey = "['b']['c']"
myDic[myKey]
>>> 2
So first it uses 'b' as a key, and whatever is reurned it then uses 'c' as a key on that. Obviously, it would return an error if the value returned from the first lookup is not a dictionary.
No, there is nothing you can put into a variable so that myDict[myKey] will dig into the nested dictionaries.
Here is a function that may work for you as an alternative:
def recursive_get(d, keys):
if len(keys) == 1:
return d[keys[0]]
return recursive_get(d[keys[0]], keys[1:])
Example:
>>> myDic = {'a':1, 'b':{'c':2, 'd':3}}
>>> recursive_get(myDic, ['b', 'c'])
2
No, not with a regular dict. With myDict[key] you can only access values that are actually values of myDict. But if myDict contains other dicts, the values of those nested dicts are not values of myDict.
Depending on what you're doing with the data structure, it may be possible to get what you want by using tuple keys instead of nested dicts. Instead of having myDic = {'b':{'c':2, 'd':3}}, you could have myDic = {('b', 'c'):2, ('b', 'd'): 3}. Then you can access the values with something like myDic['b', 'c']. And you can indeed do:
val = 'b', 'c'
myDic[val]
AFAIK, you cannot. If you think about the way python works, it evaluates inside out, left to right. [] is a shorthand for __getitem__ in this case. Thus you would need to parse the arguments you are passing into __getitem__ (whatever you pass in) and handle that intelligently. If you wanted to have such behavior, you would need to subclass/write your own dict class.
myDict = {'a':1, 'b':{'c':2, 'd':3}}
k = 'b'
myDict.get(k) should give
{'c':2, 'd':3}
and either
d.get(k)['c']
OR
k1 = 'c'
d.get(k).key(k1) should give 2
Pretty old question. There is no builtin function for that.
Compact solution using functools.reduce and operator.getitem:
from functools import reduce
from operator import getitem
d = {'a': {'b': ['banana', 'lemon']}}
p = ['a', 'b', 1]
v = reduce(getitem, p, d)
# 'lemon'
In someone else's code I read the following two lines:
x = defaultdict(lambda: 0)
y = defaultdict(lambda: defaultdict(lambda: 0))
As the argument of defaultdict is a default factory, I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed. Am I correct?
And what about y? It seems that the default factory will create a defaultdict with default 0. But what does that mean concretely? I tried to play around with it in Python shell, but couldn't figure out what it is exactly.
I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed.
That's right. This is more idiomatically written
x = defaultdict(int)
In the case of y, when you do y["ham"]["spam"], the key "ham" is inserted in y if it does not exist. The value associated with it becomes a defaultdict in which "spam" is automatically inserted with a value of 0.
I.e., y is a kind of "two-tiered" defaultdict. If "ham" not in y, then evaluating y["ham"]["spam"] is like doing
y["ham"] = {}
y["ham"]["spam"] = 0
in terms of ordinary dict.
You are correct for what the first one does. As for y, it will create a defaultdict with default 0 when a key doesn't exist in y, so you can think of this as a nested dictionary. Consider the following example:
y = defaultdict(lambda: defaultdict(lambda: 0))
print y['k1']['k2'] # 0
print dict(y['k1']) # {'k2': 0}
To create an equivalent nested dictionary structure without defaultdict you would need to create an inner dict for y['k1'] and then set y['k1']['k2'] to 0, but defaultdict does all of this behind the scenes when it encounters keys it hasn't seen:
y = {}
y['k1'] = {}
y['k1']['k2'] = 0
The following function may help for playing around with this on an interpreter to better your understanding:
def to_dict(d):
if isinstance(d, defaultdict):
return dict((k, to_dict(v)) for k, v in d.items())
return d
This will return the dict equivalent of a nested defaultdict, which is a lot easier to read, for example:
>>> y = defaultdict(lambda: defaultdict(lambda: 0))
>>> y['a']['b'] = 5
>>> y
defaultdict(<function <lambda> at 0xb7ea93e4>, {'a': defaultdict(<function <lambda> at 0xb7ea9374>, {'b': 5})})
>>> to_dict(y)
{'a': {'b': 5}}
defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained.
lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int), which will do the same thing.
As for the second part, the author would like to create a new defaultdict(int), or a nested dictionary, whenever a key is not found in the top-level dictionary.
All answers are good enough still I am giving the answer to add more info:
"defaultdict requires an argument that is callable. That return result of that callable object is the default value that the dictionary returns when you try to access the dictionary with a key that does not exist."
Here's an example
SAMPLE= {'Age':28, 'Salary':2000}
SAMPLE = defaultdict(lambda:0,SAMPLE)
>>> SAMPLE
defaultdict(<function <lambda> at 0x0000000002BF7C88>, {'Salary': 2000, 'Age': 28})
>>> SAMPLE['Age']----> This will return 28
>>> SAMPLE['Phone']----> This will return 0 # you got 0 as output for a non existing key inside SAMPLE
y = defaultdict(lambda:defaultdict(lambda:0))
will be helpful if you try this y['a']['b'] += 1
The following two expressions seem equivalent to me. Which one is preferable?
data = [('a', 1), ('b', 1), ('b', 2)]
d1 = {}
d2 = {}
for key, val in data:
# variant 1)
d1[key] = d1.get(key, []) + [val]
# variant 2)
d2.setdefault(key, []).append(val)
The results are the same but which version is better or rather more pythonic?
Personally I find version 2 harder to understand, as to me setdefault is very tricky to grasp. If I understand correctly, it looks for the value of "key" in the dictionary, if not available, enters "[]" into the dict, returns a reference to either the value or "[]" and appends "val" to that reference. While certainly smooth it is not intuitive in the least (at least to me).
To my mind, version 1 is easier to understand (if available, get the value for "key", if not, get "[]", then join with a list made up from [val] and place the result in "key"). But while more intuitive to understand, I fear this version is less performant, with all this list creating. Another disadvantage is that "d1" occurs twice in the expression which is rather error-prone. Probably there is a better implementation using get, but presently it eludes me.
My guess is that version 2, although more difficult to grasp for the inexperienced, is faster and therefore preferable. Opinions?
Your two examples do the same thing, but that doesn't mean get and setdefault do.
The difference between the two is basically manually setting d[key] to point to the list every time, versus setdefault automatically setting d[key] to the list only when it's unset.
Making the two methods as similar as possible, I ran
from timeit import timeit
print timeit("c = d.get(0, []); c.extend([1]); d[0] = c", "d = {1: []}", number = 1000000)
print timeit("c = d.get(1, []); c.extend([1]); d[0] = c", "d = {1: []}", number = 1000000)
print timeit("d.setdefault(0, []).extend([1])", "d = {1: []}", number = 1000000)
print timeit("d.setdefault(1, []).extend([1])", "d = {1: []}", number = 1000000)
and got
0.794723378711
0.811882272256
0.724429205999
0.722129751973
So setdefault is around 10% faster than get for this purpose.
The get method allows you to do less than you can with setdefault. You can use it to avoid getting a KeyError when the key doesn't exist (if that's something that's going to happen frequently) even if you don't want to set the key.
See Use cases for the 'setdefault' dict method and dict.get() method returns a pointer for some more info about the two methods.
The thread about setdefault concludes that most of the time, you want to use a defaultdict. The thread about get concludes that it is slow, and often you're better off (speed wise) doing a double lookup, using a defaultdict, or handling the error (depending on the size of the dictionary and your use case).
The accepted answer from agf isn't comparing like with like. After:
print timeit("d[0] = d.get(0, []) + [1]", "d = {1: []}", number = 10000)
d[0] contains a list with 10,000 items whereas after:
print timeit("d.setdefault(0, []) + [1]", "d = {1: []}", number = 10000)
d[0] is simply []. i.e. the d.setdefault version never modifies the list stored in d. The code should actually be:
print timeit("d.setdefault(0, []).append(1)", "d = {1: []}", number = 10000)
and in fact is faster than the faulty setdefault example.
The difference here really is because of when you append using concatenation the whole list is copied every time (and once you have 10,000 elements that is beginning to become measurable. Using append the list updates are amortised O(1), i.e. effectively constant time.
Finally, there are two other options not considered in the original question: defaultdict or simply testing the dictionary to see whether it already contains the key.
So, assuming d3, d4 = defaultdict(list), {}
# variant 1 (0.39)
d1[key] = d1.get(key, []) + [val]
# variant 2 (0.003)
d2.setdefault(key, []).append(val)
# variant 3 (0.0017)
d3[key].append(val)
# variant 4 (0.002)
if key in d4:
d4[key].append(val)
else:
d4[key] = [val]
variant 1 is by far the slowest because it copies the list every time, variant 2 is the second slowest, variant 3 is the fastest but won't work if you need Python older than 2.5, and variant 4 is just slightly slower than variant 3.
I would say use variant 3 if you can, with variant 4 as an option for those occasional places where defaultdict isn't an exact fit. Avoid both of your original variants.
For those who are still struggling in understanding these two term, let me tell you basic difference between get() and setdefault() method -
Scenario-1
root = {}
root.setdefault('A', [])
print(root)
Scenario-2
root = {}
root.get('A', [])
print(root)
In Scenario-1 output will be {'A': []} while in Scenario-2 {}
So setdefault() sets absent keys in the dict while get() only provides you default value but it does not modify the dictionary.
Now let come where this will be useful-
Suppose you are searching an element in a dict whose value is a list and you want to modify that list if found otherwise create a new key with that list.
using setdefault()
def fn1(dic, key, lst):
dic.setdefault(key, []).extend(lst)
using get()
def fn2(dic, key, lst):
dic[key] = dic.get(key, []) + (lst) #Explicit assigning happening here
Now lets examine timings -
dic = {}
%%timeit -n 10000 -r 4
fn1(dic, 'A', [1,2,3])
Took 288 ns
dic = {}
%%timeit -n 10000 -r 4
fn2(dic, 'A', [1,2,3])
Took 128 s
So there is a very large timing difference between these two approaches.
You might want to look at defaultdict in the collections module. The following is equivalent to your examples.
from collections import defaultdict
data = [('a', 1), ('b', 1), ('b', 2)]
d = defaultdict(list)
for k, v in data:
d[k].append(v)
There's more here.
1. Explained with a good example here:
http://code.activestate.com/recipes/66516-add-an-entry-to-a-dictionary-unless-the-entry-is-a/
dict.setdefault typical usage
somedict.setdefault(somekey,[]).append(somevalue)
dict.get typical usage
theIndex[word] = 1 + theIndex.get(word,0)
2. More explanation : http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html
dict.setdefault() is equivalent to get or set & get. Or set if necessary then get. It's especially efficient if your dictionary key is expensive to compute or long to type.
The only problem with dict.setdefault() is that the default value is always evaluated, whether needed or not. That only matters if the default value is expensive to compute. In that case, use defaultdict.
3. Finally the official docs with difference highlighted http://docs.python.org/2/library/stdtypes.html
get(key[, default])
Return the value for key if key is in the dictionary, else default. If
default is not given, it defaults to None, so that this method never
raises a KeyError.
setdefault(key[, default])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.
The logic of dict.get is:
if key in a_dict:
value = a_dict[key]
else:
value = default_value
Take an example:
In [72]: a_dict = {'mapping':['dict', 'OrderedDict'], 'array':['list', 'tuple']}
In [73]: a_dict.get('string', ['str', 'bytes'])
Out[73]: ['str', 'bytes']
In [74]: a_dict.get('array', ['str', 'byets'])
Out[74]: ['list', 'tuple']
The mechamism of setdefault is:
levels = ['master', 'manager', 'salesman', 'accountant', 'assistant']
#group them by the leading letter
group_by_leading_letter = {}
# the logic expressed by obvious if condition
for level in levels:
leading_letter = level[0]
if leading_letter not in group_by_leading_letter:
group_by_leading_letter[leading_letter] = [level]
else:
group_by_leading_letter[leading_letter].append(word)
In [80]: group_by_leading_letter
Out[80]: {'a': ['accountant', 'assistant'], 'm': ['master', 'manager'], 's': ['salesman']}
The setdefault dict method is for precisely this purpose. The preceding for loop can be rewritten as:
In [87]: for level in levels:
...: leading = level[0]
...: group_by_leading_letter.setdefault(leading,[]).append(level)
Out[80]: {'a': ['accountant', 'assistant'], 'm': ['master', 'manager'], 's': ['salesman']}
It's very simple, means that either a non-null list append an element or a null list append an element.
The defaultdict, which makes this even easier. To create one, you pass a type or function for generating the default value for each slot in the dict:
from collections import defualtdict
group_by_leading_letter = defaultdict(list)
for level in levels:
group_by_leading_letter[level[0]].append(level)
There is no strict answer to this question. They both accomplish the same purpose. They can both be used to deal with missing values on keys. The only difference that I have found is that with setdefault(), the key that you invoke (if not previously in the dictionary) gets automatically inserted while it does not happen with get(). Here is an example:
Setdefault()
>>> myDict = {'A': 'GOD', 'B':'Is', 'C':'GOOD'} #(1)
>>> myDict.setdefault('C') #(2)
'GOOD'
>>> myDict.setdefault('C','GREAT') #(3)
'GOOD'
>>> myDict.setdefault('D','AWESOME') #(4)
'AWESOME'
>>> myDict #(5)
{'A': 'GOD', 'B': 'Is', 'C': 'GOOD', 'D': 'AWSOME'}
>>> myDict.setdefault('E')
>>>
Get()
>>> myDict = {'a': 1, 'b': 2, 'c': 3} #(1)
>>> myDict.get('a',0) #(2)
1
>>> myDict.get('d',0) #(3)
0
>>> myDict #(4)
{'a': 1, 'b': 2, 'c': 3}
Here is my conclusion: there is no specific answer to which one is best specifically when it comes to default values imputation. The only difference is that setdefault() automatically adds any new key with a default value in the dictionary while get() does not. For more information, please go here !
In [1]: person_dict = {}
In [2]: person_dict['liqi'] = 'LiQi'
In [3]: person_dict.setdefault('liqi', 'Liqi')
Out[3]: 'LiQi'
In [4]: person_dict.setdefault('Kim', 'kim')
Out[4]: 'kim'
In [5]: person_dict
Out[5]: {'Kim': 'kim', 'liqi': 'LiQi'}
In [8]: person_dict.get('Dim', '')
Out[8]: ''
In [5]: person_dict
Out[5]: {'Kim': 'kim', 'liqi': 'LiQi'}