Python defaultdict and lambda - python

In someone else's code I read the following two lines:
x = defaultdict(lambda: 0)
y = defaultdict(lambda: defaultdict(lambda: 0))
As the argument of defaultdict is a default factory, I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed. Am I correct?
And what about y? It seems that the default factory will create a defaultdict with default 0. But what does that mean concretely? I tried to play around with it in Python shell, but couldn't figure out what it is exactly.

I think the first line means that when I call x[k] for a nonexistent key k (such as a statement like v=x[k]), the key-value pair (k,0) will be automatically added to the dictionary, as if the statement x[k]=0 is first executed.
That's right. This is more idiomatically written
x = defaultdict(int)
In the case of y, when you do y["ham"]["spam"], the key "ham" is inserted in y if it does not exist. The value associated with it becomes a defaultdict in which "spam" is automatically inserted with a value of 0.
I.e., y is a kind of "two-tiered" defaultdict. If "ham" not in y, then evaluating y["ham"]["spam"] is like doing
y["ham"] = {}
y["ham"]["spam"] = 0
in terms of ordinary dict.

You are correct for what the first one does. As for y, it will create a defaultdict with default 0 when a key doesn't exist in y, so you can think of this as a nested dictionary. Consider the following example:
y = defaultdict(lambda: defaultdict(lambda: 0))
print y['k1']['k2'] # 0
print dict(y['k1']) # {'k2': 0}
To create an equivalent nested dictionary structure without defaultdict you would need to create an inner dict for y['k1'] and then set y['k1']['k2'] to 0, but defaultdict does all of this behind the scenes when it encounters keys it hasn't seen:
y = {}
y['k1'] = {}
y['k1']['k2'] = 0
The following function may help for playing around with this on an interpreter to better your understanding:
def to_dict(d):
if isinstance(d, defaultdict):
return dict((k, to_dict(v)) for k, v in d.items())
return d
This will return the dict equivalent of a nested defaultdict, which is a lot easier to read, for example:
>>> y = defaultdict(lambda: defaultdict(lambda: 0))
>>> y['a']['b'] = 5
>>> y
defaultdict(<function <lambda> at 0xb7ea93e4>, {'a': defaultdict(<function <lambda> at 0xb7ea9374>, {'b': 5})})
>>> to_dict(y)
{'a': {'b': 5}}

defaultdict takes a zero-argument callable to its constructor, which is called when the key is not found, as you correctly explained.
lambda: 0 will of course always return zero, but the preferred method to do that is defaultdict(int), which will do the same thing.
As for the second part, the author would like to create a new defaultdict(int), or a nested dictionary, whenever a key is not found in the top-level dictionary.

All answers are good enough still I am giving the answer to add more info:
"defaultdict requires an argument that is callable. That return result of that callable object is the default value that the dictionary returns when you try to access the dictionary with a key that does not exist."
Here's an example
SAMPLE= {'Age':28, 'Salary':2000}
SAMPLE = defaultdict(lambda:0,SAMPLE)
>>> SAMPLE
defaultdict(<function <lambda> at 0x0000000002BF7C88>, {'Salary': 2000, 'Age': 28})
>>> SAMPLE['Age']----> This will return 28
>>> SAMPLE['Phone']----> This will return 0 # you got 0 as output for a non existing key inside SAMPLE

y = defaultdict(lambda:defaultdict(lambda:0))
will be helpful if you try this y['a']['b'] += 1

Related

Merging values from 2 dictionaries (Python)

(I'm new to Python!)
Trying to figure out this homework question:
The function will takes a​s input​ two dictionaries, each mapping strings to integers. The function will r​eturn​ a dictionary that maps strings from the two input dictionaries to the sum of the integers in the two input dictionaries.
my idea was this:
def ​add(​dicA,dicB):
dicA = {}
dicB = {}
newdictionary = dicA.update(dicB)
however, that brings back None.
In the professor's example:
print(add({'alice':10, 'Bob':3, 'Carlie':1}, {'alice':5, 'Bob':100, 'Carlie':1}))
the output is:
{'alice':15, 'Bob':103, 'Carlie':2}
My issue really is that I don't understand how to add up the values from each dictionaries. I know that the '+' is not supported with dictionaries. I'm not looking for anyone to do my homework for me, but any suggestions would be very much appreciated!
From the documentation:
update([other])
Update the dictionary with the key/value pairs from other, overwriting existing keys. Return None.
You don't want to replace key/value pairs, you want to add the values for similar keys. Go through each dictionary and add each value to the relevant key:
def ​add(​dicA,dicB):
result = {}
for d in dicA, dicB:
for key in d:
result[key] = result.get(key, 0) + d[key]
return result
result.get(key, 0) will retrieve the value of an existing key or produce 0 if key is not yet present.
First of all, a.update(b) updates a in place, and returns None.
Secondly, a.update(b) wouldn't help you to sum the keys; it would just produce a dictionary with the resulting dictionary having all the key, value pairs from b:
>>> a = {'alice':10, 'Bob':3, 'Carlie':1}
>>> b = {'alice':5, 'Bob':100, 'Carlie':1}
>>> a.update(b)
>>> a
{'alice': 5, 'Carlie': 1, 'Bob': 100}
It'd be easiest to use collections.Counter to achieve the desired result. As a plus, it does support addition with +:
from collections import Counter
def add(dicA, dicB):
return dict(Counter(dicA) + Counter(dicB))
This produces the intended result:
>>> print(add({'alice':10, 'Bob':3, 'Carlie':1}, {'alice':5, 'Bob':100, 'Carlie':1}))
{'alice': 15, 'Carlie': 2, 'Bob': 103}
The following is not meant to be the most elegant solution, but to get a feeling on how to deal with dicts.
dictA = {'Alice':10, 'Bob':3, 'Carlie':1}
dictB = {'Alice':5, 'Bob':100, 'Carlie':1}
# how to iterate through a dictionary
for k,v in dictA.iteritems():
print k,v
# make a new dict to keep tally
newdict={}
for d in [dictA,dictB]: # go through a list that has your dictionaries
print d
for k,v in d.iteritems(): # go through each dictionary item
if not k in newdict.keys():
newdict[k]=v
else:
newdict[k]+=v
print newdict
Output:
Bob 3
Alice 10
Carlie 1
{'Bob': 3, 'Alice': 10, 'Carlie': 1}
{'Bob': 100, 'Alice': 5, 'Carlie': 1}
{'Bob': 103, 'Alice': 15, 'Carlie': 2}
def ​add(​dicA,dicB):
You define a function that takes two arguments, dicA and dicB.
dicA = {}
dicB = {}
Then you assign an empty dictionary to both those variables, overwriting the dictionaries you passed to the function.
newdictionary = dicA.update(dicB)
Then you update dicA with the values from dicB, and assign the result to newdictionary. dict.update always returns None though.
And finally, you don’t return anything from the function, so it does not give you any results.
In order to combine those dictionaries, you actually need to use the values that were passed to it. Since dict.update mutates the dictionary it is called on, this would change one of those passed dictionaries, which we generally do not want to do. So instead, we use an empty dictionary, and then copy the values from both dictionaries into it:
def add (dicA, dicB):
newDictionary = {}
newDictionary.update(dicA)
newDictionary.update(dicB)
return newDictionary
If you want the values to sum up automatically, then use a Counter instead of a normal dictionary:
from collections import Counter
def add (dicA, dicB):
newDictionary = Counter()
newDictionary.update(dicA)
newDictionary.update(dicB)
return newDictionary
I suspect your professor wants to achieve this using more simple methods. But you can achieve this very easily using collections.Counter.
from collections import Counter
def add(a, b):
return dict(Counter(a) + Counter(b))
Your professor probably wants something like this:
def add(a, b):
new_dict = copy of a
for each key/value pair in b
if key in new_dict
add value to value already present in new_dict
else
insert key/value pair into new_dict
return new_dict
You can try this:
def add(dict1, dict2):
return dict([(key,dict1[key]+dict2[key]) for key in dict1.keys()])
I personally like using a dictionary's get method for this kind of merge:
def add(a, b):
result = {}
for dictionary in (a, b):
for key, value in dictionary.items():
result[key] = result.get(key, 0) + value
return result

Python: Why does the dict.fromkeys method not produce a working dicitonary

I had the following dictionary:
ref_range = range(0,100)
aas = list("ACDEFGHIKLMNPQRSTVWXY*")
new_dict = {}
new_dict = new_dict.fromkeys(ref_range,{k:0 for k in aas})
Then I added a 1 to a specific key
new_dict[30]['G'] += 1
>>>new_dict[30]['G']
1
but
>>>new_dict[31]['G']
1
What is going on here? I only incremented the nested key 30, 'G' by one.
Note: If I generate the dictionary this way:
new_dict = {}
for i in ref_range:
new_dict[i] = {a:0 for a in aas}
Everything behaves fine. I think this is a similar question here, but I wanted to know a bit about why this happening rather than how to solve it.
fromkeys(S, v) sets all of the keys in S to the same value v. Meaning that all of the keys in your dictionary new_dict refer to the same dictionary object, not to their own copies of that dictionary.
To set each to a different dict object you cannot use fromkeys. You need to just set each key to a new dict in a loop.
Besides what you have you could also do
{i: {a: 0 for a in aas} for i in ref_range}

Dividing dictionary into nested dictionaries, based on the key's name on Python 3.4

I have the following dictionary (short version, real data is much larger):
dict = {'C-STD-B&M-SUM:-1': 0, 'C-STD-B&M-SUM:-10': 4.520475, 'H-NSW-BAC-ART:-9': 0.33784000000000003, 'H-NSW-BAC-ART:0': 0, 'H-NSW-BAC-ENG:-59': 0.020309999999999998, 'H-NSW-BAC-ENG:-6': 0,}
I want to divide it into smaller nested dictionaries, depending on a part of the key name.
Expected output would be:
# fixed closing brackets
dict1 = {'C-STD-B&M-SUM: {'-1': 0, '-10': 4.520475}}
dict2 = {'H-NSW-BAC-ART: {'-9': 0.33784000000000003, '0': 0}}
dict3 = {'H-NSW-BAC-ENG: {'-59': 0.020309999999999998, '-6': 0}}
Logic behind is:
dict1: if the part of the key name is 'C-STD-B&M-SUM', add to dict1.
dict2: if the part of the key name is 'H-NSW-BAC-ART', add to dict2.
dict3: if the part of the key name is 'H-NSW-BAC-ENG', add to dict3.
Partial code so far:
def divide_dictionaries(dict):
c_std_bem_sum = {}
for k, v in dict.items():
if k[0:13] == 'C-STD-B&M-SUM':
c_std_bem_sum = k[14:17], v
What I'm trying to do is to create the nested dictionaries that I need and then I'll create the dictionary and add the nested one to it, but I'm not sure if it's a good way to do it.
When I run the code above, the variable c_std_bem_sum becomes a tuple, with only two values that are changed at each iteration. How can I make it be a dictionary, so I can later create another dictionary, and use this one as the value for one of the keys?
One way to approach it would be to do something like
d = {'C-STD-B&M-SUM:-1': 0, 'C-STD-B&M-SUM:-10': 4.520475, 'H-NSW-BAC-ART:-9': 0.33784000000000003, 'H-NSW-BAC-ART:0': 0, 'H-NSW-BAC-ENG:-59': 0.020309999999999998, 'H-NSW-BAC-ENG:-6': 0,}
def divide_dictionaries(somedict):
out = {}
for k,v in somedict.items():
head, tail = k.split(":")
subdict = out.setdefault(head, {})
subdict[tail] = v
return out
which gives
>>> dnew = divide_dictionaries(d)
>>> import pprint
>>> pprint.pprint(dnew)
{'C-STD-B&M-SUM': {'-1': 0, '-10': 4.520475},
'H-NSW-BAC-ART': {'-9': 0.33784000000000003, '0': 0},
'H-NSW-BAC-ENG': {'-59': 0.020309999999999998, '-6': 0}}
A few notes:
(1) We're using nested dictionaries instead of creating separate named dictionaries, which aren't convenient.
(2) We used setdefault, which is a handy way to say "give me the value in the dictionary, but if there isn't one, add this to the dictionary and return it instead.". Saves an if.
(3) We can use .split(":") instead of hardcoding the width, which isn't very robust -- at least assuming that's the delimiter, anyway!
(4) It's a bad idea to use dict, the name of a builtin type, as a variable name.
That's because you're setting your dictionary and overriding it with a tuple:
>>> a = 1, 2
>>> print a
>>> (1,2)
Now for your example:
>>> def divide_dictionaries(dict):
>>> c_std_bem_sum = {}
>>> for k, v in dict.items():
>>> if k[0:13] == 'C-STD-B&M-SUM':
>>> new_key = k[14:17] # sure you don't want [14:], open ended?
>>> c_std_bem_sum[new_key] = v
Basically, this grabs the rest of the key (or 3 characters, as you have it, the [14:None] or [14:] would get the rest of the string) and then uses that as the new key for the dict.

Python get remaining runoff voting

I am a little stuck on writing a function for a project. This function takes a dictionary of candidates who's values are the number of votes they received. I then have to return a set containing the remaining_candidates. In other words the candidate with the least amount of votes should not be in the set being returned and if for example all of the candidates have the same votes, the set should be empty. I am having trouble getting started here.
For example I know I can sort the dictionary like so:
x = min(canadites, key=canadites.__getitem__)
but that will not work if the candidates have the same value, as it just pops up the last one in the dict.
Any ideas?
Update: To make things clear.
Lets say I have the following dictionary:
canadites = {'X':22,'Y':1, 'Z':0}
Ideally the function should return a set containing only X and Y. But if Y and Z where both 1
x = min(canadites, key=canadites.__getitem__)
seems to only return Z
It's cleaner to create a new dict instead of popping items from the old one:
>>> d = {'a':1, 'b':2, 'c':1, 'd':3}
>>> min_val = min(d.values())
>>> {k:v for k,v in d.items() if v > min_val}
{'b': 2, 'd': 3}
In python2, itervalues and iteritems would be more efficient, although this is a micro-optimization in most cases.

Destructuring-bind dictionary contents

I am trying to 'destructure' a dictionary and associate values with variables names after its keys. Something like
params = {'a':1,'b':2}
a,b = params.values()
But since dictionaries are not ordered, there is no guarantee that params.values() will return values in the order of (a, b). Is there a nice way to do this?
from operator import itemgetter
params = {'a': 1, 'b': 2}
a, b = itemgetter('a', 'b')(params)
Instead of elaborate lambda functions or dictionary comprehension, may as well use a built in library.
One way to do this with less repetition than Jochen's suggestion is with a helper function. This gives the flexibility to list your variable names in any order and only destructure a subset of what is in the dict:
pluck = lambda dict, *args: (dict[arg] for arg in args)
things = {'blah': 'bleh', 'foo': 'bar'}
foo, blah = pluck(things, 'foo', 'blah')
Also, instead of joaquin's OrderedDict you could sort the keys and get the values. The only catches are you need to specify your variable names in alphabetical order and destructure everything in the dict:
sorted_vals = lambda dict: (t[1] for t in sorted(dict.items()))
things = {'foo': 'bar', 'blah': 'bleh'}
blah, foo = sorted_vals(things)
How come nobody posted the simplest approach?
params = {'a':1,'b':2}
a, b = params['a'], params['b']
Python is only able to "destructure" sequences, not dictionaries. So, to write what you want, you will have to map the needed entries to a proper sequence. As of myself, the closest match I could find is the (not very sexy):
a,b = [d[k] for k in ('a','b')]
This works with generators too:
a,b = (d[k] for k in ('a','b'))
Here is a full example:
>>> d = dict(a=1,b=2,c=3)
>>> d
{'a': 1, 'c': 3, 'b': 2}
>>> a, b = [d[k] for k in ('a','b')]
>>> a
1
>>> b
2
>>> a, b = (d[k] for k in ('a','b'))
>>> a
1
>>> b
2
Here's another way to do it similarly to how a destructuring assignment works in JS:
params = {'b': 2, 'a': 1}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
What we did was to unpack the params dictionary into key values (using **) (like in Jochen's answer), then we've taken those values in the lambda signature and assigned them according to the key name - and here's a bonus - we also get a dictionary of whatever is not in the lambda's signature so if you had:
params = {'b': 2, 'a': 1, 'c': 3}
a, b, rest = (lambda a, b, **rest: (a, b, rest))(**params)
After the lambda has been applied, the rest variable will now contain:
{'c': 3}
Useful for omitting unneeded keys from a dictionary.
Hope this helps.
Maybe you really want to do something like this?
def some_func(a, b):
print a,b
params = {'a':1,'b':2}
some_func(**params) # equiv to some_func(a=1, b=2)
If you are afraid of the issues involved in the use of the locals dictionary and you prefer to follow your original strategy, Ordered Dictionaries from python 2.7 and 3.1 collections.OrderedDicts allows you to recover you dictionary items in the order in which they were first inserted
(Ab)using the import system
The from ... import statement lets us desctructure and bind attribute names of an object. Of course, it only works for objects in the sys.modules dictionary, so one could use a hack like this:
import sys, types
mydict = {'a':1,'b':2}
sys.modules["mydict"] = types.SimpleNamespace(**mydict)
from mydict import a, b
A somewhat more serious hack would be to write a context manager to load and unload the module:
with obj_as_module(mydict, "mydict_module"):
from mydict_module import a, b
By pointing the __getattr__ method of the module directly to the __getitem__ method of the dict, the context manager can also avoid using SimpleNamespace(**mydict).
See this answer for an implementation and some extensions of the idea.
One can also temporarily replace the entire sys.modules dict with the dict of interest, and do import a, b without from.
Warning 1: as stated in the docs, this is not guaranteed to work on all Python implementations:
CPython implementation detail: This function relies on Python stack frame support
in the interpreter, which isn’t guaranteed to exist in all implementations
of Python. If running in an implementation without Python stack frame support
this function returns None.
Warning 2: this function does make the code shorter, but it probably contradicts the Python philosophy of being as explicit as you can. Moreover, it doesn't address the issues pointed out by John Christopher Jones in the comments, although you could make a similar function that works with attributes instead of keys. This is just a demonstration that you can do that if you really want to!
def destructure(dict_):
if not isinstance(dict_, dict):
raise TypeError(f"{dict_} is not a dict")
# the parent frame will contain the information about
# the current line
parent_frame = inspect.currentframe().f_back
# so we extract that line (by default the code context
# only contains the current line)
(line,) = inspect.getframeinfo(parent_frame).code_context
# "hello, key = destructure(my_dict)"
# -> ("hello, key ", "=", " destructure(my_dict)")
lvalues, _equals, _rvalue = line.strip().partition("=")
# -> ["hello", "key"]
keys = [s.strip() for s in lvalues.split(",") if s.strip()]
if missing := [key for key in keys if key not in dict_]:
raise KeyError(*missing)
for key in keys:
yield dict_[key]
In [5]: my_dict = {"hello": "world", "123": "456", "key": "value"}
In [6]: hello, key = destructure(my_dict)
In [7]: hello
Out[7]: 'world'
In [8]: key
Out[8]: 'value'
This solution allows you to pick some of the keys, not all, like in JavaScript. It's also safe for user-provided dictionaries
With Python 3.10, you can do:
d = {"a": 1, "b": 2}
match d:
case {"a": a, "b": b}:
print(f"A is {a} and b is {b}")
but it adds two extra levels of indentation, and you still have to repeat the key names.
Look for other answers as this won't cater to the unexpected order in the dictionary. will update this with a correct version sometime soon.
try this
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
keys = data.keys()
a,b,c = [data[k] for k in keys]
result:
a == 'Apple'
b == 'Banana'
c == 'Carrot'
Well, if you want these in a class you can always do this:
class AttributeDict(dict):
def __init__(self, *args, **kwargs):
super(AttributeDict, self).__init__(*args, **kwargs)
self.__dict__.update(self)
d = AttributeDict(a=1, b=2)
Based on #ShawnFumo answer I came up with this:
def destruct(dict): return (t[1] for t in sorted(dict.items()))
d = {'b': 'Banana', 'c': 'Carrot', 'a': 'Apple' }
a, b, c = destruct(d)
(Notice the order of items in dict)
An old topic, but I found this to be a useful method:
data = {'a':'Apple', 'b':'Banana','c':'Carrot'}
for key in data.keys():
locals()[key] = data[key]
This method loops over every key in your dictionary and sets a variable to that name and then assigns the value from the associated key to this new variable.
Testing:
print(a)
print(b)
print(c)
Output
Apple
Banana
Carrot
An easy and simple way to destruct dict in python:
params = {"a": 1, "b": 2}
a, b = [params[key] for key in ("a", "b")]
print(a, b)
# Output:
# 1 2
I don't know whether it's good style, but
locals().update(params)
will do the trick. You then have a, b and whatever was in your params dict available as corresponding local variables.
Since dictionaries are guaranteed to keep their insertion order in Python >= 3.7, that means that it's complete safe and idiomatic to just do this nowadays:
params = {'a': 1, 'b': 2}
a, b = params.values()
print(a)
print(b)
Output:
1
2

Categories

Resources