Related
Today, I came across the dict method get which, given a key in the dictionary, returns the associated value.
For what purpose is this function useful? If I wanted to find a value associated with a key in a dictionary, I can just do dict[key], and it returns the same thing:
dictionary = {"Name": "Harry", "Age": 17}
dictionary["Name"]
dictionary.get("Name")
It allows you to provide a default value if the key is missing:
dictionary.get("bogus", default_value)
returns default_value (whatever you choose it to be), whereas
dictionary["bogus"]
would raise a KeyError.
If omitted, default_value is None, such that
dictionary.get("bogus") # <-- No default specified -- defaults to None
returns None just like
dictionary.get("bogus", None)
would.
What is the dict.get() method?
As already mentioned the get method contains an additional parameter which indicates the missing value. From the documentation
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
An example can be
>>> d = {1:2,2:3}
>>> d[1]
2
>>> d.get(1)
2
>>> d.get(3)
>>> repr(d.get(3))
'None'
>>> d.get(3,1)
1
Are there speed improvements anywhere?
As mentioned here,
It seems that all three approaches now exhibit similar performance (within about 10% of each other), more or less independent of the properties of the list of words.
Earlier get was considerably slower, However now the speed is almost comparable along with the additional advantage of returning the default value. But to clear all our queries, we can test on a fairly large list (Note that the test includes looking up all the valid keys only)
def getway(d):
for i in range(100):
s = d.get(i)
def lookup(d):
for i in range(100):
s = d[i]
Now timing these two functions using timeit
>>> import timeit
>>> print(timeit.timeit("getway({i:i for i in range(100)})","from __main__ import getway"))
20.2124660015
>>> print(timeit.timeit("lookup({i:i for i in range(100)})","from __main__ import lookup"))
16.16223979
As we can see the lookup is faster than the get as there is no function lookup. This can be seen through dis
>>> def lookup(d,val):
... return d[val]
...
>>> def getway(d,val):
... return d.get(val)
...
>>> dis.dis(getway)
2 0 LOAD_FAST 0 (d)
3 LOAD_ATTR 0 (get)
6 LOAD_FAST 1 (val)
9 CALL_FUNCTION 1
12 RETURN_VALUE
>>> dis.dis(lookup)
2 0 LOAD_FAST 0 (d)
3 LOAD_FAST 1 (val)
6 BINARY_SUBSCR
7 RETURN_VALUE
Where will it be useful?
It will be useful whenever you want to provide a default value whenever you are looking up a dictionary. This reduces
if key in dic:
val = dic[key]
else:
val = def_val
To a single line, val = dic.get(key,def_val)
Where will it be NOT useful?
Whenever you want to return a KeyError stating that the particular key is not available. Returning a default value also carries the risk that a particular default value may be a key too!
Is it possible to have get like feature in dict['key']?
Yes! We need to implement the __missing__ in a dict subclass.
A sample program can be
class MyDict(dict):
def __missing__(self, key):
return None
A small demonstration can be
>>> my_d = MyDict({1:2,2:3})
>>> my_d[1]
2
>>> my_d[3]
>>> repr(my_d[3])
'None'
get takes a second optional value. If the specified key does not exist in your dictionary, then this value will be returned.
dictionary = {"Name": "Harry", "Age": 17}
dictionary.get('Year', 'No available data')
>> 'No available data'
If you do not give the second parameter, None will be returned.
If you use indexing as in dictionary['Year'], nonexistent keys will raise KeyError.
A gotcha to be aware of when using .get():
If the dictionary contains the key used in the call to .get() and its value is None, the .get() method will return None even if a default value is supplied.
For example, the following returns None, not 'alt_value' as may be expected:
d = {'key': None}
assert None is d.get('key', 'alt_value')
.get()'s second value is only returned if the key supplied is NOT in the dictionary, not if the return value of that call is None.
I will give a practical example in scraping web data using python, a lot of the times you will get keys with no values, in those cases you will get errors if you use dictionary['key'], whereas dictionary.get('key', 'return_otherwise') has no problems.
Similarly, I would use ''.join(list) as opposed to list[0] if you try to capture a single value from a list.
hope it helps.
[Edit] Here is a practical example:
Say, you are calling an API, which returns a JOSN file you need to parse. The first JSON looks like following:
{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","submitdate_ts":1318794805,"users_id":"2674360","project_id":"1250499"}}
The second JOSN is like this:
{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","users_id":"2674360","project_id":"1250499"}}
Note that the second JSON is missing the "submitdate_ts" key, which is pretty normal in any data structure.
So when you try to access the value of that key in a loop, can you call it with the following:
for item in API_call:
submitdate_ts = item["bids"]["submitdate_ts"]
You could, but it will give you a traceback error for the second JSON line, because the key simply doesn't exist.
The appropriate way of coding this, could be the following:
for item in API_call:
submitdate_ts = item.get("bids", {'x': None}).get("submitdate_ts")
{'x': None} is there to avoid the second level getting an error. Of course you can build in more fault tolerance into the code if you are doing scraping. Like first specifying a if condition
The purpose is that you can give a default value if the key is not found, which is very useful
dictionary.get("Name",'harry')
For what purpose is this function useful?
One particular usage is counting with a dictionary. Let's assume you want to count the number of occurrences of each element in a given list. The common way to do so is to make a dictionary where keys are elements and values are the number of occurrences.
fruits = ['apple', 'banana', 'peach', 'apple', 'pear']
d = {}
for fruit in fruits:
if fruit not in d:
d[fruit] = 0
d[fruit] += 1
Using the .get() method, you can make this code more compact and clear:
for fruit in fruits:
d[fruit] = d.get(fruit, 0) + 1
Other answers have clearly explained the difference between dict bracket keying and .get and mentioned a fairly innocuous pitfall when None or the default value is also a valid key.
Given this information, it may be tempting conclude that .get is somehow safer and better than bracket indexing and should always be used instead of bracket lookups, as argued in Stop Using Square Bracket Notation to Get a Dictionary's Value in Python, even in the common case when they expect the lookup to succeed (i.e. never raise a KeyError).
The author of the blog post argues that .get "safeguards your code":
Notice how trying to reference a term that doesn't exist causes a KeyError. This can cause major headaches, especially when dealing with unpredictable business data.
While we could wrap our statement in a try/except or if statement, this much care for a dictionary term will quickly pile up.
It's true that in the uncommon case for null (None)-coalescing or otherwise filling in a missing value to handle unpredictable dynamic data, a judiciously-deployed .get is a useful and Pythonic shorthand tool for ungainly if key in dct: and try/except blocks that only exist to set default values when the key might be missing as part of the behavioral specification for the program.
However, replacing all bracket dict lookups, including those that you assert must succeed, with .get is a different matter. This practice effectively downgrades a class of runtime errors that help reveal bugs into silent illegal state scenarios that tend to be harder to identify and debug.
A common mistake among programmers is to think exceptions cause headaches and attempt to suppress them, using techniques like wrapping code in try ... except: pass blocks. They later realize the real headache is never seeing the breach of application logic at the point of failure and deploying a broken application. Better programming practice is to embrace assertions for all program invariants such as keys that must be in a dictionary.
The hierarchy of error safety is, broadly:
Error category
Relative ease of debugging
Compile-time error
Easy; go to the line and fix the problem
Runtime exception
Medium; control needs to flow to the error and it may be due to unanticipated edge cases or hard-to-reproduce state like a race condition between threads, but at least we get a clear error message and stack trace when it does happen.
Silent logical error
Difficult; we may not even know it exists, and when we do, tracking down state that caused it can be very challenging due to lack of locality and potential for multiple assertion breaches.
When programming language designers talk about program safety, a major goal is to surface, not suppress, genuine errors by promoting runtime errors to compile-time errors and promote silent logical errors to either runtime exceptions or (ideally) compile-time errors.
Python, by design as an interpreted language, relies heavily on runtime exceptions instead of compiler errors. Missing methods or properties, illegal type operations like 1 + "a" and out of bounds or missing indices or keys raise by default.
Some languages like JS, Java, Rust and Go use the fallback behavior for their maps by default (and in many cases, don't provide a throw/raise alternative), but Python throws by default, along with other languages like C#. Perl/PHP issue an uninitialized value warning.
Indiscriminate application of .get to all dict accesses, even those that aren't expected to fail and have no fallback for dealing with None (or whatever default is used) running amok through the code, pretty much tosses away Python's runtime exception safety net for this class of errors, silencing or adding indirection to potential bugs.
Other supporting reasons to prefer bracket lookups (with the occasional, well-placed .get where a default is expected):
Prefer writing standard, idiomatic code using the tools provided by the language. Python programmers usually (correctly) prefer brackets for the exception safety reasons given above and because it's the default behavior for Python dicts.
Always using .get forfeits intent by making cases when you expect to provide a default None value indistinguishable from a lookup you assert must succeed.
Testing increases in complexity in proportion to the new "legal" program paths permitted by .get. Effectively, each lookup is now a branch that can succeed or fail -- both cases must be tested to establish coverage, even if the default path is effectively unreachable by specification (ironically leading to additional if val is not None: or try for all future uses of the retrieved value; unnecessary and confusing for something that should never be None in the first place).
.get is a bit slower.
.get is harder to type and uglier to read (compare Java's tacked-on-feel ArrayList syntax to native-feel C# Lists or C++ vector code). Minor.
Some languages like C++ and Ruby offer alternate methods (at and fetch, respectively) to opt-in to throwing an error on a bad access, while C# offers opt-in fallback value TryGetValue similar to Python's get.
Since JS, Java, Ruby, Go and Rust bake the fallback approach of .get into all hash lookups by default, it can't be that bad, one might think. It's true that this isn't the largest issue facing language designers and there are plenty of use cases for the no-throw access version, so it's unsurprising that there's no consensus across languages.
But as I've argued, Python (along with C#) has done better than these languages by making the assert option the default. It's a loss of safety and expressivity to opt-out of using it to report contract violations at the point of failure by indiscriminately using .get across the board.
Why dict.get(key) instead of dict[key]?
0. Summary
Comparing to dict[key], dict.get provides a fallback value when looking up for a key.
1. Definition
get(key[, default]) 4. Built-in Types — Python 3.6.4rc1 documentation
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
d = {"Name": "Harry", "Age": 17}
In [4]: d['gender']
KeyError: 'gender'
In [5]: d.get('gender', 'Not specified, please add it')
Out[5]: 'Not specified, please add it'
2. Problem it solves.
If without default value, you have to write cumbersome codes to handle such an exception.
def get_harry_info(key):
try:
return "{}".format(d[key])
except KeyError:
return 'Not specified, please add it'
In [9]: get_harry_info('Name')
Out[9]: 'Harry'
In [10]: get_harry_info('Gender')
Out[10]: 'Not specified, please add it'
As a convenient solution, dict.get introduces an optional default value avoiding above unwiedly codes.
3. Conclusion
dict.get has an additional default value option to deal with exception if key is absent from the dictionary
One difference, that can be an advantage, is that if we are looking for a key that doesn't exist we will get None, not like when we use the brackets notation, in which case we will get an error thrown:
print(dictionary.get("address")) # None
print(dictionary["address"]) # throws KeyError: 'address'
Last thing that is cool about the get method, is that it receives an additional optional argument for a default value, that is if we tried to get the score value of a student, but the student doesn't have a score key we can get a 0 instead.
So instead of doing this (or something similar):
score = None
try:
score = dictionary["score"]
except KeyError:
score = 0
We can do this:
score = dictionary.get("score", 0)
# score = 0
One other use-case that I do not see mentioned is as the key argument for functions like sorted, max and min. The get method allows for keys to be returned based on their values.
>>> ages = {"Harry": 17, "Lucy": 16, "Charlie": 18}
>>> print(sorted(ages, key=ages.get))
['Lucy', 'Harry', 'Charlie']
>>> print(max(ages, key=ages.get))
Charlie
>>> print(min(ages, key=ages.get))
Lucy
Thanks to this answer to a different question for providing this use-case!
Short answer
The square brackets are used for conditional lookups which can fail with a KeyError when the key is missing.
The get() method is used from unconditional lookups that never fail because a default value has been supplied.
Base method and helper method
The square brackets call the __getitem__ method which is fundamental for mappings like dicts.
The get() method is a helper layered on top of that functionality. It is a short-cut for the common coding pattern:
try:
v = d[k]
except KeyError:
v = default_value
It allow you to provide a default value, instead of get an error when the value is not found. persuedocode like this :
class dictionary():
def get(self,key,default):
if self[key] is not found :
return default
else:
return self[key]
With Python 3.8 and after, the dictionary get() method can be used with the walrus operator := in an assignment expression to further reduce code:
if (name := dictonary.get("Name")) is not None
return name
Using [] instead of get() would require wrapping the code in a try/except block and catching KeyError (not shown). And without the walrus operator, you would need another line of code:
name = dictionary.get("Name")
if (name is not None)
return name
Today, I came across the dict method get which, given a key in the dictionary, returns the associated value.
For what purpose is this function useful? If I wanted to find a value associated with a key in a dictionary, I can just do dict[key], and it returns the same thing:
dictionary = {"Name": "Harry", "Age": 17}
dictionary["Name"]
dictionary.get("Name")
It allows you to provide a default value if the key is missing:
dictionary.get("bogus", default_value)
returns default_value (whatever you choose it to be), whereas
dictionary["bogus"]
would raise a KeyError.
If omitted, default_value is None, such that
dictionary.get("bogus") # <-- No default specified -- defaults to None
returns None just like
dictionary.get("bogus", None)
would.
What is the dict.get() method?
As already mentioned the get method contains an additional parameter which indicates the missing value. From the documentation
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
An example can be
>>> d = {1:2,2:3}
>>> d[1]
2
>>> d.get(1)
2
>>> d.get(3)
>>> repr(d.get(3))
'None'
>>> d.get(3,1)
1
Are there speed improvements anywhere?
As mentioned here,
It seems that all three approaches now exhibit similar performance (within about 10% of each other), more or less independent of the properties of the list of words.
Earlier get was considerably slower, However now the speed is almost comparable along with the additional advantage of returning the default value. But to clear all our queries, we can test on a fairly large list (Note that the test includes looking up all the valid keys only)
def getway(d):
for i in range(100):
s = d.get(i)
def lookup(d):
for i in range(100):
s = d[i]
Now timing these two functions using timeit
>>> import timeit
>>> print(timeit.timeit("getway({i:i for i in range(100)})","from __main__ import getway"))
20.2124660015
>>> print(timeit.timeit("lookup({i:i for i in range(100)})","from __main__ import lookup"))
16.16223979
As we can see the lookup is faster than the get as there is no function lookup. This can be seen through dis
>>> def lookup(d,val):
... return d[val]
...
>>> def getway(d,val):
... return d.get(val)
...
>>> dis.dis(getway)
2 0 LOAD_FAST 0 (d)
3 LOAD_ATTR 0 (get)
6 LOAD_FAST 1 (val)
9 CALL_FUNCTION 1
12 RETURN_VALUE
>>> dis.dis(lookup)
2 0 LOAD_FAST 0 (d)
3 LOAD_FAST 1 (val)
6 BINARY_SUBSCR
7 RETURN_VALUE
Where will it be useful?
It will be useful whenever you want to provide a default value whenever you are looking up a dictionary. This reduces
if key in dic:
val = dic[key]
else:
val = def_val
To a single line, val = dic.get(key,def_val)
Where will it be NOT useful?
Whenever you want to return a KeyError stating that the particular key is not available. Returning a default value also carries the risk that a particular default value may be a key too!
Is it possible to have get like feature in dict['key']?
Yes! We need to implement the __missing__ in a dict subclass.
A sample program can be
class MyDict(dict):
def __missing__(self, key):
return None
A small demonstration can be
>>> my_d = MyDict({1:2,2:3})
>>> my_d[1]
2
>>> my_d[3]
>>> repr(my_d[3])
'None'
get takes a second optional value. If the specified key does not exist in your dictionary, then this value will be returned.
dictionary = {"Name": "Harry", "Age": 17}
dictionary.get('Year', 'No available data')
>> 'No available data'
If you do not give the second parameter, None will be returned.
If you use indexing as in dictionary['Year'], nonexistent keys will raise KeyError.
A gotcha to be aware of when using .get():
If the dictionary contains the key used in the call to .get() and its value is None, the .get() method will return None even if a default value is supplied.
For example, the following returns None, not 'alt_value' as may be expected:
d = {'key': None}
assert None is d.get('key', 'alt_value')
.get()'s second value is only returned if the key supplied is NOT in the dictionary, not if the return value of that call is None.
I will give a practical example in scraping web data using python, a lot of the times you will get keys with no values, in those cases you will get errors if you use dictionary['key'], whereas dictionary.get('key', 'return_otherwise') has no problems.
Similarly, I would use ''.join(list) as opposed to list[0] if you try to capture a single value from a list.
hope it helps.
[Edit] Here is a practical example:
Say, you are calling an API, which returns a JOSN file you need to parse. The first JSON looks like following:
{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","submitdate_ts":1318794805,"users_id":"2674360","project_id":"1250499"}}
The second JOSN is like this:
{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","users_id":"2674360","project_id":"1250499"}}
Note that the second JSON is missing the "submitdate_ts" key, which is pretty normal in any data structure.
So when you try to access the value of that key in a loop, can you call it with the following:
for item in API_call:
submitdate_ts = item["bids"]["submitdate_ts"]
You could, but it will give you a traceback error for the second JSON line, because the key simply doesn't exist.
The appropriate way of coding this, could be the following:
for item in API_call:
submitdate_ts = item.get("bids", {'x': None}).get("submitdate_ts")
{'x': None} is there to avoid the second level getting an error. Of course you can build in more fault tolerance into the code if you are doing scraping. Like first specifying a if condition
The purpose is that you can give a default value if the key is not found, which is very useful
dictionary.get("Name",'harry')
For what purpose is this function useful?
One particular usage is counting with a dictionary. Let's assume you want to count the number of occurrences of each element in a given list. The common way to do so is to make a dictionary where keys are elements and values are the number of occurrences.
fruits = ['apple', 'banana', 'peach', 'apple', 'pear']
d = {}
for fruit in fruits:
if fruit not in d:
d[fruit] = 0
d[fruit] += 1
Using the .get() method, you can make this code more compact and clear:
for fruit in fruits:
d[fruit] = d.get(fruit, 0) + 1
Other answers have clearly explained the difference between dict bracket keying and .get and mentioned a fairly innocuous pitfall when None or the default value is also a valid key.
Given this information, it may be tempting conclude that .get is somehow safer and better than bracket indexing and should always be used instead of bracket lookups, as argued in Stop Using Square Bracket Notation to Get a Dictionary's Value in Python, even in the common case when they expect the lookup to succeed (i.e. never raise a KeyError).
The author of the blog post argues that .get "safeguards your code":
Notice how trying to reference a term that doesn't exist causes a KeyError. This can cause major headaches, especially when dealing with unpredictable business data.
While we could wrap our statement in a try/except or if statement, this much care for a dictionary term will quickly pile up.
It's true that in the uncommon case for null (None)-coalescing or otherwise filling in a missing value to handle unpredictable dynamic data, a judiciously-deployed .get is a useful and Pythonic shorthand tool for ungainly if key in dct: and try/except blocks that only exist to set default values when the key might be missing as part of the behavioral specification for the program.
However, replacing all bracket dict lookups, including those that you assert must succeed, with .get is a different matter. This practice effectively downgrades a class of runtime errors that help reveal bugs into silent illegal state scenarios that tend to be harder to identify and debug.
A common mistake among programmers is to think exceptions cause headaches and attempt to suppress them, using techniques like wrapping code in try ... except: pass blocks. They later realize the real headache is never seeing the breach of application logic at the point of failure and deploying a broken application. Better programming practice is to embrace assertions for all program invariants such as keys that must be in a dictionary.
The hierarchy of error safety is, broadly:
Error category
Relative ease of debugging
Compile-time error
Easy; go to the line and fix the problem
Runtime exception
Medium; control needs to flow to the error and it may be due to unanticipated edge cases or hard-to-reproduce state like a race condition between threads, but at least we get a clear error message and stack trace when it does happen.
Silent logical error
Difficult; we may not even know it exists, and when we do, tracking down state that caused it can be very challenging due to lack of locality and potential for multiple assertion breaches.
When programming language designers talk about program safety, a major goal is to surface, not suppress, genuine errors by promoting runtime errors to compile-time errors and promote silent logical errors to either runtime exceptions or (ideally) compile-time errors.
Python, by design as an interpreted language, relies heavily on runtime exceptions instead of compiler errors. Missing methods or properties, illegal type operations like 1 + "a" and out of bounds or missing indices or keys raise by default.
Some languages like JS, Java, Rust and Go use the fallback behavior for their maps by default (and in many cases, don't provide a throw/raise alternative), but Python throws by default, along with other languages like C#. Perl/PHP issue an uninitialized value warning.
Indiscriminate application of .get to all dict accesses, even those that aren't expected to fail and have no fallback for dealing with None (or whatever default is used) running amok through the code, pretty much tosses away Python's runtime exception safety net for this class of errors, silencing or adding indirection to potential bugs.
Other supporting reasons to prefer bracket lookups (with the occasional, well-placed .get where a default is expected):
Prefer writing standard, idiomatic code using the tools provided by the language. Python programmers usually (correctly) prefer brackets for the exception safety reasons given above and because it's the default behavior for Python dicts.
Always using .get forfeits intent by making cases when you expect to provide a default None value indistinguishable from a lookup you assert must succeed.
Testing increases in complexity in proportion to the new "legal" program paths permitted by .get. Effectively, each lookup is now a branch that can succeed or fail -- both cases must be tested to establish coverage, even if the default path is effectively unreachable by specification (ironically leading to additional if val is not None: or try for all future uses of the retrieved value; unnecessary and confusing for something that should never be None in the first place).
.get is a bit slower.
.get is harder to type and uglier to read (compare Java's tacked-on-feel ArrayList syntax to native-feel C# Lists or C++ vector code). Minor.
Some languages like C++ and Ruby offer alternate methods (at and fetch, respectively) to opt-in to throwing an error on a bad access, while C# offers opt-in fallback value TryGetValue similar to Python's get.
Since JS, Java, Ruby, Go and Rust bake the fallback approach of .get into all hash lookups by default, it can't be that bad, one might think. It's true that this isn't the largest issue facing language designers and there are plenty of use cases for the no-throw access version, so it's unsurprising that there's no consensus across languages.
But as I've argued, Python (along with C#) has done better than these languages by making the assert option the default. It's a loss of safety and expressivity to opt-out of using it to report contract violations at the point of failure by indiscriminately using .get across the board.
Why dict.get(key) instead of dict[key]?
0. Summary
Comparing to dict[key], dict.get provides a fallback value when looking up for a key.
1. Definition
get(key[, default]) 4. Built-in Types — Python 3.6.4rc1 documentation
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
d = {"Name": "Harry", "Age": 17}
In [4]: d['gender']
KeyError: 'gender'
In [5]: d.get('gender', 'Not specified, please add it')
Out[5]: 'Not specified, please add it'
2. Problem it solves.
If without default value, you have to write cumbersome codes to handle such an exception.
def get_harry_info(key):
try:
return "{}".format(d[key])
except KeyError:
return 'Not specified, please add it'
In [9]: get_harry_info('Name')
Out[9]: 'Harry'
In [10]: get_harry_info('Gender')
Out[10]: 'Not specified, please add it'
As a convenient solution, dict.get introduces an optional default value avoiding above unwiedly codes.
3. Conclusion
dict.get has an additional default value option to deal with exception if key is absent from the dictionary
One difference, that can be an advantage, is that if we are looking for a key that doesn't exist we will get None, not like when we use the brackets notation, in which case we will get an error thrown:
print(dictionary.get("address")) # None
print(dictionary["address"]) # throws KeyError: 'address'
Last thing that is cool about the get method, is that it receives an additional optional argument for a default value, that is if we tried to get the score value of a student, but the student doesn't have a score key we can get a 0 instead.
So instead of doing this (or something similar):
score = None
try:
score = dictionary["score"]
except KeyError:
score = 0
We can do this:
score = dictionary.get("score", 0)
# score = 0
One other use-case that I do not see mentioned is as the key argument for functions like sorted, max and min. The get method allows for keys to be returned based on their values.
>>> ages = {"Harry": 17, "Lucy": 16, "Charlie": 18}
>>> print(sorted(ages, key=ages.get))
['Lucy', 'Harry', 'Charlie']
>>> print(max(ages, key=ages.get))
Charlie
>>> print(min(ages, key=ages.get))
Lucy
Thanks to this answer to a different question for providing this use-case!
Short answer
The square brackets are used for conditional lookups which can fail with a KeyError when the key is missing.
The get() method is used from unconditional lookups that never fail because a default value has been supplied.
Base method and helper method
The square brackets call the __getitem__ method which is fundamental for mappings like dicts.
The get() method is a helper layered on top of that functionality. It is a short-cut for the common coding pattern:
try:
v = d[k]
except KeyError:
v = default_value
It allow you to provide a default value, instead of get an error when the value is not found. persuedocode like this :
class dictionary():
def get(self,key,default):
if self[key] is not found :
return default
else:
return self[key]
With Python 3.8 and after, the dictionary get() method can be used with the walrus operator := in an assignment expression to further reduce code:
if (name := dictonary.get("Name")) is not None
return name
Using [] instead of get() would require wrapping the code in a try/except block and catching KeyError (not shown). And without the walrus operator, you would need another line of code:
name = dictionary.get("Name")
if (name is not None)
return name
Background: I need to read the same key/value from a dictionary (exactly) twice.
Question: There are two ways, as shown below,
Method 1. Read it with the same key twice, e.g.,
sample_map = {'A':1,}
...
if sample_map.get('A', None) is not None:
print("A's value in map is {}".format(sample_map.get('A')))
Method 2. Read it once and store it in a local variable, e.g,
sample_map = {'A':1,}
...
ret_val = sample.get('A', None)
if ret_val is not None:
print("A's value in map is {}".format(ret_val))
Which way is better? What are their Pros and Cons?
Note that I am aware that print() can naturally handle ret_val of None. This is a hypothetical example and I just use it for illustration purposes.
Under these conditions, I wouldn't use either. What you're really interested in is whether A is a valid key, and the KeyError (or lack thereof) raised by __getitem__ will tell you if it is or not.
try:
print("A's value in map is {}".format(sample['A'])
except KeyError:
pass
Or course, some would say there is too much code in the try block, in which case method 2 would be preferable.
try:
ret_val = sample['A']
except KeyError:
pass
else:
print("A's value in map is {}".format(ret_val))
or the code you already have:
ret_val = sample.get('A') # None is the default value for the second argument
if ret_val is not None:
print("A's value in map is {}".format(ret_val))
There isn't any effective difference between the options you posted.
Python: List vs Dict for look up table
Lookups in a dict are about o(1). Same goes for a variable you have stored.
Efficiency is about the same. In this case, I would skip defining the extra variable, since not much else is going on.
But in a case like below, where there's a lot of dict lookups going on, I have plans to refactor the code to make things more intelligible, as all of the lookups clutter or obfuscate the logic:
# At this point, assuming that these are floats is OK, since no thresholds had text values
if vname in paramRanges:
"""
Making sure the variable is one that we have a threshold for
"""
# We might care about it
# Don't check the equal case, because that won't matter
if float(tblChanges[vname][0]) < float(tblChanges[vname][1]):
# Check lower tolerance
# Distinction is important because tolerances are not always the same +/-
if abs(float(tblChanges[vname][0]) - float(tblChanges[vname][1])) >= float(
paramRanges[vname][2]):
# Difference from default is greater than tolerance
# vname : current value, default value, negative tolerance, tolerance units, change date
alerts[vname] = (
float(tblChanges[vname][0]), float(tblChanges[vname][1]), float(paramRanges[vname][2]),
paramRanges[vname][0], tblChanges[vname][2]
)
if abs(float(tblChanges[vname][0]) - float(tblChanges[vname][1])) >= float(
paramRanges[vname][1]):
alerts[vname] = (
float(tblChanges[vname][0]), float(tblChanges[vname][1]), float(paramRanges[vname][1]),
paramRanges[vname][0], tblChanges[vname][2]
)
In most cases—if you can't just rewrite your code to use EAFP as chepner suggests, which you probably can for this example—you want to avoid repeated method calls.
The only real benefit of repeating the get is saving an assignment statement.
If your code isn't crammed in the middle of a complex expression, that just means saving one line of vertical space—which isn't nothing, but isn't a huge deal.
If your code is crammed in the middle of a complex expression, pulling the get out may force you to rewrite things a bit. You may have to, e.g., turn a lambda into a def, or turn a while loop with a simple condition into a while True: with an if …: break. Usually that's a sign that you, e.g., really wanted a def in the first place, but "usually" isn't "always". So, this is where you might want to violate the rule of thumb—but see the section at the bottom first.
On the other side…
For dict.get, the performance cost of repeating the method is pretty tiny, and unlikely to impact your code. But what if you change the code to take an arbitrary mapping object from the caller, and someone passes you, say, a proxy that does a get by making a database query or an RPC to a remote server?
For single-threaded code, calling dict.get with the same arguments twice in a row without doing anything in between is correct. But what if you're taking a dict passed by the caller, and the caller has a background thread also modifying the same dict? Then your code is only correct if you put a Lock or other synchronization around the two accesses.
Or, what if your expression was something that might mutate some state, or do something dangerous?
Even if nothing like this is ever going to be an issue in your code, unless that fact is blindingly obvious to anyone reading your code, they're still going to have to think about the possibility of performance costs and ToCToU races and so on.
And, of course, it makes at least two of your lines longer. Assuming you're trying to write readable code that sticks to 72 or 79 or 99 columns, horizontal space is a scarce resource, while vertical space is much less of a big deal. I think your second version is easier to scan than your first, even without all of these other considerations, but imagine making the expression, say, 20 characters longer.
In the rare cases where pulling the repeated value out of an expression would be a problem, you still often want to assign it to a temporary.
Unfortunately, up to Python 3.7, you usually can't. It's either clumsy (e.g., requiring an extra nested comprehension or lambda just to give you an opportunity to bind a variable) or impossible.
But in Python 3.8, PEP 572 assignment expressions handle this case.
if (sample := sample_map.get('A', None)) is not None:
print("A's value in map is {}".format(sample))
I don't think this is a great use of an assignment expression (see the PEP for some better examples), especially since I'd probably write this the way chepner suggested… but it does show how to get the best of both worlds (assigning a temporary, and being embeddable in an expression) when you really need to.
Today, I came across the dict method get which, given a key in the dictionary, returns the associated value.
For what purpose is this function useful? If I wanted to find a value associated with a key in a dictionary, I can just do dict[key], and it returns the same thing:
dictionary = {"Name": "Harry", "Age": 17}
dictionary["Name"]
dictionary.get("Name")
It allows you to provide a default value if the key is missing:
dictionary.get("bogus", default_value)
returns default_value (whatever you choose it to be), whereas
dictionary["bogus"]
would raise a KeyError.
If omitted, default_value is None, such that
dictionary.get("bogus") # <-- No default specified -- defaults to None
returns None just like
dictionary.get("bogus", None)
would.
What is the dict.get() method?
As already mentioned the get method contains an additional parameter which indicates the missing value. From the documentation
get(key[, default])
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
An example can be
>>> d = {1:2,2:3}
>>> d[1]
2
>>> d.get(1)
2
>>> d.get(3)
>>> repr(d.get(3))
'None'
>>> d.get(3,1)
1
Are there speed improvements anywhere?
As mentioned here,
It seems that all three approaches now exhibit similar performance (within about 10% of each other), more or less independent of the properties of the list of words.
Earlier get was considerably slower, However now the speed is almost comparable along with the additional advantage of returning the default value. But to clear all our queries, we can test on a fairly large list (Note that the test includes looking up all the valid keys only)
def getway(d):
for i in range(100):
s = d.get(i)
def lookup(d):
for i in range(100):
s = d[i]
Now timing these two functions using timeit
>>> import timeit
>>> print(timeit.timeit("getway({i:i for i in range(100)})","from __main__ import getway"))
20.2124660015
>>> print(timeit.timeit("lookup({i:i for i in range(100)})","from __main__ import lookup"))
16.16223979
As we can see the lookup is faster than the get as there is no function lookup. This can be seen through dis
>>> def lookup(d,val):
... return d[val]
...
>>> def getway(d,val):
... return d.get(val)
...
>>> dis.dis(getway)
2 0 LOAD_FAST 0 (d)
3 LOAD_ATTR 0 (get)
6 LOAD_FAST 1 (val)
9 CALL_FUNCTION 1
12 RETURN_VALUE
>>> dis.dis(lookup)
2 0 LOAD_FAST 0 (d)
3 LOAD_FAST 1 (val)
6 BINARY_SUBSCR
7 RETURN_VALUE
Where will it be useful?
It will be useful whenever you want to provide a default value whenever you are looking up a dictionary. This reduces
if key in dic:
val = dic[key]
else:
val = def_val
To a single line, val = dic.get(key,def_val)
Where will it be NOT useful?
Whenever you want to return a KeyError stating that the particular key is not available. Returning a default value also carries the risk that a particular default value may be a key too!
Is it possible to have get like feature in dict['key']?
Yes! We need to implement the __missing__ in a dict subclass.
A sample program can be
class MyDict(dict):
def __missing__(self, key):
return None
A small demonstration can be
>>> my_d = MyDict({1:2,2:3})
>>> my_d[1]
2
>>> my_d[3]
>>> repr(my_d[3])
'None'
get takes a second optional value. If the specified key does not exist in your dictionary, then this value will be returned.
dictionary = {"Name": "Harry", "Age": 17}
dictionary.get('Year', 'No available data')
>> 'No available data'
If you do not give the second parameter, None will be returned.
If you use indexing as in dictionary['Year'], nonexistent keys will raise KeyError.
A gotcha to be aware of when using .get():
If the dictionary contains the key used in the call to .get() and its value is None, the .get() method will return None even if a default value is supplied.
For example, the following returns None, not 'alt_value' as may be expected:
d = {'key': None}
assert None is d.get('key', 'alt_value')
.get()'s second value is only returned if the key supplied is NOT in the dictionary, not if the return value of that call is None.
I will give a practical example in scraping web data using python, a lot of the times you will get keys with no values, in those cases you will get errors if you use dictionary['key'], whereas dictionary.get('key', 'return_otherwise') has no problems.
Similarly, I would use ''.join(list) as opposed to list[0] if you try to capture a single value from a list.
hope it helps.
[Edit] Here is a practical example:
Say, you are calling an API, which returns a JOSN file you need to parse. The first JSON looks like following:
{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","submitdate_ts":1318794805,"users_id":"2674360","project_id":"1250499"}}
The second JOSN is like this:
{"bids":{"id":16210506,"submitdate":"2011-10-16 15:53:25","submitdate_f":"10\/16\/2011 at 21:53 CEST","submitdate_f2":"p\u0159ed 2 lety","users_id":"2674360","project_id":"1250499"}}
Note that the second JSON is missing the "submitdate_ts" key, which is pretty normal in any data structure.
So when you try to access the value of that key in a loop, can you call it with the following:
for item in API_call:
submitdate_ts = item["bids"]["submitdate_ts"]
You could, but it will give you a traceback error for the second JSON line, because the key simply doesn't exist.
The appropriate way of coding this, could be the following:
for item in API_call:
submitdate_ts = item.get("bids", {'x': None}).get("submitdate_ts")
{'x': None} is there to avoid the second level getting an error. Of course you can build in more fault tolerance into the code if you are doing scraping. Like first specifying a if condition
The purpose is that you can give a default value if the key is not found, which is very useful
dictionary.get("Name",'harry')
For what purpose is this function useful?
One particular usage is counting with a dictionary. Let's assume you want to count the number of occurrences of each element in a given list. The common way to do so is to make a dictionary where keys are elements and values are the number of occurrences.
fruits = ['apple', 'banana', 'peach', 'apple', 'pear']
d = {}
for fruit in fruits:
if fruit not in d:
d[fruit] = 0
d[fruit] += 1
Using the .get() method, you can make this code more compact and clear:
for fruit in fruits:
d[fruit] = d.get(fruit, 0) + 1
Other answers have clearly explained the difference between dict bracket keying and .get and mentioned a fairly innocuous pitfall when None or the default value is also a valid key.
Given this information, it may be tempting conclude that .get is somehow safer and better than bracket indexing and should always be used instead of bracket lookups, as argued in Stop Using Square Bracket Notation to Get a Dictionary's Value in Python, even in the common case when they expect the lookup to succeed (i.e. never raise a KeyError).
The author of the blog post argues that .get "safeguards your code":
Notice how trying to reference a term that doesn't exist causes a KeyError. This can cause major headaches, especially when dealing with unpredictable business data.
While we could wrap our statement in a try/except or if statement, this much care for a dictionary term will quickly pile up.
It's true that in the uncommon case for null (None)-coalescing or otherwise filling in a missing value to handle unpredictable dynamic data, a judiciously-deployed .get is a useful and Pythonic shorthand tool for ungainly if key in dct: and try/except blocks that only exist to set default values when the key might be missing as part of the behavioral specification for the program.
However, replacing all bracket dict lookups, including those that you assert must succeed, with .get is a different matter. This practice effectively downgrades a class of runtime errors that help reveal bugs into silent illegal state scenarios that tend to be harder to identify and debug.
A common mistake among programmers is to think exceptions cause headaches and attempt to suppress them, using techniques like wrapping code in try ... except: pass blocks. They later realize the real headache is never seeing the breach of application logic at the point of failure and deploying a broken application. Better programming practice is to embrace assertions for all program invariants such as keys that must be in a dictionary.
The hierarchy of error safety is, broadly:
Error category
Relative ease of debugging
Compile-time error
Easy; go to the line and fix the problem
Runtime exception
Medium; control needs to flow to the error and it may be due to unanticipated edge cases or hard-to-reproduce state like a race condition between threads, but at least we get a clear error message and stack trace when it does happen.
Silent logical error
Difficult; we may not even know it exists, and when we do, tracking down state that caused it can be very challenging due to lack of locality and potential for multiple assertion breaches.
When programming language designers talk about program safety, a major goal is to surface, not suppress, genuine errors by promoting runtime errors to compile-time errors and promote silent logical errors to either runtime exceptions or (ideally) compile-time errors.
Python, by design as an interpreted language, relies heavily on runtime exceptions instead of compiler errors. Missing methods or properties, illegal type operations like 1 + "a" and out of bounds or missing indices or keys raise by default.
Some languages like JS, Java, Rust and Go use the fallback behavior for their maps by default (and in many cases, don't provide a throw/raise alternative), but Python throws by default, along with other languages like C#. Perl/PHP issue an uninitialized value warning.
Indiscriminate application of .get to all dict accesses, even those that aren't expected to fail and have no fallback for dealing with None (or whatever default is used) running amok through the code, pretty much tosses away Python's runtime exception safety net for this class of errors, silencing or adding indirection to potential bugs.
Other supporting reasons to prefer bracket lookups (with the occasional, well-placed .get where a default is expected):
Prefer writing standard, idiomatic code using the tools provided by the language. Python programmers usually (correctly) prefer brackets for the exception safety reasons given above and because it's the default behavior for Python dicts.
Always using .get forfeits intent by making cases when you expect to provide a default None value indistinguishable from a lookup you assert must succeed.
Testing increases in complexity in proportion to the new "legal" program paths permitted by .get. Effectively, each lookup is now a branch that can succeed or fail -- both cases must be tested to establish coverage, even if the default path is effectively unreachable by specification (ironically leading to additional if val is not None: or try for all future uses of the retrieved value; unnecessary and confusing for something that should never be None in the first place).
.get is a bit slower.
.get is harder to type and uglier to read (compare Java's tacked-on-feel ArrayList syntax to native-feel C# Lists or C++ vector code). Minor.
Some languages like C++ and Ruby offer alternate methods (at and fetch, respectively) to opt-in to throwing an error on a bad access, while C# offers opt-in fallback value TryGetValue similar to Python's get.
Since JS, Java, Ruby, Go and Rust bake the fallback approach of .get into all hash lookups by default, it can't be that bad, one might think. It's true that this isn't the largest issue facing language designers and there are plenty of use cases for the no-throw access version, so it's unsurprising that there's no consensus across languages.
But as I've argued, Python (along with C#) has done better than these languages by making the assert option the default. It's a loss of safety and expressivity to opt-out of using it to report contract violations at the point of failure by indiscriminately using .get across the board.
Why dict.get(key) instead of dict[key]?
0. Summary
Comparing to dict[key], dict.get provides a fallback value when looking up for a key.
1. Definition
get(key[, default]) 4. Built-in Types — Python 3.6.4rc1 documentation
Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
d = {"Name": "Harry", "Age": 17}
In [4]: d['gender']
KeyError: 'gender'
In [5]: d.get('gender', 'Not specified, please add it')
Out[5]: 'Not specified, please add it'
2. Problem it solves.
If without default value, you have to write cumbersome codes to handle such an exception.
def get_harry_info(key):
try:
return "{}".format(d[key])
except KeyError:
return 'Not specified, please add it'
In [9]: get_harry_info('Name')
Out[9]: 'Harry'
In [10]: get_harry_info('Gender')
Out[10]: 'Not specified, please add it'
As a convenient solution, dict.get introduces an optional default value avoiding above unwiedly codes.
3. Conclusion
dict.get has an additional default value option to deal with exception if key is absent from the dictionary
One difference, that can be an advantage, is that if we are looking for a key that doesn't exist we will get None, not like when we use the brackets notation, in which case we will get an error thrown:
print(dictionary.get("address")) # None
print(dictionary["address"]) # throws KeyError: 'address'
Last thing that is cool about the get method, is that it receives an additional optional argument for a default value, that is if we tried to get the score value of a student, but the student doesn't have a score key we can get a 0 instead.
So instead of doing this (or something similar):
score = None
try:
score = dictionary["score"]
except KeyError:
score = 0
We can do this:
score = dictionary.get("score", 0)
# score = 0
One other use-case that I do not see mentioned is as the key argument for functions like sorted, max and min. The get method allows for keys to be returned based on their values.
>>> ages = {"Harry": 17, "Lucy": 16, "Charlie": 18}
>>> print(sorted(ages, key=ages.get))
['Lucy', 'Harry', 'Charlie']
>>> print(max(ages, key=ages.get))
Charlie
>>> print(min(ages, key=ages.get))
Lucy
Thanks to this answer to a different question for providing this use-case!
Short answer
The square brackets are used for conditional lookups which can fail with a KeyError when the key is missing.
The get() method is used from unconditional lookups that never fail because a default value has been supplied.
Base method and helper method
The square brackets call the __getitem__ method which is fundamental for mappings like dicts.
The get() method is a helper layered on top of that functionality. It is a short-cut for the common coding pattern:
try:
v = d[k]
except KeyError:
v = default_value
It allow you to provide a default value, instead of get an error when the value is not found. persuedocode like this :
class dictionary():
def get(self,key,default):
if self[key] is not found :
return default
else:
return self[key]
With Python 3.8 and after, the dictionary get() method can be used with the walrus operator := in an assignment expression to further reduce code:
if (name := dictonary.get("Name")) is not None
return name
Using [] instead of get() would require wrapping the code in a try/except block and catching KeyError (not shown). And without the walrus operator, you would need another line of code:
name = dictionary.get("Name")
if (name is not None)
return name
Here's what I have so far:
def is_ordered(collection):
if isinstance(collection, set):
return False
if isinstance(collection, list):
return True
if isinstance(collection, dict):
return False
raise Exception("unknown collection")
Is there a much better way to do this?
NB: I do mean ordered and not sorted.
Motivation:
I want to iterate over an ordered collection. e.g.
def most_important(priorities):
for p in priorities:
print p
In this case the fact that priorities is ordered is important. What kind of collection it is is not. I'm trying to live duck-typing here. I have frequently been dissuaded by from type checking by Pythonistas.
If the collection is truly arbitrary (meaning it can be of any class whatsoever), then the answer has to be no.
Basically, there are two possible approaches:
know about every possible class that can be presented to your method, and whether it's ordered;
test the collection yourself by inserting into it every possible combination of keys, and seeing whether the ordering is preserved.
The latter is clearly infeasible. The former is along the lines of what you already have, except that you have to know about every derived class such as collections.OrderedDict; checking for dict is not enough.
Frankly, I think the whole is_ordered check is a can of worms. Why do you want to do this anyway?
Update: In essence, you are trying to unittest the argument passed to you. Stop doing that, and unittest your own code. Test your consumer (make sure it works with ordered collections), and unittest the code that calls it, to ensure it is getting the right results.
In a statically-typed language you would simply restrict yourself to specific types. If you really want to replicate that, simply specify the only types you accept, and test for those. Raise an exception if anything else is passed. It's not pythonic, but it reliably achieves what you want to do
Well, you have two possible approaches:
Anything with an append method is almost certainly ordered; and
If it only has an add method, you can try adding a nonce-value, then iterating over the collection to see if the nonce appears at the end (or, perhaps at one end); you could try adding a second nonce and doing it again just to be more confident.
Of course, this won't work where e.g. the collection is empty, or there is an ordering function that doesn't result in addition at the ends.
Probably a better solution is simply to specify that your code requires ordered collections, and only pass it ordered collections.
I think that enumerating the 90% case is about as good as you're going to get (if using Python 3, replace basestring with str). Probably also want to consider how you would handle generator expressions and similar ilk, too (again, if using Py3, skip the xrangor):
generator = type((i for i in xrange(0)))
enumerator = type(enumerate(range(0)))
xrangor = type(xrange(0))
is_ordered = lambda seq : isinstance(seq,(tuple, list, collections.OrderedDict,
basestring, generator, enumerator, xrangor))
If your callers start using itertools, then you'll also need to add itertools types as returned by islice, imap, groupby. But the sheer number of these special cases really starts to point to a code smell.
What if the list is not ordered, e.g. [1,3,2]?