Is there are more readable way to check if a key buried in a dict exists without checking each level independently?
Lets say I need to get this value in a object buried (example taken from Wikidata):
x = s['mainsnak']['datavalue']['value']['numeric-id']
To make sure that this does not end with a runtime error it is necessary to either check every level like so:
if 'mainsnak' in s and 'datavalue' in s['mainsnak'] and 'value' in s['mainsnak']['datavalue'] and 'nurmeric-id' in s['mainsnak']['datavalue']['value']:
x = s['mainsnak']['datavalue']['value']['numeric-id']
The other way I can think of to solve this is wrap this into a try catch construct which I feel is also rather awkward for such a simple task.
I am looking for something like:
x = exists(s['mainsnak']['datavalue']['value']['numeric-id'])
which returns True if all levels exists.
To be brief, with Python you must trust it is easier to ask for forgiveness than permission
try:
x = s['mainsnak']['datavalue']['value']['numeric-id']
except KeyError:
pass
The answer
Here is how I deal with nested dict keys:
def keys_exists(element, *keys):
'''
Check if *keys (nested) exists in `element` (dict).
'''
if not isinstance(element, dict):
raise AttributeError('keys_exists() expects dict as first argument.')
if len(keys) == 0:
raise AttributeError('keys_exists() expects at least two arguments, one given.')
_element = element
for key in keys:
try:
_element = _element[key]
except KeyError:
return False
return True
Example:
data = {
"spam": {
"egg": {
"bacon": "Well..",
"sausages": "Spam egg sausages and spam",
"spam": "does not have much spam in it"
}
}
}
print 'spam (exists): {}'.format(keys_exists(data, "spam"))
print 'spam > bacon (do not exists): {}'.format(keys_exists(data, "spam", "bacon"))
print 'spam > egg (exists): {}'.format(keys_exists(data, "spam", "egg"))
print 'spam > egg > bacon (exists): {}'.format(keys_exists(data, "spam", "egg", "bacon"))
Output:
spam (exists): True
spam > bacon (do not exists): False
spam > egg (exists): True
spam > egg > bacon (exists): True
It loop in given element testing each key in given order.
I prefere this to all variable.get('key', {}) methods I found because it follows EAFP.
Function except to be called like: keys_exists(dict_element_to_test, 'key_level_0', 'key_level_1', 'key_level_n', ..). At least two arguments are required, the element and one key, but you can add how many keys you want.
If you need to use kind of map, you can do something like:
expected_keys = ['spam', 'egg', 'bacon']
keys_exists(data, *expected_keys)
You could use .get with defaults:
s.get('mainsnak', {}).get('datavalue', {}).get('value', {}).get('numeric-id')
but this is almost certainly less clear than using try/except.
Python 3.8 +
dictionary = {
"main_key": {
"sub_key": "value",
},
}
if sub_key_value := dictionary.get("main_key", {}).get("sub_key"):
print(f"The key 'sub_key' exists in dictionary[main_key] and it's value is {sub_key_value}")
else:
print("Key 'sub_key' doesn't exists or their value is Falsy")
Extra
A little but important clarification.
In the previous code block, we verify that a key exists in a dictionary but that its value is also Truthy.
Most of the time, this is what people are really looking for, and I think this is what the OP really wants. However, it is not really the most "correct" answer, since if the key exists but its value is False, the above code block will tell us that the key does not exist, which is not true.
So, I leet here a more correct answer:
dictionary = {
"main_key": {
"sub_key": False,
},
}
if "sub_key" in dictionary.get("main_key", {}):
print(f"The key 'sub_key' exists in dictionary[main_key] and it's value is {dictionary['main_key']['sub_key']}")
else:
print("Key 'sub_key' doesn't exists")
Try/except seems to be most pythonic way to do that.
The following recursive function should work (returns None if one of the keys was not found in the dict):
def exists(obj, chain):
_key = chain.pop(0)
if _key in obj:
return exists(obj[_key], chain) if chain else obj[_key]
myDict ={
'mainsnak': {
'datavalue': {
'value': {
'numeric-id': 1
}
}
}
}
result = exists(myDict, ['mainsnak', 'datavalue', 'value', 'numeric-id'])
print(result)
>>> 1
I suggest you to use python-benedict, a solid python dict subclass with full keypath support and many utility methods.
You just need to cast your existing dict:
s = benedict(s)
Now your dict has full keypath support and you can check if the key exists in the pythonic way, using the in operator:
if 'mainsnak.datavalue.value.numeric-id' in s:
# do stuff
Here the library repository and the documentation:
https://github.com/fabiocaccamo/python-benedict
Note: I am the author of this project
You can use pydash to check if exists: http://pydash.readthedocs.io/en/latest/api.html#pydash.objects.has
Or get the value (you can even set default - to return if doesn't exist): http://pydash.readthedocs.io/en/latest/api.html#pydash.objects.has
Here is an example:
>>> get({'a': {'b': {'c': [1, 2, 3, 4]}}}, 'a.b.c[1]')
2
The try/except way is the most clean, no contest. However, it also counts as an exception in my IDE, which halts execution while debugging.
Furthermore, I do not like using exceptions as in-method control statements, which is essentially what is happening with the try/catch.
Here is a short solution which does not use recursion, and supports a default value:
def chained_dict_lookup(lookup_dict, keys, default=None):
_current_level = lookup_dict
for key in keys:
if key in _current_level:
_current_level = _current_level[key]
else:
return default
return _current_level
The accepted answer is a good one, but here is another approach. It's a little less typing and a little easier on the eyes (in my opinion) if you end up having to do this a lot. It also doesn't require any additional package dependencies like some of the other answers. Have not compared performance.
import functools
def haskey(d, path):
try:
functools.reduce(lambda x, y: x[y], path.split("."), d)
return True
except KeyError:
return False
# Throwing in this approach for nested get for the heck of it...
def getkey(d, path, *default):
try:
return functools.reduce(lambda x, y: x[y], path.split("."), d)
except KeyError:
if default:
return default[0]
raise
Usage:
data = {
"spam": {
"egg": {
"bacon": "Well..",
"sausages": "Spam egg sausages and spam",
"spam": "does not have much spam in it",
}
}
}
(Pdb) haskey(data, "spam")
True
(Pdb) haskey(data, "spamw")
False
(Pdb) haskey(data, "spam.egg")
True
(Pdb) haskey(data, "spam.egg3")
False
(Pdb) haskey(data, "spam.egg.bacon")
True
Original inspiration from the answers to this question.
EDIT: a comment pointed out that this only works with string keys. A more generic approach would be to accept an iterable path param:
def haskey(d, path):
try:
functools.reduce(lambda x, y: x[y], path, d)
return True
except KeyError:
return False
(Pdb) haskey(data, ["spam", "egg"])
True
I had the same problem and recent python lib popped up:
https://pypi.org/project/dictor/
https://github.com/perfecto25/dictor
So in your case:
from dictor import dictor
x = dictor(s, 'mainsnak.datavalue.value.numeric-id')
Personal note:
I don't like 'dictor' name, since it doesn't hint what it actually does. So I'm using it like:
from dictor import dictor as extract
x = extract(s, 'mainsnak.datavalue.value.numeric-id')
Couldn't come up with better naming than extract. Feel free to comment, if you come up with more viable naming. safe_get, robust_get didn't felt right for my case.
Another way:
def does_nested_key_exists(dictionary, nested_key):
exists = nested_key in dictionary
if not exists:
for key, value in dictionary.items():
if isinstance(value, dict):
exists = exists or does_nested_key_exists(value, nested_key)
return exists
The selected answer works well on the happy path, but there are a couple obvious issues to me. If you were to search for ["spam", "egg", "bacon", "pizza"], it would throw a type error due to trying to index "well..." using the string "pizza". Like wise, if you replaced pizza with 2, it would use that to get the index 2 from "Well..."
Selected Answer Output Issues:
data = {
"spam": {
"egg": {
"bacon": "Well..",
"sausages": "Spam egg sausages and spam",
"spam": "does not have much spam in it"
}
}
}
print(keys_exists(data, "spam", "egg", "bacon", "pizza"))
>> TypeError: string indices must be integers
print(keys_exists(data, "spam", "egg", "bacon", 2)))
>> l
I also feel that using try except can be a crutch that we might too quickly rely on. Since I believe we already need to check for the type, might as well remove the try except.
Solution:
def dict_value_or_default(element, keys=[], default=Undefined):
'''
Check if keys (nested) exists in `element` (dict).
Returns value if last key exists, else returns default value
'''
if not isinstance(element, dict):
return default
_element = element
for key in keys:
# Necessary to ensure _element is not a different indexable type (list, string, etc).
# get() would have the same issue if that method name was implemented by a different object
if not isinstance(_element, dict) or key not in _element:
return default
_element = _element[key]
return _element
Output:
print(dict_value_or_default(data, ["spam", "egg", "bacon", "pizza"]))
>> INVALID
print(dict_value_or_default(data, ["spam", "egg", "bacon", 2]))
>> INVALID
print(dict_value_or_default(data, ["spam", "egg", "bacon"]))
>> "Well..."
Here's my small snippet based on #Aroust's answer:
def exist(obj, *keys: str) -> bool:
_obj = obj
try:
for key in keys:
_obj = _obj[key]
except (KeyError, TypeError):
return False
return True
if __name__ == '__main__':
obj = {"mainsnak": {"datavalue": {"value": "A"}}}
answer = exist(obj, "mainsnak", "datavalue", "value", "B")
print(answer)
I added TypeError because when _obj is str, int, None, or etc, it would raise that error.
I wrote a data parsing library called dataknead for cases like this, basically because i got frustrated by the JSON the Wikidata API returns as well.
With that library you could do something like this
from dataknead import Knead
numid = Knead(s).query("mainsnak/datavalue/value/numeric-id").data()
if numid:
# Do something with `numeric-id`
Using dict with defaults is concise and appears to execute faster than using consecutive if statements.
Try it yourself:
import timeit
timeit.timeit("'x' in {'a': {'x': {'y'}}}.get('a', {})")
# 0.2874350370002503
timeit.timeit("'a' in {'a': {'x': {'y'}}} and 'x' in {'a': {'x': {'y'}}}['a']")
# 0.3466246419993695
I have written a handy library for this purpose.
I am iterating over ast of the dict and trying to check if a particular key is present or not.
Do check this out.
https://github.com/Agent-Hellboy/trace-dkey
If you can suffer testing a string representation of the object path then this approach might work for you:
def exists(str):
try:
eval(str)
return True
except:
return False
exists("lst['sublist']['item']")
one can try to use this for checking whether key/nestedkey/value is in nested dict
import yaml
#d - nested dictionary
if something in yaml.dump(d, default_flow_style=False):
print(something, "is in", d)
else:
print(something, "is not in", d)
There are many great answers. here is my humble take on it. Added check for array of dictionaries as well. Please note that I am not checking for arguments validity. I used part Arnot's code above. I added this answer because a I got a use case that requires checking array or dictionaries in my data.
Here is the code:
def keys_exists(element, *keys):
'''
Check if *keys (nested) exists in `element` (dict).
'''
retval=False
if isinstance(element,dict):
for key,value in element.items():
for akey in keys:
if element.get(akey) is not None:
return True
if isinstance(value,dict) or isinstance(value,list):
retval= keys_exists(value, *keys)
elif isinstance(element, list):
for val in element:
if isinstance(val,dict) or isinstance(val,list):
retval=keys_exists(val, *keys)
return retval
Related
I am realizing in Python 3 some APIs that allow me to receive information about a school based on the class code. But I would like to know how I get the information through the class code.
Example:
I enter the code GF528S and I want the program to tell me the class (3C INF), the address (Address 1, Milan), and if possible also the name of the school (Test School 1) and the previous keys. Thanks in advance! Of course I use a JSON structure:
{
"schools": {
"Lombardia": {
"Milano": {
"Milano": {
"Test School 1": {
"sedi": {
"0": {
"indirizzo": "Address 1, Milan",
"classi": {
"INFORMATICA E TELECOMUNICAZIONI": {
"3C INF": "GF528S"
}
}
},
"1": {
"indirizzo": "Address 2, Milan",
"classi": {
"INFORMATICA E TELECOMUNICAZIONI": {
"1A IT": "HKPV5P",
"2A IT": "QL3J3K",
"3A INF": "X4E35C",
"3A TEL": "ZAA7LC"
}
}
}
}
}
}
}
}
}
}
When I get the values from my database they are converted to a python dictionary if it helps!
After a series of tests thanks to your answers, I found that the for in .items() is blocked when it shows the indirizzo field:
In particular, it cannot search in these dictionaries:
{'classi': {'INFORMATICA E TELECOMUNICAZIONI': {'3C INF': 'GF528S'}}, 'indirizzo': 'Address 1, Milan'}
{'classi': {'INFORMATICA E TELECOMUNICAZIONI': {'1A IT': 'HKPV5P', '2A IT': 'QL3J3K', '3A INF': 'X4E35C', '3A TEL': 'ZAA7LC'}}, 'indirizzo': 'Address 2, Milan'}
I think the problem is precisely the indirizzo field. If you want to do the for first, it can be saved in a variable and deleted from the json:
del val ["address"]
The problem is that then I can't associate the address with the class.
Code:
def dictionary_check(input):
indirizzo = ""
for key,value in input.items():
if isinstance(value, dict):
dictionary_check(value)
else:
for i in value:
indirizzo += i["indirizzo"]
del i['indirizzo']
for x, y in i.items():
for z, j in y.items():
for a in j.items():
if a[1] == "HKPV5P":
print(indirizzo)
print("Classe: " + a[0])
While I can't write the exact code for you, I think it's reasonable to be able to give you a rough idea of what the code would look like, and some guidance.
I don't exactly know where this JSON data is being obtained. So it may have more / less keys when your applications runs. However, assuming the json is exactly as is, and the json data is loaded onto the variable (let's say json_map), then accessing a specific value looks something like:
json_map[key_value]
So you would want to do something similar to
json_map['schools']['Lombardia']['Milano']
and more keys until you reach the dictionary you want to play around with.
I think the point you might be confused is - if you have multiple values (that you may not be aware of what they might look like) how you handle it. For example, I think the key "sedi" (which I assume means locations) might return multiple locations (i.e. schools) and you won't know what their keys / values are. In that case, you may wish to iterate through that dictionary via something like:
for key, value in dict_.items():
# do your action
it is likely that key will be an integer (in string format) and value will be another dictionary. You will want to check a specific attribute of the dictionary to see if it's the one you're looking for.
Also, finally, when you get to the 'INFORMATICA E TELECOMUNICAZIONI' dictionary of the location(s), you may wish to return the key of the item that has the corresponding value. Something like:
for key, value in dict_.items():
if value == 'GF528S':
return key
Of course, you'll be able to replace this value of 'GF528S' to a variable so you can change it each time.
I think this is as far as I can help you without actually implementing this. I gave the benefit of the doubt that you are like me when I just started programming and I just needed someone to give me a rough outline of what to do. Any more help, I think you may need grab someone who has knowledge of what to do IRL or hire a tutor/teacher to teach you basic concepts of Programming.
search_key = "GF528S"
def recursive_search(dct,keys):
for key,value in dct.items():
if key == search_key:
print(keys,value)
if type(value) == dict:
recursive_search(value,[*keys,key])
recursive_search(dinput_dict,[])
You should use recursive function to find key or value.
def search_key(data, key, path=""):
if type(data) is dict:
for k, v in data.items():
path="{0} -> {1}".format(path, k)
if k == key or v == key:
return (k, v, path)
res = search_key(data[k], key, path)
if res is not None:
return res
result = search_key(your_dictionary, key="GF528S")
if result is not None:
print("key:", result[0])
print("value:", result[1])
print("path:", result[2])
else:
print("key or value not found!")
If you want to search entire of dictionary and get all duplicate keys or values using given pattern key, this below function is useful.
def entire_search_key(data, key, founds=[], path=""):
if type(data) is dict:
for k, v in data.items():
path="{0} -> {1}".format(path, k)
if k == key or v == key:
founds.append((k, v, path))
entire_search_key(data[k], key, founds, path)
return founds
result = entire_search_key(your_dictionary, key="ddd")
if result == []:
print("key or value not found!")
else:
for i in result:
print(i)
I am parsing unknown nested json object, I do not know the structure nor depth ahead of time. I am trying to search through it to find a value. This is what I came up with, but I find it fugly. Could anybody let me know how to make this look more pythonic and cleaner?
def find(d, key):
if isinstance(d, dict):
for k, v in d.iteritems():
try:
if key in str(v):
return 'found'
except:
continue
if isinstance(v, dict):
for key,value in v.iteritems():
try:
if key in str(value):
return "found"
except:
continue
if isinstance(v, dict):
find(v)
elif isinstance(v, list):
for x in v:
find(x)
if isinstance(d, list):
for x in d:
try:
if key in x:
return "found"
except:
continue
if isinstance(v, dict):
find(v)
elif isinstance(v, list):
for x in v:
find(x)
else:
if key in str(d):
return "found"
else:
return "Not Found"
It is generally more "Pythonic" to use duck typing; i.e., just try to search for your target rather than using isinstance. See What are the differences between type() and isinstance()?
However, your need for recursion makes it necessary to recurse the values of the dictionaries and the elements of the list. (Do you also want to search the keys of the dictionaries?)
The in operator can be used for both strings, lists, and dictionaries, so no need to separate the dictionaries from the lists when testing for membership. Assuming you don't want to test for the target as a substring, do use isinstance(basestring) per the previous link. To test whether your target is among the values of a dictionary, test for membership in your_dictionary.values(). See Get key by value in dictionary
Because the dictionary values might be lists or dictionaries, I still might test for dictionary and list types the way you did, but I mention that you can cover both list elements and dictionary keys with a single statement because you ask about being Pythonic, and using an overloaded oeprator like in across two types is typical of Python.
Your idea to use recursion is necessary, but I wouldn't define the function with the name find because that is a Python built-in which you will (sort of) shadow and make the recursive call less readable because another programmer might mistakenly think you're calling the built-in (and as good practice, you might want to leave the usual access to the built in in case you want to call it.)
To test for numeric types, use `numbers.Number' as described at How can I check if my python object is a number?
Also, there is a solution to a variation of your problem at https://gist.github.com/douglasmiranda/5127251 . I found that before posting because ColdSpeed's regex suggestion in the comment made me wonder if I were leading you down the wrong path.
So something like
import numbers
def recursively_search(object_from_json, target):
if isinstance(object_from_json, (basestring, numbers.Number)):
return object_from_json==target # the recursion base cases
elif isinstance(object_from_json, list):
for element in list:
if recursively_search(element, target):
return True # quit at first match
elif isinstance(object_from_json, dict):
if target in object_from_json:
return True # among the keys
else:
for value in object_from_json.values():
if recursively_search(value, target):
return True # quit upon finding first match
else:
print ("recursively_search() did not anticipate type ",type(object_from_json))
return False
return False # no match found among the list elements, dict keys, nor dict values
I have code that works but I'm wondering if there is a more pythonic way to do this. I have a dictionary and I want to see if:
a key exists
that value isn't None (NULL from SQL in this case)
that value isn't simply quote quote (blank?)
that value doesn't solely consist of spaces
So in my code the keys of "a", "b", and "c" would succeed, which is correct.
import re
mydict = {
"a":"alpha",
"b":0,
"c":False,
"d":None,
"e":"",
"g":" ",
}
#a,b,c should succeed
for k in mydict.keys():
if k in mydict and mydict[k] is not None and not re.search("^\s*$", str(mydict[k])):
print(k)
else:
print("I am incomplete and sad")
What I have above works, but that seems like an awfully long set of conditions. Maybe this simply is the right solution but I'm wondering if there is a more pythonic "exists and has stuff" or better way to do this?
UPDATE
Thank you all for wonderful answers and thoughtful comments. With some of the points and tips, I've updated the question a little bit as there some conditions I didn't have which should also succeed. I have also changed the example to a loop (just easier to test right?).
Try to fetch the value and store it in a variable, then use object "truthyness" to go further on with the value
v = mydict.get("a")
if v and v.strip():
if "a" is not in the dict, get returns None and fails the first condition
if "a" is in the dict but yields None or empty string, test fails, if "a" yields a blank string, strip() returns falsy string and it fails too.
let's test this:
for k in "abcde":
v = mydict.get(k)
if v and v.strip():
print(k,"I am here and have stuff")
else:
print(k,"I am incomplete and sad")
results:
a I am here and have stuff
b I am incomplete and sad # key isn't in dict
c I am incomplete and sad # c is None
d I am incomplete and sad # d is empty string
e I am incomplete and sad # e is only blanks
if your values can contain False, 0 or other "falsy" non-strings, you'll have to test for string, in that case replace:
if v and v.strip():
by
if v is not None and (not isinstance(v,str) or v.strip()):
so condition matches if not None and either not a string (everything matches) or if a string, the string isn't blank.
The get method for checking if a key exists is more efficient that iterating through the keys. It checks to see if the key exists without iteration using an O(1) complexity as apposed to O(n). My preferred method would look something like this:
if mydict.get("a") is not None and str(mydict.get("a")).replace(" ", "") != '':
# Do some work
You can use a list comprehension with str.strip to account for whitespace in strings.
Using if v is natural in Python to cover False-like objects, e.g. None, False, 0, etc. So note this only works if 0 is not an acceptable value.
res = [k for k, v in mydict.items() if (v.strip() if isinstance(v, str) else v)]
['a']
Here's a simple one-liner to check:
The key exists
The key is not None
The key is not ""
bool(myDict.get("some_key"))
As for checking if the value contains only spaces, you would need to be more careful as None doesn't have a strip() method.
Something like this as an example:
try:
exists = bool(myDict.get('some_key').strip())
except AttributeError:
exists = False
Well I have 2 suggestions to offer you, especially if your main issue is the length of the conditions.
The first one is for the check if the key is in the dict. You don't need to use "a" in mydict.keys() you can just use "a" in mydict.
The second suggestion to make the condition smaller is to break down into smaller conditions stored as booleans, and check these in your final condition:
import re
mydict = {
"a":"alpha",
"c":None,
"d":"",
"e":" ",
}
inKeys = True if "a" in mydict else False
isNotNone = True if mydict["a"] is not None else False
isValidKey = True if not re.search("^\s*$", mydict["a"]) else False
if inKeys and isNotNone and isValidKey:
print("I am here and have stuff")
else:
print("I am incomplete and sad")
it check exactly for NoneType not only None
from types import NoneType # dont forget to import this
mydict = {
"a":"alpha",
"b":0,
"c":False,
"d":None,
"e":"",
"g":" ",
}
#a,b,c should succeed
for k in mydict:
if type(mydict[k]) != NoneType:
if type(mydict[k]) != str or type(mydict[k]) == str and mydict[k].strip():
print(k)
else:
print("I am incomplete and sad")
else:
print("I am incomplete and sad")
cond is a generator function responsible for generating conditions to apply in a short-circuiting manner using the all function. Given d = cond(), next(d) will check if a exists in the dict, and so on until there is no condition to apply, in that case all(d) will evaluate to True.
mydict = {
"a":"alpha",
"c":None,
"d":"",
"e":" ",
}
def cond ():
yield 'a' in mydict
yield mydict ['a']
yield mydict ['a'].strip ()
if all (cond ()):
print("I am here and have stuff")
else:
print("I am incomplete and sad")
I have several nested dictionaries within lists, and I need to verify if a specific path exist e.g.
dict1['layer1']['layer2'][0]['layer3']
How can I check with an IF statement if the path is valid?
I was thinking to
if dict1['layer1']['layer2'][0]['layer3'] :
but it doesn't work
Here's the explicit short code with try/except:
try:
dict1['layer1']['layer2'][0]['layer3']
except KeyError:
present = False
else:
present = True
if present:
...
To get the element:
try:
obj = dict1['layer1']['layer2'][0]['layer3']
except KeyError:
obj = None # or whatever
I wanted to propose another solution, because I've been thinking about it too.
if not dict1.get("layer1", {}).get("layer2", {})[0].get("layer3", {}):
...
dict.get() attempts to get the key at each stage.
If the key is not present, an empty dict will be returned, instead of the nested dict (this is needed, because trying to call .get() on the default return of None will yield an AttributeError).
If the return is empty, it will evaluate to false.
So, this wouldn't work if the final result was an empty dict anyway, but if you can guarantee the results will be filled, this is an alternative that is fairly simple.
As far as I know, you've to go step by step, i.e.:
if 'layer1' in dict1:
if 'layer2' in dict1['layer1']
ans so on...
If you don't want to go the try/except route, you could whip up a quick method to do this:
def check_dict_path(d, *indices):
sentinel = object()
for index in indices:
d = d.get(index, sentinel)
if d is sentinel:
return False
return True
test_dict = {1: {'blah': {'blob': 4}}}
print check_dict_path(test_dict, 1, 'blah', 'blob') # True
print check_dict_path(test_dict, 1, 'blah', 'rob') # False
This might be redundant if you're also trying to retrieve the object at that location (rather than just verify whether the location exists). If that's the case, the above method can easily be updated accordingly.
Here is a similar question with the answer I would recommend:
Elegant way to check if a nested key exists in a python dict
Using recursive function:
def path_exists(path, dict_obj, index = 0):
if (type(dict_obj) is dict and path[index] in dict_obj.keys()):
if (len(path) > (index+1)):
return path_exists(path, dict_obj[path[index]], index + 1)
else:
return True
else:
return False
Where path is a list of strings representing the nested keys.
This question already has answers here:
Why does my recursive function return None?
(4 answers)
Closed 6 months ago.
I have a function that takes a key and traverses nested dicts to return the value regardless of its depth. However, I can only get the value to print, not return. I've read the other questions on this issue and and have tried 1. implementing yield 2. appending the value to a list and then returning the list.
def get_item(data,item_key):
# data=dict, item_key=str
if isinstance(data,dict):
if item_key in data.keys():
print data[item_key]
return data[item_key]
else:
for key in data.keys():
# recursion
get_item(data[key],item_key)
item = get_item(data,'aws:RequestId')
print item
Sample data:
data = OrderedDict([(u'aws:UrlInfoResponse', OrderedDict([(u'#xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), (u'aws:Response', OrderedDict([(u'#xmlns:aws', u'http://awis.amazonaws.com/doc/2005-07-11'), (u'aws:OperationRequest', OrderedDict([(u'aws:RequestId', u'4dbbf7ef-ae87-483b-5ff1-852c777be012')])), (u'aws:UrlInfoResult', OrderedDict([(u'aws:Alexa', OrderedDict([(u'aws:TrafficData', OrderedDict([(u'aws:DataUrl', OrderedDict([(u'#type', u'canonical'), ('#text', u'infowars.com/')])), (u'aws:Rank', u'1252')]))]))])), (u'aws:ResponseStatus', OrderedDict([(u'#xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), (u'aws:StatusCode', u'Success')]))]))]))])
When I execute, the desired value prints, but does not return:
>>>52c7e94b-dc76-2dd6-1216-f147d991d6c7
>>>None
What is happening? Why isn't the function breaking and returning the value when it finds it?
A simple fix, you have to find a nested dict that returns a value. You don't need to explicitly use an else clause because the if returns. You also don't need to call .keys():
def get_item(data, item_key):
if isinstance(data, dict):
if item_key in data:
return data[item_key]
for key in data:
found = get_item(data[key], item_key)
if found:
return found
return None # Explicit vs Implicit
>>> get_item(data, 'aws:RequestId')
'4dbbf7ef-ae87-483b-5ff1-852c777be012'
One of the design principles of python is EAFP (Easier to Ask for Forgiveness than Permission), which means that exceptions are more commonly used than in other languages. The above rewritten with EAFP design:
def get_item(data, item_key):
try:
return data[item_key]
except KeyError:
for key in data:
found = get_item(data[key], item_key)
if found:
return found
except (TypeError, IndexError):
pass
return None
As other people commented, you need return statement in else blocks, too. You have two if blocks so you would need two more return statement. Here is code that does what you may want
from collections import OrderedDict
def get_item(data,item_key):
result = []
if isinstance(data, dict):
for key in data:
if key == item_key:
print data[item_key]
result.append(data[item_key])
# recursion
result += get_item(data[key],item_key)
return result
return result
Your else block needs to return the value if it finds it.
I've made a few other minor changes to your code. You don't need to do
if item_key in data.keys():
Instead, you can simply do
if item_key in data:
Similarly, you don't need
for key in data.keys():
You can iterate directly over a dict (or any class derived from a dict) to iterate over its keys:
for key in data:
Here's my version of your code, which should run on Python 2.7 as well as Python 3.
from __future__ import print_function
from collections import OrderedDict
def get_item(data, item_key):
if isinstance(data, dict):
if item_key in data:
return data[item_key]
for val in data.values():
v = get_item(val, item_key)
if v is not None:
return v
data = OrderedDict([(u'aws:UrlInfoResponse',
OrderedDict([(u'#xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'), (u'aws:Response',
OrderedDict([(u'#xmlns:aws', u'http://awis.amazonaws.com/doc/2005-07-11'), (u'aws:OperationRequest',
OrderedDict([(u'aws:RequestId', u'4dbbf7ef-ae87-483b-5ff1-852c777be012')])), (u'aws:UrlInfoResult',
OrderedDict([(u'aws:Alexa',
OrderedDict([(u'aws:TrafficData',
OrderedDict([(u'aws:DataUrl',
OrderedDict([(u'#type', u'canonical'), ('#text', u'infowars.com/')])),
(u'aws:Rank', u'1252')]))]))])), (u'aws:ResponseStatus',
OrderedDict([(u'#xmlns:aws', u'http://alexa.amazonaws.com/doc/2005-10-05/'),
(u'aws:StatusCode', u'Success')]))]))]))])
item = get_item(data, 'aws:RequestId')
print(item)
output
4dbbf7ef-ae87-483b-5ff1-852c777be012
Note that this function returns None if the isinstance(data, dict) test fails, or if the for loop fails to return. It's generally a good idea to ensure that every possible return path in a recursive function has an explicit return statement, as that makes it clearer what's happening, but IMHO it's ok to leave those returns implicit in this fairly simple function.