I have a JSON / Python-dictionary object with some values in a numeric format, e.g. {'id': 'xxx-xxx-xxx', 'property_type': 930, ...}.
What I wish to do, is when I display the object on my website, I want to translate 930 into what it actually means, e.g. 'Public institutional building'.
As these objects come from an API call, there are quite a lot I need to translate, and it is not necessarily for each individual key-value pair I need to translate something. I guess this is common practice to do when working with APIs, however, I do not seem to be able to guide myself in the right direction of any best-practise. I have lots of ideas on how to solve it, but not in any way that I believe is considered 'best-practice'.
Obviously, I would likely have to build up an additional dictionary that would look something like:
{'property_type': {120: 'Private property', 240: 'Vacation property' ...}, 'roof_type': {10: 'xxx', 20: 'xxx'}}
But what would then be the most convenient way to thereafter automatically take any given dictionary, loop through the 'translation-dictionary' and if there is a match, then translate the dictionary.
In most cases, the object I'll be working with will be nested, so I also need to check the next level sometimes.
This funtion can be used to Recursively search through a nested dictionary and return what we find.
my_dict = {'property_type': {120: 'Private property', 240: 'Vacation property'}, 'roof_type': {10: 'xxx', 20: 'xxx'}}
def recursive_search(dictionary, item):
# Recursively check for a key in a dictionary
if item in dictionary: # Check if item is in current dict
return dictionary[item] # our property value
for key, value in dictionary.items():
if isinstance(value, dict): # Check if we have a nested dict
result = recursive_search(value, item) # Check the next level
if result is not None: # Make sure we don't return early
return result
print(recursive_search(my_dict, 120))
Related
I have a variable, jdata, that holds data read from a JSON data file. It consists of many levels of dictionaries and lists of dictionaries. I have a search routine that returns a tuple containing path-like information to the element I want to access. I'm struggling to turn the tuple into a variable index. For example, the search routine may return ('name', 5, 'pic', 3). So I want to access jdata['name'][5]['pic'][3]. The number of levels down into the data can change for each search, so the tuple length is variable.
Addendum:
for everyone asking for code and what I've done:
I don't have code to share because I don't know how to do it and that's why I'm asking here. My first thought was to try and create the text for accessing the variable, for the example above,
"x = jdata['name'][5]['pic'][3]"
and then looking for a python way of executing that line of code. I figured there has to be a more elegant solution.
I thought the description of tuple to variable access was pretty straight forward, but here is an expanded version of my problem.
jdata = { 'thing1': 1,
'name': [
{},
{},
{},
{},
{},
{ 'thing11': 1,
'pic': [ 'LocationOfPicA',
'LocationOfPicB',
'LocationOfPicC',
'LocationOfPicD',
'LocationOfPicE'],
'thing12: 2},
{},
{} ],
'thing2': 2}
I searched for 'PicD' and my search code returns: ('name', 5, 'pic', 3)
Now I want to do some stuff, for example, accessing the value 'LocationOfPicD', copy the file located there to some other place, and update the value of 'LocationOfPicD' to the new value. All of this I can code. I just need to be able to turn the tuple into an accessible variable.
Edit: I was just reading about mutability in python. Instead of generating a path to an element in the dictionary, I think I can just assign that element value to a variable (x, for example) that gets passed back up the recursion chain of the initial search. From my understanding, I can change x and that also changes the element within the jdata variable. If that doesn't work, I can resort to using the eval() command on my generated text statement using the tuple as originally planned.
If I understand the problem correctly, you just need to avoid getting the lowest level item by value. So, you could do something like
indexes = ('name', 5, 'pic', 3)
x = jdata
for index in indexes[:-1]:
x = x[index]
x[indexes[-1]] = <new_value_here>
Easy and quick recursive implementation.
def get_d(d, tup, ind=0):
if ind == len(tup) - 1: # last item just return value
return d[tup[ind]]
return get_d(d[tup[ind]], tup, ind + 1) # else keep getting sub-item
# input input datastructure (could be dict, list, or gettable item) and tuple of items to recursively get
value = get_d(jdata, ('name', 5, 'pic', 3))
Note: this implementation is super basic and has no error handling. It's just here to give you an idea on how it could be done.
I have data that is either being returned as a single dictionary, example:
{'key': 'RIDE', '3': 27.3531}
or as a list of dictionaries of unknown amount (ie. could be up to 20 dictionary lists or 2 as shown), example:
[{'key': 'GH', '3': 154.24}, {'key': 'RIDE', '3': 27.34}]
I'd like to write a piece of code that will iterate through the list of dictionaries and return all the key value pairs within each dictionary.
Any help would be appreciated, thank you!
To experiment with this we first have to write some code with a dummy data provider that either returns a dictionary or a list of dictionaries:
import random
def doSomething():
if random.random() <= 0.5:
return {'key': 'RIDE', '3': 27.3531}
else:
return [{'key': 'GH', '3': 154.24}, {'key': 'RIDE', '3': 27.34}]
#
Now we encounter exact this situation you described: That you sometimes receive a dictionary, sometimes a list of dictionaries.
Now: How to deal with this situation? It's not too difficult:
x = doSomething()
if isinstance(x, dict):
x = [ x ]
for item in x:
print(item)
So as we know we either receive a single dictionary or alternatively a list of dictionaries we can test of which type the value returned is. As it is much more convenient to always process a list the above example first converts the returned dictionary into a list before any additional data processing takes place.
So this is done by "correcting an error": The error is here that your data provider does not always return data of the same type. So whenever we receive a dictionary we first pack the dictionary into a list. Only after then we begin with the data processing we want to in the first place. Here this processing is to just print the dictionary, but of course you can iterate through the keys and values as well as you mentioned or do any kind of data processing you might feel appropriate.
But: It is not a good idea to have some kind of function or subsystem of any kind that returns different types of data. As you see this enforces to implement extra logic on the caller's side. Additionally it complicates things if we want to write a specification (and we WANT to write a specification in/for more complex programs.)
Example:
import typing
import random
def doSomething() -> typing.Union[dict,typing.List[dict]]:
if random.random() <= 0.5:
return {'key': 'RIDE', '3': 27.3531}
else:
return [{'key': 'GH', '3': 154.24}, {'key': 'RIDE', '3': 27.34}]
#
Here the specification is a formal one using the capabilities of typing. So this information about the returned type is specified in a formal way. Though this information is typically not evaluated by Python directly under normal circumstances this specification provides any programmer with the knowledge about returned types.
Of course we could have written such a specification in a non-formal way by writing some kind of text document as well, but that does not make any difference: In general having different types of return values complicates things unnecessarily. You can do that - there are situations where this makes sense - but in general it's better to avoid that situation as best as you can to simplify things.
For example using this approach:
import random
def doSomething() -> typing.List[dict]:
# the old code we might choose not to change for some reason ...
if random.random() <= 0.5:
x = {'key': 'RIDE', '3': 27.3531}
else:
x = [{'key': 'GH', '3': 154.24}, {'key': 'RIDE', '3': 27.34}]
# but we can compensate ...
if isinstance(x, dict):
x = [ x ]
return x
#
Now we made the function to always return data of the same type. Which is now much more convenient for us: As a) it simplifies processing for the caller and b) simplifies learning about the data returned in the first place.
So having converted everything to return only data of a single type our main routine will simplify to this:
for item in x:
print(item)
Or if you want to display keys and values:
for item in x:
for k, v in item.items():
print(k, "->", v)
Or whatever kind of data processing you have in mind with the data returned.
Remember as a rule of thumb, in any kind of scripting or programming language:
Always provide data in a way that it is easy for the caller to use and that the whole logic is easy to understand for the programmer. Make providing data in good a way a problem for the subroutine, not the caller. Simplify the caller's life as much as possible.
(Yes, you can decide to violate this principle and not follow it but if you do that then you really must have a very good reason. Then you really need to know what you're doing as then you have a very very special situation. Let me tell you from my 25 years of experience as a professional software developer: In 99.999% of all cases you will not have such a special situation. And I have the feeling that your situation does not fall into this category of such a special situation ;-) )
I have dicts that I need to clean, e.g.
dict = {
'sensor1': [list of numbers from sensor 1 pertaining to measurements on different days],
'sensor2': [list of numbers from from sensor 2 pertaining to measurements from different days],
etc. }
Some days have bad values, and I would like to generate a new dict with the all the sensor values from that bad day to be erased by using an upper limit on the values of one of the keys:
def clean_high(dict_name,key_string,limit):
'''clean all the keys to eliminate the bad values from the arrays'''
new_dict = dict_name
for key in new_dict: new_dict[key] = new_dict[key][new_dict[key_string]<limit]
return new_dict
If I run all the lines separately in IPython, it works. The bad days are eliminated, and the good ones are kept. These are both type numpy.ndarray: new_dict[key] and new_dict[key][new_dict[key_string]<limit]
But, when I run clean_high(), I get the error:
TypeError: only integer arrays with one element can be converted to an index
What?
Inside of clean_high(), the type for new_dict[key] is a string, not an array.
Why would the type change? Is there a better way to modify my dictionary?
Do not modify a dictionary while iterating over it. According to the python documentation: "Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries". Instead, create a new dictionary and modify it while iterating over the old one.
def clean_high(dict_name,key_string,limit):
'''clean all the keys to eliminate the bad values from the arrays'''
new_dict = {}
for key in dict_name:
new_dict[key] = dict_name[key][dict_name[key_string]<limit]
return new_dict
Lets say I have a dictionary that specifies some properties for a package:
d = {'from': 'Bob', 'to': 'Joe', 'item': 'book', 'weight': '3.5lbs'}
To check the validity of a package dictionary, it needs to have a 'from' and 'to' key, and any number of properties, but there must be at least one property. So a dictionary can have either 'item' or 'weight', both, but can't have neither. The property keys could be anything, not limited to 'item' or 'weight'.
How would I check dictionaries to make sure they're valid, as in having the 'to', 'from', and at least one other key?
The only method I can think of is by obtaining d.keys(), removing the 'from' and 'to' keys, and checking if its empty.
Is there a better way to go about doing this?
must = {"from", "to"}
print len(d) > len(must) and all(key in d for key in must)
# True
This solution makes sure that your dictionary has more elements than the elements in the must set and also all the elements in must will be there in the dictionary.
The advantage of this solution is that, it is easily extensible. If you want to make sure that one more parameter exists in the dictionary, just include that in the must dictionary, it will work fine. You don't have to alter the logic.
Edit
Apart from that, if you are using Python 2.7, you can do this more succinctly like this
print d.viewkeys() > {"from", "to"}
If you are using Python 3.x, you can simply write that as
print(d.keys() > {"from", "to"})
This hack works because, d.viewkeys and d.keys return set-like objects. So, we can use set comparison operators. > is used to check if the left hand side set is a strict superset of the right hand side set. So, in order to satisfy the condition, the left hand side set-like object should have both from and to, and some other object.
Quoting from the set.issuperset docs,
set > other
Test whether the set is a proper superset of other, that is, set >= other and set != other.
if d.keys() has a length of at least 3, and it has a from and to attribute, you're golden.
My knowledge of Python isn't the greatest but I imagine it goes something like if len(d.keys) > 2 and d['from'] and d['to']
Use the following code:
def exists(var, dict):
try:
x = dict[var]
return True
except KeyError:
return False
def check(dict):
if exists('from', dict) == False:
return False
if exists('to', dict) == False:
return False
if exists('item', dict) == False and exists('weight', dict) == False:
return False
return True
def main():
d = {'from': 'Bob', 'to': 'Joe', 'item': 'book', 'weight': '3.5lbs'}
mybool = check(d)
print mybool
if __name__ == '__main__':
main()
This doesn't address the problem OP has, but provides what I think to be a better practice solution. I realize there's already been an answer but I just spent a few minutes reading on best practices and thought I would share
Problems with using a dictionary:
Dictionaries are meant to be on a key value basis. You inherently have 2 different types of key values given that to and from are mandatory while item and weight are optional
Dictionaries are meant to be logic-less. By setting certain requirements, you violate the principal of a dictionary which is just meant to hold data. To make a instance you need to build some sort of logic constructor for the dictionary
So why not just use a class? Proposed alternative:
class D(dict): # inheirits dict
def __init__ (self,t,f,**attributes): # from is a keyword
self['to'] = t
self['from'] = f
if(len(attributes) > 0):
self.update(attributes)
else:
raise Exception("Require attribute")
d = D('jim','bob',item='book')
print d # {'to': 'jim', 'from': 'bob', 'item': 'book'}
print d['to'] # jim
print d['item'] # item
print d['from'] # bob
d = D('jim','bob') # throws error
Obviously this falls apart if to and from are set asynchronously but I think the base idea still holds. Creating a class also gives you the verbosity to prevent to and from from being overwritten/deleted as well as limiting the minimum/maximum of attributes set.
I have a dictionary and I would like to get some values from it based on some keys. For example, I have a dictionary for users with their first name, last name, username, address, age and so on. Let's say, I only want to get one value (name) - either last name or first name or username but in descending priority like shown below:
(1) last name: if key exists, get value and stop checking. If not, move to next key.
(2) first name: if key exists, get value and stop checking. If not, move to next key.
(3) username: if key exists, get value or return null/empty
#my dict looks something like this
myDict = {'age': ['value'], 'address': ['value1, value2'],
'firstName': ['value'], 'lastName': ['']}
#List of keys I want to check in descending priority: lastName > firstName > userName
keySet = ['lastName', 'firstName', 'userName']
What I tried doing is to get all the possible values and put them into a list so I can retrieve the first element in the list. Obviously it didn't work out.
tempList = []
for key in keys:
get_value = myDict.get(key)
tempList .append(get_value)
Is there a better way to do this without using if else block?
One option if the number of keys is small is to use chained gets:
value = myDict.get('lastName', myDict.get('firstName', myDict.get('userName')))
But if you have keySet defined, this might be clearer:
value = None
for key in keySet:
if key in myDict:
value = myDict[key]
break
The chained gets do not short-circuit, so all keys will be checked but only one used. If you have enough possible keys that the extra lookups matter, use the for loop.
Use .get(), which if the key is not found, returns None.
for i in keySet:
temp = myDict.get(i)
if temp is not None:
print temp
break
You can use myDict.has_key(keyname) as well to validate if the key exists.
Edit based on the comments -
This would work only on versions lower than 3.1. has_key has been removed from Python 3.1. You should use the in operator if you are using Python 3.1
If we encapsulate that in a function we could use recursion and state clearly the purpose by naming the function properly (not sure if getAny is actually a good name):
def getAny(dic, keys, default=None):
return (keys or default) and dic.get(keys[0],
getAny( dic, keys[1:], default=default))
or even better, without recursion and more clear:
def getAny(dic, keys, default=None):
for k in keys:
if k in dic:
return dic[k]
return default
Then that could be used in a way similar to the dict.get method, like:
getAny(myDict, keySet)
and even have a default result in case of no keys found at all:
getAny(myDict, keySet, "not found")