Dynamic dict value access with dot separated string - python

I'm using Python 3.5.1
So what I am trying to do is pass in a dict a dot separated string representing the path to a key and a default value. I want to check for the keys existence and if it's not there , provide the default value. The problem with this is that the key I want to access could be nested in other dicts and I will not know until run time. So what I want to do is something like this:
def replace_key(the_dict, dict_key, default_value):
if dict_key not in the_dict:
the_dict[dict_key] = default_value
return the_dict
some_dict = {'top_property': {'first_nested': {'second_nested': 'the value'}}}
key_to_replace = 'top_property.first_nested.second_nested'
default_value = 'replaced'
#this would return as {'top_property': {'first_nested': {'second_nested': 'replaced'}}}
replace_key(some_dict, key_to_replace, default_value)
What I'm looking for is a way to do this without having to do a split on '.' in the string and iterating over the possible keys as this could get messy. I would rather not have to use a third party library. I feel like there is clean built in Pythonic way to do this but I just can't find it. I've dug through the docs but to no avail. If anyone has any suggestion as to how I could do this it would be very much appreciated. Thanks!

You could use recursivity:
def replace_key(the_dict, dict_keys, default_value):
if dict_keys[0] in the_dict:
if len(dict_keys)==1:
the_dict[dict_keys[0]]=default_value
else:
replace_key(the_dict[dict_keys[0]], dict_keys[1:],default_value)
else:
raise Exception("wrong key")
some_dict = {'top_property': {'first_nested': {'second_nested': 'the value'}}}
key_to_replace = 'top_property.first_nested.second_nested'
default_value = 'replaced'
#this would return as {'top_property': {'first_nested': {'second_nested': 'replaced'}}}
replace_key(some_dict, key_to_replace.split("."), default_value)
But it still uses the split(). But maybe you consider it to be less messy?

the easyest way that I've found to do this, namely get value using a "key path" by "dotted string" is using replace and eval:
for key in pfields:
if key.find('.') > 0:
key = key.replace(".", "']['")
try:
data = str(eval(f"row['{key}']"))
except KeyError:
data = ''
And this is an example of the keys:
lfields = ['cpeid','metadata.LinkAccount','metadata.DeviceType','metadata.SoftwareVersion','mode_props.vfo.CR07.VIKPresence','mode_props.vfo.CR13.VIBHardVersion']
With this raw solution You don't need install other library

Related

Is there a better way to parse a python dictionary?

I have a json dictionary and I need to check the values of the data and see if there is a match. I am using multiple if statements and the in operator like so:
"SomeData":
{
"IsTrue": true,
"ExtraData":
{
"MyID": "1223"
}
}
json_data = MYAPI.get_json()
if 'SomeData' in json_data:
some_data = json_data['SomeData']
if 'IsTrue' in some_data:
if some_data['IsTrue'] is True:
if 'ExtraData' in some_data:
if 'MyID' in some_data['ExtraData']:
if some_data['ExtraData']['MyID'] == "1234":
is_a_match = True
break
I know that in python3 the in operator should be used, but I am thinking there must be a better way than using multiple if statements like I am using.
Is there a better way to parse json data like this?
Yes, you can assume that the keys are present, but catch a KeyError if they aren't.
try:
some_data = json_data['SomeData']
is_a_match = (
some_data['IsTrue'] is True and
some_data['ExtraData']['MyID'] == "1234"
)
except KeyError:
is_a_match = False
This style is called easier to ask for forgiveness than permission (EAFP) and it's used a lot in Python. The alternative is look before you leap (LBYL), which you use in your solution.
I would suggest writing a path function to access values in your nested dictionary. Something along the lines of this (pseudocode):
def get_path_value(json_dict, path):
"""
json_dict - dictionary with json keys and values
path - list of key sequences, e.g. ['SomeData', 'IsTrue']
you can make this a helper function and use an entry point that
splits paths, e.g. "SomeData/IsTrue"
"""
if len(path) == 1:
# last tag, base case
return json_dict[path[0]]
else:
return get_path_value(json_dict[path[0]], path[1:])
Add try/catch if you want something other than bad key, but this will let you navigat the dictionary a little more eloquently. Then you have things like:
if get_path_value(json_dict, ["SomeData", "IsTrue"]) == True and ...
You could even write a nice little class to wrap this all up, e.g. json["SomeData/IsTrue"] == True
Best of luck,
Marie

Iterating a conversion of a string to a float in a scripting file when parsing an old file

I am using a new script (a) to extract information from an old script (b) to create a new file (c). I am looking for an equal sign in the old script (b) and want to modify the modification script (a) to make it automated.
The string is
lev1tolev2 'from=e119-b3331l1 mappars="simp:180" targ=enceladus.bi.def.3 km=0.6 lat=(-71.5,90) lon=(220,360)'
It is written in python 3.
The current output is fixed at
cam2map from=e119-b3331l1 to=rsmap-x map=enc.Ink.map pixres=mpp defaultrange=MAP res=300 minlat=-71.5 maxlat=90 minlon=220 maxlon=360
Currently, I have the code able to export a string of 0.6 for all of the iterations of lev1tolev2, but each one of these is going to be different.
cam2map = Call("cam2map")
cam2map.kwargs["from"] = old_lev1tolev2.kwargs["from"]
cam2map.kwargs["to"] = "rsmap-x"
cam2map.kwargs["map"] = "enc.Ink.map"
cam2map.kwargs["pixres"] = "mpp"
cam2map.kwargs["defaultrange"] = "MAP"
**cam2map.kwargs["res"] = float((old_lev1tolev2.kwargs["km"]))**
cam2map.kwargs["minlat"] = lat[0]
cam2map.kwargs["maxlat"] = lat[1]
cam2map.kwargs["minlon"] = lon[0]
cam2map.kwargs["maxlon"] = lon[1]
I have two questions, why is this not converting the string to a float? And, why is this not iterating over all of the lev1tolev2 commands as everything else in the code does?
The full code is available here.
https://codeshare.io/G6drmk
The problem occurred at a different location in the code.
def escape_kw_value(value):
if not isinstance(value, str):
return value
elif (value.startswith(('"', "'")) and value.endswith(('"', "'"))):
return value
# TODO escape the quote with \" or \'
#if value.startswith(('"', "'")) or value.endswith(('"', "'")):
# return value
if " " in value:
value = '"{}"'.format(value)
return value
it doesn't seem to clear to me, but from you syntax here :
**cam2map.kwargs["res"] = float((old_lev1tolev2.kwargs["km"]))**
I'd bet that cam2map.kwargs["res"] is a dict, and you thought that it would convert every values in the dict, using the ** syntax. The float built-in should then be called in a loop over the elements of the dict, or possible a list-comprehension as here :
cam2map.kwargs["res"] = dict()
for key, value in old_lev1tolev2.kwars["res"].items():
cam2map.kwargs["res"][key] = float(value)
Edit :
Ok so, it seems you took the string 'from=e119-b3331l1 mappars="simp:180" targ=enceladus.bi.def.3 km=0.6 lat=(-71.5,90) lon=(220,360)'
And then thought that calling youstring.kwargs would give you a dict, but it won't, you can probably parse it to a dict first, using some lib, or, you use mystring.split('=') and then work your way to a dict first, like that:
output = dict()
for one_bit in lev_1_lev2.split(' '):
key, value = one_bit.split('=')
output[key] = value

A "pythonic" strategy to check whether a key already exists in a dictionary

I often deal with heterogeneous datasets and I acquire them as dictionaries in my python routines. I usually face the problem that the key of the next entry I am going to add to the dictionary already exists.
I was wondering if there exists a more "pythonic" way to do the following task: check whether the key exists and create/update the corresponding pair key-item of my dictionary
myDict = dict()
for line in myDatasetFile:
if int(line[-1]) in myDict.keys():
myDict[int(line[-1])].append([line[2],float(line[3])])
else:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
Use a defaultdict.
from collections import defaultdict
d = defaultdict(list)
# Every time you try to access the value of a key that isn't in the dict yet,
# d will call list with no arguments (producing an empty list),
# store the result as the new value, and give you that.
for line in myDatasetFile:
d[int(line[-1])].append([line[2],float(line[3])])
Also, never use thing in d.keys(). In Python 2, that will create a list of keys and iterate through it one item at a time to find the key instead of using a hash-based lookup. In Python 3, it's not quite as horrible, but it's still redundant and still slower than the right way, which is thing in d.
Its what that dict.setdefault is for.
setdefault(key[, default])
If key is in the dictionary, return its value. If not, insert key with a value of default and return default. default defaults to None.
example :
>>> d={}
>>> d.setdefault('a',[]).append([1,2])
>>> d
{'a': [[1, 2]]}
Python follows the idea that it's easier to ask for forgiveness than permission.
so the true Pythonic way would be:
try:
myDict[int(line[-1])].append([line[2],float(line[3])])
except KeyError:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
for reference:
https://docs.python.org/2/glossary.html#term-eafp
https://stackoverflow.com/questions/6092992/why-is-it-easier-to-ask-forgiveness-than-permission-in-python-but-not-in-java
Try to catch the Exception when you get a KeyError
myDict = dict()
for line in myDatasetFile:
try:
myDict[int(line[-1])].append([line[2],float(line[3])])
except KeyError:
myDict[int(line[-1])] = [[line[2],float(line[3])]]
Or use:
myDict = dict()
for line in myDatasetFile:
myDict.setdefault(int(line[-1]),[]).append([line[2],float(line[3])])

Check if Dictionary Values exist in a another Dictionary in Python

I am trying to compare values from 2 Dictionaries in Python. I want to know if a value from one Dictionary exists anywhere in another Dictionary. Here is what i have so far. If it exists I want to return True, else False.
The code I have is close, but not working right.
I'm using VS2012 with Python Plugin
I'm passing both Dictionary items into the functions.
def NameExists(best_guess, line):
return all (line in best_guess.values() #Getting Generator Exit Error here on values
for value in line['full name'])
Also, I want to see if there are duplicates within best_guess itself.
def CheckDuplicates(best_guess, line):
if len(set(best_guess.values())) != len(best_guess):
return True
else:
return False
As error is about generator exit, I guess you use python 3.x. So best_guess.values() is a generator, which exhaust for the first value in line['full name'] for which a match will not be found.
Also, I guess all usage is incorrect, if you look for any value to exist (not sure, from which one dictinary though).
You can use something like follows, providing line is the second dictionary:
def NameExists(best_guess, line):
vals = set(best_guess.values())
return bool(set(line.values()).intersection(vals))
The syntax in NameExists seems wrong, you aren't using the value and best_guess.values() is returning an iterator, so in will only work once, unless we convert it to a list or a set (you are using Python 3.x, aren't you?). I believe this is what you meant:
def NameExists(best_guess, line):
vals = set(best_guess.values())
return all(value in vals for value in line['full name'])
And the CheckDuplicates function can be written in a shorter way like this:
def CheckDuplicates(best_guess, line):
return len(set(best_guess.values())) != len(best_guess)

Define a dictionary name within a function

I am writing a function that will take a parameter and, among other things, make a dictionary. I would like the dictionary's name to be based off the name of the input file. Say ht input file is input.xml , i would like the name of the dictionary to be input. Ideally I would use something like this:
def function(input):
for x in y: list(get value)
input[:4][key] = [value]
I am wondering if you know a better way to do this but what i am using now is an extra name in the function:
def function(input, dictname):
for x in y: list(get value)
dictname[key] = [value]
right now I am simply adding a second name to my function but am wondering if there is a way to do this to require fewer inputs.
Edit
I am including a longer version of the function I am using so you guys can get the context. This uses a BioPython module to iterate through an XML file of hits. I am using [temp] to hold the hits for each query and then making a dictionary of for each set of query/hits. I would like this dictionary to be named the same as my input file.
from Bio.Blast import NCBIXML
def make_blast_dictionary(blastxml, maxhits, blastdict):
temp=[]
for record in NCBIXML.parse(open(blastxml):
for number, align in enumerate(record.alignments):
if number == int(maxhits): break
temp.append(str(align.title).split("|")[1])
blastdict[str(record.query_id)] = [temp]
The thing about named variables is that you can call them whatever you like. It's best to name them specific to the context you're using them with.
It would be a better move to simply return a dictionary from your method, instead.
The other respondents are legitimately concerned about why you would want to do this or whether you should do this. That being said, here is how you could do it:
import os.path
def function(filename):
d = {'red': 10, 'blue': 20}
name, ext = os.path.splitext(filename)
globals()[name] = d
function('input.xml')
print input
def make_name(input):
return = input.split('.')[0]
def function(input):
"""Note: this function is incomplete and assumes additional parameters are in your original script
"""
for x in y: list(get value)
dict_name[key] = [value]
return dict_name
def make_dict(input):
dict_name = make_name(input)
dict_name = {}
dict_name = function(input)
return dict_name
Is this what you need?

Categories

Resources