Modifying json string going into list or dictionary? - python

I am new to Python and working on revising some existing code.
There is a JSON string coming into a Python function that looks like this:
{"criteria": {"modelName":"='ALL'", "modelName": "='NEW'","fields":"*"}}
Right now it appears a dictionary is being used to create a string:
crit=data['criteria']
for crit_key in crit
crit_val = crit[crit_key]
sql+ = sql+= ' and ' + crit_key + crit_val
When the sql string is printed, only the last 'modelName' appears. It seems like a dictionary is being used as modelName is a key so the second modelName overwrites the first? I want the "sql" string in the end to contain both modelNames.

edited because of OP comments
Well, if you can't update your JSON and have to deal with.
You can make something like:
data = '{"criteria": {"modelName":"=\'ALL\'", "modelName": "=\'NEW\'","fields":"*"}}'
import json
def dict_raise_on_duplicates(ordered_pairs):
d = {}
duplicated = []
for k, v in ordered_pairs:
if k in d:
if k not in duplicated:
duplicated.append(k)
d[k] = [d[k]] + [v]
else:
d[k].append(v)
else:
d[k] = v
return d
print json.loads(data, object_pairs_hook=dict_raise_on_duplicates)
In this example, data is the JSON string with duplicated keys.
According to json.loads allows duplicate keys in a dictionary, overwriting the first value I just force json.load to handle duplicate keys.
If a duplicated key is spotted, it will create a new list containing current key data and add new value.
After, it will only append news values in created list.
Output:
{u'criteria': {u'fields': u'*', u'modelName': [u"='ALL'", u"='NEW'"]}}
You will have to update your code anyway, but you now can handle it.

Related

Dynamically building a dictionary based on variables

I am trying to build a dictionary based on a larger input of text. From this input, I will create nested dictionaries which will need to be updated as the program runs. The structure ideally looks like this:
nodes = {}
node_name: {
inc_name: inc_capacity,
inc_name: inc_capacity,
inc_name: inc_capacity,
}
Because of the nature of this input, I would like to use variables to dynamically create dictionary keys (or access them if they already exist). But I get KeyError if the key doesn't already exist. I assume I could do a try/except, but was wondering if there was a 'cleaner' way to do this in python. The next best solution I found is illustrated below:
test_dict = {}
inc_color = 'light blue'
inc_cap = 2
test_dict[f'{inc_color}'] = inc_cap
# test_dict returns >>> {'light blue': 2}
Try this code, for Large Scale input. For example file input
Lemme give you an example for what I am aiming for, and I think, this what you want.
File.txt
Person1: 115.5
Person2: 128.87
Person3: 827.43
Person4:'18.9
Numerical Validation Function
def is_number(a):
try:
float (a)
except ValueError:
return False
else:
return True
Code for dictionary File.txt
adict = {}
with open("File.txt") as data:
adict = {line[:line.index(':')]: line[line.index(':')+1: ].strip(' \n') for line in data.readlines() if is_number(line[line.index(':')+1: ].strip('\n')) == True}
print(adict)
Output
{'Person1': '115.5', 'Person2': '128.87', 'Person3': '827.43'}
For more explanation, please follow this issue solution How to fix the errors in my code for making a dictionary from a file
As already mentioned in the comments sections, you can use setdefault.
Here's how I will implement it.
Assume I want to add values to dict : node_name and I have the keys and values in two lists. Keys are in inc_names and values are in inc_ccity. Then I will use the below code to load them. Note that inc_name2 key exists twice in the key list. So the second occurrence of it will be ignored from entry into the dictionary.
node_name = {}
inc_names = ['inc_name1','inc_name2','inc_name3','inc_name2']
inc_ccity = ['inc_capacity1','inc_capacity2','inc_capacity3','inc_capacity4']
for i,names in enumerate(inc_names):
node = node_name.setdefault(names, inc_ccity[i])
if node != inc_ccity[i]:
print ('Key=',names,'already exists with value',node, '. New value=', inc_ccity[i], 'skipped')
print ('\nThe final list of values in the dict node_name are :')
print (node_name)
The output of this will be:
Key= inc_name2 already exists with value inc_capacity2 . New value= inc_capacity4 skipped
The final list of values in the dict node_name are :
{'inc_name1': 'inc_capacity1', 'inc_name2': 'inc_capacity2', 'inc_name3': 'inc_capacity3'}
This way you can add values into a dictionary using variables.

Change List to Dict Python

Imagine that you have:
value = ["Name:","Mike", "Hobby:", "bakset", "voly"]
What is the simplest way to produce the following dictionary?
output : {"Name:" : ["Mike"], "Hobby:" : ["bakset", "voly"]}
with python
Value with ":" would be the key for dictionary
Dict comprehension:
>>> {v: (a := []) for v in value if v[-1] == ':' or a.append(v)}
{'Name:': ['Mike'], 'Hobby:': ['bakset', 'voly']}
Though I suspect you're not sharing the original data but already processed data, and that there's a better way to directly build from the original data. Possibly with an existing parser for the format.

Parsing string through dictionary

ab='TS_Automation=Manual;TS_Method=Test;TS_Priority=1;TS_Tested_By=rjrjjn;TS_Written_By=SUN;TS_Review_done=No;TS_Regression=No;'
a={'TS_Automation'='Automated',TS_Tested_By='qz9ghv','TS_Review_done'='yes'}
I have a string and a dictionary ,Now i have to change the value in string based on the keys of dictionary.If the keys are not there subsequent value need to be removed.As TS_Method is not there in dictionary so need to be removed from the string ab.
Am I correct in understanding that you don't want to keep key-value pairs in the string if they don't occur in the dictionary? If that's the case, you can simply parse the dictionary to that particular string format. In your case it's simply in the form key=value; for each entry in the dictionary:
ab = ''
for key, value in a.items():
ab += "{}={};".format(key, value)
You would have to create a new string.
I would do it by using the find method using dictionary key/values for the search.
If the value being searched for does exist, I would append to a new string
s=''
for val in a:
word=val+'='+a[val]
wordLen=len(word)
x=ab.find(word)
if x != -1:
s+=ab[x:wordLen]
myvalue = ''
for k,v in a.items()
myvalue = myvalue+"{}={};".format(key, value)
ab = myvalue
just convert the dict to desired formated string and use it. There is no need for you to remove the key as your requirement is to use the dict as it is in string format.

Pandas Dataframe to Dictionary with Multiple Keys

I am currently working with a dataframe consisting of a column of 13 letter strings ('13mer') paired with ID codes ('Accession') as such:
However, I would like to create a dictionary in which the Accession codes are the keys with values being the 13mers associated with the accession so that it looks as follows:
{'JO2176': ['IGY....', 'QLG...', 'ESS...', ...],
'CYO21709': ['IGY...', 'TVL...',.............],
...}
Which I've accomplished using this code:
Accession_13mers = {}
for group in grouped:
Accession_13mers[group[0]] = []
for item in group[1].iteritems():
Accession_13mers[group[0]].append(item[1])
However, now I would like to go back through and iterate through the keys for each Accession code and run a function I've defined as find_match_position(reference_sequence, 13mer) which finds the 13mer in in a reference sequence and returns its position. I would then like to append the position as a value for the 13mer which will be the key.
If anyone has any ideas for how I can expedite this process that would be extremely helpful.
Thanks,
Justin
I would suggest creating a new dictionary, whose values are another dictionary. Essentially a nested dictionary.
position_nmers = {}
for key in H1_Access_13mers:
position_nmers[key] = {} # replicate key, val in new dictionary, as a dictionary
for value in H1_Access_13mers[key]:
position_nmers[key][value] = # do something
To introspect the dictionary and make sure it's okay:
print position_nmers
You can iterate over the groupby more cleanly by unpacking:
d = {}
for key, s in df.groupby('Accession')['13mer']:
d[key] = list(s)
This also makes it much clearer where you should put your function!
... However, I think that it might be better suited to an enumerate:
d2 = {}
for pos, val in enumerate(df['13mer']):
d2[val] = pos

How can I edit/rename keys during json.load in python?

I have a json file ( ~3Gb ) that I need to load into mongodb. Quite a few of the json keys contain a . (dot), which causes the load into mongodb to fail. I want to the load the json file, and edit the key names in the process, say replace the dot with an empty space. Using the following python code
import json
def RemoveDotKey(dataPart):
for key in dataPart.iterkeys():
new_key = key.replace(".","")
if new_key != key:
newDataPart = deepcopy(dataPart)
newDataPart[new_key] = newDataPart[key]
del newDataPart[key]
return newDataPart
return dataPart
new_json = json.loads(data, object_hook=RemoveDotKey)
The object_hook called RemoveDotKey should iterate over all the keys, it a key contains a dot, create a copy, replace the dot with a space, and return the copy. Created a copy of dataPart, since not sure if I can iterate over dataPart's keys and insert/delete key value pairs at the same time.
There seems to be an error here, all the json keys with a dot in them are not getting edited. I am not very sure how json.load works. Also am new to python ( been using it for less than a week )
You almost had it:
import json
def remove_dot_key(obj):
for key in obj.keys():
new_key = key.replace(".","")
if new_key != key:
obj[new_key] = obj[key]
del obj[key]
return obj
new_json = json.loads(data, object_hook=remove_dot_key)
You were returning a dictionary inside your loop, so you'd only modify one key. And you don't need to make a copy of the values, just rename the keys.

Categories

Resources