Proper way to iterate Python dictionary and build new one - python

I'm trying to find a cogent way to check to see whether certain keys exist in a dictionary and use those to build a new one.
Here is my example json:
"dmarc": {
"record": "v=DMARC1; p=none; rua=mailto:dmarc.spc#test.domain; adkim=s; aspf=s",
"valid": true,
"location": "test.domain",
"warnings": [
"DMARC record at root of test.domain has no effect"
],
"tags": {
"v": {
"value": "DMARC1",
"explicit": true
},
"p": {
"value": "none",
"explicit": true
},
"rua": {
"value": [
{
"scheme": "mailto",
"address": "ssc.dmarc.spc#canada.ca",
"size_limit": null
}
],
"explicit": true
},
"adkim": {
"value": "s",
"explicit": true
},
"aspf": {
"value": "s",
"explicit": true
},
"fo": {
"value": [
"0"
],
"explicit": false
},
"pct": {
"value": 100,
"explicit": false
},
"rf": {
"value": [
"afrf"
],
"explicit": false
},
"ri": {
"value": 86400,
"explicit": false
},
"sp": {
"value": "none",
"explicit": false
}
}
}
}
What I'm specifically looking to do, is pull record, valid, location, tags-p, tags-sp, and tags-pct in a programmatic way, instead of doing a bunch of try/excepts. For example, to get valid, I do:
try:
res_dict['valid'] = jsonData['valid']
except KeyError:
res_dict['valid'] = None
Now, this is easy enough to loop/repeat for top level key/values, but how would I accomplish this for the nested key/values?

No, you don't need a try-except block for the same. You can check if the key exists using:
if jsonData.get("valid"):
res_dict["valid"] = jsonData.get("valid")
The .get("key") method returns the value for the given key, if present in the dictionary. If not, then it will return None (if get() is used with only one argument).
If you want it to return something else if it doesn't find the key then suppose:
jsonData.get("valid", "invalid_something_else")

One way of handling this is by taking advantage of the fact that the result of dict.keys can be treated as a set. See the following code.
my_keys = {'record', 'valid', 'location'} # you can add more here
new_dict = {}
available_keys = my_keys & jsonData.keys()
for key in available_keys:
new_dict[key] = jsonData[key]
Above, we define the keys we are interested in within the my_keys set. We then get the available keys by taking the intersection of the keys in the dictionary and the keys we are interested in. This, in effect, only gets the keys that we are interested in that are also defined in the dictionary. Finally, we just iterate through the available_keys and build the new dictionary.
However, this does not set keys to None if they do not exist in the input dictionary. For that, it may be best to use the get method as mentioned in other answers, like so:
my_keys = ['record', 'valid', 'location'] # you can add more here
new_dict = {}
for key in my_keys:
new_dict[key] = jsonData.get(key)
The get method allows us to attempt to get the value for a key in the dictionary. If that key is not defined, it returns None. You can also change the returned default by adding an extra argument to the get method like so new_dict[key] = jsonData.get(key, "some other default value")

Simple: instead of dict['key'] use
dict.get('key', {}) for all nodes that are not leaves, and
dict.get('key', DEFAULT) for leaves, where DEFAULT is whatever you need.
If you omit DEFAULT and 'key' is absent, you get None. See the docs.
E.g.:
jsonData.get('record', "") # empty string if no 'record' key
jsonData.get('valid', False) # False if no 'valid' key
jsonData.get('location') # None if no 'location'
jsonData.get('tags', {}).get('p') # None if no 'tags' and/or no 'p'
jsonData.get('tags', {}).get('p', {}) # {} if no 'tags' and/or no 'p'
jsonData.get('tags', {}).get('p', {}).get('explicit', False) # and so on
The above presumes that you don't traverse lists (JSON arrays). If you do, you can still use
dict.get('key', [])
but if you have to dive deeper from there, you will probably have to loop over list items.

Related

Save values from POST request of a list of dicts

I a trying to expose an API (if that's the correct way to say it). I am using Quart, a python library made out of Flask and this is what my code looks like:
async def capture_post_request(request_json):
for item in request_json:
callbackidd = item['callbackid']
print(callbackidd)
#app.route('/start_work/', methods=['POST'])
async def start_work():
content_type = request.headers.get('content-type')
if (content_type == 'application/json'):
request_json = await request.get_json()
loop = asyncio.get_event_loop()
loop.create_task(capture_post_request(request_json))
body = "Async Job Started"
return body
else:
return 'Content-Type not supported!'
My schema looks like that:
[
{
"callbackid": "dd",
"itemid": "234r",
"input": [
{
"type": "thistype",
"uri": "www.uri.com"
}
],
"destination": {
"type": "thattype",
"uri": "www.urino2.com"
}
},
{
"statusCode": "202"
}
]
So far what I am getting is this error:
line 11, in capture_post_request
callbackidd = item['callbackid']
KeyError: 'callbackid'
I've tried so many stackoverflow posts to see how to iterate through my list of dicts but nothing worked. At one point in my start_work function I was using the get_data(as_text=True) method but still no results. In fact with the last method (or attr) I got:
TypeError: string indices must be integers
Any help on how to access those values is greatly appreciated. Cheers.
Your schema indicates there are two items in the request_json. The first indeed has the callbackid, the 2nd only has statusCode.
Debugging this should be easy:
async def capture_post_request(request_json):
for item in request_json:
print(item)
callbackidd = item.get('callbackid')
print(callbackidd) # will be None in case of the 2nd 'item'
This will print two dicts:
{
"callbackid": "dd",
"itemid": "234r",
"input": [
{
"type": "thistype",
"uri": "www.uri.com"
}
],
"destination": {
"type": "thattype",
"uri": "www.urino2.com"
}
}
And the 2nd, the cause of your KeyError:
{
"statusCode": "202"
}
I included the 'fix' of sorts already:
callbackidd = item.get('callbackid')
This will default to None if the key isn't in the dict.
Hopefully this will get you further!
Edit
How to work with only the dict containing your key? There are two options.
First, using filter. Something like this:
def has_callbackid(dict_to_test):
return 'callbackid' in dict_to_test
list_with_only_list_callbackid_items = list(filter(has_callbackid, request_json))
# Still a list at this point! With dicts which have the `callbackid` key
Filter accepts some arguments:
Function to call to determine if the value being tested should be filtered out or not.
The iterable you want to filter
Could also use a 'lambda function', but it's a bit evil. But serves the purpose just as well:
list_with_only_list_callbackid_items = list(filter(lambda x: 'callbackid' in x, request_json))
# Still a list at this point! With dict(s) which have the `callbackid` key
Option 2, simply loop over the result and only grab the one you want to use.
found_item = None # default
for item in request_json:
if 'callbackid' in item:
found_item = item
break # found what we're looking for, stop now
# Do stuff with the found_item from this point.

Python list of dictionaries - access keys

I need to change name of the keys in dictionary using item.replace(" ", "_").lower()
How could I access these keys?
{
"environment": [
{
"Branch Branching": "97/97(100%)",
"Test Status": "TC39",
},
{
"Branch Branching": "36/36(100%)",
"Test Status": "TC29",
}
],
}
One way is to use:
dictionary[new_key] = dictionary.pop(old_key)
In your example:
env = {
"environment": [
{
"Branch Coverage": "97/97(100%)",
"Test Environment": "REGISTERHANDLING",
"Test Configuration": "TC39",
},
{
"Branch Coverage": "36/36(100%)",
"Test Environment": "PRA",
"Test Configuration": "TC29",
}
],
}
# Looping over each index in the env['environment'] list,
# this way we can edit the original dictionary.
# Note that enumerate returns a tuple of values (idx, val)
# And _ is commonly used to demonstrate that we will not be using val, only the index.
for index, _ in enumerate(env['environment']):
# For each key, we want to create a new key and delete the old one.
for key in env['environment'][index].keys():
# Calculate the new key
new_key = key.replace(" ", "_").lower()
# .pop deletes the old key and returns the result, and the left hand side of this operation creates the new key in the correct index.
env['environment'][index][new_key] = env['environment'][index].pop(key)
This question was already solved before, if you'd like to explore other answers, click here.

Automatically entering next JSON level using Python in a similar way to JQ in bash

I am trying to use Python to extract pricePerUnit from JSON. There are many entries, and this is just 2 of them -
{
"terms": {
"OnDemand": {
"7Y9ZZ3FXWPC86CZY": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "7Y9ZZ3FXWPC86CZY",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7",
"description": "Processed translation request in AWS GovCloud (US)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
},
"CQNY8UFVUNQQYYV4": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "CQNY8UFVUNQQYYV4",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7",
"description": "$0.000015 per Character for TextTranslationJob:TextTranslationJob in EU (London)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
}
}
}
}
The issue I run into is that the keys, which in this sample, are 7Y9ZZ3FXWPC86CZY, CQNY8UFVUNQQYYV4.JRTCKXETXF, and CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7 are a changing string that I cannot just type out as I am parsing the dictionary.
I have python code that works for the first level of these random keys -
with open('index.json') as json_file:
data = json.load(json_file)
json_keys=list(data['terms']['OnDemand'].keys())
#Get the region
for i in json_keys:
print((data['terms']['OnDemand'][i]))
However, this is tedious, as I would need to run the same code three times to get the other keys like 7Y9ZZ3FXWPC86CZY.JRTCKXETXF and 7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7, since the string changes with each JSON entry.
Is there a way that I can just tell python to automatically enter the next level of the JSON object, without having to parse all keys, save them, and then iterate through them? Using JQ in bash I can do this quite easily with jq -r '.terms[][][]'.
If you are really sure, that there is exactly one key-value pair on each level, you can try the following:
def descend(x, depth):
for i in range(depth):
x = next(iter(x.values()))
return x
You can use dict.values() to iterate over the values of a dict. You can also use next(iter(dict.values())) to get a first (only) element of a dict.
for demand in data['terms']['OnDemand'].values():
next_level = next(iter(demand.values()))
print(next_level)
If you expect other number of children than 1 in the second level, you can just nest the fors:
for demand in data['terms']['OnDemand'].values():
for sub_demand in demand.values()
print(sub_demand)
If you are insterested in the keys too, you can use dict.items() method to iterate over dict keys and values at the same time:
for demand_key, demand in data['terms']['OnDemand'].items():
for sub_demand_key, sub_demand in demand.items()
print(demand_key, sub_demand_key, sub_demand)

modifying json - deleting certain elements within a json structure using python

My json structure is as follows :
"AGENT": {
"pending": [],
"active": null,
"completed": [
**{
"result": {
"job1.AGENT": "SUCCESS",
"job2.AGENT": "SUCCESS"
},
"return_value": {
"job1.AGENT": "",
"job2.AGENT": ""
},
"visible": true,
"global": true,
"locale": [
"en_US"
],
"complete_time": "2018-01-24T17:44:33.484Z",
"persist": true,
"type": "script",
"script": "<script_name>.py",
"preset_status": "CONFIGURING",
"parameters": {},
"submit_time": "2018-01-24T17:44:26.747Z"
}**,
{
"result": {
..
},
"return_value": {
..
},
"visible": true,
"global": true,
"locale": [
"en_US"
],
"complete_time": "2018-04-2T17:44:40.049Z",
"submit_time": "2018-04-2T17:44:26.817Z"
}
I need to delete the entire result block based on complete_time, like delete the result block before 2018-04-03
How can i acheive this in python ?
I have tried the following so far :
json_data = json.dumps(data)
item_dict = json.loads(data)
print item_dict["AGENT"]["completed"][0]["complete_time"]
This prints the complete time. However my problem is "AGENT" is not a constant string. The string can vary. Also I will need to figure out the logic to remove the entire json block based on complete_time
Ok, I assume that you were able to correctly load the json into a Python dictionnary, let call it item_dict, but the key may vary.
What you need now it to walk inside that Python object, and decode the complete_time field. Unfortunately, Python strptime does not know about the Z time zone, so we will have to skip that last character.
Additionaly, you should never modify a collection object while iterating it, so the bullet proof way is to store indices to remove and later remove them. Code could be:
datelimit = datetime.datetime(2018, 4, 1) # limit date for completed_time
to_remove = []
dateformat = '%Y-%m-%dT%H:%M:%S.%f'
for k, v in item_dict.items(): # enumerate top_level objects
for i, block in enumerate(v['completed']): # enumerate inner blocks
complete_time = datetime.datetime.strptime( # skip last char from complete_time
block["complete_time"][:-1], dateformat)
# print(k, i, complete_time) # uncomment for tests
if complete_time < datelimit: # too old
to_remove.append((k, i)) # store the index for later processing
for k, i in reversed(to_remove): # start from the end to keep consistent indices
del item_dict[k]["completed"][i] # actual deletion

Adding key to values in json using Python

This is the structure of my JSON:
"docs": [
{
"key": [
null,
null,
"some_name",
"12345567",
"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
I want to add the keys to the key list. This is what I want the output to look like:
"docs": [
{
"key": [
"key1":null,
"key2":null,
"key3":"some_name",
"key4":"12345567",
"key5":"test_name"
],
"value": {
"lat": "29.538208354844658",
"long": "71.98762580927113"
}
},
What's a good way to do it. I tried this but doesn't work:
for item in data['docs']:
item['test'] = data['docs'][3]['key'][0]
UPDATE 1
Based on the answer below, I have tweaked the code to this:
for number, item in enumerate(data['docs']):
# pprint (item)
# print item['key'][4]
newdict["key1"] = item['key'][0]
newdict["yek1"] = item['key'][1]
newdict["key2"] = item['key'][2]
newdict["yek2"] = item['key'][3]
newdict["key3"] = item['key'][4]
newdict["latitude"] = item['value']['lat']
newdict["longitude"] = item['value']['long']
This creates the JSON I am looking for (and I can eliminate the list I had previously). How does one make this JSON persist outside the for loop? Outside the loop, only the last value from the dictionary is added otherwise.
In your first block, key is a list, but in your second block it's a dict. You need to completely replace the key item.
newdict = {}
for number,item in enumerate(data['docs']['key']):
newdict['key%d' % (number+1)] = item
data['docs']['key'] = newdict

Categories

Resources