Getting KeyError when parsing JSON in Python for following response

Getting KeyError when parsing JSON in Python for following response - python

TL;DR:
Confused on how to parse following JSON response and get the value of [status of 12345 of dynamicValue_GGG of payload] in this case.
Full question:
I get the following as (sanitized) response upon hitting a REST API via Python code below:
response = requests.request("POST", url, data=payload,
headers=headers).json()
{
"payload": {
"name": "asdasdasdasd",
"dynamicValue_GGG": {
"12345": {
"model": "asad",
"status": "active",
"subModel1": {
"dynamicValue_67890": {
"model": "qwerty",
"status": "active"
},
"subModel2": {
"dynamicValue_33445": {
"model": "gghjjj",
"status": "active"
},
"subModel3": {
"dynamicValue_66778": {
"model": "tyutyu",
"status": "active"
}
}
}
},
"date": "2016-02-04"
},
"design": "asdasdWWWsaasdasQ"
}
If I do a type(response['payload']), it gives me 'dict'.
Now, I'm trying to parse the response above and fetch certain keys and values out of it. The problem is that I'm not able to iterate through using "index" and rather have to specify the "key", but then the response has certain "keys" that are dynamically generated and sent over. For instance, the keys called "dynamicValue_GGG", "dynamicValue_66778" etc are not static unlike the "status" key.
I can successfully parse by mentioning like:
print response['payload']['dynamicValue_GGG']['12345'][status]
in which case I get the expected output = 'active'.
However, since I have no control on 'dynamicValue_GGG', it would work only if I can specify something like this instead:
print response['payload'][0][0][status]
But the above line gives me error: " KeyError: 0 " when the python code is executed.
Is there someway in which I can use the power of both keys as well as index together in this case?

The order of values in a dictionary in Python are random, so you cannot use indexing. You'll have to iterate over all elements, potentially recursive, and test to see if it's the thing you're looking for. For example:
def find_submodels(your_dict):
for item_key, item_values in your_dict.items():
if 'status' in item_values:
print item_key, item_values['status']
if type(item_values) == dict:
find_submodels(item_values)
find_submodels(your_dict)
Which would output:
12345 active
dynamicValue_67890 active
dynamicValue_66778 active
dynamicValue_33445 active

Related

Automatically entering next JSON level using Python in a similar way to JQ in bash

I am trying to use Python to extract pricePerUnit from JSON. There are many entries, and this is just 2 of them -
{
"terms": {
"OnDemand": {
"7Y9ZZ3FXWPC86CZY": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "7Y9ZZ3FXWPC86CZY",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7",
"description": "Processed translation request in AWS GovCloud (US)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
},
"CQNY8UFVUNQQYYV4": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "CQNY8UFVUNQQYYV4",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7",
"description": "$0.000015 per Character for TextTranslationJob:TextTranslationJob in EU (London)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
}
}
}
}
The issue I run into is that the keys, which in this sample, are 7Y9ZZ3FXWPC86CZY, CQNY8UFVUNQQYYV4.JRTCKXETXF, and CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7 are a changing string that I cannot just type out as I am parsing the dictionary.
I have python code that works for the first level of these random keys -
with open('index.json') as json_file:
data = json.load(json_file)
json_keys=list(data['terms']['OnDemand'].keys())
#Get the region
for i in json_keys:
print((data['terms']['OnDemand'][i]))
However, this is tedious, as I would need to run the same code three times to get the other keys like 7Y9ZZ3FXWPC86CZY.JRTCKXETXF and 7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7, since the string changes with each JSON entry.
Is there a way that I can just tell python to automatically enter the next level of the JSON object, without having to parse all keys, save them, and then iterate through them? Using JQ in bash I can do this quite easily with jq -r '.terms[][][]'.

If you are really sure, that there is exactly one key-value pair on each level, you can try the following:
def descend(x, depth):
for i in range(depth):
x = next(iter(x.values()))
return x

You can use dict.values() to iterate over the values of a dict. You can also use next(iter(dict.values())) to get a first (only) element of a dict.
for demand in data['terms']['OnDemand'].values():
next_level = next(iter(demand.values()))
print(next_level)
If you expect other number of children than 1 in the second level, you can just nest the fors:
for demand in data['terms']['OnDemand'].values():
for sub_demand in demand.values()
print(sub_demand)
If you are insterested in the keys too, you can use dict.items() method to iterate over dict keys and values at the same time:
for demand_key, demand in data['terms']['OnDemand'].items():
for sub_demand_key, sub_demand in demand.items()
print(demand_key, sub_demand_key, sub_demand)

KeyError in Python 3 when looping items

The problem with KeyError is when one of the fields within my JSON don't have a value or don't exist at all. To solve it, I put my loop in an exception so it can skip this error and continue with the rest of the loop. However, the problem with this method is when one field is missing I can't print the rest of the JSON data because of that one or two missing item.
For example:
Scenario 1:
JSON format:
{
"data": {
"name": "Bloomberg",
"city": "NYC",
"country": "USA"
}
}
This scenario1 works fine since all items with their values are available
Output:
('name:Bloomberg', 'city:NYC', 'country:USA')
Scenario 2:
JSON format:
{
"data": {
"name": "Bloomberg",
"country": "USA"
}
}
In this scenario, the exception will capture that KeyError and skip it. However, I still need to print that data out regardless of that one missing field. I am looking for an output like this:
('name:Bloomberg', 'city field not available', 'country:USA')
The exception I used in the loop:
try :
myData = (myJSON [ 'data' ] [ 'name' ] , myJSON [ 'data' ] [ 'city' ], myJSON ['data']['country'])
print (myData)
except Error as e:
pass

You can use the get method. It returns None when the requested key doesn't exist in the dictionary.
my_json = {
"data": {
"name": "Bloomberg",
"country": "USA"
}
}
my_json_data = my_json.get('data')
if my_json_data is not None:
my_data = (my_json_data.get('name'), my_json_data.get('city'), my_json_data.get('country'))

How to properly parse embedded value from JSON

I'm having trouble parsing this json for a particular key:
sample.json:
{
"AccessToken": {
"ABCD": {
"credential_type": "AccessToken",
"secret": "abcdefghijklmnopqrstuxwxyz",
"home_account_id": "4dafe035-ff2",
"environment": "login.microsoftonline.com",
"client_id": "f16f9f797",
"target": "Directory.Read.All User.Read profile openid email",
"realm": "56c621fa50f2",
"token_type": "Bearer",
"cached_at": "1599671717",
"expires_on": "1599675316",
"extended_expires_on": "1599675316"
}
},
"Account": {
"EFGH": {
"home_account_id": "f977-41eb-8241613.56c62bbe-8598-4b85-9e51-1ca753fa50f2",
"environment": "login.microsoftonline.com",
"realm": "56c62bbe8598",
"local_account_id": "4dafe0353-304e48a51613",
"username": "foo#mail.com",
"authority_type": "MS"
}
},
"IdToken": {
"WXYZ": {
"credential_type": "IdToken",
"secret": "abcdefghijklmnopqrstuxwxyz",
"home_account_id": "4dafe035-ff2",
"environment": "login.microsoftonline.com",
"realm": "56c6a753fa50f2",
"client_id": "f169aaf9f797"
}
}
}
The goal is to parse and print the "secret" from the "IdToken" section.
abcdefghijklmnopqrstuxwxyz
So far, I can print the entire "IdToken" section, but I just want the secret.
import json
with open('sample.json') as json_file:
data = json.load(json_file)
print(data['IdToken'])
print(data['IdToken'][0]['secret']) #Tried this. Doesnot work

You need to do
print(data['IdToken']['WXYZ']['secret'])
When you do data['IdToken'][0], it takes the first element from data['IdToken'] if data['IdToken'] was an array. But here, data['IdToken'] is a dict. To get an element from a dict, you need to use the dict key inside square brackets.
EDIT: (If you don't know the exact key, but only know the position)
JSON doesn't guarantee the order of elements in a map/dict. So, unless you are sure that the items in the dict will appear in a particular order, don't use this solution. But anyways, here is how you do it - you can do print(data['IdToken'][list(data['IdToken'].keys())[0]]['secret']). Also make sure to use OrderedDict while parsing JSON. Check out this answer for that - https://stackoverflow.com/a/47111106/1421222.

If you want to index on the nested dict of data you should just use its keys and append it on the list where you index on it with an index of [0] to get the first key which is dict to and get the secret
Example
print(data['IdToken'][[*(data['IdToken'].keys())][0]]['secret'])
and the above method will get the key of IdToken and if you don't know it

python TypeError: string indices must be integers json

Can some one tell me what I am doing wrong ?I am Getting this error..
went through the earlier post of similar error. couldn't able to understand..
import json
import re
import requests
import subprocess
res = requests.get('https://api.tempura1.com/api/1.0/recipes', auth=('12345','123'), headers={'App-Key': 'some key'})
data = res.text
extracted_recipes = []
for recipe in data['recipes']:
extracted_recipes.append({
'name': recipe['name'],
'status': recipe['status']
})
print extracted_recipes
TypeError: string indices must be integers
data contains the below
{
"recipes": {
"47635": {
"name": "Desitnation Search",
"status": "SUCCESSFUL",
"kitchen": "eu",
"active": "YES",
"created_at": 1501672231,
"interval": 5,
"use_legacy_notifications": false
},
"65568": {
"name": "Validation",
"status": "SUCCESSFUL",
"kitchen": "us-west",
"active": "YES",
"created_at": 1522583593,
"interval": 5,
"use_legacy_notifications": false
},
"47437": {
"name": "Gateday",
"status": "SUCCESSFUL",
"kitchen": "us-west",
"active": "YES",
"created_at": 1501411588,
"interval": 10,
"use_legacy_notifications": false
}
},
"counts": {
"total": 3,
"limited": 3,
"filtered": 3
}
}

You are not converting the text to json. Try
data = json.loads(res.text)
or
data = res.json()
Apart from that, you probably need to change the for loop to loop over the values instead of the keys. Change it to something the following
for recipe in data['recipes'].values()

There are two problems with your code, which you could have found out by yourself by doing a very minimal amount of debugging.
The first problem is that you don't parse the response contents from json to a native Python object. Here:
data = res.text
data is a string (json formatted, but still a string). You need to parse it to turn it into it's python representation (in this case a dict). You can do it using the stdlib's json.loads() (general solution) or, since you're using python-requests, just by calling the Response.json() method:
data = res.json()
Then you have this:
for recipe in data['recipes']:
# ...
Now that we have turned data into a proper dict, we can access the data['recipes'] subdict, but iterating directly over a dict actually iterates over the keys, not the values, so in your above for loop recipe will be a string ( "47635", "65568" etc). If you want to iterate over the values, you have to ask for it explicitly:
for recipe in data['recipes'].values():
# now `recipe` is the dict you expected

Count a particular value from list in Python mongodb

I am experimenting with Python with MongoDB. I am a newbie with python. Here I get records from a collection and based on a particular value from that collection, I find the count of that record(from the 1st collection). But my problem is I cannot append this count into my list.
Here is the code:
#gen.coroutine
def post(self):
Sid = self.body['Sid']
alpha = []
test = db.student.find({"Sid": Sid})
count = yield test.count()
print(count)
for document in (yield test.to_list(length=1000)):
cursor = db.attendance.find({"StudentId": document.get('_id')})
check = yield cursor.count()
print(check)
alpha.append(document)
self.write(bson.json_util.dumps({"data": alpha}))
the displayed output alpha is from the first collection (student), the count value is from (attendance collection).
when I try to extend the list with check I end up with error
alpha.append(document.extend(check))
But I am getting the correct count value in python terminal, I am unable to write it along with the output.
My output is like
{"data": [{"Sid": "1", "Student Name": "Alex","_id": {"$oid": "..."}}, {"Sid": "1", "Student Name": "Alex","_id": {"$oid": "..."}}]}
My output should be like
{"data": [{"Sid": "1", "Student Name": "Alex","_id": {"$oid": "..."},"count": "5"}, {"Sid": "1", "Student Name": "Alex","_id": {"$oid": "..."},"count": "3"}]}
Please guide me on how I can get my desired output.
Thank you.

A better approach to this is to use the MongoDB .aggregate() method from the python driver you are using rather than repeated .find() and .count() operations:
db.attendance.aggregate([
{ "$group": {
"_id": "$StudentId",
"name": { "$first": "$Student Name" },
"count": { "$sum": 1 }
}}
])
Then it is already done for you.
What your current code is doing is looking up the current student and returning a "count" of how many occurances there are. And you are doing that for every student by the content of your output.
Rather than do that the data is "aggregated" to return both the values from the document along with a "count" within the returned results, and it is aggregated per student.
This means you don't need to run a query for each student just to get the count. Instead you just call the database "once" and make it count all the students you need in one result.
If you need more that one student but not all students then you filter that with query conditions;
db.attendance.aggregate([
{ "$match": { "StudentId": { "$in": list_of_student_ids } } },
{ "$group": {
"_id": "$StudentId",
"name": { "$first": "$Student Name" },
"count": { "$sum": 1 }
}}
])
And the selection along with the aggregation is done for you.
No need for looping code and lots of database request. The .aggregate() method and pipeline will do it for you.
Read the core documation on the Aggregation Pipeline.

Add count entry to the dictionary document and append the dictionary:
for document in (yield test.to_list(length=1000)):
cursor = db.attendance.find({"StudentId": document.get('_id')})
check = yield cursor.count()
document['count'] = check
alpha.append(document)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Getting KeyError when parsing JSON in Python for following response - python

Related

Automatically entering next JSON level using Python in a similar way to JQ in bash

KeyError in Python 3 when looping items

How to properly parse embedded value from JSON

python TypeError: string indices must be integers json

Count a particular value from list in Python mongodb

Categories

Resources