How to properly parse embedded value from JSON

How to properly parse embedded value from JSON - python

I'm having trouble parsing this json for a particular key:
sample.json:
{
"AccessToken": {
"ABCD": {
"credential_type": "AccessToken",
"secret": "abcdefghijklmnopqrstuxwxyz",
"home_account_id": "4dafe035-ff2",
"environment": "login.microsoftonline.com",
"client_id": "f16f9f797",
"target": "Directory.Read.All User.Read profile openid email",
"realm": "56c621fa50f2",
"token_type": "Bearer",
"cached_at": "1599671717",
"expires_on": "1599675316",
"extended_expires_on": "1599675316"
}
},
"Account": {
"EFGH": {
"home_account_id": "f977-41eb-8241613.56c62bbe-8598-4b85-9e51-1ca753fa50f2",
"environment": "login.microsoftonline.com",
"realm": "56c62bbe8598",
"local_account_id": "4dafe0353-304e48a51613",
"username": "foo#mail.com",
"authority_type": "MS"
}
},
"IdToken": {
"WXYZ": {
"credential_type": "IdToken",
"secret": "abcdefghijklmnopqrstuxwxyz",
"home_account_id": "4dafe035-ff2",
"environment": "login.microsoftonline.com",
"realm": "56c6a753fa50f2",
"client_id": "f169aaf9f797"
}
}
}
The goal is to parse and print the "secret" from the "IdToken" section.
abcdefghijklmnopqrstuxwxyz
So far, I can print the entire "IdToken" section, but I just want the secret.
import json
with open('sample.json') as json_file:
data = json.load(json_file)
print(data['IdToken'])
print(data['IdToken'][0]['secret']) #Tried this. Doesnot work

You need to do
print(data['IdToken']['WXYZ']['secret'])
When you do data['IdToken'][0], it takes the first element from data['IdToken'] if data['IdToken'] was an array. But here, data['IdToken'] is a dict. To get an element from a dict, you need to use the dict key inside square brackets.
EDIT: (If you don't know the exact key, but only know the position)
JSON doesn't guarantee the order of elements in a map/dict. So, unless you are sure that the items in the dict will appear in a particular order, don't use this solution. But anyways, here is how you do it - you can do print(data['IdToken'][list(data['IdToken'].keys())[0]]['secret']). Also make sure to use OrderedDict while parsing JSON. Check out this answer for that - https://stackoverflow.com/a/47111106/1421222.

If you want to index on the nested dict of data you should just use its keys and append it on the list where you index on it with an index of [0] to get the first key which is dict to and get the secret
Example
print(data['IdToken'][[*(data['IdToken'].keys())][0]]['secret'])
and the above method will get the key of IdToken and if you don't know it

Related

How to search inside json file with redis?

Here I set a json object inside a key in a redis. Later I want to perform search on the json file stored in the redis. My search key will always be a json string like in the example below and i want to match this inside the stored json file.
Currently here i am doing this by iterating and comparing but instead i want to do it with redis. How can I do it ?
rd = redis.StrictRedis(host="localhost",port=6379, db=0)
if not rd.get("mykey"):
with open(os.path.join(BASE_DIR, "my_file.josn")) as fl:
data = json.load(fl)
rd.set("mykey", json.dumps(data))
else:
key_values = json.loads(rd.get("mykey"))
search_json_key = {
"key":"value",
"key2": {
"key": "val"
}
}
# here i am searching by iterating and comparing instead i want to do it with redis
for i in key_values['all_data']:
if json.dumps(i) == json.dumps(search_json_key):
# return
# mykey format looks like this:
{
"all_data": [
{
"key":"value",
"key2": {
"key": "val"
}
},
{
"key":"value",
"key2": {
"key": "val"
}
},
{
"key":"value",
"key2": {
"key": "val"
}
},
]
}

To do search with Redis and JSON you have two options - you can use the FT CREATE command to create an index that you can then use FT SEARCH over, (while both of these web pages show the CLI syntax you can do
rd.ft().create() / search() in your python script)
OR you can check out the python OM client that will take care of that to some extent for you.
Either way you'll have to do a bit of a rework to fully take advantage of Redis' search capabilities.

Automatically entering next JSON level using Python in a similar way to JQ in bash

I am trying to use Python to extract pricePerUnit from JSON. There are many entries, and this is just 2 of them -
{
"terms": {
"OnDemand": {
"7Y9ZZ3FXWPC86CZY": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "7Y9ZZ3FXWPC86CZY",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7",
"description": "Processed translation request in AWS GovCloud (US)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
},
"CQNY8UFVUNQQYYV4": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF": {
"offerTermCode": "JRTCKXETXF",
"sku": "CQNY8UFVUNQQYYV4",
"effectiveDate": "2020-11-01T00:00:00Z",
"priceDimensions": {
"CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7": {
"rateCode": "CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7",
"description": "$0.000015 per Character for TextTranslationJob:TextTranslationJob in EU (London)",
"beginRange": "0",
"endRange": "Inf",
"unit": "Character",
"pricePerUnit": {
"USD": "0.0000150000"
},
"appliesTo": []
}
},
"termAttributes": {}
}
}
}
}
}
The issue I run into is that the keys, which in this sample, are 7Y9ZZ3FXWPC86CZY, CQNY8UFVUNQQYYV4.JRTCKXETXF, and CQNY8UFVUNQQYYV4.JRTCKXETXF.6YS6EN2CT7 are a changing string that I cannot just type out as I am parsing the dictionary.
I have python code that works for the first level of these random keys -
with open('index.json') as json_file:
data = json.load(json_file)
json_keys=list(data['terms']['OnDemand'].keys())
#Get the region
for i in json_keys:
print((data['terms']['OnDemand'][i]))
However, this is tedious, as I would need to run the same code three times to get the other keys like 7Y9ZZ3FXWPC86CZY.JRTCKXETXF and 7Y9ZZ3FXWPC86CZY.JRTCKXETXF.6YS6EN2CT7, since the string changes with each JSON entry.
Is there a way that I can just tell python to automatically enter the next level of the JSON object, without having to parse all keys, save them, and then iterate through them? Using JQ in bash I can do this quite easily with jq -r '.terms[][][]'.

If you are really sure, that there is exactly one key-value pair on each level, you can try the following:
def descend(x, depth):
for i in range(depth):
x = next(iter(x.values()))
return x

You can use dict.values() to iterate over the values of a dict. You can also use next(iter(dict.values())) to get a first (only) element of a dict.
for demand in data['terms']['OnDemand'].values():
next_level = next(iter(demand.values()))
print(next_level)
If you expect other number of children than 1 in the second level, you can just nest the fors:
for demand in data['terms']['OnDemand'].values():
for sub_demand in demand.values()
print(sub_demand)
If you are insterested in the keys too, you can use dict.items() method to iterate over dict keys and values at the same time:
for demand_key, demand in data['terms']['OnDemand'].items():
for sub_demand_key, sub_demand in demand.items()
print(demand_key, sub_demand_key, sub_demand)

How do I iterate over a json object that has a list of dictionaries inside in Python?

How can I iterate over this to get get MerchantRequestID, CheckoutRequestID, ResultCode, ResultDesc, and all the value in the 'item' list for instance value for PhoneNumber.
Am getting this data from a callBackURl after a user makes a payment.
"Body":{
"stkCallback":{
"MerchantRequestID":"19465-780693-1",
"CheckoutRequestID":"ws_CO_27072017154747416",
"ResultCode":0,
"ResultDesc":"The service request is processed successfully.",
"CallbackMetadata":{
"Item":[
{
"Name":"Amount",
"Value":1
},
{
"Name":"MpesaReceiptNumber",
"Value":"LGR7OWQX0R"
},
{
"Name":"Balance"
},
{
"Name":"TransactionDate",
"Value":20170727154800
},
{
"Name":"PhoneNumber",
"Value":254721566839
}
]
}
}
}
}```
I want to get MerchantRequestID, CheckoutRequestID, ResultCode, ResultDesc, and all the value in the list
'item' then store them in the db.
```new_user = MpesaResponses(MerchantRequestID=data[0]['Body']['stkCallback']['MerchantRequestID'],
CheckoutRequestID=data[0]['Body']['stkCallback']['CheckoutRequestID'],
ResultCode=data[0]['Body']['stkCallback']['ResultCode'],
ResultDesc=data[0]['Body']['stkCallback']['ResultDesc'],
Amount=data[0]['Body']['stkCallback']['CallbackMetadata']['Item'][0]
['value'],
MpesaReceiptNumber=data[0]['Body']['stkCallback']['CallbackMetadata']
['Item'][1]['value'],
TransactionDate=data[0]['Body']['stkCallback']['CallbackMetadata']['Item']
[3]['value'],
PhoneNumber=data[0]['Body']['stkCallback']['CallbackMetadata']['Item'][4]
['value'])
db.session.add(new_user)
db.session.commit()```
This is what I had tried.

If the data is just a dictionary starting at "Body" and not a list containing a dictionary, then just remove the first [0] from each variable, then check your spelling on Value as so:
MerchantRequestID=data['Body']['stkCallback']['MerchantRequestID']
CheckoutRequestID=data['Body']['stkCallback']['CheckoutRequestID']
ResultCode=data['Body']['stkCallback']['ResultCode']
ResultDesc=data['Body']['stkCallback']['ResultDesc']
Amount=data['Body']['stkCallback']['CallbackMetadata']['Item'][0]['Value']
MpesaReceiptNumber=data['Body']['stkCallback']['CallbackMetadata']['Item'][1]['Value']
TransactionDate=data['Body']['stkCallback']['CallbackMetadata']['Item'][3]['Value']
PhoneNumber=data['Body']['stkCallback']['CallbackMetadata']['Item'][4]['Value']

How to index list of object in Elasticsearch?

A document format I ingest into ElasticSearch looks like this:
{
'id':'514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0'
'created':'2019-09-06 06:09:33.044433',
'meta':{
'userTags':[
{
'intensity':'1',
'sentiment':'0.84',
'keyword':'train'
},
{
'intensity':'1',
'sentiment':'-0.76',
'keyword':'amtrak'
}
]
}
}
...ingested with python:
r = requests.put(itemUrl, auth = authObj, json = document, headers = headers)
The idea here is that ElasticSearch will treat keyword, intensity and sentiment as fields that can be later queried. However, on ElasticSearch side I can observe that this is not happening (I use Kibana for search UI) -- instead, I see field "meta.userTags" with the value that is the whole list of objects.
How can I make ElasticSearch index elements within a list?

I used the document body you provided to create a new index 'testind' and type 'testTyp' using the Postman REST client.:
POST http://localhost:9200/testind/testTyp
{
"id":"514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0",
"created":"2019-09-06 06:09:33.044433",
"meta":{
"userTags":[
{
"intensity":"1",
"sentiment":"0.84",
"keyword":"train"
},
{
"intensity":"1",
"sentiment":"-0.76",
"keyword":"amtrak"
}
]
}
}
When I queried for the index's mapping this is what i get :
GET http://localhost:9200/testind/testTyp/_mapping
{
"testind":{
"mappings":{
"testTyp":{
"properties":{
"created":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"id":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"meta":{
"properties":{
"userTags":{
"properties":{
"intensity":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"keyword":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
},
"sentiment":{
"type":"text",
"fields":{
"keyword":{
"type":"keyword",
"ignore_above":256
}
}
}
}
}
}
}
}
}
}
}
}
As you can see in the mapping the fields are part of the mapping and can be queried as per need in future, so I don't see the problem here as long as the field names are not one of these - https://www.elastic.co/guide/en/elasticsearch/reference/6.4/sql-syntax-reserved.html ( you might want to avoid the term 'keyword' as it might be confusing later when writing search queries as the fieldname and type are both same - 'keyword') . Also, note one thing, the mapping gets created via dynamic mapping (https://www.elastic.co/guide/en/elasticsearch/reference/6.3/dynamic-field-mapping.html#dynamic-field-mapping ) in Elasticsearch and so the data types are determined by elasticsearch based on the values you have provided.However, this may not be always accurate , so to prevent that you can use the PUT _mapping API to define your own mapping for the index, and then prevent new fields within a type from being added to mappings.

You don't need a special mapping to index a list - every field can contain one or more values of the same type. See array datatype.
In the case of a list of objects, they can be indexed as object or nested datatype. Per default elastic uses object datatype. In this case you can query meta.userTags.keyword or/and meta.userTags.sentiment. The result will allways contains whole documents with values matched independently, ie. searching keyword=train and sentiment=-0.76 you WILL find document with id=514d4e9f-09e7-4f13-b6c9-a0aa9b4f37a0.
If this is not what you want, you need to define nested datatype mapping for field userTags and use a nested query.

Getting KeyError when parsing JSON in Python for following response

TL;DR:
Confused on how to parse following JSON response and get the value of [status of 12345 of dynamicValue_GGG of payload] in this case.
Full question:
I get the following as (sanitized) response upon hitting a REST API via Python code below:
response = requests.request("POST", url, data=payload,
headers=headers).json()
{
"payload": {
"name": "asdasdasdasd",
"dynamicValue_GGG": {
"12345": {
"model": "asad",
"status": "active",
"subModel1": {
"dynamicValue_67890": {
"model": "qwerty",
"status": "active"
},
"subModel2": {
"dynamicValue_33445": {
"model": "gghjjj",
"status": "active"
},
"subModel3": {
"dynamicValue_66778": {
"model": "tyutyu",
"status": "active"
}
}
}
},
"date": "2016-02-04"
},
"design": "asdasdWWWsaasdasQ"
}
If I do a type(response['payload']), it gives me 'dict'.
Now, I'm trying to parse the response above and fetch certain keys and values out of it. The problem is that I'm not able to iterate through using "index" and rather have to specify the "key", but then the response has certain "keys" that are dynamically generated and sent over. For instance, the keys called "dynamicValue_GGG", "dynamicValue_66778" etc are not static unlike the "status" key.
I can successfully parse by mentioning like:
print response['payload']['dynamicValue_GGG']['12345'][status]
in which case I get the expected output = 'active'.
However, since I have no control on 'dynamicValue_GGG', it would work only if I can specify something like this instead:
print response['payload'][0][0][status]
But the above line gives me error: " KeyError: 0 " when the python code is executed.
Is there someway in which I can use the power of both keys as well as index together in this case?

The order of values in a dictionary in Python are random, so you cannot use indexing. You'll have to iterate over all elements, potentially recursive, and test to see if it's the thing you're looking for. For example:
def find_submodels(your_dict):
for item_key, item_values in your_dict.items():
if 'status' in item_values:
print item_key, item_values['status']
if type(item_values) == dict:
find_submodels(item_values)
find_submodels(your_dict)
Which would output:
12345 active
dynamicValue_67890 active
dynamicValue_66778 active
dynamicValue_33445 active

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to properly parse embedded value from JSON - python

Related

How to search inside json file with redis?

Automatically entering next JSON level using Python in a similar way to JQ in bash

How do I iterate over a json object that has a list of dictionaries inside in Python?

How to index list of object in Elasticsearch?

Getting KeyError when parsing JSON in Python for following response

Categories

Resources