Access JSON nested arrays in Python - python

I'm new to Python and I'm trying to process something and having no luck finding the answer or if it's already been asked. I'm making a call to an API and receiving some data back as JSON. I'm stripping out certain bits that I don't need with the keys being stripped out and only the values remaining which wouldn't be a problem but I can't get into them as the keys I want to access are nested in an array.
I've been accessing the data and can get up to json.dumps(payload['output']['generic']) but I can't seem to find any information online as to how I can access these last values only.
Apologies in advance if this question already exists.
{
"output": {
"generic": [
{
"response_type": "text",
"text": "hi"
}
],
"intents": [
{
"intent": "CollectionDate",
"confidence": 0.8478035449981689
}
],
"entities": [
{
"entity": "Payslip",
"location": [
19,
26
],
"value": "When is my collection date",
"confidence": 1
}
]
},
"context": {
"global": {
"system": {
"turn_count": 10
}
},
"skills": {
"main skill": {
"user_defined": {
"DemoContext": "Hi!"
},
"system": {}
}
}
}
}
To clarify:
I want to access the "text", "intent" and "confidence"
at the moment I'm printing the value posted and then the responses for the sections I want like the below.
print(x)
print(json.dumps(payload['output']['generic']))
print(json.dumps(payload['output']['intents']))

Use following code to convert the json to a dict first:
json_data = json.loads(str(yourData))
After that, in your case, the outermost key is "output", and it is another dict, so just use json_data['output'] to access the content inside.
For other keys inside of the "output", like "generic", you can see it is an array with the [] brackets. use json_data['output'][index] first to get the content inside, then use the same method you access a dict to access the content inside of keys like this.

They key here is that the Traceback error indicates an issue with indexing a "List"
This is because a "List" type is a valid JSON type, and generic contains a list of length 1, with a dict inside!
>>> payload['output']['generic']['text']
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers or slices, not str
>>> type(payload['output']['generic'])
<class 'list'>
>>> len(payload['output']['generic'])
1
>>> payload['output']['generic'][0]
{'response_type': 'text', 'text': 'hi'}
>>> type(payload['output']['generic'][0])
<class 'dict'>
>>> payload['output']['generic'][0]['text']
'hi'
>>>
So, given your expected input JSON format, you will need to know how to index in to pull each required data point.
There are a few packages, glom is one, that will help you deal with missing values from API generated JSON.

Related

SQLalchemy +Postgresql accessing Array of jsonb elements

I have a query that looks like this.
query = db.query(status_cte.c.finding_status_history)
the finding_status_history column is of type array when I check its .type. It's an array of jsonb objects I can easily change it to be json instead if it's easier. I've also tested this out with it as json.
[
{
"data": [
{
"status": "closed",
"created_at": "2023-01-27T18:05:27.579817",
"previous_status": "open"
},
{
"status": "open",
"created_at": "2023-01-27T18:05:28.694352",
"previous_status": "closed"
}
]
},
...
]
I'm trying to access the first dictionary nested inside data and access the status column.
I've tried to grab it using query = db.query(status_cte.c.finding_status_history[0]) but this returns a list of empty dictionaries like so.
[
{},
{},
{},
{},
{},
{},
{}
]
I'm not sure why that doesn't work as its my impression that i should grab the first entry. I'm assuming i need to access "data" some how first but i've also tried...
query = db.query(status_cte.c.finding_status_history.op('->>')('data')
Which gives me jsonb[] ->> unknown operator doesn't exist. I've tried to type cast data to be that of String and i get the same error but jsonb[] ->> String etc etc
Also when looping through the items for item in query.all() i'm seeing that [0] results in (None,) and [1] results in
({
"status": "closed",
"created_at": "2023-01-27T18:05:27.579817",
"previous_status": "open"
},)
as a tuple...
The secret was that [0] is not the first element. [1] is also noted that [-1] doesn't appear to give me the last element so i also had to order my aggregated json objects.

Filter Nested array Python

I have an output array like this after using "json.dumps(response.json(), indent=4"
{
"totalCount": 8,
"hasMore": false,
"firstIndex": 0,
"list": [
{
"id": "7d5bb8asdfasdfasfdasdfasdfasdf",
"name": "Corporate",
"domainType": "AAAAAAAA",
"description": "",
"createdBy": "admin",
"createDatetime": "2020/06/04 17:40:22",
"parentDomainId": "8b208asdfasdfasdfasdfasdfas",
"zoneCount": 2,
"subDomainCount": 1,
"administratorCount": 0,
"apCount": 0,
"zeroTouchStatus": true
},
Now when I try to filter it as follows
print(results['name']) or print(results['list'][0]['name'])
I keep getting this error message:
TypeError: string indices must be integers
This starts with a dict {}, then ther are lists [] of dict {} in here. Based on that it should work. Appreciate any guidance. Thank you.
It is clear by your data you posted that it is still in text format JSON. Convert to a python dict using json.loads()
result = json.loads(response.json())
print(results['name'])
print(results['list'][0]['name'])
The problem is json.dumps(response.json()). The return of this statement is a str so you cannot get items using [] notation. You can convert back it to a dict.

Get json objects value with python

I have a problem with my python code. I have successfully merged multiple json files with python. The json code of each file is in a python dict. When I want to use that data it only shows me the object name not the value.
This is the code that adds the code of all json files inside a python dict:
result = []
for f in glob.glob("jsons/*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
After that I would like to show all IDs of all the reports from all json files:
for files in result:
for reports in files["reports"]:
print reports["id"]
But I get the error message:
Traceback (most recent call last):
File "app.py", line 71, in <module>
print reports["id"]
TypeError: string indices must be integers
When I delete ["id"] then it shows me a list of all report names (1,2,3,...) but not the full report with objects and values. Only the report names.
Here is the json code:
{
"reports": {
"1": {
"id": "123"
},
"2": {
"id": "122"
},
"3": {
"id": "121"
}
}
},
{
"reports": {
"4": {
"id": "120"
},
"5": {
"id": "119"
},
"6": {
"id": "118"
}
}
},
...
for reports in files["reports"]:
files["reports"] is a dictionary, and iterating over a dictionary will only bind the keys of that dictionary, and not its values. So reports will be "1" and then "2" etc.
If you want reports to be bound to the {"id": "123"} dictionary value instead of the string key, specify this with the values method:
for reports in files["reports"].values():
Those are lists, not arrays.
Anyway, we have a list of nested dicts, and we want to get all the values corresponding to the id keys in the innermost dicts.
You can do that with a list comprehension, as follows:
[v['id'] for file in files for v in file['reports'].values()]
In correlation to the previous answers, your json converted to a dictionary, so does other formats of json do as below. Adding the "values" method or changing the iteration to (python3)
for files in result:
for k,v in results.items():
print(k, v)
JSON ----- PYTHON
Object ----- dict
Array ----- list
string ----- unicode
number(int) ----- int, long

parse a key:value style file and get a value

I have a text file which looks like:
{
"content":
[
{
"id": "myid1",
"path": "/x/y"
},
{
"id": "myid2",
"path": "/a/b"
}
]
}
Is there a way to get the value corresponding to "id" when I pass the
"path" value to my method? For example when I pass /a/b I should get "myid2" in
return. Should I create a dictionary?
Maybe explain briefly what it is you need to actually do as I get a hunch that there might be an easier way to do what you're trying to do.
If i understand the question correctly, if you wanted to find the id by passing a value such as "/x/y" then why not structure the dictionary as
{
"content":
{
"/x/y": "myid1"
},
...(more of the same)
}
This would give you direct access to the value you want as otherwise you need to iterate through arrays.
This looks very much like JSON, so you can use the json module to parse the file. Then, just iterate the dictionaries in the "contents" list and get the one with the matching "path".
import json
with open("data.json") as f:
data = json.load(f)
print(data)
path = "/a/b"
for d in data["content"]:
if d["path"] == path:
print(d["id"])
Output:
{'content': [{'path': '/x/y', 'id': 'myid1'}, {'path': '/a/b', 'id': 'myid2'}]}
myid2

Forming a JSON Arrays with Objects using a Python object

I have a some variables that I need to dump as a json of the format:
{
"elements":[
{
"key":"foo",
"value":"7837"
},
{
"key":"bar",
"value":"3423"
}
]
}
I am trying to figure out the right object which would give the above structure upon usin json.dumps(). I see that in python, lists give a json array where as dictionaries give a json object while using json dumps.
I am trying something like:
x={}
x["elements"]={}
x["elements"]["key"]="foo"
x["elements"]["value"]="7837"
x["elements"]["key"]="bar"
x["elements"]["value"]="3423"
json_x=json.dumps(x)
But this still gives me:
{"elements": {"key": "bar", "value": "3423"}}
which is obviously incorrect.
How to I incorporate the correct dictionary and list structure to get to the above json?
Why don't you just use literal?
x = {
"elements": [
{"key":"foo", "value":"7837"},
{"key":"bar", "value":"3423"}
]
}
To fix your code, you need to use a list literal ([]) when assigning to elements dictionary entry:
>>> x = {}
>>> x["elements"] = [] # <---
>>> x["elements"].append({})
>>> x["elements"].append({})
>>> x["elements"][0]["key"]="foo"
>>> x["elements"][0]["value"]="7837"
>>> x["elements"][1]["key"]="bar"
>>> x["elements"][1]["value"]="3423"
>>> json.dumps(x)
'{"elements": [{"value": "7837", "key": "foo"}, {"value": "3423", "key": "bar"}]}'
But, it's hard to read, maintain.

Categories

Resources