Parse complex JSON in Python - python

EDITED WITH LARGER JSON:
I have the following JSON and I need to get id element: 624ff9f71d847202039ec220
results": [
{
"id": "62503d2800c0d0004ee4636e",
"name": "2214524",
"settings": {
"dataFetch": "static",
"dataEntities": {
"variables": [
{
"id": "624ffa191d84720202e2ed4a",
"name": "temp1",
"device": {
"id": "624ff9f71d847202039ec220",
"name": "282c0240ea4c",
"label": "282c0240ea4c",
"createdAt": "2022-04-08T09:01:43.547702Z"
},
"chartType": "line",
"aggregationMethod": "last_value"
},
{
"id": "62540816330443111016e38b",
"device": {
"id": "624ff9f71d847202039ec220",
"name": "282c0240ea4c",
},
"chartType": "line",
}
]
}
...
Here is my code (EDITED)
url = "API_URL"
response = urllib.urlopen(url)
data = json.loads(response.read().decode("utf-8"))
print url
all_ids = []
for i in data['results']: # i is a dictionary
for variable in i['settings']['dataEntities']['variables']:
print(variable['id'])
all_ids.append(variable['id'])
But I have the following error:
for variable in i['settings']['dataEntities']['variables']:
KeyError: 'dataEntities'
Could you please help?
Thanks!!

What is it printing when you print(fetc)? If you format the json, it will be easier to read, the current nesting is very hard to comprehend.
fetc is a string, not a dict. If you want the dict, you have to use the key.
Try:
url = "API_URL"
response = urllib.urlopen(url)
data = json.loads(response.read().decode("utf-8"))
print url
for i in data['results']:
print(json.dumps(i['settings']))
print(i['settings']['dataEntities']
EDIT: To get to the id field, you'll need to dive further.
i['settings']['dataEntities']['variables'][0]['id']
So if you want all the ids you'll have to loop over the variables (assuming the list is more than one)`, and if you want them for all the settings, you'll need to loop over that too.
Full solution for you to try (EDITED after you uploaded the full JSON):
url = "API_URL"
response = urllib.urlopen(url)
data = json.loads(response.read().decode("utf-8"))
print url
all_ids = []
for i in data['results']: # i is a dictionary
for variable in i['settings']['dataEntities']['variables']:
print(variable['id'])
all_ids.append(variable['id'])
all_ids.append(variable['device']['id']
Let me know if that works.

The shared JSON is not valid. A valid JSON similar to yours is:
{
"results": [
{
"settings": {
"dataFetch": "static",
"dataEntities": {
"variables": [
{
"id": "624ffa191d84720202e2ed4a",
"name": "temp1",
"span": "inherit",
"color": "#2ccce4",
"device": {
"id": "624ff9f71d847202039ec220"
}
}
]
}
}
}
]
}
In order to get a list of ids from your JSON you need a double for cycle. A Pythonic code to do that is:
all_ids = [y["device"]["id"] for x in my_json["results"] for y in x["settings"]["dataEntities"]["variables"]]
Where my_json is your initial JSON.

Related

How to get a value from JSON

This is the first time I'm working with JSON, and I'm trying to pull url out of the JSON below.
{
"name": "The_New11d112a_Company_Name",
"sections": [
{
"name": "Products",
"payload": [
{
"id": 1,
"name": "TERi Geriatric Patient Skills Trainer,
"type": "string"
}
]
},
{
"name": "Contact Info",
"payload": [
{
"id": 1,
"name": "contacts",
"url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
"contacts": [
{
"name": "User",
"email": "Company Email",
"phone": "Company PhoneNumber"
}
],
"type": "contact"
}
]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"
}
I have been able to access description and _id via
data = json.loads(line)
if 'xpath' in data:
xpath = data["_id"]
description = data["sections"][0]["payload"][0]["description"]
However, I can't seem to figure out a way to access url. One other issue I have is there could be other items in sections, which makes indexing into Contact Info a non starter.
Hope this helps:
import json
with open("test.json", "r") as f:
json_out = json.load(f)
for i in json_out["sections"]:
for j in i["payload"]:
for key in j:
if "url" in key:
print(key, '->', j[key])
I think your JSON is damaged, it should be like that.
{
"name": "The_New11d112a_Company_Name",
"sections": [
{
"name": "Products",
"payload": [
{
"id": 1,
"name": "TERi Geriatric Patient Skills Trainer",
"type": "string"
}
]
},
{
"name": "Contact Info",
"payload": [
{
"id": 1,
"name": "contacts",
"url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
"contacts": [
{
"name": "User",
"email": "Company Email",
"phone": "Company PhoneNumber"
}
],
"type": "contact"
}
]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"
}
You can check it on http://json.parser.online.fr/.
And if you want to get the value of the url.
import json
j = json.load(open('yourJSONfile.json'))
print(j['sections'][1]['payload'][0]['url'])
I think it's worth to write a short function to get the url(s) and make a decision whether or not to use the first found url in the returned list, or skip processing if there's no url available in your data.
The method shall looks like this:
def extract_urls(data):
payloads = []
for section in data['sections']:
payloads += section.get('payload') or []
urls = [x['url'] for x in payloads if 'url' in x]
return urls
This should print out the URL
import json
# open json file to read
with open('test.json','r') as f:
# load json, parameter as json text (file contents)
data = json.loads(f.read())
# after observing format of JSON data, the location of the URL key
# is determined and the data variable is manipulated to extract the value
print(data['sections'][1]['payload'][0]['url'])
The exact location of the 'url' key:
1st (position) of the array which is the value of the key 'sections'
Inside the array value, there is a dict, and the key 'payload' contains an array
In the 0th (position) of the array is a dict with a key 'url'
While testing my solution, I noticed that the json provided is flawed, after fixing the json flaws(3), I ended up with this.
{
"name": "The_New11d112a_Company_Name",
"sections": [
{
"name": "Products",
"payload": [
{
"id": 1,
"name": "TERi Geriatric Patient Skills Trainer",
"type": "string"
}
]
},
{
"name": "Contact Info",
"payload": [
{
"id": 1,
"name": "contacts",
"url": "https://www.a3bs.com/catheterization-kits-8000892-3011958-3b-scientific,p_1057_31043.html",
"contacts": [
{
"name": "User",
"email": "Company Email",
"phone": "Company PhoneNumber"
}
],
"type": "contact"
}
]
}
],
"tags": [
"Male",
"Airway"
],
"_id": "0e4cd5c6-4d2f-48b9-acf2-5aa75ade36e1"}
After utilizing the JSON that was provided by Vincent55.
I made a working code with exception handling and with certain assumptions.
Working Code:
## Assuming that the target data is always under sections[i].payload
from json import loads
line = open("data.json").read()
data = loads(line)["sections"]
for x in data:
try:
# With assumption that there is only one payload
if x["payload"][0]["url"]:
print(x["payload"][0]["url"])
except KeyError:
pass

Filtering out desired data from a JSON file (Python)

this is a sample of my json file:
{
"pops": [{
"name": "pop_a",
"subnets": {
"Public": ["1.1.1.0/24,2.2.2.0/24"],
"Private": ["192.168.0.0/24,192.168.1.0/24"],
"more DATA":""
}
},
{
"name": "pop_b",
"subnets": {
"Public": ["3.3.3.0/24,4.4.4.0/24"],
"Private": ["192.168.2.0/24,192.168.3.0/24"],
"more DATA":""
}
}
]
}
after i read it, i want to make a dic object and store some of the things that i need from this file.
i want my object to be like this ..
[{
"name": "pop_a",
"subnets": {"Public": ["1.1.1.0/24,2.2.2.0/24"],"Private": ["192.168.0.0/24,192.168.1.0/24"]}
},
{
"name": "pop_b",
"subnets": {"Public": ["3.3.3.0/24,4.4.4.0/24"],"Private": ["192.168.2.0/24,192.168.3.0/24"]}
}]
then i want to be able to access some of the public/private values
here is what i tried, and i know there is update(), setdefault() that gave also same unwanted results
def my_funckion():
nt_json = [{'name':"",'subnets':[]}]
Pname = []
Psubnet= []
for pop in pop_json['pops']: # it print only the last key/value
nt_json[0]['name']= pop['name']
nt_json[0]['subnet'] = pop['subnet']
pprint (nt_json)
for pop in pop_json['pops']:
"""
it print the names in a row then all of the ipss
"""
Pname.append(pop['name'])
Pgre.append(pop['subnet'])
nt_json['pop_name'] = Pname
nt_json['subnet']= Psubnet
pprint (nt_json)
Here's a quick solution using list comprehension. Note that this approach can be taken only with enough knowledge of the json structure.
>>> import json
>>>
>>> data = ... # your data
>>> new_data = [{ "name" : x["name"], "subnets" : {"Public" : x["subnets"]["Public"], "Private" : x["subnets"]["Private"]}} for x in data["pops"]]
>>>
>>> print(json.dumps(new_data, indent=2))
[
{
"name": "pop_a",
"subnets": {
"Private": [
"192.168.0.0/24,192.168.1.0/24"
],
"Public": [
"1.1.1.0/24,2.2.2.0/24"
]
}
},
{
"name": "pop_b",
"subnets": {
"Private": [
"192.168.2.0/24,192.168.3.0/24"
],
"Public": [
"3.3.3.0/24,4.4.4.0/24"
]
}
}
]

Need read some data from JSON

I need to make a get (id, name, fraction id) for each deputy in this json
{
"id": "75785",
"title": "(за основу)",
"asozdUrl": null,
"datetime": "2011-12-21T12:20:26+0400",
"votes": [
{
"deputy": {
"id": "99111772",
"name": "Абалаков Александр Николаевич",
"faction": {
"id": "72100004",
"title": "КПРФ"
}
},
"result": "accept"
},
{
"deputy": {
"id": "99100491",
"name": "Абдулатипов Рамазан Гаджимурадович",
"faction": {
"id": "72100024",
"title": "ЕР"
}
},
"result": "none"
}
.......,` etc
My code is looks like that:
urlData = "https://raw.githubusercontent.com/data-dumaGovRu/vote/master/poll/2011-12-21/75785.json"
response = urllib.request.urlopen(urlData)
content = response.read()
data = json.loads(content.decode("utf8"))
for i in data:
#print(data["name"])
`
And i dont know what to do with that #print line, how I should write it?
You can access the list containing the deputies with data['votes']. Iterating through the list, you can access the keys you're interested in as you would with dict key lookups. Nested dicts imply you have to walk through the keys starting from the root to your point of interest:
for d in data['votes']:
print(d['deputy']['id'], d['deputy']['name'], d['deputy']['faction']['id'])

How to make a 'outer' JSON key for JSON object with python

I would like to make the following JSON syntax output with python:
data={
"timestamp": "1462868427",
"sites": [
{
"name": "SiteA",
"zone": 1
},
{
"name": "SiteB",
"zone": 7
}
]
}
But I cannot manage to get the 'outer' data key there.
So far I got this output without the data key:
{
"timestamp": "1462868427",
"sites": [
{
"name": "SiteA",
"zone": 1
},
{
"name": "SiteB",
"zone": 7
}
]
}
I have tried with this python code:
sites = [
{
"name":"nameA",
"zone":123
},
{
"name":"nameB",
"zone":324
}
]
data = {
"timestamp": 123456567,
"sites": sites
}
print(json.dumps(data, indent = 4))
But how do I manage to get the outer 'data' key there?
Once you have your data ready, you can simply do this :
data = {'data': data}
JSON doesn't have =, it's all key:value.
What you're looking for is
data = {
"data": {
"timestamp": 123456567,
"sites": sites
}
}
json.dumps(data)
json.dumps() doesn't care for the name you give to the data object in python. You have to specify it manually inside the object, as a string.

Issues decoding Collections+JSON in Python

I've been trying to decode a JSON response in Collections+JSON format using Python for a while now but I can't seem to overcome a small issue.
First of all, here is the JSON response:
{
"collection": {
"href": "http://localhost:8000/social/messages-api/",
"items": [
{
"data": [
{
"name": "messageID",
"value": 19
},
{
"name": "author",
"value": "mike"
},
{
"name": "recipient",
"value": "dan"
},
{
"name": "pm",
"value": "0"
},
{
"name": "time",
"value": "2015-03-31T15:04:01.165060Z"
},
{
"name": "text",
"value": "first message"
}
]
}
],
"version": "1.0",
"links": []
}
}
And here is how I am attempting to extract data:
response = urllib2.urlopen('myurl')
responseData = response.read()
jsonData = json.loads(responseData)
test = jsonData['collection']['items']['data']
When I run this code I get the error:
list indices must be integers, not str
If I use an integer, e.g. 0, instead of a string it merely shows 'data' instead of any useful information, unlike if I were to simply output 'items'. Similarly, I can't seem to access the data within a data child, for example:
test = jsonData['collection']['items'][0]['name']
This will argue that there is no element called 'name'.
What is the proper method of accessing JSON data in this situation? I would also like to iterate over the collection, if that helps.
I'm aware of a package that can be used to simplify working with Collections+JSON in Python, collection-json, but I'd rather be able to do this without using such a package.

Categories

Resources