Creating a Dataframe from nested JSON API output - python

I am using the Qualtrics API to pull some data for work. The results I have been receiving comes back in a JSON format and I would like to transform the data into a dataframe. I'm working inside a jupyter notebook within Alteryx. I plan to export the dataframe in Alteryx to do work elsewhere..all I need to do is get it into shape. I receive the same response as the example I have posted below from Qualtrics website. Does anyone know how I can take the fields under the nested "elements" section and create a dataframe? I would like to make a dataframe of the contact fields I receive back.
I have tried the following:
jdata = json.loads(response.text)
df = pd.DataFrame(jdata)
print(df)
But I am getting a dataframe of the entire json response.
Example Response:
{
"meta": {
"httpStatus": "200 - OK",
"requestId": "7de14d38-f5ed-49d0-9ff0-773e12b896b8"
},
"result": {
"elements": [
{
"contactId": "CID_123456",
"email": "js#example.com",
"extRef": "1234567",
"firstName": "James",
"language": "en",
"lastName": "Smith",
"phone": "8005552000",
"unsubscribed": false
},
{
"contactId": "CID_3456789",
"email": "person#example.com",
"extRef": "12345678",
"firstName": "John",
"language": "en",
"lastName": "Smith",
"phone": "8005551212",
"unsubscribed": true
}
],
"nextPage": null
}
}

jdata = json.loads(response.text)
df = pd.json_normalize(jdata, record_path=['result', 'elements'])
In fact, if jdata is a list of dict, this method is also available.

try
jdata = json.loads(response.text)
elements= jdata['result']['elements']
df = pd.DataFrame(elements)
print(df)

Related

Parse complex JSON in Python

EDITED WITH LARGER JSON:
I have the following JSON and I need to get id element: 624ff9f71d847202039ec220
results": [
{
"id": "62503d2800c0d0004ee4636e",
"name": "2214524",
"settings": {
"dataFetch": "static",
"dataEntities": {
"variables": [
{
"id": "624ffa191d84720202e2ed4a",
"name": "temp1",
"device": {
"id": "624ff9f71d847202039ec220",
"name": "282c0240ea4c",
"label": "282c0240ea4c",
"createdAt": "2022-04-08T09:01:43.547702Z"
},
"chartType": "line",
"aggregationMethod": "last_value"
},
{
"id": "62540816330443111016e38b",
"device": {
"id": "624ff9f71d847202039ec220",
"name": "282c0240ea4c",
},
"chartType": "line",
}
]
}
...
Here is my code (EDITED)
url = "API_URL"
response = urllib.urlopen(url)
data = json.loads(response.read().decode("utf-8"))
print url
all_ids = []
for i in data['results']: # i is a dictionary
for variable in i['settings']['dataEntities']['variables']:
print(variable['id'])
all_ids.append(variable['id'])
But I have the following error:
for variable in i['settings']['dataEntities']['variables']:
KeyError: 'dataEntities'
Could you please help?
Thanks!!
What is it printing when you print(fetc)? If you format the json, it will be easier to read, the current nesting is very hard to comprehend.
fetc is a string, not a dict. If you want the dict, you have to use the key.
Try:
url = "API_URL"
response = urllib.urlopen(url)
data = json.loads(response.read().decode("utf-8"))
print url
for i in data['results']:
print(json.dumps(i['settings']))
print(i['settings']['dataEntities']
EDIT: To get to the id field, you'll need to dive further.
i['settings']['dataEntities']['variables'][0]['id']
So if you want all the ids you'll have to loop over the variables (assuming the list is more than one)`, and if you want them for all the settings, you'll need to loop over that too.
Full solution for you to try (EDITED after you uploaded the full JSON):
url = "API_URL"
response = urllib.urlopen(url)
data = json.loads(response.read().decode("utf-8"))
print url
all_ids = []
for i in data['results']: # i is a dictionary
for variable in i['settings']['dataEntities']['variables']:
print(variable['id'])
all_ids.append(variable['id'])
all_ids.append(variable['device']['id']
Let me know if that works.
The shared JSON is not valid. A valid JSON similar to yours is:
{
"results": [
{
"settings": {
"dataFetch": "static",
"dataEntities": {
"variables": [
{
"id": "624ffa191d84720202e2ed4a",
"name": "temp1",
"span": "inherit",
"color": "#2ccce4",
"device": {
"id": "624ff9f71d847202039ec220"
}
}
]
}
}
}
]
}
In order to get a list of ids from your JSON you need a double for cycle. A Pythonic code to do that is:
all_ids = [y["device"]["id"] for x in my_json["results"] for y in x["settings"]["dataEntities"]["variables"]]
Where my_json is your initial JSON.

How do I output specific data from a json response?

I am fairly new to using APIs in python and I am trying to create a system that outputs data from previous motorsport races. I have sent requests to an API, but I am struggling to get it to just output one specific piece of data (eg. time, location). I get this when I just print the raw JSON data sent.
{
"MRData": {
"RaceTable": {
"Races": [
{
"Circuit": {
"Location": {
"country": "Spain",
"lat": "41.57",
"locality": "Montmeló",
"long": "2.26111"
},
"circuitId": "catalunya",
"circuitName": "Circuit de Barcelona-Catalunya",
"url": "http://en.wikipedia.org/wiki/Circuit_de_Barcelona-Catalunya"
},
"date": "2020-08-16",
"raceName": "Spanish Grand Prix",
"round": "6",
"season": "2020",
"time": "13:10:00Z",
"url": "https://en.wikipedia.org/wiki/2020_Spanish_Grand_Prix"
}
],
"round": "6",
"season": "2020"
},
"limit": "30",
"offset": "0",
"series": "f1",
"total": "1",
"url": "http://ergast.com/api/f1/2020/6.json",
"xmlns": "http://ergast.com/mrd/1.4"
}
}
Just to get to grips with APIs I am simply trying to output a simple piece of data of a specific race, and once I can do that, I'll be able to scale it up and output all sorts of data. I'd assumed it would just be as simple as typing print(data['time']) (as seen below) but I get an error message saying this:
KeyError: 'time'
My source code:
import requests
response = requests.get("http://ergast.com/api/f1/2020/6.json")
data = response.json()
print (data["time"])
Any help is appreciated!
Like this...
import json
data = """{
"MRData":{
"xmlns":"http://ergast.com/mrd/1.4",
"series":"f1",
"url":"http://ergast.com/api/f1/2020/6.json",
"limit":"30",
"offset":"0",
"total":"1",
"RaceTable":{
"season":"2020",
"round":"6",
"Races":[
{
"season":"2020",
"round":"6",
"url":"https://en.wikipedia.org/wiki/2020_Spanish_Grand_Prix",
"raceName":"Spanish Grand Prix",
"Circuit":{
"circuitId":"catalunya",
"url":"http://en.wikipedia.org/wiki/Circuit_de_Barcelona-Catalunya",
"circuitName":"Circuit de Barcelona-Catalunya",
"Location":{
"lat":"41.57",
"long":"2.26111",
"locality":"Montmeló",
"country":"Spain"
}
},
"date":"2020-08-16",
"time":"13:10:00Z"
}
]
}
}
}"""
jsonData = json.loads(data)
Races is an array, in this case there is only one race so you would desigate it as ["Races"][0]
print(jsonData["MRData"]["RaceTable"]["Races"][0]["time"])
data['time'] would work if you had a flat dictionary, but you have a nested dicts/list structure, so:
data["MRData"]["RaceTable"]["Races"][0]["time"]
data["MRData"] returns another dict, which has a key "RaceTable". The value of this key is again a dictionary which has a key "Races". The value of this is a list of races, of which you only have one. The races are again dicts which have the key time.

How do you parsing nested JSON data for specific information?

I'm using the national weather service API and when you use a specific URL you get JSON data back. My program so far grabs everything including 155 hours of weather data.
Simply put I'm trying to parse the data and grab the weather for the
latest hour but everything is in a nested data structure.
My code, JSON data, and more information are below. Any help is appreciated.
import requests
import json
def get_current_weather(): #This method returns json data from the api
url = 'https://api.weather.gov/gridpoints/*office*/*any number,*any number*/forecast/hourly'
response = requests.get(url)
full_data = response.json()
return full_data
def main(): #Prints the information grabbed from the API
print(get_current_weather())
if __name__ == "__main__":
main()
In the JSON response, I get there are 3 layers before you get to the 'shortForecast' data that I'm trying to get. The first nest is 'properties, everything before it is irrelevant to my program. The second nest is 'periods' and each period is a new hour, 0 being the latest. Lastly, I just need to grab the 'shortForcast' in the first period or periods[0].
{
"#context": [
"https://geojson.org/geojson-ld/geojson-context.jsonld",
{
"#version": "1.1",
"wx": "https://api.weather.gov/ontology#",
"geo": "http://www.opengis.net/ont/geosparql#",
"unit": "http://codes.wmo.int/common/unit/",
"#vocab": "https://api.weather.gov/ontology#"
}
],
"type": "Feature",
"geometry": {
"type": "Polygon",
"coordinates": [
[
*data I'm not gonna add*
]
]
},
"properties": {
"updated": "2021-02-11T05:57:24+00:00",
"units": "us",
"forecastGenerator": "HourlyForecastGenerator",
"generatedAt": "2021-02-11T07:12:58+00:00",
"updateTime": "2021-02-11T05:57:24+00:00",
"validTimes": "2021-02-10T23:00:00+00:00/P7DT14H",
"elevation": {
"value": ,
"unitCode": "unit:m"
},
"periods": [
{
"number": 1,
"name": "",
"startTime": "2021-02-11T02:00:00-05:00",
"endTime": "2021-02-11T03:00:00-05:00",
"isDaytime": false,
"temperature": 18,
"temperatureUnit": "F",
"temperatureTrend": null,
"windSpeed": "10 mph",
"windDirection": "N",
"icon": "https://api.weather.gov/icons/land/night/snow,40?size=small",
"shortForecast": "Chance Light Snow",
"detailedForecast": ""
},
{
"number": 2,
"name": "",
"startTime": "2021-02-11T03:00:00-05:00",
"endTime": "2021-02-11T04:00:00-05:00",
"isDaytime": false,
"temperature": 17,
"temperatureUnit": "F",
"temperatureTrend": null,
"windSpeed": "12 mph",
"windDirection": "N",
"icon": "https://api.weather.gov/icons/land/night/snow,40?size=small",
"shortForecast": "Chance Light Snow",
"detailedForecast": ""
},
OK, so I didn't want to edit everything again so this is the new get_current_weather method. I was able to get to 'periods but after that I'm still stumped. This is the new method.
def get_current_weather():
url = 'https://api.weather.gov/gridpoints/ILN/82,83/forecast/hourly'
response = requests.get(url)
full_data = response.json()
return full_data['properties'].get('periods')
For the dictionary object, you can access the nested elements by using indexing multiple times.
So, for your dictionary object, you can use the following to get the value for the key shortForecast for the first element in the list of dictionaries under key periods under the key properties in the main dictionary:
full_data['properties']['periods'][0]['shortForecast']

Access a specific value from json response

I need to access the one single value from the JSON_array response format of google endpoints model API with python how can i access it.
I want to access the values from the response and manipulate to pass to other class how can i do that.
I tried following code but it gives whole json array data:
#Gadget.query_method(path='gadgets', name='gadget.list', query_fields= ('ownerID', 'category', 'locationCity','pageToken'))
def GadgetList(self, query):
return query
How do I extract the single value I need?
following response i get.
{
"items": [
{
"category": "Audio",
"locationBuilding": "100",
"subCategory": "Microphone",
"title": "qwe",
"description": "hdhsjsjsbs ks\nshshs josh\nvsdbjddkdkdldjdj \nhsjdkdkdldlddhdhdkslsksmsnsbsnksslslsl",
"accountNumber": "1246800754379",
"image_name": "test_image.png",
"created": "2017-01-25T07:51:30.468280",
"locationPostCode": "10040",
"reviews": "0",
"locationCity": "Stockholm",
"ownerName": "Canopus Rise",
"locationStreet": "wash",
"image_first": "https://folkloric-drive-119608.appspot.com/view_photo/AMIfv95kf_NH4IjVPwyzUH2owYG7RA604yij9zeQO_WrgKe2YtKS4UXyfQP4qNzB92OqEsNZSPKg2KVX_ZxbixkX24cpeIimYFzarpjScAWbLzGFQsVZ2FZ68KMLCEqgNZslmdlWB11WlGDapeStzqNklg_AMrgxymugGPdGEGYsrfN4tV2ZjMEwqPSBysByMzWVCIABlaFkdhzYNrAOdUKBq6KoFMO1AVTW8VCB48OKEba59iIsJ2vLDbFD0xwna9Ef8ey2njve2ytXd7yLngaiI4AqCWGRqPKwPlVhdJig1oaTcf7L89Y2dBJsKqmczAHQpQY7OYhtduvwmY2RJERjn_DJR946oyiHJyYvqTptJS6hmTHySTLGF64tYp1OhzXKPVkXcwCJo2st-WrxrXITw4VAEBsAH5ALlAI9WaXNrvZjE3uIHUYbDZLWxtHmXUD",
"ownerID": "5723151296102400",
"image_second": "https://folkloric-drive-119608.appspot.com/view_photo/AMIfv967Q-nOvyX9I5qCyhdFQ-Y_ltXzcPX0BfSsDrWOL4BcepNCP9kYg3qyy3af2Q6uAnQuzDA-pcYj_abrp9CkEkVbt8c3nJuJGa5X0TpKZr4rWyQiat7WbhN3vJ7Z1mqSpalAT3VGZNMiZUopQ37lJA2BhArKtfe_S5Pyqi61pxROoFawKNsMGRupzBy3Z3xHGTuqaIA3hK1rpu7PXwlGxKf3t_lEdwklgz2myUsCK9A6iynJDY1_E8are6GKtGjgVDKKgDwzQp6A6lwcDEVwbYKnbcxaxaVtxIt3Cl54OM32z2dT0Ai9SDTKaxCU82SUwbPM_9KdmnwpDWfvmkdlFEuXgDXtyXH2N8R9ll6PxSWkNRNC34pzsLsviGypahaKTfmn86Rivr8r9IhAo4QIQGelIcjsT7L0-jJ1I8zkUYxUiZ8_8H6fBuMtNwEhpppzO5hjt",
"price": 25,
"id": "5690964005879808",
"image_third": "https://folkloric-drive-119608.appspot.com/view_photo/AMIfv94CvfaeiODVYShFYtr39sV6tILqcOH25GMwyRqum_I6hby2wu08izp7f0pJ27VFZ4YWnSScH_tS1fu-I8L_fOGQjFXXdiO444plR8F4h33CWzwBXltazPPGIC1GS5BKDSz31jtr8wtViG-0wFd-q-QIsrL3kyfr7qdebfKkwkjeX82hXpdu6kIhIvLqaxXoCPGORZWRtxPTrzsBxuDPy4ISApB3_MgmCSqLhtUVjwAZIrCwv47f1q9JCusuqBn04Qv3GGE017rlocFFgz0BCEFc_Sjeo058OS8GOTBNVRPQA8_tViCEWghWe78USvOFloRLVk_k9T1B1rSVcJsq0DATZp_Jxp_KBA2DLrTkvVk0DXJANABjhTGtZwJpUwbly_LSfl294Ux65i_GSENpyEi7_tRpgHOQ6Jj9BtvC0pTxpF2M8c2jgbnKROoWxBqnvcC_Gk9z",
"ownerPic": "https://fbcdn-profile-a.akamaihd.net/hprofile-ak-ash2/v/t1.0-1/p200x200/10262183_278529225690836_3880829709913861761_n.jpg?oh=d252a7fe35be82049a8af84035167d41&oe=58400F76&__gda__=1481700277_70deb0cd0b615a544844d9444a599a6b",
"kind": "shareclub#gadgetItem"
}
],
"kind": "shareclub#gadget",
"etag": "\"I7-yHEADuMxTlz3eOecr1LwL2AQ/z2piY1JoP7KGAyKRBlO72PmSj74\""
}

How do I access JSON data passed via AJAX in my django application?

{
"data": [
{
"name": "Kill Bill",
"category": "Movie",
"id": "179403312117362",
"created_time": "2011-06-21T17:40:15+0000"
},
{
"name": "In Search of a Midnight Kiss",
"category": "Movie",
"id": "105514816149992",
"created_time": "2011-03-21T03:59:21+0000"
},
]
}
We could use this as sample data.
So if you were to extract "In Search of a Midnight Kiss" from the request.POST variable, how would you do it ?
It'll come as a string that needs to be desearialized.
import json
def some_http_call(request)
json_string = request.GET.get('http_parameter_key', '')
json_object = json.loads(json_string)
data = json_object["data"]
for x in data:
print x["name"]
Assuming some_http_call is your dispatcher and http_parameter_key is the name of the parameter where the json string is coming the code above will print all the names in the array of elements contained in the dictionary data.
First you deserialize it using simplejson or json, then you access it as you would any other Python object.

Categories

Resources