Extracting multiple values from a JSON Object - python

I have a JSON object that looks as such, stored in a variabled called results:
{
"users":[
{
"pk":54297756964,
"username":"zach_nga_test",
"full_name":"",
"is_private":false,
"profile_pic_url":"https://instagram.fcxl1-1.fna.fbcdn.net/v/t51.2885-19/44884218_345707102882519_2446069589734326272_n.jpg?efg=eyJybWQiOiJpZ19hbmRyb2lkX21vYmlsZV9uZXR3b3JrX3N0YWNrX2JhY2t0ZXN0X3YyNDI6Y29udHJvbCJ9&_nc_ht=instagram.fcxl1-1.fna.fbcdn.net&_nc_cat=1&_nc_ohc=bdMav5vmkj0AX9lHIND&edm=ALdCaaIBAAAA&ccb=7-5&ig_cache_key=YW5vbnltb3VzX3Byb2ZpbGVfcGlj.2-ccb7-5&oh=00_AT9BXADjYsi8BkryuLhynAbrTCGmgjucZ6CJUxW4VS49QA&oe=62D6BC4F&_nc_sid=bfcd0c",
"is_verified":false,
"has_anonymous_profile_picture":true,
"has_highlight_reels":false,
"account_badges":[
],
"latest_reel_media":0
},
{
"pk":182625349,
"username":"joesmith123",
"full_name":"Joe Smith",
"is_private":false,
"profile_pic_url":"https://scontent-lga3-1.cdninstagram.com/v/t51.2885-19/264887407_23285995fef5595973_2438487768932865121_n.jpg?stp=dst-jpg_s150x150&_nc_ht=scontent-lga3-1.cdninstagram.com&_nc_cat=105&_nc_ohc=8YpjD2OKeoEAX_gXrh3&edm=APQMUHMBAAAA&ccb=7-5&oh=00_AT9UuhNl9LL_ANffkCcyNNFPv5_yK7J2FKQpRPmqEIri3w&oe=62D60038&_nc_sid=e5d0a6",
"profile_pic_id":"2725348120205619717_182625349",
"is_verified":false,
"has_anonymous_profile_picture":false,
"has_highlight_reels":false,
"account_badges":[
],
"latest_reel_media":0
},
{
"pk":7324707263,
"username":"Mike Jones",
"full_name":"Mike",
"is_private":false,
"profile_pic_url":"https://scontent-lga3-1.cdninstagram.com/v/t51.2885-19/293689497_4676376015169654_5558066294974198168_n.jpg?stp=dst-jpg_s150x150&_nc_ht=scontent-lga3-1.cdninstagram.com&_nc_cat=110&_nc_ohc=ErZ3qsP0LsAAX96OT9a&edm=APQMUHMBAAAA&ccb=7-5&oh=00_AT8pFSsMfJz4Wpq5ulTCpou-4jPs3_GBqIT_SQA6YMaQ0Q&oe=62D61C4D&_nc_sid=e5d0a6",
"profile_pic_id":"28817013445468058864_7324707263",
"is_verified":false,
"has_anonymous_profile_picture":false,
"has_highlight_reels":false,
"account_badges":[
],
"latest_reel_media":1657745524
}
]
}
What I'm trying to do is extract the pk variable from all of them.
I know how to do it on a one by one basis as such:
ids = results['users'][1]['pk']
ids = results['users'][2]['pk']
ids = results['users'][3]['pk']
And so on. But let's say I wanted to extract all of those values in one swoop. I also won't necessarily know how many there are in each JSON object (while the example I used had three, it could be hundreds).
I'm so used to R where you can just doing something like ids = results$pk but don't know how to do this with Python.
EDIT BASED ON rv.kvetch answer
data = results
data = json.loads(data)
ids = [u['pk'] for u in data['users']]
print(ids)
But when I run it I get TypeError: the JSON object must be str, bytes or bytearray, not dict

You can load the json string to a dict object, then use a list comprehension to retrieve a list of ids:
import json
# data is a string
data: str = """
...
"""
data: dict = json.loads(data)
ids = [u['pk'] for u in data['users']]
print(ids)
Output:
[54297756964, 182625349, 7324707263]

Related

data json in python [duplicate]

I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

Parse specific value from json response python [duplicate]

I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

How to get value from second level Json keys in Python [duplicate]

I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

Getting specific field values from Json Python

I have a JSON file, and what I am trying to do is getting this specific field '_id'. Problem is that when I use json.load('input_file'), it says that my variable data is a list, not a dictionary, so I can't do something like:
for value in data['_id']:
print(data['_id'][i])
because I keep getting this error: TypeError: list indices must be integers or slices, not str
What I also tried to do is:
data = json.load(input_file)[0]
It kinda works. Now, my type is a dictionary, and I can access like this: data['_id']
But I only get the first '_id' from the archive...
So, what I would like to do is add all '_id' 's values into a list, to use later.
input_file = open('input_file.txt')
data = json.load(input_file)[0]
print(data['_id'])# only shows me the first '_id' value
Thanks for the help!
[{
"_id": "5436e3abbae478396759f0cf",
"name": "ISIC_0000000",
"updated": "2015-02-23T02:48:17.495000+00:00"
},
{
"_id": "5436e3acbae478396759f0d1",
"name": "ISIC_0000001",
"updated": "2015-02-23T02:48:27.455000+00:00"
},
{
"_id": "5436e3acbae478396759f0d3",
"name": "ISIC_0000002",
"updated": "2015-02-23T02:48:37.249000+00:00"
},
{
"_id": "5436e3acbae478396759f0d5",
"name": "ISIC_0000003",
"updated": "2015-02-23T02:48:46.021000+00:00"
}]
You want to print the _id of each element of your json list, so let's do it by simply iterating over the elements:
input_file = open('input_file.txt')
data = json.load(input_file) # get the data list
for element in data: # iterate on each element of the list
# element is a dict
id = element['_id'] # get the id
print(id) # print it
If you want to transform the list of elements into a list of ids for later use, you can use list comprehension:
ids = [ e['_id'] for e in data ] # get id from each element and create a list of them
As you can see the data is a list of dictionaries
for looping over data you need to use the following code
for each in data:
print each['_id']
print each['name']
print each['updated']
it says that my variable data is a list, not a dictionary, so I can't do something like:
for value in data['_id']:
print(data['_id'][i])
Yes, but you can loop over all the dictionaries in your list and get the values for their '_id' keys. This can be done in a single line using list comprehension:
data = json.load(input_file)
ids = [value['_id'] for value in data]
print(ids)
['5436e3abbae478396759f0cf', '5436e3acbae478396759f0d1', '5436e3acbae478396759f0d3', '5436e3acbae478396759f0d5']
Another way to achieve this is using the map built-in function of python:
ids = map(lambda value: value['_id'], data)
This creates a function that returns the value of the key _id from a dictionary using a lambda expression and then returns a list with the return value from this function applied on every item in data

Extracting JSON element

I have a JSON response from the website shown below. I want to print the 'value' and 'datetime' keys of data. I am not able to access these two elements in JSON response.
data= {"parameter_name":"Inst",
"parameter_code":"WL1","data":[
{"value":3.1289999485,"datetime":"2018-07-01T00:00:00+00:00"},
{"datetime":"2018-07-01T00:30:00+00:00","value":3.1859998703},
{"value":3.33099985123,"datetime":"2018-07-01T00:45:00+00:00"},
{"datetime":"2018-07-01T01:15:00+00:00","value":3.22300004959},
{"datetime":"2018-07-01T01:45:00+00:00","value":3.32299995422}]}
my code till now
for element in len(data['data']):
date = element['datetime']
value = element['value']
print value, date
I am getting error
for element in len(data['data']):
TypeError: string indices must be integers, not str
What you've shown as your JSON data is likely not the actual value of data. If you attempt to access the data like a Python dict, it raises TypeError: string indices must be integers, not str. Your JSON data probably looks like this (notice the quotes):
# This is JSON, essentialy a string in the format of a Python dict.
data = """{
"parameter_name": "Inst",
"parameter_code": "WL1",
"data":[
{
"value":3.1289999485,
"datetime":"2018-07-01T00:00:00+00:00"
},
{
"datetime":"2018-07-01T00:30:00+00:00",
"value":3.1859998703
},
{
"value":3.33099985123,
"datetime":"2018-07-01T00:45:00+00:00"
},
{
"datetime":"2018-07-01T01:15:00+00:00",
"value":3.22300004959
},
{
"datetime":"2018-07-01T01:45:00+00:00",
"value":3.32299995422
}
]
}"""
Convert it into a Python dict by using the Python Standard Library json package:
import json
# This converts the JSON string into a Python dict
data = json.loads(data)
You can access the data and it's 'data' key, then iterate over it (like you were doing):
for element in data['data']:
print(element['value'], element['datetime'])
You can try like this:
for element in data['data']:
date = element['datetime']
value = element['value']
print(date)
print(value)
Output:
3.1289999485
2018-07-01T00:00:00+00:00
3.1859998703
2018-07-01T00:30:00+00:00
3.33099985123
2018-07-01T00:45:00+00:00
3.22300004959
2018-07-01T01:15:00+00:00
3.32299995422
2018-07-01T01:45:00+00:00
Explanation:
If you want to iterate over the elements in the list,:
for element in data['data']
If you want to iterate over the list using by their index:
for index in range(len(data['data'])):
If you have a web responce in text format you would also have to decode it first. Check
https://docs.python.org/2/library/json.html (for python 2) or https://docs.python.org/3.7/library/json.html (for python 3) to see the documentation about the json library.
You have to:
import json
decodedData = json.loads(data)
and then loop over decodedData as you've done.

Categories

Resources