I would like to print every value that belongs to my id key in my json file. I'm using the below code to print the whole file:
import json
from pprint import pprint
with open('f:\Docs\testList.json') as data_file:
data = json.load(data_file)
pprint( data )
And here is the json
{ "clubs": [
{
"location": "Dallas",
"id": "013325K52",
"type": "bar"
},
{
"location": "Dallas",
"id": "763825X56",
"type": "restaurant"
}
] }
It works correctly, however I can't figure out the type of the data_file and data variables, therefore I have no idea how could I write a for loop to print the content. I'm really new to Python, but in pseudo code I would do something like this (if I assume data is an array (or Python list) of dictionary objects):
for dictionaryVar in jsonArray
print dictionaryVar["id"]
or
for dictionaryVar in jsonArray
if dictionaryVar containsKey: "id"
print dictionaryVar["id"]
I would really appreciate if somebody could show me the right way or give guidance, because I don't really have an idea. I checked the docs of the json module, but couldn't figure out what it really does and how.
data_file is a TextIOWrapper, that is: a file object used for reading text. You should not care about it.
data is a dict. Dictionaries map key-value pairs, but you probably already know that. data has one key, "clubs", which maps to a list. This list contains other dictionaries.
Your pseudo-code:
for dictionaryVar in jsonArray
print dictionaryVar["id"]
corresponds to the following Python code:
for item in data['clubs']:
print item['id']
Your pseudo-code:
for dictionaryVar in jsonArray
if dictionaryVar containsKey: "id"
print dictionaryVar["id"]
corresponds to the following Python code:
for item in data['clubs']:
if 'id' in item:
print item['id']
This should be fairly simple, here is a quite explicit way of doing it (i made it longer so it's clearer for you, you could do this more efficiently)
import json
from pprint import pprint
with open('f:\Docs\testList.json') as data_file:
data = json.load(data_file)
clubs = data['clubs']
for club in clubs:
# Use dict.get() here to default value to None if
# it doesn't exist
club_id = club.get('id', None)
if club_id is not None:
print club_id
Related
I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.
I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.
I have some JSON data like:
{
"status": "200",
"msg": "",
"data": {
"time": "1515580011",
"video_info": [
{
"announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
"announcement_shop": "",
etc.
How do I grab the content "FOLLOW ME PLEASE"? I tried using
replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']
But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.
How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?
In a single line -
>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'
To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.
First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.
{
'data': {
'time': '1515580011',
'video_info': [{
'announcement': ( # ***
"""{
"announcement_id": "6",
"name": "INS\\u8d26\\u53f7",
"icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
"icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
"videoid": "15154610218328614178",
"content": "FOLLOW ME PLEASE",
"x_coordinate": "0.22",
"y_coordinate": "0.23"
}"""),
'announcement_shop': ''
}]
},
'msg': '',
'status': '200'
}
*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.
First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.
So, in summary, "descend" the ladder that is "data" using the following "rungs" -
data, a dictionary
video_info, a list of dicts
announcement, a dict in the first dict of the list of dicts
content residing as part of json data.
First,
i = data['data']
Next,
j = i['video_info']
Next,
k = j[0] # since this is a list
If you only want the first element, this suffices. Otherwise, you'd need to iterate:
for k in j:
...
Next,
l = k['announcement']
Now, l is JSON data. Load it -
import json
m = json.loads(l)
Lastly,
content = m['content']
print(content)
'FOLLOW ME PLEASE'
This should hopefully serve as a guide should you have future queries of this nature.
You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.
You'll have to decode that string first:
import json
replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])
then handle the resulting dictionary from there.
The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.
import json
data ='''
{
"names": {"first_boy" : "khaled"},
"names": {"second_boy" : "waseem"}
}
'''
info = json.loads(data)
for line in info:
print(info["names"])
I expected it to print the first_boy and the second_boy dictionary ,but it's printing
{'second_boy': 'waseem'}
Dicts in python can only support one of the same key. Similarly, most implementations of JSON do not allow duplicate keys. The way python handles this, when using json.loads() (or anything else that constructs a dict) is to simply use the most recent definition of any given key.
In this case, {"second_boy":"waseem"} overwrites {"first_boy":"khaled"}.
The problem here is that the key "names" exists 2 times.
Maybe you can do this:
import json
data ='''
{
"names": {"first_boy" : "khaled",
"second_boy" : "waseem"}
}
'''
info = json.loads(data)
for key, value in info['names'].items():
print(key, value)
Trying to create a 4 deep nested JSON from a csv based upon this example:
Region,Company,Department,Expense,Cost
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
At each level I would like to sum the costs and include it in the outputted JSON at the relevant level.
The structure of the outputted JSON should look something like this:
{
"id": "aUniqueIdentifier",
"name": "usually a nodes name",
"data": [
{
"key": "some key",
"value": "some value"
},
{
"key": "some other key",
"value": "some other value"
}
],
"children": [/* other nodes or empty */ ]
}
(REF: http://blog.thejit.org/2008/04/27/feeding-json-tree-structures-to-the-jit/)
Thinking along the lines of a recursive function in python but have not had much success with this approach so far... any suggestions for a quick and easy solution greatly appreciated?
UPDATE:
Gradually giving up on the idea of the summarised costs because I just can't figure it out :(. I'not much of a python coder yet)! Simply being able to generate the formatted JSON would be good enough and I can plug in the numbers later if I have to.
Have been reading, googling and reading for a solution and on the way have learnt a lot but still no success in creating my nested JSON files from the above CSV strucutre. Must be a simple solution somewhere on the web? Maybe somebody else has had more luck with their search terms????
Here are some hints.
Parse the input to a list of lists with csv.reader:
>>> rows = list(csv.reader(source.splitlines()))
Loop over the list to buildi up your dictionary and summarize the costs. Depending on the structure you're looking to create the build-up might look something like this:
>>> summary = []
>>> for region, company, department, expense, cost in rows[1:]:
summary.setdefault(*region, company, department), []).append((expense, cost))
Write the result out with json.dump:
>>> json.dump(summary, open('dest.json', 'wb'))
Hopefully, the recursive function below will help get you started. It builds a tree from the input. Please be aware of what type you want your leaves to be in, which we label as the "cost". You'll need to elaborate on the function to build-up the exact structure you intend:
import csv, itertools, json
def cluster(rows):
result = []
for key, group in itertools.groupby(rows, key=lambda r: r[0]):
group_rows = [row[1:] for row in group]
if len(group_rows[0]) == 2:
result.append({key: dict(group_rows)})
else:
result.append({key: cluster(group_rows)})
return result
if __name__ == '__main__':
s = '''\
Gondwanaland,Bobs Bits,Operations,nuts,332
Gondwanaland,Bobs Bits,Operations,bolts,254
Gondwanaland,Maureens Melons,Operations,nuts,123
'''
rows = list(csv.reader(s.splitlines()))
r = cluster(rows)
print json.dumps(r, indent=4)