I am trying to parse the following student grade report JSON using Python
{
"report":[
{
"enrollment": "rit2011001",
"name": "Julia",
"subject":[
{
"code": "DSA",
"grade": "A"
}
]
},
{
"enrollment": "rit2011020",
"name": "Samantha",
"subject":[
{
"code": "COM",
"grade": "B"
},
{
"code": "DSA",
"grade": "A"
}
]
}
]
}
Such that the report should be ordered ascending first by code, then by grade and then by enrollment. Here is how the output should be
COM B rit2011020 Samantha
DSA A rit2011001 Julia
DSA A rit2011020 Samantha
Here is my incomplete piece of code that I need help with:
import json
data='''{
"report":[
{
"enrollment": "rit2011001",
"name": "Julia",
"subject":[
{
"code": "DSA",
"grade": "A"
}
]
},
{
"enrollment": "rit2011020",
"name": "Samantha",
"subject":[
{
"code": "COM",
"grade": "B"
},
{
"code": "DSA",
"grade": "A"
}
]
}
]
}'''
print data #for debug
parsed_json = json.loads(data)
print parsed_json #for debug
for key,value in sorted(parsed_json.items()):
print key,value
I don't know how to apply successive filtering to achieve the result.
Try using a nested loop, with a print:
import json
data='''{
"report":[
{
"enrollment": "rit2011001",
"name": "Julia",
"subject":[
{
"code": "DSA",
"grade": "A"
}
]
},
{
"enrollment": "rit2011020",
"name": "Samantha",
"subject":[
{
"code": "COM",
"grade": "B"
},
{
"code": "DSA",
"grade": "A"
}
]
}
]
}'''
print data #for debug
parsed_json = json.loads(data)
print parsed_json #for debug
for i in parsed_json['report']:
for x in i['subject']:
print x['code'],x['grade'],i['enrollment'],i['name']
Output:
DSA A rit2011001 Julia
COM B rit2011020 Samantha
DSA A rit2011020 Samantha
If care about the order of the frame:
import json
data='''{
"report":[
{
"enrollment": "rit2011001",
"name": "Julia",
"subject":[
{
"code": "DSA",
"grade": "A"
}
]
},
{
"enrollment": "rit2011020",
"name": "Samantha",
"subject":[
{
"code": "COM",
"grade": "B"
},
{
"code": "DSA",
"grade": "A"
}
]
}
]
}'''
print data #for debug
parsed_json = json.loads(data)
print parsed_json #for debug
l=[]
for i in parsed_json['report']:
for x in i['subject']:
l.append(' '.join([x['code'],x['grade'],i['enrollment'],i['name']]))
print('\n'.join(sorted(l)))
If you are willing to use a very popular external library for data analysis then you can use pandas with json_normalize(), e.g.:
In []:
from pandas.io.json import json_normalize
df = json_normalize(parsed_json['report'], 'subject', ['enrollment', 'name'])
df.sort_values(['code', 'grade', 'enrollment']).reset_index(drop=True)
Out[]:
code grade enrollment name
0 COM B rit2011020 Samantha
1 DSA A rit2011001 Julia
2 DSA A rit2011020 Samantha
Related
I am creating sample JSON as shown below. I want to retrieve value by using on certain condition.
Observe below line of statement:
'$.Attributes..Attribute[?(#.setcode=="x")]'
But I am unable to retrieve the value by using code.
"{
"Attributes": [
{
"Attribute": {
"setcode": "x",
"codeType": "movieType",
"AttributeGroup": [
{
"code": "Force",
"codeValueValue": "'I'"
},
{
"code": "Remain",
"codeValueValue": "'P'"
},
{
"code": "Died",
"codeValueValue": "'E'"
},
{
"code": "Renew",
"codeValueValue": "'R'"
}
]
}
},
{
"Attribute":{
"setcode": "y",
"codeType": "movietype",
"AttributeGroup": [
{
"code": "Force",
"codeValueValue": "'I'"
},
{
"code": "Remain",
"codeValueValue": "'P'"
},
{
"code": "Died",
"codeValueValue": "'E'"
},
{
"code": "Renew",
"codeValueValue": "'R'"
}
]
}
}
]
}"
While your path returns something using a common implementation in Java (Jayway), I can confirm no result is returned using it with jsonpath-ng. The issue is the "filter op can only be used on lists" and, it looks like, requires to start with the iterable beforehand in the path:
We have to filter on the Attributes-array and pull the Attribute afterward instead:
$.Attributes[?(#.Attribute.setcode=="x")].Attribute
Full code example:
import json
json_string = """{
"Attributes": [
{
"Attribute": {
"setcode": "x",
"codeType": "movieType",
"AttributeGroup": [
{
"code": "Force",
"codeValueValue": "'I'"
},
{
"code": "Remain",
"codeValueValue": "'P'"
},
{
"code": "Died",
"codeValueValue": "'E'"
},
{
"code": "Renew",
"codeValueValue": "'R'"
}
]
}
},
{
"Attribute":{
"setcode": "y",
"codeType": "movietype",
"AttributeGroup": [
{
"code": "Force",
"codeValueValue": "'I'"
},
{
"code": "Remain",
"codeValueValue": "'P'"
},
{
"code": "Died",
"codeValueValue": "'E'"
},
{
"code": "Renew",
"codeValueValue": "'R'"
}
]
}
}
]
}"""
json_data = json.loads(json_string)
from jsonpath_ng.ext import parser
query = [x.value for x in parser.parse('$.Attributes[?(#.Attribute.setcode=="x")].Attribute').find(json_data)]
print(json.dumps(query))
I have passed a CSV file to JSON. I have something like this:
[{
"Intent": "About",
"Texto": "quien"
},
{
"Intent": "NP_Contacto",
"Texto": "atencion"
},
{
"Intent": "NP_Contacto",
"Texto": "informacion?"
},
{
"Intent": "NP_Contacto",
"Texto": "hablar"
},
{
"Intent": "NP_Contacto",
"Texto": "numero"
}]
but I want this:
[{ "intent": "NP_Contacto",
"texto": [{
"text": "telefono"
},
{
"text": "hablar"
},
{
"text": "número"
},
{
"text": "atención"
},
{
"text": "informacion"
}
]
},
{
"Intent": "About",
"Texto": "quien"
},
]
The code I have used to create this is:
reader = csv.DictReader(csvfile, fieldnames=("Texto", "Intent"))
fieldnames = ("Texto", "Intent")
output = []
for each in reader:
row = {} for field in fieldnames:
row[field] = each[field]
output.append(row)
json.dump(output, jsonfile, indent=3, sort_keys=True)
I think you are just trying to regroup it.
So try something like:
for i in json:
for j,k in i.items():
...
All,
I am trying to change the way some json looks by going through and formatting it in the following way:
1. flatten all of the fields lists
2. Then remove the fields lists and replace them with the name : flatten list
Example:
{
"name": "",
"fields": [{
"name": "keys",
"fields": [{
"node-name": "0/0/CPU0"
},
{
"interface-name": "TenGigE0/0/0/47"
},
{
"device-id": "ASR9K-H1902.corp.cisco.com"
}
]
},
{
"name": "content",
"fields": [{
"name": "lldp-neighbor",
"fields": [{
"receiving-interface-name": "TenGigE0/0/0/47"
},
{
"receiving-parent-interface-name": "Bundle-Ether403"
},
{
"device-id": "ASR9K-H1902.corp.cisco.com"
},
{
"chassis-id": "78ba.f975.a64f"
},
{
"port-id-detail": "Te0/1/0/4/0"
},
{
"header-version": 0
},
{
"hold-time": 120
},
{
"enabled-capabilities": "R"
},
{
"platform": ""
}
]
}]
}
]
}
Would turn into:
{
"": [{
"keys": [{
"node-name": "0/0/CPU0",
"interface-name": "TenGigE0/0/0/47",
"device-id": "ASR9K-H1902.corp.cisco.com"
}]
},
{
"content": [{
"lldp-neighbor": [{
"receiving-interface-name": "TenGigE0/0/0/47",
"receiving-parent-interface-name": "Bundle-Ether403",
"device-id": "ASR9K-H1902.corp.cisco.com",
"chassis-id": "78ba.f975.a64f",
"port-id-detail": "Te0/1/0/4/0",
"header-version": 0,
"hold-time": 120,
"enabled-capabilities": "R",
"platform": ""
}]
}]
}
]
}
I have tried the following to get the list flattened:
def _flatten_fields(self, fields_list):
c = {}
for b in [d for d in fields_list if bool(d)]:
c.update(b)
return c
This seems to work but I can't figure out a way to get into the sub levels using recursion, I am saving all flatten lists and names into a new dictionary, is there a way to do it by just manipulating the original dictionary?
This worked on the example you provided:
import json
def flatten(data):
result = dict()
if isinstance(data, dict):
if 'name' in data:
name = data['name']
result[name] = flatten(data['fields'])
else:
key = data.keys()[0]
value = data.values()[0]
result[key] = value
else:
for entry in data:
result.update(flatten(entry))
return result
print json.dumps(flatten(data), indent=4)
Output
{
"": {
"keys": {
"node-name": "0/0/CPU0",
"interface-name": "TenGigE0/0/0/47",
"device-id": "ASR9K-H1902.corp.cisco.com"
},
"content": {
"lldp-neighbor": {
"receiving-interface-name": "TenGigE0/0/0/47",
"receiving-parent-interface-name": "Bundle-Ether403",
"header-version": 0,
"port-id-detail": "Te0/1/0/4/0",
"chassis-id": "78ba.f975.a64f",
"platform": "",
"device-id": "ASR9K-H1902.corp.cisco.com",
"hold-time": 120,
"enabled-capabilities": "R"
}
}
}
}
It doesn't have the extra list layers shown in your expected output, but I don't think you want those.
This worked on the example you provided:
def flatten_fields(fields_list):
c = {}
for item in fields_list:
for key in item:
if key == "fields":
c[item["name"]] = flatten_fields(item["fields"])
elif key != "name":
c[key] = item[key]
break
return [c]
But it works on a list of dictionaries, so you should call it like flatten_fields([data])[0].
The output is:
{
"": [{
"keys": [{
"node-name": "0/0/CP0",
"interface-name": "TenGigE0/0/0/47",
"device-id": "ASR9K-H1902.corp.cisco.com"
}],
"content": [{
"lldp-neighbor": [{
"chassis-id": "78ba.f975.a64f",
"receiving-parent-interface-name": "Bndle-Ether403",
"enabled-capabilities": "R",
"device-id": "ASR9K-H1902.corp.cisco.com",
"hold-time": 120,
"receiving-interface-name": "TenGigE0/0/0/47",
"platform": "",
"header-version": 0,
"port-id-detail": "Te0/1/0/4/0"
}]
}]
}]
}
I have json file i what to read all the values
data=""" {"employees":[
{"firstName":"John", "lastName":"Doe"},
{"firstName":"Anna", "lastName":"Smith"},
{"firstName":"Peter", "lastName":"Jones"}
]}
{
"maps":[
{"id":"apple","iscategorical":"0"},
{"id":"ball","iscategorical":"0"}
],
"mask":{"id1":"aaaaa"},
"mask":{"id1":"bbb"},
"mask":{"id1":"cccccc"},
"om_points":"value",
"parameters":
{"id":"valore"}
}"""
out = json.loads(data)
how to get all values
firstname
lastname
mask.id1
map.id
output:
[(firstname_vaues,lastname_values,mask.id1,map.id)
(firstname_vaues,lastname_values,mask.id1,map.id) ......]
please help me
First thing, there are two json objects in your data string. So you cannot use json.loads(data). You can seperate them by a charcter like ";" . Then split the string and use json.loads on each of them.Use following code.
import json
data=""" {
"employees": [{
"firstName": "John",
"lastName": "Doe"
}, {
"firstName": "Anna",
"lastName": "Smith"
}, {
"firstName": "Peter",
"lastName": "Jones"
}]
};{
"maps": [{
"id": "apple",
"iscategorical": "0"
}, {
"id": "ball",
"iscategorical": "0"
}],
"mask": {
"id1": "aaaaa"
},
"mask": {
"id1": "bbb"
},
"mask": {
"id1": "cccccc"
},
"om_points": "value",
"parameters": {
"id": "valore"
}
}"""
splitdata = data.split(';')
datatop = json.loads(splitdata[0])
databottom = json.loads(splitdata[1])
Then you can access required fields as follows
print(datatop['employees'][0]['firstName'])
print(datatop['employees'][0]['lastName'])
print(databottom['mask']['id1'])
print(databottom['maps'][0]['id'])
In Python I'm currently working with a very large JSON file with some deep dictionaries and arrays. I'm having an issue where it's not constant. For example that's below, it's essentially countries, with regions/states, cities, and suburbs. The issue is that if there is only one suburb, it'll return a dictionary, though if there's more than one, it's a array with a dictionary making me have to add another line of code to go deeper. Sure, can ifelse/for it, but this is only a very small portion of the inconstancy and it's just not proper going ifelse all the time.
What I'd like to do is simply search anything within Belgium for the dictionary entry "code": "8400" and return it's location within the JSON file. What would be my best approach in order to do something like this? Thanks!
***SNIP***
{
"code": "BE",
"name": "Belgium",
"regions": {
"region": [
{
"code": "45",
"name": "Flanders",
"places": {
"place": [
{
"code": "1790",
"name": "Affligem"
},
{
"code": "8570",
"name": "Anzegem"
},
{
"code": "8630",
"name": "Diksmuide"
},
{
"code": "9600",
"name": "Ronse"
}
]
},
"subregions": {
"subregion": [
{
"code": "46",
"name": "Coast",
"places": {
"place": [
{
"code": "8300",
"name": "Knokke-Heist"
},
{
"code": "8400",
"name": "Oostende",
"subplaces": {
"subplace": {
"code": "8450",
"name": "Bredene"
}
}
},
{
"code": "8420",
"name": "De Haan"
},
{
"code": "8430",
"name": "Middelkerke"
},
{
"code": "8434",
"name": "Westende-Bad"
},
{
"code": "8490",
"name": "Jabbeke"
},
{
"code": "8660",
"name": "De Panne"
},
{
"code": "8670",
"name": "Oostduinkerke"
}
]
}
},
{
"code": "47",
"name": "Cities",
"places": {
"place": [
{
"code": "1000",
"name": "Brussels"
},
{
"code": "2000",
"name": "Antwerp"
},
{
"code": "8000",
"name": "Bruges"
},
{
"code": "8340",
"name": "Damme"
},
{
"code": "9000",
"name": "Gent"
}
]
}
},
{
"code": "48",
"name": "Interior",
"places": {
"place": [
{
"code": "2260",
"name": "Westerlo"
},
{
"code": "2400",
"name": "Mol"
},
{
"code": "2590",
"name": "Berlaar"
},
{
"code": "8500",
"name": "Kortrijk",
"subplaces": {
"subplace": {
"code": "8940",
"name": "Wervik"
}
}
},
{
"code": "8610",
"name": "Handzame"
},
{
"code": "8755",
"name": "Ruiselede"
},
{
"code": "8900",
"name": "Ieper"
},
{
"code": "8970",
"name": "Poperinge"
}
]
}
},
EDIT:
I was asked to show how I'm currently getting through this JSON file. Root is a dictionary containing numbers that equal the city/suburb I'm trying to search for. It doesn't define whether it is a city or suburb before hand. Below is my lazyly coded search while I was trying to learn how to dig through this JSON file, until I realized how complicated it was getting and got a bit stuck.
SNIP
for k in dataDict['countries']['country']:
if k['code'] == root['country']:
for y in k['regions']['region']['places']['place']:
if y['code'] == root['place']:
city = y['name']
else:
try:
for p in y['subplaces']['subplace']:
if p['code'] == root['place']:
city = p['name']
except:
pass
If I understand well, each dictionary has the following structure:
{"code": # some int
"name": # some str
none / "country" / "place" / whatever # some dict or list
You can write a recursive function that handle one and only one dict:
def foo(my_dict):
if my_dict['code'] == root['place']:
city = my_dict['name']
elif "country" in my_dict:
city = foo(my_dict['country'])
elif "place" in my_dict:
#
# and so on...
else:
city = None
return city
Hope this example will help you.