Interate through json data with python - python

I have been working on this code for a few hours trying a bunch of things to iterate through the supplied json data. Can figure out how to properly iterate through these nested lists and objects.
import json
data = """
{
"tracks": "1",
"timeline": {
"0.733251541": [
{
"id": 1,
"bounds": {
"Width": 0.5099463905313426,
"Height": 0.2867199993133546,
"Y": 0.4436400003433228,
"X": 0.4876505160745349
}
}
],
"0.965": [
{
"id": 1,
"bounds": {
"Width": 0.4205311330135182,
"Height": 0.2363199994340539,
"Y": 0.2393400002829731,
"X": 0.1593787633901481
}
}
],
"1.098224": [
{
"id": 1,
"bounds": {
"Width": 0.4568560813801344,
"Height": 0.2564799993857742,
"Y": 0.1992600003071129,
"X": 0.1000513407532317
}
}
]
},
"taggedTracks": {
"1": "dirk"
}
}
"""
json = json.loads(data)
for a in json["timeline"]:
for b in a:
for c in b["bounds"]:
print a, c["Width"], c["Height"], c["Y"], c["X"]
Can someone please steer me in the right direction on how to deal with the json data supplied?
I get the following error.
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
TypeError: string indices must be integers

You are getting the the TypeError because inside each value of "timeline", first comes a list. You have to take the first value of that list, using the index of 0. Then you can parse the rest.
Hopefully the following code helps:
import json
data = """
{
"tracks": "1",
"timeline": {
"0.733251541": [
{
"id": 1,
"bounds": {
"Width": 0.5099463905313426,
"Height": 0.2867199993133546,
"Y": 0.4436400003433228,
"X": 0.4876505160745349
}
}
],
"0.965": [
{
"id": 1,
"bounds": {
"Width": 0.4205311330135182,
"Height": 0.2363199994340539,
"Y": 0.2393400002829731,
"X": 0.1593787633901481
}
}
],
"1.098224": [
{
"id": 1,
"bounds": {
"Width": 0.4568560813801344,
"Height": 0.2564799993857742,
"Y": 0.1992600003071129,
"X": 0.1000513407532317
}
}
]
},
"taggedTracks": {
"1": "dirk"
}
}
"""
test_json = json.loads(data)
for num, data in test_json["timeline"].iteritems():
print(num+":")
bounds = data[0]["bounds"]
for bound, value in bounds.iteritems():
print('\t'+bound+": "+str(value))

First of all, it's not a great idea to use the name json for a variable since that is the name of the module. Let's use j instead.
Anyway, when you do json.loads(), you get back a dict. When you iterate for a in <dict>, you get back the list of keys (only). You can instead iterate over the keys and values with iteritems(), like:
for k, a in j['timeline'].iteritems():
for b in a:
c = b['bounds']
print k, c["Width"], c["Height"], c["Y"], c["X"]

Related

Appending all for loop dicts into single list

I just learnt django and I am getting data from api and looping through the json and appending the data into the list. but When I use .map() function in react then the data is appending in list (from for loop) like
[
{
"results": {
"id": 544,
"name": "User_1",
}
},
{
"results": {
"id": 218,
"name": "User_2",
}
},
{
"results": {
"id": 8948,
"name": "User_3",
}
},
{
"results": {
"id": 9,
"name": "User_4",
}
},
]
It is not appending like (Like I want)
[
results : [
{
"id": 544,
"name": "User_1"
},
{
"id": 218,
"name": "User_2"
},
{
"id": 8948,
"name": "User_3"
},
{
"id": 9,
"name": "User_4"
}
],
"length_of_results": 25,
]
views.py
def extract_view(request):
results_list = []
// api url for explanation only
get_response = "https://api.punkapi.com/v2/beers"
if get_response.status_code == 200:
for result in get_response.json():
results_list.append({"results": result})
results_list.append({"length_of_results": len(results_list)})
return Response({"data": results_list})
I know, In for loop it is appending every result within it with every iteration but I also want to assign all the responses within results list. Because I will add a append another field after for loop.
I have tried many times but it is still not working.
You can solve it by map function iterating over list:
dict(results=list(map(lambda x: x["results"], response)))
Full working example:
response = [
{
"results": {
"id": 544,
"name": "User_1",
}
},
{
"results": {
"id": 218,
"name": "User_2",
}
},
{
"results": {
"id": 8948,
"name": "User_3",
}
},
{
"results": {
"id": 9,
"name": "User_4",
}
},
]
results = dict(results=list(map(lambda x: x["results"], response)))
results["length_of_results"] = len(results["results"])
>> {'results': [{'id': 544, 'name': 'User_1'},
>> {'id': 218, 'name': 'User_2'},
>> {'id': 8948, 'name': 'User_3'},
>> {'id': 9, 'name': 'User_4'}],
>> 'length_of_results': 4}
By doing
results_list.append({"results": result})
you are creating a new dictionary with the value being result which I believe is a dictionary itself. So you should be able to just do this:
if get_response.status_code == 200:
for result in get_response.json():
results_list.append(result)

Read JSON index in Python

I want to read a json file with python like this:
{
"id": "27147e64-9ef5-42d8-b32e-b46b19071ee3b84e0e07-669e-4a10-8124-8e0d71a08e7e",
"image": "img0171.png",
"width": 640,
"height": 480,
"tags": [
{
"name": "becks_long_neck",
"parent": null,
"id": "b2d59c98-0bdc-4d13-ad1b-9d4ab5bc1fb3",
"color": "#e62921",
"type": "bounding_box",
"pos": {
"x": 387,
"y": 310.06667073567706,
"w": 62.666666666666686,
"h": 38.219034830729186
}
},
{
"name": "becks_long_neck",
"parent": null,
"id": "75635f60-e6b9-4408-89fb-ed435355dac6",
"color": "#e62921",
"type": "bounding_box",
"pos": {
"x": 358.5,
"y": 354.06667073567706,
"w": 40.833333333333314,
"h": 31.666666666666686
}
}
]
}
When I want to access to the second name I try something like this:
for dictionary in datastore:
filename = dictionary['image']
tag = dictionary['tags'][0]['name']
if(dictionary['tags'][1]['name']):
tag2 = dictionary['tags'][1]['name']
print(tag)
x = dictionary['tags'][0]['pos']['x']
print(x)
y = dictionary['tags'][0]['pos']['y']
print(y)
w = dictionary['tags'][0]['pos']['w']
print(w)
h = dictionary['tags'][0]['pos']['h']
print(h)
but show me this error:
Traceback (most recent call last):
File "json_to_txt.py", line 65, in <module>
if(dictionary['tags'][1]['name']):
IndexError: list index out of range
How can I access to the second 'name' object?
You don't need to define explicitly each individual variable such as tag, tag2 .. etc . Rather leave this operation to looping, e.g. make it dynamically like in the below case by changing the current order of looping index structure from dictionary[datastore] to datastore[dictionary] :
import json
s = '{"id": "27147e64-9ef5-42d8-b32e-b46b19071ee3b84e0e07-669e-4a10-8124-8e0d71a08e7e","image": "img0171.png","width": 640,"height": 480,"tags": [{"name": "becks_long_neck","parent": null,"id": "b2d59c98-0bdc-4d13-ad1b-9d4ab5bc1fb3","color": "#e62921","type": "bounding_box","pos": {"x": 387,"y": 310.06667073567706,"w": 62.666666666666686,"h": 38.219034830729186}},{"name": "becks_long_neck","parent": null,"id": "75635f60-e6b9-4408-89fb-ed435355dac6","color": "#e62921","type": "bounding_box","pos": {"x": 358.5,"y": 354.06667073567706,"w": 40.833333333333314,"h": 31.666666666666686}}]}'
datastore = json.loads(s)
i=0
for dictionary in datastore:
if dictionary == 'image':
filename = datastore[dictionary]
if dictionary == 'tags':
tag = datastore[dictionary]
for dictionary in tag:
print("tag_name",i,tag[i]['name'])
i+=1
>>>
tag_name 0 becks_long_neck
tag_name 1 becks_long_neck

How to parse specific parts of nested JSON format into csv in python (pandas)

I have a nested JSON file which I fail to parse into flatten csv.
I want to have the following columns in the csv:
id, name, path, tags (a column for each of them), points (I need x\y values of the 4 dots)
example of the JSON input:
{
"name": "test",
"securityToken": "test Token",
"videoSettings": {
"frameExtractionRate": 15
},
"tags": [
{
"name": "Blur Reject",
"color": "#FF0000"
},
{
"name": "Blur Poor",
"color": "#800000"
}
],
"id": "Du1qtrZQ1",
"activeLearningSettings": {
"autoDetect": false,
"predictTag": true,
"modelPathType": "coco"
},
"version": "2.1.0",
"lastVisitedAssetId": "ddee3e694ec299432fed9e42de8741ad",
"assets": {
"0b8f6f214dc7066b00b50ae16cf25cf6": {
"asset": {
"format": "jpg",
"id": "0b8f6f214dc7066b00b50ae16cf25cf6",
"name": "1.jpg",
"path": "c:\temp\1.jpg",
"size": {
"width": 1500,
"height": 1125
},
"state": 2,
"type": 1
},
"regions": [
{
"id": "VtDyR9Ovl",
"type": "POLYGON",
"tags": [
"3",
"9",
"Dark Poor"
],
"boundingBox": {
"height": 695.2110389610389,
"width": 1111.607142857143,
"left": 167.41071428571428,
"top": 241.07142857142856
},
"points": [
{
"x": 167.41071428571428,
"y": 252.02922077922076
},
{
"x": 208.80681818181816,
"y": 891.2337662337662
},
{
"x": 1252.232142857143,
"y": 936.2824675324675
},
{
"x": 1279.017857142857,
"y": 241.07142857142856
}
]
}
],
"version": "2.1.0"
},
"0155d8143c8cad85b5b9d392fd2895a4": {
"asset": {
"format": "jpg",
"id": "0155d8143c8cad85b5b9d392fd2895a4",
"name": "2.jpg",
"path": "c:\temp\2.jpg",
"size": {
"width": 1080,
"height": 1920
},
"state": 2,
"type": 1
},
"regions": [
{
"id": "7FFl_diM2",
"type": "POLYGON",
"tags": [
"Dark Poor"
],
"boundingBox": {
"height": 502.85714285714283,
"width": 820.3846153846155,
"left": 144.08653846153848,
"top": 299.2207792207792
},
"points": [
{
"x": 152.39423076923077,
"y": 311.68831168831167
},
{
"x": 144.08653846153848,
"y": 802.077922077922
},
{
"x": 964.4711538461539,
"y": 781.2987012987012
},
{
"x": 935.3942307692308,
"y": 299.2207792207792
}
]
}
],
"version": "2.1.0"
}
}
I tried using pandas's json_normalize and realized I don't fully understand how to specify the columns I wish to parse:
import json
import csv
import pandas as pd
from pandas import Series, DataFrame
from pandas.io.json import json_normalize
f = open(r'c:\temp\test-export.json')
data = json.load(f) # load as json
f.close()
df = json_normalize(data) #load json into dataframe
df.to_csv(r'c:\temp\json-to-csv.csv', sep=',', encoding='utf-8')
The results are hard to work with because I didn't specify what I want (iteirate trough specific array and append it to the CSV)
This where I wish your help.
I assume i don't fully understand how the normalize works and suspect it is not the best way to deal with this problem.
Thank you!
You can do something like this. Since you didn't provide an example output I did something on my own.
import json
import csv
f = open(r'file.txt')
data = json.load(f)
f.close()
with open("output.csv", mode="w", newline='') as out:
w = csv.writer(out)
header = ["id","name","path","tags","points"]
w.writerow(header)
for asset in data["assets"]:
data_point = data["assets"][asset]
output = [data_point["asset"]["id"]]
output.append(data_point["asset"]["name"])
output.append(data_point["asset"]["path"])
output.append(data_point["regions"][0]["tags"])
output.append(data_point["regions"][0]["points"])
w.writerow(output)
Output
id,name,path,tags,points
0b8f6f214dc7066b00b50ae16cf25cf6,1.jpg,c:\temp\1.jpg,"['3', '9', 'Dark Poor']","[{'x': 167.41071428571428, 'y': 252.02922077922076}, {'x': 208.80681818181816, 'y': 891.2337662337662}, {'x': 1252.232142857143, 'y': 936.2824675324675}, {'x': 1279.017857142857, 'y': 241.07142857142856}]"
0155d8143c8cad85b5b9d392fd2895a4,2.jpg,c:\temp\2.jpg,['Dark Poor'],"[{'x': 152.39423076923077, 'y': 311.68831168831167}, {'x': 144.08653846153848, 'y': 802.077922077922}, {'x': 964.4711538461539, 'y': 781.2987012987012}, {'x': 935.3942307692308, 'y': 299.2207792207792}]"

Parsing rekognition get_face_search results

I am trying to parse out Face Matches from the results of the get_face_search() AWS Rekognition API. It outputs an array of Persons, within that array is another array of FaceMatches for a given person and timestamp. I want to take information from the FaceMatches array and be able to loop through the array of Face Matches.
I have done something similar before for single arrays and looped successfully, but I am missing something trivial here perhaps.
Here is output from API:
Response:
{
"JobStatus": "SUCCEEDED",
"NextToken": "U5EdbZ+86xseDBfDlQ2u8QhSVzbdodDOmX/gSbwIgeO90l2BKWvJEscjUDmA6GFDCSSfpKA4",
"VideoMetadata": {
"Codec": "h264",
"DurationMillis": 6761,
"Format": "QuickTime / MOV",
"FrameRate": 30.022184371948242,
"FrameHeight": 568,
"FrameWidth": 320
},
"Persons": [
{
"Timestamp": 0,
"Person": {
"Index": 0,
"BoundingBox": {
"Width": 0.987500011920929,
"Height": 0.7764084339141846,
"Left": 0.0031250000465661287,
"Top": 0.2042253464460373
},
"Face": {
"BoundingBox": {
"Width": 0.6778846383094788,
"Height": 0.3819068372249603,
"Left": 0.10096154361963272,
"Top": 0.2654387652873993
},
"Landmarks": [
{
"Type": "eyeLeft",
"X": 0.33232420682907104,
"Y": 0.4194057583808899
},
{
"Type": "eyeRight",
"X": 0.5422032475471497,
"Y": 0.41616082191467285
},
{
"Type": "nose",
"X": 0.45633792877197266,
"Y": 0.4843473732471466
},
{
"Type": "mouthLeft",
"X": 0.37002310156822205,
"Y": 0.567118763923645
},
{
"Type": "mouthRight",
"X": 0.5330674052238464,
"Y": 0.5631639361381531
}
],
"Pose": {
"Roll": -2.2475271224975586,
"Yaw": 4.371307373046875,
"Pitch": 6.83940315246582
},
"Quality": {
"Brightness": 40.40004348754883,
"Sharpness": 99.95819854736328
},
"Confidence": 99.87971496582031
}
},
"FaceMatches": [
{
"Similarity": 99.81229400634766,
"Face": {
"FaceId": "4699a1eb-9f6e-415d-8716-eef141d23433a",
"BoundingBox": {
"Width": 0.6262923432480737,
"Height": 0.46972032423490747,
"Left": 0.130435005324523403604,
"Top": 0.13354002343240603
},
"ImageId": "1ac790eb-615a-111f-44aa-4017c3c315ad",
"Confidence": 99.19400024414062
}
}
]
},
{
"Timestamp": 66,
"Person": {
"Index": 0,
"BoundingBox": {
"Width": 0.981249988079071,
"Height": 0.7764084339141846,
"Left": 0.0062500000931322575,
"Top": 0.2042253464460373
}
}
},
{
"Timestamp": 133,
"Person": {
"Index": 0,
"BoundingBox": {
"Width": 0.9781249761581421,
"Height": 0.783450722694397,
"Left": 0.0062500000931322575,
"Top": 0.19894365966320038
}
}
},
{
"Timestamp": 199,
"Person": {
"Index": 0,
"BoundingBox": {
"Width": 0.981249988079071,
"Height": 0.783450722694397,
"Left": 0.0031250000465661287,
"Top": 0.19894365966320038
},
"Face": {
"BoundingBox": {
"Width": 0.6706730723381042,
"Height": 0.3778440058231354,
"Left": 0.10817307233810425,
"Top": 0.26679307222366333
},
"Landmarks": [
{
"Type": "eyeLeft",
"X": 0.33244985342025757,
"Y": 0.41591548919677734
},
{
"Type": "eyeRight",
"X": 0.5446155667304993,
"Y": 0.41204410791397095
},
{
"Type": "nose",
"X": 0.4586191177368164,
"Y": 0.479543000459671
},
{
"Type": "mouthLeft",
"X": 0.37614554166793823,
"Y": 0.5639738440513611
},
{
"Type": "mouthRight",
"X": 0.5334802865982056,
"Y": 0.5592300891876221
}
],
"Pose": {
"Roll": -2.4899401664733887,
"Yaw": 3.7596628665924072,
"Pitch": 6.3544135093688965
},
"Quality": {
"Brightness": 40.46360778808594,
"Sharpness": 99.95819854736328
},
"Confidence": 99.89802551269531
}
},
"FaceMatches": [
{
"Similarity": 99.80543518066406,
"Face": {
"FaceId": "4699a1eb-9f6e-415d-8716-eef141d9223a",
"BoundingBox": {
"Width": 0.626294234234737,
"Height": 0.469234234890747,
"Left": 0.130435002334234604,
"Top": 0.13354023423449180603
},
"ImageId": "1ac790eb-615a-111f-44aa-4017c3c315ad",
"Confidence": 99.19400024414062
}
}
]
},
{
"Timestamp": 266,
"Person": {
"Index": 0,
"BoundingBox": {
"Width": 0.984375,
"Height": 0.7852112650871277,
"Left": 0,
"Top": 0.19718310236930847
}
}
}
],
I have isolated the timestamps (just testing my approach) using the following:
timestamps = [m['Timestamp'] for m in response['Persons']]
Output is this, as expected - [0, 66, 133, 199, 266]
However, when I try the same thing with FaceMatches, I get an error.
[0, 66, 133, 199, 266]
list indices must be integers or slices, not str: TypeError
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 40, in lambda_handler
matches = [m['FaceMatches']['Face']['FaceId'] for m in response['Persons']]
File "/var/task/lambda_function.py", line 40, in <listcomp>
matches = [m['FaceMatches']['Face']['FaceId'] for m in response['Persons']]
TypeError: list indices must be integers or slices, not str
What I need to end up with is for each face that is matched:
Timestamp
FaceID
Similarity
Can anybody shed some light on this for me?
According to your needs , you have two FaceMatch objects in your response and you can extract required info in this way :
import json
with open('newtest.json') as f:
data = json.load(f)
length =len(data['Persons'])
for i in range(0,length):
try:
print(data['Persons'][i]['FaceMatches'][0]['Similarity'])
print(data['Persons'][i]['FaceMatches'][0]['Face']['FaceId'])
print(data['Persons'][i]['Timestamp'])
except:
continue
I have taken your json object in data variable and i have ignored timestamps where there is no corresponding facematch, if you wish you can extract then the same way

get values from json

I have problem with accessing values in json:
{
"data": [
{
"id": "",
"from": {
"name": "",
"id": ""
},
"picture": "",
"source": "",
"height": 480,
"width": 720,
"images": [
{
"height": 1365,
"width": 2048,
"source": ""
},
I wanto to get all images source: albums['data']['images'] but sth goes wrong, I have troubles with nested dictionaries. Please help
"data": [
{
The "data" field's value is not a dictionary, but rather a list of dictionaries.
Try albums['data'][0]['images'] instead, and so on for the other items in the list, or similarly, a loop:
for item in album['data']:
# do something with item['images']

Categories

Resources