Nested json files - Python - python

Good afternoon all,
I've been reading through the various posts regarding reading .json files using pandas but so far I've not been sucessful extract.
I need to read a specific 'score' in the json file of which I'll then iterate through all the json files I have as the label would be the same.
In the below how would I read the 'score'? I've tried using the normalise function but regardless of the agruement I put in I cannot get any closer.
Part of the json file:
"template_id": "template_fe61177cb0eb4642901b1eae9488fbb4",
"audit_id": "audit_1a0e9ef4a7914286808accb3dcb0700b",
"archived": false,
"created_at": "2022-10-07T08:00:14.021Z",
"modified_at": "2022-10-07T08:05:56.594Z",
"audit_data": {
"score": 10,
"total_score": 11,
"score_percentage": 90.909,
"name": "7 Oct 2022 / Test",
"duration": 240,
"authorship": {
"device_id": "user_65c3799b0f1a48549cacbceca244e1db",
"owner": "test",
"owner_id": "user_65c3799b0f1a48549cacbceca244e1db",
"author": "test",
"author_id": "user_65c3799b0f1a48549cacbceca244e1db"
},
"date_completed": "2022-10-07T08:05:55.860Z",
"date_modified": "2022-10-07T08:05:56.594Z",
"date_started": "2022-10-07T08:00:13.000Z",
"site": {
"name": "Blue Warehouse"
}
},
"template_data": {
"authorship": {
"device_id": "user_4bb896b5308341f7a7543a32f6c1f3ec",
"owner": "test",
"owner_id": "user_4bb896b5308341f7a7543a32f6c1f3ec",
"author": "test",
"author_id": "user_4bb896b5308341f7a7543a32f6c1f3ec"
},
"metadata": {
"description": "",
"name": "RCS",
"image": {
"date_created": "2022-04-12T13:27:18.852Z",
"file_ext": "png",
"label": "Go \u0026 See icon.PNG",
"media_id": "cf944a4b-7589-47e6-b42a-8d17f06b7031",
"href": "https://1"
}
},
"response_sets": {
"5b69aee5-0532-46a4-b2f5-d020d4d5381d": {
"id": "5b69aee5-0532-46a4-b2f5-d020d4d5381d",
"type": "question",
"responses": [
{
"id": "ef4abf51-3361-46f5-ba04-70c23c85ca20",
"label": "Good",
"colour": "19,133,95",
***"score": 1,***
"enable_score": true
},
Thanks for your help.
Rob.

This is done without pandas
import json
with open("my_file.json", 'r') as f:
my_dict = json.load(f)
score = my_dict["response_sets"]["5b69aee5-0532-46a4-b2f5-d020d4d5381d"]["responses"][0]["score"]

Related

Using Pandas to convert csv to Json

I want to convert a CSV to a JSON format using pandas. I am a tester and want to send some events to Event Hub for that I want to maintain a CSV file and update my records/data using the CSV file. I created a CSV file by reading a JSON using pandas for reference. Now when I am again converting the CSV into JSON using pandas< the data is not getting displayed in the correct format. Can you please help.
Step 1: Converted JSON to CSV using pandas:
df = pd.read_json('C://Users//DAMALI//Desktop/test.json')
df.to_csv('C://Users//DAMALI//Desktop/test.csv')
Step2: Now if I try to convert the JSON again to CSV, it's not getting converted in the same format as earlier:
df = pd.read_csv('C://Users//DAMALI//Desktop/test.csv')
df.to_json('C://Users//DAMALI//Desktop/test1.json')
Providing JSON below:
{
"body": {
"deviceId": "UDM",
"registrationDate": "12/11/2019",
"testRegistration": false,
"serialNumber": "25",
"articleNumber": "R91",
"deviceName": "UDM-test",
"locationId": "lc0",
"sapSoldToId": "1138474",
"crmDomainAccountId": "1234566",
"crmAccountDetails": {
"accountName": "ProjectX",
"accountId": "Instal",
"region": "AP"
},
"productLine": "UD",
"state": "registered",
"installerName": "ABC Rooms",
"installationAddress": {
"street": "Benelu",
"zipCode": "850",
"city": "Kortr",
"state": "OVL",
"country": "Belgi"
},
"customerDetails": {
"name": "John D",
"contactName": "John Doe",
"phone": "+32 999999999",
"email": "john.doe#test.com"
},
"wallConnect": {
"wallSize": "Width 5 x Height 4",
"wallOrientation": "LANDSCAPE",
"displayType": "BVD-D55M21H321A1C300",
"softwareVersion": "1.13.1.1.3"
},
"projector": {
"name": "UDX 40K-123456789",
"subType": "UDX 40K"
},
"featureLicense": ["UDX-aa00213a-5719-440e-a3b5", "UDX-aa00a-571"],
"cloudServiceLicense": ["EN04d5-4d2a-9131-875ad37c5883", "E15-4d2a-9131-875ad37c5154"],
"metadata": {
"cusQuesAns": [{
"ques": "End ucal industry",
"ans": "Hosity",
"key": "CUST_ANSWER"
},
{
"ques": "End user video wall application",
"ans": "Simulation & Virtual Reality",
"key": "CUSSECOND_ANSWER"
}
]
},
"frequency": "realtime",
"subDevices": [{
"deviceType": "DISPLAY",
"serialNumber": "68960",
"articleNumber": "R792",
"wallConnect": {
"displayFMWVersion": "3.0.0",
"displayVariant": "KVD21H331A1C300"
}
}]
},
"properties": {
"drs": {
"type": "salesforce-lm"
}
},
"systemProperties": {
"user-id": "data-cvice",
"message-id": "1b1012cc-9b18c192"
}
}
Try this for converting CSV to JSON
import pandas as pd
df = pd.read_csv (r'Fayzan-Bhatti\test.csv')
df.to_json (r'Fayzan-Bhatti\new_test.json')

Getting specific lines from a print in Python with Spotipy

I am writing some code with Python and Spotipy and I'm relatively new to coding. I have some code that get all the info about a Spotify playlist and prints it out for me:
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy
import json
client_credentials_manager = SpotifyClientCredentials()
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)
playlist_id = 'spotify:playlist:76CVeJDw2b90up5PgkZXyU'
results = sp.playlist(playlist_id)
#print(json.dumps(results, indent=4))
print((json.dumps(results, indent=4)))
It works well and gives me all the info. My problem is that I only need specifics from the print:
"collaborative": false,
"description": "",
"external_urls": {
"spotify": "https://open.spotify.com/playlist/76CVeJDw2b90up5PgkZXyU"
},
"followers": {
"href": null,
"total": 0
},
"href": "https://api.spotify.com/v1/playlists/76CVeJDw2b90up5PgkZXyU?additional_types=track",
"id": "76CVeJDw2b90up5PgkZXyU",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab67616d0000b2734a052b99c042dc15f933145b",
"width": 640
}
],
"name": "TEST",
"owner": {
"display_name": "Name",
"external_urls": {
"spotify": "https://open.spotify.com/user/myname"
},
"href": "https://api.spotify.com/v1/users/myname",
"id": "Myname",
"type": "user",
"uri": "spotify:user:kovizsombor"
},
"primary_color": null,
"public": true,
"snapshot_id": "MixmMGE0MDgxNDQ1ZGVlNmE4MThiMmQwODMwYWU0OTI3YzkyOGJhOWIz",
"tracks": {
"href": "https://api.spotify.com/v1/playlists/76CVeJDw2b90up5PgkZXyU/tracks?offset=0&limit=100&additional_types=track",
"items": [
{
"added_at": "2020-05-17T10:00:11Z",
"added_by": {
"external_urls": {
"spotify": "https://open.spotify.com/user/kovizsombor"
},
"href": "https://api.spotify.com/v1/users/kovizsombor",
"id": "kovizsombor",
"type": "user",
"uri": "spotify:user:kovizsombor"
},
"is_local": false,
"primary_color": null,
"track": {
"album": {
"album_type": "album",
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/0PFtn5NtBbbUNbU9EAmIWF"
},
"href": "https://api.spotify.com/v1/artists/0PFtn5NtBbbUNbU9EAmIWF",
"id": "0PFtn5NtBbbUNbU9EAmIWF",
"name": "TOTO",
"type": "artist",
"uri": "spotify:artist:0PFtn5NtBbbUNbU9EAmIWF"
}
],
],
"external_urls": {
"spotify": "https://open.spotify.com/album/62U7xIHcID94o20Of5ea4D"
},
"href": "https://api.spotify.com/v1/albums/62U7xIHcID94o20Of5ea4D",
"id": "62U7xIHcID94o20Of5ea4D",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab67616d0000b2734a052b99c042dc15f933145b",
"width": 640
},
{
"height": 300,
"url": "https://i.scdn.co/image/ab67616d00001e024a052b99c042dc15f933145b",
"width": 300
},
{
"height": 64,
"url": "https://i.scdn.co/image/ab67616d000048514a052b99c042dc15f933145b",
"width": 64
}
],
"name": "Toto IV",
"release_date": "1982-04-08",
"release_date_precision": "day",
"total_tracks": 10,
"type": "album",
"uri": "spotify:album:62U7xIHcID94o20Of5ea4D"
},
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/0PFtn5NtBbbUNbU9EAmIWF"
},
"href": "https://api.spotify.com/v1/artists/0PFtn5NtBbbUNbU9EAmIWF",
"id": "0PFtn5NtBbbUNbU9EAmIWF",
"name": "TOTO",
"type": "artist",
"uri": "spotify:artist:0PFtn5NtBbbUNbU9EAmIWF"
}
],
"available_markets": [
],
"disc_number": 1,
"duration_ms": 295893,
"episode": false,
"explicit": false,
"external_ids": {
"isrc": "USSM19801941"
},
"external_urls": {
"spotify": "https://open.spotify.com/track/2374M0fQpWi3dLnB54qaLX"
},
"href": "https://api.spotify.com/v1/tracks/2374M0fQpWi3dLnB54qaLX",
"id": "2374M0fQpWi3dLnB54qaLX",
"is_local": false,
"name": "Africa",
"popularity": 83,
"preview_url": "https://p.scdn.co/mp3-preview/dd78dafe31bb98f230372c038a126b8808f9349b?cid=d568e7073a38465bba48268ea9f10153",
"track": true,
"track_number": 10,
"type": "track",
"uri": "spotify:track:2374M0fQpWi3dLnB54qaLX"
},
"video_thumbnail": {
"url": null
}
}
],
"limit": 100,
"next": null,
"offset": 0,
"previous": null,
"total": 1
},
"type": "playlist",
"uri": "spotify:playlist:76CVeJDw2b90up5PgkZXyU"
}
From this long print I somehow need to extract the Artist and the song title and preferably make it into a variable. Also not sure how this would work if there are multiple songs in a playlist.
It's also a solution if I can print out only the Artist and the title of the song without printing out all the information.
Based on your posted example it appears that you have only 1 song named "Africa" by artist "TOTO". Copying the track to have two songs, I added another track with two artists for testing the arrays.
If that json is loaded into a variable named results then (as #xcmkz said) you have a python dictionary and can process accordingly.
Try working with the following to transverse through your dict appending artists and songs to lists:
song_dict = {}
for track in results['tracks']['items']:
song_name = track["track"]["name"]
a2 = []
for t1 in track['track']['artists']:
a2.append(t1['name'])
song_dict.update({song_name: a2})
print(f'Dictionary of Songs and Artists:')
for k, v in song_dict.items():
print(f'Song --> {k}, by --> {", ".join(v)}')
Results:
Dictionary of Songs and Artists:
Song --> Africa, by --> TOTO
Song --> Just Another Silly Song, by --> Artist 2, Artist 3
sp.playlist returns a dictionary, so you can simply access its values by their keys. For example:
>>> results['name']
'TEST'
JSON is a data serialization format, ie a standardized way of representing objects as pure text and parsing them back from the text. json.dumps therefore converts the dictionary object to a string of text. This is useful if you want to for example save the results to a file and load it back later. You don't need it to access contents from results.
(This is a playlist though—you will need to get information on each song/track to get its artist and name.)

fastavro - Convert json file into avro file

A bit new to avro & python.
I am trying to do a simple conversion to avro using the fastavro library, as the speed of the native apache avro library is just a bit too slow.
I want to:
1.Take a json file
2. Convert the data to avro.
My problem is that it seems like my json isn't in the correct 'record' format to be converted to avro. I even tried putting my json into a string var and making it looks similar to the syntax they have on the site # https://fastavro.readthedocs.io/en/latest/writer.html:
{u'station': u'011990-99999', u'temp': 22, u'time': 1433270389},
{u'station': u'011990-99999', u'temp': -11, u'time': 1433273379},
{u'station': u'012650-99999', u'temp': 111, u'time': 1433275478},
Here is my code:
from fastavro import json_writer, parse_schema, writer
import json
key = "test.json"
schemaFileName = "test_schema.avsc"
with open(r'C:/Path/to/file' + schemaFileName) as sc:
w = json.load(sc)
schema = parse_schema(w)
with open(r'C:/Path/to/file/' + key) as js:
x=json.load(js)
with open('C:/Path/to/file/output.avro', 'wb') as out:
writer(out, schema,x, codec='deflate')
Here is what I get as output:
File "avropython.py", line 26, in <module>
writer(out, schema,x, codec='deflate')
File "fastavro\_write.pyx", line 608, in fastavro._write.writer
ValueError: "records" argument should be an iterable, not dict
My json file and schema, rspectively:
"joined": false,
"toward": {
"selection": "dress",
"near": true,
"shoulder": false,
"fine": -109780201.3804388,
"pet": {
"stood": "saddle",
"live": false,
"leather": false,
"tube": false,
"over": false,
"impossible": true
},
"higher": false
},
"wear": true,
"asleep": "door",
"connected": true,
"stairs": -1195512399.5000324
}
{
"name": "MyClass",
"type": "record",
"namespace": "com.acme.avro",
"fields": [
{
"name": "joined",
"type": "boolean"
},
{
"name": "toward",
"type": {
"name": "toward",
"type": "record",
"fields": [
{
"name": "selection",
"type": "string"
},
{
"name": "near",
"type": "boolean"
},
{
"name": "shoulder",
"type": "boolean"
},
{
"name": "fine",
"type": "float"
},
{
"name": "pet",
"type": {
"name": "pet",
"type": "record",
"fields": [
{
"name": "stood",
"type": "string"
},
{
"name": "live",
"type": "boolean"
},
{
"name": "leather",
"type": "boolean"
},
{
"name": "tube",
"type": "boolean"
},
{
"name": "over",
"type": "boolean"
},
{
"name": "impossible",
"type": "boolean"
}
]
}
},
{
"name": "higher",
"type": "boolean"
}
]
}
},
{
"name": "wear",
"type": "boolean"
},
{
"name": "asleep",
"type": "string"
},
{
"name": "connected",
"type": "boolean"
},
{
"name": "stairs",
"type": "float"
}
]
}
If anyone could help me out, it would be greatly appreciated!!
As mentioned in the error ValueError: "records" argument should be an iterable, not dict, the problem is that when you call writer, the argument for the records needs to be an iterable. One way to solve this is to change your last line to writer(out, schema, [x], codec='deflate')
Alternatively, there is a schemaless_writer that can be used to just write a single record: https://fastavro.readthedocs.io/en/latest/writer.html#fastavro._write_py.schemaless_writer

Python Script to convert multiple json files in to single csv

{
"type": "Data",
"version": "1.0",
"box": {
"identifier": "abcdef",
"serial": "12345678"
},
"payload": {
"Type": "EL",
"Version": "1",
"Result": "Successful",
"Reference": null,
"Box": {
"Identifier": "abcdef",
"Serial": "12345678"
},
"Configuration": {
"EL": "1"
},
"vent": [
{
"ventType": "Arm",
"Timestamp": "2020-03-18T12:17:04+10:00",
"Parameters": [
{
"Name": "Arm",
"Value": "LT"
},
{
"Name": "Status",
"Value": "LD"
}
]
},
{
"ventType": "Arm",
"Timestamp": "2020-03-18T12:17:24+10:00",
"Parameters": [
{
"Name": "Arm",
"Value": "LT"
},
{
"Name": "Status",
"Value": "LD"
}
]
},
{
"EventType": "TimeUpdateCompleted",
"Timestamp": "2020-03-18T02:23:21.2979668Z",
"Parameters": [
{
"Name": "ActualAdjustment",
"Value": "PT0S"
},
{
"Name": "CorrectionOffset",
"Value": "PT0S"
},
{
"Name": "Latency",
"Value": "PT0.2423996S"
}
]
}
]
}
}
If you're looking to transfer information from a JSON file to a CSV, then you can use the following code to read in a JSON file into a dictionary in Python:
import json
with open('data.txt') as json_file:
data_dict = json.load(json_file)
You could then convert this dictionary into a list with either data_dict.items() or data_dict.values().
Then you just need to write this list to a CSV file which you can easily do by just looping through the list.

Json decoder error when loading a json file. While the File is validated as Json. in Python

so I am opening a Json file and when I try to load the file to a variable I get an errr because it can't read the file. While I have validated (online) that the Json file is valid. I am using this code:
with open("messagesTest2.json") as json_file:
data = json.load(json_file) <----- ERROR
for p in data['commits']:
print(p['message'])
And I get this error. While I have another json file that is also validated and this code works. But this file doesn't work. My guess is that somewhere in the file there is something that it cannot translate as json? Is the decoder's fault?
in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
Do you have any idea how to fix it? Keep in mind that the json file is valid else I'll have to show the file but I'll have to hide some data :D
The Json file (The urls/passwards/logins/etc have been replaced but the format remains the same) :
{
"commits": [{
"sha": "asjdaskldjkalsk",
"node_id": "sakldjaskldjaskldjklas",
"commit": {
"author": {
"name": "korki",
"email": "korki#kth.se",
"date": "2015-09-07T22:06:51Z"
},
"committer": {
"name": "korki",
"email": "korki#kth.se",
"date": "2015-09-07T22:06:51Z"
},
"message": "Added LaTex template and instructions",
"tree": {
"sha": "askdljaskdlajsklda",
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915"
},
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"comment_count": 0,
"verification": {
"verified": "False",
"reason": "unsigned",
"signature": "None",
"payload": "None"
}
},
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"html_url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"comments_url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"author": {
"login": "korki",
"id": 999,
"node_id": "askljdklas==",
"type": "User",
"site_admin": "None"
},
"committer": {
"login": "korki",
"id": 999,
"node_id": "askljdklas==",
"type": "User",
"site_admin": "None"
},
"parents": [{
"sha": "asdaskldjasdklsjl",
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"html_url": "https://gits-15.ds.sd.se/04dd5b226dda1915"
}]
}, {
"sha": "kasdjklasdjklas",
"node_id": "sdklasjdklasjkl",
"commit": {
"author": {
"name": "korki",
"email": "korki#kth.se",
"date": "2015-08-31T10:45:24Z"
},
"committer": {
"name": "korki",
"email": "korki#kth.se",
"date": "2015-08-31T10:45:24Z"
},
"message": "Update README.md",
"tree": {
"sha": "askldjkasldjklas",
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915"
},
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"comment_count": 0,
"verification": {
"verified": "None",
"reason": "unsigned",
"signature": "None",
"payload": "None"
}
},
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"html_url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"comments_url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"author": {
"login": "korki",
"id": 999,
"node_id": "dkasdasdnas==",
"type": "User",
"site_admin": "None"
},
"committer": {
"login": "korki",
"id": 999,
"node_id": "askldaskldja==",
"type": "User",
"site_admin": "None"
},
"parents": [{
"sha": "dlkasdjklas;dlkjas;",
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"html_url": "https://gits-15.ds.sd.se/04dd5b226dda1915"
}]
}, {
"sha": "dsagadsgsgdsa",
"node_id": "sdagfsdgsd",
"commit": {
"author": {
"name": "korki",
"email": "korki#kth.se",
"date": "2015-08-31T10:44:42Z"
},
"committer": {
"name": "korki",
"email": "korki#kth.se",
"date": "2015-08-31T10:44:42Z"
},
"message": "Initial commit",
"tree": {
"sha": "asdasddasdas",
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915"
},
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"comment_count": 0,
"verification": {
"verified": "None",
"reason": "unsigned",
"signature": "None",
"payload": "None"
}
},
"url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"html_url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"comments_url": "https://gits-15.ds.sd.se/04dd5b226dda1915",
"author": {
"login": "korki",
"id": 999,
"node_id": "kjklklj==",
"type": "User",
"site_admin": "None"
},
"committer": {
"login": "korki",
"id": 999,
"node_id": "jhkjkj==",
"gravatar_id": "",
"type": "User",
"site_admin": "None"
},
"parents": []
}]
}
That error means it is reading a blank file. Make sure you are reading the file you think you are reading.
EDIT: Another possibility is that you have already read through all the lines of the file. If you read through all the lines and try to read the file, it will appear as a blank file.
I had the exact same issue. I used a powershell script to create a json file and I tried to read the file from another python script but I kept getting the same error as you, even though the JSON file was properly formatted. The issue was I was using a powershell command, "Out-File" Instead, I used Set-Content and it fixed the issue. I believe it was an encoding difference between the commands. Maybe look at how you created the JSON file and the encoding used. I know this is late but I'll share anyways just in case anyone else is having the same issue.

Categories

Resources