Related
hi I'm pretty new at coding and I was trying to create a program in python that reads and save in another file the data inside a json file (not everything, just what I want). I googled how to parse data but there's something I don't understand.
that's a part of the json file:
`
{
"profileRevision": 548789,
"profileId": "campaign",
"profileChangesBaseRevision": 548789,
"profileChanges": [
{
"changeType": "fullProfileUpdate",
"profile": {
"_id": "2da4f079f8984cc48e84fc99dace495d",
"created": "2018-03-29T11:02:15.190Z",
"updated": "2022-10-31T17:34:43.284Z",
"rvn": 548789,
"wipeNumber": 9,
"accountId": "63881e614ef543b2932c70fed1196f34",
"profileId": "campaign",
"version": "refund_teddy_perks_september_2022",
"items": {
"8ec8f13f-6bf6-4933-a7db-43767a055e66": {
"templateId": "Quest:heroquest_loadout_constructor_2",
"attributes": {
"quest_state": "Claimed",
"creation_time": "min",
"last_state_change_time": "2019-05-18T16:09:12.750Z",
"completion_complete_pve03_diff26_loadout_constructor": 300,
"level": -1,
"item_seen": true,
"sent_new_notification": true,
"quest_rarity": "uncommon",
"xp_reward_scalar": 1
},
"quantity": 1
},
"6940c71b-c74b-4581-9f1e-c0a87e246884": {
"templateId": "Worker:workerbasic_sr_t01",
"attributes": {
"gender": "2",
"personality": "Homebase.Worker.Personality.IsDreamer",
"level": 1,
"item_seen": true,
"squad_slot_idx": -1,
"portrait": "WorkerPortrait:IconDef-WorkerPortrait-Dreamer-F02",
"building_slot_used": -1,
"set_bonus": "Homebase.Worker.SetBonus.IsMeleeDamageLow"
}
}
}
]
}
`
I can access profileChanges. I wrote this to create another json file with only the profileChanges things:
`
myjsonfile= open("file.json",'r')
jsondata=myjsonfile.read()
obj=json.loads(jsondata)
ciso=obj['profileChanges']
for i in ciso:
print(i)
with open("file2", "w") as outfile:
json.dump( ciso, outfile, indent=1)
the issue I have is that I can't access "profile" (inside profileChanges) in the same way by parsing the new file and I have no idea on how to do it
Access to JSON or dict element is realized by list indexes, please look at below example:
a = [
{
"friends": [
{
"id": 0,
"name": "Reba May"
}
],
"greeting": "Hello, Doris Gallagher! You have 2 unread messages.",
"favoriteFruit": "strawberry"
},
]
b = a['friends']['id] # b = 0
I've added a couple of closing braces to make your snippet valid json:
s = '''{
"profileRevision": 548789,
"profileId": "campaign",
"profileChangesBaseRevision": 548789,
"profileChanges": [
{
"changeType": "fullProfileUpdate",
"profile": {
"_id": "2da4f079f8984cc48e84fc99dace495d",
"created": "2018-03-29T11:02:15.190Z",
"updated": "2022-10-31T17:34:43.284Z",
"rvn": 548789,
"wipeNumber": 9,
"accountId": "63881e614ef543b2932c70fed1196f34",
"profileId": "campaign",
"version": "refund_teddy_perks_september_2022",
"items": {
"8ec8f13f-6bf6-4933-a7db-43767a055e66": {
"templateId": "Quest:heroquest_loadout_constructor_2",
"attributes": {
"quest_state": "Claimed",
"creation_time": "min",
"last_state_change_time": "2019-05-18T16:09:12.750Z",
"completion_complete_pve03_diff26_loadout_constructor": 300,
"level": -1,
"item_seen": true,
"sent_new_notification": true,
"quest_rarity": "uncommon",
"xp_reward_scalar": 1
},
"quantity": 1
},
"6940c71b-c74b-4581-9f1e-c0a87e246884": {
"templateId": "Worker:workerbasic_sr_t01",
"attributes": {
"gender": "2",
"personality": "Homebase.Worker.Personality.IsDreamer",
"level": 1,
"item_seen": true,
"squad_slot_idx": -1,
"portrait": "WorkerPortrait:IconDef-WorkerPortrait-Dreamer-F02",
"building_slot_used": -1,
"set_bonus": "Homebase.Worker.SetBonus.IsMeleeDamageLow"
}
}
}
}
}
]
}
'''
d = json.loads(s)
print(d['profileChanges'][0]['profile']['version'])
This prints refund_teddy_perks_september_2022
Explanation:
d is a dict
d['profileChanges'] is a list of dicts
d['profileChanges'][0] is the first dict in the list
d['profileChanges'][0]['profile'] is a dict
d['profileChanges'][0]['profile']['version'] is the value of version key in the profile dict in the first entry of the profileChanges list.
Tried solution shared in link :: Nested json to csv - generic approach
This worked for Sample 1 , but giving only a single row for Sample 2.
is there a way to have generic python code to handle both Sample 1 and Sample 2.
Sample 1 ::
{
"Response": "Success",
"Message": "",
"HasWarning": false,
"Type": 100,
"RateLimit": {},
"Data": {
"Aggregated": false,
"TimeFrom": 1234567800,
"TimeTo": 1234567900,
"Data": [
{
"id": 11,
"symbol": "AAA",
"time": 1234567800,
"block_time": 123.282828282828,
"block_size": 1212121,
"current_supply": 10101010
},
{
"id": 12,
"symbol": "BBB",
"time": 1234567900,
"block_time": 234.696969696969,
"block_size": 1313131,
"current_supply": 20202020
},
]
}
}
Sample 2::
{
"Response": "Success",
"Message": "Summary succesfully returned!",
"Data": {
"11": {
"Id": "3333",
"Url": "test/11.png",
"value": "11",
"Name": "11 entries (11)"
},
"122": {
"Id": "5555555",
"Url": "test/122.png",
"Symbol": "122",
"Name": "122 cases (122)"
}
},
"Limit": {},
"HasWarning": False,
"Type": 50
}
Try this, you need to install flatten_json from here
import sys
import csv
import json
from flatten_json import flatten
data = json.load(open(sys.argv[1]))
data = flatten(data)
with open('foo.csv', 'w') as f:
out = csv.DictWriter(f, data.keys())
out.writeheader()
out.writerow(data)
Output
> cat foo.csv
Response,Message,Data_11_Id,Data_11_Url,Data_11_value,Data_11_Name,Data_122_Id,Data_122_Url,Data_122_Symbol,Data_122_Name,Limit,HasWarning,Type
Success,Summary succesfully returned!,3333,test/11.png,11,11 entries (11),5555555,test/122.png,122,122 cases (122),{},False,50
Note: False is incorrect in Json, you need to change it to false
I am writing some code with Python and Spotipy and I'm relatively new to coding. I have some code that get all the info about a Spotify playlist and prints it out for me:
from spotipy.oauth2 import SpotifyClientCredentials
import spotipy
import json
client_credentials_manager = SpotifyClientCredentials()
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)
playlist_id = 'spotify:playlist:76CVeJDw2b90up5PgkZXyU'
results = sp.playlist(playlist_id)
#print(json.dumps(results, indent=4))
print((json.dumps(results, indent=4)))
It works well and gives me all the info. My problem is that I only need specifics from the print:
"collaborative": false,
"description": "",
"external_urls": {
"spotify": "https://open.spotify.com/playlist/76CVeJDw2b90up5PgkZXyU"
},
"followers": {
"href": null,
"total": 0
},
"href": "https://api.spotify.com/v1/playlists/76CVeJDw2b90up5PgkZXyU?additional_types=track",
"id": "76CVeJDw2b90up5PgkZXyU",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab67616d0000b2734a052b99c042dc15f933145b",
"width": 640
}
],
"name": "TEST",
"owner": {
"display_name": "Name",
"external_urls": {
"spotify": "https://open.spotify.com/user/myname"
},
"href": "https://api.spotify.com/v1/users/myname",
"id": "Myname",
"type": "user",
"uri": "spotify:user:kovizsombor"
},
"primary_color": null,
"public": true,
"snapshot_id": "MixmMGE0MDgxNDQ1ZGVlNmE4MThiMmQwODMwYWU0OTI3YzkyOGJhOWIz",
"tracks": {
"href": "https://api.spotify.com/v1/playlists/76CVeJDw2b90up5PgkZXyU/tracks?offset=0&limit=100&additional_types=track",
"items": [
{
"added_at": "2020-05-17T10:00:11Z",
"added_by": {
"external_urls": {
"spotify": "https://open.spotify.com/user/kovizsombor"
},
"href": "https://api.spotify.com/v1/users/kovizsombor",
"id": "kovizsombor",
"type": "user",
"uri": "spotify:user:kovizsombor"
},
"is_local": false,
"primary_color": null,
"track": {
"album": {
"album_type": "album",
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/0PFtn5NtBbbUNbU9EAmIWF"
},
"href": "https://api.spotify.com/v1/artists/0PFtn5NtBbbUNbU9EAmIWF",
"id": "0PFtn5NtBbbUNbU9EAmIWF",
"name": "TOTO",
"type": "artist",
"uri": "spotify:artist:0PFtn5NtBbbUNbU9EAmIWF"
}
],
],
"external_urls": {
"spotify": "https://open.spotify.com/album/62U7xIHcID94o20Of5ea4D"
},
"href": "https://api.spotify.com/v1/albums/62U7xIHcID94o20Of5ea4D",
"id": "62U7xIHcID94o20Of5ea4D",
"images": [
{
"height": 640,
"url": "https://i.scdn.co/image/ab67616d0000b2734a052b99c042dc15f933145b",
"width": 640
},
{
"height": 300,
"url": "https://i.scdn.co/image/ab67616d00001e024a052b99c042dc15f933145b",
"width": 300
},
{
"height": 64,
"url": "https://i.scdn.co/image/ab67616d000048514a052b99c042dc15f933145b",
"width": 64
}
],
"name": "Toto IV",
"release_date": "1982-04-08",
"release_date_precision": "day",
"total_tracks": 10,
"type": "album",
"uri": "spotify:album:62U7xIHcID94o20Of5ea4D"
},
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/0PFtn5NtBbbUNbU9EAmIWF"
},
"href": "https://api.spotify.com/v1/artists/0PFtn5NtBbbUNbU9EAmIWF",
"id": "0PFtn5NtBbbUNbU9EAmIWF",
"name": "TOTO",
"type": "artist",
"uri": "spotify:artist:0PFtn5NtBbbUNbU9EAmIWF"
}
],
"available_markets": [
],
"disc_number": 1,
"duration_ms": 295893,
"episode": false,
"explicit": false,
"external_ids": {
"isrc": "USSM19801941"
},
"external_urls": {
"spotify": "https://open.spotify.com/track/2374M0fQpWi3dLnB54qaLX"
},
"href": "https://api.spotify.com/v1/tracks/2374M0fQpWi3dLnB54qaLX",
"id": "2374M0fQpWi3dLnB54qaLX",
"is_local": false,
"name": "Africa",
"popularity": 83,
"preview_url": "https://p.scdn.co/mp3-preview/dd78dafe31bb98f230372c038a126b8808f9349b?cid=d568e7073a38465bba48268ea9f10153",
"track": true,
"track_number": 10,
"type": "track",
"uri": "spotify:track:2374M0fQpWi3dLnB54qaLX"
},
"video_thumbnail": {
"url": null
}
}
],
"limit": 100,
"next": null,
"offset": 0,
"previous": null,
"total": 1
},
"type": "playlist",
"uri": "spotify:playlist:76CVeJDw2b90up5PgkZXyU"
}
From this long print I somehow need to extract the Artist and the song title and preferably make it into a variable. Also not sure how this would work if there are multiple songs in a playlist.
It's also a solution if I can print out only the Artist and the title of the song without printing out all the information.
Based on your posted example it appears that you have only 1 song named "Africa" by artist "TOTO". Copying the track to have two songs, I added another track with two artists for testing the arrays.
If that json is loaded into a variable named results then (as #xcmkz said) you have a python dictionary and can process accordingly.
Try working with the following to transverse through your dict appending artists and songs to lists:
song_dict = {}
for track in results['tracks']['items']:
song_name = track["track"]["name"]
a2 = []
for t1 in track['track']['artists']:
a2.append(t1['name'])
song_dict.update({song_name: a2})
print(f'Dictionary of Songs and Artists:')
for k, v in song_dict.items():
print(f'Song --> {k}, by --> {", ".join(v)}')
Results:
Dictionary of Songs and Artists:
Song --> Africa, by --> TOTO
Song --> Just Another Silly Song, by --> Artist 2, Artist 3
sp.playlist returns a dictionary, so you can simply access its values by their keys. For example:
>>> results['name']
'TEST'
JSON is a data serialization format, ie a standardized way of representing objects as pure text and parsing them back from the text. json.dumps therefore converts the dictionary object to a string of text. This is useful if you want to for example save the results to a file and load it back later. You don't need it to access contents from results.
(This is a playlist though—you will need to get information on each song/track to get its artist and name.)
I am trying to retain the whole contents of a nested dictionary but only with its contents reordered..
This is an example of my nested dictionaries (pardon the long example..) -
{
"pages": {
"rotatingTest": {
"elements": {
"apvfafwkbnjn2bjt": {
"name": "animRot_tilt40_v001",
"data": {
"description": "tilt testing",
"project": "TEST",
"created": "26/11/18 16:32",
},
"type": "AnimWidget",
"uid": "apvfafwkbnjn2bjt"
},
"p0pkje1hjcc9jukq": {
"name": "poseRot_positionD_v003",
"data": {
"description": "posing test for positionD",
"created": "10/01/18 14:16",
"project": "TEST",
},
"type": "PosedWidget",
"uid": "p0pkje1hjcc9jukq"
},
"k1gzzc5uy1ynqtnj": {
"name": "animRot_positionH_v001",
"data": {
"description": "rotational posing test for positionH",
"created": "13/06/18 14:19",
"project": "TEST",
},
"type": "AnimWidget",
"uid": "k1gzzc5uy1ynqtnj"
}
}
},
"panningTest": {
"elements": {
"7lyuri8g8u5ctwsa": {
"name": "posePan_positionZ_v001",
"data": {
"description": "panning test for posZ",
"created": "04/10/18 12:43",
"project": "TEST",
},
"type": "PosedWidget",
"uid": "7lyuri8g8u5ctwsa"
}
}
},
"zoomingTest": {
"elements": {
"prtn0i6ehudhz475": {
"name": "posZoom_positionH_v010",
"data": {
"description": "zoom test",
"created": "11/10/18 12:42",
"project": "TEST",
},
"type": "PosedWidget",
"uid": "prtn0i6ehudhz475"
}
}
}
},
"page_order": [
"rotatingTest",
"zoomingTest",
"panningTest"
]
}
and this is my code:
for k1, v1 in test_dict.get('pages', {}).items():
return (sorted(v1.get('elements').items(), key=lambda (k2,v2): v2['data']['created']))
In the code, keys such as the page_order, pages etc are missing...
Or if there is/ are any commands where it will enables me to retain the 'whole' of the dictionary?
Appreciate in advance for any advice.
If you're using Python 3.7, a dict will preserve insert order. Otherwise, you need to use an OrderedDict.Additionally, you need to convert the date string to a date to get the correct sort order:
from datetime import datetime
def sortedPage(d):
return {k: {'elements': dict(sorted(list(v['elements'].items()), key=lambda tuple: datetime.strptime(tuple[1]['data']['created'], '%d/%m/%y %H:%M')))} for k,v in d.items()}
output = {k: sortedPage(v) if k == 'pages' else v for k,v in input.items()}
In Python I'm currently working with a very large JSON file with some deep dictionaries and arrays. I'm having an issue where it's not constant. For example that's below, it's essentially countries, with regions/states, cities, and suburbs. The issue is that if there is only one suburb, it'll return a dictionary, though if there's more than one, it's a array with a dictionary making me have to add another line of code to go deeper. Sure, can ifelse/for it, but this is only a very small portion of the inconstancy and it's just not proper going ifelse all the time.
What I'd like to do is simply search anything within Belgium for the dictionary entry "code": "8400" and return it's location within the JSON file. What would be my best approach in order to do something like this? Thanks!
***SNIP***
{
"code": "BE",
"name": "Belgium",
"regions": {
"region": [
{
"code": "45",
"name": "Flanders",
"places": {
"place": [
{
"code": "1790",
"name": "Affligem"
},
{
"code": "8570",
"name": "Anzegem"
},
{
"code": "8630",
"name": "Diksmuide"
},
{
"code": "9600",
"name": "Ronse"
}
]
},
"subregions": {
"subregion": [
{
"code": "46",
"name": "Coast",
"places": {
"place": [
{
"code": "8300",
"name": "Knokke-Heist"
},
{
"code": "8400",
"name": "Oostende",
"subplaces": {
"subplace": {
"code": "8450",
"name": "Bredene"
}
}
},
{
"code": "8420",
"name": "De Haan"
},
{
"code": "8430",
"name": "Middelkerke"
},
{
"code": "8434",
"name": "Westende-Bad"
},
{
"code": "8490",
"name": "Jabbeke"
},
{
"code": "8660",
"name": "De Panne"
},
{
"code": "8670",
"name": "Oostduinkerke"
}
]
}
},
{
"code": "47",
"name": "Cities",
"places": {
"place": [
{
"code": "1000",
"name": "Brussels"
},
{
"code": "2000",
"name": "Antwerp"
},
{
"code": "8000",
"name": "Bruges"
},
{
"code": "8340",
"name": "Damme"
},
{
"code": "9000",
"name": "Gent"
}
]
}
},
{
"code": "48",
"name": "Interior",
"places": {
"place": [
{
"code": "2260",
"name": "Westerlo"
},
{
"code": "2400",
"name": "Mol"
},
{
"code": "2590",
"name": "Berlaar"
},
{
"code": "8500",
"name": "Kortrijk",
"subplaces": {
"subplace": {
"code": "8940",
"name": "Wervik"
}
}
},
{
"code": "8610",
"name": "Handzame"
},
{
"code": "8755",
"name": "Ruiselede"
},
{
"code": "8900",
"name": "Ieper"
},
{
"code": "8970",
"name": "Poperinge"
}
]
}
},
EDIT:
I was asked to show how I'm currently getting through this JSON file. Root is a dictionary containing numbers that equal the city/suburb I'm trying to search for. It doesn't define whether it is a city or suburb before hand. Below is my lazyly coded search while I was trying to learn how to dig through this JSON file, until I realized how complicated it was getting and got a bit stuck.
SNIP
for k in dataDict['countries']['country']:
if k['code'] == root['country']:
for y in k['regions']['region']['places']['place']:
if y['code'] == root['place']:
city = y['name']
else:
try:
for p in y['subplaces']['subplace']:
if p['code'] == root['place']:
city = p['name']
except:
pass
If I understand well, each dictionary has the following structure:
{"code": # some int
"name": # some str
none / "country" / "place" / whatever # some dict or list
You can write a recursive function that handle one and only one dict:
def foo(my_dict):
if my_dict['code'] == root['place']:
city = my_dict['name']
elif "country" in my_dict:
city = foo(my_dict['country'])
elif "place" in my_dict:
#
# and so on...
else:
city = None
return city
Hope this example will help you.