Looping through JSON in python - python

I am a total beginner and I am trying to create a program for a course that will go through a dataset about some results. I face a difficulty getting to all the teams in the following JSON:
{
"name": "English Premier League 2014/15",
"rounds": [
{
"name": "Matchday 1",
"matches": [
{
"date": "2014-08-16",
"team1": {
"key": "manutd",
"name": "Manchester United",
"code": "MUN"
},
"team2": {
"key": "swansea",
"name": "Swansea",
"code": "SWA"
},
"score1": 1,
"score2": 2
},
{
"date": "2014-08-16",
"team1": {
"key": "leicester",
"name": "Leicester City",
"code": "LEI"
},
"team2": {
"key": "everton",
"name": "Everton",
"code": "EVE"
},
"score1": 2,
"score2": 2
}],
}],
}
I used these lines of code:
for matchday in js['rounds'] :
print(matchday['matches'][0]['team1']['name'])
this will print the name of the first team from the first game of every round, yet I would like to print the name of all the first teams of every round. Could someone give me a hint?

You should add second loop and iterate matches:
for rounds in js['rounds']:
for matches in rounds['matches']:
print(matches['team1']['name'])

you are taking first item in the list of matches, just loop through the list again like,
for matchday in js['rounds'] :
match_day = matchday['matches']
for each_match in match_day:
print(each_match['team1']['name'])

for json_data in data['rounds']:
for attribute, value in json_data.iteritems():
print attribute, value

Related

Create/ re-create a list of dictionaries from a dictionary via Python Recursion function

So, I'm trying to parse this json object into multiple events, as it's the expected input for a ETL tool. I know this is quite straight forward if we do this via loops, if statements and explicitly defining the search fields for given events. This method is not feasible because I have multiple heavily nested JSON objects and I would prefer to let the python recursions handle the heavy lifting. The following is a sample object, which consist of string, list and dict (basically covers most use-cases, from the data I have).
{
"event_name": "restaurants",
"properties": {
"_id": "5a9909384309cf90b5739342",
"name": "Mangal Kebab Turkish Restaurant",
"restaurant_id": "41009112",
"borough": "Queens",
"cuisine": "Turkish",
"address": {
"building": "4620",
"coord": {
"0": -73.9180155,
"1": 40.7427742
},
"street": "Queens Boulevard",
"zipcode": "11104"
},
"grades": [
{
"date": 1414540800000,
"grade": "A",
"score": 12
},
{
"date": 1397692800000,
"grade": "A",
"score": 10
},
{
"date": 1381276800000,
"grade": "A",
"score": 12
}
]
}
}
And I want to convert it to this following list of dictionaries
[
{
"event_name": "restaurants",
"properties": {
"restaurant_id": "41009112",
"name": "Mangal Kebab Turkish Restaurant",
"cuisine": "Turkish",
"_id": "5a9909384309cf90b5739342",
"borough": "Queens"
}
},
{
"event_name": "restaurant_address",
"properties": {
"zipcode": "11104",
"ref_id": "41009112",
"street": "Queens Boulevard",
"building": "4620"
}
},
{
"event_name": "restaurant_address_coord"
"ref_id": "41009112"
"0": -73.9180155,
"1": 40.7427742
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1414540800000,
"ref_id": "41009112",
"score": 12,
"grade": "A",
"index": "0"
}
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1397692800000,
"ref_id": "41009112",
"score": 10,
"grade": "A",
"index": "1"
}
},
{
"event_name": "restaurant_grades",
"properties": {
"date": 1381276800000,
"ref_id": "41009112",
"score": 12,
"grade": "A",
"index": "2"
}
}
]
And most importantly these events will be broken up into independent structured tables to conduct joins, we need to create primary keys/ unique identifiers. So the deeply nested dictionaries should have its corresponding parents_id field as ref_id. In this case ref_id = restaurant_id from its parent dictionary.
Most of the example on the internet flatten's the whole object to be normalized and into a dataframe, but to utilise this ETL tool to its full potential it would be ideal to solve this problem via recursions and outputting as list of dictionaries.
This is what one might call a brute force method. Create a translator function to move each item into the correct part of the new structure (like a schema).
# input dict
d = {
"event_name": "demo",
"properties": {
"_id": "5a9909384309cf90b5739342",
"name": "Mangal Kebab Turkish Restaurant",
"restaurant_id": "41009112",
"borough": "Queens",
"cuisine": "Turkish",
"address": {
"building": "4620",
"coord": {
"0": -73.9180155,
"1": 40.7427742
},
"street": "Queens Boulevard",
"zipcode": "11104"
},
"grades": [
{
"date": 1414540800000,
"grade": "A",
"score": 12
},
{
"date": 1397692800000,
"grade": "A",
"score": 10
},
{
"date": 1381276800000,
"grade": "A",
"score": 12
}
]
}
}
def convert_structure(d: dict):
''' function to convert to new structure'''
# the new dict
e = {}
e['event_name'] = d['event_name']
e['properties'] = {}
e['properties']['restaurant_id'] = d['properties']['restaurant_id']
# and so forth...
# keep building the new structure / template
# return a list
return [e]
# run & print
x = convert_structure(d)
print(x)
the reuslt (for the part done) looks like this:
[{'event_name': 'demo', 'properties': {'restaurant_id': '41009112'}}]
If a pattern is identified, then the above could be improved...

How can I find a specific key from a python dict and then get a value from that key in Python

I have a python dictionary that looks something like this:
[
{
"timestamp": 1621559698154,
"user": {
"uri": "spotify:user:xxxxxxxxxxxxxxxxxxxx",
"name": "Panda",
"imageUrl": "https://i.scdn.co/image/ab67757000003b82b54c68ed19f1047912529ef4"
},
"track": {
"uri": "spotify:track:6SJSOont5dooK2IXQoolNQ",
"name": "Dirty",
"imageUrl": "http://i.scdn.co/image/ab67616d0000b273a36e3d46e406deebdd5eafb0",
"album": {
"uri": "spotify:album:0NMpswZbEcswI3OIe6ml3Y",
"name": "Dirty (Live)"
},
"artist": {
"uri": "spotify:artist:4ZgQDCtRqZlhLswVS6MHN4",
"name": "grandson"
},
"context": {
"uri": "spotify:artist:4ZgQDCtRqZlhLswVS6MHN4",
"name": "grandson",
"index": 0
}
}
},
{
"timestamp": 1621816159299,
"user": {
"uri": "spotify:user:xxxxxxxxxxxxxxxxxxxxxxxx",
"name": "maja",
"imageUrl": "https://i.scdn.co/image/ab67757000003b8286459151d5426f5a9e77cfee"
},
"track": {
"uri": "spotify:track:172rW45GEnGoJUuWfm1drt",
"name": "Your Best American Girl",
"imageUrl": "http://i.scdn.co/image/ab67616d0000b27351630f0f26aff5bbf9e10835",
"album": {
"uri": "spotify:album:16i5KnBjWgUtwOO7sVMnJB",
"name": "Puberty 2"
},
"artist": {
"uri": "spotify:artist:2uYWxilOVlUdk4oV9DvwqK",
"name": "Mitski"
},
"context": {
"uri": "spotify:playlist:0tol7yRYYfiPJ17BuJQKu2",
"name": "I Bet on Losing Dogs",
"index": 0
}
}
}
]
How can I get, for example, the group of values for user.name "Panda" and then get that specific "track" list? I can't parse through the list by index because the list order changes randomly.
If you are only looking for "Panda", then you can just loop over the list, check whether the name is "Panda", and then retrieve the track list accordingly.
Otherwise, that would be inefficient if you want to do that for many different users. I would first make a dict that maps user to its index in the list, and then use that for each user (I am assuming that the list does not get modified while you execute the code, although it can be modified between executions.)
user_to_id = {data[i]['user']['name']: i for i in range(len(data))} # {'Panda': 0, 'maja': 1}
def get_track(user):
return data[user_to_id[user]]['track']
print(get_track('maja'))
print(get_track('Panda'))
where data is the list you provided.
Or, perhaps just make a dictionary of tracks directly:
tracks = {item['user']['name']: item['track'] for item in data}
print(tracks['Panda'])
If you want to get list of tracks for user Panda:
tracks = [entry['track'] for entry in data if entry['user']['name'] == 'Panda']

How do i make this JSON structure work as intended?

I have some data from a project where the variables can change from motorcycle and car. I need to get the name out of them and that value is inside the variable.
This is not the data i will be using but it has the same structure, the "official" data is some persional information so i changed it to some random values. I can not change the structure of the JSON data since this is the way the serveradmins decided to structure it for some reason.
This is my python code:
import json
with open('exampleData.json') as j:
data = json.load(j)
name = 0
Vehicle = 0
for x in data:
print(data['persons'][x]['name'])
for i in data['persons'][x]['things']["Vehicles"]:
print(data['persons'][x]['things']['Vehicles'][i]['type']['name'])
print("\n")
This is my Json data i extracted from the file "ExampleData.json"(sorry for long but it is kinda complex and necessary to understand the problem):
{
"total": 2,
"persons": [
{
"name": "Sven Svensson",
"things": {
"House": "apartment",
"Vehicles": [
{
"id": "46",
"type": {
"name": "Kawasaki ER6N",
"type": "motorcyle"
},
"Motorcycle": {
"plate": "aaa111",
"fields": {
"brand": "Kawasaki",
"status": "in shop"
}
}
},
{
"id": "44",
"type": {
"name": "BMW m3",
"type": "Car"
},
"Car": {
"plate": "bbb222",
"fields": {
"brand": "BMW",
"status": "in garage"
}
}
}
]
}
},
{
"name": "Eric Vivian Matthews",
"things": {
"House": "House",
"Vehicles": [
{
"id": "44",
"type": {
"name": "Volvo XC90",
"type": "Car"
},
"Car": {
"plate": "bbb222",
"fields": {
"brand": "Volvo",
"status": "in garage"
}
}
}
]
}
}
]
}
I want it to print out something like this :
Sven Svensson
Bmw M3
Kawasaki ER6n
Eric Vivian Matthews
Volvo XC90
but i get this error:
print(data['persons'][x]['name'])
TypeError: list indices must be integers or slices, not str
Process finished with exit code 1
What you need is
for person in data["persons"]:
for vehicle in person["things"]["vehicles"]:
print(vehicle["type"]["name"])
type = vehicle["type"]["type"]
print(vehicle[type]["plate"])
Python for loop does not return the key but rather an object here:
for x in data:
Referencing an object as key
print(data['persons'][x]['name'])
Is causing the error
What you need is to use the returning json object and iterate over them like so:
for x in data['persons']:
print(x['name'])
for vehicle in x['things']['Vehicles']:
print(vehicle['type']['name'])
print('\n')

Getting Deeper Level JSON Values in Python

I have a Python script that make an API call to retrieve data from Zendesk. (Using Python 3.x) The JSON object has a structure like this:
{
"id": 35436,
"url": "https://company.zendesk.com/api/v2/tickets/35436.json",
"external_id": "ahg35h3jh",
"created_at": "2009-07-20T22:55:29Z",
"updated_at": "2011-05-05T10:38:52Z",
"type": "incident",
"subject": "Help, my printer is on fire!",
"raw_subject": "{{dc.printer_on_fire}}",
"description": "The fire is very colorful.",
"priority": "high",
"status": "open",
"recipient": "support#company.com",
"requester_id": 20978392,
"submitter_id": 76872,
"assignee_id": 235323,
"organization_id": 509974,
"group_id": 98738,
"collaborator_ids": [35334, 234],
"forum_topic_id": 72648221,
"problem_id": 9873764,
"has_incidents": false,
"due_at": null,
"tags": ["enterprise", "other_tag"],
"via": {
"channel": "web"
},
"custom_fields": [
{
"id": 27642,
"value": "745"
},
{
"id": 27648,
"value": "yes"
}
],
"satisfaction_rating": {
"id": 1234,
"score": "good",
"comment": "Great support!"
},
"sharing_agreement_ids": [84432]
}
Where I am running into issues is in the "custom_fields" section specifically. I have a particular custom field inside of each ticket I need the value for, and I only want that particular value.
To spare you too many specifics of the Python code, I am reading through each value below for each ticket and adding it to an output variable before writing that output variable to a .csv. Here is the particular place the breakage is occuring:
output += str(ticket['custom_fields'][id:23825198]).replace(',', '')+','
All the replace nonsense is to make sure that since it is going into a comma delimited file, any commas inside of the values are removed. Anyway, here is the error I am getting:
output += str(ticket['custom_fields'][id:int(23825198)]).replace(',', '')+','
TypeError: slice indices must be integers or None or have an __index__ method
As you can see I have tried a couple different variations of this to try and resolve the issue, and have yet to find a fix. I could use some help!
Thanks...
Are you using json.loads()? If so you can then get the keys, and do an if statement against the keys. An example on how to get the keys and their respective values is shown below.
import json
some_json = """{
"id": 35436,
"url": "https://company.zendesk.com/api/v2/tickets/35436.json",
"external_id": "ahg35h3jh",
"created_at": "2009-07-20T22:55:29Z",
"updated_at": "2011-05-05T10:38:52Z",
"type": "incident",
"subject": "Help, my printer is on fire!",
"raw_subject": "{{dc.printer_on_fire}}",
"description": "The fire is very colorful.",
"priority": "high",
"status": "open",
"recipient": "support#company.com",
"requester_id": 20978392,
"submitter_id": 76872,
"assignee_id": 235323,
"organization_id": 509974,
"group_id": 98738,
"collaborator_ids": [35334, 234],
"forum_topic_id": 72648221,
"problem_id": 9873764,
"has_incidents": false,
"due_at": null,
"tags": ["enterprise", "other_tag"],
"via": {
"channel": "web"
},
"custom_fields": [
{
"sid": 27642,
"value": "745"
},
{
"id": 27648,
"value": "yes"
}
],
"satisfaction_rating": {
"id": 1234,
"score": "good",
"comment": "Great support!"
},
"sharing_agreement_ids": [84432]
}"""
# load the json object
zenJSONObj = json.loads(some_json)
# Shows a list of all custom fields
print("All the custom field data")
print(zenJSONObj['custom_fields'])
print("----")
# Tells you all the keys in the custom_fields
print("How keys and the values")
for custom_field in zenJSONObj['custom_fields']:
print("----")
for key in custom_field.keys():
print("key:",key," value: ",custom_field[key])
You can then modify the JSON object by doing something like
print(zenJSONObj['custom_fields'][0])
zenJSONObj['custom_fields'][0]['value'] = 'something new'
print(zenJSONObj['custom_fields'][0])
Then re-encode it using the following:
newJSONObject = json.dumps(zenJSONObj, sort_keys=True, indent=4)
I hope this is of some help.

Python - Find value anywhere within JSON and return location

In Python I'm currently working with a very large JSON file with some deep dictionaries and arrays. I'm having an issue where it's not constant. For example that's below, it's essentially countries, with regions/states, cities, and suburbs. The issue is that if there is only one suburb, it'll return a dictionary, though if there's more than one, it's a array with a dictionary making me have to add another line of code to go deeper. Sure, can ifelse/for it, but this is only a very small portion of the inconstancy and it's just not proper going ifelse all the time.
What I'd like to do is simply search anything within Belgium for the dictionary entry "code": "8400" and return it's location within the JSON file. What would be my best approach in order to do something like this? Thanks!
***SNIP***
{
"code": "BE",
"name": "Belgium",
"regions": {
"region": [
{
"code": "45",
"name": "Flanders",
"places": {
"place": [
{
"code": "1790",
"name": "Affligem"
},
{
"code": "8570",
"name": "Anzegem"
},
{
"code": "8630",
"name": "Diksmuide"
},
{
"code": "9600",
"name": "Ronse"
}
]
},
"subregions": {
"subregion": [
{
"code": "46",
"name": "Coast",
"places": {
"place": [
{
"code": "8300",
"name": "Knokke-Heist"
},
{
"code": "8400",
"name": "Oostende",
"subplaces": {
"subplace": {
"code": "8450",
"name": "Bredene"
}
}
},
{
"code": "8420",
"name": "De Haan"
},
{
"code": "8430",
"name": "Middelkerke"
},
{
"code": "8434",
"name": "Westende-Bad"
},
{
"code": "8490",
"name": "Jabbeke"
},
{
"code": "8660",
"name": "De Panne"
},
{
"code": "8670",
"name": "Oostduinkerke"
}
]
}
},
{
"code": "47",
"name": "Cities",
"places": {
"place": [
{
"code": "1000",
"name": "Brussels"
},
{
"code": "2000",
"name": "Antwerp"
},
{
"code": "8000",
"name": "Bruges"
},
{
"code": "8340",
"name": "Damme"
},
{
"code": "9000",
"name": "Gent"
}
]
}
},
{
"code": "48",
"name": "Interior",
"places": {
"place": [
{
"code": "2260",
"name": "Westerlo"
},
{
"code": "2400",
"name": "Mol"
},
{
"code": "2590",
"name": "Berlaar"
},
{
"code": "8500",
"name": "Kortrijk",
"subplaces": {
"subplace": {
"code": "8940",
"name": "Wervik"
}
}
},
{
"code": "8610",
"name": "Handzame"
},
{
"code": "8755",
"name": "Ruiselede"
},
{
"code": "8900",
"name": "Ieper"
},
{
"code": "8970",
"name": "Poperinge"
}
]
}
},
EDIT:
I was asked to show how I'm currently getting through this JSON file. Root is a dictionary containing numbers that equal the city/suburb I'm trying to search for. It doesn't define whether it is a city or suburb before hand. Below is my lazyly coded search while I was trying to learn how to dig through this JSON file, until I realized how complicated it was getting and got a bit stuck.
SNIP
for k in dataDict['countries']['country']:
if k['code'] == root['country']:
for y in k['regions']['region']['places']['place']:
if y['code'] == root['place']:
city = y['name']
else:
try:
for p in y['subplaces']['subplace']:
if p['code'] == root['place']:
city = p['name']
except:
pass
If I understand well, each dictionary has the following structure:
{"code": # some int
"name": # some str
none / "country" / "place" / whatever # some dict or list
You can write a recursive function that handle one and only one dict:
def foo(my_dict):
if my_dict['code'] == root['place']:
city = my_dict['name']
elif "country" in my_dict:
city = foo(my_dict['country'])
elif "place" in my_dict:
#
# and so on...
else:
city = None
return city
Hope this example will help you.

Categories

Resources