How to iterate through "nameless" JSON entries in Python? - python

I have some JSON data from the HubSpot CRM API which, after doing some pagination using Python Code essentially looks like this:
[
{
"dealId": 18039629,
"portalId": 62515,
"isDeleted": false
},
{
"dealId": 18040854,
"portalId": 62515,
"isDeleted": false
}
]
... and now what I'd like to do is:
1) Read one "set" of JSON at a time (meaning dealId, portalId, isDeleted)
2) If isDeleted==false then grab the dealId and portalId and store in variables
3) Use the variables from #2 above to build a URL string that can be used to go back to the HubSpot API and get information on each individual deal (this API endpoint is https://api.hubapi.com/deals/v1/deal/23487870?hapikey=demo (where 23487870 is the dealId from the above JSON)
4) Combine that individual deal-level info into another set of JSON. Specifically I want to grab /properties/dealname/value and properties/dealstage/value from JSON that looks like this:
{
"portalId": 62515,
"dealId": 23487870,
"isDeleted": false,
"properties": {
"dealname": {
"value": "AutomationTestUser-national",
"timestamp": 1457692022120
},
"dealstage": {
"value": "appointmentscheduled",
"timestamp": 1457692022120
}
},
"imports": []
}
5) And then output the final result in JSON something like this:
{
"deals":
[
{
"portalId": 62515,
"dealId": 23487870,
"isDeleted": false,
"properties": {
"dealname": {
"value": "AutomationTestUser-national",
"timestamp": 1457692022120
},
"dealstage": {
"value": "appointmentscheduled",
"timestamp": 1457692022120
}
},
"imports": []
},
{
"portalId": 62515,
"dealId": 23487871,
"isDeleted": false,
"properties": {
"dealname": {
"value": "AutomationTestUser-regional",
"timestamp": 1457692022120
},
"dealstage": {
"value": "appointmentscheduled",
"timestamp": 1457692022120
}
},
"imports": []
}
]
}
... all in Python.
Any help?

import json
output = {"deals": []}
data = """
[
{
"dealId": 18039629,
"portalId": 62515,
"isDeleted": false
},
{
"dealId": 18040854,
"portalId": 62515,
"isDeleted": false
}
]
"""
j = json.loads(data)
for deal in j:
if deal['isDeleted']:
continue # Ignore if deleted
dealId = deal['dealId']
url = 'https://api.hubapi.com/deals/v1/deal/' + str(dealId) + '?hapikey=demo' # Construct new url here
data = getSource(url) # implement getSource to perform web request and get data
j2 = json.loads(data)
output['deals'].append(j2)
print(output)

Related

Spotify API: Error parsing through JSON with Python

I have had trouble appending id's to a separate list as I parse through the JSON I receive from Spotify's "Users Saved Tracks" endpoint.
The JSON received looks like this:
{
"href": "https://api.spotify.com/v1/me/tracks?offset=0&limit=20",
"items": [
{
"added_at": "2021-11-16T13:56:51Z",
"track": {
"album": {
"album_type": "single",
"artists": [
{
"external_urls": {
"spotify": "https://open.spotify.com/artist/3iKDeO8yaOiWz7vkeljunk"
},
"href": "https://api.spotify.com/v1/artists/3iKDeO8yaOiWz7vkeljunk",
"id": "3iKDeO8yaOiWz7vkeljunk",
"name": "Heavenward",
"type": "artist",
"uri": "spotify:artist:3iKDeO8yaOiWz7vkeljunk"
}
],
"available_markets": [
],
"disc_number": 1,
"duration_ms": 224838,
"explicit": false,
"external_ids": {
"isrc": "QZK6P2040977"
},
"external_urls": {
"spotify": "https://open.spotify.com/track/6mJ1nbmQOm6iNClo71K5O6"
},
"href": "https://api.spotify.com/v1/tracks/6mJ1nbmQOm6iNClo71K5O6",
"id": "6mJ1nbmQOm6iNClo71K5O6",
"is_local": false,
"name": "Hole",
"popularity": 33,
"preview_url": "https://p.scdn.co/mp3-preview/c425dc91bdb19f1cddf2b35df08e30a03290c3c0?cid=8c9ee97b95854163a250399fda32d350",
"track_number": 1,
"type": "track",
"uri": "spotify:track:6mJ1nbmQOm6iNClo71K5O6"
}
}
Right now my code that I am using to parse looks like this:
def getLikedTrackIds(session):
url = 'https://api.spotify.com/v1/me/tracks'
payload = makeGetRequest(session, url)
if payload == None:
return None
liked_tracks_ids = []
for track in payload['items']:
for attribute in track['track']:
if (attribute == 'id'):
app.logger.info(f"\n\nTrack ID: {attribute}")
liked_tracks_ids.append(attribute)
return liked_tracks_ids
My liked_track_ids is filled with the string "id", for each song:
[ "id", "id", "id", "id"....]
Can anyone provide insight as to what I am doing wrong?
Already commented under the question but your code can be simplified by getting rid of the loop:
def getLikedTrackIds(session):
url = 'https://api.spotify.com/v1/me/tracks'
payload = makeGetRequest(session, url)
if payload == None:
return None
liked_tracks_ids = []
for track in payload['items']:
liked_id = track['track'].get('id', None)
if liked_id:
app.logger.info(f"\n\nTrack ID: {liked_id}")
liked_tracks_ids.append(liked_id)
return liked_tracks_ids

JSON dump appending random characters to end of file

I am writing a parser that goes through a list of data that is roughly formatted:
{
"teachers": [
{
"fullName": "Testing",
"class": [
{
"className": "Counselor",
"school": {
"id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
}
}
]
},
...
}
The parser is supposed to check for duplicate names within this json object, and when it stumbles upon said duplicate name, append the class to the class array.
So for example:
{
"teachers": [
{
"fullName": "Testing",
"class": [
{
"className": "Counselor",
"school": {
"id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
}
}
]
},
{
"fullName": "Testing",
"class": [
{
"className": "Math 8",
"school": {
"id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
}
}
]
},
...
}
Would return
{
"teachers": [
{
"fullName": "Testing",
"class": [
{
"className": "Counselor",
"school": {
"id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
}
},
{
"className": "Math 8",
"school": {
"id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
}
},
]
},
...
}
My current parser works just fine for most objects, however for some reason it doesn't catch some of the duplicates despite the names being the exact same, and also is appending the string
}7d-48d6-b0b5-3d44ce4da21c"
}
}
]
}
]
to the end of the json document. I am not sure why it would do this considering I am just dumping the modified json (which only is modified within the array).
My parser code is:
i_duplicates = []
name_duplicates = []
def converter():
global i_duplicates
file = open("final2.json", "r+")
infinite = json.load(file)
for i, teacher in enumerate(infinite["teachers"]):
class_name = teacher["class"][0]["className"]
class_data = {
"className": class_name,
"school": {
"id": "2b6671cb-617d-48d6-b0b5-3d44ce4da21c"
}
}
d = {
"fullName": teacher["fullName"],
"index": i
}
c = {
"fullName": teacher["fullName"]
}
if c in name_duplicates:
infinite["teachers"][search(i_duplicates, c["fullName"])]["class"].append(class_data)
infinite["teachers"].pop(i)
file.seek(0)
json.dump(infinite, file, indent=4)
else:
i_duplicates.append(d)
name_duplicates.append(c)
def search(a, t):
for i in a:
if i["fullName"] == t:
return i["index"]
print(Fore.RED + "not found" + Fore.RESET)
I know I am going about this inefficiently, but I am not sure how to fix the issues the current algorithm is having. Any feedback appreciated.

Python JSON schema validation for array of objects

I am trying to validate a JSON file using the schema listed below, I can enter any additional fields, I don't understand, what I am doing wrong and why please?
Sample JSON Data
{
"npcs":
[
{
"id": 0,
"name": "Pilot Alpha",
"isNPC": true,
"race": "1e",
"testNotValid": false
},
{
"id": 1,
"name": "Pilot Beta",
"isNPC": true,
"race": 1
}
]
}
JSON Schema
I have set "required" and "additionalProperties" so I thought the validation would fail....
FileSchema = {
"definitions":
{
"NpcEntry":
{
"properties":
{
"id": { "type": "integer" },
"name": { "type" : "string" },
"isNPC": { "type": "boolean" },
"race": { "type" : "integer" }
},
"required": [ "id", "name", "isNPC", "race" ],
"additionalProperties": False
}
},
"type": "object",
"required": [ "npcs" ],
"additionalProperties": False,
"properties":
{
"npcs":
{
"type": "array",
"npcs": { "$ref": "#/definitions/NpcEntry" }
}
}
}
The JSON file and schema are processed using the jsonschema package for Python, (I am using python 3.7 on a Mac).
The method I use to read and validate is below, I have removed a lot of the general validation to make the code as short and usable as possible:
import json
import jsonschema
def _ReadJsonfile(self, filename, schemaSystem, fileType):
with open(filename) as fileHandle:
fileContents = fileHandle.read()
jsonData = json.loads(fileContents)
try:
jsonschema.validate(instance=jsonData, schema=schemaSystem)
except jsonschema.exceptions.ValidationError as ex:
print(f"JSON schema validation failed for file '{filename}'")
return None
return jsonData
at: "npcs": { "$ref": "#/definitions/NpcEntry" }
change "npcs" to "items". npcs is not a valid keyword so it is ignored. The only validation that is happening is at the top level, verifying that the data is an object and that the one property is an array.

Navigating through a JSON with multiples arrays in Python

I'm trying to go through a JSON by using python but I can't access the "mbid" node. I want to print only the first "mbid" node.
Here is my function :
def get_data():
newJsonx = dict()
for item in data["resultsPage"]["results"]["calendarEntry"]:
mbid = item["event"]["performance"][0]["artist"]["identifier"][0]["mbid"]
With this function i get this error : IndexError: list index out of range
but when I'm doing
def get_data():
newJsonx = dict()
for item in data["resultsPage"]["results"]["calendarEntry"]:
mbid = item["event"]["performance"][0]["artist"]["identifier"]
And print(mbid), I'm getting a correct answer :
"identifier": [
{
"mbid": "6655955b-1c1e-4bcb-84e4-81bcd9efab30"
},
{
"mbid": "1b1b1b1b-1c1d"
}
]
So means I don't have a problem with the data. Maybe I'm doing something wrong with the second array?
Here is an example of the JSON structure :
{
"resultsPage": {
"status": "ok",
"results": {
"calendarEntry": [
{
"reason": {
},
"event": {
"performance": [
{
"id": 72641494,
"displayName": "Arnalds",
"artist": {
"id": 590465,
"identifier": [
{
"mbid": "6655955b-1c1e-4bcb-84e4-81bcd9efab30"
},
{
"mbid": "1b1b1b1b-1c1d"
}
]
}
}
]
}
}
]
}
}
}
Thanks for your time
def get_data():
newJsonx = dict()
for item in data["resultsPage"]["results"]["calendarEntry"]:
performance=item["event"]["performance"]
if performace:
identifier=performace[0]["artist"]["identifier"]
if identifier:
mbid=identifier[0]["mbid"]

How to make a 'outer' JSON key for JSON object with python

I would like to make the following JSON syntax output with python:
data={
"timestamp": "1462868427",
"sites": [
{
"name": "SiteA",
"zone": 1
},
{
"name": "SiteB",
"zone": 7
}
]
}
But I cannot manage to get the 'outer' data key there.
So far I got this output without the data key:
{
"timestamp": "1462868427",
"sites": [
{
"name": "SiteA",
"zone": 1
},
{
"name": "SiteB",
"zone": 7
}
]
}
I have tried with this python code:
sites = [
{
"name":"nameA",
"zone":123
},
{
"name":"nameB",
"zone":324
}
]
data = {
"timestamp": 123456567,
"sites": sites
}
print(json.dumps(data, indent = 4))
But how do I manage to get the outer 'data' key there?
Once you have your data ready, you can simply do this :
data = {'data': data}
JSON doesn't have =, it's all key:value.
What you're looking for is
data = {
"data": {
"timestamp": 123456567,
"sites": sites
}
}
json.dumps(data)
json.dumps() doesn't care for the name you give to the data object in python. You have to specify it manually inside the object, as a string.

Categories

Resources