Related
So I have two different dictionaries, the one is kind of like a "filter", and the other one is a list of dictionaries.
Currently what I'm doing is:
if all(item in tutor.items() for item in filters.items()):
The problem with this is, that I have a list of programs, that the tutor is capable of teaching in. It could be Maple, and or Geogebra. There are a lot of different options. The problem is, that a tutor may teach in multiple programs. So if I in the filters specify the program, Maple. I don't want it to show me the tutors that ONLY teach in Maple, but all the tutors where the program is/is not a part of the programs list.
So I somehow need to rewrite if all(item in tutor.items() for item in filters.items()):
To something like if contains(item in tutor.items() for item in filters.items()):
But that of course doesn't work.
tutor dictionary would look something like this:
{ 'age': 34,
'age_interval': '27+',
'car': 'No',
'course': '',
'educational_institution': 'Not set by tutor',
'email': 'ronaldreagon.rr#gmail.com',
'first_name': 'Ronald',
'fluent_danish': 'Not set by tutor',
'fluent_other': [''],
'gender': 'Not set by tutor',
'grade': '',
'gym_type': 'STX',
'has_second': False,
'hour_interval': None,
'hours': 0,
'id': 112306,
'inactive_reason': 'Jeg t▒r sgu ikke give ham forl▒b, s▒ g▒r ham inaktiv '
'-Elmar',
'last_name': 'Reagon Ravi Kumar',
'lat': 55.78319639999999,
'lat_alternative': 0,
'lng': 12.5151532,
'lng_alternative': 0,
'mobile_phone': '+45 50213154',
'more_courses': 'Yes',
'programs': 'Not set by tutor',
'status': 'Inactive',
'still_gym': 'Not set by tutor',
'subjects': 'None, ',
'tutor_address': 'Elektrovej 330 K5 2800 kongens lyngby',
'tutor_amount_of_students': 0,
'tutor_gym': 'Not set by tutor',
'tutor_qualification': 'Not set by tutor',
'tutor_uni': 'G▒r ikke p▒ en videreg▒ende uddannelse'}
{ 'age': 19,
'age_interval': '18 til 20',
'car': 'No',
'course': '',
'educational_institution': 'Not set by tutor',
'email': 'Katrinenm02#gmail.com',
'first_name': 'Katrine',
'fluent_danish': 'Not set by tutor',
'fluent_other': [''],
'gender': 'Not set by tutor',
'grade': '',
'gym_type': 'STX',
'has_second': False,
'hour_interval': None,
'hours': 0,
'id': 112356,
'inactive_reason': 'Inaktiv fordi hun er Kathrine',
'last_name': 'Mikha',
'lat': 55.653212,
'lat_alternative': 0,
'lng': 12.296957,
'lng_alternative': 0,
'mobile_phone': '53200337',
'more_courses': 'Yes',
'programs': 'Not set by tutor',
'status': 'Inactive',
'still_gym': 'Not set by tutor',
'subjects': 'None, ',
'tutor_address': 'Taastrup Have 8 st. TH',
'tutor_amount_of_students': 0,
'tutor_gym': 'Not set by tutor',
'tutor_qualification': 'Not set by tutor',
'tutor_uni': 'G▒r ikke p▒ en videreg▒ende uddannelse'}
{ 'age': 19,
And the filters, would just specify the same keys, and a value. For instance
{
"gym_type" "STX"
}
This is done through a GET request, to our API
#api.route("/test", methods=["GET"])
def validate_api_request():
try:
filters = request.json
return get_matching_tutors(filters)
except:
return error_response(400, "Bad request: error in body")
def get_matching_tutors(filters):
matching_tutors = []
for tutor in tutor_list:
if all(item in tutor.items() for item in filters.items()):
matching_tutors.append(tutor)
return jsonify(matching_tutors)
Let's say I specify this in the API call.
{
"programs": [
"Excel"
]
}
What I would get back, is a list of all the tutors that meet the requirement of being able to teach in Excel. But a lot of the tutors may be able to teach in Excel, and another program. But I will only get the tutors that ONLY teach in Excel. So the expected result should be something like this:
{
"age": 24,
"age_interval": "24 til 26",
"car": "Yes",
"course": "4. prioritet (Foretrækker fysisk)",
"educational_institution": "Københavns Universitet",
"email": "hdl543#alumni.ku.dk",
"first_name": "Ahmed",
"fluent_danish": "Yes",
"fluent_other": [
"Engelsk"
],
"gender": "Mand",
"grade": "7 til 8",
"gym_type": "STX",
"has_second": true,
"hour_interval": null,
"hours": 0,
"id": 134781,
"inactive_reason": "Blank",
"last_name": "Osman Mohammed",
"lat": 55.70321,
"lat_alternative": 55.68784609999999,
"lng": 12.530245,
"lng_alternative": 12.5696519,
"mobile_phone": "42313324",
"more_courses": "Yes",
"programs": [
"TI-Nspire",
"Geogebra",
"Wordmat",
"Excel",
"STATA"
],
"status": "Active",
"still_gym": "Jeg er færdig med gymnasiet",
"subjects": "Matematik B, Matematik C, Matematik Folkeskole, ",
"tutor_address": "Frederikssundsvej 54B, 2. th.",
"tutor_amount_of_students": 0,
"tutor_gym": "Frederiksberg Gymnasium",
"tutor_qualification": "Not set by tutor",
"tutor_uni": "Økonomi"
},
As you can see, I only specified Excel but I got a tutor that can teach in Excel, and other programs. So I'm thinking that I need to see if it "contains" the specified program in the API call
For the long term, you'll probably want to use a SQL database with a table for the tutors and a table with the different programs with a many-to-many relationship between them.
For now, we can create some helper functions. It's not strictly necessary, but it will make the code easier to read and maintain.
def matches_filter(filter_value, tutor_value):
if isinstance(tutor_value, list):
# We want to treat this as a set of values to match
# instead of checking for equality
# Note: if you have to do this often, consider using sets instead of lists to store these values.
return set(filter_value).issubset(set(tutor_value))
return filter_value == tutor_value
def matches_all_filters(filter_dict, tutor):
return all(filter_key in tutor and matches_filter(filter_value, tutor[filter_key])
for filter_key, filter_value in filter_dict.items())
def get_matching_tutors(filters):
matching_tutors = [tutor for tutor in tutor_list
if matches_all_filters(filters, tutor]
return jsonify(matching_tutors)
This is my first time dealing with json data. So I'm not that familiar with the structure of json.
I got some data through "we the people" e-petition sites with following code:
url = "https://api.whitehouse.gov/v1/petitions.json?limit=3&offset=0&createdBefore=1573862400"
jdata_2 = requests.get(url).json()
Yet, I realize this is something different from... the ordinary json structure since I got some error while I tried to convert it into excel file with pandas
df = pandas.read_json(jdata_2)
Obviously, I must miss something which I must have done before using pandas.read_json() code.
I have searched for the answer but most of questions are "How can I convert json data into excel data", which needs json data. For my case, I scraped it from the url, so I thought I could make that strings into json data, and then try to convert it into excel data as well. So I tried to use json.dump() as well, but it didn't work as well.
I know it must be the naive question. But I'm not sure where I can start with this naive question. If anyone can instruct me how to deal with it, I would really appreciate it. Or link me some references that I can study as well.
Thank you for your help in advance.
This is the json data with the requests, and I pprint it with indent=4.
Input:
url = "https://api.whitehouse.gov/v1/petitions.json?limit=3&offset=0&createdBefore=1573862400"
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(jdata_2)
Output :
{ 'metadata': { 'requestInfo': { 'apiVersion': 1,
'query': { 'body': None,
'createdAfter': None,
'createdAt': None,
'createdBefore': '1573862400',
'isPublic': 1,
'isSignable': None,
'limit': '3',
'mock': 0,
'offset': '0',
'petitionsDefaultLimit': '1000',
'publicThreshold': 149,
'responseId': None,
'signatureCount': None,
'signatureCountCeiling': None,
'signatureCountFloor': 0,
'signatureThreshold': None,
'signatureThresholdCeiling': None,
'signatureThresholdFloor': None,
'sortBy': 'DATE_REACHED_PUBLIC',
'sortOrder': 'ASC',
'status': None,
'title': None,
'url': None,
'websiteUrl': 'https://petitions.whitehouse.gov'},
'resource': 'petitions'},
'responseInfo': { 'developerMessage': 'OK',
'errorCode': '',
'moreInfo': '',
'status': 200,
'userMessage': ''},
'resultset': {'count': 1852, 'limit': 3, 'offset': 0}},
'results': [ { 'body': 'Please save kurdish people in syria \r\n'
'pleaee save north syria',
'created': 1570630389,
'deadline': 1573225989,
'id': '2798897',
'isPublic': True,
'isSignable': False,
'issues': [ { 'id': 326,
'name': 'Homeland Security & '
'Defense'}],
'petition_type': [ { 'id': 291,
'name': 'Call on Congress to '
'act on an issue'}],
'reachedPublic': 0,
'response': [],
'signatureCount': 149,
'signatureThreshold': 100000,
'signaturesNeeded': 99851,
'status': 'closed',
'title': 'Please save rojava north syria\r\n'
'please save kurdish people\r\n'
'please stop erdogan\r\n'
'plaease please',
'type': 'petition',
'url': 'https://petitions.whitehouse.gov/petition/please-save-rojava-north-syria-please-save-kurdish-people-please-stop-erdogan-plaease-please'},
{ 'body': 'Kane Friess was a 2 year old boy who was '
"murdered by his mom's boyfriend, Gyasi "
'Campbell. Even with expert statements from '
'forensic anthropologists, stating his injuries '
'wete the result of homicide. Mr. Campbell was '
'found guilty of involuntary manslaughter. This '
"is an outrage to Kane's Family and our "
'community.',
'created': 1566053365,
'deadline': 1568645365,
'id': '2782248',
'isPublic': True,
'isSignable': False,
'issues': [ { 'id': 321,
'name': 'Criminal Justice Reform'}],
'petition_type': [ { 'id': 281,
'name': 'Change an existing '
'Administration '
'policy'}],
'reachedPublic': 0,
'response': [],
'signatureCount': 149,
'signatureThreshold': 100000,
'signaturesNeeded': 99851,
'status': 'closed',
'title': "Kane's Law. Upon which the murder of a child, "
'regardless of circumstances, be seen as 1st '
'degree murder. A Federal Law.',
'type': 'petition',
'url': 'https://petitions.whitehouse.gov/petition/kanes-law-upon-which-murder-child-regardless-circumstances-be-seen-1st-degree-murder-federal-law'},
{ 'body': "Schumer and Pelosi's hatred and refusing to "
'work with President Donald J. Trump is holding '
'America hostage. We the people know securing '
'our southern border is a priority which will '
'not happen with these two in office. Lets '
'build the wall NOW!',
'created': 1547050064,
'deadline': 1549642064,
'id': '2722358',
'isPublic': True,
'isSignable': False,
'issues': [ {'id': 306, 'name': 'Budget & Taxes'},
{ 'id': 326,
'name': 'Homeland Security & '
'Defense'},
{'id': 29, 'name': 'Immigration'}],
'petition_type': [ { 'id': 291,
'name': 'Call on Congress to '
'act on an issue'}],
'reachedPublic': 0,
'response': [],
'signatureCount': 149,
'signatureThreshold': 100000,
'signaturesNeeded': 99851,
'status': 'closed',
'title': 'Remove Chuck Schumer and Nancy Pelosi from '
'office',
'type': 'petition',
'url': 'https://petitions.whitehouse.gov/petition/remove-chuck-schumer-and-nancy-pelosi-office'}]}
And this is the Error message I got
Input :
df = pandas.read_json(jdata_2)
Output :
ValueError: Invalid file path or buffer object type: <class 'dict'>
You can try the below code as well, it is working fine
URL = "https://api.whitehouse.gov/v1/petitions.json?limit=3&offset=0&createdBefore=1573862400"
// fetching the json response from the URL
req = requests.get(URL)
text_data= req.text
json_dict= json.loads(text_data)
//converting json dictionary to python dataframe for results object
df = pd.DataFrame.from_dict(json_dict["results"])
Finally, saving the dataframe to excel format i.e xlsx
df.to_excel("output.xlsx")
This is my first question :)
I loop over a nested dictionary to print specific values. I am using the following code.
for i in lizzo_top_tracks['tracks']:
print('Track Name: ' + i['name'])
It works for string variables, but does not work for other variables. For example, when I use the following code for the date variable:
for i in lizzo_top_tracks['tracks']:
print('Album Release Date: ' + i['release_date'])
I receive a message like this KeyError: 'release_date'
What should I do?
Here is a sample of my nested dictionary:
{'tracks': [{'album': {'album_type': 'album',
'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'}],
'external_urls': {'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640,
'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'}]}
The code you posted isn't syntactically correct; running it through a Python interpreter gives a syntax error on the last line. It looks like you lost a curly brace somewhere toward the end. :)
I went through it and fixed up the white space to make the structure easier to see; the way you had it formatted made it hard to see which keys were at which level of nesting, but with consistent indentation it becomes much clearer:
lizzo_top_tracks = {
'tracks': [{
'album': {
'album_type': 'album',
'artists': [{
'external_urls': {
'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'
},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'
}],
'external_urls': {
'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'
},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640, 'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'
}
}]
}
So the first (and only) value you get for i in lizzo_top_tracks['tracks'] is going to be this dictionary:
i = {
'album': {
'album_type': 'album',
'artists': [{
'external_urls': {
'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'
},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'
}],
'external_urls': {
'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'
},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640, 'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'
}
}
The only key in this dictionary is 'album', the value of which is another dictionary that contains all the other information. If you want to print, say, the album release date and a list of the artists' names, you'd do:
for track in lizzo_top_tracks['tracks']:
print('Album Release Date: ' + track['album']['release_date'])
print('Artists: ' + str([artist['name'] for artist in track['album']['artists']]))
If these are dictionaries that you're building yourself, you might want to remove some of the nesting layers where there's only a single key, since they just make it harder to navigate the structure without giving you any additional information. For example:
lizzo_top_albums = [{
'album_type': 'album',
'artists': [{
'external_urls': {
'spotify': 'https://open.spotify.com/artist/56oDRnqbIiwx4mymNEv7dS'
},
'href': 'https://api.spotify.com/v1/artists/56oDRnqbIiwx4mymNEv7dS',
'id': '56oDRnqbIiwx4mymNEv7dS',
'name': 'Lizzo',
'type': 'artist',
'uri': 'spotify:artist:56oDRnqbIiwx4mymNEv7dS'
}],
'external_urls': {
'spotify': 'https://open.spotify.com/album/74gSdSHe71q7urGWMMn3qB'
},
'href': 'https://api.spotify.com/v1/albums/74gSdSHe71q7urGWMMn3qB',
'id': '74gSdSHe71q7urGWMMn3qB',
'images': [{'height': 640, 'width': 640}],
'name': 'Cuz I Love You (Deluxe)',
'release_date': '2019-05-03',
'release_date_precision': 'day',
'total_tracks': 14,
'type': 'album',
'uri': 'spotify:album:74gSdSHe71q7urGWMMn3qB'
}]
This structure allows you to write the query the way you were originally trying to do it:
for album in lizzo_top_albums:
print('Album Release Date: ' + album['release_date'])
print('Artists: ' + str([artist['name'] for artist in album['artists']]))
Much simpler, right? :)
I've been struggling for about 3 weeks on this simple issue. I can't understand why and I would give anything to solve it lol.
I am trying to read values from the data structure below. The docs say it's a dictionary with keys containing lists of results of that type.
Example: I get the master query reply using an eval function. I lookup the key "song_hits" to get that structure. Then I lookup the key 'track' and parse it. The problem is getting to the 'track' part.
When I do it from how Perl docs tell me to, I get Can't locate object method "FIRSTKEY" via package "Inline::Python::Object::Data".
So I'm wondering if there's a way to read the value using the eval function to bypass ObjectData's hash key limitation, another way to read it given I know exact keys, or if I'm just doing this entirely wrong.
{
'album_hits': [
{
'album':
{
'albumArtRef': 'http://lh5.ggpht.com/DVIg4GiD6msHfgPs_Vu_2eRxCyAoz0fF...',
'albumArtist': 'J.Cole',
'albumId': 'Bfp2tuhynyqppnp6zennhmf6w3y',
'artist': 'J.Cole',
'artistId': ['Ajgnxme45wcqqv44vykrleifpji'],
'description_attribution':
{
'kind': 'sj#attribution',
'license_title': 'Creative Commons Attribution CC-BY',
'license_url': 'http://creativecommons.org/licenses/by/4.0/legalcode',
'source_title': 'Freebase',
'source_url': ''
},
'explicitType': '1',
'kind': 'sj#album',
'name': 'Work Out',
'year': 2011
},
'type': '3'
}],
'artist_hits': [
{
'artist':
{
'artistArtRef': 'http://lh3.googleusercontent.com/MJe-cDw9uQ-pUagoLlm...',
'artistArtRefs': [
{
'aspectRatio': '2',
'autogen': False,
'kind': 'sj#imageRef',
'url': 'http://lh3.googleusercontent.com/MJe-cDw9uQ-pUagoLlmKX3x_K...'
}],
'artistId': 'Ajgnxme45wcqqv44vykrleifpji',
'artist_bio_attribution':
{
'kind': 'sj#attribution',
'source_title': 'David Jeffries, Rovi'
},
'kind': 'sj#artist',
'name': 'J. Cole'
},
'type': '2'
}],
'playlist_hits': [
{
'playlist':
{
'albumArtRef': [
{
'url': 'http://lh3.googleusercontent.com/KJsAhrg8Jk_5A4xYLA68LFC...'
}],
'description': 'Workout Plan ',
'kind': 'sj#playlist',
'name': 'Workout',
'ownerName': 'Ida Sarver',
'shareToken': 'AMaBXyktyF6Yy_G-8wQy8Rru0tkueIbIFblt2h0BpkvTzHDz-fFj6P...',
'type': 'SHARED'
},
'type': '4'
}],
'situation_hits': [
{
'situation':
{
'description': 'Level up and enter beast mode with some loud, aggressive music.',
'id': 'Nrklpcyfewwrmodvtds5qlfp5ve',
'imageUrl': 'http://lh3.googleusercontent.com/Cd8WRMaG_pDwjTC_dSPIIuf...',
'title': 'Entering Beast Mode',
'wideImageUrl': 'http://lh3.googleusercontent.com/8A9S-nTb5pfJLcpS8P...'
},
'type': '7'
}],
'song_hits': [
{
'track':
{
'album': 'Work Out',
'albumArtRef': [
{
'aspectRatio': '1',
'autogen': False,
'kind': 'sj#imageRef',
'url': 'http://lh5.ggpht.com/DVIg4GiD6msHfgPs_Vu_2eRxCyAoz0fFdxj5w...'
}],
'albumArtist': 'J.Cole',
'albumAvailableForPurchase': True,
'albumId': 'Bfp2tuhynyqppnp6zennhmf6w3y',
'artist': 'J Cole',
'artistId': ['Ajgnxme45wcqqv44vykrleifpji', 'Ampniqsqcwxk7btbgh5ycujij5i'],
'composer': '',
'discNumber': 1,
'durationMillis': '234000',
'estimatedSize': '9368582',
'explicitType': '1',
'genre': 'Pop',
'kind': 'sj#track',
'nid': 'Tq3nsmzeumhilpegkimjcnbr6aq',
'primaryVideo':
{
'id': '6PN78PS_QsM',
'kind': 'sj#video',
'thumbnails': [
{
'height': 180,
'url': 'https://i.ytimg.com/vi/6PN78PS_QsM/mqdefault.jpg',
'width': 320
}]
},
'storeId': 'Tq3nsmzeumhilpegkimjcnbr6aq',
'title': 'Work Out',
'trackAvailableForPurchase': True,
'trackAvailableForSubscription': True,
'trackNumber': 1,
'trackType': '7',
'year': 2011
},
'type': '1'
}],
'station_hits': [
{
'station':
{
'compositeArtRefs': [
{
'aspectRatio': '1',
'kind': 'sj#imageRef',
'url': 'http://lh3.googleusercontent.com/3aD9mFppy6PwjADnjwv_w...'
}],
'contentTypes': ['1'],
'description': 'These riff-tastic metal tracks are perfect for getting the blood pumping.',
'imageUrls': [
{
'aspectRatio': '1',
'autogen': False,
'kind': 'sj#imageRef',
'url': 'http://lh5.ggpht.com/YNGkFdrtk43e8H941fuAHjflrNZ1CJUeqdoys...'
}],
'kind': 'sj#radioStation',
'name': 'Heavy Metal Workout',
'seed':
{
'curatedStationId': 'Lcwg73w3bd64hsrgarnorif52r',
'kind': 'sj#radioSeed',
'seedType': '9'
},
'skipEventHistory': [],
'stationSeeds': [
{
'curatedStationId': 'Lcwg73w3bd64hsrgarnorif52r',
'kind': 'sj#radioSeed',
'seedType': '9'
}]
},
'type': '6'
}],
'video_hits': [
{
'score': 629.6226806640625,
'type': '8',
'youtube_video':
{
'id': '6PN78PS_QsM',
'kind': 'sj#video',
'thumbnails': [
{
'height': 180,
'url': 'https://i.ytimg.com/vi/6PN78PS_QsM/mqdefault.jpg',
'width': 320
}],
'title': 'J. Cole - Work Out'
}
}]
}
Cleaned, but broken code with 3 weeks of different attempts: (I have tried for, foreach, while, but the furthest it would read would be either the entire unicode array, error, or an empty string)
sub search {
my $query = shift;
my $uri = 'googlemusic:search:' . $query;
if (my $result = $cache->get($uri)) {
return $result;
}
my $googleResult;
my $result = {
tracks => [],
albums => [],
artists => [],
};
eval {
$googleResult = $googleapi->search($query, $prefs->get('max_search_items'));
};
if ($#) {
$log->error("Not able to search All Access for \"$query\": $#");
return;
}
#gives not an ARRAY refernce error
for my $hit (#{$googleResult->{song_hits}}) {
push #{$result->{tracks}}, to_slim_track($hit->{track});
}
#works, but gives an error on the next line, 'newlist' object has no attribute 'album'
for my $hit ({$googleResult->{album_hits}}) {
push #{$result->{albums}}, album_to_slim_album($hit->{album});
}
#Perl and others recommended way, but gives Can't locate object method "FIRSTKEY" via package "Inline::Python::Object::Data"
for my $hit (%{$googleResult->{artist_hits}}) {
push #{$result->{artists}}, artist_to_slim_artist($hit->{artist});
}
# Add to the cache
$cache->set($uri, $result, $CACHE_TIME);
return $result;
}
I have tried reading up, but have gotten so many errors including:
'key' does not exist
Can't use string ("track") as a HASH ref while strict refs in use
Type of argument to keys on reference must be unblessed hashref or arrayref
My Full Test File: http://pastebin.com/DMnDc56i
GoogleApi PM (Python GAPI Hook): https://raw.githubusercontent.com/hechtus/squeezebox-googlemusic/master/GoogleMusic/GoogleAPI.pm
Edit: Info, There were a couple of people who wanted unmaintained old code fixed, so I offered to help and got everything working besides this part.
Old Code Git: https://github.com/hechtus/squeezebox-googlemusic
Google Api Python I use: https://github.com/simon-weber/gmusicapi
I take it that the data structure shown is in $googleResult. This is 'almost' JSON and you can process it as such using modules, after a simple cleanup. I will use JSON::XS. The code below takes off after $googleResult has been acquired. (In tests I actually copied data shown in the question into a file and read it in.) I first replace ' by " and lower-case True and False, to get a valid JSON format which the module can decode.
# Other code from the question ...
use JSON::XS;
# For tests I loaded shown data into $googleResult (did not run this eval)
eval {
$googleResult = $googleapi->search($query, $prefs->get('max_search_items'));
};
if ($#) {
$log->error("Not able to search All Access for \"$query\": $#");
return;
}
# The structure shown in the question needs a cleanup
# But this may be a road to madness, if there is more
$googleResult =~ s/'/"/g; # ' turn off wrong editor coloring
$googleResult =~ s/False/false/g;
$googleResult =~ s/True/true/g;
my $coder = JSON::XS->new;
# There are many options for how to set it up. Example:
# JSON::XS->new->ascii->pretty->allow_nonref;
my $data = $coder->decode($googleResult);
# Now this is a normal Perl data structure that we can work with.
# Look at what's under 'album_hits' for example
my $ralbhits = $data->{'album_hits'};
print Dumper($ralbhits);
# We get: VAR1 = [ { 'album' => { albumID => ... } } ]
# Array reference, with nested hash references as the sole element
# Extract the 'artist'
my $artist = $ralbhits->[0]->{'album'}->{'artist'};
print "$artist\n";
This prints J. Cole (after the dump which I omit here). You can for convenience first extract a part of the structure and then query it far more simply. For example
# Get the hashref for album
my $ralbum = $ralbhits->[0]->{'album'};
my $artist = $ralbum->{'artist'};
Now once the data is unpacked you can retrieve what you need, based on what artist_to_slim_artist() needs and does. This is a normal data structure to work with.
Modules for JSON parsing return Perl data structures, see Mapping in JSON::XS. Generally they will be nested, except in very simple cases. For how to work with them see perldsc, a cookbook on complex data structures.
The JSON object given in this example, while invalid, needed very little correction. However, it may get far more complicated. For example, there is a far larger document (~100kB) linked to in a comment, with these problems.
Name-value pairs are enclosed in ' instead of " and the values themselves contain ' (like isn't and other contractions), complicating the matching of ' pairs.
Invalid u' sequences at the beginning of names and values (u need be removed). This can be rolled together with the above, as they come together. There was also u".
Text may contain all kinds of escapes, for example some encoding of accents, which are not valid JSON. (One in that document.) This can be found and fixed (escaped for example).
It took a few minutes to come up with a few regex that corrected the document at the link, at close to 100kB in size, so that it parses cleanly with the above code. But the problem is that it is hard to tell what other trouble may be in the next document. Still, since this may be of interest here is the regex.
Instead of being enclosed in a pair of ", the names and values are in between ', and the leading one also has an extra char, u'. What makes it easier is that the closing ' must be followed by either of , : ] } and I use positive lookahead to assert that. Finally, there are some u" opening quotes and u is removed, first.
$googleResult =~ s/False/false/g;
$googleResult =~ s/True/true/g;
$googleResult =~ s/u"/"/g;
# There are also escaped characters in text, escape that backslash
$googleResult =~ s|(\\)|$1$1|g;
# Correct delimiters from u'...' to "...", see text below
$googleResult =~ s/u'(.*?)' (?= []:},] )/"$1"/gx;
# We are good now, decode it
my $data = $coder->decode($googleResult);
my $alb = $data->[0]{track}{album};
print "$alb\n";
This prints These Things Happen (correctly). Above we capture between u' and the first ' that is followed by either of ]:,} (for which a character class [...] is used). Then u'' is replaced by "". After this decode($googleResult); works and we get the Perl data structure to query.
There are various modules that allow a 'relaxed' approach and will accept many such irregularities. However, by using them we agree to use an invalid JSON, which is meant to be a simple and clear data format, and I wouldn't advise to go down that road. Note that the nearly full specification of the format fits nicely in one clear and genereously illustrated page at the above link. Also see JSON Example, for a handful of examples.
I think that the best bet is to try to clean it up. Run a decoder like in the code above and see the error message. It will pinpoint the problem exactly. Then add a regex to correct that particular violation of format. Then go again. If the various documents you may work with carry more or less the same set of problems (like the ones above for example) it may well work. Or it may turn out that it is too much trouble, if new violations keep coming up, in which case you may need a different approach.
Finally, I don't know how you arrived at this format from the original Python-object problem. Could it be that the format got broken somewhere in translation? I don't see how that would be the case. Is it actually not meant to be JSON? However, it is too close to it for that.
Is it possible to ask for valid JSON to be provided?
OK, this isn't really an answer, but out of the goodness of my heart I cleaned up your data for you. Here is a real Python dict. I don't know if some of the numerical string values should be ints or not, so I didn't mess with them. It'll be up to you to figure out what to do with the truncated URLs.
Another way to go about this would be to change True to true, False to false, and parse the dict as JSON.
{
'album_hits': [
{
'album':
{
'albumArtRef': 'http://lh5.ggpht.com/DVIg4GiD6msHfgPs_Vu_2eRxCyAoz0fF...',
'albumArtist': 'J.Cole',
'albumId': 'Bfp2tuhynyqppnp6zennhmf6w3y',
'artist': 'J.Cole',
'artistId': ['Ajgnxme45wcqqv44vykrleifpji'],
'description_attribution':
{
'kind': 'sj#attribution',
'license_title': 'Creative Commons Attribution CC-BY',
'license_url': 'http://creativecommons.org/licenses/by/4.0/legalcode',
'source_title': 'Freebase',
'source_url': ''
},
'explicitType': '1',
'kind': 'sj#album',
'name': 'Work Out',
'year': 2011
},
'type': '3'
}],
'artist_hits': [
{
'artist':
{
'artistArtRef': 'http://lh3.googleusercontent.com/MJe-cDw9uQ-pUagoLlm...',
'artistArtRefs': [
{
'aspectRatio': '2',
'autogen': False,
'kind': 'sj#imageRef',
'url': 'http://lh3.googleusercontent.com/MJe-cDw9uQ-pUagoLlmKX3x_K...'
}],
'artistId': 'Ajgnxme45wcqqv44vykrleifpji',
'artist_bio_attribution':
{
'kind': 'sj#attribution',
'source_title': 'David Jeffries, Rovi'
},
'kind': 'sj#artist',
'name': 'J. Cole'
},
'type': '2'
}],
'playlist_hits': [
{
'playlist':
{
'albumArtRef': [
{
'url': 'http://lh3.googleusercontent.com/KJsAhrg8Jk_5A4xYLA68LFC...'
}],
'description': 'Workout Plan ',
'kind': 'sj#playlist',
'name': 'Workout',
'ownerName': 'Ida Sarver',
'shareToken': 'AMaBXyktyF6Yy_G-8wQy8Rru0tkueIbIFblt2h0BpkvTzHDz-fFj6P...',
'type': 'SHARED'
},
'type': '4'
}],
'situation_hits': [
{
'situation':
{
'description': 'Level up and enter beast mode with some loud, aggressive music.',
'id': 'Nrklpcyfewwrmodvtds5qlfp5ve',
'imageUrl': 'http://lh3.googleusercontent.com/Cd8WRMaG_pDwjTC_dSPIIuf...',
'title': 'Entering Beast Mode',
'wideImageUrl': 'http://lh3.googleusercontent.com/8A9S-nTb5pfJLcpS8P...'
},
'type': '7'
}],
'song_hits': [
{
'track':
{
'album': 'Work Out',
'albumArtRef': [
{
'aspectRatio': '1',
'autogen': False,
'kind': 'sj#imageRef',
'url': 'http://lh5.ggpht.com/DVIg4GiD6msHfgPs_Vu_2eRxCyAoz0fFdxj5w...'
}],
'albumArtist': 'J.Cole',
'albumAvailableForPurchase': True,
'albumId': 'Bfp2tuhynyqppnp6zennhmf6w3y',
'artist': 'J Cole',
'artistId': ['Ajgnxme45wcqqv44vykrleifpji', 'Ampniqsqcwxk7btbgh5ycujij5i'],
'composer': '',
'discNumber': 1,
'durationMillis': '234000',
'estimatedSize': '9368582',
'explicitType': '1',
'genre': 'Pop',
'kind': 'sj#track',
'nid': 'Tq3nsmzeumhilpegkimjcnbr6aq',
'primaryVideo':
{
'id': '6PN78PS_QsM',
'kind': 'sj#video',
'thumbnails': [
{
'height': 180,
'url': 'https://i.ytimg.com/vi/6PN78PS_QsM/mqdefault.jpg',
'width': 320
}]
},
'storeId': 'Tq3nsmzeumhilpegkimjcnbr6aq',
'title': 'Work Out',
'trackAvailableForPurchase': True,
'trackAvailableForSubscription': True,
'trackNumber': 1,
'trackType': '7',
'year': 2011
},
'type': '1'
}],
'station_hits': [
{
'station':
{
'compositeArtRefs': [
{
'aspectRatio': '1',
'kind': 'sj#imageRef',
'url': 'http://lh3.googleusercontent.com/3aD9mFppy6PwjADnjwv_w...'
}],
'contentTypes': ['1'],
'description': 'These riff-tastic metal tracks are perfect for getting the blood pumping.',
'imageUrls': [
{
'aspectRatio': '1',
'autogen': False,
'kind': 'sj#imageRef',
'url': 'http://lh5.ggpht.com/YNGkFdrtk43e8H941fuAHjflrNZ1CJUeqdoys...'
}],
'kind': 'sj#radioStation',
'name': 'Heavy Metal Workout',
'seed':
{
'curatedStationId': 'Lcwg73w3bd64hsrgarnorif52r',
'kind': 'sj#radioSeed',
'seedType': '9'
},
'skipEventHistory': [],
'stationSeeds': [
{
'curatedStationId': 'Lcwg73w3bd64hsrgarnorif52r',
'kind': 'sj#radioSeed',
'seedType': '9'
}]
},
'type': '6'
}],
'video_hits': [
{
'score': 629.6226806640625,
'type': '8',
'youtube_video':
{
'id': '6PN78PS_QsM',
'kind': 'sj#video',
'thumbnails': [
{
'height': 180,
'url': 'https://i.ytimg.com/vi/6PN78PS_QsM/mqdefault.jpg',
'width': 320
}],
'title': 'J. Cole - Work Out'
}
}]
}
I've worked out a solution using a list comprehension like so:
use Inline::Python qw(py_eval);
my $song_hits = py_eval("[x for x in $googleResult->{song_hits}]", 0);
for my $hit (#$song_hits) {
push #{$result->{tracks}}, to_slim_track($hit->{track});
}
The commit is at:
https://github.com/squeezebox-googlemusic/squeezebox-googlemusic/commit/e6fa62d9da3bc7295023283ef5d25698737e5772
I am writing a small Function to catch a 404 when I request information from an API.
Code
def film_api():
number = random.randint(1, 10000)
film = requests.get('https://api.themoviedb.org/3/movie/{}?api_key=################'.format(number))
while film.status_code == '404':
film = requests.get('https://api.themoviedb.org/3/movie/{}?api_key=################'.format(number))
else:
return film.json()
404 Output
{
'status_code': 34,
'status_message': 'The resource you requested could not be found.'
}
correct output
{
'spoken_languages': [{
'name': 'English',
'iso_639_1': 'en'
}],
'genres': [{
'name': 'Comedy',
'id': 35
}, {
'name': 'Drama',
'id': 18
}],
'popularity': 0.493744,
'original_title': 'American Splendor',
'overview': 'An original mix of fiction and reality illuminates the life of comic book hero everyman Harvey Pekar.',
'runtime': 101,
'status': 'Released',
'homepage': 'http://www.newline.com/properties/americansplendor.html',
'video': False,
'revenue': 6003587,
'release_date': '2003-08-15',
'adult': False,
'vote_average': 6.4,
'imdb_id': 'tt0305206',
'poster_path': '/pcZ08ts1HaxWpUMMMQL2z3pomf1.jpg',
'production_companies': [],
'belongs_to_collection': None,
'title': 'American Splendor',
'backdrop_path': '/AswDSBB3rbh2auan9tKjETg09H8.jpg',
'original_language': 'en',
'budget': 0,
'vote_count': 43,
'production_countries': [{
'iso_3166_1': 'US',
'name': 'United States of America'
}],
'tagline': 'Ordinary life is pretty complicated',
'id': 2771
}
I have been bouncing back and forth between docs to find my answer and moved from an except to a while loop. I am using Python, Flask and Requests to build a simple web function so it does not need to be overly complex.
Is there something I am missing specifically?
requests status_code returns integer not string.
So you can fix like if film.status_code == 404:
def film_api():
number = random.randint(1, 10000)
film = requests.get('https://api.themoviedb.org/3/movie/{}?api_key=################'.format(number))
if film.status_code == 404:
film = requests.get('https://api.themoviedb.org/3/movie/{}?api_key=################'.format(number))
else:
return film.json()