On the Soundcloud API guide (https://developers.soundcloud.com/docs/api/guide#pagination)
the example given for reading more than 100 piece of data is as follows:
# get first 100 tracks
tracks = client.get('/tracks', order='created_at', limit=page_size)
for track in tracks:
print track.title
# start paging through results, 100 at a time
tracks = client.get('/tracks', order='created_at', limit=page_size,
linked_partitioning=1)
for track in tracks:
print track.title
I'm pretty certain this is wrong as I found that 'tracks.collection' needs referencing rather than just 'tracks'. Based on the GitHub python soundcloud API wiki it should look more like this:
tracks = client.get('/tracks', order='created_at',limit=10,linked_partitioning=1)
while tracks.collection != None:
for track in tracks.collection:
print(track.playback_count)
tracks = tracks.GetNextPartition()
Where I have removed the indent from the last line (I think there is an error on the wiki it is within the for loop which makes no sense to me). This works for the first loop. However, this doesn't work for successive pages because the "GetNextPartition()" function is not found. I've tried the last line as:
tracks = tracks.collection.GetNextPartition()
...but no success.
Maybe I'm getting versions mixed up? But I'm trying to run this with Python 3.4 after downloading the version from here: https://github.com/soundcloud/soundcloud-python
Any help much appreciated!
For anyone that cares, I found this solution on the SoundCloud developer forum. It is slightly modified from the original case (searching for tracks) to list my own followers. The trick is to call the client.get function repeatedly, passing the previously returned "users.next_href" as the request that points to the next page of results. Hooray!
pgsize=200
c=1
me = client.get('/me')
#first call to get a page of followers
users = client.get('/users/%d/followers' % me.id, limit=pgsize, order='id',
linked_partitioning=1)
for user in users.collection:
print(c,user.username)
c=c+1
#linked_partitioning means .next_href exists
while users.next_href != None:
#pass the contents of users.next_href that contains 'cursor=' to
#locate next page of results
users = client.get(users.next_href, limit=pgsize, order='id',
linked_partitioning=1)
for user in users.collection:
print(c,user.username)
c=c+1
Related
I'm trying to create a Python program to create Spotify playlists according to the user's "mood"; that is, the user chooses a specific mood and the program returns a playlist where the song's (taken from the user's saved tracks) features indicate that they match said mood. However, I'm running into the issue that the playlist is made up of the same song repeated 30 times, instead of 30 different songs. I'm fairly new to Python and programming in general, so maybe this isn't a very difficult problem, but I'm trying to see the issue and I am unable to.
Here is the route in my main code that refers to this issue (my full program has more routes, which all work fine so far). All other moods will be defined by other routes, but following the same logic. Any extra code that may be necessary in order to understand the issue better, I'll provide.
#app.route("/playlist_mood_angry")
def mood_playlist_angry():
# define mood and create empty track list
selected_mood = "Angry"
tracks_uris = [] # we need all the tracks uri to add to the future playlist
if 'auth_header' in session:
auth_header = session['auth_header']
# get user profile and saved tracks
user_profile_data = spotify.get_users_profile(auth_header)
user_id = user_profile_data["id"]
saved_tracks_data = spotify.get_user_saved_tracks(auth_header)
playlist_name = 'CSMoodlet: Angry'
playlist_description = "A playlist for when you're just pissed off and want a soundtrack to go with it. Automatically curated by CSMoodlet."
# go through saved tracks dictionary, get the tracks and for each one check if features average matches selected mood
for item in saved_tracks_data["items"]:
track = item["track"]
features = sp.audio_features(track['id'])
acousticness = features[0]['acousticness']
danceability = features[0]['danceability']
energy = features[0]['energy']
speechiness = features[0]['speechiness']
valence = features[0]['valence']
track_mood = spotify.define_mood(acousticness, danceability, energy, speechiness, valence)
# if the track's mood is "angry", if the list is not 30 tracks long yet, append it to the list
if track_mood == "Angry": #if track's mood is not Angry, it will go back to the for loop to check the next one (THIS DOESN'T WORK)
while len(tracks_uris) < 30:
track_uri = "spotify:track:{}".format(track['id'])
tracks_uris.append(track_uri)
# once it has gone through all saved tracks, create the playlist and add the tracks
new_playlist = spotify.create_playlist(auth_header, user_id, playlist_name, playlist_description)
new_playlist_id = new_playlist['id']
added_playlist = spotify.add_tracks_to_playlist(auth_header, new_playlist_id, tracks_uris)
playlist = spotify.get_playlist(auth_header, new_playlist_id)
tracks_data = []
for item in playlist["items"]:
track_data = item["track"]
tracks_data.append(track_data)
return render_template("created_playlist.html", selected_mood = selected_mood, playlist_tracks=tracks_data)
Any help would be deeply appreciated. Apologies if anything is badly explained, I am new to Stackoverflow and English is not my first language.
Found the error!
For anyone having issues with the same problem, I found that it was actually very simple. just substitute the while loop by an if loop, when checking the length of the playlist is less than 30.
I am using soundcloud api through python SDK.
When I get the tracks data through 'Search',
the track attribute 'playback_count' seems to be
smaller than the actual count seen on the web.
How can I avoid this problem and get the actual playback_count??
(ex.
this track's playback_count gives me 2700,
but its actually 15k when displayed on the web
https://soundcloud.com/drumandbassarena/ltj-bukem-soundcrash-mix-march-2016
)
note: this problem does not occur for comments or likes.
following is my code
##Search##
tracks = client.get('/tracks', q=querytext, created_at={'from':startdate},duration={'from':startdur},limit=200)
outputlist = []
trackinfo = {}
resultnum = 0
for t in tracks:
trackinfo = {}
resultnum += 1
trackinfo["id"] = resultnum
trackinfo["title"] =t.title
trackinfo["username"]= t.user["username"]
trackinfo["created_at"]= t.created_at[:-5]
trackinfo["genre"] = t.genre
trackinfo["plays"] = t.playback_count
trackinfo["comments"] = t.comment_count
trackinfo["likes"] =t.likes_count
trackinfo["url"] = t.permalink_url
outputlist.append(trackinfo)
There is an issue with the playback count being incorrect when reported via the API.
I have encountered this when getting data via the /me endpoint for activity and likes to mention a couple.
The first image shows the information returned when accessing the sound returned for the currently playing track in the soundcloud widget
Information returned via the api for the me/activities endpoint
Looking at the SoundCloud website, they actually call a second version of the API to populate the track list on the user page. It's similar to the documented version, but not quite the same.
If you issue a request to https://api-v2.soundcloud.com/stream/users/[userid]?limit=20&client_id=[clientid] then you'll get back a JSON object showing the same numbers you see on the web.
Since this is an undocumented version, I'm sure it'll change the next time they update their website.
Using an access token from the Facebook Graph API Explorer (https://developers.facebook.com/tools/explorer), with access scope which includes user likes, I am using the following code to try to get all the likes of a user profile:
myfbgraph = facebook.GraphAPI(token)
mylikes = myfbgraph.get_connections(id="me", connection_name="likes")['data']
for like in mylikes:
print like['name'], like['category']
...
However this is always giving me only 25 likes, whereas I know that the profile I'm using has 42 likes. Is there some innate limit operating here, or what's the problem in getting ALL the page likes of a user profile?
Per the Graph documention:
When you make an API request to a node or edge, you will usually not
receive all of the results of that request in a single response. This
is because some responses could contain thousands and thousands of
objects, and so most responses are paginated by default.
https://developers.facebook.com/docs/graph-api/using-graph-api/v2.2#paging
Well, this appears to work (a method, which accepts a user's facebook graph):
def get_myfacebook_likes(myfacebook_graph):
myfacebook_likes = []
myfacebook_likes_info = myfacebook_graph.get_connections("me", "likes")
while myfacebook_likes_info['data']:
for like in myfacebook_likes_info['data']:
myfacebook_likes.append(like)
if 'next' in myfacebook_likes_info['paging'].keys():
myfacebook_likes_info = requests.get(myfacebook_likes_info['paging']['next']).json()
else:
break
return myfacebook_likes
The above answers will work, but pretty slowly for anything with many likes. If you just want the count for number of likes, you can get it much more efficiently with total_likes:
myfacebook_likes_info = graph.get_connections(post['id'], 'likes?summary=1')
print myfacebook_likes_info["summary"]["total_count"]
I have a list of a few thousand twitter ids and I would like to check who follows who in this network.
I used Tweepy to get the accounts using something like:
ids = {}
for i in list_of_accounts:
for page in tweepy.Cursor(api.followers_ids, screen_name=i).pages():
ids[i]=page
time.sleep(60)
The values in the dictionary ids form the network I would like to analyze. If I try to get the complete list of followers for each id (to compare to the list of users in the network) I run into two problems.
The first is that I may not have permission to see the user's followers - that's okay and I can skip those - but they stop my program. This is the case with the following code:
connections = {}
for x in user_ids:
l=[]
for page in tweepy.Cursor(api.followers_ids, user_id=x).pages():
l.append(page)
connections[x]=l
The second is that I have no way of telling when my program will need to sleep to avoid the rate-limit. If I put a 60 second wait after every page in this query - my program would take too long to run.
I tried to find a simple 'exists_friendship' command that might get around these issues in a simpler way - but I only find things that became obsolete with the change to API 1.1. I am open to using other packages for Python. Thanks.
if api.exists_friendship(userid_a, userid_b):
print "a follows b"
else:
print "a doesn't follow b, check separately if b follows a"
I am trying to loop through some 50-odd files in a directory. Each file has some text for which i am trying to find the keywords using Yahoo Term Extractor. I am able to extract text from each file, but I am not able to iteratively call the API using the text as input. Only the keywords for the first file is displayed.
Here is my code snippet:
in 'comments' list, I have extracted and stored the text from each file.
for c in comments:
print "building query"
dataDict = [ ('appid', appid), ('context', c)]
queryData = urllib.urlencode(dataDict)
request.add_data(queryData)
print "fetching result"
result = OPENER.open(request).read()
print result
time.sleep(1)
Well I don't know anything about the Yahoo Term Extractor, but I'd presume that your call request.add_data(queryData) simply tacks on another data set with each iteration of your loop. And then the call to OPENER.open(request).read() would probably only process the results of the first data set. So either your request object can only hold one query, or your OPENER object's inner workings can only process one query, it's as simple as that.
Actually a third reason comes to mind now that I read the documentation provided at your link, and this is probably the true one:
RATE LIMITS
The Term Extraction service is limited to 5,000 queries per IP address per day and to noncommercial use. See information on rate limiting.
So it would make sense that the API would limit your usage to one query at a time, and not allow you to flood a bunch of queries in a single request.
In any event, I'd assume you could fix your problem in a "naive" way by having many request variables instead of just one, or maybe just creating a new request with every iteration of your loop. If you're not worried about storing your results, and just trying to debug, you could try:
for c in comments:
print "building query"
dataDict = [ ('appid', appid), ('context', c)]
queryData = urllib.urlencode(dataDict)
request = urllib2.Request() # I don't know how to initialize this variable, do it yourself
request.add_data(queryData)
print "fetching result"
result = OPENER.open(request).read()
print result
time.sleep(1)
Again, I don't know about the Yahoo Term Extractor (nor do I really have time to research it) so there may very well be a better, more native way to do this. If you post more details of your code (i.e. what classes are the request and OPENER objects coming from) then I might be able to elaborate on this.