Who shared my Facebook post? - python

Using any Facebook API for Python, I am trying to get the # of people who shared my post and who those people are. I currently have the first part..
>>> from facepy import *
>>> graph = GraphAPI("CAAEr")
>>> g = graph.get('apple/posts?limit=20')
>>> g['data'][10]['shares']
That gets the count, but I want to know who those people are.

The sharedposts connection will give you more information about the shares of a post. You have to GET USER_ID?fields=sharedposts. This information doesn't appear in the doc.
This follows your code:
# Going through each of your posts one by one
for post in g['data']:
# Getting post ID
id = post['id'] # Gives USERID_POSTID
id[-17:] # We know that a post ID is represented by 17 numerals
# Another Graph request to get the shared posts
shares = graph.get(id + '?fields=sharedposts')
print('Post' id, 'was shared by:')
# Displays the name of each sharer
print share['from']['name'] for share in shares['data']
It is my first time with Python, there might be some syntax errors in the code. But you got the idea.

Related

Scopus search for a DOI and retrieve authors

I'm trying to get the authors of a publication by using scopus. For that I got an API key and startet. I searched for the DOI and got a response. Everything is fine, there is also an entry "authors", but for each request this field is simply empty. My code in python is below:
import pyscopus
key = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXX'
doi = '10.1016/0270-0255(87)90003-0'
scopus = pyscopus.Scopus(key)
response_json = json.loads(scopus.search(f'doi({doi})', view='STANDARD').to_json(orient="records"))
So a s I sayed, you can call response_json['authors'] but it is always empty. There are authors given on the website, but webscraping is forbidden. Am I doing something wrong or do they simply not provide these information (which is confusing, since there is a field)? So far I couldn't find an answer.
I know there are other ways like crossref to get these information, but for reasons I want to do it with scopus.
Thanks!

SoundCloud API - Playback Count is smaller than actual count

I am using soundcloud api through python SDK.
When I get the tracks data through 'Search',
the track attribute 'playback_count' seems to be
smaller than the actual count seen on the web.
How can I avoid this problem and get the actual playback_count??
(ex.
this track's playback_count gives me 2700,
but its actually 15k when displayed on the web
https://soundcloud.com/drumandbassarena/ltj-bukem-soundcrash-mix-march-2016
)
note: this problem does not occur for comments or likes.
following is my code
##Search##
tracks = client.get('/tracks', q=querytext, created_at={'from':startdate},duration={'from':startdur},limit=200)
outputlist = []
trackinfo = {}
resultnum = 0
for t in tracks:
trackinfo = {}
resultnum += 1
trackinfo["id"] = resultnum
trackinfo["title"] =t.title
trackinfo["username"]= t.user["username"]
trackinfo["created_at"]= t.created_at[:-5]
trackinfo["genre"] = t.genre
trackinfo["plays"] = t.playback_count
trackinfo["comments"] = t.comment_count
trackinfo["likes"] =t.likes_count
trackinfo["url"] = t.permalink_url
outputlist.append(trackinfo)
There is an issue with the playback count being incorrect when reported via the API.
I have encountered this when getting data via the /me endpoint for activity and likes to mention a couple.
The first image shows the information returned when accessing the sound returned for the currently playing track in the soundcloud widget
Information returned via the api for the me/activities endpoint
Looking at the SoundCloud website, they actually call a second version of the API to populate the track list on the user page. It's similar to the documented version, but not quite the same.
If you issue a request to https://api-v2.soundcloud.com/stream/users/[userid]?limit=20&client_id=[clientid] then you'll get back a JSON object showing the same numbers you see on the web.
Since this is an undocumented version, I'm sure it'll change the next time they update their website.

Twitter timeline in Python, but only getting 20ish results?

I'm a nub when it comes to python. I literally just started today and have little understanding of programming. I have managed to make the following code work:
from twitter import *
config = {}
execfile("config.py", config)
twitter = Twitter(
auth = OAuth(config["access_key"], config["access_secret"],
config["consumer_key"], config["consumer_secret"]))
user = "skiftetse"
results = twitter.statuses.user_timeline(screen_name = user)
for status in results:
print "(%s) %s" % (status["created_at"], status["text"].encode("ascii",
"ignore"))
The problem is that it's only printing 20 results. The twitter page i'd like to get data from has 22k posts, so something is wrong with the last line of code.
screenshot
I would really appreciate help with this! I'm doing this for a research sentiment analysis, so I need several 100's to analyze. Beyond that it'd be great if retweets and information about how many people re tweeted their posts were included. I need to get better at all this, but right now I just need to meet that deadline at the end of the month.
You need to understand how the Twitter API works. Specifically, the user_timeline documentation.
By default, a request will only return 20 Tweets. If you want more, you will need to set the count parameter to, say, 50.
e.g.
results = twitter.statuses.user_timeline(screen_name = user, count = 50)
Note, count:
Specifies the number of tweets to try and retrieve, up to a maximum of 200 per distinct request.
In addition, the API will only let you retrieve the most recent 3,200 Tweets.

Facebook SDK for Python - getting all page likes of a Facebook profile

Using an access token from the Facebook Graph API Explorer (https://developers.facebook.com/tools/explorer), with access scope which includes user likes, I am using the following code to try to get all the likes of a user profile:
myfbgraph = facebook.GraphAPI(token)
mylikes = myfbgraph.get_connections(id="me", connection_name="likes")['data']
for like in mylikes:
print like['name'], like['category']
...
However this is always giving me only 25 likes, whereas I know that the profile I'm using has 42 likes. Is there some innate limit operating here, or what's the problem in getting ALL the page likes of a user profile?
Per the Graph documention:
When you make an API request to a node or edge, you will usually not
receive all of the results of that request in a single response. This
is because some responses could contain thousands and thousands of
objects, and so most responses are paginated by default.
https://developers.facebook.com/docs/graph-api/using-graph-api/v2.2#paging
Well, this appears to work (a method, which accepts a user's facebook graph):
def get_myfacebook_likes(myfacebook_graph):
myfacebook_likes = []
myfacebook_likes_info = myfacebook_graph.get_connections("me", "likes")
while myfacebook_likes_info['data']:
for like in myfacebook_likes_info['data']:
myfacebook_likes.append(like)
if 'next' in myfacebook_likes_info['paging'].keys():
myfacebook_likes_info = requests.get(myfacebook_likes_info['paging']['next']).json()
else:
break
return myfacebook_likes
The above answers will work, but pretty slowly for anything with many likes. If you just want the count for number of likes, you can get it much more efficiently with total_likes:
myfacebook_likes_info = graph.get_connections(post['id'], 'likes?summary=1')
print myfacebook_likes_info["summary"]["total_count"]

Discogs API => How to retrieve genre?

I've crawled a tracklist of 36.000 songs, which have been played on the Danish national radio station P3. I want to do some statistics on how frequently each of the genres have been played within this period, so I figured the discogs API might help labeling each track with genre. However, the documentation for the API doesent seem to include an example for querying the genre of a particular song.
I have a CSV-file with with 3 columns: Artist, Title & Test(Test where i want the API to label each song with the genre).
Here's a sample of the script i've built so far:
import json
import pandas as pd
import requests
import discogs_client
d = discogs_client.Client('ExampleApplication/0.1')
d.set_consumer_key('key-here', 'secret-here')
input = pd.read_csv('Desktop/TEST.csv', encoding='utf-8',error_bad_lines=False)
df = input[['Artist', 'Title', 'Test']]
df.columns = ['Artist', 'Title','Test']
for i in range(0, len(list(df.Artist))):
x = df.Artist[i]
g = d.artist(x)
df.Test[i] = str(g)
df.to_csv('Desktop/TEST2.csv', encoding='utf-8', index=False)
This script has been working with a dummy file with 3 records in it so far, for mapping the artist of a given ID#. But as soon as the file gets larger(ex. 2000), it returns a HTTPerror when it cannot find the artist.
I have some questions regarding this approach:
1) Would you recommend using the search query function in the API for retrieving a variable as 'Genre'. Or do you think it is possible to retrieve Genre with a 'd.' function from the API?
2) Will I need to aquire an API-key? I have succesfully mapped the 3 records without an API-key so far. Looks like the key is free though.
Here's the guide I have been following:
https://github.com/discogs/discogs_client
And here's the documentation for the API:
https://www.discogs.com/developers/#page:home,header:home-quickstart
Maybe you need to re-read the discogs_client examples, i am not an expert myself, but a newbie trying to use this API.
AFAIK, g = d.artist(x) fails because x must be a integer not a string.
So you must first do a search, then get the artist id, then d.artist(artist_id)
Sorry for no providing an example, i am python newbie right now ;)
Also have you checked acoustid for
It's a probably a rate limit.
Read the status code of your response, you should find an 429 Too Many Requests
Unfortunately, if that's the case, the only solution is to add a sleep in your code to make one request per second.
Checkout the api doc:
http://www.discogs.com/developers/#page:home,header:home-rate-limiting
I found this guide:
https://github.com/neutralino1/discogs_client.
Access the api with your key and try something like:
d = discogs_client.Client('something.py', user_token=auth_token)
release = d.release(774004)
genre = release.genres
If you found a better solution please share.

Categories

Resources