I am working on a twitter App using Python 2.7 and the latest version of the tweepy module. One thing I cannot figure out is how to use the function rate_limit_status()
Here is my code:
import tweepy, time, sys, random, pickle
import pprint
# argfile = str(sys.argv[1])
#enter the corresponding information from your Twitter application:
CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_KEY = ''
ACCESS_SECRET = ''
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
public_tweets = api.home_timeline()
user = api.get_user('#MyUserName')
print api.rate_limit_status()
When I print the results of the function it gives me a large dictionary that I cannot decipher. I have looked at the tweepy documentation but can't find any good examples on using rate_limit_status().
What is the next step I should be doing to troubleshoot something like this?
Is there a tool to format these large dictionaries so I can read them and try to decipher how to access the values in the dictionary?
Edit:
It turns out I didn't have a good understanding of what a Rest API is and how simply it works! I was expecting something MUCH more complicated in my head.
I actually switched to the twitter Python module twitter library instead of Tweepy and then did a lot of research on how to use the Twitter API.
Two youtube videos that REALLY helped me are:
https://www.youtube.com/watch?v=7YcW25PHnAA
and
https://www.youtube.com/watch?v=fhPb6ocUz_k
The Postman Chrome app was awesome and allowed me to easily test and visualize how my calls to the Twitter API worked and it easily formatted the resulting JSON to so I could read it.
To do quick calculations I also took the JSON from Postman and threw it into this website http://konklone.io/json/ to get a csv that I could then open in Excel and make sure everything was behaving as expected and that I was getting the right now number of results.
After all that, writing the Python code to interact with the Twitter API was easy!
Adding all this in this hopes it will help someone else in the future! If it does please let me know! :)
As per the Tweepy documentation
Returns the remaining number of API requests available to the
requesting user before the API limit is reached for the current hour.
Calls to rate_limit_status do not count against the rate limit. If
authentication credentials are provided, the rate limit status for the
authenticating user is returned. Otherwise, the rate limit status for
the requester’s IP address is returned.
So in simpler words you can say that, it returns a JSON object in which tells you about the Number of requests you have made and Number of requests remaining, the reason why it is difficult to read at first sight lies in the face that, it contains the count for every type of API call that you have made and not only the current API call you just executed.
So for example if you run the above script, then, you can see that you have made a call to api.home_timeline() Now according to the twitter Rules and Regulations you can only make 15 calls to this method in a given window session, So if you unpack the JSON object returned then you can see that, there is a lot of data but if you analyse the data then, You will find that api.home_timeline() only affects limits of relevant methods, such as when calling above methods you can check the rate limit using:
data = api.rate_limit_status()
print data['resources']['statuses']['/statuses/home_timeline']
print data['resources']['users']['/users/lookup']
However you have to do a little bit of research on the JSON returned and then you can extract the relevant data from the JSON object, as the returned JSON objects are hard to read, you can always use these types of links to make it more user readable and then analyse it.
Related
Yesterday I wrote a twitter bot with Python that takes the most recent tweet from Donald Trump, translates it over in Google Translate 45 times, and tweets back at him the final translation in English. Everything works, except the fact that I now need to add some sort of "listener" at the beginning of the code to automatically detect when he tweets so that the rest of the code can do its magic. I've been looking over the internet for some time now and I can't seem to find any sort of event handler that would allow the script to detect when he tweets. So that's why I've come to you guys. Is there any way to use Tweepy or other Python libraries to actively detect when someone tweets? I'll include my code so you guys can see what I want to happen exactly when he does Tweet. It's annotated so hopefully it's not too complicated to understand. Thanks!
import tweepy
from googletrans import Translator
#Keys for accessing the Twitter API
consumer_key = 'PLACEHOLDER'
consumer_secret = 'PLACEHOLDER'
access_token = 'PLACEHOLDER'
access_token_secret = 'PLACEHOLDER'
#Setting up authentification
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
#Scrapes my timeline for DT's latest tweet. I want this to be done AUTOMATICALLY when he tweets.
tweet = api.user_timeline(screen_name = 'realDonaldTrump', count = 1, include_rts = False, tweet_mode = 'extended')
#Translates the text from the .json file that is pulle from the previous line using the Google translate library.
for status in tweet:
translator = Translator()
translation = translator.translate(translation.text, 'mn')
translation = translator.translate(status._json["full_text"], 'ja')
#There are more translations in the actual code, but to reduce the length and complexity, I've taken those out. They don't matter to the specific question.
#Include his handle in a message so that the tweet is tweeted back at him once the translation is complete.
message = ('#realDonaldTrump', translation.text)
#Tweets the message back at him under that specific tweet using it's ID.
send = api.update_status(message, status._json["id"])
I just want the code to be able to scrape my timelines for one of DT's tweets in real time. Thanks!
To Automate your script, you might need to push it into a production server and then create a cron job that will run the script at given regular intervals. To run it locally, I use 1. NGROK - Exposes your localhost addresses and ports to the outside world ie it gives you a way to reach your localhost from the internet, and 2. Invictify - This allows you to run your scripts at a schedule. Also, the script as it is will need a web service to be triggered. Use Flask or FastAPI to create endpoints that call the script when triggered.
I use tweepy to do some twitter analysis. I wanted to look at the list of users which retweets a given tweet. First of all, I want to extract the number of retweeters of this tweet using tweepy.
I use the following code
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
count=0
for tweet in api.retweets(1090392302130888704):
countj+=1
print(countj)
As you can see from the link, the number of retweets is 54. However, this code returns 50. Why is there this discrepancy?
I have tried to appy this code to several tweets and I notice there is always a discrepancy from what I see with the Web Client and the result of the code.
Protected Retweets are shown as part of the count you see, but you're unable to obtain them or their Retweeters through the API (unless that protected account follows you).
To outline this, you can see that https://twitter.com/AmericaTalks/status/1090408203882360832 has 7 Retweets right now. If you click to see who Retweeted, it'll show 6 accounts, and at the bottom, it'll say "1 user has asked not to be shown in this view. Learn More". The API will also return only the 6 Retweet(er)s.
Note, in your code, you define count, but use countj. This will result in a NameError.
Also, API.retweets returns a list of Status objects, so you can just do len(api.retweets(1090392302130888704)), instead of looping through them to count them.
I just started to make a Twitter Api. Normally I don't have a Twitter account, for this api I created one. I Tweeted 4 times, including some mentions. But when I use mentions_timeline like this;
my_mentions = api.mentions_timeline()
#print(my_mentions)
#output: []
After then I do a for loop on my_mentions with parameters text, screen_name but nothing returned.
What I'm doing wrong here? Why it's an empty list since I mentioned some people in the tweets + how can I search mentions for another user? Is there a parameter in mentions_timeline() object like screen_name or id ?
Try using the new Cursor Object as follows:
api = tweepy.API(auth)
for mentions in tweepy.Cursor(api.mentions_timeline).items():
# process mentions here
print mentions.text
as per Twitters documentation here
Returns the 20 most recent mentions (tweets containing a users’s
#screen_name) for the authenticating user.
so you cannot check other users mentions using this method. To achieve this, you will have to uses twitters search api. for tweepy's documentation check here
import tweepy
api = tweepy.API(auth)
api.mentions_timeline()
Try going to your profile using the same of which you're using the API and see whether the mentions exists in your profile.
and try mentioning the twitter account from different account which you trying from.
This might be the case where twitter had limited your activities and replies/tweets are not visible of that account.
I'd like send query to twitter search API using the tweet id but it seems you cannot search a tweet by having its id (maybe because you don't need to search it if you already have the id). For example imagine we have a tweet https://twitter.com/great_watches/status/643389532105256961 and we want to send 643389532105256961 to the search API to see if the tweet is available on the search api or not.
I need it because I just want to compare twitter search api with twitter streaming api.
I have a python script which is listening to the stream for some keywords and whenever a tweet is comming I like to search it on twitter search api to see if it is available there also or not. meaningless huh?
You can't compare the the Search API to the Streaming API the way you're doing it due to the fact they're both retrieving different types of information.
From the Search API docs:
The Twitter Search API is part of Twitter’s v1.1 REST API. It allows
queries against the indices of recent or popular Tweets and behaves
similarily to, but not exactly like the Search feature available in
Twitter mobile or web clients, such as Twitter.com search.
Before getting involved, it’s important to know that the Search API is
focused on relevance and not completeness. This means that some Tweets
and users may be missing from search results. If you want to match for
completeness you should consider using a Streaming API instead.
Here's to explain the scenario based on the information you've given.
You're streaming the word python and you find a match.
You instantly take that match and look for it on search API
The issue with that is by the time you're going from Streaming API (which is in real time) and you're looking to find the same one on the search API you'll get in conflict where more relevant tweets and popular one that will supersede it.
You'll need to redefine the query sent to the search API to include that exact same one (i.e. include more than python as you have done with the Streaming API).
You can get it using the tweepy api. Get the consumer key, secret and access key, secret from https://apps.twitter.com/ .Then run the following:
consumer_key = 'XXXX'
consumer_secret = 'XXXX'
access_key = 'XXXX-XXXX'
access_secret = 'XXXX'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
tweet = api.statuses_lookup(['643389532105256961'])
a1[0].text # Prints the message
More info here http://docs.tweepy.org/en/v3.5.0/api.html#API.statuses_lookup
I have been trying to figure this out but this is a really frustrating. I'm trying to get tweets with a certain hashtag (a great amount of tweets) using Tweepy. But this doesn't go back more than one week. I need to go back at least two years for a period of a couple of months. Is this even possible, if so how?
Just for the check here is my code
import tweepy
import csv
consumer_key = '####'
consumer_secret = '####'
access_token = '####'
access_token_secret = '####'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
# Open/Create a file to append data
csvFile = open('tweets.csv', 'a')
#Use csv Writer
csvWriter = csv.writer(csvFile)
for tweet in tweepy.Cursor(api.search,q="#ps4",count=100,\
lang="en",\
since_id=2014-06-12).items():
print tweet.created_at, tweet.text
csvWriter.writerow([tweet.created_at, tweet.text.encode('utf-8')])
As you have noticed Twitter API has some limitations, I have implemented a code that do this using the same strategy as Twitter running over a browser. Take a look, you can get the oldest tweets: https://github.com/Jefferson-Henrique/GetOldTweets-python
You cannot use the twitter search API to collect tweets from two years ago. Per the docs:
Also note that the search results at twitter.com may return historical results while the Search API usually only serves tweets from the past week. - Twitter documentation.
If you need a way to get old tweets, you can get them from individual users because collecting tweets from them is limited by number rather than time (so in many cases you can go back months or years). A third-party service that collects tweets like Topsy may be useful in your case as well (shut down as of July 2016, but other services exist).
Found one code that would help retrieve older tweets.
https://github.com/Jefferson-Henrique/GetOldTweets-python
To get old tweets, run the following command in the directory where the code repository got extracted.
python Exporter.py --querysearch 'keyword' --since 2016-01-10 --until 2016-01-15 --maxtweets 1000
And it returned a file 'output_got.csv' with 1000 tweets during the above days with your keyword
You need to install a module 'pyquery' for this to work
PS: You can modify 'Exporter.py' python code file to get more tweet attributes as per your requirement.
2018 update:
Twitter has Premium search APIs that can return results from the beginning of time (2006):
https://developer.twitter.com/en/docs/tweets/search/overview/premium#ProductPackages
Search Tweets: 30-day endpoint → provides Tweets from the previous 30
days.
Search Tweets: Full-archive endpoint → provides complete and instant
access to Tweets dating all the way back to the first Tweet in March
2006.
With an example Python client:
https://github.com/twitterdev/search-tweets-python
Knowing this is a very old question but still, some folks might be facing the same issue.
After some digging, I found out Tweepy's search only returns data for the past 7 days and that some times lead to buy third party service.
I utilised python library, GetOldTweets3 and it worked fine for me. The utility of this library is really easy. The only limitation of this library that we can't search for more than one hashtag in one execution but it works fine to search for multiple accounts at the same time.
use the args "since" and "until" to adjust your timeframe. You are presently using since_id which is meant to correspond to twitter id values (not dates):
for tweet in tweepy.Cursor(api.search,
q="test",
since="2014-01-01",
until="2014-02-01",
lang="en").items():
As others have noted, the Twitter API has the date limitation, but not the actual advanced search as implemented on twitter.com. So so the solution is to use Python's wrapper for Selenium or PhantomJS to iterate through the twitter.com endpoint. Here's an implementation using Selenium that someone has posted on Github: https://github.com/bpb27/twitter_scraping/
I can't believe nobody said this but this git repository completely solved my problem. I haven't been able to utilize other solutions such as GOT or Twitter API Premium.
Try this, definitely useful:
https://betterprogramming.pub/how-to-scrape-tweets-with-snscrape-90124ed006af
https://github.com/MartinBeckUT/TwitterScraper/tree/master/snscrape/cli-with-python