So, first off, I realize there's a number of questions regarding handling the twitter rate limits. I have no idea why, but none of the ones's I've found so far work for me.
I'm using tweepy. I'm trying to get a list of all the followers of the followers of a user. As expected, I can't pull everything down all at once due to twitter's rate limits. I have tweepy v 3.5 installed and thus am referring to http://docs.tweepy.org/en/v3.5.0/api.html. To get the list of followers of the originating user I use:
auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth)
followerIDs = []
for page in tweepy.Cursor(api.followers_ids, screen_name=originatingUser, wait_on_rate_limit = True, wait_on_rate_limit_notify = True).pages():
followerIDs.extend(page)
followers = api.lookup_users(follower)
This works a for a bit but quickly turns into:
tweepy.error.TweepError: [{u'message': u'Rate limit exceeded', u'code': 88}]
My theory, would then to retrieve the followers of each user for each followerID using something like this:
for followerID in followerIDs:
for page in tweepy.Cursor(api.followers_ids, id=followerID, wait_on_rate_limit = True, wait_on_rate_limit_notify = True).pages():
followerIDs.extend(page)
The other problem I have is when I'm trying to look up the user names. For this, It use the grouper function from itertools to break the followers up into groups of 100 (api.lookup_users can only accept 100 id's at a time) and use
followerIDs = grouper(followerIDs,100)
for followerGroup in followerIDs:
followerGroup=filter(None, followerGroup)
followers = api.lookup_users(followerGroup,wait_on_rate_limit = True)
for follower in followers:
print (originatingUser + ", " + str(follower.screen_name))
That gets a different error, namely:
TypeError: lookup_users() got an unexpected keyword argument 'wait_on_rate_limit'
which I'm finding confusing, becuase the tweepy api suggests that that should be an accepted argument.
Any ideas as to what I'm doing wrong?
Cheers
Ben.
I know this might be a little late, but here goes.
You pass the wait_on_rate_limit argument in the Cursor constructor, while the tweepy documentation states that it should be passed on the API() constructor.
The wait_on_rate_limit argument is to be passed in the API() constructor.
In your code it would look like:
api = tweepy.API(auth,wait_on_rate_limit=True)
There's also another argument wait_on_rate_limit_notify, which informs you when tweepy is waiting for your rate limit to refresh. Adding both would finally make the line:
api = tweepy.API(auth,wait_on_rate_limit=True,wait_on_rate_limit_notify=True)
There is a rate limit for twitter API as mentioned here: https://dev.twitter.com/rest/public/rate-limiting
The quick solution to pass this could be catching the rate limit error and sleeping your application for a while then continue where you left.
pages = tweepy.Cursor(api.followers_ids, id=followerID).pages()
while True:
try:
page = pages.next()
followerIDs.extend(page)
except TweepError:
time.sleep(60 * 15)
continue
except StopIteration:
break
should do the trick. Not sure if this will work as you expect but the basic idea is this.
Related
A little help please. This is what I am working with now (I got it from here on stackoverflow) and it works very well, but it seems to only work with the most recent accounts in the list of accounts that don't follow me back. I want to start unfollowing accounts from the oldest to the newest because I keep reaching the limit of the API. I thought to make a list of followers and reverse it then plug that in somewhere but not quite sure how to do that or what the syntax would be. Thanks in advance.
import tweepy
from cred import *
from config import QUERY, UNFOLLOW, FOLLOW, LIKE, RETWEET
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
def main():
try:
if UNFOLLOW:
my_screen_name = api.get_user(screen_name='YOUR_SCREEN_NAME')
for follower in my_screen_name.friends():
Status = api.get_friendship(source_id = my_screen_name.id , source_screen_name = my_screen_name.screen_name, target_id = follower.id, target_screen_name = follower.screen_name)
if Status [0].followed_by:
print('{} he is following You'.format(follower.screen_name))
else:
print('{} he is not following You'.format(follower.screen_name))
api.destroy_friendship(screen_name = follower.screen_name)
except tweepy.errors.TweepyException as e:
print(e)
while True:
main()
here is the config.py file
#config.py
UNFOLLOW = True
I recently assembled some pieces of code together to reach this, so I'll just copy paste what I already have here instead of updating your code, but I can point out the main points (and give some tips).
The full code:
import tweepy
from cred import *
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
def unfollower():
followers = api.get_follower_ids(screen_name=api.verify_credentials().screen_name)
friends = api.get_friend_ids(screen_name=api.verify_credentials().screen_name)
print("You follow:", len(friends))
for friend in friends[::-1]:
if friend not in followers:
api.destroy_friendship(user_id = friend)
else:
pass
friends = api.friends_ids(screen_name=api.me().screen_name)
print("Now you're following:", len(friends))
unfollower()
Now what happened here and what is different from your code
This two variables:
followers = api.followers_ids(screen_name=api.verify_credentials().screen_name)
friends = api.friends_ids(screen_name=api.verify_credentials().screen_name)
create a list with the ids from both the followers (follow you) and the friends (you are following), now all we need to do is compare both.
There is a discussion about the Twitter Rate limit and how using cursors have a smaller rate than not using, but I'm not qualified to explain the whys, so let's just assume that if we do not want small rate limits, the best way is not to use requests that have a intrinsic small rate limit like the api.get_friendship and them getting the screen_name, instead I'm using the get_friend_ids method.
the next part involves what you called "make a list of followers and reverse", well the list is already there in the variable "followers", so all we need to do now is reverse read it with the following command:
for friend in friends[::-1]:
this says: "read each element of the list, starting from index -1" roughtly "read the list backwards".
Well, I think the major points are these, I created a function but you really don't even need to, is just easier to update this to a class if you need to, and this way you don't need to use the while True: main(), just call the function unfollow() and it will automatically end the script when the unfollows are over.
Now some minor points that might improve your code:
Instead of using
screen_name='YOUR_SCREEN_NAME'
That you need a config file or to hardcode the screen_name, you can use
screen_name=api.verify_credentials().screen_name
This way it will automatically knows that you want the authenticating user information (note that I didn't used this part on my code, for the get_friend_ids method does not need the screen_name)
Now this part
from cred import *
from config import QUERY, UNFOLLOW, FOLLOW, LIKE, RETWEET
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
First I've eliminated the need for the config file
and you can eliminate all the extra info that comes imported from the cred file, so you don't need to import all in from cred import * updating cred.py with:
import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
and now you can only impor the api function with from cred import api, this way the code can become cleaner:
import tweepy
from cred import api
def unfollower():
followers = api.get_follower_ids(screen_name=api.verify_credentials().screen_name)
friends = api.get_friend_ids(screen_name=api.verify_credentials().screen_name)
print("You follow:", len(friends))
for friend in friends[::-1]:
if friend not in followers:
api.destroy_friendship(user_id = friend)
else:
pass
friends = api.get_friend_ids(screen_name=api.verify_credentials().screen_name)
print("Now you're following:", len(friends))
unfollower()
Lastly, if anyone is having problems with the api.get_friend_ids or get_follower_ids remember that the tweepy update for versions 4.x.x changed the name of some methods, the ones I remember are:
followers_ids is now get_follower_ids
friends_ids is now get_friend_ids
me() is now verify_credentials()
Well, I guest that's it, you can check the rest on the docs.
Happy pythoning!
How can I get a list of followers from a Twitter account with a lot of followers? My code is meant to get followers from a user and then follow them but max I can get and follow is 200 with my code and then after 200 it gets new users but way slower. Is there a way to get thousands more?
import tweepy
from time import sleep
consumer_key = ''
consumer_secret = ''
access_key = ''
access_secret = ''
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
users = tweepy.Cursor(api.followers, screen_name='twitter', count=200).items()
count = 0
while True:
try:
user = next(users)
api.create_friendship(user.screen_name)
sleep(1)
print(user.screen_name)
except tweepy.TweepError as e:
if e.api_code == 161:
while(count < 901):
print(count, end='\r')
sleep(1)
count +=1
if e.api_code == 160:
pass
except StopIteration:
pass
Twitter documentation says there is a limit of 200 to the count parameter of the api. I would guess that is causing you your trouble.
The number of users to return per page, up to a maximum of 200. Defaults to 20.
Check your Twitter rate limits.
Servers normally have a limit on the number of calls you can make to them before they begin to restrict access. The reason for this is that if they let every server ping them 1000+ times a minute, the cost of their servers will be higher and it will make it harder to spot a DNS attack (aggressive servers have a high limit of calls they can make while still looking like non-aggressive servers).
I'm trying to create a listener to a very specific twitter account (mine), so I can do some automation, if I tweet something with a "special" code at the end (could be a character like "…") it will trigger an action, like adding the previous characters to a database.
So, I used Tweepy and I'm able to create the listener, filter keywords and so, but it will filter keywords from all the Tweetverse. This is my code:
import tweepy
cfg = {
"consumer_key" : "...",
"consumer_secret" : "...",
"access_token" : "...",
"access_token_secret" : "..."
}
auth = tweepy.OAuthHandler(cfg['consumer_key'], cfg['consumer_secret'])
auth.set_access_token(cfg['access_token'], cfg['access_token_secret'])
api = tweepy.API(auth)
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
print(status.text)
return True
def on_error(self, status):
print('error ',status)
return False
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth=auth, listener=myStreamListener)
myStream.filter(track=['…'])
It will filter all the messages containing a "…" no matter who wrote it, so I added to the last line the parameter follow='' like:
myStream.filter(follow='myTwitterName', track=['…'])
It always gives me a 406 error, if I use myStream.userstream('myTwitterName') it will give me, not just the Tweets I write, but also my whole timeline.
So, what am I doing wrong?
EDIT
I just find my first error. I was using user's screen name, not Twitter ID. Now I got rid of the 406 error, but still doesn't work. I placed the Twitter ID in the follow parameter, but does absolutely nothing. I tried both, with my account and with an account that is too "live", like CNN (ID = 759251), I see new tweets coming in my browser, but nothing on the listener.
If you're interested on knowing your own Twitter ID, I used this service: http://gettwitterid.com/
OK, solved. It was working from the very beggining, I made two mistakes:
To solve the 406 error all it has to be done, is to use Twitter id instead of Twitter name.
The listener was apparently doing nothing, because I was sending "big" tweets, that is, tweets longer than 140 chars. In this case, you shouldn't use status.text, but status.extended_tweet['full_text']
You must check for the existance of the extended_tweet, if it is not in the status received, then you should use the text
I have a wrapper class for the twitter authentication where there is a line:
self.__api = tweepy.API(self.auth,
wait_on_rate_limit=False,
wait_on_rate_limit_notify=False)
When I instantiate the wrapper class to get api object of twitter:
api_call = myWrapper(self.CONSUMER_KEY, self.CONSUMER_SECRET,
self.ACCESS_KEY, self.ACCESS_SECRET, True, True)
Based on my understanding setting up wait_on_rate_limit and wait_on_rate_limit_notify to True should default take care the rate issue (Based on tweepy documentation).
But I get following error when I am iterating over list of users and try to get their timeline (~3400)
tweepy.error.TweepError: Twitter error response: status code = 429
I tried following:
remaining = int(api_call.api.last_response.getheader('X-Rate-Limit-Remaining'))
but it says last_response attribute is not available.
No, you have to create a handler for this exception.
I have this code (not all of the code) and it basically gets the 20 most recent followers. The problem is that it will make a follow request to people who I am already following. This wouldn't be a problem but twitter limits how many requests you can make.
followers = api.followers()
following = api.friends()
tofollow = [x for x in followers if x not in following]
for u in tofollow:
try:
u.follow()
number_followed+=1
print number_followed,". ", u.screen_name
except tweepy.TweepError as err:
print "Error: when following ", u.screen_name
i think it has something to do with when i make tofollow
any thoughts?
I think that if you want to make twitter queries for the whole set and not for 20 most recent, you should use a cursor.
For example:
tweepy.Cursor(api.followers).items()
Also if you don't want to violate the twitter rate limiting you could use the following line when initializing the api object:
api = tweepy.API(auth, wait_on_rate_limit=True)
Hope it helps. Here is an example:
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth, wait_on_rate_limit=True)
friends = api.friends_ids(api.me().id)
print("You follow", len(friends), "users")
for follower in tweepy.Cursor(api.followers).items():
if follower.id != api.me().id:
if follower.id in friends:
print("You already follow", follower.screen_name)
else:
follower.follow()
print("Started following", follower.screen_name)