Followers network with tweepy - python

I'm trying to do a network of my followers in twitter with Python and tweepy. My problem is that I'm not obtaining all the followers for each user oly a few. This is the code:
import tweepy
# Copy the api key, the api secret, the access token and the access token secret from the relevant page on your Twitter app
api_key = 'xxxx'
api_secret = 'xxxx'
access_token = 'x-x'
access_token_secret = 'xxxx'
# You don't need to make any changes below here # This bit authorises you to ask for information from Twitter
auth = tweepy.OAuthHandler(api_key, api_secret)
auth.set_access_token(access_token, access_token_secret)
# The api object gives you access to all of the http calls that Twitter accepts
api = tweepy.API(auth)
#User we want to use as initial node
user='xxxx'
import csv
import time
#This creates a csv file and defines that each new entry will be in a new line
csvfile=open(user+'network.csv', 'w')
spamwriter = csv.writer(csvfile, delimiter=' ',quotechar='|', quoting=csv.QUOTE_MINIMAL)
#This is the function that takes a node (user) and looks for all its followers #and print them into a CSV file... and look for the followers of each follower...
def fib(n,user,spamwriter):
if n>0:
#There is a limit to the traffic you can have with the API, so you need to wait
#a few seconds per call or after a few calls it will restrict your traffic
#for 15 minutes. This parameter can be tweeked
time.sleep(40)
try:
users=api.followers(user)
for follower in users:
print(follower.screen_name)
spamwriter.writerow([user+';'+follower.screen_name])
fib(n-1,follower.screen_name,spamwriter)
#n defines the level of autorecurrence
except tweepy.TweepError:
print("Failed to run the command on that user, Skipping...")
n=2
fib(n,user,spamwriter)

API.followers([id/screen_name]) only returns followers 100 at a time.
Try:
API.followers_ids(id/screen_name/user_id)
It will return a list of ID's for all the people following the specified user. Just put your ID in the parameters.

Related

Tweepy API Failing For Every ID

I'm running the below code that was given to me by an instructor to grab the status based off the tweet_id in another dataframe I've imported already. When running the code, everything comes back Failed. I don't receive any errors so I'm not sure what I'm missing. When I requested my twitter developer access I didn't have to answer a ton of questions like I've seen other people say they've had to do, so I'm curious if it's just not enough access?
import tweepy
from tweepy import OAuthHandler
import json
from timeit import default_timer as timer
# Query Twitter API for each tweet in the Twitter archive and save JSON in a text file
# These are hidden to comply with Twitter's API terms and conditions
consumer_key = 'HIDDEN'
consumer_secret = 'HIDDEN'
access_token = 'HIDDEN'
access_secret = 'HIDDEN'
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
# NOTE TO STUDENT WITH MOBILE VERIFICATION ISSUES:
# df_1 is a DataFrame with the twitter_archive_enhanced.csv file. You may have to
# change line 17 to match the name of your DataFrame with twitter_archive_enhanced.csv
# NOTE TO REVIEWER: this student had mobile verification issues so the following
# Twitter API code was sent to this student from a Udacity instructor
# Tweet IDs for which to gather additional data via Twitter's API
tweet_ids = twitter_archive.tweet_id.values
len(tweet_ids)
# Query Twitter's API for JSON data for each tweet ID in the Twitter archive
count = 0
fails_dict = {}
start = timer()
# Save each tweet's returned JSON as a new line in a .txt file
with open('tweet_json.txt', 'w') as outfile:
# This loop will likely take 20-30 minutes to run because of Twitter's rate limit
for tweet_id in tweet_ids:
count += 1
print(str(count) + ": " + str(tweet_id))
try:
tweet = api.get_status(tweet_id, tweet_mode='extended')
print("Success")
json.dump(tweet._json, outfile)
outfile.write('\n')
except tweepy.TweepError as e:
print("Fail")
fails_dict[tweet_id] = e
pass
end = timer()
print(end - start)
print(fails_dict)

Timespan for Elevated Access to Historical Twitter Data

I have a developer account as an academic and my profile page on twitter has Elevated on top of it, but when I use Tweepy to access the tweets, it only scrapes tweets from 7 days ago. How can I extend my access up to 2006?
This is my code:
import tweepy
from tweepy import OAuthHandler
import pandas as pd
access_token = '#'
access_token_secret = '#'
API_key = '#'
API_key_secret = '#'
auth = tweepy.OAuthHandler(API_key, API_key_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
tweets = []
count = 1
for tweet in tweepy.Cursor(api.search_tweets, q= "#SEARCHQUERY", count=5000).items(50000):
print(count)
count += 1
try:
data = [tweet.created_at, tweet.id, tweet.text,
tweet.user._json['screen_name'], tweet.user._json['name'], tweet.user._json['created_at'], tweet.entities['urls']]
data = tuple(data)
tweets.append(data)
except tweepy.TweepError as e:
print(e.reason)
continue
except StopIteration:
break
df = pd.DataFrame(tweets, columns = ['created_at','tweet_id', 'tweet_text', 'screen_name', 'name', 'account_creation_date', 'urls'])
df.to_csv(path_or_buf = 'local address/file.csv', index=False)
The Search All endpoint is available in Twitter API v2, which is represented by the tweepy.Client object (you are using tweepy.api).
The most important thing is that you require Academic research access from Twitter. Elevated access grants addition request volume, and access to the v1.1 APIs on top of v2 (Essential) access, but you will need an account and Project with Academic access to call the endpoint. There's a process to apply for that in the Twitter Developer Portal.

Python Twitter Streaming Timeline

****I am trying to obtain information from the twitter timeline of a specific user and I am trying to print the output in Json format, however I am getting an AttributeError: 'str' object has no attribute '_json'. I am new to python so I'm having troubles trying to resolve this so any help would be greatly appreciated. ****
Below shows the code that I have at the moment:
from __future__ import absolute_import, print_function
import tweepy
import twitter
def oauth_login():
# credentials for OAuth
CONSUMER_KEY = 'woIIbsmhE0LJhGjn7GyeSkeDiU'
CONSUMER_SECRET = 'H2xSc6E3sGqiHhbNJjZCig5KFYj0UaLy22M6WjhM5gwth7HsWmi'
OAUTH_TOKEN = '306848945-Kmh3xZDbfhMc7wMHgnBmuRLtmMzs6RN7d62o3x6i8'
OAUTH_TOKEN_SECRET = 'qpaqkvXQtfrqPkJKnBf09b48TkuTufLwTV02vyTW1kFGunu'
# Creating the authentication
auth = twitter.oauth.OAuth( OAUTH_TOKEN,
OAUTH_TOKEN_SECRET,
CONSUMER_KEY,
CONSUMER_SECRET )
# Twitter instance
twitter_api = twitter.Twitter(auth=auth)
return twitter_api
# LogIn
twitter_api = oauth_login()
# Get statuses
statuses = twitter_api.statuses.user_timeline(screen_name='#ladygaga')
# Print text
for status in statuses:
print (status['text']._json)
You seem to be mixing up tweepy with twitter, and are possibly getting a bit confused with methods as a result. The auth process for tweepy, from your code, should go as follows:
import tweepy
def oauth_login():
# credentials for OAuth
consumer_key = 'YOUR_KEY'
consumer_secret = 'YOUR_KEY'
access_token = 'YOUR_KEY'
access_token_secret = 'YOUR_KEY'
# Creating the authentication
auth = tweepy.OAuthHandler(consumer_key,
consumer_secret)
# Twitter instance
auth.set_access_token(access_token, access_token_secret)
return tweepy.API(auth)
# LogIn
twitter_api = oauth_login()
# Get statuses
statuses = twitter_api.user_timeline(screen_name='#ladygaga')
# Print text
for status in statuses:
print (status._json['text'])
If, as previously mentioned, you want to create a list of tweets, you could do the following rather than everything after # Print text
# Create a list
statuses_list = [status._json['text'] for status in statuses]
And, as mentioned in the comments, you shouldn't every give out your keys publicly. Twitter lets you reset them, which I'd recommend you do as soon as possible - editing your post isn't enough as people can still read your edit history.

Downloading all Tweets about certain subject in Python

Im doing Twitter sentiment research at the moment. For this reason, I'm using the Twitter API to download all tweets on certain keywords. But my current code is taking a lot of time to create a large datafile, so I was wondering if there's a faster method.
This is what Im using right now:
__author__ = 'gerbuiker'
import time
#Import the necessary methods from tweepy library
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
#Variables that contains the user credentials to access Twitter API
access_token = "XXXXXXXXXXXXX"
access_token_secret = "XXXXXXXX"
consumer_key = "XXXXX"
consumer_secret = "XXXXXXXXXXXXXX"
#This is a basic listener that just prints received tweets to stdout.
class StdOutListener(StreamListener):
def on_data(self, data):
try:
#print data
tweet = data.split(',"text":"')[1].split('","source')[0]
print tweet
saveThis = str(time.time())+ '::'+ tweet #saves time+actual tweet
saveFile = open('twitiamsterdam.txt','a')
saveFile.write(saveThis)
saveFile.write('\n')
saveFile.close()
return True
except BaseException, e:
print 'failed ondata,',str(e)
time.sleep(5)
def on_error(self, status):
print status
if __name__ == '__main__':
#This handles Twitter authetification and the connection to Twitter Streaming API
l = StdOutListener()
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, l)
#This line filter Twitter Streams to capture data by the keywords: 'Amsterdam'
stream.filter(track=['KEYWORD which i want to check'])
This gets me about 1500 tweets in one hour, for a pretty popular keyword (Amsterdam). Does anyone now a faster method in Python?
To be clear: I want to download all tweets on a certain subject for last month/year for example. So the newest tweets don't have to keep coming in, the most recent ones for a period would be sufficient. Thanks!
I need something similar to this for an academic research.
We're you able to fix it?
Would it be possible to specify a custom range of time from which to pull the data?
Sorry for asking here, but couldn't send you private messages.

Unexpected results while fetching the total number of followers of a user on twitter

When I run it, the terminal keeps tying "23851" in new rows, which is the number of followers of the first Twitter name in my file f; I believe this means that the pointer was not moving in file f, but I'm not sure how this should be done properly in Python 2) when I check my file f1, there's nothing, i.e. the program is not writing to f1 as expected.
import tweepy
from tweepy import Stream
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
CONSUMER_KEY = 'xxx'
CONSUMER_SECRET = 'xxx'
ACCESS_KEY = 'xxx'
ACCESS_SECRET = 'xxx'
auth = OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
api = tweepy.API(auth)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
#Create Class First
class TweetListener(StreamListener):
# A listener handles tweets are the received from the stream.
#This is a basic listener that just prints received tweets to standard output
def on_data(self, data): # indented inside the class
print(data)
return True
def on_error(self, status):
print(status)
# open both files outside the loop
with open('Twitternames.txt') as f,open('followers_number.txt', 'a') as f1:
for x in f:
#search
api = tweepy.API(auth)
twitterStream = Stream(auth,TweetListener())
test = api.lookup_users(screen_names=['x'])
for user in test:
print(user.followers_count)
#print it out and also write it into a file
s = user.followers_count
f1.write(str(s) +"\n") # add a newline with +
#end of stackoverflow
f.close()
Actually there are some things to consider, There are some unwanted lines as well. So I will go line by line and explain the relevant things ,as we don't need any streaming data for counting the number of follower , so we need to import only tweepy and OauthHandler, so :
import tweepy
from tweepy import OAuthHandler
Now we need to set the 4 keys required for login so, This will go same as :
CONSUMER_KEY = 'xxxxxxxx' #Replace with the original values.
CONSUMER_SECRET = 'xxx' #Replace with the original values.
ACCESS_KEY = 'xxx' #Replace with the original values.
ACCESS_SECRET = 'xxx' #Replace with the original values.
auth = OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
I don't guess you would need, StreamListner to just log the follower_count of various users. So I am skipping that part, However you can add that code snippet afterwards.
usernames_file = open('Twitternames.txt').readlines()
I am assuming the contents of Twitternames.txt to be in the following format(every username without # symbol and separated by a new line):
user_name_1
user_name_2
user_name_3
...
now the usernames_file would be list of strings usernames_file= ['user_name_1\n', 'user_name_2\n', 'user_name_3\n'] so now we have extracted the various usernames from the text file, but we need to get rid of that \n character at the end of each name. So we can use .strip() method.
usernames = []
for i in usernames_file:
usernames.append(i.strip())
>>> usernames = ['user_name_1', 'user_name_2', 'user_name_3']
Now we are ready to use the lookup_users method as this method takes a list of usernames as input.
So it may look something like this:
test = api.lookup_users(screen_names=usernames)
for user in test:
print(user.followers_count)
If you want to log the results to a .txt file then you can use:
log_file = open("log.txt", 'a')
test = api.lookup_users(screen_names=usernames)
for user in test:
print(user.followers_count)
log_file.write(user.name+" has "+str(user.followers_count)+" followers.\n")
log_file.close()
So the short and final code would look something like this:
import tweepy
from tweepy import OAuthHandler
CONSUMER_KEY = 'xxx'
CONSUMER_SECRET = 'xxx'
ACCESS_KEY = 'xxx'
ACCESS_SECRET = 'xxx'
auth = OAuthHandler(CONSUMER_KEY,CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
usernames_file = open('Twitternames.txt').readlines()
usernames = []
for i in usernames_file:
usernames.append(i.strip())
log_file = open("log.txt", 'a')
test = api.lookup_users(screen_names=usernames)
for user in test:
print(user.followers_count)
log_file.write(user.name+" has "+str(user.followers_count)+" followers.\n")
log_file.close()

Categories

Resources