I am using tweepy and python 2.7.6 to return the tweets of a specified user
My code looks like:
import tweepy
ckey = 'myckey'
csecret = 'mycsecret'
atoken = 'myatoken'
asecret = 'myasecret'
auth = tweepy.OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
api = tweepy.API(auth)
stuff = api.user_timeline(screen_name = 'danieltosh', count = 100, include_rts = True)
print stuff
However this yields a set of messages which look like<tweepy.models.Status object at 0x7ff2ca3c1050>
Is it possible to print out useful information from these objects? where can I find all of their attributes?
Unfortunately, Status model is not really well documented in the tweepy docs.
user_timeline() method returns a list of Status object instances. You can explore the available properties and methods using dir(), or look at the actual implementation.
For example, from the source code you can see that there are author, user and other attributes:
for status in stuff:
print status.author, status.user
Or, you can print out the _json attribute value which contains the actual response of an API call:
for status in stuff:
print status._json
import tweepy
import tkinter
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
# set parser=tweepy.parsers.JSONParser() if you want a nice printed json response.
userID = "userid"
user = api.get_user(userID)
tweets = api.user_timeline(screen_name=userID,
# 200 is the maximum allowed count
count=200,
include_rts = False,
# Necessary to keep full_text
# otherwise only the first 140 words are extracted
tweet_mode = 'extended'
)
for info in tweets[:3]:
print("ID: {}".format(info.id))
print(info.created_at)
print(info.full_text)
print("\n")
Credit to https://fairyonice.github.io/extract-someones-tweet-using-tweepy.html
In Tweeter API v2 getting tweets of a specified user is fairly easy, provided that you won’t exceed the limit of 3200 tweets. See documentation for more info.
import tweepy
# create client object
tweepy.Client(
bearer_token=TWITTER_BEARER_TOKEN,
consumer_key=TWITTER_API_KEY,
consumer_secret=TWITTER_API_KEY_SECRET,
access_token=TWITTER_ACCESS_TOKEN,
access_token_secret=TWITTER_TOKEN_SECRET,
)
# retrieve first n=`max_results` tweets
tweets = client.get_users_tweets(id=user_id, **kwargs)
# retrieve using pagination until no tweets left
while True:
if not tweets.data:
break
tweets_list.extend(tweets.data)
if not tweets.meta.get('next_token'):
break
tweets = client.get_users_tweets(
id=user_id,
pagination_token=tweets.meta['next_token'],
**kwargs,
)
The tweets_list is going to be a list of tweepy.tweet.Tweet objects.
Related
Using tweepy I am able to return all of my friends using a cursor. Is it possible to specify another user and get all of their friends?
user = api.get_user('myTwitter')
print "Retreiving friends for", user.screen_name
for friend in tweepy.Cursor(api.friends).items():
print "\n", friend.screen_name
Which prints a list of all my friends, however if I change the first line
to another twitter user it still returns my friends. How can I get friends for any given user using tweepy?
#first line is changed to
user = api.get_user('otherUsername') #still returns my friends
Additionally user.screen_name when printed WILL return otherUsername
The question Get All Follower IDs in Twitter by Tweepy does essentially what I am looking for however it returns only a count of ID's. If I remove the len() function I will I can iterate through a list of user IDs, but is it possible to get screen names #twitter,#stackoverflow, #etc.....?
You can use the ids variable from the answer you referenced in the other answer to get the the id of the followers of a given person, and extend it to get the screen names of all of the followers using Tweepy's api.lookup_users method:
import time
import tweepy
auth = tweepy.OAuthHandler(..., ...)
auth.set_access_token(..., ...)
api = tweepy.API(auth)
ids = []
for page in tweepy.Cursor(api.followers_ids, screen_name="McDonalds").pages():
ids.extend(page)
time.sleep(60)
screen_names = [user.screen_name for user in api.lookup_users(user_ids=ids)]
You can use this:
# import the module
import tweepy
# assign the values accordingly
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""
# authorization of consumer key and consumer secret
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
# set access to user's access key and access secret
auth.set_access_token(access_token, access_token_secret)
# calling the api
api = tweepy.API(auth)
# the screen_name of the targeted user
screen_name = "TwitterIndia"
# printing the latest 20 friends of the user
for friend in api.friends(screen_name):
print(friend.screen_name)
for more details see https://www.geeksforgeeks.org/python-api-friends-in-tweepy/
I have a problem while I want to stream a specific public Twitter list using Tweepy. I can stream a specific user, but the filter follow doesn't work in this case. I have quite a long list of accounts I would like to stream to do further analysis, so I prepared a list with all of them on twitter. Does anyone know how to handle that?
My code is as follows:
import tweepy
import sys
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
print(status.id_str)
# if "retweeted_status" attribute exists, flag this tweet as a retweet.
is_retweet = hasattr(status, "retweeted_status")
# check if text has been truncated
if hasattr(status,"extended_tweet"):
text = status.extended_tweet["full_text"]
else:
text = status.text
# check if this is a quote tweet.
is_quote = hasattr(status, "quoted_status")
quoted_text = ""
if is_quote:
# check if quoted tweet's text has been truncated before recording it
if hasattr(status.quoted_status,"extended_tweet"):
quoted_text = status.quoted_status.extended_tweet["full_text"]
else:
quoted_text = status.quoted_status.text
# remove characters that might cause problems with csv encoding
remove_characters = [",","\n"]
for c in remove_characters:
text.replace(c," ")
quoted_text.replace(c, " ")
with open("out.csv", "a", encoding='utf-8') as f:
f.write("%s,%s,%s,%s,%s,%s\n" % (status.created_at,status.user.screen_name,is_retweet,is_quote,text,quoted_text))
def on_error(self, status_code):
print("Encountered streaming error (", status_code, ")")
sys.exit()
consumer_key = "..."
consumer_secret = "..."
access_token = "..."
access_token_secret = "..."
auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
if (not api):
print("Authentication failed!")
sys.exit(-1)
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener,tweet_mode='extended')
with open("out.csv", "w", encoding='utf-8') as f:
f.write("date,user,is_retweet,is_quote,text,quoted_text\n")
myStream.filter(follow=['52286608'])
You should be able to use the follow parameter with a comma-separated list of user IDs. From the Twitter API documentation:
follow
A comma-separated list of user IDs, indicating the users whose Tweets should be delivered on the stream. Following protected users is not supported. For each user specified, the stream will contain:
- Tweets created by the user.
- Tweets which are retweeted by the user.
- Replies to any Tweet created by the user.
- Retweets of any Tweet created by the user.
- Manual replies, created without pressing a reply button (e.g. “#twitterapi I agree”).
The stream will not contain:
- Tweets mentioning the user (e.g. “Hello #twitterapi!”).
- Manual Retweets created without pressing a Retweet button (e.g. “RT #twitterapi The API is great”).
- Tweets by protected users.
You can follow up to 5000 IDs this way.
Note that the API you are connecting has been superseded by the v2 filtered stream API, but Tweepy does not currently support that.
I am currently in the process of doing some research using sentiment analysis on twitter data regarding a certain topic (isn't necessarily important to this question) using python, of which I am a beginner at. I understand the twitter streaming API limits users to access only to the previous 7 days unless you apply for a full enterprise search which opens up the whole archive. I had recently been given access to the full archive for this research project from twitter but I am unable to specify a start and end date to the tweets I would like to stream into a csv file. This is my code:
import pandas as pd
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
ckey = 'xxxxxxxxxxxxxxxxxxxxxxx'
csecret = 'xxxxxxxxxxxxxxxxxxxxxxx'
atoken = 'xxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxx'
asecret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'
# =============================================================================
# def sentimentAnalysis(text):
# output = '0'
# return output
# =============================================================================
class listener(StreamListener):
def on_data(self, data):
tweet = data.split(',"text":"')[1].split('","source')[0]
saveMe = tweet+'::'+'\n'
output = open('output.csv','a')
output.write(saveMe)
output.close()
return True
def on_error(self, status):
print(status)
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track=["#weather"], languages = ["en"])
Now this code streams twitter date from the past 7 days perfectly. I tried changing the bottom line to
twitterStream.filter(track=["#weather"], languages = ["en"], since = ["2016-06-01"])
but this returns this error :: filter() got an unexpected keyword argument 'since'.
What would be the correct way to filter by a given date frame?
The tweepy does not provide the "since" argument, as you can check yourself here.
To achieve the desired output, you will have to use the api.user_timeline, iterating through pages until the desired date is reached, Eg:
import tweepy
import datetime
# The consumer keys can be found on your application's Details
# page located at https://dev.twitter.com/apps (under "OAuth settings")
consumer_key=""
consumer_secret=""
# The access tokens can be found on your applications's Details
# page located at https://dev.twitter.com/apps (located
# under "Your access token")
access_token=""
access_token_secret=""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
page = 1
stop_loop = False
while not stop_loop:
tweets = api.user_timeline(username, page=page)
if not tweets:
break
for tweet in tweets:
if datetime.date(YEAR, MONTH, DAY) < tweet.created_at:
stop_loop = True
break
# Do the tweet process here
page+=1
time.sleep(500)
Note that you will need to update the code to fit your needs, this is just a general solution.
I'm working on project where I stream tweets from Twitter API then apply sentiment analysis and visualize the results on an interactive colorful map.
I've tried the 'tweepy' library in python but the problem is it only retrieves few tweets (10 or less).
Also, I'm going to specify the language and the location which means I might get even less tweets! I need a real time streaming of hundred/thousands of tweets.
This is the code I tried (just in case):
import os
import tweepy
from textblob import TextBlob
port = os.getenv('PORT', '8080')
host = os.getenv('IP', '0.0.0.0')
# Step 1 - Authenticate
consumer_key= 'xx'
consumer_secret= 'xx'
access_token='xx'
access_token_secret='xx'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
#Step 3 - Retrieve Tweets
public_tweets = api.search('school')
for tweet in public_tweets:
print(tweet.text)
analysis = TextBlob(tweet.text)
print(analysis)
Is there any better alternatives? I found "PubNub" which is a JavaScript API but for now I want something in python since it is easier for me.
Thank you
If you want large amount of tweets, I would recommend you to utilize Twitter's streaming API using tweepy:
#Create a stream listner:
import tweepy
tweets = []
class MyStreamListener(tweepy.StreamListener):
#The next function defines what to do when a tweet is parsed by the streaming API
def on_status(self, status):
tweets.append(status.text)
#Create a stream:
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener)
#Filter streamed tweets by the keyword 'school':
myStream.filter(track=['school'], languages=['en'])
Note that track filter used here is the standard free filtering API where there is another API called PowerTrack which is built for enterprises who have more requirements and rules to filter on.
Ref: https://developer.twitter.com/en/docs/tweets/filter-realtime/overview/statuses-filter
Otherwise, if you want to stick to the search method, you can query maximum of 100 tweets by adding count and use since_id on the maximum id parsed to get new tweets, you can add those attributes to the search method as follows:
public_tweets = []
max_id = 0
for i in range(10): #This loop will run 10 times you can play around with that
public_tweets.extend(api.search(q='school', count=100, since_id=max_id))
max_id = max([tweet.id for tweet in public_tweets])
#To make sure you only got unique tweets, you can do:
unique_tweets = list({tweet._json['id']:tweet._json for tweet in public_tweets}.values())
This way you will have to be careful with the API's limits and you will have to handle that by enabeling wait_on_rate_limit attribute when you initialize the API: api = tweepy.API(auth,wait_on_rate_limit=True)
I'm trying to use the tweepy library in one of my python projects. When I try the following code that creates a tweepy cursor to fetch a user's timeline status messages, the count parameter is always ignored.
def search(self, username, keyword, consumer_key, consumer_secret, access_token, access_token_secret):
#start twitter auth
try:
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
user = api.get_user(username)
except Exception as e:
print(str(e))
self.error = str(e)
return
self.followercount = user.followers_count
self.screenname = user.screen_name
results = []
for status in tweepy.Cursor(api.user_timeline, id=username, count=2).items():
try:
tweet = status._json
In this instance, the count is set to 2 in the Cursor object, yet it receives all of them. What am I doing wrong?
tweepy.Cursor() does not appear to recognize a count argument. In fact, count is not mentioned anywhere in tweepy/cursor.py, the module where tweepy.Cursor is defined. Instead, it looks like you might want to use:
for status in tweepy.Cursor(api.user_timeline, id=username).items(2):
passing the limit to items() instead of as the count keyword argument. See this section in the tweepy Cursor tutorial.