I am currently trying to stream live tweets to my console of a specific user using the tweepy library (twitter api v2). I have found documentation for key words or hashtags but not to filter by the user who tweeted. Anyone know how I can add a rule to filter all tweets by say #elonmusk?
client = tweepy.Client(keys.BEARER_TOKEN, keys.API_KEY, keys.API_KEY_SECRET, keys.ACCESS_TOKEN, keys.ACCESS_TOKEN_SECRET)
auth = tweepy.OAuth1UserHandler(keys.API_KEY,keys.API_KEY_SECRET, keys.ACCESS_TOKEN, keys.ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
class MyStream(tweepy.StreamingClient):
def on_tweet(self, tweet):
print(tweet.text)
time.sleep(0.2)
stream = MyStream(bearer_token=keys.BEARER_TOKEN)
stream.add_rules({'from: elonmusk'})
print(stream.get_rules())
stream.filter()
class TweetListener(tweepy.StreamingClient):
def on_tweet(self, tweet):
print(tweet.text)
stream = TweetListener(bearer_token=tw.bearer_token)
stream.add_rules(tweepy.StreamRule("from:username01 OR from:username02"))
stream.filter()
ANSWER: I used the twarc CLT to solve this issue.
Steps taken to add a rule to specify tweets by a user:
pip3 install twarc
twarc configure
twarc2 stream-rules add "from:____" <--- put username there
run program with code I have above just delete
stream.add_rules({'from: elonmusk'})
print(stream.get_rules())
Related
I can't add "list:XXXXXXXXXX" to my rules (I can add other rules though and they work).
What I'm missing?
import tweepy
class TweetPrinter(tweepy.StreamingClient):
def on_tweet(self, tweet):
print(tweet)
printer = TweetPrinter(bearer_token)
printer.add_rules(tweepy.StreamRule("list:XXXXXXXXXX"))
printer.filter()
After executing this code instead of creating a StreamRule with an id, I got this (checking the list of the rules with printer.get_rules():
Response(data=None, includes={}, errors=[], meta={'sent': '2022-12-27T00:23:44.073Z', 'result_count': 0})
I'm using Tweepy 4.12 and Twitter API v2.
Adapted below from this on GitHub that take a tweet and makes a new post of the tweet into a subreddit.
I'm trying to adapt this to post the text from a recent tweet as a new comment and not as a new post. Then limiting the tweet that gets commented to only be the last recent tweet and set on a timer to limit the total comments so not every single tweet gets posted.
This does work if I use the code from GitHub to make a post. If I reference it however to make a comment, it will just say "tweet" like this line REPLY_TEMPLATE = 'tweet' , instead of the tweet that was pulled in.
I'm newer at this and python a weak point for me so I included the full code. I'm sure I'm missing something simple and obvious, and just not seeing it.
from time import sleep
import praw
import tweepy
import json
############################################################################
# Reddit & Twitter API connection & credentials
############################################################################
credentials = 'secrets.json'
with open(credentials) as f:
creds = json.load(f)
def main():
# Reddit Api Connection
reddit = praw.Reddit(client_id=creds['client_id'],
client_secret=creds['client_secret'],
user_agent=creds['user_agent'],
redirect_uri=creds['redirect_uri'],
refresh_token=creds['refresh_token'])
subreddit = reddit.subreddit("XXXXXXX") # Which subreddit?
for submission in subreddit.stream.submissions():
process_submission(submission)
# Twitter Api Connection
auth = tweepy.OAuthHandler(creds['twitter_consumer_key'], creds['twitter_consumer_secret'])
auth.set_access_token(creds['twitter_access_token'], creds['twitter_access_token_secret'])
api = tweepy.API(auth)
############################################################################
# Get the latest tweet from twitter to post as a comment to reddit
############################################################################
def get_last_tweet(self):
tweet = self.user_timeline(user_id="XXXXXXXX", count=1, tweet_mode="extended")[0]
print('Tweet')
return tweet
############################################################################
# Take what we got from twitter and post to reddit
############################################################################
REPLY_TEMPLATE = 'tweet'
POST = ["xxxxxx"] # Keyword for which thread to search for in the subreddit since we are limiting it to only work in a thread with a specific title
def process_submission(submission):
# Ignore titles for any thread with the same title that is not pinned
if not submission.stickied:
return
normalized_title = submission.title.lower()
for active_daily in POST:
if active_daily in normalized_title:
url_title = quote_plus(submission.title)
reply_text = REPLY_TEMPLATE.format(url_title)
print(f"Replying to: {submission.title}")
submission.reply(body=reply_text)
# A reply has been made so do not attempt to match other phrases.
break
if __name__ == "__main__":
main()
sleep(11 * 60)
I have a problem while I want to stream a specific public Twitter list using Tweepy. I can stream a specific user, but the filter follow doesn't work in this case. I have quite a long list of accounts I would like to stream to do further analysis, so I prepared a list with all of them on twitter. Does anyone know how to handle that?
My code is as follows:
import tweepy
import sys
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
print(status.id_str)
# if "retweeted_status" attribute exists, flag this tweet as a retweet.
is_retweet = hasattr(status, "retweeted_status")
# check if text has been truncated
if hasattr(status,"extended_tweet"):
text = status.extended_tweet["full_text"]
else:
text = status.text
# check if this is a quote tweet.
is_quote = hasattr(status, "quoted_status")
quoted_text = ""
if is_quote:
# check if quoted tweet's text has been truncated before recording it
if hasattr(status.quoted_status,"extended_tweet"):
quoted_text = status.quoted_status.extended_tweet["full_text"]
else:
quoted_text = status.quoted_status.text
# remove characters that might cause problems with csv encoding
remove_characters = [",","\n"]
for c in remove_characters:
text.replace(c," ")
quoted_text.replace(c, " ")
with open("out.csv", "a", encoding='utf-8') as f:
f.write("%s,%s,%s,%s,%s,%s\n" % (status.created_at,status.user.screen_name,is_retweet,is_quote,text,quoted_text))
def on_error(self, status_code):
print("Encountered streaming error (", status_code, ")")
sys.exit()
consumer_key = "..."
consumer_secret = "..."
access_token = "..."
access_token_secret = "..."
auth = tweepy.OAuthHandler(consumer_key,consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
if (not api):
print("Authentication failed!")
sys.exit(-1)
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener,tweet_mode='extended')
with open("out.csv", "w", encoding='utf-8') as f:
f.write("date,user,is_retweet,is_quote,text,quoted_text\n")
myStream.filter(follow=['52286608'])
You should be able to use the follow parameter with a comma-separated list of user IDs. From the Twitter API documentation:
follow
A comma-separated list of user IDs, indicating the users whose Tweets should be delivered on the stream. Following protected users is not supported. For each user specified, the stream will contain:
- Tweets created by the user.
- Tweets which are retweeted by the user.
- Replies to any Tweet created by the user.
- Retweets of any Tweet created by the user.
- Manual replies, created without pressing a reply button (e.g. “#twitterapi I agree”).
The stream will not contain:
- Tweets mentioning the user (e.g. “Hello #twitterapi!”).
- Manual Retweets created without pressing a Retweet button (e.g. “RT #twitterapi The API is great”).
- Tweets by protected users.
You can follow up to 5000 IDs this way.
Note that the API you are connecting has been superseded by the v2 filtered stream API, but Tweepy does not currently support that.
I have a basic program working using the Tweepy API. It essentially grabs tweets from a user and outputs it to a terminal. Ideally, I'd like this to be automated, so when the user tweets, the program will see it and display the tweet. But that's a question for another time.
What I'd like to do now, however, is grab the tweets with only a hashtag in it.
How do I go about this? I'm hoping it's a parameter I can add with inside the timeline function..?
Here is a snippet of the code I have at the moment:
import tweepy
import twitter_credentials
auth = tweepy.OAuthHandler(twitter_credentials.CONSUMER_KEY, twitter_credentials.CONSUMER_SECRET)
auth.set_access_token(twitter_credentials.ACCESS_TOKEN, twitter_credentials.ACCESS_TOKEN_SECRET)
api = tweepy.API(auth)
stuff = api.user_timeline(screen_name = 'XXXXXX', count = 10, include_rts = False)
for status in stuff:
print(status.text)
For a simple use case you can use # in the search string, for example:
api = tweepy.API(auth,wait_on_rate_limit=True)
tweet in tweepy.Cursor(api.search,q="#",count=100).items():
print(tweet)
This will give you tweets which contain any hastags.
I'm working on project where I stream tweets from Twitter API then apply sentiment analysis and visualize the results on an interactive colorful map.
I've tried the 'tweepy' library in python but the problem is it only retrieves few tweets (10 or less).
Also, I'm going to specify the language and the location which means I might get even less tweets! I need a real time streaming of hundred/thousands of tweets.
This is the code I tried (just in case):
import os
import tweepy
from textblob import TextBlob
port = os.getenv('PORT', '8080')
host = os.getenv('IP', '0.0.0.0')
# Step 1 - Authenticate
consumer_key= 'xx'
consumer_secret= 'xx'
access_token='xx'
access_token_secret='xx'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
#Step 3 - Retrieve Tweets
public_tweets = api.search('school')
for tweet in public_tweets:
print(tweet.text)
analysis = TextBlob(tweet.text)
print(analysis)
Is there any better alternatives? I found "PubNub" which is a JavaScript API but for now I want something in python since it is easier for me.
Thank you
If you want large amount of tweets, I would recommend you to utilize Twitter's streaming API using tweepy:
#Create a stream listner:
import tweepy
tweets = []
class MyStreamListener(tweepy.StreamListener):
#The next function defines what to do when a tweet is parsed by the streaming API
def on_status(self, status):
tweets.append(status.text)
#Create a stream:
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener=myStreamListener)
#Filter streamed tweets by the keyword 'school':
myStream.filter(track=['school'], languages=['en'])
Note that track filter used here is the standard free filtering API where there is another API called PowerTrack which is built for enterprises who have more requirements and rules to filter on.
Ref: https://developer.twitter.com/en/docs/tweets/filter-realtime/overview/statuses-filter
Otherwise, if you want to stick to the search method, you can query maximum of 100 tweets by adding count and use since_id on the maximum id parsed to get new tweets, you can add those attributes to the search method as follows:
public_tweets = []
max_id = 0
for i in range(10): #This loop will run 10 times you can play around with that
public_tweets.extend(api.search(q='school', count=100, since_id=max_id))
max_id = max([tweet.id for tweet in public_tweets])
#To make sure you only got unique tweets, you can do:
unique_tweets = list({tweet._json['id']:tweet._json for tweet in public_tweets}.values())
This way you will have to be careful with the API's limits and you will have to handle that by enabeling wait_on_rate_limit attribute when you initialize the API: api = tweepy.API(auth,wait_on_rate_limit=True)