Filtering in tweepy - python

I am new to tweepy and have encountered a problem. I want to download tweets with special hashtags. But it seems
stream.filter(track = ['word1', 'word2', 'word3'])
looks for these words in tweet and not in hashtags of the tweet. How can I filter on hashtags?

You can actually filter tweets based on your special hashtag.
stream.filter(track=['#MySpecialHashtag', '#AlsoThisHashtag'])
This will pick up only tweets that contain the hashtags you provide as part of the tweet text and save you from arbitrarily collecting tweets and checking if the hashtag field has your hashtag in it.

You find the tags in the status object. It is there you have to make the comparison with the ones you are looking for.
example:
for hashtag in status.entities['hashtags']:
print(hashtag['text'])
example here: http://www.pythoncentral.io/introduction-to-tweepy-twitter-for-python/

Related

Excluding link at the end while pulling tweets in tweepy Streaming

I am pulling text or extended_text using tweepy streaming, but when I pull these tweets, there is always a t.co/randomletters link at the end that leads to nowhere. What is it and how do I get rid of it?
Here is an example:
"text": "To make room for more expression, we will now count all emojis as equal—including those with gender‍‍‍ ‍‍and skin tone modifiers https://t.co(forward slash)MkGjXf9aXm"
Please help
As far as my experience with twitter and tweepy goes, these URL's are included in a tweet's text whenever there is a URL of some sort in the actual tweet, so we can't really avoid getting them.
You could remove them after you get them, this is a simple regex that replaces the pattern of these URL's with a blank string.
import re
re.sub(r' https://t.co/\w{10}', '', tweet_text)

twitter tweet_mode = 'extended' not just giving me the text in the tweet

I'm trying to download tweets using tweepy. But the tweets keep getting cut off.
results = api.search(q=hashtag, lang="en", count=num, tweet_mode="extended")
for tweet in results:
tweet_list.append(tweet.full_text)
I end up getting outputs looking like this:
RT #Acosta: Trump also said at the meeting “why do we need more Haitians? Take them out,” a person familiar with today’s meeting confirms t…
I just want the actual full text part of the tweet.
Already answered here
Instead of full_text=True you need tweet_mode="extended"
Then, instead of text you should use full_text to get the full tweet text.
Your code should look like:
new_tweets = api.user_timeline(screen_name = screen_name,count=200, tweet_mode="extended")
Then in order to get the full tweets text:
tweets = [[tweet.full_text] for tweet in new_tweets]

Using tweepy to get unique tweets

I am trying to get a corpus of Tweets using a number of search terms. One issue I am having is that it is not being able to get unique tweets. That is, retweets.
Is there a way to remove these beforehand without doing any text processing?
What I've got now:
api=tweepy.API(auth)
for search in hashtags:
for tweet in tweepy.Cursor(api.search,q=search,count=1000,lang="en").items():
text=repr(tweet.text.encode("utf-8"))
out.write(text+"\n")
You can add " -filter:retweets" to your query to only get original tweets. Maybe not the prettiest solution, but it works.
api=tweepy.API(auth)
for search in hashtags:
for tweet in tweepy.Cursor(api.search,q=search+" -filter:retweets",count=1000,lang="en").items():
text=repr(tweet.text.encode("utf-8"))
out.write(text+"\n")

How to use Tweepy to retweet with a comment

So i am stuck trying to figure out how to retweet a tweet with a comment, this was added to twitter recently.
this is when you click retweet and add a comment to the retweet and retweet it.
basically this is what i am talking about :
i was looking at the api and count find a method dedicated to this. And even the retweet method does not have a parameter where i can pass text.
So i was wondering is there a way to do this?
Tweepy doesn't have functionality to retweet with your own text, but what you can do is make a url like this https://twitter.com/<user_displayname>/status/<tweet_id> and include it with the text you want comment. It's not a retweet but you are embedding the tweet in your new tweet.
user_displayname - display name of person, whose tweet you are retweeting
tweet_id - tweet id of tweet you are retweeting
September 2021 Update
Tweepy does have the functionality to quote retweet. Just provide the url of the tweet you want to quote into attachment_url of the API.update_status method.
Python example:
# Get the tweet you want to quote
tweet_to_quote_url="https://twitter.com/andypiper/status/903615884664725505"
# Quote it in a new status
api.update_status("text", attachment_url=tweet_to_quote_url)
# Done!
In the documentation, there is a quote_tweet_id parameter in create_tweet method.
You can create a new tweet with the tweet ID of the tweet you want to quote.
comment = "Yep!"
quote_tweet = 1592447141720780803
client = tweepy.Client(bearer_token=access_token)
client.create_tweet(text=comment, quote_tweet_id=quote_tweet, user_auth=False)

Original tweet or retweeted?

I am using Tweepy with python and trying to get the original tweets authored by set of users (i.e., I want to exclude any tweet in their timeline that is actually a retweet). How can I do this with Tweepy?
I tried something like this and I do not know if it works:
tweets = api.user_timeline(id=user['id'], count=30)
for tweet in tweets:
if not tweet.retweeted:
analyze_tweet(tweet)
Does the api.user_timeline() return only original tweets? Or retweets of this user as well?
Tweepy by default doesn't include retweets in user_timeline therefore tweet.retweeted will always be false. To include retweets you can specify include_rts as True like
tweets= api.user_timeline(id=user['id'], count=30,include_rts=True)
for tweet in tweets:
if not tweet.retweeted:
analyze_tweet(tweet)
else:
#do something with retweet

Categories

Resources