Analyzing streamed data from Twitter API using Tweepy - python

I'm quite new to python and coding in general, and I'm having difficulty understanding how to interact with streamed data from the Twitter API using Tweepy.
Here's my example code which prints out any new tweet that the specified user makes.
import tweepy
auth = tweepy.OAuthHandler("****", "****")
auth.set_access_token("****", "****")
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
print (status.text)
myStreamListener = MyStreamListener()
myStream = tweepy.Stream(auth = api.auth, listener = myStreamListener)
myStream.filter(follow=['user_id_goes_here']))
If I want to do something such as check if a certain word exists inside each tweet as they are made, I do not know how, given it is a constant stream of data.
How does one analyze each tweet as it is delivered and parse it?

The tweepy documentation on streaming is very limited, but it does say
This page aims to help you get started using Twitter streams with Tweepy by offering a first walk through. Some features of Tweepy streaming are not covered here. See streaming.py in the Tweepy source code.
so searching for that file in the tweepy github repository we find
https://github.com/tweepy/tweepy/blob/master/tweepy/streaming.py
There you can find the method on_status and see that status should be an instance of the class Status
Looking at the twitter API documentation reveals that
Tweets are also known as “status updates.”
Unfortunately, looking at the source code for Status or the tweepy documentation does not yield much information.
Looking at the twitter API documentation for tweet
https://developer.twitter.com/en/docs/twitter-api/data-dictionary/object-model/tweet
We should expect a field called text that should be the tweet actual text
Another thing we can try is just using a breakpoint and then looking at the variable status using the debugger in order to see what fields it has (this is done a lot of times in python due to it's dynamic nature)

I believe I have found what I am looking for, the analysis of the status needs to happen within the on_status method, for example:
class MyStreamListener(tweepy.StreamListener):
def on_status(self, status):
if keyword in status.text:
print ("keyword found")

Related

How to create a twitter bot that replies back as soon as another person tweets?

Yesterday I wrote a twitter bot with Python that takes the most recent tweet from Donald Trump, translates it over in Google Translate 45 times, and tweets back at him the final translation in English. Everything works, except the fact that I now need to add some sort of "listener" at the beginning of the code to automatically detect when he tweets so that the rest of the code can do its magic. I've been looking over the internet for some time now and I can't seem to find any sort of event handler that would allow the script to detect when he tweets. So that's why I've come to you guys. Is there any way to use Tweepy or other Python libraries to actively detect when someone tweets? I'll include my code so you guys can see what I want to happen exactly when he does Tweet. It's annotated so hopefully it's not too complicated to understand. Thanks!
import tweepy
from googletrans import Translator
#Keys for accessing the Twitter API
consumer_key = 'PLACEHOLDER'
consumer_secret = 'PLACEHOLDER'
access_token = 'PLACEHOLDER'
access_token_secret = 'PLACEHOLDER'
#Setting up authentification
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
#Scrapes my timeline for DT's latest tweet. I want this to be done AUTOMATICALLY when he tweets.
tweet = api.user_timeline(screen_name = 'realDonaldTrump', count = 1, include_rts = False, tweet_mode = 'extended')
#Translates the text from the .json file that is pulle from the previous line using the Google translate library.
for status in tweet:
translator = Translator()
translation = translator.translate(translation.text, 'mn')
translation = translator.translate(status._json["full_text"], 'ja')
#There are more translations in the actual code, but to reduce the length and complexity, I've taken those out. They don't matter to the specific question.
#Include his handle in a message so that the tweet is tweeted back at him once the translation is complete.
message = ('#realDonaldTrump', translation.text)
#Tweets the message back at him under that specific tweet using it's ID.
send = api.update_status(message, status._json["id"])
I just want the code to be able to scrape my timelines for one of DT's tweets in real time. Thanks!
To Automate your script, you might need to push it into a production server and then create a cron job that will run the script at given regular intervals. To run it locally, I use 1. NGROK - Exposes your localhost addresses and ports to the outside world ie it gives you a way to reach your localhost from the internet, and 2. Invictify - This allows you to run your scripts at a schedule. Also, the script as it is will need a web service to be triggered. Use Flask or FastAPI to create endpoints that call the script when triggered.

Tweepy mentions_timeline returns an empty list

I just started to make a Twitter Api. Normally I don't have a Twitter account, for this api I created one. I Tweeted 4 times, including some mentions. But when I use mentions_timeline like this;
my_mentions = api.mentions_timeline()
#print(my_mentions)
#output: []
After then I do a for loop on my_mentions with parameters text, screen_name but nothing returned.
What I'm doing wrong here? Why it's an empty list since I mentioned some people in the tweets + how can I search mentions for another user? Is there a parameter in mentions_timeline() object like screen_name or id ?
Try using the new Cursor Object as follows:
api = tweepy.API(auth)
for mentions in tweepy.Cursor(api.mentions_timeline).items():
# process mentions here
print mentions.text
as per Twitters documentation here
Returns the 20 most recent mentions (tweets containing a users’s
#screen_name) for the authenticating user.
so you cannot check other users mentions using this method. To achieve this, you will have to uses twitters search api. for tweepy's documentation check here
import tweepy
api = tweepy.API(auth)
api.mentions_timeline()
Try going to your profile using the same of which you're using the API and see whether the mentions exists in your profile.
and try mentioning the twitter account from different account which you trying from.
This might be the case where twitter had limited your activities and replies/tweets are not visible of that account.

Twitter search API: search by tweet id

I'd like send query to twitter search API using the tweet id but it seems you cannot search a tweet by having its id (maybe because you don't need to search it if you already have the id). For example imagine we have a tweet https://twitter.com/great_watches/status/643389532105256961 and we want to send 643389532105256961 to the search API to see if the tweet is available on the search api or not.
I need it because I just want to compare twitter search api with twitter streaming api.
I have a python script which is listening to the stream for some keywords and whenever a tweet is comming I like to search it on twitter search api to see if it is available there also or not. meaningless huh?
You can't compare the the Search API to the Streaming API the way you're doing it due to the fact they're both retrieving different types of information.
From the Search API docs:
The Twitter Search API is part of Twitter’s v1.1 REST API. It allows
queries against the indices of recent or popular Tweets and behaves
similarily to, but not exactly like the Search feature available in
Twitter mobile or web clients, such as Twitter.com search.
Before getting involved, it’s important to know that the Search API is
focused on relevance and not completeness. This means that some Tweets
and users may be missing from search results. If you want to match for
completeness you should consider using a Streaming API instead.
Here's to explain the scenario based on the information you've given.
You're streaming the word python and you find a match.
You instantly take that match and look for it on search API
The issue with that is by the time you're going from Streaming API (which is in real time) and you're looking to find the same one on the search API you'll get in conflict where more relevant tweets and popular one that will supersede it.
You'll need to redefine the query sent to the search API to include that exact same one (i.e. include more than python as you have done with the Streaming API).
You can get it using the tweepy api. Get the consumer key, secret and access key, secret from https://apps.twitter.com/ .Then run the following:
consumer_key = 'XXXX'
consumer_secret = 'XXXX'
access_key = 'XXXX-XXXX'
access_secret = 'XXXX'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
tweet = api.statuses_lookup(['643389532105256961'])
a1[0].text # Prints the message
More info here http://docs.tweepy.org/en/v3.5.0/api.html#API.statuses_lookup

Python 2.7 - Tweepy - How to get rate_limit_status()?

I am working on a twitter App using Python 2.7 and the latest version of the tweepy module. One thing I cannot figure out is how to use the function rate_limit_status()
Here is my code:
import tweepy, time, sys, random, pickle
import pprint
# argfile = str(sys.argv[1])
#enter the corresponding information from your Twitter application:
CONSUMER_KEY = ''
CONSUMER_SECRET = ''
ACCESS_KEY = ''
ACCESS_SECRET = ''
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
public_tweets = api.home_timeline()
user = api.get_user('#MyUserName')
print api.rate_limit_status()
When I print the results of the function it gives me a large dictionary that I cannot decipher. I have looked at the tweepy documentation but can't find any good examples on using rate_limit_status().
What is the next step I should be doing to troubleshoot something like this?
Is there a tool to format these large dictionaries so I can read them and try to decipher how to access the values in the dictionary?
Edit:
It turns out I didn't have a good understanding of what a Rest API is and how simply it works! I was expecting something MUCH more complicated in my head.
I actually switched to the twitter Python module twitter library instead of Tweepy and then did a lot of research on how to use the Twitter API.
Two youtube videos that REALLY helped me are:
https://www.youtube.com/watch?v=7YcW25PHnAA
and
https://www.youtube.com/watch?v=fhPb6ocUz_k
The Postman Chrome app was awesome and allowed me to easily test and visualize how my calls to the Twitter API worked and it easily formatted the resulting JSON to so I could read it.
To do quick calculations I also took the JSON from Postman and threw it into this website http://konklone.io/json/ to get a csv that I could then open in Excel and make sure everything was behaving as expected and that I was getting the right now number of results.
After all that, writing the Python code to interact with the Twitter API was easy!
Adding all this in this hopes it will help someone else in the future! If it does please let me know! :)
As per the Tweepy documentation
Returns the remaining number of API requests available to the
requesting user before the API limit is reached for the current hour.
Calls to rate_limit_status do not count against the rate limit. If
authentication credentials are provided, the rate limit status for the
authenticating user is returned. Otherwise, the rate limit status for
the requester’s IP address is returned.
So in simpler words you can say that, it returns a JSON object in which tells you about the Number of requests you have made and Number of requests remaining, the reason why it is difficult to read at first sight lies in the face that, it contains the count for every type of API call that you have made and not only the current API call you just executed.
So for example if you run the above script, then, you can see that you have made a call to api.home_timeline() Now according to the twitter Rules and Regulations you can only make 15 calls to this method in a given window session, So if you unpack the JSON object returned then you can see that, there is a lot of data but if you analyse the data then, You will find that api.home_timeline() only affects limits of relevant methods, such as when calling above methods you can check the rate limit using:
data = api.rate_limit_status()
print data['resources']['statuses']['/statuses/home_timeline']
print data['resources']['users']['/users/lookup']
However you have to do a little bit of research on the JSON returned and then you can extract the relevant data from the JSON object, as the returned JSON objects are hard to read, you can always use these types of links to make it more user readable and then analyse it.

How to get tweets from a user using any python library?

I'm using twitter python tools for extracting twitter information. And I'm lost when it comes to a the very simple task of getting tweets from a user.
search_results = twitter_api.search.tweets(q="", user="daguilaraguilar", count=10)
tweets = search_results['statuses']
for tweet in tweets:
print tweet['text']
But I'm getting this error message at last of HTTP 400 response:
details: {"errors":[{"code":25,"message":"Query parameters are missing."}]}
You did not define your query. According to the twitter api, it is required. Looking at the docs for the python api wrapper, I see these steps for getting the tweets of a specific user. You can probably define a since variable also according to the twitter api:
# Get your "home" timeline
t.statuses.home_timeline()
# Get a particular friend's timeline
t.statuses.friends_timeline(id="billybob")

Categories

Resources