The code below is functional and it streams tweets and stores them in a JSON file. I am running it on a virtual machine since I want to collect 2 months worth of data. However, for some reason, the code stopped running after around 48 hours with no error. Is this a tweepy limitation ( streaming rate or something) or should I check my connection:
import os
import sys
from tweepy import API
from tweepy import OAuthHandler
consumer_key = ''
consumer_secret = ''
access_token = ''
access_secret = ''
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
def get_twitter_client():
"""Setup Twitter API client Return: tweepy.API object """
auth = get_twitter_auth()
client = API(auth)
return client
from tweepy import Stream
from tweepy.streaming import StreamListener
class MyListener(StreamListener):
def on_data(self, data):
try:
with open('strategyand_user.json', 'a') as f:
f.write(data)
return True
except BaseException as e:
print("Error on_data: %s" % str(e))
return True
def on_error(self, status):
print(status)
return True
twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(follow=['xxxxxxx'])
Related
Im trying to stream specific user's tweets and return them as a variable for an event based trigger afterwards. I am using the following code, but it does not seem to return any tweets for me.
I am interested in their tweets only and not retweets and want to skip tweets not posted by the specified user.
Any idea where I am going wrong:
import tweepy
#API login for Twitter
consumer_key= CONSUMER_KEY
consumer_secret= CONSUMER_SECRET
access_key= ACCESS_TOKEN
access_secret= ACCESS_TOKEN_SECRET
class StreamListener(tweepy.StreamListener):
def on_status(self, status):
if status.user.id_str != '111111111':
return
tweet = status.text
return tweet
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
# initialize stream
streamListener = StreamListener()
stream = tweepy.Stream(auth=api.auth, listener=streamListener)
stream.filter(follow=['111111111'])
print(tweet)
here is the code to stream specific user's tweets:
import tweepy
import sys
#API login for Twitter
consumer_key= CONSUMER_KEY
consumer_secret= CONSUMER_SECRET
access_key= ACCESS_TOKEN
access_secret= ACCESS_TOKEN_SECRET
# StreamListener class inherits from tweepy.StreamListener and overrides on_status/on_error methods.
class StreamListener(tweepy.StreamListener):
def on_status(self, status):
if status.user.id_str != '11111111':
return
text = status.text
print(text)
def on_error(self, status_code):
print("Encountered streaming error (", status_code, ")")
sys.exit()
if __name__ == "__main__":
# complete authorization and initialize API endpoint
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
# initialize stream
streamListener = StreamListener()
stream = tweepy.Stream(auth=api.auth, listener=streamListener,tweet_mode='extended')
stream.filter(follow=['11111111'])
I am trying to fetch streaming data from the twitter-streaming-api using the tweepy library in python. However, even after a lot of trials I am not able to get any data or print it as done in the on_data method. It's giving the 420 error message. How can I avoid it?
import io
import json
import time
import tweepy
access_token = 'XXXXXX'
access_token_secret = 'XXXXXX'
consumer_key = 'XXXXXX'
consumer_secret = 'XXXXXX'
class MyListener(tweepy.StreamListener):
def on_status(self, status):
print(status.text)
def on_data(self, tweetdata):
data = json.loads(tweetdata)
print(data)
def on_error(self, status):
print(status)
auth = tweepy.OAuthHandler(consumer_secret=consumer_secret,consumer_key=consumer_key)
auth.set_access_token(access_token,access_token_secret)
api = tweepy.API(auth)
myListener = MyListener()
myStream = tweepy.Stream(auth = api.auth, listener=myListener)
myStream.filter(languages='en',track=['#NBA'],async=True)
myStream.disconnect()
I am learning python and have started out a few weeks ago. I have tried to write a code to check for tweets with a particular hashtag in the streaming API and then reply to the tweet in case the a tweet has not been posted to the handle previously. While running the code, I have tried to avoid overstepping the rate limitations so as to not get any error. But there is an issue of duplicate status that Twitter raises once in a while. I would like the code to keep running and not stop on encountering an issue. Please help in this. The following is the code:
import tweepy
from tweepy import Stream
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
import json
import time
consumer_key =
consumer_secret =
access_token =
access_secret =
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_secret)
def check(status):
datafile = file('C:\Users\User\Desktop\Growth Handles.txt', 'r')
found = False
for line in datafile:
if status.user.screen_name in line:
found = True
break
return found
class MyListener(StreamListener):
def on_status(self, status):
f=status.user.screen_name
if check(status) :
pass
else:
Append=open('Growth Handles.txt' , 'a' )
Append.write(f + "\n")
Append.close()
Reply='#' + f + ' Check out Tomorrowland 2014 Setlist . http://.... '
api = tweepy.API(auth)
api.update_status(status=Reply)
time.sleep(45)
return True
def on_error(self, status):
print(status)
return True
twitter_stream = Stream(auth, MyListener())
twitter_stream.filter(track=['#musiclovers'])
In case, update_status method throws an error
try:
api.update_status(status=Reply)
except:
pass
In case twitter_stream gets disconnected.
twitter_stream = Stream(auth, MyListener())
while True:
twitter_stream.filter(track=['#musiclovers'])
Warning - Your app may got banned if it reaches certain limits, or their system caught you spamming. Check Twitter Rules
Using the code below, I'm trying to get a hash tag. It works fine for larger searches like #StarWars, but when i ask for smaller ones it doesn't seem to return anything.
Ideas?
'code' is used instead of the actual strings for authentication
from tweepy.streaming import StreamListener
from tweepy import OAuthHandler
from tweepy import Stream
from textwrap import TextWrapper
import json
access_token = "code"
access_token_secret = "code"
consumer_key = "code"
consumer_secret = "code"
class StdOutListener(StreamListener):
''' Handles data received from the stream. '''
status_wrapper = TextWrapper(width=60, initial_indent=' ', subsequent_indent=' ')
def on_status(self, status):
try:
print self.status_wrapper.fill(status.text)
print '\n %s %s via %s\n' % (status.author.screen_name, status.created_at, status.source)
except:
# Catch any unicode errors while printing to console
# and just ignore them to avoid breaking application.
pass
def on_error(self, status_code):
print('Got an error with status code: ' + str(status_code))
return True # To continue listening
def on_timeout(self):
print('Timeout...')
return True # To continue listening
if __name__ == '__main__':
listener = StdOutListener()
auth = OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
stream = Stream(auth, listener)
stream.filter(track=['#TestingPythonTweet'])
Ok, so found that the answer to this is that i was expecting it to work retro-actively. This was a fundamental error on my part. Instead what actually happens is that it gets what's currently being tweeted. Not was has been previously.
I need to develop an app that lets me track tweets and save them in a mongodb for a research project (as you might gather, I am a noob, so please bear with me). I have found this piece of code that sends tweets streaming through my terminal window:
import sys
import tweepy
consumer_key=""
consumer_secret=""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def on_status(self, status):
print status.text
def on_error(self, status_code):
print >> sys.stderr, 'Encountered error with status code:', status_code
return True # Don't kill the stream
def on_timeout(self):
print >> sys.stderr, 'Timeout...'
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(track=['Gandolfini'])
Is there a way I can modify this piece of code so that instead of having tweets streaming over my screen, they are sent to my mongodb database?
Thanks
Here's an example:
import json
import pymongo
import tweepy
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def __init__(self, api):
self.api = api
super(tweepy.StreamListener, self).__init__()
self.db = pymongo.MongoClient().test
def on_data(self, tweet):
self.db.tweets.insert(json.loads(tweet))
def on_error(self, status_code):
return True # Don't kill the stream
def on_timeout(self):
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=['Gandolfini'])
This will write tweets to the mongodb test database, tweets collection.
Hope that helps.
I have developed a simple command line tool that does exactly this.
https://github.com/janezkranjc/twitter-tap
It allows using the streaming API or the search API.