i'm trying to tweepy to reply to certain tweets, and my reply includes an image.
the twt variable holds the tweet i'm trying to reply to.
here's what i'm doing at the moment:
# -*- coding: utf-8 -*-
import tweepy, time, random
CONSUMER_KEY = 'XXXX'
CONSUMER_SECRET = 'XXXX'
ACCESS_KEY = 'XXXX'
ACCESS_SECRET = 'XXXX'
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
query = ['aaa', 'bbb', 'ccc']
t0 = time.time()
count = 0
last_count = 0
f = open('last_replied.txt')
last_replied = int(f.readline().strip())
f.close()
print('starting time:', time.strftime('%X'))
while True:
if count > last_count:
print(time.strftime('%X'), ':', count, 'replies')
last_count = count
for i in range(3):
twts = api.search(query[i], since_id=last_replied)
if len(twts)>0:
for twt in twts:
sid = twt.id
sn = twt.user.screen_name
stat = "lalala" + "#" + sn
api.update_with_media('oscar1.gif',status=stat,in_reply_to_status_id=sid)
count += 1
last_replied = twt.id
f = open('last_replied.txt','w')
f.write(str(last_replied))
f.close()
pause = random.randint(50,90)
time.sleep(pause)
my tweet gets posted, but not as a reply to the original tweet (twt). instead, it just gets posted as a new, independent, tweet.
however, when instead of update_with_media as above, i use update_status, such as:
api.update_status(status=stat,in_reply_to_status_id=sid)
my new tweet does get posted as a reply to the original tweet (twt).
what am i missing?
thanks
the solution i ended up with was switching to the twython module, that has the functionality well documented and working perfectly.
thank you very much for your help.
The update_with_media endpoint is being deprecated by Twitter (https://dev.twitter.com/rest/reference/post/statuses/update_with_media) you shouldn't use it. Instead upload the media first with the media_upload method and add the media_ids you get back to the update_status method.
I glanced at the code for update_with_media and don't see any obvious bugs, but the new method of handling tweets with media was added to the Tweepy API in January - note the last two bullets here:
https://github.com/tweepy/tweepy/releases/tag/v3.2.0
If you download Tweepy 3.2.0 you should be able to switch over to the new regimen, which will hopefully fix your reply_to problem. (I can't say for sure if the new stuff works, I'm using an older version of Tweepy myself.)
Related
i'm collecting tweets withe thier replies from Twitter's API to build data set and i'm using tweepy library in python for that,but the problem is that I get this error so much (Rate limit reached. Sleeping for:(any number for sec)) that delays me and I have to collect as many data as possible in the shortest time
I read that twitter has a rate limit of i think 15 requests per 15 minutes or something like that, but on my situation I can only gather a tweet or two tweet until it stops again and sometimes it stops for 15 minutes and then stop again for 15 minutes without giving me give me time between them, I don't know what caused the problem whether it is my code or not?
# Import the necessary package to process data in JSON format
try:
import json
except ImportError:
import simplejson as json
# Import the tweepy library
import tweepy
import sys
# Variables that contains the user credentials to access Twitter API
ACCESS_TOKEN = '-'
ACCESS_SECRET = '-'
CONSUMER_KEY = '-'
CONSUMER_SECRET = '-'
# Setup tweepy to authenticate with Twitter credentials:
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
# Create the api to connect to twitter with your creadentials
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True, compression=True)
file2 = open('replies.csv','w', encoding='utf-8-sig')
replies=[]
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
for full_tweets in tweepy.Cursor(api.search,q='#عربي',timeout=999999,tweet_mode='extended').items():
if (not full_tweets.retweeted) and ('RT #' not in full_tweets.full_text):
for tweet in tweepy.Cursor(api.search,q='to:'+full_tweets.user.screen_name,result_type='recent',timeout=999999,tweet_mode='extended').items(1000):
if hasattr(tweet, 'in_reply_to_status_id_str'):
if (tweet.in_reply_to_status_id_str==full_tweets.id_str):
replies.append(tweet.full_text)
print(full_tweets._json)
file2.write("{ 'id' : "+ full_tweets.id_str + "," +"'Replies' : ")
for elements in replies:
file2.write(elements.strip('\n')+" , ")
file2.write("}\n")
replies.clear()
file2.close()
$ python code.py > file.csv
Rate limit reached. Sleeping for: 262
Rate limit reached. Sleeping for: 853
Hoping this might help
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=False, compression=True)
Just add this line to the Python script to avoid the sleep:
sleep_on_rate_limit=False
i'm collecting tweets withe thier replies from Twitter's API to build data set and i'm using tweepy library in python for that,but the problem is that I get this error so much (Rate limit reached. Sleeping for:(any number for sec)) that delays me and I have to collect as many data as possible in the shortest time
I read that twitter has a rate limit of i think 15 requests per 15 minutes or something like that, but on my situation I can only gather a tweet or two tweet until it stops again and sometimes it stops for 15 minutes and then stop again for 15 minutes without giving me give me time between them, I don't know what caused the problem whether it is my code or not?
# Import the necessary package to process data in JSON format
try:
import json
except ImportError:
import simplejson as json
# Import the tweepy library
import tweepy
import sys
# Variables that contains the user credentials to access Twitter API
ACCESS_TOKEN = '-'
ACCESS_SECRET = '-'
CONSUMER_KEY = '-'
CONSUMER_SECRET = '-'
# Setup tweepy to authenticate with Twitter credentials:
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_TOKEN, ACCESS_SECRET)
# Create the api to connect to twitter with your creadentials
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True, compression=True)
file2 = open('replies.csv','w', encoding='utf-8-sig')
replies=[]
non_bmp_map = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
for full_tweets in tweepy.Cursor(api.search,q='#عربي',timeout=999999,tweet_mode='extended').items():
if (not full_tweets.retweeted) and ('RT #' not in full_tweets.full_text):
for tweet in tweepy.Cursor(api.search,q='to:'+full_tweets.user.screen_name,result_type='recent',timeout=999999,tweet_mode='extended').items(1000):
if hasattr(tweet, 'in_reply_to_status_id_str'):
if (tweet.in_reply_to_status_id_str==full_tweets.id_str):
replies.append(tweet.full_text)
print(full_tweets._json)
file2.write("{ 'id' : "+ full_tweets.id_str + "," +"'Replies' : ")
for elements in replies:
file2.write(elements.strip('\n')+" , ")
file2.write("}\n")
replies.clear()
file2.close()
$ python code.py > file.csv
Rate limit reached. Sleeping for: 262
Rate limit reached. Sleeping for: 853
Hoping this might help
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=False, compression=True)
Just add this line to the Python script to avoid the sleep:
sleep_on_rate_limit=False
I am currently in the process of doing some research using sentiment analysis on twitter data regarding a certain topic (isn't necessarily important to this question) using python, of which I am a beginner at. I understand the twitter streaming API limits users to access only to the previous 7 days unless you apply for a full enterprise search which opens up the whole archive. I had recently been given access to the full archive for this research project from twitter but I am unable to specify a start and end date to the tweets I would like to stream into a csv file. This is my code:
import pandas as pd
from tweepy import Stream
from tweepy import OAuthHandler
from tweepy.streaming import StreamListener
ckey = 'xxxxxxxxxxxxxxxxxxxxxxx'
csecret = 'xxxxxxxxxxxxxxxxxxxxxxx'
atoken = 'xxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxx'
asecret = 'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'
# =============================================================================
# def sentimentAnalysis(text):
# output = '0'
# return output
# =============================================================================
class listener(StreamListener):
def on_data(self, data):
tweet = data.split(',"text":"')[1].split('","source')[0]
saveMe = tweet+'::'+'\n'
output = open('output.csv','a')
output.write(saveMe)
output.close()
return True
def on_error(self, status):
print(status)
auth = OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
twitterStream = Stream(auth, listener())
twitterStream.filter(track=["#weather"], languages = ["en"])
Now this code streams twitter date from the past 7 days perfectly. I tried changing the bottom line to
twitterStream.filter(track=["#weather"], languages = ["en"], since = ["2016-06-01"])
but this returns this error :: filter() got an unexpected keyword argument 'since'.
What would be the correct way to filter by a given date frame?
The tweepy does not provide the "since" argument, as you can check yourself here.
To achieve the desired output, you will have to use the api.user_timeline, iterating through pages until the desired date is reached, Eg:
import tweepy
import datetime
# The consumer keys can be found on your application's Details
# page located at https://dev.twitter.com/apps (under "OAuth settings")
consumer_key=""
consumer_secret=""
# The access tokens can be found on your applications's Details
# page located at https://dev.twitter.com/apps (located
# under "Your access token")
access_token=""
access_token_secret=""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)
page = 1
stop_loop = False
while not stop_loop:
tweets = api.user_timeline(username, page=page)
if not tweets:
break
for tweet in tweets:
if datetime.date(YEAR, MONTH, DAY) < tweet.created_at:
stop_loop = True
break
# Do the tweet process here
page+=1
time.sleep(500)
Note that you will need to update the code to fit your needs, this is just a general solution.
I am using tweepy and python 2.7.6 to return the tweets of a specified user
My code looks like:
import tweepy
ckey = 'myckey'
csecret = 'mycsecret'
atoken = 'myatoken'
asecret = 'myasecret'
auth = tweepy.OAuthHandler(ckey, csecret)
auth.set_access_token(atoken, asecret)
api = tweepy.API(auth)
stuff = api.user_timeline(screen_name = 'danieltosh', count = 100, include_rts = True)
print stuff
However this yields a set of messages which look like<tweepy.models.Status object at 0x7ff2ca3c1050>
Is it possible to print out useful information from these objects? where can I find all of their attributes?
Unfortunately, Status model is not really well documented in the tweepy docs.
user_timeline() method returns a list of Status object instances. You can explore the available properties and methods using dir(), or look at the actual implementation.
For example, from the source code you can see that there are author, user and other attributes:
for status in stuff:
print status.author, status.user
Or, you can print out the _json attribute value which contains the actual response of an API call:
for status in stuff:
print status._json
import tweepy
import tkinter
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, wait_on_rate_limit=True, wait_on_rate_limit_notify=True)
# set parser=tweepy.parsers.JSONParser() if you want a nice printed json response.
userID = "userid"
user = api.get_user(userID)
tweets = api.user_timeline(screen_name=userID,
# 200 is the maximum allowed count
count=200,
include_rts = False,
# Necessary to keep full_text
# otherwise only the first 140 words are extracted
tweet_mode = 'extended'
)
for info in tweets[:3]:
print("ID: {}".format(info.id))
print(info.created_at)
print(info.full_text)
print("\n")
Credit to https://fairyonice.github.io/extract-someones-tweet-using-tweepy.html
In Tweeter API v2 getting tweets of a specified user is fairly easy, provided that you won’t exceed the limit of 3200 tweets. See documentation for more info.
import tweepy
# create client object
tweepy.Client(
bearer_token=TWITTER_BEARER_TOKEN,
consumer_key=TWITTER_API_KEY,
consumer_secret=TWITTER_API_KEY_SECRET,
access_token=TWITTER_ACCESS_TOKEN,
access_token_secret=TWITTER_TOKEN_SECRET,
)
# retrieve first n=`max_results` tweets
tweets = client.get_users_tweets(id=user_id, **kwargs)
# retrieve using pagination until no tweets left
while True:
if not tweets.data:
break
tweets_list.extend(tweets.data)
if not tweets.meta.get('next_token'):
break
tweets = client.get_users_tweets(
id=user_id,
pagination_token=tweets.meta['next_token'],
**kwargs,
)
The tweets_list is going to be a list of tweepy.tweet.Tweet objects.
I'm writing a twitter bot using tweepy that will search for mentions to it and then implement actions based on the text in the tweet. Eventually I want to run it every few minutes via cron. I'm a python beginner, so forgive my ignorance.
My problem is preventing duplicates. I have a loop that goes through and tests whether a tweet is new by checking whether its id is greater than the previous tweet. However, I can't work out a way of initializing this variable, and then saving changes to it at the end of the loop.
Here is my current (broken) code:
import sys
import tweepy
## OAuth keys go here.
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
def ask_bot():
old_id = 0
for tweet in api.mentions():
if tweet.id > old_id:
print "#%s: %s" % (tweet.author.screen_name, tweet.text)
old_id = tweet.id + 1
else:
pass
The desired behaviour at the end is for the loop to only print tweets that haven't been printed before.
I don't know much about Tweepy, but this may help:
import sys
import tweepy
## OAuth keys go here.
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
seen_ids = []
def ask_bot():
global seen_ids
for tweet in api.mentions():
if tweet.id not in seen_ids:## Heading ##:
print "#%s: %s" % (tweet.author.screen_name, tweet.text)
seen_ids.append(tweet)
else:
pass
So, it will search through Twitter for all tweets aimed at it, then it will check if it has seen that ID before. The reason I used global is so changes would affect the main variable seen_ids, not a copy made inside of the function.
Good Luck!
I would just make a list of IDs that have been printed. Then you would simply check if the ID you're trying to print is already in the printed list. If it is, do nothing. If it isn't, print it and add it to the list.
In other words:
import sys
import tweepy
## OAuth keys go here.
auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)
printed_ids = []
def ask_bot():
old_id = 0
for tweet in api.mentions():
if tweet.id not in printed_ids:
print "#%s: %s" % (tweet.author.screen_name, tweet.text)
printed_ids.append(tweet.id)
else:
pass