Tweepy search_full_archive() missing 2 required positional arguments: 'label' and 'query' - python

I'm using Tweepy 3.10.0 to collect tweets containing specific keywords and hashtags for a single calendar day at a time. I recently upgraded from the standard Developer Account to the Premium Account to access the full archive. I know this changes the "search" function to "search_full_archive" and changes a couple other small syntax things. I thought I made the correct changes but I'm still getting this error. I've checked the Developer API reference.
consumer_key = '****'
consumer_secret = '****'
access_token = '****'
access_token_secret = '****'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
def get_tweets_withHashTags(query, startdate, enddate, count = 300):
tweets_hlist= []
tweets_list= []
qt=str(query)
for page in tweepy.Cursor(api.search_full_archive, environment_name='FullArchive', q=qt, fromDate=startdate,toDate=enddate,count=300, tweet_mode='extended').pages(100):
count = len(page)
print( "Count of tweets in each page for " + str(qt) + " : " + str(count))
for value in page:
hashList = value._json["entities"]["hashtags"]
flag = 0
for tag in hashList:
if qt.lower() in tag["text"].lower():
flag = 1
if flag==1:
tweets_hlist.append(value._json)
tweets_list.append(value._json)
print("tweets_hash_"+ query +": " + str(len(tweets_hlist)))
print("tweets_"+ query +": " + str(len(tweets_list)))
with open("/Users/Victor/Documents/tweetCollection/data/"+startdate +"/" + "query1_hash_" + str(startdate)+ "_" + str(enddate) + "_" +query+'.json', 'w') as outfile:
json.dump(tweets_hlist, outfile, indent = 2)
with open("/Users/Victor/Documents/tweetCollection/data/"+startdate +"/"+"query1_Contains_" + str(startdate)+ "_" + str(enddate) + "_" +query+'.json', 'w') as outfile:
json.dump(tweets_list, outfile, indent = 2)
return len(tweets_list)
query = ["KeyWord1","KeyWord2","KeyWord3",etc.]
for value in query:
get_tweets_withHashTags(value,"2020-04-21","2020-04-22")

According to the api's code https://github.com/tweepy/tweepy/blob/5b2dd086c2c5a08c3bf7be54400adfd823d19ea1/tweepy/api.py#L1144 api.search_full_archive has as arguments label (the environment name) and query. So changing
api.search_full_archive, environment_name='FullArchive', q=qt, fromDate=startdate,toDate=enddate,count=300, tweet_mode='extended'
to
api.search_full_archive, label='FullArchive', query=qt, fromDate=startdate,toDate=enddate
As for the tweet_mode='extended', it is not available for search_full_archive nor search_30_day. You can see how to access full text in https://github.com/tweepy/tweepy/issues/1461

Related

Tweepy Error: Request exceeds account’s current package request limits

I'm using the Tweepy API to collect tweets containing specific keywords or hashtags through the standard Academic Research Developer Account. This allows me to collect 10,000,000 tweets per month. I'm trying to collect the tweets from one full calendar date at a time using the full archive search. I've gotten a rate limit error (despite the wait_on_rate_limit flag being set to true) and now this request limit error. I'm not totally sure why or what to change at this point?
consumer_key = '***'
consumer_secret = '***'
access_token = '***'
access_token_secret = '***'
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth,wait_on_rate_limit=True)
def get_tweets_withHashTags(query, startdate, enddate, count = 300):
tweets_hlist= []
tweets_list= []
qt=str(query)
for page in tweepy.Cursor(api.search_full_archive, label='myLabel', query=qt, fromDate=startdate+'0000',toDate=enddate+'0000',maxResults=100).pages(100):
count = len(page)
print( "Count of tweets in each page for " + str(qt) + " : " + str(count))
for value in page:
hashList = value._json["entities"]["hashtags"]
flag = 0
for tag in hashList:
if qt.lower() in tag["text"].lower():
flag = 1
if flag==1:
tweets_hlist.append(value._json)
tweets_list.append(value._json)
print("tweets_hash_"+ query +": " + str(len(tweets_hlist)))
print("tweets_"+ query +": " + str(len(tweets_list)))
with open("/Users/Victor/Documents/tweetCollection/data/"+startdate +"/" + "query1_hash_" + str(startdate)+ "_" + str(enddate) + "_" +query+'.json', 'w') as outfile:
json.dump(tweets_hlist, outfile, indent = 2)
with open("/Users/Victor/Documents/tweetCollection/data/"+startdate +"/"+"query1_Contains_" + str(startdate)+ "_" + str(enddate) + "_" +query+'.json', 'w') as outfile:
json.dump(tweets_list, outfile, indent = 2)
return len(tweets_list)
query = ["keyword1","keyword2","keyword3", etc. ]
for value in query:
get_tweets_withHashTags(value,"20200422","20200423")
At the time of writing this answer, Tweepy does not support version 2 of the Twitter API (although there is a Pull Request to add initial support). The api.search_full_archive is actually hitting the v1.1 Premium Full Archive search, which has a lower request and Tweet cap volume limit, so that is why you are seeing an error about exceeding your package.
For the Full Archive / All Tweets endpoint in v2 of the API, you need a Project in the Academic Track, and an App in that Project. You'll need to point your code at the /2/tweets/search/all endpoint. If you are using Python, there is sample code in the TwitterDev repo. This uses the requests library, rather than using a full API library like Tweepy.

Python Bot with tweepy

I'm trying to code a bot on twitter using the tweepy lib but I'm not getting results. I need help for code to reply to tweets that mentioned me.
search = '#MoviesRandom'
numberOfTweets = 10
phrase = movies() # Here im using a function declared by me before. Doesn't having errors here
for tweet in tweepy.Cursor(api.search, search).items(numberOfTweets):
try:
tweetId = tweet.user.idusername
username = tweet.user.screen_name
api.update_status("#" + username + " " + phrase, in_reply_to_status_id=tweetId)
print("Replied with " + phrase)
except tweepy.TweepError as e:
print(e.reason)
It's likely caused by this line here
tweetId = tweet.user.idusername
There is no such object called idusername and as #Andy mentioned it, it should just be the id object.
tweetId = tweet.user.id

Tweepy still not returning full text despite using extended text feature

I am using tweepy to download tweets about a particular topic but nobody which tutorial I follow I cannot get the tweet to output as a full tweet. There is always an ellipse that cuts it off after a certain number of characters.
Here is the code I am using
import json
import tweepy
from tweepy import OAuthHandler
import csv
import sys
from twython import Twython
nonBmpMap = dict.fromkeys(range(0x10000, sys.maxunicode + 1), 0xfffd)
with open ('Twitter_Credentials.json') as cred_data:
info = json.load(cred_data)
consumer_Key = info['Consumer_Key']
consumer_Secret = info['Consumer_Secret']
access_Key = info['Access_Key']
access_Secret = info['Access_Secret']
maxTweets = int(input('Enter the Number of tweets that you want to extract '))
userTopic = input('What topic do you want to search for ')
topic = ('"' + userTopic + '"')
tweetCount = 0
auth = OAuthHandler(consumer_Key, consumer_Secret)
auth.set_access_token(access_Key, access_Secret)
api = tweepy.API(auth, wait_on_rate_limit=True)
tweets = api.search(q=topic, count=maxTweets, tweet_mode= 'extended')
for tweet in tweets:
tweetCount = (tweetCount+1)
with open ('TweetsAbout' + userTopic, 'a', encoding='utf-8') as the_File:
print(tweet.full_text.translate(nonBmpMap))
tweet = (str(tweet.full_text).translate(nonBmpMap).replace(',','').replace('|','').replace('\n','').replace('’','\'').replace('…',"end"))
the_File.write(tweet + "\n")
print('Extracted ' + str(tweetCount) + ' tweets about ' + topic)
Try this, see if it works!
try:
specific_tweets = tweepy.Cursor(api.search, tweet_mode='extended', q=<your_query_string> +" -filter:retweets", lang='en').items(500)
except tweepy.error.TweepError:
pass
for tweet in specific_tweets:
extracted_text = tweet.full_text
all the text your trying to extract should be in extracted_text. Good Luck!!

How can I python twitter crawling (scraping) several keyword

I wrote the code.
But I don't think it's going to work.
I want to extract words from the concept of " or " rather than the concept of " and ".
It seems like only ' keyword 1 ' is extracted.
How do I make corrections?
import tweepy
import time
import os
search_term = 'keyword1'
search_term2= 'keyword2'
lat = "37.6"
lon = "127.0"
radius = "200km"
location = "%s,%s,%s" % (lat, lon, radius)
API_key = "11111"
API_secret = "22222"
Access_token = "33333"
Access_token_secret = "444"
auth = tweepy.OAuthHandler(API_key, API_secret)
auth.set_access_token(Access_token, Access_token_secret)
api = tweepy.API(auth)
c=tweepy.Cursor(api.search,
q=(search_term or search_term2),
rpp=1000,
geocode=location,
include_entities=True)
data = {}
i = 1
for tweet in c.items():
data['text'] = tweet.text
print(i, ":", data)
i += 1
time.sleep(1)
wfile = open(os.getcwd()+"/twtw2.txt", mode='w')
data = {}
i = 0
for tweet in c.items():
data['text'] = tweet.text
wfile.write(data['text']+'\n')
i += 1
time.sleep(1)
wfile.close()
Maybe change this line
q=(search_term or search_term2),
to
q="{}+OR+{}".format(search_term,search_term2),
Case matters here for the OR operator
enter q as a string, not as an expression that is short-circuit evaluated
By the way, your credentials (from your post) also work for me.

request empty result issue

I have this simple python code, which returning the content of URL and store the result as json text file named "file", but it keeps returning empty result .
What I am doing wrong here? It is just a simple code I am so disappointed.
I have included all the imports needed import Facebook,import request,and import json.
url ="https://graph.facebook.com/search?limit=5000&type=page&q=%26&access_token=xx&__after_id=139433456868"
content = requests.get(url).json()
file = open("file.txt" , 'w')
file.write(json.dumps(content, indent=1))
file.close()
but it keeps returning empty result to me what I am missing here?
here is the result:
"data": []
any help please?
Its working fine:
import urllib2
accesstoken="CAACEdEose0cBACF6HpTDEuVEwVnjx1sHOJFS3ZBQZBsoWqKKyFl93xwZCxKysqsQMUgOZBLjJoMurSxinn96pgbdfSYbyS9Hh3pULdED8Tv255RgnsYmnlxvR7JZCN7X25zP6fRnRK0ZCNmChfLPejaltwM2JGtPGYBQwnmAL9tQBKBmbZAkGYCEQHAbUf7k1YZD"
urllib2.urlopen("https://graph.facebook.com/search?limit=5000&type=page&q=%26&access_token="+accesstoken+"&__after_id=139433456868").read()
I think you have not requested access token before making the request.
How to find access token?
def getSecretToken(verification_code):
token_url = ( "https://graph.facebook.com/oauth/access_token?" +
"client_id=" + app_id +
"&redirect_uri=" +my_url +
"&client_secret=" + app_secret +
"&code=" + verification_code )
response = requests.get(token_url).content
params = {}
result = response.split("&", 1)
print result
for p in result:
(k,v) = p.split("=")
params[k] = v
return params['access_token']
how do you get that verification code?
verification_code=""
if "code" in request.query:
verification_code = request.query["code"]
if not verification_code:
dialog_url = ( "http://www.facebook.com/dialog/oauth?" +
"client_id=" + app_id +
"&redirect_uri=" + my_url +
"&scope=publish_stream" )
return "<script>top.location.href='" + dialog_url + "'</script>"
else:
access_token = getSecretToken(verification_code)

Categories

Resources