I'm trying to fetch Tweets from multiple Twitter accounts and then create a database with the TWEETS and the source of the TWEET " user name " by using the following code
posts = api.user_timeline(screen_name = 'AlArabiya_Brk', count = 100 , lang =
"ar",tweet_mode="extended")
df = pd.DataFrame([tweet.full_text for tweet in posts], columns = [ 'Tweets'])
but I have a question: how can I add more than one account? I tried doing:
posts = api.user_timeline(screen_name = ['AlArabiya_Brk','AJABreaking'], count = 100
,lang ="ar",tweet_mode="extended")
but didn't get the desired output
You'll need to make multiple calls with that method.
That API endpoint only allows a single screen name input.
Related
y'all. I'm trying to figure out how to sort for a specific country's tweets using search_recent_tweets. I take a country name as input, use pycountry to get the 2-character country code, and then I can either put some sort of location filter in my query or in search_recent_tweets params. Nothing I have tried so far in either has worked.
######
import tweepy
from tweepy import OAuthHandler
from tweepy import API
import pycountry as pyc
# upload token
BEARER_TOKEN='XXXXXXXXX'
# get tweets
client = tweepy.Client(bearer_token=BEARER_TOKEN)
# TAKE USER INPUT
countryQuery = input("Find recent tweets about travel in a certain country (input country name): ")
keyword = 'women safe' # gets tweets containing women and safe for that country (safe will catch safety)
# get country code to plug in as param in search_recent_tweets
country_code = str(pyc.countries.search_fuzzy(countryQuery)[0].alpha_2)
# get 100 recent tweets containing keywords and from location = countryQuery
query = str(keyword+' place_country='+str(countryQuery)+' -is:retweet') # search for keyword and no retweets
posts = client.search_recent_tweets(query=query, max_results=100, tweet_fields=['id', 'text', 'entities', 'author_id'])
# expansions=geo.place_id, place.fields=[country_code],
# filter posts to remove retweets
# export tweets to json
import json
with open('twitter.json', 'w') as fp:
for tweet in posts.data:
json.dump(tweet.data, fp)
fp.write('\n')
print("* " + str(tweet.text))
I have tried variations of:
query = str(keyword+' -is:retweet') # search for keyword and no retweets
posts = client.search_recent_tweets(query=query, place_fields=[str(countryQuery), country_code], max_results=100, tweet_fields=['id', 'text', 'entities', 'author_id'])
and:
query = str(keyword+' place.fields='+str(countryQuery)+','+country_code+' -is:retweet') # search for keyword and no retweets
posts = client.search_recent_tweets(query=query, max_results=100, tweet_fields=['id', 'text', 'entities', 'author_id'])
These either ended up pulling me NoneType tweets aka nothing or causing a
"The place.fields query parameter value [Germany] is not one of [contained_within,country,country_code,full_name,geo,id,name,place_type]"
The documentation for search_recent_tweets seems like place.fields / place_fields / place_country should be supported.
Any advice would help!!!
I want to search the whole timeline of a specific user for a particular query
I tried setting the .items() to the total number of tweets in the user profile
but I'm not getting any results back .. any idea on how to solve this issue?
username="No_Rumors"
t =" from:No_Rumors"
z= "أمين رابطة العالم الإسلامي يقرع جرس الكنيسة النصرانية"
no_of_tweets = api.get_user(screen_name=username).statuses_count
query=z+t
t = tweepy.Cursor(api.search,q=query).items(no_of_tweets)
for tweet in t:
print(tweet.text)
I'm trying extract a dataset using tweepy. I have a set of tweet Ids that I use to extract full text tweets. I have looped the IDs and tweepy functions to get the tweet texts, but my program keeps crashing because a few of the tweet IDs on my list are from suspended accounts.
This is the related code snippet I'm using:
# Creating DataFrame using pandas
db = pd.DataFrame(columns=['username', 'description', 'location', 'following',
'followers', 'totaltweets', 'retweetcount', 'text', 'hashtags'])
#reading tweet IDs from file
df = pd.read_excel('dataid.xlsx')
mylist = df['tweet_id'].tolist()
#tweet counter
n=1
#looping for extract tweets
for i in mylist:
tweets=api.get_status(i, tweet_mode="extended")
username = tweets.user.screen_name
description = tweets.user.description
location = tweets.user.location
following = tweets.user.friends_count
followers = tweets.user.followers_count
totaltweets = tweets.user.statuses_count
retweetcount = tweets.retweet_count
text=tweets.full_text
hashtext = list()
ith_tweet = [username, description, location, following,followers, totaltweets, retweetcount, text, hashtext]
db.loc[len(db)] = ith_tweet
n=n+1
filename = 'scraped_tweets.csv'
Novice programmer here seeking help. I have a list of hashtags for which I want to get all the historical tweets from 01-01-2015 to 31-12-2018.
I tried to use the Tweepy library but it only allows access for the last 7 days of tweets. I also tried to use GetOldTweets as it gives access to historical tweets but it kept continuously crashing. So now I have acquired premium API access for Twitter which also gives me access to the full historic tweets. In order to do do my query with the premium API I cannot use the Tweepy Library (as it does not have a link with the premium APIs right?) and my choices are between TwitterAPI and Search-Tweets.
1- Does TwitterAPI and Search-Tweets supply information regarding the user name, user location, if the user is verified, the language of the tweet, the source of the tweet, the count of the retweets and favourites and the date for each tweet? (As tweepy does). I could not find any information about this.
2- Can I supply a time span in my query?
3- How do I do all of this?
This was my code for the Tweepy library:
hashtags = ["#AAPL","#FB","#KO","#ABT","#PEPCO",...]
df = pd.DataFrame(columns = ["Hashtag", "Tweets", "User", "User_Followers",
"User_Location", "User_Verified", "User_Lang", "User_Status",
"User_Method", "Fav_Count", "RT_Count", "Tweet_date"])
def tweepy_df(df,tags):
for cash in tags:
i = len(df)+1
for tweet in tweepy.Cursor(api.search, q= cash, since = "2015-01-01", until = "2018-12-31").items():
print(i, end = '\r')
df.loc[i, "Hashtag"] = cash
df.loc[i, "Tweets"] = tweet.text
df.loc[i, "User"] = tweet.user.name
df.loc[i, "User_Followers"] = tweet.followers_count
df.loc[i, "User_Location"] = tweet.user.location
df.loc[i, "User_Verified"] = tweet.user.verified
df.loc[i, "User_Lang"] = tweet.lang
df.loc[i, "User_Status"] = tweet.user.statuses_count
df.loc[i, "User_Method"] = tweet.source
df.loc[i, "Fav_Count"] = tweet.favorite_count
df.loc[i, "RT_Count"] = tweet.retweet_count
df.loc[i, "Tweet_date"] = tweet.created_at
i+=1
return df
How do I adapt this for, for example, the Twitter API Library?
I know that it should be adapted to something like this:
for tweet in api.request('search/tweets', {'q':cash})
But it is still missing the desired timespan. And I'm not sure if the names for the characteristics match the ones for this libraries.
Using TwitterAPI, you can make Premium Search requests this way:
from TwitterAPI import TwitterAPI
SEARCH_TERM = '#AAPL OR #FB OR #KO OR #ABT OR #PEPCO'
PRODUCT = 'fullarchive'
LABEL = 'your label'
api = TwitterAPI('consumer key', 'consumer secret', 'access token key', 'access token secret')
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL), {'query':SEARCH_TERM})
for item in r:
if 'text' in item:
print(item['text'])
print(item['user']['name'])
print(item['followers_count'])
print(item['user']['location'])
print(item['user']['verified'])
print(item['lang'])
print(item['user']['statuses_count'])
print(item['source'])
print(item['favorite_count'])
print(item['retweet_count'])
print(item['created_at'])
The Premium search doc explains the supported request arguments. To do a date range use this:
r = api.request('tweets/search/%s/:%s' % (PRODUCT, LABEL),
{'query':SEARCH_TERM, 'fromDate':201501010000, 'toDate':201812310000})
I am trying to go through a list of tweets related to a specific search term and trying to extract all the hashtags. I wish to make a python list which includes all the hashtags. I started by using Twython as follows
from twython import Twython
api_key = 'xxxx'
api_secret = 'xxxx'
acces_token = 'xxxx'
ak_secret = 'xxxx'
t = Twython(app_key = api_key, app_secret = api_secret, oauth_token = acces_token, oauth_token_secret = ak_secret)
search = t.search(q = 'Python', count = 10)
tweets = search['statuses']
hashtags = []
for tweet in tweets:
b = (tweet['text'],"\n")
if b.startswith('#'):
hastags.append(b)
It doesn't seem to be working. I get the error that
'tuple object has no attribute startswith'
I am not sure if I am meant to make a list of all the statuses first and extract using the mentioned method. Or it is okay to proceed without making the list of statuses first.
Thank you
That is correct, strings have the startswith attribute and tuples do not.
Change the last three lines to this:
b = (tweet['text'])
if b.startswith("#") is True:
hashtags.append(b)
If you really want that line break then it would be:
b = (tweet['text'] + "\n")
if b.startswith("#") is True:
hashtags.append(b)