Context
I am working on a topic modeling for twitter project.
The idea is to retrieve all tweets related to a specific country and analyze them in order to discover what people from a specific country are talking about on Twitter.
What I have tried
1.First Solution
I know that we can use twitter streaming API or cursor to retrieve tweets from a specific country and I have tried the following code to get all tweets given geocodes coordinates of a country.
I have written the following code :
def get_tweets(query_fname, auth, max_time, location=None):
stop = datetime.now() + max_time
twitter_stream = Stream(auth, CustomListener(query_fname))
while datetime.now() < stop:
if location:
twitter_stream.filter(locations=[11.94,-13.64,30.54,5.19], is_async=True)
else:
twitter_stream.filter(track=query, is_async=True)
The problem of this approach
Not everyone has allowed Twitter to access his location details and with this approach, I can only get a few tweets something like 300 tweets for my location.
There are some persons who are not in the country but who tweet about the country and people within the country replies to them. Their tweets are not captured by this approach.
2.Second Solution
Another approach was to collect tweets with hashtags related to a country with a cursor
I have tried this code :
def query_tweet(client, query=[], max_tweets=2000, country=None):
"""
query tweets using the query list pass in parameter
"""
query = ' OR '.join(query)
name = 'by_hashtags_'
now = datetime.now()
today = now.strftime("%d-%m-%Y-%H-%M")
with open('data/query_drc_{}_{}.jsonl'.format(name, today), 'w') as f:
for status in Cursor(
client.search,
q=query,
include_rts=True).items(max_tweets):
f.write(json.dumps(status._json) + "\n")
Problem
This approach gives more results than the first one but as you may notice, not everyone uses those hashtags to tweets about the country.
3.Third approach
I have tried to retrieve the tweet using place id specific to a country but it gives the same problem as the first approach.
My questions
How can I retrieve all tweets about a specific country? I mean everything people are tweeting about for a specific country with or without country-specific hashtags?
Hint: For people who are not located in the country, It may be a good idea to get their tweets if they were replied or retweeted by people within the country.
Regards.
Related
I am having trouble obtaining friends_count and favorites_count using the search_all_tweets Tweepy V2 API call.
GeeksForGeeks lists friends_count and favorites_count as attributes ( https://www.geeksforgeeks.org/python-user-object-in-tweepy/). Unfortunately, I get an Attribute Error raise AttributeError from None with the last 2 lines of code.
user.public_metrics only consists of followers_count,following_count,tweet_count, and listed_count.
user.entities consist of extraneous url data.
Code is shown below:
client = tweepy.Client(bearer_token=config.BEARER_TOKEN, consumer_key=
config.CONSUMER_KEY,consumer_secret= config.CONSUMER_SECRET,access_token=
config.ACCESS_TOKEN,access_token_secret= config.ACCESS_TOKEN_SECRET)
for response in tweepy.Paginator(client.search_all_tweets, query=s,
tweet_fields=['context_annotations','created_at', 'public_metrics','author_id', 'lang', 'geo', 'entities'],
user_fields=['username','entities','public_metrics','location','verified','description'],
max_results=100, expansions='author_id'):
for user in response.includes["users"]:
print(user.public_metrics)
print(user.entities)
print(user.friends_count)
print(user.favorites_count)
The fields listed by GeeksForGeeks are the User's fields in the Twitter V1 API.
There is unfortunately no way to get the number of likes of an User with the Twitter V2 API. You can try to get all his likes and count the total number of returned tweets, but that will work only if the User has only a few likes (and that will consume your monthly Tweet cap).
And friends was the previous name of followings, so the equivalent of friends_count in the Twitter V2 API is following_count. If you were looking for the mutuals, you have to get the full list of followers and the full list of followings of the user and count the number of common elements.
Finally, I would advise you to use the Twitter API documentation (here for User objects).
I am trying to get the number of tweets containing a hashtag (let's say "#kitten") in python.
I am using tweepy.
However, all the codes I have found are in this form :
query = "kitten"
for i, status in enumerate(tweepy.Cursor(api.search, q=query).items(50)):
print(i, status)
I have this error : 'API' object has no attribute 'search'
Tweepy seemed to not cointain this object anymore. Is there any way to answer my problem ?
Sorry for my bad english.
After browsing the web and twitter documentation I found the answer.
If you want the historic of all tweet counts from 2006 you need Academic authorization. This is not my case so I can only get 7 days tracking which is enough in my case. Here is the code :
import tweepy
query = "kitten -is:retweet"
client = tweepy.Client(bearer_token)
counts = client.get_recent_tweets_count(query=query, granularity='day')
for i in counts.data:
print(i["tweet_count"])
The "-is:retweet" is here to not count the retweets. You need to remove it if you want to count them.
Since we're not pulling any tweets (only the volume of them) we are not increasing our MONTHLY TWEET CAP USAGE.
Be carefull when using symbols in your query such as "$" it might give you an error. For a list of valid operators see : list of valid operators for query
As said here Twitter counts introduction, you only need "read only" authorization to perform a recent count request. (see Recent Tweet counts)
Im trying to get the last tweet froom a twitter account using tweepy in python.
I know there are a few similiar answers for example this one Get the last tweet with tweepy
However the issue that I have is that i dont get the last tweet from a persons timeline but the second last tweet.
Im using this code here in a while loop:
tweetL = api.user_timeline(screen_name='elonmusk', tweet_mode="extended", exclude_replies, count=1)
print(tweetL[0].full_text)
If i run this(at time of the writing), i get this tweet from elon:
What a beautiful day in LA
however looking at his timeline the last tweet from him was this:
Warm, sunny day & snowy mountains
So why am i not getting the last tweet?
Strangely enough running this script last night it did print out his last tweet.
running it now I get the same tweet, it printed out yesterday as the last tweet
and if I run the above code like this (without 'exclude_replies')
tweetL = api.user_timeline(screen_name='elonmusk', tweet_mode="extended")
print(tweetL[0].full_text)
i get as his last tweet
#ErcXspace #smvllstvrs T/W will be ~1.5, so it will accelerate unusually fast. High T/W is important for reusable vehicles to make more efficient use of propellant, the primary cost. For expendable rockets, throwing away stages is the primary cost, so optimization is low T/W.
which was his last reply, so this works.
I just cant fetch the last actual tweet from his timeline
As Iain Shelvington mentioned in the comments, exclude_replies will also ignore replies to oneself.
I don't think there's a direct way of getting a user's last tweet in their timeline. You could create a function that from the retrieved tweets, gets the first one that:
a) is not a reply, i.e., in_reply_to_screen_name = None.
b) or replies to themselves, i.e., in_reply_to_screen_name = screen_name.
This could look something like:
def get_last_timeline_tweet(screen_name: str, tweets: list):
for tw in tweets:
if (tw.in_reply_to_screen_name is None or
tw.in_reply_to_screen_name == screen_name):
return tw
return None
Then, running:
last_timeline_tweet = get_last_timeline_tweet('elonmusk', tweetL).full_text
print(last_timeline_tweet)
You get:
Warm, sunny day & snowy mountains [url to the photo]
This can also be done in a one-liner with:
screen_name = 'elonmusk'
last_tweet = next((tw for tw in tweetL if tw.in_reply_to_screen_name is None
or tw.in_reply_to_screen_name == screen_name), None)
print(last_tweet.full_text)
Note: it should be checked that last_tweet is not None before getting its full_text.
So I am trying to find out retweets of my tweets using tweepy
# To get first tweet
firstTweet = api.user_timeline("zzaibis")[0]
# then getting the retweet data using tweet id and it gives me the results
resultsOfFirstTweet = api.retweets(firstTweet.id)
# but when I try to find any other tweet except first tweet, this returns nothing.
secondTweet = api.user_timeline("zzaibis")[1]
Any idea why this is not working beyond that, and also what I need to follow to get all the retweet data of my tweets considering that I don't have limited tweets and retweets in my account.
To find the number of retweets a user gets on each tweet from their timeline try the following code:
for tweet in api.user_timeline(screen_name = 'StackOverflow', count = 10):
print(tweet.retweet_count)
To edit the user who is being searched change the "screen_name" to the username of the person you want to search. To change the number of statuses is seaches change the "count".
you can also print information like the tweet id of the tweet being searched using:
print(tweet.id)
I'm a nub when it comes to python. I literally just started today and have little understanding of programming. I have managed to make the following code work:
from twitter import *
config = {}
execfile("config.py", config)
twitter = Twitter(
auth = OAuth(config["access_key"], config["access_secret"],
config["consumer_key"], config["consumer_secret"]))
user = "skiftetse"
results = twitter.statuses.user_timeline(screen_name = user)
for status in results:
print "(%s) %s" % (status["created_at"], status["text"].encode("ascii",
"ignore"))
The problem is that it's only printing 20 results. The twitter page i'd like to get data from has 22k posts, so something is wrong with the last line of code.
screenshot
I would really appreciate help with this! I'm doing this for a research sentiment analysis, so I need several 100's to analyze. Beyond that it'd be great if retweets and information about how many people re tweeted their posts were included. I need to get better at all this, but right now I just need to meet that deadline at the end of the month.
You need to understand how the Twitter API works. Specifically, the user_timeline documentation.
By default, a request will only return 20 Tweets. If you want more, you will need to set the count parameter to, say, 50.
e.g.
results = twitter.statuses.user_timeline(screen_name = user, count = 50)
Note, count:
Specifies the number of tweets to try and retrieve, up to a maximum of 200 per distinct request.
In addition, the API will only let you retrieve the most recent 3,200 Tweets.