Want to get twitter data using tweepy but in trouble

Want to get twitter data using tweepy but in trouble - python

I am trying to retrieve Twitter data using Tweepy, using that below code, but I'm having difficulties in collecting media_fields data. Especially, I want to get the type of media, but I failed.
As you can see below, the value is copied and exists in the cell that should be empty.
[enter image description here][1]
import tweepy
from twitter_authentication import bearer_token
import time
import pandas as pd
client = tweepy.Client(bearer_token, wait_on_rate_limit=True)
hoax_tweets = []
for response in tweepy.Paginator(client.search_all_tweets,
query = 'Covid hoax -is:retweet lang:en',
user_fields = ['username', 'public_metrics', 'description', 'location','verified','entities'],
tweet_fields=['id', 'in_reply_to_user_id', 'referenced_tweets', 'context_annotations',
'source', 'created_at', 'entities', 'geo', 'withheld', 'public_metrics',
'text'],
media_fields=['media_key', 'type', 'url', 'alt_text',
'public_metrics','preview_image_url'],
expansions=['author_id', 'in_reply_to_user_id', 'geo.place_id',
'attachments.media_keys','referenced_tweets.id','referenced_tweets.id.author_id'],
place_fields=['id', 'name', 'country_code', 'place_type', 'full_name', 'country',
'geo', 'contained_within'],
start_time = '2021-01-20T00:00:00Z',
end_time = '2021-01-21T00:00:00Z',
max_results=100):
time.sleep(1)
hoax_tweets.append(response)
result = []
user_dict = {}
media_dict = {}
# Loop through each response object
for response in hoax_tweets:
# Take all of the users, and put them into a dictionary of dictionaries with the info we want to keep
for user in response.includes['users']:
user_dict[user.id] = {'username': user.username,
'followers': user.public_metrics['followers_count'],
'tweets': user.public_metrics['tweet_count'],
'description': user.description,
'location': user.location,
'verified': user.verified
}
for media in response.includes['media']:
media_dict[tweet.id] = {'media_key':media.media_key,
'type':media.type
}
for tweet in response.data:
# For each tweet, find the author's information
author_info = user_dict[tweet.author_id]
# Put all of the information we want to keep in a single dictionary for each tweet
result.append({'author_id': tweet.author_id,
'username': author_info['username'],
'author_followers': author_info['followers'],
'author_tweets': author_info['tweets'],
'author_description': author_info['description'],
'author_location': author_info['location'],
'author_verified':author_info['verified'],
'tweet_id': tweet.id,
'text': tweet.text,
'created_at': tweet.created_at,
'retweets': tweet.public_metrics['retweet_count'],
'replies': tweet.public_metrics['reply_count'],
'likes': tweet.public_metrics['like_count'],
'quote_count': tweet.public_metrics['quote_count'],
'in_reply_to_user_id':tweet.in_reply_to_user_id,
'media':tweet.attachments,
'media_type': media,
'conversation':tweet.referenced_tweets
})
# Change this list of dictionaries into a dataframe
df = pd.DataFrame(result)
Also, when I change the code ''media':tweet.attachments' to 'media':tweet.attachments[0] to get 'media_key' data, I get the following error message."TypeError: 'NoneType' object is not subscriptable"
What am I doing wrong? Any suggestions would be appreciated.
[1]: https://i.stack.imgur.com/AxCcl.png

The subscriptable error comes from the fact that tweet.attachments is None, from here the NoneType part. To make it work, you can add a check for None:
'media':tweet.attachments[0] if tweet.attachments else None
I have never used the twitter API, but one thing is to make sure the tweet attachments are always present or if they may be absent.

Related

How can I get the status ID using tweepy?

I am using academic account to retrieve tweet information but I don't know how to get the status_id, I thought the conversation_id would be the same as status_id but when I track back, apparently it is not. What should I add to the tweet field?
for response in tweepy.Paginator(client.search_all_tweets,
query = 'query -is:retweet lang:en',
user_fields = ['username', 'public_metrics', 'description', 'location'],
tweet_fields = ['created_at', 'geo', 'public_metrics', 'text','id','conversation_id'],
expansions = ['author_id', 'geo.place_id'],
start_time = ['2020-01-01T00:00:00Z'],
end_time = ['2020-12-12T00:00:00Z']):
time.sleep(1)
tweets.append(response)
result

You've already got it - "id" is the status id
Tweets are the basic atomic building block of all things Twitter.
Tweets are also known as “status updates.” The Tweet object has a long
list of ‘root-level’ attributes, including fundamental attributes such
as id, created_at, and text
https://developer.twitter.com/en/docs/twitter-api/v1/data-dictionary/object-model/tweet
It may be a bit confusing because references to that id are labeled things like "in_reply_to_status_id" - but there is no field called "status_id" - it's just id.

How to get replies or quotes on a specific tweet

I am getting the tweets and the corresponding id of that user in an object obj. I want to know why I don't get the other informations like conversation_id. I want to use it to get the replies and the quotes. That's the solution that I found in the internet but didn't know how to make it work.
Does any anyone know to extract the conversation_id or any other parameters like geo.place_id? I am using tweepy but if anyone has any other solution using another library to get the same result it will be also helpful. Thanks for your help!!!
You can try the code if you create another file config and define your tokens. I can't share mine due to security purposes.
import tweepy
import config
users_name = ['derspiegel', 'zeitonline']
tweet_tab = []
def getClient():
client = tweepy.Client(bearer_token=config.BEARER_TOKEN,
consumer_key=config.API_KEY,
consumer_secret=config.API_KEY_SECRET,
access_token=config.ACCESS_TOKEN,
access_token_secret=config.ACCESS_TOKEN_SECRET)
def searchTweets(client):
for i in users_name:
client = getClient()
user = client.get_user(username=i)
userId = user.data.id
tweets = client.get_users_tweets(userId,
expansions=[
'author_id', 'referenced_tweets.id', 'referenced_tweets.id.author_id',
'in_reply_to_user_id', 'attachments.media_keys', 'entities.mentions.username', 'geo.place_id'],
tweet_fields=[
'id', 'text', 'author_id', 'created_at', 'conversation_id', 'entities',
'public_metrics', 'referenced_tweets'
],
user_fields=[
'id', 'name', 'username', 'created_at', 'description', 'public_metrics',
'verified'
],
place_fields=['full_name', 'id'],
media_fields=['type', 'url', 'alt_text', 'public_metrics'])
if not tweets is None and len(tweets) > 0:
obj = {}
obj['id'] = userId
obj['text'] = tweets
tweet_tab.append(obj)
return tweet_tab
searchTweets(client)
print("tableau final", tweet_tab)

my guess is that you need to put the ids into a list through which the function can iterate. Create the id list and try:
def get_tweets_from_timelines():
tweets_timelines_list = []
for id in range(0, len(ids), 1):
one_id = (ids[id:id+1])
one_id = ' '.join(one_id)
for tweet in tweepy.Paginator(client.get_users_tweets, id=one_id, max_results=100,
tweet_fields=['attachments', 'author_id', 'context_annotations', 'created_at', 'entities', \
'conversation_id', 'possibly_sensitive', 'public_metrics', 'referenced_tweets', \
'reply_settings', 'source', 'withheld' ],\
user_fields=['created_at', 'description', 'entities', 'profile_image_url', 'protected', \
'public_metrics', 'url', 'verified', 'withheld'],
expansions=['referenced_tweets.id', 'in_reply_to_user_id', 'attachments.media_keys', ],
media_fields=['preview_image_url'],
):
tweets_timelines_list.append(tweet)
return tweets_timelines_list

Cannot access custom column values in To-do tasks via MS Graph API using Python

I have created custom columns "VESSEL NAME", "VOYAGE NUMBER", "ETD" and "CUT-OFF" in my Outlook To-do task as shown on the pic below.
Outlook tasks snapshot
I need to access values in those columns via MS Graph API, but have had no luck so far.
Not sure if I am moving in the right direction, but I have added an openTypeExtension named "ZZZ" to my task as a test. I can retrieve it via the 'GET' method, but cannot locate it anywhere in Outlook hoping to find it amongst custom columns or other task fields.
Here is the Python code:
# In[1]:
import json
import requests
# In[2]:
token = json.load(open('ms_graph_state.jsonc'))["access_token"]
header = {'Authorization':'Bearer '+token}
header1 = {'Authorization':'Bearer '+token,'Content-Type':'application/json'}
base_url = 'https://graph.microsoft.com/v1.0/me/'
# In[3]:
task_list_id = requests.get(base_url+'todo/lists/',headers=header).json()['value'][1]['id']
task_list = base_url+'todo/lists/'+task_list_id
task_id = requests.get(task_list+'/tasks/',headers=header).json()['value'][0]['id']
# In[4]:
payload = {"#odata.type" : "microsoft.graph.openTypeExtension","extensionName" : "ZZZ","xxx" : "yyy"}
# In[5]:
create_oe = requests.post(task_list+'/tasks/'+task_id+'/extensions',headers=header1,json=payload).json()
# In[6]:
oe = requests.get(task_list+'/tasks/'+task_id+'/extensions/ZZZ',headers=header1).json()
oe
'''
Output:
{'#odata.context': "https://graph.microsoft.com/v1.0/$metadata#users('to-do-app%40outlook.co.nz')/todo/lists('AQMkADAwATZiZmYAZC0xNDM3LTZlYmMtMDACLTAwCgAuAAADtVcV-o2b90KtdxZu_nQLmgEA2HIj8QQFbES8Q4ESBpmcmgAAAgESAAAA')/tasks('AQMkADAwATZiZmYAZC0xNDM3LTZlYmMtMDACLTAwCgBGAAADtVcV-o2b90KtdxZu_nQLmgcA2HIj8QQFbES8Q4ESBpmcmgAAAgESAAAA2HIj8QQFbES8Q4ESBpmcmgAAAUeYHQAAAA%3D%3D')/extensions/$entity",
'extensionName': 'ZZZ',
'id': 'microsoft.graph.openTypeExtension.ZZZ',
'xxx': 'yyy'}
'''
# In[7]:
task = requests.get(task_list+'/tasks/'+task_id,headers=header).json()
task
'''
Output:
{'#odata.context': "https://graph.microsoft.com/v1.0/$metadata#users('to-do-app%40outlook.co.nz')/todo/lists('AQMkADAwATZiZmYAZC0xNDM3LTZlYmMtMDACLTAwCgAuAAADtVcV-o2b90KtdxZu_nQLmgEA2HIj8QQFbES8Q4ESBpmcmgAAAgESAAAA')/tasks/$entity",
'#odata.etag': 'W/"2HIj8QQFbES8Q4ESBpmcmgAAAa4dUQ=="',
'importance': 'normal',
'isReminderOn': False,
'status': 'notStarted',
'title': 'test-to-do-task',
'createdDateTime': '2021-08-14T20:14:22.5557165Z',
'lastModifiedDateTime': '2021-08-17T06:46:46.260686Z',
'id': 'AQMkADAwATZiZmYAZC0xNDM3LTZlYmMtMDACLTAwCgBGAAADtVcV-o2b90KtdxZu_nQLmgcA2HIj8QQFbES8Q4ESBpmcmgAAAgESAAAA2HIj8QQFbES8Q4ESBpmcmgAAAUeYHQAAAA==',
'body': {'content': '\r\n\r\n', 'contentType': 'text'}}
'''
Appreciate you help on this.
Thank you

AFAIK, this is currently not supported. Being said that, consider filing user voice for your specific scenario so it could be considered for future implementation.

Putting several tweets in dataframe

I am trying to download the last 10 tweets from BarackObama. However, when I try to put them into a dataframe, it only includes the 10th tweet (so only 1). Does someone know how to solve this problem? I tried the top part of the code first with just print instead of data, and then i got all 10 tweets, so I dont know where it goes wrong. I also dont get an error message.
user = 'BarackObama'
posts = tweepy.Cursor(api.user_timeline, screen_name=user,).items(10)
for status in posts:
if status.lang == 'en':
data = {'User': [status.user.name],
'Account name' ['#'+status.user.screen_name],
'Tweet': [status.text],
'Time': [status.created_at],
'Nr of retweets': [status.retweet_count],
'Nr of favorited': [status.favorite_count]}
df = pd.DataFrame(data)
df.head()

Seems like you have to create a list of tweets, and then put them into DataFrame:
user = 'BarackObama'
posts = tweepy.Cursor(api.user_timeline, screen_name=user,).items(10)
tweets = []
for status in posts:
if status.lang == 'en':
data = {'User': [status.user.name],
'Account name' ['#'+status.user.screen_name],
'Tweet': [status.text],
'Time': [status.created_at],
'Nr of retweets': [status.retweet_count],
'Nr of favorited': [status.favorite_count]}
tweets.append(data)
df = pd.DataFrame(tweets)
df.head()

How to get biddingStrategyConfiguration in Adwords API?

I'm trying to retrieve the field 'biddingStrategyConfiguration' via Adwords API for Python (3) using CampaignService(), but I always get an weird error. It's weird because the field does exist, as mentioned in the documentation found here.
account_id = 'any_id'
adwords = Adwords(account_id) # classes and objects already created, etc.
def get_bidding_strategy():
service = adwords.client.GetService('CampaignService', version = 'v201806')
selector = {
'fields': ['Id', 'Name', 'Status', 'biddingStrategyConfiguration']
}
results = service.get(selector)
data = []
if 'entries' in results:
for item in results['entries']:
if item['status'] == 'ENABLED':
data.append({
'id': item['id'],
'name': item['name'],
'status': item['status'] # i have to retrieve biddingStrategyConfiguration.biddingStrategyName (next line)
})
return results
This is the error:
Error summary:
{'faultMessage': "[SelectorError.INVALID_FIELD_NAME # serviceSelector; trigger:'biddingStrategyConfiguration']",
'requestId': '000581286e61247e0a376ac776062df4',
'serviceName': 'CampaignService',
'methodName': 'get',
'operations': '1',
'responseTime': '315'}
Notice that fields like "id" or "name" are easily retrievable, but the bidding configuration is not. In fact, I'm looking for the id/name of the biddingStrategies using .biddingStrategyID or .biddingStrategyName.
Can anyone help me? Thanks in advance.

How I solved it: biddingStrategyConfiguration is not a retrievable field, but biddingStrategyName is (part of the JSON).
account_id = 'any_id'
adwords = Adwords(account_id) # classes and objects already created, etc.
def get_bidding_strategy():
service = adwords.client.GetService('CampaignService', version = 'v201806')
selector = {
'fields': ['Id', 'Name', 'Status', 'biddingStrategyName']
}
results = service.get(selector)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Want to get twitter data using tweepy but in trouble - python

Related

How can I get the status ID using tweepy?

How to get replies or quotes on a specific tweet

Cannot access custom column values in To-do tasks via MS Graph API using Python

Putting several tweets in dataframe

How to get biddingStrategyConfiguration in Adwords API?

Categories

Resources