Get conversation ID with Tweepy

Get conversation ID with Tweepy - python

Basically I want to get the conversation_id if the Tweet is a reply to another Tweet. So I can get the list of replies to each other to analyze.
My code:
class Listener(StreamingClient):
def on_response(self, response):
print(response)
listener = Listener(auth['bearer_token'])
listener.sample(expansions=['in_reply_to_user_id'], tweet_fields=['conversation_id'])
When using this, I only get the user_id to which it is replying, but I cannot get any type of conversation_id.
I have a slight feeling I am missing something essential.

From the relevant FAQ section about this in Tweepy's documentation:
If you are simply printing the objects and looking at that output, the string representations of API v2 models/objects only include the default fields that are guaranteed to exist.
The objects themselves still include the relevant data, which you can access as attributes or by subscription.

Related

Scopus search for a DOI and retrieve authors

I'm trying to get the authors of a publication by using scopus. For that I got an API key and startet. I searched for the DOI and got a response. Everything is fine, there is also an entry "authors", but for each request this field is simply empty. My code in python is below:
import pyscopus
key = 'XXXXXXXXXXXXXXXXXXXXXXXXXXXX'
doi = '10.1016/0270-0255(87)90003-0'
scopus = pyscopus.Scopus(key)
response_json = json.loads(scopus.search(f'doi({doi})', view='STANDARD').to_json(orient="records"))
So a s I sayed, you can call response_json['authors'] but it is always empty. There are authors given on the website, but webscraping is forbidden. Am I doing something wrong or do they simply not provide these information (which is confusing, since there is a field)? So far I couldn't find an answer.
I know there are other ways like crossref to get these information, but for reasons I want to do it with scopus.
Thanks!

Tweepy: about accessing the "id" of a user after a Pagination response

I'm really stuck on this one.
I'm using Tweepy to get the IDs of all users that liked a specific tweet. I seem to get a list of "User" structures that contain "id", "name" and "username", but I'm not able to get only the "id".
The code is simple:
client = tweepy.Client(
bearer_token=bearer_token,
consumer_key=api_key, consumer_secret=api_secret,
access_token=user_token, access_token_secret=user_token_secret,
wait_on_rate_limit=True
)
for response in tweepy.Paginator(client.get_liking_users, id=tweetid, max_results=100, limit=10):
for item in response:
print("ITEM:\n", item)
if item is not None:
for user in item:
if user is not None:
print(user)
The print of "item" gets me this (simplified, of course; the number of structures is high, that's why I have to use Paginator):
[<User id=0000001 name=user1 username=UserName1>, <User id=0002 name=user2 username=UserName2>, <User id=000003 name=user3 username=UserName3>]
and the print of "user" just gets me the individual usernames: "UserName1", etc.
But no way to get user.id, user.User.id, nor anything similar. And I'm frustrated, because the information is right there, just I can't access it easily.
Thank you!

Tweepy documentation provides an example of something very similar to what you want to do: https://docs.tweepy.org/en/stable/examples.html -> API v2 -> Get Tweet’s Liking Users
import tweepy
bearer_token = ""
client = tweepy.Client(bearer_token)
# Get Tweet's Liking Users
# This endpoint/method allows you to get information about a Tweet’s liking
# users
tweet_id = 1460323737035677698
# By default, only the ID, name, and username fields of each user will be
# returned
# Additional fields can be retrieved using the user_fields parameter
response = client.get_liking_users(tweet_id, user_fields=["profile_image_url"])
for user in response.data:
print(user.username, user.profile_image_url)
This example prints the user's username and profile image URL, but note the comment says the id is also returned, so something like user.id should work. Otherwise, you can also add id to user_fields to make sure it's returned, although that shouldn't be necessary.
Unfortunately, I am not able to test it myself because I don't have a Twitter developer account with the required elevated access.
Edit: I got access to an API account with elevated access and I was able to test your code, see the update below
Iterating paginated results
The reason why you need a double for loop to iterate the paginated results and it eventually crashes after showing some results with an error saying you are trying to access a non-existent id attribute on an str object is because you are not iterating the Paginator results correctly.
For the sake of simplicity, I'm going to label your three nested for loops:
loop 0: for response in tweepy.Paginator(...
loop 1: for item in response
loop 2: for user in item
Paginator returns a Response object with all the results in the data attribute. The object has other attributes like meta, count, etc.
When you do loop 1, you are iterating all these data, count, etc., attributes of Response.
If the attribute you are iterating happens to be the data attribute, it will start loop 2 and it will iterate the results getting the output you expect.
But loop 1 will also iterate other Reponse items outside of the data attribute.
Let's see, for example, what happens when loop 1 enters the meta attribute.
meta is a dictionary that looks like this:
meta={'result_count': 80, 'next_token': '676f9b7bumw8i3jbm4nnifamw2ejjaktp8kjym6akdak9'}
When you enter loop 2 with the meta attribute, it will start iterating the keys (not the values, because that's how dicts work in Python) so the value of user in loop 2 will be either result_count or next_token. And it's then when you are getting your error saying you are trying to access id on a str.
What you should be doing is iterating the response.data in loop 1 instead and that will also allow removing the need of a second loop:
for response in tweepy.Paginator(client.get_liking_users, id=tweetid, max_results=100, limit=10):
for user in response.data:
print(user.id)
Edit: grammar and style

Tweepy api.get_status().in_reply_to_status_id_str returns None

No matter what tweet I use when I call myTweet.text it returns the text for the tweet I want but when I call myTweet.in_reply_to_status_id_str it returns None even if there are replies to the tweet.
Am I using it wrong? This is the only way I can find online of how to get the replies to a tweet. Thank you in advance for your help, the code:
api = tweepy.API(auth)
myTweet = api.get_status('id')
print(myTweet)
print(myTweet.text)
print('---', myTweet.in_reply_to_status_id_str)

The in_reply_to_status_id_str attribute will contain the string representation of the original Tweet’s ID if the represented Tweet is a reply. See https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object.
Twitter's API does not have a direct method/endpoint to retrieve replies to a specific Tweet.
Instead, you'll have to use the search APIs and filter the results manually.

How can I get a line from an object?

So I'm kind of new to python and I want to make a twitter bot.
I did this:
print(api.get_user(screen_name="My account's handle"))
(while having "tweepy" imported and given my script the correct authentication keys / tokens etc)
That line printed a lot of text, what i want to do is get the number afte "in_reply_to_status_id="
which is 1048042979359936513
The text that was printed is pasted inside here:
https://pastebin.com/ZVWzYEJw
(had to use pastebit because it was too long and has links)
I hope this makes sense...

I'm not entirely familiar with the tweepy's response object but if it's as you described above i.e.the User object, then you can probably try this:
import json
response = "{'response':" + User._json + "}"
data = json.loads(response)
data['in_reply_to_status_id']
>>>1048042979359936513
Edit: If in_reply_to_status_id is an attribute of User then you should be able to call it by just User.in_reply_to_status_id

Tweepy's API.get_user() method returns User object. The long text you see in the response is the string representation of User object.As #kerwei says, you can check which properties exist in this object by checking keys in user._json (this is a dictionary object).
But in_reply_to_status_id is in the Status object (representing a tweet) not in User object. So at first, you should get a Status object by using API.get_status() etc.. After that, you should be able to get in_reply_to_status_id in this object.
You can get in_reply_to_status_id from Status object like this:
>>> status = api.get_status(1234567890)
>>> reply_id = status.in_reply_to_status_id
>>> print(reply_id)

Gmail API: Python Email Dict appears to be Missing Keys

I'm experiencing a strange issue that seems to be inconsistent with google's gmail API:
If you look here, you can see that gmail's representation of an email has keys "snippet" and "id", among others. Here's some code that I use to generate the complete list of all my emails:
response = service.users().messages().list(userId='me').execute()
messageList = []
messageList.extend(response['messages'])
while 'nextPageToken' in response:
pagetoken = response['nextPageToken']
response = service.users().messages().list(userId='me', pageToken=pagetoken).execute()
messageList.extend(response['messages'])
for message in messageList:
if 'snippet' in message:
print(message['snippet'])
else:
print("FALSE")
The code works!... Except for the fact that I get output "FALSE" for every single one of the emails. 'snippet' doesn't exist! However, if I run the same code with "id" instead of snippet, I get a whole bunch of ids!
I decided to just print out the 'message' objects/dicts themselves, and each one only had an "id" and a "threadId", even though the API claims there should be more in the object... What gives?
Thanks for your help!

As #jedwards said in his comment, just because a message 'can' contain all of the fields specified in documentation, doesn't mean it will. 'list' provides the bare minimum amount of information for each message, because it provides a lot of messages and wants to be as lazy as possible. For individual messages that I want to know more about, I'd then use 'messages.get' with the id that I got from 'list'.
Running get for each email in your inbox seems very expensive, but to my knowledge there's no way to run a batch 'get' command.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Get conversation ID with Tweepy - python

Related

Scopus search for a DOI and retrieve authors

Tweepy: about accessing the "id" of a user after a Pagination response

Tweepy api.get_status().in_reply_to_status_id_str returns None

How can I get a line from an object?

Gmail API: Python Email Dict appears to be Missing Keys

Categories

Resources