Tweepy: about accessing the "id" of a user after a Pagination response - python

I'm really stuck on this one.
I'm using Tweepy to get the IDs of all users that liked a specific tweet. I seem to get a list of "User" structures that contain "id", "name" and "username", but I'm not able to get only the "id".
The code is simple:
client = tweepy.Client(
bearer_token=bearer_token,
consumer_key=api_key, consumer_secret=api_secret,
access_token=user_token, access_token_secret=user_token_secret,
wait_on_rate_limit=True
)
for response in tweepy.Paginator(client.get_liking_users, id=tweetid, max_results=100, limit=10):
for item in response:
print("ITEM:\n", item)
if item is not None:
for user in item:
if user is not None:
print(user)
The print of "item" gets me this (simplified, of course; the number of structures is high, that's why I have to use Paginator):
[<User id=0000001 name=user1 username=UserName1>, <User id=0002 name=user2 username=UserName2>, <User id=000003 name=user3 username=UserName3>]
and the print of "user" just gets me the individual usernames: "UserName1", etc.
But no way to get user.id, user.User.id, nor anything similar. And I'm frustrated, because the information is right there, just I can't access it easily.
Thank you!

Tweepy documentation provides an example of something very similar to what you want to do: https://docs.tweepy.org/en/stable/examples.html -> API v2 -> Get Tweet’s Liking Users
import tweepy
bearer_token = ""
client = tweepy.Client(bearer_token)
# Get Tweet's Liking Users
# This endpoint/method allows you to get information about a Tweet’s liking
# users
tweet_id = 1460323737035677698
# By default, only the ID, name, and username fields of each user will be
# returned
# Additional fields can be retrieved using the user_fields parameter
response = client.get_liking_users(tweet_id, user_fields=["profile_image_url"])
for user in response.data:
print(user.username, user.profile_image_url)
This example prints the user's username and profile image URL, but note the comment says the id is also returned, so something like user.id should work. Otherwise, you can also add id to user_fields to make sure it's returned, although that shouldn't be necessary.
Unfortunately, I am not able to test it myself because I don't have a Twitter developer account with the required elevated access.
Edit: I got access to an API account with elevated access and I was able to test your code, see the update below
Iterating paginated results
The reason why you need a double for loop to iterate the paginated results and it eventually crashes after showing some results with an error saying you are trying to access a non-existent id attribute on an str object is because you are not iterating the Paginator results correctly.
For the sake of simplicity, I'm going to label your three nested for loops:
loop 0: for response in tweepy.Paginator(...
loop 1: for item in response
loop 2: for user in item
Paginator returns a Response object with all the results in the data attribute. The object has other attributes like meta, count, etc.
When you do loop 1, you are iterating all these data, count, etc., attributes of Response.
If the attribute you are iterating happens to be the data attribute, it will start loop 2 and it will iterate the results getting the output you expect.
But loop 1 will also iterate other Reponse items outside of the data attribute.
Let's see, for example, what happens when loop 1 enters the meta attribute.
meta is a dictionary that looks like this:
meta={'result_count': 80, 'next_token': '676f9b7bumw8i3jbm4nnifamw2ejjaktp8kjym6akdak9'}
When you enter loop 2 with the meta attribute, it will start iterating the keys (not the values, because that's how dicts work in Python) so the value of user in loop 2 will be either result_count or next_token. And it's then when you are getting your error saying you are trying to access id on a str.
What you should be doing is iterating the response.data in loop 1 instead and that will also allow removing the need of a second loop:
for response in tweepy.Paginator(client.get_liking_users, id=tweetid, max_results=100, limit=10):
for user in response.data:
print(user.id)
Edit: grammar and style

Related

Get conversation ID with Tweepy

Basically I want to get the conversation_id if the Tweet is a reply to another Tweet. So I can get the list of replies to each other to analyze.
My code:
class Listener(StreamingClient):
def on_response(self, response):
print(response)
listener = Listener(auth['bearer_token'])
listener.sample(expansions=['in_reply_to_user_id'], tweet_fields=['conversation_id'])
When using this, I only get the user_id to which it is replying, but I cannot get any type of conversation_id.
I have a slight feeling I am missing something essential.
From the relevant FAQ section about this in Tweepy's documentation:
If you are simply printing the objects and looking at that output, the string representations of API v2 models/objects only include the default fields that are guaranteed to exist.
The objects themselves still include the relevant data, which you can access as attributes or by subscription.

Sharepoint Office365 REST Python Client get users

I am trying to integrate SharePoint in one of my Python Scripts and I need to grab the users from a SP list. Until now, I managed to log in, search for a specific list, and enumerate all of its items.
def enum_items(list):
items = list.items # .top(1220)
ctx.load(items)
ctx.execute_query()
for index, item in enumerate(items):
print("{0}: {1}".format(index, item.properties['Email']))
The 'Email' column's value is set to User.email like so:
{
"$schema": "https://developer.microsoft.com/json-schemas/sp/v2/column-formatting.schema.json",
"elmType": "div",
"txtContent": "[$User.email]"
}
Basically the Email is automatically set when a new user is added in the User column.
Some entries where manually added, and others use the above code. For those that were manually added it prints the email, but for the other, it prints: "None".
If I change the 'Email' to 'User', I get the following error, like it cannot find that column or so:
print("{0}: {1}".format(index, item.properties['User']))
KeyError: 'User'
Is there anyway I can grab the USER object, and then use its properties (which ever those are) to grab some information about it?
Thanks!

Imgur API - How do I retrieve all favorites without pagination?

According to the Imgur Docs, the "GET Account Favorites" API call takes optional arguments for pagination, implying that all objects are returned without it.
However, when I use the following code snippet (the application has been registered and OAuth has already performed against my account for testing), I get only the first 30 JSON objects. In the snippet below, I already have an access_token for an authorized user and can retrieve data for that username. But the length of the returned list is always the first 30 items.
username = token['username']
bearer_headers = {
'Authorization': 'Bearer ' + token['access_token']
}
fav_url = 'https://api.imgur.com/3/account/' + username + '/' + 'favorites'
r = requests.get(fav_url, headers=bearer_headers)
r_json = r.json()
favorites=r_json['data']
len(favorites)
print(favorites)
The requests response returns a dictionary with three keys: status (the HTTP status code), success (true or false), and data, of which the value is a list of dictionaries (one per favorited item).
I'm trying to retrieve this without pagination so I can extract specific metadata values into a Pandas dataframe (id, post date, etc).
I originally thought this was a Pandas display problem in Jupyter notebook, but tracked it back to the API only returning the newest 30 list items, despite the docs indicating otherwise. If I place an arbitrary page number at the end (eg, "/favorites/1"), it returns the 30 items appropriate to that page, but there doesn't seem to be an option to get all items or retrieve a count of the total items or number of pages in advance.
What am I missing?
Postscript: It appears that none of the URIs work without pagination, eg, get account images, get gallery submissions, etc. Anything where there is an optional "/{{page}}" parameter, it will default to first page if none is specified. So I guess the larger question is, "does Imgur API even support non-paginated data, and how is that accessed?".
Paginated data is usually used when the possible size of the response can be arbitrarily large. I would be surprised if a major service like Imgur had an API that didn't work this way.
As you have found, the page attribute may be optional, and if you don't provide it, you get the first page as your response.
If you want to get more than the first page, you will need to loop over the page number:
data = []
page = 0
while block := connection.get(page=page):
data.append(block)
page += 1
This assumes Python3.8+ due to the := assignment expression. If you are on an older version you'll need to set block in the loop body, but the same idea applies.

How can I get a line from an object?

So I'm kind of new to python and I want to make a twitter bot.
I did this:
print(api.get_user(screen_name="My account's handle"))
(while having "tweepy" imported and given my script the correct authentication keys / tokens etc)
That line printed a lot of text, what i want to do is get the number afte "in_reply_to_status_id="
which is 1048042979359936513
The text that was printed is pasted inside here:
https://pastebin.com/ZVWzYEJw
(had to use pastebit because it was too long and has links)
I hope this makes sense...
I'm not entirely familiar with the tweepy's response object but if it's as you described above i.e.the User object, then you can probably try this:
import json
response = "{'response':" + User._json + "}"
data = json.loads(response)
data['in_reply_to_status_id']
>>>1048042979359936513
Edit: If in_reply_to_status_id is an attribute of User then you should be able to call it by just User.in_reply_to_status_id
Tweepy's API.get_user() method returns User object. The long text you see in the response is the string representation of User object.As #kerwei says, you can check which properties exist in this object by checking keys in user._json (this is a dictionary object).
But in_reply_to_status_id is in the Status object (representing a tweet) not in User object. So at first, you should get a Status object by using API.get_status() etc.. After that, you should be able to get in_reply_to_status_id in this object.
You can get in_reply_to_status_id from Status object like this:
>>> status = api.get_status(1234567890)
>>> reply_id = status.in_reply_to_status_id
>>> print(reply_id)

Gmail API: Python Email Dict appears to be Missing Keys

I'm experiencing a strange issue that seems to be inconsistent with google's gmail API:
If you look here, you can see that gmail's representation of an email has keys "snippet" and "id", among others. Here's some code that I use to generate the complete list of all my emails:
response = service.users().messages().list(userId='me').execute()
messageList = []
messageList.extend(response['messages'])
while 'nextPageToken' in response:
pagetoken = response['nextPageToken']
response = service.users().messages().list(userId='me', pageToken=pagetoken).execute()
messageList.extend(response['messages'])
for message in messageList:
if 'snippet' in message:
print(message['snippet'])
else:
print("FALSE")
The code works!... Except for the fact that I get output "FALSE" for every single one of the emails. 'snippet' doesn't exist! However, if I run the same code with "id" instead of snippet, I get a whole bunch of ids!
I decided to just print out the 'message' objects/dicts themselves, and each one only had an "id" and a "threadId", even though the API claims there should be more in the object... What gives?
Thanks for your help!
As #jedwards said in his comment, just because a message 'can' contain all of the fields specified in documentation, doesn't mean it will. 'list' provides the bare minimum amount of information for each message, because it provides a lot of messages and wants to be as lazy as possible. For individual messages that I want to know more about, I'd then use 'messages.get' with the id that I got from 'list'.
Running get for each email in your inbox seems very expensive, but to my knowledge there's no way to run a batch 'get' command.

Categories

Resources