I'm experiencing a strange issue that seems to be inconsistent with google's gmail API:
If you look here, you can see that gmail's representation of an email has keys "snippet" and "id", among others. Here's some code that I use to generate the complete list of all my emails:
response = service.users().messages().list(userId='me').execute()
messageList = []
messageList.extend(response['messages'])
while 'nextPageToken' in response:
pagetoken = response['nextPageToken']
response = service.users().messages().list(userId='me', pageToken=pagetoken).execute()
messageList.extend(response['messages'])
for message in messageList:
if 'snippet' in message:
print(message['snippet'])
else:
print("FALSE")
The code works!... Except for the fact that I get output "FALSE" for every single one of the emails. 'snippet' doesn't exist! However, if I run the same code with "id" instead of snippet, I get a whole bunch of ids!
I decided to just print out the 'message' objects/dicts themselves, and each one only had an "id" and a "threadId", even though the API claims there should be more in the object... What gives?
Thanks for your help!
As #jedwards said in his comment, just because a message 'can' contain all of the fields specified in documentation, doesn't mean it will. 'list' provides the bare minimum amount of information for each message, because it provides a lot of messages and wants to be as lazy as possible. For individual messages that I want to know more about, I'd then use 'messages.get' with the id that I got from 'list'.
Running get for each email in your inbox seems very expensive, but to my knowledge there's no way to run a batch 'get' command.
Related
I'm really stuck on this one.
I'm using Tweepy to get the IDs of all users that liked a specific tweet. I seem to get a list of "User" structures that contain "id", "name" and "username", but I'm not able to get only the "id".
The code is simple:
client = tweepy.Client(
bearer_token=bearer_token,
consumer_key=api_key, consumer_secret=api_secret,
access_token=user_token, access_token_secret=user_token_secret,
wait_on_rate_limit=True
)
for response in tweepy.Paginator(client.get_liking_users, id=tweetid, max_results=100, limit=10):
for item in response:
print("ITEM:\n", item)
if item is not None:
for user in item:
if user is not None:
print(user)
The print of "item" gets me this (simplified, of course; the number of structures is high, that's why I have to use Paginator):
[<User id=0000001 name=user1 username=UserName1>, <User id=0002 name=user2 username=UserName2>, <User id=000003 name=user3 username=UserName3>]
and the print of "user" just gets me the individual usernames: "UserName1", etc.
But no way to get user.id, user.User.id, nor anything similar. And I'm frustrated, because the information is right there, just I can't access it easily.
Thank you!
Tweepy documentation provides an example of something very similar to what you want to do: https://docs.tweepy.org/en/stable/examples.html -> API v2 -> Get Tweet’s Liking Users
import tweepy
bearer_token = ""
client = tweepy.Client(bearer_token)
# Get Tweet's Liking Users
# This endpoint/method allows you to get information about a Tweet’s liking
# users
tweet_id = 1460323737035677698
# By default, only the ID, name, and username fields of each user will be
# returned
# Additional fields can be retrieved using the user_fields parameter
response = client.get_liking_users(tweet_id, user_fields=["profile_image_url"])
for user in response.data:
print(user.username, user.profile_image_url)
This example prints the user's username and profile image URL, but note the comment says the id is also returned, so something like user.id should work. Otherwise, you can also add id to user_fields to make sure it's returned, although that shouldn't be necessary.
Unfortunately, I am not able to test it myself because I don't have a Twitter developer account with the required elevated access.
Edit: I got access to an API account with elevated access and I was able to test your code, see the update below
Iterating paginated results
The reason why you need a double for loop to iterate the paginated results and it eventually crashes after showing some results with an error saying you are trying to access a non-existent id attribute on an str object is because you are not iterating the Paginator results correctly.
For the sake of simplicity, I'm going to label your three nested for loops:
loop 0: for response in tweepy.Paginator(...
loop 1: for item in response
loop 2: for user in item
Paginator returns a Response object with all the results in the data attribute. The object has other attributes like meta, count, etc.
When you do loop 1, you are iterating all these data, count, etc., attributes of Response.
If the attribute you are iterating happens to be the data attribute, it will start loop 2 and it will iterate the results getting the output you expect.
But loop 1 will also iterate other Reponse items outside of the data attribute.
Let's see, for example, what happens when loop 1 enters the meta attribute.
meta is a dictionary that looks like this:
meta={'result_count': 80, 'next_token': '676f9b7bumw8i3jbm4nnifamw2ejjaktp8kjym6akdak9'}
When you enter loop 2 with the meta attribute, it will start iterating the keys (not the values, because that's how dicts work in Python) so the value of user in loop 2 will be either result_count or next_token. And it's then when you are getting your error saying you are trying to access id on a str.
What you should be doing is iterating the response.data in loop 1 instead and that will also allow removing the need of a second loop:
for response in tweepy.Paginator(client.get_liking_users, id=tweetid, max_results=100, limit=10):
for user in response.data:
print(user.id)
Edit: grammar and style
Basically I want to get the conversation_id if the Tweet is a reply to another Tweet. So I can get the list of replies to each other to analyze.
My code:
class Listener(StreamingClient):
def on_response(self, response):
print(response)
listener = Listener(auth['bearer_token'])
listener.sample(expansions=['in_reply_to_user_id'], tweet_fields=['conversation_id'])
When using this, I only get the user_id to which it is replying, but I cannot get any type of conversation_id.
I have a slight feeling I am missing something essential.
From the relevant FAQ section about this in Tweepy's documentation:
If you are simply printing the objects and looking at that output, the string representations of API v2 models/objects only include the default fields that are guaranteed to exist.
The objects themselves still include the relevant data, which you can access as attributes or by subscription.
After the deprecation of my discovery url, I had to make some change on my code and now I get this error.
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://mybusinessbusinessinformation.googleapis.com/v1/accounts/{*accountid*}/locations?filter=locationKey.placeId%3{*placeid*}&readMask=paths%3A+%22locations%28name%29%22%0A&alt=json returned "Request contains an invalid argument.". Details: "[{'#type': 'type.googleapis.com/google.rpc.BadRequest', 'fieldViolations': [{'field': 'read_mask', 'description': 'Invalid field mask provided'}]}]">
I am trying to use this end point accounts.locations.list
I'm using :
python 3.8
google-api-python-client 2.29.0
My current code look likes :
from google.protobuf.field_mask_pb2 import FieldMask
googleAPI = GoogleAPI.auth_with_credentials(client_id=config.GMB_CLIENT_ID,
client_secret=config.GMB_CLIENT_SECRET,
client_refresh_token=config.GMB_REFRESH_TOKEN,
api_name='mybusinessbusinessinformation',
api_version='v1',
discovery_service_url="https://mybusinessbusinessinformation.googleapis.com/$discovery/rest")
field_mask = FieldMask(paths=["locations(name)"])
outputLocation = googleAPI.service.accounts().locations().list(parent="accounts/{*id*}",
filter="locationKey.placeId=" + google_place_id,
readMask=field_mask
).execute()
From the error, i tried a lot of fieldmask path and still don't know what they want.
I've tried things like location.name, name, locations.name, locations.location.name and it did'nt work.
I also try to pass the readMask params without use the FieldMask class with passing a string and same problem.
So if someone know what is the format of the readMask they want it will be great for me !
.
Can help:
https://www.youtube.com/watch?v=T1FUDXRB7Ns
https://developers.google.com/google-ads/api/docs/client-libs/python/field-masks
You have not set the readMask correctly. I have done a similar task in Java and Google returns the results. readMask is a String type, and what I am going to provide you in the following line includes all fields. You can omit anyone which does not serve you. I am also writing the request code in Java, maybe it can help you better to convert into Python.
String readMask="storeCode,regularHours,name,languageCode,title,phoneNumbers,categories,storefrontAddress,websiteUri,regularHours,specialHours,serviceArea,labels,adWordsLocationExtensions,latlng,openInfo,metadata,profile,relationshipData,moreHours";
MyBusinessBusinessInformation.Accounts.Locations.List request= mybusinessaccountLocations.accounts().locations().list(accountName).setReadMask(readMask);
ListLocationsResponse Response = request.execute();
List<Location>= Response.getLocations();
while (Response.getNextPageToken() != null) {
locationsList.setPageToken(Response.getNextPageToken());
Response=request.execute();
locations.addAll(Response.getLocations());
}
-- about the question in the comment you have asked this is what I have for placeId:
As other people said you before, the readmask is not set correctly.
Here Google My Business Information api v1 you can see:
"This is a comma-separated list of fully qualified names of fields"
and here Google Json representation you can see the field.
With this, I tried this
read_mask = "name,title"
and it worked for me, hope this work for you too.
Hello I am trying to scrape the tweets of a certain user using tweepy.
Here is my code :
tweets = []
username = 'example'
count = 140 #nb of tweets
try:
# Pulling individual tweets from query
for tweet in api.user_timeline(id=username, count=count, include_rts = False):
# Adding to list that contains all tweets
tweets.append((tweet.text))
except BaseException as e:
print('failed on_status,',str(e))
time.sleep(3)
The problem I am having is the tweets are coming back unfinished with "..." at the end.
I think I've looked at all the other similar problems on stack overflow and elsewhere but nothing works. Most do not concern me because I am NOT dealing with retweets .
I have tried putting tweet_mode = 'extended' and/or tweet.full_text or tweet._json['extended_tweet']['full_text'] in different combinations .
I don't get an error message but nothing works, just an empty list in return.
And It looks like the documentation is out of date because it says nothing about the 'tweet_mode' nor the 'include_rts' parameter :
Has anyone managed to get the full text of each tweet?? I'm really stuck on this seemingly simple problem and am losing my hair so I would appreciate any advice :D
Thanks in advance!!!
TL;DR: You're most likely running into a Rate Limiting issue. And use the full_text attribute.
Long version:
First,
The problem I am having is the tweets are coming back unfinished with "..." at the end.
From the Tweepy documentation on Extended Tweets, this is expected:
Compatibility mode
... It will also be discernible that the text attribute of the Status object is truncated as it will be suffixed with an ellipsis character, a space, and a shortened self-permalink URL to the Tweet.
Wrt
And It looks like the documentation is out of date because it says nothing about the 'tweet_mode' nor the 'include_rts' parameter :
They haven't explicitly added it to the documentation of each method, however, they specify that tweet_mode is added as a param:
Standard API methods
Any tweepy.API method that returns a Status object accepts a new tweet_mode parameter. Valid values for this parameter are compat and extended , which give compatibility mode and extended mode, respectively. The default mode (if no parameter is provided) is compatibility mode.
So without tweet_mode added to the call, you do get the tweets with partial text? And with it, all you get is an empty list? If you remove it and immediately retry, verify that you still get an empty list. ie, once you get an empty list result, check if you keep getting an empty list even when you change the params back to the one which worked.
Based on bug #1329 - API.user_timeline sometimes returns an empty list - it appears to be a Rate Limiting issue:
Harmon758 commented on Feb 13
This API limitation would manifest itself as exactly the issue you're describing.
Even if it was working, it's in the full_text attribute, not the usual text. So the line
tweets.append((tweet.text))
should be
tweets.append(tweet.full_text)
(and you can skip the extra enclosing ())
Btw, if you're not interested in retweets, see this example for the correct way to handle them:
Given an existing tweepy.API object and id for a Tweet, the following can be used to print the full text of the Tweet, or if it’s a Retweet, the full text of the Retweeted Tweet:
status = api.get_status(id, tweet_mode="extended")
try:
print(status.retweeted_status.full_text)
except AttributeError: # Not a Retweet
print(status.full_text)
If status is a Retweet, status.full_text could be truncated.
As per the twitter API v2:
tweet_mode does not work at all. You need to add expansions=referenced_tweets.id. Then in the response, search for includes. You can find all the truncated tweets as full tweets in the includes. You will still see the truncated tweets in response but do not worry about it.
I am so confused by all these levels of dicts I have to wade through it would be easier imo just to do it by scraping, however I guess it's a good excercise to learn dicts and will be quicker perhaps once I figure it out.
My code is as follows, where the assignment statement for cposts returns a 404:
import pytumblr
# Authenticate via OAuth
client = pytumblr.TumblrRestClient(
'xxxxxxxxxxxxxxxxxxxxxxxxxxxx'
)
f = client.followers('blog.tumblr.com')
users = f['users']
names = [b['name'] for b in f['users']]
print names
cposts = client.posts(names[0], 'notes_info=True')
print (cposts)
But the python api info says: client.posts('codingjester', **params) # get posts for a blog
and this SO post (Getting more than 50 notes with Tumblr API) says you should use notes_info to get the notes. But I don't know how to construct it in python rather than making a url.
I could use a request constructing a url but I figure there is a simpler way using the python/tumblr api I just haven't figured it out, if someone could illuminate please.
Remove the quotes around notes_info=True. You should be passing the value True to the notes_info argument of the client's posts() method. Instead, you're actually passing the string 'notes_info=True' as a positional argument, which is invalid and causes pytumblr to create an invalid URL, which is why you're getting a 404.