Download google drive attachments of an email using Gmail API in python

Download google drive attachments of an email using Gmail API in python - python

I currently use this solution to download attachments from Gmail using Gmail API via python.
However, every time an attachment exceeds 25MB, the attachments automatically get uploaded to Google Drive and the files are linked in the mail. In such cases, there is no attachmentId in the message.
I can only see the file names in 'snippet' section of the message file.
Is there any way I can download the Google dive attachments from mail?
There is a similar question posted here, but there's no solution provided to it yet

How to download a Drive "attachment"
The "attachment" referred to is actually just a link to a Drive file, so confusingly it is not an attachment at all, but just text or HTML.
The issue here is that since it's not an attachment as such, you won't be able to fetch this with the Gmail API by itself. You'll need to use the Drive API.
To use the Drive API you'll need to get the file ID. Which will be within the HTML content part among others.
You can use the re module to perform a findall on the HTML content, I used the following regex pattern to recognize drive links:
(?<=https:\/\/drive\.google\.com\/file\/d\/).+(?=\/view\?usp=drive_web)
Here is a sample python function to get the file IDs. It will return a list.
def get_file_ids(service, user_id, msg_id):
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
for part in message['payload']['parts']:
if part["mimeType"] == "text/html":
b64 = part["body"]["data"].encode('UTF-8')
unencoded_data = str(base64.urlsafe_b64decode(b64))
results = re.findall(
'(?<=https:\/\/drive\.google\.com\/file\/d\/).+(?=\/view\?usp=drive_web)',
unencoded_data
)
return results
Once you have the IDs then you will need to make a call to the Drive API.
You could follow the example in the docs:
file_ids = get_file_ids(service, "me", "[YOUR_MSG_ID]"
for id in file_ids:
request = service.files().get_media(fileId=id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
Remember, seeing as you will now be using the Drive API as well as the Gmail API, you'll need to change the scopes in your project. Also remember to activate the Drive API in the developers console, update your OAuth consent screen, credentials and delete the local token.pickle file.
References
Drive API Docs
Managing Downloads Guide
Gmail API Docs

Drive API has also limtitation of downloading 10MBs only

Related

Python O365 Outlook Connection Issues

I am trying to write a script in Python to grab new emails from a specific folder and save the attachments to a shared drive to upload to a database. Power Automate would work, but the file size limit to save the attachment is a meager 20 MB. I am able to authenticate the token, but am getting the following error when trying to grab the emails:
Unauthorized for url.
The token contains no permissions, or permissions can not be understood.
I have included the code I am using to connect to Microsoft Graph.
(credentials and tenant_id are correct in my code, took them out for obvious reasons
from O365 import Account, MSOffice365Protocol, MSGraphProtocol
credentials = ('xxxxxx', 'xxxxxx')
protocol = MSGraphProtocol(default_resource='reporting.triometric#xxxx.com')
scopes_graph = protocol.get_scopes_for('message_all_shared')
scopes = ['https://graph.microsoft.com/.default']
account = Account(credentials, auth_flow_type='credentials', tenant_id="**", scopes=scopes,)
if account.authenticate():
print('Authenticated')
mailbox = account.mailbox(resource='reporting.triometric#xxxx.com')
inbox = mailbox.inbox_folder()
for message in inbox.get_messages():
print(message)
I have already configured the permissions through Azure to include all the necessary 'mail' delegations.
The rest of my script works perfectly fine for uploading files to the database. Currently, the attachments must be manually saved on a shared drive multiple times per day, then the script is run to upload. Are there any steps I am missing? Any insights would be greatly appreciated!
Here are the permissions:

auth_flow_type='credentials' means you are using client credentials flow.
In this case you should add Application permissions rather than Delegated permissions.
Don't forget to click on "Grant admin consent for {your tenant}".
UPDATE:
If you set auth_flow_type to 'Authorization', it will use auth code flow which requires the delegated permission.

Google Drive API v3 files.export method throws a 403 error: "Export only supports Docs Editors files."

Summary
Trying to download a JPEG file using files.export, but get a Error 403 message: "Export only supports Docs Editors files."
Description
I am trying to make use of Google Drive API for a project of mine, which is really all about collecting some pictures, storing them on the Drive, and later loading them back for some processing. Pretty straighforward.
To that extent, I am simply following the API Documentation and the example code snippets provided therein. Now, I was able to upload a picture to the storage. Trying to download it, I ran into this error message:
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1D1XpQwrCvOSEy_vlaRQMEGLQJK-AeeZ7/export?mimeType=image%2Fjpeg&alt=media returned "Export only supports Docs Editors files.". Details: "Export only supports Docs Editors files.">
The code use to upload the picture looks something like that:
import io
from googleapiclient.http import MediaFileUpload, MediaIoBaseDownload
# GoogleDrive is a service created with googleapiclient.discovery.build
# upload a picture
file_metadata = {'name': 'example_on_drive.jpg'}
media = MediaFileUpload('example_upload.jpg', mimetype='image/jpeg')
file = GoogleDrive.files().create(
body=file_metadata,
media_body=media,
fields='id').execute()
The upload is successful, I get back the file's ID:
print(file)
{'id': '1aBoMvaHauCRyZOerNFfwM8yQ78RkJkDQ'}
and I can see it on my Google Drive.
Then I try to access that very same file:
request = GoogleDrive.files().export_media(fileId=file.get('id'), mimeType="image/jpeg")
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print("Download %d%%" % int(status.progress() * 100))
At which point the error occurs. I see the API Explorer on the website return the same error, which is why I am tempted to conclude that my code is not to blame. This file.export only has two inputs and I don't see how exactly I am failing to use it correctly. I looked through similar questions, but most of them deal with downloading text documents and my error message tells me it's got something to do with the doctype. Am I using a wrong mimeType?
Error handling suggestions on the official page do not feature this particular error message.
Have you got any suggestions?

From the error message of Export only supports Docs Editors files., in order to download the files except for Google Workspace documents (Document, Spreadsheet, Slides and so on), unfortunately, the method of "Files: export" cannot be used. In your case, when you want to download the image file, please use the method of "Files: get". When your script is modified, it becomes as follows.
From:
request = GoogleDrive.files().export_media(fileId=file.get('id'), mimeType="image/jpeg")
To:
request = GoogleDrive.files().get_media(fileId=file.get('id'))
Reference:
Download files
About the download a file using "export" and "get", you can see it at above official document.

Get file from GoogleDrive without downloading it to storage - Python

I have a python-script running on a server and I need to get a json-file from my GoogleDrive.
I want to use the GoogleDrive API to get the file, which I know the name, location and ID of but I only could find code-samples which downloads the file to storage. The json-content is supposed to be a dict in my script and the file must not be downloaded to storage. I'm new to Python and the GoogleDrive API, so I don't know how to manage it by myself.
This is the website I followed: https://www.thepythoncode.com/article/using-google-drive--api-in-python
I hope you can help me because I really need it.
Thanks in advance.

I believe your goal as follows.
You want to directly download the file to the memory without creating the data as a file using python.
From I need to get a json-file from my GoogleDrive., the file you want to download is the file except for Google Docs files (Spreadsheet, Document, Slides and so on). In this case, it's a text file.
You have already been able to use Drive API with googleapis for python.
You are using the script for authorizing from https://www.thepythoncode.com/article/using-google-drive--api-in-python.
In this case, in order to retrieve the file content to the memory, I would like to propose to retrieve it using requests. For this, the access token is retrieved from creds of get_gdrive_service().
In order to retrieve the file content, the method of "Files: get" is used by adding the query parameter of alt=media.
Sample script:
file_id = "###" # Please set the file ID you want to download.
access_token = creds.token
url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
res = requests.get(url, headers={"Authorization": "Bearer " + access_token})
obj = json.loads(res.text)
print(obj)
At above script, creds of creds.token is from get_gdrive_service().
From your question, I thought that the file you want to download is the JSON data. So at above script, the downloaded data is parsed as JSON object.
In this case, please import json and requests.
Note:
When the returned content is JSON data, I think that you can also use res.json() instead of res.text. But when JSONDecodeError occurs, please check the value of res.text.
Reference:
Download files

How do I properly open an html document in Google Drive?

Im using Google Drive API for creating and opening html file. But the problem is that the document opens with the technical content (links to css, js files, html tags ...) like this
How to make it so that it would open correctly, in a user-friendly form?
part of my google-api code
def file_to_drive(import_file=None):
service = build('drive', 'v3', credentials=creds)
file_name = import_file
media_body = MediaFileUpload(file_name, resumable=True, mimetype='text/html')
body = {
'title': file_name,
'description': 'Uploaded By You'}
file = service.files().create(body=body, media_body=media_body, fields='id')

The google drive API is a file store api. It allows you to upload and download files. It does not have the ability to open files. You could share a link to the file with someone that has access then when they click on the link it will open for them in the google drive web application.
The only api able to open files for editing would be the Google docs api which gives you limited ability to open google doc files. that however would require that you covert your html file to a google docs format. Even if this was an option you would need to create your own "user friendly form" Google apis return data as json and not user friendly options thats not what APIs are for.

Failed in getting the downloadUrl property of files in Google Drive with python API

I want to obtain the direct download link of a certain file on Google Drive and I used Google API Client for python and here's a part of my codes, which basically is the copy of the quickstart example:
SCOPES = "https://www.googleapis.com/auth/drive"
FILE_ID = "xxxxxx"
def get_credentials():
...
return credentials
if __name__ == '__main__':
credentials = get_credentials()
http = credentials.authorize(httplib2.Http())
service = discovery.build("drive", "v3", http=http)
web_link = service.files().get(fileId=FILE_ID, fields="webContentLink").execute() # success
download_link = service.files().get(fileId=FILE_ID, fields="downloadUrl").execute() # throw an error
Then I got the 400 error:
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://www.googleapis.com/drive/v3/files/xxxxxx?alt=json&fields=downloadUrl returned "Invalid field selection downloadUrl">
I searched and read all related questions about this problem. As one said in this question:
The difference between this downloadURL and the webContentLink is that
the webContentLink uses authorization from the user's browser cookie,
the downloadURL requires an authorized API request (using OAuth 2.0).
So I thought maybe I didn't authorize the request successfully. However the first three statements in main part, did it for me according to the guide:
Use the authorize() function of the Credentials class to apply
necessary credential headers to all requests made by an httplib2.Http
instance ... Once an httplib2.Http object has been authorized, it is
typically passed to the build function
So what's wrong with my program? And if I want to write my own request with urllib or requests to reproduce the error, how?

downloadURL is a field that is available for File resources in the Google Drive API version 2 but not version 3 which you seem to be using.
In Google Drive API v3 the field has been replaced by doing a files.get with ?alt=media request to directly download a file without having to find out the specific URL for it. You can read more about the differences and changes between API v2 and v3 here.
Here is an example of how you can download files using the python Google API client library with Google Drive API v3:
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
Check out the official documentation regarding downloading files using the API v3 here for more examples.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.