Get file from GoogleDrive without downloading it to storage - Python

Get file from GoogleDrive without downloading it to storage - Python - python

I have a python-script running on a server and I need to get a json-file from my GoogleDrive.
I want to use the GoogleDrive API to get the file, which I know the name, location and ID of but I only could find code-samples which downloads the file to storage. The json-content is supposed to be a dict in my script and the file must not be downloaded to storage. I'm new to Python and the GoogleDrive API, so I don't know how to manage it by myself.
This is the website I followed: https://www.thepythoncode.com/article/using-google-drive--api-in-python
I hope you can help me because I really need it.
Thanks in advance.

I believe your goal as follows.
You want to directly download the file to the memory without creating the data as a file using python.
From I need to get a json-file from my GoogleDrive., the file you want to download is the file except for Google Docs files (Spreadsheet, Document, Slides and so on). In this case, it's a text file.
You have already been able to use Drive API with googleapis for python.
You are using the script for authorizing from https://www.thepythoncode.com/article/using-google-drive--api-in-python.
In this case, in order to retrieve the file content to the memory, I would like to propose to retrieve it using requests. For this, the access token is retrieved from creds of get_gdrive_service().
In order to retrieve the file content, the method of "Files: get" is used by adding the query parameter of alt=media.
Sample script:
file_id = "###" # Please set the file ID you want to download.
access_token = creds.token
url = "https://www.googleapis.com/drive/v3/files/" + file_id + "?alt=media"
res = requests.get(url, headers={"Authorization": "Bearer " + access_token})
obj = json.loads(res.text)
print(obj)
At above script, creds of creds.token is from get_gdrive_service().
From your question, I thought that the file you want to download is the JSON data. So at above script, the downloaded data is parsed as JSON object.
In this case, please import json and requests.
Note:
When the returned content is JSON data, I think that you can also use res.json() instead of res.text. But when JSONDecodeError occurs, please check the value of res.text.
Reference:
Download files

Related

Download google drive attachments of an email using Gmail API in python

I currently use this solution to download attachments from Gmail using Gmail API via python.
However, every time an attachment exceeds 25MB, the attachments automatically get uploaded to Google Drive and the files are linked in the mail. In such cases, there is no attachmentId in the message.
I can only see the file names in 'snippet' section of the message file.
Is there any way I can download the Google dive attachments from mail?
There is a similar question posted here, but there's no solution provided to it yet

How to download a Drive "attachment"
The "attachment" referred to is actually just a link to a Drive file, so confusingly it is not an attachment at all, but just text or HTML.
The issue here is that since it's not an attachment as such, you won't be able to fetch this with the Gmail API by itself. You'll need to use the Drive API.
To use the Drive API you'll need to get the file ID. Which will be within the HTML content part among others.
You can use the re module to perform a findall on the HTML content, I used the following regex pattern to recognize drive links:
(?<=https:\/\/drive\.google\.com\/file\/d\/).+(?=\/view\?usp=drive_web)
Here is a sample python function to get the file IDs. It will return a list.
def get_file_ids(service, user_id, msg_id):
message = service.users().messages().get(userId=user_id, id=msg_id).execute()
for part in message['payload']['parts']:
if part["mimeType"] == "text/html":
b64 = part["body"]["data"].encode('UTF-8')
unencoded_data = str(base64.urlsafe_b64decode(b64))
results = re.findall(
'(?<=https:\/\/drive\.google\.com\/file\/d\/).+(?=\/view\?usp=drive_web)',
unencoded_data
)
return results
Once you have the IDs then you will need to make a call to the Drive API.
You could follow the example in the docs:
file_ids = get_file_ids(service, "me", "[YOUR_MSG_ID]"
for id in file_ids:
request = service.files().get_media(fileId=id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
Remember, seeing as you will now be using the Drive API as well as the Gmail API, you'll need to change the scopes in your project. Also remember to activate the Drive API in the developers console, update your OAuth consent screen, credentials and delete the local token.pickle file.
References
Drive API Docs
Managing Downloads Guide
Gmail API Docs

Drive API has also limtitation of downloading 10MBs only

Uploading and receiving json data and file in python3 via postman

i was a bit curious about how can i send json data and files via postman and receive the json data and the same file in my flask application.
Is there an convenient way to send files or shall i save the file in
another route and generate an url and pass it in the request json. Or
shall i directly send the file and save it in my server file system ?
if i do so ,can i fetch the file from the server ?
i would appreciate any help.
Code :
import os
from werkzeug.utils import secure_filename
class Test(Resource):
def post(self):
# keys = request.json.keys()
dat = request.form['request']
file_path = request.files['file_path']
file_path.save(os.path.join(app.config['UPLOAD_FOLDER'], secure_filename(file_path.filename)))
# create the folders when setting up your app
os.makedirs(os.path.join(app.instance_path, 'htmlfi'), exist_ok=True)
# when saving the file
file_path.save(os.path.join(app.instance_path, 'htmlfi', secure_filename(file_path.filename)))
print(dat)
# company_id =flask_praetorian.current_user().company_id
# data = dict(request.json)
# print(data)
return "done"
api.add_resource(Test,'/Test_data')
I am able to get the data ,but it is not json but manageable. but is it an efficient way to send file directly and save it in file system or is it better to use google cloud storage as i am using gcp? i was think about server load.
Also it is hectic to check for valid keys ,
e.g i have to
if "keys" not in request.json.keys():
which makes my work easier, but in the form data approach , i have to check like request.form['request'][0] for id key and as such

You can send your data at your python code, you dont have to send .json file to your server. If you are using dictionary data type, convert it to json and send your server in your request’s body. You will see the data at postman. If you want to save that as a json file, maybe you can get the data and do that at your server side.

Can we open a file in google drive in read mode using python?

I want to read a file in read mode which is in google drive. Can we use Drive-API to open it in read mode instead of downloading it?
My folder structure
user
>1
>1.json
In this example, I have reached till my json file and now instead of downloading I need to read the file contents in .json.
Normally, we use with open('1.json') as f: in python but how can I read and store contents using DRIVE API

What you are asking is impossible, you cannot read the information of the file without downloading the actual file. If you think about it you will see that it makes no sense to be able to ready the bytes without actually getting to your computer.
If what you mean is not to create an actual file, you could try to use the io module and store it in memory.
Actually there are examples in the google documentation:
file_id = '0BwwA4oUTeiV1UVNwOHItT0xfa2M'
request = drive_service.files().get_media(fileId=file_id)
fh = io.BytesIO()
downloader = MediaIoBaseDownload(fh, request)
done = False
while done is False:
status, done = downloader.next_chunk()
print "Download %d%%." % int(status.progress() * 100)
In this case we create the request with the get_media() method without executing it, after that we introduce the request to the MediaIoBaseDownload, this class is responsible to download the data in chunks (in case the file is big) to a io.Base object fhin this code snippet.
Now that I think about it, in this case when the file is small enough you could probably just execute the get_media request without the need of chunking the download.
So probably in your case just execute the following code
service = build('drive', 'v3', credentials=creds)
file_id = "<your file ID> "
response = service.files().get_media(fileId=file_id).execute()
print(response)
{
jola: "sadsa",
saas: 123
}

Export spreadsheet as text/csv using Drive v3 gives 500 Internal Error

I was trying to export a Google Spreadsheet in csv format using the Google client library for Python:
# OAuth and setups...
req = g['service'].files().export_media(fileId=fileid, mimeType=MIMEType)
fh = io.BytesIO()
downloader = http.MediaIoBaseDownload(fh, req)
# Other file IO handling...
This works for MIMEType: application/pdf, MS Excel, etc.
According to Google's documentation, text/csv is supported. But when I try to make a request, the server gives a 500 Internal Error.
Even using google's Drive API playground, it gives the same error.
Tried:
Like in v2, I added a field:
gid = 0
to the request to specify the worksheet, but then it's a bad request.

This is a known bug in Google's code. https://code.google.com/a/google.com/p/apps-api-issues/issues/detail?id=4289
However, if you manually build your own request, you can download the whole file in bytes (the media management stuff won't work).
With file as the file ID, http as the http object that you've authorized against you can download a file with:
from apiclient.http import HttpRequest
def postproc(*args):
return args[1]
data = HttpRequest(http=http,
postproc=postproc,
uri='https://docs.google.com/feeds/download/spreadsheets/Export?key=%s&exportFormat=csv' % file,
headers={ }).execute()
data here is a bytes object that contains your CSV. You can open it something like:
import io
lines = io.TextIOWrapper(io.BytesIO(data), encoding='utf-8', errors='replace')
for line in lines:
#Do whatever

You just need to implement an Exponential Backoff.
Looking at this documentation of ExponentialBackOffPolicy.
The idea is that the servers are only temporarily unavailable, and they should not be overwhelmed when they are trying to get back up.
The default implementation requires back off for 500 and 503 status codes. Subclasses may override if different status codes are required.
Here is an snippet of an implementation of Exponential Backoff from the first link:
ExponentialBackOff backoff = ExponentialBackOff.builder()
.setInitialIntervalMillis(500)
.setMaxElapsedTimeMillis(900000)
.setMaxIntervalMillis(6000)
.setMultiplier(1.5)
.setRandomizationFactor(0.5)
.build();
request.setUnsuccessfulResponseHandler(new HttpBackOffUnsuccessfulResponseHandler(backoff));
You may want to look at this documentation for the summary of the ExponentialBackoff implementation.

how to create a downloadable csv file in appengine

I use python Appengine. I'm trying to create a link on a webpage, which a user can click to download a csv file. How can I do this?
I've looked at csv module, but it seems to want to open a file on the server, but appengine doesn't allow that.
I've looked at remote_api, but it seems that its only for uploading or downloading using app config, and from account owner's terminal.
Any help thanks.

Pass a StringIO object as the first parameter to csv.writer; then set the content-type and content-disposition on the response appropriately (probably "text/csv" and "attachment", respectively) and send the StringIO as the content.

I used this code:
self.response.headers['Content-Type'] = 'application/csv'
writer = csv.writer(self.response.out)
writer.writerow(['foo','foo,bar', 'bar'])
Put it in your handler's get method. When user requests it, user's browser will download the list content automatically.
Got from: generating a CSV file online on Google App Engine

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.