Pydrive: Preventing Google Drive Authorization from expiring. - python

In a Google Colab notebook, I am running a block of code which will take several hours to complete, and at the end a file will be uploaded to my Google drive.
The issue is that sometimes my credentials will expire before the code can upload the file. I have looked around and may have found some code that can perhaps refresh my credentials but I am not 100% familiar with the how Pydrive works and what exactly this code is doing.
Here is the code I am using so far to set my notebook up to access my Google Drive.
!pip install -U -q PyDrive
from google.colab import files
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
And this the code I use to upload the file
uploadModel = drive.CreateFile()
uploadModel.SetContentFile('filename.file')
uploadModel.Upload()
This is the code I found which may solve my issue (found here PyDrive guath.Refresh() and Refresh Token Issues )
if gauth.credentials is None:
# Authenticate if they're not there
gauth.LocalWebserverAuth()
elif gauth.access_token_expired:
# Refresh them if expired
print "Google Drive Token Expired, Refreshing"
gauth.Refresh()
else:
# Initialize the saved creds
gauth.Authorize()
# Save the current credentials to a file
gauth.SaveCredentialsFile("GoogleDriveCredentials.txt")
So I am guessing the gauth.Refresh() line prevents my credentials from expiring?

When a user authenticates your application. You are given an access token and a refresh token.
Access tokens are used to access the Google APIs. If you need to access private data owned by a user for example their google drive account you need their permission to access it. The trick with access tokens is they are short lived they work for an hour. Once the access token has expired it will no longer work and this is where the refresh token comes into play.
Refresh tokens for the most part do not expire as long as the user does not remove your consent to your application accessing their data though their google account you can use the refresh token to request a new access token.
This like elif gauth.access_token_expired: checks to see if the access token is expired or probably about to expire. If it is then gauth.Refresh() will refresh it. Just make sure you have get_refresh_token: True so that you have a refresh token
I am a little surprised that the library isn't doing this for you automatically. But I am not familiar with Pydrive. The Google APIs Python client library automatically refreshes the access token for you.

Related

Why is the export to the Drive empty, when using Google Earth Engine in Python with authentication through service account?

I'm using the Google Earth Engine in Python to get a Sentinel-2 composition and download it to my google drive. When doing a manual authentication everything is working fine:
ee.Authenticate()
ee.Initialize()
However, as I want to use my code in a workflow and don't use all the time the manual authentication, I am using a service account, like described here. It works fine and I can use GEE without doing anything manually:
# get service account
service_account = 'test#test.iam.gserviceaccount.com'
# get credentials
credentials = ee.ServiceAccountCredentials(service_account, gee_secret.json)
ee.Initialize(credentials)
In order to export my File to Google Drive I'm using following code:
# export options
export_config = {
'scale':10,
'region':aoi, #aoi is a polygon
'crs': 'EPSG:3031',
}
file_name = "test"
# export to drive
task = ee.Batch.Export.iamge.toDrive(image, file_name, **export_config)
task.start()
With both authentication methods this task is successfully finished (The status of the task is 'Completed'). However, only when using the manual authentication, I can see my exported image in my Drive. When using the automatic authentication, my Drive is empty.
Someone else already asked a similar question here. A possible idea here was that the image file is exported to the Google Drive of the service account and not to my personal Google Drive. However, I am not sure how to access this other Drive.
Does anyone have an idea how to solve this (=how to access the exported file?). Or have another solution for automatic authentication in which the file will be at my personal Google Drive?
Many thanks to the hints of DaImTo and Yancy Godoy! With these I could find a solution. I will post it here, so that it is maybe useful for others as well.
Indeed the export to the Google Drive worked perfectly, however it was not exported to my personal Google Drive, but to the Google Drive of the service account. It was therefore important to add the access to Google Drive for my service account (see here).
In the following you can find a complete workflow. For the downloading I am using PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from oauth2client.service_account import ServiceAccountCredentials
# this file contains the E-mail adress of the service account
with open(key_path + '/service_worker_mail.txt', 'r') as file:
service_account_file = file.read().replace('\n', '')
# get the credentials
credentials = ee.ServiceAccountCredentials(service_account_file, key_path + "/" + "gee_secret.json")
# authenticate and initialize Google Earth Engine
ee.Initialize(credentials)
# The following operations of Google Earth Engine, including the Export to the Drive
# ...
# ...
# ...
# authenticate to Google Drive (of the Service account)
gauth = GoogleAuth()
scopes = ['https://www.googleapis.com/auth/drive']
gauth.credentials = ServiceAccountCredentials.from_json_keyfile_name(key_path + "/" + "gee_secret.json", scopes=scopes)
drive = GoogleDrive(gauth)
# get list of files
file_list = drive.ListFile({'q': "'root' in parents and trashed=false"}).GetList()
for file in file_list:
filename = file['title']
# download file into working directory (in this case a tiff-file)
file.GetContentFile(filename, mimetype="image/tiff")
# delete file afterwards to keep the Drive empty
file1.Delete()

Headless Google Auth

I have written this code here to upload xlsx file to google drive , everything is working fine except that it needs to be done it automatic way.
Can I pass google auth without opening the browser that googleAuth does (headless auth)
because I will put this code on a server and it will have a cron job every hour so i need it to login by itself.
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
gauth = GoogleAuth()
drive = GoogleDrive(gauth)
upload_file_list = ['traffic.xlsx']
for upload_file in upload_file_list:
gfile = drive.CreateFile({'parents': [{'id': '1TpwAxTEZ3m8OiEyNOJypxXSq4W72qUAp'}]})
gfile.SetContentFile(upload_file)
gfile.Upload()
By design, Google auth intends to prevent automatic login and insists on having a real human being to prove their identity. You can circumvent that with two ways:
use a normal browser to get a valid refresh token with a very far expiration date and use that token in your application
use selenium to fake a browser interaction with Google
Both ways will actually lower the securit of your account, because if an attacker can put their hands on your server, it will get a way to impersonate you. The former should limit the impersonation to API interaction from a registered app, while the latter will allow unlimited impersonation because the Selenium script will contain your Google password.
More details on the former way here

GDrive export using Service Account creds fails with 404

I have a script to export text from a GDrive file using an OAuth client, which works perfectly well -
import googleapiclient.discovery as google
from apiclient.http import MediaIoBaseDownload
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
import datetime, io, os, pickle
Scopes=" ".join(['https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.readonly'])
TokenFile="token.pickle"
def init_creds(clientfile,
scopes,
tokenfile=TokenFile):
token=None
if os.path.exists(tokenfile):
with open(tokenfile, 'rb') as f:
token=pickle.load(f)
if (not token or
not token.valid or
token.expiry < datetime.datetime.utcnow()):
if (token and
token.expired and
token.refresh_token):
token.refresh(Request())
else:
flow=InstalledAppFlow.from_client_secrets_file(clientfile, scopes)
token=flow.run_local_server(port=0)
with open(tokenfile, 'wb') as f:
pickle.dump(token, f)
return token
def export_text(id,
clientfile,
scopes=Scopes):
creds=init_creds(clientfile=clientfile,
scopes=scopes)
service=google.build('drive', 'v3', credentials=creds)
request=service.files().export_media(fileId=id,
mimeType='text/plain')
buf=io.BytesIO()
downloader, done = MediaIoBaseDownload(buf, request), False
while done is False:
status, done = downloader.next_chunk()
destfilename="tmp/%s.txt" % id
return buf.getvalue().decode("utf-8")
if __name__=='__main__':
print (export_text(id="#{redacted}"
clientfile="/path/to/oath/client.json"))
But it's a pain to have to go through the OAuth flow every time, and since it's only me using the script I want to simplify things and use a Service Account instead, following on from this post -
Google Drive API Python Service Account Example
My new Service Account script, doing exactly the same thing, is as follows -
import googleapiclient.discovery as google
from oauth2client.service_account import ServiceAccountCredentials
from apiclient.http import MediaIoBaseDownload
import io
Scopes=" ".join(['https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.readonly'])
def export_text(id,
clientfile,
scopes=Scopes):
creds=ServiceAccountCredentials.from_json_keyfile_name(clientfile,
scopes)
service=google.build('drive', 'v3', credentials=creds)
request=service.files().export_media(fileId=id,
mimeType='text/plain')
buf=io.BytesIO()
downloader, done = MediaIoBaseDownload(buf, request), False
while done is False:
status, done = downloader.next_chunk()
destfilename="tmp/%s.txt" % id
return buf.getvalue().decode("utf-8")
if __name__=='__main__':
print (export_text(id="#{redacted}",
clientfile="path/to/service/account.json"))
but when I run it for the same id, I get the following -
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/drive/v3/files/#{redacted}/export?mimeType=text%2Fplain&alt=media returned "File not found: #{redacted}.">
It feels like the Service Account script is passing the authentication step (ie Service Account creds are okay) but then failing when trying to fetch the file - weird as I can fetch it fine using the OAuth version :/
Any thoughts on what might be causing this 404 error in the Service Account version, given the OAuth client version clearly works for the same id?
Answer:
You need to share your file with the service account.
More Information:
As you would with any file, you need to give a user explicit permissions to be able to see it. As a service account is a separate entitiy to you, this goes for them as well.
Using the file sharing settings (you can just do this in the Drive UI by right-clicking the file and hitting Share), give the email address of the service account the correct permission (read/write). The email address of the service account is in the form:
service-account-name#project-id.iam.gserviceaccount.com
Before making your call do a File.list to see which files the service account has access to. Doing a file.get on a file that the service account doesn't have access to will result in a file not found error. Remember that the service account is not you, it has its own google drive account. Any files you want to access need to be uploaded to its account or shared with the service account.
If the file.list fails then it would suggest to me that there is something wrong with the authorization and you should ensure that the service account has access to client file maybe its that file it cant find.
Granting service account acccess
Create a directory on your personal google drive account. Take the service account email address, it can be found in the key file you downloaded it has a # in it. Then share that directory on your drive account with the service account like you would share with any other user.
Adding files to that directory may or may not give the service account access to them automatically permissions is a pain you may need to also share the file with the service account.
Remember to have the service account grant your personal account permissions to the file when it uploads it its going to be the owner.

How to log in as different user to Google API v3?

So I'm trying to create a new calendar, but i want to be able to specify what google account to create it in, assuming i have the credentials for said account, which i do. The code bellow creates it on the currently signed in user, or requires user interaction to allow access. Is there a way to specify an user and run the command on the background. I essentially just want to add a calendar to my account when the program runs, but i cant guarantee that my account will be logged in at the time.
I believe this was possible with the version 2 of the google api through ClientLogin, but i'm trying to use version 3.
import gflags
import httplib2
from apiclient.discovery import build
from oauth2client.file import Storage
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.tools import run
FLAGS = gflags.FLAGS
FLOW = OAuth2WebServerFlow(
client_id='MY_CLIENT_KEY',
client_secret='MY_CLIENT_SECRET',
scope='https://www.googleapis.com/auth/calendar',
user_agent='MyApp/v1')
storage = Storage('calendar.dat')
credentials = storage.get()
if credentials is None or credentials.invalid == True:
credentials = run(FLOW, storage)
http = httplib2.Http()
http = credentials.authorize(http)
service = build(serviceName='calendar', version='v3', http=http)
calendar = {
'summary': 'Test3',
'timeZone': 'America/New_York'
}
created_calendar = service.calendars().insert(body=calendar).execute()
With V3, you'll need to use a Service Account in order to act as the user. The process is described in the Google Drive documentation, you just need to use Calendar API v3 scopes and references instead of Google Drive API.
Another option would be to store the OAuth2 refresh token and use it to grab valid access tokens even if the user is not logged in. See my reply to google Calendar api v3 Auth only for first time

PyDrive and Google Drive - automate verification process?

I'm trying to use PyDrive to upload files to Google Drive using a local Python script which I want to automate so it can run every day via a cron job. I've stored the client OAuth ID and secret for the Google Drive app in a settings.yaml file locally, which PyDrive picks up to use for authentication.
The problem I'm getting is that although this works some of the time, every so often it decides it needs me to provide a verification code (if I use CommandLineAuth), or it takes me to a browser to enter the Google account password (LocalWebserverAuth), so I can't automate the process properly.
Anybody know which settings I need to tweak - either in PyDrive or on the Google OAuth side - in order to set this up once and then trust it to run automatically without further user input in future?
Here's what the settings.yaml file looks like:
client_config_backend: settings
client_config:
client_id: MY_CLIENT_ID
client_secret: MY_CLIENT_SECRET
save_credentials: True
save_credentials_backend: file
save_credentials_file: credentials.json
get_refresh_token: False
oauth_scope:
- https://www.googleapis.com/auth/drive.file
You can (should) create a service account - with an id and private key from the google API console - this won't require re verification but you'll need to keep the private key private.
Create a credential object based on the google python example and assign it to the PyDrive GoogleAuth() object:
from apiclient.discovery import build
from oauth2client.client import SignedJwtAssertionCredentials
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
# from google API console - convert private key to base64 or load from file
id = "...#developer.gserviceaccount.com"
key = base64.b64decode(...)
credentials = SignedJwtAssertionCredentials(id, key, scope='https://www.googleapis.com/auth/drive')
credentials.authorize(httplib2.Http())
gauth = GoogleAuth()
gauth.credentials = credentials
drive = GoogleDrive(gauth)
EDIT (Sep 2016): For the latest integrated google-api-python-client (1.5.3) you would use the following code, with id and key the same as before:
import StringIO
from apiclient import discovery
from oauth2client.service_account import ServiceAccountCredentials
credentials = ServiceAccountCredentials.from_p12_keyfile_buffer(id, StringIO.StringIO(key), scopes='https://www.googleapis.com/auth/drive')
http = credentials.authorize(httplib2.Http())
drive = discovery.build("drive", "v2", http=http)

Categories

Resources