Drive SDK not listing all my files - python

I am trying to list all the files in my drive (about 10) but the following will only list 1 (and that isn't even a real file of mine)....
the code:
from httplib2 import Http
from oauth2client.client import SignedJwtAssertionCredentials
client='my_client_id'
client_email = 'my_client_email'
with open("/path/to/file.p12") as f:
private_key = f.read()
credentials = SignedJwtAssertionCredentials(client_email, private_key, 'https://www.googleapis.com/auth/drive')
http_auth = credentials.authorize(Http())
drive_service = build('drive', 'v2', http=http_auth)
r = drive_service.files().list().execute()
files = r['items']
for f in files:
print f['id'], f['title']
result:
"<file_id> How to get started with Drive"
EDIT:
This question is similar but the answer is to have the correct oauth scope, which I have above.
EDIT #2:
I thought it might be a timing issue so I gave it a few hours and still no goose.
EDIT #3:
If I try to copy a file from another user then list my files then I'll get 2 files:
" How to get started with Drive"
" My New File"
So, this is just listing files created by that app? How do I get the rest of my files???

You use a service account to authenticate. A service account does not have by default the right to access your Drive data, but only files that it owns by itself.
You have three options to work around this :
Create a folder in your Drive account, and share it (read/write) with the service account. Any file you place in that folder will be readable and writable both by you and your service account.
If you use Google Apps For Business, setup domain wide delegation to allow your service account to impersonate all users in your domain. That way you will be able to get your service account to behave as if it were your actual Google Apps account.
Whether you use or not Google Apps For Business : do not use a service account but rather 3-legged OAuth. With 3-legged OAuth you will be able to generate an access token and a refresh token that will allow your application to act in Drive on behalf of your actual Google account. Note that this last options does not use service accounts at all.
The simplest is obviously option (1). If it is not acceptable then I would go with option (3), unless you actually want to be able to impersonate all the users in your domain.

Related

GDrive export using Service Account creds fails with 404

I have a script to export text from a GDrive file using an OAuth client, which works perfectly well -
import googleapiclient.discovery as google
from apiclient.http import MediaIoBaseDownload
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
import datetime, io, os, pickle
Scopes=" ".join(['https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.readonly'])
TokenFile="token.pickle"
def init_creds(clientfile,
scopes,
tokenfile=TokenFile):
token=None
if os.path.exists(tokenfile):
with open(tokenfile, 'rb') as f:
token=pickle.load(f)
if (not token or
not token.valid or
token.expiry < datetime.datetime.utcnow()):
if (token and
token.expired and
token.refresh_token):
token.refresh(Request())
else:
flow=InstalledAppFlow.from_client_secrets_file(clientfile, scopes)
token=flow.run_local_server(port=0)
with open(tokenfile, 'wb') as f:
pickle.dump(token, f)
return token
def export_text(id,
clientfile,
scopes=Scopes):
creds=init_creds(clientfile=clientfile,
scopes=scopes)
service=google.build('drive', 'v3', credentials=creds)
request=service.files().export_media(fileId=id,
mimeType='text/plain')
buf=io.BytesIO()
downloader, done = MediaIoBaseDownload(buf, request), False
while done is False:
status, done = downloader.next_chunk()
destfilename="tmp/%s.txt" % id
return buf.getvalue().decode("utf-8")
if __name__=='__main__':
print (export_text(id="#{redacted}"
clientfile="/path/to/oath/client.json"))
But it's a pain to have to go through the OAuth flow every time, and since it's only me using the script I want to simplify things and use a Service Account instead, following on from this post -
Google Drive API Python Service Account Example
My new Service Account script, doing exactly the same thing, is as follows -
import googleapiclient.discovery as google
from oauth2client.service_account import ServiceAccountCredentials
from apiclient.http import MediaIoBaseDownload
import io
Scopes=" ".join(['https://www.googleapis.com/auth/drive.file',
'https://www.googleapis.com/auth/drive.metadata',
'https://www.googleapis.com/auth/drive.readonly'])
def export_text(id,
clientfile,
scopes=Scopes):
creds=ServiceAccountCredentials.from_json_keyfile_name(clientfile,
scopes)
service=google.build('drive', 'v3', credentials=creds)
request=service.files().export_media(fileId=id,
mimeType='text/plain')
buf=io.BytesIO()
downloader, done = MediaIoBaseDownload(buf, request), False
while done is False:
status, done = downloader.next_chunk()
destfilename="tmp/%s.txt" % id
return buf.getvalue().decode("utf-8")
if __name__=='__main__':
print (export_text(id="#{redacted}",
clientfile="path/to/service/account.json"))
but when I run it for the same id, I get the following -
googleapiclient.errors.HttpError: <HttpError 404 when requesting https://www.googleapis.com/drive/v3/files/#{redacted}/export?mimeType=text%2Fplain&alt=media returned "File not found: #{redacted}.">
It feels like the Service Account script is passing the authentication step (ie Service Account creds are okay) but then failing when trying to fetch the file - weird as I can fetch it fine using the OAuth version :/
Any thoughts on what might be causing this 404 error in the Service Account version, given the OAuth client version clearly works for the same id?
Answer:
You need to share your file with the service account.
More Information:
As you would with any file, you need to give a user explicit permissions to be able to see it. As a service account is a separate entitiy to you, this goes for them as well.
Using the file sharing settings (you can just do this in the Drive UI by right-clicking the file and hitting Share), give the email address of the service account the correct permission (read/write). The email address of the service account is in the form:
service-account-name#project-id.iam.gserviceaccount.com
Before making your call do a File.list to see which files the service account has access to. Doing a file.get on a file that the service account doesn't have access to will result in a file not found error. Remember that the service account is not you, it has its own google drive account. Any files you want to access need to be uploaded to its account or shared with the service account.
If the file.list fails then it would suggest to me that there is something wrong with the authorization and you should ensure that the service account has access to client file maybe its that file it cant find.
Granting service account acccess
Create a directory on your personal google drive account. Take the service account email address, it can be found in the key file you downloaded it has a # in it. Then share that directory on your drive account with the service account like you would share with any other user.
Adding files to that directory may or may not give the service account access to them automatically permissions is a pain you may need to also share the file with the service account.
Remember to have the service account grant your personal account permissions to the file when it uploads it its going to be the owner.

Using Google Service Account to Resumable Upload Videos

I'm using a Google Service Account to upload videos using Resumable Method to Google Drive. The python code works well but I'm running into Google Service Account storage issue.
It seems like Google Service Account can only have 15 GB of storage. Even though I upload the video to a regular Google Drive folder, the video is still owned by the Service Account. Therefore, I tried to transfer the owner of the videos to a different account but it didn't work, the error is bad request. User message: \"You can't yet change the owner of this item. (We're working on it.)
Below is my python code that generate an access token from the service account and perform the Resumable Upload
credentials = ServiceAccountCredentials.from_json_keyfile_name(
'creds.json',
scopes='https://www.googleapis.com/auth/drive'
)
delegated_credentials = credentials.create_delegated('service_account_email')
access_token = delegated_credentials.get_access_token().access_token
filesize = os.path.getsize(file_location)
# Retrieve session for resumable upload.
headers1 = {"Authorization": "Bearer " + access_token, "Content-Type": "application/json"}
params = {
"name": file_name,
"mimeType": "video/mp4",
"parents": [folder_id]
}
r = requests.post(
"https://www.googleapis.com/upload/drive/v3/files?uploadType=resumable",
headers=headers1,
data=json.dumps(params)
)
location = r.headers['Location']
# Upload the file.
headers2 = {"Content-Range": "bytes 0-" + str(filesize - 1) + "/" + str(filesize)}
r = requests.put(
location,
headers=headers2,
data=open(file_location, 'rb')
)
Is there a workaround or increase the storage limit of the Google Service Accounts?
Any advice would be very appreciated. Thank you!
You want to use the Service Account to make a Resumable Upload to Drive.
You want the owner of the video not to be the Service Account, but a regular account which has enough Drive storage capacity.
If that's correct, then you can just have to delegate domain-wide authority to the Service Account, so that it can act on behalf of any user in the domain and, when uploading the file, impersonate the account you want to be the owner of the file.
Delegating domain-wide authority:
The process of granting domain-wide authority is explained here:
On the Service accounts page, select your Service Account, and while editing the SA, click SHOW DOMAIN-WIDE DELEGATION and, on the content that was just displayed, check the option Enable G Suite Domain-wide Delegation.
Once you've done this, go to the Admin console and then go to Main menu > Security > API Controls.
Select Manage Domain Wide Delegation in the Domain wide delegation pane, and click Add new.
Fill up the corresponding fields: (1) in Client ID, enter the SA's Client ID, which you can find both in the credentials JSON file and on the Service Account page, and (2) in OAuth scopes, add the scopes corresponding to the resources you want the SA to access on behalf of users in the domain. In this case, I guess that's just https://www.googleapis.com/auth/drive.
After clicking Authorize, you have conferred the Service Account the ability to access resources on behalf of any user in the domain.
Impersonating another user:
Now the Service Account impersonate any user in the domain, but you have to specify which user you want it to impersonate. In order to do that, you just have to do a small change in your code. Right now, you're setting the service_account_email when delegating credentials via create_delegated:
delegated_credentials = credentials.create_delegated('service_account_email')
That is, the Service Account is acting on behalf of the Service Account. If you didn't want to impersonate another account, there would be no real need for this line of code (it doesn't have any effect, since credentials and delegated_credentials both refer to the same account (the Service Account).
But since you want to use the Service Account to act on behalf of another account, you have to specify this other account's email address on this line:
delegated_credentials = credentials.create_delegated('user_account_email')
That's the only change you need to do in your code. If you have granted domain-wide delegation, the Service Account will act as if it was this other user. It will be like it was this other user who uploaded the file, so this user will be the owner of the file.
Note:
You are using a deprecated library (oauth2client). Since this is still working, there is no real need to do it now, but please consider changing your code to google-auth.
Reference:
Delegating domain-wide authority to the service account

Django server RW access to self owned google calendar?

In a django application, I try to have RW access to a google calendar which I own myself.
Tried several ways with a service account & client secrets, but all resulting in authentication errors.
The API explorer works, but it requests consent in a popup window, which is obviously not acceptable.
Documentation on google OAuth2 describes several scenarios. Probably "web server application" applies here? It says:
"The authorization sequence begins when your application redirects a
browser to a Google URL; the URL includes query parameters that
indicate the type of access being requested. Google handles the user
authentication, session selection, and user consent. The result is an
authorization code, which the application can exchange for an access
token and a refresh token."
Again, we do not want a browser redirection, we want direct access to the google calendar.
So question is: how can a django server access a google calendar, on which I have full rights, view events and add events using a simple server stored key or similar mechanism?
With help of DalmTo and this great article, I got RW access to a google calendar working from python code. I will summarize the solution here.
Here are the steps:
First of all register for a google service account: Service accounts are pre-authorized accounts that avoid you need to get consent or refresh keys every time:
https://developers.google.com/identity/protocols/OAuth2ServiceAccount
(The part on G-suite can be ignored)
Download the service account credentials and store them safely. Your python code will need access to this file.
Go to your google calendar you want to get access to.
e.g. https://calendar.google.com/calendar/r/month
On the right side you see your calendars. Create an additional one for testing (since we'll write to it soon). Then point to this new calendar: click the 3 dots next to it and edit the sharing settings. Add the service account email address to the share under "share with specific people". (you can find the service account email address in the file downloaded previously under "client_email")
In the same screen, note the "calendar ID", you'll need it in below code.
Now you service account has the RW rights to the calendar.
Add at least one event to the calendar using the web UI (https://calendar.google.com/calendar/r/month) so we can read and change it from below code.
Then use following python code to read the calendar and change an event.
from google.oauth2 import service_account
import googleapiclient.discovery
SCOPES = ['https://www.googleapis.com/auth/calendar']
SERVICE_ACCOUNT_FILE = '<path to your service account file>'
CAL_ID = '<your calendar ID>'
credentials = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_FILE, scopes=SCOPES)
service = googleapiclient.discovery.build('calendar', 'v3', credentials=credentials)
events_result = service.events().list(calendarId=CAL_ID).execute()
events = events_result.get('items', [])
event_id = events[0]['id']
event = events[0]
service.events().update(calendarId=CAL_ID, eventId=event_id, body={"end":{"date":"2018-03-25"},"start":{"date":"2018-03-25"},"summary":"Kilroy was here"}).execute()
And there you go... read an event and updated the event.

Using Google Admin to view Drive files Domain-wide

I'm trying to list all Google Drive files Domain-wide, both users that still work here, and those that have moved on. With that, we can grep the output for certain terms (former customers) to delete customer-related files.
I believe I have a successful way to list all users using the Admin SDK Quickstart, since we have only about 200 total users (max is 500). I also have a way to list all files for a user using the Drive REST API's files.list() method. What I need to know is how to impersonate each user iteratively, in order to run the file listing script.
I have found the blurb .setServiceAccountUser(someone#domain.com) but I'm not really sure where to implement this, either in the service account authorization step, or in a separate middle-man script.
Have a look at https://github.com/pinoyyid/googleDriveTransferOwnership/blob/master/src/couk/cleverthinking/tof/Main.java
Specifically lines 285-299 which deal with generating a credential for an impersonated user.
GoogleCredential.Builder builder = new GoogleCredential.Builder()
.setTransport(HTTP_TRANSPORT)
.setJsonFactory(JSON_FACTORY)
.setServiceAccountId(serviceAccountEmailAddress)
.setServiceAccountPrivateKeyFromP12File(f)
.setServiceAccountScopes(Collections.singleton(SCOPE));
// if requested, impresonate a domain user
if (!"ServiceAccount".equals(impersonatedAccountEmailAddress)) {
builder.setServiceAccountUser(impersonatedAccountEmailAddress);
}
// build the Drive service
Drive service = new Drive.Builder(HTTP_TRANSPORT, JSON_FACTORY, null)
.setApplicationName("TOF")
.setHttpRequestInitializer(builder.build()).build();
This is Java, but should at least tell you what the steps are.
You need to implement the authorization flow for Service Accounts.
Once you create a service account in a GCP project (console.developers.google.com), enable DWD (domain-wide delegation), then authorize that service account in your G Suite admin console, that key can then be used to "impersonate" any account in the G Suite instance:
Create the credentials object from the json file
from oauth2client.service_account import ServiceAccountCredentials
scopes = ['https://www.googleapis.com/auth/gmail.readonly']
credentials = ServiceAccountCredentials.from_json_keyfile_name(
'/path/to/keyfile.json', scopes=scopes)
Create a credential that can impersonate user#example.org (could be any user in the domain though)
delegated_credentials = credentials.create_delegated('user#example.org')
Authorize the credential object (i.e. get an access_token)
from httplib2 import Http
http_auth = credentials.authorize(Http())
Call the Gmail API:
from apiclient import discovery
service = discovery.build('gmail', 'v1', http=http)
response = service.users().messages().list(userId='user#example.org').execute()

Service Accounts, web OAuth and the Directory API

I'm having issues with the Directory API + Service Accounts (Google APIs). This is my current setup:
A web page has an OAuth2 login link like this: https://accounts.google.com/o/oauth2/auth?access_type=offline&state=%2Fprofile&redirect_uri=##REDIR##&response_type=code&client_id=##CLIENTID##&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fadmin.directory.user.readonly
Users log in there, authorizing the app to access the Directory API in read-only mode on their behalf.
I then try to retrieve the users of the domain of a given user (by knowing its email address), using the Directory API.
Python code:
from apiclient.discovery import build
from oauth2client.client import SignedJwtAssertionCredentials
import httplib2
CLIENT_ID = "xzxzxzxzxzxz.apps.googleusercontent.com"
APP_EMAIL = "xzxzxzxzxzxz#developer.gserviceaccount.com"
SCOPES = ('https://www.googleapis.com/auth/admin.directory.user.readonly')
f = file('key.p12', 'rb')
key = f.read()
f.close()
credentials = SignedJwtAssertionCredentials(APP_EMAIL, key, SCOPES, sub="user#example.com")
http = httplib2.Http()
http = credentials.authorize(http)
directory_service = build('admin', 'directory_v1', http=http)
users = directory_service.users().list(domain="example.com").execute()
print users
I have also tried setting sub="user#example.com" to the app owner like this sub="appowner#company.com", to no avail.
Another thing I have tried is not using impersonation at all (ie. removing the sub=xx part), which leads me to this error:
apiclient.errors.HttpError: https://www.googleapis.com/admin/directory/v1/users?domain=example.com&alt=json returned "Not Authorized to access this resource/api">
Using impersonation always yields me this. I have verified it has to do with the scopes and the api which I try to call:
oauth2client.client.AccessTokenRefreshError: access_denied
Now, the actual questions:
Should I be using service accounts? For me, it is the most convenient way as I don't have to be storing tokens which can be outdated altogether.
If service accounts are the way to go, what am I doing wrong in the way I use them? Impersonation with either the Google Apps administrator account (which logs in via OAuth web) or the app owner account does not seem to work.

Categories

Resources