Google Drive API: Avoiding Manual Authentication - python

I want to write a python script that can connect to Google Drive API without having to manually authenticate on every device the script is run on.
I am writing some python code for a research study that is going to be run at various study locations. For data privacy reasons, we cannot store data locally and need to write it to the cloud (ideally Google Drive). A member of our team will not present at all locations the software is being run, and thus any sort of initial manual authentication (entering username and password at the different sites for OAuth) is really off of the table for us.
I've looked into the Google Drive API (Python), and am wondering if there is a way for a device running my script to get a Refresh token (and subsequent Access tokens) to modify a Google Sheet without needling to manually authenticate on each device.
Is there any way to make this possible with the Google Drive API (by having some sort of 'secret' that the code could store)? If not, are there any other cloud services that may be able to accommodate this?
Additionally, the python script is being run as part of an executable (produced from Vizard, probably irrelevant but mentioning it just in case)

Yes it can be done - see How do I authorise an app (web or installed) without user intervention?
However, it's probably a bad idea for two reasons. if you distribute code with embedded secrets (technically the secret is a Refresh Token), they tend not to stay secret for long. Secondly, there is the chance that the Refresh Token will expire and your users will be dead in the water.
I would suggest that you consider:-
A Service Account
Writing an OAuth proxy, which you can host for free on Google AppEngine, which puts all of the secret stuff on a server and from which your app can fetch Access Tokens as they are needed.

Related

Daemon application authentication for OneDrive files

I have a OneDrive for Business user account within a large organization. I'd like to have a daemon service running (Python) that automatically uploads files to this user's OneDrive.
This service will be running in a headless VM, so browser-based authentication (especially if it needs to be done more than once) is very difficult.
What are my options for authenticating this app to allow it to write to the user's OneDrive? I've registered an app and created a client secret for it. I was experimenting with the authorization flow described here, but that SDK is deprecated and no longer supported, so I'd prefer to use Graph if possible.
What are my options for authentication with Python in this scenario, and is any sample code / example available?
Both delegated and application permissions are supported on MS Graph API: https://learn.microsoft.com/en-us/graph/api/drive-list?view=graph-rest-1.0&tabs=http. Application permissions might not be acceptable for your use case since they would allow access to all users' OneDrives?
Application permissions would definitely be the easiest choice.
But you can also implement this scenario using delegated permissions
You would need the user to initialize the process by authenticating interactively once.
When they do that, store the refresh token in a secret store accessible by the server application.
Then it can use the refresh token to get a new refresh token + access token when needed.
This approach has some more complexity but does allow you to only give access to this one user's OneDrive for the app.
Also, keep in mind that refresh tokens can expire.
The user would need to re-authenticate if that happens.
If this process is critical, application permissions can be a really good idea despite the downsides.

Accessing Google Adsense API from a server with Python?

Is there any way to use Python to access the Google Adsense API from a server without any user interaction?
This is typically done by setting up a "service account", but Google's docs say that "AdSense doesn't support Service Accounts".
They say to use the web or installed application flows, but these require the user to manually confirm access for every access. My application needs to run on a headless server, without user interaction, so it can pull data every hour, so this won't work. This similar question suggests going through the user consent screen once and then caching the token on the server, but this isn't feasible in my case since my process needs to be 100% automated, and the token will eventually expire and require user interaction.
Unfortunately, Google's docs are quiet unhelpful, and even worse their Python coding examples haven't been updated in 7 years, and don't even seem to have worked back then, as many of them don't even run Python 2.7, much less 3.
It's true that the AdSense Management API doesn't support service accounts. While there is setup required at first with the Web Flow, the same is true for service accounts which also have to be granted permissions on the account being accessed.
Regarding the tokens expiring, the Web Flow will yield a refresh token, which you can use to generate new access tokens (known as offline access, which doesn't require user involvement after the initial setup).

How to allow a user to download a Google Cloud Storage file from Compute Engine without public access

I'm going to try and keep this as short as possible.
I have a compute engine instance, and it is running Python/Flask.
What I am trying to do, is allow a user to download a file from google cloud storage, however I do not want the file to be publicly accessible. Is there a way I can have my Compute instance stream the file from cloud storage for the user to download, and then have the file deleted from the compute instance after the user has finished downloading the file? I'd like the download to start immediately after they click the download button.
I am using the default app credentials.
subprocess is not an option.
SideNote:
Another way I was thinking about doing this was to allow each user, who is logged into the website, access to a specific folder on a bucket. However I am unsure if this would even be possible without having them login with a google account. This also seems like it would be a pain to implement.
#jterrace's answer is what you want.
Signed URLs can have a time limit associated with them. In your application you would create a signed url for the file and do a HTTP redirect to said file.
https://cloud.google.com/storage/docs/access-control/create-signed-urls-program
If you are using the default compute engine service account (the default associated with your GCE instance) you should be able to sign just fine. Just follow the instructions on how to create the keys in the url above.
You can do all kinds of awesome stuff this way, including allowing users to upload DIRECTLY to google cloud storage! :)
It sounds like you're looking for Signed URLs.
Service account associated with your compute engine will solve the problem.
Service accounts authenticate applications running on your
virtual machine instances to other Google Cloud Platform services. For
example, if you write an application that reads and writes files on
Google Cloud Storage, it must first authenticate to the Google Cloud
Storage API. You can create a service account and grant the service
account access to the Cloud Storage API.
For historical reasons, all projects come with the Compute Engine default service account, identifiable using this email:
[PROJECT_NUMBER]-compute#developer.gserviceaccount.com
By default, the service account of compute engine has read-only access to google cloud storage service. So, compute engine can access your storage using GCP client libraries.
gsutil is the command-line tool for GCP storage, which is very handy for trying out various options offered by storage.
start by typing gsutil ls from your compute engine which lists all the buckets in your cloud storage.

Automate report generation using Python, what kind of Credentials do I need

I have an Google Analytics Account that I want to automate some custom reports from, but I have some problems understanding what kind of Credentials I need. Most of the tutorials I have seen says I need to use OAuth client ID but the google developers console site says I need a Servide Account key.
What is the difference between the two? Using another Analytics Account I tried to setup a OAuth connection, and it worked, but I now got unsure about what kind of key I should use.
What I want to do is to just have a Python script set up to run at some times, and then to get the data I want to query for. The data is just the same as the one I can get from logging into the Google Analytics UI, so there is no need for any users to consent to giving me access to any personal data or what ever else the Consent form should be used for.
Can someone explain what the difference is between the two Credentials and what one would be the correct one to use for my project?
Both Service accounts and OAuth2 are used to access private user data. Private data is data that is accessible only by logging in. My posts on Google+ are public anyone can see them. The information in my Google Analytics is private owned by me only I can see it and those I grant access to it.
With Oauth2 access is granted at run time. The first time an application is run the user will be asked if your application can access their data. If the user accepts and grants your application access you will be given a refresh token. This refresh token can be then used to get an access token which is used to access the private user data. Access tokens are only good for about an hour. After the hour is up you use the refresh token to get access again. That's why I say access is granted at runtime. You only have to ask the user for access once to get the refresh token though.
Service accounts on the other hand are pre authenticated. Service accounts are like dummy users they have their own google drive account and google calendar account. Because if this it is possible to shire data with them like you would any other user. You take the service account email address and add it as a user under the admin section of google analytics at the ACCOUNT level it must be the ACCOUNT level. Then using the service account in your code, you will be able to access the data for that Google Analytics account without requesting authentication from a user the first time.
Service accounts are most often used by developers to grant others access to the data owned by the developer. Oauth2 on the other hand would be used to access data of your customers for whos accounts you the developer does not personally have access to.
Technically speaking you can use either for your project as long as you store the refresh token you could technically use Oauth2 for your project. However I would not recommend it refresh tokens can expire under certain circumstances, which I will not go into.
I would recommend using a service account in your case it will be much easer for you to administrate as you will only need to set it up once.
My tutorials on the subject:
Google Developer console service account
Google Developer Console Oauth2 credentials

GAE: Can't Use Google Server Side API's (Whitelisting Issue)

To use Google API's, after activating them from the Google Developers Console, one needs to generate credentials. In my case, I have a backend that is supposed to consume the API server side. For this purpose, there is an option to generate what the Google page calls "Key for server applications". So far so good.
The problem is that in order to generate the key, one has to mention IP addresses of servers that would be whitelisted. But GAE has no static IP address that I could use there.
There is an option to manually get the IP's by executing:
dig -t TXT _netblocks.google.com #ns1.google.com
However there is no guarantee that the list is static (further more, it is known to change from time to time), and there is no programatic way I could automate the use of adding IP's that I get from dig into the Google Developers Console.
This leaves me with two choices:
Forget about GAE for this project, ironically, GAE cannot be used as a backend for Google API's (better use Amazon or some other solution for that). or
Program something like a watchdog over the output of the dig command that would notify me if there's a change, and then I would manually update the whitelist (no way I am going to do this - too dangerous), or allow all IP's to use the Google API granted it has my API key. Not the most secure solution but it works.
Is there any other workaround? Can it be that GAE does not support consuming Google API's server side?
You can use App Identity to access Google's API from AppEngine. See: https://developers.google.com/appengine/docs/python/appidentity/. If you setup your app using the cloud console, it should have already added your app's identity with permission to your project, but you can always check that out. From the "Permissions" Tab in cloud console for your project, make sure your service account is added under "Service Accounts" (in the form of your_app_id#appspot.gserviceaccount.com)
Furthermore, if you use something like the JSON API Libs available for python, you can use the bundled oauth2 library to do all of this for you using AppAssertionCredentials to authorize the API you wish to use. See: https://developers.google.com/api-client-library/python/guide/google_app_engine#ServiceAccounts
Yes, you should use App Identity. Forget about getting an IP or giving up on GAE :-) Here is an example of how to use Big Query, for example, inside a GAE application:
static {
// initializes Big Query
JsonFactory jsonFactory = new JacksonFactory();
HttpTransport httpTransport = new UrlFetchTransport();
AppIdentityCredential credential = new AppIdentityCredential(Arrays.asList(Constants.BIGQUERY_SCOPE));
bigquery = new Bigquery.Builder(httpTransport, jsonFactory, credential)
.setApplicationName(Constants.APPLICATION_NAME).setHttpRequestInitializer(credential)
.setBigqueryRequestInitializer(new BigqueryRequestInitializer(Constants.API_KEY)).build();
}

Categories

Resources