Currently, I am building the async frontend to my TF2 model. Now it works as two services, 1st service is a twisted service, and 2nd service is a TensorFlow serving.
The async web client is being used to query the model asynchronously. For practical reasons, I've deployed the model into the GCP AI Platform, and I can get data from it using the python code from examples, and it is okay.
But the thing is that the Google API client is synchronous, and I would like to use the asynchronous client. Since, AFAIK, there are no actively supported async clients for GCP, I tried to get straightforward and use REST. The model input is the same on TensorFlow serving (GCP AI Platform uses TensorFlow serving internally, I believe).
To perform the async call, I need to have:
Model URL. (I have it)
Input data. (I also have it)
Access token.
I saw some examples that are:
import googleapiclient.discovery
credentials = service_account.Credentials.from_service_account_file(
'/path/to/key.json',
scopes=['https://www.googleapis.com/auth/cloud-platform'])
But the issue is that credential.token is None, so I can't use it.
So I have a question: how could I get the access token to use in the rest request then?
Or maybe there is another but better way of doing that?
I already saw the following question: How to get access token from instance of google.oauth2.service_account.Credentials object?
but I am think that it is slightly irrelevant.
The following code sets up the data structures for managing credentials (OAuth tokens) from a service account. No tokens are requested at this point.
credentials = service_account.Credentials.from_service_account_file(
'/path/to/key.json',
scopes=['https://www.googleapis.com/auth/cloud-platform'])
Tokens are not requested from the Google auth server until required. There are several reasons: a) network calls take time - a significant amount of time for network failures; b) tokens expire; c) tokens are cached until they (almost) expire.
To generate a token, call the refresh() method:
import google.auth.transport.requests
request = google.auth.transport.requests.Request()
credentials.refresh(request)
credential.token will now contain an OAuth Access Token else an exception will be thrown (network error, etc.).
Just focusing on "How do I get an access token?", you will need to:
Create a Service Account
Use the Google Auth library
Run code
NOTE If you're running the code off-GCP (i.e. not on App Engine, Compute Engine, GKE etc.), then you will need to create a Key for the Service Account and you will need to export GOOGLE_APPLICATION_CREDENTIALS=path/to/your/key.json. Application Default Credentials (see below) simplify auth.
See: Authenticating as a Service Account
And:
import google.auth
from googleapiclient.discovery import build
SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
creds, project_id = google.auth.default(scopes=SCOPES)
service = build(GOOGLE_API, GOOGLE_API_VERSION, credentials=creds)
Related
Goal:
I have set up a Google API Gateway. The backend for the API is a cloud function (written in python). The cloud function should query data from Google BigQuery (BQ). To do that, I want to create a BQ Client (google.cloud.bigquery.Client()). The API should be accessed by different applications using different service accounts. The service accounts have permission to access only specific datasets within my project. Therefore, the service accounts/applications should only be able to query the datasets they have the permission for. Therefore, the BQ Client within the cloud function should be initialized with the service account that sends the request to the API.
What I tried:
The API is secured with the following OpenAPI definition so that a JWT signed by the service account SA-EMAIL is required to send a request there:
securityDefinitions:
sec-def1:
authorizationUrl: ""
flow: "implicit"
type: "oauth2"
x-google-issuer: "SA-EMAIL"
x-google-jwks_uri: "https://www.googleapis.com/robot/v1/metadata/x509/SA-EMAIL"
x-google-audiences: "SERVICE"
For the path that uses my cloud function, I use the following backend configuration:
x-google-backend:
address: https://PROJECT-ID.cloudfunctions.net/CLOUD-FUNCTION
path_translation: CONSTANT_ADDRESS
So in the cloud function itself I get the forwarded JWT as X-Forwarded-Authorization and also the already verified base64url encoded JWT payload as X-Apigateway-Api-Userinfo from the API Gateway.
I tried to use the JWT from X-Forwarded-Authorization to obtain credentials:
bearer_token = request.headers.get('X-Forwarded-Authorization')
token = bearer_token.split(" ")[1]
cred = google.auth.credentials.Credentials(token)
At first, this seems to work since cred.valid returns True, but when trying to create the client with google.cloud.bigquery.Client(credentials=cred) it returns the following error in the logs:
google.auth.exceptions.RefreshError: The credentials do not contain
the necessary fields need to refresh the access token. You must
specify refresh_token, token_uri, client_id, and client_secret.
I do not have much experience with auth/oauth at all, but I think I do not have the necessary tokens/attributes the error is saying are missing available in my cloud function.
Also, I am not exactly sure why there is a RefreshError, since I don't want to refresh the token (and don't do so explicitly) and just use it again (might be bad practice?).
Question:
Is it possible to achieve my goal in the way I have tried or in any other way?
Your goal is to catch the credential that called the API Gateway, and to reuse it in your Cloud Functions to call BigQuery.
Sadly, you can't. Why? Because API Gateway prevent you to achieve that (and it's a good news for security reason). The JWT token is correctly forwarded to your Cloud Functions, but the signature part has been removed (you receive only the header and the body of the JWT token).
The security verification has been done by API Gateway and you have to rely on that authentication.
What's the solution?
My solution is the following: In the truncated JWT that you receive, you can get the body and get the Service Account email. From there, you can use the Cloud Functions service account, to impersonate the Service Account email that you receive.
Like that, the Cloud Functions service account only needs the permission to impersonate these service account, and you keep the permission provided on the original service account.
I don't see other solutions to solve your issue.
The JWT that you are receiving from API Gateway is not an OAuth Access Token. Therefore the JWT Payload portion is not a credential that you can use for the BigQuery Client authorization.
As #guillaume blaquiere pointed out, the Payload contains the email address of an identity. If the identity is a service account, you could implement impersonation of that identity. This could be a good solution if you are using multiple identities with API Gateway. If the identity is a user account, then you would need to implement Domain-Wide Delegation.
I recommend simply using the service account assigned to the Cloud Function with proper roles assigned to initialize the BigQuery client. Provided that API Gateway is providing authorization to reach your Cloud Function, there is no need for the extra layer of impersonation.
Another option is to store the matching service account JSON key file in Secret Manager and pull it when required to create the BigQuery Client.
What is the python programmatic alternative to the gcloud command line gcloud auth print-identity-token?
I am trying to invoke Google Cloud Function by http trigger (only for auth users) and i need to pass the identity token in the Authentication header. I have method which works great when i run the code on GCP app engine. However, i struggle to find a way to get this identity token when i run the program on my own machine (where i can create the token with gcloud command line gcloud auth print-identity-token)
I found how to create access-token according to this answer but i didn't managed to understand how can i create identity-token.
Thank you in advance!
Great topic! And it's a long long way, and months of tests and discussion with Google.
TL;DR: you can't generate an identity token with your user credential, you need to have a service account (or to impersonate a service) to generate an identity token.
If you have a service account key file, I can share a piece of code to generate an identity token, but generating and having a service account key file is globally a bad practice.
I released an article on this and 2 merge requests to implement an evolution in the Java Google auth library (I'm more Java developer that python developer even if I also contribute to python OSS project) here and here. You can read them if you want to understand what is missing and how works the gcloud command today.
On the latest merge request, I understood that something is coming from google, internally, but up to now, I didn't see anything...
If you have a service account you can impersonate this is one way to get an ID token in Python from a local/dev machine.
import google.auth
from google.auth.transport.requests import AuthorizedSession
def impersonated_id_token():
credentials, project = google.auth.default(scopes=['https://www.googleapis.com/auth/cloud-platform'])
authed_session = AuthorizedSession(credentials)
sa_to_impersonate = "<SA_NAME>#<GCP_PROJECT>.iam.gserviceaccount.com"
request_body = {"audience": "<SOME_URL>"}
response = authed_session.post( f'https://iamcredentials.googleapis.com/v1/projects/-/serviceAccounts/{sa_to_impersonate}:generateIdToken',request_body)
return response
if __name__ == "__main__":
print(impersonated_id_token().json())
I'm developing a Cloud Run Service that accesses different Google APIs using a service account's secrets file with the following python 3 code:
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(SECRETS_FILE_PATH, scopes=SCOPES)
In order to deploy it, I upload the secrets file during the build/deploy process (via gcloud builds submit and gcloud run deploy commands).
How can I avoid uploading the secrets file like this?
Edit 1:
I think it is important to note that I need to impersonate user accounts from GSuite/Workspace (with domain wide delegation). The way I deal with this is by using the above credentials followed by:
delegated_credentials = credentials.with_subject(USER_EMAIL)
Using the Secret Manager might help you, as you can manage the multiple secrets you have and not have them stored as files, as you are doing right now. I would recommend you to take a look at this article here, so you can get more information on how to use it with Cloud Run, to improve the way you manage your secrets.
In addition to that, as clarified in this similar case here, you have two options: use default service account that comes with it or deploy another one with the Service Admin role. This way, you won't need to specify keys with variables - as clarified by a Google developer in this specific answer.
To improve the security, the best way is to never use service account key file, locally or on GCP (I wrote an article on this). To achieve this, Google Cloud service have an automatically loaded service account, either this one by default or, when possible, a custom one.
On Cloud Run, the default service account is the Compute Engine default service account (I recommend you to never use it, it has editor role on the project, it's too wide!), or you can specify the service account to use (--service-account= parameter)
Then, in your code, simply use the ADC mechanism (Application Default Credential) to get your credentials, like this in Python
import google.auth
credentials, project_id = google.auth.default(scopes=SCOPES)
I've found one way to solve the problem.
First, as suggested by guillaume blaquiere answer, I used google.auth ADC mechanism:
import google.auth
credentials, project_id = google.auth.default(scopes=SCOPES)
However, as I need to impersonate GSuite's (now Workspace) accounts, this method is not enough, as the credentials object generated from this method does not have the with_subject property. This led me to this similar post and specific answer which works a way to convert google.auth.credentials into the Credential object returned by service_account.Credentials.from_service_account_file. There was one problem with his solution, as it seemed that an authentication scope was missing.
All I had to do is add the https://www.googleapis.com/auth/cloud-platform scope to the following places:
The SCOPES variable in the code
Google Admin > Security > API Controls > Set client ID and scope for the service account I am deploying with
At the OAuth Consent Screen of my project
After that, my Cloud Run had access to credentials that were able to impersonate user's accounts without using key files.
I am evaluating different options for authentication in a python App Engine flex environment, for apps that run within a G Suite domain.
I am trying to put together the OpenID Connect "Server flow" instructions here with how google-auth-library-python implements the general OAuth2 instructions here.
I kind of follow things up until 4. Exchange code for access token and ID token, which looks like flow.fetch_token, except it says "response to this request contains the following fields in a JSON array," and it includes not just the access token but the id token and other things. I did see this patch to the library. Does that mean I could use some flow.fetch_token to create an IDTokenCredentials (how?) and then use this to build an OpenID Connect API client (and where is that API documented)? And what about validating the id token, is there a separate python library to help with that or is that part of the API library?
It is all very confusing. A great deal would be cleared up with some actual "soup to nuts" example code but I haven't found anything anywhere on the internet, which makes me think (a) perhaps this is not a viable way to do authentication, or (b) it is so recent the python libraries have not caught up? I would however much rather do authentication on the server than in the client with Google Sign-In.
Any suggestions or links to code are much appreciated.
It seems Google's python library contains a module for id token validation. This can be found at google.oauth2.id_token module. Once validated, it will return the decoded token which you can use to obtain user information.
from google.oauth2 import id_token
from google.auth.transport import requests
request = requests.Request()
id_info = id_token.verify_oauth2_token(
token, request, 'my-client-id.example.com')
if id_info['iss'] != 'https://accounts.google.com':
raise ValueError('Wrong issuer.')
userid = id_info['sub']
Once you obtain user information, you should follow authentication process as described in Authenticate the user section.
OK, I think I found my answer in the source code now.
google.oauth2.credentials.Credentials exposes id_token:
Depending on the authorization server and the scopes requested, this may be populated when credentials are obtained and updated when refresh is called. This token is a JWT. It can be verified and decoded [as #kavindu-dodanduwa pointed out] using google.oauth2.id_token.verify_oauth2_token.
And several layers down the call stack we can see fetch_token does some minimal validation of the response JSON (checking that an access token was returned, etc.) but basically passes through whatever it gets from the token endpoint, including (i.e. if an OpenID Connect scope is included) the id token as a JWT.
EDIT:
And the final piece of the puzzle is the translation of tokens from the (generic) OAuthSession to (Google-specific) credentials in google_auth_oauthlib.helpers, where the id_token is grabbed, if it exists.
Note that the generic oauthlib library does seem to implement OpenID Connect now, but looks to be very recent and in process (July 2018). Google doesn't seem to use any of this at the moment (this threw me off a bit).
Can someone please give me a clear explanation of how to get the Google Calendar API v3 working with the Python Client? Specifically, the initial OAuth stage is greatly confusing me. All I need to do is access my own calendar, read it, and make changes to it. Google provides this code for configuring my app:
import gflags
import httplib2
from apiclient.discovery import build
from oauth2client.file import Storage
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.tools import run
FLAGS = gflags.FLAGS
# Set up a Flow object to be used if we need to authenticate. This
# sample uses OAuth 2.0, and we set up the OAuth2WebServerFlow with
# the information it needs to authenticate. Note that it is called
# the Web Server Flow, but it can also handle the flow for native
# applications
# The client_id and client_secret are copied from the API Access tab on
# the Google APIs Console
FLOW = OAuth2WebServerFlow(
client_id='YOUR_CLIENT_ID',
client_secret='YOUR_CLIENT_SECRET',
scope='https://www.googleapis.com/auth/calendar',
user_agent='YOUR_APPLICATION_NAME/YOUR_APPLICATION_VERSION')
# To disable the local server feature, uncomment the following line:
# FLAGS.auth_local_webserver = False
# If the Credentials don't exist or are invalid, run through the native client
# flow. The Storage object will ensure that if successful the good
# Credentials will get written back to a file.
storage = Storage('calendar.dat')
credentials = storage.get()
if credentials is None or credentials.invalid == True:
credentials = run(FLOW, storage)
# Create an httplib2.Http object to handle our HTTP requests and authorize it
# with our good Credentials.
http = httplib2.Http()
http = credentials.authorize(http)
# Build a service object for interacting with the API. Visit
# the Google APIs Console
# to get a developerKey for your own application.
service = build(serviceName='calendar', version='v3', http=http,
developerKey='YOUR_DEVELOPER_KEY')
But (a) it makes absolutely no sense to me; the comment explanations are terrible, and (b) I don't know what to put in the variables. I've registered my program with Google and signed up for a Service Account key. But all that gave me was an encrypted key file to download, and a client ID. I have no idea what a "developerKey" is, or what a "client_secret" is? Is that the key? If it is, how do I get it, since it is actually contained in an encrypted file? Finally, given the relatively simple goals of my API use (i.e., it's not a multi-user, multi-access operation), is there a simpler way to be doing this? Thanks.
A simple (read: way I've done it) way to do this is to create a web application instead of a service account. This may sound weird since you don't need any sort of web application, but I use this in the same way you do - make some queries to my own calendar/add events/etc. - all from the command line and without any sort of web-app interaction. There are ways to do it with a service account (I'll tinker around if you do in fact want to go on that route), but this has worked for me thus far.
After you create a web application, you will then have all of the information indicated above (side note: the sample code above is based on a web application - to use a service account your FLOW needs to call flow_from_clientsecrets and further adjustments need to be made - see here). Therefore you will be able to fill out this section:
FLOW = OAuth2WebServerFlow(
client_id='YOUR_CLIENT_ID',
client_secret='YOUR_CLIENT_SECRET',
scope='https://www.googleapis.com/auth/calendar',
user_agent='YOUR_APPLICATION_NAME/YOUR_APPLICATION_VERSION')
You can now fill out with the values you see in the API console (client_id = the entire Client ID string, client_secret = the client secret, scope is the same and the user_agent can be whatever you want). As for the service line, developerKey is the API key you can find under the Simple API Access section in the API console (label is API key):
service = build(serviceName='calendar', version='v3', http=http,
developerKey='<your_API_key>')
You can then add in a simple check like the following to see if it worked:
events = service.events().list(calendarId='<your_email_here>').execute()
print events
Now when you run this, a browser window will pop up that will allow you to complete the authentication flow. What this means is that all authentication will be handled by Google, and the authentication response info will be stored in calendar.dat. That file (which will be stored in the same directory as your script) will contain the authentication info that the service will now use. That is what is going here:
storage = Storage('calendar.dat')
credentials = storage.get()
if credentials is None or credentials.invalid == True:
credentials = run(FLOW, storage)
It checks for the existence of valid credentials by looking for that file and verifying the contents (this is all abstracted away from you to make it easier to implement). After you authenticate, the if statement will evaluate False and you will be able to access your data without needing to authenticate again.
Hopefully that shines a bit more light on the process - long story short, make a web application and use the parameters from that, authenticate once and then forget about it. I'm sure there are various points I'm overlooking, but hopefully it will work for your situation.
Google now has a good sample application that gets you up and running without too much fuss. It is available as the "5 minute experience - Quickstart" on their
Getting Started page.
It will give you a URL to visit directly if you are working on a remote server without a browser.