Custom service account for App Engine service - python

I have multiple micro-services running on our project's GCP (Google Cloud Platform) App Engine. In our case, it would be preferable to minimize permissions on a per-service basis. Currently, we keep all the credential files in Keybase, keeping secrets sectioned off by team membership. So if I am working on one App Engine service, I can't see the secrets for another team's app engine service.
What we have been doing for other secrets, e.g. config files with passwords and secret tokes, we give the app engine service account kms decryption privileges and simply pull an encrypted copy of the config file from firestore and decrypt it. But we don't want to use the default app engine service account everywhere, because different teams using the same service accounts would have access to everyone's secrets. So we want to move to a service account on a per-service basis for each app engine service in development.
From what I can tell in the Google documentation, they want to you to upload up the credential file(s) when we deploy the app, which works, however, from the cloud console, it appears difficult to lock down who can look at the the files deployed to the service, and anyone with access could simply copy/paste all the credentials.
If you have the config in a dictionary, you do something like this:
from google.oauth2 import service_account
from google.cloud import kms_v1
d = {'type': 'service_account',
'project_id': 'my-awesome-project',
'private_key_id': '074139282fe9834ac23401',
'private_key': '-----BEGIN PRIVATE KEY----\n supersecretkeythatnobodyknows==\n-----END PRIVATE KEY-----\n',
'client_email': 'my-cool-address#my-awesome-project.iam.gserviceaccount.com',
'client_id': '1234567890',
'auth_uri': 'https://accounts.google.com/o/oauth2/auth',
'token_uri': 'https://oauth2.googleapis.com/token',
'auth_provider_x509_cert_url':
'https://www.googleapis.com/oauth2/v1/certs',
'client_x509_cert_url': 'https://www.googleapis.com/robot/v1/metadata/x509/my-cool-addres%40my-awesome-project.iam.gserviceaccount.com'}
credentials = service_account.Credentials.from_service_account_info(d)
kms_client = kms_v1.KeyManagementServiceClient(credentials=credentials)
This works, but how do we get the dictionary "d" into the program without it showing up in the code and having accessible by a broad range of people?

If each service in your AppEngine environment have to have their own identity, it's not possible with AppEngine. Newer service like Cloud Function or Cloud run can do this, but the very old service (for cloud era) AppEngine (more than 10 years old) can't.
You have a service account for all AppEngine service. You can use it to decrypt others services account key file, per service and use them in the appropriate service, but the root authorization stay the default App Engine service account, and thereby all services/all teams can have access to it (and to all the encrypted service account key files).
Maybe the solution is to redesign the application and to have a project per team?

What guillaume blaquiere said in his answer is correct and I agree with his idea of having a project per team, also. You cannot make a different service account for each of your App Engine's services.
Although what you want to achieve may be somehow possible in the App Engine flexible. OK, so you cannot find a way to only allow some team for some service and other team for the other. What you can do ( but I do not think that you want that and I would not recommend either ) is blocking certain IPs, by controlling your access with Firewalls. You can create your own firewall rules and allow request from one place and do not from another place.
This is just a trick that may or may not work for you. The real advice is that if you really want to implement the system you have described clean and correctly with more service accounts you should consider migrating to Cloud Run.

Related

GCP-Cloud Composer: Secret Manager access variable.json

I try to configure Secret Manager for my Composer (ver 1.16, airflow 1.10) but I have a weird situation like below. In my Composer, I've used a variable.json file to manage Variables in Airflow
# variable.json
{
"sleep": "False",
"ssh_host": "my_host"
}
Then, I use this article to configure Secret Manager. Follow the instructions, I override the config with a section secrets
backend: airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
backend_kwargs: {"variables_prefix":"airflow-variables", "sep":"-", "project_id": "my-project"}
And in my Secret Manager, I also created my secret : airflow-variables-sercret_name .
In fact, everything is fine, I could get my secret via (and of course, I don't have any problem with the service account)
from airflow.models.variable import Variable
my_secret = Variable.get('sercret_name')
But when I check the Logs, I found out that Airflow also tries to find the other variables in my variables.json file
2022-04-13 15:49:46.590 CEST airflow-worker Google Cloud API Call Error (NotFound): Secret ID airflow-secrets-sleep not found.
2022-04-13 15:49:46.590 CEST airflow-worker Google Cloud API Call Error (NotFound): Secret ID airflow-secrets-ssh_host not found.
So how could I avoid this situation, please? Or Did I miss understand something?
Thanks !!!
These errors are known when you use Secret Manager, but here are some workarounds below:
Add a way to skip Secret backend.
File a Feature Request to lower the log priority; You can use this as a template for your issue.
Create logs exclusion in Cloud Logging.
About Logs Exclusion:
Sinks control how Cloud Logging routes logs. Sinks belong to a given Google Cloud resource: Cloud projects, billing accounts, folders, and organizations. When the resource receives a log entry, it routes the log entry according to the sinks contained by that resource. The routing behavior for each sink is controlled by configuring the inclusion filter and exclusion filters for that sink.
When you create a sink, you can set multiple exclusion filters, letting you exclude matching log entries from being routed to the sink's destination or from being ingested by Cloud Logging.

Can I check if a script is running inside a Compute Engine or in a local environment?

I just wanted to know if there is a way to check whether a Python script is running inside a Compute Engine or in a local environment?
I want to check that in order to know how to authenticate, for example when a script runs on a Compute Engine and I want to initiate a BigQuery client I do not need to authenticate but when it comes to running a script locally I need to authenticate using a service account JSON file.
If I knew whether a script is running locally or in a Compute Engine I would be able to initiate Google services accordingly.
I could put initialization into a try-except statement but maybe there is another way?
Any help is appreciated.
If I understand your question correctly, I think a better solution is provided by Google called Application Default Credentials. See Best practices to securely auth apps in Google Cloud (thanks #sethvargo) and Application Default Credentials
Using this mechanism, authentication becomes consistent regardless of where you run your app (on- or off-GCP). See finding credentials automatically
When you run off-GCP, you set GOOGLE_APPLICATION_CREDENTIALS to point to the Service Account. When you run on-GCP (and, to be clear, you are still authenticating, it's just transparent), you don't set the environment variable because the library obtains the e.g. Compute Engine instance's service account for you.
So I read a bit on the Google Cloud authentication and came up with this solution:
import google.auth
from google.oauth2 import service_account
try:
credentials, project = google.auth.default()
except:
credentials = service_account.Credentials.from_service_account_file('/path/to/service_account_json_file.json')
client = storage.Client(credentials=credentials)
What this does is it tries to retrieve the default Google Cloud credentials (in environments such as Compute Engine) and if it fails it tries to authenticate using a service account JSON file.
It might not be the best solution but it works and I hope it will help someone else too.

Securely accessing storage via GCP cloud function?

I need to write a cloud function in GCP, which responds to HTTP requests and has service account access to GCP cloud storage. My function will receive a string and threshold parameters. It will retrieve a csv file from cloud storage, compute similarity between the text string supplied and the entities in the csv file and return the entities that satisfy the threshold requirements.
From Google's cloud function tutorials, I have yet to see anything that gives it cloud storage access, a service account for access therein, etc.
Could anyone link a resource or otherwise explain how to get from A to B?
Let the magic happens!
In fact, nothing is magic. With most of Google Cloud products, you have a service account that you can grant the permission that you want. On Cloud Functions, the default service account is the AppEngine default service account with this pattern <projectID>#appspot.gserviceaccount.com.
When you deploy a Cloud FUnctions you can use a custom service account by using this paramenter --service-account=. It's safer because your Cloud Functions can have his own service account, with limited permissions (App Engine default service account is Project Editor by default, which is too wide!!)
So, this service is loaded automatically with your cloud functions and the Google Cloud auth libraries can access it via the Metadata server. The credentials is taken from the runtime context, it's the default credential of the environment
About your code, keep it as simple as that
from google.cloud import storage
client = storage.Client() # Use default credentials
bucket = client.get_bucket('myBucket')
blobs = bucket.list_blobs()
for blob in blobs:
print(blob.size)
On your workstation, if you want to execute the same code, you can use your own credential by running this command gcloud auth application-default login If you prefer using a service account key file (that I strongly don't recommend, but it's not the topic), you can set the environment variable GOOGLE_APPLICATION_CREDENTIALS with the file path as value

How can I grant a Cloud Run service access to service account's credentials without the key file?

I'm developing a Cloud Run Service that accesses different Google APIs using a service account's secrets file with the following python 3 code:
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(SECRETS_FILE_PATH, scopes=SCOPES)
In order to deploy it, I upload the secrets file during the build/deploy process (via gcloud builds submit and gcloud run deploy commands).
How can I avoid uploading the secrets file like this?
Edit 1:
I think it is important to note that I need to impersonate user accounts from GSuite/Workspace (with domain wide delegation). The way I deal with this is by using the above credentials followed by:
delegated_credentials = credentials.with_subject(USER_EMAIL)
Using the Secret Manager might help you, as you can manage the multiple secrets you have and not have them stored as files, as you are doing right now. I would recommend you to take a look at this article here, so you can get more information on how to use it with Cloud Run, to improve the way you manage your secrets.
In addition to that, as clarified in this similar case here, you have two options: use default service account that comes with it or deploy another one with the Service Admin role. This way, you won't need to specify keys with variables - as clarified by a Google developer in this specific answer.
To improve the security, the best way is to never use service account key file, locally or on GCP (I wrote an article on this). To achieve this, Google Cloud service have an automatically loaded service account, either this one by default or, when possible, a custom one.
On Cloud Run, the default service account is the Compute Engine default service account (I recommend you to never use it, it has editor role on the project, it's too wide!), or you can specify the service account to use (--service-account= parameter)
Then, in your code, simply use the ADC mechanism (Application Default Credential) to get your credentials, like this in Python
import google.auth
credentials, project_id = google.auth.default(scopes=SCOPES)
I've found one way to solve the problem.
First, as suggested by guillaume blaquiere answer, I used google.auth ADC mechanism:
import google.auth
credentials, project_id = google.auth.default(scopes=SCOPES)
However, as I need to impersonate GSuite's (now Workspace) accounts, this method is not enough, as the credentials object generated from this method does not have the with_subject property. This led me to this similar post and specific answer which works a way to convert google.auth.credentials into the Credential object returned by service_account.Credentials.from_service_account_file. There was one problem with his solution, as it seemed that an authentication scope was missing.
All I had to do is add the https://www.googleapis.com/auth/cloud-platform scope to the following places:
The SCOPES variable in the code
Google Admin > Security > API Controls > Set client ID and scope for the service account I am deploying with
At the OAuth Consent Screen of my project
After that, my Cloud Run had access to credentials that were able to impersonate user's accounts without using key files.

How to authenticate Google APIs (Google Drive API) from Google Compute Engine and locally without downloading Service Account credentials?

Our company is working on processing data from Google Sheets (within Google Drive) from Google Cloud Platform and we are having some problems with the authentication.
There are two different places where we need to run code that makes API calls to Google Drive: within production in Google Compute Engine, and within development environments i.e. locally on our developers' laptops.
Our company is quite strict about credentials and does not allow the downloading of Service Account credential JSON keys (this is better practice and provides higher security). Seemingly all of the docs from GCP say to simply download the JSON key for a Service Account and use that. Or Google APIs/Developers docs say to create an OAuth2 Client ID and download it’s key like here.
They often use code like this:
from google.oauth2 import service_account
SCOPES = ['https://www.googleapis.com/auth/sqlservice.admin']
SERVICE_ACCOUNT_FILE = '/path/to/service.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
But we can't (or just don't want to) download our Service Account JSON keys, so we're stuck if we just follow the docs.
For the Google Compute Engine environment we have been able to authenticate by using GCP Application Default Credentials (ADCs) - i.e. not explicitly specifying credentials to use in code and letting the client libraries “just work” - this works great as long as one ensures that the VM is created with the correct scopes https://www.googleapis.com/auth/drive, and the default compute Service Account email is given permission to the Sheet that needs to be accessed - this is explained in the docs here. You can do this like so;
from googleapiclient.discovery import build
service = build('sheets', 'v4')
SPREADSHEET_ID="<sheet_id>"
RANGE_NAME="A1:A2"
s = service.spreadsheets().values().get(
spreadsheetId=SPREADSHEET_ID,
range=RANGE_NAME, majorDimension="COLUMNS"
).execute()
However, how do we do this for development, i.e. locally on our developers' laptops? Again, without downloading any JSON keys, and preferably with the most “just works” approach possible?
Usually we use gcloud auth application-default login to create default application credentials that the Google client libraries use which “just work”, such as for Google Storage. However this doesn't work for Google APIs outside of GCP, like Google Drive API service = build('sheets', 'v4') which fails with this error: “Request had insufficient authentication scopes.”. Then we tried all kinds of solutions like:
credentials, project_id = google.auth.default(scopes=["https://www.googleapis.com/auth/drive"])
and
credentials, project_id = google.auth.default()
credentials = google_auth_oauthlib.get_user_credentials(
["https://www.googleapis.com/auth/drive"], credentials._client_id, credentials._client_secret)
)
and more...
Which all give a myriad of errors/issues we can’t get past when trying to do authentication to Google Drive API :(
Any thoughts?
One method for making the authentication from development environments easy is to use Service Account impersonation.
Here is a blog about using service account impersonation, including the benefits of doing this. #johnhanley (who wrote the blog post) is a great guy and has lots of very informative answers on SO also!
To be able to have your local machine authenticate for Google Drive API you will need to create default application credentials on your local machine that impersonates a Service Account and apply the scopes needed for the APIs you want to access.
To be able to impersonate a Service Account your user must have the role roles/iam.serviceAccountTokenCreator. This role can be applied to an entire project or to an individual Service Account.
You can use the gcloud to do this:
gcloud iam service-accounts add-iam-policy-binding [COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL] \
--member user:[USER_EMAIL] \
--role roles/iam.serviceAccountTokenCreator
Once this is done create the local credentials:
gcloud auth application-default login \
--scopes=openid,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/accounts.reauth \
--impersonate-service-account=[COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL]
This will solve the scopes error you got. The three extra scopes added beyond the Drive API scope are the default scopes that gcloud auth application-default login applies and are needed.
If you apply scopes without impersonation you will get an error like this when trying to authenticate:
HttpError: <HttpError 403 when requesting https://sheets.googleapis.com/v4/spreadsheets?fields=spreadsheetId&alt=json returned "Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the sheets.googleapis.com. We recommend configuring the billing/quota_project setting in gcloud or using a service account through the auth/impersonate_service_account setting. For more information about service accounts and how to use them in your application, see https://cloud.google.com/docs/authentication/.">
Once you have set up the credentials you can use the same code that is run on Google Compute Engine on your local machine :)
Note: it is also possible to set the impersonation for all gcloud commands:
gcloud config set auth/impersonate_service_account [COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL]
Creating default application credentails on your local machine by impersonating a service account is a slick way of authenticating development code. It means that the code will have exactly the same permissions as the Service Account that it is impersonating. If this is the same Service Account that will run the code in production you know that code in development runs the same as production. It also means that you never have to create or download any Service Account keys.

Categories

Resources