Im trying to allow an app service (python) to get secrets from azure keyvault without the usage of hardcoded client id/secrets, therefore I`m trying to use ManagedIdentity.
I have enabled system & user assigned functions in my service app
I have created a policy in vault where the service app is granted access to the secrets
code:
credentials_object = ManagedIdentityCredential()
client = SecretClient(vault_url=VAULT_URL, credential=credentials_object)
value = client.get_secret('MYKEY').value
error (when app is deployed and when running locally):
azure.identity._exceptions.CredentialUnavailableError: ManagedIdentityCredential authentication unavailable, no managed identity endpoint found.
What am I missing?
Thank you!
It's important to understand that Managed Identity feature in Azure is ONLY relevant when, in this case, the App Service is deployed. This would mean you would probably want to use DefaultAzureCredential() from the Azure.Identity library which is compatible both when running locally and for the deployed web app.
This class will run down the hierarchy of possible authentication methods and when running locally I prefer to use a service principal which can created by running the following in Azure CLI: az ad sp create-for-rbac --name localtest-sp-rbac --skip-assignment. You then add the service principal localtest-sp-rbac in the IAM for the required Azure services.
I recommend reading this article for more information and how to configure your local environment: https://learn.microsoft.com/en-us/azure/developer/python/configure-local-development-environment
You can see the list of credential types that DefaultAzureCredential() goes through in the Azure docs.
In my case, it was the issue of having multiple Managed Identities attached to my VMs. I am trying to access Azure Storage Account from AKS using ManagedIdentityCredential. When I specified the client_id of the MI as:
credentials_object = ManagedIdentityCredential(client_id='XXXXXXXXXXXX')
it started to work! It's also mentioned in here that we need to specify the client_id of the MI if the VM or VMSS has multiple identities attached to it.
Related
The problem:
I'm trying to read in a .gz JSON file that is stored in one of my project's cloud storage bucket using a google colab python notebook and I keep getting this error:
HttpError: Anonymous caller does not have storage.objects.get access to the Google Cloud Storage object., 401
My code:
fs = gcsfs.GCSFileSystem(project='my-project')
with fs.open('bucket/path.json.gz') as f:
gz = gzip.GzipFile(fileobj=f)
file_as_string = gz.read()
json_a = json.loads(file_as_string)
I've tried all of these authentication methods and still get the same 401 error :
!gcloud auth login
!gcloud auth list
!gcloud projects list
!gcloud config set project 'myproject-id'
from google.colab import auth
auth.authenticate_user()
!gcloud config set account 'my GCP email'
!gcloud auth activate-service-account
!gcloud auth application-default login
!gsutil config
!gcloud config set pass_credentials_to_gsutil false
!gsutil config -a
I've also set my GCP IAM permissions to:
Editor
Owner
Storage Admin
Storage Object Admin
Storage Object Creator
Storage Object Viewer
Storage Transfer Admin
It's not entirely clear from your question but:
gcloud and Google SDKs both use Google's identity|auth platform but they don't share state. You usually (!) can't login using gcloud and expect code using an SDK to be authenticated too
#john-hanley correctly points out that one (often confusing) way to share state between gcloud and code using Google SDKs is to use gcloud auth application-default-login. However, this only works because gcloud writes its state locally and code using Google SDKs when running as the same user on the same host, will be able to access this state. I think (!?) this won't work with browser-based collab
I'm unfamiliar with gcsfs.GCSFileSystem but, it is not a Google SDK. Unless its developers have been particularly thoughtful, it won't be able to leverage authentication done by the Google SDK using auth.authenticate_user().
So...
I think you should:
Ensure that your user account (you#gmail.com or whatever) has roles/storage.objectAdmin (or any predefined role that permits storage.objects.get).
Use google.collab.auth and auth.authenticate_user() to obtain credentials for the browser's logged-in user (i.e. you#gmail.com).
Use a Google Cloud Storage library, e.g. google-cloud-storage to access the GCS object. The Google library can leverage the credentials obtained in the previous step.
Update
Here's an example.
NOTE: it use the API Client Library rather than the Cloud Client Library but these are functionally equivalent.
I am trying to access BigQuery from python code in Jupyter notebook run on a local machine. So I installed the google cloud API packages on my laptop.
I need to pass the OAuth2 authentication. But unfortunately, I only have user account to our bigquery. I do not have service account and not application credentials, nor do I have the permissions to create such. I am only allowed to work with user account.
When running the bigquery.Client() function, it appears to look for application credentials by looking at an environment variable GOOGLE_APPLICATION_CREDENTIALS. But this, it seems, for my non existing application credentials.
I cannot find any other way to connect using user account authentication. But I find it extremely weird because:
The google API for R language works simply with user authentication. Parallel code in R (it has different API) just works!
I run the code from the dataspell IDE. I have created in the IDE a database resource connection to bigquery (with my user authentication). There I am capable of opening a console for the database and I can run SQL queries in the console with no problem. I have attached the bigquery session to my python notebook, and I can see my notebook attached to the big query session in the services pane. But I am still missing something in order to access some valid running connection in the python code. (I do not know how to get a python object representing a valid connected client).
I have been reading manuals from google and looked for code examples for hours... Alas, I cannot find any description of connecting a client using user account from my notebook.
Please, can someone help?
You can use the pydata-google-auth library to authenticate with a user account. This function loads credentials from a cache on disk or initiates an OAuth2.0 flow if the credentials are not found. This is not the recommended method to do an authentication.
import pandas_gbq
import pydata_google_auth
SCOPES = [
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive',
]
credentials = pydata_google_auth.get_user_credentials(
SCOPES,
# Set auth_local_webserver to True to have a slightly more convienient
# authorization flow. Note, this doesn't work if you're running from a
# notebook on a remote sever, such as over SSH or with Google Colab.
auth_local_webserver=True,
)
df = pandas_gbq.read_gbq(
"SELECT my_col FROM `my_dataset.my_table`",
project_id='YOUR-PROJECT-ID',
credentials=credentials,
)
The recommended way to do the authentication is to contact your GCP administrator and tell them to create a key for your account following the next instructions.
Then you can use this code to set up the authentication with the key that you have:
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/path/to/key.json')
You can see more of the documentation here.
I just wanted to know if there is a way to check whether a Python script is running inside a Compute Engine or in a local environment?
I want to check that in order to know how to authenticate, for example when a script runs on a Compute Engine and I want to initiate a BigQuery client I do not need to authenticate but when it comes to running a script locally I need to authenticate using a service account JSON file.
If I knew whether a script is running locally or in a Compute Engine I would be able to initiate Google services accordingly.
I could put initialization into a try-except statement but maybe there is another way?
Any help is appreciated.
If I understand your question correctly, I think a better solution is provided by Google called Application Default Credentials. See Best practices to securely auth apps in Google Cloud (thanks #sethvargo) and Application Default Credentials
Using this mechanism, authentication becomes consistent regardless of where you run your app (on- or off-GCP). See finding credentials automatically
When you run off-GCP, you set GOOGLE_APPLICATION_CREDENTIALS to point to the Service Account. When you run on-GCP (and, to be clear, you are still authenticating, it's just transparent), you don't set the environment variable because the library obtains the e.g. Compute Engine instance's service account for you.
So I read a bit on the Google Cloud authentication and came up with this solution:
import google.auth
from google.oauth2 import service_account
try:
credentials, project = google.auth.default()
except:
credentials = service_account.Credentials.from_service_account_file('/path/to/service_account_json_file.json')
client = storage.Client(credentials=credentials)
What this does is it tries to retrieve the default Google Cloud credentials (in environments such as Compute Engine) and if it fails it tries to authenticate using a service account JSON file.
It might not be the best solution but it works and I hope it will help someone else too.
Our company is working on processing data from Google Sheets (within Google Drive) from Google Cloud Platform and we are having some problems with the authentication.
There are two different places where we need to run code that makes API calls to Google Drive: within production in Google Compute Engine, and within development environments i.e. locally on our developers' laptops.
Our company is quite strict about credentials and does not allow the downloading of Service Account credential JSON keys (this is better practice and provides higher security). Seemingly all of the docs from GCP say to simply download the JSON key for a Service Account and use that. Or Google APIs/Developers docs say to create an OAuth2 Client ID and download it’s key like here.
They often use code like this:
from google.oauth2 import service_account
SCOPES = ['https://www.googleapis.com/auth/sqlservice.admin']
SERVICE_ACCOUNT_FILE = '/path/to/service.json'
credentials = service_account.Credentials.from_service_account_file(
SERVICE_ACCOUNT_FILE, scopes=SCOPES)
But we can't (or just don't want to) download our Service Account JSON keys, so we're stuck if we just follow the docs.
For the Google Compute Engine environment we have been able to authenticate by using GCP Application Default Credentials (ADCs) - i.e. not explicitly specifying credentials to use in code and letting the client libraries “just work” - this works great as long as one ensures that the VM is created with the correct scopes https://www.googleapis.com/auth/drive, and the default compute Service Account email is given permission to the Sheet that needs to be accessed - this is explained in the docs here. You can do this like so;
from googleapiclient.discovery import build
service = build('sheets', 'v4')
SPREADSHEET_ID="<sheet_id>"
RANGE_NAME="A1:A2"
s = service.spreadsheets().values().get(
spreadsheetId=SPREADSHEET_ID,
range=RANGE_NAME, majorDimension="COLUMNS"
).execute()
However, how do we do this for development, i.e. locally on our developers' laptops? Again, without downloading any JSON keys, and preferably with the most “just works” approach possible?
Usually we use gcloud auth application-default login to create default application credentials that the Google client libraries use which “just work”, such as for Google Storage. However this doesn't work for Google APIs outside of GCP, like Google Drive API service = build('sheets', 'v4') which fails with this error: “Request had insufficient authentication scopes.”. Then we tried all kinds of solutions like:
credentials, project_id = google.auth.default(scopes=["https://www.googleapis.com/auth/drive"])
and
credentials, project_id = google.auth.default()
credentials = google_auth_oauthlib.get_user_credentials(
["https://www.googleapis.com/auth/drive"], credentials._client_id, credentials._client_secret)
)
and more...
Which all give a myriad of errors/issues we can’t get past when trying to do authentication to Google Drive API :(
Any thoughts?
One method for making the authentication from development environments easy is to use Service Account impersonation.
Here is a blog about using service account impersonation, including the benefits of doing this. #johnhanley (who wrote the blog post) is a great guy and has lots of very informative answers on SO also!
To be able to have your local machine authenticate for Google Drive API you will need to create default application credentials on your local machine that impersonates a Service Account and apply the scopes needed for the APIs you want to access.
To be able to impersonate a Service Account your user must have the role roles/iam.serviceAccountTokenCreator. This role can be applied to an entire project or to an individual Service Account.
You can use the gcloud to do this:
gcloud iam service-accounts add-iam-policy-binding [COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL] \
--member user:[USER_EMAIL] \
--role roles/iam.serviceAccountTokenCreator
Once this is done create the local credentials:
gcloud auth application-default login \
--scopes=openid,https://www.googleapis.com/auth/drive,https://www.googleapis.com/auth/userinfo.email,https://www.googleapis.com/auth/cloud-platform,https://www.googleapis.com/auth/accounts.reauth \
--impersonate-service-account=[COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL]
This will solve the scopes error you got. The three extra scopes added beyond the Drive API scope are the default scopes that gcloud auth application-default login applies and are needed.
If you apply scopes without impersonation you will get an error like this when trying to authenticate:
HttpError: <HttpError 403 when requesting https://sheets.googleapis.com/v4/spreadsheets?fields=spreadsheetId&alt=json returned "Your application has authenticated using end user credentials from the Google Cloud SDK or Google Cloud Shell which are not supported by the sheets.googleapis.com. We recommend configuring the billing/quota_project setting in gcloud or using a service account through the auth/impersonate_service_account setting. For more information about service accounts and how to use them in your application, see https://cloud.google.com/docs/authentication/.">
Once you have set up the credentials you can use the same code that is run on Google Compute Engine on your local machine :)
Note: it is also possible to set the impersonation for all gcloud commands:
gcloud config set auth/impersonate_service_account [COMPUTE_SERVICE_ACCOUNT_FULL_EMAIL]
Creating default application credentails on your local machine by impersonating a service account is a slick way of authenticating development code. It means that the code will have exactly the same permissions as the Service Account that it is impersonating. If this is the same Service Account that will run the code in production you know that code in development runs the same as production. It also means that you never have to create or download any Service Account keys.
I'm stuck at creating a user for my serivice_account which I can use in my kubeconfig
Background:
I have a cluser-A, which I have created using the google-cloud-python library. I can see the cluster created in the console. Now I want to deploy some manifests to this cluster so i'm trying to use the kubernetes-python client. To create a Client() object, I need to have a KUBECONFIG so I can:
client = kubernetes.client.load_kube_config(<MY_KUBE_CONFIG>)
I'm stuck at generating a user for this service_account in my kubeconfig. I don't know what kind of authentication certificate/key I should use for my user.
Searched everywhere but still can't figure out how to use my service_account to access my GKE cluster through the kubernetes-python library.
Additional Info:
I already have a Credentials() object (source) created using the service_accounts.Credentails() class (source).
A Kubernetes ServiceAccount is generally used by a service within the cluster itself, and most clients offer a version of rest.InClusterConfig(). If you mean a GCP-level service account, it is activated as below:
gcloud auth activate-service-account --key-file gcp-key.json
and then probably you would set a project and use gcloud container clusters get-credentials as per normal with GKE to get Kubernetes config and credentials.