I'm using a Jupyter Notebook within VS Code and the Azure Python SDK to develop locally.
Relevant VS Code Extensions installed:
Python
Azure Account
Azure Storage (maybe relevant?)
Goal:
To retrieve a secret from Azure Keyvault using DefaultCredential to authenticate
Since there are no environment variables nor ManagedIdentity credentials, DefaultCredential should default to pulling my creds from VS Code
Issue:
import logging
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
keyvault_name = "kv-test"
keyvualt_url = "https://" + keyvault_name + ".vault.azure.net"
keyvault_credential = DefaultAzureCredential()
kv_secret1_name = "secret-test"
keyvault_client = SecretClient(vault_url=keyvualt_url, credential=keyvault_credential)
retrieved_key = keyvault_client.get_secret(kv_secret1_name)
logging.info("Account key retrieved from Keyvault")
Error:
EnvironmentCredential.get_token failed: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
ManagedIdentityCredential.get_token failed: ManagedIdentityCredential authentication unavailable, no managed identity endpoint found.
SharedTokenCacheCredential.get_token failed: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
VisualStudioCodeCredential.get_token failed: **Failed to get Azure user details from Visual Studio Code**.
Tried so far:
F1, Azure: Sign in
Authenticate via browser
No change
It looks like the DefaultCredential() cred chain is running, but its unable to ...get Azure user details from Visual Studio Code..
Is this because I'm developing inside a Jupyter Notebook in VS Code or is there another issue going on? It looks like something similar happened to the Python .NET SDK.
Not sure why it does not work, it looks correct. If you just want to login with visual studio code, you can also use AzureCliCredential. It works on my side.
You could use az login to sign in your account. Then you will get secret using the code.
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential,AzureCliCredential
keyvault_credential= AzureCliCredential()
secret_client = SecretClient("https://{vault-name}.vault.azure.net", keyvault_credential)
secret = secret_client.get_secret("secret-name")
print(secret.name)
print(secret.value)
For more details, see the Azure Identity client library for Python.
There is another easy way to fix this issue. As I quoted in this article, we can make use of the configurations options as preceding.
if self.local_dev:
print(f"Local Dev is {self.local_dev}")
self.az_cred = DefaultAzureCredential(
exclude_environment_credential=True,
exclude_managed_identity_credential=True,
exclude_shared_token_cache_credential=True,
exclude_interactive_browser_credential=True,
exclude_powershell_credential=True,
exclude_visual_studio_code_credential=False,
exclude_cli_credential=False,
logging_enable=True,
)
else:
self.az_cred = DefaultAzureCredential(
exclude_environment_credential=True, logging_enable=True
)
Please be noted that the exclude_visual_studio_code_credential and exclude_cli_credential as set to False and others set to True to exclude for the local development and exclude_environment_credential is set to True for the other environment such as production.
You can see these configuations in the default.py file of azure identity package.
As stated in the docs:
It's a known issue that VisualStudioCodeCredential doesn't work with Azure Account extension versions newer than 0.9.11. A long-term fix to this problem is in progress. In the meantime, consider authenticating via the Azure CLI.
For reference, this is the issue. With that, it is probably smart to disable the vscode credentials: DefaultAzureCredential(exclude_visual_studio_code_credential=True)
Anyway, depending on the version of the vscode extension, we might need to use another mean of authentication, such as SharedTokenCacheCredential, AzureCliCredential or even InteractiveBrowserCredential.
In my case, my authentication was failing on the SharedTokenCacheCredential step, that from what I read this is a shared cache used between Microsoft products. Thus, I assume it is likely the same is happening to others that have Microsoft products installed.
It was failing because my target tenant was not included in this cache. To fix this I had two options: either I disable the shared token credential or I include the target tenant in the shared cache.
For the first option, we can do something similar as for disabling vscode: DefaultAzureCredential(exclude_shared_token_cache_credential=True)
For the second option, I did it as suggested in this blog post from Microsoft: DefaultAzureCredential(additionally_allowed_tenants=[TENANT_ID]). But by looking at the source code, it seems we can achieve the same by:
setting the target tenant ID to an environment variable named AZURE_TENANT_ID, or;
directly passing the shared cache tenant id: DefaultAzureCredential(shared_cache_tenant_id="TENANT_ID")
Note that the env variable has the benefit that it is used by other authentication methods, namely InteractiveBrowserCredential and VisualStudioCodeCredential.
Related
I am trying to access BigQuery from python code in Jupyter notebook run on a local machine. So I installed the google cloud API packages on my laptop.
I need to pass the OAuth2 authentication. But unfortunately, I only have user account to our bigquery. I do not have service account and not application credentials, nor do I have the permissions to create such. I am only allowed to work with user account.
When running the bigquery.Client() function, it appears to look for application credentials by looking at an environment variable GOOGLE_APPLICATION_CREDENTIALS. But this, it seems, for my non existing application credentials.
I cannot find any other way to connect using user account authentication. But I find it extremely weird because:
The google API for R language works simply with user authentication. Parallel code in R (it has different API) just works!
I run the code from the dataspell IDE. I have created in the IDE a database resource connection to bigquery (with my user authentication). There I am capable of opening a console for the database and I can run SQL queries in the console with no problem. I have attached the bigquery session to my python notebook, and I can see my notebook attached to the big query session in the services pane. But I am still missing something in order to access some valid running connection in the python code. (I do not know how to get a python object representing a valid connected client).
I have been reading manuals from google and looked for code examples for hours... Alas, I cannot find any description of connecting a client using user account from my notebook.
Please, can someone help?
You can use the pydata-google-auth library to authenticate with a user account. This function loads credentials from a cache on disk or initiates an OAuth2.0 flow if the credentials are not found. This is not the recommended method to do an authentication.
import pandas_gbq
import pydata_google_auth
SCOPES = [
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive',
]
credentials = pydata_google_auth.get_user_credentials(
SCOPES,
# Set auth_local_webserver to True to have a slightly more convienient
# authorization flow. Note, this doesn't work if you're running from a
# notebook on a remote sever, such as over SSH or with Google Colab.
auth_local_webserver=True,
)
df = pandas_gbq.read_gbq(
"SELECT my_col FROM `my_dataset.my_table`",
project_id='YOUR-PROJECT-ID',
credentials=credentials,
)
The recommended way to do the authentication is to contact your GCP administrator and tell them to create a key for your account following the next instructions.
Then you can use this code to set up the authentication with the key that you have:
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/path/to/key.json')
You can see more of the documentation here.
I just wanted to know if there is a way to check whether a Python script is running inside a Compute Engine or in a local environment?
I want to check that in order to know how to authenticate, for example when a script runs on a Compute Engine and I want to initiate a BigQuery client I do not need to authenticate but when it comes to running a script locally I need to authenticate using a service account JSON file.
If I knew whether a script is running locally or in a Compute Engine I would be able to initiate Google services accordingly.
I could put initialization into a try-except statement but maybe there is another way?
Any help is appreciated.
If I understand your question correctly, I think a better solution is provided by Google called Application Default Credentials. See Best practices to securely auth apps in Google Cloud (thanks #sethvargo) and Application Default Credentials
Using this mechanism, authentication becomes consistent regardless of where you run your app (on- or off-GCP). See finding credentials automatically
When you run off-GCP, you set GOOGLE_APPLICATION_CREDENTIALS to point to the Service Account. When you run on-GCP (and, to be clear, you are still authenticating, it's just transparent), you don't set the environment variable because the library obtains the e.g. Compute Engine instance's service account for you.
So I read a bit on the Google Cloud authentication and came up with this solution:
import google.auth
from google.oauth2 import service_account
try:
credentials, project = google.auth.default()
except:
credentials = service_account.Credentials.from_service_account_file('/path/to/service_account_json_file.json')
client = storage.Client(credentials=credentials)
What this does is it tries to retrieve the default Google Cloud credentials (in environments such as Compute Engine) and if it fails it tries to authenticate using a service account JSON file.
It might not be the best solution but it works and I hope it will help someone else too.
I get an error when trying to deallocate a virtual machine with the Python SDK for Azure.
Basically I try something like:
credentials = ServicePrincipalCredentials(client_id, secret, tenant)
compute_client = ComputeManagementClient(credentials, subscription_id, '2015-05-01-preview')
compute_client.virtual_machines.deallocate(resource_group_name, vm_name)
pprint (result.result())
-> exception:
msrestazure.azure_exceptions.CloudError: Azure Error: AuthorizationFailed
Message: The client '<some client UUID>' with object id '<same client UUID>' does not have authorization to perform action 'Microsoft.Compute/virtualMachines/deallocate/action' over scope '/subscriptions/<our subscription UUID>/resourceGroups/<resource-group>/providers/Microsoft.Compute/virtualMachines/<our-machine>'.
What I don't understand is that the error message contains an unknown client UUID that I have not used in the credentials.
Python is version 2.7.13 and the SDK version was from yesterday.
What I guess I need is a registration for an Application, which I did to get the information for the credentials. I am not quite sure which exact permission(s) I need to register for the application with IAM. For adding an access entry I can only pick existing users, but not an application.
So is there any programmatic way to find out which permissions are required for an action and which permissions our client application has?
Thanks!
As #GauravMantri & #LaurentMazuel said, the issue was caused by not assign role/permission to a service principal. I had answered another SO thread Cannot list image publishers from Azure java SDK, which is similar with yours.
There are two ways to resolve the issue, which include using Azure CLI & doing these operations on Azure portal, please see the details of my answer for the first, and I update below for the second way which is old.
And for you want to find out these permissions programmatically, you can refer to the REST API Role Definition List to get all role definitions that are applicable at scope and above, or refer to Azure Python SDK Authentication Management to do it via the code authorization_client.role_definitions.list(scope).
Hope it helps.
Thank you all for your answers! The best recipe for creating an application and to register it with the right role - Virtual Machine Contributor - is presented indeed on https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal
The main issue I had was that there is a bug in the adding a role within IAM. I use add. I select "Virtual Machine Contributor". With "Select" I get presented a list of users, but not the application that I have created for this purpose. Entering the first few letters of the name of my application will give a filtered output that includes my application this time though. Registration is then finished and things can proceed.
I have problems with the authentication in the Python Library of Google Cloud API.
At first it worked for some days without problem, but suddenly the API calls are not showing up in the API Overview of the Google CloudPlatform.
I created a service account and stored the json file locally. Then I set the environment variable GCLOUD_PROJECT to the project ID and GOOGLE_APPLICATION_CREDENTIALS to the path of the json file.
from google.cloud import speech
client = speech.Client()
print(client._credentials.service_account_email)
prints the correct service account email.
The following code transcribes the audio_file successfully, but the Dashboard for my Google Cloud project doesn't show anything for the activated Speech API Graph.
import io
with io.open(audio_file, 'rb') as f:
audio = client.sample(f.read(), source_uri=None, sample_rate=48000, encoding=speech.encoding.Encoding.FLAC)
alternatives = audio.sync_recognize(language_code='de-DE')
At some point the code also ran in some errors, regarding the usage limit. I guess due to the unsuccessful authentication, the free/limited option is used somehow.
I also tried the alternative option for authentication by installing the Google Cloud SDK and gcloud auth application-default login, but without success.
I have no idea where to start troubleshooting the problem.
Any help is appreciated!
(My system is running Windows 7 with Anaconda)
EDIT:
The error count (Fehler) is increasing with calls to the API. How can I get detailed information about the error?!
Make sure you are using an absolute path when setting the GOOGLE_APPLICATION_CREDENTIALS environment variable. Also, you might want to try inspecting the access token using OAuth2 tokeninfo and make sure it has "scope": "https://www.googleapis.com/auth/cloud-platform" in its response.
Sometimes you will get different error information if you initialize the client with GRPC enabled:
0.24.0:
speech_client = speech.Client(_use_grpc=True)
0.23.0:
speech_client = speech.Client(use_gax=True)
Usually it's an encoding issue, can you try with the sample audio or try generating LINEAR16 samples using something like the Unix rec tool:
rec --channels=1 --bits=16 --rate=44100 audio.wav trim 0 5
...
with io.open(speech_file, 'rb') as audio_file:
content = audio_file.read()
audio_sample = speech_client.sample(
content,
source_uri=None,
encoding='LINEAR16',
sample_rate=44100)
Other notes:
Sync Recognize is limited to 60 seconds of audio, you must use async for longer audio
If you haven't already, set up billing for your account
With regards to the usage problem, the issue is in fact that when you use the new google-cloud library to access ML APIs, it seems everyone authenticates to a project shared by everyone (hence it says you've used up your limit even though you've not used anything). To check and confirm this, you can call an ML API that you have not enabled by using the python client library, which will give you a result even though it shouldn't. This problem persists to other language client libraries and OS, so I suspect it's an issue with their grpc.
Because of this, to ensure consistency I always use the older googleapiclient that uses my API key. Here is an example to use the translate API:
from googleapiclient import discovery
service = discovery.build('translate', 'v2', developerKey='')
service_request = service.translations().list(q='hello world', target='zh')
result = service_request.execute()
print(result)
For the speech API, it's something along the lines of:
from googleapiclient import discovery
service = discovery.build('speech', 'v1beta1', developerKey='')
service_request = service.speech().syncrecognize()
result = service_request.execute()
print(result)
You can get the list of the discovery APIs at https://developers.google.com/api-client-library/python/apis/ with the speech one located in https://developers.google.com/resources/api-libraries/documentation/speech/v1beta1/python/latest/.
One of the other benefits of using the discovery library is that you get a lot more options compared to the current library, although often times it's a bit more of a pain to implement.
I'm trying to connect to Datastore from existing compute engine instance and I'm getting:
[ python 2.7 - googledatastore-v1beta2_rev1_2.1.0-py2.7 ]
googledatastore.connection.RPCError: commit RPC client failure with HTTP(403) Forbidden: Unauthorized.
the Datastore API is enabled, Permissions is set but GCE is in different zone, one project
what else ?
GCE env:
DATASTORE_DATASET = project_id
DATASTORE_PRIVATE_KEY_FILE = absolute path to pem file
DATASTORE_SERVICE_ACCOUNT = service_account_email
Any tips what should I do/check ? I'm confused because I have exactly the same configuration in my local environment - when I click "play" in pyCharm everything works well ;)
Maybe I missed something...
Thanks for your help ;)
This is currently a bug in the Cloud Datastore client library. If you are running on GCE, it will try to use the scope rules and then fail before trying other authentication methods.