Connect google datastore from existing google compute engine in python - python

I'm trying to connect to Datastore from existing compute engine instance and I'm getting:
[ python 2.7 - googledatastore-v1beta2_rev1_2.1.0-py2.7 ]
googledatastore.connection.RPCError: commit RPC client failure with HTTP(403) Forbidden: Unauthorized.
the Datastore API is enabled, Permissions is set but GCE is in different zone, one project
what else ?
GCE env:
DATASTORE_DATASET = project_id
DATASTORE_PRIVATE_KEY_FILE = absolute path to pem file
DATASTORE_SERVICE_ACCOUNT = service_account_email
Any tips what should I do/check ? I'm confused because I have exactly the same configuration in my local environment - when I click "play" in pyCharm everything works well ;)
Maybe I missed something...
Thanks for your help ;)

This is currently a bug in the Cloud Datastore client library. If you are running on GCE, it will try to use the scope rules and then fail before trying other authentication methods.

Related

using OAuth2 user account authentication in the python google cloud API from jupyter notebook

I am trying to access BigQuery from python code in Jupyter notebook run on a local machine. So I installed the google cloud API packages on my laptop.
I need to pass the OAuth2 authentication. But unfortunately, I only have user account to our bigquery. I do not have service account and not application credentials, nor do I have the permissions to create such. I am only allowed to work with user account.
When running the bigquery.Client() function, it appears to look for application credentials by looking at an environment variable GOOGLE_APPLICATION_CREDENTIALS. But this, it seems, for my non existing application credentials.
I cannot find any other way to connect using user account authentication. But I find it extremely weird because:
The google API for R language works simply with user authentication. Parallel code in R (it has different API) just works!
I run the code from the dataspell IDE. I have created in the IDE a database resource connection to bigquery (with my user authentication). There I am capable of opening a console for the database and I can run SQL queries in the console with no problem. I have attached the bigquery session to my python notebook, and I can see my notebook attached to the big query session in the services pane. But I am still missing something in order to access some valid running connection in the python code. (I do not know how to get a python object representing a valid connected client).
I have been reading manuals from google and looked for code examples for hours... Alas, I cannot find any description of connecting a client using user account from my notebook.
Please, can someone help?
You can use the pydata-google-auth library to authenticate with a user account. This function loads credentials from a cache on disk or initiates an OAuth2.0 flow if the credentials are not found. This is not the recommended method to do an authentication.
import pandas_gbq
import pydata_google_auth
SCOPES = [
'https://www.googleapis.com/auth/cloud-platform',
'https://www.googleapis.com/auth/drive',
]
credentials = pydata_google_auth.get_user_credentials(
SCOPES,
# Set auth_local_webserver to True to have a slightly more convienient
# authorization flow. Note, this doesn't work if you're running from a
# notebook on a remote sever, such as over SSH or with Google Colab.
auth_local_webserver=True,
)
df = pandas_gbq.read_gbq(
"SELECT my_col FROM `my_dataset.my_table`",
project_id='YOUR-PROJECT-ID',
credentials=credentials,
)
The recommended way to do the authentication is to contact your GCP administrator and tell them to create a key for your account following the next instructions.
Then you can use this code to set up the authentication with the key that you have:
from google.oauth2 import service_account
credentials = service_account.Credentials.from_service_account_file(
'/path/to/key.json')
You can see more of the documentation here.

Can I check if a script is running inside a Compute Engine or in a local environment?

I just wanted to know if there is a way to check whether a Python script is running inside a Compute Engine or in a local environment?
I want to check that in order to know how to authenticate, for example when a script runs on a Compute Engine and I want to initiate a BigQuery client I do not need to authenticate but when it comes to running a script locally I need to authenticate using a service account JSON file.
If I knew whether a script is running locally or in a Compute Engine I would be able to initiate Google services accordingly.
I could put initialization into a try-except statement but maybe there is another way?
Any help is appreciated.
If I understand your question correctly, I think a better solution is provided by Google called Application Default Credentials. See Best practices to securely auth apps in Google Cloud (thanks #sethvargo) and Application Default Credentials
Using this mechanism, authentication becomes consistent regardless of where you run your app (on- or off-GCP). See finding credentials automatically
When you run off-GCP, you set GOOGLE_APPLICATION_CREDENTIALS to point to the Service Account. When you run on-GCP (and, to be clear, you are still authenticating, it's just transparent), you don't set the environment variable because the library obtains the e.g. Compute Engine instance's service account for you.
So I read a bit on the Google Cloud authentication and came up with this solution:
import google.auth
from google.oauth2 import service_account
try:
credentials, project = google.auth.default()
except:
credentials = service_account.Credentials.from_service_account_file('/path/to/service_account_json_file.json')
client = storage.Client(credentials=credentials)
What this does is it tries to retrieve the default Google Cloud credentials (in environments such as Compute Engine) and if it fails it tries to authenticate using a service account JSON file.
It might not be the best solution but it works and I hope it will help someone else too.

VisualStudioCodeCredential.get_token failed

I'm using a Jupyter Notebook within VS Code and the Azure Python SDK to develop locally.
Relevant VS Code Extensions installed:
Python
Azure Account
Azure Storage (maybe relevant?)
Goal:
To retrieve a secret from Azure Keyvault using DefaultCredential to authenticate
Since there are no environment variables nor ManagedIdentity credentials, DefaultCredential should default to pulling my creds from VS Code
Issue:
import logging
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential
keyvault_name = "kv-test"
keyvualt_url = "https://" + keyvault_name + ".vault.azure.net"
keyvault_credential = DefaultAzureCredential()
kv_secret1_name = "secret-test"
keyvault_client = SecretClient(vault_url=keyvualt_url, credential=keyvault_credential)
retrieved_key = keyvault_client.get_secret(kv_secret1_name)
logging.info("Account key retrieved from Keyvault")
Error:
EnvironmentCredential.get_token failed: EnvironmentCredential authentication unavailable. Environment variables are not fully configured.
ManagedIdentityCredential.get_token failed: ManagedIdentityCredential authentication unavailable, no managed identity endpoint found.
SharedTokenCacheCredential.get_token failed: SharedTokenCacheCredential authentication unavailable. No accounts were found in the cache.
VisualStudioCodeCredential.get_token failed: **Failed to get Azure user details from Visual Studio Code**.
Tried so far:
F1, Azure: Sign in
Authenticate via browser
No change
It looks like the DefaultCredential() cred chain is running, but its unable to ...get Azure user details from Visual Studio Code..
Is this because I'm developing inside a Jupyter Notebook in VS Code or is there another issue going on? It looks like something similar happened to the Python .NET SDK.
Not sure why it does not work, it looks correct. If you just want to login with visual studio code, you can also use AzureCliCredential. It works on my side.
You could use az login to sign in your account. Then you will get secret using the code.
from azure.keyvault.secrets import SecretClient
from azure.identity import DefaultAzureCredential,AzureCliCredential
keyvault_credential= AzureCliCredential()
secret_client = SecretClient("https://{vault-name}.vault.azure.net", keyvault_credential)
secret = secret_client.get_secret("secret-name")
print(secret.name)
print(secret.value)
For more details, see the Azure Identity client library for Python.
There is another easy way to fix this issue. As I quoted in this article, we can make use of the configurations options as preceding.
if self.local_dev:
print(f"Local Dev is {self.local_dev}")
self.az_cred = DefaultAzureCredential(
exclude_environment_credential=True,
exclude_managed_identity_credential=True,
exclude_shared_token_cache_credential=True,
exclude_interactive_browser_credential=True,
exclude_powershell_credential=True,
exclude_visual_studio_code_credential=False,
exclude_cli_credential=False,
logging_enable=True,
)
else:
self.az_cred = DefaultAzureCredential(
exclude_environment_credential=True, logging_enable=True
)
Please be noted that the exclude_visual_studio_code_credential and exclude_cli_credential as set to False and others set to True to exclude for the local development and exclude_environment_credential is set to True for the other environment such as production.
You can see these configuations in the default.py file of azure identity package.
As stated in the docs:
It's a known issue that VisualStudioCodeCredential doesn't work with Azure Account extension versions newer than 0.9.11. A long-term fix to this problem is in progress. In the meantime, consider authenticating via the Azure CLI.
For reference, this is the issue. With that, it is probably smart to disable the vscode credentials: DefaultAzureCredential(exclude_visual_studio_code_credential=True)
Anyway, depending on the version of the vscode extension, we might need to use another mean of authentication, such as SharedTokenCacheCredential, AzureCliCredential or even InteractiveBrowserCredential.
In my case, my authentication was failing on the SharedTokenCacheCredential step, that from what I read this is a shared cache used between Microsoft products. Thus, I assume it is likely the same is happening to others that have Microsoft products installed.
It was failing because my target tenant was not included in this cache. To fix this I had two options: either I disable the shared token credential or I include the target tenant in the shared cache.
For the first option, we can do something similar as for disabling vscode: DefaultAzureCredential(exclude_shared_token_cache_credential=True)
For the second option, I did it as suggested in this blog post from Microsoft: DefaultAzureCredential(additionally_allowed_tenants=[TENANT_ID]). But by looking at the source code, it seems we can achieve the same by:
setting the target tenant ID to an environment variable named AZURE_TENANT_ID, or;
directly passing the shared cache tenant id: DefaultAzureCredential(shared_cache_tenant_id="TENANT_ID")
Note that the env variable has the benefit that it is used by other authentication methods, namely InteractiveBrowserCredential and VisualStudioCodeCredential.

Azure python SDK ComputerManagementClient error

I get an error when trying to deallocate a virtual machine with the Python SDK for Azure.
Basically I try something like:
credentials = ServicePrincipalCredentials(client_id, secret, tenant)
compute_client = ComputeManagementClient(credentials, subscription_id, '2015-05-01-preview')
compute_client.virtual_machines.deallocate(resource_group_name, vm_name)
pprint (result.result())
-> exception:
msrestazure.azure_exceptions.CloudError: Azure Error: AuthorizationFailed
Message: The client '<some client UUID>' with object id '<same client UUID>' does not have authorization to perform action 'Microsoft.Compute/virtualMachines/deallocate/action' over scope '/subscriptions/<our subscription UUID>/resourceGroups/<resource-group>/providers/Microsoft.Compute/virtualMachines/<our-machine>'.
What I don't understand is that the error message contains an unknown client UUID that I have not used in the credentials.
Python is version 2.7.13 and the SDK version was from yesterday.
What I guess I need is a registration for an Application, which I did to get the information for the credentials. I am not quite sure which exact permission(s) I need to register for the application with IAM. For adding an access entry I can only pick existing users, but not an application.
So is there any programmatic way to find out which permissions are required for an action and which permissions our client application has?
Thanks!
As #GauravMantri & #LaurentMazuel said, the issue was caused by not assign role/permission to a service principal. I had answered another SO thread Cannot list image publishers from Azure java SDK, which is similar with yours.
There are two ways to resolve the issue, which include using Azure CLI & doing these operations on Azure portal, please see the details of my answer for the first, and I update below for the second way which is old.
And for you want to find out these permissions programmatically, you can refer to the REST API Role Definition List to get all role definitions that are applicable at scope and above, or refer to Azure Python SDK Authentication Management to do it via the code authorization_client.role_definitions.list(scope).
Hope it helps.
Thank you all for your answers! The best recipe for creating an application and to register it with the right role - Virtual Machine Contributor - is presented indeed on https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal
The main issue I had was that there is a bug in the adding a role within IAM. I use add. I select "Virtual Machine Contributor". With "Select" I get presented a list of users, but not the application that I have created for this purpose. Entering the first few letters of the name of my application will give a filtered output that includes my application this time though. Registration is then finished and things can proceed.

Is it possible to connect and query a BigQuery table from Google App-Engine (python) without OAuth2 authentication dialog?

I’m working on a Google App-Engine project which stores around 100K entities in the Datastore. Since I have to search in the string properties of those entities I have to find an effective way to do it.
After some research I found Google’s BigQuery service which looks perfect for me. I already imported the entities into BigQuery via the web interface but I can not connect and run a query on the BigQuery from the App-Engine code.
My App-Engine project has no web interface. It generates only JSON outputs which are consumed by mobile applications.
So, my question is this: is it possible to connect and run a query from the App-Engine python code without the OAuth2 authentication dialog?
Yes. Simply use what's known as a "service account" as described here. Then, some simple Python code once you've exported GOOGLE_APPLICATION_CREDENTIALS to point to the credential file:
from google.cloud import bigquery
client = bigquery.Client(project='PROJECT_ID')
for dataset in client.list_datasets():
do_something_with(dataset)
More info here too.
Just a quick caution this is case sensitive
- to check the captalisation for 'Client';
my_bigquery_client = bigquery.client(project = 'my_project')
- this fails with an error "TypeError: 'module' object is not callable"
my_bigquery_client = bigquery.Client(project = 'my_project')
- this works.

Categories

Resources