For security purpose, I have disabled public access under Networking Tab in Keyvault and have a private endpoint in place. Both keyvault and private endpoint reside in same resource group. I have an app registration for my application for which I have granted access under Access policies in Keyvault.
Using Python SDK,
from azure.keyvault.secrets import SecretClient
from azure.identity import ClientSecretCredential as cs
keyVaultName = "<NAME>"
kvURI = "https://<NAME>.vault.azure.net"
AZ_TENANT_ID = '<AZ_TENANT_ID>'
AZ_CLIENT_ID = '<AZ_CLIENT_ID>'
AZ_CLIENT_SECRET = '<AZ_CLIENT_SECRET>'
credential = cs(
tenant_id=AZ_TENANT_ID,
client_id=AZ_CLIENT_ID,
client_secret=AZ_CLIENT_SECRET)
def set_secret(secretname,secretvalue):
print(credential)
secret_client = SecretClient(vault_url=kvURI, credential=credential)
secret = secret_client.set_secret(secretname,secretvalue,enabled=True)
sec_dic={}
sec_dic['name']=secret.name
sec_dic['value']=secret.value
sec_dic['properties']=secret.properties.version
return sec_dic
xx=set_secret('g','ff')
print(xx)
When running this code, I get the follwing error,
azure.core.exceptions.HttpResponseError: (Forbidden) Public network access is disabled and request is not from a trusted service nor via an approved private link.
Code: Forbidden
Message: Public network access is disabled and request is not from a trusted service nor via an approved private link.
Inner error: {
"code": "ForbiddenByConnection"
}
What am I doing wrong? How do I connect to keyvault that has no public access only via private endpoint?
I have reproduced in my environment, and got expected results as below:
Firstly, I have done the same process you have explained, and I got same error as you have got:
So, this error comes when you create a private endpoint.
When you create a private endpoint with a particular Virtual Network-Subnet, you need to create a Virtual Machine where Virtual Network is integrated.
Then we need to open the Vm created integrated with the above Virtual Network-Subnet, and now when I try to access the key Vault, I got required result as below:
References:
azure - Unable to get storage account container details - Stack Overflow
Azure Key Vault not allow access via private endpoint connection
Azure functions and Azure KeyVault communicating through service endpoint
Related
My Sql server DB password is saved on Azure app vault which has DATAREF ID as a identifier. I need that password to create spark dataframe from table which is present in SQL server. I am running this .py file on google Dataproc cluster. How can I get that password using python?
Since you are accessing an Azure service from a non-Azure service, you will need a service principal. You can use certificate or secret. See THIS link for the different methods. You will need to give the service principal proper access and this will depend if you are using RBAC or access policy for your key vault.
So the steps you need to follow are:
Create a key vault and create a secret.
Create a Service principal or application registration. Store the clientid, clientsecret and tenantid.
Give the service principal proper access to the key vault(if you are using access policies) or to the specific secret(if you are using RBAC model)
The python link for the code is HERE.
The code that will work for you is below:
from azure.identity import ClientSecretCredential
from azure.keyvault.secrets import SecretClient
tenantid = <your_tenant_id>
clientsecret = <your_client_secret>
clientid = <your_client_id>
my_credentials = ClientSecretCredential(tenant_id=tenantid, client_id=clientid, client_secret=clientsecret)
secret_client = SecretClient(vault_url="https://<your_keyvault_name>.vault.azure.net/", credential=my_credentials)
secret = secret_client.get_secret("<your_secret_name>")
print(secret.name)
print(secret.value)
I have created and ran dags on a google-cloud-composer environment (dlkpipelinesv1 : composer-1.13.0-airflow-1.10.12). I am able to trigger these dags manually, and using the scheduler, but I am stuck when it comes to triggering them via cloud-functions that detect changes in a google-cloud-storage bucket.
Note that I had another GC-Composer environment (pipelines:composer-1.7.5-airflow-1.10.2) that used those same google cloud functions to trigger the relevant dags, and it was working.
I followed this guide to create the functions that trigger the dags. So I retrieved the following variables:
PROJECT_ID = <project_id>
CLIENT_ID = <client_id_retrieved_by_running_the_code_in_the_guide_within_my_gcp_console>
WEBSERVER_ID = <airflow_webserver_id>
DAG_NAME = <dag_to_trigger>
WEBSERVER_URL = f"https://{WEBSERVER_ID}.appspot.com/api/experimental/dags/{DAG_NAME}/dag_runs"
def file_listener(event, context):
"""Entry point of the cloud function: Triggered by a change to a Cloud Storage bucket.
Args:
event (dict): Event payload.
context (google.cloud.functions.Context): Metadata for the event.
"""
logging.info("Running the file listener process")
logging.info(f"event : {event}")
logging.info(f"context : {context}")
file = event
if file["size"] == "0" or "DTM_DATALAKE_AUDIT_COMPTAGE" not in file["name"] or ".filepart" in file["name"].lower():
logging.info("no matching file")
exit(0)
logging.info(f"File listener detected the presence of : {file['name']}.")
# id_token = authorize_iap()
# make_iap_request({"file_name": file["name"]}, id_token)
make_iap_request(url=WEBSERVER_URL, client_id=CLIENT_ID, method="POST")
def make_iap_request(url, client_id, method="GET", **kwargs):
"""Makes a request to an application protected by Identity-Aware Proxy.
Args:
url: The Identity-Aware Proxy-protected URL to fetch.
client_id: The client ID used by Identity-Aware Proxy.
method: The request method to use
('GET', 'OPTIONS', 'HEAD', 'POST', 'PUT', 'PATCH', 'DELETE')
**kwargs: Any of the parameters defined for the request function:
https://github.com/requests/requests/blob/master/requests/api.py
If no timeout is provided, it is set to 90 by default.
Returns:
The page body, or raises an exception if the page couldn't be retrieved.
"""
# Set the default timeout, if missing
if "timeout" not in kwargs:
kwargs["timeout"] = 90
# Obtain an OpenID Connect (OIDC) token from metadata server or using service account.
open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
logging.info(f"Retrieved open id connect (bearer) token {open_id_connect_token}")
# Fetch the Identity-Aware Proxy-protected URL, including an authorization header containing "Bearer " followed by a
# Google-issued OpenID Connect token for the service account.
resp = requests.request(method, url, headers={"Authorization": f"Bearer {open_id_connect_token}"}, **kwargs)
if resp.status_code == 403:
raise Exception("Service account does not have permission to access the IAP-protected application.")
elif resp.status_code != 200:
raise Exception(f"Bad response from application: {resp.status_code} / {resp.headers} / {resp.text}")
else:
logging.info(f"Response status - {resp.status_code}")
return resp.json
This is the code that runs in the GC-functions
I have checked the environment details in dlkpipelinesv1 and piplines respectively, using this code :
credentials, _ = google.auth.default(
scopes=['https://www.googleapis.com/auth/cloud-platform'])
authed_session = google.auth.transport.requests.AuthorizedSession(
credentials)
# project_id = 'YOUR_PROJECT_ID'
# location = 'us-central1'
# composer_environment = 'YOUR_COMPOSER_ENVIRONMENT_NAME'
environment_url = (
'https://composer.googleapis.com/v1beta1/projects/{}/locations/{}'
'/environments/{}').format(project_id, location, composer_environment)
composer_response = authed_session.request('GET', environment_url)
environment_data = composer_response.json()
and the two are using the same service accounts to run, i.e. the same IAM roles. Although I have noticed the following different details :
In the old environment :
"airflowUri": "https://p5<hidden_value>-tp.appspot.com",
"privateEnvironmentConfig": { "privateClusterConfig": {} },
in the new environment:
"airflowUri": "https://da<hidden_value>-tp.appspot.com",
"privateEnvironmentConfig": {
"privateClusterConfig": {},
"webServerIpv4CidrBlock": "<hidden_value>",
"cloudSqlIpv4CidrBlock": "<hidden_value>"
}
The service account that I use to make the post request has the following roles :
Cloud Functions Service Agent
Composer Administrator
Composer User
Service Account Token Creator
Service Account User
The service account that runs my composer environment has the following roles :
BigQuery Admin
Composer Worker
Service Account Token Creator
Storage Object Admin
But I am still receiving a 403 - Forbidden in the Log Explorer when the post request is made to the airflow API.
EDIT 2020-11-16 :
I've updated to the latest make_iap_request code.
I tinkered with the IAP within the security service, but I cannot find the webserver that will accept HTTP: post requests from my cloud functions... See the image bellow, anyway I added the service account to the default and CRM IAP resources just in case, but I still get this error :
Exception: Service account does not have permission to access the IAP-protected application.
The main question is: What IAP is at stake here?? And how do I add my service account as a user of this IAP.
What am I missing?
There is a configuration parameter that causes ALL requests to the API to be denied...
In the documentation, it is mentioned that we need to override the following airflow configuration :
[api]
auth_backend = airflow.api.auth.backend.deny_all
into
[api]
auth_backend = airflow.api.auth.backend.default
This detail is really important to know, and it is not mentioned in google's documentation...
Useful links :
Triggering DAGS (workflows) with GCS
make_iap_request.py repository
The code throwing the 403 is the way it used to work. There was a breaking change in the middle of 2020. Instead of using requests to make an HTTP call for the token, you should use Google's OAuth2 library:
from google.oauth2 import id_token
from google.auth.transport.requests import Request
open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
see this example
I followed the steps in Triggering DAGs and has worked in my env, please see below my recommendations.
It is a good start that the Componser Environment is up and runnning. Through the process you will only need to upload the new DAG (trigger_response_dag.py) and get the clientID (ends with .apps.googleusercontent.com) with either a python script or from the login page the first time you open Airflow UI.
In the Cloud Functions side, I noticed you have a combination of instructions for Node.js and for Python, for example, USER_AGENT is only for Node.js. And routine make_iap_request is only for python. I hope the following points helps to resolve your problem:
Service Account (SA). The Node.js code uses the default service account ${projectId}#appspot.gserviceaccount.com whose default role is Editor, meaning that it has wide access to the GCP services, including Cloud Composer. In python I think the authentication is managed somehow by client_id since a token is retrieved with id. Please ensure that the SA has this Editor role and don't forget to assign serviceAccountTokenCreator as specified in the guide.
I used Node.js 8 runtime and I noticed the user agent you are concerned it should be 'gcf-event-trigger' as it is hard coded; USER_AGENT = 'gcf-event-trigger'. In python, it seems not necesary.
By default, in the GCS trigger, the GCS Event Type is set to Archive, you need to change it to Finalize/Create. If set to Archive, the trigger won't work when you upload objects, and the DAG won't be started.
If you think your cloud function is correctly configured and an error persists, you can find its cause in your cloud function's LOGS tab in the Console. It can give you more details.
Basically, from the guide, I only had to change the following values in Node.js:
// The project that holds your function. Replace <YOUR-PROJECT-ID>
const PROJECT_ID = '<YOUR-PROJECT-ID>';
// Navigate to your webserver's login page and get this from the URL
const CLIENT_ID = '<ALPHANUMERIC>.apps.googleusercontent.com';
// This should be part of your webserver's URL in the Env's detail page: {tenant-project-id}.appspot.com.
const WEBSERVER_ID = 'v90eaaaa11113fp-tp';
// The name of the DAG you wish to trigger. It's DAG's name in the script trigger_response_dag.py you uploaded to your Env.
const DAG_NAME = 'composer_sample_trigger_response_dag';
For Python I only changed these parameters:
client_id = '<ALPHANUMERIC>.apps.googleusercontent.com'
# This should be part of your webserver's URL:
# {tenant-project-id}.appspot.com
webserver_id = 'v90eaaaa11113fp-tp'
# Change dag_name only if you are not using the example
dag_name = 'composer_sample_trigger_response_dag'
I've setup a VM machine in Azure that has a managed identity.
I follow the guide here https://learn.microsoft.com/en-us/azure/active-directory/managed-identities-azure-resources/tutorial-linux-vm-access-arm
So now I have an access token. But what I fail to understand is how do I use this token to access my key vault? I'm using the Python SDK. Looking at the docs for the SDK here https://learn.microsoft.com/en-us/python/api/azure-keyvault/azure.keyvault?view=azure-python
There exist a access token class AccessToken(scheme, token, key)
I assume i can use my token i generated earlier here. But what is scheme and key? The docs does not explain it. Or am I looking at the wrong class to use with the token?
If you're using a VM with a managed identity, then you can create a credential for a Key Vault client using azure-identity's ManagedIdentityCredential class. The credential will fetch and use access tokens for you as you use the Key Vault client:
from azure.identity import ManagedIdentityCredential
from azure.keyvault.secrets import SecretClient
credential = ManagedIdentityCredential()
client = SecretClient("https://{vault-name}.vault.azure.net", credential)
secret = client.get_secret("secret-name")
Note that I'm using a SecretClient to fetch secrets from Key Vault; there are new packages for working with Key Vault in Python that replace azure-keyvault:
azure-keyvault-certificates (Migration guide)
azure-keyvault-keys (Migration guide)
azure-keyvault-secrets (Migration guide)
Clients in each of these packages can use any credential from azure-identity for authentication.
(I work on the Azure SDK in Python)
I wouldn't recommend you use the managed identity of a VM to access KeyVault. You should create a service principal if you intend on running scripts / code.
The best way of doing this is with the Azure CLI. See here for instructions on installing the CLI, and refer to this, or this for creating your service principal.
The best way to manage resources in Python is by using ADAL, which is documented:
https://github.com/AzureAD/azure-activedirectory-library-for-python
In your case, however, managing KeyVault is made a little easier since the KeyVault library for Python also provides the means for you to authenticate without directly using ADAL to obtain your access token. See here:
https://learn.microsoft.com/en-us/python/api/overview/azure/key-vault?view=azure-python
from azure.keyvault import KeyVaultClient
from azure.common.credentials import ServicePrincipalCredentials
credentials = ServicePrincipalCredentials(
client_id = '...',
secret = '...',
tenant = '...'
)
client = KeyVaultClient(credentials)
# VAULT_URL must be in the format 'https://<vaultname>.vault.azure.net'
# KEY_VERSION is required, and can be obtained with the KeyVaultClient.get_key_versions(self, vault_url, key_name) API
key_bundle = client.get_key(VAULT_URL, KEY_NAME, KEY_VERSION)
key = key_bundle.key
In the above, client_id, secret, and tenant (id) are all outputs of the az ad sp create-for-rbac --name {APP-NAME} CLI command.
Remember to review and adjust the role assignments for the sp you created. And your KeyVault is only as secure as the devices which have access to your sp's credentials.
I am trying to automate creating the user in azure devops by writing a python script for that. I used the python client API of azure-devops for this purpose.
As of now, the authentication is done using a personal access token(PAT) :
from azure.devops.connection import Connection
from msrest.authentication import BasicAuthentication
personal_access_token = <myPAT>
organization_url = 'https://dev.azure.com/<myOrganization>'
# Create a connection to the org
credentials = BasicAuthentication('', personal_access_token)
connection = Connection(base_url=organization_url, creds=credentials)
Actually, I have a script that uses Graph API which authenticates to Azure AD via ADAL. That means I already have an application registered in our Azure AD which was created to use Graph API.
Can I use this application and it's client ID to authenticate to Azure DevOps services? How can use this authentication method with Python client API?
Does this article about OAuth 2.0 authentication method point to the same? :
https://learn.microsoft.com/en-us/azure/devops/integrate/get-started/authentication/oauth?toc=%2Fazure%2Fdevops%2Forganizations%2Ftoc.json&bc=%2Fazure%2Fdevops%2Forganizations%2Fbreadcrumb%2Ftoc.json&view=azure-devops#register-your-app
I got confused as it says to create an app in azure devops, not in azure AD.
Could anyone help me by clarifying this and explaining the steps to do this authentication, if possible?
I am trying to build an flask app on AWS lambda, which needs to access https://www.googleapis.com/auth/admin.directory.device.chromeos.readonly to get the Mac Address of chrome devices by domain name and chrome device id.
Basically the working flow is blew:
A Chrome Extension deployed by G-Suit sends an request with domain name and device id to AWS(as we are using AWS), then the AWS sends an request with domain name and device id to Google Cloud to get Mac Address.
I started with using an access the directory API as a service account like service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_FILE, scopes=SCOPES, subject="username#domainName.com"), and it works. However, I realised it was wrong implementation as the domainName would be changed and subject will be different for each domain as the Chrome Extension can be deployed via different domains.
Then I started to use the sample code from Google(https://developers.google.com/api-client-library/python/auth/web-app), and I was able to access the domain and get the Mac Address of each device.
However, the problem is it requires to choose the email when you call the google api for the first time. But as I mentioned above, this app is going to run on AWS, so clearly users cannot choose the email.
So is that possible that we just use the domainName instead of choosing email to do the authentication and access the different directories? If so, is there any examples or documentations I need to read?
I suspect I need to modify this part from the sample, but I am still getting confused how it works.
#app.route('/adminlogin')
def authorize():
# Create flow instance to manage the OAuth 2.0 Authorization Grant Flow steps.
flow = google_auth_oauthlib.flow.Flow.from_client_secrets_file(
CLIENT_SECRETS_FILE, scopes=SCOPES)
flow.redirect_uri = flask.url_for('oauth2callback', _external=True)
authorization_url, state = flow.authorization_url(
# Enable offline access so that you can refresh an access token without
# re-prompting the user for permission. Recommended for web server apps.
access_type='offline',
approval_prompt="force",
# Enable incremental authorization. Recommended as a best practice.
#include_granted_scopes='true'
)
# Store the state so the callback can verify the auth server response.
flask.session['state'] = state
return flask.redirect(authorization_url)
Any hint will be helpful.