Airflow: distinguish API- and UI-triggered dag runs - python

I'm using Apache Airflow 2.2.4. When I trigger a DAG run via UI click or via API call, I get context['dag_run'].external_trigger = True and context['dag_run'].run_type = 'scheduled' in both cases. I would like to distinguish between those two cases though. How can I do so?

Create new Role that doesn't have the permission action = website.
Create a new user that have this role for your API calls.
from the context["dag_run"] you can get "owner"

Related

Azure AuthorizationFailed

When i am running python script to manage vm's on azure environment its getting failed due to AuthorizationFailed message. Following exception are getting printed along with the error message.
CloudError("The client 'bdafca09-d426-4924-b63c-dff61c034187' with object id 'bdafca09-d426-4924-b63c-dff61c034187' does not have authorization to perform action 'Microsoft.Resources/subscriptions/resourcegroups/write' over scope '/subscriptions/49ec57ce-8a6f-4cdf-95bf-8163b231edf6/resourcegroups/azure-sample-group-virtual-machines' or the scope is invalid. If access was recently granted, please refresh your credentials.")
You need to add Contributor role for the application you created in Azure AD in the portal.
Select your subscription in Subscriptions(search Subscriptions
in the top bar)
Add role assignment in Access control
select and add the Contributor role for your application.
If you add it successfully, you will see your application showing in the role assignment list.

BigQuery cross project access via cloud functions

Let's say I have two GCP Projects, A and B. And I am the owner of both projects. When I use the UI, I can query BigQuery tables in project B from both projects. But I run into problems when I try to run a Cloud Function in project A, from which I try to access a BigQuery table in project B. Specifically I run into a 403 Access Denied: Table <>: User does not have permission to query table <>.. I am a bit confused as to why I can't access the data in B and what I need to do. In my Cloud Function all I do is:
from google.cloud import bigquery
client = bigquery.Client()
query = cient.query(<my-query>)
res = query.result()
The service account used to run the function exists in project A - how do I give it editor access to BigQuery in project B? (Or what else should I do?).
Basically you have an issue with IAM Permissions and roles on the service account used to run the function.
You should define the role bigquery.admin on your service account and it would do the trick.
However it may not be the adequate solution in regards to best practices. The link below provides a few scenarios with examples of roles most suited to your case.
https://cloud.google.com/bigquery/docs/access-control-examples

BigQuery GCP Python Integration

I am trying to write all my scripts in Python instead of BigQuery. I set my active project using 'glcoud config set project' but I still get this ERROR 403 POST https://bigquery.googleapis.com/bigquery/v2/projects/analytics-supplychain-thd/jobs: Caller does not have required permission to use project analytics-supplychain-thd. Grant the caller the Owner or Editor role, or a custom role with the serviceusage.services.use permission, by visiting https://console.developers.google.com/iam-admin/iam/project?project=analytics-supplychain-thd and then retry (propagation of new permission may take a few minutes).
How do I fix this?
I suspect you are picking up the wrong "key".json, at least in terms of permissions for one of the operation you are trying to perform. The key currently defined [1] in GOOGLE_APPLICATION_CREDENTIALS seems not have right permission. A list of roles you should grant to the Service Account can be find here [2], anyway from your error you would need at least a primitive role as Owner or Editor. The latter depends on your needs and targets (operation you perform through such script).
You should pick up the right role for your operation and associating it to the Service Account you want to use, defining therefore an identity for it through the IAM portal UI, also doable also through the CLI or API calls.
After that be sure the client you are using is logged in with the correct service account (correct json key path).
Particularly, I used the code you gave me to test and I have been able to load the data:
import pandas_gbq
import google.oauth2.service_account as service_account
# TODO: Set project_id to your Google Cloud Platform project ID
project_id = "xxx-xxxx-xxxxx"
sql = """SELECT * FROM xxx-xxxx-xxxxx.fourth_dataset.2test LIMIT 100"""
credentials = service_account.Credentials.from_service_account_file('/home/myself/key.json')
df = pandas_gbq.read_gbq(sql, project_id=project_id, dialect="standard", credentials=credentials)
This
Hope this helps!!
[1] https://cloud.google.com/docs/authentication/getting-started#setting_the_environment_variable
[2] https://cloud.google.com/iam/docs/understanding-roles#primitive_roles

New Azure Subscriptions are not listed with azure python sdk

I created 2 new subscriptions from Azure Portal, but I've not been able to list those newly created subscriptions using the python SDK. It lists the old subscriptions fine.
from azure.mgmt.resource import SubscriptionClient
...
subscriptionClient = SubscriptionClient(credentials)
for subscription in subscriptionClient.subscriptions.list():
print subscription
...
I was having the same problem with CLI as well, but logging out and logging back in resolved the issue.
I don't see any other subscriptions operations to scan and refresh the subscriptions. Is there something I need to under Azure Active Directory to manage new subscriptions?
I tried to reproduce your issue successfully, which was caused by your client registed on Azure AD that have no permission to retrieve the information of these subscriptions. So the solution is to add permission for each subsription via add role like Owner for your client, as the figure below.
Then your code works fine, but I know the solution is not perfect for you. I'm looking for a better one.

Azure python SDK ComputerManagementClient error

I get an error when trying to deallocate a virtual machine with the Python SDK for Azure.
Basically I try something like:
credentials = ServicePrincipalCredentials(client_id, secret, tenant)
compute_client = ComputeManagementClient(credentials, subscription_id, '2015-05-01-preview')
compute_client.virtual_machines.deallocate(resource_group_name, vm_name)
pprint (result.result())
-> exception:
msrestazure.azure_exceptions.CloudError: Azure Error: AuthorizationFailed
Message: The client '<some client UUID>' with object id '<same client UUID>' does not have authorization to perform action 'Microsoft.Compute/virtualMachines/deallocate/action' over scope '/subscriptions/<our subscription UUID>/resourceGroups/<resource-group>/providers/Microsoft.Compute/virtualMachines/<our-machine>'.
What I don't understand is that the error message contains an unknown client UUID that I have not used in the credentials.
Python is version 2.7.13 and the SDK version was from yesterday.
What I guess I need is a registration for an Application, which I did to get the information for the credentials. I am not quite sure which exact permission(s) I need to register for the application with IAM. For adding an access entry I can only pick existing users, but not an application.
So is there any programmatic way to find out which permissions are required for an action and which permissions our client application has?
Thanks!
As #GauravMantri & #LaurentMazuel said, the issue was caused by not assign role/permission to a service principal. I had answered another SO thread Cannot list image publishers from Azure java SDK, which is similar with yours.
There are two ways to resolve the issue, which include using Azure CLI & doing these operations on Azure portal, please see the details of my answer for the first, and I update below for the second way which is old.
And for you want to find out these permissions programmatically, you can refer to the REST API Role Definition List to get all role definitions that are applicable at scope and above, or refer to Azure Python SDK Authentication Management to do it via the code authorization_client.role_definitions.list(scope).
Hope it helps.
Thank you all for your answers! The best recipe for creating an application and to register it with the right role - Virtual Machine Contributor - is presented indeed on https://learn.microsoft.com/en-us/azure/azure-resource-manager/resource-group-create-service-principal-portal
The main issue I had was that there is a bug in the adding a role within IAM. I use add. I select "Virtual Machine Contributor". With "Select" I get presented a list of users, but not the application that I have created for this purpose. Entering the first few letters of the name of my application will give a filtered output that includes my application this time though. Registration is then finished and things can proceed.

Categories

Resources