I'm using Cloud Composer, and I have a DAG that has one task that calls an HTTPS-triggered cloud function that sends out an email (due to restrictions on the project I'm working on, I had to do it this way).
The most simple form of this works. I can trigger the cloud function, and the emails are being sent successfully. However, I want to pass some variables I'm defining in the DAG to the Cloud Function, and this is where something is failing. I was using the usual way to pass parameters to the request URL.
This was the way I was defining the DAG:
# --------------------------------------------------------------------------------
# Import Libraries
# --------------------------------------------------------------------------------
import datetime
from airflow.models import DAG
from airflow.operators.dummy_operator import DummyOperator
from airflow.operators.python_operator import PythonOperator
from airflow.contrib.operators.bigquery_operator import BigQueryOperator
from airflow.providers.google.cloud.operators.bigquery import BigQueryInsertJobOperator,BigQueryExecuteQueryOperator
from airflow.providers.google.common.utils import id_token_credentials as id_token_credential_utils
import google.auth.transport.requests
from google.auth.transport.requests import AuthorizedSession
# --------------------------------------------------------------------------------
# Set variables
# --------------------------------------------------------------------------------
(...)
report_name_url = "report_name_url"
end_user = "end_user#email.com"
# --------------------------------------------------------------------------------
# Functions
# --------------------------------------------------------------------------------
def invoke_cloud_function():
url = "https://<trigger_url>?report_name_url={}&end_user={}".format(report_name_url, end_user) #I'M ADDING THE STRINGS TO THE URL AFTER THE ?, TO PASS WHAT I WANT AS ARGUMENTS TO THE CLOUD FUNCTION
request = google.auth.transport.requests.Request() #this is a request for obtaining the the credentials
id_token_credentials = id_token_credential_utils.get_default_id_token_credentials(url, request=request) # If your cloud function url has query parameters, remove them before passing to the audience
resp = AuthorizedSession(id_token_credentials).request("GET", url=url) # the authorized session object is used to access the Cloud Function
print(resp.status_code) # should return 200
print(resp.content) # the body of the HTTP response
# --------------------------------------------------------------------------------
# Define DAG
# --------------------------------------------------------------------------------
with DAG(
dag_id,
schedule_interval= '0 13 05 * *', # DAG Cron scheduler
default_args = default_args) as dag:
(...)
send_email = PythonOperator(
task_id="send_email",
python_callable=invoke_cloud_function
)
start >> run_stored_procedure >> composer_logging >> send_email >> end
This is what I have as far as the DAG goes. From the perspective of the cloud function, I have the following:
def send_email(request):
import ssl
from email.message import EmailMessage
import smtplib
import os
report_name_url = request.args.get('report_name_url')
report_name = report_name_url.replace("_", " ")
end_user = request.args.get('end_user')
(...)
context = ssl.create_default_context()
with smtplib.SMTP_SSL('smtp.gmail.com', 465, context=context) as smtp:
smtp.login(sender_email, password)
smtp.sendmail(sender_email, receiver_email, em.as_string())
Can someone point me toward a solution for my use-case?
Thank you very much.
Edit for added context:
I'm getting the following information from the logs:
"(...)Unauthorized</h1>\n<h2>Your client does not have permission to the requested URL <code>...</code>.</h2>\n<h2></h2>\n</body></html>\n'"
This is odd because I think this was the error I was getting BEFORE giving permissions to my service account to invoke the cloud function.
Now, that permission is in place.
The only problem I foresee is that I'm not exactly calling the original URL that triggers the cloud function, since I'm adding parameters. Could this be the problem?
EDIT:
After a lot of digging around, I found a way to do this. First and foremost, I had to switch from GET to POST. This way, I was able to pass the URL that indeed is meant to trigger the Cloud Function.
The final solution came down to this:
This was the function in the DAG:
def invoke_cloud_function_success():
url = "<trigger url>" #the url is also the target audience.
request = google.auth.transport.requests.Request() #this is a request for obtaining the the credentials
id_token_credentials =
id_token_credential_utils.get_default_id_token_credentials(url, request=request) # If your cloud function url has query parameters, remove them before passing to the audience
headers = {"Content-Type": "application/json"}
body = {"report_name":report_name, "end_user":end_user, "datastudio_link":datastudio_link}
resp = AuthorizedSession(id_token_credentials).post(url=url, json=body, headers=headers) # the authorized session object is used to access the Cloud Function
print(resp.status_code) # should return 200
print(resp.content) # the body of the HTTP response
In the final Cloud Function I had to put:
request_json = request.get_json()
report_name = list(request_json['report_name'])
datastudio_link = list(request_json['datastudio_link'])
end_user = list(request_json['end_user'])
Related
i want to trigger the dag externally
I was unable to find the solution , i'm new to programming
You can trigger a DAG externally in a several ways :
Solution 1 :
trigger a DAG with gcloud cli and gcloud composer command :
gcloud composer environments run ENVIRONMENT_NAME \
--location LOCATION \
dags trigger -- DAG_ID
Replace :
ENVIRONMENT_NAME with the name of the environment.
LOCATION with the region where the environment is located.
DAG_ID with the name of the DAG.
Solution 2 :
trigger a DAG with a Cloud function
from google.auth.transport.requests import Request
from google.oauth2 import id_token
import requests
IAM_SCOPE = 'https://www.googleapis.com/auth/iam'
OAUTH_TOKEN_URI = 'https://www.googleapis.com/oauth2/v4/token'
# If you are using the stable API, set this value to False
# For more info about Airflow APIs see https://cloud.google.com/composer/docs/access-airflow-api
USE_EXPERIMENTAL_API = True
def trigger_dag(data, context=None):
"""Makes a POST request to the Composer DAG Trigger API
When called via Google Cloud Functions (GCF),
data and context are Background function parameters.
For more info, refer to
https://cloud.google.com/functions/docs/writing/background#functions_background_parameters-python
To call this function from a Python script, omit the ``context`` argument
and pass in a non-null value for the ``data`` argument.
This function is currently only compatible with Composer v1 environments.
"""
# Fill in with your Composer info here
# Navigate to your webserver's login page and get this from the URL
# Or use the script found at
# https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/composer/rest/get_client_id.py
client_id = 'YOUR-CLIENT-ID'
# This should be part of your webserver's URL:
# {tenant-project-id}.appspot.com
webserver_id = 'YOUR-TENANT-PROJECT'
# The name of the DAG you wish to trigger
dag_name = 'composer_sample_trigger_response_dag'
if USE_EXPERIMENTAL_API:
endpoint = f'api/experimental/dags/{dag_name}/dag_runs'
json_data = {'conf': data, 'replace_microseconds': 'false'}
else:
endpoint = f'api/v1/dags/{dag_name}/dagRuns'
json_data = {'conf': data}
webserver_url = (
'https://'
+ webserver_id
+ '.appspot.com/'
+ endpoint
)
# Make a POST request to IAP which then Triggers the DAG
make_iap_request(
webserver_url, client_id, method='POST', json=json_data)
# This code is copied from
# https://github.com/GoogleCloudPlatform/python-docs-samples/blob/main/iap/make_iap_request.py
# START COPIED IAP CODE
def make_iap_request(url, client_id, method='GET', **kwargs):
"""Makes a request to an application protected by Identity-Aware Proxy.
Args:
url: The Identity-Aware Proxy-protected URL to fetch.
client_id: The client ID used by Identity-Aware Proxy.
method: The request method to use
('GET', 'OPTIONS', 'HEAD', 'POST', 'PUT', 'PATCH', 'DELETE')
**kwargs: Any of the parameters defined for the request function:
https://github.com/requests/requests/blob/master/requests/api.py
If no timeout is provided, it is set to 90 by default.
Returns:
The page body, or raises an exception if the page couldn't be retrieved.
"""
# Set the default timeout, if missing
if 'timeout' not in kwargs:
kwargs['timeout'] = 90
# Obtain an OpenID Connect (OIDC) token from metadata server or using service
# account.
google_open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
# Fetch the Identity-Aware Proxy-protected URL, including an
# Authorization header containing "Bearer " followed by a
# Google-issued OpenID Connect token for the service account.
resp = requests.request(
method, url,
headers={'Authorization': 'Bearer {}'.format(
google_open_id_connect_token)}, **kwargs)
if resp.status_code == 403:
raise Exception('Service account does not have permission to '
'access the IAP-protected application.')
elif resp.status_code != 200:
raise Exception(
'Bad response from application: {!r} / {!r} / {!r}'.format(
resp.status_code, resp.headers, resp.text))
else:
return resp.text
# END COPIED IAP CODE
I am trying to use OAuth2 to access the Azure DevopsAPI, to query work-items.
But I am unable to get the access tokene.
I am using Python and Flask. My approach is based on these resources:
Microsoft documentation , there currently Step 3 is relevant
OAuth Tutorial, which worked fine for Github, but is not working for Azure.
Relevant libraries:
from requests_oauthlib import OAuth2Session
from flask import Flask, request, redirect, session, url_for
Parameters:
client_id = "..."
client_secret = "..."
authorization_base_url = "https://app.vssps.visualstudio.com/oauth2/authorize"
token_url = "https://app.vssps.visualstudio.com/oauth2/token"
callback_url = "..."
Step 1: User Authorization. (works fine)
#app.route("/")
def demo():
azure = OAuth2Session(client_id)
authorization_url, state = azure.authorization_url(authorization_base_url)
session['oauth_state'] = state
authorization_url += "&scope=" + authorized_scopes + "&redirect_uri=" + callback_url
print(authorization_url)
return redirect(authorization_url)
Step 2: Retrieving an access token (generates an error)
#app.route("/callback", methods=["GET"])
def callback():
fetch_body = "client_assertion_type=urn:ietf:params:oauth:client-assertion-type:jwt-bearer" \
"&client_assertion=" + client_secret + \
"&grant_type=urn:ietf:params:oauth:grant-type:jwt-bearer" \
"&assertion=" + request.args["code"] + \
"&redirect_uri=" + callback_url
azure = OAuth2Session(client_id, state=session['oauth_state'])
token = azure.fetch_token(token_url=token_url, client_secret=client_secret,
body=fetch_body,
authorization_response=request.url)
azure.request()
session['oauth_token'] = token
return redirect(url_for('.profile'))
The application-registration and adhoc-SSL-certification are working fine (using it just temporary).
When I use the client_assertion in Postman, I get a correct response from Azure:
But when I execute the code, this error is thrown:
oauthlib.oauth2.rfc6749.errors.MissingTokenError: (missing_token) Missing access token parameter.
Which only lets me know, that no token was received.
There is one issue in the generated request body, where the grant_type is added twice:
grant_type=urn%3Aietf%3Aparams%3Aoauth%3Agrant-type%3Ajwt-bearer
grant_type=authorization_code
The first value is expected by Azure, but the second one is generated automatically by the library.
Now when I specify the grant_type in the fetch_token call, like this:
token = azure.fetch_token(token_url=token_url, client_secret=client_secret,
body=fetch_body, grant_type="urn:ietf:params:oauth:grant-type:jwt-bearer",
authorization_response=request.url)
I get this error
TypeError: prepare_token_request() got multiple values for argument 'grant_type'
And the actual request to Azure is not even sent.
I see in the web_application.py that is used by oauth2_session.py, that grant_type ='authorization_code' is set fixed, so I guess this library is generally incompatible with Azure.
Is that the case?
If so, what would be the simplest way to connect to Azure-OAuth with Python (Flask)?
I would be very grateful for any advice and help that point me in the right direction.
I just found the azure.devops library that solves my problem.
Ressources
https://github.com/Microsoft/azure-devops-python-api
https://github.com/microsoft/azure-devops-python-samples/blob/main/src/samples/work_item_tracking.py
azure-devops-python-api query for work item where field == string
from azure.devops.connection import Connection
from azure.devops.v5_1.work_item_tracking import Wiql
from msrest.authentication import BasicAuthentication
import pprint
# Fill in with your personal access token and org URL
personal_access_token = '... PAT'
organization_url = 'https://dev.azure.com/....'
# Create a connection to the org
credentials = BasicAuthentication('', personal_access_token)
connection = Connection(base_url=organization_url, creds=credentials)
# Get a client (the "core" client provides access to projects, teams, etc)
core_client = connection.clients.get_core_client()
wit_client = connection.clients.get_work_item_tracking_client()
query = "SELECT [System.Id], [System.WorkItemType], [System.Title], [System.AssignedTo], [System.State]," \
"[System.Tags] FROM workitems WHERE [System.TeamProject] = 'Test'"
wiql = Wiql(query=query)
query_results = wit_client.query_by_wiql(wiql).work_items
for item in query_results:
work_item = wit_client.get_work_item(item.id)
pprint.pprint(work_item.fields['System.Title'])
I would need to disable Airflow DAGs with AWS Lambda or some other way. Can I use python code in order to do this? Thank you in advance.
You can pause/unpause a DAG with Airflow Rest API
The relevant endpoint is update a DAG.
https://airflow.apache.org/api/v1/dags/{dag_id}
With:
{
"is_paused": true
}
You also have Airflow official python client that you can use to interact with the API. Example:
import time
import airflow_client.client
from airflow_client.client.api import dag_api
from airflow_client.client.model.dag import DAG
from airflow_client.client.model.error import Error
from pprint import pprint
configuration = client.Configuration(
host = "http://localhost/api/v1"
)
# Configure HTTP basic authorization: Basic
configuration = client.Configuration(
username = 'YOUR_USERNAME',
password = 'YOUR_PASSWORD'
)
with client.ApiClient(configuration) as api_client:
# Create an instance of the API class
api_instance = dag_api.DAGApi(api_client)
dag_id = "dag_id_example" # str | The DAG ID.
dag = DAG(
is_paused=True,
)
try:
# Update a DAG
api_response = api_instance.patch_dag(dag_id, dag)
pprint(api_response)
except client.ApiException as e:
print("Exception when calling DAGApi->patch_dag: %s\n" % e)
You can see the full example in the client doc.
I am trying to start an eBay API in Python and I can't find a single answer as to how to get an API key with eBay's new requirements of "Account Deletion/Closure Notifications." Here's the link: https://developer.ebay.com/marketplace-account-deletion
Specifically, I am told that "Your Keyset is currently disabled" because I have not completed whatever process is needed for this marketplace account deletion/closure notification.
The problems?
I have no idea if I need this.
I have no idea how to actually do this.
Re: 1. It looks like this is for anyone who stores user data. I don’t think that’s me intentionally because I really just want to get sold data and current listings, but is it actually me?
Re: 2. I don’t understand how to validate it and send back the proper responses. I’ve gotten quite good at python but I’m lost here.
eBay forums are completely useless and I see no one with an answer to this. Any help is greatly appreciated.
Re: 1. Same. Here's my interpretation: In order to use their APIs, you need to provide (and configure) your own API, so they can communicate with you —programatically— and tell you what users have asked to have their accounts/data deleted.
Re: 2. To handle their GET and POST requests, I guess you'll need to configure a website's URL as an API endpoint. In Django, I might use something like this (untested) code:
import hashlib
import json
from django.http import (
HttpResponse,
JsonResponse,
HttpResponseBadRequest
)
def your_api_endpoint(request):
"""
API Endpoint to handle the verification's challenge code and
receive eBay's Marketplace Account Deletion/Closure Notifications.
"""
# STEP 1: Handle verification's challenge code
challengeCode = request.GET.get('challenge_code')
if challengeCode is not None:
# Token needs to be 32-80 characters long
verificationToken = "your-token-012345678901234567890123456789"
# URL needs to use HTTPS protocol
endpoint_url = "https://your-domain.com/your-endpoint"
# Hash elements need to be ordered as follows
m = hashlib.sha256((challengeCode+verificationToken+endpoint_url).encode('utf-8'))
# JSON field needs to be called challengeResponse
return JsonResponse({"challengeResponse": m.hexdigest()}, status=200)
# STEP 2: Handle account deletion/closure notification
elif request.method == 'POST':
notification_details = json.loads(request.body)
# Verify notification is actually from eBay
# ...
# Delete/close account
# ...
# Acknowledge notification reception
return HttpResponse(status=200)
else:
return HttpResponseBadRequest()
If you find the answer to question number one, please do let me know.
Re: 1. You need to comply with eBay's Marketplace Account Deletion/Closure Notification workflow if you are storing user data into your own database. For example, using eBay's Buy APIs, you may get access to what users are selling on eBay (for ex. an eBay feed of products). If those eBay sellers decide they want to remove all of their personal data from eBay's database, eBay is requesting you remove their data from your database as well. If you are NOT storing any eBay user data into your database, you do not need to comply. Here is where you can find more info: https://partnerhelp.ebay.com/helpcenter/s/article/Complying-with-the-eBay-Marketplace-Account-Deletion-Closure-Notification-workflow?language=en_US
Re: 2. To be honest I've spent days trying to figure this out in Python (Django), but I have a solution now and am happy to share it with whoever else comes across this issue. Here's my solution:
import os
import json
import base64
import hashlib
import requests
import logging
from OpenSSL import crypto
from rest_framework import status
from rest_framework.views import APIView
from django.http import JsonResponse
logger = logging.getLogger(__name__)
class EbayMarketplaceAccountDeletion(APIView):
"""
This is required as per eBay Marketplace Account Deletion Requirements.
See documentation here: https://developer.ebay.com/marketplace-account-deletion
"""
# Ebay Config Values
CHALLENGE_CODE = 'challenge_code'
VERIFICATION_TOKEN = os.environ.get('VERIFICATION_TOKEN')
# ^ NOTE: You can make this value up so long as it is between 32-80 characters.
ENDPOINT = 'https://example.com/ebay_marketplace_account_deletion'
# ^ NOTE: Replace this with your own endpoint
X_EBAY_SIGNATURE = 'X-Ebay-Signature'
EBAY_BASE64_AUTHORIZATION_TOKEN = os.environ.get('EBAY_BASE64_AUTHORIZATION_TOKEN')
# ^ NOTE: Here's how you can get your EBAY_BASE64_AUTHORIZATION_TOKEN:
# import base64
# base64.b64encode(b'{CLIENT_ID}:{CLIENT_SECRET}')
def __init__(self):
super(EbayMarketplaceAccountDeletion, self).__init__()
def get(self, request):
"""
Get challenge code and return challengeResponse: challengeCode + verificationToken + endpoint
:return: Response
"""
challenge_code = request.GET.get(self.CHALLENGE_CODE)
challenge_response = hashlib.sha256(challenge_code.encode('utf-8') +
self.VERIFICATION_TOKEN.encode('utf-8') +
self.ENDPOINT.encode('utf-8'))
response_parameters = {
"challengeResponse": challenge_response.hexdigest()
}
return JsonResponse(response_parameters, status=status.HTTP_200_OK)
def post(self, request):
"""
Return 200 status code and remove from db.
See how to validate the notification here:
https://developer.ebay.com/api-docs/commerce/notification/overview.html#use
"""
# Verify notification is actually from eBay #
# 1. Use a Base64 function to decode the X-EBAY-SIGNATURE header and retrieve the public key ID and signature
x_ebay_signature = request.headers[self.X_EBAY_SIGNATURE]
x_ebay_signature_decoded = json.loads(base64.b64decode(x_ebay_signature).decode('utf-8'))
kid = x_ebay_signature_decoded['kid']
signature = x_ebay_signature_decoded['signature']
# 2. Call the getPublicKey Notification API method, passing in the public key ID ("kid") retrieved from the
# decoded signature header. Documentation on getPublicKey:
# https://developer.ebay.com/api-docs/commerce/notification/resources/public_key/methods/getPublicKey
public_key = None
try:
ebay_verification_url = f'https://api.ebay.com/commerce/notification/v1/public_key/{kid}'
oauth_access_token = self.get_oauth_token()
headers = {
'Authorization': f'Bearer {oauth_access_token}'
}
public_key_request = requests.get(url=ebay_verification_url, headers=headers, data={})
if public_key_request.status_code == 200:
public_key_response = public_key_request.json()
public_key = public_key_response['key']
except Exception as e:
message_title = "Ebay Marketplace Account Deletion: Error calling getPublicKey Notfication API."
logger.error(f"{message_title} Error: {e}")
return JsonResponse({}, status=status.HTTP_500_INTERNAL_SERVER_ERROR)
# 3. Initialize the cryptographic library to perform the verification with the public key that is returned from
# the getPublicKey method. If the signature verification fails, an HTTP status of 412 Precondition Failed is returned.
pkey = crypto.load_publickey(crypto.FILETYPE_PEM, self.get_public_key_into_proper_format(public_key))
certification = crypto.X509()
certification.set_pubkey(pkey)
notification_payload = request.body
signature_decoded = base64.b64decode(signature)
try:
crypto.verify(certification, signature_decoded, notification_payload, 'sha1')
except crypto.Error as e:
message_title = f"Ebay Marketplace Account Deletion: Signature Invalid. " \
f"The signature is invalid or there is a problem verifying the signature. "
logger.warning(f"{message_title} Error: {e}")
return JsonResponse({}, status=status.HTTP_412_PRECONDITION_FAILED)
except Exception as e:
message_title = f"Ebay Marketplace Account Deletion: Error performing cryptographic validation."
logger.error(f"{message_title} Error: {e}")
return JsonResponse({}, status=status.HTTP_412_PRECONDITION_FAILED)
# Take appropriate action to delete the user data. Deletion should be done in a manner such that even the
# highest system privilege cannot reverse the deletion #
# TODO: Replace with your own data removal here
# Acknowledge notification reception
return JsonResponse({}, status=status.HTTP_200_OK)
def get_oauth_token(self):
"""
Returns the OAuth Token from eBay which can be used for making other API requests such as getPublicKey
"""
url = 'https://api.ebay.com/identity/v1/oauth2/token'
headers = {
'Content-Type': 'application/x-www-form-urlencoded',
'Authorization': f"Basic {self.EBAY_BASE64_AUTHORIZATION_TOKEN}"
}
payload = 'grant_type=client_credentials&scope=https%3A%2F%2Fapi.ebay.com%2Foauth%2Fapi_scope'
request = requests.post(url=url, headers=headers, data=payload)
data = request.json()
return data['access_token']
#staticmethod
def get_public_key_into_proper_format(public_key):
"""
Public key needs to have \n in places to be properly assessed by crypto library.
"""
return public_key[:26] + '\n' + public_key[26:-24] + '\n' + public_key[-24:]
This is how I am dealing with the ebay notification requirement using Python3 cgi. Because bytes are sent, cannot use cgi.FieldStorage()
import os
import sys
import hashlib
import json
from datetime import datetime
from html import escape
import cgi
import cgitb
import io
include_path = '/var/domain_name/www'
sys.path.insert(0, include_path)
cgitb.enable(display=0, logdir=f"""{include_path}/tmp_errors""") # include_path is OUTDIR
dt_now = datetime.now()
current_dt_now = dt_now.strftime("%Y-%m-%d_%H-%M-%S")
def enc_print(string='', encoding='utf8'):
sys.stdout.buffer.write(string.encode(encoding) + b'\n')
html = ''
challengeCode = ''
# GET
myQuery = os.environ.get('QUERY_STRING')
if myQuery.find('=') != -1:
pos = myQuery.find('=')
var_name = myQuery[:pos]
var_val = myQuery[pos+1:]
challengeCode = var_val
# POST
if os.environ.get('CONTENT_LENGTH') != None:
totalBytes=int(os.environ.get('CONTENT_LENGTH'))
reqbytes=io.open(sys.stdin.fileno(),"rb").read(totalBytes)
if challengeCode != '' :
"""
API Endpoint to handle the verification's challenge code and
receive eBay's Marketplace Account Deletion/Closure Notifications.
"""
# STEP 1: Handle verification's challenge code
# Token needs to be 32-80 characters long
verificationToken = "0123456789012345678901234567890123456789" #sample token
# URL needs to use HTTPS protocol
endpoint = "https://domain_name.com/ebay/notification.py" # sample endpoint
# Hash elements need to be ordered as follows
m = hashlib.sha256( (challengeCode+verificationToken+endpoint).encode('utf-8') )
# JSON field needs to be called challengeResponse
enc_print("Content-Type: application/json")
enc_print("Status: 200 OK")
enc_print()
enc_print('{"challengeResponse":"' + m.hexdigest() + '"}')
exit()
else :
#html += 'var length:' + str(totalBytes) + '\n'
html += reqbytes.decode('utf-8') + '\n'
# STEP 2: Handle account deletion/closure notification
# Verify notification is actually from eBay
# ...
# Delete/close account
# ...
# Acknowledge notification reception
with open( f"""./notifications/{current_dt_now}_user_notification.txt""", 'w') as f:
f.write(html)
enc_print("Content-Type: application/json")
enc_print("Status: 200 OK")
enc_print()
exit()
I've been trying #José Matías Arévalo code. It works except "STEP 2" branch - Django returns 403 error. This is because of by default Django uses CSRF middleware (Cross Site Request Forgery protection). To avoid 403 error we need to marks a view as being exempt from the protection as described here https://docs.djangoproject.com/en/dev/ref/csrf/#utilities so add couple strings in code:
from django.views.decorators.csrf import csrf_exempt
#csrf_exempt
def your_api_endpoint(request):
And in my case I use url "https://your-domain.com/your-endpoint/" with slash symbol "/" at the end of url. Without this slash eBay doesn't confirm subscription.
I am using Flask and this is the code I have used:
from flask import Flask, request
import hashlib
# Create a random verification token, it needs to be 32-80 characters long
verification_token = 'a94cbd68e463cb9780e2008b1f61986110a5fd0ff8b99c9cba15f1f802ad65f9'
endpoint_url = 'https://dev.example.com'
app = Flask(__name__)
# There will be errors if you just use '/' as the route as it will redirect eBays request
# eBay will send a request to https://dev.example.com?challenge_code=123
# The request will get redirected by Flask to https://dev.example.com/?challenge_code=123 which eBay will not accept
endpoint = endpoint_url + '/test'
# The Content-Type header will be added automatically by Flask as 'application/json'
#app.route('/test')
def test():
code = request.args.get('challenge_code')
print('Requests argument:', code)
code = code + token + endpoint
code = code.encode('utf-8')
code = hashlib.sha256(code)
code = code.hexdigest()
print('Hexdigest:', code)
final = {"challengeResponse": code}
return final
## To run locally first use this:
# app.run(port=29)
I am trying to get the event and context variables data from the background functions run on Google Cloud Functions and pass the values through to a container running the KubernetesPodOperator on Cloud Composer / Airflow.
The first section of code is my cloud function which triggers a dag called gcs_to_pubsub_topic_dag, what I would like to pass over and access is the data in json, specifically the "conf": event data.
#!/usr/bin/env python
# coding: utf-8
from google.auth.transport.requests import Request
from google.oauth2 import id_token
import requests
IAM_SCOPE = 'https://www.googleapis.com/auth/iam'
OAUTH_TOKEN_URI = 'https://www.googleapis.com/oauth2/v4/token'
def trigger_dag(event, context=None):
client_id = '###############.apps.googleusercontent.com'
webserver_id = '###############'
# The name of the DAG you wish to trigger
dag_name = 'gcs_to_pubsub_topic_dag'
webserver_url = (
'https://'
+ webserver_id
+ '.appspot.com/api/experimental/dags/'
+ dag_name
+ '/dag_runs'
)
print(f' This is my webserver url: {webserver_url}')
# Make a POST request to IAP which then Triggers the DAG
make_iap_request(
webserver_url, client_id, method='POST', json={"conf": event, "replace_microseconds": 'false'})
def make_iap_request(url, client_id, method='GET', **kwargs):
if 'timeout' not in kwargs:
kwargs['timeout'] = 90
google_open_id_connect_token = id_token.fetch_id_token(Request(), client_id)
resp = requests.request(
method, url,
headers={'Authorization': 'Bearer {}'.format(
google_open_id_connect_token)}, **kwargs)
if resp.status_code == 403:
raise Exception('Service account does not have permission to '
'access the IAP-protected application.')
elif resp.status_code != 200:
raise Exception(
'Bad response from application: {!r} / {!r} / {!r}'.format(
resp.status_code, resp.headers, resp.text))
else:
return resp.text
def main(event, context=None):
"""
Call the main function, sets the order in which to run functions.
"""
trigger_dag(event, context=None)
return 'Script has run without errors !!'
if (__name__ == "__main__"):
main()
The dag that is triggered runs this KubernetesPodOperator code:
kubernetes_pod_operator.KubernetesPodOperator(
# The ID specified for the task.
task_id=TASK_ID,
# Name of task you want to run, used to generate Pod ID.
name=TASK_ID,
# Entrypoint of the container, if not specified the Docker container's
# entrypoint is used. The cmds parameter is templated.
cmds=[f'python3', 'execution_file.py'],
# The namespace to run within Kubernetes, default namespace is `default`.
namespace=KUBERNETES_NAMESPACE,
# location of the docker image on google container repository
image=f'eu.gcr.io/{GCP_PROJECT_ID}/{CONTAINER_ID}:{IMAGE_VERSION}',
#Always pulls the image before running it.
image_pull_policy='Always',
# The env_var template variable allows you to access variables defined in Airflow UI.
env_vars = {'GCP_PROJECT_ID':GCP_PROJECT_ID,'DAG_CONF':{{ dag_run.conf }}},
dag=dag)
And then finally I want to get DAG_CONF to print within the called container image execution_file.py script :
#!/usr/bin/env python
# coding: utf-8
from gcs_unzip_function import main as gcs_unzip_function
from gcs_to_pubsub_topic import main as gcs_to_pubsub_topic
from os import listdir, getenv
GCP_PROJECT_ID = getenv('GCP_PROJECT_ID')
DAG_CONF = getenv('DAG_CONF')
print('Test run')
print(GCP_PROJECT_ID)
print (f'This is my dag conf {DAG_CONF}')
print(type(DAG_CONF))
At the moment the code triggers the dag and returns:
Test run
GCP_PROJECT_ID (this is set in the airflow environment variables)
This is my dag conf None
class 'NoneType
where as I would like DAG_CONF to come through
I have a work around way of accessing data about the object triggering the dag within the container running with KubernetesPodOperator.
The post request code stays the same but I want to highlight that you can pass anything as long as it is to the conf element in the dictionary.
make_iap_request(
webserver_url, client_id, method='POST', json={"conf": event,
"replace_microseconds": 'false'})
The dag code requires you to create a custom class which assess the dag_run and the .conf element, then the argument acesses the json we sent from the publish request.
article read while doing this part.
from airflow.contrib.operators.kubernetes_pod_operator import KubernetesPodOperator
class CustomKubernetesPodOperator(KubernetesPodOperator):
def execute(self, context):
json = str(context['dag_run'].conf)
arguments = [f'--json={json}']
self.arguments.extend(arguments)
super().execute(context)
CustomKubernetesPodOperator(
# The ID specified for the task.
task_id=TASK_ID,
# Name of task you want to run, used to generate Pod ID.
name=TASK_ID,
# Entrypoint of the container, if not specified the Docker container's
# entrypoint is used. The cmds parameter is templated.
cmds=[f'python3', 'execution_file.py'],
# The namespace to run within Kubernetes, default namespace is `default`.
namespace=KUBERNETES_NAMESPACE,
# location of the docker image on google container repository
image=f'eu.gcr.io/{GCP_PROJECT_ID}/{CONTAINER_ID}:{IMAGE_VERSION}',
#Always pulls the image before running it.
image_pull_policy='Always',
# The env_var template variable allows you to access variables defined in Airflow UI.
env_vars = {'GCP_PROJECT_ID':GCP_PROJECT_ID},
dag=dag)
The code being run in the container is using argparse to get the argument as a string and then uses ast literal to change it back to a dictionary to be accessed in the code:
import ast
import argparse
from os import listdir, getenv
def main(object_metadata_dict):
"""
Call the main function, sets the order in which to run functions.
"""
print(f'This is my metadata as a dictionary {object_metadata_dict}')
print (f'This is my bucket {object_metadata_dict["bucket"]}')
print (f'This is my file name {object_metadata_dict["name"]}')
return 'Script has run without errors !!'
if (__name__ == "__main__"):
parser = argparse.ArgumentParser(description='Staging to live load process.')
parser.add_argument("--json",type=str, dest="json", required = False, default = 'all',\
help="List of metadata for the triggered object derived
from cloud function backgroud functions.")
args = parser.parse_args()
json=args.json
object_metadata_dict=ast.literal_eval(json)
main(object_metadata_dict)