Google Cloud Function - Python script to get data from Webhook - python

I hope someone can help me out on my problem.
I have a google-cloud function created which is http triggered and a webhook setup in customer.io
I need to capture data that was sent by customer.io app. this should trigger the google cloud function and run the python script setup within the cloud function. I am new to writing python script and its libraries. The final goal is to write the webhook data into bigquery table.
For now, I am able to see that trigger is working since I am seeing the data using print sent by the app using the function logs. I am able to check the schema of the data as well from the logs in textpayload.
This is the sample data from the textpayload I wanted to load on a bigquery table:
{
"data":{
"action_id":42,
"campaign_id":23,
"customer_id":"user-123",
"delivery_id":"RAECAAFwnUSneIa0ZXkmq8EdkAM==-",
"identifiers":{
"id":"user-123"
},
"recipient":"test#example.com",
"subject":"Thanks for signing up"
},
"event_id":"01E2EMRMM6TZ12TF9WGZN0WJaa",
"metric":"sent",
"object_type":"email",
"timestamp":1669337039
}
and this is the sample Python code I have created on the google-cloud function:
import os
def webhook(request):
request_json = request.get_json()
if request.method == 'POST':
print(request_json)
return 'success'
else:
return 'failed'
I have only tried printing the data from webhook and what I am expecting is to have a Python code that writes this textpayload data into bigquery table.
{
"data":{
"action_id":42,
"campaign_id":23,
"customer_id":"user-123",
"delivery_id":"RAECAAFwnUSneIa0ZXkmq8EdkAM==-",
"identifiers":{
"id":"user-123"
},
"recipient":"test#example.com",
"subject":"Thanks for signing up"
},
"event_id":"01E2EMRMM6TZ12TF9WGZN0WJaa",
"metric":"sent",
"object_type":"email",
"timestamp":1669337039
}

So, you have up a cloud function that executes some code whenever the webhook posts some data to it.
What this cloud function needs now is the BigQuery python client library. Here's an example of how it's used (source):
from google.cloud import bigquery
client = bigquery.Client()
dataset_id = ...
table_name = ...
data = ...
dataset_ref = client.dataset(dataset_id)
table_ref = dataset_ref.table(table_name)
table = client.get_table(table_ref)
result = client.insert_rows(table, data)
So you could put something like this into your cloud function in order to send your data to a target BigQuery table.

Related

Update query string in scheduled query using Python Client for BigQuery Data Transfer Service

I'm struggling to find documentation and examples for Python Client for BigQuery Data Transfer Service. A new query string is generated by my application from time to time and I'd like to update the existing scheduled query accordingly. This is the most helpful thing I have found so far, however I am still unsure where to pass my query string. Is this the correct method?
from google.cloud import bigquery_datatransfer_v1
def sample_update_transfer_config():
# Create a client
client = bigquery_datatransfer_v1.DataTransferServiceClient()
# Initialize request argument(s)
transfer_config = bigquery_datatransfer_v1.TransferConfig()
transfer_config.destination_dataset_id = "destination_dataset_id_value"
request = bigquery_datatransfer_v1.UpdateTransferConfigRequest(
transfer_config=transfer_config,
)
# Make the request
response = client.update_transfer_config(request=request)
# Handle the response
print(response)
You may refer to this Update Scheduled Queries for python documentation from BigQuery for the official reference on the usage of Python Client Library in updating scheduled queries.
However, I updated the code for you to update your query string. I added the updated query string in the params and define what attributes of the TransferConfig() will be updated in the update_mask.
See updated code below:
from google.cloud import bigquery_datatransfer
from google.protobuf import field_mask_pb2
transfer_client = bigquery_datatransfer.DataTransferServiceClient()
transfer_config_name = "projects/{your-project-id}/locations/us/transferConfigs/{unique-ID-of-transferconfig}"
new_display_name = "Your Desired Updated Name if Necessary" #--remove if no need to update **scheduled query name**.
query_string_new = """
SELECT
CURRENT_TIMESTAMP() as current_time
"""
new_params={
"query": query_string_new,
"destination_table_name_template": "your_table_{run_date}",
"write_disposition": "WRITE_TRUNCATE",
"partitioning_field": "",
}
transfer_config = bigquery_datatransfer.TransferConfig(name=transfer_config_name,
)
transfer_config.display_name = new_display_name #--remove if no need to update **scheduled query name**.
transfer_config.params = new_params
transfer_config = transfer_client.update_transfer_config(
{
"transfer_config": transfer_config,
"update_mask": field_mask_pb2.FieldMask(paths=["display_name","params"]), #--remove "display_name" from the list if no need to update **scheduled query name**.
}
)
print("Updates are executed successfully")
For you to get the value of your transfer_config_name, you may list all your scheduled queries by following this SO post.

Access CosmosDB Data from Azure App Service by using managed identity (Failed)

A FastAPI-based API written in Python has been deployed as an Azure App Service. The API needs to read and write data from CosmosDB, and I attempted to use Managed Identity for this purpose, but encountered an error, stating Unrecognized credential type
These are the key steps that I took towards that goal
Step One: I used Terraform to configure the managed identity for Azure App Service, and assigned the 'contributor' role to the identity so that it can access and write data to CosmosDB. The role assignment was carried out in the file where the Azure App Service is provisioned.
resource "azurerm_linux_web_app" "this" {
name = var.appname
location = var.location
resource_group_name = var.rg_name
service_plan_id = azurerm_service_plan.this.id
app_settings = {
"PROD" = false
"DOCKER_ENABLE_CI" = true
"DOCKER_REGISTRY_SERVER_URL" = data.azurerm_container_registry.this.login_server
"WEBSITE_HTTPLOGGING_RETENTION_DAYS" = "30"
"WEBSITE_ENABLE_APP_SERVICE_STORAGE" = false
}
lifecycle {
ignore_changes = [
app_settings["WEBSITE_HTTPLOGGING_RETENTION_DAYS"]
]
}
https_only = true
identity {
type = "SystemAssigned"
}
data "azurerm_cosmosdb_account" "this" {
name = var.cosmosdb_account_name
resource_group_name = var.cosmosdb_resource_group_name
}
// built-in role that allow the app-service to read and write to an Azure Cosmos DB
resource "azurerm_role_assignment" "cosmosdbContributor" {
scope = data.azurerm_cosmosdb_account.this.id
principal_id = azurerm_linux_web_app.this.identity.0.principal_id
role_definition_name = "Contributor"
}
Step Two: I used the managed identity library to fetch the necessary credentials in the Python code.
from azure.identity import ManagedIdentityCredential
from azure.cosmos.cosmos_client import CosmosClient
client = CosmosClient(get_endpoint(),credential=ManagedIdentityCredential())
client = self._get_or_create_client()
database = client.get_database_client(DB_NAME)
container = database.get_container_client(CONTAINER_NAME)
container.query_items(query)
I received the following error when running the code locally and from Azure (the error can be viewed from the Log stream of the Azure App Service):
raise TypeError(
TypeError: Unrecognized credential type. Please supply the master key as str, or a dictionary or resource tokens, or a list of permissions.
Any help or discussion is welcome
If you are using the Python SDK, you can directly do this ,check the sample here
aad_credentials = ClientSecretCredential(
tenant_id="<azure-ad-tenant-id>",
client_id="<client-application-id>",
client_secret="<client-application-secret>")
client = CosmosClient("<account-endpoint>", aad_credentials)

How do I destructure an API with python Django and django-rest-framework?

I have a successfully compiled and run a django rest consuming cocktaildb api. On local server when I run http://127.0.0.1:8000/api/ I get
{
"ingredients": "http://127.0.0.1:8000/api/ingredients/",
"drinks": "http://127.0.0.1:8000/api/drinks/",
"feeling-lucky": "http://127.0.0.1:8000/api/feeling-lucky/"
}
But when I go to one of the links mentioned in the json result above, for example:
http://127.0.0.1:8000/api/ingredients/
I get an empty [] with a status 200OK!
I need an endpoint to GET drinks and ingredients before I can destructure to specific details using angular.
I implemented helper folder in the app with the the API function as below:
class TheCoctailDBAPI:
THECOCTAILDB_URL = 'https://www.thecocktaildb.com/api/json/v1/1/'
async def __load_coctails_for_drink(self, drink, session):
for i in range(1, 16):
ingredientKey = 'strIngredient' + str(i)
ingredientName = drink[ingredientKey]
if not ingredientName:
break
if ingredientName not in self.ingredients:
async with session.get(f'{TheCoctailDBAPI.THECOCTAILDB_URL}search.php?i={ingredientName}') \
as response:
result = json.loads(await response.text())
self.ingredients[ingredientName] = result['ingredients'][0]
What was your expected responce?
Add the function that is called by this API as well as the DB settings in the question, so that we can properly help you.
Are you sure that you are connecting and pulling data from a remote location? It looks to me like your local DB is empty, so the API has no data to return.

How to use Azure DataBricks Api to submit job?

I am a beginner in Azure Databricks and I want to use APIs to create cluster and submit job in python. I am stuck as I am unable to do so. Also if I have an existing cluster how will the code look like? I got job id after running this code but unable to see any output.
import requests
DOMAIN = ''
TOKEN = ''
response = requests.post(
'https://%s/api/2.0/jobs/create' % (DOMAIN),
headers={'Authorization': 'Bearer %s' % TOKEN},
json={
"name": "SparkPi spark-submit job",
"new_cluster": {
"spark_version": "7.3.x-scala2.12",
"node_type_id": "Standard_DS3_v2",
"num_workers": 2
},
"spark_submit_task": {
"parameters": [
"--class",
"org.apache.spark.examples.SparkPi",
"dbfs:/FileStore/sparkpi_assembly_0_1.jar",
"10"
]
}
}
)
if response.status_code == 200:
print(response.json())
else:
print("Error launching cluster: %s: %s" % (response.json()["error_code"], response.json()["message"]))
Jobs at Databricks could be executed two ways (see docs):
on a new cluster - that's how you do it right now
on existing cluster - remove the new_cluster block, and add the existing_cluster_id field with the ID of existing cluster. If you don't have a cluster yet, then you can create it via Cluster API
When you create a job, then you get back the job ID that could be used to edit the job or delete it. You can also launch the job using the Run Now API. But if you just want to execute the job without create the Job in the UI, then you need to look onto Run Submit API. Either of the APIs will return the ID of specific job run, and then you can use Run Get API to get status of the job, or Run Get Output API to get the results of execution.

Snowflake External Functions using Azure Functions on Python not working

I want to create an external function that can be used to upsert rows into MongoDB. I've created the function, tested it locally using Postman and after publishing. I've followed the documentation from https://docs.snowflake.com/en/sql-reference/external-functions-creating-azure-ui.html and at first, I used the javascript function they proposed to test and worked. However, when I run it it python I get an error. This is the code.
import logging
import azure.functions as func
import pymongo
import json
import os
from datetime import datetime
cluster = pymongo.MongoClient(os.environ['MongoDBConnString'])
db = cluster[f"{os.environ['MongoDB']}"]
collection = db[f"{os.environ['MongoDBCollection']}"]
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
name = req.params.get('name')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
return func.HttpResponse(f"Hello, {name}. This HTTP triggered function executed successfully.")
else:
collection.update_one(
filter={
'_id':req_body['_id']
},
update={
'$set': {'segment_ids': req_body['segment_ids']}
},
upsert=True)
return func.HttpResponse(
json.dumps({"status_code": 200,
"status_message": "Upsert Success",
"Timestamp": datetime.utcnow().strftime("%Y-%m-%dT%H:%M:%S"),
"_id": req_body['_id']}),
status_code=200,
mimetype="text/plain"
)
The error states that req_body is referenced before being defined, failing at line '_id':req_body['_id']. In Snowflake I've created an external function called mongoUpsert(body variant) and I am parsing a simple query to test.
select mongoUpsert(object_construct('_id', 'someuuid', 'segment_ids;, array_construct(1,2,3,4)))
From what I can tell, the function is not receiving the body I'm parsing in Snowflake for some reason. I don't know what I am doing wrong. Can anyone help me? Can anyone also explain how Snowflake is sending the parameters (as body, params, headers) and is there a way to specify if I want to parse a body or params?
External functions send and receive data in a particular format. All the parameters are sent in the request body.
https://docs.snowflake.com/en/sql-reference/external-functions-data-format.html
You can checkout snowflake-labs
for external functions samples.
There is one specifically for Azure Python functions that calls the Translator API.
I've started from scratch and stripped the layers one by one in Snowflake. So the Snowflake parameter is parsed to the body of the function but wrapped in an array which is then wrapped in another object called 'data'. Furthermore, it expects the same schema as a response back. So here's below the template to use for Azure Functions when using Python.
import logging
import azure.functions as func
import json
def main(req: func.HttpRequest) -> func.HttpResponse:
# Get body response from Snowflake
req_body = req.get_json()['data'][0][1]
###### Do Something
# Return Response
message = {"Task": "Completed"}
return func.HttpResponse(
json.dumps({'data': [[0, message]]}),
status_code=200)
As an example, I've used a simple JSON object:
{
"_id": "someuuid"
}
And created an external function in Snowflake called testfunc(body variant) and called it using select testfunc(object_construct('_id', 'someuuid')).
If you would log the response (using logging.info(req.get_json())) it would print the following
{
"data":
[
[
0,
{
"_id": "someuuid"
}
]
]
}
So to get the clean input I fed in snowflake I have the line
req_body = req.get_json()['data'][0][1]
However, I kept getting errors on the response until I tried just echoing the input and noticed it returned it without the wrapping. The returned body needs to be a string (hence why using json.dumps()) but it also needs the wrapping. So to print it out, first define a message you want (it may be a calculation of the input or an acknowledgement), then wrap the message in {'data': [[0, message]]} and finally compile it as a string (json.dumps())

Categories

Resources