I have written a program to convert a file JSON in each line to convert into a JSON array.
Refer to the below link to understand what I want to achieve:
How to get JSON Array in a blob storage using dataflow
I have created below files for trigger:
function.json:
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "jsonfiletrigger",
"type": "blobTrigger",
"direction": "in",
"path": "<containername>/in.json",
"connection": "<Storage account>"
},
{
"name": "blobin",
"type": "blob",
"direction": "in",
"path": "<containername>/in.json",
"connection": "<Storage account>"
},
{
"name": "blobout",
"type": "blob",
"direction": "out",
"path": "<containername>/out.json",
"connection": "<Storage account>"
}
],
"disabled": false
}
host.json:
{
"version": "2.0"
}
local.settings.json
{
"IsEncrypted": false,
"Values": {
"AzureWebJobsStorage": "DefaultEndpointsProtocol=https;AccountName=<Storage account>;AccountKey=<Storage account access key>;EndpointSuffix=core.windows.net",
"FUNCTIONS_EXTENSION_VERSION": "~3",
"FUNCTIONS_WORKER_RUNTIME": "python",
"APPINSIGHTS_INSTRUMENTATIONKEY": "<appinsight instrumentation key>",
"APPLICATIONINSIGHTS_CONNECTION_STRING": "InstrumentationKey=<Instrumentation key>;IngestionEndpoint=https://westeurope-3.in.applicationinsights.azure.com/"
},
"ConnectionStrings": {}
}
init.py
import logging
import azure.functions as func
import sys
import json
import os
def main(blobin: func.InputStream, blobout: func.Out[bytes], context: func.Context):
logging.info('env variables :: %s' % dict(os.environ))
jsonarr = []
try:
with open(blobin, 'rt') as fin:
for line in fin.readlines():
jsonobj = json.loads(line.strip())
jsonarr.append(jsonobj)
except OSError as e:
print(f'EXCEPTION: Unable to read input as file. {e}')
sys.exit(254)
except Exception as e:
print(f'EXCEPTION: {e}')
sys.exit(255)
try:
with open(blobout, 'wt') as fout:
json.dump(jsonarr, indent=4, fp=fout)
except OSError as e:
print(f'EXCEPTION: Unable to write output. {e}')
sys.exit(254)
except Exception as e:
print(f'EXCEPTION: {e}')
sys.exit(255)
I ran the below command to publish:
func azure functionapp publish jsonlisttoarray --publish-local-settings
I see files are in the functionapp. But not sure why the function doesn't get triggered.
Please help resolve the issue.
The problem may be caused by the connection string of storage account hadn't been uploaded to azure portal when you do deployment.
We can see the document shows us values in ConnectionStrings will not be published to azure when you run the command func azure functionapp publish jsonlisttoarray --publish-local-settings.
I test it in my side, the values under ConnectionStrings field in local.settings.json wasn't published to function application settings on portal when I do deployment. And it will lead to the function can't be triggered.
To solve this problem, you need to go to your function app on azure portal first. Then click "Configuration" --> under "Application settings" tab --> click "New application setting" to add a variable with the name and value same with it in your local.settings.json under ConnectionStrings field.
================================Update================================
It seems you there is a mistake of connection field in your function.json. First, you should add a variable in local.settings.json with the value of the storage connection string like below:
Then set the value of connection field with the name of connection string(in local.settings.json) in function.json:
Then deploy the function to azure again with the command func azure functionapp publish jsonlisttoarray --publish-local-settings.
Note: if you do not add --publish-local-settings in your publish command, it will not upload the value in local.settings.json to your function app when you do deployment.
I found that problem was with the directory structure. I was have having all the files in the same directory. I needed to have function.json and init.py or any other sources under directory.
A function-app can have multiple functions each function shares the same settings, requirements.txt, and host.json.
The directory structure looks like below:
$ ls -Ra
.:
. .. BlobTrigger extensions.csproj host.json local.settings.json proxies.json .python_packages requirements.txt
./BlobTrigger:
. .. function.json __init__.py
./.python_packages:
. ..
Related
Using Azure functions in a python runtime env. to take latitude/longitude tuples from a snowflake database and return the respective countries. We also want to convert any non-english country names into English.
We initially found that although the script would show output in the terminal while testing on azure, it would soon return a 503 error (although the script continues to run at this point). If we cancelled the script it would show as a success in the monitor screen of azure portal, however leaving the script to run to completion would result in the script failing. We decided (partially based on this post) this was due to the runtime exceeding the maximum http response time allowed. To combat this we tried a number of solutions.
First we extended the function timeout value in the function.json file:
{
"version": "2.0",
"logging": {
"applicationInsights": {
"samplingSettings": {
"isEnabled": true,
"excludedTypes": "Request"
}
}
},
"extensionBundle": {
"id": "Microsoft.Azure.Functions.ExtensionBundle",
"version": "[2.*, 3.0.0)"
},
"functionTimeout": "00:10:00"
}
We then modified our script to use a queue trigger by adding the output
def main(req: func.HttpRequest, msg: func.Out[func.QueueMessage]) ->func.HttpResponse:
to the main .py script. We also then modified the function.json file to
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "function",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
},
{
"type": "queue",
"direction": "out",
"name": "msg",
"queueName": "processing",
"connection": "QueueConnectionString"
}
]
}
and the local.settings.json file to
{
"IsEncrypted": false,
"Values": {
"FUNCTIONS_WORKER_RUNTIME": "python",
"AzureWebJobsStorage": "{AzureWebJobsStorage}",
"QueueConnectionString": "<Connection String for Storage Account>",
"STORAGE_CONTAINER_NAME": "testdocs",
"STORAGE_TABLE_NAME": "status"
}
}
We also then added a check to see if the country name was already in English. The intention here was to cut down on calls to the translate function.
After each of these changes we redeployed to the functions app and tested again. Same result. The function will run, and print output to terminal, however after a few seconds it will show a 503 error and eventually fail.
I can show a code sample but cannot provide the tables unfortunately.
from snowflake import connector
import pandas as pd
import pyarrow
from geopy.geocoders import Nominatim
from deep_translator import GoogleTranslator
from pprint import pprint
import langdetect
import logging
import azure.functions as func
def main(req: func.HttpRequest, msg: func.Out[func.QueueMessage]) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
# Connecting string to Snowflake
conn = connector.connect(
user='<USER>',
password='<PASSWORD>',
account='<ACCOUNT>',
warehouse='<WH>',
database='<DB>',
schema='<SCH>'
)
# Creating objects for Snowflake, Geolocation, Translate
cur = conn.cursor()
geolocator = Nominatim(user_agent="geoapiExercises")
translator = GoogleTranslator(target='en')
# Fetching weblog data to get the current latlong list
fetchsql = "SELECT PAGELATLONG FROM <TABLE_NAME> WHERE PAGELATLONG IS NOT NULL GROUP BY PAGELATLONG;"
logging.info(fetchsql)
cur.execute(fetchsql)
df = pd.DataFrame(cur.fetchall(), columns = ['PageLatLong'])
logging.info('created data frame')
# Creating and Inserting the mapping into final table
for index, row in df.iterrows():
latlong = row['PageLatLong']
location = geolocator.reverse(row['PageLatLong']).raw['address']
logging.info('got addresses')
city = str(location.get('state_district'))
country = str(location.get('country'))
countrycd = str(location.get('country_code'))
logging.info('got countries')
# detect language of country
res = langdetect.detect_langs(country)
lang = str(res[0]).split(':')[0]
conf = str(res[0]).split(':')[0]
if lang != 'en' and conf > 0.99:
country = translator.translate(country)
logging.info('translated non-english country names')
insertstmt = "INSERT INTO <RESULTS_TABLE> VALUES('"+latlong+"','"+city+"','"+country+"','"+countrycd+"')"
logging.info(insertstmt)
try:
cur.execute(insertstmt)
except Exception:
pass
return func.HttpResponse("success")
If anyone had an idea what may be causing this issue I'd appreciate any suggestions.
Thanks.
To resolve timeout errors, you can try following ways:
As suggested by MayankBargali-MSFT , You can try to define the retry policies and for Trigger like HTTP and timer, don't resume on a new instance. This means that the max retry count is a best effort, and in some rare cases an execution could be retried more than the maximum, or for triggers like HTTP and timer be retried less than the maximum. You can navigate to Diagnose and solve problems to see if it helps you to know the root cause of 503 error as there can be multiple reasons for this error
As suggested by ryanchill , 503 issue is the result of high memory consumption which exceeded the limits of the consumption plan. The best resolve for this issue is switching to a dedicated hosting plan which provides more resources. However, if that isn't an option, reducing the amount of data being retrieved should be explored.
References: https://learn.microsoft.com/en-us/answers/questions/539967/azure-function-app-503-service-unavailable-in-code.html , https://learn.microsoft.com/en-us/answers/questions/522216/503-service-unavailable-while-executing-an-azure-f.html , https://learn.microsoft.com/en-us/answers/questions/328952/azure-durable-functions-timeout-error-in-activity.html and https://learn.microsoft.com/en-us/answers/questions/250623/azure-function-not-running-successfully.html
I have created an Azure Function base on a custom image (docker) using VS Code.
I used the deployment feature of VS code to deploy it to azure and everything was fine.
My function.json file specifies anonymous auth level:
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
}
]
}
Why an I still getting the 401 unauthorized error?
Thanks
Amit
I changed my authLevel from function to anonymous and it finally worked!
Below methods can fix 4XX errors in our function app:
Make sure you add all the values from Local.Settings.json file to Application settings (FunctionApp -> Configuration -> Application Settings)
Check for CORS in your function app. Try adding “*” and saving it.. reload the function app and try to run it.
(Any request made against a storage resource when CORS is enabled must either have a valid authorization header or must be made against a public resource.)
When you make your request to your function, you may need to pass an authorization header with key 'x-functions-key' and value equal to either your default key for all functions (Function App > App Keys > default in the Azure portal) or a key specific to that function (Function App > Functions > [specific function] > Function Keys > default in Azure Portal).
I have encountered a problem when setting blob metadata in Azure Storage. I have developed a script for this in Spyder, so local Python, which works great. Now, I want to be able to execute this same script as an Azure Function. However, when setting the metadata I get the following error: HttpResponseError: The specifed resource name contains invalid characters.
The only change from Spyder to Functions that I made is:
Spyder:
def main(container_name,blob_name,metadata):
from azure.storage.blob import BlobServiceClient
# Connection string to storage account
storageconnectionstring=secretstoragestringnotforstackoverflow
# initialize clients
blobclient_from_connectionstring=BlobServiceClient.from_connection_string(storageconnectionstring)
containerclient=blobclient_from_connectionstring.get_container_client(container_name)
blob_client = containerclient.get_blob_client(blob_name)
# set metadata of container
blob_client.set_blob_metadata(metadata=metadata)
return
Functions
def main(req: func.HttpRequest):
container_name = req.params.get('container_name')
blob_name = req.params.get('blob_name')
metadata_raw = req.params.get('metadata')
metadata_json = json.loads(metadata_raw)
# Connection string to storage account
storageconnectionstring=secretstoragestringnotforstackoverflow
# initialize clients
blobclient_from_connectionstring=BlobServiceClient.from_connection_string(storageconnectionstring)
containerclient=blobclient_from_connectionstring.get_container_client(container_name)
blob_client = containerclient.get_blob_client(blob_name)
# set metadata of container
blob_client.set_blob_metadata(metadata=metadata_json)
return func.HttpResponse()
Arguments to the Function are passed in the header. Problem lies with metadata and not container_name or blob_name as I get no error when I comment out metadata. Also, I tried formatting metadata in many variations with single or double quotes and as JSON or as string but no luck so far. Anyone who could help me solve this problem?
I was able to fix the problem. Script was fine, problem was in the input parameters. They needed to be in a specific format. metadata as a dict with double quotes and blob/container as string without any quote.
As request the function.json:
{
"scriptFile": "__init__.py",
"bindings": [
{
"authLevel": "anonymous",
"type": "httpTrigger",
"direction": "in",
"name": "req",
"methods": [
"get",
"post"
]
},
{
"type": "http",
"direction": "out",
"name": "$return"
}
]
}
With parameter formatting:
Picture from Azure Functions
I am using Ubuntu 16.04.5 LTS local machine to create and publish Python Function App to Azure using CLI and Azure Functions Core Tools (Ref). I have configured Blob Trigger and my function.json file looks like this:
{
"disabled": false,
"scriptFile": "__init__.py",
"bindings": [
{
"name": "<Blob Trigger Name>",
"type": "blobTrigger",
"direction": "in",
"path": "<Blob Container Name>/{name}",
"connection": "<Connection String having storage account and key>"
},
{
"name": "outputblob",
"type": "blob",
"path": "<Blob Container Name>",
"connection": "<Connection String having storage account and key>",
"direction": "out"
}
]
}
My init.py function looks like this.
def main(<Blob Trigger Name>: func.InputStream, doc: func.Out[func.Document]):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {<Blob Trigger Name>.name}\n"
f"Blob Size: {<Blob Trigger Name>.length} bytes")
logging.basicConfig(filename='example.log',level=logging.DEBUG)
logging.debug('This message should go to the log file')
logging.info('So should this')
logging.warning('And this, too')
# Write text to the file.
file = open("QuickStart.txt", 'w')
file.write("Hello, World!")
file.close()
# Create the BlockBlockService that is used to call the Blob service for the storage account
block_blob_service = BlockBlobService(account_name='<Storage Account Name>', account_key='<Storage Account Key>')
container_name='<Blob Container Name>'
# Set the permission so the blobs are public.
block_blob_service.set_container_acl(container_name, public_access=PublicAccess.Container)
# Upload the created file, use local_file_name for the blob name
block_blob_service.create_blob_from_path(container_name, 'QuickStart.txt', '')
The Function App is "Always On" but when I upload a blob in the storage the function is not getting triggered. Another Reference Link is this (Ref).
What's going wrong?
Thanks and regards,
Shashank
Have you checked that the local.settings.json (connection strings for storage accounts) are also in the function app in Azure? They are not published from local machine by default.
You can configure them manually in the portal or use the publish-local-settings flag:
func azure functionapp publish "functionname" --publish-local-settings
I tried to reproduce this issue by creating a sample function app in python using Visual studio code with default template and finally deployed in Linux. It worked for me.
Here is the piece of code i have written in pyhton file.
import logging
import azure.functions as func
def main(myblob: func.InputStream):
logging.info(f"Python blob trigger function processed blob \n"
f"Name: {myblob.name}\n"
f"Blob Size: {myblob.length} bytes")
and here is the function.json file from my function app.
{
"scriptFile": "__init__.py",
"bindings": [
{
"name": "myblob",
"type": "blobTrigger",
"direction": "in",
"path": "samples-workitems/{name}",
"connection": ""
}
]
}
I am using 2.0 Azure function , python 3.6 and Azure Functions Core Tools version 2.2.70
this is the reference link i used :
https://learn.microsoft.com/en-us/azure/azure-functions/functions-create-first-function-python
Please try to use this and see if it helps.
In your main def of py script you have 2nd argument of doc: func.Out[func.Document], which is for cosmos db. This should be an output stream, as its of type blob
I have an azure function which is triggered by a file being put into blob storage and I was wondering how (if possible) to get the name of the blob (file) which triggered the function, I have tried doing:
fileObject=os.environ['inputBlob']
message = "Python script processed input blob'{0}'".format(fileObject.fileName)
and
fileObject=os.environ['inputBlob']
message = "Python script processed input blob'{0}'".format(fileObject.name)
but neither of these worked, they both resulted in errors. Can I get some help with this or some suggesstions?
Thanks
The blob name can be captured via the Function.json and provided as binding data. See the {filename} token below.
Function.json is language agnostic and works in all languages.
See documentation at https://learn.microsoft.com/en-us/azure/azure-functions/functions-triggers-bindings for details.
{
"bindings": [
{
"name": "image",
"type": "blobTrigger",
"path": "sample-images/{filename}",
"direction": "in",
"connection": "MyStorageConnection"
},
{
"name": "imageSmall",
"type": "blob",
"path": "sample-images-sm/{filename}",
"direction": "out",
"connection": "MyStorageConnection"
}
],
}
If you want to get the file name of the file that triggered your function you can to that:
Use {name} in function.json :
{
"bindings": [
{
"name": "myblob",
"type": "blobTrigger",
"path": "MyBlobPath/{name}",
"direction": "in",
"connection": "MyStorageConnection"
}
]
}
The function will be triggered by changes in yout blob storage.
Get the name of the file that triggered the function in python (init.py):
def main(myblob: func.InputStream):
filemane = {myblob.name}
Will give you the name of the file that triggered your function.
There is not any information about what trigger you used in your description. But fortunately, there is a sample project yokawasa/azure-functions-python-samples on GitHub for Azure Function using Python which includes many samples using different triggers like queue trigger or blob trigger. I think it's very helpful for you now, and you can refer to these samples to write your own one to satisfy your needs。
Hope it helps.
Getting the name of the inputBlob is not currently possible with Python Azure-Functions. There are open issues about it in azure-webjobs-sdk and azure-webjobs-sdk-script GitHub:
https://github.com/Azure/azure-webjobs-sdk/issues/1090
https://github.com/Azure/azure-webjobs-sdk-script/issues/1339
Unfortunatelly it's still not possible.
In Python, you can do:
import azure.functions as func
import os
def main(blobin: func.InputStream):
filename=os.path.basename(blobin.name)