How to set operation id for custom application insight events? - python

We have running on Azure a Python Web App (flask). It processes requests and also logs some results in the end via a custom event to Application Insights.
However inside Application Insights our end-to-end transaction Operation ID for our Custom Events is 0.
Other event types such as Requests, do have the operation id. How can we get the same operation id in here as the Request event?
import logging
from opencensus.ext.azure.log_exporter import AzureEventHandler, AzureLogHandler
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
logger.addHandler(AzureEventHandler(
connection_string='InstrumentationKey=00000000-0000-0000-0000-000000000000'))
def track_result(result):
properties = {'result': result}
logger.info('Result log', extra={'custom_dimensions': properties})
return None

Here using openCensus and AzureEventHandler the operation ID was populated in application insights using the following code :
import logging
from opencensus.trace import config_integration
from opencensus.ext.azure.log_exporter import AzureEventHandler
config_integration.trace_integrations(['logging'])
logger = logging.getLogger(__name__)
handler = AzureEventHandler(connection_string='Connection String from portal')
logger.addHandler(handler)
# Now we can log like this
logger.warning('Before the span')
Here the operation Id is filled every time I make a get request to my flask api

Related

How to send logs to GCP using StructuredLogHandler with jsonPayload?

I am trying to send logs (with jsonPayload) to GCP using StructuredLogHandler with below python code.
rootlogger = logging.getLogger()
client = google.cloud.logging.Client(credentials=xxx, project=xxx)
h = StructuredLogHandler()
rootlogger.addHandler(h)
logger = logging.getLogger('test')
logger.warning('warning')
I see that logs are being printed on console (in json format) but the logs are not sent to GCP Log Explorer. Can someone help?
Since v3 of Python Cloud Logging Library it's now easier than ever as it integrates with the Python standard logging library with client.setup_logging:
import logging
import google.cloud.logging
# Instantiate a client
client = google.cloud.logging.Client()
# Retrieves a Cloud Logging handler based on the environment
# you're running in and integrates the handler with the
# Python logging module. By default this captures all logs
# at INFO level and higher
client.setup_logging()
Name your logger as usual. E.g.
logger = logging.getLogger('test')
Then if you want to send structured log messages to Cloud Logging then you can use one of two methods:
Use the json_fields extra argument:
data_dict = {"hello": "world"}
logging.info("message field", extra={"json_fields": data_dict})
Use a JSON-parseable string (requires importing the json module):
import json
data_dict = {"hello": "world"}
logging.info(json.dumps(data_dict))
This will see your log messages sent to Google Cloud and the JSON payload available under the jsonPayload field of the expanded log entry:

Log to Cloud Logging with correct severity from Cloud Run Job and package used in the job

What we are trying:
We are trying to Run a Cloud Run Job that does some computation and also uses one our custom package to do the computation. The cloud run job is using google-cloud-logging and python's default logging package as described here. The custom python package also logs its data (only logger is defined as suggested here).
Simple illustration:
from google.cloud import logging as gcp_logging
import logging
import os
import google.auth
from our_package import do_something
def log_test_function():
SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
credentials, project_id = google.auth.default(scopes=SCOPES)
try:
function_logger_name = os.getenv("FUNCTION_LOGGER_NAME")
logging_client = gcp_logging.Client(credentials=credentials, project=project_id)
logging.basicConfig()
logger = logging.getLogger(function_logger_name)
logger.setLevel(logging.INFO)
logging_client.setup_logging(log_level=logging.INFO)
logger.critical("Critical Log TEST")
logger.error("Error Log TEST")
logger.info("Info Log TEST")
logger.debug("Debug Log TEST")
result = do_something()
logger.info(result)
except Exception as e:
print(e) # just to test how print works
return "Returned"
if __name__ == "__main__":
result = log_test_function()
print(result)
Cloud Run Job Logs
The Blue Box indicates logs from custom package
The Black Box indicates logs from Cloud Run Job
The Cloud Logging is not able to identify the severity of logs. It parses every log entry as default level.
But if I run same code in Cloud Function, it seems to work as expected (i.e. severity level of logs from cloud function and custom package is respected) as shown in image below.
Cloud Function Logs
Both are serverless architecture than why does it works in Cloud Function but not in Cloud Run.
What we want to do:
We want to log every message from Cloud Run Job and custom package to Cloud Logging with correct severity.
We would appreciate your help guys!
Edit 1
Following Google Cloud Python library commiters solution. Almost solved the problem. Following is the modified code.
from google.cloud import logging as gcp_logging
import logging
import os
import google.auth
from our_package import do_something
from google.cloud.logging.handlers import CloudLoggingHandler
from google.cloud.logging_v2.handlers import setup_logging
from google.cloud.logging_v2.resource import Resource
from google.cloud.logging_v2.handlers._monitored_resources import retrieve_metadata_server, _REGION_ID, _PROJECT_NAME
def log_test_function():
SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
region = retrieve_metadata_server(_REGION_ID)
project = retrieve_metadata_server(_PROJECT_NAME)
try:
function_logger_name = os.getenv("FUNCTION_LOGGER_NAME")
# build a manual resource object
cr_job_resource = Resource(
type="cloud_run_job",
labels={
"job_name": os.environ.get('CLOUD_RUN_JOB', 'unknownJobId'),
"location": region.split("/")[-1] if region else "",
"project_id": project
}
)
logging_client = gcp_logging.Client()
gcloud_logging_handler = CloudLoggingHandler(logging_client, resource=cr_job_resource)
setup_logging(gcloud_logging_handler, log_level=logging.INFO)
logging.basicConfig()
logger = logging.getLogger(function_logger_name)
logger.setLevel(logging.INFO)
logger.critical("Critical Log TEST")
logger.error("Error Log TEST")
logger.warning("Warning Log TEST")
logger.info("Info Log TEST")
logger.debug("Debug Log TEST")
result = do_something()
logger.info(result)
except Exception as e:
print(e) # just to test how print works
return "Returned"
if __name__ == "__main__":
result = log_test_function()
print(result)
Now the every log is logged twice one severity sensitive log other severity insensitive logs at "default" level as shown below.
Cloud Functions get your code, wrap it in a webserver, build a container and deploy it.
With Cloud Run, you only build and deploy the container.
That means, Cloud Functions webserver wrapper do something more that you do: it initialize correctly the logger in Python.
Have a look to that doc page, you should solve your issue with it
EDIT 1
I took the exact example and I added it in my flask server like that
import os
from flask import Flask
app = Flask(__name__)
import google.cloud.logging
# Instantiates a client
client = google.cloud.logging.Client()
# Retrieves a Cloud Logging handler based on the environment
# you're running in and integrates the handler with the
# Python logging module. By default this captures all logs
# at INFO level and higher
client.setup_logging()
# [END logging_handler_setup]
# [START logging_handler_usage]
# Imports Python standard library logging
import logging
#app.route('/')
def call_function():
# The data to log
text = "Hello, world!"
# Emits the data using the standard logging module
logging.warning(text)
# [END logging_handler_usage]
print("Logged: {}".format(text))
return text
# For local execution
if __name__ == "__main__":
app.run(host='0.0.0.0',port=int(os.environ.get('PORT',8080)))
A rough copy of that sample.
And the result is correct. A Logged entry with the default level (the print), and my Hello world in warning status (the logging.warning)
EDIT 2
Thanks to the help of the Google Cloud Python library commiters, I got a solution to my issue. By waiting the native integration in the library.
Here my new code for Cloud Run Jobs this time
import google.cloud.logging
from google.cloud.logging.handlers import CloudLoggingHandler
from google.cloud.logging_v2.handlers import setup_logging
from google.cloud.logging_v2.resource import Resource
from google.cloud.logging_v2.handlers._monitored_resources import retrieve_metadata_server, _REGION_ID, _PROJECT_NAME
import os
# find metadata about the execution environment
region = retrieve_metadata_server(_REGION_ID)
project = retrieve_metadata_server(_PROJECT_NAME)
# build a manual resource object
cr_job_resource = Resource(
type = "cloud_run_job",
labels = {
"job_name": os.environ.get('CLOUD_RUN_JOB', 'unknownJobId'),
"location": region.split("/")[-1] if region else "",
"project_id": project
}
)
# configure handling using CloudLoggingHandler with custom resource
client = google.cloud.logging.Client()
handler = CloudLoggingHandler(client, resource=cr_job_resource)
setup_logging(handler)
import logging
def call_function():
# The data to log
text = "Hello, world!"
# Emits the data using the standard logging module
logging.warning(text)
# [END logging_handler_usage]
print("Logged: {}".format(text))
return text
# For local execution
if __name__ == "__main__":
call_function()
And the result works great:
Logged: .... entry is the simple "Print" to the stdout. Standard level
Warning entry as expected
you might want to get rid of loggers you don't need. take a look at https://stackoverflow.com/a/61602361/13161301

Recursive logging issue when using Opencensus with FastAPI

I have a problem with my implementation of Opencensus, logging in Python and FastAPI. I want to log incomming requests to Application Insights in Azure, so I added a FastAPI middleware to my code following the Microsoft docs and this Github post:
propagator = TraceContextPropagator()
#app.middleware('http')
async def middleware_opencensus(request: Request, call_next):
tracer = Tracer(
span_context=propagator.from_headers(request.headers),
exporter=AzureExporter(connection_string=os.environ['APPLICATION_INSIGHTS_CONNECTION_STRING']),
sampler=AlwaysOnSampler(),
propagator=propagator)
with tracer.span('main') as span:
span.span_kind = SpanKind.SERVER
tracer.add_attribute_to_current_span(HTTP_HOST, request.url.hostname)
tracer.add_attribute_to_current_span(HTTP_METHOD, request.method)
tracer.add_attribute_to_current_span(HTTP_PATH, request.url.path)
tracer.add_attribute_to_current_span(HTTP_ROUTE, request.url.path)
tracer.add_attribute_to_current_span(HTTP_URL, str(request.url))
response = await call_next(request)
tracer.add_attribute_to_current_span(HTTP_STATUS_CODE, response.status_code)
return response
This works great when running local, and all incomming requests to the api are logged to Application Insights. Since having Opencensus implemented however, when deployed in a Container Instance on Azure, after a couple of days (approximately 3) an issue arises where it looks like some recursive logging issue happens (+30.000 logs per second!), i.a. stating Queue is full. Dropping telemetry, before finally crashing after a few hours of mad logging:
Our logger.py file where we define our logging handlers is as follows:
import logging.config
import os
import tqdm
from pathlib import Path
from opencensus.ext.azure.log_exporter import AzureLogHandler
class TqdmLoggingHandler(logging.Handler):
"""
Class for enabling logging during a process with a tqdm progress bar.
Using this handler logs will be put above the progress bar, pushing the
process bar down instead of replacing it.
"""
def __init__(self, level=logging.NOTSET):
super().__init__(level)
self.formatter = logging.Formatter(fmt='%(asctime)s <%(name)s> %(levelname)s: %(message)s',
datefmt='%d-%m-%Y %H:%M:%S')
def emit(self, record):
try:
msg = self.format(record)
tqdm.tqdm.write(msg)
self.flush()
except (KeyboardInterrupt, SystemExit):
raise
except:
self.handleError(record)
logging_conf_path = Path(__file__).parent
logging.config.fileConfig(logging_conf_path / 'logging.conf')
logger = logging.getLogger(__name__)
logger.addHandler(TqdmLoggingHandler(logging.DEBUG)) # Add tqdm handler to root logger to replace the stream handler
if os.getenv('APPLICATION_INSIGHTS_CONNECTION_STRING'):
logger.addHandler(AzureLogHandler(connection_string=os.environ['APPLICATION_INSIGHTS_CONNECTION_STRING']))
warning_level_loggers = ['urllib3', 'requests']
for lgr in warning_level_loggers:
logging.getLogger(lgr).setLevel(logging.WARNING)
Does anyone have any idea on what could be the cause of this issue, or have people encountered similar issues? I don't know what the 'first' error log is due to the fast amount of logging.
Please let me know if additional information is required.
Thanks in advance!
We decided to revisit the problem and found two helpful threads describing similar if not exactly the same behaviour we were seeing:
https://github.com/census-instrumentation/opencensus-python/issues/862
https://github.com/census-instrumentation/opencensus-python/issues/1007
As described in the second thread it seems that Opencensus attempts to send a trace to AI and on failure the failed logs will be batched and sent again in 15s (default). This will go on indefinitely until success, possibly causing the huge and seemingly recursive spam of failure logs.
A solution introduced and proposed by Izchen in this comment is to set the enable_local_storage=False for this issue.
Another solution would be to migrate to OpenTelemetry that should not contain this potential problem and is the solution we are currently running. Do keep in mind that Opencensus is still the officially supported application monitoring solution by Microsoft and OpenTelemetry is still very young. OpenTelemetry does seem to have a lot of support however and getting more and more traction.
As for the implementation of OpenTelemetry we did the following to trace our requests:
if os.getenv('APPLICATION_INSIGHTS_CONNECTION_STRING'):
from azure.monitor.opentelemetry.exporter import AzureMonitorTraceExporter
from opentelemetry import trace
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
from opentelemetry.propagate import extract
from opentelemetry.sdk.resources import SERVICE_NAME, SERVICE_NAMESPACE, SERVICE_INSTANCE_ID, Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
provider = TracerProvider()
processor = BatchSpanProcessor(AzureMonitorTraceExporter.from_connection_string(
os.environ['APPLICATION_INSIGHTS_CONNECTION_STRING']))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
FastAPIInstrumentor.instrument_app(app)
OpenTelemetry supports a lot of custom Instrumentors that can be used to create spans for for example Requests PyMongo, Elastic, Redis, etc. => https://opentelemetry.io/registry/.
If you'd want to write your custom tracers/spans like in the OpenCensus example above you can attempt something like this:
# These come still from Opencensus for convenience
HTTP_HOST = COMMON_ATTRIBUTES['HTTP_HOST']
HTTP_METHOD = COMMON_ATTRIBUTES['HTTP_METHOD']
HTTP_PATH = COMMON_ATTRIBUTES['HTTP_PATH']
HTTP_ROUTE = COMMON_ATTRIBUTES['HTTP_ROUTE']
HTTP_URL = COMMON_ATTRIBUTES['HTTP_URL']
HTTP_STATUS_CODE = COMMON_ATTRIBUTES['HTTP_STATUS_CODE']
provider = TracerProvider()
processor = BatchSpanProcessor(AzureMonitorTraceExporter.from_connection_string(
os.environ['APPLICATION_INSIGHTS_CONNECTION_STRING']))
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
#app.middleware('http')
async def middleware_opentelemetry(request: Request, call_next):
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span('main',
context=extract(request.headers),
kind=trace.SpanKind.SERVER) as span:
span.set_attributes({
HTTP_HOST: request.url.hostname,
HTTP_METHOD: request.method,
HTTP_PATH: request.url.path,
HTTP_ROUTE: request.url.path,
HTTP_URL: str(request.url)
})
response = await call_next(request)
span.set_attribute(HTTP_STATUS_CODE, response.status_code)
return response
The AzureLogHandler from our logger.py configuration wasn't needed any more with this solution and was thus removed.
Some other sources that might be useful:
https://learn.microsoft.com/en-us/azure/communication-services/quickstarts/telemetry-application-insights?pivots=programming-language-python#setting-up-the-telemetry-tracer-with-communication-identity-sdk-calls
https://learn.microsoft.com/en-us/python/api/overview/azure/monitor-opentelemetry-exporter-readme?view=azure-python-preview
https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-enable?tabs=python

Python logging to Azure

I am using Python and I was wondering if there is any package/simple way for logging directly to Azure?
I found a package (azure-storage-logging) that would be really nice, however it is not being maintained and not compatible with the new Azure API.
Any help is welcome.
You should use Application Insights which will send the logs to Azure Monitor (previously Log Analytics).
https://learn.microsoft.com/en-us/azure/azure-monitor/app/opencensus-python
I had the same requirement to log error and debug messages for small application and store the logs to Azure data lake. We did not want to use Azure Insight as our was not a web application and we just needed logs to debug the code.
To solve this I created temp.log file.
logging.basicConfig(filename='temp.log', format='%(asctime)s %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s',
datefmt='%Y-%m-%d:%H:%M:%S')
At the end of program I uploaded the temp.log to azure using,
DataLakeFileClient.append_data
local_file = open("temp.log",'r')
file_contents = local_file.read()
file_client.append_data(data=file_contents, offset=0, length=len(file_contents))
file_client.flush_data(len(file_contents))
https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python
You can just create your own handler. I show you how to log it to an azure table. Storing in blob can be similar. The biggest benefit is that you emit the log as you log it instead of sending logs at the end of a process.
Create a table in azure table storage and then run the following code.
from logging import Logger, getLogger
from azure.core.credentials import AzureSasCredential
from azure.data.tables import TableClient, TableEntity
class _AzureTableHandler(logging.Handler):
def __init__(self, *args, **kwargs):
super(_AzureTableHandler, self).__init__(*args, **kwargs)
credentials: AzureSasCredential = AzureSasCredential(signature=<sas-token>)
self._table_client: TableClient = TableClient(endpoint=<storage-account-url>, table_name=<table-name>, credential=credentials)
def emit(self, record):
level = record.levelname
message = record.getMessage()
self._table_client.create_entity(TableEntity({'Severity': level,
'Message': message,
'PartitionKey': f'{datetime.now().date()}',
'RowKey': f'{datetime.now().microsecond}'}))
if __name__ == "__main__":
logger: Logger = getLogger(__name__)
logger.addHandler(_AzureTableHandler())
logger.warning('testing azure logging')
In this approach, you also have the benefit of creating custom columns for your table. For example, you can have separate columns for the project name which is logging, or the username of the dev who is running the script.
logger.addHandler(_AzureTableHandler(Username="Hesam", Project="Deployment-support-tools-client-portal"))
Make sure to add your custom column names to the table_entity dictionary. Or you can put the project name as partition key.

Redirect console logging output to flask-socketio

I log events to console via the Python logging module.
I also want to send that log messages via socket-io (flask) to a client.
The following approach was only partly successful.
from flask.ext.socketio import send
fmt_str = '%(asctime)s - %(message)s'
formatter = logging.Formatter(fmt_str)
logging.basicConfig(level=logging.INFO, format=fmt_str)
logger = logging.getLogger("")
class SocketIOHandler(logging.Handler):
def emit(self, record):
send(record.getMessage())
sio = SocketIOHandler()
logger.addHandler(sio)
I get the result in the browser, but still get
RuntimeError: working outside of request context
for each send call on the console. I think the context for the send call is not availible... What is a useful way to deal with that problem? Thanks.
The send and emit functions are context-aware functions that only work from inside an event handler. The equivalent context-free functions are available from the socketio instance. Example (making some assumptions about your app that may or may not be true):
from app import socketio # flask.ext.socketio.SocketIO instance
fmt_str = '%(asctime)s - %(message)s'
formatter = logging.Formatter(fmt_str)
logging.basicConfig(level=logging.INFO, format=fmt_str)
logger = logging.getLogger("")
class SocketIOHandler(logging.Handler):
def emit(self, record):
socketio.send(record.getMessage())
sio = SocketIOHandler()
logger.addHandler(sio)
The first line may need to be adapted to the structure of your application, but with this your application will broadcast logs to all the clients.
Hope this helps!

Categories

Resources