I am using Python and I was wondering if there is any package/simple way for logging directly to Azure?
I found a package (azure-storage-logging) that would be really nice, however it is not being maintained and not compatible with the new Azure API.
Any help is welcome.
You should use Application Insights which will send the logs to Azure Monitor (previously Log Analytics).
https://learn.microsoft.com/en-us/azure/azure-monitor/app/opencensus-python
I had the same requirement to log error and debug messages for small application and store the logs to Azure data lake. We did not want to use Azure Insight as our was not a web application and we just needed logs to debug the code.
To solve this I created temp.log file.
logging.basicConfig(filename='temp.log', format='%(asctime)s %(levelname)-8s [%(filename)s:%(lineno)d] %(message)s',
datefmt='%Y-%m-%d:%H:%M:%S')
At the end of program I uploaded the temp.log to azure using,
DataLakeFileClient.append_data
local_file = open("temp.log",'r')
file_contents = local_file.read()
file_client.append_data(data=file_contents, offset=0, length=len(file_contents))
file_client.flush_data(len(file_contents))
https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-directory-file-acl-python
You can just create your own handler. I show you how to log it to an azure table. Storing in blob can be similar. The biggest benefit is that you emit the log as you log it instead of sending logs at the end of a process.
Create a table in azure table storage and then run the following code.
from logging import Logger, getLogger
from azure.core.credentials import AzureSasCredential
from azure.data.tables import TableClient, TableEntity
class _AzureTableHandler(logging.Handler):
def __init__(self, *args, **kwargs):
super(_AzureTableHandler, self).__init__(*args, **kwargs)
credentials: AzureSasCredential = AzureSasCredential(signature=<sas-token>)
self._table_client: TableClient = TableClient(endpoint=<storage-account-url>, table_name=<table-name>, credential=credentials)
def emit(self, record):
level = record.levelname
message = record.getMessage()
self._table_client.create_entity(TableEntity({'Severity': level,
'Message': message,
'PartitionKey': f'{datetime.now().date()}',
'RowKey': f'{datetime.now().microsecond}'}))
if __name__ == "__main__":
logger: Logger = getLogger(__name__)
logger.addHandler(_AzureTableHandler())
logger.warning('testing azure logging')
In this approach, you also have the benefit of creating custom columns for your table. For example, you can have separate columns for the project name which is logging, or the username of the dev who is running the script.
logger.addHandler(_AzureTableHandler(Username="Hesam", Project="Deployment-support-tools-client-portal"))
Make sure to add your custom column names to the table_entity dictionary. Or you can put the project name as partition key.
Related
when i receive input like url
Then want to logging in cloud run
I want to see my input with python
You can log the input URL in Python using the built-in logging module. You can use this example
import logging
def log_url(url):
logging.basicConfig(filename='url.log', level=logging.INFO)
logging.info('Received URL: %s', url)
if __name__ == '__main__':
url = 'https://www.example.com'
log_url(url)
This code will log the input URL in a file named url.log in the same directory as the script. The logging level is set to logging.INFO, so only informational messages will be logged. You can adjust the logging level as needed to control the verbosity of the log output.
To log the input URL in the cloud, you will need to configure the logging backend to write to a cloud-based log service. The specific service you use will depend on the cloud provider you are using. For example, if you are using Google Cloud, you could use Stackdriver Logging to store and manage your logs.
I am trying to send logs (with jsonPayload) to GCP using StructuredLogHandler with below python code.
rootlogger = logging.getLogger()
client = google.cloud.logging.Client(credentials=xxx, project=xxx)
h = StructuredLogHandler()
rootlogger.addHandler(h)
logger = logging.getLogger('test')
logger.warning('warning')
I see that logs are being printed on console (in json format) but the logs are not sent to GCP Log Explorer. Can someone help?
Since v3 of Python Cloud Logging Library it's now easier than ever as it integrates with the Python standard logging library with client.setup_logging:
import logging
import google.cloud.logging
# Instantiate a client
client = google.cloud.logging.Client()
# Retrieves a Cloud Logging handler based on the environment
# you're running in and integrates the handler with the
# Python logging module. By default this captures all logs
# at INFO level and higher
client.setup_logging()
Name your logger as usual. E.g.
logger = logging.getLogger('test')
Then if you want to send structured log messages to Cloud Logging then you can use one of two methods:
Use the json_fields extra argument:
data_dict = {"hello": "world"}
logging.info("message field", extra={"json_fields": data_dict})
Use a JSON-parseable string (requires importing the json module):
import json
data_dict = {"hello": "world"}
logging.info(json.dumps(data_dict))
This will see your log messages sent to Google Cloud and the JSON payload available under the jsonPayload field of the expanded log entry:
What we are trying:
We are trying to Run a Cloud Run Job that does some computation and also uses one our custom package to do the computation. The cloud run job is using google-cloud-logging and python's default logging package as described here. The custom python package also logs its data (only logger is defined as suggested here).
Simple illustration:
from google.cloud import logging as gcp_logging
import logging
import os
import google.auth
from our_package import do_something
def log_test_function():
SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
credentials, project_id = google.auth.default(scopes=SCOPES)
try:
function_logger_name = os.getenv("FUNCTION_LOGGER_NAME")
logging_client = gcp_logging.Client(credentials=credentials, project=project_id)
logging.basicConfig()
logger = logging.getLogger(function_logger_name)
logger.setLevel(logging.INFO)
logging_client.setup_logging(log_level=logging.INFO)
logger.critical("Critical Log TEST")
logger.error("Error Log TEST")
logger.info("Info Log TEST")
logger.debug("Debug Log TEST")
result = do_something()
logger.info(result)
except Exception as e:
print(e) # just to test how print works
return "Returned"
if __name__ == "__main__":
result = log_test_function()
print(result)
Cloud Run Job Logs
The Blue Box indicates logs from custom package
The Black Box indicates logs from Cloud Run Job
The Cloud Logging is not able to identify the severity of logs. It parses every log entry as default level.
But if I run same code in Cloud Function, it seems to work as expected (i.e. severity level of logs from cloud function and custom package is respected) as shown in image below.
Cloud Function Logs
Both are serverless architecture than why does it works in Cloud Function but not in Cloud Run.
What we want to do:
We want to log every message from Cloud Run Job and custom package to Cloud Logging with correct severity.
We would appreciate your help guys!
Edit 1
Following Google Cloud Python library commiters solution. Almost solved the problem. Following is the modified code.
from google.cloud import logging as gcp_logging
import logging
import os
import google.auth
from our_package import do_something
from google.cloud.logging.handlers import CloudLoggingHandler
from google.cloud.logging_v2.handlers import setup_logging
from google.cloud.logging_v2.resource import Resource
from google.cloud.logging_v2.handlers._monitored_resources import retrieve_metadata_server, _REGION_ID, _PROJECT_NAME
def log_test_function():
SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
region = retrieve_metadata_server(_REGION_ID)
project = retrieve_metadata_server(_PROJECT_NAME)
try:
function_logger_name = os.getenv("FUNCTION_LOGGER_NAME")
# build a manual resource object
cr_job_resource = Resource(
type="cloud_run_job",
labels={
"job_name": os.environ.get('CLOUD_RUN_JOB', 'unknownJobId'),
"location": region.split("/")[-1] if region else "",
"project_id": project
}
)
logging_client = gcp_logging.Client()
gcloud_logging_handler = CloudLoggingHandler(logging_client, resource=cr_job_resource)
setup_logging(gcloud_logging_handler, log_level=logging.INFO)
logging.basicConfig()
logger = logging.getLogger(function_logger_name)
logger.setLevel(logging.INFO)
logger.critical("Critical Log TEST")
logger.error("Error Log TEST")
logger.warning("Warning Log TEST")
logger.info("Info Log TEST")
logger.debug("Debug Log TEST")
result = do_something()
logger.info(result)
except Exception as e:
print(e) # just to test how print works
return "Returned"
if __name__ == "__main__":
result = log_test_function()
print(result)
Now the every log is logged twice one severity sensitive log other severity insensitive logs at "default" level as shown below.
Cloud Functions get your code, wrap it in a webserver, build a container and deploy it.
With Cloud Run, you only build and deploy the container.
That means, Cloud Functions webserver wrapper do something more that you do: it initialize correctly the logger in Python.
Have a look to that doc page, you should solve your issue with it
EDIT 1
I took the exact example and I added it in my flask server like that
import os
from flask import Flask
app = Flask(__name__)
import google.cloud.logging
# Instantiates a client
client = google.cloud.logging.Client()
# Retrieves a Cloud Logging handler based on the environment
# you're running in and integrates the handler with the
# Python logging module. By default this captures all logs
# at INFO level and higher
client.setup_logging()
# [END logging_handler_setup]
# [START logging_handler_usage]
# Imports Python standard library logging
import logging
#app.route('/')
def call_function():
# The data to log
text = "Hello, world!"
# Emits the data using the standard logging module
logging.warning(text)
# [END logging_handler_usage]
print("Logged: {}".format(text))
return text
# For local execution
if __name__ == "__main__":
app.run(host='0.0.0.0',port=int(os.environ.get('PORT',8080)))
A rough copy of that sample.
And the result is correct. A Logged entry with the default level (the print), and my Hello world in warning status (the logging.warning)
EDIT 2
Thanks to the help of the Google Cloud Python library commiters, I got a solution to my issue. By waiting the native integration in the library.
Here my new code for Cloud Run Jobs this time
import google.cloud.logging
from google.cloud.logging.handlers import CloudLoggingHandler
from google.cloud.logging_v2.handlers import setup_logging
from google.cloud.logging_v2.resource import Resource
from google.cloud.logging_v2.handlers._monitored_resources import retrieve_metadata_server, _REGION_ID, _PROJECT_NAME
import os
# find metadata about the execution environment
region = retrieve_metadata_server(_REGION_ID)
project = retrieve_metadata_server(_PROJECT_NAME)
# build a manual resource object
cr_job_resource = Resource(
type = "cloud_run_job",
labels = {
"job_name": os.environ.get('CLOUD_RUN_JOB', 'unknownJobId'),
"location": region.split("/")[-1] if region else "",
"project_id": project
}
)
# configure handling using CloudLoggingHandler with custom resource
client = google.cloud.logging.Client()
handler = CloudLoggingHandler(client, resource=cr_job_resource)
setup_logging(handler)
import logging
def call_function():
# The data to log
text = "Hello, world!"
# Emits the data using the standard logging module
logging.warning(text)
# [END logging_handler_usage]
print("Logged: {}".format(text))
return text
# For local execution
if __name__ == "__main__":
call_function()
And the result works great:
Logged: .... entry is the simple "Print" to the stdout. Standard level
Warning entry as expected
you might want to get rid of loggers you don't need. take a look at https://stackoverflow.com/a/61602361/13161301
I was wondering whether cloud functions has restrictions to export logging from python out of Google Cloud Platform for long messages (over 1500 characters).
The below are multiple examples from Papertrail on how to send log messages from Python, but the one I tried is below which worked sending messages to Papertrail on Google Compute Engine, but did not work on Google Cloud Functions. Google Cloud Function instead logs them in Google Cloud Operation Logging Service.
import logging
import socket
from logging.handlers import SysLogHandler
class ContextFilter(logging.Filter):
hostname = socket.gethostname()
def filter(self, record):
record.hostname = ContextFilter.hostname
return True
syslog = SysLogHandler(address=('logsN.papertrailapp.com', XXXXX))
syslog.addFilter(ContextFilter())
format = '%(asctime)s %(hostname)s YOUR_APP: %(message)s'
formatter = logging.Formatter(format, datefmt='%b %d %H:%M:%S')
syslog.setFormatter(formatter)
logger = logging.getLogger()
logger.addHandler(syslog)
logger.setLevel(logging.INFO)
logger.info("This is a message")
In particular, it seems that for the above code, the address I placed on the handler does not work in Google Cloud Function if the message I pass is too long.
syslog = SysLogHandler(address=('logsN.papertrailapp.com', XXXXX))
Is this normal or my cloud function has some issues?
Is there a way I can pass an address in my python logging within Google Cloud Functions that can be compatible for long messages?
I am creating an application which can be run locally or on Google Cloud. To set up google cloud logging I've used Google Cloud Logging, made a cloud logger and basically log using the class below
class CloudLogger():
def __init__(self, instance_id: str = LOGGER_INSTANCE_ID, instance_zone: str = LOGGER_INSTANCE_ZONE) -> None:
self.instance_id = instance_id
self.instance_zone = instance_zone
self.cred = service_account.Credentials.from_service_account_file(CREDENTIAL_FILE)
self.client = gcp_logging.Client(project = PROJECT, credentials=self.cred)
self.res = Resource(type="gce_instance",
labels={
"instance_id": self.instance_id,
"zone": self.instance_zone
})
self.hdlr = CloudLoggingHandler(self.client, resource = self.res)
self.logger = logging.getLogger('my_gcp_logger')
self.hdlr.setFormatter(logging.Formatter('%(message)s'))
if not self.logger.handlers:
self.logger.addHandler(self.hdlr)
self.logger.setLevel(logging.INFO)
def info(self, log_this):
self.logger.setLevel(logging.INFO)
self.logger.info(log_this)
I want to have this so that if it is running on the cloud, it uses the GCP logger, and if run locally, it uses python logging. I can either pass in as an argument ("Cloud", "Local") or make it intelligent enough to understand on its own. But I want the underlying logic to be the same so that I can log to cloud/local seamlessly. How would I go about doing this?
Wondering if (maybe) theres some way to create a local logger. And have those local logs parsed to GCP if running on the cloud.
I coped with this issue and Google is aware of it (and also for Go). My helper do this:
I perform a request to metadata server. A get to http://metadata.google.internal/computeMetadata/v1/project/ with this header Metadata-Flavor: Google
If I have a 404, I'm in local, and I set up my logger as local
If I have a 2XX, I'm on GCP and I set up the logger to use FluentD format (GCP Cloud Logging)
Not perfect, but enough for me!