Passing extra context for tornado log entries - python

Our log formatter has a user defined parameter. It is defined as:
'%(asctime)s|%(levelname)s|%(name)s|REQID:%(req_id)s|%(module)s:%(lineno)s|%(message)s'
where req_id is a request id, generated by application code for every request. When we are processing the requests, on our application code, we have access to this req_id, and we use it for logging purposes like this:
logger = logging.LoggerAdapter(logging.getLogger(service_name), {'req_id': req_id})
logger.debug('A debug message')
I am trying to make the tornado logger conforming to our log format, but since tornado has no access to our application level req_id it fails with:
KeyError: 'req_id'
How can I tell tornado to use a LoggerAdapter for tornado.access, with a user provided context?
EDIT
As a workaround, I tried the following:
Since it is not possible for me to tell tornado what loggers to use, I managed to hack my way around this limitation by reconfiguring the tornado logger in each request, adding the contextual information using a logging filter.
Unfortunately, reconfiguring the log for each request does not work since tornado will be serving requests in parallel, and we will get an inconsistent state.
How can we pass user context for the tornado loggers then?

The tornado.access log can be controlled by overriding Application.log_request in a subclass or using the log_function application setting. This method defaults to writing to the tornado.access log, but you can override it to log however you want.
Note however that the tornado.general and tornado.application loggers cannot be overridden in this way, so your log formatters/filters must still be able to handle messages that do not have the req_id field.

Related

How to use Syslog in Django REST framework?

I’m developing a Python Django REST API. Currently, I need to incorporate a logger to Syslog.
Do I need to just define the logger and Syslog handler in the Settings.py?
I’m relatively new to Django and using the Syslog protocol, so I appreciate any help and what Python modules to use.
Yes, you can for example use the syslog library to log messages.
Using it, is pretty easy.
import syslog
# create syslog handle
syslog.openlog(ident="settings.py", facility=syslog.LOG_LOCAL0)
# log some message
syslog.syslog(syslog.LOG_INFO, "Some log message.")
The log message will not get written to /var/log/local0.log - probably it will also get written to /var/log/rsyslog or /var/log/messages depending on the standard configuration of your /etc/rsyslog.conf file.

Python - is it possible to get logging.handlers.httpHandler to set the content header to JSON without resorting to custom code?

I'd like to use Python's logging.handlers.HTTPHandler to send log events using POST with the log events in a JSON format. However I don't want to resort to writing extensions - so the configuration could be implemented either using file OR by code. I wrote the following code to configure the logger and send a test message. The event is being received but doesn't appear to have the content type set to JSON
import logging, logging.handlers
testHandler = logging.handlers.HTTPHandler('localhost:18080', '/test', method='POST')
log = logging.getLogger("me")
log.addHandler(testHandler)
log.warn ('{"beep":"beep"}')
Have I missed a configuration?
This is not possible, and especially so when you use the POST method. The HTTPHandler is hardcoded to set the content type header to application/x-www-form-urlencoded for all POST requests.
The best you can do is subclass the HTTPHandler and implement a custom version of the emit() method.

Logging on either GCP or local

Suppose there is a system that is run on GCP, but as a backup, can be run locally.
When running on the cloud, stackdriver is pretty straightforward.
However, I need my system to push to stackdriver if on the cloud, and if not on the cloud, use the local python logger.
I also don't want to include any logic to do so, and this should be automatic.
When logging, log straight to Python/local logger.
If on GCP -> push these to stackdriver.
I can write logic that could implement this but that is bad practice. There surely is a direct way of getting this to work.
Example
import google.cloud.logging
client = google.cloud.logging.Client()
client.setup_logging()
import logging
cl = logging.getLogger()
file_handler = logging.FileHandler('file.log')
cl.addHandler(file_handler)
logging.info("INFO!")
This will basically log to python logger, and then 'always' upload to cloud logger. How can I have it so that I don't need to explicitly add import google.cloud.logging and basically if stackdriver is installed, it directly gets the logs? Is that even possible? If not can someone explain how this would be handled from a best practices perspective?
Attempt 1 [works]
Created /etc/google-fluentd/config.d/workflow_log.conf
<source>
#type tail
format none
path /home/daudn/this_log.log
pos_file /var/lib/google-fluentd/pos/this_log.pos
read_from_head true
tag workflow-log
</source>
Created /var/log/this_log.log
pos_file /var/lib/google-fluentd/pos/this_log.pos exists
import logging
cl = logging.getLogger()
file_handler = logging.FileHandler('/var/log/this_log.log')
file_handler.setFormatter(logging.Formatter("%(asctime)s;%(levelname)s;%(message)s"))
cl.addHandler(file_handler)
logging.info("info_log")
logging.error("error_log")
This works! Look for your logs for the specific VM and not global>python
Fortunately, this is a story that is handled. Stackdriver Logging is a very versatile framework for logging. However, there are a variety of logging APIs and Google's intent was not that you had to rewrite all your existing applications to leveraging the Stackdriver logging native APIs. Instead, you can use a logging API of your choice (including standard and defacto APIs) and these logging APIs will then map to Stackdriver. If executed outside a GCP environment or you simply wish to switch to an alternate log collector, your applications would not have to be re-coded or recompiled.
A list of the logging APIs available for different languages can be found at Setting Up Language Runtimes and this includes Setting Up Stackdriver Logging for Python.
For Python, at runtime, you have a configuration property (eg an Environment variable) that declares whether or not you wish to use Stackdriver. If set to true, then .. and only then ... would you execute the login that sets up the native Python logging for Stackdriver otherwise that logic would not be called and hence you would have no dependency on Stackdriver.
A possible piece of code might be:
if os.environ.get('USE_STACKDRIVER') == 'true':
import google.cloud.logging
client = google.cloud.logging.Client()
client.setup_logging()
You do not need to specifically enable or use Stackdriver in your program. You can use the Python logger and write to any file you want. However, Stackdriver only logs specific log files. This means that you would need to manually set up Stackdriver to log "your" log files.
In your example, you are writing to file.log. Modify /etc/google-fluentd/config.d/mylogfile.conf to include the following. You will need to specify the full path for file.log and not just the file name. In this example, I named it /var/log/mylogfile.log. This example also assumes that your logs start each line with a date.
<source>
#type tail
# Parse the timestamp, but still collect the entire line as 'message'
format /^(?<message>(?<time>[^ ]*\s*[^ ]* [^ ]*) .*)$/
path /var/log/mylogfile.log
pos_file /var/lib/google-fluentd/pos/mylogfile.log.pos
read_from_head true
tag auth
</source>
For more information read the following document:
Stackdriver - Configuring the Agent
Now your program will run outside GCP and when running on a configured instance, log to Stackdriver.
Note: I would do the opposite of what you have asked. I would always use Stackdriver. When not running in GCP I would manually set up Stackdriver on my desktop, local server, etc and continue to log to Stackdriver.

Using Flask and native Python logging?

The trouble with Flask logging (i.e., app.logger.info(...), etc.) is that sub modules don't use it, so it seems to me that the only way to globally configure the app's logging is via the underlying python logging mechanism: e.g.,
logging.config.fileConfig('config/logging.conf')
But this doesn't configure Flask's default handlers, so for example, the HTTP logging is not configured. So I get mixed logs:
2015-07-28 14:57:47,320 [main.py][INFO] Starting... # Python log
[2015-07-28 14:58:49] "GET /demo HTTP/1.1" 200 956 0.318825 # Flask log
Can anyone suggest the standard practice for globally configuring logging via a configuration file (i.e., so that I can easily configure individual packages).
ALSO, I cant seem to remove the HTTP logging (GET, etc.) to stderr (which I presume come from werkzeug); setting logging.getLogger('werkzeug').setLevel() only affects the werkzeug logs that are not related to HTTP logging.
Thanks.
I'm pretty sure there is a piece about this in the Flask docs: https://flask.palletsprojects.com/en/1.1.x/logging/#other-libraries
It basically suggests to add the handlers of other libraries to the root of Flask. This is the example they give:
from flask.logging import default_handler
root = logging.getLogger()
root.addHandler(default_handler)
root.addHandler(mail_handler)

Django - Sending error report email in try-except code

Is it possible to hook into Django's built in error reporting emails from a try-except code block? In other words, email the default error report and stack trace to the ADMINS/MANAGERS while still having situation specific error handling.
Specific example:
In a project performing complex calculations and generating large reports, the view displaying the report page does all the calculations and generates a long html page with lots of pretty tables and graphs and also generates downloadable PDFs from sections of that same HTML.
Recently we had errors in the PDF generation from issues with storage on S3. Now this is obviously an error we need to track down and attend, but most users are happy if they can just see the report on screen. If the PDFs download links just weren't displayed the issue could go entirely unnoticed for hours or even days - but the dev team should be notified.
Ideally, but not necessarily, I would love a solution that is logger agnostic, where it will use whatever error logger is used and trigger the default 500 error handler, and return back to the finally block or after the except block.
All you need to do is to use Python's logging framework to raise an appropriate message at the appropriate level. In your settings.py there is a LOGGING variable that defines how things are logged. By default I believe Django has any ERROR in django.request will be handled by mail_admins.
So in your code, all you need to do is
import logging
logger = logging.getLogger(__name__) # this will create a logger with the module being the logger name
try:
#do stuff you watch to catch
except:
# we're going to catch and just log it
logger.error('Some error title', exc_info=True) # exc_info=True will include the stacktrace
finally:
# what you want to do in your finally block.
Note, this will swallow the exception and won't bubble it up. Your response will return as a 200. If you want to bubble up the exception, just call raise in your except block. However, if all you care about is logging the error, but the view is still functional, then just log and swallow it.
In your LOGGING variable, you can add additional entries to loggers for the different logger names. You can have an app log at a different logging status, say INFO if you want to debug a certain code path. As long as you create a logger with the module name, you have a lot of flexibility of segmenting your logging to different handlers such as mail_admins.
Lastly, I'd recommend to look into sentry, as it's a really great error logging tool.

Categories

Resources