Python: flush logging only at end of script run - python

Currently I use for logging a custom logging system that works as follow:
I have a Log class that ressemble the following:
class Log:
def __init__(self):
self.script = ""
self.datetime = datetime.datetime.now().replace(second=0, microsecond=0)
self.mssg = ""
self.mssg_detail = ""
self.err = ""
self.err_detail = ""
I created a function decorator that perform a try/except on the function call, and add a message either to .mssg or .err on the Log object accordingly.
def logging(fun):
#functools.wraps(fun)
def inner(self, *args):
try:
f = fun(self, *args)
self.logger.mssg += fun.__name__ +" :ok, "
return f
except Exception as e:
self.logger.err += fun.__name__ +": error: "+str(e.args)
return inner
So usually a script is a class that is composed of multiple methods that are run sequentially.
I hence run those methods (decorated such as mentionned above) , and lastly I upload the Log object into a mysql db.
This works quite fine and alright. But now I want to modify those items so that they integrate with the "official" logging module of python.
What I dont like about that module is that it is not possible to "save" the messages onto 1 log object in order to upload/save to log only at the end of the run. Rather each logging call will write/send the message to a file etc. - which create lots of performances issues sometimes. I could usehandlers.MemoryHandler , but it still doesn't seems to perform as my original system: it is said to collect messages and flush them to another handler periodically - which is not what i want: I want to collect the messages in memory and to flush them on request with an explicit function.
Anyone has any suggestions?

Here is my idea. Use a handler to capture the log in a StringIO. Then you can grab the StringIO whenever you want. Since there was perhaps some confusion in the discussion thread - StringIO is a "file-like" interface for strings, there isn't ever an actual file involved.
import logging
import io
def initialize_logging(log_level, log_name='default_logname'):
logger = logging.getLogger(log_name)
logger.setLevel(log_level)
log_stream = io.StringIO()
if not logger.handlers:
ch = logging.StreamHandler(log_stream)
ch.setLevel(log_level)
ch.setFormatter(logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
))
logger.addHandler(ch)
logger.propagate = 0
return logger, log_stream
And then something like:
>>> logger, log_stream = initialize_logging(logging.INFO, "logname")
>>> logger.warning("Hello World!")
And when you want the log information:
>>> log_stream.getvalue()
'2017-05-16 16:35:03,501 - logname - WARNING - Hello World!\n'

At program start (in the main), you can:
instanciate your custom logger => global variable/singleton.
register a function at program end which will flush your logger.
Run your decorated functions.
To register a function you can use atexit.register function. See the page Exit handlers in the doc.
EDIT
The idea above can be simplified.
To delay the logging, you can use the standard MemoryHandler handler, described in the page logging.handlers — Logging handlers
Take a look at this GitHub project: https://github.com/tantale/python-ini-cfg-demo
And replace the INI file by this:
[formatters]
keys=default
[formatter_default]
format=%(asctime)s:%(levelname)s:%(message)s
class=logging.Formatter
[handlers]
keys=console, alternate
[handler_console]
class=logging.handlers.MemoryHandler
formatter=default
args=(1024, INFO)
target=alternate
[handler_alternate]
class=logging.StreamHandler
formatter=default
args=()
[loggers]
keys=root
[logger_root]
level=DEBUG
formatter=default
handlers=console
To log to a database table, just replace the alternate handler by your own database handler.
There is some blog/SO questions about that:
You can look at Logging Exceptions To Your SQLAlchemy Database to create a SQLAlchemyHandler
See Store Django log to database if you are using DJango.
EDIT2
Note: ORM generally support "Eager loading", for instance with SqlAlchemy

Related

Prevent one of the logging handlers for specific messages

I monitor my script with the logging module of the Python Standard Library and I send the loggings to both the console with StreamHandler, and to a file with FileHandler.
I would like to have the option to disable a handler for a LogRecord independantly of its severity. For example, for a specific LogRecord I would like to have the option not to send it to the file destination or to the console (with passing a parameter).
I have found that the library has the Filter class for that reason (which is described as a finer grained way to filter blocks), but haven't figured out how to do it.
Any ideas how to do this in a cosistent way?
Finally, it is quite easy. I used a function as a Handler.filer as suggested in the comments.
This is a working example:
from pathlib import Path
import logging
from logging import LogRecord
def build_handler_filters(handler: str):
def handler_filter(record: LogRecord):
if hasattr(record, 'block'):
if record.block == handler:
return False
return True
return handler_filter
ch = logging.StreamHandler()
ch.addFilter(build_handler_filters('console'))
fh = logging.FileHandler(Path('/tmp/test.log'))
fh.addFilter(build_handler_filters('file'))
mylogger = logging.getLogger(__name__)
mylogger.setLevel(logging.DEBUG)
mylogger.addHandler(ch)
mylogger.addHandler(fh)
When the logger is called, the message is sent to both console and output, i.e.
mylogger.info('msg').
To block for example the file the logger should be called with the extra argument like this
mylogger.info('msg only to console', extra={'block': 'file'})
Disabling console is analogous.

How can I decorate python logging output?

I use import logging module for logging inside the AWS lambda with python 3.7 runtime.
I would like to perform certain manipulations on log statements before they are flushed to stdout, e.g. wrap the message as json and add tracing data, so that they would be parseable by Kibana parser.
I don't want to write my own decorator for that because that won't work for underlying dependencies.
Ideally, it should be something like a configured callback for the logger
so that it would do following work for me:
log_statement = {}
log_statement['message'] = 'this is the message'
log_statement['X-B3-TraceId'] = "76b85f5e32ce7b46"
log_statement['level'] = 'INFO'
sys.stdout.write(json.dumps(log_statement) + '\n')
while having still logger.info('this is the message').
How can I do that?
Answering my own question:
I had to use LoggerAdapter that is quite a good fit for the purpose of pre-processing log statements:
import logging
class CustomAdapter(logging.LoggerAdapter):
def process(self, msg, kwargs):
log_statement = '{"X-B3-TraceId":"%s", "message":"%s"}' % (self.extra['X-B3-TraceId'], msg) + '\n'
return log_statement, kwargs
See: https://docs.python.org/3/howto/logging-cookbook.html#using-loggeradapters-to-impart-contextual-information
In general, the next step would be just plugging in the adapter like:
import logging
...
logging.basicConfig(format='%(message)s')
logger = logging.getLogger()
logger.setLevel(LOG_LEVEL)
custom_logger = CustomAdapter(logger, {'X-B3-TraceId': "test"})
...
custom_logger.info("test")
Note: I had to put format as a message only because I need to get the whole statement as a JSON string. Unfortunately, thus I lost some predefined log statement parts, e.g. aws_request_id. This is the limitation of LoggerAdapter#process as it handles only the message part. If anyone has a better approach here, pls suggest.
It appears that AWS lambda python runtime somehow interferes with logging facility and changing the format like above did not work. So I had to do additionally this:
FORMAT = "%(message)s"
logger = logging.getLogger()
for h in logger.handlers:
h.setFormatter(logging.Formatter(FORMAT))
See: https://gist.github.com/niranjv/fb95e716151642e8ca553b0e38dd152e

Create log file named after filename of caller script

I have a logger.py file which initialises logging.
import logging
logger = logging.getLogger(__name__)
def logger_init():
import os
import inspect
global logger
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
logger.addHandler(ch)
fh = logging.FileHandler(os.getcwd() + os.path.basename(__file__) + ".log")
fh.setLevel(level=logging.DEBUG)
logger.addHandler(fh)
return None
logger_init()
I have another script caller.py that calls the logger.
from logger import *
logger.info("test log")
What happens is a log file called logger.log will be created containing the logged messages.
What I want is the name of this log file to be named after the caller script filename. So, in this case, the created log file should have the name caller.log instead.
I am using python 3.7
It is immensely helpful to consolidate logging to one location. I learned this the hard way. It is easier to debug when events are sorted by time and it is thread-safe to log to the same file. There are solutions for multiprocessing logging.
The log format can, then, contain the module name, function name and even line number from where the log call was made. This is invaluable. You can find a list of attributes you can include automatically in a log message here.
Example format:
format='[%(asctime)s] [%(module)s.%(funcName)s] [%(levelname)s] %(message)s
Example log message
[2019-04-03 12:29:48,351] [caller.work_func] [INFO] Completed task 1.
You can get the filename of the main script from the first item in sys.argv, but if you want to get the caller module not the main script, check the answers on this question.

Custom extra data in python log with multiple modules

I'm trying to implement a logger in python that involves many modules. The core functionality involves a gRPC microservice which takes a payload, generates an id and does some optimization.
When multiple requests come in at once, the log can get jumbled up. What I would like to do is tag the logger lines with a the run_id so that the run_id prints in the log for each function call made in the parent and all imported modules.
I tried this:
logger = logging.getLogger("myservice")
log_file = "logs/solver.log"
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(log_id)s - %(name)s - %(message)s", datefmt="%y/%m/%d %H:%M:%S") #: log_id is custom data
log_data = {'log_id': None}
concise_log = logging.FileHandler(log_file) #: file handler for concise log
concise_log.setLevel(logging.INFO)
concise_log.setFormatter(formatter)
#: Add the logging hanlders to the logger
logger.addHandler(concise_log)
logger.addHandler(detailed_log)
logger.info(f"log_file: {log_file}", extra=log_data)
logger.info("LOG START", extra=log_data)
logger.debug("Debug level messages are being recorded.", extra=log_data)
I used a dummy value for log_data initially because the logger will error out otherwise.
Later on I set the log_id and log various things like so:
log_id = str(uuid.uuid4())
log_data = {'log_id': log_id} #: extra data for logger
#: load the server config:
logger.info("loading server config...", extra=log_data)
This works fine, but if I make calls to my logger in any imported function, it will error out because those functions don't have log_data.
What I would like to be able to do is set the log_id in a more global way so that the logger always has the right log_id in the imported modules, which will change from request to request.
Is there another pattern that I should be using to accomplish this?
This is best done with a LoggerAdapter. The official documentation has an excellent example that is very close to what you are trying to do. The code will approximately look like this:
class LogIdAdapter(logging.LoggerAdapter):
def process(self, msg, kwargs):
return '[%s] %s' % (self.extra['log_id'], msg), kwargs
tmp_logger = logging.getLogger("myservice")
# ... set up handlers etc here. do not include log_id in formatter, it will be added later
logger = LogIdAdapter(tmp_logger, {'log_id': str(uuid.uuid4())})
logger.info("some text") # this log will have log_id prepended

Multiple streamhandlers

I am trying to beef up the logging in my Python scripts and I am grateful if could share best practices with me. For now I have created this little script (I should say that I run Python 3.4)
import logging
import io
import sys
def Streamhandler(stream, level, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"):
ch = logging.StreamHandler(stream)
ch.setLevel(level)
formatter = logging.Formatter(format)
ch.setFormatter(formatter)
return ch
# get the root logger
logger = logging.getLogger()
stream = io.StringIO()
logger.addHandler(Streamhandler(stream, logging.WARN))
stream_error = io.StringIO()
logger.addHandler(Streamhandler(stream_error, logging.ERROR))
logger.addHandler(Streamhandler(stream=sys.stdout, level=logging.DEBUG))
print(logger)
for h in logger.handlers:
print(h)
print(h.level)
# 'application' code # goes to the root logger!
logging.debug('debug message')
logging.info('info message')
logging.warning('warn message')
logging.error('error message')
logging.critical('critical message')
print(stream.getvalue())
print(stream_error.getvalue())
I have three handlers, 2 of them write into a io.StringIO (this seems to work). I need this to simplify testing but also to send logs via a HTTP email service. And then there is a Streamhandler for the console. However, logging.debug and logging.info messages are ignored on the console here despite setting the level explicitly low enough?!
First, you didn't set the level on the logger itself:
logger.setLevel(logging.DEBUG)
Also, you define a logger but do your calls on logging - which will call on the root logger. Not that it will make any difference in your case since you didn't specify a name for your logger, so logging.getLogger() returns the root logger.
wrt/ "best practices", it really depends on how "complex" your scripts are and of course on your logging needs.
For self-contained simple scripts with simple use cases (single known environment, no concurrent execution, simple logging to a file or stderr etc), a simple call to logging.basicConfig() and direct calls to logging.whatever() are usually good enough.
For anything more complex, it's better to use a distinct config file - either in ini format or as Python dict (using logging.dictConfig), split your script into distinct module(s) or package(s) each defining it's own named logger (with logger = logging.getLogger(__name__)) and only keep your script itself as the "runner" for your code, ie: configure logging, import modules, parse command line args and call the main function - preferably in a try/except block as to properly log any unhandled exception before crashing.
A logger has a threshold level too, you need to set it to DEBUG first:
logger.setLevel(logging.DEBUG)

Categories

Resources