Custom extra data in python log with multiple modules - python

I'm trying to implement a logger in python that involves many modules. The core functionality involves a gRPC microservice which takes a payload, generates an id and does some optimization.
When multiple requests come in at once, the log can get jumbled up. What I would like to do is tag the logger lines with a the run_id so that the run_id prints in the log for each function call made in the parent and all imported modules.
I tried this:
logger = logging.getLogger("myservice")
log_file = "logs/solver.log"
formatter = logging.Formatter("%(asctime)s - %(levelname)s - %(log_id)s - %(name)s - %(message)s", datefmt="%y/%m/%d %H:%M:%S") #: log_id is custom data
log_data = {'log_id': None}
concise_log = logging.FileHandler(log_file) #: file handler for concise log
concise_log.setLevel(logging.INFO)
concise_log.setFormatter(formatter)
#: Add the logging hanlders to the logger
logger.addHandler(concise_log)
logger.addHandler(detailed_log)
logger.info(f"log_file: {log_file}", extra=log_data)
logger.info("LOG START", extra=log_data)
logger.debug("Debug level messages are being recorded.", extra=log_data)
I used a dummy value for log_data initially because the logger will error out otherwise.
Later on I set the log_id and log various things like so:
log_id = str(uuid.uuid4())
log_data = {'log_id': log_id} #: extra data for logger
#: load the server config:
logger.info("loading server config...", extra=log_data)
This works fine, but if I make calls to my logger in any imported function, it will error out because those functions don't have log_data.
What I would like to be able to do is set the log_id in a more global way so that the logger always has the right log_id in the imported modules, which will change from request to request.
Is there another pattern that I should be using to accomplish this?

This is best done with a LoggerAdapter. The official documentation has an excellent example that is very close to what you are trying to do. The code will approximately look like this:
class LogIdAdapter(logging.LoggerAdapter):
def process(self, msg, kwargs):
return '[%s] %s' % (self.extra['log_id'], msg), kwargs
tmp_logger = logging.getLogger("myservice")
# ... set up handlers etc here. do not include log_id in formatter, it will be added later
logger = LogIdAdapter(tmp_logger, {'log_id': str(uuid.uuid4())})
logger.info("some text") # this log will have log_id prepended

Related

Python logging - new log file each loop iteration

I would like to generate a new log file on each iteration of a loop in Python using the logging module. I am analysing data in a for loop, where each iteration of the loop contains information on a new object. I would like to generate a log file per object.
I looked at the docs for the logging module and there is capability to change log file on time intervals or when the log file fills up, but I cannot see how to iteratively generate a new log file with a new name. I know ahead of time how many objects are in the loop.
My imagined pseudo code would be:
import logging
for target in targets:
logfile_name = f"{target}.log"
logging.basicConfig(format='%(asctime)s - %(levelname)s : %(message)s',
datefmt='%Y-%m/%dT%H:%M:%S',
filename=logfile_name,
level=logging.DEBUG)
# analyse target infomation
logging.info('log target info...')
However, the logging information is always appended to the fist log file for target 1.
Is there a way to force a new log file at the beginning of each loop?
Rather than using logging directly, you need to use logger objects. Go thorough the docs here.
Create a new logger object as a first statement in the loop. The below is a working solution.
import logging
import sys
def my_custom_logger(logger_name, level=logging.DEBUG):
"""
Method to return a custom logger with the given name and level
"""
logger = logging.getLogger(logger_name)
logger.setLevel(level)
format_string = ("%(asctime)s — %(name)s — %(levelname)s — %(funcName)s:"
"%(lineno)d — %(message)s")
log_format = logging.Formatter(format_string)
# Creating and adding the console handler
console_handler = logging.StreamHandler(sys.stdout)
console_handler.setFormatter(log_format)
logger.addHandler(console_handler)
# Creating and adding the file handler
file_handler = logging.FileHandler(logger_name, mode='a')
file_handler.setFormatter(log_format)
logger.addHandler(file_handler)
return logger
if __name__ == "__main__":
for item in range(10):
logger = my_custom_logger(f"Logger{item}")
logger.debug(item)
This writes to a different log file for each iteration.
This might not be the best solution, but it will create new log file for each iteration. What this is doing is, adding a new file handler in each iteration.
import logging
targets = ["a", "b", "c"]
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
for target in targets:
log_file = "{}.log".format(target)
log_format = "|%(levelname)s| : [%(filename)s]--[%(funcName)s] : %(message)s"
formatter = logging.Formatter(log_format)
# create file handler and set the formatter
file_handler = logging.FileHandler(log_file)
file_handler.setFormatter(formatter)
# add handler to the logger
logger.addHandler(file_handler)
# sample message
logger.info("Log file: {}".format(target))
This is not necessarily the best answer but worked for my case, and just wanted to put it here for future references. I created a function that looks as follows:
def logger(filename, level=None, format=None):
"""A wrapper to the logging python module
This module is useful for cases where we need to log in a for loop
different files. It also will allow more flexibility later on how the
logging format could evolve.
Parameters
----------
filename : str
Name of logfile.
level : str, optional
Level of logging messages, by default 'info'. Supported are: 'info'
and 'debug'.
format : str, optional
Format of logging messages, by default '%(message)s'.
Returns
-------
logger
A logger object.
"""
levels = {"info": logging.INFO, "debug": logging.DEBUG}
if level is None:
level = levels["info"]
else:
level = levels[level.lower()]
if format is None:
format = "%(message)s"
# https://stackoverflow.com/a/12158233/1995261
for handler in logging.root.handlers[:]:
logging.root.removeHandler(handler)
logger = logging.basicConfig(filename=filename, level=level, format=format)
return logger
As you can see (you might need to scroll down the code above to see the return logger line), I am using logging.basicConfig(). All modules I have in my package that log stuff, have the following at the beginning of the files:
import logging
import other stuff
logger = logging.getLogger()
class SomeClass(object):
def some_method(self):
logger.info("Whatever")
.... stuff
When doing a loop, I have call things this way:
if __name__ == "__main__":
for i in range(1, 11, 1):
directory = "_{}".format(i)
if not os.path.exists(directory):
os.makedirs(directory)
filename = directory + "/training.log"
logger(filename=filename)
I hope this is helpful.
I'd like to slightly modify #0Nicholas's method. The direction is right, but the first FileHandler will continue log information into the first log file as long as the function is running. Therefore, we would want to pop the handler out of the logger's handlers list:
import logging
targets = ["a", "b", "c"]
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
log_format = "|%(levelname)s| : [%(filename)s]--[%(funcName)s] : %(message)s"
formatter = logging.Formatter(log_format)
for target in targets:
log_file = f"{target}.log"
# create file handler and set the formatter
file_handler = logging.FileHandler(log_file)
file_handler.setFormatter(formatter)
# add handler to the logger
logger.addHandler(file_handler)
# sample message
logger.info(f"Log file: {target}")
# close the log file
file_handler.close()
# remove the handler from the logger. The default behavior is to pop out
# the last added one, which is the file_handler we just added in the
# beginning of this iteration.
logger.handlers.pop()
Here is a working version for this problem. I was only able to get it to work if the targets already have .log before going into the loop so you may want to add one more for before going into targets and override all targets with .log extension
import logging
targets = ["a.log","b.log","c.log"]
for target in targets:
log = logging.getLogger(target)
formatter = logging.Formatter('%(asctime)s - %(levelname)s : %(message)s', datefmt='%Y-%m/%dT%H:%M:%S')
fileHandler = logging.FileHandler(target, mode='a')
fileHandler.setFormatter(formatter)
streamHandler = logging.StreamHandler()
streamHandler.setFormatter(formatter)
log.addHandler(fileHandler)
log.addHandler(streamHandler)
log.info('log target info...')

How setup Python logging in unit tests

How do I set up logging in a Python package and the supporting unit tests so that I get a logging file out that I can look at when things go wrong?
Currently package logging seems to be getting captured by nose/unittest and is thrown to the console if there is a failed test; only unit test logging makes it into file.
In both the package and unit test source files, I'm currently getting a logger using:
import logging
import package_under_test
log = logging.getLogger(__name__)
In the unit test script I have been trying to set up log handlers using the basic FileHandler, either directly in-line or via the setUp()/setUpClass() TestCase methods
And the logging config, currently set in the Unit test script setUp() method.
root, ext = os.path.splitext(__file__)
log_filename = root + '.log'
log_format = (
'%(asctime)8.8s %(filename)-12.12s %(lineno)5.5s:'
' %(funcName)-32.32s %(message)s')
datefmt = "%H:%M:%S"
log_fmt = logging.Formatter(log_format, datefmt)
log_handler = logging.FileHandler(log_filename, mode='w')
log_handler.setFormatter(log_fmt)
log.addHandler(log_handler)
log_format = '%(message)s'
log.setLevel(logging.DEBUG)
log.debug('test logging enabled: %s' % log_filename)
The log in the last line does end up in the file but this configuration clearly doesn't filter back into the imported package being tested.
Logging objects operate in a hierarchy, and log messages 'bubble up' the hierarchy chain and are passed to any handlers along the way (provided the log level of the message is at or exceeds the minimal threshold of the logger object you are logging on). Ignoring filtering and global log-level configurations, in pseudo code this is what happens:
if record.level < current_logger.level:
return
for logger_object in (current_logger + current_logger.parents_reversed):
for handler in logger_object.handlers:
if record.level >= handler.level:
handler.handle(record)
if not logger_object.propagate:
# propagation disabled, the buck stops here.
break
Where handlers are actually responsible for putting a log message into a file or write it to the console, etc.
The problem you have is that you added your log handlers to the __name__ logger, where __name__ is the current package identifier. The . separator in the package name are hierarchy separators, so if you run this in, say, acme.tests then only the loggers in acme.tests and contained modules are being sent to this handler. Any code outside of acme.tests will never reach these handlers.
Your log object hierarchy is something akin to this:
- acme
- frobnars
- tests
# logger object
- test1
- test2
- widgets
then only log objects in test1 and test2 will see the same handlers.
You can move your logger to the root logger instead, with logging.root or logger.getLogger() (no name argument or the name set to None). All loggers are child nodes of the root logger, and as long as they don't set the propagate attribute to False, log messages will reach the root handlers.
The other options are to get the acme logger object explicitly, with logging.getLogger('acme'), or always use a single, explicit logger name throughout your code that is the same in your tests and in your library.
Do take into account that Nose also configures handlers on the root logger.

Python logging create handlers in wrapper script for logger in imported module

I am writing an API (python 2.7.x), and I have a worker script for it which does nothing on its own but can be wrapped by a variety of higher level scripts (ie one that feeds the worker data from csv, one from dB etc). The current task requires me to:
log INFO+ to console
log a certain set of INFO+ events to a .csv file
log ALL events to a distinct .log file
I've distilled my code to the following examples:
# SuperExample.py
import logging
import SubExample
def main():
logging.basicConfig(level=logging.INFO)
verbose_log = 'debug.log'
data_log = 'data.csv'
format_string = '%(asctime)s::%(name)s::%(levelname)s::%(message)s'
formatter = logging.Formatter(format_string)
# verbose log is a typical event log used for debugging
verbose = logging.FileHandler(verbose_log, mode='w')
verbose.setLevel(logging.DEBUG)
verbose.setFormatter(formatter)
SubExample.logger.addHandler(verbose)
# data log will eventually have a different formatter and a filter in
# order to get a narrow set of events, formatted for post-processing ease
data = logging.FileHandler(data_log, mode='w')
data.setLevel(logging.INFO)
data.setFormatter(formatter)
SubExample.logger.addHandler(data)
logging.info('Started')
SubExample.do_something()
logging.info('Finished')
if __name__ == '__main__':
main()
and
# SubExample.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
def do_something():
logger.debug('hey look I am doing something!')
logger.debug('now I am doing something else!!')
logger.info('this is my result!!!')
which gives me what I want in my files, but gives me this in my console:
INFO:root:Started
DEBUG:SubExample:hey look I am doing something!
DEBUG:SubExample:now I am doing something else!!
INFO:SubExample:this is my result!!!
INFO:root:Finished
I've read about the logging module and it's best practices, but very little of the example code works exactly the way its described when libraries get involved. So, my first question is: is this a basically sane approach? I haven't actually seen anyone else attach handlers to the subscript logger from the wrapper script, but it seems to do what I want.
And my second question is why do the DEBUG statements get into the console? I would think that logging.basicConfig(level=logging.INFO) should prevent this?
In the SuperExample.py file, I removed the basicConfig step and instead did this:
# SuperExample.py
import logging
import SubExample
def main():
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.INFO)
logger.addHandler(console)
...
...
logger.info('Started')
...
logger.info('Finished')
In the SubExample.py file:
# SubExample.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.INFO)
logger.addHandler(console)
def do_something():
....
Rest of the code is same as yours. When I run SuperExample.py, this is the output:
test_project ~$ python SuperExample.py
Started
this is my result!!!
Finished
The debug.log file has this:
2017-10-25 16:18:08,292::SubExample::DEBUG::hey look I am doing something!
2017-10-25 16:18:08,292::SubExample::DEBUG::now I am doing something else!!
2017-10-25 16:18:08,292::SubExample::INFO::this is my result!!!
The data.csv file has this:
2017-10-25 16:18:08,292::SubExample::INFO::this is my result!!!
So, it seems like the right way to do this is to add a StreamHandler to the logger in each of your modules and set it's level to what you want logged from there to the console. Also, whenever you do logging.getLogger(), you HAVE to set that logger's level to get the expected behavior from the handlers.

Python: flush logging only at end of script run

Currently I use for logging a custom logging system that works as follow:
I have a Log class that ressemble the following:
class Log:
def __init__(self):
self.script = ""
self.datetime = datetime.datetime.now().replace(second=0, microsecond=0)
self.mssg = ""
self.mssg_detail = ""
self.err = ""
self.err_detail = ""
I created a function decorator that perform a try/except on the function call, and add a message either to .mssg or .err on the Log object accordingly.
def logging(fun):
#functools.wraps(fun)
def inner(self, *args):
try:
f = fun(self, *args)
self.logger.mssg += fun.__name__ +" :ok, "
return f
except Exception as e:
self.logger.err += fun.__name__ +": error: "+str(e.args)
return inner
So usually a script is a class that is composed of multiple methods that are run sequentially.
I hence run those methods (decorated such as mentionned above) , and lastly I upload the Log object into a mysql db.
This works quite fine and alright. But now I want to modify those items so that they integrate with the "official" logging module of python.
What I dont like about that module is that it is not possible to "save" the messages onto 1 log object in order to upload/save to log only at the end of the run. Rather each logging call will write/send the message to a file etc. - which create lots of performances issues sometimes. I could usehandlers.MemoryHandler , but it still doesn't seems to perform as my original system: it is said to collect messages and flush them to another handler periodically - which is not what i want: I want to collect the messages in memory and to flush them on request with an explicit function.
Anyone has any suggestions?
Here is my idea. Use a handler to capture the log in a StringIO. Then you can grab the StringIO whenever you want. Since there was perhaps some confusion in the discussion thread - StringIO is a "file-like" interface for strings, there isn't ever an actual file involved.
import logging
import io
def initialize_logging(log_level, log_name='default_logname'):
logger = logging.getLogger(log_name)
logger.setLevel(log_level)
log_stream = io.StringIO()
if not logger.handlers:
ch = logging.StreamHandler(log_stream)
ch.setLevel(log_level)
ch.setFormatter(logging.Formatter(
'%(asctime)s - %(name)s - %(levelname)s - %(message)s'
))
logger.addHandler(ch)
logger.propagate = 0
return logger, log_stream
And then something like:
>>> logger, log_stream = initialize_logging(logging.INFO, "logname")
>>> logger.warning("Hello World!")
And when you want the log information:
>>> log_stream.getvalue()
'2017-05-16 16:35:03,501 - logname - WARNING - Hello World!\n'
At program start (in the main), you can:
instanciate your custom logger => global variable/singleton.
register a function at program end which will flush your logger.
Run your decorated functions.
To register a function you can use atexit.register function. See the page Exit handlers in the doc.
EDIT
The idea above can be simplified.
To delay the logging, you can use the standard MemoryHandler handler, described in the page logging.handlers — Logging handlers
Take a look at this GitHub project: https://github.com/tantale/python-ini-cfg-demo
And replace the INI file by this:
[formatters]
keys=default
[formatter_default]
format=%(asctime)s:%(levelname)s:%(message)s
class=logging.Formatter
[handlers]
keys=console, alternate
[handler_console]
class=logging.handlers.MemoryHandler
formatter=default
args=(1024, INFO)
target=alternate
[handler_alternate]
class=logging.StreamHandler
formatter=default
args=()
[loggers]
keys=root
[logger_root]
level=DEBUG
formatter=default
handlers=console
To log to a database table, just replace the alternate handler by your own database handler.
There is some blog/SO questions about that:
You can look at Logging Exceptions To Your SQLAlchemy Database to create a SQLAlchemyHandler
See Store Django log to database if you are using DJango.
EDIT2
Note: ORM generally support "Eager loading", for instance with SqlAlchemy

How to add a prefix to an existing python logging formatter

In my code I get a logger from my client, then I do stuff and log my analysis to the logger.
I want to add my own prefix to the logger but I don't want to create my own formatter, just to add my prefix to the existing one.
In addition I want to remove my prefix once my code is done.
From looking at the documentation I could only find ways to create new formatter but not to modify an existing one. Is there a way to do so?
You are correct. As per Python 3 and Python 2 documentation there is no way to reset your format on the existing formatter object and you do need to create a new logging.Formatter object. However, looking at the object at runtime there is _fmt method to get the existing format and it seems tweaking it will work. I tried in 2.7 and it works. Below is the example.
Example code for python 2.7:
import logging
logger = logging.getLogger('something')
myFormatter = logging.Formatter('%(asctime)s - %(message)s')
handler = logging.StreamHandler()
handler.setFormatter(myFormatter)
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
logger.info("log statement here")
#Tweak the formatter
myFormatter._fmt = "My PREFIX -- " + myFormatter._fmt
logger.info("another log statement here")
Output:
2015-03-11 12:51:36,605 - log statement here
My PREFIX -- 2015-03-11 12:51:36,605 - another log statement here
This can be achieved with logging.LoggerAdapter
import logging
class CustomAdapter(logging.LoggerAdapter):
def process(self, msg, kwargs):
return f"[my prefix] {msg}", kwargs
logger = CustomAdapter(logging.getLogger(__name__))
Please note that only the message will be affected. But this technique can be used for more complicated cases
You can actually set the format through the 'basicConfig', it is mentioned in the Python document: https://docs.python.org/2/howto/logging-cookbook.html#context-info
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)-15s %(name)-5s %(levelname)-8s IP: %(ip)-15s User: %(user)-8s %(message)s')

Categories

Resources