I am trying to beef up the logging in my Python scripts and I am grateful if could share best practices with me. For now I have created this little script (I should say that I run Python 3.4)
import logging
import io
import sys
def Streamhandler(stream, level, format="%(asctime)s - %(name)s - %(levelname)s - %(message)s"):
ch = logging.StreamHandler(stream)
ch.setLevel(level)
formatter = logging.Formatter(format)
ch.setFormatter(formatter)
return ch
# get the root logger
logger = logging.getLogger()
stream = io.StringIO()
logger.addHandler(Streamhandler(stream, logging.WARN))
stream_error = io.StringIO()
logger.addHandler(Streamhandler(stream_error, logging.ERROR))
logger.addHandler(Streamhandler(stream=sys.stdout, level=logging.DEBUG))
print(logger)
for h in logger.handlers:
print(h)
print(h.level)
# 'application' code # goes to the root logger!
logging.debug('debug message')
logging.info('info message')
logging.warning('warn message')
logging.error('error message')
logging.critical('critical message')
print(stream.getvalue())
print(stream_error.getvalue())
I have three handlers, 2 of them write into a io.StringIO (this seems to work). I need this to simplify testing but also to send logs via a HTTP email service. And then there is a Streamhandler for the console. However, logging.debug and logging.info messages are ignored on the console here despite setting the level explicitly low enough?!
First, you didn't set the level on the logger itself:
logger.setLevel(logging.DEBUG)
Also, you define a logger but do your calls on logging - which will call on the root logger. Not that it will make any difference in your case since you didn't specify a name for your logger, so logging.getLogger() returns the root logger.
wrt/ "best practices", it really depends on how "complex" your scripts are and of course on your logging needs.
For self-contained simple scripts with simple use cases (single known environment, no concurrent execution, simple logging to a file or stderr etc), a simple call to logging.basicConfig() and direct calls to logging.whatever() are usually good enough.
For anything more complex, it's better to use a distinct config file - either in ini format or as Python dict (using logging.dictConfig), split your script into distinct module(s) or package(s) each defining it's own named logger (with logger = logging.getLogger(__name__)) and only keep your script itself as the "runner" for your code, ie: configure logging, import modules, parse command line args and call the main function - preferably in a try/except block as to properly log any unhandled exception before crashing.
A logger has a threshold level too, you need to set it to DEBUG first:
logger.setLevel(logging.DEBUG)
Related
I have a logger.py file which initialises logging.
import logging
logger = logging.getLogger(__name__)
def logger_init():
import os
import inspect
global logger
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
ch.setLevel(logging.DEBUG)
logger.addHandler(ch)
fh = logging.FileHandler(os.getcwd() + os.path.basename(__file__) + ".log")
fh.setLevel(level=logging.DEBUG)
logger.addHandler(fh)
return None
logger_init()
I have another script caller.py that calls the logger.
from logger import *
logger.info("test log")
What happens is a log file called logger.log will be created containing the logged messages.
What I want is the name of this log file to be named after the caller script filename. So, in this case, the created log file should have the name caller.log instead.
I am using python 3.7
It is immensely helpful to consolidate logging to one location. I learned this the hard way. It is easier to debug when events are sorted by time and it is thread-safe to log to the same file. There are solutions for multiprocessing logging.
The log format can, then, contain the module name, function name and even line number from where the log call was made. This is invaluable. You can find a list of attributes you can include automatically in a log message here.
Example format:
format='[%(asctime)s] [%(module)s.%(funcName)s] [%(levelname)s] %(message)s
Example log message
[2019-04-03 12:29:48,351] [caller.work_func] [INFO] Completed task 1.
You can get the filename of the main script from the first item in sys.argv, but if you want to get the caller module not the main script, check the answers on this question.
I have a main script and multiple modules. Right now I have logging setup where all the logging from all modules go into the same log file. It gets hard to debug when its all in one file. So I would like to separate each module into its own log file. I would also like to see the requests module each module uses into the log of the module that used it. I dont know if this is even possible. I searched everywhere and tried everything I could think of to do it but it always comes back to logging everything into one file or setup logging in each module and from my main module initiate the script instead of import as a module.
main.py
import logging, logging.handlers
import other_script.py
console_debug = True
log = logging.getLogger()
def setup_logging():
filelog = logging.handlers.TimedRotatingFileHandler(path+'logs/api/api.log',
when='midnight', interval=1, backupCount=3)
filelog.setLevel(logging.DEBUG)
fileformatter = logging.Formatter('%(asctime)s %(name)-15s %(levelname)-8s %(message)s')
filelog.setFormatter(fileformatter)
log.addHandler(filelog)
if console_debug:
console = logging.StreamHandler()
console.setLevel(logging.DEBUG)
formatter = logging.Formatter('%(name)-15s: %(levelname)-8s %(message)s')
console.setFormatter(formatter)
log.addHandler(console)
if __name__ == '__main__':
setup_logging()
other_script.py
import requests
import logging
log = logging.getLogger(__name__)
One very basic concept of python logging is that every file, stream or other place that logs go is equivalent to one Handler. So if you want every module to log to a different file you will have to give every module it's own handler. This can also be done from a central place. In your main.py you could add this to make the other_script module log to a separate file:
other_logger = logging.getLogger('other_script')
other_logger.addHandler(logging.FileHandler('other_file'))
other_logger.propagate = False
The last line is only required if you add a handler to the root logger. If you keep propagate at the default True you will have all logs be sent to the root loggers handlers too. In your scenario it might be better to not even use the root logger at all, and use a specific named logger like getLogger('__main__') in main.
referring to this question here: LINK
How can I set up a config, that will only log my root script and my own sub-scripts? The question of the link asked for disabling all imported modules, but that is not my intention.
My root setup:
import logging
from exchangehandler import send_mail
log_wp = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s - %(levelname)s [%(filename)s]: %(name)s %(funcName)20s - Message: %(message)s',
datefmt='%d.%m.%Y %H:%M:%S',
filename='C:/log/myapp.log',
filemode='a')
handler = logging.StreamHandler()
log_wp.addHandler(handler)
log_wp.debug('This is from root')
send_mail('address#eg.com', 'Request', 'Hi there')
My sub-module exchangehandler.py:
import logging
log_wp = logging.getLogger(__name__)
def send_mail(mail_to,mail_subject,mail_body, mail_attachment=None):
log_wp.debug('Hey this is from exchangehandler.py!')
m.send_and_save()
myapp.log:
16.07.2018 10:27:40 - DEBUG [test_script.py]: __main__ <module> - Message: This is from root
16.07.2018 10:28:02 - DEBUG [exchangehandler.py]: exchangehandler send_mail - Message: Hey this is from exchangehandler.py!
16.07.2018 10:28:02 - DEBUG [folders.py]: exchangelib.folders get_default_folder - Message: Testing default <class 'exchangelib.folders.SentItems'> folder with GetFolder
16.07.2018 10:28:02 - DEBUG [services.py]: exchangelib.services get_payload - Message: Getting folder ArchiveDeletedItems (archivedeleteditems)
16.07.2018 10:28:02 - DEBUG [services.py]: exchangelib.services get_payload - Message: Getting folder ArchiveInbox (archiveinbox)
My problem is, that the log-file contains also a lot of information of the exchangelib-module, that is imported in exchangehandler.py. Either the imported exchangelib-module is configured incorrectly or I have made a mistake. So how can I reduce the log-output only to my logging messages?
EDIT:
An extract of the folder.py of the exchangelib-module. This is not anything that I have written:
import logging
log = logging.getLogger(__name__)
def get_default_folder(self, folder_cls):
try:
# Get the default folder
log.debug('Testing default %s folder with GetFolder', folder_cls)
# Use cached instance if available
for f in self._folders_map.values():
if isinstance(f, folder_cls) and f.has_distinguished_name:
return f
return folder_cls.get_distinguished(account=self.account)
The imported exchangelib module is not configured at all when it comes to logging. You are configuring it implicitly by calling logging.basicConfig() in your main module.
exchangelib does create loggers and logs to them, but by default these loggers do not have handlers and formatters attached, so they don't do anything visible. What they do, is propagating up to the root logger, which by default also has no handlers and formatters attached.
By calling logging.basicConfig in your main module, you actually attach handlers to the root logger. Your own, desired loggers propagate to the root logger, hence the messages are written to the handlers, but the same is true for the exchangelib loggers from that point onwards.
You have at least two options here. You can explicitly configure "your" named logger(s):
main module
import logging
log_wp = logging.getLogger(__name__) # or pass an explicit name here, e.g. "mylogger"
hdlr = logging.StreamHandler()
fhdlr = logging.FileHandler("myapp.log")
log_wp.addHandler(hdlr)
log_wp.addHandler(fhdlr)
log_wp.setLevel(logging.DEBUG)
The above is very simplified. To explicitly configure multiple named loggers, refer to the logging.config HowTo
If you rather want to stick to just using the root logger (configured via basicConfig()), you can also explicitly disable the undesired loggers after exchangelib has been imported and these loggers have been created:
logging.getLogger("exchangelib.folders").disabled = True
logging.getLogger("exchangelib.services").disabled = True
If you don't know the names of the loggers to disable, logging has a dictionary holding all the known loggers. So you could temporarily do this to see all the loggers your program creates:
# e.g. after the line 'log_wp.addHandler(handler)'
print([k for k in logging.Logger.manager.loggerDict])
Using the dict would also allow you to do sth. like this:
for v in logging.Logger.manager.loggerDict.values():
if v.name.startswith('exchangelib'):
v.disabled = True
How do I set up logging in a Python package and the supporting unit tests so that I get a logging file out that I can look at when things go wrong?
Currently package logging seems to be getting captured by nose/unittest and is thrown to the console if there is a failed test; only unit test logging makes it into file.
In both the package and unit test source files, I'm currently getting a logger using:
import logging
import package_under_test
log = logging.getLogger(__name__)
In the unit test script I have been trying to set up log handlers using the basic FileHandler, either directly in-line or via the setUp()/setUpClass() TestCase methods
And the logging config, currently set in the Unit test script setUp() method.
root, ext = os.path.splitext(__file__)
log_filename = root + '.log'
log_format = (
'%(asctime)8.8s %(filename)-12.12s %(lineno)5.5s:'
' %(funcName)-32.32s %(message)s')
datefmt = "%H:%M:%S"
log_fmt = logging.Formatter(log_format, datefmt)
log_handler = logging.FileHandler(log_filename, mode='w')
log_handler.setFormatter(log_fmt)
log.addHandler(log_handler)
log_format = '%(message)s'
log.setLevel(logging.DEBUG)
log.debug('test logging enabled: %s' % log_filename)
The log in the last line does end up in the file but this configuration clearly doesn't filter back into the imported package being tested.
Logging objects operate in a hierarchy, and log messages 'bubble up' the hierarchy chain and are passed to any handlers along the way (provided the log level of the message is at or exceeds the minimal threshold of the logger object you are logging on). Ignoring filtering and global log-level configurations, in pseudo code this is what happens:
if record.level < current_logger.level:
return
for logger_object in (current_logger + current_logger.parents_reversed):
for handler in logger_object.handlers:
if record.level >= handler.level:
handler.handle(record)
if not logger_object.propagate:
# propagation disabled, the buck stops here.
break
Where handlers are actually responsible for putting a log message into a file or write it to the console, etc.
The problem you have is that you added your log handlers to the __name__ logger, where __name__ is the current package identifier. The . separator in the package name are hierarchy separators, so if you run this in, say, acme.tests then only the loggers in acme.tests and contained modules are being sent to this handler. Any code outside of acme.tests will never reach these handlers.
Your log object hierarchy is something akin to this:
- acme
- frobnars
- tests
# logger object
- test1
- test2
- widgets
then only log objects in test1 and test2 will see the same handlers.
You can move your logger to the root logger instead, with logging.root or logger.getLogger() (no name argument or the name set to None). All loggers are child nodes of the root logger, and as long as they don't set the propagate attribute to False, log messages will reach the root handlers.
The other options are to get the acme logger object explicitly, with logging.getLogger('acme'), or always use a single, explicit logger name throughout your code that is the same in your tests and in your library.
Do take into account that Nose also configures handlers on the root logger.
I am writing an API (python 2.7.x), and I have a worker script for it which does nothing on its own but can be wrapped by a variety of higher level scripts (ie one that feeds the worker data from csv, one from dB etc). The current task requires me to:
log INFO+ to console
log a certain set of INFO+ events to a .csv file
log ALL events to a distinct .log file
I've distilled my code to the following examples:
# SuperExample.py
import logging
import SubExample
def main():
logging.basicConfig(level=logging.INFO)
verbose_log = 'debug.log'
data_log = 'data.csv'
format_string = '%(asctime)s::%(name)s::%(levelname)s::%(message)s'
formatter = logging.Formatter(format_string)
# verbose log is a typical event log used for debugging
verbose = logging.FileHandler(verbose_log, mode='w')
verbose.setLevel(logging.DEBUG)
verbose.setFormatter(formatter)
SubExample.logger.addHandler(verbose)
# data log will eventually have a different formatter and a filter in
# order to get a narrow set of events, formatted for post-processing ease
data = logging.FileHandler(data_log, mode='w')
data.setLevel(logging.INFO)
data.setFormatter(formatter)
SubExample.logger.addHandler(data)
logging.info('Started')
SubExample.do_something()
logging.info('Finished')
if __name__ == '__main__':
main()
and
# SubExample.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
def do_something():
logger.debug('hey look I am doing something!')
logger.debug('now I am doing something else!!')
logger.info('this is my result!!!')
which gives me what I want in my files, but gives me this in my console:
INFO:root:Started
DEBUG:SubExample:hey look I am doing something!
DEBUG:SubExample:now I am doing something else!!
INFO:SubExample:this is my result!!!
INFO:root:Finished
I've read about the logging module and it's best practices, but very little of the example code works exactly the way its described when libraries get involved. So, my first question is: is this a basically sane approach? I haven't actually seen anyone else attach handlers to the subscript logger from the wrapper script, but it seems to do what I want.
And my second question is why do the DEBUG statements get into the console? I would think that logging.basicConfig(level=logging.INFO) should prevent this?
In the SuperExample.py file, I removed the basicConfig step and instead did this:
# SuperExample.py
import logging
import SubExample
def main():
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.INFO)
logger.addHandler(console)
...
...
logger.info('Started')
...
logger.info('Finished')
In the SubExample.py file:
# SubExample.py
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
console = logging.StreamHandler()
console.setLevel(logging.INFO)
logger.addHandler(console)
def do_something():
....
Rest of the code is same as yours. When I run SuperExample.py, this is the output:
test_project ~$ python SuperExample.py
Started
this is my result!!!
Finished
The debug.log file has this:
2017-10-25 16:18:08,292::SubExample::DEBUG::hey look I am doing something!
2017-10-25 16:18:08,292::SubExample::DEBUG::now I am doing something else!!
2017-10-25 16:18:08,292::SubExample::INFO::this is my result!!!
The data.csv file has this:
2017-10-25 16:18:08,292::SubExample::INFO::this is my result!!!
So, it seems like the right way to do this is to add a StreamHandler to the logger in each of your modules and set it's level to what you want logged from there to the console. Also, whenever you do logging.getLogger(), you HAVE to set that logger's level to get the expected behavior from the handlers.