How to set the same logging format for all handlers? - python

I am using .setFormatter() to set the same logging.Formatter() on each of my handlers.
Is there a way to set a global default format?
Alternatively - is it possible to iterate though the handlers already added via .addHandler() to a logger ?
Another question mentions what the format is, but not how to set it.

The intended way is to attach the formatter to each handler as you're creating them.
Since you're supposed to set up logging destinations in one, central place at the start of the main program, this isn't so taxing to do.
E.g. this is the stock logging set-up code that I use in my scripts that are to be run autonomously:
# set up logging #####################################
import sys,logging,logging.handlers,os.path
log_file=os.path.splitext(__file__)[0]+".log"
l = logging.getLogger()
l.setLevel(logging.DEBUG)
f = logging.Formatter('%(asctime)s %(process)d:%(thread)d %(name)s %(levelname)-8s %(message)s')
h=logging.StreamHandler(sys.stdout)
h.setLevel(logging.NOTSET)
h.setFormatter(f)
l.addHandler(h)
h=logging.handlers.RotatingFileHandler(log_file,maxBytes=1024**2,backupCount=1)
h.setLevel(logging.NOTSET)
h.setFormatter(f)
l.addHandler(h)
del h,f
#hook to log unhandled exceptions
def excepthook(type,value,traceback):
logging.error("Unhandled exception occured",exc_info=(type,value,traceback))
#Don't need another copy of traceback on stderr
if old_excepthook!=sys.__excepthook__:
old_excepthook(type,value,traceback)
old_excepthook = sys.excepthook
sys.excepthook = excepthook
del excepthook,log_file
# ####################################################
There are other methods but each has a downside:
Each logger has an undocumented <logger>.handlers list, but it only lists handlers connected to that logger directly. So, you need to iterate over this list for all loggers you have if you have multiple.
There is a global undocumented logging._handlerList (references to all handlers are kept there to shut them down at exit). Likewise, that is an implementation detail.
Finally, you can override a handler's init logic by
replacing __init__ methods of Handler and/or subclass (that will affect everything else that uses logging), or by
subclassing/addin a mixin to the required classes.
That is probably an overkill.

If you have a number of handlers, you're advised to configure logging using the logging.config.dictConfig() API. This other answer shows how to configure formatters for handlers; you may be able to adapt it to your needs.

Related

How to avoid root handler being called from the custom logger in Python?

I have a basic config for the logging module with debug level - now I want to create another logger with error level only. How can I do that?
The problem is that the root handler is called in addition to the error-handler - this is something I want to avoid.
import logging
fmt = '%(asctime)s:%(funcName)s:%(lineno)d:%(levelname)s:%(name)s:%(message)s'
logging.basicConfig(level=logging.DEBUG, format=fmt)
logger = logging.getLogger('Temp')
logger.setLevel(logging.ERROR)
handler = logging.StreamHandler()
handler.setLevel(logging.ERROR)
logger.addHandler(handler)
logger.error('boo')
The above code prints boo twice while I expect once only, and I have no idea what to do with this annoying issue...
In [4]: logger.error('boo')
boo
2021-04-26 18:54:24,329:<module>:1:ERROR:Temp:boo
In [5]: logger.handlers
Out[5]: [<StreamHandler stderr (ERROR)>]
Some basics about the logging module
logger: a person who receives the log string, sorts it by a predefined level, then uses his own handler if any to process the log and, by default passes the log to his superior.
root logger: the superior of superiors, does all the things that a normal logger does but doesn't pass the received log to anyone else.
handler: a private contractor of a logger, who actually does anything with the log, eg. formats the log, writes it to a file or stdout, or sends it through tcp/udp.
formatter: a theme, a design that the handler applies to the log.
basicConfig: a shortcut way to config the root logger. This is useful when you want him to do all the job and all his lower rank loggers would just pass the log to him.
With no argument, basicConfig sets root logger's level to WARNING and add a StreamHandler that output the log to stderr.
What you did
You created a format and used a shortcut basicConfig to config the root logger. You want the root logger to do all the actual logging things
You created a new low-rank logger Temp
You want it to accept logs with only ERROR level and above.
You created another StreamHandler. Which output to stdout by default.
You want it to handle only ERROR level and above
Oh, you assigned it to Temp logger, which made 5. redundant since the level is set in 3.
Oh wait, thought you just want the root logger to do the job since 1.!
You logged an ERROR with your logger.
What happened
Your Temp logger accepted a string boo at ERROR level. Then told its handler to process the string. Since this handler didn't have any formatter assigned to it, it outputted the string as-is to stdout: boo
After that, Temp logger passed the string boo to his superior, the root logger.
The root logger accepted the log since the log level is ERROR > WARNING.
The root logger then told its handler to process the string boo.
This handler applies the format string to boo. Added timestamp, added location, added the name of logger that passed the log, etc.
Finally it outputted the result to stderr: 2021-04-26 18:54:24,329:<module>:1:ERROR:Temp:boo
Make it right
Since your code does exactly what you tell it to do, you have to tell it as much detail as possible.
Only use basicConfig when you are lazy. By removing basicConfig line, your problem solved.
Use logger = logging.getLogger('__name__') so that the logger has the name of the module. Looking at the log and know exactly which import path that it came from.
Decide if a logger should keep the log on its own or pass it up the chain with the propagate property. In your case, logger.propagate = False also solves the problem.
Use a dictConfig file so you don't get messed with the config code.
In practice, you should not add handlers to your logger, and just let the logger pass the log all the way to the root and let the root do the actual logging. Why?
Someone else uses your code as a module, can control the logging. For example, not output to stdout but to tcp/udp, output with a different format, etc.
You can turn off the logging from a specific logger entirely, by propagating=False.
You know exactly all the handlers and formatters in the code if you only added them to the root logger. You have centralized control over the logging.

Log level of file handler vs. that of logger

To set up logging in Python without basicConfig we would go through the steps:
Set up a file handler.
Set the logging level of the file handler.
Set up a formatter.
Point the file handler to the formatter.
Get the logger object.
Set the logging level of the logger object.
Add the file handler as a handler to the logger object.
Use the .info(), .warning(), etc method on the logger.
These steps are executed by the following code:
import logging
file_handler = logging.FileHandler('./out.log', 'a')
file_handler.setLevel(logging.DEBUG)
format_string = '%(asctime)s\t%(levelname)s: %(message)s'
formatter = logging.Formatter(format_string)
file_handler.setFormatter(formatter)
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)
logger.addHandler(file_handler)
logger.info('visible info')
logger.debug('invisible debug')
What is the difference between setting the logging level for the file handler and setting the logging level for the logger?
Okay, so here is a small bit of code to work out:
import logging
# Declare a function to log all 5 levels with different information
def log_all_levels(logger):
logger.debug("Debug from logger {}".format(logger.name))
logger.info("Info from logger {}".format(logger.name))
logger.warning("Warning from logger {}".format(logger.name))
logger.error("Error from logger {}".format(logger.name))
logger.critical("Fatal from logger {}".format(logger.name))
# This file handler will track errors from all loggers
all_errors_handler = logging.FileHandler('errors.log')
all_errors_handler.setLevel(logging.ERROR)
# This file handler will only be used in a specific region of code
foo_info_handler = logging.FileHandler('foo_info.log')
foo_info_handler.setLevel(logging.INFO)
foo_info_handler.addFilter(lambda r: r.levelno == logging.INFO)
# The following loggers will be used in the main execution
foo_logger = logging.getLogger("Foo")
nameless_logger = logging.getLogger("nameless")
foo_logger.setLevel(logging.INFO)
nameless_logger.setLevel(logging.DEBUG)
loggers = (foo_logger, nameless_logger)
# Set each logger up to use the file handlers
# Each logger can have many handlers, each handler can be used by many loggers
for logger in loggers:
logger.addHandler(all_errors_handler)
debug_file_handler = logging.FileHandler('{}.log'.format(logger.name))
debug_file_handler.setLevel(logging.DEBUG)
logger.addHandler(debug_file_handler)
if logger.name == "Foo":
logger.addHandler(foo_info_handler)
# Let's run some logging operations
for logger in loggers:
log_all_levels(logger)
There are 2 loggers - foo_logger set to the info level and nameless_logger set to the debug level. Both of them use the errors and debug handlers, however only the foo_logger uses the foo_file_handler. There are now loggers and file handlers with different levels, connected together in a many-to-many relationship.
As you can find out:
errors.log will contain errors from both loggers. Quite self-explanatory for a real life scenario - reading through logs containing just the errors helps debugging the code.
Foo.log and nameless.log will contain everything possible about those loggers, respecting their levels. So the former will contain info and greater, whereas the latter will track debug and greater levels. Logging per object will potentially create a lot of files, but it might be crucial when trying to detect some object-specific errors.
foo_info is a very special file handler and it only allows info level from the associated logger. Such files can be a life saver when you enter a potentially unsafe or untested area of code and would like to see what exactly is happening within that code block, without having to browse through all your program log.
There are many other things you can do with logging - set up your own logging rules, make a logging hierarchy, create a logger factory - possibilities are endless. Logging should allow flexibility - for example by allowing logger objects and file handlers to have different and separate logging levels, and letting the programmer combine them together as needed.
I hope the small code exercise alongside with my explanations cleared any further doubts - but I do recommend to have a look at Logging Cookbook or the docs if you still need more examples.

How to check if a logger exists

I've had to extend the existing logging library to add some features. One of these is a feature which lets a handler listen to any existing log without knowing if it exists beforehand. The following allows the handler to listen, but does not check if the log exists:
def listen_to_log(target, handler):
logging.getLogger(target).addHandler(handler)
The issue is that I don't want any handler to listen to a log which isn't being logged to, and want to raise a ValueError if the log doesn't already exist. Ideally I'd do something like the following:
def listen_to_log(target, handler):
if not logging.logExists(target):
raise ValueError('Log not found')
logging.getLogger(target).addHandler(handler)
So far I have been unable to find a function like logging.logExists, is there such a function, or a convenient workaround?
Workaround that works for me and possibly you:
When you create a logger for your own code, you will almost certainly create a logger with handlers (file handler and/or console handler).
When you have not yet created a logger and you get the 'root' logger by
logger = logging.getLogger()
then this logger will have no handlers yet.
Therefore, I always check whether the logger above results in a logger that has handlers by
logging.getLogger().hasHandlers()
So I create a logger only if the root logger does not have handlers:
logger = <create_my_logger> if not logging.getLogger().hasHandlers() else logging.getLogger()
The "create_my_logger" snippet represents my code/function that returns a logger (with handlers).
WARNING. This is not documented. It may change without notice.
The logging module internally uses a single Manager object to hold the hierarchy of loggers in a dict accessible as:
import logging
logging.Logger.manager.loggerDict
All loggers except the root logger are stored in that dict under their names.
There are few examples on the net:
http://code.activestate.com/lists/python-list/621740/
and
https://michaelgoerz.net/notes/use-of-the-logging-module-in-python.html (uses Python 2 print)
Of course, if you attach a handler to the root logger, you will (generally) get all messages (unless a library author specifically chooses not to propagate to root handlers).
Why should you care if a handler listens to a log which hasn't been created yet? You can apply filters to a handler that ignore certain events.
This doesn't specifically answer your question, because I see your question as an instance of an XY Problem.

Ensuring Python logging in multiple threads is thread-safe

I have a log.py module, that is used in at least two other modules (server.py and device.py).
It has these globals:
fileLogger = logging.getLogger()
fileLogger.setLevel(logging.DEBUG)
consoleLogger = logging.getLogger()
consoleLogger.setLevel(logging.DEBUG)
file_logging_level_switch = {
'debug': fileLogger.debug,
'info': fileLogger.info,
'warning': fileLogger.warning,
'error': fileLogger.error,
'critical': fileLogger.critical
}
console_logging_level_switch = {
'debug': consoleLogger.debug,
'info': consoleLogger.info,
'warning': consoleLogger.warning,
'error': consoleLogger.error,
'critical': consoleLogger.critical
}
It has two functions:
def LoggingInit( logPath, logFile, html=True ):
global fileLogger
global consoleLogger
logFormatStr = "[%(asctime)s %(threadName)s, %(levelname)s] %(message)s"
consoleFormatStr = "[%(threadName)s, %(levelname)s] %(message)s"
if html:
logFormatStr = "<p>" + logFormatStr + "</p>"
# File Handler for log file
logFormatter = logging.Formatter(logFormatStr)
fileHandler = logging.FileHandler(
"{0}{1}.html".format( logPath, logFile ))
fileHandler.setFormatter( logFormatter )
fileLogger.addHandler( fileHandler )
# Stream Handler for stdout, stderr
consoleFormatter = logging.Formatter(consoleFormatStr)
consoleHandler = logging.StreamHandler()
consoleHandler.setFormatter( consoleFormatter )
consoleLogger.addHandler( consoleHandler )
And:
def WriteLog( string, print_screen=True, remove_newlines=True,
level='debug' ):
if remove_newlines:
string = string.replace('\r', '').replace('\n', ' ')
if print_screen:
console_logging_level_switch[level](string)
file_logging_level_switch[level](string)
I call LoggingInit from server.py, which initializes the file and console loggers. I then call WriteLog from all over the place, so multiple threads are accessing fileLogger and consoleLogger.
Do I need any further protection for my log file? The documentation states that thread locks are handled by the handler.
The good news is that you don't need to do anything extra for thread safety, and you either need nothing extra or something almost trivial for clean shutdown. I'll get to the details later.
The bad news is that your code has a serious problem even before you get to that point: fileLogger and consoleLogger are the same object. From the documentation for getLogger():
Return a logger with the specified name or, if no name is specified, return a logger which is the root logger of the hierarchy.
So, you're getting the root logger and storing it as fileLogger, and then you're getting the root logger and storing it as consoleLogger. So, in LoggingInit, you initialize fileLogger, then re-initialize the same object under a different name with different values.
You can add multiple handlers to the same logger—and, since the only initialization you actually do for each is addHandler, your code will sort of work as intended, but only by accident. And only sort of. You will get two copies of each message in both logs if you pass print_screen=True, and you will get copies in the console even if you pass print_screen=False.
There's actually no reason for global variables at all; the whole point of getLogger() is that you can call it every time you need it and get the global root logger, so you don't need to store it anywhere.
A more minor problem is that you're not escaping the text you insert into HTML. At some point you're going to try to log the string "a < b" and end up in trouble.
Less seriously, a sequence of <p> tags that isn't inside a <body> inside an <html> is not a valid HTML document. But plenty of viewers will take care of that automatically, or you can post-process your logs trivially before displaying them. But if you really want this to be correct, you need to subclass FileHandler and have your __init__ add a header if given an empty file and remove a footer if present, then have your close add a footer.
Getting back to your actual question:
You do not need any additional locking. If a handler correctly implements createLock, acquire, and release (and it's called on a platform with threads), the logging machinery will automatically make sure to acquire the lock when needed to make sure each message is logged atomically.
As far as I know, the documentation doesn't directly say that StreamHandler and FileHandler implement these methods, it does strongly imply it (the text you mentioned in the question says "The logging module is intended to be thread-safe without any special work needing to be done by its clients", etc.). And you can look at the source for your implementation (e.g., CPython 3.3) and see that they both inherit correctly-implemented methods from logging.Handler.
Likewise, if a handler correctly implements flush and close, the logging machinery will make sure it's finalized correctly during normal shutdown.
Here, the documentation does explain what StreamHandler.flush(), FileHandler.flush(), and FileHandler.close(). They're mostly what you'd expect, except that StreamHandler.close() is a no-op, meaning it's possible that final log messages to the console may get lost. From the docs:
Note that the close() method is inherited from Handler and so does no output, so an explicit flush() call may be needed at times.
If this matters to you, and you want to fix it, you need to do something like this:
class ClosingStreamHandler(logging.StreamHandler):
def close(self):
self.flush()
super().close()
And then use ClosingStreamHandler() instead of StreamHandler().
FileHandler has no such problem.
The normal way to send logs to two places is to just use the root logger with two handlers, each with their own formatter.
Also, even if you do want two loggers, you don't need the separate console_logging_level_switch and file_logging_level_switch maps; calling Logger.debug(msg) is exactly the same thing as calling Logger.log(DEBUG, msg). You'll still need some way to map your custom level names debug, etc. to the standard names DEBUG, etc., but you can just do one lookup, instead of doing it once per logger (plus, if your names are just the standard names with different cast, you can cheat).
This is all described pretty well in the `Multiple handlers and formatters section, and the rest of the logging cookbook.
The only problem with the standard way of doing this is that you can't easily turn off console logging on a message-by-message basis. That's because it's not a normal thing to do. Usually, you just log by levels, and set the log level higher on the file log.
But, if you want more control, you can use filters. For example, give your FileHandler a filter that accepts everything, and your ConsoleHandler a filter that requires something starting with console, then use the filter 'console' if print_screen else ''. That reduces WriteLog to almost a one-liner.
You still need the extra two lines to remove newlines—but you can even do that in the filter, or via an adapter, if you want. (Again, see the cookbook.) And then WriteLog really is a one-liner.
Python logging is thread safe:
Is Python's logging module thread safe?
http://docs.python.org/2/library/logging.html#thread-safety
So you have no problem in the Python (library) code.
The routine that you call from multiple threads (WriteLog) does not write to any shared state. So you have no problem in your code.
So you are OK.

python logging specific level only

I'm logging events in my python code uing the python logging module. I have 2 logging files I wish to log too, one to contain user information and the other a more detailed log file for devs. I've set the the two logging files to the levels I want (usr.log = INFO and dev.log = ERROR) but cant work out how to restrict the logging to the usr.log file so only the INFO level logs are written to the log file as opposed to INFO plus everthing else above it e.g. INFO, WARNING, ERROR and CRITICAL.
This is basically my code:-
import logging
logger1 = logging.getLogger('')
logger1.addHandler(logging.FileHandler('/home/tmp/usr.log')
logger1.setLevel(logging.INFO)
logger2 = logging.getLogger('')
logger2.addHandler(logging.FileHandler('/home/tmp/dev.log')
logger2.setLevel(logging.ERROR)
logging.critical('this to be logged in dev.log only')
logging.info('this to be logged to usr.log and dev.log')
logging.warning('this to be logged to dev.log only')
Any help would be great thank you.
I am in general agreement with David, but I think more needs to be said. To paraphrase The Princess Bride - I do not think this code means what you think it means. Your code has:
logger1 = logging.getLogger('')
...
logger2 = logging.getLogger('')
which means that logger1 and logger2 are the same logger, so when you set the level of logger2 to ERROR you actually end up setting the level of logger1 at the same time. In order to get two different loggers, you would need to supply two different logger names. For example:
logger1 = logging.getLogger('user')
...
logger2 = logging.getLogger('dev')
Worse still, you are calling the logging module's critical(), info() and warning() methods and expecting that both loggers will get the messages. This only works because you used the empty string as the name for both logger1 and logger2 and thus they are not only the same logger, they are also the root logger. If you use different names for the two loggers as I have suggested, then you'll need to call the critical(), info() and warning() methods on each logger individually (i.e. you'll need two calls rather than just one).
What I think you really want is to have two different handlers on a single logger. For example:
import logging
mylogger = logging.getLogger('mylogger')
handler1 = logging.FileHandler('usr.log')
handler1.setLevel(logging.INFO)
mylogger.addHandler(handler1)
handler2 = logging.FileHandler('dev.log')
handler2.setLevel(logging.ERROR)
mylogger.addHandler(handler2)
mylogger.setLevel(logging.INFO)
mylogger.critical('A critical message')
mylogger.info('An info message')
Once you've made this change, then you can use filters as David has already mentioned. Here's a quick sample filter:
class MyFilter(object):
def __init__(self, level):
self.__level = level
def filter(self, logRecord):
return logRecord.levelno <= self.__level
You can apply the filter to each of the two handlers like this:
handler1.addFilter(MyFilter(logging.INFO))
...
handler2.addFilter(MyFilter(logging.ERROR))
This will restrict each handler to only write out log messages at the level specified.
First: this is a rather odd thing to want to do, and strikes me as a slight misuse of the logging system. I can't imagine any situation in which it makes sense to notify the user about the normal operation of the program but not about things that are more important. The logging levels should be used to indicate importance; if you have messages that are only of interest to developers, you should be using some other mechanism to distinguish them (such as which logger you send them to).
That being said, you can implement arbitrary filtering of log records by creating a Filter subclass whose filter method implements your desired criteria and install it on the appropriate handler.

Categories

Resources