Why does logger persist even when instance spawning it has been deleted? - python

There was a weird issue happening yesterday. I got it fixed but still don't understand why things were happening the way they did.
So I have a class with an instance of a logger getting initialised in __init__:
class Foo:
def __init__(self, some_value):
self.logger = self._initialise_logger()
self.logger.info(f'Initialising an instance of {self.__class__.__name__}')
self.value = some_value
def _initialise_logger(self):
# Setting up logger
return logger
The class is meant to perform some calculations and at one stage I had to run it in a loop:
my_list = []
for m in my_list:
f = Foo(m)
f.calculate()
When this loop was running I started getting strange messages in the output. On the first run the messages would be normal, but on the second run they would be duplicated, then on the next run every logging message would appear three times and so on.
So I figured that somehow the instance of the class that spawned the logger might be persisting and logger keeps printing messages so I decided that I could just manually delete the instance when calculations are completed and the issue will be gone:
my_list = []
for m in my_list:
f = Foo(m)
f.calculate()
del f
That didn't work. In the end I fixed it by initialising an instance only once and then change the value of the instance variable inside the loop:
my_list = []
f = Foo()
for m in my_list:
f.value = m
f.calculate()
This fixed the problem but I still don't understand how a logger can persist even when the instance that spawned it has been deleted?
EDIT:
def _initialise_logger(self):
log_file = self._get_logging_filename()
logger = logging.getLogger(__name__ + "." + self.__class__.__name__)
logger.propagate = False
logger.setLevel(logging.DEBUG)
file_handler = logging.FileHandler(log_file)
file_handler.setLevel(logging.DEBUG)
file_formatter = logging.Formatter(fmt='%(asctime)s.%(msecs)04d,%(name)s,%(levelname)s,%(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
screen_handler = logging.StreamHandler()
screen_handler.setLevel(logging.DEBUG)
screen_formatter = logging.Formatter(fmt='%(asctime)s.%(msecs)02d,%(levelname)s: -> %(message)s',
datefmt='%Y-%m-%d %H:%M:%S')
screen_handler.setFormatter(screen_formatter)
file_handler.setFormatter(file_formatter)
logger.addHandler(file_handler)
logger.addHandler(screen_handler)
logger.info(f'\n\n\n\n\nInitiating a {self.__class__.__name__} instance.')
return logger

There's a lot of things that are wrong with your code, but for starters, let's take a look at the below example to get a better understanding of the logging module:
import logging
def make_logger(name):
logger = logging.getLogger(name)
sh = logging.StreamHandler()
sh.setFormatter(logging.Formatter(fmt="%(message)s"))
logger.addHandler(sh)
return logger
logger_A = make_logger("A")
logger_A.warning("hi")
>>> hi
logger_B = make_logger("B")
logger_B.warning("bye")
>>> bye
logger_C1 = make_logger("C")
logger_C2 = make_logger("C")
logger_C1.warning("bonjour")
>>> bonjour
>>> bonjour
Notice we only get repeats when we use the same name for multiple loggers. This is because there can only be one instance of a logger with a given name, so if you call getLogger(name) with a name that already exists, it will just return the already existing logger object with that name. So, when we call our function make_logger twice with the same name, we're basically adding two different Stream Handlers to the same logger, which is why we see the double logging.
In your code, you construct the logger name using __name__ + "." + self.__class__.__name__. This produces a string that will be the exact same for every instance of your class. You could change that line of code to give a unique string for every different instance of your class, but this isn't really how you should be using the logging module.
I highly recommend reviewing this article to learn more about logging in python.
Why not just use one logger declared globally in your module/file? If you need to include identifying information for specific instance, you can always just include it in the log message itself:
import logging
logger = # set up your logger here, after your imports, before your code
class Foo:
def __init__(self, some_value):
self.value = some_value
logger.info(f'Initiating a {self.__class__.__name__} instance.')

Related

How to use logging filter without passing resulting logger as an extra argument to every function?

I am using logging Filters to provide contextual information for logging statements.
import logging
class ContextFilter(logging.Filter):
def filter(self, record):
record.client_id = 12345
return True
FORMAT = "%(levelname)s:%(name)s:%(message)s:{\"client_id\": %(client_id)s}"
logging.basicConfig(level=logging.DEBUG, format=FORMAT)
f = ContextFilter()
logger = logging.getLogger(__name__)
logger.addFilter(f)
I want that in every function the extra parameter "client_id" is logged without passing the logger explicitly as an argument.
I tried
def h():
logging.info("test")
h()
but I get "ValueError: Formatting field not found in record: 'client_id'"
The following does work:
def h(logger):
logger.info("test")
h()
but I am to lazy to add to every function an extra argument. Is there a way to circumvent this?
When you use logging.info() you are using the root logger, which doesn't have your filter. If you added
logging.getLogger().addFilter(f) # logging.getLogger() gets root logger
then your logging.info() call won't raise the error.

Change log-level via mocking

I want to change the log-level temporarily.
My current strategy is to use mocking.
with mock.patch(...):
my_method_which_does_log()
All logging.info() calls inside the method should get ignored and not logged to the console.
How to implement the ... to make logs of level INFO get ignored?
The code is single-process and single-thread and executed during testing only.
I want to change the log-level temporarily.
A way to do this without mocking is logging.disable
class TestSomething(unittest.TestCase):
def setUp(self):
logging.disable(logging.WARNING)
def tearDown(self):
logging.disable(logging.NOTSET)
This example would only show messages of level WARNING and above for each test in the TestSomething class. (You call disable at the start and end of each test as needed. This seems a bit cleaner.)
To unset this temporary throttling, call logging.disable(logging.NOTSET):
If logging.disable(logging.NOTSET) is called, it effectively removes this overriding level, so that logging output again depends on the effective levels of individual loggers.
I don't think mocking is going to do what you want. The loggers are presumably already instantiated in this scenario, and level is an instance variable for each of the loggers (and also any of the handlers that each logger has).
You can create a custom context manager. That would look something like this:
Context Manager
import logging
class override_logging_level():
"A context manager for temporarily setting the logging level"
def __init__(self, level, process_handlers=True):
self.saved_level = {}
self.level = level
self.process_handlers = process_handlers
def __enter__(self):
# Save the root logger
self.save_logger('', logging.getLogger())
# Iterate over the other loggers
for name, logger in logging.Logger.manager.loggerDict.items():
self.save_logger(name, logger)
def __exit__(self, exception_type, exception_value, traceback):
# Restore the root logger
self.restore_logger('', logging.getLogger())
# Iterate over the loggers
for name, logger in logging.Logger.manager.loggerDict.items():
self.restore_logger(name, logger)
def save_logger(self, name, logger):
# Save off the level
self.saved_level[name] = logger.level
# Override the level
logger.setLevel(self.level)
if not self.process_handlers:
return
# Iterate over the handlers for this logger
for handler in logger.handlers:
# No reliable name. Just use the id of the object
self.saved_level[id(handler)] = handler.level
def restore_logger(self, name, logger):
# It's possible that some intervening code added one or more loggers...
if name not in self.saved_level:
return
# Restore the level for the logger
logger.setLevel(self.saved_level[name])
if not self.process_handlers:
return
# Iterate over the handlers for this logger
for handler in logger.handlers:
# Reconstruct the key for this handler
key = id(handler)
# Again, we could have possibly added more handlers
if key not in self.saved_level:
continue
# Restore the level for the handler
handler.setLevel(self.saved_level[key])
Test Code
# Setup for basic logging
logging.basicConfig(level=logging.ERROR)
# Create some loggers - the root logger and a couple others
lr = logging.getLogger()
l1 = logging.getLogger('L1')
l2 = logging.getLogger('L2')
# Won't see this message due to the level
lr.info("lr - msg 1")
l1.info("l1 - msg 1")
l2.info("l2 - msg 1")
# Temporarily override the level
with override_logging_level(logging.INFO):
# Will see
lr.info("lr - msg 2")
l1.info("l1 - msg 2")
l2.info("l2 - msg 2")
# Won't see, again...
lr.info("lr - msg 3")
l1.info("l1 - msg 3")
l2.info("l2 - msg 3")
Results
$ python ./main.py
INFO:root:lr - msg 2
INFO:L1:l1 - msg 2
INFO:L2:l2 - msg 2
Notes
The code would need to be enhanced to support multithreading; for example, logging.Logger.manager.loggerDict is a shared variable that's guarded by locks in the logging code.
Using #cryptoplex's approach of using Context Managers, here's the official version from the logging cookbook:
import logging
import sys
class LoggingContext(object):
def __init__(self, logger, level=None, handler=None, close=True):
self.logger = logger
self.level = level
self.handler = handler
self.close = close
def __enter__(self):
if self.level is not None:
self.old_level = self.logger.level
self.logger.setLevel(self.level)
if self.handler:
self.logger.addHandler(self.handler)
def __exit__(self, et, ev, tb):
if self.level is not None:
self.logger.setLevel(self.old_level)
if self.handler:
self.logger.removeHandler(self.handler)
if self.handler and self.close:
self.handler.close()
# implicit return of None => don't swallow exceptions
You could use dependency injection to pass the logger instance to the method you are testing. It is a bit more invasive though since you are changing your method a little, however it gives you more flexibility.
Add the logger parameter to your method signature, something along the lines of:
def my_method( your_other_params, logger):
pass
In your unit test file:
if __name__ == "__main__":
# define the logger you want to use:
logging.basicConfig( stream=sys.stderr )
logging.getLogger( "MyTests.test_my_method" ).setLevel( logging.DEBUG )
...
def test_my_method(self):
test_logger = logging.getLogger( "MyTests.test_my_method" )
# pass your logger to your method
my_method(your_normal_parameters, test_logger)
python logger docs: https://docs.python.org/3/library/logging.html
I use this pattern to write all logs to a list. It ignores logs of level INFO and smaller.
logs=[]
import logging
def my_log(logger_self, level, *args, **kwargs):
if level>logging.INFO:
logs.append((args, kwargs))
with mock.patch('logging.Logger._log', my_log):
my_method_which_does_log()

Python logging creating extra log files

I am trying to use logging to create log files for a program. I'm doing something like this:
if not os.path.exists(r'.\logs'):
os.mkdir(r'.\logs')
logging.basicConfig(filename = rf'.\logs\log_{time.ctime().replace(":", "-").replace(" ", "_")}.log',
format = '%(asctime)s %(name)s %(levelname)s %(message)s',
level = logging.DEBUG)
def foo():
# do stuff ...
logging.debug('Done some stuff')
# do extra stuff ...
logging.debug('Did extra stuff')
# some parallel map that does NOT use logging in the mapping function
logging.debug('Done mapping')
if __name__ == '__main__':
foo()
All goes well ant the log is created with the correct information in it:
logs
log_Wed_Feb_14_09-23-32_2018.log
Only that at the end, for some reason, it also creates 2 additional log files and leaves them empty:
logs
log_Wed_Feb_14_09-23-32_2018.log
log_Wed_Feb_14_09-23-35_2018.log
log_Wed_Feb_14_09-23-39_2018.log
The timestamps are only a few seconds apart, but all of the logging still only goes in the first log file as it should.
Why is it doing this? Also is there a way to stop it from giving me extra empty files aside from just deleting any empty logs at the end of the program?
Solved. Kind of.
The behaviour using basic config kept happening so I tried to make a custom logger class:
class Logger:
"""Class used to encapsulate logging logic."""
__slots__ = ['dir',
'level',
'formatter',
'handler',
'logger']
def __init__(self,
name: str = '',
logdir: str = r'.\logs',
lvl: int = logging.INFO,
fmt: str = '%(asctime)s %(name)s %(levelname)s %(message)s',
hdl: str = rf'.\logs\log_{time.ctime().replace(":", "-").replace(" ", "_")}.log'):
print('construct')
if not os.path.exists(logdir):
os.mkdir(logdir)
self.dir = logdir
self.level = lvl
self.formatter = logging.Formatter(fmt = fmt)
self.handler = logging.FileHandler(filename = hdl)
self.handler.setFormatter(self.formatter)
self.logger = logging.getLogger(name)
self.logger.setLevel(self.level)
self.logger.addHandler(self.handler)
def log(self, msg: str):
"""Logs the given message to the set level of the logger."""
self.logger.log(self.level, msg)
def cleanup(self):
"""Iterates trough the root level of the log folder, removing all log files that have a size of 0."""
for log_file in (rf'{self.dir}\{log}' for log in next(os.walk(self.dir))[2]
if log.endswith('.log') and os.path.getsize(rf'{self.dir}\{log}') is 0):
os.remove(log_file)
def shutdown(self):
"""Prepares and executes the shutdown and cleanul actions."""
logging.shutdown()
self.handler.close()
self.cleanup()
And tried to pass it as a parameter to functions like this:
def foo(logger = Logger('foo_logger')):
But this approach made it construct a whole new logger each time I called the log method which let again to multiple files. By using one instance of Logger and defaulting the arguments to None I solved the problem of multiple files for this case.
However the initial Basic Config situation remains a mistery.

How to find out whether getLogger created a new object?

How do I find out whether getLogger() returned a new or an existing logger object?
The motivation is that I don't want to addHandler repeatedly to the same logger.
There doesn't seem to be a particularly clean way to do this... However, if you must, the source-code is a pretty good place to start looking in order to figure this out. Note that logging.getLogger is mostly a wrapper around logging.Logger.manager.getLogger.
The Manager keeps a mapping of names -> Logger (or Placeholder). If it has Logger in the slot designated by a given name, it will return it. Otherwise, it'll return a new Logger.
import logging
def has_logger(name):
manager = logging.Logger.manager
if name in manager.loggerDict:
return isinstance(manager.loggerDict[name], logging.Logger)
else:
return False
Note that this only handles the case where you have named loggers. If you do logging.getLogger() (without passing a name), then it simply will return the root logger which is created at import time (and therefore, it is never new).
Another approach could be to get a logger and check that it's handlers list is smaller than you'd expect (i.e. if it isn't an empty list, then handlers have been added).
def has_handlers(logger):
"""Return True if logger has handlers, False otherwise."""
return bool(logger.handlers)
getLogger will return a singleton instance over the named logger, to check that
import logging
id_ = id(logging.getLogger())
for i in range(10):
assert id_ == id(logging.getLogger())
For logging purpose i used to used à logger module looking like:
mylogger.py
import logging
import logging.config
from pathlib import Path
logging.config.fileConfig(str(Path(__file__).parent / "logging.conf"),
disable_existing_loggers=False)
def get(name="MYLOG", **kw):
logger = logging.getLogger(name)
logger.propagate = True
if kw.get('level'):
logger.setLevel(kw.get('level'))
else:
logger.setLevel(logging.ERROR)
return logger
All handlers are defined in the logging.conf

Python Logging module: custom loggers

I was trying to create a custom attribute for logging (caller's class name, module name, etc.) and got stuck with a strange exception telling me that the LogRecord instance created in the process did not have the necessary attributes. After a bit of testing I ended up with this:
import logging
class MyLogger(logging.getLoggerClass()):
value = None
logging.setLoggerClass(MyLogger)
loggers = [
logging.getLogger(),
logging.getLogger(""),
logging.getLogger("Name")
]
for logger in loggers:
print(isinstance(logger, MyLogger), hasattr(logger, "value"))
This seemingly correct piece of code yields:
False False
False False
True True
Bug or feature?
Looking at the source code we can see the following:
root = RootLogger(WARNING)
def getLogger(name=None):
if name:
return Logger.manager.getLogger(name)
else:
return root
That is, a root logger is created by default when the module is imported. Hence, every time you look for the root looger (passing a false value such as the empty string), you're going to get a logging.RootLogger object regardless of any call to logging.setLoggerClass.
Regarding the logger class being used we can see:
_loggerClass = None
def setLoggerClass(klass):
...
_loggerClass = klass
This means that a global variable holds the logger class to be used in the future.
In addition to this, looking at logging.Manager (used by logging.getLogger), we can see this:
def getLogger(self, name):
...
rv = (self.loggerClass or _loggerClass)(name)
That is, if self.loggerClass isn't set (which won't be unless you've explicitly set it), the class from the global variable is used.
Hence, it's a feature. The root logger is always a logging.RootLogger object and the other logger objects are created based on the configuration at that time.
logging.getLogger() and logging.getLogger("") don't return a MyLogger because they return the root logger of the logging hierarchy, as described in the logging documentation:
logging.getLogger([name])
Return a logger with the specified name or, if no name is specified, return a logger which is the root logger of the hierarchy.
Thus, as you have the logger set up:
>>> logging.getLogger()
<logging.RootLogger object at 0x7d9450>
>>> logging.getLogger("foo")
<test3.MyLogger object at 0x76d9f0>
I don't think this is related to the KeyError you started your post with. You should post the code that caused that exception to be thrown (test.py).

Categories

Resources