logging.LogRecord.getMessage() simplifies the manipulation of logging records by providing a factory. I use a module of mine, imported in each piece of code, to homogenize the logging:
# this is mylogging.py
import logging
import logging.handlers
def mylogging(name):
old_factory = logging.getLogRecordFactory()
def record_factory(*args, **kwargs):
record = old_factory(*args, **kwargs)
# send an SMS for critical events (level = 50)
if args[1] == 50:
pass # here is the code which sends an SMS
return record
logging.setLogRecordFactory(record_factory)
# common logging info
log = logging.getLogger(name)
log.setLevel(logging.DEBUG)
(...)
All my scripts bootstrap logging via a
log = mylogging.mylogging("name_of_the_project")
This works fine.
I now would like to keep track of the number of SMS sent. For this I would like to set a counter within mylogging.py, common to all scripts which import mylogging. The problem is that such a variable will be local to each script.
On the other hand, logging is peculiar in the sense that when different scripts call logging.getLogger(name) with the same name, the handler is reused - which means that there is some persistence between scripts (even though each of them does an independent import logging).
With this in mind, is there a way to use a variable which would be common to all logging, placed right after the here is the code which sends an SMS line, and which would be incremented no matter what script the logging request comes from?
An import such as
from mylogging import mycounter
mycounter += 1
adds a new reference to mycounter in the local module namespace. For an immutable type such as an integer counter, the addition rebinds the value in the local namespace only - other modules see the value at the point where they imported.
One solution is to keep the original namespace so that the rebinding happens in mylogger itself.
import mylogger
mylogger.mycounter += 1
This is fragile. Its not very obvious that it only works because of the way the import was done.
A better solution is to use a mutable type. itertools.count is interesting but doesn't let you view the current value of the counter. Here's a simple class that will do it. I've adding locking so that it works in a multithreaded environment also.
Add to mylogger.py
import threading
class MyCounter(object):
def __init__(self):
self.val = 0
self.lock = threading.Lock()
def inc(self):
with self.lock:
self.val += 1
return self.val
sms_counter = MyCounter()
Some other module
from mylogger import sms_counter
print('sms count is {}'.format(sms_counter.inc()))
Related
Let's say I have different classes defined like this
class PS1:
def __init__(self):
pass
def do_sth(self):
logging.info('PS1 do sth')
class PS2:
def __init__(self):
pass
def do_sth(self):
logging.info('PS2 do sth')
How can I make PS1 log into PS1.log and PS2 log into PS2.log? I know I can set up loggers and file handlers and do something likes ps1_logger.info instead of logging.info. However, my current code base already has hundreds of logging.info, do I have to change them all?
When you call logging methods on the module level you are creating logs directly at the root logger, so you no longer can separate them by logger. The only chance left is to separate your logs by Handler with a filter. Since you don't want to update your logging calls to add any information about where the call happens you have to inspect the stack to find the calling class. The code below does this, but I highly recommend not using this solution in production and refactor your code to not send everything to the root logger. Look into logging adapters and/or the extra keyword argument for an alternative solution that modifies the logging calls.
import logging
import inspect
root = logging.getLogger()
ps1_handler = logging.FileHandler('ps1.log')
ps2_handler = logging.FileHandler('ps2.log')
def make_class_filter(class_name):
def filter(record):
for fi in inspect.stack():
if 'self' in fi[0].f_locals and class_name in str(fi[0].f_locals['self']):
return True
return False
return filter
ps1_handler.addFilter(make_class_filter('PS1'))
ps2_handler.addFilter(make_class_filter('PS2'))
root.addHandler(ps1_handler)
root.addHandler(ps2_handler)
class PS1:
def do_sth(self):
logging.warning('PS1 do sth')
class PS2:
def do_sth(self):
logging.warning('PS2 do sth')
PS1().do_sth()
PS2().do_sth()
How do I find out whether getLogger() returned a new or an existing logger object?
The motivation is that I don't want to addHandler repeatedly to the same logger.
There doesn't seem to be a particularly clean way to do this... However, if you must, the source-code is a pretty good place to start looking in order to figure this out. Note that logging.getLogger is mostly a wrapper around logging.Logger.manager.getLogger.
The Manager keeps a mapping of names -> Logger (or Placeholder). If it has Logger in the slot designated by a given name, it will return it. Otherwise, it'll return a new Logger.
import logging
def has_logger(name):
manager = logging.Logger.manager
if name in manager.loggerDict:
return isinstance(manager.loggerDict[name], logging.Logger)
else:
return False
Note that this only handles the case where you have named loggers. If you do logging.getLogger() (without passing a name), then it simply will return the root logger which is created at import time (and therefore, it is never new).
Another approach could be to get a logger and check that it's handlers list is smaller than you'd expect (i.e. if it isn't an empty list, then handlers have been added).
def has_handlers(logger):
"""Return True if logger has handlers, False otherwise."""
return bool(logger.handlers)
getLogger will return a singleton instance over the named logger, to check that
import logging
id_ = id(logging.getLogger())
for i in range(10):
assert id_ == id(logging.getLogger())
For logging purpose i used to used à logger module looking like:
mylogger.py
import logging
import logging.config
from pathlib import Path
logging.config.fileConfig(str(Path(__file__).parent / "logging.conf"),
disable_existing_loggers=False)
def get(name="MYLOG", **kw):
logger = logging.getLogger(name)
logger.propagate = True
if kw.get('level'):
logger.setLevel(kw.get('level'))
else:
logger.setLevel(logging.ERROR)
return logger
All handlers are defined in the logging.conf
Are there any built-in ways to have different threads have different destinations for print() and similar?
I'm exploring the creation of an interactive Python environment, so I can't just use print() from module spamegg. It has to be the globally available one with no arguments.
You can replace sys.stdout with an object that checks the current thread and writes to the appropriate file:
import sys, threading
class CustomOutput(object):
def __init__(self):
# the "softspace" slot is used internally by Python's print
# to keep track of whether to prepend space to the
# printed expression
self.softspace = 0
self._old_stdout = None
def activate(self):
self._old_stdout = sys.stdout
sys.stdout = self
def deactivate(self):
sys.stdout = self._old_stdout
self._old_stdout = None
def write(self, s):
# actually write to an open file obtained from an attribute
# on the current thread
threading.current_thread().open_file.write(s)
def writelines(self, seq):
for s in seq:
self.write(s)
def close(self):
pass
def flush(self):
pass
def isatty(self):
return False
It is possible to do what you're asking, although it's complicated and clunky and possibly not portable, and I don't think it's what you want to do.
Your objection to just using spamegg.print is:
I'm exploring the creation of an interactive Python environment, so I can't just use print() from module spamegg. It has to be the globally available one with no arguments.
But the solution to that is easy: Just use print from module spamegg in your code, and from spamegg import print in the interactive interpreter. That's all there is to it.
For that matter, there's no good reason this even needs to be called print in the first place. If all of your code used some other output function with a different name, you could do the same thing in the interactive interpreter.
But how does that let each thread have a different destination?
The easy way to do that is to just look up the destination in a threading.local().
But if you really want to do both parts of this the hard way, you can.
To do the global print the hard way, you can either have spamegg replace the builtin print instead of just giving you a way to shadow it, or have it replace sys.stdout, so the builtin print with default arguments will print somewhere else.
import builtins
_real_print = builtins.print
def _print(*args, **kwargs):
kwargs.setdefault('file', my_output_destination)
_real_print(*args, **kwargs)
builtins.print = _print
import io
import sys
class MyStdOut(io.TextIOBase):
# ... __init__, write, etc.
sys.stdout = MyStdOut()
That still requires having MyStdOut use a thread-local target.
Alternatively, you can compile or wrap each thread function in its own custom globals environment that replaces __builtins__ and/or sys from the default, allowing you to give a different one to each thread from the start. For example:
from functools import partial
from threading import Thread
from types import FunctionType
class MyThread(Thread):
def __init__(self, group=None, target=None, *args, **kwargs):
if target:
g = target.__globals__.copy()
g['__builtins__'] = g['__builtins__'].copy()
output = make_output_for_new_thread()
g['__builtins__']['print'] = partial(print, file=output)
target = FunctionType(thread_func.__code__, g, thread_func.__name__,
thread_func.__defaults__, thread_func.__closure__)
super().__init__(self, group, target, *args, **kwargs)
I might have solution for you, but it's quite more complicated than just print.
class ClusteredLogging(object):
'''
Class gathers all logs performed inside with statement and flush
it to mainHandler on exit at once.
Good for multithreaded applications that has to log several
lines at once.
'''
def __init__(self, mainHandler, formatter):
self.mainHandler = mainHandler
self.formatter = formatter
self.buffer = StringIO()
self.handler = logging.StreamHandler(self.buffer)
self.handler.setFormatter(formatter)
def __enter__(self):
rootLogger = logging.getLogger()
rootLogger.addHandler(self.handler)
def __exit__(self, t, value, tb):
rootLogger = logging.getLogger()
rootLogger.removeHandler(self.handler)
self.handler.flush()
self.buffer.flush()
rootLogger.addHandler(self.mainHandler)
logging.info(self.buffer.getvalue().strip())
rootLogger.removeHandler(self.mainHandler)
Using this, you can create log handler for each thread and configure them to store logs to different locations.
Keep in mind that this is developed with slightly different goal in mind (see comments) but you can adapt it by taking handler juggling feature of ClusteredLogging as a start.
And some test code:
import concurrent.futures
try:
from StringIO import StringIO
except ImportError:
from io import StringIO
import logging
import sys
# put ClusteredLogging here
if __name__ == "__main__":
formatter = logging.Formatter('%(asctime)s %(levelname)8s\t%(message)s')
onlyMessageFormatter = logging.Formatter("%(message)s")
mainHandler = logging.StreamHandler(sys.stdout)
mainHandler.setFormatter(onlyMessageFormatter)
mainHandler.setLevel(logging.DEBUG)
rootLogger = logging.getLogger()
rootLogger.setLevel(logging.DEBUG)
def logSomethingLong(label):
with ClusteredLogging(mainHandler, formatter):
for i in range(15):
logging.info(label + " " + str(i))
labels = ("TEST", "EXPERIMENT", "TRIAL")
executor = concurrent.futures.ProcessPoolExecutor()
futures = [executor.submit(logSomethingLong, label) for label in labels]
concurrent.futures.wait(futures)
I am changing some code to spin up VMs in ec2 instead of openstack. Main starts a thread per VM, and then various modules perform tasks on these VM. Each thread controls it's own VM. So, instead of either having to add parameters to all of the downstream modules to look up information, or having to change all of the code to unpickle the class instance that created the vm, I am hoping that I can have the class itself decide whether to start a new VM or return the existing pickle. That way the majority of the code wont need to be altered.
This is the general idea, and closest I have gotten to getting it to work:
import os
import sys
import pickle
if sys.version_info >= (2, 7):
from threading import current_thread
else:
from threading import currentThread as current_thread
class testA:
def __init__(self, var="Foo"):
self.class_pickle_file = "%s.p" % current_thread().ident
if os.path.isfile(self.class_pickle_file):
self.load_self()
else:
self.var = var
pickle.dump(self, open(self.class_pickle_file, "wb"))
def test_method(self):
print self.var
def load_self(self):
return pickle.load(open(self.class_pickle_file, "rb"))
x = testA("Bar")
y = testA()
y.test_method()
But that results in: NameError: global name 'var' is not defined
But, If I do y = pickle.load(open("140355004004096.p", "rb")) it works just fine. So the data IS getting in there by storing self inside the class, it's a problem of getting the class to return the pickle instead of itself...
Any ideas? Thanks.
It looks to me like you create a file named by the current thread's ident, then you instantiate another TestA object using the same thread (!!same ident!!), so it checks for a pickle file (and finds it, that's bad), then self.var never gets set.
In test_method, you check for a variable that was never set.
Run each item in its own thread to get different idents, or ensure you set self.var no matter what.
I can create a named child logger, so that all the logs output by that logger are marked with it's name. I can use that logger exclusively in my function/class/whatever.
However, if that code calls out to functions in another module that makes use of logging using just the logging module functions (that proxy to the root logger), how can I ensure that those log messages go through the same logger (or are at least logged in the same way)?
For example:
main.py
import logging
import other
def do_stuff(logger):
logger.info("doing stuff")
other.do_more_stuff()
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("stuff")
do_stuff(logger)
other.py
import logging
def do_more_stuff():
logging.info("doing other stuff")
Outputs:
$ python main.py
INFO:stuff:doing stuff
INFO:root:doing other stuff
I want to be able to cause both log lines to be marked with the name 'stuff', and I want to be able to do this only changing main.py.
How can I cause the logging calls in other.py to use a different logger without changing that module?
This is the solution I've come up with:
Using thread local data to store the contextual information, and using a Filter on the root loggers handlers to add this information to LogRecords before they are emitted.
context = threading.local()
context.name = None
class ContextFilter(logging.Filter):
def filter(self, record):
if context.name is not None:
record.name = "%s.%s" % (context.name, record.name)
return True
This is fine for me, because I'm using the logger name to indicate what task was being carried out when this message was logged.
I can then use context managers or decorators to make logging from a particular passage of code all appear as though it was logged from a particular child logger.
#contextlib.contextmanager
def logname(name):
old_name = context.name
if old_name is None:
context.name = name
else:
context.name = "%s.%s" % (old_name, name)
try:
yield
finally:
context.name = old_name
def as_logname(name):
def decorator(f):
#functools.wraps(f)
def wrapper(*args, **kwargs):
with logname(name):
return f(*args, **kwargs)
return wrapper
return decorator
So then, I can do:
with logname("stuff"):
logging.info("I'm doing stuff!")
do_more_stuff()
or:
#as_logname("things")
def do_things():
logging.info("Starting to do things")
do_more_stuff()
The key thing being that any logging that do_more_stuff() does will be logged as if it were logged with either a "stuff" or "things" child logger, without having to change do_more_stuff() at all.
This solution would have problems if you were going to have different handlers on different child loggers.
Use logging.setLoggerClass so that all loggers used by other modules use your logger subclass (emphasis mine):
Tells the logging system to use the class klass when instantiating a logger. The class should define __init__() such that only a name argument is required, and the __init__() should call Logger.__init__(). This function is typically called before any loggers are instantiated by applications which need to use custom logger behavior.
This is what logging.handlers (or the handlers in the logging module) is for. In addition to creating your logger, you create one or more handlers to send the logging information to various places and add them to the root logger. Most modules that do logging create a logger that they use for there own purposes but depend on the controlling script to create the handlers. Some frameworks decide to be super helpful and add handlers for you.
Read the logging docs, its all there.
(edit)
logging.basicConfig() is a helper function that adds a single handler to the root logger. You can control the format string it uses with the 'format=' parameter. If all you want to do is have all modules display "stuff", then use logging.basicConfig(level=logging.INFO, format="%(levelname)s:stuff:%(message)s").
The logging.{info,warning,…} methods just call the respective methods on a Logger object called root (cf. the logging module source), so if you know the other module is only calling the functions exported by the logging module, you can overwrite the logging module in others namespace with your logger object:
import logging
import other
def do_stuff(logger):
logger.info("doing stuff")
other.do_more_stuff()
if __name__ == '__main__':
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger("stuff")
# Overwrite other.logging with your just-generated logger object:
other.logging = logger
do_stuff(logger)