I would like to collect info with the help of logging.
The idea is simple. I have hash_value of some data, which I want to write to log. So, I set up my logging this way:
import logging
logger.setLevel(logging.DEBUG)
logging.basicConfig(format='%(asctime)s :%(message)s', level=logging.INFO)
As you can see, now timing and some message will automatically write to log file, for example I can use it like this:
logger.info('Initial data: {}'.format(data))
But what if I want to write hash_value of my data automatically? Like it is happening with time now.
I looked through documentation and find nothing useful. There is no attribute for variable in module logging.
So I am forced to do it awry. Like this:
hash_value = hash(data)
logger.info('Initial data: {} {}'.format(hash_value, data))
I would expect from this code:
logging.basicConfig(format='%(asctime)s: %(variable)s :%(message)s', level=logging.INFO)
and
logger.info('Initial data: {}'.format(hash_value, data))
to do the job. But it does not work (and it should not basically) and I did not find the solution in documentation.
So, how to avoid this awry code:
logger.info('Initial data: {} {}'.format(hash_value, data))
which I am having now?
import logging
import sys
MY_PARAMS = ("variable1", "var2", )
class ExtraFilter(logging.Filter):
def filter(self, record):
# this one used for second, simplier handler
# to avoid duplicate of logging entries if "extra" keyword was passed.
# Check all of your custom params:
# if all of them are present - record should be filtered
# * all because if any of them is missing - there would be silent exception and record wont be logged at all
# bellow is just an example how to check.
# You can use something like this:
# if all(hasattr(record, param) for param in MY_PARAMS): return False
if hasattr(record, "variable1"):
return False
return True
# init logging
log = logging.getLogger()
# init handlers and formatters
h1 = logging.StreamHandler(sys.stdout)
f1 = logging.Formatter('%(asctime)s: %(variable1)s: %(var2)s: %(message)s')
h2 = logging.StreamHandler(sys.stdout)
f2 = logging.Formatter('%(asctime)s: %(message)s')
h1.setFormatter(f1)
h2.setFormatter(f2)
h2.addFilter(ExtraFilter())
log.addHandler(h1)
log.addHandler(h2)
# example of data:
extra = {"variable1": "test1", "var2": "test2"}
log.setLevel(logging.DEBUG)
log.debug("debug message", extra=extra)
log.info("info message")
The above code will produce following output:
2017-11-04 09:16:36,787: test1: test2: debug message
2017-11-04 09:16:36,787: info message
It is not awry code, you want to add two informations, therefore you must either pass two parameters to format or concatenate the string more "manually"
You could go with
Logging.info("initial data " + hash_value + " " + data)
Or you could change the "data" object so its "str" or the repr method adds the hash by itself (preferably the repr in this case)
Class Data():
....
def __repr__(self):
Return self.hash() + " " self.data
Which in this case will print the hash and the string version of the parameter data( or simply whatever you want to show as string) passing only one parameter in the string format.
Anyway, you could make the formating string prettier with....
Logging.info("Initial data {hash} {data}".format(hash=hash_value, data=data))
By the way, in C++ and Java you would also need to declare two "entries" for those two atributes. In java would be something like this:
LOGGING.info("Initial data {} {}", hash, data);
Related
I am using the logging package in Python.
When creating a handler, I use:
handler.terminator = ""
... so that by default the line does not end when calling the info or debug function. I use it to log things like this:
Writing applications in... 1.29s
Writing assets in... 2.34s
In the above the computational time is written at a second log call. The formatter is empty. I now want to add a formatter, and naturally I get this:
20220206 22:20:02 [INFO] Writing applications in... 20220206 22:20:03 [INFO] 1.29s
20220206 22:20:03 [INFO] Writing assets in... 20220206 22:20:05 [INFO] 2.34s
Is it possible to ensure that the formatter is only applied to when a new line is beginning? Like this, ideally:
20220206 22:20:02 [INFO] Writing applications in... 1.29s
20220206 22:20:03 [INFO] Writing assets in... 2.34s
Thank you very much.
Minimal Reproducible Example with no formatter :
import logging
import random
import sys
random.seed(71012594)
logger = logging.getLogger("so71012594")
handler = logging.StreamHandler()
handler.terminator = ""
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
logger.debug("Doing something ... ")
duration = random.random() * 10 # a float between 0 and 10
logger.debug(f"in {duration} seconds\n")
logger.debug("Doing something else ... ")
duration = random.random() * 10
logger.debug(f"in {duration} seconds\n")
Doing something ... in 8.44033947716514 seconds
Doing something else ... in 9.921684947596866 seconds
Adding the formatter :
# I chose to keep a short format string for readability
formatter = logging.Formatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)
logger.debug("Doing something ... ")
# ...
[DEBUG] Doing something ... [DEBUG] in 8.44033947716514 seconds
[DEBUG] Doing something else ... [DEBUG] in 9.921684947596866 seconds
Looking at the CPython implementation for logging.StreamHandler in 3.10 there are only 2 occurences of terminator :
class StreamHandler(Handler):
# ...
terminator = '\n'
and
def emit(self, record):
"""
...
"""
try:
msg = self.format(record)
stream = self.stream
# issue 35046: merged two stream.writes into one.
stream.write(msg + self.terminator)
self.flush()
except RecursionError: # See issue 36272
raise
except Exception:
self.handleError(record)
So the terminator is always appended to the log line, which is the formatting applied to the message.
I see different options :
don't call format at all (conditional call to format by emit)
change what format does (conditional formatting)
Solution 1
There are different ways to implement it, let's do something quick and dirty : monkey-patching emit !
original_stream_handler_emit = handler.emit # getting a reference to the bound method
def my_emit(record):
print(repr(record))
return original_stream_handler_emit(record) # call the bound method like a function
handler.emit = my_emit # monkey-patching
logger.debug("Doing something ... ")
# ...
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 21, "Doing something ... ">
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 23, "in 8.44033947716514 seconds
">
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 25, "Doing something else ... ">
<LogRecord: so71012594, 10, /home/stack_overflow/so71012594.py, 27, "in 9.921684947596866 seconds
">
[DEBUG] Doing something ... [DEBUG] in 8.44033947716514 seconds
[DEBUG] Doing something else ... [DEBUG] in 9.921684947596866 seconds
Now, we just have to detect whether there is a newline in the message :
original_stream_handler_emit = handler.emit # getting a reference to the bound method
def my_emit(record):
self = handler
if "\n" in record.msg:
# it is the second part of a message, we don't want any pesky formatting
try:
self.stream.write(record.msg) # no formatting applied on the record !
except Exception:
self.handleError(record)
else:
# simply call the original
original_stream_handler_emit(record) # call the bound method like a function
handler.emit = my_emit # monkey-patching
logger.debug("Doing something ... ")
# ...
[DEBUG] Doing something ... in 8.44033947716514 seconds
[DEBUG] Doing something else ... in 9.921684947596866 seconds
Monkey patching is simple to write but not very maintainable.
Another way is to create a StreamHandler subclass :
class MyStreamHandler(logging.StreamHandler):
def emit(self, record):
if "\n" in record.msg:
# it is the second part of a message, we don't want any pesky formatting
try:
self.stream.write(record.msg) # no formatting applied on the record !
except Exception:
self.handleError(record)
else:
# simply call the original
super().emit(record) # logging.StreamHandler.emit
[...]
handler = MyStreamHandler() # instead of logging.StreamHandler
Which is much better in my opinion.
Solution 2
There already exists similar question on this site, for example "Can Python's logging format be modified depending on the message log level?".
Here is how to do it :
class MyFormatter(logging.Formatter):
def format(self, record):
if "\n" in record.msg:
# it is the second part of a message, we don't want any pesky formatting
return record.msg
else:
# simply call the original
return super().format(record) # logging.Formatter.format
formatter = MyFormatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)
Limitations for the solutions 1 and 2
Relying on record.msg is wonky :
logger.debug("Hello\nWorld!")
[DEBUG] Doing something ... in 8.44033947716514 seconds
[DEBUG] Doing something else ... in 9.921684947596866 seconds
Hello
World!
There is no formatting at all for the record because it contains a newline. While it is true that most log messages should be on one line, that's not always true.
Also :
logger.propagate = False # for demonstration purposes
try:
1/0
except ZeroDivisionError:
logging.exception("math error\nThese pesky zeroes !") # this one using the default logger/formatter
print("\n---\n", file=sys.stderr)
logger.exception("math error\nThese pesky zeroes !") # this one using the custom formatter
ERROR:root:math error
These pesky zeroes !
Traceback (most recent call last):
File "/home/S0121595/workspace/stack_overflow/so71012594.py", line 65, in <module>
1/0
ZeroDivisionError: division by zero
---
math error
These pesky zeroes !
There is no stacktrace displayed nor indication there was an error, because all that normally would get added by the regular format function, they are not included in the record.msg field.
It also breaks when *args are supplied :
logger.propagate = False ; logging.getLogger().setLevel(logging.DEBUG) # for demonstration purposes
logging.debug("part one,\n %s", "part two") # this one using the default logger/formatter
print("\n---\n", file=sys.stderr)
logger.debug("part one,\n %s", "part two") # this one using the custom formatter
DEBUG:root:part one,
part two
---
part one,
%s
So either you can ignore all these cases, or you should implement them in your solution. Here is an untested example based on CPython's logging.Fomatter.format :
def format(self, record):
"""
...
"""
record.message = record.getMessage()
if self.usesTime():
record.asctime = self.formatTime(record, self.datefmt)
s = self.formatMessage(record)
s = record.message
if record.exc_info:
# Cache the traceback text to avoid converting it multiple times
# (it's constant anyway)
if not record.exc_text:
record.exc_text = self.formatException(record.exc_info)
if record.exc_text:
if s[-1:] != "\n":
s = s + "\n"
s = s + record.exc_text
if record.stack_info:
if s[-1:] != "\n":
s = s + "\n"
s = s + self.formatStack(record.stack_info)
return s
(the diff syntax is not supported on StackOverflow)
For all these reasons, I think it is best to not rely on detecting \n.
Solution 3
Basically what you want is :
# first message : no newline here
v
[DEBUG] Doing something ... in 8.44033947716514 seconds
^
# second message : no formatting here
So I propose a clearer way to indicate that :
# vvvvvvvvvvvvvvvvv
logger.debug("Doing something ... ", extra={"end": ""})
logger.debug(f"in {duration} seconds", extra={"no_format": True})
# ^^^^^^^^^^^^^^^^^^^^^^^^^
I'm reusing the end= convention of print with the extra parameter of all logging functions.
The Logger.makeRecord method takes care of converting the extra pairs into record fields :
original_stream_handler_emit = handler.emit # getting a reference to the bound method
def my_emit(record):
print("end:", repr(getattr(record, "end", None)))
print("no_format: ", getattr(record, "no_format", None))
return original_stream_handler_emit(record) # call the bound method like a function
handler.emit = my_emit # monkey-patching
end: ''
no_format: None
end: None
no_format: True
So here is the full code :
import contextlib # stdlib
import logging
import random
random.seed(71012594)
#contextlib.contextmanager
def patch_object(obj, field_name, temp_value): # basically `mock.patch.object`
original_value = getattr(obj, field_name) # make a backup
setattr(obj, field_name, temp_value) # set the new value
try:
yield temp_value
finally:
setattr(obj, field_name, original_value) # restore backup in any case
class MyStreamHandler2(logging.StreamHandler):
def emit(self, record):
end_value = getattr(record, "end", None)
use_end_value = end_value is not None and end_value != self.terminator
no_format = getattr(record, "no_format", False)
fake_format_function = (lambda rec: rec.message)
with patch_object(self, "terminator", end_value) if use_end_value else contextlib.nullcontext(), \
patch_object(self.formatter, "formatMessage", fake_format_function) if no_format else contextlib.nullcontext():
super().emit(record)
logger = logging.getLogger("so71012594")
handler = MyStreamHandler2() # instead of logging.StreamHandler
#handler.terminator = "" # No need to replace the terminator !
logger.addHandler(handler)
logger.setLevel(logging.DEBUG)
formatter = logging.Formatter("[%(levelname)s] %(message)s")
handler.setFormatter(formatter)
logger.debug("Doing something ... ", extra={"end": ""})
duration = random.random() * 10 # a float between 0 and 10
logger.debug(f"in {duration} seconds", extra={"no_format": True})
logger.debug("Doing something else ... ", extra={"end": ""})
duration = random.random() * 10
logger.debug(f"in {duration} seconds", extra={"no_format": True})
[DEBUG] Doing something ... in 8.44033947716514 seconds
[DEBUG] Doing something else ... in 9.921684947596866 seconds
It can be strange to include unittest (specially its mock subpackage) in the non-test code, but I find it better for monkey-patching than reimplementing it each time, or taking risks by simply setting/resetting values.
It is simply that :
from unittest.mock import patch # stdlib
# remove the `patch_object` function entirely
[...]
with patch.object(...
patch.object(...
And it works like a charm.
Note : the end value can be anything !
logger.debug("Doing something ... ", extra={"end": "πβ¨\tπβ‘ οΈ"})
[DEBUG] Doing something ... πβ¨ πβ‘ οΈin 8.44033947716514 seconds
Bonus : you can now have as many parts as you want to form a single line !
import time
for _ in range(4):
duration = random.random() * 10
logger.debug("Starting to do something long ", extra={"end": ""})
for elapsed in range(int(duration)+1):
logger.debug(".", extra={"end": "", "no_format": True})
time.sleep(1)
logger.debug(f" finished ({duration:.1f}s) !", extra={"no_format": True})
[DEBUG] Starting to do something long ......... finished (8.4s) !
[DEBUG] Starting to do something long .......... finished (9.9s) !
[DEBUG] Starting to do something long .. finished (1.4s) !
[DEBUG] Starting to do something long ......... finished (8.6s) !
(the dots appear progressively)
I'm using a logging filter to print out my log messages including some custom fields that are not present in the usual logging framework.
For instance:
class NISARLogger(object):
def __init__(self, filename):
self.filename = filename
fid = logging.FileHandler(filename)
formatter_str = '%(asctime)s, %(levelname)s, %(pge)s, %(module)s, %(error_code)i, \
%(source)s:%(line_number)i, "%(error_name)s: %(message)s"'
formatter = logging.Formatter(formatter_str)
fid.setFormatter(formatter)
self.logger = logging.getLogger(name="NISAR")
self.logger.setLevel(logging.DEBUG)
self.logger.addHandler(fid)
def log_message(self, class_filter, message):
xfilter = class_filter()
log_funct = getattr(self.logger, xfilter.level)
self.logger.addFilter(xfilter)
log_funct(message)
def close(self):
logging.shutdown()
Everything seems to be working fine except my log looks like this:
2020-08-18 14:41:07,431, INFO, QA, misc, 100000, '../verify_rslc.py':70, "N/A: Opening file L_JOINT_00213_LINE12_RUN1_FP_12122019134617.h5 with xml spec /Users/cmoroney/Desktop/working/NISAR/src/GitHub/QualityAssurance/xml/nisar_L1_SLC.xml"
2020-08-18 14:41:07,432, INFO, QA, misc, 100000, '/Users/cmoroney/Desktop/working/NISAR/src/GitHub/QualityAssurance/quality/SLCFile.py':28, "N/A: Opening file L_JOINT_00213_LINE12_RUN1_FP_12122019134617.h5"
where there's a lot of padding between the '100000' (error code parameter) and the filename (source parameter) both of which are extra parameters passed into the logger via the 'addFilter' call. I've tried experimenting with the length of the 'source' and 'error_code' fields in the formatter_str variable but no luck. Any idea where that padding is coming from?
The extra space is coming from the whitespace in the source code itself at the start of the second line.
formatter_str = '%(asctime)s, %(levelname)s, %(pge)s, %(module)s, %(error_code)i, \
%(source)s:%(line_number)i, "%(error_name)s: %(message)s"'
Try this instead:
formatter_str = ('%(asctime)s, %(levelname)s, %(pge)s, %(module)s, %(error_code)i, '
'%(source)s:%(line_number)i, "%(error_name)s: %(message)s"')
I have a python script that calls some long functions. I would like to change the logging format inside the functions.
Something like:
def funz1():
logging.basicConfig(format='%(asctime)s FUNZ1:%(levelname)s:% message)s',
x = 3
logging.info('x is defined')
return
def runner()
logging.basicConfig(format='%(asctime)s Runner:%(levelname)s: %(message)s',
z = 10
logging.info("z is defined") # RUNNER format
funz1() # The generated log must be FUNZ1 format
v = 20
logging.info("v is defined") # RUNNER format
EDIT:
In addition to the format I would like to change also the loglevel
I have logging function as follows.
logging.basicConfig(
filename = fileName,
format = "%(levelname) -10s %(asctime)s %(message)s",
level = logging.DEBUG
)
def printinfo(string):
if DEBUG:
logging.info(string)
def printerror(string):
if DEBUG:
logging.error(string)
print string
I need to login the line number, stack information. For example:
1: def hello():
2: goodbye()
3:
4: def goodbye():
5: printinfo()
---> Line 5: goodbye()/hello()
How can I do this with Python?
SOLVED
def printinfo(string):
if DEBUG:
frame = inspect.currentframe()
stack_trace = traceback.format_stack(frame)
logging.debug(stack_trace[:-1])
if LOG:
logging.info(string)
gives me this info which is exactly what I need.
DEBUG 2011-02-23 10:09:13,500 [
' File "/abc.py", line 553, in <module>\n runUnitTest(COVERAGE, PROFILE)\n',
' File "/abc.py", line 411, in runUnitTest\n printinfo(string)\n']
Current function name, module and line number you can do simply by changing your format string to include them.
logging.basicConfig(
filename = fileName,
format = "%(levelname) -10s %(asctime)s %(module)s:%(lineno)s %(funcName)s %(message)s",
level = logging.DEBUG
)
Most people only want the stack when logging an exception, and the logging module does that automatically if you call logging.exception(). If you really want stack information at other times then you will need to use the traceback module for extract the additional information you need.
import inspect
import traceback
def method():
frame = inspect.currentframe()
stack_trace = traceback.format_stack(frame)
print ''.join(stack_trace)
Use stack_trace[:-1] to avoid including method/printinfo in the stack trace.
As of Python 3.2, this can be simplified to passing the stack_info=True flag to the logging calls. However, you'll need to use one of the above answers for any earlier version.
Late answer, but oh well.
Another solution is that you can create your own formatter with a filter as specified in the docs here. This is a really great feature as you now no longer have to use a helper function (and have to put the helper function everywhere you want the stack trace). Instead, a custom formatted implements it directly into the logs themselves.
import logging
class ContextFilter(logging.Filter):
def __init__(self, trim_amount)
self.trim_amount = trim_amount
def filter(self, record):
import traceback
record.stack = ''.join(
str(row) for row in traceback.format_stack()[:-self.trim_amount]
)
return True
# Now you can create the logger and apply the filter.
logger = logging.getLogger(__name__)
logger.addFilter(ContextFilter(5))
# And then you can directly implement a stack trace in the formatter.
formatter = logging.Formatter('%(asctime)s %(levelname)s %(message)s \n %(stack)s')
Note: In the above code I trim the last 5 stack frames. This is just for convenience and so that we don't show stack frames from the python logging package itself.(It also might have to be adjusted for different versions of the logging package)
Use the traceback module.
logging.error(traceback.format_exc())
Here is an example that i hope it can help you:
import inspect
import logging
logging.basicConfig(
format = "%(levelname) -10s %(asctime)s %(message)s",
level = logging.DEBUG
)
def test():
caller_list = []
frame = inspect.currentframe()
this_frame = frame # Save current frame.
while frame.f_back:
caller_list.append('{0}()'.format(frame.f_code.co_name))
frame = frame.f_back
caller_line = this_frame.f_back.f_lineno
callers = '/'.join(reversed(caller_list))
logging.info('Line {0} : {1}'.format(caller_line, callers))
def foo():
test()
def bar():
foo()
bar()
Result:
INFO 2011-02-23 17:03:26,426 Line 28 : bar()/foo()/test()
Look at traceback module
>>> import traceback
>>> def test():
>>> print "/".join( str(x[2]) for x in traceback.extract_stack() )
>>> def main():
>>> test()
>>> main()
<module>/launch_new_instance/mainloop/mainloop/interact/push/runsource/runcode/<module>/main/test
This is based on #mouad's answer but made more useful (IMO) by including at each level the filename (but not its full path) and line number of the call stack, and by leaving the stack in most-recently-called-from (i.e. NOT reversed) order because that's the way I want to read it :-)
Each entry has file:line:func() which is the same sequence as the normal stacktrace, but all on the same line so much more compact.
import inspect
def callers(self):
caller_list = []
frame = inspect.currentframe()
while frame.f_back:
caller_list.append('{2}:{1}:{0}()'.format(frame.f_code.co_name,frame.f_lineno,frame.f_code.co_filename.split("\\")[-1]))
frame = frame.f_back
callers = ' <= '.join(caller_list)
return callers
You may need to add an extra f_back if you have any intervening calls to produce the log text.
frame = inspect.currentframe().f_back
Produces output like this:
file2.py:620:func1() <= file3.py:211:func2() <= file3.py:201:func3() <= main.py:795:func4() <= file4.py:295:run() <= main.py:881:main()
I only need this stacktrace in two key functions, so I add the output of callers into the text in the logger.debug() call, like htis:
logger.debug("\nWIRE: justdoit request -----\n"+callers()+"\n\n")
I'm writing python package/module and would like the logging messages mention what module/class/function they come from. I.e. if I run this code:
import mymodule.utils.worker as worker
w = worker.Worker()
w.run()
I'd like to logging messages looks like this:
2010-06-07 15:15:29 INFO mymodule.utils.worker.Worker.run <pid/threadid>: Hello from worker
How can I accomplish this?
Thanks.
I tend to use the logging module in my packages/modules like so:
import logging
log = logging.getLogger(__name__)
log.info("Whatever your info message.")
This sets the name of your logger to the name of the module for inclusion in the log message. You can control where the name is by where %(name)s is found in the format string. Similarly you can place the pid with %(process)d and the thread id with %(thread)d. See the docs for all the options.
Formatting example:
import logging
logging.basicConfig(format="%(asctime)s %(levelname)s %(name)s %(process)d/%(threadName)s: %(message)s")
logging.getLogger('this.is.the.module').warning('Testing for SO')
Gives me:
2010-06-07 08:43:10,494 WARNING this.is.the.module 14980/MainThread: Testing for SO
Here is my solution that came out of this discussion. Thanks to everyone for suggestions.
Usage:
>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> from hierlogger import hierlogger as logger
>>> def main():
... logger().debug("test")
...
>>> main()
DEBUG:main:test
By default it will name logger as ... You can also control the depth by providing parameter:
3 - module.class.method default
2 - module.class
1 - module only
Logger instances are also cached to prevent calculating logger name on each call.
I hope someone will enjoy it.
The code:
import logging
import inspect
class NullHandler(logging.Handler):
def emit(self, record): pass
def hierlogger(level=3):
callerFrame = inspect.stack()[1]
caller = callerFrame[0]
lname = '__heirlogger'+str(level)+'__'
if lname not in caller.f_locals:
loggerName = str()
if level >= 1:
try:
loggerName += inspect.getmodule(inspect.stack()[1][0]).__name__
except: pass
if 'self' in caller.f_locals and (level >= 2):
loggerName += ('.' if len(loggerName) > 0 else '') +
caller.f_locals['self'].__class__.__name__
if callerFrame[3] != '' and level >= 3:
loggerName += ('.' if len(loggerName) > 0 else '') + callerFrame[3]
caller.f_locals[lname] = logging.getLogger(loggerName)
caller.f_locals[lname].addHandler(NullHandler())
return caller.f_locals[lname]