Preferred method for controlled logging of Python debug messages? - python

I'm writing a large hardware simulation library in Python3. For logging, I use the Python3 Logging module.
For controlling debug messages with method-level granularity, I learned "on the street" (ok, here at StackOverflow) to create sub-loggers within each method I wanted to log from:
sub_logger = logging.getChild("new_sublogger_name")
sub_logger.setLevel(logging.DEBUG)
# Sample debug message
sub_logger.debug("This is a debug message...")
By changing the call to setLevel(), the user is able to enable/disable debugging messages on a per-method basis.
Now the Boss Man don't like this approach. He's advocating a single-point at which all logging messages in the library can be enabled/disabled with the same method-level granularity. (This was to be accomplished by writing our own Python logging library BTW).
Not wanting to re-invent the logging wheel, I proposed to instead continue to use the Python Logging library, but instead use Filters to allow single-point control of logging messages.
Having not used Python Logging Filters very often, is there a consensus on using Filters vs Sublogger.setLevel() for this application? What are the pros/cons of each method?
I'm quite used to setLevel() after using it for a while, but that may be coloring my objectiveness. I DO NOT, however, wish to waste everyone's time writing another Python logging library.

I think the existing logging module does what you want. The trick is to separate the place where you call setLevel() (a configuration operation) from the places where you call getChild() (ongoing logging operations).
import logging
logger = logging.getLogger('mod1')
def fctn1():
logger.getChild('fctn1').debug('I am chatty')
# do stuff (notice, no setLevel)
def fctn2():
logger.getChild('fctn2').debug('I am even more chatty')
# do stuff (notice, no setLevel)
Notice there was no setLevel() there, which makes sense. Why call setLevel() every time and since when does a method know what logging level the user wants.
You set your logging levels in a configuration step at the beginning of the program. You can do it with the dictionary based configuration, a python module that does a bunch of setLevel() calls or even something you cook up with ini files or whatever. But basically it boils down to:
def config_logger():
logging.getLogger('abc.def').setLevel(logging.INFO)
logging.getLogger('mod1').setLevel(logging.WARN)
logging.getLogger('mod1.fctn1').setLeveL(logging.DEBUG)
(etc...)
Now, if you want to get fancy with filters, you can use them to inspect the stack frame and pull the method name out for you. But that gets more complicated.

Related

How can I format log messages based on severity in Python's logging?

From reading the docs about logging, I know I can use the basicConfig for example to set the format of logging records, such as:
import logging
logging.basicConfig(format="%(levelname)s - %(message)s")
for example.
My question is, can I specify different formats for different levels? What I'm trying to achieve is to format logging records to Azure Pipelines logging commands so that for examle the Python code:
logging.error("some error")
Will be printed as:
##[error]some error
Now I know I can use the %(levelname) but I don't want to rely on the correspondence between Azure Pipelines logging commands to Python's "logging" module. For example, in Python's logging there's info level, but not in Azure Pipelines.
There's an Azure doc specifically for Python logging which you might find useful. If I understand it correctly (I've never used Azure), you should be able to send it Python logs without doing anything special.
However, the log messages shown as part of this Azure chart are plainly a bit different. If you want your messages to appear like that, one option is to set some custom levels with the level-names Azure prefers. The doc you referenced has a section on that, although it warns against doing it in complex libraries.
If you want to reuse standard levels, but rename "info" -> "command" or some such, you're probably going to need a separate formatter for each such rename.

How do I stop py2neo watch()?

I run this earlier in the code
watch("httpstream")
Subsequently, any py2neo commands that triggers HTTP traffic will result in verbose logging. How can I stop the effect of watch() logging without creating a new Graph instance?
You don't. That is, I've not written a way to do that.
The watch function is intended only as a debugging utility for an interactive console session. You shouldn't need to use it in an application.
You can set the logging level to a higher value, for example
import logging
logging.getLogger("httpstream").setLevel(Logging.WARNING)
Get Logger Information
You can enumerate a list of all available loggers
print logging.Logger.manager.loggerDict.keys()
Then, you can use either
logging.getLogger("httpstream").getEffectiveLevel()
or
logging.getLogger("httpstream").isEnabledFor(logging.DEBUG)
to get the logging level.

Python- How to flush the log? (django)

I'm working with Django-nonrel on Google App Engine, which forces me to use logging.debug() instead of print().
The "logging" module is provided by Django, but I'm having a rough time using it instead of print().
For example, if I need to verify the content held in the variable x, I will put
logging.debug('x is: %s' % x). But if the program crashes soon after (without flushing the stream), then it never gets printed.
So for debugging, I need debug() to be flushed before the program exits on error, and this is not happening.
I think this may work for you, assuming you're only using one(or default) handler:
>>> import logging
>>> logger = logging.getLogger()
>>> logging.debug('wat wat')
>>> logger.handlers[0].flush()
It's kind of frowned upon in the documentation, though.
Application code should not directly instantiate and use instances of Handler. Instead, the Handler class is a base class that defines the interface that all handlers should have and establishes some default behavior that child classes can use (or override).
http://docs.python.org/2/howto/logging.html#handler-basic
And it could be a performance drain, but if you're really stuck, this may help with your debugging.
If the use case is that you have a python program that should flush its logs when exiting, use logging.shutdown().
From the python documentation:
logging.shutdown()
Informs the logging system to perform an orderly
shutdown by flushing and closing all handlers. This should be called
at application exit and no further use of the logging system should be
made after this call.
Django logging relies on the standard python logging module.
This module has a module-level method: logging.shutdown() which flushes all of the handlers and shuts down the logging system (i.e. logging can not longer be used after it is called)
Inspecting the code of this function shows that currently (python 2.7) the logging module holds a list of weak references to all handlers in a module-level variable called _handlerList so all of the handlers can be flushed by doing something like
[h_weak_ref().flush() for h_weak_ref in logging._handlerList]
because this solution uses the internals of the module #Mikes solution above is better, but it relies on having access to a logger, it can be generalized as follows:
[h.flush() for h in my_logger.handlerList]
a simple function that always working for register you debug messsages while programming. dont use it for production, since it will not rotate:
def make_log(message):
import datetime
with open('mylogfile.log','a') as f:
f.write(f"{datetime.datetime.now()} {message}\n")
then use as
make_log('my message to register')
when to put on production, just comment the last 2 lines
def make_log(message):
import datetime
#with open('mylogfile.log','a') as f:
# f.write(f"{datetime.datetime.now()} {message}\n")

In python, why use logging instead of print?

For simple debugging in a complex project is there a reason to use the python logger instead of print? What about other use-cases? Is there an accepted best use-case for each (especially when you're only looking for stdout)?
I've always heard that this is a "best practice" but I haven't been able to figure out why.
The logging package has a lot of useful features:
Easy to see where and when (even what line no.) a logging call is being made from.
You can log to files, sockets, pretty much anything, all at the same time.
You can differentiate your logging based on severity.
Print doesn't have any of these.
Also, if your project is meant to be imported by other python tools, it's bad practice for your package to print things to stdout, since the user likely won't know where the print messages are coming from. With logging, users of your package can choose whether or not they want to propogate logging messages from your tool or not.
One of the biggest advantages of proper logging is that you can categorize messages and turn them on or off depending on what you need. For example, it might be useful to turn on debugging level messages for a certain part of the project, but tone it down for other parts, so as not to be taken over by information overload and to easily concentrate on the task for which you need logging.
Also, logs are configurable. You can easily filter them, send them to files, format them, add timestamps, and any other things you might need on a global basis. Print statements are not easily managed.
Print statements are sort of the worst of both worlds, combining the negative aspects of an online debugger with diagnostic instrumentation. You have to modify the program but you don't get more, useful code from it.
An online debugger allows you to inspect the state of a running program; But the nice thing about a real debugger is that you don't have to modify the source; neither before nor after the debugging session; You just load the program into the debugger, tell the debugger where you want to look, and you're all set.
Instrumenting the application might take some work up front, modifying the source code in some way, but the resulting diagnostic output can have enormous amounts of detail, and can be turned on or off to a very specific degree. The python logging module can show not just the message logged, but also the file and function that called it, a traceback if there was one, the actual time that the message was emitted, and so on. More than that; diagnostic instrumentation need never be removed; It's just as valid and useful when the program is finished and in production as it was the day it was added; but it can have it's output stuck in a log file where it's not likely to annoy anyone, or the log level can be turned down to keep all but the most urgent messages out.
anticipating the need or use for a debugger is really no harder than using ipython while you're testing, and becoming familiar with the commands it uses to control the built in pdb debugger.
When you find yourself thinking that a print statement might be easier than using pdb (as it often is), You'll find that using a logger pulls your program in a much easier to work on state than if you use and later remove print statements.
I have my editor configured to highlight print statements as syntax errors, and logging statements as comments, since that's about how I regard them.
In brief, the advantages of using logging libraries do outweigh print as below reasons:
Control what’s emitted
Define what types of information you want to include in your logs
Configure how it looks when it’s emitted
Most importantly, set the destination for your logs
In detail, segmenting log events by severity level is a good way to sift through which log messages may be most relevant at a given time. A log event’s severity level also gives you an indication of how worried you should be when you see a particular message. For instance, dividing logging type to debug, info, warning, critical, and error. Timing can be everything when you’re trying to understand what went wrong with an application. You want to know the answers to questions like:
“Was this happening before or after my database connection died?”
“Exactly when did that request come in?”
Furthermore, it is easy to see where a log has occurred through line number and filename or method name even in which thread.
Here's a functional logging library for Python named loguru.
If you use logging then the person responsible for deployment can configure the logger to send it to a custom location, with custom information. If you only print, then that's all they get.
Logging essentially creates a searchable plain text database of print outputs with other meta data (timestamp, loglevel, line number, process etc.).
This is pure gold, I can run egrep over the log file after the python script has run.
I can tune my egrep pattern search to pick exactly what I am interested in and ignore the rest. This reduction of cognitive load and freedom to pick my egrep pattern later on by trial and error is the key benefit for me.
tail -f mylogfile.log | egrep "key_word1|key_word2"
Now throw in other cool things that print can't do (sending to socket, setting debug levels, logrotate, adding meta data etc.), you have every reason to prefer logging over plain print statements.
I tend to use print statements because it's lazy and easy, adding logging needs some boiler plate code, hey we have yasnippets (emacs) and ultisnips (vim) and other templating tools, so why give up logging for plain print statements!?
I would add to all other mentionned advantages that the print function in standard configuration is buffered. The flush may occure only at the end of the current block (the one where the print is).
This is true for any program launched in a non interactive shell (codebuild, gitlab-ci for instance) or whose output is redirected.
If for any reason the program is killed (kill -9, hard reset of the computer, …), you may be missing some line of logs if you used print for the same.
However, the logging library will ensure to flush the logs printed to stderr and stdout immediately at any call.

google app engine python log level noise reduction

Does anyone know how to reduce the verbosity of logging output from dev_appserver.py?
The noise level of these logs is just driving me crazy. I know how to do this kind of config in Java with log4j, but am really lost here on google app engine python.
Solution 1.
You can instruct the logging library to only log statements at or above a given level with logging.setLevel(). If you set this level threshold higher than the level which contains the messages you don't want then you'll filter out the unwanted messages from dev_appserver.
To make your log messages show up, you need to do one of the following:
Ensure your logging messages are logged at least at the filtering out threshold you set above (probably WARN).
Configure and use your own custom logger. Then you can control the logging level for your logger independently of the root logger used by the dev server.
Solution 2.
The workaround above is a little annoying because you either have to avoid DEBUG and INFO levels, or you have to use create your own logger.
Another solution is to comment out the offending log messages from dev_appserver.py (and related modules). This would be quite a pain to do by hand, but I've written a tool which replaces logging calls in all files in a given folder (and its subfolders) - check out my post Python logging and performance: how to have your cake and eat it too.

Categories

Resources