I wonder what is the advantage of using warnings.warn over using just print and why should I use it.
Not only the code is a bit more messy, but also the warnings.warn's output:
/path/to/script/script.py:42: UserWarning: Warning message.
warn("Warning message.", stacklevel=1)
I just don't see a need to print a script's path or a code fragment with the desired message.
Are my thoughts relevant, or there are some qualities of warnings.warn I'm blind to?
Or maybe, there are some other better ways to handle warnings?
warnings.warn is different from print.
It could show different kind of Warnings: Categories
These warnings could be filtered: enter link description here
So Warnings are very configurable, could be printed (stderr), do nothing, or thrown an Exception.
Logging is another thing completely different, I would say is more useful for a common application, to create logs in a file or just show them: Logging
Related
this is an expansion of How do I find where a "Sorting because non-concatenation" warning is coming from?.
I'm still getting the same warning, in my pytest. I've looked at several questions here, and done:
import warnings
warnings.filterwarnings('error')
which is suggested in How do I catch a numpy warning like it's an exception (not just for testing)?
However, when I run pytest, it still gives me the error, but nothing actually errors...
Try passing the -W flag when you run pytest, like this:
pytest -W error::RuntimeWarning
Specify the kind of warning you want to turn in to an error e.g. DeprecationWarning, FutureWarning, UserWarning.
Wanted to share another solution in hopes that it will help others as I spent way too long trying to solve this.
I specifically only wanted a single test to fail on a warning, not all of them. In my case an exception was being raised within a thread I wanted to test for and discovered the pytest.mark.filterwarnings decorator can be used for this purpose.
The traceback:
raise SerialException(
serial.serialutil.SerialException: device reports readiness to read but returned no data (device disconnected or multiple access on port?)
warnings.warn(pytest.PytestUnhandledThreadExceptionWarning(msg))
-- Docs: https://docs.pytest.org/en/stable/warnings.html
The decorator to catch it:
#pytest.mark.filterwarnings("error::pytest.PytestUnhandledThreadExceptionWarning")
Is this idiomatic/pythonic to do like this or is there a better way? I want all the errors to get in log for in case I don't have access to the console output. Also I want to abort this code path in case the problem arises.
try:
with open(self._file_path, "wb") as out_f:
out_f.write(...)
...
except OSError as e:
log("Saving %s failed: %s" % (self._file_path, str(e)))
raise
EDIT: this question is about handling exceptions in a correct place/with correct idiom. It is not about logging class.
A proven, working scheme is to have a generic except clause at the top level of your application code to make sure any unhandled error will be logged (and re-raised fo course) - and it also gives you an opportunity to try and do some cleanup before crashing)
Once you have this, adding specific "log and re-reraise" exception handlers in your code makes sense if and when you want to capture more contextual informations in your log message, as in your snippet example. This means the exception might end up logged twice but this is hardly and issue .
If you really want to be pythonic (or if you value your error logs), use the stdlib's logging module and it's logger.exception() method that will automagically add the full traceback to the log.
Some (other) benefits of the logging module are the ability to decouple the logging configuration (which should be handled by the app itself, and can be quite fine-grained) from the logging calls (which most often happen at library code level), the compatibility with well-written libs (which already use logging so you just have to configure your loggers to get infos from 3rd-part libs - and this can really save your ass), and the ability to use different logging mechanisms (to stderr, to file, to syslog, via email alerts, whatever, and you're not restricted to a single handler) according to the log source and severity and the deployment environment.
Update:
What would you say about re-raising the same exception (as in example) or re-raising custom exception (MyLibException) instead of original one?
This is a common pattern indeed, but beware of overdoing it - you only want to do this for exceptions that are actually expected and where you really know the cause. Some exception classes can have different causes - cf OSError, 'IOErrorandRuntimeError- so never assume anything about what really caused the exception, either check it with a decently robust condition (for example the.errnofield forIOError`) or let the exception propagate. I once wasted a couple hours trying to understand why some lib complained about a malformed input file when the real reason was a permission issue (which I found out tracing the library code...).
Another possible issue with this pattern is that (in Python2 at least) you will loose the original exception and traceback, so better to log them appropriately before raising your own exception. IIRC Python3 has some mechanism to handle this situation in a cleaner way that let you preserve some of the original exception infos.
This is a request for more information - the warning mentioned below is not otherwise affecting my code. I would just like some advice on how to suppress warnings!
When running a script that plots a .fits file in Spyder, I receive the following warning:
C:\Users\an16975\AppData\Local\Continuum\Anaconda3\lib\site-packages\matplotlib\__init__.py:878:
UserWarning: axes.color_cycle is deprecated and replaced with axes.prop_cycle;
please use the latter.
warnings.warn(self.msg_depr % (key, alt_key))
From the most similar post on StackOverflow, a solution was:
import warnings
warnings.filterwarnings("ignore")
However, this does not work to suppress the warning.
Is there another way to suppress warnings? Would an earlier, more stable of matplotlib avoid this problem, and if so how would I install it?
Cheers,
Ailsa
You need to put the lines
import warnings
warnings.filterwarnings("ignore")
at the very beginning of your script.
The warning you get may be produced either by your script, which uses axes.color_cycle, in which case you need to replace it by axes.prop_cycle.
Or, it may be produced by some module you import, in which case one would need to know the actual module that causes this. Possibly updating the module would help.
The following question seems relevant here: How to suppress matplotlib warning?
For simple debugging in a complex project is there a reason to use the python logger instead of print? What about other use-cases? Is there an accepted best use-case for each (especially when you're only looking for stdout)?
I've always heard that this is a "best practice" but I haven't been able to figure out why.
The logging package has a lot of useful features:
Easy to see where and when (even what line no.) a logging call is being made from.
You can log to files, sockets, pretty much anything, all at the same time.
You can differentiate your logging based on severity.
Print doesn't have any of these.
Also, if your project is meant to be imported by other python tools, it's bad practice for your package to print things to stdout, since the user likely won't know where the print messages are coming from. With logging, users of your package can choose whether or not they want to propogate logging messages from your tool or not.
One of the biggest advantages of proper logging is that you can categorize messages and turn them on or off depending on what you need. For example, it might be useful to turn on debugging level messages for a certain part of the project, but tone it down for other parts, so as not to be taken over by information overload and to easily concentrate on the task for which you need logging.
Also, logs are configurable. You can easily filter them, send them to files, format them, add timestamps, and any other things you might need on a global basis. Print statements are not easily managed.
Print statements are sort of the worst of both worlds, combining the negative aspects of an online debugger with diagnostic instrumentation. You have to modify the program but you don't get more, useful code from it.
An online debugger allows you to inspect the state of a running program; But the nice thing about a real debugger is that you don't have to modify the source; neither before nor after the debugging session; You just load the program into the debugger, tell the debugger where you want to look, and you're all set.
Instrumenting the application might take some work up front, modifying the source code in some way, but the resulting diagnostic output can have enormous amounts of detail, and can be turned on or off to a very specific degree. The python logging module can show not just the message logged, but also the file and function that called it, a traceback if there was one, the actual time that the message was emitted, and so on. More than that; diagnostic instrumentation need never be removed; It's just as valid and useful when the program is finished and in production as it was the day it was added; but it can have it's output stuck in a log file where it's not likely to annoy anyone, or the log level can be turned down to keep all but the most urgent messages out.
anticipating the need or use for a debugger is really no harder than using ipython while you're testing, and becoming familiar with the commands it uses to control the built in pdb debugger.
When you find yourself thinking that a print statement might be easier than using pdb (as it often is), You'll find that using a logger pulls your program in a much easier to work on state than if you use and later remove print statements.
I have my editor configured to highlight print statements as syntax errors, and logging statements as comments, since that's about how I regard them.
In brief, the advantages of using logging libraries do outweigh print as below reasons:
Control what’s emitted
Define what types of information you want to include in your logs
Configure how it looks when it’s emitted
Most importantly, set the destination for your logs
In detail, segmenting log events by severity level is a good way to sift through which log messages may be most relevant at a given time. A log event’s severity level also gives you an indication of how worried you should be when you see a particular message. For instance, dividing logging type to debug, info, warning, critical, and error. Timing can be everything when you’re trying to understand what went wrong with an application. You want to know the answers to questions like:
“Was this happening before or after my database connection died?”
“Exactly when did that request come in?”
Furthermore, it is easy to see where a log has occurred through line number and filename or method name even in which thread.
Here's a functional logging library for Python named loguru.
If you use logging then the person responsible for deployment can configure the logger to send it to a custom location, with custom information. If you only print, then that's all they get.
Logging essentially creates a searchable plain text database of print outputs with other meta data (timestamp, loglevel, line number, process etc.).
This is pure gold, I can run egrep over the log file after the python script has run.
I can tune my egrep pattern search to pick exactly what I am interested in and ignore the rest. This reduction of cognitive load and freedom to pick my egrep pattern later on by trial and error is the key benefit for me.
tail -f mylogfile.log | egrep "key_word1|key_word2"
Now throw in other cool things that print can't do (sending to socket, setting debug levels, logrotate, adding meta data etc.), you have every reason to prefer logging over plain print statements.
I tend to use print statements because it's lazy and easy, adding logging needs some boiler plate code, hey we have yasnippets (emacs) and ultisnips (vim) and other templating tools, so why give up logging for plain print statements!?
I would add to all other mentionned advantages that the print function in standard configuration is buffered. The flush may occure only at the end of the current block (the one where the print is).
This is true for any program launched in a non interactive shell (codebuild, gitlab-ci for instance) or whose output is redirected.
If for any reason the program is killed (kill -9, hard reset of the computer, …), you may be missing some line of logs if you used print for the same.
However, the logging library will ensure to flush the logs printed to stderr and stdout immediately at any call.
I'm thinking about where to write the log record around an operation. Here are two different styles. The first one, write log before the operation.
Before:
log.info("Perform operation XXX")
operation()
And here is a different style, write the log after the operation.
After:
operation()
log.info("Operation XXX is done.")
With the before-style, the logging records say what is going to do now. The pro of this style is that when something goes wrong, developer can detect it easily, because they know what is the program doing now. But the con is that you are not sure is the operation finished correctly, if something wrong is inside the operation, for example, a function call gets blocked there and never return, you can't never know it by reading the logging records. With the after-style, you are sure the operation is done.
Of course, we can mix those two style together
Both:
log.info("Perform operation XXX")
operation()
log.info("Operation XXX is done.")
But I feel that is kinda verbose, it makes double logging records. So, here is my question - what is the good logging style? I would like to know how do you think.
I'd typically use two different log levels.
The first one I put on a "debug" level, and the second one on an "info" level. That way typical production machines would only log what's being done, but I can turn on the debug logging and see what it tries to do before it errors out.
It all depends what you want to log. If you're interested in the code getting to the point where it's about to do an operation. If you want to make sure the operation succeeded, do it after. If you want both, do both.
Maybe you could use something like a try catch ? Here 's a naive python example :
try :
operation()
log.info("Operation XXX is done.")
except Exception:
log.info("Operation xxx Failed")
raise Exception() # optional : if you want to propagate failure to another try catch statement and/or crash eventually.
Operation will be launched.
If it doesn't fail (no exception raised) you get a success statement in the logs.
If it fails (by raising an exception. Like disc full or whatever you are trying to do), Exception is caught and you get a failure statement.
Log is more meaning full. You get to keep the verbosity to a oneliner and get to know if operation succeeded. Best of all choices.
Oh and you get a hook point where you can add some code to be executed in case of failure.
I hope it help.
There's another style that I've seen used in Linux boot scripts and in strace. It's got the advantages of your combined style with less verbosity, but you've got to make sure that your logging facility isn't doing any buffering. I don't know log.info, so here's a rough example with print:
print "Doing XXX... ", # Note lack of newline :)
operation()
print "Done."
(Since in most cases print uses buffering, using this example verbatim won't work properly. You won't see "Doing XXX" until you see the "Done". But you get the general idea.)
The other disadvantage of this style is that things can get mixed up if you have multiple threads writing to the same log.