What is the pythonic way to bubble up error conditions - python

I have been working on a Python project that has grown somewhat large, and has several layers of functions. Due to some boneheaded early decisions I'm finding that I have to go a fix a lot of crashers because the lower level functions are returning a type I did not expect in the higher level functions (usually None).
Before I go through and clean this up, I got to wondering what is the most pythonic way of indicating error conditions and handling them in higher functions?
What I have been doing for the most part is if a function can not complete and return its expected result, I'll return None. This gets a little gross, as you end up having to always check for None in all the functions that call it.
def lowLevel():
## some error occurred
return None
## processing was good, return normal result string
return resultString
def highLevel():
resultFromLow = lowLevel()
if not resultFromLow:
return None
## some processing error occurred
return None
## processing was good, return normal result string
return resultString
I guess another solution might be to throw exceptions. With that you still get a lot of code in the calling functions to handle the exception.
Nothing seems super elegant. What do other people use? In obj-c a common pattern is to return an error parameter by reference, and then the caller checks that.

It really depends on what you want to do about the fact this processing can't be completed. If these are really exceptions to the ordinary control flow, and you need to clean up, bail out, email people etc. then exceptions are what you want to throw. In this case the handling code is just necessary work, and is basically unavoidable, although you can rethrow up the stack and handle the errors in one place to keep it tidier.
If these errors can be tolerated, then one elegant solution is to use the Null Object pattern. Rather than returning None, you return an object that mimics the interface of the real object that would normally be returned, but just doesn't do anything. This allows all code downstream of the failure to continue to operate, oblivious to the fact there was a failure. The main downside of this pattern is that it can make it hard to spot real errors, since your code will not crash, but may not produce anything useful at the end of its run.
A common example of the Null Object pattern in Python is returning an empty list or dict when you're lower level function has come up empty. Any subsequent function using this returned value and iterating through elements will just fall through silently, without the need for error checking. Of course, if you require the list to have at least one element for some reason, then this won't work, and you're back to handling an exceptional situation again.

On the bright side, you have discovered exactly the problem with using a return value to indicate an error condition.
Exceptions are the pythonic way to deal with problems. The question you have to answer (and I suspect you already did) is: Is there a useful default that can be returned by low_level functions? If so, take it and run with it; otherwise, raise an exception (`ValueError', 'TypeError' or even a custom error).
Then, further up the call stack, where you know how to deal with the problem, catch the exception and deal with it. You don't have to catch exceptions immediately -- if high_level calls mid-level calls low_level, it's okay to not have any try/except in mid_level and let high_level deal with it. It may be that all you can do is have a try/except at the top of your program to catch and log all uncaught and undealt-with errors, and that can be okay.

This is not necessarily Pytonic as such but experience has taught me to let exceptions "lie where they lie".
That is to say; Don't unnecessarily hide them or re-raise a different exception.
It's sometimes good practice to let the callee fail rather than trying to capture and hide all kinds of error conditions.
Obviously this topic is and can be a little subjective; but if you don't hide or raise a different exception, then it's much easier to debug your code and much easier for the callee of your functions or api to understand what went wrong.
Note: This answer is not complete -- See comments. Some or all of the answers presented in this Q&A should probably be combined in a nice way presneting the various problems and solutions in a clear and concise manner.

Related

Mutable Default Arguments - (Why) is my code dangerous?

My code triggers a warning in pylint:
def getInsertDefault(collection=['key', 'value'], usefile='defaultMode.xml'):
return doInsert(collection,usefile,True)
The warning is pretty clear, it's Mutable Default Arguments, I'm getting the point in several instances it can give a wrong impression of what's happening. There are several posts on SA already, but it doesn't feel this one here is covered.
Most questions and examples deal with empty lists which are weak-referenced and can cause an error.
I'm also aware it's better practice to change the code to getInsertDefault(collection=None ...) but in this method for default-initialization, I don't intend to do anything with the list except reading, (why) is my code dangerous or could result in a pitfall?
--EDIT--
To the point: Why is the empty dictionary a dangerous default value in Python? would be answering the question.
Kind of: I am aware my code is against the convention and could result in a pitfall - but in this very specific case: Am I safe?
I found the suggestion in the comments useful to use collection=('key', 'value') instead as it's conventional and safe. Still, out of pure interest: Is my previous attempt able to create some kind of major problem?
Assuming that doInsert() (and whatever code doInsert is calling) is only ever reading collection, there is indeed no immediate issue - just a time bomb.
As soon as any part of the code seeing this list starts mutating it, your code will break in the most unexpected way, and you may have a hard time debugging the issue (imagine if what changes is a 3rd part library function tens of stack frames away... and that's in the best case, where the issue and it's root cause are still in the same direct branch of the call stack - it might stored as an instance attribute somewhere and mutated by some unrelated call, and then you're in for some fun).
Now the odds that this ever happens are rather low, but given the potential for subtles (it might just result in incorrect results once in a while, not necessarily crash the program) and hard to track bugs this introduces, you should think twice before assuming that this is really "safe".

Big exception handler or lots of try...except clauses

I have a question which is about code design in python.
I'm working on a certain project and I can see that there is a certain amount of different types of errors that I have to handle often, which results in lots of places where there is a try...execept clause that repeats itself.
Now the question is, will it be more preferred to create one exception handler (a decorator) and decorate with it all those functions that have those repeating errors.
The trade off here is that if I create this exception handler decorator it will become quite a big of a class/function which will then make the person reading the code to try and understand another piece of complicated (maybe) logic to understand how the error is handled, where if I don't use the decorator, its pretty clear to the reader how is it handled.
Another option is to create multiple decorators for each of the types of the errors.
Or maybe just leave all those try...except clauses even though they are being repeated.
Any opinions on the matter and maybe other solutions? Thanks!
A lot of this is subjective, but I personally think it's better for exception handling code to be close to where the error is occurring, for the sake of readability and debugging ease. So:
The trade off here is that if I create this exception handler decorator it will become quite a big of a class/function
I would recommend against the Mr. Fixit class. When an error occurs and the debugger drops you into the Mr. Fixit, you then have to walk back quite a bit before you can figure out why the error happened, and what needs to be fixed to make it go away. Likewise, an unfamiliar developer reading your code loses the ability to understand just one small snippet pertaining to a particular error, and now has to work through a large class. As an added issue, a lot of what's in the Mr. Fixit is irrelevant to the one error they're looking at, and the place where the error handling occurs is in an entirely different place. With decorators especially, I feel like you are sacrificing readability (especially for someone less familiar with decorators than you) while gaining not much.
If written with some care, try/catch blocks are not very performance intensive and do not clutter up code too much. I would suggest erring on the side of more try/catches, with every try/catch close to what it's handling, so that you can tell at a glance how errors are handled for any given piece of code (without having to go to a different file).
If you are repeating code a lot, you can either refactor by making the code inside the catch a method that can be repeatedly called, or by making the code inside the try its own method that does its error handling inside its body. When in doubt, keep it simple.
Also, I hate being a close/off-topic Nazi so I won't flag, but I do think this question is more suited to Programmers#SE (being an abstract philosophy/conceptual question) and you might get better responses on that site.

Python: statically detect unhandled exceptions

I'm trying to write a highly-reliable piece of code in Python. The most common issue I run into is that after running for a while, some edge case will occur and raise an exception that I haven't handled. This most often happens when using external libraries - without reading the source code, I don't know of an easy way to get a list of all exceptions that might be raised when using a specific library function, so it's hard to know what to handle. I understand that it's bad practice to use catch-all handlers, so I haven't been doing that.
Is there a good solution for this? It seems like it should be possible for a static analysis tool to check that all exceptions are handled, but I haven't found one. Does this exist? If not, why? (is it impossible? a bad idea? etc) I especially would like it to analyze imported code for the reason explained above.
"it's bad practice to use catch-all handlers" to ignore exceptions:
Our web service has an except which wraps the main loop.
except:
log_exception()
attempt_recovery()
This is good, as it notifies us (necessary) of the unexpected error and then tries to recover (not necessary). We can then go look at those logs and figure out what went wrong so we can prevent it from hitting our general exception again.
This is what you want to avoid:
except:
pass
Because it ignores the error... then you don't know an error happened and your data may be corrupted/invalid/gone/stolen by bears. Your server may be up/down/on fire. We have no idea because we ignored the exception.
Python doesn't require registering of what exceptions might be thrown, so there are no checks for all exceptions a module might throw, but most will give you some idea of what you should be ready to handle in the docs. Depending on your service, when it gets an unhandled exception, you might want to:
Log it and crash
Log it and attempt to continue
Log it and restart
Notice a trend? The action changes, but you never want to ignore it.
Great question.
You can try approaching the problem statically (by, for instance, introducing a custom flake8 rule?), but, I think, it's a problem of testing and test coverage scope. I would approach the problem by adding more "negative path"/"error handling" checks/tests to the places where third-party packages are used, introducing mock side effects whenever needed and monitoring coverage reports at the same time.
I would also look into Mutation Testing idea (check out Cosmic Ray Python package). I am not sure if mutants can be configured to also throw exceptions, but see if it could be helpful.
Instead of trying to handle all exceptions, which is very hard as you described, why don't you catch all, but exclude some, for example the KeyboardInterrupt you mentioned in the comments?
This may help

Can I change the behaviour of "raise" or "Exception"? [duplicate]

This question already has answers here:
Calling a hook function every time an Exception is raised
(4 answers)
Closed 6 years ago.
My project's code is full of blocks like the following:
try:
execute_some_code()
except Exception:
print(datetime.datetime.now())
raise
simply because, if I get an error message, I'd like to know when it happened. I find it rather silly to repeat this code over and over, and I'd like to factor it away.
I don't want to decorate execute_some_code with something that does the error capturing (because sometimes it's just a block of code rather than a function call, and sometimes I don't need the exact same function to be decorated like that). I also don't want to divert stdout to some different stream that logs everything, because that would affect every other thing that gets sent to stdout as well.
Ideally, I'd like to over-ride the behaviour of either the raise statement (to also print datetime.datetime.now() on every execution) or the Exception class, to pre-pend all of its messages with the time. I can easily sub-class from Exception, but then I'd have to make sure my functions raise an instance of this subclass, and I'd have just as much code duplication as currently.
Is either of these options possible?
You might be able to modify python (I'd have to read code to be sure how complex that'd be), but:
You do not want to replace raise with different behaviour - trying and catching is a very pythonic approach to problem solving, so there's lots of code that works very well by e.g. calling a method and letting that method raise an exception, catching that under normal circumstances. So we can rule that approach out – you really only want to know about the exceptions you care about, not the ones that are normal during operation.
The same goes for triggering some action whenever an Exception instance is created – but:
You might be able to overwrite the global namespace; at least for things that get initialized after you declared your own Exception class. You could then add a message property that includes a timestamp. Don't do that, though – there might be people actually relying on the message to automatically react to Exceptions (bad style, but still not really seldom, sadly).

Most pythonic way to call dependant methods

I have a class with few methods - each one is setting some internal state, and usually requires some other method to be called first, to prepare stage.
Typical invocation goes like this:
c = MyMysteryClass()
c.connectToServer()
c.downloadData()
c.computeResults()
In some cases only connectToServer() and downloadData() will be called (or even just connectToServer() alone).
The question is: how should those methods behave when they are called in wrong order (or, in other words, when the internal state is not yet ready for their task)?
I see two solutions:
They should throw an exception
They should call correct previous method internally
Currently I'm using second approach, as it allows me to write less code (I can just write c.computeResults() and know that two other methods will be called if necessary). Plus, when I call them multiple times, I don't have to keep track of what was already called and so I avoid multiple reconnecting or downloading.
On the other hand, first approach seems more predictable from the caller perspective, and possibly less error prone.
And of course, there is a possibility for a hybrid solution: throw and exception, and add another layer of methods with internal state detection and proper calling of previous ones. But that seems to be a bit of an overkill.
Your suggestions?
They should throw an exception. As said in the Zen of Python: Explicit is better than implicit. And, for that matter, Errors should never pass silently. Unless explicitly silenced. If the methods are called out of order that's a programmer's mistake, and you shouldn't try to fix that by guessing what they mean. You might accidentally cover up an oversight in a way that looks like it works, but is not actually an accurate reflection of the programmer's intent. (That programmer may be future you.)
If these methods are usually called immediately one after another, you could consider collating them by adding a new method that simply calls them all in a row. That way you can use that method and not have to worry about getting it wrong.
Note that classes that handle internal state in this way are sometimes called for but are often not, in fact, necessary. Depending on your use case and the needs of the rest of your application, you may be better off doing this with functions and actually passing connection objects, etc. from one method to another, rather than using a class to store internal state. See for instance Stop Writing Classes. This is just something to consider and not an imperative; plenty of reasonable people disagree with the theory behind Stop Writing Classes.
You should write exceptions. It is good programming practice to write Exceptions to make your code easier to understand for the following reasons:
What you are describe fits the literal description of "exception" -- it is an exception to normal proceedings.
If you build in some kind of work around, you will likely have "spaghetti code" = BAD.
When you, or someone else goes back and reads this code later, it will be difficult to understand if you do not provide the hint that it is an exception to have these methods executed out of order.
Here's a good source:
http://jeffknupp.com/blog/2013/02/06/write-cleaner-python-use-exceptions/
As my CS professor always said "Good programmers can write code that computers can read, but great programmers write code that humans and computers can read".
I hope this helps.
If it's possible, you should make the dependencies explicit.
For your example:
c = MyMysteryClass()
connection = c.connectToServer()
data = c.downloadData(connection)
results = c.computeResults(data)
This way, even if you don't know how the library works, there's only one order the methods could be called in.

Categories

Resources