Big exception handler or lots of try...except clauses - python

I have a question which is about code design in python.
I'm working on a certain project and I can see that there is a certain amount of different types of errors that I have to handle often, which results in lots of places where there is a try...execept clause that repeats itself.
Now the question is, will it be more preferred to create one exception handler (a decorator) and decorate with it all those functions that have those repeating errors.
The trade off here is that if I create this exception handler decorator it will become quite a big of a class/function which will then make the person reading the code to try and understand another piece of complicated (maybe) logic to understand how the error is handled, where if I don't use the decorator, its pretty clear to the reader how is it handled.
Another option is to create multiple decorators for each of the types of the errors.
Or maybe just leave all those try...except clauses even though they are being repeated.
Any opinions on the matter and maybe other solutions? Thanks!

A lot of this is subjective, but I personally think it's better for exception handling code to be close to where the error is occurring, for the sake of readability and debugging ease. So:
The trade off here is that if I create this exception handler decorator it will become quite a big of a class/function
I would recommend against the Mr. Fixit class. When an error occurs and the debugger drops you into the Mr. Fixit, you then have to walk back quite a bit before you can figure out why the error happened, and what needs to be fixed to make it go away. Likewise, an unfamiliar developer reading your code loses the ability to understand just one small snippet pertaining to a particular error, and now has to work through a large class. As an added issue, a lot of what's in the Mr. Fixit is irrelevant to the one error they're looking at, and the place where the error handling occurs is in an entirely different place. With decorators especially, I feel like you are sacrificing readability (especially for someone less familiar with decorators than you) while gaining not much.
If written with some care, try/catch blocks are not very performance intensive and do not clutter up code too much. I would suggest erring on the side of more try/catches, with every try/catch close to what it's handling, so that you can tell at a glance how errors are handled for any given piece of code (without having to go to a different file).
If you are repeating code a lot, you can either refactor by making the code inside the catch a method that can be repeatedly called, or by making the code inside the try its own method that does its error handling inside its body. When in doubt, keep it simple.
Also, I hate being a close/off-topic Nazi so I won't flag, but I do think this question is more suited to Programmers#SE (being an abstract philosophy/conceptual question) and you might get better responses on that site.

Related

Mutable Default Arguments - (Why) is my code dangerous?

My code triggers a warning in pylint:
def getInsertDefault(collection=['key', 'value'], usefile='defaultMode.xml'):
return doInsert(collection,usefile,True)
The warning is pretty clear, it's Mutable Default Arguments, I'm getting the point in several instances it can give a wrong impression of what's happening. There are several posts on SA already, but it doesn't feel this one here is covered.
Most questions and examples deal with empty lists which are weak-referenced and can cause an error.
I'm also aware it's better practice to change the code to getInsertDefault(collection=None ...) but in this method for default-initialization, I don't intend to do anything with the list except reading, (why) is my code dangerous or could result in a pitfall?
--EDIT--
To the point: Why is the empty dictionary a dangerous default value in Python? would be answering the question.
Kind of: I am aware my code is against the convention and could result in a pitfall - but in this very specific case: Am I safe?
I found the suggestion in the comments useful to use collection=('key', 'value') instead as it's conventional and safe. Still, out of pure interest: Is my previous attempt able to create some kind of major problem?
Assuming that doInsert() (and whatever code doInsert is calling) is only ever reading collection, there is indeed no immediate issue - just a time bomb.
As soon as any part of the code seeing this list starts mutating it, your code will break in the most unexpected way, and you may have a hard time debugging the issue (imagine if what changes is a 3rd part library function tens of stack frames away... and that's in the best case, where the issue and it's root cause are still in the same direct branch of the call stack - it might stored as an instance attribute somewhere and mutated by some unrelated call, and then you're in for some fun).
Now the odds that this ever happens are rather low, but given the potential for subtles (it might just result in incorrect results once in a while, not necessarily crash the program) and hard to track bugs this introduces, you should think twice before assuming that this is really "safe".

string to (possible) integer. Error Handling: Filter with regex or try...catch?

Sitting before this problem and I'm not sure which path I should choose.
I get string inputs representing an ID.
Most of the time it will be meaningfull keynames to look up the actual ID in a table/dict.
But direct integer ID input should be possible as well. But these will come as a string.
I'm not sure if this is more a style question or if there is a huge leaning towards one option.
Option 1:
Use a regexp, check for integer,
if true convert.
Option 2:
try
ID = string.tointeger()
catch
ID = table[string]
I'm leaning towards the try...catch option as it looks cleaner. But I'm not sure if avoiding error handling via regex would deep down actually be the cleaner way and should be preferred.
Having no deeper knowledge about about try...catch; if it is actually pretty smooth 'use it if you like' or a hiccup 'avoid if you can'
"He who would sacrifice correctness for performance deserves neither[!?]."
TL;DR:
try...catch can be very fast. Looking cleaner than code trying to ship around errors. On the other hand it might be harder to follow for others when there are big code blocks and there happens stuff behind the scene which you possible don't want to have.
This is of course language dependent
regexp can be very slow and a cheap filter before it can save you.
--
So trying to answer this myself now:
I hoped for a general answer but in the end of course it is language dependent.
While doing research I often found the arguments it is expensive, creating the exception object, collecting the call stack... Sure makes sense. But often there was no explanation and it felt more like a Stigma or Superstition: try...catch is bad.
From my tests (C++): The try...catch method was the fastest overall. A mere 3% drop in execution speed. Two simple "Does the string start with digits?" "^-?\d+" regexp for integer and float were a 50% drop and as I do some substring analysis after these 50%^x became noticeable.
In the end stumpled over Bjarne Stroustrup's (creator or C++) own FAQ:
http://www.stroustrup.com/bs_faq2.html#exceptions-why
What good can using exceptions do for me? The basic answer is: Using
exceptions for error handling makes you code simpler, cleaner, and
less likely to miss errors. But what's wrong with "good old errno and
if-statements"? The basic answer is: Using those, your error handling
and your normal code are closely intertwined. That way, your code gets
messy and it becomes hard to ensure that you have dealt with all
errors (think "spaghetti code" or a "rat's nest of tests").
[...]
Common objections to the use of exceptions:
but exceptions are expensive!: Not really. Modern C++ implementations
reduce the overhead of using exceptions to a few percent (say, 3%) and
that's compared to no error handling. Writing code with error-return
codes and tests is not free either. As a rule of thumb, exception
handling is extremely cheap when you don't throw an exception. It
costs nothing on some implementations. All the cost is incurred when
you throw an exception: that is, "normal code" is faster than code
using error-return codes and tests. You incur cost only when you have
an error.
Well now in the end I decided to use simple manual filter 1st Char in array[-,0-9] before the regexp. Which might be 10% slower than try...catch but does not throw errors 80% of the time. Still good performance and nice code :)
From the Python glossary: https://docs.python.org/3/glossary.html#term-eafp
EAFP
Easier to ask for forgiveness than permission. This common Python
coding style assumes the existence of valid keys or attributes and
catches exceptions if the assumption proves false. This clean and fast
style is characterized by the presence of many try and except
statements. The technique contrasts with the LBYL style common to many
other languages such as C.
LBYL
Look before you leap. This coding style explicitly tests for pre-conditions before making calls or lookups. This style contrasts
with the EAFP approach and is characterized by the presence of many if
statements.
In a multi-threaded environment, the LBYL approach can risk introducing a race condition between “the looking” and “the leaping”.
For example, the code, if key in mapping: return mapping[key] can fail
if another thread removes key from mapping after the test, but before
the lookup. This issue can be solved with locks or by using the EAFP
approach.

Python: statically detect unhandled exceptions

I'm trying to write a highly-reliable piece of code in Python. The most common issue I run into is that after running for a while, some edge case will occur and raise an exception that I haven't handled. This most often happens when using external libraries - without reading the source code, I don't know of an easy way to get a list of all exceptions that might be raised when using a specific library function, so it's hard to know what to handle. I understand that it's bad practice to use catch-all handlers, so I haven't been doing that.
Is there a good solution for this? It seems like it should be possible for a static analysis tool to check that all exceptions are handled, but I haven't found one. Does this exist? If not, why? (is it impossible? a bad idea? etc) I especially would like it to analyze imported code for the reason explained above.
"it's bad practice to use catch-all handlers" to ignore exceptions:
Our web service has an except which wraps the main loop.
except:
log_exception()
attempt_recovery()
This is good, as it notifies us (necessary) of the unexpected error and then tries to recover (not necessary). We can then go look at those logs and figure out what went wrong so we can prevent it from hitting our general exception again.
This is what you want to avoid:
except:
pass
Because it ignores the error... then you don't know an error happened and your data may be corrupted/invalid/gone/stolen by bears. Your server may be up/down/on fire. We have no idea because we ignored the exception.
Python doesn't require registering of what exceptions might be thrown, so there are no checks for all exceptions a module might throw, but most will give you some idea of what you should be ready to handle in the docs. Depending on your service, when it gets an unhandled exception, you might want to:
Log it and crash
Log it and attempt to continue
Log it and restart
Notice a trend? The action changes, but you never want to ignore it.
Great question.
You can try approaching the problem statically (by, for instance, introducing a custom flake8 rule?), but, I think, it's a problem of testing and test coverage scope. I would approach the problem by adding more "negative path"/"error handling" checks/tests to the places where third-party packages are used, introducing mock side effects whenever needed and monitoring coverage reports at the same time.
I would also look into Mutation Testing idea (check out Cosmic Ray Python package). I am not sure if mutants can be configured to also throw exceptions, but see if it could be helpful.
Instead of trying to handle all exceptions, which is very hard as you described, why don't you catch all, but exclude some, for example the KeyboardInterrupt you mentioned in the comments?
This may help

Most pythonic way to call dependant methods

I have a class with few methods - each one is setting some internal state, and usually requires some other method to be called first, to prepare stage.
Typical invocation goes like this:
c = MyMysteryClass()
c.connectToServer()
c.downloadData()
c.computeResults()
In some cases only connectToServer() and downloadData() will be called (or even just connectToServer() alone).
The question is: how should those methods behave when they are called in wrong order (or, in other words, when the internal state is not yet ready for their task)?
I see two solutions:
They should throw an exception
They should call correct previous method internally
Currently I'm using second approach, as it allows me to write less code (I can just write c.computeResults() and know that two other methods will be called if necessary). Plus, when I call them multiple times, I don't have to keep track of what was already called and so I avoid multiple reconnecting or downloading.
On the other hand, first approach seems more predictable from the caller perspective, and possibly less error prone.
And of course, there is a possibility for a hybrid solution: throw and exception, and add another layer of methods with internal state detection and proper calling of previous ones. But that seems to be a bit of an overkill.
Your suggestions?
They should throw an exception. As said in the Zen of Python: Explicit is better than implicit. And, for that matter, Errors should never pass silently. Unless explicitly silenced. If the methods are called out of order that's a programmer's mistake, and you shouldn't try to fix that by guessing what they mean. You might accidentally cover up an oversight in a way that looks like it works, but is not actually an accurate reflection of the programmer's intent. (That programmer may be future you.)
If these methods are usually called immediately one after another, you could consider collating them by adding a new method that simply calls them all in a row. That way you can use that method and not have to worry about getting it wrong.
Note that classes that handle internal state in this way are sometimes called for but are often not, in fact, necessary. Depending on your use case and the needs of the rest of your application, you may be better off doing this with functions and actually passing connection objects, etc. from one method to another, rather than using a class to store internal state. See for instance Stop Writing Classes. This is just something to consider and not an imperative; plenty of reasonable people disagree with the theory behind Stop Writing Classes.
You should write exceptions. It is good programming practice to write Exceptions to make your code easier to understand for the following reasons:
What you are describe fits the literal description of "exception" -- it is an exception to normal proceedings.
If you build in some kind of work around, you will likely have "spaghetti code" = BAD.
When you, or someone else goes back and reads this code later, it will be difficult to understand if you do not provide the hint that it is an exception to have these methods executed out of order.
Here's a good source:
http://jeffknupp.com/blog/2013/02/06/write-cleaner-python-use-exceptions/
As my CS professor always said "Good programmers can write code that computers can read, but great programmers write code that humans and computers can read".
I hope this helps.
If it's possible, you should make the dependencies explicit.
For your example:
c = MyMysteryClass()
connection = c.connectToServer()
data = c.downloadData(connection)
results = c.computeResults(data)
This way, even if you don't know how the library works, there's only one order the methods could be called in.

What is the pythonic way to bubble up error conditions

I have been working on a Python project that has grown somewhat large, and has several layers of functions. Due to some boneheaded early decisions I'm finding that I have to go a fix a lot of crashers because the lower level functions are returning a type I did not expect in the higher level functions (usually None).
Before I go through and clean this up, I got to wondering what is the most pythonic way of indicating error conditions and handling them in higher functions?
What I have been doing for the most part is if a function can not complete and return its expected result, I'll return None. This gets a little gross, as you end up having to always check for None in all the functions that call it.
def lowLevel():
## some error occurred
return None
## processing was good, return normal result string
return resultString
def highLevel():
resultFromLow = lowLevel()
if not resultFromLow:
return None
## some processing error occurred
return None
## processing was good, return normal result string
return resultString
I guess another solution might be to throw exceptions. With that you still get a lot of code in the calling functions to handle the exception.
Nothing seems super elegant. What do other people use? In obj-c a common pattern is to return an error parameter by reference, and then the caller checks that.
It really depends on what you want to do about the fact this processing can't be completed. If these are really exceptions to the ordinary control flow, and you need to clean up, bail out, email people etc. then exceptions are what you want to throw. In this case the handling code is just necessary work, and is basically unavoidable, although you can rethrow up the stack and handle the errors in one place to keep it tidier.
If these errors can be tolerated, then one elegant solution is to use the Null Object pattern. Rather than returning None, you return an object that mimics the interface of the real object that would normally be returned, but just doesn't do anything. This allows all code downstream of the failure to continue to operate, oblivious to the fact there was a failure. The main downside of this pattern is that it can make it hard to spot real errors, since your code will not crash, but may not produce anything useful at the end of its run.
A common example of the Null Object pattern in Python is returning an empty list or dict when you're lower level function has come up empty. Any subsequent function using this returned value and iterating through elements will just fall through silently, without the need for error checking. Of course, if you require the list to have at least one element for some reason, then this won't work, and you're back to handling an exceptional situation again.
On the bright side, you have discovered exactly the problem with using a return value to indicate an error condition.
Exceptions are the pythonic way to deal with problems. The question you have to answer (and I suspect you already did) is: Is there a useful default that can be returned by low_level functions? If so, take it and run with it; otherwise, raise an exception (`ValueError', 'TypeError' or even a custom error).
Then, further up the call stack, where you know how to deal with the problem, catch the exception and deal with it. You don't have to catch exceptions immediately -- if high_level calls mid-level calls low_level, it's okay to not have any try/except in mid_level and let high_level deal with it. It may be that all you can do is have a try/except at the top of your program to catch and log all uncaught and undealt-with errors, and that can be okay.
This is not necessarily Pytonic as such but experience has taught me to let exceptions "lie where they lie".
That is to say; Don't unnecessarily hide them or re-raise a different exception.
It's sometimes good practice to let the callee fail rather than trying to capture and hide all kinds of error conditions.
Obviously this topic is and can be a little subjective; but if you don't hide or raise a different exception, then it's much easier to debug your code and much easier for the callee of your functions or api to understand what went wrong.
Note: This answer is not complete -- See comments. Some or all of the answers presented in this Q&A should probably be combined in a nice way presneting the various problems and solutions in a clear and concise manner.

Categories

Resources