Should a validate method throw an exception? - python

I've implemented a little validation library which is used like this:
domain_object.validate()
# handle validation errors in some way ...
if domain_object.errors:
for error in domain_object.errors:
print(error)
validate() performs the checks and populates a list called errors.
I know from other validation libraries that they throw exception when validation is performed unsuccessfully. Error messages would be passed as an exception property.
What approach is better? Is it advantageous to throw validation exceptions?

No, I wouldn't think that a validation method should throw an exception.
That would create a bit of an anti-pattern, as the client code calling the method would reasonably expect an exception to be thrown, and would then need to catch the exception. Since it's generally recommended that exceptions not be used for flow control, why not just return a value indicating whether validation was successful or not. The client code could check the return value and proceed accordingly.
You essentially accomplish the same thing as you would by throwing an exception, but without the extra cost and poor semantics of actually throwing an exception.
Exceptions should be reserved for truly exceptional conditions, not normal operation of a program. Failing validation to me seems like it is a pretty normal condition, something to be expected during the day-to-day operation of an application. It would be easily handled by the calling code, and normal operation would continue. Generally, that's not the case with exceptions.

I argue for the exact opposite: In a validation layer you want to ensure that every validation error is handled. If you rely on return values, there might be a bug in the integration code (especially when you use the validators in a different environment).
Python exceptions will make that problem obvious.

Related

How expensive is raise in Python?

During development using drf, an efficient method for error handling was needed.
I found two methods, one using ErrorResponse created by inheriting Response, and one using APIException provided by drf.
The first method is done through return, and the second method uses the raise command.
I wonder which one is more efficient and why!
I apologize in advance for the question may be too vague.
Not sure if efficiency and CPU time is most important thing.
You have to understand Django request-response cycle first. The next step after return Response (or raise Exception) is not a client side browser but number of Middlewares that you imported in your application. And these Middlewares may be different depends on what happens inside View.
When you raising something you break this cycle flow.
Django handling raised exception, writing extra error logs, returning specified error response to client side. You don't have to care that all conditions of correct responses will be satisfied, because error already happens, it is already not correct. In other way returned Response will be delivered to client side by normal way. Django will care that all validations and steps will be passed before response reach a client.
If you need to save milliseconds by choosing between return / raise and deeply thinking about efficiency, at first stop using Django. Seriously. It is slowest framework even for python.
raise produces an error in the current level of the call stack. You can catch a raised error by covering the area where the error might be raised in a try and handling that error in an except.
return on the other hand, returns a value to where the function was called from, so returning an exception usually is not the functionality you are looking for in a situation like this, since the exception itself is not the thing triggering the except it is instead the raising of the exception that triggers it.
https://docs.python.org/3/reference/simple_stmts.html#raise
https://docs.python.org/3/reference/simple_stmts.html#return
So to answer your question I would raise because it is built for errors compared to return. Also, they are the same in speed/efficiency.

Python: statically detect unhandled exceptions

I'm trying to write a highly-reliable piece of code in Python. The most common issue I run into is that after running for a while, some edge case will occur and raise an exception that I haven't handled. This most often happens when using external libraries - without reading the source code, I don't know of an easy way to get a list of all exceptions that might be raised when using a specific library function, so it's hard to know what to handle. I understand that it's bad practice to use catch-all handlers, so I haven't been doing that.
Is there a good solution for this? It seems like it should be possible for a static analysis tool to check that all exceptions are handled, but I haven't found one. Does this exist? If not, why? (is it impossible? a bad idea? etc) I especially would like it to analyze imported code for the reason explained above.
"it's bad practice to use catch-all handlers" to ignore exceptions:
Our web service has an except which wraps the main loop.
except:
log_exception()
attempt_recovery()
This is good, as it notifies us (necessary) of the unexpected error and then tries to recover (not necessary). We can then go look at those logs and figure out what went wrong so we can prevent it from hitting our general exception again.
This is what you want to avoid:
except:
pass
Because it ignores the error... then you don't know an error happened and your data may be corrupted/invalid/gone/stolen by bears. Your server may be up/down/on fire. We have no idea because we ignored the exception.
Python doesn't require registering of what exceptions might be thrown, so there are no checks for all exceptions a module might throw, but most will give you some idea of what you should be ready to handle in the docs. Depending on your service, when it gets an unhandled exception, you might want to:
Log it and crash
Log it and attempt to continue
Log it and restart
Notice a trend? The action changes, but you never want to ignore it.
Great question.
You can try approaching the problem statically (by, for instance, introducing a custom flake8 rule?), but, I think, it's a problem of testing and test coverage scope. I would approach the problem by adding more "negative path"/"error handling" checks/tests to the places where third-party packages are used, introducing mock side effects whenever needed and monitoring coverage reports at the same time.
I would also look into Mutation Testing idea (check out Cosmic Ray Python package). I am not sure if mutants can be configured to also throw exceptions, but see if it could be helpful.
Instead of trying to handle all exceptions, which is very hard as you described, why don't you catch all, but exclude some, for example the KeyboardInterrupt you mentioned in the comments?
This may help

Python determine whether an exception was thrown (regardless of whether it is caught or not)

I am writing tests for some legacy code that is littered with catch-all constructs like
try:
do_something()
do_something_else()
for x in some_huge_list():
do_more_things()
except Exception:
pass
and I want to tell whether an exception was thrown inside the try block.
I want to avoid introducing changes into the codebase just to support a few tests and I don't want to make the except cases more specific for fear of unintentionally introducing regressions.
Is there a way of extracting information about exceptions that were raised and subsequently handled from the runtime? Or some function with a similar API to eval/exec/apply/call that either records information on every raised exception, lets the user supply an exception handler that gets run first, or lets the user register a callback that gets run on events like an exception being raised or caught.
If there isn't a way to detect whether an exception was thrown without getting under the (C)Python runtime in a really nasty way, what are some good strategies for testing code with catch-all exceptions inside the units you're testing?
Your only realistic option is to instrument the except handlers.
Python does record exception information, which is retrievable with sys.exc_info(), but this information is cleared when a function exits (Python 2) or the try statement is done (Python 3).
A good strategy would be testing observable behaviour. Since exceptions were explicitly excluded from the observable behaviour I do not think you should be testing whether an exception was raised or not.

What is the pythonic way to bubble up error conditions

I have been working on a Python project that has grown somewhat large, and has several layers of functions. Due to some boneheaded early decisions I'm finding that I have to go a fix a lot of crashers because the lower level functions are returning a type I did not expect in the higher level functions (usually None).
Before I go through and clean this up, I got to wondering what is the most pythonic way of indicating error conditions and handling them in higher functions?
What I have been doing for the most part is if a function can not complete and return its expected result, I'll return None. This gets a little gross, as you end up having to always check for None in all the functions that call it.
def lowLevel():
## some error occurred
return None
## processing was good, return normal result string
return resultString
def highLevel():
resultFromLow = lowLevel()
if not resultFromLow:
return None
## some processing error occurred
return None
## processing was good, return normal result string
return resultString
I guess another solution might be to throw exceptions. With that you still get a lot of code in the calling functions to handle the exception.
Nothing seems super elegant. What do other people use? In obj-c a common pattern is to return an error parameter by reference, and then the caller checks that.
It really depends on what you want to do about the fact this processing can't be completed. If these are really exceptions to the ordinary control flow, and you need to clean up, bail out, email people etc. then exceptions are what you want to throw. In this case the handling code is just necessary work, and is basically unavoidable, although you can rethrow up the stack and handle the errors in one place to keep it tidier.
If these errors can be tolerated, then one elegant solution is to use the Null Object pattern. Rather than returning None, you return an object that mimics the interface of the real object that would normally be returned, but just doesn't do anything. This allows all code downstream of the failure to continue to operate, oblivious to the fact there was a failure. The main downside of this pattern is that it can make it hard to spot real errors, since your code will not crash, but may not produce anything useful at the end of its run.
A common example of the Null Object pattern in Python is returning an empty list or dict when you're lower level function has come up empty. Any subsequent function using this returned value and iterating through elements will just fall through silently, without the need for error checking. Of course, if you require the list to have at least one element for some reason, then this won't work, and you're back to handling an exceptional situation again.
On the bright side, you have discovered exactly the problem with using a return value to indicate an error condition.
Exceptions are the pythonic way to deal with problems. The question you have to answer (and I suspect you already did) is: Is there a useful default that can be returned by low_level functions? If so, take it and run with it; otherwise, raise an exception (`ValueError', 'TypeError' or even a custom error).
Then, further up the call stack, where you know how to deal with the problem, catch the exception and deal with it. You don't have to catch exceptions immediately -- if high_level calls mid-level calls low_level, it's okay to not have any try/except in mid_level and let high_level deal with it. It may be that all you can do is have a try/except at the top of your program to catch and log all uncaught and undealt-with errors, and that can be okay.
This is not necessarily Pytonic as such but experience has taught me to let exceptions "lie where they lie".
That is to say; Don't unnecessarily hide them or re-raise a different exception.
It's sometimes good practice to let the callee fail rather than trying to capture and hide all kinds of error conditions.
Obviously this topic is and can be a little subjective; but if you don't hide or raise a different exception, then it's much easier to debug your code and much easier for the callee of your functions or api to understand what went wrong.
Note: This answer is not complete -- See comments. Some or all of the answers presented in this Q&A should probably be combined in a nice way presneting the various problems and solutions in a clear and concise manner.

Should I always specify an exception type in `except` statements?

When using PyCharm IDE the use of except: without an exception type triggers a reminder from the IDE that this exception clause is Too broad.
Should I be ignoring this advice? Or is it Pythonic to always specific the exception type?
It's almost always better to specify an explicit exception type. If you use a naked except: clause, you might end up catching exceptions other than the ones you expect to catch - this can hide bugs or make it harder to debug programs when they aren't doing what you expect.
For example, if you're inserting a row into a database, you might want to catch an exception that indicates that the row already exists, so you can do an update.
try:
insert(connection, data)
except:
update(connection, data)
If you specify a bare except:, you would also catch a socket error indicating that the database server has fallen over. It's best to only catch exceptions that you know how to handle - it's often better for the program to fail at the point of the exception than to continue but behave in weird unexpected ways.
One case where you might want to use a bare except: is at the top-level of a program you need to always be running, like a network server. But then, you need to be very careful to log the exceptions, otherwise it'll be impossible to work out what's going wrong. Basically, there should only be at most one place in a program that does this.
A corollary to all of this is that your code should never do raise Exception('some message') because it forces client code to use except: (or except Exception: which is almost as bad). You should define an exception specific to the problem you want to signal (maybe inheriting from some built-in exception subclass like ValueError or TypeError). Or you should raise a specific built-in exception. This enables users of your code to be careful in catching just the exceptions they want to handle.
You should not be ignoring the advice that the interpreter gives you.
From the PEP-8 Style Guide for Python :
When catching exceptions, mention specific exceptions whenever
possible instead of using a bare except: clause.
For example, use:
try:
import platform_specific_module
except ImportError:
platform_specific_module = None
A bare except: clause will catch SystemExit and KeyboardInterrupt exceptions, making it harder to
interrupt a program with Control-C, and can disguise other problems.
If you want to catch all exceptions that signal program errors, use
except Exception: (bare except is equivalent to except
BaseException:).
A good rule of thumb is to limit use of bare 'except' clauses to two
cases:
If the exception handler will be printing out or logging the
traceback; at least the user will be aware that an error has occurred.
If the code needs to do some cleanup work, but then lets the exception
propagate upwards with raise. try...finally can be a better way to
handle this case.
Not specfic to Python this.
The whole point of exceptions is to deal with the problem as close to where it was caused as possible.
So you keep the code that could in exceptional cirumstances could trigger the problem and the resolution "next" to each other.
The thing is you can't know all the exceptions that could be thrown by a piece of code. All you can know is that if it's a say a file not found exception, then you could trap it and to prompt the user to get one that does or cancel the function.
If you put try catch round that, then no matter what problem there was in your file routine (read only, permissions, UAC, not really a pdf, etc), every one will drop in to your file not found catch, and your user is screaming "but it is there, this code is crap"
Now there are a couple of situation where you might catch everything, but they should be chosen consciously.
They are catch, undo some local action (such as creating or locking a resource, (opening a file on disk to write for instance), then you throw the exception again, to be dealt with at a higher level)
The other you is you don't care why it went wrong. Printing for instance. You might have a catch all round that, to say There is some problem with your printer, please sort it out, and not kill the application because of it. Ona similar vain if your code executed a series of separate tasks using some sort of schedule, you wouldnlt want the entire thing to die, because one of the tasks failed.
Note If you do the above, I can't recommend some sort of exception logging, e.g. try catch log end, highly enough.
Always specify the exception type, there are many types you don't want to catch, like SyntaxError, KeyboardInterrupt, MemoryError etc.
You will also catch e.g. Control-C with that, so don't do it unless you "throw" it again. However, in that case you should rather use "finally".
Here are the places where i use except without type
quick and dirty prototyping
That's the main use in my code for unchecked exceptions
top level main() function, where i log every uncaught exception
I always add this, so that production code does not spill stacktraces
between application layers
I have two ways to do it :
First way to do it : when a higher level layer calls a lower level function, it wrap the calls in typed excepts to handle the "top" lower level exceptions. But i add a generic except statement, to detect unhandled lower level exceptions in the lower level functions.
I prefer it this way, i find it easier to detect which exceptions should have been caught appropriately : i "see" the problem better when a lower level exception is logged by a higher level
Second way to do it : each top level functions of lower level layers have their code wrapped in a generic except, to it catches all unhandled exception on that specific layer.
Some coworkers prefer this way, as it keeps lower level exceptions in lower level functions, where they "belong".
Try this:
try:
#code
except ValueError:
pass
I got the answer from this link, if anyone else run into this issue Check it out

Categories

Resources