The best practice seems to be to use assert for a condition that should never happen if the code is correct, and an exception for a condition that is a bit unusual but can happen (e.g., when memory runs out, or user input is invalid, or external connections are broken). I understand the rationale behind this practice as follows:
assert will be disabled with -O interpreter flag. Conditions that may arise from external factors must not be allowed to be silently ignored, so assert there is inappropriate. OTOH, conditions that may only arise if my code is incorrect are hopefully eliminated through testing and debugging, so assert is fine.
assert discourages the caller from handling the exception, since AssertionError is usually interpreted as "don't catch me, this is a catastrophic failure". Furthermore, it is too generic to catch. This is perfect when a bug is found; the typical handling for that would be to stop the execution and debug the code. It is not good if it's a common condition due to external reasons.
Suppose I write some code where I ensure that a certain function argument is always positive. If I find it to be negative, clearly I made a mistake in the code. Hence, I am going to assert that the argument is positive.
Later, someone finds this function useful in another application. They import it, and send all sorts of data to it. Now from the perspective of my function, receiving a negative value is actually quite likely; it is simply an invalid user input. Arguably, the assert is no longer appropriate, and should be replaced with an exception.
Since almost any code could potentially be reused one day, often without my knowledge, this argument seems to say "never use assert; only use exceptions". Obviously, this is not an accepted practice. What am I missing?
EDIT:
To be more specific, let's say the function cannot handle a negative argument at all. So once the argument is negative, the function will do one of the following:
raise an exception
fail an assert
continue execution, likely producing incorrect output
I can see how it would be nice if negative arguments were caught by the caller. But if the calls to the function are interspersed in dozens of places around the code, it's arguably detrimental to the code clarity due to the numerous repetitions of the same check. (Not to mention, it could be forgotten by accident.)
If the function you are writing/reusing is valid with positive or negative numbers, it is not the method that should contain the assert. The function calling the re-used function should have the assert because it is the function providing the invalid values to the function.
function x() {
var i;
// logic to set i. use assertion to test the logic.
assert(i > 0);
reusedFunc(i);
}
if reusedFunc(i) is not valid with negative numbers, it should throw an exception if passed a negative value.
assert statements are for things that you use while developing and debugging the code, not to guarantee critical API constraints.
For your positive number example, test the value and raise ValueError with a useful error message if using a negative value within your code would have bad consequences.
I tell people never to use assert statements. They are easy to write but yet they are so often inappropriate.
assert statements also fall over when people write long ones and decide to use parentheses to make the statement and message span multiple lines... This generates a SyntaxWarning in modern Python about the tuple that the author inadvertently created by using (condition, message) on a non-function statement but it hasn't always done so.
Another rule of thumb: If you ever see a unittest verifying that an AssertionError was raised, the code should not be using an assert.
If you don't use assert statements none of these things will ever bite you.
Related
In Python, assert is a statement, and not a function. Was this a deliberate decision? Are there any advantages to having assert be a statement (and reserved word) instead of a function?
According to the docs, assert expression1, expression2 is expanded to
if __debug__:
if not expression1: raise AssertionError(expression2)
The docs also say that "The current code generator emits no code for an assert statement when optimization is requested at compile time." Without knowing the details, it seems like a special case was required to make this possible. But then, a special case could also be used to optimize away calls to an assert() function.
If assert were a function, you could write:
assert(some_long_condition,
"explanation")
But because assert is a statement, the tuple always evaluates to True, and
you get
SyntaxWarning: assertion is always true, perhaps remove parentheses?
The correct way to write it is
assert some_long_condition, \
"explanation"
which is arguably less pretty.
Are there any advantages to having assert be a statement (and reserved word) instead of a function?
Cannot be reassigned to a user function, meaning it can be effectively disabled at compile time as #mgilson pointed out.
The evaluation of the second, optional parameter is deferred until if/when the assertion fails. Awkward to do that with functions and function arguments (would need to pass a lambda.) Not deferring the evaluation of the second parameter would introduce additional overhead.
One of the wonderful things about assert in python and in other languages (specifically C) is that you can remove them to optimize your code by just adding the correct #define (optionally on the commandline with any compiler I've ever used) or optimization flags (-O in python). If assert became a function, this feature would be impossible to add to python as you don't know until runtime whether you have the builtin assert function or user-defined function of the same name.
Also note that in python, function calls are reasonably expensive. Replacing inline with the code if __debug__: ... is probably a lot more efficient than doing a function call which could be significant if you put an assert statement in a performance critical routine.
In addition to the other answers (and sort of off-topic) a tip. To avoid the use of backslashes you can use implicit line joining inside parenthesis. ;-)
Instead of:
assert some_long_condition, \
"explanation"
You could write:
assert some_long_condition, (
"explanation")
I am no expert in Python, but I believe performance is one of the biggest reason.
if we have assert(expression, explanation) as function, if expression is expensive to evaluate, even we are in non-debug mode, Python needs to evaluate both expression to pass it to the assert function.
By expanding the assert, the expression and explanation statement is in fact not evaluated unless they are really needed (when debug evaluates to true). I believe it is critical if we want to make assert not affecting performance when not necessary (i.e. no performance hit in production system).
[Edit] changed return 0 to return. Side effects of beinga Python n00b. :)
I'm defining a function, where i'm doing some 20 lines of processing. Before processing, i need to check if a certain condition is met. If so, then I should bypass all processing. I have defined the function this way.
def test_funciton(self,inputs):
if inputs == 0:
<Display Message box>
return
<20 line logic here>
Note that the 20 line logic does not return any value, and i'm not using the 0 returned in the first 'if'.
I want to know if this is better than using the below type of code (in terms of performance, or readability, or for any other matter), because the above method looks good to me as it is one indentation less:
def test_function(self,inputs):
if inputs == 0:
<Display Message box>
else:
<20 line logic here>
In general, it improves code readability to handle failure conditions as early as possible. Then the meat of your code doesn't have to worry about these, and the reader of your code doesn't have to consider them any more. In many cases you'd be raising exceptions, but if you really want to do nothing, I don't see that as a major problem even if you generally hew to the "single exit point" style.
But why return 0 instead of just return, since you're not using the value?
First, you can use return without anything after, you don't have to force a return 0.
For the performance way, this question seems to prove you won't notice any difference (except if you're realy unlucky ;) )
In this context, I think it's important to know why inputs can't be zero? Typically, I think the way most programs will handle this is to raise an exception if a bad value is passed. Then the exception can be handled (or not) in the calling routine.
You'll often see it written "Better to ask forgiveness" as opposed to "Look before you leap". Of course, If you're often passing 0 into the function, then the try / except clause could get expensive (try is cheap, except is not).
If you're set on "looking before you leap", I would probably use the first form to keep indentation down.
I doubt the performance is going to be significantly different in either case. Like you I would tend to lean more toward the first method for readability.
In addition to the smaller indentation(which doesn't really matter much IMO), it precludes the necessity to read further for when your inputs == 0:
In the second method one might assume that there is additional processing after the if/else statement, whereas the first one makes it obvious that the method is complete upon that condition.
It really just comes down to personal preference though, you will see both methods used in practice.
Your second example will return after it displays the message box in this case.
I prefer to "return early" as in my opinion it leads to better readability. But then again, most of my returns that happen prior to the actual end of the function tend to be more around short circuiting application logic if certain conditions are not met.
Lately, I've been adding asserts to nearly every single function I make to validate every input as sort of a poor-man's replacement for type checking or to prevent myself from accidentally inputting malformed data while developing. For example,
def register_symbol(self, symbol, func, keypress=None):
assert(isinstance(symbol, basestring))
assert(len(symbol) == 1)
assert(callable(func))
assert(keypress is None or type(keypress) is int)
self.symbols_map[symbol] = (func, keypress)
return
However, I'm worried that this goes against the idea of duck typing, and that I might be going too overboard or constricting myself unnecessarily. Can you ever have too many assert statements? When's a good time to stop?
I only use asserts if they provide far better diagnostics than the error messages that I would get otherwise. Your third assert
assert(callable(func))
might be an example for such an assert -- if func is not callable, you will get an error message at a completely different line of code than where the actual error is, and it might not be obvious how the non-callable object ended up in self.symbols_map. I write "might" because this depends on the rest of your code -- if this is the only place where self.symbols_map gets updated, the assert might also be unnecessary.
The first and last assert definitely are against the idea of duck-typing, and the second one is redundant. If symbol isn't a string of length 1, chances are that self.symbols_map[symbol] will raise a KeyError anyway, so no need for the asserts.
The last assert is also wrong -- type(keypress) cannot be None, and type checks should be done with isinstance(). There might be very specialised applications where you cannot allow subtypes, but than the check should be performed with type(x) is int instead of type(x) == int. Checking for None should be done by x is None, not by type(x) is NoneType.
You should probably write a good set of unit tests -- they will be far more useful than the asserts, and might make almost all of your asserts redundant.
Asserts in your code are not nearly as useful as unittests. Do more of the latter, less of the former.
Be aware that assert statements are stripped whenever Python generates optimized bytecode! Since that is the case in most production environments, assert statements may not be used to validate input.
In fact, I came to the conclusion that I can't use them for anything at all if I can't rely on them being executed. So if I need to check some condition, I use "if ... raise ..." instead, and if I just want to test my code, I write unittests.
I've read about when to use assert vs. exceptions, but I'm still not "getting it". It seems like whenever I think I'm in a situation where I should use assert, later on in development I find that I'm "looking before I leap" to make sure the assert doesn't fail when I call the function. Since there's another Python idiom about preferring to use try-except, I generally end up ditching the assert and throwing an exception instead. I have yet to find a place where it seems right to use an assert. Can anyone come up with some good examples?
A good guideline is using assert when its triggering means a bug in your code. When your code assumes something and acts upon the assumption, it's recommended to protect this assumption with an assert. This assert failing means your assumption isn't correct, which means your code isn't correct.
tend to use assert to check for things that should never happen. sort of like a sanity check.
Another thing to realize is that asserts are removed when optimized:
The current code generator emits no code for an assert statement when optimization is requested at compile time.
Generelly, assert is there to verify an assumption you have about your code, i.e. at that point in time, either the assert succeeds, or your implementation is somehow buggy. An exception is acutally expecting an error to happen and "embracing" it, i.e. allowing you to handle it.
A good example is checking the arguments of a function for consistency:
def f(probability_vector, positive_number):
assert sum(probability_vector) == 1., "probability vectors have to sum to 1"
assert positive_number >= 0., "positive_number should be positive"
# body of function goes here
My work place has imposed a rules for no use of exception (catching is allowed). If I have code like this
def f1()
if bad_thing_happen():
raise Exception('bad stuff')
...
return something
I could change it to
def f1()
if bad_thing_happen():
return [-1, None]
...
return [0, something]
f1 caller would be like this
def f1_caller():
code, result = f1(param1)
if code < 0:
return code
actual_work1()
# call f1 again
code, result = f1(param2)
if code < 0:
return code
actual_work2()
...
Are there more elegant ways than this in Python ?
Exceptions in python are not something to be avoided, and are often a straightforward way to solve problems. Additionally, an exception carries a great deal of information with it that can help quickly locate (via stack trace) and identify problems (via exception class or message).
Whoever has come up with this blanket policy was surely thinking of another language (perhaps C++?) where throwing exceptions is a more expensive operation (and will reduce performance if your code is executing on a 20 year old computer).
To answer your question: the alternative is to return an error code. This means that you are mixing function results with error handling, which raises (ha!) it's own problems. However, returning None is often a perfectly reasonable way to indicate function failure.
Returning None is reasonably common and works well conceptually. If you are expecting a return value, and you get none, that is a good indication that something went wrong.
Another possible approach, if you are expecting to return a list (or dictionary, etc.) is to return an empty list or dict. This can easily be tested for using if, because an empty container evaluates to False in Python, and if you are going to iterate over it, you may not even need to check for it (depending on what you want to do if the function fails).
Of course, these approaches don't tell you why the function failed. So you could return an exception instance, such as return ValueError("invalid index"). Then you can test for particular exceptions (or Exceptions in general) using isinstance() and print them to get decent error messages. (Or you could provide a helper function that tests a return code to see if it's derived from Exception.) You can still create your own Exception subclasses; you would simply be returning them rather than raising them.
Finally, I would work toward getting this ridiculous policy changed, as exceptions are an important part of how Python works, have low overhead, and will be expected by anyone using your functions.
You have to use return codes. Other alternatives would involve mutable global state (think C's errno) or passing in a mutable object (such as a list), but you almost always want to avoid both in Python. Perhaps you could try explaining to them how exceptions let you write better post-conditions instead of adding complication to return values, but are otherwise equivalent.