design of python: why is assert a statement and not a function? - python

In Python, assert is a statement, and not a function. Was this a deliberate decision? Are there any advantages to having assert be a statement (and reserved word) instead of a function?
According to the docs, assert expression1, expression2 is expanded to
if __debug__:
if not expression1: raise AssertionError(expression2)
The docs also say that "The current code generator emits no code for an assert statement when optimization is requested at compile time." Without knowing the details, it seems like a special case was required to make this possible. But then, a special case could also be used to optimize away calls to an assert() function.
If assert were a function, you could write:
assert(some_long_condition,
"explanation")
But because assert is a statement, the tuple always evaluates to True, and
you get
SyntaxWarning: assertion is always true, perhaps remove parentheses?
The correct way to write it is
assert some_long_condition, \
"explanation"
which is arguably less pretty.

Are there any advantages to having assert be a statement (and reserved word) instead of a function?
Cannot be reassigned to a user function, meaning it can be effectively disabled at compile time as #mgilson pointed out.
The evaluation of the second, optional parameter is deferred until if/when the assertion fails. Awkward to do that with functions and function arguments (would need to pass a lambda.) Not deferring the evaluation of the second parameter would introduce additional overhead.

One of the wonderful things about assert in python and in other languages (specifically C) is that you can remove them to optimize your code by just adding the correct #define (optionally on the commandline with any compiler I've ever used) or optimization flags (-O in python). If assert became a function, this feature would be impossible to add to python as you don't know until runtime whether you have the builtin assert function or user-defined function of the same name.
Also note that in python, function calls are reasonably expensive. Replacing inline with the code if __debug__: ... is probably a lot more efficient than doing a function call which could be significant if you put an assert statement in a performance critical routine.

In addition to the other answers (and sort of off-topic) a tip. To avoid the use of backslashes you can use implicit line joining inside parenthesis. ;-)
Instead of:
assert some_long_condition, \
"explanation"
You could write:
assert some_long_condition, (
"explanation")

I am no expert in Python, but I believe performance is one of the biggest reason.
if we have assert(expression, explanation) as function, if expression is expensive to evaluate, even we are in non-debug mode, Python needs to evaluate both expression to pass it to the assert function.
By expanding the assert, the expression and explanation statement is in fact not evaluated unless they are really needed (when debug evaluates to true). I believe it is critical if we want to make assert not affecting performance when not necessary (i.e. no performance hit in production system).

Related

Why is the simple string "Guido" a valid statement in Python?

Why is this a valid statement in Python?
"Guido"
This tripped me up with a multi-line string where I did not properly use parens:
# BAD
message = "Guido"
" van Rossum"
# GOOD
message = ("Guido"
" van Rossum")
Is it purely for the repl or is there some other reasoning for it?
Expressions are statements in Python (and most other imperative languages) for several good reasons:
What if a function, like foo() does something useful but also returns a value? foo() would be an invalid statement if the language didn't allow us to implicitly discard the return value.
Docstrings! They are just strings that happen to be the first statement of a module/class/function, so no special syntax was needed to support them.
In general, it's hard for the interpreter to determine whether an expression might have side effects. So it's been designed to not even try; it simply evaluates the expression, even if it's a simple constant, and discards the result.
To check for such mistakes as you mentioned, and many others, pylint can be helpful. It has a specific warning for this very case. However, it seems to not catch the mistake in your exact example code (using PyLint version 2.4.4); might be a bug.

Can you have too many asserts (in Python)?

Lately, I've been adding asserts to nearly every single function I make to validate every input as sort of a poor-man's replacement for type checking or to prevent myself from accidentally inputting malformed data while developing. For example,
def register_symbol(self, symbol, func, keypress=None):
assert(isinstance(symbol, basestring))
assert(len(symbol) == 1)
assert(callable(func))
assert(keypress is None or type(keypress) is int)
self.symbols_map[symbol] = (func, keypress)
return
However, I'm worried that this goes against the idea of duck typing, and that I might be going too overboard or constricting myself unnecessarily. Can you ever have too many assert statements? When's a good time to stop?
I only use asserts if they provide far better diagnostics than the error messages that I would get otherwise. Your third assert
assert(callable(func))
might be an example for such an assert -- if func is not callable, you will get an error message at a completely different line of code than where the actual error is, and it might not be obvious how the non-callable object ended up in self.symbols_map. I write "might" because this depends on the rest of your code -- if this is the only place where self.symbols_map gets updated, the assert might also be unnecessary.
The first and last assert definitely are against the idea of duck-typing, and the second one is redundant. If symbol isn't a string of length 1, chances are that self.symbols_map[symbol] will raise a KeyError anyway, so no need for the asserts.
The last assert is also wrong -- type(keypress) cannot be None, and type checks should be done with isinstance(). There might be very specialised applications where you cannot allow subtypes, but than the check should be performed with type(x) is int instead of type(x) == int. Checking for None should be done by x is None, not by type(x) is NoneType.
You should probably write a good set of unit tests -- they will be far more useful than the asserts, and might make almost all of your asserts redundant.
Asserts in your code are not nearly as useful as unittests. Do more of the latter, less of the former.
Be aware that assert statements are stripped whenever Python generates optimized bytecode! Since that is the case in most production environments, assert statements may not be used to validate input.
In fact, I came to the conclusion that I can't use them for anything at all if I can't rely on them being executed. So if I need to check some condition, I use "if ... raise ..." instead, and if I just want to test my code, I write unittests.

Python assert statement and code reusability

The best practice seems to be to use assert for a condition that should never happen if the code is correct, and an exception for a condition that is a bit unusual but can happen (e.g., when memory runs out, or user input is invalid, or external connections are broken). I understand the rationale behind this practice as follows:
assert will be disabled with -O interpreter flag. Conditions that may arise from external factors must not be allowed to be silently ignored, so assert there is inappropriate. OTOH, conditions that may only arise if my code is incorrect are hopefully eliminated through testing and debugging, so assert is fine.
assert discourages the caller from handling the exception, since AssertionError is usually interpreted as "don't catch me, this is a catastrophic failure". Furthermore, it is too generic to catch. This is perfect when a bug is found; the typical handling for that would be to stop the execution and debug the code. It is not good if it's a common condition due to external reasons.
Suppose I write some code where I ensure that a certain function argument is always positive. If I find it to be negative, clearly I made a mistake in the code. Hence, I am going to assert that the argument is positive.
Later, someone finds this function useful in another application. They import it, and send all sorts of data to it. Now from the perspective of my function, receiving a negative value is actually quite likely; it is simply an invalid user input. Arguably, the assert is no longer appropriate, and should be replaced with an exception.
Since almost any code could potentially be reused one day, often without my knowledge, this argument seems to say "never use assert; only use exceptions". Obviously, this is not an accepted practice. What am I missing?
EDIT:
To be more specific, let's say the function cannot handle a negative argument at all. So once the argument is negative, the function will do one of the following:
raise an exception
fail an assert
continue execution, likely producing incorrect output
I can see how it would be nice if negative arguments were caught by the caller. But if the calls to the function are interspersed in dozens of places around the code, it's arguably detrimental to the code clarity due to the numerous repetitions of the same check. (Not to mention, it could be forgotten by accident.)
If the function you are writing/reusing is valid with positive or negative numbers, it is not the method that should contain the assert. The function calling the re-used function should have the assert because it is the function providing the invalid values to the function.
function x() {
var i;
// logic to set i. use assertion to test the logic.
assert(i > 0);
reusedFunc(i);
}
if reusedFunc(i) is not valid with negative numbers, it should throw an exception if passed a negative value.
assert statements are for things that you use while developing and debugging the code, not to guarantee critical API constraints.
For your positive number example, test the value and raise ValueError with a useful error message if using a negative value within your code would have bad consequences.
I tell people never to use assert statements. They are easy to write but yet they are so often inappropriate.
assert statements also fall over when people write long ones and decide to use parentheses to make the statement and message span multiple lines... This generates a SyntaxWarning in modern Python about the tuple that the author inadvertently created by using (condition, message) on a non-function statement but it hasn't always done so.
Another rule of thumb: If you ever see a unittest verifying that an AssertionError was raised, the code should not be using an assert.
If you don't use assert statements none of these things will ever bite you.

Example use of assert in Python?

I've read about when to use assert vs. exceptions, but I'm still not "getting it". It seems like whenever I think I'm in a situation where I should use assert, later on in development I find that I'm "looking before I leap" to make sure the assert doesn't fail when I call the function. Since there's another Python idiom about preferring to use try-except, I generally end up ditching the assert and throwing an exception instead. I have yet to find a place where it seems right to use an assert. Can anyone come up with some good examples?
A good guideline is using assert when its triggering means a bug in your code. When your code assumes something and acts upon the assumption, it's recommended to protect this assumption with an assert. This assert failing means your assumption isn't correct, which means your code isn't correct.
tend to use assert to check for things that should never happen. sort of like a sanity check.
Another thing to realize is that asserts are removed when optimized:
The current code generator emits no code for an assert statement when optimization is requested at compile time.
Generelly, assert is there to verify an assumption you have about your code, i.e. at that point in time, either the assert succeeds, or your implementation is somehow buggy. An exception is acutally expecting an error to happen and "embracing" it, i.e. allowing you to handle it.
A good example is checking the arguments of a function for consistency:
def f(probability_vector, positive_number):
assert sum(probability_vector) == 1., "probability vectors have to sum to 1"
assert positive_number >= 0., "positive_number should be positive"
# body of function goes here

Checking arguments in numerical Python code

I find myself writing the same argument checking code all the time for number-crunching:
def myfun(a, b):
if a < 0:
raise ValueError('a cannot be < 0 (was a=%s)' % a)
# more if.. raise exception stuff here ...
return a + b
Is there a better way? I was told not to use 'assert' for these things (though I don't see the problem, apart from not knowing the value of the variable that caused the error).
edit: To clarify, the arguments are usually just numbers and the error checking conditions can be complex, non-trivial and will not necessarily lead to an exception later, but simply to a wrong result. (unstable algorithms, meaningless solutions etc)
assert gets optimized away if you run with python -O (modest optimizations, but sometimes nice to have). One preferable alternative if you have patterns that often repeat may be to use decorators -- great way to factor out repetition. E.g., say you have a zillion functions that must be called with arguments by-position (not by-keyword) and must have their first arguments positive; then...:
def firstargpos(f):
def wrapper(first, *args):
if first < 0:
raise ValueError(whateveryouwish)
return f(first, *args)
return wrapper
then you say something like:
#firstargpos
def myfun(a, b):
...
and the checks are performed in the decorators (or rather the wrapper closure it returns) once and for all. So, the only tricky part is figuring out exactly what checks your functions need and how best to call the decorator(s) to express those (hard to say, without seeing the set of functions you're defining and the set of checks each needs!-). Remember, DRY ("Don't Repeat Yourself") is close to the top spot among guiding principles in software development, and Python has reasonable support to allow you to implement DRY and avoid boilerplatey, repetitious code!-)
You don't want to use assert because your code can be run (and is by default on some systems) in such a way that assert lines are not checked and do not raise errors (-O command line flag).
If you're using a lot of variables that are all supposed to have those same properties, why not subclass whatever type you're using and add that check to the class itself? Then when you use your new class, you know you never have an invalid value, and don't have to go checking for it all over the place.
I'm not sure if this will answer your question, but it strikes me that checking a lot of arguments at the start of a function isn't very pythonic.
What I mean by this is that it is the assumption of most pythonistas that we are all consenting adults, and we trust each other not to do something stupid. Here's how I'd write your example:
def myfun(a, b):
'''a cannot be < 0'''
return a + b
This has three distinct advantages. First off, it's concise, there's really no extra code doing anything unrelated to what you're actually trying to get done. Second, it puts the information exactly where it belongs, in help(myfun), where pythonistas are expected to look for usage notes. Finally, is a non-positive value for a really an error? Although you might think so, unless something definitely will break if a is zero (here it probably wont), then maybe letting it slip through and cause an error up the call stream is wiser. after all, if a + b is in error, it raises an exception which gets passed up the call stack and behavior is still pretty much the same.

Categories

Resources