Python: Should I avoid initialization of variables inside blocks? - python

Problem
I have a code like this
if condition:
a = f(x)
else:
a = g(y)
Initialization of a inside of the block looks bad for me. Can it be written better?
I cannot use ternary operator, because names of functions and/or lists of arguments are long.
Saying "long" I mean that the following expression
a = f(x) if condition else g(y)
will take more than 79 (sometimes even more than 119) symbols with real names instead of a, f, g, x, y and condition.
Usage of multiple slashes will make the code ugly and messy.
I don't want to initialize a with result of one of the functions by defaul, because both function are slow and I cannot allow such overhead
a = g(y)
if condition:
a = f(x)
I can initialize the variable with None, but is this solution pretty enough?
a = None
if condition:
a = f(x)
else:
a = g(y)
Let me explain my position: in C and C++ variables inside of a block have the block as their scope. In ES6 the let keyword was introduced — it allows to create variables with the same scoping rules as variables in C and C++. Variables defined with old var keyword have similar scoping rules as in Python.
That's why I think that initialization of variables should be made outside blocks if I want to use the variables outside these blocks.
Update
Here is more complicated example
for obj in gen:
# do something with the `obj`
if predicate(obj):
try:
result = f(obj)
except Exception as e:
log(e)
continue
else:
result = g(obj)
# do something useful with the `result`
else:
result = h(obj)
display(result)
I go through elements of some generator gen, process them and perform some actions on the result on each iteration.
Then I want to do something with the last result outside of the loop.
Is it pythonic enough to not assign a dummy value to the result beforehand?
Doesn't this make the code less readable?
Question
Is it good to initialize variables inside if/else/for/etc. in Python?

Python has no block scope... the scope is the whole function and it's perfectly pythonic to write
if <condition>:
a = f()
else:
a = g()
If you want to write in C++ then write in C++ using C++, don't write C++ using Python... it's a bad idea.

Ok, there are two points that need to be clarified here which are fundamental to python.
There is no variable declaration/initialization in python. An expression like a = f(x) is simply a scheme to name the object that is returned by f as a. That namea can be later used to name any other object no matter what its type is. See this answer.
A block in python is either the body of a module, a class or a function. Anything defined/named inside these objects are visible to later code until the end of a block. A loop or an if-else is not a block. So any name defined before outside the loop or if/else will be visible inside and vice versa. See this. global and nonlocal objects are a little different. There is no let in python since that is default behavior.
In your case the only concern is how you are using a further in the code. If you code expects the type of objects returned by f or g it should work fine unless there is an error. Because at least one of the if or the else should run in a normal operation so a will refer to some kind of an object (if the names were different in if and else that would be a problem). If you want to make sure that the subsequent code does not break you can use a try-except-else to catch any error generated by the functions and assign a default value to a in the except clause after appropriate reporting/logging of the error.
Hence to summarize and also to directly address your question, assigning names to objects inside an if-else statement or a loop is perfectly good practice provided:
The same name is used in both if and else clause so that the name is guaranteed to refer to an object at the end of the statement. Additional try-except-else error catching can take care of exceptions raised by the functions.
The names should not be too short, generic or something that does not make the intention of the code clear like a, res etc. A sensible name will lead to much better readability and prevent accidental use of the same name later for some other object thereby losing the original.

Let me clarify what I meant in my comments.
#this is not, strictly, needed, but it makes the
#exception handler more robust
a = b = None
try:
if condition:
a = f(x)
b = v(x)
else:
a = g(y)
b = v2(x)
return w(a, b)
except Exception, e:
logger.exception("exception:%s" % (e))
logger.exception(" the value of a was:%s" % (a))
logger.exception(" the value of b was:%s" % (b))
raise
This is pretty std code, you just want to wrap the whole thing in some logging code in case of exceptions. I re-raise the original exception, but could just as easily return a default value.
Problem is, unless the exception waits until return w(a, b) to happen, any access to a and b will throw its own NameError based on those variables not having been declared.
This has happened to me, a lot, with custom web unittesting code - I get a response from a get or post to an url and I run a bunch of tests against the response. If the original get/post that failed, the response doesn't exist, so any diagnostics like pretty printing that response's attributes will throw an exception, forcing you to clean things up before your exception handler is useful.
So, to guard against this, I initialize any variable referred to in the exception handler to None. Of course, if needed, you also have to guard against a being None, with something like logger("a.attr1:%s" % (getattr(a, "attr1","?")

Related

Inlining Python Function

In a C program, inlining a function is a fairly intuitive optimization. If the inlined function's body is sufficiently small, you end up saving the jump to the function and creation of the stack frame, and you store the return value wherever the function's result would have been stored, jumping to the end of the inlined function's "body" rather than long-jumping to the return pointer.
I'm interested in doing the same thing in Python, converting two python functions into another valid python function where the first got "inlined" into the second. An ideal solution to this might look something like the following:
def g(x):
return x ** 2
def f(y):
return g(y + 3)
# ... Becomes ...
def inlined_f(y):
return (y + 3) ** 2
Clearly, in a language as dynamic as Python, this isn't trivial to do automatically. The best generic solution I have come up with is to use dict to capture the arguments passed to the function, wrap the function body in a one-iteration for loop, use break to jump to the end of the function, and replace uses of arguments with indexes into the argument dictionary. The result looks something like the following:
def inlined_f(y):
_g = dict(x=y + 3)
for ____ in [None]:
_g['return'] = _g['x'] ** 2
break
_g_return = _g.get('return', None)
del _g
return _g_return
I don't care that it's ugly, but I do care that it doesn't support returns from within loops. E.g.:
def g(x):
for i in range(x + 1):
if i == x:
return i ** 2
print("Woops, you shouldn't get here")
def inlined_f(y):
_g = dict(x=y + 3)
for ____ in [None]:
for _g['i'] in range(_g['x'] + 1):
if _g['i'] == _g['x']:
_g['return'] _g['i'] ** 2
break # <-- Doesn't exit function, just innermost loop
print("Woops, you shouldn't get here")
_g_return = _g.get('return', None)
del _g
return _g_return
What approach could I take to this problem that avoids needing to use break to "jump" out of the inlined function's body? I'd also be open to an overall better, generic approach could I take to inline one Python function into another.
For reference, I'm working at the AST (abstract syntax tree) level, so using parsed Python code; clearly, outside of literal values, I don't know what value or type anything will have while performing this transformation. The resulting inlined function must behave identically to the original functions, and must support all features typically available when calling a function. Is this even possible in Python?
EDIT: I should clarify since I used the tag "optimization", that I'm not actually interested in a performance boost. The resulting code does not need to be faster, it just must not call the inlined function while still behaving identically. You can assume that both functions' source code is available as valid Python.
The only reasonable way on source level I see, simplified:
Parse the source into some AST (or just use the built-in AST).
Copy a subtree representing the function's body.
Rename the variables in the subtree, e.g. by adding an unique prefix.
At the call site, replace all passed arguments with assignments using the function's new variable names.
Remove the call and replace it with the function body you've prepared.
Serialize the AST back to source.
What poses real problems:
Generator functions; just don't inline them.
Returns from under try/finally that need to run the finally part. Might be pretty hard to rewrite correctly; imho, best left in-unlined.
Returns from under context managers that need to run the __exit__ parts. While not impossible, it's also tricky to rewrite preserving the semantics; likely also best left un-inlined.
Mid-function returns, especially from within multiple loop constructs. You might need to replace them with an extra variable and thread it into every condition of every while statement, and likely to add a conditional break to for statements. Again, not impossible but likely best left un-inlined.
Probably the closest analog to a return would be raising an Exception, which would work to pop out of nested loops to the top of the "inlined function".
class ReturnException(Exception):
pass
g = dict(x=y + 3)
try:
for j in some_loop:
for _g['i'] in range(_g['x'] + 1):
if _g['i'] == _g['x']:
raise ReturnException(_g['i'] ** 2)
except ReturnException as e:
_g['return'] = e.message
else:
_g['return'] = None
I don't know how much overhead is associated with exceptions though or if that would be faster than simply calling the function.

Where to put exception handling in python

When using try/except blocks in Python, is there a recommendation to delegate it to any methods that might raise an exception, or to catch it in the parent function, or both?
For example, which of the following is preferred?
def my_function():
s = something.that.might.go_wrong()
return s
def main():
try:
s = my_function()
except Exception:
print "Error"
or
def my_function():
try:
s = something.that.might.go_wrong()
return s
except Exception:
print "Error"
def main():
s = my_function()
PEP 8 seems to be quiet on the matter, and I seem to find examples of both cases everywhere.
It really depends on the semantics of the functions in question. In general if you're writing a library, your library probably should handle exceptions that get raised inside the library (optionally re-raising them as new library-specific exceptions).
At the individual function level, though, the main thing you want to think about is what context/scope you desire to handle the exception in - if there is a reasonable different thing you could do in exceptional cases within the inner function, it might be useful to handle it within the inner function; otherwise, it might make more sense to handle it in the outer function.
For the specific case of writing output, it's often useful to only do that at the highest level, and inner functions only ever (a) return values or (b) raise exceptions. That makes the code easier to test because you don't have to worry about testing side effect output.
If you are following the rule of "one function should handle one task" then You shouldn't handle exception in that function, Let it fail loudly on unexpected input. Parent function which calling such function may handle to give better experience to user.
We can refer python builtin functions to get pythonic way
I always use the second one. Logically to me it seems that the problems of a function should be dealt with in that function only. This would provide user a clean and a hassle free interface, so you could later put your code in a library.
There could be some cases where you would want to use exceptions outside the function. For example if you want to print a particular message when something goes wrong then you should use the exception out of the function.
However you could provide the exception statements as a argument to the function if you want to give user the ability to decide his own exception message. So i guess the Second example(exception inside the function) would be universal and thus should be preferred.

Is it advisable to use print statements in a python function rather than return

Lets say I have the function:
def function(a)
c = a+b
print(c)
Is it advisable to use the print statement in the function to display output rather than placing a return statement at the end and using the print(function(a))?
Also what implications would there be if I used both a print statement and a return statement in a function to display the same output? Lets imagine I need to show the answer for c and then use the value of c somewhere else. Does this break any coding conventions?
So the highlight of the question isn't the difference between print and return, but rather if it is considered a good style to use both in the same function and if it has a possible impact on a program. For example in:
def function(a)
c = a+b
print(c)
return c
value = function
print(value)
Would the result be two c's? Assume c = 5; therefore, would the output be(?):
5
5
print and return solve two completely different problems. They appear to do the same thing when running trivial examples interactively, but they are completely different.
If you indeed just want to print a result, use print. If you need the result for further calculation, use return. It's relatively rare to use both, except during a debugging phase where the print statements help see what's going on if you don't use a debugger.
As a rule of thumb I think it's good to avoid adding print statement in functions, unless the explicit purpose of the function is to print something out.
In all other cases, a function should return a value. The function (or person) that calls the function can then decide to print it, write it to a file, pass it to another function, etc.
So the highlight of the question isnt the difference between print and
return but rather if it is considered good style to use both in the
same function and its possible impact on a program.
It's not good style, it's not bad style. There is no significant impact on the program other than the fact you end up printing a lot of stuff that may not need to be printed.
If you need the function to both print and return a value, it's perfectly acceptable. In general, this sort of thing is rarely done in programming. It goes back to the concept of what the function is designed to do. If it's designed to print, there's usually no point in returning a value, and if it's designed to return a value, there's usually no point in printing since the caller can print it if it wants.
Well return and print are entirely two different processes.
Whereas print will display information to the user or through the console; and return is used for collecting data from a method that fulfills a certain purpose (to use later on throughout your program).
And to answer your question, I believe it would return the two values; since one prints the c variable itself, and the other returns the value c to present as well? Correct me if I'm wrong.

What's the pythonic way of conditional variable initialization?

Due to the scoping rules of Python, all variables once initialized within a scope are available thereafter. Since conditionals do not introduce new scope, constructs in other languages (such as initializing a variable before that condition) aren't necessarily needed. For example, we might have:
def foo(optionalvar = None):
# some processing, resulting in...
message = get_message()
if optionalvar is not None:
# some other processing, resulting in...
message = get_other_message()
# ... rest of function that uses message
or, we could have instead:
def foo(optionalvar = None):
if optionalvar is None:
# processing, resulting in...
message = get_message()
else:
# other processing, resulting in...
message = get_other_message()
# ... rest of function that uses message
Of course, the get_message and get_other_message functions might be many lines of code and are basically irrelevant (you can assume that the state of the program after each path is the same); the goal here is making message ready for use beyond this section of the function.
I've seen the latter construct used several times in other questions, such as:
https://stackoverflow.com/a/6402327/18097
https://stackoverflow.com/a/7382688/18097
Which construct would be more acceptable?
Python also has a very useful if syntax pattern which you can use here
message = get_other_message() if optional_var else get_message()
Or if you want to compare strictly with None
message = get_other_message() if optional_var is not None else get_message()
Unlike with example 1) you posted this doesn't call get_message() unnecessarily.
In general second approach is better and more generic because it doesn't involve calling get_message unconditionally. Which may be ok if that function is not resource incentive but consider a search function
def search(engine):
results = get_from_google()
if engine == 'bing':
results = get_from_bing()
obviously this is not good, i can't think of such bad scenario for second case, so basically a approach which goes thru all options and finally does the default is best e.g.
def search(engine):
if engine == 'bing':
results = get_from_bing()
else:
results = get_from_google()
I think it's more pythonic to not set an explicit rule about this, and instead just keep to the idea that smallish functions are better (in part because it's possible to keep in your mind just when new names are introduced).
I suppose though that if your conditional tests get much more complicated than an if/else you may run the risk of all of them failing and you later using an undefined name, resulting in a possible runtime error, unless you are very careful. That might be an argument for the first style, when it's possible.
The answer depends on if there are side effects of get_message() which are wanted.
In most cases clearly the second one wins, because the code which produces the unwanted result is not executed. But if you need the side effects, you should choose the first version.
It might be better (read: safer) to initialize your variable outside the conditions. If you have to define other conditions or even remove some, the user of message later on might get an uninitialized variable exception.

Python: I'm not allowed to raise exception. Are there other elegant python ways?

My work place has imposed a rules for no use of exception (catching is allowed). If I have code like this
def f1()
if bad_thing_happen():
raise Exception('bad stuff')
...
return something
I could change it to
def f1()
if bad_thing_happen():
return [-1, None]
...
return [0, something]
f1 caller would be like this
def f1_caller():
code, result = f1(param1)
if code < 0:
return code
actual_work1()
# call f1 again
code, result = f1(param2)
if code < 0:
return code
actual_work2()
...
Are there more elegant ways than this in Python ?
Exceptions in python are not something to be avoided, and are often a straightforward way to solve problems. Additionally, an exception carries a great deal of information with it that can help quickly locate (via stack trace) and identify problems (via exception class or message).
Whoever has come up with this blanket policy was surely thinking of another language (perhaps C++?) where throwing exceptions is a more expensive operation (and will reduce performance if your code is executing on a 20 year old computer).
To answer your question: the alternative is to return an error code. This means that you are mixing function results with error handling, which raises (ha!) it's own problems. However, returning None is often a perfectly reasonable way to indicate function failure.
Returning None is reasonably common and works well conceptually. If you are expecting a return value, and you get none, that is a good indication that something went wrong.
Another possible approach, if you are expecting to return a list (or dictionary, etc.) is to return an empty list or dict. This can easily be tested for using if, because an empty container evaluates to False in Python, and if you are going to iterate over it, you may not even need to check for it (depending on what you want to do if the function fails).
Of course, these approaches don't tell you why the function failed. So you could return an exception instance, such as return ValueError("invalid index"). Then you can test for particular exceptions (or Exceptions in general) using isinstance() and print them to get decent error messages. (Or you could provide a helper function that tests a return code to see if it's derived from Exception.) You can still create your own Exception subclasses; you would simply be returning them rather than raising them.
Finally, I would work toward getting this ridiculous policy changed, as exceptions are an important part of how Python works, have low overhead, and will be expected by anyone using your functions.
You have to use return codes. Other alternatives would involve mutable global state (think C's errno) or passing in a mutable object (such as a list), but you almost always want to avoid both in Python. Perhaps you could try explaining to them how exceptions let you write better post-conditions instead of adding complication to return values, but are otherwise equivalent.

Categories

Resources