Basic Python: Exception raising and local variable scope / binding - python

I have a basic "best practices" Python question. I see that there are already StackOverflow answers tangentially related to this question but they're mired in complicated examples or involve multiple factors.
Given this code:
#!/usr/bin/python
def test_function():
try:
a = str(5)
raise
b = str(6)
except:
print b
test_function()
what is the best way to avoid the inevitable "UnboundLocalError: local variable 'b' referenced before assignment" that I'm going to get in the exception handler?
Does python have an elegant way to handle this? If not, what about an inelegant way? In a complicated function I'd prefer to avoid testing the existence of every local variable before I, for example, printed debug information about them.

Does python have an elegant way to
handle this?
To avoid exceptions from printing unbound names, the most elegant way is not to print them; the second most elegant is to ensure the names do get bound, e.g. by binding them at the start of the function (the placeholder None is popular for this purpose).
If not, what about an inelegant way?
try: print 'b is', b
except NameError: print 'b is not bound'
In a complicated function I'd prefer
to avoid testing the existence of
every local variable before I, for
example, printed debug information
about them
Keeping your functions simple (i.e., not complicated) is highly recommended, too. As Hoare wrote 30 years ago (in his Turing acceptance lecture "The Emperor's old clothes", reprinted e.g. in this PDF):
There are two ways of constructing a
software design: One way is to make it
so simple that there are obviously no
deficiencies, and the other way is to
make it so complicated that there are
no obvious deficiencies. The first
method is far more difficult.
Achieving and maintaining simplicity is indeed difficult: given that you have to implement a certain total functionality X, it's the most natural temptation in the world to do so via complicated accretion into a few complicated classes and functions of sundry bits and pieces, "clever" hacks, copy-and-paste-and-edit-a-bit episodes of "drive-by coding", etc, etc.
However, it's a worthwhile effort to strive instead to keep your functions "so simple that there are obviously no deficiencies". If a function's hard to completely unit-test, it's too complicated: break it up (i.e., refactor it) into its natural components, even though it will take work to unearth them. (That's actually one of the way in which a strong focus on unit testing helps code quality: by spurring you relentlessly to keep all the code perfectly testable, it's at the same time spurring you to make it simple in its structure).

You can initialize your variables outside of the try block
a = None
b = None
try:
a = str(5)
raise
b = str(6)
except:
print b

You could check to see if the variable is defined in local scope using the built-in method locals()
http://docs.python.org/library/functions.html#locals
#!/usr/bin/python
def test_function():
try:
a = str(5)
raise
b = str(6)
except:
if 'b' in locals(): print b
test_function()

def test_function():
try:
a = str(5)
raise
b = str(6)
except:
print b
b = str(6) is never run; the program exits try block just after raise. If you want to print some variable in the except block, evaluate it before raising an exception and put them into the exception you throw.
class MyException(Exception):
def __init__(self, var):
self.var = var
def test_function():
try:
a = str(5)
b = str(6)
raise MyException(b)
except MyException,e:
print e.var

Related

Modify *existing* variable in `locals()` or `frame.f_locals`

I have found some vaguely related questions to this question, but not any clean and specific solution for CPython. And I assume that a "valid" solution is interpreter specific.
First the things I think I understand:
locals() gives a non-modifiable dictionary.
A function may (and indeed does) use some kind of optimization to access its local variables
frame.f_locals gives a locals() like dictionary, but less prone to hackish things through exec. Or at least I have been less able to do hackish undocumented things like the locals()['var'] = value ; exec ""
exec is capable to do weird things to the local variables, but it is not reliable --e.g. I read somewhere that it doesn't work in Python 3. Haven't tested.
So I understand that, given those limitations, it will never be safe to add extra variables to the locals, because it breaks the interpreter structure.
However, it should be possible to change a variable already existing, isn't it?
Things that I considered
In a function f, one can access the f.func_code.co_nlocals and f.func_code.co_varnames.
In a frame, the variables can be accessed / checked / read through the frame.f_locals. This is in the use case of setting a tracer through sys.settrace.
One can easily access the function in which a frame is --cosidering the use case of setting a trace and using it to "do things" in with the local variables given a certain trigger or whatever.
The variables should be somewhere, preferably writeable... but I am not capable of finding it. Even if it is an array (for interpreter efficient access), or I need some extra C-specific wiring, I am ready to commit to it.
How can I achieve that modification of variables from a tracer function or from a decorated wrapped function or something like that?
A full solution will be of course appreciated, but even some pointers will help me greatly, because I'm stuck here with lots of non writeable dictionaries :-/
Edit: Hackish exec is doing things like this or this
It exists an undocumented C-API call for doing things like that:
PyFrame_LocalsToFast
There is some more discussion in this PyDev blog post. The basic idea seems to be:
import ctypes
...
frame.f_locals.update({
'a': 'newvalue',
'b': other_local_value,
})
ctypes.pythonapi.PyFrame_LocalsToFast(
ctypes.py_object(frame), ctypes.c_int(0))
I have yet to test if this works as expected.
Note that there might be some way to access the Fast directly, to avoid an indirection if the requirements is only modification of existing variable. But, as this seems to be mostly non-documented API, source code is the documentation resource.
Based on the notes from MariusSiuram, I wrote a recipe that show the behavior.
The conclusions are:
we can modify an existing variable
we can delete an existing variable
we can NOT add a new variable.
So, here is the code:
import inspect
import ctypes
def parent():
a = 1
z = 'foo'
print('- Trying to add a new variable ---------------')
hack(case=0) # just try to add a new variable 'b'
print(a)
print(z)
assert a == 1
assert z == 'foo'
try:
print (b)
assert False # never is going to reach this point
except NameError, why:
print("ok, global name 'b' is not defined")
print('- Trying to remove an existing variable ------')
hack(case=1)
print(a)
assert a == 2
try:
print (z)
except NameError, why:
print("ok, we've removed the 'z' var")
print('- Trying to update an existing variable ------')
hack(case=2)
print(a)
assert a == 3
def hack(case=0):
frame = inspect.stack()[1][0]
if case == 0:
frame.f_locals['b'] = "don't work"
elif case == 1:
frame.f_locals.pop('z')
frame.f_locals['a'] += 1
else:
frame.f_locals['a'] += 1
# passing c_int(1) will remove and update variables as well
# passing c_int(0) will only update
ctypes.pythonapi.PyFrame_LocalsToFast(
ctypes.py_object(frame),
ctypes.c_int(1))
if __name__ == '__main__':
parent()
The output would be like:
- Trying to add a new variable ---------------
1
foo
ok, global name 'b' is not defined
- Trying to remove an existing variable ------
2
foo
- Trying to update an existing variable ------
3

Seeking general advice on how to prevent relentless "NameErrors" in Python

I have a question that I am sure has been on the mind of every intermediate-level Python programmer at some point: that is, how to fix/prevent/avoid/work around those ever-so-persistent and equally frustrating NameErrors. I'm not talking about actual errors (like typos, etc.), but a bizarre problem that basically say a global name was not defined, when in reality it was defined further down. For whatever reason, Python seems to be extremely needy in this area: every single variable absolutely positively has to hast to be defined above and only above anything that refers to it (or so it seems).
For example:
condition = True
if condition == True:
doStuff()
def doStuff():
it_worked = True
Causes Python to give me this:
Traceback (most recent call last):
File "C:\Users\Owner\Desktop\Python projects\test7.py", line 4, in <module>
doStuff()
NameError: name 'doStuff' is not defined
However, the name WAS defined, just not where Python apparently wanted it. So for a cheesy little function like doStuff() it's no big deal; just cut and paste the function into an area that satisfies the system's requirement for a certain order. But when you try to actually design something with it it makes organizing code practically impossible (I've had to "un-organize" tons of code to accomodate this bug). I have never encountered this problem with any of the other languages I've written in, so it seems to be specific to Python... but anyway I've researched this in the docs and haven't found any solutions (or even potential leads to a possible solution) so I'd appreciate any tips, tricks, workarounds or other suggestions.
It may be as simple as learning a specific organizational structure (like some kind of "Pythonic" and very strategic approach to working around the bug), or maybe just use a lot of import statements so it'll be easier to organize those in a specific order that will keep the system from acting up...
Avoid writing code (other than declarations) at top-level, use a main() function in files meant to be executed directly:
def main():
condition = True
if condition:
do_stuff()
def do_stuff():
it_worked = True
if __name__ == '__main__':
main()
This way you only need to make sure that the if..main construct follows the main() function (e.g. place it at the end of the file), the rest can be in any order. The file will be fully parsed (and thus all the names defined in the module can be resolved) by the time main() is executed.
As a rule of thumb: For most cases define all your functions first and then use them later in your code.
It is just the way it is: every name has to be defined at the time it is used.
This is especially true at code being executed at top level:
func()
def func():
func2()
def func2():
print "OK"
func()
The first func() will fail, because it is not defined yet.
But if I call func() at the end, everything will be OK, although func2() is defined after func().
Why? Because at the time of calling, func2() exists.
In short, the code of func() says "Call whatever is defined as func2 at the time of calling".
In Python defining a function is an act which happens at runtime, not at compile time. During that act, the code compiled at compile time is assigned to the name of the function. This name then is a variable in the current scope. It can be overwritten later as any other variable can:
def f():
print 42
f() # will print 42
def f():
print 23
f() # will print 23
You can even assign functions like other values to variables:
def f():
print 42
g = 23
f() # will print 42
g # will print 23
f, g = g, f
f # will print 23
g() # will print 42
When you say that you didn't come across this in other languages, it's because the other languages you are referring to aren't interpreted as a script. Try similar things in bash for instance and you will find that things can be as in Python in other languages as well.
There are a few things to say about this:
If your code is so complex that you can't organize it in one file, think about using many files and import them into one smaller main file
I you put your function in a class it will work. example:
class test():
def __init__(self):
self.do_something()
def do_something(self):
print 'test'
As said in the comment from Volatility that is an characteristic of interpreted languages

How can I access variables from the caller, even if it isn't an enclosing scope (i.e., implement dynamic scoping)?

Consider this example:
def outer():
s_outer = "outer\n"
def inner():
s_inner = "inner\n"
do_something()
inner()
I want the code in do_something to be able to access the variables of the calling functions further up the call stack, in this case s_outer and s_inner. More generally, I want to call it from various other functions, but always execute it in their respective context and access their respective scopes (implement dynamic scoping).
I know that in Python 3.x, the nonlocal keyword allows access to s_outer from within inner. Unfortunately, that only helps with do_something if it's defined within inner. Otherwise, inner isn't a lexically enclosing scope (similarly, neither is outer, unless do_something is defined within outer).
I figured out how to inspect stack frames with the standard library inspect, and made a small accessor that I can call from within do_something() like this:
def reach(name):
for f in inspect.stack():
if name in f[0].f_locals:
return f[0].f_locals[name]
return None
and then
def do_something():
print( reach("s_outer"), reach("s_inner") )
works just fine.
Can reach be implemented more simply? How else can I solve the problem?
There is no and, in my opinion, should be no elegant way of implementing reach since that introduces a new non-standard indirection which is really hard to comprehend, debug, test and maintain. As the Python mantra (try import this) says:
Explicit is better than implicit.
So, just pass the arguments. You-from-the-future will be really grateful to you-from-today.
What I ended up doing was
scope = locals()
and make scope accessible from do_something. That way I don't have to reach, but I can still access the dictionary of local variables of the caller. This is quite similar to building a dictionary myself and passing it on.
We can get naughtier.
This is an answer to the "Is there a more elegant/shortened way to implement the reach() function?" half of the question.
We can give better syntax for the user: instead of reach("foo"), outer.foo.
This is nicer to type, and the language itself immediately tells you if you used a name that can't be a valid variable (attribute names and variable names have the same constraints).
We can raise an error, to properly distinguish "this doesn't exist" from "this was set to None".
If we actually want to smudge those cases together, we can getattr with the default parameter, or try-except AttributeError.
We can optimize: no need to pessimistically build a list big enough for all the frames at once.
In most cases we probably won't need to go all the way to the root of the call stack.
Just because we're inappropriately reaching up stack frames, violating one of the most important rules of programming to not have things far away invisibly effecting behavior, doesn't mean we can't be civilized.
If someone is trying to use this Serious API for Real Work on a Python without stack frame inspection support, we should helpfully let them know.
import inspect
class OuterScopeGetter(object):
def __getattribute__(self, name):
frame = inspect.currentframe()
if frame is None:
raise RuntimeError('cannot inspect stack frames')
sentinel = object()
frame = frame.f_back
while frame is not None:
value = frame.f_locals.get(name, sentinel)
if value is not sentinel:
return value
frame = frame.f_back
raise AttributeError(repr(name) + ' not found in any outer scope')
outer = OuterScopeGetter()
Excellent. Now we can just do:
>>> def f():
... return outer.x
...
>>> f()
Traceback (most recent call last):
...
AttributeError: 'x' not found in any outer scope
>>>
>>> x = 1
>>> f()
1
>>> x = 2
>>> f()
2
>>>
>>> def do_something():
... print(outer.y)
... print(outer.z)
...
>>> def g():
... y = 3
... def h():
... z = 4
... do_something()
... h()
...
>>> g()
3
4
Perversion elegantly achieved.
Is there a better way to solve this problem? (Other than wrapping the respective data into dicts and pass these dicts explicitly to do_something())
Passing the dicts explicitly is a better way.
What you're proposing sounds very unconventional. When code increases in size, you have to break down the code into a modular architecture, with clean APIs between modules. It also has to be something that is easy to comprehend, easy to explain, and easy to hand over to another programmer to modify/improve/debug it. What you're proposing sounds like it is not a clean API, unconventional, with a non-obvious data flow. I suspect it would probably make many programmers grumpy when they saw it. :)
Another option would be to make the functions members of a class, with the data being in the class instance. That could work well if your problem can be modelled as several functions operating on the data object.

Way in Python to make vars visible in calling method scope?

I find myself doing something like this constantly to pull GET args into vars:
some_var = self.request.get('some_var', None)
other_var = self.request.get('other_var', None)
if None in [some_var, other_var]:
logging.error("some arg was missing in " + self.request.path)
exit()
What I would really want to do is:
pull_args('some_var', 'other_var')
And that would somehow pull these variables to be available in current scope, or log an error and exit if not (or return to calling method if possible). Is this possible in Python?
First, a disclaimer: "pulling" variables into the local scope in any way other than var = something is really really really not recommended. It tends to make your code really confusing for someone who isn't intimately familiar with what you're doing (i.e. anyone who isn't you, or who is you 6 months in the future, etc.)
That being said, for educational purposes only, there is a way. Your pull_args function could be implemented like this:
def pull_args(request, *args):
pulled = {}
try:
for a in args:
pulled[a] = request[a]
except AttributeError:
logging.error("some arg was missing in " + self.request.path)
exit()
else:
caller = inspect.stack()[1][0]
caller.f_locals.update(pulled)
At least, something to that effect worked when I came up with it probably about a year ago. I wouldn't necessarily count on it continuing to work in future Python versions. (Yet another reason not to do it) I personally have never found a good reason to use this code snippet.
No it's not and also pointless. Writing to outer namespaces completely destroys the purpose of namespaces, which is having only the things around that you explicitly set. Use lists!
def pull_args(*names):
return [self.request.get(name, None) for name in names]
print None in pull_args('some_var', 'other_var')
Probably this works too, to check if all _var are set:
print all(name in self.request for name in ('some_var', 'other_var'))

Call python function as if it were inline

I want to have a function in a different module, that when called, has access to all variables that its caller has access to, and functions just as if its body had been pasted into the caller rather than having its own context, basically like a C Macro instead of a normal function. I know I can pass locals() into the function and then it can access the local variables as a dict, but I want to be able to access them normally (eg x.y, not x["y"] and I want all names the caller has access to not just the locals, as well as things that were 'imported' into the caller's file but not into the module that contains the function.
Is this possible to pull off?
Edit 2 Here's the simplest possible example I can come up with of what I'm really trying to do:
def getObj(expression)
ofs = expression.rfind(".")
obj = eval(expression[:ofs])
print "The part of the expression Left of the period is of type ", type(obj),
Problem is that 'expression' requires the imports and local variables of the caller in order to eval without error.In reality theres a lot more than just an eval, so I'm trying to avoid the solution of just passing locals() in and through to the eval() since that won't fix my general case problem.
And another, even uglier way to do it -- please don't do this, even if it's possible --
import sys
def insp():
l = sys._getframe(1).f_locals
expression = l["expression"]
ofs = expression.rfind(".")
expofs = expression[:ofs]
obj = eval(expofs, globals(), l)
print "The part of the expression %r Left of the period (%r) is of type %r" % (expression, expofs, type(obj)),
def foo():
derp = 5
expression = "derp.durr"
insp()
foo()
outputs
The part of the expression 'derp.durr' Left of the period ('derp') is of type (type 'int')
I don't presume this is the answer that you wanted to hear, but trying to access local variables from a caller module's scope is not a good idea. If you normally program in PHP or C, you might be used to this sort of thing?
If you still want to do this, you might consider creating a class and passing an instance of that class in place of locals():
#other_module.py
def some_func(lcls):
print(lcls.x)
Then,
>>> import other_module
>>>
>>>
>>> x = 'Hello World'
>>>
>>> class MyLocals(object):
... def __init__(self, lcls):
... self.lcls = lcls
... def __getattr__(self, name):
... return self.lcls[name]
...
>>> # Call your function with an instance of this instead.
>>> other_module.some_func(MyLocals(locals()))
'Hello World'
Give it a whirl.
Is this possible to pull off?
Yes (sort of, in a very roundabout way) which I would strongly advise against it in general (more on that later).
Consider:
myfile.py
def func_in_caller():
print "in caller"
import otherfile
globals()["imported_func"] = otherfile.remote_func
imported_func(123, globals())
otherfile.py
def remote_func(x1, extra):
for k,v in extra.iteritems():
globals()[k] = v
print x1
func_in_caller()
This yields (as expected):
123
in caller
What we're doing here is trickery: we just copy every item into another namespace in order to make this work. This can (and will) break very easily and/or lead to hard to find bugs.
There's almost certainly a better way of solving your problem / structuring your code (we need more information in general on what you're trying to achieve).
From The Zen of Python:
2) Explicit is better than implicit.
In other words, pass in the parameter and don't try to get really fancy just because you think it would be easier for you. Writing code is not just about you.

Categories

Resources