Precise rules of variable binding in nested scopes [duplicate] - python

This question already has answers here:
Short description of the scoping rules?
(9 answers)
Closed 2 years ago.
I seem to have misunderstood something about Python variable binding. What are the precise rules for deciding which variable is accessed given a nested scope with shadowing names?
Let me illustrate with some examples. First the basic shadow.
a = 1
def foo():
a = 2
def _foo():
return a
return _foo()
print(foo()) # -> 2
Everything is fine here. The value is overwritten and returned accordingly. However, if the value is changed after the function definition, it is still the inner value:
def bar():
def _bar():
return a
a = 2
return _bar()
print(bar()) # -> 2
What's more, defining a function that references a non-existent variable is possible.
def baz():
def _baz():
return b
return _baz()
Then, if b is defined later, the function can be executed. But not if is defined in another inner scope:
def qux(f):
b = 3
return f()
print(qux(baz())) # -> NameError
Now all of these cases could be explained by having Python know about lines that come later in the program, but that conflicts with my knowledge of Python being an interpreted language, advancing line by line. So are statements parsed at once instead of line by line?
A weird behaviour with shadowing class attributes throws me off a bit more.
class C:
a = 2
b = a
def meth(self):
return a
c = meth
print(C.b, C().meth(), C.c) # -> 2 1 C.meth
Here a is defined as a class attribute and is successfully used in b, but this does not carry over to the method definition. The method itself can be used in later attributes, but not for example in other methods without going through self.
Is my guess about the binding happening all at once correct? And in that case are class bodies an exception by design, or are they not a scope at all? Or is something else going on here?

I think you might be overthinking it.
By default, variables when created are put in the narrowest enclosing function's scope.
Variables from all enclosing scopes are available in a read-only capacity, be that an enclosing function's scope or the global scope. If you try to assign to this, it'll create a new variable in the narrowest enclosing scope, shadowing those outside. Using the global keyword to bring an external variable into the local scope will stop this from happening, allowing you to assign things to the non-local scope.
Additionally, keep in mind that functions are compiled and evaluated at the time when the def statement is interpreted. For nested functions, essentially, every new call re-evaluates the inner functions. This also means that inner functions have read-only access to the scope of the outer functions. Same rules as usual.
Your bar() example works because, by the time python tries to access the variable a, it is present in at least one of the enclosing scopes. Python doesn't check these things until the last possible moment. Your qux() example doesn't work because the scope in which b is declared does not enclose the scope where _baz() is defined, and thus is not accessible.
Class scopes are weird. When the class is evaluated, all variables defined inside it are bound to the class. However, the class doesn't really count as a scope of its own, for the purpose of the methods enclosed inside it. Think of meth() as an unbound function, declared in the global scope, which C.meth refers to (and, now, C.c). Calling a function via dot notation is a syntactic shorthand:
# the following two are identical
C().meth()
C.meth(C())
and while C.meth is technically bound to C, it's not enclosed in C's class-level namespace. Trying to do C().meth() will fail, because a is not defined with respect to the function. (note that if a is defined in the global scope, the function will work as expected - C.meth() has the global scope as a parent, not C's class-level scope).

Related

Behaviour discrepancy between classes and functions scope in Python

As you may know, the scope of a variable is statically determined in python (What are the rules for local and global variables in Python?).
For instance :
a = "global"
def function():
print(a)
a = "local"
function()
# UnboundLocalError: local variable 'a' referenced before assignment
The same rule applies to classes, but it seems to default to the global scope instead of raising an AttributeError:
a = "global"
class C():
print(a)
a = "local"
# 'global'
Moreover, in the case of a nested function, the behavior is the same (without using nonlocal or global) :
a = "global"
def outer_func():
a = "outer"
def inner_func():
print(a)
a = "local"
inner_func()
outer_func()
# UnboundLocalError: local variable 'a' referenced before assignment
But in the case of nested classes, it still defaults to the global scope, and not the outer scope (again without using global or nonlocal) :
a = "global"
def outer_func():
a = "outer"
class InnerClass:
print(a)
a = "local"
outer_func()
# 'global'
The weirdest part is that the nested class default to the outer scope when there is no declaration of a :
a = "global"
def outer_func():
a = "outer"
class InnerClass:
print(a)
outer_func()
# 'outer'
So my questions are :
Why the discrepancy between functions and classes (one raising an exception, the other defaulting to the global scope.
In nested classes, why the default scope has to become global instead of keeping using the outer one when using a variable defined afterward?
The answer is given in great detail in Section 9.2 of the official docs. The crux of the matter is
... On the other hand, the actual search for names is done dynamically, at run time — however, the language definition is evolving towards static name resolution, at “compile” time, so don’t rely on dynamic name resolution! (In fact, local variables are already determined statically.)
When you are in the class definition, which at the moment of its execution is the innermost scope, dynamic name resolution applies. You therefore see printouts of the global value of a.
If the name resolution were static, as in function definitions, the name a would be recognized as a local name even in the print statement. That is why you can't print a in a function before assigning to it.
The rules for class body scoping are alluded to in Section 4.2.2:
Class definition blocks and arguments to exec() and eval() are special in the context of name resolution. A class definition is an executable statement that may use and define names. These references follow the normal rules for name resolution with an exception that unbound local variables are looked up in the global namespace.
Let's parse that last sentence carefully, because it fully covers your last two examples. First off, what is an unbound local variable in this context? A class body creates a new namespace, just like entering a function. If a name is bound somewhere in a class body, it is a local variable. This is determined statically, as mentioned above. If you attempt to reference the name before it is first bound, you have an unbound local variable. Instead of raising an error, as a function call would do, python jumps straight to the global namespace to perform the lookup (and ignores builtins as well). In all other cases (not local variables), normal LEGB lookup order applies.
This is indeed a bit counter-intuitive, and I would argue that it pushes if not outright breaks the rule of least surprise.

Consider Three Nested Functions. Can the Innermost Function Access the Namespace of the Outermost One?

I tested this out with a program I wrote myself:
>>> def f():
f=['f',1,2]
def g():
g=1
print('this prints out f from f(): ',f)
print("id",id(f))
def x():
x=1
print('this also prints out f from f():',f)
print('id',id(f))
x()
g()
>>> f()#output
this prints out f from f(): ['f', 1, 2]
id 140601546763464
this also prints out f from f(): ['f', 1, 2]
id 140601546763464
From what I learned, the innermost x() function can only access its own local namespace, the enclosing namespace, the global, and finally the built-in namespace. I initially thought that trying to access the list f declared in function f() from function x() would raise an error, as the f() function's namespace cannot be classified as any of the aforementioned elements. After running the program, I realized you indeed can access the list f from the function x(). I don't quite understand how this works though. My guess is that checking the enclosing namespace not only checks the local namespace of the enclosing function but the enclosing function for it as well, in a process that works almost recursively. Can somebody please explain how this works?
Python resolves names using LEGB rule:(LEGB means Local, Enclosing, Global, and Built-in)
Local scope:
contains the names that are defined inside the function.
visible only inside the function
created at function call(If we call the function multiple times each call creates new local scope)
will be destroyed once function return
Enclosing or nonlocal:
exists for nested functions
contains names defined in the enclosing function
visible in inner and enclosing functions.
Global:
contains all the names defined at the top level of a program
visible from everywhere inside the code.
exist throut the life of code.
Built-in:
created whenever we run a script
contains keywords, functions, exceptions, etc that are built into Python
visible everywhere in the code
The LEGB rule is a rule which determines the order in which Python looks up names.
i.e Python will look the name sequentially in the local, enclosing, global, and built-in scope. And inner scope codes can outer scope names but outer scope codes cannot access inner scope names.
When we use nested functions the scope resolving is as follows:
check the local scope(inside the function)
If not found check enclosing scopes of outer functions from the innermost scope to the outermost scope
If not found look the global scope
If not found look built-ins
Still not found raise error

Python life variables in if statement [duplicate]

This question already has answers here:
What's the scope of a variable initialized in an if statement?
(7 answers)
Closed 3 years ago.
Can someone tell me where I can find some information about the life of variables in if statement?
In this code:
if 2 < 3:
a = 3
else:
b = 1
print(a)
It prints the variable a. But it seems to me a local variable of the if statement. In C infacts it gives me an error if I create the a variable in the if statement.
I think that this behaviour is because Python is an interpreted language. Am I right?
Python variables are scoped to the innermost function, class, or module in which they're assigned. Control blocks like if and while blocks don't count, so a variable assigned inside an if is still scoped to a function, class, or module. However Implicit functions defined by a generator expression or list/set/dict comprehension do count, as do lambda expressions. You can't stuff an assignment statement into any of those, but lambda parameters and for clause targets are implicit assignment.
Taking into consideration your example:
if 2 < 3:
a = 3
else:
b = 1
print(a)
Note that a isn't declared or initialized before the condition unlike C or Java, In other words, Python does not have block-level scopes. You can get more information about it here
Interpretation and compilation have nothing to do with it, and being interpreted is not a property of languages but of implementations.
You could compile Python and interpret C and get exactly the same result.
In Python, you don't need to declare variables and assigning to a variable that doesn't exist creates it.
With a different condition – if 3 < 2:, for instance – your Python code produces an error.
An if statement does not define a scope as it is not a class, module or a function --the only structures in Python that define a scope. Python Scopes and Namespaces
So a variable defined in an if statement is in the tightest outer scope of that if.
See:
What's the scope of a variable initialized in an if statement?
Python variables are scoped to the innermost function, class, or
module in which they're assigned. Control blocks like if and while
blocks don't count, so a variable assigned inside an if is still
scoped to a function, class, or module.
This also applies to for loops which is completely unlike C/C++; a variable created inside the for loop will also be visible to the enclosing function/class/module. Even more curiously, this doesn't apply to variables created inside list comprehensions, which are like self-contained functions, i.e.:
mylist = [zed for zed in range(10)]
In this case, zed is not visible to the enclosing scope! It's all consistent, just a bit different from some other languages. Designer's choice, I guess.
The way c language and its compiler is written is that during compile time itself "declaration of certain identifiers are caught", i.e this kind of grammar is not allowed. you can call it a feature/limitation.
Python - https://www.python.org/dev/peps/pep-0330/
The Python Virtual Machine executes Python programs that have been compiled from the Python language into a bytecode representation.
However, python(is compiled and interpreted, not just interpreted) is flexible i.e you can create and assign values to variables and access them globally, locally and nonlocally(https://docs.python.org/3/reference/simple_stmts.html#grammar-token-nonlocal-stmt), there are many verities you can cook up during your program creation and the compiler allows it as this is the feature of the language itself.
coming to scope and life of variables , see the following description, it might be helpful
When you define a variable at the beginning of your program, it will be a global variable. This means it is accessible from anywhere in your script, including from within a function.
example:-
Program#1
a=1
if2<3
print a
this prints a declared outside.
however, in the below example a is defined globally as 5, but it's defined again as 3, within a function. If you print the value of a from within the function, the value that was defined locally will be printed. If you print a outside of the function, its globally defined value will be printed. The a defined in function() is literally sealed off from the outside world. It can only be accessed locally, from within the same function. So the two a's are different, depending on where you access them from.
Program#2
a = 1
def random1():
a = 3
print(a)
function()
print(a)
Here, you see 3 and 5 as output.
**The scope of a variable refers to the places that you can see or access a variable.
CASE 1:If you define a variable at the top level of your script or module or notebook, this is a global variable:**
>>> global_var = 3
>>> def my_first_func():
... # my_func can 'see' the global variable
... print('I see "global_var" = ', global_var, ' from "my_first_func"')
output:
my_first_func()
I see "global_var" = 3 from "my_first_func"
CASE 2:Variables defined inside a function or class, are not global. Only the function or class can see the variable:
>>> def my_second_func():
... local_var = 10
... print('I see "local_var" = ', local_var, 'from "my_second_func"')
Output:
my_second_func()
I see "local_var" = 10 from "my_second_func"
CASE 3:But here, down in the top (global) level of the notebook, we can’t see that local variable:
Output:
>>> local_var
Traceback (most recent call last):
...
NameError: name 'local_var' is not defined

Scope of Python variables in this case - Difference between Enclosing and Local Variable

I am confused about the scope of python variables. How is this working
Consider the following example
i = 12
if i==12 :
str = "is equal"
else:
str = "NOT"
print str //Prints is equal - recognized the string str
The variable str is only in the scope of if statement and its scope is lost at the else statement. Since there is no hoisting in python. I am confused how this example works.I read this post and it states that Variables are scoped in the Following order
1-L (local variables are given preference)
2-E (Enclosing variables)
3-G (Global variables)
4-B (Builtin)
My question is what is the difference between Enclosing variable and local variable ?
The variable str is only in the scope of if statement and its scope is lost at the else statement.
Nope. Python is function scoped, not block scoped. Entering an if block doesn't create a new scope, so the str variable is still in scope for the print.
Enclosing variables are variables from functions enclosing a given function. They occur when you have closures:
def f():
x = 3
def g():
print x # enclosing variable
g()
Python doesn't have general block scope for, only function scope (with some additional weirdness for cases like class declarations). Any name assigned within a function will remain valid for the life of the function.
Enclosing scope applies when nesting function declarations, e.g.:
def foo(a):
def bar(b):
return a + b
return bar
So in this case, foo(1)(2) will create a bar whose enclosing scope is a foo call with a == 1, then call bar(2), which will see a as 1.
Enclosing scope also applies to lambda functions; they can read variables available in the scope surrounding the point where lambda was used, so for something like this:
val_to_key = {...} # Some dictionary mapping values to sort key values
mylist.sort(key=lambda x: val_to_key[x])
val_to_key is available; it wouldn't be in scope inside the sort function, but the lambda function binds the enclosing scope at declaration time and can therefore use val_to_key.

Immediately enclosing namespace of a top-level function (for the purpose of parsing nonlocal)? [duplicate]

In Python 3.3.1, this works:
i = 76
def A():
global i
i += 10
print(i) # 76
A()
print(i) # 86
This also works:
def enclosing_function():
i = 76
def A():
nonlocal i
i += 10
print(i) # 76
A()
print(i) # 86
enclosing_function()
But this doesn't work:
i = 76
def A():
nonlocal i # "SyntaxError: no binding for nonlocal 'i' found"
i += 10
print(i)
A()
print(i)
The documentation for the nonlocal keyword states (emphasis added):
The nonlocal statement causes the listed identifiers to refer to
previously bound variables in the nearest enclosing scope.
In the third example, the "nearest enclosing scope" just happens to be the global scope. So why doesn't it work?
PLEASE READ THIS BIT
I do notice that the documentation goes on to state (emphasis added):
The [nonlocal] statement allows encapsulated code to
rebind variables outside of the local scope besides the global
(module) scope.
but, strictly speaking, this doesn't mean that what I'm doing in the third example shouldn't work.
The search order for names is LEGB, i.e Local, Enclosing, Global, Builtin. So the global scope is not an enclosing scope.
EDIT
From the docs:
The nonlocal statement causes the listed identifiers to refer to
previously bound variables in the nearest enclosing scope. This is
important because the default behavior for binding is to search the
local namespace first. The statement allows encapsulated code to
rebind variables outside of the local scope besides the global
(module) scope.
why is a module's scope considered global and not an enclosing one? It's still not global to other modules (well, unless you do from module import *), is it?
If you put some name into module's namespace; it is visible in any module that uses module i.e., it is global for the whole Python process.
In general, your application should use as few mutable globals as possible. See Why globals are bad?:
Non-locality
No Access Control or Constraint Checking
Implicit coupling
Concurrency issues
Namespace pollution
Testing and Confinement
Therefore It would be bad if nonlocal allowed to create globals by accident. If you want to modify a global variable; you could use global keyword directly.
global is the most destructive: may affect all uses of the module anywhere in the program
nonlocal is less destructive: limited by the outer() function scope (the binding is checked at compile time)
no declaration (local variable) is the least destructive option: limited by inner() function scope
You can read about history and motivation behind nonlocal in PEP: 3104
Access to Names in Outer Scopes.
It depends upon the Boundary cases:
nonlocals come with some senstivity areas which we need to be aware of. First, unlike the global statement, nonlocal names really must have previous been assigned in an enclosing def's scope when a nonlocal is evaluated or else you'll get an error-you cannot create them dynamically by assigning them anew in the enclosing scope. In fact, they are checked at function definition time before either or nested function is called
>>>def tester(start):
def nested(label):
nonlocal state #nonlocals must already exist in enclosing def!
state = 0
print(label, state)
return nested
SyntaxError: no binding for nonlocal 'state' found
>>>def tester(start):
def nested(label):
global state #Globals dont have to exits yet when declared
state = 0 #This creates the name in the module now
print(label, state)
return nested
>>> F = tester(0)
>>> F('abc')
abc 0
>>> state
0
Second, nonlocal restricts the scope lookup to just enclosing defs; nonlocals are not looked up in the enclosing module's global scope or the built-in scope outside all def's, even if they are already there:
for example:-
>>>spam = 99
>>>def tester():
def nested():
nonlocal spam #Must be in a def, not the module!
print('current=', spam)
spam += 1
return nested
SyntaxError: no binding for nonlocal 'spam' found
These restrictions make sense once you realize that python would not otherwise generally know enclosing scope to create a brand-new name in. In the prior listing, should spam be assigned in tester, or the module outside? Because this is ambiguous, Python must resolve nonlocals at function creation time, not function call time.
The answer is that the global scope does not enclose anything - it is global to everything. Use the global keyword in such a case.
Historical reasons
In 2.x, nonlocal didn't exist yet. It wasn't considered necessary to be able to modify enclosing, non-global scopes; the global scope was seen as a special case. After all, the concept of a "global variable" is a lot easier to explain than lexical closures.
The global scope works differently
Because functions are objects, and in particular because a nested function could be returned from its enclosing function (producing an object that persists after the call to the enclosing function), Python needs to implement lookup into enclosing scopes differently from lookup into either local or global scopes. Specifically, in the reference implementation of 3.x, Python will attach a __closure__ attribute to the inner function, which is a tuple of cell instances that work like references (in the C++ sense) to the closed-over variables. (These are also references in the reference-counting garbage-collection sense; they keep the call frame data alive so that it can be accessed after the enclosing function returns.)
By contrast, global lookup works by doing a chained dictionary lookup: there's a dictionary that implements the global scope, and if that fails, a separate dictionary for the builtin scope is checked. (Of course, writing a global only writes to the global dict, not the builtin dict; there is no builtin keyword.)
Theoretically, of course, there's no reason why the implementation of nonlocal couldn't fall back on a lookup in the global (and then builtin) scope, in the same way that a lookup in the global scope falls back to builtins. Stack Overflow is not the right place to speculate on the reason behind the design decision. I can't find anything relevant in the PEP, so it may simply not have been considered.
The best I can offer is: like with local variable lookup, nonlocal lookup works by determining at compile time what the scope of the variable will be. If you consider builtins as simply pre-defined, shadow-able globals (i.e. the only real difference between the actual implementation and just dumping them into the global scope ahead of time, is that you can recover access to the builtin with del), then so does global lookup. As they say, "simple is better than complex" and "special cases aren't special enough to break the rules"; so, no fallback behaviour.

Categories

Resources