When python closures are exactly doing their capture? - python

Here is a rather self-explanatory code snippet in python:
globl = 1
def foo():
def bar():
return free+capture
capture = globl #not seen, when bar is defined
return bar
free = 2
a = foo()
globl = 4
b = foo()
print(a()) #3
print(b()) #6
print(a.__closure__[0].cell_contents) # 1
print(b.__closure__[0].cell_contents) # 4
When 'bar' is defined, both 'free' and 'captured' variables are free. they do not exist in parent envirionment, neither in root. When 'bar' returned from 'foo', 'capture' will be captured. From a stack!
So I assume python closes over environment on return of a function. Why is that the case? Why not on definition time of 'bar'?
Same snippet also works if we replace bar with lambda:
bar = lambda : free+capture

at compile-time, when python encounters a function, it makes some determinations. It looks at everything in the function, it does not create a scope or a namespace but it determines which variables are going to be local or nonlocal.
When you run foo(), which means at run-time of foo, bar function gets created and python determines "free" and "capture" as non-local, therefore bar already has a reference to free variables.
"free" and "capture" are in two different scopes but always reference the same "value". When python determines the local variables of "bar", it creates a cell object.
So when outer function "foo" finishes running, this "Cell" object still exists so when inner function "bar" is called, it still gets the same value.

Related

Why does two main functions in a single Python script work?

I was trying to do silly things in Python and tried the silliest of the things (see below) to see how Python reacts. To my surprise it executed perfectly. But I do not understand why.
How does Python know which foo to execute? Why does it not execute the same foo twice?
def main():
foo()
def foo():
print('this is foo 1.')
if __name__ == '__main__':
main()
def main():
foo()
def foo():
print('this is foo 2.')
if __name__ == '__main__':
main()
Python executes statements from top to bottom as they appear in the input file.
So first it defines two functions main and foo and then calls main, leading to the output from foo, "this is foo 1".
Then it defines two other functions which happen to be also named main and foo, these names now refer to the new functions and the first two functions are no longer accessible by these names. You could also say that the new functions override the old functions or that they are redefined.
Then it calls main (which now refers to the new function) which leads to the output of the new function foo, "this is foo 2".
Also note that the function name main has no special meaning in Python.
The interpreter does these things, in the order they appear in the file:
Defines main as your first definition of that function.
Defines foo as your first definition of that function.
Because __name__ is __main__: executes the function (currently defined as main) which calls the function currently defined as foo.
Redefines main as your second definition of that function. (Albeit that's no different from the first definition.)
Redefines foo as your second definition of that function.
Because __name__ is __main__: executes the function (currently defined as main) which calls the function currently defined as foo, which by now is your second foo.
You might be assuming that the interpreter somehow does a "first pass" and processes all function definitions, and only then executes the code under if __name__ == '__main__'. But that's not how it works. The steps are executed in order, like they would be if you ran:
x = 5
y = 7
print(x+y)
x = 6
y = 10
print(x+y)
For similar reasons, this code would not work:
if __name__ == '__main__':
foo()
def foo():
print(2)
because foo is undefined at the point it is referenced.
When main is called the first time, foo is the function that prints "This is foo 1.". When main is called the second time, foo is the function that prints "This is foo 2." Therefore, the first time, "This is foo 1." is the output, and the second time, "This is foo 2." is the output.
Python is in interpretted language so it executes each line as it encounters it.
Which means if you ran this on the python shell or as a script your output would be determined by whatever is encountered first foo() - foo1, and then the newer value of the function foo() - foo2
Output:
this is foo 1.
this is foo 2.

Why isn't Python switching from enclosing to local variable scope?

I was just checking my mental model about scoping in Python and got confused. The first two examples match my model, the 3rd example doesn't.
I assumed that Python has 4 scopes:
Local
Enclosed
Global
Built-in
I imagine those 4 scopes like dictionaries. The built-in one is pre-defined and the other ones get generated after some actions:
Global: The main script file creates a variable. This scope is killed once the script finished executing.
Local: Within a function, a variable is called. This scope is killed once the function finished executing
Enclosed: A function B is defined within a function A. Once B is called, the local scope of A becomes the enclosed scope of B. This scope is killed once B finished executing.
I assumed that Python has those 4 dictionaries in memory and essentially tries every time all 4:
Does the variable exist in local scope? Use it. If not, got to 2
Does the variable exist in enclosed scope? Use it. If not, go to 3.
Does the variable exist in global scope? Use it. If not, go to 4.
does the variable exist in the built-ins? Use it. If not, throw a NameError
I especially assumed that a variable could switch from enclosed scope being used to local scope being used. This is obviously not the case. Could somebody explain why? Is there maybe a bigger difference in my mental model from what actually happens?
Example 1
This prints "local"
def foo():
min = lambda n: "enclosing"
def bar():
"""Bar is enclosed by 'foo'"""
min = lambda n: "local"
print(min([1, 2, 3]))
bar()
foo()
Example 2
This prints "enclosing"
def foo():
min = lambda n: "enclosing"
def bar():
"""Bar is enclosed by 'foo'"""
print(min([1,2,3]))
bar()
foo()
Example 3
def foo():
min = lambda n: "enclosing"
def bar():
"""Bar is enclosed by 'foo'"""
print(min([1,2,3]))
min = lambda n: "local"
print(min([1, 2, 3]))
bar()
foo()
gives
Traceback (most recent call last):
File "example.py", line 13, in <module>
foo()
File "example.py", line 10, in foo
bar()
File "example.py", line 6, in bar
print(min([1,2,3]))
UnboundLocalError: local variable 'min' referenced before assignment
The rule is that if you assign to a variable anywhere inside the body of a function, the variable is considered local - in the whole function.
If you want to refer to a global variable, you have to say so explicitely by declaring
global my_variable
and if you mean to refer to the variable in the closest enclosing scope that defines it, you have do declare it as
nonlocal my_variable
So, there is nothing special happening here: the general rule for deciding that a variable is local still applies.

Where liveth declared but undefined global variables in Python?

When you state that a variable is global, it does not create it for you (if it doesn't already exist). What does the global statement actually do to the variable? It obviously doesn't merely modify it since it does not have to exist for it to be modified. Once this goes out of scope, can
def foo():
global cat, dog
dog = 1
foo()
print('dog' in globals()) # => True
print(dog) # => 1
print('cat' in globals()) # => False
print(cat) # => NameError
This also raises an error (not surprising):
def foo():
global cat, dog
dog = 1
def bar():
cat = 2
foo()
bar()
print(dog)
print(cat) # => NameError
So obviously the global modifier only works within the scope of the function being executed. Is this, in any way, caused by the garbage collector? Is there some phantom globalizer object that waits for the creation of an object with the given name and gets cleared up upon the end of the function?
What does the global statement actually do to the variable?
Absolutely nothing.
global foo means that any occurrences of the variable name foo in the scope of the function refer to a module-global foo variable instead of a function-call-local variable. It does nothing to the variable itself.
As for where such variables live, they don't really "live" anywhere. When such a variable is assigned, an entry will be created for them in the module's global variable dict. If the variable is deleted, the global variable dict entry will be erased. This is identical to what would happen if you were assigning and deleting these variables at module level without a global declaration.
The global is a directive to the parser.
as written in the doc. This means it doesn't change anything. Also notice that "it applies only to code parsed at the same time as the global statement". This can be tested in the below example
a=3
def foo():
exec('global a')
a=4
foo()
print(a) # 3
If global is a modifier as you said, then the last line will print 4. But that's not the case.

Order of evaluation of classes?

In a file lib.py I defined a functional class C and an enumeration class E as follows:
class C:
a = None
def meth(self, v):
if v == E.v1:
print("In C.meth().v1")
a = E.v1
if v == E.v2:
print("In C.meth().v2")
a = E.v2
from enum import Enum
class E(Enum):
print("In Enum")
v1 = 1
v2 = 2
Then, I import the two classes into my module main.py and use the enumeration:
from lib import C
from lib import E
c = C()
c.meth(E.v1)
When running, I get the following output:
In Enum
In C.meth().v1
Now, since Python is an interpreted language (at least, when using IDLE), I'd expect to get an error on the reference to the enumerations in the method meth. Since there is no error, and it seems to run OK, I wonder what are the (ordering) rules for referencing classes in the same module, and in between different modules? Why is there no error?
Name lookup happens at run time. So when you are defining class C and its method meth, then the lookup on E isn’t done yet. So it’s not a problem that you define it afterwards. Instead, the lookup happens when you call the method.
Also, name lookup happens by going up the scope, so meth will find the original E declared on module level, regardless of whether you import it in your main.py or not. Since you also import E in main.py, which is a reference to the same object, you can reference the same enum value in there too.
See also this example:
>>> def test(): # foo is not defined at this time
print(foo)
>>> test()
NameError: global name 'foo' is not defined
>>> foo = 'bar' # after defining foo, it works:
>>> test()
bar
When defining methods, variables are never “embedded”; the methods only contain the names and those names are looked up at run-time. However, due to how Python does the lookup, names of local variables are always “around” even if they haven’t been initialized yet. This can result in UnboundLocalErrors:
>>> def test():
print(foo)
foo = 'baz'
>>> test()
UnboundLocalError: local variable 'foo' referenced before assignment
One might expect that foo would be looked up in the outer scope for the first print, but because there is a local foo (even if it wasn’t initialized yet), foo will always* resolve to the local foo.
(* The nonlocal statement allows to make foo non-local, resolving it to the outer scope—again for all uses of foo in that method.)
When a module is imported, the commands are executed from top to bottom. Inside a class-definition, the commands are also executed, to define the methods inside the class. A def defines a method, but the commands inside the def are not executed, but only parsed.
The simplest way to understand the order of evaluation in your code is is to watch it execute:
http://dbgr.cc/q
Press the play button on the far right of the debug buttons and it will automatically step through.
I think what is confusing to you is that when class E is defined, all statements inside of the E class are run. This is the case for every class definition. This includes calling the print function to say "In Enum", as well as defining the v1 and v2 members of the E class.
The line c.meth(E.v1) isn't executed until both the C and the E classes have been defined, which means that E.v1 has also already been defined. This is why there is no error like you were expecting.

Python: Why redefinition of a function is not an error? Is there a hackish way to have that feature? [duplicate]

This question already has answers here:
Can the python interpreter fail on redeclared functions?
(2 answers)
Closed 8 years ago.
For example:
def foo():
print 'first foo'
def foo():
print 'second foo'
foo()
silently produces: second foo
Today I copy/pasted a function definition in the same file and changed a few lines in the body of second definition but forgot to change the function name itself. I scratched my head for a long time looking at the output and it took me a while to figure it out.
How to force the interpreter throw at least a warning at redefinition of a function? Thanks in advance.
How about using pylint?
pylint your_code.py
Let your_code.py be
1 def dup():
2 print 'a'
3 def dup():
4 print 'a'
5
6 dup()
pylint shows
C: 1,0: Missing docstring
C: 1,0:dup: Missing docstring
E: 3,0:dup: function already defined line 1 <--- HERE!!!!
C: 3,0:dup: Missing docstring
...
If you are using Pydev, You can find duplication interactively.
When mouseover the second dup, It says Duplicated signature: dup.
It is one of features of Python. Functions are values just as integers, so you can pass them around and rebind to names, just as you would in C++ using function pointers.
Look at this code:
def foo(): # we define function and bind func object to name 'foo'
print "this if foo"
foo() # >>>this if foo
bar = foo # we bind name 'bar' to the same function object
def foo(): # new function object is created and bound to foo
print "this is new foo"
foo() # foo now points to new object
# >>>this is new foo
bar() # but old function object is still unmodified:
# >>>this if foo
Thus interpreter works fine. In fact it is common to redefine functions when you are working with interactive interpreter, until you get it right. Or when you use decorators.
If you want to be warned about redefining something in python, you can use 'lint' tools, like pylint (see function-redefined (E0102))
I think it is a similar behaviour for what happens with variables (called identifiers):
In [4]: a = 2
In [5]: a = 3
In [6]: a
Out[6]: 3
you don't see the interpreter whining about a being redefined.
EDIT Somebody commented below and I think it might help clarifying my answer:
[this is due to] function objects are not treated differently from other objects, and
that names defined via def aren't treated differently from names
defined via other means
See the language reference about def being a reserved identifier.
You need to know python philosophy of object. Everything in python is object. When you create a function, you actually create object of class function and name it as your function name.
When you re-define it you simply replace old object with new one simple as creating new variable of same name.
e.g.
>>> a=10
>>> print a
10
>>> a=20
>>> print a
20
same way you can check class of the function.
>>> def a():
... pass
...
>>> a.__class__
<type 'function'>
which indicates your function is actually a object or variable that can be replaced with any other object variable of same class.
Well, you could check if it exists like this
def foo():
pass
def check_foo(variable_dict):
if 'foo' in variable_dict:
print('Function foo already exists!')
else:
print('Function foo does not exist..')
>>> check_foo()
True
>>> del foo
>>> check_foo(locals())
False

Categories

Resources