Python behavior for immutable default parameter values - python

>>> def a():
... print "a executed"
... return []
...
>>>
>>> def b(x=a()):
... x.append(5)
... print x
...
a executed
>>> b()
[5]
>>> b()
[5, 5]
x is bound to an empty list object when the function b is first defined. the empty list object gets modified each time b is called because b is bound to the object.
What I don't get is when this happens to immutable objects:
>>> def a():
... print "a executed"
... return 0
...
>>>
>>> def b(x=a()):
... x = x + 2
... print x
...
a executed
>>> b()
2
>>> b()
2
From my POV, x is bound to the int object 0 when the function b is first defined. Then, x is modified when b() is called. Therefore subsequent calls to b() should re-bind x to 2, 4, 6, and so on. Why doesn't this occur? I am obviously missing something important here!
Thx :)

When you do x = you're not modifying the object that x references, you're just changing the reference x to point to a different object, in this case, another int. In this case it's event irrelevant whether x points to an immutable object. If you would do x = x + [5] with lists, it would also remain unchanged. Note the difference:
def b1(x = []):
x = x + [5]
print(x)
def b2(x = []):
x.append(5)
print(x)
print("b1:")
b1()
print("b1:")
b1()
print("b2:")
b2()
print("b2:")
b2()
Gives:
b1:
[5]
b1:
[5]
b2:
[5]
b2:
[5, 5]
When the function is being executed, you're working on a local variable x that either was initialized using the default value, or provided by the caller. So what gets rebound is the local variable x, not the default value for the parameter.
You may want to also read about the difference between formal and actual parameters. It's only slightly related to this problem, but may help you understand this better. An example explanation can be found here.

Careful, there's a huge difference between:
x.append(5)
and:
x = x + 1
Namely, the first mutates the object referenced by x whereas the second creates a new object which is the result of x + 1 and rebinds it to the name x.
Of course, this is a bit of an over-simplification -- e.g. what if you had used += ...
It really falls back on how __add__ and __iadd__ are defined in the first place, but this should get the point across ...
To go a little deeper, you can think of a function as an object or an instance of a class. It has some special attributes which you can even look at if you want to:
>>> def foo(x = lst): pass
...
>>> foo.func_defaults
([],)
>>> foo.func_defaults[0] is lst
True
When the function is defined, func_defaults1 gets set. Every time the function gets called, python looks at the defaults and the stuff which was present in the call and it figures out which defaults to pass into the function and which ones were provided already. The take away is that this is why, when you append to the list in the first case, the change persists -- You're actually changing the value in func_defaults too. In the second case where you use x = x + 1, you're not actually doing anything to change func_defaults -- You're just creating something new and putting it into the function's namespace.
1the attribute is just __defaults__ in python3.x

Related

How does closures see context variables into the stack?

I would like to understand how the stack frame pushed by calling b() can access the value of x that lives in the stack frame pushed by a().
Is there a pointer from b() frame to a() frame? Or does the runtime copy the value of x as a local variable in the b() frame? Or is there another machanism under the hood?
This example is in python, but is there a universal mechanism to solve that or different languages use different mechanisms?
>>> def a():
... x = 5
... def b():
... return x + 2
... return b()
...
>>> a()
7
In CPython (the implementation most people use) b itself contains a reference to the value. Consider this modification to your function:
def a():
x = 5
def b():
return x + 2
# b.__closure__[0] corresponds to x
print(b.__closure__[0].cell_contents)
x = 9
print(b.__closure__[0].cell_contents)
When you call a, note that the value of the cell content changes with the local variable x.
The __closure__ attribute is a tuple of cell objects, one per variable that b closes over. The cell object basically has one interesting attribute, cell_contents, that acts like a reference to the variable it represents. (You can even assign to the cell_contents attribute to change the value of the variable, but I can't imagine when that would be a good idea.)

Python dynamic function attribute

I came across an interesting issue while trying to achieve dynamic sort.
Given the following code:
>>> l = []
>>> for i in range(2):
>>> def f():
>>> return f.v
>>> f.v = i
>>> l.append(f)
You have to be careful about how to use the functions in l:
>>> l[0]()
1
>>> l[1]()
1
>>> [h() for h in l]
[1, 1]
>>> [f() for f in l]
[0, 1]
>>> f = l[0]
>>> f()
0
>>> k = l[1]
>>> k()
0
>>> f = l[1]
>>> k()
1
>>> del f
>>> k()
NameError: global name 'f' is not defined
The behavior of the function depends on what f currently is.
What should I do to avoid this issue? How can I set a function attribute that does not depends on the function's name?
Update
Reading your comments and answers, here is my actual problem.
I have some data that I want to sort according to user input (so I don't know sorting criteria in advance). User can choose on which part of the data to apply successive sorts, and these sorts can be ascending or descending.
So my first try was to loop over the user inputs, define a function for each criterion, store this function in a list and then use this list for sorted's key like this: key=lambda x: [f(x) for f in functions]. To avoid multiplying conditions into functions themselves, I was computing some needed values before the function definition and binding them to the function (different functions with different pre-computed values).
While debugging, I understood that function attribute was not the solution here, so I indeed wrote a class with a __call__ method.
The issue is due to the fact that return f.v loads the global f, and not the one you intend.1 You can see this by disassembling the code:
>>> dis.dis(l[0])
3 0 LOAD_GLOBAL 0 (f)
3 LOAD_ATTR 1 (v)
6 RETURN_VALUE
After the loop that populates l, f is a reference to the last closure created, as you can see here:
>>> l
[<function f at 0x02594170>, <function f at 0x02594130>]
>>> f
<function f at 0x02594130>
Thus, when you call l[0](), it still loads the f that points to the last function created, and it returns 1. When you redefined f by doing f = l[0], then the global f now points to the first function.
What you seem to want is a function that has a state, which really is a class. You could therefore do something like this:
class MyFunction:
def __init__(self, v):
self.v = v
def __call__(self):
return self.v
l = [MyFunction(i) for i in range(2)]
l[0]() # 0
l[1]() # 1
Though it may be a good idea to explain your actual problem first, as there might be a better solution.
1: Why doesn't it load the global f and not the current instance, you may ask?
Recall that when you create a class, you need to pass a self argument, like so:
# ...
def my_method(self):
return self.value
self is actually a reference to the current instance of your object. That's how Python knows where to load the attribute value. It knows it has to look into the instance referenced by self. So when you do:
a.value = 1
a.my_method()
self is now a reference to a.
So when you do:
def f():
return f.v
There's no way for Python to know what f actually is. It's not a parameter, so it has to load it from elsewhere. In your case, it's loaded from the global variables.
Thus, when you do f.v = i, while you do set an attribute v for the instance of f, there's no way to know which instance you are referring to in the body of your function.
Note that what you are doing here:
def f():
return f.v
is not making a function which returns whatever its own v attribute is. It's returning whatever the f object's v attribute is. So it necessarily depends on the value of f. It's not that your v attribute "depends on the function's name". It really has nothing at all to do with the function's name.
Later, when you do
>>> f = l[0]
>>> k = l[1]
>>> k()
0
What you have done is bound k to the function at l[1]. When you call it, you of course get f.v, because that's what the function does.
But notice:
>>> k.v
1
>>> [h.v for h in l]
[0, 1]
So, a function is an object, and just like most objects, it can have attributes assigned to it (which you can access using dot notation, or the getattr() function, or inspecting the object's dictionary, etc.). But a function is not designed to access its own attributes from within its own code. For that, you want to use a class (as demonstrated by #VincentSavard).
In your particular case, the effect you seem to be after doesn't really need an "attribute" per se; you are apparently looking for a closure. You can implement a closure using a class, but a lighter-weight way is a nested function (one form of which is demonstrated by #TomKarzes; you could also use a named inner function instead of lambda).
Try this:
l = []
for i in range(2):
def f(n):
return lambda: n
l.append(f(i))
This doesn't use attributes, but creates a closure for each value of i. The value of n is then locked once f returns. Here's some sample output:
>>> [f() for f in l]
[0, 1]
As others said, return f.v looks for f name in the current scope which is equal to the last defined function.
To work around this you can simulate functions:
>>> class Function(object):
... def __init__(self, return_value):
... self.return_value = return_value
... def __call__(self):
... return self.return_value
...
>>> l = []
>>> for i in range(2):
... l.append(Function(i))
...
>>> l[0]()
>>> 0
>>> l[1]()
>>> 1

Python closures using lambda

I saw this below piece of code in a tutorial and wondering how it works.
Generally, the lambda takes a input and returns something but here it does not take anything and still it works.
>>> for i in range(3):
... a.append(lambda:i)
...
>>> a
[<function <lambda> at 0x028930B0>, <function <lambda> at 0x02893030>, <function
<lambda> at 0x028930F0>]
lambda:i defines the constant function that returns i.
Try this:
>>> f = lambda:3
>>> f()
You get the value 3.
But there's something more going on. Try this:
>>> a = 4
>>> g = lambda:a
>>> g()
gives you 4. But after a = 5, g() returns 5. Python functions "remember" the environment in which they're executed. This environment is called a "closure". By modifying the data in the closure (e.g. the variable a in the second example) you can change the behavior of the functions defined in that closure.
In this case a is a list of function objects defined in the loop.
Each of which will return 2.
>>> a[0]()
2
To make these function objects remember i values sequentially you should rewrite the code to
>>> for i in range(3):
... a.append(lambda x=i:x)
...
that will give you
>>> a[0]()
0
>>> a[1]()
1
>>> a[2]()
2
but in this case you get side effect that allows you to not to use remembered value
>>> a[0](42)
42
I'm not sure what you mean by "it works". It appears that it doesn't work at all. In the case you have presented, i is a global variable. It changes every time the loop iterates, so after the loop, i == 2. Now, since each lambda function simply says lambda:i each function call will simply return the most recent value of i. For example:
>>> a = []
>>> for i in range(3):
a.append(lambda:1)
>>> print a[0]()
2
>>> print a[1]()
2
>>> print a[2]()
In other words, this does not likely do what you expect it to do.
lambda defines an anonymous inline function. These functions are limited compared to the full functions you can define with def - they can't do assignments, and they just return a result. However, you can run into interesting issues with them, as defining an ordinary function inside a loop is not common, but lambda functions are often put into loops. This can create closure issues.
The following:
>>> a = []
>>> for i in range(3):
... a.append(lambda:i)
adds three functions (which are first-class objects in Python) to a. These functions return the value of i. However, they use the definition of i as it existed at the end of the loop. Therefore, you can call any of these functions:
>>> a[0]()
2
>>> a[1]()
2
>>> a[2]()
2
and they will each return 2, the last iteration of the range object. If you want each to return a different number, use a default argument:
>>> for i in range(3):
... a.append(lambda i=i:i)
This will forcibly give each function an i as it was at that specific point during execution.
>>> a[0]()
0
>>> a[1]()
1
>>> a[2]()
2
Of course, since we're now able to pass an argument to that function, we can do this:
>>> b[0](5)
5
>>> b[0](range(3))
range(0, 3)
It all depends on what you're planning to do with it.

To use init or not to in Python classes

I have always defined variables for classes like:
class A():
def __init__(self):
self.x = 1
However, I discovered it is also simply possible to use:
class A():
x = 1
In both cases, a new instance will have a variable x with a value of 1.
Is there any difference?
For further reading, in the Python Tutorial chapter on classes, that matter is discussed in detail. A summary follows:
There is a difference as soon as non-immutable data structures take part in the game.
>>> class A:
... x = [1]
...
>>> a1 = A()
>>> a2 = A()
>>> a1.x.append(2)
>>> a1.x
[1, 2]
>>> a2.x
[1, 2]
In that case, the same instance of x is used for both class instances. When using __init__, new instances are created when a new A instance is created:
>>> class A:
... def __init__(self):
... self.x = [1]
...
>>> a1 = A()
>>> a2 = A()
>>> a1.x.append(2)
>>> a1.x
[1, 2]
>>> a2.x
[1]
In the first example, a list is created and bound to A.x. This can be accessed both using A.x and using A().x (for any A(), such as a1 or a2). They all share the same list object.
In the second example, A does not have an attribute x. Instead, the objects receive an attribute x during initialization, which is distinct for each object.
Your question is very imprecise. You speak about "variables for classes", but later you say "instance will have a variable". In fact, your examples are reversed. Second one shows a class A with a variable x, and the first one shows a class A with no variable x, but whose every instance (after __init__, unless deleted) has a variable x.
If the value is immutable, there is not much difference, since when you have a=A() and a doesn't have a variable x, a.x automatically delegates to A.x. But if the value is mutable, then it matters, since there is only one x in the second example, and as many xs as there are instances (zero, one, two, seventeen,...) in the first one.

Can someone please explain this bit of Python code?

I started working in Python just recently and haven't fully learned all the nuts and bolts of it, but recently I came across this post that explains why python has closures, in there, there is a sample code that goes like this:
y = 0
def foo():
x = [0]
def bar():
print x[0], y
def change(z):
global y
x[0] = y = z
change(1)
bar()
change(2)
bar()
change(3)
bar()
change(4)
bar()
foo()
1 1
2 2
3 3
and basically I don't understand how it actually works, and what construct like x[0] does in this case, or actually I understand what it's doing, I just don't get how is it this :)
Before the nonlocal keyword was added in Python 3 (and still today, if you're stuck on 2.* for whatever reason), a nested function just couldn't rebind a local barename of its outer function -- because, normally, an assignment statement to a barename, such as x = 23, means that x is a local name for the function containing that statement. global exists (and has existed for a long time) to allow assignments to bind or rebind module-level barenames -- but nothing (except for nonlocal in Python 3, as I said) to allow assignments to bind or rebind names in the outer function.
The solution is of course very simple: since you cannot bind or rebind such a barename, use instead a name that is not bare -- an indexing or an attribute of some object named in the outer function. Of course, said object must be of a type that lets you rebind an indexing (e.g., a list), or one that lets you bind or rebind an attribute (e.g., a function), and a list is normally the simplest and most direct approach for this. x is exactly that list in this code sample -- it exists only in order to let nested function change rebind x[0].
It might be simpler to understand if you look at this simplified code where I have removed the global variable:
def foo():
x = [0]
def bar():
print x[0]
def change(z):
x[0] = z
change(1)
bar()
foo()
The first line in foo creates a list with one element. Then bar is defined to be a function which prints the first element in x and the function change modifies the first element of the list. When change(1) is called the value of x becomes [1].
This code is trying to explain when python creates a new variable, and when python reuses an existing variable. I rewrote the above code slightly to make the point more clear.
y = "lion"
def foo():
x = ["tiger"]
w = "bear"
def bar():
print y, x[0], w
def change(z):
global y
x[0] = z
y = z
w = z
bar()
change("zap")
bar()
foo()
This will produce this output:
lion tiger bear
zap zap bear
The point is that the inner function change is able to affect the variable y, and the elements of array x, but it does not change w (because it gets its own local variable w that is not shared).
x = [0] creates a new list with the value 0 in it. x[0] references the zero-eth element in the list, which also happens to be zero.
The example is referencing closures, or passable blocks of code within code.

Categories

Resources