Why is default variable off by one in function definition? - python

I found an interesting issue why trying to do a the following for a code golf challenge:
>>> f=lambda s,z=len(s): 5+z
>>> f("horse")
11 #Expected 10
>>>
>>> def g(s,z=len(s)):
... print "z: ", z
... print "sum: ", 5+z
...
>>> g("horse")
z: 6
sum: 11
>>>
>>> len("horse") + 5 #Expected function operation
10
Creating the function both ways seems to initialize z as 6 instead of the expected 5, why does this happen?

The python docs have a page that explains this
Python’s default arguments are evaluated once when the function is
defined, not each time the function is called
In your case, s must have already been bound to a string of length 6 before you created the lambda function. When python evaluated the lambda definition with z=len(s), it evaluated to z=6. It doesn't get processed again each time you call the function.

You can't use assignment in lambda expression unless the variable is already declared. In your case the variable s was previously declared as mentioned in this comment and because z gets bound at runtime it used that value of s.
Demo:
>>> a = 9
>>> f = lambda b: a + b
>>> f(3)
12
>>> a = 11
>>> f(3)
14
>>> f = lambda b, a=a: a + b # "a" gets bound to previous value 11
>>> f(3)
14
>>> a = 3 #
>>> f(3)
14
As you can see if you use a=a in the lambda expression a value gets bound at definition time and changing the value doesn't have any effect which is what happened in your case.
You should change your lambda expression like this:
>>> f = lambda s: 5 + len(s)
>>> f('horse')
10

Your function definition won't work, because the way you have defined your default arguments. Python’s default arguments are evaluated once when the function is defined, not each time the function is called (like it is in say, Ruby). This means that if you use a mutable default argument and mutate it, you will and have mutated that object for all future calls to the function as well. Your program will not work as you see below!

Related

how python interpreter treats the position of the function definition having default parameter

Why the first code outputs 51 and the second code outputs 21. I understand the second code should output 21, but the way I understood, the first code should also output 21 (The value of b changed to 20 and then is calling the function f). What am I missing?
b = 50
def f(a, b=b):
return a + b
b = 20
print(f(1))
Output: 51
b = 50
b = 20
def f(a, b=b):
return a + b
print(f(1))
Output: 21
Edit: This is different from How to change default value of optional function parameter in Python 2.7? because here the unintentional change happening to the default parameter is being discussed, not how to intentionally change the value of default parameter, ie here the question focuses on how the python interpreter treats the position of function definition for functions having default parameters.
Tip for python beginners : If you use IDEs like pycharm - you can put a debugger and see what is happening with the variables.
We can get a better understanding of what is going on using the id(b) which gets us the address of the particular object in memory:
Return the “identity” of an object. This is an integer which is
guaranteed to be unique and constant for this object during its
lifetime. Two objects with non-overlapping lifetimes may have the same
id() value.
CPython implementation detail: This is the address of the object in
memory.
Let me modify your code to the following :
b = 50
print("b=50 :", id(b))
def f(a, b=b):
print("b in the function f :", id(b))
print(id(b))
return a + b
b = 20
print("b=20 :", id(b))
print(f(1))
The output is as following:
b=50 : 4528710960
b=20 : 4528710000
b in the function f : 4528710960
4528710960
51
As you can see the b inside the function and the b=50 have the same address.
When you do b=20 a new object was created.
In Python, (almost) everything is an object. What we commonly refer to as "variables" in Python are more properly called names. Likewise, "assignment" is really the binding of a name to an object. Each binding has a scope that defines its visibility, usually the block in which the name originates.
In python
When you do
b=50 a binding of b to an int object is created in the scope of the block
When we later say b=20 the int object b=50 is unaffected. These both are essentially two different objects.
You can read more about it in these links.
Is Python call-by-value or call-by-reference? Neither.
Parameter Passing
Python id()
Think of how the interpreter treats this. In the first case, def f(a, b=b) is interpreted as def f(a, b=50) since the value of b in the current scope is 50. As such, f(1) assigns a to 1, and thus, a + b = 1 + 50 = 51.
Similarly, in the second case, the value of b in the current scope is 20 when the function is declared, so the definition is interpreted as def f(a, b=20). Therefore, f(1) = 21.
The reason why the different placement of the function is resulting in different errors is because of the placement of the value 'b' as well.
Since the function 'f', is using a named parameter 'b', so it takes the first assignment of the variable 'b' as the argument/parameter to the function 'f'
For example,
b = 50
def f(a, b=b):
return a + b
b = 20
print(f(1))
As you pointed out, this results in the output 51
But if I were to change the code by a bit to
def f(a, b=b):
return a + b
b = 50
b = 20
print(f(1))
It would result in the following error:
def f(a, b=b):
NameError: name 'b' is not defined
Hence, we can deduce that the placement of the variable which is taken as a named parameter to the function is causing the difference in outputs.
You can also use the global variables for the same.
Because by the time you are defining the function f in case 1, you are assigning the value of b (at that time it's 50) to the second argument of the function.
While in case 2, at the time of assigning the value of b to the second argument of f it is 20.
This is the reason for the different answers in both cases.

Python dynamic function attribute

I came across an interesting issue while trying to achieve dynamic sort.
Given the following code:
>>> l = []
>>> for i in range(2):
>>> def f():
>>> return f.v
>>> f.v = i
>>> l.append(f)
You have to be careful about how to use the functions in l:
>>> l[0]()
1
>>> l[1]()
1
>>> [h() for h in l]
[1, 1]
>>> [f() for f in l]
[0, 1]
>>> f = l[0]
>>> f()
0
>>> k = l[1]
>>> k()
0
>>> f = l[1]
>>> k()
1
>>> del f
>>> k()
NameError: global name 'f' is not defined
The behavior of the function depends on what f currently is.
What should I do to avoid this issue? How can I set a function attribute that does not depends on the function's name?
Update
Reading your comments and answers, here is my actual problem.
I have some data that I want to sort according to user input (so I don't know sorting criteria in advance). User can choose on which part of the data to apply successive sorts, and these sorts can be ascending or descending.
So my first try was to loop over the user inputs, define a function for each criterion, store this function in a list and then use this list for sorted's key like this: key=lambda x: [f(x) for f in functions]. To avoid multiplying conditions into functions themselves, I was computing some needed values before the function definition and binding them to the function (different functions with different pre-computed values).
While debugging, I understood that function attribute was not the solution here, so I indeed wrote a class with a __call__ method.
The issue is due to the fact that return f.v loads the global f, and not the one you intend.1 You can see this by disassembling the code:
>>> dis.dis(l[0])
3 0 LOAD_GLOBAL 0 (f)
3 LOAD_ATTR 1 (v)
6 RETURN_VALUE
After the loop that populates l, f is a reference to the last closure created, as you can see here:
>>> l
[<function f at 0x02594170>, <function f at 0x02594130>]
>>> f
<function f at 0x02594130>
Thus, when you call l[0](), it still loads the f that points to the last function created, and it returns 1. When you redefined f by doing f = l[0], then the global f now points to the first function.
What you seem to want is a function that has a state, which really is a class. You could therefore do something like this:
class MyFunction:
def __init__(self, v):
self.v = v
def __call__(self):
return self.v
l = [MyFunction(i) for i in range(2)]
l[0]() # 0
l[1]() # 1
Though it may be a good idea to explain your actual problem first, as there might be a better solution.
1: Why doesn't it load the global f and not the current instance, you may ask?
Recall that when you create a class, you need to pass a self argument, like so:
# ...
def my_method(self):
return self.value
self is actually a reference to the current instance of your object. That's how Python knows where to load the attribute value. It knows it has to look into the instance referenced by self. So when you do:
a.value = 1
a.my_method()
self is now a reference to a.
So when you do:
def f():
return f.v
There's no way for Python to know what f actually is. It's not a parameter, so it has to load it from elsewhere. In your case, it's loaded from the global variables.
Thus, when you do f.v = i, while you do set an attribute v for the instance of f, there's no way to know which instance you are referring to in the body of your function.
Note that what you are doing here:
def f():
return f.v
is not making a function which returns whatever its own v attribute is. It's returning whatever the f object's v attribute is. So it necessarily depends on the value of f. It's not that your v attribute "depends on the function's name". It really has nothing at all to do with the function's name.
Later, when you do
>>> f = l[0]
>>> k = l[1]
>>> k()
0
What you have done is bound k to the function at l[1]. When you call it, you of course get f.v, because that's what the function does.
But notice:
>>> k.v
1
>>> [h.v for h in l]
[0, 1]
So, a function is an object, and just like most objects, it can have attributes assigned to it (which you can access using dot notation, or the getattr() function, or inspecting the object's dictionary, etc.). But a function is not designed to access its own attributes from within its own code. For that, you want to use a class (as demonstrated by #VincentSavard).
In your particular case, the effect you seem to be after doesn't really need an "attribute" per se; you are apparently looking for a closure. You can implement a closure using a class, but a lighter-weight way is a nested function (one form of which is demonstrated by #TomKarzes; you could also use a named inner function instead of lambda).
Try this:
l = []
for i in range(2):
def f(n):
return lambda: n
l.append(f(i))
This doesn't use attributes, but creates a closure for each value of i. The value of n is then locked once f returns. Here's some sample output:
>>> [f() for f in l]
[0, 1]
As others said, return f.v looks for f name in the current scope which is equal to the last defined function.
To work around this you can simulate functions:
>>> class Function(object):
... def __init__(self, return_value):
... self.return_value = return_value
... def __call__(self):
... return self.return_value
...
>>> l = []
>>> for i in range(2):
... l.append(Function(i))
...
>>> l[0]()
>>> 0
>>> l[1]()
>>> 1

Python closures using lambda

I saw this below piece of code in a tutorial and wondering how it works.
Generally, the lambda takes a input and returns something but here it does not take anything and still it works.
>>> for i in range(3):
... a.append(lambda:i)
...
>>> a
[<function <lambda> at 0x028930B0>, <function <lambda> at 0x02893030>, <function
<lambda> at 0x028930F0>]
lambda:i defines the constant function that returns i.
Try this:
>>> f = lambda:3
>>> f()
You get the value 3.
But there's something more going on. Try this:
>>> a = 4
>>> g = lambda:a
>>> g()
gives you 4. But after a = 5, g() returns 5. Python functions "remember" the environment in which they're executed. This environment is called a "closure". By modifying the data in the closure (e.g. the variable a in the second example) you can change the behavior of the functions defined in that closure.
In this case a is a list of function objects defined in the loop.
Each of which will return 2.
>>> a[0]()
2
To make these function objects remember i values sequentially you should rewrite the code to
>>> for i in range(3):
... a.append(lambda x=i:x)
...
that will give you
>>> a[0]()
0
>>> a[1]()
1
>>> a[2]()
2
but in this case you get side effect that allows you to not to use remembered value
>>> a[0](42)
42
I'm not sure what you mean by "it works". It appears that it doesn't work at all. In the case you have presented, i is a global variable. It changes every time the loop iterates, so after the loop, i == 2. Now, since each lambda function simply says lambda:i each function call will simply return the most recent value of i. For example:
>>> a = []
>>> for i in range(3):
a.append(lambda:1)
>>> print a[0]()
2
>>> print a[1]()
2
>>> print a[2]()
In other words, this does not likely do what you expect it to do.
lambda defines an anonymous inline function. These functions are limited compared to the full functions you can define with def - they can't do assignments, and they just return a result. However, you can run into interesting issues with them, as defining an ordinary function inside a loop is not common, but lambda functions are often put into loops. This can create closure issues.
The following:
>>> a = []
>>> for i in range(3):
... a.append(lambda:i)
adds three functions (which are first-class objects in Python) to a. These functions return the value of i. However, they use the definition of i as it existed at the end of the loop. Therefore, you can call any of these functions:
>>> a[0]()
2
>>> a[1]()
2
>>> a[2]()
2
and they will each return 2, the last iteration of the range object. If you want each to return a different number, use a default argument:
>>> for i in range(3):
... a.append(lambda i=i:i)
This will forcibly give each function an i as it was at that specific point during execution.
>>> a[0]()
0
>>> a[1]()
1
>>> a[2]()
2
Of course, since we're now able to pass an argument to that function, we can do this:
>>> b[0](5)
5
>>> b[0](range(3))
range(0, 3)
It all depends on what you're planning to do with it.

Python: Efficiently calling subset variables of multiple returns function

I wanna know if I can prevent my function to work through all its routine if I'm only interested in one (or less than total) of the variables it returns.
To elucidate, suppose I have a function with (a tuple of) multiple returns:
def func_123(n):
x=n+1
y=n+2
z=n+3
return x,y,z
If I'm only interested in the third values, I can just do:
_,_,three = func_123(0)
But I wanna know how it works in the function.
Does my function performs of three calculations and only then chooses to 'drop' the first two and give me the one i want or does it recognise it can do a lot less work if it only performs the subroutines needed to return the value i want? If the first, is there a way around this (besides, of course, creating functions for each calculation and let go of an unique function to organize all subroutines)?
It will calculate, and return, all of the values. For example
def foo(x):
return x+1, x+2
When I call this function
>>> foo(1)
(2, 3)
>>> _, a = foo(1)
>>> a
3
>>> _
2
Note that _ is a perfectly valid, and usable, variable name. It is just used by convention to imply that you do not wish to use that variable.
The closest thing to what you are describing would be to write your function as a generator. For example
def func_123(n):
for i in range(1,4):
yield n + i
>>> a = func_123(1)
>>> next(a)
2
>>> next(a)
3
>>> next(a)
4
In this way, the values are evaluated and returned lazily, or in other words only when they are needed. In this way, you could craft your function so they return in the order that you want.
It doesn't "choose" or "drop" anything. What you're using is tuple assignment; specifically, you're assigning the return value to the tuple (_,_,three). The _ variable is just a convention for a "throw away" variable.
I would like to try something differently using functools builtin module (this may not be exactly what you are looking for but you can rethink of what you are doing.)
>>> import functools
>>> def func_123(n, m):
... return n + m
...
>>> func_dict = dict()
>>> for r in [1,2,3]:
... func_dict[r] = functools.partial(func_123, r)
...
>>> for k in [1,2,3]:
... func_dict[k](10)
...
11
12
13
>>> func_dict[3](20)
23
>>>
OR
>>> func_1 = functools.partial(func_123, 1)
>>> func_2 = functools.partial(func_123, 2)
>>> func_3 = functools.partial(func_123, 3)
>>> func_1(5)
6
>>> func_2(5)
7
>>> func_3(5)
8
>>> func_3(3)
6
>>>
So, you don't need to worry about returning output in tuple and selecting the values you want.
It's only a convention to use _ for unused variables.So all the statements in the function do get evaluated.

In Python, why can a lambda expression refer to the variable being defined but not a list?

This is more a curiosity than anything, but I just noticed the following. If I am defining a self-referential lambda, I can do it easily:
>>> f = lambda: f
>>> f() is f
True
But if I am defining a self-referential list, I have to do it in more than one statement:
>>> a = [a]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
>>> a = []
>>> a.append(a)
>>> a[0] is a
True
>>> a
[[...]]
I also noticed that this is not limited to lists but seems like any other expression other than a lambda can not reference the variable left of the assignment. For example, if you have a cyclic linked-list with one node, you can't simply go:
>>> class Node(object):
... def __init__(self, next_node):
... self.next = next_node
...
>>> n = Node(n)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'n' is not defined
Instead, you have to do it in two statements:
>>> n = Node(None)
>>> n.next = n
>>> n is n.next
True
Does anyone know what the philosophy behind this difference is? I understand that a recursive lambda are used much more frequently, and hence supporting self-reference is important for lambdas, but why not allow it for any assignment?
EDIT: The answers below clarify this quite nicely. The reason is that variables in lambdas in Python are evaluated each time the lambda is called, not when it's defined. In this sense they are exactly like functions defined using def. I wrote the following bit of code to experiment with how this works, both with lambdas and def functions in case it might help clarify it for anyone.
>>> f = lambda: f
>>> f() is f
True
>>> g = f
>>> f = "something else"
>>> g()
'something else'
>>> f = "hello"
>>> g()
'hello'
>>> f = g
>>> g() is f
True
>>> def f():
... print(f)
...
>>> f()
<function f at 0x10d125560>
>>> g = f
>>> g()
<function f at 0x10d125560>
>>> f = "test"
>>> g()
test
>>> f = "something else"
>>> g()
something else
The expression inside a lambda is evaluated when the function is called, not when it is defined.
In other words, Python will not evaluate the f inside your lambda until you call it. And, by then, f is already defined in the current scope (it is the lambda itself). Hence, no NameError is raised.
Note that this is not the case for a line like this:
a = [a]
When Python interprets this type of line (known as an assignment statement), it will evaluate the expression on the right of the = immediately. Moreover, a NameError will be raised for any name used on the right that is undefined in the current scope.
Because a lambda is a function, and the function body is not executed until the function is called.
In other words, the other way to do it is this:
def f():
return f
But you're correct that you can't do it in an expression because def is a statement, so it can't be used in an expression.
We can see when we disassemble the lambda function (this is identical output in Python 2.6 and 3.3)
>>> import dis
>>> f = lambda: f
>>> dis.dis(f)
1 0 LOAD_GLOBAL 0 (f)
3 RETURN_VALUE
We demonstrate that we do not need to load f until it is called, whereupon it is already defined globally, and therefore stored, so this works:
>>> f is f()
True
But when we do:
>>> a = [a]
We have an error (if a is previously undefined), and if we disassemble Python's implementation of this.
>>> def foo():
... a = [a]
...
>>> dis.dis(foo)
2 0 LOAD_FAST 0 (a)
3 BUILD_LIST 1
6 STORE_FAST 0 (a)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
we see that we attempt to load a before we have stored it.
There's no special-casing required to make this happen; it's just how it works.
A lambda expression is not any different from a normal function, really. Meaning, I can do this:
x = 1
def f():
print x + 2
f()
3
x = 2
f()
4
As you can see, inside the function, the value of x does not have a predefined value - it's looked up when we actually run f. This includes the value of the function itself: We don't look up what f represents until we actually run it, and by then it exists.
Doing that as a lambda doesn't work any differently:
del x
f = lambda: x+2
f()
NameError: global name 'x' is not defined
x = 2
f()
4
works similarly. In this case, I went ahead and deleted x so it was no longer in the scope when f was defined, and running f in this case correctly shows that x doesn't exist. But after we define x, then f works again.
This is different in the list case, because we are actually generating an object right now, and so everything on the right side has to be bound, right now. The way python works (as i understand it, and at least in practice this has been useful) is to consider that everything on the right side is deferenced & bound and then processed, and only after that's all complete are the value(s) on the left side bound and assigned.
Since the same value is on the right side and the left, when python tries to bind the name on the right side, it doesn't exist yet.

Categories

Resources