Related
Python Tutorial 4.7.1. Default Argument Values states the following:
Important warning: The default value is evaluated only once. This makes a difference
when the default is a mutable object such as a list, dictionary, or instances of most
classes. For example, the following function accumulates the arguments passed to it on
subsequent calls:
def f(a, L=[]):
L.append(a)
return L
print f(1)
print f(2)
print f(3)
This will print
[1]
[1, 2]
[1, 2, 3]
I don't quite understand the meaning of "evaluated only once" in terms of memory management. Apparently, the default value of the function is evaluated once when the function is first called and stored in a separate memory address even after the function has ended. (according to my understanding, after the function ended, all local variables should be freed?)
Am I correct?
In Python, functions are objects too, and the defaults are stored with the function object. Defaults are not locals; it is just that when the function is called, the arguments are bound to a default when not given an explicit value.
When Python encounters a def <functionname>(<arguments>): statement, it creates a function object for you there and then; this is 'definition time'; the function is not called but merely created. It is then that defaults are evaluated and stored, in an attribute on the function object.
Then when you call the function, the defaults have already been created and are used when you didn't provide a more concrete value for the argument. Because the defaults are stored with the function object, you get to see changes to mutable objects between function calls.
The locals are still cleared up of course, but as they are references (all identifiers in Python are), the objects they were bound to are only cleared up if nothing else is referencing them anymore either.
You can take a look a the defaults of any function object:
>>> def foo(bar='spam', eggs=[]):
... eggs.append(bar)
... return eggs
...
>>> foo.__defaults__
('spam', [])
>>> foo()
['spam']
>>> foo.__defaults__
('spam', ['spam'])
>>> foo() is foo.__defaults__[1]
True
The foo() function has a __defaults__ attribute, a tuple of default values to use when no values for the arguments have been passed in. You can see the mutable list change as the function is called, and because the function returns the eggs list, you can also see that it is the exact same object as the second value in that tuple.
If you don't want your defaults to be shared and instead need a new value for a parameter every time the function is called, but the parameter is not given, you need to set the default value to a sentinel object. If your parameter is still set to that sentinel in the function body, you can execute code to set a fresh default value. None is usually the best choice:
def foo(bar='spam', eggs=None):
if eggs is None:
eggs = []
If it should be possible to use None as a non-default value, use a singleton sentinel created beforehand:
_sentinel = object()
def foo(bar='spam', eggs=_sentinel):
if eggs is _sentinel:
eggs = []
The function that you have defined f is an object in its own regard. When you define defaults, these defaults are bound to the function that you have created.
You can see this in action:
>>> def f(a, L=[]):
... L.append(a)
... return L
>>> print id(f)
4419902952
>>> print f.__defaults__
([],)
>>> f(1)
[1]
>>> print id(f)
4419902952
>>> print f.__defaults__
([1],)
edit, further, you can see that the list container does not change either:
>>> print id(f.__defaults__[0])
4419887544
>>> f(2)
[1, 2]
>>> print id(f.__defaults__[0])
4419887544
On each subsequent call, the default list ("L") of your f function will have your a value appended.
A function is just an object in python, that is created using the def syntax. Default values are stored within the function object when the function is defined, and they are not re-evaluated later.
This is sometimes used to create function variables that persist to subsequent invocations. You can use the __defaults__ methods to check what the default values are for your function.
A common way to initialize new objects instead of reusing the same is:
def f(a, L=None):
if L is None:
L = []
L.append(a)
return L
You can check this page for more details.
Sorry this answer was meant for a different question, but I'll leave it here as a reference if anyone who wants to look at it. Define once means that at the first point when the code is executed, the default variable gets assigned to an object which is retained within the function object itself.
Notice only 1 object address gets printed, the default list object is used.
def f(a, L=[]):
print("id default: ", id(L))
L.append(a)
print("id used: ", id(L)
return L
Notice 2 different object addresses are printed, when you perform L=[] within the function, you are binding L to a different list object, therefore the default list object does not get change.
def f(a, L=[]):
print("id default: ", id(L))
if L == []:
L = []
L.append(a)
print("id used: ", id(L))
return L
The function above is basically the same as the one below except it uses the None object instead of a empty list object.
def f(a, L=None):
print("id default", id(L))
if L is None:
L = []
L.append(a)
print("id used: ", id(L))
return L
This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 7 years ago.
This is covered by the Python Tutorial, but I still don't quite understand the reason why Python has this style. Is it purely convention, or is there some explanation behind why Python has the following style regarding default parameters:
My understanding is that Python prefers something=None as opposed to something=[] for default parameters for functions. But...why not use something=[]? Surely this is the convention in other languages, like C
As an example, take these two examples, which are equivalent
def function(arr, L=[]):
L.append(arr)
return L
and
def function(arr, L=None):
if L is None:
L = []
L.append(arr)
return L
My understanding is that the first is "incorrect style" for Python. Why?
EDIT: Ah, I finally understand. I am incorrect above: the two functions are NOT equivalent. Default arguments are evaluated once when the function is defined, not each time the function is called!
When you set a parameter to the value of a list, it is assigned when the function is defined, not when it is called. That is why you'll get different results for calling the function multiple times with the same input parameter. Beware!!!
def function(arr, L=[]):
L.append(arr)
return L
arr = [1, 2, 3]
>>> function(arr)
[[1, 2, 3]]
>>> function(arr)
[[1, 2, 3], [1, 2, 3]]
A default value that's an empty list will refer to one specific variable each time you call it, instead of creating a new empty list every time.
>>> def add(defList=[]):
defList.append(1)
return defList
>>> add()
[1]
>>> add()
[1,1]
It's a quirk of how mutable data works in Python. It could be occasionally useful, but often it's safer to use None. Then create an empty list if a value hasn't been passed.
The reason is that Python stores a value for L. In other words L is a reference to a constant. But that constant can be updated.
You can say that Python stores it as:
|--> []
|
function (arr, L)
But that [] is an ordinary object (thus with state), that can be modified as well. Now if the function can modify or return L, you start modifying the state of L. In the example:
def function(arr, L=[]):
L.append(arr)
return L
You modify L. If you call this the first time (for instance with function(123)), the object is updated so, now it is represented as:
|--> [123]
|
function (arr, L)
As a result the behavior of function depends on a global state. In general a global state is seen as a bad smell in code design and furthermore it is not what people might expect. This doesn't hold for None, since you modify the local reference L (not the object itself).
You can say that the object is the same, but each time you call the function, you copy a reference to a local variable L.
Now for the second case:
|--> None
|
def function(arr, L):
if L is None:
L = []
L.append(arr)
return L
If you call this method, you assign a value to L, but L itself is not global (only the object to which L refers). So if you call this method, after the if, the function (in progress) will look like.
Because the argument l is assigned an initial value of [] NOT when you call the function but when it's first declared and the module it's in is interpreted.
See: Python: Common Gotcha (Thanks #ferhat elmas)
Example: (showing what's going on)
$ python
Python 2.7.9 (default, Mar 19 2015, 22:32:11)
[GCC 4.8.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def f(L=[]):
... L.append(1)
... return id(L), L
...
>>> def g(L=None):
... L = [] if L is None else L
... L.append(1)
... return id(L), L
...
We'll ignore what happens when you pass in another list to the L argument because that behaviour is well defined and acceptable. (takes a list, appends to it and returns it).
>>> f()
(139978918901088, [1])
>>> g()
(139978918964112, [1])
The first time we call f() and g() we might mistakenly think that we have successfully returned a new list from an initial empty one right? What what happens when we call f() again compared with the correct g():
>>> f()
(139978918901088, [1, 1])
>>> g()
(139978918964112, [1])
So the initial value you assigned for the argument L is set once and only only when the function is defined; not when it's called. This is a common gotcha as explained in the link aove.
NB: id() here returns the unique identity of objects so you can clearly see why the first is an incorrect way to define a default value for a mutable object such as a list.
Also Note: In the above examples that the identity of L does not change when calling f().
Python Tutorial 4.7.1. Default Argument Values states the following:
Important warning: The default value is evaluated only once. This makes a difference
when the default is a mutable object such as a list, dictionary, or instances of most
classes. For example, the following function accumulates the arguments passed to it on
subsequent calls:
def f(a, L=[]):
L.append(a)
return L
print f(1)
print f(2)
print f(3)
This will print
[1]
[1, 2]
[1, 2, 3]
I don't quite understand the meaning of "evaluated only once" in terms of memory management. Apparently, the default value of the function is evaluated once when the function is first called and stored in a separate memory address even after the function has ended. (according to my understanding, after the function ended, all local variables should be freed?)
Am I correct?
In Python, functions are objects too, and the defaults are stored with the function object. Defaults are not locals; it is just that when the function is called, the arguments are bound to a default when not given an explicit value.
When Python encounters a def <functionname>(<arguments>): statement, it creates a function object for you there and then; this is 'definition time'; the function is not called but merely created. It is then that defaults are evaluated and stored, in an attribute on the function object.
Then when you call the function, the defaults have already been created and are used when you didn't provide a more concrete value for the argument. Because the defaults are stored with the function object, you get to see changes to mutable objects between function calls.
The locals are still cleared up of course, but as they are references (all identifiers in Python are), the objects they were bound to are only cleared up if nothing else is referencing them anymore either.
You can take a look a the defaults of any function object:
>>> def foo(bar='spam', eggs=[]):
... eggs.append(bar)
... return eggs
...
>>> foo.__defaults__
('spam', [])
>>> foo()
['spam']
>>> foo.__defaults__
('spam', ['spam'])
>>> foo() is foo.__defaults__[1]
True
The foo() function has a __defaults__ attribute, a tuple of default values to use when no values for the arguments have been passed in. You can see the mutable list change as the function is called, and because the function returns the eggs list, you can also see that it is the exact same object as the second value in that tuple.
If you don't want your defaults to be shared and instead need a new value for a parameter every time the function is called, but the parameter is not given, you need to set the default value to a sentinel object. If your parameter is still set to that sentinel in the function body, you can execute code to set a fresh default value. None is usually the best choice:
def foo(bar='spam', eggs=None):
if eggs is None:
eggs = []
If it should be possible to use None as a non-default value, use a singleton sentinel created beforehand:
_sentinel = object()
def foo(bar='spam', eggs=_sentinel):
if eggs is _sentinel:
eggs = []
The function that you have defined f is an object in its own regard. When you define defaults, these defaults are bound to the function that you have created.
You can see this in action:
>>> def f(a, L=[]):
... L.append(a)
... return L
>>> print id(f)
4419902952
>>> print f.__defaults__
([],)
>>> f(1)
[1]
>>> print id(f)
4419902952
>>> print f.__defaults__
([1],)
edit, further, you can see that the list container does not change either:
>>> print id(f.__defaults__[0])
4419887544
>>> f(2)
[1, 2]
>>> print id(f.__defaults__[0])
4419887544
On each subsequent call, the default list ("L") of your f function will have your a value appended.
A function is just an object in python, that is created using the def syntax. Default values are stored within the function object when the function is defined, and they are not re-evaluated later.
This is sometimes used to create function variables that persist to subsequent invocations. You can use the __defaults__ methods to check what the default values are for your function.
A common way to initialize new objects instead of reusing the same is:
def f(a, L=None):
if L is None:
L = []
L.append(a)
return L
You can check this page for more details.
Sorry this answer was meant for a different question, but I'll leave it here as a reference if anyone who wants to look at it. Define once means that at the first point when the code is executed, the default variable gets assigned to an object which is retained within the function object itself.
Notice only 1 object address gets printed, the default list object is used.
def f(a, L=[]):
print("id default: ", id(L))
L.append(a)
print("id used: ", id(L)
return L
Notice 2 different object addresses are printed, when you perform L=[] within the function, you are binding L to a different list object, therefore the default list object does not get change.
def f(a, L=[]):
print("id default: ", id(L))
if L == []:
L = []
L.append(a)
print("id used: ", id(L))
return L
The function above is basically the same as the one below except it uses the None object instead of a empty list object.
def f(a, L=None):
print("id default", id(L))
if L is None:
L = []
L.append(a)
print("id used: ", id(L))
return L
I am asking because of the classic problem where somebody creates a list of lambdas:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
and unexpectedly gets only twos as output.
The commonly proposed solution is to make i a named argument like this:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
for l in foo:
print(l())
Which produces the desired output of 0, 1, 2 but now something magical has happened. It sort of did what is expected because Python is pass-by-reference and you didn't want a reference.
Still, just adding a new name to something, shouldn't that just create another reference?
So the question becomes what are the exact rules for when something is not a reference?
Considering that ints are immutable and the following works:
x = 3
y = x
x = 5
print(x, y) // outputs 5 3
probably explains why adding that named parameter works. A local i with the same value was created and captured.
Now why, in the case of our lambdas was the same i referenced? I pass an int to function and it is refenced and if I store it in a variable it is copied. Hm.
Basically I am looking for the most concise and abstract way possible to remember exactly how this works. When is the same value referenced, when do I get a copy. If it has any common names and there are programming languages were it works the same that would be interesting as well.
Here is my current assumption:
Arguments are always passed to functions by reference.
Assigning to a variable of immutable type creates a copy.
I am asking anyway, just to make sure and hopefully get some background.
The issue here is how you think of names.
In your first example, i is a variable that is assigned to every time the loop iterates. When you use lambda to make a function, you make a function that accesses the name i and returns it's value. This means as the name i changes, the value returned by the functions also changes.
The reason the default argument trick works is that the name is evaluated when the function is defined. This means the default value is the value the i name points to at that time, not the name itself.
i is a label. 0, 1 and 2 are the objects. In the first case, the program assigns 0 to i, then makes a function that returns i - it then does this with 1 and 2. When the function is called, it looks up i (which is now 2) and then returns it.
In the second example, you assign 0 to i, then you make a function with a default argument. That default argument is the value that is gotten by evaluating i - that is the object 0. This is repeated for 1 and 2. When the function is called, it assigns that default value to a new variable i, local to the function and unrelated to the outer i.
Python doesn't exactly pass by reference or by value (at least, not the way you'd think of it, coming from a language like C++).
In many other languages (such as C++), variables can be thought of as synonymous with the values they hold.
However, in Python, variables are names that point to the objects in memory.
(This is a good explanation (with pictures!))
Because of this, you can get multiple names attached to one object, which can lead to interesting effects.
Consider these equivalent program snippets:
// C++:
int x;
x = 10; // line A
x = 20; // line B
and
# Python:
x = 10 # line C
x = 20 # line D
After line A, the int 10 is stored in memory, say, at the memory address 0x1111.
After line B, the memory at 0x1111 is overwritten, so 0x1111 now holds the int 20
However, the way this program works in python is quite different:
After line C, x points to some memory, say, 0x2222, and the value stored at 0x2222 is 10
After line D, x points to some different memory, say, 0x3333, and the value stored at 0x3333 is 20
Eventually, the orphaned memory at 0x2222 is garbage collected by Python.
Hopefully this helps you get a grasp of the subtle differences between variables in Python and most other languages.
(I know I didn't directly answer your question about lambdas, but I think this is good background knowledge to have before reading one of the good explanations here, such as #Lattyware's)
See this question for some more background info.
Here's some final background info, in the form of oft-quoted but instructive examples:
print 'Example 1: Expected:'
x = 3
y = x
x = 2
print 'x =', x
print 'y =', y
print 'Example 2: Surprising:'
x = [3]
y = x
x[0] = 2
print 'x =', x
print 'y =', y
print 'Example 3: Same logic as in Example 1:'
x = [3]
y = x
x = [2]
print 'x =', x
print 'y =', y
The output is:
Example 1: Expected:
x = 2
y = 3
Example 2: Surprising:
x = [2]
y = [2]
Example 3: Same logic as in Example 1:
x = [2]
y = [3]
foo = []
for i in range(3):
foo.append((lambda: i))
Here since all the lambda's were created in the same scope so all of them point to the same global variable variable i. so, whatever value i points to will be returned when they are actually called.
foo = []
for i in range(3):
foo.append((lambda z = i: id(z)))
print id(i) #165618436
print(foo[-1]()) #165618436
Here in each loop we assign the value of i to a local variable z, as default arguments are calculated when the function is parsed so the value z simply points to the values stored by i during the iteration.
Arguments are always passed to functions by reference?
In fact the z in foo[-1] still points to the same object as i of the last iteration, so yes values are passed by reference but as integers are immutable so changing i won't affect z of the foo[-1] at all.
In the example below all lambda's point to some mutable object, so modifying items in lis will also affect the functions in foo:
foo = []
lis = ([], [], [])
for i in lis:
foo.append((lambda z = i: z))
lis[0].append("bar")
print foo[0]() #prints ['bar']
i.append("foo") # `i` still points to lis[-1]
print foo[-1]() #prints ['foo']
Assigning to a variable of immutable type creates a copy?
No values are never copied.
>>> x = 1000
>>> y = x # x and y point to the same object, but an immutable object.
>>> x += 1 # so modifying x won't affect y at all, in fact after this step
# x now points to some different object and y still points to
# the same object 1000
>>> x #x now points to an new object, new id()
1001
>>> y #still points to the same object, same id()
1000
>>> x = []
>>> y = x
>>> x.append("foo") #modify an mutable object
>>> x,y #changes can be seen in all references to the object
(['foo'], ['foo'])
The list of lambdas problem arises because the i referred to in both snippets is the same variable.
Two distinct variables with the same name exist only if they exist in two separate scopes. See the following link for when that happens, but basically any new function (including a lambda) or class establishes its own scope, as do modules, and pretty much nothing else does. See: http://docs.python.org/2/reference/executionmodel.html#naming-and-binding
HOWEVER, when reading the value of a variable, if it is not defined in the current local scope, the enclosing local scopes are searched*. Your first example is of exactly this behaviour:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
Each lambda creates no variables at all, so its own local scope is empty. When execution hits the locally undefined i, it is located in the enclosing scope.
In your second example, each lambda creates its own i variable in the parameter list:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
This is in fact equivalent to lambda a=i: a, because the i inside the body is the same as the i on the left hand side of the assignment, and not the i on the right hand side. The consequence is that i is not missing from the local scope, and so the value of the local i is used by each lambda.
Update: Both of your assumptions are incorrect.
Function arguments are passed by value. The value passed is the reference to the object. Pass-by-reference would allow the original variable to be altered.
No implicit copying ever occurs on function call or assignment, of any language-level object. Under the hood, because this is pass-by-value, the references to the parameter objects are copied when the function is called, as is usual in any language which passes references by value.
Update 2: The details of function evaluation are here: http://docs.python.org/2/reference/expressions.html#calls . See the link above for the details regarding name binding.
* No actual linear search occurs in CPython, because the correct variable to use can be determined at compile time.
The answer is that the references created in a closure (where a function is inside a function, and the inner function accesses variables from the outer one) are special. This is an implementation detail, but in CPython the value is a particular kind of object called a cell and it allows the variable's value to be changed without rebinding it to a new object. More info here.
The way variables work in Python is actually rather simple.
All variables contain references to objects.
Reassigning a variable points it to a different object.
All arguments are passed by value when calling functions (though the values being passed are references).
Some types of objects are mutable, which means they can be changed without changing what any of their variable names point to. Only these types can be changed when passed, since this does not require changing any references to the object.
Values are never copied implicitly. Never.
The behaviour really has very little to do with how parameters are passed (which is always the same way; there is no distinction in Python where things are sometimes passed by reference and sometimes passed by value). Rather the problem is to do with how names themselves are found.
lambda: i
creates a function that is of course equivalent to:
def anonymous():
return i
That i is a name, within the scope of anonymous. But it's never bound within that scope (not even as a parameter). So for that to mean anything i must be a name from some outer scope. To find a suitable name i, Python will look at the scope in which anonymous was defined in the source code (and then similarly out from there), until it finds a definition for i.1
So this loop:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
Is almost exactly as if you had written this:
foo = []
for i in range(3):
def anonymous():
return i
foo.append(anonymous)
for l in foo:
print(l())
So that i in return i (or lambda: i) ends up being the same i from the outer scope, which is the loop variable. Not that they are all references to the same object, but that they are all the same name. So it's simply not possible for the functions stored in foo to return different values; they're all returning the object referred to by a single name.
To prove it, watch what happens when I remove the variable i after the loop:
>>> foo = []
>>> for i in range(3):
foo.append((lambda: i))
>>> del i
>>> for l in foo:
print(l())
Traceback (most recent call last):
File "<pyshell#7>", line 2, in <module>
print(l())
File "<pyshell#3>", line 2, in <lambda>
foo.append((lambda: i))
NameError: global name 'i' is not defined
You can see that the problem isn't that each function has a local i bound to the wrong thing, but rather than each function is returning the value of the same global variable, which I've now removed.
OTOH, when your loop looks like this:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
for l in foo:
print(l())
That is quite like this:
foo = []
for i in range(3):
def anonymous(i=i):
return i
foo.append(anonymous)
for l in foo:
print(l())
Now the i in return i is not the same i as in the outer scope; it's a local variable of the function anonymous. A new function is created in each iteration of the loop (stored temporarily in the outer scope variable anonymous, and then permanently in a slot of foo), so each one has it's own local variables.
As each function is created, the default value of its parameter is set to the value of i (in the scope defining the functions). Like any other "read" of a variable, that pulls out whatever object is referenced by the variable at that time, and thereafter has no connection to the variable.2
So each function gets the default value of i as it is in the outer scope at the time it is created, and then when the function is called without an argument that default value becomes the value of the i in that function's local scope. Each function has no non-local references, so is completely unaffected by what happens outside it.
1 This is done at "compile time" (when the Python file is converted to bytecode), with no regard for what the system is like at runtime; it is almost literally looking for an outer def block with i = ... in the source code. So local variables are actually statically resolved! If that lookup chain falls all the way out to the module global scope, then Python assumes that i will be defined in the global scope at the point that the code will be run, and just treats i as a global variable whether or not there is a statically visible binding for i at module scope, hence why you can dynamically create global variables but not local ones.
2 Confusingly, this means that in lambda i=i: i, the three is refer to three completely different "variables" in two different scopes on the one line.
The leftmost i is the "name" holding the value that will be used for the default value of i, which exists independently of any particular call of the function; it's almost exactly "member data" stored in the function object.
The second i is an expression evaluated as the function is created, to get the default value. So the i=i bit acts very like an independent statement the_function.default_i = i, evaluated in the same scope containing the lambda expression.
And finally the third i is actually the local variable inside the function, which only exists within a call to the anonymous function.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
“Least Astonishment” in Python: The Mutable Default Argument
I recently met a problem in Python.
Code:
def f(a, L=[]):
L.append(a)
return L
print f(1)
print f(2)
print f(3)
the output would be
[1]
[1, 2]
[1, 2, 3]
but why the value in the local list: L in the function f remains unchanged?
Because L is a local variable, I think the output should be:
[1]
[2]
[3]
I tried another way to implement this function:
Code:
def f(a, L=None):
if L is None:
L = []
L.append(a)
return L
This time, the output is:
[1]
[2]
[3]
I just don't understand why...
Does anyone have some ideas? many thanks.
The default parameters are in fact initialized when the function is defined, so
def f(L = []): pass
is quite similar to
global default_L = []
def f(L = default_L): pass
You can see this way that it is the same list object that is used in every invocation of the function.
The list in def f(a, L=[]) is defined as the function is defined. It is the same list every time you call the function without a keyword argument.
Setting the keyword to None and checking / creating as you have done is the usual work around for this sort of behaviour.