This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
“Least Astonishment” in Python: The Mutable Default Argument
Using python 2.7 I came across strange behaviour and I'm not sure how to explain it or if it even exists in any python docs. Using the code
class MyClass:
#staticmethod
def func(objects=[], a=None, b=None):
objects.append(a)
print 'objects: %s'%objects
print 'b: %s'%b
MyClass.func(a='one')
MyClass.func(a='two', b='foo')
MyClass.func(a='three')
I get the get output
objects: ['one']
b: None
objects: ['one', 'two']
b: foo
objects: ['one', 'two', 'three']
b: None
As you can see, the first list parameter (objects) of the method retains it's values across calls.. new values being appended to the last list even though in it's header declaration it has a default value of []. But the last parameter (b) does not retain it's value, it is reset to the default value between calls.
The expected (for me anyway) is that the objects parameter should be reset to it's default on any call to the method (like the b parameter is), but this doesn't seem to happen and only seems to occur on the first call.
Can anyone explain this behaviour? Is it a bug in this version of python or is it intended behaviour? Possibly something to do with the list reference being retained across calls but the string variable (b) is not? I'm very confused by this behaviour.
Thanks
It has nothing to do with being an staticmethod. This is a very common mistake in Python.
Functions in Python are first class objects, not just a block of code, so the empty list you assign to objects in the parameters is a real list attached to the function object. Everytime you call it and append something to that list, you are using the same list. It's easy to see it happening this way:
>>> def func(x, objects=[]):
... objects.append(x)
...
>>> func(1)
>>> func.func_defaults
([1],)
>>> func(2)
>>> func.func_defaults
([1, 2],)
>>> func(3)
>>> func.func_defaults
([1, 2, 3],)
func_defaults is the function object attribute that constains the default keyword parameters you set. See how the list is there and get changed?
The proper way to do what you want is:
class MyClass:
#staticmethod
def func(objects=None, a=None, b=None):
if objects is None:
objects = []
objects.append(a)
print 'objects: %s'%objects
print 'b: %s'%b
This behaviour is broadly known and by many treated as a feature, and is not staticmethod-specific. It works for every function. It happens when you assign mutable object as the default value to the argument.
See this answer on StackOverflow: Default Argument Gotchas / Dangers of Mutable Default arguments
If you understand mutability vs. immutability issue, think of the function / method like that:
when it is defined, the default argument values are assigned,
when it is invoked, the body is executed and if it changes mutable default argument value in-place, the changed value is then being used on every subsequent call where the default argument value has not been overriden by providing different value instead of it,
Related
I am very confused by the behaviour below. Cases 1, 3, and 4 perform as I would expect, but case 2 does not. Why does case 2 allow the function to change the value of the dictionary entry globally, even though the dictionary is never returned by the function? A main reason I am using functions is to isolate everything in the function from the rest of the code, but this does not seem to be possible if I choose to use the same variable names inside of the function. I was under the understanding that anything explicitly defined in a function is local to that function, but this does not seem to be the case if the dictionary is defined and passed as an input to the function.
Case 1
>>> def testfun1(a):
... a=2
...
>>> a=0
>>> testfun1(a)
>>> a
0
Case 2
>>> def testfun2(b):
... b['test']=2
...
>>> b={}
>>> testfun2(b)
>>> b
{'test': 2}
Case 3
>>> def testfun3():
... c=2
...
>>> c=0
>>> testfun3()
>>> c
0
Case 4
(explained by this question: Global dictionaries don't need keyword global to modify them?)
>>> def testfun4():
... d['test']=10
...
>>> d={}
>>> testfun4()
>>> d
{'test': 10}
Python's "parameter evaluation strategy" acts a bit different than the languages you're probably used to. Instead of having explicit call by value and call by reference semantics, python has call by sharing. You are essentially always passing the object itself, and the object's mutability determines whether or not it can be modified. Lists and Dicts are mutable objects. Numbers, Strings, and Tuples are not.
You are passing the dictionary to the function, not a copy. Thus when you modify it, you are also modifying the original copy.
To avoid this, you should first copy the dictionary before calling the function, or from within the function (passing the dictionary to the dict function should do it, i.e. testfun4(dict(d)) and defining the function as def testfun4(d):).
To support what #Casey Kuball said, every object in Python is passed by reference. Each function receives a reference to the actual object you passed. Modifying these objects depends on whether they are mutable data types.
In essence, one can say that mutable objects like dictionaries, sets, and lists are passed by reference. Immutable objects like int, str, tuple are passed by value.
You should also note that there are cases where mutable objects are overwritten in a function thereby losing reference to the actual object passed to the function.
>>> def testfun(b):
... b = b or {} # Creates a new object if b is false
... b['test'] = 2
...
>>> b = {}
>>> testfun(b)
>>> b
{}
When you pass a basic object like an integer or a string to a function, if you change it inside the function nothing occurs to the corresponding object outside the function because when you are leading with a basic object, python passes it by value.
However, if you pass a dictionary or a list to a function they are passed by reference, which means you will have that behaviour: the object outside the function is changed, as you have seen.
edit:
In addition, there is a difference between passing by value or by reference: by value, a "copy" of the object is made in order to be used in the function; by reference, the exactly same object is passed through reference and modifications to it inside the function are visible outside. By definition python passes its immutable objects by value, and its mutable objects by reference.
The global keyword is required only for assignment (and likely del, I've never tried it). Object mutations are perfectly valid.
You have passed a dict object to the function and modified it inside the function, so of course it will be modified after the function return. The object is not copied so you modify the same object that you passed, and this question has nothing to do with naming, similar names, scopes etc. as you passed the object explicitly.
I am using PyCharm (Python 3) to write a Python function which accepts a dictionary as an argument with attachment={}.
def put_object(self, parent_object, connection_name, **data):
...
def put_wall_post(self, message, attachment={}, profile_id="me"):
return self.put_object(profile_id, "feed", message=message, **attachment)
In the IDE, attachment={} is colored yellow. Moving the mouse over it shows a warning.
Default arguments value is mutable
This inspection detects when a mutable value as list or dictionary is
detected in a default value for an argument.
Default argument values are evaluated only once at function definition
time, which means that modifying the default value of the argument
will affect all subsequent calls of the function.
What does this mean and how can I resolve it?
If you don't alter the "mutable default argument" or pass it anywhere where it could be altered just ignore the message, because there is nothing to be "fixed".
In your case you only unpack (which does an implicit copy) the "mutable default argument" - so you're safe.
If you want to "remove that warning message" you could use None as default and set it to {} when it's None:
def put_wall_post(self,message,attachment=None,profile_id="me"):
if attachment is None:
attachment = {}
return self.put_object(profile_id,"feed",message = message,**attachment)
Just to explain the "what it means": Some types in Python are immutable (int, str, ...) others are mutable (like dict, set, list, ...). If you want to change immutable objects another object is created - but if you change mutable objects the object remains the same but it's contents are changed.
The tricky part is that class variables and default arguments are created when the function is loaded (and only once), that means that any changes to a "mutable default argument" or "mutable class variable" are permanent:
def func(key, value, a={}):
a[key] = value
return a
>>> print(func('a', 10)) # that's expected
{'a': 10}
>>> print(func('b', 20)) # that could be unexpected
{'b': 20, 'a': 10}
PyCharm probably shows this Warning because it's easy to get it wrong by accident (see for example Why do mutable default arguments remember mutations between function calls? and all linked questions). However, if you did it on purpose (Good uses for mutable function argument default values?) the Warning could be annoying.
You can replace mutable default arguments with None. Then check inside the function and assign the default:
def put_wall_post(self, message, attachment=None, profile_id="me"):
attachment = attachment if attachment else {}
return self.put_object(profile_id, "feed", message=message, **attachment)
This works because None evaluates to False so we then assign an empty dictionary.
In general you may want to explicitly check for None as other values could also evaluate to False, e.g. 0, '', set(), [], etc, are all False-y. If your default isn't 0 and is 5 for example, then you wouldn't want to stomp on 0 being passed as a valid parameter:
def function(param=None):
param = 5 if param is None else param
This is a warning from the interpreter that because your default argument is mutable, you might end up changing the default if you modify it in-place, which could lead to unexpected results in some cases. The default argument is really just a reference to the object you indicate, so much like when you alias a list to two different identifiers, e.g.,
>>> a={}
>>> b=a
>>> b['foo']='bar'
>>> a
{'foo': 'bar'}
if the object is changed through any reference, whether during that call to the function, a separate call, or even outside the function, it will affect future calls the function. If you're not expecting the behavior of the function to change at runtime, this could be a cause for bugs. Every time the function is called, it's the same name being bound to the same object. (in fact, I'm not sure if it even goes through the whole name binding process each time? I think it just gets another reference.)
The (likely unwanted) behavior
You can see the effect of this by declaring the following and calling it a few times:
>>> def mutable_default_arg (something = {'foo':1}):
something['foo'] += 1
print (something)
>>> mutable_default_arg()
{'foo': 2}
>>> mutable_default_arg()
{'foo': 3}
Wait, what? yes, because the object referenced by the argument doesn't change between calls, changing one of its elements changes the default. If you use an immutable type, you don't have to worry about this because it shouldn't be possible, under standard circumstances, to change an immutable's data. I don't know if this holds for user-defined classes, but that is why this is usually just addressed with "None" (that, and you only need it as a placeholder, nothing more. Why spend the extra RAM on something more complicated?)
Duct-taped problems...
In your case, you were saved by an implicit copy, as another answer pointed out, but it's never a good idea to rely on implicit behavior, especially unexpected implicit behavior, since it could change. That's why we say "explicit is better than implicit". Besides which, implicit behavior tends to hide what's going on, which could lead you or another programmer to removing the duct tape.
...with simple (permanent) solutions
You can avoid this bug magnet completely and satisfy the warning by, as others have suggested, using an immutable type such as None, checking for it at the start of the function, and if found, immediately replacing it before your function gets going:
def put_wall_post(self, message, attachment=None, profile_id="me"):
if attachment is None:
attachment = {}
return self.put_object(profile_id, "feed", message=message, **attachment)
Since immutable types force you to replace them (Technically, you are binding a new object to the same name. In the above, the reference to None is overwritten when attachment is rebound to the new empty dictionary) instead of updating them, you know attachment will always start as None unless specified in the call parameters, thus avoiding the risk of unexpected changes to the default.
(As an aside, when in doubt whether an object is the same as another object, compare them with is or check id(object). The former can check whether two references refer to the same object, and the latter can be useful for debugging by printing a unique identifier—typically the memory location—for the object.)
To rephrase the warning: every call to this function, if it uses the default, will use the same object. So long as you never change that object, the fact that it is mutable won't matter. But if you do change it, then subsequent calls will start with the modified value, which is probably not what you want.
One solution to avoid this issue would be to have the default be a immutable type like None, and set the parameter to {} if that default is used:
def put_wall_post(self,message,attachment=None,profile_id="me"):
if attachment==None:
attachment={}
return self.put_object(profile_id,"feed",message = message,**attachment)
Lists are mutable and as declaring default with def at declaration at compile time will assign a mutable list to the variable at some address
def abc(a=[]):
a.append(2)
print(a)
abc() #prints [2]
abc() #prints [2, 2] as mutable thus changed the same assigned list at func delaration points to same address and append at the end
abc([4]) #prints [4, 2] because new list is passed at a new address
abc() #prints [2, 2, 2] took same assigned list and append at the end
To correct this:
def abc(a=None):
if not a:
a=[]
a.append(2)
print(a)
This works as every time a new list is created and not referencing the old list as a value always null thus assigning new list at new address
I am very confused by the behaviour below. Cases 1, 3, and 4 perform as I would expect, but case 2 does not. Why does case 2 allow the function to change the value of the dictionary entry globally, even though the dictionary is never returned by the function? A main reason I am using functions is to isolate everything in the function from the rest of the code, but this does not seem to be possible if I choose to use the same variable names inside of the function. I was under the understanding that anything explicitly defined in a function is local to that function, but this does not seem to be the case if the dictionary is defined and passed as an input to the function.
Case 1
>>> def testfun1(a):
... a=2
...
>>> a=0
>>> testfun1(a)
>>> a
0
Case 2
>>> def testfun2(b):
... b['test']=2
...
>>> b={}
>>> testfun2(b)
>>> b
{'test': 2}
Case 3
>>> def testfun3():
... c=2
...
>>> c=0
>>> testfun3()
>>> c
0
Case 4
(explained by this question: Global dictionaries don't need keyword global to modify them?)
>>> def testfun4():
... d['test']=10
...
>>> d={}
>>> testfun4()
>>> d
{'test': 10}
Python's "parameter evaluation strategy" acts a bit different than the languages you're probably used to. Instead of having explicit call by value and call by reference semantics, python has call by sharing. You are essentially always passing the object itself, and the object's mutability determines whether or not it can be modified. Lists and Dicts are mutable objects. Numbers, Strings, and Tuples are not.
You are passing the dictionary to the function, not a copy. Thus when you modify it, you are also modifying the original copy.
To avoid this, you should first copy the dictionary before calling the function, or from within the function (passing the dictionary to the dict function should do it, i.e. testfun4(dict(d)) and defining the function as def testfun4(d):).
To support what #Casey Kuball said, every object in Python is passed by reference. Each function receives a reference to the actual object you passed. Modifying these objects depends on whether they are mutable data types.
In essence, one can say that mutable objects like dictionaries, sets, and lists are passed by reference. Immutable objects like int, str, tuple are passed by value.
You should also note that there are cases where mutable objects are overwritten in a function thereby losing reference to the actual object passed to the function.
>>> def testfun(b):
... b = b or {} # Creates a new object if b is false
... b['test'] = 2
...
>>> b = {}
>>> testfun(b)
>>> b
{}
When you pass a basic object like an integer or a string to a function, if you change it inside the function nothing occurs to the corresponding object outside the function because when you are leading with a basic object, python passes it by value.
However, if you pass a dictionary or a list to a function they are passed by reference, which means you will have that behaviour: the object outside the function is changed, as you have seen.
edit:
In addition, there is a difference between passing by value or by reference: by value, a "copy" of the object is made in order to be used in the function; by reference, the exactly same object is passed through reference and modifications to it inside the function are visible outside. By definition python passes its immutable objects by value, and its mutable objects by reference.
The global keyword is required only for assignment (and likely del, I've never tried it). Object mutations are perfectly valid.
You have passed a dict object to the function and modified it inside the function, so of course it will be modified after the function return. The object is not copied so you modify the same object that you passed, and this question has nothing to do with naming, similar names, scopes etc. as you passed the object explicitly.
This question already has answers here:
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 8 years ago.
Refer to this
>>> def foo(counter=[0]):
... counter[0] += 1
... print("Counter is %i." % counter[0]);
...
>>> foo()
Counter is 1.
>>> foo()
Counter is 2.
>>>
Default values are initialized only when the function is first evaluated, not each time it is executed, so you can use a list or any other mutable object to maintain static values.
Question> Why the counter can keep its updated value during different callings? Is it true that counter refers to the same memory used to store the temporary list of the default parameter so that it can refer to the same memory address and keep the updated values during the calls?
The object created as the default argument becomes part of the function and persists until the function is destroyed.
First things first: if you are coding in Python: forget about "memory address" - you will never need one. Yes, there are objects, and they are placed in memory, and if you are referring to the same object, it is in the same "memory address" - but that does not matter - there could even be an implementation where objects don't have a memory address at all (just a place in a data structure, for example).
Then, when Python encounters the function body, as it is defined above, it does create a code object with the contents of the function body, and executes the function definition line - resolving any expressions inlined there and setting the results of those expressions as the default parameters for that function. There is nothing "temporary" about these objects. The expressions (in this case [0]) are evaluated, the resulting objects (in this case a Python list with a single element) are created, and assigned to a reference in the function object (a position in the functions's "func_defaults" attribute - remember that functions themselves are objects in Python.)
Whenever you call that function, if you don't pass a value to the counter parameter, it is assigned to the object recorded in the func_defaults attribute -in this case, a Python list. And it is the same Python list that was created at function parsing time.
What happens is that Python lists themselves are mutable: one can change its contents, add more elements, etc...but they are still the same list.
What the code above does is exactly incrementing the element in the position 0 of the list.
You can access this list at any time in the example above by typing foo.func_defaults[0]
If you want to "reset" the counter, you can just do: foo.func_defaults[0][0]=0, for example.
Of course, it is a side effect of how thigns a reimplemented in Python, and though consistent, and even docuemnted, should not be used in "real code". If you need "static variables", use a class instead of a function like the above:
class Foo(object):
def __init__(self):
self.counter = 0
def __call__(self):
self.counter += 1
print("Counter is %i." % self.counter)
And on the console:
>>> foo = Foo()
>>> foo()
Counter is 1.
>>> foo()
Counter is 2.
>>>
Default arguments are evaluated at the function definition, and not its calls. When the foo function object is created (remember, functions are first class citizens in python) its arguments, including default arguments, are all local names bound to the function object. The function object is created when the def: statement is first encountered.
In this case counter is set to a mutable list in the definition of foo(), so calling it without an argument gives the original mutable list instantiated at definition the name counter. This makes calls to foo() use and modify the original list.
Like all function arguments, counter is a local name defined when the function is declared. The default argument, too, is evaluated once when the function is declared.
If, when calling foo(), no value is passed for counter, that default value, the exact object instance provided at function definition time, is given the name counter. Since this is a mutable object (in this case, a list), any changes made to it remain after the function has completed.
The function contains a reference to the list in its default argument tuple, func_defaults, which prevents it from being destroyed.
This question already has answers here:
Why does using `arg=None` fix Python's mutable default argument issue?
(5 answers)
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 9 months ago.
I was writing some code this afternoon, and stumbled across a bug in my code. I noticed that the default values for one of my newly created objects was carrying over from another object! For example:
class One(object):
def __init__(self, my_list=[]):
self.my_list = my_list
one1 = One()
print(one1.my_list)
[] # empty list, what you'd expect.
one1.my_list.append('hi')
print(one1.my_list)
['hi'] # list with the new value in it, what you'd expect.
one2 = One()
print(one2.my_list)
['hi'] # Hey! It saved the variable from the other One!
So I know it can be solved by doing this:
class One(object):
def __init__(self, my_list=None):
self.my_list = my_list if my_list is not None else []
What I would like to know is... Why? Why are Python classes structured so that the default values are saved across instances of the class?
This is a known behaviour of the way Python default values work, which is often surprising to the unwary. The empty array object [] is created at the time of definition of the function, rather than at the time it is called.
To fix it, try:
def __init__(self, my_list=None):
if my_list is None:
my_list = []
self.my_list = my_list
Several others have pointed out that this is an instance of the "mutable default argument" issue in Python. The basic reason is that the default arguments have to exist "outside" the function in order to be passed into it.
But the real root of this as a problem has nothing to do with default arguments. Any time it would be bad if a mutable default value was modified, you really need to ask yourself: would it be bad if an explicitly provided value was modified? Unless someone is extremely familiar with the guts of your class, the following behaviour would also be very surprising (and therefore lead to bugs):
>>> class One(object):
... def __init__(self, my_list=[]):
... self.my_list = my_list
...
>>> alist = ['hello']
>>> one1 = One(alist)
>>> alist.append('world')
>>> one2 = One(alist)
>>>
>>> print(one1.my_list) # Huh? This isn't what I initialised one1 with!
['hello', 'world']
>>> print(one2.my_list) # At least this one's okay...
['hello', 'world']
>>> del alist[0]
>>> print one2.my_list # What the hell? I just modified a local variable and a class instance somewhere else got changed?
['world']
9 times out of 10, if you discover yourself reaching for the "pattern" of using None as the default value and using if value is None: value = default, you shouldn't be. You should be just not modifying your arguments! Arguments should not be treated as owned by the called code unless it is explicitly documented as taking ownership of them.
In this case (especially because you're initialising a class instance, so the mutable variable is going to live a long time and be used by other methods and potentially other code that retrieves it from the instance) I would do the following:
class One(object):
def __init__(self, my_list=[])
self.my_list = list(my_list)
Now you're initialising the data of your class from a list provided as input, rather than taking ownership of a pre-existing list. There's no danger that two separate instances end up sharing the same list, nor that the list is shared with a variable in the caller which the caller may want to continue using. It also has the nice effect that your callers can provide tuples, generators, strings, sets, dictionaries, home-brewed custom iterable classes, etc, and you know you can still count on self.my_list having an append method, because you made it yourself.
There's still a potential problem here, if the elements contained in the list are themselves mutable then the caller and this instance can still accidentally interfere with each other. I find it not to very often be a problem in practice in my code (so I don't automatically take a deep copy of everything), but you have to be aware of it.
Another issue is that if my_list can be very large, the copy can be expensive. There you have to make a trade-off. In that case, maybe it is better to just use the passed-in list after all, and use the if my_list is None: my_list = [] pattern to prevent all default instances sharing the one list. But if you do that you need to make it clear, either in documentation or the name of the class, that callers are relinquishing ownership of the lists they use to initialise the instance. Or, if you really want to be constructing a list solely for the purpose of wrapping up in an instance of One, maybe you should figure out how to encapsulate the creation of the list inside the initialisation of One, rather than constructing it first; after all, it's really part of the instance, not an initialising value. Sometimes this isn't flexible enough though.
And sometimes you really honestly do want to have aliasing going on, and have code communicating by mutating values they both have access to. I think very hard before I commit to such a design, however. And it will surprise others (and you when you come back to the code in X months), so again documentation is your friend!
In my opinion, educating new Python programmers about the "mutable default argument" gotcha is actually (slightly) harmful. We should be asking them "Why are you modifying your arguments?" (and then pointing out the way default arguments work in Python). The very fact of a function having a sensible default argument is often a good indicator that it isn't intended as something that receives ownership of a pre-existing value, so it probably shouldn't be modifying the argument whether or not it got the default value.
Basically, python function objects store a tuple of default arguments, which is fine for immutable things like integers, but lists and other mutable objects are often modified in-place, resulting in the behavior you observed.
This is standard behavior of default arguments anywhere in Python, not just in classes.
For more explanation, see Mutable defaults for function/method arguments.
Python functions are objects. Default arguments of a function are attributes of that function. So if the default value of an argument is mutable and it's modified inside your function, the changes are reflected in subsequent calls to that function.
Not an answer, but it's worth noting this is also true for class variables defined outside any class functions.
Example:
>>> class one:
... myList = []
...
>>>
>>> one1 = one()
>>> one1.myList
[]
>>> one2 = one()
>>> one2.myList.append("Hello Thar!")
>>>
>>> one1.myList
['Hello Thar!']
>>>
Note that not only does the value of myList persist, but every instance of myList points to the same list.
I ran into this bug/feature myself, and spent something like 3 hours trying to figure out what was going on. It's rather challenging to debug when you are getting valid data, but it's not from the local computations, but previous ones.
It's made worse since this is not just a default argument. You can't just put myList in the class definition, it has to be set equal to something, although whatever it is set equal to is only evaluated once.
The solution, at least for me, was to simply create all the class variable inside __init__.