Method to replace assignation into functions - python

I know it's not possible to assign a new value to a variable passed as a parameter to a function.
>>> a = 1
>>> def foo(bar):
... bar = 2
...
>>> foo(a)
>>> a
1
But it is possible to modify it with methods.
>>> a = [1, 2]
>>> def foo(bar):
... bar.append(3)
...
>>> foo(a)
>>> a
[1, 2, 3]
But is there a method to replace assignation (giving the variable a whole new value). Something to make my first example work :
>>> a = 1
>>> def foo(bar):
... bar.assign(2)
...
>>> foo(a)
>>> a
2
The only alternatives I found are global variables and designing my own classes.
So my questions are :
Is there such method? (or alternative?)
If there isn't, I must be more a design choice than a oversight. Why this choice? If there are methods to modify a part of the value/content, why not a method to replace/give a whole new value/content?

Everything in python is an object, and objects in python are mutable, except when they are not. Basic types like strings, numbers (int, float) are not mutable ie. 1 is always 1, and you can never make 1 something else.
This is actually the case with most so called object oriented languages and is more of an optimization.
As you said, you would need to create your own Object that wraps the immutable types in order to mutate the internals of your new object. You can even create a very simple class to do this
class Mutatable:
def __init__(self, value):
self.value = value
def assign(self, value):
# in python you cannot overload the = (assignment) operator
self.value = value
Now from your example you can say
>>> a = Mutatable(1)
>>> def foo(bar):
... bar.assign(2)
...
>>> foo(a)
>>> a.value
out: 2
As some of the other posters mentioned. General programming advice. Overuse of mutations creates for very hard to debug applications. Functions that return values and raise Exceptions are way easier to test and debug.

First off, every every thing in python is object (except some singleton and interns) and every object exist in a scope contain global, local, built-in and enclosing scopes. And the main purpose of using scopes is obviously preserving a set of command and script which are following a specific aim.
Therefore it's not reasonable to destroy this architecture by letting variables in different scopes to impact each other.
But in Python has provided us with some tools in order to make some variables available in upper scope1, like global statement.
Note: Regarding the second example, remember that changing a mutable object inside the function may impact the caller and that's because mutable objects like lists actually are a container of pointers to the actual objects and when you change one pointer it will affect on the main object.
1. The hierarchy of scopes in python from inner to outer is: Local, Enclosing, Global, Built-in. Known as LEGB manner.

It is a consequence of two design decisions. First, functions arguments in Python are passed by assignment. That is, if you have a call like
foo(a)
it translates to, roughly
bar = a
<function body, verbatim>
That's why you can't just pass a pointer to any variable as you would in, say, C. If you want to mutate a function argument, it must have some mutable internal structure. That's where the second design decision comes in: integers in Python are immutable, along with other types including other kinds of numbers, strings or tuples. I don't know the original motivation behind this decision, but I see two main advantages:
easier for a human to reason about the code (you can be sure your integer does not magically change when you pass it to a function)
easier for a computer to reason about the code (for the purpose of optimization or static analysis), because a lot of function arguments are just numbers or strings
My personal opinion is that you should avoid functions that mutate their arguments where possible. At the very least, they should be methods and mutate the corresponding object. Otherwise the code becomes error-prone and hard to maintain and test. Therefore I fully support this immutability, and try to use immutable objects everywhere.

If you want to change the value of a variable in some function without global or return instructions, then you can pass its name in function parameters and use globals() :
>>> def foo(bar):
globals()[bar] = 3
>>> x = 7
>>> x
7
foo('x')
>>> x
3

Related

Why does changing attributes of one object change the attributes of different objects from the same class? [duplicate]

I am very confused by the behaviour below. Cases 1, 3, and 4 perform as I would expect, but case 2 does not. Why does case 2 allow the function to change the value of the dictionary entry globally, even though the dictionary is never returned by the function? A main reason I am using functions is to isolate everything in the function from the rest of the code, but this does not seem to be possible if I choose to use the same variable names inside of the function. I was under the understanding that anything explicitly defined in a function is local to that function, but this does not seem to be the case if the dictionary is defined and passed as an input to the function.
Case 1
>>> def testfun1(a):
... a=2
...
>>> a=0
>>> testfun1(a)
>>> a
0
Case 2
>>> def testfun2(b):
... b['test']=2
...
>>> b={}
>>> testfun2(b)
>>> b
{'test': 2}
Case 3
>>> def testfun3():
... c=2
...
>>> c=0
>>> testfun3()
>>> c
0
Case 4
(explained by this question: Global dictionaries don't need keyword global to modify them?)
>>> def testfun4():
... d['test']=10
...
>>> d={}
>>> testfun4()
>>> d
{'test': 10}
Python's "parameter evaluation strategy" acts a bit different than the languages you're probably used to. Instead of having explicit call by value and call by reference semantics, python has call by sharing. You are essentially always passing the object itself, and the object's mutability determines whether or not it can be modified. Lists and Dicts are mutable objects. Numbers, Strings, and Tuples are not.
You are passing the dictionary to the function, not a copy. Thus when you modify it, you are also modifying the original copy.
To avoid this, you should first copy the dictionary before calling the function, or from within the function (passing the dictionary to the dict function should do it, i.e. testfun4(dict(d)) and defining the function as def testfun4(d):).
To support what #Casey Kuball said, every object in Python is passed by reference. Each function receives a reference to the actual object you passed. Modifying these objects depends on whether they are mutable data types.
In essence, one can say that mutable objects like dictionaries, sets, and lists are passed by reference. Immutable objects like int, str, tuple are passed by value.
You should also note that there are cases where mutable objects are overwritten in a function thereby losing reference to the actual object passed to the function.
>>> def testfun(b):
... b = b or {} # Creates a new object if b is false
... b['test'] = 2
...
>>> b = {}
>>> testfun(b)
>>> b
{}
When you pass a basic object like an integer or a string to a function, if you change it inside the function nothing occurs to the corresponding object outside the function because when you are leading with a basic object, python passes it by value.
However, if you pass a dictionary or a list to a function they are passed by reference, which means you will have that behaviour: the object outside the function is changed, as you have seen.
edit:
In addition, there is a difference between passing by value or by reference: by value, a "copy" of the object is made in order to be used in the function; by reference, the exactly same object is passed through reference and modifications to it inside the function are visible outside. By definition python passes its immutable objects by value, and its mutable objects by reference.
The global keyword is required only for assignment (and likely del, I've never tried it). Object mutations are perfectly valid.
You have passed a dict object to the function and modified it inside the function, so of course it will be modified after the function return. The object is not copied so you modify the same object that you passed, and this question has nothing to do with naming, similar names, scopes etc. as you passed the object explicitly.

Why does the 'number' parameter does not pass? [duplicate]

Suppose I have a function like:
def foo():
x = 'hello world'
How do I get the function to return x, in such a way that I can use it as the input for another function or use the variable within the body of a program? I tried using return and then using the x variable in another function, but I get a NameError that way.
For the specific case of communicating information between methods in the same class, it is often best to store the information in self. See Passing variables between methods in Python? for details.
def foo():
x = 'hello world'
return x # return 'hello world' would do, too
foo()
print(x) # NameError - x is not defined outside the function
y = foo()
print(y) # this works
x = foo()
print(x) # this also works, and it's a completely different x than that inside
# foo()
z = bar(x) # of course, now you can use x as you want
z = bar(foo()) # but you don't have to
Effectively, there are two ways: directly and indirectly.
The direct way is to return a value from the function, as you tried, and let the calling code use that value. This is normally what you want. The natural, simple, direct, explicit way to get information back from a function is to return it. Broadly speaking, the purpose of a function is to compute a value, and return signifies "this is the value we computed; we are done here".
Directly using return
The main trick here is that return returns a value, not a variable. So return x does not enable the calling code to use x after calling the function, and does not modify any existing value that x had in the context of the call. (That's presumably why you got a NameError.)
After we use return in the function:
def example():
x = 'hello world'
return x
we need to write the calling code to use the return value:
result = example()
print(result)
The other key point here is that a call to a function is an expression, so we can use it the same way that we use, say, the result of an addition. Just as we may say result = 'hello ' + 'world', we may say result = foo(). After that, result is our own, local name for that string, and we can do whatever we want with it.
We can use the same name, x, if we want. Or we can use a different name. The calling code doesn't have to know anything about how the function is written, or what names it uses for things.1
We can use the value directly to call another function: for example, print(foo()).2 We can return the value directly: simply return 'hello world', without assigning to x. (Again: we are returning a value, not a variable.)
The function can only return once each time it is called. return terminates the function - again, we just determined the result of the calculation, so there is no reason to calculate any further. If we want to return multiple pieces of information, therefore, we will need to come up with a single object (in Python, "value" and "object" are effectively synonyms; this doesn't work out so well for some other languages.)
We can make a tuple right on the return line; or we can use a dictionary, a namedtuple (Python 2.6+), a types.simpleNamespace (Python 3.3+), a dataclass (Python 3.7+), or some other class (perhaps even one we write ourselves) to associate names with the values that are being returned; or we can accumulate values from a loop in a list; etc. etc. The possibilities are endless..
On the other hand, the function returns whether you like it or not (unless an exception is raised). If it reaches the end, it will implicitly return the special value None. You may or may not want to do it explicitly instead.
Indirect methods
Other than returning the result back to the caller directly, we can communicate it by modifying some existing object that the caller knows about. There are many ways to do that, but they're all variations on that same theme.
If you want the code to communicate information back this way, please just let it return None - don't also use the return value for something meaningful. That's how the built-in functionality works.
In order to modify that object, the called function also has to know about it, of course. That means, having a name for the object that can be looked up in a current scope. So, let's go through those in order:
Local scope: Modifying a passed-in argument
If one of our parameters is mutable, we can just mutate it, and rely on the caller to examine the change. This is usually not a great idea, because it can be hard to reason about the code. It looks like:
def called(mutable):
mutable.append('world')
def caller():
my_value = ['hello'] # a list with just that string
called(my_value)
# now it contains both strings
If the value is an instance of our own class, we could also assign to an attribute:
class Test:
def __init__(self, value):
self.value = value
def called(mutable):
mutable.value = 'world'
def caller():
test = Test('hello')
called(test)
# now test.value has changed
Assigning to an attribute does not work for built-in types, including object; and it might not work for some classes that explicitly prevent you from doing it.
Local scope: Modifying self, in a method
We already have an example of this above: setting self.value in the Test.__init__ code. This is a special case of modifying a passed-in argument; but it's part of how classes work in Python, and something we're expected to do. Normally, when we do this, the calling won't actually check for changes to self - it will just use the modified object in the next step of the logic. That's what makes it appropriate to write code this way: we're still presenting an interface, so the caller doesn't have to worry about the details.
class Example:
def __init__(self):
self._words = ['hello']
def add_word(self):
self._words.append('world')
def display(self):
print(*self.words)
x = Example()
x.add_word()
x.display()
In the example, calling add_word gave information back to the top-level code - but instead of looking for it, we just go ahead and call display.3
See also: Passing variables between methods in Python?
Enclosing scope
This is a rare special case when using nested functions. There isn't a lot to say here - it works the same way as with the global scope, just using the nonlocal keyword rather than global.4
Global scope: Modifying a global
Generally speaking, it is a bad idea to change anything in the global scope after setting it up in the first place. It makes code harder to reason about, because anything that uses that global (aside from whatever was responsible for the change) now has a "hidden" source of input.
If you still want to do it, the syntax is straightforward:
words = ['hello']
def add_global_word():
words.append('world')
add_global_word() # `words` is changed
Global scope: Assigning to a new or existing global
This is actually a special case of modifying a global. I don't mean that assignment is a kind of modification (it isn't). I mean that when you assign a global name, Python automatically updates a dict that represents the global namespace. You can get that dict with globals(), and you can modify that dict and it will actually impact what global variables exist. (I.e., the return from globals() is the dictionary itself, not a copy.)5
But please don't. That's even worse of an idea than the previous one. If you really need to get the result from your function by assigning to a global variable, use the global keyword to tell Python that the name should be looked up in the global scope:
words = ['hello']
def replace_global_words():
global words
words = ['hello', 'world']
replace_global_words() # `words` is a new list with both words
Global scope: Assigning to or modifying an attribute of the function itself
This is a rare special case, but now that you've seen the other examples, the theory should be clear. In Python, functions are mutable (i.e. you can set attributes on them); and if we define a function at top level, it's in the global namespace. So this is really just modifying a global:
def set_own_words():
set_own_words.words = ['hello', 'world']
set_own_words()
print(*set_own_words.words)
We shouldn't really use this to send information to the caller. It has all the usual problems with globals, and it's even harder to understand. But it can be useful to set a function's attributes from within the function, in order for the function to remember something in between calls. (It's similar to how methods remember things in between calls by modifying self.) The functools standard library does this, for example in the cache implementation.
Builtin scope
This doesn't work. The builtin namespace doesn't contain any mutable objects, and you can't assign new builtin names (they'll go into the global namespace instead).
Some approaches that don't work in Python
Just calculating something before the function ends
In some other programming languages, there is some kind of hidden variable that automatically picks up the result of the last calculation, every time something is calculated; and if you reach the end of a function without returning anything, it gets returned. That doesn't work in Python. If you reach the end without returning anything, your function returns None.
Assigning to the function's name
In some other programming languages, you are allowed (or expected) to assign to a variable with the same name as the function; and at the end of the function, that value is returned. That still doesn't work in Python. If you reach the end without returning anything, your function still returns None.
def broken():
broken = 1
broken()
print(broken + 1) # causes a `TypeError`
It might seem like you can at least use the value that way, if you use the global keyword:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
print(subtly_broken + 1) # 2
But this, of course, is just a special case of assigning to a global. And there's a big problem with it - the same name can't refer to two things at once. By doing this, the function replaced its own name. So it will fail next time:
def subtly_broken():
global subtly_broken
subtly_broken = 1
subtly_broken()
subtly_broken() # causes a `TypeError`
Assigning to a parameter
Sometimes people expect to be able to assign to one of the function's parameters, and have it affect a variable that was used for the corresponding argument. However, this does not work:
def broken(words):
words = ['hello', 'world']
data = ['hello']
broken(data) # `data` does not change
Just like how Python returns values, not variables, it also passes values, not variables. words is a local name; by definition the calling code doesn't know anything about that namespace.
One of the working methods that we saw is to modify the passed-in list. That works because if the list itself changes, then it changes - it doesn't matter what name is used for it, or what part of the code uses that name. However, assigning a new list to words does not cause the existing list to change. It just makes words start being a name for a different list.
For more information, see How do I pass a variable by reference?.
1 At least, not for getting the value back. If you want to use keyword arguments, you need to know what the keyword names are. But generally, the point of functions is that they're an abstraction; you only need to know about their interface, and you don't need to think about what they're doing internally.
2 In 2.x, print is a statement rather than a function, so this doesn't make an example of calling another function directly. However, print foo() still works with 2.x's print statement, and so does print(foo()) (in this case, the extra parentheses are just ordinary grouping parentheses). Aside from that, 2.7 (the last 2.x version) has been unsupported since the beginning of 2020 - which was nearly a 5 year extension of the normal schedule. But then, this question was originally asked in 2010.
3Again: if the purpose of a method is to update the object, don't also return a value. Some people like to return self so that you can "chain" method calls; but in Python this is considered poor style. If you want that kind of "fluent" interface, then instead of writing methods that update self, write methods that create a new, modified instance of the class.
4 Except, of course, that if we're modifying a value rather than assigning, we don't need either keyword.
5 There's also a locals() that gives you a dict of local variables. However, this cannot be used to make new local variables - the behaviour is undefined in 2.x, and in 3.x the dict is created on the fly and assigning to it has no effect. Some of Python's optimizations depend on the local variables for a function being known ahead of time.
>>> def foo():
return 'hello world'
>>> x = foo()
>>> x
'hello world'
You can use global statement and then achieve what you want without returning value from
the function. For example you can do something like below:
def foo():
global x
x = "hello world"
foo()
print x
The above code will print "hello world".
But please be warned that usage of "global" is not a good idea at all and it is better to avoid usage that is shown in my example.
Also check this related discussion on about usage of global statement in Python.

python dictionary passed as an input to a function acts like a global in that function rather than a local

I am very confused by the behaviour below. Cases 1, 3, and 4 perform as I would expect, but case 2 does not. Why does case 2 allow the function to change the value of the dictionary entry globally, even though the dictionary is never returned by the function? A main reason I am using functions is to isolate everything in the function from the rest of the code, but this does not seem to be possible if I choose to use the same variable names inside of the function. I was under the understanding that anything explicitly defined in a function is local to that function, but this does not seem to be the case if the dictionary is defined and passed as an input to the function.
Case 1
>>> def testfun1(a):
... a=2
...
>>> a=0
>>> testfun1(a)
>>> a
0
Case 2
>>> def testfun2(b):
... b['test']=2
...
>>> b={}
>>> testfun2(b)
>>> b
{'test': 2}
Case 3
>>> def testfun3():
... c=2
...
>>> c=0
>>> testfun3()
>>> c
0
Case 4
(explained by this question: Global dictionaries don't need keyword global to modify them?)
>>> def testfun4():
... d['test']=10
...
>>> d={}
>>> testfun4()
>>> d
{'test': 10}
Python's "parameter evaluation strategy" acts a bit different than the languages you're probably used to. Instead of having explicit call by value and call by reference semantics, python has call by sharing. You are essentially always passing the object itself, and the object's mutability determines whether or not it can be modified. Lists and Dicts are mutable objects. Numbers, Strings, and Tuples are not.
You are passing the dictionary to the function, not a copy. Thus when you modify it, you are also modifying the original copy.
To avoid this, you should first copy the dictionary before calling the function, or from within the function (passing the dictionary to the dict function should do it, i.e. testfun4(dict(d)) and defining the function as def testfun4(d):).
To support what #Casey Kuball said, every object in Python is passed by reference. Each function receives a reference to the actual object you passed. Modifying these objects depends on whether they are mutable data types.
In essence, one can say that mutable objects like dictionaries, sets, and lists are passed by reference. Immutable objects like int, str, tuple are passed by value.
You should also note that there are cases where mutable objects are overwritten in a function thereby losing reference to the actual object passed to the function.
>>> def testfun(b):
... b = b or {} # Creates a new object if b is false
... b['test'] = 2
...
>>> b = {}
>>> testfun(b)
>>> b
{}
When you pass a basic object like an integer or a string to a function, if you change it inside the function nothing occurs to the corresponding object outside the function because when you are leading with a basic object, python passes it by value.
However, if you pass a dictionary or a list to a function they are passed by reference, which means you will have that behaviour: the object outside the function is changed, as you have seen.
edit:
In addition, there is a difference between passing by value or by reference: by value, a "copy" of the object is made in order to be used in the function; by reference, the exactly same object is passed through reference and modifications to it inside the function are visible outside. By definition python passes its immutable objects by value, and its mutable objects by reference.
The global keyword is required only for assignment (and likely del, I've never tried it). Object mutations are perfectly valid.
You have passed a dict object to the function and modified it inside the function, so of course it will be modified after the function return. The object is not copied so you modify the same object that you passed, and this question has nothing to do with naming, similar names, scopes etc. as you passed the object explicitly.

Python and reference passing. Limitation?

I would like to do something like the following:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(foo):
foo = Foo()
aTestFoo = None
factory(aTestFoo)
print aTestFoo.member
However it crashes with AttributeError: 'NoneType' object has no attribute 'member':
the object aTestFoo has not been modified inside the call of the function factory.
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
In C++, in the function prototype, I would have added a reference to the pointer to be created in the factory... but maybe this is not the kind of things I should think about in Python.
In C#, there's the key word ref that allows to modify the reference itself, really close to the C++ way. I don't know in Java... and I do wonder in Python.
Python does not have pass by reference. One of the few things it shares with Java, by the way. Some people describe argument passing in Python as call by value (and define the values as references, where reference means not what it means in C++), some people describe it as pass by reference with reasoning I find quite questionable (they re-define it to use to what Python calls "reference", and end up with something which has nothing to do with what has been known as pass by reference for decades), others go for terms which are not as widely used and abused (popular examples are "{pass,call} by {object,sharing}"). See Call By Object on effbot.org for a rather extensive discussion on the defintions of the various terms, on history, and on the flaws in some of the arguments for the terms pass by reference and pass by value.
The short story, without naming it, goes like this:
Every variable, object attribute, collection item, etc. refers to an object.
Assignment, argument passing, etc. create another variable, object attribute, collection item, etc. which refers to the same object but has no knowledge which other variables, object attributes, collection items, etc. refer to that object.
Any variable, object attribute, collection item, etc. can be used to modify an object, and any other variable, object attribute, collection item, etc. can be used to observe that modification.
No variable, object attribute, collection item, etc. refers to another variable, object attribute, collection items, etc. and thus you can't emulate pass by reference (in the C++ sense) except by treating a mutable object/collection as your "namespace". This is excessively ugly, so don't use it when there's a much easier alternative (such as a return value, or exceptions, or multiple return values via iterable unpacking).
You may consider this like using pointers, but not pointers to pointers (but sometimes pointers to structures containing pointers) in C. And then passing those pointers by value. But don't read too much into this simile. Python's data model is significantly different from C's.
You are making a mistake here because in Python
"We call the argument passing technique _call by sharing_,
because the argument objects are shared between the
caller and the called routine. This technique does not
correspond to most traditional argument passing techniques
(it is similar to argument passing in LISP). In particular it
is not call by value because mutations of arguments per-
formed by the called routine will be visible to the caller.
And it is not call by reference because access is not given
to the variables of the caller, but merely to certain objects."
in Python, the variables in the formal argument list are bound to the
actual argument objects. the objects are shared between caller
and callee; there are no "fresh locations" or extra "stores" involved.
(which, of course, is why the CLU folks called this mechanism "call-
by-sharing".)
and btw, Python functions doesn't run in an extended environment, either. function bodies have very limited access to the surrounding environment.
The Assignment Statements section of the Python docs might be interesting.
The = statement in Python acts differently depending on the situation, but in the case you present, it just binds the new object to a new local variable:
def factory(foo):
# This makes a new instance of Foo,
# and binds it to a local variable `foo`,
foo = Foo()
# This binds `None` to a top-level variable `aTestFoo`
aTestFoo = None
# Call `factory` with first argument of `None`
factory(aTestFoo)
print aTestFoo.member
Although it can potentially be more confusing than helpful, the dis module can show you the byte-code representation of a function, which can reveal how Python works internally. Here is the disassembly of `factory:
>>> dis.dis(factory)
4 0 LOAD_GLOBAL 0 (Foo)
3 CALL_FUNCTION 0
6 STORE_FAST 0 (foo)
9 LOAD_CONST 0 (None)
12 RETURN_VALUE
What that says is, Python loads the global Foo class by name (0), and calls it (3, instantiation and calling are very similar), then stores the result in a local variable (6, see STORE_FAST). Then it loads the default return value None (9) and returns it (12)
What is the pythonic way of performing that ? Is it a pattern to avoid ? If it is a current mistake, how is it called ?
Factory functions are rarely necessary in Python. In the occasional case where they are necessary, you would just return the new instance from your factory (instead of trying to assign it to a passed-in variable):
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory():
return Foo()
aTestFoo = factory()
print aTestFoo.member
Your factory method doesn't return anything - and by default it will have a return value of None. You assign aTestFoo to None, but never re-assign it - which is where your actual error is coming from.
Fixing these issues:
class Foo(object):
def __init__(self):
self.member = 10
pass
def factory(obj):
return obj()
aTestFoo = factory(Foo)
print aTestFoo.member
This should do what I think you are after, although such patterns are not that typical in Python (ie, factory methods).

Python class function default variables are class objects? [duplicate]

This question already has answers here:
Why does using `arg=None` fix Python's mutable default argument issue?
(5 answers)
"Least Astonishment" and the Mutable Default Argument
(33 answers)
Closed 9 months ago.
I was writing some code this afternoon, and stumbled across a bug in my code. I noticed that the default values for one of my newly created objects was carrying over from another object! For example:
class One(object):
def __init__(self, my_list=[]):
self.my_list = my_list
one1 = One()
print(one1.my_list)
[] # empty list, what you'd expect.
one1.my_list.append('hi')
print(one1.my_list)
['hi'] # list with the new value in it, what you'd expect.
one2 = One()
print(one2.my_list)
['hi'] # Hey! It saved the variable from the other One!
So I know it can be solved by doing this:
class One(object):
def __init__(self, my_list=None):
self.my_list = my_list if my_list is not None else []
What I would like to know is... Why? Why are Python classes structured so that the default values are saved across instances of the class?
This is a known behaviour of the way Python default values work, which is often surprising to the unwary. The empty array object [] is created at the time of definition of the function, rather than at the time it is called.
To fix it, try:
def __init__(self, my_list=None):
if my_list is None:
my_list = []
self.my_list = my_list
Several others have pointed out that this is an instance of the "mutable default argument" issue in Python. The basic reason is that the default arguments have to exist "outside" the function in order to be passed into it.
But the real root of this as a problem has nothing to do with default arguments. Any time it would be bad if a mutable default value was modified, you really need to ask yourself: would it be bad if an explicitly provided value was modified? Unless someone is extremely familiar with the guts of your class, the following behaviour would also be very surprising (and therefore lead to bugs):
>>> class One(object):
... def __init__(self, my_list=[]):
... self.my_list = my_list
...
>>> alist = ['hello']
>>> one1 = One(alist)
>>> alist.append('world')
>>> one2 = One(alist)
>>>
>>> print(one1.my_list) # Huh? This isn't what I initialised one1 with!
['hello', 'world']
>>> print(one2.my_list) # At least this one's okay...
['hello', 'world']
>>> del alist[0]
>>> print one2.my_list # What the hell? I just modified a local variable and a class instance somewhere else got changed?
['world']
9 times out of 10, if you discover yourself reaching for the "pattern" of using None as the default value and using if value is None: value = default, you shouldn't be. You should be just not modifying your arguments! Arguments should not be treated as owned by the called code unless it is explicitly documented as taking ownership of them.
In this case (especially because you're initialising a class instance, so the mutable variable is going to live a long time and be used by other methods and potentially other code that retrieves it from the instance) I would do the following:
class One(object):
def __init__(self, my_list=[])
self.my_list = list(my_list)
Now you're initialising the data of your class from a list provided as input, rather than taking ownership of a pre-existing list. There's no danger that two separate instances end up sharing the same list, nor that the list is shared with a variable in the caller which the caller may want to continue using. It also has the nice effect that your callers can provide tuples, generators, strings, sets, dictionaries, home-brewed custom iterable classes, etc, and you know you can still count on self.my_list having an append method, because you made it yourself.
There's still a potential problem here, if the elements contained in the list are themselves mutable then the caller and this instance can still accidentally interfere with each other. I find it not to very often be a problem in practice in my code (so I don't automatically take a deep copy of everything), but you have to be aware of it.
Another issue is that if my_list can be very large, the copy can be expensive. There you have to make a trade-off. In that case, maybe it is better to just use the passed-in list after all, and use the if my_list is None: my_list = [] pattern to prevent all default instances sharing the one list. But if you do that you need to make it clear, either in documentation or the name of the class, that callers are relinquishing ownership of the lists they use to initialise the instance. Or, if you really want to be constructing a list solely for the purpose of wrapping up in an instance of One, maybe you should figure out how to encapsulate the creation of the list inside the initialisation of One, rather than constructing it first; after all, it's really part of the instance, not an initialising value. Sometimes this isn't flexible enough though.
And sometimes you really honestly do want to have aliasing going on, and have code communicating by mutating values they both have access to. I think very hard before I commit to such a design, however. And it will surprise others (and you when you come back to the code in X months), so again documentation is your friend!
In my opinion, educating new Python programmers about the "mutable default argument" gotcha is actually (slightly) harmful. We should be asking them "Why are you modifying your arguments?" (and then pointing out the way default arguments work in Python). The very fact of a function having a sensible default argument is often a good indicator that it isn't intended as something that receives ownership of a pre-existing value, so it probably shouldn't be modifying the argument whether or not it got the default value.
Basically, python function objects store a tuple of default arguments, which is fine for immutable things like integers, but lists and other mutable objects are often modified in-place, resulting in the behavior you observed.
This is standard behavior of default arguments anywhere in Python, not just in classes.
For more explanation, see Mutable defaults for function/method arguments.
Python functions are objects. Default arguments of a function are attributes of that function. So if the default value of an argument is mutable and it's modified inside your function, the changes are reflected in subsequent calls to that function.
Not an answer, but it's worth noting this is also true for class variables defined outside any class functions.
Example:
>>> class one:
... myList = []
...
>>>
>>> one1 = one()
>>> one1.myList
[]
>>> one2 = one()
>>> one2.myList.append("Hello Thar!")
>>>
>>> one1.myList
['Hello Thar!']
>>>
Note that not only does the value of myList persist, but every instance of myList points to the same list.
I ran into this bug/feature myself, and spent something like 3 hours trying to figure out what was going on. It's rather challenging to debug when you are getting valid data, but it's not from the local computations, but previous ones.
It's made worse since this is not just a default argument. You can't just put myList in the class definition, it has to be set equal to something, although whatever it is set equal to is only evaluated once.
The solution, at least for me, was to simply create all the class variable inside __init__.

Categories

Resources