Weird behaviour. Functions outside of class change it's instance variables [duplicate] - python

This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 8 months ago.
I am not sure I understand the concept of Python's call by object style of passing function arguments (explained here http://effbot.org/zone/call-by-object.htm). There don't seem to be enough examples to clarify this concept well (or my google-fu is probably weak! :D)
I wrote this little contrived Python program to try to understand this concept
def foo( itnumber, ittuple, itlist, itdict ):
itnumber +=1
print id(itnumber) , itnumber
print id(ittuple) , ittuple
itlist.append(3.4)
print id(itlist) , itlist
itdict['mary'] = 2.3
print id(itdict), itdict
# Initialize a number, a tuple, a list and a dictionary
tnumber = 1
print id( tnumber ), tnumber
ttuple = (1, 2, 3)
print id( ttuple ) , ttuple
tlist = [1, 2, 3]
print id( tlist ) , tlist
tdict = tel = {'jack': 4098, 'sape': 4139}
print '-------'
# Invoke a function and test it
foo(tnumber, ttuple, tlist , tdict)
print '-------'
#Test behaviour after the function call is over
print id(tnumber) , tnumber
print id(ttuple) , ttuple
print id(tlist) , tlist
print id(tdict), tdict
The output of the program is
146739376 1
3075201660 (1, 2, 3)
3075103916 [1, 2, 3]
3075193004 {'sape': 4139, 'jack': 4098}
---------
146739364 2
3075201660 (1, 2, 3)
3075103916 [1, 2, 3, 3.4]
3075193004 {'sape': 4139, 'jack': 4098, 'mary': 2.3}
---------
146739376 1
3075201660 (1, 2, 3)
3075103916 [1, 2, 3, 3.4]
3075193004 {'sape': 4139, 'jack': 4098, 'mary': 2.3}
As you can see , except for the integer that was passed, the object id's (which as I understand refers to memeory location) remain unchanged.
So in the case of the integer, it was (effectively) passed by value and the other data structure were (effectively) passed by reference. I tried changing the list , the number and the dictionary to just test if the data-structures were changed in place. The number was not bu the list and the
dictionary were.
I use the word effectively above, since the 'call-by-object' style of argument passing seems to behave both ways depending on the data-structure passed in the above code
For more complicated data structures, (say numpy arrays etc), is there any quick rule of thumb to
recognize which arguments will be passed by reference and which ones passed by value?

The key difference is that in C-style language, a variable is a box in memory in which you put stuff. In Python, a variable is a name.
Python is neither call-by-reference nor call-by-value. It's something much more sensible! (In fact, I learned Python before I learned the more common languages, so call-by-value and call-by-reference seem very strange to me.)
In Python, there are things and there are names. Lists, integers, strings, and custom objects are all things. x, y, and z are names. Writing
x = []
means "construct a new thing [] and give it the name x". Writing
x = []
foo = lambda x: x.append(None)
foo(x)
means "construct a new thing [] with name x, construct a new function (which is another thing) with name foo, and call foo on the thing with name x". Now foo just appends None to whatever it received, so this reduces to "append None to the the empty list". Writing
x = 0
def foo(x):
x += 1
foo(x)
means "construct a new thing 0 with name x, construct a new function foo, and call foo on x". Inside foo, the assignment just says "rename x to 1 plus what it used to be", but that doesn't change the thing 0.

Others have already posted good answers. One more thing that I think will help:
x = expr
evaluates expr and binds x to the result. On the other hand:
x.operate()
does something to x and hence can change it (resulting in the same underlying object having a different value).
The funny cases come in with things like:
x += expr
which translate into either x = x + expr (rebinding) or x.__iadd__(expr) (modifying), sometimes in very peculiar ways:
>>> x = 1
>>> x += 2
>>> x
3
(so x was rebound, since integers are immutable)
>>> x = ([1], 2)
>>> x
([1], 2)
>>> x[0] += [3]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> x
([1, 3], 2)
Here x[0], which is itself mutable, was mutated in-place; but then Python also attempted to mutate x itself (as with x.__iadd__), which errored-out because tuples are immutable. But by then x[0] was already mutated!

Numbers, strings, and tuples in Python are immutable; using augmented assignment will rebind the name.
Your other types are merely mutated, and remain the same object.

Related

List is unchanged ever after element is changed

While trying to implement an algorithm, I couldn't get python lists to mutate via a function. After reading up on the issue I was suggested by this StackOverflow answer to use [:] in order to mutate the array passed in the function argumemt.
However, as seen in the following code snippet, the issue still persists when trying to mutate the list l. I am expecting the output to be Before: [1,2,3,4]
After: [69, 69, 69, 69], but instead I get back the original value of l as shown below.
def mutate_list(a, b):
c = [69] * 4
a[:] = c[:2] # changed the elements, but array's still unchanged outside function
b[:] = c[2:]
if __name__ == '__main__':
l = [1, 2, 3, 4]
print("Before: {}" .format(l))
mutate_list(l[:2], l[2:])
print("After: {}" .format(l))
Output:
Before: [1, 2, 3, 4]
After : [1, 2, 3, 4]
Any insights into why this is happening?
The error is that you not pass actually the l but two slices of it. You should change it, for example:
def mutate_list(a):
c = [69] * 4
a[:2] = c[:2]
a[2:] = c[2:]
if __name__ == '__main__':
l = [1, 2, 3, 4]
print("Before: {}" .format(l))
mutate_list(l)
print("After: {}" .format(l))
its all about the scope, mutable concept is applicable on list but not to reference variable.
The variables a,b are local variables, hence the scope of the variable will be always function scope.
The operations which you have performed :
a[:]=c[:2]
b[:]=c[2:]
Note: a and b both are list now so you will get following output in the function:
[69,69],[69,69]
but if you use + operator which is use for adding operations then the out out will be like:
[69,69,69,69]
Now whatever I told you that will be a local scope, if you want that the list should be mutable across the program then you have to specify the scope of the list as global inside function and on that variable you can do changes. in this case you also dont need to pass any arguments:
def mutate_list():
global l # telling python to use this global variable in a local function
c = [69] * 4
l=c # assigning new values to actual list i.e l
Now before output will be [1,2,3,4]
and after will be [69,69,69,69]
As pointed out by others, the issue arose from the fact that the function parameters were slices of the original array and as such, the parameters were being passed by value (instead of being passed by reference).
According to #Selcuk 's suggestion, the correct way of doing such an operation would be to pass the original array along with its indices to the function and then perform any slicing inside the function.
NOTE: This concept comes in handy for (recursive) divide-and-conquer algorithms where subarrays must be mutated and combined to form the solution.

Why do python lists act like this when using the = operator [duplicate]

This question already has answers here:
Variable assignment and modification (in python) [duplicate]
(6 answers)
Closed 4 years ago.
How come the following code:
a = [1,2,3]
b = a
b[0] = 3
print(a)
will print list b after it has been altered?[3,2,3].
Also why is this true but that the following code:
a = [1,2,3]
b = a
b = [0,0,0]
print(a,b)
prints [1, 2, 3] [0, 0, 0]?? This seems inconsistent. If the first code is true, then shouldn't the second code print [0,0,0][0,0,0]? Can someone please provide an explanation for this?
In python there are two types of data... mutable and immutable. Numbers, strings, boolean, tuples, and other simple types are immutable. Dicts, lists, sets, objects, classes, and other complex types are mutable.
When you say:
a = [1,2,3]
b = a
You've created a single mutable list in memory, assigned a to point to it, and then assigned b to point to it. It's the same thing in memory.
Therefore when you mutate it (modify it):
b[0] = 3
It is a modification (mutation) of the index [0] of the value which b points to at that same memory location.
However, when you replace it:
b = [0,0,0]
It is creating a new mutable list in memory and assigning b to point at it.
Check out the id() function. It will tell you the "address" of any variable. You can see which names are pointing to the same memory location with id(varname).
Bonus: Every value in python is passed by reference... meaning that when you assign it to a variable it simply causes that variable to point to that value where it was in memory. Having immutable types allows python to "reuse" the same memory location for common immutable types.
Consider some common values when the interpreter starts up:
>>> import sys
>>> sys.getrefcount('abc')
68
>>> sys.getrefcount(100)
110
>>> sys.getrefcount(2)
6471
However, a value that is definitely not present would return 2. This has to do with the fact that a couple of references to that value were in-use during the call to sys.getrefcount
>>> sys.getrefcount('nope not me. I am definitely not here already.')
2
Notice that an empty tuple has a lot of references:
>>> sys.getrefcount(tuple())
34571
But an empty list has no extra references:
>>> sys.getrefcount(list())
1
Why is this? Because tuple is immutable so it is fine to share that value across any number of variables. However, lists are mutable so they MUST NOT be shared across arbitrary variables or changes to one would affect the others.
Incidentally, this is also why you must NEVER use mutable types as default argument values to functions. Consider this innocent little function:
>>> def foo(value=[]):
... value.append(1)
... print(value)
...
...
When you call it you might expect to get [1] printed...
>>> foo()
[1]
However, when you call it again, you prob. won't expect to get [1,1] out... ???
>>> foo()
[1, 1]
And on and on...
>>> foo()
[1, 1, 1]
>>> foo()
[1, 1, 1, 1]
WHY IS THIS? Because default arguments to functions are evaluated once during function definition, and not at function run time. That way if you use a mutable value as a default argument value, then you will be stuck with that one value, mutating in unexpected ways as the function is called multiple times.
The proper way to do it is this:
>>> def foo(value=None):
... if value is None:
... value = []
... value.append(1)
... print(value)
...
...
>>>
>>> foo()
[1]
>>> foo()
[1]
>>> foo()
[1]

Passing variable and list to function in Python

I don't understand this behaviour:
def getvariable(v):
v += 1
def getlist(l):
l.append(8)
myvariable = 1
mylist = [5, 6, 7]
print myvariable, mylist
getvariable(myvariable)
getlist(mylist)
print myvariable, mylist
Output:
1 [5, 6, 7]
1 [5, 6, 7, 8]
Why list changed, but variable doesn't?
How can I change variable in function?
Many people say about passing by value, by reference, by object reference, so I am a bit confused and don't know how it is really.
In python integers are immutable. v += 1 only binds a new integer value to the name v, which is local in your function. It does not modify the integer in place.
Lists in python are mutable. You pass a list (by reference, as always in python), and the function changes it in place. That's why the change is "seen" externally to the function.
There is no such thing as "passing by value" in python.
What you probably want to do is return v+1 from your function, not to modify the value bound to the name v.
Because, lists are mutable but integers are immutable.
Read more about it here: http://docs.python.org/2/reference/datamodel.html#objects-values-and-types

Why does 'remove' mutate a list when called locally? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Understanding Python's call-by-object style of passing function arguments
I recently came across this:
x = [1,2,3]
def change_1(x):
x = x.remove(x[0])
return x
Which results in:
>>> change_1(x)
>>> x
[2, 3]
I find this behavior surprising, as I thought that whatever goes in inside a function has no effect on outside variables. Furthermore, I constructed an example where basically do the same thing but without using remove:
x = [1,2,3]
def change_2(x):
x = x[1:]
return x
Which results in:
>>> change_2(x)
[2, 3] # Also the output prints out here not sure why this is
>>> x
[1, 2, 3]
And I get the result that I would expect, the function does not change x.
So it must be something remove specific that has effect. What is going on here?
One of the things that is confusing is that you've called lots of different things 'x'. For example:
def change_1(x): # the x parameter is a reference to the global 'x' list
x = x.remove(x[0]) # on the left, though, is a new 'x' that is local to the function
return x # the local x is returned
>>> x = [1, 2, 3]
>>> y = change_1(x) # assign the return value to 'y'
>>> print y
None # this is None because x.remove() assigned None to the local 'x' inside the function
>>> print x
[2, 3] # but x.remove() modified the global x inside the function
def change_2(x):
x = x[1:] # again, x on left is local, it gets a copy of the slice, but the 'x' parameter is not changed
return x # return the slice (copy)
>>> x = [1, 2, 3]
>>> y = change_2(x)
>>> print x
[1, 2, 3] # the global 'x' is not changed!
>>> print y
[2, 3] # but the slice created in the function is assigned to 'y'
You would get the same result if you called the parameter of your function n, or q.
It's not the variable name that's being affected. The x in the scope of your list and the x outside that scope are two different "labels". Since you passed the value that x was attached to, to change_1() however, they are both referring to the same object. When you do x.remove() on the object in the function, you are basically saying: "get the object that x refers to. Now remove an element from that object. This is very different than a lhs assignment. If you did y=0; x=y inside your function. Your not doing anything to the list object. You're basically just tearing the 'x' label of of [1, 2. 3] and putting it on whatever y is pointing to.
Your variable x is just a reference to the list you've created. When you call that function, you are passing that reference by value. But in the function, you have a reference to the same list So when the function modifies it, it is modified at any scope.
Also, when executing commands in the interactive interpreter, python prints the return value, if it is not assigned to a variable.

Weird behavior: Lambda inside list comprehension [duplicate]

This question already has answers here:
Creating functions (or lambdas) in a loop (or comprehension)
(6 answers)
Closed 8 months ago.
In python 2.6:
[x() for x in [lambda: m for m in [1,2,3]]]
results in:
[3, 3, 3]
I would expect the output to be [1, 2, 3]. I get the exact same problem even with a non list comprehension approach. And even after I copy m into a different variable.
What am I missing?
To make the lambdas remember the value of m, you could use an argument with a default value:
[x() for x in [lambda m=m: m for m in [1,2,3]]]
# [1, 2, 3]
This works because default values are set once, at definition time. Each lambda now uses its own default value of m instead of looking for m's value in an outer scope at lambda execution time.
The effect you’re encountering is called closures, when you define a function that references non-local variables, the function retains a reference to the variable, rather than getting its own copy. To illustrate, I’ll expand your code into an equivalent version without comprehensions or lambdas.
inner_list = []
for m in [1, 2, 3]:
def Lambda():
return m
inner_list.append(Lambda)
So, at this point, inner_list has three functions in it, and each function, when called, will return the value of m. But the salient point is that they all see the very same m, even though m is changing, they never look at it until called much later.
outer_list = []
for x in inner_list:
outer_list.append(x())
In particular, since the inner list is constructed completely before the outer list starts getting built, m has already reached its last value of 3, and all three functions see that same value.
Long story short, you don't want to do this. More specifically, what you're encountering is an order of operations problem. You're creating three separate lambda's that all return m, but none of them are called immediately. Then, when you get to the outer list comprehension and they're all called the residual value of m is 3, the last value of the inner list comprehension.
-- For comments --
>>> [lambda: m for m in range(3)]
[<function <lambda> at 0x021EA230>, <function <lambda> at 0x021EA1F0>, <function <lambda> at 0x021EA270>]
Those are three separate lambdas.
And, as further evidence:
>>> [id(m) for m in [lambda: m for m in range(3)]]
[35563248, 35563184, 35563312]
Again, three separate IDs.
Look at the __closure__ of the functions. All 3 point to the same cell object, which keeps a reference to m from the outer scope:
>>> print(*[x.__closure__[0] for x in [lambda: m for m in [1,2,3]]], sep='\n')
<cell at 0x00D17610: int object at 0x1E2139A8>
<cell at 0x00D17610: int object at 0x1E2139A8>
<cell at 0x00D17610: int object at 0x1E2139A8>
If you don't want your functions to take m as a keyword argument, as per unubtu's answer, you could instead use an additional lambda to evaluate m at each iteration:
>>> [x() for x in [(lambda x: lambda: x)(m) for m in [1,2,3]]]
[1, 2, 3]
Personally, I find this a more elegant solution. Lambda returns a function, so if we want to use the function, then we should use it. It's confusing to use the same symbol for the 'anonymous' variable in the lambda and for the generator, so in my example I use a different symbol to make it hopefully more clear.
>>> [ (lambda a:a)(i) for i in range(3)]
[0, 1, 2]
>>>
it's faster too.
>>> timeit.timeit('[(lambda a:a)(i) for i in range(10000)]',number=10000)
9.231263160705566
>>> timeit.timeit('[lambda a=i:a for i in range(10000)]',number=10000)
11.117988109588623
>>>
but not as fast as map:
>>> timeit.timeit('map(lambda a:a, range(10000))',number=10000)
5.746963977813721
(I ran these tests more than once, result was the same, this was done in python 2.7, results are different in python 3: the two list comprehensions are much closer in performance and both a lot slower, map remains much faster. )
#unubtu's answer is correct. I recreated the scenario in Groovy with closures. Perhaps is illustrates what is going on.
This is analogous to [x() for x in [lambda: m for m in [1,2,3]]]
arr = []
x = 0
while (x < 3) {
x++
arr.add({ -> x })
}
arr.collect { f -> f() } == [3, 3, 3]
This is analogous to [x() for x in [lambda m=m: m for m in [1,2,3]]]
arr = []
x = 0
while (x < 3) {
x++
arr.add({_x -> { -> _x }}(x))
}
arr.collect { f -> f() } == [1, 2, 3]
Note that this would not happen if i used [1,2,3].each {x -> ... } instead of a while loop. Groovy while loops and Python list comprehensions both share its closure between iterations.
I noticed that too. I concluded that lambda are created only once. So in fact your inner list comprehension will give 3 indentical functions all related to the last value of m.
Try it and check the id() of the elements.
[Note: this answer is not correct; see the comments]

Categories

Resources