Say I have a list:
L = [1,2,3]
and I assigned L[0] to a variable a
a = L[0]
then if I change a, it won't affect L.
a = a + 1
print L # [1,2,3] not affected
Why is this happening? isn't python passing everything around with references? I thought that a is pointing to L[0]
The problem is that a and L[0] are references to an immutable object, so changing any one of them won't affect the other references:
>>> L = [1, 2, [3]]
>>> a = L[0]
>>> a = a + 1
a now points to a new object, while L[0] still points to the same object.
>>> a, L[0]
(2, 1)
Now in this case b and L[2] are references to a mutable object(list), any in-place operation on them will affect all the references:
>>> b = L[2]
>>> b.append(4) #list.append is an in-place operation
>>> b, L[2]
([3, 4], [3, 4])
>>> b = b + [5] #mutable object, but not an in-place operation
>>> b #b is assigned to new list object
[3, 4, 5]
>>> L[2] #L[2] is unchanged
[3, 4]
L[0] is a name, and when you create the list L, you assign an object to that name, the integer 1. a is also a name, and when you assign a as in a = L[0], you make a to point to the same object that L[0] points to.
But when you later do a = a + 1, this is another assignment. You are not modifying the object that a points to -- the = sign can't do that. You are creating a new object, the integer 2, and assigning that to a.
So in the end, you have two objects in memory; one is referred to by L[0] and the other is referred to by a.
Integers are immutable, which means that there is no possible way to change the properties of the objects in this example; however, that's not salient in this example exactly, because even if the object was mutable it wouldn't change the fact that you're doing assignment (with the = sign). In a case where the object in question was mutable, you could theoretically change the properties of the object when it is still referenced by L[0] and a, instead of doing any additional assignment with = as you are doing. At that point, you would see the properties change regardless of which name you used to inspect the object.
Since L[0] in your case is immutable, changing a doesn't affect the value of L[0]. When you change a, the new object is created and a starts to pointing to it.
See what happens if L[0] is of a mutable type:
>>> L = [[1],2,3]
>>> a = L[0]
>>> a.append(2)
>>> L
[[1, 2], 2, 3]
In this case a and L[0] both point to the same object.
Also see Raymond Hettinger's answer in the relevant thread.
Change the assignment to:
a = L
then when you change L as:
L[0] += 1
you will see that a also changes. This is the reference magic.
Related
>>> a = [1,2,3]
>>> b = []
>>> b.append(a)
>>> print(b)
[[1, 2, 3]]
>>> num = a.pop(0)
>>> a.append(num)
>>> print(a)
[2, 3, 1]
>>> b.append(a)
>>> print(b)
[[2, 3, 1], [2, 3, 1]]
>>>
Why is this happening and how to fix it? I need the list like
[[1, 2, 3], [2, 3, 1]]
Thank you.
Edit:
Also, why is this working?
>>> a = []
>>> b = []
>>> a = [1,2,3]
>>> b.append(a)
>>> a = [1,2,3,4]
>>> b.append(a)
>>> print(b)
[[1, 2, 3], [1, 2, 3, 4]]
>>>
'''
Append a copy of your list a, at least the first time. Otherwise, you've appended the same list both times.
b.append(a[:])
When you append the list a, python creates a reference to that variable inside the list b. So when you edit the list a, it is reflected again in the list b. You need to create a copy of your variable and then append it to get the desired result.
Every variable name in Python should be thought of as a reference to a piece of data. In your first listing, b contains two references to the same underlying object that is also referenced by the name a. That object gets changed in-place by the operations you’re using to rotate its members. The effect of that change is seen when you look at either of the two references to the object found in b, or indeed when you look at the reference associated with the name a.
Their identicality can be seen by using the id() function: id(a), id(b[0]) and id(b[1]) all return the same number, which is the unique identifier of the underlying list object that they all refer to. Or you can use the is operator: b[0] is b[1] evaluates to True.
By contrast, in the second listing, you reassign a—in other words, by using the assignment operator = you cause that name to become associated with a different object: in this case, a new list object that you just created with your square-bracketed literal expression. b still contains one reference to the old list, and now you append a new reference that points to this different piece of underlying data. So the two elements of b now look different from each other—and indeed they are different objects and accordingly have different id() numbers, only one of which is the same as the current id(a). b[0] is b[1] now evaluates to False
How to fix it? Reassign the name a before changing it: for example, create a copy:
a = list(a)
or:
import copy
a = copy.copy(a)
(or you could even use copy.deepcopy()—study the difference). Alternatively, rotate the members a using methods that entail reassignment rather than in-place changes—e.g.:
a = a[1:] + a[:1]
(NB immutable objects such as the tuple avoid this whole confusion —not because they behave fundamentally differently but because they lack methods that produce in-place changes and therefore force you to use reassignment strategies.)
In addition to making the copy of a by doing a[:] and assigning it to b.
You can also use collections.deque.rotate to rotate your list
from collections import deque
a = [1,2,3]
#Make a deque of copy of a
b = deque(a[:])
#Rotate the deque
b.rotate(len(a)-1)
#Create the list and print it
print([a,list(b)])
#[[1, 2, 3], [2, 3, 1]]
In Python, lists are passed by reference to functions, right?
If that is so, what's happening here?
>>> def f(a):
... print(a)
... a = a[:2]
... print(a)
...
>>> b = [1,2,3]
>>> f(b)
[1, 2, 3]
[1, 2]
>>> print(b)
[1, 2, 3]
>>>
In the statement:
a = a[:2]
you are creating a new local (to f()) variable which you call using the same name as the input argument a.
That is, what you are doing is equivalent to:
def f(a):
print(a)
b = a[:2]
print(b)
Instead, you should be changing a in place such as:
def f(a):
print(a)
a[:] = a[:2]
print(a)
When you do:
a = a[:2]
it reassigns a to a new value (The first two items of the list).
All Python arguments are passed by reference. You need to change the object that it is refered to, instead of making a refer to a new object.
a[2:] = []
# or
del a[2:]
# or
a[:] = a[:2]
Where the first and last assign to slices of the list, changing the list in-place (affecting its value), and the middle one also changes the value of the list, by deleting the rest of the elements.
Indeed the objects are passed by reference but a = a[:2] basically creates a new local variable that points to slice of the list.
To modify the list object in place you can assign it to its slice(slice assignment).
Consider a and b here equivalent to your global b and local a, here assigning a to new object doesn't affect b:
>>> a = b = [1, 2, 3]
>>> a = a[:2] # The identifier `a` now points to a new object, nothing changes for `b`.
>>> a, b
([1, 2], [1, 2, 3])
>>> id(a), id(b)
(4370921480, 4369473992) # `a` now points to a different object
Slice assignment work as expected:
>>> a = b = [1, 2, 3]
>>> a[:] = a[:2] # Updates the object in-place, hence affects all references.
>>> a, b
([1, 2], [1, 2])
>>> id(a), id(b)
(4370940488, 4370940488) # Both still point to the same object
Related: What is the difference between slice assignment that slices the whole list and direct assignment?
I am trying to write a function which removes the first item in a Python list. This is what I've tried. Why doesn't remove_first_wrong change l when I call the function on it? And why does the list slicing approach work when I do it in the main function?
def remove_first_wrong(lst):
lst = lst[1:]
def remove_first_right(lst):
lst.pop(0)
if __name__ == '__main__':
l = [1, 2, 3, 4, 5]
remove_first_wrong(l)
print(l)
l_2 = [1, 2, 3, 4, 5]
remove_first_right(l_2)
print(l_2)
# Why does this work and remove_first_wrong doesn't?
l_3 = [1, 2, 3, 4, 5]
l_3 = l_3[1:]
print(l_3)
Slicing a list returns a new list object, which is a copy of the original list indices you indicated in the slice. You then rebound lst (a local name in the function) to reference that new list instead. The old list is never altered in that process.
list.pop() on the other hand, operates on the list object itself. It doesn't matter what reference you used to reach the list.
You'd see the same thing without functions:
>>> a = [1, 2]
>>> b = a[:] # slice with all the elements, produces a *copy*
>>> b
[1, 2]
>>> a.pop() # remove an element from a won't change b
2
>>> b
[1, 2]
>>> a
[1]
Using [:] is one of two ways of making a shallow copy of a list, see How to clone or copy a list?
You may want to read or watch Ned Batchelder's Names and Values presestation, to further help understand how Python names and objects work.
Inside the function remove_first_wrong the = sign reassigns the name lst to the object on the right. Which is a brand new object, created by slicing operation lst[1:]. Thus, the object lst assigned to is local to that function (and it actually will disappear on return).
That is what Martijn means by "You then rebound lst (a local name in the function) to reference that new list instead."
On contrary, lst.pop(0) is a call to the given object -- it operates on the object.
For example, this will work right too:
def remove_first_right2(lst):
x = lst # x is assigned to the same object as lst
x.pop(0) # pop the item from the object
Alternately, you can use del keyword:
def remove_first_element(lst):
del lst[0]
return lst
>>> a = [3, 2]
>>> a[0:1][0] = 1
>>> a
[3, 2]
>>> a[0:1] = [1]
>>> a
[1, 2]
What does a[0:1] mean?
If it's a pointer to the range of a, then a[0:1][0] = 1 should change the value of a.
If it's a copy of the range of a, then a[0:1] = [1] shouldn't change the value of a.
I think the result of the two is inconsistent with each other. Could you please help me work out the problem?
Internally, this is a big difference:
>>> a = [3, 2]
>>> a[0:1][0] = 1
is a shorthand for
temp = a[0:1]
temp[0] = 1
and is internally expressed as
a.__getitem__(slice(0, 1)).__setitem__(0, 1)
resp.
temp = a.__getitem__(slice(0, 1))
temp.__setitem__(0, 1)
so it accesses a part of the list, making a separate object, and doing an assignment on this object, which is then dropped.
On the other hand,
>>> a[0:1] = [1]
does
a.__setitem__(slice(0, 1), [1])
which just operates on the original object.
So, while looking similar, these expressions are distinct on what they mean.
Let's test that:
class Itemtest(object):
def __init__(self, name):
self.name = name
def __repr__(self):
return self.name
def __setitem__(self, item, value):
print "__setitem__", self, item, value
def __getitem__(self, item):
print "__getitem__", self, item
return Itemtest("inner")
a = Itemtest("outer")
a[0:1] = [4]
temp = a[0:1]
temp[0] = 4
a[0:1][0] = 4
outputs
__setitem__ outer slice(0, 1, None) [4]
__getitem__ outer slice(0, 1, None)
__setitem__ inner 0 4
__getitem__ outer slice(0, 1, None)
__setitem__ inner 0 4
Slicing a list creates a shallow copy- it is not a reference to the original. So when you get that slice, it is not bound to the original list a. Therefore, you can try and change a single element of it, but it is not stored in a variable so no changes will be made to any original list.
To clarify, with the former to you doing __getitem__- accessing part of the list (a copy):
a[0:1][0] = 1
You are editing the slice [0:1], which is a only shallow copy of a, so will not edit a itself.
But with the latter, one is calling __setitem__, which will of course edit the object in-place.:
a[0:1] = [1]
You are directly referring to and editing part of a, so it changes in real-time.
The following statement:
>>> a[0:1] = [1]
assigns the list [1] as a subset of the list a from 0 to 1.
By doing a[0:1][0] you are getting the first element of [3] which is 3. Then if you try to assign it a value of 1, it simply wont work because 3 cannot be 1. However, if you stick to a[0:1], you are getting [3], which can be changed to [1]. Hope that helps
Examples
>>> a = [1,2,3,4]
>>> a[1:4]
[2,3,4]
>>> a[1:4] = [6,5,4,3,2]
>>> a
[1,6,5,4,3,2]
I was told that += can have different effects than the standard notation of i = i +. Is there a case in which i += 1 would be different from i = i + 1?
This depends entirely on the object i.
+= calls the __iadd__ method (if it exists -- falling back on __add__ if it doesn't exist) whereas + calls the __add__ method1 or the __radd__ method in a few cases2.
From an API perspective, __iadd__ is supposed to be used for modifying mutable objects in place (returning the object which was mutated) whereas __add__ should return a new instance of something. For immutable objects, both methods return a new instance, but __iadd__ will put the new instance in the current namespace with the same name that the old instance had. This is why
i = 1
i += 1
seems to increment i. In reality, you get a new integer and assign it "on top of" i -- losing one reference to the old integer. In this case, i += 1 is exactly the same as i = i + 1. But, with most mutable objects, it's a different story:
As a concrete example:
a = [1, 2, 3]
b = a
b += [1, 2, 3]
print(a) # [1, 2, 3, 1, 2, 3]
print(b) # [1, 2, 3, 1, 2, 3]
compared to:
a = [1, 2, 3]
b = a
b = b + [1, 2, 3]
print(a) # [1, 2, 3]
print(b) # [1, 2, 3, 1, 2, 3]
notice how in the first example, since b and a reference the same object, when I use += on b, it actually changes b (and a sees that change too -- After all, it's referencing the same list). In the second case however, when I do b = b + [1, 2, 3], this takes the list that b is referencing and concatenates it with a new list [1, 2, 3]. It then stores the concatenated list in the current namespace as b -- With no regard for what b was the line before.
1In the expression x + y, if x.__add__ isn't implemented or if x.__add__(y) returns NotImplemented and x and y have different types, then x + y tries to call y.__radd__(x). So, in the case where you have
foo_instance += bar_instance
if Foo doesn't implement __add__ or __iadd__ then the result here is the same as
foo_instance = bar_instance.__radd__(bar_instance, foo_instance)
2In the expression foo_instance + bar_instance, bar_instance.__radd__ will be tried before foo_instance.__add__ if the type of bar_instance is a subclass of the type of foo_instance (e.g. issubclass(Bar, Foo)). The rationale for this is that Bar is in some sense a "higher-level" object than Foo so Bar should get the option of overriding Foo's behavior.
Under the covers, i += 1 does something like this:
try:
i = i.__iadd__(1)
except AttributeError:
i = i.__add__(1)
While i = i + 1 does something like this:
i = i.__add__(1)
This is a slight oversimplification, but you get the idea: Python gives types a way to handle += specially, by creating an __iadd__ method as well as an __add__.
The intention is that mutable types, like list, will mutate themselves in __iadd__ (and then return self, unless you're doing something very tricky), while immutable types, like int, will just not implement it.
For example:
>>> l1 = []
>>> l2 = l1
>>> l1 += [3]
>>> l2
[3]
Because l2 is the same object as l1, and you mutated l1, you also mutated l2.
But:
>>> l1 = []
>>> l2 = l1
>>> l1 = l1 + [3]
>>> l2
[]
Here, you didn't mutate l1; instead, you created a new list, l1 + [3], and rebound the name l1 to point at it, leaving l2 pointing at the original list.
(In the += version, you were also rebinding l1, it's just that in that case you were rebinding it to the same list it was already bound to, so you can usually ignore that part.)
Here is an example that directly compares i += x with i = i + x:
def foo(x):
x = x + [42]
def bar(x):
x += [42]
c = [27]
foo(c); # c is not changed
bar(c); # c is changed to [27, 42]