shallow copy in python - python

I am a little confused on how shallow copy works, my understanding is when we do new_obj = copy.copy(mutable_obj) a new object is created with elements of it still pointing to the old object.
Example of where I am confused -
## assignment
i = [1, 2, 3]
j = i
id(i[0]) == id (j[0]) # True
i[0] = 10
i # [10, 2, 3]
j # [10, 2, 3]
## shallow copy
k = copy.copy(i)
k # [10, 2, 3]
id(i) == id(k) # False (as these are two separate objects)
id(i[0]) == id (k[0]) # True (as the reference the same location, right?)
i[0] = 100
id(i[0]) == id (k[0]) # False (why did that value in that loc change?)
id(i[:]) == id (k[:]) # True (why is this still true if an element just changed?)
i # [100, 2, 3]
k # [10, 2, 3]
In shallow copy, isn't k[0] just pointing to i[0] similar to assignment? Shouldn't k[0] change when i[0] changes?
Why I expect these to be same, because -
i = [1, 2, [3]]
k = copy(i)
i # [1, 2, [3]]
k # [1, 2, [3]]
i[2].append(4)
i # [1, 2, [3, 4]]
k # [1, 2, [3, 4]]
id(i[0]) == id (k[0]) # True
id(i[2]) == id (k[2]) # True
id(i[:]) == id (k[:]) # True

id(i) == id(k) # False (as these are two separate objects)
Correct.
id(i[0]) == id (k[0]) # True (as the reference the same location, right?)
Correct.
i[0] = 100
id(i[0]) == id (k[0]) # False (why did that value in that loc change?)
It changed because you changed it in the previous line. i[0] was pointing 10, but you changed it to point to 100. Therefore, i[0] and k[0] now no longer point to the same spot.
Pointers (references) are one way. 10 does not know what is pointing to it. Neither does 100. They are just locations in memory. So if you change where i's first element is pointing to, k doesn't care (since k and i are not the same reference). k's first element is still pointing to what it always was pointing to.
id(i[:]) == id (k[:]) # True (why is this still true if an element just changed?)
This one's a bit more subtle, but note that:
>>> id([1,2,3,4,5]) == id([1,2,3])
True
whereas
>>> x = [1,2,3,4,5]
>>> y = [1,2,3]
>>> id(x) == id(y)
False
It has to do with some subtleties of garbage collection and id, and it's answered in depth here: Unnamed Python objects have the same id.
Long story short, when you say id([1,2,3,4,5]) == id([1,2,3]), the first thing that happens is we create [1,2,3,4,5]. Then we grab where it is in memory with the call to id. However, [1,2,3,4,5] is anonymous, and so the garbage collector immediately reclaims it. Then, we create another anonymous object, [1,2,3], and CPython happens to decide that it should go in the spot that it just cleaned up. [1,2,3] is also immediately deleted and cleaned up. If you store the references, though, GC can't get in the way, and then the references are different.
Mutables example
The same thing happens with mutable objects if you reassign them. Here's an example:
>>> import copy
>>> a = [ [1,2,3], [4,5,6], [7,8,9] ]
>>> b = copy.copy(a)
>>> a[0].append(123)
>>> b[0]
[1, 2, 3, 123]
>>> a
[[1, 2, 3, 123], [4, 5, 6], [7, 8, 9]]
>>> b
[[1, 2, 3, 123], [4, 5, 6], [7, 8, 9]]
>>> a[0] = [123]
>>> b[0]
[1, 2, 3, 123]
>>> a
[[123], [4, 5, 6], [7, 8, 9]]
>>> b
[[1, 2, 3, 123], [4, 5, 6], [7, 8, 9]]
The difference is when you say a[0].append(123), we're modifying whatever a[0] is pointing to. It happens to be the case that b[0] is pointing to the same object (a[0] and b[0] are references to the same object).
But if you point a[0] to a new object (through assignment, as in a[0] = [123]), then b[0] and a[0] no longer point to the same place.

In Python all things are objects. This includes integers. All lists only hold references to objects. Replacing an element of the list doesn't mean that the element itself changes.
Consider a different example:
class MyInt:
def __init__(self, v):
self.v = v
def __repr__(self):
return str(self.v)
>>> i = [MyInt(1), MyInt(2), MyInt(3)]
[1, 2, 3]
>>> j = i[:] # This achieves the same as copy.copy(i)
[1, 2, 3]
>>> j[0].v = 7
>>> j
[7, 2, 3]
>>> i
[7, 2, 3]
>>> i[0] = MyInt(1)
>>> i
[1, 2, 3]
>>> j
[7, 2, 3]
I am creating a class MyInt here which just holds an int.
By modifying an instance of the class, both lists "change". However as I replace a list entry, the lists are now different.
The same happens with integers. You just can't modify them.

In the first case j = i is an assignment, both j and i point to the same list object. When you change an element of the list object and print i and j, since both i and j point to same list object, and it is the element and not the list object which has changed, so both will print the same output.
In the second case k = copy.copy(i) is a shallow copy, in which a copy of list object and copy of nested references is made but the internal immutable objects are not copied.
A shallow copy doesn't create a copy of nested objects, instead it just copies the reference of nested objects. Please refer this https://www.programiz.com/python-programming/shallow-deep-copy
Thus i and k have different set of references pointing to the same immutable objects. When you do i[0] = 100, the reference in list i points to a new int object with value 100, but the reference in k still references the old int object with value 10.

Related

Function is changing both lists instead of one list based on '0' in Python

I'm a bit confused about why both outputs get changed after the restore when surely it should be just one(outputs are illustrated in the notes). Surely just the first one should change? If anyone could give me a suggestion as to why this happens I'd appreciate it
def switcher(y):
# shifts two characters
temp = y[0]
y[0] = y[1]
y[1] = temp
sub = [[1,2,3],[1,2,3]]
switcher(sub[0])
sub
#[[2, 1, 3], [1, 2, 3]]
#restore
sub[0] = sub[1]
sub
# [[1, 2, 3], [1, 2, 3]]
switcher(sub[0])
sub
#[[2, 1, 3], [2, 1, 3]]
With sub[0] = sub[1] you are defining both lists to be the same object, that's why the subsequent change is applied to both of them. Do sub[0] = sub[1][:] to create a copy, for example (there is more ways of doing this for a list).
When you are doing sub[0] = sub[1], you are assigning the reference to the value at index 1 i.e. [1, 2, 3] in your case to index 0, so ultimately both the lists reside in the same memory location, and change in either makes the corresponding change to the other.
You can verify this using id builtin which represents the memory reference for a given value:
ids after initialization:
>>> sub = [[1,2,3],[1,2,3]]
>>> [id(s) for s in sub]
[1385461417096, 1385461338824]
ids after calling switcher:
>>> switcher(sub[0])
>>> [id(s) for s in sub]
[1385461417096, 1385461338824]
ids after assigning sub[0] = sub[1]:
>>> sub[0] = sub[1]
>>> [id(s) for s in sub]
[1385461338824, 1385461338824]
As you can see, the ids are same after assigning sub[0] = sub[1], both the sub-lists get changed when modifying one of them
The offending line is the assignment
sub[0] = sub[1]
Assignment never copies data.
You are telling Python that the references sub[0] and sub[1] now both point to the same list object in memory (with the content [1,2,3]).
In your specific case this is easily fixed by taking a (shallow) copy of the list on the right hand side of the assignment.
sub[0] = sub[1][:]
You have a problem with references. By defining:
sub = [[1,2,3],[1,2,3]]
you create a list of two different list, but when you do:
sub[0] = sub[1]
you are telling python to copy sub[1] into sub[0] hence for python your new vector will be:
sub <- [ reference_to_memory_where_sub1_is, sub1 ]
To avoid this behaviour you can explicitly tell python to duplicate the objects in memory. You can do this with the module "copy":
import copy
def switcher(y):
# shifts two characters
temp = y[0]
y[0] = y[1]
y[1] = temp
l1 = [1,2,3]
l2 = [1,2,3]
sub = [copy.deepcopy(l1),copy.deepcopy(l2)]
switcher(sub[0])
print(sub)
#[[2, 1, 3], [1, 2, 3]]
#restore
sub[0] = l1
print(sub)
# [[1, 2, 3], [1, 2, 3]]
switcher(sub[0])
print(sub)
#[[2, 1, 3], [2, 1, 3]]

Why is append and concat giving me different results? [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 3 years ago.
I have a function that generates all permutations of a string. It prints out all the possible permutations just fine. But now I want a list of all such permutations.
I tried making the global list as well as tried passing it as a parameter, but post appending the permutation all the lists previously in the main list get changed to the list last appended. Please explain this behavior
def permutationNum(a,lower,upper,perm):
if(lower==upper):
print(a)
print(perm)
perm.append(a)
# perm = perm.append(a)
else:
for i in range(lower,upper+1):
a[lower],a[i] = a[i],a[lower]
permutationNum(a,lower+1,upper, perm)
a[lower],a[i] = a[i],a[lower]
listy = [1,2,3]
perm = []
permutationNum(listy, 0, len(listy)-1, perm)
print(perm)
Output : [[1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3], [1, 2, 3]]
Expected Output : [[1, 2, 3], [1, 3, 2], [2, 1, 3], [2, 3, 1], [3, 2, 1], [3, 1, 2]]
UPDATE:
Turns out it was indeed deep copy problem after all. I just had a temp variable store a deep copy of a and appended that temp variable to the list. It all worked out.
In python, certain data types are copied when passed into functions as arguments, while others are referenced.
If an argument is copied, any changes to it inside the function will not affect the original variable that was passed in.
If an argument is referenced, any changes to it inside the function will affect the original.
Strings, Ints, Floats are copied, while objects and lists are referenced. This behaviour is also replicated when assigning one variable to another:
a = 5
b = a
b = 6
print(a)
>>> 5
a = [5]
b = a
b.append(6)
print(a)
>>> [5, 6]
If you want to copy a list and not just reference it, there's multiple ways you can achieve this:
Copy Module
import copy
a = [5]
b = copy.copy(a)
b.append(6)
print(a)
>>> [5]
Slicing
a = [5]
b = a[:]
b.append(6)
print(a)
>>> [5]
.copy()
a = [5]
b = a.copy()
b.append(6)
print(a)
>>> [5]
list()
a = [5]
b = list(a)
b.append(6)
print(a)
>>> [5]
So in your case, you would change the following line from:
permutationNum(a,lower+1,upper, perm) to
permutationNum(a[:],lower+1,upper, perm)
Change this line - to append new instance of list everytime
perm.append(list(a))
Another way to get permutations:
import itertools
def permutationNum(a):
for x in itertools.permutations(a):
perm.append(list(x))
listy = [1,2,3]
perm = []
permutationNum(listy)
print(perm)
or
import itertools
def permutationNum(a):
return [list(x) for x in itertools.permutations(a)]
listy = [1,2,3]
print(permutationNum(listy))

Instance variable gets modified automatically

I made a new class that represents one position in the game of Tic Tac Toe. Basically what I'm trying to do is make a tree of all possibilities of game positions where each node is a Position object and find the best move for the player using minimax algorithm. The minimax algorithm isn't shown below as the Position class is not working as required.
The Position class has a generate_children method that makes a list of Position objects that can be reached from the current position. Executing the program we get the output that after each iteration the pos_matrix of the current Position object is changing which is undesirable. I have not touched the pos_matrix of the current Position object in the loop and play_move makes a copy of the matrix to avoid messing it up. Still the pos_matrix is changing each iteration.
What is happening? How do I debug it?
Tried: moved play_move out of class, didn't work.
Note: A 0 in the pos_matrix represents empty square while 1 represent "X" and -1 represents "O".
Also kiska_chance means "whose chance". :P
class Position:
def __init__(self, parent_):
self.parent = parent_
self.children = []
self.best_move = []
self.pos_matrix = []
self.last_move = []
def set_pos_matrix(self, pos_matrix_):
self.pos_matrix = list(pos_matrix_)
# Avoiding copying problems by creating copy of list
def set_last_move(self, last_move_):
self.last_move = list(last_move_)
# Avoiding copying problems by creating copy of list
def play_move(self, move, kiska_chance):
m2 = list(self.pos_matrix)
x, y = move
m2[x][y] = kiska_chance
return m2
def generate_children(self, kiska_chance):
children_ = []
for move in self.get_possible_moves():
# Passing a Position object into the possible moves with
# parent as self.
pos_temp = Position(self)
pos_temp.set_pos_matrix(self.play_move(move, kiska_chance))
pos_temp.set_last_move(move)
print self.pos_matrix
children_.append(pos_temp)
self.children = children_
return children_
def get_possible_moves(self):
dem_moves = []
for i in xrange(3):
for j in xrange(3):
if self.pos_matrix[i][j]==0:
dem_moves.append([i, j])
return dem_moves
pos = Position(None)
pos.set_pos_matrix([[0, 0, 0],
[0, 0, 0],
[0, 0, 0]])
pos.generate_children(1)
You have nested lists in self.pos_matrix. You were only copying the outer list. Because of that all of the lists in the list are still shared by both lists. You need to copy the lists in the list. See corrected code:
def play_move(self, move, kiska_chance):
m2 = list(list(l) for l in self.pos_matrix)
x, y = move
m2[x][y] = kiska_chance
return m2
Also in:
def set_pos_matrix(self, pos_matrix_):
self.pos_matrix = list(list(l) for l in pos_matrix_)
# Avoiding copying problems by creating copy of list and lists in list
Generally you've to use deepcopy for this, since lists are mutable objects, they'll be passed by reference to other objects.
Lets see what happens for a list having both mutable and immutable objects.
>>> l1 = [1, 2]
>>> l2 = [3, 4]
>>> t1 = (1, 2, 3)
>>> l = [l1, l2, t1, 5]
>>> l
[[1, 2], [3, 4], (1, 2, 3), 5]
Here list l is having l1 and l2 and then tuple t1 and then a number 5, So list is a mutable object and number and tuples are not.
If you simply doa list(l) that becomes a shallow copy, shallow copy means just copying the very outer object, but still your inner objects refer to the same objects.
You better use copy module for this.
import copy
>>> l_copy = copy.copy(l)
Now we did a shallow copy here..lets see what happens.
>>> l_copy
[[1, 2], [3, 4], (1, 2, 3), 5]
We've gt the same list as l and lets try appending one value to the inner list.
>>> l_copy[0]
[1, 2]
>>> l_copy[0].append(0)
>>> l_copy
[[1, 2, 0], [3, 4], (1, 2, 3), 5]
Now l[0] --> [1, 2] got added one more value [1,2, 0]..and if you try checking l and l1 you can see the same value there as well.
>>> l
[[1, 2, 0], [3, 4], (1, 2, 3), 5]
>>> l1
[1, 2, 0]
You didn't modify l and l1, but still they got added new value 0 this what happens if your mutable objects.
Lets try deepcopy here.. Note: You've to define l1, l2 and l again don't use the old ones..
>>> l_copy = copy.deepcopy(l)
>>> l_copy
[[1, 2], [3, 4], (1, 2, 3), 5]
Now append value to l_copy[0] this means to l1..
>>> l_copy[0].append(0)
>>> l_copy
[[1, 2, 0], [3, 4], (1, 2, 3), 5]
And try checking with l1 and l..
>>> l
[[1, 2], [3, 4], (1, 2, 3), 5]
>>> l1
[1, 2]
If you see now that value is not reflecting over l1 and l.
So you've to be careful with mutable objects. and may be refer below doc for more info about copy and deepcopy.
https://docs.python.org/2/library/copy.html

reverse method mutating input

For an assignment we were asked to create a function that would reverse all the elements in an arbitrarily nested list. So inputs to the function should return something like this:
>>> seq = [1,[2,[3]]]
>>> print arb_reverse(seq)
[[[3],2],1]
>>> seq = [9,[[],[0,1,[[],[2,[[],3]]]],[],[[[4],5]]]]
>>> print arb_reverse(seq)
[[[[5,[4]]],[],[[[[3,[]],2],[]],1,0],[]],9]
I came up with a recursive solution which works well:
def arb_reverse(seq):
result = []
for element in reversed(seq):
if not is_list(element):
result.append(element)
else:
result.append(arb_reverse(element))
return result
But for a bit of a personal challenge I wanted to create a solution without the use of recursion. One version of this attempt resulted in some curious behavior which I am not understanding. For clarification, I was NOT expecting this version to work properly but the resulting input mutation does not make sense. Here is the iterative version in question:
def arb_reverse(seq):
elements = list(seq) #so input is not mutated, also tried seq[:] just to be thorough
result = []
while elements:
item = elements.pop()
if isinstance(item, list):
item.reverse() #this operation seems to be the culprit
elements += item
else:
result.append(item)
return result
This returns a flattened semi-reversed list (somewhat expected), but the interesting part is what it does to the input (not expected)...
>>> a = [1, [2, [3]]]
>>> arb_reverse(a)
[2, 3, 1]
>>> a
[1, [[3], 2]]
>>> p = [1, [2, 3, [4, [5, 6]]]]
>>> print arb_reverse(p)
[2, 3, 4, 5, 6, 1]
>>> print p
[1, [[[6, 5], 4], 3, 2]]
I was under the impression that by passing the values contained in the input to a variable using list() or input[:] as i did with elements, that I would avoid mutating the input. However, a few print statements later revealed that the reverse method had a hand in mutating the original list. Why is that?
The list() call is making a new list with shallow-copied lists from the original.
Try this (stolen from here):
from copy import deepcopy
listB = deepcopy(listA)
Try running the following code through this tool http://people.csail.mit.edu/pgbovine/python/tutor.html
o1 = [1, 2, 3]
o2 = [4, 5, 6]
l1 = [o1, o2]
l2 = list(l1)
l2[0].reverse()
print l2
print l1
Specifically look at what happens when l2[0].reverse() is called.
You'll see that when you call list() to create a copy of the list, the lists still reference the same objects.

Add to list vs. Increment

Given that in Python:
element = element + [0]
should be equal to:
element += [0]
Why does one modify a list and the other does not? Here is a example:
>>> a = [[0, 0], [0,0]]
>>> for element in a:
... element = element + [0]
...
>>> a
[[0, 0], [0, 0]]
a is not modified. But if I increment:
>>> a = [[0, 0], [0,0]]
>>> for element in a:
... element += [0]
...
>>> a
[[0, 0, 0], [0, 0, 0]]
a is modified.
Thanks,
Frank
This is a fun side-effect of += operatior, which calls __iadd__ instead of __add__.
The statement x = x + y is equivalent to x = x.__add__(y), while x += y is equivalent to x = x.__iadd__(y).
This lets the list class optimize += by extending the existing (ex, x += y is roughly equivalent to x.extend(y)) list instead of creating an entirely new list (which is what + needs to do).
For example:
>>> a = [1, 2, 3]
>>> original_a = a
>>> b = [1, 2, 3]
>>> original_b = b
>>> a += [4]
>>> b = b + [4]
>>> a is original_a
True
>>> b is original_b
False
You can see that using += maintains the identity of the left hand side (ie, a new list isn't created) while using + does not maintain the identity (ie, a new list is created).
For more, see: http://docs.python.org/library/operator.html#operator.iadd and the paragraph directly above the documentation for operator.iadd.
In the first case, element = element + [0], you are creating a new list.
In the second case, element += [0], you are modifying an existing list.
Since the list of lists, a, contains pointers to the elements, only modifying the elements will actually change things. (That is, creating a new list does not change the pointers in a.)
This is seen more clearly if we take a simple example showing how lists work:
>>> a = [1, 2, 3]
>>> b = a
>>> a = [4, 5, 6]
>>> a
[4, 5, 6]
>>> b
[1, 2, 3]
>>> a = [1, 2, 3]
>>> b = a
>>> a += [4, 5, 6]
>>> b
[1, 2, 3, 4, 5, 6]
>>> a
[1, 2, 3, 4, 5, 6]
Assigning a variable to a list simply assigns a pointer.
Adding to what others said, there is a difference in what these statements do:
element = element + [0]
does
element = element.__add__([0])
while
element += [0]
does
element = element.__iadd__([0])
__iadd__(), in this case, is free to determine what to return: the original object with a modification or a new object.
In the case of a immutable object, it must return a different one (e.g., a = b = 8; a += 9 => a is not b.
But in the case of a mutable object, such as a list, it normally modifies this one:
a = b = []
a += [8]
=> a is b.
This different behaviour reflects in your for loop:
for element in a:
element = element + [0]
=> name element gets rebound to a different object; original one remains untouched
for element in a:
element += [0]
=> original object, which is as well contained in the outer list, a, gets modified. The fact that element is reassigned is irrelevant; it is not used.

Categories

Resources