Python list slice syntax used for no obvious reason - python

I occasionally see the list slice syntax used in Python code like this:
newList = oldList[:]
Surely this is just the same as:
newList = oldList
Or am I missing something?

[:] Shallow copies the list, making a copy of the list structure containing references to the original list members. This means that operations on the copy do not affect the structure of the original. However, if you do something to the list members, both lists still refer to them, so the updates will show up if the members are accessed through the original.
A Deep Copy would make copies of all the list members as well.
The code snippet below shows a shallow copy in action.
# ================================================================
# === ShallowCopy.py =============================================
# ================================================================
#
class Foo:
def __init__(self, data):
self._data = data
aa = Foo ('aaa')
bb = Foo ('bbb')
# The initial list has two elements containing 'aaa' and 'bbb'
OldList = [aa,bb]
print OldList[0]._data
# The shallow copy makes a new list pointing to the old elements
NewList = OldList[:]
print NewList[0]._data
# Updating one of the elements through the new list sees the
# change reflected when you access that element through the
# old list.
NewList[0]._data = 'xxx'
print OldList[0]._data
# Updating the new list to point to something new is not reflected
# in the old list.
NewList[0] = Foo ('ccc')
print NewList[0]._data
print OldList[0]._data
Running it in a python shell gives the following transcript. We can see the
list being made with copies of the old objects. One of the objects can have
its state updated by reference through the old list, and the updates can be
seen when the object is accessed through the old list. Finally, changing a
reference in the new list can be seen to not reflect in the old list, as the
new list is now referring to a different object.
>>> # ================================================================
... # === ShallowCopy.py =============================================
... # ================================================================
... #
... class Foo:
... def __init__(self, data):
... self._data = data
...
>>> aa = Foo ('aaa')
>>> bb = Foo ('bbb')
>>>
>>> # The initial list has two elements containing 'aaa' and 'bbb'
... OldList = [aa,bb]
>>> print OldList[0]._data
aaa
>>>
>>> # The shallow copy makes a new list pointing to the old elements
... NewList = OldList[:]
>>> print NewList[0]._data
aaa
>>>
>>> # Updating one of the elements through the new list sees the
... # change reflected when you access that element through the
... # old list.
... NewList[0]._data = 'xxx'
>>> print OldList[0]._data
xxx
>>>
>>> # Updating the new list to point to something new is not reflected
... # in the old list.
... NewList[0] = Foo ('ccc')
>>> print NewList[0]._data
ccc
>>> print OldList[0]._data
xxx

Like NXC said, Python variable names actually point to an object, and not a specific spot in memory.
newList = oldList would create two different variables that point to the same object, therefore, changing oldList would also change newList.
However, when you do newList = oldList[:], it "slices" the list, and creates a new list. The default values for [:] are 0 and the end of the list, so it copies everything. Therefore, it creates a new list with all the data contained in the first one, but both can be altered without changing the other.

As it has already been answered, I'll simply add a simple demonstration:
>>> a = [1, 2, 3, 4]
>>> b = a
>>> c = a[:]
>>> b[2] = 10
>>> c[3] = 20
>>> a
[1, 2, 10, 4]
>>> b
[1, 2, 10, 4]
>>> c
[1, 2, 3, 20]

Never think that 'a = b' in Python means 'copy b to a'. If there are variables on both sides, you can't really know that. Instead, think of it as 'give b the additional name a'.
If b is an immutable object (like a number, tuple or a string), then yes, the effect is that you get a copy. But that's because when you deal with immutables (which maybe should have been called read only, unchangeable or WORM) you always get a copy, by definition.
If b is a mutable, you always have to do something extra to be sure you have a true copy. Always. With lists, it's as simple as a slice: a = b[:].
Mutability is also the reason that this:
def myfunction(mylist=[]):
pass
... doesn't quite do what you think it does.
If you're from a C-background: what's left of the '=' is a pointer, always. All variables are pointers, always. If you put variables in a list: a = [b, c], you've put pointers to the values pointed to by b and c in a list pointed to by a. If you then set a[0] = d, the pointer in position 0 is now pointing to whatever d points to.
See also the copy-module: http://docs.python.org/library/copy.html

Shallow Copy: (copies chunks of memory from one location to another)
a = ['one','two','three']
b = a[:]
b[1] = 2
print id(a), a #Output: 1077248300 ['one', 'two', 'three']
print id(b), b #Output: 1077248908 ['one', 2, 'three']
Deep Copy: (Copies object reference)
a = ['one','two','three']
b = a
b[1] = 2
print id(a), a #Output: 1077248300 ['one', 2, 'three']
print id(b), b #Output: 1077248300 ['one', 2, 'three']

Related

Chained list assignment in python [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 4 years ago.
When I ran this script (Python v2.6):
a = [1,2]
b = a
a.append(3)
print a
>>>> [1,2,3]
print b
>>>> [1,2,3]
I expected print b to output [1,2]. Why did b get changed when all I did was change a? Is b permanently tied to a? If so, can I make them independent? How?
Memory management in Python involves a private heap memory location containing all Python objects and data structures.
Python's runtime only deals in references to objects (which all live in the heap): what goes on Python's stack are always references to values that live elsewhere.
>>> a = [1, 2]
>>> b = a
>>> a.append(3)
Here we can clearly see that the variable b is bound to the same object as a.
You can use the is operator to tests if two objects are physically the same, that means if they have the same address in memory. This can also be tested also using the id() function.
>>> a is b
>>> True
>>> id(a) == id(b)
>>> True
So, in this case, you must explicitly ask for a copy.
Once you've done that, there will be no more connection between the two distinct list objects.
>>> b = list(a)
>>> a is b
>>> False
Objects in Python are stored by reference—you aren't assigning the value of a to b, but a pointer to the object that a is pointing to.
To emulate assignation by value, you can make a copy like so:
import copy
b = copy.copy(a)
# now the code works as "expected"
Be aware this has performance disadvantages.
In the case of an array, there's a special method that relies on slices:
b = a[:]
# code also works as expected here
Update– In addition to this, with some objects you can use the constructor—this includes lists:
b = list(a)
Short answer - Pointers.
When you type b = a it is setting b to look at the same array that a looks at. You have to make a new array with copies of the elements to separate them. In this case, something like b = [n for n in a] would work fine. For more complex operations you may want to check out http://docs.python.org/library/copy.html.
You might want to look at this link. The problem you have here is a and b both point to the same memory location, so changing one changes the other. Instead, you want to do something like this:
a = [1,2]
b = list(a)
a is a pointer to the list [1,2].
When you do the assignment b = a the value of b is the address of the list [1,2].
So when you do a.append(3) you are not actually changing a, you are changing the list that a points to. Since a and b both point to the same list, they both appear to change when you modify the other.
If you simply want to copy the contents of list a to b, instead of making b a pointer to a:
b = a[:]
Using the slice operator will copy the contents of the list into b such that you example would become:
a = [1,2]
b = a[:]
a.append(3)
print a
>>>> [1,2,3]
print b
>>>> [1,2]

Having trouble understanding immutable, mutable, scope in python functions

See my code in python 3.4. I can get around it fine. It bugs me a little. I'm guessing it's something to do with foo2 resetting a rather than treating it as list 1.
def foo1(a):
a.append(3) ### add element 3 to end of list
return()
def foo2(a):
a=a+[3] #### add element 3 to end of list
return()
list1=[1,2]
foo1(list1)
print(list1) ### shows [1,2,3]
list1=[1,2]
foo2(list1)
print(list1) #### shows [1,2]
In foo2 you do not mutate the original list referred to by a - instead, you create a new list from list1 and [3], and bind the result which is a new list to the local name a. So list1 is not changed at all.
There is a difference between append and +=
>>> a = []
>>> id(a)
11814312
>>> a.append("hello")
>>> id(a)
11814312
>>> b = []
>>> id(b)
11828720
>>> c = b + ["hello"]
>>> id(c)
11833752
>>> b += ["hello"]
>>> id(b)
11828720
As you can see, append and += have the same result; they add the item to the list, without producing a new list. Using + adds the two lists and produces a new list.
In the first example, you're using a method that modifies a in-place. In the second example, you're making a new a that replaces the old a but without modifying the old a - that's usually what happens when you use the = to assign a new value. One exception is when you use slicing notation on the left-hand side: a[:] = a + [3] would work as your first example did.

Create a copy of the list not referencing the contained objects

Say I have a vector a defined as:
a = [[1,2,3],[-1,-2,-3]]
I have learned that to create a copy of the object a without referencing it I should use the following syntaxis:
b = a[:]
Indeed, if I execute the following statements:
b = []
print a
the output is
>>> [[1,2,3],[-1,-2,-3]]
exactly as I was expecting. Though, if I do the following:
b = a[:]
b[0][2] = 'change a'
print a
the output is
>>> [[1,2,'change a'],[-1,-2,-3]]
So it's clear to me that the object a[0] is being referenced even if contained in a. How can I create a copy of the object a in a way that even all its internal objects will not be referenced?
a[:] creates a shallow copy of the list.
You can use the copy.deepcopy() function to recursively copy the objects, or use a list comprehension:
b = [el[:] for el in a]
This creates a new list object with shallow copies of the nested list objects in a.
For that use deepcopy:
>>> from copy import deepcopy
>>> b = deepcopy(a)
>>> b[0][2] = 'change a'
>>> print a
[[1,2,3],[-1,-2,-3]]
Deepcopy: https://docs.python.org/2/library/copy.html#copy.deepcopy
Extension
Deepcopy also creates an individual copy of class instances. Please see simple example below.
from copy import deepcopy
class A:
def __init__(self):
self.val = 'A'
>>> a = A()
>>> b = deepcopy(a)
>>> b.val = 'B'
>>> print a.val
'A'
>>> print b.val
'B'
If you only "shallow copy" b = a[:], each sub-list b[n] is still a reference to the same sub-list referenced at a[n]. Instead you need to do a deep(er) copy, e.g. by
b = [l[:] for l in a]
This creates a shallow copy of each sub-list, but as their contents are immutable that isn't a problem. If you have more levels of container nesting, you need copy.deepcopy as the other answers suggest.
Use copy.deepcopy
import copy
b = copy.deepcopy(a)
Quoting the docs:
A deep copy constructs a new compound object and then, recursively,
inserts copies into it of the objects found in the original
Example:
>>> a = list(range(1000))
>>> b = copy.deepcopy(a)
>>> a is b # b is a new object
False
>>>

python: list changes when global edited

a = [1]
def do():
global a
b=a
print b
a[0] = 2
print b
do()
outputs:
1
2
I am pretty sure it has something to do with the fact that 'a' is a global list.
Could someone please explain to me why the variable b changes when the global changes. And how i could possibly stop it from happening?
an extension to the question:
how would you handle further nesting, such as:
a = []
b = []
def do():
global a, b
b.append(a[:])
print a, b
a[0][0] +=1
print a, b
a.append([1])
do()
In this line b=a you essentially create a reference b, which points to a. This in python does not create a new copy of the list, but just creates a new link to it.
If you want to create a copy of a then you need to do it explicitly. Using list comprehensions, you can do it in this way :
b = a[:]
This will create a copy of a which will be referenced by b. See it in action :
>>> a = [1]
>>> b = a #Same list
>>> a[0] = 2
>>> b
[2] #Problem you are experiencing
You can see for yourself whether they refer to the same object or not by :
>>> a is b
True
The true signifies that they refer to the same object.
>>> b = a[:] #Solution <<--------------
Doing the same test again :
>>> a is b
False
And problem solved. They now refer to different objects.
>>> b
[2]
>>> a[0] = 3 #a changed
>>> a
[3]
>>> b
[2] #No change here
When you assign b = a you are copying the reference to a list object that is held in a to b, so they point at the same list. The changes to the underlying object will be reflected by either reference.
If you want to create a copy of the list use
b = list(a)
or, a method that will work on most objects:
import copy
b = copy.copy(a)
I think you have a misunderstanding of python's variable model. This is the article I read that made it click for me (the sections "Other languages have variables" and "Python has names").

Python: Problem with list editing

Simplified version of my code:
sequence = [['WT_1', 'AAAAAAAA'], ['WT_2', 'BBBBBBB']]
def speciate(sequence):
lineage_1 = []
lineage_2 = []
for i in sequence:
lineage_1.append(i)
for k in sequence:
lineage_2.append(k)
lineage_1[0][0] = 'L1_A'
lineage_1[1][0] = 'L1_B'
lineage_2[0][0] = 'L2_A'
lineage_2[1][0] = 'L2_B'
print lineage_1
print lineage_2
speciate(sequence)
outputs:
[['L2_A', 'AAAAAAAA'], ['L2_B','BBBBBBB']]
[['L2_A','AAAAAAAA'], ['L2_B','BBBBBBB']]
when I would expect to get this:
[['L1_A', 'AAAAAAAA'], ['L1_B','BBBBBBB']]
[['L2_A','AAAAAAAA'], ['L2_B','BBBBBBB']]
Does anybody know what the problem is?
You have to make a deep copy (or shallow copy suffices in this case) when you append. Else lineage_1[0][0] and lineage_2[0][0] reference the same object.
from copy import deepcopy
for i in sequence:
lineage_1.append(deepcopy(i))
for k in sequence:
lineage_2.append(deepcopy(k))
See also: http://docs.python.org/library/copy.html
You are appending list objects in your for-loops -- the same list object (sequence[0]).
So when you modify the first element of that list:
lineage_1[0][0] = 'L1_A'
lineage_1[1][0] = 'L1_B'
lineage_2[0][0] = 'L2_A'
lineage_2[1][0] = 'L2_B'
you're seeing it show up as modified in both the lineage_X lists that contain copies of the list that is in sequence[0].
Do something like:
import copy
for i in sequence:
lineage_1.append(copy.copy(i))
for k in sequence:
lineage_2.append(copy.copy(k))
this will make copies of the sublists of sequence so that you don't have this aliasing issue. (If the real code has deeper nesting, you can use copy.deepcopy instead of copy.copy.)
Consider this simple example:
>>> aa = [1, 2, 3]
>>> bb = aa
>>> bb[0] = 999
>>> aa
[999, 2, 3]
What happened here?
"Names" like aa and bb simply reference the list, the same list. Hence when you change the list through bb, aa sees it as well. Using id shows this in action:
>>> id(aa)
32343984
>>> id(bb)
32343984
Now, this is exactly what happens in your code:
for i in sequence:
lineage_1.append(i)
for k in sequence:
lineage_2.append(k)
You append references to the same lists to lineage_1 and lineage_2.

Categories

Resources