Python: Problem with list editing - python

Simplified version of my code:
sequence = [['WT_1', 'AAAAAAAA'], ['WT_2', 'BBBBBBB']]
def speciate(sequence):
lineage_1 = []
lineage_2 = []
for i in sequence:
lineage_1.append(i)
for k in sequence:
lineage_2.append(k)
lineage_1[0][0] = 'L1_A'
lineage_1[1][0] = 'L1_B'
lineage_2[0][0] = 'L2_A'
lineage_2[1][0] = 'L2_B'
print lineage_1
print lineage_2
speciate(sequence)
outputs:
[['L2_A', 'AAAAAAAA'], ['L2_B','BBBBBBB']]
[['L2_A','AAAAAAAA'], ['L2_B','BBBBBBB']]
when I would expect to get this:
[['L1_A', 'AAAAAAAA'], ['L1_B','BBBBBBB']]
[['L2_A','AAAAAAAA'], ['L2_B','BBBBBBB']]
Does anybody know what the problem is?

You have to make a deep copy (or shallow copy suffices in this case) when you append. Else lineage_1[0][0] and lineage_2[0][0] reference the same object.
from copy import deepcopy
for i in sequence:
lineage_1.append(deepcopy(i))
for k in sequence:
lineage_2.append(deepcopy(k))
See also: http://docs.python.org/library/copy.html

You are appending list objects in your for-loops -- the same list object (sequence[0]).
So when you modify the first element of that list:
lineage_1[0][0] = 'L1_A'
lineage_1[1][0] = 'L1_B'
lineage_2[0][0] = 'L2_A'
lineage_2[1][0] = 'L2_B'
you're seeing it show up as modified in both the lineage_X lists that contain copies of the list that is in sequence[0].
Do something like:
import copy
for i in sequence:
lineage_1.append(copy.copy(i))
for k in sequence:
lineage_2.append(copy.copy(k))
this will make copies of the sublists of sequence so that you don't have this aliasing issue. (If the real code has deeper nesting, you can use copy.deepcopy instead of copy.copy.)

Consider this simple example:
>>> aa = [1, 2, 3]
>>> bb = aa
>>> bb[0] = 999
>>> aa
[999, 2, 3]
What happened here?
"Names" like aa and bb simply reference the list, the same list. Hence when you change the list through bb, aa sees it as well. Using id shows this in action:
>>> id(aa)
32343984
>>> id(bb)
32343984
Now, this is exactly what happens in your code:
for i in sequence:
lineage_1.append(i)
for k in sequence:
lineage_2.append(k)
You append references to the same lists to lineage_1 and lineage_2.

Related

List operation, keeping track of old list

After I apply an operation to a list, I would like to get access to both the modified list and the original one.
Somehow I am not able to.
In the following code snippet, I define two functions with which I modify the original list.
Afterwards, I get my values from a class and apply the transformation.
def get_min_by_col(li, col): # get minimum from list
return min(li, key=lambda x: x[col - 1])[col - 1]
def hashCluster(coords): # transform to origin
min_row = get_min_by_col(coords,0)
min_col = get_min_by_col(coords,1)
for pix in coords:
pix[1] = pix[1] - min_row
pix[0] = pix[0] - min_col
return (coords)
pixCoords = hashCoords = originalPixCoords = [] # making sure they are empty
for j in dm.getPixelsForCluster(dm.clusters[i]):
pixCoords.append([j['m_column'], j['m_row']]) # getting some values from a class -- ex: [[613, 265], [613, 266]] or [[615, 341], [615, 342], [616, 341], [616, 342]]
originalPixCoords = pixCoords.copy() # just to be safe, I make a copy of the original list
print ('Original : ', originalPixCoords)
hashCoords = hashCluster(pixCoords) # apply transformation
print ('Modified : ', hashCoords)
print ('Original : ', originalPixCoords) # should get the original list
Some results [Jupyter Notebook]:
Original : [[607, 268]]
Modified : [[0, 0]]
Original : [[0, 0]]
Original : [[602, 264], [603, 264]]
Modified : [[0, 0], [1, 0]]
Original : [[0, 0], [1, 0]]
Original : [[613, 265], [613, 266]]
Modified : [[0, 0], [0, 1]]
Original : [[0, 0], [0, 1]]
Is the function hashCluster able to modify the new list as well? Even after the .copy()?
What am I doing wrong? My goal is to have access to both the original and modified lists, with as less operations and copies of lists as possible (since I am looping over a very large document).
You have a list of lists, and are modifying the inner lists. The operation pixCoords.copy() creates a shallow copy of the outer list. Both pixCoords and originalPixCoords now have two list buffers pointing to the same mutable objects. There are two ways to handle this situation, each with its own pros and cons.
The knee-jerk method that most users seem to have is to make a deep copy:
originalPixCoords = copy.deepcopy(pixCoords)
I would argue that this method is the less pythonic and more error prone approach. A better solution would be to make hashCluster actually return a new list. By doing that, you will make it treat the input as immutable, and eliminate the problem entirely. I consider this more pythonic because it reduces the maintenance burden. Also, conventionally, python functions that return a value create a new list without modifying the input while in-place operations generally don't return a value.
def hashCluster(coords):
min_row = get_min_by_col(coords, 0)
min_col = get_min_by_col(coords, 1)
return [[pix[0] - min_col, pix[1] - min_row] for pix in coords]
use
import copy
OriginalPixCoords= copy.deepcopy(pixCoords)
What you're using is a shallow copy. It effectively means you created a new list and just pointed to the old memory spaces. Meaning if those object got modified, your new list will still reflect those updates since they occurred in the same memory space.
>>> # Shallow Copy
>>> mylist = []
>>> mylist.append({"key": "original"})
>>> mynewlist = mylist.copy()
>>> mynewlist
[{'key': 'original'}]
>>> mylist[0]["key"] = "new value"
>>> mylist
[{'key': 'new value'}]
>>> mynewlist
[{'key': 'new value'}]
>>> # Now Deep Copy
>>> mylist = []
>>> mylist.append({"key": "original"})
>>> from copy import deepcopy
>>> mynewlist = deepcopy(mylist)
>>> mynewlist
[{'key': 'original'}]
>>> mylist[0]["key"] = "new value"
>>> mylist
[{'key': 'new value'}]
>>> mynewlist
[{'key': 'original'}]
Another similar question: What is the difference between shallow copy, deepcopy and normal assignment operation?
Settings multiple variables equal to the same value is the equivalent of a pointer in Python.
Check this out
a = b = [1,2,3]
a == b # True
a is b # True (same memory location)
b[1] = 3
print(b) # [1,3,3]
print(a) #[1,3,3]
Right now, you are creating shallow copies. If you need both copies (with different values and data history), you can simply assign the variables in the following manner:
import copy
original = data
original_copy = copy.deepcopy(data)
original_copy == original == data # True
original_copy is original # False
original_copy[0] = 4
original_copy == original # False

Python tuples to tuples

I get stuck in my school missions about a couple of days! The question asks to copy a tuple into a new tuple which has the different ID with the original tuple! This is my current code but still can't get how to copy with different Ids!
def copy_tree(tree):
mylist=[]
for items in tree:
mylist.append(items)
mytuple=tuple(mylist)
return mytuple
original = (1, 2, 3, 4)
Tuples in Python are immutable, so creating a copy is usually not needed. That's probably the reason, why other than e.g. list, tuple will not automatically create a new tuple if the given parameter already is a tuple:
>>> l = [1,2,3]
>>> list(l) is l # new list ...
False
>>> t = (1,2,3)
>>> tuple(t) is t # but same tuple
True
You can, however, convert the tuple to a list first, and then create a new tuple from that list.
>>> tuple(list(t)) == t # equal ...
True
>>> tuple(list(t)) is t # ... but not the same
False
>>> id(tuple(list(t))), id(t) # different id
(139852830618896, 139852830618752)
Which is basically what you are currently doing, although in a few more lines, so your code should actually work just fine.
Note, however, that this will create a shallow copy of the tuple, i.e. the objects within the tuple (other tuples, list, whatever) are not copied. If you need to copy those, too, use copy.deepcopy as in the other answer. However, this, too, is so "smart" that it will not create a copy if the (nested) tuple only contains immutable values:
>>> k = (1, (2, "3")) # all immutable
>>> copy.deepcopy(k) is k
True
>>> k = (1, (2, "3", [])) # contains mutable list
>>> copy.deepcopy(k) is k
False
#There is no need to copy immutables. For Academic Purpose:
from copy import deepcopy
#initialising first tuple k
k=(1,2)
id(k) # checking memory id of k
j=deepcopy(k) #deepcopying k to j
id(j) # checking memory id of j
Don't really know what you are looking for but :
t1 = (1, 2, 3, 4)
t2 = t1
print(t1)
print(t2)
in this, t2 is a literal copy/clone of t1.
as mentioned before, tuples are not mutable. if you want to add tuple to another you can just use "," as a separator.
You could just add numbers: "t2=t1,1,2,3,4" which would make it tuple copied within another tuple.
or
You could insert numbers from another tuple to another by slicing it like t2=t1[2],1,2,3

How to modify values in list of lists

I have written the python code in the following form
temp=[]
x=[1,2,3]
for i in range(4):
temp=temp+[x]
temp[1][1]=x[1]+1
print temp[0]
print temp[1]
Here, I wanted the value of just temp[1][1], but the value of temp[0][1] also gets changed. Is there a way of changing just one value? I created a new list and tried to add it to temp, but that does not seem to work as well.
Update:
Thanks, but it did not seem to work in my case (which was a multi dimensional array). I have the code has follows:
tempList=[]
for i in range(openList[0].hx):
tempList=tempList+[copy.copy(abc)]
tempList[0][0][0]=123
print sudokuList
Here abc is a two dimensional list. Modifying the value of tempList[0][0][0] changes the value of tempList[1][0][0] and so on.
That's because of that you are assigning the x to all of your list items so all of them are references to one object and once you change one on them actually you have changed all of them. for getting ride of this problem you can use a list comprehension to define the temp list :
temp=[[1,2,3] for _ in range(4)]
temp[1][1]=7
print temp[0]
print temp[1]
result :
[1, 2, 3]
[1, 7, 3]
This is actually a common error for beginners to Python: How to clone or copy a list?
When you add x to temp four times, you're creating a temp which has the same x four number of times.
So, temp[0], temp[2], temp[3] and temp[4] are all pointing to the same x you declared at the first line.
Just make a copy when adding:
temp=[]
x=[1,2,3]
for i in range(4):
temp=temp.append(x[:])
temp[1][1]=x[1]+1
print temp[0]
print temp[1]
You can see it with id function, which returns a different value for different objects:
>>> temp=[]
>>> x=[1,2,3]
>>> for i in range(4):
... temp=temp+[x]
...
>>> id(temp[0]), id(temp[1])
(4301992880, 4301992880) # they're the same
>>> temp=[]
>>> x=[1,2,3]
>>> for i in range(4):
... temp=temp+[x[:]]
...
>>> id(temp[0]), id(temp[1])
(4301992088, 4302183024) # now they are not
Try, the following. x in for loop is a reference to the original x and not a copy. Because of this reference, changing any element reflects on all objects. So you would need to make a copy as used in following snippet.
temp=[]
x=[1,2,3]
for i in range(4):
temp=temp+[x[:]]
temp[1][1]=x[1]+1
print temp[0]
print temp[1]
----EDIT----
As per your comment, use copy.deepcopy to copy the list. deepcopy would recursively copy all the referenced elements inside the list. Check copy.deepcopy. So the code looks like:-
import copy
temp=[]
x=[1,2,3]
for i in range(4):
x_copy = copy.deepcopy(x)
#do something with x_copy. use this inplace of x in your code.
#will work for 1D or 2D or any other higher order lists.

Having trouble understanding immutable, mutable, scope in python functions

See my code in python 3.4. I can get around it fine. It bugs me a little. I'm guessing it's something to do with foo2 resetting a rather than treating it as list 1.
def foo1(a):
a.append(3) ### add element 3 to end of list
return()
def foo2(a):
a=a+[3] #### add element 3 to end of list
return()
list1=[1,2]
foo1(list1)
print(list1) ### shows [1,2,3]
list1=[1,2]
foo2(list1)
print(list1) #### shows [1,2]
In foo2 you do not mutate the original list referred to by a - instead, you create a new list from list1 and [3], and bind the result which is a new list to the local name a. So list1 is not changed at all.
There is a difference between append and +=
>>> a = []
>>> id(a)
11814312
>>> a.append("hello")
>>> id(a)
11814312
>>> b = []
>>> id(b)
11828720
>>> c = b + ["hello"]
>>> id(c)
11833752
>>> b += ["hello"]
>>> id(b)
11828720
As you can see, append and += have the same result; they add the item to the list, without producing a new list. Using + adds the two lists and produces a new list.
In the first example, you're using a method that modifies a in-place. In the second example, you're making a new a that replaces the old a but without modifying the old a - that's usually what happens when you use the = to assign a new value. One exception is when you use slicing notation on the left-hand side: a[:] = a + [3] would work as your first example did.

Python list slice syntax used for no obvious reason

I occasionally see the list slice syntax used in Python code like this:
newList = oldList[:]
Surely this is just the same as:
newList = oldList
Or am I missing something?
[:] Shallow copies the list, making a copy of the list structure containing references to the original list members. This means that operations on the copy do not affect the structure of the original. However, if you do something to the list members, both lists still refer to them, so the updates will show up if the members are accessed through the original.
A Deep Copy would make copies of all the list members as well.
The code snippet below shows a shallow copy in action.
# ================================================================
# === ShallowCopy.py =============================================
# ================================================================
#
class Foo:
def __init__(self, data):
self._data = data
aa = Foo ('aaa')
bb = Foo ('bbb')
# The initial list has two elements containing 'aaa' and 'bbb'
OldList = [aa,bb]
print OldList[0]._data
# The shallow copy makes a new list pointing to the old elements
NewList = OldList[:]
print NewList[0]._data
# Updating one of the elements through the new list sees the
# change reflected when you access that element through the
# old list.
NewList[0]._data = 'xxx'
print OldList[0]._data
# Updating the new list to point to something new is not reflected
# in the old list.
NewList[0] = Foo ('ccc')
print NewList[0]._data
print OldList[0]._data
Running it in a python shell gives the following transcript. We can see the
list being made with copies of the old objects. One of the objects can have
its state updated by reference through the old list, and the updates can be
seen when the object is accessed through the old list. Finally, changing a
reference in the new list can be seen to not reflect in the old list, as the
new list is now referring to a different object.
>>> # ================================================================
... # === ShallowCopy.py =============================================
... # ================================================================
... #
... class Foo:
... def __init__(self, data):
... self._data = data
...
>>> aa = Foo ('aaa')
>>> bb = Foo ('bbb')
>>>
>>> # The initial list has two elements containing 'aaa' and 'bbb'
... OldList = [aa,bb]
>>> print OldList[0]._data
aaa
>>>
>>> # The shallow copy makes a new list pointing to the old elements
... NewList = OldList[:]
>>> print NewList[0]._data
aaa
>>>
>>> # Updating one of the elements through the new list sees the
... # change reflected when you access that element through the
... # old list.
... NewList[0]._data = 'xxx'
>>> print OldList[0]._data
xxx
>>>
>>> # Updating the new list to point to something new is not reflected
... # in the old list.
... NewList[0] = Foo ('ccc')
>>> print NewList[0]._data
ccc
>>> print OldList[0]._data
xxx
Like NXC said, Python variable names actually point to an object, and not a specific spot in memory.
newList = oldList would create two different variables that point to the same object, therefore, changing oldList would also change newList.
However, when you do newList = oldList[:], it "slices" the list, and creates a new list. The default values for [:] are 0 and the end of the list, so it copies everything. Therefore, it creates a new list with all the data contained in the first one, but both can be altered without changing the other.
As it has already been answered, I'll simply add a simple demonstration:
>>> a = [1, 2, 3, 4]
>>> b = a
>>> c = a[:]
>>> b[2] = 10
>>> c[3] = 20
>>> a
[1, 2, 10, 4]
>>> b
[1, 2, 10, 4]
>>> c
[1, 2, 3, 20]
Never think that 'a = b' in Python means 'copy b to a'. If there are variables on both sides, you can't really know that. Instead, think of it as 'give b the additional name a'.
If b is an immutable object (like a number, tuple or a string), then yes, the effect is that you get a copy. But that's because when you deal with immutables (which maybe should have been called read only, unchangeable or WORM) you always get a copy, by definition.
If b is a mutable, you always have to do something extra to be sure you have a true copy. Always. With lists, it's as simple as a slice: a = b[:].
Mutability is also the reason that this:
def myfunction(mylist=[]):
pass
... doesn't quite do what you think it does.
If you're from a C-background: what's left of the '=' is a pointer, always. All variables are pointers, always. If you put variables in a list: a = [b, c], you've put pointers to the values pointed to by b and c in a list pointed to by a. If you then set a[0] = d, the pointer in position 0 is now pointing to whatever d points to.
See also the copy-module: http://docs.python.org/library/copy.html
Shallow Copy: (copies chunks of memory from one location to another)
a = ['one','two','three']
b = a[:]
b[1] = 2
print id(a), a #Output: 1077248300 ['one', 'two', 'three']
print id(b), b #Output: 1077248908 ['one', 2, 'three']
Deep Copy: (Copies object reference)
a = ['one','two','three']
b = a
b[1] = 2
print id(a), a #Output: 1077248300 ['one', 2, 'three']
print id(b), b #Output: 1077248300 ['one', 2, 'three']

Categories

Resources