Does Python's copy.deepcopy really copy everything? [duplicate] - python

This question already has answers here:
What is the difference between shallow copy, deepcopy and normal assignment operation?
(12 answers)
Closed 8 years ago.
I was under the impression that deepcopy copied everything recursively down a tree, but I came upon a situation that seemed to go against what I previously believed.
>>> item = "hello"
>>> a["hello"] = item
>>> b = copy.deepcopy(a)
>>> id(a)
31995776
>>> id(b)
32733616 # I expected this
>>> id(a["hello"])
140651836041376
>>> id(b["hello"])
140651836041376 # I did not expect this
The id of a and b are different, which I expected, but the internal item is still the same object. Does deepcopy only copy to a certain depth? Or is this something specific to the way Python stores strings? (I got a similar result with integers as well)

deepcopy only needs to create copies of mutable objects, like lists and dictionaries. Strings and integers are immutable; they can't be changed in-place, so there's no need to explicitly create a copy, and a reference to the same object is inserted instead.
Here is a quick demo, showing the difference between lists (mutable) and tuples (immutable):
>>> import copy
>>> l = [[1, 2], (3, 4)]
>>> l2 = copy.deepcopy(l)
>>> l2[0] is l[0]
False # created new list
>>> l2[1] is l[1]
True # didn't create new tuple

Related

Python using type() to deepcopy

I was working on a leetcode question and ran into a problem where I'd have to deepcopy a list. I found a solution that used type() like such:
orignallist=[1,2,3]
deepcopylist=type(orignallist)(orignallist)
Sure enough, it works and deepcopylist is a deepcopy but how on earth is this working? Python's type() documentation doesn't make any mention of this and I also don't understand how parentheses work with the second (orignallist) added in.
First off, it's not a deep copy. You've made a shallow copy, exactly equivalent to what list(orignallist) would produce (it doesn't matter, because all the values contained in your example list are immutable types, specifically int, but if they weren't, the distinction between deep and shallow copies would be important).
Second, all type(orignallist) is doing is extracting the class that the object bound to orignallist is an instance of, in this case, list. It's runtime determined, so if orignallist was actually a set, it would get set, but right here it's getting list. After that, it's nothing special, it's just constructing an instance of whatever orignallist is using orignallist as the argument to the constructor. If you want to see what it's doing, you can do it piecemeal:
>>> orignallist=[1,2,3]
>>> type_of_orignallist = type(orignallist)
>>> type_of_orignallist is list # It's just another alias to list
True
>>> type_of_orignallist(orignallist) # Since it's an alias of list, calling it makes a new list
[1, 2, 3]
In any event, the correct way to deep copy any object in Python is the copy.deepcopy routine:
>>> import copy
>>> lst_of_lst = [[]] # List with mutable element to demonstrate difference between shallow and deep copy
>>> shallow_copy = type(lst_of_lst)(lst_of_lst) # Or lst_of_lst[:], or lst_of_lst.copy(), or list(lst_of_lst)
>>> deep_copy = copy.deepcopy(lst_of_lst)
>>> lst_of_lst[0].append(1)
>>> lst_of_lst is shallow_copy # We copied the outer list structure
False
>>> lst_of_lst
[[1]]
>>> shallow_copy # Oops, shallow, not deep
[[1]]
>>> lst_of_lst[0] is shallow_copy[0] # Because we didn't copy the inner list
>>> deep_copy # Does what it says on the tin
[[]]
>>> lst_of_lst is deep_copy
False
>>> lst_of_lst[0] is deep_copy[0] # Yep, it recursively deepcopied so the inner list differs
False

.pop to remove items from lists causes weird problems with variable scopes in python [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 2 years ago.
This one is making me absolutely crazy, so any help would be much appreciated. I have a program where I'm iterating through a list in a function. Here's a toy model of the problem:
masterList = ["person","woman","man","camera","television"]
workingList = masterList
def removeItem (workingList):
item = workingList.pop(2)
print("Test 1:",workingList)
removeItem(workingList)
print("Test 2:", workingList)
print("Test 3:", masterList)
As expected, "Test 1" prints out the list with an item removed.
However, "Test 2" also prints out the list with an item removed. I wasn't expecting that, but no matter. That's not the real problem, but I'm sure there's something I don't understand here about variable scopes and variable shadowing.
No the real problem is "Test 3", which as you can see, is printing out the masterList, which shouldn't even be touched by the removeItem function or the pop function within it. And yet, it too is printing out the list with an item removed.
How can this be happening, and how can I prevent it?
Thanks so much!
Cheers,
Ari
Python lists are mutable objects.
m = list([1, 2, 3])
n = m
a = 5
b = a
id(a) == id(b)
# id() return "identity" of the object.
# True, Both a and b are references to the same object.
id(m) == id(n)
# True, Both m and n are references to the same object.
b = b + 2
id(a) == id(b)
# False, a new object on separate location is created, which b will point.
n.pop()
id(m) == id(n)
# True, the object is mutated, and m and n still references to the same object.
Since, python lists are mutable, m and n will still be reference to the same object after mutation. Whereas, for immutable objects like int, a new object will be created and the identifier will refer to the new object.
Gist is, in your scenario, there has been only one object since python lists are mutable.
However, if you need the original list unchanged when the new list is modified, you can use the copy() method.
new_list = original_list.copy()
The ids of new_list and original_list is different.
Learn here about mutability in detail: https://medium.com/#meghamohan/mutable-and-immutable-side-of-python-c2145cf72747.
You have to make a copy of the masterList, otherwise workingList is nothing more than a reference.
If your list does not contain other containers (which yours doesn't), then a shallow copy is sufficient. There are numerous ways to make a shallow copy but slicing is the most optimal.
masterList = ["person","woman","man","camera","television"]
workingList = masterList[:] #copy via slice
a bunch of other ways to make a shallow copy
workingList = masterList * 1
workingList = masterList.copy()
workingList = list(masterList)
workingList = [*masterList]
import copy
workingList = copy.copy(masterList)
If you have a list that does possess something that holds a reference (like other containers, classes, etc), then you need to make a deepcopy.
import copy
a = [[1, 2, 3], ['a', 'b', 'c']]
b = copy.deepcopy(a)
Looks like I figured it out. The two lists are actually the same list unless you use list.copy()
So replacing the top two lines with:
masterList = ["person","woman","man","camera","television"]
workingList = masterList.copy()
makes everything perform as expected! Well, I still can't say I understand how the variable scopes work in their entirety, but at least this solves the major problem.

Chained list assignment in python [duplicate]

This question already has answers here:
How do I clone a list so that it doesn't change unexpectedly after assignment?
(24 answers)
Closed 4 years ago.
When I ran this script (Python v2.6):
a = [1,2]
b = a
a.append(3)
print a
>>>> [1,2,3]
print b
>>>> [1,2,3]
I expected print b to output [1,2]. Why did b get changed when all I did was change a? Is b permanently tied to a? If so, can I make them independent? How?
Memory management in Python involves a private heap memory location containing all Python objects and data structures.
Python's runtime only deals in references to objects (which all live in the heap): what goes on Python's stack are always references to values that live elsewhere.
>>> a = [1, 2]
>>> b = a
>>> a.append(3)
Here we can clearly see that the variable b is bound to the same object as a.
You can use the is operator to tests if two objects are physically the same, that means if they have the same address in memory. This can also be tested also using the id() function.
>>> a is b
>>> True
>>> id(a) == id(b)
>>> True
So, in this case, you must explicitly ask for a copy.
Once you've done that, there will be no more connection between the two distinct list objects.
>>> b = list(a)
>>> a is b
>>> False
Objects in Python are stored by reference—you aren't assigning the value of a to b, but a pointer to the object that a is pointing to.
To emulate assignation by value, you can make a copy like so:
import copy
b = copy.copy(a)
# now the code works as "expected"
Be aware this has performance disadvantages.
In the case of an array, there's a special method that relies on slices:
b = a[:]
# code also works as expected here
Update– In addition to this, with some objects you can use the constructor—this includes lists:
b = list(a)
Short answer - Pointers.
When you type b = a it is setting b to look at the same array that a looks at. You have to make a new array with copies of the elements to separate them. In this case, something like b = [n for n in a] would work fine. For more complex operations you may want to check out http://docs.python.org/library/copy.html.
You might want to look at this link. The problem you have here is a and b both point to the same memory location, so changing one changes the other. Instead, you want to do something like this:
a = [1,2]
b = list(a)
a is a pointer to the list [1,2].
When you do the assignment b = a the value of b is the address of the list [1,2].
So when you do a.append(3) you are not actually changing a, you are changing the list that a points to. Since a and b both point to the same list, they both appear to change when you modify the other.
If you simply want to copy the contents of list a to b, instead of making b a pointer to a:
b = a[:]
Using the slice operator will copy the contents of the list into b such that you example would become:
a = [1,2]
b = a[:]
a.append(3)
print a
>>>> [1,2,3]
print b
>>>> [1,2]

python copying nested lists [duplicate]

This question already has answers here:
"Deep copy" nested list without using the deepcopy function
(5 answers)
Closed 4 years ago.
Let's say i have list a, and i want to copy it to b so that i can alter a but have its original form intact:
I use the traditional list() function...
a = [1,[2,3],4]
b = list(a)
print id(a), id(b)
# 2941136 35748600
a and b have different id, so the copying is a success. but list() did not copy the sublist -- altering a[1][0] will change b
a[1][0]=3
print b
# [1, [3, 3], 4]
I'm aware of copy.deepcopy() to solve this sort of problem, but i'm wondering if there are other ways of handling this without using a module.
One way to copy nested lists (given your example) is:
def deepcopy_nested_list(data):
out = []
for el in data:
if isinstance(el, list):
out.append(deepcopy_nested_list(el))
else:
out.append(el)
return out
This function copies the list to a new list and then recursively copies all nested lists to achieve a deep copy.
Please note that this does only create copies of lists and immutable objects (e.g., dicts are not copied). It shows only the idea how you would implement such a function and does not give a full implementation.
In real world code you would of course use copy.deepcopy().

When and why to use [:] in python [duplicate]

This question already has answers here:
Understanding slicing
(38 answers)
Closed 8 years ago.
sentence = "Hello"
print sentence
print sentence[:]
Both outputs the same thing, i.e. Hello
So, when and why to use/not use [:] ?
Thanks! :)
As Nobi pointed out in the comments, there's already a question regarding Python's slicing notation. As stated in the answer to that question, the slicing without start and end values ([:]) basically creates a copy of the original sequence.
However, you have hit a special case with strings. Since strings are immutable, it makes no sense to create a copy of a string. Since you won't be able to modify any instance of the string, there's no need to have more than one in memory. So, basically, with s[:] (being s a string) you're not creating a copy of the string; that statement is returning the very same string referenced by s. An easy way to see this is by using the id() (object identity) function:
>>> l1 = [1, 2, 3]
>>> l2 = l1[:]
>>> id(l1)
3075103852L
>>> id(l2)
3072580172L
Identities are different. However, with strings:
>>> s1 = "Hello"
>>> s2 = s1[:]
>>> id(s1)
3072585984L
>>> id(s2)
3072585984L
Identity is the same, meaning both are the same exact object.
>>> a = [1, 2, 3]
>>> b=a[:]
>>> id(b)
4387312200
>>> id(a)
4387379464
When you want to make a deep copy of an array.
>>> a='123'
>>> b=a[:]
>>> id(a)
4387372528
>>> id(b)
4387372528
But since string is immutable, string[:] has no difference with string itself.
P.S. I see most of people answering this question didn't understand what is the question at all.
Thee reason why you are getting Hello as output, is you are not passing any parameter.
L[start:stop:step]
Here L is your variable, which holds Hello. and start means the initial position of the string and stop means where you want to end your string with & step means how many char you want to skip.
For more information on this topic, visit this
See, if that resolved your issue.

Categories

Resources