How does append method work in python? - python

I study programming languages and a quiz question and solution was this:
def foo(x):
x.append (3)
x = [8]
return x
x=[1, 5]
y= foo(x)
print x
print y
Why does this print as follows:
[1 5 3 ]
[8]
Why doesn't x equal to 8 ??

The other two answers are great. I suggest you to try id to get address.
See the following example
def foo(x):
x.append (3)
print "global",id(x)
x = [8]
print "local ",id(x)
return x
x=[1, 5]
print "global",id(x)
y= foo(x)
print "global",id(x)
print x
print y
And the output
global 140646798391920
global 140646798391920
local 140646798392928
global 140646798391920
[1, 5, 3]
[8]
As you can see, the address of the variable x remains same when you manipulate it but changes when you use =. Variable assignment inside a function makes the variable local to the function

You have a lot of things going on there. So let's go step by step.
x = [1,5] You are assigning a list of 1,5 to x
y=foo(x) You are calling foo and passing in x and assigning whatever gets returned from foo
inside foo you call x.append(3) which appends 3 to the list that was passed in.
you then set x = [8] which is now a reference to a local variable x which then gets returned from foo ultimately setting y = [8]

Object References
The key to understanding this is that Python passes variables around using object references. These are similar to pointers in a language like c++, but are different in very key ways.
When an assignment is made (using the assignment operator, =):
x = [1, 5]
Actually TWO things have been created. First is the object itself, which is the list [1, 5]. This object is a separate entity from the second thing, which is the (global) object reference to the object.
Object(s) Object Reference(s)
[1, 5] x (global) <--- New object created
In python, objects are passed into functions by object reference; they are not passed "by reference" or "by value" like in c++. This means that when x is passed into the foo function, there is a new, local object reference to the object created.
Object(s) Object Reference(s)
[1, 5] x (global), x (local to foo) <--- Now two object references
Now inside of foo we call x.append(3), which directly changes the object (referred to by the foo-local x object reference) itself:
Object(s) Object Reference(s)
[1, 5, 3] x (global), x (local to foo) <--- Still two object references
Next, we do something different. We assign the local-foo x object reference (or re-assign, since the object reference already existed previously) to a NEW LIST.
Object(s) Object Reference(s)
[1, 5, 3] x (global) <--- One object reference remains
[8] x (local to foo) <--- New object created
Notice that the global object reference, x, is still there! It has not been impacted. We only re-assigned the local-foo x object reference to a NEW list.
And finally, it should be clear that when the function returns, we have this:
Object(s) Object Reference(s)
[1, 5, 3] x (global) <--- Unchanged
[8] y (global), x (local to foo) <--- New object reference created
Sidebar
Notice that the local-foo x object reference is still there! This is important behavior to understand, because if we do something like this:
def f(a = []):
a.append(1)
return a
f()
f()
f()
We will NOT get:
[1]
[1]
[1]
Instead, we will get:
[1]
[1, 1]
[1, 1, 1]
The statement a = [] is only evaluated ONCE by the interpreter when the program first runs, and that object reference never gets deleted (unless we delete the function itself).
As a result, when f() is called, local-f a is not changed back to []. It "remembers" its previous value; this is because that local object reference is still valid (i.e., it has not been deleted), and therefore the object does not get garbage collected between function calls.
Contrast with Pointers
One of the ways object references are different from pointers is in assignment. If Python used actual pointers, you would expect behavior such as the following:
a = 1
b = a
b = 2
assert a == 2
However, this assertion produces an error. The reason is that b = 2 does NOT impact the object "pointed to" by the object reference, b. It creates a NEW object (2) and re-assigns b to that object.
Another way object references are different from pointers is in deletion:
a = 1
b = a
del a
assert b is None
This assertion also produces an error. The reason is the same as in the example above; del a does NOT impact the object "pointed to" by the object reference, b. It simply deletes the object reference, a. The object reference, b, and the object it points to, 1, are not impacted.
You might be asking, "Well then how do I delete the actual object?" The answer is YOU CAN'T! You can only delete all the references to that object. Once there are no longer any references to an object, the object becomes eligible for garbage collection and it will be deleted for you (although you can force this to happen using the gc module). This feature is known as memory management, and it is one of the primary strengths of Python, and it is also one of the reasons why Python uses object references in the first place.
Mutability
Another subject that needs to be understood is that there are two types of objects: mutable, and immutable. Mutable objects can be changed, while immutable objects cannot be changed.
A list, such as [1, 5], is mutable. A tuple or int, on the other hand, is immutable.
The append Method
Now that all of this is clear, we should be able to intuit the answer to the question "How does append work in Python?" The result of the append() method is an operation on the mutable list object itself. The list is changed "in place", so to speak. There is not a new list created and then assigned to the foo-local x. This is in contrast to the assignment operator =, which creates a NEW object and assigns that object to an object reference.

The append function modifies the x that was passed into the function, whereas assigning something new to x changed the locally scoped value and returned it.

The scope of x inside foo is specific to the function and is independent from the main calling context. x inside foo starts out referencing the same object as x in the main context because that's the parameter that was passed in, but the moment you use the assignment operator to set it to [8] you have allocated a new object to which x inside foo points, which is now totally different from x in the main context. To illustrate further, try changing foo to this:
def foo(x):
print("Inside foo(): id(x) = " + str(id(x)))
x.append (3)
print("After appending: id(x) = " + str(id(x)))
x = [8]
print("After setting x to [8], id(x) = " + str(id(x)))
return x
When I executed, I got this output:
Inside foo(): id(x) = 4418625336
After appending: id(x) = 4418625336
After setting x to [8], id(x) = 4418719896
[1, 5, 3]
[8]
(the IDs you see will vary, but the point will still be clear I hope)
You can see that append just mutates the existing object - no need to allocate a new one. But once the = operator executes, a new object gets allocated (and eventually returned to the main context, where it is assigned to y).

Related

Can we delete reference cycle object and free its memory?

I read different articles about garbage collection. They say that we need gc module to clean the reference cycle object. But can we do the cleaning by simply using del?
For example, if I do the following, do I successfully free the memory of this reference cycle object? If yes, then why do we need gc module anyway? If not, then why not?
>>> x = [1, 2, 3]
>>> x.append(x) # create reference cycle
>>> print(x)
[1, 2, 3, [...]]
>>> sys.getrefcount(x)
3
>>> del x[3]
>>> print(x)
[1, 2, 3]
>>> sys.getrefcount(x)
2
>>> del x # reference count of x goes to 0!
cpython is reference counted. When the object reference count goes to zero, it is deleted - and that's done inline with the code that caused the decrement. The garbage collector is there to handle the case where an object is unreachable but its reference count is not zero. Its not that every circular reference needs a garbage collection. It matters how that reference is unwound.
>>> x = [1, 2, 3]
We created a list and bound it to x. The list itself is just an anonymous object in memory, but currently reachable by x. Lets call the actual object the list.
>>> x.append(x) # create reference cycle
>>> print(x)
[1, 2, 3, [...]]
>>> sys.getrefcount(x)
3
At this point, the list has references from x, its own 4th element and the getrefcount parameter (which disappears when the function returns, leaving 2 references).
>>> del x[3]
del doesn't actually delete things. It unbinds objects. In that case, the list's __delitem__ function is called unbinding the list from its own 4th element, leaving a single reference.
>>> del x
This time del unbinds the list from x and removes x from the namespace. The reference count went to zero and the list was deleted. The garbage collector was never envolved.
Now lets mix it up.
>>> x = [1, 2, 3]
>>> x.append(x) # create reference cycle
We have 2 references to the list. But this time just
>>> del x
del unbinds the list from x and decrements its reference count to 1. That's not enough to delete the list. Now we have a problem. the list is no longer assigned to any variable that we can get to. Its not in x because x doesn't exist any more. But its still in memory. That is what the garbage collector attempts to fix.
We can't just use del because there is no variable to do the del on.

Why do python lists act like this when using the = operator [duplicate]

This question already has answers here:
Variable assignment and modification (in python) [duplicate]
(6 answers)
Closed 4 years ago.
How come the following code:
a = [1,2,3]
b = a
b[0] = 3
print(a)
will print list b after it has been altered?[3,2,3].
Also why is this true but that the following code:
a = [1,2,3]
b = a
b = [0,0,0]
print(a,b)
prints [1, 2, 3] [0, 0, 0]?? This seems inconsistent. If the first code is true, then shouldn't the second code print [0,0,0][0,0,0]? Can someone please provide an explanation for this?
In python there are two types of data... mutable and immutable. Numbers, strings, boolean, tuples, and other simple types are immutable. Dicts, lists, sets, objects, classes, and other complex types are mutable.
When you say:
a = [1,2,3]
b = a
You've created a single mutable list in memory, assigned a to point to it, and then assigned b to point to it. It's the same thing in memory.
Therefore when you mutate it (modify it):
b[0] = 3
It is a modification (mutation) of the index [0] of the value which b points to at that same memory location.
However, when you replace it:
b = [0,0,0]
It is creating a new mutable list in memory and assigning b to point at it.
Check out the id() function. It will tell you the "address" of any variable. You can see which names are pointing to the same memory location with id(varname).
Bonus: Every value in python is passed by reference... meaning that when you assign it to a variable it simply causes that variable to point to that value where it was in memory. Having immutable types allows python to "reuse" the same memory location for common immutable types.
Consider some common values when the interpreter starts up:
>>> import sys
>>> sys.getrefcount('abc')
68
>>> sys.getrefcount(100)
110
>>> sys.getrefcount(2)
6471
However, a value that is definitely not present would return 2. This has to do with the fact that a couple of references to that value were in-use during the call to sys.getrefcount
>>> sys.getrefcount('nope not me. I am definitely not here already.')
2
Notice that an empty tuple has a lot of references:
>>> sys.getrefcount(tuple())
34571
But an empty list has no extra references:
>>> sys.getrefcount(list())
1
Why is this? Because tuple is immutable so it is fine to share that value across any number of variables. However, lists are mutable so they MUST NOT be shared across arbitrary variables or changes to one would affect the others.
Incidentally, this is also why you must NEVER use mutable types as default argument values to functions. Consider this innocent little function:
>>> def foo(value=[]):
... value.append(1)
... print(value)
...
...
When you call it you might expect to get [1] printed...
>>> foo()
[1]
However, when you call it again, you prob. won't expect to get [1,1] out... ???
>>> foo()
[1, 1]
And on and on...
>>> foo()
[1, 1, 1]
>>> foo()
[1, 1, 1, 1]
WHY IS THIS? Because default arguments to functions are evaluated once during function definition, and not at function run time. That way if you use a mutable value as a default argument value, then you will be stuck with that one value, mutating in unexpected ways as the function is called multiple times.
The proper way to do it is this:
>>> def foo(value=None):
... if value is None:
... value = []
... value.append(1)
... print(value)
...
...
>>>
>>> foo()
[1]
>>> foo()
[1]
>>> foo()
[1]

Variables and memory allocated for them?

I have a question about Python deal with memory to copy variables.
For example, I have a list(or string, tuple, dictionary, set) variable
A = [1,2,3]
then I assign the value of A to another variable B
B = A
then if I do "some changes" to A, e.g.,
A.pop(0)
then B also changes, i.e.,
print(A,B) will give me ([2,3], [2,3])
I read some material and they say "B=A did not copy the value of A to a new place in memory labelled by B. It just made the name B point to the same position in memory as A." Can I interpret this as we still only have one place of memory, but now it has 2 names?
However, I found that if I did some other changes to A, such as
A = [5,6] # I reassign A value,
Then I found
print(A,B)
gives me ([5,6],[1,2,3])
So I am confused here. It seems that now we have two places of memory
Your first understanding was correct. When you do
B = A
you now have two names pointing to the same object in memory.
Your misunderstanding is what happens when you do
A = [5, 6]
This doesn't copy [5, 6] to that location in memory. It allocates a new list [5, 6] and then changes the name A to point to this. But B still points to the same list that it pointed to before.
Basically, every time you do
A = <something>
you're changing where A points, not changing the thing that it points to.
Lists are objects and therefore 'call-by-reference'. When you write B=A you'll get a reference (c-pointer) on the object behind A (not A itself!), so basically, as your code is telling you already, A is B == True. The reference is not on A but on the object that A points to, so if you change A to A = [5,6] the interpreter will notice that you've got another reference (B) on the old list and will keep that reference and the list (else it would land in the garbage collector). It'll only change the adress stored in A.
If you then, however, reassing B=A, B will be [5,6].
you assign new obj to a at second time
>>>
>>> a= [1,2,3]
>>> id(a)
4353139632
>>> b = a
>>> id(b)
4353139632
>>> a= [4,5]
>>> id(a)
4353139776
>>> id(b)
4353139632
Lists, tuples, and objects are referenced in Python. You can see these variable names as pointers in C. So, A is a pointer to some location, storing an array, when you did B = A you copied the reference to that location ( the address ) to B.
So, when you changed contents at that location, via A, then consequently, answer would be what that is at that memory location, whether you access it via A or B.
However, if you would like to copy the elements, you can use
B = [i for i in A]
or something like that.
and when you assigned some other value to A, A = [5,6], the reference at A is now pointing to some other memory location, and that at B to the original location, so B stays same.

Why does 'remove' mutate a list when called locally? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Understanding Python's call-by-object style of passing function arguments
I recently came across this:
x = [1,2,3]
def change_1(x):
x = x.remove(x[0])
return x
Which results in:
>>> change_1(x)
>>> x
[2, 3]
I find this behavior surprising, as I thought that whatever goes in inside a function has no effect on outside variables. Furthermore, I constructed an example where basically do the same thing but without using remove:
x = [1,2,3]
def change_2(x):
x = x[1:]
return x
Which results in:
>>> change_2(x)
[2, 3] # Also the output prints out here not sure why this is
>>> x
[1, 2, 3]
And I get the result that I would expect, the function does not change x.
So it must be something remove specific that has effect. What is going on here?
One of the things that is confusing is that you've called lots of different things 'x'. For example:
def change_1(x): # the x parameter is a reference to the global 'x' list
x = x.remove(x[0]) # on the left, though, is a new 'x' that is local to the function
return x # the local x is returned
>>> x = [1, 2, 3]
>>> y = change_1(x) # assign the return value to 'y'
>>> print y
None # this is None because x.remove() assigned None to the local 'x' inside the function
>>> print x
[2, 3] # but x.remove() modified the global x inside the function
def change_2(x):
x = x[1:] # again, x on left is local, it gets a copy of the slice, but the 'x' parameter is not changed
return x # return the slice (copy)
>>> x = [1, 2, 3]
>>> y = change_2(x)
>>> print x
[1, 2, 3] # the global 'x' is not changed!
>>> print y
[2, 3] # but the slice created in the function is assigned to 'y'
You would get the same result if you called the parameter of your function n, or q.
It's not the variable name that's being affected. The x in the scope of your list and the x outside that scope are two different "labels". Since you passed the value that x was attached to, to change_1() however, they are both referring to the same object. When you do x.remove() on the object in the function, you are basically saying: "get the object that x refers to. Now remove an element from that object. This is very different than a lhs assignment. If you did y=0; x=y inside your function. Your not doing anything to the list object. You're basically just tearing the 'x' label of of [1, 2. 3] and putting it on whatever y is pointing to.
Your variable x is just a reference to the list you've created. When you call that function, you are passing that reference by value. But in the function, you have a reference to the same list So when the function modifies it, it is modified at any scope.
Also, when executing commands in the interactive interpreter, python prints the return value, if it is not assigned to a variable.

Python Variable Scope (passing by reference or copy?)

Why does the variable L gets manipulated in the sorting(L) function call? In other languages, a copy of L would be passed through to sorting() as a copy so that any changes to x would not change the original variable?
def sorting(x):
A = x #Passed by reference?
A.sort()
def testScope():
L = [5,4,3,2,1]
sorting(L) #Passed by reference?
return L
>>> print testScope()
>>> [1, 2, 3, 4, 5]
Long story short: Python uses pass-by-value, but the things that are passed by value are references. The actual objects have 0 to infinity references pointing at them, and for purposes of mutating that object, it doesn't matter who you are and how you got a reference to the object.
Going through your example step by step:
L = [...] creates a list object somewhere in memory, the local variable L stores a reference to that object.
sorting (strictly speaking, the callable object pointed to be the global name sorting) gets called with a copy of the reference stored by L, and stores it in a local called x.
The method sort of the object pointed to by the reference contained in x is invoked. It gets a reference to the object (in the self parameter) as well. It somehow mutates that object (the object, not some reference to the object, which is merely more than a memory address).
Now, since references were copied, but not the object the references point to, all the other references we discussed still point to the same object. The one object that was modified "in-place".
testScope then returns another reference to that list object.
print uses it to request a string representation (calls the __str__ method) and outputs it. Since it's still the same object, of course it's printing the sorted list.
So whenever you pass an object anywhere, you share it with whoever recives it. Functions can (but usually won't) mutate the objects (pointed to by the references) they are passed, from calling mutating methods to assigning members. Note though that assigning a member is different from assigning a plain ol' name - which merely means mutating your local scope, not any of the caller's objects. So you can't mutate the caller's locals (this is why it's not pass-by-reference).
Further reading: A discussion on effbot.org why it's not pass-by-reference and not what most people would call pass-by-value.
Python has the concept of Mutable and Immutable objects. An object like a string or integer is immutable - every change you make creates a new string or integer.
Lists are mutable and can be manipulated in place. See below.
a = [1, 2, 3]
b = [1, 2, 3]
c = a
print a is b, a is c
# False True
print a, b, c
# [1, 2, 3] [1, 2, 3] [1, 2, 3]
a.reverse()
print a, b, c
# [3, 2, 1] [1, 2, 3] [3, 2, 1]
print a is b, a is c
# False True
Note how c was reversed, because c "is" a. There are many ways to copy a list to a new object in memory. An easy method is to slice: c = a[:]
It's specifically mentioned in the documentation the .sort() function mutates the collection. If you want to iterate over a sorted collection use sorted(L) instead. This provides a generator instead of just sorting the list.
a = 1
b = a
a = 2
print b
References are not the same as separate objects.
.sort() also mutates the collection.

Categories

Resources