I'm trying to understand how reference works in python - python

After line 7, I haven't written a single line of code which mentions the list named 'outer'. However, if you execute it, you'll see that the 'outer' (i.e, the nested lists inside it) list would change/update due to lines 10 and 12...
I'm guessing it has something to do with reference vs value. My question is, why didn't line 13 effect (change/update) the 'outer' list the same way that lines 7 and 10 did? I'm trying to undertand this concept. How do I go about it. I know there's a lot of resources online.. but I don't even know what to google. Please help.
inner = []
outer = []
lis = ['a', 'b', 'c']
inner.append(lis[0])
outer.append(inner) <<---- Line 7 <<
print(outer)
inner.append(lis[1]) <<---- Line 10 <<
print(outer)
inner.append(lis[2]) <<---- Line 12 <<
print(outer)
lis[2] = 'x' <<---- Line *******13******* <<
print(outer)

This is a boiled-down version of your example:
some_list = []
a = 2
some_list.append(a)
a = 3
print(some_list) # output: [2]
print(a) # output: 3
If we follow your original logic, you would expect some_list to contain the value 3 when we print it. But the reality is that we never appended a itself to the list. Instead, writing some_list.append(a) means appending the value referenced by a to the list some_list.
Remember, variables are simply references to a value. Here's the same snippet as above, but with an explanation of what's referencing what.
some_list = [] # the name "some_list" REFERENCES an empty list
a = 2 # the name "a" REFERENCES the integer value 2
some_list.append(a) # we append the value REFERENCED BY "a"
# (the integer 2) to the list REFERENCED
# BY "some_list". That list is not empty
# anymore, holding the value [2]
a = 3 # the name "a" now REFERENCES the integer value 3. This
# has no implications on the list REFERENCED BY "some_list".
# We simply move the "arrow" that pointed the name "a" to
# the value 2 to its new value of 3
print(some_list) # output: [2]
print(a) # output: 3
The key aspect to understand here is that variables are simply references to a value. Writing some_list.append(a) does not mean "place the variable a into the list" but rather "place the value that the variable a references at this moment in time into the list". Variables cannot keep track of other variables, only the values that they are a reference to.
This becomes even clearer if we append to some_list a second time, after modifying the value that a references:
some_list = []
a = 2
some_list.append(a)
a = 3
some_list.append(a)
print(some_list) # output: [2, 3]
print(a) # output: 3

In Python, when you store a list in variable you don't store the list itself, but a reference to a list somewhere in the computer's RAM. If you say
a = [0, 1, 2]
b = a
c = 3
then both a and b will be references to the same list as you set b to a, which is a reference to a list. Then, modifying a will modify b and vice-versa. c, however, is an integer; it works differently. It's like that:
┌───┐
a │ █━┿━━━━━━━━━━━━━━━┓ ┌───┬───┬───┐
└───┘ ┠→│ 0 │ 1 │ 2 │
┌───┐ ┃ └───┴───┴───┘
b │ █━┿━━━━━━━━━━━━━━━┛
└───┘
┌───┐
c │ 3 │
└───┘
a and b are references to a same list, but c is an pointer to an integer which is copied (the integer) when you say d = c. The reference to it, however, is not copied.
So, let's go back to your program. When you say inner.append(lis[n]) you add the value at the end of the list inner. You don't add the reference to the item #2 of the list lis but you create a copy of the value itself and add to the list the reference to this copy!
If you modify lis, then it will have an impact only on variables that are references to lis.
If you want inner to be modified if you modify lis, then replace the inner.append(lis[n])s by inner.append(lis).

Related

What happens in memory while unpacking a collection?

Say
list1=[4,8,12]
a,b,c=list1
output is a=4,b=8,c=12.
My confusion
Instructor told us that it is not like a gets mapped to 4, b to 8, and c to 12. I didn't understand what he told clearly (although I listened repeatedly to him multiple times). He was telling something like object is created for 4 and a is mapped to 4. But what is the difference between this and what I have presented below in figure?
The thing about your picture that's misleading is that it implies that a, b, and c reference slices of list1. If you change list1, though, you will find that a, b, and c aren't affected by that change.
A better way to draw the picture might be to show 4, 8, and 12 separate from list1:
list1-->[ ][ ][ ]
| | |
V V V
4 8 12
^ ^ ^
| | |
a b c
All of the variables are independent of one another, even though some of them (e.g. list1[0] and a) currently point to the same values.
To put it another way: saying a = list1[0] is saying "evaluate list1[0] and assign a to reference whatever that value is right now", which is not the same as saying "make a be an alias for list1[0]".
Try this:
# define the list
list1=[4,8,12]
# reserve 3 memory spaces and unpack the values from list into them
# those memory spaces will contain one integer each one of the size of
# sys.getsizeof(int()) == 28 bytes (Python 3)
# a,b and c are actually pointers to those memory spaces
a,b,c=list1
print(a,b,c)
# change the first value of the list
list1[0] = 56
print(list1)
# now check that indeed "a" is not the same pointer than "list1[0]"
print(a)
But
You must to be careful with this kind of asignations with lists, try also this:
list2 = list1
print(list1, list2)
# then change any of them
list1 [0] = -1
# check that "list2" is pointing to the same memory address than "list1"
print(list1, list2)

how del list[:] works [duplicate]

This question already has answers here:
Clarify how python del works for lists and slices
(2 answers)
Closed 3 years ago.
list[:] creates a copy of the list then why does del list[:] remove all the list items?
Shouldn't it delete the copy of the list?
The answer is no, it shouldn't. It's intended to delete all of the elements of the list. Looking at the documentation, the result of s.clear() (where s is a mutable sequence type, for example a list) is:
removes all items from s (same as del s[:])
Hence, del s[:] is the same as s.clear() in that it removes all items from s.
Perhaps, this is a bit more understandable if you consider that the function called behind the scenes is __delitem__. From the docs:
Called to implement deletion of self[key]. Same note as for __getitem__(). This should only be implemented for mappings if the objects support removal of keys, or for sequences if elements can be removed from the sequence. The same exceptions should be raised for improper key values as for the __getitem__() method.
Consider the following difference:
a = [1,2,3]
b = a
del a
# print(a) ## raises an error
print(b) ## prints [1,2,3]
c = [1,2,3]
d = c
del c[:]
print(c) ## prints []
print(d) ## prints []
So why would you want del a[:] to behave this way? Well, think of it as just a special case of deleting a slice of a list. For example, say that you'd want to delete the 3rd, 4th, and 5th element of a long list a = list(range(40)). With the slice notation and the __delitem__ this is easy, just use del a[3:6]. Now try to do the same with a for loop and you'll soon find out it can get quite cumbersome. Heck, just try to delete all the items of a (but not the a itself!) with a for loop ;)
Because a[:] is is simply just a copy of a, consisting of the identical objects as a but a different object than a. Their elements are identical but they are not.
Let's create a list and do some id checks:
a = [1, 2, 3]
print(id(a)) # 97731118088
print(id(a[:])) # 97731213576
print(id(a[:])) # 97731212104
print(id(a[:])) # 97731198600
Notice the changing ids of the copy a[:]. It is an object that is created on the fly every time it is called, and most importantly, even a[:] is a[:] not True! Let's look at the ids of their elements to come to a conclusion:
print(id(a[0])) # 1648192992
print(id(a[:][0])) # 1648192992
a[0] is a[:][0] # True
a[1] is a[:][1] # True
a[2] is a[:][2] # True
So, we can conclude that a[:] is an object that consists of the very elements of a but is a different object than a, and is a different object every time it is called. The elements of a and a[:] are all identical, but they themselves are not.
So what del a[:] does is to remove all of the elements of a. That way a is mutated, and you end up with an empty a, i.e. []. However, del a removes the name a from namespace completely, and when you ask Python to print a for you, you'll get a NameError: name 'a' is not defined.
But how do we know that? Well, let's gain some perspective by disassembling del on a vs a[:]:
Let's define two functions:
def deletion1(a):
del a
def deletion2(a):
del a[:]
Let's disassemble them:
import dis
dis.dis(x = deletion1)
1 0 DELETE_FAST 0 (a)
2 LOAD_CONST 0 (None)
4 RETURN_VALUE
dis.dis(x = deletion2)
1 0 LOAD_FAST 0 (a)
2 LOAD_CONST 0 (None)
4 LOAD_CONST 0 (None)
6 BUILD_SLICE 2
8 DELETE_SUBSCR
10 LOAD_CONST 0 (None)
12 RETURN_VALUE
The dis documentation indicates that the DELETE_FAST operation, which the first function does, simply "Deletes local co_varnames[var_num]". This is basically removal of that name a so that you can't reach the list object anymore. Beware, this does not remove the object that is referred to by the name a, but just removes its name so that the a is not a reference to anything anymore. The object 97731118088 is still the same list, [1, 2, 3]:
import gc
for obj in gc.get_objects():
if id(obj) == 97731118088:
print(obj)
# [1, 2, 3]
On the other hand, again from the documentation, DELETE_SUBSCR "Implements del TOS1[TOS]", which is basically "an in-place operator that removes the top of the stack and pushes the result back on the stack". This way, the stack elements are removed, and you are left with the name a, which now refers to a list whose elements are deleted, i.e. just an "empty shell", if you will. After this operation, a becomes [], but has still the same id of 97731118088. Just its elements are gone via an in-place deletion.

How to have shared variables between modules in Python

I started lately to use Python instead of Matlab and I have a question to which the answer might be obvious but I can't figure it out yet.
I have the following module in python called shared_variables.py:
global a
global b
a = 2
b = 3
c = a
d = b
in my main.py script I do the following things:
import shared_variables
for i in range(1,4):
shared_variables.a += 1
shared_variables.b += 1
print 'a= ',shared_variables.a
print 'b= ',shared_variables.b
print 'c= ',shared_variables.c
print 'd= ',shared_variables.d
and the output is the following:
a= 3
b= 4
c= 2
d= 3
a= 4
b= 5
c= 2
d= 3
a= 5
b= 6
c= 2
d= 3
Basically c and d values are not updated at each iteration. How can I solve this problem? I am asking this question because I have written a longer program in which I need to share common values between different modules that i need to update at each different iteration.
The following lines set the values of the variables once (e.g., assign the current value of a to c):
a = 2
b = 3
c = a
d = b
It does not mean that c changes whenever a changes nor that d changes whenever b changes. If you want variables to change value you'll need to assign a new value to them explicitly.
Integers are immutable in Python. You can't change them.
a += 1 is a syntax sugar for a = a + 1 i.e., after the assignment a is a different int object. c is not a anymore.
If a and c were mutable objects such as lists then changing a would change c. c = a makes c and a both to refer to the same object i.e., c is a.
For example,
a = [0]
c = a
a[0] += 1
print(a, c) # -> [1] [1]
Here are nice pictures to understand the relation between names and objects in Python
c and d start out as references to the same value as a and b, not to the same memory position. Once you assign new values to a and b, the other two references won't follow.
Python values are like balloons, and variable names are like labels. You can tie a thread between the label (name) and the balloon (value), and you can tie multiple labels to a given balloon. But assignment means you tied a new balloon to a given label. The other labels are not re-tied as well, they still are tied to the original balloon.
Python integers are immutable; they remain the same balloon throughout. Incrementing a by adding 1 with the in-place addition operator (a += 1) still has to find another balloon with the result of a + 1, and tie a to the new integer result. If a was tied to a balloon representing 2 before, it'll be replaced by a new balloon representing the value 3, and a will be retied to that. You cannot take a marker to the integer balloon and erase the 2 to replace it with 3.

Python Referenced For Loop

I'm playing with for loops in Python and trying to get used to the way they handle variables.
Take the following piece for code:
a=[1,2,3,4,5]
b=a
b[0]=6
After doing this, the zeroth element of both b and a should be 6. The = sign points a reference at the array, yes?
Now, I take a for loop:
a=[1,2,3,4,5]
for i in a:
i=6
My expectation would be that every element of a is now 6, because I would imagine that i points to the elements in a rather than copying them; however, this doesn't seem to be the case.
Clarification would be appreciated, thanks!
Everything in python is treated like a reference. What happens when you do b[0] = 6 is that you assign the 6 to an appropriate place defined by LHS of that expression.
In the second example, you assign the references from the array to i, so that i is 1, then 2, then 3, ... but i never is an element of the array. So when you assign 6 to it, you just change the thing i represents.
http://docs.python.org/reference/datamodel.html is an interesting read if you want to know more about the details.
That isn't how it works. The for loop is iterating through the values of a. The variable i actually has no sense of what is in a itself. Basically, what is happening:
# this is basically what the loop is doing:
# beginning of loop:
i = a[0]
i = 6
# next iteration of for loop:
i = a[1]
i = 6
# next iteration of for loop:
i = a[2]
i = 6
# you get the idea.
At no point does the value at the index change, the only thing to change is the value of i.
You're trying to do this:
for i in xrange(len(a)):
a[i] = 6 # assign the value at index i
Just as you said, "The = sign points a reference". So your loop just reassigns the 'i' reference to 5 different numbers, each one in turn.

Reason for unintuitive UnboundLocalError behaviour 2 [duplicate]

This question already has answers here:
Python difference between mutating and re-assigning a list ( _list = and _list[:] = )
(3 answers)
Closed 19 days ago.
Following up on Reason for unintuitive UnboundLocalError behaviour (I will assume you've read it).
Consider the following Python script:
def f():
# a+=1 # 1
aa=a
aa+=1
# b+='b' # 2
bb=b
bb+='b'
c[0]+='c' # 3
c.append('c')
cc=c
cc.append('c')
d['d']=5 # Update 1
d['dd']=6 # Update 1
dd=d # Update 1
dd['ddd']=7 # Update 1
e.add('e') # Update 2
ee=e # Update 2
ee.add('e') # Update 2
a=1
b='b'
c=['c']
d={'d':4} # Update 1
e=set(['e']) # Update 2
f()
print a
print b
print c
print d # Update 1
print e # Update 2
The result of the script is:
1
b
['cc', 'c', 'c']
{'dd': 6, 'd': 5, 'ddd': 7}
set(['e'])
The commented out lines (marked 1,2) are lines that would through an UnboundLocalError and the SO question I referenced explains why. However, the line marked 3 works!
By default, lists are copied by reference in Python, therefore it's understandable that c changes when cc changes. But why should Python allow c to change in the first place, if it didn't allow changes to a and b directly from the method's scope?
I don't see how the fact that by default lists are copied by reference in Python should make this design decision inconsistent.
What am I missing folks?
UPDATES:
For completeness I also added the dictionary equivalent to the question above, i.e. I added the source code and marked the update with # Update
For further completeness I also added the set equivalent. The set's behavior is actually surprisingly for me. I expected it to act similar to list and dictionary...
Unlike strings and integers, lists in Python are mutable objects. This means they are designed to be changed. The line
c[0] += 'c'
is identical to saying
c.__setitem__(0, c.__getitem__(0) + 'c')
which doesn't make any change to what the name c is bound to. Before and after this call, c is the same list – it's just the contents of this list that have changed.
Had you said
c += ['c']
c = [42]
in the function f(), the same UnboundLocalError would have occured, because the second line makes c a local name, and the first line translates to
c = c + ['c']
requiring the name c to be already bound to something, which (in this local scope) it isn't yet.
The important thing to think about is this: what object does a (or b or c) refer to? The line a += 1 is changing which integer a refers to. Integers are immutable, so when a changes from 1 to 2, it's really the same as a = a + 1, which is giving a an entirely new integer to refer to.
On the other hand, c[0] += 'c' doesn't change which list c refers to, it merely changes which string its first element refers to. Lists are mutable, so the same list can be modified without changing its identity.

Categories

Resources