Reason for unintuitive UnboundLocalError behaviour 2 [duplicate]

Reason for unintuitive UnboundLocalError behaviour 2 [duplicate] - python

This question already has answers here:
Python difference between mutating and re-assigning a list ( _list = and _list[:] = )
(3 answers)
Closed 19 days ago.
Following up on Reason for unintuitive UnboundLocalError behaviour (I will assume you've read it).
Consider the following Python script:
def f():
# a+=1 # 1
aa=a
aa+=1
# b+='b' # 2
bb=b
bb+='b'
c[0]+='c' # 3
c.append('c')
cc=c
cc.append('c')
d['d']=5 # Update 1
d['dd']=6 # Update 1
dd=d # Update 1
dd['ddd']=7 # Update 1
e.add('e') # Update 2
ee=e # Update 2
ee.add('e') # Update 2
a=1
b='b'
c=['c']
d={'d':4} # Update 1
e=set(['e']) # Update 2
f()
print a
print b
print c
print d # Update 1
print e # Update 2
The result of the script is:
1
b
['cc', 'c', 'c']
{'dd': 6, 'd': 5, 'ddd': 7}
set(['e'])
The commented out lines (marked 1,2) are lines that would through an UnboundLocalError and the SO question I referenced explains why. However, the line marked 3 works!
By default, lists are copied by reference in Python, therefore it's understandable that c changes when cc changes. But why should Python allow c to change in the first place, if it didn't allow changes to a and b directly from the method's scope?
I don't see how the fact that by default lists are copied by reference in Python should make this design decision inconsistent.
What am I missing folks?
UPDATES:
For completeness I also added the dictionary equivalent to the question above, i.e. I added the source code and marked the update with # Update
For further completeness I also added the set equivalent. The set's behavior is actually surprisingly for me. I expected it to act similar to list and dictionary...

Unlike strings and integers, lists in Python are mutable objects. This means they are designed to be changed. The line
c[0] += 'c'
is identical to saying
c.__setitem__(0, c.__getitem__(0) + 'c')
which doesn't make any change to what the name c is bound to. Before and after this call, c is the same list – it's just the contents of this list that have changed.
Had you said
c += ['c']
c = [42]
in the function f(), the same UnboundLocalError would have occured, because the second line makes c a local name, and the first line translates to
c = c + ['c']
requiring the name c to be already bound to something, which (in this local scope) it isn't yet.

The important thing to think about is this: what object does a (or b or c) refer to? The line a += 1 is changing which integer a refers to. Integers are immutable, so when a changes from 1 to 2, it's really the same as a = a + 1, which is giving a an entirely new integer to refer to.
On the other hand, c[0] += 'c' doesn't change which list c refers to, it merely changes which string its first element refers to. Lists are mutable, so the same list can be modified without changing its identity.

Related

I'm trying to understand how reference works in python

After line 7, I haven't written a single line of code which mentions the list named 'outer'. However, if you execute it, you'll see that the 'outer' (i.e, the nested lists inside it) list would change/update due to lines 10 and 12...
I'm guessing it has something to do with reference vs value. My question is, why didn't line 13 effect (change/update) the 'outer' list the same way that lines 7 and 10 did? I'm trying to undertand this concept. How do I go about it. I know there's a lot of resources online.. but I don't even know what to google. Please help.
inner = []
outer = []
lis = ['a', 'b', 'c']
inner.append(lis[0])
outer.append(inner) <<---- Line 7 <<
print(outer)
inner.append(lis[1]) <<---- Line 10 <<
print(outer)
inner.append(lis[2]) <<---- Line 12 <<
print(outer)
lis[2] = 'x' <<---- Line *******13******* <<
print(outer)

This is a boiled-down version of your example:
some_list = []
a = 2
some_list.append(a)
a = 3
print(some_list) # output: [2]
print(a) # output: 3
If we follow your original logic, you would expect some_list to contain the value 3 when we print it. But the reality is that we never appended a itself to the list. Instead, writing some_list.append(a) means appending the value referenced by a to the list some_list.
Remember, variables are simply references to a value. Here's the same snippet as above, but with an explanation of what's referencing what.
some_list = [] # the name "some_list" REFERENCES an empty list
a = 2 # the name "a" REFERENCES the integer value 2
some_list.append(a) # we append the value REFERENCED BY "a"
# (the integer 2) to the list REFERENCED
# BY "some_list". That list is not empty
# anymore, holding the value [2]
a = 3 # the name "a" now REFERENCES the integer value 3. This
# has no implications on the list REFERENCED BY "some_list".
# We simply move the "arrow" that pointed the name "a" to
# the value 2 to its new value of 3
print(some_list) # output: [2]
print(a) # output: 3
The key aspect to understand here is that variables are simply references to a value. Writing some_list.append(a) does not mean "place the variable a into the list" but rather "place the value that the variable a references at this moment in time into the list". Variables cannot keep track of other variables, only the values that they are a reference to.
This becomes even clearer if we append to some_list a second time, after modifying the value that a references:
some_list = []
a = 2
some_list.append(a)
a = 3
some_list.append(a)
print(some_list) # output: [2, 3]
print(a) # output: 3

In Python, when you store a list in variable you don't store the list itself, but a reference to a list somewhere in the computer's RAM. If you say
a = [0, 1, 2]
b = a
c = 3
then both a and b will be references to the same list as you set b to a, which is a reference to a list. Then, modifying a will modify b and vice-versa. c, however, is an integer; it works differently. It's like that:
┌───┐
a │ █━┿━━━━━━━━━━━━━━━┓ ┌───┬───┬───┐
└───┘ ┠→│ 0 │ 1 │ 2 │
┌───┐ ┃ └───┴───┴───┘
b │ █━┿━━━━━━━━━━━━━━━┛
└───┘
┌───┐
c │ 3 │
└───┘
a and b are references to a same list, but c is an pointer to an integer which is copied (the integer) when you say d = c. The reference to it, however, is not copied.
So, let's go back to your program. When you say inner.append(lis[n]) you add the value at the end of the list inner. You don't add the reference to the item #2 of the list lis but you create a copy of the value itself and add to the list the reference to this copy!
If you modify lis, then it will have an impact only on variables that are references to lis.
If you want inner to be modified if you modify lis, then replace the inner.append(lis[n])s by inner.append(lis).

Why does b+=(4,) work and b = b + (4,) doesn't work when b is a list?

If we take b = [1,2,3] and if we try doing: b+=(4,)
It returns b = [1,2,3,4], but if we try doing b = b + (4,) it doesn't work.
b = [1,2,3]
b+=(4,) # Prints out b = [1,2,3,4]
b = b + (4,) # Gives an error saying you can't add tuples and lists
I expected b+=(4,) to fail as you can't add a list and a tuple, but it worked. So I tried b = b + (4,) expecting to get the same result, but it didn't work.

The problem with "why" questions is that usually they can mean multiple different things. I will try to answer each one I think you might have in mind.
"Why is it possible for it to work differently?" which is answered by e.g. this. Basically, += tries to use different methods of the object: __iadd__ (which is only checked on the left-hand side), vs __add__ and __radd__ ("reverse add", checked on the right-hand side if the left-hand side doesn't have __add__) for +.
"What exactly does each version do?" In short, the list.__iadd__ method does the same thing as list.extend (but because of the language design, there is still an assignment back).
This also means for example that
>>> a = [1,2,3]
>>> b = a
>>> a += [4] # uses the .extend logic, so it is still the same object
>>> b # therefore a and b are still the same list, and b has the `4` added
[1, 2, 3, 4]
>>> b = b + [5] # makes a new list and assigns back to b
>>> a # so now a is a separate list and does not have the `5`
[1, 2, 3, 4]
+, of course, creates a new object, but explicitly requires another list instead of trying to pull elements out of a different sequence.
"Why is it useful for += to do this? It's more efficient; the extend method doesn't have to create a new object. Of course, this has some surprising effects sometimes (like above), and generally Python is not really about efficiency, but these decisions were made a long time ago.
"What is the reason not to allow adding lists and tuples with +?" See here (thanks, #splash58); one idea is that (tuple + list) should produce the same type as (list + tuple), and it's not clear which type the result should be. += doesn't have this problem, because a += b obviously should not change the type of a.

They are not equivalent:
b += (4,)
is shorthand for:
b.extend((4,))
while + concatenates lists, so by:
b = b + (4,)
you're trying to concatenate a tuple to a list

When you do this:
b += (4,)
is converted to this:
b.__iadd__((4,))
Under the hood it calls b.extend((4,)), extend accepts an iterator and this why this also work:
b = [1,2,3]
b += range(2) # prints [1, 2, 3, 0, 1]
but when you do this:
b = b + (4,)
is converted to this:
b = b.__add__((4,))
accept only list object.

From the official docs, for mutable sequence types both:
s += t
s.extend(t)
are defined as:
extends s with the contents of t
Which is different than being defined as:
s = s + t # not equivalent in Python!
This also means any sequence type will work for t, including a tuple like in your example.
But it also works for ranges and generators! For instance, you can also do:
s += range(3)

The "augmented" assignment operators like += were introduced in Python 2.0, which was released in October 2000. The design and rationale are described in PEP 203. One of the declared goals of these operators was the support of in-place operations. Writing
a = [1, 2, 3]
a += [4, 5, 6]
is supposed to update the list a in place. This matters if there are other references to the list a, e.g. when a was received as a function argument.
However, the operation can't always happen in place, since many Python types, including integers and strings, are immutable, so e.g. i += 1 for an integer i can't possibly operate in place.
In summary, augmented assignment operators were supposed to work in place when possible, and create a new object otherwise. To facilitate these design goals, the expression x += y was specified to behave as follows:
If x.__iadd__ is defined, x.__iadd__(y) is evaluated.
Otherwise, if x.__add__ is implemented x.__add__(y) is evaluated.
Otherwise, if y.__radd__ is implemented y.__radd__(x) is evaluated.
Otherwise raise an error.
The first result obtained by this process will be assigned back to x (unless that result is the NotImplemented singleton, in which case the lookup continues with the next step).
This process allows types that support in-place modification to implement __iadd__(). Types that don't support in-place modification don't need to add any new magic methods, since Python will automatically fall back to essentially x = x + y.
So let's finally come to your actual question – why you can add a tuple to a list with an augmented assignment operator. From memory, the history of this was roughly like this: The list.__iadd__() method was implemented to simply call the already existing list.extend() method in Python 2.0. When iterators were introduced in Python 2.1, the list.extend() method was updated to accept arbitrary iterators. The end result of these changes was that my_list += my_tuple worked starting from Python 2.1. The list.__add__() method, however, was never supposed to support arbitrary iterators as the right-hand argument – this was considered inappropriate for a strongly typed language.
I personally think the implementation of augmented operators ended up being a bit too complex in Python. It has many surprising side effects, e.g. this code:
t = ([42], [43])
t[0] += [44]
The second line raises TypeError: 'tuple' object does not support item assignment, but the operation is successfully performed anyway – t will be ([42, 44], [43]) after executing the line that raises the error.

Most people would expect X += Y to be equivalent to X = X + Y. Indeed, the Python Pocket Reference (4th ed) by Mark Lutz says on page 57 "The following two formats are roughly equivalent: X = X + Y , X += Y". However, the people who specified Python did not make them equivalent. Possibly that was a mistake which will result in hours of debugging time by frustrated programmers for as long as Python remains in use, but it's now just the way Python is. If X is a mutable sequence type, X += Y is equivalent to X.extend( Y ) and not to X = X + Y.

As it's explained here, if array doesn't implement __iadd__ method, the b+=(4,) would be just a shorthanded of b = b + (4,) but obviously it's not, so array does implement __iadd__ method. Apparently the implementation of __iadd__ method is something like this:
def __iadd__(self, x):
self.extend(x)
However we know that the above code is not the actual implementation of __iadd__ method but we can assume and accept that there's something like extend method, which accepts tupple inputs.

why is my for loop printing 2 at each iteration if b has not even been defined as a placeholder in the for loop declaration?

I am fairly new to python and I just learned about tuple unpacking, so I was playing around with this concept and got a weird result.
nList = [(1,2),4,5,6]
for a in nList:
print(a)
print(b)
I was expecting my program to crash since b is not defined as a placeholder and even if it was only the first element of my list is a tuple, but instead, I got the following result:
(1, 2)
2
4
2
5
2
6
2

You have not initialized the variable b in your code, because of which it is using some garbage value of b from memory (which is 2). To avoid such scenarios you must initialize your variables with some value (or 0) in start of your code before using them.

Variable assignment query in python

I am writing Fibonacci code in python. The below solution is mine.
While the other below solution is from python.org.
Can anyone tell me why it yields a different answer even though the logic of assigning the variables is the same?

Those two programs are not equivalent. The right hand side of the equals (=) is evaluated all together.
Doing:
a=b
b=a+b
Is different from:
a,b = b,a+b
This is in reality the same as:
c = a
a = b
b = b + c
Your example is actually covered on the Python documentation:
The first line contains a multiple assignment: the variables a and b simultaneously get the new values 0 and 1. On the last line this is used again, demonstrating that the expressions on the right-hand side are all evaluated first before any of the assignments take place. The right-hand side expressions are evaluated from the left to the right.

The lines
a = b # Assigns the value of 'b' to 'a'
b = a + b # Adds the new value of 'a' to 'b'
whereas,
a, b = b, a+b Assigns the value of b to a. Adds the existing value of a to b.

The reason it works in the second example is becasue the a=b doesn't evaluate until both are done. So when it gets to the b=a+b part, a is still its previous value. In your first example you are overwriting a before using it. In python when you declare variables in this way you are actually using them as tuples. This means that until the whole line is completed they retain their original values. Once the tuple is unpacked they are overwritten.

I see extra tabs in your solution and also logic of your program is wrong. As far as I understood by writing fib(5) you want 5th fibonacci in the series (which is 5) not a number which is less than 5 (which is 3).
a=b
b=a+b
and
a,b = b,a+b
are not equal.
Look at the code below.
def fibonacci(num):
a,b=0,1;
counter = 2;
while(a<=):
a,b = b,a+b
counter += 1
return b
print fibonacci(5)

How to have shared variables between modules in Python

I started lately to use Python instead of Matlab and I have a question to which the answer might be obvious but I can't figure it out yet.
I have the following module in python called shared_variables.py:
global a
global b
a = 2
b = 3
c = a
d = b
in my main.py script I do the following things:
import shared_variables
for i in range(1,4):
shared_variables.a += 1
shared_variables.b += 1
print 'a= ',shared_variables.a
print 'b= ',shared_variables.b
print 'c= ',shared_variables.c
print 'd= ',shared_variables.d
and the output is the following:
a= 3
b= 4
c= 2
d= 3
a= 4
b= 5
c= 2
d= 3
a= 5
b= 6
c= 2
d= 3
Basically c and d values are not updated at each iteration. How can I solve this problem? I am asking this question because I have written a longer program in which I need to share common values between different modules that i need to update at each different iteration.

The following lines set the values of the variables once (e.g., assign the current value of a to c):
a = 2
b = 3
c = a
d = b
It does not mean that c changes whenever a changes nor that d changes whenever b changes. If you want variables to change value you'll need to assign a new value to them explicitly.

Integers are immutable in Python. You can't change them.
a += 1 is a syntax sugar for a = a + 1 i.e., after the assignment a is a different int object. c is not a anymore.
If a and c were mutable objects such as lists then changing a would change c. c = a makes c and a both to refer to the same object i.e., c is a.
For example,
a = [0]
c = a
a[0] += 1
print(a, c) # -> [1] [1]
Here are nice pictures to understand the relation between names and objects in Python

c and d start out as references to the same value as a and b, not to the same memory position. Once you assign new values to a and b, the other two references won't follow.
Python values are like balloons, and variable names are like labels. You can tie a thread between the label (name) and the balloon (value), and you can tie multiple labels to a given balloon. But assignment means you tied a new balloon to a given label. The other labels are not re-tied as well, they still are tied to the original balloon.
Python integers are immutable; they remain the same balloon throughout. Incrementing a by adding 1 with the in-place addition operator (a += 1) still has to find another balloon with the result of a + 1, and tie a to the new integer result. If a was tied to a balloon representing 2 before, it'll be replaced by a new balloon representing the value 3, and a will be retied to that. You cannot take a marker to the integer balloon and erase the 2 to replace it with 3.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Reason for unintuitive UnboundLocalError behaviour 2 [duplicate] - python

Related

I'm trying to understand how reference works in python

Why does b+=(4,) work and b = b + (4,) doesn't work when b is a list?

why is my for loop printing 2 at each iteration if b has not even been defined as a placeholder in the for loop declaration?

Variable assignment query in python

How to have shared variables between modules in Python

Categories

Resources