Why is my (local) variable behaving like a global variable?

Why is my (local) variable behaving like a global variable? - python

I have no use for a global variable and never define one explicitly, and yet I seem to have one in my code. Can you help me make it local, please?
def algo(X): # randomized algorithm
while len(X)>2:
# do a bunch of things to nested list X
print(X)
# tracing: output is the same every time, where it shouldn't be.
return len(X[1][1])
def find_min(X): # iterate algo() multiple times to find minimum
m = float('inf')
for i in some_range:
new = algo(X)
m = min(m, new)
return m
X = [[[..], [...]],
[[..], [...]],
[[..], [...]]]
print(find_min(X))
print(X)
# same value as inside the algo() call, even though it shouldn't be affected.
X appears to be behaving like a global variable. The randomized algorithm algo() is really performed only once on the first call because with X retaining its changed value, it never makes it inside the while loop. The purpose of iterations in find_min is thus defeated.
I'm new to python and even newer to this forum, so let me know if I need to clarify my question. Thanks.
update Many thanks for all the answers so far. I almost understand it, except I've done something like this before with a happier result. Could you explain why this code below is different, please?
def qsort(X):
for ...
# recursively sort X in place
count+=1 # count number of operations
return X, count
X = [ , , , ]
Y, count = qsort(X)
print(Y) # sorted
print(X) # original, unsorted.
Thank you.
update II To answer my own second question, the difference seems to be the use of a list method in the first code (not shown) and the lack thereof in the second code.

As others have pointed out already, the problem is that the list is passed as a reference to the function, so the list inside the function body is the very same object as the one you passed to it as an argument. Any mutations your function performs are thus visible from outside.
To solve this, your algo function should operate on a copy of the list that it gets passed.
As you're operating on a nested list, you should use the deepcopy function from the copy module to create a copy of your list that you can freely mutate without affecting anything outside of your function. The built-in list function can also be used to copy lists, but it only creates shallow copies, which isn't what you want for nested lists, because the inner lists would still just be pointers to the same objects.
from copy import deepcopy
def algo (X):
X = deepcopy(X)
...

When you do find_min(X), you are passing the object X (a list in this case) to the function. If that function mutates the list (e.g., by appending to it) then yes, it will affect the original object. Python does not copy objects just because you pass them to a function.

When you pass an object to a python function, the object isn't copied, but rather a pointer to the object is passed.
This makes sense because it greatly speeds up execution - in the case of a long list, there is no need to copy all of its elements.
However, this means that when you modify a passed object (for example, your list X), the modification applies to that object, even after the function returns.
For example:
def foo(x):
x.extend('a')
print x
l = []
foo(l)
foo(l)
Will print:
['a']
['a', 'a']

Python lists are mutable (i.e., they can be changed) and the use of algo within find_min function call does change the value of X (i.e., it is pass-by-reference for lists). See this SO question, for example.

Related

Really simple question about reference of list in python

I have a really simple question about references in python.
I assume you are familiar with this:
aa = [1,2]
bb = aa
aa[0] = 100
print(bb)
As you might guess, the output will be
[100, 2]
and it's totally OK ✔
Let's do another example:
l = [[],[],[]]
a = l[0]
l[0] = [1,2]
print(a)
But here the output is:
[]
I know why that happened.
It's because, on line 3, we made an entirely different list and "replaced it"(not changing but replacing) with l[0]
Now my question is "Can I somehow replace l[0] with [1,2] and also keep a as a reference?
P.S not like l[0].append(1,2)

Short answer: Not really, but there might be something close enough.
The problem is, under the covers, a is a pointer-to-a-list, and l is a pointer-to-a-list-of-pointers-to-lists. When you a = l[0], what that actually translates to at the CPU is "dereference the pointer l, treat the resulting region of memory as a list object, get the first object (which will be the address of another list), and set the value of pointer a to that address". Once you've done that, a and l[0] are only related by concidence; the are two separate pointers that happen, for the moment, to point at the same object. If you assign to either variable, you're changing the value of a pointer, not the contents of the pointed-to object.
Broadly speaking, there's a few ways the computer could practically do what you ask.
Modify the pointed-to object (list) without modifying either pointer. That's what the append function does, along with the many other mutators of python lists. If you want to do this in a way that perhaps more clearly expresses your intent, you could do l[0][:] = [1,2]. That's a list copy operation, copying into the object pointed to by both l[0] and a. This is your best bet as a developer, though note that copy operations are O(n).
Implement a as a pointer-to-a-pointer-to-a-list that is automatically dereferenced (to merely a pointer-to-list) when accessed. This is not, AFAIK, something Python provides any support for; almost no language does. In C you could say list ** a = &(l[0]); but then any time you want to actually do anything with a you'd have to use *a instead.
Tell the interpreter to observe that a is an alias to l[0], rather than its own, separate variable. As far as I know, Python doesn't support this either. In C, you could do it as #define a (l[0]) though you'd want to #undef a when it went out of scope.
Rather than making a a list variable (which is implemented as a pointer-to-list), make it a function: a = lambda: l[0]. This means you have to use a() instead of a anywhere you want to get the actual content of l[0], and you can't assign to l[0] through a() (or through a directly). But it does work, in Python. You could even go so far as to use properties, which would let you skip the parentheses and assign through a, but at the cost of writing a bunch more code to wrap the lists (I'm not aware of a way to attach properties to lists directly, though one might exist, so you'd instead have to create a new object wrapping the list).

If a = l then calling a[0] in that case would yield [1,2], however, you are deleting the list that “a” was referencing, therefore the reference is destroyed as well. You need to either make a = l and call a[0] or reset the reference by calling a = l[0] again.

The original array at l[0] and the [1,2] array are two different objects. there is no way to do that short of re-assigning the a variable to l[0] again once the change has been made.

About the way to modify the list in-place in a function in Python

If I try to modify the 'board' list in-place in the way below, it doesn't work, it seems like it generate some new 'board' instead of modify in-place.
def func(self, board):
"""
:type board: List[List[str]]
"""
board = [['A' for j in range(len(board[0]))] for i in range(len(board))]
return
I have to do something like this to modify it in-place, what's the reason? Thanks.
for i in range(len(board)):
for j in range(len(board[0])):
board[i][j] = 'A'

You seem to understand the difference between these two cases, and want to know why Python makes you handle them differently?
I have to do something like this to modify it in-place, what's the reason?
Creating a new copy is something that has a value. So it makes sense for it to be an expression. In fact, list comprehensions would be useless if they weren't expressions.
Mutating a list in-place isn't something that has a value. So, there's no reason to make it an expression, and in fact, it would be weird to do so. Sure, you could come up with some kind of value (like, say, the list being mutated). But that would be at odds with everything else in the design of Python: spam.append(eggs) doesn't return spam, it returns nothing. spam = eggs doesn't have a value. And so on.
Secondarily, the comprehension style feeds very well into the iterable paradigm, which is fundamental to Python. For example, notice that you can turn a list comprehension into a generator comprehension (which gives you a lazy iterator over values that are computed on demand) just by changing the […] to (…). What useful equivalent could there be for mutation?
Making the transforming-copy more convenient also encourages people to use a non-mutating style, which often leads to better answers for many problems. When you want to know how to avoid writing three lines of nested statement to mutate some global, the answer is to stop mutating that global and instead pass in a parameter and return the new value.
Also, the syntax was copied from Haskell, where there is no mutation.
But of course all those "often" and "usually" don't mean "never". Sometimes (unless you're designing a language with no mutation), you need to do things in-place. That's why we have list.sort as well as sorted. (And a lot of work has gone into optimizing the hell out of list.sort; it's not just an afterthought.)
Python doesn't stop you from doing it. It just doesn't bend over quite as far to make it easy as it does for copying.

that is not modifying it in place. The list comprehension syntax [x for y in z] is creating a new list. The original list is not modified by this syntax. Making the name inside the function point to a new list won't change what list the name outside the function is pointing.
In other words, when calling a function python passes a reference to the object, not the name, so there is no easy way to change which object the variable name outside the function is refering to.

Why does modifying what list() return not work?

Let's say I have a list that contains three strings, and I want a new list that drops one of the strings. I know there are alternate ways of doing this, but I was surprised that the following does not work:
x = ['A','B','C']
y = list(x).remove('A')
Why does the above not work?
Edit: Thanks for the answers everyone!

Per the Python Programming FAQ (emphasis added):
Some operations (for example y.append(10) and y.sort()) mutate the object, whereas superficially similar operations (for example y = y + [10] and sorted(y)) create a new object. In general in Python (and in all cases in the standard library) a method that mutates an object will return None to help avoid getting the two types of operations confused. So if you mistakenly write y.sort() thinking it will give you a sorted copy of y, you’ll instead end up with None, which will likely cause your program to generate an easily diagnosed error.
Since remove is a mutating method (changes the list it's called on in-place), it follows the general pattern of returning None. If it didn't, a line like:
y = x.remove('A')
would appear to work, but it would be aliasing y to the same list referenced by x, not creating a new list at all, and it might take some time for that mistake to be noticed, even as you use x and y believing them to be independent. By returning None, any attempt to use y believing it to be a separate list (or a list at all), will likely fail loudly (as it does in your case, with or without the list wrapping, making your misuse of remove obvious).
This also generally encourages Python's (loose) guideline to avoid shoving too many steps in a process on a single line. If you want to copy a list and remove one element, you do it in two steps:
y = list(x)
y.remove('A')
and it works just fine.

Efficiency of giving function as list in list comprehension

When you supply a function as the old list in a list comprehension like this
my_new_list = [x * 2 for x in list_maker()]
is list_maker() called each time a new x is grabbed?
I'm wondering because I want to know if it'd be more efficient to do this
my_old_list = list_maker()
my_new_list = [x * 2 for x in my_old_list]
Thanks!

The answer, like most questions of "which is more effecient?", is "it depends".
Traditionally, list_maker would be called once, so whether you call it in the list comprehension or outside and assign to a variable makes no difference(1).
However (and this is what #PeterWood is referring to), list_maker could be a generator, which would cause it to be entered repeatedly (which is not exactly the same as called repeatedly, but probably close enough). (See also PEP 255.)
The question of which is more effecient, however, is not clear-cut -- a regular function returning the whole list would use more memory than a generator, which might or might not be more expensive.
(1) Except that the memory used to store the result of list_maker can be freed immediately after the list compreshension compeletes, where are the my_new_list would have to go out of scope unreferenced first.

How to make a variable (truly) local to a procedure or function

ie we have the global declaration, but no local.
"Normally" arguments are local, I think, or they certainly behave that way.
However if an argument is, say, a list and a method is applied which modifies the list, some surprising (to me) results can ensue.
I have 2 questions: what is the proper way to ensure that a variable is truly local?
I wound up using the following, which works, but it can hardly be the proper way of doing it:
def AexclB(a,b):
z = a+[] # yuk
for k in range(0, len(b)):
try: z.remove(b[k])
except: continue
return z
Absent the +[], "a" in the calling scope gets modified, which is not desired.
(The issue here is using a list method,
The supplementary question is, why is there no "local" declaration?
Finally, in trying to pin this down, I made various mickey mouse functions which all behaved as expected except the last one:
def fun4(a):
z = a
z = z.append(["!!"])
return z
a = ["hello"]
print "a=",a
print "fun4(a)=",fun4(a)
print "a=",a
which produced the following on the console:
a= ['hello']
fun4(a)= None
a= ['hello', ['!!']]
...
>>>
The 'None' result was not expected (by me).
Python 2.7 btw in case that matters.
PS: I've tried searching here and elsewhere but not succeeded in finding anything corresponding exactly - there's lots about making variables global, sadly.

It's not that z isn't a local variable in your function. Rather when you have the line z = a, you are making z refer to the same list in memory that a already points to. If you want z to be a copy of a, then you should write z = a[:] or z = list(a).
See this link for some illustrations and a bit more explanation http://henry.precheur.org/python/copy_list

Python will not copy objects unless you explicitly ask it to. Integers and strings are not modifiable, so every operation on them returns a new instance of the type. Lists, dictionaries, and basically every other object in Python are mutable, so operations like list.append happen in-place (and therefore return None).
If you want the variable to be a copy, you must explicitly copy it. In the case of lists, you slice them:
z = a[:]

There is a great answer than will cover most of your question in here which explains mutable and immutable types and how they are kept in memory and how they are referenced. First section of the answer is for you. (Before How do we get around this? header)
In the following line
z = z.append(["!!"])
Lists are mutable objects, so when you call append, it will update referenced object, it will not create a new one and return it. If a method or function do not retun anything, it means it returns None.
Above link also gives an immutable examle so you can see the real difference.
You can not make a mutable object act like it is immutable. But you can create a new one instead of passing the reference when you create a new object from an existing mutable one.
a = [1,2,3]
b = a[:]
For more options you can check here

What you're missing is that all variable assignment in python is by reference (or by pointer, if you like). Passing arguments to a function literally assigns values from the caller to the arguments of the function, by reference. If you dig into the reference, and change something inside it, the caller will see that change.
If you want to ensure that callers will not have their values changed, you can either try to use immutable values more often (tuple, frozenset, str, int, bool, NoneType), or be certain to take copies of your data before mutating it in place.
In summary, scoping isn't involved in your problem here. Mutability is.
Is that clear now?
Still not sure whats the 'correct' way to force the copy, there are
various suggestions here.
It differs by data type, but generally <type>(obj) will do the trick. For example list([1, 2]) and dict({1:2}) both return (shallow!) copies of their argument.
If, however, you have a tree of mutable objects and also you don't know a-priori which level of the tree you might modify, you need the copy module. That said, I've only needed this a handful of times (in 8 years of full-time python), and most of those ended up causing bugs. If you need this, it's a code smell, in my opinion.
The complexity of maintaining copies of mutable objects is the reason why there is a growing trend of using immutable objects by default. In the clojure language, all data types are immutable by default and mutability is treated as a special cases to be minimized.

If you need to work on a list or other object in a truly local context you need to explicitly make a copy or a deep copy of it.
from copy import copy
def fn(x):
y = copy(x)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.