Related
This question already has answers here:
Using global variables in a function
(25 answers)
Closed 5 months ago.
I know I should avoid using global variables in the first place due to confusion like this, but if I were to use them, is the following a valid way to go about using them? (I am trying to call the global copy of a variable created in a separate function.)
x = "somevalue"
def func_A ():
global x
# Do things to x
return x
def func_B():
x = func_A()
# Do things
return x
func_A()
func_B()
Does the x that the second function uses have the same value of the global copy of x that func_a uses and modifies? When calling the functions after definition, does order matter?
If you want to simply access a global variable you just use its name. However to change its value you need to use the global keyword.
E.g.
global someVar
someVar = 55
This would change the value of the global variable to 55. Otherwise it would just assign 55 to a local variable.
The order of function definition listings doesn't matter (assuming they don't refer to each other in some way), the order they are called does.
Within a Python scope, any assignment to a variable not already declared within that scope creates a new local variable unless that variable is declared earlier in the function as referring to a globally scoped variable with the keyword global.
Let's look at a modified version of your pseudocode to see what happens:
# Here, we're creating a variable 'x', in the __main__ scope.
x = 'None!'
def func_A():
# The below declaration lets the function know that we
# mean the global 'x' when we refer to that variable, not
# any local one
global x
x = 'A'
return x
def func_B():
# Here, we are somewhat mislead. We're actually involving two different
# variables named 'x'. One is local to func_B, the other is global.
# By calling func_A(), we do two things: we're reassigning the value
# of the GLOBAL x as part of func_A, and then taking that same value
# since it's returned by func_A, and assigning it to a LOCAL variable
# named 'x'.
x = func_A() # look at this as: x_local = func_A()
# Here, we're assigning the value of 'B' to the LOCAL x.
x = 'B' # look at this as: x_local = 'B'
return x # look at this as: return x_local
In fact, you could rewrite all of func_B with the variable named x_local and it would work identically.
The order matters only as far as the order in which your functions do operations that change the value of the global x. Thus in our example, order doesn't matter, since func_B calls func_A. In this example, order does matter:
def a():
global foo
foo = 'A'
def b():
global foo
foo = 'B'
b()
a()
print foo
# prints 'A' because a() was the last function to modify 'foo'.
Note that global is only required to modify global objects. You can still access them from within a function without declaring global.
Thus, we have:
x = 5
def access_only():
return x
# This returns whatever the global value of 'x' is
def modify():
global x
x = 'modified'
return x
# This function makes the global 'x' equal to 'modified', and then returns that value
def create_locally():
x = 'local!'
return x
# This function creates a new local variable named 'x', and sets it as 'local',
# and returns that. The global 'x' is untouched.
Note the difference between create_locally and access_only -- access_only is accessing the global x despite not calling global, and even though create_locally doesn't use global either, it creates a local copy since it's assigning a value.
The confusion here is why you shouldn't use global variables.
You can directly access a global variable inside a function. If you want to change the value of that global variable, use "global variable_name". See the following example:
var = 1
def global_var_change():
global var
var = "value changed"
global_var_change() #call the function for changes
print var
Generally speaking, this is not a good programming practice. By breaking namespace logic, code can become difficult to understand and debug.
As others have noted, you need to declare a variable global in a function when you want that function to be able to modify the global variable. If you only want to access it, then you don't need global.
To go into a bit more detail on that, what "modify" means is this: if you want to re-bind the global name so it points to a different object, the name must be declared global in the function.
Many operations that modify (mutate) an object do not re-bind the global name to point to a different object, and so they are all valid without declaring the name global in the function.
d = {}
l = []
o = type("object", (object,), {})()
def valid(): # these are all valid without declaring any names global!
d[0] = 1 # changes what's in d, but d still points to the same object
d[0] += 1 # ditto
d.clear() # ditto! d is now empty but it`s still the same object!
l.append(0) # l is still the same list but has an additional member
o.test = 1 # creating new attribute on o, but o is still the same object
Here is one case that caught me out, using a global as a default value of a parameter.
globVar = None # initialize value of global variable
def func(param = globVar): # use globVar as default value for param
print 'param =', param, 'globVar =', globVar # display values
def test():
global globVar
globVar = 42 # change value of global
func()
test()
=========
output: param = None, globVar = 42
I had expected param to have a value of 42. Surprise. Python 2.7 evaluated the value of globVar when it first parsed the function func. Changing the value of globVar did not affect the default value assigned to param. Delaying the evaluation, as in the following, worked as I needed it to.
def func(param = eval('globVar')): # this seems to work
print 'param =', param, 'globVar =', globVar # display values
Or, if you want to be safe,
def func(param = None)):
if param == None:
param = globVar
print 'param =', param, 'globVar =', globVar # display values
You must use the global declaration when you wish to alter the value assigned to a global variable.
You do not need it to read from a global variable. Note that calling a method on an object (even if it alters the data within that object) does not alter the value of the variable holding that object (absent reflective magic).
def f():
print (x)
def g():
print (x)
x = 1
x = 3
f()
x = 3
g()
I am learning functions now, why are empty arguments used in these functions ?
No arguments are being used (as none are being passed).
Inside f, x is a free variable. There is no local variable by that name, so its value is looked up in the next enclosing scope, which in this case is the global scope.
Inside g, x is a local variable (by virtue of a value being assigned to it), but it is not yet defined when you call print(x). If you were to reverse the order of the assignment and the call to print, you would see 1 as the output, since the global variable by the same name is ignored.
This question already has answers here:
UnboundLocalError trying to use a variable (supposed to be global) that is (re)assigned (even after first use)
(14 answers)
Short description of the scoping rules?
(9 answers)
Closed 6 months ago.
If I run the following code:
x = 1
class Incr:
print(x)
x = x + 1
print(x)
print(x)
It prints:
1
2
1
Okay no problems, that's exactly what I expected. And if I do the following:
x = 1
class Incr:
global x
print(x)
x = x + 1
print(x)
print(x)
It prints:
1
2
2
Also what I expected. No problems there.
Now if I start making an increment function as follows:
x = 1
def incr():
print(x)
incr()
It prints 1 just as I expected. I assume it does this because it cannot find x in its local scope, so it searches its enclosing scope and finds x there. So far no problems.
Now if I do:
x = 1
def incr():
print(x)
x = x + 1
incr()
This gives me the following error in the traceback:
UnboundLocalError: local variable 'x' referenced before assignment.
Why does Python not just search the enclosing space for x when it cannot find a value of x to use for the assignment like my class Incr did? Note that I am not asking how to make this function work. I know the function will work if I do the following:
x = 1
def incr():
global x
print(x)
x = x + 1
print(x)
incr()
This will correctly print:
1
2
just as I expect. All I am asking is why it doesn't just pull x from the enclosing scope when the keyword global is not present just like it did for my class above. Why does the interpreter feel the need to report this as an UnboundLocalError when clearly it knows that some x exists. Since the function was able to read the value at x for printing, I know that it has x as part of its enclosing scope...so why does this not work just like the class example?
Why is using the value of x for print so different from using its value for assignment? I just don't get it.
Classes and functions are different, variables inside a class are actually assigned to the class's namespace as its attributes, while inside a function the variables are just normal variables that cannot be accessed outside of it.
The local variables inside a function are actually decided when the function gets parsed for the first time, and python will not search for them in global scope because it knows that you declared it as a local variable.
So, as soon as python sees a x = x + 1(assignment) and there's no global declared for that variable then python will not look for that variable in global or other scopes.
>>> x = 'outer'
>>> def func():
... x = 'inner' #x is a local variable now
... print x
...
>>> func()
inner
Common gotcha:
>>> x = 'outer'
>>> def func():
... print x #this won't access the global `x`
... x = 'inner' #`x` is a local variable
... print x
...
>>> func()
...
UnboundLocalError: local variable 'x' referenced before assignment
But when you use a global statement then python for look for that variable in global scope.
Read: Why am I getting an UnboundLocalError when the variable has a value?
nonlocal: For nested functions you can use the nonlocal statement in py3.x to modify a variable declared in an enclosing function.
But classes work differently, a variable x declared inside a class A actually becomes A.x:
>>> x = 'outer'
>>> class A:
... x += 'inside' #use the value of global `x` to create a new attribute `A.x`
... print x #prints `A.x`
...
outerinside
>>> print x
outer
You can also access the class attributes directly from global scope as well:
>>> A.x
'outerinside'
Using global in class:
>>> x = 'outer'
>>> class A:
... global x
... x += 'inner' #now x is not a class attribute, you just modified the global x
... print x
...
outerinner
>>> x
'outerinner'
>>> A.x
AttributeError: class A has no attribute 'x'
Function's gotcha will not raise an error in classes:
>>> x = 'outer'
>>> class A:
... print x #fetch from globals or builitns
... x = 'I am a class attribute' #declare a class attribute
... print x #print class attribute, i.e `A.x`
...
outer
I am a class attribute
>>> x
'outer'
>>> A.x
'I am a class attribute'
LEGB rule: if no global and nonlocal is used then python searches in this order.
>>> outer = 'global'
>>> def func():
enclosing = 'enclosing'
def inner():
inner = 'inner'
print inner #fetch from (L)ocal scope
print enclosing #fetch from (E)nclosing scope
print outer #fetch from (G)lobal scope
print any #fetch from (B)uilt-ins
inner()
...
>>> func()
inner
enclosing
global
<built-in function any>
From Python scopes and namespaces:
It is important to realize that scopes are determined textually: the global scope of a function defined in a module is that module’s namespace, no matter from where or by what alias the function is called. On the other hand, the actual search for names is done dynamically, at run time — however, the language definition is evolving towards static name resolution, at “compile” time, so don’t rely on dynamic name resolution! (In fact, local variables are already determined statically.)
Which means that, the scope for x = x + 1 is determined statically, before the function is called. And since this is an assignment, then 'x' becomes a local variable, and not looked up globally.
That is also the reason why from mod import * is disallowed in functions. Because the interpreter won't import modules for you in compile time to know the names you are using in the function. i.e, it must know all names referenced in the function at compile time.
It's the rule Python follows - get used to it ;-) There is a practical reason: both the compiler and human readers can determine which variables are local by looking only at the function. Which names are local has nothing to do with the context in which a function appears, and it's generally a Very Good Idea to follow rules that limit the amount of source code you have to stare at to answer a question.
About:
I assume it does this because it cannot find x in its local scope, so
it searches its enclosing scope and finds x.
Not quite: the compiler determines at compile time which names are and aren't local. There's no dynamic "hmm - is this local or global?" search going on at runtime. The precise rules are spelled out here.
As to why you don't need to declare a name global just to reference its value, I like Fredrik Lundh's old answer here. In practice, it is indeed valuable that a global statement alerts code readers to that a function may be rebinding a global name.
Because that's the way it was designed to work.
Basically, if you have an assignment anywhere in your function, then that variable becomes local to that function (unless you've used global, of course).
Because it would lead to very hard to track down bugs!
When you type that x = x + 1, you could have meant to increment an x in the enclosing scope... or you could have simply forgotten that you already used x somewhere else and been trying to declare a local variable.
I would prefer the interpreter to only allow you to change the parent namespace if you intend to- this way you can't do it by accident.
I can try and make an educated guess why it works this way.
When Python encounters a string x = x + 1 in your function, it has to decide where to look up the x.
It could say "the first occurrence of x is global, and the second one is local", but this is quite ambiguous (and therefore against Python philosophy). This could be made part of the syntax, but it potentially leads to tricky bugs. Therefore it was decided to be consistent about it, and treat all occurrences as either global or local variables.
There is an assignment, therefore if x was supposed to be global, there would be a global statement, but none is found.
Therefore, x is local, but it is not bound to anything, and yet it is used in the expression x + 1. Throw UnboundLocalError.
As an extra example to the newly created A.x created within the class. The reassigning of x to 'inner' within the class does not update the global value of x because it is now a class variable.
x = 'outer'
class A:
x = x
print(x)
x = 'inner'
print(x)
print(x)
print(A.x)
As a follow up to this question
Why can I change the global value of x.a within def A? I am guessing it has to do with the fact that it is a class since it would not work with a regular variable because it would be redefined in the local scope of def A, but I am not quite understanding what is happening.
Case 1
class X:
def __init__(self):
self.a = 1
self.b = 2
self.c = 3
class Y:
def A(self):
print(x.a,x.b,x.c)
x.a = 4
x = X()
y = Y()
y.A()
print(x.a,x.b,x.c)
If you set x = Y() in A scope it would create a local x in that scope. In this case however, you are not setting x, you are setting x.a. Looking up variables takes into account global variables too. Imagine you are doing this instead setattr(x, "a", 4) and it will make more sense.
Also if I remember correctly you can "import" global variables into a function scope by using the global keyword. (see Use of "global" keyword in Python)
Global names can be read within functions:
x = 5
def read_x():
print(x)
Globals that reference mutable types can also be mutated within functions:
x = [1, 2, 3]
def mutate_x():
x[0] = 'One'
What you can't do to a global within the scope of a function is assignment:
x = 5
def set_x():
# this simply assigns a variable named x local to this function -- it doesn't modify the global x
x = 3
# now, back outside the scope of set_x, x remains 5
print(x)
5
Unless you explicitly declare the global within the scope of the function:
x = 5
def set_x():
global x
x = 3
# back outside the function's scope
print(x)
3
What you're doing in your example is "mutating" -- modifying an attribute of an object. Assigning a value to an attribute of a user-defined type is one example of mutation, just like modifying an element of a list or a dictionary. That's why this works.
The class A doesn't have an init method that defines x, so the method A, when you try to access the attribute a of x, try to find the x object in the local scope, and since it can't find it then it looks the outside scope where an object name x is present, it grabs that object and override the attribute a.
So basically what you are doing is modify the actual attribute a of the object x that you create before you call y.A().
It's the very basic foundation of a closure: access a variable that is defined outside the local scope.
I am asking because of the classic problem where somebody creates a list of lambdas:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
and unexpectedly gets only twos as output.
The commonly proposed solution is to make i a named argument like this:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
for l in foo:
print(l())
Which produces the desired output of 0, 1, 2 but now something magical has happened. It sort of did what is expected because Python is pass-by-reference and you didn't want a reference.
Still, just adding a new name to something, shouldn't that just create another reference?
So the question becomes what are the exact rules for when something is not a reference?
Considering that ints are immutable and the following works:
x = 3
y = x
x = 5
print(x, y) // outputs 5 3
probably explains why adding that named parameter works. A local i with the same value was created and captured.
Now why, in the case of our lambdas was the same i referenced? I pass an int to function and it is refenced and if I store it in a variable it is copied. Hm.
Basically I am looking for the most concise and abstract way possible to remember exactly how this works. When is the same value referenced, when do I get a copy. If it has any common names and there are programming languages were it works the same that would be interesting as well.
Here is my current assumption:
Arguments are always passed to functions by reference.
Assigning to a variable of immutable type creates a copy.
I am asking anyway, just to make sure and hopefully get some background.
The issue here is how you think of names.
In your first example, i is a variable that is assigned to every time the loop iterates. When you use lambda to make a function, you make a function that accesses the name i and returns it's value. This means as the name i changes, the value returned by the functions also changes.
The reason the default argument trick works is that the name is evaluated when the function is defined. This means the default value is the value the i name points to at that time, not the name itself.
i is a label. 0, 1 and 2 are the objects. In the first case, the program assigns 0 to i, then makes a function that returns i - it then does this with 1 and 2. When the function is called, it looks up i (which is now 2) and then returns it.
In the second example, you assign 0 to i, then you make a function with a default argument. That default argument is the value that is gotten by evaluating i - that is the object 0. This is repeated for 1 and 2. When the function is called, it assigns that default value to a new variable i, local to the function and unrelated to the outer i.
Python doesn't exactly pass by reference or by value (at least, not the way you'd think of it, coming from a language like C++).
In many other languages (such as C++), variables can be thought of as synonymous with the values they hold.
However, in Python, variables are names that point to the objects in memory.
(This is a good explanation (with pictures!))
Because of this, you can get multiple names attached to one object, which can lead to interesting effects.
Consider these equivalent program snippets:
// C++:
int x;
x = 10; // line A
x = 20; // line B
and
# Python:
x = 10 # line C
x = 20 # line D
After line A, the int 10 is stored in memory, say, at the memory address 0x1111.
After line B, the memory at 0x1111 is overwritten, so 0x1111 now holds the int 20
However, the way this program works in python is quite different:
After line C, x points to some memory, say, 0x2222, and the value stored at 0x2222 is 10
After line D, x points to some different memory, say, 0x3333, and the value stored at 0x3333 is 20
Eventually, the orphaned memory at 0x2222 is garbage collected by Python.
Hopefully this helps you get a grasp of the subtle differences between variables in Python and most other languages.
(I know I didn't directly answer your question about lambdas, but I think this is good background knowledge to have before reading one of the good explanations here, such as #Lattyware's)
See this question for some more background info.
Here's some final background info, in the form of oft-quoted but instructive examples:
print 'Example 1: Expected:'
x = 3
y = x
x = 2
print 'x =', x
print 'y =', y
print 'Example 2: Surprising:'
x = [3]
y = x
x[0] = 2
print 'x =', x
print 'y =', y
print 'Example 3: Same logic as in Example 1:'
x = [3]
y = x
x = [2]
print 'x =', x
print 'y =', y
The output is:
Example 1: Expected:
x = 2
y = 3
Example 2: Surprising:
x = [2]
y = [2]
Example 3: Same logic as in Example 1:
x = [2]
y = [3]
foo = []
for i in range(3):
foo.append((lambda: i))
Here since all the lambda's were created in the same scope so all of them point to the same global variable variable i. so, whatever value i points to will be returned when they are actually called.
foo = []
for i in range(3):
foo.append((lambda z = i: id(z)))
print id(i) #165618436
print(foo[-1]()) #165618436
Here in each loop we assign the value of i to a local variable z, as default arguments are calculated when the function is parsed so the value z simply points to the values stored by i during the iteration.
Arguments are always passed to functions by reference?
In fact the z in foo[-1] still points to the same object as i of the last iteration, so yes values are passed by reference but as integers are immutable so changing i won't affect z of the foo[-1] at all.
In the example below all lambda's point to some mutable object, so modifying items in lis will also affect the functions in foo:
foo = []
lis = ([], [], [])
for i in lis:
foo.append((lambda z = i: z))
lis[0].append("bar")
print foo[0]() #prints ['bar']
i.append("foo") # `i` still points to lis[-1]
print foo[-1]() #prints ['foo']
Assigning to a variable of immutable type creates a copy?
No values are never copied.
>>> x = 1000
>>> y = x # x and y point to the same object, but an immutable object.
>>> x += 1 # so modifying x won't affect y at all, in fact after this step
# x now points to some different object and y still points to
# the same object 1000
>>> x #x now points to an new object, new id()
1001
>>> y #still points to the same object, same id()
1000
>>> x = []
>>> y = x
>>> x.append("foo") #modify an mutable object
>>> x,y #changes can be seen in all references to the object
(['foo'], ['foo'])
The list of lambdas problem arises because the i referred to in both snippets is the same variable.
Two distinct variables with the same name exist only if they exist in two separate scopes. See the following link for when that happens, but basically any new function (including a lambda) or class establishes its own scope, as do modules, and pretty much nothing else does. See: http://docs.python.org/2/reference/executionmodel.html#naming-and-binding
HOWEVER, when reading the value of a variable, if it is not defined in the current local scope, the enclosing local scopes are searched*. Your first example is of exactly this behaviour:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
Each lambda creates no variables at all, so its own local scope is empty. When execution hits the locally undefined i, it is located in the enclosing scope.
In your second example, each lambda creates its own i variable in the parameter list:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
This is in fact equivalent to lambda a=i: a, because the i inside the body is the same as the i on the left hand side of the assignment, and not the i on the right hand side. The consequence is that i is not missing from the local scope, and so the value of the local i is used by each lambda.
Update: Both of your assumptions are incorrect.
Function arguments are passed by value. The value passed is the reference to the object. Pass-by-reference would allow the original variable to be altered.
No implicit copying ever occurs on function call or assignment, of any language-level object. Under the hood, because this is pass-by-value, the references to the parameter objects are copied when the function is called, as is usual in any language which passes references by value.
Update 2: The details of function evaluation are here: http://docs.python.org/2/reference/expressions.html#calls . See the link above for the details regarding name binding.
* No actual linear search occurs in CPython, because the correct variable to use can be determined at compile time.
The answer is that the references created in a closure (where a function is inside a function, and the inner function accesses variables from the outer one) are special. This is an implementation detail, but in CPython the value is a particular kind of object called a cell and it allows the variable's value to be changed without rebinding it to a new object. More info here.
The way variables work in Python is actually rather simple.
All variables contain references to objects.
Reassigning a variable points it to a different object.
All arguments are passed by value when calling functions (though the values being passed are references).
Some types of objects are mutable, which means they can be changed without changing what any of their variable names point to. Only these types can be changed when passed, since this does not require changing any references to the object.
Values are never copied implicitly. Never.
The behaviour really has very little to do with how parameters are passed (which is always the same way; there is no distinction in Python where things are sometimes passed by reference and sometimes passed by value). Rather the problem is to do with how names themselves are found.
lambda: i
creates a function that is of course equivalent to:
def anonymous():
return i
That i is a name, within the scope of anonymous. But it's never bound within that scope (not even as a parameter). So for that to mean anything i must be a name from some outer scope. To find a suitable name i, Python will look at the scope in which anonymous was defined in the source code (and then similarly out from there), until it finds a definition for i.1
So this loop:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
Is almost exactly as if you had written this:
foo = []
for i in range(3):
def anonymous():
return i
foo.append(anonymous)
for l in foo:
print(l())
So that i in return i (or lambda: i) ends up being the same i from the outer scope, which is the loop variable. Not that they are all references to the same object, but that they are all the same name. So it's simply not possible for the functions stored in foo to return different values; they're all returning the object referred to by a single name.
To prove it, watch what happens when I remove the variable i after the loop:
>>> foo = []
>>> for i in range(3):
foo.append((lambda: i))
>>> del i
>>> for l in foo:
print(l())
Traceback (most recent call last):
File "<pyshell#7>", line 2, in <module>
print(l())
File "<pyshell#3>", line 2, in <lambda>
foo.append((lambda: i))
NameError: global name 'i' is not defined
You can see that the problem isn't that each function has a local i bound to the wrong thing, but rather than each function is returning the value of the same global variable, which I've now removed.
OTOH, when your loop looks like this:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
for l in foo:
print(l())
That is quite like this:
foo = []
for i in range(3):
def anonymous(i=i):
return i
foo.append(anonymous)
for l in foo:
print(l())
Now the i in return i is not the same i as in the outer scope; it's a local variable of the function anonymous. A new function is created in each iteration of the loop (stored temporarily in the outer scope variable anonymous, and then permanently in a slot of foo), so each one has it's own local variables.
As each function is created, the default value of its parameter is set to the value of i (in the scope defining the functions). Like any other "read" of a variable, that pulls out whatever object is referenced by the variable at that time, and thereafter has no connection to the variable.2
So each function gets the default value of i as it is in the outer scope at the time it is created, and then when the function is called without an argument that default value becomes the value of the i in that function's local scope. Each function has no non-local references, so is completely unaffected by what happens outside it.
1 This is done at "compile time" (when the Python file is converted to bytecode), with no regard for what the system is like at runtime; it is almost literally looking for an outer def block with i = ... in the source code. So local variables are actually statically resolved! If that lookup chain falls all the way out to the module global scope, then Python assumes that i will be defined in the global scope at the point that the code will be run, and just treats i as a global variable whether or not there is a statically visible binding for i at module scope, hence why you can dynamically create global variables but not local ones.
2 Confusingly, this means that in lambda i=i: i, the three is refer to three completely different "variables" in two different scopes on the one line.
The leftmost i is the "name" holding the value that will be used for the default value of i, which exists independently of any particular call of the function; it's almost exactly "member data" stored in the function object.
The second i is an expression evaluated as the function is created, to get the default value. So the i=i bit acts very like an independent statement the_function.default_i = i, evaluated in the same scope containing the lambda expression.
And finally the third i is actually the local variable inside the function, which only exists within a call to the anonymous function.