Why is globals() a function in Python?

Why is globals() a function in Python? - python

Python offers the function globals() to access a dictionary of all global variables. Why is that a function and not a variable? The following works:
g = globals()
g["foo"] = "bar"
print foo # Works and outputs "bar"
What is the rationale behind hiding globals in a function? And is it better to call it only once and store a reference somewhere or should I call it each time I need it?
IMHO, this is not a duplicate of Reason for globals() in Python?, because I'm not asking why globals() exist but rather why it must be a function (instead of a variable __globals__).

Because it may depend on the Python implementation how much work it is to build that dictionary.
In CPython, globals are kept in just another mapping, and calling the globals() function returns a reference to that mapping. But other Python implementations are free to create a separate dictionary for the object, as needed, on demand.
This mirrors the locals() function, which in CPython has to create a dictionary on demand because locals are normally stored in an array (local names are translated to array access in CPython bytecode).
So you'd call globals() when you need access to the mapping of global names. Storing a reference to that mapping works in CPython, but don't count on other this in other implementations.

Related

Function closures and meaning of <locals> syntax in object name in Python

Suppose I have the following code:
def outer(information):
print(locals())
def inner():
print("The information given to me is: ", information)
return inner
func1 = outer("info1")
print(func1)
It returns:
{'information': 'info1'}
<function outer.<locals>.inner at 0x1004d9d30>
Of course, if I call func1, it will print with info1 in the statement. So, from printing the locals() in the outer function, I can see that there is some relationship between the local scope and the storage of the argument.
I was expecting func1 to simply be outer.inner, why does the syntax instead say outer.<locals>.inner? Is this a syntactical way of clarifying that there are different local scopes associated to each of these functions - imagine I made another one func2 = outer("info2") - I return using the outer function?
Also, is there something special about the enclosing <> syntax when used around a name? I see it around both the object and locals.

See PEP 3155 -- Qualified name for classes and functions and the example with nested functions.
For nested classes, methods, and nested functions, the __qualname__ attribute contains a dotted path leading to the object from the module top-level. A function's local namespace is represented in that dotted path by a component named <locals>.
Since the __repr__ of a function uses the __qualname__ attribute, you see this extra component in the output when printing a nested function.
I was expecting func1 to simply be outer.inner
That's not a fully qualified name. With this repr you might mistakenly assume you could import the name outer and dynamically access the attribute inner. Remember the qualname is a "dotted path leading to the object", but in this case attribute access is not possible because inner is a local variable.
Also, is there something special about the enclosing <> syntax when used around a name?
There is nothing special about it, but it gives a pretty strong hint to the programmer that you can't access this namespace directly, because the name is not a valid identifier.

You can think of outer.<locals>.inner as saying that inner is a local variable created by the function. inner is what is referred to a closure in computer science. Roughly speaking a closure is like a lambda in that it acts as a function, but it requires non-global data be bundled with it to operate. In memory it acts as a tuple between information and a reference to the function being called.
foo = outer("foo")
bar = outer("bar")
# In memory these more or less looks like the following:
("foo", outer.inner)
("bar", outer.inner)
# And since it was created from a local namespace and can not be accessed
# from a static context local variables bundled with the function, it
# represents that by adding <local> when printed.
# While something like this looks a whole lot more convenient, it gets way
# more annoying to work with when the local variables used are the length of
# your entire terminal screen.
<function outer."foo".inner at 0x1004d9d30>
There is nothing inherently special about the <> other than informing you that <local> has some special meaning.
Edit:
I was not completely sure when writing my answer, but after seeing #wim's answer <local> not only applies to closures created consuming variables within a local context. It can be applied more broadly to all functions (or anything else) created within a local namespace. So in summary foo.<local>.bar just means that "bar was created within the local namespace of foo".

When is the locals dictionary set?

A module holds a dictionary to keep track of itscontext, such as the names defined at some point of the execution. This dictionary can be accessed through vars(module) (or module.__dict__) if module was imported, or by a call to the locals built-in function in the module itself:
Update and return a dictionary representing the current local symbol table.
But I found myself a bit confused, when I tried accessing the locals dictionary from a function. The output of a script containing only the following is an empty dictionary:
def list_locals():
print(locals())
list_locals()
But on the other hand, if a script contains exclusively the following, the output is the expected dictionary, containing __name__, __doc__ and the other module-level variables:
print(locals())
So, when is the content of the locals dictionary set?
In addition, what does "update" mean in the definition of the locals function?

The namespace of a module is the global namespace, accessed through globals(). There is no separate locals namespace, so locals(), outside of functions, just returns the global namespace.
Only functions have a local namespace. Note that locals() is a one-way reflection of that namespace; the CPython implementation for functions is highly optimised and you can't add or alter local variables through the locals() dictionary. The dictionary returned by locals() is updated whenever the namespace has changed between calls to that function.
Note also that things like list / dict / set comprehensions, generator expressions and class bodies are in reality executed as functions too, albeit with different namespace rules. In such contexts locals() will also return the separate function local namespace.

If you call locals() more than once during the same function call, it will return the same dictionary.
>>> def f():
... d = locals()
... assert 'd' not in d
... locals()
... assert 'd' in d
... assert d is locals()
... print(d)
...
>>> f()
{'d': {...}}
"Update" in this case means that the contents of this dictionary are updated to reflect the current scope of local variables that exist, but if you kept a reference to this dictionary then that dictionary will still be used.
Notice also that the locals dictionary doesn't contain a key for 'f', even though you could have accessed it during the course of the function. In case it's not obvious, that's a global, not a local.

execfile() cannot be used reliably to modify a function’s locals

The python documentation states "execfile() cannot be used reliably to modify a function’s locals." on the page http://docs.python.org/2/library/functions.html#execfile
Can anyone provide any further details on this statement? The documentation is fairly minimal. The statement seems very contradictory to "If both dictionaries are omitted, the expression is executed in the environment where execfile() is called." which is also in the documentation. Is there a special case when excecfile is used within a function then execfile is then acting similar to a function in that it creates a new scoping level?
If I use execfile in a function such as
def testfun():
execfile('thefile.py',globals())
def testfun2():
print a
and there are objects created by the commands in 'thefile.py' (such as the object 'a'), how do I know if they are going to be local objects to testfun or global objects? So, in the function testfun2, 'a' will appear to be a global? If I omit globals() from the execfile statement, can anyone give a more detailed explanation why objects created by commands in 'thefile.py' are not available to 'testfun'?

In Python, the way names are looked up is highly optimized inside functions. One of the side effects is that the mapping returned by locals() gives you a copy of the local names inside a function, and altering that mapping does not actually influence the function:
def foo():
a = 'spam'
locals()['a'] = 'ham'
print(a) # prints 'spam'
Internally, Python uses the LOAD_FAST opcode to look up the a name in the current frame by index, instead of the slower LOAD_NAME, which would look for a local name (by name), then in the globals() mapping if not found in the first.
The python compiler can only emit LOAD_FAST opcodes for local names that are known at compile time; but if you allow the locals() to directly influence a functions' locals then you cannot know all the local names ahead of time. Nested functions using scoped names (free variables) complicates matters some more.
In Python 2, you can force the compiler to switch off the optimizations and use LOAD_NAME always by using an exec statement in the function:
def foo():
a = 'spam'
exec 'a == a' # a noop, but just the presence of `exec` is important
locals()['a'] = 'ham'
print(a) # prints 'ham'
In Python 3, exec has been replaced by exec() and the work-around is gone. In Python 3 all functions are optimized.
And if you didn't follow all this, that's fine too, but that is why the documentation glosses over this a little. It is all due to an implementation detail of the CPython compiler and interpreter that most Python users do not need to understand; all you need to know that using locals() to change local names in a function does not work, usually.

Locals are kind of weird in Python. Regular locals are generally accessed by index, not by name, in the bytecode (as this is faster), but this means that Python has to know all the local variables at compile time. And that means you can't add new ones at runtime.
Now, if you use exec in a function, in Python 2.x, Python knows not to do this and falls back to the slower method of accessing local variables by name, and you can make new ones programmatically. (This trick was removed in Python 3.) You'd think Python would also do this for execfile(), but it doesn't, because exec is a statement and execfile() is a function call, and the name execfile might not refer to the built-in function at runtime (it can be reassigned, after all).
What will happen in your example function? Well, try it and find out! As the documentation for execfile states, if you don't pass in a locals dict, the dict you pass in as globals will be used. You pass in globals() (your module's real global variables) so if it assigns to a, then a becomes a global.
Now you might try something like this:
def testfun():
execfile('thefile.py')
def testfun2():
print a
return testfun2
exec ""
The exec statement at the end forces testfun() to use the old-style name-based local variables. It doesn't even have to be executed, as it is not here; it just has to be in the function somewhere.
But this doesn't work either, because the name-based locals don't support nesting functions with free variables (a in this case). That functionality also requires Python know all the local variables at function definition time. You can't even define the above function—Python won't let you.
In short, trying to deal with local variables programmatically is a pain and the documentation is correct: execfile() cannot reliably be used to modify a function's locals.
A better solution, probably, is to just import the file as a module. You can do this within the function, then access values in the module the usual way.
def testfun():
import thefile
print thefile.a
If you won't know the name of the file to be imported until runtime, you can use __import__ instead. Also, you may need to modify sys.path to make sure the directory you want to import from is first in the path (and put it back afterward, probably).
You can also just pass in your own dictionary to execfile and afterward, access the variables from the executed file using myVarsDict['a'] and so on.

where python store global and local variables?

Almost same as question Where are the local, global, static, auto, register, extern, const, volatile variables are stored?, the difference is this thread is asking how Python language implement this.

Of all those, Python only has "local", "global" and "nonlocal" variables.
Some of those are stored in a Dictionary or dictionary like object, which usually can be explicitly addressed.
"global": Actually "global" variables are global relatively to the module where they are defined - sometimes they are referred to as "module level" variables instead of globals, since most of evils of using global variables in C do not apply - since one won't have neither naming conflicts neither won't know wether a certain name came from when using a module-level global variable.
Their value is stored in the dictionary available as the "__dict__" attribute of a module object. It is important to note that all names in a module are stored in this way - since names in Python point to any akind of object: that is, there is no distinction at the language level, of "variables", functions or classes in a module: the names for all these objects will be keys in the "__dict__" special attribute, which is accessed directly by the Language. Yes, one can insert or change the objects pointed by variables in a module at run time with the usual "setattr", or even changing the module's __dict__ directly.
"local": Local variables are available fr "user code" in a dictionary returned by the "locals()" buil-in function call. This dictionary is referenced by the "f_locals" attribute of the current code frame being run. Since there are ways of retrieving the code frame of functions that called the current running code, one can retrieve values of the variables available in those functions using the f_locals attribute, although in the CPython implementation, changing a value in the f_locals dictionary won't reflect on the actuall variable values of the running code - those values are cached by the bytecode machinery.
"nonlocal" Variables are special references, inside a function to variables defined in an outter scope, in the case of functions (or other code, like a class body) defined inside a function. They can be retrieved in running code, by getting the func_closure attribute - which is a tuple of "cell" objects. For example, to retrieve the value of the first nonlocal variable inside a function object, one does:_
function.func_closure[0].cell_contents - the values are kept separate from the variable names, which can be retrieved as function.func_code.co_varnames. (this naming scheme is valid for Python 2.x)
The bottom-line: Variable "values" are always kept inside objects that are compatible with Python objects and managed by the virtual machine. Some of these data can be made programmatically accessible through introspection - some of it is opaque. (For example, retrieving, through introspection, nonlocal variables from inside the function that owns them itself is a bit tricky)

Python, print names and values of all bound variables

Is there a way to have Python print the names and values of all bound variables?
(without redesigning the program to store them all in a single list)

Yes you can, it is a rather dirty way to do it, but it is good for debugging etc
from pprint import pprint
def getCurrentVariableState():
pprint(locals())
pprint(globals())

globals() and locals() should give you what you're looking for.

dir(...)
dir([object]) -> list of strings
If called without an argument, return the names in the current scope.
Else, return an alphabetized list of names comprising (some of) the attributes
of the given object, and of attributes reachable from it.
If the object supplies a method named __dir__, it will be used; otherwise
the default dir() logic is used and returns:
for a module object: the module's attributes.
for a class object: its attributes, and recursively the attributes
of its bases.
for any other object: its attributes, its class's attributes, and
recursively the attributes of its class's base classes.

globals() and locals() each return a dictionary representing the symbol table in global and local scopes respectively.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why is globals() a function in Python? - python

Related

Function closures and meaning of <locals> syntax in object name in Python

When is the locals dictionary set?

execfile() cannot be used reliably to modify a function’s locals

where python store global and local variables?

Python, print names and values of all bound variables

Categories

Resources