python import inside function hides existing variable - python

I was wrestling with a weird "UnboundLocalError: local variable referenced before assignment" issue in a multi-submodule project I am working on and slimmed it down to this snippet (using the logging module from standard library):
import logging
def foo():
logging.info('foo')
def bar():
logging.info('bar')
if False:
import logging.handlers
# With these alternatives things work:
# import logging.handlers as logginghandlers
# from logging.handlers import SocketHandler
logging.basicConfig(level=logging.INFO)
foo()
bar()
Which has this output (I tried python 2.7 and 3.3):
INFO:root:foo
Traceback (most recent call last):
File "import-test01.py", line 16, in <module>
bar()
File "import-test01.py", line 7, in bar
logging.info('bar')
UnboundLocalError: local variable 'logging' referenced before assignment
Apparently the presence of an import statement inside a function hides already existing variables with the same name in the function scope, even if the import is not executed.
This feels counter-intuitive and non-pythonic. I tried to find some information or documentation about this, without much success thus far. Has someone more information/insights about this behaviour?
thanks

The problem you're having is just a restatement of the same ol' way python deals with locals masking globals.
to understand it, import foo is (approximately) syntactic sugar for:
foo = __import__("foo")
thus, your code is:
x = 1
def bar():
print x
if False:
x = 2
since the name logging appears on the left side of an assignment statement inside bar, it is taken to be a local variable reference, and so python won't look for a global by the same name in that scope, even though the line that sets it can never be called.
The usual workarounds for dealing with globals apply: use another name, as you have found, or:
def bar():
global logging
logging.info('bar')
if False:
import logging.handlers
so that python won't think logging is local.

By that import statement you introduce logging as a local variable in the function and you call that local variable before it's initialized.
def bar():
import logging.handlers
print locals()
>>> foo()
{'logging': <module 'logging' from '/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/__init__.pyc'>}

Related

Visibility of global variable declared in imported module

I just bumped into an unexpected (at least, for me!) behavior, and I'm trying to understand it.
Let's say I have a main file:
main.py
from my_packages.module_00 import my_function
def main():
my_function()
if __name__ == "__main__":
main()
and, in the folder "my_packages", the module module_00 containing the function definition for "my_function" and a global variable:
module_00.py
global_var = 'global variable'
def my_function():
print(f'Do I know {global_var}???')
When I run main.py, it outputs:
Do I know global variable???
And I'm trying to figure out why it's working.
I would expect the variable global_var to have a scope limited only to the module where it's defined (the answer to this question seems to confirm it).
Basically, I assumed that importing my_function by
from my_packages.module_00 import my_function
was equivalent to copy/pasting the function definition in main.py. However, it seems that...the imported function somehow keeps track of the global variables declared in the module where the function itself has been defined?
Or am I missing something?
However, it seems that...the imported function somehow keeps track of the global variables declared in the module where the function itself has been defined?
That's exactly what it is doing.
>>> from module_00 import my_func
>>> my_func.__globals__['global_var']
'global variable'
>>> module_00.global_var is my_func.__globals__['global_var']
True
>>> module_00.global_var = 3
>>> my_func.__globals__['global_var']
3
__globals__ is a reference to the global namespace of the module where my_func was defined.
Fix a couple of typos and your code works as expected. Each module in Python has its own private symbol table for the module's global variables.
import in main.py must be my_func not my_function matching the function name as defined in module_00.py.
ref in f-string in module_00.py must be {global_var} not {global variable}.
main.py
from my_packages.module_00 import my_func
^^^^^
def main():
my_func() # my_func not my_function
if __name__ == "__main__":
main()
module_00.py
global_var = 'global variable'
def my_func():
print(f'Do I know {global_var}???')
^^^^^^^^^^
Output:
Do I know global variable???
If you want to access a module's global variable then you can import the module and access the variable with the syntax: package.module_name.variable_name; e.g. my_packages.module_00.global_var

Python global variable in import * [duplicate]

I've run into a bit of a wall importing modules in a Python script. I'll do my best to describe the error, why I run into it, and why I'm tying this particular approach to solve my problem (which I will describe in a second):
Let's suppose I have a module in which I've defined some utility functions/classes, which refer to entities defined in the namespace into which this auxiliary module will be imported (let "a" be such an entity):
module1:
def f():
print a
And then I have the main program, where "a" is defined, into which I want to import those utilities:
import module1
a=3
module1.f()
Executing the program will trigger the following error:
Traceback (most recent call last):
File "Z:\Python\main.py", line 10, in <module>
module1.f()
File "Z:\Python\module1.py", line 3, in f
print a
NameError: global name 'a' is not defined
Similar questions have been asked in the past (two days ago, d'uh) and several solutions have been suggested, however I don't really think these fit my requirements. Here's my particular context:
I'm trying to make a Python program which connects to a MySQL database server and displays/modifies data with a GUI. For cleanliness sake, I've defined the bunch of auxiliary/utility MySQL-related functions in a separate file. However they all have a common variable, which I had originally defined inside the utilities module, and which is the cursor object from MySQLdb module.
I later realised that the cursor object (which is used to communicate with the db server) should be defined in the main module, so that both the main module and anything that is imported into it can access that object.
End result would be something like this:
utilities_module.py:
def utility_1(args):
code which references a variable named "cur"
def utility_n(args):
etcetera
And my main module:
program.py:
import MySQLdb, Tkinter
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
And then, as soon as I try to call any of the utilities functions, it triggers the aforementioned "global name not defined" error.
A particular suggestion was to have a "from program import cur" statement in the utilities file, such as this:
utilities_module.py:
from program import cur
#rest of function definitions
program.py:
import Tkinter, MySQLdb
db=MySQLdb.connect(#blahblah) ; cur=db.cursor() #cur is defined!
from utilities_module import *
But that's cyclic import or something like that and, bottom line, it crashes too. So my question is:
How in hell can I make the "cur" object, defined in the main module, visible to those auxiliary functions which are imported into it?
Thanks for your time and my deepest apologies if the solution has been posted elsewhere. I just can't find the answer myself and I've got no more tricks in my book.
Globals in Python are global to a module, not across all modules. (Many people are confused by this, because in, say, C, a global is the same across all implementation files unless you explicitly make it static.)
There are different ways to solve this, depending on your actual use case.
Before even going down this path, ask yourself whether this really needs to be global. Maybe you really want a class, with f as an instance method, rather than just a free function? Then you could do something like this:
import module1
thingy1 = module1.Thingy(a=3)
thingy1.f()
If you really do want a global, but it's just there to be used by module1, set it in that module.
import module1
module1.a=3
module1.f()
On the other hand, if a is shared by a whole lot of modules, put it somewhere else, and have everyone import it:
import shared_stuff
import module1
shared_stuff.a = 3
module1.f()
… and, in module1.py:
import shared_stuff
def f():
print shared_stuff.a
Don't use a from import unless the variable is intended to be a constant. from shared_stuff import a would create a new a variable initialized to whatever shared_stuff.a referred to at the time of the import, and this new a variable would not be affected by assignments to shared_stuff.a.
Or, in the rare case that you really do need it to be truly global everywhere, like a builtin, add it to the builtin module. The exact details differ between Python 2.x and 3.x. In 3.x, it works like this:
import builtins
import module1
builtins.a = 3
module1.f()
As a workaround, you could consider setting environment variables in the outer layer, like this.
main.py:
import os
os.environ['MYVAL'] = str(myintvariable)
mymodule.py:
import os
myval = None
if 'MYVAL' in os.environ:
myval = os.environ['MYVAL']
As an extra precaution, handle the case when MYVAL is not defined inside the module.
This post is just an observation for Python behaviour I encountered. Maybe the advices you read above don't work for you if you made the same thing I did below.
Namely, I have a module which contains global/shared variables (as suggested above):
#sharedstuff.py
globaltimes_randomnode=[]
globalist_randomnode=[]
Then I had the main module which imports the shared stuff with:
import sharedstuff as shared
and some other modules that actually populated these arrays. These are called by the main module. When exiting these other modules I can clearly see that the arrays are populated. But when reading them back in the main module, they were empty. This was rather strange for me (well, I am new to Python). However, when I change the way I import the sharedstuff.py in the main module to:
from globals import *
it worked (the arrays were populated).
Just sayin'
A function uses the globals of the module it's defined in. Instead of setting a = 3, for example, you should be setting module1.a = 3. So, if you want cur available as a global in utilities_module, set utilities_module.cur.
A better solution: don't use globals. Pass the variables you need into the functions that need it, or create a class to bundle all the data together, and pass it when initializing the instance.
The easiest solution to this particular problem would have been to add another function within the module that would have stored the cursor in a variable global to the module. Then all the other functions could use it as well.
module1:
cursor = None
def setCursor(cur):
global cursor
cursor = cur
def method(some, args):
global cursor
do_stuff(cursor, some, args)
main program:
import module1
cursor = get_a_cursor()
module1.setCursor(cursor)
module1.method()
Since globals are module specific, you can add the following function to all imported modules, and then use it to:
Add singular variables (in dictionary format) as globals for those
Transfer your main module globals to it
.
addglobals = lambda x: globals().update(x)
Then all you need to pass on current globals is:
import module
module.addglobals(globals())
Since I haven't seen it in the answers above, I thought I would add my simple workaround, which is just to add a global_dict argument to the function requiring the calling module's globals, and then pass the dict into the function when calling; e.g:
# external_module
def imported_function(global_dict=None):
print(global_dict["a"])
# calling_module
a = 12
from external_module import imported_function
imported_function(global_dict=globals())
>>> 12
The OOP way of doing this would be to make your module a class instead of a set of unbound methods. Then you could use __init__ or a setter method to set the variables from the caller for use in the module methods.
Update
To test the theory, I created a module and put it on pypi. It all worked perfectly.
pip install superglobals
Short answer
This works fine in Python 2 or 3:
import inspect
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
save as superglobals.py and employ in another module thusly:
from superglobals import *
superglobals()['var'] = value
Extended Answer
You can add some extra functions to make things more attractive.
def superglobals():
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals
def getglobal(key, default=None):
"""
getglobal(key[, default]) -> value
Return the value for key if key is in the global dictionary, else default.
"""
_globals = dict(inspect.getmembers(
inspect.stack()[len(inspect.stack()) - 1][0]))["f_globals"]
return _globals.get(key, default)
def setglobal(key, value):
_globals = superglobals()
_globals[key] = value
def defaultglobal(key, value):
"""
defaultglobal(key, value)
Set the value of global variable `key` if it is not otherwise st
"""
_globals = superglobals()
if key not in _globals:
_globals[key] = value
Then use thusly:
from superglobals import *
setglobal('test', 123)
defaultglobal('test', 456)
assert(getglobal('test') == 123)
Justification
The "python purity league" answers that litter this question are perfectly correct, but in some environments (such as IDAPython) which is basically single threaded with a large globally instantiated API, it just doesn't matter as much.
It's still bad form and a bad practice to encourage, but sometimes it's just easier. Especially when the code you are writing isn't going to have a very long life.

Why is behavior different with respect to global variables in "import module" vs "from module import * "?

Let's have a.py be:
def foo():
global spam
spam = 42
return 'this'
At a console, if I simply import a, things make sense to me:
>>> import a
>>> a.foo()
'this'
>>> a.spam
42
However, if I do the less popular thing and...
>>> from a import *
>>> foo()
'this'
>>> spam
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'spam' is not defined
>>> a.spam
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'a' is not defined
I've read opinions about why people don't like "from module import * " from a namespace perspective, but I can't find anything on this behavior, and frankly I figured out that this was the issue I was having by accident.
When you ask for a.spam there happens a namespace search in the module a and spam is found. But when you ask for just spam:
>>> from a import * # imported foo, spam doesn't exist yet
>>> foo()
spam is created in the namespace a (you cannot access it with such import though), but not in the current module. And it seems like nobody promised us to add newly added globals of a to all the namespaces module a has been imported into via *. That will require storing import links inside the interpreter and probably will degrade performance if one heavily imported module does such tricks all the time.
And imagine you have defined spam in your main module prior to calling foo(). That would be downright name collision.
Just as illustration, you can do from a import * to get fresh updates for the module a:
from a import *
print(foo())
from a import *
print(spam)
Let's go thorough it step by step:
At the point of importing, a only has the symbol foo which refers to a function.
Only if the function is executed, a gets the additional symbol spam.
In the first case, you do import a and get a "handle" to the module, which allows you to monitor whatever happens later. If you'd do a.spam before calling a.foo(), you'd get an error.
In the second case, from a import * gives you whatever currently is in the module - and that's just spam(). After calling that, you could do from a import * to get spam as well.
I generally agree with Vovanrock2002.
As was recently explained to me, the '.' is a scope resolution operator. import a and from a import * give you different syntaxes. from a import * imports each global variable from a separately, and binds them as variables in the local scope. A more practical example might be the difference between import datetime and from datetime import date. With the former, I have to create a date object using datetime.date(2015, 11, 12), with the latter I get to just use date(2015, 11, 12).
You can read more on the import statement
I would have to differ with you, however, in that I don't believe that spam is the meaning of life, the universe, and everything.

Bug in Python's documentation?

I am reading http://docs.python.org/2/tutorial/modules.html#more-on-modules and wonder if the following is correct:
Modules can import other modules. It is customary but not required to
place all import statements at the beginning of a module (or script,
for that matter). The imported module names are placed in the
importing module’s global symbol table.
Apparently not:
>>> def foo(): import sys
...
>>> foo()
>>> sys.path
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sys' is not defined
See http://ideone.com/cLK09v for an online demo.
So, is it a bug in the Python's documentation or I don't understand something?
Yes, this is a documentation error. The import statement imports the names to the current namespace. Usually import is used outside of functions and classes, but as you've discovered, it does work within them. In your example function, the module is imported into the function's local namespace when the function is called. (Which you didn't do, but that wouldn't make it available outside the function anyway.)
The global keyword does work here, however:
def foo():
global sys
import sys
foo()
sys.path
I don't think this is actually an error in the documentation, but more of a mis-interpretation. You simply have a scope issue. You are importing it in the scope of the function foo(). You could certainly do as the documentation suggests and put the import at the bottom of the file or somewhere else in the file that would still have the same global scope as your module. The problem is "The imported module names are placed in the importing module’s global symbol table", where the scope of the module you are importing into is contained in the function foo(), not at the module's global level.

Some confusion regarding imports in Python

I'm new to Python and there's something that's been bothering me for quite some time. I read in "Learning Python" by Mark Lutz that when we use a from statement to import a name present in a module, it first imports the module, then assigns a new name to it (i.e. the name of the function, class, etc. present in the imported module) and then deletes the module object with the del statement. However what happens if I try to import a name using from that references a name in the imported module that itself is not imported? Consider the following example in which there are two modules mod1.py and mod2.py:
#mod1.py
from mod2 import test
test('mod1.py')
#mod2.py
def countLines(name):
print len(open(name).readlines())
def countChars(name):
print len(open(name).read())
def test(name):
print 'loading...'
countLines(name)
countChars(name)
print '-'*10
Now see what happens when I run or import mod1:
>>>import mod1
loading...
3
44
----------
Here when I imported and ran the test function, it ran successfully although I didn't even import countChars or countLines, and the from statement had already deleted the mod2 module object.
So I basically need to know why this code works even though considering the problems I mentioned it shouldn't.
EDIT: Thanx alot to everyone who answered :)
Every function have a __globals__ attribute which holds a reference for the environment where it search for global variables and functions.
The test function is then linked to the global variables of mod2. So when it calls countLines the interpreter will always find the right function even if you wrote a new one with the same name in the module importing the function.
I think you're wrestling with the way python handles namespaces. when you type from module import thing you are bringing thing from module into your current namespace. So, in your example, when mod1 gets imported, the code is evaluated in the following order:
from mod2 import test #Import mod2, bring test function into current module namespace
test("mod1.py") #run the test function (defined in mod2)
And now for mod2:
#create a new function named 'test' in the current (mod2) namespace
#the first time this module is imported. Note that this function has
#access to the entire namespace where it is defined (mod2).
def test(name):
print 'loading...'
countLines(name)
countChars(name)
print '-'*10
The reason that all of this is important is because python lets you choose exactly what you want to pull into your namespace. For example, say you have a module1 which defines function cool_func. Now you are writing another module (module2) and it makes since for module2 to have a function cool_func also. Python allows you to keep those separate. In module3 you could do:
import module1
import module2
module1.cool_func()
module2.cool_func()
Or, you could do:
from module1 import cool_func
import module2
cool_func() #module1
module2.cool_func()
or you could do:
from module1 import cool_func as cool
from module2 import cool_func as cooler
cool() #module1
cooler() #module2
The possibilities go on ...
Hopefully my point is clear. When you import an object from a module, you are choosing how you want to reference that object in your current namespace.
The other answers are better articulated than this one, but if you run the following you can see that countChars and countLines are actually both defined in test.__globals__:
from pprint import pprint
from mod2 import test
pprint(test.__globals___)
test('mod1')
You can see that importing test brings along the other globals defined in mod2, letting you run the function without worrying about having to import everything you need.
Each module has its own scope. Within mod1, you cannot use the names countLines or countChars (or mod2).
mod2 itself isn't affected in the least by how it happens to be imported elsewhere; all names defined in it are available within the module.
If the webpage you reference really says that the module object is deleted with the del statement, it's wrong. del only removes names, it doesn't delete objects.
From A GUIDE TO PYTHON NAMESPACES,
Even though modules have their own global namespaces, this doesn’t mean that all names can be used from everywhere in the module. A scope refers to a region of a program from where a namespace can be accessed without a prefix. Scopes are important for the isolation they provide within a module. At any time there are a number of scopes in operation: the scope of the current function you’re in, the scope of the module and then the scope of the Python builtins. This nesting of scopes means that one function can’t access names inside another function.
Namespaces are also searched for names inside out. This means that if there is a certain name declared in the module’s global namespace, you can reuse the name inside a function while being certain that any other function will get the global name. Of course, you can force the function to use the global name by prefixing the name with the ‘global’ keyword. But if you need to use this, then you might be better off using classes and objects.
An import statement loads the whole module in memory so that's why the test() function ran successfully.
But as you used from statement that's why you can't use the countLines and countChars directly but test can surely call them.
from statement basically loads the whole module and sets the imported function, variable etc to the global namespace.
for eg.
>>> from math import sin
>>> sin(90) #now sin() is a global variable in the module and can be accesed directly
0.89399666360055785
>>> math
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
math
NameError: name 'math' is not defined
>>> vars() #shows the current namespace, and there's sin() in it
{'__builtins__': <module '__builtin__' (built-in)>, '__file__': '/usr/bin/idle', '__package__': None, '__name__': '__main__', 'main': <function main at 0xb6ac702c>, 'sin': <built-in function sin>, '__doc__': None}
consider a simple file, file.py:
def f1():
print 2+2
def f2():
f1()
import only f2:
>>> from file import f2
>>> f2()
4
though I only imported f2() not f1() but it ran f1() succesfully it's because the module is loaded in memory but we can only access f2(), but f2() can access other parts of the module.

Categories

Resources