Related
The common explanation given (as in 12270954 and 710551) is that from X import *
clutters up the namespace, and hence is not recommended. However, the following example
shows that there is something more to it than just cluttering the namespace.
Consider the following code:
x1.py:
g_c = 5
class TestClass():
def run(self):
global g_c
g_c = 1
print(g_c) # prints 1
x2.py:
from x1 import *
t = TestClass()
t.run()
print(g_c) # prints 5, why?
x3.py:
import x1
t = x1.TestClass()
t.run()
print(x1.g_c) # prints 1
The results for running x2.py and x3.py are different (on python 3.6.8). Can someone please explain why
the two imports are behaving differently?
Extra Notes: (to demonstrate the note mentioned by #tdelaney about lists)
Change the assignments in x1.py like this:
g_c = [5]
...
g_c.append(1)
And now, both x2.py and x3.py give the same answer. It is only when an atomic type is being used that the problem arises.
You can imagine from x1 import * doing something approximately equal to
import m1
globals.update(m1.__dict__)
del m1
In other words it imports the module and then copies all the globals into your namespace. The function and classes inside the module however will keep referring to their original globals, not to the copies in your namespace.
Note the "implementation" above is not completely correct as not all names are copied and there are other subtleties, but it's I think an easy way to understand the general behavior.
The difference becomes evident when the module changes its globals but you don't see the change reflected in your global namespace.
After executing from x1 import * the variables sys.modules["x1"].g_c and g_c are two distinct variables with the same value; code in the module is referring to the one inside the module, not to your copy.
from X import * rebinds the objects in X's global namespace to the current module's namespace. If one of these variables is reassigned, the reassignment is private to the current namespace.
The import doesn't change the imported objects, though, it just adds a new reference. So for instance, the global namespace for class TestClass is still the x1.py module. TestClass.run always changes x1.g_c (its global namespace) regardless of which module calls it.
In x2.py, you import the original 5 referenced by x1.g_c to x2.g_c. But reassignment of x1.g_c doesn't affect x2.g_c which is a variable in a different namespace.
Suppose conversely that x1.g_c was a list and your code was appending to the list. In that case, all modules would see the appends because you are modifying the object referenced by x1.g_c, not reassigning it.
When you import X, you will have to use X to call a function inside X, ie:
X.func1()
But when you use from X import *, you import all names in the module's namespace that don't begin with _ or all names in module.__all__ , and you don't require to use X to call the function, ie:
func1()
In x2.py when you did:
from x1 import *
you just imported every thing from x1 , ie , including the variables. This means that your altered the namespace of the current script (x2). So when you call:
print(g_c)
it prints :
5
which is called from the current scripts namespace.
On the run of the class TestClass() in x2 , it changed the value of g_c in x1, but not in x2.
This indicates the Disadvantages of using Global Variables.
When variables are declared as global, then they remain in the memory till program execution is completed. That is, your global declaration is only valid inside x1. The declaration is valid inside x1's namespace.
And when you did:
import x1
t = x1.TestClass()
t.run()
This import x1 as a module and everything inside isin't imported unless you call it. And you did call the class TestClass(), ran it, resulting g_c inside the class being made a global variable. So then for x1, g_c is 1 since it was globally defined and the current script doesn't know x1 has a g_c = 5 since it wasn't imported/called.
On print(x1.g_c), it calls for the value of g_c in x1's namespace, which is 1.
The keys of this behavior is following:
Python makes namespaces for each files
global X makes variable X refer to the object that lives in the namespace where global X is declared
When t.run() is executed in x2.py, global g_c is evaluated and this makes g_c refers to the object in the namespace for x1.py not x2.py.
Trying the following snippet would be help.
x2_.py
from x1 import *
t = TestClass()
t.run()
print(g_c) # prints 5
import x1
print(x1.g_c) # this print 1
Let us take an example where you are importing the module suman in Python. It brings the entire module into your workspace i.e. In the import suman statement it does not give you access to any methods or function associated with like for example if their is a function display_name() in the suman module then you have to import it like suman.display_name()
In from suman import * it brings all of the names inside suman module (like display() for example) into your module. Now you can access those names without a prefix of name of your module like this:
from suman import *
print(display_name())
Backstory:
I was trying implementing one way to handle -v parameters to increase the verbosity of an application. To that end, I wanted to use a global variable that is pointing to an empty lambda function initially. If -v is given the variable is changed and gets another lambda function assigned that does print it input.
MWE:
I noticed that this did not work as expected when calling the lambda function from another module after importing it via from x import *...
mwe.py:
from mod import *
import mod
def f():
vprint("test in f")
vprint("test before")
print("before: %d" % foo)
set_verbosity(1)
vprint("test after")
print("after: %d" % foo)
f()
mod.vprint("explicit: %d" % mod.foo)
modf()
mod.py:
vprint = lambda *a, **k: None
foo = 42
def set_verbosity(verbose):
global vprint, foo
if verbose > 0:
vprint = lambda *args, **kwargs: print(*args, **kwargs)
foo = 0
def modf():
vprint("modf: %d" % foo)
The output is
before: 42
after: 42
explicit: 0
modf: 0
where the "explicit" and "modf" outputs are due to the mod.vprint and modf calls at the end of the mwe. All other invocations of vprint (that go through the imported version of vprint) are apparently not using the updated definition. Likewise, the value of foo seems to be imported only once.
Question:
It looks to me as if the from x import * type of imports copies the state of the globals of the imported module. I am not so much interested in workarounds per se but the actual reason for this behavior. Where is this defined in the documentation and what's the rationale?
Workaround:
As a side note, one way to implement this anyway is to wrap the global lambda variables by functions and export only them:
_vprint = lambda *a, **k: None
def vprint(*args, **kwargs):
_vprint(*args, **kwargs)
def set_verbosity(verbose):
global _vprint
if verbose > 0:
_vprint = lambda *args, **kwargs: print(*args, **kwargs)
This makes it work with the import-from way that allows the other module to simply call vprint instead of explicitly deferencing via the module's name.
TL;DR: when you do from module import *, you're copying the names and their associated references; changing the reference associated with the original name does not change the reference associated with the copy.
This deals with the underlying difference between names and references. The underlying reason for the behavior has to do with how python handles such things.
Only one thing is truly immutable in python: memory. You can't change individual bytes directly. However, almost everything in python deals with references to individual bytes, and you can change those references. When you do my_list[2] = 5, you're not changing any memory - rather, you're allocating a new block of memory to hold the value 5, and pointing the second index of my_list to it. The original data that my_list[2] used to be pointing at is still there, but since nothing refers to it any more, the garbage collector will take care of it eventually and free the memory it was using.
The same principle goes with names. Any given namespace in python is comparable to a dict - each name has a corresponding reference. And this is where the problems come in.
Consider the difference between the following two statements:
from module import *
import module
In both cases, module is loaded into memory.
In the latter case, only one thing is added to the local namespace - the name 'module', which references the entire memory block containing the module that was just loaded. Or, well, it references the memory block of that module's own namespace, which itself has references to all the names in the module, and so on all the way down.
In the former case, however, every name in module's namespace is copied into the local namespace. The same block of memory still exists, but instead of having one reference to all of it, you now have many references to small parts of it.
Now, let's say we do both of those statements in succession:
from module import *
import module
This leaves us with one name 'module' referencing all the memory the module was loaded into, and a bunch of other names that reference individual parts of that block. We can verify that they point to the same thing:
print(module.func_name == func_name)
# True
But now, we try to assign something else to module.attribute:
module.func_name = lambda x:pass
print(module.func_name == func_name)
# False
It's no longer the same. Why?
Well, when we did module.func_name = lambda x:pass, we first allocated some memory to store lambda x:pass, and then we changed module's 'func_name' name to reference that memory instead of what it was referencing. Note that, like the example I gave earlier with lists, we didn't change the thing that the module.func_name was previously referencing - it still exists, and the local func_name continues to reference it.
So when you do from module import *, you're copying the names and their associated references; changing the reference associated with the original name does not change the reference associated with the copy.
The workaround for this is to not do import *. In fact, this is pretty much the ultimate reason why using import * is usually considered poor practice, save for a handful of special cases. Consider the following code:
# module.py
variable = "Original"
# file1.py
import module
def func1():
module.variable = "New"
# file2.py
import module
import file1
print(module.variable)
file1.func1()
print(module.variable)
When you run python file2.py, you get the following output:
Original
New
Why? Because file1 and file2 both imported module, and in both of their namespaces 'module' is pointing to the same block of memory. module's namespace contains a name 'variable' referencing some value. Then, the following things happen:
file2 says "okay, module, please give me the value associated with the name 'variable' in your namespace.
file1.func1() says "okay, module, the name 'variable' in your namespace now references this other value.
file2 says "okay, module, please give me the value associated with the name 'variable' in your namespace.
Since file1 and file2 are still both talking to the same bit of memory, they stay coordinated.
Random stab in the dark ahead:
If you look at the docs for __import__, there's the bit:
On the other hand, the statement from spam.ham import eggs, sausage as saus results in
_temp = __import__('spam.ham', globals(), locals(), ['eggs', 'sausage'], 0)
eggs = _temp.eggs
saus = _temp.sausage
I think this is the key. If we infer that from mod import * results in something like:
_temp = __import__('mod', globals(), locals(), [], 0)
printv = _temp.printv
foo = _temp.foo
This shows what the problem is. printv is a reference to the old version of printv; what mod.printv was pointing to at the time of import. Reassigning what the printv in mod is pointing to doesn't effect anything in mwe, because the mwe reference to printv is still looking at the previous lambda.
It's similar to how this doesn't change b:
a = 1
b = a
a = 2
b is still pointing to 1, because reassigning a doesn't effect what b is looking at.
On the other hand, mod.printv does work because we are now using a direct reference to the global in mod instead of a reference that points to printv in mod.
This was a random stab because I think I know the answer based on some random reading I did awhile ago. If I'm incorrect, please let me know and I'll remove this to avoid confusion.
I am trying to authomatise a script that:
reads an input defined in another script (e.g. Input_01.py, or Input_02.py, just containing a variable definition such as: J = 6 or J = 7); and
use that variable within a function, i.e.:
def foo_function():
A = J
A += 3
print(A)
Now, if I don't need to authomatise anything, the thing is pretty straightforward:
I just type from Input_01 import J and that's it. Then I can do the same for Input_02 and repeat the operation.
But my idea is to make a script (Multi_Run.py) that allows me to authomatise the process of calling several times the key script (called "code_foo.py"), one for each Input file, and each time reading a different Input file (e.g. Input_01, Input_02,..., etc.)
This is my current "Multi_Run.py" script (simplified for the case of just two different input files):
Inputs = []
for i in range(2):
Inputs.append("Input_0%s" % (i+1))
for Counter in Inputs:
import code_foo
code_foo.foo_function()
But now I cannot say, within "code_foo.py", something like: from Counter import J, because that won't work for two reasons:
First, Counter is not a variable in code_foo (this seems to be solvable by adding a line like from __main__ import *)
Second, the import function is not able to read the string inside the Counter variable, but it wants directly the name of the module (e.g. Input_01), preventing authomatisation. I tried to solve this by using the importlib.import_module but it's not seeming to work properly (i.e. the "Input_##" module is "imported" somehow, but not "run", which means that the "J=#somenumber" line is not run when importing the script and thus I get an error because the J variable is not defined).
For better clarity, my current code_foo.py script is the following:
import importlib
from __main__ import *
importlib.import_module(Counter)
def foo_function():
A = J
A += 3
print(A)
Any hints? thanks a lot
AFAICT (your question is still quite unclear), it looks like you have two issues here.
The first one is about Python's imports and namespaces. Python imports are not "includes" à la C or PHP so importing a module doesn't make the names defined in the module directly available in the importer's namespace, you have to use the qualified name ("."). Also, Python doesn't have a "process global" namespace, Python's "globals" are only global to the current module's namespace, so defining a global name in module "a" doesn't make it magically available to modules imported by module "a". This means that your foo_function in code_foo.py cannot directly access names defined in your main script.
Actually - and this is the second issue - your problem comes from foo_function depending on a global name that is defined elsewhere. This is a design mistake. As a general rule, you should not use globals when you don't need them, and you very rarely need globals. Just write your functions so they take "external" values as argument, and everything becomes much simpler AND much easier to read, test and maintain.
In your case, all you need to do is to 1/ rewrite foo_function() so it takes J as an argument and 2/ rewrite multirun so that it gets J from your "inputs" modules and passes it to foo_function:
# code_foo.py
def foo_function(j):
a = j
a += 3
print(a)
and
# multirun.py
import importlib
import code_foo
def main():
inputs = ["Input_0%s" % i for i in range(1,3)]
for modname in inputs:
module = importlib.import_module(modname)
j = module.J
code_foo.foo_function(j)
if __name__ == "__main__":
main()
As a side note: python's naming conventions are to use all_lower names for modules and variables - ALL_CAPS names are for pseudo-constants.
first.py
myGlobal = "hello"
def changeGlobal():
myGlobal="bye"
second.py
from first import *
changeGlobal()
print myGlobal
The output I get is
hello
although I thought it should be
bye
Why doesn't the global variable myGlobal changes after the call to the changeGlobal() function?
Try:
def changeGlobal():
global myGlobal
myGlobal = "bye"
Actually, that doesn't work either. When you import *, you create a new local module global myGlobal that is immune to the change you intend (as long as you're not mutating the variable, see below). You can use this instead:
import nice
nice.changeGlobal()
print nice.myGlobal
Or:
myGlobal = "hello"
def changeGlobal():
global myGlobal
myGlobal="bye"
changeGlobal()
However, if your global is a mutable container, you're now holding a reference to a mutable and are able to see changes done to it:
myGlobal = ["hello"]
def changeGlobal():
myGlobal[0] = "bye"
I had once the same concern as yours and reading the following section from Norman Matloff's Quick and Painless Python Tutorial was really a good help. Here is what you need to understand (copied from Matloff's book):
Python does not truly allow global variables in the sense that C/C++ do. An imported Python module will not have direct access to the globals in the module which imports it, nor vice versa.
For instance, consider these two files, x.py,
# x.py
import y
def f():
global x
x = 6
def main():
global x
x = 3
f()
y.g()
if __name__ == ’__main__’:
main()
and y.py:
# y.py
def g():
global x
x += 1
The variable x in x.py is visible throughout the module x.py, but not in y.py. In fact, execution of the line
x += 1
in the latter will cause an error message to appear, “global name ’x’ is not defined.”
Indeed, a global variable in a module is merely an attribute (i.e. a member entity) of that module, similar to a class variable’s role within a class. When module B is imported by module A, B’s namespace is copied to A’s. If module B has a global variable X, then module A will create a variable of that name, whose initial value is whatever module B had for its variable of that name at the time of importing. But changes to X in one of the modules will NOT be reflected in the other.
Say X does change in B, but we want code in A to be able to get the latest value of X in B. We can do that by including a function, say named GetX() in B. Assuming that A imported everything from B, then A will get a function GetX() which is a copy of B’s function of that name, and whose sole purpose is to return the value of X. Unless B changes that function (which is possible, e.g. functions may be assigned), the functions in the two modules will always be the same, and thus A can use its function to get the value of X in B.
Python global variables are not global
As wassimans points out above they are essentially attributes within the scope of the module they are defined in (or the module that contains the function that defined them).
The first confusion(bug) people run into is not realizing that functions have a local name space and that setting a variable in a function makes it a local to the function even when they intended for it to change a (global) variable of the same name in the enclosing module. (declaring the name
in a 'global' statement in the function, or accessing the (global) variable before setting it.)
The second confusion(bug) people run into is that each module (ie imported file) contains its own so called 'global' name space. I guess python things the world(globe) is the module -- perhaps we are looking for 'universal' variables that span more than one globe.
The third confusion (that I'm starting to understand now) is where are the 'globals' in the __main__ module? Ie if you start python from the command line in interactive mode, or if you invoke python script (type the name of the foo.py from the command shell) -- there is no import of a module whose name you can use.
The contents of 'globals()' or globals().keys() -- which gives you a list of the globals -- seems to be accessible as: dir(sys.modules['__main__'])
It seems that the module for the loaded python script (or the interactive session with no loaded script), the one named in: __name__, has no global name, but is accessible as the module whose name is '__main__' in the system's list of all active modules, sys.modules
The __debug__ variable is handy in part because it affects every module. If I want to create another variable that works the same way, how would I do it?
The variable (let's be original and call it 'foo') doesn't have to be truly global, in the sense that if I change foo in one module, it is updated in others. I'd be fine if I could set foo before importing other modules and then they would see the same value for it.
If you need a global cross-module variable maybe just simple global module-level variable will suffice.
a.py:
var = 1
b.py:
import a
print a.var
import c
print a.var
c.py:
import a
a.var = 2
Test:
$ python b.py
# -> 1 2
Real-world example: Django's global_settings.py (though in Django apps settings are used by importing the object django.conf.settings).
I don't endorse this solution in any way, shape or form. But if you add a variable to the __builtin__ module, it will be accessible as if a global from any other module that includes __builtin__ -- which is all of them, by default.
a.py contains
print foo
b.py contains
import __builtin__
__builtin__.foo = 1
import a
The result is that "1" is printed.
Edit: The __builtin__ module is available as the local symbol __builtins__ -- that's the reason for the discrepancy between two of these answers. Also note that __builtin__ has been renamed to builtins in python3.
I believe that there are plenty of circumstances in which it does make sense and it simplifies programming to have some globals that are known across several (tightly coupled) modules. In this spirit, I would like to elaborate a bit on the idea of having a module of globals which is imported by those modules which need to reference them.
When there is only one such module, I name it "g". In it, I assign default values for every variable I intend to treat as global. In each module that uses any of them, I do not use "from g import var", as this only results in a local variable which is initialized from g only at the time of the import. I make most references in the form g.var, and the "g." serves as a constant reminder that I am dealing with a variable that is potentially accessible to other modules.
If the value of such a global variable is to be used frequently in some function in a module, then that function can make a local copy: var = g.var. However, it is important to realize that assignments to var are local, and global g.var cannot be updated without referencing g.var explicitly in an assignment.
Note that you can also have multiple such globals modules shared by different subsets of your modules to keep things a little more tightly controlled. The reason I use short names for my globals modules is to avoid cluttering up the code too much with occurrences of them. With only a little experience, they become mnemonic enough with only 1 or 2 characters.
It is still possible to make an assignment to, say, g.x when x was not already defined in g, and a different module can then access g.x. However, even though the interpreter permits it, this approach is not so transparent, and I do avoid it. There is still the possibility of accidentally creating a new variable in g as a result of a typo in the variable name for an assignment. Sometimes an examination of dir(g) is useful to discover any surprise names that may have arisen by such accident.
Define a module ( call it "globalbaz" ) and have the variables defined inside it. All the modules using this "pseudoglobal" should import the "globalbaz" module, and refer to it using "globalbaz.var_name"
This works regardless of the place of the change, you can change the variable before or after the import. The imported module will use the latest value. (I tested this in a toy example)
For clarification, globalbaz.py looks just like this:
var_name = "my_useful_string"
You can pass the globals of one module to onother:
In Module A:
import module_b
my_var=2
module_b.do_something_with_my_globals(globals())
print my_var
In Module B:
def do_something_with_my_globals(glob): # glob is simply a dict.
glob["my_var"]=3
Global variables are usually a bad idea, but you can do this by assigning to __builtins__:
__builtins__.foo = 'something'
print foo
Also, modules themselves are variables that you can access from any module. So if you define a module called my_globals.py:
# my_globals.py
foo = 'something'
Then you can use that from anywhere as well:
import my_globals
print my_globals.foo
Using modules rather than modifying __builtins__ is generally a cleaner way to do globals of this sort.
You can already do this with module-level variables. Modules are the same no matter what module they're being imported from. So you can make the variable a module-level variable in whatever module it makes sense to put it in, and access it or assign to it from other modules. It would be better to call a function to set the variable's value, or to make it a property of some singleton object. That way if you end up needing to run some code when the variable's changed, you can do so without breaking your module's external interface.
It's not usually a great way to do things — using globals seldom is — but I think this is the cleanest way to do it.
I wanted to post an answer that there is a case where the variable won't be found.
Cyclical imports may break the module behavior.
For example:
first.py
import second
var = 1
second.py
import first
print(first.var) # will throw an error because the order of execution happens before var gets declared.
main.py
import first
On this is example it should be obvious, but in a large code-base, this can be really confusing.
I wondered if it would be possible to avoid some of the disadvantages of using global variables (see e.g. http://wiki.c2.com/?GlobalVariablesAreBad) by using a class namespace rather than a global/module namespace to pass values of variables. The following code indicates that the two methods are essentially identical. There is a slight advantage in using class namespaces as explained below.
The following code fragments also show that attributes or variables may be dynamically created and deleted in both global/module namespaces and class namespaces.
wall.py
# Note no definition of global variables
class router:
""" Empty class """
I call this module 'wall' since it is used to bounce variables off of. It will act as a space to temporarily define global variables and class-wide attributes of the empty class 'router'.
source.py
import wall
def sourcefn():
msg = 'Hello world!'
wall.msg = msg
wall.router.msg = msg
This module imports wall and defines a single function sourcefn which defines a message and emits it by two different mechanisms, one via globals and one via the router function. Note that the variables wall.msg and wall.router.message are defined here for the first time in their respective namespaces.
dest.py
import wall
def destfn():
if hasattr(wall, 'msg'):
print 'global: ' + wall.msg
del wall.msg
else:
print 'global: ' + 'no message'
if hasattr(wall.router, 'msg'):
print 'router: ' + wall.router.msg
del wall.router.msg
else:
print 'router: ' + 'no message'
This module defines a function destfn which uses the two different mechanisms to receive the messages emitted by source. It allows for the possibility that the variable 'msg' may not exist. destfn also deletes the variables once they have been displayed.
main.py
import source, dest
source.sourcefn()
dest.destfn() # variables deleted after this call
dest.destfn()
This module calls the previously defined functions in sequence. After the first call to dest.destfn the variables wall.msg and wall.router.msg no longer exist.
The output from the program is:
global: Hello world!
router: Hello world!
global: no message
router: no message
The above code fragments show that the module/global and the class/class variable mechanisms are essentially identical.
If a lot of variables are to be shared, namespace pollution can be managed either by using several wall-type modules, e.g. wall1, wall2 etc. or by defining several router-type classes in a single file. The latter is slightly tidier, so perhaps represents a marginal advantage for use of the class-variable mechanism.
This sounds like modifying the __builtin__ name space. To do it:
import __builtin__
__builtin__.foo = 'some-value'
Do not use the __builtins__ directly (notice the extra "s") - apparently this can be a dictionary or a module. Thanks to ΤΖΩΤΖΙΟΥ for pointing this out, more can be found here.
Now foo is available for use everywhere.
I don't recommend doing this generally, but the use of this is up to the programmer.
Assigning to it must be done as above, just setting foo = 'some-other-value' will only set it in the current namespace.
I use this for a couple built-in primitive functions that I felt were really missing. One example is a find function that has the same usage semantics as filter, map, reduce.
def builtin_find(f, x, d=None):
for i in x:
if f(i):
return i
return d
import __builtin__
__builtin__.find = builtin_find
Once this is run (for instance, by importing near your entry point) all your modules can use find() as though, obviously, it was built in.
find(lambda i: i < 0, [1, 3, 0, -5, -10]) # Yields -5, the first negative.
Note: You can do this, of course, with filter and another line to test for zero length, or with reduce in one sort of weird line, but I always felt it was weird.
I could achieve cross-module modifiable (or mutable) variables by using a dictionary:
# in myapp.__init__
Timeouts = {} # cross-modules global mutable variables for testing purpose
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 60
# in myapp.mod1
from myapp import Timeouts
def wait_app_up(project_name, port):
# wait for app until Timeouts['WAIT_APP_UP_IN_SECONDS']
# ...
# in myapp.test.test_mod1
from myapp import Timeouts
def test_wait_app_up_fail(self):
timeout_bak = Timeouts['WAIT_APP_UP_IN_SECONDS']
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 3
with self.assertRaises(hlp.TimeoutException) as cm:
wait_app_up(PROJECT_NAME, PROJECT_PORT)
self.assertEqual("Timeout while waiting for App to start", str(cm.exception))
Timeouts['WAIT_JENKINS_UP_TIMEOUT_IN_SECONDS'] = timeout_bak
When launching test_wait_app_up_fail, the actual timeout duration is 3 seconds.