Related
I have two specific situations where I don't understand how importing works in Python:
1st specific situation:
When I import the same module in two different Python scripts, the module isn't imported twice, right? The first time Python encounters it, it is imported, and second time, does it check if the module has been imported, or does it make a copy?
2nd specific situation:
Consider the following module, called bla.py:
a = 10
And then, we have foo.py, a module which imports bla.py:
from bla import *
def Stuff ():
return a
And after that, we have a script called bar.py, which gets executed by the user:
from foo import *
Stuff() #This should return 10
a = 5
Stuff()
Here I don't know: Does Stuff() return 10 or 5?
Part 1
The module is only loaded once, so there is no performance loss by importing it again. If you actually wanted it to be loaded/parsed again, you'd have to reload() the module.
The first place checked is sys.modules, the cache of all modules that have been imported previously. [source]
Part 2
from foo import * imports a to the local scope. When assigning a value to a, it is replaced with the new value - but the original foo.a variable is not touched.
So unless you import foo and modify foo.a, both calls will return the same value.
For a mutable type such as a list or dict it would be different, modifying it would indeed affect the original variable - but assigning a new value to it would still not modify foo.whatever.
If you want some more detailed information, have a look at http://docs.python.org/reference/executionmodel.html:
The following constructs bind names: formal parameters to functions, import statements, class and function definitions (these bind the class or function name in the defining block), and targets that are identifiers if occurring in an assignment, for loop header, in the second position of an except clause header or after as in a with statement.
The two bold sections are the relevant ones for you: First the name a is bound to the value of foo.a during the import. Then, when doing a = 5, the name a is bound to 5. Since modifying a list/dict does not cause any binding, those operations would modify the original one (b and foo.b are bound to the same object on which you operate). Assigning a new object to b would be a binding operation again and thus separate b from foo.b.
It is also worth noting what exactly the import statement does:
import foo binds the module name to the module object in the current scope, so if you modify foo.whatever, you will work with the name in that module - any modifications/assignments will affect the variable in the module.
from foo import bar binds the given name(s) only (i.e. foo will remain unbound) to the element with the same name in foo - so operations on bar behave like explained earlier.
from foo import * behaves like the previous one, but it imports all global names which are not prefixed with an underscore. If the module defines __all__ only names inside this sequence are imported.
Part 3 (which doesn't even exist in your question :p)
The python documentation is extremely good and usually verbose - you find answer on almost every possible language-related question in there. Here are some useful links:
http://docs.python.org/reference/datamodel.html (classes, properties, magic methods, etc.) ()
http://docs.python.org/reference/executionmodel.html (how variables work in python)
http://docs.python.org/reference/expressions.html
http://docs.python.org/reference/simple_stmts.html (statements such as import, yield)
http://docs.python.org/reference/compound_stmts.html (block statements such as for, try, with)
To answer your first question:
No, python does not get 'imported' twice. When python loads a module, it checks for the module in sys.modules. If it is not in there, it is put in there, and loaded.
To answer your second question:
Modules can define what names they will export to a from camelot import * scenario, and the behavior is to create names for the existing values, not to reference existing variables (python does not have references).
On a somewhat related topic, doing a from camelot import * is not the same as a regular import.
I have three modules:
constants, which contains loaded configuration and other stuff in a class
main, which initializes constants when it is run
user, which imports constants and accesses its configuration.
constants module (simplified) looks like this:
class Constants:
def __init__(self, filename):
# Read values from INI file
config = self.read_inifile(filename)
self.somevalue = config['ex']['ex']
def read_inifile(self, filename):
# reads inifile
constants: Constants = None
def populate_constants(filename):
global constants
constants = Constants(filename)
A simple object which should hold the configuration, nothing groundbreaking.
The function populate_constants() is called from main on program startup.
Now the weirdness happens - when I import the constants module from user like this, the constants is None:
from toplevelpkg.constants import constants
print(constants)
None
However, if I import it like this, constants is initialized as one would expect:
from toplevelpkg import constants
print(constants.constants)
<trumpet.constants.Constants object at 0x035AA130>
Why is that?
EDIT: in my code, the function attempting to read constants is run asynchronously via await loop.run_in_executor(None, method_needing_import). Not sure if this may cause issues?
(and as a side-question, is it a good practice to have a configuration-holding object which parses the config file and provides it as its member variables?)
There's indeed a difference between
from mymodule import obj
(...)
do_something_with(obj)
and
import mymodule
(...)
do_something_with(mymodule.obj)
In the first case, it acts as :
import mymodule
obj = mymodule.obj
del mymodule
which means that at this point, in the current module, obj is a "global" (which in python actually means 'module-level', not 'application-wide') name bound to whatever mymodule.obj was when it was imported (in your case : None). From then on, mymodule.obj and the module-local obj names live in different namespaces (the first one in mymodule namespace, the second one in the current module namespace), and rebinding mymodule.obj from anywhere won't change anything to what the current module's obj is bound to. Actually, it's exactly as if you were doing this:
a = 2
b = a
a = 4
After the third statement, b is obviously still bound to 2 - rebinding a to 4 doesn't impact b.
In the second case (import mymodule) what get's bound in the importing module's namespace is the whole mymodule object, so if mymodule.obj gets rebound (from within mymodule or anywhere else) the change will be visible in the importing module. In this case it's equivalent to
a = {"x": 2}
b = a
a["x"] = 4
In which case the change will be visible from b["x"] as well since a and b are still bound to the very same object.
wrt/ your side question: yes, having some "config" object is quite a common pattern. You just may want to make sure you can build it "from scratch" too (I mean, not necessarily from a config file) to make unittesting easier.
I see that there are other questions related to cross module variables, but they don't really fully answer my question.
I have an application that I have split into 3 modules + 1 main application, mainly for ease of readability and maintainability.
2 of these modules have threads with variables that need to be modified from other modules and other module threads.
Whilst I can modify a module's variable from the main code, I don't appear to be able to modify one module's variable from another module unless I import every module into every other module.
The example below where a&b are imported into main and a module a needs to access a variable in module b:
main
module a
var a
module b
var a
main
a.a = 1
b.a = 2
module a
b.a = 3
module b
a.a = 0
without importing module a into module b and importing module b into module a, can this be achieved globally through the main program ?
If I do have to import a and b into main, and then import a into b and b into a, what are the implications in terms of memory and resource usage / speed etc ?
I tried the suggestion from #abarnert:
#moda
vara = 10
#modb
print(str(vara))
#main
import moda
from moda import vara
import modb
however I get "name error vara is not defined"
If the code in the modules are defined as classes, and the main program creates instances of these classes, the main program can pass an instance of one module class to another, and changes to that instance will be reflected everywhere. There would be no need to import a or b into each other, because they would simply have references to each other.
If I do have to import a and b into main, and then import a into b and b into a, what are the implications in terms of memory and resource usage / speed etc ?
Absolutely none for memory—every module that imports a will get a reference to the exact same a module object. All you're doing is increasing its refcount, not creating new objects.
For speed, the time to discover that you're trying to import a module that already exists is almost nothing (it's just looking up the module name in a dictionary). It is slightly slower to access a.a than to just access a. But this is very rarely an issue. If it is, you're almost certainly going to want to copy that value into the locals of whatever function is accessing it over and over, at which point it won't matter which globals it came from.
without importing module a into module b and importing module b into module a, can this be achieved globally through the main program ?
Sure. All you have to do is import (with from a import a or import a.a as aa or whatever) or copy the variables from module a into main.
Note that just makes a new name for each value; it doesn't make references to the variables. There is no such thing as a reference to a variable in Python.
This works if the variables are holding constants, or if they're holding mutable values that you modify. It just doesn't do anything useful if the variables are names that you want to rebind to new values. (If you do need to do that, just wrap the values in something mutable—e.g., turn each variable into a 1-item list, so you can rebind a[0] instead of a, which means anyone else who has a reference to a can see your new a[0] value.)
If you insist on a "true global", even that isn't impossible. See builtins for details. But you almost certainly don't want this.
If you want to be able to modify a module-level variable from a different module then yes, you will need to import the other module. I would question why you need to do this. Perhaps you should be breaking your code into classes instead of separate modules.
For example you could choose to encapsulate the variables that need to be modified by both modules inside a separate class and pass a single instance of that class to all classes (or modules but you should really use classes) that need it.
See Circular (or cyclic) imports in Python for more information about cyclical imports.
I'm new to Python and programming in general (a couple of weeks at most).
Concerning Python and using modules, I realise that functions can imported using from a import *.
So instead of typing
a.sayHi()
a.sayBye()
I can say
sayHi()
sayBye()
which I find simplifies things a great deal. Now, say I have a bunch of variables that I want to use across modules and I have them all defined in one python module. How can I, using a similar method as mentioned above or an equally simple one, import these variables. I don't want to use import a and then be required to prefix all my variables with a..
The following situation would by ideal:
a.py
name = "Michael"
age = 15
b.py
some_function
if name == "Michael":
if age == 15:
print("Simple!")
Output:
Simple!
You gave the solution yourself: from a import * will work just fine. Python does not differentiate between functions and variables in this respect.
>>> from a import *
>>> if name == "Michael" and age == 15:
... print('Simple!')
...
Simple!
Just for some context, most linters will flag from module import * with a warning, because it's prone to namespace collisions that will cause headaches down the road.
Nobody has noted yet that, as an alternative, you can use the
from a import name, age
form and then use name and age directly (without the a. prefix). The from [module] import [identifiers] form is more future proof because you can easily see when one import will be overriding another.
Also note that "variables" aren't different from functions in Python in terms of how they're addressed -- every identifier like name or sayBye is pointing at some kind of object. The identifier name is pointing at a string object, sayBye is pointing at a function object, and age is pointing at an integer object. When you tell Python:
from a import name, age
you're saying "take those objects pointed at by name and age within module a and point at them in the current scope with the same identifiers".
Similarly, if you want to point at them with different identifiers on import, you can use the
from a import sayBye as bidFarewell
form. The same function object gets pointed at, except in the current scope the identifier pointing at it is bidFarewell whereas in module a the identifier pointing at it is sayBye.
Like others have said,
from module import *
will also import the modules variables.
However, you need to understand that you are not importing variables, just references to objects. Assigning something else to the imported names in the importing module won't affect the other modules.
Example: assume you have a module module.py containing the following code:
a= 1
b= 2
Then you have two other modules, mod1.py and mod2.py which both do the following:
from module import *
In each module, two names, a and b are created, pointing to the objects 1 and 2, respectively.
Now, if somewhere in mod1.py you assign something else to the global name a:
a= 3
the name a in module.py and the name a in mod2.py will still point to the object 1.
So from module import * will work if you want read-only globals, but it won't work if you want read-write globals. If the latter, you're better off just importing import module and then either getting the value (module.a) or setting the value (module.a= …) prefixed by the module.
You didn't say this directly, but I'm assuming you're having trouble with manipulating these global variables.
If you manipulate global variables from inside a function, you must declare them global
a = 10
def x():
global a
a = 15
print a
x()
print a
If you don't do that, then a = 15 will just create a local variable and assign it 15, while the global a stays 10
The __debug__ variable is handy in part because it affects every module. If I want to create another variable that works the same way, how would I do it?
The variable (let's be original and call it 'foo') doesn't have to be truly global, in the sense that if I change foo in one module, it is updated in others. I'd be fine if I could set foo before importing other modules and then they would see the same value for it.
If you need a global cross-module variable maybe just simple global module-level variable will suffice.
a.py:
var = 1
b.py:
import a
print a.var
import c
print a.var
c.py:
import a
a.var = 2
Test:
$ python b.py
# -> 1 2
Real-world example: Django's global_settings.py (though in Django apps settings are used by importing the object django.conf.settings).
I don't endorse this solution in any way, shape or form. But if you add a variable to the __builtin__ module, it will be accessible as if a global from any other module that includes __builtin__ -- which is all of them, by default.
a.py contains
print foo
b.py contains
import __builtin__
__builtin__.foo = 1
import a
The result is that "1" is printed.
Edit: The __builtin__ module is available as the local symbol __builtins__ -- that's the reason for the discrepancy between two of these answers. Also note that __builtin__ has been renamed to builtins in python3.
I believe that there are plenty of circumstances in which it does make sense and it simplifies programming to have some globals that are known across several (tightly coupled) modules. In this spirit, I would like to elaborate a bit on the idea of having a module of globals which is imported by those modules which need to reference them.
When there is only one such module, I name it "g". In it, I assign default values for every variable I intend to treat as global. In each module that uses any of them, I do not use "from g import var", as this only results in a local variable which is initialized from g only at the time of the import. I make most references in the form g.var, and the "g." serves as a constant reminder that I am dealing with a variable that is potentially accessible to other modules.
If the value of such a global variable is to be used frequently in some function in a module, then that function can make a local copy: var = g.var. However, it is important to realize that assignments to var are local, and global g.var cannot be updated without referencing g.var explicitly in an assignment.
Note that you can also have multiple such globals modules shared by different subsets of your modules to keep things a little more tightly controlled. The reason I use short names for my globals modules is to avoid cluttering up the code too much with occurrences of them. With only a little experience, they become mnemonic enough with only 1 or 2 characters.
It is still possible to make an assignment to, say, g.x when x was not already defined in g, and a different module can then access g.x. However, even though the interpreter permits it, this approach is not so transparent, and I do avoid it. There is still the possibility of accidentally creating a new variable in g as a result of a typo in the variable name for an assignment. Sometimes an examination of dir(g) is useful to discover any surprise names that may have arisen by such accident.
Define a module ( call it "globalbaz" ) and have the variables defined inside it. All the modules using this "pseudoglobal" should import the "globalbaz" module, and refer to it using "globalbaz.var_name"
This works regardless of the place of the change, you can change the variable before or after the import. The imported module will use the latest value. (I tested this in a toy example)
For clarification, globalbaz.py looks just like this:
var_name = "my_useful_string"
You can pass the globals of one module to onother:
In Module A:
import module_b
my_var=2
module_b.do_something_with_my_globals(globals())
print my_var
In Module B:
def do_something_with_my_globals(glob): # glob is simply a dict.
glob["my_var"]=3
Global variables are usually a bad idea, but you can do this by assigning to __builtins__:
__builtins__.foo = 'something'
print foo
Also, modules themselves are variables that you can access from any module. So if you define a module called my_globals.py:
# my_globals.py
foo = 'something'
Then you can use that from anywhere as well:
import my_globals
print my_globals.foo
Using modules rather than modifying __builtins__ is generally a cleaner way to do globals of this sort.
You can already do this with module-level variables. Modules are the same no matter what module they're being imported from. So you can make the variable a module-level variable in whatever module it makes sense to put it in, and access it or assign to it from other modules. It would be better to call a function to set the variable's value, or to make it a property of some singleton object. That way if you end up needing to run some code when the variable's changed, you can do so without breaking your module's external interface.
It's not usually a great way to do things — using globals seldom is — but I think this is the cleanest way to do it.
I wanted to post an answer that there is a case where the variable won't be found.
Cyclical imports may break the module behavior.
For example:
first.py
import second
var = 1
second.py
import first
print(first.var) # will throw an error because the order of execution happens before var gets declared.
main.py
import first
On this is example it should be obvious, but in a large code-base, this can be really confusing.
I wondered if it would be possible to avoid some of the disadvantages of using global variables (see e.g. http://wiki.c2.com/?GlobalVariablesAreBad) by using a class namespace rather than a global/module namespace to pass values of variables. The following code indicates that the two methods are essentially identical. There is a slight advantage in using class namespaces as explained below.
The following code fragments also show that attributes or variables may be dynamically created and deleted in both global/module namespaces and class namespaces.
wall.py
# Note no definition of global variables
class router:
""" Empty class """
I call this module 'wall' since it is used to bounce variables off of. It will act as a space to temporarily define global variables and class-wide attributes of the empty class 'router'.
source.py
import wall
def sourcefn():
msg = 'Hello world!'
wall.msg = msg
wall.router.msg = msg
This module imports wall and defines a single function sourcefn which defines a message and emits it by two different mechanisms, one via globals and one via the router function. Note that the variables wall.msg and wall.router.message are defined here for the first time in their respective namespaces.
dest.py
import wall
def destfn():
if hasattr(wall, 'msg'):
print 'global: ' + wall.msg
del wall.msg
else:
print 'global: ' + 'no message'
if hasattr(wall.router, 'msg'):
print 'router: ' + wall.router.msg
del wall.router.msg
else:
print 'router: ' + 'no message'
This module defines a function destfn which uses the two different mechanisms to receive the messages emitted by source. It allows for the possibility that the variable 'msg' may not exist. destfn also deletes the variables once they have been displayed.
main.py
import source, dest
source.sourcefn()
dest.destfn() # variables deleted after this call
dest.destfn()
This module calls the previously defined functions in sequence. After the first call to dest.destfn the variables wall.msg and wall.router.msg no longer exist.
The output from the program is:
global: Hello world!
router: Hello world!
global: no message
router: no message
The above code fragments show that the module/global and the class/class variable mechanisms are essentially identical.
If a lot of variables are to be shared, namespace pollution can be managed either by using several wall-type modules, e.g. wall1, wall2 etc. or by defining several router-type classes in a single file. The latter is slightly tidier, so perhaps represents a marginal advantage for use of the class-variable mechanism.
This sounds like modifying the __builtin__ name space. To do it:
import __builtin__
__builtin__.foo = 'some-value'
Do not use the __builtins__ directly (notice the extra "s") - apparently this can be a dictionary or a module. Thanks to ΤΖΩΤΖΙΟΥ for pointing this out, more can be found here.
Now foo is available for use everywhere.
I don't recommend doing this generally, but the use of this is up to the programmer.
Assigning to it must be done as above, just setting foo = 'some-other-value' will only set it in the current namespace.
I use this for a couple built-in primitive functions that I felt were really missing. One example is a find function that has the same usage semantics as filter, map, reduce.
def builtin_find(f, x, d=None):
for i in x:
if f(i):
return i
return d
import __builtin__
__builtin__.find = builtin_find
Once this is run (for instance, by importing near your entry point) all your modules can use find() as though, obviously, it was built in.
find(lambda i: i < 0, [1, 3, 0, -5, -10]) # Yields -5, the first negative.
Note: You can do this, of course, with filter and another line to test for zero length, or with reduce in one sort of weird line, but I always felt it was weird.
I could achieve cross-module modifiable (or mutable) variables by using a dictionary:
# in myapp.__init__
Timeouts = {} # cross-modules global mutable variables for testing purpose
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 60
# in myapp.mod1
from myapp import Timeouts
def wait_app_up(project_name, port):
# wait for app until Timeouts['WAIT_APP_UP_IN_SECONDS']
# ...
# in myapp.test.test_mod1
from myapp import Timeouts
def test_wait_app_up_fail(self):
timeout_bak = Timeouts['WAIT_APP_UP_IN_SECONDS']
Timeouts['WAIT_APP_UP_IN_SECONDS'] = 3
with self.assertRaises(hlp.TimeoutException) as cm:
wait_app_up(PROJECT_NAME, PROJECT_PORT)
self.assertEqual("Timeout while waiting for App to start", str(cm.exception))
Timeouts['WAIT_JENKINS_UP_TIMEOUT_IN_SECONDS'] = timeout_bak
When launching test_wait_app_up_fail, the actual timeout duration is 3 seconds.