This question already has answers here:
Does Python have “private” variables in classes?
(15 answers)
Closed 7 years ago.
(I've checked out Does Python have “private” variables in classes? -- it asks about classes rather than modules. As such, answers there don't cover import which is what I'm interested in.)
Consider, there is a module called X with variable y. If any other module tries to import the module X, how to avoid loading the variable y in Python?
For example:
# x.py
y=10
We then use this in some other module:
import x
print x.y
How to avoid accessing x.y from the module x.py ?
If you do import module, there's no way to hide any global members, and that's intentional: the module object returned is the "true" one, the very same that's used by the module's members. Dirty hacks like __getattr__ are also prohibited for modules.
One way is to mark "internal" entities with a leading underscore to hint the user they are not intended for external use. This isn't necessary for references to other modules imported by yours since the guidelines explicitly discourage external use of them (the only exception is if the referenced module is inaccessible the normal way).
When doing from module import *, however, you don't get a module reference but import things from it directly into the current namespace. By default, everything except names starting from an underscore is imported. You can override this by defining the __all__ module attribute.
In normal use import foo only imports the module once; all other imports see that it has been imported, and doesn't load it again.
Quoth https://docs.python.org/3/reference/import.html#the-module-cache:
During import, the module name is looked up in sys.modules and if
present, the associated value is the module satisfying the import, and
the process completes. However, if the value is None, then an
ImportError is raised. If the module name is missing, Python will
continue searching for the module.
This has been long been a feature of the interpreter going back at least to version 2.7 and probably earlier.
Speaking to your specific question, there is no variable y, there is x.y because that's what you imported. If you do the highly unrecommended from module import * you can end up with a y, but you shouldn't do that.
Related
If a large module is loaded by some submodule of your code, is there any benefit to referencing the module from that namespace instead of importing it again?
For example:
I have a module MyLib, which makes extensive use of ReallyBigLib. If I have code that imports MyLib, should I dig the module out like so
import MyLib
ReallyBigLib = MyLib.SomeModule.ReallyBigLib
or just
import MyLib
import ReallyBigLib
Python modules could be considered as singletons... no matter how many times you import them they get initialized only once, so it's better to do:
import MyLib
import ReallyBigLib
Relevant documentation on the import statement:
https://docs.python.org/2/reference/simple_stmts.html#the-import-statement
Once the name of the module is known (unless otherwise specified, the term “module” will refer to both packages and modules), searching for the module or package can begin. The first place checked is sys.modules, the cache of all modules that have been imported previously. If the module is found there then it is used in step (2) of import.
The imported modules are cached in sys.modules:
This is a dictionary that maps module names to modules which have already been loaded. This can be manipulated to force reloading of modules and other tricks. Note that removing a module from this dictionary is not the same as calling reload() on the corresponding module object.
As others have pointed out, Python maintains an internal list of all modules that have been imported. When you import a module for the first time, the module (a script) is executed in its own namespace until the end, the internal list is updated, and execution of continues after the import statement.
Try this code:
# module/file a.py
print "Hello from a.py!"
import b
# module/file b.py
print "Hello from b.py!"
import a
There is no loop: there is only a cache lookup.
>>> import b
Hello from b.py!
Hello from a.py!
>>> import a
>>>
One of the beauties of Python is how everything devolves to executing a script in a namespace.
It makes no substantial difference. If the big module has already been loaded, the second import in your second example does nothing except adding 'ReallyBigLib' to the current namespace.
WARNING: Python does not guarantee that module will not be initialized twice.
I've stubled upon such issue. See discussion:
http://code.djangoproject.com/ticket/8193
The internal registry of imported modules is the sys.modules dictionary, which maps module names to module objects. You can look there to see all the modules that are currently imported.
You can also pull some useful tricks (if you need to) by monkeying with sys.modules - for example adding your own objects as pseudo-modules which can be imported by other modules.
It is the same performancewise. There is no JIT compiler in Python yet.
I'm noticing some weird situations where tests like the following fail:
x = <a function from some module, passed around some big application for a while>
mod = __import__(x.__module__)
x_ref = getattr(mod, x.__name__)
assert x_ref is x # Fails
(Code like this appears in the pickle module)
I don't think I have any import hooks, reload calls, or sys.modules manipulation that would mess with python's normal import caching behavior.
Is there any other reason why a module would be loaded twice? I've seen claims about this (e.g, https://stackoverflow.com/a/10989692/1332492), but I haven't been able to reproduce it in a simple, isolated script.
I believe you misunderstood how __import__ works:
>>> from my_package import my_module
>>> my_module.function.__module__
'my_package.my_module'
>>> __import__(my_module.function.__module__)
<module 'my_package' from './my_package/__init__.py'>
From the documentation:
When the name variable is of the form package.module, normally, the
top-level package (the name up till the first dot) is returned, not
the module named by name. However, when a non-empty fromlist
argument is given, the module named by name is returned.
As you can see __import__ does not return the sub-module, but only the top package. If you have function also defined at package level you will indeed have different references to it.
If you want to just load a module you should use importlib.import_module instead of __import__.
As to answer you actual question: AFAIK there is no way to import the same module, with the same name, twice without messing around with the importing mechanism. However, you could have a submodule of a package that is also available in the sys.path, in this case you can import it twice using different names:
from some.package import submodule
import submodule as submodule2
print(submodule is submodule2) # False. They have *no* relationships.
This sometimes can cause problems with, e.g., pickle. If you pickle something referenced by submodule you cannot unpickle it using submodule2 as reference.
However this doesn't address the specific example you gave us, because using the __module__ attribute the import should return the correct module.
This question already has answers here:
How can I import a module dynamically given its name as string?
(10 answers)
Closed 9 years ago.
I'm doing
module = __import__("client.elements.gui.button", globals(), locals(), [], 0)
But it's only returning client.
What is my problem?
That's what __import__ does.
When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name.
You're not really supposed to use __import__; if you want to import a module dynamically, use importlib.import_module.
Accepted answer is correct, but if you read on in the docs you'll find that this can be gotten around with an admittedly unsettling "hack" by using __import__ like so:
module = __import__('client.elements.gui.button', fromlist=[''])
It doesn't really matter what you pass in for fromlist so long as it's a non-empty list. This signals to the default __import__ implementation that you want to do a from x.y.z import foo style import, and it will return the the module you're after.
As stated you should use importlib instead, but this is still a workaround if you need to support Python versions < 2.7.
It only obtains the top level, but you can also work around this like so:
module_name = 'some.module.import.class'
module = __import__(module_name)
for n in module_name.split('.')[1:]:
module = getattr(module, n)
# module is now equal to what would normally
# have been retrieved where you to properly import the file
http://docs.python.org/2/library/runpy.html#runpy.run_module
My question is about this part of the run_module documentation.
... and then executed in a fresh module namespace.
What is a "module namespace" in python? In what ways does runpy differ from import?
Every module executes with its own set of global variables, which become the module's attributes. A module namespace is where a module's globals go; "executed in a fresh module namespace" means "executed with its own global variable environment".
A Python interpreter only executes a module's code the first time it's imported in any given program. Further import statements simply return the existing module object. This prevents exponential import explosion when modules a and b both import modules c and d, which both import e and f, etc. It also means that all modules see the same versions of, say, collections.defaultdict, so type checks behave intuitively. runpy.run_module says "run the code in this module, whether or not it was imported already, and don't count it as an import." If you run_module a module and then __import__ it, the dict you got from run_module will contain objects very similar to, but distinct from, the objects in the module you got from __import__.
I have a basic understanding of python, but somewhere I have read that when we import a module using following syntax, it doesn't import attributes defined in specified module which starts with _ (single underscore). Can anybody tell me how it is happening and why it is like that ?
from module.submodule import *
It's by design. Variables starting with an underscore are regarded as for internal use only (not the same as private in other languages). They can still be accessed on the module directly, but they arn't imported on a * import.
From the documentation about * imports:
This imports all names except those beginning with an underscore (_). In most cases Python programmers do not use this facility since it introduces an unknown set of names into the interpreter, possibly hiding some things you have already defined.
This is also to tell you that it's discouraged to use a * import, better to explicitly import the things you need. The exception are modules that are designed to be used via * import, that means they have an __all__ attribute (a list containing containing the names of everything the module wants to export).