This is my project structure (Python 3.5.1.):
a
├── b.py
└── __init__.py
Case 1
File b.py is empty.
File __init__.py is:
print(b)
If we run import a, the output is:
NameError: name 'b' is not defined
Case 2
File b.py is empty.
File __init__.py is:
import a.b
print(b)
If we run import a, the output is:
<module 'a.b' from '/tmp/a/b.py'>
Question
Why doesn't the program fail in Case 2?
Usually if we run import a.b then we can only reference it by a.b, not b. Hopefully somebody can help explain what's happening to the namespace in Case 2.
Python adds modules as globals to the parent package after import.
So when you imported a.b, the name b was added as a global to the a module, created by a/__init__.py.
From the Python 3 import system documentation:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
Bold emphasis mine. Note that the same applies to Python 2, but Python 3 made the process more explicit.
An import statement brings a module into scope. You imported b, so there it is, a module object.
Read the documentation for import:
The basic import statement (no from clause) is executed in two steps:
find a module, loading and initializing it if necessary
define a name or names in the local namespace for the scope where the import statement occurs.
You didn't import b in the first case.
Related
The docs say:
If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in init.py) and then imports whatever names are defined in the package.
The first clause (before ';') says does not import, the second clause says then imports whatever names are defined. Isn't that a contradiction to the first clause? Logically, everything is imported eventually, so why this convolued wording?
The important distinction is "imports whatever names are defined in the package" vs "does not import all submodules".
Submodules aren't automatically names in the package __init__ module (unless explicitly imported to be).
IOW, with a package
foo/
__init__.py
baz.py
with foo/__init__.py containing
def bar():
print("bar")
the statement
from foo import *
will only bring the function bar into scope (from foo/__init__.py), but not baz (a submodule in the foo package) into scope, unless it's a name in foo/__init__.py, á la
from . import baz
To refer to the sound/ hierarchy in the linked docs (in an abridged form, and assuming there is no code in those __init__.py files):
sound/
__init__.py # Initialize the sound package
formats/ # Subpackage for file format conversions
__init__.py
wavread.py # Read WAV files
this means that your application doing
from sound import *
will not put formats in the scope and
from sound.formats import *
won't give you wavread either.
If __all__ is not defined it will import every name from the module that does not start with an _.
def _private(): ... # won't import
def public(): ... # will import
I have a python package of the form:
package
├── __init__.py
├── module_1.py
└── module_2.py
Inside __init__.py I have:
__all__ = ["module_1", "module_2"]
Module_1 is:
__all__ = ["foo"]
def foo(): pass
Module_2 is:
__all__ = ["Bar"]
class Bar: pass
From the documentation I thought that the following code would import foo and Bar:
import package as pk
However, when I run pk.foo() it throws an error: AttributeError: module 'package' has no attribute 'foo' (same for Bar).
Thanks to this answer, I know that to get the desired behavior, I can change __init__.py into:
from .module_1 import *
from .module_2 import *
The above works.
However, I do not understand the documentation. The lines:
if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered
Sound like my original __init__.py should have worked (the one with __all__ = ["module_1", "module_2"]). That is, the line import package as pk should have imported the modules module_1 and module_2, which in turn make foo and Bar available.
What am I missing?
EDIT:
I have also tried using exactly what the documentation mentions. That is, using from package import *, and then tried with package.foo() and foo(), but neither worked.
The first, package.foo(), throws the error NameError: 'package' is not defined.. The second, foo(), throws the error NameError: 'foo' is not defined..
How would a working example look like?.. One where __init__.py is of the form __all__ = ["module_1", "module_2"].
I will try to summarize the knowledge gained from the comments to my question, the documentation, my own tests, and this post.
1) __all__ behaves differently on __init__ and modules
1.1) Within a module
When __all__ is within a module, it determines what objects are made available when running from module import *.
Given this package structure:
package
├── __init__.py
├── module_1.py
└── module_2.py
And given the following code for module_1:
__all__ = ["foo"]
def foo(): pass
def baz(): pass
Running from package.module_1 import * will make foo available but not baz.
Moreover, foo can then be called using foo(), i.e., there is no need to reference the module.
1.2) Within __init__ (my original question)
When __all__ is within __init__,
it is taken to be the list of module names that should be imported when from package import * is encountered.
Running from package import * will then have two effects:
The scripts of the modules in __all__ will be ran (they are imported).
These modules are made available in the namespace.
This means that if __init__ is of the form:
__all__ = ["module_1", "module_2"]
Then, running from package import * will run the scripts of module_1 and module_2 and make both modules available. So, now, the function foo inside module_1 can be called as module_1.foo() instead of package.module_1.foo().
However, if this is the intent, then using from package.module_1 import foo might be better. As it makes foo accessible as foo().
2) from package import * is not the same as import package
Running from package import * has the effect mentioned in 1.2). However, this is not true for running import package: i.e. module_1.foo() would not work in this scenario.
3) The alternative approach to __init__
(The following is based on this post) As I mentioned in my question, there is an alternative approach to __init__, in which the objects that you want to make available when the user calls from package import * are directly imported into __init__.
As an example, __init__ could contain the following code:
from .module_1 import *
from .module_2 import *
Then, when the user calls from package import * the objects from modules 1 and 2 would be available on the namespace.
If module_1 was that of 1.1), then the function foo could be called without referencing the module. i.e. foo(). However, the same would not work for baz.
As mentioned in 2), from package import * is different to import package. Calling import package in this scenario (with this __init__), makes foo available through package.foo(), instead of just foo(). Similarly import package as pk makes foo available as pk.foo().
This approach could be preferable to that of 1.2), in which foo is made available through module_1.foo().
The key problem in your code is that you have defined __all__ in sub modules. Which allows to be exported only that module, but such module is not available in your directory. They are definitions you want to use.
So, just remove them from sub modules and you'll get it working.
(Python 3.6)
I have this folder structure:
package/
start.py
subpackage/
__init__.py
submodule.py
submodule.py:
def subfunc():
print("This is submodule")
__ init __.py:
from subpackage.submodule import subfunc
start.py:
import subpackage
subpackage.subfunc()
subpackage.submodule.subfunc()
I understand how and why
subpackage.subfunc()
works.
But I don't understand why:
subpackage.submodule.subfunc()
also works, if I have not done:
from subpackage import submodule
Nor:
import subpackage.submodule
Neither in __ init __.py nor in start.py
Thank you very much if anyone may clear my doubt.
When issuing from subpackage.submodule import subfunc, python does two things for you: one, search and evaluate the module named subpackage.submodule, put it into sys.modules cache; two, populate subpackage.submodule.subfunc object and bind name "subfunc" to the namespace of the current module:
The import statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope.
When importing subpackage.submodule, parent of submodule also got imported:
While certain side-effects may occur, such as the importing of parent packages, and the updating of various caches (including sys.modules) ...
On the last stage of importing subpackage.submodule, python would set the module as an attribute on its parent subpackage, this behavior is documented:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object.
If I'm getting this right, you have a folder called "package" in which there are 2 things: a .py file and another folder called "subpackage".
Inside "subpackage" you have __init__.py and submodule.py which the latter contains a function that just prints "This is submodule".
Now, when you call import subpackage, you call and "pull" everything that's inside "subpackage", including submodule and therefore, the subfunc() function.
When you write subpackage.submodule.subfunc() there's really nothing amazing going there, you just call the mainfolder/container (subpackage.), then the .py file (submodule.) and finally the function itself (subfunc() ).
I've stumbled across some odd python (2.7) import behaviour, which, whilst easy to work around, has me scratching my head.
Given the following folder structure:
test/
__init__.py
x.py
package/
__init__.py
x.py
Where test/package/__init__.py contains the following
from .. import x
print x
from .x import hello
print x
print x.hello
And test/package/x.py contains the following
hello = 1
Why would running import test.package from a REPL result in the following output?
<module 'test.x' from 'test/x.pyc'>
<module 'test.package.x' from 'test/package/x.pyc'>
1
I would have expected x to reference the top level x module, however what the second import does instead, is to import the whole local x module (not just hello as I expected), effectively trampling on the first import.
Can anyone explain the mechanics of the import here?
The from .x import name realizes that test.package.x needs to be a module. It then checks the corresponding entry in sys.modules; if it is found there, then sys.modules['test.package.x'].hello is imported into the calling module.
However, if sys.modules['test.package.x'] does not exist yet, the module is loaded; and as the last step of loading the sys.modules['test.package'].x is set to point to the newly loaded module, even if you explicitly did not ask for it. Thus the second import overrides the name of the first import.
This is by design, otherwise
import foo.bar.baz
foo.bar.baz.x()
and
from foo.bar import baz
baz.x()
wouldn't be interchangeable.
I am unable to find good documentation on this behaviour in the Python 2 documentation, but the Python 3 behaviour is essentially the same in this case:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
[...]
The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former.
Take the following code example:
File package1/__init__.py:
from moduleB import foo
print moduleB.__name__
File package1/moduleB.py:
def foo(): pass
Then from the current directory:
>>> import package1
package1.moduleB
This code works in CPython. What surprises me about it is that the from ... import in __init__.py statement makes the moduleB name visible. According to Python documentation, this should not be the case:
The from form does not bind the module name
Could someone please explain why CPython works that way? Is there any documentation describing this in detail?
The documentation misled you as it is written to describe the more common case of importing a module from outside of the parent package containing it.
For example, using "from example import submodule" in my own code, where "example" is some third party library completely unconnected to my own code, does not bind the name "example". It does still import both the example/__init__.py and example/submodule.py modules, create two module objects, and assign example.submodule to the second module object.
But, "from..import" of names from a submodule must set the submodule attribute on the parent package object. Consider if it didn't:
package/__init__.py executes when package is imported.
That __init__ does "from submodule import name".
At some point later, other completely different code does "import package.submodule".
At step 3, either sys.modules["package.submodule"] doesn't exist, in which case loading it again will give you two different module objects in different scopes; or sys.modules["package.submodule"] will exist but "submodule" won't be an attribute of the parent package object (sys.modules["package"]), and "import package.submodule" will do nothing. However, if it does nothing, the code using the import cannot access submodule as an attribute of package!
Theoretically, how importing a submodule works could be changed if the rest of the import machinery was changed to match.
If you just need to know what importing a submodule S from package P will do, then in a nutshell:
Ensure P is imported, or import it otherwise. (This step recurses to handle "import A.B.C.D".)
Execute S.py to get a module object. (Skipping details of .pyc files, etc.)
Store module object in sys.modules["P.S"].
setattr(sys.modules["P"], "S", sys.modules["P.S"])
If that import was of the form "import P.S", bind "P" in local scope.
this is because __init__.py represent itself as package1 module object at runtime, so every .py file will be defined as an submodule. and rewrite __all__ will not make any sense. you can make another file e.g example.py and fill it with the same code in __init__.py and it will raise NameError.
i think CPython runtime takes special algorithm when __init__.py looking for variables differ from other python files, may be like this:
looking for variable named "moduleB"
if not found:
if __file__ == '__init__.py': #dont raise NameError, looking for file named moduleB.py
if current dir contains file named "moduleB.py":
import moduleB
else:
raise namerror