Python import function from package - python

(Python 3.6)
I have this folder structure:
package/
start.py
subpackage/
__init__.py
submodule.py
submodule.py:
def subfunc():
print("This is submodule")
__ init __.py:
from subpackage.submodule import subfunc
start.py:
import subpackage
subpackage.subfunc()
subpackage.submodule.subfunc()
I understand how and why
subpackage.subfunc()
works.
But I don't understand why:
subpackage.submodule.subfunc()
also works, if I have not done:
from subpackage import submodule
Nor:
import subpackage.submodule
Neither in __ init __.py nor in start.py
Thank you very much if anyone may clear my doubt.

When issuing from subpackage.submodule import subfunc, python does two things for you: one, search and evaluate the module named subpackage.submodule, put it into sys.modules cache; two, populate subpackage.submodule.subfunc object and bind name "subfunc" to the namespace of the current module:
The import statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope.
When importing subpackage.submodule, parent of submodule also got imported:
While certain side-effects may occur, such as the importing of parent packages, and the updating of various caches (including sys.modules) ...
On the last stage of importing subpackage.submodule, python would set the module as an attribute on its parent subpackage, this behavior is documented:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object.

If I'm getting this right, you have a folder called "package" in which there are 2 things: a .py file and another folder called "subpackage".
Inside "subpackage" you have __init__.py and submodule.py which the latter contains a function that just prints "This is submodule".
Now, when you call import subpackage, you call and "pull" everything that's inside "subpackage", including submodule and therefore, the subfunc() function.
When you write subpackage.submodule.subfunc() there's really nothing amazing going there, you just call the mainfolder/container (subpackage.), then the .py file (submodule.) and finally the function itself (subfunc() ).

Related

Python docs: Importing everything from python package convention not clear

The docs say:
If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in init.py) and then imports whatever names are defined in the package.
The first clause (before ';') says does not import, the second clause says then imports whatever names are defined. Isn't that a contradiction to the first clause? Logically, everything is imported eventually, so why this convolued wording?
The important distinction is "imports whatever names are defined in the package" vs "does not import all submodules".
Submodules aren't automatically names in the package __init__ module (unless explicitly imported to be).
IOW, with a package
foo/
__init__.py
baz.py
with foo/__init__.py containing
def bar():
print("bar")
the statement
from foo import *
will only bring the function bar into scope (from foo/__init__.py), but not baz (a submodule in the foo package) into scope, unless it's a name in foo/__init__.py, á la
from . import baz
To refer to the sound/ hierarchy in the linked docs (in an abridged form, and assuming there is no code in those __init__.py files):
sound/
__init__.py # Initialize the sound package
formats/ # Subpackage for file format conversions
__init__.py
wavread.py # Read WAV files
this means that your application doing
from sound import *
will not put formats in the scope and
from sound.formats import *
won't give you wavread either.
If __all__ is not defined it will import every name from the module that does not start with an _.
def _private(): ... # won't import
def public(): ... # will import

Behavior of __all__ in __init__ on Python when used to import modules

I have a python package of the form:
package
├── __init__.py
├── module_1.py
└── module_2.py
Inside __init__.py I have:
__all__ = ["module_1", "module_2"]
Module_1 is:
__all__ = ["foo"]
def foo(): pass
Module_2 is:
__all__ = ["Bar"]
class Bar: pass
From the documentation I thought that the following code would import foo and Bar:
import package as pk
However, when I run pk.foo() it throws an error: AttributeError: module 'package' has no attribute 'foo' (same for Bar).
Thanks to this answer, I know that to get the desired behavior, I can change __init__.py into:
from .module_1 import *
from .module_2 import *
The above works.
However, I do not understand the documentation. The lines:
if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered
Sound like my original __init__.py should have worked (the one with __all__ = ["module_1", "module_2"]). That is, the line import package as pk should have imported the modules module_1 and module_2, which in turn make foo and Bar available.
What am I missing?
EDIT:
I have also tried using exactly what the documentation mentions. That is, using from package import *, and then tried with package.foo() and foo(), but neither worked.
The first, package.foo(), throws the error NameError: 'package' is not defined.. The second, foo(), throws the error NameError: 'foo' is not defined..
How would a working example look like?.. One where __init__.py is of the form __all__ = ["module_1", "module_2"].
I will try to summarize the knowledge gained from the comments to my question, the documentation, my own tests, and this post.
1) __all__ behaves differently on __init__ and modules
1.1) Within a module
When __all__ is within a module, it determines what objects are made available when running from module import *.
Given this package structure:
package
├── __init__.py
├── module_1.py
└── module_2.py
And given the following code for module_1:
__all__ = ["foo"]
def foo(): pass
def baz(): pass
Running from package.module_1 import * will make foo available but not baz.
Moreover, foo can then be called using foo(), i.e., there is no need to reference the module.
1.2) Within __init__ (my original question)
When __all__ is within __init__,
it is taken to be the list of module names that should be imported when from package import * is encountered.
Running from package import * will then have two effects:
The scripts of the modules in __all__ will be ran (they are imported).
These modules are made available in the namespace.
This means that if __init__ is of the form:
__all__ = ["module_1", "module_2"]
Then, running from package import * will run the scripts of module_1 and module_2 and make both modules available. So, now, the function foo inside module_1 can be called as module_1.foo() instead of package.module_1.foo().
However, if this is the intent, then using from package.module_1 import foo might be better. As it makes foo accessible as foo().
2) from package import * is not the same as import package
Running from package import * has the effect mentioned in 1.2). However, this is not true for running import package: i.e. module_1.foo() would not work in this scenario.
3) The alternative approach to __init__
(The following is based on this post) As I mentioned in my question, there is an alternative approach to __init__, in which the objects that you want to make available when the user calls from package import * are directly imported into __init__.
As an example, __init__ could contain the following code:
from .module_1 import *
from .module_2 import *
Then, when the user calls from package import * the objects from modules 1 and 2 would be available on the namespace.
If module_1 was that of 1.1), then the function foo could be called without referencing the module. i.e. foo(). However, the same would not work for baz.
As mentioned in 2), from package import * is different to import package. Calling import package in this scenario (with this __init__), makes foo available through package.foo(), instead of just foo(). Similarly import package as pk makes foo available as pk.foo().
This approach could be preferable to that of 1.2), in which foo is made available through module_1.foo().
The key problem in your code is that you have defined __all__ in sub modules. Which allows to be exported only that module, but such module is not available in your directory. They are definitions you want to use.
So, just remove them from sub modules and you'll get it working.

Import method from Python submodule in __init__, but not submodule itself

I have a Python module with the following structure:
mymod/
__init__.py
tools.py
# __init__.py
from .tools import foo
# tools.py
def foo():
return 42
Now, when import mymod, I see that it has the following members:
mymod.foo()
mymod.tools.foo()
I don't want the latter though; it just pollutes the namespace.
Funnily enough, if tools.py is called foo.py you get what you want:
mymod.foo()
(Obviously, this only works if there is just one function per file.)
How do I avoid importing tools? Note that putting foo() into __init__.py is not an option. (In reality, there are many functions like foo which would absolutely clutter the file.)
The existence of the mymod.tools attribute is crucial to maintaining proper function of the import system. One of the normal invariants of Python imports is that if a module x.y is registered in sys.modules, then the x module has a y attribute referring to the x.y module. Otherwise, things like
import x.y
x.y.y_function()
break, and depending on the Python version, even
from x import y
can break. Even if you don't think you're doing any of the things that would break, other tools and modules rely on these invariants, and trying to remove the attribute causes a slew of compatibility problems that are nowhere near worth it.
Trying to make tools not show up in your mymod module's namespace is kind of like trying to not make "private" (leading-underscore) attributes show up in your objects' namespaces. It's not how Python is designed to work, and trying to force it to work that way causes more problems than it solves.
The leading-underscore convention isn't just for instance variables. You could mark your tools module with a leading underscore, renaming it to _tools. This would prevent it from getting picked up by from mymod import * imports (unless you explicitly put it in an __all__ list), and it'd change how IDEs and linters treat attempts to access it directly.
You are not importing the tools module, it's just available when you import the package like you're doing:
import mymod
You will have access to everything defined in the __init__ file and all the modules of this package:
import mymod
# Reference a module
mymod.tools
# Reference a member of a module
mymod.tools.foo
# And any other modules from this package
mymod.tools.subtools.func
When you import foo inside __init__ you are are just making foo available there just like if you have defined it there, but of course you defined it in tools which is a way to organize your package, so now since you imported it inside __init__ you can:
import mymod
mymod.foo()
Or you can import foo alone:
from mymod import foo
foo()
But you can import foo without making it available inside __init__, you can do the following which is exactly the same as the example above:
from mymod.tools import foo
foo()
You can use both approaches, they're both right, in all these example you are not "cluttering the file" as you can see accessing foo using mymod.tools.foo is namespaced so you can have multiple foos defined in other modules.
Try putting this in your __init__.py file:
from .tools import foo
del tools

Imports in __init__.py

This is my project structure (Python 3.5.1.):
a
├── b.py
└── __init__.py
Case 1
File b.py is empty.
File __init__.py is:
print(b)
If we run import a, the output is:
NameError: name 'b' is not defined
Case 2
File b.py is empty.
File __init__.py is:
import a.b
print(b)
If we run import a, the output is:
<module 'a.b' from '/tmp/a/b.py'>
Question
Why doesn't the program fail in Case 2?
Usually if we run import a.b then we can only reference it by a.b, not b. Hopefully somebody can help explain what's happening to the namespace in Case 2.
Python adds modules as globals to the parent package after import.
So when you imported a.b, the name b was added as a global to the a module, created by a/__init__.py.
From the Python 3 import system documentation:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
Bold emphasis mine. Note that the same applies to Python 2, but Python 3 made the process more explicit.
An import statement brings a module into scope. You imported b, so there it is, a module object.
Read the documentation for import:
The basic import statement (no from clause) is executed in two steps:
find a module, loading and initializing it if necessary
define a name or names in the local namespace for the scope where the import statement occurs.
You didn't import b in the first case.

from <module> import ... in __init__.py makes module name visible?

Take the following code example:
File package1/__init__.py:
from moduleB import foo
print moduleB.__name__
File package1/moduleB.py:
def foo(): pass
Then from the current directory:
>>> import package1
package1.moduleB
This code works in CPython. What surprises me about it is that the from ... import in __init__.py statement makes the moduleB name visible. According to Python documentation, this should not be the case:
The from form does not bind the module name
Could someone please explain why CPython works that way? Is there any documentation describing this in detail?
The documentation misled you as it is written to describe the more common case of importing a module from outside of the parent package containing it.
For example, using "from example import submodule" in my own code, where "example" is some third party library completely unconnected to my own code, does not bind the name "example". It does still import both the example/__init__.py and example/submodule.py modules, create two module objects, and assign example.submodule to the second module object.
But, "from..import" of names from a submodule must set the submodule attribute on the parent package object. Consider if it didn't:
package/__init__.py executes when package is imported.
That __init__ does "from submodule import name".
At some point later, other completely different code does "import package.submodule".
At step 3, either sys.modules["package.submodule"] doesn't exist, in which case loading it again will give you two different module objects in different scopes; or sys.modules["package.submodule"] will exist but "submodule" won't be an attribute of the parent package object (sys.modules["package"]), and "import package.submodule" will do nothing. However, if it does nothing, the code using the import cannot access submodule as an attribute of package!
Theoretically, how importing a submodule works could be changed if the rest of the import machinery was changed to match.
If you just need to know what importing a submodule S from package P will do, then in a nutshell:
Ensure P is imported, or import it otherwise. (This step recurses to handle "import A.B.C.D".)
Execute S.py to get a module object. (Skipping details of .pyc files, etc.)
Store module object in sys.modules["P.S"].
setattr(sys.modules["P"], "S", sys.modules["P.S"])
If that import was of the form "import P.S", bind "P" in local scope.
this is because __init__.py represent itself as package1 module object at runtime, so every .py file will be defined as an submodule. and rewrite __all__ will not make any sense. you can make another file e.g example.py and fill it with the same code in __init__.py and it will raise NameError.
i think CPython runtime takes special algorithm when __init__.py looking for variables differ from other python files, may be like this:
looking for variable named "moduleB"
if not found:
if __file__ == '__init__.py': #dont raise NameError, looking for file named moduleB.py
if current dir contains file named "moduleB.py":
import moduleB
else:
raise namerror

Categories

Resources