I was facing import error (ImportError: cannot import name 'ClassB') in following code:
dir structure:
main.py
test_pkg/
__init__.py
a.py
b.py
main.py:
from test_pkg import ClassA, ClassB
__init__.py:
from .a import ClassA
from .b import ClassB
a.py:
from test_pkg import ClassB
class ClassA:
pass
b.py:
class ClassB:
pass
in past i fixed it by quick 'experiment' by adding full name in import in a.py:
from test_pkg.b import ClassB
class ClassA:
pass
I have read about import machinery and according to :
This name will be used in various phases of the import search, and it may be the dotted path to a submodule, e.g. foo.bar.baz. In this case, Python first tries to import foo, then foo.bar, and finally foo.bar.baz. link2doc
I was expecting it will fail again, because it will try to import test_pkg during test_pkg import, but it is working. My question is why?
Also 2 additional questions:
is it proper approach to have cross modules dependencies?
is it ok to have modules imported in package init.py?
My analysis:
Based on readings i recognized that high probably issue is that, because off
__init__.py:
from .a import ClassA
from .b import ClassB
ClassA and ClassB import is executed as part of test_pkg import, but then hits import statement in a.py:
a.py:
from test_pkg import ClassB
class ClassA:
pass
and it fail because circular dependency occured.
But is working when is imported using:
from test_pkg.b import ClassB
and according to my understanding it shouldnt, because:
This name will be used in various phases of the import search, and it
may be the dotted path to a submodule, e.g. foo.bar.baz. In this case,
Python first tries to import foo, then foo.bar, and finally
foo.bar.baz. If any of the intermediate imports fail, a
ModuleNotFoundError is raised.
so i was expecting same behavior for both imports.
Looks like import with full path is not launching problematic test_pkg import process
from test_pkg.b import ClassB
Your file is named b.py, but you're trying to import B, not import b.
Depending on your platform (see PEP 235 for details), this may work, or it may not. If it doesn't work, the symptoms will be exactly what you're seeing: ImportError: cannot import name 'B'.
The fix is to from test_okg import b. Or, if you want the module to be named B, rename the file to B.py.
This actually has nothing to do with packages (except that the error message you get says cannot import name 'B' instead of No module named 'B', because in a failed from … import statement, Python can't tell whether you were failing to import a module from a package, or some global name from a module).
So, why does this work?
from test_pkg.b import B
I was expecting it will fail again, because it will try to import test_pkg during test_pkg import, but it is working. My question is why?
Because importing test_pkg isn't the problem in the first place; importing test_pkg.B is. And you solved that problem by importing test_pkg.b instead.
test_pkg.b is successfully found in test_pkg/b.py.
And then, test_pkg.b.B is found within that module, and imported into your module, because of course there's a class B: statement in b.py.
For your followup questions:
is it proper approach to have cross modules dependencies?
There's nothing wrong with cross-module dependencies as long as they aren't circular, and yours aren't.
It's perfectly fine for test_pkg.a to import test_pkg.b with an absolute import, like from test_pkg import b.
However, it's usually better to use a relative import, like from . import b (unless you need dual-version code that works the same on Python 2.x and 3.x). PEP 328 explains the reasons why relative imports are usually better for intra-package dependencies (and why it's only "usually" rather than "always").
is it ok to have modules imported in package __init__.py?
Yes. In fact, this is a pretty common idiom, used for multiple purposes.
For example, see asyncio.__init__.py, which imports all of the public exports from each of its submodules and re-exports them. There are a handful of rarely-used names in the submodules, which don't start with _ but aren't included in __all__, and if you want to use those you need to import the submodule explicitly. But everything you're likely to need in a typical program is included in __all__ and re-exported by the package, so you can just write, e.g., asyncio.Lock instead of asyncio.locks.Lock.
The code I would prefer to write is probably:
main.py:
from test_pkg import A, B
b.py:
class B:
pass
a.py:
from .b import B
class A:
pass
__init__.py:
from .a import A
from .b import B
main.py:
from test_pkg import A, B
The proper approach is to have cross module dependencies but not circular. You should figure out the hierarchy of your project and arrange your dependency graph in a DAG (directed acyclic graph).
what you put in package __init__.py will be what you can access thru the package. Also you can refer to this question for the use of __all__ in __init__.py.
Related
I have a python package of the form:
package
├── __init__.py
├── module_1.py
└── module_2.py
Inside __init__.py I have:
__all__ = ["module_1", "module_2"]
Module_1 is:
__all__ = ["foo"]
def foo(): pass
Module_2 is:
__all__ = ["Bar"]
class Bar: pass
From the documentation I thought that the following code would import foo and Bar:
import package as pk
However, when I run pk.foo() it throws an error: AttributeError: module 'package' has no attribute 'foo' (same for Bar).
Thanks to this answer, I know that to get the desired behavior, I can change __init__.py into:
from .module_1 import *
from .module_2 import *
The above works.
However, I do not understand the documentation. The lines:
if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered
Sound like my original __init__.py should have worked (the one with __all__ = ["module_1", "module_2"]). That is, the line import package as pk should have imported the modules module_1 and module_2, which in turn make foo and Bar available.
What am I missing?
EDIT:
I have also tried using exactly what the documentation mentions. That is, using from package import *, and then tried with package.foo() and foo(), but neither worked.
The first, package.foo(), throws the error NameError: 'package' is not defined.. The second, foo(), throws the error NameError: 'foo' is not defined..
How would a working example look like?.. One where __init__.py is of the form __all__ = ["module_1", "module_2"].
I will try to summarize the knowledge gained from the comments to my question, the documentation, my own tests, and this post.
1) __all__ behaves differently on __init__ and modules
1.1) Within a module
When __all__ is within a module, it determines what objects are made available when running from module import *.
Given this package structure:
package
├── __init__.py
├── module_1.py
└── module_2.py
And given the following code for module_1:
__all__ = ["foo"]
def foo(): pass
def baz(): pass
Running from package.module_1 import * will make foo available but not baz.
Moreover, foo can then be called using foo(), i.e., there is no need to reference the module.
1.2) Within __init__ (my original question)
When __all__ is within __init__,
it is taken to be the list of module names that should be imported when from package import * is encountered.
Running from package import * will then have two effects:
The scripts of the modules in __all__ will be ran (they are imported).
These modules are made available in the namespace.
This means that if __init__ is of the form:
__all__ = ["module_1", "module_2"]
Then, running from package import * will run the scripts of module_1 and module_2 and make both modules available. So, now, the function foo inside module_1 can be called as module_1.foo() instead of package.module_1.foo().
However, if this is the intent, then using from package.module_1 import foo might be better. As it makes foo accessible as foo().
2) from package import * is not the same as import package
Running from package import * has the effect mentioned in 1.2). However, this is not true for running import package: i.e. module_1.foo() would not work in this scenario.
3) The alternative approach to __init__
(The following is based on this post) As I mentioned in my question, there is an alternative approach to __init__, in which the objects that you want to make available when the user calls from package import * are directly imported into __init__.
As an example, __init__ could contain the following code:
from .module_1 import *
from .module_2 import *
Then, when the user calls from package import * the objects from modules 1 and 2 would be available on the namespace.
If module_1 was that of 1.1), then the function foo could be called without referencing the module. i.e. foo(). However, the same would not work for baz.
As mentioned in 2), from package import * is different to import package. Calling import package in this scenario (with this __init__), makes foo available through package.foo(), instead of just foo(). Similarly import package as pk makes foo available as pk.foo().
This approach could be preferable to that of 1.2), in which foo is made available through module_1.foo().
The key problem in your code is that you have defined __all__ in sub modules. Which allows to be exported only that module, but such module is not available in your directory. They are definitions you want to use.
So, just remove them from sub modules and you'll get it working.
I have a Python module with the following structure:
mymod/
__init__.py
tools.py
# __init__.py
from .tools import foo
# tools.py
def foo():
return 42
Now, when import mymod, I see that it has the following members:
mymod.foo()
mymod.tools.foo()
I don't want the latter though; it just pollutes the namespace.
Funnily enough, if tools.py is called foo.py you get what you want:
mymod.foo()
(Obviously, this only works if there is just one function per file.)
How do I avoid importing tools? Note that putting foo() into __init__.py is not an option. (In reality, there are many functions like foo which would absolutely clutter the file.)
The existence of the mymod.tools attribute is crucial to maintaining proper function of the import system. One of the normal invariants of Python imports is that if a module x.y is registered in sys.modules, then the x module has a y attribute referring to the x.y module. Otherwise, things like
import x.y
x.y.y_function()
break, and depending on the Python version, even
from x import y
can break. Even if you don't think you're doing any of the things that would break, other tools and modules rely on these invariants, and trying to remove the attribute causes a slew of compatibility problems that are nowhere near worth it.
Trying to make tools not show up in your mymod module's namespace is kind of like trying to not make "private" (leading-underscore) attributes show up in your objects' namespaces. It's not how Python is designed to work, and trying to force it to work that way causes more problems than it solves.
The leading-underscore convention isn't just for instance variables. You could mark your tools module with a leading underscore, renaming it to _tools. This would prevent it from getting picked up by from mymod import * imports (unless you explicitly put it in an __all__ list), and it'd change how IDEs and linters treat attempts to access it directly.
You are not importing the tools module, it's just available when you import the package like you're doing:
import mymod
You will have access to everything defined in the __init__ file and all the modules of this package:
import mymod
# Reference a module
mymod.tools
# Reference a member of a module
mymod.tools.foo
# And any other modules from this package
mymod.tools.subtools.func
When you import foo inside __init__ you are are just making foo available there just like if you have defined it there, but of course you defined it in tools which is a way to organize your package, so now since you imported it inside __init__ you can:
import mymod
mymod.foo()
Or you can import foo alone:
from mymod import foo
foo()
But you can import foo without making it available inside __init__, you can do the following which is exactly the same as the example above:
from mymod.tools import foo
foo()
You can use both approaches, they're both right, in all these example you are not "cluttering the file" as you can see accessing foo using mymod.tools.foo is namespaced so you can have multiple foos defined in other modules.
Try putting this in your __init__.py file:
from .tools import foo
del tools
(Python 3.6)
I have this folder structure:
package/
start.py
subpackage/
__init__.py
submodule.py
submodule.py:
def subfunc():
print("This is submodule")
__ init __.py:
from subpackage.submodule import subfunc
start.py:
import subpackage
subpackage.subfunc()
subpackage.submodule.subfunc()
I understand how and why
subpackage.subfunc()
works.
But I don't understand why:
subpackage.submodule.subfunc()
also works, if I have not done:
from subpackage import submodule
Nor:
import subpackage.submodule
Neither in __ init __.py nor in start.py
Thank you very much if anyone may clear my doubt.
When issuing from subpackage.submodule import subfunc, python does two things for you: one, search and evaluate the module named subpackage.submodule, put it into sys.modules cache; two, populate subpackage.submodule.subfunc object and bind name "subfunc" to the namespace of the current module:
The import statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope.
When importing subpackage.submodule, parent of submodule also got imported:
While certain side-effects may occur, such as the importing of parent packages, and the updating of various caches (including sys.modules) ...
On the last stage of importing subpackage.submodule, python would set the module as an attribute on its parent subpackage, this behavior is documented:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object.
If I'm getting this right, you have a folder called "package" in which there are 2 things: a .py file and another folder called "subpackage".
Inside "subpackage" you have __init__.py and submodule.py which the latter contains a function that just prints "This is submodule".
Now, when you call import subpackage, you call and "pull" everything that's inside "subpackage", including submodule and therefore, the subfunc() function.
When you write subpackage.submodule.subfunc() there's really nothing amazing going there, you just call the mainfolder/container (subpackage.), then the .py file (submodule.) and finally the function itself (subfunc() ).
Can someone explain this to me?
When you import Tkinter.Messagebox what actually does this mean (Dot Notation)?
I know that you can import Tkinter but when you import Tkinter.Messagebox what actually is this? Is it a class inside a class?
I am new to Python and dot notation confuses me sometimes.
When you're putting that dot in your imports, you're referring to something inside the package/file you're importing from.
what you import can be a class, package or a file, each time you put a dot you ask something that is inside the instance before it.
parent/
__init__.py
file.py
one/
__init__.py
anotherfile.py
two/
__init__.py
three/
__init__.py
for example you have this, when you pass import parent.file you're actually importing another python module that may contain classes and variables, so to refer to a specific variable or class inside that file you do from parent.file import class for example.
this may go further, import a packaging inside another package or a class inside a file inside a package etc (like import parent.one.anotherfile)
For more info read Python documentation about this.
import a.b imports b into the namespace a, you can access it by a.b . Be aware that this only works if b is a module. (e.g. import urllib.request in Python 3)
from a import b however imports b into the current namespace, accessible by b. This works for classes, functions etc.
Be careful when using from - import:
from math import sqrt
from cmath import sqrt
Both statements import the function sqrt into the current namespace, however, the second import statement overrides the first one.
This is my project structure (Python 3.5.1.):
a
├── b.py
└── __init__.py
Case 1
File b.py is empty.
File __init__.py is:
print(b)
If we run import a, the output is:
NameError: name 'b' is not defined
Case 2
File b.py is empty.
File __init__.py is:
import a.b
print(b)
If we run import a, the output is:
<module 'a.b' from '/tmp/a/b.py'>
Question
Why doesn't the program fail in Case 2?
Usually if we run import a.b then we can only reference it by a.b, not b. Hopefully somebody can help explain what's happening to the namespace in Case 2.
Python adds modules as globals to the parent package after import.
So when you imported a.b, the name b was added as a global to the a module, created by a/__init__.py.
From the Python 3 import system documentation:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
Bold emphasis mine. Note that the same applies to Python 2, but Python 3 made the process more explicit.
An import statement brings a module into scope. You imported b, so there it is, a module object.
Read the documentation for import:
The basic import statement (no from clause) is executed in two steps:
find a module, loading and initializing it if necessary
define a name or names in the local namespace for the scope where the import statement occurs.
You didn't import b in the first case.