Intra-package imports do not always work - python

I have a Django project structured like so:
appname/
models/
__init__.py
a.py
base.py
c.py
... where appname/models/__init__.py contains only statements like so:
from appname.models.base import Base
from appname.models.a import A
from appname.models.c import C
... and where appname/models/base.py contains:
import django.db.models
class Base(django.db.models.Model):
...
and where appname/models/a.py contains:
import appname.models as models
class A(models.Base):
....
...and similarly for appname/models/c.py, etc..
I am quite happy with this structure of my code, but of course it does not work, because of circular imports.
When appname/__init__.py is run, appname/models/a.py will get run, but that module imports "appname.models", which has not finished executing yet. Classic circular import.
So this supposedly indicates that my code is structured poorly and needs to be re-designed in order to avoid circular dependency.
What are the options to do that?
Some solutions I can think of and then why I don't want to use them:
Combine all my model code into a single file: Having 20+ classes in the same file is a far worse style than what I am trying to do (with separate files), in my opinion.
Move the "Base" model class into another package outside of "appname/models": This means that I would end up with package in my project that contains base/parent classes that should ideally be split into the packages in which their child/sub classes are located. Why should I have base/parent classes for models, forms, views, etc. in the same package and not in their own packages (where the child/sub classes would be located), other than to avoid circular imports?
So my question is not just how to avoid circular imports, but to do so in a way that is just as clean (if not cleaner) that what I tried to implement.
Does anyone have a better way?

Edit
I have researched this more thoroughly and come to the conclusion that this is a bug in either core Python or the Python documentation. More information is available at this question and answer.
Python's PEP 8 indicates a clear preference for absolute over relative imports. This problem has a workaround that involves relative imports, and there is a possible fix in the import machinery.
My original answer below gives examples and workarounds.
Original answer
The problem, as you have correctly deduced, is circular dependencies. In some cases, Python can handle these just fine, but if you get too many nested imports, it has issues.
For example, if you only have one package level, it is actually fairly hard to get it to break (without mutual imports), but as soon as you nest packages, it works more like mutual imports, and it starts to become difficult to make it work. Here is an example that provokes the error:
level1/__init__.py
from level1.level2 import Base
level1/level2/__init__.py
from level1.level2.base import Base
from level1.level2.a import A
level1/level2/a.py
import level1.level2.base
class A(level1.level2.base.Base): pass
level1/level2/base
class Base: pass
The error can be "fixed" (for this small case) in several different ways, but many potential fixes are fragile. For example, if you don't need the import of A in the level2 __init__ file, removing that import will fix the problem (and your program can later execute import level1.level2.a.A), but if your package gets more complex, you will see the errors creeping in again.
Python sometimes does a good job of making these complex imports work, and the rules for when they will and won't work are not at all intuitive. One general rule is that from xxx.yyy import zzz can be more forgiving than import xxx.yyy followed by xxx.yyy.zzz. In the latter case, the interpreter has to have finished binding yyy into the xxx namespace when it is time to retrieve xxx.yyy.zzz, but in the former case, the interpreter can traverse the modules in the package before the top-level package namespace is completely set up.
So for this example, the real problem is the bare import in a.py This could easily be fixed:
from level1.level2.base import Base
class A(Base): pass
Consistently using relative imports is a good way to enforce this use of from ... import for the simple reason that relative imports do not work without the from'. To use relative imports with the example above,level1/level2/a.py` should contain:
from .base import Base
class A(Base): pass
This breaks the problematic import cycle and everything else works fine. If the imported name (such as Base) is too confusingly generic when not prefixed with the source module name, you can easily rename it on import:
from .base import Base as BaseModel
class A(BaseModel): pass
Although that fixes the current problem, if the package structure gets more complex, you might want to consider using relative imports more generally. For example, level1/level2/__init__.py could be:
from .base import Base
from .a import A

Related

How to deal with objects not beeing equal because of different imports

First a small example to illustrate what I mean. Suppose I have a class MyClass in a subpackage in somepackage. Suppose I can import in these two ways, because the sys path is set accordingly:
from somepackage.subpackage import MyClass as mc1
from subpackage import MyClass as mc2
then:
mc1() == mc2()
will return False. Because apparently Python thinks these are two objects of different classes as far as I understand it.
Is my reasoning correct here? How to deal with such sitatuations? This seems like an easy way to break code.
To make this a little bit easier, I give a more concrete example for this issue. The above class is actually part of a library. This library seems to relies on the fact that somepackage is on the syspath and directly imports from subpackackes like in the second line above. For this reason I need to modify the syspath in my application as well to include somepackage, so it will find the imports in the library when using it.
I think it's already bad enough that I have to modify syspath at all, but at least I didn't want to rely on this modification in my own code that I can control and import the library with the full name ("the regular way"). I then ran exactly into the issue described above.
I have from somepackage.subpackage import MyClass as mc1 and create an object to the library code. The library compares the object with a second one it created after importing the class with from subpackage import MyClass as mc2. And then the code failed.
Do I just have to accept that the library relies on the syspath modification and have to import all libraries the same way? This is what I currently do, but it feels like really bad. Is there any better way?
How to detect such issues in general? I had "luck" that this lead in my use case to an Exception, so I found the problem very quickly. But in generally these sort of bugs seem extremely dangerous to me.
Small Bonus Question: Is there some guideline or something similar, which says that libraries shouldn't import stuff like this? Something I could just send to the author to convince them that it would be better to not do it like this. Because at least to me it seems like bad style.

Is it possible to enforce via CI that module_a does not import anything from module_b?

I'm maintaining several open source projects and I want to write code at work that nudges people to do the right thing.
I have a situation where I see people importing stuff in module_a from module_b, but that should not happen. There are two reasons for it:
Production code importing stuff from test code: I hope I don't need to explain why that is a bad idea.
Import Cycles: Some modules are so basic, that they should not import any other modules from the package (e.g. constants.py / errors.py / utils.py).
For this question, you can assume that all imports happen on module level (hence not inside a function).
Is it possible to enforce via CI (e.g. mypy / pytest / flake8) that module_a does not import anything from module_b?

Circular imports hell

Python is extremely elegant language. Well, except... except imports. I still can't get it work the way it seems natural to me.
I have a class MyObjectA which is in file mypackage/myobjecta.py. This object uses some utility functions which are in mypackage/utils.py. So in my first lines in myobjecta.py I write:
from mypackage.utils import util_func1, util_func2
But some of the utility functions create and return new instances of MyObjectA. So I need to write in utils.py:
from mypackage.myobjecta import MyObjectA
Well, no I can't. This is a circular import and Python will refuse to do that.
There are many question here regarding this issue, but none seems to give satisfactory answer. From what I can read in all the answers:
Reorganize your modules, you are doing it wrong! But I do not know
how better to organize my modules even in such a simple case as I
presented.
Try just import ... rather than from ... import ...
(personally I hate to write and potentially refactor all the full
name qualifiers; I love to see what exactly I am importing into
module from the outside world). Would that help? I am not sure,
still there are circular imports.
Do hacks like import something in the inner scope of a function body just one line before you use something from other module.
I am still hoping there is solution number 4) which would be Pythonic in the sense of being functional and elegant and simple and working. Or is there not?
Note: I am primarily a C++ programmer, the example above is so much easily solved by including corresponding headers that I can't believe it is not possible in Python.
There is nothing hackish about importing something in a function body, it's an absolutely valid pattern:
def some_function():
import logging
do_some_logging()
Usually ImportErrors are only raised because of the way import() evaluates top level statements of the entire file when called.
In case you do not have a logic circular dependency...
, nothing is impossible in python...
There is a way around it if you positively want your imports on top:
From David Beazleys excellent talk Modules and Packages: Live and Let Die! - PyCon 2015, 1:54:00, here is a way to deal with circular imports in python:
try:
from images.serializers import SimplifiedImageSerializer
except ImportError:
import sys
SimplifiedImageSerializer = sys.modules[__package__ + '.SimplifiedImageSerializer']
This tries to import SimplifiedImageSerializer and if ImportError is raised (due to a circular import error or the it not existing) it will pull it from the importcache.
PS: You have to read this entire post in David Beazley's voice.
Don't import mypackage.utils to your main module, it already exists in mypackage.myobjecta. Once you import mypackage.myobjecta the code from that module is being executed and you don't need to import anything to your current module, because mypackage.myobjecta is already complete.
What you want isn't possible. There's no way for Python to know in which order it needs to execute the top-level code in order to do what you ask.
Assume you import utils first. Python will begin by evaluating the first statement, from mypackage.myobjecta import MyObjectA, which requires executing the top level of the myobjecta module. Python must then execute from mypackage.utils import util_func1, util_func2, but it can't do that until it resolves the myobjecta import.
Instead of recursing infinitely, Python resolves this situation by allowing the innermost import to complete without finishing. Thus, the utils import completes without executing the rest of the file, and your import statement fails because util_func1 doesn't exist yet.
The reason import myobjecta works is that it allows the symbols to be resolved later, after the body of every module has executed. Personally, I've run into a lot of confusion even with this kind of circular import, and so I don't recommend using them at all.
If you really want to use a circular import anyway, and you want them to be "from" imports, I think the only way it can reliably work is this: Define all symbols used by another module before importing from that module. In this case, your definitions for util_func1 and util_func2 must be before your from mypackage.myobjecta import MyObjectA statement in utils, and the definition of MyObjectA must be before from mypackage.utils import util_func1, util_func2 in myobjecta.
Compiled languages like C# can handle situations like this because the top level is a collection of definitions, not instructions. They don't have to create every class and every function in the order given. They can work things out in whatever order is required to avoid any cycles. (C++ does it by duplicating information in prototypes, which I personally feel is a rather hacky solution, but that's also not how Python works.)
The advantage of a system like Python is that it's highly dynamic. Yes you can define a class or a function differently based on something you only know at runtime. Or modify a class after it's been created. Or try to import dependencies and go without them if they're not available. If you don't feel these things are worth the inconvenience of adhering to a strict dependency tree, that's totally reasonable, and maybe you'd be better served by a compiled language.
Pythonistas frown upon importing from a function. Pythonistas usually frown upon global variables. Yet, I saw both and don't think the projects that used them were any worse than others done by some strict Pythhonistas. The feature does exist, not going into a long argument over its utility.
There's an alternative to the problem of importing from a function: when you import from the top of a file (or the bottom, really), this import will take some time (some small time, but some time), but Python will cache the entire file and if another file needs the same import, Python can retrieve the module quickly without importing. Whereas, if you import from a function, things get complicated: Python will have to process the import line each time you call the function, which might, in a tiny way, slow your program down.
A solution to this is to cache the module independently. Okay, this uses imports inside function bodies AND global variables. Wow!
_MODULEA = None
def util1():
if _MODULEA is None:
from mymodule import modulea as _MODULEA
obj = _MODULEA.ClassYouWant
return obj
I saw this strategy adopted with a project using a flat API. Whether you like it or not (and I'm not sure about that myself), it works and is fast, because the import line is executed only once (when the function first executes). Still, I would recommend restructuring: problems with circular imports show a problem in structure, usually, and this is always worth fixing. I do agree, though, it would be nice if Python provided more useful errors when this kind of situation happens.

Automatically import to all Python files in the given folder?

I am relatively quite new to Python and I try to learn the "Pythonic" way of doing things to build a solid foundation in terms of Python development. Perhaps what I want to achieve is not Python at all, but I am nonetheless seeking to find out the "right" way to solve this issue.
I am building an application, for which I am creating modules. I just noticed that a module of mine has 7 different .py Python files, all importing 3 different same things. So all these files share these imports.
I tried removing them, and inserting these import to the empty init.py in the folder, but it did not do the trick.
If possible, since these imports are needed by all these module files, I would not like them to be imported in each file one by one.
What can I do to perform the common import?
Thank you very much, I really appreciate your kind help.
As the Zen of Python states, "Explicit is better than implicit", and this is a good example.
It's very useful to have the dependencies of a module listed explicitly in the imports and it means that every symbol in a file can be traced to its origin with a simple text search. E.g. if you search for some_identifier in your file, you'll either find a definition in the file, or from some_module import some_identifier. It's even more obvious with direct references to some_module.some_identifier. (This is also one reason why you should not do from module import *.)
One thing you could do, without losing the above property, is to import your three shared modules into a fourth module:
#fourth.py
import first
import second
import third
then...
#another.py
import fourth
fourth.first.some_function()
#etc.
If you can't stomach that (it does make calls more verbose, after all) then the duplication of three imports is fine, really.
I agree with DrewV, it is perfectly pythonic to do
File1:
import xyz
import abc
...
File2:
import xyz
An almost identical question has also been addressed in the following link:
python multiple imports for a common module
As it explains, Python does the job of optimising the module load, so you can write multiple import statements and not worry about performance losses, because the module is only loaded once. In fact, listing out all the imports in each file makes it explicitly clear what each file depends on.
And for a discussion of how imports interact with namespaces, see:
Python imports across modules and global variables

Python module getting too big

My module is all in one big file that is getting hard to maintain. What is the standard way of breaking things up?
I have one module in a file my_module.py, which I import like this:
import my_module
"my_module" will soon be a thousand lines, which is pushing the limits of my ability to keep everything straight. I was thinking of adding files my_module_base.py, my_module_blah.py, etc. And then, replacing my_module.py with
from my_module_base import *
from my_module_blah import *
# etc.
Then, the user code does not need to change:
import my_module # still works...
Is this the standard pattern?
It depends on what your module is doing actually. Usually it is always a good idea to make your module a directory with an '__init__.py' file inside. So you would first transform your your_module.py to something like your_module/__init__.py.
After that you continue according to your business logic. Here some examples:
do you have utility functions which are not directly used by the modules API put them in some file called utils.py
do you have some classes dealing with the database or representing your database models put them in models.py
do you have some internal configuration it might make sense to put it into some extra file called settings.py or config.py
These are just examples (a little bit stolen from the Django approach of reusable apps ^^). As said, it depends a lot what your module does. If it is still too big afterwards it also makes sense to create submodules (as subdirectories with their own __init__.py).
i'm sure there are lots of opinions on this, but I'd say you break it into more well-defined functional units (modules), contained in a package. Then you use:
from mypackage import modulex
Then use the package name to reference the object:
modulex.MyClass()
etc.
You should (almost) never use
from mypackage import *
Since that can introduce bugs (duplicate names from different modules will end up clobbering one).
No, that is not the standard pattern. from something import * is usually not a good practice as it will import lot of things you did not intend to. Instead follow the same approach as you did, but include the modules specifically from one to another for e.g.
In base.py if you are having def myfunc then in main.py use from base import myfunc So that for your users, main.myfunc would work too. Of course, you need to take care that you don't end up doing a circular import.
Also, if you see that from something import * is required, then control the import values using the __all__ construct.

Categories

Resources