circular import dependencies in a package with inheritances - python

I have basically the following setup in my package:
thing.py:
from otherthing import *
class Thing(Base):
def action(self):
...do something with Otherthing()...
subthing.py:
from thing import *
class Subthing(Thing):
pass
otherthing.py:
from subthing import *
class Otherthing(Base):
def action(self):
... do something with Subthing()...
If I put all objects into one file, it will work, but that file would just become way too big and it'll be harder to maintain. How do I solve this problem?

This is treading into the dreaded Python circular imports argument but, IMHO, you can have an excellent design and still need circular references.
So, try this approach:
thing.py:
class Thing(Base):
def action(self):
...do something with otherthing.Otherthing()...
import otherthing
subthing.py:
import thing
class Subthing(thing.Thing):
pass
otherthing.py:
class Otherthing(Base):
def action(self):
... do something with subthing.Subthing()...
import subthing
There are a couple of things going on here. First, some background.
Due to the way importing works in Python, a module that is in the process of being imported (but has not been fully parsed yet) will be considered already imported when future import statements in other modules referencing that module are evaluated. So, you can end up with a reference to a symbol on a module that is still in the middle of being parsed - and if the parsing hasn't made it down to the symbol you need yet, it will not be found and will throw an exception.
One way to deal with this is to use "tail imports". The purpose of this technique is to define any symbols that other modules referring to this one might need before potentially triggering the import of those other modules.
Another way to deal with circular references is to move from from based imports to a normal import. How does this help? When you have a from style import, the target module will be imported and then the symbol referenced in the from statement will be looked up on the module object right at that moment.
With a normal import statement, the lookup of the reference is delayed until something does an actual attribute reference on the module. This can usually be pushed down into a function or method which should not normally be executed until all of your importing is complete.
The case where these two techniques don't work is when you have circular references in your class hierarchy. The import has to come before the subclass definition and the attribute representing the super class must be there when the class statement is hit. The best you can do is use a normal import, reference the super class via the module and hope you can rearrange enough of the rest of your code to make it work.
If you are still stuck at that point, another technique that can help is to use accessor functions to mediate the access between one module and another. For instance, if you have class A in one module and want to reference it from another module but can't due to a circular reference, you can sometimes create a third module with a function in it that just returns a reference to class A. If you generalize this into a suite of accessor functions, this doesn't end up as much of a hack as it sounds.
If all else fails, you can move import statements into your functions and methods - but I usually leave that as the very last resort.
--- EDIT ---
Just wanted to add something new I discovered recently. In a "class" statement, the super class is actually a Python expression. So, you can do something like this:
>>> b=lambda :object
>>> class A(b()):
... pass
...
>>> a=A()
>>> a
<__main__.A object at 0x1fbdad0>
>>> a.__class__.__mro__
(<class '__main__.A'>, <type 'object'>)
>>>
This allows you to define and import an accessor function to get access to a class from another class definition.

Stop writing circular imports. It's simple. thing cannot possible depend on everything that's in otherthing.
1) search for other questions exactly like yours.
2) read those answers.
3) rewrite otherthing so that thing depends on part of otherthing, not all of otherthing.

Related

How to override or replace an imported function call on a different module

I am leveraging an existing framework for a tool build activity based on python. Let me get into my issue straight :
Let's say the framework I am using is having a module named m1.py where I am having below function
def func_should_not_run(*args,**kwargs):
<doing something >
And I have a another module named m2.py where I am having below class :
from m1 import
class JustAClass:
def __init__(self,*args,*kwargs):
<All kind of initialisation..>
def run_something(self,*args,*kwargs):
<lots of code before>
func_should_not_run(*args,*kwargs)
<lots of code after>
Now my code module my_mod.py is having below class where I am creating an instance of above class from framework and calling m2.JustAClass.run_something inside another method as below
class JustAnotherClass:
def __init__(self,*args,*kwargs):
<All kind of initialisation..>
self.obj1=JustAClass(*some_args,**some_kwargs)
def run(self):
<some code before>
self.obj1.run_something(*some_other_args,**some_other_kwargs)
<some code after>
Now due to some implementation issue with m1.func_should_not_run which is getting called inside m2.JustAClass.run_something , I need to replace it with my own function func_should_run so that when func_should_not_run will be called inside m2.JustAClass.run_something, it should instead execute func_should_run from my module.
How can I achieve this?
Is there any way if I can override the import statement "from m1 import" on m2.py from my_mod.py?
This solution is risky for some aspects and could potentially fail given its side-effects, but worth to be mentioned in my opinion.
The idea is to replace (or better reload) the module that depends on the module you want to change, after some adjustment. I am going to start from the code and then I will show you the problems and limits of this approach:
from m2 import JustAClass
def func_should_run():
print('This is the function you want to call')
class JustAnotherClass:
def __init__(self,*args, **kwargs):
self.obj1 = JustAClass(*args, **kwargs)
def run(self):
self.obj1.run_something()
if __name__ == '__main__':
import importlib
importlib.import_module('m1').func_should_not_run = func_should_run
importlib.reload(importlib.import_module('m2'))
janc = JustAnotherClass()
janc.run()
Output:
This is the function you want to call
After importing importlib:
importlib.import_module('m1').func_should_not_run = func_should_run: I am importing module m1 and changing the func_should_not_run reference to func_should_run. This means that, for all the following calls to func_should_not_run, the code executed is the one of func_should_run. Obviously, this is also not valid for objects referencing the old func_should_not_run, like m2.JustAClass, so
importlib.reload(importlib.import_module('m2')): here I am reloading the module m2, that is going to use the new version of func_should_not_run because the module m1 is already loaded in the cache (i.e. sys.module) and therefore is not going to reload it (for this reason Transitive reloading can't occur, unless you explicitly do that).
From now on, every instance of JustAnotherClass correctly calls func_should_run
Should you use importlib.reload() for this?
Typically, reloading a module is useful when you have applied changes to a certain module and you do not want to restart the whole system to see those changes. In your case, unless you have clear in mind all the risks of this approach, you are kind of abusing the reload().
What are the main side-effects of this solution?
For start, reloading has its costs, especially if inside the module you have some initialization code that you do not want to re-execute. This means:
You are inevitably going to execute the module code twice (at least)
Be sure to comment your code explaining that every occurence of func_should_not_run is actually replace with func_should_run, but this is definitely not a good practice and not maintainable if used in many places.
To conclude, it is a simple as much as risky solution that can be adopted taking all the necessary precautions, with the awareness that it is just a hack and not a reasonable design decision.

Misunderstanding differences between inside-class and outside-class imports in Python [duplicate]

This question already has answers here:
Short description of the scoping rules?
(9 answers)
Closed 1 year ago.
Context: I'm writing a translator from one Python API to another, both in Python 3.5+. I load the file to be translated with a class named FileLoader, described by Fileloader.py. This file loader allows me to transfer the file's content to other classes doing the translation job.
All of the .py files describing each class are in the same folder
I tried two different ways to import my FileLoader module inside the other modules containing the classes doing the translation job. One seems to work, but the other didn't and I don't understand why.
Here are two code examples illustrating both ways:
The working way
import FileLoader
class Parser:
#
def __init__(self, fileLoader):
if isinstance(fileLoader, FileLoader.FileLoader)
self._fileLoader = fileLoader
else:
# raise a nice exception
The crashing way
class Parser:
import FileLoader
#
def __init__(self, fileLoader):
if isinstance(fileLoader, FileLoader.FileLoader)
self._fileLoader = fileLoader
else:
# raise a nice exception
I thought doing the import inside the class's scope (where it's the only scope FileLoader is used) would be enough, since it would know how to relate to the FileLoader module and its content. I'm obviously wrong since it's the first way which worked.
What am I missing about scopes in Python? Or is it about something different?
2 things : this won't work. And there is no benefit to doing it this way.
First, why not?
class Parser:
#this assigns to the Parser namespace, to refer to it
#within a method you need to use `self.FileLoader` or
#Parser.FileLoader
import FileLoader
#`FileLoader` works fine here, under the Parser indentation
#(in its namespace, but outside of method)
copy_of_FileLoader = FileLoader
#
def __init__(self, fileLoader):
# you need to refer to modules under in Parser namespace
# with that `self`, just like you would with any other
# class or instance variable 👇
if isinstance(fileLoader, self.FileLoader.FileLoader)
self._fileLoader = fileLoader
else:
# raise a nice exception
#works here again, since we are outside of method,
#in `Parser` scope/indent.
copy2_of_FileLoader = FileLoader
Second it's not Pythonic and it doesn't help
Customary for the Python community would be to put import FileLoader at the top of the program. Since it seems to be one of your own modules, it would go after std library imports and after third party module imports. You would not put it under a class declaration.
Unless... you had a good (probably bad actually reason to).
My own code, and this doesn't reflect all that well on me, sometimes has stuff like.
class MainManager(batchhelper.BatchManager):
....
def _load(self, *args, **kwargs):
👉 from pssystem.models import NotificationConfig
So, after stating this wasn't a good thing, why am I doing this?
Well, there are some specific circumstances to my code going here. This is a batch, command-line, script, usable within a Django context and it uses some Django ORM models. In order for those to be used, Django needs to be imported first and then setup. But that often happens too early in the context of these types of batch programs and I get circular import errors, with Django complaining that it hasn't initialized yet.
The solution? Defer execution until the method is called, when all the other modules have been imported and Django has been setup elsewhere.
NotificationConfig is now available, but only within that method as it is a local variable in it. It works, but... it's really not great practice.
Remember: anything in the global scope gets executed at module load time, anything under classes at module load time, anything withing method/function bodies when the method/function is called.
#happens at module load time, you could have circular import errors
import X1
class DoImportsLater:
.
#happens at module load time, you could have circular import errors
import X2
def _load(self, *args, **kwargs):
#only happens when this method is called, if ever
#so you shouldn't be seeing circular imports
import X3
import X1 is std practice, Pythonic.
import X2, what are doing, is not and doesn't help
import X3, what I did, is a hack and is covering up circular import references. But it "fixes" the issue.

How to mock Python classes when nested several dependencies deep

If I have the following architecture...
Please note the edits below. It occurred to me (after some recent refactoring) that there are actually three classes in three different files. Sorry that the file/class names are getting ridiculous. I assure you those are not the real names. :)
main_class.py
class MainClass(object):
def do_some_stuff(self):
dependent_class = DependentClass()
dependent_class.py
class DependentClass(object):
def __init__():
dependent_dependent_class = DependentDependentClass()
dependent_dependent_class.do_dependent_stuff()
dependent_dependent_class.py
class DependentDependentClass(object):
def do_dependent_stuff(self):
print "I'm gonna do production stuff that I want to mock"
print "Like access a database or interact with a remote server"
class MockDependentDependentClass(object):
def do_dependent_stuff(self):
print "Respond as if the production stuff was all successful."
and I want to call main_class.do_some_stuff during testing but, during its execution, I want instances of DependentDependentClass replaced with MockDependentDependentClass how can I do that pythonically using best practices.
Currently, the best thing I could come up with is to conditionally instantiate one class or the other based on the presence/value of an environment variable. It certainly works but is pretty dirty.
I spent some time reading about the unittest.mock and mock.patch functions and they seem like they might be able to help but each description that I could wrap my head around seemed to be a little different than my actual use case.
The key is that I don't want to define mock return values or attributes but that I want the namespace changed, globally, I guess, such that when my application thinks it is instantiating DependentClass it is actually instantiating MockDependentClass.
The fact that I can't find any examples of anyone doing exactly this means one of two things:
It's because I'm doing it in a very dumb/naive way.
I'm doing something so genius no else has ever encountered it.
... I assume it's number 1...
Full disclosure, unit testing is not something with which I am skilled. It's an effort that my internal tools development team is trying to catch up to step our game up a bit. It's possible that I'm not thinking about testing correctly.
Any thoughts would be most welcome. Thank you, in advance!
SOLUTION!!!
Thanks to #de1 for the help. Given my clever architecture shown above the following accomplishes what I want.
The following code is located in main_class.py
import dependent_class
from dependent_dependent_class import MockDependentDependentClass
with patch.object(dependent_class, "DependentDependentClass", MockDependentDependentClass):
main_class = MainClass()
main_class.do_some_stuff()
The code seems to (and hell if I know how it's doing this) manipulate the namespace within the module dependent_class so that, while inside the with block (that's a context manager for anyone who is hung up on that part) anything referring to the class object DependentDependentClass will actually be referencing MockDependentDependentClass.
The mock module does indeed seem to be a good fit in this case. You can specify the mock (your mock) to use when calling the various patch methods.
If you are importing only the class rather than the module you can patch the imported DependentDependentClass in DependentClass:
import .DependentClass as dependent_class
from .DependentDependentClass import MockDependentDependentClass
with patch.object(dependent_class, 'DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Alternatively:
with patch('yourmodule.DependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
or the following will only work if you are accessing the class via a module or import it after it is being patched:
with patch('yourmodule.DependentDependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Just bare in mind what object is being patched, when.
Note: you might find it less confusing naming your files in lower case, slightly different to the embedded class(es).
Note 2: If you need to mock a dependency of a dependency of the module under test then it might suggest that you are not testing at the right level.

Proper use of `isinstance(obj, class)`

As I write it, it seems almost surreal to me that I'm actually experiencing this problem.
I have a list of objects. Each of these objects are of instances of an Individual class that I wrote.
Thus, conventional wisdom says that isinstance(myObj, Individual) should return True. However, this was not the case. So I thought that there was a bug in my programming, and printed type(myObj), which to my surprise printed instance and myObj.__class__ gave me Individual!
>>> type(pop[0])
<type 'instance'>
>>> isinstance(pop[0], Individual) # with all the proper imports
False
>>> pop[0].__class__
Genetic.individual.Individual
I'm stumped! What gives?
EDIT: My Individual class
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = chromosomes[:] # managed as a list as order is used to identify chromosomal functions (i.e. chromosome i encodes functionality f)
self.id = self.ID.next()
# other methods
This error indicates that the Individual class somehow got created twice. You created pop[0] with one version of Instance, and are checking for instance with the other one. Although they are pretty much identical, Python doesn't know that, and isinstance fails. To verify this, check whether pop[0].__class__ is Individual evaluates to false.
Normally classes don't get created twice (unless you use reload) because modules are imported only once, and all class objects effectively remain singletons. However, using packages and relative imports can leave a trap that leads to a module being imported twice. This happens when a script (started with python bla, as opposed to being imported from another module with import bla) contains a relative import. When running the script, python doesn't know that its imports refer to the Genetic package, so it processes its imports as absolute, creating a top-level individual module with its own individual.Individual class. Another other module correctly imports the Genetic package which ends up importing Genetic.individual, which results in the creation of the doppelganger, Genetic.individual.Individual.
To fix the problem, make sure that your script only uses absolute imports, such as import Genetic.individual even if a relative import like import individual appears to work just fine. And if you want to save on typing, use import Genetic.individual as individual. Also note that despite your use of old-style classes, isinstance should still work, since it predates new-style classes. Having said that, it would be highly advisable to switch to new-style classes.
You need to use new-style classes that inherit from
class ClassName(object):
pass
From your example, you are using old-style classes that inherit from
class Classname:
pass
EDIT: As #user4815162342 said,
>>> type(pop[0])
<type 'instance'>
is caused by using an old-style class, but this is not the cause of your issues with isinstance. You should instead make sure you don't create the class in more than one place, or if you do, use distinct names. Importing it more than once should not be an issue.

Python namespaces: How to make unique objects accessible in other modules?

I am writing a moderate-sized (a few KLOC) PyQt app. I started out writing it in nice modules for ease of comprehension but I am foundering on the rules of Python namespaces. At several points it is important to instantiate just one object of a class as a resource for other code.
For example: an object that represents Aspell attached as a subprocess, offering a check(word) method. Another example: the app features a single QTextEdit and other code needs to call on methods of this singular object, e.g. "if theEditWidget.document().isEmpty()..."
No matter where I instantiate such an object, it can only be referenced from code in that module and no other. So e.g. the code of the edit widget can't call on the Aspell gateway object unless the Aspell object is created in the same module. Fine except it is also needed from other modules.
In this question the bunch class is offered, but it seems to me a bunch has exactly the same problem: it's a unique object that can only be used in the module where it's created. Or am I completely missing the boat here?
OK suggested elsewhere, this seems like a simple answer to my problem. I just tested the following:
junk_main.py:
import junk_A
singularResource = junk_A.thing()
import junk_B
junk_B.handle = singularResource
print junk_B.look()
junk_A.py:
class thing():
def __init__(self):
self.member = 99
junk_B.py:
def look():
return handle.member
When I run junk_main it prints 99. So the main code can inject names into modules just by assignment. I am trying to think of reasons this is a bad idea.
You can access objects in a module with the . operator just like with a function. So, for example:
# Module a.py
a = 3
>>> import a
>>> print a.a
3
This is a trivial example, but you might want to do something like:
# Module EditWidget.py
theEditWidget = EditWidget()
...
# Another module
import EditWidget
if EditWidget.theEditWidget.document().isEmpty():
Or...
import * from EditWidget
if theEditWidget.document().isEmpty():
If you do go the import * from route, you can even define a list named __all__ in your modules with a list of the names (as strings) of all the objects you want your module to export to *. So if you wanted only theEditWidget to be exported, you could do:
# Module EditWidget.py
__all__ = ["theEditWidget"]
theEditWidget = EditWidget()
...
It turns out the answer is simpler than I thought. As I noted in the question, the main module can add names to an imported module. And any code can add members to an object. So the simple way to create an inter-module communication area is to create a very basic object in the main, say IMC (for inter-module communicator) and assign to it as members, anything that should be available to other modules:
IMC.special = A.thingy()
IMC.important_global_constant = 0x0001
etc. After importing any module, just assign IMC to it:
import B
B.IMC = IMC
Now, this is probably not the greatest idea from a software design standpoint. If you just limit IMC to holding named constants, it acts like a C header file. If it's just to give access to singular resources, it's like a link extern. But because of Python's liberal rules, code in any module can modify or add members to IMC. Used in an undisciplined way, "who changed that" could be a debugging issue. If there are multiple processes, race conditions are a danger.
At several points it is important to instantiate just one object of a class as a resource for other code.
Instead of trying to create some sort of singleton factory, can you not create the single-use object somewhere between the main point of entry for the program and instantiating the object that needs it? The single-use object can just be passed as a parameter to the other object. Logically, then, you won't create the single-use object more than once.
For example:
def main(...):
aspell_instance = ...
myapp = MyAppClass(aspell_instance)
or...
class SomeWidget(...):
def __init__(self, edit_widget):
self.edit_widget = edit_widget
def onSomeEvent(self, ...):
if self.edit_widget.document().isEmpty():
....
I don't know if that's clear enough, or if it's applicable to your situation. But to be honest, the only time I've found I can't do this is in a CherryPy-based webserver, where the points of entry were pretty much everywhere.

Categories

Resources