I have file A.py:
...
from System.IO import Directory
...
execfile('B.py')
and file B.py:
...
result = ['a','b'].Contains('a')
Such combination works fine.
But if I comment this particular import line at A.py then B.py complains:
AttributeError: 'list' object has no attribute 'Contains'
It looks strange for me, especially B.py alone runs fine.
Is there some 'list' override in module System.IO? Is there way to determine what is changed during this import? Or avoid such strange behavior?
B.py on its own (on IPy 2.7.4) also results in the error you provided. It should as there is no built-in method that would bind for that call. You could also try to reproduce this on standard python to make sure.
The way you "include" B.py into A.py has inherent risks and problems at least because of missing control over the scope the code in B is executed against. If you wrap your code in a proper module/classes you can be sure that there are no issues of that sort.
This could (simplified, just a function, no classes etc.) look like:
from B import getResult
getResult()
from System.IO import Directory
getResult()
and
def getResult():
from System.IO import Directory
result = ['a','b'].Contains('a')
return result
As you can see the function getResult can ensure that all imports and other scope properties are correct and it will work whether or not the call site has the given import.
As for why the import of System.IO.Directory causes binding to IronPython.dll!IronPython.Runtime.List.Contains(object value) is not clear and would require a close look at IronPython's internal implementation.
If you mind the actual import to System.IO.Directory but don't want to refactor the code you could go with LINQ for IronPython/ImportExtensions as that would provide a Contains method as well. This will not reduce but actually increase your importing boilerplate but be more to the point.
import clr
clr.AddReference("System.Core")
import System
clr.ImportExtensions(System.Linq)
result = ['a','b'].Contains('a')
Related
I need to use a utility function that exists in an external module I do not control. The module has many imports that are not used by the specific function I care about. I don't want to install those unnecessary dependencies (and especially not bloat my packages dependency list).
It doesn't seem possible to use the function without the imports. I wondered if it's possible to mock the module? I had briefly hoped this would work:
""" module_a """
import unused_module
def f():
return something_of_value
""" module_b: needs to use module_a.f() but doesn't have unused_module installed. """
import mock_module as unused_module
global unused_module
import module_a
module_a.f()
But I couldn't inject my mock of unused_module. Is there a workaround I could use here?
What about this? We're exploiting the module caching system, where modules only get imported once and get stored in sys.modules which is a dict.
utili.py (which I should have called module_a.py)
""" module_a """
import unused_module
def f():
return "something_of_value"
Then the client code that needs to use your function.
user.py
import sys
#we're not really doing anything here, just
#slotting in a fake entry into sys.modules
#so that it wont bother to "reimport" unused_modules
sys.modules["unused_module"] = 1
from utili import f
print(f"{f()=}")
output of executing user.py:
python user.py
f()='something_of_value'
And, to be clear, I don't have an actual unused_module anywhere, so the import will fail if utili.py is called first.
py utili.py
Traceback (most recent call last):
File "/Users/myuser/explore/test_384_mockmo/utili.py", line 2, in <module>
import unused_module
ModuleNotFoundError: No module named 'unused_module'
Things will also not go well if any code tries to use unused_module so your f definitely can't require its actual existence. Ditto for any code that gets executed at the module load of utili.py. You could however assign a mock of some sorts instead of the litteral 1.
I'm working with a project that contains about 30 unique modules. It wasn't designed too well, so it's common that I create circular imports when adding some new functionality to the project.
Of course, when I add the circular import, I'm unaware of it. Sometimes it's pretty obvious I've made a circular import when I get an error like AttributeError: 'module' object has no attribute 'attribute' where I clearly defined 'attribute'. But other times, the code doesn't throw exceptions because of the way it's used.
So, to my question:
Is it possible to programmatically detect when and where a circular import is occuring?
The only solution I can think of so far is to have a module importTracking that contains a dict importingModules, a function importInProgress(file), which increments importingModules[file], and throws an error if it's greater than 1, and a function importComplete(file) which decrements importingModules[file]. All other modules would look like:
import importTracking
importTracking.importInProgress(__file__)
#module code goes here.
importTracking.importComplete(__file__)
But that looks really nasty, there's got to be a better way to do it, right?
To avoid having to alter every module, you could stick your import-tracking functionality in a import hook, or in a customized __import__ you could stick in the built-ins -- the latter, for once, might work better, because __import__ gets called even if the module getting imported is already in sys.modules, which is the case during circular imports.
For the implementation I'd simply use a set of the modules "in the process of being imported", something like (benjaoming edit: Inserting a working snippet derived from original):
beingimported = set()
originalimport = __import__
def newimport(modulename, *args, **kwargs):
if modulename in beingimported:
print "Importing in circles", modulename, args, kwargs
print " Import stack trace -> ", beingimported
# sys.exit(1) # Normally exiting is a bad idea.
beingimported.add(modulename)
result = originalimport(modulename, *args, **kwargs)
if modulename in beingimported:
beingimported.remove(modulename)
return result
import __builtin__
__builtin__.__import__ = newimport
Not all circular imports are a problem, as you've found when an exception is not thrown.
When they are a problem, you'll get an exception the next time you try to run any of your tests. You can change the code when this happens.
I don't see any change required from this situation.
Example of when it's not a problem:
a.py
import b
a = 42
def f():
return b.b
b.py
import a
b = 42
def f():
return a.a
Circular imports in Python are not like PHP includes.
Python imported modules are loaded the first time into an import "handler", and kept there for the duration of the process. This handler assigns names in the local namespace for whatever is imported from that module, for every subsequent import. A module is unique, and a reference to that module name will always point to the same loaded module, regardless of where it was imported.
So if you have a circular module import, the loading of each file will happen once, and then each module will have names relating to the other module created into its namespace.
There could of course be problems when referring to specific names within both modules (when the circular imports occur BEFORE the class/function definitions that are referenced in the imports of the opposite modules), but you'll get an error if that happens.
import uses __builtin__.__import__(), so if you monkeypatch that then every import everywhere will pick up the changes. Note that a circular import is not necessarily a problem though.
main:
import fileb
favouritePizza = "pineapple"
fileb.eatPizza
fileb:
from main import favouritePizza
def eatPizza():
print("favouritePizza")
This is what I tried, however, I get no such attribute. I looked at other problems and these wouldn't help.
This is classic example of circular dependency. main imports fileb, while fileb requires main.
Your case is hard (impossible?) to solve even in theory. In reality, python import machinery does even less expected thing — every time you import anything from some module, whole module is read and imported into global(per process) module namespace. Actually, from module import function is just a syntax sugar that gives you ability to not litter your namespace with everything form particular module (from module import *), but behind the scenes, its the almost the same as import module; module.function(...).
From composition/architecture point of view, basic program structure is:
top modules import bottom
bottom modules get parameters from top when being called
You probably want to use favouritePizza variable somehow in fileb, i.e. in eatPizza functions. Good thing to do is to make this function accept parameter, that will be passed from any place that uses it:
# fileb.py
def eatPizza(name):
print(name)
And call it with
# main.py
import fileb
favouritePizza = "pineapple"
fileb.eatPizza(favouritePizza)
I have a testing environment to try to understand how python circular dependencies can be avoided importing the modules with an import x statement, instead of using a from x import y:
test/
__init__.py
testing.py
a/
__init__.py
m_a.py
b/
__init__.py
m_b.py
The files have the following content:
testing.py:
from a.m_a import A
m_a.py:
import b.m_b
print b.m_b
class A:
pass
m_b.py:
import a.m_a
print a.m_a
class B:
pass
There is a situation which I can't understand:
If I remove the print statements from modules m_a.py and m_b.py or only from m_b.py this works OK, but if the print is present at m_b.py, then the following error is thrown:
File "testing.py", line 1, in <module>
from a.m_a import A
File "/home/enric/test/a/m_a.py", line 1, in <module>
import b.m_b
File "/home/enric/test/b/m_b.py", line 3, in <module>
print a.m_a
AttributeError: 'module' object has no attribute 'm_a'
Do you have any ideas?
It only "works" with the print statements removed because you're not actually doing anything that depends on the imports. It's still a broken circular import.
Either run this in the debugger, or add a print statement after each line, and you'll see what happens:
testing.py: from a.m_a import A
a.m_a: import b.m_b
b.m_b: import a.m_a
b.m_b: print a.m_a
It's clearly trying to access a.m_a before the module finished importing. (In fact, you can see the rest of a.m_a on the stack in your backtrace.)
If you dump out sys.modules at this point, you'll find two partial modules named a and a.m_a, but if you dir(a), there's no m_a there yet.
As far as I can tell, the fact that m_a doesn't get added to a until m_a.py finishes evaluating is not documented anywhere in the Python 2.7 documentation. (3.x has much a more complete specification of the import process—but it's also a very different import process.) So, you can't rely on this either failing or succeeding; either one is perfectly legal for an implementation. (But it happens to fail in at least CPython and PyPy…)
More generally, using import foo instead of from foo import bar doesn't magically solve all circular-import problems. It just solves one particular class of circular-import problems (or, rather, makes that class moot). (I realize there is some misleading text in the FAQ about this.)
There are various tricks to work around circular imports while still letting you have circular top-level dependencies. But really, it's almost always simpler to get rid of the circular top-level dependencies.
In this toy case, there's really no reason for a.m_a to depend on b.m_b at all. If you need some that prints out a.m_a, there are better ways to get it than from a completely independent package!
In real-life code, there probably is some stuff in m_a that m_b needs and vice-versa. But usually, you can separate it out into two levels: stuff in m_a that needs m_b, and stuff in m_a that's needed by m_b. So, just split it into two modules. It's really the same thing as the common fix for a bunch of modules that try to reach back up and import main: split a utils off main.
What if there really is something that m_b needs from m_a, that also needs m_b? Well, in that case, you may have to insert a level of indirection. For example, maybe you can pass the thing-from-m_b into the function/constructor/whatever from m_a, so it can access it as a local parameter value instead of as a global. (It's hard to be more specific without a more specific problem.)
If worst comes to worst, and you can't remove the import via indirection, you have to move the import out of the way. That may again mean doing an import inside a function call, etc. (as explained in the FAQ immediately after the paragraph that set you off), or just moving some code above the import, or all kinds of other possibilities. But consider these last-ditch solutions to something which just can't be designed cleanly, not a roadmap to follow for your designs.
Take the following code example:
File package1/__init__.py:
from moduleB import foo
print moduleB.__name__
File package1/moduleB.py:
def foo(): pass
Then from the current directory:
>>> import package1
package1.moduleB
This code works in CPython. What surprises me about it is that the from ... import in __init__.py statement makes the moduleB name visible. According to Python documentation, this should not be the case:
The from form does not bind the module name
Could someone please explain why CPython works that way? Is there any documentation describing this in detail?
The documentation misled you as it is written to describe the more common case of importing a module from outside of the parent package containing it.
For example, using "from example import submodule" in my own code, where "example" is some third party library completely unconnected to my own code, does not bind the name "example". It does still import both the example/__init__.py and example/submodule.py modules, create two module objects, and assign example.submodule to the second module object.
But, "from..import" of names from a submodule must set the submodule attribute on the parent package object. Consider if it didn't:
package/__init__.py executes when package is imported.
That __init__ does "from submodule import name".
At some point later, other completely different code does "import package.submodule".
At step 3, either sys.modules["package.submodule"] doesn't exist, in which case loading it again will give you two different module objects in different scopes; or sys.modules["package.submodule"] will exist but "submodule" won't be an attribute of the parent package object (sys.modules["package"]), and "import package.submodule" will do nothing. However, if it does nothing, the code using the import cannot access submodule as an attribute of package!
Theoretically, how importing a submodule works could be changed if the rest of the import machinery was changed to match.
If you just need to know what importing a submodule S from package P will do, then in a nutshell:
Ensure P is imported, or import it otherwise. (This step recurses to handle "import A.B.C.D".)
Execute S.py to get a module object. (Skipping details of .pyc files, etc.)
Store module object in sys.modules["P.S"].
setattr(sys.modules["P"], "S", sys.modules["P.S"])
If that import was of the form "import P.S", bind "P" in local scope.
this is because __init__.py represent itself as package1 module object at runtime, so every .py file will be defined as an submodule. and rewrite __all__ will not make any sense. you can make another file e.g example.py and fill it with the same code in __init__.py and it will raise NameError.
i think CPython runtime takes special algorithm when __init__.py looking for variables differ from other python files, may be like this:
looking for variable named "moduleB"
if not found:
if __file__ == '__init__.py': #dont raise NameError, looking for file named moduleB.py
if current dir contains file named "moduleB.py":
import moduleB
else:
raise namerror