I have classes which require dependencies in order to be instantiated but are otherwise optional. I'd like to lazily import the dependencies and fail to instantiate the class if they aren't available. Note that these dependencies are not required at the package level (otherwise they'd be mandatory via setuptools). I currently have something like this:
class Foo:
def __init__(self):
try:
import module
except ImportError:
raise ModuleNotFoundError("...")
def foo(self):
import module
Because this try/except pattern is common, I'd like to abstract it into a lazy importer. Ideally if module is available, I won't need to import it again in Foo.foo so I'd like module to be available once it's been imported in __init__. I've tried the following, which populates globals() and fails to instantiate the class if numpy isn't available, but it pollutes the global namespace.
def lazy_import(name, as_=None):
# Doesn't handle error_msg well yet
import importlib
mod = importlib.import_module(name)
if as_ is not None:
name = as_
# yuck...
globals()[name] = mod
class NeedsNumpyFoo:
def __init__(self):
lazy_import("numpy", as_="np")
def foo(self):
return np.array([1,2,])
I could instantiate the module outside the class and point to the imported module if import doesn't fail, but that is the same as the globals() approach. Alternatively lazy_import could return the mod and I could call it whenever the module is needed, but this is tantamount to just importing it everywhere as before.
Is there a better way to handle this?
Pandas actually has a function import_optional_dependency which may make a good example (link GitHub) as used in SQLAlchemyEngine (link GitHub)
However, this is only used during class __init__ to get a meaningful error (raise ImportError(...) by default!) or warn about absence or old dependencies (which is likely a more practical use of it, as older or newer dependencies may import correctly anywhere if they exist, but not work or be explicitly tested against or even be an accidental local import)
I'd consider doing similarly, and either not bother to have special handling or only do it in the __init__ (and then perhaps only for a few cases where you're interested in the version, etc.) and otherwise simply import where needed
class Foo():
def __init__(self, ...):
import bar # only tests for existence
def usebar(self, value):
import bar
bar.baz(value)
Plausibly you could assign to a property of the class, but this may cause some trouble or confusion (as the import should already be available in globals once imported)
class Foo():
def __init__(self, ...):
import bar
self.bar = bar
def usebar(self, value):
self.bar.baz(value)
Gave it a quick test with a wrapper, seems to work fine:
def requires_math(fn):
def wrapper(*args, **kwargs):
global math
try:
math
except NameError:
import math
return fn(*args, **kwargs)
return wrapper
#requires_math
def func():
return math.ceil(5.5)
print(func())
Edit: More advanced one that works with any module, and ensures it is a module in case it's been set to something else.
from types import ModuleType
def requires_import(*mods):
def decorator(fn):
def wrapper(*args, **kwargs):
for mod in mods:
if mod not in globals() or not isinstance(globals()[mod], ModuleType):
globals()[mod] = __import__(mod)
return fn(*args, **kwargs)
return wrapper
return decorator
#requires_import('math', 'random')
def func():
return math.ceil(random.uniform(0, 10))
print(func())
Related
I have a library that can use two modules; one is fast but available only on Linux and macOS, the other is slower but it's multi-platform. My solution was to make the library compatible with both and have something like the following:
try:
import fastmodule
except ImportError:
import slowmodule
Now I want to compare the timing of the library when using either module. Is there any way of masking the fastmodule without changing the source code (i.e. within a Jupyter Notebook), in an environment where both modules are installed, so that the slowmodule is used?
This is a bit of a hack, but here it goes:
You can write your own importer and register it (note that this is Python 3-specific, Python 2 had another API for this):
import sut
import functools
import importlib
import sys
def use_slow(f):
#functools.wraps(f)
def wrapped(*args, **kwargs):
ImportRaiser.use_slow = True
if 'fastmodule' in sys.modules:
del sys.modules['fastmodule'] # otherwise it will remain cached
importlib.reload(sut)
f(*args, **kwargs)
return wrapped
def use_fast(f):
#functools.wraps(f)
def wrapped(*args, **kwargs):
ImportRaiser.use_slow = False
importlib.reload(sut)
f(*args, **kwargs)
return wrapped
class ImportRaiser:
use_slow = False
def find_spec(self, fullname, path, target=None):
if fullname == 'fastmodule':
if self.use_slow:
raise ImportError()
sys.meta_path.insert(0, ImportRaiser())
#use_fast
def test_fast():
# test code
#use_slow
def test_slow():
# test code
Here, sut is your module under test, which you have to reload in order to change the behavior. I added decorators for readabilty, but this can be done by some function or in the test setup, of course.
If you use the slow version, fastmodule will raise ImportError on import and slowmodule will be used instead. In "fast" case, all works as usual.
Is there a way to import a package twice in the same python session, under the same name, but at different scope, in a multi-threaded environment ?
I would like to import the package, then override some of its functions to change its behavior only when used in specific class.
For instance, is it possible to achieve something like this ?
import mod
class MyClass:
mod = __import__('mod')
def __init__():
mod.function = new_function # override module function
def method():
mod.function() # call new_function
mod.function() # call original function
It might seem weird, but in this case the user deriving the class wouldn't have to change his code to use the improved package.
To import a module as a copy:
def freshimport(name):
import sys, importlib
if name in sys.modules:
del sys.modules[name]
mod = importlib.import_module(name)
sys.modules[name] = mod
return mod
Test:
import mymodule as m1
m2 = freshimport('mymodule')
assert m1.func is not m2.func
Note:
importlib.reload will not do the job, as it always "thoughtfully" updates the old module:
import importlib
import mymodule as m1
print(id(m1.func))
m2 = importlib.reload(m1)
print(id(m1.func))
print(id(m2.func))
Sample output:
139681606300944
139681606050680
139681606050680
It looks like a job for a context manager
import modul
def newfunc():
print('newfunc')
class MyClass:
def __enter__(self):
self._f = modul.func
modul.func = newfunc
return self
def __exit__(self, type, value, tb):
modul.func = self._f
def method(self):
modul.func()
modul.func()
with MyClass() as obj:
obj.method()
modul.func()
modul.func()
outputs
func
newfunc
newfunc
func
where modul.py contains
def func():
print('func')
NOTE: this solution suits single-threaded applications only (unspecified in the OP)
This works in a script to recognise if a is of class myproject.aa.RefClass
isinstance(a, myproject.aa.RefClass)
But how could I do it so I do not have to specify the full namespace ? I would like to be able to type:
isinstance(a, RefClass)
How is this done in Python ?
EDIT: let me give more details.
In module aa.referencedatatable.py:
class ReferenceDataTable(object):
def __init__(self, name):
self.name = name
def __call__(self, f):
self._myfn = f
return self
def referencedatatable_from_tag(tag):
import definitions
defn_lst = [definitions]
for defn in defn_lst:
referencedatatable_instance_lst = [getattr(defn, a) for a in dir(defn) if isinstance(getattr(defn, a), ReferenceDataTable)]
for referencedatatable_instance in referencedatatable_instance_lst
if referencedatatable_instance.name == tag
return referencedatatable_instance
raise("could not find")
def main()
referencedata_from_tag("Example")
In module aa.definitions.py:
from aa.referencedatatable import ReferenceDataTable
#ReferenceDataTable("Example")
def EXAMPLE():
raise NotImplementedError("not written")
For some reason calling the main from aa.referencedatatable.py will throw as it will not be able to recognise the instance of the class. But if I copy this main in another module it will work:
import aa.referencedatatable
a = aa.referencedatatable.referencedatatable_from_tag("Example")
print a
This second example works, for some reason calling this function inside the same module where the class is declared does not.
The 'namespace' is just a module object, and so is the class. You can always assign the class to a different name:
RefClass = myproject.aa.RefClass
or better yet, import it directly into your own namespace:
from myproject.aa import RefClass
Either way, now you have a global name RefClass that references the class object, so you can do:
isinstance(a, RefClass)
I'm writing a test suite, and the code I'm testing makes excessive use of delayed module imports. So it's possible that with 5 different inputs to the same method, this may end up importing 5 additional modules. What I'd like to be able to do is set up tests so that I can assert that running the method with one input causes one import, and doesn't cause the other 4.
I had a few ideas of how to start on this, but none so far have been successful. I already have a custom importer, and I can put the logging code in the importer. But this doesn't work, because the import statements only run once. I need the log statement to be executed regardless of if the module has been previously imported. Just running del sys.modules['modname'] also doesn't work, because that runs in the test code, and I can't reload the module in the code being tested.
The next thing I tried was subclassing dict to do the monitoring, and replace sys.modules with this subclass. This subclass has a reimplemented __getitem__ method, but calling import module doesn't seem to trigger the __getitem__ call in the subclass. I also can't assign directly to sys.modules.__getitem__, because it is read-only.
Is what I'm trying to do even possible?
UPDATE
nneonneo's answer seems to only work if the implementation of logImports() is in the same module as where it is used. If I make a base test class containing this functionality, it has problems. The first is that it can't find just __import__, erroring with:
# old_import = __import__
# UnboundLocalError: local variable '__import__' referenced before assignment
When I change that to __builtin__.__import__, I another error:
myunittest.py:
import unittest
class TestCase(unittest.TestCase):
def logImports(self):
old_import = __builtins__.__import__
def __import__(*args, **kwargs):
print args, kwargs
return old_import(*args, **kwargs)
__builtins__.__import__ = __import__
test.py:
import myunittest
import unittest
class RealTest(myunittest.TestCase):
def setUp(self):
self.logImports()
def testSomething(self):
import unittest
self.assertTrue(True)
unittest.main()
# old_import = __builtins__.__import__
# AttributeError: 'dict' object has no attribute '__import__'
Try
old_import = __import__
def __import__(*args, **kwargs):
print args, kwargs
return old_import(*args, **kwargs)
__builtins__.__import__ = __import__
This overrides __import__ completely, allowing you to monitor every invocation of import.
Building on the previous answer, in Python 3, I've had success with the following.
import builtins
old_import = __import__
def importWithLog(*args, **kwargs):
print(args[0]) # This is the module name
print(args, kwargs)
return old_import(*args, **kwargs)
builtins.__import__ = importWithLog
# Rest of the code goes here.
import time
import myModule
...
I am trying to get a module to import, but only if an object of a specific class is called. For example:
class One(object):
try:
import OneHelper
except ImportError:
pass
def __init__(self):
# this function doesn't use OneHelper
...
def blah(self):
# this function does
OneHelper.blah()
This causes a NameError: global name 'OneHelper' is not defined when the One.blah() function is called. So far the only thing I have found that works is importing the module into the actual functions that use it. So:
class One(object):
def __init__(self):
# this function doesn't use OneHelper
...
def blah(self):
try:
import OneHelper
except ImportError:
pass
# this function does
OneHelper.blah()
But I don't want to have to import the module in each function I want to use it in, I want it to be available to the whole class, but only if an instance of that class is instantiated. Apologies if I'm not being clear enough...
The import OneHelper works fine in the class, making it a class attribute. You can verify this with dir(One) after defining your class -- there's your OneHelper attribute. One.OneHelper is a reference to the module. In an instance, of course, you may access it as self.OneHelper from your methods. (You could also continue to access it as One.OneHelper.)
Import it on __init__ and attribute to some property:
class One(object):
def __init__(self):
try:
import OneHelper
except ImportError:
self.OneHelper = None
else:
self.OneHelper = OneHelper
def blah(self):
if self.OneHelper:
self.OneHelper.blah()
Your example looks funny because if the module fails to import what is the point of calling it later?
You might also consider using global OneHelper before importing the module. This adds the OneHelper to the global namespace.