I'm trying to temporarily remove a python module from sys.modules so that I can import it as part of a test case (with various system functions mocked out) and then put it back again. (Yes, that's a bit crazy and I'm probably going to end up restructuring the code instead but now I'm curious...)
I can remove the module and reimport it just fine but I can't seem to put it back to the original module once I'm finished. (Maybe that's just not posible?) Here's a test case that I wrote to test out the idea:
class Test(unittest.TestCase):
def test_assumptions(self):
import meta.common.fileutils as fu1
del(sys.modules["meta.common.fileutils"])
import meta.common.fileutils
del(sys.modules["meta.common.fileutils"])
sys.modules["meta.common.fileutils"] = fu1 # I hoped this would set the module back
import meta.common.fileutils as fu2
self.assertEqual(fu1, fu2) # assert fails, fu2 is a new copy of module :-(
Can anyone suggest why it might be failing?
Edit, using pop() as suggested by one of the answers also fails:
class Test(unittest.TestCase):
def test_assumptions(self):
import meta.common.fileutils as fu1
orig = sys.modules.pop("meta.common.fileutils")
import meta.common.fileutils
del(sys.modules["meta.common.fileutils"])
sys.modules["meta.common.fileutils"] = orig
import meta.common.fileutils as fu2
self.assertEqual(fu1, orig) # passes
self.assertEqual(fu2, orig) # fails
self.assertEqual(fu1, fu2) # fails
It looks to me like the issue here has to do with packages. In particular, for a module that lives in a package (eg meta.common), there are two ways to access it: via sys.modules, and via the parent package's dictionary (i.e., meta.common.__dict__). It looks to me like the import meta.common.fileutils as fu2 line is getting fu2's value from meta.common.__dict__, and not from sys.modules.
So the solution: in addition to monkey-patching sys.modules, you should also monkey-patch the parent package. I.e., add something like this:
>>> import meta.common
>>> meta.common.fileutils = fu1
right before the sys.modules["meta.common.fileutils"] = fu1 line.
The sys.modules structure is really just a Python dict. You can remove modules from it, and you can also put them back in.
Store the original module object in a local variable, using dict.pop() to both remove the module and return it:
orig = sys.modules.pop('meta.common.fileutils')
then, when it comes to restoring it, just put that object back into sys.modules:
sys.modules['meta.common.fileutils'] = orig
Related
I found quite a few answers regarding the question how to re-import a module (e.g. after I changed it during programming), but I want to re-import it as. In other words I would like to repeat
import main.mydir.mymodule as mymod
and have my changes incorporated into my console without restarting the console.
What I am trying currently when I try to reload is the following. I might run
import main.warp.optimisation as opt
res = opt.combiascend(par)
then I do some changes, for example I put a print('Yes, this worked.') at the end of the method combiascend, then I run
import importlib
import main
importlib.reload(main)
importlib.reload(main.warp.optimisation)
opt = main.warp.optimisation
res = opt.combiascend(par)
This does not work: I am not getting any error, but changes I did in the module optimisation just were not applied. In my example, I do not get the respective output.
After employing one of those other answers to "refresh" main.mydir.mymodule, simply do:
mymod = main.mydir.mymodule
Looks like importlib also updates the reference you give it, so if the original import used an alias, you can simply reimport the alias. Given empty foo/__init__.py and foo/bar/__init__.py, and a foo/bar/test.py containing this:
def func():
print("a")
Then I get this:
>>> import foo.bar.test as mod
>>> mod.func()
a
>>> import importlib
>>> # (Updating the file now to print b instead)
>>> importlib.reload(mod)
<module 'foo.bar.test' from '/home/aasmund/foo/bar/test.py'>
>>> mod.func()
b
Is there a way to do this in python 3.6+?
import -force mymodule
I just want a single python command that both:
(1) loads the module for the first time, and
(2) forces a reload of the module if it already loaded without barfing.
(This is not a duplicate question because I'm asking for something different. What I want is a single function call that will do Items (1) and (2) above as the same function call. I don't want to make a coding decision manually about if I could issue "import" or "imp.reload". I just want python code for a single function "def" that can detect which case is appropriate and proceed automatically to make the decision for me about how to import it it, (1) or (2).
I'm thinking that it something like this:
def import_force(m):
import sys
if m not in sys.modules:
import m
else:
import importlib
importlib.reload(m)
Except, I can't figure out how to pass a module name as a parameter. Just gives me an error no such module named 'm'
There is one missing step that you semi-corrected in your new answer, which is that you need to assign the new module in every scope that uses it. The easiest way is to return the module object and bind it to the name you want outside your function. Your original implementation was 90% correct:
import sys, importlib
def import_force(m):
if m not in sys.modules:
return __import__(m)
else:
return importlib.reload(sys.modules[m])
Now you can use this function from the command line to replace import, e.g.:
my_module = force_import('my_module')
Any time you find yourself using exec to perform a task for which there is so much well defined machinery already available, you have code smell. There is also no reason to re-import sys and importlib every time.
This function should do what you want:
def import_force(name):
needs_reload = name in sys.modules
module = importlib.import_module(name)
if needs_reload:
module = importlib.reload(module)
return module
# Usage example:
os = import_force('os')
An alternative approach is to write your own import hooks, which I won't describe.
However please note that this is an anti-pattern and I would discourage the practice of reloading modules at every import.
If this is for debugging purposes, then I would suggest using one of the many auto-reloader solutions available online: they watch your Python files for changes, and when you make modifications they automatically re-import the modules.
The reasons why your function didn't work are two:
The import keyword does not resolve variables, so import m does not mean "import the module which name is in the variable m", but rather it means "import the module named m".
importlib.reload wants a module object, not a module name.
import sys
import importlib
# importing with a sledgehammer... simple, effective, and it always works
def import_force(name):
module = importlib.import_module(name)
module = importlib.reload(module)
return module
#assuming mymodule.py is in the current directory
mymodule = import_force("mymodule")
It's possible! but a little bit tricky to code correctly the first time...
import sys
import importlib
def import_force(modstr):
if modstr not in sys.modules:
print("IMPORT " + modstr)
cmd = "globals()['%s'] = importlib.import_module('%s')" % (modstr, modstr)
exec(cmd)
else:
print("RELOAD " + modstr)
cmd = "globals()['%s'] = importlib.reload(%s)" % (modstr, modstr)
exec(cmd)
If you have a module file in your current directory call "mymodule.py", then use it like this:
Py> import_force("mymodule")
Version 2.0:
def import_force(modstr):
if modstr not in sys.modules:
print("IMPORT " + modstr)
globals()[modstr] = importlib.import_module(modstr)
else:
print("RELOAD " + modstr)
globals()[modstr] = importlib.reload(sys.modules[modstr])
I'm writing a python application in which I want to make use of dynamic, one-time-runnable plugins.
By this I mean that at various times during the running of this application, it looks for python source files with special names in specific locations. If any such source file is found, I want my application to load it, run a pre-named function within it (if such a function exists), and then forget about that source file.
Later during the running of the application, that file might have changed, and I want my python application to reload it afresh, execute its method, and then forget about it, like before.
The standard import system keeps the module resident after the initial load, and this means that subsequent "import" or "__import__" calls won't reload the same module after its initial import. Therefore, any changes to the python code within this source file are ignored during its second through n-th imports.
In order for such packages to be loaded uniquely each time, I came up with the following procedure. It works, but it seems kind of "hacky" to me. Are there any more elegant or preferred ways of doing this? (note that the following is an over-simplified, illustrative example)
import sys
import imp
# The following module name can be anything, as long as it doesn't
# change throughout the life of the application ...
modname = '__whatever__'
def myimport(path):
'''Dynamically load python code from "path"'''
# get rid of previous instance, if it exists
try:
del sys.modules[modname]
except:
pass
# load the module
try:
return imp.load_source(modname, path)
except Exception, e:
print 'exception: {}'.format(e)
return None
mymod = myimport('/path/to/plugin.py')
if mymod is not None:
# call the plugin function:
try:
mymod.func()
except:
print 'func() not defined in plugin: {}'.format(path)
Addendum: one problem with this is that func() runs within a separate module context, and it has no access to any functions or variables within the caller's space. I therefore have to do inelegant things like the following if I
want func_one(), func_two() and abc to be accessible within the invocation
of func():
def func_one():
# whatever
def func_two():
# whatever
abc = '123'
# Load the module as shown above, but before invoking mymod.func(),
# the following has to be done ...
mymod.func_one = func_one
mymod.func_two = func_two
mymod.abc = abc
# This is a PITA, and I'm hoping there's a better way to do all of
# this.
Thank you very much.
I use the following code to do this sort of thing.
Note that I don't actually import the code as a module, but instead execute the code in a particular context. This lets me define a bunch of api functions automatically available to the plugins without users having to import anything.
def load_plugin(filename, context):
source = open(filename).read()
code = compile(source, filename, 'exec')
exec(code, context)
return context['func']
context = { 'func_one': func_one, 'func_two': func_two, 'abc': abc }
func = load_plugin(filename, context)
func()
This method works in python 2.6+ and python 3.3+
The approach you use is totally fine. For this question
one problem with this is that func() runs within a separate module context, and it has no access to any functions or variables within the caller's space.
It may be better to use execfile function:
# main.py
def func1():
print ('func1 called')
exec(open('trackableClass.py','r').read(),globals()) # this is similar to import except everything is done in the current module
#execfile('/path/to/plugin.py',globals()) # python 2 version
func()
Test it:
#/path/to/plugin.py
def func():
func1()
Result:
python main.py
# func1 called
One potential problem with this approach is namespace pollution because every file is run in the current namespace which increase the chance of name conflict.
In interactive python I'd like to import a module that is in, say,
C:\Modules\Module1\module.py
What I've been able to do is to create an empty
C:\Modules\Module1\__init__.py
and then do:
>>> import sys
>>> sys.path.append(r'C:\Modules\Module1')
>>> import module
And that works, but I'm having to append to sys.path, and if there was another file called module.py that is in the sys.path as well, how to unambiguously resolve to the one that I really want to import?
Is there another way to import that doesn't involve appending to sys.path?
EDIT: Here's something I'd forgotten about: Is this correct way to import python scripts residing in arbitrary folders? I'll leave the rest of my answer here for reference.
There is, but you'd basically wind up writing your own importer which manually creates a new module object and uses execfile to run the module's code in that object's "namespace". If you want to do that, take a look at the mod_python importer for an example.
For a simpler solution, you could just add the directory of the file you want to import to the beginning of sys.path, not the end, like so:
>>> import sys
>>> sys.path.insert(0, r'C:\Modules\Module1')
>>> import module
You shouldn't need to create the __init__.py file, not unless you're importing from within a package (so, if you were doing import package.module then you'd need __init__.py).
inserting in sys.path (at the very first place) works better:
>>> import sys
>>> sys.path.insert(0, 'C:/Modules/Module1')
>>> import module
>>> del sys.path[0] # if you don't want that directory in the path
append to a list puts the item in the last place, so it's quite possible that other previous entries in the path take precedence; putting the directory in the first place is therefore a sounder approach.
Summary: when a certain python module is imported, I want to be able to intercept this action, and instead of loading the required class, I want to load another class of my choice.
Reason: I am working on some legacy code. I need to write some unit test code before I start some enhancement/refactoring. The code imports a certain module which will fail in a unit test setting, however. (Because of database server dependency)
Pseduo Code:
from LegacyDataLoader import load_me_data
...
def do_something():
data = load_me_data()
So, ideally, when python excutes the import line above in a unit test, an alternative class, says MockDataLoader, is loaded instead.
I am still using 2.4.3. I suppose there is an import hook I can manipulate
Edit
Thanks a lot for the answers so far. They are all very helpful.
One particular type of suggestion is about manipulation of PYTHONPATH. It does not work in my case. So I will elaborate my particular situation here.
The original codebase is organised in this way
./dir1/myapp/database/LegacyDataLoader.py
./dir1/myapp/database/Other.py
./dir1/myapp/database/__init__.py
./dir1/myapp/__init__.py
My goal is to enhance the Other class in the Other module. But since it is legacy code, I do not feel comfortable working on it without strapping a test suite around it first.
Now I introduce this unit test code
./unit_test/test.py
The content is simply:
from myapp.database.Other import Other
def test1():
o = Other()
o.do_something()
if __name__ == "__main__":
test1()
When the CI server runs the above test, the test fails. It is because class Other uses LegacyDataLoader, and LegacydataLoader cannot establish database connection to the db server from the CI box.
Now let's add a fake class as suggested:
./unit_test_fake/myapp/database/LegacyDataLoader.py
./unit_test_fake/myapp/database/__init__.py
./unit_test_fake/myapp/__init__.py
Modify the PYTHONPATH to
export PYTHONPATH=unit_test_fake:dir1:unit_test
Now the test fails for another reason
File "unit_test/test.py", line 1, in <module>
from myapp.database.Other import Other
ImportError: No module named Other
It has something to do with the way python resolves classes/attributes in a module
You can intercept import and from ... import statements by defining your own __import__ function and assigning it to __builtin__.__import__ (make sure to save the previous value, since your override will no doubt want to delegate to it; and you'll need to import __builtin__ to get the builtin-objects module).
For example (Py2.4 specific, since that's what you're asking about), save in aim.py the following:
import __builtin__
realimp = __builtin__.__import__
def my_import(name, globals={}, locals={}, fromlist=[]):
print 'importing', name, fromlist
return realimp(name, globals, locals, fromlist)
__builtin__.__import__ = my_import
from os import path
and now:
$ python2.4 aim.py
importing os ('path',)
So this lets you intercept any specific import request you want, and alter the imported module[s] as you wish before you return them -- see the specs here. This is the kind of "hook" you're looking for, right?
There are cleaner ways to do this, but I'll assume that you can't modify the file containing from LegacyDataLoader import load_me_data.
The simplest thing to do is probably to create a new directory called testing_shims, and create LegacyDataLoader.py file in it. In that file, define whatever fake load_me_data you like. When running the unit tests, put testing_shims into your PYTHONPATH environment variable as the first directory. Alternately, you can modify your test runner to insert testing_shims as the first value in sys.path.
This way, your file will be found when importing LegacyDataLoader, and your code will be loaded instead of the real code.
The import statement just grabs stuff from sys.modules if a matching name is found there, so the simplest thing is to make sure you insert your own module into sys.modules under the target name before anything else tries to import the real thing.
# in test code
import sys
import MockDataLoader
sys.modules['LegacyDataLoader'] = MockDataLoader
import module_under_test
There are a handful of variations on the theme, but that basic approach should work fine to do what you describe in the question. A slightly simpler approach would be this, using just a mock function to replace the one in question:
# in test code
import module_under_test
def mock_load_me_data():
# do mock stuff here
module_under_test.load_me_data = mock_load_me_data
That simply replaces the appropriate name right in the module itself, so when you invoke the code under test, presumably do_something() in your question, it calls your mock routine.
Well, if the import fails by raising an exception, you could put it in a try...except loop:
try:
from LegacyDataLoader import load_me_data
except: # put error that occurs here, so as not to mask actual problems
from MockDataLoader import load_me_data
Is that what you're looking for? If it fails, but doesn't raise an exception, you could have it run the unit test with a special command line tag, like --unittest, like this:
import sys
if "--unittest" in sys.argv:
from MockDataLoader import load_me_data
else:
from LegacyDataLoader import load_me_data