Recursive version of 'reload' - python

When I'm developing Python code, I usually test it in an ad-hoc way in the interpreter. I'll import some_module, test it, find a bug, fix the bug and save, and then use the built-in reload function to reload(some_module) and test again.
However, suppose that in some_module I have import some_other_module, and while testing some_module I discover a bug in some_other_module and fix it. Now calling reload(some_module) won't recursively re-import some_other_module. I have to either manually reimport the dependency (by doing something like reload(some_module.some_other_module), or import some_other_module; reload(some_other_module), or, if I've changed a whole bunch of dependencies and lost track of what I need to reload, I need to restart the entire interpreter.
What'd be more convenient is if there were some recursive_reload function, and I could just do recursive_reload(some_module) and have Python not only reload some_module, but also recursively reload every module that some_module imports (and every module that each of those modules imports, and so on) so that I could be sure that I wasn't using an old version of any of the other modules upon which some_module depends.
I don't think there's anything built in to Python that behaves like the recursive_reload function I describe here, but is there an easy way to hack such a thing together?

I've run up against the same issue, and you inspired me to actually solve the problem.
from types import ModuleType
try:
from importlib import reload # Python 3.4+
except ImportError:
# Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an
# alias for the builtin reload.
from imp import reload
def rreload(module):
"""Recursively reload modules."""
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
rreload(attribute)
Or, if you are using IPython, just use dreload or pass --deep-reload on startup.

I've run against the same issue and I've built up on #Mattew and #osa answer.
from types import ModuleType
import os, sys
def rreload(module, paths=None, mdict=None):
"""Recursively reload modules."""
if paths is None:
paths = ['']
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
rreload(attribute, paths, mdict)
reload(module)
#return mdict
There are three differences:
In the general case, reload(module) has to be called at the end of the function as well, as #osa pointed out.
With circular import dependencies the code posted earlier would loop forever so I've added a dictionary of lists to keep track of the set of modules loaded by other modules. While circular dependencies are not cool, Python allows them, so this reload function deals with them as well.
I've added a list of paths (default is ['']) from which the reloading is allowed. Some modules don't like been reloaded the normal way, (as shown here).

The code worked great for dependency modules imported just as import another_module, but it failed when the module imported functions with from another_module import some_func.
I expanded on #redsk's answer to try and be smart about these functions. I've also added a blacklist because unfortunately typing and importlib don't appear in sys.builtin_module_names (maybe there are more). Also I wanted to prevent reloading of some dependencies I knew about.
I also track the reloaded module names and return them.
Tested on Python 3.7.4 Windows:
def rreload(module, paths=None, mdict=None, base_module=None, blacklist=None, reloaded_modules=None):
"""Recursively reload modules."""
if paths is None:
paths = [""]
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
if base_module is None:
base_module = module
if blacklist is None:
blacklist = ["importlib", "typing"]
if reloaded_modules is None:
reloaded_modules = []
reload(module)
reloaded_modules.append(module.__name__)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType and attribute.__name__ not in blacklist:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
reloaded_modules = rreload(attribute, paths, mdict, base_module, blacklist, reloaded_modules)
elif callable(attribute) and attribute.__module__ not in blacklist:
if attribute.__module__ not in sys.builtin_module_names and f"_{attribute.__module__}" not in sys.builtin_module_names:
if sys.modules[attribute.__module__] != base_module:
if sys.modules[attribute.__module__] not in mdict:
mdict[sys.modules[attribute.__module__]] = [attribute]
reloaded_modules = rreload(sys.modules[attribute.__module__], paths, mdict, base_module, blacklist, reloaded_modules)
reload(module)
return reloaded_modules
Some notes:
I don't know why some builtin_module_names are prefixed with an underscore (for example collections is listed as _collections, so I have to do the double string check.
callable() returns True for classes, I guess that's expected but that was one of the reasons I had to blacklist extra modules.
At least now I am able to deep reload a module at runtime and from my tests I was able to go multiple levels deep with from foo import bar and see the result at each call to rreload()
(Apologies for the long and ugly depth, but the black formatted version doesn't look so readable on SO)

Wouldn't it be simpler to actually write some test cases and run them every time you are done with modifying your module?
What you are doing is cool (you are in essence using TDD (test driven development) but you are doing it wrong.
Consider that with written unit tests(using the default python unittest module, or better yet nose) you get to have tests that are reusable, stable and help you detect inconsitencies in your code much much faster and better than with testing your module in the interactive environment.

I found the idea to just clear all the modules and then reimport your module here, which suggested to just do this:
import sys
sys.modules.clear()
This would mess up modules loaded that you don't want to reload (if you only want to reload your own modules). My idea is to only clear the modules that include your own folders. Something like this:
import sys
import importlib
def reload_all():
delete_folders = ["yourfolder", "yourotherfolder"]
for module in list(sys.modules.keys()):
if any(folder in module for folder in delete_folders):
del sys.modules[module]
# And then you can reimport the file that you are running.
importlib.import_module("yourfolder.entrypoint")
Reimporting your entry point will reimport all of its imports since the modules were cleared and it's automatically recursive.

Technically, in each file you could put a reload command, to ensure that it reloads each time it imports
a.py:
def testa():
print 'hi!'
b.py:
import a
reload(a)
def testb():
a.testa()
Now, interactively:
import b
b.testb()
#hi!
#<modify a.py>
reload(b)
b.testb()
#hello again!

I found the answer of redsk very useful.
I propose a simplified (for the user, not as code) version where the path to the module is automatically gathered and recursion works for an arbitrary number of levels.
Everything is self-contained in a single function.
Tested on Python 3.4. I guess for python 3.3 one must import reload from imp instead of ... from importlib.
It also checks if the __file__ file is present, which might be false if the coder forgets to define an __init__.py file in a submodule. In such case, an exception is raised.
def rreload(module):
"""
Recursive reload of the specified module and (recursively) the used ones.
Mandatory! Every submodule must have an __init__.py file
Usage:
import mymodule
rreload(mymodule)
:param module: the module to load (the module itself, not a string)
:return: nothing
"""
import os.path
import sys
def rreload_deep_scan(module, rootpath, mdict=None):
from types import ModuleType
from importlib import reload
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
# print("RReloading " + str(module))
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
# print ("for attr "+attribute_name)
if type(attribute) is ModuleType:
# print ("typeok")
if attribute not in mdict[module]:
# print ("not int mdict")
if attribute.__name__ not in sys.builtin_module_names:
# print ("not a builtin")
# If the submodule is a python file, it will have a __file__ attribute
if not hasattr(attribute, '__file__'):
raise BaseException("Could not find attribute __file__ for module '"+str(attribute)+"'. Maybe a missing __init__.py file?")
attribute_path = os.path.dirname(attribute.__file__)
if attribute_path.startswith(rootpath):
# print ("in path")
mdict[module].append(attribute)
rreload_deep_scan(attribute, rootpath, mdict)
rreload_deep_scan(module, rootpath=os.path.dirname(module.__file__))

For Python 3.6+ you can use:
from types import ModuleType
import sys
import importlib
def deep_reload(m: ModuleType):
name = m.__name__ # get the name that is used in sys.modules
name_ext = name + '.' # support finding sub modules or packages
def compare(loaded: str):
return (loaded == name) or loaded.startswith(name_ext)
all_mods = tuple(sys.modules) # prevent changing iterable while iterating over it
sub_mods = filter(compare, all_mods)
for pkg in sorted(sub_mods, key=lambda item: item.count('.'), reverse=True):
importlib.reload(sys.modules[pkg]) # reload packages, beginning with the most deeply nested

Below is the recursive reload function that I use, including a magic function for ipython/jupyter.
It does a depth-first search through all sub-modules and reloads them in the correct order of dependence.
import logging
from importlib import reload, import_module
from types import ModuleType
from IPython.core.magic import register_line_magic
logger = logging.getLogger(__name__)
def reload_recursive(module, reload_external_modules=False):
"""
Recursively reload a module (in order of dependence).
Parameters
----------
module : ModuleType or str
The module to reload.
reload_external_modules : bool, optional
Whether to reload all referenced modules, including external ones which
aren't submodules of ``module``.
"""
_reload(module, reload_external_modules, set())
#register_line_magic('reload')
def reload_magic(module):
"""
Reload module on demand.
Examples
--------
>>> %reload my_module
reloading module: my_module
"""
reload_recursive(module)
def _reload(module, reload_all, reloaded):
if isinstance(module, ModuleType):
module_name = module.__name__
elif isinstance(module, str):
module_name, module = module, import_module(module)
else:
raise TypeError(
"'module' must be either a module or str; "
f"got: {module.__class__.__name__}")
for attr_name in dir(module):
attr = getattr(module, attr_name)
check = (
# is it a module?
isinstance(attr, ModuleType)
# has it already been reloaded?
and attr.__name__ not in reloaded
# is it a proper submodule? (or just reload all)
and (reload_all or attr.__name__.startswith(module_name))
)
if check:
_reload(attr, reload_all, reloaded)
logger.debug(f"reloading module: {module.__name__}")
reload(module)
reloaded.add(module_name)

It is a tricky thing to do - I have an working example in this answer:
how to find list of modules which depend upon a specific module in python

Related

What is happening in m5/objects/__init__.py file gem5

I am new with gem5 simulator. I was reading the documentation (http://www.m5sim.org/Configuration_/_Simulation_Scripts) trying to understand how everything is implemented. When they write about Python classes they say the following:
gem5 provides a collection of Python object classes that correspond to its C++ simulation object classes. These Python classes are defined in a Python module called "m5.objects". The Python class definitions for these objects can be found in .py files in src, typically in the same directory as their C++ definitions.
To make the Python classes visible, the configuration file must first import the class definitions from the m5 module
In the m5/objects directory there is only one file "__init__.py". This is the code:
from __future__ import print_function
from __future__ import absolute_import
from m5.internal import params
from m5.SimObject import *
try:
modules = __loader__.modules
except NameError:
modules = { }
for module in modules.keys():
if module.startswith('m5.objects.'):
exec("from %s import *" % module)
Normally I don't program with Python so perhaps that is the problem, but I haven't fully understood what is going on here. In this other post Python's __loader__, what is it? they speak about what loader means but I feel I am missing something. Any help would be appreciated. Thanks in advance.
The __loader__
Consider the following code:
import sys
class FooImporter:
def find_module(self, module_name, package_path):
return self if module_name == 'foo' else None
def load_module(self, module_name):
print('FooImporter is working.')
sys.modules[module_name] = __import__('sys')
# This activates the importer
sys.meta_path.append(FooImporter())
# This should trigger our importer to import 'foo'
import foo
# Show what we've just got
print(foo)
This will result in output:
FooImporter is working.
<module 'sys' (built-in)>
As long as you do not have a python module named foo in PYTHONPATH.
Python Import Hook (PEP 302) allows us to customize the behavior of import. In the above example, module foo was said to be found and handled by the FooImporter. Note the importer will create the module foo as an alias of sys. The complete importer (unlike the simplified one we've seen) will be responsible for setting the __loader__ attribute for the imported module to the importer itself.
Gem5's import hook
Back to your question, gem5 is using the same mechanism for loading SimObject's by its design of modulization. You can find the very importer class at src/python/importer.py with the class name CodeImporter.
When the module m5.object was being imported, say,
from m5.objects import Root
The CodeImporter will be responsible for handling the import task, in which the __loader__ attribute will be set for the imported module (in this case m5.objects). If you try printing __loader__ in m5/objects/__init__.py, you'll get something like:
<importer.CodeImporter object at 0x7f4f58941d60>
The __loader__.modules is a dictionary containing gem5 maintained SimObjects where each item will be added by addModule() calls from src/sim/init.cc.
As long as a SimObject's C++ correspondence has called the constructor for EmbeddedPython, it will be added to a list so the gem5 initialization will remember to add it to the instance of CodeImporter. For example, one should be able to find a Root.py.cc file in the build folder that registers the Root object. The loop at the end of m5/object/__init__.py is just importing a list of known SimObject's by this mechanism.
I think this should be sufficient to give someone a picture of the underlying magic and (hopefully) resolve their curiosity.

isinstance fails for a type imported via package and from the same module directly

/Project
|-- main.py
|--/lib
| |--__init__.py
| |--foo.py
| |--Types.py
/Project/lib has been added to the PYTHONPATH variables.
Types.py:
class Custom(object):
def __init__(self):
a = 1
b = 2
foo.py:
from Types import Custom
def foo(o):
assert isinstance(o, Custom)
Finally, from main.py:
from lib.Types import Custom
from lib.foo import foo
a = Custom()
foo(a)
The problem now is, that a is of type lib.foo.Custom, while the isinstance call will check if it equals foo.Custom, which obviously returns false.
How can I avoid this problem, without having to change anything in the library (lib)?
You should not both make lib a package and add it to PYTHONPATH. This makes it possible to import its modules both as lib. and directly, setting yourself up for failure.
As you can see,
lib.Types.Custom != Types.Custom
because of the way Python imports work.
Python searches the import path and parses an appropriate entry that it finds.
When you import lib.Types, it imports the lib directory as a package, then lib/Types.py as a submodule inside it, creating module objects lib and lib.Types in sys.modules.
When you import Types, it imports Types.py as a standalone module, creating a module object Types in sys.modules.
So, Types and lib.Types end up as two different module objects. Python doesn't check if they are the same file to keep things simple and to avoid second-guessing you.
(This is actually listed in the Traps for the Unwary in Python’s Import System article as the "double import trap".)
If you remove lib from PYTHONPATH, the import in lib/foo.py would need to become a relative import:
from .Types import Custom
or an absolute import:
from lib.Types import Custom
When a module is imported thru two different path in the same process - like here with import Types in foo.py and import lib.Types in main.py, it is really imported twice, yielding two distinct module objects, each with it's own distinct functions and class instances (you can check by yourself using id(obj_or_class)), effectively breaking is and isinstance tests.
The solution here would be to add Project (not Project/lib) to your pythonpath (fwiw that's what should have been done anyway - pythonpath/sys.path should be a list of directories containing packages and modules, not the packages directories themselves) and use from lib.Type import Custom everywhere, so you only have one single instance of the module.
# For generating a class UUID: uuidgen -n "<MODULE_UUID>" -N <Python class name> -s
# Example: uuidgen -n "dec9b2e9-07c0-4f59-af97-92f171e6fe33" -N Args -s
MODULE_UUID = "dec9b2e9-07c0-4f59-af97-92f171e6fe33"
def get_class_uuid(obj_or_cls):
if isinstance(obj_or_cls, type):
# it's a class
return getattr(obj_or_cls, "CLASS_UUID", None)
# it's an object
return getattr(obj_or_cls.__class__, "CLASS_UUID", None)
def same_type(obj, cls):
return get_class_uuid(obj) == get_class_uuid(cls)
class Foo:
CLASS_UUID = "340637d8-5cb7-53b1-975e-d3f30bb825cd"
#staticmethod
def check_type(obj, accept_none=True):
if obj is None:
return accept_none
return same_type(obj, Foo)
...
assert Foo.check_type(obj)

Export __all__ as something that is not itself

I want my_module to export __all__ as empty list, i.e.
from my_module import *
assert '__all__' in dir() and __all__ == []
I can export __all__ like this (in 'my_module.py'):
__all__ = ['__all__']
However it predictably binds __all__ to itself , so that
from my_module import *
assert '__all__' in dir() and __all__ == ['__all__']
How can I export __all__ as an empty list? Failing that, how can I hook into import process to put __all__ into importing module's __dict__ on every top level import my_module statement, circumventing module caching.
I'll start with saying this is, in my mind, a terrible idea. You really should not implicitly alter what is exported from a module, this goes counter to the Zen of Python: Explicit is better than implicit..
I also agree with the highest-voted answer on the question you cite; Python already has a mechanism to mark functions 'private', by convention we use a leading underscore to indicate a function should not be considered part of the module API. This approach works with existing tools, vs. the decorator dynamically setting __all__ which certainly breaks static code analysers.
That out of the way, here is a shotgun pointing at your foot. Use it with care.
What you want here is a way to detect when names are imported. You cannot normally do this; there are no hooks for import statements. Once a module has been imported from source, a module object is added to sys.modules and re-used for subsequent imports, but that object is not notified of imports.
What you can do is hook into attribute access. Not with the default module object, but you can stuff any object into sys.modules and it'll be treated as a module. You could just subclass the module type even, then add a __getattribute__ method to that. It'll be called when importing any name with from module import name, for all names listed in __all__ when using from module import *, and in Python 3, __spec__ is accessed for all import forms, even when doing just import module.
You can then use this to hack your way into the calling frame globals, via sys._getframe():
import sys
import types
class AttributeAccessHookModule(types.ModuleType):
def __getattribute__(self, name):
if name == '__all__':
# assume we are being imported with from module import *
g = sys._getframe(1).f_globals
if '__all__' not in g:
g['__all__'] = []
return super(AttributeAccessHook, self).__getattribute__(name)
# replace *this* module with our hacked-up version
# this part goes at the *end* of your module.
replacement = sys.module[__name__] = AttributeAccessHook(__name__, __doc__)
for name, obj in globals().items():
setattr(replacement, name, obj)
The guy there sets __all__ on first decorator application, so not explicitly exporting anything causes it to implicitly export everything. I am trying to improve on this design: if the decorator is imported, then export nothing my default, regardless of it's usage.
Just set __all__ to an empty list at the start of your module, e.g.:
# this is my_module.py
from utilitymodule import public
__all__ = []
# and now you could use your #public decorator to optionally add module to it

Python difference between __import__ and import as

I am trying to dynamically import a configuration file in Python.
My code works fine when I use:
import conf.config as config
but doesn't work correctly when I use:
config = __import__("conf.config")
Below are to sample programs and the results I get when running them:
#regularimport.py
import conf.config as config
def read_values(cfg):
for varname in cfg.__dict__.keys():
if varname.startswith('__'):
continue
value = getattr(cfg, varname)
yield (varname, value)
for name,value in read_values(config):
print "Current config: %s = %s" % (name, value)
Results:
$python regularimport.py
Current config: SETTING_TWO = another setting
Current config: SETTING_ONE = some setting
Dynamic import:
#dynamicimport.py
conf_str = "conf.config"
config = __import__(conf_str)
def read_values(cfg):
for varname in cfg.__dict__.keys():
if varname.startswith('__'):
continue
value = getattr(cfg, varname)
yield (varname, value)
for name,value in read_values(config):
print "Current config: %s = %s" % (name, value)
Results:
$ python dynamicimport.py
Current config: config = <module 'conf.config' from '/home/ubuntu/importex/conf/config.pyc'>
My question is why the difference? And more importantly, how can I make the dynamic import example work as it does with the regular import?
As the documentation explains:
When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name.
So, when you do this:
config = __import__("conf.config")
That's not the same as:
import conf.config as config
But rather something more like:
import conf.config
config = conf
Why?
Because conf, not conf.config, is the thing that gets bound by an import statement. (Sure, in import foo as bar, obviously bar gets bound… but __import__ isn't meant to be an equivalent of import foo as bar, but of import foo.) The docs explain further. But the upshot is that you probably shouldn't be using __import__ in the first place.
At the very top of the function documentation it says:
Note: This is an advanced function that is not needed in everyday Python programming, unlike importlib.import_module().
And at the bottom, after explaining how __import__ works with packages and why it works that way, it says:
If you simply want to import a module (potentially within a package) by name, use importlib.import_module().
So, as you might guess, the simple solution is to use importlib.import_module.
If you have to use Python 2.6, where importlib doesn't exist… well, there just is no easy solution. You can build something like import_module yourself out of imp. Or use __import__ and then dig through sys.modules. Or __import__ each piece and then getattr your way through the results. Or in various other hacky ways. And yes, that sucks—which is why 3.0 and 2.7 fixed it.
The 2.6 docs give an example of the second hack. Adapting it to your case:
__import__("conf.config")
config = sys.modules["conf.config"]
config = __import__("conf.config") is not equivalent to import conf.config as config.
For example:
>>> import os.path as path
>>> path
<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>
>>> __import__('os.path')
<module 'os' from '/usr/lib/python2.7/os.pyc'>
Instead of __import__ use importlib.import_module to get the subpackage / submodule.
>>> import importlib
>>> importlib.import_module('os.path')
<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>

Python: How to load a module twice?

Is there a way to load a module twice in the same python session?
To fill this question with an example: Here is a module:
Mod.py
x = 0
Now I would like to import that module twice, like creating two instances of a class to have actually two copies of x.
To already answer the questions in the comments, "why anyone would want to do that if they could just create a class with x as a variable":
You are correct, but there exists some huge amount of source that would have to be rewritten, and loading a module twice would be a quick fix^^.
Yes, you can load a module twice:
import mod
import sys
del sys.modules["mod"]
import mod as mod2
Now, mod and mod2 are two instances of the same module.
That said, I doubt this is ever useful. Use classes instead -- eventually it will be less work.
Edit: In Python 2.x, you can also use the following code to "manually" import a module:
import imp
def my_import(name):
file, pathname, description = imp.find_module(name)
code = compile(file.read(), pathname, "exec", dont_inherit=True)
file.close()
module = imp.new_module(name)
exec code in module.__dict__
return module
This solution might be more flexible than the first one. You no longer have to "fight" the import mechanism since you are (partly) rolling your own one. (Note that this implementation doesn't set the __file__, __path__ and __package__ attributes of the module -- if these are needed, just add code to set them.)
Deleting an entry from sys.modules will not necessarily work (e.g. it will fail when importing recurly twice, if you want to work with multiple recurly accounts in the same app etc.)
Another way to accomplish this is:
>>> import importlib
>>> spec = importlib.util.find_spec(module_name)
>>> instance_one = importlib.util.module_from_spec(spec)
>>> instance_two = importlib.util.module_from_spec(spec)
>>> instance_one == instance_two
False
You could use the __import__ function.
module1 = __import__("module")
module2 = __import__("module")
Edit: As it turns out, this does not import two separate versions of the module, instead module1 and module2 will point to the same object, as pointed out by Sven.

Categories

Resources