Python: Dynamically import module's code from string with importlib - python

I wish to dynamically import a module in Python (3.7), whereby the module's code is defined within a string.
Below is a working example that uses the imp module, which is deprecated in favour of importlib (as of version 3.4):
import imp
def import_code(code, name):
# create blank module
module = imp.new_module(name)
# populate the module with code
exec(code, module.__dict__)
return module
code = """
def testFunc():
print('spam!')
"""
m = import_code(code, 'test')
m.testFunc()
Python's documentation states that importlib.util.module_from_spec() should be used instead of imp.new_module(). However, there doesn't seem to be a way to create a blank module object using the importlib module, like I could with imp.
How can I use importlib instead of imp to achieve the same result?

You can simply instantiate types.Module:
import types
mod = types.ModuleType("mod")
Then you can populate it with exec just like you did:
exec(code, mod.__dict__)
mod.testFunc() # will print 'spam!'
So your code will look like this:
import types
def import_code(code, name):
# create blank module
module = types.ModuleType(name)
# populate the module with code
exec(code, module.__dict__)
return module
code = """
def testFunc():
print('spam!')
"""
m = import_code(code, 'test')
m.testFunc()
As commented by #Error - Syntactical Remorse, you should keep in mind that exec basically executes whatever code is contained in the string you give it, so you should use it with extra care.
At least check what you're given, but it'd be good to use exclusively predefined strings.

According to Python documentation module_from_spec()
importlib.util.module_from_spec(spec)
...
This function is preferred over using types.ModuleType to create a new module as spec is used to set as many import-controlled attributes on the module as possible.
Here is what I came up with to load the module from source code located in github repo. It is a way without writing the file to disk.
import requests
url = "https://github.com/udacity/deep-learning-v2-pytorch/raw/master/intro-to-pytorch/helper.py"
r = requests.get(url)
import importlib.util
spec = importlib.util.spec_from_loader('helper', loader=None, origin=url)
helper = importlib.util.module_from_spec(spec)
exec(r.content, helper.__dict__)
helper.view_classify() # executes function from github file

Related

Python import all * from dynamically created module

I have found many variations for importing dynamically created/named modules by reference to their names as text, but all import the module as a whole and do not seem to facilitate importing all * ....
In my case, the objects within the file are dynamically created and named, so their identities cannot be discovered beforehand.
This works, but is there a better way perhaps using importlib ?
PREFIX = "my_super_new"
active_data_module = "{0}_data_module".format(PREFIX)
exec("from {0} import *".format(active_data_module))
You could use vars with the module. This would return a dictionary of all attributes on the module (I think). Then you can assign the dictionary to the globals dictionary to make it accessible in the current module:
import importlib
PREFIX = "my_super_new"
active_data_module = "{0}_data_module".format(PREFIX)
module = importlib.import_module(active_data_module)
globals().update(vars(module))
Using Peter Wood's answer, I created a small utility function:
import importlib
def import_everything_from_module_by_name(module_name):
globals().update(vars(importlib.import_module(module_name)))
modules_for_import = [
module_a,
module_b,
module_c
]
for module_name in modules_for_import:
import_everything_from_module_by_name(module_name)

How to use a function to import a variable in python?

I want to write a function which takes the name of a variable, a file name, and a third string, and tries to import the given variable from the file and if it can not do that, it sets the variable to the third string. Let me show you. This is in my config.py:
variable = 'value'
This is my function (it doesn't work):
#!/usr/bin/python
def importvar (var, fname, notfound) :
try:
from fname import var
except:
var = notfound
return var;
value = importvar ('variable', 'config', 'value not found')
print value #prints 'value not found'
This is what I am trying to achieve:
from config import variable
print variable #prints 'value'
This question is similar to "How to use a variable name as a variable in python?", but the answers I found to those didn't seem to work for me. I don't necessarily need to store them in a variable, but I couldn't come up with anything better. I know this is a perfect example of "What you shouldn't do in python", but I still need this. Thanks for the help!
What you want to do is dynamically importing a module starting from a string describing the path of the module. You can do this by using import_module from the importlib package.
import importlib
def importvar (var, fname, notfound) :
try:
return getattr(importlib.import_module(fname), var)
except:
return notfound
This should give you the clue:
>>> from importlib import import_module
>>> config = import_module('config')
>>> print( getattr(config, 'variable') )
value
See the docs for getattr.
Basically, getattr(x, 'variable') is equivalent to x.variable
a function for import & return imported variable:
def importvar (var, fname, notfound):
try:
exec('from {f} import {v}'.format(f=fname, v=var))
return locals().get(var, notfound)
except:
return notfound
If you just want a simple import from a string, the __import__ builtin may be good enough. It takes the module name as a string and returns it. If you also need to get an attribute from it programmatically use the builtin getattr, which takes the attribute name as a string.
If you're trying to import a package submodule, though, importlib.import_module is easier--you can import a name with a dot in it and get the module directly. This just calls __import__ for you. Compare __import__("logging.config").config vs import_module("logging.config").
If you're trying to import an arbitrary file not on the Python path, it gets a little more involved. The Python docs have a recipe for this.
import importlib.util
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
Unlike __import__, this doesn't add the module to the cache, because it doesn't have a canonical import name. But you can add it yourself (using whatever name you want) if you want to import it normally later, e.g.
import sys
sys.modules["foo_module"] = module
After running this, it allows you to get the same module instance again with a simple
import foo_module

Python difference between __import__ and import as

I am trying to dynamically import a configuration file in Python.
My code works fine when I use:
import conf.config as config
but doesn't work correctly when I use:
config = __import__("conf.config")
Below are to sample programs and the results I get when running them:
#regularimport.py
import conf.config as config
def read_values(cfg):
for varname in cfg.__dict__.keys():
if varname.startswith('__'):
continue
value = getattr(cfg, varname)
yield (varname, value)
for name,value in read_values(config):
print "Current config: %s = %s" % (name, value)
Results:
$python regularimport.py
Current config: SETTING_TWO = another setting
Current config: SETTING_ONE = some setting
Dynamic import:
#dynamicimport.py
conf_str = "conf.config"
config = __import__(conf_str)
def read_values(cfg):
for varname in cfg.__dict__.keys():
if varname.startswith('__'):
continue
value = getattr(cfg, varname)
yield (varname, value)
for name,value in read_values(config):
print "Current config: %s = %s" % (name, value)
Results:
$ python dynamicimport.py
Current config: config = <module 'conf.config' from '/home/ubuntu/importex/conf/config.pyc'>
My question is why the difference? And more importantly, how can I make the dynamic import example work as it does with the regular import?
As the documentation explains:
When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name.
So, when you do this:
config = __import__("conf.config")
That's not the same as:
import conf.config as config
But rather something more like:
import conf.config
config = conf
Why?
Because conf, not conf.config, is the thing that gets bound by an import statement. (Sure, in import foo as bar, obviously bar gets bound… but __import__ isn't meant to be an equivalent of import foo as bar, but of import foo.) The docs explain further. But the upshot is that you probably shouldn't be using __import__ in the first place.
At the very top of the function documentation it says:
Note: This is an advanced function that is not needed in everyday Python programming, unlike importlib.import_module().
And at the bottom, after explaining how __import__ works with packages and why it works that way, it says:
If you simply want to import a module (potentially within a package) by name, use importlib.import_module().
So, as you might guess, the simple solution is to use importlib.import_module.
If you have to use Python 2.6, where importlib doesn't exist… well, there just is no easy solution. You can build something like import_module yourself out of imp. Or use __import__ and then dig through sys.modules. Or __import__ each piece and then getattr your way through the results. Or in various other hacky ways. And yes, that sucks—which is why 3.0 and 2.7 fixed it.
The 2.6 docs give an example of the second hack. Adapting it to your case:
__import__("conf.config")
config = sys.modules["conf.config"]
config = __import__("conf.config") is not equivalent to import conf.config as config.
For example:
>>> import os.path as path
>>> path
<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>
>>> __import__('os.path')
<module 'os' from '/usr/lib/python2.7/os.pyc'>
Instead of __import__ use importlib.import_module to get the subpackage / submodule.
>>> import importlib
>>> importlib.import_module('os.path')
<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>

Recursive version of 'reload'

When I'm developing Python code, I usually test it in an ad-hoc way in the interpreter. I'll import some_module, test it, find a bug, fix the bug and save, and then use the built-in reload function to reload(some_module) and test again.
However, suppose that in some_module I have import some_other_module, and while testing some_module I discover a bug in some_other_module and fix it. Now calling reload(some_module) won't recursively re-import some_other_module. I have to either manually reimport the dependency (by doing something like reload(some_module.some_other_module), or import some_other_module; reload(some_other_module), or, if I've changed a whole bunch of dependencies and lost track of what I need to reload, I need to restart the entire interpreter.
What'd be more convenient is if there were some recursive_reload function, and I could just do recursive_reload(some_module) and have Python not only reload some_module, but also recursively reload every module that some_module imports (and every module that each of those modules imports, and so on) so that I could be sure that I wasn't using an old version of any of the other modules upon which some_module depends.
I don't think there's anything built in to Python that behaves like the recursive_reload function I describe here, but is there an easy way to hack such a thing together?
I've run up against the same issue, and you inspired me to actually solve the problem.
from types import ModuleType
try:
from importlib import reload # Python 3.4+
except ImportError:
# Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an
# alias for the builtin reload.
from imp import reload
def rreload(module):
"""Recursively reload modules."""
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
rreload(attribute)
Or, if you are using IPython, just use dreload or pass --deep-reload on startup.
I've run against the same issue and I've built up on #Mattew and #osa answer.
from types import ModuleType
import os, sys
def rreload(module, paths=None, mdict=None):
"""Recursively reload modules."""
if paths is None:
paths = ['']
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
rreload(attribute, paths, mdict)
reload(module)
#return mdict
There are three differences:
In the general case, reload(module) has to be called at the end of the function as well, as #osa pointed out.
With circular import dependencies the code posted earlier would loop forever so I've added a dictionary of lists to keep track of the set of modules loaded by other modules. While circular dependencies are not cool, Python allows them, so this reload function deals with them as well.
I've added a list of paths (default is ['']) from which the reloading is allowed. Some modules don't like been reloaded the normal way, (as shown here).
The code worked great for dependency modules imported just as import another_module, but it failed when the module imported functions with from another_module import some_func.
I expanded on #redsk's answer to try and be smart about these functions. I've also added a blacklist because unfortunately typing and importlib don't appear in sys.builtin_module_names (maybe there are more). Also I wanted to prevent reloading of some dependencies I knew about.
I also track the reloaded module names and return them.
Tested on Python 3.7.4 Windows:
def rreload(module, paths=None, mdict=None, base_module=None, blacklist=None, reloaded_modules=None):
"""Recursively reload modules."""
if paths is None:
paths = [""]
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
if base_module is None:
base_module = module
if blacklist is None:
blacklist = ["importlib", "typing"]
if reloaded_modules is None:
reloaded_modules = []
reload(module)
reloaded_modules.append(module.__name__)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType and attribute.__name__ not in blacklist:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
reloaded_modules = rreload(attribute, paths, mdict, base_module, blacklist, reloaded_modules)
elif callable(attribute) and attribute.__module__ not in blacklist:
if attribute.__module__ not in sys.builtin_module_names and f"_{attribute.__module__}" not in sys.builtin_module_names:
if sys.modules[attribute.__module__] != base_module:
if sys.modules[attribute.__module__] not in mdict:
mdict[sys.modules[attribute.__module__]] = [attribute]
reloaded_modules = rreload(sys.modules[attribute.__module__], paths, mdict, base_module, blacklist, reloaded_modules)
reload(module)
return reloaded_modules
Some notes:
I don't know why some builtin_module_names are prefixed with an underscore (for example collections is listed as _collections, so I have to do the double string check.
callable() returns True for classes, I guess that's expected but that was one of the reasons I had to blacklist extra modules.
At least now I am able to deep reload a module at runtime and from my tests I was able to go multiple levels deep with from foo import bar and see the result at each call to rreload()
(Apologies for the long and ugly depth, but the black formatted version doesn't look so readable on SO)
Wouldn't it be simpler to actually write some test cases and run them every time you are done with modifying your module?
What you are doing is cool (you are in essence using TDD (test driven development) but you are doing it wrong.
Consider that with written unit tests(using the default python unittest module, or better yet nose) you get to have tests that are reusable, stable and help you detect inconsitencies in your code much much faster and better than with testing your module in the interactive environment.
I found the idea to just clear all the modules and then reimport your module here, which suggested to just do this:
import sys
sys.modules.clear()
This would mess up modules loaded that you don't want to reload (if you only want to reload your own modules). My idea is to only clear the modules that include your own folders. Something like this:
import sys
import importlib
def reload_all():
delete_folders = ["yourfolder", "yourotherfolder"]
for module in list(sys.modules.keys()):
if any(folder in module for folder in delete_folders):
del sys.modules[module]
# And then you can reimport the file that you are running.
importlib.import_module("yourfolder.entrypoint")
Reimporting your entry point will reimport all of its imports since the modules were cleared and it's automatically recursive.
Technically, in each file you could put a reload command, to ensure that it reloads each time it imports
a.py:
def testa():
print 'hi!'
b.py:
import a
reload(a)
def testb():
a.testa()
Now, interactively:
import b
b.testb()
#hi!
#<modify a.py>
reload(b)
b.testb()
#hello again!
I found the answer of redsk very useful.
I propose a simplified (for the user, not as code) version where the path to the module is automatically gathered and recursion works for an arbitrary number of levels.
Everything is self-contained in a single function.
Tested on Python 3.4. I guess for python 3.3 one must import reload from imp instead of ... from importlib.
It also checks if the __file__ file is present, which might be false if the coder forgets to define an __init__.py file in a submodule. In such case, an exception is raised.
def rreload(module):
"""
Recursive reload of the specified module and (recursively) the used ones.
Mandatory! Every submodule must have an __init__.py file
Usage:
import mymodule
rreload(mymodule)
:param module: the module to load (the module itself, not a string)
:return: nothing
"""
import os.path
import sys
def rreload_deep_scan(module, rootpath, mdict=None):
from types import ModuleType
from importlib import reload
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
# print("RReloading " + str(module))
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
# print ("for attr "+attribute_name)
if type(attribute) is ModuleType:
# print ("typeok")
if attribute not in mdict[module]:
# print ("not int mdict")
if attribute.__name__ not in sys.builtin_module_names:
# print ("not a builtin")
# If the submodule is a python file, it will have a __file__ attribute
if not hasattr(attribute, '__file__'):
raise BaseException("Could not find attribute __file__ for module '"+str(attribute)+"'. Maybe a missing __init__.py file?")
attribute_path = os.path.dirname(attribute.__file__)
if attribute_path.startswith(rootpath):
# print ("in path")
mdict[module].append(attribute)
rreload_deep_scan(attribute, rootpath, mdict)
rreload_deep_scan(module, rootpath=os.path.dirname(module.__file__))
For Python 3.6+ you can use:
from types import ModuleType
import sys
import importlib
def deep_reload(m: ModuleType):
name = m.__name__ # get the name that is used in sys.modules
name_ext = name + '.' # support finding sub modules or packages
def compare(loaded: str):
return (loaded == name) or loaded.startswith(name_ext)
all_mods = tuple(sys.modules) # prevent changing iterable while iterating over it
sub_mods = filter(compare, all_mods)
for pkg in sorted(sub_mods, key=lambda item: item.count('.'), reverse=True):
importlib.reload(sys.modules[pkg]) # reload packages, beginning with the most deeply nested
Below is the recursive reload function that I use, including a magic function for ipython/jupyter.
It does a depth-first search through all sub-modules and reloads them in the correct order of dependence.
import logging
from importlib import reload, import_module
from types import ModuleType
from IPython.core.magic import register_line_magic
logger = logging.getLogger(__name__)
def reload_recursive(module, reload_external_modules=False):
"""
Recursively reload a module (in order of dependence).
Parameters
----------
module : ModuleType or str
The module to reload.
reload_external_modules : bool, optional
Whether to reload all referenced modules, including external ones which
aren't submodules of ``module``.
"""
_reload(module, reload_external_modules, set())
#register_line_magic('reload')
def reload_magic(module):
"""
Reload module on demand.
Examples
--------
>>> %reload my_module
reloading module: my_module
"""
reload_recursive(module)
def _reload(module, reload_all, reloaded):
if isinstance(module, ModuleType):
module_name = module.__name__
elif isinstance(module, str):
module_name, module = module, import_module(module)
else:
raise TypeError(
"'module' must be either a module or str; "
f"got: {module.__class__.__name__}")
for attr_name in dir(module):
attr = getattr(module, attr_name)
check = (
# is it a module?
isinstance(attr, ModuleType)
# has it already been reloaded?
and attr.__name__ not in reloaded
# is it a proper submodule? (or just reload all)
and (reload_all or attr.__name__.startswith(module_name))
)
if check:
_reload(attr, reload_all, reloaded)
logger.debug(f"reloading module: {module.__name__}")
reload(module)
reloaded.add(module_name)
It is a tricky thing to do - I have an working example in this answer:
how to find list of modules which depend upon a specific module in python

Python: How to load a module twice?

Is there a way to load a module twice in the same python session?
To fill this question with an example: Here is a module:
Mod.py
x = 0
Now I would like to import that module twice, like creating two instances of a class to have actually two copies of x.
To already answer the questions in the comments, "why anyone would want to do that if they could just create a class with x as a variable":
You are correct, but there exists some huge amount of source that would have to be rewritten, and loading a module twice would be a quick fix^^.
Yes, you can load a module twice:
import mod
import sys
del sys.modules["mod"]
import mod as mod2
Now, mod and mod2 are two instances of the same module.
That said, I doubt this is ever useful. Use classes instead -- eventually it will be less work.
Edit: In Python 2.x, you can also use the following code to "manually" import a module:
import imp
def my_import(name):
file, pathname, description = imp.find_module(name)
code = compile(file.read(), pathname, "exec", dont_inherit=True)
file.close()
module = imp.new_module(name)
exec code in module.__dict__
return module
This solution might be more flexible than the first one. You no longer have to "fight" the import mechanism since you are (partly) rolling your own one. (Note that this implementation doesn't set the __file__, __path__ and __package__ attributes of the module -- if these are needed, just add code to set them.)
Deleting an entry from sys.modules will not necessarily work (e.g. it will fail when importing recurly twice, if you want to work with multiple recurly accounts in the same app etc.)
Another way to accomplish this is:
>>> import importlib
>>> spec = importlib.util.find_spec(module_name)
>>> instance_one = importlib.util.module_from_spec(spec)
>>> instance_two = importlib.util.module_from_spec(spec)
>>> instance_one == instance_two
False
You could use the __import__ function.
module1 = __import__("module")
module2 = __import__("module")
Edit: As it turns out, this does not import two separate versions of the module, instead module1 and module2 will point to the same object, as pointed out by Sven.

Categories

Resources