Python difference between __import__ and import as - python

I am trying to dynamically import a configuration file in Python.
My code works fine when I use:
import conf.config as config
but doesn't work correctly when I use:
config = __import__("conf.config")
Below are to sample programs and the results I get when running them:
#regularimport.py
import conf.config as config
def read_values(cfg):
for varname in cfg.__dict__.keys():
if varname.startswith('__'):
continue
value = getattr(cfg, varname)
yield (varname, value)
for name,value in read_values(config):
print "Current config: %s = %s" % (name, value)
Results:
$python regularimport.py
Current config: SETTING_TWO = another setting
Current config: SETTING_ONE = some setting
Dynamic import:
#dynamicimport.py
conf_str = "conf.config"
config = __import__(conf_str)
def read_values(cfg):
for varname in cfg.__dict__.keys():
if varname.startswith('__'):
continue
value = getattr(cfg, varname)
yield (varname, value)
for name,value in read_values(config):
print "Current config: %s = %s" % (name, value)
Results:
$ python dynamicimport.py
Current config: config = <module 'conf.config' from '/home/ubuntu/importex/conf/config.pyc'>
My question is why the difference? And more importantly, how can I make the dynamic import example work as it does with the regular import?

As the documentation explains:
When the name variable is of the form package.module, normally, the top-level package (the name up till the first dot) is returned, not the module named by name.
So, when you do this:
config = __import__("conf.config")
That's not the same as:
import conf.config as config
But rather something more like:
import conf.config
config = conf
Why?
Because conf, not conf.config, is the thing that gets bound by an import statement. (Sure, in import foo as bar, obviously bar gets bound… but __import__ isn't meant to be an equivalent of import foo as bar, but of import foo.) The docs explain further. But the upshot is that you probably shouldn't be using __import__ in the first place.
At the very top of the function documentation it says:
Note: This is an advanced function that is not needed in everyday Python programming, unlike importlib.import_module().
And at the bottom, after explaining how __import__ works with packages and why it works that way, it says:
If you simply want to import a module (potentially within a package) by name, use importlib.import_module().
So, as you might guess, the simple solution is to use importlib.import_module.
If you have to use Python 2.6, where importlib doesn't exist… well, there just is no easy solution. You can build something like import_module yourself out of imp. Or use __import__ and then dig through sys.modules. Or __import__ each piece and then getattr your way through the results. Or in various other hacky ways. And yes, that sucks—which is why 3.0 and 2.7 fixed it.
The 2.6 docs give an example of the second hack. Adapting it to your case:
__import__("conf.config")
config = sys.modules["conf.config"]

config = __import__("conf.config") is not equivalent to import conf.config as config.
For example:
>>> import os.path as path
>>> path
<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>
>>> __import__('os.path')
<module 'os' from '/usr/lib/python2.7/os.pyc'>
Instead of __import__ use importlib.import_module to get the subpackage / submodule.
>>> import importlib
>>> importlib.import_module('os.path')
<module 'posixpath' from '/usr/lib/python2.7/posixpath.pyc'>

Related

Use string to specify `from my_package import my_class as my_custom_name`

I'd like to make the following line dynamic :
from my_package import my_class as my_custom_name
I know how to dynamically import modules via string
import importlib
module_name = "my_package"
my_module = importlib.import_module(module_name)
as suggested here. However it still doesn't let me specify the class I want to import (my_class) and the alias I want to assign to the class name (my_custom_name). I'm using python 3.6
Two steps. Number one, you can reference a module directly using importlib:
importlib.import_module('my_package.my_module') # You can use '.'.join((my_package, my_module))
Your class will be contained in the module itself as an attribute, as in any import. As such, just use
my_custom_name = importlib.import_module('my_package.my_module').__dict__['my_class']
or even better
my_custom_name = getattr(importlib.import_module('my_package.my_module'), 'my_class')

How to use a function to import a variable in python?

I want to write a function which takes the name of a variable, a file name, and a third string, and tries to import the given variable from the file and if it can not do that, it sets the variable to the third string. Let me show you. This is in my config.py:
variable = 'value'
This is my function (it doesn't work):
#!/usr/bin/python
def importvar (var, fname, notfound) :
try:
from fname import var
except:
var = notfound
return var;
value = importvar ('variable', 'config', 'value not found')
print value #prints 'value not found'
This is what I am trying to achieve:
from config import variable
print variable #prints 'value'
This question is similar to "How to use a variable name as a variable in python?", but the answers I found to those didn't seem to work for me. I don't necessarily need to store them in a variable, but I couldn't come up with anything better. I know this is a perfect example of "What you shouldn't do in python", but I still need this. Thanks for the help!
What you want to do is dynamically importing a module starting from a string describing the path of the module. You can do this by using import_module from the importlib package.
import importlib
def importvar (var, fname, notfound) :
try:
return getattr(importlib.import_module(fname), var)
except:
return notfound
This should give you the clue:
>>> from importlib import import_module
>>> config = import_module('config')
>>> print( getattr(config, 'variable') )
value
See the docs for getattr.
Basically, getattr(x, 'variable') is equivalent to x.variable
a function for import & return imported variable:
def importvar (var, fname, notfound):
try:
exec('from {f} import {v}'.format(f=fname, v=var))
return locals().get(var, notfound)
except:
return notfound
If you just want a simple import from a string, the __import__ builtin may be good enough. It takes the module name as a string and returns it. If you also need to get an attribute from it programmatically use the builtin getattr, which takes the attribute name as a string.
If you're trying to import a package submodule, though, importlib.import_module is easier--you can import a name with a dot in it and get the module directly. This just calls __import__ for you. Compare __import__("logging.config").config vs import_module("logging.config").
If you're trying to import an arbitrary file not on the Python path, it gets a little more involved. The Python docs have a recipe for this.
import importlib.util
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
Unlike __import__, this doesn't add the module to the cache, because it doesn't have a canonical import name. But you can add it yourself (using whatever name you want) if you want to import it normally later, e.g.
import sys
sys.modules["foo_module"] = module
After running this, it allows you to get the same module instance again with a simple
import foo_module

Recursive version of 'reload'

When I'm developing Python code, I usually test it in an ad-hoc way in the interpreter. I'll import some_module, test it, find a bug, fix the bug and save, and then use the built-in reload function to reload(some_module) and test again.
However, suppose that in some_module I have import some_other_module, and while testing some_module I discover a bug in some_other_module and fix it. Now calling reload(some_module) won't recursively re-import some_other_module. I have to either manually reimport the dependency (by doing something like reload(some_module.some_other_module), or import some_other_module; reload(some_other_module), or, if I've changed a whole bunch of dependencies and lost track of what I need to reload, I need to restart the entire interpreter.
What'd be more convenient is if there were some recursive_reload function, and I could just do recursive_reload(some_module) and have Python not only reload some_module, but also recursively reload every module that some_module imports (and every module that each of those modules imports, and so on) so that I could be sure that I wasn't using an old version of any of the other modules upon which some_module depends.
I don't think there's anything built in to Python that behaves like the recursive_reload function I describe here, but is there an easy way to hack such a thing together?
I've run up against the same issue, and you inspired me to actually solve the problem.
from types import ModuleType
try:
from importlib import reload # Python 3.4+
except ImportError:
# Needed for Python 3.0-3.3; harmless in Python 2.7 where imp.reload is just an
# alias for the builtin reload.
from imp import reload
def rreload(module):
"""Recursively reload modules."""
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
rreload(attribute)
Or, if you are using IPython, just use dreload or pass --deep-reload on startup.
I've run against the same issue and I've built up on #Mattew and #osa answer.
from types import ModuleType
import os, sys
def rreload(module, paths=None, mdict=None):
"""Recursively reload modules."""
if paths is None:
paths = ['']
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
rreload(attribute, paths, mdict)
reload(module)
#return mdict
There are three differences:
In the general case, reload(module) has to be called at the end of the function as well, as #osa pointed out.
With circular import dependencies the code posted earlier would loop forever so I've added a dictionary of lists to keep track of the set of modules loaded by other modules. While circular dependencies are not cool, Python allows them, so this reload function deals with them as well.
I've added a list of paths (default is ['']) from which the reloading is allowed. Some modules don't like been reloaded the normal way, (as shown here).
The code worked great for dependency modules imported just as import another_module, but it failed when the module imported functions with from another_module import some_func.
I expanded on #redsk's answer to try and be smart about these functions. I've also added a blacklist because unfortunately typing and importlib don't appear in sys.builtin_module_names (maybe there are more). Also I wanted to prevent reloading of some dependencies I knew about.
I also track the reloaded module names and return them.
Tested on Python 3.7.4 Windows:
def rreload(module, paths=None, mdict=None, base_module=None, blacklist=None, reloaded_modules=None):
"""Recursively reload modules."""
if paths is None:
paths = [""]
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
if base_module is None:
base_module = module
if blacklist is None:
blacklist = ["importlib", "typing"]
if reloaded_modules is None:
reloaded_modules = []
reload(module)
reloaded_modules.append(module.__name__)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
if type(attribute) is ModuleType and attribute.__name__ not in blacklist:
if attribute not in mdict[module]:
if attribute.__name__ not in sys.builtin_module_names:
if os.path.dirname(attribute.__file__) in paths:
mdict[module].append(attribute)
reloaded_modules = rreload(attribute, paths, mdict, base_module, blacklist, reloaded_modules)
elif callable(attribute) and attribute.__module__ not in blacklist:
if attribute.__module__ not in sys.builtin_module_names and f"_{attribute.__module__}" not in sys.builtin_module_names:
if sys.modules[attribute.__module__] != base_module:
if sys.modules[attribute.__module__] not in mdict:
mdict[sys.modules[attribute.__module__]] = [attribute]
reloaded_modules = rreload(sys.modules[attribute.__module__], paths, mdict, base_module, blacklist, reloaded_modules)
reload(module)
return reloaded_modules
Some notes:
I don't know why some builtin_module_names are prefixed with an underscore (for example collections is listed as _collections, so I have to do the double string check.
callable() returns True for classes, I guess that's expected but that was one of the reasons I had to blacklist extra modules.
At least now I am able to deep reload a module at runtime and from my tests I was able to go multiple levels deep with from foo import bar and see the result at each call to rreload()
(Apologies for the long and ugly depth, but the black formatted version doesn't look so readable on SO)
Wouldn't it be simpler to actually write some test cases and run them every time you are done with modifying your module?
What you are doing is cool (you are in essence using TDD (test driven development) but you are doing it wrong.
Consider that with written unit tests(using the default python unittest module, or better yet nose) you get to have tests that are reusable, stable and help you detect inconsitencies in your code much much faster and better than with testing your module in the interactive environment.
I found the idea to just clear all the modules and then reimport your module here, which suggested to just do this:
import sys
sys.modules.clear()
This would mess up modules loaded that you don't want to reload (if you only want to reload your own modules). My idea is to only clear the modules that include your own folders. Something like this:
import sys
import importlib
def reload_all():
delete_folders = ["yourfolder", "yourotherfolder"]
for module in list(sys.modules.keys()):
if any(folder in module for folder in delete_folders):
del sys.modules[module]
# And then you can reimport the file that you are running.
importlib.import_module("yourfolder.entrypoint")
Reimporting your entry point will reimport all of its imports since the modules were cleared and it's automatically recursive.
Technically, in each file you could put a reload command, to ensure that it reloads each time it imports
a.py:
def testa():
print 'hi!'
b.py:
import a
reload(a)
def testb():
a.testa()
Now, interactively:
import b
b.testb()
#hi!
#<modify a.py>
reload(b)
b.testb()
#hello again!
I found the answer of redsk very useful.
I propose a simplified (for the user, not as code) version where the path to the module is automatically gathered and recursion works for an arbitrary number of levels.
Everything is self-contained in a single function.
Tested on Python 3.4. I guess for python 3.3 one must import reload from imp instead of ... from importlib.
It also checks if the __file__ file is present, which might be false if the coder forgets to define an __init__.py file in a submodule. In such case, an exception is raised.
def rreload(module):
"""
Recursive reload of the specified module and (recursively) the used ones.
Mandatory! Every submodule must have an __init__.py file
Usage:
import mymodule
rreload(mymodule)
:param module: the module to load (the module itself, not a string)
:return: nothing
"""
import os.path
import sys
def rreload_deep_scan(module, rootpath, mdict=None):
from types import ModuleType
from importlib import reload
if mdict is None:
mdict = {}
if module not in mdict:
# modules reloaded from this module
mdict[module] = []
# print("RReloading " + str(module))
reload(module)
for attribute_name in dir(module):
attribute = getattr(module, attribute_name)
# print ("for attr "+attribute_name)
if type(attribute) is ModuleType:
# print ("typeok")
if attribute not in mdict[module]:
# print ("not int mdict")
if attribute.__name__ not in sys.builtin_module_names:
# print ("not a builtin")
# If the submodule is a python file, it will have a __file__ attribute
if not hasattr(attribute, '__file__'):
raise BaseException("Could not find attribute __file__ for module '"+str(attribute)+"'. Maybe a missing __init__.py file?")
attribute_path = os.path.dirname(attribute.__file__)
if attribute_path.startswith(rootpath):
# print ("in path")
mdict[module].append(attribute)
rreload_deep_scan(attribute, rootpath, mdict)
rreload_deep_scan(module, rootpath=os.path.dirname(module.__file__))
For Python 3.6+ you can use:
from types import ModuleType
import sys
import importlib
def deep_reload(m: ModuleType):
name = m.__name__ # get the name that is used in sys.modules
name_ext = name + '.' # support finding sub modules or packages
def compare(loaded: str):
return (loaded == name) or loaded.startswith(name_ext)
all_mods = tuple(sys.modules) # prevent changing iterable while iterating over it
sub_mods = filter(compare, all_mods)
for pkg in sorted(sub_mods, key=lambda item: item.count('.'), reverse=True):
importlib.reload(sys.modules[pkg]) # reload packages, beginning with the most deeply nested
Below is the recursive reload function that I use, including a magic function for ipython/jupyter.
It does a depth-first search through all sub-modules and reloads them in the correct order of dependence.
import logging
from importlib import reload, import_module
from types import ModuleType
from IPython.core.magic import register_line_magic
logger = logging.getLogger(__name__)
def reload_recursive(module, reload_external_modules=False):
"""
Recursively reload a module (in order of dependence).
Parameters
----------
module : ModuleType or str
The module to reload.
reload_external_modules : bool, optional
Whether to reload all referenced modules, including external ones which
aren't submodules of ``module``.
"""
_reload(module, reload_external_modules, set())
#register_line_magic('reload')
def reload_magic(module):
"""
Reload module on demand.
Examples
--------
>>> %reload my_module
reloading module: my_module
"""
reload_recursive(module)
def _reload(module, reload_all, reloaded):
if isinstance(module, ModuleType):
module_name = module.__name__
elif isinstance(module, str):
module_name, module = module, import_module(module)
else:
raise TypeError(
"'module' must be either a module or str; "
f"got: {module.__class__.__name__}")
for attr_name in dir(module):
attr = getattr(module, attr_name)
check = (
# is it a module?
isinstance(attr, ModuleType)
# has it already been reloaded?
and attr.__name__ not in reloaded
# is it a proper submodule? (or just reload all)
and (reload_all or attr.__name__.startswith(module_name))
)
if check:
_reload(attr, reload_all, reloaded)
logger.debug(f"reloading module: {module.__name__}")
reload(module)
reloaded.add(module_name)
It is a tricky thing to do - I have an working example in this answer:
how to find list of modules which depend upon a specific module in python

How to use the __import__ function to import a name from a submodule?

I'm trying to replicate from foo.bar import object using the __import__ function and I seem to have hit a wall.
A simpler case from glob import glob is easy: glob = __import__("glob").glob
The problem I'm having is that I am importing a name from a subpackage (i.e. from foo.bar):
So what I'd like is something like
string_to_import = "bar"
object = __import__("foo." + string_to_import).object
But this just imported the top-level foo package, not the foo.bar subpackage:
__import__("foo.bar")
<module 'foo' from 'foo/__init__.pyc'>
How to use python's __import__() function properly?
There are two kinds of uses:
direct importing
a hook to alter import behavior
For the most part, you don't really need to do either.
For user-space importing
Best practice is to use importlib instead. But if you insist:
Trivial usage:
>>> sys = __import__('sys')
>>> sys
<module 'sys' (built-in)>
Complicated:
>>> os = __import__('os.path')
>>> os
<module 'os' from '/home/myuser/anaconda3/lib/python3.6/os.py'>
>>> os.path
<module 'posixpath' from '/home/myuser/anaconda3/lib/python3.6/posixpath.py'>
If you want the rightmost child module in the name, pass a nonempty list, e.g. [None], to fromlist:
>>> path = __import__('os.path', fromlist=[None])
>>> path
<module 'posixpath' from '/home/myuser/anaconda3/lib/python3.6/posixpath.py'>
Or, as the documentation declares, use importlib.import_module:
>>> importlib = __import__('importlib')
>>> futures = importlib.import_module('concurrent.futures')
>>> futures
<module 'concurrent.futures' from '/home/myuser/anaconda3/lib/python3.6/concurrent/futures/__init__.py'>
Documentation
The docs for __import__ are the most confusing of the builtin functions.
__import__(...)
__import__(name, globals=None, locals=None, fromlist=(), level=0) -> module
Import a module. Because this function is meant for use by the Python
interpreter and not for general use it is better to use
importlib.import_module() to programmatically import a module.
The globals argument is only used to determine the context;
they are not modified. The locals argument is unused. The fromlist
should be a list of names to emulate ``from name import ...'', or an
empty list to emulate ``import name''.
When importing a module from a package, note that __import__('A.B', ...)
returns package A when fromlist is empty, but its submodule B when
fromlist is not empty. Level is used to determine whether to perform
absolute or relative imports. 0 is absolute while a positive number
is the number of parent directories to search relative to the current module.
If you read it carefully, you get the sense that the API was originally intended to allow for lazy-loading of functions from modules. However, this is not how CPython works, and I am unaware if any other implementations of Python have managed to do this.
Instead, CPython executes all of the code in the module's namespace on its first import, after which the module is cached in sys.modules.
__import__ can still be useful. But understanding what it does based on the documentation is rather hard.
Full Usage of __import__
To adapt the full functionality to demonstrate the current __import__ API, here is a wrapper function with a cleaner, better documented, API.
def importer(name, root_package=False, relative_globals=None, level=0):
""" We only import modules, functions can be looked up on the module.
Usage:
from foo.bar import baz
>>> baz = importer('foo.bar.baz')
import foo.bar.baz
>>> foo = importer('foo.bar.baz', root_package=True)
>>> foo.bar.baz
from .. import baz (level = number of dots)
>>> baz = importer('baz', relative_globals=globals(), level=2)
"""
return __import__(name, locals=None, # locals has no use
globals=relative_globals,
fromlist=[] if root_package else [None],
level=level)
To demonstrate, e.g. from a sister package to baz:
baz = importer('foo.bar.baz')
foo = importer('foo.bar.baz', root_package=True)
baz2 = importer('bar.baz', relative_globals=globals(), level=2)
assert foo.bar.baz is baz is baz2
Dynamic access of names in the module
To dynamically access globals by name from the baz module, use getattr. For example:
for name in dir(baz):
print(getattr(baz, name))
Hook to alter import behavior
You can use __import__ to alter or intercept importing behavior. In this case, let's just print the arguments it gets to demonstrate we're intercepting it:
old_import = __import__
def noisy_importer(name, locals, globals, fromlist, level):
print(f'name: {name!r}')
print(f'fromlist: {fromlist}')
print(f'level: {level}')
return old_import(name, locals, globals, fromlist, level)
import builtins
builtins.__import__ = noisy_importer
And now when you import you can see these important arguments.
>>> from os.path import join as opj
name: 'os.path'
fromlist: ('join',)
level: 0
>>> opj
<function join at 0x7fd08d882618>
Perhaps in this context getting the globals or locals could be useful, but no specific uses for this immediately come to mind.
The __import__ function will return the top level module of a package, unless you pass a nonempty fromlist argument:
_temp = __import__('foo.bar', fromlist=['object'])
object = _temp.object
See the Python docs on the __import__ function.
You should use importlib.import_module, __import__ is not advised outside the interpreter.
In __import__'s docstring:
Import a module. Because this function is meant for use by the Python
interpreter and not for general use it is better to use
importlib.import_module() to programmatically import a module.
It also supports relative imports.
Rather than use the __import__ function I would use the getattr function:
model = getattr(module, model_s)
where module is the module to look in and and model_s is your model string. The __import__ function is not meant to be used loosely, where as this function will get you what you want.
In addition to these excellent answers, I use __import__ for convenience, to call an one-liner on the fly. Examples like the following can also be saved as auto-triggered snippets at your IDE.
Plant an ipdb break-point (triggered with "ipdb")
__import__("ipdb").set_trace(context=9)
Print prettily (triggered with "pp")
__import__("pprint").pprint(<cursor-position>)
This way, you get a temporary object, that is not referenced by anything, and call an attribute on the spot. Also, you can easily comment, uncomment or delete a single line.

Python: How to load a module twice?

Is there a way to load a module twice in the same python session?
To fill this question with an example: Here is a module:
Mod.py
x = 0
Now I would like to import that module twice, like creating two instances of a class to have actually two copies of x.
To already answer the questions in the comments, "why anyone would want to do that if they could just create a class with x as a variable":
You are correct, but there exists some huge amount of source that would have to be rewritten, and loading a module twice would be a quick fix^^.
Yes, you can load a module twice:
import mod
import sys
del sys.modules["mod"]
import mod as mod2
Now, mod and mod2 are two instances of the same module.
That said, I doubt this is ever useful. Use classes instead -- eventually it will be less work.
Edit: In Python 2.x, you can also use the following code to "manually" import a module:
import imp
def my_import(name):
file, pathname, description = imp.find_module(name)
code = compile(file.read(), pathname, "exec", dont_inherit=True)
file.close()
module = imp.new_module(name)
exec code in module.__dict__
return module
This solution might be more flexible than the first one. You no longer have to "fight" the import mechanism since you are (partly) rolling your own one. (Note that this implementation doesn't set the __file__, __path__ and __package__ attributes of the module -- if these are needed, just add code to set them.)
Deleting an entry from sys.modules will not necessarily work (e.g. it will fail when importing recurly twice, if you want to work with multiple recurly accounts in the same app etc.)
Another way to accomplish this is:
>>> import importlib
>>> spec = importlib.util.find_spec(module_name)
>>> instance_one = importlib.util.module_from_spec(spec)
>>> instance_two = importlib.util.module_from_spec(spec)
>>> instance_one == instance_two
False
You could use the __import__ function.
module1 = __import__("module")
module2 = __import__("module")
Edit: As it turns out, this does not import two separate versions of the module, instead module1 and module2 will point to the same object, as pointed out by Sven.

Categories

Resources