Dynamically importing Python module - python

I have a trusted remote server that stores many custom Python modules. I can fetch them via HTTP (e.g. using urllib2.urlopen) as text/plain, but I cannot save the fetched module code to the local hard disk. How can I import the code as a fully operable Python module, including its global variables and imports?
I suppose I have to use some combination of exec and imp module's functions, but I've been unable to make it work yet.

It looks like this should do the trick: importing a dynamically generated module
>>> import imp
>>> foo = imp.new_module("foo")
>>> foo_code = """
... class Foo:
... pass
... """
>>> exec foo_code in foo.__dict__
>>> foo.Foo.__module__
'foo'
>>>
Also, as suggested in the ActiveState article, you might want to add your new module to sys.modules:
>>> import sys
>>> sys.modules["foo"] = foo
>>> from foo import Foo
<class 'Foo' …>
>>>

Here's something I bookmarked a while back that covers something similar:
Customizing the Python Import System
It's a bit beyond what you want, but the basic idea is there.

Python3 version
(attempted to edit other answer but the edit que is full)
import imp
my_dynamic_module = imp.new_module("my_dynamic_module")
exec("""
class Foo:
pass
""", my_dynamic_module.__dict__)
Foo = my_dynamic_module.Foo
foo_object = Foo()
# register it on sys
import sys
sys.modules[my_dynamic_module.__name__] = my_dynamic_module

I recently encountered trying to do this while trying to write unit tests for source code examples I put into a project's readme (I wanted to avoid just linking to small files or duplicating the text in a way that could get out of sync).
I came up with the following
import sys
import types
from importlib import import_module
def compile_and_install_module(module_name: str, source_code: str) -> types.ModuleType:
"""Compile source code and install it as a module.
End result is that `import <module_name>` and `from <module_name> import ...` should work.
"""
module = types.ModuleType(module_name, "Module created from source code")
# Execute source in context of empty/fake module
exec(source_code, module.__dict__)
# Insert fake module into sys.modules. It's now a real module
sys.modules[module_name] = module
# Imports should work now
return import_module(module_name)
And a quick example of how you can use it
$ cat hello.py
def foo():
print("Hello world")
bar = 42
$ python
Python 3.9.5 (tags/v3.9.5:0a7dcbd, May 3 2021, 17:27:52) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from compile import compile_and_install_module
>>> compile_and_install_module("hello", open("hello.py").read())
<module 'hello'>
>>> import hello
>>> hello.foo()
Hello world
>>> from hello import bar
>>> bar
42
You can remove the return value and import_lib import if you

Related

How to execute if __name__ != "__main__"?

I have a code which I cannot alter named temp.py which contains
if __name__ == "__main__":
*some pieces of code not a function call*
I want to import temp into an other file which I can edit but importing doesnot run the above part. I know I can use subprocess to run temp.py but that is not what I want. I want to import the module entirely but cannot alter the code.
I have heard of a module called imp but is depreceated now.
EDIT:
I am aware that code under the if statement is not meant to be excecuted when imported, but lets just assume temp.py is written in a really worse way and I cannot alter it.
You are trying to go against the import system. Pretty much any thing you end up doing to achieve this will be a hack.
Consider we have a file:
(py39) Juans-MacBook-Pro:~ juan$ cat subvert.py
def foo(x):
print("Hello, ", x)
if __name__ == "__main__":
foo("Goodbye")
Note, if I open up a REPL, I'm in __main__, so at your top-level script, you could just do:
(py39) Juans-MacBook-Pro:~ juan$ python
Python 3.9.5 (default, May 18 2021, 12:31:01)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> __name__
'__main__'
>>> exec(open("subvert.py").read())
Hello, Goodbye
Of course, just execing the source code directly in the same namespace is probably not what you want...
Note, we can pass our own namespace:
>>> namespace = {"__name__": "__main__"}
>>> exec(open("subvert.py").read(), namespace)
Hello, Goodbye
So our module's namespace won't be clobbered.
Now, if you want an actual module after it, you could hack together something like:
>>> import types
>>> module = types.ModuleType("__main__")
>>> module
<module '__main__'>
>>> exec(open("subvert.py").read(), module.__dict__)
Hello, Goodbye
>>> module.foo('bar')
Hello, bar
I wouldn't expect any of this to work well. It won't have all the attributes of a fully loaded module, look at importlib if you find you need to flesh it out more.
something like this is really bad practice but will work:
file1.py has the "main" part:
def foo1():
print("foo1")
if __name__ == '__main__':
print("main function started")
foo1()
print("main function finished")
file2.py is an intermediary file:
from file1 import *
with open('file1.py') as f:
lines = f.readlines()
a = "if __name__ == '__main__':\n"
new_main = ["def main():\n"] + lines[lines.index(a)+1:]
exec("".join(new_main))
file3.py used to import file2 as if it is the file1 module:
import file2
file2.main()
file2.foo1()
output when running file3.py:
main function started
foo1
main function finished
foo1
Python programmers use following mechanism for avoiding importing a main script on another scripts.
if __name__ == "__main__":
# TODO
Read its philosophia here.
When some code used if __name__ == "__main__": means to you shouldn't import its code and this work isn't true.
But you can use this mechanism :
def main():
# TODO
if __name__ == "__main__":
main()
In this case everyone can import your main script and use main function.

Python moving modules into sub directory (without breaking existing import structure)

Supposing I am writing library code with a directory structure as follows:
- mylibrary/
|
|-----foo.py
|-----bar.py
|-----baz.py
|-----__init__.py
And to better organise I create a sub directory:
- mylibrary/
|
|-----foobar/
| |-----foo.py
| |-----bar.py
|-----baz.py
|-----__init__.py
I want all client code to keep working without updates so I want to update init.py so that imports don't break.
I've tried adding this to init.py:
from foobar import foo
Now if I open a shell I can do:
from mylibrary import foo
print(foo.Foo)
However if I do this:
from mylibrary.foo import Foo
I get No module named mylibrary.foo error.
Here is the traceback from my actual example:
Type "help", "copyright", "credits" or "license" for more information.
>>> from global_toolkit import section
>>> section.Section
<class 'global_toolkit.serialization.section.Section'>
>>> from global_toolkit.section import Section
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'global_toolkit.section'
>>>
Can anyone explain this behaviour?
Add this in your __init__.py :
from .foobar import foo, bar
import sys
for i in ['foo','bar']:
sys.modules['mylib.'+i] = sys.modules['mylib.foobar.'+i]
Now, from mylib.foo import Foo should work.

Python Pickle failing on Windows but works on Mac [duplicate]

I have a class MyClass defined in my_module. MyClass has a method pickle_myself which pickles the instance of the class in question:
def pickle_myself(self, pkl_file_path):
with open(pkl_file_path, 'w+') as f:
pkl.dump(self, f, protocol=2)
I have made sure that my_module is in PYTHONPATH. In the interpreter, executing __import__('my_module') works fine:
>>> __import__('my_module')
<module 'my_module' from 'A:\my_stuff\my_module.pyc'>
However, when eventually loading the file, I get:
File "A:\Anaconda\lib\pickle.py", line 1128, in find_class
__import__(module)
ImportError: No module named my_module
Some things I have made sure of:
I have not changed the location of my_module.py (Python pickling after changing a module's directory)
I have tried to use dill instead, but still get the same error (More on python ImportError No module named)
EDIT -- A toy example that reproduces the error:
The example itself is spread over a bunch of files.
First, we have the module ball (stored in a file called ball.py):
class Ball():
def __init__(self, ball_radius):
self.ball_radius = ball_radius
def say_hello(self):
print "Hi, I'm a ball with radius {}!".format(self.ball_radius)
Then, we have the module test_environment:
import os
import ball
#import dill as pkl
import pickle as pkl
class Environment():
def __init__(self, store_dir, num_balls, default_ball_radius):
self.store_dir = store_dir
self.balls_in_environment = [ball.Ball(default_ball_radius) for x in range(num_balls)]
def persist(self):
pkl_file_path = os.path.join(self.store_dir, "test_stored_env.pkl")
with open(pkl_file_path, 'w+') as f:
pkl.dump(self, f, protocol=2)
Then, we have a module that has functions to make environments, persist them, and load them, called make_persist_load:
import os
import test_environment
#import pickle as pkl
import dill as pkl
def make_env_and_persist():
cwd = os.getcwd()
my_env = test_environment.Environment(cwd, 5, 5)
my_env.persist()
def load_env(store_path):
stored_env = None
with open(store_path, 'rb') as pkl_f:
stored_env = pkl.load(pkl_f)
return stored_env
Then we have a script to put it all together, in test_serialization.py:
import os
import make_persist_load
MAKE_AND_PERSIST = True
LOAD = (not MAKE_AND_PERSIST)
cwd = os.getcwd()
store_path = os.path.join(cwd, "test_stored_env.pkl")
if MAKE_AND_PERSIST == True:
make_persist_load.make_env_and_persist()
if LOAD == True:
loaded_env = make_persist_load.load_env(store_path)
In order to make it easy to use this toy example, I have put it all up on in a Github repository that simply needs to be cloned into your directory of choice.. Please see the README containing instructions, which I also reproduce here:
Instructions:
1) Clone repository into a directory.
2) Add repository directory to PYTHONPATH.
3) Open up test_serialization.py, and set the variable MAKE_AND_PERSIST to True. Run the script in an interpreter.
4) Close the previous interpreter instance, and start up a new one. In test_serialization.py, change MAKE_AND_PERSIST to False, and this will programmatically set LOAD to True. Run the script in an interpreter, causing ImportError: No module named test_environment.
5) By default, the test is set to use dill, instead of pickle. In order to change this, go into test_environment.py and make_persist_load.py, to change imports as required.
EDIT: after switching to dill '0.2.5.dev0', dill.detect.trace(True) output
C2: test_environment.Environment
# C2
D2: <dict object at 0x000000000A9BDAE8>
C2: ball.Ball
# C2
D2: <dict object at 0x000000000AA25048>
# D2
D2: <dict object at 0x000000000AA25268>
# D2
D2: <dict object at 0x000000000A9BD598>
# D2
D2: <dict object at 0x000000000A9BD9D8>
# D2
D2: <dict object at 0x000000000A9B0BF8>
# D2
# D2
EDIT: the toy example works perfectly well when run on Mac/Ubuntu (i.e. Unix-like systems?). It only fails on Windows.
I can tell from your question that you are probably doing something like this, with a class method that is attempting to pickle the instance of the class. It's ill-advised to do that, if you are doing that… it's much more sane to use pkl.dump external to the class instead (where pkl is pickle or dill etc). However, it can still work with this design, see below:
>>> class Thing(object):
... def pickle_myself(self, pkl_file_path):
... with open(pkl_file_path, 'w+') as f:
... pkl.dump(self, f, protocol=2)
...
>>> import dill as pkl
>>>
>>> t = Thing()
>>> t.pickle_myself('foo.pkl')
Then restarting...
Python 2.7.10 (default, Sep 2 2015, 17:36:25)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = open('foo.pkl', 'r')
>>> t = dill.load(f)
>>> t
<__main__.Thing object at 0x1060ff410>
If you have a much more complicated class, which I'm sure you do, then you are likely to run into trouble, especially if that class uses another file that is sitting in the same directory.
>>> import dill
>>> from bar import Zap
>>> print dill.source.getsource(Zap)
class Zap(object):
x = 1
def __init__(self, y):
self.y = y
>>>
>>> class Thing2(Zap):
... def pickle_myself(self, pkl_file_path):
... with open(pkl_file_path, 'w+') as f:
... dill.dump(self, f, protocol=2)
...
>>> t = Thing2(2)
>>> t.pickle_myself('foo2.pkl')
Then restarting…
Python 2.7.10 (default, Sep 2 2015, 17:36:25)
[GCC 4.2.1 Compatible Apple LLVM 5.1 (clang-503.0.40)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> f = open('foo2.pkl', 'r')
>>> t = dill.load(f)
>>> t
<__main__.Thing2 object at 0x10eca8090>
>>> t.y
2
>>>
Well… shoot, that works too. You'll have to post your code, so we can see what pattern you are using that dill (and pickle) fails for. I know having one module import another that is not "installed" (i.e. in some local directory) and expecting the serialization to "just work" doesn't for all cases.
See dill issues:
https://github.com/uqfoundation/dill/issues/128
https://github.com/uqfoundation/dill/issues/129
and this SO question:
Why dill dumps external classes by reference, no matter what?
for some examples of failure and potential workarounds.
EDIT with regard to updated question:
I don't see your issue. Running from the command line, importing from the interpreter (import test_serialization), and running the script in the interpreter (as below, and indicated in your steps 3-5) all work. That leads me to think you might be using an older version of dill?
>>> import os
>>> import make_persist_load
>>>
>>> MAKE_AND_PERSIST = False #True
>>> LOAD = (not MAKE_AND_PERSIST)
>>>
>>> cwd = os.getcwd()
>>> store_path = os.path.join(cwd, "test_stored_env.pkl")
>>>
>>> if MAKE_AND_PERSIST == True:
... make_persist_load.make_env_and_persist()
...
>>> if LOAD == True:
... loaded_env = make_persist_load.load_env(store_path)
...
>>>
EDIT based on discussion in comments:
Looks like it's probably an issue with Windows, as that seems to be the only OS the error appears.
EDIT after some work (see: https://github.com/uqfoundation/dill/issues/140):
Using this minimal example, I can reproduce the same error on Windows, while on MacOSX it still works…
# test.py
class Environment():
def __init__(self):
pass
and
# doit.py
import test
import dill
env = test.Environment()
path = "test.pkl"
with open(path, 'w+') as f:
dill.dump(env, f)
with open(path, 'rb') as _f:
_env = dill.load(_f)
print _env
However, if you use open(path, 'r') as _f, it works on both Windows and MacOSX. So it looks like the __import__ on Windows is more sensitive to file type than on non-Windows systems. Still, throwing an ImportError is weird… but this one small change should make it work.
In case someone is having same problem, I had the same problem running Python 2.7 and the problem was the pickle file created on windows while I am running Linux, what I had to do is running dos2unix which has to be downloaded first using
sudo yum install dos2unix
And then you need to convert the pickle file example
dos2unix data.p

python: how to import a generated module [duplicate]

I'm writing a Python application that takes a command as an argument, for example:
$ python myapp.py command1
I want the application to be extensible, that is, to be able to add new modules that implement new commands without having to change the main application source. The tree looks something like:
myapp/
__init__.py
commands/
__init__.py
command1.py
command2.py
foo.py
bar.py
So I want the application to find the available command modules at runtime and execute the appropriate one.
Python defines an __import__() function, which takes a string for a module name:
__import__(name, globals=None, locals=None, fromlist=(), level=0)
The function imports the module name, potentially using the given globals and locals to determine how to interpret the name in a package context. The fromlist gives the names of objects or submodules that should be imported from the module given by name.
Source: https://docs.python.org/3/library/functions.html#__import__
So currently I have something like:
command = sys.argv[1]
try:
command_module = __import__("myapp.commands.%s" % command, fromlist=["myapp.commands"])
except ImportError:
# Display error message
command_module.run()
This works just fine, I'm just wondering if there is possibly a more idiomatic way to accomplish what we are doing with this code.
Note that I specifically don't want to get in to using eggs or extension points. This is not an open-source project and I don't expect there to be "plugins". The point is to simplify the main application code and remove the need to modify it each time a new command module is added.
See also: How do I import a module given the full path?
With Python older than 2.7/3.1, that's pretty much how you do it.
For newer versions, see importlib.import_module for Python 2 and Python 3.
Or using __import__ you can import a list of modules by doing this:
>>> moduleNames = ['sys', 'os', 're', 'unittest']
>>> moduleNames
['sys', 'os', 're', 'unittest']
>>> modules = map(__import__, moduleNames)
Ripped straight from Dive Into Python.
The recommended way for Python 2.7 and 3.1 and later is to use importlib module:
importlib.import_module(name, package=None)
Import a module. The name argument specifies what module to import in absolute or relative terms (e.g. either pkg.mod or ..mod). If the name is specified in relative terms, then the package argument must be set to the name of the package which is to act as the anchor for resolving the package name (e.g. import_module('..mod', 'pkg.subpkg') will import pkg.mod).
e.g.
my_module = importlib.import_module('os.path')
Note: imp is deprecated since Python 3.4 in favor of importlib
As mentioned the imp module provides you loading functions:
imp.load_source(name, path)
imp.load_compiled(name, path)
I've used these before to perform something similar.
In my case I defined a specific class with defined methods that were required.
Once I loaded the module I would check if the class was in the module, and then create an instance of that class, something like this:
import imp
import os
def load_from_file(filepath):
class_inst = None
expected_class = 'MyClass'
mod_name,file_ext = os.path.splitext(os.path.split(filepath)[-1])
if file_ext.lower() == '.py':
py_mod = imp.load_source(mod_name, filepath)
elif file_ext.lower() == '.pyc':
py_mod = imp.load_compiled(mod_name, filepath)
if hasattr(py_mod, expected_class):
class_inst = getattr(py_mod, expected_class)()
return class_inst
Using importlib
Importing a source file
Here is a slightly adapted example from the documentation:
import sys
import importlib.util
file_path = 'pluginX.py'
module_name = 'pluginX'
spec = importlib.util.spec_from_file_location(module_name, file_path)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
# Verify contents of the module:
print(dir(module))
From here, module will be a module object representing the pluginX module (the same thing that would be assigned to pluginX by doing import pluginX). Thus, to call e.g. a hello function (with no parameters) defined in pluginX, use module.hello().
To get the effect "importing" functionality from the module instead, store it in the in-memory cache of loaded modules, and then do the corresponding from import:
sys.modules[module_name] = module
from pluginX import hello
hello()
Importing a package
To import a package instead, calling import_module is sufficient. Suppose there is a package folder pluginX in the current working directory; then just do
import importlib
pkg = importlib.import_module('pluginX')
# check if it's all there..
print(dir(pkg))
Use the imp module, or the more direct __import__() function.
You can use exec:
exec("import myapp.commands.%s" % command)
If you want it in your locals:
>>> mod = 'sys'
>>> locals()['my_module'] = __import__(mod)
>>> my_module.version
'2.6.6 (r266:84297, Aug 24 2010, 18:46:32) [MSC v.1500 32 bit (Intel)]'
same would work with globals()
Similar as #monkut 's solution but reusable and error tolerant described here http://stamat.wordpress.com/dynamic-module-import-in-python/:
import os
import imp
def importFromURI(uri, absl):
mod = None
if not absl:
uri = os.path.normpath(os.path.join(os.path.dirname(__file__), uri))
path, fname = os.path.split(uri)
mname, ext = os.path.splitext(fname)
if os.path.exists(os.path.join(path,mname)+'.pyc'):
try:
return imp.load_compiled(mname, uri)
except:
pass
if os.path.exists(os.path.join(path,mname)+'.py'):
try:
return imp.load_source(mname, uri)
except:
pass
return mod
The below piece worked for me:
>>>import imp;
>>>fp, pathname, description = imp.find_module("/home/test_module");
>>>test_module = imp.load_module("test_module", fp, pathname, description);
>>>print test_module.print_hello();
if you want to import in shell-script:
python -c '<above entire code in one line>'
The following worked for me:
import sys, glob
sys.path.append('/home/marc/python/importtest/modus')
fl = glob.glob('modus/*.py')
modulist = []
adapters=[]
for i in range(len(fl)):
fl[i] = fl[i].split('/')[1]
fl[i] = fl[i][0:(len(fl[i])-3)]
modulist.append(getattr(__import__(fl[i]),fl[i]))
adapters.append(modulist[i]())
It loads modules from the folder 'modus'. The modules have a single class with the same name as the module name. E.g. the file modus/modu1.py contains:
class modu1():
def __init__(self):
self.x=1
print self.x
The result is a list of dynamically loaded classes "adapters".

Do something every time a module is imported

Is there a way to do something (like print "funkymodule imported" for example) every time a module is imported from any other module? Not only the first time it's imported to the runtime or reloaded?
One possibility would be to monkey patch __import__:
>>> old_import = __import__
>>> def my_import(module,*args,**kwargs):
... print module, 'loaded'
... return old_import(module,*args,**kwargs)
...
>>> __builtins__.__import__ = my_import
>>> import datetime
datetime loaded
>>> import datetime
datetime loaded
>>> import django
django loaded
It worked fine on command line (using Python 2.7.3 on Windows XP), but I dunno if would work in other environments.
To access the module object (instead of just the module name - so you can do something useful with it) just intercept the return value instead of the argument:
>>> def my_import(*args,**kwargs):
... ret = old_import(*args,**kwargs)
... print ret
... return ret
...
>>> __builtins__.__import__ = my_import
>>> import datetime
<module 'datetime' (built-in)>
>>> import django
<module 'django' from 'C:\Python27\lib\site-packages\django\__init__.pyc'>
Update: Just confirmed it works if used inside a python file too - though in this case, the correct way of assigning it is __builtins__['__import__'] = my_import.

Categories

Resources