Dynamically load and execute Python modules in isolation - python

This is an extension to the question posed here on dynamically execution of python modules
Dynamic module import in Python
While the consensus seems to be to use import or importlib to accomplish the dynamic loading/execution of python modules, this solution tends to break down when you have additional imports defined inside of the dynamically loaded module.
Take the original example
myapp/
__init__.py
commands/
__init__.py
command1.py
command2.py
foo.py
bar.py
If command1.py imports command2.py then when you try to dynamically load command1.py using importlib or import it will fail with
ModuleNotFoundError: No module named 'command2'
Now I can get around this by adding commands directory to sys.path but that will pollute the global namespace. This can get even more problematic if there are multiple commands folder with different pip third party library dependencies. One command may depend on a different version of pip installed library than another command.
So in essence, I am looking for a way to dynamically load/execute a python module in isolation. Any ideas on how to achieve this?

Since myapp is your project's root folder, not myapp/commands, you should do:
from commands import command2
in command1.py, so that the interpreter will be able to load command2.py.

Related

Importing module from a directory above current working directory

First of all, there are a bunch of solutions on stackoverflow regarding this but from the ones I tried none of them is working. I am working on a remote machine (linux). I am prototyping within the dir-2/module_2.py file using an ipython interpreter. Also I am trying to avoid using absolute paths as the absolute path in this remote machine is long and ugly, and I want my code to run on other machines upon download.
My directory structure is as follows:
/project-dir/
-/dir-1/
-/__ init__.py
-/module_1.py
-/dir-2/
-/__ init__.py
-/module_2.py
-/module_3.py
Now I want to import module_1 from module_2. However the solution mentioned in this stackoverflow post: link of using
sys.path.append('../..')
import module_2
Does not work. I get the error: ModuleNotFoundError: No module named 'module_1'
Moreover, within the ipython interpreter things like import .module_3 within module_2 throws error:
import .module_3
^ SyntaxError: invalid syntax
Isn't the dot operator supposed to work within the same directory as well. Overall I am quite confused by the importing mechanism. Any help with the initial problem is greatly appreciated! Thanks a lot!
Why it didn't work?
If you run the module1.py file and you want to import module2 then you need something like
sys.path.append("../dir-2")
If you use sys.path.append("../..") then the folder you added to the path is the folder containing project-dirand there is notmodule2.py` file inside it.
The syntax import .module_3 is for relative imports. if you tried to execute module2.py and it contains import .module_3 it does not work because you are using module2.py as a script. To use relative imports you need to treat both module2.py and module_3.py as modules. That is, some other file imports module2 and module2 import something from module3 using this syntax.
Suggestion on how you can proceed
One possible solution that solves both problems is property organizing the project and (optionally, ut a good idea) packaging your library (that is, make your code "installable"). Then, once your library is installed (in the virtual environment you are working) you don't need hacky sys.path solutions. You will be able to import your library from any folder.
Furthermore, don't treat your modules as scripts (don't run your modules). Use a separate python file as your "executable" (or entry point) and import everything you need from there. With this, relative imports in your module*.py files will work correctly and you don't get confused.
A possible directory structure could be
/project-dir/
- apps/
- main.py
- yourlib/
-/__ init__.py
-/dir-1/
-/__ init__.py
-/module_1.py
-/dir-2/
-/__ init__.py
-/module_2.py
-/module_3.py
Notice that the the yourlib folder as well as subfolders contain an __init__.py file. With this structure, you only run main.py (the name does not need to be main.py).
Case 1: You don't want to package your library
If you don't want to package your library, then you can add sys.path.append("../") in main.py to add "the project-dir/ folder to the path. With that your yourlib library will be "importable" in main.py. You can do something like from yourlib import module_2 and it will work correctly (and module_2 can use relative imports). Alternatively, you can also directly put main.py in the project-dir/ folder and you don't need to change sys.path at all, since project-dir/ will be the "working directory" in that case.
Note that you can also have a tests folder inside project-dir and to run a test file you can do the same as you did to run main.py.
Case 2: You want to package your library
The previous solution already solves your problems, but going the extra mile adds some benefits, such as dependency management and no need to change sys.path no matter where you are. There are several options to package your library and I will show one option using poetry due to its simplicity.
After installing poetry, you can run the command below in a terminal to create a new project
poetry new mylib
This creates the following folder structure
mylib/
- README.rst
- mylib/
- __init__.py
- pyproject.toml
- tests
You can then add the apps folder if you want, as well as subfolders inside mylib/ (each with a __init__.py file).
The pyproject.toml file specifies the dependencies and project metadata. You can edit it by hand and/or use poetry to add new dependencies, such as
poetry add pandas
poetry add --dev mypy
to add pandas as a dependency and mypy as a development dependency, for instance. After that, you can run
poetry build
to create a virtual environment and install your library in it. You can activate the virtual environment with poetry shell and you will be able to import your library from anywhere. Note that you can change your library files without the need to run poetry build again.
At last, if you want to publish your library in PyPi for everyone to see you can use
poetry publish --username your_pypi_username --password _passowrd_
TL; DR
Use an organized project structure with a clear place for the scripts you execute. Particularly, it is better if the script you execute is outside the folder with your modules. Also, don't run a module as a script (otherwise you can't use relative imports).

Resolving module conflict in python [duplicate]

Okay, the scenario is very simple. I have this file structure:
.
├── interface.py
├── pkg
│   ├── __init__.py
│   ├── mod1.py
│   ├── mod2.py
Now, these are my conditions:
mod2 needs to import mod1.
both interface.py and mod2 needs to be run independently as a main script. If you want, think interface as the actual program and mod2 as an internal tester of the package.
So, in Python 2 I would simply do import mod1 inside mod2.py and both python2 mod2.py and python2 interface.py would work as expected.
However, and this is the part I less understand, using Python 3.5.2, if I do import mod1; then I can do python3 mod2.py, but python3 interface.py throws: ImportError: No module named 'mod1' :(
So, apparently, python 3 proposes to use import pkg.mod1 to avoid collisions against built-in modules. Ok, If I use that I can do python3 interface.py; but then I can't python3 mod2.py because: ImportError: No module named 'pkg'
Similarly, If I use relative import:
from . import mod1 then python3 interface.py works; but mod2.py says SystemError: Parent module '' not loaded, cannot perform relative import :( :(
The only "solution", I've found is to go up one folder and do python -m pkg.mod2 and then it works. But do we have to be adding the package prefix pkg to every import to other modules within that package? Even more, to run any scripts inside the package, do I have to remember to go one folder up and use the -m switch? That's the only way to go??
I'm confused. This scenario was pretty straightforward with python 2, but looks awkward in python 3.
UPDATE: I have upload those files with the (referred as "solution" above) working source code here: https://gitlab.com/Akronix/test_python3_packages. Note that I still don't like it, and looks much uglier than the python2 solution.
Related SO questions I've already read:
Python -- import the package in a module that is inside the same package
How to do relative imports in Python?
Absolute import module in same package
Related links:
https://docs.python.org/3.5/tutorial/modules.html
https://www.python.org/dev/peps/pep-0328/
https://www.python.org/dev/peps/pep-0366/
TLDR:
Run your code with python -m pkg.mod2.
Import your code with from . import mod1.
The only "solution", I've found is to go up one folder and do python -m pkg.mod2 and then it works.
Using the -m switch is indeed the "only" solution - it was already the only solution before. The old behaviour simply only ever worked out of sheer luck; it could be broken without even modifying your code.
Going "one folder up" merely adds your package to the search path. Installing your package or modifying the search path works as well. See below for details.
But do we have to be adding the package prefix pkg to every import to other modules within that package?
You must have a reference to your package - otherwise it is ambiguous which module you want. The package reference can be either absolute or relative.
A relative import is usually what you want. It saves writing pkg explicitly, making it easier to refactor and move modules.
# inside mod1.py
# import mod2 - this is wrong! It can pull in an arbitrary mod2 module
# these are correct, they uniquely identify the module
import pkg.mod2
from pkg import mod2
from . import mod2
from .mod2 import foo # if pkg.mod2.foo exists
Note that you can always use <import> as <name> to bind your import to a different name. For example, import pkg.mod2 as mod2 lets you work with just the module name.
Even more, to run any scripts inside the package, do I have to remember to go one folder up and use the -m switch? That's the only way to go??
If your package is properly installed, you can use the -m switch from anywhere. For example, you can always use python3 -m json.tool.
echo '{"json":"obj"}' | python -m json.tool
If your package is not installed (yet), you can set PYTHONPATH to its base directory. This includes your package in the search path, and allows the -m switch to find it properly.
If you are in the executable's directory, you can execute export PYTHONPATH="$(pwd)/.." to quickly mount the package for import.
I'm confused. This scenario was pretty straightforward with python 2, but looks awkward in python 3.
This scenario was basically broken in python 2. While it was straightforward in many cases, it was difficult or outright impossible to fix in any other cases.
The new behaviour is more awkward in the straightforward case, but robust and reliable in any case.
I had similar problem.
I solved it adding
import sys
sys.path.insert(0,".package_name")
into the __init__.py file in the package folder.

Why can I import successfully without __init__.py?

What exactly is the use of __init__.py? Yes, I know this file makes a directory into an importable package. However, consider the following example:
project/
foo/
__init__.py
a.py
bar/
b.py
If I want to import a into b, I have to add following statement:
sys.path.append('/path_to_foo')
import foo.a
This will run successfully with or without __init__.py. However, if there is not an sys.path.append statement, a "no module" error will occur, with or without __init__.py. This makes it seem lik eonly the system path matters, and that __init__.py does not have any effect.
Why would this import work without __init__.py?
__init__.py has nothing to do with whether Python can find your package. You've run your code in such a way that your package isn't on the search path by default, but if you had run it differently or configured your PYTHONPATH differently, the sys.path.append would have been unnecessary.
__init__.py used to be necessary to create a package, and in most cases, you should still provide it. Since Python 3.3, though, a folder without an __init__.py can be considered part of an implicit namespace package, a feature for splitting a package across multiple directories.
During import processing, the import machinery will continue to
iterate over each directory in the parent path as it does in Python
3.2. While looking for a module or package named "foo", for each directory in the parent path:
If <directory>/foo/__init__.py is found, a regular package is imported and returned.
If not, but <directory>/foo.{py,pyc,so,pyd} is found, a module is imported and returned. The exact list of extension varies by platform
and whether the -O flag is specified. The list here is
representative.
If not, but <directory>/foo is found and is a directory, it is recorded and the scan continues with the next directory in the parent
path.
Otherwise the scan continues with the next directory in the parent path.
If the scan completes without returning a module or package, and at
least one directory was recorded, then a namespace package is created.
If you really want to avoid __init__.py for some reason, you don't sys.path. Rather, create a module object and set its __path__ to a list of directories.
if I want to import a into b, I have to add following statement:
No! You'd just say: import foo.a. All this is provided you run the entire package at once using python -m main.module where main.module is the entry point to your entire application. It imports all other modules, and the modules that import more modules will try to look for them from the root of this project. For instance, foo.bar.c will import as foo.bar.b
Then it seems that only the system path matters and init.py does not have any effect.
You need to modify sys.path only when you are importing modules from locations that are not in your project, or the places where python looks for libraries. __init__.py not only makes a folder look like a package, it also does a few more things like "export" objects to outside world (__all__)
When you import something it has to either:
Retrieve an already loaded module or
Load the module that was imported
When you do import foo and python finds a folder called foo in a folder on your sys.path then it will look in that folder for an __init__.py to be considered the top level module.
(Note that if the package is not on your sys.path then you would need to append it's location to be able to import it.)
If that is not present it will look for a __init__.pyc version possibly in the __pycache__ folder, if that is also missing then that folder foo is not considered a loadable python package. If no other options for foo are found then an ImportError is raised.
If you try deleting the __init__.pyc file as well you will see that the the initializer script for a package is indeed necessary.

Override Python Module

I want to override a module of an existing Python application. The application is structured in modules and submodules and I have registered an extension. Within the extension I want to provide a module (new_module.py) which overrides the original module (module.py). So whenever another module imports it, my version is used.
/application
__init__.py
folder_a
folder_b
__init__.py
folder_b_a
__init__.py
module.py
/extension
__init__.py
new_module.py
I guess this can be achieved by setting it in the sys module, like this:
import extension.new_module
sys.modules["application.folder_b.folder_b_a.module"] = extension.new_module
But I am not sure where to put those lines. I tried it in all of the init files, but it does not work. Or is there another way?
Documented here: https://docs.python.org/2/using/cmdline.html#envvar-PYTHONSTARTUP
Force python to load your extension via the environment variable, and in that extension import sys and mess about with sys.modules.

Python package structure

I have a Python package with several subpackages.
myproject/
__init__.py
models/
__init__.py
...
controllers/
__init__.py
..
scripts/
__init__.py
myscript.py
Within myproject.scripts.myscript, how can I access myproject.models? I've tried
from myproject import models # No module named myproject
import models # No module named models
from .. import models # Attempted relative import in non-package
I've had to solve this before, but I can never remember how it's supposed to be done. It's just not intuitive to me.
This is the correct version:
from myproject import models
If it fails with ImportError: No module named foo it is because you haven't set PYTHONPATH to include the directory which contains myproject/.
I'm afraid other people will suggest tricks to let you avoid setting PYTHONPATH. I urge you to disregard them. This is why PYTHONPATH exists: to tell Python where to look for code to load. It is robust, reasonably well documented, and portable to many environments. Tricks people play to avoid having to set it are none of these things.
The explicit relative import will work even without PYTHONPATH being set, since it can just walk up the directory hierarchy until it finds the right place, it doesn't need to find the top and then walk down. However, it doesn't work in a script you pass as a command line argument to python (or equivalently, invoke directly with a #!/usr/bin/python line). This is because in both these cases, it becomes the __main__ module of the process. There's nowhere to walk up to from __main__ - it's already at the top! If you invoke the code in your script by importing that module, then it will be fine. That is, compare:
python myproject/scripts/myscript.py
to
python -c 'import myproject.scripts.myscript'
You can take advantage of this by not executing your script module directly, but creating a bin/myscript that does the import and perhaps calls a main function:
import myprojects.scripts.myscript
myprojects.scripts.myscript.main()
Compare to how Twisted's command line scripts are defined: http://twistedmatrix.com/trac/browser/trunk/bin/twistd
Your project is not in your path.
Option A
Install your package so that python can find it via its absolute name from anywhere (using from myproject import models )
Option B
Trickery to add the relative parent to your path
sys.path.append(os.path.abspath('..'))
The former option is recommended.

Categories

Resources