Python importing only modules within package - python

I am creating a Python package with multiple modules. I want to make sure that when I import modules within the package that they are importing only from the package and not something outside the package that has the same name.
Is the correct way of doing this is to use relative imports? Will this interfere when I move my package to a different location on my machine (or gets installed wherever on a customer's machine)?

Modern relative imports (here's a reference) are package-relative and package-specific, so as long as the internal structure of your package does not change you can move the package as a whole around wherever you want.
While Joran Beasley's answer should work as well (though does not seem necessary in those older versions of Python where absolute imports aren't the default, as the old style of importing checked within the package's directory first), I personally don't really like modifying the import path like that when you don't have to, especially if you need to load some of those other packages or modules that your modules or packages now shadow.
A warning, however: these do require that the module in question is loaded as part of a package, or at least have their __name__ set to indicate a location in a package. Relative imports won't work for a module when __name__ == '__main__', so if you're writing a simple/casual script that utilizes another module in the same directory as it (and want to make sure the script will refer to the proper directory, things won't work right if the current working directory is not set to the script's), you could do something like import os, sys; sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))) (with thanks to https://stackoverflow.com/a/1432949/138772 for the idea). As noted in S.Lott's answer to the same question, this probably isn't something you'd want to do professionally or as part of a team project, but for something personal where you're just doing some menial task automation or the like it should be fine.

the sys.path tells python where to look for imports
add
import sys
sys.path.insert(0,".")
to the top of your main python script this will ensure local packages are imported BEFORE builtin packages (although tbh I think this happens automagically)
if you really want to import only packages in your folder do
import sys
sys.path = ["."]
however I do not recommend this at all as it will probably break lots of your stuff ...
most IDE's (eclipse/pycharm/etc) provide mechanisms to set up the environment a project uses including its paths
really the best option is not to name packages the same as builtin packages or 3rd party modules that are installed on your system
also the best option is to distribute it via a correctly bundled package, this should more than suffice

Related

python: include files from other directories into a project

I have multiple python projects which should use a number of shared files but I can not figure out how to do this in python.
If I just copy the file into the pyhton working directory it works fine with:
from HBonds import calculateHBondsForSeveralReplicas, compareSameShapeHBMatrices, calculateHBonds
But I do not want to copy it. I want to include it from: /home/b/data/pythonWorkspace/util/HBonds
For me it would make sense to do it like this (but it does not work):
from /home/b/data/pythonWorkspace/util/HBonds/HBonds.py import calculateHBondsForSeveralReplicas, compareSameShapeHBMatrices, calculateHBonds
How can I do this?
You have to make sure the PYTHONPATH includes the path to that directory as it was pointed out in previous answer.
Or you can use a nastier way: make it available at runtime with piece of code like this.
import os
import sys
folder = os.path.dirname('/home/b/data/pythonWorkspace/util/')
if dossier not in sys.path:
sys.path.append(folder)
from HBonds import HBonds
For 3rd-party libraries, it's best to install them the stock way - either to the system's site-packages or into a virtualenv.
For project(s) you're actively developing at the machine where it's running, a maintainable solution is to add their root directory(ies) to PYTHONPATH so that you can import <top_level_module>.<submodule>.<etc> from anywhere. That's what we did at my previous occupation. The main plus here is trivial code base update and switch.
Another way is to use relative imports, but it's intended for intra-package references, so that you don't have to reiterate the package's name everywhere. If many otherwise unrelated parts of the code use the same module(s), it's probably more convenient to make the shared part a separate package which would be a dependency for all of them.

How do I structure my Python project to allow named modules to be imported from sub directories

This is my directory structure:
Projects
+ Project_1
+ Project_2
- Project_3
- Lib1
__init__.py # empty
moduleA.py
- Tests
__init__.py # empty
foo_tests.py
bar_tests.py
setpath.py
__init__.py # empty
foo.py
bar.py
Goals:
Have an organized project structure
Be able to independently run each .py file when necessary
Be able to reference/import both sibling and cousin modules
Keep all import/from statements at the beginning of each file.
I Achieved #1 by using the above structure
I've mostly achieved 2, 3, and 4 by doing the following (as recommended by this excellent guide)
In any package that needs to access parent or cousin modules (such as the Tests directory above) I include a file called setpath.py which has the following code:
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
sys.path.insert(0, os.path.abspath('.'))
sys.path.insert(0, os.path.abspath('...'))
Then, in each module that needs parent/cousin access, such as foo_tests.py, I can write a nice clean list of imports like so:
import setpath # Annoyingly, PyCharm warns me that this is an unused import statement
import foo.py
Inside setpath.py, the second and third inserts are not strictly necessary for this example, but are included as a troubleshooting step.
My problem is that this only works for imports that reference the module name directly, and not for imports that reference the package. For example, inside bar_tests.py, neither of the two statements below work when running bar_tests.py directly.
import setpath
import Project_3.foo.py # Error
from Project_3 import foo # Error
I receive the error "ImportError: No module named 'Project_3'".
What is odd is that I can run the file directly from within PyCharm and it works fine. I know that PyCharm is doing some behind the scenes magic with the Python Path variable to make everything work, but I can't figure out what it is. As PyCharm simply runs python.exe and sets some environmental variables, it should be possible to clone this behavior from within a Python script itself.
For reasons not really germane to this question, I have to reference bar using the Project_3 qualifier.
I'm open to any solution that accomplishes the above while still meeting my earlier goals. I'm also open to an alternate directory structure if there is one that works better. I've read the Python doc on imports and packages but am still at a loss. I think one possible avenue might be manually setting the __path__ variable, but I'm not sure which one needs to be changed or what to set it to.
Those types of questions qualify as "primarily opinion based", so let me share my opinion how I would do it.
First "be able to independently run each .py file when necessary": either the file is an module, so it should not be called directly, or it is standalone executable, then it should import its dependencies starting from top level (you may avoid it in code or rather move it to common place, by using setup.py entry_points, but then your former executable effectively converts to a module). And yes, it is one of weak points of Python modules model, that causes misunderstandings.
Second, use virtualenv (or venv in Python3) and put each of your Project_x into separate one. This way project's name won't be part of Python module's path.
Third, link you've provided mentions setup.py – you may make use of it. Put your custom code into Project_x/src/mylib1, create src/mylib1/setup.py and finally your modules into src/mylib1/mylib1/module.py. Then you may install your code by pip as any other package (or pip -e so you may work on the code directly without reinstalling it, though it unfortunately has some limitations).
And finally, as you've confirmed in comment already ;). Problem with your current model was that in sys.path.insert(0, os.path.abspath('...')) you'd mistakenly used Python module's notation, which in incorrect for system paths and should be replaced with '../..' to work as expected.
I think your goals are not reasonable. Specifically, goal number 2 is a problem:
Be able to independently run each .py file when neccessary
This doesn't work well for modules in a package. At least, not if you're running the .py files naively (e.g. with python foo_tests.py on the command line). When you run the files that way, Python can't tell where the package hierarchy should start.
There are two alternatives that can work. The first option is to run your scripts from the top level folder (e.g. projects) using the -m flag to the interpreter to give it a dotted path to the main module, and using explicit relative imports to get the sibling and cousin modules. So rather than running python foo_tests.py directly, run python -m project_3.tests.foo_tests from the projects folder (or python -m tests.foo_tests from within project_3 perhaps), and have have foo_tests.py use from .. import foo.
The other (less good) option is to add a top-level folder to your Python installation's module search path on a system wide basis (e.g. add the projects folder to the PYTHON_PATH environment variable), and then use absolute imports for all your modules (e.g. import project3.foo). This is effectively what your setpath module does, but doing it system wide as part of your system's configuration, rather than at run time, it's much cleaner. It also avoids the multiple names that setpath will allow to you use to import a module (e.g. try import foo_tests, tests.foo_tests and you'll get two separate copies of the same module).

Some way to create a cross-platform, self-contained, cloud-synchronized python library of modules for personal use? [duplicate]

I need to ship a collection of Python programs that use multiple packages stored in a local Library directory: the goal is to avoid having users install packages before using my programs (the packages are shipped in the Library directory). What is the best way of importing the packages contained in Library?
I tried three methods, but none of them appears perfect: is there a simpler and robust method? or is one of these methods the best one can do?
In the first method, the Library folder is simply added to the library path:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
import package_from_Library
The Library folder is put at the beginning so that the packages shipped with my programs have priority over the same modules installed by the user (this way I am sure that they have the correct version to work with my programs). This method also works when the Library folder is not in the current directory, which is good. However, this approach has drawbacks. Each and every one of my programs adds a copy of the same path to sys.path, which is a waste. In addition, all programs must contain the same three path-modifying lines, which goes against the Don't Repeat Yourself principle.
An improvement over the above problems consists in trying to add the Library path only once, by doing it in an imported module:
# In module add_Library_path:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
and then to use, in each of my programs:
import add_Library_path
import package_from_Library
This way, thanks to the caching mechanism of CPython, the module add_Library_path is only run once, and the Library path is added only once to sys.path. However, a drawback of this approach is that import add_Library_path has an invisible side effect, and that the order of the imports matters: this makes the code less legible, and more fragile. Also, this forces my distribution of programs to inlude an add_Library_path.py program that users will not use.
Python modules from Library can also be imported by making it a package (empty __init__.py file stored inside), which allows one to do:
from Library import module_from_Library
However, this breaks for packages in Library, as they might do something like from xlutils.filter import …, which breaks because xlutils is not found in sys.path. So, this method works, but only when including modules in Library, not packages.
All these methods have some drawback.
Is there a better way of shipping programs with a collection of packages (that they use) stored in a local Library directory? or is one of the methods above (method 1?) the best one can do?
PS: In my case, all the packages from Library are pure Python packages, but a more general solution that works for any operating system is best.
PPS: The goal is that the user be able to use my programs without having to install anything (beyond copying the directory I ship them regularly), like in the examples above.
PPPS: More precisely, the goal is to have the flexibility of easily updating both my collection of programs and their associated third-party packages from Library by having my users do a simple copy of a directory containing my programs and the Library folder of "hidden" third-party packages. (I do frequent updates, so I prefer not forcing the users to update their Python distribution too.)
Messing around with sys.path() leads to pain... The modern package template and Distribute contain a vast array of information and were in part set up to solve your problem.
What I would do is to set up setup.py to install all your packages to a specific site-packages location or if you could do it to the system's site-packages. In the former case, the local site-packages would then be added to the PYTHONPATH of the system/user. In the latter case, nothing needs to changes
You could use the batch file to set the python path as well. Or change the python executable to point to a shell script that contains a modified PYTHONPATH and then executes the python interpreter. The latter of course, means that you have to have access to the user's machine, which you do not. However, if your users only run scripts and do not import your own libraries, you could use your own wrapper for scripts:
#!/path/to/my/python
And the /path/to/my/python script would be something like:
#!/bin/sh
PYTHONPATH=/whatever/lib/path:$PYTHONPATH /usr/bin/python $*
I think you should have a look at path import hooks which allow to modify the behaviour of python when searching for modules.
For example you could try to do something like kde's scriptengine does for python plugins[1].
It adds a special token to sys.path(like "<plasmaXXXXXX>" with XXXXXX being a random number just to avoid name collisions) and then when python try to import modules and can't find them in the other paths, it will call your importer which can deal with it.
A simpler alternative is to have a main script used as launcher which simply adds the path to sys.path and execute the target file(so that you can safely avoid putting the sys.path.append(...) line on every file).
Yet an other alternative, that works on python2.6+, would be to install the library under the per-user site-packages directory.
[1] You can find the source code under /usr/share/kde4/apps/plasma_scriptengine_python in a linux installation with kde.

Modules paths in Python

I have created a folder with all my modules for my GAE application and with external libraries like Jinja2 to keep everything sorted in one place. I have folders structure like this:
lib\
\utils\
\__init__.py
\firepython
\jinja2
\jsonpickle
__init__.py
sessions.py
When I try to load Jinja from utils__init__.py, I get error ImportError: No module named jinja2.environment. When I look at Jinja2 imports instructions, I see them look like from jinja2.loaders. I try to change them to be like from lib.jinja2.loaders but some other errors then appear about imports. More than that I don't think it's a good practice to change these imports in external libraries source if there is a more convenient and right way to import modules properly. I also have added some paths to PYTHONPATH but it doesn't solve all problems. How can I properly import an external package that is placed in another folder, may be with a deep structure?
Indeed you should not have to change imports in external libraries - though depending on your environment, you might even have too.
PYTHONPATH
Modifying your PYTHONPATH should suffice; PYTHONPATH should contain a 'lib' path that is either absolute or relative to your home, eg.
Then you could simply do
from jinja2 import WHATEVER
sys.path.append
Another way to go without PYTHONPATH is to use sys.path.append() and add your paths from your python code. I actually favor that, as it also allows to have per-application paths.
use virtualenv
Details would be a bit long to be put here, but please follow the official doc
These options applies to general python development rather than GAE specificities; if it does not work on your development machine you should post more details (exact imports, absolute paths, pythonpath...).
A proper project structure and use of appcfg.py should workout dependencies when uploading to google: please take a look at this good answer: How do I manage third-party Python libraries with Google App Engine? (virtualenv? pip?) and follow those guidelines.
A nice way to go with GAE is through yaml application directives. Please take a look at the doc for includes: http://code.google.com/appengine/docs/python/config/appconfig.html#Includes
Also remember that GAE officially supports python 2.5, and 2.7 support is experimental
Python 2.7 is now officially supported
To properly import a module, you need to make sure, that python knows where to find it.
To do so, for each external library append it's parent directory to the sys.path (in run-time), or setup PYTHONPATH environment (before running).
For example:
import sys
sys.path.append('/my/lib')
# now we can import from lib
import jsonpickle # will load /my/lib/jsonpickle/__init__.py
See http://docs.python.org/tutorial/modules.html#the-module-search-path . to understand what python does when you call import.

Importing Python modules from a distant directory

What's the shortest way to import a module from a distant, relative directory?
We've been using this code which isn't too bad except your current working directory has to be the same as the directory as this code's or the relative path breaks, which can be error prone and confusing to users.
import sys
sys.path.append('../../../Path/To/Shared/Code')
This code (I think) fixes that problem but is a lot more to type.
import os,sys
sys.path.append(os.path.realpath(os.path.join(os.path.dirname(__file__), '../../../Path/To/Shared/Code')))
Is there a shorter way to append the absolute path? The brevity matters because this is going to have to be typed/appear in a lot of our files. (We can't factor it out because then it would be in the shared code and we couldn't get to it. Chicken & egg, bootstrapping, etc.)
Plus it bothers me that we keep blindly appending to sys.path but that would be even more code. I sure wish something in the standard library could help with this.
This will typically appear in script files which are run from the command line. We're running Python 2.6.2.
Edit:
The reason we're using relative paths is that we typically have multiple, independent copies of the codebase on our computers. It's important that each copy of the codebase use its own copy of the shared code. So any solution which supports only a single code base (e.g., 'Put it in site-packages.') won't work for us.
Any suggestions? Thank you!
Since you don't want to install it in site-packages, you should use buildout or virtualenv to create isolated development environments. That solves the problem, and means you don't have to fiddle with sys.path anymore (in fact, because Buildout does exactly that for you).
You've explained in a comment why you don't want to install "a single site-packages directory", but what about putting in site-packages a single, tiny module, say jebootstrap.py:
import os, sys
def relative_dir(apath):
return os.path.realpath(
os.path.join(os.path.dirname(apath),
'../../../Path/To/Shared/Code'))
def addpack(apath):
relative = relative_dir(apath)
if relative not in sys.path:
sys.path.append(relative)
Now everywhere in your code you can just have
import jebootstrap
jebootsrap.addpack(__file__)
and all the rest of your shared codebase can remain independent per-installation.
Any reason you wouldn't want to make your own shared-code dir under site-packages? Then you could just import import shared.code.module...
You have several ways to handle imports, all documented in the Python language manual.
See http://docs.python.org/library/site.html and http://docs.python.org/reference/simple_stmts.html#the-import-statement
Put it in site-packages and have multiple Python installations. You select the installation using the ordinary PATH environment variable.
Put the directory in your PYTHONPATH environment variable. This is a per-individual-person setting, so you can manage to have multiple versions of the codebase this way.
Put the directory in .pth files in your site-packages. You select the installation using the ordinary PATH environment variable.

Categories

Resources