Expand Python Search Path to Other Source

Expand Python Search Path to Other Source - python

I have just joined a project with a rather large existing code base. We develop in linux and do not use and IDE. We run through the command line. I'm trying to figure out how to get python to search for the right path when I run project modules. For instance, when I run something like:
python someprojectfile.py
I get
ImportError: no module named core.'somemodule'
I get this for all of my imports to I assume it's an issue with the path.
How do I get Python to search ~/codez/project/ and all the files and folders for *.py files during import statements?

There are a few possible ways to do this:
Set the environment variable PYTHONPATH to a colon-separated list of directories to search for imported modules.
In your program, use sys.path.append('/path/to/search') to add the names of directories you want Python to search for imported modules. sys.path is just the list of directories Python searches every time it gets asked to import a module, and you can alter it as needed (although I wouldn't recommend removing any of the standard directories!). Any directories you put in the environment variable PYTHONPATH will be inserted into sys.path when Python starts up.
Use site.addsitedir to add a directory to sys.path. The difference between this and just plain appending is that when you use addsitedir, it also looks for .pth files within that directory and uses them to possibly add additional directories to sys.path based on the contents of the files. See the documentation for more detail.
Which one of these you want to use depends on your situation. Remember that when you distribute your project to other users, they typically install it in such a manner that the Python code files will be automatically detected by Python's importer (i.e. packages are usually installed in the site-packages directory), so if you mess with sys.path in your code, that may be unnecessary and might even have adverse effects when that code runs on another computer. For development, I would venture a guess that setting PYTHONPATH is usually the best way to go.
However, when you're using something that just runs on your own computer (or when you have nonstandard setups, e.g. sometimes in web app frameworks), it's not entirely uncommon to do something like
import sys
from os.path import dirname
sys.path.append(dirname(__file__))

You should also read about python packages here: http://docs.python.org/tutorial/modules.html.
From your example, I would guess that you really have a package at ~/codez/project. The file __init__.py in a python directory maps a directory into a namespace. If your subdirectories all have an __init__.py file, then you only need to add the base directory to your PYTHONPATH. For example:
PYTHONPATH=$PYTHONPATH:$HOME/adaifotis/project
In addition to testing your PYTHONPATH environment variable, as David explains, you can test it in python like this:
$ python
>>> import project # should work if PYTHONPATH set
>>> import sys
>>> for line in sys.path: print line # print current python path
...

I know this thread is a bit old, but it took me some time to get to the heart of this, so I wanted to share.
In my project, I had the main script in a parent directory, and, to differentiate the modules, I put all the supporting modules in a sub-folder called "modules". In my main script, I import these modules like this (for a module called report.py):
from modules.report import report, reportError
If I call my main script, this works. HOWEVER, I wanted to test each module by including a main() in each, and calling each directly, as:
python modules/report.py
Now Python complains that it can't find "a module called modules". The key here is that, by default, Python includes the folder of the script in its search path, BUT NOT THE CWD. So what this error says, really, is "I can't find a modules subfolder". The is because there is no "modules" subdirectory from the directory where the report.py module resides.
I find that the neatest solution to this is to append the CWD in Python search path by including this at the top:
import sys
sys.path.append(".")
Now Python searches the CWD (current directory), finds the "modules" sub-folder, and all is well.

The easiest way I find is to create a file any_name.pth and put it in your folder \Lib\site-packages. You should find that folder wherever python is installed.
In that file, put a list of directories where you want to keep modules for importing. For instance, make a line in that file like this:
C:\Users\example\...\example
You will be able to tell it works by running this in python:
import sys
for line in sys.path:
print line
You will see your directory printed out, amongst others from where you can also import. Now you can import a mymodule.py file that sits in that directory as easily as:
import mymodule
This will not import subfolders. For that you could imagine creating a python script to create a .pth file containing all sub folders of a folder you define. Have it run at startup perhaps.

I read this question looking for an answer, and didn't like any of them.
So I wrote a quick and dirty solution. Just put this somewhere on your sys.path, and it'll add any directory under folder (from the current working directory), or under abspath:
#using.py
import sys, os.path
def all_from(folder='', abspath=None):
"""add all dirs under `folder` to sys.path if any .py files are found.
Use an abspath if you'd rather do it that way.
Uses the current working directory as the location of using.py.
Keep in mind that os.walk goes *all the way* down the directory tree.
With that, try not to use this on something too close to '/'
"""
add = set(sys.path)
if abspath is None:
cwd = os.path.abspath(os.path.curdir)
abspath = os.path.join(cwd, folder)
for root, dirs, files in os.walk(abspath):
for f in files:
if f[-3:] in '.py':
add.add(root)
break
for i in add: sys.path.append(i)
>>> import using, sys, pprint
>>> using.all_from('py') #if in ~, /home/user/py/
>>> pprint.pprint(sys.path)
[
#that was easy
]
And I like it because I can have a folder for some random tools and not have them be a part of packages or anything, and still get access to some (or all) of them in a couple lines of code.

the first task is to find out where the current environment gets its standard libraries from. And that's where I would store my OOP modules.
Import numpy
numpy.__file__
You get a catalog on your environment. For another computer, it is a good idea to create a separate directory for OOP modules.
Import sys
sys.path.append('F:/Python/My_libra/')

New option for old question.
Installing fail2ban package on Debian, looks like it's hardcoded to install on /usr/lib/python3/dist-packages/fail2ban a path not on python3 sys.path.
> python3
Python 3.7.3 (v3.7.3:ef4ec6ed12, Jun 25 2019, 18:51:50)
[GCC 6.3.0 20170516] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib/python37.zip', '/usr/lib/python3.7', '/usr/lib/python3.7/lib-dynload', '/usr/lib/python3.7/site-packages']
>>>
so, instead of just copying, I (bash) linked the library to newer versions. Future updates to the original app, will also be automatically applied to the linked versions.
if [ -d /usr/lib/python3/dist-packages/fail2ban ]
then
for d in /usr/lib/python3.*
do
[ -d ${d}/fail2ban ] || \
ln -vs /usr/lib/python3/dist-packages/fail2ban ${d}/
done
fi

Related

How to manage Python's sys.path for a project in a system independent way

I have a project directory that looks like this
/work
/venv
/project1
app.py
/package1
/package2
etc
where the script app.py runs the project I'm working on. When I run
python -m site from the /project1 directory I get, as expected
sys.path = [
'/work/project1',
plus the relevant /lib/pythonX.Y paths. However, when running app.py (from within project1) this import
from package1 import ...
fails with a ModuleNotFoundError, whereas from .package1 import ... works fine since it tells python to search directory that app.py is in. So, I added this to the beginning of app.py
import sys
print(sys.path)
and the result is
sys.path = [
'/work',
Instead of /work/project1 the directory that is being searched when importing to app.py is /work. What is the cause of this discrepancy between python -m site and print(sys.path) and what is the best way to go about making sure that /work/project1 is always part of sys.path?
I would like to avoid using something like site.addsitedir() and perhaps use a .pth file instead. From what I've read though a .pth file belongs in /lib/pythonX.Y/sitepackages, but I also need this solution to be system independent so that a collaborator who clones /project1 from github wont have to add their own .pth file.

I don't know if it applies to you, but I've helped a lot of others by pointing out what our shop does. We don't worry about the path upon execution at all. All we rely upon is that from the initial script location, we know where all the dependencies are. Then we add directories to sys.path by computing their positions relative the directory containing the initial script, which is always available.
We put this logic in a file named init.py in the same dir as the main script. Here's what one looks like:
repoPath = os.path.dirname(__file__) + "/../../Development/repo1"
sys.path.append(repoPath + "/common/common-pyutil")
sys.path.append(repoPath + "/common/common-pyutil/thirdparty")
and then in the main script do import init. We have directories with lots of scripts in them, and they all just do import init to have their sys.path fixed up to find all of our utility modules and such.

PEP-8: module at top of file

Desiring to improve my Python style, I ran a PEP-8 style checker on one of my script and it complained about something I don't know how to fix. The prologue of the script is something like:
#! /bin/env python3
import sys
import os
exe_name = os.path.basename(os.path.realpath(__file__))
bin_dir = os.path.dirname(os.path.realpath(__file__))
inst_dir = os.path.dirname(bin_dir)
sys.path.insert(0, inst_dir+'/path/to/packages')
import mypackage.mymodule
and the style checker complain on the import mymodule line, stating that it should be a top of file. But I obviously can't move it before setting the path where it will be found. Is there a good way to achieve this (mandating an environment variable or a shell wrapper are not what I find better than my current code) while respecting PEP-8 recommendations at the same time?

If you want to avoid path manipulation, you may be able to do so by using the under-known .pth feature.
sys.path should begin with the directory containing the main program either by name or by reference as ''. I assume that the file importing mymodule is not part of mypackage, so that the '' entry is not useful for importing mymodule.
sys.path should end with the site-packages directory for the executing binary. That is the normal place for added packages. If you do not want to move mypackage into site-packages, you can extend the latter 'vitually' by putting a mystuff.pth file in it. It should contain one line: the path to the directory containing mypackage. Call it myprojects. Then mypackage and any other package in myprojects can be imported as if they were in site-packages.
One advantage of .pth files is that you can put identical copies in multiple site-packages directories. For instance, I have multiple projects in F:/python. I have multiple versions of Python installed. So I have put python.pth containing that one line in the site-packages for each.

The best strategy would be to put the sys.path related code in separate file and import it in working code file.
So, I will split above code in two files. One named my_module.py and other named working_module.py
Now my_module will contain below lines
import sys
import os
exe_name = os.path.basename(os.path.realpath(__file__))
bin_dir = os.path.dirname(os.path.realpath(__file__))
inst_dir = os.path.dirname(bin_dir)
sys.path.insert(0, inst_dir+'/path/to/packages')
and working_module will contain
import my_module
import mypackage.mymodule
because we are importing my_module before mypackage, it will execute the path related code before and your package will be available in path.

You can use importlib module (python 3.5) or imp for python 2.7 to load mypackage.mymodule programatically. Both have the same purpose:
mechanisms used to implement the import statement
This question might help you more:
How to import a module given the full path?
https://docs.python.org/3/library/importlib.html#examples
https://docs.python.org/2/library/imp.html

importing a package from within another package in python

Please assume the following project structure:
/project
/packages
/files
__init__.py
fileChecker.py
/hasher
__init__.py
fileHash.py
mainProject.py
/test
I would like to get access to the module fileChecker.py from within the module fileHash.py.
This is some kind of global package.
One way is to append paths to sys.path.
[Is this PYTHONPATH by the way?]
What would be a solution when distributing the project?
Same as above? --> But then there could be paths to modules with the
same name in PYTHONPATH?
Is setuptools doing all the work?
How can I achieve it in a nice and clean way?
Thanks alot.
Update:
Also see my answer below
--> when calling fileHash.py (including an import like from files import fileChecker) directly from within its package directory, the project's path needs to be added to sys.path (described below).
Test cases situated within /test (see structure above) also need the path added to sys.path, when called from within /test.

Thanks mguijarr.
I found a solution here on stackoverflow:
source:
How to fix "Attempted relative import in non-package" even with __init__.py
when I am in the project folder /project, I can call the module like this:
python -m packages.files.fileHash (no .py here, because it is a package)
This is wokring well.
In this case PYTHONPATH is known and the import can look like this:
from packages.files import fileChecker
If it is not called directly, but from within the package directory in my case /packages/hasher --> setting the PYTHONPATH is needed:
if __package__ is None:
import sys
from os import path
sys.path.append( path.dirname( path.dirname( path.abspath(__file__) ) ) )
from packages.files import fileChecker
else:
from packages.files import fileChecker
The important thing for me here is, that the path to include is the PROJECT path.
The code snippet above(the last one) already includes the case describes both cases (called as package and directly).
Thanks alot for your help.
Update:
Just to make my answer more complete
Python adds the current path to the PYTHONPATH automatically when doing
python fileHash.py
Another option, in addition to the one above, is to set the PYTHONPATH when running the program like this
PYTHONPATH=/path/to/project python fileHash.py
I gained some experience, I would like to share:
I don't run modules from within their directories anymore.
Starting the app, running tests or sphinx or pylint or whatever is all done from the project directory.
This ensures that the project directory is contained in the python path and all packages, modules are found without doing additional stuff on imports.
The only place I still set the python path to the project folder using sys.path is in my setup.py in order to make codeship work.
Still, in my opinion this is somehow not an easy matter and I find myself reflecting the PYTHONPATH often enough :)

I finally found another solution, quite intuitive, no sys path or anything needed: https://www.tutorialsteacher.com/python/python-package
And then do what they explain in ''Install a Package Globally''.
In your case, put the "setup.py" file in the packages directory and add the two packages:
from setuptools import setup
setup(name='mypackage',
version='0.1',
description='Testing installation of Package',
url='#',
author='malhar',
author_email='mlathkar#gmail.com',
license='MIT',
packages=['files', 'hasher'], ## here the names
zip_safe=False)

Relative Import in Python

I am having the following directory structure and I'm new to python. In folder bin I have a single file "script.py" in which I want to import "module.py" from code package. I have seen many solutions on stackoverflow to this problem that involve modifying sys-path or giving full pathname. Is there any way I could import these with an import statement that works relatively, instead of specifying full path?
project/
bin/
script.py
code/
module.py
__init__.py

To make sure that bin/script.py can be run without configuring environment, add this preambule to script.py before from code import module line (
from twisted/bin/_preambule.py):
# insert `code/__init__.py` parent directory into `sys.path`
import sys, os
path = os.path.abspath(sys.argv[0]) # path to bin/script.py
while os.path.dirname(path) != path: # until top-most directory
if os.path.exists(os.path.join(path, 'code', '__init__.py')):
sys.path.insert(0, path)
break
path = os.path.dirname(path)
The while loop is to support running it from bin/some-other-directory/.../script.py

While correct, I think that the "dynamically add my current directory to the path" is a dirty, dirty hack.
Add a (possibly blank) __init__.py to project. Then add a .pth file containing the path to project to your sites-packages directory. Then:
from project.code import module
This has two advantages, in my opinion:
1) If you refactor, you just need to change the from project.code line and avoid messing with anything else.
2) There will be nothing special about your code -- it will behave exactly like any other package that you've installed through PyPi.
It may seem messy to add your project directory to your PYTHONPATH but I think it's a much, much cleaner solution than any of the alternatives. Personally, I've added the parent directory that all of my python code lives in to my PYTHONPATH with a .pth file, so I can deal with all of the code I write just like 3rd party libraries.
I don't think that there is any issue with cluttering up your PYTHONPATH, since only folders with an __init__.py will be importable.

Importing Python modules from different working directory

I have a Python script that uses built-in modules but also imports a number of custom modules that exist in the same directory as the main script itself.
For example, I would call
python agent.py
and agent.py has a number of imports, including:
import checks
where checks is in a file in the same directory as agent.py
agent/agent.py
agent/checks.py
When the current working directory is agent/ then everything is fine. However, if I call agent.py from any other directory, it is obviously unable to import checks.py and so errors.
How can I ensure that the custom modules can be imported regardless of where the agent.py is called from e.g.
python /home/bob/scripts/agent/agent.py

Actually your example works because checks.py is in the same directory as agent.py, but say checks.py was in the preceeding directory, eg;
agent/agent.py
checks.py
Then you could do the following:
path = os.path.abspath(os.path.join(os.path.dirname(__file__), '..'))
if not path in sys.path:
sys.path.insert(1, path)
del path
Note the use of __file__.

You should NOT need to fiddle with sys.path. To quote from the Python 2.6 docs for sys.path:
As initialized upon program startup, the first item of this list,
path[0], is the directory containing the script that was used to
invoke the Python interpreter. If the script directory is not
available (e.g. if the interpreter is invoked interactively or if the
script is read from standard input), path[0] is the empty string,
which directs Python to search modules in the current directory first.
Notice that the script directory is inserted before the entries
inserted as a result of PYTHONPATH.
=== amod.py ===
def whoami():
return __file__
=== ascript.py ===
import sys
print "sys.argv", sys.argv
print "sys.path", sys.path
import amod
print "amod __file__", amod.whoami()
=== result of running ascript.py from afar ===
C:\somewhere_else>\python26\python \junk\timport\ascript.py
sys.argv ['\\junk\\timport\\ascript.py']
sys.path ['C:\\junk\\timport', 'C:\\WINDOWS\\system32\\python26.zip', SNIP]
amod __file__ C:\junk\timport\amod.py
and if it's re-run, the last line changes of course to ...amod.pyc. This appears not to be a novelty, it works with Python 2.1 and 1.5.2.
Debug hints for you: Try two simple files like I have. Try running Python with -v and -vv. Show us the results of your failing tests, including full traceback and error message, and your two files. Tell us what platform you are running on, and what version of Python. Check the permissions on the checks.py file. Is there a checks.something_else that's causing interference?

You need to add the path to the currently executing module to the sys.path variable. Since you called it on the command line, the path to the script will always be in sys.argv[0].
import sys
import os
sys.path.append(os.path.split(sys.argv[0])[0])
Now when import searches for the module, it will also look in the folder that hosts the agent.py file.

There are several ways to add things to the PYTHONPATH.
Read http://docs.python.org/library/site.html
Set the PYTHONPATH environment variable prior to running your script.
You can do this python -m agent to run agent.py from your PYTHONPATH.
Create .pth files in your lib/site-packages directory.
Install your modules in lib/site-packages.

I think you should consider making the agent directory into a proper Python package. Then you place this package anywhere on the python path, and you can import checks as
from agent import checks
See http://docs.python.org/tutorial/modules.html

If you know full path to check.py use this recipe (http://code.activestate.com/recipes/159571/)
If you want to add directory to system path -- this recipe (http://code.activestate.com/recipes/52662/). In this case I have to determine application directory (sys.argv[0]) an pass this value to AddSysPath function. If you want to look at production sample please leave a comment on this thread so I post it later.
Regards.

To generalize my understanding of your goal, you want to be able to import custom packages using import custom_package_name no matter where you are calling python from and no matter where your python script is located.
A number of answers mention what I'm about to describe, but I feel like most of the answers assume a lot of previous knowledge. I'll try to be as explicit as I can.
To achieve the goal of allowing custom packages to be imported via the import statement, they have to be discoverable somewhere through the path that python uses to search for packages. Python actually uses multiple paths, but we'll only focus on the one that can be found by combining the output of sys.prefix (in your python interpreter) with /lib/pythonX.Y/site-packages (or lib/site-packages if you're using windows) where X.Y is your python version.
Concretely, find the path that your python uses by running:
import sys
your_path = sys.prefix + '/lib/pythonX.Y/site-packages'
print(your_path)
This path should look something like /usr/local/lib/python3.5/site-packages if you're using python 3.5, but it could be much different depending on your setup.
Python uses this path (and a few others) to find the packages that you want to import. So what you can do is put your custom packages in the /usr/local/lib/python3.5/site-packages folder. Don't forget to add an init.py file to the folder.
To be concrete again, in terminal type:
cd your_path
cp path_to_custom_package/custom_package ./
Now you should be able to import everything your custom package like you would if the package was located in the same directory (i.e. import package.subpackage for each subpackage file in your package should work).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Expand Python Search Path to Other Source - python

Related

How to manage Python's sys.path for a project in a system independent way

PEP-8: module at top of file

importing a package from within another package in python

Relative Import in Python

Importing Python modules from different working directory

Categories

Resources