Structuring python projects without path hacks - python

I have a shared python library that I use in multiple projects, so the structure looks like this:
Project1
main.py <--- (One of the projects that uses the library)
...
sharedlib
__init__.py
ps_lib.py
another.py
Now in each project's main.py I use the following hack to make it work:
import os
import sys
sys.path.insert(0, os.path.abspath('..'))
import sharedlib.ps_lib
...
Is there a way to do it without using this hack? Or is there a better way to organize the projects structure?

I think the best way would be to make sharedlib a real package. That means changing the structure a bit:
sharedlib/
sharedlib/
__init__.py
ps_lib.py
another.py
setup.py
And using something like this in the setup.py (taken partially from Python-packaging "Minimal Structure"):
from setuptools import setup
setup(name='sharedlib',
version='0.1',
description='...',
license='...',
packages=['sharedlib'], # you might need to change this if you have subfolders.
zip_safe=False)
Then install it with python setup.py develop or pip install -e . when in the root folder of the sharedlib package.
That way (using the develop or -e option) changes to the contents of sharedlib/sharedlib/* files will be visible without re-installing the sharedlib package - although you may need to restart the interpreter if you're working in an interactive interpreter. That's because the interpreter caches already imported packages.
From the setuptools documentation:
Setuptools allows you to deploy your projects for use in a common directory or staging area, but without copying any files. Thus, you can edit each project’s code in its checkout directory, and only need to run build commands when you change a project’s C extensions or similarly compiled files. [...]
To do this, use the setup.py develop command.
(emphasis mine)
The most important thing is that you can import sharedlib everywhere now - no need to insert the sharedlib package in the PATH or PYTHONPATH anymore because Python (or at least the Python where you installed it) now treats sharedlib like any other installed package.

The way we do it is to use bash entry-scripts for the python scripts. Our directory structure would look similar to the following:
/opt/stackoverflow/
-> bin
-> conf
-> lib
-> log
Our lib folder then contains all of our sub-projects
/opt/stackoverflow/lib/
-> python_algorithms
-> python_data_structures
-> python_shared_libraries
and then when we want to execute a python script, we'll execute it via a bash script within the bin directory
/opt/stackoverflow/bin/
-> quick_sort.sh
-> merge_sort.sh
and if we cat one of our entry scripts
cat merge_sort.sh
#!/bin/bash
export STACKOVERFLOW_HOME=/opt/stackoverflow
export STACKOVERFLOW_BIN=${STACKOVERFLOW_HOME}/bin
export STACKOVERFLOW_LIB=${STACKOVERFLOW_HOME}/lib
export STACKOVERFLOW_LOG=${STACKOVERFLOW_HOME}/log
export STACKOVERFLOW_CONF=${STACKOVERFLOW_HOME}/conf
# Do any pre-script server work here
export PYTHONPATH=${PYTHONPATH}:${STACKOVERFLOW_LIB}
/usr/bin/python "${STACKOVERFLOW_LIB}/python_algorithms/merge_sort.py" $* 2>&1

Related

Importing module from a directory above current working directory

First of all, there are a bunch of solutions on stackoverflow regarding this but from the ones I tried none of them is working. I am working on a remote machine (linux). I am prototyping within the dir-2/module_2.py file using an ipython interpreter. Also I am trying to avoid using absolute paths as the absolute path in this remote machine is long and ugly, and I want my code to run on other machines upon download.
My directory structure is as follows:
/project-dir/
-/dir-1/
-/__ init__.py
-/module_1.py
-/dir-2/
-/__ init__.py
-/module_2.py
-/module_3.py
Now I want to import module_1 from module_2. However the solution mentioned in this stackoverflow post: link of using
sys.path.append('../..')
import module_2
Does not work. I get the error: ModuleNotFoundError: No module named 'module_1'
Moreover, within the ipython interpreter things like import .module_3 within module_2 throws error:
import .module_3
^ SyntaxError: invalid syntax
Isn't the dot operator supposed to work within the same directory as well. Overall I am quite confused by the importing mechanism. Any help with the initial problem is greatly appreciated! Thanks a lot!
Why it didn't work?
If you run the module1.py file and you want to import module2 then you need something like
sys.path.append("../dir-2")
If you use sys.path.append("../..") then the folder you added to the path is the folder containing project-dirand there is notmodule2.py` file inside it.
The syntax import .module_3 is for relative imports. if you tried to execute module2.py and it contains import .module_3 it does not work because you are using module2.py as a script. To use relative imports you need to treat both module2.py and module_3.py as modules. That is, some other file imports module2 and module2 import something from module3 using this syntax.
Suggestion on how you can proceed
One possible solution that solves both problems is property organizing the project and (optionally, ut a good idea) packaging your library (that is, make your code "installable"). Then, once your library is installed (in the virtual environment you are working) you don't need hacky sys.path solutions. You will be able to import your library from any folder.
Furthermore, don't treat your modules as scripts (don't run your modules). Use a separate python file as your "executable" (or entry point) and import everything you need from there. With this, relative imports in your module*.py files will work correctly and you don't get confused.
A possible directory structure could be
/project-dir/
- apps/
- main.py
- yourlib/
-/__ init__.py
-/dir-1/
-/__ init__.py
-/module_1.py
-/dir-2/
-/__ init__.py
-/module_2.py
-/module_3.py
Notice that the the yourlib folder as well as subfolders contain an __init__.py file. With this structure, you only run main.py (the name does not need to be main.py).
Case 1: You don't want to package your library
If you don't want to package your library, then you can add sys.path.append("../") in main.py to add "the project-dir/ folder to the path. With that your yourlib library will be "importable" in main.py. You can do something like from yourlib import module_2 and it will work correctly (and module_2 can use relative imports). Alternatively, you can also directly put main.py in the project-dir/ folder and you don't need to change sys.path at all, since project-dir/ will be the "working directory" in that case.
Note that you can also have a tests folder inside project-dir and to run a test file you can do the same as you did to run main.py.
Case 2: You want to package your library
The previous solution already solves your problems, but going the extra mile adds some benefits, such as dependency management and no need to change sys.path no matter where you are. There are several options to package your library and I will show one option using poetry due to its simplicity.
After installing poetry, you can run the command below in a terminal to create a new project
poetry new mylib
This creates the following folder structure
mylib/
- README.rst
- mylib/
- __init__.py
- pyproject.toml
- tests
You can then add the apps folder if you want, as well as subfolders inside mylib/ (each with a __init__.py file).
The pyproject.toml file specifies the dependencies and project metadata. You can edit it by hand and/or use poetry to add new dependencies, such as
poetry add pandas
poetry add --dev mypy
to add pandas as a dependency and mypy as a development dependency, for instance. After that, you can run
poetry build
to create a virtual environment and install your library in it. You can activate the virtual environment with poetry shell and you will be able to import your library from anywhere. Note that you can change your library files without the need to run poetry build again.
At last, if you want to publish your library in PyPi for everyone to see you can use
poetry publish --username your_pypi_username --password _passowrd_
TL; DR
Use an organized project structure with a clear place for the scripts you execute. Particularly, it is better if the script you execute is outside the folder with your modules. Also, don't run a module as a script (otherwise you can't use relative imports).

Including a python library with my script

My system administrator will not allow global installation of python packages.
I'm writing a script that people will invoke to perform certain actions for them. The script I'm writing needs certain libraries like sqlalchemy and coloredlogs. I am however allowed to install python libs any local folder. i.e not site-packages.
How would I go about installing the libs in the same folder as the script so that the script has access to them?
My folder hierarchy is like so
script_to_invoke.py
scriptpack/
bin/
coloredlogs
coloredlogs.egg
...
utils/
util1.py
util2.py
(all the folders indicated have an __init__.py)
What I've tried so far:
within script_to_invoke.py I use
from scriptpack.utils invoke util1 # no problem here
from scriptpack.bin import coloredlogs # fails to find the import
I've looked at some other SO answers abut I'm not sure how to correlate them with my problem.
I figured it out!
Python had to be directed to find the .egg files
This can be done by either
Editing the PYTHONPATH var BEFORE the interpreter is started (or)
Appending the full path to the eggs to the sys path
Code Below:
import sys
for entry in [<list of full path to egg files in bin dir>]:
sys.path.append(str(entry))
# Proceed with local imports
If you might want to try packaging up everything as a zipapp. Doing so makes a single zip file that acts as a Python script, but can contain a whole multitude of embedded packages. The steps to make it are:
Make a folder with the name of your program (testapp in my example)
Name your main script __main__.py and put it in that folder
Using pip, install the required packages to the folder with --target=/path/to/testapp
Run python3 -mzipapp testapp -p='/usr/bin/env python3' (providing the shebang line is optional; without it, users will need to run the package with python3 testapp.pyz, while with the shebag, they can just do ./testapp.pyz)
That creates a zip file with all your requirements embedded in it alongside your script, that doesn't even need to be unpacked to run (Python knows how to run zip apps natively). As a trivial example:
$ mkdir testapp
$ echo -e '#!/usr/bin/python3\nimport sqlalchemy\nprint(sqlalchemy)' > __main__.py
$ pip3 install --target=./testapp sqlalchemy
$ python3 -mzipapp testapp -p='/usr/bin/env python3'
$ ./testapp.pyz
<module 'sqlalchemy' from './testapp.pyz/sqlalchemy/__init__.py'>
showing how the simple main was able to access sqlalchemy from within the same zipapp. It's also smaller (thanks to the zipping) that distributing the uncompressed modules:
$ du -s -h testapp*
13M testapp
8.1M testapp.pyz
You can install these packages in a non-global location (generally in ~/.local/lib/python<x.y>) using the --user flag, e.g.:
pip install --user sqlalchemy coloredlogs
That way you don't have to worry about changing how imports work, and you're still compliant with your sysadmins policies.

Package a python project using batch scripts

Hi I currently have a python project that uses subprocess. Popen to run some batch files.
Is it possible to package the batch file as source. Thus, when some of our other python project use setup.py to include the current python project in install_requires, the other project could install and update those batch files and uses it from source (i.e. run these script with subprocess. Popen as well)?
Anyone have some idea how should I do it?
Thanks in advance!
If you have bash scripts that are required to run your python package, you could simply store them within your package folder and they should be included when installing the package using setuptools. Here is an example of a possible folder structure:
/myproject
/myproject
__init__.py
main.py
batch.sh
setup.py
In the main.py you could access the batch file by:
import os.path
import subprocess
current_dir = os.path.dirname(os.path.abspath(__file__))
batch_script = os.path.join(current_dir, 'batch.sh')
subprocess.call(batch_script)
UPDATE
Based on other comments, if you instead need a way to make batch scripts accessible to third party packages, you could specify the scripts in the 'scripts' key in setuptools. You can see this available option in setuptools here.

How could I share my python command line tool to the wild

I have just create Python command line tool(xiber) called xiber.py that creating a iPad.xib file from a iPhone.xib. On my own computer I do:
alias xiber='python path_to/xiber.py'
Then I can use 'xiber' anywhere in my own computer. I want share this tool to other developer.
My question is: How could they use it without do the alias xiber='....' stuff and just using the xiber command.
Thanks.
Setup your Git repo like this:
bin/
xiber # Formerly index.py
README.md
setup.py
In the setup.py file.
from distutils.core import setup
# Typing this from memory (not tested).
setup(
name = 'Xiber',
version = '0.1',
scripts = ['bin/xiber'],
# Other options if desired ...
)
People who want to install your script can use ordinary tools like pip or easy_install, which will put your script in the appropriate bin directory of the Python installation. For example:
pip install git+git://github.com/liaa/xiber
a way is to provide a xiber bash script and a xiber.bat batch file in a /bin/ folder that is distributed with the project. Then the end user can make sure to add the bin folder to their PATH
Create executable by PyInstaller
Another option could be creating standalone executable file using Pyinstaller.
If you check the tutorial, you would find example starting with your scenario - turning Python script into executable (I used it and finally modified to the method "single file"), it works.

Tests and python package structure

I have some problems in structuring my python project. Currently it is a bunch of files in the same folder. I have tried to structure it like
proj/
__init__.py
foo.py
...
bar/
__init__.py
foobar.py
...
tests/
foo_test.py
foobar_test.py
...
The problem is that I'm not able, from inner directories, to import modules from the outer directories. This is particularly annoying with tests.
I have read PEP 328 about relative imports and PEP 366 about relative imports from the main module. But both these methods require the base package to be in my PYTHONPATH. Indeed I obtain the following error
ValueError: Attempted relative import in non-package.
So I added the following boilerplate code on top of the test files
import os, sys
sys.path.append(os.path.join(os.getcwd(), os.path.pardir))
Still I get the same error. What is the correct way to
structure a package, complete with tests, and
add the base directory to the path to allow imports?
EDIT As requested in the comment, I add an example import that fails (in the file foo_test.py)
import os, sys
sys.path.append(os.path.join(os.getcwd(), os.path.pardir))
from ..foo import Foo
When you use the -m switch to run code, the current directory is added to sys.path. So the easiest way to run your tests is from the parent directory of proj, using the command:
python -m proj.tests.foo_test
To make that work, you will need to include an __init__.py file in your tests directory so that the tests are correctly recognised as part of the package.
I like to import modules using the full proj.NAME package prefix whenever possible. This is the approach the Google Python styleguide recommends.
One option to allow you to keep your package structure, use full package paths, and still move forward with development would be to use a virtualenv and put your project in develop mode. Your project's setup.py will need to use setuptools instead of distutils, to get the develop command.
This will let you avoid the sys.path.append stuff above:
% virtualenv ~/virt
% . ~/virt/bin/activate
(virt)~% cd ~/myproject
(virt)~/myproject% python setup.py develop
(virt)~/myproject% python tests/foo_test.py
Where foo_test.py uses:
from proj.foo import Foo
Now when you run python from within your virtualenv your PYTHONPATH will point to all of the packages in your project. You can create a shorter shell alias to enter your virtualenv without having to type . ~/virt/bin/activate every time.

Categories

Resources