Can pip install editable avoid recompiling from sratch? - python

I'm developing a python module with a long pybind11 C++ extension compilation step invoked via cmake in setup.py.
When making changes to a C++ file, cmake invoked via python setup.py develop will avoid recompiling units whose dependent files have not changed. However, invoking setup.py directly ignores the settings in my pyproject.toml and I understand that the modern way to do a developmental build is with python -m pip install -e .
While pip install -e successfully builds, it unfortunately starts from scratch inside a clean temporary directory every invocation. Is there a way to instruct pip install -e to maintain my CMakeCache.txt and compilation dependency tracking?
(And/or does this somehow indicate that I misunderstand pip install -e or am using it incorrectly?)
This previous unanswered question is quite similar sounding. Perhaps, I have the added detail that my python setup.py develop is working in this regard.

Related

What is the preferred way to develop a python package without using setup.py

I am developing a python package, and I don't want to have to keep running pip install . to reinstall my package every time I change something. Using the -e or --editable doesn't seem to work unless I have a setup.py file, nor does --no-use-pep517. I have a pyproject.toml instead, as is preferred nowadays if I am not mistaken. So, what is the preferred way to do this nowadays?
My package is just a CLI script, but it imports some functions from another file in the same directory called utils.py. When developing, I can't just run the script manually rfrom the terminal, because then I get name_of_package is not a package from the line
from name_of_package.utils import function, whereas If i just have
from utils import function, I can run the script from the terminal, but when I pip install it, it says there is no module named utils.
I did install poetry and installed my dependencies, ran poetry shell and then tried to run my script with poetry run /path/to/script.py, but I kept getting an error that my package wasn't a package.
If you want to keep using setuptools as "build back-end" then you can replace the setup.py script with a setup.cfg declarative configuration file and still be able to do "editable" installations (independently of whether or not you have a pyproject.toml file).
There is now PEP 660 which standardizes editable installations. The following tools have support for PEP 660:
PDM
Flit
Hatch
Poetry
On top of setuptools-based projects, all projects that use a PEP 660 build back-end should be installable as editable by pip (python -m pip install --editable .).

How to build a python package in a dev mode from source code with conda?

I am working on an open source python project which depends on a lot of non-python packages (perl, r, ...). Consequently, they use conda to install the dependencies, as they can't be simply installed with pip.
You can run $ conda install --channel bioconda <awesome_package> to install the stable build of the package. I want to install it in the development mode. For a purely python project, I'd do something like this:
pull the source code from github
$ cd path/to/awesome_package && pip install -e .
run tests
modify the code
run tests
etc
In the step 2 above, the pip command installs all I need to work. It uses setup.py script to do its job, and requirements.txt together with requirements_dev.txt to install the dependencies. And, I don't need to rebuild/reinstall anything when I change something in the source code.
How do I do the same with conda instead of pip? How do you provide a list of requirements (python and non-python) to conda?
The most relevant thing I found is this: http://conda.pydata.org/docs/building/bpp.html
However, with this approach I'd need to rebuild and reinstall the package locally very time I want to run the tests, which is something I'd like to avoid.
I am only going to change the python source code of the package I'm working on.
TL;DR: how to build a package with conda in a dev mode from the source code?
Any help is much appreciated.

What's the standard way to package a python project with dependencies?

I have a python project that has a few dependencies (defined under install_requires in setup.py). My ops people requires a package to be self contained and only depend on a python installation. The litmus test would be that they're able to get a zip-file and then unzip and run it without an internet connection.
Is there an easy way to package an install including dependencies? It is acceptable if I have to build on the OS/architecture that it will eventually be run on.
For what it's worth, I've tried both setup.py build and setup.py sdist, but they don't seem to fit the bill since they do not include dependencies. I've also considered virtualenv (which could be installed if absolutely necessary), but that has hard coded paths which makes it less than ideal.
There are a few nuances to how pip works. Unfortunately, using --prefix vendor to store all the dependencies of the project doesn't work if any of those dependencies, or dependencies of dependencies are installed into a place where pip can find them. It will skip those dependencies and just install the rest to your vendor folder.
In the past I've used virtualenv's --no-site-packages option to solve this issue. At one company we would ship the whole virtualenv, which includes the python binary. In the interest of only shipping the dependencies, you can combine using a virtualenv with the --prefix switch on pip to give yourself a clean environment that installs to the right place.
I'll provide an example script that creates a temporary virtualenv, activates it, then installs the dependencies to a local vendor folder. This is handy if you are running in CI.
#!/bin/bash
tempdir=$(mktemp -d -t project.XXX) # create a temporary directory
trap "rm -rf $tempdir" EXIT # ensure it is cleaned up
# create the virtualenv and exclude packages outside of it
virtualenv --python=$(which python2.7) --no-site-packages $tempdir/venv
# activate the virtualenv
source $tempdir/venv/bin/activate
# install the dependencies as above
pip install -r requirements.txt --prefix=vendor
In most cases you should be able to "vendor" all the dependencies. It's basically a crude version of virtualenv.
For example look at how the requests package includes chardet and urllib3 in its own source tree. Here's an example script that should do the initial downloading and copying for you: https://gist.github.com/proppy/1136723
Once you have the dependencies installed, you can reference them with from .some.namespace import dependency_name to make sure that you're using your local versions.
It's possible to do this with recent versions of pip (I'm using 8.1.2). On the build machine:
pip install -r requirements.txt --prefix vendor
Then run it:
PYTHONPATH=vendor/lib/python2.7/site-packages python yourapp.py
(This is basically an expansion of #valentjedi comment. Thanks!)
let's say you have python module app.py with dependencies in requirements.txt file.
first, install all your dependencies in appdeps folder.
python -m pip install -r requirements.txt --target=./appdeps
then in your app.py module add this dependency folder to the pythonpath
# app.py
import sys
sys.path.append('appdeps')
# rest of your module normally
#...
this will work the same way as if you were running this script from venv with all the dependencies installed inside ;>

When would the -e, --editable option be useful with pip install?

When would the -e, or --editable option be useful with pip install?
For some projects the last line in requirements.txt is -e .. What does it do exactly?
As the man page says it:
-e,--editable <path/url>
Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.
So you would use this when trying to install a package locally, most often in the case when you are developing it on your system. It will just link the package to the original location, basically meaning any changes to the original package would reflect directly in your environment.
Some nuggets around the same here and here.
An example run can be:
pip install -e .
or
pip install -e ~/ultimate-utils/ultimate-utils-proj-src/
note the second is the full path to where the setup.py would be at.
Concrete example of using --editable in development
If you play with this test package as in:
cd ~
git clone https://github.com/cirosantilli/vcdvcd
cd vcdvcd
git checkout 5dd4205c37ed0244ecaf443d8106fadb2f9cfbb8
python -m pip install --editable . --user
it outputs:
Obtaining file:///home/ciro/bak/git/vcdvcd
Installing collected packages: vcdvcd
Attempting uninstall: vcdvcd
Found existing installation: vcdvcd 1.0.6
Can't uninstall 'vcdvcd'. No files were found to uninstall.
Running setup.py develop for vcdvcd
Successfully installed vcdvcd-1.0.6
The Can't uninstall 'vcdvcd' is normal: it tried to uninstall any existing vcdvcd to then replace them with the "symlink-like mechanism" that is produced in the following steps, but failed because there were no previous installations.
Then it generates a file:
~/.local/lib/python3.8/site-packages/vcdvcd.egg-link
which contains:
/home/ciro/vcdvcd
.
and acts as a "symlink" to the Python interpreter.
So now, if I make any changes to the git source code under /home/ciro/vcdvcd, it reflects automatically on importers who can from any directory do:
python -c 'import vcdvcd'
Note however that at my pip version at least, binary files installed with --editable, such as the vcdcat script provided by that package via scripts= on setup.py, do not get symlinked, just copied to:
~/.local/bin/vcdcat
just like for regular installs, and therefore updates to the git repository won't directly affect them.
By comparison, a regular non --editable install from the git source:
python -m pip uninstall vcdvcd
python -m pip install --user .
produces a copy of the installed files under:
~/.local/lib/python3.8/site-packages/vcdvcd
Uninstall of an editable package as done above requires a new enough pip as mentioned at: How to uninstall editable packages with pip (installed with -e)
Tested in Python 3.8, pip 20.0.2, Ubuntu 20.04.
Recommendation: develop directly in-tree whenever possible
The editable setup is useful when you are testing your patch to a package through another project.
If however you can fully test your change in-tree, just do that instead of generating an editable install which is more complex.
E.g., the vcdvcd package above is setup in a way that you can just cd into the source and do ./vcdcat without pip installing the package itself (in general, you might need to install dependencies from requirements.txt though), and the import vcdvcd that that executable does (or possibly your own custom test) just finds the package correctly in the same directory it lives in.
From Working in "development" mode:
Although not required, it’s common to locally install your project in
“editable” or “develop” mode while you’re working on it. This allows
your project to be both installed and editable in project form.
Assuming you’re in the root of your project directory, then run:
pip install -e .
Although somewhat cryptic, -e is short for
--editable, and . refers to the current working directory, so together, it means to install the current directory (i.e. your
project) in editable mode.
Some additional insights into the internals of setuptools and distutils from “Development Mode”:
Under normal circumstances, the distutils assume that you are going to
build a distribution of your project, not use it in its “raw” or
“unbuilt” form. If you were to use the distutils that way, you would
have to rebuild and reinstall your project every time you made a
change to it during development.
Another problem that sometimes comes up with the distutils is that you
may need to do development on two related projects at the same time.
You may need to put both projects’ packages in the same directory to
run them, but need to keep them separate for revision control
purposes. How can you do this?
Setuptools allows you to deploy your projects for use in a common
directory or staging area, but without copying any files. Thus, you
can edit each project’s code in its checkout directory, and only need
to run build commands when you change a project’s C extensions or
similarly compiled files. You can even deploy a project into another
project’s checkout directory, if that’s your preferred way of working
(as opposed to using a common independent staging area or the
site-packages directory).
To do this, use the setup.py develop command. It works very similarly
to setup.py install, except that it doesn’t actually install anything.
Instead, it creates a special .egg-link file in the deployment
directory, that links to your project’s source code. And, if your
deployment directory is Python’s site-packages directory, it will also
update the easy-install.pth file to include your project’s source
code, thereby making it available on sys.path for all programs using
that Python installation.
It is important to note that pip uninstall can not uninstall a module that has been installed with pip install -e. So if you go down this route, be prepared for things to get very messy if you ever need to uninstall. A partial solution is to (1) reinstall, keeping a record of files created, as in sudo python3 -m setup.py install --record installed_files.txt, and then (2) manually delete all the files listed, as in e.g. sudo rm -r /usr/local/lib/python3.7/dist-packages/tdc7201-0.1a2-py3.7.egg/ (for release 0.1a2 of module tdc7201). This does not 100% clean everything up however; even after you've done it, importing the (removed!) local library may succeed, and attempting to install the same version from a remote server may fail to do anything (because it thinks your (deleted!) local version is already up to date).
As suggested in previous answers, there is no symlinks that are getting created.
How does '-e' option work? -> It just updates the file "PYTHONDIR/site-packages/easy-install.pth" with the project path specified in the 'command pip install -e'.
So each time python search for a package it will check this directory as well => any changes to the files in this directory is instantly reflected.

python pip specify a library directory and an include directory

I am using pip and trying to install a python module called pyodbc which has some dependencies on non-python libraries like unixodbc-dev, unixodbc-bin, unixodbc. I cannot install these dependencies system wide at the moment, as I am only playing, so I have installed them in a non-standard location. How do I tell pip where to look for these dependencies ? More exactly, how do I pass information through pip of include dirs (gcc -I) and library dirs (gcc -L -l) to be used when building the pyodbc extension ?
pip has a --global-option flag
You can use it to pass additional flags to build_ext.
For instance, to add a --library-dirs (-L) flag:
pip install --global-option=build_ext --global-option="-L/path/to/local" pyodbc
gcc supports also environment variables:
http://gcc.gnu.org/onlinedocs/gcc/Environment-Variables.html
I couldn't find any build_ext documentation, so here is the command line help
Options for 'build_ext' command:
--build-lib (-b) directory for compiled extension modules
--build-temp (-t) directory for temporary files (build by-products)
--plat-name (-p) platform name to cross-compile for, if supported
(default: linux-x86_64)
--inplace (-i) ignore build-lib and put compiled extensions into the
source directory alongside your pure Python modules
--include-dirs (-I) list of directories to search for header files
(separated by ':')
--define (-D) C preprocessor macros to define
--undef (-U) C preprocessor macros to undefine
--libraries (-l) external C libraries to link with
--library-dirs (-L) directories to search for external C libraries
(separated by ':')
--rpath (-R) directories to search for shared C libraries at runtime
--link-objects (-O) extra explicit link objects to include in the link
--debug (-g) compile/link with debugging information
--force (-f) forcibly build everything (ignore file timestamps)
--compiler (-c) specify the compiler type
--swig-cpp make SWIG create C++ files (default is C)
--swig-opts list of SWIG command line options
--swig path to the SWIG executable
--user add user include, library and rpath
--help-compiler list available compilers
Building on Thorfin's answer and assuming that your desired include and library locations are in /usr/local, you can pass both in like so:
sudo pip install --global-option=build_ext --global-option="-I/usr/local/include/" --global-option="-L/usr/local/lib" <you package name>
Another way to indicate the location of include files and libraries are set relevant environment variables before running pip e.g.
export LDFLAGS=-L/usr/local/opt/openssl/lib
export CPPFLAGS=-I/usr/local/opt/openssl/include
pip install cryptography
Just FYI... If you are having trouble installing a package with pip, then you can use the
--no-clean option to see what is exactly going on (that is, why the build did not work). For instance, if numpy is not installing properly, you could try
pip install --no-clean numpy
then look at the Temporary folder to see how far the build got. On a Windows machine, this should be located at something like:
C:\Users\Bob\AppData\Local\Temp\pip_build_Bob\numpy
Just to be clear, the --no-clean option tries to install the package, but does not clean up after itself, letting you see what pip was trying to do.
Otherwise, if you just want to download the source code, then I would use the -d flag. For instance, to download the Numpy source code .tar file to the current directory, use:
pip install -d %cd% numpy
I was also helped by Thorfin's answer; I was building GTK3+ on windows and installing pygobject, I was having difficulties on how to include multiple folders with pip install.
I tried creating pip config file as per pip documentation. but failed.
the one working is with the command line:
pip install --global-option=build_ext --global-option="-IlistOfDirectories"
# and/or with: --global-option="-LlistofDirectories"
the separator that works with multiple folders in windows is ';' semicolon, NOT colon ':' it might be different in other OS.
sample working command line:
pip install --global-option=build_ext --global-option="-Ic:/gtk-build/gtk/x64/release/include;d:/gtk-build/gtk/x64/release/include/gobject-introspection-1.0" --global-option="-Lc:\gtk-build\gtk\x64\release\lib" pygobject==3.27.1
you can use '' or '/' for path, but make sure do not type backslash next to "
this below will fail because there is backslash next to double quote
pip install --global-option=build_ext --global-option="-Ic:\willFail\" --global-option="-Lc:\willFail\" pygobject==3.27.1
Have you ever used virtualenv? It's Python package that let's you create and maintain multiple isolated environments on one machine. Each can use different modules independent of one another without screwing up dependencies in your system library or a separate virtual environment.
If you don't have root privileges, you can download and use the virtualenv package from source:
$ curl -O https://pypi.python.org/packages/source/v/virtualenv/virtualenv-X.X.tar.gz
$ tar xvfz virtualenv-X.X.tar.gz
$ cd virtualenv-X.X
$ python virtualenv.py myVE
I followed the above steps this weekend on Ubuntu Server 12.0.4 and it worked perfectly. Each new virtual environment you create comes with PIP by default so installing packages into your new environment is easy.
Just in case it's of help to somebody, I still could not find a way to do it through pip, so ended up simply downloading the package and doing through its 'setup.py'. Also switched to what seems an easier to install API called 'pymssql'.

Categories

Resources