pip install editable working dir to custom path using requirements.txt - python

Short version:
Is it possible to use the -e parameter in requirements.txt with a path where the editable package should be installed?
First approach
requirements.txt:
-e git+https://github.com/snake-soft/imap-storage.git#egg=imap-storage
Pro: Automated install
Contra: Editable directory is inside virtualenv src folder (not in workspace)
Second approach (Edit: Don't use this until you know what you're doing, look at bottom)
If i clone the repo and installed it like this (virtualenv activated):
cd /home/user/workspace
git clone https://github.com/snake-soft/imap-storage.git
pip install -e .
Gives the structure i want:
workspace
├── imap-storage
├── django-project # uses imap-storage module
I have what i want. The repository (imap-storage) lays parallel to the django-project, that uses it.
It is importable because it is installed inside the virtualenv.
Pro: Editable directory is inside my workspace
Contra: Not automated, not intuitive
Goal
pip install -r requirements.txt to install module from git (like first approach)
Module is in pythonpath of virtualenv -> importable
Editable working dir of the module is in my workspace (like second approach)
PS: Or am i completely wrong-thinking and should go for something completely different?

Why did i ask such a crazy question?
I thought i could make my life a little bit easier when both (package and Django project that is using this package) are laying editable inside my workspace because i work on them in parallel.
My résumé
I tried it a little bit with the second approach and at the end, i decided to prefer the first approach.
Reason
With both methods pydev won't show it as an installed package.
When mix both methots like that:
install package via requirements.txt (with the -e switch)
uninstall it
clone it into (eg. ~/workspace/)
install it with the 'pip install -e .' inside the package
Then you will end up in a bad situation.
The 'virtualenv/src/' directory won't be deleted and is recognized as source for the package inside pydev.
When running the Django instance that uses that package, it runs the package-code from '~/workspace/'.
Suggestion
Use the first approach, import that source dir as project in pydev ('virtualenv/src/') and make a link inside the file-manager of your choice.
It will save you from a complicated mistake.

Related

How to integrate and work with complex structure GIT repository sync and stored in a folder in the PYTHONPATH

I have my github project MYMODULE structure following the guidelines e.g.:
README.rst
LICENSE
sample/__init__.py
sample/core.py
sample/helpers.py
The core of my project is inside sample folder where __init__.py is stored.
I would like to be able to keep it accessible including sample folder in the PYTHONPATH, but of course if I git -clone the project in a folder listed in the PYTHONPATH I will have another folder MYMODULE before the sample folder with the __init__ file, and so I won't be able to import it. I know that if I move the sample folder one level up I will be able to access the module but this will compromise the sync with my GitHub repository.
Is there any guideline or best practice for solving this issue?
I've tried to use git sparse checkout but this doesn't solve the issue because it however stores the folder selected inside a parent folder.
git is a development tool but not distribution/deployment. To install a Python package you should have setup.py. Then you can install the package directly from git using pip or you can clone the repository yourself and install with pip install -e . or even with python setup.py install
During development you can clone the code and point $PYTHONPATH to the current directory: export PYTHONPATH=$(pwd)
But the best practice is virtualenv. Create setup.py for every package, create a virtualenv for every separate project, install packages with pip install -e .

one python project with multiple packages

I'm moving on from single scripts to a bigger python application.
It's an application with multiple packages.
package1-> package1/.py files
package2-> package2/.py files
As package 1 should be able to be used stand alone, I keep it in a separate git repo.
I'd love to do in package2: import package1
It feels like the easiest way to do it would be having project1 (in its git repo) in a subdirectory of project2, but that doesnt sound like a nice solution.
Some answers I found feels dated and I couldn't get it to work. (python setup.py install)
Adding package1 location to the PATH is a solution, but it's not very nice if I want to distribute it to co-workers. Ideally, I "install" the package as easily as possible.
I read "pip" would be preferred, but would need some directions where to start looking for creating a package. Also, distribution would be only local.
(python3.6. Code will be used on linux and windows. )
excerpt from an excellent (but kindof hidden) answer using pip given by np8 in question
Importing modules from parent folder:
checkout his answer!
--
1) Add a setup.py to the root folder
The contents of the setup.py can be simply
from setuptools import setup, find_packages
setup(name='myproject', version='1.0', packages=find_packages())
Basically "any" setup.py would work. This is just a minimal working example.
2) Use a virtual environment
3) pip install your project in editable state
Install your top level package myproject using pip. The trick is to use the -e flag when doing the install. This way it is installed in an editable state, and all the edits made to the .py files will be automatically included in the installed package.
In the root directory, run
pip install -e . (note the dot, it stands for "current directory")
You can also see that it is installed by using pip freeze

What's the standard way to package a python project with dependencies?

I have a python project that has a few dependencies (defined under install_requires in setup.py). My ops people requires a package to be self contained and only depend on a python installation. The litmus test would be that they're able to get a zip-file and then unzip and run it without an internet connection.
Is there an easy way to package an install including dependencies? It is acceptable if I have to build on the OS/architecture that it will eventually be run on.
For what it's worth, I've tried both setup.py build and setup.py sdist, but they don't seem to fit the bill since they do not include dependencies. I've also considered virtualenv (which could be installed if absolutely necessary), but that has hard coded paths which makes it less than ideal.
There are a few nuances to how pip works. Unfortunately, using --prefix vendor to store all the dependencies of the project doesn't work if any of those dependencies, or dependencies of dependencies are installed into a place where pip can find them. It will skip those dependencies and just install the rest to your vendor folder.
In the past I've used virtualenv's --no-site-packages option to solve this issue. At one company we would ship the whole virtualenv, which includes the python binary. In the interest of only shipping the dependencies, you can combine using a virtualenv with the --prefix switch on pip to give yourself a clean environment that installs to the right place.
I'll provide an example script that creates a temporary virtualenv, activates it, then installs the dependencies to a local vendor folder. This is handy if you are running in CI.
#!/bin/bash
tempdir=$(mktemp -d -t project.XXX) # create a temporary directory
trap "rm -rf $tempdir" EXIT # ensure it is cleaned up
# create the virtualenv and exclude packages outside of it
virtualenv --python=$(which python2.7) --no-site-packages $tempdir/venv
# activate the virtualenv
source $tempdir/venv/bin/activate
# install the dependencies as above
pip install -r requirements.txt --prefix=vendor
In most cases you should be able to "vendor" all the dependencies. It's basically a crude version of virtualenv.
For example look at how the requests package includes chardet and urllib3 in its own source tree. Here's an example script that should do the initial downloading and copying for you: https://gist.github.com/proppy/1136723
Once you have the dependencies installed, you can reference them with from .some.namespace import dependency_name to make sure that you're using your local versions.
It's possible to do this with recent versions of pip (I'm using 8.1.2). On the build machine:
pip install -r requirements.txt --prefix vendor
Then run it:
PYTHONPATH=vendor/lib/python2.7/site-packages python yourapp.py
(This is basically an expansion of #valentjedi comment. Thanks!)
let's say you have python module app.py with dependencies in requirements.txt file.
first, install all your dependencies in appdeps folder.
python -m pip install -r requirements.txt --target=./appdeps
then in your app.py module add this dependency folder to the pythonpath
# app.py
import sys
sys.path.append('appdeps')
# rest of your module normally
#...
this will work the same way as if you were running this script from venv with all the dependencies installed inside ;>

When would the -e, --editable option be useful with pip install?

When would the -e, or --editable option be useful with pip install?
For some projects the last line in requirements.txt is -e .. What does it do exactly?
As the man page says it:
-e,--editable <path/url>
Install a project in editable mode (i.e. setuptools "develop mode") from a local project path or a VCS url.
So you would use this when trying to install a package locally, most often in the case when you are developing it on your system. It will just link the package to the original location, basically meaning any changes to the original package would reflect directly in your environment.
Some nuggets around the same here and here.
An example run can be:
pip install -e .
or
pip install -e ~/ultimate-utils/ultimate-utils-proj-src/
note the second is the full path to where the setup.py would be at.
Concrete example of using --editable in development
If you play with this test package as in:
cd ~
git clone https://github.com/cirosantilli/vcdvcd
cd vcdvcd
git checkout 5dd4205c37ed0244ecaf443d8106fadb2f9cfbb8
python -m pip install --editable . --user
it outputs:
Obtaining file:///home/ciro/bak/git/vcdvcd
Installing collected packages: vcdvcd
Attempting uninstall: vcdvcd
Found existing installation: vcdvcd 1.0.6
Can't uninstall 'vcdvcd'. No files were found to uninstall.
Running setup.py develop for vcdvcd
Successfully installed vcdvcd-1.0.6
The Can't uninstall 'vcdvcd' is normal: it tried to uninstall any existing vcdvcd to then replace them with the "symlink-like mechanism" that is produced in the following steps, but failed because there were no previous installations.
Then it generates a file:
~/.local/lib/python3.8/site-packages/vcdvcd.egg-link
which contains:
/home/ciro/vcdvcd
.
and acts as a "symlink" to the Python interpreter.
So now, if I make any changes to the git source code under /home/ciro/vcdvcd, it reflects automatically on importers who can from any directory do:
python -c 'import vcdvcd'
Note however that at my pip version at least, binary files installed with --editable, such as the vcdcat script provided by that package via scripts= on setup.py, do not get symlinked, just copied to:
~/.local/bin/vcdcat
just like for regular installs, and therefore updates to the git repository won't directly affect them.
By comparison, a regular non --editable install from the git source:
python -m pip uninstall vcdvcd
python -m pip install --user .
produces a copy of the installed files under:
~/.local/lib/python3.8/site-packages/vcdvcd
Uninstall of an editable package as done above requires a new enough pip as mentioned at: How to uninstall editable packages with pip (installed with -e)
Tested in Python 3.8, pip 20.0.2, Ubuntu 20.04.
Recommendation: develop directly in-tree whenever possible
The editable setup is useful when you are testing your patch to a package through another project.
If however you can fully test your change in-tree, just do that instead of generating an editable install which is more complex.
E.g., the vcdvcd package above is setup in a way that you can just cd into the source and do ./vcdcat without pip installing the package itself (in general, you might need to install dependencies from requirements.txt though), and the import vcdvcd that that executable does (or possibly your own custom test) just finds the package correctly in the same directory it lives in.
From Working in "development" mode:
Although not required, it’s common to locally install your project in
“editable” or “develop” mode while you’re working on it. This allows
your project to be both installed and editable in project form.
Assuming you’re in the root of your project directory, then run:
pip install -e .
Although somewhat cryptic, -e is short for
--editable, and . refers to the current working directory, so together, it means to install the current directory (i.e. your
project) in editable mode.
Some additional insights into the internals of setuptools and distutils from “Development Mode”:
Under normal circumstances, the distutils assume that you are going to
build a distribution of your project, not use it in its “raw” or
“unbuilt” form. If you were to use the distutils that way, you would
have to rebuild and reinstall your project every time you made a
change to it during development.
Another problem that sometimes comes up with the distutils is that you
may need to do development on two related projects at the same time.
You may need to put both projects’ packages in the same directory to
run them, but need to keep them separate for revision control
purposes. How can you do this?
Setuptools allows you to deploy your projects for use in a common
directory or staging area, but without copying any files. Thus, you
can edit each project’s code in its checkout directory, and only need
to run build commands when you change a project’s C extensions or
similarly compiled files. You can even deploy a project into another
project’s checkout directory, if that’s your preferred way of working
(as opposed to using a common independent staging area or the
site-packages directory).
To do this, use the setup.py develop command. It works very similarly
to setup.py install, except that it doesn’t actually install anything.
Instead, it creates a special .egg-link file in the deployment
directory, that links to your project’s source code. And, if your
deployment directory is Python’s site-packages directory, it will also
update the easy-install.pth file to include your project’s source
code, thereby making it available on sys.path for all programs using
that Python installation.
It is important to note that pip uninstall can not uninstall a module that has been installed with pip install -e. So if you go down this route, be prepared for things to get very messy if you ever need to uninstall. A partial solution is to (1) reinstall, keeping a record of files created, as in sudo python3 -m setup.py install --record installed_files.txt, and then (2) manually delete all the files listed, as in e.g. sudo rm -r /usr/local/lib/python3.7/dist-packages/tdc7201-0.1a2-py3.7.egg/ (for release 0.1a2 of module tdc7201). This does not 100% clean everything up however; even after you've done it, importing the (removed!) local library may succeed, and attempting to install the same version from a remote server may fail to do anything (because it thinks your (deleted!) local version is already up to date).
As suggested in previous answers, there is no symlinks that are getting created.
How does '-e' option work? -> It just updates the file "PYTHONDIR/site-packages/easy-install.pth" with the project path specified in the 'command pip install -e'.
So each time python search for a package it will check this directory as well => any changes to the files in this directory is instantly reflected.

Pythonic Ways of Importing Custom Modules?

I've needed to deal with this for some time, but never really figured out what the most pythonic way of importing/setting up PYTHONPATH for custom modules is. I know I can use virtualenv to manage it, I know I can set it inside of scripts, or through pth files, but none of these seem very clean and pythonic to me, so I'm guessing I'm missing something.
Almost always, all custom modules I'm interested in are contained in the git directory I've cloned down that has whatever script I'm running, if that simplifies things.
I'm guessing virtualenv is the answer, but figured I'd ask in case I'm missing anything.
EDIT: To clarify, this is only a question about custom modules. I'm already using pip for modules from PyPI.
You can use pip to install packages that are not on PyPI also. You just need an URI endpoint and a valid python package:
Examples:
$ pip install https://github.com/pypa/pip/archive/develop.zip#egg=pip
$ pip install git+https://github.com/pypa/pip.git#egg=pip
$ pip install git+git://github.com/pypa/pip.git#egg=pip
$ pip install /path/to/pip.tar.gz
$ pip install .
Read more at https://pip-installer.org/en/latest/usage.html#pip-install
virtualenv is a good start.
There are also package managers like pip and easy_install that manage third party modules.
In code you can use:
import sys
sys.path.append('/path/to/customModule')
Virtualenv is the way to go with this.
pip install virtualenv
Then make a folder to setup your environments. Inside that folder:
virtualenv <new_env_name>
That'll create a new folder in that directory, inside that there's a bin folder, run source on activate in that bin folder. You can then do pip install and it will only install it for that environment.
If you're cloning a git repo that you also want to be able to peruse the code easily (like if you're also working on that repo) clone it into your work_dir and then symlink or alias the package folder into the site-package directory inside that virtualenv's lib directory. Otherwise, if it's packaged correctly if you do python setup.py install it should install it right for that virtualenv.

Categories

Resources