I have a Python program that has dependencies on other Python libraries. I have used virtualenv and pip to get requirements.txt for all the libs needed to run the app, and thus keeping the environment clean of unnecessary libs. Things work great and I can make progress in developing the app.
This works on my machine, but the issue is that I need to package the app and deploy/distribute to an environment where the requirements.txt and pip cannot be used to simply download the dependencies. The target environment needs a fully functional application.
I'm a bit confused with all these tools offered by Python, such as setuptools and distutils, since none of them seem to offer this (at least easily).
I'm used to Java way, with Maven/Gradle etc. where one simply states dependencies and they are added to the distributable jar/war, unless explicitly stated otherwise.
The dependencies are install inside my virtual environment, under scripts/dir. Is there some easy way to get the dependencies bundled within my app with standard tools, or do I need to roll my own for this?
Related
TL;DR;
The package ColorMe, depends on mypy 3.
The package ColorMe is installed in project Rainbow and Rainbow depends on mypy 2.
How to distribute *ColorMe with all the dependencies, so those who use ColorMe don't need to follow its dependencies versions.
Complete Story:
I have created a project validator (code linter, env checker, etc etc) in my company, which needs to be installed as a dependency in your python solution (normally via pip, or poetry). Currently, it's packaged and distributed as a whl file via poetry build.
Though it's a project that keeps growing and has a lot of dependencies, it starting to conflict with the packages existing in the developers' solutions.
I don't want that my package influences their project, as it must be used as a stand-alone CLI and not as a code source for their project.
Is it possible to create a build or something that isolates the dependencies?
So when devs install my CLI tool (with everything needed for it), which has a dependency on mypy, flake8, etc., don't make the main project dependent on those packages as well?
I want to download python libraries like NumPy, scipy, etc. in a separate folder. I want to include that folder in the python project so that whenever I switch to some other laptop, I don't need to install the libraries again rather I import libraries from that folder. Is there any way?
You can easily install python virtualenv.
Your libraries will be installed in directory created by virtualenv.
https://pypi.org/project/virtualenv/.
Other option, you can also use docker.
I suggest using virtual environment in this case. You could use pipenv so that the project hast exactly the libraries you need for it to run.
You can do it.
for numpy library: https://pypi.org/project/numpy/#files
You can download the files statically from pypi.
I would not recommend you go with this approach. There are several reasons to do that.
There would be a dependency on this kind of library. So you have to keep these dependencies along with the NumPy package.
These libraries are getting updated after some time with some newly added functionality and some bug fixes. So with the time other libraries might not compatible with this library.
Recommended way:
Just create an requirement.txt file that contains all the dependency with its version number.
whenever you want to use your project elsewhere, just install all these libraries with below command.
pip install -r requirement.txt
There are two major way you can install python libs to a separate folder: a virtual environment or a container.
Virtula environment (like venv, pipenv, etc) is good as this is the simplest way to your project's own liblaries set which is not impact any other pythonic script in your system. The downside of this case is thet you really have to set up an environment (including lib install) on every computer you move your script to. This can and should be autimated, of course, but this should be done either way.
The container, in other hand, requires additional resources to handle and to build, build, but it is exactly the box with a specific version of your script along with all libs and binaries it requires. No need to reinstall libs while moving to new laptop/desktop/server/cloud/whatever. For this case I would recommend the Docker/Kubernetes. But it's better to start with Docker.
Our command line utility is written with Python.
On Linux/OS X this is usually not a problem since both come with Python 2.x pre installed. However on Windows, Python isn't installed by default.
Additional problem is that few of our dependencies require compiling which yet again is not a trivial problem for Windows users since it requires tinkering with MSVC/Cygwin/etc'.
Up until now we solved this issue by using Pyinstaller to create "frozen" Python package with pre-installed dependencies. This worked well, however made our utility non extendable - we cannot add additional Python modules by using utilities such as pip for example. Since our CLI depends on this ability in order to add additional usability, this limitation became a blocker for us and we would like to rethink our approach.
Searching around, I found how Rhodecode solve this. Basicly their installer brings Python and everything else (including pre-compiled dependencies).
This seem as a good idea for us, the only limitation I see here is that their installer actually installs Python from .msi which puts stuff in Windows Registry. So there can be only one Python of version X.Y installed on Windows (from .msi)
For a server application this might be reasonable since server application tends to act like it's the only thing installed on the PC, but for command line utility, this is completely unacceptable.
Looking around I found few projects that claim to make Python portable - for example Portable Python. However I'm not sure how "portable" it really is, especially after issues like this.
So questions are:
Is it possible to install same Python version multiple times on Windows without creating collisions between the instances?
Would you choose other workaround to solving this problem (please no "smart" solutions such as: drop Windows support/don't use Python)
Thanks!
Frankly I would stick with PyInstaller or something similar. That will always provide you with the correct version of Python whether or not the target machine has Python installed. It also protects you from clobbering a previously installed version of Python.
If you need to add plugins, then you should build that into your app. There are a few projects that might help you with that. Here are a couple of examples:
http://yapsy.sourceforge.net/
https://pypi.python.org/pypi/Plugins/
You might also take a look at how Django or Flask handles extensions. For example, you can add an extension to Flask to allow it to work with SQLAlchemy. You should be able to do something similar with your own application. The pip utility won't work with a frozen application after all. Alternatively, have you looked at conda? It might work for your purposes.
This is an answer for my second question, sadly I still haven't figured out a better solution for number 1.
For now here's how we changed our approach for creating setup file:
We package our code and its dependencies as set of pre-built Python wheels. It's relatively easy to create pre-built wheels on Windows since the release of Visual C++ compiler for Python2.7.
We package Python setup MSI together with Pip, Setuptools and Virtualenv wheels.
Before install starts we check whether Python, Pip and Virtualenv are already installed (by looking in the registry and \Scripts), if it's not, we install it from the packaged wheels. Pip wheel is installed by using get-pip.py script which we bundle as well.
We create separate Virtualenv and install our wheels into it.
Uninstall is done by removing the virtualenv folder. Python, Pip and Virtualenv install are left behind.
Installer created by Inno Setup.
I'm still not fully satisfied with this solution since some components are globally installed and might collide with what user had already installed before (older/newer version of python, pip, setuptools or virtualenv). This creates potential for unpredictable bugs during install or runtime. Or the possibility for user to upgrade one of the components in the future and somehow break the application.
Also, uninstall is dirty and leaves stuff behind.
I'd like to distribute a whole virtualenv, or a bunch of Python wheels of exact versions with their runtime dependencies, for example:
pycurl
pycurl.so
libcurl.so
libz.so
libssl.so
libcrypto.so
libgssapi_krb5.so
libkrb5.so
libresolv.so
I suppose I could rely on the system to have libssl.so installed, but surely not libcurl.so of the correct version and probably not Kerberos.
What is the easiest way to package one library in a wheel with all the run-time dependency?
Or is that a fool's errand and I should package entire virtualenv?
How to do that reliably?
P.S. compiling on the fly is not an option, some modules are patched.
AFAIK, there is no good standard way to portably install dependencies with your package. Continuum has made conda for precisely this purpose. The numpy guys wrote their own distutils submodule in their package to install some complicated dependencies, and now at least some of them advocate conda as a solution. Unfortunately, you may have to make conda packages for some of these dependencies yourself.
If you're fine without portability, then targeting the package manager of the target machines will obviously work. Otherwise, for a portable package manager, conda is the only option I know of.
Alternatively, from your post ("compiling on the fly is not an option") it sounds like portability may not be an issue for you, in which case you could also install all the requirements to a prefix directory (most installers I've come across support a configure --prefix=/some/dir/ option). If you have a guaranteed single architecture, you could probably prefix-install all your dependencies to a single directory and pass that around like a file. The conda approach would probably be cleaner, but I've used prefix installs quite a bit and they tend to be one of the easiest solutions to get going.
Edit:
As for conda, it is simultaneously a package-manager and a "virtualenv"-like environment/python install. While virtualenv is added on top of an existing python install, conda takes over the whole install, so you can be more sure that all the dependencies are accounted for. Compared to pip, it is designed for adding generalized non-Python dependencies, instead of just compiling C/Cpp exentions. For more info, I would see:
pip vs conda (also recommends buildout as a possibility)
conda as a python install
As for how to use conda for your purpose, the docs explain how to create a recipe:
Conda build framework
Building a package requires a recipe. A recipe is flat directory which
contains the following files:
meta.yaml (metadata file)
build.sh (Unix build script which is executed using bash)
bld.bat (Windows build script which is executed using cmd)
run_test.py (optional Python test file)
patches to the source (optional, see below)
other resources, which are not included in the source and cannot be
generated by the build scripts.
The same recipe should be used to build a package on all platforms.
When building a package, the following steps are invoked:
read the metadata
download the source (into a cache)
extract the source in a source directory
apply the patches
create a build environment (build dependencies are installed here)
run the actual build script. The current working directory is the source
directory with environment variables set. The build script installs into
the build environment
do some necessary post processing steps: shebang, rpath, etc.
add conda metadata to the build environment
package up the new files in the build environment into a conda package
test the new conda package:
create a test environment with the package (and its dependencies)
run the test scripts
There are example recipes for many conda packages in the conda-recipes
<https://github.com/continuumio/conda-recipes>_ repo.
The :ref:conda skeleton <skeleton_ref> command can help to make skeleton recipes for common
repositories, such as PyPI <https://pypi.python.org/pypi>_.
Then, as a client, you would install the package similar to how you would install from pip
Lastly, docker may also be interesting to you, though I haven't seen it much used for Python.
You may want to look into PEX: https://pex.readthedocs.io/en/stable/whatispex.html
'Files with the .pex extension – “PEX files” or ”.pex files” – are self-contained executable Python virtual environments. PEX files make it easy to deploy Python applications: the deployment process becomes simply scp.'
I'm an author of a pure Python library that aims to be also convenient to use from a command line. For Windows users it would be nice just installing the package from an .exe or .msi package.
However I cannot get the installer to install package dependencies (especially the dependency on setuptools itself, so that running the software fails with an import error on pkg_resources). I don't believe that providing an easy .exe installer makes much sense, if the user then needs to manually install setuptools and other libraries on top. I'd rather tell them how to add easy_install to their PATH and go through this way (http://stackoverflow.com/questions/1449494/how-do-i-install-python-packages-on-windows).
I've build .exe packages in the past, but don't remember if that ever worked the way I'd preferred it to.
It is quite common to distribute packages that have dependencies, especially those as you have, but I understand your wish to make installation as simple as possible.
Have a look at deployment bootstrapper, a tool dedicated to solving the problem of delivering software including its prerequisites.
Regardless of what packaging method you eventually choose, maintain your sanity by staying away from including MSIs in other MSI in any way. That just does not work because of transactional installation requirements and locking of the Windows Installer database.