Is there a way to "version" my python distribution? - python

I'm working by myself right now, but am looking at ways to scale my operation.
I'd like to find an easy way to version my Python distribution, so that I can recreate it very easily. Is there a tool to do this? Or can I add /usr/local/lib/python2.7/site-packages/ (or whatever) to an svn repo? This doesn't solve the problems with PATHs, but I can always write a script to alter the path. Ideally, the solution would be to build my Python env in a VM, and then hand copies of the VM out.
How have other people solved this?

virtualenv + requirements.txt are your friend.
You can create several virtual python installs for your projects, everything containing exactly those library versions you need (Tip: pip freeze spits out a requirements.txt with the exact library versions).
Find a good reference to virtualenv here: http://simononsoftware.com/virtualenv-tutorial/ (it's from this question Comprehensive beginner's virtualenv tutorial?).
Alternatively, if you just want to distribute your code together with libraries, PyInstaller is worth a try. You can package everything together in a static executable - you don't even have to install the software afterwards.

You want to use virtualenv. It lets you create an application(s) specific directory for installed packages. You can also use pip to generate and build a requirements.txt

For the same goal, i.e. having the exact same Python distribution as my colleagues, I tried to create a virtual environment in a network drive, so that everybody of us would be able to use it, without anybody making his local copy.
The idea was to share the same packages installed in a shared folder.
Outcome: Python run so unbearably slow that it could not be used. Also installing a package was very very sluggish.
So it looks there is no other way than using virtualenv and a requirements file. (Even if unfortunately often it does not always work smoothly on Windows and it requires manual installation of some packages and dependencies, at least at this time of writing.)

Related

How to manage python project that depends on multiple versions of a shared library?

I am on macOS, using brew, pyenv, and virtualenv.
I have a Python project that depends on bokeh and gdal (both python packages were installed with pip inside a virtual environment). Both bokeh and gdal depend on a system version of libopenssl, but they depend on different versions (1.0 and 1.1).
I have had this project working at various points in the past, with some combination of libraries (using pip for all python packages and brew for system packages) but when I change python versions and environments (using pyenv) to work on other projects, and then come back to this project, it no longer works. Usually something along these lines with a problem finding a shared library for openssl:
$ ./my_python_program.py
...
ImportError: dlopen(/Users/userBob/.pyenv/versions/3.7.0/lib/python3.7/lib-dynload/_ssl.cpython-37m-darwin.so, 2):
Library not loaded: /usr/local/opt/openssl#1.1/lib/libssl.1.1.dylib
Referenced from: /Users/userBob/.pyenv/versions/3.7.0/lib/python3.7/lib-dynload/_ssl.cpython-37m-darwin.so
Reason: image not found
I feel like I am eventually able to get things to work by trying random combinations of installing and uninstalling various package versions using pip and brew. But this is a fragile and inefficient way to maintain my projects.
In general what is the best way to handle this kind of situation? Do I need to simply record the exact brew and pip install/uninstall commands to get it working? Am I missing the concept of version "pinning"? Are there additional options with brew and pyenv that I am missing that might make this process easier?
I'm not sure this is the best way to do it, but I can tell you what I do usually.
First of all, I'm using Anaconda.
When I'm on a project, I switch to the relevant virtual environment.
Before switching out, when I commit/push my modifications, I also create an export file of my environment like you can find it there.
I also track this file with git, this way, if I make any modification when working on the environment, it's stored in the .yml file.
This way, I can reinstall all dependencies needed for the project if I format my machine or get a new one, etc. And the reference for every dependency I need is stored in the cloud with my sources. So in case I start getting weird behaviour, I just restore my environment with this reference file from the time when it was working.
I'm not switching between projects fast enough for me to justify automatizing this process, but I'm sure that's feasible if you want to.

Make virtualenv share already existing site-packages?

My layout is as follows:
I have various different python projects under ~/projects, each with the following structure:
~/projects/$project_name/env #This is the virtualenv
~/projects/$project_name/scripts #This is where the code actually lives
~/projects/$project_name/scripts/requirements.txt #This helps keep track of this project's dependencies
Now, this setup works great as it does the following:
Each project has it's own dependencies in its corresponding env
I can easily redeploy this project somewhere else by cloning the scripts file, creating a new virtualenv and doing pip install -r requirements.txt
The main downside of this setup is that I have multiple copies of the same packages in multiple virtual environments. I regularly end up with a couple of hundred megs for each virtual environment.
My question is:
Is there a way to share packages between multiple virtualenvs?
Things I've tried and do not work:
virtualenv --system-site-packages. This makes the system-wise packages available in the virtualenv but:
it makes it impossible to get a list of specific dependencies
I can't have multiple versions of the same dependency installed (e.g. pandas 0.16 vs pandas 0.15) which I need, as different projects have different needs.
virtualenv --extra-search-dir=/path/to/dist only works for pip, AFAICT, so not good for me.
Scrap the comment, maybe I do know an answer. It appears as though Anaconda's package management system does use symlinks. So that would basically be a virtualenv but with the feature you want. See here: How to free disk space taken up by (ana)conda?
That said, there's a large initial harddisk cost to using Conda, so investigate a bit more and decide if it will work for you.

Debian build package: Add python virtualenv into dpkg-buildpackage to be uploaded to launchpad

I would like to pack a python program and ship it in a deb package.
For reasons (I know in 99% it is bad practice) I want to ship the program in a python virtual environment within a debian package.
I know I can do this using dh-virtualenv. This works great - generally no problem.
But the problem arises when I want to upload this to launchpad. Uploading to launchpad means uploading a source package. In terms of dh-virtualenv a source package is the package description, where the virtualenv has not been created, yet.
What happens when I upload this to launchpad is, that the package will not build, since the dh-virtualenv which is executed during the build process on launchpad will try to install python modules into the virtualenv, which means installing these from the PyPI, which will not work, since launchpad does not allow external network access.
So basically there are two possible solutions:
Approach A
Prepare the virtualenv and somehow incorporate it into the source package and having the dh build process simply "move" this prepared virtualenv to its end location. This could work with virtualenv --relocatable. BUT the relocation strips the utf-8 marker at the beginning of all python scripts, rendering all python scripts in the virtualenv broken.
Apporach B
Somehow cache all necessary python packages in the source package and have dh_virtualenv install from the cache instead of from PyPI.
This seems like to be doable with pip2pi, but certain experiements show, that it will not install packages, although they are located in the local package index.
Both approaches seem a bit clumsy and prone to errors.
What do you think of this?
What are your experiences?
What would you recommend?

How can I escape from python environment hell?

I don't know how I've gotten here but I have many competing installations of python on my Ubuntu 16.04 path. Some I use, some I don't.
I'm at the point now where I want to clean up things to save headache when troubleshooting issues but I don't know any strategies or tools of tackling this.
What is the best way I can find out which environments are being used and not used?
How can I determine which python directories are being pointed to and which ones are abandoned?
Whats a quick way I can get a list of non-standard packages installed to each environment?
Here is what you can try
which python usually for python2.x and which python3 for python3.x.
Then decide which version you want to use by default then you can use export python='Your required python interpreter path' for permanent changes, or you can use alias python=PATH for temporary usage.
Also see where the pip and pip3 are pointing at by using which pipX. Thus you can use one of them to install required packages.
I would recommend you to use virtualenv or pipenv so that you get more fine grained control over the interpreter selection according to the need of your project.
Note do not uninstall any of the above python packages without some research as there might be system dependencies thus breaking your system.

Creating a Portable Python (local install) for Linux

I'm looking to create the following:
A portable version of python that can be run on any system (with any previous version of python or no python installed) and have it pre-configured with various python packages (ie, django, lxml, pysqlite, etc)
The closest I've found to the above is virtualenv, but this only goes so far.
If I package up a nice virtualenv for python on one machine, it contains sym links to a lot of the libraries it needs. I can take those sym links and convert them to their actual files, but if I try to move this entire directory to another machine, I get seg fault after seg fault.
To launch python on a different machine, I'm using:
LD_LIBRARY_PATH=lib/ ./bin/python
and in lib/ I have all of the shared libraries I copied from the original machine. The problem here is these shared libraries might rely on other shared libraries that I'm not including, so executing this on other linux distros does not work. Probably due to it falling back on older shared libaries installed on the system that do not work with what I copied over.
Anyone have an idea on how to get this working? Is this even possible?
EDIT:
To clarify, the desired outcome is to create a tar.gz of a python binary and associated packages (django, lxml, pysqlite, etc) that can be extracted and run on any linux based system, ie (ubuntu 8.04, redhat 5, suse 11, etc), all 32bit distros, where the locally installed version of python doesn't impact what's in the tar.gz.
I just tested this and it works great.
Get the copy of python you want to install and untar it and cd to the untarred folder first.
Also get a copy of setuptools and untar that.
/opt/portapy used below is of course just the name I came up with for this post, it could be any path and the full path should be tarred up and the same path should be used on any systems you put this on due to absolute path linking.
mkdir /opt/portapy
cd <python source dir>
./configure --prefix=/opt/portapy && make && make install
cd <setuptools source dir>
/opt/portapy/bin/python ./setup.py install
Make the virtual env folder inside the portapy folder.
mkdir /opt/portapy/virtenv
/opt/portapy/bin/virtualenv /opt/portapy/virtenv
cd /opt/portapy/virtenv
source bin/activate
Done. You are ready to install all of your libraries here and have the option of creating multiple virtual envs this way.
You can then tar up the whole /opt/portapy folder and transport it to any Linux system of the same arch, within reason I suspect.
I compiled 2.7.5 ond centOS 5.8 64bit and moved the folder to a Cent6.9 system and it runs perfectly.
I don't know how this is even possible. If it were, they woudn't need to distribute binary packages of python for different platforms. You can't simply distribute python that will run on any platform. It has to be built from source for that arch. Virtualenv will expect you to tell it which system python to use (using links).
This pretty much goes for almost any binary package that links against system libs. Again, if it were possible, we wouldn't need any platform specific binary distributions.
You can, however, achieve part of what you want. That is, running python on another machine that doesn't have python installed as long as its the same arch. This is the same concept behind freezing, or py2exe/py2app/pyinstaller. An interpreter is bundled into a standalone environment. So the app can run on any similar platform.
Edit
I just realized that while your question speaks about "system" agnostically, your title contains the reference "linux". There are different flavors of linux, so in order for it to work you would have to build it fat for multiple archs and also completely contain the standalone links. You might try building a package with pyinstaller and using that to include in your project.
You can try just building python from source, in your virtualenv:
$ ./configure --prefix=/path/to/virtualenv && make && make install
If you still have problems with the links to libs, you can also investigate building it statically
I'm not sure that working solely in Python is the way to go here. You might have better luck with Puppet of Chef, which are configuration tools that can be used to create a local environment. There is plenty of code out there to install virtualenv and python on just about any Linux plus OSX (probably not Windows though).
Your workflow would be to install chef or Puppet (your choice), run a script to install the Python you want, then enter a virtualenv and pip install any packages you might need.
Sorry this isn't as easy as virtualenv alone, but it is much more robust.
Well, since I rarely accept "can't be done", there is a way to do it. Warning: it isn't pretty and you should probably look into a different scenario.
What you will need to to is determine a standard location for this top level directory. Second, using that directory as your root you will need to compile Python on each Linux distribution you want to run this on. For this you would use something like "/usr/local/myappname/platform/" to configure and compile Python to live in. In each case substitute "platform" with the name of the platform such as "/usr/local/rhel/". If memory serves the configure option you are looking for here is --prefix.
Once you have each distribution compiled you will need a script to determine which one to use and either set environment variables or have it create symlinks to the appropriate "installation" of python. I would then use virtualenv and bootstrap in that tree to keep the "in-use" python libraries even more specific.
I can't think of a common Linux distribution that doesn't have Python by default. As such you could use setup.py and/or basic python scripts to script this out since you should be able to rely in Python being present - even if its ye olde version as in RHEL installs. Personally I find the above method overly complicated but it would meet your stated requirements with the allowance for a final script. Of course, you could use shar (SHell ARchive) to tar all of this into a runnable shell script to do the installation and avoid the need for secondary scripts. If you gzip the resulting shel archive then you can decompress it on target systems and execute it to set everything up.
All that said, I would not recommend this. I would recommend determining the minimum Python version you can run on and ensuring that is installed by the distribution whenever possible and if needs be pulling down from a repo and installing. Then, use virtualenv and bootstrap with a requirements.txt to install necessary python libraries and apps into the virutalenv. For that see this documentation
I faced the same problem, so I created PortableVirtualenv. Your Question is just the definition of it.
I use it as a base for commercial multiplatform app I develop. (But PortableVirtualenv is public domain - use it freely.)
If needed, you can pip-install any package and zip the whole directory to distribute also packages you need.
One nice option is to make a "snap" portable linux application. They have a python mode which lets you specify you specify exactly what modules you need. From https://snapcraft.io/first-snap#python :
Snaps let you distribute a dependency-isolated Python app in an app store experience for end users.
Another option is to containerize your application with something like docker. Then instead of executing your script directly, the user is actually running a small OS with just your application and its dependencies. https://www.infoq.com/articles/docker-executable-images/ has more about executable containers.
Container images can also be used for short lived processes: a containerized executable meant to be run on your computer. These containers execute a single task, are short lived and can generally be removed after use. We call these executable images. Examples are compilers (Golang) or build tools (Maven), presentation software (I love to hack a simple presentation in Markdown format and let a RevealJS Docker image serve that) and browsers (a fresh contained browser to follow that fishy link). A real evangelist for executable images is Docker's own Jessie Frazelle. To get some great inspiration be sure to read her blog about them or check out this presentation at DockerCon 2015.

Categories

Resources