When working with JVM languages a pattern commonly followed is to use a build system (ant+ivy / maven / gradle), where using a build file, the dependencies of your code can be defined. The build system is able to fetch these dependencies when you build your code. Moreover IDEs like Eclipse/IntelliJ are also able to read these build files and continuously build/verify your code as you write it.
How is something similar done while developing in Python? While there may not necessarily be a build step, I want a developer to be able to checkout my code and then run a single bootstrap command that will setup a virtualenv and pull in any thirdy-party dependencies necessary to run the code. I could include some sort of a script to do this, but I am wondering if there is a tool to do this? Most of my search so far has led me to packaging tools, which are more for distribution to end-user than for this purpose (or so I understand).
This is managed by virtualenv and the pip install -r requirements.txt command. More info here: Virtual Environments
I guess requirements.txt is what you are looking for. For example, PyCharm IDE will definitely see it as a dependency list.
Related
I have jupyter-notebook running on my own Mac with the caylsto-processing library plugged in so I can run processing scripts in a notebook in a browser tab. But I am trying to be able to run this all in binder, so that I can share my processing scripts with students during class. I created a Github repository and have it linked to a binder, and the binder builds and launches, but the only kernel available is python 3.
I have read that I can include a bunch of configuration files, but I'm new to these and I don't see any examples that bring in the calysto-processing kernel, so I'm unsure on how to proceed.
Screenshot of my binder with the jupyter-notebook with a processing script - but when you click on kernels, the only kernel it shows is python:
Any help would be appreciated.
Very good question. Ayman suggestion is good.
I've just installed calysto_processing and noticed 3 things are necessary:
installing the calysto_processing package via pip,
running install on the calysto_processing package.
installing Processing.
First point should be easy with requirements.txt.
I'm unsure what the best option is for the second step (maybe a custom setup.py ?).
Step 3 feels the trickiest.
Installing Processing currently isn't supported with apt-get so Dockerfile might be way forward (even through mybinder recommend that only as a last resort).
Let's assume a Dockerfile would contain all the steps to manually download/install processing (and I'm not super experienced with Docker at the moment btw), it will need to be executed which will require a windowing system to render the Processing window.
I don't know how well that plays with Docker, sounds like it's getting into virtual machine territory.
That being said, looking at the source code right here:
Processing is used only to validate the sketch, and pull syntax errors to display them otherwise.
ProcessingJS is used to actually render the processing code in a <canvas/> element within the Jupyter Notebook
I'm not sure what the easiest way to run the current calysto_processing in mybinder as is.
My pragmatic (even hacky if you will) suggestion is to:
fork the project and remove the processing-java dependency (which means might loose error checking)
install the cloned/tweaked version via pip/requirements.txt (pip can install a package from a github repo)
Update I have tried the above: you can run test kernel here
The source is here and the module is installed from this fork which simply comments out the processing-java part.
In terms of the mybinder configuration it boils down to:
create a binder folder in the repo containing the notebook
add requirements.txt which points to the tweaked version of calysto_processing stripped off the processing-java dependency: git+https://github.com/orgicus/calysto_processing.git#hotfix/PJS-only-test
add postBuild file which runs install on the calysto_processing module: python -m calysto_processing install --user
Notes
With this workaround java error checking is gone
Although Processing syntax is used it's execute as javascript and rendered in <canvas/> using ProcessingJS: this means no processing-java libraries, no threads or other java specific features,(buggy or no 3D),etc. just basic Processing drawing sketches
It might be worth looking at replacing ProcessingJS with p5.js and checking out other JS notebooks ? (e.g. Observable or IJavascript)
Using from setuptools.command.install import install, I can easily run a custom post-install script if I run python setup.py install. This is fairly trivial to do.
Currently, the script does nothing but print some text but I want it to deal with system changes that need to happen when a new package is installed -- for example, back up the database that the package is using.
I want to generate the a Python wheel for my package and then copy that and install it on a a set of deployment machines. However, my custom install script is no longer run on the deployment machine.
What am I doing wrong? Is that even possible?
Do not mix package installation and system deployment
Installation of Python packages (using any sort of packaging tools or formats) shall be focused on making that package usable from Python code.
Deployment, what might include database modifications etc. is definitely out of scope and shall be handled by other tools like fab, salt-stack etc.
The fact, that something seems fairly trivial does not mean, one shall do it.
The risk is, you will make your package installation difficult to reuse, as it will be spoiled by others things, which are not related to pure package installation.
The option to hook into installation process and modify environment is by some people even considered flaw in design, causing big mess in Python packaging situation - see Armin Roacher in Python Packaging: Hate, Hate, Hate Everywhere, chapter "PTH: The failed Design that enabled it all"
PEP 427 which specifies the wheel package format does not leave any provisions for custom pre or post installation scripts.
Therefore running a custom script is not possible during wheel package installation.
You'll have to add the custom script to a place in your package where you expect the developer to execute first.
I'm working by myself right now, but am looking at ways to scale my operation.
I'd like to find an easy way to version my Python distribution, so that I can recreate it very easily. Is there a tool to do this? Or can I add /usr/local/lib/python2.7/site-packages/ (or whatever) to an svn repo? This doesn't solve the problems with PATHs, but I can always write a script to alter the path. Ideally, the solution would be to build my Python env in a VM, and then hand copies of the VM out.
How have other people solved this?
virtualenv + requirements.txt are your friend.
You can create several virtual python installs for your projects, everything containing exactly those library versions you need (Tip: pip freeze spits out a requirements.txt with the exact library versions).
Find a good reference to virtualenv here: http://simononsoftware.com/virtualenv-tutorial/ (it's from this question Comprehensive beginner's virtualenv tutorial?).
Alternatively, if you just want to distribute your code together with libraries, PyInstaller is worth a try. You can package everything together in a static executable - you don't even have to install the software afterwards.
You want to use virtualenv. It lets you create an application(s) specific directory for installed packages. You can also use pip to generate and build a requirements.txt
For the same goal, i.e. having the exact same Python distribution as my colleagues, I tried to create a virtual environment in a network drive, so that everybody of us would be able to use it, without anybody making his local copy.
The idea was to share the same packages installed in a shared folder.
Outcome: Python run so unbearably slow that it could not be used. Also installing a package was very very sluggish.
So it looks there is no other way than using virtualenv and a requirements file. (Even if unfortunately often it does not always work smoothly on Windows and it requires manual installation of some packages and dependencies, at least at this time of writing.)
I'm looking to create the following:
A portable version of python that can be run on any system (with any previous version of python or no python installed) and have it pre-configured with various python packages (ie, django, lxml, pysqlite, etc)
The closest I've found to the above is virtualenv, but this only goes so far.
If I package up a nice virtualenv for python on one machine, it contains sym links to a lot of the libraries it needs. I can take those sym links and convert them to their actual files, but if I try to move this entire directory to another machine, I get seg fault after seg fault.
To launch python on a different machine, I'm using:
LD_LIBRARY_PATH=lib/ ./bin/python
and in lib/ I have all of the shared libraries I copied from the original machine. The problem here is these shared libraries might rely on other shared libraries that I'm not including, so executing this on other linux distros does not work. Probably due to it falling back on older shared libaries installed on the system that do not work with what I copied over.
Anyone have an idea on how to get this working? Is this even possible?
EDIT:
To clarify, the desired outcome is to create a tar.gz of a python binary and associated packages (django, lxml, pysqlite, etc) that can be extracted and run on any linux based system, ie (ubuntu 8.04, redhat 5, suse 11, etc), all 32bit distros, where the locally installed version of python doesn't impact what's in the tar.gz.
I just tested this and it works great.
Get the copy of python you want to install and untar it and cd to the untarred folder first.
Also get a copy of setuptools and untar that.
/opt/portapy used below is of course just the name I came up with for this post, it could be any path and the full path should be tarred up and the same path should be used on any systems you put this on due to absolute path linking.
mkdir /opt/portapy
cd <python source dir>
./configure --prefix=/opt/portapy && make && make install
cd <setuptools source dir>
/opt/portapy/bin/python ./setup.py install
Make the virtual env folder inside the portapy folder.
mkdir /opt/portapy/virtenv
/opt/portapy/bin/virtualenv /opt/portapy/virtenv
cd /opt/portapy/virtenv
source bin/activate
Done. You are ready to install all of your libraries here and have the option of creating multiple virtual envs this way.
You can then tar up the whole /opt/portapy folder and transport it to any Linux system of the same arch, within reason I suspect.
I compiled 2.7.5 ond centOS 5.8 64bit and moved the folder to a Cent6.9 system and it runs perfectly.
I don't know how this is even possible. If it were, they woudn't need to distribute binary packages of python for different platforms. You can't simply distribute python that will run on any platform. It has to be built from source for that arch. Virtualenv will expect you to tell it which system python to use (using links).
This pretty much goes for almost any binary package that links against system libs. Again, if it were possible, we wouldn't need any platform specific binary distributions.
You can, however, achieve part of what you want. That is, running python on another machine that doesn't have python installed as long as its the same arch. This is the same concept behind freezing, or py2exe/py2app/pyinstaller. An interpreter is bundled into a standalone environment. So the app can run on any similar platform.
Edit
I just realized that while your question speaks about "system" agnostically, your title contains the reference "linux". There are different flavors of linux, so in order for it to work you would have to build it fat for multiple archs and also completely contain the standalone links. You might try building a package with pyinstaller and using that to include in your project.
You can try just building python from source, in your virtualenv:
$ ./configure --prefix=/path/to/virtualenv && make && make install
If you still have problems with the links to libs, you can also investigate building it statically
I'm not sure that working solely in Python is the way to go here. You might have better luck with Puppet of Chef, which are configuration tools that can be used to create a local environment. There is plenty of code out there to install virtualenv and python on just about any Linux plus OSX (probably not Windows though).
Your workflow would be to install chef or Puppet (your choice), run a script to install the Python you want, then enter a virtualenv and pip install any packages you might need.
Sorry this isn't as easy as virtualenv alone, but it is much more robust.
Well, since I rarely accept "can't be done", there is a way to do it. Warning: it isn't pretty and you should probably look into a different scenario.
What you will need to to is determine a standard location for this top level directory. Second, using that directory as your root you will need to compile Python on each Linux distribution you want to run this on. For this you would use something like "/usr/local/myappname/platform/" to configure and compile Python to live in. In each case substitute "platform" with the name of the platform such as "/usr/local/rhel/". If memory serves the configure option you are looking for here is --prefix.
Once you have each distribution compiled you will need a script to determine which one to use and either set environment variables or have it create symlinks to the appropriate "installation" of python. I would then use virtualenv and bootstrap in that tree to keep the "in-use" python libraries even more specific.
I can't think of a common Linux distribution that doesn't have Python by default. As such you could use setup.py and/or basic python scripts to script this out since you should be able to rely in Python being present - even if its ye olde version as in RHEL installs. Personally I find the above method overly complicated but it would meet your stated requirements with the allowance for a final script. Of course, you could use shar (SHell ARchive) to tar all of this into a runnable shell script to do the installation and avoid the need for secondary scripts. If you gzip the resulting shel archive then you can decompress it on target systems and execute it to set everything up.
All that said, I would not recommend this. I would recommend determining the minimum Python version you can run on and ensuring that is installed by the distribution whenever possible and if needs be pulling down from a repo and installing. Then, use virtualenv and bootstrap with a requirements.txt to install necessary python libraries and apps into the virutalenv. For that see this documentation
I faced the same problem, so I created PortableVirtualenv. Your Question is just the definition of it.
I use it as a base for commercial multiplatform app I develop. (But PortableVirtualenv is public domain - use it freely.)
If needed, you can pip-install any package and zip the whole directory to distribute also packages you need.
One nice option is to make a "snap" portable linux application. They have a python mode which lets you specify you specify exactly what modules you need. From https://snapcraft.io/first-snap#python :
Snaps let you distribute a dependency-isolated Python app in an app store experience for end users.
Another option is to containerize your application with something like docker. Then instead of executing your script directly, the user is actually running a small OS with just your application and its dependencies. https://www.infoq.com/articles/docker-executable-images/ has more about executable containers.
Container images can also be used for short lived processes: a containerized executable meant to be run on your computer. These containers execute a single task, are short lived and can generally be removed after use. We call these executable images. Examples are compilers (Golang) or build tools (Maven), presentation software (I love to hack a simple presentation in Markdown format and let a RevealJS Docker image serve that) and browsers (a fresh contained browser to follow that fishy link). A real evangelist for executable images is Docker's own Jessie Frazelle. To get some great inspiration be sure to read her blog about them or check out this presentation at DockerCon 2015.
I want to distribute some python code, with a few external dependencies, to machines with only core python installed (and users that unfamiliar with easy_install etc.).
I was wondering if perhaps virtualenv can be used for this purpose? I should be able to write some bash scripts that trigger the virtualenv (with the suitable packages) and then run my code.. but this seems somewhat messy, and I'm wondering if I'm re-inventing the wheel?
Are there any simple solutions to distributing python code with dependencies, that ideally doesn't require sudo on client machines?
Buildout - http://pypi.python.org/pypi/zc.buildout
As sample look at my clean project: http://hg.jackleo.info/hyde-0.5.3-buildout-enviroment/src its only 2 files that do the magic, more over Makefile is optional but then you'll need bootstrap.py (Make file downloads it, but it runs only on Linux). buildout.cfg is the main file where you write dependency's and configuration how project is laid down.
To get bootstrap.py just download from http://svn.zope.org/repos/main/zc.buildout/trunk/bootstrap/bootstrap.py
Then run python bootstap.py and bin/buildout. I do not recommend to install buildout locally although it is possible, just use the one bootstrap downloads.
I must admit that buildout is not the easiest solution but its really powerful. So learning is worth time.
UPDATE 2014-05-30
Since It was recently up-voted and used as an answer (probably), I wan to notify of few changes.
First of - buildout is now downloaded from github https://raw.githubusercontent.com/buildout/buildout/master/bootstrap/bootstrap.py
That hyde project would probably fail due to buildout 2 breaking changes.
Here you can find better samples http://www.buildout.org/en/latest/docs/index.html also I want to suggest to look at "collection of links related to Buildout" part, it might contain info for your project.
Secondly I am personally more in favor of setup.py script that can be installed using python. More about the egg structure can be found here http://peak.telecommunity.com/DevCenter/PythonEggs and if that looks too scary - look up google (query for python egg). It's actually more simple in my opinion than buildout (definitely easier to debug) as well as it is probably more useful since it can be distributed more easily and installed anywhere with a help of virtualenv or globally where with buildout you have to provide all of the building scripts with the source all of the time.
You can use a tool like PyInstaller for this purpose. Your application will appear as a single executable on all platforms, and include dependencies. The user doesn't even need Python installed!
See as an example my logview package, which has dependencies on PyQt4 and ZeroMQ and includes distributions for Linux, Mac OSX and Windows all created using PyInstaller.
You don't want to distribute your virtualenv, if that's what you're asking. But you can use pip to create a requirements file - typically called requirements.txt - and tell your users to create a virtualenv then run pip install -r requirements.txt, which will install all the dependencies for them.
See the pip docs for a description of the requirements file format, and the Pinax project for an example of a project that does this very well.