Tools / best practices for managing application dependencies? - python

What tools or best practices are available for tracking and managing dependencies of the software I'm developing? I'm using Python / Django, and to date all my software requirements are open source.
I'm developing a web application that, while modest, has a number of dependencies. At a minimum, I'd like to track the software and version number for these. I suppose I'd also like to track the configurations of the required software, and possibly some system-level stuff (userid, if any, of the process of the instance required software, and required permissions thereof).
(Better yet would be something that would help me set up a server for the application when I'm ready to deploy. Still better would be something that allows me to track the the http and dns name server used to support the app. But rumor has it that puppet is a tool for that sort of thing.)

Use pip and virtualenv. With virtualenv, you can create a "virtual environment" which has all your Python packages installed into a local directory. With pip install -r, you can install all packages listed in a specific requirements file.
Rough example:
virtualenv /path/to/env --no-site-packages --unzip-setuptools # create virtual environment
source /path/to/env/bin/activate # activate environment
easy_install pip # install pip into environment
source /path/to/env/bin/activate # reload to get access to pip
pip install -r requirements.txt
Where requirements.txt contains lines like this:
django==1.3
The great thing about this is that requirements.txt serves both as documentation and as part of the installation procedure, so there's no need to synchronize the two.

Related

Pyramid not included in Python virtualenv

I'm an experienced developer, but not very familiar with Python and Pyramid.
I'm trying to follow some (a bit old and unclear) instructions on deploying a Pyramid web application. My OS is Ubuntu 16.04.
The instructions say to install virtualenv and Pyramid - I do so with apt install virtualenv and apt install python-pyramid. Then they say I should run the app in a virtual environment, so I build that with virtualenv . -ppython3, and activate it with source bin/activate. I install the application from a ready-to-run buildout from GitHub. The buildout includes a "production.ini" file with parameters to pserve.
But Pyramid is not included in the virtual environment built with virtualenv. (There is no "pserve" in the bin directory, e.g.) So I can't run the applications with bin/pserve etc/production.ini, as the instructions say. And if I try with only "pserve", I get errors when trying to access files like "var/waitress-%(process_num)s.sock". Files that the app excepts to find in the virtual environment.
I've looked for flags to tell virtualenv to include Pyramid, but couldn't find any. Am I overlooking something? I'd be most grateful for some help! :-)
/Anders from Sweden
Perhaps you might want to try installing Pyramid in your virtual environment using pip, since apt-installed libraries are installed into /opt, rather than being visible to Python. In the guide, it seems like you're wanting to install Pyramid through the virtual environment so that it can be used by your program, so I think you'd be best using pip rather than apt-get. I did a quick Google search, and it seems like this is the library you need. Here, all you'd have to do is run the installation command once you've already entered the virtual environment with pip install pyramid. This way, you should only have access to it within the virtual environment, as well!
You mentioned it's using buildout - I assume this is zc.buildout. buildout usually manages its own virtualenv and handles installing all of the necessary dependencies. It really depends on how that buildout is configured as there's no standard there for what to do or how to run your app. I would normally expect pserve to be exposed in the bin folder, but maybe another app-specific script is exposed instead.

What is the use case for `pip install -e`?

When I need to work on one of my pet projects, I simply clone the repository as usual (git clone <url>), edit what I need, run the tests, update the setup.py version, commit, push, build the packages and upload them to PyPI.
What is the advantage of using pip install -e? Should I be using it? How would it improve my workflow?
I find pip install -e extremely useful when simultaneously developing a product and a dependency, which I do a lot.
Example:
You build websites using Django for numerous clients, and have also developed an in-house Django app called locations which you reuse across many projects, so you make it available on pip and version it.
When you work on a project, you install the requirements as usual, which installs locations into site packages.
But you soon discover that locations could do with some improvements.
So you grab a copy of the locations repository and start making changes. Of course, you need to test these changes in the context of a Django project.
Simply go into your project and type:
pip install -e /path/to/locations/repo
This will overwrite the directory in site-packages with a symbolic link to the locations repository, meaning any changes to code in there will automatically be reflected - just reload the page (so long as you're using the development server).
The symbolic link looks at the current files in the directory, meaning you can switch branches to see changes or try different things etc...
The alternative would be to create a new version, push it to pip, and hope you've not forgotten anything. If you have many such in-house apps, this quickly becomes untenable.
For those who don't have time:
If you install your project with an -e flag (e.g. pip install -e mynumpy) and use it in your code (e.g. from mynumpy import some_function), when you make any change to some_function, you should be able to use the updated function without reinstalling it.
pip install -e is how setuptools dependencies are handled via pip.
What you typically do is to install the dependencies:
git clone URL
cd project
run pip install -e . or pip install -e .[dev]*
And now all the dependencies should be installed.
*[dev] is the name of the requirements group from setup.py
Other than setuptools (egg) there is also a wheel system of python installation.
Both these systems are based on promise that no building and compilation is performed.

How to handle python dependencies throughout the project?

Lets say a developer is working on a project when he realizes he needs to use some package.
He uses pip to install it. Now, after installing it, would a the developer write it down as a dependency in the requirements file / setup.py?
What does that same dev do if he forgot to write down all the dependencies of the project (or if he didn't know better since he hasn't been doing it long)?
What I'm asking is what's the workflow when working with external packages from the PyPi?
The command:
pip freeze > requirements.txt
will copy all of the dependencies currently in your python environment into requirements.txt. http://pip.readthedocs.org/en/latest/reference/pip_freeze.html
It depends on the project.
If you're working on a library, you'll want to put your dependencies in setup.py so that if you're putting the library on PyPi, people will be able to install it, and its dependencies automatically.
If you're working on an application in Python (possibly web application), a requirements.txt file will be easier for deploying. You can copy all your code to where you need it, set up a virtual environment with virtualenv or pyvenv, and then do pip install -r requirements.txt. (You should be doing this for development as well so that you don't have a mess of libraries globally).
It's certainly easier to write the packages you're installing to your requirements.txt as soon as you've installed them than trying to figure out which ones you need at the end. What I do so that I never forget is I write the packages to the file first and then install with pip install -r.
pip freeze helps if you've forgotten what you've installed, but you should always read the file it created to make sure that you actually need everything that's in there. If you're using virtualenv it'll give better results than if you're installing all packages globally.

How to manage libraries in deployment

I run Vagrant on Mac OS X. I am coding inside a virtual machine with CentOS 6, and I have the same versions of Python and Ruby in my development and production environment. I have these restrictions:
I cannot manually install. Everything must come through RPM.
I cannot use pip install and gem install to install the libraries I want as the system is managed through Puppet, and everything I add will be removed.
yum has old packages. I usually cannot find the latest versions of the libraries.
I would like to put my libraries locally in a lib directory near my scripts, and create an RPM that includes those frozen versions of dependencies. I cannot find an easy way to bundle my libraries for my scripts and push everything into my production server. I would like to know the easiest way to gather my dependencies in Python and Ruby.
I tried:
virtualenv (with --relocatable option)
PYTHONPATH
sys.path.append("lib path")
I don't know which is the right way to go. Also for ruby, is there any way to solve my problems with bundler? I see that bundler is for rails. Does it work for custom small scripts?
I like the approach in Node.JS and NPM; all packages are stored locally in node_modules. I have nodejs rpm installed, and I deploy a folder with my application on the production server. I would like to do it this way in Ruby and Python.
I don't know Node, but what you describe for NPM seems to be exactly what a virtualenv is. Once the virtualenv is activated, pip installs only within that virtualenv - so puppet won't interfere. You can write out your current list of packages to a requirements.txt file with pip freeze, and recreate the whole thing again with pip install -r requirements.txt. Ideally you would then deploy with puppet, and the deploy step would involve creating or updating the virtualenv, activating it, then running that pip command.
Maybe take a look at Docker?
With Docker you could create a image of your specific environment and deploy that.
https://www.docker.com/whatisdocker/

Best way to install python packages locally for development

Being new to the python games I seem to have missed out on some knowledge on how you can develop on a program but also keep it in your live environment.
Programs like gpodder can be run directly from the source checkout which is really handy however others want to be "installed" to run.
A lot of programs are distributed with a setup.py with instructions to run "python ./setup.py install" as root which will put stuff somewhere in your file-system. There are even install commands like "develop" which seem to hold the promise of what I want. So I tried:
export PYTHONPATH=/home/alex/python
python ./setup.py develop --install-dir=/home/alex/python
Which downloaded a bunch of stuff locally and seems magically ensure the application I'm hacking on is still being run out of the src tree. So I guess my roundabout question is is this the correct way of developing python code? How do things like easy_install and pip factor into this?
So I tried the following:
python /usr/share/pyshared/virtualenv.py /home/alex/src/goobook
cd /home/alex/src/goobook/googbook.git
/home/alex/src/goobook/bin/python ./setup.py develop
And finally linked the program in question to my ~/bin
cd /home/alex/src/goobook
linkbin.pl bin/goobook
However invocation throws up a load of extra chatter which seems to imply it's wrong:
17:17 alex#socrates/i686 [goobook] >goobook --help
/home/alex/bin/goobook:5: UserWarning: Module pkg_resources was already imported from /home/alex/src/goobook/lib/python2.5/site-packages/setuptools-0.6c8-py2.5.egg/pkg_resources.py, but /home/alex/src/goobook/lib/python2.5/site-packages/distribute-0.6.10-py2.5.egg is being added to sys.path
from pkg_resources import load_entry_point
/home/alex/bin/goobook:5: UserWarning: Module site was already imported from /home/alex/src/goobook/lib/python2.5/site.pyc, but /home/alex/src/goobook/lib/python2.5/site-packages/distribute-0.6.10-py2.5.egg is being added to sys.path
from pkg_resources import load_entry_point
Install:
http://pypi.python.org/pypi/virtualenv
to set up a localized virtual environment for your libraries, and:
http://pypi.python.org/pypi/setuptools
i.e. "easy_install" to install new things.
Virtualenv allows you to work in completely independent and isolated Python environments. It will let you easily create multiple environments which have different Python packages installed or different versions of a same package. Virtualenv also lets you easily switch between your different environments.
As of 2012, the de facto preferred tool for package management in Python is pip rather than setuptools. Pip is able to handle dependencies and to install/uninstall globally or inside a virtual environment. Pip even comes out-of-the-box with virtualenv.
Python 3
Also worth mentioning is the fact that virtual environments are becoming a part of Python itself in release 3.3, with the implementation of PEP 405.
The Python Packaging User Guide, which "aims to be the authoritative resource on how to package, publish and install Python distributions using current tools", recommends using pip to install in "development mode":
pip install -e <path>
Thus in the root directory of your package you can simply
pip install -e .
See installing from a local source tree.
The best way to develop Python apps with dependencies is to:
Download desired version of the python interpreter.
Install and use buildout (http://www.buildout.org/).
Buildout is something like Maven for Java (will fetch all needed packages automatically).
This way your Python interpreter will not be polluted by third party packages (this is important if you will be running developed application on other machines). Additionally you can integrate buildout with virtualenv package (this allows you to create virtual python interpreters for each project).

Categories

Resources