Related
I'm working on a DevOps project for a client who's using Python. Though I never used it professionally, I know a few things, such as using virtualenv and pip - though not in great detail.
When I looked at the staging box, which I am trying to prepare for running a functional test suite, I saw chaos. Tons of packages installed globally, and those installed inside a virtualenv not matching the requirements.txt of the project. OK, thought I, there's a lot of cleaning up. Starting with global packages.
However, I ran into a problem at once:
➜ ~ pip uninstall PyYAML
Not uninstalling PyYAML at /usr/lib/python2.7/dist-packages, owned by OS
OK, someone must've done a 'sudo pip install PyYAML'. I think I know how to fix it:
➜ ~ sudo pip uninstall PyYAML
Not uninstalling PyYAML at /usr/lib/python2.7/dist-packages, owned by OS
Uh, apparently I don't.
A search revealed some similar conflicts caused by users installing packages bypassing pip, but I'm not convinced - why would pip even know about them, if that was the case? Unless the "other" way is placing them in the same location pip would use - but if that's the case, why would it fail to uninstall under sudo?
Pip denies to uninstall these packages because Debian developers patched it to behave so. This allows you to use both pip and apt simultaneously. The "original" pip program doesn't have such functionality
Update: my answer is relevant only to old versions of Pip. For the latest versions, Pip is configured to modify only the files which reside only in its "home directory" - that is /usr/local/lib/python3.* for Debian. For the latest tools, you will get these errors when you try to delete the package, installed by apt:
For pip 9.0.1-2.3~ubuntu1 (installed from Ubuntu repository):
Not uninstalling pyyaml at /usr/lib/python3/dist-packages, outside environment /usr
For pip 10.0.1 (original, installed from pypi.org):
Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
The point is not that pip cannot install the package because you don't have enough permissions, but because it is not a package installed through pip, so it doesn't want to uninstall it.
dist-packages is where packages installed by the OS package manager reside; as they are handled by another package manager (e.g. apt on Ubuntu/Debian, pacman on Arch, rpm/yum on CentOS, ... ) pip won't touch them (but still has to know about them as they are installed packages, so they can be used to satisfy dependencies of pip-installed packages).
You should also probably avoid to touch them unless you use the correct package manager, and even so, they may have been installed automatically to satisfy the dependencies of some program, so you may not remove them without breaking it. This can usually be checked quite easily, although the exact way depends from the precise Linux distribution you are using.
I'm trying to install the basic scipy stack (numpy, scipy, matplotlib, pandas, sympy, ipython, nose) into a virtualenv; currently, I'm using OSX Mountain Lion. From the installation pages for these packages, as well as various threads here and around the web, it seems that pandas, sympy, and nose can be installed easily with just pip (although some list numpy/scipy/etc. as dependencies).
However, there seems to be conflicting and kind of convoluted instructions for properly installing numpy, scipy, matplotlib, and, to a certain extent, ipython*. Installing solely with pip doesn't seem to be the proper way to install these packages; it seems that some dependencies need to be installed with homebrew, but various places list different packages to brew install before pip installing numpy/scipy/etc. Is there a comprehensive and current list of package dependencies that need to be installed with homebrew before pip installing numpy, scipy, and matplotlib?
Just as a note, I've looked at tapping homebrew/python for installing numpy, scipy, and matplotlib properly with homebrew, but I want to install into a virtualenv and I don't think I can use homebrew to do that.
Any help would be greatly appreciated; thanks in advance!
*Also, for ipython, the installation page says that pip install ipython[all] should be sufficient, but some other sources (http://www.coderstart.com/setup/python-setup.html; http://www.lowindata.com/2013/installing-scientific-python-on-mac-os-x/) seem to suggest that the qt, pyqt, and zmq packages need to be brew installed first before pip installing ipython; are the homebrew installation necessary or is it fine to just pip install according to the ipython installation page?
Answering my own question here, but hopefully this is helpful; feel free to correct if there are any mistakes. The original version is a gigantic wall of text, so I've added a tl;dr to the top with just the steps to hopefully make the process more clear.
tl;dr: In terminal/bash, go into a virtualenv (if you want to install into one) and enter these commands in order. This is tested for OSX Mountain Lion.
pip install numpy
brew install gcc
pip install scipy
brew install freetype
pip install matplotlib
pip install nose
pip install pandas
pip install sympy
pip install ipython[all]
brew install pyqt
brew install qt
brew install sip
echo "export PYTHONPATH=/usr/local/lib/python2.7/site-packages:$PYTHONPATH" >> ~/.bash_profile
source ~/.bash_profile
*Notes: brew installing pyqt might already install qt and sip; if it does, no need to install qt and sip after installing pyqt. For the second to last line, might be be more reliable to directly c/p that line into ~/.bash_profile, since it might need to be at the top of the ~/.bash_profile contents. Also, brew install pandoc is optional, but necessary for ipython notebook's nbconvert command work properly.
edit 10/13/14 [see edit at bottom]: editing PYTHONPATH in ~/.bash_profile forces virtual environments to inherit global packages; if you want to be able to make isolated environments, don't do the last two steps. Instead, assuming virtualenvwrapper is installed, edit the local postactivate and predeactivate scripts in the bin directory under the virtualenv that contains the scipy stuff.
In postactivate, enter:
export PYTHONPATH=/usr/local/lib/python2.7/site-packages:$PYTHONPATH
In predeactivate, enter:
unset PYTHONPATH
This should edit PYTHONPATH when when the virtualenv with scipy stuff is activated so that ipython qtconsole works, but should reset PYTHONPATH when the virtualenv is deactivated, so that other virtualenvs are not affected by PYTHONPATH changes.
Below is the long narrative version.
Anyway, from some trial and error after I originally posted the question, I found that the following steps work; I used the two sets of instructions that I linked above as general guides, and this was tested on OSX Mountain Lion.
After activating the virtualenv in which the packages are going to be installed, first pip install numpy; this should work as expected and numpy should be installed (note that numpy should be installed first since a lot of the other packages in the scipy stack depend on numpy).
Now, before installing scipy, several sources note that gfortran (this seems to be the most common, but I think any fortran compiler should work) needs to be installed; brew install gfortran returns an error that says that gcc now includes gfortran so the gfortran formula is deprecated. Hence, we brew install gcc (note that even though xcode command line tools, which is generally installed before homebrew, already includes gcc, its version of gcc somehow doesn't work or doesn't include gfortran). After installing gcc, pip install scipy works as expected, and scipy should be installed. Quick aside: brew install gcc installs gcc as well as a bunch of dependencies, namely cloog, gcc, gmp, isl, libmpc, mpfr. [These should all be installed into /usr/local/Cellar, which is the default install location for homebrew.]
For installing matplotlib, freetype needs to be installed first, so we brew install freetype; this should install freetype and libpng, which seems to be a freetype dependency. Afterwards, pip install matplotlib works as expected, installing matplotlib successfully. Note that matplotlib installs with mock (needed to run matplotlib test suite), pyparsing (required for mathtext support), python-dateutil (required for date axis support), six (no reason given) along with it. [These should all be installed, along with any other pip install within a virtualenv, into the site-packages directory within the virtualenv.]
Installing nose, sympy, and pandas simply require pip installing, as they don't have any dependencies that need to be brew installed. However, of these, note that at least pandas depends on numpy, so installing pandas (not sure about the others) after installing numpy is probably preferred. Also, note that pandas installs with pytz (for time zone calculations).
Installing ipython is pretty simple, but setting it up is a little more convoluted. As a heads up, ipython can be used with both a qt console and something called ipython notebook, which have various benefits. You can choose to just install ipython with pip install ipython and install optional dependencies later on as needed, but I installed all the main optional dependencies with pip install ipython[all]. This installs ipython, along with a lot of other packages dependencies (installs with backports.ssl-match-hostname (from tornado), certifi (from tornado), docutils (from sphinx), gnureadline, ipython, jinja2, markupsafe (from jinja2), numpydoc (from ipython[all]), pygments, pyzmq, sphinx, tornado). This should be a good foundation for ipython to use both the standard ipython shell, the qt console, and the ipython notebook. However, it's not fully set up if you want to use either the qt console or the notebook.
To use the qt console, the pyqt, qt, and sip packages must be brew installed, as these are dependencies that cannot be installed with pip; from experience, brew install pyqt seems to install all three packages, but individually installing the three might be a safer bet. After doing this, go into ~/.bash_profile and add the line 'export PYTHONPATH=/usr/local/lib/python2.7/site-packages:$PYTHONPATH' to it; then 'source ~/.bash_profile' in terminal to reload the shell. This should successfully allow the qt console to be launched. [I'm not totally sure why this line needs to be added, since I had already edited the PATH variable to put /usr/local/bin before /usr/bin, but perhaps qt/pyqt/sip were still trying to link themselves with the system default python instead of homebrew installed python.]
The notebook seems to work fine out of the box as far as I've seen, but there's one thing of note: in order to use nbconvert (converting the notebook to different file formats), the pandoc package must be installed, presumably with homebrew. Like qt/pyqt/sip, it can't be installed with pip which is why it wasn't installed with pip install ipython[all].
edit 10/13/14: So apparently, editing PYTHONPATH will nullify empty virtual environments, making global packages also available in a virtualenv (how to isolate virtualenv from local dist-packages?); this, for the most part, defeats the purpose of a virtualenv, assuming that you want to a fresh environment, but is necessary for ipython qtconsole to work correctly.
The fix is to edit (assuming virtualenvwrapper is installed) the local postactivate and predeactivate scripts in the bin file of your virtualenv. In postactivate, enter the line "export PYTHONPATH=/usr/local/lib/python2.7/site-packages:$PYTHONPATH; in predeactivate, enter the line "reset PYTHONPATH". Don't do the last two steps of the original sequence, or delete the line that was added to ~/bash_profile. This should make it so that the changes to PYTHONPATH are made only when the virtualenv with our installed packages is activated, so that qtconsole works, but are reset before the virtualenv deactivates, so that other environments are not affected.
I'm on OSX 10.9, but I had good luck following these instructions and get roughly the same environment that you outlined above
http://hackercodex.com/guide/python-development-environment-on-mac-osx/
Hope this helps
Until today, I have been using the macports version of python27 and installing python packages through macports. Today, I needed some packages which were not available through macports; I learned about pip and found them there. After installing these packages through pip, however, I realized that neither pip nor macports could see what had been installed by the other. So, for consistency, I decided to uninstall all macports packages, install python27 and py27-pip through macports and then proceed to install all of my python packages through pip.
This worked fine, but since macports does not know about my pip-installed python packages, I ran into trouble when installing something else which depends on python (e.g., inkscape): macports tried to install its own version of, e.g. py27-numpy (already installed by pip) and then failed installation because it "already exists and does not belong to a registered port."
Is there a consistent way to use pip and to get macports to recognize that the python packages it might need for something else are already installed?
The solution is: dont use Macports for installing Python's packages.
Macports is a general package manager and it registers installed packages in its database.
Pip is a package manager for Python so if you want to install Python package, use appropriate package management tool. Pip doesnt have it's own database to keep evidence about installed stuff - it just checks Python's path to see if the package is there (and that's what you want).
Sooner or later you'll use Virtualenv anyway and you'll need pip to install packages in there too so it's better to use it everywhere.
The output of pip freeze on my machine has included the following odd line:
command-not-found==0.2.44
When trying to install requirements on a different machine, I got the obvious No distributions at all found for command-not-found==0.2.44. Is this a pip bug? Or is there any real python package of that name, one which does not exist in pypi?
Indeed, as mentioned in the follow up comments, Ubuntu has a python package, installed via dpkg/apt that is called "python-commandnotfound"
$apt-cache search command-not-found
command-not-found - Suggest installation of packages in interactive bash sessions
command-not-found-data - Set of data files for command-not-found.
python-commandnotfound - Python 2 bindings for command-not-found.
python3-commandnotfound - Python 3 bindings for command-not-found.
As this is provided via apt, and not available in the pypi repo, you won't be able to install it via pip, but pip will see that it is installed. For the purposes of showing installed packages, pip doesn't care if a package is installed via apt, easy_install, pip, manually, etc.
In short, if you actually need it on another host (which I assume you don't) you'll need to apt-get install python-commandnotfound.
A tweet reads:
Don't use easy_install, unless you
like stabbing yourself in the face.
Use pip.
Why use pip over easy_install? Doesn't the fault lie with PyPI and package authors mostly? If an author uploads crap source tarball (eg: missing files, no setup.py) to PyPI, then both pip and easy_install will fail. Other than cosmetic differences, why do Python people (like in the above tweet) seem to strongly favor pip over easy_install?
(Let's assume that we're talking about easy_install from the Distribute package, that is maintained by the community)
From Ian Bicking's own introduction to pip:
pip was originally written to improve on easy_install in the following ways
All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
Care is taken to present useful output on the console.
The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
Error messages should be useful.
The code is relatively concise and cohesive, making it easier to use programmatically.
Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
Native support for other version control systems (Git, Mercurial and Bazaar)
Uninstallation of packages.
Simple to define fixed sets of requirements and reliably reproduce a set of packages.
Many of the answers here are out of date for 2015 (although the initially accepted one from Daniel Roseman is not). Here's the current state of things:
Binary packages are now distributed as wheels (.whl files)—not just on PyPI, but in third-party repositories like Christoph Gohlke's Extension Packages for Windows. pip can handle wheels; easy_install cannot.
Virtual environments (which come built-in with 3.4, or can be added to 2.6+/3.1+ with virtualenv) have become a very important and prominent tool (and recommended in the official docs); they include pip out of the box, but don't even work properly with easy_install.
The distribute package that included easy_install is no longer maintained. Its improvements over setuptools got merged back into setuptools. Trying to install distribute will just install setuptools instead.
easy_install itself is only quasi-maintained.
All of the cases where pip used to be inferior to easy_install—installing from an unpacked source tree, from a DVCS repo, etc.—are long-gone; you can pip install ., pip install git+https://.
pip comes with the official Python 2.7 and 3.4+ packages from python.org, and a pip bootstrap is included by default if you build from source.
The various incomplete bits of documentation on installing, using, and building packages have been replaced by the Python Packaging User Guide. Python's own documentation on Installing Python Modules now defers to this user guide, and explicitly calls out pip as "the preferred installer program".
Other new features have been added to pip over the years that will never be in easy_install. For example, pip makes it easy to clone your site-packages by building a requirements file and then installing it with a single command on each side. Or to convert your requirements file to a local repo to use for in-house development. And so on.
The only good reason that I know of to use easy_install in 2015 is the special case of using Apple's pre-installed Python versions with OS X 10.5-10.8. Since 10.5, Apple has included easy_install, but as of 10.10 they still don't include pip. With 10.9+, you should still just use get-pip.py, but for 10.5-10.8, this has some problems, so it's easier to sudo easy_install pip. (In general, easy_install pip is a bad idea; it's only for OS X 10.5-10.8 that you want to do this.) Also, 10.5-10.8 include readline in a way that easy_install knows how to kludge around but pip doesn't, so you also want to sudo easy_install readline if you want to upgrade that.
Another—as of yet unmentioned—reason for favoring pip is because it is the new hotness and will continue to be used in the future.
The infographic below—from the Current State of Packaging section in the The Hitchhiker's Guide to Packaging v1.0—shows that setuptools/easy_install will go away in the future.
Here's another infographic from distribute's documentation showing that Setuptools and easy_install will be replaced by the new hotness—distribute and pip. While pip is still the new hotness, Distribute merged with Setuptools in 2013 with the release of Setuptools v0.7.
Two reasons, there may be more:
pip provides an uninstall command
if an installation fails in the middle, pip will leave you in a clean state.
REQUIREMENTS files.
Seriously, I use this in conjunction with virtualenv every day.
QUICK DEPENDENCY MANAGEMENT TUTORIAL, FOLKS
Requirements files allow you to create a snapshot of all packages that have been installed through pip. By encapsulating those packages in a virtualenvironment, you can have your codebase work off a very specific set of packages and share that codebase with others.
From Heroku's documentation https://devcenter.heroku.com/articles/python
You create a virtual environment, and set your shell to use it. (bash/*nix instructions)
virtualenv env
source env/bin/activate
Now all python scripts run with this shell will use this environment's packages and configuration. Now you can install a package locally to this environment without needing to install it globally on your machine.
pip install flask
Now you can dump the info about which packages are installed with
pip freeze > requirements.txt
If you checked that file into version control, when someone else gets your code, they can setup their own virtual environment and install all the dependencies with:
pip install -r requirements.txt
Any time you can automate tedium like this is awesome.
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be used there. easy_install can install binary packages for Windows.
UPDATE: setuptools has absorbed distribute as opposed to the other way around, as some thought. setuptools is up-to-date with the latest distutils changes and the wheel format. Hence, easy_install and pip are more or less on equal footing now.
Source: http://pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name
As an addition to fuzzyman's reply:
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be
used there. easy_install can install binary packages for Windows.
Here is a trick on Windows:
you can use easy_install <package> to install binary packages to avoid building a binary
you can use pip uninstall <package> even if you used easy_install.
This is just a work-around that works for me on windows.
Actually I always use pip if no binaries are involved.
See the current pip doku: http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install
I will ask on the mailing list what is planned for that.
Here is the latest update:
The new supported way to install binaries is going to be wheel!
It is not yet in the standard, but almost. Current version is still an alpha: 1.0.0a1
https://pypi.python.org/pypi/wheel
http://wheel.readthedocs.org/en/latest/
I will test wheel by creating an OS X installer for PySide using wheel instead of eggs. Will get back and report about this.
cheers - Chris
A quick update:
The transition to wheel is almost over. Most packages are supporting wheel.
I promised to build wheels for PySide, and I did that last summer. Works great!
HINT:
A few developers failed so far to support the wheel format, simply because they forget to
replace distutils by setuptools.
Often, it is easy to convert such packages by replacing this single word in setup.py.
Just met one special case that I had to use easy_install instead of pip, or I have to pull the source codes directly.
For the package GitPython, the version in pip is too old, which is 0.1.7, while the one from easy_install is the latest which is 0.3.2.rc1.
I'm using Python 2.7.8. I'm not sure about the underlay mechanism of easy_install and pip, but at least the versions of some packages may be different from each other, and sometimes easy_install is the one with newer version.
easy_install GitPython