Related
It's a great hassle when installing some packages in a VE and conda or pip downloads them again even when I already have it in my base environment. Since I have limited internet bandwidth and I'm assuming I'll work with many different VE's, it will take a lot of time to download basic packages such as OpenCV/Tensorflow.
By default, pip caches anything it downloads, and will used the cached version whenever possible. This cache is shared between your base environment and all virtual environments. So unless you pass the --no-cache-dir option, pip downloading a package means it has not previously downloaded a compatible version of that package. If you already have that package installed in your base environment or another virtual environment and it downloads it anyway, this probably means one or more of the following is true:
You installed your existing version with a method other than pip.
There is a newer version available, and you didn't specify, for example, pip install pandas=1.1.5 (if that's the version you already have elsewhere). Pip will install the newest compatible version for your environment, unless you tell it otherwise.
The VE you're installing to is a different Python version (e.g. created with Pyenv), and needs a different build.
I'm less familiar with the specifics of conda, and I can't seem to find anything in its online docs that focuses on the default caching behavior. However, a how-to for modifying the cache location seems to assume that the default behavior is similar to how pip works. Perhaps someone else with more Anaconda experience can chime in as well.
So except for the caveats above, as long as you're installing a package with the same method you did last time, you shouldn't have to download anything.
If you want to simplify the process of installing all the same packages (that were installed via pip) in a new VE that you already have in another environment, pip can automate that too. Run pip freeze > requirements.txt in the first environment, and copy the resulting file to your newly created VE. There, run pip install -r requirements.txt and pip will install all the packages that were installed (via pip) in the first environment. (Note that pip freeze records version numbers as well, so this won't install newer versions that may be available -- whether this is a good or bad thing depends on your needs.)
I am trying to install psutil with the command pip install -U psutil and that gives me the error:
Cannot uninstall 'psutil'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
It seems like this is a known issue in pip with versions > 10, and I understand that part (I currently have pip 18). But I just found that I can solve it by directly doing a pip install psutil without using the Upgrade flag. I was wondering if there is a reasoning behind that. My initial sense is that in the first case, where pip tries to upgrade, it first tries to remove the package, which it cannot, but in the latter case it tries to install directly, and hence does not get the error. My question is does it still not have to remove the package first and install (when not using the Upgrade flag), or why specifically is it that pip gives an error with an Upgrade flag but no error without it.
EDIT: So, I tried to run pip install -v psutil as hoefling suggested, and I got a whole bunch of text, as opposed to saying that requirements already met, which means that psutil didn't get installed in the first place. I tried to figure this a bit, and this is what I understand so far: I was running inside a python virtualenv and installing it by means of pip -U -r requirements.txt where requirements.txt contains a bunch of packages including psutil. When I remove the -U flag, it skips installing psutil, and jumps over to other packages. Which raises another question, whether this is how pip is supposed to behave when there is no -U flag. Its interesting that the first time, when its installing the packages with the -U flag, it looks inside the main python installation instead of the virtual environment one, and when the -U flag is removed it doesn't do that and skips entirely.
There are some setups where you have a bunch of packages installed somewhere that isn't the normal install location for setuptools, and comes after the normal install location on sys.path.
Probably the most common of these setups is Apple's pre-installed Python 2.7, so I'll use it as an example. Even if that isn't your setup, it will be hopefully still be instructive.
Apple includes an Extras directory (at /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python), with a bunch of third-party packages that either Apple's own tools need, or that Apple thought you might want (back when Apple cared about providing the best Python experience of any platform).
For example, on macOS 10.13, that directory will include NumPy 1.8.0.
These packages are all installed as distribute-style eggs.
(Some linux distros do, or at least used to do, similar things, with Python packages built as RPM/DEB/etc. packages, which go into adistutils directory, unlike things you install via pip or manually, which go into a setuptools directory. The details are a bit different, but the effects, and the workaround, end up being the same.)
If you install pip, and then try to pip install -U numpy or pip uninstall numpy, pip will see the distribute-style numpy-1.8.0rc1-py2.7.egg-info file and refuse to touch it for fear of breaking everything.
If you just pip install numpy, however, it will look only in the standard site-packages installation location used by setuptools, /Library/Python/2.7/site-packages, see nothing there, and happily install a modern version of NumPy for you.
And, because /Library/Python/2.7/site-packages comes before /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python on your sys,path, the new NumPy will hide the ancient NumPy, and everything will just work as intended.
There can be a few problems with this. Most notably, if you try to install something which isn't included in Extras itself, but which has a dependency that is included in Extras, it may fail with mysterious and hard-to-debug errors. For example, on macOS 10.12, pip install pandas will throw a bunch of errors at you about not being able to upgrade dateutil, which you didn't even know you were trying to do. The only thing you can do is look at the dependencies for pandas, see which ones are pre-installed in Extras, and manually pip install shadowing versions of all of them.
But, for the most part, it works.
A tweet reads:
Don't use easy_install, unless you
like stabbing yourself in the face.
Use pip.
Why use pip over easy_install? Doesn't the fault lie with PyPI and package authors mostly? If an author uploads crap source tarball (eg: missing files, no setup.py) to PyPI, then both pip and easy_install will fail. Other than cosmetic differences, why do Python people (like in the above tweet) seem to strongly favor pip over easy_install?
(Let's assume that we're talking about easy_install from the Distribute package, that is maintained by the community)
From Ian Bicking's own introduction to pip:
pip was originally written to improve on easy_install in the following ways
All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
Care is taken to present useful output on the console.
The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
Error messages should be useful.
The code is relatively concise and cohesive, making it easier to use programmatically.
Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
Native support for other version control systems (Git, Mercurial and Bazaar)
Uninstallation of packages.
Simple to define fixed sets of requirements and reliably reproduce a set of packages.
Many of the answers here are out of date for 2015 (although the initially accepted one from Daniel Roseman is not). Here's the current state of things:
Binary packages are now distributed as wheels (.whl files)—not just on PyPI, but in third-party repositories like Christoph Gohlke's Extension Packages for Windows. pip can handle wheels; easy_install cannot.
Virtual environments (which come built-in with 3.4, or can be added to 2.6+/3.1+ with virtualenv) have become a very important and prominent tool (and recommended in the official docs); they include pip out of the box, but don't even work properly with easy_install.
The distribute package that included easy_install is no longer maintained. Its improvements over setuptools got merged back into setuptools. Trying to install distribute will just install setuptools instead.
easy_install itself is only quasi-maintained.
All of the cases where pip used to be inferior to easy_install—installing from an unpacked source tree, from a DVCS repo, etc.—are long-gone; you can pip install ., pip install git+https://.
pip comes with the official Python 2.7 and 3.4+ packages from python.org, and a pip bootstrap is included by default if you build from source.
The various incomplete bits of documentation on installing, using, and building packages have been replaced by the Python Packaging User Guide. Python's own documentation on Installing Python Modules now defers to this user guide, and explicitly calls out pip as "the preferred installer program".
Other new features have been added to pip over the years that will never be in easy_install. For example, pip makes it easy to clone your site-packages by building a requirements file and then installing it with a single command on each side. Or to convert your requirements file to a local repo to use for in-house development. And so on.
The only good reason that I know of to use easy_install in 2015 is the special case of using Apple's pre-installed Python versions with OS X 10.5-10.8. Since 10.5, Apple has included easy_install, but as of 10.10 they still don't include pip. With 10.9+, you should still just use get-pip.py, but for 10.5-10.8, this has some problems, so it's easier to sudo easy_install pip. (In general, easy_install pip is a bad idea; it's only for OS X 10.5-10.8 that you want to do this.) Also, 10.5-10.8 include readline in a way that easy_install knows how to kludge around but pip doesn't, so you also want to sudo easy_install readline if you want to upgrade that.
Another—as of yet unmentioned—reason for favoring pip is because it is the new hotness and will continue to be used in the future.
The infographic below—from the Current State of Packaging section in the The Hitchhiker's Guide to Packaging v1.0—shows that setuptools/easy_install will go away in the future.
Here's another infographic from distribute's documentation showing that Setuptools and easy_install will be replaced by the new hotness—distribute and pip. While pip is still the new hotness, Distribute merged with Setuptools in 2013 with the release of Setuptools v0.7.
Two reasons, there may be more:
pip provides an uninstall command
if an installation fails in the middle, pip will leave you in a clean state.
REQUIREMENTS files.
Seriously, I use this in conjunction with virtualenv every day.
QUICK DEPENDENCY MANAGEMENT TUTORIAL, FOLKS
Requirements files allow you to create a snapshot of all packages that have been installed through pip. By encapsulating those packages in a virtualenvironment, you can have your codebase work off a very specific set of packages and share that codebase with others.
From Heroku's documentation https://devcenter.heroku.com/articles/python
You create a virtual environment, and set your shell to use it. (bash/*nix instructions)
virtualenv env
source env/bin/activate
Now all python scripts run with this shell will use this environment's packages and configuration. Now you can install a package locally to this environment without needing to install it globally on your machine.
pip install flask
Now you can dump the info about which packages are installed with
pip freeze > requirements.txt
If you checked that file into version control, when someone else gets your code, they can setup their own virtual environment and install all the dependencies with:
pip install -r requirements.txt
Any time you can automate tedium like this is awesome.
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be used there. easy_install can install binary packages for Windows.
UPDATE: setuptools has absorbed distribute as opposed to the other way around, as some thought. setuptools is up-to-date with the latest distutils changes and the wheel format. Hence, easy_install and pip are more or less on equal footing now.
Source: http://pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name
As an addition to fuzzyman's reply:
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be
used there. easy_install can install binary packages for Windows.
Here is a trick on Windows:
you can use easy_install <package> to install binary packages to avoid building a binary
you can use pip uninstall <package> even if you used easy_install.
This is just a work-around that works for me on windows.
Actually I always use pip if no binaries are involved.
See the current pip doku: http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install
I will ask on the mailing list what is planned for that.
Here is the latest update:
The new supported way to install binaries is going to be wheel!
It is not yet in the standard, but almost. Current version is still an alpha: 1.0.0a1
https://pypi.python.org/pypi/wheel
http://wheel.readthedocs.org/en/latest/
I will test wheel by creating an OS X installer for PySide using wheel instead of eggs. Will get back and report about this.
cheers - Chris
A quick update:
The transition to wheel is almost over. Most packages are supporting wheel.
I promised to build wheels for PySide, and I did that last summer. Works great!
HINT:
A few developers failed so far to support the wheel format, simply because they forget to
replace distutils by setuptools.
Often, it is easy to convert such packages by replacing this single word in setup.py.
Just met one special case that I had to use easy_install instead of pip, or I have to pull the source codes directly.
For the package GitPython, the version in pip is too old, which is 0.1.7, while the one from easy_install is the latest which is 0.3.2.rc1.
I'm using Python 2.7.8. I'm not sure about the underlay mechanism of easy_install and pip, but at least the versions of some packages may be different from each other, and sometimes easy_install is the one with newer version.
easy_install GitPython
I recently began learning Python, and I am a bit confused about how packages are distributed and installed.
I understand that the official way of installing packages is distutils: you download the source tarball, unpack it, and run: python setup.py install, then the module will automagically install itself
I also know about setuptools which comes with easy_install helper script. It uses eggs for distribution, and from what I understand, is built on top of distutils and does the same thing as above, plus it takes care of any dependencies required, all fetched from PyPi
Then there is also pip, which I'm still not sure how it differ from the others.
Finally, as I am on a windows machine, a lot of packages also offers binary builds through a windows installer, especially the ones that requires compiling C/Fortran code, which otherwise would be a nightmare to manually compile on windows (assumes you have MSVC or MinGW/Cygwin dev environment with all necessary libraries setup.. nonetheless try to build numpy or scipy yourself and you will understand!)
So can someone help me make sense of all this, and explain the differences, pros/cons of each method. I'd like to know how each keeps track of packages (Windows Registry, config files, ..). In particular, how would you manage all your third-party libraries (be able to list installed packages, disable/uninstall, etc..)
I use pip, and not on Windows, so I can't provide comparison with the Windows-installer option, just some information about pip:
Pip is built on top of setuptools, and requires it to be installed.
Pip is a replacement (improvement) for setuptools' easy_install. It does everything easy_install does, plus a lot more (make sure all desired distributions can be downloaded before actually installing any of them to avoid broken installs, list installed distributions and versions, uninstall, search PyPI, install from a requirements file listing multiple distributions and versions...).
Pip currently does not support installing any form of precompiled or binary distributions, so any distributions with extensions requiring compilation can only be installed if you have the appropriate compiler available. Supporting installation from Windows binary installers is on the roadmap, but it's not clear when it will happen.
Until recently, pip's Windows support was flaky and untested. Thanks to a lot of work from Dave Abrahams, pip trunk now passes all its tests on Windows (and there's a continuous integration server helping us ensure it stays that way), but a release has not yet been made including that work. So more reliable Windows support should be coming with the next release.
All the standard Python package installation mechanisms store all metadata about installed distributions in a file or files next to the actual installed package(s). Distutils uses a distribution_name-X.X-pyX.X.egg-info file, pip uses a similarly-named directory with multiple metadata files in it. Easy_install puts all the installed Python code for a distribution inside its own zipfile or directory, and places an EGG-INFO directory inside that directory with metadata in it. If you import a Python package from the interactive prompt, check the value of package.__file__; you should find the metadata for that package's distribution nearby.
Info about installed distributions is only stored in any kind of global registry by OS-specific packaging tools such as Windows installers, Apt, or RPM. The standard Python packaging tools don't modify or pay attention to these listings.
Pip (or, in my opinion, any Python packaging tool) is best used with virtualenv, which allows you to create isolated per-project Python mini-environments into which you can install packages without affecting your overall system. Every new virtualenv automatically comes with pip installed in it.
A couple other projects you may want to be aware of as well (yes, there's more!):
distribute is a fork of setuptools which has some additional bugfixes and features.
distutils2 is intended to be the "next generation" of Python packaging. It is (hopefully) adopting the best features of distutils/setuptools/distribute/pip. It is being developed independently and is not ready for use yet, but eventually should replace distutils in the Python standard library and become the de facto Python packaging solution.
Hope all that helped clarify something! Good luck.
I use windows and python. It is somewhat frustrating, because pip doesn't always work to install things. Python is moving to pip, so I still use it. Pip is nice, because you can uninstall items and use
pip freeze > requirements.txt
pip install -r requirements.txt
Another reason I like pip is for virtual environments like venv with python 3.4. I have found venv a lot easier to use on windows than virtualenv.
If you cannot install a package you have to find the binary for it. http://www.lfd.uci.edu/~gohlke/pythonlibs/
I have found these binaries to be very useful.
Pip is trying to make something called a wheel for binary installations.
pip install wheel
wheel convert path\to\binary.exe
pip install converted_wheel.whl
You will also have to do this for any required libraries that do not install and are required for that package.
The simplest way to deal with python package installations, so far, to me, has been to check out the source from the source control system and then add a symbolic link in the python dist-packages folder.
Clearly since source control provides the complete control to downgrade, upgrade to any branch, tag, it works very well.
Is there a way using one of the Package installers (easy_install or pip or other), one can achieve the same.
easy_install obtains the tar.gz and install them using the setup.py install which installs in the dist-packages folder in python2.6. Is there a way to configure it, or pip to use the source version control system (SVN/GIT/Hg/Bzr) instead.
Using pip this is quite easy. For instance:
pip install -e hg+http://bitbucket.org/andrewgodwin/south/#egg=South
Pip will automatically clone the source repo and run "setup.py develop" for you to install it into your environment (which hopefully is a virtualenv). Git, Subversion, Bazaar and Mercurial are all supported.
You can also then run "pip freeze" and it will output a list of your currently-installed packages with their exact versions (including, for develop-installs, the exact revision from the VCS). You can put this straight into a requirements file and later run
pip install -r requirements.txt
to install that same set of packages at the exact same versions.
If you download or check out the source distribution of a package — the one that has its "setup.py" inside of it — then if the package is based on the "setuptools" (which also power easy_install), you can move into that directory and say:
$ python setup.py develop
and it will create the right symlinks in dist-packages so that the .py files in the source distribution are the ones that get imported, rather than copies installed separately (which is what "setup.py install" would do — create separate copies that don't change immediately when you edit the source code to try a change).
As the other response indicates, you should try reading the "setuptools" documentation to learn more. "setup.py develop" is a really useful feature! Try using it in combination with a virtualenv, and you can "setup.py develop" painlessly and without messing up your system-wide Python with packages you are only developing on temporarily:
http://pypi.python.org/pypi/virtualenv
easy_install has support for downloading specific versions. For example:
easy_install python-dateutil==1.4.0
Will install v1.4, while the latest version 1.4.1 would be picked if no version was specified.
There is also support for svn checkouts, but using that doesn't give you much benefits from your manual version. See the manual for more information above.
Being able to switch to specific branches is rarely useful unless you are developing the packages in question, and then it's typically not a good idea to install them in site-packages anyway.
easy_install accepts a URL for the source tree too. Works at least when the sources are in Subversion.