Why use pip over easy_install? - python

A tweet reads:
Don't use easy_install, unless you
like stabbing yourself in the face.
Use pip.
Why use pip over easy_install? Doesn't the fault lie with PyPI and package authors mostly? If an author uploads crap source tarball (eg: missing files, no setup.py) to PyPI, then both pip and easy_install will fail. Other than cosmetic differences, why do Python people (like in the above tweet) seem to strongly favor pip over easy_install?
(Let's assume that we're talking about easy_install from the Distribute package, that is maintained by the community)

From Ian Bicking's own introduction to pip:
pip was originally written to improve on easy_install in the following ways
All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
Care is taken to present useful output on the console.
The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
Error messages should be useful.
The code is relatively concise and cohesive, making it easier to use programmatically.
Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
Native support for other version control systems (Git, Mercurial and Bazaar)
Uninstallation of packages.
Simple to define fixed sets of requirements and reliably reproduce a set of packages.

Many of the answers here are out of date for 2015 (although the initially accepted one from Daniel Roseman is not). Here's the current state of things:
Binary packages are now distributed as wheels (.whl files)—not just on PyPI, but in third-party repositories like Christoph Gohlke's Extension Packages for Windows. pip can handle wheels; easy_install cannot.
Virtual environments (which come built-in with 3.4, or can be added to 2.6+/3.1+ with virtualenv) have become a very important and prominent tool (and recommended in the official docs); they include pip out of the box, but don't even work properly with easy_install.
The distribute package that included easy_install is no longer maintained. Its improvements over setuptools got merged back into setuptools. Trying to install distribute will just install setuptools instead.
easy_install itself is only quasi-maintained.
All of the cases where pip used to be inferior to easy_install—installing from an unpacked source tree, from a DVCS repo, etc.—are long-gone; you can pip install ., pip install git+https://.
pip comes with the official Python 2.7 and 3.4+ packages from python.org, and a pip bootstrap is included by default if you build from source.
The various incomplete bits of documentation on installing, using, and building packages have been replaced by the Python Packaging User Guide. Python's own documentation on Installing Python Modules now defers to this user guide, and explicitly calls out pip as "the preferred installer program".
Other new features have been added to pip over the years that will never be in easy_install. For example, pip makes it easy to clone your site-packages by building a requirements file and then installing it with a single command on each side. Or to convert your requirements file to a local repo to use for in-house development. And so on.
The only good reason that I know of to use easy_install in 2015 is the special case of using Apple's pre-installed Python versions with OS X 10.5-10.8. Since 10.5, Apple has included easy_install, but as of 10.10 they still don't include pip. With 10.9+, you should still just use get-pip.py, but for 10.5-10.8, this has some problems, so it's easier to sudo easy_install pip. (In general, easy_install pip is a bad idea; it's only for OS X 10.5-10.8 that you want to do this.) Also, 10.5-10.8 include readline in a way that easy_install knows how to kludge around but pip doesn't, so you also want to sudo easy_install readline if you want to upgrade that.

Another—as of yet unmentioned—reason for favoring pip is because it is the new hotness and will continue to be used in the future.
The infographic below—from the Current State of Packaging section in the The Hitchhiker's Guide to Packaging v1.0—shows that setuptools/easy_install will go away in the future.
Here's another infographic from distribute's documentation showing that Setuptools and easy_install will be replaced by the new hotness—distribute and pip. While pip is still the new hotness, Distribute merged with Setuptools in 2013 with the release of Setuptools v0.7.

Two reasons, there may be more:
pip provides an uninstall command
if an installation fails in the middle, pip will leave you in a clean state.

REQUIREMENTS files.
Seriously, I use this in conjunction with virtualenv every day.
QUICK DEPENDENCY MANAGEMENT TUTORIAL, FOLKS
Requirements files allow you to create a snapshot of all packages that have been installed through pip. By encapsulating those packages in a virtualenvironment, you can have your codebase work off a very specific set of packages and share that codebase with others.
From Heroku's documentation https://devcenter.heroku.com/articles/python
You create a virtual environment, and set your shell to use it. (bash/*nix instructions)
virtualenv env
source env/bin/activate
Now all python scripts run with this shell will use this environment's packages and configuration. Now you can install a package locally to this environment without needing to install it globally on your machine.
pip install flask
Now you can dump the info about which packages are installed with
pip freeze > requirements.txt
If you checked that file into version control, when someone else gets your code, they can setup their own virtual environment and install all the dependencies with:
pip install -r requirements.txt
Any time you can automate tedium like this is awesome.

pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be used there. easy_install can install binary packages for Windows.

UPDATE: setuptools has absorbed distribute as opposed to the other way around, as some thought. setuptools is up-to-date with the latest distutils changes and the wheel format. Hence, easy_install and pip are more or less on equal footing now.
Source: http://pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name

As an addition to fuzzyman's reply:
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be
used there. easy_install can install binary packages for Windows.
Here is a trick on Windows:
you can use easy_install <package> to install binary packages to avoid building a binary
you can use pip uninstall <package> even if you used easy_install.
This is just a work-around that works for me on windows.
Actually I always use pip if no binaries are involved.
See the current pip doku: http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install
I will ask on the mailing list what is planned for that.
Here is the latest update:
The new supported way to install binaries is going to be wheel!
It is not yet in the standard, but almost. Current version is still an alpha: 1.0.0a1
https://pypi.python.org/pypi/wheel
http://wheel.readthedocs.org/en/latest/
I will test wheel by creating an OS X installer for PySide using wheel instead of eggs. Will get back and report about this.
cheers - Chris
A quick update:
The transition to wheel is almost over. Most packages are supporting wheel.
I promised to build wheels for PySide, and I did that last summer. Works great!
HINT:
A few developers failed so far to support the wheel format, simply because they forget to
replace distutils by setuptools.
Often, it is easy to convert such packages by replacing this single word in setup.py.

Just met one special case that I had to use easy_install instead of pip, or I have to pull the source codes directly.
For the package GitPython, the version in pip is too old, which is 0.1.7, while the one from easy_install is the latest which is 0.3.2.rc1.
I'm using Python 2.7.8. I'm not sure about the underlay mechanism of easy_install and pip, but at least the versions of some packages may be different from each other, and sometimes easy_install is the one with newer version.
easy_install GitPython

Related

pip uninstall fails with "owned by OS" - even under sudo

I'm working on a DevOps project for a client who's using Python. Though I never used it professionally, I know a few things, such as using virtualenv and pip - though not in great detail.
When I looked at the staging box, which I am trying to prepare for running a functional test suite, I saw chaos. Tons of packages installed globally, and those installed inside a virtualenv not matching the requirements.txt of the project. OK, thought I, there's a lot of cleaning up. Starting with global packages.
However, I ran into a problem at once:
➜ ~ pip uninstall PyYAML
Not uninstalling PyYAML at /usr/lib/python2.7/dist-packages, owned by OS
OK, someone must've done a 'sudo pip install PyYAML'. I think I know how to fix it:
➜ ~ sudo pip uninstall PyYAML
Not uninstalling PyYAML at /usr/lib/python2.7/dist-packages, owned by OS
Uh, apparently I don't.
A search revealed some similar conflicts caused by users installing packages bypassing pip, but I'm not convinced - why would pip even know about them, if that was the case? Unless the "other" way is placing them in the same location pip would use - but if that's the case, why would it fail to uninstall under sudo?
Pip denies to uninstall these packages because Debian developers patched it to behave so. This allows you to use both pip and apt simultaneously. The "original" pip program doesn't have such functionality
Update: my answer is relevant only to old versions of Pip. For the latest versions, Pip is configured to modify only the files which reside only in its "home directory" - that is /usr/local/lib/python3.* for Debian. For the latest tools, you will get these errors when you try to delete the package, installed by apt:
For pip 9.0.1-2.3~ubuntu1 (installed from Ubuntu repository):
Not uninstalling pyyaml at /usr/lib/python3/dist-packages, outside environment /usr
For pip 10.0.1 (original, installed from pypi.org):
Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
The point is not that pip cannot install the package because you don't have enough permissions, but because it is not a package installed through pip, so it doesn't want to uninstall it.
dist-packages is where packages installed by the OS package manager reside; as they are handled by another package manager (e.g. apt on Ubuntu/Debian, pacman on Arch, rpm/yum on CentOS, ... ) pip won't touch them (but still has to know about them as they are installed packages, so they can be used to satisfy dependencies of pip-installed packages).
You should also probably avoid to touch them unless you use the correct package manager, and even so, they may have been installed automatically to satisfy the dependencies of some program, so you may not remove them without breaking it. This can usually be checked quite easily, although the exact way depends from the precise Linux distribution you are using.

applying debian packages instead of pip packages

I've a strange problem with co-existence of debian package and pip package. For example, I've python-requests (deb version 0.8.2) installed. Then when i install the requests (pip version 2.2.1), the system only apply the deb version instead of pip new version. Does anyone can resolve this problem? Thank you in advance.
In regard to installing python packages by system packages and pip, you have to define clear plan.
Personally, I follow these rules:
Install only minimal set of python packages by system installation packages
There I include supervisord in case, I am not on too old system.
Do not install pip or virtualenv by system package.
Especially with pip in last year there were many situations, when system packages were far back behind what was really needed.
Use Virtualenv and prefer to install packages (by pip) in here
This will keep your system wide Python rather clean. It takes a moment to get used, but it is rather easy to follow, especially, if you use virtualenvwrapper which helps a lot during development.
Prepare conditions for quick installation of compiled packages
Some packages require compilation and this often fails on missing dependencies.
Such packages include e.g. lxml, pyzmq, pyyaml.
Make sure, which ones you are going to use, prepare packages in the system and you are able to install them into virtualenv.
Fine-tuning speed of installation of compiled packages
There is great package format (usable by pip) called wheel. This allows to install a package (like lxml) to install on the same platform within fraction of a second (compared to minutes of compilation). See my answer at SO on this topic

What is the difference between pip and conda?

I know pip is a package manager for python packages. However, I saw the installation on IPython's website use conda to install IPython.
Can I use pip to install IPython? Why should I use conda as another python package manager when I already have pip?
What is the difference between pip and conda?
Quoting from the Conda blog:
Having been involved in the python world for so long, we are all aware of pip, easy_install, and virtualenv, but these tools did not meet all of our specific requirements. The main problem is that they are focused around Python, neglecting non-Python library dependencies, such as HDF5, MKL, LLVM, etc., which do not have a setup.py in their source code and also do not install files into Python’s site-packages directory.
So Conda is a packaging tool and installer that aims to do more than what pip does; handle library dependencies outside of the Python packages as well as the Python packages themselves. Conda also creates a virtual environment, like virtualenv does.
As such, Conda should be compared to Buildout perhaps, another tool that lets you handle both Python and non-Python installation tasks.
Because Conda introduces a new packaging format, you cannot use pip and Conda interchangeably; pip cannot install the Conda package format. You can use the two tools side by side (by installing pip with conda install pip) but they do not interoperate either.
Since writing this answer, Anaconda has published a new page on Understanding Conda and Pip, which echoes this as well:
This highlights a key difference between conda and pip. Pip installs Python packages whereas conda installs packages which may contain software written in any language. For example, before using pip, a Python interpreter must be installed via a system package manager or by downloading and running an installer. Conda on the other hand can install Python packages as well as the Python interpreter directly.
and further on
Occasionally a package is needed which is not available as a conda package but is available on PyPI and can be installed with pip. In these cases, it makes sense to try to use both conda and pip.
Disclaimer: This answer describes the state of things as it was a decade ago, at that time pip did not support binary packages. Conda was specifically created to better support building and distributing binary packages, in particular data science libraries with C extensions. For reference, pip only gained widespread support for portable binary packages with wheels (pip 1.4 in 2013) and the manylinux1 specification (pip 8.1 in March 2016). See the more recent answer for more history.
Here is a short rundown:
pip
Python packages only.
Compiles everything from source. EDIT: pip now installs binary wheels, if they are available.
Blessed by the core Python community (i.e., Python 3.4+ includes code that automatically bootstraps pip).
conda
Python agnostic. The main focus of existing packages are for Python, and indeed Conda itself is written in Python, but you can also have Conda packages for C libraries, or R packages, or really anything.
Installs binaries. There is a tool called conda build that builds packages from source, but conda install itself installs things from already built Conda packages.
External. conda is an environment and package manager. It is included in the Anaconda Python distribution provided by Continuum Analytics (now called Anaconda, Inc.).
conda is an environment manager written in Python and is language-agnostic. conda environment management functions cover the functionality provided by venv, virtualenv, pipenv, pyenv, and other Python-specific package managers. You could use conda within an existing Python installation by pip installing it (though this is not recommended unless you have a good reason to use an existing installation). As of 2022, conda and pip are not fully aware of one another package management activities within a virtual environment, not are they interoperable for Python package management.
In both cases:
Written in Python
Open source (conda is BSD and pip is MIT)
Warning: While conda itself is open-source, the package repositories are hosted by Anaconda Inc and have restrictions around commercial usage.
The first two bullet points of conda are really what make it advantageous over pip for many packages. Since pip installs from source, it can be painful to install things with it if you are unable to compile the source code (this is especially true on Windows, but it can even be true on Linux if the packages have some difficult C or FORTRAN library dependencies). conda installs from binary, meaning that someone (e.g., Continuum) has already done the hard work of compiling the package, and so the installation is easy.
There are also some differences if you are interested in building your own packages. For instance, pip is built on top of setuptools, whereas conda uses its own format, which has some advantages (like being static, and again, Python agnostic).
The other answers give a fair description of the details, but I want to highlight some high-level points.
pip is a package manager that facilitates installation, upgrade, and uninstallation of python packages. It also works with virtual python environments.
conda is a package manager for any software (installation, upgrade and uninstallation). It also works with virtual system environments.
One of the goals with the design of conda is to facilitate package management for the entire software stack required by users, of which one or more python versions may only be a small part. This includes low-level libraries, such as linear algebra, compilers, such as mingw on Windows, editors, version control tools like Hg and Git, or whatever else requires distribution and management.
For version management, pip allows you to switch between and manage multiple python environments.
Conda allows you to switch between and manage multiple general purpose environments across which multiple other things can vary in version number, like C-libraries, or compilers, or test-suites, or database engines and so on.
Conda is not Windows-centric, but on Windows it is by far the superior solution currently available when complex scientific packages requiring compilation are required to be installed and managed.
I want to weep when I think of how much time I have lost trying to compile many of these packages via pip on Windows, or debug failed pip install sessions when compilation was required.
As a final point, Continuum Analytics also hosts (free) binstar.org (now called anaconda.org) to allow regular package developers to create their own custom (built!) software stacks that their package-users will be able to conda install from.
(2021 UPDATE)
TL;DR Use pip, it's the official package manager since Python 3.
pip
basics
pip is the default package manager for python
pip is built-in as of Python 3.0
Usage: python3 -m venv myenv; source myenv/bin/activate; python3 -m pip install requests
Packages are downloaded from pypi.org, the official public python repository
It can install precompiled binaries (wheels) when available, or source (tar/zip archive).
Compiled binaries are important because many packages are mixed Python/C/other with third-party dependencies and complex build chains. They MUST be distributed as binaries to be ready-to-use.
advanced
pip can actually install from any archive, wheel, or git/svn repo...
...that can be located on disk, or on a HTTP URL, or a personal pypi server.
pip install git+https://github.com/psf/requests.git#v2.25.0 for example (it can be useful for testing patches on a branch).
pip install https://download.pytorch.org/whl/cpu/torch-1.9.0%2Bcpu-cp39-cp39-linux_x86_64.whl (that wheel is Python 3.9 on Linux).
when installing from source, pip will automatically build the package. (it's not always possible, try building TensorFlow without the google build system :D)
binary wheels can be python-version specific and OS specific, see manylinux specification to maximize portability.
conda
You are NOT permitted to use Anaconda or packages from Anaconda repositories for commercial use, unless you acquire a license.
Conda is a third party package manager from conda.
It's popularized by anaconda, a Python distribution including most common data science libraries ready-to-use.
You will use conda when you use anaconda.
Packages are downloaded from the anaconda repo.
It only installs precompiled packages.
Conda has its own format of packages. It doesn't use wheels.
conda install to install a package.
conda build to build a package.
conda can build the python interpreter (and other C packages it depends on). That's how an interpreter is built and bundled for anaconda.
conda allows to install and upgrade the Python interpreter (pip does not).
advanced
Historically, the selling point of conda was to support building and installing binary packages, because pip did not support binary packages very well (until wheels and manylinux2010 spec).
Emphasis on building packages. Conda has extensive build settings and it stores extensive metadata, to work with dependencies and build chains.
Some projects use conda to initiate complex build systems and generate a wheel, that is published to pypi.org for pip.
easy_install/egg
For historical reference only. DO NOT USE
egg is an abandoned format of package, it was used up to mid 2010s and completely replaced by wheels.
an egg is a zip archive, it contains python source files and/or compiled libraries.
eggs are used with easy_install and the first releases of pip.
easy_install was yet another package manager, that preceded pip and conda. It was removed in setuptools v58.3 (year 2021).
it too caused a lot of confusion, just like pip vs conda :D
egg files are slow to load, poorly specified, and OS specific.
Each egg was setup in a separate directory, an import mypackage would have to look for mypackage.py in potentially hundreds of directories (how many libraries were installed?). That was slow and not friendly to the filesystem cache.
Historically, the above three tools were open-source and written in Python.
However the company behind conda updated their Terms of Service in 2020 to prohibit commercial usage, watch out!
Funfact: The only strictly-required dependency to build the Python interpreter is zlib (a zip library), because compression is necessary to load more packages. Eggs and wheels packages are zip files.
Why so many options?
A good question.
Let's delve into the history of Python and computers. =D
Pure python packages have always worked fine with any of these packagers. The troubles were with not-only-Python packages.
Most of the code in the world depends on C. That is true for the Python interpreter, that is written in C. That is true for numerous Python packages, that are python wrappers around C libraries or projects mixing python/C/C++ code.
Anything that involves SSL, compression, GUI (X11 and Windows subsystems), math libraries, GPU, CUDA, etc... is typically coupled with some C code.
This creates troubles to package and distribute Python libraries because it's not just Python code that can run anywhere. The library must be compiled, compilation requires compilers and system libraries and third party libraries, then once compiled, the generated binary code only works for the specific system and python version it was compiled on.
Originally, python could distribute pure-python libraries just fine, but there was little support for distributing binary libraries. In and around 2010 you'd get a lot of errors trying to use numpy or cassandra. It downloaded the source and failed to compile, because of missing dependencies. Or it downloaded a prebuilt package (maybe an egg at the time) and it crashed with a SEGFAULT when used, because it was built for another system. It was a nightmare.
This was resolved by pip and wheels from 2012 onward. Then wait many years for people to adopt the tools and for the tools to propagate to stable Linux distributions (many developers rely on /usr/bin/python). The issues with binary packages extended to the late 2010s.
For reference, that's why the first command to run is python3 -m venv myvenv && source myvenv/bin/activate && pip install --upgrade pip setuptools on antiquated systems, because the OS comes with an old python+pip from 5 years ago that's buggy and can't recognize the current package format.
Conda worked on their own solution in parallel. Anaconda was specifically meant to make data science libraries easy to use out-of-the-box (data science = C and C++ everywhere), hence they had to come up with a package manager specifically meant to address building and distributing binary packages, conda.
If you install any package with pip install xxx nowadays, it just works. That's the recommended way to install packages and it's built-in in current versions of Python.
Not to confuse you further,
but you can also use pip within your conda environment, which validates the general vs. python specific managers comments above.
conda install -n testenv pip
source activate testenv
pip <pip command>
you can also add pip to default packages of any environment so it is present each time so you don't have to follow the above snippet.
Quote from Conda for Data Science article onto Continuum's website:
Conda vs pip
Python programmers are probably familiar with pip to download packages from PyPI and manage their requirements. Although, both conda and pip are package managers, they are very different:
Pip is specific for Python packages and conda is language-agnostic, which means we can use conda to manage packages from any language
Pip compiles from source and conda installs binaries, removing the burden of compilation
Conda creates language-agnostic environments natively whereas pip relies on virtualenv to manage only Python environments
Though it is recommended to always use conda packages, conda also includes pip, so you don’t have to choose between the two. For example, to install a python package that does not have a conda package, but is available through pip, just run, for example:
conda install pip
pip install gensim
pip is a package manager.
conda is both a package manager and an environment manager.
Detail:
Dependency check
Pip and conda also differ in how dependency relationships within an environment are fulfilled. When installing packages, pip installs dependencies in a recursive, serial loop. No effort is made to ensure that the dependencies of all packages are fulfilled simultaneously. This can lead to environments that are broken in subtle ways, if packages installed earlier in the order have incompatible dependency versions relative to packages installed later in the order. In contrast, conda uses a satisfiability (SAT) solver to verify that all requirements of all packages installed in an environment are met. This check can take extra time but helps prevent the creation of broken environments. As long as package metadata about dependencies is correct, conda will predictably produce working environments.
References
Understanding Conda and Pip
Quoting from Conda: Myths and Misconceptions (a comprehensive description):
...
Myth #3: Conda and pip are direct competitors
Reality: Conda and pip serve different purposes, and only directly compete in a small subset of tasks: namely installing Python packages in isolated environments.
Pip, which stands for Pip Installs Packages, is Python's officially-sanctioned package manager, and is most commonly used to install packages published on the Python Package Index (PyPI). Both pip and PyPI are governed and supported by the Python Packaging Authority (PyPA).
In short, pip is a general-purpose manager for Python packages; conda is a language-agnostic cross-platform environment manager. For the user, the most salient distinction is probably this: pip installs python packages within any environment; conda installs any package within conda environments. If all you are doing is installing Python packages within an isolated environment, conda and pip+virtualenv are mostly interchangeable, modulo some difference in dependency handling and package availability. By isolated environment I mean a conda-env or virtualenv, in which you can install packages without modifying your system Python installation.
Even setting aside Myth #2, if we focus on just installation of Python packages, conda and pip serve different audiences and different purposes. If you want to, say, manage Python packages within an existing system Python installation, conda can't help you: by design, it can only install packages within conda environments. If you want to, say, work with the many Python packages which rely on external dependencies (NumPy, SciPy, and Matplotlib are common examples), while tracking those dependencies in a meaningful way, pip can't help you: by design, it manages Python packages and only Python packages.
Conda and pip are not competitors, but rather tools focused on different groups of users and patterns of use.
For WINDOWS users
"standard" packaging tools situation is improving recently:
on pypi itself, there are now 48% of wheel packages as of sept. 11th 2015 (up from 38% in may 2015 , 24% in sept. 2014),
the wheel format is now supported out-of-the-box per latest python 2.7.9,
"standard"+"tweaks" packaging tools situation is improving also:
you can find nearly all scientific packages on wheel format at http://www.lfd.uci.edu/~gohlke/pythonlibs,
the mingwpy project may bring one day a 'compilation' package to windows users, allowing to install everything from source when needed.
"Conda" packaging remains better for the market it serves, and highlights areas where the "standard" should improve.
(also, the dependency specification multiple-effort, in standard wheel system and in conda system, or buildout, is not very pythonic, it would be nice if all these packaging 'core' techniques could converge, via a sort of PEP)
(2022 UPDATE) This answer was derived from the one above by #user5994461
You can use pip for package management. Pip is the official built-in package manager for Python.org since Python 3.
pip is not a virtual environment manager.
pip
basics
pip is the default package manager for python
pip is built-in as of Python 3.0
Usage: python3 -m venv myenv; source myenv/bin/activate; python3 -m pip install requests
Packages are downloaded from pypi.org, the official public python repository
It can install precompiled binaries (wheels) when available, or source (tar/zip archive).
Compiled binaries are important because many packages are mixed Python/C/other with third-party dependencies and complex build chains. They MUST be distributed as binaries to be ready-to-use.
advanced
pip can actually install from any archive, wheel, or git/svn repo...
...that can be located on disk, or on a HTTP URL, or a personal pypi server.
pip install git+https://github.com/psf/requests.git#v2.25.0 for example (it can be useful for testing patches on a branch).
pip install https://download.pytorch.org/whl/cpu/torch-1.9.0%2Bcpu-cp39-cp39-linux_x86_64.whl (that wheel is Python 3.9 on Linux).
when installing from source, pip will automatically build the package. (it's not always possible, try building TensorFlow without the google build system :D)
binary wheels can be python-version specific and OS specific, see manylinux specification to maximize portability.
conda
conda is an open source environment manager AND package manager maintained by the open source community. It is separate from Anaconda, Inc. and does not require a commercial license to use.
conda is also bundled into Anaconda Navigator, a popular commercial Python distribution from Anaconda, Inc. Anaconda) that includes most common data science and Python developer libraries ready-to-use.
You will use conda when you use Anaconda Navigator GUI.
Packages may be downloaded from conda-forge, anaconda repo4, and other public and private conda package "channels" (aka repos).
It only installs precompiled packages.
conda has its own package format. It doesn't use wheels.
conda install to install a package.
conda build to build a package.
conda can build the python interpreter (and other C packages it depends on). That's how an interpreter is built and bundled for Anaconda Navigator.
conda allows to install and upgrade the Python interpreter (pip does not).
advanced
Historically, one selling point of conda was to support building and installing binary packages, because pip did not support binary packages very well (until wheels and manylinux2010 spec).
Emphasis on building packages. conda has extensive build settings and it stores extensive metadata, to work with dependencies and build chains.
Some projects use conda to initiate complex build systems and generate a wheel, that is published to pypi.org for pip.
conda emphasizes building and managing virtual environments. conda is by design a programming language-agnostic virtual environment manager. conda can install and manage other package managers such as npm, pip, and other language package managers.
Can I use Anaconda Navigator packages for commercial use?
The new language states that use by individual hobbyists, students, universities, non-profit organizations, or businesses with less than 200 employees is allowed, and all other usage is considered commercial and thus requires a business relationship with Anaconda. (as of Oct 28, 2020)
IF you are a large developer organization, i.e., greater than 200 employees, you are NOT permitted to use Anaconda or packages from Anaconda repository for commercial use, unless you acquire a license.
Pulling and using (properly open-sourced) packages from conda-forge repository do not require commercial licenses from Anaconda, Inc. Developers are free to build their own conda packages using the packaging tools provided in the conda-forge infrastructure.
easy_install/egg
For historical reference only. DO NOT USE
egg is an abandoned format of package, it was used up to mid 2010s and completely replaced by wheels.
an egg is a zip archive, it contains python source files and/or compiled libraries.
eggs are used with easy_install and the first releases of pip.
easy_install was yet another package manager, that preceded pip and conda. It was removed in setuptools v58.3 (year 2021).
it too caused a lot of confusion, just like pip vs conda :D
egg files are slow to load, poorly specified, and OS specific.
Each egg was setup in a separate directory, an import mypackage would have to look for mypackage.py in potentially hundreds of directories (how many libraries were installed?). That was slow and not friendly to the filesystem cache.
Funfact: The only strictly-required dependency to build the Python interpreter is zlib (a zip library), because compression is necessary to load more packages. Eggs and wheels packages are zip files.
Why so many options?
A good question.
Let's delve into the history of Python and computers. =D
Pure python packages have always worked fine with any of these packagers. The troubles were with not-only-Python packages.
Most of the code in the world depends on C. That is true for the Python interpreter, that is written in C. That is true for numerous Python packages, that are python wrappers around C libraries or projects mixing python/C/C++ code.
Anything that involves SSL, compression, GUI (X11 and Windows subsystems), math libraries, GPU, CUDA, etc... is typically coupled with some C code.
This creates troubles to package and distribute Python libraries because it's not just Python code that can run anywhere. The library must be compiled, compilation requires compilers and system libraries and third party libraries, then once compiled, the generated binary code only works for the specific system and python version it was compiled on.
Originally, python could distribute pure-python libraries just fine, but there was little support for distributing binary libraries. In and around 2010 you'd get a lot of errors trying to use numpy or cassandra. It downloaded the source and failed to compile, because of missing dependencies. Or it downloaded a prebuilt package (maybe an egg at the time) and it crashed with a SEGFAULT when used, because it was built for another system. It was a nightmare.
This was resolved by pip and wheels from 2012 onward. Then wait many years for people to adopt the tools and for the tools to propagate to stable Linux distributions (many developers rely on /usr/bin/python). The issues with binary packages extended to the late 2010s.
For reference, that's why the first command to run is python3 -m venv myvenv && source myvenv/bin/activate && pip install --upgrade pip setuptools on antiquated systems, because the OS comes with an old python+pip from 5 years ago that's buggy and can't recognize the current package format.
Continuum Analytics (later renamed Anaconda, Inc.) worked on their own solution (released as Anaconda Navigator) in parallel. Anaconda Navigator was specifically meant to make data science libraries easy to use out-of-the-box (data science = C and C++ everywhere), hence they came up with a package manager specifically meant to address building and distributing binary packages, and built it into the environment manager, conda.
If you install any package with pip install xxx nowadays, it usually just works. pip is a recommended way to install packages that is built into current versions of Python.
To answer the original question,
For installing packages, PIP and Conda are different ways to accomplish the same thing. Both are standard applications to install packages. The main difference is the source of the package files.
PIP/PyPI will have more "experimental" packages, or newer, less common, versions of packages
Conda will typically have more well established packages or versions
An important cautionary side note: If you use both sources (pip and conda) to install packages in the same environment, this may cause issues later.
Recreate the environment will be more difficult
Fix package incompatibilities becomes more complicated
Best practice is to select one application, PIP or Conda, to install packages, and use that application to install any packages you need.
However, there are many exceptions or reasons to still use pip from within a conda environment, and vice versa.
For example:
When there are packages you need that only exist on one, and the
other doesn't have them.
You need a certain version that is only available in one environment
Can I use pip to install iPython?
Sure, both (first approach on page)
pip install ipython
and (third approach, second is conda)
You can manually download IPython from GitHub or PyPI. To install one
of these versions, unpack it and run the following from the top-level
source directory using the Terminal:
pip install .
are officially recommended ways to install.
Why should I use conda as another python package manager when I already have pip?
As said here:
If you need a specific package, maybe only for one project, or if you need to share the project with someone else, conda seems more appropriate.
Conda surpasses pip in (YMMV)
projects that use non-python tools
sharing with colleagues
switching between versions
switching between projects with different library versions
What is the difference between pip and conda?
That is extensively answered by everyone else.
pip is for Python only
conda is only for Anaconda + other scientific packages like R dependencies etc. NOT everyone needs Anaconda that already comes with Python. Anaconda is mostly for those who do Machine learning/deep learning etc. Casual Python dev won't run Anaconda on his laptop.
I may have found one further difference of a minor nature. I have my python environments under /usr rather than /home or whatever. In order to install to it, I would have to use sudo install pip. For me, the undesired side effect of sudo install pip was slightly different than what are widely reported elsewhere: after doing so, I had to run python with sudo in order to import any of the sudo-installed packages. I gave up on that and eventually found I could use sudo conda to install packages to an environment under /usr which then imported normally without needing sudo permission for python. I even used sudo conda to fix a broken pip rather than using sudo pip uninstall pip or sudo pip --upgrade install pip.

Python packages installation in Windows

I recently began learning Python, and I am a bit confused about how packages are distributed and installed.
I understand that the official way of installing packages is distutils: you download the source tarball, unpack it, and run: python setup.py install, then the module will automagically install itself
I also know about setuptools which comes with easy_install helper script. It uses eggs for distribution, and from what I understand, is built on top of distutils and does the same thing as above, plus it takes care of any dependencies required, all fetched from PyPi
Then there is also pip, which I'm still not sure how it differ from the others.
Finally, as I am on a windows machine, a lot of packages also offers binary builds through a windows installer, especially the ones that requires compiling C/Fortran code, which otherwise would be a nightmare to manually compile on windows (assumes you have MSVC or MinGW/Cygwin dev environment with all necessary libraries setup.. nonetheless try to build numpy or scipy yourself and you will understand!)
So can someone help me make sense of all this, and explain the differences, pros/cons of each method. I'd like to know how each keeps track of packages (Windows Registry, config files, ..). In particular, how would you manage all your third-party libraries (be able to list installed packages, disable/uninstall, etc..)
I use pip, and not on Windows, so I can't provide comparison with the Windows-installer option, just some information about pip:
Pip is built on top of setuptools, and requires it to be installed.
Pip is a replacement (improvement) for setuptools' easy_install. It does everything easy_install does, plus a lot more (make sure all desired distributions can be downloaded before actually installing any of them to avoid broken installs, list installed distributions and versions, uninstall, search PyPI, install from a requirements file listing multiple distributions and versions...).
Pip currently does not support installing any form of precompiled or binary distributions, so any distributions with extensions requiring compilation can only be installed if you have the appropriate compiler available. Supporting installation from Windows binary installers is on the roadmap, but it's not clear when it will happen.
Until recently, pip's Windows support was flaky and untested. Thanks to a lot of work from Dave Abrahams, pip trunk now passes all its tests on Windows (and there's a continuous integration server helping us ensure it stays that way), but a release has not yet been made including that work. So more reliable Windows support should be coming with the next release.
All the standard Python package installation mechanisms store all metadata about installed distributions in a file or files next to the actual installed package(s). Distutils uses a distribution_name-X.X-pyX.X.egg-info file, pip uses a similarly-named directory with multiple metadata files in it. Easy_install puts all the installed Python code for a distribution inside its own zipfile or directory, and places an EGG-INFO directory inside that directory with metadata in it. If you import a Python package from the interactive prompt, check the value of package.__file__; you should find the metadata for that package's distribution nearby.
Info about installed distributions is only stored in any kind of global registry by OS-specific packaging tools such as Windows installers, Apt, or RPM. The standard Python packaging tools don't modify or pay attention to these listings.
Pip (or, in my opinion, any Python packaging tool) is best used with virtualenv, which allows you to create isolated per-project Python mini-environments into which you can install packages without affecting your overall system. Every new virtualenv automatically comes with pip installed in it.
A couple other projects you may want to be aware of as well (yes, there's more!):
distribute is a fork of setuptools which has some additional bugfixes and features.
distutils2 is intended to be the "next generation" of Python packaging. It is (hopefully) adopting the best features of distutils/setuptools/distribute/pip. It is being developed independently and is not ready for use yet, but eventually should replace distutils in the Python standard library and become the de facto Python packaging solution.
Hope all that helped clarify something! Good luck.
I use windows and python. It is somewhat frustrating, because pip doesn't always work to install things. Python is moving to pip, so I still use it. Pip is nice, because you can uninstall items and use
pip freeze > requirements.txt
pip install -r requirements.txt
Another reason I like pip is for virtual environments like venv with python 3.4. I have found venv a lot easier to use on windows than virtualenv.
If you cannot install a package you have to find the binary for it. http://www.lfd.uci.edu/~gohlke/pythonlibs/
I have found these binaries to be very useful.
Pip is trying to make something called a wheel for binary installations.
pip install wheel
wheel convert path\to\binary.exe
pip install converted_wheel.whl
You will also have to do this for any required libraries that do not install and are required for that package.

Python package install using pip or easy_install from repos

The simplest way to deal with python package installations, so far, to me, has been to check out the source from the source control system and then add a symbolic link in the python dist-packages folder.
Clearly since source control provides the complete control to downgrade, upgrade to any branch, tag, it works very well.
Is there a way using one of the Package installers (easy_install or pip or other), one can achieve the same.
easy_install obtains the tar.gz and install them using the setup.py install which installs in the dist-packages folder in python2.6. Is there a way to configure it, or pip to use the source version control system (SVN/GIT/Hg/Bzr) instead.
Using pip this is quite easy. For instance:
pip install -e hg+http://bitbucket.org/andrewgodwin/south/#egg=South
Pip will automatically clone the source repo and run "setup.py develop" for you to install it into your environment (which hopefully is a virtualenv). Git, Subversion, Bazaar and Mercurial are all supported.
You can also then run "pip freeze" and it will output a list of your currently-installed packages with their exact versions (including, for develop-installs, the exact revision from the VCS). You can put this straight into a requirements file and later run
pip install -r requirements.txt
to install that same set of packages at the exact same versions.
If you download or check out the source distribution of a package — the one that has its "setup.py" inside of it — then if the package is based on the "setuptools" (which also power easy_install), you can move into that directory and say:
$ python setup.py develop
and it will create the right symlinks in dist-packages so that the .py files in the source distribution are the ones that get imported, rather than copies installed separately (which is what "setup.py install" would do — create separate copies that don't change immediately when you edit the source code to try a change).
As the other response indicates, you should try reading the "setuptools" documentation to learn more. "setup.py develop" is a really useful feature! Try using it in combination with a virtualenv, and you can "setup.py develop" painlessly and without messing up your system-wide Python with packages you are only developing on temporarily:
http://pypi.python.org/pypi/virtualenv
easy_install has support for downloading specific versions. For example:
easy_install python-dateutil==1.4.0
Will install v1.4, while the latest version 1.4.1 would be picked if no version was specified.
There is also support for svn checkouts, but using that doesn't give you much benefits from your manual version. See the manual for more information above.
Being able to switch to specific branches is rarely useful unless you are developing the packages in question, and then it's typically not a good idea to install them in site-packages anyway.
easy_install accepts a URL for the source tree too. Works at least when the sources are in Subversion.

Categories

Resources