Pip chosing requirements version - python

I am asking myself, which version of the library pip will install in this scenario:
requirements.txt contains:
numpy<=1.14
scikit-learn
Now imagine, that scikit-learn depends on numpy>=1.10.
If I start pip install -r requirements.txt now, how will pip install the dependencies?
Does it parse the whole dependency structure before installing and finds a valid version of numpy?
Does it just parse the file and dependencies sequentially (package by package) and tries to go for the best "last" dependency?
In my example this would be:
numpy==1.14
numpy==latest
The essential question is: In which order will pip install its dependencies? How does it determine the proper version, respecting all cross dependencies?
EDIT: My initial guess would be, that it has an internal list with valid version and cancels out invalid versions by parsing all dependencies before installing. Then it takes the highest valid remaining version of each package.

First thing to know: Most likely the order will change soon. pip is currently (today: September 2020) slowly rolling out a new dependency resolver. It can be used today already but is not the default. So depending which dependency resolver you are using, results might differ.
A couple of pointers:
pip's documentation chapter "Installation Order"
pip's documentation chapter "Changes to the pip dependency resolver in 20.2 (2020)"

Related

pip's dependency resolver takes way too long to solve the conflict

I've been trying to install a package through pip on my rpi 3 model B
my operating system is raspbian. Debian based pip version is 21.0.1 and python version is 3.7.4
the command I'm using is:
python3 -m pip install librosa
the problem is that the dependency resolver takes way too long to resolve the conflicts.
and after a few hours, it keeps repeating this line over and over again for hours ( I even left the installation running for 2 days overnights )
INFO: pip is looking at multiple versions of <Python from requires-Python> to determine which version is compatible with other requirements. this could take a while.
INFO: This is taking longer than usual. You might need to provide the dependency resolver with stricter constraints to reduce runtime. If you want to abort this run you can press ctrl + c to do so.
I've tried using a stricter constraint such as adding "numpy > 1.20.0" and other stuff but now the popped up and I have no clue what I can do now.
So as of pip 20.3, a new (not always working) resolver has been introduced. As of pip 21.0 the old (working) resolver is unsupported and slated for removal dependent on pip team resources.
Changes to the pip dependency resolver in 20.3
I have hit the same issue trying to build jupyter, my solution was to pin pip back to the 20.2 release which is the last release with the old resolver. This got past the point my builds were choking at using the new resolver under pip 21.1.1.
A second approach that might work (untested) is to use the flag:
--use-deprecated=legacy-resolver
which appears to have been added when 20.3 switched over to the new resolver. This would allow the benefits of newer pip releases, until the backtracking issue is resolved, assuming it works.
What is happening, according to the devs on this Github issue, is "pip downloads multiple versions [of each package] because the versions it downloaded conflict with other packages you specified, which is called backtracking and is a feature. The versions need to be downloaded to detect the conflicts." But it takes a very long time to download all of these versions. Pip explains this in detail, along with ways to resolve it or speed it up, at https://pip.pypa.io/en/latest/topics/dependency-resolution/.
If you run
pip install -r requirements.txt --use-deprecated=legacy-resolver
you will not get this backtracking behavior, but your install will complete, and you will see an error at the end that is useful for troubleshooting:
ERROR: pip's legacy dependency resolver does not consider dependency conflicts when selecting packages. This behaviour is the source of the following dependency conflicts.
apache-airflow-providers-amazon 2.6.0 requires boto3<1.19.0,>=1.15.0, but you'll have boto3 1.9.253 which is incompatible.
package_xyz 0.0.1 requires PyJWT==2.1.0, but you'll have pyjwt 1.7.1 which is incompatible.
Upgrading my pip to 21.3.1 worked
python.exe -m pip install --upgrade pip

pip not properly resolving child/grandchild dependencies

I have a dependency tree of modules that works like this (→ indicating a dependency):
a → b, c
b → ruamel.yaml >= 0.16.5
c → ruamel.yaml < 0.16.6, >=0.12.4
It's very clear to me that ruamel.yaml 0.16.5 will resolve both of these dependencies correctly. However, when I pip install a, I get the following logs:
Collecting ruamel.yaml>=0.16.5
Downloading ruamel.yaml-0.16.10-py2.py3-none-any.whl (111 kB)
And then later:
ERROR: <package c> 0.4.0 has requirement ruamel.yaml<0.16.6,>=0.12.4, but you'll have ruamel-yaml 0.16.10 which is incompatible.
So pip has completely ignored the grandchild dependencies when choosing which packages to install. But it realises that it has messed up at the end. Why is pip not choosing the correct package here. Is there a way to help it work better?
I believe this is a well known problem that is currently being worked on. Message from one week ago: http://pyfound.blogspot.com/2020/03/new-pip-resolver-to-roll-out-this-year.html
In the meantime there are some measures that can be taken to try and mitigate this kind of issues:
Revert the order of the dependencies (in your example a could list c before b)
Use an additional requirements.txt or constraints.txt file
Depending on the actual needs, an alternative tool could help (I believe poetry, pipenv, and most likely others as well might have better dependency resolvers, but they are not a one-to-one replacement for pip)
It appears to be already possible to test pip's future dependency resolver today:
Install pip from source
Run path/to/python -m pip install --unstable-feature=resolver ...
In a way it also seems to be possible to somewhat test this dependency resolver in current releases or pip via the pip check command.
Some more references on the topic:
https://pradyunsg.me/blog/2020/03/27/pip-resolver-testing/
https://discuss.python.org/t/an-update-on-pip-and-dependency-resolution/1898/2

Install a new package from requirement.txt without upgrading the dependencies which are already satisfied

I am using requirement.txt to specify the package dependencies that are used in my python application. And everything seems to work fine for packages of which either there are no internal dependencies or for the one using the package dependencies which are not already installed.
The issue occurs when i try to install a package which has a nested dependency on some other package and an older version of this package is already installed.
I know i can avoid this while installing a package manually bu using pip install -U --no-deps <package_name>. I want to understand how to do this using the requirement.txt as the deployment and requirement installation is an automated process.
Note:
The already installed package is not something i am directly using in my project but is part of a different project on the same server.
Thanks in advance.
Dependency resolution is a fairly complicated problem. A requirements.txt just specifies your dependencies with optional version ranges. If you want to "lock" your transitive dependencies (dependencies of dependencies) in place you would have to produce a requirements.txt that contains exact versions of every package you install with something like pip freeze. This doesn't solve the problem but it would at least point out to you on an install which dependencies conflict so that you can manually pick the right versions.
That being said the new (as of writing) officially supported tool for managing application dependencies is Pipenv. This tool will both manage the exact versions of transitive dependencies for you (so you won't have to maintain a "requirements.txt" manually) and it will isolate the packages that your code requires from the rest of the system. (It does this using the virtualenv tool under the hood). This isolation should fix your problems with breaking a colocated project since your project can have different versions of libraries than the rest of the system.
(TL;DR Try using Pipenv and see if your problem just disappears)

How do I install Python Babel library in it's trunk version?

After hours of finding out why there are missing some documented functions in my Babel installation, I learned there are two branches of Babel development:
Babel has two separate development paths (0.9.x branch and trunk) in
parallel for about 4 years now despite very few developers working on
the project. We try to resolve that situation by releasing a stable
1.0 version but well, real live is not always friendly to open source contribution.
Babel's FAQ confirms that. I want to use Flask-Babel in my project. It's dependency in setup.py says I need just Babel. It means my pip takes any version installed in my environment or searches PyPI for the newest version, where is version 0.9.6. Unlogically, Flask-Babel uses functions which are not present in 0.9.x branch. Maybe I am missing something, maybe I am just confused, but how can I easily install the trunk version, where is the most of new features? And how can I enforce using such a version in my setup.py? How it all works for the people who use Flask-Babel? (I know, the last question is rather Flask-specific and should go here, but all the other questions can answer anyone else.)
Thank you for any suggestions. The bold questions are the most important, the rest is rather Flask-Babel-specific "nice to have".
Have you tried using pip with the url to the branch that you need?
$ sudo pip install http://svn.edgewall.org/repos/babel/trunk
After that, pip should be happy with the dependency:
$ sudo pip install Flask-Babel
...
Requirement already satisfied (use --upgrade to upgrade): Babel in /usr/local/lib/python2.7/dist-packages (from Flask-Babel)
...
Regarding how to force a dependency in you setup.py. Since you're already using pip, you can give a try to a requirements file.

Why use pip over easy_install?

A tweet reads:
Don't use easy_install, unless you
like stabbing yourself in the face.
Use pip.
Why use pip over easy_install? Doesn't the fault lie with PyPI and package authors mostly? If an author uploads crap source tarball (eg: missing files, no setup.py) to PyPI, then both pip and easy_install will fail. Other than cosmetic differences, why do Python people (like in the above tweet) seem to strongly favor pip over easy_install?
(Let's assume that we're talking about easy_install from the Distribute package, that is maintained by the community)
From Ian Bicking's own introduction to pip:
pip was originally written to improve on easy_install in the following ways
All packages are downloaded before installation. Partially-completed installation doesn’t occur as a result.
Care is taken to present useful output on the console.
The reasons for actions are kept track of. For instance, if a package is being installed, pip keeps track of why that package was required.
Error messages should be useful.
The code is relatively concise and cohesive, making it easier to use programmatically.
Packages don’t have to be installed as egg archives, they can be installed flat (while keeping the egg metadata).
Native support for other version control systems (Git, Mercurial and Bazaar)
Uninstallation of packages.
Simple to define fixed sets of requirements and reliably reproduce a set of packages.
Many of the answers here are out of date for 2015 (although the initially accepted one from Daniel Roseman is not). Here's the current state of things:
Binary packages are now distributed as wheels (.whl files)—not just on PyPI, but in third-party repositories like Christoph Gohlke's Extension Packages for Windows. pip can handle wheels; easy_install cannot.
Virtual environments (which come built-in with 3.4, or can be added to 2.6+/3.1+ with virtualenv) have become a very important and prominent tool (and recommended in the official docs); they include pip out of the box, but don't even work properly with easy_install.
The distribute package that included easy_install is no longer maintained. Its improvements over setuptools got merged back into setuptools. Trying to install distribute will just install setuptools instead.
easy_install itself is only quasi-maintained.
All of the cases where pip used to be inferior to easy_install—installing from an unpacked source tree, from a DVCS repo, etc.—are long-gone; you can pip install ., pip install git+https://.
pip comes with the official Python 2.7 and 3.4+ packages from python.org, and a pip bootstrap is included by default if you build from source.
The various incomplete bits of documentation on installing, using, and building packages have been replaced by the Python Packaging User Guide. Python's own documentation on Installing Python Modules now defers to this user guide, and explicitly calls out pip as "the preferred installer program".
Other new features have been added to pip over the years that will never be in easy_install. For example, pip makes it easy to clone your site-packages by building a requirements file and then installing it with a single command on each side. Or to convert your requirements file to a local repo to use for in-house development. And so on.
The only good reason that I know of to use easy_install in 2015 is the special case of using Apple's pre-installed Python versions with OS X 10.5-10.8. Since 10.5, Apple has included easy_install, but as of 10.10 they still don't include pip. With 10.9+, you should still just use get-pip.py, but for 10.5-10.8, this has some problems, so it's easier to sudo easy_install pip. (In general, easy_install pip is a bad idea; it's only for OS X 10.5-10.8 that you want to do this.) Also, 10.5-10.8 include readline in a way that easy_install knows how to kludge around but pip doesn't, so you also want to sudo easy_install readline if you want to upgrade that.
Another—as of yet unmentioned—reason for favoring pip is because it is the new hotness and will continue to be used in the future.
The infographic below—from the Current State of Packaging section in the The Hitchhiker's Guide to Packaging v1.0—shows that setuptools/easy_install will go away in the future.
Here's another infographic from distribute's documentation showing that Setuptools and easy_install will be replaced by the new hotness—distribute and pip. While pip is still the new hotness, Distribute merged with Setuptools in 2013 with the release of Setuptools v0.7.
Two reasons, there may be more:
pip provides an uninstall command
if an installation fails in the middle, pip will leave you in a clean state.
REQUIREMENTS files.
Seriously, I use this in conjunction with virtualenv every day.
QUICK DEPENDENCY MANAGEMENT TUTORIAL, FOLKS
Requirements files allow you to create a snapshot of all packages that have been installed through pip. By encapsulating those packages in a virtualenvironment, you can have your codebase work off a very specific set of packages and share that codebase with others.
From Heroku's documentation https://devcenter.heroku.com/articles/python
You create a virtual environment, and set your shell to use it. (bash/*nix instructions)
virtualenv env
source env/bin/activate
Now all python scripts run with this shell will use this environment's packages and configuration. Now you can install a package locally to this environment without needing to install it globally on your machine.
pip install flask
Now you can dump the info about which packages are installed with
pip freeze > requirements.txt
If you checked that file into version control, when someone else gets your code, they can setup their own virtual environment and install all the dependencies with:
pip install -r requirements.txt
Any time you can automate tedium like this is awesome.
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be used there. easy_install can install binary packages for Windows.
UPDATE: setuptools has absorbed distribute as opposed to the other way around, as some thought. setuptools is up-to-date with the latest distutils changes and the wheel format. Hence, easy_install and pip are more or less on equal footing now.
Source: http://pythonhosted.org/setuptools/merge-faq.html#why-setuptools-and-not-distribute-or-another-name
As an addition to fuzzyman's reply:
pip won't install binary packages and isn't well tested on Windows.
As Windows doesn't come with a compiler by default pip often can't be
used there. easy_install can install binary packages for Windows.
Here is a trick on Windows:
you can use easy_install <package> to install binary packages to avoid building a binary
you can use pip uninstall <package> even if you used easy_install.
This is just a work-around that works for me on windows.
Actually I always use pip if no binaries are involved.
See the current pip doku: http://www.pip-installer.org/en/latest/other-tools.html#pip-compared-to-easy-install
I will ask on the mailing list what is planned for that.
Here is the latest update:
The new supported way to install binaries is going to be wheel!
It is not yet in the standard, but almost. Current version is still an alpha: 1.0.0a1
https://pypi.python.org/pypi/wheel
http://wheel.readthedocs.org/en/latest/
I will test wheel by creating an OS X installer for PySide using wheel instead of eggs. Will get back and report about this.
cheers - Chris
A quick update:
The transition to wheel is almost over. Most packages are supporting wheel.
I promised to build wheels for PySide, and I did that last summer. Works great!
HINT:
A few developers failed so far to support the wheel format, simply because they forget to
replace distutils by setuptools.
Often, it is easy to convert such packages by replacing this single word in setup.py.
Just met one special case that I had to use easy_install instead of pip, or I have to pull the source codes directly.
For the package GitPython, the version in pip is too old, which is 0.1.7, while the one from easy_install is the latest which is 0.3.2.rc1.
I'm using Python 2.7.8. I'm not sure about the underlay mechanism of easy_install and pip, but at least the versions of some packages may be different from each other, and sometimes easy_install is the one with newer version.
easy_install GitPython

Categories

Resources