PIP hg+ and git+ always downloads package instead of detecting satisfied requirement - python

My other question here just got answered about why pip svn+ was always re-downloading entire packages.
Now I have a handful more packages in my pip_requirements file that always get downloaded instead of detecting that the package requirements are satisfied.
They are the following types:
git+git://github.com/yuchant/django-jinja2.git
hg+https://bitbucket.org/yuchant/django-storages
With svn+ my packages are detected as satisfied regardless of whether I specify trunk or a specific revision. Is the pattern different for git and mercurial?

Short Answer
When using any VCS with pip requirement files you should always specify using #egg=[egg-name]
So your requirements file should contain:
git+git://github.com/yuchant/django-jinja2.git#egg=django-jinja2
hg+https://bitbucket.org/yuchant/django-storages#egg=django-storages
Long Answer
If you specify the pip requirements just like you do in your question without the #egg=[egg-name]. I'm going to call that string the egg identifier. The problem is very similar to your last question. Pip uses the egg identifier to search the currently installed python modules.
This is what happens if an egg identifier isn't specified:
Pip searches for git+git://github.com/yuchant/django-jinja2.git in the installed modules
Pip doesn't find it so it attempts to install it again
If you use an egg identifier this won't have this problem.

Related

How does pip resolve entries in install_requires especially in regards to already installed packages from conda?

In my project I have a dependency on a python package that is only available via conda and not pip. I'm aware that pip can not find it and hence cannot install it if it is not already installed. But I still wanted to add it to install_requires so at least the install fails and the users knows what is missing.
In fact I tried that and for this specific dependency pip always says it's not installed even if the correct version is in fact installed. I also depend on numpy which happens to be already installed by conda as well and that one is found by pip (+ same for additional dependencies)
So I'm a bit confused on how the resolution works, how pip determines a package is installed or not?
As far as I can tell, pip checks if a egg-info or dist-info folder for the mentioned package exists and what version the folder name is. Example: numpy-1.19.2.dist-info
For dist-info the simple presence of the folder is not enough. It must also contain a METADATA file. However it can be empty.
Anyway the conclusion is that if a package does not create one of these folders upon installation, pip doesn't see it's installed and hence will fail due to a missing requirement.

If i unsinstall python then will it also uninstall downloaded packages?

So, i recently used sublime text for python code and i couldnt run code because i forgot to check mark the 'add to path' option, i thought to just reinstall python properly. But i have downloaded some libraries from pip will they also be removed completely if i delete/uninstall python?
surprisingly no one has asked this Q on stackoverflow as i didnt find any.
Also: i came across this link while searching for my question which was about packages/modules. Is there a difference between library/packages/modules? i thought the words were used interchangeably but the code syntax suggests otherwise?
I don't know the answer, but here is a work-around:
Get all currently installed packages and store them in a file
pip freeze > current.txt
Then you can uninstall Python.
Next step would be to re-install Python, followed by re-installing your packages.
pip install -r current.txt
The packages would either be installed again, or a message will be raised stating the package is already installed (i.e. requirement already satisfied)

Pip chosing requirements version

I am asking myself, which version of the library pip will install in this scenario:
requirements.txt contains:
numpy<=1.14
scikit-learn
Now imagine, that scikit-learn depends on numpy>=1.10.
If I start pip install -r requirements.txt now, how will pip install the dependencies?
Does it parse the whole dependency structure before installing and finds a valid version of numpy?
Does it just parse the file and dependencies sequentially (package by package) and tries to go for the best "last" dependency?
In my example this would be:
numpy==1.14
numpy==latest
The essential question is: In which order will pip install its dependencies? How does it determine the proper version, respecting all cross dependencies?
EDIT: My initial guess would be, that it has an internal list with valid version and cancels out invalid versions by parsing all dependencies before installing. Then it takes the highest valid remaining version of each package.
First thing to know: Most likely the order will change soon. pip is currently (today: September 2020) slowly rolling out a new dependency resolver. It can be used today already but is not the default. So depending which dependency resolver you are using, results might differ.
A couple of pointers:
pip's documentation chapter "Installation Order"
pip's documentation chapter "Changes to the pip dependency resolver in 20.2 (2020)"

Does the `conda uninstall` and `pip uninstall` also removing dependencies but only ones that are not used by other packages?

Does the conda uninstall and pip uninstall also removing dependencies for package that I am specifying in uninstall command but only ones that are not dependencies for any other currently installed packages?
The documentation for Conda remove isn't clear. It doesn't explicitly address the situation of whether shared dependencies are removed and the statement "--unless a replacement can be found without that dependency" only further confuses things.
This answer from Merv, if correct, indicates shared dependencies are not removed:
In the uninstallation, Conda...will attempt to remove the requested package, plus any of its dependencies that were not explicitly installed or required by other packages.
If you take a look at docs, the answer is easy to come by.
Conda remove man page plainly says:
This command will also remove any package that depends on any of the
specified packages as well---unless a replacement can be found without
that dependency. If you wish to skip this dependency checking and
remove just the requested packages, add the '--force' option. Note
however that this may result in a broken environment, so use this with
caution.
pip uninstall on the other hand requires a -r option to delete package requirements.

Install a new package from requirement.txt without upgrading the dependencies which are already satisfied

I am using requirement.txt to specify the package dependencies that are used in my python application. And everything seems to work fine for packages of which either there are no internal dependencies or for the one using the package dependencies which are not already installed.
The issue occurs when i try to install a package which has a nested dependency on some other package and an older version of this package is already installed.
I know i can avoid this while installing a package manually bu using pip install -U --no-deps <package_name>. I want to understand how to do this using the requirement.txt as the deployment and requirement installation is an automated process.
Note:
The already installed package is not something i am directly using in my project but is part of a different project on the same server.
Thanks in advance.
Dependency resolution is a fairly complicated problem. A requirements.txt just specifies your dependencies with optional version ranges. If you want to "lock" your transitive dependencies (dependencies of dependencies) in place you would have to produce a requirements.txt that contains exact versions of every package you install with something like pip freeze. This doesn't solve the problem but it would at least point out to you on an install which dependencies conflict so that you can manually pick the right versions.
That being said the new (as of writing) officially supported tool for managing application dependencies is Pipenv. This tool will both manage the exact versions of transitive dependencies for you (so you won't have to maintain a "requirements.txt" manually) and it will isolate the packages that your code requires from the rest of the system. (It does this using the virtualenv tool under the hood). This isolation should fix your problems with breaking a colocated project since your project can have different versions of libraries than the rest of the system.
(TL;DR Try using Pipenv and see if your problem just disappears)

Categories

Resources