I'm using pip to install modules from a requirements file produced with pip freeze. However the problem sometimes it's unable to install or download one module and then everything fails and doesn't install anything. Is there a way to make it install the modules that satisfy the requirements?
With pip only, I would say no. pip and Python packages generally are designed to work in such a way that you might need dependencies installed in order to install the package itself. Thus, they don't have an option to try despite of failures.
However, pip install -r requirements.txt simply goes through the file line-by-line. You can iterate the every single item yourself and call pip install for it, without caring the result (was the installation successfully or not). With shell scripting this could be done e.g.:
cat requirements.txt|xargs pip install
The example does not understand comments, spaces, etc. so you might need to how something more complex in place for a real-life scenario.
Alternative you can simply run pip in loop until it gives a successful return value.
But as a real solution I would recommend you to set up your own Python package mirror server, or a local cache - which would be another question.
Related
I have come across some nasty dependencies.
Looking around I found solutions like upgrade this, downgrade that...
Some solutions work for some but not for others.
Is there a more 'rounded' way to tackle this issue?
To test this:
I wrote a script to create a virtual environment, attempt to install a requirements file, then delete this environment which I run to quickly test if changes to requirements result in a successful installation or not.
I wrote a "test app" which uses the basic functionalities of all the libraries I use in one place, to see if despite a successful installation, there are dependencies that pip is unaware of (due to problematic 3rd party library architecture) that break. I can edit the dependencies of this app, then commit to run build actions that run it and see if it completes or crashes.
This way I can upgrade and add libraries more rapidly.
If all libraries used semantic versioning, declared all their dependencies via requirement files and did not define upper versions in requirement files, this would be a lot simpler. Unfortunately, one of the DB-vendors I use (which I shall not name) is very problematic and has a problematic Python library (amongst many other problems).
First you need to understand that pip can resolve problems one at a time and when you put it in a corner, it can't go further.
But, if you give to pip the 'big problem' it has a nice way to try to resolve it. It may not always work, but for most cases it will.
The solutions you normally find out there are in some cases a coincidence. Someone has an environment similar to another person and one solution works for both.
But if the solution takes into consideration 'your present environment' then the solution should work for more people than just 'the coincidences'.
Disclaimer: below are linux/terminal commands.
Upgrade pip. We want the smartest pip we can get.
pip install --upgrade pip
Extract the list of packages you want to install.
In my case (these and many others, trimmed for brevity)
google-cloud-texttospeech attrdict google-cloud-language transformers
Give them all at once to pip.
pip install google-cloud-texttospeech attrdict google-cloud-language transformers
It will try all the combinations of versions and dependencies' versions until it finds something suitable. This will potentially download a ton of packages just to see their dependencies, so you only want to make this once.
If happy with the result, extract the requirements file.
pip freeze > requirements.txt
This contains all the packages installed, we are not interested in all.
And from it, extract the specific versions of your desired packages.
cat requirements.txt | egrep -i "google-cloud-texttospeech|attrdict|google-cloud-language|transformers"
Note: The command above may not list all your required packages, just the ones that appear in the requirements.txt, so check that all show up.
attrdict==2.0.1
google-cloud-language==1.2.0
google-cloud-texttospeech==2.12.3
transformers==2.11.0
Now you can put that on a file like resolved-dependencies.txt
And next time, install the packages directly with the valid & compatible version with.
pip install -r resolved-dependencies.txt
pip install -e . is a great feature. It allows you to work on your package without having to uninstall-reinstall constantly. It seemingly doesn't, however, keep track of your build files (e.g. your setup.cfg or setup.py). Say, you change these (e.g. add, subtract, change version for dependencies, or change which modules you include in the package). What is then the best way to have pip recheck these requirements?
Or more generally, what is the way you're supposed to handle changes in your setup.cfg or setup.py when using pip install -e .?
What I usually end up doing is just doing pip install -e . in the root directory. This walks through your entire setup configuration again, install any new or changed dependencies and then uninstalls your package before reinstalling it again. Which definitely isn't always necessary, and does slow things down a lot.
While this certainly works, it feels against the idea of the 'editable' package.
Is there a proper way of doing this?
FYI, I know you can just install dependencies that aren't listed in your setup.cfg yourself by just pip install ..., my question is aimed at learning a better way of doing things.
It is probably the best way of doing things. The reason pip install -e . reinstalls everything is because it is environment agnostic.
Let's say you have two dependencies numpy >= 1.7.2 and pandas==1.4.2.
Now, pandas 1.4.2 requires a minimum version of numpy==1.18.2, so when you do a pip install (editable or not), it would probably pick the greatest compatible version(1.22.3).
Let's say you now want to fix numpy at 1.20.2. The only way pip knows it is compatible with pandas is by walking through your whole list of requirements.
If you do end up wanting a "better" tool than pip, check out pipenv or poetry.
I am trying to install psutil with the command pip install -U psutil and that gives me the error:
Cannot uninstall 'psutil'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
It seems like this is a known issue in pip with versions > 10, and I understand that part (I currently have pip 18). But I just found that I can solve it by directly doing a pip install psutil without using the Upgrade flag. I was wondering if there is a reasoning behind that. My initial sense is that in the first case, where pip tries to upgrade, it first tries to remove the package, which it cannot, but in the latter case it tries to install directly, and hence does not get the error. My question is does it still not have to remove the package first and install (when not using the Upgrade flag), or why specifically is it that pip gives an error with an Upgrade flag but no error without it.
EDIT: So, I tried to run pip install -v psutil as hoefling suggested, and I got a whole bunch of text, as opposed to saying that requirements already met, which means that psutil didn't get installed in the first place. I tried to figure this a bit, and this is what I understand so far: I was running inside a python virtualenv and installing it by means of pip -U -r requirements.txt where requirements.txt contains a bunch of packages including psutil. When I remove the -U flag, it skips installing psutil, and jumps over to other packages. Which raises another question, whether this is how pip is supposed to behave when there is no -U flag. Its interesting that the first time, when its installing the packages with the -U flag, it looks inside the main python installation instead of the virtual environment one, and when the -U flag is removed it doesn't do that and skips entirely.
There are some setups where you have a bunch of packages installed somewhere that isn't the normal install location for setuptools, and comes after the normal install location on sys.path.
Probably the most common of these setups is Apple's pre-installed Python 2.7, so I'll use it as an example. Even if that isn't your setup, it will be hopefully still be instructive.
Apple includes an Extras directory (at /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python), with a bunch of third-party packages that either Apple's own tools need, or that Apple thought you might want (back when Apple cared about providing the best Python experience of any platform).
For example, on macOS 10.13, that directory will include NumPy 1.8.0.
These packages are all installed as distribute-style eggs.
(Some linux distros do, or at least used to do, similar things, with Python packages built as RPM/DEB/etc. packages, which go into adistutils directory, unlike things you install via pip or manually, which go into a setuptools directory. The details are a bit different, but the effects, and the workaround, end up being the same.)
If you install pip, and then try to pip install -U numpy or pip uninstall numpy, pip will see the distribute-style numpy-1.8.0rc1-py2.7.egg-info file and refuse to touch it for fear of breaking everything.
If you just pip install numpy, however, it will look only in the standard site-packages installation location used by setuptools, /Library/Python/2.7/site-packages, see nothing there, and happily install a modern version of NumPy for you.
And, because /Library/Python/2.7/site-packages comes before /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python on your sys,path, the new NumPy will hide the ancient NumPy, and everything will just work as intended.
There can be a few problems with this. Most notably, if you try to install something which isn't included in Extras itself, but which has a dependency that is included in Extras, it may fail with mysterious and hard-to-debug errors. For example, on macOS 10.12, pip install pandas will throw a bunch of errors at you about not being able to upgrade dateutil, which you didn't even know you were trying to do. The only thing you can do is look at the dependencies for pandas, see which ones are pre-installed in Extras, and manually pip install shadowing versions of all of them.
But, for the most part, it works.
The following works:
pip install git+git://github.com/pydata/pandas#master
But the following doesn't:
pip install -e git+git://github.com/pydata/pandas#master
The error is:
--editable=git+git://github.com/pydata/pandas#master is not the right format; it must have #egg=Package
Why?
Also, I read that the -e does the following:
--egg
Install as self contained egg file, like easy_install does.
what is the value of this? When would this be helpful? (I always work on a virtualenv and install through pip)
Generally, you don't want to install as a .egg file. However, there are a few rare cases where you might. For example:
It's one of a handful of packages that needs to override a built-in package, and knows how to do so when installed as a .egg. With Apple Python, readline is such a package. I don't know of any other common exceptions.
The egg has binary dependencies that point to other eggs on PyPI, and can serve as a binary dependency for yet other eggs on PyPI. This is pretty rare nowadays, because it doesn't actually work in many important cases.
You want a package embedded in a single file that you can copy-and-paste, FTP, etc. from one installation to another.
You want a package that you can install into another installation straight out of site-packages.
The package is badly broken (and you can't fix it, for whatever reason), so that setup.py install does not work, but it can properly build an egg and run out of an egg.
Meanwhile, if you want to use editable mode, the package, and all other packages it depends on, have to be egg-compatible, whether or not you install them as eggs; pip will add #egg=<project name> to the VCS URL for each one, and if any of them don't understand that, it will fail.
I feel like there must be a way to do this, but for the life of me I can't figure out how: I want to run pip against a requirements file in a virtualenv so that no matter what packages are in the virtualenv before I run pip, the requirements file is totally fulfilled (including specific versions) after I run it.
The problem now is that if I have an older version of a package installed in the virtualenv than is listed in the requirements file, it complains about the version mismatch and exits (it should just update the package to the given version). The command I'm running is pip install -I -r requirements.txt and according to pip's help, -I is supposed to make pip "Ignore the installed packages (reinstalling instead)" but it definitely isn't doing that.
What am I missing?
(It'd be nice if pip skipped the packages that are already fulfilled too.)
I figured out what the cause of my pip problems was. Long story short, source left over in the virtualenv's build directory was causing an error that made packages upgrades fail. What I actually should have been doing was clearing out that directory (which pip doesn't always do I guess) before running the pip install and it seems to do everything I want after when paired with the --upgrade/-U flag.