How to extract dependencies from a PyPi package

How to extract dependencies from a PyPi package - python

my goal is simple, i want to get the dependency of a PyPi package remotely without needing to download it completely.
I seem to understand (reading the pip code) that pip when resolving dependencies seems to read the egg once the package has been downloaded...
Is there any other way ?

Use pipdeptree to view dependencies of installed PyPI packages.
Install:
pip install pipdeptree
Then run:
pipdeptree
You'll see something like that:
Warning!!! Possible conflicting dependencies found:
* Mako==0.9.1 -> MarkupSafe [required: >=0.9.2, installed: 0.18]
Jinja2==2.7.2 -> MarkupSafe [installed: 0.18]
------------------------------------------------------------------------
Lookupy==0.1
wsgiref==0.1.2
argparse==1.2.1
psycopg2==2.5.2
Flask-Script==0.6.6
- Flask [installed: 0.10.1]
- Werkzeug [required: >=0.7, installed: 0.9.4]
- Jinja2 [required: >=2.4, installed: 2.7.2]
- MarkupSafe [installed: 0.18]
- itsdangerous [required: >=0.21, installed: 0.23]
alembic==0.6.2
- SQLAlchemy [required: >=0.7.3, installed: 0.9.1]
- Mako [installed: 0.9.1]
- MarkupSafe [required: >=0.9.2, installed: 0.18]
ipython==2.0.0
slugify==0.0.1
redis==2.9.1

As jinghli notes, there isn't currently a reliable way to get the dependency of an arbitrary PyPi package remotely without needing to download it completely. And in fact the dependencies sometimes depend on your environment, so an approach like Brian's of executing setup.py code is needed in the general case.
The way the Python ecosystem handles dependencies started evolving in the 1990's before the problem was well understood. PEP 508 -- Dependency specification for Python Software Packages sets us on course to improve the situtation, and an "aspirational" draft approach in PEP 426 -- Metadata for Python Software Packages 2.0 may improve it more in the future, in conjunction with the reimplementation of PyPI as Warehouse.
The current situation is described well in the document Python Dependency Resolution.
PyPI does provide a json interface to download metadata for each package. The info.requires_dist object contains a list of names of required packages with optional version restrictions etc. It is often missing, but it is one place to start.
E.g. Django (json) indicates:
{
"info": {
...
"requires_dist": [
"bcrypt; extra == 'bcrypt'",
"argon2-cffi (>=16.1.0); extra == 'argon2'",
"pytz"
],
...
}

I've just needed to find a way to do this and this is what I came up with (stolen from pip).
def dist_metadata(setup_py):
'''Get the dist object for setup.py file'''
with open(setup_py) as f:
d = f.read()
try:
# we have to do this with current globals else
# imports will fail. secure? not really. A
# problem? not really if your setup.py sources are
# trusted
exec d in globals(), globals()
except SystemExit:
pass
return distutils.core._setup_distribution
https://stackoverflow.com/a/12505166/3332282 answers why the exec incantation is subtle and hard to get right.

Sadly, pip doesn't have this function. The metadata available for packages on PyPI does not include information about the dependencies.
Normally, you can find the detailed dependency from the README file from project website.
pip search can give some information about the package. It can tell you what is it based on.
$ pip search flask
Flask - A microframework based on Werkzeug, Jinja2 and good intentions

Related

Install transitive dependency version automatically in python

I am writing a python application and I have the following dependencies:
Requires-Dist: keyring (==23.9.3) Requires-Dist: keyrings.alt (==4.2.0) Requires-Dist: pandas (==1.3.5) Requires-Dist: pyarrow (==10.0.0) Requires-Dist: requests (==2.28.1) Requires-Dist: requests-toolbelt (==0.10.1) Requires-Dist: toml (==0.10.2)
In the above list, each dependency has its subsequent transitive dependencies. For example "requests" has urllib3 dependency and the version of the same should be above 1.21.1 and below 1.27.
- requests [required: ==2.28.1, installed: 2.28.1] - certifi [required: >=2017.4.17, installed: 2022.9.14] - charset-normalizer [required: >=2,<3, installed: 2.0.4] - idna [required: >=2.5,<4, installed: 3.3] - urllib3 [required: >=1.21.1,<1.27, installed: 1.25.8]
When my application wheel file is installed (using pip install) is there any way I can make sure that the highest required/supported version of the transitive dependency is also automatically installed?
Currently my application fails to run because the APIs I am using from "requests" is looking for a method from urllib3 version 1.26 or above.
ERROR - __init__() got an unexpected keyword argument 'allowed_methods'
So if there is any way to automatically install the highest required/supported version of the transitive dependencies automatically this problem can be solved. Similar issue can be observed with other transitive dependency versions as well. So request help on this. Thanks in advance.
When my application wheel file is installed (using pip install) is there any way I can make sure that the highest required/supported version of the transitive dependency is also automatically installed?
Currently my application fails to run because the APIs I am using from "requests" is looking for a method from urllib3 version 1.26 or above.
ERROR - __init__() got an unexpected keyword argument 'allowed_methods'
So if there is any way to automatically install the highest required/supported version of the transitive dependencies automatically this problem can be solved. Similar issue can be observed with other transitive dependency versions as well. So request help on this. Thanks in advance.

Install specific version of setuptools as a dependency of package

My package has setuptools in dependencies. I am trying to restrict the version of setuptools when installing my package.
The package has following restriction in setup.py:
setup(
setup_requires=[
'setuptools==50.2.0',
'pip>=19,!=20.0,!=20.0.1,<21'
],
...
And it has the same restriction in pyproject.toml:
[build-system]
requires = ["setuptools==50.2.0", "pip>=19,!=20.0,!=20.0.1,<21", "wheel"] # PEP 508 specifications.
However, when installing my package with pip, it downloads the latest setuptools 50.3.0.
Why does it ignore the requirements? How can I make it not install the latest version?

I think you're getting confused about build time (setup_requires / pyproject.toml build-system requires) and installed time (install_requires). at install time, you're getting unpinned setuptools because it's a transitive dependency without version restrictions
setuptools is being pulled in via a transitive dependency in install_requires (notably: jsonschema):
$ visualize-requirements t.txt
cryptography>=2.4.2,<3
- cffi!=1.11.3,>=1.8
- pycparser
- six>=1.4.1
click>=7.0,<8
intelhex<3,>=2.2.1
python-jose<4,>=3.0.1
- pyasn1
- rsa
- pyasn1>=0.1.3
- ecdsa<0.15
- six
- six<2.0
jsonschema<4,>=3.0.0
- six>=1.11.0
- attrs>=17.4.0
- setuptools
- pyrsistent>=0.14.0
pyocd==0.27.3
- intervaltree<4.0,>=3.0.2
- sortedcontainers<3.0,>=2.0
- pylink-square
- six
- psutil>=5.2.2
- future
- cmsis-pack-manager>=0.2.7
- milksnake>=0.1.2
- cffi>=1.6.0
- pycparser
- appdirs>=1.4
- pyyaml>=3.12
- pyelftools
- six<2.0,>=1.0
- colorama
- prettytable
- pyusb>=1.0.0b2,<2.0
- pyyaml<6.0,>=5.1
- intelhex<3.0,>=2.0
cbor==1.0.0
imgtool==1.7.0a1
- intelhex>=2.2.1
- click
- cryptography>=2.4.2
- cffi!=1.11.3,>=1.8
- pycparser
- six>=1.4.1
- cbor>=1.0.0
I'm using visualize-requirements from a tool I wrote called requirements-tools

Seems accurate, 50.3.0 is greater than 40.0, less than 51, and not equal to 46.0 or 50.0. You may need to further restrict your requirements. If you know which version you want, just specify that explicitly
EDIT:
I created a fresh venv and checked pip list, seems like with a high enough version of pip, setuptools comes at 50.3.0.
$ pip3 -V
pip 8.1.1 from /usr/lib/python3/dist-packages (python 3.5)
$ pip3 list | grep setup
setuptools (20.7.0)
You are using pip version 8.1.1, however version 20.2.3 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Inside the venv (assuming Python 3.x)
$ . vv/bin/activate
(vv) $ pip3 -V
pip 20.2.3 from /home/user/vv/lib/python3.5/site-packages/pip (python 3.5)
(vv) $ pip3 list | grep setup
DEPRECATION: Python 3.5 reached the end of its life on September 13th, 2020. Please upgrade your Python as Python 3.5 is no longer maintained. pip 21.0 will drop support for Python 3.5 in January 2021. pip 21.0 will remove support for this functionality.
setuptools 50.3.0

Thanks to the answers and comments I can make a conclusion.
To use a specific version of setuptools it is necessary to have it in both locations - in pyproject.toml and at the beginning of install_requires of setup.py.
The tool like pip will use the version from pyproject.toml to build the project. However, if there is any dependency that has the latest version of setuptools in its requirements, then the latest version will be used to install the dependency. Also, the environment will keep the version that was last installed.

How to get egg or wheel file of pip-installed python package?

I have similar import error on Spark executors as described here, just with psycopg2: ImportError: No module named numpy on spark workers
Here it says "Although pandas is too complex to distribute as a *.py file, you can create an egg for it and its dependencies and send that to executors".
So the question is "How to create egg file from package and it dependencies?" Or wheel, in case eggs are legacy. Is there any command for this in pip?

You want to be making a wheel. They are newer, more robust than eggs, and are supported by both Python 2/3.
For something as popular as numpy, you don't need to bother making the wheel yourself. They package wheels in their distribution, so you can just download it. Many python libraries will have a wheel as part of their distribution. See here: https://pypi.python.org/pypi/numpy
If you're curious, see here how to make one in general: https://pip.pypa.io/en/stable/reference/pip_wheel/.
Alternatively, you could just install numpy on your target workers.
EDIT:
After your comments, I think it's pertinent to mention the pipdeptree utility. If you need to see by hand what the pip dependencies are, this utility will list them for you. Here's an example:
$ pipdeptree
3to2==1.1.1
anaconda-navigator==1.2.1
ansible==2.2.1.0
- jinja2 [required: <2.9, installed: 2.8]
- MarkupSafe [required: Any, installed: 0.23]
- paramiko [required: Any, installed: 2.1.1]
- cryptography [required: >=1.1, installed: 1.4]
- cffi [required: >=1.4.1, installed: 1.6.0]
- pycparser [required: Any, installed: 2.14]
- enum34 [required: Any, installed: 1.1.6]
- idna [required: >=2.0, installed: 2.1]
- ipaddress [required: Any, installed: 1.0.16]
- pyasn1 [required: >=0.1.8, installed: 0.1.9]
- setuptools [required: >=11.3, installed: 23.0.0]
- six [required: >=1.4.1, installed: 1.10.0]
- pyasn1 [required: >=0.1.7, installed: 0.1.9]
- pycrypto [required: >=2.6, installed: 2.6.1]
- PyYAML [required: Any, installed: 3.11]
- setuptools [required: Any, installed: 23.0.0
If you're using Pyspark and need to package your dependencies, pip can't do this for you automatically. Pyspark has its own dependency management that pip knows nothing about. The best you can do is list the dependencies and shove them over by hand, as far as I know.
Additionally, Pyspark isn't dependent on numpy or psycopg2, so pip can't possibly tell you that you'd need them if all you're telling pip is your version of Pyspark. That dependency has been introduced by you, so you're responsible for giving it to Pyspark.
As a side note, we use bootstrap scripts that install our dependencies (like numpy) before we boot our clusters. It seems to work well. That way you list the libs you need once in a script, and then you can forget about it.
HTH.

You can install wheel using pip install wheel.
Then create a .whl using python setup.py bdist_wheel. You'll find it in the dist directory in root directory of the python package. You might also want to pass --universal if you want a single .whl file for both python 2 and python 3.
More info on wheel.

Anaconda: Error while building from PyPi package ("Package XY missing in current linux-64 channels")

I am trying to build a conda package of the open energy modelling framework (oemof) PyPi package as described in the respective manual. The oemof package has the Pyomo package as a requirement which I had installed in advance using a suitable recipe.
My problem is that I now get an error during the build process:
Package missing in current linux-64 channels:
- pyomo >=4.2.0
wheras my installed Pyomo version seems to be above 4.2:
cord#crd-Laptop:~/.anaconda3/bin$ ./conda update pyomo
pyomo 4.2.10784 py35_10 cachemeorg
What's my mistake here and how can I build my package as described in the conda manual?
Thanks in advance!
Below you can see the steps I went through so far:
cord#crd-Laptop:~/.anaconda3/bin$ ./conda skeleton pypi oemof
Warning, the following versions were found for oemof
0.0.6
0.0.5
0.0.4
0.0.3
Using 0.0.6
Use --version to specify a different version.
Using url https://pypi.python.org/packages/3b/1f/5a82acf8cbcb3d0adb537346b2939cb6fa415e9c347f734af19c8a1b50d1/oemof-0.0.6.tar.gz (52 KB) for oemof.
Downloading oemof
Using cached download
Unpacking oemof...
done
working in /tmp/tmpd67mbpi2conda_skeleton_oemof-0.0.6.tar.gz
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata: ......
Solving package specifications: .........
The following NEW packages will be INSTALLED:
mkl: 11.3.1-0
numpy: 1.11.0-py35_0
openssl: 1.0.2g-0
pip: 8.1.1-py35_1
python: 3.5.1-0
pyyaml: 3.11-py35_1
readline: 6.2-2
setuptools: 20.7.0-py35_0
sqlite: 3.9.2-0
tk: 8.5.18-0
wheel: 0.29.0-py35_0
xz: 5.0.5-1
yaml: 0.1.6-0
zlib: 1.2.8-0
Linking packages ...
[ COMPLETE ]|###########################################################################################| 100%
Applying patch: '/tmp/tmpd67mbpi2conda_skeleton_oemof-0.0.6.tar.gz/pypi-distutils.patch'
patching file core.py
Hunk #1 succeeded at 167 with fuzz 2 (offset 1 line).
Using "UNKNOWN" for the license
Writing recipe for oemof
Done
cord#crd-Laptop:~/.anaconda3/bin$ ./conda build oemof
Removing old build environment
Removing old work directory
BUILD START: oemof-0.0.6-py35_0
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata: ......
Solving package specifications: .
Package missing in current linux-64 channels:
- pyomo >=4.2.0
Missing dependency pyomo, but found recipe directory, so building pyomo first
Ignoring non-recipe: pyomo
Removing old build environment
Removing old work directory
BUILD START: oemof-0.0.6-py35_0
Fetching package metadata: ......
Solving package specifications: .
Package missing in current linux-64 channels:
- pyomo >=4.2.0
cord#crd-Laptop:~/.anaconda3/bin$ ./conda update pyomo
Using Anaconda Cloud api site https://api.anaconda.org
Fetching package metadata: ....
# All requested packages already installed.
# packages in environment at /home/cord/.anaconda3:
#
pyomo 4.2.10784 py35_10 cachemeorg
cord#crd-Laptop:~/.anaconda3/bin$

For your build step please try conda build -c cachemeorg oemof.
I believe the problem here is that conda build creates a whole new conda environment when it is building and it will install all the package dependencies, including pyomo, in that environment. It installs them by looking for them in the channels and not via your currently installed packages in your root. In this example you have pyomo installed as a package but that didn't come from a channel in your channels list as you installed it yourself. Therefore it fails to find the pyomo package when searching your conda channels. But if we add a channel to the list that conda build is looking at (via the -c flag) which has pyomo then it should work. It looks like cachemeorg has this package and therefore the above command should work.

Identifying the dependency relationship for python packages installed with pip

When I do a pip freeze I see large number of Python packages that I didn't explicitly install, e.g.
$ pip freeze
Cheetah==2.4.3
GnuPGInterface==0.3.2
Landscape-Client==11.01
M2Crypto==0.20.1
PAM==0.4.2
PIL==1.1.7
PyYAML==3.09
Twisted-Core==10.2.0
Twisted-Web==10.2.0
(etc.)
Is there a way for me to determine why pip installed these particular dependent packages? In other words, how do I determine the parent package that had these packages as dependencies?
For example, I might want to use Twisted and I don't want to depend on a package until I know more about not accidentally uninstalling it or upgrading it.

You could try pipdeptree which displays dependencies as a tree structure e.g.:
$ pipdeptree
Lookupy==0.1
wsgiref==0.1.2
argparse==1.2.1
psycopg2==2.5.2
Flask-Script==0.6.6
- Flask [installed: 0.10.1]
- Werkzeug [required: >=0.7, installed: 0.9.4]
- Jinja2 [required: >=2.4, installed: 2.7.2]
- MarkupSafe [installed: 0.18]
- itsdangerous [required: >=0.21, installed: 0.23]
alembic==0.6.2
- SQLAlchemy [required: >=0.7.3, installed: 0.9.1]
- Mako [installed: 0.9.1]
- MarkupSafe [required: >=0.9.2, installed: 0.18]
ipython==2.0.0
slugify==0.0.1
redis==2.9.1
To get it run:
pip install pipdeptree
EDIT: as noted by #Esteban in the comments you can also list the tree in reverse with -r or for a single package with -p <package_name> so to find what installed Werkzeug you could run:
$ pipdeptree -r -p Werkzeug
Werkzeug==0.11.15
- Flask==0.12 [requires: Werkzeug>=0.7]

The pip show command will show what packages are required for the specified package (note that the specified package must already be installed):
$ pip show specloud
Package: specloud
Version: 0.4.4
Requires:
nose
figleaf
pinocchio
pip show was introduced in pip version 1.4rc5

As I recently said on a hn thread, I'll recommend the following:
Have a commented requirements.txt file with your main dependencies:
## this is needed for whatever reason
package1
Install your dependencies: pip install -r requirements.txt.
Now you get the full list of your dependencies with pip freeze -r requirements.txt:
## this is needed for whatever reason
package1==1.2.3
## The following requirements were added by pip --freeze:
package1-dependency1==1.2.3
package1-dependency1==1.2.3
This allows you to keep your file structure with comments, nicely separating your dependencies from the dependencies of your dependencies. This way you'll have a much nicer time the day you need to remove one of them :)
Note the following:
You can have a clean requirements.raw with version control to rebuild your full requirements.txt.
Beware of git urls being replaced by egg names in the process.
The dependencies of your dependencies are still alphabetically sorted so you don't directly know which one was required by which package but at this point you don't really need it.
Use pip install --no-install <package_name> to list specific requirements.
Use virtualenv if you don't.

You may also use a one line command which pipes the packages in requirements to pip show.
cut -d'=' -f1 requirements.txt | xargs pip show

The following command will show requirements of all installed packages:
pip3 freeze | awk '{print $1}' | cut -d '=' -f1 | xargs pip3 show

First of all pip freeze displays all currently installed packages Python, not necessarily using PIP.
Secondly Python packages do contain the information about dependent packages as well as required versions. You can see the dependencies of particular pkg using the methods described here. When you're upgrading a package the installer script like PIP will handle the upgrade of dependencies for you.
To solve updating of packages i recommend using PIP requirements files. You can define what packages and versions you need, and install them at once using pip install.

(workaround, not true answer)
Had the same problem, with lxml not installing and me wanting to know who needed lxml. Not who lxml needed. Ended up bypassing the issue by.
noting where my site packages were being put.
go there and recursive grep for the import (the last grep's --invert-match serves to remove lxml's own files from consideration).
Yes, not an answer as to how to use pip to do it, but I didn't get any success out of the suggestions here, for whatever reason.
site-packages me$ egrep -i --include=*.py -r -n lxml . | grep import | grep --invert-match /lxml/

I wrote a quick script to solve this problem. The following script will display the parent (dependant) package(s) for any given package. This way you can be sure it is safe to upgrade or install any particular package. It can be used as follows: dependants.py PACKAGENAME
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""Find dependants of a Python package"""
import logging
import pip
import pkg_resources
import sys
__program__ = 'dependants.py'
def get_dependants(target_name):
for package in pip._internal.utils.misc.get_installed_distributions():
for requirement_package in package.requires():
requirement_name = requirement_package.project_name
if requirement_name == target_name:
yield package.project_name
# configure logging
logging.basicConfig(format='%(levelname)s: %(message)s',
level=logging.INFO)
try:
target_name = sys.argv[1]
except IndexError:
logging.error('missing package name')
sys.exit(1)
try:
pkg_resources.get_distribution(target_name)
except pkg_resources.DistributionNotFound:
logging.error("'%s' is not a valid package", target_name)
sys.exit(1)
print(list(get_dependants(target_name)))

You have two options here.
The first will output all top-level packages, excluding sub packages. Note that this will also exclude for example requests, even if you want to have it explicitly installed
pip3 list --not-required --format freeze --exclude pip --exclude setuptools
The second option is to print the packages based on the existing requirements.txt file.
pip3 freeze -r requirements.txt
This will generate a file in the format:
existing-package==1.0.0
## The following requirements were added by pip freeze:
dependency-package==1.0.0
You can remove all the additionally added packages by using sed:
pip3 freeze -r requirements.txt | sed -n '/## The following requirements were added by pip freeze:/q;p'

With GraphVis as seen on tv
If you like graphs you can use graphviz (Documentation)
pip install graphviz
Then do something like this:
#! /usr/bin/env python3
import graphviz
import pkg_resources
GRAPH_NAME = "pipdeps"
def init_grph():
grph = graphviz.Digraph(GRAPH_NAME,
node_attr={'color': 'lightblue2', 'style': 'filled'})
# This does not seem to be interpreted on websites I tested
grph.attr(engine='neato')
return grph
def add_pip_dependencies_to_graph(grph):
l_packages = [p for p in pkg_resources.working_set]
for package in l_packages:
name = package.key
for dep in package.requires():
grph.edge(dep.key, name)
def main():
grph = init_grph()
add_pip_dependencies_to_graph(grph)
print(grph.source)
# grph.view()
main()
This prints a dotgraph/digraph/graphviz (idk 🤷‍♀️)
You can view it on an online graphviz visualiser (e.g. )
If you have a graphic interface (😮) you can use grph.view()
hth

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to extract dependencies from a PyPi package - python

my goal is simple, i want to get the dependency of a PyPi package remotely without needing to download it completely. I seem to understand (reading the pip code) that pip when resolving dependencies seems to read the egg once the package has been downloaded... Is there any other way ?

Related

Install transitive dependency version automatically in python

Install specific version of setuptools as a dependency of package

How to get egg or wheel file of pip-installed python package?

Anaconda: Error while building from PyPi package ("Package XY missing in current linux-64 channels")

Identifying the dependency relationship for python packages installed with pip

Categories

Resources