Checking a Python module version at runtime - python

Many third-party Python modules have an attribute which holds the version information for the module (usually something like module.VERSION or module.__version__), however some do not.
Particular examples of such modules are libxslt and libxml2.
I need to check that the correct version of these modules are being used at runtime. Is there a way to do this?
A potential solution wold be to read in the source at runtime, hash it, and then compare it to the hash of the known version, but that's nasty.
Is there a better solutions?

Use pkg_resources. Anything installed from PyPI at least should have a version number.
>>> import pkg_resources
>>> pkg_resources.get_distribution("blogofile").version
'0.7.1'

If you're on python >=3.8 you can use a module from the built-in library for that. To check a package's version (in this example lxml) run:
>>> from importlib.metadata import version
>>> version('lxml')
'4.3.1'
This functionality has been ported to older versions of python (<3.8) as well, but you need to install a separate library first:
pip install importlib_metadata
and then to check a package's version (in this example lxml) run:
>>> from importlib_metadata import version
>>> version('lxml')
'4.3.1'
Keep in mind that this works only for packages installed from PyPI. Also, you must pass a package name as an argument to the version method, rather than a module name that this package provides (although they're usually the same).

I'd stay away from hashing. The version of libxslt being used might contain some type of patch that doesn't effect your use of it.
As an alternative, I'd like to suggest that you don't check at run time (don't know if that's a hard requirement or not). For the python stuff I write that has external dependencies (3rd party libraries), I write a script that users can run to check their python install to see if the appropriate versions of modules are installed.
For the modules that don't have a defined 'version' attribute, you can inspect the interfaces it contains (classes and methods) and see if they match the interface they expect. Then in the actual code that you're working on, assume that the 3rd party modules have the interface you expect.

Some ideas:
Try checking for functions that exist or don't exist in your needed versions.
If there are no function differences, inspect function arguments and signatures.
If you can't figure it out from function signatures, set up some stub calls at import time and check their behavior.

I found it quite unreliable to use the various tools available (including the best one pkg_resources mentioned by this other answer), as most of them do not cover all cases. For example
built-in modules
modules not installed but just added to the python path (by your IDE for example)
two versions of the same module available (one in python path superseding the one installed)
Since we needed a reliable way to get the version of any package, module or submodule, I ended up writing getversion. It is quite simple to use:
from getversion import get_module_version
import foo
version, details = get_module_version(foo)
See the documentation for details.

You can use
pip freeze
to see the installed packages in requirements format.

For modules which do not provide __version__ the following is ugly but works:
#!/usr/bin/env python3.6
import sys
import os
import subprocess
import re
sp = subprocess.run(["pip3", "show", "numpy"], stdout=subprocess.PIPE)
ver = sp.stdout.decode('utf-8').strip().split('\n')[1]
res = re.search('^Version:\ (.*)$', ver)
print(res.group(1))
or
#!/usr/bin/env python3.7
import sys
import os
import subprocess
import re
sp = subprocess.run(["pip3", "show", "numpy"], capture_output=True)
ver = sp.stdout.decode('utf-8').strip().split('\n')[1]
res = re.search('^Version:\ (.*)$', ver)
print(res.group(1))

Related

Check version of imported library in python script

in a python file, how to check the version of all the libraries that are imported exactly in that file by "import xxxxxxxxxxxx".
This is useful to later make an environment that has enough library to run that script while don't need to clone the exact environment.
example:
import pandas
import tensorflow
def version_list():
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
# expect the version_list to return
# pandas==xxx tensorflow==xxx
# without returning other things
Many modules have their version set to the variable __version__. For example
import numpy
print(numpy.__version__)
prints the version of numpy which you have imported.
This may be related: Standard way to embed version into Python package?

Why are some Python package names different than their import name?

Some packages are imported with a string which is different from the name of the package on PyPI, e.g.:
$ pip list | grep -i "yaml\|qt"
PyYAML 3.13
QtPy 1.5.2
pyyaml (pip instal pyyaml), but import yaml
qtpy (pip install qtpy), yes import is qtpy but package is QtPy
Several tools can't not handle that, e.g sphinx:
$ make html
WARNING: autodoc: failed to import module 'wireshark' from module 'logcollector.plugins'; the following exception was raised:
No module named 'qtpy'
I don't remember it right now, but same is for tools which scan the requirements.txt file and print warnings that the yaml package isn't installed (but it is and its name is pyyaml).
There are multiple reasons why authors choose to use different names in different environments:
Drop-in replacements: Sometimes it is helpful when you can install a fork and keep the rest of your code the same. I guess the most famous example is pyyaml / yaml. I did it when I created propy3 which can be used as a drop-in replacement for propy. I would say that this is also what happened with pillow.
Convenience: beautifulsoup4 can be imported as bs4 (+ package parking for bs4)
Lost credentials: I don't know of an example where the import name was changed as well, but I think for flask-restx the package name and the import name were changed.
A word of caution
As Ziyad Edher has pointed out in a related discussion, typosquatting is an issue on PyPI (source). If you add packages with different names, this gets more likely.
Other examples
Name in the docs vs "import" package name vs pypi package name vs anaconda packages vs Debian:
scikit-learn vs sklearn vs scikit-learn vs scikit-learn vs python-sklearn and python3-sklearn
OpenCV-Pyton vs cv2 vs opencv-python vs py-opencv vs python-opencv
PyTables vs tables vs tables vs pytables vs python-tables
Because these two concepts are not really related.
One is a python concept of package/module names, the other one a package manager concept.
Look at a simple packaging command with zip:
zip -r MyCoolTool.zip tool.py
The Tool is named tool, which probably is not unique and if you do not know that its MyCoolTool you do not know which tool it is. When I upload it somewhere I name it MyCoolTool, so you now a more unique name, that may be a bit more descriptive.
The other point is, that a pip package may include more modules than just one. PyYAML could for example include a second python module yaml2xml in addtion to yaml.
Finally there can be several implementations. PyYAML sounds like a pure python implementation. Now assume you need a really fast parser, then you may program CYAML with a C-backend, but the same interface at the name yaml.
In case of sphinx you can mock 3rd party packages with: autodoc_mock_imports

How to install a missing python package from inside the script that needs it?

Assuming that you already have pip or easy_install installed on your python distribution, I would like to know how can I installed a required package in the user directory from within the script itself.
From what I know pip is also a python module so the solution should look like:
try:
import zumba
except ImportError:
import pip
# ... do "pip install --user zumba" or throw exception <-- how?
import zumba
What I am missing is doing "pip install --user zumba" from inside python, I don't want to do it using os.system() as this may create other problems.
I assume it is possible...
Updated for newer pip version (>= 10.0):
try:
import zumba
except ImportError:
from pip._internal import main as pip
pip(['install', '--user', 'zumba'])
import zumba
Thanks to #Joop I was able to come-up with the proper answer.
try:
import zumba
except ImportError:
import pip
pip.main(['install', '--user', 'zumba'])
import zumba
One important remark is that this will work without requiring root access as it will install the module in user directory.
Not sure if it will work for binary modules or ones that would require compilation, but it clearly works well for pure-python modules.
Now you can write self contained scripts and not worry about dependencies.
As of pip version >= 10.0.0, the above solutions will not work because of internal package restructuring. The new way to use pip inside a script is now as follows:
try: import abc
except ImportError:
from pip._internal import main as pip
pip(['install', '--user', 'abc'])
import abc
I wanted to note that the current accepted answer could result in a possible app name collision. Importing from the app namespace doesn't give you the full picture of what's installed on the system.
A better way would be:
import pip
packages = [package.project_name for package in pip.get_installed_distributions()]
if 'package' not in packages:
pip.main(['install', 'package'])
Do not use pip.main or pip._internal.main.
Quoting directly from the official documentation (boldface emphasis and editing comments mine, italics theirs):
As noted previously, pip is a command line program. While it is... available from your Python code via import pip, you must not use pip’s internal APIs in this way. There are a number of reasons for this:
The pip code assumes that [it] is in sole control of the global state of the program. pip manages things like... without considering the possibility that user code might be affected.
pip’s code is not thread safe. If you were to run pip in a thread, there is no guarantee that either your code or pip’s would work as you expect.
pip assumes that once it has finished its work, the process will terminate... calling pip twice in the same process is likely to have issues.
This does not mean that the pip developers are opposed in principle to the idea that pip could be used as a library - it’s just that this isn’t how it was written, and it would be a lot of work to redesign the internals for use as a library [with a] stable API... And we simply don’t currently have the resources....
...[E]verything inside of pip is considered an implementation detail. Even the fact that the import name is pip is subject to change without notice. While we do try not to break things as much as possible, all the internal APIs can change at any time, for any reason....
...[I]nstalling packages into sys.path in a running Python process is something that should only be done with care. The import system caches certain data, and installing new packages while a program is running may not always behave as expected....
Having said all of the above[:] The most reliable approach, and the one that is fully supported, is to run pip in a subprocess. This is easily done using the standard subprocess module:
subprocess.check_call([sys.executable, '-m', 'pip', 'install', 'my_package'])
It goes on to describe other more appropriate tools and approaches for related problems.

Getting package version using pkg_resources?

What is the recommended way getting hold of the package version of a package in the $PYTHONPATH or sys.path?
I remember that the pkg_resource module has some functionality for this but I can not find any related information. Please don't point me to solution using a version.py file and reading it somehow. Using pkg_resources is the way to go but how exactly?
>>> import pkg_resources
>>> pkg_resources.get_distribution("PIL").version
'1.1.7'
I found it quite unreliable to use the various tools available (including the best one pkg_resources mentioned by this other answer), as most of them do not cover all cases. For example built-in modules and modules not installed but just added to the python path (by your IDE for example). Since we needed a reliable way to get the version of any package, module or submodule, I ended up writing getversion. It is quite simple to use:
from getversion import get_module_version
import foo
version, details = get_module_version(foo)
See the documentation for details.

How do I get the version of an installed module in Python programmatically?

For the modules:
required_modules = ['nose', 'coverage', 'webunit', 'MySQLdb', 'pgdb', 'memcache']
and programs:
required_programs = ['psql', 'mysql', 'gpsd', 'sox', 'memcached']
Something like:
# Report on the versions of programs installed
for module in required_modules:
try:
print module.__version__
except:
exit
Unfortunately, module.__version__ isn't present in all modules.
A workaround is to use a package manager. When you install a library using easy_install or pip, it keeps a record of the installed version. Then you can do:
import pkg_resources
version = pkg_resources.get_distribution("nose").version
I found it quite unreliable to use the various tools available (including the best one pkg_resources mentioned by moraes' answer), as most of them do not cover all cases. For example
built-in modules
modules not installed but just added to the python path (by your IDE for example)
two versions of the same module available (one in python path superseding the one installed)
Since we needed a reliable way to get the version of any package, module or submodule, I ended up writing getversion. It is quite simple to use:
from getversion import get_module_version
import foo
version, details = get_module_version(foo)
See the documentation for details.

Categories

Resources