Trying to make a Python project requirements version-free

Trying to make a Python project requirements version-free - python

Imagine a project MyLibrary which used to have its own requirements.txt file specifying all the versions needed by each of the dependencies...
lib_a==0.1
lib_b==0.11
lib_c==0.1.1
lib_d==0.1.2
lib_e==0.1.8
And a project ChildProject which happens to have the same kind of setup, with its own requirements.txt file and everything.
ChildProject uses MyLibrary as it needs some common functionality it has. The problem with this two, is that ChildProject has a library which is also specified in MyLibrary, but with a different version which causes conflict and causing the build to fail.
What I've done to get rid of the problem is to erase the dependencies in MyLibrary and specify the minimum and maximum versions for each of the libraries, specifying those in the setup_requires property within the setup() method...
setup(
setup_requires=['pbr', 'pytest-runner'],
install_requires=[
'lib_a>=0,<1',
'lib_b>=0,<2',
'lib_c>=0,<3',
'lib_d>=0,<4',
'lib_e>=0,<5'
],
pbr=True,
)
And here is where I get lost...
Should I remove requirements.txt in MyLibrary and leave all the versioning to child projects using ?
If so, how do I know that ChildProject is specifying all of the needed dependencies? What if I miss to specify lib_a in ChildProject?
Does the latest version that complies with the setup_requires constraints gets automatically installed or how does it work? (I ask this because AFAIK, install_requires just specified the constraints but doesn't include any library whatsoever in the project).

General suggestions for managing deps versions:
libraries dont't pin versions (i.e. either install_requires doesn't have version at all, or loose restrictions, i.e. <4). That's what you have already
applications can do whatever needed. In reality, it's highly recommended to pin your dependencies to some exact version (ant better yet — provide hash, to save yourself from forged libs). Reason for this — you can not guarantee 3rd-party libraries to follow semver. Which means that having >2, <3 in your requirements.txt may lead to broken build/deployment, because 3rd party lib released 2.5 which appears to be backward-incompatible with 2.4. So, you must do you best to avoid breaking builds by just re-building in different time. In other words, your build should be idempotent on PyPI state.
In general — you pin version to some state, test your application and commit/save/build/however you deliver. Some time later, you're revising versions (i.e. to update framework or address security patch), updating version in requirements.txt, testing your app with new deps state, if there's no conflicts/broken parts, you "freeze" that state with pinned versions, and build/deploy/etc. This kind of loop gives you space to occasionally update your requirements to stay up to date, and at the same time you have code that will not be broken by just re-installing dependencies.
If you're looking to easier dep management with version, I suggest taking a look at pipenv

Related

How to handle development of nested graph python libraries

Imagine I have:
Library Z
Library Y, which depends on Library Z
Application A, which depends on Library Y
To fully test out changes to Library Z, I'd like to run the tests of Application A with any Development releases of Library Z.
To do this I can set up Library Z to publish packages to some package index for development releases under the versioning scheme {major}.{minor}.{micro}.dev{build}, then have Library Y specify it's dependency range for Library Z as >={major},<{major+1} for instance and use pip install --pre ... on Application A to ensure the Development releases of Library Z are picked up.
This all works fine, until we have > 1 maintainer of Library Z making changes, likely in different git branches, and effectively competing on the {build} number. Wondering how folks have solved this problem?
This problem gets potentially worse as well if in Application A you are also in a situation where > 1 maintainer are making changes and not everyone wants to ingest the Development release, so ensure the --pre flag is optionally passed and ideally synced up with just the dependency in question (possible with poetry via the more granular allow-prereleases flag, see docs here).
Editable installs are likely considered out of scope, this set up is a trivial case. In reality this dependency graph could be deeper, and is often pared with Docker to make it commercially viable when pared with C dependencies so the complexity of hooking up volume mounts very hard. Also the user developing Library Z, may be different than the person testing Application A.
Whilst I used pip in the examples here, in reality our system uses poetry (and pip in places).

[given the information exchanged in the comments...] Then version control branches should do. If you agree that a certain branch in lib Z will have the new features app A needs for the purposes of build W, just keeping that branch updated won´t require any changes on the configuration of W, and no package building: at this point you are down to being able to switch the requirement list (and the pointers to each branch) on the build process of A.
I don't know how Poetry could handle this, but using a setup.py file it should be trivial: just read different text files listing the requirements (with pointers to specific GIT branches or Tags), based on a system-variable or other configuration you can easily change for the build. Since setup.py are plain Python code file, one can just use if statements along with os.environ to check environment variables, and read the contents of requirements_setup_W.txt to feed the install_requires parameter of the call to setup.
Either way, you are saying each package might have more than one state of "pre-release", which would be interesting for different builds - I think managing that through branches in the version control would be better than uploading several differing packages for a repository. or maybe, change the uploaded package name for each finality. So, you could have a package named libZ_setupW on the local pypi (and rotate the requirements by code on setup.py . (again, the main pain point of poetry is trashing executable code expecting all build needs can be represented by config files: they can't. I just looked around, and there seems to be no "poetry pre-build hook", which could be used to rename (or dynamically rewrite) your pyproject.toml according to the desired setup. One could add a script to do just that, but it would need to be called manually before calling poetry install)

Python requirements conflict with PyPi

I have a project that needs some DevOps TLC, so I am finally building my installation script. This will eventually be a package that will be install-able by pip locally, but may not end up in PyPI.
It has a dependency for a module called u2py. It is this package, created for U2 Database operations, not this package, for... something else. The one I want is only ever installed by a 3rd party vendor (Rocket), the one I don't want is in PyPI.
What should be the expected behavior of my package in this case? I will include a blurb about this in my readme doc, but is that sufficient?
I've thought about throwing an exception to identify when the wrong package is present, but that makes me feel weird. It seems that maybe the most pythonic thing is to NOT add this to my install script, and blindly assume import u2py results in a module I can use. If it quacks like a duck, parses DynArrays like a duck, and call()s SUBROUTINEs like a duck, then it's a duck, right? Otherwise, if there is an error the user will just go and actually read the docs.
I've looked a classifiers, but not sure if they apply here.

Ideally there would be a way at install-time (in setup.py) to detect whether the package is being installed into a "u2 environment" or not, and could fail the installation (with an appropriate error message) if that's the case.
With this solution, you won't be able to provide built distributions (wheels) since they don't execute the setup.py file at install-time, but just publishing source distributions should be fine.

It's a case where it would be nice if Python projects had namespaces (pip install com.rocket.u2py and import com.rocket.u2py as u2py).
From my point of view there are 2 aspects to consider: at the project level, at the package level.
1. project (distribution package)
I believe it is a bad practice to force alternative download sources onto the end user of your project. By default, pip should download from PyPI and nowhere else, unless the user decides it themselves (via --find-links or similar options, which you could instruct your users to do in your documentation).
Since it is such a niche dependency, I think I would simply not add it to install_requires. I would assume the end users of your project know about the dependency already and are able to install it themselves directly.
Also I don't believe it is possible to check reliably at install-time if the correct dependency is installed, since setup.py does not always run (overriding the bdist_wheel command can help, but probably not 100% effective).
2. package (importable package)
I am not sure some specific action is needed. The code would most likely fail sooner or later naturally, because of module or function is not importable. Which might be okay-ish, maybe?
But probably detecting if the dependency is installed (and it is the correct one), is relatively easy and would provide a better user experience. Either check that some specific modules or functions are importable. Or inspect the meta-data (import importlib_metadata; importlib_metadata.distribution('u2py').metadata['Author']).
In case of an application, I would try to fail gracefully as soon as possible. In case of a library I would try to find one strategic spot to place the check and raise a custom exception (CannotFindU2pyException).
Links:
Prevent pip from caching a package
https://docs.python.org/3/library/importlib.metadata.html#distribution-metadata
https://github.com/pypa/pip/issues/4187#issuecomment-415067034
Equivalent for `--find-links` in `setup.py`

You can specify the url to the package in install_requires using setuptools (requires pip version 18.1 or greater).
Requirement specifiers
Example
setup.py
import setuptools
setuptools.setup(
name='MyPackage',
version='1.0.0',
# ...
install_requires=[
'requests # https://github.com/psf/requests/archive/v2.22.0.zip'
]
# ...
)
and do python setup.py install
Also
Since version 19.1, pip also supports direct references like so:
SomeProject # file:///somewhere/...
Ref
https://www.python.org/dev/peps/pep-0508/
https://github.com/pypa/pip/pull/4175

Versioning Python extension that builds against upstream library

I am really confused on how to properly combine versioning for my own extension with the versioning of the upstream library I am interfacing to.
Background in a nutshell
There is an upstream static library (say, version 2_2_0_5), and I am building a Cython extension named foobar to provide an interface to it (say, version 0.5.3). Because upstream library is huge, I won't be able to cover its interface in a single shot, so I need to version my own extension as well.
Once new version of upstream is released, I take the latest version of my extension for previous version of upstream and start adjusting it according to the changes in the upstream release, increasing my own version number.
Requirements
I have to provide an easy way to restrict the version of my extension being installed to the given upstream version, something like pip install foobar[upstream==2.2.0.5].
Options I have considered
foobar-2.2.0.5+0.5.3
I have considered using local version labels, like 2.2.0.5+0.5.3, but because local version label has no semantics, I cannot install specific version of my extension without explicitly mentioning the version of upstream — there is seemingly no way to tell pip to do something like pip install foobar>=*+0.5.3.
foobar-0.5.3+2.2.0.5
An approach where my version is the "primary version" and upstream version is my local version label suffers from inability to pinpoint exact upstream version.
foobar_2_2_0_5-0.5.3
Another option I had considered is to inline the upstream version into the name of package itself, so I will have foobar_2_2_0_5-0.5.3 and foobar_2_2_1_9-0.6.8, but this seems extremely ugly to me, and does not offer any filtering capabilities — I cannot install foobar for upstream~=2.2.
foobar[upstream.2.2.0.5]-0.5.3
There are "extras" in setuptools, so package can be installed with foobar[hello,there], but there is no way to version extras — they are mere flags. And I am not sure whether having extra like foobar[upstream.2.2.0.5] would actually do any help, and readability suffers anyway.
I could theoretically make foobar a "virtual package" that would install relevant dependency given correct extra:
setup(
name='foobar',
extras_require={
'2.2.0.5': ['foobar_2_2_0_5'],
'2.2.1.9': ['foobar_2_2_1_9'],
}
)
In practice this would effectively disable the way to specify foobar version, and is essentially not different from foobar-2.2.0.5 approach.
What next?
I have to also keep in mind that upstream developers may use my extension with private builds, so it's necessary to make it possible to keep build tags in upstream version. I don't know how to accomplish that.
Also, reading explanations on the purpose of local version labels, it can be clearly seen that my use contradicts the intended use. However, I don't see any other mechanism I could make to use.
Given this all, I am really confused on what to do next, is there a way to solve this without rewriting half of the setuptools, and whether I am missing some obvious solution after all.

How can I build an RPM for an earlier version of python?

How can I build a python distribution RPM that is only dependent on an earlier version of python?
Why? I'm trying to build a distribution RPMs for RHEL6/CentOS 6, which only includes Python 2.6, but I am building usually on machines with Python 2.7.
This is an open source project, and I have already ensured that it shouldn't be including any libraries/APIs that are not in 2.6.
I am building the RPMs with:
python setup.py bdist_rpm
setup.py file:
from distutils.core import setup
setup(name='pyresttest',
version='0.1',
description=Text',
maintainer='Not listing here',
maintainer_email='no,just no',
url='project url here',
keywords='rest web http testing',
packages=['pyresttest'],
license='Apache License, Version 2.0',
requires=['yaml','pycurl']
)
(Specifics removed for the url, maintainer, email and description).
The RPM appears to be valid, but when I try to install on RHEL6, I get this error:
python(abi) = 2.7 is needed by pyresttest-0.1-1.noarch
There should be some way to get it to override the default python version to require, or supply a custom SPEC file, but after several hours of fiddling with it, I'm stuck. Ideas?
EDIT: I suppose I should clarify why I'm doing a RPM for python code, instead of just using setuptools or pip: this will hopefully go to production at work, where all deployments are RPM-based and most VMs are still RHEL6. Asking them to adopt another packaging tool is likely to be a non-starter, since our company is closely tied to the RPM format.

Re-organized the answer.
Actually, there's no "rpm-package". There're rpm-packages for RHEL6, rpm-packages for FedoraNN, rpm-packagse for OpenSUSE-X.Y and so on. And besides there're Debian, Ubuntu, Arch and Gentoo :)
You have the following possibilities with your Python package:
You may completely avoid rpm-, deb- and other "native linux packaging systems", and may opt to use a "python-native" packaging system like PIP. Thus you completely avoid the complexity and lack of compatibility between packaging systems in various versions and various flavours of Linux. And for a package which doesn't "infiltrate" deeply into "core system", this could be the best solution.
You may continue to use RPM as an archive format for your package but completely turn off automatic dependency calculations. This can be done with AutoReqProv: no directive in the spec. To be able to work with a customized spec one may use --spec-only and --spec-file distutils options. But remember that a package built this way is even worse than a zip from p.1: without proper dependencies it contains less necessary metainformation and thus "defames" the whole idea behind Linux packaging systems which were invented to built consistent systems, to avoid problems like "DLL hell" and to be suitable for automatic maintainance and updates. Actually you may add dependency information manually, via Requires: <something> tag but this may become even more hard and bporing if you target several Linux platforms at once.
In order to take into account all those complex and boring details and nuances of a particular package system you may create "build sandboxes" with appropriate versions of necessary Linux flavours. My preferred way to create such sandboxes is to use pre-created "OpenVZ templates", but without OpenVZ per se: simply unpack a given archive into a subdirectory (being root to preserve permissions), then chroot into the subdirectory, and voila! you've got Debian, RHEL etc... Fedora people have created Mock for the same purposes and likely Mock would be a more elaborated solution. As #BobMcGee suggests in the comment one also may consider Jenkins Docker plugin
Once you have a build sandbox with python distribution specific to that system, distutils etc you may automate the build process using simple scripting, bash or python.
That's it.

I do not do very much python work but have done some RPM packaging. You probably need to somehow do what one would normally do in the RPM's spec file and specify and require a particular release of your python package like so ...
# this would be in your spec file
requires: python <= 2.6
Take a look here for more info:
http://ftp.rpm.org/max-rpm/s1-rpm-depend-manual-dependencies.html

How might I handle development versions of Python packages without relying on SCM?

One issue that comes up during Pinax development is dealing with development versions of external apps. I am trying to come up with a solution that doesn't involve bringing in the version control systems. Reason being I'd rather not have to install all the possible version control systems on my system (or force that upon contributors) and deal the problems that might arise during environment creation.
Take this situation (knowing how Pinax works will be beneficial to understanding):
We are beginning development on a new version of Pinax. The previous version has a pip requirements file with explicit versions set. A bug comes in for an external app that we'd like to get resolved. To get that bug fix in Pinax the current process is to simply make a minor release of the app assuming we have control of the app. Apps we don't have control we just deal with the release cycle of the app author or force them to make releases ;-) I am not too fond of constantly making minor releases for bug fixes as in some cases I'd like to be working on new features for apps as well. Of course branching the older version is what we do and then do backports as we need.
I'd love to hear some thoughts on this.

Could you handle this using the "==dev" version specifier? If the distribution's page on PyPI includes a link to a .tgz of the current dev version (such as both github and bitbucket provide automatically) and you append "#egg=project_name-dev" to the link, both easy_install and pip will use that .tgz if ==dev is requested.
This doesn't allow you to pin to anything more specific than "most recent tip/head", but in a lot of cases that might be good enough?

I meant to mention that the solution I had considered before asking was to put up a Pinax PyPI and make development releases on it. We could put up an instance of chishop. We are already using pip's --find-links to point at pypi.pinaxproject.com for packages we've had to release ourselves.

Most open source distributors (the Debians, Ubuntu's, MacPorts, et al) use some sort of patch management mechanism. So something like: import the base source code for each package as released, as a tar ball, or as a SCM snapshot. Then manage any necessary modifications on top of it using a patch manager, like quilt or Mercurial's Queues. Then bundle up each external package with any applied patches in a consistent format. Or have URLs to the base packages and URLs to the individual patches and have them applied during installation. That's essentially what MacPorts does.
EDIT: To take it one step further, you could then version control the set of patches across all of the external packages and make that available as a unit. That's quite easy to do with Mercurial Queues. Then you've simplified the problem to just publishing one set of patches using one SCM system, with the patches applied locally as above or available for developers to pull and apply to their copies of the base release packages.

EDIT: I am not sure I am reading your question correctly so the following may not answer your question directly.
Something I've considered, but haven't tested, is using pip's freeze bundle feature. Perhaps using that and distributing the bundle with Pinax would work? My only concern would be how different OS's are handled. For example, I've never used pip on Windows, so I wouldn't know how a bundle would interact there.
The full idea I hope to try is creating a paver script that controls management of the bundles, making it easy for users to upgrade to newer versions. This would require a bit of scaffolding though.
One other option may be you keeping a mirror of the apps you don't control, in a consistent vcs, and then distributing your mirrored versions. This would take away the need for "everyone" to have many different programs installed.
Other than that, it seems the only real solution is what you guys are doing, there isn't a hassle-free way that I've been able to find.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.