Best way to "resume" or alter PEP 517 build

Best way to "resume" or alter PEP 517 build - python

I was wondering what the best way is to resume (or pause and alter) a PEP 517 build.
It comes up quite frequently that on my setup (Windows with Cygwin) some Python package fails to build when running, for example, pip wheel. But when I halt the build shortly before the error (with ctrl-z) and inspect the build directory or makefile or sources, I frequently find the problem that causes the error. Only then I'm often unable to modify the right files at the right time during the pip build (for example, makefiles that are auto-generated and would have to be edited right after auto-generation, but before the build script calls make).
The current example is that I want to build Scipy 1.10.0 on Cygwin. There the meson build fails because some symbols defined in math.h can't be found in the given configuration. I fixed the build by editing the ninja.build file and ran meson compile myself (successfully). Only now I have this directory with built object files and dlls that gets ignored when I run pip wheel again (it generates a new random directory and starts the (failing) build all over). It would be great if I could tell PEP517 to skip this build step, pause, wait for me to copy the already-built directory to its new location and then let PEP517 resume building the wheel.

Related

A build-system independent way to get the version from a package source directory

There are many different ways that Python packages manage their version data. Is there a build-system independent way to extract the version from a package source directory?
I'm aware of the PEP 517 compatible Python package builder build which does this internally. For instance, in an example source directory for a Python package my_pkg:
$ python -m build --sdist
...
Successfully built my_pkg-1.2.0.tar.gz
so is there a clever way to just extract the version number without building the distribution?

No. The only mandatory hooks currently specified for a PEP 517 build backend are the build hooks:
def build_sdist(sdist_directory, config_settings=None):
...
def build_wheel(wheel_directory, config_settings=None, metadata_directory=None):
...
The build process also generates the package metadata, including the Version field. In the general case, it is necessary to execute a build to get the version info.
Note that it's also fairly common for the version info to be generated dynamically, e.g. sourcing it from the underlying version control system, so discovering the version from the source directory without a build would only be possible in a subset of cases anyway.
Some build backends may provide other ways to get the version, for example in setuptools you could use:
python3 -c 'import setuptools; setuptools.setup()' --version
However, PEP 517 has nothing to say about this, and it will be specific to the build backend.
For a backend-agnostic way to generate the version, you could use build.util.project_wheel_metadata, but this may actually execute a build.

Distribution format for Python libraries uploaded on PyPI

I went through the tutorial of uploading packages to https://test.pypi.org/ and I was successful in doing so.
However, $python setup.py sdist bdist_wheel produces a .whl file and a tar.gz file in the dist/ directory. twine allows uploading just the .whl or tar.gz file or both. I see many repositories on https://pypi.org/ have uploaded both the formats.
I want to understand what is the best practice. Is one format preferred over the other? If .whl file is enough for distributing my code, should I upload tar.gz file too? Or is there anything else I am completely missing here.

Best practice is to provide both.
A "built distribution" (.whl) for users that are able to use that distribution. This saves install time as a "built distribution" is pre-built and can just be dropped into place on the users machine, without any compilation step or without executing setup.py. There may be more than one built distribution for a given release -- once you start including compiled binaries with your distribution, they become platform-specific (see https://pypi.org/project/tensorflow/#files for example)
A "source distribution" (.tar.gz) is essentially a fallback for any user that cannot use your built distribution(s). Source distributions are not "built" meaning they may require compilation to install. At the minimum, they require executing a build-backend (for most projects, this is invoking setup.py with setuptools as the build-backend). Any installer should be able to install from source. In addition, a source distribution makes it easier for users who want to audit your source code (although this is possible with built distributions as well).
For the majority of Python projects, turning a "source distribution" into a "built distribution" results in a single pure-Python wheel (which is indicated by the none-any in a filename like projectname-1.2.3-py2.py3-none-any.whl). There's not much difference between this and the source distribution, but it's still best practice to upload both.

A .tar.gz is a so called source distribution. It contains the source code of your package and instructions on how to build it, and the target system will perform the build before installing it.
A .wheel (spec details in PEP 427) is a built distribution format, which means that the target system doesn't need to build it any more. Installing a wheel usually just means copying its content into the right site-packages.
The wheel sounds downright superior, because it is. It is still best practice to upload both, wheels and a source distribution, because any built distribution format only works for a subset of target systems. For a package that contains only python code, that subset is "everything" - people still often upload source distributions though, maybe to be forward compatible in case a new standard turns up[1], maybe to anticipate system specific extensions that will suddenly require source distributions in order to support all platforms, maybe to give users the option to run a custom build with specific build-parameters.
A good example package to observe the different cases is numpy, which uploads a whooping 25 wheels to cover the most popular platforms, plus a source distribution. If you install numpy from any of the supported platforms, you'll get a nice short install taking a couple of second where the contents of the wheel are copied over. If you are on an unsupported platform (such as alpine), a normal computer will probably take at least 20 minutes to build numpy from source before it can be installed, and you need to have all kinds of system level dev tools for building C-extensions installed. A bit of a pain, but still better then not being able to install it all.
[1] After all, before wheel there was egg, and the adoption of the wheel format would have been a lot harder than it already was if package managers had decided to only upload egg distributions and no sources.

You can just upload whl file and install with below command

How to include files downloadable from a server into a pip installation?

I want to create a pip-installable (this is important, we already have a mostly-working easy-install version, but we want to switch to PIP) python package, which is essentially a wrapper for some C functions. As I understand it, I cannot count on users having compilers installed (e.g. Windows), and so preferably I would precompile these files and upload them onto a server. What I would like, is PIP to download a suitable file (I would prefer if it wasn't necessary for all these files to come shipped with the package) during the installation. I've tried reading the docs, but failed to find any solutions for my problem there. Is PIP able to download a compiled C file from a server during the installation? If so, what is the course of action? Should I perhaps try to include a python script, to be run at installation, which would determine the OS and the architecture, and then access a specific link?

You are correct in most of your assumptions. You can offer a source distribution, or sdist, which requires build tools on the target machine in order to be installed. It is often uploaded as a fallback when the platform wheel that you need doesn't exist, or if you want your users to be able to build it themselves.
Speaking of wheels, that is the name of the current standard for binary python distributions, or bdists. If your package contains code that needs to be compiled, wheels will end up being platform specific - depending on the used build system that can be Linux, macOS, or Windows. See for example the sklearn entry on pyPI, which features one wheel per os for all supported python versions (plus 32/64 bit support, but that's another story).
If you specify an index (or just a directory that has the wheels in it) where pip should install from, it will automatically make sure that it downloads/installs the correct wheel, which avoids writing platform-specific code into your source code. The hard part is building the wheels.
Related question:
Pip install and platform specific wheels
How to avoid building C library with my python package? (ends up building platform specific wheels)

setuptools "eager_resources" to executable directory

I maintain a Python utility that allows bpy to be installable as a Python module. Due to the hugeness of the spurce code, and the length of time it takes to download the libraries, I have chosen to provide this module as a wheel.
Unfortunately, platform differences and Blender runtime expectations makes support for this tricky at times.
Currently, one of my big goals is to get the Blender addon scripts directory to install into the correct location. The directory (simply named after the version of Blender API) has to exist in the same directory as the Python executable.
Unfortunately the way that setuptools works (or at least the way that I have it configured) the 2.79 directory is not always placed as a sibling to the Python executable. It fails on Windows platforms outside of virtual environments.
However, I noticed in setuptools documentation that you can specify eager_resources that supposedly guarantees the location of extracted files.
https://setuptools.readthedocs.io/en/latest/setuptools.html#automatic-resource-extraction
https://setuptools.readthedocs.io/en/latest/pkg_resources.html#resource-extraction
There was a lot of hand waving and jargon in the documentation, and 0 examples. I'm really confused as to how to structure my setup.py file in order to guarantee the resource extraction. Currently, I just label the whole 2.79 directory as "scripts" in my setuptools Extension and ship it.
Is there a way to write my setup.py and package my module so as to guarantee the 2.79 directory's location is the same as the currently running python executable when someone runs
py -3.6.8-32 -m pip install bpy
Besides simply "hacking it in"? I was considering writing a install_requires module that would simply move it if possible but that is mangling with the user's file system and kind of hacky. However it's the route I am going to go if this proves impossible.
Here is the original issue for anyone interested.
https://github.com/TylerGubala/blenderpy/issues/13
My build process is identical to the process descsribed in my answer here
https://stackoverflow.com/a/51575996/6767685

Maybe try the data_files option of distutils/setuptools.
You could start by adding data_files=[('mydata', ['setup.py'],)], to your setuptools.setup function call. Build a wheel, then install it and see if you can find mydata/setup.py somewhere in your sys.prefix.
In your case the difficult part will be to compute the actual target directory (mydata in this example). It will depend on the platform (Linux, Windows, etc.), if it's in a virtual environment or not, if it's a global or local install (not actually feasible with wheels currently, see update below) and so on.
Finally of course, check that everything gets removed cleanly on uninstall. It's a bit unnecessary when working with virtual environments, but very important in case of a global installation.
Update
Looks like your use case requires a custom step at install time of your package (since the location of the binary for the Python interpreter relative to sys.prefix can not be known in advance). This can not be done currently with wheels. You have seen it yourself in this discussion.
Knowing this, my recommendation would be to follow the advice from Jan Vlcinsky in his comment for his answer to this question:
Post install script after installing a wheel.
Add an extra setuptools console entry point to your package (let's call it bpyconfigure).
Instruct the users of your package to run it immediately after installing your package (pip install bpy && bpyconfigure).
The purpose of bpyconfigure should be clearly stated (in the documentation and maybe also as a notice shown in the console right after starting bpyconfigure) since it would write into locations of the file system where pip install does not usually write.
bpyconfigure should figure out where is the Python interpreter, and where to write the extra data.
The extra data to write should be packaged as package_data, so that it can be found with pkg_resources.
Of course bpyconfigure --uninstall should be available as well!

Post install script after installing a wheel

Using from setuptools.command.install import install, I can easily run a custom post-install script if I run python setup.py install. This is fairly trivial to do.
Currently, the script does nothing but print some text but I want it to deal with system changes that need to happen when a new package is installed -- for example, back up the database that the package is using.
I want to generate the a Python wheel for my package and then copy that and install it on a a set of deployment machines. However, my custom install script is no longer run on the deployment machine.
What am I doing wrong? Is that even possible?

Do not mix package installation and system deployment
Installation of Python packages (using any sort of packaging tools or formats) shall be focused on making that package usable from Python code.
Deployment, what might include database modifications etc. is definitely out of scope and shall be handled by other tools like fab, salt-stack etc.
The fact, that something seems fairly trivial does not mean, one shall do it.
The risk is, you will make your package installation difficult to reuse, as it will be spoiled by others things, which are not related to pure package installation.
The option to hook into installation process and modify environment is by some people even considered flaw in design, causing big mess in Python packaging situation - see Armin Roacher in Python Packaging: Hate, Hate, Hate Everywhere, chapter "PTH: The failed Design that enabled it all"

PEP 427 which specifies the wheel package format does not leave any provisions for custom pre or post installation scripts.
Therefore running a custom script is not possible during wheel package installation.
You'll have to add the custom script to a place in your package where you expect the developer to execute first.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.