Compile python wheel with SWIG for manylinux without linking to libpython - python

I'm trying to dig a bit deeper into binary packages with python with the wheel format, and I thought I might try to give wheelifying one of my dependencies as an exercise.
To maximize the reusability of the resulting package I want to build it using the manylinux containers, which compiles the binaries included in the package with the lowest common denominator that I'm likely to encounter. However I'm unsure how to avoid linking against a specific libpython.so. The docs describe this in a rather abstract way:
Note that libpythonX.Y.so.1 is not on the list of libraries that a manylinux1 extension is allowed to link to. Explicitly linking to libpythonX.Y.so.1 is unnecessary in almost all cases: the way ELF linking works, extension modules that are loaded into the interpreter automatically get access to all of the interpreter's symbols, regardless of whether or not the extension itself is explicitly linked against libpython. Furthermore, explicit linking to libpython creates problems in the common configuration where Python is not built with --enable-shared. In particular, on Debian and Ubuntu systems, apt install pythonX.Y does not even install libpythonX.Y.so.1, meaning that any wheel that did depend on libpythonX.Y.so.1 could fail to import.
So I'm wondering, SWIG generates the python wrappers for the C functions, and obviously makes use of the libpython functions, so the linker will try to link in libpython, which manylinux doesn't provide. If I understand correctly I am supposed to somehow leave the references dangling/unlinked and the interpreter will provide those symbols at runtime when loading the shared library, but how am I supposed to compile this in the first place?

Related

Programmatically obtain Python install paths in prefix, without distutils

As distutils is being removed from Python in the > 3.10 versions, and setuptools will not be added to the stdlib, I want to replace an existing setup.py recipe for building/installing a C++ library Cython extension (i.e. not primarily a Python package, not run in a venv, etc.) with some custom code.
The Cython part is working fine, and I just about managed to construct an equivalent call to the C++ compiler from that previously executed by distutils, by using config-var info from sysconfig... though the latter was very trial and error, with no documentation or particular consistency to the config-var collection as far as I could tell.
But I am now stuck on identifying what directory to install my build extension .so into, within the target prefix of my build. Depending on the platform and path scheme in use, and the prefix itself, the subdirs could be in lib or lib64, a pythonX.Y subdir of some sort, and a final site-packages, dist-packages or other directory. This decision was previously made by distutils but I can't find any equivalent code to return such path decisions in other stdlib packages.
Any suggestions of answers or best-practice approaches? (Other than "use setuptools", please!)

How to edit and use a python package containing an .so file

I have installed a Python package as usual using pip install package_name. It contains the main/most relevant file in the form of .so extension. I want to MODIFY it and use it for my work. Is it even possible to do it. Is there a background/underlying code for the .so file in python/.. that comes along with the package or is it a standalone program?
Go to the site whence pip fetches things, find your package, and follow the link to its source distribution. Building that yourself often requires more tools and expertise than usingpip, which is the cost of customization. (The GPL, despite being more restrictive (in this peculiar sense) than most Free licenses, certainly allows merely providing Internet access to the sources for binaries so distributed.)

Importing cython generated *.so-module with another python-version or on another OS

How should a file myModule.cpython-35m-x86_64-linux-gnu.so be imported in python? Is it possible?
I tried the regular way:
import myModule
and the interpreter says:
`ModuleNotFoundError: No module named 'myModule'`
This is a software that I can't install in the cluster that I am working at so I just extracted the .deb package and it does not have a wheel file or structure to install.
It is problematic to use a C-extension built for one Python version in another Python version. Normally (at least for Python3) there is a mechanism in place to differentiate C-extensions for different Python versions, so they can co-exist in the same directory.
In your example, the suffix is cpython-35m-x86_64-linux-gnu so this C-extension will be picked up by a CPython3.5 on a x86_64 Linux. If you try to import this extension with another Python-version or on another plattform, the module isn't visible and ModuleNotFoundError is raised.
It is possible to see, which suffixes are accepted by the current Python version, e.g. via:
>>> import _imp
>>>_imp.extension_suffixes()
['.cpython-36m-x86_64-linux-gnu.so', '.abi3.so', '.so']
A possibility is to use the stable C-API which could be used with multiple Python versions without recompilation. Cython start to support it in version 3.0 (see this PR), see also this SO-post about setuptools and stable C-API.
One might want to be clever and rename the extension to simple .so, so it can be picked up by the Finder - this can/does work for some Python-version combinations on some platforms for some extension - yet this approach cannot be sustained in the long run and is not the right thing to do.
The right thing to do, is to build the C-extension for/with the right Python-version on the right OS/platform or to use the right wheel (or use stable C-API).
In general, a C-extension built for a python-version (let's say PythonA.B) cannot be used by another Python version (let's say PythonC.D), because those extensions/modules are linked against a special Python-library and the needed functionality might no longer/not yet be present in the library of another version.
This different to *.py-files and more similar to *.pyc-files which cannot be used with a different version.
While PEP-3147 regulates the suffices of *.pyc-files, PEP-3149 does the same for the C-extensions. PEP-3149 is however not the state-of-the-art, as some of the problems where fixed only in Python3.5, the whole discussion can be found here.

Should I bundle C libraries with my Python application?

If I have a Python package that depends on some C libraries (like say the Gnu Scientific Library (GSL) for numerical computations), is it a good idea to bundle the library with my code?
I'd like to make my package as easy to install as possible for users and I don't want them to have to download C libraries by hand and supply include-paths. Also I could always ensure that the version of the library that I ship is compatible with my code.
However, is it possible that there are clashes if the user has the library installed already, or ar there any other reasons why I shouldn't do this?
I know that I can make it easier for users by just providing a binary distribution, but I'd like to avoid having to maintain binary distributions for all possible OSs. So, I'd like to stick to a source distribution, but for the user (who proudly owns a C compiler) installation should be as easy as python setup.py install.
Distribution is one of the hard parts for any software project. Java and .NET lift part of this burden by defining a standard runtime and then just saying "just distribute everything else." Of course there's a drawback: everything must be rewritten in a language supported by the runtime - as soon as you want to use native code, you lose all the advantages.
That's harder in Python, as it is in Ruby, C, C++ and other languages, as they usually leverage existing native libraries.
Generally speaking:
Make it possible to get a source sdist, via pypi.python.org as an example. Correctly set your install_requires (probably you'll require python bindings for GSL, not GSL itself). Use standard setuptools/distribute layout. This will let anyone - let's say a package maintainer for any distro - to pick up your software and package it.
Additionally, consider providing a full-blown installable package for your audience. You don't have to support all the distros and operating system; pick one or two that you consider will be used most. Tools like PyInstaller will let you create an installable, runnable package for many operating systems, but especially for linux you might want the user to install the distribution's own version of transitive deps (libgsl?) - you'll need a full-blown deb or rpm package to satisfy that - again, don't try supporting any and all the distro, you'll turn out mad. Support something you most use, and let other users to help you with other packaging needs.
Also take a look at Python Packaging Guide
You could have two separate branches of the src, one containing the libraries and another that doesn't. That way you can explicitly warn your users in case they have installed the libraries. Another solution could be (if the licences of the libraries allow you) is to wrap 'em up in a single file.
I think there's no unique solution, but this are the ideas I could think so far.
Good luck
You can use virtualenv to create a private Python environment for your application. This avoids conflicts with other libraries. It is best if you package modules and dependencies such as your libraries using Distribute. Distutils is something else that is worth researching.

Building a ctypes-"based" C library with distutils

Following this recommendation, I have written a native C extension library to optimise part of a Python module via ctypes. I chose ctypes over writing a CPython-native library because it was quicker and easier (just a few functions with all tight loops inside).
I've now hit a snag. If I want my work to be easily installable using distutils using python setup.py install, then distutils needs to be able to build my shared library and install it (presumably into /usr/lib/myproject). However, this not a Python extension module, and so as far as I can tell, distutils cannot do this.
I've found a few references to people other people with this problem:
Someone on numpy-discussion with a hack back in 2006.
Somebody asking on distutils-sig and not getting an answer.
Somebody asking on the main python list and being pointed to the innards of an existing project.
I am aware that I can do something native and not use distutils for the shared library, or indeed use my distribution's packaging system. My concern is that this will limit usability as not everyone will be able to install it easily.
So my question is: what is the current best way of distributing a shared library with distutils that will be used by ctypes but otherwise is OS-native and not a Python extension module?
Feel free to answer with one of the hacks linked to above if you can expand on it and justify why that is the best way. If there is nothing better, at least all the information will be in one place.
The distutils documentation here states that:
A C extension for CPython is a shared library (e.g. a .so file on Linux, .pyd on Windows), which exports an initialization function.
So the only difference regarding a plain shared library seems to be the initialization function (besides a sensible file naming convention I don't think you have any problem with). Now, if you take a look at distutils.command.build_ext you will see it defines a get_export_symbols() method that:
Return the list of symbols that a shared extension has to export. This either uses 'ext.export_symbols' or, if it's not provided, "PyInit_" + module_name. Only relevant on Windows, where the .pyd file (DLL) must export the module "PyInit_" function.
So using it for plain shared libraries should work out-of-the-box except in
Windows. But it's easy to also fix that. The return value of get_export_symbols() is passed to distutils.ccompiler.CCompiler.link(), which documentation states:
'export_symbols' is a list of symbols that the shared library will export. (This appears to be relevant only on Windows.)
So not adding the initialization function to the export symbols will do the trick. For that you just need to trivially override build_ext.get_export_symbols().
Also, you might want to simplify the module name. Here is a complete example of a build_ext subclass that can build ctypes modules as well as extension modules:
from distutils.core import setup, Extension
from distutils.command.build_ext import build_ext
class build_ext(build_ext):
def build_extension(self, ext):
self._ctypes = isinstance(ext, CTypes)
return super().build_extension(ext)
def get_export_symbols(self, ext):
if self._ctypes:
return ext.export_symbols
return super().get_export_symbols(ext)
def get_ext_filename(self, ext_name):
if self._ctypes:
return ext_name + '.so'
return super().get_ext_filename(ext_name)
class CTypes(Extension): pass
setup(name='testct', version='1.0',
ext_modules=[CTypes('ct', sources=['testct/ct.c']),
Extension('ext', sources=['testct/ext.c'])],
cmdclass={'build_ext': build_ext})
I have setup a minimal working python package with ctypes extension here:
https://github.com/himbeles/ctypes-example
which works on Windows, Mac, Linux.
It takes the approach of memeplex above of overwriting build_ext.get_export_symbols() and forcing the library extension to be the same (.so) for all operating systems.
Additionally, a compiler directive in the c / c++ source code ensures proper export of the shared library symbols in case of Windows vs. Unix.
As a bonus, the binary wheels are automatically compiled by a GitHub Action for all operating systems :-)
Some clarifications here:
It's not a "ctypes based" library. It's just a standard C library, and you want to install it with distutils. If you use a C-extension, ctypes or cython to wrap that library is irrelevant for the question.
Since the library apparently isn't generic, but just contains optimizations for your application, the recommendation you link to doesn't apply to you, in your case it is probably easier to write a C-extension or to use Cython, in which case your problem is avoided.
For the actual question, you can always use your own custom distutils command, and in fact one of the discussions linked to just such a command, the OOF2 build_shlib command, that does what you want. In this case though you want to install a custom library that really isn't shared, and then I think you don't need to install it in /usr/lib/yourproject, but you can install it into the package directory in /usr/lib/python-x.x/site-packages/yourmodule, together with your python files. But I'm not 100% sure of that so you'll have to try.

Categories

Resources