pip package: proper way of compiling code that depends on libclang - python

I am building a python library, that I want to be installable via pip. The installation process requires a cpp file to be compiled, and that cpp file depends on libclang (in particular, it includes some of clang-c header files, and needs to be linked against libclang.so).
I am assuming that the end user has clang++ installed. However, I don't know where that installation is. For example, when I installed clang++ locally, even though it installed all the headers and the library I need, if I just compile a blank C++ file that has
#include <clang-c/CXCompilationDatabase.h>
It won't find it. I need to explicitly provide the path via command-line argument or CPLUS_INCLUDE_PATH.
Now I need some way for the script that pip invokes to find those headers. I obviously can ask the user to set CPLUS_INCLUDE_PATH and LD_LIBRARY_PATH to include paths to clang++ before running the installation process, but it seems ugly. I can add headers and the library into my package, but then I would rather it build against the version that the user has. Is there a way to find a clang++ installation if a user has one (or at least if he installed one via apt-get or another package manager?), or in general, what is the correct way of solving this issue when building a pip package?

Related

Building a PEP517 package with MinGW

I am a package maintainer for a Python package that provides an interface to an academic project. The main code is written in C and there are some Python scripts to add some extra functionality. The core C code is relatively portable (for C) but does not currently build using the MS Visual Studio Compiler. I do not have enough Windows users that this is a problem right now, and I am fine with simply telling them that they need to do some extra work to use MinGW to install the Python package (as they need to do to use the core package).
The package uses PEP517 because deprecation warnings from prior standards gave me the impression that I needed to migrate. However, I have now discovered that the --global-option flag (specified by this post) is unavailable for PEP517 packages meaning there is no clear way to specify MinGW as the compiler.
I discovered that there is a --config-settings flag for PEP517 packages, but trying the obvious settings (compiler=mingw64) did not change anything.
How can I tell pip to use mingw64 as the compiler for my package on Windows? If there was a way to put this information in the package itself, that would be even better, but I am happy with just being able to install it with some extra command-line parameters.

Python packaging: Boost library as dependency

Assume that someone wants to package a Python (Cython) library that depends on the C++ boost library.
What is the best way to configure the setup.py so that the user is properly informed that it is required to install the boost library (i.e., apt-get install libboost-dev in Ubuntu, etc in other OSes)? Or is it a better practice to include the boost library in the python package distribution?
The question is better asked as
What is the best way to distribute a Python extension including
an external library dependency.
This is better dealt with binary wheel packages.
User does not need to know anything about setup.py, which is used for building and installing source code. User just needs to download and install a binary wheel package.
Including just the header files does not solve the problem of needing the library to build with and link to. It also opens up issues with version incompatibilities.
So setup.py need not have anything special about any of this, it just needs to know where to find headers which will be a sub-dir in your project if the library is included and which libraries to link with.
The documentation should include instructions on how to build from source, for which more than just boost is needed (python header files, appropriate compilers etc).
Tools like auditwheel then take care of bundling external library dependencies into the binary wheel, so end-users need not have the library installed to use your package.
See also manylinux for distributing binary Python extensions and this demo project.

How to use precompiled headers with python wheels?

I'm writing a python packages which I would like to distribute as a wheel, so it can be easily installed via pip install.
As part of the functionality of the package itself I compile C++ code. For that, I distribute with the package some set of header files for the C++ code to include. Now, in order to speed up those compilation operations I'd like to provide a precompiled-header as part of the package.
I am able to do this if the package is installed via python setup.py install because I can add a step after the installation that generates the precompiled header in the installation directory (some-virtualenv/lib/python3.5/site-packages/...) directly.
But now I can't figure out how to do this when I distribute a wheel. It seems to me that the installation process of a wheel is supposed to be a simple unpack and copy and provides me no way of performing some extra configuration on the installed package (that would generate that precompiled header for example).
As part of my search of how to do this I stumbled across this, but no solution is offered there.
Is there any way around this or am I forced to use a source distribution for my package?

Centos RPM with python virtual env

I am trying to deploy my python virtualenv as centos rpm.
Following steps I have taken.
Created virtual environment with required dependencies.
One of requirement is pyOpenSSL
Built rpm package.
Now when installed on fresh centos instance, I am getting error as
'No module named OpenSSL'.
Is there any different procedure for pyOpenSSL module or do we need to install openssl-devel and openssl explicitly on new machine.
Is there any different procedure for pyOpenSSL module or do we need to install openssl-devel and openssl explicitely on new machine.
First, I'm pretty sure pyOpenSSL only requires openssl-devel as a build dependency, not a runtime dependency. So, as long as you're distributing a pre-built copy (whether by tarring up a portable virtualenv, using a wheel file in a custom repo, or otherwise), openssl-devel shouldn't be a problem, only openssl.
But openssl is a problem. pyOpenSSL is a pure-Python wrapper; all it does is dlopen the .so file and call a bunch of functions out of it. Without the .so, it can't do anything useful at all.
I'm a bit surprised that any (non-embedded) linux distro doesn't come with OpenSSL. If CentOS comes with openssl pre-installed, then you already have all your dependencies, and there's nothing to do here.
But, given that, as far as I know, OpenSSL still isn't part of LSB, this is at least a conceivable problem, so let's deal with it.
As a brief sidebar: If you switched to a different OpenSSL wrapper (there are at least three major alternatives) that's written in C (or SWIG or SIP) rather than Python and links OpenSSL directly rather than by cffi, everything would be a lot simpler. For example, I believe with M2Crypto, all you have to do is create a static-only build of OpenSSL, then build M2Crypto with --openssl=/path/to/static/openssl, and you're done.
But if you're set on pyOpenSSL, is there any way around this? Yes, but it's not going to be easy.
The good news is that pyOpenSSL isn't a wrapper around OpenSSL itself, but the lower-level cryptography module. And cryptography uses cffi rather than ctypes, which means it can be distributed with a pre-built helper module, which can be statically linked against OpenSSL.
Unfortunately, while cryptography does have built-in support for pre-building its helpers with statically linked OpenSSL on Windows, last time I checked, it did not have that support on other platforms. That means you're either going to need to add that support yourself, or trick it in some way.
If you don't know much about this stuff, you're almost certainly going to need step-by-step help with this. Read the docs on cffi, and both the Building cryptography on Linux docs and the bundled INSTALL file, and make sure you understand static linking and GNU ld, and then go to the cryptography-dev mailing list. Or maybe go to the bug tracker and search for or create an issue for static linking OpenSSL on linux.
But if you know what you're doing, here's the basic idea:
To build in static linking support, look in the OpenSSL binding and binding utilities source to see what it does on Windows and do the equivalent on Linux.
To work around it, what you want to do is build OpenSSL locally, copy it inside the cryptography source tree, trick cryptography into link to libopenssl.so (and friends) with relative rather than absolute paths, and modify its setup.py to copy the .so files into the package itself.
If you want to get really hacky, instead of modifying setup.py, just copy the .so files into the package manually after installation. (This will obviously only work if you're using a distribution mechanism based on bundling up the virtualenv, rather than rebuilding it.)
Another option is to use an executable packager to automatically bundle any dependent non-system libraries into the executable. Unfortunately, the only one I know of that does that kind of thing on linux is pyInstaller, and last I checked it didn't know how to handle cryptography/pyOpenSSL, so again you'd have to figure it out and extend pyInstaller yourself. If you want to go this way, take a look at the other recipes there. Also, I believe py2app can handle pyOpenSSL; while it's Mac-specific, looking at what it does may help you do the same on linux.

What exactly does distutils do?

I have read the documentation but I don't understand.
Why do I have to use distutils to install python modules ?
Why do I just can't save the modules in python path ?
You don't have to use distutils. You can install modules manually, just like you can compile a C++ library manually (compile every implementation file, then link the .obj files) or install an application manually (compile, put into its own directory, add a shortcut for launching). It just gets tedious and error-prone, as every repetive task done manually.
Moreover, the manual steps I listed for the examples are pretty optimistic - often, you want to do more. For example, PyQt adds the .ui-to-.py-compiler to the path so you can invoke it via command line.
So you end up with a stack of work that could be automated. This alone is a good argument.
Also, the devs would have to write installing instructions. With distutils etc, you only have to specify what your project consists of (and fancy extras if and only if you need it) - for example, you don't need to tell it to put everything in a new folder in site-packages, because it already knows this.
So in the end, it's easier for developers and for users.
what python modules ? for installing python package if they exist in pypi you should do :
pip install <name_of_package>
if not, you should download them .tar.gz or what so ever and see if you find a setup.py and run it like this :
python setup.py install
or if you want to install it in development mode (you can change in package and see the result without installing it again ) :
python setup.py develop
this is the usual way to distribute python package (the setup.py); and this setup.py is the one that call disutils.
to summarize this distutils is a python package that help developer create a python package installer that will build and install a given package by just running the command setup.py install.
so basically what disutils does (i will sit only important stuff):
it search dependencies of the package (install dependencies automatically).
it copy the package modules in site-packages or just create a sym link if it's in develop mode
you can create an egg of you package.
it can also run test over your package.
you can use it to upload your package to pypi.
if you want more detail see this http://docs.python.org/library/distutils.html
You don't have to use distutils to get your own modules working on your own machine; saving them in your python path is sufficient.
When you decide to publish your modules for other people to use, distutils provides a standard way for them to install your modules on their machines. (The "dist" in "distutils" means distribution, as in distributing your software to others.)

Categories

Resources