Changing output directory of library created through pybind11_add_module - python

I am using CMake to build some python bindings for my code using Pybind11. It is working well, but they get compiled in the main build directory. I would like them to be built on build\python directory. I am trying the following:
pybind11_add_module(myModule src/main.cpp)
set_target_properties(myModule PROPERTIES RUNTIME_OUTPUT_DIRECTORY "${CMAKE_BINARY_DIR}/python")
But it is not working as intended, myModule is still built on the build directory as if the set_target_properties was not being called.
In the official pybind11_add_module documentation it is written:
This function behaves very much like CMake’s builtin add_library (in fact, it’s a wrapper function around that command). It will add a library target called to be built from the listed source files. In addition, it will take care of all the Python-specific compiler and linker flags as well as the OS- and Python-version-specific file extension. The produced target can be further manipulated with regular CMake commands.
So I assumed that set_target_properties could be used to indicate a different output directory after it, is this not the case? If not, how can this be done?
Thank you in advance!

pybind11 module is a library of either a SHARED or MODULE type.
Build directory for SHARED libraries is specified via LIBRARY_OUTPUT_DIRECTORY on all platforms except for Windows (and its dll's).
Build directory for MODULE libraries is specified via LIBRARY_OUTPUT_DIRECTORY on all platforms without exception.
Detailed description of types for output artifacts in CMake and corresponded OUTPUT variables can be found in the documentation.

Related

What's the difference between Bazel's py_library and py_binary rules

Bazel's documentation explains how to use each, but I don't understand the difference in their nature. Aren't all python files executables anyway? What difference does it make to specify it is a binary?
Executable or not?
Aren't all python files executables anyway?
If we don't take file permissions into account, the answer to that question, if we are observing the behavior, is not necessarily. In the real world, it depends on what is written in the python file.
What do I mean by "observing the behavior"?
Python files can be executable in the sense that they perform "actual work" like calling methods, performing calculations, file handling, logging, etc.
Have in mind that python files may also have a different purpose like simply defining the functions, enums, or classes that are to be used from separate python files.
To understand the upper sentences, my recommendation is to read more about __main__ in python.
Providing a functionality
Now imagine a use case in which it is needed to provide a set of functions, enums, or classes for someone else to use, but not to use them in the file in which they are written. Most of the time when there is a such requirement, it is needed to develop a library that provides all the necessary functionality.
py_library
To create an artifact that is "passive" and is to be used by other executables you can use py_library Bazel python rule. py_library should not be an executable, it creates an artifact that can be used by other artifacts such as py_test or py_binary.
Creating a py_library allows you to use that library in multiple Bazel targets and provides a clear message to the developers that the current target has a dependency on the specified py_library.
py_binary
Is the executable that is intended to be run and to perform a certain set of tasks.
TL;DR
What difference does it make to specify it is a binary?
Use py_binary when you need to actually execute the code written in the python source files
Use py_binary if it is needed to execute defined functionality from a separate file (e.g. perform setup/cleanup) like it is described in the second example of py_binary
Use py_library when you need to provide passive functionalities that will be utilized from a separate python (or even a different programming language) file/artifact
My recommendation is that you also watch this Youtube video that gives good insight on when/how to use, and how to "connect" py_binary, py_library and py_test
A py_library is a group of Python source files that can be utilized by other Bazel targets as dependencies. It itself is not an executable file. On the other hand, a py_binary is an independently executable Python application. It can be directly executed after being built from one or more py library targets.
The distinction between a static library and an executable file in other programming languages is analogous to that between a py library and py binary. An executable is a standalone program that can be run directly, but a static library is a group of object files that may be linked into other executables.
All .py files are regarded as executables in Python since the Python interpreter can use them directly. Not all Python programs, meanwhile, are created to be executed immediately. Other Python applications can import and utilize certain Python files as libraries. These Python files are defined in Bazel by the py library rule, whereas directly executable Python applications are defined by the py_binary rule.

Python Extension Dll Installation

I have a large program written in C++ that I wish to make usable via Python. I've written a python extension to expose an interface through which python code can call the C++ functions. The issue I'm having with this is that installing seems to be nontrivial.
All documentation I can find seems to indicate that I should create a setup.py which creates a distutils.core.Extension. In every example I've found, the Extension object being created is given a list of source files, which it compiles. If my code was one or two files, this would be fine. Unfortunately, it's dozens of files, and I use a number of relatively complicated visual studio build settings. As a result, building by listing .c files seems to be challenging to say the least.
I've currently configured my Python extension to build as a .dll and link against python39.lib. I tried changing the extension to .pyd and including the file in a manifest.in. After I created a setup.py and ran it, it created a .egg file that I verified did include the .pyd I created. However, after installing it, when I imported the module into python, the module was completely empty (and I verified that the PyInit_[module] function was not called). Python dll Extension Import says that I can import the dll if I change the extension to .pyd and place the file in the Dlls directory of python's installation. I've encountered two problems with this.
The first is that it seems to me that it's not very distributable like this. I'd like to package this into a python wheel, and I'm not sure how a wheel could do this. The second is even more problematic - it doesn't exactly work. It calls the initialization function of my extension, and I've verified in WinDbg that it's returning a python module. However, this is what I always get from the console.
>>> import bluespawn
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
SystemError: initialization of bluespawn did not return an extension module
The Python documentation has a section on publishing binary extensions, but for the past four years, it has been left as a placeholder. The github issue linked here isn't that helpful either; it boils down to either use distutils to build or use enscons to build. But since my build is a fairly complicated procedure, completely rewriting it to use enscons is less than desirable, to say the least.
It seems to me like placing the file in the DLLs directory is the wrong way of going about this. Given that I have a DLL and making setuptools compile everything itself seems infeasible, how should I go about installing my extension?
For reference, here's my initialization function, in case that's incorrect.
PyModuleDef bsModule{ PyModuleDef_HEAD_INIT, "bluespawn", "Bluespawn python bindings", -1, methods };
PyMODINIT_FUNC PyInit_bluespawn() {
PyObject* m;
Py_Initialize();
PyEval_InitThreads();
PyGILState_STATE state = PyGILState_Ensure(); // Crashes without this. Call to PyEval_InitThreads() required for this.
m = PyModule_Create(&bsModule);
PyGILState_Release(state);
Py_Finalize();
return m;
}
The python interface is available here: https://github.com/ION28/BLUESPAWN/blob/client-add-pylib/BLUESPAWN-win-client/src/user/python/PythonInterface.cpp
EDIT: I have a working solution that I am sure is not best practice. I created a very small C file that simply passes all calls it receives onto the large DLL I've already created. The C file is responsible for initializing the module, but everything else is handled inside the DLL. It works, but it seems like a very bad way of doing things. What I'm looking for is a better way of doing this.
Let me try and divide your post into two separate questions:
How to package a C++ library with a non-trivial compilation process using setuptools
Is it possible to distribute a python package with a precompiled library
1. How to package a C++ library with a non-trivial compilation process using setuptools
It is possible. I was quite surprised to see that setuptools offers many ways to override the compilation process, see the documentation here. For example, you can use the keyword argument extra_compile_args to pass extra arguments to the compiler.
In addition, as setup.py is a python file, you could relatively easily write some code to automatically collect all files needed for compilation. I'd done this myself in a project (github), and it worked quite well for me.
Here's some code from the setup.py:
libinjector = Extension('pyinjector.libinjector',
sources=[str(c.relative_to(PROJECT_ROOT))
for c in [LIBINJECTOR_WRAPPER, *LIBINJECTOR_SRC.iterdir()]
if c.suffix == '.c'],
include_dirs=[str(LIBINJECTOR_DIR.relative_to(PROJECT_ROOT) / 'include')],
export_symbols=['injector_attach', 'injector_inject', 'injector_detach'],
define_macros=[('EM_AARCH64', '183')])
2. Is it possible to distribute a python package with a precompiled library
I understand from your edit that you've managed to get it to work, but I'll say a few words anyway. Releasing precompiled binaries with your source distribution is possible, and it is possible to release your manually-compiled binaries in a wheel file as well, but it is not recommended.
The main reason is compatibility with the target architecture. First, you'll have to include two DLLs in your distribution, one for x64 and one for x86. Second, you might lose some nice optimizations, because you'll have to instruct the compiler to ignore optimizations available for the specific CPU type (note that this applies to normal wheel distributions as well). If you're compiling against windows SDK, you'll probably want to use the user's version too. In addition, including two DLLs in your release might grow it to an awkward size for a source distribution.

Statically linking Python, but still supporting external .pyd modules

I'm looking at statically link Python with my application. The reason for this is because in some test cases I've seen a 10% speed increase. My application uses the Python C-API heavily, and it seems that Whole Program Optimization is able to do some good optimizations. I expect Profile Guided Optimizations will gain a little more too. This is all being done in MSVC2015
So far I've recompiled the pythoncore project (python35.dll) into a static library and linked that with my application (let's call it myapp.exe). FYI other than changing the project type to static, the only other thing that needs doing is setting the define Py_NO_ENABLE_SHARED during the static lib compile, and when compiling myapp.exe. This works fine and it's how I was able to obtain the 10% speed improvement test result.
So the next step is continuing to support external python modules that have .pyd files (which are .dll files renamed to .pyd). These modules will have been compiled expecting to dynamically link with python35.dll, so I need to provide a workaround for that requirement, since all of the python functions are now embedded into myapp.exe.
First I use a .def file to export all of the public Python functions from myapp.exe. This works fine.
The missing piece is how do I create a python35.dll which redirects all the calls to the functions exported from myapp.exe.
My first attempt is using DLL forwarding. I made a custom python35.dll which has a .def file with lines such as:
PyArg_Parse=myapp.PyArg_Parse
In theory, this works. If I use Dependency Walker on socket.pyd, it correctly opens my python35.dll and shows that all the calls are being forwarded to myapp.exe.
However when actually running myapp.exe and trying to import socket, it fails to load the required entry points from myapp.exe. 'import socket' in Python will cause a LoadLibrary("socket.pyd") to occur. This will load my custom python35.dll implicitly. The failure occurs while trying to load python35.dll, it's unable to find the entry points for it's forwards. It seems like the reason for this is because myapp.exe won't be part of the library search path. I seem to be able to verify this by copying myapp.exe to myapp.dll. If I do that then the python35.dll load works, however this isn't a solution since that will result in 2 copies of the Python enviroment (one in myapp.exe, one in myapp.dll)
Possible other avenues I've also looked into but haven't found the right solution for:
Somehow getting .exe files to be part of the library search path
Using Windows manifest/configuration to redirect the library somehow
Manually using declspec(naked) and jmp statements to more explicitly wrap the .dll. I'm working in x64, so I don't think that's possible anymore?
I could manually do the whole Python API and wrap each function manually. This is doable if I can find a way to create the function definitions of all the exports so it's not an insane amount of manual work.
In summary, is there a way to redirect/forward calls to a .dll to functions/data exported from an .exe. Thanks!
I ended up going with the solution that #martineau suggested in the comments, which was to put all of my application, including Python, into a single .dll instead of an .exe. Then the .exe is just a simple file that calls into the .dll and does nothing else.

Python + setuptools: distributing a pre-compiled shared library with boost.python bindings

I have a C++ library (we'll call it Example in the following) for which I wrote Python bindings using the boost.python library. This Python-wrapped library will be called pyExample. The entire project is built using CMake and the resulting Python-wrapped library is a file named libpyExample.so.
When I use the Python bindings from a Python script located in the same directory as libpyExample.so, I simply have to write:
import libpyExample
libpyExample.hello_world()
and this executes a hello_world() function exposed by the wrapping process.
What I want to do
For convenience, I would like my pyExample library to be available from anywhere simply using the command
import pyExample
I also want pyExample to be easily installable in any virtualenv in just one command. So I thought a convenient process would be to use setuptools to make that happen. That would therefore imply:
Making libpyExample.so visible for any Python script
Changing the name under which the module is accessed
I did find many things about compiling C++ extensions with setuptools, but nothing about packaging a pre-compiled C++ extension. Is what I want to do even possible?
What I do not want to do
I don't want to build the pyExample library with setuptools, I would like to avoid modifying the existing project too much. The CMake build is just fine, I can retrieve the libpyExample.so file very easily.
If I understand your question correctly, you have the following situation:
you have an existing CMake-based build of a C++ library with Python bindings
you want to package this library with setuptools
The latter then allows you to call python setup.py install --user, which installs the lib in the site-packages directory and makes it available from every path in your system.
What you want is possible, if you overload the classes that setuptools uses to build extensions, so that those classes actually call your CMake build system. This is not trivial, but you can find a working example here, provided by the pybind11 project:
https://github.com/pybind/cmake_example
Have a look into setup.py, you will see how the classes build_ext and Extension are inherited from and modified to execute the CMake build.
This should work out of the box for you or with little modification - if your build requires special -D flags to be set.
I hope this helps!

Embedding Python on Windows: why does it have to be a DLL?

I'm trying to write a software plug-in that embeds Python. On Windows the plug-in is technically a DLL (this may be relevant). The Python Windows FAQ says:
1.Do not build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL’s. (This is the first key undocumented fact.) Instead, link to pythonNN.dll; it is typically installed in C:\Windows\System. NN is the Python version, a number such as “23” for Python 2.3.
My question is why exactly Python must be a DLL? If, as in my case, the host application is not an .exe, but also a DLL, could I build Python into it? Or, perhaps, this note means that third-party C extensions rely on pythonN.N.dll to be present and other DLL won't do? Assuming that I'd really want to have a single DLL, what should I do?
I see there's the dynload_win.c file, which appears to be the module to import C extensions on Windows and, as far as I can see, it scans the extension file to find which pythonX.X.dll it imports; but I'm not experienced with Windows and I don't quite understand all the code there.
You need to link to pythonXY.dll as a DLL, instead of linking the relevant code directly into your executable, because otherwise the Python runtime can't load other DLLs (the extension modules it relies on.) If you make your own DLL you could theoretically link all the Python code in that DLL directly, since it doesn't end up in the executable but still in a DLL. You'll have to take care to do the linking correctly, however, as pretty much none of the standard tools (like distutils) will do this for you.
However, regardless of how you embed Python, you can't make do with just the DLL, nor can you make do with just any DLL. The ABI changes between Python versions, so if you compiled your code against Python 2.6, you need python26.dll; you can't use python25.dll or python27.dll. And Python isn't just a DLL; it also needs its standard library, which includes extension modules (which are DLLs themselves, although they have the .pyd extension.) The code in dynload_win.c you ran into is for loading those DLLs, and are not related to loading of pythonXY.dll.
In short, in order to embed Python in your plugin, you need to either ship Python with the plugin, or require that the right Python version is already installed.
(Sorry, I did a stupid thing, I first wrote the question, and then registered, and now I cannot alter it or comment on the replies, because StackOverflow's engine doesn't think I'm the author. I cannot even properly thank those who replied :( So this is actually an update to the question and comments.)
Thanks for all the advice, it's very valuable. As far as I understand with some effort I can link Python statically into a custom DLL, provided that I compile other dynamically loaded extensions myself and link them against the same DLL. (I know I need to ship the standard library too; my plan was to append a zipped archive to the DLL file. As far as I understand, I will even be able to import pure Python modules from it.)
I also found an interesting place in dynload_win.c. (I understand it loads dynamic extensions that use Python C API, e.g. _ctypes.) As far as I can see it not only looks for init_ctypes symbol or whatever the extension name is, but also scans the .pyd file's import table looking for (regex) python\d+\. and then compares the found symbol with known pythonNN. string to make sure the extension was compiled for this version of Python. If the import table doesn't have such a symbol or it refers to another version, it raises an error.
For me it means that:
If I link an extension against pythonNN.dll and try to load it from my custom DLL that includes a statically linked Python, it will pass the check, but — well, here I'm not sure: will it fail because there's no pythonNN.dll (i.e. even before getting to the check) or it will happily load the symbols?
And if I link it against my custom DLL, it will find symbols, but won't pass the check :)
I guess I could rewrite this piece to suit my needs... Are there any other such places, I wonder.
Python needs to be a dll (with a standard name) such that your application, and the plugin, can use the same instance of python.
Plugin dlls are already going to expect to be loading (and using python from) a python26.dll (or whichever version) - if your python is statically embedded in your exe, then two different instances of the python library would be managing the same data structures.
If the python libraries use no static variables at all, and the compile settings are exactly the same this should not be a problem. However, generally its far safer to simply ensure that only one instance of the python interpreter is being used.
On *nix, all shared objects in a process, including the executable, contribute their exported names into a common pool; any of the shared objects can then pull any of the names from the pool and use them as they like. This allows e.g. cStringIO.so to pull the relevant Python library functions from the main executable when the Python library is statically-linked.
On Windows, each shared object has its own independent pool of names it can use. This means that it must read the relevant different shared objects it needs functions from. Since it is a lot of work to get all the names from the main executable, the Python functions are separated out into their own DLL.

Categories

Resources