I have a large program written in C++ that I wish to make usable via Python. I've written a python extension to expose an interface through which python code can call the C++ functions. The issue I'm having with this is that installing seems to be nontrivial.
All documentation I can find seems to indicate that I should create a setup.py which creates a distutils.core.Extension. In every example I've found, the Extension object being created is given a list of source files, which it compiles. If my code was one or two files, this would be fine. Unfortunately, it's dozens of files, and I use a number of relatively complicated visual studio build settings. As a result, building by listing .c files seems to be challenging to say the least.
I've currently configured my Python extension to build as a .dll and link against python39.lib. I tried changing the extension to .pyd and including the file in a manifest.in. After I created a setup.py and ran it, it created a .egg file that I verified did include the .pyd I created. However, after installing it, when I imported the module into python, the module was completely empty (and I verified that the PyInit_[module] function was not called). Python dll Extension Import says that I can import the dll if I change the extension to .pyd and place the file in the Dlls directory of python's installation. I've encountered two problems with this.
The first is that it seems to me that it's not very distributable like this. I'd like to package this into a python wheel, and I'm not sure how a wheel could do this. The second is even more problematic - it doesn't exactly work. It calls the initialization function of my extension, and I've verified in WinDbg that it's returning a python module. However, this is what I always get from the console.
>>> import bluespawn
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
SystemError: initialization of bluespawn did not return an extension module
The Python documentation has a section on publishing binary extensions, but for the past four years, it has been left as a placeholder. The github issue linked here isn't that helpful either; it boils down to either use distutils to build or use enscons to build. But since my build is a fairly complicated procedure, completely rewriting it to use enscons is less than desirable, to say the least.
It seems to me like placing the file in the DLLs directory is the wrong way of going about this. Given that I have a DLL and making setuptools compile everything itself seems infeasible, how should I go about installing my extension?
For reference, here's my initialization function, in case that's incorrect.
PyModuleDef bsModule{ PyModuleDef_HEAD_INIT, "bluespawn", "Bluespawn python bindings", -1, methods };
PyMODINIT_FUNC PyInit_bluespawn() {
PyObject* m;
Py_Initialize();
PyEval_InitThreads();
PyGILState_STATE state = PyGILState_Ensure(); // Crashes without this. Call to PyEval_InitThreads() required for this.
m = PyModule_Create(&bsModule);
PyGILState_Release(state);
Py_Finalize();
return m;
}
The python interface is available here: https://github.com/ION28/BLUESPAWN/blob/client-add-pylib/BLUESPAWN-win-client/src/user/python/PythonInterface.cpp
EDIT: I have a working solution that I am sure is not best practice. I created a very small C file that simply passes all calls it receives onto the large DLL I've already created. The C file is responsible for initializing the module, but everything else is handled inside the DLL. It works, but it seems like a very bad way of doing things. What I'm looking for is a better way of doing this.
Let me try and divide your post into two separate questions:
How to package a C++ library with a non-trivial compilation process using setuptools
Is it possible to distribute a python package with a precompiled library
1. How to package a C++ library with a non-trivial compilation process using setuptools
It is possible. I was quite surprised to see that setuptools offers many ways to override the compilation process, see the documentation here. For example, you can use the keyword argument extra_compile_args to pass extra arguments to the compiler.
In addition, as setup.py is a python file, you could relatively easily write some code to automatically collect all files needed for compilation. I'd done this myself in a project (github), and it worked quite well for me.
Here's some code from the setup.py:
libinjector = Extension('pyinjector.libinjector',
sources=[str(c.relative_to(PROJECT_ROOT))
for c in [LIBINJECTOR_WRAPPER, *LIBINJECTOR_SRC.iterdir()]
if c.suffix == '.c'],
include_dirs=[str(LIBINJECTOR_DIR.relative_to(PROJECT_ROOT) / 'include')],
export_symbols=['injector_attach', 'injector_inject', 'injector_detach'],
define_macros=[('EM_AARCH64', '183')])
2. Is it possible to distribute a python package with a precompiled library
I understand from your edit that you've managed to get it to work, but I'll say a few words anyway. Releasing precompiled binaries with your source distribution is possible, and it is possible to release your manually-compiled binaries in a wheel file as well, but it is not recommended.
The main reason is compatibility with the target architecture. First, you'll have to include two DLLs in your distribution, one for x64 and one for x86. Second, you might lose some nice optimizations, because you'll have to instruct the compiler to ignore optimizations available for the specific CPU type (note that this applies to normal wheel distributions as well). If you're compiling against windows SDK, you'll probably want to use the user's version too. In addition, including two DLLs in your release might grow it to an awkward size for a source distribution.
Related
TL;DR
Adding pybind11 bindings to a working C++ DLL project allows me to import and use the resulting DLL in Python but breaks my ability to use it in C++ code using boost/dll machinery.
Summary
I've got a C++ library that I compile to FooLib.dll. I use boost/dll's BOOST_DLL_ALIAS and boost::dll::import_alias() to export and load a class Foo that does some work in other C++ code.
Some details omitted but it all works great, following this recipe.
I'd like to be able to call the same library code from Python to do some complicated functional testing and do comparisons to numpy/scipy prototypes without having to write so much test code in C++.
So I tried to add pybind11 bindings to the FooLib DLL project using PYBIND11_MODULE.
It compiles, I get a FooLib.dll. I can copy and rename that to FooLib.pyd, import it as a Python module, and it all works fine. I export Foo as a Python class, and it works.
However, when I compile in the pybind bindings, the boost/dll import machinery can no longer load the original FooLib.dll. I verify with boost::dll::library_info() that the appropriate CreateFoo symbol is exported to the DLL. But loading with boost::dll::import_alias() fails with:
boost::dll::shared_library::load() failed: The specified module could not be found
Minimal Example
Unfortunately, something that needs a C++ and Python executable and compiled boost isn't exactly minimal, but I did my best here:
https://github.com/danzimmerman/pybind-boostdll-minimal
Direct links to the source files:
DLL Project Files
HelloSayerLib.h
HelloSayerImp.cpp
C++ Test Code
HelloSayerLibCppTest.cpp
Python Test Code
HelloSayerLibPythonTests.py
Any advice for next steps?
Is it even possible to compile to one binary that works for both C++ and Python like this?
The suggestion in #n.'pronouns'm. comment is correct. Simply copying the python DLL from the Anaconda distribution I built against to the C++ program's run directory fixes the problem. Makes sense in retrospect, but didn't occur to me.
Makes it even more likely that I should keep the builds separate or at least set up my real project to only build with pybind bindings on my machine.
Under Windows, I'm trying to use a 3rd party DLL (SomeLib.dll) programmed in C++ from Python 2.7 using ctypes.
For some of its features, this library uses another COM DLL (SomeCOMlib.dll), which itself uses other DLL (LibA.dll).
Please note that this isn't about using a COM DLL directly from Python, but it's about using a DLL that uses it from Python.
To make the integration with Python easier, I've grouped the calls I want to use into my own functions in a new DLL (MyLib.dll), also programmed in C++, simply to make the calls from ctypes more convenient (using extern "C" and tailoring the functions for specific scenarios).
Essentially, my library exposes 2 functions: doSomethingSimple(), doSomethingWithCOMobj() (all returning void, no parameters).
The "effective" dependency hierarchy is as follows:
MyLib.dll
SomeLib.dll
SomeCOMlib.dll
LibA.dll
I'm able to write a simple C++ console application (Visual C++) that uses MyLib.dll and makes those 2 consecutive calls without problem.
Using Python/ctypes, the first call works fine, but the call that makes use of COM throws a WindowsError: [Error -529697949] Windows Error 0xE06D7363.
From the rest of the library behaviour, I can see that the problem comes exactly where that COM call is made.
(The simple test C++ application also fails more or less at the same place if LibA.dll is missing, but I'm not sure if it's related.)
I've looked at the dependency hierarchy using Dependency Walker.
SomeCOMlib.dll isn't listed as a dependency of SomeLib.dll, although it's clearly required, and LibA.dll isn't listed as a dependency of SomeCOMlib.dll, although it's also clearly required at run time.
I'm running everything from the command line, from within the directory where these DLLs are (and the C++ sample executable works fine).
I've tried to force the PATH to include that directory, and I've also tried to copy the DLLs to various places where I guessed they might be picked up (C:\Windows\System32 and C:\Python27\DLLs), without success.
SomeCOMlib.dll was also registered with regasm.exe.
What could cause this difference between using this DLL from a plain C++ application and from Python's ctypes when it comes to its own usage of the COM mechanism (and possibly the subsequent loading of other DLLs there)?
Which steps would give at least a bit more information than Windows Error 0xE06D7363 from Python, to be able to investigate the problem further?
The Python code looks like this:
import ctypes
myDll = ctypes.WinDLL("MyLib.dll")
myDll.doSomethingSimple()
myDll.doSomethingWithCOMobj() # This statement throws the Windows Error
(The test C++ standalone application that works, linked to MyLib.dll makes exactly the same calls within main.)
When you need an in-proc COM object, you don't link directly to the implementing DLL. You usually use CoCreateInstance/CoCreateInstanceEx, which will load the DLL for you.
The lookup goes through the application's manifest and its dependant assembly manifests. This is done to support registration-free COM.
If there's no application manifest or if none of the dependant assembly manifests declare your class in a comClass XML element, the lookup defaults to the registry, which will check HKEY_CLASSES_ROOT\CLSID1 for a subkey named {<your-CLSID>}, itself with an InProcServer32 subkey stating the DLL.
This explains why SomeCOMlib.dll doesn't appear as a dependency. It doesn't explain why LibA.dll doesn't appear as a dependency of it, probably because it's dynamically loaded. If you profile your app within Dependency Walker, you'll see a log of LoadLibrary calls in the bottom pane. To profile it, open your Python executable in Dependency Walker, then go to menu option Profile->Start profiling..., set the parameters to run your .py file and click Ok.
The 0xE06D7363 exception code is a Visual C++ exception code. You should check the source code of doSomethingWithCOMobj. To debug it, use your preferred tool (Visual C++, WinDbg, etc.), open Python's executable, setting up the arguments to run your .py file, and enable a breakpoint on the first statement of the function before running the application. Then run it and step through each instruction.
It's really hard to guess what's different about your native C++ application and Python, but it may be that different COM initialization arguments are used by Python and doSomethingWithCOMobj, or that you haven't declared it __stdcall (although being a void function that shouldn't matter), or that it tries to write to stdout and you're using pythonw.exe which isn't a console application, etc.
1. HKEY_CLASSES_ROOT is a mix of HKEY_CURRENT_USER\Software\Classes and HKEY_LOCAL_MACHINE\Software\Classes.
You could use something like processexplorer to find out if all the required libraries are loaded into the running process.
Did you initialize the COM runtime somewhere in your Python code? In your C++ code?
One difference between C++ code and Python/ctypes is that in the C++ code the dll's DllMain() function is called when the dll is loaded, Python/ctypes doesn't do this because of the fully dynamic loading of the dll.
I'm going to take a guess that you have the same issue that I had, although I don't really know because unlike this case, I know nothing about the dll I am using. 6 years later I don't know if you can still test this, but I wonder if importing pythoncom would solve the problem.
import ctypes
import pythoncom
myDll = ctypes.WinDLL("MyLib.dll")
myDll.doSomethingSimple()
myDll.doSomethingWithCOMobj() # This statement throws the Windows Error
Following this recommendation, I have written a native C extension library to optimise part of a Python module via ctypes. I chose ctypes over writing a CPython-native library because it was quicker and easier (just a few functions with all tight loops inside).
I've now hit a snag. If I want my work to be easily installable using distutils using python setup.py install, then distutils needs to be able to build my shared library and install it (presumably into /usr/lib/myproject). However, this not a Python extension module, and so as far as I can tell, distutils cannot do this.
I've found a few references to people other people with this problem:
Someone on numpy-discussion with a hack back in 2006.
Somebody asking on distutils-sig and not getting an answer.
Somebody asking on the main python list and being pointed to the innards of an existing project.
I am aware that I can do something native and not use distutils for the shared library, or indeed use my distribution's packaging system. My concern is that this will limit usability as not everyone will be able to install it easily.
So my question is: what is the current best way of distributing a shared library with distutils that will be used by ctypes but otherwise is OS-native and not a Python extension module?
Feel free to answer with one of the hacks linked to above if you can expand on it and justify why that is the best way. If there is nothing better, at least all the information will be in one place.
The distutils documentation here states that:
A C extension for CPython is a shared library (e.g. a .so file on Linux, .pyd on Windows), which exports an initialization function.
So the only difference regarding a plain shared library seems to be the initialization function (besides a sensible file naming convention I don't think you have any problem with). Now, if you take a look at distutils.command.build_ext you will see it defines a get_export_symbols() method that:
Return the list of symbols that a shared extension has to export. This either uses 'ext.export_symbols' or, if it's not provided, "PyInit_" + module_name. Only relevant on Windows, where the .pyd file (DLL) must export the module "PyInit_" function.
So using it for plain shared libraries should work out-of-the-box except in
Windows. But it's easy to also fix that. The return value of get_export_symbols() is passed to distutils.ccompiler.CCompiler.link(), which documentation states:
'export_symbols' is a list of symbols that the shared library will export. (This appears to be relevant only on Windows.)
So not adding the initialization function to the export symbols will do the trick. For that you just need to trivially override build_ext.get_export_symbols().
Also, you might want to simplify the module name. Here is a complete example of a build_ext subclass that can build ctypes modules as well as extension modules:
from distutils.core import setup, Extension
from distutils.command.build_ext import build_ext
class build_ext(build_ext):
def build_extension(self, ext):
self._ctypes = isinstance(ext, CTypes)
return super().build_extension(ext)
def get_export_symbols(self, ext):
if self._ctypes:
return ext.export_symbols
return super().get_export_symbols(ext)
def get_ext_filename(self, ext_name):
if self._ctypes:
return ext_name + '.so'
return super().get_ext_filename(ext_name)
class CTypes(Extension): pass
setup(name='testct', version='1.0',
ext_modules=[CTypes('ct', sources=['testct/ct.c']),
Extension('ext', sources=['testct/ext.c'])],
cmdclass={'build_ext': build_ext})
I have setup a minimal working python package with ctypes extension here:
https://github.com/himbeles/ctypes-example
which works on Windows, Mac, Linux.
It takes the approach of memeplex above of overwriting build_ext.get_export_symbols() and forcing the library extension to be the same (.so) for all operating systems.
Additionally, a compiler directive in the c / c++ source code ensures proper export of the shared library symbols in case of Windows vs. Unix.
As a bonus, the binary wheels are automatically compiled by a GitHub Action for all operating systems :-)
Some clarifications here:
It's not a "ctypes based" library. It's just a standard C library, and you want to install it with distutils. If you use a C-extension, ctypes or cython to wrap that library is irrelevant for the question.
Since the library apparently isn't generic, but just contains optimizations for your application, the recommendation you link to doesn't apply to you, in your case it is probably easier to write a C-extension or to use Cython, in which case your problem is avoided.
For the actual question, you can always use your own custom distutils command, and in fact one of the discussions linked to just such a command, the OOF2 build_shlib command, that does what you want. In this case though you want to install a custom library that really isn't shared, and then I think you don't need to install it in /usr/lib/yourproject, but you can install it into the package directory in /usr/lib/python-x.x/site-packages/yourmodule, together with your python files. But I'm not 100% sure of that so you'll have to try.
I'm trying to write a software plug-in that embeds Python. On Windows the plug-in is technically a DLL (this may be relevant). The Python Windows FAQ says:
1.Do not build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL’s. (This is the first key undocumented fact.) Instead, link to pythonNN.dll; it is typically installed in C:\Windows\System. NN is the Python version, a number such as “23” for Python 2.3.
My question is why exactly Python must be a DLL? If, as in my case, the host application is not an .exe, but also a DLL, could I build Python into it? Or, perhaps, this note means that third-party C extensions rely on pythonN.N.dll to be present and other DLL won't do? Assuming that I'd really want to have a single DLL, what should I do?
I see there's the dynload_win.c file, which appears to be the module to import C extensions on Windows and, as far as I can see, it scans the extension file to find which pythonX.X.dll it imports; but I'm not experienced with Windows and I don't quite understand all the code there.
You need to link to pythonXY.dll as a DLL, instead of linking the relevant code directly into your executable, because otherwise the Python runtime can't load other DLLs (the extension modules it relies on.) If you make your own DLL you could theoretically link all the Python code in that DLL directly, since it doesn't end up in the executable but still in a DLL. You'll have to take care to do the linking correctly, however, as pretty much none of the standard tools (like distutils) will do this for you.
However, regardless of how you embed Python, you can't make do with just the DLL, nor can you make do with just any DLL. The ABI changes between Python versions, so if you compiled your code against Python 2.6, you need python26.dll; you can't use python25.dll or python27.dll. And Python isn't just a DLL; it also needs its standard library, which includes extension modules (which are DLLs themselves, although they have the .pyd extension.) The code in dynload_win.c you ran into is for loading those DLLs, and are not related to loading of pythonXY.dll.
In short, in order to embed Python in your plugin, you need to either ship Python with the plugin, or require that the right Python version is already installed.
(Sorry, I did a stupid thing, I first wrote the question, and then registered, and now I cannot alter it or comment on the replies, because StackOverflow's engine doesn't think I'm the author. I cannot even properly thank those who replied :( So this is actually an update to the question and comments.)
Thanks for all the advice, it's very valuable. As far as I understand with some effort I can link Python statically into a custom DLL, provided that I compile other dynamically loaded extensions myself and link them against the same DLL. (I know I need to ship the standard library too; my plan was to append a zipped archive to the DLL file. As far as I understand, I will even be able to import pure Python modules from it.)
I also found an interesting place in dynload_win.c. (I understand it loads dynamic extensions that use Python C API, e.g. _ctypes.) As far as I can see it not only looks for init_ctypes symbol or whatever the extension name is, but also scans the .pyd file's import table looking for (regex) python\d+\. and then compares the found symbol with known pythonNN. string to make sure the extension was compiled for this version of Python. If the import table doesn't have such a symbol or it refers to another version, it raises an error.
For me it means that:
If I link an extension against pythonNN.dll and try to load it from my custom DLL that includes a statically linked Python, it will pass the check, but — well, here I'm not sure: will it fail because there's no pythonNN.dll (i.e. even before getting to the check) or it will happily load the symbols?
And if I link it against my custom DLL, it will find symbols, but won't pass the check :)
I guess I could rewrite this piece to suit my needs... Are there any other such places, I wonder.
Python needs to be a dll (with a standard name) such that your application, and the plugin, can use the same instance of python.
Plugin dlls are already going to expect to be loading (and using python from) a python26.dll (or whichever version) - if your python is statically embedded in your exe, then two different instances of the python library would be managing the same data structures.
If the python libraries use no static variables at all, and the compile settings are exactly the same this should not be a problem. However, generally its far safer to simply ensure that only one instance of the python interpreter is being used.
On *nix, all shared objects in a process, including the executable, contribute their exported names into a common pool; any of the shared objects can then pull any of the names from the pool and use them as they like. This allows e.g. cStringIO.so to pull the relevant Python library functions from the main executable when the Python library is statically-linked.
On Windows, each shared object has its own independent pool of names it can use. This means that it must read the relevant different shared objects it needs functions from. Since it is a lot of work to get all the names from the main executable, the Python functions are separated out into their own DLL.
I'm developing on Windows, and I've searched everywhere without finding anyone talking about this kind of thing.
I made a C++ app on my desktop that embedded Python 3.1 using MSVC. I linked python31.lib and included python31.dll in the app's run folder alongside the executable. It works great. My extension and embedding code definitely works and there are no crashes.
I sent the run folder to my friend who doesn't have Python installed, and the app crashes for him during the scripting setup phase.
A few hours ago, I tried the app on my laptop that has Python 2.6 installed. I got the same crash behavior as my friend, and through debugging found that it was the Py_Initialize() call that fails.
I installed Python 3.1 on my laptop without changing the app code. I ran it and it runs perfectly. I uninstalled Python 3.1 and the app crashes again. I put in code in my app to dynamically link from the local python31.dll, to ensure that it was using it, but I still get the crash.
I don't know if the interpreter needs more than the DLL to start up or what. I haven't been able to find any resources on this. The Python documentation and other guides do not seem to ever address how to distribute your C/C++ applications that use Python embedding without having the users install Python locally. I know it's more of an issue on Windows than on Unix, but I've seen a number of Windows C/C++ applications that embed Python locally and I'm not sure how they do it.
What else do I need other than the DLL? Why does it work when I install Python and then stop working when I uninstall it? It sounds like it should be so trivial; maybe that's why nobody really talks about it. Nevertheless, I can't really explain how to deal with this crash issue.
Thank you very much in advance.
In addition to pythonxy.dll, you also need the entire Python library, i.e. the contents of the lib folder, plus the extension modules, i.e. the contents of the DLLs folder. Without the standard library, Python won't even start, since it tries to find os.py (in 3.x; string.py in 2.x). On startup, it imports a number of modules, in particular site.py.
There are various locations where it searches for the standard library; in your cases, it eventually finds it in the registry. Before, uses the executable name (as set through Py_SetProgramName) trying to find the landmark; it also checks for a file python31.zip which should be a zipped copy of the standard library. It also checks for a environment variable PYTHONHOME.
You are free to strip the library from stuff that you don't need; there are various tools that compute dependencies statically (modulefinder in particular).
If you want to minimize the number of files, you can
link all extension modules statically into your pythonxy.dll, or even link pythonxy.dll statically into your application
use the freeze tool; this will allow linking the byte code of the standard library into your pythonxy.dll.
(alternatively to 2.) use pythonxy.zip for the standard library.
Nice. And if you do not want to zip, copy Python26\DLLs & Python26\lib to your exe directory as:
.\myexe.exe
.\python26.dll
.\Python26\DLLs
.\Python26\lib
And then set PYTHONHOME with Py_SetPythonHome() API. Apparently, this API is not in the list of "allowed" calls before Py_Initialize();
Below worked for me on Windows (Python not installed):
#include "stdafx.h"
#include <iostream>
#include "Python.h"
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
char pySearchPath[] = "Python26";
Py_SetPythonHome(pySearchPath);
Py_Initialize();
PyRun_SimpleString("from time import time,ctime\n"
"print 'Today is',ctime(time())\n");
//cerr << Py_GetPath() << endl;
Py_Finalize();
return 0;
}
Good that the search path is relative w.r.t the exe. Py_GetPath can show you where all it is looking for the modules.
A zip of the Python standard library worked for me with Python27.
I zipped the contents of Lib and dll, and made sure there was no additional python27-subfolder or Lib or dll subfolder. i.e. just a zip named python27.zip containing all the files.
I copied that zip and the python27.dll alongside the executable.
I wanted to add some additional info for others who might still be having troubles with this, as I was. I was eventually able to get my application working using the method proposed by user sambha, that is:
Program Files (x86)\
MyApplicationFolder\
MyApplication.exe
python27.dll
Python27\
DLLs\ (contents of DLLs folder)
Lib\ (contents of Lib folder)
...with one important addition: I also needed to install the MSVCR90.DLL. I'm using Python 2.7 and apparently python27.dll requires the MSVCR90.DLL (and maybe other MSVC*90.DLLs).
I solved this by downloading, installing, and running the 'vcredist_x86.exe' package from http://www.microsoft.com/en-us/download/details.aspx?id=29 . I think, though I am not certain, that you need to do it this way at least on Win7, as opposed to simply placing the MSVC*90.DLLs alongside your .exe as you may have done in the past. The Microsoft installer places the files and registers them in a special way under Win7.
I also tried the .zip file method but that did not work, even with the MSVCR90.DLL installed.