I am trying to make one unix executable file from my python source files.
I have two file, p1.py and p2.py
p1.py :-
from p2 import test_func
print (test_func())
p2.py :-
def test_func():
return ('Test')
Now, as we can see p1.py is dependent on p2.py . I want to make an executable file by combining two files together. I am using cython.
I changed the file names to p1.pyx and p2.pyx respectively.
Now, I can make file executable by using cython,
cython p1.pyx --embed
It will generate a C source file called p1.c . Next we can use gcc to make it executable,
gcc -Os -I /usr/include/python3.5m -o test p1.c -lpython3.5m -lpthread -lm -lutil -ldl
But how to combine two files into one executable ?
People are tempted to do this because it's fairly easy to do for the simplest case (one module, no dependencies). #ead's answer is good but honestly pretty fiddly and it is handling the next simplest case (two modules that you have complete control of, no dependencies).
In general a Python program will depend on a range of external modules. Python comes with a large standard library which most programs use to an extent. There's a wide range of third party libraries for maths, GUIs, web frameworks. Even tracing those dependencies through the libraries and working out what you need to build is complicated, and tools such as PyInstaller attempt it but aren't 100% reliable.
When you're compiling all these Python modules you're likely to come across a few Cython incompatibilities/bugs. It's generally pretty good, but struggles with features like introspection, so it's unlikely a large project will compile cleanly and entirely.
On top of that many of those modules are compiled modules written either in C, or using tools such as SWIG, F2Py, Cython, boost-python, etc.. These compiled modules may have their own unique idiosyncrasies that make them difficult to link together into one large blob.
In summary, it may be possible, but for non-trivial programs it is not a good idea however appealing it seems. Tools like PyInstaller and Py2Exe that use a much simpler approach (bundle everything into a giant zip file) are much more suitable for this task (and even then they struggle to be really robust).
Note this answer is posted with the intention of making this question a canonical duplicate for this problem. While an answer showing how it might be done is useful, "don't do this" is probably the best solution for the vast majority of people.
There are some loops you have to jump through to make it work.
First, you must be aware that the resulting executable is a very slim layer which just delegates the whole work to (i.e. calls functions from) pythonX.Ym.so. You can see this dependency when calling
ldd test
...
libpythonX.Ym.so.1.0 => not found
...
So, to run the program you either need to have the LD_LIBRARY_PATH showing to the location of the libpythonX.Ym.so or build the exe with --rpath option, otherwise at the start-up of test dynamic loader will throw an error similar to
/test: error while loading shared libraries: libpythonX.Ym.so.1.0: cannot open shared object file: No such file or directory
The generic build command would look like following:
gcc -fPIC <other flags> -o test p1.c -I<path_python_include> -L<path_python_lib> -Wl,-rpath=<path_python_lib> -lpython3.6m <other_needed_libs>
It is also possible to build against static version of the python-library, thus eliminating run time dependency on the libpythonX.Ym, see for example this SO-post.
The resulting executable test behaves exactly the same as if it were a python-interpreter. This means that now, test will fail because it will not find the module p2.
One simple solution were to cythonize the p2-module inplace (cythonize p2.pyx -i): you would get the desired behavior - however, you would have to distribute the resulting shared-object p2.so along with test.
It is easy to bundle both extension into one executable - just pass both cythonized c-files to gcc:
# creates p1.c:
cython --empbed p1.pyx
# creates p2.c:
cython p2.pyx
gcc ... -o test p1.c p2.c ...
But now a new (or old) problem arises: the resulting test-executable cannot once again find the module p2, because there is no p2.py and no p2.so on the python-path.
There are two similar SO questions about this problem, here and here. In your case the proposed solutions are kind of overkill, here it is enough to initialize the p2 module before it gets imported in the p1.pyx-file to make it work:
# making init-function from other modules accessible:
cdef extern object PyInit_p2();
#init/load p2-module manually
PyInit_p2() #Cython handles error, i.e. if NULL returned
# actually using already cached imported module
# no search in python path needed
from p2 import test_func
print(test_func())
Calling the init-function of a module prior to importing it (actually the module will not be really imported a second time, only looked up in the cache) works also if there are cyclic dependencies between modules. For example if module p2 imports module p3, which imports p2in its turn.
Warning: Since Cython 0.29, Cython uses multi-phase initialization per default for Python>=3.5, thus calling PyInit_p2 is not enough (see e.g. this SO-post). To switch off this multi-phase initialization -DCYTHON_PEP489_MULTI_PHASE_INIT=0should be passed to gcc or similar to other compilers.
Note: However, even after all of the above, the embedded interpreter will need its standard libraries (see for example this SO-post) - there is much more work to do to make it truly standalone! So maybe one should heed #DavidW's advice:
"don't do this" is probably the best solution for the vast majority of
people.
A word of warning: if we declare PyInit_p2() as
from cpython cimport PyObject
cdef extern PyObject *PyInit_p2();
PyInit_p2(); # TODO: error handling if NULL is returned
Cython will no longer handle the errors and its our responsibility. Instead of
PyObject *__pyx_t_1 = NULL;
__pyx_t_1 = PyInit_p2(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 4, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
produced for object-version, the generated code becomes just:
(void)(PyInit_p2());
i.e. no error checking!
On the other hand using
cdef extern from *:
"""
PyObject *PyInit_p2(void);
"""
object PyInit_p2()
will not work with g++ - one has to add extern C to declaration.
Related
I have a large program written in C++ that I wish to make usable via Python. I've written a python extension to expose an interface through which python code can call the C++ functions. The issue I'm having with this is that installing seems to be nontrivial.
All documentation I can find seems to indicate that I should create a setup.py which creates a distutils.core.Extension. In every example I've found, the Extension object being created is given a list of source files, which it compiles. If my code was one or two files, this would be fine. Unfortunately, it's dozens of files, and I use a number of relatively complicated visual studio build settings. As a result, building by listing .c files seems to be challenging to say the least.
I've currently configured my Python extension to build as a .dll and link against python39.lib. I tried changing the extension to .pyd and including the file in a manifest.in. After I created a setup.py and ran it, it created a .egg file that I verified did include the .pyd I created. However, after installing it, when I imported the module into python, the module was completely empty (and I verified that the PyInit_[module] function was not called). Python dll Extension Import says that I can import the dll if I change the extension to .pyd and place the file in the Dlls directory of python's installation. I've encountered two problems with this.
The first is that it seems to me that it's not very distributable like this. I'd like to package this into a python wheel, and I'm not sure how a wheel could do this. The second is even more problematic - it doesn't exactly work. It calls the initialization function of my extension, and I've verified in WinDbg that it's returning a python module. However, this is what I always get from the console.
>>> import bluespawn
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
SystemError: initialization of bluespawn did not return an extension module
The Python documentation has a section on publishing binary extensions, but for the past four years, it has been left as a placeholder. The github issue linked here isn't that helpful either; it boils down to either use distutils to build or use enscons to build. But since my build is a fairly complicated procedure, completely rewriting it to use enscons is less than desirable, to say the least.
It seems to me like placing the file in the DLLs directory is the wrong way of going about this. Given that I have a DLL and making setuptools compile everything itself seems infeasible, how should I go about installing my extension?
For reference, here's my initialization function, in case that's incorrect.
PyModuleDef bsModule{ PyModuleDef_HEAD_INIT, "bluespawn", "Bluespawn python bindings", -1, methods };
PyMODINIT_FUNC PyInit_bluespawn() {
PyObject* m;
Py_Initialize();
PyEval_InitThreads();
PyGILState_STATE state = PyGILState_Ensure(); // Crashes without this. Call to PyEval_InitThreads() required for this.
m = PyModule_Create(&bsModule);
PyGILState_Release(state);
Py_Finalize();
return m;
}
The python interface is available here: https://github.com/ION28/BLUESPAWN/blob/client-add-pylib/BLUESPAWN-win-client/src/user/python/PythonInterface.cpp
EDIT: I have a working solution that I am sure is not best practice. I created a very small C file that simply passes all calls it receives onto the large DLL I've already created. The C file is responsible for initializing the module, but everything else is handled inside the DLL. It works, but it seems like a very bad way of doing things. What I'm looking for is a better way of doing this.
Let me try and divide your post into two separate questions:
How to package a C++ library with a non-trivial compilation process using setuptools
Is it possible to distribute a python package with a precompiled library
1. How to package a C++ library with a non-trivial compilation process using setuptools
It is possible. I was quite surprised to see that setuptools offers many ways to override the compilation process, see the documentation here. For example, you can use the keyword argument extra_compile_args to pass extra arguments to the compiler.
In addition, as setup.py is a python file, you could relatively easily write some code to automatically collect all files needed for compilation. I'd done this myself in a project (github), and it worked quite well for me.
Here's some code from the setup.py:
libinjector = Extension('pyinjector.libinjector',
sources=[str(c.relative_to(PROJECT_ROOT))
for c in [LIBINJECTOR_WRAPPER, *LIBINJECTOR_SRC.iterdir()]
if c.suffix == '.c'],
include_dirs=[str(LIBINJECTOR_DIR.relative_to(PROJECT_ROOT) / 'include')],
export_symbols=['injector_attach', 'injector_inject', 'injector_detach'],
define_macros=[('EM_AARCH64', '183')])
2. Is it possible to distribute a python package with a precompiled library
I understand from your edit that you've managed to get it to work, but I'll say a few words anyway. Releasing precompiled binaries with your source distribution is possible, and it is possible to release your manually-compiled binaries in a wheel file as well, but it is not recommended.
The main reason is compatibility with the target architecture. First, you'll have to include two DLLs in your distribution, one for x64 and one for x86. Second, you might lose some nice optimizations, because you'll have to instruct the compiler to ignore optimizations available for the specific CPU type (note that this applies to normal wheel distributions as well). If you're compiling against windows SDK, you'll probably want to use the user's version too. In addition, including two DLLs in your release might grow it to an awkward size for a source distribution.
I am trying to make one unix executable file from my python source files.
I have two file, p1.py and p2.py
p1.py :-
from p2 import test_func
print (test_func())
p2.py :-
def test_func():
return ('Test')
Now, as we can see p1.py is dependent on p2.py . I want to make an executable file by combining two files together. I am using cython.
I changed the file names to p1.pyx and p2.pyx respectively.
Now, I can make file executable by using cython,
cython p1.pyx --embed
It will generate a C source file called p1.c . Next we can use gcc to make it executable,
gcc -Os -I /usr/include/python3.5m -o test p1.c -lpython3.5m -lpthread -lm -lutil -ldl
But how to combine two files into one executable ?
People are tempted to do this because it's fairly easy to do for the simplest case (one module, no dependencies). #ead's answer is good but honestly pretty fiddly and it is handling the next simplest case (two modules that you have complete control of, no dependencies).
In general a Python program will depend on a range of external modules. Python comes with a large standard library which most programs use to an extent. There's a wide range of third party libraries for maths, GUIs, web frameworks. Even tracing those dependencies through the libraries and working out what you need to build is complicated, and tools such as PyInstaller attempt it but aren't 100% reliable.
When you're compiling all these Python modules you're likely to come across a few Cython incompatibilities/bugs. It's generally pretty good, but struggles with features like introspection, so it's unlikely a large project will compile cleanly and entirely.
On top of that many of those modules are compiled modules written either in C, or using tools such as SWIG, F2Py, Cython, boost-python, etc.. These compiled modules may have their own unique idiosyncrasies that make them difficult to link together into one large blob.
In summary, it may be possible, but for non-trivial programs it is not a good idea however appealing it seems. Tools like PyInstaller and Py2Exe that use a much simpler approach (bundle everything into a giant zip file) are much more suitable for this task (and even then they struggle to be really robust).
Note this answer is posted with the intention of making this question a canonical duplicate for this problem. While an answer showing how it might be done is useful, "don't do this" is probably the best solution for the vast majority of people.
There are some loops you have to jump through to make it work.
First, you must be aware that the resulting executable is a very slim layer which just delegates the whole work to (i.e. calls functions from) pythonX.Ym.so. You can see this dependency when calling
ldd test
...
libpythonX.Ym.so.1.0 => not found
...
So, to run the program you either need to have the LD_LIBRARY_PATH showing to the location of the libpythonX.Ym.so or build the exe with --rpath option, otherwise at the start-up of test dynamic loader will throw an error similar to
/test: error while loading shared libraries: libpythonX.Ym.so.1.0: cannot open shared object file: No such file or directory
The generic build command would look like following:
gcc -fPIC <other flags> -o test p1.c -I<path_python_include> -L<path_python_lib> -Wl,-rpath=<path_python_lib> -lpython3.6m <other_needed_libs>
It is also possible to build against static version of the python-library, thus eliminating run time dependency on the libpythonX.Ym, see for example this SO-post.
The resulting executable test behaves exactly the same as if it were a python-interpreter. This means that now, test will fail because it will not find the module p2.
One simple solution were to cythonize the p2-module inplace (cythonize p2.pyx -i): you would get the desired behavior - however, you would have to distribute the resulting shared-object p2.so along with test.
It is easy to bundle both extension into one executable - just pass both cythonized c-files to gcc:
# creates p1.c:
cython --empbed p1.pyx
# creates p2.c:
cython p2.pyx
gcc ... -o test p1.c p2.c ...
But now a new (or old) problem arises: the resulting test-executable cannot once again find the module p2, because there is no p2.py and no p2.so on the python-path.
There are two similar SO questions about this problem, here and here. In your case the proposed solutions are kind of overkill, here it is enough to initialize the p2 module before it gets imported in the p1.pyx-file to make it work:
# making init-function from other modules accessible:
cdef extern object PyInit_p2();
#init/load p2-module manually
PyInit_p2() #Cython handles error, i.e. if NULL returned
# actually using already cached imported module
# no search in python path needed
from p2 import test_func
print(test_func())
Calling the init-function of a module prior to importing it (actually the module will not be really imported a second time, only looked up in the cache) works also if there are cyclic dependencies between modules. For example if module p2 imports module p3, which imports p2in its turn.
Warning: Since Cython 0.29, Cython uses multi-phase initialization per default for Python>=3.5, thus calling PyInit_p2 is not enough (see e.g. this SO-post). To switch off this multi-phase initialization -DCYTHON_PEP489_MULTI_PHASE_INIT=0should be passed to gcc or similar to other compilers.
Note: However, even after all of the above, the embedded interpreter will need its standard libraries (see for example this SO-post) - there is much more work to do to make it truly standalone! So maybe one should heed #DavidW's advice:
"don't do this" is probably the best solution for the vast majority of
people.
A word of warning: if we declare PyInit_p2() as
from cpython cimport PyObject
cdef extern PyObject *PyInit_p2();
PyInit_p2(); # TODO: error handling if NULL is returned
Cython will no longer handle the errors and its our responsibility. Instead of
PyObject *__pyx_t_1 = NULL;
__pyx_t_1 = PyInit_p2(); if (unlikely(!__pyx_t_1)) __PYX_ERR(0, 4, __pyx_L1_error)
__Pyx_GOTREF(__pyx_t_1);
__Pyx_DECREF(__pyx_t_1); __pyx_t_1 = 0;
produced for object-version, the generated code becomes just:
(void)(PyInit_p2());
i.e. no error checking!
On the other hand using
cdef extern from *:
"""
PyObject *PyInit_p2(void);
"""
object PyInit_p2()
will not work with g++ - one has to add extern C to declaration.
I'm new to all of this thing, and so please excuse me if i kinda did something stupid here. Treat me and explain to me as if i'm a total noob would be helpful.
I have a simple function written in python, filename a.pyx:-
#!/usr/bin/env python
import os
import sys
def get_syspath():
ret = sys.path
print "Syspath:{}".format(ret)
return ret
I want it to be able to be used by tcl.
I read thru the cython page, and followed.
I ran this:-
cython -o a.c a.pyx
I then ran this command to generate the object file a.o:-
gcc -fpic -c a.c -I/usr/local/include -I/tools/share/python/2.7.1/linux64/include/python2.7
And then ran this to generate the so file a.so:-
gcc -shared a.o -o a.so
when i load it from a tclsh, it failed.
$tclsh
% load ./a.so
couldn't load file "./a.so": ./a.so: undefined symbol: PyExc_RuntimeError
Am i taking the correct path here? If not, can please explain to me what went wrong, and what should I be doing?
Thanks in advance.
The object code needs to be linked to the libraries it depends on when you're building the loadable library. This means adding appropriate -l... options and possibly some -L... options as well. I'm guessing that the option will be something like -lpython27 or something like that (which links to a libpython27.so somewhere on your library search path; the library search path is modified with the -L... options), but I don't know. Paths will depend a lot on exactly how your system is set up and experimentation is likely required on your half.
It still probably won't work as a loadable library in Tcl. Tcl expects there to be a library initialisation function (in your case, it will look for A_Init) that takes a Tcl_Interp* as its only argument so that the library can install the commands it defines into the Tcl interpreter context. I would be astonished if Python made such a thing by default. It's not failing with that for you yet because the failures are still happening during the internal dlopen() call and not the dlsym() call, but I can confidently predict that you'll still face them.
The easiest way to “integrate” that sort of functionality is by running the command in a subprocess.
Here's the Python code you might use:
import os
import sys
print sys.path
And here's the Tcl code to use it:
set syspath [exec python /path/to/yourcode.py]
# If it is in the same directory as this script, use this:
#set syspath [exec python [file join [file dirname [info script]] yourcode.py]]
It's not the most efficient way, but it's super-easy to make it work since you don't have to solve the compilation and linking of the two languages. It's programmer-efficient…
Maybe have a look at tclpython or libtclpy.
Both allow calling Python code from Tcl.
But if you wanted to wrap things in a nicer way, e.g. have nicer already wrapped APIs, maybe you should also look at Elmer, which seems to aim at the task your attempting.
I have an application in C and at some point I need to solve a non-linear optimization problem. Unfortunately AFAIK there are very limited resources to do that in C (please let me know otherwise). However it is quite simple to do it in Python, e.g. scipy.optimize.minimize.
While I was trying to do that I encountered some of what it seems to be very frequent pitfalls, e.g. Python.h not found, module not loading, segmentation fault on function call, etc.
What is a quick and easy first-timer’s way to link the two programs?
There are some things that you have to make sure are in place in order to make this work:
Make sure you have Python installed (you may need the python-dev package).
Locate your Python.h file, e.g. by locate Python.h. One of the occurrences should be in a sub(sub)folder in the include folder, e.g. the path should be something like ../include/python2.7/Python.h.
Insert #include “<path_to_Python.h>" in your C code in order to be able to use the Python API.
Use any tutorial to call your Python function. I used this one and it did the trick. However there were a couple of small points missing:
Whenever you use any Py<Name> function, e.g. PyImport_Import(), always check the result to make sure there was no error, e.g.
// Load the module object
pModule = PyImport_Import(pName);
if (!pModule)
{
PyErr_Print();
printf("ERROR in pModule\n");
exit(1);
}
Immediately after initializing the Python interpreter, i.e. after Py_Initialize();, you have to append the current path to sys.path in order to be able to load your module (assuming it is located in your current directory):
PyObject *sys = PyImport_ImportModule("sys");
PyObject *path = PyObject_GetAttrString(sys, "path");
PyList_Append(path, PyString_FromString("."));
Keep in mind that when you give the name of your Python file, it has to be without the extension .py.
Lastly, you have to do the following during compiling/linking:
Remember the ../include/python2.7/Python.h file you used before? Include the include folder in the list of the header files directories with the -I option in the gcc options during compilation, e.g. -I /System/Library/Frameworks/Python.framework/Versions/2.7/include.
Also pass to the linker the folder with the required libraries. It should be inside the same folder where the include folder is located, e.g. -L /System/Library/Frameworks/Python.framework/Versions/2.7/lib, along with the -lpython2.7 option (of course adjusting it accordingly to your Python version).
Now you must be able to successfully compile and execute your C program that calls in it your Python program.
I hope this was helpful and good luck!
Sources:
How do you call Python code from C code?
http://www.linuxjournal.com/article/8497?page=0,1
http://www.codeproject.com/Articles/11805/Embedding-Python-in-C-C-Part-I
http://www.codeproject.com/Articles/11843/Embedding-Python-in-C-C-Part-II
Python C API doesn't load module
What sets up sys.path with Python, and when?
http://linux.die.net/man/1/gcc
PyObject segfault on function call
I have Python on my Ubuntu system, but gcc can't find Python.h
How do you call Python code from C code?
Following this recommendation, I have written a native C extension library to optimise part of a Python module via ctypes. I chose ctypes over writing a CPython-native library because it was quicker and easier (just a few functions with all tight loops inside).
I've now hit a snag. If I want my work to be easily installable using distutils using python setup.py install, then distutils needs to be able to build my shared library and install it (presumably into /usr/lib/myproject). However, this not a Python extension module, and so as far as I can tell, distutils cannot do this.
I've found a few references to people other people with this problem:
Someone on numpy-discussion with a hack back in 2006.
Somebody asking on distutils-sig and not getting an answer.
Somebody asking on the main python list and being pointed to the innards of an existing project.
I am aware that I can do something native and not use distutils for the shared library, or indeed use my distribution's packaging system. My concern is that this will limit usability as not everyone will be able to install it easily.
So my question is: what is the current best way of distributing a shared library with distutils that will be used by ctypes but otherwise is OS-native and not a Python extension module?
Feel free to answer with one of the hacks linked to above if you can expand on it and justify why that is the best way. If there is nothing better, at least all the information will be in one place.
The distutils documentation here states that:
A C extension for CPython is a shared library (e.g. a .so file on Linux, .pyd on Windows), which exports an initialization function.
So the only difference regarding a plain shared library seems to be the initialization function (besides a sensible file naming convention I don't think you have any problem with). Now, if you take a look at distutils.command.build_ext you will see it defines a get_export_symbols() method that:
Return the list of symbols that a shared extension has to export. This either uses 'ext.export_symbols' or, if it's not provided, "PyInit_" + module_name. Only relevant on Windows, where the .pyd file (DLL) must export the module "PyInit_" function.
So using it for plain shared libraries should work out-of-the-box except in
Windows. But it's easy to also fix that. The return value of get_export_symbols() is passed to distutils.ccompiler.CCompiler.link(), which documentation states:
'export_symbols' is a list of symbols that the shared library will export. (This appears to be relevant only on Windows.)
So not adding the initialization function to the export symbols will do the trick. For that you just need to trivially override build_ext.get_export_symbols().
Also, you might want to simplify the module name. Here is a complete example of a build_ext subclass that can build ctypes modules as well as extension modules:
from distutils.core import setup, Extension
from distutils.command.build_ext import build_ext
class build_ext(build_ext):
def build_extension(self, ext):
self._ctypes = isinstance(ext, CTypes)
return super().build_extension(ext)
def get_export_symbols(self, ext):
if self._ctypes:
return ext.export_symbols
return super().get_export_symbols(ext)
def get_ext_filename(self, ext_name):
if self._ctypes:
return ext_name + '.so'
return super().get_ext_filename(ext_name)
class CTypes(Extension): pass
setup(name='testct', version='1.0',
ext_modules=[CTypes('ct', sources=['testct/ct.c']),
Extension('ext', sources=['testct/ext.c'])],
cmdclass={'build_ext': build_ext})
I have setup a minimal working python package with ctypes extension here:
https://github.com/himbeles/ctypes-example
which works on Windows, Mac, Linux.
It takes the approach of memeplex above of overwriting build_ext.get_export_symbols() and forcing the library extension to be the same (.so) for all operating systems.
Additionally, a compiler directive in the c / c++ source code ensures proper export of the shared library symbols in case of Windows vs. Unix.
As a bonus, the binary wheels are automatically compiled by a GitHub Action for all operating systems :-)
Some clarifications here:
It's not a "ctypes based" library. It's just a standard C library, and you want to install it with distutils. If you use a C-extension, ctypes or cython to wrap that library is irrelevant for the question.
Since the library apparently isn't generic, but just contains optimizations for your application, the recommendation you link to doesn't apply to you, in your case it is probably easier to write a C-extension or to use Cython, in which case your problem is avoided.
For the actual question, you can always use your own custom distutils command, and in fact one of the discussions linked to just such a command, the OOF2 build_shlib command, that does what you want. In this case though you want to install a custom library that really isn't shared, and then I think you don't need to install it in /usr/lib/yourproject, but you can install it into the package directory in /usr/lib/python-x.x/site-packages/yourmodule, together with your python files. But I'm not 100% sure of that so you'll have to try.