python embed gcc not linking after compiling python myself [duplicate] - python

I have a pure Python script that I would like to distribute to systems with unkown Python configuration. Therefore, I would like to compile the Python code to a stand-alone executable.
I run cython --embed ./foo.py without problems giving foo.c. Then, I run
gcc $(python3-config --cflags) $(python3-config --ldflags) ./foo.c
where python3-config --cflags gives
-I/usr/include/python3.5m -I/usr/include/python3.5m -Wno-unused-result -Wsign-compare -g -fdebug-prefix-map=/build/python3.5-MLq5fN/python3.5-3.5.3=. -fstack-protector-strong -Wformat -Werror=format-security -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes
and python3-config --ldflags gives
-L/usr/lib/python3.5/config-3.5m-x86_64-linux-gnu -L/usr/lib -lpython3.5m -lpthread -ldl -lutil -lm -Xlinker -export-dynamic -Wl,-O1 -Wl,-Bsymbolic-functions
This way I obtain a dynamically linked executable that runs without a problem. ldd a.out yields
linux-vdso.so.1 (0x00007ffcd57fd000)
libpython3.5m.so.1.0 => /usr/lib/x86_64-linux-gnu/libpython3.5m.so.1.0 (0x00007fda76823000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fda76603000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fda763fb000)
libutil.so.1 => /lib/x86_64-linux-gnu/libutil.so.1 (0x00007fda761f3000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fda75eeb000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fda75b4b000)
libexpat.so.1 => /lib/x86_64-linux-gnu/libexpat.so.1 (0x00007fda7591b000)
libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007fda756fb000)
/lib64/ld-linux-x86-64.so.2 (0x00007fda77103000)
Now, I try to add the option -static to gcc, but this results in an error:
/usr/bin/ld: dynamic STT_GNU_IFUNC symbol `strcmp' with pointer equality in `/usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(strcmp.o)' can not be used when making an executable; recompile with -fPIE and relink with -pie
collect2: error: ld returned 1 exit status
I checked that all shared libraries given by ldd are also installed as static libraries.
So, is this some incompatibility with the options given by python3-config?

The experienced problems are obviously from the linker (gcc started a linker under the hood, to see it - just start gcc with -v - in verbose mode). So let's start with a short reminder how the linkage process works:
The linker keeps the names of all symbols it needs to resolve. In the beginning it is only the symbol main. What happens, when linker inspects a library?
If it is a static library, the linker looks at every object file in this library, and if this object files defines some looked for symbols, the whole object file is included (which means some symbols becomes resolved, but some further new unresolved symbols can be added). Linker might need to pass multiple times over a static library.
If it is a shared library, it is viewed by the linker as a library consisting out of a single huge object file (after all, we have to load this library at the run time and don't have to pass multiple times over and over to prune unused symbols): If there is at least one needed symbol the whole library is "linked" (not really the linkage happens at the run-time, this is a kind of a dry-run), if not - the whole library is discarded and never looked again at.
For example if you link with:
gcc -L/path -lpython3.x <other libs> foo.o
you will get a problem, no matter whether python3.x is a shared or a static lib: when the linker sees it, it looks only for the symbol main, but this symbol is not defined in the python-lib, so it the python-lib is discarded and never looked again at. Only when the linker sees the object-file foo.o, it realizes, that the whole Python-Symbols are needed, but now it is already too late.
There is a simple rule to handle this problem: put the object files first! That means:
gcc -L/path foo.o -lpython3.x <other libs>
Now the linker knows what it needs from the python-lib, when it first sees it.
There are other ways to achieve a similar result.
A) Let the linker to reiterate a group of archives as long as at least one new symbol definition was added per sweep:
gcc -L/path --Wl,-start-group -lpython3.x <other libs> foo.o -Wl,-end-group
Linker-options -Wl,-start-group and -Wl,-end-group says to linker iterate more than once over this group of archives, so the linker has a second chance (or more) to include symbols. This option can lead to longer linkage time.
B) Switching on the option --no-as-needed will lead to a shared library (and only shared library) being linked in, no matter whether in this library defined symbols are needed or not.
gcc -L/path -Wl,-no-as-needed -lpython3.x -Wl,-as-needed <other libs> foo.o
Actually, the default ld-behavior is --no-as-needed, but the gcc-frontend calls ld with option --as-needed, so we can restore the behavior by adding -no-as-needed prior to the python-library and then switch it off again.
Now to your problem of statical linking. I don't think it is advisable to use static versions of all standard libraries (all above glibc), what you should probably do is to link only the python-library statically.
The rules of the linkage are simple: per default the linker tries to open a shared version of the library first and than the static version. I.e. for the library libmylib and paths A and B, i.e.
-L/A -L/B lmylib
it tries to open libraries in the following order:
A/libmylib.so
A/libmylib.a
B/libmylib.so
B/libmylib.a
Thus if the folder A has only a static version, so this static version is used (no matter whether there is a shared version in folder B).
Because it is quite opaque which library is really used - it depends on the setup of your system, usually one would switch on the logging of the linker via -Wl,-verbose to trouble-shoot.
By using the option -Bstatic one can enforce the usage of the static version of a library:
gcc foo.o -L/path -Wl,-Bstatic -lpython3.x -Wl,-Bdynamic <other libs> -Wl,-verbose -o foo
Notable thing:
foo.o is linked before the libraries.
switch the static-mode off, directly after the python-library, so other libraries are linked dynamically.
And now:
gcc <cflags> L/paths foo.c -Wl,-Bstatic -lpython3.X -Wl,-Bdynamic <other libs> -o foo -Wl,-verbose
...
attempt to open path/libpython3.6m.a succeeded
...
ldd foo shows no dependency on python-lib
./foo
It works!
And yes, if you link against static glibc (I don't recommend), you will need to delete -Xlinker -export-dynamic from the command line.
The executable compiled without -Xlinker -export-dynamic will not be able to load some of c-extension which depend on this property of the executable to which they are loaded with ldopen.
Possible issues due to implicit -pie option.
Recent versions of gcc build with pie-option per default. Often/sometimes, older python versions where build with an older gcc-version, thus python-config --cflags would miss the now necessary -no-pie, as it was not needed back then. In this case the linker will produce an error message like:
relocation R_X86_64_32S against symbol `XXXXX' can not be used when
making a PIE object; recompile with -fPIC
In this case, -no-pie option should be added to <cflags>.

Related

Using cython and gcc compiler [duplicate]

Is it possible (and how) to use MinGW-w64 for building of C-extensions for Python or embeding Python on Windows?
Let's take as example the following cython-extension foo.pyx:
print("foo loaded")
from which the C-code can be generated either via cython -3 foo.pyx or cython -3 --embed foo.pyx if interpreter should be embedded.
While mingw-w64-compiler is not really supported (the only supported windows compiler is MSVC), it can be used to create C-extensions or to embed Python. There are however no guarantee, this won't break in the future versions.
distutils does not support mingw-w64, so there is no gain in setting up a setup.py-file - the steps must be performed manually.
First we need some information usually provided by distutils:
Headers: We need the path to the Python includes. For a way to find them see this SO-post.
DLL: mingw-w64's linker works differently than MSVC's: python-dll and not python-lib is needed. So we need the path to the pythonXY.dll which is usually next the the python.exe.
Once the C-code is created/generated, the extension can be build via
x86_64-w64-mingw32-gcc -shared foo.c -DMS_WIN64 -O2 <other_options> -I <path_to_python_include> -L <path_to_python_dll> -lpython37 -o foo.pyd
The important details are:
it is probably Ok to use only use -O2 for optimization and leave <other_options> empty-
It is important to define MS_WIN64-macro (e.g. via -DMS_WIN64). In order to build for x64 on windows it must be set, but it works out of the box only for MSVC (defining _WIN64 could have slightly different outcomes):
#ifdef _WIN64
#define MS_WIN64
#endif
if it is not done, at least for files generated by Cython the following error message will be generated by the compiler:
error: enumerator value for ‘__pyx_check_sizeof_voidp’ is not an integer constant
201 | enum { __pyx_check_sizeof_voidp = 1 / (int)(SIZEOF_VOID_P == sizeof(void*)) };
pyd is just a dll in disguise, thus we need the -shared option, which means a dynamic library (i.e. shared-object in Linux-world) will be created.
It is important, that the python-library (pythonXY) should be the dll itself and not the lib (see this SO-post). Thua we use the path to pythonXY.dll (in my case python37) and not pythonXY.lib, as it would be the case for MSVC.
One probably should add the proper suffix to the resulting pyd-file, I use the old convention for simplicity here.
Embeded Python:
In this case an executable should be build (e.g. the C-file is generated by Cython with --embed option: cython -3 --embed foo.pyx) and thus the command line looks as follows:
x86_64-w64-mingw32-gcc foo.c -DMS_WIN64 -O2 <other_options> -I <path_to_python_include> -L <path_to_python_dll> -lpython37 -o foo.exe -municode
There are two important differences:
-shared should no longer be used, as the result is no longer a dynamic library (that is what *.pyd-file is after all) but an executable.
-municode is needed, because for Windows, Cython defines int wmain(int argc, wchar_t **argv) instead of int main(int argc, char** argv). Without this option, an error message like
/build/mingw-w64-_1w3Xm/mingw-w64-4.0.4/mingw-w64-crt/crt/crt0_c.c:18: undefined reference to 'WinMain'
collect2: error: ld returned 1 exit status
would appear (see this SO-post for more information).
Note: for the resulting executable to run, a whole python-distribution (and not only the dll) is needed (see also this SO-post), otherwise the resulting executable will abort with error (either the python dll wasn't found or the python installation or the site packages - depending on the configuration of the machine on which the exe has to run).
mingw-w64 can also be used on Linux for cross-compilation for Windows, see this SO-post.

Embedding Python in C with Generates Import Errors for Python Shared Libraries

The Problem
I've been learning the ins and outs of both Cython and the C API for Python and ran into a peculiar problem. I've created a C program that simply embeds a Python interpreter into it, imports a local .py file, and extracts a function/method from this file. I've been successful in calling a method from a localized file and returning the value to C, so I've gotten the process down to some degree.
I've been working on two different variants: 1) creating a standalone executable and 2) creating a shared library and allowing another executable to access the functions I've built in Python. For case 1, I have no issues. For case 2, I can only successfully make this work if there are no Python dependencies. If my Python function imports a Python core library or any library that is dynamically loaded (e.g. import math, import numpy, import ctypes, import matplotlib) an ImportError will be generated. The paths to these imports are typically in ../pathtopython/lib/pythonX.Y/lib-dynload for the python core libraries/modules.
Some Notes:
I'm using an Anaconda distribution for Python 3.7 on CentOS 7. I do not have administrative access, and there are system pythons that exist that might be interfering - I've demonstrated that I can make this work by using the system python but then I lose the ability to add packages. So I will highlight my findings from using the system python too.
My C Code:
struct results cfuncEval(double height, double diameter){
Py_Initialize();
PyObject *sys_path, *path;
sys_path = PySys_GetObject("path");
path = PyUnicode_DecodeFSDefault("path/to/my/python/file/");
PyList_Insert(sys_path, 0, path);
PyObject *pName, *pModule;
pName = PyUnicode_DecodeFSDefault("myModule");
pModule = PyImport_Import(pName);
PyErr_Print();
PyObject *pFunc = PyObject_GetAttrString(pModule, "cone_funcEval");
PyObject* args;
args = PyTuple_Pack(2,PyFloat_FromDouble(height),PyFloat_FromDouble(diameter));
PyObject* ret = PyObject_CallObject(pFunc, args);
I do things successfully with ret if there are no import statements in my .py file, however if I import a shared library as mentioned before (such as import math) I will get an error. It's fair to note that I can in fact successfully import other .py files from that .py file without error.
My myModule.py Code:
try:
import math
except ImportError as e:
print(e)
def cone_funcEval(diameter, height):
radius = diameter/2.0
volume = math.pi*radius*radius*height/3.0
slantHeight = ( radius**2.0 + height**2.0 )**0.5
surfaceArea = math.pi*radius*(radius + slantHeight)
return volume, slantHeight, surfaceArea
Please note that I know this logic is faster in C, but I am just trying to demonstrate creating the linkage and find importing core modules to be an expected feature. So, this is just my example problem. In this .py file, I trigger C to print my ImportError as follows:
/path/to/anaconda3/lib/python3.7/lib-dynload/math.cpython-37m-x86_64-linux-gnu.so: undefined symbol: PyArg_Parse
Indeed Python is not linked to the math .so if I run ldd on it, but I suppose that's to happen at runtime. I can make this work if I create a standalone executable using the -export-dynamic flag with gcc, but I cannot make it work otherwise as that flag appears to be not in use if you are compiling a shared library.
A Word About the System Python:
The system Python's core modules are located in /usr/lib64/pythonX.Y/lib-dynload. When I run ldd on /usr/lib64/pythonX.Y/lib-dynload/math.cpython-XYm-x86_64-linux-gnu.so it identifies the following:
linux-vdso.so.1 => (0x00007fffaf3ab000)
libm.so.6 => /usr/lib64/libm.so.6 (0x00007f2cd3aff000)
libpythonX.Ym.so.1.0 => /usr/lib64/libpythonX.Ym.so.1.0 (0x00007f2cd35da000)
libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007f2cd33be000)
libc.so.6 => /usr/lib64/libc.so.6 (0x00007f2cd2ff0000)
/lib64/ld-linux-x86-64.so.2 (0x00007f2cd400d000)
libdl.so.2 => /usr/lib64/libdl.so.2 (0x00007f2cd2dec000)
libutil.so.1 => /usr/lib64/libutil.so.1 (0x00007f2cd2be9000)
whereas when I run ldd on my anaconda3's math it produces the following:
linux-vdso.so.1 => (0x00007ffe1338a000)
libpthread.so.0 => /usr/lib64/libpthread.so.0 (0x00007f1774dc3000)
libc.so.6 => /usr/lib64/libc.so.6 (0x00007f17749f5000)
libm.so.6 => /usr/lib64/libm.so.6 (0x00007f17746f3000)
/lib64/ld-linux-x86-64.so.2 (0x00007f1774fdf000)
This, to me, indicates they're compiled somewhat differently. Most notably, the system Python is linked to its libpython.so, whereas my anaconda3 copy is not. I did attempt to compile a brand new version of python from source code using --enable-shared to replicate this to no avail.
My Compilation Flags:
CFLAGS as produced by python3-config --cflags
-I/path/to/anaconda3/include/python3.7m -Wno-unused-result -Wsign-compare -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O3 -ffunction-sections -pipe -isystem /path/to/anaconda3/include -fuse-linker-plugin -ffat-lto-objects -flto-partition=none -flto -DNDEBUG -fwrapv -O3 -Wall
LDFLAGS
-L/path/to/anaconda3/lib/python3.7/config-3.7m-x86_64-linux-gnu -L/path/to/anaconda3/lib -lm -lrt -lutil -ldl -lpthread -lc -lpython3 -fno-lto
Ideal Solution:
I've searched all over SO and found no solutions that have worked. Ideally I would like to be able to take one of the following solutions:
Figure out what flags or steps I need to be able to make my shared library successfully pass the symbols coming from Python.h to the Python symbols needed by the libraries at runtime
Figure out how to properly compile a Python from source that looks like the system Python's (or an explanation as to why I can't do that)
A good lecturing as to how I have no clue what I'm doing

How can I wrap a C-Library in SWIG, which has usually to be linked during C-compilation?

Given a C-library, which has to be linked during compilation if I want to use its functions. I want to access these functions in Python using SWIG. I can only find examples and introductions where C-Code (example.c) is wrapped using SWIG, no method how to wrap a dynamic library (example.so).
All you need to do to make the .so (or .a) library case work is to link the library appropriately when you do the compile step of the example build process. You will still have to compile the example_wrap.c that gets generated, this is where you can link against things.
So modified from the SWIG docs that would be:
$ swig -python example.i
$ gcc -O2 -fPIC -c example.c
$ gcc -O2 -fPIC -c example_wrap.c -I/usr/local/include/python2.5
$ gcc -shared example_wrap.o -o _example.so -lmylib.so
In reality you can also skip this at the compile time linker step and use dlopen at runtime instead by injecting some extra code into the Python part of your module that calls dlopen before the shared object from SWIG gets loaded.

Compile file .c with embedded Python/C functions

I'm starting the study of Python/C API and I make the first code to test some functions, I write this:
file: test.c
#include "Python.h"
int main() {
PyObject* none = Py_BuildValue("");
}
I compile with command:
gcc -I/usr/include/python2.7 test.c
I've the error undefined reference to `Py_BuildValue'
After I run:
gcc -I/usr/include/python2.7 --shared -fPIC hashmem.c
this compile without errors, but when I run the compiled file I've a
Segmentation fault (core dumped)
How do I set the gcc parameters?
I've ubuntu 12.04, python 2.7.3, gcc 4.6.3 and I installed python-dev.
Thanks.
In the comments #Pablo has provided the solution
gcc -I/usr/include/python2.7 test.c -lpython2.7
I forgot to link the python library with the "-l" parameter.
-llibrary
-l library
Search the library named library when linking. (The second alternative with the library as a separate argument is only for POSIX) compliance and is not recommended.)It makes a difference where in the command you write this option; the linker searches and processes libraries and object files in the order they are specified. Thus, foo.o -lz bar.o' searches libraryz' after file foo.o but before bar.o. If bar.o refers to functions in z', those functions may not be loaded.The linker searches a standard list of directories for the library, which is actually a file named liblibrary.a. The linker then uses this file as if it had been specified precisely by name.The directories searched include several standard system directories plus any that you specify with -L.Normally the files found this way are library files—archive files whose members are object files. The linker handles an archive file by scanning through it for members which define symbols that have so far been referenced but not defined. But if the file that is found is an ordinary object file, it is linked in the usual fashion. The only difference between using an -l option and specifying a file name is that - l surrounds library withlib' and `.a' and searches several directories. 
Parameter description source

How to statically link a library when compiling a python module extension

I would like to modify a setup.py file such that the command "python setup.py build" compiles a C-based extension module that is statically (rather than dynamically) linked to a library.
The extension is currently dynamically linked to a number of libraries. I would like to leave everything unchanged except for statically linking to just one library. I have successfully done this by manually modifying the call to gcc that distutils runs, although it required that I explicitly listed the dependent libraries.
Perhaps this is too much information, but for clarity this is the final linking command that was executed during the "python setup.py build" script:
gcc -pthread -shared -L/system/lib64 -L/system/lib/ -I/system/include build/temp.linux-x86_64-2.7/src/*.o -L/system/lib -L/usr/local/lib -L/usr/lib -ligraph -o build/lib.linux-x86_64-2.7/igraph/core.so
And this is my manual modification:
gcc -pthread -shared -L/system/lib64 -L/system/lib/ -I/system/include build/temp.linux-x86_64-2.7/src/*.o -L/system/lib -L/usr/local/lib -L/usr/lib /system/lib/libigraph.a -lxml2 -lz -lgmp -lstdc++ -lm -ldl -o build/lib.linux-x86_64-2.7/igraph/core.so
Section 2.3.4 of Distributing Python Modules discusses the specification of libraries, but only "library_dirs" is appropriate and those libraries are dynamically linked.
I'm using a Linux environment for development but the package will also be compiled and installed on Windows, so a portable solution is what I'm after.
Can someone tell me where to look for instructions, or how to modify the setup.py script? (Thanks in advance!)
I'm new to StackOverflow, so my apologies if I haven't correctly tagged this question, or if I have made some other error in this posting.
6 - 7 years later, static linking with Python extensions is still poorly documented. As the OP points out in a comment, the usage is OS dependend.
On Linux / Unix
Static libraries are linked just as object files and should go with the path and its extension into extra_objects.
On Windows
The compiler sees if the linked library is static or dynamic and the static library name goes to the libraries list and the directories to library_dir
Solution for both platforms
For the example below, I will use the same library scenario as OP, linking igraph static and z, xml2 and gmp dynamic. This solution is a bit hackish, but at least does for each platform the right thing.
static_libraries = ['igraph']
static_lib_dir = '/system/lib'
libraries = ['z', 'xml2', 'gmp']
library_dirs = ['/system/lib', '/system/lib64']
if sys.platform == 'win32':
libraries.extend(static_libraries)
library_dirs.append(static_lib_dir)
extra_objects = []
else: # POSIX
extra_objects = ['{}/lib{}.a'.format(static_lib_dir, l) for l in static_libraries]
ext = Extension('igraph.core',
sources=source_file_list,
libraries=libraries,
library_dirs=library_dirs,
include_dirs=include_dirs,
extra_objects=extra_objects)
On MacOS
I guess this works also for MacOS (using the else path) but I have not tested it.
If all else fails, there's always the little-documented extra_compile_args and extra_link_args options to the Extension builder. (See also here.)
You might need to hack in some OS-dependent code to get the right argument format for a particular platform though.
Any possibility that this might work?
g++ -Wl,-Bstatic -lfoo -Wl,-Bdynamic -lbar -Wl,--as-needed

Categories

Resources