Using mkl_set_num_threads with numpy

Using mkl_set_num_threads with numpy - python

I'm trying to set the number of threads for numpy calculations with mkl_set_num_threads like this
import numpy
import ctypes
mkl_rt = ctypes.CDLL('libmkl_rt.so')
mkl_rt.mkl_set_num_threads(4)
but I keep getting an segmentation fault:
Program received signal SIGSEGV, Segmentation fault.
0x00002aaab34d7561 in mkl_set_num_threads__ () from /../libmkl_intel_lp64.so
Getting the number of threads is no problem:
print mkl_rt.mkl_get_max_threads()
How can I get my code working?
Or is there another way to set the number of threads at runtime?

Ophion led me the right way. Despite the documentation, one have to transfer the parameter of mkl_set_num_thread by reference.
Now I have defined to functions, for getting and setting the threads
import numpy
import ctypes
mkl_rt = ctypes.CDLL('libmkl_rt.so')
mkl_get_max_threads = mkl_rt.mkl_get_max_threads
def mkl_set_num_threads(cores):
mkl_rt.mkl_set_num_threads(ctypes.byref(ctypes.c_int(cores)))
mkl_set_num_threads(4)
print mkl_get_max_threads() # says 4
and they work as expected.
Edit: according to Rufflewind, the names of the C-Functions are written in capital-case, which expect parameters by value:
import ctypes
mkl_rt = ctypes.CDLL('libmkl_rt.so')
mkl_set_num_threads = mkl_rt.MKL_Set_Num_Threads
mkl_get_max_threads = mkl_rt.MKL_Get_Max_Threads

Long story short, use MKL_Set_Num_Threads and its CamelCased friends when calling MKL from Python. The same applies to C if you don't #include <mkl.h>.
The MKL documentation seems to suggest that the correct type signature in C is:
void mkl_set_num_threads(int nt);
Okay, let's try a minimal program then:
void mkl_set_num_threads(int);
int main(void) {
mkl_set_num_threads(1);
return 0;
}
Compile it with GCC and boom, Segmentation fault again. So it seems the problem isn't restricted to Python.
Running it through a debugger (GDB) reveals:
Program received signal SIGSEGV, Segmentation fault.
0x0000… in mkl_set_num_threads_ ()
from /…/mkl/lib/intel64/libmkl_intel_lp64.so
Wait a second, mkl_set_num_threads_?? That's the Fortran version of mkl_set_num_threads! How did we end up calling the Fortran version? (Keep in mind that Fortran's calling convention requires arguments to be passed as pointers rather than by value.)
It turns out the documentation was a complete façade. If you actually inspect the header files for the recent versions of MKL, you will find this cute little definition:
void MKL_Set_Num_Threads(int nth);
#define mkl_set_num_threads MKL_Set_Num_Threads
… and now everything makes sense! The correct function do call (for C code) is MKL_Set_Num_Threads, not mkl_set_num_threads. Inspecting the symbol table reveals that there are actually four different variants defined:
nm -D /…/mkl/lib/intel64/libmkl_rt.so | grep -i mkl_set_num_threads
00000000000e3060 T MKL_SET_NUM_THREADS
…
00000000000e30b0 T MKL_Set_Num_Threads
…
00000000000e3060 T mkl_set_num_threads
00000000000e3060 T mkl_set_num_threads_
…
Why did Intel put in four different variants of one function despite there being only C and Fortran variants in the documentation? I don't know for certain, but I suspect it's for compatibility with different Fortran compilers. You see, Fortran calling convention is not standardized. Different compilers will mangle the names of the functions differently:
some use upper case,
some use lower case with a trailing underscore, and
some use lower case with no decoration at all.
There may even be other ways that I'm not aware of. This trick allows the MKL library to be used with most Fortran compilers without any modification, the downside being that C functions need to be "mangled" to make room for the 3 variants of the Fortran calling convention.

For people looking for a cross platform and packaged solution, note that we have recently released threadpoolctl, a module to limit the number of threads used in C-level threadpools called by python (OpenBLAS, OpenMP and MKL). See this answer for more info.

For people looking for the complete solution, you can use a context manager:
import ctypes
class MKLThreads(object):
_mkl_rt = None
#classmethod
def _mkl(cls):
if cls._mkl_rt is None:
try:
cls._mkl_rt = ctypes.CDLL('libmkl_rt.so')
except OSError:
cls._mkl_rt = ctypes.CDLL('mkl_rt.dll')
return cls._mkl_rt
#classmethod
def get_max_threads(cls):
return cls._mkl().mkl_get_max_threads()
#classmethod
def set_num_threads(cls, n):
assert type(n) == int
cls._mkl().mkl_set_num_threads(ctypes.byref(ctypes.c_int(n)))
def __init__(self, num_threads):
self._n = num_threads
self._saved_n = self.get_max_threads()
def __enter__(self):
self.set_num_threads(self._n)
return self
def __exit__(self, type, value, traceback):
self.set_num_threads(self._saved_n)
Then use it like:
with MKLThreads(2):
# do some stuff on two cores
pass
Or just manipulating configuration by calling following functions:
# Example
MKLThreads.set_num_threads(3)
print(MKLThreads.get_max_threads())
Code is also available in this gist.

Related

Specifying Exact CPU Instruction Set with Cythonized Python Wheels

I have a Python package with a native extension compiled by Cython. Due to some performance needs, the compilation is done with -march=native, -mtune=native flags. This basically enables the compiler to use any of the available ISA extensions.
Additionally, we keep a non-cythonized, pure-python version of this package. It should be used in environments which are less performance sensitive.
Hence, in total we have two versions published:
Cythonized wheel built for a very specific platform
Pure-python wheel.
Some other packages depend on this package, and some of the machines are a bit different than the one that the package was compiled on. Since we used -march=native, as a result we get SIGILL, since some ISA extension is missing on the server.
So, in essence, I'd like to somehow make pip disregard the native wheel if the host CPU is not compatible with the wheel.
The native wheel does have the cp37 and platform name, but I don't see a way to define a more granular ISA requirements here. I can always use --implementation flags for pip, but I wonder if there's a better way for pip to differentiate among different ISAs.
Thanks,

The pip infrastructure doesn't support such granularity.
I think a better approach would be to have two versions of the Cython-extension compiled: with -march=native and without, to install both and to decide at the run time which one should be loaded.
Here is a proof of concept.
The first hoop to jump: how to check at run time which instructions are supported by CPU/OS combination. For the simplicity we will check for AVX (this SO-post has more details) and I offer only a gcc-specific (see also this) solution - called impl_picker.pyx:
cdef extern from *:
"""
int cpu_supports_avx(void){
return __builtin_cpu_supports("avx");
}
"""
int cpu_supports_avx()
def cpu_has_avx_support():
return cpu_supports_avx() != 0
The second problem: the pyx-file and the module must have the same name. To avoid code duplication, the actual code is in a pxi-file:
# worker.pxi
cdef extern from *:
"""
int compiled_with_avx(void){
#ifdef __AVX__
return 1;
#else
return 0;
#endif
}
"""
int compiled_with_avx()
def compiled_with_avx_support():
return compiled_with_avx() != 0
As one can see, the function compiled_with_avx_support will yield different results, depending on whether it was compiled with -march=native or not.
And now we can define two versions of the module just by including the actual code from the *.pxi-file. One module called worker_native.pyx:
# distutils: extra_compile_args=["-march=native"]
include "worker.pxi"
and worker_fallback.pyx:
include "worker.pxi"
Building everything, e.g. via cythonize -i -3 *.pyx, it can be used as follows:
from impl_picker import cpu_has_avx_support
# overhead once when imported:
if cpu_has_avx_support():
import worker_native as worker
else:
print("using fallback worker")
import worker_fallback as worker
print("compiled_with_avx_support:", worker.compiled_with_avx_support())
On my machine the above would lead to compiled_with_avx_support: True, on older machines the "slower" worker_fallback will be used and the result will be compiled_with_avx_support: False.
The goal of this post is not to give a working setup.py, but just to outline the idea how one could achieve the goal of picking correct version at the run time. Obviously, the setup.py could be quite more complicated: e.g. one would need to compile multiple c-files with different compiler settings (see this SO-post, how this could be achieved).

Resultpointer in function call

I want to use functions in dll's via ctype. I can call the function without errors and even the error code of the function is 0 meanig function successfuly finished. But when I try to acces the result variable ist is empty.
I have been implemented the lookup in free pascal severeal years ago and would transfer it to python right now. The interface allow to access via cdel convention and I tied to reimplement in python 3.7.4 with ctypes now
The last working Pascal Prototype have been:
PROCEDURE pGetCallInfo(DriveInfo: pointer; ACall: pointer; AInfo: pointer;
var AErrorCode: SmallInt); pascal; external 'raccd32a.dll';
My best version in python have been the following:
from ctypes import *
callBookDLL = CDLL('raccd32a')
AInfo = create_string_buffer(400)
err = callBookDLL.cGetCallInfo("self.txt_CallBookPath.text()","DG1ATN",AInfo)
The result ist:
err
0
AInfo.value
b''
AInfo should contain a max. 400 char long stringbuffer with an result containing Name, Adress and so on.
As I have a second library I have to acces same way I search for my fault but I was not able to find it. I think my problem is the work with pointer and the type conversion.
I checked teh ctypes howto allready but I can noht solve this trouble.
Thanks a lot so far ...

Check [Python 3.Docs]: ctypes - A foreign function library for Python. It contains (almost) every piece of info that you need.
There are a number of problems:
ctypes doesn't support pascal calling convention, only cdecl and stdcall (applies to 32bit only). That means (after reading the manual) that you shouldn't use the p* functions, but the c* (or s*)
You didn't specify argtypes (and restype) for your function. This results in UB. Some effects of this:
[SO]: Python ctypes cdll.LoadLibrary, instantiate an object, execute its method, private variable address truncated (#CristiFati's answer)
[SO]: python ctypes issue on different OSes (#CristiFati's answer)
It is a procedure (a function that returns void). Anyway this is a minor one
Here's some sample code (of course it's blind, as I didn't test it):
#!/usr/bin/env python3
import sys
import ctypes
dll = ctypes.CDLL("raccd32a.dll")
cGetCallInfo = dll.cGetCallInfo
cGetCallInfo.argtypes = [ctypes.c_char_p, ctypes.c_char_p, ctypes.c_char_p, ctypes.POINTER(ctypes.c_short)]
cGetCallInfo.restype = None
ADriveInfo = self.txt_CallBookPath.text().encode()
#ADriveInfo = b"C:\\callbook2019\\" # Notice the double bkslashes
ACall = b"DG1ATN"
AInfo = ctypes.create_string_buffer(400)
result = ctypes.c_short(0)
cGetCallInfo(ADriveInfo, ACall, AInfo, ctypes.byref(result))
#EDIT0:
From the beginning, I wanted yo say that the 1st argument passed to the function doesn't make much sense. Then, there are problems regarding the 2nd one as well. According to the manual ([AMT-I]: TECHNICAL INFORMATION about RACCD32a.DLL (emphasis is mine)):
ADriveInfo, ACall and AInfo are pointers to zero-terminated strings. These
strings has to exist at the moment of calling xGetCallInfo. The calling
program is responsible for creating them. AInfo must be long enough to
comfort xGetCallInfo (at least 400 characters).
Note: "Length of AInfo" refers to the length of the string AInfo points at.
ADriveInfo and ACall are treated in the same manner for short.
In ADriveInfo the procedure expects the path to the CD ROM drive. Use
"G:\"
if "G:" designates the CD ROM drive with the callbook CD ROM.
Keep in mind that this information is a *must* and the calling program
has to know it.
Note: If the active directory on drive G: is not the root, ADriveInfo = "G:"
will lead to an error 3. So always use "G:\".
The calling program has to ensure that the length of ADriveInfo does not
exceed 80 characters.
ACall contains the call you are looking for, all letters in lower case,
no additional spaces etc. The calling program has to ensure that ACall is
not longer than 15 characters. However, there is no call longer than 6
characters in the database.

How to Grab dll message-dispatching procedure and redirect to python stdout?

I'm trying to figure out how to proceed and if it is feasible or not in general.
I working with external DLL to control my mechanical delay line.
This API has it internal procedure for message output in separate window. I have a strong desire to catch this message flow and present in my python (PyQT5) written application.
In API description there is a function:
int LS_SetProcessMessagesProc(void *pProc);
Function returns either 0 or 1, if there is no error or it is present, respectively.
According to dll description
It enables the replacement of the internal message-dispatching procedure of the LStep API.
The LStep API processes during waiting for confirmation of the LStep in the main-thread messages. If you want to switch of the Message-Dispatching or replace with your onw Code, you can use SetProcessMessagesProc for using a callback-procedure.
pProc must be a pointer to a stdcall-procedure without a parameter:
void MyProcessMessages() {...}
Example:
LS.SetProcessMessagesProc(&MyProcessMessages);
As example if we take python stdout, how I can send the message to it?

I'm going to illustrate everything on the:
[MSDN]: EnumWindows function - which enumerates all windows on the screen, and for each of them calls a callback function - check next bullet
[MSDN]: EnumWindowsProc callback function - which is used to handle every enumerated window
which is (a slightly more complicated example of) what you need: a function defined in an external .dll which needs to call another custom function (written by you in Python), via [Python]: ctypes module (on Win).
The code:
import ctypes
from ctypes import wintypes
try:
from win32gui import GetWindowText
pywin32_present = True
except ImportError:
pywin32_present = False
def enum_windows_proc(hwnd, l_param):
print("HWND: {}\n".format(hwnd))
if pywin32_present:
txt = GetWindowText(hwnd)
if txt and "MSCTFIME UI" not in txt and "Default IME" not in txt:
print(" Window text: {}\n".format(txt))
return 1
def main():
user32_dll = ctypes.windll.LoadLibrary("user32.dll")
enum_windows = user32_dll.EnumWindows
WND_ENUM_PROC_TYPE = ctypes.WINFUNCTYPE(wintypes.BOOL, wintypes.HWND, wintypes.LPARAM)
enum_windows.argtypes = (WND_ENUM_PROC_TYPE, wintypes.LPARAM)
enum_windows.restype = wintypes.BOOL
enum_windows(WND_ENUM_PROC_TYPE(enum_windows_proc), wintypes.LPARAM(0))
if __name__ == "__main__":
main()
Notes:
imports:
wintypes is a ctypes sub-module that defines a bunch of Ms specific data (constants, structs, enums, ...)
[Python]: pywin32 is a Python wrapper over C Win functions, it's basically a more advanced (and pythonic) approach of what ctypes does. It doesn't come with Python by default, it must be manually installed; in our example, it's optional
def enum_windows_proc(hwnd, l_param)::
It's the Python form of BOOL CALLBACK EnumWindowsProc(_In_ HWND hwnd, _In_ LPARAM lParam);
Prints the hwnd (window handle) for each window (note that there will be lots of such windows, since most of them are "invisible" to the user)
If pywin32 module is installed, it will be used to extract each window's title(caption). Of course, that can also be done with ctypes but it's a little bit more complicated
The title filtering is to avoid printing useless text (for most of the windows). If you want more details, check: [SO]:
Get the title of a window of another program using the process name
The main function:
First, the .dll that contains the function (in our case user32.dll) needs to be loaded. This is done using [MSDN]: LoadLibrary function (Ux: [man]: dlopen). Also, the internal structure (needed for next line to look so simple) of the returned object (user32_dll) is initialized
The function (or better: a pointer to it) is being retrieved (enum_windows) using [MSDN]:
GetProcAddress function (Ux: [man]: dlsym)
The next 3 lines of code are used to let Python know the details about the loaded function pointer (return type and argument types). Note that (in some cases) there is a simpler way (codewise) to do all that, but for learning purposes, it's OK to go through the whole thing
Finally, call the external function, with our custom function as an argument
Since there will be lots of output (and will be mostly memory addresses), I won't paste it here.
Going to your problem, based on the function headers you pasted in the question, we can take the same approach (note that the code will not work copy/pasted OOTB):
import ctypes
from ctypes import wintypes
def my_process_messages():
# Your code here (delete the next (`pass`) line)
pass
dll_name = "your dll path (full or relative)"
dll_object = ctypes.windll.LoadLibrary(dll_name)
ls_set_process_messages_proc = dll_object.LS_SetProcessMessagesProc
PROCESS_MESSAGES_TYPE = ctypes.WINFUNCTYPE(None)
ls_set_process_messages_proc.argtypes = (PROCESS_MESSAGES_TYPE,)
ls_set_process_messages_proc.restype = ctypes.c_int
print("ls_set_process_messages_proc returned: {}\n".format(ls_set_process_messages_proc(PROCESS_MESSAGES_TYPE(my_process_messages))))
Note: The example is based on the fact that the external .dll:
Is Win style (uses stdcall calling convention). If that's not true (it uses cdecl), you need to change (for rigorousity's sake I'm going to say 2 things):
ctypes.windll to ctypes.cdll
ctypes.WINFUNCTYPE to ctypes.CFUNCTYPE
Exports C style functions (not C++ which mangles function names), which I'm almost 100% sure. But, if this is not the case then, sorry, nothing to do here. For more details on this topic, check: [SO]: Excel VBA, Can't Find DLL Entry Point from a DLL file.

Read about ctypes tutorial
Loading-dynamic-link-libraries
Accessing functions from loaded dlls
Calling functions
Linux example there, which uses the standard C library's qsort function:
Load the libc.so.6 dll.
from ctypes import *
libc = CDLL("libc.so.6")
Get a function pointer to qsort.
qsort = libc.qsort
qsort.restype = None
Create the type for the callback function
and implement the Python callback function.
CMPFUNC = CFUNCTYPE(c_int, POINTER(c_int), POINTER(c_int))
def py_cmp_func(a, b):
return a[0] - b[0]
cmp_func = CMPFUNC(py_cmp_func)
Define a C-Type Integer Array with values
and use qsort to sort the Array using cmp_func.
IntArray5 = c_int * 5
ia = IntArray5(5, 1, 7, 33, 99)
qsort(ia, len(ia), sizeof(c_int), cmp_func)
for i in ia:
print(i)
Output
1 5 7 33 99

How to read file capabilities using Python?

On Linux systems root privileges can be granted more selectively than adding the setuid bit using file capabilities. See capabilities(7) for details. These are attributes of files and can be read using the getcap program. How can these attributes be retrieved in Python?
Even though running the getcap program using e.g. subprocess for answering such a question is possible it is not desirable when retrieving very many capabilities.
It should be possible to devise a solution using ctypes. Are there alternatives to this approach or even libraries facilitating this task?

Python 3.3 comes with os.getxattr. If not, yeah... one way would be using ctypes, at least to get the raw stuff, or maybe use pyxattr
For pyxattr:
>>> import xattr
>>> xattr.listxattr("/bin/ping")
(u'security.capability',)
>>> xattr.getxattr("/bin/ping", "security.capability")
'\x00\x00\x00\x02\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
For Python 3.3's version, it's essentially the same, just importing os, instead of xattr. ctypes is a bit more involved, though.
Now, we're getting the raw result, meaning that those two are most useful only retrieving textual attributes. But... we can use the same approach of getcap, through libcap itself:
import ctypes
libcap = ctypes.cdll.LoadLibrary("libcap.so")
cap_t = libcap.cap_get_file('/bin/ping')
libcap.cap_to_text.restype = ctypes.c_char_p
libcap.cap_to_text(cap_t, None)
which gives me:
'= cap_net_raw+p'
probably more useful for you.
PS: note that cap_to_text returns a malloced string. It's your job to deallocate it using cap_free
Hint about the "binary gibberish":
>>> import struct
>>> caps = '\x00\x00\x00\x02\x00 \x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>> struct.unpack("<IIIII", caps)
(33554432, 8192, 0, 0, 0)
In that 8192, the only active bit is the 13th. If you go to linux/capability.h, you'll see that CAP_NET_RAW is defined at 13.
Now, if you wan to write a module with all those constants, you can decode the info. But I'd say it's much more laborious than just using ctypes + libcap.

I tried the code from Ricardo Cárdenes's answer, but it did not work properly for me, because some details of the ctypes invocation incorrect. This issue caused a truncated path string to be passed to getxattr(...) inside of libcap, which thus returned the wrong capabilities list for the wrong item (the / directory, or other first path character, and not the actual path).
It is very important to remember and account for the difference between str and bytes in Python 3.X. This code works properly on Python 3.5/3.6:
#!/usr/bin/env python3
import ctypes
import os
import sys
# load shared library
libcap = ctypes.cdll.LoadLibrary('libcap.so')
class libcap_auto_c_char_p(ctypes.c_char_p):
def __del__(self):
libcap.cap_free(self)
# cap_t cap_get_file(const char *path_p)
libcap.cap_get_file.argtypes = [ctypes.c_char_p]
libcap.cap_get_file.restype = ctypes.c_void_p
# char* cap_to_text(cap_t caps, ssize_t *length_p)
libcap.cap_to_text.argtypes = [ctypes.c_void_p, ctypes.c_void_p]
libcap.cap_to_text.restype = libcap_auto_c_char_p
def cap_get_file(path):
cap_t = libcap.cap_get_file(path.encode('utf-8'))
if cap_t is None:
return ''
else:
return libcap.cap_to_text(cap_t, None).value.decode('utf-8')
print(cap_get_file('/usr/bin/traceroute6.iputils'))
print(cap_get_file('/usr/bin/systemd-detect-virt'))
print(cap_get_file('/usr/bin/mtr'))
print(cap_get_file('/usr/bin/tar'))
print(cap_get_file('/usr/bin/bogus'))
The output will look like this (anything nonexistent, or with no capabilities set just returns '':
= cap_net_raw+ep
= cap_dac_override,cap_sys_ptrace+ep
= cap_net_raw+ep

How do you call Python code from C code?

I want to extend a large C project with some new functionality, but I really want to write it in Python. Basically, I want to call Python code from C code. However, Python->C wrappers like SWIG allow for the OPPOSITE, that is writing C modules and calling C from Python.
I'm considering an approach involving IPC or RPC (I don't mind having multiple processes); that is, having my pure-Python component run in a separate process (on the same machine) and having my C project communicate with it by writing/reading from a socket (or unix pipe). my python component can read/write to socket to communicate. Is that a reasonable approach? Is there something better? Like some special RPC mechanism?
Thanks for the answer so far - however, i'd like to focus on IPC-based approaches since I want to have my Python program in a separate process as my C program. I don't want to embed a Python interpreter. Thanks!

I recommend the approaches detailed here. It starts by explaining how to execute strings of Python code, then from there details how to set up a Python environment to interact with your C program, call Python functions from your C code, manipulate Python objects from your C code, etc.
EDIT: If you really want to go the route of IPC, then you'll want to use the struct module or better yet, protlib. Most communication between a Python and C process revolves around passing structs back and forth, either over a socket or through shared memory.
I recommend creating a Command struct with fields and codes to represent commands and their arguments. I can't give much more specific advice without knowing more about what you want to accomplish, but in general I recommend the protlib library, since it's what I use to communicate between C and Python programs (disclaimer: I am the author of protlib).

Have you considered just wrapping your python application in a shell script and invoking it from within your C application?
Not the most elegant solution, but it is very simple.

See the relevant chapter in the manual: http://docs.python.org/extending/
Essentially you'll have to embed the python interpreter into your program.

I haven't used an IPC approach for Python<->C communication but it should work pretty well. I would have the C program do a standard fork-exec and use redirected stdin and stdout in the child process for the communication. A nice text-based communication will make it very easy to develop and test the Python program.

If I had decided to go with IPC, I'd probably splurge with XML-RPC -- cross-platform, lets you easily put the Python server project on a different node later if you want, has many excellent implementations (see here for many, including C and Python ones, and here for the simple XML-RPC server that's part the Python standard library -- not as highly scalable as other approaches but probably fine and convenient for your use case).
It may not be a perfect IPC approach for all cases (or even a perfect RPC one, by all means!), but the convenience, flexibility, robustness, and broad range of implementations outweigh a lot of minor defects, in my opinion.

This seems quite nice http://thrift.apache.org/, there is even a book about it.
Details:
The Apache Thrift software framework, for scalable cross-language
services development, combines a software stack with a code generation
engine to build services that work efficiently and seamlessly between
C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa,
JavaScript, Node.js, Smalltalk, OCaml and Delphi and other languages.

I've used the "standard" approach of Embedding Python in Another Application. But it's complicated/tedious. Each new function in Python is painful to implement.
I saw an example of Calling PyPy from C. It uses CFFI to simplify the interface but it requires PyPy, not Python. Read and understand this example first, at least at a high level.
I modified the C/PyPy example to work with Python. Here's how to call Python from C using CFFI.
My example is more complicated because I implemented three functions in Python instead of one. I wanted to cover additional aspects of passing data back and forth.
The complicated part is now isolated to passing the address of api to Python. That only has to be implemented once. After that it's easy to add new functions in Python.
interface.h
// These are the three functions that I implemented in Python.
// Any additional function would be added here.
struct API {
double (*add_numbers)(double x, double y);
char* (*dump_buffer)(char *buffer, int buffer_size);
int (*release_object)(char *obj);
};
test_cffi.c
//
// Calling Python from C.
// Based on Calling PyPy from C:
// http://doc.pypy.org/en/latest/embedding.html#more-complete-example
//
#include <stdio.h>
#include <assert.h>
#include "Python.h"
#include "interface.h"
struct API api; /* global var */
int main(int argc, char *argv[])
{
int rc;
// Start Python interpreter and initialize "api" in interface.py using
// old style "Embedding Python in Another Application":
// https://docs.python.org/2/extending/embedding.html#embedding-python-in-another-application
PyObject *pName, *pModule, *py_results;
PyObject *fill_api;
#define PYVERIFY(exp) if ((exp) == 0) { fprintf(stderr, "%s[%d]: ", __FILE__, __LINE__); PyErr_Print(); exit(1); }
Py_SetProgramName(argv[0]); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString(
"import sys;"
"sys.path.insert(0, '.')" );
PYVERIFY( pName = PyString_FromString("interface") )
PYVERIFY( pModule = PyImport_Import(pName) )
Py_DECREF(pName);
PYVERIFY( fill_api = PyObject_GetAttrString(pModule, "fill_api") )
// "k" = [unsigned long],
// see https://docs.python.org/2/c-api/arg.html#c.Py_BuildValue
PYVERIFY( py_results = PyObject_CallFunction(fill_api, "k", &api) )
assert(py_results == Py_None);
// Call Python function from C using cffi.
printf("sum: %f\n", api.add_numbers(12.3, 45.6));
// More complex example.
char buffer[20];
char * result = api.dump_buffer(buffer, sizeof buffer);
assert(result != 0);
printf("buffer: %s\n", result);
// Let Python perform garbage collection on result now.
rc = api.release_object(result);
assert(rc == 0);
// Close Python interpreter.
Py_Finalize();
return 0;
}
interface.py
import cffi
import sys
import traceback
ffi = cffi.FFI()
ffi.cdef(file('interface.h').read())
# Hold references to objects to prevent garbage collection.
noGCDict = {}
# Add two numbers.
# This function was copied from the PyPy example.
#ffi.callback("double (double, double)")
def add_numbers(x, y):
return x + y
# Convert input buffer to repr(buffer).
#ffi.callback("char *(char*, int)")
def dump_buffer(buffer, buffer_len):
try:
# First attempt to access data in buffer.
# Using the ffi/lib objects:
# http://cffi.readthedocs.org/en/latest/using.html#using-the-ffi-lib-objects
# One char at time, Looks inefficient.
#data = ''.join([buffer[i] for i in xrange(buffer_len)])
# Second attempt.
# FFI Interface:
# http://cffi.readthedocs.org/en/latest/using.html#ffi-interface
# Works but doc says "str() gives inconsistent results".
#data = str( ffi.buffer(buffer, buffer_len) )
# Convert C buffer to Python str.
# Doc says [:] is recommended instead of str().
data = ffi.buffer(buffer, buffer_len)[:]
# The goal is to return repr(data)
# but it has to be converted to a C buffer.
result = ffi.new('char []', repr(data))
# Save reference to data so it's not freed until released by C program.
noGCDict[ffi.addressof(result)] = result
return result
except:
print >>sys.stderr, traceback.format_exc()
return ffi.NULL
# Release object so that Python can reclaim the memory.
#ffi.callback("int (char*)")
def release_object(ptr):
try:
del noGCDict[ptr]
return 0
except:
print >>sys.stderr, traceback.format_exc()
return 1
def fill_api(ptr):
global api
api = ffi.cast("struct API*", ptr)
api.add_numbers = add_numbers
api.dump_buffer = dump_buffer
api.release_object = release_object
Compile:
gcc -o test_cffi test_cffi.c -I/home/jmudd/pgsql-native/Python-2.7.10.install/include/python2.7 -L/home/jmudd/pgsql-native/Python-2.7.10.install/lib -lpython2.7
Execute:
$ test_cffi
sum: 57.900000
buffer: 'T\x9e\x04\x08\xa8\x93\xff\xbf]\x86\x04\x08\x00\x00\x00\x00\x00\x00\x00\x00'
$

Few tips for binding it with Python 3
file() not supported, use open()
ffi.cdef(open('interface.h').read())
PyObject* PyStr_FromString(const char *u)
Create a PyStr from a UTF-8 encoded null-terminated character buffer.
Python 2: PyString_FromString
Python 3: PyUnicode_FromString
Change to: PYVERIFY( pName = PyUnicode_FromString("interface") )
Program name
wchar_t *name = Py_DecodeLocale(argv[0], NULL);
Py_SetProgramName(name);
for compiling
gcc cc.c -o cc -I/usr/include/python3.6m -I/usr/include/x86_64-linux-gnu/python3.6m -lpython3.6m
I butchered dump def .. maybe it will give some ideas
def get_prediction(buffer, buffer_len):
try:
data = ffi.buffer(buffer, buffer_len)[:]
result = ffi.new('char []', data)
print('\n I am doing something here here........',data )
resultA = ffi.new('char []', b"Failed") ### New message
##noGCDict[ffi.addressof(resultA)] = resultA
return resultA
except:
print >>sys.stderr, traceback.format_exc()
return ffi.NULL
}
Hopefully it will help and save you some time

apparently Python need to be able to compile to win32 dll, it will solve the problem
In such a way that converting c# code to win32 dlls will make it usable by any development tool

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using mkl_set_num_threads with numpy - python

For people looking for a cross platform and packaged solution, note that we have recently released threadpoolctl, a module to limit the number of threads used in C-level threadpools called by python (OpenBLAS, OpenMP and MKL). See this answer for more info.

Related

Specifying Exact CPU Instruction Set with Cythonized Python Wheels

Resultpointer in function call

How to Grab dll message-dispatching procedure and redirect to python stdout?

How to read file capabilities using Python?

How do you call Python code from C code?

Categories

Resources