referencing opaque types with python ctypes - python

I'm using a 3rd party C library that defines an opaque type:
foo_t
And uses pointers to this type in its functions:
void foo_init(foo_t *foo);
Typical usage would be allocating a foo_t on the stack and passing a reference:
{
foo_t foo;
foo_init(&foo);
...
}
How do I call foo_init() with ctypes without knowing what constitutes a foo_t?
I think if I knew sizeof(foo_t) I could create a buffer of that size and cast, but is it possible to get the size with ctypes?
I could write a one-liner C program:
printf("sizeof(foo_t) = %zu\n", sizeof(foo_t));
and hard-code that value into my python, but that would get ugly in a hurry: I'd have to touch my python source with every upgrade to the library.
A slightly cleaner way would be to write a python c-ext to export the size value, but that too would require a recompile with every library upgrade.
Does anyone have a recipe for using ctypes with such opaque types?

I think this is the simplest solution...
Create a C file, say, foosizes.c:
size_t SIZEOF_FOO = sizeof(foo_t);
And compile it into a shared object, foosizes.so. Then in a python script:
from ctypes import *
foosizeslib = CDLL('foosizes.so')
sizeof_foo = c_ulong.in_dll(foosizeslib, 'SIZEOF_FOO')
I can then create a buffer of the appropriate size and pass it to functions, by reference, as a pointer to the opaque type. So far, so good.

It is not possible to get the size with ctypes as C does not support runtime reflection, as no metadata about types is stored in the compiled binary as is done with Java or C#/.Net.
As you said, one way to get the size is create a simple C program that includes the header that defines the type and then use the sizeof operator to print out the size. Taking that a step further you could utilize a C compiler written in Python to compile and execute your C code to get the size when your Python code is executed. You might even be able to get it without needing to actually execute the result by walking the data structures provided by the compiler.
That said, are you certain you need to create the memory yourself? Frequently C libraries provide a method to create an opaque type that their other functions operate on. Update: from the comments it is certain that the memory must be allocated by the caller.

Related

Call Windows Function in Python

https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntqueryinformationfile?redirectedfrom=MSDN
How can I call the above kernel method in python? I found an example on another stackoverflow post: Winapi: Get the process which has specific handle of a file
The answer on this other post is essentially what I want to do, but in python. The goal is to be able to get a list of processes which currently are accessing/locking a file. This NtQueryInformationFile method seems to be exactly what I want. I know this can be done with ctypes, but I am not familiar or comfortable enough with ctypes to do this myself. How can I do this?
If there's no available wrapper for the function, you'll need to call the function yourself using ctypes.
The dlls windows uses are exposed through ctypes.windll, with cytpes.windll.ntdll being the one that exposes the function you need.
To help python convert arguments, it's usually a good idea to specify the function's argument and return types, which can be done through the argtypes and restype attributes on the function object, like so:
function = cytpes.windll.ntdll.NtQueryInformationFile
function.argtypes = [ctypes.wintypes.HANDLE, ...]
function.restype = ctypes.c_long
ctypes exposes the common window types in the ctypes.wintypes module, though for most structures like the PIO_STATUS_BLOCK in your function you'll need to define the struct yourself and add it to the argument list to use it properly. In case it's optional a void pointer and passing it None will suffice.
Also, do mind that windows handles are not the file descriptors that python exposes, to convert to/from them you can use the ..._osfhandle functions from the msvcrt module

Python ctypes misbehaves when a C function returns a dynamic array

I'm working on Python wrapper classes for Matlab's dynamic libraries to read Matlab MAT files in Python, and I'm encountering an odd behavior that I cannot explain from the ctypes interface.
The C function signature looks like this:
const mwSize *mxGetDimensions(const mxArray *);
Here, mwSize is a renamed size_t, and mxArray* is an opaque pointer. This function returns the "shape" of the Matlab array. The returned pointer points to the size_t array, which is stored internally within mxArray object and is not null terminated (its size is obtained via another function).
To call this function from Python, I set up the library as follows:
libmx = ctypes.cdll.LoadLibrary('libmx.dll')
libmx.mxGetDimensions.restype = ctypes.POINTER(ctypes.c_size_t)
libmx.mxGetDimensions.argtypes = [ctypes.c_void_p]
and after obtained mxArray* in VAR, I called:
dims = libmx.mxGetDimensions(VAR)
print(dims[0],dims[1])
VAR is known to be 2-D and has a shape of (1, 13) (validated with a C program) but my Python code returns (55834574849 0) in c_ulonglong... Results are consistently garbage across all the variables stored in the test MAT file.
What am I doing wrong? Other library calls using VAR seems be working properly, so VAR is pointing to the valid object. As stated above, mxGetDimensions() called in a C program works as expected.
Any inputs would be appreciated! Thanks
#Neitsa solved my immediate issue in his comment under the OP, and further investigation of libmx.dll resolved the remaining discrepancy between Python & C versions.
Because Matlab's libmx.dll goes back really long time as it was first written in the 32-bit era, the DLL contains multiple versions of its functions for backward compatibility. As it turned out, const mwSize *mxGetDimensions(const mxArray *); is the oldest version of the function, and the associated C header file (matrix.h) has the line #define mxGetDimensions mxGetDimensions_800 to override the function with its newest version. Obviously, Python's ctypes doesn't check the C header file; so, it was left for me to sift through the header file figuring out which version of the function to use.
In the end, the proper behavior with POINTER(c_size_t) was obtained when I changed my code to:
libmx = ctypes.cdll.LoadLibrary('libmx.dll')
libmx.mxGetDimensions_800.restype = ctypes.POINTER(ctypes.c_size_t)
libmx.mxGetDimensions_800.argtypes = [ctypes.c_void_p]
dims = libmx.mxGetDimensions_800(VAR)
So, there you have it: Study the associated header file thoroughly if you are wrapping a 3rd-party dynamic/shared library.

How to avoid a memory leak when Python objects are passed to C using callbacks?

To avoid a memory leak in the sutuation described below, I would like to call Py_DecRef directly from Python. Is there a way to do this? Is there a better solution for this problem?
I am using ctypes to interface my Python code to a C library for which I do not have the code. The C library was not written for Python, so it doesn't know anything about Python objects.
The C library uses two callbacks: the first creates an object and returns a void* pointer to it, and the second gets the pointer as parameter and is supposed to destroy it. In the C header files, the types of these callback functions are defined as follows:
typedef void* (*CreateCallback)();
typedef void (*DestroyCallback)(void*);
These callbacks could be defined in Python as shown below (simplified code). The current code has a memory leak as explained in the comments.
import ctypes
CreateCallback = ctypes.CFUNCTYPE(ctypes.py_object)
DestroyCallback = ctypes.CFUNCTYPE(None, ctypes.py_object)
class Object:
pass # In the real application, this contains more code
#CreateCallback
def create():
return Object()
# Ctypes correctly increments the reference count of the
# object, to make sure it does not get garbage collected
# while the C code holds a reference to it.
#DestroyCallback
def destroy(object):
pass
# Above, the reference count of the object should be
# decremented, because the C code no longer holds a
# reference to it. However, Ctypes does not know this so
# cannot do it automatically. How can I do this from
# Python? Is it possible to call Py_DecRef or similar
# directly from Python?
One option would be to create a C function that call Py_DecRef, compile that C function into a dll (or so for Linux), and call that from the destroy function above. That solution has at least two disadvantages:
It seems overly complex to create a dll just for one function
The C code would have to be compiled against a specific version of Python, instead of using whatever version of Python is running my Python code. Note that I need this to work on Windows, where a dll cannot contain undefined globals.

how to include shared object in python [duplicate]

I'm just getting started with ctypes and would like to use a C++ class that I have exported in a dll file from within python using ctypes.
So lets say my C++ code looks something like this:
class MyClass {
public:
int test();
...
I would know create a .dll file that contains this class and then load the .dll file in python using ctypes.
Now how would I create an Object of type MyClass and call its test function? Is that even possible with ctypes? Alternatively I would consider using SWIG or Boost.Python but ctypes seems like the easiest option for small projects.
Besides Boost.Python(which is probably a more friendly solution for larger projects that require one-to-one mapping of C++ classes to python classes), you could provide on the C++ side a C interface. It's one solution of many so it has its own trade offs, but I will present it for the benefit of those who aren't familiar with the technique. For full disclosure, with this approach one wouldn't be interfacing C++ to python, but C++ to C to Python. Below I included an example that meets your requirements to show you the general idea of the extern "c" facility of C++ compilers.
//YourFile.cpp (compiled into a .dll or .so file)
#include <new> //For std::nothrow
//Either include a header defining your class, or define it here.
extern "C" //Tells the compile to use C-linkage for the next scope.
{
//Note: The interface this linkage region needs to use C only.
void * CreateInstanceOfClass( void )
{
// Note: Inside the function body, I can use C++.
return new(std::nothrow) MyClass;
}
//Thanks Chris.
void DeleteInstanceOfClass (void *ptr)
{
delete(std::nothrow) ptr;
}
int CallMemberTest(void *ptr)
{
// Note: A downside here is the lack of type safety.
// You could always internally(in the C++ library) save a reference to all
// pointers created of type MyClass and verify it is an element in that
//structure.
//
// Per comments with Andre, we should avoid throwing exceptions.
try
{
MyClass * ref = reinterpret_cast<MyClass *>(ptr);
return ref->Test();
}
catch(...)
{
return -1; //assuming -1 is an error condition.
}
}
} //End C linkage scope.
You can compile this code with
gcc -shared -o test.so test.cpp
#creates test.so in your current working directory.
In your python code you could do something like this (interactive prompt from 2.7 shown):
>>> from ctypes import cdll
>>> stdc=cdll.LoadLibrary("libc.so.6") # or similar to load c library
>>> stdcpp=cdll.LoadLibrary("libstdc++.so.6") # or similar to load c++ library
>>> myLib=cdll.LoadLibrary("/path/to/test.so")
>>> spam = myLib.CreateInstanceOfClass()
>>> spam
[outputs the pointer address of the element]
>>> value=CallMemberTest(spam)
[does whatever Test does to the spam reference of the object]
I'm sure Boost.Python does something similar under the hood, but perhaps understanding the lower levels concepts is helpful. I would be more excited about this method if you were attempting to access functionality of a C++ library and a one-to-one mapping was not required.
For more information on C/C++ interaction check out this page from Sun: http://dsc.sun.com/solaris/articles/mixing.html#cpp_from_c
The short story is that there is no standard binary interface for C++ in the way that there is for C. Different compilers output different binaries for the same C++ dynamic libraries, due to name mangling and different ways to handle the stack between library function calls.
So, unfortunately, there really isn't a portable way to access C++ libraries in general. But, for one compiler at a time, it's no problem.
This blog post also has a short overview of why this currently won't work. Maybe after C++0x comes out, we'll have a standard ABI for C++? Until then, you're probably not going to have any way to access C++ classes through Python's ctypes.
The answer by AudaAero is very good but not complete (at least for me).
On my system (Debian Stretch x64 with GCC and G++ 6.3.0, Python 3.5.3) I have segfaults as soon has I call a member function that access a member value of the class.
I diagnosticated by printing pointer values to stdout that the void* pointer coded on 64 bits in wrappers is being represented on 32 bits in Python. Thus big problems occurs when it is passed back to a member function wrapper.
The solution I found is to change:
spam = myLib.CreateInstanceOfClass()
Into
Class_ctor_wrapper = myLib.CreateInstanceOfClass
Class_ctor_wrapper.restype = c_void_p
spam = c_void_p(Class_ctor_wrapper())
So two things were missing: setting the return type to c_void_p (the default is int) and then creating a c_void_p object (not just an integer).
I wish I could have written a comment but I still lack 27 rep points.
Extending AudaAero's and Gabriel Devillers answer I would complete the class object instance creation by:
stdc=c_void_p(cdll.LoadLibrary("libc.so.6"))
using ctypes c_void_p data type ensures the proper representation of the class object pointer within python.
Also make sure that the dll's memory management be handled by the dll (allocated memory in the dll should be deallocated also in the dll, and not in python)!
I ran into the same problem. From trial and error and some internet research (not necessarily from knowing the g++ compiler or C++ very well), I came across this particular solution that seems to be working quite well for me.
//model.hpp
class Model{
public:
static Model* CreateModel(char* model_name) asm("CreateModel"); // static method, creates an instance of the class
double GetValue(uint32_t index) asm("GetValue"); // object method
}
#model.py
from ctypes import ...
if __name__ == '__main__':
# load dll as model_dll
# Static Method Signature
fCreateModel = getattr(model_dll, 'CreateModel') # or model_dll.CreateModel
fCreateModel.argtypes = [c_char_p]
fCreateModel.restype = c_void_p
# Object Method Signature
fGetValue = getattr(model_dll, 'GetValue') # or model_dll.GetValue
fGetValue.argtypes = [c_void_p, c_uint32] # Notice two Params
fGetValue.restype = c_double
# Calling the Methods
obj_ptr = fCreateModel(c_char_p(b"new_model"))
val = fGetValue(obj_ptr, c_int32(0)) # pass in obj_ptr as first param of obj method
>>> nm -Dg libmodel.so
U cbrt#GLIBC_2.2.5
U close#GLIBC_2.2.5
00000000000033a0 T CreateModel # <----- Static Method
U __cxa_atexit#GLIBC_2.2.5
w __cxa_finalize#GLIBC_2.2.5
U fprintf#GLIBC_2.2.5
0000000000002b40 T GetValue # <----- Object Method
w __gmon_start__
...
...
... # Mangled Symbol Names Below
0000000000002430 T _ZN12SHMEMWrapper4HashEPKc
0000000000006120 B _ZN12SHMEMWrapper8info_mapE
00000000000033f0 T _ZN5Model12DestroyModelEPKc
0000000000002b20 T _ZN5Model14GetLinearIndexElll
First, I was able to avoid the extern "C" directive completely by instead using the asm keyword which, to my knowledge, asks the compiler to use a given name instead of the generated one when exporting the function to the shared object lib's symbol table. This allowed me to avoid the weird symbol names that the C++ compiler generates automatically. They look something like the _ZN1... pattern you see above. Then in a program using Python ctypes, I was able to access the class functions directly using the custom name I gave them. The program looks like fhandle = mydll.myfunc or fhandler = getattr(mydll, 'myfunc') instead of fhandle = getattr(mydll, '_ZN12...myfunc...'). Of course, you could just use the long name; it would make no difference, but I figure the shorter name is a little cleaner and doesn't require using nm to read the symbol table and extract the names in the first place.
Second, in the spirit of Python's style of object oriented programming, I decided to try passing in my class' object pointer as the first argument of the class object method, just like when we pass self in as the first method in Python object methods. To my surprise, it worked! See the Python section above. Apparently, if you set the first argument in the fhandle.argtypes argument to c_void_ptr and pass in the ptr you get from your class' static factory method, the program should execute cleanly. Class static methods seem to work as one would expect like in Python; just use the original function signature.
I'm using g++ 12.1.1, python 3.10.5 on Arch Linux. I hope this helps someone.

How to pass an array from C to an embedded python script

I am running to some problems and would like some help. I have a piece code, which is used to embed a python script. This python script contains a function which will expect to receive an array as an argument (in this case I am using numpy array within the python script).
I would like to know how can I pass an array from C to the embedded python script as an argument for the function within the script. More specifically can someone show me a simple example of this.
Really, the best answer here is probably to use numpy arrays exclusively, even from your C code. But if that's not possible, then you have the same problem as any code that shares data between C types and Python types.
In general, there are at least five options for sharing data between C and Python:
Create a Python list or other object to pass.
Define a new Python type (in your C code) to wrap and represent the array, with the same methods you'd define for a sequence object in Python (__getitem__, etc.).
Cast the pointer to the array to intptr_t, or to explicit ctypes type, or just leave it un-cast; then use ctypes on the Python side to access it.
Cast the pointer to the array to const char * and pass it as a str (or, in Py3, bytes), and use struct or ctypes on the Python side to access it.
Create an object matching the buffer protocol, and again use struct or ctypes on the Python side.
In your case, you want to use numpy.arrays in Python. So, the general cases become:
Create a numpy.array to pass.
(probably not appropriate)
Pass the pointer to the array as-is, and from Python, use ctypes to get it into a type that numpy can convert into an array.
Cast the pointer to the array to const char * and pass it as a str (or, in Py3, bytes), which is already a type that numpy can convert into an array.
Create an object matching the buffer protocol, and which again I believe numpy can convert directly.
For 1, here's how to do it with a list, just because it's a very simple example (and I already wrote it…):
PyObject *makelist(int array[], size_t size) {
PyObject *l = PyList_New(size);
for (size_t i = 0; i != size; ++i) {
PyList_SET_ITEM(l, i, PyInt_FromLong(array[i]));
}
return l;
}
And here's the numpy.array equivalent (assuming you can rely on the C array not to be deleted—see Creating arrays in the docs for more details on your options here):
PyObject *makearray(int array[], size_t size) {
npy_int dim = size;
return PyArray_SimpleNewFromData(1, &dim, (void *)array);
}
At any rate, however you do this, you will end up with something that looks like a PyObject * from C (and has a single refcount), so you can pass it as a function argument, while on the Python side it will look like a numpy.array, list, bytes, or whatever else is appropriate.
Now, how do you actually pass function arguments? Well, the sample code in Pure Embedding that you referenced in your comment shows how to do this, but doesn't really explain what's going on. There's actually more explanation in the extending docs than the embedding docs, specifically, Calling Python Functions from C. Also, keep in mind that the standard library source code is chock full of examples of this (although some of them aren't as readable as they could be, either because of optimization, or just because they haven't been updated to take advantage of new simplified C API features).
Skip the first example about getting a Python function from Python, because presumably you already have that. The second example (and the paragraph right about it) shows the easy way to do it: Creating an argument tuple with Py_BuildValue. So, let's say we want to call a function you've got stored in myfunc with the list mylist returned by that makelist function above. Here's what you do:
if (!PyCallable_Check(myfunc)) {
PyErr_SetString(PyExc_TypeError, "function is not callable?!");
return NULL;
}
PyObject *arglist = Py_BuildValue("(o)", mylist);
PyObject *result = PyObject_CallObject(myfunc, arglist);
Py_DECREF(arglist);
return result;
You can skip the callable check if you're sure you've got a valid callable object, of course. (And it's usually better to check when you first get myfunc, if appropriate, because you can give both earlier and better error feedback that way.)
If you want to actually understand what's going on, try it without Py_BuildValue. As the docs say, the second argument to [PyObject_CallObject][6] is a tuple, and PyObject_CallObject(callable_object, args) is equivalent to apply(callable_object, args), which is equivalent to callable_object(*args). So, if you wanted to call myfunc(mylist) in Python, you have to turn that into, effectively, myfunc(*(mylist,)) so you can translate it to C. You can construct a tuple like this:
PyObject *arglist = PyTuple_Pack(1, mylist);
But usually, Py_BuildValue is easier (especially if you haven't already packed everything up as Python objects), and the intention in your code is clearer (just as using PyArg_ParseTuple is simpler and clearer than using explicit tuple functions in the other direction).
So, how do you get that myfunc? Well, if you've created the function from the embedding code, just keep the pointer around. If you want it passed in from the Python code, that's exactly what the first example does. If you want to, e.g., look it up by name from a module or other context, the APIs for concrete types like PyModule and abstract types like PyMapping are pretty simple, and it's generally obvious how to convert Python code into the equivalent C code, even if the result is mostly ugly boilerplate.
Putting it all together, let's say I've got a C array of integers, and I want to import mymodule and call a function mymodule.myfunc(mylist) that returns an int. Here's a stripped-down example (not actually tested, and no error handling, but it should show all the parts):
int callModuleFunc(int array[], size_t size) {
PyObject *mymodule = PyImport_ImportModule("mymodule");
PyObject *myfunc = PyObject_GetAttrString(mymodule, "myfunc");
PyObject *mylist = PyList_New(size);
for (size_t i = 0; i != size; ++i) {
PyList_SET_ITEM(l, i, PyInt_FromLong(array[i]));
}
PyObject *arglist = Py_BuildValue("(o)", mylist);
PyObject *result = PyObject_CallObject(myfunc, arglist);
int retval = (int)PyInt_AsLong(result);
Py_DECREF(result);
Py_DECREF(arglist);
Py_DECREF(mylist);
Py_DECREF(myfunc);
Py_DECREF(mymodule);
return retval;
}
If you're using C++, you probably want to look into some kind of scope-guard/janitor/etc. to handle all those Py_DECREF calls, especially once you start doing proper error handling (which usually means early return NULL calls peppered through the function). If you're using C++11 or Boost, unique_ptr<PyObject, Py_DecRef> may be all you need.
But really, a better way to reduce all that ugly boilerplate, if you plan to do a lot of C<->Python communication, is to look at all of the familiar frameworks designed for improving extending Python—Cython, boost::python, etc. Even though you're embedding, you're effectively doing the same work as extending, so they can help in the same ways.
For that matter, some of them also have tools to help the embedding part, if you search around the docs. For example, you can write your main program in Cython, using both C code and Python code, and cython --embed. You may want to cross your fingers and/or sacrifice some chickens, but if it works, it's amazingly simple and productive. Boost isn't nearly as trivial to get started, but once you've got things together, almost everything is done in exactly the way you'd expect, and just works, and that's just as true for embedding as extending. And so on.
The Python function will need a Python object to be passed in. Since you want that Python object to be a NumPy array, you should use one of the NumPy C-API functions for creating arrays; PyArray_SimpleNewFromData() is probably a good start. It will use the buffer provided, without copying the data.
That said, it is almost always easier to write the main program in Python and use a C extension module for the C code. This approach makes it easier to let Python do the memory management, and the ctypes module together with Numpy's cpython extensions make it easy to pass a NumPy array to a C function.

Categories

Resources