Initializing Cython objects with existing C Objects - python

C++ Model
Say I have the following C++ data structures I wish to expose to Python.
#include <memory>
#include <vector>
struct mystruct
{
int a, b, c, d, e, f, g, h, i, j, k, l, m;
};
typedef std::vector<std::shared_ptr<mystruct>> mystruct_list;
Boost Python
I can wrap these fairly effectively using boost::python with the following code, easily allowing me to use the existing mystruct (copying the shared_ptr) rather than recreating an existing object.
#include "mystruct.h"
#include <boost/python.hpp>
using namespace boost::python;
BOOST_PYTHON_MODULE(example)
{
class_<mystruct, std::shared_ptr<mystruct>>("MyStruct", init<>())
.def_readwrite("a", &mystruct::a);
// add the rest of the member variables
class_<mystruct_list>("MyStructList", init<>())
.def("at", &mystruct_list::at, return_value_policy<copy_const_reference>());
// add the rest of the member functions
}
Cython
In Cython, I have no idea how to extract an item from mystruct_list, without copying the underlying data. I have no idea how I could initialize MyStruct from the existing shared_ptr<mystruct>, without copying all the data over in one of various forms.
from libcpp.memory cimport shared_ptr
from cython.operator cimport dereference
cdef extern from "mystruct.h" nogil:
cdef cppclass mystruct:
int a, b, c, d, e, f, g, h, i, j, k, l, m
ctypedef vector[v] mystruct_list
cdef class MyStruct:
cdef shared_ptr[mystruct] ptr
def __cinit__(MyStruct self):
self.ptr.reset(new mystruct)
property a:
def __get__(MyStruct self):
return dereference(self.ptr).a
def __set__(MyStruct self, int value):
dereference(self.ptr).a = value
cdef class MyStructList:
cdef mystruct_list c
cdef mystruct_list.iterator it
def __cinit__(MyStructList self):
pass
def __getitem__(MyStructList self, int index):
# How do return MyStruct without copying the underlying `mystruct`
pass
I see many possible workarounds, and none of them are very satisfactory:
I could initialize an empty MyStruct, and in Cython assign over the shared_ptr. However, this would result in wasting an initalized struct for absolutely no reason.
MyStruct value
value.ptr = self.c.at(index)
return value
I also could copy the data from the existing mystruct to the new mystruct. However, this suffers from similar bloat.
MyStruct value
dereference(value.ptr).a = dereference(self.c.at(index)).a
return value
I could also expose a init=True flag for each __cinit__ method, which would prevent reconstructing the object internally if the C-object exists already (when init is False). However, this could cause catastrophic issues, since it would be exposed to the Python API and would allow dereferencing a null or uninitialized pointer.
def __cinit__(MyStruct self, bint init=True):
if init:
self.ptr.reset(new mystruct)
I could also overload __init__ with the Python-exposed constructor (which would reset self.ptr), but this would have risky memory safety if __new__ was used from the Python layer.
Bottom-Line
I would love to use Cython, for compilation speed, syntactical sugar, and numerous other reasons, as opposed to the fairly clunky boost::python. I'm looking at pybind11 right now, and it may solve the compilation speed issues, but I would still prefer to use Cython.
Is there any way I can do such a simple task idiomatically in Cython? Thanks.

The way this works in Cython is by having a factory class to create Python objects out of the shared pointer. This gives you access to the underlying C/C++ structure without copying.
Example Cython code:
<..>
cdef class MyStruct:
cdef shared_ptr[mystruct] ptr
def __cinit__(self):
# Do not create new ref here, we will
# pass one in from Cython code
self.ptr = NULL
def __dealloc__(self):
# Do de-allocation here, important!
if self.ptr is not NULL:
<de-alloc>
<rest per MyStruct code above>
cdef object PyStruct(shared_ptr[mystruct] MyStruct_ptr):
"""Python object factory class taking Cpp mystruct pointer
as argument
"""
# Create new MyStruct object. This does not create
# new structure but does allocate a null pointer
cdef MyStruct _mystruct = MyStruct()
# Set pointer of cdef class to existing struct ptr
_mystruct.ptr = MyStruct_ptr
# Return the wrapped MyStruct object with MyStruct_ptr
return _mystruct
def make_structure():
"""Function to create new Cpp mystruct and return
python object representation of it
"""
cdef MyStruct mypystruct = PyStruct(new mystruct)
return mypystruct
Note the type for the argument of PyStruct is a pointer to the Cpp struct.
mypystruct then is a python object of class MyStruct, as returned by the factory class, which refers to the
Cpp mystruct without copying. mypystruct can be safely returned in def cython functions and used in python space, per make_structure code.
To return a Python object of an existing Cpp mystruct pointer just wrap it with PyStruct like
return PyStruct(my_cpp_struct_ptr)
anywhere in your Cython code.
Obviously only def functions are visible there so the Cpp function calls would need to be wrapped as well inside MyStruct if they are to be used in Python space, at least if you want the Cpp function calls inside the Cython class to let go of the GiL (probably worth doing for obvious reasons).
For a real-world example see this Cython extension code and the underlying C code bindings in Cython. Also see this code for Python function wrapping of C function calls that let go of GIL. Not Cpp but same applies.
See also official Cython documentation on when a factory class/function is needed (Note that all constructor arguments will be passed as Python objects). For built in types, Cython does this conversion for you but for custom structures or objects a factory class/function is needed.
The Cpp structure initialisation could be handled in __new__ of PyStruct if needed, per suggestion above, if you want the factory class to actually create the C++ structure for you (depends on the use case really).
The benefit of a factory class with pointer arguments is it allows you to use existing pointers of C/C++ structures and wrap them in a Python extension class, rather than always having to create new ones. It would be perfectly safe to, for example, have multiple Python objects referring to the same underlying C struct. Python's ref counting ensures they won't be de-allocated prematurely. You should still check for null when deallocating though as the shared pointer could already had been de-allocated explicitly (eg, by del).
Note that there is, however, some overhead in creating new python objects even if they do point to the same C++ structure. Not a lot, but still.
IMO this auto de-allocation and ref counting of C/C++ pointers is one of the greatest features of Python's C extension API. As all that acts on Python objects (alone), the C/C++ structures need to be wrapped in a compatible Python object class definition.
Note - My experience is mostly in C, the above may need adjusting as I'm more familiar with regular C pointers than C++'s shared pointers.

Related

Wrong result using a PyCapsule created from a method in Cython

We would need to create a PyCapsule from a method of a class in Cython. We managed to write a code which compiles and even runs without error but the results are wrong.
A simple example is here: https://github.com/paugier/cython_capi/tree/master/using_cpython_pycapsule_class
The capsules are executed by Pythran (one needs to use the version on github https://github.com/serge-sans-paille/pythran).
The .pyx file:
from cpython.pycapsule cimport PyCapsule_New
cdef int twice_func(int c):
return 2*c
cdef class Twice:
cdef public dict __pyx_capi__
def __init__(self):
self.__pyx_capi__ = self.get_capi()
cpdef get_capi(self):
return {
'twice_func': PyCapsule_New(
<void *>twice_func, 'int (int)', NULL),
'twice_cpdef': PyCapsule_New(
<void *>self.twice_cpdef, 'int (int)', NULL),
'twice_cdef': PyCapsule_New(
<void *>self.twice_cdef, 'int (int)', NULL),
'twice_static': PyCapsule_New(
<void *>self.twice_static, 'int (int)', NULL)}
cpdef int twice_cpdef(self, int c):
return 2*c
cdef int twice_cdef(self, int c):
return 2*c
#staticmethod
cdef int twice_static(int c):
return 2*c
The file compiled by pythran (call_capsule_pythran.py).
# pythran export call_capsule(int(int), int)
def call_capsule(capsule, n):
r = capsule(n)
return r
Once again it is a new feature of Pythran so one needs the version on github...
And the test file:
try:
import faulthandler
faulthandler.enable()
except ImportError:
pass
import unittest
from twice import Twice
from call_capsule_pythran import call_capsule
class TestAll(unittest.TestCase):
def setUp(self):
self.obj = Twice()
self.capi = self.obj.__pyx_capi__
def test_pythran(self):
value = 41
print('\n')
for name, capsule in self.capi.items():
print('capsule', name)
result = call_capsule(capsule, value)
if name.startswith('twice'):
if result != 2*value:
how = 'wrong'
else:
how = 'good'
print(how, f'result ({result})\n')
if __name__ == '__main__':
unittest.main()
It is buggy and gives:
capsule twice_func
good result (82)
capsule twice_cpdef
wrong result (4006664390)
capsule twice_cdef
wrong result (4006664390)
capsule twice_static
good result (82)
It shows that it works fine for the standard function and for the static function but that there is a problem for the methods.
Note that the fact that it works for two capsules seems to indicate that the problem does not come from Pythran.
Edit
After DavidW's comments, I understand that we would have to create at run time (for example in get_capi) a C function with the signature int(int) from the bound method twice_cdef whose signature is actually int(Twice, int).
I don't know if this is really impossible to do with Cython...
To follow up/expand on my comments:
The basic issue is that the Pythran is expecting a C function pointer with the signature int f(int) to be contained within the PyCapsule. However, the signature of your methods is int(PyObject* self, int c). The 2 gets passed as self (not causing disaster since it isn't actually used...) and some arbitrary bit of memory is used in place of the int c. Unfortunately it isn't possible to use pure C code to create a C function pointer with "bound arguments" so Cython can't (and realistically won't be able to) do it.
Modification 1 is to get better compile-time type checking of what you're passing to your PyCapsules by creating a function that accepts the correct types and casting in there, rather than just casting to <void*> blindly. This doesn't solve your problem but warns you at compile-time when it isn't going to work:
ctypedef int(*f_ptr_type)(int)
cdef make_PyCapsule(f_ptr_type f, string):
return PyCapsule_New(
<void *>f, string, NULL)
# then in get_capi:
'twice_func': make_PyCapsule(twice_func, b'int (int)'), # etc
It is actually possible to create C function from arbitrary Python callables using ctypes (or cffi) - see Using function pointers to methods of classes without the gil (bottom of answer). This adds an extra layer of Python calls so isn't terribly quick, and the code is a bit messy. ctypes achieves this by using runtime code generation (which isn't that portable or something you can do in pure C) to build a function on the fly and then create a pointer to that.
Although you claim in the comments that you don't think you can use the Python interpreter, I don't think this is true - Pythran generates Python extension modules (so is pretty bound to the Python interpreter) and it seems to work in your test case shown here:
_func_cache = []
cdef f_ptr_type py_to_fptr(f):
import ctypes
functype = ctypes.CFUNCTYPE(ctypes.c_int,ctypes.c_int)
ctypes_f = functype(f)
_func_cache.append(ctypes_f) # ensure references are kept
return (<f_ptr_type*><size_t>ctypes.addressof(ctypes_f))[0]
# then in make_capi:
'twice_cpdef': make_PyCapsule(py_to_fptr(self.twice_cpdef), b'int (int)')
Unfortunately it only works for cpdef and not cdef functions since it does rely on having a Python callable. cdef functions can be made to work with a lambda (provided you change get_capi to def instead of cpdef):
'twice_cdef': make_PyCapsule(py_to_fptr(lambda x: self.twice_cdef(x)), b'int (int)'),
It's all a little messy but can be made to work.

Call cdef function by name in Cython

I have a bunch of cdef functions in Cython, that are called by a def function in a pyx file, e.g.:
cdef inline void myfunc_c(...):
(...)
return
def wrapper(...):
myfunc_c(...)
return
This works well. But to simplify not having to have a python wrapper for each cdef function, I was trying to index the cdef functions by name, either by assigning them to a dictionary or something like:
def wrapper(operation):
if operation == 'my_func':
func = myfunc_c
func(...)
return
But this doesn't work. Cython complains that it doesn't know the type of myfunc_c.
Is there any way to index or call the cpdef functions by name (e.g. use a string)? I also tried things like locals()['myfunc_c'], but that doesn't work either.
For a general cdef functions this is impossible - they only define a C interface but not a Python interface so there's no introspection available.
For a cdef function declared with api (e.g. cdef api funcname()) it is actually possible. There's an undocumented dictionary __pyx_capi__. This defines a dictionary (indexed by name) of PyCapsules containing function pointers. You'd then do
capsule_name = PyCapsule_GetName(obj)
func = PyCapsule_GetPointer(obj, capsule_name)
(where PyCapsule_* are functions cimported from the Python C API). func is a void* that you can cast into a function pointer of an appropriate type. Getting the type right is important, and up to you!
Although undocumented, the __pyx_capi__ interface is relatively stable and used by Scipy for its LowLevelCallable feature, for example.
cpdef functions define both a Python and a C interface. Within a function they will be available in globals() rather than locals() (since locals() only gives the variables defined in that function.)
I don't actually think you want to do this though. I think you just want to use cpdef instead of cdef since this automatically generates a Python wrapper for the function.

Cython extension class: How do I expose methods in the auto-generated C struct?

I have existing C++ code that defines some classes I need to use, but I need to be able to send those classes to Python code. Specifically, I need to create class instances in C++, create Python objects to serve as wrappers for these C++ objects, then pass these Python objects to Python code for processing. This is just one piece of a larger C++ program, so it needs to be done ultimately in C++ using the C/Python API.
To make my life easier, I have used Cython to define extension classes (cdef classes) that serve as the Python wrappers for my C++ objects. I am using the typical format where the cdef class contains a pointer to the C++ class, which is then initialized when the cdef class instance is created. Since I also want to be able to replace the pointer if I have an existing C++ object to wrap, I have added methods to my cdef classes to accept() the C++ object and take its pointer. My other cdef classes successfully use the accept() method in Cython, for example when one object owns another.
Here is a sample of my Cython code:
MyCPlus.pxd
cdef extern from "MyCPlus.h" namespace "mynamespace":
cdef cppclass MyCPlus_Class:
MyCPlus_Class() except +
PyModule.pyx
cimport MyCPlus
from libcpp cimport bool
cdef class Py_Class [object Py_Class, type PyType_Class]:
cdef MyCPlus.MyCPlus_Class* thisptr
cdef bool owned
cdef void accept(self, MyCPlus.MyCPlus_Class &indata):
if self.owned:
del self.thisptr
self.thisptr = &indata
self.owned = False
def __cinit__(self):
self.thisptr = new MyCPlus.MyCPlus_Class()
self.owned = True
def __dealloc__(self):
if self.owned:
del self.thisptr
The problem comes when I try to access the accept() method from C++. I tried using the public and api keywords on my cdef class and on the accept() method, but I cannot figure out how to expose this method in the C struct in Cython's auto-generated .h file. No matter what I try, the C struct looks like this:
PyModule.h (auto-generated by Cython)
struct Py_Class {
PyObject_HEAD
struct __pyx_vtabstruct_11PyModule_Py_Class *__pyx_vtab;
mynamespace::MyCPlus_Class *thisptr;
bool owned;
};
I also tried typing the self input as a Py_Class, and I even tried forward-declaring Py_Class with the public and api keywords. I also experimented with making accept() a static method. Nothing I've tried works to expose the accept() method so that I can use it from C++. I did try accessing it through __pyx_vtab, but I got a compiler error, "invalid use of incomplete type". I have searched quite a bit, but haven't seen a solution to this. Can anyone help me? Please and thank you!
As you pointed in your comment, it does seem that the __pyx_vtab member is for Cython use only, since it doesn't even define the struct type for it in the exported header(s).
Adding to your response, one approach could also be:
cdef api class Py_Class [object Py_Class, type Py_ClassType]:
...
cdef void accept(self, MyCPlus.MyCPlus_Class &indata):
... # do stuff here
...
cdef api void (*Py_Class_accept)(Py_Class self, MyCPlus.MyCPlus_Class &indata)
Py_Class_accept = &Py_Class.accept
Basically, we define a function pointer and set it to the extension's method we want to expose. This is not that much different to your response's cdef'd function; the main difference would be that we can define our methods as usual in the class definition without having to duplicate functionality or method/function calls to another function to expose it. One caveat is that we would've to define our function pointer's signature almost verbatim to the method's one in addition to the self's extension type (in this case) and etc; then again this also applies for regular functions.
Do note that I tried this up on a C-level Cython .pyx file, I haven't and do not intent to test it on a CPP implementation file. But hopefully this might work just as fine, I guess.
This is not really a solution, but I came up with a workaround for my problem. I am still hoping for a solution that allows me to tell Cython to expose the accept() method to C++.
My workaround is that I wrote a separate function for my Python class (not a method). I then gave the api keyword both to my Python class and to the new function:
cdef api class Py_Class [object Py_Class, type PyType_Class]:
(etc.)
cdef api Py_Class wrap_MyCPlusClass(MyCPlus.MyCPlus_Class &indata):
wrapper = Py_Class()
del wrapper.thisptr
wrapper.thisptr = &indata
wrapper.owned = False
return wrapper
This gets a little unwieldy with the number of different classes I need to wrap, but at least Cython puts the function in the API where it is easy to use:
struct Py_Class* wrap_MyCPlusClass(mynamespace::MyCPlusClass &);
You probably want to use cpdef instead of cdef when declaring accept. See the docs:
Callable from Python and C
* Are declared with the cpdef statement.
* Can be called from anywhere, because it uses a little Cython magic.
* Uses the faster C calling conventions when being called from other Cython code.
Try that!

Passing a bounded method in Cython as argument

I am trying to wrap some C++ code into Cython and I came up with some trouble trying to pass a method from a class as an argument to a function.
I do not know if it makes it more clear, but class A represents a statistical model (so myAMethod uses not only the arguments passed but many instance variables) and B has different methods for minimizing the function passed.
In C++ I have something of this style:
class A
{
public:
double myAMethod(double*)
};
class B
{
public:
double myBMethod(A&, double (A::*f) (double*)
}
So what I am trying to do is to use instances of both A and B in Cython code. I had no trouble wrapping the classes, but when I try to use myBMethod, I don't know how to pass a pointer of the kind A::*myAMethod
If I do this:
myBMethod(ptrToAObj[0], &ptrToAObj.myAMethod),
then Cython compiles this code to [...] &ptrToAObj->myAMethod [...], and I get the message one would expect from g++:
"ISO C++ forbids taking the address of a bound member function to form a pointer to member function."
But if I try to point straight to the class method, and do myBMethod(ptrToAObj[0], A.myAMethod), then Cython won't compile and say that
myAMethod is not a static member from A.
And that's pretty much all I was able to advance. I could work at C++ level and avoid any of these anoyances, but if I were able to use instances of A and B in Python (via Cython) interactively, it would help me in speedig my development pace.
Any help will be really appreciated, and I apologize if this question as been already answered and/or is available in a referece - I search SO, Cython reference and Smith's "Cython" book and I did not found this theme adressed.
Thanks in advance!
I have a partial (if horrendous) solution. I'm prepared to believe there's a better way, but I don't know it.
In cpp_bit.hpp:
class A {
public:
double myAMethod(double*) { return 0.0; }
};
typedef double (A::*A_f_ptr)(double *);
class B {
public:
double myBMethod(A& a, A_f_ptr f) {
double x = 0.1;
return (a.*f)(&x);
}
};
A_f_ptr getAMethod() {
return &A::myAMethod;
}
I've given the functions very basic implementations, just so I can check for really obvious crashes. I've also created a function pointer which returns a pointer to myAMethod. You'll need to do this for every method you want to wrap.
In py_bit.pyx
# distutils: language = c++
from cython.operator import dereference
cdef extern from "cpp_bit.hpp":
cdef cppclass A:
double myAMethod(double*)
cdef cppclass A_f_ptr:
pass
cdef cppclass B:
double myBMethod(A&, A_f_ptr)
cdef A_f_ptr getAMethod()
cdef class PyA:
cdef A* thisptr
def __cinit__(self):
self.thisptr = new A()
def __dealloc__(self):
del self.thisptr
cpdef myAMethod(self,double[:] o):
return self.thisptr.myAMethod(&o[0])
cdef class PyB:
cdef B* thisptr
def __cinit__(self):
self.thisptr = new B()
def __dealloc__(self):
del self.thisptr
cpdef myBMethod(self,PyA a):
return self.thisptr.myBMethod(dereference(a.thisptr),getAMethod())
I couldn't figure out how to typedef a member function pointer in Cython, so instead I created an empty cppclass with the same name. This works because cython just seems to use it for type-checking and nothing more, and since it includes cpp_bit.hpp (where it's defined) you can use it no problem.
All I've done is left the task of getting the member function pointer to c++ (in getAMethod, which I call). I don't think it's entirely satisfactory, but it looks workable, and is only a short extra c++ function for every member function you want to access. You could play with where you put it to encapsulate it more cleanly.
An alternative, untested approach: (Edit: further thought suggests this might be very tricky! Attempt this at your own risk!)
Personally, I'd be tempted to change the c++ interface so that myBMethod is defined as
double myBMethod(std::function<double (double*)>)
(since presumably you always call it with the A instance it's passed with). Then use lambda functions in c++(11!) to wrap the A instance and function together
b.myBMethod([&](double* d){ return a.myAMethod(d) };
It may then take a bit of hugely complicated Cython wrapping that I haven't yet considered, but it should be possible to convert a simple double (double*) function pointer to the c++ function object, and so use it more directly.
It's also possible that your actual design is more complicated in ways I haven't considered, and this approach isn't flexible enough anyway.

How to wrap a C++ functor in Cython

I'm trying to wrap a C++ library in which the logic is implemented as templatized functors in .hpp files, and I'm struggling to find the right way to expose the C++ functors as Cython/Python functions. How are functors like the one below supposed to be wrapped in Cython?
I believe this should be possible, at least for template classes and functions, according to the Cython 0.20 docs.
Note: I think I've figured out how to wrap normal C++ functions—the problem occurs when I'm trying to wrap a templatized functor, i.e. a template struct that overloads the () operator (making it act like a function when a data type is fixed).
Disclaimer: I'm a total novice in C++ and very new to Cython so apologies if I'm making obvious mistakes here.
The functor I'm trying to wrap:
#include <vector>
#include "EMD_DEFS.hpp"
#include "flow_utils.hpp"
template<typename NUM_T, FLOW_TYPE_T FLOW_TYPE= NO_FLOW>
struct emd_hat_gd_metric {
NUM_T operator()(const std::vector<NUM_T>& P, const std::vector<NUM_T>& Q,
const std::vector< std::vector<NUM_T> >& C,
NUM_T extra_mass_penalty= -1,
std::vector< std::vector<NUM_T> >* F= NULL);
};
My wrapper.pyx file:
# distutils: language = c++
from libcpp.vector cimport vector
cdef extern from "lib/emd_hat.hpp":
# Apparently `cppclass` is necessary here even though
# `emd_hat_gd_metric` is not a class...?
cdef cppclass emd_hat_gd_metric[NUM_T]:
NUM_T operator()(vector[NUM_T]& P,
vector[NUM_T]& Q,
vector[vector[NUM_T]]& C) except +
cdef class EMD:
cdef emd_hat_gd_metric *__thisptr
def __cinit__(self):
self.__thisptr = new emd_hat_gd_metric()
def __dealloc__(self):
del self.__thisptr
def calculate(self, P, Q, C):
# What goes here? How do I call the functor as a function?
return self.__thisptr(P, Q, C)
The above just gives a Calling non-function type 'emd_hat_gd_metric[NUM_T]' error when I try to compile it with cython --cplus wrapper.pyx.
Here's the full library I'm trying to wrap.
End goal: to be able to call emd_hat_gd_metric as a Cython/Python function, with arguments being NumPy arrays.
I couldn't find a real solution, but here's a workaround (that requires modifying the C++ code): just instantiate the template function with the data type you need in the C++ header, then declare that function normally in your .pyx file.
It's a little unwieldy if you need many different data types, but I only needed double. It would also be nicer if it wasn't necessary to modify the external library… but it works.
In the C++ some_library.hpp file:
Instantiate the functor with the data type you need (say, double):
template<typename T>
struct some_template_functor {
T operator()(T x);
};
// Add this:
some_template_functor<double> some_template_functor_double;
In the Cython .pyx file:
Declare the function normally (no need for cppclass):
cdef extern from "path/to/some_library.hpp":
cdef double some_template_functor_double(double x)
Then you can call some_template_functor_double from within Cython.

Categories

Resources