We have cython code built on top of an existing C library -- very simplified example here just to show the problem. The somecapi_t structure is opaque with an allocate function, a free function, and several functions to manipulate the structure.
data.pxd:
cdef extern from "<some_c_api.h>" nogil:
ctypedef struct some_c_api_t:
pass
some_c_api_t *some_c_api_alloc()
void some_c_api_free(some_c_api_t *scap)
int some_c_api_get_count(const some_c_api_t *scap)
Class Data is the python representation of the C structure. Class Data uses a helper class, Helper, to implement _getitem_, _setitem_, and other operations on part of the C type in a more pythonic way than the C API provides. Here, all we show is _len_. For sake of example, we also have an equivalent function, get_count that tries to access the "scap" pointer from Data.
data.pyx
from cpython.exc cimport PyErr_SetFromErrno
cdef class Helper:
cdef some_c_api_t *scap
def __cinit__(self, data, some_c_api_t *scap):
self.data = data
self.scap = scap
def __len__(self):
return some_c_api_get_count(self.scap)
def get_count(self):
return some_c_api_get_count(self.data.scap)
cdef class Data:
cdef some_c_api_t *scap
def __cinit__(self):
self.scap = some_c_api_alloc()
if self.scap == NULL:
PyErr_SetFromErrno(OSError)
def __dealloc__(self):
some_c_api_free(self.scap)
#property
def values(self):
return Helper(self, self.scap)
When compiled, we get two errors. First, the call to Helper's constructor is trying to convert "scap" to a python object even though it's declared as the C type in Helper's constructor.
#property
def values(self):
return Helper(self, self.scap)
^
------------------------------------------------------------
vna/data.pyx:30:32: Cannot convert 'some_c_api_t *' to Python object
Similarly, when get_count tries to get the scap pointer from Data, it thinks it's a python object.
def get_count(self):
return some_c_api_get_count(self.data.scap)
^
------------------------------------------------------------
vna/data.pyx:14:45: Cannot convert Python object to 'const some_c_api_t *'
Another approach we tried was to define a "get_scap" method in Data:
cdef some_c_api_t *scap(self):
return self.scap
and call it from Helper. In every case, however, attempts to pass the C pointer to Helper fails, insisting that the API between these two "cdef" classes must consist only of Python objects.
Is it possible for the Helper class to get access the C pointer?
Related
I'm trying to pass an np.ndarray from Python to instantiate a Cython class. However, I can't work out how to do it for an any-dimensional array. I'd like my .pyx interface to look like:
wrapper.pyx:
import numpy as np
cimport numpy as np
cdef extern from "myClass.h":
cdef cppclass C_myClass "myClass":
void myClass(np.float32_t*, int*, int)
cdef class array:
cdef C_myClass* cython_class
cdef int shape[8] # max number of dimensions = 8
cdef int ndim
def __init__(self, np.ndarray[dtype=np.float32_t] numpy_array):
self.ndim = numpy_array.ndim
for dim in range(self.ndim):
self.shape[dim] = numpy_array.shape[dim]
self.cython_class = new C_myClass(&numpy_array[0], &self.shape[0], self.ndim)
def __del__(self):
del self.cython_class
Such that the class constructor can look like:
myClass.h:
myClass(float* array_, int* shape_, int ndim_);
Do any of you know how to handle an array of any dimensions within Cython, while still being able to get the array shape parameters (I don't want the user to have to flatten the array or pass in the array shapes themselves)?
There isn't a direct Cython type for "numpy array with any number of dimensions" (or indeed "typed memoryview with any number of dimensions").
Therefore, I suggest leaving the array argument to the constructor untyped and using the buffer protocol yourself.
The following code is untested, but should give you the right idea
from cpython.buffer cimport Py_buffer, PyObject_GetBuffer, PyBUF_ND, PyBUF_FORMAT, PyBuffer_Release
...
def __init__(self, array):
cdef Py_buffer buf
# shape and ndim can be passed as before - they don't
# need the array to be typed
self.ndim = numpy_array.ndim
for dim in range(self.ndim):
self.shape[dim] = numpy_array.shape[dim]
# Note that PyBUF_ND requires a C contiguous array. This looks
# to be an implicit requirement of your C++ class that you
# probably don't realize you have
PyObject_GetBuffer(array, buf, PyBUF_ND | PyBUF_FORMAT)
if not buf.format == "f":
PyBuffer_Release(buf)
raise TypeError("Numpy array must have dtype float32_t")
self.cython_class = new C_myClass(<float32_t*>buf.buf, &self.shape[0], self.ndim)
You then need to make sure you release the buffer with ByBuffer_Release.
It isn't clear whether C_myClass copies the data or holds a pointer to it.
if C_myClass copies the data then call PyBuffer_release straight away after constructing self.cython_class
if C_myClass holds a reference to the data then call PyBuffer_release in the destructor after self.cython_class is deleted.
Assume we have a cython class A with a pointer to float like in
# A.pyx
cdef class A:
cdef float * ptr
We also have a cython class B in another module which needs access to the data under ptr:
# B.pyx
cdef class B:
cdef float * f_ptr
cpdef submit(self, ptr_var):
self.f_ptr= get_from( ptr_var ) # ???
The corresponding Python code using A and B might be something like
from A import A
from B import B
a = A()
b = B()
ptr = a.get_ptr()
b.submit(ptr)
How can we define get_ptr() and what would we use for get_from in B?
The solution is to wrap the pointer variable into a Python object. Module libc.stdint offers a type named uintptr_t which is an integer large enough for storing any kind of pointer. With this the solution might look as follows.
from libc.stdint cimport uintptr_t
cdef class A:
cdef float * ptr
def get_ptr(self):
return <uintptr_t>self.ptr
The expression in angle brackets <uintptr_t> corresponds to a cast to uintptr_t. In class B we then have to cast the variable back to a pointer to float.
from libc.stdint cimport uintptr_t
cdef class B:
cdef float * f_ptr
cpdef submit(self, uintptr_t ptr_var):
self.f_ptr= <float *>ptr_var
This works for any kind of pointers not only for pointers to float. One has to make sure that both modules (A and B) deal with the same kind of pointer since that information is lost once the pointer is wrapped in a uintptr_t.
I have some classes implemented as cdef class in cython. In client python code, I would like to compose the classes with multiple inheritance, but I'm getting a type error. Here is a minimal reproducible example:
In [1]: %load_ext cython
In [2]: %%cython
...: cdef class A:
...: cdef int x
...: def __init__(self):
...: self.x = 0
...: cdef class B:
...: cdef int y
...: def __init__(self):
...: self.y = 0
...:
In [3]: class C(A, B):
...: pass
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-3-83ef5091d3a6> in <module>()
----> 1 class C(A, B):
2 pass
TypeError: Error when calling the metaclass bases
multiple bases have instance lay-out conflict
Is there any way to get around this?
The docs say that:
A Python class can inherit from multiple extension types provided that the usual Python rules for multiple inheritance are followed (i.e. the C layouts of all the base classes must be compatible).
I'm trying to understand what this could possibly mean given the trivial example above.
It's pretty restricted. As best as I can tell all but one of the classes has to be empty. Empty classes can have def functions, but not cdef functions or cdef attributes.
Take a Cython class:
cdef class A:
cdef int x
This translates to C code:
struct __pyx_obj_2bc_A { // the name might be different
PyObject_HEAD
int x;
};
Essentially just a C struct containing the basic Python object stuff, and an integer.
The restriction is that a derived class must contain only one PyObject_HEAD and that its PyObject* should also be interpretable as a struct __pyx_obj_2bc_A* or a struct __pyx_obj_2bc_B*.
In your case the two integers x and y would attempt to occupy the same memory (so conflict). However, if one of the types was empty then they would share the PyObject_HEAD but not conflict further.
cdef functions cause a struct __pyx_vtabstruct_2bc_A *__pyx_vtab; to be added to the struct (so it's not empty). This contains function pointers which allows inherited classes to override the cdef functions.
Having two cdef classes that inherit from a common third class is OK, event if the common third class is not empty.
cdef class A:
cdef int x
cdef class B(A):
cdef int y
cdef class C(A):
pass
class D(B,C):
pass
The internal Python code that does this check is the function best_base if you really want to investigate the details of the algorithm.
With reference to "is there any way to get round this?" the answer is "not really." Your best option is probably composition rather than inheritance (i.e. have class C hold an A and B object, rather than inherit from A and B)
main.h
ifndef MAIN_H
define MAIN_H
ifdef __cplusplus
extern "C" {
endif
typedef struct Pythonout{
int pn;
double *px;
}Pythonout;
struct Pythonout l1tf_main(char *ifile_y, double lambda, int rflag);
ifdef __cplusplus
}
endif
endif /* MAIN_H */
Following is the Cython pyx file using main.h
.pyx
cimport numpy as np
cdef extern from "main.h":
ctypedef struct Pythonout:
int n
double *x
cdef Pythonout l1tf_main(char *ifile_y
,double lambdaval,
int rflag);
cdef class Pyclass:
cdef Pythonout pnx
def __cinit__(self, char *pfilename,
lambdaval, rflag):
self.pnx = l1tf_main(pfilename,
lambdaval, rflag)
#property
def n(self):
return self.pnx.n
#property
def x(self):
cdef np.npy_float64 shape[1]
shape[0] = <np.npy_intp> self.pnx.n
ndarray =
np.PyArray_SimpleNewFromData(1,
&(self.pnx.n),np.NPY_FLOAT64,
<void *> self.pnx.x)
np.PyArray_UpdateFlags(ndarray,
ndarray.flags.num
| np.NPY_OWNDATA)
return ndarray
cpdef filtered_trend(char *pfilename, double
lambdaval, int rflag):
pnx = Pyclass(pfilename, lambdaval, rflag)
return pnx.x
In the class I am getting the following error while compiling:
‘Pythonout {aka struct Pythonout}’ has no member named ‘n’
‘Pythonout {aka struct Pythonout}’ has no member named ‘x’
When calling the object value pnx.n and pnx.x.
There's at least two issues with your code.
The trivial issue causing the compilation errors: in C you call the struct attributes px and pn, while in Cython you call them x and n. This means that the code Cython generates does not match the C header. Make these consistent.
np.PyArray_UpdateFlags(ndarray,
ndarray.flags.num
| np.NPY_OWNDATA)
This tells Numpy that in now owns the data in x and is responsible for deallocating it. However suppose you have the Python code:
x1 = PyClassInstance.x
x2 = PyClassInstance.x
You now have two Numpy arrays that each believe the own the same data and will both try to deallocate it. (Similarly if you never access x then pnx.x is never deallocated) What you should probably do is have the PyClass instance be responsible for deallocating its pnx.x (in a __dealloc__ function). Then in your x property do:
ndarray = PyArray_SimpleNewFromData(...)
Py_INCREF(self) # SetBaseObject doesn't do this so you must do it manually
PyArray_SetBaseObject(ndarray, self) # check return value for errors...
Now the Numpy array treats the PyClass instance as owning the data.
In another question I learnt how to expose a function returning a C++ object to Python by copying the object. Having to perform a copy does not seem optimal. How can I return the object without copying it? i.e. how can I directly access the peaks returned by self.thisptr.getPeaks(data) in PyPeakDetection.getPeaks (defined in peak_detection_.pyx)?
peak_detection.hpp
#ifndef PEAKDETECTION_H
#define PEAKDETECTION_H
#include <string>
#include <map>
#include <vector>
#include "peak.hpp"
class PeakDetection
{
public:
PeakDetection(std::map<std::string, std::string> config);
std::vector<Peak> getPeaks(std::vector<float> &data);
private:
float _threshold;
};
#endif
peak_detection.cpp
#include <iostream>
#include <string>
#include "peak.hpp"
#include "peak_detection.hpp"
using namespace std;
PeakDetection::PeakDetection(map<string, string> config)
{
_threshold = stof(config["_threshold"]);
}
vector<Peak> PeakDetection::getPeaks(vector<float> &data){
Peak peak1 = Peak(10,1);
Peak peak2 = Peak(20,2);
vector<Peak> test;
test.push_back(peak1);
test.push_back(peak2);
return test;
}
peak.hpp
#ifndef PEAK_H
#define PEAK_H
class Peak {
public:
float freq;
float mag;
Peak() : freq(), mag() {}
Peak(float f, float m) : freq(f), mag(m) {}
};
#endif
peak_detection_.pyx
# distutils: language = c++
# distutils: sources = peak_detection.cpp
from libcpp.vector cimport vector
from libcpp.map cimport map
from libcpp.string cimport string
cdef extern from "peak.hpp":
cdef cppclass Peak:
Peak()
Peak(Peak &)
float freq, mag
cdef class PyPeak:
cdef Peak *thisptr
def __cinit__(self):
self.thisptr = new Peak()
def __dealloc__(self):
del self.thisptr
cdef copy(self, Peak &other):
del self.thisptr
self.thisptr = new Peak(other)
def __repr__(self):
return "<Peak: freq={0}, mag={1}>".format(self.freq, self.mag)
property freq:
def __get__(self): return self.thisptr.freq
def __set__(self, freq): self.thisptr.freq = freq
property mag:
def __get__(self): return self.thisptr.mag
def __set__(self, mag): self.thisptr.mag = mag
cdef extern from "peak_detection.hpp":
cdef cppclass PeakDetection:
PeakDetection(map[string,string])
vector[Peak] getPeaks(vector[float])
cdef class PyPeakDetection:
cdef PeakDetection *thisptr
def __cinit__(self, map[string,string] config):
self.thisptr = new PeakDetection(config)
def __dealloc__(self):
del self.thisptr
def getPeaks(self, data):
cdef Peak peak
cdef PyPeak new_peak
cdef vector[Peak] peaks = self.thisptr.getPeaks(data)
retval = []
for peak in peaks:
new_peak = PyPeak()
new_peak.copy(peak) # how can I avoid that copy?
retval.append(new_peak)
return retval
If you have a modern C++ compiler and can use rvalue references, move constructors and std::move it's pretty straight-forward. I think the easiest way is to create a Cython wrapper for the vector, and then use a move constructor to take hold of the contents of the vector.
All code shown goes in peak_detection_.pyx.
First wrap std::move. For simplicity I've just wrapped the one case we want (vector<Peak>) rather than messing about with templates.
cdef extern from "<utility>":
vector[Peak]&& move(vector[Peak]&&) # just define for peak rather than anything else
Second, create a vector wrapper class. This defines the Python functions necessary to access it like a list. It also defines a function to call the move assignment operator
cdef class PyPeakVector:
cdef vector[Peak] vec
cdef move_from(self, vector[Peak]&& move_this):
self.vec = move(move_this)
def __getitem__(self,idx):
return PyPeak2(self,idx)
def __len__(self):
return self.vec.size()
Then define the class the wraps the Peak. This is slightly different to your other class in that it doesn't own the Peak it wraps (the vector does). Otherwise, most of the functions remain the same
cdef class PyPeak2:
cdef int idx
cdef PyPeakVector vector # keep this alive, since it owns the peak rather that PyPeak2
def __cinit__(self,PyPeakVector vec,idx):
self.vector = vec
self.idx = idx
cdef Peak* getthisptr(self):
# lookup the pointer each time - it isn't generally safe
# to store pointers incase the vector is resized
return &self.vector.vec[self.idx]
# rest of functions as is
# don't define a destructor since we don't own the Peak
Finally, implement getPeaks()
cdef class PyPeakDetection:
# ...
def getPeaks(self, data):
cdef Peak peak
cdef PyPeak new_peak
cdef vector[Peak] peaks = self.thisptr.getPeaks(data)
retval = PyPeakVector()
retval.move_from(move(peaks))
return retval
Alternative approaches:
If Peak was nontrivial you could go for an approach where you call move on Peak rather that on the vector, as you construct your PyPeaks. For the case you have here move and copy will be equivalent for `Peak.
If you can't use C++11 features you'll need to change the interface a little. Instead of having your C++ getPeaks function return a vector have it take an empty vector reference (owned by PyPeakVector) as an input argument and write into it. Much of the rest of the wrapping remains the same.
There are two projects that accomplish interfacing with C++ code into Python that have withstood the test of time Boost.Python and SWIG. Both work by adding additional markup to pertinent C/C++ code and generating dynamically loaded python extension libraries (.so files) and the related python modules.
However, depending on your use case there may still be some additional markup that looks like "copying." However, the copying should not be as extensive and it will all be exposed in the C++ code rather than being explicitly copied verbatim in Cython/Pyrex.