I am writing a Cython wrapper around a C library we are maintaining. I am getting the following error message:
analog.pyx:6:66: Cannot convert 'unsigned short (*)' to Python object
Here's the code I am trying to write:
cimport company as lib
def get_value(device, channel, index):
cdef unsigned short aValue
err = library_get_data_val(device, channel, index, &aValue) # line 6
# Ignore the err return value for StackOverflow.
return aValue
The prototype of the C function I am trying to use is:
unsigned long library_get_data_val(unsigned long device, int channel,
int index, unsigned short *pValue);
The library function returns the requested value in the aValue parameter. It's just an unsigned short primitive. What's the expected way of returning primitives (i.e. not struct) from these type of functions? I am new to Cython so the answer may be quite simple but I didn't see anything obvious through Google.
I think the problem is that you haven't defined library_get_data_val properly, so Cython thinks it's a Python type function you're calling, and doesn't know what to do with the pointer to aValue
Try:
cdef extern from "header_containing_library_get_data_val.h":
# I've taken a guess at the signature of library_get_data_val
# Update it to match reality
int library_get_data_val(int device, int channel, int index, int* value)
That way Cython knows it's a C-function that expects a pointer, and will be happy.
(Edited to be significantly changed from my original answer, where I misunderstood the problem!)
I found out what my problem was. You can probably tell what it is now that I've edited the question. There's a company.pxd file that's cimported by the .pyx file. Once I copied the C prototype into company.pxd it worked.
I also needed to use the lib prefix in my call:
err = lib.library_get_data_val(device, channel, index, &aValue) # line 6
Related
We would need to create a PyCapsule from a method of a class in Cython. We managed to write a code which compiles and even runs without error but the results are wrong.
A simple example is here: https://github.com/paugier/cython_capi/tree/master/using_cpython_pycapsule_class
The capsules are executed by Pythran (one needs to use the version on github https://github.com/serge-sans-paille/pythran).
The .pyx file:
from cpython.pycapsule cimport PyCapsule_New
cdef int twice_func(int c):
return 2*c
cdef class Twice:
cdef public dict __pyx_capi__
def __init__(self):
self.__pyx_capi__ = self.get_capi()
cpdef get_capi(self):
return {
'twice_func': PyCapsule_New(
<void *>twice_func, 'int (int)', NULL),
'twice_cpdef': PyCapsule_New(
<void *>self.twice_cpdef, 'int (int)', NULL),
'twice_cdef': PyCapsule_New(
<void *>self.twice_cdef, 'int (int)', NULL),
'twice_static': PyCapsule_New(
<void *>self.twice_static, 'int (int)', NULL)}
cpdef int twice_cpdef(self, int c):
return 2*c
cdef int twice_cdef(self, int c):
return 2*c
#staticmethod
cdef int twice_static(int c):
return 2*c
The file compiled by pythran (call_capsule_pythran.py).
# pythran export call_capsule(int(int), int)
def call_capsule(capsule, n):
r = capsule(n)
return r
Once again it is a new feature of Pythran so one needs the version on github...
And the test file:
try:
import faulthandler
faulthandler.enable()
except ImportError:
pass
import unittest
from twice import Twice
from call_capsule_pythran import call_capsule
class TestAll(unittest.TestCase):
def setUp(self):
self.obj = Twice()
self.capi = self.obj.__pyx_capi__
def test_pythran(self):
value = 41
print('\n')
for name, capsule in self.capi.items():
print('capsule', name)
result = call_capsule(capsule, value)
if name.startswith('twice'):
if result != 2*value:
how = 'wrong'
else:
how = 'good'
print(how, f'result ({result})\n')
if __name__ == '__main__':
unittest.main()
It is buggy and gives:
capsule twice_func
good result (82)
capsule twice_cpdef
wrong result (4006664390)
capsule twice_cdef
wrong result (4006664390)
capsule twice_static
good result (82)
It shows that it works fine for the standard function and for the static function but that there is a problem for the methods.
Note that the fact that it works for two capsules seems to indicate that the problem does not come from Pythran.
Edit
After DavidW's comments, I understand that we would have to create at run time (for example in get_capi) a C function with the signature int(int) from the bound method twice_cdef whose signature is actually int(Twice, int).
I don't know if this is really impossible to do with Cython...
To follow up/expand on my comments:
The basic issue is that the Pythran is expecting a C function pointer with the signature int f(int) to be contained within the PyCapsule. However, the signature of your methods is int(PyObject* self, int c). The 2 gets passed as self (not causing disaster since it isn't actually used...) and some arbitrary bit of memory is used in place of the int c. Unfortunately it isn't possible to use pure C code to create a C function pointer with "bound arguments" so Cython can't (and realistically won't be able to) do it.
Modification 1 is to get better compile-time type checking of what you're passing to your PyCapsules by creating a function that accepts the correct types and casting in there, rather than just casting to <void*> blindly. This doesn't solve your problem but warns you at compile-time when it isn't going to work:
ctypedef int(*f_ptr_type)(int)
cdef make_PyCapsule(f_ptr_type f, string):
return PyCapsule_New(
<void *>f, string, NULL)
# then in get_capi:
'twice_func': make_PyCapsule(twice_func, b'int (int)'), # etc
It is actually possible to create C function from arbitrary Python callables using ctypes (or cffi) - see Using function pointers to methods of classes without the gil (bottom of answer). This adds an extra layer of Python calls so isn't terribly quick, and the code is a bit messy. ctypes achieves this by using runtime code generation (which isn't that portable or something you can do in pure C) to build a function on the fly and then create a pointer to that.
Although you claim in the comments that you don't think you can use the Python interpreter, I don't think this is true - Pythran generates Python extension modules (so is pretty bound to the Python interpreter) and it seems to work in your test case shown here:
_func_cache = []
cdef f_ptr_type py_to_fptr(f):
import ctypes
functype = ctypes.CFUNCTYPE(ctypes.c_int,ctypes.c_int)
ctypes_f = functype(f)
_func_cache.append(ctypes_f) # ensure references are kept
return (<f_ptr_type*><size_t>ctypes.addressof(ctypes_f))[0]
# then in make_capi:
'twice_cpdef': make_PyCapsule(py_to_fptr(self.twice_cpdef), b'int (int)')
Unfortunately it only works for cpdef and not cdef functions since it does rely on having a Python callable. cdef functions can be made to work with a lambda (provided you change get_capi to def instead of cpdef):
'twice_cdef': make_PyCapsule(py_to_fptr(lambda x: self.twice_cdef(x)), b'int (int)'),
It's all a little messy but can be made to work.
This (I hope) is quite a simple issue, but despite doing some reading (I'm v. new to SWIG, and fairly green C-wise) I'm just not able to make the "connection" in my head.
I have a function from a library (legacy code, keen not to edit):
extern int myfunction(char *infile, char *maskfile, int check, float *median, char *msg)
My aim is to create a wrapper for this in Python using SWIG.
The values of the median and msg variables are changed by the C function. When the return int != 0 then there will be some error information in the msg arg. Where the return int == 0, then median variable will contain a float with value assigned from myfunction.
This generally runs OK where the return value is 0. I use %array_functions and %pointer_functions to create the pointers needing to be passed, as per this .i file:
%module test
%include "cpointer.i"
%include "carrays.i"
%{
#include <stdint.h>
%}
extern int myfunction(char *infile, char *maskfile, int check, float *median, char *msg)
%pointer_functions(float, floatp);
%pointer_functions(char, charp);
%array_functions(char, charArray);
After swig-ing, compiling and linking, I can call the function in python:
import test
errmsg_buffer = 1024
_infile = 'test2.dat'
infile = imstat.new_charArray(len(_infile))
for i in xrange(len(_infile)):
imstat.charArray_setitem(infile,i,_infile[i])
maskfile = imstat.new_charArray(1)
imstat.charArray_setitem(maskfile,0,'')
check = 0
med = imstat.new_floatp()
errmsg = imstat.new_charArray(errmsg_buffer)
out = test.myfunction(infile,maskfile,check,med,errmsg)
median = test.floatp_value(med)
This works sometimes, but often not - I get a lot of segfaults which are generally fixed by changing the errmsg_buffer length (clearly not a useful fix!). The C code that changes the msg string is:
(void)sprintf(errmsg,"file not found");
My main issue is in proper handling of msg string, which I suspect is causing the segfaults (and might be due to incorrect implementation via new_charArray?).
What is the best way to do this?
Can I add something to the .i that converts the char *msg into a python str?
Can this be done without "pre-initialising" with new_CharArray? I'd presumably get a buffer overflow if errmsg_buffer is too small.
I hope this is clear - happy to add comments for further discussion.
Your wrapper can be much simplified using SWIG. Try this SWIG interface file (details below):
%module test
%include "typemaps.i"
%include "cstring.i"
%apply float *OUTPUT { float *median };
%cstring_bounded_output(char *msg, 1024);
extern int myfunction(char *infile, char *maskfile, int check, float *median, char *msg);
Then, from python, use the module in the following way:
import test
infile = 'test2.dat'
maskfile = ''
check = 0
out, median, errmsg = test.myfunction(infile,maskfile,check)
if out == 0: print(errmsg)
...
However, from what you write, it is not quite clear to me why your approach segfaults.
Details
The typemaps.i file contains the float *OUTPUT typemap, which is then applied to the float *median argument and turns this from an argument into a float output value. See the SWIG docs on argument handling for details.
The cstrings.i file contains SWIG macros to deal with C strings. Here, I used the %cstring_bounded_output macro. This creates a char * buffer of the given size 1024 and passes this as the argument for char *msg automatically. Then, the contents after the function complete are converted into a python string and appended to the output. See here for details.
SWIG handles the first two char * arguments by default, that is converting python strings to appropriate char * and passing these. Note that the passed char * for these arguments are immutable, i.e., if your myfunction attempts to modify these, bad things will happen. Read about how SWIG handles C strings here.
So, your wrapped myfunction then is used as shown above and has the following signature in python:
myfunction(infile, maskfile, check) -> (out, median, msg)
EDIT:
The SWIG docs about carrays.i state:
Note: %array_functions() and %array_class() should not be used with types of char or char *.
I think your code is not creating correctly NULL-terminated C char *, so perhaps this could be causing the segfaults.
I am not learn SWIG very deeply.But I try give you some suggestions.
1.
If your program modifies the input parameter or uses it to return data, consider using the cstring.i library file described in the SWIG Library chapter.
Data is copied into a new Python string and returned.
If your program needs to work with binary data, you can use a typemap to expand a Python string into a pointer/length argument pair. As luck would have it, just such a typemap is already defined. Just do this:
%apply (char *STRING, int LENGTH) { (char *data, int size) };
...
int parity(char *data, int size, int initial);
Python:
parity("e\x09ffss\x00\x00\x01\nx", 0)
If you need to return binary data, you might use the cstring.i library file. The cdata.i library can also be used to extra binary data from arbitrary pointers.
2.I think "pre-initialising" maybe necessary.
I'm new to SWIG and if my question is documented, feel free to just post the link and I'll read through it.
I have a C function that takes the form:
int myFunc(char *output, const char *input)
I generated the Python wrapper, and I tried calling this function (in Python) with:
m=""
n="valid input string"
myFunc(m,n)
This simply prints the (int) return code, and m is still "". What am I doing wrong?
Thanks!
From the SWIG manual section 8.3.4:
If your C function is declared like this:
int myFunc(char *myOutput, const char *myInput);
Then you can use the following SWIG interface syntax:
%include "cstring.i"
%cstring_bounded_output(char *myOutput, 1024);
int myFunc(char *myOutput, const char *myInput);
This should result in a Python wrapper function taking a single string argument (myInput) and returning a tuple of an integer (the C function's return value) and a string (myOutput). Memory for the string will be allocated by SWIG and be 1024 bytes in length, in this example.
OK, it is not recommended practice (see my comment), and YMMV, but this seems to work for me with Python 2.4:
#pre-allocate enough space for m
m = "\x00"*100
n = "valid input string"
myFunc(m,n)
# now m.rstrip("\x00") has what you want
I am trying to write a Cython extension to CPython to wrap the mcrypt library, so that I can use it with Python 3. However, I am running into a problem where I segfault while trying to use one of the mcrypt APIs.
The code that is failing is:
def _real_encrypt(self, source):
src_len = len(source)
cdef char* ciphertext = source
cmc.mcrypt_generic(self._mcStream, <void *>ciphertext, src_len)
retval = source[:src_len]
return retval
Now, the way I understand the Cython documentation, the assignment on line 3 should copy the contents of the buffer (a object in Python 3) to the C string pointer. I would figure that this would also mean that it would allocate the memory, but when I made this modification:
def _real_encrypt(self, source):
src_len = len(source)
cdef char* ciphertext = <char *>malloc(src_len)
ciphertext = source
cmc.mcrypt_generic(self._mcStream, <void *>ciphertext, src_len)
retval = source[:src_len]
return retval
it still crashed with a segfault. It's crashing inside of mcrypt_generic, but when I use plain C code I am able to make it work just fine, so there has to be something that I am not quite understanding about how Cython is working with C data here.
Thanks for any help!
ETA: The problem was a bug on my part. I was working on this after being awake for far too many hours (isn't that something we've all done at some point?) and missed something stupid. The code that I now have, which works, is:
def _real_encrypt(self, source):
src_len = len(source)
cdef char *ciphertext = <char *>malloc(src_len)
cmc.strncpy(ciphertext, source, src_len)
cmc.mcrypt_generic_init(self._mcStream, <void *>self._key,
len(self._key), NULL)
cmc.mcrypt_generic(self._mcStream, <void *>ciphertext,
src_len)
retval = ciphertext[:src_len]
cmc.mcrypt_generic_deinit(self._mcStream)
return retval
It's probably not the most efficient code in the world, as it makes a copy to do the encryption and then a second copy to the return value. I'm not sure if it is possible to avoid that, though, since I'm not sure if it is possible to take a newly-allocated buffer and return it to Python in-place as a bytestring. But now that I have a working function, I'm going to implement a block-by-block method as well, so that one can provide an iterable of blocks for encryption or decryption, and be able to do it without having the entire source and destination all in memory all at once---that way, it'd be possible to encrypt/decrypt huge files without having to worry about holding up to three copies of it in memory at any one point...
Thanks for the help, everyone!
The first one is pointing the char* at the Python string. The second allocates memory, but then re-points the pointer to the Python string and ignores the newly allocated memory. You should be invoking the C library function strcpy from Cython, presumably; but I don't know the details.
A few comments on your code to help improve it, IMHO. There are functions provided by the python C API that do exactly what you need to do, and make sure everything conforms to the Python way of doing things. It will handle embedded NULL's without a problem.
Rather than calling malloc directly, change this:
cdef char *ciphertext = <char *>malloc(src_len)
to
cdef str retval = PyString_FromStringAndSize(PyString_AsString(source), <Py_ssize_t>src_len)
cdef char *ciphertext = PyString_AsString(retval)
The above lines will create a brand new Python str object initialized to the contents of source. The second line points ciphertext to retval's internal char * buffer without copying. Whatever modifies ciphertext will modify retval. Since retval is a brand new Python str, it can be modified by C code before being returned from _real_encrypt.
See the Python C/API docs on the above functions for more details, here and here.
The net effect saves you a copy. The whole code would be something like:
cdef extern from "Python.h":
object PyString_FromStringAndSize(char *, Py_ssize_t)
char *PyString_AsString(object)
def _real_encrypt(self, source):
src_len = len(source)
cdef str retval = PyString_FromStringAndSize(PyString_AsString(source), <Py_ssize_t>src_len)
cdef char *ciphertext = PyString_AsString(retval)
cmc.mcrypt_generic_init(self._mcStream, <void *>self._key,
len(self._key), NULL)
cmc.mcrypt_generic(self._mcStream, <void *>ciphertext,
src_len)
# since the above initialized ciphertext, the retval str is also correctly initialized, too.
cmc.mcrypt_generic_deinit(self._mcStream)
return retval
The approach I've used (with Python 2.x) is to declare the string type parameters in the function signature so that the Cython code does all conversions and type checking automatically:
def _real_encrypt(self,char* src):
...
I'd like to use some existing C++ code, NvTriStrip, in a Python tool.
SWIG easily handles the functions with simple parameters, but the main function, GenerateStrips, is much more complicated.
What do I need to put in the SWIG interface file to indicate that primGroups is really an output parameter and that it must be cleaned up with delete[]?
///////////////////////////////////////////////////////////////////////////
// GenerateStrips()
//
// in_indices: input index list, the indices you would use to render
// in_numIndices: number of entries in in_indices
// primGroups: array of optimized/stripified PrimitiveGroups
// numGroups: number of groups returned
//
// Be sure to call delete[] on the returned primGroups to avoid leaking mem
//
bool GenerateStrips( const unsigned short* in_indices,
const unsigned int in_numIndices,
PrimitiveGroup** primGroups,
unsigned short* numGroups,
bool validateEnabled = false );
FYI, here is the PrimitiveGroup declaration:
enum PrimType
{
PT_LIST,
PT_STRIP,
PT_FAN
};
struct PrimitiveGroup
{
PrimType type;
unsigned int numIndices;
unsigned short* indices;
PrimitiveGroup() : type(PT_STRIP), numIndices(0), indices(NULL) {}
~PrimitiveGroup()
{
if(indices)
delete[] indices;
indices = NULL;
}
};
Have you looked at the documentation of SWIG regarding their "cpointer.i" and "carray.i" libraries? They're found here. That's how you have to manipulate things unless you want to create your own utility libraries to accompany the wrapped code. Here's the link to the Python handling of pointers with SWIG.
Onto your question on getting it to recognize input versus output. They've got another section in the documentation here, that describes exactly that. You lable things OUTPUT in the *.i file. So in your case you'd write:
%inline{
extern bool GenerateStrips( const unsigned short* in_dices,
const unsigned short* in_numIndices,
PrimitiveGroup** OUTPUT,
unsigned short* numGroups,
bool validated );
%}
which gives you a function that returns both the bool and the PrimitiveGroup* array as a tuple.
Does that help?
It's actually so easy to make python bindings for things directly that I don't know why people bother with confusing wrapper stuff like SWIG.
Just use Py_BuildValue once per element of the outer array, producing one tuple per row. Store those tuples in a C array. Then Call PyList_New and PyList_SetSlice to generate a list of tuples, and return the list pointer from your C function.
I don't know how to do it with SWIG, but you might want to consider moving to a more modern binding system like Pyrex or Cython.
For example, Pyrex gives you access to C++ delete for cases like this. Here's an excerpt from the documentation:
Disposal
The del statement can be applied to a pointer to a C++ struct
to deallocate it. This is equivalent to delete in C++.
cdef Shrubbery *big_sh
big_sh = new Shrubbery(42.0)
display_in_garden_show(big_sh)
del big_sh
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/using_with_c++.html