Calling C from Python: passing list of numpy pointers - python

I have a variable number of numpy arrays, which I'd like to pass to a C function. I managed to pass each individual array (using <ndarray>.ctypes.data_as(c_void_p)), but the number of array may vary a lot.
I thought I could pass all of these "pointers" in a list and use the PyList_GetItem() function in the C code. It works like a charm, except that the values of all elements are not the pointers I usually get when they are passed as function arguments.
Though, if I have :
from numpy import array
from ctypes import py_object
a1 = array([1., 2., 3.8])
a2 = array([222.3, 33.5])
values = [a1, a2]
my_cfunc(py_object(values), c_long(len(values)))
And my C code looks like :
void my_cfunc(PyObject *values)
{
int i, n;
n = PyObject_Length(values)
for(i = 0; i < n; i++)
{
unsigned long long *pointer;
pointer = (unsigned long long *)(PyList_GetItem(values, i);
printf("value 0 : %f\n", *pointer);
}
}
The printed value are all 0.0000
I have tried a lot of different solutions, using ctypes.byref(), ctypes.pointer(), etc. But I can't seem to be able to retrieve the real pointer values. I even have the impression the values converted by c_void_p() are truncated to 32 bits...
While there are many documentations about passing numpy pointers to C, I haven't seen anything about c_types within Python list (I admit this may seem strange...).
Any clue ?

After a few hours spent reading many pages of documentation and digging in numpy include files, I've finally managed to understand exactly how it works. Since I've spent a great amount of time searching for these exact explanations, I'm providing the following text as a way to avoid anyone to waste its time.
I repeat the question :
How to transfer a list of numpy arrays, from Python to C
(I also assume you know how to compile, link and import your C module in Python)
Passing a Numpy array from Python to C is rather simple, as long as it's going to be passed as an argument in a C function. You just need to do something like this in Python
from numpy import array
from ctypes import c_long
values = array([1.0, 2.2, 3.3, 4.4, 5.5])
my_c_func(values.ctypes.data_as(c_void_p), c_long(values.size))
And the C code could look like :
void my_c_func(double *value, long size)
{
int i;
for (i = 0; i < size; i++)
printf("%ld : %.10f\n", i, values[i]);
}
That's simple... but what if I have a variable number of arrays ? Of course, I could use the techniques which parses the function's argument list (many examples in Stackoverflow), but I'd like to do something different.
I'd like to store all my arrays in a list and pass this list to the C function, and let the C code handle all the arrays.
In fact, it's extremely simple, easy et coherent... once you understand how it's done ! There is simply one very simple fact to remember :
Any member of a list/tuple/dictionary is a Python object... on the C side of the code !
You can't expect to directly pass a pointer as I initially, and wrongly, thought. Once said, it sounds very simple :-) Though, let's write some Python code :
from numpy import array
my_list = (array([1.0, 2.2, 3.3, 4.4, 5.5]),
array([2.9, 3.8. 4.7, 5.6]))
my_c_func(py_object(my_list))
Well, you don't need to change anything in the list, but you need to specify that you are passing the list as a PyObject argument.
And here is the how all this is being accessed in C.
void my_c_func(PyObject *list)
{
int i, n_arrays;
// Get the number of elements in the list
n_arrays = PyObject_Length(list);
for (i = 0; i LT n_arrays; i++)
{
PyArrayObject *elem;
double *pd;
elem = PyList_GetItem(list,
i);
pd = PyArray_DATA(elem);
printf("Value 0 : %.10f\n", *pd);
}
}
Explanation :
The list is received as a pointer to a PyObject
We get the number of array from the list by using the PyObject_Length() function.
PyList_GetItem() always return a PyObject (in fact a void *)
We retrieve the pointer to the array of data by using the PyArray_DATA() macro.
Normally, PyList_GetItem() returns a PyObject *, but, if you look in the Python.h and ndarraytypes.h, you'll find that they are both defined as (I've expanded the macros !):
typedef struct _object {
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
} PyObject;
And the PyArrayObject... is exactly the same. Though, it's perfectly interchangeable at this level. The content of ob_type is accessible for both objects and contain everything which is needed to manipulate any generic Python object. I admit that I've used one of its member during my investigations. The struct member tp_name is the string containing the name of the object... in clear text; and believe me, it helped ! This is how I discovered what each list element was containing.
While these structures don't contain anything else, how is it that we can access the pointer of this ndarray object ? Simply using object macros... which use an extended structure, allowing the compiler to know how to access the additional object's elements, behind the ob_type pointer. The PyArray_DATA() macro is defined as :
#define PyArray_DATA(obj) ((void *)((PyArrayObject_fields *)(obj))->data)
There, it's casting the PyArayObject * as a PyArrayObject_fields * and this latest structure is simply (simplified and macros expanded !) :
typedef struct tagPyArrayObject_fields {
Py_ssize_t ob_refcnt;
struct _typeobject *ob_type;
char *data;
int nd;
npy_intp *dimensions;
npy_intp *strides;
PyObject *base;
PyArray_Descr *descr;
int flags;
PyObject *weakreflist;
} PyArrayObject_fields;
As you can see, the first two element of the structure are the same as a PyObject and PyArrayObject, but additional elements can be addressed using this definition. It is tempting to directly access these elements, but it's a very bad and dangerous practice which is more than strongly discouraged. You must rather use the macros and don't bother with the details and elements in all these structures. I just thought you might be interested by some internals.
Note that all PyArrayObject macros are documented in http://docs.scipy.org/doc/numpy/reference/c-api.array.html
For instance, the size of a PyArrayObject can be obtained using the macro PyArray_SIZE(PyArrayObject *)
Finally, it's very simple and logical, once you know it :-)

Related

Python C Api transfer a PyObject * into c array

I used python c api and wish to get an array back from python. I returned a python array from the python side and want to transfer the PyObject* result into a c array so I can use it.
Is there anyway I can do that?
Side questions: In what circumstance trying to return an element in the array like
return arr[3]
will cause the PyObject_callobject return NULL?
return arr
does give me something
In Python an "array" is a list datatype. Look at the PyList_* functions in the C API. You can find this reference here for 2.7 and here for 3.5. The functions of interest are:
Py_ssize_t PyList_Size(PyObject *list) to get the number of list members
and
PyObject* PyList_GetItem(PyObject *list, Py_ssize_t index) to get a reference to the member at that index.
A simple use of malloc or similar allocator and a for loop should do the trick.
Also note that only borrowed references are involved, so no reference count maintenance of these objects via void Py_INCREF(PyObject *o) and void Py_DECREF(PyObject *o) is necessary.

Convert C array of pointers to Python array of structures

I am writing a Python app that makes use of PulseAudio API. The implementation is heavily using callbacks written in Python and invoked by PulseAudio's C code.
The most information is passed into the callback by a specific structure, for instance pa_sink_info, which is defined in C as follows:
typedef struct pa_sink_info {
const char *name;
uint32_t index;
const char *description;
pa_sample_spec sample_spec;
pa_channel_map channel_map;
uint32_t owner_module;
pa_cvolume volume;
int mute;
uint32_t monitor_source;
const char *monitor_source_name;
pa_usec_t latency;
const char *driver;
pa_sink_flags_t flags;
pa_proplist *proplist;
pa_usec_t configured_latency;
pa_volume_t base_volume;
pa_sink_state_t state;
uint32_t n_volume_steps;
uint32_t card;
uint32_t n_ports;
pa_sink_port_info** ports;
pa_sink_port_info* active_port;
uint8_t n_formats;
pa_format_info **formats;
} pa_sink_info;
From this structure it's very easy to get scalar values, eg.:
self.some_proc(
struct.contents.index,
struct.contents.name,
struct.contents.description)
But I have a difficulty dealing with ports and active_port, which in Python are described as:
('n_ports', uint32_t),
('ports', POINTER(POINTER(pa_sink_port_info))),
('active_port', POINTER(pa_sink_port_info)),
Here n_ports specifies number of elements in ports, which is a pointer to array of pointers to structures of type pa_sink_port_info. Actually, I don't even know how I can convert these to Python types at all.
What is the most efficient way of converting ports into Python dictionary containing pa_sink_port_info's?
Solving this problem required careful reading of Python's ctypes reference. Once the mechanism of ctypes type translation implementation was clear, it's not so difficult to get to the desired values.
The main idea about pointers is that you use their contents attribute to get to the data the pointer points to. Another useful thing to know is that pointers can be indexed like arrays (it's not validated by the interpreter, so it's your own responsibility to make sure it is indeed an array).
For this particular PulseAudio example, we can process the ports structure member (which is a pointer to array of pointers) as follows:
port_list = []
if struct.contents.ports:
i = 0
while True:
port_ptr = struct.contents.ports[i]
# NULL pointer terminates the array
if port_ptr:
port_struct = port_ptr.contents
port_list.append(port_struct.name)
i += 1
else:
break

C array to PyArray

I'm writing a Python C-Extension without using Cython.
I want to allocate a double array in C, use it in an internal function (that happens to be in Fortran) and return it. I point out that the C-Fortran interface works perfectly in C.
static PyObject *
Py_drecur(PyObject *self, PyObject *args)
{
// INPUT
int n;
int ipoly;
double al;
double be;
if (!PyArg_ParseTuple(args, "iidd", &n, &ipoly, &al, &be))
return NULL;
// OUTPUT
int nd = 1;
npy_intp dims[] = {n};
double a[n];
double b[n];
int ierr;
drecur_(n, ipoly, al, be, a, b, ierr);
// Create PyArray
PyObject* alpha = PyArray_SimpleNewFromData(nd, dims, NPY_DOUBLE, a);
PyObject* beta = PyArray_SimpleNewFromData(nd, dims, NPY_DOUBLE, b);
Py_INCREF(alpha);
Py_INCREF(beta);
return Py_BuildValue("OO", alpha, beta);
}
I debugged this code and I get a Segmentation fault when I try to create alpha out of a. Up to there everything works fine. The function drecur_ works and I get the same problem if it is removed.
Now, what's the standard way of defining a PyArray around C data? I found documentation but no good example. Also, what about memory leakage? Is it correct to INCREF before return so that the instance of alpha and beta are preserved? What about the deallocation when they are not needed anymore?
EDIT
I finally got it right with the approach found in NumPy cookbook.
static PyObject *
Py_drecur(PyObject *self, PyObject *args)
{
// INPUT
int n;
int ipoly;
double al;
double be;
double *a, *b;
PyArrayObject *alpha, *beta;
if (!PyArg_ParseTuple(args, "iidd", &n, &ipoly, &al, &be))
return NULL;
// OUTPUT
int nd = 1;
int dims[2];
dims[0] = n;
alpha = (PyArrayObject*) PyArray_FromDims(nd, dims, NPY_DOUBLE);
beta = (PyArrayObject*) PyArray_FromDims(nd, dims, NPY_DOUBLE);
a = pyvector_to_Carrayptrs(alpha);
b = pyvector_to_Carrayptrs(beta);
int ierr;
drecur_(n, ipoly, al, be, a, b, ierr);
return Py_BuildValue("OO", alpha, beta);
}
double *pyvector_to_Carrayptrs(PyArrayObject *arrayin) {
int n=arrayin->dimensions[0];
return (double *) arrayin->data; /* pointer to arrayin data as double */
}
Feel free to comment on this and thanks for the answers.
So the first things that looks suspicious, is that your array a and b are in the local scope of the function. That means after the return you will get an illegal memory access.
So you should declare the arrays with
double *a = malloc(n*sizeof(double));
Then you need to make sure that the memory is later freed by the object you have created.
See this quote of the documentation:
PyObject PyArray_SimpleNewFromData(int nd, npy_intp dims, int typenum, void* data)
Sometimes, you want to wrap memory allocated elsewhere into an ndarray object for downstream use. This routine makes it straightforward to do that. The first three arguments are the same as in PyArray_SimpleNew, the final argument is a pointer to a block of contiguous memory that the ndarray should use as it’s data-buffer which will be interpreted in C-style contiguous fashion. A new reference to an ndarray is returned, but the ndarray will not own its data. When this ndarray is deallocated, the pointer will not be freed.
You should ensure that the provided memory is not freed while the returned array is in existence. The easiest way to handle this is if data comes from another reference-counted Python object. The reference count on this object should be increased after the pointer is passed in, and the base member of the returned ndarray should point to the Python object that owns the data. Then, when the ndarray is deallocated, the base-member will be DECREF’d appropriately. If you want the memory to be freed as soon as the ndarray is deallocated then simply set the OWNDATA flag on the returned ndarray.
For your second question the Py_INCREF(alpha); is generally only necessary if you intend to keep the reference in a global variable or a class member.
But since you are only wrapping a function you don't have to do it.
Sadly it could be that the function PyArray_SimpleNewFromData does not set the reference counter to 1, if that would be the case you would have to increase it to 1. I hope that was understandable ;).
One problem might be that your arrays (a,b) have to last at least as long as the numpy-array that contains it. You've created your arrays in local scope so they will be destroyed when you leave the method.
Try to have python allocating the array (e.g. using PyArray_SimpleNew), copy your content into it and pass it a pointer. You might also want to use boost::python for taking care of these details, if building against boost is an option.

Swig, returning an array of doubles

I know, there are often many ways to solve certain problems. But here I know which way I want to have it, but I am unable to make it work with Python and SWIG...
I have a C-function, which returns me an array of double values:
double *my(int x)
{
double a,b,*buf;
buf = malloc (x * sizeof(double));
a=3.14;
b=2.7;
buf[0]=a;
buf[1]=b;
return buf;
}
Here, I definitively want to have the array as a return value. Not, as in many examples a 'void' function, which writes into an input array. Now, I would like to get a SWIG-python wrapper, which might be used as:
>>> import example
>>> print example.my(7)
[3.14,2.7]
Whatever I do, I have some conceptual problems here - I always get s.th. like <Swig Object of type 'double *' at 0xFABCABA12>
I tried to define some typemaps in my swg file:
%typemap(out) double [ANY] {
int i;
$result = PyList_New($1_dim0);
for (i = 0; i < $1_dim0; i++) {
PyObject *o = PyFloat_FromDouble((double) $1[i]);
PyList_SetItem($result,i,o);
}
}
But still I am unable to get out my results as required. Does anyone have a simple code-example to achieve this task?
The first problem is that your typemap doesn't match, you'll need a %typemap(out) double * { ... } since your function returns a pointer to double and not a double array.
If your list is of fixed size (i.e. an integer literal) as in the example you gave (which I assume is not what you want) you could simply change the typemap as I gave above and exchange $1_dim0 for the fixed size.
Otherwise your problem is that your %typemap(out) double * cannot possibly know the value of your parameter int x. You could return a struct that carries both the pointer and the size. Then you can easily define a typemap to turn that into a list (or a NumPy array, see also my response to Wrap C struct with array member for access in python: SWIG? cython? ctypes?).
Incidentally it's not possible to return a fixed sized array in C (see also this answer: Declaring a C function to return an array), so a %typemap(out) double [ANY] { ... } can never match.
I suffered similar problem and solved it in following way.
// example.i
%module example
%include "carrays.i"
%array_class(float, floatArray);
float * FloatArray(int N);
float SumFloats(float * f);
# ipython
> a = example.floatArray(23) # array generated by swig's class constructor
> a
<example.floatArray; proxy of <Swig Object of type 'floatArray *' at 0x2e74180> >
> a[0]
-2.6762280573445764e-37 # unfortunately it is created uninitialized..
> b = example.FloatArray(23) # array generated by function
> b
<Swig Object of type 'float *' at 0x2e6ad80>
> b[0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
# .....
TypeError: 'SwigPyObject' object is not subscriptable
> #But there is a way to access b!!
> p = example.floatArray_frompointer(b) # i found this function by example. and twice tab
> p
<example.floatArray; proxy of <Swig Object of type 'floatArray *' at 0x2e66750> >
> p[0]
0.0
> p[0] = 42
> p[0]
42.0
Fortunately, all of these types (float *, floatArray *, and proxy of floatArray *) may be successfully passed to C++ function (such as SumFloats).
You might want to look at the documentation around carray.i:
%include "carrays.i"
%array_class(int, intArray);
http://www.swig.org/Doc2.0/Python.html#Python_nn48
If you don't mind pulling in the numpy python module in your python code, you can do the following:
In the SWIG interface file:
%{
#define SWIG_FILE_WITH_INIT
%}
%include "numpy.i"
%init %{
import_array();
%}
%apply(float ARGOUT_ARRAY1[ANY]) {(float outarray1d[9])};
void rf(float outarray1d[9]);
Only the last two lines are specific to this example, the first stuff is default for numpy.i (see the numpy.i documentation elsewhere: http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html).
In the C file (can also be inlined in .i file):
void rf(float outarray1d[9]) {
float _internal_rf[9];
/* ... */
memcpy(outarray1d, _internal_rf, 9*sizeof(float));
}
Then you have a function which you can call from python as
import mymodule
a = mymodule.rf()
# a is a numpy array of float32's, with len 9
Now, if you don't want to be forced to pull in the numpy module in your python project, then I suggest you check numpy.i to see how they do the %typemap trick -- as I understand it, it's done with SWIG typemaps and not inherently tied to numpy - should be possible to do the same trick with tuples or lists as return value.
I don't know how much C you know - so apologies if I'm teaching you to suck eggs here...
There is no Array class in plain-ole C. An array is always a pointer to a piece of memory, not a "thing" in itself and therefore cannot just be printed-out by itself.
In this case - your "buf" is of type "double *". AFAICRem, if you want to print out the actual values stored at "the memory pointed-to by buf" you have to deallocate each one eg (in pseudocode): for i = 0 to buflength print buf[i]

In Python, how to use a C++ function which returns an allocated array of structs via a ** parameter?

I'd like to use some existing C++ code, NvTriStrip, in a Python tool.
SWIG easily handles the functions with simple parameters, but the main function, GenerateStrips, is much more complicated.
What do I need to put in the SWIG interface file to indicate that primGroups is really an output parameter and that it must be cleaned up with delete[]?
///////////////////////////////////////////////////////////////////////////
// GenerateStrips()
//
// in_indices: input index list, the indices you would use to render
// in_numIndices: number of entries in in_indices
// primGroups: array of optimized/stripified PrimitiveGroups
// numGroups: number of groups returned
//
// Be sure to call delete[] on the returned primGroups to avoid leaking mem
//
bool GenerateStrips( const unsigned short* in_indices,
const unsigned int in_numIndices,
PrimitiveGroup** primGroups,
unsigned short* numGroups,
bool validateEnabled = false );
FYI, here is the PrimitiveGroup declaration:
enum PrimType
{
PT_LIST,
PT_STRIP,
PT_FAN
};
struct PrimitiveGroup
{
PrimType type;
unsigned int numIndices;
unsigned short* indices;
PrimitiveGroup() : type(PT_STRIP), numIndices(0), indices(NULL) {}
~PrimitiveGroup()
{
if(indices)
delete[] indices;
indices = NULL;
}
};
Have you looked at the documentation of SWIG regarding their "cpointer.i" and "carray.i" libraries? They're found here. That's how you have to manipulate things unless you want to create your own utility libraries to accompany the wrapped code. Here's the link to the Python handling of pointers with SWIG.
Onto your question on getting it to recognize input versus output. They've got another section in the documentation here, that describes exactly that. You lable things OUTPUT in the *.i file. So in your case you'd write:
%inline{
extern bool GenerateStrips( const unsigned short* in_dices,
const unsigned short* in_numIndices,
PrimitiveGroup** OUTPUT,
unsigned short* numGroups,
bool validated );
%}
which gives you a function that returns both the bool and the PrimitiveGroup* array as a tuple.
Does that help?
It's actually so easy to make python bindings for things directly that I don't know why people bother with confusing wrapper stuff like SWIG.
Just use Py_BuildValue once per element of the outer array, producing one tuple per row. Store those tuples in a C array. Then Call PyList_New and PyList_SetSlice to generate a list of tuples, and return the list pointer from your C function.
I don't know how to do it with SWIG, but you might want to consider moving to a more modern binding system like Pyrex or Cython.
For example, Pyrex gives you access to C++ delete for cases like this. Here's an excerpt from the documentation:
Disposal
The del statement can be applied to a pointer to a C++ struct
to deallocate it. This is equivalent to delete in C++.
cdef Shrubbery *big_sh
big_sh = new Shrubbery(42.0)
display_in_garden_show(big_sh)
del big_sh
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/using_with_c++.html

Categories

Resources