I'd like to call my C function from Python, in order to manipulate some NumPy arrays. The function is like this:
void c_func(int *in_array, int n, int *out_array);
where the results are supplied in out_array, whose size I know in advance (not my function, actually). I try to do in the corresponding .pyx file the following, in order to able to pass the input to the function from a NumPy array, and store the result in a NumPy array:
def pyfunc(np.ndarray[np.int32_t, ndim=1] in_array):
n = len(in_array)
out_array = np.zeros((512,), dtype = np.int32)
mymodule.c_func(<int *> in_array.data, n, <int *> out_array.data)
return out_array
But I get
"Python objects cannot be cast to pointers of primitive types" error for the output assignment. How do I accomplish this?
(If I require that the Python caller allocates the proper output array, then I can do
def pyfunc(np.ndarray[np.int32_t, ndim=1] in_array, np.ndarray[np.int32_t, ndim=1] out_array):
n = len(in_array)
mymodule.cfunc(<int *> in_array.data, n, <int*> out_array.data)
But can I do this in a way that the caller doesn't have to pre-allocate the appropriately sized output array?
You should add cdef np.ndarray before the out_array assignement:
def pyfunc(np.ndarray[np.int32_t, ndim=1] in_array):
cdef np.ndarray out_array = np.zeros((512,), dtype = np.int32)
n = len(in_array)
mymodule.c_func(<int *> in_array.data, n, <int *> out_array.data)
return out_array
Here is an example how to manipulate NumPy arrays using code written in C/C++ through ctypes.
I wrote a small function in C, taking the square of numbers from a first array and writing the result to a second array. The number of elements is given by a third parameter. This code is compiled as shared object.
squares.c compiled to squares.so:
void square(double* pin, double* pout, int n) {
for (int i=0; i<n; ++i) {
pout[i] = pin[i] * pin[i];
}
}
In python, you just load the library using ctypes and call the function. The array pointers are obtained from the NumPy ctypes interface.
import numpy as np
import ctypes
n = 5
a = np.arange(n, dtype=np.double)
b = np.zeros(n, dtype=np.double)
square = ctypes.cdll.LoadLibrary("./square.so")
aptr = a.ctypes.data_as(ctypes.POINTER(ctypes.c_double))
bptr = b.ctypes.data_as(ctypes.POINTER(ctypes.c_double))
square.square(aptr, bptr, n)
print b
This will work for any c-library, you just have to know which argument types to pass, possibly rebuilding c-structs in python using ctypes.
Related
Currently I'm learning about C types. My goal is to generate an numpy
array A in python from 0 to 4*pi in 500 steps. That array is passed to
C code which calculates the tangent of those values. The C code also
passes those values back to an numpy array B in python.
Yesterday I tried simply to convert one value from python to C and
(after some help) succeeded. Today I try to pass a whole array, not a
value.
I think it's an good idea to add another function to the C library to
process the array. The new function should in a loop pass each value
of A to the function tan1() and store that value in array B.
I have two issues:
writing the function that processes the numpy array A
Passing the numpy array between python and C code.
I read the following info:
https://nenadmarkus.com/p/numpy-to-native/
How to use NumPy array with ctypes?
Helpful, but I still don't know how to solve my problem.
C code (Only the piece that seems relevant):
double tan1(f) double f;
{
return sin1(f)/cos1(f);
}
void loop(double A, int n);
{
double *B;
B = (double*) malloc(n * sizeof(double));
for(i=0; i<= n, i++)
{
B[i] = tan1(A[i])
}
}
Python code:
import numpy as np
import ctypes
A = np.array(np.linspace(0,4*np.pi,500), dtype=np.float64)
testlib = ctypes.CDLL('./testlib.so')
testlib.loop.argtypes = ctypes.c_double,
testlib.loop.restype = ctypes.c_double
#print(testlib.tan1(3))
I'm aware that ctypes.c_double is wrong in this context, but that is what I had in the 1 value version and don't know yet for what to substitute.
Could I please get some feedback on how to achieve this goal?
You need to return the dynamically allocated memory, e.g. change your C code to something like:
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
double tan1(double f) {
return sin(f)/cos(f);
}
double *loop(double *arr, int n) {
double *b = malloc(n * sizeof(double));
for(int i = 0; i < n; i++) {
b[i] = tan(arr[i]);
}
return b;
}
void freeArray(double *b) {
free(b);
}
On the Python side you have to declare parameter and return types. As mentioned by others in comments, you should also free dynamically allocated memory. Note that on the C side, arrays always decay into pointers. Therefore, you need an additional parameter which tells you the number of elements in the array.
Also if you return a pointer to double to the Python page, you must specify the size of the array. With np.frombuffer you can work with the data without making a copy of it.
import numpy as np
from ctypes import *
testlib = ctypes.CDLL('./testlib.so')
n = 500
dtype = np.float64
input_array = np.array(np.linspace(0, 4 * np.pi, n), dtype=dtype)
input_ptr = input_array.ctypes.data_as(POINTER(c_double))
testlib.loop.argtypes = (POINTER(c_double), c_int)
testlib.loop.restype = POINTER(c_double * n)
testlib.freeArray.argtypes = POINTER(c_double * n),
result_ptr = testlib.loop(input_ptr, n)
result_array = np.frombuffer(result_ptr.contents)
# ...do some processing
for value in result_array:
print(value)
# free buffer
testlib.freeArray(result_ptr)
Is there a way to use AES-NI instructions within Cython code?
Closest I could find is how someone accessed SIMD instructions:
https://groups.google.com/forum/#!msg/cython-users/nTnyI7A6sMc/a6_GnOOsLuQJ
AES-NI in Python thread was not answered:
Python support for AES-NI
You should be able to just define the intrinsics as if they're normal C functions in Cython. Something like
cdef extern from "emmintrin.h": # I'm going off the microsoft documentation for where the headers are
# define the datatype as an opaque type
ctypedef struct __m128i:
pass
__m128i _mm_set_epi32 (int i3, int i2, int i1, int i0)
cdef extern from "wmmintrin.h":
__m128i _mm_aesdec_si128(__m128i v,__m128i rkey)
# then in some Cython function
def f():
cdef __m128i v = _mm_set_epi32(1,2,3,4)
cdef __m128i key = _mm_set_epi32(5,6,7,8)
cdef __m128i result = _mm_aesdec_si128(v,key)
The question "how do I apply this over a bytes array"? First, you get a char* of the bytes array. Then just iterate over it with range (being careful not to run off the end).
# assuming you already have an __m128i key
cdef __m128i v
cdef char* array = python_bytes_array # auto conversion
cdef int i, j
# you NEED to ensure that the byte array has a length divisible by
# 16, otherwise you'll probably get a segmentation fault.
for i in range(0,len(python_bytes_array),16):
# go over in chunks of 16
v = _mm_set_epi8(array[i+15],array[i+14],array[i+13],
# etc... fill in the rest
array[i+1], array[i])
cdef __m128 result = _mm_aesdec_si128(v,key)
# write back to the same place?
for j in range(16):
array[i+j] = _mm_extract_epi8(result,j)
I want to make a pure function in c-style which take an array as an argument (pointer) and do something with it. But I cannot find out how to define an array argument for a cdef function. Here is some toy code I have made.
cdef void test(double[] array ) except? -2:
cdef int i,n
i = 0
n = len(array)
for i in range(0,n):
array[i] = array[i]+1.0
def ctest(a):
n = len(a)
#Make a C-array on the heap.
cdef double *v
v = <double *>malloc(n*sizeof(double))
#Copy in the python array
for i in range(n):
v[i] = float(a[i])
#Calling the C-function which do something with the array
test(v)
#Puttint the changed C-array back into python
for i in range(n):
a[i] = v[i]
free(v)
return a
The code will not compile. Have search for how to define C-arrays in Cython, but have not found how to do it. The double[] array does clearly not not work. Have also tried with:
cdef void test(double* array ) except? -2:
I can manage to do the same in pure c, but not in cython:(
D:\cython-test\ python setup.py build_ext --inplace
Compiling ctest.pyx because it changed.
[1/1] Cythonizing ctest.pyx
Error compiling Cython file:
------------------------------------------------------------
...
from libc.stdlib cimport malloc, free
cdef void test(double[] array):
cdef int i,n
n = len(array)
^
------------------------------------------------------------
ctest.pyx:5:17: Cannot convert 'double *' to Python object
Error compiling Cython file:
------------------------------------------------------------
...
from libc.stdlib cimport malloc, free
cdef void test(double[] array):
cdef int i,n
n = len(array)
for i in range(0,len(array)):
^
------------------------------------------------------------
ctest.pyx:6:30: Cannot convert 'double *' to Python object
Traceback (most recent call last):
File "setup.py", line 10, in <module>
ext_modules = cythonize("ctest.pyx"),
File "C:\Anaconda\lib\site-packages\Cython\Build\Dependencies.py", line 877, i
n cythonize
cythonize_one(*args)
File "C:\Anaconda\lib\site-packages\Cython\Build\Dependencies.py", line 997, i
n cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: ctest.pyx
E:\GD\UD\Software\BendStiffener\curvmom>
UPDATE:
Have updated my code after all advices and it compiles now:) But my array do still not update. I will expect that all entries should be updated with 5.0, but they do not
from libc.stdlib cimport malloc, free
cdef void test(double[] array):
cdef int i,n
n = sizeof(array)/sizeof(double)
for i in range(0,n):
array[i] = array[i]+5.0
def ctest(a):
n = len(a)
#Make a C-array on the heap.
cdef double* v
v = <double*>malloc(n*sizeof(double))
#Copy in the python array
for i in range(n):
v[i] = float(a[i])
#Calling the C-function which do something with the array
test(v)
#Puttint the changed C-array back into python
for i in range(n):
a[i] = v[i]
free(v)
for x in a:
print x
return a
Here are a python test program for testing my code:
import ctest
a = [0,0,0]
ctest.ctest(a)
So there is still something I am doing wrong. Any suggestion?
len() is a python function that works only on python objects. This is why it won't compile.
For a C-array you could replace n=len(array) by n = sizeof(array) / sizeof(double).
You might want to take a look at typed memoryviews and the buffer interface. These provide a nice interface to array like data structures like those underlying numpy arrays, but can also be used to work with C arrays. From the documentation:
For example, they can handle C arrays and the Cython array type (Cython arrays).
In your case this might help:
cdef test(double[:] array) except? -2:
...
The double[:] allows all 1d double arrays to be passed to the function. Those can then be modified. As the [:] defines a memoryview, all changes will be made in the array you created the memoryview on (the variable you passed as the parameter to test).
I am transferring a double-array from a c-function to a python-function. My code for that is:
C-Code:
double *compute(int size, const double a[])
{
double* array;
array = malloc(sizeof(double)*size);
for (int i=0; i<size; i++)
{
array[i] = 3*a[i];
}
//printf("Array in compute-function is: \n[");
//for(int i = 0; i < size; i++)
//printf("%f, ", array[i]);
//printf("]\n");
return array;
}
pyx-code:
cdef class ArrayWrapper:
cdef void* data_ptr
cdef int size
cdef set_data(self, int size, void* data_ptr):
""" Set the data of the array
This cannot be done in the constructor as it must recieve C-level
arguments.
Parameters:
-----------
size: int
Length of the array.
data_ptr: void*
Pointer to the data
"""
self.data_ptr = data_ptr
self.size = size
def __array__(self):
""" Here we use the __array__ method, that is called when numpy
tries to get an array from the object."""
cdef np.npy_intp shape[1]
shape[0] = <np.npy_intp> self.size
# Create a 1D array, of length 'size'
ndarray = np.PyArray_SimpleNewFromData(1, shape,
np.NPY_INT, self.data_ptr)
return ndarray
def __dealloc__(self):
""" Frees the array. This is called by Python when all the
references to the object are gone. """
free(<void*>self.data_ptr)
def py_compute(int size, np.ndarray[np.double_t,ndim=1] a):
""" Python binding of the 'compute' function in 'GNLSE_RHS.c' that does
not copy the data allocated in C.
"""
cdef double *array
cdef np.ndarray ndarray
# Call the C function
array = compute(size, <double*> a.data)
array_wrapper = ArrayWrapper()
array_wrapper.set_data(size, <void*> array)
ndarray = np.array(array_wrapper, copy=False)
# Assign our object to the 'base' of the ndarray object
ndarray.base = <PyObject*> array_wrapper
# Increment the reference count, as the above assignement was done in
# C, and Python does not know that there is this additional reference
Py_INCREF(array_wrapper)
return ndarray
python-code:
for i in xrange(10):
x[i] = i;
a = cython_wrapper.py_compute(10, x)
print a
But my result is
[ 0 0 0 1074266112 0 1075314688 0 1075970048 0 1076363264]
instead of the expected
[ 0. 3. 6. 9. 12. 15. 18. 21. 24. 27.]
Where is my mistake? I assume it has something to do with a problematic pointer transfer, but I am not sure.
The mistake here is that in the line
ndarray = np.PyArray_SimpleNewFromData(1, shape,
np.NPY_INT, self.data_ptr)
you are telling numpy that self.data_ptr is pointing to an array of ints and not one of doubles.
You can fix your code by telling numpy the correct datatype like so:
ndarray = np.PyArray_SimpleNewFromData(1, shape,
np.NPY_DOUBLE, self.data_ptr)
and it should work as expected.
In addition to this you can simplify your wrapper code slightly as well by not having to pass in the size of the input array as it is already contained in the np.ndarray that you pass to py_compute
def py_compute(np.ndarray[np.double_t,ndim=1] a):
""" Python binding of the 'compute' function in 'GNLSE_RHS.c' that does
not copy the data allocated in C.
"""
cdef double *array
cdef np.ndarray ndarray
cdef size = a.shape[0]
# Call the C function
array = compute(size, &a[0])
array_wrapper = ArrayWrapper()
array_wrapper.set_data(size, <void*> array)
ndarray = np.array(array_wrapper, copy=False)
# Assign our object to the 'base' of the ndarray object
ndarray.base = <PyObject*> array_wrapper
# Increment the reference count, as the above assignement was done in
# C, and Python does not know that there is this additional reference
Py_INCREF(array_wrapper)
return ndarray
I am trying to speed up my Numpy code and decided that I wanted to implement one particular function where my code spent most of the time in C.
I'm actually a rookie in C, but I managed to write the function which normalizes every row in a matrix to sum to 1. I can compile it and I tested it with some data (in C) and it does what I want. At that point I was very proud of myself.
Now I'm trying to call my glorious function from Python where it should accept a 2d-Numpy array.
The various things I've tried are
SWIG
SWIG + numpy.i
ctypes
My function has the prototype
void normalize_logspace_matrix(size_t nrow, size_t ncol, double mat[nrow][ncol]);
So it takes a pointer to a variable-length array and modifies it in place.
I tried the following pure SWIG interface file:
%module c_utils
%{
extern void normalize_logspace_matrix(size_t, size_t, double mat[*][*]);
%}
extern void normalize_logspace_matrix(size_t, size_t, double** mat);
Then I would do (on Mac OS X 64bit):
> swig -python c-utils.i
> gcc -fPIC c-utils_wrap.c -o c-utils_wrap.o \
-I/Library/Frameworks/Python.framework/Versions/6.2/include/python2.6/ \
-L/Library/Frameworks/Python.framework/Versions/6.2/lib/python2.6/ -c
c-utils_wrap.c: In function ‘_wrap_normalize_logspace_matrix’:
c-utils_wrap.c:2867: warning: passing argument 3 of ‘normalize_logspace_matrix’ from incompatible pointer type
> g++ -dynamiclib c-utils.o -o _c_utils.so
In Python I then get the following error on importing my module:
>>> import c_utils
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: dynamic module does not define init function (initc_utils)
Next I tried this approach using SWIG + numpy.i:
%module c_utils
%{
#define SWIG_FILE_WITH_INIT
#include "c-utils.h"
%}
%include "numpy.i"
%init %{
import_array();
%}
%apply ( int DIM1, int DIM2, DATA_TYPE* INPLACE_ARRAY2 )
{(size_t nrow, size_t ncol, double* mat)};
%include "c-utils.h"
However, I don't get any further than this:
> swig -python c-utils.i
c-utils.i:13: Warning 453: Can't apply (int DIM1,int DIM2,DATA_TYPE *INPLACE_ARRAY2). No typemaps are defined.
SWIG doesn't seem to find the typemaps defined in numpy.i, but I don't understand why, because numpy.i is in the same directory and SWIG doesn't complain that it can't find it.
With ctypes I didn't get very far, but got lost in the docs pretty quickly since I couldn't figure out how to pass it a 2d-array and then get the result back.
So could somebody show me the magic trick how to make my function available in Python/Numpy?
Unless you have a really good reason not to, you should use cython to interface C and python. (We are starting to use cython instead of raw C inside numpy/scipy themselves).
You can see a simple example in my scikits talkbox (since cython has improved quite a bit since then, I think you could write it better today).
def cslfilter(c_np.ndarray b, c_np.ndarray a, c_np.ndarray x):
"""Fast version of slfilter for a set of frames and filter coefficients.
More precisely, given rank 2 arrays for coefficients and input, this
computes:
for i in range(x.shape[0]):
y[i] = lfilter(b[i], a[i], x[i])
This is mostly useful for processing on a set of windows with variable
filters, e.g. to compute LPC residual from a signal chopped into a set of
windows.
Parameters
----------
b: array
recursive coefficients
a: array
non-recursive coefficients
x: array
signal to filter
Note
----
This is a specialized function, and does not handle other types than
double, nor initial conditions."""
cdef int na, nb, nfr, i, nx
cdef double *raw_x, *raw_a, *raw_b, *raw_y
cdef c_np.ndarray[double, ndim=2] tb
cdef c_np.ndarray[double, ndim=2] ta
cdef c_np.ndarray[double, ndim=2] tx
cdef c_np.ndarray[double, ndim=2] ty
dt = np.common_type(a, b, x)
if not dt == np.float64:
raise ValueError("Only float64 supported for now")
if not x.ndim == 2:
raise ValueError("Only input of rank 2 support")
if not b.ndim == 2:
raise ValueError("Only b of rank 2 support")
if not a.ndim == 2:
raise ValueError("Only a of rank 2 support")
nfr = a.shape[0]
if not nfr == b.shape[0]:
raise ValueError("Number of filters should be the same")
if not nfr == x.shape[0]:
raise ValueError, \
"Number of filters and number of frames should be the same"
tx = np.ascontiguousarray(x, dtype=dt)
ty = np.ones((x.shape[0], x.shape[1]), dt)
na = a.shape[1]
nb = b.shape[1]
nx = x.shape[1]
ta = np.ascontiguousarray(np.copy(a), dtype=dt)
tb = np.ascontiguousarray(np.copy(b), dtype=dt)
raw_x = <double*>tx.data
raw_b = <double*>tb.data
raw_a = <double*>ta.data
raw_y = <double*>ty.data
for i in range(nfr):
filter_double(raw_b, nb, raw_a, na, raw_x, nx, raw_y)
raw_b += nb
raw_a += na
raw_x += nx
raw_y += nx
return ty
As you can see, besides the usual argument checking you would do in python, it is almost the same thing (filter_double is a function which can be written in pure C in a separate library if you want to). Of course, since it is compiled code, failing to check your argument will crash your interpreter instead of raising exception (there are several levels of safety vs speed tradeoffs available with recent cython, though).
To answer the real question: SWIG doesn't tell you it can't find any typemaps. It tells you it can't apply the typemap (int DIM1,int DIM2,DATA_TYPE *INPLACE_ARRAY2), which is because there is no typemap defined for DATA_TYPE *. You need to tell it you want to apply it to a double*:
%apply ( int DIM1, int DIM2, double* INPLACE_ARRAY2 )
{(size_t nrow, size_t ncol, double* mat)};
First, are you sure that you were writing the fastest possible numpy code? If by normalise you mean divide the whole row by its sum, then you can write fast vectorised code which looks something like this:
matrix /= matrix.sum(axis=0)
If this is not what you had in mind and you are still sure that you need a fast C extension, I would strongly recommend you write it in cython instead of C. This will save you all the overhead and difficulties in wrapping code, and allow you to write something which looks like python code but which can be made to run as fast as C in most circumstances.
I agree with others that a little Cython is well worth learning.
But if you must write C or C++, use a 1d array which overlays the 2d, like this:
// sum1rows.cpp: 2d A as 1d A1
// Unfortunately
// void f( int m, int n, double a[m][n] ) { ... }
// is valid c but not c++ .
// See also
// http://stackoverflow.com/questions/3959457/high-performance-c-multi-dimensional-arrays
// http://stackoverflow.com/questions/tagged/multidimensional-array c++
#include <stdio.h>
void sum1( int n, double x[] ) // x /= sum(x)
{
float sum = 0;
for( int j = 0; j < n; j ++ )
sum += x[j];
for( int j = 0; j < n; j ++ )
x[j] /= sum;
}
void sum1rows( int nrow, int ncol, double A1[] ) // 1d A1 == 2d A[nrow][ncol]
{
for( int j = 0; j < nrow*ncol; j += ncol )
sum1( ncol, &A1[j] );
}
int main( int argc, char** argv )
{
int nrow = 100, ncol = 10;
double A[nrow][ncol];
for( int j = 0; j < nrow; j ++ )
for( int k = 0; k < ncol; k ++ )
A[j][k] = (j+1) * k;
double* A1 = &A[0][0]; // A as 1d array -- bad practice
sum1rows( nrow, ncol, A1 );
for( int j = 0; j < 2; j ++ ){
for( int k = 0; k < ncol; k ++ ){
printf( "%.2g ", A[j][k] );
}
printf( "\n" );
}
}
Added 8 Nov: as you probably know, numpy.reshape can overlay a numpy 2d array with a 1d view to pass to sum1rows, like this:
import numpy as np
A = np.arange(10).reshape((2,5))
A1 = A.reshape(A.size) # a 1d view of A, not a copy
# sum1rows( 2, 5, A1 )
A[1,1] += 10
print "A:", A
print "A1:", A1
SciPy has an extension tutorial with example code for arrays.
http://docs.scipy.org/doc/numpy/user/c-info.how-to-extend.html