I want to make a pure function in c-style which take an array as an argument (pointer) and do something with it. But I cannot find out how to define an array argument for a cdef function. Here is some toy code I have made.
cdef void test(double[] array ) except? -2:
cdef int i,n
i = 0
n = len(array)
for i in range(0,n):
array[i] = array[i]+1.0
def ctest(a):
n = len(a)
#Make a C-array on the heap.
cdef double *v
v = <double *>malloc(n*sizeof(double))
#Copy in the python array
for i in range(n):
v[i] = float(a[i])
#Calling the C-function which do something with the array
test(v)
#Puttint the changed C-array back into python
for i in range(n):
a[i] = v[i]
free(v)
return a
The code will not compile. Have search for how to define C-arrays in Cython, but have not found how to do it. The double[] array does clearly not not work. Have also tried with:
cdef void test(double* array ) except? -2:
I can manage to do the same in pure c, but not in cython:(
D:\cython-test\ python setup.py build_ext --inplace
Compiling ctest.pyx because it changed.
[1/1] Cythonizing ctest.pyx
Error compiling Cython file:
------------------------------------------------------------
...
from libc.stdlib cimport malloc, free
cdef void test(double[] array):
cdef int i,n
n = len(array)
^
------------------------------------------------------------
ctest.pyx:5:17: Cannot convert 'double *' to Python object
Error compiling Cython file:
------------------------------------------------------------
...
from libc.stdlib cimport malloc, free
cdef void test(double[] array):
cdef int i,n
n = len(array)
for i in range(0,len(array)):
^
------------------------------------------------------------
ctest.pyx:6:30: Cannot convert 'double *' to Python object
Traceback (most recent call last):
File "setup.py", line 10, in <module>
ext_modules = cythonize("ctest.pyx"),
File "C:\Anaconda\lib\site-packages\Cython\Build\Dependencies.py", line 877, i
n cythonize
cythonize_one(*args)
File "C:\Anaconda\lib\site-packages\Cython\Build\Dependencies.py", line 997, i
n cythonize_one
raise CompileError(None, pyx_file)
Cython.Compiler.Errors.CompileError: ctest.pyx
E:\GD\UD\Software\BendStiffener\curvmom>
UPDATE:
Have updated my code after all advices and it compiles now:) But my array do still not update. I will expect that all entries should be updated with 5.0, but they do not
from libc.stdlib cimport malloc, free
cdef void test(double[] array):
cdef int i,n
n = sizeof(array)/sizeof(double)
for i in range(0,n):
array[i] = array[i]+5.0
def ctest(a):
n = len(a)
#Make a C-array on the heap.
cdef double* v
v = <double*>malloc(n*sizeof(double))
#Copy in the python array
for i in range(n):
v[i] = float(a[i])
#Calling the C-function which do something with the array
test(v)
#Puttint the changed C-array back into python
for i in range(n):
a[i] = v[i]
free(v)
for x in a:
print x
return a
Here are a python test program for testing my code:
import ctest
a = [0,0,0]
ctest.ctest(a)
So there is still something I am doing wrong. Any suggestion?
len() is a python function that works only on python objects. This is why it won't compile.
For a C-array you could replace n=len(array) by n = sizeof(array) / sizeof(double).
You might want to take a look at typed memoryviews and the buffer interface. These provide a nice interface to array like data structures like those underlying numpy arrays, but can also be used to work with C arrays. From the documentation:
For example, they can handle C arrays and the Cython array type (Cython arrays).
In your case this might help:
cdef test(double[:] array) except? -2:
...
The double[:] allows all 1d double arrays to be passed to the function. Those can then be modified. As the [:] defines a memoryview, all changes will be made in the array you created the memoryview on (the variable you passed as the parameter to test).
Related
Is there a way to use AES-NI instructions within Cython code?
Closest I could find is how someone accessed SIMD instructions:
https://groups.google.com/forum/#!msg/cython-users/nTnyI7A6sMc/a6_GnOOsLuQJ
AES-NI in Python thread was not answered:
Python support for AES-NI
You should be able to just define the intrinsics as if they're normal C functions in Cython. Something like
cdef extern from "emmintrin.h": # I'm going off the microsoft documentation for where the headers are
# define the datatype as an opaque type
ctypedef struct __m128i:
pass
__m128i _mm_set_epi32 (int i3, int i2, int i1, int i0)
cdef extern from "wmmintrin.h":
__m128i _mm_aesdec_si128(__m128i v,__m128i rkey)
# then in some Cython function
def f():
cdef __m128i v = _mm_set_epi32(1,2,3,4)
cdef __m128i key = _mm_set_epi32(5,6,7,8)
cdef __m128i result = _mm_aesdec_si128(v,key)
The question "how do I apply this over a bytes array"? First, you get a char* of the bytes array. Then just iterate over it with range (being careful not to run off the end).
# assuming you already have an __m128i key
cdef __m128i v
cdef char* array = python_bytes_array # auto conversion
cdef int i, j
# you NEED to ensure that the byte array has a length divisible by
# 16, otherwise you'll probably get a segmentation fault.
for i in range(0,len(python_bytes_array),16):
# go over in chunks of 16
v = _mm_set_epi8(array[i+15],array[i+14],array[i+13],
# etc... fill in the rest
array[i+1], array[i])
cdef __m128 result = _mm_aesdec_si128(v,key)
# write back to the same place?
for j in range(16):
array[i+j] = _mm_extract_epi8(result,j)
Well, this seems easy, but I can't find a single reference on the web. In C we can create a char array of n null-characters as follows:
char arr[n] = "";
But when I try to do the same in Cython with
cdef char arr[n] = ""
I get this compilation error:
Error compiling Cython file:
------------------------------------------------------------
...
cdef char a[n] = ""
^
------------------------------------------------------------
Syntax error in C variable declaration
Obviously Cython doesn't allow to declare arrays this way, but is there an alternative? I don't want to manually set each item in the array, that is I'm not looking for something like this
cdef char a[10]
for i in range(0, 10, 1):
a[i] = b"\0"
You don't have to set each element to make a length-zero C string. It is sufficient to just zero the first element:
cdef char arr[n]
arr[0] = 0
Next, if you want to zero the whole char array, use memset
from libc.string cimport memset
cdef char arr[n]
memset(arr, 0, n)
And if C purists complain about the 0 instead of '\0', note that the '\0' is a Python string (unicode in Python 3) in Cython. '\0' is not a C char in Cython! memset expects an integer value for its second argument, not a Python string.
If you really want to know the int value of a C '\0' in Cython, you must write a helper function in C:
/* zerochar.h */
static int zerochar()
{
return '\0';
}
And now:
cdef extern from "zerochar.h":
int zerochar()
cdef char arr[n]
arr[0] = zerochar()
or
cdef extern from "zerochar.h":
int zerochar()
from libc.string cimport memset
cdef char arr[n]
memset(arr, zerochar(), n)
In C '' is used for a char, and "" for a string. But any 'empty char' does not really make sense, probably what you want is '\0' or just 0
Maybe:
import cython
from libc.stdlib cimport malloc, free
cdef char * test():
n = 10
cdef char *arr = <char *>malloc(n * sizeof(char))
for n in range(n):
arr[n] = '\0'
return arr
Edit
void *
calloc(size_t count, size_t size);
Does that for you,
How about:
cdef char *arr = ['\0']*n
I'd like to call my C function from Python, in order to manipulate some NumPy arrays. The function is like this:
void c_func(int *in_array, int n, int *out_array);
where the results are supplied in out_array, whose size I know in advance (not my function, actually). I try to do in the corresponding .pyx file the following, in order to able to pass the input to the function from a NumPy array, and store the result in a NumPy array:
def pyfunc(np.ndarray[np.int32_t, ndim=1] in_array):
n = len(in_array)
out_array = np.zeros((512,), dtype = np.int32)
mymodule.c_func(<int *> in_array.data, n, <int *> out_array.data)
return out_array
But I get
"Python objects cannot be cast to pointers of primitive types" error for the output assignment. How do I accomplish this?
(If I require that the Python caller allocates the proper output array, then I can do
def pyfunc(np.ndarray[np.int32_t, ndim=1] in_array, np.ndarray[np.int32_t, ndim=1] out_array):
n = len(in_array)
mymodule.cfunc(<int *> in_array.data, n, <int*> out_array.data)
But can I do this in a way that the caller doesn't have to pre-allocate the appropriately sized output array?
You should add cdef np.ndarray before the out_array assignement:
def pyfunc(np.ndarray[np.int32_t, ndim=1] in_array):
cdef np.ndarray out_array = np.zeros((512,), dtype = np.int32)
n = len(in_array)
mymodule.c_func(<int *> in_array.data, n, <int *> out_array.data)
return out_array
Here is an example how to manipulate NumPy arrays using code written in C/C++ through ctypes.
I wrote a small function in C, taking the square of numbers from a first array and writing the result to a second array. The number of elements is given by a third parameter. This code is compiled as shared object.
squares.c compiled to squares.so:
void square(double* pin, double* pout, int n) {
for (int i=0; i<n; ++i) {
pout[i] = pin[i] * pin[i];
}
}
In python, you just load the library using ctypes and call the function. The array pointers are obtained from the NumPy ctypes interface.
import numpy as np
import ctypes
n = 5
a = np.arange(n, dtype=np.double)
b = np.zeros(n, dtype=np.double)
square = ctypes.cdll.LoadLibrary("./square.so")
aptr = a.ctypes.data_as(ctypes.POINTER(ctypes.c_double))
bptr = b.ctypes.data_as(ctypes.POINTER(ctypes.c_double))
square.square(aptr, bptr, n)
print b
This will work for any c-library, you just have to know which argument types to pass, possibly rebuilding c-structs in python using ctypes.
I am trying to speed up some python code with cython, and I'm making use of cython's -a option to see where I can improve things. My understanding is that in the generated html file, the highlighted lines are ones where python functions are called - is that correct?
In the following trivial function, I have declared the numpy array argument arr using the buffer syntax. I thought that this allows indexing operations to take place purely in C without having to call python functions. However, cython -a (version 0.15) highlights the line where I set the value of an element of arr, though not the one where i read one of its elements. Why does this happen? Is there a more efficient way of accessing numpy array elements?
import numpy
cimport numpy
def foo(numpy.ndarray[double, ndim=1] arr not None):
cdef int i
cdef double elem
for i in xrange(10):
elem = arr[i] #not highlighted
arr[i] = 1.0 + elem #highlighted
EDIT: Also, how does the mode buffer argument interact with numpy? Assuming I haven't changed the order argument of numpy.array from the default, is it always safe to use mode='c'? Does this actually make a difference to performance?
EDIT after delnan's comment: arr[i] += 1 also gets highlighted (that is why I split it up in the first place, to see which part of the operation was causing the issue). If I turn off bounds checking to simplify things (this makes no difference to what gets highlighted), the generated c code is:
/* "ct.pyx":11
* cdef int i
* cdef double elem
* for i in xrange(10): # <<<<<<<<<<<<<<
* elem = arr[i]
* arr[i] = 1.0 + elem
*/
for (__pyx_t_1 = 0; __pyx_t_1 < 10; __pyx_t_1+=1) {
__pyx_v_i = __pyx_t_1;
/* "ct.pyx":12
* cdef double elem
* for i in xrange(10):
* elem = arr[i] # <<<<<<<<<<<<<<
* arr[i] = 1.0 + elem
*/
__pyx_t_2 = __pyx_v_i;
__pyx_v_elem = (*__Pyx_BufPtrStrided1d(double *, __pyx_bstruct_arr.buf, __pyx_t_2, __pyx_bstride_0_arr));
/* "ct.pyx":13
* for i in xrange(10):
* elem = arr[i]
* arr[i] = 1.0 + elem # <<<<<<<<<<<<<<
*/
__pyx_t_3 = __pyx_v_i;
*__Pyx_BufPtrStrided1d(double *, __pyx_bstruct_arr.buf, __pyx_t_3, __pyx_bstride_0_arr) = (1.0 + __pyx_v_elem);
}
The answer is that the highlighter fools the reader.
I compiled your code and the instructions generated under the highlight are those needed
to handle the error cases and the return value, they are not related to the array assignment.
Indeed if you change the code to read :
def foo(numpy.ndarray[double, ndim=1] arr not None):
cdef int i
cdef double elem
for i in xrange(10):
elem = arr[i]
arr[i] = 1.0 + elem
return # + add this
The highlight would be on the last line and not more in the assignment.
You can further speed up your code by using the #cython.boundscheck:
import numpy
cimport numpy
cimport cython
#cython.boundscheck(False)
def foo(numpy.ndarray[double, ndim=1] arr not None):
cdef int i
cdef double elem
for i in xrange(10):
elem = arr[i]
arr[i] = 1.0 + elem
return
I am trying to speed up my Numpy code and decided that I wanted to implement one particular function where my code spent most of the time in C.
I'm actually a rookie in C, but I managed to write the function which normalizes every row in a matrix to sum to 1. I can compile it and I tested it with some data (in C) and it does what I want. At that point I was very proud of myself.
Now I'm trying to call my glorious function from Python where it should accept a 2d-Numpy array.
The various things I've tried are
SWIG
SWIG + numpy.i
ctypes
My function has the prototype
void normalize_logspace_matrix(size_t nrow, size_t ncol, double mat[nrow][ncol]);
So it takes a pointer to a variable-length array and modifies it in place.
I tried the following pure SWIG interface file:
%module c_utils
%{
extern void normalize_logspace_matrix(size_t, size_t, double mat[*][*]);
%}
extern void normalize_logspace_matrix(size_t, size_t, double** mat);
Then I would do (on Mac OS X 64bit):
> swig -python c-utils.i
> gcc -fPIC c-utils_wrap.c -o c-utils_wrap.o \
-I/Library/Frameworks/Python.framework/Versions/6.2/include/python2.6/ \
-L/Library/Frameworks/Python.framework/Versions/6.2/lib/python2.6/ -c
c-utils_wrap.c: In function ‘_wrap_normalize_logspace_matrix’:
c-utils_wrap.c:2867: warning: passing argument 3 of ‘normalize_logspace_matrix’ from incompatible pointer type
> g++ -dynamiclib c-utils.o -o _c_utils.so
In Python I then get the following error on importing my module:
>>> import c_utils
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: dynamic module does not define init function (initc_utils)
Next I tried this approach using SWIG + numpy.i:
%module c_utils
%{
#define SWIG_FILE_WITH_INIT
#include "c-utils.h"
%}
%include "numpy.i"
%init %{
import_array();
%}
%apply ( int DIM1, int DIM2, DATA_TYPE* INPLACE_ARRAY2 )
{(size_t nrow, size_t ncol, double* mat)};
%include "c-utils.h"
However, I don't get any further than this:
> swig -python c-utils.i
c-utils.i:13: Warning 453: Can't apply (int DIM1,int DIM2,DATA_TYPE *INPLACE_ARRAY2). No typemaps are defined.
SWIG doesn't seem to find the typemaps defined in numpy.i, but I don't understand why, because numpy.i is in the same directory and SWIG doesn't complain that it can't find it.
With ctypes I didn't get very far, but got lost in the docs pretty quickly since I couldn't figure out how to pass it a 2d-array and then get the result back.
So could somebody show me the magic trick how to make my function available in Python/Numpy?
Unless you have a really good reason not to, you should use cython to interface C and python. (We are starting to use cython instead of raw C inside numpy/scipy themselves).
You can see a simple example in my scikits talkbox (since cython has improved quite a bit since then, I think you could write it better today).
def cslfilter(c_np.ndarray b, c_np.ndarray a, c_np.ndarray x):
"""Fast version of slfilter for a set of frames and filter coefficients.
More precisely, given rank 2 arrays for coefficients and input, this
computes:
for i in range(x.shape[0]):
y[i] = lfilter(b[i], a[i], x[i])
This is mostly useful for processing on a set of windows with variable
filters, e.g. to compute LPC residual from a signal chopped into a set of
windows.
Parameters
----------
b: array
recursive coefficients
a: array
non-recursive coefficients
x: array
signal to filter
Note
----
This is a specialized function, and does not handle other types than
double, nor initial conditions."""
cdef int na, nb, nfr, i, nx
cdef double *raw_x, *raw_a, *raw_b, *raw_y
cdef c_np.ndarray[double, ndim=2] tb
cdef c_np.ndarray[double, ndim=2] ta
cdef c_np.ndarray[double, ndim=2] tx
cdef c_np.ndarray[double, ndim=2] ty
dt = np.common_type(a, b, x)
if not dt == np.float64:
raise ValueError("Only float64 supported for now")
if not x.ndim == 2:
raise ValueError("Only input of rank 2 support")
if not b.ndim == 2:
raise ValueError("Only b of rank 2 support")
if not a.ndim == 2:
raise ValueError("Only a of rank 2 support")
nfr = a.shape[0]
if not nfr == b.shape[0]:
raise ValueError("Number of filters should be the same")
if not nfr == x.shape[0]:
raise ValueError, \
"Number of filters and number of frames should be the same"
tx = np.ascontiguousarray(x, dtype=dt)
ty = np.ones((x.shape[0], x.shape[1]), dt)
na = a.shape[1]
nb = b.shape[1]
nx = x.shape[1]
ta = np.ascontiguousarray(np.copy(a), dtype=dt)
tb = np.ascontiguousarray(np.copy(b), dtype=dt)
raw_x = <double*>tx.data
raw_b = <double*>tb.data
raw_a = <double*>ta.data
raw_y = <double*>ty.data
for i in range(nfr):
filter_double(raw_b, nb, raw_a, na, raw_x, nx, raw_y)
raw_b += nb
raw_a += na
raw_x += nx
raw_y += nx
return ty
As you can see, besides the usual argument checking you would do in python, it is almost the same thing (filter_double is a function which can be written in pure C in a separate library if you want to). Of course, since it is compiled code, failing to check your argument will crash your interpreter instead of raising exception (there are several levels of safety vs speed tradeoffs available with recent cython, though).
To answer the real question: SWIG doesn't tell you it can't find any typemaps. It tells you it can't apply the typemap (int DIM1,int DIM2,DATA_TYPE *INPLACE_ARRAY2), which is because there is no typemap defined for DATA_TYPE *. You need to tell it you want to apply it to a double*:
%apply ( int DIM1, int DIM2, double* INPLACE_ARRAY2 )
{(size_t nrow, size_t ncol, double* mat)};
First, are you sure that you were writing the fastest possible numpy code? If by normalise you mean divide the whole row by its sum, then you can write fast vectorised code which looks something like this:
matrix /= matrix.sum(axis=0)
If this is not what you had in mind and you are still sure that you need a fast C extension, I would strongly recommend you write it in cython instead of C. This will save you all the overhead and difficulties in wrapping code, and allow you to write something which looks like python code but which can be made to run as fast as C in most circumstances.
I agree with others that a little Cython is well worth learning.
But if you must write C or C++, use a 1d array which overlays the 2d, like this:
// sum1rows.cpp: 2d A as 1d A1
// Unfortunately
// void f( int m, int n, double a[m][n] ) { ... }
// is valid c but not c++ .
// See also
// http://stackoverflow.com/questions/3959457/high-performance-c-multi-dimensional-arrays
// http://stackoverflow.com/questions/tagged/multidimensional-array c++
#include <stdio.h>
void sum1( int n, double x[] ) // x /= sum(x)
{
float sum = 0;
for( int j = 0; j < n; j ++ )
sum += x[j];
for( int j = 0; j < n; j ++ )
x[j] /= sum;
}
void sum1rows( int nrow, int ncol, double A1[] ) // 1d A1 == 2d A[nrow][ncol]
{
for( int j = 0; j < nrow*ncol; j += ncol )
sum1( ncol, &A1[j] );
}
int main( int argc, char** argv )
{
int nrow = 100, ncol = 10;
double A[nrow][ncol];
for( int j = 0; j < nrow; j ++ )
for( int k = 0; k < ncol; k ++ )
A[j][k] = (j+1) * k;
double* A1 = &A[0][0]; // A as 1d array -- bad practice
sum1rows( nrow, ncol, A1 );
for( int j = 0; j < 2; j ++ ){
for( int k = 0; k < ncol; k ++ ){
printf( "%.2g ", A[j][k] );
}
printf( "\n" );
}
}
Added 8 Nov: as you probably know, numpy.reshape can overlay a numpy 2d array with a 1d view to pass to sum1rows, like this:
import numpy as np
A = np.arange(10).reshape((2,5))
A1 = A.reshape(A.size) # a 1d view of A, not a copy
# sum1rows( 2, 5, A1 )
A[1,1] += 10
print "A:", A
print "A1:", A1
SciPy has an extension tutorial with example code for arrays.
http://docs.scipy.org/doc/numpy/user/c-info.how-to-extend.html