What is the canonical way of getting a numpy matrix as an argument to a C function which takes a double pointer?
Context: I'm using numpy to validate some C code-to wit, I have a C function which takes a const double ** const, and I'm using ctypes to call the .so from Python.
I've tried:
func.argtypes = ctypeslib.ndpointer(dtype=double, ndim=2, flags="C_CONTIGUOUS")
and passed the numpy matrix directly (didn't work), as well as
func.argtypes = ctypes.POINTER(ctypes.POINTER(ctypes.c_double))
and then passed the numpy matrix via various casts. Casting led to the Python error
TypeError: _type_ must have storage info
Note: This question came up a few years ago here, but there was no completely successful resolution.
I think you're looking for the ctypes interface in numpy's ndarrays (or matrix for that matter). You may have a look here for more information on numpy's manual.
Please note that numpy C-API stores ndarrays (or matrices for that matter) with a single pointer (see http://docs.scipy.org/doc/numpy/reference/c-api.types-and-structures.html#c.PyArrayObject). You cannot convert this single pointer into a double pointer in C, just because these are different types. Furthermore, numpy not only stores the data on these objects, it also stores information about the shape of the matrix and the strides (how the data is organised on the provided data pointer). Without knowing all these information, your code will not work on all conditions. For example, if you transpose a matrix before interfacing it to your code, you'll get unexpected results!
There are a couple of solutions depending on your settings:
If you can modify your library's API, change it so that you can pass numpy information concerning not only the data pointer, but also the shape of the matrix and its strides. Then, use the above link to figure out how to interface numpy/ctypes support with your new API.
If you cannot modify your API, I suggest you create a Python function based on ctypes that converts the contents of the numpy array into a double-pointer matrix created with ctypes itself (as suggested on this discussion). Also add support for the numpy object shape and stride so that you can run the conversion properly. After converting, pass the newly created double pointer structure to your original function.
Related
I'm using Cython to wrap a C++ library. In the C++ code there is some data that represents a list of 3D vectors. It is stored in the object std::vector< std::array<double, 3> >. My current method to convert this into a python object is to loop over the vector and use the method arrayd3ToNumpy in the answer to my previous question on each element. This is quite slow, however, when the vector is very large. I'm not as worried about making copies of the data (since I believe the auto casting of vector to list creates a copy), but more about the speed of the casting.
This this case the data is a simple C type, continguous in memory. I'd therefore expose it to Python via a wrapper cdef class that has the buffer protocol.
There's a guide (with an example) in the Cython documentation. In your case you don't need to store ncols - it's just 3 defined by the array type.
The key advantage is that you can largely eliminate copying. You could use your type directly with Cython's typed memoryviews to quickly access the underlying data. You could also do np.asarray to create a Numpy array that just wraps the underlying data that already exists.
I'm looking at python code working with numba, and have some questions. There is less tutorial on numba, so that I come here to ask.
In numba, data type is pre-declared to help processing. I'm not clear on the rule to declare data type. One example is numba.float64[:,::1]. I feel it's declaring a 2D array in float type. However, I'm not sure what ::1 means here. Another example is nb.types.NPDatetime('M')[::1]. Is it slicing the 1D array?
I still have questions on ListType(), which is imported from numba.types. Only one element is allowed here? In my code, one class type is saved and passed to ListType() as single argument. What if I need to explicitly define this class type, and pass it here? Thanks.
I feel there is few tutorial or documents on numba module. If ok, please share some resources on numba. That's very appreciated.
One example is numba.float64[:,::1]. I feel it's declaring a 2D array in float type. However, I'm not sure what ::1 means here
Exactly. ::1 means that the array axis is contiguous. This enable further optimizations like the use of SIMD instructions.
Another example is nb.types.NPDatetime('M')[::1]. Is it slicing the 1D array?
nb.types.NPDatetime('M') is a Numpy datetime type (where 'M' is meant to specify the granularity of the datetime. eg. months) here and [::1] means that this is a 1D contiguous array (containing datetime objects).
One should not be confused between object instances and object types. In Python, this is quite frequent to mix both but this is due to the dynamic nature of the language. Statically-typed compiled languages like C or C++ clearly separate the two concepts and types cannot be manipulated at runtime.
Only one element is allowed here?
ListType is a class representing the type of a typed list. Its unique parameter defines the type of the item in the resulting type of list. For example nb.types.ListType(nb.types.int32) returns an object representing the type of a typed list containing 32-bit integers. Note that it is not a list instance. It is meant to be provided to Numba signature or other types.
For a project I am working on, I require passing a PyTorch tensor (with dtype torch.int16) to a C++ method which expects an std::vector (basically, a list of integers).
I have the binding figured out, and I am already able to accomplish this task if I first convert my tensor into a numpy array (and use pybind11's numpy support), but for optimizing my code I want to be able to do this without this conversion to a numpy array.
Based on my research, I assume that I will have to write a custom type-caster. How should I proceed?
Say there is a C++ class in which we would like to define a function to be called in python. On the python side the goal is being able to call this function with:
Input: of type 2D numpy-array(float32), or list of lists, or other suggestions
Output: of type 2D numpy-array(float32), or list of lists, or other suggestions
and if it helps latency/simplicity 1D array is also ok.
One would for example define a function in the header with:
bool func(const std::string& name);
which has string type as input and bool as output.
What can be a good choice with the above requirements to write in the header?
And finally, after the header file, what should be written in the pyx/pyd file for Cython?
Cython Input
The most natural Cython type to use for the input interface between Python and Cython would be a 2D typed memoryview. This will take a 2D numpy array, as well as any other 2D array type that exports the buffer interface (there aren't too many other types since Numpy is pretty ubiquitous, but some image-handling libraries have some alternatives).
I'd avoid using list-of-lists as an interface - the length of the second dimension is poorly defined. However, Numpy arrays are easily created from list-of-lists.
Cython Output
For output you return either a Cython memoryview, or a Numpy array (easily created from a memoryview with np.asarray(memview)). I'd probably return a Numpy array, but make a decision based on whether you want to make Numpy a hard dependency.
C++ interface
This is very difficult to answer without knowing about your code. If you have existing code you should just use the type that's most natural to that if at all possible.
You can get a pointer from your memoryview with &memview[0,0], and access its attributes .shape and .strides to get information about how the data is stored. (If you make the memoryview contiguous then you know strides from shape so it's simpler). You then need to decided whether to copy the data, or just use a pointer to the Python-owned data (if C++ only keeps the data for the duration of the function call then using a pointer is good).
Similar considerations apply to the output data, but it's hard to know without knowing what you're trying to do in C++.
I am trying to implement an image classification algorithm in Python. The problem is that python takes very long with looping through the array. That's why I decided to write a Delphi dll which performs the array-processing. My problem is that I don't know how to pass the multidimensional python-array to my dll-function.
Delphi dll extract: (I use this function only for testing)
type
TImgArray = array of array of Integer;
function count(a: TImgArray): Integer; cdecl;
begin
result:= high(a);
end;
relevant Python code:
arraydll = cdll.LoadLibrary("C:\\ArrayFunctions.dll")
c_int_p = ctypes.POINTER(ctypes.c_int32)
data = valBD.ReadAsArray()
data = data.astype(np.int32)
data_p = data.ctypes.data_as(c_int_p)
print arraydll.count(data_p)
The value returned by the dll-function is not the right one (it is 2816 instead of
7339). That's why I guess that there's somethin wrong with my type-conversion :(
Thanks in advance,
Mario
What you're doing won't work, and is likely to corrupt memory too.
A Delphi dynamic array is implemented under the hood as a data structure that holds some metadata about the array, including the length. But what you're passing to it is a C-style pointer-as-array, which is a pointer, not a Delphi dynamic array data structure. That structure is specific to Delphi and you can't use it in other languages. (Even if you did manage to implement the exact same structure in another language, you still couldn't pass it to a Delphi DLL and have it work right, because Delphi's memory manager is involved under the hood. Doing it this way is just asking for heap corruption and/or exceptions being raised by the memory manager.)
If you want to pass a C-style array into a DLL, you have to do the C way, by passing a second parameter that includes the length of the array. Your Python code should already know the length, and it shouldn't take any time to calculate it. You can definitely use Delphi to speed up image processing. But the array dimensions have to come from the Python side. There's no shortcut you can take here.
Your Delphi function declaration should look something like this:
type
TImgArray = array[0..MAXINT] of Integer;
PImgArray = ^TImgArray;
function ProcessSomething(a: PImgArray; size: integer): Integer; cdecl;
Normal Python arrays are normally called "Lists". A numpy.array type in Python is a special type that is more memory efficient than a normal Python list of normal Python floating point objects. Numpy.array wraps a standard block of memory that is accessed as a native C array type. This in turn, does NOT map cleanly to Delphi array types.
As David says, if you're willing to use C, this will all be easier. If you want to use Delphi and access Numpy.array, I suspect that the easiest way to do it would be to find a way to export some simple C functions that access the Numpy.array type. In C I would import the numpy headers, and then write functions that I can call from Pascal. Then I would import these functions from a DLL:
function GetNumpyArrayValue( arrayObj:Pointer; index:Integer):Double;
I haven't written any CPython wrapper code in a while. This would be easier if you wanted to simply access CORE PYTHON types from Delphi. The existing Python-for-delphi wrappers will help you. Using numpy with Delphi is just a lot more work.
Since you're only writing a DLL and not a whole application, I would seriously advise you forget about Delphi and write this puppy in plain C, which is what Python extensions (which is what you're writing) really should be written in.
In short, since you're writing a DLL, in Pascal, you're going to need at least another small DLL in C, just to bridge the types between the Python extension types (numpy.array) and the Python floating point values. And even then, you're not going to easily (quickly) get an array value you could read in Delphi as a native delphi array type.
The very fastest access mechanism I can think of is this:
type
PDouble = ^Double;
function GetNumpyArrayValue( arrayObj:Pointer; var doubleVector:PDouble; var doubleVectorLen:Integer):Boolean;
You could then use doubleVector (pointer) type math to access the underlying C array memory type.