Boost wrapped python process dies when delete or freed array - python

I'm stuck in wrapping C dll with python.
dll side is provided blackbox and I made additional wrapper class.
python gives C dll numpy ndarray. then C dll process it and return it.
When I delete array in C, whole program dies. with or without deep copy. below are brief code to reproduce
python main
import mypyd, numpy
myobject = mypyd.myimpro()
image = readimg() #ndarray
result = myobject.process(image) <- process die here with exit code -1073741819
dllmain header
BOOST_PYTHON_MODULE(mypyd){
class_<mywrapclass>("myimpro")
.def("process", &mywrapclass::process)
}
c wrapper header
namespace np = boost::python::numpy;
class mywrapclass{
dllhandler m_handler;
myimagestruct process(np::ndarray);
}
c wrapper code
mywrapclass::mywrapclass(){
Py_Initialize();
np::initialize();
m_handler = initInDLL()
}
np::ndarray mywrapclass::process(np::ndarray input){
myimagestruct image = ndarray2struct(input); ## deep copy data array
myimagestruct result = processInDLL(m_handler, image);
np::ndarray ret = struct2ndarray(result); ## also deep copy data array. just in case.
return ret;
}
c header for dll(I can't modify this code. it's given with dll)
typedef struct {
void *data
... size, type etc
} myimagestruct;
typedef void * dllhandler;
dllhandler = initInDLL();
myimagestruct processInDLL(myimagestruct);
sorry for wrong code. Initially, I naively thought boost.python or deep copying will solve this heap memory problem, lol. Tried almost every keyword in my brain but couldn't even find any point to start.
dll works perfect without boost wrapping. So it must be my part to resolve it.
Thanks for reading anyway.

Related

Writing address of a numpy array to a file and then opening it in C++ via ctypes

I was wondering if it's possible to actually write in a file an address of a numpy array, via e.g. ctypeslib.ndpointer or something similar and then open this file in a C++ function, also called through ctypes in the same python process and read this address, convert it to e.g. C++ double array.
This will all be happening in the same python process.
I am aware that it's possible to pass it as a function argument and that works, but that isn't something I'd need.
This is how the code would look like, don't mind the syntax errors:
test.py
with open(path) as f:
f.write(matrix.ctypes.data_as(np.ctypeslib.ndpointer(dtype=np.float64, ndim=2, flags='C_CONTIGUOUS')))
and cpp:
void function()
{
... read file, get address stored into double* array;
e.g. then print out the values
}
Where could I be wrong?
I work on a project where we are writing np array to a file and then reading that file in cpp, which is wasteful. I want to try adjusting it to write and later on read just this address. Sending a ndpointer or something else as a function argument wont work, as that would require editing big partion of the project.
I think that the data of your np.array will be lost once the python program terminates therefore you will not be able to access its memory location once the program ends.
Unfortunately, I don't know how to do it using ctypes but only using the C-API Extention.
With it, you access directly the python variable from c. It is represented by a pointer therefore you could access the address of any python object( therefore also ndarrays).
in python you would write:
import c_module
import NumPy as np
...
a = np.array([...])
#generate the numpy array
...
c_module.c_fun(a)
and then in your c++ code, you will receive the memory address
static PyObject* py_f_roots(PyObject* self, PyObject* args) {
PyObject *np_array_py;
if (!PyArg_ParseTuple(args, "OO", &np_array_py))
return NULL;
//now np_array_py points to the memory cell of the python numpy array a
//if you want to access it you need to cast it to a PyArrayObject *
PyArrayObject *np_array = (PyArrayObject *) np_array_py;
//you can access the data
double *data = (double *) PyArray_DATA(np_array);
return Py_None;
}
The documentation for numpy c API
The reference manual for c python extention
If the Python and C code are run in the same process, then the address you write from Python will be valid in C. I think you want the following:
test.py
import ctypes as ct
import numpy as np
matrix = np.array([1.1,2.2,3.3,4.4,5.5])
# use binary to write the address
with open('addr.bin','wb') as f:
# type of pointer doesn't matter just need the address
f.write(matrix.ctypes.data_as(ct.c_void_p))
# test function to receive the filename
dll = ct.CDLL('./test')
dll.func.argtypes = ct.c_char_p,
dll.func.restype = None
dll.func(b'addr.bin')
test.c
#include <stdio.h>
__declspec(dllexport)
void func(const char* file) {
double* p;
FILE* fp = fopen(file,"rb"); // read the pointer
fread(&p, 1, sizeof(p), fp);
fclose(fp);
for(int i = 0; i < 5; ++i) // dump the elements
printf("%lf\n", p[i]);
}
Output:
1.100000
2.200000
3.300000
4.400000
5.500000

calling C program function in python - Segmentation fault

So I have a C program that I am running from Python. But am getting segmentation fault error. when I run the C program alone, it runs fine. The C program interfaces a fingerprint sensor using the fprint lib.
#include <poll.h>
#include <stdlib.h>
#include <sys/time.h>
#include <stdio.h>
#include <libfprint/fprint.h>
int main(){
struct fp_dscv_dev **devices;
struct fp_dev *device;
struct fp_img **img;
int r;
r=fp_init();
if(r<0){
printf("Error");
return 1;
}
devices=fp_discover_devs();
if(devices){
device=fp_dev_open(*devices);
fp_dscv_devs_free(devices);
}
if(device==NULL){
printf("NO Device\n");
return 1;
}else{
printf("Yes\n");
}
int caps;
caps=fp_dev_img_capture(device,0,img);
printf("bloody status %i \n",caps);
//save the fingerprint image to file. ** this is the block that
causes the segmentation fault.
int imrstx;
imrstx=fp_img_save_to_file(*img,"enrolledx.pgm");
fp_img_free(*img);
fp_exit();
return 0;
}
the python code
from ctypes import *
so_file = "/home/arkounts/Desktop/pythonsdk/capture.so"
my_functions = CDLL(so_file)
a=my_functions.main()
print(a)
print("Done")
The capture.so is built and accessed in python. But calling from python, I get a Segmentation fault. What could be my problem?
Thanks alot
Although I am unfamiliar with libfprint, after taking a look at your code and comparing it with the documentation, I see two issues with your code that can both cause a segmentation fault:
First issue:
According to the documentation of the function fp_discover_devs, NULL is returned on error. On success, a NULL-terminated list is returned, which may be empty.
In the following code, you check for failure/success, but don't check for an empty list:
devices=fp_discover_devs();
if(devices){
device=fp_dev_open(*devices);
fp_dscv_devs_free(devices);
}
If devices is non-NULL, but empty, then devices[0] (which is equivalent to *devices) is NULL. In that case, you pass this NULL pointer to fp_dev_open. This may cause a segmentation fault.
I don't think that this is the reason for your segmentation fault though, because this error in your code would only be triggered if an empty list were returned.
Second issue:
The last parameter of fp_dev_img_capture should be a pointer to an allocated variable of type struct fp_img *. This tells the function the address of the variable that it should write to. However, with the code
struct fp_img **img;
[...]
caps=fp_dev_img_capture(device,0,img);
you are passing that function a wild pointer, because img does not point to any valid object. This can cause a segmentation fault as soon as the wild pointer is dereferenced by the function or cause some other kind of undefined behavior, such as overwriting other variables in your program.
I suggest you write the following code instead:
struct fp_img *img;
[...]
caps=fp_dev_img_capture(device,0,&img);
Now the third parameter is pointing to a valid object (to the variable img).
Since img is now a single pointer and not a double pointer, you must pass img instead of *img to the functions fp_img_save_to_file and fp_img_free.
This second issue is probably the reason for your segmentation fault. It seems that you were just "lucky" that your program did not segfault as a standalone program.

Python C extension - memory leaks

I'm relatively new to Python and this is my first attempt at writing a C extension.
Background
In my Python 3.X project I need to load and parse large binary files (10-100MB) to extract data for further processing. The binary content is organized in frames: headers followed by a variable amount of data. Due to the low performance in Python I decided to go for a C extension to speedup the loading part.
The standalone C code outperforms Python by a factor in between 20x-500x so I am pretty satisfied with it.
The problem: the memory keeps growing when I invoke the function from my C-extension multiple times within the same Python module.
my_c_ext.c
#include <Python.h>
#include <numpy/arrayobject.h>
#include "my_c_ext.h"
static unsigned short *X, *Y;
static PyObject* c_load(PyObject* self, PyObject* args)
{
char *filename;
if(!PyArg_ParseTuple(args, "s", &filename))
return NULL;
PyObject *PyX, *PyY;
__load(filename);
npy_intp dims[1] = {n_events};
PyX = PyArray_SimpleNewFromData(1, dims, NPY_UINT16, X);
PyArray_ENABLEFLAGS((PyArrayObject*)PyX, NPY_ARRAY_OWNDATA);
PyY = PyArray_SimpleNewFromData(1, dims, NPY_UINT16, Y);
PyArray_ENABLEFLAGS((PyArrayObject*)PyY, NPY_ARRAY_OWNDATA);
PyObject *xy = Py_BuildValue("NN", PyX, PyY);
return xy;
}
...
//More Python C-extension boilerplate (methods, etc..)
...
void __load(char *) {
// open file, extract frame header and compute new_size
X = realloc(X, new_size * sizeof(*X));
Y = realloc(Y, new_size * sizeof(*Y));
X[i] = ...
Y[i] = ...
return;
}
test.py
import my_c_ext as ce
binary_files = ['file1.bin',...,'fileN.bin']
for f in binary_files:
x,y = ce.c_load(f)
del x,y
Here I am deleting the returned objects in hope of lowering memory usage.
After reading several posts (e.g. this, this and this), I am still stuck.
I tried to add/remove the PyArray_ENABLEFLAGS setting the NPY_ARRAY_OWNDATA flag without experiencing any difference. It is not yet clear to me if the NPY_ARRAY_OWNDATA implies a free(X) in C. If I explicitly free the arrays in C, I ran into a segfault when trying to load second file in the for loop in test.py.
Any idea of what am I doing wrong?
This looks like a memory management disaster. NPY_ARRAY_OWNDATA should cause it to call free on the data (or at least PyArray_free which isn't necessarily the same thing...).
However once this is done you still have the global variables X and Y pointing to a now-invalid area of memory. You then call realloc on those invalid pointers. At this point you're well into undefined behaviour and so anything could happen.
If it's a global variable then the memory needs to be managed globally, not by Numpy. If the memory is managed by the Numpy array then you need to ensure that you store no other way to access it except through that Numpy array. Anything else is going to cause you problems.

Python ctypes how to read string modified by C code

After following this answer, I wrote a C library which I call from python using ctypes. The C library contains a function which takes a char* and modifies its contents. I would like to read the contents in Python, however, after calling the library function in Python, I get an empty string even though I see the correct output in the terminal if I include a printf statement in the C code. What am I doing wrong?
C code:
void somefunction(char * pythonbuffer)
{
/* do something */
printf("%s", pythonbuffer);
}
Python code:
lib.somefunction.argtypes = [ctypes.c_char_p]
lib.somefunction.restype = ctypes.c_void_p
buffer = ctypes.create_string_buffer(upper_limit)
lib.somefunction(buffer)
print(buffer.value)
Output:
b''
Found the problem. Turns out that the code in the /* do something */ part somehow changed the adress of pythonbuffer. I solved it by using a temporary char * and then copying it to pythonbuffer with strcpy.

How to send array of bytes from C++ to python with boost python and return it modified?

I am creating small GUI system and I would like to make my rendering with python and cairo and pystacia libraries. For C++/Python interaction I am using Boost Python but I am having troubles with pointers.
I have seen this kind of question asked few times but didn't quite understand how to solve it.
If I have a strcut/class with only pointer for image data:
struct ImageData{
unsigned char* data;
void setData(unsigned char* data) { this->data = data; } // lets assume there is more code here that manages memory
unsigned char* getData() { return data; }
}
how can I make this available for python to do this (C++):
ImageData myimage;
myimage.data = some_image_data;
global["myimage"] = python::ptr(&myimage);
and in python:
import mymodule
from mymodule import ImageData
myimagedata = myimage.GetData()
#manipulate with image data and those manipulations can be read from C++ data pointer that is passed
My code works for calling basic method calling of passed ptr to class. This is probably basic use case but I haven't been able to make it work. I tried shared_ptr but failed. Should it be solved using shared_ptr, proxy class or some other way?
You have a problem here with the locality of your variable: myimage will get deleted when you get out of its scope. To fix this, you can create it with dynamic memory allocation and moving then moving the pointer to python:
ImageData * myimage = new ImageData();
myimage->data = new ImageRawData(some_image_data); // assuming copy of the buffer
global["myimage"] = boost::python::ptr(myimage);
Please notice that this does not take care of memory handling on python. You shall use boost::python::handle<> to correctly state you are transferring memory management to python.

Categories

Resources