Python C Extension Nested Dictionary Segmentation Fault - python

I am trying to create a Python (2.7.12) extension in C that does the following:
Provide a read only nested dictionary with module level scope for the Python programmer.
A background thread invisible to the Python programmer will add, delete, and modify entries in the dictionary.
The extension will be built directly into the Python interpreter.
I created a simplified version of this extension that adds one entry to the dictionary and then constantly modifies it with new values. Below is the C file containing comments about what it is doing along with my understanding how the reference counts are being handled.
#include <Python.h>
#include <pthread.h>
static PyObject *module;
static PyObject *pyitem_error;
static PyObject *item;
static PyObject *item_handle;
static pthread_t thread;
void *stuff(void *param)
{
int garbage = 0;
PyObject *size;
PyObject *value;
while(1)
{
// Build a dictionary called size containg two integer objects
// Py_BuildValue will pass ownership of its reference to size to this thread
size = NULL;
size = Py_BuildValue("{s:i,s:i}", "l", garbage, "w", garbage);
if(size == NULL)
{
goto error;
}
// Build a dictionary containing an integer object and the size dictionary
// Py_BuildValue will create and own a reference to the size dictionary but not steal it
// Py_BuildValue will pass ownership of its reference to value to this thread
value = NULL;
value = Py_BuildValue("{s:i,s:O}", "h", garbage, "base", size);
if(value == NULL)
{
goto error;
}
// Add the new data to the dictionary
// PyDict_SetItemString will borrow a reference to value
PyDict_SetItemString(item, "dim", value);
error:
Py_XDECREF(size);
Py_XDECREF(value);
garbage++;
}
return NULL;
}
// There will be methods for this module in the future
static PyMethodDef pyitem_methods[] =
{
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initpyitem(void)
{
// Create a module object
// Own a reference to it since Py_InitModule returns a borrowed reference
module = Py_InitModule("pyitem", pyitem_methods);
Py_INCREF(module);
// Create an exception object for future use
// Own a second reference to it since PyModule_AddObject will steal a reference
pyitem_error = PyErr_NewException("pyitem.error", NULL, NULL);
Py_INCREF(pyitem_error);
PyModule_AddObject(module, "error", pyitem_error);
// Create a dictionary object and a proxy object that makes it read only
// Own a second reference to the proxy object since PyModule_AddObject will steal a reference
item = PyDict_New();
item_handle = PyDictProxy_New(item);
Py_INCREF(item_handle);
PyModule_AddObject(module, "item", item_handle);
// Start the background thread that modifies the dictionary
pthread_create(&thread, NULL, stuff, NULL);
}
Below is a Python program using this extension. All it does is print out what is in the dictionary.
import pyitem
while True:
print pyitem.item
print
This extension seems to work for a while and then crashes with a segmentation fault. An examination of the core dump reveals the following:
Core was generated by `python pyitem_test.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 PyObject_Malloc (nbytes=nbytes#entry=42) at Objects/obmalloc.c:831
831 if ((pool->freeblock = *(block **)bp) != NULL) {
[Current thread is 1 (Thread 0x7f144a824700 (LWP 3931))]
This core dump leads me to believe the issue might have to do with my handling of the object reference counts. I believe this might be one cause since problems posed by others with the same core dump resolved the issue by properly handling reference counts. However, I do not see anything wrong with my handling of the object reference counts.
Another thing that comes to mind is that the print function in Python is likely only borrowing a references to the contents of the dictionary. When it is trying to print the dictionary (or access its contents in any other way), the background thread comes along and replaces the old entry with a new one. This causes the reference count of the old entry to decrease and the object is then removed by the garbage collector. However, the print function is still trying to use the old reference which causes an error.
Something that I found interesting is that I can change how quickly or slowly the extension has a segmentation fault by only changing the names of the keys in the dictionaries.
Does anyone have any insights as to what the issue may be? Is there a better way to create the extension and still have the properties that I want?

I believe I have found the cause of the segmentation fault. The background thread is modifying the state of the interpreter without obtaining the Global Interpreter Lock (GIL). This would indeed cause the interpreter to behave in unexpected ways.
To fix this, I first call the function PyEval_InitThreads() in the module initialization function. The next thing to do is enclose any instructions in the background thread that make use of Python C API with the functions PyGILState_Ensure() and PyGILState_Release(). Below is the modified source code with this fix.
#include <Python.h>
#include <pthread.h>
static PyObject *module;
static PyObject *pyitem_error;
static PyObject *item;
static PyObject *item_handle;
static pthread_t thread;
void *stuff(void *param)
{
int garbage = 0;
PyObject *size;
PyObject *value;
PyGILState_STATE state; // Needed for PyGILState_Ensure() and PyGILState_Release()
while(1)
{
// Obtain the GIL
state = PyGILState_Ensure();
size = NULL;
size = Py_BuildValue("{s:i,s:i}", "l", garbage, "w", garbage);
if(size == NULL)
{
goto error;
}
value = NULL;
value = Py_BuildValue("{s:i,s:O}", "h", garbage, "base", size);
if(value == NULL)
{
goto error;
}
PyDict_SetItemString(item, "dim", value);
error:
Py_XDECREF(size);
Py_XDECREF(value);
// Release the GIL
PyGILState_Release(state);
garbage++;
}
return NULL;
}
static PyMethodDef pyitem_methods[] =
{
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initpyitem(void)
{
module = Py_InitModule("pyitem", pyitem_methods);
Py_INCREF(module);
pyitem_error = PyErr_NewException("pyitem.error", NULL, NULL);
Py_INCREF(pyitem_error);
PyModule_AddObject(module, "error", pyitem_error);
item = PyDict_New();
item_handle = PyDictProxy_New(item);
Py_INCREF(item_handle);
PyModule_AddObject(module, "item", item_handle);
// Initialize Global Interpreter Lock (GIL)
PyEval_InitThreads();
pthread_create(&thread, NULL, stuff, NULL);
}
The extension now runs without any segmentation faults.

Related

Python-C-api, reference count and read access violation

I have a c++ piece a code, included in a larger native python project, that triggers various random read access violation. I suspect there is an issue with the handling of the reference count but I cannot figure it.
The code features a C++ class with 2 attributes wrapped into a Python Object.
typedef struct
{
PyObject_HEAD
MyCustomClass *self;
} PyMyCustomClass;
class MyCustomClass {
public:
PyObject *values;
PyObject *incr_values;
...
}
Both of the attributes are tuple initialized to None and MyCustomClass features the following methods:
MyCustomClass(){
values = Py_BuildValue("");
incr_values= Py_BuildValue("");
}
~MyCustomClass(){
Py_DECREF(this->values);
Py_DECREF(this->incr_values);
}
PyObject *get_values() {
Py_INCREF(this->values);
return this->values;
}
int set_incr_values( PyObject *new_values) {
Py_DECREF(this->incr_values);
Py_INCREF(new_values);
this->incr_values = new_values;
return 0;
}
PyObject *compute_incr_values() {
if( condition )
return this->get_values(); //new reference
else { //add 1 to all values
PyObject *one = Py_BuildValue("i", 1);
Py_ssize_clean_t size = PyTuple_GET_SIZE(this->values);
PyObject *new_values = PyTuple_New(size);
for(Py_ssize_t i = 0; i < size; i++ ) {
PyObject *item = PyTuple_GET_ITEM(input,i);
auto add_fct = Py_TYPE(item)->tp_as_number->nb_add;
PyTuple_SET_ITEM(new_values, i, add_fct(item,one) );
}
Py_DECREF(one);
return new_values; //new reference
}
}
static PyObject *compute_incr_values(PyMyCustomClass *obj, PyObject *Py_UNUSED) {
PyObject *new_values = obj->self->compute_incr_values();
obj->self->set_incr_values(new_values);
Py_DECREF(new_values); //Get rid of unused object
Py_RETURN_NONE;
}
The code as presented causes various random read access violation to be triggered in the Python code. However if I remove Py_DECREF(this->values); in the destructor and remove Py_DECREF(new_values); in compute_incr_values method, it then works.
I do not understand the issue here. Is there an issue with the handling of the reference count ?
I can see at least two issues with your code.
Your set_incr_values function is broken
int set_incr_values( PyObject *new_values) {
Py_DECREF(this->incr_values);
Py_INCREF(new_values);
this->incr_values = new_values;
return 0;
}
There's actually two issues here. First it can fail if new_values is the same as this->incr_values. Second, Py_DECREF can cause arbitrary code to be executed (read the big red warning on the documentation for Py_DECREF). Therefore, you must ensure that self is in a valid state before the decref. Doing the assignment first is the easiest way of doing this.
The better way to do it would be:
int set_incr_values( PyObject *new_values) {
PyObject *old = this->incr_values;
this->incr_values = new_values;
Py_INCREF(new_values);
Py_DECREF(this->incr_values);
return 0;
}
values is not a tuple
You uncritically use values as a tuple inside compute_incr_values (e.g. PyTuple_GET_SIZE). However, when you create values you assign None to it (which you know because you point it out in the question, although I don't think "tuple initialized" means anything).
values = Py_BuildValue("");
There's also no error checking. You say in the comments "I have removed the error checking for simplicity". This is generally unhelpful - my first assumption when reading C API code with no error checking is that it's failing just because they're ignoring some exception. That assumption is usually right.
It isn't possible to tell exactly what's wrong with the code because there's no minimal reproducible example. However, there's plenty of issues based on a quick skim-read so I'd be suspicious of the rest of it.

Embedding Python in DLL: Access violation reading location when Py_DECREF list object

I am trying to embed Python into an an XLL to allow Python functions to be called within Excel. An XLL is a DLL that also includes at a minimum 2 functions that tell Excel how to register or unregister the exported functions from the DLL, so this can be treated exactly as a traditional DLL. The issue I am having is when decreasing the reference count of a Python list object the program crashes with the following error:
Exception thrown at 0x1E14E37D (python37.dll) in EXCEL.EXE: 0xC0000005: Access violation reading location 0x00000064.
I do not have this issue when decreasing the reference count strings, floats, bool, etc. Only list and tuple object are giving me this issue.
Below I have made a simple example with 2 functions exposed to Excel, testFloat and testList. Both are made very simple to try and debug the issue, so the functions take no arguments and both return xltypenil to Excel, which will fill the cell with 0. Each function creates their own Python object (a float or a list) and then decrements the reference count.
#define PY_SSIZE_T_CLEAN
#include <Python.h>
#include <Windows.h>
#include <XLCALL.H>
#include <FRAMEWRK.H>
LPCWCHAR uFuncs[][3]{
{L"testFloat", L"Q", L"testFloat"},
{L"testList", L"Q", L"testList"},
};
BOOL APIENTRY DllMain( HMODULE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
switch (ul_reason_for_call)
{
case DLL_PROCESS_ATTACH:
case DLL_THREAD_ATTACH:
case DLL_THREAD_DETACH:
case DLL_PROCESS_DETACH:
break;
}
return TRUE;
}
int WINAPI xlAutoOpen(void) {
if (!Py_IsInitialized()) {
Py_InitializeEx(0);
}
// register functions with Excel
XLOPER12 xDLL;
Excel12f(xlGetName, &xDLL, 0);
for (int i{ 0 }; i < sizeof(uFuncs) / sizeof(uFuncs[0]); i++) {
Excel12f(xlfRegister, 0, 4,
(LPXLOPER12)& xDLL,
(LPXLOPER12)TempStr12(uFuncs[i][0]),
(LPXLOPER12)TempStr12(uFuncs[i][1]),
(LPXLOPER12)TempStr12(uFuncs[i][2])
);
}
Excel12f(xlFree, &xDLL, 0);
return 1;
}
int WINAPI xlAutoClose(void) {
if (Py_IsInitialized()) {
Py_FinalizeEx();
}
return 1;
}
LPXLOPER12 testFloat(void) {
static XLOPER12 xRet;
PyObject* obj{ PyFloat_FromDouble(2.5) };
Py_DECREF(obj);
xRet.xltype = xltypeNil;
return &xRet;
}
LPXLOPER12 testList(void) {
static XLOPER12 xRet;
PyObject* obj{ Py_BuildValue("[dd]", 3.4, 4.5) };
Py_DECREF(obj); // <---- This is where the debugger says the error is
xRet.xltype = xltypeNil;
return &xRet;
}
Currently I am getting the following error:
Exception thrown at 0x1E14E37D (python37.dll) in EXCEL.EXE: 0xC0000005: Access violation reading location 0x00000064.
I am expecting this to run with no errors and return 0 to the Excel cell calling the function.
You need to acquire the GIL before you can safely do anything with the python C API. Quoting directly from the python api doc:
PyGILState_STATE gstate;
gstate = PyGILState_Ensure();
/* Perform Python actions here. */
result = CallSomeFunction();
/* evaluate result or handle exception */
/* Release the thread. No Python API allowed beyond this point. */
PyGILState_Release(gstate);
I'd suggest using pybind11 rather that tackling the python API directly.
Even further, I'd suggest either using or a least cribbing the code from my Excel/Python library which allows you to interface between Excel and python directly without needing to touch a C API!

Python C API: Assigning PyObjects to a dictionary causes memory leak

I am writing a C++ wrapper for Python using the Python C API. In my case I have to make bigger amounts of byte oriented data accessible for the Python script. For this purpose I use the PyByteArray_FromStringAndSize method to produce a Python bytearray (https://docs.python.org/2.7/c-api/bytearray.html).
When returning this bytearray directly I have not experienced any problems. When however adding the bytearray into a Python dict, the memory from the bytearray will not be released once the dict is destroyed.
This can be solved by calling Py_DECREF on the bytearray object after adding the bytearray object to the Python dict.
Below is a complete working example of my code containing a method dummyArrPlain returning the plain bytearray and a method dummyArrInDict returning a bytearray in a dict. The second method will produce a memory leak unless Py_DECREF(pyData); is called.
My question is: Why is Py_DECREF necessary at this point. Intuitively I would have expected that Py_DECREF should be called once the dict is destroyed.
Also I assign values like in the following to a dict:
PyDict_SetItem(dict, PyString_FromString("i"), PyInt_FromLong(i));
Will this also produce a memory leak when not calling Py_DECREF on the created string and long?
This is my dummy C++ wrapper:
#include <python2.7/Python.h>
static char module_docstring[] = "This is a module causing a memory leak";
static PyObject *dummyArrPlain(PyObject *self, PyObject *args);
static PyObject *dummyArrInDict(PyObject *self, PyObject *args);
static PyMethodDef module_methods[] = {
{"dummy_arr_plain", dummyArrPlain, METH_VARARGS, "returns a plain dummy bytearray"},
{"dummy_arr_in_dict", dummyArrInDict, METH_VARARGS, "returns a dummy bytearray in a dict"},
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initlibdummy(void)
{
PyObject *m = Py_InitModule("libdummy", module_methods);
if (m == NULL)
return;
}
static PyObject *dummyArrPlain(PyObject *self, PyObject *args)
{
int len = 10000000;
char* data = new char[len];
for(int i=0; i<len; i++) {
data[i] = 0;
}
PyObject * pyData = PyByteArray_FromStringAndSize(data, len);
delete [] data;
return pyData;
}
static PyObject *dummyArrInDict(PyObject *self, PyObject *args)
{
int len = 10000000;
char* data = new char[len];
for(int i=0; i<len; i++) {
data[i] = 0;
}
PyObject * pyData = PyByteArray_FromStringAndSize(data, len);
delete [] data;
PyObject *dict = PyDict_New();
PyDict_SetItem(dict, PyString_FromString("data"), pyData);
// memory leak without Py_DECREF(pyData);
return dict;
}
And a dummy python script using the wrapper:
import libdummy
import time
while True:
a = libdummy.dummy_arr_in_dict()
time.sleep(0.01)
It's a matter of [Python 2.0.Docs]: Ownership rules. I'm going to exemplify on Python 2.7.10 (pretty old, but I don't think that the behavior has (significantly) changed along the way).
PyByteArray_FromStringAndSize (bytearrayobject.c: 168) creates a new object (using PyObject_New, and allocates memory for the buffer as well).
By default, the refcount of that object (or better: of any newly created object) is 1 (set by _Py_NewReference), so that when the user calls del upon it, or at program exit, the refcount will be decreased, and when reaching 0, the object will be deallocated.
This is the behavior on the flow where the object is returned
But, in dummyArrInDict's case, PyDict_SetItem does (indirectly) a Py_INCREF of pyData (it does other stuff, but only this is relevant in the current situation), ending up with a refcount of 2 and therefore the memory leak
It's basically same thing that you're doing with data: you allocate memory for it, and when you no longer need it, you free it (this is because you're not returning it, you only use it temporarily).
Note: It's safer to use the X macros (e.g. [Python 2.Docs]: Py_XDECREF, especially since you're not testing for NULL the returned PyObjects).
For more details, also take a look at [Python 2.Docs]: C API Reference.

Pointer ownership when PyCapsule_New fails

PyCapsule_New accepts a destructor function, which is called when the capsule is destroyed:
PyObject* PyCapsule_New(void *pointer, const char *name, PyCapsule_Destructor destructor)
I am trying to use this mechanism to pass ownership of an object created by C++ code to Python. Specifically, the destructor simply calls "delete" for the object.
auto ptr = make_unique<ObjType>(arg);
PyObject * ret = PyCapsule_New(ptr.release(), nullptr, Destroyer);
void Destroyer(PyObject *capsule)
{
auto rawPtr = static_cast<ObjType*>(PyCapsule_GetPointer(capsule, nullptr));
delete rawPtr;
}
It seems to me there is a potential memory leak here: If the PyCapsule_New fails, the released raw pointer becomes dangling. I tried to get confirmation from Python C API document. However, it only mentions that upon failure, an exception is set, and a NULL is returned. It doesn't talk about the ownership.
It seems reasonable to assume the pointer will be dangling, because if the capsule is not generated in the first place, there is no handler to be passed to the destructor.
However, I am not sure if PyCapsule_New internally calls the destructor or not, specifically:
Inside PyCapsule_New, the capsule construction is almost complete.
A failure happens just before it returns.
PyCapsule_New sets an exception, returns NULL, after calling the destructor (???)
If the highlighted part is never going to happen, it seems to me the above code will have to be re-written as
auto ptr = make_unique<ObjType>(arg);
PyObject * ret = PyCapsule_New(ptr.get(), nullptr, Destroyer);
if (ret != nullptr)
ptr.release();
Could someone please help confirm if that's the case?
Changing comment to answer, as suggested.
Short answer: No, when PyCapsule_New fails, it does not call the destroyer.
See implementation in https://github.com/python/cpython/blob/master/Objects/capsule.c#L44
PyObject *
PyCapsule_New(void *pointer, const char *name, PyCapsule_Destructor destructor)
{
PyCapsule *capsule;
if (!pointer) {
PyErr_SetString(PyExc_ValueError, "PyCapsule_New called with null pointer");
return NULL;
}
capsule = PyObject_NEW(PyCapsule, &PyCapsule_Type);
if (capsule == NULL) {
return NULL;
}
capsule->pointer = pointer;
capsule->name = name;
capsule->context = NULL;
capsule->destructor = destructor;
return (PyObject *)capsule;
}
As a result, the first implementation does contain potential memory leak. "release()" should only be called when PyCapsule_New is successful.

Pickling a Python Extension type defined as a C struct having PyObject* members

I am running C++ code via Python and would like to pickle an extension type.
So I have a C++ struct (py_db_manager) containing pointers to a database object and a object manager object (both written in C++) that I wrapped with a python type object (t_db_manager). My problem is that this python type needs to know how to pickle the two pointers in order to send it to some child multicore processes. So I registered the type with the copy_reg module (this is equivalent to writing a reduce() method on the type. However, I'm not too sure what to put in it. Should I build a tuple with the PyObject* or just the integer pointers?. Can anyone help?
typedef struct
{
PyObject_HEAD
PyObject* man_inst_ ;
PyObject* db_inst_ ;
}py_db_manager;`
Here's the Py_TypeObject
PyTypeObject t_db_manager = {
PyObject_HEAD_INIT(0) /* tp_head */
0, /* tp_internal */
".py_db_manager", /* tp_name */
sizeof(py_db_manager)};
And here's the code that would be in the reduce method:
PyObject *pickle_manager(PyObject *module, PyObject *args)
{
py_db_manager *cpp_manager =0;
PyObject *values = NULL,
*tuple = NULL;
char text[512];
if (!PyArg_ParseTuple(args, "O!", &t_db_manager, &cpp_manager))
goto error;
sprintf(text,"man_inst_, db_inst_");
if ((values = Py_BuildValue("(sii)", text,
cpp_manager->man_inst_, cpp_manager->db_inst_)) == NULL)
goto error;
tuple = Py_BuildValue("(OO)", manager_constructor, values);
error:
Py_XDECREF(values);
return tuple;
}
Because this will be passed to another process, pickling just integer pointers will not work as you would want it to. Different processes use different memory space therefore they don't see the same things.
So, to answer your question, you should pickle full objects and reconstruct them from the receiving end.

Categories

Resources