I have a c++ piece a code, included in a larger native python project, that triggers various random read access violation. I suspect there is an issue with the handling of the reference count but I cannot figure it.
The code features a C++ class with 2 attributes wrapped into a Python Object.
typedef struct
{
PyObject_HEAD
MyCustomClass *self;
} PyMyCustomClass;
class MyCustomClass {
public:
PyObject *values;
PyObject *incr_values;
...
}
Both of the attributes are tuple initialized to None and MyCustomClass features the following methods:
MyCustomClass(){
values = Py_BuildValue("");
incr_values= Py_BuildValue("");
}
~MyCustomClass(){
Py_DECREF(this->values);
Py_DECREF(this->incr_values);
}
PyObject *get_values() {
Py_INCREF(this->values);
return this->values;
}
int set_incr_values( PyObject *new_values) {
Py_DECREF(this->incr_values);
Py_INCREF(new_values);
this->incr_values = new_values;
return 0;
}
PyObject *compute_incr_values() {
if( condition )
return this->get_values(); //new reference
else { //add 1 to all values
PyObject *one = Py_BuildValue("i", 1);
Py_ssize_clean_t size = PyTuple_GET_SIZE(this->values);
PyObject *new_values = PyTuple_New(size);
for(Py_ssize_t i = 0; i < size; i++ ) {
PyObject *item = PyTuple_GET_ITEM(input,i);
auto add_fct = Py_TYPE(item)->tp_as_number->nb_add;
PyTuple_SET_ITEM(new_values, i, add_fct(item,one) );
}
Py_DECREF(one);
return new_values; //new reference
}
}
static PyObject *compute_incr_values(PyMyCustomClass *obj, PyObject *Py_UNUSED) {
PyObject *new_values = obj->self->compute_incr_values();
obj->self->set_incr_values(new_values);
Py_DECREF(new_values); //Get rid of unused object
Py_RETURN_NONE;
}
The code as presented causes various random read access violation to be triggered in the Python code. However if I remove Py_DECREF(this->values); in the destructor and remove Py_DECREF(new_values); in compute_incr_values method, it then works.
I do not understand the issue here. Is there an issue with the handling of the reference count ?
I can see at least two issues with your code.
Your set_incr_values function is broken
int set_incr_values( PyObject *new_values) {
Py_DECREF(this->incr_values);
Py_INCREF(new_values);
this->incr_values = new_values;
return 0;
}
There's actually two issues here. First it can fail if new_values is the same as this->incr_values. Second, Py_DECREF can cause arbitrary code to be executed (read the big red warning on the documentation for Py_DECREF). Therefore, you must ensure that self is in a valid state before the decref. Doing the assignment first is the easiest way of doing this.
The better way to do it would be:
int set_incr_values( PyObject *new_values) {
PyObject *old = this->incr_values;
this->incr_values = new_values;
Py_INCREF(new_values);
Py_DECREF(this->incr_values);
return 0;
}
values is not a tuple
You uncritically use values as a tuple inside compute_incr_values (e.g. PyTuple_GET_SIZE). However, when you create values you assign None to it (which you know because you point it out in the question, although I don't think "tuple initialized" means anything).
values = Py_BuildValue("");
There's also no error checking. You say in the comments "I have removed the error checking for simplicity". This is generally unhelpful - my first assumption when reading C API code with no error checking is that it's failing just because they're ignoring some exception. That assumption is usually right.
It isn't possible to tell exactly what's wrong with the code because there's no minimal reproducible example. However, there's plenty of issues based on a quick skim-read so I'd be suspicious of the rest of it.
Related
According to the accepted answer in Py_INCREF/DECREF: When, Python objects that are created by functions but not explicitly returned should have their reference counts decremented via DECREF. Does this guideline apply to temporary variables? For example, I could use this:
void PythonInterface::SetModule (const char *filename)
{
PyObject *name = PyUnicode_DecodeFSDefault (filename);
_module = PyImport_Import (name);
Py_XDECREF (name);
}
or this:
void PythonInterface::SetModule (const char *filename)
{
_module = PyImport_Import (PyUnicode_DecodeFSDefault (filename));
}
Are these two bits of code identical, or will the second example cause problems?
Whenever I call this function, memory usage is increases a lot per call, so I think there is some memory leak here.
PyObject *pScript, *pModule, *pFunc, *pValue;
PyObject *pArgs = NULL;
long ret = 1;
// Initialize python, set system path and load the module
pScript = SetPyObjectString(PYTHON_SCRIPT_NAME);
PyRun_SimpleString("import sys");
PyRun_SimpleString("sys.path.append('"PYTHON_SCRIPT_PATH"')");
pModule = PyImport_Import(pScript);
Py_XDECREF(pScript);
if (pModule != NULL) {
// Get function object from python module
pFunc = PyObject_GetAttrString(pModule, operation.c_str());
if (pFunc && PyCallable_Check(pFunc)) {
// Create argument(s) as Python tuples
if (operation == UPDATE_KEY) {
// If operation is Update key, create two arguments - key and value
pArgs = PyTuple_New(2);
}
else {
pArgs = PyTuple_New(1);
}
pValue = SetPyObjectString(key.c_str());
// Set argument(s) with key/value strings
PyTuple_SetItem(pArgs, 0, pValue);
if (operation == UPDATE_KEY) {
// If operation is Update key, set two arguments - key and value
pValue = SetPyObjectString(value.c_str());
PyTuple_SetItem(pArgs, 1, pValue);
}
// Call the function using function object and arguments
pValue = PyObject_CallObject(pFunc, pArgs);
Py_XDECREF(pArgs);
if (pValue != NULL) {
// Parse the return values
ret = PyLong_AsLong(PyList_GetItem(pValue, 0));
value = GetPyObjectString(PyList_GetItem(pValue, 1));
}
else {
ERROR("Function call to %s failed", operation.c_str());
}
Py_XDECREF(pValue);
Py_XDECREF(pFunc);
}
else {
ERROR("Cannot find function in python module");
}
Py_XDECREF(pModule);
}
else {
ERROR("Failed to load python module");
}
I am leaking some memory when this C++ snippet in my code calls the python script and I want to know why. I think I am doing something wrong with my Py_DECREFs. Any help would be much appreciated.
I spotted one missing decref from a quick glance:
pFunc = PyObject_GetAttrString(pModule, operation.c_str());
if (pFunc && PyCallable_Check(pFunc)) {
// ...
Py_XDECREF(pFunc);
}
This will leak any non-callable attribute matching operation.
The two reassignments of pValue… I think that's OK, because PyTuple_SetItem steals the reference to each of the original values.
For this line that you asked about:
value = GetPyObjectString(PyList_GetItem(pValue, 1));
The PyList_GetItem returns a borrowed reference, so the fact that you don't decref it is correct.
But I don't see the declaration for value anywhere, or GetPyObjectString, so I have no idea what that part is doing. Maybe it's just getting a borrowed buffer out of a PyUnicodeObject * and copying it into some C++ wstring or UTF-32 string type, or maybe it's leaking a Python object or a copied buffer, or returning a raw C buffer that you just leak here, or… who knows?
But I certainly wouldn't trust that some guy on the internet found all of them on a quick scan. Learn to use a memory debugger.
Or: You're using C++. RAII is almost the whole point of using C++—in other words, instead of using raw PyObject * values, you can use a smart pointer that decrefs things for you automatically. Or, even better, use a ready-made library like PyCXX.
I am trying to create a Python (2.7.12) extension in C that does the following:
Provide a read only nested dictionary with module level scope for the Python programmer.
A background thread invisible to the Python programmer will add, delete, and modify entries in the dictionary.
The extension will be built directly into the Python interpreter.
I created a simplified version of this extension that adds one entry to the dictionary and then constantly modifies it with new values. Below is the C file containing comments about what it is doing along with my understanding how the reference counts are being handled.
#include <Python.h>
#include <pthread.h>
static PyObject *module;
static PyObject *pyitem_error;
static PyObject *item;
static PyObject *item_handle;
static pthread_t thread;
void *stuff(void *param)
{
int garbage = 0;
PyObject *size;
PyObject *value;
while(1)
{
// Build a dictionary called size containg two integer objects
// Py_BuildValue will pass ownership of its reference to size to this thread
size = NULL;
size = Py_BuildValue("{s:i,s:i}", "l", garbage, "w", garbage);
if(size == NULL)
{
goto error;
}
// Build a dictionary containing an integer object and the size dictionary
// Py_BuildValue will create and own a reference to the size dictionary but not steal it
// Py_BuildValue will pass ownership of its reference to value to this thread
value = NULL;
value = Py_BuildValue("{s:i,s:O}", "h", garbage, "base", size);
if(value == NULL)
{
goto error;
}
// Add the new data to the dictionary
// PyDict_SetItemString will borrow a reference to value
PyDict_SetItemString(item, "dim", value);
error:
Py_XDECREF(size);
Py_XDECREF(value);
garbage++;
}
return NULL;
}
// There will be methods for this module in the future
static PyMethodDef pyitem_methods[] =
{
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initpyitem(void)
{
// Create a module object
// Own a reference to it since Py_InitModule returns a borrowed reference
module = Py_InitModule("pyitem", pyitem_methods);
Py_INCREF(module);
// Create an exception object for future use
// Own a second reference to it since PyModule_AddObject will steal a reference
pyitem_error = PyErr_NewException("pyitem.error", NULL, NULL);
Py_INCREF(pyitem_error);
PyModule_AddObject(module, "error", pyitem_error);
// Create a dictionary object and a proxy object that makes it read only
// Own a second reference to the proxy object since PyModule_AddObject will steal a reference
item = PyDict_New();
item_handle = PyDictProxy_New(item);
Py_INCREF(item_handle);
PyModule_AddObject(module, "item", item_handle);
// Start the background thread that modifies the dictionary
pthread_create(&thread, NULL, stuff, NULL);
}
Below is a Python program using this extension. All it does is print out what is in the dictionary.
import pyitem
while True:
print pyitem.item
print
This extension seems to work for a while and then crashes with a segmentation fault. An examination of the core dump reveals the following:
Core was generated by `python pyitem_test.py'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 PyObject_Malloc (nbytes=nbytes#entry=42) at Objects/obmalloc.c:831
831 if ((pool->freeblock = *(block **)bp) != NULL) {
[Current thread is 1 (Thread 0x7f144a824700 (LWP 3931))]
This core dump leads me to believe the issue might have to do with my handling of the object reference counts. I believe this might be one cause since problems posed by others with the same core dump resolved the issue by properly handling reference counts. However, I do not see anything wrong with my handling of the object reference counts.
Another thing that comes to mind is that the print function in Python is likely only borrowing a references to the contents of the dictionary. When it is trying to print the dictionary (or access its contents in any other way), the background thread comes along and replaces the old entry with a new one. This causes the reference count of the old entry to decrease and the object is then removed by the garbage collector. However, the print function is still trying to use the old reference which causes an error.
Something that I found interesting is that I can change how quickly or slowly the extension has a segmentation fault by only changing the names of the keys in the dictionaries.
Does anyone have any insights as to what the issue may be? Is there a better way to create the extension and still have the properties that I want?
I believe I have found the cause of the segmentation fault. The background thread is modifying the state of the interpreter without obtaining the Global Interpreter Lock (GIL). This would indeed cause the interpreter to behave in unexpected ways.
To fix this, I first call the function PyEval_InitThreads() in the module initialization function. The next thing to do is enclose any instructions in the background thread that make use of Python C API with the functions PyGILState_Ensure() and PyGILState_Release(). Below is the modified source code with this fix.
#include <Python.h>
#include <pthread.h>
static PyObject *module;
static PyObject *pyitem_error;
static PyObject *item;
static PyObject *item_handle;
static pthread_t thread;
void *stuff(void *param)
{
int garbage = 0;
PyObject *size;
PyObject *value;
PyGILState_STATE state; // Needed for PyGILState_Ensure() and PyGILState_Release()
while(1)
{
// Obtain the GIL
state = PyGILState_Ensure();
size = NULL;
size = Py_BuildValue("{s:i,s:i}", "l", garbage, "w", garbage);
if(size == NULL)
{
goto error;
}
value = NULL;
value = Py_BuildValue("{s:i,s:O}", "h", garbage, "base", size);
if(value == NULL)
{
goto error;
}
PyDict_SetItemString(item, "dim", value);
error:
Py_XDECREF(size);
Py_XDECREF(value);
// Release the GIL
PyGILState_Release(state);
garbage++;
}
return NULL;
}
static PyMethodDef pyitem_methods[] =
{
{NULL, NULL, 0, NULL}
};
PyMODINIT_FUNC initpyitem(void)
{
module = Py_InitModule("pyitem", pyitem_methods);
Py_INCREF(module);
pyitem_error = PyErr_NewException("pyitem.error", NULL, NULL);
Py_INCREF(pyitem_error);
PyModule_AddObject(module, "error", pyitem_error);
item = PyDict_New();
item_handle = PyDictProxy_New(item);
Py_INCREF(item_handle);
PyModule_AddObject(module, "item", item_handle);
// Initialize Global Interpreter Lock (GIL)
PyEval_InitThreads();
pthread_create(&thread, NULL, stuff, NULL);
}
The extension now runs without any segmentation faults.
I am new to the business of writing custom Python modules and I am a bit confused how Capsules work. I use Python 2.7.6 from the system OSX installation and try to use Capsules (as recommended for Python > 2.7) for passing pointers around (before they used PyCObject for that). My code does not work at the moment and I would like to get some insights how things should be handled in principle here. The code should define a class LuscherClm and I want be able to do the following:
>>> c40=Luscher(4,0)
>>>
>>> c40(0.12)
>>> <print the result of the evaluation>
First question: at the moment I would have to do something like:
>>> c40=Luscher.init(4,0)
>>>
>>> c40.eval(0.12)
Segfault
My first question is therefore: how do I have to modify the method table to have more operator-style casts instead of the member functions init and eval.
However, my code has other problems and here is the relevant part (the underlying C++ class works smoothly, I use it in production a lot):
The destructor:
//destructor
static void clm_destruct(PyObject* capsule){
void* ptr=PyCapsule_GetPointer(capsule,"zetfunc");
Zetafunc* zetptr=static_cast<Zetafunc*>(ptr);
delete zetptr;
return;
}
The constructor: it returns the pointer to the capsule. I do not know whether this is correct. Because in this case when I call, clm=LuscherClm.init(l,m), the clm object is a PyCapsule and has no attribute eval so that I cannot call clm.eval(x) on that. How should this be handled?
//constructor
static PyObject* clm_init(PyObject* self, PyObject *args){
//return value
PyObject* result=NULL;
//parse variables
unsigned int lval=0;
int mval=0;
if(!PyArg_ParseTuple(args,"li",&lval,&mval)){
::std::cout << "Please specify l and m!" << ::std::endl;
return result;
}
//class instance:
Zetafunc* zetfunc=new Zetafunc(lval,mval);
instanceCapsule=PyCapsule_New(static_cast<void*> (zetfunc),"zetfunc",&clm_destruct);
return instanceCapsule;
}
So how is the capsule passed to the evaluate function? the code below is not correct since I have not updated it after moving from CObjects to Capsules. Shall the capsule be a global variable (I do not like that) or how can I pass it to the evaluation function? Or shall I call it on self, but what is self at the moment?
//evaluate the function
static PyObject* clm_evaluate(PyObject* self, PyObject* args){
//get the PyCObject from the capsule:
void* tmpzetfunc=PyCapsule_GetPointer(instanceCapsule,"zetfunc");
if (PyErr_Occurred()){
std::cerr << "Some Error occured!" << std::endl;
return NULL;
}
Zetafunc* zetfunc=static_cast< Zetafunc* >(tmpzetfunc);
//parse value:
double x;
if(!PyArg_ParseTuple(args,"d",&x)){
std::cerr << "Specify a number at which you want to evaluate the function" << std::endl;
return NULL;
}
double result=(*zetfunc)(x).re();
//return the result as a packed function:
return Py_BuildValue("d",result);
}
//methods
static PyMethodDef LuscherClmMethods[] = {
{"init", clm_init, METH_VARARGS, "Initialize clm class!"},
{"eval", clm_evaluate, METH_VARARGS, "Evaluate the Zeta-Function!"},
{NULL, NULL, 0, NULL} /* Sentinel */
};
Python < 3 initialisation function:
PyMODINIT_FUNC
initLuscherClm(void)
{
PyObject *m = Py_InitModule("LuscherClm", LuscherClmMethods);
return;
}
Can you explain to me what is wrong and why? I would like to stay away from SWIG or boost if possible, since this module should be easily portable and I want to avoid having to install additional packages every time I want to use it somewhere else.
Further: what is the overhead produced by the C/API in calling the function? I need to call it an order of O(10^6) times and I would still like it to be fast.
Ok, I am using boost.python now but I get a segfault when I run object.eval(). That is my procedure now:
BOOST_PYTHON_MODULE(threevecd)
{
class_< threevec<double> >("threevecd",init<double,double,double>());
}
BOOST_PYTHON_MODULE(LuscherClm)
{
class_<Zetafunc>("LuscherClm",init<int,int, optional<double,threevec<double>,double,int> >())
.def("eval",&Zetafunc::operator(),return_value_policy<return_by_value>());
boost::python::to_python_converter<dcomplex,dcomplex_to_python_object>();
}
dcomplex is my own complex number implementation. So I had to write a converter:
struct dcomplex_to_python_object
{
static PyObject* convert(dcomplex const& comp)
{
if(fabs(comp.im())<std::numeric_limits<double>::epsilon()){
boost::python::object result=boost::python::object(complex<double>(comp.re(),comp.im()));
return boost::python::incref(result.ptr());
}
else{
return Py_BuildValue("d",comp.re());
}
}
};
Complex128 is a numpy extension which is not understood by boost. So my questions are:
1) how can I return a complex number as a python datatype (is complex a standard python type?)
2) Why do I get a segfault. My result in my testcase is real so it should default to the else statement. I guess that the pointer runs out of scope and thats it. But even in the if-case (where I take care about ref-increments), it segfaults. Can someone help me with the type conversion issue?
Thanks
Thorsten
Ok, I got it. The following converter does the job:
struct dcomplex_to_python_object
{
static PyObject* convert(dcomplex const& comp)
{
PyObject* result;
if(std::abs(comp.im())<=std::numeric_limits<double>::epsilon()){
result=PyFloat_FromDouble(comp.re());
}
else{
result=PyComplex_FromDoubles(comp.re(),comp.im());
}
Py_INCREF(result);
return result;
}
};
Using this converter and the post by Wouter, I suppose my question is answered. Thanks
I have written a Python extension for a C library. I have a data structure that looks like this:
typedef struct _mystruct{
double * clientdata;
size_t len;
} MyStruct;
The purpose of this datatype maps directly to the list data type in Python. I therefore, want to create 'list-like' behavior for the exported struct, so that code written using my C extension is more 'Pythonic'.
In particular, this is what I want to be able to do (from python code)
Note: py_ctsruct is a ctsruct datatype being accessed in python.
My requirements can be sumarized as:
list(py_ctsruct) returns a python list with all contents copied out from the c struct
py_cstruct[i] returns ith element (preferably throws IndexError on invalid index)
for elem in py_ctsruct: ability to enumerate
According to PEP234, An object can be iterated over with "for" if it implements
_iter_() or _getitem_(). Using that logic then, I think that by adding the following attributes (via rename) to my SWIG interface file, I will have the desired behavior (apart from req. #1 above - which I still dont know how to achieve):
__len__
__getitem__
__setitem__
I am now able to index the C object in python. I have not yet implemented the Python exception throwing, however if array bounds are exceeded, are return a magic number (error code).
The interesting thing is that when I attempt to iterate over the struct using 'for x in' syntax for example:
for i in py_cstruct:
print i
Python enters into an infinite loop that simply prints the magic (error) number mentioned above, on the console. which suggests to me that there is something wrong with the indexing.
last but not the least, how can I implement requirement 1? this involves (as I understand it):
handling' the function call list() from python
Returning a Python (list) data type from C code
[[Update]]
I would be interested in seeing a little code snippet on what (if any) declarations I need to put in my interface file, so that I can iterate over the elements of the c struct, from Python.
The simplest solution to this is to implement __getitem__ and throw an IndexError exception for an invalid index.
I put together an example of this, using %extend and %exception in SWIG to implement __getitem__ and raise an exception respectively:
%module test
%include "exception.i"
%{
#include <assert.h>
#include "test.h"
static int myErr = 0; // flag to save error state
%}
%exception MyStruct::__getitem__ {
assert(!myErr);
$action
if (myErr) {
myErr = 0; // clear flag for next time
// You could also check the value in $result, but it's a PyObject here
SWIG_exception(SWIG_IndexError, "Index out of bounds");
}
}
%include "test.h"
%extend MyStruct {
double __getitem__(size_t i) {
if (i >= $self->len) {
myErr = 1;
return 0;
}
return $self->clientdata[i];
}
}
I tested it by adding to test.h:
static MyStruct *test() {
static MyStruct inst = {0,0};
if (!inst.clientdata) {
inst.len = 10;
inst.clientdata = malloc(sizeof(double)*inst.len);
for (size_t i = 0; i < inst.len; ++i) {
inst.clientdata[i] = i;
}
}
return &inst;
}
And running the following Python:
import test
for i in test.test():
print i
Which prints:
python run.py
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
and then finishes.
An alternative approach, using a typemap to map MyStruct onto a PyList directly is possible too:
%module test
%{
#include "test.h"
%}
%typemap(out) (MyStruct *) {
PyObject *list = PyList_New($1->len);
for (size_t i = 0; i < $1->len; ++i) {
PyList_SetItem(list, i, PyFloat_FromDouble($1->clientdata[i]));
}
$result = list;
}
%include "test.h"
This will create a PyList with the return value from any function that returns a MyStruct *. I tested this %typemap(out) with the exact same function as the previous method.
You can also write a corresponding %typemap(in) and %typemap(freearg) for the reverse, something like this untested code:
%typemap(in) (MyStruct *) {
if (!PyList_Check($input)) {
SWIG_exception(SWIG_TypeError, "Expecting a PyList");
return NULL;
}
MyStruct *tmp = malloc(sizeof(MyStruct));
tmp->len = PyList_Size($input);
tmp->clientdata = malloc(sizeof(double) * tmp->len);
for (size_t i = 0; i < tmp->len; ++i) {
tmp->clientdata[i] = PyFloat_AsDouble(PyList_GetItem($input, i));
if (PyErr_Occured()) {
free(tmp->clientdata);
free(tmp);
SWIG_exception(SWIG_TypeError, "Expecting a double");
return NULL;
}
}
$1 = tmp;
}
%typemap(freearg) (MyStruct *) {
free($1->clientdata);
free($1);
}
Using an iterator would make more sense for containers like linked lists, but for completeness sake here's how you might go about doing it for MyStruct with __iter__. The key bit is that you get SWIG to wrap another type for you, which provides the __iter__() and next() needed, in this case MyStructIter which is defined and wrapped at the same time using %inline since it's not part of the normal C API:
%module test
%include "exception.i"
%{
#include <assert.h>
#include "test.h"
static int myErr = 0;
%}
%exception MyStructIter::next {
assert(!myErr);
$action
if (myErr) {
myErr = 0; // clear flag for next time
PyErr_SetString(PyExc_StopIteration, "End of iterator");
return NULL;
}
}
%inline %{
struct MyStructIter {
double *ptr;
size_t len;
};
%}
%include "test.h"
%extend MyStructIter {
struct MyStructIter *__iter__() {
return $self;
}
double next() {
if ($self->len--) {
return *$self->ptr++;
}
myErr = 1;
return 0;
}
}
%extend MyStruct {
struct MyStructIter __iter__() {
struct MyStructIter ret = { $self->clientdata, $self->len };
return ret;
}
}
The requirements for iteration over containers are such that the container needs to implement __iter__() and return a new iterator, but in addition to next() which returns the next item and increments the iterator the iterator itself must also supply a __iter__() method. This means that either the container or an iterator can be used identically.
MyStructIter needs to keep track of the current state of iteration - where we are and how much we have left. In this example I did that by keeping a pointer to the next item and a counter that we use to tell when we hit the end. You could also have kept track of the sate by keeping a pointer to the MyStruct the iterator is using and a counter for the position within that, something like:
%inline %{
struct MyStructIter {
MyStruct *list;
size_t pos;
};
%}
%include "test.h"
%extend MyStructIter {
struct MyStructIter *__iter__() {
return $self;
}
double next() {
if ($self->pos < $self->list->len) {
return $self->list->clientdata[$self->pos++];
}
myErr = 1;
return 0;
}
}
%extend MyStruct {
struct MyStructIter __iter__() {
struct MyStructIter ret = { $self, 0 };
return ret;
}
}
(In this instance we could actually have just used the container itself as the iterator as an iterator, by supplying an __iter__() that returned a copy of the container and a next() similar to the first type. I didn't do that in my original answer because I thought that would be less clear than have two distinct types - a container and an iterator for that container)
Look up using the %typemap swig command. http://www.swig.org/Doc2.0/SWIGDocumentation.html#Typemaps
http://www.swig.org/Doc2.0/SWIGDocumentation.html#Typemaps_nn25
The memberin typemap might do what you want.
http://www.swig.org/Doc2.0/SWIGDocumentation.html#Typemaps_nn35
I have a typemap that I found in the Python section that allows me to transfer char** data into the C++ as a list of Python strings. I would guess there would be similar functionality.
Also, you can define %pythoncode in your interface inside the struct inside the swig "i" file. This will allow you to add python methods in the object that gets created for the struct. There is another command %addmethod (I think) that allows you to add methods to the struct or a class as well. Then you can create methods for indexing the objects in C++ or C if you want. There are a lot of ways to solve this.
For an interface I am working on I used a class object that has some methods for accessing the data in my code. Those methods are written in C++. Then I used the %pythoncode directive inside the class inside of the "i" file and created "getitem" and "setitem" methods in Python code that uses the expose C++ methods to make it look like a dictionary style access.
You say you have yet to implement Python exception throwing - that's the problem. From PEP 234:
A new exception is defined, StopIteration, which can be used to signal the end of an iteration.
You must set this exception at the end of your iteration. Since your code doesn't do this, you're running into the situation you've described:
The interpreter loops through your list's custom iternext function
Your function gets to the end of the array, and rather than correctly setting the StopIteration exception, simply returns your 'magic number'.
The interpreter, seeing no good reason to stop iterating, simply continues to print the value returned by iternext... your magic number. To the interpreter, it's just yet another list member.
Fortunately, this is a pretty simple fix, though may not seem as straightforward, because C has no exception facility. The Python C API simply uses a global error indicator that you set when an exception situation is raised, and then the API standards dictate you return NULL all the way up the stack to the interpreter, which then looks at the output of PyErr_Occurred() to see if an error is set, and if it is, prints the relevant exception and traceback.
So in your function, when you reach the end of the array, you just need this:
PyErr_SetString(PyExc_StopIteration,"End of list");
return NULL;
Here's another great answer for further reading on this issue: How to create a generator/iterator with the Python C API?
I encountered the very same problem with Python 2.6, and solved it thank to #aphex reply.
But I wanted to avoid any magic value, or extra boolean to pass end-of-list condition. Sure enough, my iterator have an atEnd() methods that tells me I am past the end of the list.
So in fact, it is fairly easy with SWIG exception handling. I just had to add the following magic:
%ignore MyStructIter::atEnd();
%exception MyStructIter::next {
if( $self->list->atEnd() ) {
PyErr_SetString(PyExc_StopIteration,"End of list");
SWIG_fail;
}
$action
}
The point is this snipet skips the next() calls completly once you are past the end of list.
If you stick to your idioms, it should look like:
%exception MyStructIter::next {
if( $self->pos >= $self->list->len ) {
PyErr_SetString(PyExc_StopIteration,"End of list");
SWIG_fail;
}
$action
}
NOTE FOR PYTHON 3.x:
You shall name your next() function with the magic "__ " prefix&postfix name. One option is simply to add:
%rename(__next__) MyStructIter::next;