I was wondering what is the correct way to wrap an array of strings in C to a Python list using SWIG.
The array is inside a struct :
typedef struct {
char** my_array;
char* some_string;
}Foo;
SWIG automatically wraps some_string to a python string.
What should I put in the SWIG interface file so that I can access my_array in Python as a regular Python string list ['string1', 'string2' ] ?
I have used typemap as sugested :
%typemap(python,out) char** {
int len,i;
len = 0;
while ($1[len]) len++;
$result = PyList_New(len);
for (i = 0; i < len; i++) {
PyList_SetItem($result,i,PyString_FromString($1[i]));
}
}
But that still didn't work. In Python, the my_array variable appears as SwigPyObject: _20afba0100000000_p_p_char.
I wonder if that is because the char** is inside a struct? Maybe I need to inform SWIG that?
Any ideas?
I don't think there is a option to handle this conversion automatically in SWIG. You need use Typemap feature of SWIG and write type converter manually. Here you can find a conversion from Python list to char** http://www.swig.org/Doc1.3/Python.html#Python_nn59 so half of job is done. What you need to do right now is to check rest of documentation of Typemap and write converter from char** to Python list.
I am not an expert on this but I think:
%typemap(python,out) char** {
applies to a function that returns char **. Your char ** is inside a structure.. have a look at the code generated by swig to confirm the map got applied or not.
You might have to use something like:
%typemap(python,out) struct Foo {
To have a map that works on a structure Foo that gets returned.
Background: I used the same typemap definition as you used, but then for a char ** successfully.
I am sorry for being slightly off-topic, but if it is an option for you I would strongly recommend using ctypes instead of swig. Here is a related question I asked previously in ctypes context: Passing a list of strings to from python/ctypes to C function expecting char **
Related
I would like to pass as argument of a function in my C module an array of uint8_t's.
I couldn't find a method to directly parse this array, so I'm parsing it to a PyObject_t and then iterating as a PyTuple_t object. This way, I need to cast each element PyObject_t of this tuple to uint8_t.
How can I do that, once that there is no PyInt_FromUINT8_t function or anything like it?
You can usually just get away with B using unsigned char. According to Parsing Arguments you should just be able to do:
uint8_t b;
if (!PyArg_ParseTuple("b", &b)) {
return NULL;
}
If not directly using arguments (e.g. you are dealing with a PyObject, simply use one of the PyInt_*, PyLong_* or PyNumber_* functions (https://docs.python.org/3/c-api/number.html?highlight=pynumber#c.PyNumber_AsSsize_t).
Converting from a uin8_t to a PyObject is simple as well, you can use PyInt_FromLong or PyLong_FromLong
This (I hope) is quite a simple issue, but despite doing some reading (I'm v. new to SWIG, and fairly green C-wise) I'm just not able to make the "connection" in my head.
I have a function from a library (legacy code, keen not to edit):
extern int myfunction(char *infile, char *maskfile, int check, float *median, char *msg)
My aim is to create a wrapper for this in Python using SWIG.
The values of the median and msg variables are changed by the C function. When the return int != 0 then there will be some error information in the msg arg. Where the return int == 0, then median variable will contain a float with value assigned from myfunction.
This generally runs OK where the return value is 0. I use %array_functions and %pointer_functions to create the pointers needing to be passed, as per this .i file:
%module test
%include "cpointer.i"
%include "carrays.i"
%{
#include <stdint.h>
%}
extern int myfunction(char *infile, char *maskfile, int check, float *median, char *msg)
%pointer_functions(float, floatp);
%pointer_functions(char, charp);
%array_functions(char, charArray);
After swig-ing, compiling and linking, I can call the function in python:
import test
errmsg_buffer = 1024
_infile = 'test2.dat'
infile = imstat.new_charArray(len(_infile))
for i in xrange(len(_infile)):
imstat.charArray_setitem(infile,i,_infile[i])
maskfile = imstat.new_charArray(1)
imstat.charArray_setitem(maskfile,0,'')
check = 0
med = imstat.new_floatp()
errmsg = imstat.new_charArray(errmsg_buffer)
out = test.myfunction(infile,maskfile,check,med,errmsg)
median = test.floatp_value(med)
This works sometimes, but often not - I get a lot of segfaults which are generally fixed by changing the errmsg_buffer length (clearly not a useful fix!). The C code that changes the msg string is:
(void)sprintf(errmsg,"file not found");
My main issue is in proper handling of msg string, which I suspect is causing the segfaults (and might be due to incorrect implementation via new_charArray?).
What is the best way to do this?
Can I add something to the .i that converts the char *msg into a python str?
Can this be done without "pre-initialising" with new_CharArray? I'd presumably get a buffer overflow if errmsg_buffer is too small.
I hope this is clear - happy to add comments for further discussion.
Your wrapper can be much simplified using SWIG. Try this SWIG interface file (details below):
%module test
%include "typemaps.i"
%include "cstring.i"
%apply float *OUTPUT { float *median };
%cstring_bounded_output(char *msg, 1024);
extern int myfunction(char *infile, char *maskfile, int check, float *median, char *msg);
Then, from python, use the module in the following way:
import test
infile = 'test2.dat'
maskfile = ''
check = 0
out, median, errmsg = test.myfunction(infile,maskfile,check)
if out == 0: print(errmsg)
...
However, from what you write, it is not quite clear to me why your approach segfaults.
Details
The typemaps.i file contains the float *OUTPUT typemap, which is then applied to the float *median argument and turns this from an argument into a float output value. See the SWIG docs on argument handling for details.
The cstrings.i file contains SWIG macros to deal with C strings. Here, I used the %cstring_bounded_output macro. This creates a char * buffer of the given size 1024 and passes this as the argument for char *msg automatically. Then, the contents after the function complete are converted into a python string and appended to the output. See here for details.
SWIG handles the first two char * arguments by default, that is converting python strings to appropriate char * and passing these. Note that the passed char * for these arguments are immutable, i.e., if your myfunction attempts to modify these, bad things will happen. Read about how SWIG handles C strings here.
So, your wrapped myfunction then is used as shown above and has the following signature in python:
myfunction(infile, maskfile, check) -> (out, median, msg)
EDIT:
The SWIG docs about carrays.i state:
Note: %array_functions() and %array_class() should not be used with types of char or char *.
I think your code is not creating correctly NULL-terminated C char *, so perhaps this could be causing the segfaults.
I am not learn SWIG very deeply.But I try give you some suggestions.
1.
If your program modifies the input parameter or uses it to return data, consider using the cstring.i library file described in the SWIG Library chapter.
Data is copied into a new Python string and returned.
If your program needs to work with binary data, you can use a typemap to expand a Python string into a pointer/length argument pair. As luck would have it, just such a typemap is already defined. Just do this:
%apply (char *STRING, int LENGTH) { (char *data, int size) };
...
int parity(char *data, int size, int initial);
Python:
parity("e\x09ffss\x00\x00\x01\nx", 0)
If you need to return binary data, you might use the cstring.i library file. The cdata.i library can also be used to extra binary data from arbitrary pointers.
2.I think "pre-initialising" maybe necessary.
I use SWIG for generating wrappers. Therefore I need a function which looks like
%inline %{
// Serializes into a string
void* SerCmd(Class *v, int *length, char *str)
{
QByteArray ba;
QDataStream out(&ba, QIODevice::WriteOnly);
out << *v;
*length = ba.size();
str = new char[ba.size()];
memcpy(str, ba.constData(), ba.size());
return str;
}
%}
This function is called from python then but who is deleting the memory I allocate with new? Is python doing that for me or how can that be achieved?
Thanks!
If this doesn't answer your question, I will remove it. But according to the SWIG information found here:
http://www.swig.org/Doc1.3/Library.html#Library_stl_cpp_library
a std::string can be used instead of manually allocating memory. Given this information, this more than likely can be used.
%inline %{
// Serializes into a string
void SerCmd(Class *v, int *length, std::string& str)
{
QByteArray ba;
QDataStream out(&ba, QIODevice::WriteOnly);
out << *v;
*length = ba.size();
str.clear();
str.append(ba.constData(), ba.size());
}
%}
Since you noted that the std::string can contain NULLs, then the proper way to handle it is to use the string::append() function.
http://en.cppreference.com/w/cpp/string/basic_string/append
Please note item 4) in the link above (null characters are perfectly ok). Note that std::string does not determine its size by a null character, unlike C-strings.
Now, to get to this data, use the string::data() function, along with the string::size() function to tell you how much data you have in the string.
I don't know SWIG either, but since you asked, "Is python doing that for me or how can that be achieved?"
Python garbage collection deletes stuff when it is no longer in scope i.e. no longer has anything pointing to it. However, it cannot delete stuff it is not aware of. These docs may help.
Here is the doc on how memory management works:
https://docs.python.org/2/c-api/memory.html
And here are the docs to the gc module to help give you a more control of the process.
https://docs.python.org/2/library/gc.html
Swig's manual is kinda confusing to me. I am wrapping my C library to python so that I can test my C code in python. Now I want to know how I can access the C pointer address in Python,
for example, this is the code I have
typedef struct _buffer_t {
char buf[1024];
char *next_ptr;
} buffer_t;
void parse_buffer(buffer_t * buf_p) {
buf_p -> next_ptr ++;
}
What I wanted to do is below, in C code
buffer_t my_buf;
my_buf.next_ptr = my_buf.buf;
parse_buffer(&my_buf);
expect_equal(&(my_buf.buf)+1, my_buf.next_ptr);
How do I do the same thing in Python?
After import the SWIG wrapped module, I have the buffer_t class in python.
The problem is that SWIG is trying to wrap the buffer as a String in Python, which is not just a pointer. The synthesised assignment for next_ptr will allocate memory and make a string copy, not just do a pointer assignment. You can work around this in several ways.
The simplest is to use %extend to add a "reset buffer" method in Python:
%extend {
void resetPtr() {
$self->next_ptr=$self->buf;
}
}
which can then be called in Python to make the assignment you want.
Alternatively if you want to force the buffer to be treated as just another pointer you could force SWIG to treat both members as void* instead of char* and char[]. I had hoped that would be as simple as a %apply per type, but it seems not to work correctly for memberin and memberout in my testing:
%apply void * { char *next };
%apply void * { char buf[ANY] };
Given that the memberin/memberout typemaps are critical to making this work I think %extend is by far the cleanest solution.
I'd like to use some existing C++ code, NvTriStrip, in a Python tool.
SWIG easily handles the functions with simple parameters, but the main function, GenerateStrips, is much more complicated.
What do I need to put in the SWIG interface file to indicate that primGroups is really an output parameter and that it must be cleaned up with delete[]?
///////////////////////////////////////////////////////////////////////////
// GenerateStrips()
//
// in_indices: input index list, the indices you would use to render
// in_numIndices: number of entries in in_indices
// primGroups: array of optimized/stripified PrimitiveGroups
// numGroups: number of groups returned
//
// Be sure to call delete[] on the returned primGroups to avoid leaking mem
//
bool GenerateStrips( const unsigned short* in_indices,
const unsigned int in_numIndices,
PrimitiveGroup** primGroups,
unsigned short* numGroups,
bool validateEnabled = false );
FYI, here is the PrimitiveGroup declaration:
enum PrimType
{
PT_LIST,
PT_STRIP,
PT_FAN
};
struct PrimitiveGroup
{
PrimType type;
unsigned int numIndices;
unsigned short* indices;
PrimitiveGroup() : type(PT_STRIP), numIndices(0), indices(NULL) {}
~PrimitiveGroup()
{
if(indices)
delete[] indices;
indices = NULL;
}
};
Have you looked at the documentation of SWIG regarding their "cpointer.i" and "carray.i" libraries? They're found here. That's how you have to manipulate things unless you want to create your own utility libraries to accompany the wrapped code. Here's the link to the Python handling of pointers with SWIG.
Onto your question on getting it to recognize input versus output. They've got another section in the documentation here, that describes exactly that. You lable things OUTPUT in the *.i file. So in your case you'd write:
%inline{
extern bool GenerateStrips( const unsigned short* in_dices,
const unsigned short* in_numIndices,
PrimitiveGroup** OUTPUT,
unsigned short* numGroups,
bool validated );
%}
which gives you a function that returns both the bool and the PrimitiveGroup* array as a tuple.
Does that help?
It's actually so easy to make python bindings for things directly that I don't know why people bother with confusing wrapper stuff like SWIG.
Just use Py_BuildValue once per element of the outer array, producing one tuple per row. Store those tuples in a C array. Then Call PyList_New and PyList_SetSlice to generate a list of tuples, and return the list pointer from your C function.
I don't know how to do it with SWIG, but you might want to consider moving to a more modern binding system like Pyrex or Cython.
For example, Pyrex gives you access to C++ delete for cases like this. Here's an excerpt from the documentation:
Disposal
The del statement can be applied to a pointer to a C++ struct
to deallocate it. This is equivalent to delete in C++.
cdef Shrubbery *big_sh
big_sh = new Shrubbery(42.0)
display_in_garden_show(big_sh)
del big_sh
http://www.cosc.canterbury.ac.nz/greg.ewing/python/Pyrex/version/Doc/Manual/using_with_c++.html