Q on Python serialization/deserialization - python

What chances do I have to instantiate, keep and serialize/deserialize to/from binary data Python classes reflecting this pattern (adopted from RFC 2246 [TLS]):
enum { apple, orange } VariantTag;
struct {
uint16 number;
opaque string<0..10>; /* variable length */
} V1;
struct {
uint32 number;
opaque string[10]; /* fixed length */
} V2;
struct {
select (VariantTag) { /* value of selector is implicit */
case apple: V1; /* VariantBody, tag = apple */
case orange: V2; /* VariantBody, tag = orange */
} variant_body; /* optional label on variant */
} VariantRecord;
Basically I would have to define a (variant) class VariantRecord, which varies depending on the value of VariantTag. That's not that difficult. The challenge is to find a most generic way to build a class, which serializes/deserializes to and from a byte stream... Pickle, Google protocol buffer, marshal is all not an option.
I made little success with having an explicit "def serialize" in my class, but I'm not very happy with it, because it's not generic enough.
I hope I could express the problem.
My current solution in case VariantTag = apple would look like this, but I don't like it too much
import binascii
import struct
class VariantRecord(object):
def __init__(self, number, opaque):
self.number = number
self.opaque = opaque
def serialize(self):
out = struct.pack('>HB%ds' % len(self.opaque), self.number, len(self.opaque), self.opaque)
return out
v = VariantRecord(10, 'Hello')
print binascii.hexlify(v.serialize())
>> 000a0548656c6c6f
Regards

Two suggestions:
For the variable length structure use a fixed format
and just slice the result.
Use struct.Struct
e.g. If I've understood your formats correctly (is the length byte that appeared in your example but wasn't mentioned originally present in the other variant also?)
>>> import binascii
>>> import struct
>>> V1 = struct.Struct(">H10p")
>>> V2 = struct.Struct(">L10p")
>>> def serialize(variant, n, s):
if variant:
return V2.pack(n,s)
else:
return V1.pack(n,s)[:len(s)+3]
>>> print binascii.hexlify(serialize(False, 10, 'hello')) #V1
000a0568656c6c6f
>>> print binascii.hexlify(serialize(True, 10, 'hello')) #V2
0000000a0568656c6c6f00000000
>>>

Related

How to create an Enum object in Python C API?

I'm struggling how to create a python Enum object inside the Python C API. The enum class has assigned tp_base to PyEnum_Type, so it inherits Enum. But, I can't figure out a way to tell the Enum base class what items are in the enum. I want to allow iteration and lookup from Python using the __members__ attribute that every Python Enum provides.
Thank you,
Jelle
It is not straightforward at all. The Enum is a Python class using a Python metaclass. It is possible to create it in C but it will be just emulating the constructing Python code in C - the end result is the same and while it speeds up things slightly, you'll most probably run the code only once within each program run.
In any case it is possible, but it is not easy at all. I'll show how to do it in Python:
from enum import Enum
class Color(Enum):
RED = 1
GREEN = 2
BLUE = 3
print(Color)
print(Color.RED)
is the same as:
from enum import Enum
name = 'Color'
bases = (Enum,)
enum_meta = type(Enum)
namespace = enum_meta.__prepare__(name, bases)
namespace['RED'] = 1
namespace['GREEN'] = 2
namespace['BLUE'] = 3
Color = enum_meta(name, bases, namespace)
print(Color)
print(Color.RED)
The latter is the code that you need to translate into C.
Edited note: An answer on a very similar question details how enum.Enum has a functional interface that can be used instead. That is almost certainly the correct approach. I think my answer here is a useful alternative approach to be aware of, although it probably isn't the best solution to this problem.
I'm aware that this answer is slightly cheating, but this is exactly the kind of code that's better written in Python, and in the C API we still have access to the full Python interpreter. My reasoning for this is that the main reason to keep things entirely in C is performance, and it seems unlikely that creating enum objects will be performance critical.
I'll give three versions, essentially depending on the level of complexity.
First, the simplest case: the enum is entirely known and defined and compile-time. Here we simply set up an empty global dict, run the Python code, then extract the enum from the global dict:
PyObject* get_enum(void) {
const char str[] = "from enum import Enum\n"
"class Colour(Enum):\n"
" RED = 1\n"
" GREEN = 2\n"
" BLUE = 3\n"
"";
PyObject *global_dict=NULL, *should_be_none=NULL, *output=NULL;
global_dict = PyDict_New();
if (!global_dict) goto cleanup;
should_be_none = PyRun_String(str, Py_file_input, global_dict, global_dict);
if (!should_be_none) goto cleanup;
// extract Color from global_dict
output = PyDict_GetItemString(global_dict, "Colour");
if (!output) {
// PyDict_GetItemString does not set exceptions
PyErr_SetString(PyExc_KeyError, "could not get 'Colour'");
} else {
Py_INCREF(output); // PyDict_GetItemString returns a borrow reference
}
cleanup:
Py_XDECREF(global_dict);
Py_XDECREF(should_be_none);
return output;
}
Second, we might want to change what we define in C at runtime. For example, maybe the input parameters pick the enum values. Here, I'm going to use string formatting to insert the appropriate values into our string. There's a number of options here: sprintf, PyBytes_Format, the C++ standard library, using Python strings (perhaps with another call into Python code?). Pick whichever you're most comfortable with.
PyObject* get_enum_fmt(int red, int green, int blue) {
const char str[] = "from enum import Enum\n"
"class Colour(Enum):\n"
" RED = %d\n"
" GREEN = %d\n"
" BLUE = %d\n"
"";
PyObject *formatted_str=NULL, *global_dict=NULL, *should_be_none=NULL, *output=NULL;
formatted_str = PyBytes_FromFormat(str, red, green, blue);
if (!formatted_str) goto cleanup;
global_dict = PyDict_New();
if (!global_dict) goto cleanup;
should_be_none = PyRun_String(PyBytes_AsString(formatted_str), Py_file_input, global_dict, global_dict);
if (!should_be_none) goto cleanup;
// extract Color from global_dict
output = PyDict_GetItemString(global_dict, "Colour");
if (!output) {
// PyDict_GetItemString does not set exceptions
PyErr_SetString(PyExc_KeyError, "could not get 'Colour'");
} else {
Py_INCREF(output); // PyDict_GetItemString returns a borrow reference
}
cleanup:
Py_XDECREF(formatted_str);
Py_XDECREF(global_dict);
Py_XDECREF(should_be_none);
return output;
}
Obviously you can do as much or as little as you like with string formatting - I've just picked a simple example to show the point. The main differences from the previous version are the call to PyBytes_FromFormat to set up the string, and the call to PyBytes_AsString that gets the underlying char* out of the prepared bytes object.
Finally, we could prepare the enum attributes in C Python dict and pass it in. This necessitates a bit of a change. Essentially I use #AnttiHaapala's lower-level Python code, but insert namespace.update(contents) after the call to __prepare__.
PyObject* get_enum_dict(const char* key1, int value1, const char* key2, int value2) {
const char str[] = "from enum import Enum\n"
"name = 'Colour'\n"
"bases = (Enum,)\n"
"enum_meta = type(Enum)\n"
"namespace = enum_meta.__prepare__(name, bases)\n"
"namespace.update(contents)\n"
"Colour = enum_meta(name, bases, namespace)\n";
PyObject *global_dict=NULL, *contents_dict=NULL, *value_as_object=NULL, *should_be_none=NULL, *output=NULL;
global_dict = PyDict_New();
if (!global_dict) goto cleanup;
// create and fill the contents dictionary
contents_dict = PyDict_New();
if (!contents_dict) goto cleanup;
value_as_object = PyLong_FromLong(value1);
if (!value_as_object) goto cleanup;
int set_item_result = PyDict_SetItemString(contents_dict, key1, value_as_object);
Py_CLEAR(value_as_object);
if (set_item_result!=0) goto cleanup;
value_as_object = PyLong_FromLong(value2);
if (!value_as_object) goto cleanup;
set_item_result = PyDict_SetItemString(contents_dict, key2, value_as_object);
Py_CLEAR(value_as_object);
if (set_item_result!=0) goto cleanup;
set_item_result = PyDict_SetItemString(global_dict, "contents", contents_dict);
if (set_item_result!=0) goto cleanup;
should_be_none = PyRun_String(str, Py_file_input, global_dict, global_dict);
if (!should_be_none) goto cleanup;
// extract Color from global_dict
output = PyDict_GetItemString(global_dict, "Colour");
if (!output) {
// PyDict_GetItemString does not set exceptions
PyErr_SetString(PyExc_KeyError, "could not get 'Colour'");
} else {
Py_INCREF(output); // PyDict_GetItemString returns a borrow reference
}
cleanup:
Py_XDECREF(contents_dict);
Py_XDECREF(global_dict);
Py_XDECREF(should_be_none);
return output;
}
Again, this presents a reasonably flexible way to get values from C into a generated enum.
For the sake of testing I used the follow simple Cython wrapper - this is just presented for completeness to help people try these functions.
cdef extern from "cenum.c":
object get_enum()
object get_enum_fmt(int, int, int)
object get_enum_dict(char*, int, char*, int)
def py_get_enum():
return get_enum()
def py_get_enum_fmt(red, green, blue):
return get_enum_fmt(red, green, blue)
def py_get_enum_dict(key1, value1, key2, value2):
return get_enum_dict(key1, value1, key2, value2)
To reiterate: this answer is only partly in the C API, but the approach of calling Python from C is one that I've found productive at times for "run-once" code that would be tricky to write entirely in C.

operator.index with custom class instance

I have a simple class below,
class MyClass(int):
def __index__(self):
return 1
According to operator.index documentation,
operator.index(a)
Return a converted to an integer. Equivalent to a.__index__()
But when I use operator.index with MyClass instance, I got 100 instead of 1 (I am getting 1 if I use a.__index__()). Why is that?.
>>> a = MyClass(100)
>>>
>>> import operator
>>> print(operator.index(a))
100
>>> print(a.__index__())
1
This actually appears to be a deep-rooted issue in cpython. If you look at the source code for operator.py, you can see the definition of index:
def index(a):
"Same as a.__index__()."
return a.__index__()
So...why is it not equivalent? It's literally calling __index__. Well, at the bottom of the source, there's the culprit:
try:
from _operator import *
except ImportError:
pass
else:
from _operator import __doc__
It's overwriting the definitions with a native _operator module. In fact, if you comment this out (either by modifying the actual library or making your own fake operator.py* and importing that), it works. So, we can find the source code for the native _operator library, and look at the related part:
static PyObject *
_operator_index(PyObject *module, PyObject *a)
{
return PyNumber_Index(a);
}
So, it's a wrapper around the PyNumber_Index function. PyNumber_Index is a wrapper around _PyNumber_Index, so we can look at that:
PyObject *
_PyNumber_Index(PyObject *item)
{
PyObject *result = NULL;
if (item == NULL) {
return null_error();
}
if (PyLong_Check(item)) {
Py_INCREF(item);
return item;
}
if (!_PyIndex_Check(item)) {
PyErr_Format(PyExc_TypeError,
"'%.200s' object cannot be interpreted "
"as an integer", Py_TYPE(item)->tp_name);
return NULL;
}
result = Py_TYPE(item)->tp_as_number->nb_index(item);
if (!result || PyLong_CheckExact(result))
return result;
if (!PyLong_Check(result)) {
PyErr_Format(PyExc_TypeError,
"__index__ returned non-int (type %.200s)",
Py_TYPE(result)->tp_name);
Py_DECREF(result);
return NULL;
}
/* Issue #17576: warn if 'result' not of exact type int. */
if (PyErr_WarnFormat(PyExc_DeprecationWarning, 1,
"__index__ returned non-int (type %.200s). "
"The ability to return an instance of a strict subclass of int "
"is deprecated, and may be removed in a future version of Python.",
Py_TYPE(result)->tp_name)) {
Py_DECREF(result);
return NULL;
}
return result;
}
PyObject *
PyNumber_Index(PyObject *item)
{
PyObject *result = _PyNumber_Index(item);
if (result != NULL && !PyLong_CheckExact(result)) {
Py_SETREF(result, _PyLong_Copy((PyLongObject *)result));
}
return result;
}
You can see before it even calls nb_index (the C name for __index__), it calls PyLong_Check on the argument, and if it's true, it just returns the item with no modification. PyLong_Check is a macro that checks for long subtyping (int in python is a PyLong):
#define PyLong_Check(op) \
PyType_FastSubclass(Py_TYPE(op), Py_TPFLAGS_LONG_SUBCLASS)
#define PyLong_CheckExact(op) Py_IS_TYPE(op, &PyLong_Type)
So, basically, the takeaway is that for whatever reason, probably for speed, int subclasses don't get their __index__ method called, and instead just get _PyLong_Copy'd to the resulting return value, but only in the native _operator module, and not in the non-native operator.py. This conflict of implementation as well as inconsistency in documentation leads me to believe that this is an issue, either in the documentation or the implementation, and you may want to raise it as one.
It's likely a documentation and not an implementation issue, as cpython has a habit of sacrificing correctness for speed: (nan,) == (nan,) but nan != nan.
* You may have to name it something like fake_operator.py then import it with import fake_operator as operator
This is because your type is an int subclass. __index__ will not be used because the instance is already an integer. That much is by design, and unlikely to be considered a bug in CPython. PyPy behaves the same.
In _operator.c:
static PyObject *
_operator_index(PyObject *module, PyObject *a)
/*[clinic end generated code: output=d972b0764ac305fc input=6f54d50ea64a579c]*/
{
return PyNumber_Index(a);
}
Note that operator.py Python code is not used generally, this code is only a fallback in the case that compiled _operator module is not available. That explains why the result a.__index__() differs.
In abstract.c, cropped after the relevant PyLong_Check part:
/* Return an exact Python int from the object item.
Raise TypeError if the result is not an int
or if the object cannot be interpreted as an index.
*/
PyObject *
PyNumber_Index(PyObject *item)
{
PyObject *result = _PyNumber_Index(item);
if (result != NULL && !PyLong_CheckExact(result)) {
Py_SETREF(result, _PyLong_Copy((PyLongObject *)result));
}
return result;
}
...
/* Return a Python int from the object item.
Can return an instance of int subclass.
Raise TypeError if the result is not an int
or if the object cannot be interpreted as an index.
*/
PyObject *
_PyNumber_Index(PyObject *item)
{
PyObject *result = NULL;
if (item == NULL) {
return null_error();
}
if (PyLong_Check(item)) {
Py_INCREF(item);
return item; /* <---- short-circuited here */
}
...
}
The documentation for operator.index is inaccurate, so this may be considered a minor documentation issue:
>>> import operator
>>> operator.index.__doc__
'Same as a.__index__()'
So, why isn't __index__ considered for integers? The probable answer is found in PEP 357, under the discussion section titled Speed:
Implementation should not slow down Python because integers and long integers used as indexes will complete in the same number of instructions. The only change will be that what used to generate an error will now be acceptable.
We do not want to slow down the most common case for slicing with integers, having to check for an nb_index slot every time.
Update
This answer is incorrect; I misread the documentation. See Aplet123's answer instead. Tl;dr the problem is actually that the C implementation doesn't match the documentation and Python implementation. The C implementation is more like a if isinstance(a, int) else a.__index__().
To prove it, try defining MyClass.__int__(). The outcome will be the same.
Original answer
See the documentation for object.__index__():
object.__index__(self)
Called to implement operator.index(), and whenever Python needs to losslessly convert the numeric object to an integer object (such as in slicing, or in the built-in bin(), hex() and oct() functions). Presence of this method indicates that the numeric object is an integer type. Must return an integer.
If __int__(), __float__() and __complex__() are not defined then corresponding built-in functions int(), float() and complex() fall back to __index__().
(added bold)
a.__int__() exists, so its return value is used instead.
>>> a.__int__
<method-wrapper '__int__' of MyClass object at 0x7f2c5f0f4ec8>
>>> a.__int__()
100

How do you recreate these structs in Python

I have these two nested structs in C below
typedef struct tag_interest {
float *high;// array
float *low;// array
} sinterest;
typedef struct tag_sfutures {
int time;
float result;
sinterest *interest;// array
} sfutures;
What is there equivalent in Python?
EDIT
I had tried this. I am yet to parse and check this because am still in the process of debugging some code that comes before this.
class CInterest(object):
high = []
low = []
def add_high(self,High):
self.high.append(High)
def add_low(self,Low):
self.low.append(Low)
class CFutures(object):
interest = [CInterest]
def add_interest(self,interest):
self.interest.append(interest)
def set_time(self,time):
self.time = time
def set_put(self,put):
self.put = put
Take a look at cstruct.
https://pypi.python.org/pypi/cstruct
It will take your struct definition as a string and create a Python class you can use, instantiate and pack/unpack binary data. I use it to auto generate hundreds of C structures with no issues other than its list of C primitives is not comprehensive.

Convert Rust vector of tuples to a C compatible structure

Following these answers, I've currently defined a Rust 1.0 function as follows, in order to be callable from Python using ctypes:
use std::vec;
extern crate libc;
use libc::{c_int, c_float, size_t};
use std::slice;
#[no_mangle]
pub extern fn convert_vec(input_lon: *const c_float,
lon_size: size_t,
input_lat: *const c_float,
lat_size: size_t) -> Vec<(i32, i32)> {
let input_lon = unsafe {
slice::from_raw_parts(input_lon, lon_size as usize)
};
let input_lat = unsafe {
slice::from_raw_parts(input_lat, lat_size as usize)
};
let combined: Vec<(i32, i32)> = input_lon
.iter()
.zip(input_lat.iter())
.map(|each| convert(*each.0, *each.1))
.collect();
return combined
}
And I'm setting up the Python part like so:
from ctypes import *
class Int32_2(Structure):
_fields_ = [("array", c_int32 * 2)]
rust_bng_vec = lib.convert_vec_py
rust_bng_vec.argtypes = [POINTER(c_float), c_size_t,
POINTER(c_float), c_size_t]
rust_bng_vec.restype = POINTER(Int32_2)
This seems to be OK, but I'm:
Not sure how to transform combined (a Vec<(i32, i32)>) to a C-compatible structure, so it can be returned to my Python script.
Not sure whether I should be returning a reference (return &combined?) and how I would have to annotate the function with the appropriate lifetime specifier if I did
The most important thing to note is that there is no such thing as a tuple in C. C is the lingua franca of library interoperability, and you will be required to restrict yourself to abilities of this language. It doesn't matter if you are talking between Rust and another high-level language; you have to speak C.
There may not be tuples in C, but there are structs. A two-element tuple is just a struct with two members!
Let's start with the C code that we would write:
#include <stdio.h>
#include <stdint.h>
typedef struct {
uint32_t a;
uint32_t b;
} tuple_t;
typedef struct {
void *data;
size_t len;
} array_t;
extern array_t convert_vec(array_t lat, array_t lon);
int main() {
uint32_t lats[3] = {0, 1, 2};
uint32_t lons[3] = {9, 8, 7};
array_t lat = { .data = lats, .len = 3 };
array_t lon = { .data = lons, .len = 3 };
array_t fixed = convert_vec(lat, lon);
tuple_t *real = fixed.data;
for (int i = 0; i < fixed.len; i++) {
printf("%d, %d\n", real[i].a, real[i].b);
}
return 0;
}
We've defined two structs — one to represent our tuple, and another to represent an array, as we will be passing those back and forth a bit.
We will follow this up by defining the exact same structs in Rust and define them to have the exact same members (types, ordering, names). Importantly, we use #[repr(C)] to let the Rust compiler know to not do anything funky with reordering the data.
extern crate libc;
use std::slice;
use std::mem;
#[repr(C)]
pub struct Tuple {
a: libc::uint32_t,
b: libc::uint32_t,
}
#[repr(C)]
pub struct Array {
data: *const libc::c_void,
len: libc::size_t,
}
impl Array {
unsafe fn as_u32_slice(&self) -> &[u32] {
assert!(!self.data.is_null());
slice::from_raw_parts(self.data as *const u32, self.len as usize)
}
fn from_vec<T>(mut vec: Vec<T>) -> Array {
// Important to make length and capacity match
// A better solution is to track both length and capacity
vec.shrink_to_fit();
let array = Array { data: vec.as_ptr() as *const libc::c_void, len: vec.len() as libc::size_t };
// Whee! Leak the memory, and now the raw pointer (and
// eventually C) is the owner.
mem::forget(vec);
array
}
}
#[no_mangle]
pub extern fn convert_vec(lon: Array, lat: Array) -> Array {
let lon = unsafe { lon.as_u32_slice() };
let lat = unsafe { lat.as_u32_slice() };
let vec =
lat.iter().zip(lon.iter())
.map(|(&lat, &lon)| Tuple { a: lat, b: lon })
.collect();
Array::from_vec(vec)
}
We must never accept or return non-repr(C) types across the FFI boundary, so we pass across our Array. Note that there's a good amount of unsafe code, as we have to convert an unknown pointer to data (c_void) to a specific type. That's the price of being generic in C world.
Let's turn our eye to Python now. Basically, we just have to mimic what the C code did:
import ctypes
class FFITuple(ctypes.Structure):
_fields_ = [("a", ctypes.c_uint32),
("b", ctypes.c_uint32)]
class FFIArray(ctypes.Structure):
_fields_ = [("data", ctypes.c_void_p),
("len", ctypes.c_size_t)]
# Allow implicit conversions from a sequence of 32-bit unsigned
# integers.
#classmethod
def from_param(cls, seq):
return cls(seq)
# Wrap sequence of values. You can specify another type besides a
# 32-bit unsigned integer.
def __init__(self, seq, data_type = ctypes.c_uint32):
array_type = data_type * len(seq)
raw_seq = array_type(*seq)
self.data = ctypes.cast(raw_seq, ctypes.c_void_p)
self.len = len(seq)
# A conversion function that cleans up the result value to make it
# nicer to consume.
def void_array_to_tuple_list(array, _func, _args):
tuple_array = ctypes.cast(array.data, ctypes.POINTER(FFITuple))
return [tuple_array[i] for i in range(0, array.len)]
lib = ctypes.cdll.LoadLibrary("./target/debug/libtupleffi.dylib")
lib.convert_vec.argtypes = (FFIArray, FFIArray)
lib.convert_vec.restype = FFIArray
lib.convert_vec.errcheck = void_array_to_tuple_list
for tupl in lib.convert_vec([1,2,3], [9,8,7]):
print tupl.a, tupl.b
Forgive my rudimentary Python. I'm sure an experienced Pythonista could make this look a lot prettier! Thanks to #eryksun for some nice advice on how to make the consumer side of calling the method much nicer.
A word about ownership and memory leaks
In this example code, we've leaked the memory allocated by the Vec. Theoretically, the FFI code now owns the memory, but realistically, it can't do anything useful with it. To have a fully correct example, you'd need to add another method that would accept the pointer back from the callee, transform it back into a Vec, then allow Rust to drop the value. This is the only safe way, as Rust is almost guaranteed to use a different memory allocator than the one your FFI language is using.
Not sure whether I should be returning a reference and how I would have to annotate the function with the appropriate lifetime specifier if I did
No, you don't want to (read: can't) return a reference. If you could, then the ownership of the item would end with the function call, and the reference would point to nothing. This is why we need to do the two-step dance with mem::forget and returning a raw pointer.

Python ctype - How to pass data between C functions

I have a self-made C library that I want to access using python. The problem is that the code consists essentially of two parts, an initialization to read in data from a number of files and a few calculations that need to be done only once. The other part is called in a loop and uses the data generated before repeatedly. To this function I want to pass parameters from python.
My idea was to write two C wrapper functions, "init" and "loop" - "init" reads the data and returns a void pointer to a structure that "loop" can use together with additional parameters that I can pass on from python. Something like
void *init() {
struct *mystruct ret = (mystruct *)malloc(sizeof(mystruct));
/* Fill ret with data */
return ret;
}
float loop(void *data, float par1, float par2) {
/* do stuff with data, par1, par2, return result */
}
I tried calling "init" from python as a c_void_p, but since "loop" changes some of the contents of "data" and ctypes' void pointers are immutable, this did not work.
Other solutions to similar problems I saw seem to require knowledge of how much memory "init" would use, and I do not know that.
Is there a way to pass data from one C function to another through python without telling python exactly what or how much it is? Or is there another way to solve my problem?
I tried (and failed) to write a minimum crashing example, and after some debugging it turned out there was a bug in my C code. Thanks to everyone who replied!
Hoping that this might help other people, here is a sort-of-minimal working version (still without separate 'free' - sorry):
pybug.c:
#include <stdio.h>
#include <stdlib.h>
typedef struct inner_struct_s {
int length;
float *array;
} inner_struct_t;
typedef struct mystruct_S {
int id;
float start;
float end;
inner_struct_t *inner;
} mystruct_t;
void init(void **data) {
int i;
mystruct_t *mystruct = (mystruct_t *)malloc(sizeof(mystruct_t));
inner_struct_t *inner = (inner_struct_t *)malloc(sizeof(inner_struct_t));
inner->length = 10;
inner->array = calloc(inner->length, sizeof(float));
for (i=0; i<inner->length; i++)
inner->array[i] = 2*i;
mystruct->id = 0;
mystruct->start = 0;
mystruct->end = inner->length;
mystruct->inner = inner;
*data = mystruct;
}
float loop(void *data, float par1, float par2, int newsize) {
mystruct_t *str = data;
inner_struct_t *inner = str->inner;
int i;
inner->length = newsize;
inner->array = realloc(inner->array, newsize * sizeof(float));
for (i=0; i<inner->length; i++)
inner->array[i] = par1 + i * par2;
return inner->array[inner->length-1];
}
compile as
cc -c -fPIC pybug.c
cc -shared -o libbug.so pybug.o
Run in python:
from ctypes import *
sl = CDLL('libbug.so')
# What arguments do functions take / return?
sl.init.argtype = c_void_p
sl.loop.restype = c_float
sl.loop.argtypes = [c_void_p, c_float, c_float, c_int]
# Init takes a pointer to a pointer
px = c_void_p()
sl.init(byref(px))
# Call the loop a couple of times
for i in range(10):
print sl.loop(px, i, 5, 10*i+5)
You should have a corresponding function to free the data buffer when the caller is done. Otherwise I don't see the issue. Just pass the pointer to loop that you get from init.
init.restype = c_void_p
loop.argtypes = [c_void_p, c_float, c_float]
loop.restype = c_float
I'm not sure what you mean by "ctypes' void pointers are immutable", unless you're talking about c_char_p and c_wchar_p. The issue there is if you pass a Python string as an argument it uses Python's private pointer to the string buffer. If a function can change the string, you should first copy it to a c_char or c_wchar array.
Here's a simple example showing the problem of passing a Python string (2.x byte string) as an argument to a function that modifies it. In this case it changes index 0 to '\x00':
>>> import os
>>> from ctypes import *
>>> open('tmp.c', 'w').write("void f(char *s) {s[0] = 0;}")
>>> os.system('gcc -shared -fPIC -o tmp.so tmp.c')
0
>>> tmp = CDLL('./tmp.so')
>>> tmp.f.argtypes = [c_void_p]
>>> tmp.f.restype = None
>>> tmp.f('a')
>>> 'a'
'\x00'
>>> s = 'abc'
>>> tmp.f(s)
>>> s
'\x00bc'
This is specific to passing Python strings as arguments. It isn't a problem to pass pointers to data structures that are intended to be mutable, either ctypes data objects such as a Structure, or pointers returned by libraries.
Is your C code in a DLL? If so can might consider creating a global pointer in there. init() will do any initialization required and set the pointer equal to newly allocated memory and loop() will operate on that memory. Also don't forget to free it up with a close() function

Categories

Resources