I want to use Boost.Python to create a Python wrapper for a C++ constructor with optional arguments. I want the Python wrapper to act like this:
class Foo():
def __init__(self, filename, phase, stages=None, level=0):
"""
filename -- string
phase -- int
stages -- optional list of strings
level -- optional int
"""
if stages is None:
stages = []
# ...
How do I do this with Boost.Python? I don't see how to do it with make_constructor, and I don't know how to make a constructor with raw_function. Is there some better documentation than this out there?
My specific problem is trying to add two optional arguments (stages and level) to these two constructors:
https://github.com/BVLC/caffe/blob/rc3/python/caffe/_caffe.cpp#L76-L96
Thanks to Dan's comments, I found a solution that works. I'll copy most of it here since there are some interesting tidbits about how to extract objects from bp::object, etc.
// Net constructor
shared_ptr<Net<Dtype> > Net_Init(string param_file, int phase,
const int level, const bp::object& stages,
const bp::object& weights_file) {
CheckFile(param_file);
// Convert stages from list to vector
vector<string> stages_vector;
if (!stages.is_none()) {
for (int i = 0; i < len(stages); i++) {
stages_vector.push_back(bp::extract<string>(stages[i]));
}
}
// Initialize net
shared_ptr<Net<Dtype> > net(new Net<Dtype>(param_file,
static_cast<Phase>(phase), level, &stages_vector));
// Load weights
if (!weights_file.is_none()) {
std::string weights_file_str = bp::extract<std::string>(weights_file);
CheckFile(weights_file_str);
net->CopyTrainedLayersFrom(weights_file_str);
}
return net;
}
BOOST_PYTHON_MODULE(_caffe) {
bp::class_<Net<Dtype>, shared_ptr<Net<Dtype> >, boost::noncopyable >("Net",
bp::no_init)
.def("__init__", bp::make_constructor(&Net_Init,
bp::default_call_policies(), (bp::arg("network_file"), "phase",
bp::arg("level")=0, bp::arg("stages")=bp::object(),
bp::arg("weights_file")=bp::object())))
}
The generated signature is:
__init__(boost::python::api::object, std::string network_file, int phase,
int level=0, boost::python::api::object stages=None,
boost::python::api::object weights_file=None)
And I can use it like:
net = caffe.Net('network.prototxt', weights_file='weights.caffemodel',
phase=caffe.TEST, level=1, stages=['deploy'])
Full code available in pull request here: https://github.com/BVLC/caffe/pull/3863
Related
In python you can do list[2:] to get every element after the second element. Is there any way to do the same thing with an array in c++?
I hope I figured an acceptable answer. Unfortunately in C++, you cannot overload operator[] with more than one argument so I used operator() instead.
#include <iostream>
#include <vector>
template <typename T>
class WrapperVector
{
private:
std::vector<T> data;
public:
WrapperVector(size_t reserve_size)
{
data.reserve(reserve_size);
}
WrapperVector(typename std::vector<T>::iterator start, typename std::vector<T>::iterator end)
{
data = std::vector<T>(start, end);
}
// appends element to the end of container, just like in Python
void append(T element)
{
data.push_back(element);
}
/* instead of operator[], operator() must be used
because operator[] can't accept more than one argument */
// instead of self[x:y], use this(x, y)
WrapperVector<T> operator()(size_t start, size_t end)
{
return WrapperVector<T>(data.begin() + start, data.begin() + end);
}
// instead of self[x:], use this(x)
WrapperVector<T> operator()(size_t start)
{
return WrapperVector<T>(data.begin() + start, data.end());
}
// prints all elements to cout
void print()
{
if (!data.size())
{
std::cout << "No elements.\n";
return;
}
std::cout << data[0];
size_t length = data.size();
for(size_t i=1; i < length; i++)
std::cout << ' ' << data[i];
std::cout << '\n';
}
};
int main()
{
WrapperVector<int> w(5);
w.append(1);
w.append(2);
w.append(3);
w.append(4);
w.append(5);
w(0).print();
w(1, 3).print();
// you can also save the slice
WrapperVector<int> w2 = w(2);
WrapperVector<int> w3 = w(2, 4);
w2.print();
w3.print();
return 0;
}
Now you could even overload it to accept three arguments to account for the step, just like in Python. I let that as an exercise to you.
Instead of creating a custom class that introduces some amount of maintenance and certain limitations you can use the concept of iterators. Two iterators are used to represent a range of values. For example the function
template<typename TIterator>
foo(TIterator start, TIterator end);
would take a range of objects that is only specified by two iterators. It's a template, so it can take iterators from different containers. To select a sub-range from a range given this way you can use std::next() and std::prev(). For example for a
std::vector<int> list;
you can call foo with a subrange starting from third element as:
foo(std::next(list.begin(), 2), list.end());
or call foo with a subrange from the third to the second last element:
foo(std::next(list.begin(), 2), std::prev(list.end(), 1));
If you need/want to copy a subrange you can easily do that. For example
std::vector<int> subList(std::next(list.begin(), 2),
std::prev(list.end(), 1));
would create a vector sublist that contains the third to second last element from list.
Of cause those are just simple examples. In a real world application you need to check if those subranges are valid/exist.
The advantages are:
no extra wrapper classes
no need to copy any data (only the iterators themselves)
the standard library works almost exclusively with iterators to represent ranges, so it's good to stick to that concept.
With templates you can easily support all containers in the standard library as long as their iterators satisfy the requirements listed in the reference. (You could use the above examples with std::array, std::list and most other containers without any modifications, except for the type of list of cause)
iterators are an interface for user or third party containers, as long as they provide iterators that satisfy the requirements listed in the reference.
On the contra side:
the code gets a bit more complex
it may take some time to understand iterators, because it's a quite different concept than other languages use.
How does one go about to parse a group of required but mutually exclusive arguments using the python C-api?
E.g. given the function definition
static PyObject* my_func(PyObject *self, PyObject *args, PyObject *kwargs) {
double a; // first argument, required
double b=0, c=0; // second argument, required but mutually exclusive, b is default keyword if no keyword is set
char d[] = "..."; // third argument, optional
// parse arguments
...
}
My idea here was to parse the input arguments twice, i.e. replacing ... above with:
static const char *kwList1[] = {"a","b","c","d"};
static const char *kwList2[] = {"a","b","d"};
int ret;
if (!(ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|dds",(char **)kwList1,&a,&b,&c,&d))) {
ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|ds",(char **)kwList2,&a,&b,&d));
}
if (!ret) return NULL;
// verify that one of, but not both, variables b and c are non-zero
...
However, the second call to PyArg_ParseTupleAndKeywords() returns 0 for valid input so I assume here that the variables args and kwargs have some attributes set by the first call to PyArg_ParseTupleAndKeywords() that causes the second call to fail (output python error is: TypeError: a float is required).
I'm aware that the above could be solved using the argparse python module but would prefer a solution directly using the C-api. One idea here would be if it were possible to first copy of the input args and kwargs into two new PyObject variables and use these in the second call to PyArg_ParseTupleAndKeywords(), however I can't find any api-function to do so (guess I also would need to know howto release the memory allocated for this).
Seems like the issue were that the first call to PyArg_ParseTupleAndKeywords() set the error indicator which caused the second call to the function to fail. So the solution is to insert a call to PyErr_Clear() between the calls to PyArg_ParseTupleAndKeywords(). In summary, the following code performs the task
static PyObject* my_func(PyObject *self, PyObject *args, PyObject *kwargs) {
double a; // first argument, required
double b=0, c=0; // second argument, required but mutually exclusive, b is default keyword if no keyword is set
char d[] = "..."; // third argument, optional
// parse arguments
static const char *kwList1[] = {"a","b","c","d"};
static const char *kwList2[] = {"a","b","d"};
int ret;
if (!(ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|dds",(char **)kwList1,&a,&b,&c,&d))) {
PyErr_Clear();
ret = PyArg_ParseTupleAndKeywords(args,kwargs,"d|ds",(char **)kwList2,&a,&b,&d));
}
if (!ret) return NULL;
// verify that one of, but not both, variables b and c are non-zero
if (b==0 && c==0) {
PyErr_SetString(PyExc_TypeError,"Required mutually exclusive arguments 'b' or 'c' (pos 2) not found (or input with value 0)");
return NULL;
} else if (b!=0 && c!=0) {
PyErr_SetString(PyExc_TypeError,"Use of multiple mutually exclusive required arguments 'b' and 'c' (pos 2)");
return NULL;
}
...
}
Then again this does not guard against the calling the function with both the arguments b and c given that one of them is 0 and the other not. However this is a minor problem.
Following these answers, I've currently defined a Rust 1.0 function as follows, in order to be callable from Python using ctypes:
use std::vec;
extern crate libc;
use libc::{c_int, c_float, size_t};
use std::slice;
#[no_mangle]
pub extern fn convert_vec(input_lon: *const c_float,
lon_size: size_t,
input_lat: *const c_float,
lat_size: size_t) -> Vec<(i32, i32)> {
let input_lon = unsafe {
slice::from_raw_parts(input_lon, lon_size as usize)
};
let input_lat = unsafe {
slice::from_raw_parts(input_lat, lat_size as usize)
};
let combined: Vec<(i32, i32)> = input_lon
.iter()
.zip(input_lat.iter())
.map(|each| convert(*each.0, *each.1))
.collect();
return combined
}
And I'm setting up the Python part like so:
from ctypes import *
class Int32_2(Structure):
_fields_ = [("array", c_int32 * 2)]
rust_bng_vec = lib.convert_vec_py
rust_bng_vec.argtypes = [POINTER(c_float), c_size_t,
POINTER(c_float), c_size_t]
rust_bng_vec.restype = POINTER(Int32_2)
This seems to be OK, but I'm:
Not sure how to transform combined (a Vec<(i32, i32)>) to a C-compatible structure, so it can be returned to my Python script.
Not sure whether I should be returning a reference (return &combined?) and how I would have to annotate the function with the appropriate lifetime specifier if I did
The most important thing to note is that there is no such thing as a tuple in C. C is the lingua franca of library interoperability, and you will be required to restrict yourself to abilities of this language. It doesn't matter if you are talking between Rust and another high-level language; you have to speak C.
There may not be tuples in C, but there are structs. A two-element tuple is just a struct with two members!
Let's start with the C code that we would write:
#include <stdio.h>
#include <stdint.h>
typedef struct {
uint32_t a;
uint32_t b;
} tuple_t;
typedef struct {
void *data;
size_t len;
} array_t;
extern array_t convert_vec(array_t lat, array_t lon);
int main() {
uint32_t lats[3] = {0, 1, 2};
uint32_t lons[3] = {9, 8, 7};
array_t lat = { .data = lats, .len = 3 };
array_t lon = { .data = lons, .len = 3 };
array_t fixed = convert_vec(lat, lon);
tuple_t *real = fixed.data;
for (int i = 0; i < fixed.len; i++) {
printf("%d, %d\n", real[i].a, real[i].b);
}
return 0;
}
We've defined two structs — one to represent our tuple, and another to represent an array, as we will be passing those back and forth a bit.
We will follow this up by defining the exact same structs in Rust and define them to have the exact same members (types, ordering, names). Importantly, we use #[repr(C)] to let the Rust compiler know to not do anything funky with reordering the data.
extern crate libc;
use std::slice;
use std::mem;
#[repr(C)]
pub struct Tuple {
a: libc::uint32_t,
b: libc::uint32_t,
}
#[repr(C)]
pub struct Array {
data: *const libc::c_void,
len: libc::size_t,
}
impl Array {
unsafe fn as_u32_slice(&self) -> &[u32] {
assert!(!self.data.is_null());
slice::from_raw_parts(self.data as *const u32, self.len as usize)
}
fn from_vec<T>(mut vec: Vec<T>) -> Array {
// Important to make length and capacity match
// A better solution is to track both length and capacity
vec.shrink_to_fit();
let array = Array { data: vec.as_ptr() as *const libc::c_void, len: vec.len() as libc::size_t };
// Whee! Leak the memory, and now the raw pointer (and
// eventually C) is the owner.
mem::forget(vec);
array
}
}
#[no_mangle]
pub extern fn convert_vec(lon: Array, lat: Array) -> Array {
let lon = unsafe { lon.as_u32_slice() };
let lat = unsafe { lat.as_u32_slice() };
let vec =
lat.iter().zip(lon.iter())
.map(|(&lat, &lon)| Tuple { a: lat, b: lon })
.collect();
Array::from_vec(vec)
}
We must never accept or return non-repr(C) types across the FFI boundary, so we pass across our Array. Note that there's a good amount of unsafe code, as we have to convert an unknown pointer to data (c_void) to a specific type. That's the price of being generic in C world.
Let's turn our eye to Python now. Basically, we just have to mimic what the C code did:
import ctypes
class FFITuple(ctypes.Structure):
_fields_ = [("a", ctypes.c_uint32),
("b", ctypes.c_uint32)]
class FFIArray(ctypes.Structure):
_fields_ = [("data", ctypes.c_void_p),
("len", ctypes.c_size_t)]
# Allow implicit conversions from a sequence of 32-bit unsigned
# integers.
#classmethod
def from_param(cls, seq):
return cls(seq)
# Wrap sequence of values. You can specify another type besides a
# 32-bit unsigned integer.
def __init__(self, seq, data_type = ctypes.c_uint32):
array_type = data_type * len(seq)
raw_seq = array_type(*seq)
self.data = ctypes.cast(raw_seq, ctypes.c_void_p)
self.len = len(seq)
# A conversion function that cleans up the result value to make it
# nicer to consume.
def void_array_to_tuple_list(array, _func, _args):
tuple_array = ctypes.cast(array.data, ctypes.POINTER(FFITuple))
return [tuple_array[i] for i in range(0, array.len)]
lib = ctypes.cdll.LoadLibrary("./target/debug/libtupleffi.dylib")
lib.convert_vec.argtypes = (FFIArray, FFIArray)
lib.convert_vec.restype = FFIArray
lib.convert_vec.errcheck = void_array_to_tuple_list
for tupl in lib.convert_vec([1,2,3], [9,8,7]):
print tupl.a, tupl.b
Forgive my rudimentary Python. I'm sure an experienced Pythonista could make this look a lot prettier! Thanks to #eryksun for some nice advice on how to make the consumer side of calling the method much nicer.
A word about ownership and memory leaks
In this example code, we've leaked the memory allocated by the Vec. Theoretically, the FFI code now owns the memory, but realistically, it can't do anything useful with it. To have a fully correct example, you'd need to add another method that would accept the pointer back from the callee, transform it back into a Vec, then allow Rust to drop the value. This is the only safe way, as Rust is almost guaranteed to use a different memory allocator than the one your FFI language is using.
Not sure whether I should be returning a reference and how I would have to annotate the function with the appropriate lifetime specifier if I did
No, you don't want to (read: can't) return a reference. If you could, then the ownership of the item would end with the function call, and the reference would point to nothing. This is why we need to do the two-step dance with mem::forget and returning a raw pointer.
Asked because of this: Default argument in c++
Say I have a function such as this: void f(int p1=1, int p2=2, int p3=3, int p4=4);
And I want to call it using only some of the arguments - the rest will be the defaults.
Something like this would work:
template<bool P1=true, bool P2=true, bool P3=true, bool P4=true>
void f(int p1=1, int p2=2, int p3=3, int p4=4);
// specialize:
template<>
void f<false, true, false, false>(int p1) {
f(1, p1);
}
template<>
void f<false, true, true, false>(int p1, int p2) {
f(1, p1, p2);
}
// ... and so on.
// Would need a specialization for each combination of arguments
// which is very tedious and error-prone
// Use:
f<false, true, false, false>(5); // passes 5 as p2 argument
But it requires too much code to be practical.
Is there a better way to do this?
Use the Named Parameters Idiom (→ FAQ link).
The Boost.Parameters library (→ link) can also solve this task, but paid for by code verbosity and greatly reduced clarity. It's also deficient in handling constructors. And it requires having the Boost library installed, of course.
Have a look at the Boost.Parameter library.
It implements named paramaters in C++. Example:
#include <boost/parameter/name.hpp>
#include <boost/parameter/preprocessor.hpp>
#include <iostream>
//Define
BOOST_PARAMETER_NAME(p1)
BOOST_PARAMETER_NAME(p2)
BOOST_PARAMETER_NAME(p3)
BOOST_PARAMETER_NAME(p4)
BOOST_PARAMETER_FUNCTION(
(void),
f,
tag,
(optional
(p1, *, 1)
(p2, *, 2)
(p3, *, 3)
(p4, *, 4)))
{
std::cout << "p1: " << p1
<< ", p2: " << p2
<< ", p3: " << p3
<< ", p4: " << p4 << "\n";
}
//Use
int main()
{
//Prints "p1: 1, p2: 5, p3: 3, p4: 4"
f(_p2=5);
}
Although Boost.Parameters is amusing, it suffers (unfortunately) for a number of issues, among which placeholder collision (and having to debug quirky preprocessors/template errors):
BOOST_PARAMETER_NAME(p1)
Will create the _p1 placeholder that you then use later on. If you have two different headers declaring the same placeholder, you get a conflict. Not fun.
There is a much simpler (both conceptually and practically) answer, based on the Builder Pattern somewhat is the Named Parameters Idiom.
Instead of specifying such a function:
void f(int a, int b, int c = 10, int d = 20);
You specify a structure, on which you will override the operator():
the constructor is used to ask for mandatory arguments (not strictly in the Named Parameters Idiom, but nobody said you had to follow it blindly), and default values are set for the optional ones
each optional parameter is given a setter
Generally, it is combined with Chaining which consists in making the setters return a reference to the current object so that the calls can be chained on a single line.
class f {
public:
// Take mandatory arguments, set default values
f(int a, int b): _a(a), _b(b), _c(10), _d(20) {}
// Define setters for optional arguments
// Remember the Chaining idiom
f& c(int v) { _c = v; return *this; }
f& d(int v) { _d = v; return *this; }
// Finally define the invocation function
void operator()() const;
private:
int _a;
int _b;
int _c;
int _d;
}; // class f
The invocation is:
f(/*a=*/1, /*b=*/2).c(3)(); // the last () being to actually invoke the function
I've seen a variant putting the mandatory arguments as parameters to operator(), this avoids keeping the arguments as attributes but the syntax is a bit weirder:
f().c(3)(/*a=*/1, /*b=*/2);
Once the compiler has inlined all the constructor and setters call (which is why they are defined here, while operator() is not), it should result in similarly efficient code compared to the "regular" function invocation.
This isn't really an answer, but...
In C++ Template Metaprogramming by David Abrahams and Aleksey Gurtovoy (published in 2004!) the authors talk about this:
While writing this book, we reconsidered the interface used for named
function parameter support. With a little experimentation we
discovered that it’s possible to provide the ideal syntax by using
keyword objects with overloaded assignment operators:
f(slew = .799, name = "z");
They go on to say:
We’re not going to get into the implementation details of this named
parameter library here; it’s straightforward enough that we suggest
you try implementing it yourself as an exercise.
This was in the context of template metaprogramming and Boost::MPL. I'm not too sure how their "straighforward" implementation would jive with default parameters, but I assume it would be transparent.
Refering to http://mail.python.org/pipermail/python-dev/2009-June/090210.html
AND http://dan.iel.fm/posts/python-c-extensions/
and here is other places i searched regarding my question:
http://article.gmane.org/gmane.comp.python.general/424736
http://joyrex.spc.uchicago.edu/bookshelves/python/cookbook/pythoncook-CHP-16-SECT-3.html
http://docs.python.org/2/c-api/sequence.html#PySequence_Check
Python extension module with variable number of arguments
I am inexperienced in Python/C API.
I have the following code:
sm_int_list = (1,20,3)
c_int_array = (ctypes.c_int * len(sm_int_list))(*sm_int_list)
sm_str_tuple = ('some','text', 'here')
On the C extension side, i have done something like this:
static PyObject* stuff_here(PyObject *self, PyObject *args)
{
char* input;
int *i1, *i2;
char *s1, *s2;
// args = (('some','text', 'here'), [1,20,3], ('some','text', 'here'), [1,20,3])
**PyArg_ParseTuple(args, "(s#:):#(i:)#(s#:):#(i:)#", &s1, &i1, &s2, &i2)**;
/*stuff*/
}
such that:
stuff.here(('some','text', 'here'), [1,20,3], ('some','text', 'here'), [1,20,3])
returns data in the same form as args after some computation.
I would like to know the PyArg_ParseTuple expression, is it the proper way to parse
an array of varying string
an array of integers
UPDATE NEW
Is this the correct way?:
static PyObject* stuff_here(PyObject *self, PyObject *args)
unsigned int tint[], cint[];
ttotal=0, ctotal=0;
char *tstr, *cstr;
int *t_counts, *c_counts;
Py_ssize_t size;
PyObject *t_str1, *t_int1, *c_str2, *c_int2; //the C var that takes in the py variable value
PyObject *tseq, cseq;
int t_seqlen=0, c_seqlen=0;
if (!PyArg_ParseTuple(args, "OOiOOi", &t_str1, &t_int1, &ttotal, &c_str2, &c_int2, &ctotal))
{
return NULL;
}
if (!PySequence_Check(tag_str1) && !PySequence_Check(cat_str2)) return NULL;
else:
{
//All things t
tseq = PySequence_Fast(t_str1, "iterable");
t_seqlen = PySequence_Fast_GET_SIZE(tseq);
t_counts = PySequence_Fast(t_int1);
//All things c
cseq = PySequence_Fast(c_str2);
c_seqlen = PySequence_Fast_GET_SIZE(cseq);
c_counts = PySequence_Fast(c_int2);
//Make c arrays of all things tag and cat
for (i=0; i<t_seqlen; i++)
{
tstr[i] = PySequence_Fast_GET_ITEM(tseq, i);
tcounts[i] = PySequence_Fast_GET_ITEM(t_counts, i);
}
for (i=0; i<c_seqlen; i++)
{
cstr[i] = PySequence_Fast_GET_ITEM(cseq, i);
ccounts[i] = PySequence_Fast_GET_ITEM(c_counts, i);
}
}
OR
PyArg_ParseTuple(args, "(s:)(i:)(s:)(i:)", &s1, &i1, &s2, &i2)
And then again while returning,
Py_BuildValue("sisi", arr_str1,arr_int1,arr_str2,arr_int2) ??
Infact if someone could in detail clarify the various PyArg_ParseTuple function that would be of great benefit. the Python C API, as i find it in the documentation, is not exactly a tutorial on things to do.
You can use PyArg_ParseTuple to parse a real tuple, that has a fixed structure. Especially the number of items in the subtuples cannot change.
As the 2.7.5 documentation says, your format "(s#:):#(i:)#(s#:):#(i:)#" is wrong since : cannot occur in nested parenthesis. The format "(sss)(iii)(sss)(iii)", along with total of 12 pointer arguments should match your arguments. Likewise for Py_BuildValue you can use the same format string (which creates 4 tuples within 1 tuple), or "(sss)[iii](sss)[iii]" if the type matters (this makes the integers to be in lists instead of tuples).