In python you can do list[2:] to get every element after the second element. Is there any way to do the same thing with an array in c++?
I hope I figured an acceptable answer. Unfortunately in C++, you cannot overload operator[] with more than one argument so I used operator() instead.
#include <iostream>
#include <vector>
template <typename T>
class WrapperVector
{
private:
std::vector<T> data;
public:
WrapperVector(size_t reserve_size)
{
data.reserve(reserve_size);
}
WrapperVector(typename std::vector<T>::iterator start, typename std::vector<T>::iterator end)
{
data = std::vector<T>(start, end);
}
// appends element to the end of container, just like in Python
void append(T element)
{
data.push_back(element);
}
/* instead of operator[], operator() must be used
because operator[] can't accept more than one argument */
// instead of self[x:y], use this(x, y)
WrapperVector<T> operator()(size_t start, size_t end)
{
return WrapperVector<T>(data.begin() + start, data.begin() + end);
}
// instead of self[x:], use this(x)
WrapperVector<T> operator()(size_t start)
{
return WrapperVector<T>(data.begin() + start, data.end());
}
// prints all elements to cout
void print()
{
if (!data.size())
{
std::cout << "No elements.\n";
return;
}
std::cout << data[0];
size_t length = data.size();
for(size_t i=1; i < length; i++)
std::cout << ' ' << data[i];
std::cout << '\n';
}
};
int main()
{
WrapperVector<int> w(5);
w.append(1);
w.append(2);
w.append(3);
w.append(4);
w.append(5);
w(0).print();
w(1, 3).print();
// you can also save the slice
WrapperVector<int> w2 = w(2);
WrapperVector<int> w3 = w(2, 4);
w2.print();
w3.print();
return 0;
}
Now you could even overload it to accept three arguments to account for the step, just like in Python. I let that as an exercise to you.
Instead of creating a custom class that introduces some amount of maintenance and certain limitations you can use the concept of iterators. Two iterators are used to represent a range of values. For example the function
template<typename TIterator>
foo(TIterator start, TIterator end);
would take a range of objects that is only specified by two iterators. It's a template, so it can take iterators from different containers. To select a sub-range from a range given this way you can use std::next() and std::prev(). For example for a
std::vector<int> list;
you can call foo with a subrange starting from third element as:
foo(std::next(list.begin(), 2), list.end());
or call foo with a subrange from the third to the second last element:
foo(std::next(list.begin(), 2), std::prev(list.end(), 1));
If you need/want to copy a subrange you can easily do that. For example
std::vector<int> subList(std::next(list.begin(), 2),
std::prev(list.end(), 1));
would create a vector sublist that contains the third to second last element from list.
Of cause those are just simple examples. In a real world application you need to check if those subranges are valid/exist.
The advantages are:
no extra wrapper classes
no need to copy any data (only the iterators themselves)
the standard library works almost exclusively with iterators to represent ranges, so it's good to stick to that concept.
With templates you can easily support all containers in the standard library as long as their iterators satisfy the requirements listed in the reference. (You could use the above examples with std::array, std::list and most other containers without any modifications, except for the type of list of cause)
iterators are an interface for user or third party containers, as long as they provide iterators that satisfy the requirements listed in the reference.
On the contra side:
the code gets a bit more complex
it may take some time to understand iterators, because it's a quite different concept than other languages use.
Related
I'm venturing into C extensions for the first time, and am somewhat new to C as well. I've got a working C extension, however, if i repeatedly call the utility in python, I eventually get a segmentation fault: 11.
#include <Python.h>
static PyObject *getasof(PyObject *self, PyObject *args) {
PyObject *fmap;
long dt;
if (!PyArg_ParseTuple(args, "Ol", &fmap, &dt))
return NULL;
long length = PyList_Size(fmap);
for (int i = 0; i < length; i++) {
PyObject *event = PyList_GetItem(fmap, i);
long dti = PyInt_AsLong(PyList_GetItem(event, 0));
if (dti > dt) {
PyObject *output = PyList_GetItem(event, 1);
return output;
}
}
Py_RETURN_NONE;
};
The function args are
a time series (list of lists): ex [[1, 'a'], [5, 'b']]
a time (long): ex 4
And it's supposed to iterate over the list of lists til it finds a value greater than the time given. Then return that value. As I mentioned, it correctly returns the answer, but if I call it enough times, it segfaults.
My gut feeling is that this has to do with reference counting, but I'm not familiar enough with the concept to know if this is the direct cause.
Any help would be appreciated.
"My gut feeling is that this has to do with reference counting..." Your instincts are correct.
PyList_GetItem returns a borrowed reference, which means your function doesn't "own" a reference to the item. So there is a problem here:
PyObject *output = PyList_GetItem(event, 1);
return output;
You don't own a reference to the item, but you return it to the caller, so the caller doesn't own a reference either. The caller will run into a problem if the item is garbage collected while the caller is still trying to use it. So you'll need to increase the reference count of the item before you return it:
PyObject *output = PyList_GetItem(event, 1);
Py_INCREF(output);
return output;
That assumes that PyList_GetItem(event, 1) doesn't fail! Except for PyArg_ParseTuple, you aren't checking the return values of the C API functions, which means you are assuming the input argument always has the exact structure that you expect. That's fine while you're testing code and figuring out how this works, but eventually you should be checking the return values of the C API functions for failure, and handling it appropriately.
Asked because of this: Default argument in c++
Say I have a function such as this: void f(int p1=1, int p2=2, int p3=3, int p4=4);
And I want to call it using only some of the arguments - the rest will be the defaults.
Something like this would work:
template<bool P1=true, bool P2=true, bool P3=true, bool P4=true>
void f(int p1=1, int p2=2, int p3=3, int p4=4);
// specialize:
template<>
void f<false, true, false, false>(int p1) {
f(1, p1);
}
template<>
void f<false, true, true, false>(int p1, int p2) {
f(1, p1, p2);
}
// ... and so on.
// Would need a specialization for each combination of arguments
// which is very tedious and error-prone
// Use:
f<false, true, false, false>(5); // passes 5 as p2 argument
But it requires too much code to be practical.
Is there a better way to do this?
Use the Named Parameters Idiom (→ FAQ link).
The Boost.Parameters library (→ link) can also solve this task, but paid for by code verbosity and greatly reduced clarity. It's also deficient in handling constructors. And it requires having the Boost library installed, of course.
Have a look at the Boost.Parameter library.
It implements named paramaters in C++. Example:
#include <boost/parameter/name.hpp>
#include <boost/parameter/preprocessor.hpp>
#include <iostream>
//Define
BOOST_PARAMETER_NAME(p1)
BOOST_PARAMETER_NAME(p2)
BOOST_PARAMETER_NAME(p3)
BOOST_PARAMETER_NAME(p4)
BOOST_PARAMETER_FUNCTION(
(void),
f,
tag,
(optional
(p1, *, 1)
(p2, *, 2)
(p3, *, 3)
(p4, *, 4)))
{
std::cout << "p1: " << p1
<< ", p2: " << p2
<< ", p3: " << p3
<< ", p4: " << p4 << "\n";
}
//Use
int main()
{
//Prints "p1: 1, p2: 5, p3: 3, p4: 4"
f(_p2=5);
}
Although Boost.Parameters is amusing, it suffers (unfortunately) for a number of issues, among which placeholder collision (and having to debug quirky preprocessors/template errors):
BOOST_PARAMETER_NAME(p1)
Will create the _p1 placeholder that you then use later on. If you have two different headers declaring the same placeholder, you get a conflict. Not fun.
There is a much simpler (both conceptually and practically) answer, based on the Builder Pattern somewhat is the Named Parameters Idiom.
Instead of specifying such a function:
void f(int a, int b, int c = 10, int d = 20);
You specify a structure, on which you will override the operator():
the constructor is used to ask for mandatory arguments (not strictly in the Named Parameters Idiom, but nobody said you had to follow it blindly), and default values are set for the optional ones
each optional parameter is given a setter
Generally, it is combined with Chaining which consists in making the setters return a reference to the current object so that the calls can be chained on a single line.
class f {
public:
// Take mandatory arguments, set default values
f(int a, int b): _a(a), _b(b), _c(10), _d(20) {}
// Define setters for optional arguments
// Remember the Chaining idiom
f& c(int v) { _c = v; return *this; }
f& d(int v) { _d = v; return *this; }
// Finally define the invocation function
void operator()() const;
private:
int _a;
int _b;
int _c;
int _d;
}; // class f
The invocation is:
f(/*a=*/1, /*b=*/2).c(3)(); // the last () being to actually invoke the function
I've seen a variant putting the mandatory arguments as parameters to operator(), this avoids keeping the arguments as attributes but the syntax is a bit weirder:
f().c(3)(/*a=*/1, /*b=*/2);
Once the compiler has inlined all the constructor and setters call (which is why they are defined here, while operator() is not), it should result in similarly efficient code compared to the "regular" function invocation.
This isn't really an answer, but...
In C++ Template Metaprogramming by David Abrahams and Aleksey Gurtovoy (published in 2004!) the authors talk about this:
While writing this book, we reconsidered the interface used for named
function parameter support. With a little experimentation we
discovered that it’s possible to provide the ideal syntax by using
keyword objects with overloaded assignment operators:
f(slew = .799, name = "z");
They go on to say:
We’re not going to get into the implementation details of this named
parameter library here; it’s straightforward enough that we suggest
you try implementing it yourself as an exercise.
This was in the context of template metaprogramming and Boost::MPL. I'm not too sure how their "straighforward" implementation would jive with default parameters, but I assume it would be transparent.
I have a c++ class written and I am using SWIG to make a Python version of my class. I would like to overload the constructor so that it can take in Python lists. For example:
>>> import example
>>> a = example.Array([1,2,3,4])
I was attempting to use the typemap feature in swig, but the scope of typemap does not include code in extend
Here is a similar example to what I have...
%typemap(in) double[]
{
if (!PyList_Check($input))
return NULL;
int size = PyList_Size($input);
int i = 0;
$1 = (double *) malloc((size+1)*sizeof(double));
for (i = 0; i < size; i++)
{
PyObject *o = PyList_GetItem($input,i);
if (PyNumber_Check(o))
$1[i] = PyFloat_AsDouble(o);
else
{
PyErr_SetString(PyExc_TypeError,"list must contain numbers");
free($1);
return NULL;
}
}
$1[i] = 0;
}
%include "Array.h"
%extend Array
{
Array(double lst[])
{
Array *a = new Array();
...
/* do stuff with lst[] */
...
return a;
}
}
I know the typemap is working correctly (I wrote a small test function that just prints out elements in the double[]).
I attempted putting the typemap inside the extend clause, but that did not solve the problem.
Maybe there is another way to use Python Lists inside of the extend, but I could not find any examples.
Thanks for the help in advance.
You're really close: instead of a double lst[], extend with std::list<double>:
%include "std_list.i" // or std_vector.i
%include "Array.h"
%extend Array
{
Array(const std::list<double>& numbers) {
Array* arr = new Array;
...put numbers list items in "arr", then
return a; // interpreter will take ownership
}
}
SWIG should automatically convert the Python list to the std::list.
Refering to http://mail.python.org/pipermail/python-dev/2009-June/090210.html
AND http://dan.iel.fm/posts/python-c-extensions/
and here is other places i searched regarding my question:
http://article.gmane.org/gmane.comp.python.general/424736
http://joyrex.spc.uchicago.edu/bookshelves/python/cookbook/pythoncook-CHP-16-SECT-3.html
http://docs.python.org/2/c-api/sequence.html#PySequence_Check
Python extension module with variable number of arguments
I am inexperienced in Python/C API.
I have the following code:
sm_int_list = (1,20,3)
c_int_array = (ctypes.c_int * len(sm_int_list))(*sm_int_list)
sm_str_tuple = ('some','text', 'here')
On the C extension side, i have done something like this:
static PyObject* stuff_here(PyObject *self, PyObject *args)
{
char* input;
int *i1, *i2;
char *s1, *s2;
// args = (('some','text', 'here'), [1,20,3], ('some','text', 'here'), [1,20,3])
**PyArg_ParseTuple(args, "(s#:):#(i:)#(s#:):#(i:)#", &s1, &i1, &s2, &i2)**;
/*stuff*/
}
such that:
stuff.here(('some','text', 'here'), [1,20,3], ('some','text', 'here'), [1,20,3])
returns data in the same form as args after some computation.
I would like to know the PyArg_ParseTuple expression, is it the proper way to parse
an array of varying string
an array of integers
UPDATE NEW
Is this the correct way?:
static PyObject* stuff_here(PyObject *self, PyObject *args)
unsigned int tint[], cint[];
ttotal=0, ctotal=0;
char *tstr, *cstr;
int *t_counts, *c_counts;
Py_ssize_t size;
PyObject *t_str1, *t_int1, *c_str2, *c_int2; //the C var that takes in the py variable value
PyObject *tseq, cseq;
int t_seqlen=0, c_seqlen=0;
if (!PyArg_ParseTuple(args, "OOiOOi", &t_str1, &t_int1, &ttotal, &c_str2, &c_int2, &ctotal))
{
return NULL;
}
if (!PySequence_Check(tag_str1) && !PySequence_Check(cat_str2)) return NULL;
else:
{
//All things t
tseq = PySequence_Fast(t_str1, "iterable");
t_seqlen = PySequence_Fast_GET_SIZE(tseq);
t_counts = PySequence_Fast(t_int1);
//All things c
cseq = PySequence_Fast(c_str2);
c_seqlen = PySequence_Fast_GET_SIZE(cseq);
c_counts = PySequence_Fast(c_int2);
//Make c arrays of all things tag and cat
for (i=0; i<t_seqlen; i++)
{
tstr[i] = PySequence_Fast_GET_ITEM(tseq, i);
tcounts[i] = PySequence_Fast_GET_ITEM(t_counts, i);
}
for (i=0; i<c_seqlen; i++)
{
cstr[i] = PySequence_Fast_GET_ITEM(cseq, i);
ccounts[i] = PySequence_Fast_GET_ITEM(c_counts, i);
}
}
OR
PyArg_ParseTuple(args, "(s:)(i:)(s:)(i:)", &s1, &i1, &s2, &i2)
And then again while returning,
Py_BuildValue("sisi", arr_str1,arr_int1,arr_str2,arr_int2) ??
Infact if someone could in detail clarify the various PyArg_ParseTuple function that would be of great benefit. the Python C API, as i find it in the documentation, is not exactly a tutorial on things to do.
You can use PyArg_ParseTuple to parse a real tuple, that has a fixed structure. Especially the number of items in the subtuples cannot change.
As the 2.7.5 documentation says, your format "(s#:):#(i:)#(s#:):#(i:)#" is wrong since : cannot occur in nested parenthesis. The format "(sss)(iii)(sss)(iii)", along with total of 12 pointer arguments should match your arguments. Likewise for Py_BuildValue you can use the same format string (which creates 4 tuples within 1 tuple), or "(sss)[iii](sss)[iii]" if the type matters (this makes the integers to be in lists instead of tuples).
I have a little project that works beautifully with SWIG. In particular, some of my functions return std::vectors, which get translated to tuples in Python. Now, I do a lot of numerics, so I just have SWIG convert these to numpy arrays after they're returned from the c++ code. To do this, I use something like the following in SWIG.
%feature("pythonappend") My::Cool::Namespace::Data() const %{ if isinstance(val, tuple) : val = numpy.array(val) %}
(Actually, there are several functions named Data, some of which return floats, which is why I check that val is actually a tuple.) This works just beautifully.
But, I'd also like to use the -builtin flag that's now available. Calls to these Data functions are rare and mostly interactive, so their slowness is not a problem, but there are other slow loops that speed up significantly with the builtin option.
The problem is that when I use that flag, the pythonappend feature is silently ignored. Now, Data just returns a tuple again. Is there any way I could still return numpy arrays? I tried using typemaps, but it turned into a giant mess.
Edit:
Borealid has answered the question very nicely. Just for completeness, I include a couple related but subtly different typemaps that I need because I return by const reference and I use vectors of vectors (don't start!). These are different enough that I wouldn't want anyone else stumbling around trying to figure out the minor differences.
%typemap(out) std::vector<int>& {
npy_intp result_size = $1->size();
npy_intp dims[1] = { result_size };
PyArrayObject* npy_arr = (PyArrayObject*)PyArray_SimpleNew(1, dims, NPY_INT);
int* dat = (int*) PyArray_DATA(npy_arr);
for (size_t i = 0; i < result_size; ++i) { dat[i] = (*$1)[i]; }
$result = PyArray_Return(npy_arr);
}
%typemap(out) std::vector<std::vector<int> >& {
npy_intp result_size = $1->size();
npy_intp result_size2 = (result_size>0 ? (*$1)[0].size() : 0);
npy_intp dims[2] = { result_size, result_size2 };
PyArrayObject* npy_arr = (PyArrayObject*)PyArray_SimpleNew(2, dims, NPY_INT);
int* dat = (int*) PyArray_DATA(npy_arr);
for (size_t i = 0; i < result_size; ++i) { for (size_t j = 0; j < result_size2; ++j) { dat[i*result_size2+j] = (*$1)[i][j]; } }
$result = PyArray_Return(npy_arr);
}
Edit 2:
Though not quite what I was looking for, similar problems may also be solved using #MONK's approach (explained here).
I agree with you that using typemap gets a little messy, but it is the right way to accomplish this task. You are also right that the SWIG documentation does not directly say that %pythonappend is incompatible with -builtin, but it is strongly implied: %pythonappend adds to the Python proxy class, and the Python proxy class does not exist at all in conjunction with the -builtin flag.
Before, what you were doing was having SWIG convert the C++ std::vector objects into Python tuples, and then passing those tuples back down to numpy - where they were converted again.
What you really want to do is convert them once, at the C level.
Here's some code which will turn all std::vector<int> objects into NumPy integer arrays:
%{
#include "numpy/arrayobject.h"
%}
%init %{
import_array();
%}
%typemap(out) std::vector<int> {
npy_intp result_size = $1.size();
npy_intp dims[1] = { result_size };
PyArrayObject* npy_arr = (PyArrayObject*)PyArray_SimpleNew(1, dims, NPY_INT);
int* dat = (int*) PyArray_DATA(npy_arr);
for (size_t i = 0; i < result_size; ++i) {
dat[i] = $1[i];
}
$result = PyArray_Return(npy_arr);
}
This uses the C-level numpy functions to construct and return an array. In order, it:
Ensures NumPy's arrayobject.h file is included in the C++ output file
Causes import_array to be called when the Python module is loaded (otherwise, all NumPy methods will segfault)
Maps any returns of std::vector<int> into NumPy arrays with a typemap
This code should be placed before you %import the headers which contain the functions returning std::vector<int>. Other than that restriction, it's entirely self-contained, so it shouldn't add too much subjective "mess" to your codebase.
If you need other vector types, you can just change the NPY_INT and all the int* and int bits, otherwise duplicating the function above.