How to pass an array from C to an embedded python script - python

I am running to some problems and would like some help. I have a piece code, which is used to embed a python script. This python script contains a function which will expect to receive an array as an argument (in this case I am using numpy array within the python script).
I would like to know how can I pass an array from C to the embedded python script as an argument for the function within the script. More specifically can someone show me a simple example of this.

Really, the best answer here is probably to use numpy arrays exclusively, even from your C code. But if that's not possible, then you have the same problem as any code that shares data between C types and Python types.
In general, there are at least five options for sharing data between C and Python:
Create a Python list or other object to pass.
Define a new Python type (in your C code) to wrap and represent the array, with the same methods you'd define for a sequence object in Python (__getitem__, etc.).
Cast the pointer to the array to intptr_t, or to explicit ctypes type, or just leave it un-cast; then use ctypes on the Python side to access it.
Cast the pointer to the array to const char * and pass it as a str (or, in Py3, bytes), and use struct or ctypes on the Python side to access it.
Create an object matching the buffer protocol, and again use struct or ctypes on the Python side.
In your case, you want to use numpy.arrays in Python. So, the general cases become:
Create a numpy.array to pass.
(probably not appropriate)
Pass the pointer to the array as-is, and from Python, use ctypes to get it into a type that numpy can convert into an array.
Cast the pointer to the array to const char * and pass it as a str (or, in Py3, bytes), which is already a type that numpy can convert into an array.
Create an object matching the buffer protocol, and which again I believe numpy can convert directly.
For 1, here's how to do it with a list, just because it's a very simple example (and I already wrote it…):
PyObject *makelist(int array[], size_t size) {
PyObject *l = PyList_New(size);
for (size_t i = 0; i != size; ++i) {
PyList_SET_ITEM(l, i, PyInt_FromLong(array[i]));
}
return l;
}
And here's the numpy.array equivalent (assuming you can rely on the C array not to be deleted—see Creating arrays in the docs for more details on your options here):
PyObject *makearray(int array[], size_t size) {
npy_int dim = size;
return PyArray_SimpleNewFromData(1, &dim, (void *)array);
}
At any rate, however you do this, you will end up with something that looks like a PyObject * from C (and has a single refcount), so you can pass it as a function argument, while on the Python side it will look like a numpy.array, list, bytes, or whatever else is appropriate.
Now, how do you actually pass function arguments? Well, the sample code in Pure Embedding that you referenced in your comment shows how to do this, but doesn't really explain what's going on. There's actually more explanation in the extending docs than the embedding docs, specifically, Calling Python Functions from C. Also, keep in mind that the standard library source code is chock full of examples of this (although some of them aren't as readable as they could be, either because of optimization, or just because they haven't been updated to take advantage of new simplified C API features).
Skip the first example about getting a Python function from Python, because presumably you already have that. The second example (and the paragraph right about it) shows the easy way to do it: Creating an argument tuple with Py_BuildValue. So, let's say we want to call a function you've got stored in myfunc with the list mylist returned by that makelist function above. Here's what you do:
if (!PyCallable_Check(myfunc)) {
PyErr_SetString(PyExc_TypeError, "function is not callable?!");
return NULL;
}
PyObject *arglist = Py_BuildValue("(o)", mylist);
PyObject *result = PyObject_CallObject(myfunc, arglist);
Py_DECREF(arglist);
return result;
You can skip the callable check if you're sure you've got a valid callable object, of course. (And it's usually better to check when you first get myfunc, if appropriate, because you can give both earlier and better error feedback that way.)
If you want to actually understand what's going on, try it without Py_BuildValue. As the docs say, the second argument to [PyObject_CallObject][6] is a tuple, and PyObject_CallObject(callable_object, args) is equivalent to apply(callable_object, args), which is equivalent to callable_object(*args). So, if you wanted to call myfunc(mylist) in Python, you have to turn that into, effectively, myfunc(*(mylist,)) so you can translate it to C. You can construct a tuple like this:
PyObject *arglist = PyTuple_Pack(1, mylist);
But usually, Py_BuildValue is easier (especially if you haven't already packed everything up as Python objects), and the intention in your code is clearer (just as using PyArg_ParseTuple is simpler and clearer than using explicit tuple functions in the other direction).
So, how do you get that myfunc? Well, if you've created the function from the embedding code, just keep the pointer around. If you want it passed in from the Python code, that's exactly what the first example does. If you want to, e.g., look it up by name from a module or other context, the APIs for concrete types like PyModule and abstract types like PyMapping are pretty simple, and it's generally obvious how to convert Python code into the equivalent C code, even if the result is mostly ugly boilerplate.
Putting it all together, let's say I've got a C array of integers, and I want to import mymodule and call a function mymodule.myfunc(mylist) that returns an int. Here's a stripped-down example (not actually tested, and no error handling, but it should show all the parts):
int callModuleFunc(int array[], size_t size) {
PyObject *mymodule = PyImport_ImportModule("mymodule");
PyObject *myfunc = PyObject_GetAttrString(mymodule, "myfunc");
PyObject *mylist = PyList_New(size);
for (size_t i = 0; i != size; ++i) {
PyList_SET_ITEM(l, i, PyInt_FromLong(array[i]));
}
PyObject *arglist = Py_BuildValue("(o)", mylist);
PyObject *result = PyObject_CallObject(myfunc, arglist);
int retval = (int)PyInt_AsLong(result);
Py_DECREF(result);
Py_DECREF(arglist);
Py_DECREF(mylist);
Py_DECREF(myfunc);
Py_DECREF(mymodule);
return retval;
}
If you're using C++, you probably want to look into some kind of scope-guard/janitor/etc. to handle all those Py_DECREF calls, especially once you start doing proper error handling (which usually means early return NULL calls peppered through the function). If you're using C++11 or Boost, unique_ptr<PyObject, Py_DecRef> may be all you need.
But really, a better way to reduce all that ugly boilerplate, if you plan to do a lot of C<->Python communication, is to look at all of the familiar frameworks designed for improving extending Python—Cython, boost::python, etc. Even though you're embedding, you're effectively doing the same work as extending, so they can help in the same ways.
For that matter, some of them also have tools to help the embedding part, if you search around the docs. For example, you can write your main program in Cython, using both C code and Python code, and cython --embed. You may want to cross your fingers and/or sacrifice some chickens, but if it works, it's amazingly simple and productive. Boost isn't nearly as trivial to get started, but once you've got things together, almost everything is done in exactly the way you'd expect, and just works, and that's just as true for embedding as extending. And so on.

The Python function will need a Python object to be passed in. Since you want that Python object to be a NumPy array, you should use one of the NumPy C-API functions for creating arrays; PyArray_SimpleNewFromData() is probably a good start. It will use the buffer provided, without copying the data.
That said, it is almost always easier to write the main program in Python and use a C extension module for the C code. This approach makes it easier to let Python do the memory management, and the ctypes module together with Numpy's cpython extensions make it easy to pass a NumPy array to a C function.

Related

How to use C keyword restrict to decorate return pointer correctly?

In some documents, I learned that we shall use 'restrict' to decorate function parameters or memory allocation statements. Like this:
void(int*restrict paraA){}
int*restrict A = (int*)malloc(10 * sizeof(int);
And I'm confused about whether a 'restricted' pointer can be returned by any function.
Here I have a function funcA:
int* funcA(paraA, paraB){
int*restrict result = (int*)calloc(10,sizeof(int));
......
return result;
}
Could I just use the return pointer like this? Will it be safe?
void funcB(){
int*restrict pA = funcA(0, 0);
pA[0] = 1;
}
In msvc, I learn about __declspec(restrict).
Should I use __declspec(restrict) int* funcA() or int*restrict funcA() instead in gcc?
And if I want to use funcA in Python, will __declspec(dllexport,restrict) int* funcA() effect the result in Python compared with declspec(dllexport) int* funcA()?
This usage of restrict doesn't make much sense. The keyword only serves a purpose when you have two or more pointers to compatible types and the compiler can't know if they potentially point at the same object. Or if they possibly point to an object with external linkage ("global") in the same translation unit. More details here.
In funcA the pointer has no relation to any other pointer or object in the program, since it is assigned to dynamic memory inside the function. restrict fills no obvious purpose at all in that function.

Converting Arrays and Tensors in Chaquopy

How do I do this?
I saw your post saying that you can just pass java Objects to Python methods but this is not working for numpy arrays and TensorFlow tensors. The following, and various variation of this, is what I tried and to no avail.
double[][] anchors = new double[][]{{0.57273, 0.677385}, {1.87446, 2.06253}, {3.33843, 5.47434}, {7.88282, 3.52778}, {9.77052, 9.16828}};
PyObject anchors_ = numpy.callAttr("array", anchors);
I also tried to use concatenate to create this but it does not work. This is because concatenate (and stack, etc.) require a sequence containing the names of the arrays to be passed as an argument and there does not seem to be a way to do that with Chaquopy in Java.
Any advice?
I assume the error you received was "ValueError: only 2 non-keyword arguments accepted".
You probably also got a warning from Android Studio in the call to numpy.array, saying "Confusing argument 'anchors', unclear if a varargs or non-varargs call is desired". This is the source of the problem. You intended to pass one double[][] argument, but unfortunately Java has interpreted it as five double[] arguments.
Android Studio should offer you an automatic fix of casting the parameter to Object, i.e.:
numpy.callAttr("array", (Object)anchors);
This tells the Java compiler that you intend to pass only one argument, and numpy.array will then work correctly.
I have managed to find two ways that actually work in converting this toy array into proper Python arrays.
In Java:
import com.chaquo.python.*;
Python py = Python.getInstance();
PyObject np = py.getModule("numpy");
PyObject anchors_final = np.callAttr("array", anchors[0]);
anchors_final = np.callAttr("expand_dims", anchors_final, 0);
for (int i=1; i < anchors.length; i++){
PyObject temp_arr = np.callAttr("expand_dims", anchors[i], 0);
anchors_final = np.callAttr("append", anchors_final, temp_arr, 0);
}
// Then you can pass it to your Python file to do whatever
In Python (the simpler way)
After passing the array to your Python function, using for example:
import com.chaquo.python.*;
Python py = Python.getInstance();
PyObject pp = py.getModule("file_name");
PyObject output = pp.callAttr("fnc_head", anchors);
In your Python file, you can simply do:
def fnc_head():
anchors = [list(x) for x in anchors]
...
return result
These were tested with 2-d arrays. Other array types would likely require modifications.

Are __next__ and __str__ invoked by the equivalent next and str functions internally?

From Learning python book 5th Edition:
Page 421, footnote2:
Technically speaking, the for loop calls the internal equivalent of I.__next__, instead of the next(I) used here, though there is rarely any difference between the two. Your manual iterations can generally use either call scheme.
What does this exactly mean? Does it mean that that I.__next__ is invoked by a C function instead of str builtin function in the forloop or any builtin iteration contexts?
Page 914:
__str__ is tried first for the print operation and the str built-in function (the internal equivalent of which print runs). It generally should return a user-friendly display.
Aside from book, Does Python calls __str__ or __next__ using C functions internally as I understood from the book?
Python C implementations use C functions that are essentially the same thing as the Python functions, in that the Python functions like str() and next() are usually thin wrappers around the C functions.
These C functions then take care of calling the right hook; this could be the C version of the hook (a slot in a structure pointing to a function), or the Python function on a class.
Now, both str() and next() are a little more than wrappers here, because there is additional functionality defined by these functions that require a little more implementation work; next() takes a 2nd argument that defines a default, for example.
So I'll take len() as an example instead. The function is defined in the builtin_len() C function:
static PyObject *
builtin_len(PyObject *self, PyObject *v)
{
Py_ssize_t res;
res = PyObject_Size(v);
if (res < 0 && PyErr_Occurred())
return NULL;
return PyInt_FromSsize_t(res);
}
Note the call PyObject_Size(); that's what C code would use to get the length of an object. The rest is just error handling and producing a Python int object.
PyObject_Size() then is implemented like this:
Py_ssize_t
PyObject_Size(PyObject *o)
{
PySequenceMethods *m;
if (o == NULL) {
null_error();
return -1;
}
m = o->ob_type->tp_as_sequence;
if (m && m->sq_length)
return m->sq_length(o);
return PyMapping_Size(o);
}
It takes a PyObject structure, finds the ob_type structure from there, which has an optional tp_as_sequence structure, which can define a sq_length function pointer. If it exists, it is called to produce the actual length. Different types can define that function, and a special C structure for Python instances can handle redirecting back to a Python method.
All this shows that Python's internal implementation uses a lot of abstractions to implement objects, allowing both C-defined types and Python classes to be treated the same, mostly. If you want to dig deeper, the Python documentation has full coverage of the C-API, including a dedicated tutorial.
Circling back to your original two functions, the internal equivalent of next() is PyIter_Next(), and str(), as used for string conversions of arbitrary objects, is PyObject_Str().

Create object from string repesentation in C API

I am working on a system which is embedding a Python interpreter, and I need to construct a PyObject* given a string from the C API.
I have a const char* representing a dictionary, in the proper format for eval() to work properly from within Python, ie: "{'bar': 42, 'baz': 50}".
Currently, this is being passed into Python as a PyObject* using the Py_Unicode_ api (representing a string), so in my python interpreter, I can successfully write:
foo = eval(myObject.value)
print(foo['bar']) # prints 42
I would like to change this to automatically "eval" the const char* on the C side, and return a PyObject* representing a completed dictionary. How do I go about converting this string into a dictionary in the C API?
There are two basic ways to do this.
The first is to simply call eval the same way you do in Python. The only trick is that you need a handle to the builtins module, because you don't get that for free in the C API. There are a number of ways to do this, but one really easy way is to just import it:
/* or PyEval_GetBuiltins() if you know you're at the interpreter's top level */
PyObject *builtins = PyImport_ImportModule("builtins");
PyObject *eval = PyObject_GetAttrString(builtins, "eval");
PyObject *args = Py_BuildValue("(s)", expression_as_c_string);
PyObject *result = PyObject_Call(eval, args);
(This is untested code, and it at least leaks references, and doesn't check for NULL return if you want to handle exceptions on the C side… But it should be enough to get the idea across.)
One nice thing about this is that you can use ast.literal_eval in exactly the same way as eval (which means you get some free validation); just change "builtins" to "ast", and "eval" to "literal_eval". But the real win is that you're doing exactly what eval does in Python, which you already know is exactly what you wanted.
The alternative is to use the compilation APIs. At the really high level, you can just build a Python statement out of "foo = eval(%s)" and PyRun_SimpleString it. Below that, use Py_CompileString to parse and compile the expression (you can also parse and compile in separate steps, but that isn't useful here), then PyEval_EvalCode to evaluate it in the appropriate globals and locals. (If you're not tracking globals yourself, use the interpreter-reflection APIs PyEval_GetLocals and PyEval_GetGlobals.) Note that I'm giving the super-simplified version of each function; often you want to use one of the sibling functions. But you can find them easily in the docs.

Boost.Python function pointers as class constructor argument

I have a C++ class that requires a function pointer in it's constructor (float(*myfunction)(vector<float>*))
I've already exposed some function pointers to Python.
The ideal way to use this class is something like this:
import mymodule
mymodule.some_class(mymodule.some_function)
So I tell Boost about this class like so:
class_<SomeClass>("some_class", init<float(*)(vector<float>*)>);
But I get:
error: no matching function for call to 'register_shared_ptr1(Sample (*)(std::vector<double, std::allocator<double> >*))'
when I try to compile it.
So, does anyone have any ideas on how I can fix the error without losing the flexibility gained from function pointers (ie no falling back to strings that indicate which function to call)?
Also, the main point of writing this code in C++ is for speed. So it would be nice if I was still able to keep that benefit (the function pointer gets assigned to a member variable during initialization and will get called over a million times later on).
OK, so this is a fairly difficult question to answer in general. The root cause of your problem is that there really is no python type which is exactly equivalent to a C function pointer. Python functions are sort-of close, but their interface doesn't match for a few reasons.
Firstly, I want to mention the technique for wrapping a constructor from here:
http://wiki.python.org/moin/boost.python/HowTo#namedconstructors.2BAC8factories.28asPythoninitializers.29. This lets you write an __init__ function for your object that doesn't directly correspond to an actual C++ constructor. Note also, that you might have to specify boost::python::no_init in the boost::python::class_ construction, and then def a real __init__ function later, if your object isn't default-constructible.
Back to the question:
Is there only a small set of functions that you'll usually want to pass in? In that case, you could just declare a special enum (or specialized class), make an overload of your constructor that accepts the enum, and use that to look up the real function pointer. You can't directly call the functions yourself from python using this approach, but it's not that bad, and the performance will be the same as using real function pointers.
If you want to provide a general approach that will work for any python callable, things get more complex. You'll have to add a constructor to your C++ object that accepts a general functor, e.g. using boost::function or std::tr1::function. You could replace the existing constructor if you wanted, because function pointers will convert to this type correctly.
So, assuming you've added a boost::function constructor to SomeClass, you should add these functions to your python wrapping code:
struct WrapPythonCallable
{
typedef float * result_type;
explicit WrapPythonCallable(const boost::python::object & wrapped)
: wrapped_(wrapped)
{ }
float * operator()(vector<float>* arg) const
{
//Do whatever you need to do to convert into a
//boost::python::object here
boost::python::object arg_as_python_object = /* ... */;
//Call out to python with the object - note that wrapped_
//is callable using an operator() overload, and returns
//a boost::python::object.
//Also, the call can throw boost::python::error_already_set -
//you might want to handle that here.
boost::python::object result_object = wrapped_(arg_as_python_object);
//Do whatever you need to do to extract a float * from result_object,
//maybe using boost::python::extract
float * result = /* ... */;
return result;
}
boost::python::object wrapped_;
};
//This function is the "constructor wrapper" that you'll add to SomeClass.
//Change the return type to match the holder type for SomeClass, like if it's
//held using a shared_ptr.
std::auto_ptr<SomeClass> CreateSomeClassFromPython(
const boost::python::object & callable)
{
return std::auto_ptr<SomeClass>(
new SomeClass(WrapPythonCallable(callable)));
}
//Later, when telling Boost.Python about SomeClass:
class_<SomeClass>("some_class", no_init)
.def("__init__", make_constructor(&CreateSomeClassFromPython));
I've left out details on how to convert pointers to and from python - that's obviously something that you'll have to work out, because there are object lifetime issues there.
If you need to call the function pointers that you'll pass in to this function from Python, then you'll need to def these functions using Boost.Python at some point. This second approach will work fine with these def'd functions, but calling them will be slow, because objects will be unnecessarily converted to and from Python every time they're called.
To fix this, you can modify CreateSomeClassFromPython to recognize known or common function objects, and replace them with their real function pointers. You can compare python objects' identity in C++ using object1.ptr() == object2.ptr(), equivalent to id(object1) == id(object2) in python.
Finally, you can of course combine the general approach with the enum approach. Be aware when doing this, that boost::python's overloading rules are different from C++'s, and this can bite you when dealing with functions like CreateSomeClassFromPython. Boost.Python tests functions in the order that they are def'd to see if the runtime arguments can be converted to the C++ argument types. So, CreateSomeClassFromPython will prevent single-argument constructors def'd later than it from being used, because its argument matches any python object. Be sure to put it after other single-argument __init__ functions.
If you find yourself doing this sort of thing a lot, then you might want to look at the general boost::function wrapping technique (mentioned on the same page with the named constructor technique): http://wiki.python.org/moin/boost.python/HowTo?action=AttachFile&do=view&target=py_boost_function.hpp.

Categories

Resources