Return reference to member field in PyO3 - python

Suppose I have a Rust struct like this
struct X{...}
struct Y{
x:X
}
I'd like to be able to write python code that accesses X through Y
y = Y()
y.x.some_method()
What would be the best way to implement it in PyO3? Currently I made two wrapper classes
#[pyclass]
struct XWrapper{
x:X
}
#[pyclass]
struct YWrapper{
y:Y
}
#[pymethods]
impl YWrapper{
#[getter]
pub fn x(&self)->XWrapper{
XWrapper{x:self.y.clone()}
}
}
However, this requires clone(). I'd rather want to return reference. Of course I know that if X was a pyclass, then I could easily return PyRef to it. But the problem is that X and Y come from a Rust library and I cannot nilly-wily add #[pyclass] to them.

I don't think what you say is possible without some rejigging of the interface:
Your XWrapper owns the x and your Y owns its x as well. That means creating an XWrapper will always involve a clone (or a new).
Could we change XWrapper so that it merely contains a reference to an x? Not really, because that would require giving XWrapper a lifetime annotation, and PyO3 afaik doesn't allow pyclasses with lifetime annotation. Makes sense, because passing an object to python puts it on the python heap, at which point rust loses control over the object.
So what can we do?
Some thoughts: Do you really need to expose the composition structure of y to the python module? Just because that's the way it's organized within Rust doesn't mean it needs to be that way in Python. Your YWrapper could provide methods to the python interface that behind the scenes forward the request to the x instance:
#[pymethods]
impl YWrapper{
pub fn some_method(&self) {
self.y.x.some_method();
}
}
This would also be a welcome sight to strict adherents of the Law of Demeter ;)
I'm trying to think of other clever ways. Depending on some of the details of how y.x is accessed and modified by the methods of y itself, it might be possible to add a field x: XWrapper to the YWrapper. Then you create the XWrapper (including a clone of y.x) once when YWrapper is created, and from then on you can return references to that XWrapper in your pub fn x. Of course that becomes much more cumbersome when x gets frequently changed and updated via the methods of y...
In a way, this demonstrates the clash between Python's ref-counted object model and Rust's ownership object model. Rust enforces that you can't arbitrarily mess with objects unless you're their owner.

It is indeed possible to share objects and return them or mutate them. Whatever, just as in Python. What Lagerbaer suggested works and it's actually pretty ideal for small codes. However, if the number of methods increases there will be A LOT of repeating and boilerplate needed (and worse, folds every time you increase the depth of your nesting).
I have no idea if this is something that we are supposed to do. But from what I understood, the way to do it is using Py. God wish I had a habit of reading the docs thoroughly before experimenting.
In https://docs.rs/pyo3/latest/pyo3/#the-gil-independent-types in the MAIN PAGE of the doc says:
When wrapped in Py<...>, like with Py or Py, Python objects no longer have a limited lifetime which makes them easier to store in structs and pass between functions. However, you cannot do much with them without a Python<'py> token, for which you’d need to reacquire the GIL.
A Py is "A GIL-independent reference to an object allocated on the Python heap." https://docs.rs/pyo3/latest/pyo3/prelude/struct.Py.html
In other words, to return pyclass objects, we need to wrap it like Py<pyclass_struct_name>.
Your example is too complicated and to be honest I don't even understand what you are trying to do but here is an alternative version which suits my own usecase more closely. Since this is basically one of the only results that pops in Google I see it fit to paste it here even if it is not an exact response to the example provided above.
So here we go...
Suppose we have a Rust struct X and we cannot modify the lib as you mentioned. We need an XWrapper (let's call it PyX) pyclass to hold it.
So we define them here:
// in lib.rs
pub struct X {}
// in py_bindings.rs
#[pyclass]
struct PyX{
value: Py<X>,
}
impl_new_for!(PyX);
Then for the usage, all we have to do is to initialize the object with a GIL lock (assuming in the init of the XWrapper) and then define a getter for it. THE IMPORTANT NOTE HERE IS THAT YOU CALL clone_ref ON IT AND DO NOT RETURN THE OBJECT.
This is basically a nested class system afterwards and the nested object is immutable (has interior mutability tho) so it's a fantastic way to nest your code as well.
In the example below, I used my needed X as a PyX in yet another wrapper called Api.
#[pyclass]
struct Api {
x: PyX,
}
#[pymethods]
impl Api {
#[new]
fn __new__() -> PyResult<Self> {
Python::with_gil(|py| {
Ok(Self {
x: Py::new(
py,
PyX::new(),
),
})
}
}
#[getter(x)]
fn x(&mut self, py: Python) -> Py<Network> {
self.x.clone_ref(py)
}
}

Related

how is every object related to pyObject when c does not have Inheritance

I have been going through source code of python. It looks like every object is derived from PyObject. But, in C, there is no concept of object oriented programming. So, how exactly is this implemented without inheritance?
Your assertion that C has no concept of object-oriented programming is wrong. C doesn't explicitly have OOP, and it wasn't built with it in mind, but you can certainly do OOP things with C without too much effort. This comes from leveraging the fact that C doesn't actually really care what a struct's internal memory layout looks like. If you have two structs:
struct A {
int field1;
int field2;
double field3;
};
struct B {
A fieldA
int field4;
float field5;
};
then that essentially lets B behave as a subclass of A. After all, the first part of B's memory layout is exactly the same as A's memory layout. If you pass it around as a void pointer, then you can typecast away and C doesn't really care:
void doSomething(void *obj) {
int field2value = ((A*) obj).field2;
float field5value = ((B*) obj).field5;
printf("field2: %d\nfield5: %f", field2value, field5value);
}
You tell C what type you think the void pointer is supposed to be, and it makes that happen. And you get unexpected behavior if you guess wrong, or you get a segfault if the type you think it's supposed to be is larger than the type it actually is. You can use this to clumsily implement inheritance:
void constructA(void* obj) {
a = (A*) obj
a.field1 = 4;
a.field2 = 2;
a.field3 = 3.14;
}
void constructB(void *obj) {
constructA(obj);
b = (B*) obj;
b.field4 = 7;
b.field5 = 6.28;
}
int main() {
B *myObj = malloc(sizeof(B));
constructB(myObj);
free(myObj);
}
If one of A's variables is a function pointer, then that's fine. That function pointer gets passed down alongside the rest. You can call it from anywhere, after all. You can supplant its functionality in a "subclass", and then still call the original version later on, if you don't actually override its spot in memory - or if you do, you could have your replacement manually call the function it was pointing to.
A lot of advanced C code uses a similar pattern for replicating the idea of inheritance (or, alternatively, just uses C++, which optimizes this whole arrangement and abstracts it away so that programmers can work with a more intuitive syntax).
But even then, that's missing the point. One of my favorite things about python is, at the deepest level, its consistency - everything is an object, and all objects are basically hashmaps with names pointing to a references. Python's idea of duck typing works not because the underlying C code has any idea of inheritance, but because the code just looks for an attribute with the right name, and if it finds one, it uses it.
Subclasses in python, then, are the same as above - a new python object that initializes itself from the top down, adding more and more fields to the hashmap as it gets closer to the bottom.
What makes the Object Oriented programming paradigm is the relation between "classes" as templates for a data set and functions that will operate on this data set. And, the inheritance mechanism which is a relation from a class to ancestor classes.
These relations, however, do not depend on a particular language Syntax - just that they are present in anyway.
So, nothing stops one from doing "object orientation" in C, and in fact, organized libraries, even without an OO framework, end up with an organization related to OO.
It happens that the Python object system is entirely defined in pure C, with objects having a __class__ slot that points to its class with a C pointer - only when "viwed" from Python the full represenation of the class is resented. Classes in their turn having a __mro__ and __bases__ slots that point to the different arrangements of superclasses (the pointers this time are for containers that will be seen from Python as sequences).
So, when coding in C using the definitions and API of the Python runtime, one can use OOP just in the same way as coding in Python - and in fact use Python objects that are interoperable with the Python language. (The cython project will even transpile a superset of the Python language to C and provide transparent ways f writing native code with Python syntax)
There are other frameworks available to C that provide different OOP systems, that are equaly conformant, for example, glib - which defines "gobject" and is the base for all GTK+ and GNOME applications.

Create a PyObject with attached functions and return to Python

I wonder how I can create a PyObject in C++ and then return it to Python.
Sadly the documentation is not very explicit about it.
There is no PyObject_Create so I wonder whether allocating sizeof(PyObject) via PyObject_Malloc and initializing the struct is sufficient.
For now I only need an object with functions attached.
Do you really want a (1) PyObject, as in what Python calls object, or (2) an object of some subtype? That you "need an object with functions attached" seems to indicate you want either methods or attributes. That needs (2) in any case. I'm no expert on the C API, but generally you'd define your own PyTypeObject, then create an instance of that via PyObject_New (refcount and type field are initialized, other fields you might add are not).

Create object from string repesentation in C API

I am working on a system which is embedding a Python interpreter, and I need to construct a PyObject* given a string from the C API.
I have a const char* representing a dictionary, in the proper format for eval() to work properly from within Python, ie: "{'bar': 42, 'baz': 50}".
Currently, this is being passed into Python as a PyObject* using the Py_Unicode_ api (representing a string), so in my python interpreter, I can successfully write:
foo = eval(myObject.value)
print(foo['bar']) # prints 42
I would like to change this to automatically "eval" the const char* on the C side, and return a PyObject* representing a completed dictionary. How do I go about converting this string into a dictionary in the C API?
There are two basic ways to do this.
The first is to simply call eval the same way you do in Python. The only trick is that you need a handle to the builtins module, because you don't get that for free in the C API. There are a number of ways to do this, but one really easy way is to just import it:
/* or PyEval_GetBuiltins() if you know you're at the interpreter's top level */
PyObject *builtins = PyImport_ImportModule("builtins");
PyObject *eval = PyObject_GetAttrString(builtins, "eval");
PyObject *args = Py_BuildValue("(s)", expression_as_c_string);
PyObject *result = PyObject_Call(eval, args);
(This is untested code, and it at least leaks references, and doesn't check for NULL return if you want to handle exceptions on the C side… But it should be enough to get the idea across.)
One nice thing about this is that you can use ast.literal_eval in exactly the same way as eval (which means you get some free validation); just change "builtins" to "ast", and "eval" to "literal_eval". But the real win is that you're doing exactly what eval does in Python, which you already know is exactly what you wanted.
The alternative is to use the compilation APIs. At the really high level, you can just build a Python statement out of "foo = eval(%s)" and PyRun_SimpleString it. Below that, use Py_CompileString to parse and compile the expression (you can also parse and compile in separate steps, but that isn't useful here), then PyEval_EvalCode to evaluate it in the appropriate globals and locals. (If you're not tracking globals yourself, use the interpreter-reflection APIs PyEval_GetLocals and PyEval_GetGlobals.) Note that I'm giving the super-simplified version of each function; often you want to use one of the sibling functions. But you can find them easily in the docs.

How to pass an array from C to an embedded python script

I am running to some problems and would like some help. I have a piece code, which is used to embed a python script. This python script contains a function which will expect to receive an array as an argument (in this case I am using numpy array within the python script).
I would like to know how can I pass an array from C to the embedded python script as an argument for the function within the script. More specifically can someone show me a simple example of this.
Really, the best answer here is probably to use numpy arrays exclusively, even from your C code. But if that's not possible, then you have the same problem as any code that shares data between C types and Python types.
In general, there are at least five options for sharing data between C and Python:
Create a Python list or other object to pass.
Define a new Python type (in your C code) to wrap and represent the array, with the same methods you'd define for a sequence object in Python (__getitem__, etc.).
Cast the pointer to the array to intptr_t, or to explicit ctypes type, or just leave it un-cast; then use ctypes on the Python side to access it.
Cast the pointer to the array to const char * and pass it as a str (or, in Py3, bytes), and use struct or ctypes on the Python side to access it.
Create an object matching the buffer protocol, and again use struct or ctypes on the Python side.
In your case, you want to use numpy.arrays in Python. So, the general cases become:
Create a numpy.array to pass.
(probably not appropriate)
Pass the pointer to the array as-is, and from Python, use ctypes to get it into a type that numpy can convert into an array.
Cast the pointer to the array to const char * and pass it as a str (or, in Py3, bytes), which is already a type that numpy can convert into an array.
Create an object matching the buffer protocol, and which again I believe numpy can convert directly.
For 1, here's how to do it with a list, just because it's a very simple example (and I already wrote it…):
PyObject *makelist(int array[], size_t size) {
PyObject *l = PyList_New(size);
for (size_t i = 0; i != size; ++i) {
PyList_SET_ITEM(l, i, PyInt_FromLong(array[i]));
}
return l;
}
And here's the numpy.array equivalent (assuming you can rely on the C array not to be deleted—see Creating arrays in the docs for more details on your options here):
PyObject *makearray(int array[], size_t size) {
npy_int dim = size;
return PyArray_SimpleNewFromData(1, &dim, (void *)array);
}
At any rate, however you do this, you will end up with something that looks like a PyObject * from C (and has a single refcount), so you can pass it as a function argument, while on the Python side it will look like a numpy.array, list, bytes, or whatever else is appropriate.
Now, how do you actually pass function arguments? Well, the sample code in Pure Embedding that you referenced in your comment shows how to do this, but doesn't really explain what's going on. There's actually more explanation in the extending docs than the embedding docs, specifically, Calling Python Functions from C. Also, keep in mind that the standard library source code is chock full of examples of this (although some of them aren't as readable as they could be, either because of optimization, or just because they haven't been updated to take advantage of new simplified C API features).
Skip the first example about getting a Python function from Python, because presumably you already have that. The second example (and the paragraph right about it) shows the easy way to do it: Creating an argument tuple with Py_BuildValue. So, let's say we want to call a function you've got stored in myfunc with the list mylist returned by that makelist function above. Here's what you do:
if (!PyCallable_Check(myfunc)) {
PyErr_SetString(PyExc_TypeError, "function is not callable?!");
return NULL;
}
PyObject *arglist = Py_BuildValue("(o)", mylist);
PyObject *result = PyObject_CallObject(myfunc, arglist);
Py_DECREF(arglist);
return result;
You can skip the callable check if you're sure you've got a valid callable object, of course. (And it's usually better to check when you first get myfunc, if appropriate, because you can give both earlier and better error feedback that way.)
If you want to actually understand what's going on, try it without Py_BuildValue. As the docs say, the second argument to [PyObject_CallObject][6] is a tuple, and PyObject_CallObject(callable_object, args) is equivalent to apply(callable_object, args), which is equivalent to callable_object(*args). So, if you wanted to call myfunc(mylist) in Python, you have to turn that into, effectively, myfunc(*(mylist,)) so you can translate it to C. You can construct a tuple like this:
PyObject *arglist = PyTuple_Pack(1, mylist);
But usually, Py_BuildValue is easier (especially if you haven't already packed everything up as Python objects), and the intention in your code is clearer (just as using PyArg_ParseTuple is simpler and clearer than using explicit tuple functions in the other direction).
So, how do you get that myfunc? Well, if you've created the function from the embedding code, just keep the pointer around. If you want it passed in from the Python code, that's exactly what the first example does. If you want to, e.g., look it up by name from a module or other context, the APIs for concrete types like PyModule and abstract types like PyMapping are pretty simple, and it's generally obvious how to convert Python code into the equivalent C code, even if the result is mostly ugly boilerplate.
Putting it all together, let's say I've got a C array of integers, and I want to import mymodule and call a function mymodule.myfunc(mylist) that returns an int. Here's a stripped-down example (not actually tested, and no error handling, but it should show all the parts):
int callModuleFunc(int array[], size_t size) {
PyObject *mymodule = PyImport_ImportModule("mymodule");
PyObject *myfunc = PyObject_GetAttrString(mymodule, "myfunc");
PyObject *mylist = PyList_New(size);
for (size_t i = 0; i != size; ++i) {
PyList_SET_ITEM(l, i, PyInt_FromLong(array[i]));
}
PyObject *arglist = Py_BuildValue("(o)", mylist);
PyObject *result = PyObject_CallObject(myfunc, arglist);
int retval = (int)PyInt_AsLong(result);
Py_DECREF(result);
Py_DECREF(arglist);
Py_DECREF(mylist);
Py_DECREF(myfunc);
Py_DECREF(mymodule);
return retval;
}
If you're using C++, you probably want to look into some kind of scope-guard/janitor/etc. to handle all those Py_DECREF calls, especially once you start doing proper error handling (which usually means early return NULL calls peppered through the function). If you're using C++11 or Boost, unique_ptr<PyObject, Py_DecRef> may be all you need.
But really, a better way to reduce all that ugly boilerplate, if you plan to do a lot of C<->Python communication, is to look at all of the familiar frameworks designed for improving extending Python—Cython, boost::python, etc. Even though you're embedding, you're effectively doing the same work as extending, so they can help in the same ways.
For that matter, some of them also have tools to help the embedding part, if you search around the docs. For example, you can write your main program in Cython, using both C code and Python code, and cython --embed. You may want to cross your fingers and/or sacrifice some chickens, but if it works, it's amazingly simple and productive. Boost isn't nearly as trivial to get started, but once you've got things together, almost everything is done in exactly the way you'd expect, and just works, and that's just as true for embedding as extending. And so on.
The Python function will need a Python object to be passed in. Since you want that Python object to be a NumPy array, you should use one of the NumPy C-API functions for creating arrays; PyArray_SimpleNewFromData() is probably a good start. It will use the buffer provided, without copying the data.
That said, it is almost always easier to write the main program in Python and use a C extension module for the C code. This approach makes it easier to let Python do the memory management, and the ctypes module together with Numpy's cpython extensions make it easy to pass a NumPy array to a C function.

Boost.Python function pointers as class constructor argument

I have a C++ class that requires a function pointer in it's constructor (float(*myfunction)(vector<float>*))
I've already exposed some function pointers to Python.
The ideal way to use this class is something like this:
import mymodule
mymodule.some_class(mymodule.some_function)
So I tell Boost about this class like so:
class_<SomeClass>("some_class", init<float(*)(vector<float>*)>);
But I get:
error: no matching function for call to 'register_shared_ptr1(Sample (*)(std::vector<double, std::allocator<double> >*))'
when I try to compile it.
So, does anyone have any ideas on how I can fix the error without losing the flexibility gained from function pointers (ie no falling back to strings that indicate which function to call)?
Also, the main point of writing this code in C++ is for speed. So it would be nice if I was still able to keep that benefit (the function pointer gets assigned to a member variable during initialization and will get called over a million times later on).
OK, so this is a fairly difficult question to answer in general. The root cause of your problem is that there really is no python type which is exactly equivalent to a C function pointer. Python functions are sort-of close, but their interface doesn't match for a few reasons.
Firstly, I want to mention the technique for wrapping a constructor from here:
http://wiki.python.org/moin/boost.python/HowTo#namedconstructors.2BAC8factories.28asPythoninitializers.29. This lets you write an __init__ function for your object that doesn't directly correspond to an actual C++ constructor. Note also, that you might have to specify boost::python::no_init in the boost::python::class_ construction, and then def a real __init__ function later, if your object isn't default-constructible.
Back to the question:
Is there only a small set of functions that you'll usually want to pass in? In that case, you could just declare a special enum (or specialized class), make an overload of your constructor that accepts the enum, and use that to look up the real function pointer. You can't directly call the functions yourself from python using this approach, but it's not that bad, and the performance will be the same as using real function pointers.
If you want to provide a general approach that will work for any python callable, things get more complex. You'll have to add a constructor to your C++ object that accepts a general functor, e.g. using boost::function or std::tr1::function. You could replace the existing constructor if you wanted, because function pointers will convert to this type correctly.
So, assuming you've added a boost::function constructor to SomeClass, you should add these functions to your python wrapping code:
struct WrapPythonCallable
{
typedef float * result_type;
explicit WrapPythonCallable(const boost::python::object & wrapped)
: wrapped_(wrapped)
{ }
float * operator()(vector<float>* arg) const
{
//Do whatever you need to do to convert into a
//boost::python::object here
boost::python::object arg_as_python_object = /* ... */;
//Call out to python with the object - note that wrapped_
//is callable using an operator() overload, and returns
//a boost::python::object.
//Also, the call can throw boost::python::error_already_set -
//you might want to handle that here.
boost::python::object result_object = wrapped_(arg_as_python_object);
//Do whatever you need to do to extract a float * from result_object,
//maybe using boost::python::extract
float * result = /* ... */;
return result;
}
boost::python::object wrapped_;
};
//This function is the "constructor wrapper" that you'll add to SomeClass.
//Change the return type to match the holder type for SomeClass, like if it's
//held using a shared_ptr.
std::auto_ptr<SomeClass> CreateSomeClassFromPython(
const boost::python::object & callable)
{
return std::auto_ptr<SomeClass>(
new SomeClass(WrapPythonCallable(callable)));
}
//Later, when telling Boost.Python about SomeClass:
class_<SomeClass>("some_class", no_init)
.def("__init__", make_constructor(&CreateSomeClassFromPython));
I've left out details on how to convert pointers to and from python - that's obviously something that you'll have to work out, because there are object lifetime issues there.
If you need to call the function pointers that you'll pass in to this function from Python, then you'll need to def these functions using Boost.Python at some point. This second approach will work fine with these def'd functions, but calling them will be slow, because objects will be unnecessarily converted to and from Python every time they're called.
To fix this, you can modify CreateSomeClassFromPython to recognize known or common function objects, and replace them with their real function pointers. You can compare python objects' identity in C++ using object1.ptr() == object2.ptr(), equivalent to id(object1) == id(object2) in python.
Finally, you can of course combine the general approach with the enum approach. Be aware when doing this, that boost::python's overloading rules are different from C++'s, and this can bite you when dealing with functions like CreateSomeClassFromPython. Boost.Python tests functions in the order that they are def'd to see if the runtime arguments can be converted to the C++ argument types. So, CreateSomeClassFromPython will prevent single-argument constructors def'd later than it from being used, because its argument matches any python object. Be sure to put it after other single-argument __init__ functions.
If you find yourself doing this sort of thing a lot, then you might want to look at the general boost::function wrapping technique (mentioned on the same page with the named constructor technique): http://wiki.python.org/moin/boost.python/HowTo?action=AttachFile&do=view&target=py_boost_function.hpp.

Categories

Resources