I am trying to write some python function in crystal-lang through the C Python API.
My code follows:
METH_VARARGS = 0x0001
#[Link("python3.5m")]
lib Python
alias PyObject = Void*
struct PyMethodDef
name : UInt8*
func : Void*
flags : LibC::Int
doc : UInt8*
end
fun Py_Initialize
fun Py_Finalize
fun PyObject_CallObject(func : PyObject, args : PyObject) : PyObject
fun PyCFunction_NewEx(method : PyMethodDef*, __self__ : PyObject, ) : PyObject
fun PyLong_AsLong(n : PyObject) : Int64
fun PyLong_FromLong(n : Int64) : PyObject
end
def new_method_def(name : String, function, flags : LibC::Int)
x = Pointer(Python::PyMethodDef).malloc(1)
x.value.name = name.to_unsafe
x.value.func = function
x.value.flags = flags
x.value.doc = nil
x
end
Python.Py_Initialize
a = ->(args : Void*) {
puts Python.PyLong_AsLong(args)
Pointer(Void).null
}
name = "num"
number = Python.PyLong_FromLong(1)
Python.Py_IncRef(number)
method = Python.PyCFunction_NewEx(new_method_def(name,a.pointer,METH_VARARGS),number)
Python.PyObject_CallObject(method,Pointer(Void).null)
Python.Py_Finalize
Everything works if I set nil instead of number when in PyCFunction_NewEx, but as the code is, it throws an invalid acces memory exception when Py_Finalize is called.
I can't understand what's causing it.
Can someone help me?
The root problem here is that you're calling a C function of three parameters with only two arguments.
Regrettably, PyCFunction_NewEx is missing from the documentation, despite being a public API function. But all of the examples using it pass three arguments. And if you go to the source:
PyObject *
PyCFunction_NewEx(PyMethodDef *ml, PyObject *self, PyObject *module)
That's 3.7, but this is the same in 3.5 and in 2.7, and in every other version since the function was added to the API in 2.3. The whole point of NewEx is to allow you to pass a module.
Presumably, the function is expecting that third argument either in a register or on the stack, and you haven't put anything there, so it's completely arbitrary what you're passing. Slightly different code will leave completely different values in those places, so it's not surprising that you get different results:
If the value happens to be 0, that's fine; you're allowed to pass NULL as the module value.
If the value happens to be something that points to unmapped memory, like, say, 1 (as in the raw C long/long long, not a PyLongObject), you should get a segfault from the attempt to incref the module.
If the value happens to be a pointer to some random thing in memory, the incref will work, but will corrupt that random thing. Which could do just about anything, but a mysterious segfault at some arbitrary later point is almost the least surprising thing it could do.
Meanwhile, from a comment:
I am calling PyCFunction_NewEx because PyCFunction_New is a marco in the source code.
If you're using Python 2.3-2.6 or 3.0-3.2, then sure. But in later versions, including the 3.5 you say you're using, CPython goes out of its way to define PyCFunction_New as a function specifically so that it will be present in the API (and even the stable API, for 3.x). See 3.5 for example:
/* undefine macro trampoline to PyCFunction_NewEx */
#undef PyCFunction_New
PyAPI_FUNC(PyObject *)
PyCFunction_New(PyMethodDef *ml, PyObject *self)
{
return PyCFunction_NewEx(ml, self, NULL);
}
So, you really can just call PyCFunction_New.
Related
I'm new to the Python C-API and browsing through some source code to pick parts of it up.
Here is a minimal version of a function that I found, in the C source of a package that contains extension modules:
#define PY_SSIZE_T_CLEAN
#include <Python.h>
static PyObject *
modulename_myfunc(PyObject *self, PyObject *args) {
// Call PyArg_ParseTuple, etc ...
// Dummy values; in the real function they are calculated
int is_found = 1;
Py_ssize_t n_bytes_found = 1024;
PyObject *result;
result = Py_BuildValue("(Oi)",
is_found ? Py_True : Py_False, // Py_INCREF?
n_bytes_found);
return result;
}
Does this introduce a small memory leak by failing to use Py_INCREF on either Py_True or Py_False? The C-API docs for Boolean object seem pretty explicit about always needing to incref/decref Py_True and Py_False.
If a Py_INCREF does need to be introduced, how can it most properly be used here, assuming that Py_RETURN_TRUE/Py_RETURN_FALSE aren't really applicable because a tuple is being returned?
The reason a Py_INCREF is not used here is because Py_BuildValue, when being passed an object with "O" will increment the reference count for you:
O (object) [PyObject *]
Pass a Python object untouched (except for its reference count, which is incremented by one). If the object passed in is a NULL pointer, it is assumed that this was caused because the call producing the argument found an error and set an exception. Therefore, Py_BuildValue() will return NULL but won’t raise an exception. If no exception has been raised yet, SystemError is set.
You'll see a similar usage here in CPython itself for example.
Our code base currently supports a single SWIG interface file (for Python) that has grown over the years to include roughly 300 C++ classes (technically interfaces), all of which inherit from a single base class, and all of which exist in a single global namespace. This allows us, with a minimal amount of SWIG code, to implement dynamic casting among the C++ classes that the SWIG classes represent while at the same time simplifying by keeping the C++ inheritance structure out of SWIG.
As long as we compiled our SWIG interface in a single module, this mechanism worked well -- but as the SWIG interface file has grown it has become difficult to manage, and compile/link times have grown. To address this I split the interface file up into separate modules by the names of the derived classes -- one module for class names beginning with "A" to "G", one for names beginning with "H" to "N", etc., resulting in four derived-class modules and a base class module. I was able to get these modules to compile and link, and exhibit expected behavior for the dynamic casting, following the method outlined here: (http://www.swig.org/Doc3.0/SWIGDocumentation.html#Modules_nn1)
However, breaking the single module into four parts (five parts counting the base class) causes problems with the namespace when containers come into play. Consider the following function, from a class in my v-to-z interface file:
void RemoveIsolated(const std::vector<global::IFoo*> spRemoveIsolated) {
…
}
That takes a vector of one of the derived classes that exist in the global namespace. This worked without issue when I had only one module but now class IFoo lives in the a-to-g module -- so if I cast something to an IFoo*, it's an a-to-g.IFoo*. However, the function demands a global::IFoo*.
This seems to be a situation that could be addressed by the SWIG template mechanism. I've seen discussions in which people have had success by means of at one point (possibly in the interface file for the base class??) declaring
%template(FooVector) std::vector<global::Foo*>;
And at another point (possibly in the interface file for the derived class??):
%template () std::vector<global::Foo*>;
But my attempts to implement this have not been successful. The discussions are somewhat ambiguous, it's possible that I'm doing something wrong. Can anyone provide clarification, ideally with an example?
The piece of information it looks like you're missing is the %import directive, which lets modules cooperate with the definition of types, without repeating them and still ending up with a single wrapped type. The documentation suggests using this to reduce module size even.
Probably all you need to do is have your v-to-z module %import the a-to-g module to get this working for you. (Personally I'd have tried to divide them up by functionality rather than alphabetically though, so the dependency between then wouldn't be an issue)
Thanks for your suggestion Flexo. Importing the a-to-g module did not work; the C++ compiler complained that all of the classes (interfaces) declared there were not part of the global namespace when it tried to compile to v-to-z wrapper file. However, going through the exercise led me to question why we had been having success previously when we were compiling a single module. It turned out that we were using a typemapping macro in the interface file for the single module that would take a
const std::vector<global::IFoo*>
and map it thusly:
TYPEMAPMACRO(global::IFoo, SWIGTYPE_p_global__IFoo)
for vector containers. The macro itself, for anyone who's interested, is:
%define TYPEMAPMACRO(type, name)
%typemap(in) const std::vector {
/*Check if is a list */
std::vector vec;
void *pobj = 0;
if(PyTuple_Check($input))
{
size_t size = PyTuple_Size($input);
for (size_t j = 0; j < size; j++) {
PyObject *o = PyTuple_GetItem($input, j);
void *argp1 = 0 ;
int res1 = SWIG_ConvertPtr(o, &argp1, name, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Typemap of std::vector" "', argument " "1"" of type '" """'");
}
vec.push_back(reinterpret_cast< type * >(argp1));
}
$1 = vec;
}
else if (SWIG_IsOK(SWIG_ConvertPtr($input, &pobj, name, 0 | 0 ))) {
PyObject *o = $input;
void *argp1 = 0 ;
int res1 = SWIG_ConvertPtr(o, &argp1, name, 0 | 0 );
if (!SWIG_IsOK(res1)) {
SWIG_exception_fail(SWIG_ArgError(res1), "in method '" "Typemap of std::vector" "', argument " "1"" of type '" """'");
}
vec.push_back(reinterpret_cast< type * >(argp1));
$1 = vec;
}
else {
PyErr_SetString(PyExc_TypeError, "not a list");
return NULL;
}
}
%typecheck(SWIG_TYPECHECK_POINTER) std::vector {
void *pobj = 0;
if(!PyTuple_Check($input) && !SWIG_IsOK(SWIG_ConvertPtr($input, &pobj, name, 0 | 0 ))) {
$1 = 0;
PyErr_Clear();
} else {
$1 = 1;
}
}
%enddef
My sense is that this is standard boilerplate stuff, I don't claim to understand it well as it's someone else's code, but what I do understand now that I did not before is that I needed to place the macro for the typemap before the function that uses the typemap (e.g the "RemoveIsolated" example above). That ordering had been broken when I divided my big module up into smaller ones.
I have a python extension module written in C++, which contains multiple functions. One of these generates an instance of a custom structure, which I then want to use with other functions of my module in Python as follows
import MyModule
var = MyModule.genFunc()
MyModule.readFunc(var)
To do this, I've tried using PyCapsule objects to pass a pointer to these objects between Python and C, but this produces errors when attempting to read them in the second C function ("PyCapsule_GetPointer called with invalid PyCapsule object"). Python, however, if asked to print the PyCapsule object (var) correctly identifies it as a "'capsule object "testcapsule"'. My C code appears as follows:
struct MyStruct {
int value;
};
static PyObject* genFunc(PyObject* self, PyObject *args) {
MyStruct var;
PyObject *capsuleTest;
var.value = 1;
capsuleTest = PyCapsule_New(&var, "testcapsule", NULL);
return capsuleTest;
}
static PyObject* readFunc(PyObject* self, PyObject *args) {
PyCapsule_GetPointer(args, "testcapsule");
return 0;
}
Thank you for your help.
Like stated in a comment to your question, you'll run into an issue when reading data from the local variable MyStruct var. For this you can use the third destructor to PyCapsule_New.
But that's not the reason for your problem just now. You're using PyCapsule_GetPointer(args, "testcapsule") on the args parameter. And since it's not a capsule, even though var is one, you might have defined the signature of the function as METH_VARARGS. Instead you need to unpack the tuple or use METH_O.
Sorry if this is too vague. I was recently reading about python's list.sort() method and read that it was written in C for performance reasons.
I'm assuming that the python code just passes a list to the C code and the C code passes a list back, but how does the python code know where to pass it or that C gave it the correct data type, and how does the C code know what data type it was given?
Python can be extended in C/C++ (more info here)
It basically means that you can wrap a C module like this
#include "Python.h"
// Static function returning a PyObject pointer
static PyObject *
keywdarg_parrot(PyObject *self, PyObject *args, PyObject *keywds)
// takes self, args and kwargs.
{
int voltage;
// No such thing as strings here. Its a tough life.
char *state = "a stiff";
char *action = "voom";
char *type = "Norwegian Blue";
// Possible keywords
static char *kwlist[] = {"voltage", "state", "action", "type", NULL};
// unpack arguments
if (!PyArg_ParseTupleAndKeywords(args, keywds, "i|sss", kwlist,
&voltage, &state, &action, &type))
return NULL;
// print to stdout
printf("-- This parrot wouldn't %s if you put %i Volts through it.\n",
action, voltage);
printf("-- Lovely plumage, the %s -- It's %s!\n", type, state);
// Reference count some None.
Py_INCREF(Py_None);
// return some none.
return Py_None;
}
// Static PyMethodDef
static PyMethodDef keywdarg_methods[] = {
/* The cast of the function is necessary since PyCFunction values
* only take two PyObject* parameters, and keywdarg_parrot() takes
* three.
*/
// Declare the parrot function, say what it takes and give it a doc string.
{"parrot", (PyCFunction)keywdarg_parrot, METH_VARARGS | METH_KEYWORDS,
"Print a lovely skit to standard output."},
{NULL, NULL, 0, NULL} /* sentinel */
};
And using the Python header files it will define and understand entry points and return locations in the C/C++ code.
I can't speak to Python/C interaction directly, but I can give some background to how these sorts of things work in general.
On a particular platform or implementation, there is a calling convention that specifies how parameters are passed to subroutines and how values are returned to the caller. Compilers and interpreters that target that platform or implementation generate code to conform to that convention, so that subroutines/modules/whatever written in different languages can communicate with each other.
In my assembly class, we had an assignment where we had to write a program using VAX assembler, C, and Pascal (this was in the mid-Cretaceous1980s). The driver was in one of C or Pascal (can't remember which anymore), which called the assembly routine, which called the other routine (which was written in whichever language the driver wasn't). Our assembly code had to pop and push parameters from the stack based on the VMS calling convention.
Each computing platform has (or should have) an application binary interface (ABI). This is a specification of how parameters are passed between routines, how values are returned, what state the machine should be in and so on.
The ABI will specify things such as (for example):
The first integer argument (up to some number of bits, say 32) will be passed in a certain register (such as %EAX or R3). The second will be passed in another specific, register, and so on.
After the list of register is used, additional integer arguments will be passed on the stack, starting at a certain offset from the value of the stack pointer when the call is made.
Pointer arguments will be treated the same as integer arguments.
Floating-point arguments will be passed in floating-point registers F1, F2, and so on, until those registers are used up, and then on the stack.
Compound arguments (such as structures) will be passed as integer arguments if they are very small (e.g., four char objects in one structure) or on the stack if they are large.
Each compiler or other language implementation will generate code that conforms to the ABI, at least where its routines call or are called from other routines that might be outside the language.
We're working on some Python/C-API code, and we've encountered a method that would like to be passed a callback. The method will ship periodic updates to the callback as a form a feedback. As it turns out, we're not that interested in periodic feedback. The only way to disable the method's default feedback mechanism is to pass it some kind of callback.
The technique we've employed is to declare a module level function that just returns None, ie:
static PyObject*
donothing(PyObject* self, PyObject* args) {
return Py_None;
}
But of course, this function also needs to be registered with the modules method table, ie:
static PyMethodDef methods[] = {
{"donothing", donothing, METH_VARARGS, "do nothing"},
...
{NULL}
};
Then, when we go to call the method, we need to grab a reference to this method, ie: PyObject_GetAttrString(module_reference, "donothing").
All of this feels like we're spending too much time spinning our wheels just to do nothing. Then it ocurred to me..hey, seems like a perfect use for lambda x: None. But after spending an hour with the Python/C-API docs, I can't figure out how one create's lambdas.
I see there are references to closures on the page http://docs.python.org/2/c-api/function.html, but I can't sort out the details on how one creates them.
Any pointers (or references to RTFM) would be greatly appreciated.
A lambda expressions is used to create simple anonymous functions. These have a PyFunction_Type wrapping an object of PyCode_Type, which is a chunk of executable code. But you're already on the C side, so creating a Python function would be a little too much. Instead you should create an object of PyCFunction_Type. This is similar to what you've tried to do with the module methods.
The boilerplate in C wouldn't be too big either, but only a few lines:
static PyObject *
donothing(PyObject *self, PyObject *args) {
Py_RETURN_NONE;
}
static PyMethodDef donothing_ml = {"donothing", donothing, METH_VARARGS, "doc"};
The object then is created with PyCFunction_New(&donothing_ml, NULL) which yields a <built-in function donothing>. This function is independent of your module and can be used like any other PyObject.
It's not exactly a high level lambda, but rather a low level implementation of lambda *args: None.
However if you'd really like to create a high level lambda you can do this with a single statement like dastrobu proposed
l = PyRun_String("lambda *args: None", Py_eval_input, PyEval_GetGlobals(), NULL);
or if you'd like to assemble it yourself you could do
PyCodeObject *c = (PyCodeObject *) Py_CompileString("None", "fn", Py_eval_input);
#if PY_MAJOR_VERSION >= 3
c->co_name = PyUnicode_FromString("<c-lambda>"); // function name
#else
c->co_name = PyString_FromString("<c-lambda>"); // function name
#endif
c->co_flags |= CO_VARARGS; // accept *args
c->co_nlocals = 1; // needed in Python 3
l = PyFunction_New((PyObject *) c, PyEval_GetGlobals());
In both cases you'll get a function with dissasembled code dis(l) equivalent to a lambda:
1 0 LOAD_CONST 0 (None)
3 RETURN_VALUE