Add a signature, with annotations, to extension methods - python

When embedding Python in my application, and writing an extension type, I can add a signature to the method by using a properly crafted .tp_doc string.
static PyMethodDef Answer_methods[] = {
{ "ultimate", (PyCFunction)Answer_ultimate, METH_VARARGS,
"ultimate(self, question='Life, the universe, everything!')\n"
"--\n"
"\n"
"Return the ultimate answer to the given question." },
{ NULL }
};
When help(Answer) is executed, the following is returned (abbreviated):
class Answer(builtins.object)
|
| ultimate(self, question='Life, the universe, everything!')
| Return the ultimate answer to the given question.
This is good, but I'm using Python3.6, which has support for annotations. I'd like to annotate question to be a string, and the function to return an int. I've tried:
static PyMethodDef Answer_methods[] = {
{ "ultimate", (PyCFunction)Answer_is_ultimate, METH_VARARGS,
"ultimate(self, question:str='Life, the universe, everything!') -> int\n"
"--\n"
"\n"
"Return the ultimate answer to the given question." },
{ NULL }
};
but this reverts to the (...) notation, and the documentation becomes:
| ultimate(...)
| ultimate(self, question:str='Life, the universe, everything!') -> int
| --
|
| Return the ultimate answer to the given question.
and asking for inspect.signature(Answer.ultimate) results in an exception.
Traceback (most recent call last):
File "<string>", line 11, in <module>
File "inspect.py", line 3037, in signature
File "inspect.py", line 2787, in from_callable
File "inspect.py", line 2266, in _signature_from_callable
File "inspect.py", line 2090, in _signature_from_builtin
ValueError: no signature found for builtin <built-in method ultimate of example.Answer object at 0x000002179F3A11B0>
I've tried to add the annotations after the fact with Python code:
example.Answer.ultimate.__annotations__ = {'return': bool}
But the builtin method descriptors can't have annotations added this way.
Traceback (most recent call last):
File "<string>", line 2, in <module>
AttributeError: 'method_descriptor' object has no attribute '__annotations__'
Is there a way to add annotations to extension methods, using the C-API?
Argument Clinic looked promising and may still be very useful, but as of 3.6.5, it doesn't support annotations.
annotation
The annotation value for this parameter. Not currently supported, because PEP 8 mandates that the Python library may not use annotations.

TL;DR There is currently no way to do this.
How do signatures and C extensions work together?
In theory it works like this (for Python C extension objects):
If the C function has the "correct docstring" the signature is stored in the __text_signature__ attribute.
If you call help or inspect.signature on such an object it parses the __text_signature__ and tries to construct a signature from that.
If you use the argument clinic you don't need to write the "correct docstring" yourself. The signature line is generated based on comments in the code. However the 2 steps mentioned before still happen. They just happen to the automatically generated signature line.
That's why built-in Python functions like sum have a __text-signature__s:
>>> sum.__text_signature__
'($module, iterable, start=0, /)'
The signature in this case is generated through the argument clinic based on the comments around the sum implementation.
What are the problems with annotations?
There are several problems with annotations:
Return annotations break the contract of a "correct docstring". So the __text_signature__ will be empty when you add a return annotation. That's a major problem because a workaround would necessarily involve re-writing the part of the CPython C code that is responsible for the docstring -> __text_signature__ translation! That's not only complicated but you would also have to provide the changed CPython version so that it works for the people using your functions.
Just as example, if you use this "signature":
ultimate(self, question:str='Life, the universe, everything!') -> int
You get:
>>> ultimate.__text_signature__ is None
True
But if you remove the return annotation:
ultimate(self, question:str='Life, the universe, everything!')
It gives you a __text_signature__:
>>> ultimate.__text_signature__
"(self, question:str='Life, the universe, everything!')"
If you don't have the return annotation it still won't work because annotations are explicitly not supported (currently).
Assuming you have this signature:
ultimate(self, question:str='Life, the universe, everything!')
It doesn't work with inspect.signature (the exception message actually says it all):
>>> import inspect
>>> inspect.signature(ultimate)
Traceback (most recent call last):
...
raise ValueError("Annotations are not currently supported")
ValueError: Annotations are not currently supported
The function that is responsible for the parsing of __text_signature__ is inspect._signature_fromstr. In theory it could be possible that you maybe could make it work by monkey-patching it (return annotations still wouldn't work!). But maybe not, there are several places that make assumptions about the __text_signature__ that may not work with annotations.
Would PyFunction_SetAnnotations work?
In the comments this C API function was mentioned. However that deliberately doesn't work with C extension functions. If you try to call it on a C extension function it will raise a SystemError: bad argument to internal function call. I tested this with a small Cython Jupyter "script":
%load_ext cython
%%cython
cdef extern from "Python.h":
bint PyFunction_SetAnnotations(object func, dict annotations) except -1
cpdef call_PyFunction_SetAnnotations(object func, dict annotations):
PyFunction_SetAnnotations(func, annotations)
>>> call_PyFunction_SetAnnotations(sum, {})
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
<ipython-input-4-120260516322> in <module>()
----> 1 call_PyFunction_SetAnnotations(sum, {})
SystemError: ..\Objects\funcobject.c:211: bad argument to internal function
So that also doesn't work with C extension functions.
Summary
So return annotations are completely out of the question currently (at least without distributing your own CPython with the program). Parameter annotations could work if you monkey-patch a private function in the inspect module. It's a Python module so it could be feasible, but I haven't made a proof-of-concept so treat this as a maybe possible, but probably very complicated and almost certainly not worth the trouble.
However you can always just wrap the C extension function with a Python function (just a very thing wrapper). This Python wrapper can have function annotations. It's more maintenance and a tiny bit slower but saves you all the hassle with signatures and C extensions. I'm not exactly sure but if you use Cython to wrap your C or C++ code it might even have some automated tooling (writing the Python wrappers automatically).

Related

Subtype Polymorphism is broken in Cython v30.0.0a11?

Trying to pass an instance of a derived class to a function which accepts instances of the superclass gives an error in Cython v3.0.0a11:
test.pyx:
class MyInt(int):
pass
def takes_int(a: int):
pass
try.py:
from test import takes_int, MyInt
takes_int(MyInt(1))
try.py OUTPUT:
Traceback (most recent call last):
File "C:\Users\LENOVO PC\PycharmProjects\MyProject\cython_src\try.py", line 3, in <module>
takes_int(MyInt(1))
TypeError: Argument 'a' has incorrect type (expected int, got MyInt)
Changing to v0.29.32, cleaning the generated C file and the object files, and re-running, gets rid of the error.
This is (kind of) expected.
Cython has never allowed subtype polymorphism for builtin type arguments. See https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#types:
This requires an exact match of the class, it does not allow subclasses. This allows Cython to optimize code by accessing internals of the builtin class, which is the main reason for declaring builtin types in the first place.
This is a restriction which applies only to builtin types - for Cython defined cdef classes it works fine. It's also slightly different to the usual rule for annotations, but it's there because it's the only way that Cython can do much with these annotation.
What's changed is that an int annotation is interpreted as "any object" in Cython 0.29.x and a Python int in Cython 3. (Note that cdef int declares a C int though.) The reason for not using an int annotation in earlier versions of Cython is that Python 2 has two Python integer types, and it isn't easy to accept both of those and usefully use the types.
I'm not quite sure exactly what the final version of Cython 3 will end up doing with int annotations though.
If you don't want Cython to use the annotation (for example you would like your int class to be accepted) then you can turn off annotation_typing locally with the #cython.annotation_typing(False) decorator.

Python: type-hinting CTypes "pointer to X" types in a way that MyPy accepts

I have a large Python binding over a C library, with complex memory management. To help with that, I have drawn up the following aliases for strings (here, you a minimal reproducible example of my definitions and the subsequent problem).
from typing import Type, TypeAlias
from ctypes import POINTER, pointer, c_char, c_char_p
#normal python string
p_useable_p_str = str
#the result of a my_str.decode("utf-8")
c_useable_p_str = bytes
#used for string literals returned by the C, which should not be freed
p_useable_c_str = c_char_p
#used for allocated strings returned by the C, which need to be freed later
c_useable_c_str = POINTER(c_char) #the problematic line
def example(hello: c_useable_c_str): # source of the MyPy error
pass
My code runs fine (memory freed properly, consistent inheritance of the above if necessary, etc) when following the above definitions+usage conventions). POINTER(c_char) has the intended behavior in the rest of the code.
However, analyzing the above with MyPy, I get:
playground.py:10: error: Variable "playground.c_useable_c_str" is not valid as a type
playground.py:10: note: See https://mypy.readthedocs.io/en/latest/common_issues.html#variables-vs-type-aliases
Found 1 error in 1 file (checked 1 source file)
I get this error for anything that uses the c_useable_c_str alias. I, of course, read the section in the documentation linked above, and tried using Type and TypeAlias in a bunch of different ways - to no avail.
The only syntax that seems to make MyPy happy is
c_useable_c_str = pointer[c_char]
However, when actually running the code with this definition of the type alias, I get the following error (not seen by MyPy, so I suspect a bug on MyPy's end, or a lack of typing in the standard):
Traceback (most recent call last):
File "/home/fulguritude/ProfessionalWork/LEDR/Orchestra-AvesTerra/Python_binding/playground2.py", line 92, in <module>
c_useable_c_str = pointer[c_char]
TypeError: 'builtin_function_or_method' object is not subscriptable
Any ideas as to how I'm supposed to make things consistent ?
TLDR: what's the valid way to type hint, and alias, a "pointer to X" with MyPy and CTypes ?
I think you can use something like:
if TYPE_CHECKING: # valid for mypy
c_useable_c_str = pointer[c_char] # the problematic line
else: # valid at run time
c_useable_c_str = pointer
I found an imperfect (but good enough) solution, based on what I found here: https://github.com/python/mypy/issues/7540 (comment by thijsmie, commented on May 21, 2021).
if not TYPE_CHECKING:
# Monkeypatch typed pointer from typeshed into ctypes
# NB: files that wish to import ctypes.pointer should all import `pointer` from this file instead
class pointer_fix:
#classmethod
def __class_getitem__(cls, item):
return POINTER(item)
pointer = pointer_fix
It does solve MyPy's complaints concerning c_useable_c_str, and functions that use c_useable_c_str still get proper typing errors when the argument provided isn't of a compatible type, so that's good.
Well, EXCEPT for the following case:
class t_string(c_useable_c_str):
pass
class t_string_p(pointer[t_string]):
pass
where MyPy does not seem to be able to understand that t_string_p is compatible with the type pointer[pointer[c_char]]. For these cases, I used a type union; not ideal, but it got the job done.
I'll be leaving this question unanswered for now, in case someone, someday, can find a better solution.

How to clone function in cython? I getting SystemError: unknown opcode

In cpython this code would work:
import inspect
from types import FunctionType
def f(a, b): # line 5
print(a, b)
f_clone = FunctionType(
f.__code__,
f.__globals__,
closure=f.__closure__,
name=f.__name__
)
f_clone.__annotations__ = {'a': int, 'b': int}
f_clone.__defaults__ = (1, 2)
print(inspect.signature(f_clone)) # (a: int = 1, b: int = 2)
print(inspect.signature(f)) # (a, b)
f_clone() # 1 2
f(1, 2) # 1 2
try:
f()
except TypeError as e:
print(e) # f() missing 2 required positional arguments: 'a' and 'b'
However in cython when calling f_clone, I get:
XXX lineno: 5, opcode: 0
Traceback (most recent call last):
...
File "test.py", line 5, in f # line of f definitio
SystemError: unknown opcode
I need this to create a copy of class __init__ method on each class creation and and modify its signature, but keep original __init__ signature untouched.
Edit:
Changes made to signature of copied object must not affect runtime calls and needed only for inspection purposes.
I am relatively convinced this is never going to work well. If I were you I'd modify your code to fail elegantly for unclonable functions (maybe by just using the original __init__ and not replacing it, since this seems to be a purely cosmetic approach to generate prettier docstrings). After that you could submit an issue to the Cython issue tracker - however the maintainers of Cython know that full-introspection compatibility with Python is very challenging, so may not be hugely interested.
One of the main reasons I think you should just handle the error rather than find a workaround is that Cython is not the only method to accelerate Python. For example Numba can generate classes containing JIT accelerated code, or people can write their own functions in C (either as a C-API function, or perhaps wrapped with Ctypes or CFFI). These are all situations where your rather fragile introspection approach is likely to break. Handling the error fixes it for all of these; while you're likely to need an individual workaround for each one, plus all the methods I haven't thought of, plus any that are developed in the future.
Some details about Cython functions: at the moment a Cython has a compilation option called binding that can generate functions in two different modes:
With binding=False functions have the type builtin_function_or_method, which has minimum introspection capacities, and so no __code__, __globals__, __closure__ (or most other) attributes.
With binding=True functions have the type cython_function_or_method. This has improved introspection capacity, so does provide most of the expected annotations. However some of them are nonsense defaults - specifically __code__. The __code__ attribute is expected to be Python bytecode, however Cython doesn't use Python bytecode (since it's compiled to C). Therefore it just provides a dummy attribute.
It looks like Cython defaults to binding=True when compiling a .py file and when compiling a regular (non-cdef) class, giving the behaviour you report. However, when compiling a .pyx file it currently defaults to binding=False. It's possible you may also want to handle the binding=False case in some circumstances too.
Having established that trying to create a regular Python function object with the __code__ attribute of a cython_function_or_method isn't going to work, let's look at a few other options:
>>> print(f)
<cyfunction f at 0x7f08a1c63550>
>>> type(f)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'cython_function_or_method' instances
So you can't create your own cython_function_or_method and populate it from Python - the type does not have a user callable constructor.
copy.copy appears to work, but doesn't actually create a new instance:
>>> import copy
>>> copy.copy(f)
<cyfunction f at 0x7f08a1c63550>
Note, however, that this has exactly the same address - it isn't a copy:
>>> copy.copy(f) is f
True
At which point I'm out of ideas.
What I don't quite get is why you don't use functools.wraps?
#functools.wraps(f):
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
This updates wrapper with most of the relevant introspection attributes from f, works for both types of Cython function (to an extent - the binding=False case doesn't provide much useful information), and should work for most other types of function too.
It's possible I'm missing something, but it seems a whole lot less fragile than your scheme of copying code objects.

python optional argument to optional argument validation

In python, how to validate optional-to-optional keyword arguments?
this question is an extension to this question of mine about this optional-to-optional-arguments thing, (does it have a better name by the way?)
we know we can define optional arguments in this style:
os.fdopen(fd[, mode[, bufsize]])
so that if I mistakenly call fdopen by fdopen(sth, bufsize=16), python will point out I must specify mode to use bufsize argument.
How is this implemented? I can obviously write so much if-elseses to make this work, but that would result in some really messed-up code for those really complicated functions, for example:
cv2.dilate(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) → dst
There's no specific syntax for this at Python level. You have to define ordinary optional arguments and do the validation yourself.
The specific case you're looking at is implemented in C. Depending on which platform you're on, the C implementation is different. Here's the version for POSIX, Windows, and OS/2:
static PyObject *
posix_fdopen(PyObject *self, PyObject *args)
{
...
if (!PyArg_ParseTuple(args, "i|si", &fd, &orgmode, &bufsize))
return NULL;
The use of PyArg_ParseTuple means that this function doesn't actually accept any arguments by name. If you do os.fdopen(sth, bufsize=16), you'll get a TypeError:
>>> os.fdopen('', bufsize=16)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: fdopen() takes no keyword arguments
Note that there's no real reason the bufsize argument should depend on the mode argument. I think this function probably predates keyword arguments, though it's hard to be sure. Keyword arguments were added to Python all the way back in 1.3, and the earliest Python documentation available on python.org is for 1.4, by which time os.fdopen was definitely around.

Higher-Order Programming Using Boost::Python

So, I have a simple event library, written in C++ and using the Boost libraries. I wanted to expose said library to Python, so naturally I turned to Boost::Python. I got the code to compile, eventually, but now I'm faced with quite the problem: my library uses higher-order programming techniques. For example, the library is made up of three main classes: an event class, an event manager class, and an event listener class. The event listener class poses a problem. Code:
class listener{
public:
listener(){}
void alert(cham::event::event e){
if (responses[e.getName()])
responses[e.getName()](e.getData());
}
void setResponse(std::string n, boost::function<void (std::string d)> c){responses.insert(make_pair(n, c));}
void setManager(_manager<listener> *m){manager = m;}
private:
std::map<std::string, boost::function<void (std::string d)> > responses;
_manager<listener> *manager;
As you can see, the function setResponse is the problem. It requires a function to be passed to it, and, unfortunately, Boost::Python does not apply it's converter magic in this situation. When called like the following:
>>> import chameleon
>>> man = chameleon.manager()
>>> lis = chameleon.listener()
>>> def oup(s):
... print s
...
>>> lis.setResponse("event", oup)
it gives this error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
Boost.Python.ArgumentError: Python argument types in
listener.setResponse(listener, str, function)
did not match C++ signature:
setResponse(cham::event::listener {lvalue}, std::string, boost::function<void ()(std::string)>)
So, my question is, how could I fix this? It would have to either use overloading or a wrapper, as I would like the library to remain callable by C++.
You will need a wrapper around setResponse, which takes a boost::python::object instead of a function. It should store this bp::object in a known location (probably a member variable of a listener subclass).
Then pass a different c++ function to the base setResponse, that will know how to lookup and call the function in the bp::object. If events are to be called on a different thread, you will also need to ensure proper handling of python's Global Interpreter Lock, as discussed here: boost.python not supporting parallelism?.

Categories

Resources