python optional argument to optional argument validation - python

In python, how to validate optional-to-optional keyword arguments?
this question is an extension to this question of mine about this optional-to-optional-arguments thing, (does it have a better name by the way?)
we know we can define optional arguments in this style:
os.fdopen(fd[, mode[, bufsize]])
so that if I mistakenly call fdopen by fdopen(sth, bufsize=16), python will point out I must specify mode to use bufsize argument.
How is this implemented? I can obviously write so much if-elseses to make this work, but that would result in some really messed-up code for those really complicated functions, for example:
cv2.dilate(src, kernel[, dst[, anchor[, iterations[, borderType[, borderValue]]]]]) → dst

There's no specific syntax for this at Python level. You have to define ordinary optional arguments and do the validation yourself.
The specific case you're looking at is implemented in C. Depending on which platform you're on, the C implementation is different. Here's the version for POSIX, Windows, and OS/2:
static PyObject *
posix_fdopen(PyObject *self, PyObject *args)
{
...
if (!PyArg_ParseTuple(args, "i|si", &fd, &orgmode, &bufsize))
return NULL;
The use of PyArg_ParseTuple means that this function doesn't actually accept any arguments by name. If you do os.fdopen(sth, bufsize=16), you'll get a TypeError:
>>> os.fdopen('', bufsize=16)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: fdopen() takes no keyword arguments
Note that there's no real reason the bufsize argument should depend on the mode argument. I think this function probably predates keyword arguments, though it's hard to be sure. Keyword arguments were added to Python all the way back in 1.3, and the earliest Python documentation available on python.org is for 1.4, by which time os.fdopen was definitely around.

Related

Subtype Polymorphism is broken in Cython v30.0.0a11?

Trying to pass an instance of a derived class to a function which accepts instances of the superclass gives an error in Cython v3.0.0a11:
test.pyx:
class MyInt(int):
pass
def takes_int(a: int):
pass
try.py:
from test import takes_int, MyInt
takes_int(MyInt(1))
try.py OUTPUT:
Traceback (most recent call last):
File "C:\Users\LENOVO PC\PycharmProjects\MyProject\cython_src\try.py", line 3, in <module>
takes_int(MyInt(1))
TypeError: Argument 'a' has incorrect type (expected int, got MyInt)
Changing to v0.29.32, cleaning the generated C file and the object files, and re-running, gets rid of the error.
This is (kind of) expected.
Cython has never allowed subtype polymorphism for builtin type arguments. See https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#types:
This requires an exact match of the class, it does not allow subclasses. This allows Cython to optimize code by accessing internals of the builtin class, which is the main reason for declaring builtin types in the first place.
This is a restriction which applies only to builtin types - for Cython defined cdef classes it works fine. It's also slightly different to the usual rule for annotations, but it's there because it's the only way that Cython can do much with these annotation.
What's changed is that an int annotation is interpreted as "any object" in Cython 0.29.x and a Python int in Cython 3. (Note that cdef int declares a C int though.) The reason for not using an int annotation in earlier versions of Cython is that Python 2 has two Python integer types, and it isn't easy to accept both of those and usefully use the types.
I'm not quite sure exactly what the final version of Cython 3 will end up doing with int annotations though.
If you don't want Cython to use the annotation (for example you would like your int class to be accepted) then you can turn off annotation_typing locally with the #cython.annotation_typing(False) decorator.

How to clone function in cython? I getting SystemError: unknown opcode

In cpython this code would work:
import inspect
from types import FunctionType
def f(a, b): # line 5
print(a, b)
f_clone = FunctionType(
f.__code__,
f.__globals__,
closure=f.__closure__,
name=f.__name__
)
f_clone.__annotations__ = {'a': int, 'b': int}
f_clone.__defaults__ = (1, 2)
print(inspect.signature(f_clone)) # (a: int = 1, b: int = 2)
print(inspect.signature(f)) # (a, b)
f_clone() # 1 2
f(1, 2) # 1 2
try:
f()
except TypeError as e:
print(e) # f() missing 2 required positional arguments: 'a' and 'b'
However in cython when calling f_clone, I get:
XXX lineno: 5, opcode: 0
Traceback (most recent call last):
...
File "test.py", line 5, in f # line of f definitio
SystemError: unknown opcode
I need this to create a copy of class __init__ method on each class creation and and modify its signature, but keep original __init__ signature untouched.
Edit:
Changes made to signature of copied object must not affect runtime calls and needed only for inspection purposes.
I am relatively convinced this is never going to work well. If I were you I'd modify your code to fail elegantly for unclonable functions (maybe by just using the original __init__ and not replacing it, since this seems to be a purely cosmetic approach to generate prettier docstrings). After that you could submit an issue to the Cython issue tracker - however the maintainers of Cython know that full-introspection compatibility with Python is very challenging, so may not be hugely interested.
One of the main reasons I think you should just handle the error rather than find a workaround is that Cython is not the only method to accelerate Python. For example Numba can generate classes containing JIT accelerated code, or people can write their own functions in C (either as a C-API function, or perhaps wrapped with Ctypes or CFFI). These are all situations where your rather fragile introspection approach is likely to break. Handling the error fixes it for all of these; while you're likely to need an individual workaround for each one, plus all the methods I haven't thought of, plus any that are developed in the future.
Some details about Cython functions: at the moment a Cython has a compilation option called binding that can generate functions in two different modes:
With binding=False functions have the type builtin_function_or_method, which has minimum introspection capacities, and so no __code__, __globals__, __closure__ (or most other) attributes.
With binding=True functions have the type cython_function_or_method. This has improved introspection capacity, so does provide most of the expected annotations. However some of them are nonsense defaults - specifically __code__. The __code__ attribute is expected to be Python bytecode, however Cython doesn't use Python bytecode (since it's compiled to C). Therefore it just provides a dummy attribute.
It looks like Cython defaults to binding=True when compiling a .py file and when compiling a regular (non-cdef) class, giving the behaviour you report. However, when compiling a .pyx file it currently defaults to binding=False. It's possible you may also want to handle the binding=False case in some circumstances too.
Having established that trying to create a regular Python function object with the __code__ attribute of a cython_function_or_method isn't going to work, let's look at a few other options:
>>> print(f)
<cyfunction f at 0x7f08a1c63550>
>>> type(f)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'cython_function_or_method' instances
So you can't create your own cython_function_or_method and populate it from Python - the type does not have a user callable constructor.
copy.copy appears to work, but doesn't actually create a new instance:
>>> import copy
>>> copy.copy(f)
<cyfunction f at 0x7f08a1c63550>
Note, however, that this has exactly the same address - it isn't a copy:
>>> copy.copy(f) is f
True
At which point I'm out of ideas.
What I don't quite get is why you don't use functools.wraps?
#functools.wraps(f):
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
This updates wrapper with most of the relevant introspection attributes from f, works for both types of Cython function (to an extent - the binding=False case doesn't provide much useful information), and should work for most other types of function too.
It's possible I'm missing something, but it seems a whole lot less fragile than your scheme of copying code objects.

Add a signature, with annotations, to extension methods

When embedding Python in my application, and writing an extension type, I can add a signature to the method by using a properly crafted .tp_doc string.
static PyMethodDef Answer_methods[] = {
{ "ultimate", (PyCFunction)Answer_ultimate, METH_VARARGS,
"ultimate(self, question='Life, the universe, everything!')\n"
"--\n"
"\n"
"Return the ultimate answer to the given question." },
{ NULL }
};
When help(Answer) is executed, the following is returned (abbreviated):
class Answer(builtins.object)
|
| ultimate(self, question='Life, the universe, everything!')
| Return the ultimate answer to the given question.
This is good, but I'm using Python3.6, which has support for annotations. I'd like to annotate question to be a string, and the function to return an int. I've tried:
static PyMethodDef Answer_methods[] = {
{ "ultimate", (PyCFunction)Answer_is_ultimate, METH_VARARGS,
"ultimate(self, question:str='Life, the universe, everything!') -> int\n"
"--\n"
"\n"
"Return the ultimate answer to the given question." },
{ NULL }
};
but this reverts to the (...) notation, and the documentation becomes:
| ultimate(...)
| ultimate(self, question:str='Life, the universe, everything!') -> int
| --
|
| Return the ultimate answer to the given question.
and asking for inspect.signature(Answer.ultimate) results in an exception.
Traceback (most recent call last):
File "<string>", line 11, in <module>
File "inspect.py", line 3037, in signature
File "inspect.py", line 2787, in from_callable
File "inspect.py", line 2266, in _signature_from_callable
File "inspect.py", line 2090, in _signature_from_builtin
ValueError: no signature found for builtin <built-in method ultimate of example.Answer object at 0x000002179F3A11B0>
I've tried to add the annotations after the fact with Python code:
example.Answer.ultimate.__annotations__ = {'return': bool}
But the builtin method descriptors can't have annotations added this way.
Traceback (most recent call last):
File "<string>", line 2, in <module>
AttributeError: 'method_descriptor' object has no attribute '__annotations__'
Is there a way to add annotations to extension methods, using the C-API?
Argument Clinic looked promising and may still be very useful, but as of 3.6.5, it doesn't support annotations.
annotation
The annotation value for this parameter. Not currently supported, because PEP 8 mandates that the Python library may not use annotations.
TL;DR There is currently no way to do this.
How do signatures and C extensions work together?
In theory it works like this (for Python C extension objects):
If the C function has the "correct docstring" the signature is stored in the __text_signature__ attribute.
If you call help or inspect.signature on such an object it parses the __text_signature__ and tries to construct a signature from that.
If you use the argument clinic you don't need to write the "correct docstring" yourself. The signature line is generated based on comments in the code. However the 2 steps mentioned before still happen. They just happen to the automatically generated signature line.
That's why built-in Python functions like sum have a __text-signature__s:
>>> sum.__text_signature__
'($module, iterable, start=0, /)'
The signature in this case is generated through the argument clinic based on the comments around the sum implementation.
What are the problems with annotations?
There are several problems with annotations:
Return annotations break the contract of a "correct docstring". So the __text_signature__ will be empty when you add a return annotation. That's a major problem because a workaround would necessarily involve re-writing the part of the CPython C code that is responsible for the docstring -> __text_signature__ translation! That's not only complicated but you would also have to provide the changed CPython version so that it works for the people using your functions.
Just as example, if you use this "signature":
ultimate(self, question:str='Life, the universe, everything!') -> int
You get:
>>> ultimate.__text_signature__ is None
True
But if you remove the return annotation:
ultimate(self, question:str='Life, the universe, everything!')
It gives you a __text_signature__:
>>> ultimate.__text_signature__
"(self, question:str='Life, the universe, everything!')"
If you don't have the return annotation it still won't work because annotations are explicitly not supported (currently).
Assuming you have this signature:
ultimate(self, question:str='Life, the universe, everything!')
It doesn't work with inspect.signature (the exception message actually says it all):
>>> import inspect
>>> inspect.signature(ultimate)
Traceback (most recent call last):
...
raise ValueError("Annotations are not currently supported")
ValueError: Annotations are not currently supported
The function that is responsible for the parsing of __text_signature__ is inspect._signature_fromstr. In theory it could be possible that you maybe could make it work by monkey-patching it (return annotations still wouldn't work!). But maybe not, there are several places that make assumptions about the __text_signature__ that may not work with annotations.
Would PyFunction_SetAnnotations work?
In the comments this C API function was mentioned. However that deliberately doesn't work with C extension functions. If you try to call it on a C extension function it will raise a SystemError: bad argument to internal function call. I tested this with a small Cython Jupyter "script":
%load_ext cython
%%cython
cdef extern from "Python.h":
bint PyFunction_SetAnnotations(object func, dict annotations) except -1
cpdef call_PyFunction_SetAnnotations(object func, dict annotations):
PyFunction_SetAnnotations(func, annotations)
>>> call_PyFunction_SetAnnotations(sum, {})
---------------------------------------------------------------------------
SystemError Traceback (most recent call last)
<ipython-input-4-120260516322> in <module>()
----> 1 call_PyFunction_SetAnnotations(sum, {})
SystemError: ..\Objects\funcobject.c:211: bad argument to internal function
So that also doesn't work with C extension functions.
Summary
So return annotations are completely out of the question currently (at least without distributing your own CPython with the program). Parameter annotations could work if you monkey-patch a private function in the inspect module. It's a Python module so it could be feasible, but I haven't made a proof-of-concept so treat this as a maybe possible, but probably very complicated and almost certainly not worth the trouble.
However you can always just wrap the C extension function with a Python function (just a very thing wrapper). This Python wrapper can have function annotations. It's more maintenance and a tiny bit slower but saves you all the hassle with signatures and C extensions. I'm not exactly sure but if you use Cython to wrap your C or C++ code it might even have some automated tooling (writing the Python wrappers automatically).

Python SWIG bindings with SomeType ** as function argument

I couldn't find any working Python bindings for ffmpeg, so I decided to generate one with SWIG. Generation was quick and easy (no customization, just default SWIG interface), but these a problem using some functions like int avformat_open_input(AVFormatContext **ps, const char *filename, AVInputFormat *fmt, AVDictionary **options); from libavformat/avformat.h. Using C this can be run simply by:
AVFormatContext *pFormatCtx = NULL;
int status;
status = avformat_open_input(&pFormatCtx, '/path/to/my/file.ext', NULL, NULL);
In Python I try following:
>>> from ppmpeg import *
>>> av_register_all()
>>> FormatCtx = AVFormatContext()
>>> FormatCtx
<ppmpeg.AVFormatContext; proxy of <Swig Object of type 'struct AVFormatContext *' at 0x173eed0> >
>>> avformat_open_input(FormatCtx, '/path/to/my/file.ext', None, None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: in method 'avformat_open_input', argument 1 of type 'AVFormatContext **'
Problem is that Python do not have & equivalent. I tried to use cpointer.i and its pointer_class (%pointer_class(AVFormatContext, new_ctx)), but new_ctx() returns pointer and this is not I want definitely. %pointer_class(AVFormatContext *, new_ctx) is illegal and gives syntax error. I would be grateful for any help. Thanks.
EDIT:
I forgot to mention I tried to use typemaps, but don't know how to write custom typemap for struct and documentation has only examples for basic types like int or float...
That looks like it's an out parameter. That's necessary in C because C only allows one return value, but Python allows multiple. SWIG lets you mark an argument as OUTPUT or INOUT that should accomplish what you want. See this.
You can also do it manually with a typemap. A typemap lets you specify an arbitrary conversion.
For example, you likely need in and argout typemaps as described in the typemap docs.
Note that since you're using custom datatypes you need to make sure that the headers that declare the struct are included in the generated .cpp. If SWIG doesn't take care of this automatically then put something like this at the top of your .i
// This block gets copied verbatim into the header area of the generated wrapper.
%{
#include "the_required_header.h"
%}

Python's c api and __add__ calls

I am writing a binding system that exposes classes and functions to python in a slightly unusual way.
Normally one would create a python type and provide a list of functions that represent the methods of that type, and then allow python to use its generic tp_getattro function to select the right one.
For reasons I wont go into here, I can't do it this way, and must provide my own tp_getattro function, that selects methods from elsewhere and returns my own 'bound method' wrapper. This works fine, but means that a types methods are not listed in its dictionary (so dir(MyType()) doesn't show anything interesting).
The problem is that I cannot seem to get __add__ methods working. see the following sample:
>>> from mymod import Vec3
>>> v=Vec3()
>>> v.__add__
<Bound Method of a mymod Class object at 0xb754e080>
>>> v.__add__(v)
<mymod.Vec3 object at 0xb751d710>
>>> v+v
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'mymod.Vec3' and 'mymod.Vec3'
As you can see, Vec3 has an __add__ method which can be called, but python's + refuses to use it.
How can I get python to use it? How does the + operator actually work in python, and what method does it use to see if you can add two arbitrary objects?
Thanks.
(P.S. I am aware of other systems such as Boost.Python and SWIG which do this automatically, and I have good reason for not using them, however wonderful they may be.)
Do you have an nb_add in your type's number methods structure (pointed by field tp_as_number of your type object)?

Categories

Resources