Python's c api and __add__ calls - python

I am writing a binding system that exposes classes and functions to python in a slightly unusual way.
Normally one would create a python type and provide a list of functions that represent the methods of that type, and then allow python to use its generic tp_getattro function to select the right one.
For reasons I wont go into here, I can't do it this way, and must provide my own tp_getattro function, that selects methods from elsewhere and returns my own 'bound method' wrapper. This works fine, but means that a types methods are not listed in its dictionary (so dir(MyType()) doesn't show anything interesting).
The problem is that I cannot seem to get __add__ methods working. see the following sample:
>>> from mymod import Vec3
>>> v=Vec3()
>>> v.__add__
<Bound Method of a mymod Class object at 0xb754e080>
>>> v.__add__(v)
<mymod.Vec3 object at 0xb751d710>
>>> v+v
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'mymod.Vec3' and 'mymod.Vec3'
As you can see, Vec3 has an __add__ method which can be called, but python's + refuses to use it.
How can I get python to use it? How does the + operator actually work in python, and what method does it use to see if you can add two arbitrary objects?
Thanks.
(P.S. I am aware of other systems such as Boost.Python and SWIG which do this automatically, and I have good reason for not using them, however wonderful they may be.)

Do you have an nb_add in your type's number methods structure (pointed by field tp_as_number of your type object)?

Related

Subtype Polymorphism is broken in Cython v30.0.0a11?

Trying to pass an instance of a derived class to a function which accepts instances of the superclass gives an error in Cython v3.0.0a11:
test.pyx:
class MyInt(int):
pass
def takes_int(a: int):
pass
try.py:
from test import takes_int, MyInt
takes_int(MyInt(1))
try.py OUTPUT:
Traceback (most recent call last):
File "C:\Users\LENOVO PC\PycharmProjects\MyProject\cython_src\try.py", line 3, in <module>
takes_int(MyInt(1))
TypeError: Argument 'a' has incorrect type (expected int, got MyInt)
Changing to v0.29.32, cleaning the generated C file and the object files, and re-running, gets rid of the error.
This is (kind of) expected.
Cython has never allowed subtype polymorphism for builtin type arguments. See https://cython.readthedocs.io/en/latest/src/userguide/language_basics.html#types:
This requires an exact match of the class, it does not allow subclasses. This allows Cython to optimize code by accessing internals of the builtin class, which is the main reason for declaring builtin types in the first place.
This is a restriction which applies only to builtin types - for Cython defined cdef classes it works fine. It's also slightly different to the usual rule for annotations, but it's there because it's the only way that Cython can do much with these annotation.
What's changed is that an int annotation is interpreted as "any object" in Cython 0.29.x and a Python int in Cython 3. (Note that cdef int declares a C int though.) The reason for not using an int annotation in earlier versions of Cython is that Python 2 has two Python integer types, and it isn't easy to accept both of those and usefully use the types.
I'm not quite sure exactly what the final version of Cython 3 will end up doing with int annotations though.
If you don't want Cython to use the annotation (for example you would like your int class to be accepted) then you can turn off annotation_typing locally with the #cython.annotation_typing(False) decorator.

Location of `object.__init__`

Where is object.__init__ located in the cpython repository?
I searched for __init__ in Objects/object.c, but it gives no results.
It appears that all the immutable data types use object.__init__, so I would like to know the implementation of it.
Objects/object.c is where (most of) the object protocol is implemented, not where object is implemented.
object is implemented along with type in Objects/typeobject.c, and its __init__ method is object_init in that file.
(Note that the very similar-sounding PyObject_Init function is actually completely unrelated to object.__init__. PyObject_Init is a generic helper function that performs type pointer and refcount initialization for a newly-allocated object struct.)

What is Python's "Namespace" object?

I know what namespaces are. But when running
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('bar')
parser.parse_args(['XXX']) # outputs: Namespace(bar='XXX')
What kind of object is Namespace(bar='XXX')? I find this totally confusing.
Reading the argparse docs, it says "Most ArgumentParser actions add some value as an attribute of the object returned by parse_args()". Shouldn't this object then appear when running globals()? Or how can I introspect it?
Samwise's answer is very good, but let me answer the other part of the question.
Or how can I introspect it?
Being able to introspect objects is a valuable skill in any language, so let's approach this as though Namespace is a completely unknown type.
>>> obj = parser.parse_args(['XXX']) # outputs: Namespace(bar='XXX')
Your first instinct is good. See if there's a Namespace in the global scope, which there isn't.
>>> Namespace
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'Namespace' is not defined
So let's see the actual type of the thing. The Namespace(bar='XXX') printer syntax is coming from a __str__ or __repr__ method somewhere, so let's see what the type actually is.
>>> type(obj)
<class 'argparse.Namespace'>
and its module
>>> type(obj).__module__
'argparse'
Now it's a pretty safe bet that we can do from argparse import Namespace and get the type. Beyond that, we can do
>>> help(argparse.Namespace)
in the interactive interpreter to get detailed documentation on the Namespace class, all with no Internet connection necessary.
It's simply a container for the data that parse_args generates.
https://docs.python.org/3/library/argparse.html#argparse.Namespace
This class is deliberately simple, just an object subclass with a readable string representation.
Just do parser.parse_args(...).bar to get the value of your bar argument. That's all there is to that object. Per the doc, you can also convert it to a dict via vars().
The symbol Namespace doesn't appear when running globals() because you didn't import it individually. (You can access it as argparse.Namespace if you want to.) It's not necessary to touch it at all, though, because you don't need to instantiate a Namespace yourself. I've used argparse many times and until seeing this question never paid attention to the name of the object type that it returns -- it's totally unimportant to the practical applications of argparse.
Namespace is basically just a bare-bones class, on whose instances you can define attributes, with a few niceties:
A nice __repr__
Only keyword arguments can be used to instantiate it, preventing "anonymous" attributes.
A convenient method to check if an attribute exists (foo in Namespace(bar=3) evaluates to False)
Equality with other Namespace instances based on having identical attributes and attribute values. (E.g. ,Namespace(foo=3, bar=5) == Namespace(bar=5, foo=3))
Instances of Namespace are returned by parse_args:
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('bar')
args = parser.parse_args(['XXX'])
assert args.bar == 'XXX'

How to clone function in cython? I getting SystemError: unknown opcode

In cpython this code would work:
import inspect
from types import FunctionType
def f(a, b): # line 5
print(a, b)
f_clone = FunctionType(
f.__code__,
f.__globals__,
closure=f.__closure__,
name=f.__name__
)
f_clone.__annotations__ = {'a': int, 'b': int}
f_clone.__defaults__ = (1, 2)
print(inspect.signature(f_clone)) # (a: int = 1, b: int = 2)
print(inspect.signature(f)) # (a, b)
f_clone() # 1 2
f(1, 2) # 1 2
try:
f()
except TypeError as e:
print(e) # f() missing 2 required positional arguments: 'a' and 'b'
However in cython when calling f_clone, I get:
XXX lineno: 5, opcode: 0
Traceback (most recent call last):
...
File "test.py", line 5, in f # line of f definitio
SystemError: unknown opcode
I need this to create a copy of class __init__ method on each class creation and and modify its signature, but keep original __init__ signature untouched.
Edit:
Changes made to signature of copied object must not affect runtime calls and needed only for inspection purposes.
I am relatively convinced this is never going to work well. If I were you I'd modify your code to fail elegantly for unclonable functions (maybe by just using the original __init__ and not replacing it, since this seems to be a purely cosmetic approach to generate prettier docstrings). After that you could submit an issue to the Cython issue tracker - however the maintainers of Cython know that full-introspection compatibility with Python is very challenging, so may not be hugely interested.
One of the main reasons I think you should just handle the error rather than find a workaround is that Cython is not the only method to accelerate Python. For example Numba can generate classes containing JIT accelerated code, or people can write their own functions in C (either as a C-API function, or perhaps wrapped with Ctypes or CFFI). These are all situations where your rather fragile introspection approach is likely to break. Handling the error fixes it for all of these; while you're likely to need an individual workaround for each one, plus all the methods I haven't thought of, plus any that are developed in the future.
Some details about Cython functions: at the moment a Cython has a compilation option called binding that can generate functions in two different modes:
With binding=False functions have the type builtin_function_or_method, which has minimum introspection capacities, and so no __code__, __globals__, __closure__ (or most other) attributes.
With binding=True functions have the type cython_function_or_method. This has improved introspection capacity, so does provide most of the expected annotations. However some of them are nonsense defaults - specifically __code__. The __code__ attribute is expected to be Python bytecode, however Cython doesn't use Python bytecode (since it's compiled to C). Therefore it just provides a dummy attribute.
It looks like Cython defaults to binding=True when compiling a .py file and when compiling a regular (non-cdef) class, giving the behaviour you report. However, when compiling a .pyx file it currently defaults to binding=False. It's possible you may also want to handle the binding=False case in some circumstances too.
Having established that trying to create a regular Python function object with the __code__ attribute of a cython_function_or_method isn't going to work, let's look at a few other options:
>>> print(f)
<cyfunction f at 0x7f08a1c63550>
>>> type(f)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'cython_function_or_method' instances
So you can't create your own cython_function_or_method and populate it from Python - the type does not have a user callable constructor.
copy.copy appears to work, but doesn't actually create a new instance:
>>> import copy
>>> copy.copy(f)
<cyfunction f at 0x7f08a1c63550>
Note, however, that this has exactly the same address - it isn't a copy:
>>> copy.copy(f) is f
True
At which point I'm out of ideas.
What I don't quite get is why you don't use functools.wraps?
#functools.wraps(f):
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
This updates wrapper with most of the relevant introspection attributes from f, works for both types of Cython function (to an extent - the binding=False case doesn't provide much useful information), and should work for most other types of function too.
It's possible I'm missing something, but it seems a whole lot less fragile than your scheme of copying code objects.

Make SWIG wrapped builtin class "hashable" in Python

I use SWIG to expose our C++ libraries to Python. For performance reasons, I'm interested in switching some of the wrapping to use SWIG's -builtin option, which removes the layers of Python proxy objects.
However, the wrapped class can no longer be used in Python sets or as a key in Python dicts. It is unhashable!
>>> wrapped_object = WrappedObject()
>>> hash(wrapped_object)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'structure.WrappedObject'
I have defined __hash__(), __eq__(), and __ne__() methods for my class.
>>> wrapped_object.__hash__
<built-in method __hash__ of structure.WrappedObject object at 0x7fa9e0e4c378>
>>> wrapped_object.__eq__
<method-wrapper '__eq__' of structure.WrappedObject object at 0x7fa9e0e4c378>
What do I need to do to make this class hashable?
For Builtin objects, Python uses the hash slot (Python docs link) rather than the __hash__() method. Thus, the new builtin object needs to fill the hash slot. This requires a specific method prototype.
In the WrappedObject C++ files:
long WrappedObject::getHash();
And in the SWIG wrapper definition files:
%rename(__hash__) WrappedObject::getHash;
%feature("python:slot", "tp_hash", functype="hashfunc") WrappedObject::getHash;
This worked for me!

Categories

Resources