I use SWIG to expose our C++ libraries to Python. For performance reasons, I'm interested in switching some of the wrapping to use SWIG's -builtin option, which removes the layers of Python proxy objects.
However, the wrapped class can no longer be used in Python sets or as a key in Python dicts. It is unhashable!
>>> wrapped_object = WrappedObject()
>>> hash(wrapped_object)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unhashable type: 'structure.WrappedObject'
I have defined __hash__(), __eq__(), and __ne__() methods for my class.
>>> wrapped_object.__hash__
<built-in method __hash__ of structure.WrappedObject object at 0x7fa9e0e4c378>
>>> wrapped_object.__eq__
<method-wrapper '__eq__' of structure.WrappedObject object at 0x7fa9e0e4c378>
What do I need to do to make this class hashable?
For Builtin objects, Python uses the hash slot (Python docs link) rather than the __hash__() method. Thus, the new builtin object needs to fill the hash slot. This requires a specific method prototype.
In the WrappedObject C++ files:
long WrappedObject::getHash();
And in the SWIG wrapper definition files:
%rename(__hash__) WrappedObject::getHash;
%feature("python:slot", "tp_hash", functype="hashfunc") WrappedObject::getHash;
This worked for me!
Related
Where is object.__init__ located in the cpython repository?
I searched for __init__ in Objects/object.c, but it gives no results.
It appears that all the immutable data types use object.__init__, so I would like to know the implementation of it.
Objects/object.c is where (most of) the object protocol is implemented, not where object is implemented.
object is implemented along with type in Objects/typeobject.c, and its __init__ method is object_init in that file.
(Note that the very similar-sounding PyObject_Init function is actually completely unrelated to object.__init__. PyObject_Init is a generic helper function that performs type pointer and refcount initialization for a newly-allocated object struct.)
In cpython this code would work:
import inspect
from types import FunctionType
def f(a, b): # line 5
print(a, b)
f_clone = FunctionType(
f.__code__,
f.__globals__,
closure=f.__closure__,
name=f.__name__
)
f_clone.__annotations__ = {'a': int, 'b': int}
f_clone.__defaults__ = (1, 2)
print(inspect.signature(f_clone)) # (a: int = 1, b: int = 2)
print(inspect.signature(f)) # (a, b)
f_clone() # 1 2
f(1, 2) # 1 2
try:
f()
except TypeError as e:
print(e) # f() missing 2 required positional arguments: 'a' and 'b'
However in cython when calling f_clone, I get:
XXX lineno: 5, opcode: 0
Traceback (most recent call last):
...
File "test.py", line 5, in f # line of f definitio
SystemError: unknown opcode
I need this to create a copy of class __init__ method on each class creation and and modify its signature, but keep original __init__ signature untouched.
Edit:
Changes made to signature of copied object must not affect runtime calls and needed only for inspection purposes.
I am relatively convinced this is never going to work well. If I were you I'd modify your code to fail elegantly for unclonable functions (maybe by just using the original __init__ and not replacing it, since this seems to be a purely cosmetic approach to generate prettier docstrings). After that you could submit an issue to the Cython issue tracker - however the maintainers of Cython know that full-introspection compatibility with Python is very challenging, so may not be hugely interested.
One of the main reasons I think you should just handle the error rather than find a workaround is that Cython is not the only method to accelerate Python. For example Numba can generate classes containing JIT accelerated code, or people can write their own functions in C (either as a C-API function, or perhaps wrapped with Ctypes or CFFI). These are all situations where your rather fragile introspection approach is likely to break. Handling the error fixes it for all of these; while you're likely to need an individual workaround for each one, plus all the methods I haven't thought of, plus any that are developed in the future.
Some details about Cython functions: at the moment a Cython has a compilation option called binding that can generate functions in two different modes:
With binding=False functions have the type builtin_function_or_method, which has minimum introspection capacities, and so no __code__, __globals__, __closure__ (or most other) attributes.
With binding=True functions have the type cython_function_or_method. This has improved introspection capacity, so does provide most of the expected annotations. However some of them are nonsense defaults - specifically __code__. The __code__ attribute is expected to be Python bytecode, however Cython doesn't use Python bytecode (since it's compiled to C). Therefore it just provides a dummy attribute.
It looks like Cython defaults to binding=True when compiling a .py file and when compiling a regular (non-cdef) class, giving the behaviour you report. However, when compiling a .pyx file it currently defaults to binding=False. It's possible you may also want to handle the binding=False case in some circumstances too.
Having established that trying to create a regular Python function object with the __code__ attribute of a cython_function_or_method isn't going to work, let's look at a few other options:
>>> print(f)
<cyfunction f at 0x7f08a1c63550>
>>> type(f)()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: cannot create 'cython_function_or_method' instances
So you can't create your own cython_function_or_method and populate it from Python - the type does not have a user callable constructor.
copy.copy appears to work, but doesn't actually create a new instance:
>>> import copy
>>> copy.copy(f)
<cyfunction f at 0x7f08a1c63550>
Note, however, that this has exactly the same address - it isn't a copy:
>>> copy.copy(f) is f
True
At which point I'm out of ideas.
What I don't quite get is why you don't use functools.wraps?
#functools.wraps(f):
def wrapper(*args, **kwargs):
return f(*args, **kwargs)
This updates wrapper with most of the relevant introspection attributes from f, works for both types of Cython function (to an extent - the binding=False case doesn't provide much useful information), and should work for most other types of function too.
It's possible I'm missing something, but it seems a whole lot less fragile than your scheme of copying code objects.
What is the difference between a python built-in and a normal object? we often say that in python everything is an object.
for example, when I do this in Python 3.6:
>>> import os, inspect
>>> inspect.getsource(os.scandir)
TypeError: <built-in function scandir> is not a module, class, method, function, traceback, frame, or code object
I have two questions:
is built-in function an object? if not is this why getsource throws TypeError?
why can't I find scandir in python3 documentation as a built-in?
You can't access the source of builtins and other modules that were written using the C API, since there isn't a Python source for them.
From the documentation for os.getsourcefile
Return the name of the Python source file in which an object was defined. This will fail with a TypeError if the object is a built-in module, class, or function.
This question already has an answer here:
Python: subscript a module
(1 answer)
Closed 7 years ago.
So it's quite a simple question. how do I add __getitem__ to a Python module. I mostly just want it as an ease of use, however it's confusing why it wouldn't let me 'just set it'. Below is a simple example of __getitem__ semi-working, however I wish for the other['test'] to work.
Here's the full output:
hello
hello
Traceback (most recent call last):
File "main.py", line 4, in <module>
print other['test']
TypeError: 'module' object has no attribute '__getitem__'
main.py
import other
print other.get('test')
print other.__getitem__('test')
print other['test']
other.py
test = 'hello'
def __getitem__(name):
return globals()[name]
get = __getitem__
I've tried to set __getitem__ using globals() aswell, globals()['__getitem__'] = __getitem__. It didn't work. And I tried to set it in main.py. So I'm confused as to why it's so adamant in not allowing me to use other['test'].
If it's impossible, then a short reason would be good.
Special methods are looked up on the type, not on an instance. Python looks for type(other).__getitem__() and that isn't available. You'd have to add the __getitem__ method to the module type; you can't in Python.
You'd have to replace the whole module instance in sys.modules with an instance of your own class to achieve what you want:
class MyModule(object):
def __init__(self, namespace):
self.__dict__.update(namespace)
def __getitem__(name):
return self.__dict__[name]
import other
import sys
sys.modules[other.__name__] = MyModule(other.__dict__)
This limitation doesn't just apply for modules, it applies for anything such that the type is not object or some subclass of object, or something with a metaclass that never bottoms out with object in the mro.
For example, you can also see this happening with type type:
In [32]: class Foo(type):
....: pass
....:
In [33]: type(Foo)
Out[33]: type
In [34]: Foo.__getitem__ = lambda x, y: x.__dict__.get(y)
In [35]: Foo.foo = "hello"
In [36]: Foo['foo']
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-38-e354ca231ddc> in <module>()
----> 1 Foo['foo']
TypeError: 'type' object has no attribute '__getitem__'
In [37]: Foo.__dict__.get('foo')
Out[37]: 'hello'
The reason is that at the C-API level, both module and type are particular instances of PyTypeObject which don't implement the required protocol for inducing the same search mechanism that the PyTypeObject implementation of object and friends does implement.
To change this aspect of the language itself, rather than hacking a replacement of sys.modules, you would need to change the C source definitions for PyModule_Type and PyType_Type such that there were C functions created for __getitem__ and added to the appropriate location in the C-API big PyTypeObject struct-o-magic-functions (a lot of which is expanded by the macro PyObject_HEAD) instead of 0 (which is the sentinel for does not exist), and recompile Python itself with these modified implementations of module and type.
I am writing a binding system that exposes classes and functions to python in a slightly unusual way.
Normally one would create a python type and provide a list of functions that represent the methods of that type, and then allow python to use its generic tp_getattro function to select the right one.
For reasons I wont go into here, I can't do it this way, and must provide my own tp_getattro function, that selects methods from elsewhere and returns my own 'bound method' wrapper. This works fine, but means that a types methods are not listed in its dictionary (so dir(MyType()) doesn't show anything interesting).
The problem is that I cannot seem to get __add__ methods working. see the following sample:
>>> from mymod import Vec3
>>> v=Vec3()
>>> v.__add__
<Bound Method of a mymod Class object at 0xb754e080>
>>> v.__add__(v)
<mymod.Vec3 object at 0xb751d710>
>>> v+v
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'mymod.Vec3' and 'mymod.Vec3'
As you can see, Vec3 has an __add__ method which can be called, but python's + refuses to use it.
How can I get python to use it? How does the + operator actually work in python, and what method does it use to see if you can add two arbitrary objects?
Thanks.
(P.S. I am aware of other systems such as Boost.Python and SWIG which do this automatically, and I have good reason for not using them, however wonderful they may be.)
Do you have an nb_add in your type's number methods structure (pointed by field tp_as_number of your type object)?