Why is __getattribute__ not invoked on an implicit __getitem__-invocation? - python

While trying to wrap arbitrary objects, I came across a problem with dictionaries and lists. Investigating, I managed to come up with a simple piece of code whose behaviour I simply do not understand. I hope some of you can tell me what is going on:
>>> class Cl(object): # simple class that prints (and suppresses) each attribute lookup
... def __getattribute__(self, name):
... print 'Access:', name
...
>>> i = Cl() # instance of class
>>> i.test # test that __getattribute__ override works
Access: test
>>> i.__getitem__ # test that it works for special functions, too
Access: __getitem__
>>> i['foo'] # but why doesn't this work?
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Cl' object has no attribute '__getitem__'

Magic __methods__() are treated specially: They are internally assigned to "slots" in the type data structure to speed up their look-up, and they are only looked up in these slots. If the slot is empty, you get the error message you got.
See Special method lookup for new-style classes in the documentation for further details. Excerpt:
In addition to bypassing any instance attributes in the interest of correctness, implicit special method lookup generally also bypasses the __getattribute__() method even of the object’s metaclass.
[…]
Bypassing the __getattribute__() machinery in this fashion provides significant scope for speed optimisations within the interpreter, at the cost of some flexibility in the handling of special methods (the special method must be set on the class object itself in order to be consistently invoked by the interpreter).

Related

Python: Why do functools.partial functions not become bound methods when set as class attributes?

I was reading about how functions become bound methods when being set as class atrributes. I then observed that this is not the case for functions that are wrapped by functools.partial. What is the explanation for this?
Simple example:
from functools import partial
def func1():
print("foo")
func1_partial = partial(func1)
class A:
f = func1
g = func1_partial
a = A()
a.f() # TypeError: func1() takes 0 positional arguments but 1 was given
a.g() # prints "foo"
I kind of expected them both to behave in the same way.
The trick that allows functions to become bound methods is the __get__ magic method.
To very briefly summarize that page, when you access a field on an instance, say foo.bar, Python first checks whether bar exists in foo's __dict__ (or __slots__, if it has one). If it does, we return it, no harm done. If not, then we look on type(foo). However, when we access the field Foo.bar on the class Foo through an instance, something magical happens. When we write foo.bar, assuming there is no bar on foo's __dict__ (resp. __slots__), then we actually call Foo.bar.__get__(foo, Foo). That is, Python calls a magic method asking the object how it would like to be retrieved.
This is how properties are implemented, and it's also how bound methods are implemented. Somewhere deep down (probably written in C), there's a __get__ function on the type function that binds the method when accessed through an instance.
functools.partial, despite looking a lot like a function, is not an instance of the type function. It's just a random class that happens to implement __call__, and it doesn't implement __get__. Why doesn't it? Well, they probably just didn't think it was worth it, or it's possible nobody even considered it. Regardless, the "bound method" trick applies to the type called function, not to all callable objects.
Another useful resource on magic methods, and __get__ in particular: https://rszalski.github.io/magicmethods/#descriptor
The type function implements the __get__ method:
>>> import types
>>> types.FunctionType.__get__
<slot wrapper '__get__' of 'function' objects>
partial does not.
>>> from functools import partial
>>> partial.__get__
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: type object 'functools.partial' has no attribute '__get__'. Did you mean: '__ge__'?
The __get__ method is what makes a.f evaluate to a value of type method instead of A.f. Without __get__, a.g is equivalent to A.g.

Self-deleting class in Python

EDIT: Disclaimer - I don't mean deletion in the sense that applies to languages that aren't memory-managed (e.g. free in C++). Deletion here is to be understood as the fact that the superclass doesn't have the subclass as one of its subclasses anymore after its been deleted.
In Python, you can delete a class (yes I do mean a class, not an instance) by doing the following:
class Super:
...
class DeleteMe(Super):
...
print(Super.__subclasses__())
# [<class '__main__.DeleteMe'>]
del DeleteMe
import gc
gc.collect() # Force a collection
print(Super.__subclasses__())
# []
I am trying to emulate this behaviour but I want the DeleteMe class to be able to destroy itself. Here is what I've tried:
class Super:
...
class DeleteMe(Super):
def self_delete(self):
print(self.__class__)
# <class '__main__.DeleteMe'>, this looks right
del self.__class__ # this fails
import gc
gc.collect()
print(Super.__subclasses__())
# [<class '__main__.DeleteMe'>]
DeleteMe().self_delete()
It fails with the following traceback:
Traceback (most recent call last):
File "/Users/rayan/Desktop/test.py", line 10, in <module>
DeleteMe().self_delete()
File "/Users/rayan/Desktop/test.py", line 4, in self_delete
del self.__class__
TypeError: can't delete __class__ attribute
How can I achieve this self-destructing behaviour?
Note: not a duplicate of How to remove classes from __subclasses__?, that question covers the first case where the deletion happens outside of the class
del DestructMe
This is not deleting the class. This is deleting the name that happens to refer to the class. If there are no other references to the class (and that includes the name you just deleted, any module that's ever imported the class, any instances of the class, and any other places where the class might happen to be stored), then the garbage collector might delete the class when you gc.collect().
Now an instance always knows its own class, via the __class__ attribute. It makes little sense to delete self.__class__, because then what would we be left with? An instance with no class? What can we do with it? We can't call methods on it since those are defined on the class, and we can't do anything object-like on it since it's no longer an instance of object (a superclass of the class we just removed). So really we have a sort of silly looking dictionary that doesn't even do all of the dict things in Python. Hence, disallowed.
You cannot delete data in Python. That's the garbage collector's job. There is no Python equivalent of C's free or C++'s delete. del in Python deletes bindings or dictionary entries. It does not remove data; it removes pointers that happen to point to data.

__get__() of a method/function in Python

I have a piece of code that I am trying to understand, and even with the existing answers, I really couldn't understand the purpose of the following code, Can someone please help me in understanding the same?
I have already looked a various relevant questions ( __get__() ) here and I couldnt find specific answers. I understand that the class below is trying to create a method on the fly ( possibly we get to this class from a __getattr__() method which fails to find an attribute ) and return the method to the caller. I have commented right above the lines of code I need understanding with.
class MethodGen(object):
def __getattr__(self, name):
method = self.method_gen(name)
if method:
return self.method_gen(name)
def method_gen(self, name):
def method(*args, **kwargs):
print("Creating a method here")
# Below are the two lines of code I need help understanding with
method.__name__ = name
setattr(self, name, method.__get__(self))
return method
If I am not wrong, the method() function's attribute __name__ has been set, but in setattr() function, the attribute of the class MethodGen, name is set to what ?
This question really intrigued me. The two answers provided didn't seem to tell the whole story. What bothered me was the fact that in this line:
setattr(self, name, method.__get__(self))
the code is not setting things up so that method.__get__ Will be called at some point. Rather, method.__get__ is actually Being Called! But isn't the idea that this __get__ method will be called when a particular attribute of an object, an instance of MethodGen in this case, is actually referenced? If you read the docs, this is the impression you get...that an attribute is linked to a Descriptor that implements __get__, and that implementation determines what gets returned when that attribute is referenced. But again, that's not what's going on here. This is all happening before that point. So what IS really going on here?
The answer lies HERE. The key language is this:
To support method calls, functions include the __get__() method for
binding methods during attribute access. This means that all functions
are non-data descriptors which return bound methods when they are
invoked from an object.
method.__get__(self) is exactly what's being described here. So what method.__get__(self) is actually doing is returning a reference to the "method" function that is bound to self. Since in our case, self is an instance of MethodGen, this call is returning a reference to the "method" function that is bound to an instance of MethodGen. In this case, the __get__ method has nothing to do with the act of referencing an attribute. Rather, this call is turning a function reference into a method reference!
So now we have a reference to a method we've created on the fly. But how do we set it up so it gets called at the right time, when an attribute with the right name is referenced on the instance it is bound to? That's where the setattr(self, name, X) part comes in. This call takes our new method and binds it to the attribute with name name on our instance.
All of the above then is why:
setattr(self, name, method.__get__(self))
is adding a new method to self, the instance of the MethodGen class on which method_gen has been called.
The method.__name__ = name part is not all that important. Executing just the line of code discussed above gives you all the behavior you really want. This extra step just attaches a name to our new method so that code that asks for the name of the method, like code that uses introspection to write documentation, will get the right name. It is the instance attribute's name...the name passed to setattr...that really matters, and really "names" the method.
Interesting, never seen this done before, seems tough to maintain (probably will make some fellow developers want to hang you).
I changed some code so you can see a little more of what is happening.
class MethodGen(object):
def method_gen(self, name):
print("Creating a method here")
def method(*args, **kwargs):
print("Calling method")
print(args) # so we can see what is actually being outputted
# Below are the two lines of code I need help understanding with
method.__name__ = name # These the method name equal to name (i.e. we can call the method this way)
# The following is adding the new method to the current class.
setattr(self, name, method.__get__(self)) # Adds the method to this class
# I would do: setattr(self, name, method) though and remove the __get__
return method # Returns the emthod
m = MethodGen()
test = m.method_gen("my_method") # I created a method in MethodGen class called my_method
test("test") # It returned a pointer to the method that I can use
m.my_method("test") # Or I can now call that method in the class.
m.method_gen("method_2")
m.method_2("test2")
Consider the class below:
class Foo:
def bar(self):
print("hi")
f = Foo()
f.bar()
bar is a class attribute that has a function as its value. Because function implements the descriptor protocol, however, accessing it as Foo.bar or f.bar does not immediately return the function itself; it causes the function's __get__ method to be invoked, and that returns either the original function (as in Foo.bar) or a new value of type instancemethod (as in f.bar). f.bar() is evaluated as Foo.bar.__get__(f, Foo)().
method_gen takes the function named method, and attaches an actual method retrieved by calling the function's __get__ method to an object. The intent is so that something like this works:
>>> m = MethodGen()
>>> n = MethodGen()
>>> m.foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MethodGen' object has no attribute 'foo'
>>> n.foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MethodGen' object has no attribute 'foo'
>>> m.method_gen('foo')
<function foo at 0x10465c758>
>>> m.foo()
Creating a method here
>>> n.foo()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'MethodGen' object has no attribute 'foo'
Initially, MethodGen does not have any methods other than method_gen. You can see the exception raised when attempting to invoke a method named foo on either of two instances. Calling method_gen, however, attaches a new method to just that particular instance. After calling m.method_gen("foo"), m.foo() calls the method defined by method_gen. That call does not affect other instances of MethodGen like n.

Descriptors and direct access: Python reference

The python 3.3 documentation tells me that direct access to a property descriptor should be possible, although I'm skeptical of its syntax x.__get__(a). But the example that I constructed below fails. Am I missing something?
class MyDescriptor(object):
"""Descriptor"""
def __get__(self, instance, owner):
print "hello"
return 42
class Owner(object):
x = MyDescriptor()
def do_direct_access(self):
self.x.__get__(self)
if __name__ == '__main__':
my_instance = Owner()
print my_instance.x
my_instance.do_direct_access()
Here's the error I get in Python 2.7 (and also Python 3.2 after porting the snippet of code). The error message makes sense to me, but that doesn't seem to be how the documentation said it would work.
Traceback (most recent call last):
File "descriptor_test.py", line 15, in <module>
my_instance.do_direct_access()
File "descriptor_test.py", line 10, in do_direct_access
self.x.__get__(self)
AttributeError: 'int' object has no attribute '__get__'
shell returned 1
By accessing the descriptor on self you invoked __get__ already. The value 42 is being returned.
For any attribute access, Python will look to the type of the object (so type(self) here) to see if there is a descriptor object there (an object with a .__get__() method, for example), and will then invoke that descriptor.
That's how methods work; a function object is found, which has a .__get__() method, which is invoked and returns a method object bound to self.
If you wanted to access the descriptor directly, you'd have to bypass this mechanism; access x in the __dict__ dictionary of Owner:
>>> Owner.__dict__['x']
<__main__.MyDescriptor object at 0x100e48e10>
>>> Owner.__dict__['x'].__get__(None, Owner)
hello
42
This behaviour is documented right above where you saw the x.__get__(a) direct call:
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses.
The Direct Call scenario in the documentation only applies when you have a direct reference to the descriptor object (not invoked); the Owner.__dict__['x'] expression is such a reference.
Your code on the other hand, is an example of the Instance Binding scenario:
Instance Binding
If binding to an object instance, a.x is transformed into the call: type(a).__dict__['x'].__get__(a, type(a)).

How to tell python non-class objects from class objects

I am new to python. I think non-class objects do not have bases attribute whereas class objects do have it. But I am not sure. How does python\cpython checks if an object is non-class or class and passes the correct arguments to the object's descriptor attribute accordingly during the attribute access?
============================================
updated:
I was learning how __getattribute__ and descriptor cooperate together to make bounded methods. I was wondering how class object & non-class object invokes the descriptor's __get__ differently. I thought those 2 types of objects shared the same __getattribute__ CPython function and that same function would have to know if the invoking object was a class or non-class. But I was wrong. This article explains it well:
http://docs.python.org/dev/howto/descriptor.html#functions-and-methods
So class object use type.__getattribute__ whereas non-class object use object.__getattribute__. They are different CPython functions. And super has a third __getattribute__ CPython implementation as well.
However, about the super one, the above article states that:
quote and quote
The object returned by super() also has a custom _getattribute_() method for invoking descriptors. The call super(B, obj).m() searches obj._class_._mro_ for the base class A immediately following B and then returns A._dict_['m']._get_(obj, A). If not a descriptor, m is returned unchanged. If not in the dictionary, m reverts to a search using object._getattribute_().
The statement above didn't seem to match my experiment with Python3.1. What I saw is, which is reasonable to me:
super(B, obj).m ---> A.__dict__['m'].__get__(obj, type(obj))
objclass = type(obj)
super(B, objclass).m ---> A.__dict__['m'].__get__(None, objclass)
A was never passed to __get__
It is reasonable to me because I believe objclass (rather than A) 's mro chain is the one needed within m especially for the second case.
Was I doing something wrong? Or I didn't understand it correctly?
As the commenters asked: Why do you care? Usually that's a sign of not using Python the way it was meant to be used.
A very powerful concept of Python is duck typing. You don't care about the type or class of an object as long as it exposes the attributes you need.
how about inspect.isclass(objectname)?
more info here: http://docs.python.org/library/inspect.html

Categories

Resources