How is `getattr` related to `object.__getattribute__` and to `object.__getattr__`? - python

From https://docs.python.org/3.6/library/functions.html#getattr
getattr(object, name[, default])
Return the value of the named attribute of object. name must be a
string. If the string is the name of one of the object’s attributes,
the result is the value of that attribute. For example, getattr(x,
'foobar') is equivalent to x.foobar. If the named attribute does
not exist, default is returned if provided, otherwise
AttributeError is raised.
How is getattr related to object.__getattribute__ and to object.__getattr__?
Does getattr call object.__getattribute__ or object.__getattr__? I would guess the former?

Summary Answer
In general, a dotted lookup invokes __getattribute__.
If the code in __getattribute__ doesn't find the attribute, it looks to see if __getattr__ is defined. If so, it is called. Otherwise, AttributeError is raised.
The getattr() function is just an alternative way to call the above methods. For example getattr(a, 'x') is equivalent to a.x.
The getattr() function is mainly useful when you don't know the name of an attribute in advance (i.e. when it is stored in a variable). For example, k = 'x'; getattr(a, k) is equivalent to a.x.
Analogy and high level point-of-view
The best way to think of it is that __getattribute__ is the first called primary method and __getattr__ is the fallback which is called when attributes are missing. In this way, it is very much like the relationship between __getitem__ and __missing__ for square bracket lookups in dictionaries.
Demonstration code
Here is a worked-out example:
>>> class A(object):
x = 10
def __getattribute__(self, attr):
print(f'Looking up {attr!r}')
return object.__getattribute__(self, attr)
def __getattr__(self, attr):
print(f'Invoked the fallback method for missing {attr!r}')
return 42
>>> a = A()
>>> a.x
Looking up 'x'
10
>>> a.y
Looking up 'y'
Invoked the fallback method for missing 'y'
42
>>> # Equivalent calls with getattr()
>>> getattr(a, 'x')
Looking up 'x'
10
>>> getattr(a, 'y')
Looking up 'y'
Invoked the fallback method for missing 'y'
42
Official Documentation
Here are the relevant parts of the docs:
object.__getattr__(self, name) Called when an attribute lookup has
not found the attribute in the usual places (i.e. it is not an
instance attribute nor is it found in the class tree for self). name
is the attribute name. This method should return the (computed)
attribute value or raise an AttributeError exception.
Note that if the attribute is found through the normal mechanism,
__getattr__() is not called. (This is an intentional asymmetry between __getattr__() and __setattr__().) This is done both for efficiency reasons and because otherwise __getattr__() would have no way to
access other attributes of the instance. Note that at least for
instance variables, you can fake total control by not inserting any
values in the instance attribute dictionary (but instead inserting
them in another object). See the __getattribute__() method below for a
way to actually get total control over attribute access.
object.__getattribute__(self, name) Called unconditionally to
implement attribute accesses for instances of the class. If the class
also defines __getattr__(), the latter will not be called unless
__getattribute__() either calls it explicitly or raises an AttributeError. This method should return the (computed) attribute
value or raise an AttributeError exception. In order to avoid infinite
recursion in this method, its implementation should always call the
base class method with the same name to access any attributes it
needs, for example, object.__getattribute__(self, name).

getattr(foo, 'bar')(which is basically foo.bar) calls __getattribute__ and from there __getattr__ can be called(if it exists) when:
AttributeError is raised by __getattribute__
__getattribute__ calls it explicitly
From docs:
If the class also defines __getattr__(), the latter will not be called
unless __getattribute__() either calls it explicitly or raises an
AttributeError.
Defining __getattr__ can be useful if you don't want to throw attribute errors on random attribute access. Mock library is great example of that.
>>> from unittest.mock import Mock
>>> m = Mock()
>>> m.foo.bar.spam
<Mock name='mock.foo.bar.spam' id='4367274280'>
If you don't want to define __getattr__ all the time nor want to handle AttributeError all the time then you could either use the 3 argument form of getattr() to return default value if attribute wasn't found or use hasattr to check for its existence.
>> getattr([], 'foo')
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-60-f1834c32d3ce> in <module>()
----> 1 getattr([], 'foo')
AttributeError: 'list' object has no attribute 'foo'
>>> getattr([], 'foo', 'default')
'default'
Demo:
class A:
def __getattr__(self, attr):
print('Inside A.__getattr__')
return 'eggs'
print(A().foo)
outputs:
Inside A.__getattr__
eggs
Note that the data model documentation points it out as object.__getattr__, but that doesn't mean they exists on builtin object type. Instead they can exist on any object in general. And object in this case is a type as dunder methods are looked up on an object's type.

Related

Why __get__ method of a descriptor in Python is called inside of hassattr()?

Suppose
class D:
def __init__(self,id): self.id = id
def __get__(self,obj,type=None):
print(self.id,"__get__ is called")
class C:
d1 = D(1)
d2 = D(2)
d3 = D(3)
c = C()
then during the call hasattr(c,"d1") the __get__ method of C.d1 is called. Why? Shouldn't hasattr() just check the dictionaries?
The same (but weirder) happens when <tab> is presses for completion in an interactive CPython-version-3.6.10-session, as in c.d<tab>. In this case __get__ will be called for all matching attributes except the last one.
What is going on here?
If you look at the help for hasattr (or the documentation:
Help on built-in function hasattr in module builtins:
hasattr(obj, name, /)
Return whether the object has an attribute with the given name.
This is done by calling getattr(obj, name) and catching AttributeError.
So no, it doesn't just "check dictionaries", it couldn't do that because not all objects have dictionaries as namespaces to begin with, e.g. built-in objects, or user-defined objects with __slots__.
getattr(obj, name) will correctly invoke the equivalent machinery as:
obj.name
Which would call the descriptor's __get__
Not sure about the tab completion, but hasattr does call __get__. It is documented as well:
The arguments are an object and a string. The result is True if the
string is the name of one of the object’s attributes, False if not.
(This is implemented by calling getattr(object, name) and seeing
whether it raises an AttributeError or not.)

the fundamental differences of the way to overwrite getattr and setattr

My hope is to make attributes case-insensitive. But overwriting __getattr__ and __setattr__ are somewhat different, as indicated by the following toy example:
class A(object):
x = 10
def __getattr__(self, attribute):
return getattr(self, attribute.lower())
## following alternatives don't work ##
# def __getattr__(self, attribute):
# return self.__getattr__(attribute.lower())
# def __getattr__(self, attribute):
# return super().__getattr__(attribute.lower())
def __setattr__(self, attribute, value):
return super().__setattr__(attribute.lower(), value)
## following alternative doesn't work ##
# def __setattr__(self, attribute, value):
# return setattr(self, attribute.lower(), value)
a = A()
print(a.x) ## output is 10
a.X = 2
print(a.X) ## output is 2
I am confused by two points.
I assume getattr() is a syntactic sugar for __getattr__, but they behave differently.
Why does __setattr__ need to call super(), while __getattr__ doesn't?
I assume getattr() is a syntactic sugar for __getattr__, but they behave differently.
That's because the assumption is incorrect. getattr() goes through the entire attribute lookup process, of which __getattr__ is only a part.
Attribute lookup first invokes a different hook, namely the __getattribute__ method, which by default performs the familiar search through the instance dict and class hierarchy. __getattr__ will be called only if the attribute hasn't been found by __getattribute__. From the __getattr__ documentation:
Called when the default attribute access fails with an AttributeError (either __getattribute__() raises an AttributeError because name is not an instance attribute or an attribute in the class tree for self; or __get__() of a name property raises AttributeError).
In other words, __getattr__ is an extra hook to access attributes that don't exist, and would otherwise raise AttributeError.
Also, functions like getattr() or len() are not syntactic sugar for a dunder method. They almost always do more work, with the dunder method a hook for that process to call. Sometimes there are multiple hooks involved, such as here, or when creating an instance of a class by calling the class. Sometimes the connection is fairly direct, such as in len(), but even in the simple cases there are additional checks being made that the hook itself is not responsible for.
Why does __setattr__ need to call super(), while __getattr__ doesn't?
__getattr__ is an optional hook. There is no default implementation, which is why super().__getattr__() doesn't work. __setattr__ is not optional, so object provides you with a default implementation.
Note that by using getattr() you created an infinite loop! instance.non_existing will call __getattribute__('non_existing') and then __getattr__('non_existing'), at which point you use getattr(..., 'non_existing') which calls __getattribute__() and then __getattr__, etc.
In this case, you should override __getattribute__ instead:
class A(object):
x = 10
def __getattribute__(self, attribute):
return super().__getattribute__(attribute.lower())
def __setattr__(self, attribute, value):
return super().__setattr__(attribute.lower(), value)

Why is cls.__dict__[meth] different than getattr(cls, meth) for classmethods/staticmethods?

I've never seen anything else work like this before.
Is there anything else that does this?
>>> class NothingSpecial:
#classmethod
def meth(cls): pass
>>> NothingSpecial.meth
<bound method classobj.meth of <class __main__.NothingSpecial at 0x02C68C70>>
>>> NothingSpecial.__dict__['meth']
<classmethod object at 0x03F15FD0>
>>> getattr(NothingSpecial, 'meth')
<bound method NothingSpecial.meth of <class '__main__.NothingSpecial'>>
>>> object.__getattribute__(NothingSpecial, 'meth')
<classmethod object at 0x03FAFE90>
>>> type.__getattribute__(NothingSpecial, 'meth')
<bound method NothingSpecial.meth of <class '__main__.NothingSpecial'>>
Getattr Uses Descriptor Logic
The main difference is that the dictionary lookup does no extra processing while the attribute fetch incorporates extra logic (see my Descriptor How-To Guide for all the details).
There Are Two Different Underlying Methods
1) The call NothingSpecial.__dict__['meth'] uses the square brackets operator which dispatches to dict.__getitem__ which does a simple hash table lookup or raises KeyError if not found.
2) The call NothingSpecial.meth uses the dot operator which dispatches to type.__getattribute__ which does a simple lookup followed by a special case for descriptors. If the lookup fails, an AttributeError is raised.
How It Works
The overall logic is documented here and here.
In general, a descriptor is an object attribute with “binding
behavior”, one whose attribute access has been overridden by methods
in the descriptor protocol: __get__(), __set__(), and/or __delete__(). If
any of those methods are defined for an object, it is said to be a
descriptor.
The default behavior for attribute access is to get, set, or delete
the attribute from an object’s dictionary. For instance, a.x has a
lookup chain starting with a.__dict__['x'], then
type(a).__dict__['x'], and continuing through the base classes of
type(a) excluding metaclasses.
However, if the looked-up value is an object defining one of the
descriptor methods, then Python may override the default behavior and
invoke the descriptor method instead. Where this occurs in the
precedence chain depends on which descriptor methods were defined and
how they were called
Hope you've found all of this to be helpful. The kind of exploring you're doing is a great way to learn about Python :-)
P.S. You might also enjoy reading the original Whatsnew in Python 2.2 entry for descriptors or looking at PEP 252 where Guido van Rossum originally proposed the idea.
object.__getattribute__(NothingSpecial, 'meth')
and
NothingSpecial.__dict__['meth']
return the same object in this case. You can quickly check it by doing:
NothingSpecial.__dict__['meth'] is object.__getattribute__(NothingSpecial, 'meth')
$True
Both of them points to the same descriptor object
on the other hand:
object.__getattribute__(NothingSpecial, 'meth') is getattr(NothingSpecial, 'meth')
$False
Basically, they aren't they are not the same object nand the same type:
type(object.__getattribute__(NothingSpecial, 'meth'))
$<class 'classmethod'>
type(getattr(NothingSpecial, 'meth'))
$<class 'method'>
So the answer is that getattr will automagically invoke an object's __get__ method if it has one, whereas object.__getattribute__ and the objects __dict__ lookup do not. The following function proves that:
class Nothing:
#classmethod
def a(cls):
return cls()
#staticmethod
def b():
return 'b'
def c(self):
return 'c'
def gitter(obj, name):
value = object.__getattribute__(obj, name)
if hasattr(value, '__get__'):
if isclass(obj):
instance, cls = None, obj
else:
instance, cls = obj, type(obj)
return value.__get__(instance, cls)
return value
>>> gitter(Nothing, 'a')()
<__main__.Nothing object at 0x03E97930>
>>> gitter(Nothing, 'b')()
'b'
>>> gitter(Nothing(), 'c')()
'c'
However, gitter(Nothing(), 'b') doesn't work currently because it's not detecting that the objtype default value is None, but this is enough.

Why do people default owner parameter to None in __get__?

I've seen this quite often:
def __get__(self, instance, owner=None):
Why do some people use the default value of None for the the owner parameter?
This is even done in the Python docs:
descr.__get__(self, obj, type=None) --> value
Because the owner can easily be derived from the instance, the second argument is optional. Only when there is no instance to derive an owner from, is the owner argument needed.
This is described in the proposal that introduced descriptors, PEP 252 - Making Types Look More Like Classes:
__get__: a function callable with one or two arguments that
retrieves the attribute value from an object. This is also
referred to as a "binding" operation, because it may return a
"bound method" object in the case of method descriptors. The
first argument, X, is the object from which the attribute must
be retrieved or to which it must be bound. When X is None,
the optional second argument, T, should be meta-object and the
binding operation may return an unbound method restricted to
instances of T.
(Bold emphasis mine).
Binding, from day one, was meant to be applicable to the instance alone, with the type being optional. Methods don't need it, for example, since they can be bound to the instance alone:
>>> class Foo: pass
...
>>> def bar(self): return self
...
>>> foo = Foo()
>>> foo.bar = bar.__get__(foo) # look ma! no class!
>>> foo.bar
<bound method Foo.bar of <__main__.Foo object at 0x10a0c2710>>
>>> foo.bar()
<__main__.Foo object at 0x10a0c2710>
Besides, the second argument can easily be derived from the first argument; witness a classmethod still binding to the class even though we did not pass one in:
>>> classmethod(bar).__get__(foo)
<bound method type.bar of <class '__main__.Foo'>>
>>> classmethod(bar).__get__(foo)()
<class '__main__.Foo'>
The only reason the argument is there in the first place is to support binding to class, e.g. when there is no instance to bind to. The class method again; binding to None as the instance won't work, it only works if we actually pass in the class:
>>> classmethod(bar).__get__(None)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __get__(None, None) is invalid
>>> classmethod(bar).__get__(None, Foo)
<bound method type.bar of <class '__main__.Foo'>>
This is the standard way to do it; all Python built-in descriptors I've seen do it, including functions, properties, staticmethods, etc. I know of no case in the descriptor protocol where __get__ will be called without the owner argument, but if you want to call __get__ manually, it can be useful not to have to pass an owner. The owner argument usually doesn't do much.
As an example, you might want a cleaner way to give individual objects new methods. The following decorator cleans up the syntax and lets the methods have access to self:
def method_of(instance):
def method_adder(function):
setattr(instance, function.__name__, function.__get__(instance))
return function
return method_adder
#method_of(a)
def foo(self, arg1, arg2):
stuff()
Now a has a foo method. We manually used the __get__ method of the foo function to create a bound method object like any other, except that since this method isn't associated with a class, we didn't pass __get__ a class. Pretty much the only difference is that when you print the method object, you see ?.foo instead of SomeClassName.foo.
Because that's how the descriptor protocol is specified:
descr.__get__(self, obj, type=None) --> value
cf https://docs.python.org/2/howto/descriptor.html#descriptor-protocol
The type argument allows access to the class on which the descriptor is looked up when it's looked up on a class instead of an instance. Since you can get the class from the instance, it's somehow redundant when the descriptor is looked up on an instance, so it has been made optional to allow the less verbose desc.__get__(obj) call (instead of desc.__get__(obj, type(obj))).

Descriptors and direct access: Python reference

The python 3.3 documentation tells me that direct access to a property descriptor should be possible, although I'm skeptical of its syntax x.__get__(a). But the example that I constructed below fails. Am I missing something?
class MyDescriptor(object):
"""Descriptor"""
def __get__(self, instance, owner):
print "hello"
return 42
class Owner(object):
x = MyDescriptor()
def do_direct_access(self):
self.x.__get__(self)
if __name__ == '__main__':
my_instance = Owner()
print my_instance.x
my_instance.do_direct_access()
Here's the error I get in Python 2.7 (and also Python 3.2 after porting the snippet of code). The error message makes sense to me, but that doesn't seem to be how the documentation said it would work.
Traceback (most recent call last):
File "descriptor_test.py", line 15, in <module>
my_instance.do_direct_access()
File "descriptor_test.py", line 10, in do_direct_access
self.x.__get__(self)
AttributeError: 'int' object has no attribute '__get__'
shell returned 1
By accessing the descriptor on self you invoked __get__ already. The value 42 is being returned.
For any attribute access, Python will look to the type of the object (so type(self) here) to see if there is a descriptor object there (an object with a .__get__() method, for example), and will then invoke that descriptor.
That's how methods work; a function object is found, which has a .__get__() method, which is invoked and returns a method object bound to self.
If you wanted to access the descriptor directly, you'd have to bypass this mechanism; access x in the __dict__ dictionary of Owner:
>>> Owner.__dict__['x']
<__main__.MyDescriptor object at 0x100e48e10>
>>> Owner.__dict__['x'].__get__(None, Owner)
hello
42
This behaviour is documented right above where you saw the x.__get__(a) direct call:
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses.
The Direct Call scenario in the documentation only applies when you have a direct reference to the descriptor object (not invoked); the Owner.__dict__['x'] expression is such a reference.
Your code on the other hand, is an example of the Instance Binding scenario:
Instance Binding
If binding to an object instance, a.x is transformed into the call: type(a).__dict__['x'].__get__(a, type(a)).

Categories

Resources