Assume that we are using python 3.x or newer, not python 2.x
Python, like many languages, has a dot-operator:
# Create a new instance of the Rectangle class
robby = Rectangle(3, 10)
# INVOKE THE DOT OPERATOR
x = robby.length
Python's dot-operator is sometimes implemented as __getattribute__.
The following is equivalent to x = robby.length:
x = Rectangle.__getattribute__(robby, "length")
However, the dot-operator is not always implemented as __getattribute__.
Python has "magic methods"
A magic methods is any method whose name begins and ends with two underscore characters.
__len__() is an example of a magic method.
You can get a list of most of python's magic methods by executing the following code:
print("\n".join(filter(lambda s: s.startswith("__"), dir(int))))
The output is:
__abs__
__add__
__and__
__bool__
__ceil__
__class__
__delattr__
__dir__
__divmod__
__doc__
__eq__
__float__
[... truncated / abridged ...]
__rtruediv__
__rxor__
__setattr__
__sizeof__
__str__
__sub__
__subclasshook__
__truediv__
__trunc__
__xor__
Suppose we write a class named Rectangle, sub-class of object.
Then my attempts to override object.__getattribute__ inside of the Rectangle class, usually fail.
The following shows an example of a class where python sometimes ignores an overridden dot-operator:
class Klass:
def __getattribute__(self, attr_name):
return print
obj = Klass()
obj.append() # WORKS FINE. `obj.append == print`
obj.delete() # WORKS FINE. `obj.delete == print`
obj.length # WORKS FINE
obj.x # WORKS FINE
# None of the following work, because they
# invoke magic methods.
# The following line is similar to:
# x = Klass.__len__(obj)
len(obj)
# obj + 5
# is similar to:
# x = Klass.__add__(obj, 5)
x = obj + 5
# The following line is similair to:
# x = Klass.__radd__(obj, 2)
x = 2 + obj
There is more than one way to override python's dot operator.
What is an example of one way to do it which is readable, clean, and consistent?
By consistent, I mean that our custom dot operator gets called whenever . is used in source code, no matter whether the method is a magic method or not.
I am unwilling to manually type-in every single magic method under the sun.
I don't want to see thousands of lines of code which looks like:
def __len__(*args, **kwargs):
return getattr(args[0], "__len__")(*args, **kwargs)
I understand the distinction between __getattr__ and __getattribute__
Overriding __getattribute__ instead of __getattr__ is not the issue at hand.
__getattribute__ already does what you're literally asking for - overriding __getattribute__ is all you need to handle all uses of the . operator. (Strictly speaking, Python will fall back to __getattr__ if __getattribute__ fails, but you don't have to worry about that as long as you don't implement __getattr__.)
You say you want your operator to be called "whenever . is used in source code", but len, +, and all the other things you're worried about don't use .. There is no . in len(obj), obj + 5, or 2 + obj.
Most magic method lookups don't use attribute access. If you actually look up yourobj.__len__ or yourobj.__add__, that will go through attribute access, and your __getattribute__ will be invoked, but when Python looks up a magic method to implement language functionality, it does a direct search of the object's type's MRO. The . operator is not involved.
There is no way to override magic method lookup. That's a hardcoded process with no override hook. The closest thing you can do is override individual magic methods to delegate to __getattribute__, but that's not the same thing as overriding magic method lookup (or overriding .), and it's easy to get infinite recursion bugs that way.
If all you really want to do is avoid repetitive individual magic method overrides, you could put them in a class decorator or mixin.
Related
I have a class as follows:
class Lz:
def __init__(self, b):
self.b = b
def __getattr__(self, item):
return self.b.__getattribute__(item)
And I create an instance and print :
a = Lz('abc')
print(a)
Result is: abc
I have set a breakpoint at line return self.b.__getattribute__(item), item show __str__
I don't know why it calls __getattr__, and item is __str__ when I access the instance.
print calls __str__ (see this question for details), but as Lz does not have a __str__ method, a lookup for an attribute named '__str__' takes place using __getattr__.
So if you add a __str__ method, __getattr__ should not be called anymore when printing objects of the Lz class.
print(obj) invokes str(obj) (to get a printable representation), which in turn tries to invokes obj.__str__() (and fallback to something else if this fails, but that's not the point here).
You defined Lz as an old-style class, so it's doesn't by default have a __str__ method (new-style classes inherit this from object), but you defined a __getattr__() method, so this is what gets invoked in the end (__getattr__() is the last thing the attribute lookup will invoke when everything else has failed).
NB: in case you don't already know, since everything in Python is an object - include classes, functions, methods etc - Python doesn't make difference between "data" attributes and "method" attributes - those are all attributes, period.
NB2: directly accessing __magic__ names is considered bad practice. Those names are implementation support for operators or operator-like generic functions (ie len(), type etc), and you are supposed to use the operator or generic function instead. IOW, this:
return self.b.__getattribute__(item)
should be written as
return getattr(self.b, item)
(getattr() is the generic function version of the "dot" attribute lookup operator (.))
If I want to change the behavior of an inherited method, I would do something like this:
class a:
def changeMe(self):
print('called a')
class b(a):
def changeMe(self):
print('called b')
I believe this is an example of overriding in Python.
However, if I want to overload an operator, I do something very similar:
class c:
aNumber = 0
def __add__(self, operand):
print("words")
return self.aNumber + operand.aNumber
a = c()
b = c()
a.aNumber += 1
b.aNumber += 2
print(a + b) # prints "words\n3"
I thought that maybe the operator methods are really overridden in python since we overload using optional parameters and we just call it operator overloading out of convention.
But it also couldn't be an override, since '__add__' in object.__dict__.keys() is False; a method needs to be a member of the parent class in order to be overridden (and all classes inherit from object when created).
Where is the gap in my understanding?
I guess since the original question specifically asked about the gap in my own understanding, I am best-positioned to answer it. Go figure.
What I failed to understand was that whereas overriding depends on inheritance, overloading does not. Rather, Python matches methods for overloading based on name only.
For a subclass to override a method, the method does indeed need to exist in the parent class. Therefore, the def __add__ portion is not an example of overriding.
(In this case, I also did not fully understand that if the interpreter sees a + operator, it will look to the class of the operands for a definition of the __add__ magic method.)
Because the + operator is essentially an alias for __add__(), the same name is being used. Operator overloading is in fact an example of overloading because we are changing the behavior of the name (+ or __add__) when it is called with novel parameters (in my example, objects of class c).
Overloading means 2 methods with the SAME Name and different signatures + return types. Overriding means 2 methods with the SAME name, wherein the sub method has different functionality.The main difference between overloading and overriding is that in overloading we can use same function name with different parameters for multiple times for different tasks with on a class. and overriding means we can use same name function name with same parameters of the base class in the derived class. this is also called as re usability of code in the program.
I know I should have come up with a better title, but anyway...
Say I make a class inherited from int in python:
class Foo(int):
def is_even(self):
return self%2 == 0
and do something like this
a = Foo(3)
b = Foo(5)
print(type(a+b)) #=> <class 'int'>
I understand this behaviour is not surprising at all, as __add___ called here is defined to return int instances. But I would like to create a class so that a+b returns Foo(8). In other words, I'd like the result a+b to have the is_even method.
Is there any way I can achieve this conveniently? Or do I have to overwrite __add__ and everything?
Background information: I'm trying to write an interpreter for an esoteric programming language called Grass . In that attempt, I want to have a class that behaves like 'callable-int' (actually, numpy.uint8), whose __call__ would be like
def __call__(self, other):
if self == other:
return lambda x: lambda y: x
else:
return lambda x: lambda y: y
.
There are tricks that you could do with metaclasses (__metaclass__ class variable) or the __getattribute__ special method. But the documentation states:
Bypassing the getattribute() machinery in this fashion provides significant scope for speed optimisations within the interpreter, at the cost of some flexibility in the handling of special methods (the special method must be set on the class object itself in order to be consistently invoked by the interpreter)
Which means that if you want to make sure that the parent class is never handled directly, you need to intercept everything. And for int, that is described as emulating numeric types (i.e.: implementing all those methods).
That said, I believe you could implement all those methods in your class quite easily by creating a lambda or generic method that takes two parameters and just calls super on them. And then assign that method to all the specific methods that you need to implement. So you implement once and reuse it.
I am trying to implement a class in which an attempt to access any attributes that do not exist in the current class or any of its ancestors will attempt to access those attributes from a member. Below is a trivial version of what I am trying to do.
class Foo:
def __init__(self, value):
self._value = value
def __getattr__(self, name):
return getattr(self._value, name)
if __name__ == '__main__':
print(Foo(5) > Foo(4)) # should do 5 > 4 (or (5).__gt__(4))
However, this raises a TypeError. Even using the operator module's attrgetter class does the same thing. I was taking a look at the documentation regarding customizing attribute access, but I didn't find it an easy read. How can I get around this?
If I understand you correctly, what you are doing is correct, but it still won't work for what you're trying to use it for. The reason is that implicit magic-method lookup does not use __getattr__ (or __getattribute__ or any other such thing). The methods have to actually explicitly be there with their magic names. Your approach will work for normal attributes, but not magic methods. (Note that if you do Foo(5).__lt__(4) explicitly, it will work; it's only the implicit "magic" lookup --- e.g., calling __lt__ when < is used) --- that is blocked.)
This post describes an approach for autogenerating magic methods using a metaclass. If you only need certain methods, you can just define them on the class manually.
__*__ methods will not work unless they actually exist - so neither __getattr__ nor __getattribute__ will allow you to proxy those calls. You must create every single methods manually.
Yes, this does involve quite a bit of copy&paste. And yes, it's perfectly fine in this case.
You might be able to use the werkzeug LocalProxy class as a base or instead of your own class; your code would look like this when using LocalProxy:
print(LocalProxy(lambda: 5) > LocalProxy(lambda: 4))
class Foo(object):
pass
foo = Foo()
def bar(self):
print 'bar'
Foo.bar = bar
foo.bar() #bar
Coming from JavaScript, if a "class" prototype was augmented with a certain attribute. It is known that all instances of that "class" would have that attribute in its prototype chain, hence no modifications has to be done on any of its instances or "sub-classes".
In that sense, how can a Class-based language like Python achieve Monkey patching?
The real question is, how can it not? In Python, classes are first-class objects in their own right. Attribute access on instances of a class is resolved by looking up attributes on the instance, and then the class, and then the parent classes (in the method resolution order.) These lookups are all done at runtime (as is everything in Python.) If you add an attribute to a class after you create an instance, the instance will still "see" the new attribute, simply because nothing prevents it.
In other words, it works because Python doesn't cache attributes (unless your code does), because it doesn't use negative caching or shadowclasses or any of the optimization techniques that would inhibit it (or, when Python implementations do, they take into account the class might change) and because everything is runtime.
I just read through a bunch of documentation, and as far as I can tell, the whole story of how foo.bar is resolved, is as follows:
Can we find foo.__getattribute__ by the following process? If so, use the result of foo.__getattribute__('bar').
(Looking up __getattribute__ will not cause infinite recursion, but the implementation of it might.)
(In reality, we will always find __getattribute__ in new-style objects, as a default implementation is provided in object - but that implementation is of the following process. ;) )
(If we define a __getattribute__ method in Foo, and access foo.__getattribute__, foo.__getattribute__('__getattribute__') will be called! But this does not imply infinite recursion - if you are careful ;) )
Is bar a "special" name for an attribute provided by the Python runtime (e.g. __dict__, __class__, __bases__, __mro__)? If so, use that. (As far as I can tell, __getattribute__ falls into this category, which avoids infinite recursion.)
Is bar in the foo.__dict__ dict? If so, use foo.__dict__['bar'].
Does foo.__mro__ exist (i.e., is foo actually a class)? If so,
For each base-class base in foo.__mro__[1:]:
(Note that the first one will be foo itself, which we already searched.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
For each base-class base of foo.__class__.__mro__:
(Note that this recursion is not a problem: those attributes should always exist, and fall into the "provided by the Python runtime" case. foo.__class__.__mro__[0] will always be foo.__class__, i.e. Foo in our example.)
(Note that we do this even if foo.__mro__ exists. This is because classes have a class, too: its name is type, and it provides, among other things, the method used to calculate __mro__ attributes in the first place.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
If we still haven't found something to use: can we find foo.__getattr__ by the preceding process? If so, use the result of foo.__getattr__('bar').
If everything failed, raise AttributeError.
bar.__get__ is not really a function - it's a "method-wrapper" - but you can imagine it being implemented vaguely like this:
# Somewhere in the Python internals
class __method_wrapper(object):
def __init__(self, func):
self.func = func
def __call__(self, obj, cls):
return lambda *args, **kwargs: func(obj, *args, **kwargs)
# Except it actually returns a "bound method" object
# that uses cls for its __repr__
# and there is a __repr__ for the method_wrapper that I *think*
# uses the hashcode of the underlying function, rather than of itself,
# but I'm not sure.
# Automatically done after compiling bar
bar.__get__ = __method_wrapper(bar)
The "binding" that happens within the __get__ automatically attached to bar (called a descriptor), by the way, is more or less the reason why you have to specify self parameters explicitly for Python methods. In Javascript, this itself is magical; in Python, it is merely the process of binding things to self that is magical. ;)
And yes, you can explicitly set a __get__ method on your own objects and have it do special things when you set a class attribute to an instance of the object and then access it from an instance of that other class. Python is extremely reflective. :) But if you want to learn how to do that, and get a really full understanding of the situation, you have a lot of reading to do. ;)