Order of base classes and super() usage in multiple inheritance - python

Could you please help me to understand the difference between these two cases?
class B1:
def f(self):
super().temp()
class B2:
def temp(self):
print("B2")
class A(B1, B2):
pass
A().f()
It prints "B2".
If we switch B1 and B2:
class A(B2, B1):
pass
A().f()
I get AttributeError: 'super' object has no attribute 'temp'

Python uses something called C3 linearization to decide what order the base classes are in: the "method resolution order". This has basically two parts when stated informally:
The path must hierarchy must never go down from a class to its superclass, even indirectly. As such, no cycles are allowed and issubclass(X, Y) and issubclass(Y, Z) implies issubclass(X, Z).
The order when not forced by the rule above is ordered by number of steps to the super-most class (lower number of steps means earlier in the chain) and then the order of the classes in the class lists (earlier in the list means earlier in the chain).
The hierarchy here is:
A
/ \
/ \
B1 B2 # Possibly switched
\ /
\ /
object
In the first case the order after C3 linearization is
super super super
A → B1 → B2 → object
which we can find out with:
A.mro()
#>>> [<class 'A'>, <class 'B1'>, <class 'B2'>, <class 'object'>]
So the super() calls would resolve as:
class A(B1, B2):
pass
class B1:
def f(self):
# super() proxies the next link in the chain,
# which is B2. It implicitly passes self along.
B2.temp(self)
class B2:
def temp(self):
print("B2")
so calling A().f() tries:
Is f on the instance? No, so
Is f on the first class, A? No, so
Is f on the next class, B1? Yes!
Then B1.f is called and this calls B2.temp(self), which checks:
Is f on the class, B2? Yes!
And it is called, printing B2
In the second case we have
super super super
A → B2 → B1 → object
So the resolves
So the super() calls would resolve as:
class A(B2, B2):
pass
class B2:
def temp(self):
print("B2")
class B1:
def f(self):
# super() proxies the next link in the chain,
# which is B2. It implicitly passes self along.
object.temp(self)
Is f on the instance? No, so
Is f on the first class, A? No, so
Is f on the next class, B2? No, so
Is f on the next class, B1? Yes!
So B1.f is called and this calls object.temp(self), which checks:
Is f on the class, object? No,
There are no superclasses, so we have failed to find the attribute.
Raise AttributeError("{!r} object has no attribute {!r}".format(instance, attribute_name)).

The difference is simply the order of classes in MRO of class A in both cases:
class A1(B1, B2):
pass
class A2(B2, B1):
pass
print(A1.mro())
print(A2.mro())
Which returns:
[<class '__main__.A1'>, <class '__main__.B1'>, <class '__main__.B2'>, <class 'object'>]
and
[<class '__main__.A2'>, <class '__main__.B2'>, <class '__main__.B1'>, <class 'object'>]
Now when you call A1.f() or A2.F() the attribute is found in B1, and there you call super().temp(), which means call temp() on the next class found(or move on to class next to it not until temp is not found and so on..) in MRO.
As the next and only class in case of A2 is object which has no temp() method an error is raised.
In case of A1 next class after B1 is B2 which has a temp() method, hence no error is raised.

Related

Class inheritance via super with two arguments

In the below code, i replaced args with numbers to demonstrate what classes are inherited.
class Animal:
def __init__(self, animalName):
print(animalName, 'is an animal.');
class Mammal(Animal):
def __init__(self, mammalName):
print(mammalName, 'is a mammal.')
super().__init__(mammalName)
class CannotFly(Mammal):
def __init__(self, mammalThatCantFly):
print('2', "cannot fly.")
super().__init__('2')
class CannotSwim(Mammal):
def __init__(self, mammalThatCantSwim):
print('1', "cannot swim.")
super().__init__('1')
# Cat inherits CannotSwim and CannotFly
class Cat(CannotSwim, CannotFly):
def __init__(self):
print('I am a cat.');
super().__init__('Cat')
cat = Cat()
returns
I am a cat.
1 cannot swim.
2 cannot fly.
2 is a mammal.
2 is an animal.
Why is it not the below?
I am a cat.
1 cannot swim.
1 is a mammal.
1 is an animal.
2 cannot fly.
2 is a mammal.
2 is an animal.
There are effectively two call streams, no?
You can see the method resolution order (MRO) for Cat:
>>> Cat.mro()
[<class '__main__.Cat'>, <class '__main__.CannotSwim'>, <class '__main__.CannotFly'>, <class '__main__.Mammal'>, <class '__main__.Animal'>, <class 'object'>]
Each class appears once in the MRO, due to the C3 linearization algorithm. Very briefly, this constructs the MRO from the inheritance graph using a few simple rules:
Each class in the graph appears once.
Each class precedes any of its parent classes.
When a class has two parents, the left-to-right order of the parents is preserved.
("Linearization", because it produces a linear ordering of the nodes in the inheritance graph.)
super() is misnamed; a better name would have been something lie nextclass, because it does not use the current class's list of parents, but the MRO of the self argument. When you call Cat, you are seeing the following calls.
Cat.__init__
Cat.__init__ uses super to call CannotSwim.__init__
CannotSwim.__init__ uses super to call CannotFly.__init__
CannotFly.__init__ uses super to call Mammal.__init__
Mammal.__init__ uses super to call Animal.__init__
Animal.__init__ uses super to call object.__init__
object.__init__ does not use super (it "owns" __init__), so the chain ends there.
In particular, note #3: CannotSwim causes its "sibling" in the inheritance graph to be used, not its own parent.
Take a look at this post What does 'super' do in Python? - difference between super().__init__() and explicit superclass __init__()
Now what it says is that __init__ is called for every class in the instance's mro.
you can print that by doing print(Cat.__mro__) this will print out
(<class '__main__.Cat'>, <class '__main__.CannotSwim'>, <class '__main__.CannotFly'>, <class '__main__.Mammal'>, <class '__main__.Animal'>, <class 'object'>)
As you can see there is the order of the calls. Now, why '2' is used and not '1', see the answer in the comments by hussic

Converting a subclass instance into its parent class instance

Let's say I have 2 classes A and B, where B inherits from A. B overrides some methods of A and B have a couple more attributes. Once I created an object b of type B, is it possible to convert it into the type A and only A ? This is to get the primitive behavior of the methods
I don't know how safe it is, but you can reassign the __class__ attribute of the object.
class A:
def f(self):
print("A")
class B(A):
def f(self):
print("B")
b = B()
b.f() # prints B
b.__class__ = A
b.f() # prints A
This only changes the class of the object, it doesn't update any of the attributes. In Python, attributes are added dynamically to objects, and nothing intrinsically links them to specific classes, so there's no way to automatically update the attributes if you change the class.

method resolution order MRO

Why after searching B, it does not go deeper to search Y OR z but go to search A?
Y is the parent of A, if should search A first, but Y is the parent of B so it should search Y first, why this does not throw a MRO error?
Can someone explain how this lookup works?
class X(object):pass
class Y(object): pass
class Z(object): pass
class A(X,Y): pass
class B(Y,Z):pass
class M(B,A,Z):pass
print M.__mro__
gives
(<class '__main__.M'>, <class '__main__.B'>, <class '__main__.A'>, <class '__main__.X'>, <class '__main__.Y'>, <class '__main__.Z'>, <type 'object'>)
In your specific example, after searching B, we can't consider Y immediately because it is a child of A. We can't consider Z immediately because M inherits from A before it inherits from Z.
Python uses C3 method resolution order details here .
C3 resolution order solves the diamond inheritance problem well
In the example below, we have a very generic class Object that's a superclass of B and C. We only want method implementations (say of __repr__ or something) in Object to be considered if neither B nor C have an implementation.
Object
/ \
B C
\ /
A
In other words, each possible parent in the transitive closure of the parent classes of A is considered, but the classes are ordered according to the "latest" path from the base class to the class in question.
There are two paths to object:
A -> B -> Object
A -> C -> Object
The "latest" path is A -> C -> Object because A -> B -> Object would be earlier in a left-biased depth-first search.
C3 linearization satisfies two key invariants:
if X inherits from Y, X is checked before Y.
if Z inherits from U and then V in that order, U is checked before V.
Indeed C3 linearization guarantees that both of those properties hold.
It's possible to construct hierarchies that can't be linearized, in which case you get an exception at class definition time.
running inherit.py
class E: pass
class F: pass
class A(E, F): pass
class B(F, E): pass
class Z(A, B): pass
produces the following error.
Traceback (most recent call last):
File "inherit.py", line 5, in <module>
class Z(A, B): pass
TypeError: Cannot create a consistent method resolution
order (MRO) for bases E, F

Grandchild inheriting from Parent class - Python

I am learning all about Python classes and I have a lot of ground to cover.
I came across an example that got me a bit confused.
These are the parent classes
Class X
Class Y
Class Z
Child classes are:
Class A (X,Y)
Class B (Y,Z)
Grandchild class is:
Class M (A,B,Z)
Doesn't Class M inherit Class Z through inheriting from Class B or what would the reason be for this type of structure? Class M would just ignore the second time Class Z is inherited wouldn't it be, or am I missing something?
Class M would just inherit the Class Z attributes twice (redundant) wouldn't it be, or am I missing something?
No, there are no "duplicated" attributes, Python performs a linearization they can the Method Resolution Order (MRO) as is, for instance, explained here. You are however correct that here adding Z to the list does not change anything.
They first construct MRO's for the parents, so:
MRO(X) = (X,object)
MRO(Y) = (Y,object)
MRO(Z) = (Z,object)
MRO(A) = (A,X,Y,object)
MRO(B) = (B,Y,Z,object)
and then they construct an MRO for M by merging:
MRO(M) = (M,)+merge((A,X,Y,object),(B,Y,Z,object),(Z,object))
= (M,A,X,B,Y,Z,object)
Now each time you call a method, Python will first check if the attribute is in the internal dictionary self.__dict__ of that object). If not, Python will walk throught the MRO and attempt to find an attribute with that name. From the moment it finds one, it will stop searching.
Finally super() is a proxy-object that does the same resolution, but starts in the MRO at the stage of the class. So in this case if you have:
class B:
def foo():
super().bar()
and you construct an object m = M() and call m.foo() then - given the foo() of B is called, super().bar will first attempt to find a bar in Y, if that fails, it will look for a bar in Z and finally in object.
Attributes are not inherited twice. If you add an attribute like:
self.qux = 1425
then it is simply added to the internal self.__dict__ dictionary of that object.
Stating Z explicitly however can be beneficial: if the designer of B is not sure whether Z is a real requirement. In that case you know for sure that Z will still be in the MRO if B is altered.
Apart from what #Willem has mentioned, I would like to add that, you are talking about multiple inheritance problem. For python, object instantiation is a bit different compared other languages like Java. Here, object instatiation is divided into two parts :- Object creation(using __new__ method) and object initialization(using __init__ method). Moreover, it's not necessary that child class will always have parent class's attributes. Child class get parent class's attribute, only if parent class constructor is invoked from child class(explicitly).
>>> class A(object):
def __init__(self):
self.a = 23
>>> class B(A):
def __init__(self):
self.b = 33
>>> class C(A):
def __init__(self):
self.c = 44
super(C, self).__init__()
>>> a = A()
>>> b = B()
>>> c = C()
>>> print (a.a) 23
>>> print (b.a) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'B' object has no attribute 'a'
>>> print (c.a) 23
In the above code snipped, B is not invoking A's __init__ method and so, it doesn't have a as member variable, despite the fact that it's inheriting from A class. Same thing is not the case for language like Java, where there's a fixed template of attributes, that a class will have. This's how python is different from other languages.
Attributes that an object have, are stored in __dict__ member of object and it's __getattribute__ magic method in object class, which implements attribute lookup according to mro specified by willem. You can use vars() and dir() method for introspection of instance.

Simple class inheritance in Python

I am fairly new to Python and OOP. Suppose I have two classes, Base1 and Base2. Suppose also that Base1 has computed some values a1, b1 and that Base2 has a method that multiplies two values. My question is, what is the correct way of passing a1 and b1 of Base1 to multiply in Base2?
One way to do this by defining a derived class, Derived as follows:
class Base1:
def __init__(self, x1 , y1):
# perform some computation on x1 and y1
# store result in a1 and b1
self.a1=x1
self.b1=y1
class Base2:
def __init__(self, x2 , y2):
self.a2=x2
self.b2=y2
self.c2=self.multiply(self.a1,self.b1) # using a1 and b1 of Base1
def multiply(self, p,q):
return p*q
class Derived(Base1,Base2):
def __init__(self):
self.describe='Derived Class'
Base1.__init__(self,3,4)
Base2.__init__(self,5,6)
Then:
f=Derived()
f.c2=12
However, in a more complex situation, it is easy to lose track of where self.a1, self.b1 came from. It is also not obvious to me why the two base classes can access the attributes and the methods of each other in this way?
Edit: This is Python 2.7.10.
In Python 2 always inherit from object. Otherwise you get old-style classes which should not use:
class Base1(object):
def __init__(self, x1 , y1):
# perform some computation on x1 and y1
# store result in a1 and b1
self.a1 = x1
self.b1 = y1
class Base2(object):
def __init__(self, x2 , y2):
self.a2 = x2
self.b2 = y2
self.c2 = self.multiply(self.a1, self.b1) # using a1 and b1 of Base1
def multiply(self, p,q):
return p*q
class Derived(Base1, Base2):
def __init__(self):
self.describe='Derived Class'
Base1.__init__(self, 3, 4)
Base2.__init__(self, 5, 6)
Python looks for methods using the method resolution order (mro). You can find out the current order:
>>> Derived.mro()
[__main__.Derived, __main__.Base1, __main__.Base2, object]
That means Python looks for a method multiply() in the class Derived first. If it finds it there, it will use it. Otherwise it keeps searching using the mro until it finds it. Try changing the order of Base1 and Base2 in Derived(Base1,Base2) and check how this effects the mro:
class Derived2(Base2, Base1):
pass
>>> Derived2.mro()
[__main__.Derived2, __main__.Base2, __main__.Base1, object]
The self always refers to the instance. In this case f (f = Derived()). It does not matter how f gets its attributes. The assignment self.x = something can happen in any method of any of the classes involved.
TL;DR
Python is dynamic. It doesn't check if attributes are present until the actual line of code that tries to access them. So your code happens to work. Just because you can do this, doesn't mean you should, though. Python depends on you to make good decisions in organizing your code rather than trying to protect you from doing dumb things; we're all adults here.
Why you can access variables
The reason really boils down to the fact that Python is a dynamic language. No types are assigned to variables, so Python doesn't know ahead of time what to expect in that variable. Alongside that design decision, Python doesn't actually check for the existence of an attribute until it actually tries to access the attribute.
Let's modify Base2 a little bit to get some clarity. First, make Base1 and Base2 inherit from object. (That's necessary so we can tell what types we're actually dealing with.) Then add the following prints to Base2:
class Base2(object):
def __init__(self, x2 , y2):
print type(self)
print id(self)
self.a2=x2
self.b2=y2
self.c2=self.multiply(self.a1,self.b1) # using a1 and b1 of Base1
def multiply(self, p,q):
return p*q
Now let's try it out:
>>> d = Derived()
<class '__main__.Derived'>
42223600
>>> print id(d)
42223600
So we can see that even in Base2's initializer, Python knows that self contains a Derived instance. Because Python uses duck typing, it doesn't check ahead of time whether self has a1 or b1 attributes; it just tries to access them. If they are there, it works. If they are not, it throws an error. You can see this by instantiating an instance of Base2 directly:
>>> Base2(1, 2)
<class '__main__.Base2'>
41403888
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 7, in __init__
AttributeError: 'Base2' object has no attribute 'a1'
Note that even with the error, it still executes the print statements before trying to access a1. Python doesn't check that the attribute is there until the line of code is executed.
We can get even crazier and add attributes to objects as the code runs:
>>> b = Base1(1,2)
>>> b.a1
1
>>> b.notyet
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'Base1' object has no attribute 'notyet'
>>> b.notyet = 'adding an attribute'
>>> b.notyet
'adding an attribute'
How you should organize this code
Base2 should not try to access those variables without inheriting from Base1. Even though it's possible to do this if you only ever instantiate instances of Derived, you should assume that someone else might use Base2 directly or create a different class that inherits from Base2 and not Base1. In other words, you should just ignore that this is possible. A lot of things are like that in Python. It doesn't restrict you from doing them, but you shouldn't do them because they will confuse you or other people or cause problems later. Python is known for not trying to restrict functionality and depending on you, the developer, to use the language wisely. The community has a catchphrase for that approach: we're all adults here.
I'm going to assume that Base2 is primarily intended to be just a mix-in to provide the multiply method. In that case, we should define c2 on the subclass, Derived, since it will have access to both multiply and the attributes a1 and b1.
For a purely derived value, you should use a property:
class Derived(Base1,Base2):
def __init__(self):
self.describe='Derived Class'
Base1.__init__(self,3,4)
Base2.__init__(self,5,6)
#property
def c2(self):
return self.multiply(self.a1,self.b1) # using a1 and b1 of Base1
This prevents callers from changing the value (unless you explicitly create a setter) and avoids the issue of tracking where it came from. It will always be computed on the fly, even though using it looks like just using a normal attribute:
x = Derived()
print x.c2
This would give 12 as expected.
You can just provide a method multiply in the base class which assumes that a1 and b1 has been defined in the base class.
So the code will be like
class Base1(object):
def __init__(self,a1,b1):
self.a1 = a1
self.b1 = b1
class Base2(Base1):
def multiply():
return self.a1*self.b1
Here as you havent provided a __init__ for base2 it will use the init method of base1 which takes in parameters as a1 and a2
so now
base = Base2(5,4)
print(base.multiply())

Categories

Resources