This is a follow-up from this question:
Effective Java 2nd Edition, Item 17: Design and document for inheritance, or else prohibit it:
There are a few more restrictions that a class must obey to allow inheritance. Constructors must not invoke overridable methods, directly or indirectly. If you violate this rule, program failure will result. The superclass constructor runs before the subclass constructor, so the overriding method in the subclass will be invoked before the subclass constructor has run. If the overriding method depends on any initialization performed by the subclass constructor, the method will not behave as expected.
Since we are only going to touch the initialization aspect of the Java's constructor, I assume it's safe to compare Java's constructor with Python's __init__() in this question.
I am guessing since in Python I have the flexibility to decide when to call (after initializing my current class' data attributes in this case) my ancestor's __init__() as compared to Java where super() needs to be the first statement in a constructor's invocation, I might be safe in calling overridden methods from __init__()?
If I am correct in guessing the aforementioned, wouldn't I as an ancestor class designer getting left at the mercy of my subclass designer? If the subclass designer calls my __init__() before he initializes his data attributes which will be used by one of his overridden methods, and I call that method, I result in program failure!
I might be safe in calling overridden methods from __init__()?
You are. The self binding has already happened before __init__() is evaluated.
ancestor class designer getting left at the mercy of my subclass designer?
That's universally true and has nothing to do with __init__() or anything else.
Subclass developers can do anything. Also, this is Python: they have your source, and can modify that also.
If the subclass designer calls my __init__() before he initializes his data attributes which will be used by one of his overridden methods, and I call that method, I result in program failure!
Correct. The way this will happen is the subclass designer picked a name which you were already using. They assured that the program would fail through their choice of name.
Related
When (and why) was the Python __new__() function introduced?
There are three steps in creating an instance of a class, e.g. MyClass():
MyClass.__call__() is called. This method must be defined in the metaclass of MyClass.
MyClass.__new__() is called (by __call__). Defined on MyClass itself. This creates the instance.
MyClass.__init__() is called (also by __call__). This initializes the instance.
Creation of the instance can be influenced either by overloading __call__ or __new__. There usually is little reason to overload __call__ instead of __new__ (e.g. Using the __call__ method of a metaclass instead of __new__?).
We have some old code (still running strong!) where __call__ is overloaded. The reason given was that __new__ was not available at the time. So I tried to learn more about the history of both Python and our code, but I could not figure out when __new__ was introduced.
__new__ appears in the documentation for Python 2.4 and not in those for Python 2.3, but it does not appear in the whathsnew of any of the Python 2 versions. The first commit that introduced __new__ (Merge of descr-branch back into trunk.) that I could find is from 2001, but the 'back into trunk' message is an indication that there was something before. PEP 252 (Making Types Look More Like Classes) and PEP 253 (Subtyping Built-in Types) from a few months earlier seem to be relevant.
Learning more about the introduction of __new__ would teach us more about why Python is the way it is.
Edit for clarification:
It seems that class.__new__ duplicates functionality that is already provided by metaclass.__call__. It seems un-Pythonic to add a method only to replicate existing functionality in a better way.
__new__ is one of the few class methods that you get out of the box (i.e. with cls as first argument), thereby introducing complexity that wasn't there before. If the class is the first argument of a function, then it can be argued that the function should be a normal method of the metaclass. But that method did already exist: __call__(). I feel like I'm missing something.
There should be one-- and preferably only one --obvious way to do it.
The blog post The Inside Story on New-Style Classes
(from the aptly named http://python-history.blogspot.com) written by Guido van Rossum (Python's BDFL) provides some good information regarding this subject.
Some relevant quotes:
New-style classes introduced a new class method __new__() that lets
the class author customize how new class instances are created. By
overriding __new__() a class author can implement patterns like the
Singleton Pattern, return a previously created instance (e.g., from a
free list), or to return an instance of a different class (e.g., a
subclass). However, the use of __new__ has other important
applications. For example, in the pickle module, __new__ is used to
create instances when unserializing objects. In this case, instances
are created, but the __init__ method is not invoked.
Another use of __new__ is to help with the subclassing of immutable
types. By the nature of their immutability, these kinds of objects can
not be initialized through a standard __init__() method. Instead, any
kind of special initialization must be performed as the object is
created; for instance, if the class wanted to modify the value being
stored in the immutable object, the __new__ method can do this by
passing the modified value to the base class __new__ method.
You can read the entire post for more information on this subject.
Another post about New-style Classes which was written along with the above quoted post has some additional information.
Edit:
In response to OP's edit and the quote from the Zen of Python, I would say this.
Zen of Python was not written by the creator of the language but by Tim Peters and was published only in August 19, 2004. We have to take into account the fact that __new__ appears only in the documentation of Python 2.4 (which was released on November 30, 2004), and this particular guideline (or aphorism) did not even exist publicly when __new__ was introduced into the language.
Even if such a document of guidelines existed informally before, I do not think that the author(s) intended them to be misinterpreted as a design document for an entire language and ecosystem.
I will not explain the history of __new__ here because I have only used Python since 2005, so after it was introduced into the language. But here is the rationale behind it.
The normal configuration method for a new object is the __init__ method of its class. The object has already been created (usually via an indirect call to object.__new__) and the method just initializes it. Simply, if you have a truely non mutable object, it is too late.
In that use case the Pythonic way is the __new__ method, which builds and returns the new object. The nice point with it, is that is is still included in the class definition and does not require a specific metaclass. Standard documentation states:
new() is intended mainly to allow subclasses of immutable types (like int, str, or tuple) to customize instance creation. It is also commonly overridden in custom metaclasses in order to customize class creation.
Defining a __call__ method on the metaclass is indeed allowed but is IMHO non Pythonic, because __new__ should be enough. In addition, __init__, __new__ and metaclasses each dig deeper inside the internal Python machinery. So the rule shoud be do not use __new__ if __init__ is enough, and do not use metaclasses if __new__ is enough.
Functions in a python class can be either instance methods, class methods or static methods.
The former is characterised by the self as its first (implicit) argument, acts directly on the instance of the class, and does not require any decorators to be treated as such.
The other two, however, need decorators #classmethod and #staticmethod before the name of the method - this is why I refer to the instance method as the "default" one, i.e. the one for which a wrapper is not needed.
My question is: suppose I am in a class, and I am breaking up my calculation into several functions for readibility. Only one of these methods will need access to the self.something variables that I share instance-wise, but most of the others do not need to know about the class they belong to - they are just there for "housekeeping".
Should make these functions (the ones that do not need any self.something knowledge) all #staticmethod? Doing so would require a decorator and hence an extra step. It would be easier (not requiring the extra step of using a decotrator) for every method to just be an instance method, thus inheritig a lot of potential but also waisting it since it is not needed for the scope of the functions in question.
Why is the instance method the "default"? Why not have every method a static method by default, and give it the extra functionality associated with being a instance method with a wrapper?
The reason to default to instance methods is because that's usually what you want when you're doing object oriented programming. I can't think of a single language that claims to support OOP and has methods default to anything but instance methods. Classes are templates for "data with behaviors", so the default is to make methods that provide behaviors to each instantiation of the class. If you just want a collection of functions, you can just define them at the top level of a module and save the unnecessary class after all.
In general, #staticmethod is used to mean "I know this isn't a behavior of the class or its instances, but it helps implement the real behaviors and isn't very useful outside the class, so I'll namespace it inside it." If the features are useful outside the class, you'd just make it a plain top-level function rather putting it inside the class at all. It is advantageous to use #staticmethod where appropriate; it's a little faster to call than an instance method, so if you don't need the instance, #staticmethod will speed up your code a bit (note: This may not be true in 3.7+, where they added an optimization to avoid the creation of bound methods, which may speed up instance/class methods).
#classmethod basically has two use cases:
(Primary) Defining alternate constructors in a subclass friendly way (the cls it receives is the actual subclass, if applicable, not just the class it was defined in)
(Mostly unnecessary) As an alternative to #staticmethod when the method needs to call other static methods and you'd rather not have to refer to the class by name over and over
Point is, #staticmethod is mostly for when you're opting out of OOP, and #classmethods are for niche use cases; instance methods are just more useful, so they're the default. Beyond that, as a historical note, static and class methods were introduced later, so making them the default would have broken all existing Python code, for no real benefit.
The main reason to use #staticmethod over instance methods with an ignored self (when self isn't needed) is that it will continue to work when called on the class itself, not just on instances of the class; if you tried to call MyClass.notreallystatic(), it would die for lack of a self, while MyClass.actuallystatic() would work.
I am confused with a concept in python - base class overriding. I learned that you can have two different functions with the same name in different classes, and the correct function will be called on an object depending on which class the object is from.
However, I have just learned about the super call, and I learned that you can use it if you overrode (correct past tense?) a function that you need back. I'm confused because the overridden function isn't gone in the first place, is it? Why do I need to "restore" it using the super call?
The child's type is first in MRO, so its method will get called even if any of its parents have the same method. super "restarts" MRO at the next link in the inheritance chain, and allows discovery of attributes belonging to parent classes.
I know that in Python 3, you can write super() and Python automatically passes the correct arguments to super.
It's also possible to introduce subtle bugs by accidentally writing super(Parent, self).
Are there any scenarios where you wouldn't want to pass the current class as the first argument to super?
Yes, potentially you might want to skip the immediate superclass method, but still call the methods further up the hierarchy. This might happen for example when you know you have replaced the logic with your own, and calling the superclass method would be pointless or even harmful.
Python will resolve a method name in the class of the method and all parent classes of that class until it resolves.
Does this apply to the constructor as well. I.e., if a class does not define __init__() but its parent does, will the parent constructor automatically be called?
The short answer is: yes. This is how inheritance works.
This is also the reason why you should call the parent constructor explicitly most of the time (unless you want to do otherwise for some reason), when you are overriding method within child class.
It is also worth learning about Method Resolution Order in Python: Method Resolution Order (MRO) in new style Python classes. It defines the order with which the methods are resolved (especially important in case of multiple inheritance).