I am trying to understand some basic OOP in python. If I try to subclass a class like list, how do I invoke the parent constructor? After tinkering for a bit I found that it is:
super(subclass_name, self).__init__(args).
However I dont intuitively understand this. Why can't I just do list(args)? or
list.__init__(args)?
The following is the relevant snippet:
class slist(list):
def __init__(self, iterable):
#super(slist, self).__init__(iterable) <--- This works)
list.__init__(iterable) # This does not work
self.__l = list(iterable)
def __str__(self):
return ",".join([str(s) for s in self.__l])
list.__init__(iterable) is missing the information of which list to initialize, and list(iterable) builds a different list entirely unrelated to the one you're trying to initialize.
If you don't want to use super, you can do list.__init__(self, iterable).
list.__init__(iterable) is incorrect. You need to tell __init__() which object it is initializing. list(args) is even more incorrect, as it creates a completely new list object rather than initializing your object. It calls list.__new__() rather than list.__init__(). You need to pass self to the constructor call to correctly initialize the parent class:
list.__init__(self, args)
and that would work. Using super() though usually allows for cleaner syntax. For example, the above could be rewritten as:
super(slist, self).__init__(args)
However, the main reason useage of super() is encouraged over simply calling the parent constructor, is for cases where you have multiple inheritance. super() will automatically call the constructor of each parent class in the correct order. This is closely related to Python's Method Resolution Order.
Related
I want to be able to recycle a method from a parent class that uses a second method in that parent class. However the second method is overridden in the child class. I want the recycled parent method to use the new overridden second method when it is called from the child class. An example of how I want it to work will hopefully make this clearer:
class Parent:
def method1(self, num):
return num**2
def method2(self, list_size):
return [self.method1(i) for i in range(list_size)] #List of squares
class Child(Parent):
def method1(self, num): #Overrides corresponding parent method
return num**3
def method2(self, list_size):
return super().method2(list_size) #Returns a list of cubes using child's method 1.
Is this possible in python3? Or will calling the parent's method 2 also use the parent's method 1? I'm hoping to reuse large parts of a parent class as the child class differs in only a few ways. The methods nesting like that in the parent class make it a lot more general.
Thanks!
EDIT: I forgot to test it with simple code! It does work like I wanted if anyone was wondering!
Short answer: yes.
Just tried a slightly modified version of your code with prints.
class Parent:
def method1(self):
print("Parent method1")
def method2(self):
print("Parent method2")
self.method1()
class Child(Parent):
def method1(self):
print("Child method1")
def method2(self):
print("Child method2")
super().method2()
c = Child()
c.method2()
This is the output:
Child method2
Parent method2
Child method1
As you can see, the method1 used is the child one.
Yes, this works the way you want it to.
You can easily test this yourself. Unless you pass in nothing but 0s and 1s, it should be pretty obvious whether they're getting squared or cubed.
And, in cases where it's less obvious, just add a debugger breakpoint to Child.method1 and Parent.method1 and see which one gets hit. Or add print(f'Child1.method({self}, {num})') to the method and see if it gets printed out.
If you're coming from another language with C++ OO semantics instead of Smalltalk OO semantics, it may help to think of it this way: Every method is always virtual.
Are __init__ calls virtual? Yes.
What if you call a method during __init__? Yes.
What if you call a method inside a super call? Yes.
What about a #classmethod? Yes.
What if…? Yes.
The only exceptions are when you go out of your way to explicitly tell Python not to make a virtual function call:
Calls on super() use the implementation from the next class in the MRO chain, because that's the whole point of super.
If you grab a parent's bound method and call that, like Parent.method1(self, num), you obviously get Parent.method1, because that's the whole point of bound methods.
If you go digging into the class dicts and run the descriptor protocol manually, you obviously get whatever you do manually.
If you're not trying to understand Python in terms of Java, and just want a deeper understanding of Python on its own terms, what you need to understand is what happens when you call self.method1(i).
First, self.method1 doesn't know or care that you're going to call it. It's an attribute lookup, just like, say, self.name would be.
The way Python resolves this is described in the Descriptor HOWTO, but an oversimplified version looks like this:
Does self.__dict__ have anything named method1? No.
Does type(self).__dict__ have anything named method1? Yes.
Return type(self).__dict__['method1'].__get__(self).
If that second lookup failed, Python would loop over type(self).mro() and do the same test for each one. But here, that doesn't come up. type(self) is always going to be Child for an instance of Child, and Child.__dict__['method1'] exists, so it binds Child.method to self, and the result is what self.method1 means.
I'm porting a legacy codebase from Python 2.7 to Python 3.6. In that codebase I have a number of instances of things like:
class EntityName(unicode):
#staticmethod
def __new__(cls, s):
clean = cls.strip_junk(s)
return super(EntityName, cls).__new__(cls, clean)
def __init__(self, s):
self._clean = s
self._normalized = normalized_name(self._clean)
self._simplified = simplified_name(self._clean)
self._is_all_caps = None
self._is_all_lower = None
super(EntityName, self).__init__(self._clean)
It might be called like this:
EntityName("Guy DeFalt")
When porting this to Python 3 the above code fails because unicode is no longer a class you can extend (at least, if there is an equivalent class I cannot find it). Given that str is unicode now, I tried to just swap str in, but the parent init doesn't take a the string value I'm trying to pass:
TypeError: object.__init__() takes no parameters
This makes sense because str does not have an __init__ method - this does not seem to be an idiomatic way of using this class. So my question has two major branches:
Is there a better way to be porting classes that sub-classed the old unicode class?
If subclassing str is appropriate, how should I modify the __init__ function for idiomatic behavior?
The right way to subclass a string or another immutable class in Python 3 is same as in Python 2:
class MyString(str):
def __new__(cls, initial_arguments): # no staticmethod
desired_string_value = get_desired_string_value(initial_arguments)
return super(MyString, cls).__new__(cls, desired_string_value)
# can be shortened to super().__new__(...)
def __init__(self, initial_arguments): # arguments are unused
self.whatever = whatever(self)
# no need to call super().__init__(), but if you must, do not pass arguments
There are several issues with your sample. First, why __new__ is #staticmethod? It's #classmethod, although you don't need to specify this. Second, the code seems to operate under the assumption that when you call __new__ of the superclass, it somehow calls your __init__ as well. I'm deriving this from looking at how self._clean is supposed to be set. This is not the case. When you call MyString(arguments), the following happens:
First Python calls __new__ with the class parameter (usually called cls) and arguments. __new__ must return the class instance. To do this it can create it, as we do, or do something else; e.g. it may return an existing one or, in fact, anything.
Then Python calls __init__ with the instance it received from __new__ (this parameter is usually called self) and the same arguments.
(There's a special case: Python won't call __init__ if __new__ returned something that is not a subclass of the passed class.)
Python uses class hierarchy to see which __new__ and __init__ to call. It's up to you to correctly sort out the arguments and use proper superclass calls in these two methods.
I just can't see why do we need to use #staticmethod. Let's start with an exmaple.
class test1:
def __init__(self,value):
self.value=value
#staticmethod
def static_add_one(value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
a=test1(3)
print(a.new_val) ## >>> 4
class test2:
def __init__(self,value):
self.value=value
def static_add_one(self,value):
return value+1
#property
def new_val(self):
self.value=self.static_add_one(self.value)
return self.value
b=test2(3)
print(b.new_val) ## >>> 4
In the example above, the method, static_add_one , in the two classes do not require the instance of the class(self) in calculation.
The method static_add_one in the class test1 is decorated by #staticmethod and work properly.
But at the same time, the method static_add_one in the class test2 which has no #staticmethod decoration also works properly by using a trick that provides a self in the argument but doesn't use it at all.
So what is the benefit of using #staticmethod? Does it improve the performance? Or is it just due to the zen of python which states that "Explicit is better than implicit"?
The reason to use staticmethod is if you have something that could be written as a standalone function (not part of any class), but you want to keep it within the class because it's somehow semantically related to the class. (For instance, it could be a function that doesn't require any information from the class, but whose behavior is specific to the class, so that subclasses might want to override it.) In many cases, it could make just as much sense to write something as a standalone function instead of a staticmethod.
Your example isn't really the same. A key difference is that, even though you don't use self, you still need an instance to call static_add_one --- you can't call it directly on the class with test2.static_add_one(1). So there is a genuine difference in behavior there. The most serious "rival" to a staticmethod isn't a regular method that ignores self, but a standalone function.
Today I suddenly find a benefit of using #staticmethod.
If you created a staticmethod within a class, you don't need to create an instance of the class before using the staticmethod.
For example,
class File1:
def __init__(self, path):
out=self.parse(path)
def parse(self, path):
..parsing works..
return x
class File2:
def __init__(self, path):
out=self.parse(path)
#staticmethod
def parse(path):
..parsing works..
return x
if __name__=='__main__':
path='abc.txt'
File1.parse(path) #TypeError: unbound method parse() ....
File2.parse(path) #Goal!!!!!!!!!!!!!!!!!!!!
Since the method parse is strongly related to the classes File1 and File2, it is more natural to put it inside the class. However, sometimes this parse method may also be used in other classes under some circumstances. If you want to do so using File1, you must create an instance of File1 before calling the method parse. While using staticmethod in the class File2, you may directly call the method by using the syntax File2.parse.
This makes your works more convenient and natural.
I will add something other answers didn't mention. It's not only a matter of modularity, of putting something next to other logically related parts. It's also that the method could be non-static at other point of the hierarchy (i.e. in a subclass or superclass) and thus participate in polymorphism (type based dispatching). So if you put that function outside the class you will be precluding subclasses from effectively overriding it. Now, say you realize you don't need self in function C.f of class C, you have three two options:
Put it outside the class. But we just decided against this.
Do nothing new: while unused, still keep the self parameter.
Declare you are not using the self parameter, while still letting other C methods to call f as self.f, which is required if you wish to keep open the possibility of further overrides of f that do depend on some instance state.
Option 2 demands less conceptual baggage (you already have to know about self and methods-as-bound-functions, because it's the more general case). But you still may prefer to be explicit about self not being using (and the interpreter could even reward you with some optimization, not having to partially apply a function to self). In that case, you pick option 3 and add #staticmethod on top of your function.
Use #staticmethod for methods that don't need to operate on a specific object, but that you still want located in the scope of the class (as opposed to module scope).
Your example in test2.static_add_one wastes its time passing an unused self parameter, but otherwise works the same as test1.static_add_one. Note that this extraneous parameter can't be optimized away.
One example I can think of is in a Django project I have, where a model class represents a database table, and an object of that class represents a record. There are some functions used by the class that are stand-alone and do not need an object to operate on, for example a function that converts a title into a "slug", which is a representation of the title that follows the character set limits imposed by URL syntax. The function that converts a title to a slug is declared as a staticmethod precisely to strongly associate it with the class that uses it.
I have somewhat of a strange question here. Let's say I'm making a simple, basic class as follows:
class MyClass(object):
def __init__(self):
super(MyClass, self).__init__()
Is there any purpose in calling super()? My class only has the default object parent class. The reason why I'm asking this is because my IDE automagically gives me this snippet when I create a new class. I usually remove the super() function because I don't see any purpose in leaving it. But maybe I'm missing something ?
You're not obliged to call object.__init__ (via super or otherwise). It does nothing.
However, the purpose of writing that snippet in that way in an __init__ function (or any function that calls the superclass) is to give you some flexibility to change the superclass without modifying that code.
So it doesn't buy you much, but it does buy you the ability to change the superclass of MyClass to a different class whose __init__ likewise accepts no-args, but which perhaps does do something and does need to be called by subclass __init__ functions. All without modifying your MyClass.__init__.
Your call whether that's worth having.
Also in this particular example you can leave out MyClass.__init__ entirely, because yours does nothing too.
I'm having trouble with Python (2.7) inheritance. I'm trying to refer from derived classes to parents and back, which is easy enough if you hard-code the classes, but that seems like an ugly approach to me. Is it? Anyway, here we go:
class Alpha(object):
def fie(self):
pass
class Beta(Alpha):
def fie(self):
super(self.__class__, self).fie()
class Gamma(Beta):
pass
Alpha().fie()
Beta().fie()
Gamma().fie()
The last one calls fie as defined on Beta, but since it's called from Gamma, the super will refer to Beta. As such it'll call itself again and starts an infinite recursion.
Is there a way to reference the class for which the function is initially defined? Or the class highest up the chain (besides object)? Or possibly an even better way to accomplish this without hard-coding class names?
Nope - you just have to write it as:
class Beta(Alpha):
def fie(self):
super(Beta, self).fie()
See: http://yergler.net/blog/2011/07/04/super-self/ - and quoted from there (as it explains it better than I could!):
According to the Python 2.7.2 standard library documentation, super “return[s] a proxy object that delegates method calls to a parent or sibling class of type.” So in the case of single inheritance, it delegates access to the super class, it does not return an instance of the super class. In the example above, this means that when you instantiate B, the follow happens:
enter B.__init__()
call super on B and call __init__ on the proxy object
enter A.__init__()
call super on self.__class__ and call __init__ on the proxy object
The problem is that when we get to step four, self still refers to our instance of B, so calling super points back to A again. In technical terms: Ka-bloom.
And within that article is a link to a blog by Raymond Hettinger (and they're always worth reading): http://rhettinger.wordpress.com/2011/05/26/super-considered-super/
NB: read the comment where a user suggests using type(self) (equiv to your self._class_) and why it doesn't work