Why doesn't object.__init__ take *args, **kwargs as arguments? This breaks some simple code in a highly annoying manner without any upsides as far as I can see:
Say we want to make sure that all __init__'s of all parent classes are called. As long as every init follows the simple convention of calling super().__init__ this will guarantee that the whole hierarchy is run through and that exactly once (also without ever having to specify the parent specifically). The problem appears when we pass arguments along:
class Foo:
def __init__(self, *args, **kwargs):
print("foo-init")
super().__init__(*args, **kwargs) # error if there are arguments!
class Bar:
def __init__(self, *args, **kwargs):
print("bar-init")
super().__init__(*args, **kwargs)
class Baz(Bar, Foo):
def __init__(self, *args, **kwargs):
print("baz-init")
super().__init__(*args, **kwargs)
b1 = Baz() # works
b2 = Baz("error")
What's the reasoning for this and what's the best general (! it's easily solvable in my specific case but that relies on additional knowledge of the hierarchy) workaround? The best I can see is to check whether the parent is object and in that case not give it any args.. horribly ugly that.
You can see http://bugs.python.org/issue1683368 for a discussion. Note that someone there actually asked for it to cause an error. Also see the discussion on python-dev.
Anyway, your design is rather odd. Why are you writing every single class to take unspecified *args and **kwargs? In general it's better to have methods accept the arguments they need. Accepting open-ended arguments for everything can lead to all sorts of bugs if someone mistypes a keyword name, for instance. Sometimes it's necessary, but it shouldn't be the default way of doing things.
Raymond Hettinger's super() considered super has some information about how to deal with this. It's in the section "Practical advice".
Related
Preamble
This is a rather basic question, I realize, but I haven't been able to find a sturdy reference for it, which would likely be a mixture of technical details and best practices for well-behaved classes.
Question
When a parent and child class both define the same initialization parameter, but with default values, what's the best way to get sane behavior when the child class is created?
My assumptions are:
Classes only accept named parameters, I don't need to deal with positional arguments. That simplifies many things, both theoretically in reasoning about situations and practically in taking arguments from external config files, etc.
__init__ methods may be more sophisticated than just setting self.foo = foo for their arguments - they may transform it before storing, use it to set other params, etc. and I'd like to be as respectful of that as possible.
Subclasses never break the interfaces of their parents, both for __init__ parameters and for attributes. Having a different default value is not considered "breaking".
Classes should never have to be aware of their subclasses, they should just do things in "reasonable ways" and it's up to subclasses to ensure everything still works properly. However, it's sometimes necessary to modify a superclass to be "more reasonable" if it's doing things that aren't amenable to being subclassed - this can form a set of principles that help everyone get along well.
Examples
In general, my idea of a "best practice" template for a derived class looks like this:
class Child(Parent):
def __init__(self, arg=1, **kwargs):
self.arg = arg
super().__init__(**kwargs)
That works well in most situations - deal with our stuff, then delegate all the rest to our superclass.
However, it doesn't work well if arg is shared by both Child and Parent - neither the caller's argument nor the Child default are respected:
class Parent:
def __init__(self, arg=0):
self.arg = arg
class Child(Parent):
def __init__(self, arg=1, **kwargs):
self.arg = arg
super().__init__(**kwargs)
print(Child(arg=6).arg)
# Prints `0` - bad
A better approach is probably for Child to acknowledge that the argument is shared:
class Parent:
def __init__(self, arg=0):
self.arg = arg
class Child(Parent):
def __init__(self, arg=1, **kwargs):
super().__init__(arg=arg, **kwargs)
print(Child(arg=6).arg)
# Prints `6` - good
print(Child().arg)
# Prints `1` - good
That successfully gets the defaults working according to expectations. What I'm not sure of is whether this plays well with the expectations of Parent. So I think my questions are:
If Parent.__init__ does some Fancy Stuff with arg and/or self.arg, how should Child be set up to respect that?
In general does this require knowing Too Much about the internals of Parent and how self.arg is used? Or are there reasonable practices that everyone can follow to draw that part of the interface contract in a clean way?
Are there any specific gotchas to keep in mind?
Parent.__init__ only expects that the caller may choose to omit an argument for the arg parameter. It doesn't matter if any particular caller (Child.__init__, in this case) always provides an argument, nor does it matter how the caller produces the value it passes.
Your third example is what I would write, with the addition that Parent.__init__ itself also uses super().__init__: it doesn't assume that it's the end of whatever MRO is in force for its self argument.
class Parent:
def __init__(self, arg=0, **kwargs):
super().__init__(**kwargs)
self.arg = arg
class Child(Parent):
def __init__(self, arg=1, **kwargs):
super().__init__(arg=arg, **kwargs)
Can someone help me understand how MRO works in python?
Suppose I have four classes - Character, Thief, Agile, Sneaky. Character is the super class to Thief, Agile and Sneaky are siblings. Please see my code and question below
class Character:
def __init__(self, name="", **kwargs):
if not name:
raise ValueError("'name' is required")
self.name = name
for key, value in kwargs.items():
setattr(self, key, value)
class Agile:
agile = True
def __init__(self, agile=True, *args, **kwargs):
super().__init__(*args, **kwargs)
self.agile = agile
class Sneaky:
sneaky = True
def __init__(self, sneaky=True, *args, **kwargs):
super().__init__(*args, **kwargs)
self.sneaky = sneaky
class Thief(Agile, Sneaky, Character):
def pickpocket(self):
return self.sneaky and bool(random.randint(0, 1))
parker = Thief(name="Parker", sneaky=False)
So, here is what I think is going on, please let me know if I'm understanding it correctly.
Since Agile is first on the list, all arguments are first sent to Agile where the arguments will be cross-referenced with the Agile parameters. If there is a match the value will be assigned, then everything that doesn't have a matching keyword will be packed up in *kwargs and sent to the Sneaky class (via super), where the same thing will happen - all arguments get unpacked, cross-referenced with the Sneaky parameters (this is when sneaky = False is set), then packed up in kwargs and sent to Character. Then everything within the Character inint method will run and all values will be set (like the name = "Parker").
HOW I THINK MRO WORKS ON THE WAY BACK
Now that everything made it to the Character class and everything in the Character init method has run, now it has to go back to the Agile and Sneaky classes and finishing running everything in their init methods(or everything under their super). So, it will first go back to the Sneaky class and finish it's init method, then go back to the Agile class and finish the rest of its init method (respectively).
Do I have it confused anywhere? Phew. I'm sorry, I know this is a lot, but I'm really stuck here and I'm trying to get a clear understanding of how MRO works.
Thank you, everyone.
Your code as posted doesn't even compile, much less run. But, guessing at how it's supposed to work…
Yes, you've got things basically right.
But you should be able to verify this yourself, in two ways. And knowing how to verify it may be even more important than knowing the answer.
First, just print out Thief.mro(). It should look something like this:
[Thief, Agile, Sneaky, Character, object]
And then you can see which classes provide an __init__ method, and therefore how they'll be chained up if everyone just calls super:
>>> [cls for cls in Thief.mro() if '__init__' in cls.__dict__]
[Agile, Sneaky, Character, object]
And, just to make sure Agile really does get called first:
>>> Thief.__init__
<function Agile.__init__>
Second, you can run your code in the debugger and step through the calls.
Or you can just add print statements at the top and bottom of each one, like this:
def __init__(self, agile=True, *args, **kwargs):
print(f'>Agile.__init__(agile={agile}, args={args}, kwargs={kwargs})')
super().__init__(*args, **kwargs)
self.agile = agile
print(f'<Agile.__init__: agile={agile}')
(You could even write a decorator that does this automatically, with a bit of inspect magic.)
If you do that, it'll print out something like:
> Agile.__init__(agile=True, args=(), kwargs={'name': 'Parker', 'sneaky':False})
> Sneaky.__init__(sneaky=False, args=(), kwargs={'name': 'Parker'})
> Character.__init__(name='Parker', args=(), kwargs={})
< Character.__init__: name: 'Parker'
< Sneaky.__init__: sneaky: False
< Agile.__init__: agile: True
So, you're right about the order things get called via super, and the order the stack gets popped on the way back is obviously the exact opposite.
But, meanwhile, you've got one detail wrong:
sent to the Sneaky class (via super), where the same thing will happen - all arguments get unpacked, cross-referenced with the Sneaky parameters (this is when sneaky = False is set)
This is where the parameter/local variable sneaky gets set, but self.sneaky doesn't get set until after the super returns. Until then (including during Character.__init__, and similarly for any other mixins that you choose to throw in after Sneaky), there is no sneaky in self.__dict__, so if anyone were to try to look up self.sneaky, they'd only be able to find the class attribute—which has the wrong value.
Which raises another point: What are those class attributes for? If you wanted them to provide default values, you've already got default values on the initializer parameters for that, so they're useless.
If you wanted them to provide values during initialization, then they're potentially wrong, so they're worse than useless. If you need to have a self.sneaky before calling Character.__init__, the way to do that is simple: just move self.sneaky = sneaky up before the super() call.
In fact, that's one of the strengths of Python's "explicit super" model. In some languages, like C++, constructors are always called automatically, whether from inside out or outside in. Python forcing you to do it explicitly is less convenient, and harder to get wrong—but it means you can choose to do your setup either before or after the base class gets its chance (or, of course, a little of each), which is sometimes useful.
I am writing a GUI in wxPython, and am creating a custom control for displaying a terminal window, as I have not been able to find one currently in existence.
My control TerminalCtrl extends upon wx.Control, and my init definition starts as follows:
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
I would like to enforce the following style:
style=wx.BORDER_NONE
That is, no borders will ever be allowed on this window. However, I would still like to allow other styles to be applied, at programmers discretion.
For reference, the __init__ function for wx.Control is defined as follows
__init__ (self, parent, id=ID_ANY, pos=DefaultPosition, size=DefaultSize, style=0, validator=DefaultValidator, name=ControlNameStr)
What I would like to achieve is that I may filter the style parameter to enforce the wx.BORDER_NONE style. It is my understanding that this could be in either *args or **kwargs, depending on whether the parameters are passed by position or by specifically referencing the parameter name such as (style=wx.BORDER_NONE).
Is there a standard/recommended/pythonic way that I may enforce such a filter upon a parameter before passing it on to wx.Control.__init__ and if so how may I achieve that?
The cleanest way is probably to just copy the base class's signature:
def __init__(self, parent, id=ID_ANY, pos=DefaultPosition,
size=DefaultSize, style=0, validator=DefaultValidator,
name=ControlNameStr):
style |= wx.BORDER_NONE
super().__init__(parent, id, pos, size, style, validator, name)
This can get a bit ugly if you're doing this for a whole bunch of classes whose construction signatures all have a whole bunch of positional-or-keyword parameters. Or if you're doing it for an API that changes regularly.
For those cases, you can always do it dynamically, with inspect:
_wxControlSig = inspect.signature(wx.Control)
class TerminalCtrl(wx.Control)
def __init__(self, *args, **kwargs):
bound = _wxControlSig.bind(*args, **kwargs)
bound.apply_defaults()
bound.arguments['style'] |= wx.BORDER_NONE
super().__init__(*bound.args, **bound.kwargs)
If you were doing dozens of these, you'd probably want to write a decorator to help out. And you might also want to apply functools.wraps or do the equivalent manually to make your signature introspectable. (And if you weren't doing dozens of these, you'd probably want to just be explicit, as in the example at the top of the answer.)
If you have something which is just a bit too repetitive and annoying to do explicitly, but not worth going crazy with the introspection, the only thing in between is something decidedly hacky, like this:
def __init__(self, *args, **kwargs):
if len(args) > 3:
args = list(args)
args[3] |= WX_BORDER_NONE
elif 'style' in kwargs:
kwargs['style'] |= wx.BORDER_NONE
else:
kwargs['style'] = wx.BORDER_NONE
super().__init__(*args, **kwargs)
For Python 2.x (or 3.0-3.2), where you don't have signature, only getargspec and friends, this might be tempting. But for 3.3+, the only reason to avoid signature would optimizing out a few nanoseconds. and when the function in question is the constructor for a widget that involves talking to the system window manager, that would be pretty silly to worry about.
Base class looks like this (code comes from Django, but the question isn't Django specific):
class BaseModelForm(BaseForm):
def __init__(self, data=None, files=None, auto_id='id_%s', prefix=None,
initial=None, error_class=ErrorList, label_suffix=':',
empty_permitted=False, instance=None):
class ModelForm(BaseModelForm):
__metaclass__ = ModelFormMetaclass
Derived class looks like this:
class MyForm(forms.ModelForm):
def __init__(self, data=None, files=None, *args, **kwargs):
super(EdocForm, self).__init__(data, files, *args, **kwargs)
self.Meta.model.get_others(data, files, kwargs.get('instance', None))
If the coder passes instance as a kwargs this is all fine and should work, but what if they don't do that and pass it in args instead? What is a sane way to extract instance when dealing with *args and **kwargs? Admittedly in this case the chances of instance being in args is small, but if it was the 3rd argument instead of the 5th (not counting self).
Well in django if instance is not passed or its not properly set then it becomes None and init will then create a new instance for you
http://docs.nullpobug.com/django/trunk/django.forms.models-pysrc.html#BaseModelForm
But this isn't your question.
If the other variables aren't properly set an exception will probably be raised.
Its really up to you if you want to check if the user has properly used your API, and in python this can be quite daunting, and checking for instance types isn't very pythonic, there are very heated debate though out the web about whether its a good or bad thing to have a language thats so dynamically typed. On one hand you can be very productive in a dynamically typed language on the other you can have really nasty bugs whose solution aren't that apparent, but thats from my own experience.
I believe consistency is one of the most crucial things a system could have, as such I tend to always use keyword values, all set to None and then simply do an assert to make sure all that are required aren't None, you could have a function that just checks your parameters are good. Keyword arguments tend to be the most flexible, from my own experience, since you don't need to keep track of the order, you pay the price by having the callee remember its name.
If you really need 'instance' then you can iterate over args and/or kwargs using isinstance to get the instance variable, though like I said this isn't very pythonic....
If you insist on accepting *args, one way to handle this situation if getting instance matters to MyForm, is to explicitly include instance as a keyword argument to MyForm, add instance to kwargs, and then pass up kwargs.
class MyForm(forms.ModelForm):
def __init__(self, data=None, files=None, instance=None, *args, **kwargs):
kwargs['instance'] = instance
super(EdocForm, self).__init__(data, files, *args, **kwargs)
Note that, if someone put instance as the third positional argument, they would be making a very explicit error, since instance is the last argument to BaseModelForm.
However, a better way to handle this situation would be to specifically not allow additional positional arguments. E.g.:
class MyForm(forms.ModelForm):
def __init__(self, data=None, files=None, **kwargs):
super(EdocForm, self).__init__(data, files, *args, **kwargs)
self.Meta.model.get_others(data, files, kwargs.get('instance', None))
That way, MyForm can only be called with up to 2 positional arguments. Any more and the Python interpreter will generate a TypeError: <func> takes exactly 2 arguments (# given).
If you need to use the instance argument, you should explicitly include it in the keyword arguments to MyForm. Failing that, you at least should note it in the doc string to the function.
If you're subclassing BaseModelForm and it has a self.instance variable, you could access it directly via super. E.g. self.instance = super(EdocForm, self).instance. That way, you let the superclass handle whatever it needs to do with instance and grab the instance for yourself post-processing. (be careful about syntax with super...you need both the class and self)
Is this a legal use of super()?
class A(object):
def method(self, arg):
pass
class B(A):
def method(self, arg):
super(B,self).method(arg)
class C(B):
def method(self, arg):
super(B,self).method(arg)
Thank you.
It will work, but it will probably confuse anyone trying to read your code (including you, unless you remember it specifically). Don't forget that if you want to call a method from a particular parent class, you can just do:
A.method(self, arg)
Well, "legal" is a questionable term here. The code will end up calling A.method, since the type given to super is excluded from the search. I would consider this usage of super flaky to say the least, since it will skip a member of the inheritance hierarchy (seemingly haphhazardly), which is inconsistent with what I would expect as a developer. Since users of super are already encouraged to maintain consistency, I'd recommend against this practice.