Can someone help me understand how MRO works in python?
Suppose I have four classes - Character, Thief, Agile, Sneaky. Character is the super class to Thief, Agile and Sneaky are siblings. Please see my code and question below
class Character:
def __init__(self, name="", **kwargs):
if not name:
raise ValueError("'name' is required")
self.name = name
for key, value in kwargs.items():
setattr(self, key, value)
class Agile:
agile = True
def __init__(self, agile=True, *args, **kwargs):
super().__init__(*args, **kwargs)
self.agile = agile
class Sneaky:
sneaky = True
def __init__(self, sneaky=True, *args, **kwargs):
super().__init__(*args, **kwargs)
self.sneaky = sneaky
class Thief(Agile, Sneaky, Character):
def pickpocket(self):
return self.sneaky and bool(random.randint(0, 1))
parker = Thief(name="Parker", sneaky=False)
So, here is what I think is going on, please let me know if I'm understanding it correctly.
Since Agile is first on the list, all arguments are first sent to Agile where the arguments will be cross-referenced with the Agile parameters. If there is a match the value will be assigned, then everything that doesn't have a matching keyword will be packed up in *kwargs and sent to the Sneaky class (via super), where the same thing will happen - all arguments get unpacked, cross-referenced with the Sneaky parameters (this is when sneaky = False is set), then packed up in kwargs and sent to Character. Then everything within the Character inint method will run and all values will be set (like the name = "Parker").
HOW I THINK MRO WORKS ON THE WAY BACK
Now that everything made it to the Character class and everything in the Character init method has run, now it has to go back to the Agile and Sneaky classes and finishing running everything in their init methods(or everything under their super). So, it will first go back to the Sneaky class and finish it's init method, then go back to the Agile class and finish the rest of its init method (respectively).
Do I have it confused anywhere? Phew. I'm sorry, I know this is a lot, but I'm really stuck here and I'm trying to get a clear understanding of how MRO works.
Thank you, everyone.
Your code as posted doesn't even compile, much less run. But, guessing at how it's supposed to work…
Yes, you've got things basically right.
But you should be able to verify this yourself, in two ways. And knowing how to verify it may be even more important than knowing the answer.
First, just print out Thief.mro(). It should look something like this:
[Thief, Agile, Sneaky, Character, object]
And then you can see which classes provide an __init__ method, and therefore how they'll be chained up if everyone just calls super:
>>> [cls for cls in Thief.mro() if '__init__' in cls.__dict__]
[Agile, Sneaky, Character, object]
And, just to make sure Agile really does get called first:
>>> Thief.__init__
<function Agile.__init__>
Second, you can run your code in the debugger and step through the calls.
Or you can just add print statements at the top and bottom of each one, like this:
def __init__(self, agile=True, *args, **kwargs):
print(f'>Agile.__init__(agile={agile}, args={args}, kwargs={kwargs})')
super().__init__(*args, **kwargs)
self.agile = agile
print(f'<Agile.__init__: agile={agile}')
(You could even write a decorator that does this automatically, with a bit of inspect magic.)
If you do that, it'll print out something like:
> Agile.__init__(agile=True, args=(), kwargs={'name': 'Parker', 'sneaky':False})
> Sneaky.__init__(sneaky=False, args=(), kwargs={'name': 'Parker'})
> Character.__init__(name='Parker', args=(), kwargs={})
< Character.__init__: name: 'Parker'
< Sneaky.__init__: sneaky: False
< Agile.__init__: agile: True
So, you're right about the order things get called via super, and the order the stack gets popped on the way back is obviously the exact opposite.
But, meanwhile, you've got one detail wrong:
sent to the Sneaky class (via super), where the same thing will happen - all arguments get unpacked, cross-referenced with the Sneaky parameters (this is when sneaky = False is set)
This is where the parameter/local variable sneaky gets set, but self.sneaky doesn't get set until after the super returns. Until then (including during Character.__init__, and similarly for any other mixins that you choose to throw in after Sneaky), there is no sneaky in self.__dict__, so if anyone were to try to look up self.sneaky, they'd only be able to find the class attribute—which has the wrong value.
Which raises another point: What are those class attributes for? If you wanted them to provide default values, you've already got default values on the initializer parameters for that, so they're useless.
If you wanted them to provide values during initialization, then they're potentially wrong, so they're worse than useless. If you need to have a self.sneaky before calling Character.__init__, the way to do that is simple: just move self.sneaky = sneaky up before the super() call.
In fact, that's one of the strengths of Python's "explicit super" model. In some languages, like C++, constructors are always called automatically, whether from inside out or outside in. Python forcing you to do it explicitly is less convenient, and harder to get wrong—but it means you can choose to do your setup either before or after the base class gets its chance (or, of course, a little of each), which is sometimes useful.
Related
I'm trying to add flexibility to a python class, so that it notices when one of the init arguments is already an instance of that class. Skip "Initial situation" if you don't mind, how I got here.
Initial situation
I have this class:
class Pet:
def __init__(self, animal):
self._animal = animal
#property
def present(self):
return "This pet is a " + self._animal
...
and there are many functions which accept an instance of this class as an argument (def f(pet, ...)). Everything worked as expected.
I then wanted to add some flexibility to the usage of these functions: if the caller passes a Pet instance, everything keeps on working as before. In all other cases, a Pet instance is created. One way to achieve that, is like this:
def f(pet_or_animal, ...):
if isinstance(pet_or_animal, Pet): #Pet instance was passed
pet = pet_or_animal
else: #animal string was passed
pet = Pet(pet_or_animal)
...
This also works as expected, but these lines are repeated in every function. Not DRY, not good.
Goal
So, I'd like to extract the if/else from each of the functions, and integrate it into the Pet class itself. I tried changing its __init__ method to
class PetA: #I've changed the name to facilitate discussion here.
def __init__(self, pet_or_animal):
if isinstance(pet_or_animal, PetA):
self = pet_or_animal
else:
self._animal = pet_or_animal
...
and start each function with
def f(pet_or_animal, ...):
pet = PetA(pet_or_animal)
...
However, that is not working. If a Pet instance is passed, everything is good, but if a string is called, a Pet instance is not correctly created.
Current (ugly) solution
What is working, is to add a class method to the class, like so:
class PetB: #I've changed the name to facilitate discussion here.
#classmethod
def init(cls, pet_or_animal):
if isinstance(pet_or_animal, PetB):
return pet_or_animal
else:
return cls(pet_or_animal)
def __init__(self, animal):
self._animal = animal
...
and also change the functions to
def f(pet_or_animal, ...):
pet = PetB.init(pet_or_animal) #ugly
...
Questions
Does anyone know, how to change class PetA so, that it has the intended behavior? To be sure, here is the quick test:
pb1 = PetB.init('dog')
pb2 = PetB.init(pb1) #correctly initialized; points to same instance as pb1 (as desired)
pa1 = PetA('cat')
pa2 = PetA(pa1) #incorrectly initialized; pa1 != pa2
More generally, is this the right way to go about adding this flexibility? Another option I considered was writing a separate function to just do the checking, but this too is rather ugly and yet another thing to keep track of. I'd rather keep everything neat and wrapped in the class itself.
And one final remark: I realize that some people might find the added class method (petB) a more elegant solution. The reason I prefer to add to the __init__ method (petA) is that, in my real-world use, I already allow for many different types of initialization arguments. So, there is already a list of if/elif/elif/... statements that check, just which of the possibilities is used by the creator. I'd like to extend that by one more case, namely, if an initialized instance is passed.
Many thanks
I believe your current "ugly" solution is actually the correct approach.
This pushes the flexibility up as far as possible, since it is messy. Even though python allows for arbitrary types and values to float around, your users and yourself will thank you for keeping that constrained to the outermost levels.
I would think of it as (don't need to implement it this way)
class Pet:
#classmethod
def from_animal(cls, ...):
...
#classmethod
def from_pet(cls, ...):
...
#classmethod
def auto(cls, ...):
if is_pet(...):
return cls.from_pet(...)
def __init__(cls, internal_rep):
...
etc.
It is a code smell if you don't know whether your function is taking an object or an initializer. See if you can do processing as up-front as possible with user input and standardize everything beyond there.
You could use a function instead to get the same behaviour you want:
def make_pet_if_required(pet_or_animal):
if isinstance(pet_or_animal, PetA):
return pet_or_animal
else:
return Pet(pet_or_animal)
And then:
def f(pet_or_animal, ...):
pet = make_pet_if_required(pet_or_animal)
...
For more "beauty" you can try turning that function call into a decorator.
I am writing a GUI in wxPython, and am creating a custom control for displaying a terminal window, as I have not been able to find one currently in existence.
My control TerminalCtrl extends upon wx.Control, and my init definition starts as follows:
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
I would like to enforce the following style:
style=wx.BORDER_NONE
That is, no borders will ever be allowed on this window. However, I would still like to allow other styles to be applied, at programmers discretion.
For reference, the __init__ function for wx.Control is defined as follows
__init__ (self, parent, id=ID_ANY, pos=DefaultPosition, size=DefaultSize, style=0, validator=DefaultValidator, name=ControlNameStr)
What I would like to achieve is that I may filter the style parameter to enforce the wx.BORDER_NONE style. It is my understanding that this could be in either *args or **kwargs, depending on whether the parameters are passed by position or by specifically referencing the parameter name such as (style=wx.BORDER_NONE).
Is there a standard/recommended/pythonic way that I may enforce such a filter upon a parameter before passing it on to wx.Control.__init__ and if so how may I achieve that?
The cleanest way is probably to just copy the base class's signature:
def __init__(self, parent, id=ID_ANY, pos=DefaultPosition,
size=DefaultSize, style=0, validator=DefaultValidator,
name=ControlNameStr):
style |= wx.BORDER_NONE
super().__init__(parent, id, pos, size, style, validator, name)
This can get a bit ugly if you're doing this for a whole bunch of classes whose construction signatures all have a whole bunch of positional-or-keyword parameters. Or if you're doing it for an API that changes regularly.
For those cases, you can always do it dynamically, with inspect:
_wxControlSig = inspect.signature(wx.Control)
class TerminalCtrl(wx.Control)
def __init__(self, *args, **kwargs):
bound = _wxControlSig.bind(*args, **kwargs)
bound.apply_defaults()
bound.arguments['style'] |= wx.BORDER_NONE
super().__init__(*bound.args, **bound.kwargs)
If you were doing dozens of these, you'd probably want to write a decorator to help out. And you might also want to apply functools.wraps or do the equivalent manually to make your signature introspectable. (And if you weren't doing dozens of these, you'd probably want to just be explicit, as in the example at the top of the answer.)
If you have something which is just a bit too repetitive and annoying to do explicitly, but not worth going crazy with the introspection, the only thing in between is something decidedly hacky, like this:
def __init__(self, *args, **kwargs):
if len(args) > 3:
args = list(args)
args[3] |= WX_BORDER_NONE
elif 'style' in kwargs:
kwargs['style'] |= wx.BORDER_NONE
else:
kwargs['style'] = wx.BORDER_NONE
super().__init__(*args, **kwargs)
For Python 2.x (or 3.0-3.2), where you don't have signature, only getargspec and friends, this might be tempting. But for 3.3+, the only reason to avoid signature would optimizing out a few nanoseconds. and when the function in question is the constructor for a widget that involves talking to the system window manager, that would be pretty silly to worry about.
Is there a generally accepted best practice for creating a class whose instances will have many (non-defaultable) variables?
For example, by explicit arguments:
class Circle(object):
def __init__(self,x,y,radius):
self.x = x
self.y = y
self.radius = radius
using **kwargs:
class Circle(object):
def __init__(self, **kwargs):
if 'x' in kwargs:
self.x = kwargs['x']
if 'y' in kwargs:
self.y = kwargs['y']
if 'radius' in kwargs:
self.radius = kwargs['radius']
or using properties:
class Circle(object):
def __init__(self):
pass
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
#property
def y(self):
return self._y
#y.setter
def y(self, value):
self._y = value
#property
def radius(self):
return self._radius
#radius.setter
def radius(self, value):
self._radius = value
For classes which implement a small number of instance variables (like the example above), it seems like the natural solution is to use explicit arguments, but this approach quickly becomes unruly as the number of variables grows. Is there a preferred approach when the number of instance variables grows lengthy?
I'm sure there are many different schools of thought on this, by here's how I've usually thought about it:
Explicit Keyword Arguments
Pros
Simple, less code
Very explicit, clear what attributes you can pass to the class
Cons
Can get very unwieldy as you mention when you have LOTs of things to pass in
Prognosis
This should usually be your method of first attack. If you find however that your list of things you are passing in is getting too long, it is likely pointing to more of a structural problem with the code. Do some of these things you are passing in share any common ground? Could you encapsulate that in a separate object? Sometimes I've used config objects for this and then you go from passing in a gazillion args to passing in 1 or 2
Using **kwargs
Pros
Seamlessly modify or transform arguments before passing it to a wrapped system
Great when you want to make a variable number of arguments look like part of the api, e.g. if you have a list or dictionary
Avoid endlessly long and hard to maintain passthrough definitions to a lower level system,
e.g.
def do_it(a, b, thing=None, zip=2, zap=100, zimmer='okay', zammer=True):
# do some stuff with a and b
# ...
get_er_done(abcombo, thing=thing, zip=zip, zap=zap, zimmer=zimmer, zammer=zammer)
Instead becomes:
def do_it(a, b, **kwargs):
# do some stuff with a and b
# ...
get_er_done(abcombo, **kwargs)
Much cleaner in cases like this, and can see get_er_done for the full signature, although good docstrings can also just list all the arguments as if they were real arguments accepted by do_it
Cons
Makes it less readable and explicit what the arguments are in cases where it is not a more or less simple passthrough
Can really easily hide bugs and obfuscate things for maintainers if you are not careful
Prognosis
The *args and **kwargs syntax is super useful, but also can be super dangerous and hard to maintain as you lose the explicit nature of what arguments you can pass in. I usually like to use these in situations when I have a method that basically is just a wrapper around another method or system and you want to just pass things through without defining everything again, or in interesting cases where the arguments need to be pre-filtered or made more dynamic, etc. If you are just using it to hide the fact that you have tons and tons of arguments and keyword arguments, **kwargs will probably just exacerbate the problem by making your code even more unwieldy and arcane.
Using Properties
Pros
Very explicit
Provides a great way of creating objects when they are somehow still "valid" when not all parameters are you known and passing around half-formed objects through a pipeline to slowly populate args. Also for attributes that don't need to be set, but could be, it sometimes provides a clean way of pairing down your __init__'s
Are great when you want to present a simple interface of attributes, e.g. for an api, but under the hood are doing more complicated cooler things like maintaining caches, or other fun stuff
Cons
A lot more verbose, more code to maintain
Counterpoint to above, can introduce danger in allowing invalid objects with some properties not yet fully initialized to be generated when they should never be allowed to exist
Prognosis
I actually really like taking advantage of the getter and setter properties, especially when I am doing tricky stuff with private versions of those attributes that I don't want to expose. It can also be good for config objects and other things and is nice and explicit, which I like. However, if I am initializing an object where I don't want to allow half-formed ones to be walking around and they are serving no purpose, it's still better to just go with explicit argument and keyword arguments.
TL;DR
**kwargs and properties have nice specific use cases, but just stick to explicit keyword arguments whenever practical/possible. If there are too many instance variables, consider breaking up your class into hierarchical container objects.
Without really knowing the particulars of your situation, the classic answer is this: if your class initializer requires a whole bunch of arguments, then it is probably doing too much, and it should be factored into several classes.
Take a Car class defined as such:
class Car:
def __init__(self, tire_size, tire_tread, tire_age, paint_color,
paint_condition, engine_size, engine_horsepower):
self.tire_size = tire_size
self.tire_tread = tire_tread
# ...
self.engine_horsepower = engine_horsepower
Clearly a better approach would be to define Engine, Tire, and Paint
classes (or namedtuples) and pass instances of these into Car():
class Car:
def __init__(self, tire, paint, engine):
self.tire = tire
self.paint = paint
self.engine = engine
If something is required to make an instance of a class, for example, radius in your Circle class, it should be a required argument to __init__ (or factored into a smaller class which is passed into __init__, or set by an alternative constructor). The reason is this: IDEs, automatic documentation generators, code autocompleters, linters, and the like can read a method's argument list. If it's just **kwargs, there's no information there. But if it has the names of the arguments you expect, then these tools can do their work.
Now, properties are pretty cool, but I'd hesitate to use them until necessary (and you'll know when they are necessary). Leave your attributes as they are and allow people to access them directly. If they shouldn't be set or changed, document it.
Lastly, if you really must have a whole bunch of arguments, but don't want to write a bunch of assignments in your __init__, you might be interested in Alex Martelli's answer to a related question.
Passing arguments to the __init__ is usually the best practice like in any Object Oriented programming language. In your example, setters/getters would allow the object to be in this weird state where it doesn't have any attribute yet.
Specifying the arguments, or using **kwargs depends on the situation. Here's a good rule of thumb:
If you have many arguments, **kwargs is a good solution, since it avoids code like this:
def __init__(first, second, third, fourth, fifth, sixth, seventh,
ninth, tenth, eleventh, twelfth, thirteenth, fourteenth,
...
)
If you're heavily using inheritance. **kwargs is the best solution:
class Parent:
def __init__(self, many, arguments, here):
self.many = many
self.arguments = arguments
self.here = here
class Child(Parent):
def __init__(self, **kwargs):
self.extra = kwargs.pop('extra')
super().__init__(**kwargs)
avoids writing:
class Child:
def __init__(self, many, arguments, here, extra):
self.extra = extra
super().__init__(many, arguments, here)
For all other cases, specifying the arguments is better since it allows developers to use both positional and named arguments, like this:
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
Can be instantiated by Point(1, 2) or Point(x=1, y=2).
For general knowledge, you can see how namedtuple does it and use it.
Your second approach can be written in more elegant way:
class A:
def __init__(self, **kwargs):
self.__dict__ = {**self.__dict__, **kwargs}
a = A(x=1, y=2, verbose=False)
b = A(x=5, y=6, z=7, comment='bar')
print(a.x + b.x)
But all already mentioned disadvantages persist...
Why doesn't object.__init__ take *args, **kwargs as arguments? This breaks some simple code in a highly annoying manner without any upsides as far as I can see:
Say we want to make sure that all __init__'s of all parent classes are called. As long as every init follows the simple convention of calling super().__init__ this will guarantee that the whole hierarchy is run through and that exactly once (also without ever having to specify the parent specifically). The problem appears when we pass arguments along:
class Foo:
def __init__(self, *args, **kwargs):
print("foo-init")
super().__init__(*args, **kwargs) # error if there are arguments!
class Bar:
def __init__(self, *args, **kwargs):
print("bar-init")
super().__init__(*args, **kwargs)
class Baz(Bar, Foo):
def __init__(self, *args, **kwargs):
print("baz-init")
super().__init__(*args, **kwargs)
b1 = Baz() # works
b2 = Baz("error")
What's the reasoning for this and what's the best general (! it's easily solvable in my specific case but that relies on additional knowledge of the hierarchy) workaround? The best I can see is to check whether the parent is object and in that case not give it any args.. horribly ugly that.
You can see http://bugs.python.org/issue1683368 for a discussion. Note that someone there actually asked for it to cause an error. Also see the discussion on python-dev.
Anyway, your design is rather odd. Why are you writing every single class to take unspecified *args and **kwargs? In general it's better to have methods accept the arguments they need. Accepting open-ended arguments for everything can lead to all sorts of bugs if someone mistypes a keyword name, for instance. Sometimes it's necessary, but it shouldn't be the default way of doing things.
Raymond Hettinger's super() considered super has some information about how to deal with this. It's in the section "Practical advice".
I've been striving mightily for three days to wrap my head around __init__ and "self", starting at Learn Python the Hard Way exercise 42, and moving on to read parts of the Python documentation, Alan Gauld's chapter on Object-Oriented Programming, Stack threads like this one on "self", and this one, and frankly, I'm getting ready to hit myself in the face with a brick until I pass out.
That being said, I've noticed a really common convention in initial __init__ definitions, which is to follow up with (self, foo) and then immediately declare, within that definition, that self.foo = foo.
From LPTHW, ex42:
class Game(object):
def __init__(self, start):
self.quips = ["a list", "of phrases", "here"]
self.start = start
From Alan Gauld:
def __init__(self,val): self.val = val
I'm in that horrible space where I can see that there's just One Big Thing I'm not getting, and I it's remaining opaque no matter how much I read about it and try to figure it out. Maybe if somebody can explain this little bit of consistency to me, the light will turn on. Is this because we need to say that "foo," the variable, will always be equal to the (foo) parameter, which is itself contained in the "self" parameter that's automatically assigned to the def it's attached to?
You might want to study up on object-oriented programming.
Loosely speaking, when you say
class Game(object):
def __init__(self, start):
self.start = start
you're saying:
I have a type of "thing" named Game
Whenever a new Game is created, it will demand me for some extra piece of information, start. (This is because the Game's initializer, named __init__, asks for this information.)
The initializer (also referred to as the "constructor", although that's a slight misnomer) needs to know which object (which was created just a moment ago) it's initializing. That's the first parameter -- which is usually called self by convention (but which you could call anything else...).
The game probably needs to remember what the start I gave it was. So it stores this information "inside" itself, by creating an instance variable also named start (nothing special, it's just whatever name you want), and assigning the value of the start parameter to the start variable.
If it doesn't store the value of the parameter, it won't have that informatoin available for later use.
Hope this explains what's happening.
I'm not quite sure what you're missing, so let me hit some basic items.
There are two "special" intialization names in a Python class object, one that is relatively rare for users to worry about, called __new__, and one that is much more usual, called __init__.
When you invoke a class-object constructor, e.g. (based on your example) x = Game(args), this first calls Game.__new__ to obtain memory in which to hold the object, and then Game.__init__ to fill in that memory. Most of the time, you can allow the underlying object.__new__ to allocate the memory, and you just need to fill it in. (You can use your own allocator for special weird rare cases like objects that never change and may share identities, the way ordinary integers do for instance. It's also for "metaclasses" that do weird stuff. But that's all a topic for much later.)
Your Game.__init__ function is called with "all the arguments to the constructor" plus one stashed in the front, which is the memory allocated for that object itself. (For "ordinary" objects that's mostly a dictionary of "attributes", plus the magic glue for classes, but for objects with __slots__ the attributes dictionary is omitted.) Naming that first argument self is just a convention—but don't violate it, people will hate you if you do. :-)
There's nothing that requires you to save all the arguments to the constructor. You can set any or all instance attributes you like:
class Weird(object):
def __init__(self, required_arg1, required_arg2, optional_arg3 = 'spam'):
self.irrelevant = False
def __str__(self):
...
The thing is that a Weird() instance is pretty useless after initialization, because you're required to pass two arguments that are simply thrown away, and given a third optional argument that is also thrown away:
x = Weird(42, 0.0, 'maybe')
The only point in requiring those thrown-away arguments is for future expansion, as it were (you might have these unused fields during early development). So if you're not immediately using and/or saving arguments to __init__, something is definitely weird in Weird.
Incidentally, the only reason for using (object) in the class definition is to indicate to Python 2.x that this is a "new-style" class (as distinguished from very-old-Python "instance only" classes). But it's generally best to use it—it makes what I said above about object.__new__ true, for instance :-) —until Python 3, where the old-style stuff is gone entirely.
Parameter names should be meaningful, to convey the role they play in the function/method or some information about their content.
You can see parameters of constructors to be even more important because they are often required for the working of the new instance and contain information which is needed in other methods of the class as well.
Imagine you have a Game class which accepts a playerList.
class Game:
def __init__(self, playerList):
self.playerList = playerList # or self.players = playerList
def printPlayerList(self):
print self.playerList # or print self.players
This list is needed in various methods of the class. Hence it makes sense to assign it to self.playerList. You could also assign it to self.players, whatever you feel more comfortable with and you think is understandable. But if you don't assign it to self.<somename> it won't be accessible in other methods.
So there is nothing special about how to name parameters/attributes/etc (there are some special class methods though), but using meaningful names makes the code easier to understand. Or would you understand the meaning of the above class if you had:
class G:
def __init__(self, x):
self.y = x
def ppl(self):
print self.y
? :) It does exactly the same but is harder to understand...