Is there a generally accepted best practice for creating a class whose instances will have many (non-defaultable) variables?
For example, by explicit arguments:
class Circle(object):
def __init__(self,x,y,radius):
self.x = x
self.y = y
self.radius = radius
using **kwargs:
class Circle(object):
def __init__(self, **kwargs):
if 'x' in kwargs:
self.x = kwargs['x']
if 'y' in kwargs:
self.y = kwargs['y']
if 'radius' in kwargs:
self.radius = kwargs['radius']
or using properties:
class Circle(object):
def __init__(self):
pass
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
#property
def y(self):
return self._y
#y.setter
def y(self, value):
self._y = value
#property
def radius(self):
return self._radius
#radius.setter
def radius(self, value):
self._radius = value
For classes which implement a small number of instance variables (like the example above), it seems like the natural solution is to use explicit arguments, but this approach quickly becomes unruly as the number of variables grows. Is there a preferred approach when the number of instance variables grows lengthy?
I'm sure there are many different schools of thought on this, by here's how I've usually thought about it:
Explicit Keyword Arguments
Pros
Simple, less code
Very explicit, clear what attributes you can pass to the class
Cons
Can get very unwieldy as you mention when you have LOTs of things to pass in
Prognosis
This should usually be your method of first attack. If you find however that your list of things you are passing in is getting too long, it is likely pointing to more of a structural problem with the code. Do some of these things you are passing in share any common ground? Could you encapsulate that in a separate object? Sometimes I've used config objects for this and then you go from passing in a gazillion args to passing in 1 or 2
Using **kwargs
Pros
Seamlessly modify or transform arguments before passing it to a wrapped system
Great when you want to make a variable number of arguments look like part of the api, e.g. if you have a list or dictionary
Avoid endlessly long and hard to maintain passthrough definitions to a lower level system,
e.g.
def do_it(a, b, thing=None, zip=2, zap=100, zimmer='okay', zammer=True):
# do some stuff with a and b
# ...
get_er_done(abcombo, thing=thing, zip=zip, zap=zap, zimmer=zimmer, zammer=zammer)
Instead becomes:
def do_it(a, b, **kwargs):
# do some stuff with a and b
# ...
get_er_done(abcombo, **kwargs)
Much cleaner in cases like this, and can see get_er_done for the full signature, although good docstrings can also just list all the arguments as if they were real arguments accepted by do_it
Cons
Makes it less readable and explicit what the arguments are in cases where it is not a more or less simple passthrough
Can really easily hide bugs and obfuscate things for maintainers if you are not careful
Prognosis
The *args and **kwargs syntax is super useful, but also can be super dangerous and hard to maintain as you lose the explicit nature of what arguments you can pass in. I usually like to use these in situations when I have a method that basically is just a wrapper around another method or system and you want to just pass things through without defining everything again, or in interesting cases where the arguments need to be pre-filtered or made more dynamic, etc. If you are just using it to hide the fact that you have tons and tons of arguments and keyword arguments, **kwargs will probably just exacerbate the problem by making your code even more unwieldy and arcane.
Using Properties
Pros
Very explicit
Provides a great way of creating objects when they are somehow still "valid" when not all parameters are you known and passing around half-formed objects through a pipeline to slowly populate args. Also for attributes that don't need to be set, but could be, it sometimes provides a clean way of pairing down your __init__'s
Are great when you want to present a simple interface of attributes, e.g. for an api, but under the hood are doing more complicated cooler things like maintaining caches, or other fun stuff
Cons
A lot more verbose, more code to maintain
Counterpoint to above, can introduce danger in allowing invalid objects with some properties not yet fully initialized to be generated when they should never be allowed to exist
Prognosis
I actually really like taking advantage of the getter and setter properties, especially when I am doing tricky stuff with private versions of those attributes that I don't want to expose. It can also be good for config objects and other things and is nice and explicit, which I like. However, if I am initializing an object where I don't want to allow half-formed ones to be walking around and they are serving no purpose, it's still better to just go with explicit argument and keyword arguments.
TL;DR
**kwargs and properties have nice specific use cases, but just stick to explicit keyword arguments whenever practical/possible. If there are too many instance variables, consider breaking up your class into hierarchical container objects.
Without really knowing the particulars of your situation, the classic answer is this: if your class initializer requires a whole bunch of arguments, then it is probably doing too much, and it should be factored into several classes.
Take a Car class defined as such:
class Car:
def __init__(self, tire_size, tire_tread, tire_age, paint_color,
paint_condition, engine_size, engine_horsepower):
self.tire_size = tire_size
self.tire_tread = tire_tread
# ...
self.engine_horsepower = engine_horsepower
Clearly a better approach would be to define Engine, Tire, and Paint
classes (or namedtuples) and pass instances of these into Car():
class Car:
def __init__(self, tire, paint, engine):
self.tire = tire
self.paint = paint
self.engine = engine
If something is required to make an instance of a class, for example, radius in your Circle class, it should be a required argument to __init__ (or factored into a smaller class which is passed into __init__, or set by an alternative constructor). The reason is this: IDEs, automatic documentation generators, code autocompleters, linters, and the like can read a method's argument list. If it's just **kwargs, there's no information there. But if it has the names of the arguments you expect, then these tools can do their work.
Now, properties are pretty cool, but I'd hesitate to use them until necessary (and you'll know when they are necessary). Leave your attributes as they are and allow people to access them directly. If they shouldn't be set or changed, document it.
Lastly, if you really must have a whole bunch of arguments, but don't want to write a bunch of assignments in your __init__, you might be interested in Alex Martelli's answer to a related question.
Passing arguments to the __init__ is usually the best practice like in any Object Oriented programming language. In your example, setters/getters would allow the object to be in this weird state where it doesn't have any attribute yet.
Specifying the arguments, or using **kwargs depends on the situation. Here's a good rule of thumb:
If you have many arguments, **kwargs is a good solution, since it avoids code like this:
def __init__(first, second, third, fourth, fifth, sixth, seventh,
ninth, tenth, eleventh, twelfth, thirteenth, fourteenth,
...
)
If you're heavily using inheritance. **kwargs is the best solution:
class Parent:
def __init__(self, many, arguments, here):
self.many = many
self.arguments = arguments
self.here = here
class Child(Parent):
def __init__(self, **kwargs):
self.extra = kwargs.pop('extra')
super().__init__(**kwargs)
avoids writing:
class Child:
def __init__(self, many, arguments, here, extra):
self.extra = extra
super().__init__(many, arguments, here)
For all other cases, specifying the arguments is better since it allows developers to use both positional and named arguments, like this:
class Point:
def __init__(self, x, y):
self.x = x
self.y = y
Can be instantiated by Point(1, 2) or Point(x=1, y=2).
For general knowledge, you can see how namedtuple does it and use it.
Your second approach can be written in more elegant way:
class A:
def __init__(self, **kwargs):
self.__dict__ = {**self.__dict__, **kwargs}
a = A(x=1, y=2, verbose=False)
b = A(x=5, y=6, z=7, comment='bar')
print(a.x + b.x)
But all already mentioned disadvantages persist...
Related
Preamble
This is a rather basic question, I realize, but I haven't been able to find a sturdy reference for it, which would likely be a mixture of technical details and best practices for well-behaved classes.
Question
When a parent and child class both define the same initialization parameter, but with default values, what's the best way to get sane behavior when the child class is created?
My assumptions are:
Classes only accept named parameters, I don't need to deal with positional arguments. That simplifies many things, both theoretically in reasoning about situations and practically in taking arguments from external config files, etc.
__init__ methods may be more sophisticated than just setting self.foo = foo for their arguments - they may transform it before storing, use it to set other params, etc. and I'd like to be as respectful of that as possible.
Subclasses never break the interfaces of their parents, both for __init__ parameters and for attributes. Having a different default value is not considered "breaking".
Classes should never have to be aware of their subclasses, they should just do things in "reasonable ways" and it's up to subclasses to ensure everything still works properly. However, it's sometimes necessary to modify a superclass to be "more reasonable" if it's doing things that aren't amenable to being subclassed - this can form a set of principles that help everyone get along well.
Examples
In general, my idea of a "best practice" template for a derived class looks like this:
class Child(Parent):
def __init__(self, arg=1, **kwargs):
self.arg = arg
super().__init__(**kwargs)
That works well in most situations - deal with our stuff, then delegate all the rest to our superclass.
However, it doesn't work well if arg is shared by both Child and Parent - neither the caller's argument nor the Child default are respected:
class Parent:
def __init__(self, arg=0):
self.arg = arg
class Child(Parent):
def __init__(self, arg=1, **kwargs):
self.arg = arg
super().__init__(**kwargs)
print(Child(arg=6).arg)
# Prints `0` - bad
A better approach is probably for Child to acknowledge that the argument is shared:
class Parent:
def __init__(self, arg=0):
self.arg = arg
class Child(Parent):
def __init__(self, arg=1, **kwargs):
super().__init__(arg=arg, **kwargs)
print(Child(arg=6).arg)
# Prints `6` - good
print(Child().arg)
# Prints `1` - good
That successfully gets the defaults working according to expectations. What I'm not sure of is whether this plays well with the expectations of Parent. So I think my questions are:
If Parent.__init__ does some Fancy Stuff with arg and/or self.arg, how should Child be set up to respect that?
In general does this require knowing Too Much about the internals of Parent and how self.arg is used? Or are there reasonable practices that everyone can follow to draw that part of the interface contract in a clean way?
Are there any specific gotchas to keep in mind?
Parent.__init__ only expects that the caller may choose to omit an argument for the arg parameter. It doesn't matter if any particular caller (Child.__init__, in this case) always provides an argument, nor does it matter how the caller produces the value it passes.
Your third example is what I would write, with the addition that Parent.__init__ itself also uses super().__init__: it doesn't assume that it's the end of whatever MRO is in force for its self argument.
class Parent:
def __init__(self, arg=0, **kwargs):
super().__init__(**kwargs)
self.arg = arg
class Child(Parent):
def __init__(self, arg=1, **kwargs):
super().__init__(arg=arg, **kwargs)
I am building a plotting class in Python, and am hoping to do the following. I want a graphics window using PyQt5 that also inherits from some custom classes I have made (such as a curve fitting class). In order for the curve fitting class to manipulate data that persists in the plotting class, it must have a reference to the data that is contained in the plotting class. Because of this, I have chosen the plotting class to inherit from the CurveFitting class.
The problem seems to arise in inheriting both from PyQt5's GraphicsWindow class and my custom class, which accept different numbers of arguments. I have read that Python does not play nice with classes that inherit different numbers of arguments using the "super" functionality, so I decided to make my custom CurveFitting class accept **kwargs, which would then give it a reference to the parent. However, I then encountered a different error which I do not understand. Below is a tidied up example of what I'm trying to do
import numpy as np
from pyqtgraph import GraphicsWindow
class ClassA():
def __init__(self, **kwargs):
super().__init__()
self.kwargs = kwargs
self.parent = self.kwargs['parent']
self.xdata = self.parent.xdata
def print_data(self):
print(self.parent.xdata)
print(self.parent.ydata)
class classC(GraphicsWindow, ClassA):
def __init__(self):
kwargs = {}
kwargs['parent'] = self
kargs = kwargs
self.xdata = np.linspace(0, 100, 101)
self.ydata = np.linspace(0, 200, 101)
super().__init__(**kwargs)
# ClassA.__init__(self, **kwargs)
# GraphicsWindow.__init__(self)
instC = classC()
instC.print_data()
When I run the above I get "RuntimeError: super-class init() of type classC was never called" on the "super().__init(**kwargs)" line, which I honestly do not understand at all, and have tried googling for a while but to no avail.
Additionally, I have tried commenting out the line, and uncommenting the next two lines to inherit from each class manually, but this also does not work. What I find pretty weird is that if I comment one of those two lines out, they both work individually, but together they do not work. For example, if I run it with both lines, it gives me an error that kwargs has no key word 'parent', as if it didn't even pass **kwargs.
Is there a way to inherit from two classes that take a different number of initialization parameters like this? Is there a totally different way I could be approaching this problem? Thanks.
The immediate problem with your code is that ClassC inherits from GraphicsWindow as its first base class, and ClassA is the second base class. When you call super, only one gets called (GraphicsWindow) and if it was not designed to work with multiple inheritance (as seems to be the case), it may not call super itself or may not pass on the arguments that ClassA expects.
Just switching the order of the base classes may be enough to make it work. Python guarantees that the base classes will be called in the same relative order that they appear in the class statement in (though other classes may be inserted between them in the MRO if more inheritance happens later). Since ClassA.__init__ does call super, it should work better!
It can be tricky to make __init__ methods work with multiple inheritance though, even if all the classes involved are designed to work with it. This is why positional arguments are often avoided, since their order can become very confusing (since child classes can only add positional arguments ahead of their parent's positional arguments unless they want to repeat all the names). Using keyword arguments is definitely a better approach.
But the code you have is making dealing with keyword arguments a bit more complicated than it should be. You shouldn't need to explicitly create dictionaries to pass on with **kwargs syntax, nor should you need to extract keyword values from a a dict you accepted with a **kwargs argument. Usually each function should name the arguments it takes, and only use **kwargs for unknown arguments (that may be needed by some other class in the MRO). Here's what that looks like:
class Base1:
def __init__(self, *, arg1, arg2, arg3, **kwargs): # the * means the other args are kw-only
super().__init__(**kwargs) # always pass on all unknown arguments
... # use the named args here (not kwargs)
class Base2:
def __init__(self, *, arg4, arg5, arg6, **kwargs):
super().__init__(**kwargs)
...
class Derived(Base1, Base2):
def __init__(self, *, arg2, arg7, **kwargs): # child can accept args used by parents
super().__init__(arg2=arg2+1, arg6=3, **kwargs) # it can then modify or create from
... # scratch args to pass to its parents
obj = Derived(arg1=1, arg2=2, arg3=3, arg4=4, arg5=5, arg7=7) # note, we've skipped arg6
# and Base1 will get 3 for arg2
But I'd also give serious though to whether inheritance makes any sense in your situation. It may make more sense for one of your two base classes to be encapsulated within your child class, rather than being inherited from. That is, you'd inherit from only one of ClassA or GraphicsWindow, and store an instance of the other in each instance of ClassC. (You could even inherit from neither base class, and encapsulate them both.) Encapsulation is often a lot easier to reason about and get right than inheritance.
By using the #property decorator, Python has completely eliminated the need for getters and setters on object properties (some might say 'attributes'). This makes code much simpler, while maintaining the extensibility when things do need to get more complex.
I was wondering what the Pythonic approach to the following kind of method is, though. Say I have the following class:
class A(object):
def is_winner(self):
return True # typically a more arcane method to determine the answer
Such methods typically take no arguments, and have no side effects. One might call these predicates. And given their name, they often closely resemble something one might also have stored as a property.
I am inclined to add a #property decorator to the above, in order to be able to call it as an object property (i.e. foo.is_winner), but I was wondering if this is the standard thing to do. At first glance, I could not find any documentation on this subject. Is there a common standard for this situation?
It seems that the general consensus is that attributes are generally seen as being instant and next-to-free to use, so if the computation being decorated as a #property is expensive, it's probably best to either cache the outcome for repeated use (#Martijn Pieters) or to leave it as a method, as methods are generally expected to take more time than attribute lookups. PEP 8 notes specifically:
Note 2: Try to keep the functional behavior side-effect free, although side-effects such as caching are generally fine.
Note 3: Avoid using properties for computationally expensive operations; the attribute notation makes the caller believe that access is (relatively) cheap.
One particular use case of the #property decorator is to add some behavior to a class without requiring that users of the class change from foo.bar references to foo.bar() calls -- for example, if you wanted to count the number of times that an attribute was referenced, you could convert the attribute into a #property where the decorated method manipulates some state before returning the requested data.
Here is an example of the original class:
class Cat(object):
def __init__(self, name):
self.name = name
# In user code
baxter = Cat('Baxter')
print(baxter.name) # => Baxter
With the #property decorator, we can now add some under-the-hood machinery without affecting the user code:
class Cat(object):
def __init__(self, name):
self._name = name
self._name_access_count = 0
#property
def name(self):
self._name_access_count += 1
return self._name
# User code remains unchanged
baxter = Cat('Baxter')
print(baxter.name) # => Baxter
# Also have information available about the number of times baxter's name was accessed
print(baxter._name_access_count) # => 1
baxter.name # => 'Baxter'
print(baxter._name_access_count) # => 2
This treatment of the #property decorator has been mentioned in some blog posts(1, 2) as one of the main use cases -- allowing us to initially write the simplest code possible, and then later on switch over to #propery-decorated methods when we need the functionality.
I'm writing a class for something and I keep stumbling across the same tiresome to type out construction. Is there some simple way I can set up class so that all the parameters in the constructor get initialized as their own name, i.e. fish = 0 -> self.fish = fish?
class Example(object):
def __init__(self, fish=0, birds=0, sheep=0):
self.fish = fish
self.birds = birds
self.sheep = sheep
Short answer: no. You are not required to initialize everything in the constructor (you could do it lazily), unless you need it immediately or expose it (meaning that you don't control access). But, since in Python you don't declare data fields, it will become difficult, much difficult, to track them all if they appear in different parts of the code.
More comprehensive answer: you could do some magic with **kwargs (which holds a dictionary of argument name/value pairs), but that is highly discouraged, because it makes documenting the changes almost impossible and difficult for users to check if a certain argument is accepted or not. Use it only for optional, internal flags. It could be useful when having 20 or more parameters to pass, but in that case I would suggest to rethink the design and cluster data.
In case you need a simple key/value storage, consider using a builtin, such as dict.
You could use the inspect module:
import inspect
class Example(object):
def __init__(self, fish=0, birds=0, sheep=0):
frame = inspect.currentframe()
args, _, _, values = inspect.getargvalues(frame)
for i in args:
setattr(self, i, values[i])
This works, but is more complicated that just setting them manually. It should be possible to hide this with a decorator:
#set_attributes
def __init__(self, fish=0, birds=0, sheep=0):
pass
but defining set_attributes gets tricky because the decorator inserts another stack frame into the mix, and I can't quite get the details right.
For Python 3.7+, you can try using data classes in combination with type annotations.
https://docs.python.org/3/library/dataclasses.html
Import the module and use the decorator. Type-annotate your variables and there's no need to define an init method, because it will automatically be created for you.
from dataclasses import dataclass
#dataclass
class Example:
fish: int = 0
birds: int = 0
sheep: int = 0
I have class Base. I'd like to extend its functionality in a class Derived. I was planning to write:
class Derived(Base):
def __init__(self, base_arg1, base_arg2, derived_arg1, derived_arg2):
super().__init__(base_arg1, base_arg2)
# ...
def derived_method1(self):
# ...
Sometimes I already have a Base instance, and I want to create a Derived instance based on it, i.e., a Derived instance that shares the Base object (doesn't re-create it from scratch). I thought I could write a static method to do that:
b = Base(arg1, arg2) # very large object, expensive to create or copy
d = Derived.from_base(b, derived_arg1, derived_arg2) # reuses existing b object
but it seems impossible. Either I'm missing a way to make this work, or (more likely) I'm missing a very big reason why it can't be allowed to work. Can someone explain which one it is?
[Of course, if I used composition rather than inheritance, this would all be easy to do. But I was hoping to avoid the delegation of all the Base methods to Derived through __getattr__.]
Rely on what your Base class is doing with with base_arg1, base_arg2.
class Base(object):
def __init__(self, base_arg1, base_arg2):
self.base_arg1 = base_arg1
self.base_arg2 = base_arg2
...
class Derived(Base):
def __init__(self, base_arg1, base_arg2, derived_arg1, derived_arg2):
super().__init__(base_arg1, base_arg2)
...
#classmethod
def from_base(cls, b, da1, da2):
return cls(b.base_arg1, b.base_arg2, da1, da2)
The alternative approach to Alexey's answer (my +1) is to pass the base object in the base_arg1 argument and to check, whether it was misused for passing the base object (if it is the instance of the base class). The other agrument can be made technically optional (say None) and checked explicitly when decided inside the code.
The difference is that only the argument type decides what of the two possible ways of creation is to be used. This is neccessary if the creation of the object cannot be explicitly captured in the source code (e.g. some structure contains a mix of argument tuples, some of them with the initial values, some of them with the references to the existing objects. Then you would probably need pass the arguments as the keyword arguments:
d = Derived(b, derived_arg1=derived_arg1, derived_arg2=derived_arg2)
Updated: For the sharing the internal structures with the initial class, it is possible using both approaches. However, you must be aware of the fact, that if one of the objects tries to modify the shared data, the usual funny things can happen.
To be clear here, I'll make an answer with code. pepr talks about this solution, but code is always clearer than English. In this case Base should not be subclassed, but it should be a member of Derived:
class Base(object):
def __init__(self, base_arg1, base_arg2):
self.base_arg1 = base_arg1
self.base_arg2 = base_arg2
class Derived(object):
def __init__(self, base, derived_arg1, derived_arg2):
self.base = base
self.derived_arg1 = derived_arg1
self.derived_arg2 = derived_arg2
def derived_method1(self):
return self.base.base_arg1 * self.derived_arg1