What is the most pythonic way to add several identical methods to several classes?
I could use a class decorator, but that seems to bring in a fair bit of complication and its harder to write and read than the other methods.
I could make a base class with all the methods and let the other classes inherit, but then for some of the classes I would be very tempted to allow multiple inheritance, which I have read frequently is to be avoided or minimized. Also, the "is-a" relationship does not apply.
I could also change them from being methods to make them stand-alone functions which just expect their values to supply the appropriate properties through duck-typing. This is in some ways clean, but it is less object oriented and makes it less clear when a function could be used on that type of object.
I could use delegation, but this requires all of the classes that want to call to have methods calling up to the helper classes methods. This would make the code base much longer than the other options and require adding a method to delegate every time I want to add a new function to the helper class.
I know giving one class an instance of the other as an attribute works nicely in some cases, but it does not always work cleanly and can make calls more complicated than they would be otherwise.
After playing around with it a bit, I am leaning towrds inheritance even when it leads to multiple inheritance. But I hesitate due to numerous texts warning very strongly against ever allowing multiple inheritance and some (such as the wikipedia entry) going so far as to say that inheritance just for code reuse such as this should be minimized.
This may be more clear with an example, so for a simplified example say we are dealing with numerous distinct classes which all have a location on an x, y grid. There are a lot of operations we might want to make methods of everything with an x, y location, such as a method to get the distance between two such entities, or the relative direction, midpoint between them, etc.
What would be the most pythonic way to give all such classes access to these methods that rely only on having x and y as attributes?
For your specific example, I would try to take advantage of duck-typing. Write plain simple functions that take objects which are assumed to have x and y attributes:
def distance(a, b):
"""
Returns distance between `a` and `b`.
`a` and `b` should have `x` and `y` attributes.
"""
return math.sqrt((a.x-b.x)**2 + (a.y-b.y)**2)
Very simple. To make it clear how the function can be used, just document it.
Plain old functions are best for this problem. E.g. instead of this ...
class BaseGeoObject:
def distanceFromZeroZero(self):
return math.sqrt(self.x()**2 + self.y()**2)
...
... just have functions like this one:
def distanceFromZeroZero(point):
return math.sqrt(point.x()**2 + point.y()**2)
This is a good solution because it's also easy to test - it's not necessary to subclass just to exercise a specific function.
Related
I've been using python for scientific purposes for some years now. I recently became more familiar with class writing, but I feel like I'm missing something regarding the standard way to instantiate classes.
Say I define a class MyClass.
class MyClass:
def __init__(self):
pass
Then I know that I can map x to an instance of MyClass simply with
x = MyClass()
This works well and exactly as I expect.
However, it seems to me that when I use code from standard libraries or from numpy or scipy, I don't create objects in the same way: as far as I know, I generally don't use the name of a class to instantiate it. From what I understand, I'd say that this implies that I use neither class methods nor the default constructor of a class, but rather other functions which are defined outside the class.
For example, numpy's random module uses a class Generator to generate random numbers. However, numpy explicitly recommends not to use the class constructor to get a Generator instance, and to use instead the default_rng function from the random module. So if I want to generate random numbers, I use
rng = numpy.random.default_rng()
to create a Generator instance. This is done without using explicitly the name of the class.
It seems to me that most of the code that I use is written in the latter way. Why is that so? Is it somehow considered bad practice to directly call default class constructors? Is it considered to be a better practice to have separate functions in a module to create class instances? Is it only because some preprocessing must usually be done before creating an instance of a class? (I guess not, because it that case, why not do that in the initialization of the class?)
No, it is not bad practice to use the normal constructor, but sometimes it can be useful to have an alternative constructor.
Reasons for using a function as an alternative constructor to create an object:
(not a complete list and not in any order)
Decouple the creation of an object from its implementation.
Decoupling is often aimed for in OOP.
Hide complexity
The constructor could have many parameters, but often a default object is needed.
Easier to read/write and understand
numpy.random.default_rng() vs numpy.random.Generator(numpy.random.PCG64())
A factory, that creates and returns a (different) object, based on sometimes complex conditions.
e.g. python's open() returns different objects for text files and for binary files.
Where to implement these?
In some other languages, these would be implemented as class methods of the class they instantiate, or even of a new class.
This could be done in python, too, but it is often shorter and more convenient to use, if they are implemented as functions at module level.
I think np.array call to create np.ndarray is probably one of the most common ways in which an object is created by calling another function. Here is an explanation of that.
What is the difference between ndarray and array in numpy?
I cannot answer for all cases in which we use a function to "wrap" the construction of an object, but I have used such functions to simplify object creation in many situations which results in cleaner code. I can speak of such situations.
For example, the underlying class definition may expose a lot of parameters. It may not make sense to ask the user to provide parameters values for all parameters of the class in 99.9% of the cases (say). These "spurious" parameters may be fixed, or may be inferred from other parameter values in most such situations (e.g., parameter b is 2x parameter a in most cases). The code becomes unwieldy in these 99.9% of cases to explicitly provide values for such parameters, so a wrapper function is written to make it cleaner.
It is possible to use default parameters to deal with many such situations, but it may not make sense to push the inference of parameter values into the class' init function itself. For example, while something like b = 2 * a if a is None else b seems reasonable to put in the init function, where a, b are parameters, it may not be so simple practically (e.g., b may have a complex relationship with a, c, d, f, etc or it may be a class object itself), or there may be 1000 such parameter inferences to be made. So it is logical to separate such "glue" code (which is a customization for ease of usage) into another function and keep the base code (which implements a specific functionality) clean and to-the-point.
Do we want to write another class wrapper instead of a function wrapper? In this case, the new class wrapper will present a simplified interface. But writing a class wrapper in this situation is unnecessary since class implies many things, while a function implies just procedural execution.
Note that this happens mostly in case of library type code which has the largest number of use cases where you want to make usage easiest for most people to use. Such issues do not exist for most "user" code where we simply write classes for a specific application. So in practice when we write applications, we should create classes directly using constructors when possible.
There is also the popular Factory Design pattern that some #ekhumoro referenced above which is very similar to this. But based on text-book definition, the Factory Design pattern seems to be restricted to super/sub classes (I could be wrong, and this might be useless semantics).
This question already has answers here:
How to set class names dynamically?
(3 answers)
Closed 5 years ago.
Would it be correct to say that if some python class is well written,
then
class Subclass(BaseClass):
pass
Should be sufficient to create a well behaved class with similar behavior to that of BaseClass?
(I am writing similar and not identical because for example)
SubClass.name or BaseClass.qualname would not be the same as their counterparts in BaseClass and this would possibly (probably) also extend to str and repr and possibly other metadata.
Would It make sense to use such "empty" inheritance to do class renaming for better semantics e.g. would you inherit collections.Countr to call it GuestsCount if you want to Count how many Adults / Children / Babies will be attending some event? or call it a "Polinomial" and use the count values to represent coefficients of some class that would represent variables to some power ( i.e. X^2 or Y^3 ) and so on ?
EDIT:
I don't see how my Q is even related to dynamic renaming of class in any way AS IS.
I am talking about inheritance v.s. aliasing (or possibly just instantiating) but not about renaming an existing class dynamically nor about dynamically creating classes and issues related to how to name those dynamically created classes as discussed in the so called duplicate mentioned here :(
Although it is not common, I think it does makes sense. I'd refer you to this article by Robert Martin. Especially the last paragraph supports your rationale. Although the article deals with renaming functions, the same arguments could hold for renaming classes.
Additionally, concepts as different as PersonCounter and Polynomial will most likely soon diverge in terms of functionality too, although they start from the same class, so it makes sense to make them different classes.
Note: A closely related, common pattern in python frameworks, is subclasses that have only one class attribute
class GuestCounter(Counter):
datatype=Person
class Polynomial(Counter):
datatype=float
which could be useful if you create factory functions/type checkers/adapter functions for your objects
An additional advantage is that you could add attributes to the SubClass,
while it might not be possible for the BaseClass (dict or list for example).
It might make sense if you actually get better semantics, but in your examples you do not.
Having AdultCounter and ChildCounter suggests that counting adults is somewhat different than counting children, which is false and misleading. Not because they happen to share the same implementation (as explained in the Uncle Bob`s article linked in other answer, that would be fine) but because they are conceptually the same. Counting abstracts over any attributes of the items counted. Counting adults, children, sheep or a mix of them, it is all the same.
adults = AdultCounter({ev: ev.num_adults() for ev in events})
children = ChildCounter({ev: ev.num_children() for ev in events})
The ocasional reader would wonder what do these counters do that a bare Counter does not. After They would have to look at the definitions to find out the answer: nothing. So what's the point? The part that is actually different, the filtering, is done outside the counters.
And for polyomies, a Counter does not look like a good abstraction. Why would mypoly.most_common() return the term with the highest coefficient? Why does poly1 + poly2 work while 2 * poly does not and poly1 - poly2 is buggy?
I need several very similar plotting functions in python that share many arguments, but differ in some and of course also differ slightly in what they do. This is what I came up with so far:
Obviously just defining them one after the other and copying the code they share is a possibility, though not a very good one, I reckon.
One could also transfer the "shared" part of the code to helper functions and call these from inside the different plotting functions. This would make it tedious though, to later add features that all functions should have.
And finally I've also thought of implementing one "big" function, making possibly not needed arguments optional and then deciding on what to do in the function body based on additional arguments. This, I believe, would make it difficult though, to find out what really happens in a specific case as one would face a forest of arguments.
I can rule out the first option, but I'm hard pressed to decide between the second and third. So I started wondering: is there another, maybe object-oriented, way? And if not, how does one decide between option two and three?
I hope this question is not too general and I guess it is not really python-specific, but since I am rather new to programming (I've never done OOP) and first thought about this now, I guess I will add the python tag.
EDIT:
As pointed out by many, this question is quite general and it was intended to be so, but I understand that this makes answering it rather difficult. So here's some info on the problem that caused me to ask:
I need to plot simulation data, so all the plotting problems have simulation parameters in common (location of files, physical parameters,...). I also want the figure design to be the same. But depending on the quantity, some plots will be 1D, some 2D, some should contain more than one figure, sometimes I need to normalize the data or take a logarithm before plotting it. The output format might also vary.
I hope this helps a bit.
How about something like this. You can create a Base class that will have a method foo that is your base shared method that performs all the similar code. Then for your different classes you can inherit from Base and super the method of interest and extend the implementation to whatever extra functionality you need.
Here is an example of how it works. Note the different example I provided between how to use super in Python 2 and Python 3.
class Base:
def foo(self, *args, **kwargs):
print("foo stuff from Base")
return "return something here"
class SomeClass(Base):
def foo(self, *args, **kwargs):
# python 2
#x = super(SomeClass, self).foo(*args, **kwargs)
# python 3
x = super().foo(*args, **kwargs)
print(x)
print("SomeClass extension of foo")
s = SomeClass()
s.foo()
Output:
foo stuff from Base
return something here
SomeClass extension of foo from Base
More information needs to be given to fully understand the context. But, in a general sense, I'd do a mix of all of them. Use helper functions for "shared" parts, and use conditional statements too. Honestly, a lot of it comes down to just what is easier for you to do?
Long story short, I need to find a shortcut to calling the same method in multiple objects which were made from multiple classes.
Now, all of the classes have the same parent class, and even though the method differs bewteen the different classes, I figured the methods be the same name would work. So I though I might just be able to do something like this:
for object in listOfObjects:
object.method()
It hasn't worked. It might very well be a misspelling by me, but I can't find it. I think I could solve it by making a list that only adds the objects I need, but that would require a lot of coding, including changing other classes.
~~ skip to last paragraph for pseudo code accurately describing what I need~~
At this point, I will begin to go more in detail as to what specifically I am doing. I hope that it will better illustrate the scope of my question, and that the answering of my question will be more broadly applicable. The more general usage of this question are above, but this might help answer the question. Please be aware that I will change the question once I get an answer to more closely represent what I need done, so that it can apply to a wide variety of problems.
I am working on a gravity simulator. Whereas most simulators make objects which interact with one another and represent full bodies where their center of gravity is the actual attraction point, I am attempting to write a program which will simulate the distribution of gravity across all given points within an object.
As such, each object(not in programming terms, in literal terms) is made up of a bunch of tiny objects (both literally and figuratively). Essentially, what I am trying to do is call the object.gravity() method, which essentially takes into account all of the gravity from all other objects in the simulation and then moves the position of this particular object based on that input.
Now, either due to a syntactical bug (which I kinda doubt) or due to Python's limitations, I am unable to get all of the particles to behave properly all at once. The code snippet I posted before doesn't seem to be working.
tl;dr:
As such, I am wondering if there is a way (save adding all objects to a list and then iterating through it) to simply call the .gravity() method on every object that has the method. basically, even though this is sort of list format, this is what I want to do:
for ALL_OBJECTS:
if OBJECT has .gravity():
OBJECT.gravity()
You want the hasattr() function here:
for obj in all_objects:
if hasattr(obj, 'gravity'):
obj.gravity()
or, if the gravity method is defined by a specific parent class, you can test for that too:
for obj in all_objects:
if isinstance(obj, Planet):
obj.gravity()
Can also do ... better pythonic way to do it
for obj in all_objects:
try:
obj.gravity()
except AttributeError:
pass
Using getattr while set default option of getattr to lambda: None.
for obj in all_objects:
getattr(obj, 'gravity', lambda: None)()
(Using Python 3.2, though I doubt it matters.)
I have class Data, class Rules, and class Result. I use lowercase to denote an instance of the class.
A rules object contains rules that, if applied to a data object, can create a result object.
I'm deciding where to put the (rather complicated and evolving) code that actually applies the rules to the data. I can see two choices:
Put that code inside a class Result method, say parse_rules. Result constructor would take as an argument a rules object, and pass it onto self.parse_rules.
Put that code inside a new class ResultFactory. ResultFactory would be a singleton class, which has a method, say build_result, which takes rules as an argument and returns a newly built result object.
What are the pros and cons of the two approaches?
The GRASP design principles provide guidelines for assigning responsibility to classes and objects in object-oriented design. For example, the Creator pattern suggests: In general, a class B should be responsible for creating instances of class A if one, or preferably more, of the following apply:
Instances of B contains or compositely aggregates instances of A
Instances of B record instances of A
Instances of B closely use instances of A
Instances of B have the initializing information for instances of A and pass it on creation.
In your example, you have complicated and evolving code for applying rules to data. That suggests the use of the Factory Pattern.
Putting the code in Results is contraindicated because 1) results don't create results, and 2) results aren't the information expert (i.e. they don't have most of the knowledge that is needed).
In short, the ResultFactory seems like a reasonable place to concentrate the knowledge of how to apply rules to data to generate results. If you were to try to push all of this logic into class constructors for either Results or Rules, it would lead to tight coupling and loss of cohesion.
Third scenario:
You may want to consider a third scenario:
Put the code inside the method Rules.__call__.
Instantiating Result like: result = rules(data)
Pros:
Results can be totally unaware of the Rules that generates them (and maybe even of the original Data).
Every Rules sub-class can customize its Result creation.
It feels natural (to me): Rules applied to Data yield Result.
And you'll have a couple of GRASP principle on your side:
Creator: Instances of Rules have the initializing information for instances of Result and pass it on creation.
Information Expert: Information Expert will lead to placing the responsibility on the class with the most information required to fulfill it.
Side effects:
Coupling: You'll raise the coupling between Rules and Data:
You need to pass the whole data set to every Rules
Which means that each Rules should be able to decide on which data it'll be applied.
Why not put the rules in their own classes? If you create a RuleBase class, then each rule can derive from it. This way, polymorphism can be used when Data needs rules applied. Data doesn't need to know or care which Rule instances were applied (unless Data itself is the one who knows which rules should be applied).
When rules need to be invoked, a data instance can all RuleBase.ExecuteRules() and pass itself in as an argument. The correct subclass of Rule can be chosen directly from Data, if Data knows which Rule is necessary. Or some other design pattern can be used, like Chain of Responsibility, where Data invokes the pattern and lets a Result come back.
This would make a great whiteboard discussion.
Can you make ResultFactory a pure function? It's not useful to create a singleton object if all you need is a function.
Well, the second is downright silly, especially with all the singletonness. If Result requires Rules to create an instance, and you can't create one without it, it should take that as an argument to __init__. No pattern shopping necessary.