So, this is more like a philosophical question for someone who is trying to understand classes.
Most of time, how i use class is actually a very bad way to use it. I think of a lot of functions and after a time just indent the code and makes it a class and replacing few stuff with self.variable if a variable is repeated a lot. (I know its bad practise)
But anyways... What i am asking is:
class FooBar:
def __init__(self,foo,bar):
self._foo = foo
self._bar = bar
self.ans = self.__execute()
def __execute(self):
return something(self._foo, self._bar)
Now there are many ways to do this:
class FooBar:
def __init__(self,foo):
self._foo = foo
def execute(self,bar):
return something(self._foo, bar)
Can you suggest which one is bad and which one is worse?
or any other way to do this.
This is just a toy example (offcourse). I mean, there is no need to have a class here if there is one function.. but lets say in __execute something() calls a whole set of other methods.. ??
Thanks
If each FooBar is responsible for bar then the first is correct. If bar is only needed for execute() but not FooBar's problem otherwise, the second is correct.
In a single phrase, the formal term you want to worry about here is Separation of Concerns.
In response to your specific example, which of the two examples you choose depends on the concerns solved by FooBar and bar. If bar is in the same problem domain as FooBar or if it otherwise makes sense for FooBar to store a reference to bar, then the second example is correct. For instance, if FooBar has multiple methods that accept a bar and if you always pass the same instance of bar to each of a particular FooBar instance's bar-related methods, then the prior example is correct. Otherwise, the latter is more correct.
Ideally, each class you create should model exactly one major concern of your program. This can get a bit tricky, because you have to decide the granularity of "major concern" for yourself. Look to your dependency tree to determine if you're doing this correctly. It should be relatively easy to pull each individual class out of your program and test it in isolation from all other classes. If this is very difficult, you haven't split your concerns correctly. More formally, good separation of concerns is accomplished by designing classes which are cohesive and loosely coupled.
While this isn't useful for the example you posted, one simple pattern that helps accomplish this on a broader scale is Inversion of Control (IoC, sometimes called Dependency Injection). Under IoC, classes are written such that they aren't aware of their dependencies directly, only the interfaces (protocols in Python-speak) which their dependencies implement. Then at run time, typically during application initialization, instances of major classes are created by factories, and they are assigned references to their actual dependencies. See this article (with this example) for an explanation of how this can be accomplished in Python.
Finally, learn the four tenets of object-oriented programming.
Related
I have a lot of different child classes that inherit from one base class. However all the different child classes implement very similar methods. So if I want to change code in the child classes, I have to change it multiple times.
For me this sounds like bad practice and I would like to implement it correcty. But after a lot of googling I still didn't find a coherent way of how this should be done.
Here is an example of what I mean:
from ABC import ABC, abstractmethod
import logging.config
class BaseModel(ABC):
def __init__(self):
# initialize logging
logging.config.fileConfig(os.path.join(os.getcwd(),
'../myconfig.ini'))
self.logger = logging.getLogger(__name__)
#abstractmethod
def prepare_data(self):
"""
Prepares the needed data.
"""
self.logger.info('Data preparation started.\n')
pass
So this is my BaseClass. Now from this class multiple other classes inherit the init and prepare_data method. The prepare_data method is very similar for every class.
class Class_One(BaseModel):
def __init__(self):
super.__init()__
def prepare_data(self):
super().prepare_data()
# Some code that this method does
class Class_Two(BaseModel):
def __init__(self):
super.__init()__
def prepare_data(self):
super().prepare_data()
# Some code that this method does
# Code is almost the same as for Class_One
class Class_Three(BaseModel):
def __init__(self):
super.__init()__
def prepare_data(self):
super().prepare_data()
# Some code that this method does
# Code is almost the same as for Class_One and Class_Two
# etc.
I suppose you could refactor the methods into another file and then call them in each class. I would love to know how to do this correctly. Thanks a lot in advance!
I'm afraid there's no generic one-size-fits-all magic answer - it all really depend on the "almost" part AND on the forces that will drive change in those parts of the code. IOW, one can only really answer on a concrete example...
This being said, there are a couple lessons learned from experience, which are mostly summmarized in the famous (but unfortunately often misunderstood) GOF "Design Patterns" book. If you take time to first read the first part of the book, you understand that most of (if not all) the patterns in the catalog are based on the same principle: separate the variant from the invariant. Once you can tell one from the other in your code (warning: there's a trap here and beginner almost always fall into it), which pattern to apply is usually obvious (sometimes to the point you only realize you used this and that patterns after you refactored your code).
Now as I said, there IS a trap: accidental duplication. Just because two pieces of code look similar doesn't mean they are duplicates - quite often, they are only "accidentally" similar now but the forces that will make one or the other change are mostly unrelated. If you try to immediatly refactor this code, you'll soon find yourself making the "generic" case more and more complicated to support changes that are actually unrelated, and end up with an overcomplicated, undecipherable mess that only make your code unmaintainable. So the trick here is to carefully examine the whole context, ask yourself what would drive change in one or the other "similar" parts, and if in doubt, wait until you know more. If it happens than each time you change A you have to make the exact same change in B for the exact same reasons then you DO have real duplicate.
For a more practical, short-term advise based on what we can guess from your way too abstract example (and from experience), there are at least two patterns that are most often involved in factoring out duplication in a class hierarchy: the template method and the strategy.
NB : I said "unfortunately often misunderstood" because most people seem to jump to the patterns catalog and try to forcefit all of them in their code (whether it makes sense for the problem at hand or not), and usually by copy-pasting the canonical textbook _implementation_ (usually Java or C++ based) instead of understanding the _concept_ and implementing it in a way that's both idiomatic and adapted to the concrete use case (example: when functions are first class object, you don't necessarily need a Strategie class with abstract base and concrete subclasses - most often plain old callback functions JustWork(tm)).
EDIT totally unrelated but this:
def __init__(self):
# initialize logging
logging.config.fileConfig(os.path.join(os.getcwd(),
'../myconfig.ini'))
self.logger = logging.getLogger(__name__)
is NOT how to use logging. Library code can use loggers, but must not configure anything - this is the application's (your main script / function / whatever) responsability, the rational being that the proper logging config depends on the context - which type of application is using the lib (a CLI app, a local GUI app and a backend web app don't have the same needs at all) and in which kind of environment (a local dev env will want much more logs than a production one for example).
Also, with the logger created with __name__ in your base class module, all child classes will send their log to the same logger, which is certainly not what you want (you want them to have their own package / module specific loggers so you can fine tune the config per package / module).
And finally, this:
os.path.join(os.getcwd(), '../myconfig.ini')
certainly doesn't work as you expect - your cwd can be just anything at this point and you have no way of knowing in advance. If you want to reference a path relative to the current file's directory, you want os.path.dirname(os.path.realpath(__file__)). And of course adding system specific path stuff (ie "../") in a os.path.join() call totally defeats the whole point of using os.path.
The following surprised me, although perhaps it shouldn't have. However, I've never seen this done elsewhere
def bar_method(self):
print( f"in Foo method bar, baz={self.baz}")
# and pages more code in the real world
class Foo(object):
def __init__(self):
self.baz = "Quux"
bar = bar_method
>>> foo = Foo()
>>> foo.bar()
in Foo method bar, baz=Quux
>>>
This follows from the definition of def which is an assignment of a function object to its name in the current context.
The great advantage, as far as I am concerned, is that I can move large method definitions outside the class body and even into a different file, and merely link them into the class with a single assignment. Instantiating the class binds them as usual.
My question is simply, why have I never seen this done before? Is there anything lurking here that might bite me? Or if it's regarded as bad style, why?
(If you are wondering about my context, it's about Django View and Form subclasses. I would dearly love to keep them short, so that the business logic behind them is easy to follow. I'd far rather that methods of only cosmetic significance were moved elsewhere).
The great advantage, as far as I am concerned, is that I can move large method definitions outside the class body and even into a different file
I personnally wouldn't consider this as "a great advantage".
My question is simply, why have I never seen this done before?
Because there are very few good reasons to do so, and a lot of good reasons to NOT do so.
Or if it's regarded as bad style, why?
Because it makes it much harder to understand what's going on, quite simply. Debugging is difficult enough, and each file you have to navigate too adds to the "mental charge".
I would dearly love to keep them short, so that the business logic behind them is easy to follow. I'd far rather that methods of only cosmetic significance were moved elsewhere
Then put the important parts at the beginning of the class statement and the cosmetic ones at the end.
Also, ask yourself whether you're doing things at the right place - as far as I'm concerned, "business logic" mainly belongs to the domain layer (models), and "cosmetic" rather suggests presentation layer (views / templates). Of course some objects (forms and views specially) are actually dealing with both business rules and presentation, but even then you can often move some of this domain part to models (which the view or form will call on) and some of this presentation part to the template layer (using filters / custom tags).
My 2 cents...
NB: sorry if the above sounds a bit patronizing - I don't know what your code looks like, so I can't tell whether you're obviously already aware of this and doing the right thing already and just trying to improve further, or you're a beginner still fighting with how to achieve proper code separation...
I have a class BigStructure that builds a complicated data structure from some input. It also includes methods that perform operations on that data structure.
The class grew too large, so I'm trying to split it in two to help maintainability. I was thinking that it would be natural to move the operations into a new class, say class OperationsOnBigStructure.
Unfortunately, since class BigStructure is quite unique, OperationsOnBigStructure cannot be reasonably reused with any other class. In a sense, it's forever tied to BigStructure. For example, a typical operation may consist of traversing a big structure instance in a way that is only meaningful for a BigStructure object.
Now, I have two classes, but it feels like I haven't improved anything. In fact, I made things slightly more complicated, since I now need to pass the BigStructure object to the methods in OperationsOnBigStructure, and they need to store that object internally.
Should I just live with one big class?
The solution I came up with for this problem was to create a package to contain the class.
Something along the lines of:
MyClass/
__init__.py
method_a.py
method_b.py
...
in my case __init__.py contains the actual datastructure definition, but no methods. To 'attach' the methods to the class I just import them into the class' namespace.
Contents of method_a.py:
def method_a(self, msg):
print 'a: %s' % str(msg)
Contents of __init__.py:
class MyClass():
from method_a import method_a
from method_b import method_b
def method_c(self):
print 'c'
In the python console:
>>> from MyClass import MyClass
>>> a = MyClass()
>>> dir(a)
['__doc__', '__module__', 'method_a', 'method_b', 'method_c']
>>> a.method_a('hello world')
a: hello world
>>> a.method_c()
c
This has worked for me.
I was thinking that it would be natural to move the operations into a new class, say class OperationsOnBigStructure.
I would say, that's quite the opposite of what Object Oriented Design is all about. The idea behind OOD is to keep data and methods together.
Usually a (too) big class is a sign of too much responsibility: i.e. your class is simply doing too much. It seems that you first defined a data structure and then added functions to it. You could try to break the data structure into substructures and define independent classes for these (i.e. use aggregation). But without knowing more it's difficult to say...
Of course sometimes, a program just runs fine with one big class. But if you feel incomfortable with it yourself, that's a strong hint to start doing something against ...
"""Now, I have two classes, but it feels like I haven't improved anything. In fact, I made things slightly more complicated, since I now need to pass the BigStructure object to the methods in OperationsOnBigStructure, and they need to store that object internally."""
I think a natural approach there would be to have "OperationsOnBigStructure" to inherit from bigstructure - therefore you have all the relevant code in one place, without the extra parameter passing,as the data it needs to operate on will be contained in "self".
For example, a typical operation may consist of traversing a big
structure instance in a way that is only meaningful for a BigStructure
object.
Perhaps you could write some generators as methods of BigStructure that would do the grunt work of traversal. Then, OperationsOnBigStructure could just loop over an iterator when doing a task, which might improve readability of the code.
So, by having two classes instead of one, you are raising the level of abstraction at two stages.
At first make sure you have high test coverage, this will boost your refactoring experience. If there are no or not enough unittests, create them.
Then make reasonable small refactoring steps and keep the unittests working:
As a rule of thumb try to keep the core functionality together in the big class. Try not to draw a border if there is too much coupling.
At first refactor sub tasks to functions in seperate libraries. If it is possible to abstract things, move that functionality to libraries.
Then make the code more clean and reorder it, until you can see the structure it really 'wants' to have.
If in the end you feel you can still cut it in two classes and this is quite natural according to the inner structure, then consider really doing it.
Always keep the test coverage high enough and refactor always after you made some changes. AFter some time you will have much more beautiful code.
General Python Question
I'm importing a Python library (call it animals.py) with the following class structure:
class Animal(object): pass
class Rat(Animal): pass
class Bat(Animal): pass
class Cat(Animal): pass
...
I want to add a parent class (Pet) to each of the species classes (Rat, Bat, Cat, ...); however, I cannot change the actual source of the library I'm importing, so it has to be a run time change.
The following seems to work:
import animals
class Pet(object): pass
for klass in (animals.Rat, animals.Bat, animals.Cat, ...):
klass.__bases__ = (Pet,) + klass.__bases__
Is this the best way to inject a parent class into an inheritance tree in Python without making modification to the source definition of the class to be modified?
Motivating Circumstances
I'm trying to graft persistence onto the a large library that controls lab equipment. Messing with it is out of the question. I want to give ZODB's Persistent a try. I don't want to write the mixin/facade wrapper library because I'm dealing with 100+ classes and lots of imports in my application code that would need to be updated. I'm testing options by hacking on my entry point only: setting up the DB, patching as shown above (but pulling the species classes w/ introspection on the animals module instead of explicit listing) then closing out the DB as I exit.
Mea Culpa / Request
This is an intentionally general question. I'm interested in different approaches to injecting a parent and comments on the pros and cons of those approaches. I agree that this sort of runtime chicanery would make for really confusing code. If I settle on ZODB I'll do something explicit. For now, as a regular user of python, I'm curious about the general case.
Your method is pretty much how to do it dynamically. The real question is: What does this new parent class add? If you are trying to insert your own methods in a method chain that exists in the classes already there, and they were not written properly, you won't be able to; if you are adding original methods (e.g. an interface layer), then you could possibly just use functions instead.
I am one who embraces Python's dynamic nature, and would have no problem using the code you have presented. Make sure you have good unit tests in place (dynamic or not ;), and that modifying the inheritance tree actually lets you do what you need, and enjoy Python!
You should try really hard not to do this. It is strange, and will likely end in tears.
As #agf mentions, you can use Pet as a mixin. If you tell us more about why you want to insert a parent class, we can help you find a nicer solution.
Let's suppose I have several functions for a RPG I'm working on...
def name_of_function():
action
and wanted to implement axe class (see below) into each function without having to rewrite each class. How would I create the class as a global class. I'm not sure if I'm using the correct terminology or not on that, but please help. This has always held me abck from creating Text based RPG games. An example of a global class would be awesome!
class axe:
attack = 5
weight = 6
description = "A lightweight battle axe."
level_required = 1
price = 10
You can't create anything that's truly global in Python - that is, something that's magically available to any module no matter what. But it hardly matters once you understand how modules and importing work.
Typically, you create classes and organize them into modules. Then you import them into whatever module needs them, which adds the class to the module's symbol table.
So for instance, you might create a module called weapons.py, and create a WeaponBase class in it, and then Axe and Broadsword classes derived from WeaponsBase. Then, in any module that needed to use weapons, you'd put
import weapons
at the top of the file. Once you do this, weapons.Axe returns the Axe class, weapons.Broadsword returns the Broadsword class, and so on. You could also use:
from weapons import Axe, Broadsword
which adds Axe and Broadsword to the module's symbol table, allowing code to do pretty much exactly what you are saying you want it to do.
You can also use
from weapons import *
but this generally is not a great idea for two reasons. First, it imports everything in the module whether you're going to use it or not - WeaponsBase, for instance. Second, you run into all kinds of confusing problems if there's a function in weapons that's got the same name as a function in the importing module.
There are a lot of subtleties in the import system. You have to be careful to make sure that modules don't try to import each other, for instance. And eventually your project gets large enough that you don't want to put all of its modules in the same directory, and you'll have to learn about things like __init__.py. But you can worry about that down the road.
i beg to differ with the view that you can't create something truly global in python. in fact, it is easy. in Python 3.1, it looks like this:
def get_builtins():
"""Due to the way Python works, ``__builtins__`` can strangely be either a module or a dictionary,
depending on whether the file is executed directly or as an import. I couldn’t care less about this
detail, so here is a method that simply returns the namespace as a dictionary."""
return getattr( __builtins__, '__dict__', __builtins__ )
like a bunch of other things, builtins are one point where Py3 differs in details from the way it used to work in Py2. read the "What's New in Python X.X" documents on python.org for details. i have no idea what the reason for the convention mentioned above might be; i just want to ignore that stuff. i think that above code should work in Py2 as well.
so the point here is there is a __builtins__ thingie which holds a lot of stuff that comes as, well, built-into Python. all the sum, max, range stuff you've come to love. well, almost everything. but you don't need the details, really. the simplest thing you could do is to say
G = get_builtins()
G[ 'G' ] = G
G[ 'axe' ] = axe
at a point in your code that is always guaranteed to execute. G stands in for the globally available namespace, and since i've registered G itself within G, G now magically transcends its existence into the background of every module. means you should use it with care. where naming collisions occur between whatever is held in G and in a module's namespace, the module's namespace should win (as it gets inspected first). also, be prepared for everybody to jump on you when you tell them you're POLLUTING THE GLOBAL NAMESPACE dammit. i'm relly surprised noone has copmplained about that as yet, here.
well, those people would be quite right, in a way. personally, however, this is one of my main application composition techniques: what you do is you take a step away from the all-purpose module (which shouldn't do such a thing) towards a fine-tuned application-specific namespace. your modules are bound not to work outside that namespace, but then they're not supposed to, either. i actually started this style of programming as an experimental rebellion against (1) established views, hah!, and (2) the desperation that befalls me whenever i want to accomplish something less than trivial using Python's import statement. these days, i only use import for standard library stuff and regularly-installed modules; for my own stuff, i use a homegrown system. what can i say, it works!
ah yes, two more points: do yourself a favor, if you like this solution, and write yourself a publish() method or the like that oversees you never publish a name that has already been taken. in most cases, you do not want that.
lastly, let me second the first commenter: i have been programming in exactly the style you show above, coz that's what you find in the textbook examples (most of the time using cars, not axes to be sure). for a rather substantial number of reasons, i've pretty much given up on that.
consider this: JSON defines but seven kinds of data: null, true, false, numbers, texts, lists, dictionaries, that's it. i claim you can model any other useful datatype with those.
there is still a lot of justification for things like sets, bags, ordered dictionaries and so on. the claim here is not that it is always convenient or appropriate to fall back on a pure, directly JSON-compatible form; the claim is only that it is possible to simulate. right now, i'm implementing a sparse list for use in a messaging system, and that data type i do implement in classical OOP. that's what it's good for.
but i never define classes that go beyond these generic datatypes. rather, i write libraries that take generic datatypes and that provide the functionality you need. all of my business data (in your case probably representations of players, scenes, implements and so on) go into generic data container (as a rule, mostly dicts). i know there are open questions with this way of architecturing things, but programming has become ever so much easier, so much more fluent since i broke thru the BS that part of OOP propaganda is (apart from the really useful and nice things that another part of OOP is).
oh yes, and did i mention that as long as you keep your business data in JSON-compatible objects you can always write them to and resurrect them from the disk? or send them over the wire so you can interact with remote players? and how incredibly twisted the serialization business can become in classical OOP if you want to do that (read this for the tip of the iceberg)? most of the technical detail you have to know in this field is completely meaningless for the rest of your life.
You can add (or change existing) Python built-in functions and classes by doing either of the following, at least in Py 2.x. Afterwards, whatever you add will available to all code by default, although not permanently.
Disclaimer: Doing this sort of thing can be dangerous due to possible name clashes and problematic due to the fact that it's extremely non-explicit. But, hey, as they say, we're all adults here, right?
class MyCLass: pass
# one way
setattr(__builtins__, 'MyCLass', MyCLass)
# another way
import __builtin__
__builtin__.MyCLass = MyCLass
Another way is to create a singleton:
class Singleton(type):
def __init__(cls, name, bases, dict):
super(Singleton, cls).__init__(name, bases, dict)
cls.instance = None
class GlobalClass(object):
__metaclass__ = Singleton
def __init__():
pinrt("I am global and whenever attributes are added in one instance, any other instance will be affected as well.")