defining object method outside its class block, any reason not to? - python

The following surprised me, although perhaps it shouldn't have. However, I've never seen this done elsewhere
def bar_method(self):
print( f"in Foo method bar, baz={self.baz}")
# and pages more code in the real world
class Foo(object):
def __init__(self):
self.baz = "Quux"
bar = bar_method
>>> foo = Foo()
>>> foo.bar()
in Foo method bar, baz=Quux
>>>
This follows from the definition of def which is an assignment of a function object to its name in the current context.
The great advantage, as far as I am concerned, is that I can move large method definitions outside the class body and even into a different file, and merely link them into the class with a single assignment. Instantiating the class binds them as usual.
My question is simply, why have I never seen this done before? Is there anything lurking here that might bite me? Or if it's regarded as bad style, why?
(If you are wondering about my context, it's about Django View and Form subclasses. I would dearly love to keep them short, so that the business logic behind them is easy to follow. I'd far rather that methods of only cosmetic significance were moved elsewhere).

The great advantage, as far as I am concerned, is that I can move large method definitions outside the class body and even into a different file
I personnally wouldn't consider this as "a great advantage".
My question is simply, why have I never seen this done before?
Because there are very few good reasons to do so, and a lot of good reasons to NOT do so.
Or if it's regarded as bad style, why?
Because it makes it much harder to understand what's going on, quite simply. Debugging is difficult enough, and each file you have to navigate too adds to the "mental charge".
I would dearly love to keep them short, so that the business logic behind them is easy to follow. I'd far rather that methods of only cosmetic significance were moved elsewhere
Then put the important parts at the beginning of the class statement and the cosmetic ones at the end.
Also, ask yourself whether you're doing things at the right place - as far as I'm concerned, "business logic" mainly belongs to the domain layer (models), and "cosmetic" rather suggests presentation layer (views / templates). Of course some objects (forms and views specially) are actually dealing with both business rules and presentation, but even then you can often move some of this domain part to models (which the view or form will call on) and some of this presentation part to the template layer (using filters / custom tags).
My 2 cents...
NB: sorry if the above sounds a bit patronizing - I don't know what your code looks like, so I can't tell whether you're obviously already aware of this and doing the right thing already and just trying to improve further, or you're a beginner still fighting with how to achieve proper code separation...

Related

Python: What is wrong with subclassing a class from a parent package?

When I try to design package structures and class hierarchies within those packages in Python 3 projects, I'm constantly affected with circular import issues. This is whenever I want to implement a class in some module, subclassing a base class from a parent package (i.e. from some __init__.py from some parent directory).
Although I technically understand why that happens, I was not able to come up with a good solution so far. I found some threads here, but none of them go deeper in that particular scenario, let alone mentioning solutions.
In particular:
Putting everything in one file is maybe not great. It potentially can be quite a mass of things. Maybe 90% of the entire project code?! Maybe it wants to define a common subclass of things (whatever, e.g. widget base class for a ui lib), and then just lots of subclasses in nicely organized subpackages? There are obvious reasons why we would not write masses of stuff in one file.
Implementing your base class outside of __init__.py and just importing it there can work for a few places, but messes up the hierarchy with lots of aliases for the same thing (e.g. myproject.BaseClass vs. myproject.foo.BaseClass. Just not nice, is it? It also leads to a lot of boilerplate code that also wants to be maintained.
Implementing it outside, and even not importing it in __init__.py makes those "fully qualified" notations longer everywhere in the code, because it has to contain .foo everywhere I use BaseClass (yes, I usually do import myproject...somemodule, which is not what everybody does, but should not be bad afaik).
Some kinds of rather dirty tricks could help, e.g. defining those subclasses inside some kind of factory methods, so they are not defined at module level. Can work in a few situations, but is just terrible in general.
All of them are maybe okay in a single isolated situation, more as kind of a workaround imho, but in larger scale it ruins a lot. Also, the dirtier the tricks, the more it also breaks the services an IDE can provide.
All kinds of vague statements like 'then your class structure is probably bad and needs reworking' puzzle me a bit. I'm more or less distantly aware of some kinds of general good coding practices, although I never read "Clean Code" or similar stuff.
So, what is wrong with subclassing a class from a parent module from that abstract perspective? And if it is not, what is the secret workaround that people use in Python3? Is there a one, or is everybody basically dealing with one of the mentioned hacks? Sure, everywhere are tradeoffs to be made. But here I'm really struggling, even after many years of writing Python code.
Okay, thank you! I think I was a bit on the wrong way here in understanding the actual problem. It's essentially what #tdelaney said. So it's more complicated than what I initially wrote. It's allowed to do that, but you must not import the submodule in some parent packages. This is what I do here in one place for offering some convenience things. So, it's maybe not perfect, maybe the point where you could argue that my structure is not well designed (but, well, things are always a compromise). At least it's admittedly not true what I posted initially. And the workaround to use a few function level imports is maybe kind of okay in that place.

Best way to design a class in python

So, this is more like a philosophical question for someone who is trying to understand classes.
Most of time, how i use class is actually a very bad way to use it. I think of a lot of functions and after a time just indent the code and makes it a class and replacing few stuff with self.variable if a variable is repeated a lot. (I know its bad practise)
But anyways... What i am asking is:
class FooBar:
def __init__(self,foo,bar):
self._foo = foo
self._bar = bar
self.ans = self.__execute()
def __execute(self):
return something(self._foo, self._bar)
Now there are many ways to do this:
class FooBar:
def __init__(self,foo):
self._foo = foo
def execute(self,bar):
return something(self._foo, bar)
Can you suggest which one is bad and which one is worse?
or any other way to do this.
This is just a toy example (offcourse). I mean, there is no need to have a class here if there is one function.. but lets say in __execute something() calls a whole set of other methods.. ??
Thanks
If each FooBar is responsible for bar then the first is correct. If bar is only needed for execute() but not FooBar's problem otherwise, the second is correct.
In a single phrase, the formal term you want to worry about here is Separation of Concerns.
In response to your specific example, which of the two examples you choose depends on the concerns solved by FooBar and bar. If bar is in the same problem domain as FooBar or if it otherwise makes sense for FooBar to store a reference to bar, then the second example is correct. For instance, if FooBar has multiple methods that accept a bar and if you always pass the same instance of bar to each of a particular FooBar instance's bar-related methods, then the prior example is correct. Otherwise, the latter is more correct.
Ideally, each class you create should model exactly one major concern of your program. This can get a bit tricky, because you have to decide the granularity of "major concern" for yourself. Look to your dependency tree to determine if you're doing this correctly. It should be relatively easy to pull each individual class out of your program and test it in isolation from all other classes. If this is very difficult, you haven't split your concerns correctly. More formally, good separation of concerns is accomplished by designing classes which are cohesive and loosely coupled.
While this isn't useful for the example you posted, one simple pattern that helps accomplish this on a broader scale is Inversion of Control (IoC, sometimes called Dependency Injection). Under IoC, classes are written such that they aren't aware of their dependencies directly, only the interfaces (protocols in Python-speak) which their dependencies implement. Then at run time, typically during application initialization, instances of major classes are created by factories, and they are assigned references to their actual dependencies. See this article (with this example) for an explanation of how this can be accomplished in Python.
Finally, learn the four tenets of object-oriented programming.

splitting a class that is too large

I have a class BigStructure that builds a complicated data structure from some input. It also includes methods that perform operations on that data structure.
The class grew too large, so I'm trying to split it in two to help maintainability. I was thinking that it would be natural to move the operations into a new class, say class OperationsOnBigStructure.
Unfortunately, since class BigStructure is quite unique, OperationsOnBigStructure cannot be reasonably reused with any other class. In a sense, it's forever tied to BigStructure. For example, a typical operation may consist of traversing a big structure instance in a way that is only meaningful for a BigStructure object.
Now, I have two classes, but it feels like I haven't improved anything. In fact, I made things slightly more complicated, since I now need to pass the BigStructure object to the methods in OperationsOnBigStructure, and they need to store that object internally.
Should I just live with one big class?
The solution I came up with for this problem was to create a package to contain the class.
Something along the lines of:
MyClass/
__init__.py
method_a.py
method_b.py
...
in my case __init__.py contains the actual datastructure definition, but no methods. To 'attach' the methods to the class I just import them into the class' namespace.
Contents of method_a.py:
def method_a(self, msg):
print 'a: %s' % str(msg)
Contents of __init__.py:
class MyClass():
from method_a import method_a
from method_b import method_b
def method_c(self):
print 'c'
In the python console:
>>> from MyClass import MyClass
>>> a = MyClass()
>>> dir(a)
['__doc__', '__module__', 'method_a', 'method_b', 'method_c']
>>> a.method_a('hello world')
a: hello world
>>> a.method_c()
c
This has worked for me.
I was thinking that it would be natural to move the operations into a new class, say class OperationsOnBigStructure.
I would say, that's quite the opposite of what Object Oriented Design is all about. The idea behind OOD is to keep data and methods together.
Usually a (too) big class is a sign of too much responsibility: i.e. your class is simply doing too much. It seems that you first defined a data structure and then added functions to it. You could try to break the data structure into substructures and define independent classes for these (i.e. use aggregation). But without knowing more it's difficult to say...
Of course sometimes, a program just runs fine with one big class. But if you feel incomfortable with it yourself, that's a strong hint to start doing something against ...
"""Now, I have two classes, but it feels like I haven't improved anything. In fact, I made things slightly more complicated, since I now need to pass the BigStructure object to the methods in OperationsOnBigStructure, and they need to store that object internally."""
I think a natural approach there would be to have "OperationsOnBigStructure" to inherit from bigstructure - therefore you have all the relevant code in one place, without the extra parameter passing,as the data it needs to operate on will be contained in "self".
For example, a typical operation may consist of traversing a big
structure instance in a way that is only meaningful for a BigStructure
object.
Perhaps you could write some generators as methods of BigStructure that would do the grunt work of traversal. Then, OperationsOnBigStructure could just loop over an iterator when doing a task, which might improve readability of the code.
So, by having two classes instead of one, you are raising the level of abstraction at two stages.
At first make sure you have high test coverage, this will boost your refactoring experience. If there are no or not enough unittests, create them.
Then make reasonable small refactoring steps and keep the unittests working:
As a rule of thumb try to keep the core functionality together in the big class. Try not to draw a border if there is too much coupling.
At first refactor sub tasks to functions in seperate libraries. If it is possible to abstract things, move that functionality to libraries.
Then make the code more clean and reorder it, until you can see the structure it really 'wants' to have.
If in the end you feel you can still cut it in two classes and this is quite natural according to the inner structure, then consider really doing it.
Always keep the test coverage high enough and refactor always after you made some changes. AFter some time you will have much more beautiful code.

How can I create global classes in Python (if possible)?

Let's suppose I have several functions for a RPG I'm working on...
def name_of_function():
action
and wanted to implement axe class (see below) into each function without having to rewrite each class. How would I create the class as a global class. I'm not sure if I'm using the correct terminology or not on that, but please help. This has always held me abck from creating Text based RPG games. An example of a global class would be awesome!
class axe:
attack = 5
weight = 6
description = "A lightweight battle axe."
level_required = 1
price = 10
You can't create anything that's truly global in Python - that is, something that's magically available to any module no matter what. But it hardly matters once you understand how modules and importing work.
Typically, you create classes and organize them into modules. Then you import them into whatever module needs them, which adds the class to the module's symbol table.
So for instance, you might create a module called weapons.py, and create a WeaponBase class in it, and then Axe and Broadsword classes derived from WeaponsBase. Then, in any module that needed to use weapons, you'd put
import weapons
at the top of the file. Once you do this, weapons.Axe returns the Axe class, weapons.Broadsword returns the Broadsword class, and so on. You could also use:
from weapons import Axe, Broadsword
which adds Axe and Broadsword to the module's symbol table, allowing code to do pretty much exactly what you are saying you want it to do.
You can also use
from weapons import *
but this generally is not a great idea for two reasons. First, it imports everything in the module whether you're going to use it or not - WeaponsBase, for instance. Second, you run into all kinds of confusing problems if there's a function in weapons that's got the same name as a function in the importing module.
There are a lot of subtleties in the import system. You have to be careful to make sure that modules don't try to import each other, for instance. And eventually your project gets large enough that you don't want to put all of its modules in the same directory, and you'll have to learn about things like __init__.py. But you can worry about that down the road.
i beg to differ with the view that you can't create something truly global in python. in fact, it is easy. in Python 3.1, it looks like this:
def get_builtins():
"""Due to the way Python works, ``__builtins__`` can strangely be either a module or a dictionary,
depending on whether the file is executed directly or as an import. I couldn’t care less about this
detail, so here is a method that simply returns the namespace as a dictionary."""
return getattr( __builtins__, '__dict__', __builtins__ )
like a bunch of other things, builtins are one point where Py3 differs in details from the way it used to work in Py2. read the "What's New in Python X.X" documents on python.org for details. i have no idea what the reason for the convention mentioned above might be; i just want to ignore that stuff. i think that above code should work in Py2 as well.
so the point here is there is a __builtins__ thingie which holds a lot of stuff that comes as, well, built-into Python. all the sum, max, range stuff you've come to love. well, almost everything. but you don't need the details, really. the simplest thing you could do is to say
G = get_builtins()
G[ 'G' ] = G
G[ 'axe' ] = axe
at a point in your code that is always guaranteed to execute. G stands in for the globally available namespace, and since i've registered G itself within G, G now magically transcends its existence into the background of every module. means you should use it with care. where naming collisions occur between whatever is held in G and in a module's namespace, the module's namespace should win (as it gets inspected first). also, be prepared for everybody to jump on you when you tell them you're POLLUTING THE GLOBAL NAMESPACE dammit. i'm relly surprised noone has copmplained about that as yet, here.
well, those people would be quite right, in a way. personally, however, this is one of my main application composition techniques: what you do is you take a step away from the all-purpose module (which shouldn't do such a thing) towards a fine-tuned application-specific namespace. your modules are bound not to work outside that namespace, but then they're not supposed to, either. i actually started this style of programming as an experimental rebellion against (1) established views, hah!, and (2) the desperation that befalls me whenever i want to accomplish something less than trivial using Python's import statement. these days, i only use import for standard library stuff and regularly-installed modules; for my own stuff, i use a homegrown system. what can i say, it works!
ah yes, two more points: do yourself a favor, if you like this solution, and write yourself a publish() method or the like that oversees you never publish a name that has already been taken. in most cases, you do not want that.
lastly, let me second the first commenter: i have been programming in exactly the style you show above, coz that's what you find in the textbook examples (most of the time using cars, not axes to be sure). for a rather substantial number of reasons, i've pretty much given up on that.
consider this: JSON defines but seven kinds of data: null, true, false, numbers, texts, lists, dictionaries, that's it. i claim you can model any other useful datatype with those.
there is still a lot of justification for things like sets, bags, ordered dictionaries and so on. the claim here is not that it is always convenient or appropriate to fall back on a pure, directly JSON-compatible form; the claim is only that it is possible to simulate. right now, i'm implementing a sparse list for use in a messaging system, and that data type i do implement in classical OOP. that's what it's good for.
but i never define classes that go beyond these generic datatypes. rather, i write libraries that take generic datatypes and that provide the functionality you need. all of my business data (in your case probably representations of players, scenes, implements and so on) go into generic data container (as a rule, mostly dicts). i know there are open questions with this way of architecturing things, but programming has become ever so much easier, so much more fluent since i broke thru the BS that part of OOP propaganda is (apart from the really useful and nice things that another part of OOP is).
oh yes, and did i mention that as long as you keep your business data in JSON-compatible objects you can always write them to and resurrect them from the disk? or send them over the wire so you can interact with remote players? and how incredibly twisted the serialization business can become in classical OOP if you want to do that (read this for the tip of the iceberg)? most of the technical detail you have to know in this field is completely meaningless for the rest of your life.
You can add (or change existing) Python built-in functions and classes by doing either of the following, at least in Py 2.x. Afterwards, whatever you add will available to all code by default, although not permanently.
Disclaimer: Doing this sort of thing can be dangerous due to possible name clashes and problematic due to the fact that it's extremely non-explicit. But, hey, as they say, we're all adults here, right?
class MyCLass: pass
# one way
setattr(__builtins__, 'MyCLass', MyCLass)
# another way
import __builtin__
__builtin__.MyCLass = MyCLass
Another way is to create a singleton:
class Singleton(type):
def __init__(cls, name, bases, dict):
super(Singleton, cls).__init__(name, bases, dict)
cls.instance = None
class GlobalClass(object):
__metaclass__ = Singleton
def __init__():
pinrt("I am global and whenever attributes are added in one instance, any other instance will be affected as well.")

Are accessors in Python ever justified?

I realize that in most cases, it's preferred in Python to just access attributes directly, since there's no real concept of encapsulation like there is in Java and the like. However, I'm wondering if there aren't any exceptions, particularly with abstract classes that have disparate implementations.
Let's say I'm writing a bunch of abstract classes (because I am) and that they represent things having to do with version control systems like repositories and revisions (because they do). Something like an SvnRevision and an HgRevision and a GitRevision are very closely semantically linked, and I want them to be able to do the same things (so that I can have code elsewhere that acts on any kind of Repository object, and is agnostic of the subclass), which is why I want them to inherit from an abstract class. However, their implementations vary considerably.
So far, the subclasses that have been implemented share a lot of attribute names, and in a lot of code outside of the classes themselves, direct attribute access is used. For example, every subclass of Revision has an author attribute, and a date attribute, and so on. However, the attributes aren't described anywhere in the abstract class. This seems to me like a very fragile design.
If someone wants to write another implementation of the Revision class, I feel like they should be able to do so just by looking at the abstract class. However, an implementation of the class that satisfies all of the abstract methods will almost certainly fail, because the author won't know that they need attributes called 'author' and 'date' and so on, so code that tries to access Revision.author will throw an exception. Probably not hard to find the source of the problem, but irritating nonetheless, and it just feels like an inelegant design.
My solution was to write accessor methods for the abstract classes (get_id, get_author, etc.). I thought this was actually a pretty clean solution, since it eliminates arbitrary restrictions on how attributes are named and stored, and just makes clear what data the object needs to be able to access. Any class that implements all of the methods of the abstract class will work... that feels right.
Anyways, the team I'm working with hates this solution (seemingly for the reason that accessors are unpythonic, which I can't really argue with). So... what's the alternative? Documentation? Or is the problem I'm imagining a non-issue?
Note: I've considered properties, but I don't think they're a cleaner solution.
Note: I've considered properties, but I don't think they're a cleaner solution.
But they are. By using properties, you'll have the class signature you want, while being able to use the property as an attribute itself.
def _get_id(self):
return self._id
def _set_id(self, newid):
self._id = newid
Is likely similar to what you have now. To placate your team, you'd just need to add the following:
id = property(_get_id, _set_id)
You could also use property as a decorator:
#property
def id(self):
return self._id
#id.setter
def id(self, newid):
self._id = newid
And to make it readonly, just leave out set_id/the id.setter bit.
You've missed the point. It isn't the lack of encapsulation that removes the need for accessors, it's the fact that, by changing from a direct attribute to a property, you can add an accessor at a later time without changing the published interface in any way.
In many other languages, if you expose an attribute as public and then later want to wrap some code round it on access or mutation then you have to change the interface and anyone using the code has at the very least to recompile and possibly to edit their code also. Python isn't like that: you can flip flop between attribute or property just as much as you want and no code that uses the class will break.
Only behind a property.
"they should be able to do so just by looking at the abstract class"
Don't know what this should be true. A "programmer's guide", a "how to extend" document, plus some training seems appropriate to me.
"the author won't know that they need attributes called 'author' and 'date' and so on".
In that case, the documentation isn't complete. Perhaps the abstract class needs a better docstring. Or a "programmer's guide", or a "how to extend" document.
Also, it doesn't seem very difficult to (1) document these attributes in the docstring and (2) provide default values in the __init__ method.
What's wrong with providing extra support for programmers?
It sounds like you have a social problem, not a technical one. Writing code to solve a social problem seems like a waste of time and money.
The discussion already ended a year ago, but this snippet seemed telltale, it's worth discussing:
However, an implementation of the
class that satisfies all of the
abstract methods will almost certainly
fail, because the author won't know
that they need attributes called
'author' and 'date' and so on, so code
that tries to access Revision.author
will throw an exception.
Uh, something is deeply wrong. (What S. Lott said, minus the personal comments).
If these are required members, aren't they referenced (if not required) in the constructor, and defined by docstring? or at very least, as required args of methods, and again documented?
How could users of the class not know what the required members are?
To be the devil's advocate, what if your constructor(s) requires you to supply all the members that will/may be required, what issue does that cause?
Also, are you checking the parameters when passed, and throwing informative exceptions?
(The ideological argument of accessor-vs-property is a sidebar. Properties are preferable but I don't think that's the issue with your class design.)

Categories

Resources