best practise to init member - python

I have a question on best practises to initialise a class. Consider a class that is complex and has a lot of members. It's bad practise to init them outside of __init__(), but how can I handle the problem without having an enormous __init__() method.
An example:
class A:
def __init__(self):
self.member0 = "a"
# to prevent that the init method gets too big
# put some initialisation outside
init_other_stuff()
def init_other_stuff(self):
self.member1 = "b"
self.member2 = "c"
...
Thanks in advance.
[update] To clarify. The goal is not to put the stuff into another long method of course. Instead you can split the initialisation into different parts like:
def init_network_stuff(self):
""" init network setup """
self.my_new_socket = socket.socket(..)
def init_local_stuff(self):
""" init local setup """
self.my_new_logpath = "/foo/bar/log"
...

I would also agree that having too many attributes is probably a sign of insufficient abstraction, which is usually harder to develop & debug, and so changing your design is probably a good idea.
But, you do have a gold badge, so you've obviously been around and (probably) know what you are doing, and might have a reason to do this. In that case, I think it's a good idea to split up the initialization by category like you suggested. The only suggestion I would have is to use _ on the sub-init functions to signal to a user they aren't intended for normal use i.e.
class A:
def __init__(self):
self.member0 = "a"
self.__init_other_stuff()
def __init_other_stuff(self):
self.member1 = "b"
self.member2 = "c"
etc. This also hides them from tab-completion in most consoles & editors.
Your other choice is to make this class a subclass of multiple classes that implement part of the interface if you really need A to contain those items directly, i.e. do something like this:
class Logger(object):
def __init__(self):
self.logparam = 1
class NetworkSource(object):
def __init__(self):
self.netparam = 2
class A(Logger, NetworkSource):
def __init__(self):
Logger.__init__(self)
NetworkSource.__init__(self)
In [2]: a = A()
In [3]: a.<tab>
a.logparam a.netparam
Then it would get the functionality of both classes, while having a relatively short init. Multiple inheritance is conceptually a little more complicated though IMHO.

Related

At work, see python programmer define an empty class inside of a class to get "dot structure". Is this bad?

I have seen this from a coworker, and want to know if this is legit. A class is defined in order to get the self.thing.morethings structure.
import numpy as np
class MyBigThing:
def __init__(self):
self.thing_builder()
def compute(self):
print(self.thing.morethings)
def thing_builder(self):
self.x = 2
self.y = 2
self.z1 = (2, 2)
class thing:
pass
self.thing = thing
self.thing.morethings = np.zeros(self.z1)
Personally, I would define an entirely separate class. I think the first implementation is meaningless, and will be better formed like this. Meaning we can construct more Things later down the line rather than create them adhoc in the MyBigThing.
import numpy as np
class MyBigThing:
def __init__(self):
self.thing_builder()
def compute(self):
print(self.thing.morethings)
def thing_builder(self):
self.x = 2
self.y = 2
self.z1 = (2, 2)
self.thing = Thing(self.z1)
class Thing:
def __init__(self, shape):
self.morethings = np.zeros(shape)
You can test either one with:
import test
x = test.MyBigThing()
x.compute()
You should get with either implementation:
[[0. 0.]
[0. 0.]]
From a software development standpoint, doing such a thing is pretty non-standard and can lead to some confusion. I can imagine that there might be some benefit of grouping different variables inside common namespaces, however my spidey senses are telling me that this programmer is not following good development practices.
First of all, the class thing only exists inside of the function and not in class mybigthing, so right away we know that the code is invalid. What the programmer is intending to do is to have self.thing be the actual class thing and serve as some sort of namespace separator between variables, but the code you posted in fact is not even valid in Python:
class mybigthing:
def thing_builder(self):
class thing:
pass
self.thing.morethings=1
a = mybigthing()
a.thing_builder()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 5, in thing_builder
AttributeError: 'mybigthing' object has no attribute 'thing'
Okay, maybe you meant to do something like this:
class mybigthing:
# code stuff
class thing:
pass
# code stuff here
def thing_builder(self):
self.thing.morethings = 1
a = mybigthing()
a.thing_builder()
print(a.thing)
print(a.thing.morethings)
>>> <class '__main__.mybigthing.thing'>
>>> 1
That makes more sense. This is behaving as I expected. As to the question as whether this is valid: yes, this is valid Python, but this is extremely bad software development practice because the variable morethings is only instantiated within mybigthing, rather than thing like we would logically expect. This makes it unclear to programmers as to what thing is supposed to represent and what variables it is supposed to have. You first need to ask yourself, why are you creating a dummy class, and for what purpose do these variables need to be grouped in separate namespaces/classes?
The pattern your co-worker is probably using is this one:
class Class:
class AnotherClass:
pass
def __init__(self, **attrs):
pass
Which is functionally equivalent to:
class AnotherClass:
pass
class Class:
AnotherClass = AnotherClass
While this isn't common practice, it is far from unprecedented or bad practice in Python and there are some major frameworks that use this pattern, like Django which uses this pattern to define the Meta options class for a given model:
class Book(models.Model):
class Meta:
unique_together = (("author", "title"),)
author = models.CharField(max_length=100)
title = models.CharField(max_length=100)
When used this way it's essentially a class that will only ever be used by Book via the Meta class attribute name. Defining this inline is an idiom which groups vital information about the behavior of the Book model withe the rest of the Book model definition, which makes the code more coherent.
I have used this idiom to define schemas for rest endpoints with marshmallow:
class API(Resource):
class Schema(marshmallow.Schema):
value = marshmallow.fields.String()
def get(self):
return self.Schema().dump({ "value": "yes" })
Neither of these examples makes use of this idiom as a way to namespace class definitions, which I would count as a very poor reason to use this idiom. If I encountered this in code review, I would want to verify that this class was only ever used by the class whose scope it was defined in, and that it implemented a contract expected by the parent class.

settattr for parent class to use in children

I have a library with one parent and a dozen of children:
# mylib1.py:
#
class Foo(object):
def __init__(self, a):
self.a = a
class FooChild(Foo):
def __init__(self, a, b):
super(FooChild, self).__init__(a)
self.b = b
# more children here...
Now I want to extend that library with a simple (but a bit spesific, for use in another approach) method. So I would like to change parent class and use it's children.
# mylib2.py:
#
import mylib1
def fooMethod(self):
print 'a={}, b={}'.format(self.a, self.b)
setattr(mylib1.Foo, 'fooMethod', fooMethod)
And now I can use it like this:
# way ONE:
import mylib2
fc = mylib2.mylib1.FooChild(3, 4)
fc.fooMethod()
or like this:
# way TWO:
# order doesn't matter here:
import mylib1
import mylib2
fc = mylib1.FooChild(3, 4)
fc.fooMethod()
So, my questions are:
Is this good thing?
How this should be done in a better way?
A common approach is to use mixin
If you want, you could add dynamically How do I dynamically add mixins as base classes without getting MRO errors?.
There is a general rule in programming, that you should avoid dependence on global state. Which in other words means that your globals should be if possible constant. Classes are (mostly) globals.
Your approach is called monkey patching. And if you don't have a really really good reason to explain it, you should avoid it. This is because monkey patching violates the above rule.
Imagine you have two separate modules and both of them use this approach. One of them sets Foo.fooMethod to some method. The other - to another. Then you somehow switch control between these modules. The result would be, that it would be hard to determine what fooMethod is used where. This means hard to debug problems.
There are people (e.g. Brandon Craig-Rhodes), who believe that patching is bad even in tests.
What I would suggest is to use some attribute that you would set when instantiating instances of your Foo() class (and its children), that would control the behaviour of your fooMethod. Then the behaviour of this method would depend on how you instantiated the object, not on global state.

Overriding an inner function of a method in python

That is a kind of best practices question.
I have a class structure with some methods defined. In some cases I want to override a particular part of a method. First thought on that is splitting my method to more atomic pieces and override related parts like below.
class myTest(object):
def __init__(self):
pass
def myfunc(self):
self._do_atomic_job()
...
...
def _do_atomic_job(self):
print "Hello"
That is a practical-looking way to solve the problem. But since I have too many parameters that is needed to be transferred to and revieced back from _do_atomic_job(), I do not want to pass and retrieve tons of parameters. Other option is setting these parameters as class variables with self.param_var etc but those parameters are used in a small part of the code and using self is not my preferred way of solving this.
Last option I thought is using inner functions. (I know I will have problems in variable scopes but as I said, this is a best practise and just ignore them and think scope and all things about the inner functions are working as expected)
class MyTest2(object):
mytext = ""
def myfunc(self):
def _do_atomic_job():
mytext = "Hello"
_do_atomic_job()
print mytext
Lets assume that works as expected. What I want to do is overriding the inner function _do_atomic_job()
class MyTest3(MyTest2):
def __init__(self):
super(MyTest3, self).__init__()
self.myfunc._do_atomic_job = self._alt_do_atomic_job # Of course this do not work!
def _alt_do_atomic_job(self):
mytext = "Hollla!"
Do what I want to achieve is overriding inherited class' method's inner function _do_atomic_job
Is it possible?
Either factoring _do_atomic_job() into a proper method, or maybe factoring it
into its own class seem like the best approach to take. Overriding an inner
function can't work, because you won't have access to the local variable of the
containing method.
You say that _do_atomic_job() takes a lot of parameters returns lots of values. Maybe you group some of these parameters into reasonable objects:
_do_atomic_job(start_x, start_y, end_x, end_y) # Separate coordinates
_do_atomic_job(start, end) # Better: start/end points
_do_atomic_job(rect) # Even better: rectangle
If you can't do that, and _do_atomic_job() is reasonably self-contained,
you could create helper classes AtomicJobParams and AtomicJobResult.
An example using namedtuples instead of classes:
AtomicJobParams = namedtuple('AtomicJobParams', ['a', 'b', 'c', 'd'])
jobparams = AtomicJobParams(a, b, c, d)
_do_atomic_job(jobparams) # Returns AtomicJobResult
Finally, if the atomic job is self-contained, you can even factor it into its
own class AtomicJob.
class AtomicJob:
def __init__(self, a, b, c, d):
self.a = a
self.b = b
self.c = c
self.d = d
self._do_atomic_job()
def _do_atomic_job(self):
...
self.result_1 = 42
self.result_2 = 23
self.result_3 = 443
Overall, this seems more like a code factorization problem. Aim for rather lean
classes that delegate work to helpers where appropriate. Follow the single responsibility principle. If values belong together, bundle them up in a value class.
As David Miller (a prominent Linux kernel developer) recently said:
If you write interfaces with more than 4 or 5 function arguments, it's
possible that you and I cannot be friends.
Inner variables are related to where they are defined and not where they are executed. This prints "hello".
class MyTest2(object):
def __init__(self):
localvariable = "hello"
def do_atomic_job():
print localvariable
self.do_atomic_job = do_atomic_job
def myfunc(self):
localvariable = "hollla!"
self.do_atomic_job()
MyTest2().myfunc()
So I can't see any way you could use the local variables without passing them, which is probably the best way to do it.
Note: Passing locals() will get you a dict of the variables, this is considered quite bad style though.

do's and don'ts of __init__ method

I was just wondering if it's considered wildly inappropriate, just messy, or unconventional at all to use the init method to set variables by calling, one after another, the rest of the functions within a class. I have done things like, self.age = ch_age(), where ch_age is a function within the same class, and set more variables the same way, like self.name=ch_name() etc. Also, what about prompting for user input within init specifically to get the arguments with which to call ch_age? The latter feels a little wrong I must say. Any advice, suggestions, admonishments welcome!
I always favor being lazy: if you NEED to initialize everything in the constructor, you should--in a lot of cases, I put a general "reset" method in my class. Then you can call that method in init, and can re-initialize the class instance easily.
But if you don't need those variables initially, I feel it's better to wait to initialize things until you actually need them.
For your specific case
class Blah1(object):
def __init__(self):
self.name=self.ch_name()
def ch_name(self):
return 'Ozzy'
you might as well use the property decorator. The following will have the same effect:
class Blah2(object):
def __init__(self):
pass
#property
def name():
return 'Ozzy'
In both of the implementations above, the following code should not issue any exceptions:
>>> b1 = Blah1()
>>> b2 = Blah2()
>>> assert b1.name == 'Ozzy'
>>> assert b2.name == 'Ozzy'
If you wanted to provide a reset method, it might look something like this:
class Blah3(object):
def __init__(self, name):
self.reset(name)
def reset(self, name):
self.name = name

Is there a benefit to defining a class inside another class in Python?

What I'm talking about here are nested classes. Essentially, I have two classes that I'm modeling. A DownloadManager class and a DownloadThread class. The obvious OOP concept here is composition. However, composition doesn't necessarily mean nesting, right?
I have code that looks something like this:
class DownloadThread:
def foo(self):
pass
class DownloadManager():
def __init__(self):
dwld_threads = []
def create_new_thread():
dwld_threads.append(DownloadThread())
But now I'm wondering if there's a situation where nesting would be better. Something like:
class DownloadManager():
class DownloadThread:
def foo(self):
pass
def __init__(self):
dwld_threads = []
def create_new_thread():
dwld_threads.append(DownloadManager.DownloadThread())
You might want to do this when the "inner" class is a one-off, which will never be used outside the definition of the outer class. For example to use a metaclass, it's sometimes handy to do
class Foo(object):
class __metaclass__(type):
....
instead of defining a metaclass separately, if you're only using it once.
The only other time I've used nested classes like that, I used the outer class only as a namespace to group a bunch of closely related classes together:
class Group(object):
class cls1(object):
...
class cls2(object):
...
Then from another module, you can import Group and refer to these as Group.cls1, Group.cls2 etc. However one might argue that you can accomplish exactly the same (perhaps in a less confusing way) by using a module.
I don't know Python, but your question seems very general. Ignore me if it's specific to Python.
Class nesting is all about scope. If you think that one class will only make sense in the context of another one, then the former is probably a good candidate to become a nested class.
It is a common pattern make helper classes as private, nested classes.
There is another usage for nested class, when one wants to construct inherited classes whose enhanced functionalities are encapsulated in a specific nested class.
See this example:
class foo:
class bar:
... # functionalities of a specific sub-feature of foo
def __init__(self):
self.a = self.bar()
...
... # other features of foo
class foo2(foo):
class bar(foo.bar):
... # enhanced functionalities for this specific feature
def __init__(self):
foo.__init__(self)
Note that in the constructor of foo, the line self.a = self.bar() will construct a foo.bar when the object being constructed is actually a foo object, and a foo2.bar object when the object being constructed is actually a foo2 object.
If the class bar was defined outside of class foo instead, as well as its inherited version (which would be called bar2 for example), then defining the new class foo2 would be much more painful, because the constuctor of foo2 would need to have its first line replaced by self.a = bar2(), which implies re-writing the whole constructor.
You could be using a class as class generator. Like (in some off the cuff code :)
class gen(object):
class base_1(object): pass
...
class base_n(object): pass
def __init__(self, ...):
...
def mk_cls(self, ..., type):
'''makes a class based on the type passed in, the current state of
the class, and the other inputs to the method'''
I feel like when you need this functionality it will be very clear to you. If you don't need to be doing something similar than it probably isn't a good use case.
There is really no benefit to doing this, except if you are dealing with metaclasses.
the class: suite really isn't what you think it is. It is a weird scope, and it does strange things. It really doesn't even make a class! It is just a way of collecting some variables - the name of the class, the bases, a little dictionary of attributes, and a metaclass.
The name, the dictionary and the bases are all passed to the function that is the metaclass, and then it is assigned to the variable 'name' in the scope where the class: suite was.
What you can gain by messing with metaclasses, and indeed by nesting classes within your stock standard classes, is harder to read code, harder to understand code, and odd errors that are terribly difficult to understand without being intimately familiar with why the 'class' scope is entirely different to any other python scope.
A good use case for this feature is Error/Exception handling, e.g.:
class DownloadManager(object):
class DowndloadException(Exception):
pass
def download(self):
...
Now the one who is reading the code knows all the possible exceptions related to this class.
Either way, defined inside or outside of a class, would work. Here is an employee pay schedule program where the helper class EmpInit is embedded inside the class Employee:
class Employee:
def level(self, j):
return j * 5E3
def __init__(self, name, deg, yrs):
self.name = name
self.deg = deg
self.yrs = yrs
self.empInit = Employee.EmpInit(self.deg, self.level)
self.base = Employee.EmpInit(self.deg, self.level).pay
def pay(self):
if self.deg in self.base:
return self.base[self.deg]() + self.level(self.yrs)
print(f"Degree {self.deg} is not in the database {self.base.keys()}")
return 0
class EmpInit:
def __init__(self, deg, level):
self.level = level
self.j = deg
self.pay = {1: self.t1, 2: self.t2, 3: self.t3}
def t1(self): return self.level(1*self.j)
def t2(self): return self.level(2*self.j)
def t3(self): return self.level(3*self.j)
if __name__ == '__main__':
for loop in range(10):
lst = [item for item in input(f"Enter name, degree and years : ").split(' ')]
e1 = Employee(lst[0], int(lst[1]), int(lst[2]))
print(f'Employee {e1.name} with degree {e1.deg} and years {e1.yrs} is making {e1.pay()} dollars')
print("EmpInit deg {0}\nlevel {1}\npay[deg]: {2}".format(e1.empInit.j, e1.empInit.level, e1.base[e1.empInit.j]))
To define it outside, just un-indent EmpInit and change Employee.EmpInit() to simply EmpInit() as a regular "has-a" composition. However, since Employee is the controller of EmpInit and users don't instantiate or interface with it directly, it makes sense to define it inside as it is not a standalone class. Also note that the instance method level() is designed to be called in both classes here. Hence it can also be conveniently defined as a static method in Employee so that we don't need to pass it into EmpInit, instead just invoke it with Employee.level().

Categories

Resources