In a large python project (openerp) I encounter several times the following pattern:
In a module, a class with its methods is defined. Then, in the same module and immediately after the class definition, an instance of the class is instantiated, that is then called from other modules.
# in module_A.py:
class ClassA(object):
def __init__(self, default="Hello world!"):
self.default = default
def my_method(self, data):
print self.default
print data
object_a = ClassA()
To me it looks simpler to define the methods as module functions, without the class lookup overload:
# in module_B.py:
default = "Hello world!"
def my_method(data):
print default
print data
Seen from other modules, the usage is very similar:
from module_a import object_a as prefix
prefix.my_method("I'm so objective!")
versus:
import module_b as prefix
prefix.my_method("I'm so modular!")
Is there any rationale to prefer pattern A over pattern B? Or is pattern B more pythonic?
Sometimes, you want different clients to be able to use your module with different settings in such a way that they don't conflict with each other. For example, Python's random module provides a bunch of random number generation functions that are actually bound methods of a hidden Random instance. Most users don't care too much what algorithm generates their random numbers or whether other modules asking for random numbers will change the sequence. However, users who do care can get their own Random object and generate sequences of random numbers that won't be affected by other modules asking for random numbers.
Sometimes, something that's global now might not always be global. For example, if you're working on a planetary-scale RTS, you might have a Planet class with one instance, because the battle only happens on one planet. However, you don't want to rule out the possibility of building something like Planetary Annihilation, with battles stretching across entire solar systems and dropping extinction-event asteroids as superweapons. If you get rid of the Planet class and make its methods and attributes module-level, it'll be much harder to go back and add more planets later.
Sometimes, it's more readable to have objects doing things instead of modules. For example, suppose module joebob defines two objects evil_overlord_bob and good_guy_joe.
class Bob(object):
def slaughter_everything(self):
print "Muahahaha! Die for my amusement!"
class Joe(object):
def stop_bob(self):
print "I won't let you hurt those innocents!"
evil_overlord_bob = Bob()
good_guy_joe = Joe()
Suppose Bob and Joe are very unique people. It's unthinkable that you'd want to create another object anything like Bob or Joe. In that case, you could move slaughter_everything and stop_bob to module-level and get rid of the Bob and Joe classes and objects entirely. However, then you'd be writing
joebob.slaughter_everything()
joebob.stop_bob()
It's much clearer what's going on if you can say
evil_overlord_bob.slaughter_everything()
good_guy_joe.stop_bob()
even if you'll never need to instantiate Bob's equally-evil twin brother greg_the_fleshripper.
Among other benefits, using classes allows you to use the introspection on the instances, which is something you cannot do with functions.
In a more general way, both approaches are "pythonic". Use one of the other really depends on the type of project (small/big, with/without GUI, ...)
Related
I am writing some code that is an upside down triangle of inheritance. I have a base Linux class that has a CLIENT attr which holds a connection. I have several APIs that are logically separated (kvm, yum, gdb, dhcp, etc..) that use CLIENT but I only want the user to need to create a single instance of Linux class but be able to call all the methods from the Parent classes. While maintaining the nice logical code separation among the parents:
class Linux(
SSHClient,
yum.Yum,
kvm.Kvm,
ldap.Ldap,
lshw.Lshw,
packet_capture.Tcpdump,
tc.TrafficControl,
networking.Networking,
gdb.Gdb,
dhcp.Dhcp,
httputil.Http,
scp.Scp,
fileutils.FileUtils):
I made a little example:
class Dad(object):
def __init__(self):
raise NotImplementedError("Create Baby instead")
def dadCallBaby(self):
print('sup {}'.format(self.babyName))
class Mom(object):
def __init__(self):
raise NotImplementedError("Create Baby instead")
def momCallBaby(self):
print('goochi goo {}'.format(self.babyName))
class Baby(Mom, Dad):
def __init__(self, name):
self.babyName = name
def greeting(self):
self.momCallBaby()
self.dadCallBaby()
x=Baby('Joe')
x.greeting()
What is doing this called? Is this Duck Typing? And is there a better option?
There's really no such thing as "child-only attributes".
The attribute babyName is just stored in each object's namespace, and looked up there. Python doesn't care that it happened to be stored by Baby.__init__. And in fact, you can write store the same attribute on a Mom that isn't a Baby and it will work the same way:
class NotABaby(Mom):
def __init__(self): pass
mom = NotABaby()
mom.babyName = 'Me?'
mom.momCallBaby()
Also, it's hard to suggest a better way to do what you're doing, because what you're doing is inherently confusing and probably shouldn't be done.
Inheritance normally means subtyping—that is, Baby should only be a subclass of Mom if every Baby instance is usable as a Mom.1
But a baby is not a mom and a dad.2 A baby has a mom and a dad. And the way to represent that is by giving Baby attributes for its mom and dad:
class Baby(object):
def __init__(self, mom, dad, name):
self.mom, self.dad, self.name = mom, dad, name
def greeting(self):
self.mom.momCallBaby(self.name)
self.dad.dadCallBaby(self.name)
Notice that, e.g., this means that the same woman can be the mom of two babies. Since that's also true of the real-life thing you're modeling here, that's a sign that you're modeling things correctly.
Your "real" example is a little less clear, but I suspect the same thing is going on there.
The only reason you want to use inheritance, as far as I can tell, is:
I only want the user to need to create a single instance of Linux class
You don't need, or want, inheritance for that:
class Linux(object):
def __init__(self):
self.ssh_client = SSHClient()
self.yum = yum.Yum()
# etc.
… but be able to call all the methods from the Parent classes
If yum.Yum, ldap.Ldap and dhcp.Dhcp both have methods named lookup, which one would be called by Linux.lookup?
What you probably want is to just leave the attributes as public attributes, and use them explicitly:
system = Linux()
print(system.yum.lookup(package))
print(system.ldap.lookup(name))
print(system.dhcp.lookup(reservation))
Or you'll want to provide a "Linux API" that wraps all the underlying APIs:
def lookup_package(self, package):
return self.yum.lookup(package)
def lookup_ldap_name(self, name):
return self.ldap.lookup(name)
def lookup_reservation(self, reservation):
return self.dhcp.lookup(reservation)
If you really do want to just forward every method of all of your different components, and you're sure that none of them conflict with each other, and there are way too many to write out manually, you can always do it programmatically, by iterating all of the classes, iterating inspect.getmembers of each one, filtering out the ones that start with _ or aren't unbound methods, creating a proxy function, and setattr-ing it onto Linux.
Or, alternatively (probably not as good an idea in this case, but very commonly useful in cases that aren't that different), you can proxy dynamically, at method lookup time, by implementing a __getattr__ method (and, often, a __dir__ method).
I think one of these two kinds of proxying may be what you're really after here.
1. There are some cases where you want to inherit for reasons other than subtyping. For example, you inherit a mixin class to get implementations for a bunch of methods. The question of whether your class is usable wherever that mixin's instances are usable doesn't really make sense, because the mixin isn't usable anywhere (except as a base class). But the subtyping is still the standard that you're bending there.
2. If it is, call Child Protective Services. And also call Professor X, because that shouldn't be physically possible.
Python is supposed to be fun, simple and easy to learn.
Instead, it's been a huge pain.
I've discovered that all the errors I'm getting are related to me not declaring each variable global in each function.
So for my toy program of dressUp, I have to write:
hatColor = ""
shirtColor = ""
pantsColor = ""
def pickWardrobe(hat, shirt, pants):
global hatColor
global shirtColor
global pantsColor
...
This gets really annoying when I have 20 functions, and each one needs to have 20 global declarations at the beginning.
Is there any way to avoid this?
Thanks!
ADDED
I am getting tons of `UnboundLocalError - local variable X referenced before assignment.
Why am I doing this? Because I need to write a py file that can do some calculations for me. I don't want it all in the same function, or it gets messy and I can't reuse code. But if I split the work among a few functions, I have to declare these annoying globals over and over.
Classes versus global variables
global is common to all
class is a template for an object, representing something, here it could be person dressed up somehow.
Class might have class properties, these are not so commonly used, as they are shared by all instances (sort of "global for classes).
Classes start living as soon as you instantiate them, it means, the pattern defined by class definition is realized in form of unique object.
Such an object, called instance, might have it's own properties, which are not shared with other instances.
I am sometime thinking about a class as of a can - class definition means "can is something you can put thing into" and instance is real tangible can, which has a name of it and in Python I put property values into it, which are bound to the name of given can holder.
DressUp class with real instance properties
Properties in "holmeswatson" solution are bound to class definition. You would run into problems if you would use multiple instances of DressUp, they would be sharing the properties over class definition.
It is better and safer to use it as instance variables, which are over self bound to instance of the class, not to class definition.
Modified code:
class DressUp:
def __init__(self, name, hatColor="", shirtColor=""):
self.name = name
self.hatColor = hatColor
self.shirtColor = shirtColor
def pickWardrobe(self,hat, shirt):
self.hatColor = hat
self.shirtColor = shirt
def __repr__(self):
name = self.name
hatColor = self.hatColor
shirtColor = self.shirtColor
templ = "<Person:{name}: hat:{hatColor}, shirt:{shirtColor}>"
return templ.format(name=name, hatColor=hatColor, shirtColor=shirtColor)
tom = DressUp("Tom")
tom.pickWardrobe("red","yellow")
print "tom's hat is", tom.hatColor
print "simple print:", tom
print "__repr__ call:", tom.__repr__()
jane = DressUp("Jane")
jane.pickWardrobe("pink","pink")
print "jane's hat is", jane.hatColor
print "simple print:", jane
print "__repr__ call:", jane.__repr__()
The __repr__ method is used at the moment, you call print tom or print jane.
It is used here to show, how to instance method can get access to instance properties.
Is there any way around it? Yes, there are several. If you're using global variables on a regular basis, you're making a mistake in your design.
One common pattern when you have many functions that will operate on the same, related data is to create a class and then declare instances of that class. Each instance has its own set of data and methods, and the methods within that instance can operate on the data within that instance.
This is called object oriented programming, it is a common and basic paradigm in modern programming.
Several respondents have sketched out what a class might look like in your case but I don't think you've given enough information (which would include the method signatures of the other functions) to actually write out what you need. If you post more information you might get some better examples.
If it is appropriate, you could use classes.
class DressUp:
def __init__(self, name):
self.name = name
def pickWardrobe(self,hat, shirt, pants):
self.hatColor = hat
self.shirtColor = shirt
self.pantsColor = pants
obj1 = DressUp("Tom")
obj1.pickWardrobe("red","yellow","blue")
print obj1.hatColor
Have a look:
http://www.tutorialspoint.com/python/python_classes_objects.htm
I want to create a class with two methods at this point (I also want to be able to
alter the class obviously).
class ogrGeo(object):
def __init__(self):
pass
def CreateLine(self, o_file, xy):
#lots of code
def CreatePoint(self, o_file, xy):
# lot's of the same code as CreateLine(),
# only minor differences
To keep things as clean and to to repeat as
less code as possible I'm asking for some advise. The two methods CreateLine()
and CreatePoint() share a lot of code. To reduce redundance:
Should a define third method that both methods can call?
In this case you could still call
o = ogrGeo()
o.CreateLine(...)
o.CreatePoint(...)seperatly.
Or should I merge them into one method? Is there another solution I haven't thought about or know nothing about?
Thanks already for any suggestions.
Whether you should merge the methods into one is a matter of API design. If the functions have a different purpose, then you keep them seperate. I would merge them if client code is likely to follow the pattern
if some_condition:
o.CreateLine(f, xy)
else:
o.CreatePoint(f, xy)
But otherwise, don't merge. Instead, refactor the common code into a private method, or even a freestanding function if the common code does not touch self. Python has no notion of "private method" built into the language, but names with a leading _ will be recognized as such.
It's perfectly normal to factor out common code into a (private) helper method:
class ogrGeo(object)
def __init__(self):
pass
def CreateLine(self, o_file, xy):
#lots of code
value = self._utility_method(xy)
def CreatePoint(self, o_file, xy):
# lot's of the same code as CreateLine(),
# only minor differences
value = self._utility_method(xy)
def _utility_method(self, xy):
# Common code here
return value
The method could return a value, or it could directly manipulate the attributes on self.
A word of advice: read the Python style guide and stick to it's conventions. Most other python projects do, and it'll make your code easier to comprehend for other Python developers if you do.
For the pieces of code that will overlap, consider whether those can be their own separate functions as well. Then CreateLine would be comprised of several calls to certain functions, with parameter choices that make sense for CreateLine, meanwhile CreatePoint would be several function calls with appropriate parameters for creating a point.
Even if those new auxiliary functions aren't going to be used elsewhere, it's better to modularize them as separate functions than to copy/paste code. But, if it is the case that the auxialiary functions needed to create these structures are pretty specific, then why not break them out into their own classes?
You could make an "Object" class that involves all of the basics for creating objects, and then have "Line" and "Point" classes which derive from "Object". Within those classes, override the necessary functions so that the construction is specific, relying on auxiliary functions in the base "Object" class for the portions of code that overlap.
Then the ogrGeo class will construct instances of these other classes. Even if the ultimate consumer of "Line" or "Shape" doesn't need a full blown class object, you can still use this design, and give ogrGeo the ability to return the sub-pieces of a Line instance or a Point instance that the consumer does wish to use.
It hardly matters. You want the class methods to be as usable as possible for the calling programs, and it's slightly easier and more efficient to have two methods than to have a single method with an additional parameter for the type of object to be created:
def CreateObj(self, obj, o_file, xy) # obj = 0 for Point, 1 for Line, ...
Recommendation: use separate API calls and factor the common code into method(s) that can be called within your class.
You as well could go the other direction. Especially if the following is the case:
def methA/B(...):
lots of common code
small difference
lots of common code
then you could do
def _common(..., callback):
lots of common code
callback()
lots of common code
def methA(...):
def _mypart(): do what A does
_common(..., _mypart)
def methB(...):
def _mypart(): do what B does
_common(..., _mypart)
I've been striving mightily for three days to wrap my head around __init__ and "self", starting at Learn Python the Hard Way exercise 42, and moving on to read parts of the Python documentation, Alan Gauld's chapter on Object-Oriented Programming, Stack threads like this one on "self", and this one, and frankly, I'm getting ready to hit myself in the face with a brick until I pass out.
That being said, I've noticed a really common convention in initial __init__ definitions, which is to follow up with (self, foo) and then immediately declare, within that definition, that self.foo = foo.
From LPTHW, ex42:
class Game(object):
def __init__(self, start):
self.quips = ["a list", "of phrases", "here"]
self.start = start
From Alan Gauld:
def __init__(self,val): self.val = val
I'm in that horrible space where I can see that there's just One Big Thing I'm not getting, and I it's remaining opaque no matter how much I read about it and try to figure it out. Maybe if somebody can explain this little bit of consistency to me, the light will turn on. Is this because we need to say that "foo," the variable, will always be equal to the (foo) parameter, which is itself contained in the "self" parameter that's automatically assigned to the def it's attached to?
You might want to study up on object-oriented programming.
Loosely speaking, when you say
class Game(object):
def __init__(self, start):
self.start = start
you're saying:
I have a type of "thing" named Game
Whenever a new Game is created, it will demand me for some extra piece of information, start. (This is because the Game's initializer, named __init__, asks for this information.)
The initializer (also referred to as the "constructor", although that's a slight misnomer) needs to know which object (which was created just a moment ago) it's initializing. That's the first parameter -- which is usually called self by convention (but which you could call anything else...).
The game probably needs to remember what the start I gave it was. So it stores this information "inside" itself, by creating an instance variable also named start (nothing special, it's just whatever name you want), and assigning the value of the start parameter to the start variable.
If it doesn't store the value of the parameter, it won't have that informatoin available for later use.
Hope this explains what's happening.
I'm not quite sure what you're missing, so let me hit some basic items.
There are two "special" intialization names in a Python class object, one that is relatively rare for users to worry about, called __new__, and one that is much more usual, called __init__.
When you invoke a class-object constructor, e.g. (based on your example) x = Game(args), this first calls Game.__new__ to obtain memory in which to hold the object, and then Game.__init__ to fill in that memory. Most of the time, you can allow the underlying object.__new__ to allocate the memory, and you just need to fill it in. (You can use your own allocator for special weird rare cases like objects that never change and may share identities, the way ordinary integers do for instance. It's also for "metaclasses" that do weird stuff. But that's all a topic for much later.)
Your Game.__init__ function is called with "all the arguments to the constructor" plus one stashed in the front, which is the memory allocated for that object itself. (For "ordinary" objects that's mostly a dictionary of "attributes", plus the magic glue for classes, but for objects with __slots__ the attributes dictionary is omitted.) Naming that first argument self is just a convention—but don't violate it, people will hate you if you do. :-)
There's nothing that requires you to save all the arguments to the constructor. You can set any or all instance attributes you like:
class Weird(object):
def __init__(self, required_arg1, required_arg2, optional_arg3 = 'spam'):
self.irrelevant = False
def __str__(self):
...
The thing is that a Weird() instance is pretty useless after initialization, because you're required to pass two arguments that are simply thrown away, and given a third optional argument that is also thrown away:
x = Weird(42, 0.0, 'maybe')
The only point in requiring those thrown-away arguments is for future expansion, as it were (you might have these unused fields during early development). So if you're not immediately using and/or saving arguments to __init__, something is definitely weird in Weird.
Incidentally, the only reason for using (object) in the class definition is to indicate to Python 2.x that this is a "new-style" class (as distinguished from very-old-Python "instance only" classes). But it's generally best to use it—it makes what I said above about object.__new__ true, for instance :-) —until Python 3, where the old-style stuff is gone entirely.
Parameter names should be meaningful, to convey the role they play in the function/method or some information about their content.
You can see parameters of constructors to be even more important because they are often required for the working of the new instance and contain information which is needed in other methods of the class as well.
Imagine you have a Game class which accepts a playerList.
class Game:
def __init__(self, playerList):
self.playerList = playerList # or self.players = playerList
def printPlayerList(self):
print self.playerList # or print self.players
This list is needed in various methods of the class. Hence it makes sense to assign it to self.playerList. You could also assign it to self.players, whatever you feel more comfortable with and you think is understandable. But if you don't assign it to self.<somename> it won't be accessible in other methods.
So there is nothing special about how to name parameters/attributes/etc (there are some special class methods though), but using meaningful names makes the code easier to understand. Or would you understand the meaning of the above class if you had:
class G:
def __init__(self, x):
self.y = x
def ppl(self):
print self.y
? :) It does exactly the same but is harder to understand...
I'm fairly new to object oriented programming so some of the abstraction ideas are a little blurry to me. I'm writing an interpreter for an old game language. Part of this has made me need to implement custom types from said language and place them on a stack to be manipulated as needed.
Now, I can put a string on a list. I can put a number on a list, and I've even found I can put symbols on a list. But I'm a bit fuzzy on how I would put a custom object instance on a list when I can't just drop it into a variable (since, after all, I don't know how many there will be and can't go about defining them by hand while the code is running :)
I've made a class for one of the simplest data types-- a DBREF. The DBREF just contains a Database reference number. I can't just use an integer, string, dictionary, etc, because there are type-checking mechanisms in the language I have to implement and that would confuse matters, since those are already used elsewhere in their closes analogues.
Here is my code and my reasoning behind it:
class dbref:
dbnumber=0
def __init__(self, number):
global number
dbnumber=number
def getdbref:
global number
return number
I create a class named dbref. All it does (for now) is take a number and store it in a variable. My hope is that if I were to do:
examplelist=[ dbref(5) ]
That the dbref object would be on the stack. Is that possible? Further, will I be able to do:
if typeof(examplelist[0]) is dbref:
print "It's a DBREF."
else:
print "Nope."
...or am I misunderstanding how Python classes work? Also, is my class definition wonky in any way?
If you used...
class dbref:
dbnumber=0
that would share the same number among all instances of the class, because dbnumber would be a class attribute, rather than an instance attribute. Try this instead:
class dbref(object):
def __init__(self, number):
self.dbnumber = number
def getdbref(self):
return self.dbnumber
self is a reference to the object instance itself that's automatically passed by Python when you call one of the instance's methods.