How should I choose between using instance vs. class attributes? - python

I tried this example code:
class testclass:
classvar = 'its classvariable LITERAL'
def __init__(self,x,y):
self.z = x
self.classvar = 'its initvariable LITERAL'
self.test()
def test(self):
print('class var',testclass.classvar)
print('instance var',self.classvar)
if __name__ == '__main__':
x = testclass(2,3)
I need some clarification. In both cases, I'm able to access the class attribute and instance in the test method.
So, suppose if I have to define a literal that needs to be used across all function, which would be the better way to define it: an instance attribute or a class attribute?

I found this in an old presentation made by Guido van Rossum in 1999 ( http://legacy.python.org/doc/essays/ppt/acm-ws/sld001.htm ) and I think it explains the topic beautifully:
Instance variable rules
On use via instance (self.x), search order:
(1) instance, (2) class, (3) base classes
this also works for method lookup
On assigment via instance (self.x = ...):
always makes an instance variable
Class variables "default" for instance variables
But...!
mutable class variable: one copy shared by all
mutable instance variable: each instance its own

Class variables are quite good for "constants" used by all the instances (that's all methods are technically). You could use module globals, but using a class variable makes it more clearly associated with the class.
There are often uses for class variables that you actually change, too, but it's usually best to stay away from them for the same reason you stay away from having different parts of your program communicate by altering global variables.
Instance variables are for data that is actually part of the instance. They could be different for each particular instance, and they often change over the lifetime of a single particular instance. It's best to use instance variables for data that is conceptually part of an instance, even if in your program you happen to only have one instance, or you have a few instances that in practice always have the same value.

It's good practice to only use class attributes if they are going to remain fixed, and one great thing about them is that they can be accessed outside of an instance:
class MyClass():
var1 = 1
def __init__(self):
self.var2 = 2
MyClass.var1 # 1 (you can reference var1 without instantiating)
MyClass.var2 # AttributeError: class MyClass has no attribute 'var2'
If MyClass.var is defined, it should be the same in every instance of MyClass, otherwise you get the following behaviour which is considered confusing.
a = MyClass()
b = MyClass()
a.var1, a.var2 # (1,2)
a.var1, a.var2 = (3,4) # you can change these variables
a.var1, a.var2 # (3,4)
b.var1, b.var2 # (1,2) # but they don't change in b
MyClass.var1 # 1 nor in MyClass

You should define it as a class attribute if you want it to be shared among all instances. You should define it as an instance variable if you want a separate one for each instance (e.g., if different instances might have different values for the variable).

Related

Properties seem to set to the same value for all objects (Python) [duplicate]

What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.

python diff instance method [duplicate]

From what I understand, each instance of a class stores references to the instance's methods.
I thought, in concept, all instances of a class have the same instance methods. If so, both memory savings and logical clarity seem to suggest that instance methods should be stored in the class object rather than the instance object (with the instance object looking them up through the class object; of course, each instance has a reference to its class). Why is this not done?
A secondary question. Why are instance methods not accessible in a way similar to instance attributes, i.e., through __dict__, or through some other system attribute? Is there any way to look at (and perhaps change) the names and the references to instance methods?
EDIT:
Oops, sorry. I was totally wrong. I saw the following Python 2 code, and incorrectly concluded from it that instance methods are stored in the instances. I am not sure what it does, since I don't use Python 2, and new is gone from Python 3.
import new
class X(object):
def f(self):
print 'f'
a = X()
b = X()
def g(self):
print 'g'
# I thought this modified instance method just in a, not in b
X.f = new.instancemethod(g, a, X)
Attribute lookup on objects in Python is non-trivial. But instance methods are certainly not stored on the instance object!
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses.
(docs)
Note that it is possible to store a function on an instance. But that's not an instance method! When the interpreter looks up an attribute and finds that it is (a) a function and (b) on the class object, it automatically wraps it in a bound method object which passes self.
Is there any way to look at (and perhaps change) the names and the references to instance methods?
Well, you can certainly modify the class object after defining it. But I assume what you mean is "can you make the x method of a particular instance do something different?"
This being Python, the answer is "yes": just define a.x to be some new function. Then you will get that function back before looking on the class.
This may cause you a lot of confusion when you're trying to understand the code, though!
From what I understand, each instance of a class stores references to the instance's methods.
I don't know where you got this from, but it's wrong. They don't.
Why are instance methods not accessible in a way similar to instance attributes, i.e., through __dict__, or through some other system attribute?
Well, because they are not stored on the instance.
Is there any way to look at (and perhaps change) the names and the references to instance methods?
Since these references don't exist, you cannot change them. You can of course create any attribute you want by normal assignments, but note that functions stored on the instance are not treated like ordinary methods -- the mechanism that implicitly passes the self parameter does not apply for them.
Incorrect. Instances do not store references to each method.
For example:
class Foo():
def bar(self):
print 'bar'
f = Foo()
def alternate_bar(self):
print 'alternate bar'
f.bar()
Foo.bar = alternate_bar # modifies the class!
f.bar()
prints
bar
alternate bar
This is also why you provide a self to each method you define in a class. Without a reference to self, the method has no idea which instance it is working on.
Another example
class Point:
def __init__(self, xcoord, ycoord):
self.x = xcoord
self.y = ycoord
def draw(self):
print self.x, " ", self.y
p = Point(205.12, 305.21)
#draw the coordinates of the point instance
p.draw()
# now define a new point drawing function vdraw()
def vdraw(q):
print "[",q.x,",",q.y,"]"
#p.draw()
#now reassign the draw() method to vdraw()
Point.draw = vdraw
# now print the coordinates of the point instance
print p.x
print p.y
#now draw the coordinates of the point instance
p.draw()

Python weird class variables usage

Suppose we have the following code:
class A:
var = 0
a = A()
I do understand that a.var and A.var are different variables, and I think I understand why this thing happens. I thought it was just a side effect of python's data model, since why would someone want to modify a class variable in an instance?
However, today I came across a strange example of such a usage: it is in google app engine db.Model reference. Google app engine datastore assumes we inherit db.Model class and introduce keys as class variables:
class Story(db.Model):
title = db.StringProperty()
body = db.TextProperty()
created = db.DateTimeProperty(auto_now_add=True)
s = Story(title="The Three Little Pigs")
I don't understand why do they expect me to do like that? Why not introduce a constructor and use only instance variables?
The db.Model class is a 'Model' style class in classic Model View Controller design pattern.
Each of the assignments in there are actually setting up columns in the database, while also giving an easy to use interface for you to program with. This is why
title="The Three Little Pigs"
will update the object as well as the column in the database.
There is a constructor (no doubt in db.Model) that handles this pass-off logic, and it will take a keyword args list and digest it to create this relational model.
This is why the variables are setup the way they are, so that relation is maintained.
Edit: Let me describe that better. A normal class just sets up the blue print for an object. It has instance variables and class variables. Because of the inheritence to db.Model, this is actually doing a third thing: Setting up column definitions in a database. In order to do this third task it is making EXTENSIVE behinds the scenes changes to things like attribute setting and getting. Pretty much once you inherit from db.Model you aren't really a class anymore, but a DB template. Long story short, this is a VERY specific edge case of the use of a class
If all variables are declared as instance variables then the classes using Story class as superclass will inherit nothing from it.
From the Model and Property docs, it looks like Model has overridden __getattr__ and __setattr__ methods so that, in effect, "Story.title = ..." does not actually set the instance attribute; instead it sets the value stored with the instance's Property.
If you ask for story.__dict__['title'], what does it give you?
I do understand that a.var and A.var are different variables
First off: as of now, no, they aren't.
In Python, everything you declare inside the class block belongs to the class. You can look up attributes of the class via the instance, if the instance doesn't already have something with that name. When you assign to an attribute of an instance, the instance now has that attribute, regardless of whether it had one before. (__init__, in this regard, is just another function; it's called automatically by Python's machinery, but it simply adds attributes to an object, it doesn't magically specify some kind of template for the contents of all instances of the class - there's the magic __slots__ class attribute for that, but it still doesn't do quite what you might expect.)
But right now, a has no .var of its own, so a.var refers to A.var. And you can modify a class attribute via an instance - but note modify, not replace. This requires, of course, that the original value of the attribute is something modifiable - a list qualifies, a str doesn't.
Your GAE example, though, is something totally different. The class Story has attributes which specifically are "properties", which can do assorted magic when you "assign to" them. This works by using the class' __getattr__, __setattr__ etc. methods to change the behaviour of the assignment syntax.
The other answers have it mostly right, but miss one critical thing.
If you define a class like this:
class Foo(object):
a = 5
and an instance:
myinstance = Foo()
Then Foo.a and myinstance.a are the very same variable. Changing one will change the other, and if you create multiple instances of Foo, the .a property on each will be the same variable. This is because of the way Python resolves attribute access: First it looks in the object's dict, and if it doesn't find it there, it looks in the class's dict, and so forth.
That also helps explain why assignments don't work the way you'd expect given the shared nature of the variable:
>>> bar = Foo()
>>> baz = Foo()
>>> Foo.a = 6
>>> bar.a = 7
>>> bar.a
7
>>> baz.a
6
What happened here is that when we assigned to Foo.a, it modified the variable that all instance of Foo normally resolve when you ask for instance.a. But when we assigned to bar.a, Python created a new variable on that instance called a, which now masks the class variable - from now on, that particular instance will always see its own local value.
If you wanted each instance of your class to have a separate variable initialized to 5, the normal way to do it would be like this:
class Foo(object);
def __init__(self):
self.a = 5
That is, you define a class with a constructor that sets the a variable on the new instance to 5.
Finally, what App Engine is doing is an entirely different kind of black magic called descriptors. In short, Python allows objects to define special __get__ and __set__ methods. When an instance of a class that defines these special methods is attached to a class, and you create an instance of that class, attempts to access the attribute will, instead of setting or returning the instance or class variable, they call the special __get__ and __set__ methods. A much more comprehensive introduction to descriptors can be found here, but here's a simple demo:
class MultiplyDescriptor(object):
def __init__(self, multiplicand, initial=0):
self.multiplicand = multiplicand
self.value = initial
def __get__(self, obj, objtype):
if obj is None:
return self
return self.multiplicand * self.value
def __set__(self, obj, value):
self.value = value
Now you can do something like this:
class Foo(object):
a = MultiplyDescriptor(2)
bar = Foo()
bar.a = 10
print bar.a # Prints 20!
Descriptors are the secret sauce behind a surprising amount of the Python language. For instance, property is implemented using descriptors, as are methods, static and class methods, and a bunch of other stuff.
These class variables are metadata to Google App Engine generate their models.
FYI, in your example, a.var == A.var.
>>> class A:
... var = 0
...
... a = A()
... A.var = 3
... a.var == A.var
1: True

Python constructors and __init__

Why are constructors indeed called "Constructors"? What is their purpose and how are they different from methods in a class?
Also, can there be more that one __init__ in a class? I tried the following, can someone please explain the result?
>>> class test:
def __init__(self):
print "init 1"
def __init__(self):
print "init 2"
>>> s=test()
init 2
Finally, is __init__ an operator overloader?
There is no function overloading in Python, meaning that you can't have multiple functions with the same name but different arguments.
In your code example, you're not overloading __init__(). What happens is that the second definition rebinds the name __init__ to the new method, rendering the first method inaccessible.
As to your general question about constructors, Wikipedia is a good starting point. For Python-specific stuff, I highly recommend the Python docs.
Why are constructors indeed called "Constructors" ?
The constructor (named __new__) creates and returns a new instance of the class. So the C.__new__ class method is the constructor for the class C.
The C.__init__ instance method is called on a specific instance, after it is created, to initialise it before being passed back to the caller. So that method is the initialiser for new instances of C.
How are they different from methods in a class?
As stated in the official documentation __init__ is called after the instance is created. Other methods do not receive this treatment.
What is their purpose?
The purpose of the constructor C.__new__ is to define custom behaviour during construction of a new C instance.
The purpose of the initialiser C.__init__ is to define custom initialisation of each instance of C after it is created.
For example Python allows you to do:
class Test(object):
pass
t = Test()
t.x = 10 # here you're building your object t
print t.x
But if you want every instance of Test to have an attribute x equal to 10, you can put that code inside __init__:
class Test(object):
def __init__(self):
self.x = 10
t = Test()
print t.x
Every instance method (a method called on a specific instance of a class) receives the instance as its first argument. That argument is conventionally named self.
Class methods, such as the constructor __new__, instead receive the class as their first argument.
Now, if you want custom values for the x attribute all you have to do is pass that value as argument to __init__:
class Test(object):
def __init__(self, x):
self.x = x
t = Test(10)
print t.x
z = Test(20)
print t.x
I hope this will help you clear some doubts, and since you've already received good answers to the other questions I will stop here :)
Classes are simply blueprints to create objects from. The constructor is some code that are run every time you create an object. Therefor it does'nt make sense to have two constructors. What happens is that the second over write the first.
What you typically use them for is create variables for that object like this:
>>> class testing:
... def __init__(self, init_value):
... self.some_value = init_value
So what you could do then is to create an object from this class like this:
>>> testobject = testing(5)
The testobject will then have an object called some_value that in this sample will be 5.
>>> testobject.some_value
5
But you don't need to set a value for each object like i did in my sample. You can also do like this:
>>> class testing:
... def __init__(self):
... self.some_value = 5
then the value of some_value will be 5 and you don't have to set it when you create the object.
>>> testobject = testing()
>>> testobject.some_value
5
the >>> and ... in my sample is not what you write. It's how it would look in pyshell...
coonstructors are called automatically when you create a new object, thereby "constructing" the object. The reason you can have more than one init is because names are just references in python, and you are allowed to change what each variable references whenever you want (hence dynamic typing)
def func(): #now func refers to an empty funcion
pass
...
func=5 #now func refers to the number 5
def func():
print "something" #now func refers to a different function
in your class definition, it just keeps the later one
There is no notion of method overloading in Python. But you can achieve a similar effect by specifying optional and keyword arguments

Python subclass inheritance

I am trying to build some classes that inherit from a parent class, which contains subclasses that inherit from other parent classes. But when I change attributes in the subclasses in any children, the change affects all child classes. I am looking to avoid having to create instances, as I am using that feature later.
The code below boils down the problem. The final line shows the unexpected result.
class SubclsParent(object):
a = "Hello"
class Parent(object):
class Subcls(SubclsParent):
pass
class Child1(Parent):
pass
class Child2(Parent):
pass
Child1.Subcls.a # Returns "Hello"
Child2.Subcls.a # Returns "Hello"
Child1.Subcls.a = "Goodbye"
Child1.Subcls.a # Returns "Goodbye"
Child2.Subcls.a # Returns "Goodbye" / Should still return "Hello"!
The behaviour you are seeing is exactly what you should expect. When you define a class
>>> class Foo(object): pass
...
you can modify that class -- not instances of it, the class itself -- because the class is just another object, stored in the variable Foo. So, for instance, you can get and set attributes of the class:
>>> Foo.a = 1
>>> Foo.a
1
In other words, the class keyword creates a new type of object and binds the specified name to that object.
Now, if you define a class inside another class (which is a weird thing to do, by the way), that is equivalent to defining a local variable inside the class body. And you know what defining local variables inside the body of a class does: it sets them as class attributes. In other words, variables defined locally are stored on the class object and not on individual instances. Thus,
>>> class Foo(object):
... class Bar(object): pass
...
defines a class Foo with one class attribute, Bar, which happens itself to be a class. There is no subclassing going on here, though -- the classes Foo and Bar are entirely independent. (The behaviour you have achieved could be replicated as follows:
>>> class Foo(object):
... class Bar(object): pass
...
>>> class Foo(object): pass
...
>>> class Bar(object): pass
...
>>> Foo.Bar = Bar
.)
So you are always modifying the same variable! Of course you will change the values you see; you have changed them yourself!
Your problem seems to be that you are somewhat confused between instance and class attributes, which are not the same thing at all.
A class attribute is a variable which is defined over the whole class. That is, any instance of the class will share the same variable. For instance, most methods are class attributes, since you want to call the same methods on every instance (usually). You can also use class attributes for things like global counters (how many times have you instantiated this class?) and other properties which should be shared amongst instances.
An instance attribute is a variable peculiar to an instance of the class. That is, each instance has a different copy of the variable, with possibly-different contents. This is where you store the data in classes -- if you have a Page class, say, you would like the contents attribute to be stored per-instance, since different Pages will of course need different contents.
In your example, you want Child1.Subcls.a and Child2.Subcls.a to be different variables. Naturally, then, they should depend on the instance!
This may be a bit of a leap of faith, but are you trying to implement Java-style interfaces in Python? In other words, are you trying to specify what properties and methods a class should have, without actually defining those properties?
This used to be considered something of a non-Pythonic thing to do, since the prevailing consensus was that you should allow the classes to do whatever they want and catch the exceptions which arise when they don't define a needed property or method. However, recently people have realised that interfaces are actually sometimes a good thing, and new functionality was added to Python to allow this: abstract base classes.
Try this
class SubclsParent(object):
def __init__(self):
self.a = "Hello"
When you define SubclsParent.a directly on the class, you are defining it as static.
When you use Child1.Subcls, python sees there is no Child1.Subcls, and so checks Parent where it finds it and returns it. The same thing happens for Child2.Subcls. As a result, both of those expressions refer to the same class. Child1 and Child2 do not get their own subclasses of it, rather they have access to the original.
*I am looking to avoid having to create instances, as I am using that feature later. *
I don't understand what you mean here.
Your problem is that when you access the attributes you are accessing inherited routines that were created in the parent class, which all refer to the same variable. You can either make those instance variables, or else you can create the attributes in the child classes to get independent attributes.
It may be that what you really want is metaclasses.

Categories

Resources