Best of two ways to declare a class variable in Python - python

The way I usually declare a class variable to be used in instances in Python is the following:
class MyClass(object):
def __init__(self):
self.a_member = 0
my_object = MyClass()
my_object.a_member # evaluates to 0
But the following also works. Is it bad practice? If so, why?
class MyClass(object):
a_member = 0
my_object = MyClass()
my_object.a_member # also evaluates to 0
The second method is used all over Zope, but I haven't seen it anywhere else. Why is that?
Edit: as a response to sr2222's answer. I understand that the two are essentially different. However, if the class is only ever used to instantiate objects, the two will work he same way. So is it bad to use a class variable as an instance variable? It feels like it would be but I can't explain why.

The question is whether this is an attribute of the class itself or of a particular object. If the whole class of things has a certain attribute (possibly with minor exceptions), then by all means, assign an attribute onto the class. If some strange objects, or subclasses differ in this attribute, they can override it as necessary. Also, this is more memory-efficient than assigning an essentially constant attribute onto every object; only the class's __dict__ has a single entry for that attribute, and the __dict__ of each object may remain empty (at least for that particular attribute).
In short, both of your examples are quite idiomatic code, but they mean somewhat different things, both at the machine level, and at the human semantic level.
Let me explain this:
>>> class MyClass(object):
... a_member = 'a'
...
>>> o = MyClass()
>>> p = MyClass()
>>> o.a_member
'a'
>>> p.a_member
'a'
>>> o.a_member = 'b'
>>> p.a_member
'a'
On line two, you're setting a "class attribute". This is litterally an attribute of the object named "MyClass". It is stored as MyClass.__dict__['a_member'] = 'a'. On later lines, you're setting the object attribute o.a_member to be. This is completely equivalent to o.__dict__['a_member'] = 'b'. You can see that this has nothing to do with the separate dictionary of p.__dict__. When accessing a_member of p, it is not found in the object dictionary, and deferred up to its class dictionary: MyClass.a_member. This is why modifying the attributes of o do not affect the attributes of p, because it doesn't affect the attributes of MyClass.

The first is an instance attribute, the second a class attribute. They are not the same at all. An instance attribute is attached to an actual created object of the type whereas the class variable is attached to the class (the type) itself.
>>> class A(object):
... cls_attr = 'a'
... def __init__(self, x):
... self.ins_attr = x
...
>>> a1 = A(1)
>>> a2 = A(2)
>>> a1.cls_attr
'a'
>>> a2.cls_attr
'a'
>>> a1.ins_attr
1
>>> a2.ins_attr
2
>>> a1.__class__.cls_attr = 'b'
>>> a2.cls_attr
'b'
>>> a1.ins_attr = 3
>>> a2.ins_attr
2

Even if you are never modifying the objects' contents, the two are not interchangeable. The way I understand it, accessing class attributes is slightly slower than accessing instance attributes, because the interpreter essentially has to take an extra step to look up the class attribute.
Instance attribute
"What's a.thing?"
Class attribute
"What's a.thing? Oh, a has no instance attribute thing, I'll check its class..."

I have my answer! I owe to #mjgpy3's reference in the comment to the original post. The difference comes if the value assigned to the class variable is MUTABLE! THEN, the two will be changed together. The members split when a new value replaces the old one
>>> class MyClass(object):
... my_str = 'a'
... my_list = []
...
>>> a1, a2 = MyClass(), MyClass()
>>> a1.my_str # This is the CLASS variable.
'a'
>>> a2.my_str # This is the exact same class variable.
'a'
>>> a1.my_str = 'b' # This is a completely new instance variable. Strings are not mutable.
>>> a2.my_str # This is still the old, unchanged class variable.
'a'
>>> a1.my_list.append('w') # We're changing the mutable class variable, but not reassigning it.
>>> a2.my_list # This is the same old class variable, but with a new value.
['w']
Edit: this is pretty much what bukzor wrote. They get the best answer mark.

Related

Python: understanding class and instance variables

I think I have some misconception about class and instance variables. Here is an example code:
class Animal(object):
energy = 10
skills = []
def work(self):
print 'I do something'
self.energy -= 1
def new_skill(self, skill):
self.skills.append(skill)
if __name__ == '__main__':
a1 = Animal()
a2 = Animal()
a1.work()
print a1.energy # result:9
print a2.energy # result:10
a1.new_skill('bark')
a2.new_skill('sleep')
print a1.skills # result:['bark', 'sleep']
print a2.skills # result:['bark', 'sleep']
I thought that energy and skill were class variables, because I declared them out of any method. I modify its values inside the methods in the same way (with self in his declaration, maybe incorrect?). But the results show me that energy takes different values for each object (like a instance variable), while skills seems to be shared (like a class variable). I think I've missed something important...
The trick here is in understanding what self.energy -= 1 does. It's really two expressions; one getting the value of self.energy - 1, and one assigning that back to self.energy.
But the thing that's confusing you is that the references are not interpreted the same way on both sides of that assignment. When Python is told to get self.energy, it tries to find that attribute on the instance, fails, and falls back to the class attribute. However, when it assigns to self.energy, it will always assign to an instance attribute, even though that hadn't previously existed.
You are running into initialization issues based around mutability.
First, the fix. skills and energy are class attributes.
It is a good practice to consider them as read only, as initial values for instance attributes. The classic way to build your class is:
class Animal(object):
energy = 10
skills = []
def __init__(self,en=energy,sk=None):
self.energy = en
self.skills = [] if sk is None else sk
....
Then each instance will have its own attributes, all your problems will disappear.
Second, what's happening with this code?
Why is skills shared, when energy is per-instance?
The -= operator is subtle. it is for in-place assignation if possible. The difference here is that list types are mutable so in-place modification often occurs:
In [6]:
b=[]
print(b,id(b))
b+=['strong']
print(b,id(b))
[] 201781512
['strong'] 201781512
So a1.skills and a2.skills are the same list, which is also accessible as Animal.skills. But energy is a non-mutable int, so modification is impossible. In this case a new int object is created, so each instance manages its own copy of the energy variable:
In [7]:
a=10
print(a,id(a))
a-=1
print(a,id(a))
10 1360251232
9 1360251200
Upon initial creation both attributes are the same object:
>>> a1 = Animal()
>>> a2 = Animal()
>>> a1.energy is a2.energy
True
>>> a1.skills is a2.skills
True
>>> a1 is a2
False
When you assign to a class attribute, it is made local to the instance:
>>> id(a1.energy)
31346816
>>> id(a2.energy)
31346816
>>> a1.work()
I do something
>>> id(a1.energy)
31346840 # id changes as attribute is made local to instance
>>> id(a2.energy)
31346816
The new_skill() method does not assign a new value to the skills array, but rather it appends which modifies the list in place.
If you were to manually add a skill, then the skills list would be come local to the instance:
>>> id(a1.skills)
140668681481032
>>> a1.skills = ['sit', 'jump']
>>> id(a1.skills)
140668681617704
>>> id(a2.skills)
140668681481032
>>> a1.skills
['sit', 'jump']
>>> a2.skills
['bark', 'sleep']
Finally, if you were to delete the instance attribute a1.skills, the reference would revert back to the class attribute:
>>> a1.skills
['sit', 'jump']
>>> del a1.skills
>>> a1.skills
['bark', 'sleep']
>>> id(a1.skills)
140668681481032
Access the class variables through the class, not through self:
class Animal(object):
energy = 10
skills = []
def work(self):
print 'I do something'
self.__class__.energy -= 1
def new_skill(self, skill):
self.__class__.skills.append(skill)
Actually in you code
a1.work();
print a1.energy;
print a2.energy
when you are calling a1.work() an instance variable for a1 object is getting created with the same name that is 'energy'.
And When interpreter comes to 'print a1.energy' it execute the instance variable of object a1.
And when interpreter comes to 'print a2.energy' it execute the class variable, and since you have not changed the value of class variable it shows 10 as output.

python - class instance variable inherence and class variable inherence

The code below:
Since Iter Class is inheriting the Parser class, class Iter(Parser):
is it unnessary to define duplicate but Iter class specific variables with Parser class variables?
Meaning
self.totalEntriesI is just receiver of the variable value in the Parser class known as totalEntires shown in the code as Parser.totalEntires so that work may be done with the value.
however is this necessary?
could I achieve the same thing with out doing it
class Iter(Parser):
def __init__(self, Parser):
self.totalEntriesI = Parser.totalEntries
self.perPageI = Parser.perPage
self.currentPageI = Parser.currentPage
Hugs and kisses
Correct, it's unneccesary. The class attributes ("variables") of Parser are also available on its subclass Iter.
If you assign them to instance attributes as shown, then each Iter instance will get its own copy of the values -- useful if you need to modify them later on a per-instance basis, but otherwise a waste of space and attention :)
A subtlety to be aware of: if you subsequently assign a value to one of these attributes via the subclass Iter, then Iter will get its own copy of the attribute. For example:
>>> class A(): my_attr = 'foo'
>>> class B(A): pass
As you'd expect,
>>> A.my_attr == B.my_attr == 'foo'
True
However, observe:
>>> B.my_attr = 'bar'
>>> B.my_attr
'bar'
>>> A.my_attr
'foo'

Is there a reason to prefer list or tuple for __slots__?

You can define __slots__ in new-style python classes using either list or tuple (or perhaps any iterable?). The type persists after instances are created.
Given that tuples are always a little more efficient than lists and are immutable, is there any reason why you would not want to use a tuple for __slots__?
>>> class foo(object):
... __slots__ = ('a',)
...
>>> class foo2(object):
... __slots__ = ['a']
...
>>> foo().__slots__
('a',)
>>> foo2().__slots__
['a']
First, tuples aren't any more efficient than lists; they both support the exact same fast iteration mechanism from C API code, and use the same code for both indexing and iterating from Python.
More importantly, the __slots__ mechanism doesn't actually use the __slots__ member except during construction. This may not be that clearly explained by the documentation, but if you read all of the bullet points carefully enough the information is there.
And really, it has to be true. Otherwise, this wouldn't work:
class Foo(object):
__slots__ = (x for x in ['a', 'b', 'c'] if x != 'b')
… and, worse, this would:
slots = ['a', 'b', 'c']
class Foo(object):
__slots__ = slots
foo = Foo()
slots.append('d')
foo.d = 4
For further proof:
>>> a = ['a', 'b']
>>> class Foo(object):
... __slots__ = a
>>> del Foo.__slots__
>>> foo = Foo()
>>> foo.d = 3
AttributeError: 'Foo' object has no attribute 'd'
>>> foo.__dict__
AttributeError: 'Foo' object has no attribute '__dict__'
>>> foo.__slots__
AttributeError: 'Foo' object has no attribute '__slots__'
So, that __slots__ member in Foo is really only there for documentation and introspection purposes. Which means there is no performance issue, or behavior issue, just a stylistic one.
According to the Python docs..
This class variable can be assigned a string, iterable, or sequence of
strings with variable names used by instances.
So, you can define it using any iterable. Which one you use is up to you, but in terms of which to "prefer", I would use a list.
First, let's look at what would be the preferred choice if performance were not an issue, which would mean it would be the same decision you would make between list and tuples in all Python code. I would say a list, and the reason is because a tuple is design to have semantic structure: it should semantically mean something that you stored an element as the first item rather than the second. For example, if you stored the first value of an (X,Y) coordinate tuple (the X) as the second item, you just completely changed the semantic value of the structure. If you rearrange the names of the attributes in the __slots__ list, you haven't semantically changed anything. Therefore, in this case, you should use a list.
Now, about performance. First, this is probably premature optimization. I don't know about the performance difference between lists and tuples, but I would guess there isn't anyway. But even assuming there is, it would really only come into play if the __slots__ variable is accessed many times.
I haven't actually looked at the code for when __slots__ is accessed, but I ran the following test..
print('Defining slotter..')
class Slotter(object):
def __iter__(self):
print('Looking for slots')
yield 'A'
yield 'B'
yield 'C'
print('Defining Mine..')
class Mine(object):
__slots__ = Slotter()
print('Creating first mine...')
m1 = Mine()
m1.A = 1
m1.B = 2
print('Creating second mine...')
m2 = Mine()
m2.A = 1
m2.C = 2
Basically, I use a custom class so that I can see exactly when the slots variable is actually iterated. You'll see that it is done exactly once, when the class is defined.
Defining slotter..
Defining Mine..
Looking for slots
Creating first mine...
Creating second mine...
Unless there is a case that I'm missing where the __slots__ variable is iterated again, I think that the performance difference can be declared negligible at worst.

please help me understanding python object instantiation.

I am not a programmer and I am trying to learn python at the moment. But I am a little confused with the object instantiation. I am thinking Class like a template and object is make(or instantiated) based on the template. Doesn't that mean once object is created(eg. classinst1 = MyClass() ), change in template shouldn't affect what's in the object?
In addition, the below code shows that I could change the class variable "common" but only if I haven't assign a new value to the "common" variable in the object. If I assign a new value to "common" in my object (say classinst1.common = 99), then change my class variable "common" no longer affect classinst.common value????
Can someone please clarify for me why the code below behave such way? Is it common to all OO language or just one of the quirky aspects of python?
===============
>>> class MyClass(object):
... common = 10
... def __init__(self):
... self.myvar=3
... def myfunction(self,arg1,arg2):
... return self.myvar
...
>>> classinst1 = MyClass()
>>> classinst1.myfunction(1,2)
3
>>> classinst2 = MyClass()
>>> classinst2.common
10
>>> classinst1.common
10
>>> MyClass.common = 50
>>> classinst1.common
50
>>> classinst2.common
50
>>> classinst1.common = 99
>>> classinst2.common
50
>>> classinst1.common
99
>>> MyClass.common = 7000
>>> classinst1.common
99
>>> classinst2.common
7000
You have the general idea of class declaration and instantiation right. But the reason why the output in your example doesn't seem to make sense is that there are actually two variables called common. The first is the class variable declared and instantiated at the top of your code, in the class declaration. This is the only common for most of your example.
When you execute this line:
classinst1.common = 99
you are creating an object variable, a member of classinst1. Since this has the same name as a class variable, it shadows or hides MyClass.common. All further references to classinst1.common now refer to that object variable, while all references to classinst2.common continue to fall back to MyClass.common since there is no object variable called common that is a member of classinst2.
So when you execute:
MyClass.common = 7000
this changes MyClass.common but classinst1.common remains equal to 99. And in the final lines of your example when you ask the interpreter for the values of classinst1.common and classinst2.common, the former refers to the classinst1 object member variable common while the latter refers to the class variable MyClass.common.

confusing about python class and instance variables

I saw the following Python documentation which says that "define variables in a Class" will be class variables:
"Programmer's note: Variables defined in the class definition are
class variables; they are shared by all instances. "
but as I wrote sample code like this:
class CustomizedMethods(object):
class_var1 = 'foo'
class_var2 = 'bar'
cm1 = CustomizedMethods()
cm2 = CustomizedMethods()
print cm1.class_var1, cm1.class_var2 #'foo bar'
print cm2.class_var1, cm2.class_var2 #'foo bar'
cm2.class_var1, cm2.class_var2 = 'bar','for'
print cm1.class_var1, cm1.class_var2 #'foo bar' #here not changed as my expectation
print cm2.class_var1, cm2.class_var2 #'bar foo' #here has changed but they seemed to become instance variables.
I'm confused since what I tried is different from Python's official documentation.
When you assign an attribute on the instance, it is assigned on the instance, even if it previously existed on the class. At first, class_var1 and class_var2 are indeed class attributes. But when you do cm1.class_var1 = "bar", you are not changing this class attribute. Rather, you are creating a new attribute, also called class_var1, but this one is an instance attribute on the instance cm1.
Here is another example showing the difference, although it still may be a bit tough to grasp:
>>> class A(object):
... var = []
>>> a = A()
>>> a.var is A.var
True
>>> a.var = []
>>> a.var is A.var
False
At first, a.var is A.var is true (i.e., they are the same object): since a doesn't have it's own attribute called var, trying to access that goes through to the class. After you give a its own instance attribute, it is no longer the same as the one on the class.
You're assigning attributes on the instances, so yes, they become instance variables at that point. Python looks for attributes on whatever object you specify, then if it can't find them there, looks up the inheritance chain (to the class, the class's parents, etc.). So the attribute you assign on the instance "shadows" or "hides" the class's attribute of the same name.
Strings are immutable, so the difference between a class and instance variable isn't as noticable. For immutable variables in a class definition, the main thing to notice is less use of memory (i.e., if you have 1,000 instances of CustomizedMethods, there's still only one instance of the string "foo" stored in memory.)
However, using mutable variables in a class can introduce subtle bugs if you don't know what you're doing.
Consider:
class CustomizedMethods(object):
class_var = {}
cm1 = CustomizedMethods()
cm2 = CustomizedMethods()
cm1.class_var['test'] = 'foo'
print cm2.class_var
'foo'
cm2.class_var['test'] = 'bar'
print cm1.class_var
'bar'
When you reassigned the cm2 variables, you created new instance variables that "hid" the class variables.
>>> CustomizedMethods.class_var1 = 'one'
>>> CustomizedMethods.class_var2 = 'two'
>>> print cm1.class_var1, cm1.class_var2
one two
>>> print cm2.class_var1, cm2.class_var2
bar for
Try to
print cm1.__dict__
print cm2.__dict__
it will be enlightning...
When you ask cm2 for an attribute it first looks among the attributes of the instance (if one matches the name) and then if there is no matching attribute among the class attributes.
So class_var1 and class_var2 are the names of the class attributes.
Try also the following:
cm2.__class__.class_var1 = "bar_foo"
print cm1.class_var1
what do you expect?

Categories

Resources