A random class definition:
class ABC:
x = 6
Setting some values, first for the abc instance, later for the static variable:
abc = ABC()
abc.x = 2
ABC.x = 5
and then print the results:
print abc.x
print ABC.x
which prints
2
5
Now, I don't really get what is going on, because if i replace in the class definition x = 6 for "pass", it will just output the same thing. My question is, what is the purpose of defining a variable in the class definition in python if it seems like i can anyone set at any time any variable without doing so?
Also, does python know the difference between instance and static variables? From what I saw, I'd say so.
Warning: the following is an oversimplification; I'm ignoring __new__() and a bunch of other special class methods, and handwaving a lot of details. But this explanation will get you pretty far in Python.
When you create an instance of a class in Python, like calling ABC() in your example:
abc = ABC()
Python creates a new empty object and sets its class to ABC. Then it calls the __init__() if there is one. Finally it returns the object.
When you ask for an attribute of an object, first it looks in the instance. If it doesn't find it, it looks in the instance's class. Then in the base class(es) and so on. If it never finds anybody with the attribute defined, it throws an exception.
When you assign to an attribute of an object, it creates that attribute if the object doesn't already have one. Then it sets the attribute to that value. If the object already had an attribute with that name, it drops the reference to the old value and takes a reference to the new one.
These rules make the behavior you observe easy to predict. After this line:
abc = ABC()
only the ABC object (the class) has an attribute named x. The abc instance doesn't have its own x yet, so if you ask for one you're going to get the value of ABC.x. But then you reassign the attribute x on both the class and the object. And when you subsequently examine those attributes you observe the values you put there are still there.
Now you should be able to predict what this code does:
class ABC:
x = 6
a = ABC()
ABC.xyz = 5
print(ABC.xyz, a.xyz)
Yes: it prints two fives. You might have expected it to throw an AttributeError exception. But Python finds the attribute in the class--even though it was added after the instance was created.
This behavior can really get you in to trouble. One classic beginner mistake in Python:
class ABC:
x = []
a = ABC()
a.x.append(1)
b = ABC()
print(b.x)
That will print [1]. All instances of ABC() are sharing the same list. What you probably wanted was this:
class ABC:
def __init__(self):
self.x = []
a = ABC()
a.x.append(1)
b = ABC()
print(b.x)
That will print an empty list as you expect.
To answer your exact questions:
My question is, what is the purpose of defining a variable in the class definition in python if it seems like i can anyone set at any time any variable without doing so?
I assume this means "why should I assign members inside the class, instead of inside the __init__ method?"
As a practical matter, this means the instances don't have their own copy of the attribute (or at least not yet). This means the instances are smaller; it also means accessing the attribute is slower. It also means the instances all share the same value for that attribute, which in the case of mutable objects may or may not be what you want. Finally, assignments here mean that the value is an attribute of the class, and that's the most straightforward way to set attributes on the class.
As a purely stylistic matter it's shorter code, as you don't have all those instances of self. all over. Beyond that it doesn't make much difference. However, assigning attributes in the __init__ method ensures they are unambiguously instance variables.
I'm not terribly consistent myself. The only thing I'm sure to do is assign all the mutable objects that I don't want shared in the __init__ method.
Also, does python know the difference between instance and static variables? From what I saw, I'd say so.
Python classes don't have class static variables like C++ does. There are only attributes: attributes of the class object, and attributes of the instance object. And if you ask for an attribute, and the instance doesn't have it, you'll get the attribute from the class.
The closest approximation of a class static variable in Python would be a hidden module attribute, like so:
_x = 3
class ABC:
def method(self):
global _x
# ...
It's not part of the class per se. But this is a common Python idiom.
class SomeClass:
x=6 # class variable
def __init__(self):
self.y = 666 # instance variable
There is virtue in declaring a class scoped variable: it serves as default for one. Think of class scoped variable as you would think of "static" variables in some other languages.
Python makes a distinction between the two. The purpose could be multiple, but one example is this:
class token(object):
id = 0
def __init__(self, value):
self.value = value
self.id = token.id
token.id += 1
Here, the class variable token.id is automatically incremented at each new instance, and this instance can take a unique ID at the same time, which will be put in self.id. Both are stored at different places - in the class object, or in the instance object, you can indeed compare that to static and instance variables in some OO languages like C++ or C#.
In that example, if you do:
print token.id
you will see the next ID to be assigned, whereas:
x = token(10)
print x.id
will give the id of that instance.
Everyone can also put other attributes in an instance or in a class, that's right, but that wouldn't be interesting since the class code is not intended to use them. The interest with an exemple as above is that the class code uses them.
A class-level variable (called "static" in other languages) is owned by the class, and shared by all instances of the class.
A instance variable is part of by each distinct instance of the class.
However.
You can add a new instance variable any time you want.
So getting abc.x requires first checking for an instance variable. If there is no instance variable, it will try the class variable.
And setting abc.x will create (or replace) an instance variable.
Every object has a __dict__. The class ABC and its instance, abc, are both objects, and so each has their own separate __dict__:
In [3]: class ABC:
...: x=6
Notice ABC.__dict__ has a 'x' key:
In [4]: ABC.__dict__
Out[4]: {'__doc__': None, '__module__': '__main__', 'x': 6}
In [5]: abc=ABC()
In [6]: abc.__dict__
Out[6]: {}
Notice that if 'x' is not in abc.__dict__, then the __dict__'s of abc's superclass(es) are searched. So abc.x is "inherited" from ABC:
In [14]: abc.x
Out[14]: 6
But if we set abc.x then we are changing abc.__dict__, not ABC.__dict__:
In [7]: abc.x = 2
In [8]: abc.__dict__
Out[8]: {'x': 2}
In [9]: ABC.__dict__
Out[9]: {'__doc__': None, '__module__': '__main__', 'x': 6}
Of course, we can change ABC.__dict__ if we wish:
In [10]: ABC.x = 5
In [11]: ABC.__dict__
Out[11]: {'__doc__': None, '__module__': '__main__', 'x': 5}
The benefit of a "static" or in Python a "class attribute" is that each instance of the class will have access to the same class attribute. This is not true for instance attributes as you may be aware.
Take for example:
class A(object):
b = 1
A.b # => 1
inst = A()
inst2 = A()
inst.b # => 1
inst2.b # => 1
A.b = 5
inst.b # => 5
inst2.b # => 5
As you can see the instance of the class has access to the class attribute which can be set by specifying the class name and then the class attribute.
The tricky part is when you have a class attribute and instance attribute named the same thing. This requires an understanding of what is going on under the hood.
inst.__dict__ # => {}
A.__dict__ # => {..., 'b': 5}
Notice how the instance does not have b as an attribute? Above, when we called inst.b Python actually checks inst.__dict__ for the attribute, if it cannot be found, then it searches A.__dict__ (the class's attributes). Of course, when Python looks up b in the class's attributes it is found and returned.
You can get some confusing output if you then set an instance attribute with the same name.
For example:
inst.b = 10
inst.__dict__ #=> {'b': 10}
A.b # => 5
inst.b # => 10
You can see that the instance of the class now has the b instance attribute and therefore Python returns that value.
Related
I am learning Python and have started a chapter on "classes" and also class/instance attributes. The chapter starts off with a very basic example of creating an empty class
class Contact:
pass
x=Contact()
So an empty class is created and an instance of the class is created. Then it also throws in the following line of code
x.name='Mr.Roger'
So this threw me for a loop as the class definition is totally empty with no variables. Similarly the object is created with no variables.
Its explained that apparently this is a "data attribute". I tried to google this and most documentation speaks to class/instance attributes - Though I was able to find reference to data attributes here: https://docs.python.org/3/tutorial/classes.html#instance-objects
In my very basic mind - What I am seeing happening is that an empty object is instantiated. Then seemingly new variables can then be created and attached to this object (in this case x.name). I am assuming that we can create any number of attributes in this manner so we could even do
x.firstname='Roger'
x.middlename='Sam'
x.lastname='Jacobs'
etc.
Since there are already class and instance attributes - I am confused why one would do this and for what situations or use-cases? Is this not a recommended way of creating attributes or is this frowned upon?
If I create a second object and then attach other attributes to it - How can I find all the attributes attached to this object or any other object that is implemented in a similar way?
Python is a very dynamic language. Classes acts like molds, they can create instance according to a specific shape, but unlike other languages where shapes are fixed, in Python you can (nearly) always modify their shape.
I never heard of "data attribute" in this context, so I'm not surprised that you did find nothing to explain this behavior.
Instead, I recommend you the Python data model documentation. Under "Class instances" :
[...] A class instance has a namespace implemented as a dictionary which is the first place in which attribute references are searched. When an attribute is not found there, and the instance’s class has an attribute by that name, the search continues with the class attributes.
[...]
Special attributes: __dict__ is the attribute dictionary; __class__ is the instance’s class.
Python looks simple on the surface level, but what happens when you do a.my_value is rather complex. For the simple cases, my_value is an instance variable, which usually is defined during the class declaration, like so :
class Something:
def __init__(self, parameter):
self.my_value = parameter # storing the parameter in an instance variable (self)
a = Something(1)
b = Something(2)
# instance variables are not shared (by default)
print(a.my_value) # 1
print(b.my_value) # 2
a.my_value = 10
b.my_value = 20
print(a.my_value) # 10
print(b.my_value) # 20
But it would have worked without the __init__:
class Something:
pass # nothing special
a = Something()
a.my_value = 1 # we have to set it ourselves, because there is no more __init__
b = Something()
b.my_value = 2 # same
# and we get the same results as before :
print(a.my_value) # 1
print(b.my_value) # 2
a.my_value = 10
b.my_value = 20
print(a.my_value) # 10
print(b.my_value) # 20
Because each instance uses a dictionary to store its attributes (methods and fields), and you can edit this dictionary, then you can edit the fields of any object at any moment. This is both very handy sometimes, and very annoying other times.
Example of the instance's __dict__ attribute :
class Something:
pass # nothing special
a = Something()
print(a.__dict__) # {}
a.my_value = 1
print(a.__dict__) # {'my_value': 1}
a.my_value = 10
print(a.__dict__) # {'my_value': 10}
Because it did not existed before, it got added to the __dict__. Then it just got modified.
And if we create another Something:
b = Something()
print(a.__dict__) # {'my_value': 10}
print(b.__dict__) # {}
They were created with the same mold (the Something class) but one got modified afterwards.
The usual way to set attributes to instances is with the __init__ method :
class Something:
def __init__(self, param):
print(self.__dict__) # {}
self.my_value = param
print(self.__dict__) # {'my_value': 1}
a = Something(1)
print(a.__dict__) # {'my_value': 1}
It does exactly what we did before : add a new entry in the instance's __dict__. In that way, __init__ is not much more than a convention of where to put all your fields declarations, but you can do without.
It comes from the face that everything in Python is a dynamic object, that you can edit anytime. For example, that's the way modules work too :
import sys
this_module = sys.modules[__name__]
print(this_module.__dict__) # {... a bunch of things ...}
MODULE_VAR = 4
print(this_module.__dict__) # {... a bunch of things ..., 'MODULE_VAR': 4}
This is a core feature of Python, its dynamic nature sometime makes things easy. For example, it enables duck typing, monkey patching, instrospection, ... But in a large codebases, without coding rules, you can quickly get a mess of undeclared instances everywhere. Nowadays, we try to write less clever, more reliable code, so adding new attributes to instances outside of the __init__ is indeed frowned upon.
From what I understand, each instance of a class stores references to the instance's methods.
I thought, in concept, all instances of a class have the same instance methods. If so, both memory savings and logical clarity seem to suggest that instance methods should be stored in the class object rather than the instance object (with the instance object looking them up through the class object; of course, each instance has a reference to its class). Why is this not done?
A secondary question. Why are instance methods not accessible in a way similar to instance attributes, i.e., through __dict__, or through some other system attribute? Is there any way to look at (and perhaps change) the names and the references to instance methods?
EDIT:
Oops, sorry. I was totally wrong. I saw the following Python 2 code, and incorrectly concluded from it that instance methods are stored in the instances. I am not sure what it does, since I don't use Python 2, and new is gone from Python 3.
import new
class X(object):
def f(self):
print 'f'
a = X()
b = X()
def g(self):
print 'g'
# I thought this modified instance method just in a, not in b
X.f = new.instancemethod(g, a, X)
Attribute lookup on objects in Python is non-trivial. But instance methods are certainly not stored on the instance object!
The default behavior for attribute access is to get, set, or delete the attribute from an object’s dictionary. For instance, a.x has a lookup chain starting with a.__dict__['x'], then type(a).__dict__['x'], and continuing through the base classes of type(a) excluding metaclasses.
(docs)
Note that it is possible to store a function on an instance. But that's not an instance method! When the interpreter looks up an attribute and finds that it is (a) a function and (b) on the class object, it automatically wraps it in a bound method object which passes self.
Is there any way to look at (and perhaps change) the names and the references to instance methods?
Well, you can certainly modify the class object after defining it. But I assume what you mean is "can you make the x method of a particular instance do something different?"
This being Python, the answer is "yes": just define a.x to be some new function. Then you will get that function back before looking on the class.
This may cause you a lot of confusion when you're trying to understand the code, though!
From what I understand, each instance of a class stores references to the instance's methods.
I don't know where you got this from, but it's wrong. They don't.
Why are instance methods not accessible in a way similar to instance attributes, i.e., through __dict__, or through some other system attribute?
Well, because they are not stored on the instance.
Is there any way to look at (and perhaps change) the names and the references to instance methods?
Since these references don't exist, you cannot change them. You can of course create any attribute you want by normal assignments, but note that functions stored on the instance are not treated like ordinary methods -- the mechanism that implicitly passes the self parameter does not apply for them.
Incorrect. Instances do not store references to each method.
For example:
class Foo():
def bar(self):
print 'bar'
f = Foo()
def alternate_bar(self):
print 'alternate bar'
f.bar()
Foo.bar = alternate_bar # modifies the class!
f.bar()
prints
bar
alternate bar
This is also why you provide a self to each method you define in a class. Without a reference to self, the method has no idea which instance it is working on.
Another example
class Point:
def __init__(self, xcoord, ycoord):
self.x = xcoord
self.y = ycoord
def draw(self):
print self.x, " ", self.y
p = Point(205.12, 305.21)
#draw the coordinates of the point instance
p.draw()
# now define a new point drawing function vdraw()
def vdraw(q):
print "[",q.x,",",q.y,"]"
#p.draw()
#now reassign the draw() method to vdraw()
Point.draw = vdraw
# now print the coordinates of the point instance
print p.x
print p.y
#now draw the coordinates of the point instance
p.draw()
I was wondering what was the difference between the Foo.var= user input and self.var= userinput in the 2 classes.
class foo():
var=None
def __init__(self,userinput):
foo.var=userinput
class bar():
var=None
def __init__(self,userinput):
self.var=userinput
foo refers to the class, self refers to the object.
Class members are a property of the class (and thus are shared between all objects of that class), while instance members are a property of the specific object, so a change to an instance member affects only the given object.
When you operate on an object, the members it has are a merge of the class members and the instance members. When two members with the same name are defined, the instance members have the priority.
Thus:
bar sets an instance variable; that change has effect only on the current instance, so if you do:
b=bar(10)
c=bar(20)
you'll see that c.var is 20 and b.var is 10; nothing strange here;
foo sets a class variable, which is common to all the instances; so, if you do:
f=foo(10)
g=foo(20)
you'll see that both f.var and g.var will be 20, because they both actually refer to foo.var, that was last set to 20 in g's constructor;
on the other hand, instance variables shadow class variables; so, if you do
f=foo(10)
g=foo(20)
f.var=30
you'll have g.var==foo.var==20, but f.var==30, since now f.var refers to the instance variable f.var; but, if you do
del f.var
now the instance (f's) attribute var no longer exists, and thus f.var refers again to the class attribute var (thus f.var==g.var==foo.var==20).
Long story short: normally you'll want to use self.var (i.e. instance members); classname.var is only for sharing stuff between all instances of a given class.
I'd like to point to an existing post which explains the difference perfectly in my opinion.
Python: Difference between class and instance attributes
Yes,
In the first instance you are setting the variable for all instances of foo this is because it is a class variable.
In the second case you are only setting the variable for that instance of foo.
For Example:
class pie():
def __init__(self, j):
pie.var = "pies" + str(j)
print (self.var)
def __str__(self):
return self.var
a = pie(1)
b = pie(2)
print (a)
print (b)
I tried this example code:
class testclass:
classvar = 'its classvariable LITERAL'
def __init__(self,x,y):
self.z = x
self.classvar = 'its initvariable LITERAL'
self.test()
def test(self):
print('class var',testclass.classvar)
print('instance var',self.classvar)
if __name__ == '__main__':
x = testclass(2,3)
I need some clarification. In both cases, I'm able to access the class attribute and instance in the test method.
So, suppose if I have to define a literal that needs to be used across all function, which would be the better way to define it: an instance attribute or a class attribute?
I found this in an old presentation made by Guido van Rossum in 1999 ( http://legacy.python.org/doc/essays/ppt/acm-ws/sld001.htm ) and I think it explains the topic beautifully:
Instance variable rules
On use via instance (self.x), search order:
(1) instance, (2) class, (3) base classes
this also works for method lookup
On assigment via instance (self.x = ...):
always makes an instance variable
Class variables "default" for instance variables
But...!
mutable class variable: one copy shared by all
mutable instance variable: each instance its own
Class variables are quite good for "constants" used by all the instances (that's all methods are technically). You could use module globals, but using a class variable makes it more clearly associated with the class.
There are often uses for class variables that you actually change, too, but it's usually best to stay away from them for the same reason you stay away from having different parts of your program communicate by altering global variables.
Instance variables are for data that is actually part of the instance. They could be different for each particular instance, and they often change over the lifetime of a single particular instance. It's best to use instance variables for data that is conceptually part of an instance, even if in your program you happen to only have one instance, or you have a few instances that in practice always have the same value.
It's good practice to only use class attributes if they are going to remain fixed, and one great thing about them is that they can be accessed outside of an instance:
class MyClass():
var1 = 1
def __init__(self):
self.var2 = 2
MyClass.var1 # 1 (you can reference var1 without instantiating)
MyClass.var2 # AttributeError: class MyClass has no attribute 'var2'
If MyClass.var is defined, it should be the same in every instance of MyClass, otherwise you get the following behaviour which is considered confusing.
a = MyClass()
b = MyClass()
a.var1, a.var2 # (1,2)
a.var1, a.var2 = (3,4) # you can change these variables
a.var1, a.var2 # (3,4)
b.var1, b.var2 # (1,2) # but they don't change in b
MyClass.var1 # 1 nor in MyClass
You should define it as a class attribute if you want it to be shared among all instances. You should define it as an instance variable if you want a separate one for each instance (e.g., if different instances might have different values for the variable).
This question already has answers here:
Usage of __slots__?
(14 answers)
Can't set attributes on instance of "object" class
(7 answers)
Closed 7 months ago.
For example, this code is Python:
a = object()
a.b = 3
throws AttributeError: 'object' object has no attribute 'b'
But, this piece of code:
class c(object): pass
a = c()
a.b = 3
is just fine. Why can I assign property b, when class x does not have that property? How can I make my classes have only properties defined?
The object type is a built-in class written in C and doesn't let you add attributes to it. It has been expressly coded to prevent it.
The easiest way to get the same behavior in your own classes is to use the __slots__ attribute to define a list of the exact attributes you want to support. Python will reserve space for just those attributes and not allow any others.
class c(object):
__slots__ = "foo", "bar", "baz"
a = c()
a.foo = 3 # works
a.b = 3 # AttributeError
Of course, there are some caveats with this approach: you can't pickle such objects, and code that expects every object to have a __dict__ attribute will break. A "more Pythonic" way would be to use a custom __setattr__() as shown by another poster. Of course there are plenty of ways around that, and no way around setting __slots__ (aside from subclassing and adding your attributes to the subclass).
In general, this is not something you should actually want to do in Python. If the user of your class wants to store some extra attributes on instances of the class, there's no reason not to let them, and in fact a lot of reasons why you might want to.
You can override the behavior of the __setattr__ magic method like so.
class C(object):
def __setattr__(self, name, value):
allowed_attrs = ('a', 'b', 'c')
if name not in allowed_attrs:
# raise exception
# or do something else
pass
self.__dict__[name] = value
Of course, this will only prevent you from setting attributes like a.b (the dot form). You can still set the attributes using a.__dict__[b] = value. In that case, you should override the __dict__ method too.
Python generally allows you to set any attribute on any object. This is a special case where the object class acts differently. There are also some modules implemented in C that act similarly.
If you want your object to behave like this, you can define a __setattr__(self, name, value) method that explicitly does a raise AttributeError() if you try to set a member that's not on the "approved list" (see http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/389916)
Creating an object instance has no features. Therefore setting attributes on an instance of a the base object type is expressly disabled. You must subclass it to be able to create attributes.
Hint: If you want a simple object to use as something on which to store properties, you can do so by creating an anonymous function with lambda. Functions, being objects, are able to store attributes as well, so this is perfectly legit:
>>> a = lambda: None
>>> a.b = 3
>>> a.b
3
This happens because when you say a.b = 3, it creates a variable in a that represents b. For example,
class a: pass
print a.b
returns AttributeError: class a has no attribute b
However this code,
class a: pass
a.b = 3
print a.b
returns 3 as it sets the value of b in a, to 3.