In Python, class variables can be accessed via that class instance:
>>> class A(object):
... x = 4
...
>>> a = A()
>>> a.x
4
It's easy to show that a.x is really resolved to A.x, not copied to an instance during construction:
>>> A.x = 5
>>> a.x
5
Despite the fact that this behavior is well known and widely used, I couldn't find any definitive documentation covering it. The closest I could find in Python docs was the section on classes:
class MyClass:
"""A simple example class"""
i = 12345
def f(self):
return 'hello world'
[snip]
... By definition, all attributes of a class that are function objects define corresponding methods of its instances. So in our example, x.f is a valid method reference, since MyClass.f is a function, but x.i is not, since MyClass.i is not. ...
However, this part talks specifically about methods so it's probably not relevant to the general case.
My question is, is this documented? Can I rely on this behavior?
Refs the Classes and Class instances parts in the Python data model documentation
A class has a namespace implemented by a dictionary object. Class
attribute references are translated to lookups in this dictionary,
e.g., C.x is translated to C.__dict__["x"] (although for new-style classes in particular there are a number of hooks which allow for other means of locating attributes).
...
A class instance is created by calling a class object (see above). A
class instance has a namespace implemented as a dictionary which is
the first place in which attribute references are searched. When an
attribute is not found there, and the instance’s class has an
attribute by that name, the search continues with the class
attributes.
Generally, this usage is fine, except the special cases mentioned as "for new-style classes in particular there are a number of hooks which allow for other means of locating attributes".
Not only can you rely on this behavior, you constantly do.
Think about methods. A method is merely a function that has been made a class attribute. You then look it up on the instance.
>>> def foo(self, x):
... print "foo:", self, x
...
>>> class C(object):
... method = foo # What a weird way to write this! But perhaps illustrative?
...
>>> C().method("hello")
foo: <__main__.C object at 0xadad50> hello
In the case of objects like functions, this isn't a plain lookup, but some magic occurs to pass self automatically. You may have used other objects that are meant to be stored as class attributes and looked up on the instance; properties are an example (check out the property builtin if you're not familiar with it.)
As okm notes, the way this works is described in the data model reference (including information about and links to more information about the magic that makes methods and properties work). The Data Model page is by far the most useful part of the Language Reference; it also includes among other things documentation about almost all the __foo__ methods and names.
Related
Class objects have a __bases__ (and a __base__) attribute:
>>> class Foo(object):
... pass
...
>>> Foo.__bases__
(<class 'object'>,)
Sadly, these attributes aren't accessible in the class body, which would be very convenient for accessing parent class attributes without having to hard-code the name:
class Foo:
cls_attr = 3
class Bar(Foo):
cls_attr = __base__.cls_attr + 2
# throws NameError: name '__base__' is not defined
Is there a reason why __bases__ and __base__ can't be accessed in the class body?
(To be clear, I'm asking if this is a conscious design decision. I'm not asking about the implementation; I know that __bases__ is a descriptor in type and that this descriptor can't be accessed until a class object has been created. I want to know why python doesn't create __bases__ as a local variable in the class body.)
I want to know why python doesn't create __bases__ as a local variable in the class body
As you know, class is mostly a shortcut for type.__new__() - when the runtime hits a class statements, it executes all statements at the top-level of the class body, collects all resulting bindings in a dedicated namespace dict, calls type() with the concrete metaclass, the class name, the base classes and the namespace dict, and binds the resulting class object to the class name in the enclosing scope (usually but not necessarily the module's top-level namespace).
The important point here is that it's the metaclass responsabilty to build the class object, and to allow for class object creation customisations, the metaclass must be free to do whatever it wants with its arguments. Most often a custom metaclass will mainly work on the attrs dict, but it must also be able to mess with the bases argument. Now since the metaclass is only invoked AFTER the class body statements have been executed, there's no way the runtime can reliably expose the bases in the class body scope since those bases could be modified afterward by the metaclass.
There are also some more philosophical considerations here, notably wrt/ explicit vs implicit, and as shx2 mentions, Python designers try to avoid magic variables popping out of the blue. There are indeed a couple implementation variables (__module__ and, in py3, __qualname__) that are "automagically" defined in the class body namespace, but those are just names, mostly intended as additional debugging / inspection informations for developers) and have absolutely no impact on the class object creation nor on its properties, behaviour and whatnots.
As always with Python, you have to consider the whole context (the execution model, the object model, how the different parts work together etc) to really understand the design choices. Whether you agree with the whole design and philosophy is another debate (and one that doesn't belong here), but you can be sure that yes, those choices are "conscious design decisions".
I am not answering as to why it was decided to be implemented the way it was, I'm answering why it wasn't implemented as a "local variable in the class body":
Simply because nothing in python is a local variable magically defined in the class body. Python doesn't like names magically appearing out of nowhere.
It's because it's simply is not yet created.
Consider the following:
>>> class baselessmeta(type):
... def __new__(metaclass, class_name, bases, classdict):
... return type.__new__(
... metaclass,
... class_name,
... (), # I can just ignore all the base
... {}
... )
...
>>> class Baseless(int, metaclass=baselessmeta):
... # imaginary print(__bases__, __base__)
... ...
...
>>> Baseless.__bases__
(<class 'object'>,)
>>> Baseless.__base__
<class 'object'>
>>>
What should the imaginary print result in?
Every python class is created via the type metaclass one way or another.
You have the int argument for the type() in bases argument, yet you do not know what is the return value is going to be. You may use that directly as a base in your metaclass, or you may return another base with your LOC.
Just realized your to be clear part and now my answer is useless haha. Oh welp.
What is the difference between class and instance variables in Python?
class Complex:
a = 1
and
class Complex:
def __init__(self):
self.a = 1
Using the call: x = Complex().a in both cases assigns x to 1.
A more in-depth answer about __init__() and self will be appreciated.
When you write a class block, you create class attributes (or class variables). All the names you assign in the class block, including methods you define with def become class attributes.
After a class instance is created, anything with a reference to the instance can create instance attributes on it. Inside methods, the "current" instance is almost always bound to the name self, which is why you are thinking of these as "self variables". Usually in object-oriented design, the code attached to a class is supposed to have control over the attributes of instances of that class, so almost all instance attribute assignment is done inside methods, using the reference to the instance received in the self parameter of the method.
Class attributes are often compared to static variables (or methods) as found in languages like Java, C#, or C++. However, if you want to aim for deeper understanding I would avoid thinking of class attributes as "the same" as static variables. While they are often used for the same purposes, the underlying concept is quite different. More on this in the "advanced" section below the line.
An example!
class SomeClass:
def __init__(self):
self.foo = 'I am an instance attribute called foo'
self.foo_list = []
bar = 'I am a class attribute called bar'
bar_list = []
After executing this block, there is a class SomeClass, with 3 class attributes: __init__, bar, and bar_list.
Then we'll create an instance:
instance = SomeClass()
When this happens, SomeClass's __init__ method is executed, receiving the new instance in its self parameter. This method creates two instance attributes: foo and foo_list. Then this instance is assigned into the instance variable, so it's bound to a thing with those two instance attributes: foo and foo_list.
But:
print instance.bar
gives:
I am a class attribute called bar
How did this happen? When we try to retrieve an attribute through the dot syntax, and the attribute doesn't exist, Python goes through a bunch of steps to try and fulfill your request anyway. The next thing it will try is to look at the class attributes of the class of your instance. In this case, it found an attribute bar in SomeClass, so it returned that.
That's also how method calls work by the way. When you call mylist.append(5), for example, mylist doesn't have an attribute named append. But the class of mylist does, and it's bound to a method object. That method object is returned by the mylist.append bit, and then the (5) bit calls the method with the argument 5.
The way this is useful is that all instances of SomeClass will have access to the same bar attribute. We could create a million instances, but we only need to store that one string in memory, because they can all find it.
But you have to be a bit careful. Have a look at the following operations:
sc1 = SomeClass()
sc1.foo_list.append(1)
sc1.bar_list.append(2)
sc2 = SomeClass()
sc2.foo_list.append(10)
sc2.bar_list.append(20)
print sc1.foo_list
print sc1.bar_list
print sc2.foo_list
print sc2.bar_list
What do you think this prints?
[1]
[2, 20]
[10]
[2, 20]
This is because each instance has its own copy of foo_list, so they were appended to separately. But all instances share access to the same bar_list. So when we did sc1.bar_list.append(2) it affected sc2, even though sc2 didn't exist yet! And likewise sc2.bar_list.append(20) affected the bar_list retrieved through sc1. This is often not what you want.
Advanced study follows. :)
To really grok Python, coming from traditional statically typed OO-languages like Java and C#, you have to learn to rethink classes a little bit.
In Java, a class isn't really a thing in its own right. When you write a class you're more declaring a bunch of things that all instances of that class have in common. At runtime, there's only instances (and static methods/variables, but those are really just global variables and functions in a namespace associated with a class, nothing to do with OO really). Classes are the way you write down in your source code what the instances will be like at runtime; they only "exist" in your source code, not in the running program.
In Python, a class is nothing special. It's an object just like anything else. So "class attributes" are in fact exactly the same thing as "instance attributes"; in reality there's just "attributes". The only reason for drawing a distinction is that we tend to use objects which are classes differently from objects which are not classes. The underlying machinery is all the same. This is why I say it would be a mistake to think of class attributes as static variables from other languages.
But the thing that really makes Python classes different from Java-style classes is that just like any other object each class is an instance of some class!
In Python, most classes are instances of a builtin class called type. It is this class that controls the common behaviour of classes, and makes all the OO stuff the way it does. The default OO way of having instances of classes that have their own attributes, and have common methods/attributes defined by their class, is just a protocol in Python. You can change most aspects of it if you want. If you've ever heard of using a metaclass, all that is is defining a class that is an instance of a different class than type.
The only really "special" thing about classes (aside from all the builtin machinery to make them work they way they do by default), is the class block syntax, to make it easier for you to create instances of type. This:
class Foo(BaseFoo):
def __init__(self, foo):
self.foo = foo
z = 28
is roughly equivalent to the following:
def __init__(self, foo):
self.foo = foo
classdict = {'__init__': __init__, 'z': 28 }
Foo = type('Foo', (BaseFoo,) classdict)
And it will arrange for all the contents of classdict to become attributes of the object that gets created.
So then it becomes almost trivial to see that you can access a class attribute by Class.attribute just as easily as i = Class(); i.attribute. Both i and Class are objects, and objects have attributes. This also makes it easy to understand how you can modify a class after it's been created; just assign its attributes the same way you would with any other object!
In fact, instances have no particular special relationship with the class used to create them. The way Python knows which class to search for attributes that aren't found in the instance is by the hidden __class__ attribute. Which you can read to find out what class this is an instance of, just as with any other attribute: c = some_instance.__class__. Now you have a variable c bound to a class, even though it probably doesn't have the same name as the class. You can use this to access class attributes, or even call it to create more instances of it (even though you don't know what class it is!).
And you can even assign to i.__class__ to change what class it is an instance of! If you do this, nothing in particular happens immediately. It's not earth-shattering. All that it means is that when you look up attributes that don't exist in the instance, Python will go look at the new contents of __class__. Since that includes most methods, and methods usually expect the instance they're operating on to be in certain states, this usually results in errors if you do it at random, and it's very confusing, but it can be done. If you're very careful, the thing you store in __class__ doesn't even have to be a class object; all Python's going to do with it is look up attributes under certain circumstances, so all you need is an object that has the right kind of attributes (some caveats aside where Python does get picky about things being classes or instances of a particular class).
That's probably enough for now. Hopefully (if you've even read this far) I haven't confused you too much. Python is neat when you learn how it works. :)
What you're calling an "instance" variable isn't actually an instance variable; it's a class variable. See the language reference about classes.
In your example, the a appears to be an instance variable because it is immutable. It's nature as a class variable can be seen in the case when you assign a mutable object:
>>> class Complex:
>>> a = []
>>>
>>> b = Complex()
>>> c = Complex()
>>>
>>> # What do they look like?
>>> b.a
[]
>>> c.a
[]
>>>
>>> # Change b...
>>> b.a.append('Hello')
>>> b.a
['Hello']
>>> # What does c look like?
>>> c.a
['Hello']
If you used self, then it would be a true instance variable, and thus each instance would have it's own unique a. An object's __init__ function is called when a new instance is created, and self is a reference to that instance.
I am a complete Python novice so my question might seem to be dumb. I've seen there are two ways of assigning values to object attributes in Python:
Using __dict__:
class A(object):
def __init__(self,a,b):
self.__dict__['a'] = a
self.__dict__['b'] = b
Without __dict__:
class A(object):
def __init__(self,a,b):
self.a = a
self.b = b
Can anyone explain where is the difference?
Unless you need set attributes dynamically and bypass descriptors or the .__setattr__() hook, do not assign attributes directly to .__dict__.
Not all instances have a .__dict__ attribute even, not if the class defined a .__slots__ attribute to save memory.
If you do need to set attributes dynamically but don't need to bypass a descriptor or a .__setattr__() hook, you'd use the setattr() function normally.
From the Python documentation:
Both class types (new-style classes) and class objects
(old-style/classic classes) are typically created by class definitions
(see section Class definitions). A class has a namespace implemented
by a dictionary object. Class attribute references are translated to
lookups in this dictionary, e.g., C.x is translated to C.__dict__["x"]
(although for new-style classes in particular there are a number of
hooks which allow for other means of locating attributes).
The much more typical way of assigning a new element to a class is your second example, which allows for a number of customizations (__seattr__, __getattr__, etc.).
I was just reading the Python documentation about classes; it says, in Python "classes themselves are objects". How is that different from classes in C#, Java, Ruby, or Smalltalk?
What advantages and disadvantages does this type of classes have compared with those other languages?
In Python, classes are objects in the sense that you can assign them to variables, pass them to functions, etc. just like any other objects. For example
>>> t = type(10)
>>> t
<type 'int'>
>>> len(t.__dict__)
55
>>> t() # construct an int
0
>>> t(10)
10
Java has Class objects which provide some information about a class, but you can't use them in place of explicit class names. They aren't really classes, just class information structures.
Class C = x.getClass();
new C(); // won't work
Declaring a class is simply declaring a variable:
class foo(object):
def bar(self): pass
print foo # <class '__main__.foo'>
They can be assigned and stored like any variable:
class foo(object):
pass
class bar(object):
pass
baz = bar # simple variable assignment
items = [foo, bar]
my_foo = items[0]() # creates a foo
for x in (foo, bar): # create one of each type
print x()
and passed around as a variable:
class foo(object):
def __init__(self):
print "created foo"
def func(f):
f()
func(foo)
They can be created by functions, including the base class list:
def func(base_class, var):
class cls(base_class):
def g(self):
print var
return cls
class my_base(object):
def f(self): print "hello"
new_class = func(my_base, 10)
obj = new_class()
obj.f() # hello
obj.g() # 10
By contrast, while classes in Java have objects representing them, eg. String.class, the class name itself--String--isn't an object and can't be manipulated as one. That's inherent to statically-typed languages.
In C# and Java the classes are not objects. They are types, in the sense in which those languages are statically typed. True you can get an object representing a specific class - but that's not the same as the class itself.
In python what looks like a class is actually an object too.
It's exlpained here much better than I can ever do :)
The main difference is that they mean you can easily manipulate the class as an object. The same facility is available in Java, where you can use the methods of Class to get at information about the class of an object. In languages like Python, Ruby, and Smalltalk, the more dynamic nature of the language lets you "open" the class and change it, which is sometimes called "monkey patching".
Personally I don't think the differences are all that much of a big deal, but I'm sure we can get a good religious war started about it.
Classes are objects in that they are manipulable in Python code just like any object. Others have shown how you can pass them around to functions, allowing them to be operated upon like any object. Here is how you might do this:
class Foo(object):
pass
f = Foo()
f.a = "a" # assigns attribute on instance f
Foo.b = "b" # assigns attribute on class Foo, and thus on all instances including f
print f.a, f.b
Second, like all objects, classes are instantiated at runtime. That is, a class definition is code that is executed rather than a structure that is compiled before anything runs. This means a class can "bake in" things that are only known when the program is run, such as environment variables or user input. These are evaluated once when the class is declared and then become a part of the class. This is different from compiled languages like C# which require this sort of behavior to be implemented differently.
Finally, classes, like any object, are built from classes. Just as an object is built from a class, so is a class built from a special kind of class called a metaclass. You can write your own metaclasses to change how classes are defined.
Another advantage of classes being objects is that objects can change their class at runtime:
>>> class MyClass(object):
... def foo(self):
... print "Yo There! I'm a MyCLass-Object!"
...
>>> class YourClass(object):
... def foo(self):
... print "Guess what?! I'm a YourClass-Object!"
...
>>> o = MyClass()
>>> o.foo()
Yo There! I'm a MyCLass-Object!
>>> o.__class__ = YourClass
>>> o.foo()
Guess what?! I'm a YourClass-Object!
Objects have a special attribute __class__ that points to the class of which they are an instance. This is possible only because classes are objects themself, and therefore can be bound to an attribute like __class__.
As this question has a Smalltalk tag, this answer is from a Smalltalk perspective. In Object-Oriented programming, things get done through message-passing. You send a message to an object, if the object understands that message, it executes the corresponding method and returns a value. But how is the object created in the first place? If special syntax is introduced for creating objects that will break the simple syntax based on message passing. This is what happens in languages like Java:
p = new Point(10, 20); // Creates a new Point object with the help of a special keyword - new.
p.draw(); // Sends the message `draw` to the Point object.
As it is evident from the above code, the language has two ways to get things done - one imperative and the other Object Oriented. In contrast, Smalltalk has a consistent syntax based only on messaging:
p := Point new: 10 y: 20.
p draw.
Here new is a message send to a singleton object called Point which is an instance of a Metaclass. In addition to giving the language a consistent model of computation, metaclasses allow dynamic modification of classes. For instance, the following statement will add a new instance variable to the Point class without requiring a recompilation or VM restart:
Point addInstVarName: 'z'.
The best reading on this subject is The Art of the Metaobject Protocol.
I am trying to build some classes that inherit from a parent class, which contains subclasses that inherit from other parent classes. But when I change attributes in the subclasses in any children, the change affects all child classes. I am looking to avoid having to create instances, as I am using that feature later.
The code below boils down the problem. The final line shows the unexpected result.
class SubclsParent(object):
a = "Hello"
class Parent(object):
class Subcls(SubclsParent):
pass
class Child1(Parent):
pass
class Child2(Parent):
pass
Child1.Subcls.a # Returns "Hello"
Child2.Subcls.a # Returns "Hello"
Child1.Subcls.a = "Goodbye"
Child1.Subcls.a # Returns "Goodbye"
Child2.Subcls.a # Returns "Goodbye" / Should still return "Hello"!
The behaviour you are seeing is exactly what you should expect. When you define a class
>>> class Foo(object): pass
...
you can modify that class -- not instances of it, the class itself -- because the class is just another object, stored in the variable Foo. So, for instance, you can get and set attributes of the class:
>>> Foo.a = 1
>>> Foo.a
1
In other words, the class keyword creates a new type of object and binds the specified name to that object.
Now, if you define a class inside another class (which is a weird thing to do, by the way), that is equivalent to defining a local variable inside the class body. And you know what defining local variables inside the body of a class does: it sets them as class attributes. In other words, variables defined locally are stored on the class object and not on individual instances. Thus,
>>> class Foo(object):
... class Bar(object): pass
...
defines a class Foo with one class attribute, Bar, which happens itself to be a class. There is no subclassing going on here, though -- the classes Foo and Bar are entirely independent. (The behaviour you have achieved could be replicated as follows:
>>> class Foo(object):
... class Bar(object): pass
...
>>> class Foo(object): pass
...
>>> class Bar(object): pass
...
>>> Foo.Bar = Bar
.)
So you are always modifying the same variable! Of course you will change the values you see; you have changed them yourself!
Your problem seems to be that you are somewhat confused between instance and class attributes, which are not the same thing at all.
A class attribute is a variable which is defined over the whole class. That is, any instance of the class will share the same variable. For instance, most methods are class attributes, since you want to call the same methods on every instance (usually). You can also use class attributes for things like global counters (how many times have you instantiated this class?) and other properties which should be shared amongst instances.
An instance attribute is a variable peculiar to an instance of the class. That is, each instance has a different copy of the variable, with possibly-different contents. This is where you store the data in classes -- if you have a Page class, say, you would like the contents attribute to be stored per-instance, since different Pages will of course need different contents.
In your example, you want Child1.Subcls.a and Child2.Subcls.a to be different variables. Naturally, then, they should depend on the instance!
This may be a bit of a leap of faith, but are you trying to implement Java-style interfaces in Python? In other words, are you trying to specify what properties and methods a class should have, without actually defining those properties?
This used to be considered something of a non-Pythonic thing to do, since the prevailing consensus was that you should allow the classes to do whatever they want and catch the exceptions which arise when they don't define a needed property or method. However, recently people have realised that interfaces are actually sometimes a good thing, and new functionality was added to Python to allow this: abstract base classes.
Try this
class SubclsParent(object):
def __init__(self):
self.a = "Hello"
When you define SubclsParent.a directly on the class, you are defining it as static.
When you use Child1.Subcls, python sees there is no Child1.Subcls, and so checks Parent where it finds it and returns it. The same thing happens for Child2.Subcls. As a result, both of those expressions refer to the same class. Child1 and Child2 do not get their own subclasses of it, rather they have access to the original.
*I am looking to avoid having to create instances, as I am using that feature later. *
I don't understand what you mean here.
Your problem is that when you access the attributes you are accessing inherited routines that were created in the parent class, which all refer to the same variable. You can either make those instance variables, or else you can create the attributes in the child classes to get independent attributes.
It may be that what you really want is metaclasses.