In an attempt to discover the boundaries of Python as a language I'm exploring whether it is possible to go further with information hiding than the convention of using a leading underscore to denote 'private' implementation details.
I have managed to achieve some additional level of privacy of fields and methods using code such as this to copy 'public' stuff from a locally defined class:
from __future__ import print_function
class Dog(object):
def __init__(self):
class Guts(object):
def __init__(self):
self._dog_sound = "woof"
self._repeat = 1
def _make_sound(self):
for _ in range(self._repeat):
print(self._dog_sound)
def get_repeat(self):
return self._repeat
def set_repeat(self, value):
self._repeat = value
#property
def repeat(self):
return self._repeat
#repeat.setter
def repeat(self, value):
self._repeat = value
def speak(self):
self._make_sound()
guts = Guts()
# Make public methods
self.speak = guts.speak
self.set_repeat = guts.set_repeat
self.get_repeat = guts.get_repeat
dog = Dog()
print("Speak once:")
dog.speak()
print("Speak twice:")
dog.set_repeat(2)
dog.speak()
However, I'm struggling to find a way to do the same for the property setter and getter.
I want to be able to write code like this:
print("Speak thrice:")
dog.repeat = 3
dog.speak()
and for it to actually print 'woof' three times.
I've tried all of the following in Dog.__init__, none of which blow up, but neither do they seem to have any effect:
Dog.repeat = guts.repeat
self.repeat = guts.repeat
Dog.repeat = Guts.repeat
self.repeat = Guts.repeat
self.repeat = property(Guts.repeat.getter, Guts.repeat.setter)
self.repeat = property(Guts.repeat.fget, Guts.repeat.fset)
Descriptors only work when defined on the class, not the instance. See this previous question and this one and the documentation that abarnert already pointed you to. The key statement is:
For objects, the machinery is in object.__getattribute__() which transforms b.x into type(b).__dict__['x'].__get__(b, type(b)).
Note the reference to type(b). There is only one property object for the whole class, and the instance information is passed in at access time.
That means you can't have a property on an individual Dog instance that deals only with that particular dog's guts. You have to define a property on the Dog class, and have it access the guts of the individual dog via self. Except you can't do that with your setup, because you're not storing a reference to the dog's guts on self, because you're trying to hide the guts.
The bottom line is that you can't effectively proxy attribute access to an underlying "guts" object without storing a reference to that object on the "outward-facing" object. And if you store such a reference, people can use it to modify the guts in an "unauthorized" way.
Also, sad to say, even your existing example doesn't really protect the guts. I can do this:
>>> d = Dog()
>>> d.speak.__self__._repeat = 3
>>> d.speak()
woof
woof
woof
Even though you try to hide the guts by exposing only the public methods of the guts, those public methods themselves contain a reference to the actual Guts object, allowing anyone to sneak in and modify the guts directly, bypassing your information-hiding scheme. This is why it's futile to try to enforce information-hiding in Python: core parts of the language like methods already expose a lot of stuff, and you just can't plug all those leaks.
You could set up a system with descriptors and a metaclass where the Dog metaclass creates descriptors for all the public attributes of the class and constructs a SecretDog class containing all the private methods, then have each Dog instance has a shadow SecretDog instance tracked by the descriptors that houses your 'private' implementation. However, this would be going an awfully long way to "secure" the private implementation in a language that by it's nature can't really have anything private. You'll also have a hell of a time getting inheritance to work reliably.
Ultimately, if you want to hide a private implementation in Python, you should probably be writing it as a C extension (or not trying to in the first place). If your goal is a deeper understanding of the language, looking at writing a C extension isn't a bad place to start.
You will need something like this
>>> class hi:
... def __setattr__(self, attr, value):
... getattr(self, attr)(value)
... def haha(self, val):
... print val
...
>>> a = hi()
>>> a.haha = 10
10
just be careful with this
Related
I know that Python is a dynamically typed language, and that I am likely trying to recreate Java behavior here. However, I have a team of people working on this code base, and my goal with the code is to ensure that they are doing things in a consistent manner. Let me give an example:
class Company:
def __init__(self, j):
self.locations = []
When they instantiate a Company object, an empty list that holds locations is created. Now, with Python anything can be added to the list. However, I would like for this list to only contain Location objects:
class Location:
def __init__(self, j):
self.address = None
self.city = None
self.state = None
self.zip = None
I'm doing this with classes so that the code is self documenting. In other words, "location has only these attributes". My goal is that they do this:
c = Company()
l = Location()
l.city = "New York"
c.locations.append(l)
Unfortunately, nothing is stopping them from simply doing c.locations.append("foo"), and nothing indicates to them that c.locations should be a list of Location objects.
What is the Pythonic way to enforce consistency when working with a team of developers?
An OOP solution is to make sure the users of your class' API do not have to interact directly with your instance attributes.
Methods
One approach is to implement methods which encapsulate the logic of adding a location.
Example
class Company:
def __init__(self, j):
self.locations = []
def add_location(self, location):
if isinstance(location, Location):
self.locations.append(location)
else:
raise TypeError("argument 'location' should be a Location object")
Properties
Another OOP concept you can use is a property. Properties are a simple way to define getter and setters for your instance attributes.
Example
Suppose we want to enforce a certain format for a Location.zip attribute
class Location:
def __init__(self):
self._zip = None
#property
def zip(self):
return self._zip
#zip.setter
def zip(self, value):
if some_condition_on_value:
self._zip = value
else:
raise ValueError('Incorrect format')
#zip.deleter
def zip(self):
self._zip = None
Notice that the attribute Location()._zip is still accessible and writable. While the underscore denotes what should be a private attribute, nothing is really private in Python.
Final word
Due to Python's high introspection capabilities, nothing will ever be totally safe. You will have to sit down with your team and discuss the tools and practice you want to adopt.
Nothing is really private in python. No class or class instance can
keep you away from all what's inside (this makes introspection
possible and powerful). Python trusts you. It says "hey, if you want
to go poking around in dark places, I'm gonna trust that you've got a
good reason and you're not making trouble."
After all, we're all consenting adults here.
--- Karl Fast
You could also define a new class ListOfLocations that make the safety checks. Something like this
class ListOfLocations(list):
def append(self,l):
if not isinstance(l, Location): raise TypeError("Location required here")
else: super().append(l)
I'm coming from the Java world and reading Bruce Eckels' Python 3 Patterns, Recipes and Idioms.
While reading about classes, it goes on to say that in Python there is no need to declare instance variables. You just use them in the constructor, and boom, they are there.
So for example:
class Simple:
def __init__(self, s):
print("inside the simple constructor")
self.s = s
def show(self):
print(self.s)
def showMsg(self, msg):
print(msg + ':', self.show())
If that’s true, then any object of class Simple can just change the value of variable s outside of the class.
For example:
if __name__ == "__main__":
x = Simple("constructor argument")
x.s = "test15" # this changes the value
x.show()
x.showMsg("A message")
In Java, we have been taught about public/private/protected variables. Those keywords make sense because at times you want variables in a class to which no one outside the class has access to.
Why is that not required in Python?
It's cultural. In Python, you don't write to other classes' instance or class variables. In Java, nothing prevents you from doing the same if you really want to - after all, you can always edit the source of the class itself to achieve the same effect. Python drops that pretence of security and encourages programmers to be responsible. In practice, this works very nicely.
If you want to emulate private variables for some reason, you can always use the __ prefix from PEP 8. Python mangles the names of variables like __foo so that they're not easily visible to code outside the namespace that contains them (although you can get around it if you're determined enough, just like you can get around Java's protections if you work at it).
By the same convention, the _ prefix means _variable should be used internally in the class (or module) only, even if you're not technically prevented from accessing it from somewhere else. You don't play around with another class's variables that look like __foo or _bar.
Private variables in Python is more or less a hack: the interpreter intentionally renames the variable.
class A:
def __init__(self):
self.__var = 123
def printVar(self):
print self.__var
Now, if you try to access __var outside the class definition, it will fail:
>>> x = A()
>>> x.__var # this will return error: "A has no attribute __var"
>>> x.printVar() # this gives back 123
But you can easily get away with this:
>>> x.__dict__ # this will show everything that is contained in object x
# which in this case is something like {'_A__var' : 123}
>>> x._A__var = 456 # you now know the masked name of private variables
>>> x.printVar() # this gives back 456
You probably know that methods in OOP are invoked like this: x.printVar() => A.printVar(x). If A.printVar() can access some field in x, this field can also be accessed outside A.printVar()... After all, functions are created for reusability, and there isn't any special power given to the statements inside.
As correctly mentioned by many of the comments above, let's not forget the main goal of Access Modifiers: To help users of code understand what is supposed to change and what is supposed not to. When you see a private field you don't mess around with it. So it's mostly syntactic sugar which is easily achieved in Python by the _ and __.
Python does not have any private variables like C++ or Java does. You could access any member variable at any time if wanted, too. However, you don't need private variables in Python, because in Python it is not bad to expose your classes' member variables. If you have the need to encapsulate a member variable, you can do this by using "#property" later on without breaking existing client code.
In Python, the single underscore "_" is used to indicate that a method or variable is not considered as part of the public API of a class and that this part of the API could change between different versions. You can use these methods and variables, but your code could break, if you use a newer version of this class.
The double underscore "__" does not mean a "private variable". You use it to define variables which are "class local" and which can not be easily overridden by subclasses. It mangles the variables name.
For example:
class A(object):
def __init__(self):
self.__foobar = None # Will be automatically mangled to self._A__foobar
class B(A):
def __init__(self):
self.__foobar = 1 # Will be automatically mangled to self._B__foobar
self.__foobar's name is automatically mangled to self._A__foobar in class A. In class B it is mangled to self._B__foobar. So every subclass can define its own variable __foobar without overriding its parents variable(s). But nothing prevents you from accessing variables beginning with double underscores. However, name mangling prevents you from calling this variables /methods incidentally.
I strongly recommend you watch Raymond Hettinger's Python's class development toolkit from PyCon 2013, which gives a good example why and how you should use #property and "__"-instance variables.
If you have exposed public variables and you have the need to encapsulate them, then you can use #property. Therefore you can start with the simplest solution possible. You can leave member variables public unless you have a concrete reason to not do so. Here is an example:
class Distance:
def __init__(self, meter):
self.meter = meter
d = Distance(1.0)
print(d.meter)
# prints 1.0
class Distance:
def __init__(self, meter):
# Customer request: Distances must be stored in millimeters.
# Public available internals must be changed.
# This would break client code in C++.
# This is why you never expose public variables in C++ or Java.
# However, this is Python.
self.millimeter = meter * 1000
# In Python we have #property to the rescue.
#property
def meter(self):
return self.millimeter *0.001
#meter.setter
def meter(self, value):
self.millimeter = value * 1000
d = Distance(1.0)
print(d.meter)
# prints 1.0
There is a variation of private variables in the underscore convention.
In [5]: class Test(object):
...: def __private_method(self):
...: return "Boo"
...: def public_method(self):
...: return self.__private_method()
...:
In [6]: x = Test()
In [7]: x.public_method()
Out[7]: 'Boo'
In [8]: x.__private_method()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-fa17ce05d8bc> in <module>()
----> 1 x.__private_method()
AttributeError: 'Test' object has no attribute '__private_method'
There are some subtle differences, but for the sake of programming pattern ideological purity, it's good enough.
There are examples out there of #private decorators that more closely implement the concept, but your mileage may vary. Arguably, one could also write a class definition that uses meta.
As mentioned earlier, you can indicate that a variable or method is private by prefixing it with an underscore. If you don't feel like this is enough, you can always use the property decorator. Here's an example:
class Foo:
def __init__(self, bar):
self._bar = bar
#property
def bar(self):
"""Getter for '_bar'."""
return self._bar
This way, someone or something that references bar is actually referencing the return value of the bar function rather than the variable itself, and therefore it can be accessed but not changed. However, if someone really wanted to, they could simply use _bar and assign a new value to it. There is no surefire way to prevent someone from accessing variables and methods that you wish to hide, as has been said repeatedly. However, using property is the clearest message you can send that a variable is not to be edited. property can also be used for more complex getter/setter/deleter access paths, as explained here: https://docs.python.org/3/library/functions.html#property
Python has limited support for private identifiers, through a feature that automatically prepends the class name to any identifiers starting with two underscores. This is transparent to the programmer, for the most part, but the net effect is that any variables named this way can be used as private variables.
See here for more on that.
In general, Python's implementation of object orientation is a bit primitive compared to other languages. But I enjoy this, actually. It's a very conceptually simple implementation and fits well with the dynamic style of the language.
The only time I ever use private variables is when I need to do other things when writing to or reading from the variable and as such I need to force the use of a setter and/or getter.
Again this goes to culture, as already stated. I've been working on projects where reading and writing other classes variables was free-for-all. When one implementation became deprecated it took a lot longer to identify all code paths that used that function. When use of setters and getters was forced, a debug statement could easily be written to identify that the deprecated method had been called and the code path that calls it.
When you are on a project where anyone can write an extension, notifying users about deprecated methods that are to disappear in a few releases hence is vital to keep module breakage at a minimum upon upgrades.
So my answer is; if you and your colleagues maintain a simple code set then protecting class variables is not always necessary. If you are writing an extensible system then it becomes imperative when changes to the core is made that needs to be caught by all extensions using the code.
"In java, we have been taught about public/private/protected variables"
"Why is that not required in python?"
For the same reason, it's not required in Java.
You're free to use -- or not use private and protected.
As a Python and Java programmer, I've found that private and protected are very, very important design concepts. But as a practical matter, in tens of thousands of lines of Java and Python, I've never actually used private or protected.
Why not?
Here's my question "protected from whom?"
Other programmers on my team? They have the source. What does protected mean when they can change it?
Other programmers on other teams? They work for the same company. They can -- with a phone call -- get the source.
Clients? It's work-for-hire programming (generally). The clients (generally) own the code.
So, who -- precisely -- am I protecting it from?
In Python 3, if you just want to "encapsulate" the class attributes, like in Java, you can just do the same thing like this:
class Simple:
def __init__(self, str):
print("inside the simple constructor")
self.__s = str
def show(self):
print(self.__s)
def showMsg(self, msg):
print(msg + ':', self.show())
To instantiate this do:
ss = Simple("lol")
ss.show()
Note that: print(ss.__s) will throw an error.
In practice, Python 3 will obfuscate the global attribute name. It is turning this like a "private" attribute, like in Java. The attribute's name is still global, but in an inaccessible way, like a private attribute in other languages.
But don't be afraid of it. It doesn't matter. It does the job too. ;)
Private and protected concepts are very important. But Python is just a tool for prototyping and rapid development with restricted resources available for development, and that is why some of the protection levels are not so strictly followed in Python. You can use "__" in a class member. It works properly, but it does not look good enough. Each access to such field contains these characters.
Also, you can notice that the Python OOP concept is not perfect. Smalltalk or Ruby are much closer to a pure OOP concept. Even C# or Java are closer.
Python is a very good tool. But it is a simplified OOP language. Syntactically and conceptually simplified. The main goal of Python's existence is to bring to developers the possibility to write easy readable code with a high abstraction level in a very fast manner.
Here's how I handle Python 3 class fields:
class MyClass:
def __init__(self, public_read_variable, private_variable):
self.public_read_variable_ = public_read_variable
self.__private_variable = private_variable
I access the __private_variable with two underscores only inside MyClass methods.
I do read access of the public_read_variable_ with one underscore
outside the class, but never modify the variable:
my_class = MyClass("public", "private")
print(my_class.public_read_variable_) # OK
my_class.public_read_variable_ = 'another value' # NOT OK, don't do that.
So I’m new to Python but I have a background in C# and JavaScript. Python feels like a mix of the two in terms of features. JavaScript also struggles in this area and the way around it here, is to create a closure. This prevents access to data you don’t want to expose by returning a different object.
def print_msg(msg):
# This is the outer enclosing function
def printer():
# This is the nested function
print(msg)
return printer # returns the nested function
# Now let's try calling this function.
# Output: Hello
another = print_msg("Hello")
another()
https://www.programiz.com/python-programming/closure
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures#emulating_private_methods_with_closures
About sources (to change the access rights and thus bypass language encapsulation like Java or C++):
You don't always have the sources and even if you do, the sources are managed by a system that only allows certain programmers to access a source (in a professional context). Often, every programmer is responsible for certain classes and therefore knows what he can and cannot do. The source manager also locks the sources being modified and of course, manages the access rights of programmers.
So I trust more in software than in human, by experience. So convention is good, but multiple protections are better, like access management (real private variable) + sources management.
I have been thinking about private class attributes and methods (named members in further reading) since I have started to develop a package that I want to publish. The thought behind it were never to make it impossible to overwrite these members, but to have a warning for those who touch them. I came up with a few solutions that might help. The first solution is used in one of my favorite Python books, Fluent Python.
Upsides of technique 1:
It is unlikely to be overwritten by accident.
It is easily understood and implemented.
Its easier to handle than leading double underscore for instance attributes.
*In the book the hash-symbol was used, but you could use integer converted to strings as well. In Python it is forbidden to use klass.1
class Technique1:
def __init__(self, name, value):
setattr(self, f'private#{name}', value)
setattr(self, f'1{name}', value)
Downsides of technique 1:
Methods are not easily protected with this technique though. It is possible.
Attribute lookups are just possible via getattr
Still no warning to the user
Another solution I came across was to write __setattr__. Pros:
It is easily implemented and understood
It works with methods
Lookup is not affected
The user gets a warning or error
class Demonstration:
def __init__(self):
self.a = 1
def method(self):
return None
def __setattr__(self, name, value):
if not getattr(self, name, None):
super().__setattr__(name, value)
else:
raise ValueError(f'Already reserved name: {name}')
d = Demonstration()
#d.a = 2
d.method = None
Cons:
You can still overwrite the class
To have variables not just constants, you need to map allowed input.
Subclasses can still overwrite methods
To prevent subclasses from overwriting methods you can use __init_subclass__:
class Demonstration:
__protected = ['method']
def method(self):
return None
def __init_subclass__(cls):
protected_methods = Demonstration.__protected
subclass_methods = dir(cls)
for i in protected_methods:
p = getattr(Demonstration,i)
j = getattr(cls, i)
if not p is j:
raise ValueError(f'Protected method "{i}" was touched')
You see, there are ways to protect your class members, but it isn't any guarantee that users don't overwrite them anyway. This should just give you some ideas. In the end, you could also use a meta class, but this might open up new dangers to encounter. The techniques used here are also very simple minded and you should definitely take a look at the documentation, you can find useful feature to this technique and customize them to your need.
I'm surprised that the following code runs without error.
# ABC
class Foo(object):
__metaclass__ = ABCMeta
a = 1
def __init__(self, b, c):
self.b = b
self.c = c
def get_scaled_a(self):
return self.a / Bar1.a # why can I access Bar1.a?
#abstractmethod
def class_type(self):
pass
# Derived class 1
class Bar1(Foo):
a = 100
def class_type(self):
return 'bar1'
# Derived class 2
class Bar2(Foo):
a = 10
def class_type(self):
return 'bar2'
my_bar2_inst = Bar2(0, 0)
print(my_bar2_inst.get_scaled_a())
# 0.1
# Why can I access Bar.a?
Because Python assumes that developers are mature human beings. Instead of the interpreter checking whether you have access to a certain attribute, in Python you usually have access to all attributes. It is up to you to be mature enough and don't break anything.
There is however a convention that attributes starting with a lowercase, like _foo, __bar and __qux__ are considered private. It means it is usually a bad idea to access those yourself. But there is no mechanism in place to prevent you from accessing them: the variable name more or less asks that you would be so kind not to access it. In case you absolutely need it, it is your responsibility.
Now the a of Bar is a member of the Bar class, not of a Bar instance. So in some other languages, it would be considered to be "static". That's another reason why you can access it.
When you run get_scaled_a, the Bar1 class has already been defined, and so has its a attribute. Remember that classes are objects too, not just class instances. So when Bar1 is defined, you can access certain attributes without creating an instance of it.
The fact that it is a subclass doesn't play a role at all.
You can access the child class attribute for the same reason you can do this:
>>> class myobj: pass
>>> def f(obj):
print(obj.a)
>>> obj1, obj2 = myobj(), myobj()
>>> obj1.a, obj2.a = 1, 2
>>> f(obj1)
1
>>> f(obj2)
2
Classes are just objects. Metaclasses are little more than class-object-factories.
Here's another illustration. Say I have a Reuben maker:
>>> Reuben = type('Reuben', (), {'mayo':False}) # <-- shortcut for making a class
Now say I have some kind of worthless sandwich making factory that puts mayo on everything (gross):
>>> def MakeMeASandwich(sandwich_type):
sandwich_type.mayo = True
return sandwich_type()
Would you expect that to work? If you said yes, you're right:
>>> s = MakeMeASandwich(Reuben)
# reuben with mayo!
Why did you expect that to work? Probably because there is no reason the function shouldn't be able to access mayo. It's there. It's not hidden. So of course it can get to it.
Again: a metaclass is little more than a class making factory. It is very much the same as any other factory (though they do have some nifty extra bells and whistles that you probably don't need).
It's possible to use type in Python to create a new class object, as you probably know:
A = type('A', (object,), {})
a = A() # create an instance of A
What I'm curious about is whether there's any problem with creating different class objects with the same name, eg, following on from the above:
B = type('A', (object,), {})
In other words, is there an issue with this second class object, B, having the same name as our first class object, A?
The motivation for this is that I'd like to get a clean copy of a class to apply different decorators to without using the inheritance approach described in this question.
So I'd like to define a class normally, eg:
class Fruit(object):
pass
and then make a fresh copy of it to play with:
def copy_class(cls):
return type(cls.__name__, cls.__bases__, dict(cls.__dict__))
FreshFruit = copy_class(fruit)
In my testing, things I do with FreshFruit are properly decoupled from things I do to Fruit.
However, I'm unsure whether I should also be mangling the name in copy_class in order to avoid unexpected problems.
In particular, one concern I have is that this could cause the class to be replaced in the module's dictionary, such that future imports (eg, from module import Fruit return the copied class).
There is no reason why you can't have 2 classes with the same __name__ in the same module if you want to and have a good reason to do so.
e.g. In your example from module import Fruit -- python doesn't care at all about the __name__ of the class. It looks in the module's globals for Fruit and imports what it finds there.
Note that, in general, this approach isn't great if you're using super (although the same can be said for class decorators ...):
class A(Base):
def foo(self):
super(A, self).foo()
B = copy_class(A)
In this case, when B.foo is called, it will end up calling super(A, self) which could lead to funky behaviour in a number of circumstances. . .
Suppose we have the following code:
class A:
var = 0
a = A()
I do understand that a.var and A.var are different variables, and I think I understand why this thing happens. I thought it was just a side effect of python's data model, since why would someone want to modify a class variable in an instance?
However, today I came across a strange example of such a usage: it is in google app engine db.Model reference. Google app engine datastore assumes we inherit db.Model class and introduce keys as class variables:
class Story(db.Model):
title = db.StringProperty()
body = db.TextProperty()
created = db.DateTimeProperty(auto_now_add=True)
s = Story(title="The Three Little Pigs")
I don't understand why do they expect me to do like that? Why not introduce a constructor and use only instance variables?
The db.Model class is a 'Model' style class in classic Model View Controller design pattern.
Each of the assignments in there are actually setting up columns in the database, while also giving an easy to use interface for you to program with. This is why
title="The Three Little Pigs"
will update the object as well as the column in the database.
There is a constructor (no doubt in db.Model) that handles this pass-off logic, and it will take a keyword args list and digest it to create this relational model.
This is why the variables are setup the way they are, so that relation is maintained.
Edit: Let me describe that better. A normal class just sets up the blue print for an object. It has instance variables and class variables. Because of the inheritence to db.Model, this is actually doing a third thing: Setting up column definitions in a database. In order to do this third task it is making EXTENSIVE behinds the scenes changes to things like attribute setting and getting. Pretty much once you inherit from db.Model you aren't really a class anymore, but a DB template. Long story short, this is a VERY specific edge case of the use of a class
If all variables are declared as instance variables then the classes using Story class as superclass will inherit nothing from it.
From the Model and Property docs, it looks like Model has overridden __getattr__ and __setattr__ methods so that, in effect, "Story.title = ..." does not actually set the instance attribute; instead it sets the value stored with the instance's Property.
If you ask for story.__dict__['title'], what does it give you?
I do understand that a.var and A.var are different variables
First off: as of now, no, they aren't.
In Python, everything you declare inside the class block belongs to the class. You can look up attributes of the class via the instance, if the instance doesn't already have something with that name. When you assign to an attribute of an instance, the instance now has that attribute, regardless of whether it had one before. (__init__, in this regard, is just another function; it's called automatically by Python's machinery, but it simply adds attributes to an object, it doesn't magically specify some kind of template for the contents of all instances of the class - there's the magic __slots__ class attribute for that, but it still doesn't do quite what you might expect.)
But right now, a has no .var of its own, so a.var refers to A.var. And you can modify a class attribute via an instance - but note modify, not replace. This requires, of course, that the original value of the attribute is something modifiable - a list qualifies, a str doesn't.
Your GAE example, though, is something totally different. The class Story has attributes which specifically are "properties", which can do assorted magic when you "assign to" them. This works by using the class' __getattr__, __setattr__ etc. methods to change the behaviour of the assignment syntax.
The other answers have it mostly right, but miss one critical thing.
If you define a class like this:
class Foo(object):
a = 5
and an instance:
myinstance = Foo()
Then Foo.a and myinstance.a are the very same variable. Changing one will change the other, and if you create multiple instances of Foo, the .a property on each will be the same variable. This is because of the way Python resolves attribute access: First it looks in the object's dict, and if it doesn't find it there, it looks in the class's dict, and so forth.
That also helps explain why assignments don't work the way you'd expect given the shared nature of the variable:
>>> bar = Foo()
>>> baz = Foo()
>>> Foo.a = 6
>>> bar.a = 7
>>> bar.a
7
>>> baz.a
6
What happened here is that when we assigned to Foo.a, it modified the variable that all instance of Foo normally resolve when you ask for instance.a. But when we assigned to bar.a, Python created a new variable on that instance called a, which now masks the class variable - from now on, that particular instance will always see its own local value.
If you wanted each instance of your class to have a separate variable initialized to 5, the normal way to do it would be like this:
class Foo(object);
def __init__(self):
self.a = 5
That is, you define a class with a constructor that sets the a variable on the new instance to 5.
Finally, what App Engine is doing is an entirely different kind of black magic called descriptors. In short, Python allows objects to define special __get__ and __set__ methods. When an instance of a class that defines these special methods is attached to a class, and you create an instance of that class, attempts to access the attribute will, instead of setting or returning the instance or class variable, they call the special __get__ and __set__ methods. A much more comprehensive introduction to descriptors can be found here, but here's a simple demo:
class MultiplyDescriptor(object):
def __init__(self, multiplicand, initial=0):
self.multiplicand = multiplicand
self.value = initial
def __get__(self, obj, objtype):
if obj is None:
return self
return self.multiplicand * self.value
def __set__(self, obj, value):
self.value = value
Now you can do something like this:
class Foo(object):
a = MultiplyDescriptor(2)
bar = Foo()
bar.a = 10
print bar.a # Prints 20!
Descriptors are the secret sauce behind a surprising amount of the Python language. For instance, property is implemented using descriptors, as are methods, static and class methods, and a bunch of other stuff.
These class variables are metadata to Google App Engine generate their models.
FYI, in your example, a.var == A.var.
>>> class A:
... var = 0
...
... a = A()
... A.var = 3
... a.var == A.var
1: True