I've a class that has some callbacks and its own interface, something like:
class Service:
def __init__(self):
connect("service_resolved", self.service_resolved)
def service_resolved(self, a,b c):
''' This function is called when it's triggered
service resolved signal and has a lot of parameters'''
the connect function is for example the gtkwidget.connect, but I want that this connection is something more general, so I've decided to use a "twisted like" approach:
class MyService(Service):
def my_on_service_resolved(self, little_param):
''' it's a decorated version of srvice_resolved '''
def service_resolved(self,a,b,c):
super(MyService,self).service_resolved(a,b,c)
little_param = "something that's obtained from a,b,c"
self.my_on_service_resolved(little_param)
So I can use MyService by overriding my_on_service_resolved.
The problem is the "attributes" pollution. In the real implementation, Service has some attributes that can accidentally be overriden in MyService and those who subclass MyService.
How can I avoid attribute pollution?
What I've thought is a "wrapper" like approach but I don't know if it's a good solution:
class WrapperService():
def __init__(self):
self._service = service_resolved
# how to override self._service.service_resolved callback?
def my_on_service_resolved(self,param):
'''
'''
Avoiding accidental clashes with derived classes is the reason the "double-leading-underscore" naming approach exists: if you name an attribute in class Service __foo, the Python compiler will internally "mangle" that name to _Service__foo, making accidental clashes unlikely. Alas, not impossible: a subclass might also be named Service (and live in another module) and also have its own __foo attribute... which would be named-mangled exactly the same way, resulting in a conflict again. Ameliorations include naming base classes BaseService and the like, exploiting the fact that a derived class is very unlikely to be named Basewhatever. But, in the end, there's no alternative to clearly documenting a convention such as this (at least if subclasses are going to be written by programmers with little Python experience).
I don't think the "accidental clash" issue is enough to force you to forego subclassing entirely (in favor of using wrapping exclusively), which is basically the only way to avoid accidental name clashes for certain. The approach you call "twisted-like" (and is actually a case of the Template Method design pattern) is quite viable, and a mix of convention for naming, and documentation of your design choices, in real-life practice proves sufficient to avoid its "clash" risks.
Related
I'm coming from the Java world and reading Bruce Eckels' Python 3 Patterns, Recipes and Idioms.
While reading about classes, it goes on to say that in Python there is no need to declare instance variables. You just use them in the constructor, and boom, they are there.
So for example:
class Simple:
def __init__(self, s):
print("inside the simple constructor")
self.s = s
def show(self):
print(self.s)
def showMsg(self, msg):
print(msg + ':', self.show())
If that’s true, then any object of class Simple can just change the value of variable s outside of the class.
For example:
if __name__ == "__main__":
x = Simple("constructor argument")
x.s = "test15" # this changes the value
x.show()
x.showMsg("A message")
In Java, we have been taught about public/private/protected variables. Those keywords make sense because at times you want variables in a class to which no one outside the class has access to.
Why is that not required in Python?
It's cultural. In Python, you don't write to other classes' instance or class variables. In Java, nothing prevents you from doing the same if you really want to - after all, you can always edit the source of the class itself to achieve the same effect. Python drops that pretence of security and encourages programmers to be responsible. In practice, this works very nicely.
If you want to emulate private variables for some reason, you can always use the __ prefix from PEP 8. Python mangles the names of variables like __foo so that they're not easily visible to code outside the namespace that contains them (although you can get around it if you're determined enough, just like you can get around Java's protections if you work at it).
By the same convention, the _ prefix means _variable should be used internally in the class (or module) only, even if you're not technically prevented from accessing it from somewhere else. You don't play around with another class's variables that look like __foo or _bar.
Private variables in Python is more or less a hack: the interpreter intentionally renames the variable.
class A:
def __init__(self):
self.__var = 123
def printVar(self):
print self.__var
Now, if you try to access __var outside the class definition, it will fail:
>>> x = A()
>>> x.__var # this will return error: "A has no attribute __var"
>>> x.printVar() # this gives back 123
But you can easily get away with this:
>>> x.__dict__ # this will show everything that is contained in object x
# which in this case is something like {'_A__var' : 123}
>>> x._A__var = 456 # you now know the masked name of private variables
>>> x.printVar() # this gives back 456
You probably know that methods in OOP are invoked like this: x.printVar() => A.printVar(x). If A.printVar() can access some field in x, this field can also be accessed outside A.printVar()... After all, functions are created for reusability, and there isn't any special power given to the statements inside.
As correctly mentioned by many of the comments above, let's not forget the main goal of Access Modifiers: To help users of code understand what is supposed to change and what is supposed not to. When you see a private field you don't mess around with it. So it's mostly syntactic sugar which is easily achieved in Python by the _ and __.
Python does not have any private variables like C++ or Java does. You could access any member variable at any time if wanted, too. However, you don't need private variables in Python, because in Python it is not bad to expose your classes' member variables. If you have the need to encapsulate a member variable, you can do this by using "#property" later on without breaking existing client code.
In Python, the single underscore "_" is used to indicate that a method or variable is not considered as part of the public API of a class and that this part of the API could change between different versions. You can use these methods and variables, but your code could break, if you use a newer version of this class.
The double underscore "__" does not mean a "private variable". You use it to define variables which are "class local" and which can not be easily overridden by subclasses. It mangles the variables name.
For example:
class A(object):
def __init__(self):
self.__foobar = None # Will be automatically mangled to self._A__foobar
class B(A):
def __init__(self):
self.__foobar = 1 # Will be automatically mangled to self._B__foobar
self.__foobar's name is automatically mangled to self._A__foobar in class A. In class B it is mangled to self._B__foobar. So every subclass can define its own variable __foobar without overriding its parents variable(s). But nothing prevents you from accessing variables beginning with double underscores. However, name mangling prevents you from calling this variables /methods incidentally.
I strongly recommend you watch Raymond Hettinger's Python's class development toolkit from PyCon 2013, which gives a good example why and how you should use #property and "__"-instance variables.
If you have exposed public variables and you have the need to encapsulate them, then you can use #property. Therefore you can start with the simplest solution possible. You can leave member variables public unless you have a concrete reason to not do so. Here is an example:
class Distance:
def __init__(self, meter):
self.meter = meter
d = Distance(1.0)
print(d.meter)
# prints 1.0
class Distance:
def __init__(self, meter):
# Customer request: Distances must be stored in millimeters.
# Public available internals must be changed.
# This would break client code in C++.
# This is why you never expose public variables in C++ or Java.
# However, this is Python.
self.millimeter = meter * 1000
# In Python we have #property to the rescue.
#property
def meter(self):
return self.millimeter *0.001
#meter.setter
def meter(self, value):
self.millimeter = value * 1000
d = Distance(1.0)
print(d.meter)
# prints 1.0
There is a variation of private variables in the underscore convention.
In [5]: class Test(object):
...: def __private_method(self):
...: return "Boo"
...: def public_method(self):
...: return self.__private_method()
...:
In [6]: x = Test()
In [7]: x.public_method()
Out[7]: 'Boo'
In [8]: x.__private_method()
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-8-fa17ce05d8bc> in <module>()
----> 1 x.__private_method()
AttributeError: 'Test' object has no attribute '__private_method'
There are some subtle differences, but for the sake of programming pattern ideological purity, it's good enough.
There are examples out there of #private decorators that more closely implement the concept, but your mileage may vary. Arguably, one could also write a class definition that uses meta.
As mentioned earlier, you can indicate that a variable or method is private by prefixing it with an underscore. If you don't feel like this is enough, you can always use the property decorator. Here's an example:
class Foo:
def __init__(self, bar):
self._bar = bar
#property
def bar(self):
"""Getter for '_bar'."""
return self._bar
This way, someone or something that references bar is actually referencing the return value of the bar function rather than the variable itself, and therefore it can be accessed but not changed. However, if someone really wanted to, they could simply use _bar and assign a new value to it. There is no surefire way to prevent someone from accessing variables and methods that you wish to hide, as has been said repeatedly. However, using property is the clearest message you can send that a variable is not to be edited. property can also be used for more complex getter/setter/deleter access paths, as explained here: https://docs.python.org/3/library/functions.html#property
Python has limited support for private identifiers, through a feature that automatically prepends the class name to any identifiers starting with two underscores. This is transparent to the programmer, for the most part, but the net effect is that any variables named this way can be used as private variables.
See here for more on that.
In general, Python's implementation of object orientation is a bit primitive compared to other languages. But I enjoy this, actually. It's a very conceptually simple implementation and fits well with the dynamic style of the language.
The only time I ever use private variables is when I need to do other things when writing to or reading from the variable and as such I need to force the use of a setter and/or getter.
Again this goes to culture, as already stated. I've been working on projects where reading and writing other classes variables was free-for-all. When one implementation became deprecated it took a lot longer to identify all code paths that used that function. When use of setters and getters was forced, a debug statement could easily be written to identify that the deprecated method had been called and the code path that calls it.
When you are on a project where anyone can write an extension, notifying users about deprecated methods that are to disappear in a few releases hence is vital to keep module breakage at a minimum upon upgrades.
So my answer is; if you and your colleagues maintain a simple code set then protecting class variables is not always necessary. If you are writing an extensible system then it becomes imperative when changes to the core is made that needs to be caught by all extensions using the code.
"In java, we have been taught about public/private/protected variables"
"Why is that not required in python?"
For the same reason, it's not required in Java.
You're free to use -- or not use private and protected.
As a Python and Java programmer, I've found that private and protected are very, very important design concepts. But as a practical matter, in tens of thousands of lines of Java and Python, I've never actually used private or protected.
Why not?
Here's my question "protected from whom?"
Other programmers on my team? They have the source. What does protected mean when they can change it?
Other programmers on other teams? They work for the same company. They can -- with a phone call -- get the source.
Clients? It's work-for-hire programming (generally). The clients (generally) own the code.
So, who -- precisely -- am I protecting it from?
In Python 3, if you just want to "encapsulate" the class attributes, like in Java, you can just do the same thing like this:
class Simple:
def __init__(self, str):
print("inside the simple constructor")
self.__s = str
def show(self):
print(self.__s)
def showMsg(self, msg):
print(msg + ':', self.show())
To instantiate this do:
ss = Simple("lol")
ss.show()
Note that: print(ss.__s) will throw an error.
In practice, Python 3 will obfuscate the global attribute name. It is turning this like a "private" attribute, like in Java. The attribute's name is still global, but in an inaccessible way, like a private attribute in other languages.
But don't be afraid of it. It doesn't matter. It does the job too. ;)
Private and protected concepts are very important. But Python is just a tool for prototyping and rapid development with restricted resources available for development, and that is why some of the protection levels are not so strictly followed in Python. You can use "__" in a class member. It works properly, but it does not look good enough. Each access to such field contains these characters.
Also, you can notice that the Python OOP concept is not perfect. Smalltalk or Ruby are much closer to a pure OOP concept. Even C# or Java are closer.
Python is a very good tool. But it is a simplified OOP language. Syntactically and conceptually simplified. The main goal of Python's existence is to bring to developers the possibility to write easy readable code with a high abstraction level in a very fast manner.
Here's how I handle Python 3 class fields:
class MyClass:
def __init__(self, public_read_variable, private_variable):
self.public_read_variable_ = public_read_variable
self.__private_variable = private_variable
I access the __private_variable with two underscores only inside MyClass methods.
I do read access of the public_read_variable_ with one underscore
outside the class, but never modify the variable:
my_class = MyClass("public", "private")
print(my_class.public_read_variable_) # OK
my_class.public_read_variable_ = 'another value' # NOT OK, don't do that.
So I’m new to Python but I have a background in C# and JavaScript. Python feels like a mix of the two in terms of features. JavaScript also struggles in this area and the way around it here, is to create a closure. This prevents access to data you don’t want to expose by returning a different object.
def print_msg(msg):
# This is the outer enclosing function
def printer():
# This is the nested function
print(msg)
return printer # returns the nested function
# Now let's try calling this function.
# Output: Hello
another = print_msg("Hello")
another()
https://www.programiz.com/python-programming/closure
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Closures#emulating_private_methods_with_closures
About sources (to change the access rights and thus bypass language encapsulation like Java or C++):
You don't always have the sources and even if you do, the sources are managed by a system that only allows certain programmers to access a source (in a professional context). Often, every programmer is responsible for certain classes and therefore knows what he can and cannot do. The source manager also locks the sources being modified and of course, manages the access rights of programmers.
So I trust more in software than in human, by experience. So convention is good, but multiple protections are better, like access management (real private variable) + sources management.
I have been thinking about private class attributes and methods (named members in further reading) since I have started to develop a package that I want to publish. The thought behind it were never to make it impossible to overwrite these members, but to have a warning for those who touch them. I came up with a few solutions that might help. The first solution is used in one of my favorite Python books, Fluent Python.
Upsides of technique 1:
It is unlikely to be overwritten by accident.
It is easily understood and implemented.
Its easier to handle than leading double underscore for instance attributes.
*In the book the hash-symbol was used, but you could use integer converted to strings as well. In Python it is forbidden to use klass.1
class Technique1:
def __init__(self, name, value):
setattr(self, f'private#{name}', value)
setattr(self, f'1{name}', value)
Downsides of technique 1:
Methods are not easily protected with this technique though. It is possible.
Attribute lookups are just possible via getattr
Still no warning to the user
Another solution I came across was to write __setattr__. Pros:
It is easily implemented and understood
It works with methods
Lookup is not affected
The user gets a warning or error
class Demonstration:
def __init__(self):
self.a = 1
def method(self):
return None
def __setattr__(self, name, value):
if not getattr(self, name, None):
super().__setattr__(name, value)
else:
raise ValueError(f'Already reserved name: {name}')
d = Demonstration()
#d.a = 2
d.method = None
Cons:
You can still overwrite the class
To have variables not just constants, you need to map allowed input.
Subclasses can still overwrite methods
To prevent subclasses from overwriting methods you can use __init_subclass__:
class Demonstration:
__protected = ['method']
def method(self):
return None
def __init_subclass__(cls):
protected_methods = Demonstration.__protected
subclass_methods = dir(cls)
for i in protected_methods:
p = getattr(Demonstration,i)
j = getattr(cls, i)
if not p is j:
raise ValueError(f'Protected method "{i}" was touched')
You see, there are ways to protect your class members, but it isn't any guarantee that users don't overwrite them anyway. This should just give you some ideas. In the end, you could also use a meta class, but this might open up new dangers to encounter. The techniques used here are also very simple minded and you should definitely take a look at the documentation, you can find useful feature to this technique and customize them to your need.
I have a class that includes some auxiliary functions that do not operate on object data. Ordinarily I would leave these methods private, but I am in Python so there is no such thing. In testing, I am finding it slightly goofy to have to instantiate an instance of my class in order to be able to call these methods. Is there a solid theoretical reason to choose to keep these methods non-static or to make them static?
If a method does not need access to the current instance, you may want to make it either a classmethod, a staticmethod or a plain function.
A classmethod will get the current class as first param. This enable it to access the class attributes including other classmethods or staticmethods. This is the right choice if your method needs to call other classmethods or staticmethods.
A staticmethod only gets it's explicit argument - actually it nothing but a function that can be resolved on the class or instance. The main points of staticmethods are specialization - you can override a staticmethod in a child class and have a method (classmethod or instancemethod) of the base class call the overridden version of the staticmethod - and eventually ease of use - you don't need to import the function apart from the class (at this point it's bordering on lazyness but I've had a couple cases with dynamic imports etc where it happened to be handy).
A plain function is, well, just a plain function - no class-based dispatch, no inheritance, no fancy stuff. But if it's only a helper function used internally by a couple classes in the same module and not part of the classes nor module API, it's possibly just what you're looking for.
As a last note: you can have "some kind of" privacy in Python. Mostly, prefixing a name (whether an attribute / method / class or plain function) with a single leading underscore means "this is an implementation detail, it's NOT part of the API, you're not even supposed to know it exists, it might change or disappear without notice, so if you use it and your code breaks then it's your problem".
If you want to keep said methods in the class just for structural reasons, you might as well make them static, by using the #staticmethod decorator:
class Foo():
#staticmethod
def my_static_method(*args, **kwargs):
....
Your first argument will not be interpretted as the object itself, and you can use it from either the class or an object from that class. If you still need to access class attributes in your method though, you can make it a class method:
class Bar():
counter = 0
#classmethod
def my_class_method(cls, *args, **kwargs):
cls.counter += 1
...
Your first argument of the class method will obviously be the class instead of the instance.
If you do not use any class or instance attribute, I can see no "theoretical" reason to not make them static. Some IDE's even highlight this as a soft warning to prompt you to make the method static if it does not use or mutate any class/instance attribute.
I have a large program with multiple classes.
class Dog(object):
def __init__(self):
self.food = 10
def print_dog_food(self):
print(self.food)
class Cat(object):
def __init__(self):
self.food = 5
def print_cat_food(self):
print(self.food)
They are inherited by other classes:
class AllAnimals(Cat, Dog):
def __init__(self):
Cat.__init__(self)
Dog.__init__(self)
Some variables will be having similar names. I am afraid of accidentally overriding an existing variable and creating nasty bugs:
# Wrong result.
AllAnimals().print_cat_food() # prints 10
AllAnimals().print_dog_food() # prints 10
Keeping track of all those variables to avoid accidents seems impractical, so I was thinking of using mangling:
def __init__(self):
self.__food = ......
and get what i expect:
# Correct result.
AllAnimals().print_cat_food() # prints 5
AllAnimals().print_dog_food() # prints 10
Mangling solves my problem, but reading this highly upvoted answer stating that:
If you want to use it eventually, you can, but it is neither usual nor
recommended.
and the "This is culture" section of that answer makes me sceptical.
Questions:
- Should I use mangling as in the example?
- If not, what are my alternatives to avoid accidental overriding of variables?
Using mangling to produce 'private' attributes is neither usual nor recommended, because there is no such thing as truly private attributes in Python.
Using mangling for what it was designed to do, reduce the likelyhood of subclasses clashing on attribute names, is absolutely pythonic and fine.
From the documentation:
Names in this category, when used within the context of a class definition, are re-written to use a mangled form to help avoid name clashes between “private” attributes of base and derived classes.
Emphasis mine.
However, the use-case is pretty rare; there are not many situations where you want the attribute to not be accidentally overridden. But if you are building base classes that are going to be widely subclassed (in a large project or as part of a framework or library intended for a wide range of uses), then using double-underscore names for implementation details is perfectly fine.
It is unusual, because what you're doing here is unusual. However, given your requirements, I would say that using name-mangled variables is a perfectly appropriate thing to do.
I have the sense that this must be kind of a dumb question—nub here. So I'm open to an answer of the sort "This is ass-backwards, don't do it, please try this: [proper way]".
I'm using Python 2.7.5.
General Form of the Problem
This causes an infinite loop unless Thesaurus (an app-wide singleton) does not call Baseclass.__init__()
class Baseclass():
def __init__(self):
thes = Thesaurus()
#do stuff
class Thesaurus(Baseclass):
def __init__(self):
Baseclass.__init__(self)
#do stuff
My Specific Case
I have a base class that virtually every other class in my app extends (just some basic conventions for functionality within the app; perhaps should just be an interface). This base class is meant to house a singleton of a Thesaurus class that grants some flexibility with user input by inferring some synonyms (ie. {'yes':'yep', 'ok'}).
But since the subclass calls the superclass's __init__(), which in turn creates another subclass, loops ensue. Not calling the superclass's __init__() works just fine, but I'm concerned that's merely a lucky coincidence, and that my Thesaurus class may eventually be modified to require it's parent __init__().
Advice?
Well, I'm stopping to look at your code, and I'll just base my answer on what you say:
I have a base class that virtually every other class in my app extends (just some basic conventions for functionality within the app; perhaps should just be an interface).
this would be ThesaurusBase in the code below
This base class is meant to house a singleton of a Thesaurus class that grants some flexibility with user input by inferring some synonyms (ie. {'yes':'yep', 'ok'}).
That would be ThesaurusSingleton, that you can call with a better name and make it actually useful.
class ThesaurusBase():
def __init__(self, singleton=None):
self.singleton = singleton
def mymethod1(self):
raise NotImplementedError
def mymethod2(self):
raise NotImplementedError
class ThesaurusSingleton(ThesaurusBase):
def mymethod1(self):
return "meaw!"
class Thesaurus(TheraususBase):
def __init__(self, singleton=None):
TheraususBase.__init__(self, singleton)
def mymethod1(self):
return "quack!"
def mymethod2(self):
return "\\_o<"
now you can create your objects as follows:
singleton = ThesaurusSingleton()
thesaurus = Thesaurus(singleton)
edit:
Basically, what I've done here is build a "Base" class that is just an interface defining an expected behavior for all its children classes. The class ThesaurusSingleton (I know that's a terrible name) is also implementing that interface, because you said it had too and I did not want to discuss your design, you may always have good reasons for weird constraints.
And finally, do you really need to instantiate your singleton inside the class that is defining the singleton object? Though there may be some hackish way to do so, there's often a better design that avoids the "hackish" part.
What I think is that however you create your singleton, you should better do it explicitly. That's in the "Zen of python": explicit is better than implicit. Why? because then people reading your code (and that might be you in six months) will be able to understand what's happening and what you were thinking when you wrote that code. If you try to make things more implicit (like using sophisticated meta classes and weird self-inheritance) you may wonder what this code does in less than three weeks!
I'm not telling to avoid that kind of options, but to only use sophisticated stuff when you're out of simple ones!
Based on what you said I think the solution I gave can be a starting point. But as you focus on some obscure, yet not very useful hackish stuff instead of talking about your design, I can't be sure if my example is that appropriate, and hint you on the design.
edit2:
There's an another way to achieve what you say you want (but be sure that's really the design you want). You may want to use a class method that will act on the class itself (instead of the instances) and thus enable you to store a class-wide instance of itself:
>>> class ThesaurusBase:
... #classmethod
... def initClassWide(cls):
... cls._shared = cls()
...
>>> class T(ThesaurusBase):
... def foo(self):
... print self._shared
...
>>> ThesaurusBase.initClassWide()
>>> t = T()
>>> t.foo()
<__main__.ThesaurusBase instance at 0x7ff299a7def0>
and you can call the initClassWide method at the module level of where you declare ThesaurusBase, so whenever you import that module, it will have the singleton loaded (the import mechanism ensuring that python modules are run only once).
the short answer is:
do not instantiate an instance of a sub class from the super class constructor
longer answer:
if the motive you have to try to do this is the fact the Thesaurus is a singleton then you'll be better off exposing the singleton using a static method in the class (Thesaurus) and calling this method when you need the singleton
Python classes have no concept of public/private, so we are told to not touch something that starts with an underscore unless we created it. But does this not require complete knowledge of all classes from which we inherit, directly or indirectly? Witness:
class Base(object):
def __init__(self):
super(Base, self).__init__()
self._foo = 0
def foo(self):
return self._foo + 1
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self._foo = None
Sub().foo()
Expectedly, a TypeError is raised when None + 1 is evaluated. So I have to know that _foo exists in the base class. To get around this, __foo can be used instead, which solves the problem by mangling the name. This seems to be, if not elegant, an acceptable solution. However, what happens if Base inherits from a class (in a separate package) called Sub? Now __foo in my Sub overrides __foo in the grandparent Sub.
This implies that I have to know the entire inheritance chain, including all "private" objects each uses. The fact that Python is dynamically-typed makes this even harder, since there are no declarations to search for. The worst part, however, is probably the fact Base might inherit from object right now, but in some future release, it switches to inheriting from Sub. Clearly if I know Sub is inherited from, I can rename my class, however annoying that is. But I can't see into the future.
Is this not a case where a true private data type would prevent a problem? How, in Python, can I be sure that I'm not accidentally stepping on somebody's toes if those toes might spring into existence at some point in the future?
EDIT: I've apparently not made clear the primary question. I'm familiar with name mangling and the difference between a single and a double underscore. The question is: how do I deal with the fact that I might clash with classes whose existence I don't know of right now? If my parent class (which is in a package I did not write) happens to start inheriting from a class with the same name as my class, even name mangling won't help. Am I wrong in seeing this as a (corner) case that true private members would solve, but that Python has trouble with?
EDIT: As requested, the following is a full example:
File parent.py:
class Sub(object):
def __init__(self):
self.__foo = 12
def foo(self):
return self.__foo + 1
class Base(Sub):
pass
File sub.py:
import parent
class Sub(parent.Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
Sub().foo()
The grandparent's foo is called, but my __foo is used.
Obviously you wouldn't write code like this yourself, but parent could easily be provided by a third party, the details of which could change at any time.
Use private names (instead of protected ones), starting with a double underscore:
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
# ^^
will not conflict with _foo or __foo in Base. This is because Python replaces the double underscore with a single underscore and the name of the class; the following two lines are equivalent:
class Sub(Base):
def x(self):
self.__foo = None # .. is the same as ..
self._Sub__foo = None
(In response to the edit:) The chance that two classes in a class hierarchy not only have the same name, but that they are both using the same property name, and are both using the private mangled (__) form is so minuscule that it can be safely ignored in practice (I for one haven't heard of a single case so far).
In theory, however, you are correct in that in order to formally verify correctness of a program, one most know the entire inheritance chain. Luckily, formal verification usually requires a fixed set of libraries in any case.
This is in the spirit of the Zen of Python, which includes
practicality beats purity.
Name mangling includes the class so your Base.__foo and Sub.__foo will have different names. This was the entire reason for adding the name mangling feature to Python in the first place. One will be _Base__foo, the other _Sub__foo.
Many people prefer to use composition (has-a) instead of inheritance (is-a) for some of these very reasons.
This implies that I have to know the entire inheritance chain. . .
Yes, you should know the entire inheritance chain, or the docs for the object you are directly sub-classing should tell you what you need to know.
Subclassing is an advanced feature, and should be treated with care.
A good example of docs specifying what should be overridden in a subclass is the threading class:
This class represents an activity that is run in a separate thread of control. There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass. No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.
How often do you modify base classes in inheritance chains to introduce inheritance from a class with the same name as a subclass further down the chain???
Less flippantly, yes, you have to know the code you are working with. You certainly have to know the public names being used, after all. Python being python, discovering the public names in use by your ancestor classes takes pretty much the same effort as discovering the private ones.
In years of Python programming, I have never found this to be much of an issue in practice. When you're naming instance variables, you should have a pretty good idea whether (a) a name is generic enough that it's likely to be used in other contexts and (b) the class you're writing is likely to be involved in an inheritance hierarchy with other unknown classes. In such cases, you think a bit more carefully about the names you're using; self.value isn't a great idea for an attribute name, and neither is something like Adaptor a great class name.
In contrast, I have run into difficulties with the overuse of double-underscore names a number of times. Python being Python, even "private" names tend to be accessed by code defined outside the class. You might think that it would always be bad practice to let an external function access "private" attributes, but what about things like getattr and hasattr? The invocation of them can be in the class's own code, so the class is still controlling all access to the private attributes, but they still don't work without you doing the name-mangling manually. If Python had actually-enforced private variables you couldn't use functions like those on them at all. These days I tend to reserve double-underscore names for cases when I'm writing something very generic like a decorator, metaclass, or mixin that needs to add a "secret attribute" to the instances of the (unknown) classes it's applied to.
And of course there's the standard dynamic language argument: the reality is that you have to test your code thoroughly to have much justification in making the claim "my software works". Such testing will be very unlikely to miss the bugs caused by accidentally clashing names. If you are not doing that testing, then many more uncaught bugs will be introduced by other means than by accidental name clashes.
In summation, the lack of private variables is just not that big a deal in idiomatic Python code in practice, and the addition of true private variables would cause more frequent problems in other ways IMHO.
Mangling happens with double underscores. Single underscores are more of a "please don't".
You don't need to know all the details of all parent classes (note that deep inheritance is usually best avoided), because you can still dir() and help() and any other form of introspection you can come up with.
As noted, you can use name mangling. However, you can stick with a single underscore (or none!) if you document your code adequately - you should not have so many private variables that this proves to be a problem. Just say if a method relies on a private variable, and add either the variable, or the name of the method to the class docstring to alert users.
Further, if you create unit tests, you should create tests that check invariants on members, and accordingly these should be able to show up such name clashes.
If you really want to have "private" variables, and for whatever reason name-mangling doesn't meet your needs, you can factor your private state into another object:
class Foo(object):
class Stateholder(object): pass
def __init__(self):
self._state = Stateholder()
self.state.private = 1