Subclassing a class with private members

Subclassing a class with private members - python

One of the really nice things about python is the simplicity with which you can name variables that have the same name as the accessor:
self.__value = 1
def value():
return self.__value
Is there a simple way of providing access to the private members of a class that I wish to subclass? Often I wish to simply work with the raw data objects inside of a class without having to use accessors and mutators all the time.
I know this seems to go against the general idea of private and public, but usually the class I am trying to subclass is one of my own which I am quite happy to expose the members from to a subclass but not to an instance of that class. Is there a clean way of providing this distinction?

Not conveniently, without further breaking encapsulation. The double-underscore attribute is name-mangled by prepending '_ClassName' for the class it is being accessed in. So, if you have a 'ContainerThing' class that has a '__value' attribute, the attribute is actually being stored as '_ContainerThing__value'. Changing the class name (or refactoring where the attribute is assigned to) would mean breaking all subclasses that try to access that attribute.
This is exactly why the double-underscore name-mangling (which is not really "private", just "inconvenient") is a bad idea to use. Just use a single leading underscore. Everyone will know not to touch your 'private' attribute and you will still be able to access it in subclasses and other situations where it's darned handy. The name-mangling of double-underscore attributes is useful only to avoid name-clashes for attributes that are truly specific to a particular class, which is extremely rare. It provides no extra 'security' since even the name-mangled attributes are trivially accessible.
For the record, '__value' and 'value' (and '_value') are not the same name. The underscores are part of the name.

"I know this seems to go against the general idea of private and public" Not really "against", just different from C++ and Java.
Private -- as implemented in C++ and Java is not a very useful concept. It helps, sometimes, to isolate implementation details. But it is way overused.
Python names beginning with two __ are special and you should not, as a normal thing, be defining attributes with names like this. Names with __ are special and part of the implementation. And exposed for your use.
Names beginning with one _ are "private". Sometimes they are concealed, a little. Most of the time, the "consenting adults" rule applies -- don't use them foolishly, they're subject to change without notice.
We put "private" in quotes because it's just an agreement between you and your users. You've marked things with _. Your users (and yourself) should honor that.
Often, we have method function names with a leading _ to indicate that we consider them to be "private" and subject to change without notice.
The endless getters and setters that Java requires aren't as often used in Python. Python introspection is more flexible, you have access to an object's internal dictionary of attribute values, and you have first class functions like getattr() and setattr().
Further, you have the property() function which is often used to bind getters and setters to a single name that behaves like a simple attribute, but is actually well-defined method function calls.

Not sure of where to cite it from, but the following statement in regard to access protection is Pythonic canon: "We're all consenting adults here".
Just as Thomas Wouters has stated, a single leading underscore is the idiomatic way of marking an attribute as being a part of the object's internal state. Two underscores just provides name mangling to prevent easy access to the attribute.
After that, you should just expect that the client of your library won't go and shoot themselves in the foot by meddling with the "private" attributes.

Related

Python get and set methods versus #property decorator

I am exploring decorators in Python, and as a person who came to Python from other languages, I am a bit confused about the purpose of #property and its #xxx.setter brother. In Java and C++ get_xxx() and set_xxx() are usually the way to organize encapsulation. In Python we have these two decorators, which require specific syntax, and name matching in order to work. How is #property better than get-set methods?
I have checked this post and still, what are the advantages of #property besides the availability of the += operator?

The best part of using property for an attribute is that you don't need it.
The philosophy in Python is that classes attributes and methods are all public, but by convention - when you prefix their name with a single "_"
The mechanism behing "property", the descriptor protocol, allows one to change a previous dumb plain attribute into an instrumented attribute, guarded with code for the getter and setter, if the system evolves to a situation where it is needed.
But by default, a name attribute in a class, is just a plain attribute. You do person.name = "Name"- no guards needed, no setting method needed nor recommended. When and if it one needs a guard for that (say, to capitalize the name, or filter on improper words), whatever code uses that attribute needs no change: with the use of property, attribute assignment still takes place with the "=" operator.
Other than that, if using "=" does not look prettier than person.set_name("Name") for you, I think it does for most people. Of course, that is subjective.

Python private method for public usage

I have a class A that need to implement a method meth().
Now, I don't want this method to be called by the end-user of my package. Thus, I have to make this method private (i.e. _meth(). I know that it's not really, private, but conventions matter.)
The problem though is that I have yet another class B in my package that has to call that method _meth(). Problem is that I now get the warning method that say that B tries to access a protected method of a class. Thus, I have to make the method public, i.e. without the leading underscore. This contradicts my intentions.
What is the most pythonic way to solve this dilemma?
I know I can re-implement that method outside of A, but it will lead to code duplication and, as meth() uses private attributes of A, will lead to the same problem.
Inheriting from a single metaclass is not an option as those classes have entirely different purposes and that will be contributing towards a ghastly mess.

The fact that pylint/your editor/whatever external tool gives you a warning doesn't prevent code execution. I don't know about your editor but pylint warnings can be disabled on a case-by-case basis using special comments (nb: "case by case" meaning: "do not warn me for this line or block", not "totally disable this warning").
And it's perfectly ok for your own code to access protected attributes and methods in the same package - the "_protected" naming convention does not mean "None shall pass", just "are you sure you understand what you're doing and willing to take responsability if you break something ?". Since you're the author/maintainer of the package and those are intra-package access you are obviously entitled to take this responsability ;)

The "most pythonic way" would be to not care about private and protected, as these concepts do not exist in Python.
Everything is public. Adding a underscore in the name does not make it private, it just indicates the method is for internal use in the class (not to prevent usage by some end-user).
If you need to use the method from another class, it shows that you're not using classes and objects correctly, and you probably come from a different language like Java where classes are used to group methods together in some namespace.
Just move the function to the module level (outside the class), as you're not using the object (self) anyway.

How to represent protected methods in Python classes?

Reading this question on method ordering, I thought about where to put protected methods and whether they should be private _method(self) or public method(self) in Python. I know that Python doesn't provide a language feature for protected methods.
Private: By convention, attributes starting with an underscore are private. They can still normally be accessed from the outside but should not. Starting protected methods with an underscore feels weird since it is unclear that the subclass actually overrides the method rather than declaring its own implementation detail.
Public: Without the underscore, it is more likely that someone would take a look at the base class to see whether the method is already there. Thus this is nicer for people who subclass. However, people who want to use the subclass don't know that the method is just an implementation detail and might try to call it from the outside.
What is the preferred way to define protected methods in Python?

Just use names starting with a single underscore.
A protected method is a implementation detail that you want to share with subclasses, so such methods are not part of the public API. Anything not part of the public API is best named with an initial underscore.
In other words, 'protected' should be treated just the same as 'private'. Protected methods only need to exist in a language with a strict privacy model where making such implementation details private would preclude sharing such methods with subclasses. Python has no such problem.
Whatever you do, do not use a leading double underscore; such names are considered class private and are namespaced to the class that defines them (they are renamed by the compiler by prefixing _ClassName in front), to ensure that subclasses don't accidentally overwrite them.

How do I create a package protected constructor in Python?

I'd like to create a factory pattern in Python, where one class has some configuration, and knows how to build another class' object (or several classes) on demand. To make this complete, I would like to prevent the created class from being created outside of the factory. In Java, I would put both in the same package, and make the class' constructor package protected.
For regular method names or variables, one can follow the Python convention and use single or double underscores ("_foo" or "__foo"). Is there a way to do something like that for a constructor?
Thank you

You can't. The Python mentality is often summed up as "we're all grown-ups here"; that is, you can't stop people calling methods, changing attributes, instantiating classes, and so on. Instead, you should make an obvious way to construct an instance of your class and then assume that it will be used.

Don't bother, it's not the Python way.
The preferred solution is to simply document which constructor or factory method clients are supposed to call, and not worry too much about public/private (which doesn't mean much in Python anyway; everything is essentially public-in-code.)

The convention in Python is to prefix the name of internal things (members or classes) with an underscore. There is no way to enforce limited access, but the underscore serves as a signal that "you shouldn't be touching this".
From the python tutorial:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.

Based on a comment from Wim, one can name the class of the object to be created starting with a single or double underscore. This way it is clear that the constructor is private, and should not be called directly.

What is your convention to distinguish between object methods to be called by the outside, and object methods to be called by a subclass?

I know most of the ins and outs of Python's approach to private variables/members/functions/...
However, I can't make my mind up on how to distinguish between methods for external use or subclassing use.
Consider the following example:
class EventMixin(object):
def subscribe(self, **kwargs):
'''kwargs should be a dict of event -> callable, to be specialized in the subclass'''
def event(self, name, *args, **kwargs):
...
def _somePrivateMethod(self):
...
In this example, I want to make it clear that subscribe is a method to be used by external users of the class/object, while event is a method that should not be called from the outside, but rather by subclass implementations.
Right now, I consider both part of the public API, hence don't use any underscores. However, for this particular situation, it would feel cleaner to, for example, use no underscores for the external API, one underscore for the subclassable API, and two underscores for the private/internal API. However, that would become unwieldy because then the internal API would need to be invoked as
self._EventMixin__somePrivateMethod()
So, what are your conventions, coding-wise, documentationwise, or otherwise ?

use no underscores for the external API,
one underscore for the subclassable API,
and two underscores for the private/internal API
This is a reasonable and relatively common way of doing it, yes. The double-underline-for-actually-private (as opposed to ‘protected’ in C++ terms) is in practice pretty rare. You never really know what behaviours a subclass might want to override, so assuming ‘protected’ is generally a good bet unless there's a really good reason why messing with a member might be particularly dangerous.
However, that would become unwieldy because then the internal API would
need to be invoked as self._EventMixin__somePrivateMethod()
Nope, you can just use the double-underlined version and it will be munged automatically. It's ugly but it works.

I generally find using double __ to be more trouble that they are worth, as it makes unit testing very painful. using single _ as convention for methods/attributes that are not intended to be part of the public interface of a particular class/module is my preferred approach.

I'd like to make the suggestion that when you find yourself encountering this kind of distinction, it may be a good idea to consider using composition instead of inheritance; in other words, instantiating EventMixin (presumably the name would change) instead of inheriting it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subclassing a class with private members - python

Related

Python get and set methods versus #property decorator

Python private method for public usage

How to represent protected methods in Python classes?

How do I create a package protected constructor in Python?

What is your convention to distinguish between object methods to be called by the outside, and object methods to be called by a subclass?

Categories

Resources