How to represent protected methods in Python classes? - python

Reading this question on method ordering, I thought about where to put protected methods and whether they should be private _method(self) or public method(self) in Python. I know that Python doesn't provide a language feature for protected methods.
Private: By convention, attributes starting with an underscore are private. They can still normally be accessed from the outside but should not. Starting protected methods with an underscore feels weird since it is unclear that the subclass actually overrides the method rather than declaring its own implementation detail.
Public: Without the underscore, it is more likely that someone would take a look at the base class to see whether the method is already there. Thus this is nicer for people who subclass. However, people who want to use the subclass don't know that the method is just an implementation detail and might try to call it from the outside.
What is the preferred way to define protected methods in Python?

Just use names starting with a single underscore.
A protected method is a implementation detail that you want to share with subclasses, so such methods are not part of the public API. Anything not part of the public API is best named with an initial underscore.
In other words, 'protected' should be treated just the same as 'private'. Protected methods only need to exist in a language with a strict privacy model where making such implementation details private would preclude sharing such methods with subclasses. Python has no such problem.
Whatever you do, do not use a leading double underscore; such names are considered class private and are namespaced to the class that defines them (they are renamed by the compiler by prefixing _ClassName in front), to ensure that subclasses don't accidentally overwrite them.

Related

can a static class member be private?

Can a static member be private in a python class?
What is the best practice for calling a getter?
If not, is it a bad practice and why?
In python, nothing really is private.
Best practice for anything considered private, is to prefix the member with an underscore; see What is the meaning of a single and a double underscore before an object name?
If you like a getter / setter based approach, check out the python property decorator. However, in my experience, you would typically just have those public members be public (not prefixed with underscore). The typical use case I would use a property for is some calculated attribute (i.e. a method call disguised as an attribute).

Python private method for public usage

I have a class A that need to implement a method meth().
Now, I don't want this method to be called by the end-user of my package. Thus, I have to make this method private (i.e. _meth(). I know that it's not really, private, but conventions matter.)
The problem though is that I have yet another class B in my package that has to call that method _meth(). Problem is that I now get the warning method that say that B tries to access a protected method of a class. Thus, I have to make the method public, i.e. without the leading underscore. This contradicts my intentions.
What is the most pythonic way to solve this dilemma?
I know I can re-implement that method outside of A, but it will lead to code duplication and, as meth() uses private attributes of A, will lead to the same problem.
Inheriting from a single metaclass is not an option as those classes have entirely different purposes and that will be contributing towards a ghastly mess.
The fact that pylint/your editor/whatever external tool gives you a warning doesn't prevent code execution. I don't know about your editor but pylint warnings can be disabled on a case-by-case basis using special comments (nb: "case by case" meaning: "do not warn me for this line or block", not "totally disable this warning").
And it's perfectly ok for your own code to access protected attributes and methods in the same package - the "_protected" naming convention does not mean "None shall pass", just "are you sure you understand what you're doing and willing to take responsability if you break something ?". Since you're the author/maintainer of the package and those are intra-package access you are obviously entitled to take this responsability ;)
The "most pythonic way" would be to not care about private and protected, as these concepts do not exist in Python.
Everything is public. Adding a underscore in the name does not make it private, it just indicates the method is for internal use in the class (not to prevent usage by some end-user).
If you need to use the method from another class, it shows that you're not using classes and objects correctly, and you probably come from a different language like Java where classes are used to group methods together in some namespace.
Just move the function to the module level (outside the class), as you're not using the object (self) anyway.

Python attribute scope best practices [duplicate]

This question already has answers here:
What is the meaning of single and double underscore before an object name?
(18 answers)
Closed 6 years ago.
A very experienced engineer has told me to NEVER use double underscores for defining methods and variables inside a class, because they are reserved for magic methods and only use a single underscore. I understand that double underscores make attributes private to the class, and a single underscore makes them protected. I also understand that protected attributes is just a mutual understanding between developers. I find it hard to believe to not use private attributes, then why was that concept created in the first place. So my questions are:
Is it really bad practice to use double underscores even when it makes sense to make attributes non public?
Since protected attributes are "not really protected", wouldn't it make sense to just make it private, because it would have lesser mistakes when done this way?
Here are some corrections to your statements that will hopefully clarify what you are asking about:
Magic methods and attributes are prefixed and suffixed by a double underscore. A double underscore only in the prefix is specifically to make things private.
In Python 3 and above, attributes that are only prefixed with a double underscore get their name mangled to make them more private. You will be unable to access them outside a class using the literal name. This can cause issues outside of classes, so do not use a double-underscore prefix for say module-level attributes: How to access private variable of Python module from class. However, do use them in classes to make things private. If the feature was not intended to be used, it would not have been added to Python.
As far as privacy and protection goes in general, there is no such concept in Python. It is just an expectation that object oriented programmers have coming in from other languages, so there is an established convention for marking attributes as private.
The single underscore prefix is generally the preferred way to mark things as private because it does not mangle the name, leaving privacy at the discretion of the API's user. This sort of privacy/protection is really more of a way to indicate that the attribute is an implementation detail that may change in future versions. There is nothing stopping you from using the attribute, especially if you are OK with your code breaking when it is linked against different versions of libraries.
Keep in mind that even mangled names follow a fixed pattern for a given version of Python. The mangling is intended more to prevent you from accidentally overriding something you didn't intend to than to make attributes truly private. It just adds the class name with a bunch of underscores to your attribute name, so you can still access it directly if you know how.
Here is a good description of pretty much everything I just wrote from the docs: https://docs.python.org/2/tutorial/classes.html#private-variables-and-class-local-references

How do I create a package protected constructor in Python?

I'd like to create a factory pattern in Python, where one class has some configuration, and knows how to build another class' object (or several classes) on demand. To make this complete, I would like to prevent the created class from being created outside of the factory. In Java, I would put both in the same package, and make the class' constructor package protected.
For regular method names or variables, one can follow the Python convention and use single or double underscores ("_foo" or "__foo"). Is there a way to do something like that for a constructor?
Thank you
You can't. The Python mentality is often summed up as "we're all grown-ups here"; that is, you can't stop people calling methods, changing attributes, instantiating classes, and so on. Instead, you should make an obvious way to construct an instance of your class and then assume that it will be used.
Don't bother, it's not the Python way.
The preferred solution is to simply document which constructor or factory method clients are supposed to call, and not worry too much about public/private (which doesn't mean much in Python anyway; everything is essentially public-in-code.)
The convention in Python is to prefix the name of internal things (members or classes) with an underscore. There is no way to enforce limited access, but the underscore serves as a signal that "you shouldn't be touching this".
From the python tutorial:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
Based on a comment from Wim, one can name the class of the object to be created starting with a single or double underscore. This way it is clear that the constructor is private, and should not be called directly.

Subclassing a class with private members

One of the really nice things about python is the simplicity with which you can name variables that have the same name as the accessor:
self.__value = 1
def value():
return self.__value
Is there a simple way of providing access to the private members of a class that I wish to subclass? Often I wish to simply work with the raw data objects inside of a class without having to use accessors and mutators all the time.
I know this seems to go against the general idea of private and public, but usually the class I am trying to subclass is one of my own which I am quite happy to expose the members from to a subclass but not to an instance of that class. Is there a clean way of providing this distinction?
Not conveniently, without further breaking encapsulation. The double-underscore attribute is name-mangled by prepending '_ClassName' for the class it is being accessed in. So, if you have a 'ContainerThing' class that has a '__value' attribute, the attribute is actually being stored as '_ContainerThing__value'. Changing the class name (or refactoring where the attribute is assigned to) would mean breaking all subclasses that try to access that attribute.
This is exactly why the double-underscore name-mangling (which is not really "private", just "inconvenient") is a bad idea to use. Just use a single leading underscore. Everyone will know not to touch your 'private' attribute and you will still be able to access it in subclasses and other situations where it's darned handy. The name-mangling of double-underscore attributes is useful only to avoid name-clashes for attributes that are truly specific to a particular class, which is extremely rare. It provides no extra 'security' since even the name-mangled attributes are trivially accessible.
For the record, '__value' and 'value' (and '_value') are not the same name. The underscores are part of the name.
"I know this seems to go against the general idea of private and public" Not really "against", just different from C++ and Java.
Private -- as implemented in C++ and Java is not a very useful concept. It helps, sometimes, to isolate implementation details. But it is way overused.
Python names beginning with two __ are special and you should not, as a normal thing, be defining attributes with names like this. Names with __ are special and part of the implementation. And exposed for your use.
Names beginning with one _ are "private". Sometimes they are concealed, a little. Most of the time, the "consenting adults" rule applies -- don't use them foolishly, they're subject to change without notice.
We put "private" in quotes because it's just an agreement between you and your users. You've marked things with _. Your users (and yourself) should honor that.
Often, we have method function names with a leading _ to indicate that we consider them to be "private" and subject to change without notice.
The endless getters and setters that Java requires aren't as often used in Python. Python introspection is more flexible, you have access to an object's internal dictionary of attribute values, and you have first class functions like getattr() and setattr().
Further, you have the property() function which is often used to bind getters and setters to a single name that behaves like a simple attribute, but is actually well-defined method function calls.
Not sure of where to cite it from, but the following statement in regard to access protection is Pythonic canon: "We're all consenting adults here".
Just as Thomas Wouters has stated, a single leading underscore is the idiomatic way of marking an attribute as being a part of the object's internal state. Two underscores just provides name mangling to prevent easy access to the attribute.
After that, you should just expect that the client of your library won't go and shoot themselves in the foot by meddling with the "private" attributes.

Categories

Resources