Python private method for public usage - python

I have a class A that need to implement a method meth().
Now, I don't want this method to be called by the end-user of my package. Thus, I have to make this method private (i.e. _meth(). I know that it's not really, private, but conventions matter.)
The problem though is that I have yet another class B in my package that has to call that method _meth(). Problem is that I now get the warning method that say that B tries to access a protected method of a class. Thus, I have to make the method public, i.e. without the leading underscore. This contradicts my intentions.
What is the most pythonic way to solve this dilemma?
I know I can re-implement that method outside of A, but it will lead to code duplication and, as meth() uses private attributes of A, will lead to the same problem.
Inheriting from a single metaclass is not an option as those classes have entirely different purposes and that will be contributing towards a ghastly mess.

The fact that pylint/your editor/whatever external tool gives you a warning doesn't prevent code execution. I don't know about your editor but pylint warnings can be disabled on a case-by-case basis using special comments (nb: "case by case" meaning: "do not warn me for this line or block", not "totally disable this warning").
And it's perfectly ok for your own code to access protected attributes and methods in the same package - the "_protected" naming convention does not mean "None shall pass", just "are you sure you understand what you're doing and willing to take responsability if you break something ?". Since you're the author/maintainer of the package and those are intra-package access you are obviously entitled to take this responsability ;)

The "most pythonic way" would be to not care about private and protected, as these concepts do not exist in Python.
Everything is public. Adding a underscore in the name does not make it private, it just indicates the method is for internal use in the class (not to prevent usage by some end-user).
If you need to use the method from another class, it shows that you're not using classes and objects correctly, and you probably come from a different language like Java where classes are used to group methods together in some namespace.
Just move the function to the module level (outside the class), as you're not using the object (self) anyway.

Related

Advantages of using static methods over instance methods in python

My IDE keeps suggesting I convert my instance methods to static methods. I guess because I haven't referenced any self within these methods.
An example is :
class NotificationViewSet(NSViewSet):
def pre_create_processing(self, request, obj):
log.debug(" creating messages ")
# Ensure data is consistent and belongs to the sending bot.
obj['user_id'] = request.auth.owner.id
obj['bot_id'] = request.auth.id
So my question would be: do I lose anything by just ignoring the IDE suggestions, or is there more to it?
This is a matter of workflow, intentions with your design, and also a somewhat subjective decision.
First of all, you are right, your IDE suggests converting the method to a static method because the method does not use the instance. It is most likely a good idea to follow this suggestion, but you might have a few reasons to ignore it.
Possible reasons to ignore it:
The code is soon to be changed to use the instance (on the other hand, the idea of soon is subjective, so be careful)
The code is legacy and not entirely understood/known
The interface is used in a polymorphic/duck typed way (e.g. you have a collection of objects with this method and you want to call them in a uniform way, but the implementation in this class happens to not need to use the instance - which is a bit of a code smell)
The interface is specified externally and cannot be changed (this is analog to the previous reason)
The AST of the code is read/manipulated either by itself or something that uses it and expects this method to be an instance method (this again is an external dependency on the interface)
I'm sure there can be more, but failing these types of reasons I would follow the suggestion. However, if the method does not belong to the class (e.g. factory method or something similar), I would refactor it to not be part of the class.
I think that you might be mixing up some terminology - the example is not a class method. Class methods receive the class as the first argument, they do not receive the instance. In this case you have a normal instance method that is not using its instance.
If the method does not belong in the class, you can move it out of the class and make it a standard function. Otherwise, if it should be bundled as part of the class, e.g. it's a factory function, then you should probably make it a static method as this (at a minimum) serves as useful documentation to users of your class that the method is coupled to the class, but not dependent on it's state.
Making the method static also has the advantage this it can be overridden in subclasses of the class. If the method was moved outside of the class as a regular function then subclassing is not possible.

Duck typing: how to avoid name collisions?

I think understand the idea of duck typing, and would like to use it more often in my code. However, I am concerned about one potential problem: name collision.
Suppose I want an object to do something. I know the appropriate method, so I simply call it and see what happens. In general, there are three possible outcomes:
The method is not found and AttributeError exception is raised. This indicates that the object isn't what I think it is. That's fine, since with duck typing I'm either catching such an exception, or I am willing to let the outer scope deal with it (or let the program terminate).
The method is found, it does precisely what I want, and everything is great.
The method is found, but it's not the method that I want; it's a same-name method from an entirely unrelated class. The execution continues, until either inconsistent state is detected later, or, in the worst case, the program silently produces incorrect output.
Now, I can see how good quality names can reduce the chances of outcome #3. But projects are combined, code is reused, libraries are swapped, and it's quite possible that at some point two methods have the same name and are completely unrelated (i.e., they are not intended to substitute for each other in a polymorphism).
One solution I was thinking about is to add a registry of method names. Each registry record would contain:
method name (unique; i.e., only one record per name)
its generalized description (i.e., applicable to any instance it might be called on)
the set of classes which it is intended to be used in
If a method is added to a new class, the class needs to be added to the registry (by hand). At that time, the programmer would presumably notice if the method is not consistent with the meaning already attached to it, and if necessary, use another name.
Whenever a method is called, the program would automatically verify that the name is in the registry and the class of the instance is one of the classes in the record. If not, an exception would be raised.
I understand this is a very heavy approach, but in some cases where precision is critical, I can see it might be useful. Has it been tried (in Python or other dynamically typed languages)? Are there any tools that do something similar? Are there any other approaches worth considering?
Note: I'm not referring to name clashes at the global level, where avoiding namespace pollution would be the right approach. I'm referring to clashes at the method names; these are not affected by namespaces.
Well, if this is critical, you probably should not be using duck typing...
In practice, programs are finite systems, and the range of possible types passed into any particular routine does not cause the issues you are worrying about (most often there's only ever one type passed in).
But if you want address this issue anyway, python provides ABCs (abstract base classes). these allow you to associated a "type" with any set of methods and so would work something like the registry you suggest (you can either inherit from an ABC in the normal way, or simply "register" with it).
You can then check for these types manually or automate the checking with decorators from pytyp.
But, despite being the author of pytyp, and finding these questions interesting, I personally do not find such an approach useful. In practice, what you are worrying about simply does not happen (if you want to worry about something, focus on the lack of documentation from types when using higher order functions!).
PS note - ABCs are purely metadata. They do not enforce anything. Also, checking with pytyp decorators is horrendously inefficient - you really want to do this only where it is critical.
If you are following good programming practice or let me rather say if your code is Pythoic then chances are you would seldom face such issues. Refer the FAQ What are the “best practices” for using import in a module?.
It is generally not advised to clutter the namespace and the only time when there could be a conflict if you are trying to reuse the Python reserved names and or standard libraries or name conflicts with module name. But if you encounter conflict as such then there is a serious issue with the code. For example
Why would someone name a variable as list or define a function called len?
Why would someone name a variable difflib when s/he is intending to import it in the current namespace?
To address your problem, look at abstract base classes. They're a Pythonic way to deal with this issue; you can define common behavior in a base class, and even define ways to determine if a particular object is a "virtual baseclass" of the abstract base class. This somewhat mimics the registry you're describing, without requiring all classes know about the registry beforehand.
In practice, though, this issue doesn't crop up as often as you might expect. Objects with an __iter__ method or a __str__ method are simply broken if the methods don't work the way you expect. Likewise, if you say an argument to your function requires a .callback() method defined on it, people are going to do the right thing.
If you are worried that the lack of static type checking will let some bugs get through, the answer isn't to bolt on type checking, it is to write tests.
In the presence of unit tests, a type checking system becomes largely redundant as a means of catching bugs. While it is true that a type checking system can catch some bugs, it will only catch a small subset of potential bugs. To catch the rest you'll need tests. Those unit tests will necessarily catch most of the type errors that a type checking system would have caught, as well as bugs that the type checking system cannot catch.

How do I create a package protected constructor in Python?

I'd like to create a factory pattern in Python, where one class has some configuration, and knows how to build another class' object (or several classes) on demand. To make this complete, I would like to prevent the created class from being created outside of the factory. In Java, I would put both in the same package, and make the class' constructor package protected.
For regular method names or variables, one can follow the Python convention and use single or double underscores ("_foo" or "__foo"). Is there a way to do something like that for a constructor?
Thank you
You can't. The Python mentality is often summed up as "we're all grown-ups here"; that is, you can't stop people calling methods, changing attributes, instantiating classes, and so on. Instead, you should make an obvious way to construct an instance of your class and then assume that it will be used.
Don't bother, it's not the Python way.
The preferred solution is to simply document which constructor or factory method clients are supposed to call, and not worry too much about public/private (which doesn't mean much in Python anyway; everything is essentially public-in-code.)
The convention in Python is to prefix the name of internal things (members or classes) with an underscore. There is no way to enforce limited access, but the underscore serves as a signal that "you shouldn't be touching this".
From the python tutorial:
“Private” instance variables that cannot be accessed except from inside an object don’t exist in Python. However, there is a convention that is followed by most Python code: a name prefixed with an underscore (e.g. _spam) should be treated as a non-public part of the API (whether it is a function, a method or a data member). It should be considered an implementation detail and subject to change without notice.
Based on a comment from Wim, one can name the class of the object to be created starting with a single or double underscore. This way it is clear that the constructor is private, and should not be called directly.

What is your convention to distinguish between object methods to be called by the outside, and object methods to be called by a subclass?

I know most of the ins and outs of Python's approach to private variables/members/functions/...
However, I can't make my mind up on how to distinguish between methods for external use or subclassing use.
Consider the following example:
class EventMixin(object):
def subscribe(self, **kwargs):
'''kwargs should be a dict of event -> callable, to be specialized in the subclass'''
def event(self, name, *args, **kwargs):
...
def _somePrivateMethod(self):
...
In this example, I want to make it clear that subscribe is a method to be used by external users of the class/object, while event is a method that should not be called from the outside, but rather by subclass implementations.
Right now, I consider both part of the public API, hence don't use any underscores. However, for this particular situation, it would feel cleaner to, for example, use no underscores for the external API, one underscore for the subclassable API, and two underscores for the private/internal API. However, that would become unwieldy because then the internal API would need to be invoked as
self._EventMixin__somePrivateMethod()
So, what are your conventions, coding-wise, documentationwise, or otherwise ?
use no underscores for the external API,
one underscore for the subclassable API,
and two underscores for the private/internal API
This is a reasonable and relatively common way of doing it, yes. The double-underline-for-actually-private (as opposed to ‘protected’ in C++ terms) is in practice pretty rare. You never really know what behaviours a subclass might want to override, so assuming ‘protected’ is generally a good bet unless there's a really good reason why messing with a member might be particularly dangerous.
However, that would become unwieldy because then the internal API would
need to be invoked as self._EventMixin__somePrivateMethod()
Nope, you can just use the double-underlined version and it will be munged automatically. It's ugly but it works.
I generally find using double __ to be more trouble that they are worth, as it makes unit testing very painful. using single _ as convention for methods/attributes that are not intended to be part of the public interface of a particular class/module is my preferred approach.
I'd like to make the suggestion that when you find yourself encountering this kind of distinction, it may be a good idea to consider using composition instead of inheritance; in other words, instantiating EventMixin (presumably the name would change) instead of inheriting it.

Subclassing a class with private members

One of the really nice things about python is the simplicity with which you can name variables that have the same name as the accessor:
self.__value = 1
def value():
return self.__value
Is there a simple way of providing access to the private members of a class that I wish to subclass? Often I wish to simply work with the raw data objects inside of a class without having to use accessors and mutators all the time.
I know this seems to go against the general idea of private and public, but usually the class I am trying to subclass is one of my own which I am quite happy to expose the members from to a subclass but not to an instance of that class. Is there a clean way of providing this distinction?
Not conveniently, without further breaking encapsulation. The double-underscore attribute is name-mangled by prepending '_ClassName' for the class it is being accessed in. So, if you have a 'ContainerThing' class that has a '__value' attribute, the attribute is actually being stored as '_ContainerThing__value'. Changing the class name (or refactoring where the attribute is assigned to) would mean breaking all subclasses that try to access that attribute.
This is exactly why the double-underscore name-mangling (which is not really "private", just "inconvenient") is a bad idea to use. Just use a single leading underscore. Everyone will know not to touch your 'private' attribute and you will still be able to access it in subclasses and other situations where it's darned handy. The name-mangling of double-underscore attributes is useful only to avoid name-clashes for attributes that are truly specific to a particular class, which is extremely rare. It provides no extra 'security' since even the name-mangled attributes are trivially accessible.
For the record, '__value' and 'value' (and '_value') are not the same name. The underscores are part of the name.
"I know this seems to go against the general idea of private and public" Not really "against", just different from C++ and Java.
Private -- as implemented in C++ and Java is not a very useful concept. It helps, sometimes, to isolate implementation details. But it is way overused.
Python names beginning with two __ are special and you should not, as a normal thing, be defining attributes with names like this. Names with __ are special and part of the implementation. And exposed for your use.
Names beginning with one _ are "private". Sometimes they are concealed, a little. Most of the time, the "consenting adults" rule applies -- don't use them foolishly, they're subject to change without notice.
We put "private" in quotes because it's just an agreement between you and your users. You've marked things with _. Your users (and yourself) should honor that.
Often, we have method function names with a leading _ to indicate that we consider them to be "private" and subject to change without notice.
The endless getters and setters that Java requires aren't as often used in Python. Python introspection is more flexible, you have access to an object's internal dictionary of attribute values, and you have first class functions like getattr() and setattr().
Further, you have the property() function which is often used to bind getters and setters to a single name that behaves like a simple attribute, but is actually well-defined method function calls.
Not sure of where to cite it from, but the following statement in regard to access protection is Pythonic canon: "We're all consenting adults here".
Just as Thomas Wouters has stated, a single leading underscore is the idiomatic way of marking an attribute as being a part of the object's internal state. Two underscores just provides name mangling to prevent easy access to the attribute.
After that, you should just expect that the client of your library won't go and shoot themselves in the foot by meddling with the "private" attributes.

Categories

Resources