I have a Python method with the following signature:
def basic_sizer(self, ctrl):
where ctrl can be any wxPython control derived from wx.Control. Is there a specific Python stock annotation to indicate this other than either
def basic_sizer(self, ctrl: wx.Control):
or
def basic_sizer(self, ctrl: Union[wx.SpinCtrl, wx.BitmapButton, <other possible controls>]):
I have tried
def basic_sizer(self, ctrl: Type[wx.Control]):
as suggested here. This approach is also presented in the official documentation, but PyCharm does not accept it, flagging mismatched type. I do not want to use some PyCharm-specific hack, even if available. Rather, I am interested in whether the Python typing module provides a generic approach for this situation.
Abstraction
You have some base class SomeBase. You want to write and annotate a function foo that takes an argument arg. That argument arg can be an instance of SomeBase or of any subclass of SomeBase. This is how you write that:
def foo(arg: SomeBase):
...
Say now there are classes DerivedA and DerivedB that both inherit from SomeBase and you realize that arg should actually only ever be an instance of any of those subclasses and not be of the type SomeBase (directly). Here is how you write that:
def foo(arg: DerivedA | DerivedB):
...
Or in Python <3.10:
from typing import Union
def foo(arg: Union[DerivedA, DerivedB]):
...
To my knowledge, there is currently no way to annotate that arg should be an instance of any subclass of SomeBase but not of the class SomeBase itself.
Concrete
I am not familiar with wxPython, but you stated that you want the argument ctrl to
be any wxPython control derived from wx.Control.
According to the documentation, wx.Control is in fact a class. Your statement is still ambiguous in whether or not the ctrl argument should be assumed to be any instance of wx.Control. But if so, you do write:
def basic_sizer(self, ctrl: wx.Control):
...
If you want to restrict it to specific subclasses, you use the Union.
But this is wrong:
def basic_sizer(self, ctrl: Type[wx.Control]):
...
That would state that ctrl must be a class (as opposed to an instance of a class), namely wx.Control or any subclass of it. Unless of course that is in fact what you want... Again, your statement is ambiguous.
Mismatched types
Possible reasons for PyCharm complaining about "mismatched types" include:
You are calling the method basic_sizer providing an argument for ctrl that is not actually an instance of wx.Control.
wxPython messed up big time in their typing.
PyCharm has a bug in its static type checker.
If you provide the code that produces the PyCharm complaint and the specific message by PyCharm, we can sort this out.
PS:
If the PyCharm complaint arises in some other place because you assume that ctrl has certain attributes that it may not have, that would probably indicate that you actually need it to be an instance of specific subclasses. There are multiple ways to handle this, depending on the situation.
Consider the following code :
class parent_print():
def print_word(self):
self.print_hello()
class child_print(parent_print):
def print_hello(self):
print('Hello')
basic_print = child_print()
basic_print.print_word()
Here I am assuming that print_hello() is a virtual function in the parent class. And all the children of parent (parent_print) will have a method implemented named print_hello. So that all children can call a single function print_word and appropriate binding to the print_hello function is done based on the child's method implementation.
But in languages like C++ we need to specifically say that a function is virtual in the parent class with the virtual keyword and also use the function destructor symbol ~.
But how is this possible in python with no mention in python parent class to the function print_hello() being a virtual function.
I am assuming that the concept that python uses is that of virtual functions , if I am wrong please correct me and explain the concept.
Rather than thinking of this as being a matter of "virtual functions", it might be more useful to think of this as an example of "duck typing". In this expression:
self.print_hello()
we simply access the print_hello attribute of self (whatever it might be), and then call it. If it doesn't have such an attribute, an AttributeError is raised at runtime. If the attribute isn't callable, a TypeError is raised. That's all there is to it. No assumptions are made about the type of self -- we simply ask it to "quack like a duck" and see if it can.
The danger of duck typing is that it's very easy to accidentally ask something to quack that does not in fact know how to quack -- if you instantiate a parent_print and call print_word on it, it will fail:
abstract_print = parent_print()
abstract_print.print_word()
# raises AttributeError: 'parent_print' object has no attribute 'print_hello'
Python does have support for static type declarations and the concept of abstract classes, which can help you avoid mistakes. For example, if we run a static type checker (mypy) on your code as-is, we'll get an error:
test.py:4: error: "parent_print" has no attribute "print_hello"
which is exactly correct -- from a static typing perspective, it's not valid to call print_hello since we haven't established that all parent_print instances have such an attribute.
To fix this, we can declare print_hello as an abstract method (I'll also clean up the names to match standard Python conventions, in the interest of building good habits):
from abc import ABC, abstractmethod
class ParentPrint(ABC):
#abstractmethod
def print_hello(self) -> None: ...
def print_word(self) -> None:
self.print_hello()
class ChildPrint(ParentPrint):
def print_hello(self) -> None:
print('Hello')
basic_print = ChildPrint()
basic_print.print_word()
Now the code typechecks with no issues.
The #abstractmethod decorator also indicates that the class is abstract ("pure virtual") and can't be instantiated. If we attempt to create a ParentPrint, or any subclass of it that doesn't provide an implementation of the abstract method, we get an error, both statically from mypy and at runtime in the form of a TypeError that is raised as soon as you try to instantiate the object (before you even try to call the abstract method):
abstract_print = ParentPrint()
# raises error: Cannot instantiate abstract class "ParentPrint" with abstract attribute "print_hello"
I get that a metaclass can be substituted for type and define how a newly created class behaves.
ex:
class NoMixedCase(type):
def __new__(cls,clsname,base,clsdict):
for name in clsdict:
if name.lower() != name:
raise TypeError("Bad name.Don't mix case!")
return super().__new__(cls,clsname,base,clsdict)
class Root(metaclass=NoMixedCase):
pass
class B(Root):
def Foo(self): #type error
pass
However, is there a way of setting NoMixedCase globally, so anytime a new class is created it's behavior is defined by NoMixedCase by default, without havining to inherit from Root?
So if you did...
Class B:
def Foo(self):
pass
...it would still check case on method names.
As for your question, no, it it is not ordinarily - and possibly not even some extra-ordinary thng that will work for this - a lot of CPythons inner things are tied to the type class, and hardcoded to it.
What is possible of trying, without crashing the interpretrer right away, would be to write a wrapper for type.__new__ and use ctypes to replace it directly in type.__new__ slot. (Ordinary assignment won't do it). You'd probably still crash things.
So, in real life, if you decide not to go via a linter program with a plug-in and commit hooks as I suggested in the comment above, the way to go is to have a Base class that uses your metaclass, and get everyone in your project to inherit from that Base.
I have a specific problem closely related to PyCharm (Community 3.1.1). The following simple example illustrates this. I will use the screenshot of PyCharm rather than type the code, for reasons that will be clear shortly.
As you can see, the call to self.say_hello() is highlighted in yellow by PyCharm, and presumably this is because say_hello() is not implemented in the Base class. The fact that say_hello() is not implemented in the base class is intentional on my part, because I want a kind of "abstract" effect, so that an instance of Base cannot call say_hello() (and therefore shouldn't call hello()), but that an instance of Child can call hello() (implemented in the Base class). How do I get this "abstract" effect without PyCharm complaining?
As I learned from here, I could use the abc module. But that, to me, would be rather cumbersome and somewhat not pythonic. What are your recommendations?
I would implement say_hello() as a stub:
class Base(object):
# ...as above...
def say_hello(self):
raise NotImplementedError
Alternatively, put only pass in the body of say_hello().
This would also signal to the user of your Base class that say_hello() should be implemented before she gets an AttributeError when calling obj.hello().
Whether to raise an Exception or to pass depends on whether doing nothing is sensible default behaviour. If you require the user to supply her own method, raise an exception.
Python classes have no concept of public/private, so we are told to not touch something that starts with an underscore unless we created it. But does this not require complete knowledge of all classes from which we inherit, directly or indirectly? Witness:
class Base(object):
def __init__(self):
super(Base, self).__init__()
self._foo = 0
def foo(self):
return self._foo + 1
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self._foo = None
Sub().foo()
Expectedly, a TypeError is raised when None + 1 is evaluated. So I have to know that _foo exists in the base class. To get around this, __foo can be used instead, which solves the problem by mangling the name. This seems to be, if not elegant, an acceptable solution. However, what happens if Base inherits from a class (in a separate package) called Sub? Now __foo in my Sub overrides __foo in the grandparent Sub.
This implies that I have to know the entire inheritance chain, including all "private" objects each uses. The fact that Python is dynamically-typed makes this even harder, since there are no declarations to search for. The worst part, however, is probably the fact Base might inherit from object right now, but in some future release, it switches to inheriting from Sub. Clearly if I know Sub is inherited from, I can rename my class, however annoying that is. But I can't see into the future.
Is this not a case where a true private data type would prevent a problem? How, in Python, can I be sure that I'm not accidentally stepping on somebody's toes if those toes might spring into existence at some point in the future?
EDIT: I've apparently not made clear the primary question. I'm familiar with name mangling and the difference between a single and a double underscore. The question is: how do I deal with the fact that I might clash with classes whose existence I don't know of right now? If my parent class (which is in a package I did not write) happens to start inheriting from a class with the same name as my class, even name mangling won't help. Am I wrong in seeing this as a (corner) case that true private members would solve, but that Python has trouble with?
EDIT: As requested, the following is a full example:
File parent.py:
class Sub(object):
def __init__(self):
self.__foo = 12
def foo(self):
return self.__foo + 1
class Base(Sub):
pass
File sub.py:
import parent
class Sub(parent.Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
Sub().foo()
The grandparent's foo is called, but my __foo is used.
Obviously you wouldn't write code like this yourself, but parent could easily be provided by a third party, the details of which could change at any time.
Use private names (instead of protected ones), starting with a double underscore:
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
# ^^
will not conflict with _foo or __foo in Base. This is because Python replaces the double underscore with a single underscore and the name of the class; the following two lines are equivalent:
class Sub(Base):
def x(self):
self.__foo = None # .. is the same as ..
self._Sub__foo = None
(In response to the edit:) The chance that two classes in a class hierarchy not only have the same name, but that they are both using the same property name, and are both using the private mangled (__) form is so minuscule that it can be safely ignored in practice (I for one haven't heard of a single case so far).
In theory, however, you are correct in that in order to formally verify correctness of a program, one most know the entire inheritance chain. Luckily, formal verification usually requires a fixed set of libraries in any case.
This is in the spirit of the Zen of Python, which includes
practicality beats purity.
Name mangling includes the class so your Base.__foo and Sub.__foo will have different names. This was the entire reason for adding the name mangling feature to Python in the first place. One will be _Base__foo, the other _Sub__foo.
Many people prefer to use composition (has-a) instead of inheritance (is-a) for some of these very reasons.
This implies that I have to know the entire inheritance chain. . .
Yes, you should know the entire inheritance chain, or the docs for the object you are directly sub-classing should tell you what you need to know.
Subclassing is an advanced feature, and should be treated with care.
A good example of docs specifying what should be overridden in a subclass is the threading class:
This class represents an activity that is run in a separate thread of control. There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass. No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.
How often do you modify base classes in inheritance chains to introduce inheritance from a class with the same name as a subclass further down the chain???
Less flippantly, yes, you have to know the code you are working with. You certainly have to know the public names being used, after all. Python being python, discovering the public names in use by your ancestor classes takes pretty much the same effort as discovering the private ones.
In years of Python programming, I have never found this to be much of an issue in practice. When you're naming instance variables, you should have a pretty good idea whether (a) a name is generic enough that it's likely to be used in other contexts and (b) the class you're writing is likely to be involved in an inheritance hierarchy with other unknown classes. In such cases, you think a bit more carefully about the names you're using; self.value isn't a great idea for an attribute name, and neither is something like Adaptor a great class name.
In contrast, I have run into difficulties with the overuse of double-underscore names a number of times. Python being Python, even "private" names tend to be accessed by code defined outside the class. You might think that it would always be bad practice to let an external function access "private" attributes, but what about things like getattr and hasattr? The invocation of them can be in the class's own code, so the class is still controlling all access to the private attributes, but they still don't work without you doing the name-mangling manually. If Python had actually-enforced private variables you couldn't use functions like those on them at all. These days I tend to reserve double-underscore names for cases when I'm writing something very generic like a decorator, metaclass, or mixin that needs to add a "secret attribute" to the instances of the (unknown) classes it's applied to.
And of course there's the standard dynamic language argument: the reality is that you have to test your code thoroughly to have much justification in making the claim "my software works". Such testing will be very unlikely to miss the bugs caused by accidentally clashing names. If you are not doing that testing, then many more uncaught bugs will be introduced by other means than by accidental name clashes.
In summation, the lack of private variables is just not that big a deal in idiomatic Python code in practice, and the addition of true private variables would cause more frequent problems in other ways IMHO.
Mangling happens with double underscores. Single underscores are more of a "please don't".
You don't need to know all the details of all parent classes (note that deep inheritance is usually best avoided), because you can still dir() and help() and any other form of introspection you can come up with.
As noted, you can use name mangling. However, you can stick with a single underscore (or none!) if you document your code adequately - you should not have so many private variables that this proves to be a problem. Just say if a method relies on a private variable, and add either the variable, or the name of the method to the class docstring to alert users.
Further, if you create unit tests, you should create tests that check invariants on members, and accordingly these should be able to show up such name clashes.
If you really want to have "private" variables, and for whatever reason name-mangling doesn't meet your needs, you can factor your private state into another object:
class Foo(object):
class Stateholder(object): pass
def __init__(self):
self._state = Stateholder()
self.state.private = 1