How to tell if a class is abstract in Python 3? - python

I wrote a metaclass that automatically registers its classes in a dict at runtime. In order for it to work properly, it must be able to ignore abstract classes.
The code works really well in Python 2, but I've run into a wall trying to make it compatible with Python 3.
Here's what the code looks like currently:
def AutoRegister(registry, base_type=ABCMeta):
class _metaclass(base_type):
def __init__(self, what, bases=None, attrs=None):
super(_metaclass, self).__init__(what, bases, attrs)
# Do not register abstract classes.
# Note that we do not use `inspect.isabstract` here, as
# that only detects classes with unimplemented abstract
# methods - which is a valid approach, but not what we
# want here.
# :see: http://stackoverflow.com/a/14410942/
metaclass = attrs.get('__metaclass__')
if not (metaclass and issubclass(metaclass, ABCMeta)):
registry.register(self)
return _metaclass
Usage in Python 2 looks like this:
# Abstract classes; these are not registered.
class BaseWidget(object): __metaclass__ = AutoRegister(widget_registry)
class BaseGizmo(BaseWidget): __metaclass__ = ABCMeta
# Concrete classes; these get registered.
class AlphaWidget(BaseWidget): pass
class BravoGizmo(BaseGizmo): pass
What I can't figure out, though, is how to make this work in Python 3.
How can a metaclass determine if it is initializing an abstract class in Python 3?

PEP3119 describes how the ABCMeta metaclass "marks" abstract methods and creates an __abstractmethods__ frozenset that contains all methods of a class that are still abstract. So, to check if a class cls is abstract, check if cls.__abstractmethods__ is empty or not.
I also found this relevant post on abstract classes useful.

I couldn't shake the feeling as I was posting this question that I was dealing with an XY Problem. As it turns out, that's exactly what was going on.
The real issue here is that the AutoRegister metaclass, as implemented, relies on a flawed understanding of what an abstract class is. Python or not, one of the most important criteria of an abstract class is that it is not instanciable.
In the example posted in the question, BaseWidget and BaseGizmo are instanciable, so they are not abstract.
Aren't we just bifurcating rabbits here?
Well, why was I having so much trouble getting AutoRegister to work in Python 3? Because I was trying to build something whose behavior contradicts the way classes work in Python.
The fact that inspect.isabstract wasn't returning the result I wanted should have been a major red flag: AutoRegister is a warranty-voider.
So what's the real solution then?
First, we have to recognize that BaseWidget and BaseGizmo have no reason to exist. They do not provide enough functionality to be instantiable, nor do they declare abstract methods that describe the functionality that they are missing.
One could argue that they could be used to "categorize" their sub-classes, but a) that's clearly not what's going on in this case, and b) quack.
Instead, we could embrace Python's definition of "abstract":
Modify BaseWidget and BaseGizmo so that they define one or more abstract methods.
If we can't come up with any abstract methods, then can we remove them entirely?
If we can't remove them but also can't make them properly abstract, it might be worthwhile to take a step back and see if there are other ways we might solve this problem.
Modify the definition of AutoRegister so that it uses inspect.isabstract to decide if a class is abstract: see final implementation.
That's cool and all, but what if I can't change the base classes?
Or, if you have to maintain backwards compatibility with existing code (as was the case for me), a decorator is probably easier:
#widget_registry.register
class AlphaWidget(object):
pass
#widget_registry.register
class BravoGizmo(object):
pass

Related

Are there any unique features provided only by metaclasses in Python?

I have read answers for this question: What are metaclasses in Python? and this question: In Python, when should I use a meta class? and skimmed through documentation: Data model.
It is very possible I missed something, and I would like to clarify: is there anything that metaclasses can do that cannot be properly or improperly (unpythonic, etc) done with the help of other tools (decorators, inheritance, etc)?
That is a bit tricky to answer -
However, it is a very nice question to ask at this point, and there are certainly a few things that are easier to do with metaclasses.
So, first, I think it is important to note the things for which one used to need a metaclass in the past, and no longer needs to: I'd say that with the release of Python 3.6 and the inclusion of __init_subclass__ and __set_name__ dunder methods, a lot, maybe the majority of the cases I had always written a metaclass for (most of them for answering questions or in toy code - no one creates that many production-code metaclasses even in a lifetime as a programmer) became outdated.
Specially __init_subclass__ adds the convenience of being able to transform any attribute or method like class-decorators, but is automatically applied on inheritance, which does not happen with decorators.
I guess reading about it was a fator motivating your question - since most metaclasses found out in the wild deal with transforming these attributes in __new__ and __init__ metaclass methods.
However, note that if one needs to transform any attribute prior to having it included in the class, the metaclass __new__ method is the only place it can be done. In most cases, however, one can simply transform it in the final new class namespace.
Then, one version forward, in 3.7, we had __class_getitem__ implemented - since using the [ ] (__getitem__) operator directly on classes became popular due to typing annotations. Before that, one would have to create a metaclass with a __getitem__ method for the sole purpose of being able to indicate to the type-checker toolchain some extra information like generic variables.
One interesting possibility that did not exist in Python 2, was introduced in Python 3, then outdated, and now can only serve very specific cases is the use of the __prepare__ method on the metaclass:
I don't know if this is written in any official docs, but the obvious primary motivation for metaclass __prepare__ which allows one custom namespace for the class body, was to return an ordered dict, so that one could have ordered attributes in classes that would work as data entities. It turns out that also, from Python 3.6 on, class body namespaces where always ordered (which later on Python 3.7 were formalized for all Python dictionaries). However, although not needed for returning an OrderedDict anymore, __prepare__ is still aunique thing in the language in which it allows a custom mapping class to be used as namespace in a piece of Python code (even if that is limited to class bodies). For example, one can trivialy create an "auto-enumeration" metaclass by returning a
class MD(dict):
def __init__(self, *args, **kw):
super().__init__(*args, **kw)
self.counter = 0
def __missing__(self, key):
counter = self[key] = self.counter
self.counter += 1
return counter
class MC(type):
#classmethod
def __prepare__(mcls, name, bases, **kwd):
return MD()
class Colors(metaclass=MC):
RED
GREEN
BLUE
(an example similar to this is included in Luciano Ramalho's 'Fluent Python' 2nd edition)
The __call__ method on the metaclass is also peculiar: it control the calls to __new__ and __init__ whenever an instance of the class is created. There are recipes around that use this to create a "singleton" - I find those terrible and overkill: if I need a singleton, I just create an instance of the singleton class at module level. However, overriding typing.__call__ offers a level of control on class instantiation that may be hard to achieve on the class __new__ and __init__ themselves. But this definitely can be done by correctly keeping the desired states in the class object itself.
__subclasscheck__ and __instancecheck__: these are metaclass only methods, and the only workaround would be to make a class decorator that would re-create a class object so that it would be a "real" subclass of the intended base class. (and that is not always possible).
"hidden" class attributes: now, this can be useful, and is less known, as it derives from the language behavior itself: any attribute or method besides the dunder methods included in a metaclass can be used from a class, but from instances of that class. An example for this is the .register method in classes using abc.ABCMeta. This contrasts with ordinary classmethods which can be used normally from an instance.
And finally, any behavior defined with the dunder methods for a Python object can be implemented to work on classes if they are defined in the metaclass. So if you have any use case for "add-able" classes, or want a special repr for your classes, just implement __add__ or __repr__ on the metaclass: this behavior obviously can't be obtained by other means.
I think I got all covered there.

Is there a way to decorate a class injecting a parent class?

I have a base class A, and a decorator behavior. Both has different behaviors but sometimes it can be used at the same time.
There is to implement a new class decorator new_behavior that applies behavior and "inject" A as a parent class?
Something like this:
#new_behavior
class B:
...
So B will behave just like if it was declared like class B(A): but B also inhirts all #behavior behaviors?
Broadly speaking, by the time a decorator gets a chance to operate on a class, it's too late to change fundamental properties of the class, like its bases. But that doesn't necessarily mean you can't do what you want, it only rules out direct approaches.
You could have your decorator create a new class with the desired bases, and add the contents of the old class to the new one. But there are a lot of subtle details that might go wrong, like methods that don't play correctly with super and other stuff that make it somewhat challenging. I would not want to do this on a whim.
One possible option that might be simpler than most is to make a new class that inherits from both the class you're decorating, and the base class you want to add. That isn't exactly the same as injecting a base class as a base of the decorated, but it will usually wind up with the same MRO, and super should work just fine. Here's how I'd implement that:
def new_behavior(cls):
class NewClass(cls, A): # do the multiple inheritance by adding A here
pass
NewClass.__name__ = f'New{cls.__name__}' # should modify __qualname__ too
return NewClass
I'm not applying any other decorators in that code, but you could do that by changing the last line to return some_other_decorator(NewClass) or just applying the decorator to the class statement with #decorator syntax. In order to make introspection nicer, you might want to modify a few parameters of NewClass before returning it. I demonstrate altering the __name__ attribute, but you would probably also want to change __qualname__ (which I've skipped doing because it would be a bit more fiddly and annoying to get something appropriate), and maybe some others that I can't think of off the top of my head.

Using ABC, PolymorphicModel, django-models gives metaclass conflict

So far every other answer on SO answers in the exact same way: construct your metaclasses and then inherit the 'joined' version of those metaclasses, i.e.
class M_A(type): pass
class M_B(type): pass
class A(metaclass=M_A): pass
class B(metaclass=M_B): pass
class M_C(M_A, M_B): pass
class C:(A, B, metaclass=M_C): pass
But I don't know what world these people are living in, where they're constructing your own metaclasses! Obviously, one would be using classes from other libraries and unless you have a perfect handle on meta programming, how are you supposed to know whether you can just override a class's metaclass? (Clearly I do not have a handle on them yet).
My problem is:
class InterfaceToTransactions(ABC):
def account(self):
return None
...
class Category(PolymorphicModel, InterfaceToTransactions):
def account(self):
return self.source_account
...
class Income(TimeStampedModel, InterfaceToTransactions):
def account(self):
return self.destination_account
...
Which of course gives me the error: "metaclass conflict: the metaclass of a derived class must be a (non-strict) subclass of the metaclasses of all its bases"
I've tried many variations of the solution given above, the following does not work, gives the same error.
class InterfaceToTransactionsIntermediaryMeta(type(PolymorphicModel), type(InterfaceToTransactions)):
pass
class Category(PolymorphicModel, InterfaceToTransactions):
__metaclass__ = InterfaceToTransactionsIntermediaryMeta
...
Nor does putting anything inside the class Meta function. I've read every single other SO question on this topic, please don't simply mark it as duplicate.
-------------------Edited 1/8/18 after accepting the solution-------
Oddly enough, if I try to makemigrations with this new configuration (the one I accepted), it starts giving the metaclass error again, but it still works during runtime. If I comment out the metaclass parts then makemigrations and migrate, it will do it successfully, but then I have to put it back in there after migrating every time.
If you are using Python 3, you are trying to use your derived metaclass incorrectly.
And since you get "the same error", and not other possible, more subtle, error, I'd say this is what is happening.
Try just changing to:
class IntermediaryMeta(type(InterfaceToTransactions), type(PolymorphicModel)):
pass
class Category(PolymorphicModel, InterfaceToTransactions, metaclass=IntermediaryMeta):
...
(At least the ABCMeta class is guaranteed to work collaboratively using super, that is enough motive to place the classe it first on the bases )
tuple)
If that yields you new and improved errors, this means that one or both of those classes can't really collaborate properly due to one of several motives. Then, the way to go is to force your inheritance tree that depends on ABCMeta not to do so, since its role is almost aesthetical in a language where everything else is for "consenting adults" like Python.
Unfortunatelly, the way to that is to use varying methods of brute-force, from safe "rewritting everything" to monkey patching ABCMeta and abstractmethod on the place were "InterfaceToTransactions" is defined to simply do nothing.
If you need to get there, and need some help, please post another question.
Sorry - this is actually the major drawbacks of using metaclasses.
Unless django-polymorphic decides to inherit from abc.ABC this is going to be very difficult to achieve. A good solution would be to "manually" create your interface. For instance:
class InterfaceToTransactions:
def account(self):
raise NotImplementedError("Account method must be implemented.")
...
class Category(PolymorphicModel, InterfaceToTransactions):
def account(self):
return self.source_account
...
class Income(TimeStampedModel, InterfaceToTransactions):
def account(self):
return self.destination_account
...

Does Python require intimate knowledge of all classes in the inheritance chain?

Python classes have no concept of public/private, so we are told to not touch something that starts with an underscore unless we created it. But does this not require complete knowledge of all classes from which we inherit, directly or indirectly? Witness:
class Base(object):
def __init__(self):
super(Base, self).__init__()
self._foo = 0
def foo(self):
return self._foo + 1
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self._foo = None
Sub().foo()
Expectedly, a TypeError is raised when None + 1 is evaluated. So I have to know that _foo exists in the base class. To get around this, __foo can be used instead, which solves the problem by mangling the name. This seems to be, if not elegant, an acceptable solution. However, what happens if Base inherits from a class (in a separate package) called Sub? Now __foo in my Sub overrides __foo in the grandparent Sub.
This implies that I have to know the entire inheritance chain, including all "private" objects each uses. The fact that Python is dynamically-typed makes this even harder, since there are no declarations to search for. The worst part, however, is probably the fact Base might inherit from object right now, but in some future release, it switches to inheriting from Sub. Clearly if I know Sub is inherited from, I can rename my class, however annoying that is. But I can't see into the future.
Is this not a case where a true private data type would prevent a problem? How, in Python, can I be sure that I'm not accidentally stepping on somebody's toes if those toes might spring into existence at some point in the future?
EDIT: I've apparently not made clear the primary question. I'm familiar with name mangling and the difference between a single and a double underscore. The question is: how do I deal with the fact that I might clash with classes whose existence I don't know of right now? If my parent class (which is in a package I did not write) happens to start inheriting from a class with the same name as my class, even name mangling won't help. Am I wrong in seeing this as a (corner) case that true private members would solve, but that Python has trouble with?
EDIT: As requested, the following is a full example:
File parent.py:
class Sub(object):
def __init__(self):
self.__foo = 12
def foo(self):
return self.__foo + 1
class Base(Sub):
pass
File sub.py:
import parent
class Sub(parent.Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
Sub().foo()
The grandparent's foo is called, but my __foo is used.
Obviously you wouldn't write code like this yourself, but parent could easily be provided by a third party, the details of which could change at any time.
Use private names (instead of protected ones), starting with a double underscore:
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
# ^^
will not conflict with _foo or __foo in Base. This is because Python replaces the double underscore with a single underscore and the name of the class; the following two lines are equivalent:
class Sub(Base):
def x(self):
self.__foo = None # .. is the same as ..
self._Sub__foo = None
(In response to the edit:) The chance that two classes in a class hierarchy not only have the same name, but that they are both using the same property name, and are both using the private mangled (__) form is so minuscule that it can be safely ignored in practice (I for one haven't heard of a single case so far).
In theory, however, you are correct in that in order to formally verify correctness of a program, one most know the entire inheritance chain. Luckily, formal verification usually requires a fixed set of libraries in any case.
This is in the spirit of the Zen of Python, which includes
practicality beats purity.
Name mangling includes the class so your Base.__foo and Sub.__foo will have different names. This was the entire reason for adding the name mangling feature to Python in the first place. One will be _Base__foo, the other _Sub__foo.
Many people prefer to use composition (has-a) instead of inheritance (is-a) for some of these very reasons.
This implies that I have to know the entire inheritance chain. . .
Yes, you should know the entire inheritance chain, or the docs for the object you are directly sub-classing should tell you what you need to know.
Subclassing is an advanced feature, and should be treated with care.
A good example of docs specifying what should be overridden in a subclass is the threading class:
This class represents an activity that is run in a separate thread of control. There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass. No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.
How often do you modify base classes in inheritance chains to introduce inheritance from a class with the same name as a subclass further down the chain???
Less flippantly, yes, you have to know the code you are working with. You certainly have to know the public names being used, after all. Python being python, discovering the public names in use by your ancestor classes takes pretty much the same effort as discovering the private ones.
In years of Python programming, I have never found this to be much of an issue in practice. When you're naming instance variables, you should have a pretty good idea whether (a) a name is generic enough that it's likely to be used in other contexts and (b) the class you're writing is likely to be involved in an inheritance hierarchy with other unknown classes. In such cases, you think a bit more carefully about the names you're using; self.value isn't a great idea for an attribute name, and neither is something like Adaptor a great class name.
In contrast, I have run into difficulties with the overuse of double-underscore names a number of times. Python being Python, even "private" names tend to be accessed by code defined outside the class. You might think that it would always be bad practice to let an external function access "private" attributes, but what about things like getattr and hasattr? The invocation of them can be in the class's own code, so the class is still controlling all access to the private attributes, but they still don't work without you doing the name-mangling manually. If Python had actually-enforced private variables you couldn't use functions like those on them at all. These days I tend to reserve double-underscore names for cases when I'm writing something very generic like a decorator, metaclass, or mixin that needs to add a "secret attribute" to the instances of the (unknown) classes it's applied to.
And of course there's the standard dynamic language argument: the reality is that you have to test your code thoroughly to have much justification in making the claim "my software works". Such testing will be very unlikely to miss the bugs caused by accidentally clashing names. If you are not doing that testing, then many more uncaught bugs will be introduced by other means than by accidental name clashes.
In summation, the lack of private variables is just not that big a deal in idiomatic Python code in practice, and the addition of true private variables would cause more frequent problems in other ways IMHO.
Mangling happens with double underscores. Single underscores are more of a "please don't".
You don't need to know all the details of all parent classes (note that deep inheritance is usually best avoided), because you can still dir() and help() and any other form of introspection you can come up with.
As noted, you can use name mangling. However, you can stick with a single underscore (or none!) if you document your code adequately - you should not have so many private variables that this proves to be a problem. Just say if a method relies on a private variable, and add either the variable, or the name of the method to the class docstring to alert users.
Further, if you create unit tests, you should create tests that check invariants on members, and accordingly these should be able to show up such name clashes.
If you really want to have "private" variables, and for whatever reason name-mangling doesn't meet your needs, you can factor your private state into another object:
class Foo(object):
class Stateholder(object): pass
def __init__(self):
self._state = Stateholder()
self.state.private = 1

In Python, when should I use a meta class?

I have gone through this: What is a metaclass in Python?
But can any one explain more specifically when should I use the meta class concept and when it's very handy?
Suppose I have a class like below:
class Book(object):
CATEGORIES = ['programming','literature','physics']
def _get_book_name(self,book):
return book['title']
def _get_category(self, book):
for cat in self.CATEGORIES:
if book['title'].find(cat) > -1:
return cat
return "Other"
if __name__ == '__main__':
b = Book()
dummy_book = {'title':'Python Guide of Programming', 'status':'available'}
print b._get_category(dummy_book)
For this class.
In which situation should I use a meta class and why is it useful?
Thanks in advance.
You use metaclasses when you want to mutate the class as it is being created. Metaclasses are hardly ever needed, they're hard to debug, and they're difficult to understand -- but occasionally they can make frameworks easier to use. In our 600Kloc code base we've used metaclasses 7 times: ABCMeta once, 4x models.SubfieldBase from Django, and twice a metaclass that makes classes usable as views in Django. As #Ignacio writes, if you don't know that you need a metaclass (and have considered all other options), you don't need a metaclass.
Conceptually, a class exists to define what a set of objects (the instances of the class) have in common. That's all. It allows you to think about the instances of the class according to that shared pattern defined by the class. If every object was different, we wouldn't bother using classes, we'd just use dictionaries.
A metaclass is an ordinary class, and it exists for the same reason; to define what is common to its instances. The default metaclass type provides all the normal rules that make classes and instances work the way you're used to, such as:
Attribute lookup on an instance checks the instance followed by its class, followed by all superclasses in MRO order
Calling MyClass(*args, **kwargs) invokes i = MyClass.__new__(MyClass, *args, **kwargs) to get an instance, then invokes i.__init__(*args, **kwargs) to initialise it
A class is created from the definitions in a class block by making all the names bound in the class block into attributes of the class
Etc
If you want to have some classes that work differently to normal classes, you can define a metaclass and make your unusual classes instances of the metaclass rather than type. Your metaclass will almost certainly be a subclass of type, because you probably don't want to make your different kind of class completely different; just as you might want to have some sub-set of Books behave a bit differently (say, books that are compilations of other works) and use a subclass of Book rather than a completely different class.
If you're not trying to define a way of making some classes work differently to normal classes, then a metaclass is probably not the most appropriate solution. Note that the "classes define how their instances work" is already a very flexible and abstract paradigm; most of the time you do not need to change how classes work.
If you google around, you'll see a lot of examples of metaclasses that are really just being used to go do a bunch of stuff around class creation; often automatically processing the class attributes, or finding new ones automatically from somewhere. I wouldn't really call those great uses of metaclasses. They're not changing how classes work, they're just processing some classes. A factory function to create the classes, or a class method that you invoke immediately after class creation, or best of all a class decorator, would be a better way to implement this sort of thing, in my opinion.
But occasionally you find yourself writing complex code to get Python's default behaviour of classes to do something conceptually simple, and it actually helps to step "further out" and implement it at the metaclass level.
A fairly trivial example is the "singleton pattern", where you have a class of which there can only be one instance; calling the class will return an existing instance if one has already been created. Personally I am against singletons and would not advise their use (I think they're just global variables, cunningly disguised to look like newly created instances in order to be even more likely to cause subtle bugs). But people use them, and there are huge numbers of recipes for making singleton classes using __new__ and __init__. Doing it this way can be a little irritating, mainly because Python wants to call __new__ and then call __init__ on the result of that, so you have to find a way of not having your initialisation code re-run every time someone requests access to the singleton. But wouldn't be easier if we could just tell Python directly what we want to happen when we call the class, rather than trying to set up the things that Python wants to do so that they happen to do what we want in the end?
class Singleton(type):
def __init__(self, *args, **kwargs):
super(Singleton, self).__init__(*args, **kwargs)
self.__instance = None
def __call__(self, *args, **kwargs):
if self.__instance is None:
self.__instance = super(Singleton, self).__call__(*args, **kwargs)
return self.__instance
Under 10 lines, and it turns normal classes into singletons simply by adding __metaclass__ = Singleton, i.e. nothing more than a declaration that they are a singleton. It's just easier to implement this sort of thing at this level, than to hack something out at the class level directly.
But for your specific Book class, it doesn't look like you have any need to do anything that would be helped by a metaclass. You really don't need to reach for metaclasses unless you find the normal rules of how classes work are preventing you from doing something that should be simple in a simple way (which is different from "man, I wish I didn't have to type so much for all these classes, I wonder if I could auto-generate the common bits?"). In fact, I have never actually used a metaclass for something real, despite using Python every day at work; all my metaclasses have been toy examples like the above Singleton or else just silly exploration.
A metaclass is used whenever you need to override the default behavior for classes, including their creation.
A class gets created from the name, a tuple of bases, and a class dict. You can intercept the creation process to make changes to any of those inputs.
You can also override any of the services provided by classes:
__call__ which is used to create instances
__getattribute__ which is used to lookup attributes and methods on a class
__setattr__ which controls setting attributes
__repr__ which controls how the class is diplayed
In summary, metaclasses are used when you need to control how classes are created or when you need to alter any of the services provided by classes.
If you for whatever reason want to do stuff like Class[x], x in Class etc., you have to use metaclasses:
class Meta(type):
def __getitem__(cls, x):
return x ** 2
def __contains__(cls, x):
return int(x ** (0.5)) == x ** 0.5
# Python 2.x
class Class(object):
__metaclass__ = Meta
# Python 3.x
class Class(metaclass=Meta):
pass
print Class[2]
print 4 in Class
check the link Meta Class Made Easy to know how and when to use meta class.

Categories

Resources