Subclasses vs Mixins in Python [duplicate] - python

In Programming Python, Mark Lutz mentions the term mixin. I am from a C/C++/C# background and I have not heard the term before. What is a mixin?
Reading between the lines of this example (which I have linked to because it is quite long), I am presuming it is a case of using multiple inheritance to extend a class as opposed to proper subclassing. Is this right?
Why would I want to do that rather than put the new functionality into a subclass? For that matter, why would a mixin/multiple inheritance approach be better than using composition?
What separates a mixin from multiple inheritance? Is it just a matter of semantics?

A mixin is a special kind of multiple inheritance. There are two main situations where mixins are used:
You want to provide a lot of optional features for a class.
You want to use one particular feature in a lot of different classes.
For an example of number one, consider werkzeug's request and response system. I can make a plain old request object by saying:
from werkzeug import BaseRequest
class Request(BaseRequest):
pass
If I want to add accept header support, I would make that
from werkzeug import BaseRequest, AcceptMixin
class Request(AcceptMixin, BaseRequest):
pass
If I wanted to make a request object that supports accept headers, etags, authentication, and user agent support, I could do this:
from werkzeug import BaseRequest, AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthenticationMixin
class Request(AcceptMixin, ETagRequestMixin, UserAgentMixin, AuthenticationMixin, BaseRequest):
pass
The difference is subtle, but in the above examples, the mixin classes weren't made to stand on their own. In more traditional multiple inheritance, the AuthenticationMixin (for example) would probably be something more like Authenticator. That is, the class would probably be designed to stand on its own.

First, you should note that mixins only exist in multiple-inheritance languages. You can't do a mixin in Java or C#.
Basically, a mixin is a stand-alone base type that provides limited functionality and polymorphic resonance for a child class. If you're thinking in C#, think of an interface that you don't have to actually implement because it's already implemented; you just inherit from it and benefit from its functionality.
Mixins are typically narrow in scope and not meant to be extended.
[edit -- as to why:]
I suppose I should address why, since you asked. The big benefit is that you don't have to do it yourself over and over again. In C#, the biggest place where a mixin could benefit might be from the Disposal pattern. Whenever you implement IDisposable, you almost always want to follow the same pattern, but you end up writing and re-writing the same basic code with minor variations. If there were an extendable Disposal mixin, you could save yourself a lot of extra typing.
[edit 2 -- to answer your other questions]
What separates a mixin from multiple inheritance? Is it just a matter of semantics?
Yes. The difference between a mixin and standard multiple inheritance is just a matter of semantics; a class that has multiple inheritance might utilize a mixin as part of that multiple inheritance.
The point of a mixin is to create a type that can be "mixed in" to any other type via inheritance without affecting the inheriting type while still offering some beneficial functionality for that type.
Again, think of an interface that is already implemented.
I personally don't use mixins since I develop primarily in a language that doesn't support them, so I'm having a really difficult time coming up with a decent example that will just supply that "ahah!" moment for you. But I'll try again. I'm going to use an example that's contrived -- most languages already provide the feature in some way or another -- but that will, hopefully, explain how mixins are supposed to be created and used. Here goes:
Suppose you have a type that you want to be able to serialize to and from XML. You want the type to provide a "ToXML" method that returns a string containing an XML fragment with the data values of the type, and a "FromXML" that allows the type to reconstruct its data values from an XML fragment in a string. Again, this is a contrived example, so perhaps you use a file stream, or an XML Writer class from your language's runtime library... whatever. The point is that you want to serialize your object to XML and get a new object back from XML.
The other important point in this example is that you want to do this in a generic way. You don't want to have to implement a "ToXML" and "FromXML" method for every type that you want to serialize, you want some generic means of ensuring that your type will do this and it just works. You want code reuse.
If your language supported it, you could create the XmlSerializable mixin to do your work for you. This type would implement the ToXML and the FromXML methods. It would, using some mechanism that's not important to the example, be capable of gathering all the necessary data from any type that it's mixed in with to build the XML fragment returned by ToXML and it would be equally capable of restoring that data when FromXML is called.
And.. that's it. To use it, you would have any type that needs to be serialized to XML inherit from XmlSerializable. Whenever you needed to serialize or deserialize that type, you would simply call ToXML or FromXML. In fact, since XmlSerializable is a fully-fledged type and polymorphic, you could conceivably build a document serializer that doesn't know anything about your original type, accepting only, say, an array of XmlSerializable types.
Now imagine using this scenario for other things, like creating a mixin that ensures that every class that mixes it in logs every method call, or a mixin that provides transactionality to the type that mixes it in. The list can go on and on.
If you just think of a mixin as a small base type designed to add a small amount of functionality to a type without otherwise affecting that type, then you're golden.
Hopefully. :)

This answer aims to explain mixins with examples that are:
self-contained: short, with no need to know any libraries to understand the example.
in Python, not in other languages.
It is understandable that there were examples from other languages such as Ruby since the term is much more common in those languages, but this is a Python thread.
It shall also consider the controversial question:
Is multiple inheritance necessary or not to characterize a mixin?
Definitions
I have yet to see a citation from an "authoritative" source clearly saying what is a mixin in Python.
I have seen 2 possible definitions of a mixin (if they are to be considered as different from other similar concepts such as abstract base classes), and people don't entirely agree on which one is correct.
The consensus may vary between different languages.
Definition 1: no multiple inheritance
A mixin is a class such that some method of the class uses a method which is not defined in the class.
Therefore the class is not meant to be instantiated, but rather serve as a base class. Otherwise the instance would have methods that cannot be called without raising an exception.
A constraint which some sources add is that the class may not contain data, only methods, but I don't see why this is necessary. In practice however, many useful mixins don't have any data, and base classes without data are simpler to use.
A classic example is the implementation of all comparison operators from only <= and ==:
class ComparableMixin(object):
"""This class has methods which use `<=` and `==`,
but this class does NOT implement those methods."""
def __ne__(self, other):
return not (self == other)
def __lt__(self, other):
return self <= other and (self != other)
def __gt__(self, other):
return not self <= other
def __ge__(self, other):
return self == other or self > other
class Integer(ComparableMixin):
def __init__(self, i):
self.i = i
def __le__(self, other):
return self.i <= other.i
def __eq__(self, other):
return self.i == other.i
assert Integer(0) < Integer(1)
assert Integer(0) != Integer(1)
assert Integer(1) > Integer(0)
assert Integer(1) >= Integer(1)
# It is possible to instantiate a mixin:
o = ComparableMixin()
# but one of its methods raise an exception:
#o != o
This particular example could have been achieved via the functools.total_ordering() decorator, but the game here was to reinvent the wheel:
import functools
#functools.total_ordering
class Integer(object):
def __init__(self, i):
self.i = i
def __le__(self, other):
return self.i <= other.i
def __eq__(self, other):
return self.i == other.i
assert Integer(0) < Integer(1)
assert Integer(0) != Integer(1)
assert Integer(1) > Integer(0)
assert Integer(1) >= Integer(1)
Definition 2: multiple inheritance
A mixin is a design pattern in which some method of a base class uses a method it does not define, and that method is meant to be implemented by another base class, not by the derived like in Definition 1.
The term mixin class refers to base classes which are intended to be used in that design pattern (TODO those that use the method, or those that implement it?)
It is not easy to decide if a given class is a mixin or not: the method could be just implemented on the derived class, in which case we're back to Definition 1. You have to consider the author's intentions.
This pattern is interesting because it is possible to recombine functionalities with different choices of base classes:
class HasMethod1(object):
def method(self):
return 1
class HasMethod2(object):
def method(self):
return 2
class UsesMethod10(object):
def usesMethod(self):
return self.method() + 10
class UsesMethod20(object):
def usesMethod(self):
return self.method() + 20
class C1_10(HasMethod1, UsesMethod10): pass
class C1_20(HasMethod1, UsesMethod20): pass
class C2_10(HasMethod2, UsesMethod10): pass
class C2_20(HasMethod2, UsesMethod20): pass
assert C1_10().usesMethod() == 11
assert C1_20().usesMethod() == 21
assert C2_10().usesMethod() == 12
assert C2_20().usesMethod() == 22
# Nothing prevents implementing the method
# on the base class like in Definition 1:
class C3_10(UsesMethod10):
def method(self):
return 3
assert C3_10().usesMethod() == 13
Authoritative Python occurrences
At the official documentatiton for collections.abc the documentation explicitly uses the term Mixin Methods.
It states that if a class:
implements __next__
inherits from a single class Iterator
then the class gets an __iter__ mixin method for free.
Therefore at least on this point of the documentation, mixin does not not require multiple inheritance, and is coherent with Definition 1.
The documentation could of course be contradictory at different points, and other important Python libraries might be using the other definition in their documentation.
This page also uses the term Set mixin, which clearly suggests that classes like Set and Iterator can be called Mixin classes.
In other languages
Ruby: Clearly does not require multiple inheritance for mixin, as mentioned in major reference books such as Programming Ruby and The Ruby programming Language
C++: A virtual method that is set =0 is a pure virtual method.
Definition 1 coincides with the definition of an abstract class (a class that has a pure virtual method).
That class cannot be instantiated.
Definition 2 is possible with virtual inheritance: Multiple Inheritance from two derived classes

I think of them as a disciplined way of using multiple inheritance - because ultimately a mixin is just another python class that (might) follow the conventions about classes that are called mixins.
My understanding of the conventions that govern something you would call a Mixin are that a Mixin:
adds methods but not instance variables (class constants are OK)
only inherits from object (in Python)
That way it limits the potential complexity of multiple inheritance, and makes it reasonably easy to track the flow of your program by limiting where you have to look (compared to full multiple inheritance). They are similar to ruby modules.
If I want to add instance variables (with more flexibility than allowed for by single inheritance) then I tend to go for composition.
Having said that, I have seen classes called XYZMixin that do have instance variables.

What separates a mixin from multiple inheritance? Is it just a matter of semantics?
A mixin is a limited form of multiple inheritance. In some languages the mechanism for adding a mixin to a class is slightly different (in terms of syntax) from that of inheritance.
In the context of Python especially, a mixin is a parent class that provides functionality to subclasses but is not intended to be instantiated itself.
What might cause you to say, "that's just multiple inheritance, not really a mixin" is if the class that might be confused for a mixin can actually be instantiated and used - so indeed it is a semantic, and very real, difference.
Example of Multiple Inheritance
This example, from the documentation, is an OrderedCounter:
class OrderedCounter(Counter, OrderedDict):
'Counter that remembers the order elements are first encountered'
def __repr__(self):
return '%s(%r)' % (self.__class__.__name__, OrderedDict(self))
def __reduce__(self):
return self.__class__, (OrderedDict(self),)
It subclasses both the Counter and the OrderedDict from the collections module.
Both Counter and OrderedDict are intended to be instantiated and used on their own. However, by subclassing them both, we can have a counter that is ordered and reuses the code in each object.
This is a powerful way to reuse code, but it can also be problematic. If it turns out there's a bug in one of the objects, fixing it without care could create a bug in the subclass.
Example of a Mixin
Mixins are usually promoted as the way to get code reuse without potential coupling issues that cooperative multiple inheritance, like the OrderedCounter, could have. When you use mixins, you use functionality that isn't as tightly coupled to the data.
Unlike the example above, a mixin is not intended to be used on its own. It provides new or different functionality.
For example, the standard library has a couple of mixins in the socketserver library.
Forking and threading versions of each type of server can be created
using these mix-in classes. For instance, ThreadingUDPServer is
created as follows:
class ThreadingUDPServer(ThreadingMixIn, UDPServer):
pass
The mix-in class comes first, since it overrides a method defined in
UDPServer. Setting the various attributes also changes the behavior of
the underlying server mechanism.
In this case, the mixin methods override the methods in the UDPServer object definition to allow for concurrency.
The overridden method appears to be process_request and it also provides another method, process_request_thread. Here it is from the source code:
class ThreadingMixIn:
"""Mix-in class to handle each request in a new thread."""
# Decides how threads will act upon termination of the
# main process
daemon_threads = False
def process_request_thread(self, request, client_address):
"""Same as in BaseServer but as a thread.
In addition, exception handling is done here.
"""
try:
self.finish_request(request, client_address)
except Exception:
self.handle_error(request, client_address)
finally:
self.shutdown_request(request)
def process_request(self, request, client_address):
"""Start a new thread to process the request."""
t = threading.Thread(target = self.process_request_thread,
args = (request, client_address))
t.daemon = self.daemon_threads
t.start()
A Contrived Example
This is a mixin that is mostly for demonstration purposes - most objects will evolve beyond the usefulness of this repr:
class SimpleInitReprMixin(object):
"""mixin, don't instantiate - useful for classes instantiable
by keyword arguments to their __init__ method.
"""
__slots__ = () # allow subclasses to use __slots__ to prevent __dict__
def __repr__(self):
kwarg_strings = []
d = getattr(self, '__dict__', None)
if d is not None:
for k, v in d.items():
kwarg_strings.append('{k}={v}'.format(k=k, v=repr(v)))
slots = getattr(self, '__slots__', None)
if slots is not None:
for k in slots:
v = getattr(self, k, None)
kwarg_strings.append('{k}={v}'.format(k=k, v=repr(v)))
return '{name}({kwargs})'.format(
name=type(self).__name__,
kwargs=', '.join(kwarg_strings)
)
and usage would be:
class Foo(SimpleInitReprMixin): # add other mixins and/or extend another class here
__slots__ = 'foo',
def __init__(self, foo=None):
self.foo = foo
super(Foo, self).__init__()
And usage:
>>> f1 = Foo('bar')
>>> f2 = Foo()
>>> f1
Foo(foo='bar')
>>> f2
Foo(foo=None)

I think previous responses defined very well what MixIns are. However,
in order to better understand them, it might be useful to compare MixIns with Abstract Classes and Interfaces from the code/implementation perspective:
1. Abstract Class
Class that needs to contain one or more abstract methods
Abstract Class can contain state (instance variables) and non-abstract methods
2. Interface
Interface contains abstract methods only (no non-abstract methods and no internal state)
3. MixIns
MixIns (like Interfaces) do not contain internal state (instance variables)
MixIns contain one or more non-abstract methods (they can contain non-abstract methods unlike interfaces)
In e.g. Python these are just conventions, because all of the above are defined as classes. However, the common feature of both Abstract Classes, Interfaces and MixIns is that they should not exist on their own, i.e. should not be instantiated.

Mixins is a concept in Programming in which the class provides functionalities but it is not meant to be used for instantiation. Main purpose of Mixins is to provide functionalities which are standalone and it would be best if the mixins itself do not have inheritance with other mixins and also avoid state. In languages such as Ruby, there is some direct language support but for Python, there isn't. However, you could used multi-class inheritance to execute the functionality provided in Python.
I watched this video http://www.youtube.com/watch?v=v_uKI2NOLEM to understand the basics of mixins. It is quite useful for a beginner to understand the basics of mixins and how they work and the problems you might face in implementing them.
Wikipedia is still the best: http://en.wikipedia.org/wiki/Mixin

I think there have been some good explanations here but I wanted to provide another perspective.
In Scala, you can do mixins as has been described here but what is very interesting is that the mixins are actually 'fused' together to create a new kind of class to inherit from. In essence, you do not inherit from multiple classes/mixins, but rather, generate a new kind of class with all the properties of the mixin to inherit from. This makes sense since Scala is based on the JVM where multiple-inheritance is not currently supported (as of Java 8). This mixin class type, by the way, is a special type called a Trait in Scala.
It's hinted at in the way a class is defined:
class NewClass extends FirstMixin with SecondMixin with ThirdMixin
...
I'm not sure if the CPython interpreter does the same (mixin class-composition) but I wouldn't be surprised. Also, coming from a C++ background, I would not call an ABC or 'interface' equivalent to a mixin -- it's a similar concept but divergent in use and implementation.

I'd advise against mix-ins in new Python code, if you can find any other way around it (such as composition-instead-of-inheritance, or just monkey-patching methods into your own classes) that isn't much more effort.
In old-style classes you could use mix-ins as a way of grabbing a few methods from another class. But in the new-style world everything, even the mix-in, inherits from object. That means that any use of multiple inheritance naturally introduces MRO issues.
There are ways to make multiple-inheritance MRO work in Python, most notably the super() function, but it means you have to do your whole class hierarchy using super(), and it's considerably more difficult to understand the flow of control.

Perhaps a couple of examples will help.
If you're building a class and you want it to act like a dictionary, you can define all the various __ __ methods necessary. But that's a bit of a pain. As an alternative, you can just define a few, and inherit (in addition to any other inheritance) from UserDict.DictMixin (moved to collections.DictMixin in py3k). This will have the effect of automatically defining all the rest of the dictionary api.
A second example: the GUI toolkit wxPython allows you to make list controls with multiple columns (like, say, the file display in Windows Explorer). By default, these lists are fairly basic. You can add additional functionality, such as the ability to sort the list by a particular column by clicking on the column header, by inheriting from ListCtrl and adding appropriate mixins.

It's not a Python example but in the D programing language the term mixin is used to refer to a construct used much the same way; adding a pile of stuff to a class.
In D (which by the way doesn't do MI) this is done by inserting a template (think syntactically aware and safe macros and you will be close) into a scope. This allows for a single line of code in a class, struct, function, module or whatever to expand to any number of declarations.

OP mentioned that he/she never heard of mixin in C++, perhaps that is because they are called Curiously Recurring Template Pattern (CRTP) in C++. Also, #Ciro Santilli mentioned that mixin is implemented via abstract base class in C++. While abstract base class can be used to implement mixin, it is an overkill as the functionality of virtual function at run-time can be achieved using template at compile time without the overhead of virtual table lookup at run-time.
The CRTP pattern is described in detail here
I have converted the python example in #Ciro Santilli's answer into C++ using template class below:
#include <iostream>
#include <assert.h>
template <class T>
class ComparableMixin {
public:
bool operator !=(ComparableMixin &other) {
return ~(*static_cast<T*>(this) == static_cast<T&>(other));
}
bool operator <(ComparableMixin &other) {
return ((*(this) != other) && (*static_cast<T*>(this) <= static_cast<T&>(other)));
}
bool operator >(ComparableMixin &other) {
return ~(*static_cast<T*>(this) <= static_cast<T&>(other));
}
bool operator >=(ComparableMixin &other) {
return ((*static_cast<T*>(this) == static_cast<T&>(other)) || (*(this) > other));
}
protected:
ComparableMixin() {}
};
class Integer: public ComparableMixin<Integer> {
public:
Integer(int i) {
this->i = i;
}
int i;
bool operator <=(Integer &other) {
return (this->i <= other.i);
}
bool operator ==(Integer &other) {
return (this->i == other.i);
}
};
int main() {
Integer i(0) ;
Integer j(1) ;
//ComparableMixin<Integer> c; // this will cause compilation error because constructor is protected.
assert (i < j );
assert (i != j);
assert (j > i);
assert (j >= i);
return 0;
}
EDIT: Added protected constructor in ComparableMixin so that it can only be inherited and not instantiated. Updated the example to show how protected constructor will cause compilation error when an object of ComparableMixin is created.

The concept comes from Steve’s Ice Cream, an ice cream store founded by Steve Herrell in Somerville, Massachusetts, in 1973, where mix-ins (candies, cakes, etc.) were mixed into basic ice cream flavors (vanilla, chocolate, etc.).
Inspired by Steve’s Ice Cream, the designers of the Lisp object system Flavors included the concept in a programming language for the first time, where mix-ins were small helper classes designed for enhancing other classes, and flavors were large standalone classes.
So the main idea is that a mix-in is a reusable extension (’reusable’ as opposed to ‘exclusive’; ‘extension’ as opposed to ‘base’).
The concept is orthogonal to the concepts of single or multiple inheritance and abstract or concrete class. Mix-in classes can be used in single or multiple inheritance and can be abstract or concrete classes. Mix-in classes have incomplete interfaces while abstract classes have incomplete implementations and concrete classes have complete implementations.
Mix-in class names are conventionally suffixed with ‘-MixIn’, ‘-able’, or ‘-ible’ to emphasize their nature, like in the Python standard library with the ThreadingMixIn and ForkingMixIn classes of the socketserver module, and the Hashable, Iterable, Callable, Awaitable, AsyncIterable, and Reversible classes of the collections.abc module.
Here is an example of a mix-in class used for extending the Python built-in list and dict classes with logging capability:
import logging
class LoggingMixIn:
def __setitem__(self, key, value):
logging.info('Setting %r to %r', key, value)
super().__setitem__(key, value)
def __delitem__(self, key):
logging.info('Deleting %r', key)
super().__delitem__(key)
class LoggingList(LoggingMixIn, list):
pass
class LoggingDict(LoggingMixIn, dict):
pass
>>> logging.basicConfig(level=logging.INFO)
>>> l = LoggingList([False])
>>> d = LoggingDict({'a': False})
>>> l[0] = True
INFO:root:Setting 0 to True
>>> d['a'] = True
INFO:root:Setting 'a' to True
>>> del l[0]
INFO:root:Deleting 0
>>> del d['a']
INFO:root:Deleting 'a'

mixin gives a way to add functionality in a class, i.e you can interact with methods defined in a module by including the module inside the desired class. Though ruby doesn't supports multiple inheritance but provides mixin as an alternative to achieve that.
here is an example that explains how multiple inheritance is achieved using mixin.
module A # you create a module
def a1 # lets have a method 'a1' in it
end
def a2 # Another method 'a2'
end
end
module B # let's say we have another module
def b1 # A method 'b1'
end
def b2 #another method b2
end
end
class Sample # we create a class 'Sample'
include A # including module 'A' in the class 'Sample' (mixin)
include B # including module B as well
def S1 #class 'Sample' contains a method 's1'
end
end
samp = Sample.new # creating an instance object 'samp'
# we can access methods from module A and B in our class(power of mixin)
samp.a1 # accessing method 'a1' from module A
samp.a2 # accessing method 'a2' from module A
samp.b1 # accessing method 'b1' from module B
samp.b2 # accessing method 'a2' from module B
samp.s1 # accessing method 's1' inside the class Sample

I just used a python mixin to implement unit testing for python milters. Normally, a milter talks to an MTA, making unit testing difficult. The test mixin overrides methods that talk to the MTA, and create a simulated environment driven by test cases instead.
So, you take an unmodified milter application, like spfmilter, and mixin TestBase, like this:
class TestMilter(TestBase,spfmilter.spfMilter):
def __init__(self):
TestBase.__init__(self)
spfmilter.config = spfmilter.Config()
spfmilter.config.access_file = 'test/access.db'
spfmilter.spfMilter.__init__(self)
Then, use TestMilter in the test cases for the milter application:
def testPass(self):
milter = TestMilter()
rc = milter.connect('mail.example.com',ip='192.0.2.1')
self.assertEqual(rc,Milter.CONTINUE)
rc = milter.feedMsg('test1',sender='good#example.com')
self.assertEqual(rc,Milter.CONTINUE)
milter.close()
http://pymilter.cvs.sourceforge.net/viewvc/pymilter/pymilter/Milter/test.py?revision=1.6&view=markup

Maybe an example from ruby can help:
You can include the mixin Comparable and define one function "<=>(other)", the mixin provides all those functions:
<(other)
>(other)
==(other)
<=(other)
>=(other)
between?(other)
It does this by invoking <=>(other) and giving back the right result.
"instance <=> other" returns 0 if both objects are equal, less than 0 if instance is bigger than other and more than 0 if other is bigger.

I read that you have a c# background. So a good starting point might be a mixin implementation for .NET.
You might want to check out the codeplex project at http://remix.codeplex.com/
Watch the lang.net Symposium link to get an overview. There is still more to come on documentation on codeplex page.
regards
Stefan

Roughly summarizing all great answers above:
                States        /     Methods
Concrete Method
Abstract Method
Concrete State
Class
Abstract Class
Abstract State
Mixin
Interface

Related

Are there any unique features provided only by metaclasses in Python?

I have read answers for this question: What are metaclasses in Python? and this question: In Python, when should I use a meta class? and skimmed through documentation: Data model.
It is very possible I missed something, and I would like to clarify: is there anything that metaclasses can do that cannot be properly or improperly (unpythonic, etc) done with the help of other tools (decorators, inheritance, etc)?
That is a bit tricky to answer -
However, it is a very nice question to ask at this point, and there are certainly a few things that are easier to do with metaclasses.
So, first, I think it is important to note the things for which one used to need a metaclass in the past, and no longer needs to: I'd say that with the release of Python 3.6 and the inclusion of __init_subclass__ and __set_name__ dunder methods, a lot, maybe the majority of the cases I had always written a metaclass for (most of them for answering questions or in toy code - no one creates that many production-code metaclasses even in a lifetime as a programmer) became outdated.
Specially __init_subclass__ adds the convenience of being able to transform any attribute or method like class-decorators, but is automatically applied on inheritance, which does not happen with decorators.
I guess reading about it was a fator motivating your question - since most metaclasses found out in the wild deal with transforming these attributes in __new__ and __init__ metaclass methods.
However, note that if one needs to transform any attribute prior to having it included in the class, the metaclass __new__ method is the only place it can be done. In most cases, however, one can simply transform it in the final new class namespace.
Then, one version forward, in 3.7, we had __class_getitem__ implemented - since using the [ ] (__getitem__) operator directly on classes became popular due to typing annotations. Before that, one would have to create a metaclass with a __getitem__ method for the sole purpose of being able to indicate to the type-checker toolchain some extra information like generic variables.
One interesting possibility that did not exist in Python 2, was introduced in Python 3, then outdated, and now can only serve very specific cases is the use of the __prepare__ method on the metaclass:
I don't know if this is written in any official docs, but the obvious primary motivation for metaclass __prepare__ which allows one custom namespace for the class body, was to return an ordered dict, so that one could have ordered attributes in classes that would work as data entities. It turns out that also, from Python 3.6 on, class body namespaces where always ordered (which later on Python 3.7 were formalized for all Python dictionaries). However, although not needed for returning an OrderedDict anymore, __prepare__ is still aunique thing in the language in which it allows a custom mapping class to be used as namespace in a piece of Python code (even if that is limited to class bodies). For example, one can trivialy create an "auto-enumeration" metaclass by returning a
class MD(dict):
def __init__(self, *args, **kw):
super().__init__(*args, **kw)
self.counter = 0
def __missing__(self, key):
counter = self[key] = self.counter
self.counter += 1
return counter
class MC(type):
#classmethod
def __prepare__(mcls, name, bases, **kwd):
return MD()
class Colors(metaclass=MC):
RED
GREEN
BLUE
(an example similar to this is included in Luciano Ramalho's 'Fluent Python' 2nd edition)
The __call__ method on the metaclass is also peculiar: it control the calls to __new__ and __init__ whenever an instance of the class is created. There are recipes around that use this to create a "singleton" - I find those terrible and overkill: if I need a singleton, I just create an instance of the singleton class at module level. However, overriding typing.__call__ offers a level of control on class instantiation that may be hard to achieve on the class __new__ and __init__ themselves. But this definitely can be done by correctly keeping the desired states in the class object itself.
__subclasscheck__ and __instancecheck__: these are metaclass only methods, and the only workaround would be to make a class decorator that would re-create a class object so that it would be a "real" subclass of the intended base class. (and that is not always possible).
"hidden" class attributes: now, this can be useful, and is less known, as it derives from the language behavior itself: any attribute or method besides the dunder methods included in a metaclass can be used from a class, but from instances of that class. An example for this is the .register method in classes using abc.ABCMeta. This contrasts with ordinary classmethods which can be used normally from an instance.
And finally, any behavior defined with the dunder methods for a Python object can be implemented to work on classes if they are defined in the metaclass. So if you have any use case for "add-able" classes, or want a special repr for your classes, just implement __add__ or __repr__ on the metaclass: this behavior obviously can't be obtained by other means.
I think I got all covered there.

Undoing a decade of singleton pattern and class-level configuration

Overview
I need to duplicate a whole inheritance tree of classes. Simply deep-copying the class objects does not work; a proper factory pattern involves a huge amount of code changes; I'm not sure how to use metaclasses to accomplish this.
Background
The software I work on implements support for specialized external hardware, connected to the host computer via USB. Many years ago, it was assumed that there would only ever be one type of hardware in use at a time. Consequently, the hardware object is used as a singleton. Along the years, secondary classes were configured based on the currently active hardware class.
At the moment, it is impossible to use this library with two types of hardware at the same time, since the classobjects cannot be configured for both hardware at the same time.
In recent years, we have avoided this issue by creating one python process for each hardware, but this is becoming untenable.
Here is an extremely simplified example of the architecture:
# ----------
# Hardware classes
class HwBase():
def customizeComponent(self, compDict):
compDict['ComponentBase'].hardware = self
class HwA(HwBase):
def customizeComponent(self, compDict):
super().customizeComponent(compDict)
compDict['AnotherComponent'].prop.configure(1,2,3)
class HwB(HwBase):
def customizeComponent(self, compDict):
super().customizeComponent(compDict)
compDict['AnotherComponent'].prop.configure(4,5,6)
# ----------
# Property classes
class SpecialProperty(property):
def __init__(self, fvalidate):
self.fvalidate = fvalidate
# handle fset, fget, etc. here.
# super().__init__()
# ----------
# Component classes
class ComponentBase():
hardware = None
def validateProp(self, val):
return val < self.maxVal
prop = SpecialProperty(fvalidate=validateProp)
class SomeComponent():
"""Users directly instantiate and use this compoent via an interactive shell.
This component does complex operations with the hardware attribute"""
def validateThing(self, val):
return isinstance(val, ComponentBase)
thing = SpecialProperty(fvalidate=validateThing)
class AnotherComponent():
"""Users directly instantiate and use this compoent via an interactive shell
This component does complex operations with the hardware attribute"""
maxVal = 15
# ----------
# Initialization
def initialize():
""" This is only called once perppython instance."""
#activeCls = HwA
activeCls = HwB
allComponents = {
'ComponentBase': ComponentBase,
'SomeComponent': SomeComponent,
'AnotherComponent': AnotherComponent
}
hwInstance = activeCls()
hwInstance.customizeComponent(allComponents)
return allComponents
components = initialize()
# ----------
# User code goes here
someInstance1 = components['SomeComponent']()
someInstance2 = components['SomeComponent']()
someInstance1.prop = 10
someInstance2.prop = 10
The overarching goal would be to interact with both HwA and HwB at the same time. Since most interactions are done via components instead of the Hw objects themselves, I believe the solution involves having multiple versions of the components, e.g.: two separate inheritance trees, for a total of 6 final components, one tree/set configured for each hardware. This is what I need help with.
Potential solutions
Consider that I have around tens different hardware do configure for. Furthermore, there are hundreds of different leaf components classes, with many extra bases and mixin classes.
Move all configuration steps in the component's init method
Not possible due to the use of properties; these need to be set on the class.
Deepcopy the classobjects
Copy all classobjects, swap in the appropriate __bases__. Mutable class variables need to be carefully handled. However, I'm not sure how to deal with properties for this, since classbody references within the property objects (such as fvalidate) need to be updated to that of the copied class.
This requires a significant amount of manual intervention to work. Not impossible, but prone to breaking in the long term.
Factory pattern
Wrap all component definition in a factory function:
def ComponentBaseFactory(hw):
class SomeComponent(cache[hw].ComponentBase):
pass
and have some sort of component cache which would handle creating all classobjects during initialize()
This is what I consider the most architecturally-correct option available. Since the class body is re-executed
on every factory call, the attributes of the properties will reference the appropriate class object.
Downside: huge code footprint. I am familiar with doing codebase-wide changes via sed or python scripts, but this would be quite a lot.
Add metaclasses on components
I am not sure how to proceed for this. Based on the python data model (py3.7), the following happens at class creation (which happens right after the class definition indentation ends):
MRO entries are resolved;
the appropriate metaclass is determined;
the class namespace is prepared;
the class body is executed;
the class object is created.
I would need to redo these steps after the class has been defined (like a factory function!), but i'm not sure how to redo step 4. Specifically, the python documentation states in section 3.3.3.5 that the class body is executed as with a "special?" form of the exec() builtin. How can I re-exec the class body with a different set of locals/globals? Even if I access the class body's code with inspect shenanigans, i'm not sure i'll be able to reproduce the module environment properly.
Even if I mess with __prepare__ and __new__, I don't see how I can fix the cross-references introduced in the class code block regarding the property instantiation.
Components as metaclasses
A metaclass is a class factory, just like a class is an object factory. SomeComponent and AnotherComponent could be declared as metaclasses, then get instantiated with the Hw object during initialize():
SomeComponent = SomeComponentMeta(hw)
This is similar to the factory pattern, but would also require quite a few code changes: a lot of class code would have to be moved to the metaclass __init__.
I'd have to spend a lot more of time here to proper understand what you need, but if your "TL;DR" of executing the class body with different globals/nonlocal variables is the bottom line, the factory approach is a very clean and readable way, as you had considered.
At first, I don't think a metaclass could be a good approach here - although it could be used to customize your special properties (in my first read, I could not figure out what they actually do, and how they should differ between your final classes). If the function as a class factory can specialize your properties, it would work nonetheless.
If what you need is that the properties are independent for Hwa and HwB like in accessing a different list object in HwA than is accessed in HwB, yes, a metaclass could take care of that, by automatically recreating any properties when creating a subclass (so that the property objects themselves are not shared with the supper-classes and across the hierarchy).
If that i what you need, leave a comment, I can write some proof of concept code.
Anyway, it is possible to create a metaclass that, upon instantiating a subclass, will look upon the hierarchy for all SpecialProperty and create new-instances of those for the subclass - so that a base value set on a superclass remains valid for the subclasses, but when configuration runs, each class will have an independent configuration. (as it turns out, no metaclass is needed: we are covered by __init_subclass__ )
Another thing to take care of is that subclassses of property cannot be simply copies with Python's copy.copy (tested empirically), so we need a way to create reliable copies of those. I include one function bellow, but it might need to be improved to work with the actual SpecialProperty class.
from copy import copy
def copy_property(prop):
cls = prop.__class__
new_prop = cls.__new__(cls)
# Initialize the attributes that can't be set from Python code, inplace:
property.__init__(new_prop, prop.fget, prop.fset, prop.fdel)
if hasattr(prop, "__dict__"): # only exists for subclasses of property
# Possible adaptation needed: it may be that for some attributes of
# SpecialProperty, a deepcopy would be needed.
# But for the given example attribute of "fvalidate" a simple copy is better:
new_prop.__dict__ = copy(prop.__dict__)
return new_prop
# Python 3.6 introduced `__init_subclass__` which is called at subclass _creation_
# time. With it, the logic can be inserted in ComponentBase and there is no need for
# a metaclass.
class ComponentBase():
def __init_subclass__(cls, **kwargs):
super().__init_subclass__(**kwargs)
for attrname in dir(cls):
attr = getattr(cls, attrname)
if not isinstance(attr, SpecialProperty):
continue
new_prop = copy_property(attr)
setattr(cls, attrname, new_prop)
hardware = None
...
As you see- theres some workarounds that had to be done because your project opted for subclassing property. I am leaving this remark here as a remainder that unless property fits one exact needs, it is more clean to write a new class implementing the Descriptor Protocol - just by implementing __set__, __get__ and __delete__ directly.

pythonic way to expose user override hooks

Note: although my particular use is Flask related, I think the question is more general.
I am building a Flask web application meant to be customized by the user. For example, the user is expected to provide a concrete subclass of a DatabaseInterface and may add to the list of certain ModelObjects that the application knows how to handle.
What is the best way to expose the various hooks to users, and indicate required and optional status? 'Best' here primarily means most 'pythonic', or "easiest for python users to grasp", but other criteria like not causing headaches down the road are certainly worth mentioning.
Some approaches I've considered:
Rely solely on documentation
Create a template file with documented overrides, much like default config files for many servers. E.g.
app = mycode.get_app()
##Add your list of extra foo classes here
#app.extra_foos = []
Create a UserOverrides class with an attr/method for each of the hooks; possibly split into RequiredOverrides and OptionalOverrides
Create an empty class with unimplemented methods that the user must subclass into a concrete instance
One method is by using abstract base classes (abc module). For example, you can define an ABC with abstract methods that must be overridden by child classes like this:
from abc import ABC
class MyClass(ABC): # inherit from ABC
def __init__(self):
pass
#abstractmethod
def some_method(self, args):
# must be overridden by child class
pass
You would then implement a child class like:
class MyChild(MyClass):
# uses parent's __init__ by default
def some_method(self, args):
# overrides the abstract method
You can specify what everything needs to do in the overridden methods with documentation. There are also decorators for abstract properties, class methods, and static methods. Attempting to instantiate an ABC that does not have all of its abstract methods/properties overridden will result in an error.
Inheritance. Is. Bad.
This is especially true in Python, which gives you a nice precedent to avoid the issue. Consider the following code:
len({1,2,3}) # set with length 3
len([1,2,3]) # list with length 3
len((1,2,3)) # tuple with length 3
Which is cool and all for the built-in data structures, but what if you want to make your own data structure and have it work with Python's len? Simple:
class Duple(object):
def __init__(self, fst, snd):
super(Duple, self).__init__()
self.fst = fst
self.snd = snd
def __len__():
return 2
A Duple is a two-element (only) data structure (calling it with more or fewer arguments raises) and now works with len:
len(Duple(1,2)) # 2
Which is exactly how you should do this:
def foo(arg):
return arg.__foo__()
Any class that wants to work with your foo function just implements the __foo__ magic method, which is how len works under the hood.

In Python, when should I use a meta class?

I have gone through this: What is a metaclass in Python?
But can any one explain more specifically when should I use the meta class concept and when it's very handy?
Suppose I have a class like below:
class Book(object):
CATEGORIES = ['programming','literature','physics']
def _get_book_name(self,book):
return book['title']
def _get_category(self, book):
for cat in self.CATEGORIES:
if book['title'].find(cat) > -1:
return cat
return "Other"
if __name__ == '__main__':
b = Book()
dummy_book = {'title':'Python Guide of Programming', 'status':'available'}
print b._get_category(dummy_book)
For this class.
In which situation should I use a meta class and why is it useful?
Thanks in advance.
You use metaclasses when you want to mutate the class as it is being created. Metaclasses are hardly ever needed, they're hard to debug, and they're difficult to understand -- but occasionally they can make frameworks easier to use. In our 600Kloc code base we've used metaclasses 7 times: ABCMeta once, 4x models.SubfieldBase from Django, and twice a metaclass that makes classes usable as views in Django. As #Ignacio writes, if you don't know that you need a metaclass (and have considered all other options), you don't need a metaclass.
Conceptually, a class exists to define what a set of objects (the instances of the class) have in common. That's all. It allows you to think about the instances of the class according to that shared pattern defined by the class. If every object was different, we wouldn't bother using classes, we'd just use dictionaries.
A metaclass is an ordinary class, and it exists for the same reason; to define what is common to its instances. The default metaclass type provides all the normal rules that make classes and instances work the way you're used to, such as:
Attribute lookup on an instance checks the instance followed by its class, followed by all superclasses in MRO order
Calling MyClass(*args, **kwargs) invokes i = MyClass.__new__(MyClass, *args, **kwargs) to get an instance, then invokes i.__init__(*args, **kwargs) to initialise it
A class is created from the definitions in a class block by making all the names bound in the class block into attributes of the class
Etc
If you want to have some classes that work differently to normal classes, you can define a metaclass and make your unusual classes instances of the metaclass rather than type. Your metaclass will almost certainly be a subclass of type, because you probably don't want to make your different kind of class completely different; just as you might want to have some sub-set of Books behave a bit differently (say, books that are compilations of other works) and use a subclass of Book rather than a completely different class.
If you're not trying to define a way of making some classes work differently to normal classes, then a metaclass is probably not the most appropriate solution. Note that the "classes define how their instances work" is already a very flexible and abstract paradigm; most of the time you do not need to change how classes work.
If you google around, you'll see a lot of examples of metaclasses that are really just being used to go do a bunch of stuff around class creation; often automatically processing the class attributes, or finding new ones automatically from somewhere. I wouldn't really call those great uses of metaclasses. They're not changing how classes work, they're just processing some classes. A factory function to create the classes, or a class method that you invoke immediately after class creation, or best of all a class decorator, would be a better way to implement this sort of thing, in my opinion.
But occasionally you find yourself writing complex code to get Python's default behaviour of classes to do something conceptually simple, and it actually helps to step "further out" and implement it at the metaclass level.
A fairly trivial example is the "singleton pattern", where you have a class of which there can only be one instance; calling the class will return an existing instance if one has already been created. Personally I am against singletons and would not advise their use (I think they're just global variables, cunningly disguised to look like newly created instances in order to be even more likely to cause subtle bugs). But people use them, and there are huge numbers of recipes for making singleton classes using __new__ and __init__. Doing it this way can be a little irritating, mainly because Python wants to call __new__ and then call __init__ on the result of that, so you have to find a way of not having your initialisation code re-run every time someone requests access to the singleton. But wouldn't be easier if we could just tell Python directly what we want to happen when we call the class, rather than trying to set up the things that Python wants to do so that they happen to do what we want in the end?
class Singleton(type):
def __init__(self, *args, **kwargs):
super(Singleton, self).__init__(*args, **kwargs)
self.__instance = None
def __call__(self, *args, **kwargs):
if self.__instance is None:
self.__instance = super(Singleton, self).__call__(*args, **kwargs)
return self.__instance
Under 10 lines, and it turns normal classes into singletons simply by adding __metaclass__ = Singleton, i.e. nothing more than a declaration that they are a singleton. It's just easier to implement this sort of thing at this level, than to hack something out at the class level directly.
But for your specific Book class, it doesn't look like you have any need to do anything that would be helped by a metaclass. You really don't need to reach for metaclasses unless you find the normal rules of how classes work are preventing you from doing something that should be simple in a simple way (which is different from "man, I wish I didn't have to type so much for all these classes, I wonder if I could auto-generate the common bits?"). In fact, I have never actually used a metaclass for something real, despite using Python every day at work; all my metaclasses have been toy examples like the above Singleton or else just silly exploration.
A metaclass is used whenever you need to override the default behavior for classes, including their creation.
A class gets created from the name, a tuple of bases, and a class dict. You can intercept the creation process to make changes to any of those inputs.
You can also override any of the services provided by classes:
__call__ which is used to create instances
__getattribute__ which is used to lookup attributes and methods on a class
__setattr__ which controls setting attributes
__repr__ which controls how the class is diplayed
In summary, metaclasses are used when you need to control how classes are created or when you need to alter any of the services provided by classes.
If you for whatever reason want to do stuff like Class[x], x in Class etc., you have to use metaclasses:
class Meta(type):
def __getitem__(cls, x):
return x ** 2
def __contains__(cls, x):
return int(x ** (0.5)) == x ** 0.5
# Python 2.x
class Class(object):
__metaclass__ = Meta
# Python 3.x
class Class(metaclass=Meta):
pass
print Class[2]
print 4 in Class
check the link Meta Class Made Easy to know how and when to use meta class.

Dynamic sub-classing in Python

I have a number of atomic classes (Components/Mixins, not really sure what to call them) in a library I'm developing, which are meant to be subclassed by applications. This atomicity was created so that applications can only use the features that they need, and combine the components through multiple inheritance.
However, sometimes this atomicity cannot be ensured because some component may depend on another one. For example, imagine I have a component that gives a graphical representation to an object, and another component which uses this graphical representation to perform some collision checking. The first is purely atomic, however the latter requires that the current object already subclassed this graphical representation component, so that its methods are available to it. This is a problem, because we have to somehow tell the users of this library, that in order to use a certain Component, they also have to subclass this other one. We could make this collision component sub class the visual component, but if the user also subclasses this visual component, it wouldn't work because the class is not on the same level (unlike a simple diamond relationship, which is desired), and would give the cryptic meta class errors which are hard to understand for the programmer.
Therefore, I would like to know if there is any cool way, through maybe metaclass redefinition or using class decorators, to mark these unatomic components, and when they are subclassed, the additional dependency would be injected into the current object, if its not yet available. Example:
class AtomicComponent(object):
pass
#depends(AtomicComponent) # <- something like this?
class UnAtomicComponent(object):
pass
class UserClass(UnAtomicComponent): #automatically includes AtomicComponent
pass
class UserClass2(AtomicComponent, UnAtomicComponent): #also works without problem
pass
Can someone give me an hint on how I can do this? or if it is even possible...
edit:
Since it is debatable that the meta class solution is the best one, I'll leave this unaccepted for 2 days.
Other solutions might be to improve error messages, for example, doing something like UserClass2 would give an error saying that UnAtomicComponent already extends this component. This however creates the problem that it is impossible to use two UnAtomicComponents, given that they would subclass object on different levels.
"Metaclasses"
This is what they are for! At time of class creation, the class parameters run through the
metaclass code, where you can check the bases and change then, for example.
This runs without error - though it does not preserve the order of needed classes
marked with the "depends" decorator:
class AutoSubclass(type):
def __new__(metacls, name, bases, dct):
new_bases = set()
for base in bases:
if hasattr(base, "_depends"):
for dependence in base._depends:
if not dependence in bases:
new_bases.add(dependence)
bases = bases + tuple(new_bases)
return type.__new__(metacls, name, bases, dct)
__metaclass__ = AutoSubclass
def depends(*args):
def decorator(cls):
cls._depends = args
return cls
return decorator
class AtomicComponent:
pass
#depends(AtomicComponent) # <- something like this?
class UnAtomicComponent:
pass
class UserClass(UnAtomicComponent): #automatically includes AtomicComponent
pass
class UserClass2(AtomicComponent, UnAtomicComponent): #also works without problem
pass
(I removed inheritance from "object", as I declared a global __metaclass__ variable. All classs will still be new style class and have this metaclass. Inheriting from object or another class does override the global __metaclass__variable, and a class level __metclass__ will have to be declared)
-- edit --
Without metaclasses, the way to go is to have your classes to properly inherit from their dependencies. Tehy will no longer be that "atomic", but, since they could not work being that atomic, it may be no matter.
In the example bellow, classes C and D would be your User classes:
>>> class A(object): pass
...
>>> class B(A, object): pass
...
>>>
>>> class C(B): pass
...
>>> class D(B,A): pass
...
>>>

Categories

Resources