Is this a valid use of metaclasses - python

I've been watching some videos on decorators and metaclasses and I think I understand them better now. One maxim I took away was "don't use metaclasses if you can do it more simply without using them". Some time ago I wrote a metaclass without really understanding what I was doing and I went back and reviewed it. I'm pretty certain that I've done something sensible here but I thought I'd check ....
PS I'm mildly concerned that the Colour class is used in the Metaclass definition, I feel it ought to be used at the Class level but that would complicate the code.
import webcolors
# This is a holding class for demo purposes
# actual code allows much more, eg 0.3*c0 + 0.7*c1
class Colour(list):
def __init__(self,*arg):
super().__init__(arg)
# define a metaclass to implement Class.name
class MetaColour(type):
def __getattr__(cls,name):
try:
_ = webcolors.name_to_rgb(name)
return Colour(_.blue,_.green,_.red)
except ValueError as e:
raise ValueError(f"{name} is not a valid name for a colour ({e})")
return f"name({name})"
# a class based on the metaclass MetaColour
class Colours(metaclass=MetaColour):
pass
print("blue = ",Colours.blue)
print("green = ",Colours.green)
print("lime = ",Colours.lime)
print("orange = ",Colours.orange)
print()
print("lilac = ",Colours.lilac)
Edit: I realise I could have written the Colour class so that Colour("red") was equivalent to Colours.red but felt at the time that using Colours.red was more elegant and added the implication that the Colour 'red' was a constant, not something that has to be looked up and can vary.

If you really need Colours to be a class, then this metaclass just does it job - and seems fine. There is no problem at all in making use Colour inside it - there is no such thing as "metaclass code can not make use of any 'ordinary' class" - it is Python code as usuall.
The remark I'd do there is that maybe you don't need to use Colours as a class, and instead just create the Colours class, with all the functionality you need, and create a single instance of it. The remainder of the code will use this instance instead of the Colours class.
Yes, a single instance is the "singleton pattern" - but unlike some complicated code you can find around on how to make your class "be a singleton" (including some widely spread bad-practice about needing a metaclass to have a singleton in Python), you can just create the instance, assign it to a name, and be done with it. Just like in your example you have the "webcolors" object you are using.
For an extra-singleton bonus, you can make your single instance of Colours be named Colours, and shadow the class, preventing any accidental use of the class instead of the instance.
(And, although it might be obvious, for sake of completeness: in the "use Colours as an instance" case there is no need for this metaclass at all - the same __getattr__ method goes into the class body)
Of course, again, if you have uses for Colours as a class, there is no problem with this design.

Related

How to test a method where the set up and checking depends on untested methods?

I've created an example class (a bitmask class) which has 4 really simple functions. I've also created a unit-test for this class.
import unittest
class BitMask:
def __init__(self):
self.__mask = 0
def set(self, slot):
self.__mask |= (1 << slot)
def remove(self, slot):
self.__mask &= ~(1 << slot)
def has(self, slot):
return (self.__mask >> slot) & 1
def clear(self):
self.__mask = 0
class TestBitmask(unittest.TestCase):
def setUp(self):
self.bitmask = BitMask()
def test_set_on_valid_input(self):
self.bitmask.set(5)
self.assertEqual(self.bitmask.has(5), True)
def test_has_on_valid_input(self):
self.bitmask.set(5)
self.assertEqual(self.bitmask.has(5), True)
def test_remove_on_valid_input(self):
self.bitmask.set(5)
self.bitmask.remove(5)
self.assertEqual(self.bitmask.has(5), False)
def test_clear(self):
for i in range(16):
self.bitmask.set(i)
self.bitmask.clear()
for j in range(16):
with self.subTest(j=j):
self.assertEqual(self.bitmask.has(j), False)
The problem I'm facing is that all these tests requires both the set and has methods for setting and checking values in the bitmask, but these methods are untested. I cannot confirm that one is correct without knowing that the other one is.
This example class isn't the first time I've experienced this issue. It usually occurs when I need to set up and check values/states within a class in order to test a method.
I've tried to find resources that explain this, but unfortunately their examples only use pure functions or where the changed attribute can be read directly. I could solve the problem by extracting the methods to be pure functions, or using a read-only property that returns the attribute __mask.
But is this the preferred approach? If not, how do I test a method that needs to be set up and/or checked using untested methods?
Not sure this answers your question, as it deals with changing of initial class design,
but here it goes.
You make a lazy class with no constructor or property , which hides the state of your
object. It is not the set or has methods that are untested, it is the issue of
state of the object being unknown. Have you had a .value property to reveal
self.__mask, this would solve a question of testing .set() and has().
Also I would strongly consider a default value in constructor, which makes it a better-looking
instantination and allows easier testing (some advice on avoiding setters in python is here).
def __init__(self, mask=0):
self.__mask = mask
If there any design considerations that prevent you from having a .value property,
perhaps an `__eq__ method can be used, if __init__ accepts a value.
a = BitMask(0)
b = BitMask(5)
a.set(5)
assert a == b
Of course, you can challenge that on how is __eq__tested itself.
Finally, perhaps you are failiar with patching or monkey-patching - a
technique to block something inside a object under test or make it work differently
(eg imitate web response without actual call). With any of the libraries for pathcing
I think you would still endup-performing a kind of x.__mask = value assignment, which
is not too reasonable for a small, nice, and locally-defined class like one here.
Hope it helps in line of what you are exploring.
I would’ve used single underscore instead of double, and just looked directly at the _mask in unit test.
Python doesn’t really have private attributes or methods, even double underscore attributes are accessible on your instance like this: obj._BitMask__mask.
Double underscore is used when you want subclasses to not overwrite the attribute of superclass. To indicate “private” you should use single underscore.
Allowing access to private fields is a part of python's design, so using this ability responsibly is not considered wrong, doubly so if you are accessing your own class.
The rationale behind "Do not touch the private fields" is that you as the developer can mess something up with the internals of the class, also private interface of s library can change at any point and break your code.
When you are writing unit tests you are not afraid of messing with your own class, and is accepting that you have to change unit test if you change your class, so this programming idiom is not useful for you to apply.

Tricky method for overriding a method in several sibling classes required

Imagine a situation in which a large set of animal classes, which cannot be modified, all inherit from the same parent class "Animal", and each implements a method called "make_noise" each with a slightly different signature, but all with shared parameter volume:
class Cat(Animal)
def make_noise(volume, duration)
-some code here-
class Mouse(Animal)
def make_noise(volume, pitch)
-some different code here-
A different "controller" class, which also cannot be modified, is instructing a list of these animal instances (a list which I have populated) to make sounds at particular volumes (and duration/pitch/etc as appropriate). However, I need to get between the controller and animal classes to modify the behaviour of "make_noise" in all animal classes, so that I can reduce the value of volume before the sound is made.
One option would be to do something like:
def animal_monkeypatcher(animal_class, volume_reduction_factor):
class QuietAnimal(animal_class)
def make_noise(volume, **kwargs)
volume = volume * volume_reduction_factor
super(QuietAnimal, self).make_noise(volume, **kwargs)
However, I also need to pickle these objects, and that doesn't work with this approach. The next approach I thought about was a class which had an instance of the animal like so...
class QuietAnimal():
def __init__(animal_class, init_kwargs):
self.animal = animal_class(**init_kwargs)
def make_noise(volume, **kwargs)
volume = volume * volume_reduction_factor
self.animal.make_noise(volume, **kwargs)
def lots of other functions.....
However, this is also not suitable because the controller sometimes needs to create new instances of animals. It does this by getting the class of an animal (which is QuietAnimal, instead of say Mouse) and then using the same set of init_kwargs to create it, which does not match the signature of QuietAnimal, so again we're stuck...
At the moment I have a horrible hack, which basically forks the init depending on whether or not an animal_class has been passed in or not, and records some info in some class variables. It's frankly dreadful, and not useful if I need to create more than one type of animal (which in my use case I don't, but still...). It's also rubbish because I have to include all of the methods from all of the animals.
What is the appropriate way to wrap/proxy/whatever this set of classes to achieve the above? Some sample code would be greatly appreciated. I am a Python novice.
It turns out, what I needed was standard "monkey patching". Not a great idea either, but slightly cleaner than creating classes on the fly.

How to tell if a class is abstract in Python 3?

I wrote a metaclass that automatically registers its classes in a dict at runtime. In order for it to work properly, it must be able to ignore abstract classes.
The code works really well in Python 2, but I've run into a wall trying to make it compatible with Python 3.
Here's what the code looks like currently:
def AutoRegister(registry, base_type=ABCMeta):
class _metaclass(base_type):
def __init__(self, what, bases=None, attrs=None):
super(_metaclass, self).__init__(what, bases, attrs)
# Do not register abstract classes.
# Note that we do not use `inspect.isabstract` here, as
# that only detects classes with unimplemented abstract
# methods - which is a valid approach, but not what we
# want here.
# :see: http://stackoverflow.com/a/14410942/
metaclass = attrs.get('__metaclass__')
if not (metaclass and issubclass(metaclass, ABCMeta)):
registry.register(self)
return _metaclass
Usage in Python 2 looks like this:
# Abstract classes; these are not registered.
class BaseWidget(object): __metaclass__ = AutoRegister(widget_registry)
class BaseGizmo(BaseWidget): __metaclass__ = ABCMeta
# Concrete classes; these get registered.
class AlphaWidget(BaseWidget): pass
class BravoGizmo(BaseGizmo): pass
What I can't figure out, though, is how to make this work in Python 3.
How can a metaclass determine if it is initializing an abstract class in Python 3?
PEP3119 describes how the ABCMeta metaclass "marks" abstract methods and creates an __abstractmethods__ frozenset that contains all methods of a class that are still abstract. So, to check if a class cls is abstract, check if cls.__abstractmethods__ is empty or not.
I also found this relevant post on abstract classes useful.
I couldn't shake the feeling as I was posting this question that I was dealing with an XY Problem. As it turns out, that's exactly what was going on.
The real issue here is that the AutoRegister metaclass, as implemented, relies on a flawed understanding of what an abstract class is. Python or not, one of the most important criteria of an abstract class is that it is not instanciable.
In the example posted in the question, BaseWidget and BaseGizmo are instanciable, so they are not abstract.
Aren't we just bifurcating rabbits here?
Well, why was I having so much trouble getting AutoRegister to work in Python 3? Because I was trying to build something whose behavior contradicts the way classes work in Python.
The fact that inspect.isabstract wasn't returning the result I wanted should have been a major red flag: AutoRegister is a warranty-voider.
So what's the real solution then?
First, we have to recognize that BaseWidget and BaseGizmo have no reason to exist. They do not provide enough functionality to be instantiable, nor do they declare abstract methods that describe the functionality that they are missing.
One could argue that they could be used to "categorize" their sub-classes, but a) that's clearly not what's going on in this case, and b) quack.
Instead, we could embrace Python's definition of "abstract":
Modify BaseWidget and BaseGizmo so that they define one or more abstract methods.
If we can't come up with any abstract methods, then can we remove them entirely?
If we can't remove them but also can't make them properly abstract, it might be worthwhile to take a step back and see if there are other ways we might solve this problem.
Modify the definition of AutoRegister so that it uses inspect.isabstract to decide if a class is abstract: see final implementation.
That's cool and all, but what if I can't change the base classes?
Or, if you have to maintain backwards compatibility with existing code (as was the case for me), a decorator is probably easier:
#widget_registry.register
class AlphaWidget(object):
pass
#widget_registry.register
class BravoGizmo(object):
pass

Does Python require intimate knowledge of all classes in the inheritance chain?

Python classes have no concept of public/private, so we are told to not touch something that starts with an underscore unless we created it. But does this not require complete knowledge of all classes from which we inherit, directly or indirectly? Witness:
class Base(object):
def __init__(self):
super(Base, self).__init__()
self._foo = 0
def foo(self):
return self._foo + 1
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self._foo = None
Sub().foo()
Expectedly, a TypeError is raised when None + 1 is evaluated. So I have to know that _foo exists in the base class. To get around this, __foo can be used instead, which solves the problem by mangling the name. This seems to be, if not elegant, an acceptable solution. However, what happens if Base inherits from a class (in a separate package) called Sub? Now __foo in my Sub overrides __foo in the grandparent Sub.
This implies that I have to know the entire inheritance chain, including all "private" objects each uses. The fact that Python is dynamically-typed makes this even harder, since there are no declarations to search for. The worst part, however, is probably the fact Base might inherit from object right now, but in some future release, it switches to inheriting from Sub. Clearly if I know Sub is inherited from, I can rename my class, however annoying that is. But I can't see into the future.
Is this not a case where a true private data type would prevent a problem? How, in Python, can I be sure that I'm not accidentally stepping on somebody's toes if those toes might spring into existence at some point in the future?
EDIT: I've apparently not made clear the primary question. I'm familiar with name mangling and the difference between a single and a double underscore. The question is: how do I deal with the fact that I might clash with classes whose existence I don't know of right now? If my parent class (which is in a package I did not write) happens to start inheriting from a class with the same name as my class, even name mangling won't help. Am I wrong in seeing this as a (corner) case that true private members would solve, but that Python has trouble with?
EDIT: As requested, the following is a full example:
File parent.py:
class Sub(object):
def __init__(self):
self.__foo = 12
def foo(self):
return self.__foo + 1
class Base(Sub):
pass
File sub.py:
import parent
class Sub(parent.Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
Sub().foo()
The grandparent's foo is called, but my __foo is used.
Obviously you wouldn't write code like this yourself, but parent could easily be provided by a third party, the details of which could change at any time.
Use private names (instead of protected ones), starting with a double underscore:
class Sub(Base):
def __init__(self):
super(Sub, self).__init__()
self.__foo = None
# ^^
will not conflict with _foo or __foo in Base. This is because Python replaces the double underscore with a single underscore and the name of the class; the following two lines are equivalent:
class Sub(Base):
def x(self):
self.__foo = None # .. is the same as ..
self._Sub__foo = None
(In response to the edit:) The chance that two classes in a class hierarchy not only have the same name, but that they are both using the same property name, and are both using the private mangled (__) form is so minuscule that it can be safely ignored in practice (I for one haven't heard of a single case so far).
In theory, however, you are correct in that in order to formally verify correctness of a program, one most know the entire inheritance chain. Luckily, formal verification usually requires a fixed set of libraries in any case.
This is in the spirit of the Zen of Python, which includes
practicality beats purity.
Name mangling includes the class so your Base.__foo and Sub.__foo will have different names. This was the entire reason for adding the name mangling feature to Python in the first place. One will be _Base__foo, the other _Sub__foo.
Many people prefer to use composition (has-a) instead of inheritance (is-a) for some of these very reasons.
This implies that I have to know the entire inheritance chain. . .
Yes, you should know the entire inheritance chain, or the docs for the object you are directly sub-classing should tell you what you need to know.
Subclassing is an advanced feature, and should be treated with care.
A good example of docs specifying what should be overridden in a subclass is the threading class:
This class represents an activity that is run in a separate thread of control. There are two ways to specify the activity: by passing a callable object to the constructor, or by overriding the run() method in a subclass. No other methods (except for the constructor) should be overridden in a subclass. In other words, only override the __init__() and run() methods of this class.
How often do you modify base classes in inheritance chains to introduce inheritance from a class with the same name as a subclass further down the chain???
Less flippantly, yes, you have to know the code you are working with. You certainly have to know the public names being used, after all. Python being python, discovering the public names in use by your ancestor classes takes pretty much the same effort as discovering the private ones.
In years of Python programming, I have never found this to be much of an issue in practice. When you're naming instance variables, you should have a pretty good idea whether (a) a name is generic enough that it's likely to be used in other contexts and (b) the class you're writing is likely to be involved in an inheritance hierarchy with other unknown classes. In such cases, you think a bit more carefully about the names you're using; self.value isn't a great idea for an attribute name, and neither is something like Adaptor a great class name.
In contrast, I have run into difficulties with the overuse of double-underscore names a number of times. Python being Python, even "private" names tend to be accessed by code defined outside the class. You might think that it would always be bad practice to let an external function access "private" attributes, but what about things like getattr and hasattr? The invocation of them can be in the class's own code, so the class is still controlling all access to the private attributes, but they still don't work without you doing the name-mangling manually. If Python had actually-enforced private variables you couldn't use functions like those on them at all. These days I tend to reserve double-underscore names for cases when I'm writing something very generic like a decorator, metaclass, or mixin that needs to add a "secret attribute" to the instances of the (unknown) classes it's applied to.
And of course there's the standard dynamic language argument: the reality is that you have to test your code thoroughly to have much justification in making the claim "my software works". Such testing will be very unlikely to miss the bugs caused by accidentally clashing names. If you are not doing that testing, then many more uncaught bugs will be introduced by other means than by accidental name clashes.
In summation, the lack of private variables is just not that big a deal in idiomatic Python code in practice, and the addition of true private variables would cause more frequent problems in other ways IMHO.
Mangling happens with double underscores. Single underscores are more of a "please don't".
You don't need to know all the details of all parent classes (note that deep inheritance is usually best avoided), because you can still dir() and help() and any other form of introspection you can come up with.
As noted, you can use name mangling. However, you can stick with a single underscore (or none!) if you document your code adequately - you should not have so many private variables that this proves to be a problem. Just say if a method relies on a private variable, and add either the variable, or the name of the method to the class docstring to alert users.
Further, if you create unit tests, you should create tests that check invariants on members, and accordingly these should be able to show up such name clashes.
If you really want to have "private" variables, and for whatever reason name-mangling doesn't meet your needs, you can factor your private state into another object:
class Foo(object):
class Stateholder(object): pass
def __init__(self):
self._state = Stateholder()
self.state.private = 1

Python: must __init__(self, foo) always be followed by self.foo = foo?

I've been striving mightily for three days to wrap my head around __init__ and "self", starting at Learn Python the Hard Way exercise 42, and moving on to read parts of the Python documentation, Alan Gauld's chapter on Object-Oriented Programming, Stack threads like this one on "self", and this one, and frankly, I'm getting ready to hit myself in the face with a brick until I pass out.
That being said, I've noticed a really common convention in initial __init__ definitions, which is to follow up with (self, foo) and then immediately declare, within that definition, that self.foo = foo.
From LPTHW, ex42:
class Game(object):
def __init__(self, start):
self.quips = ["a list", "of phrases", "here"]
self.start = start
From Alan Gauld:
def __init__(self,val): self.val = val
I'm in that horrible space where I can see that there's just One Big Thing I'm not getting, and I it's remaining opaque no matter how much I read about it and try to figure it out. Maybe if somebody can explain this little bit of consistency to me, the light will turn on. Is this because we need to say that "foo," the variable, will always be equal to the (foo) parameter, which is itself contained in the "self" parameter that's automatically assigned to the def it's attached to?
You might want to study up on object-oriented programming.
Loosely speaking, when you say
class Game(object):
def __init__(self, start):
self.start = start
you're saying:
I have a type of "thing" named Game
Whenever a new Game is created, it will demand me for some extra piece of information, start. (This is because the Game's initializer, named __init__, asks for this information.)
The initializer (also referred to as the "constructor", although that's a slight misnomer) needs to know which object (which was created just a moment ago) it's initializing. That's the first parameter -- which is usually called self by convention (but which you could call anything else...).
The game probably needs to remember what the start I gave it was. So it stores this information "inside" itself, by creating an instance variable also named start (nothing special, it's just whatever name you want), and assigning the value of the start parameter to the start variable.
If it doesn't store the value of the parameter, it won't have that informatoin available for later use.
Hope this explains what's happening.
I'm not quite sure what you're missing, so let me hit some basic items.
There are two "special" intialization names in a Python class object, one that is relatively rare for users to worry about, called __new__, and one that is much more usual, called __init__.
When you invoke a class-object constructor, e.g. (based on your example) x = Game(args), this first calls Game.__new__ to obtain memory in which to hold the object, and then Game.__init__ to fill in that memory. Most of the time, you can allow the underlying object.__new__ to allocate the memory, and you just need to fill it in. (You can use your own allocator for special weird rare cases like objects that never change and may share identities, the way ordinary integers do for instance. It's also for "metaclasses" that do weird stuff. But that's all a topic for much later.)
Your Game.__init__ function is called with "all the arguments to the constructor" plus one stashed in the front, which is the memory allocated for that object itself. (For "ordinary" objects that's mostly a dictionary of "attributes", plus the magic glue for classes, but for objects with __slots__ the attributes dictionary is omitted.) Naming that first argument self is just a convention—but don't violate it, people will hate you if you do. :-)
There's nothing that requires you to save all the arguments to the constructor. You can set any or all instance attributes you like:
class Weird(object):
def __init__(self, required_arg1, required_arg2, optional_arg3 = 'spam'):
self.irrelevant = False
def __str__(self):
...
The thing is that a Weird() instance is pretty useless after initialization, because you're required to pass two arguments that are simply thrown away, and given a third optional argument that is also thrown away:
x = Weird(42, 0.0, 'maybe')
The only point in requiring those thrown-away arguments is for future expansion, as it were (you might have these unused fields during early development). So if you're not immediately using and/or saving arguments to __init__, something is definitely weird in Weird.
Incidentally, the only reason for using (object) in the class definition is to indicate to Python 2.x that this is a "new-style" class (as distinguished from very-old-Python "instance only" classes). But it's generally best to use it—it makes what I said above about object.__new__ true, for instance :-) —until Python 3, where the old-style stuff is gone entirely.
Parameter names should be meaningful, to convey the role they play in the function/method or some information about their content.
You can see parameters of constructors to be even more important because they are often required for the working of the new instance and contain information which is needed in other methods of the class as well.
Imagine you have a Game class which accepts a playerList.
class Game:
def __init__(self, playerList):
self.playerList = playerList # or self.players = playerList
def printPlayerList(self):
print self.playerList # or print self.players
This list is needed in various methods of the class. Hence it makes sense to assign it to self.playerList. You could also assign it to self.players, whatever you feel more comfortable with and you think is understandable. But if you don't assign it to self.<somename> it won't be accessible in other methods.
So there is nothing special about how to name parameters/attributes/etc (there are some special class methods though), but using meaningful names makes the code easier to understand. Or would you understand the meaning of the above class if you had:
class G:
def __init__(self, x):
self.y = x
def ppl(self):
print self.y
? :) It does exactly the same but is harder to understand...

Categories

Resources