Intercepting __getattr__ in a Python 2.3 old-style mixin class? - python

I have a large Python 2.3 based installation with 200k LOC. As part of a migration project I need to intercept all attribute lookups of all old-style class.
Old legacy code:
class Foo(Bar):
...
My idea is to inject a common mixin class like
class Foo(Bar, Mixin):
...
class Mixin:
def __getattr__(self, k)
print repr(self), k
return Foo.__getattr__(self, k)
However I am running always into a recursion because Foo.__getattr__ resolves
to Mixin.__getattr__.
Is there any way to fix the code for Python 2.3 old-style classes?

If you are already injecting mixins, why not add object as parent, to make them new style
class Foo(Mixin, Bar, object):
...
And then use super
class Mixin(object):
def __getattr__(self, k)
print repr(self), k
return super(Mixin, self).__getattr__(k)

Assuming that none of the classes in your code base implement __setattr__ or __getattr__ then one approach is to intercept __setattr__ in your Mixin, writing the value to another reserved attribute, then read it back in __getattr__
class Mixin:
def __setattr__(self, attr, value):
# write the value into some special reserved space
namespace = self.__dict__.setdefault("_namespace", {})
namespace[attr] = value
def __getattr__(self, attr):
# reject special methods so e.g. __repr__ can't recurse
if attr.startswith("__") and attr.endswith("__"):
raise AttributeError
# do whatever you wish to do here ...
print repr(self), attr
# read the value from the reserved space
namespace = self.__dict__.get("_namespace", {})
return namespace[attr]
Example:
class Foo(Mixin):
def __init__(self):
self.x = 1
Then
>>> Foo().x
<__main__.Foo instance at 0x10c4dad88> x
Clearly this won't work if any of your Foo classes implement __setattr__ or __getattr__ themselves.

Related

A good practice to implement with python multiple inheritance class?

The Scenario:
class A:
def __init__(self, key, secret):
self.key = key
self.secret = secret
def same_name_method(self):
do_some_staff
def method_a(self):
pass
class B:
def __init__(self, key, secret):
self.key = key
self.secret = secret
def same_name_method(self):
do_another_staff
def method_b(self):
pass
class C(A,B):
def __init__(self, *args, **kwargs):
# I want to init both class A and B's key and secret
## I want to rename class A and B's same method
any_ideas()
...
What I Want:
I want the instance of class C initialize both class A and B, because they are different api key.
And I want rename class A and B's same_name_method, so I will not confused at which same_name_method.
What I Have Done:
For problem one, I have done this:
class C(A,B):
def __init__(self, *args, **kwargs):
A.__init__(self, a_api_key,a_api_secret)
B.__init__(self, b_api_key,b_api_secret)
Comment: I know about super(), but for this situation I do not know how to use it.
For problem two, I add a __new__ for class C
def __new__(cls, *args, **kwargs):
cls.platforms = []
cls.rename_method = []
for platform in cls.__bases__:
# fetch platform module name
module_name = platform.__module__.split('.')[0]
cls.platforms.append(module_name)
# rename attr
for k, v in platform.__dict__.items():
if not k.startswith('__'):
setattr(cls, module_name+'_'+k, v)
cls.rename_method.append(k)
for i in cls.rename_method:
delattr(cls, i) ## this line will raise AttributeError!!
return super().__new__(cls)
Comment: because I rename the new method names and add it to cls attr. I need to delete the old method attr, but do not know how to delattr. Now I just leave them alone, did not delete the old methods.
Question:
Any Suggestions?
So, you want some pretty advanced things, some complicated things, and you don't understand well how classes behave in Python.
So, for your first thing: initializing both classes, and every other method that should run in all classes: the correct solution is to make use of cooperative calls to super() methods.
A call to super() in Python returns you a very special proxy objects that reflects all methods available in the next class, obeying the proper method Resolution Order.
So, if A.__init__ and B.__init__ have to be called, both methods should include a super().__init__ call - and one will call the other's __init__ in the appropriate order, regardless of how they are used as bases in subclasses. As object also have __init__, the last super().__init__ will just call it that is a no-op. If you have more methods in your classes that should be run in all base classes, you'd rather build a proper base class so that the top-most super() call don't try to propagate to a non-existing method.
Otherwise, it is just:
class A:
def __init__(self, akey, asecret, **kwargs):
self.key = akey
self.secret = asecret
super().__init__(**kwargs)
class B:
def __init__(self, bkey, bsecret, **kwargs):
self.key = bkey
self.secret = bsecret
super().__init__(**kwargs)
class C(A,B):
# does not even need an explicit `__init__`.
I think you can get the idea. Of course, the parameter names have to differ - ideally, when writing C you don't have to worry about parameter order - but when calling C you have to worry about suplying all mandatory parameters for C and its bases. If you can't rename the parameters in A or B to be distinct, you could try to use the parameter order for the call, though, with each __init__ consuming two position-parameters - but that will require some extra care in inheritance order.
So - up to this point, it is basic Python multiple-inheritance "howto", and should be pretty straightforward. Now comes your strange stuff.
As for the auto-renaming of methods: first things first -
are you quite sure you need inheritance? Maybe having your granular classes for each external service, and a registry and dispatch class that call the methods on the others by composition would be more sane. (I may come back to this later)
Are you aware that __new__ is called for each instantiation of the class, and all class-attribute mangling you are performing there happens at each new instance of your classes?
So, if the needed method-renaming + shadowing needs to take place at class creation time, you can do that using the special method __init_subclass__ that exists from Python 3.6. It is a special class method that is called once for each derived class of the class it is defined on. So, just create a base class, from which A and B themselves will inherit, and move a properly modified version the thing you are putting in __new__ there. If you are not using Python 3.6, this should be done on the __new__ or __init__ of a metaclass, not on the __new__ of the class itself.
Another approach would be to have a custom __getattribute__ method - this could be crafted to provide namespaces for the base classes. It would owrk ony on instances, not on the classes themselves (but could be made to, again, using a metaclass). __getattribute__ can even hide the same-name-methods.
class Base:
#classmethod
def _get_base_modules(cls):
result = {}
for base in cls.__bases__:
module_name = cls.__module__.split(".")[0]
result[module_name] = base
return result
#classmethod
def _proxy(self, module_name):
class base:
def __dir__(base_self):
return dir(self._base_modules[module_name])
def __getattr__(base_self, attr):
original_value = self._base_modules[module_name].__dict__[attr]
if hasattr(original_value, "__get__"):
original_value = original_value.__get__(self, self.__class__)
return original_value
base.__name__ = module_name
return base()
def __init_subclass__(cls):
cls._base_modules = cls._get_base_modules()
cls._shadowed = {name for module_class in cls._base_modules.values() for name in module_class.__dict__ if not name.startswith("_")}
def __getattribute__(self, attr):
if attr.startswith("_"):
return super().__getattribute__(attr)
cls = self.__class__
if attr in cls._shadowed:
raise AttributeError(attr)
if attr in cls._base_modules:
return cls._proxy(attr)
return super().__getattribute__(attr)
def __dir__(self):
return super().dir() + list(self._base_modules)
class A(Base):
...
class B(Base):
...
class C(A, B):
...
As you can see - this is some fun, but starts getting really complicated - and all the hoola-boops that are needed to retrieve the actual attributes from the superclasses after ading an artificial namespace seem to indicate your problem is not calling for using inheritance after all, as I suggested above.
Since you have your small, functional, atomic classes for each "service" , you could use a plain, simple, non-meta-at-all class that would work as a registry for the various services - and you can even enhance it to call the equivalent method in several of the services it is handling with a single call:
class Services:
def __init__(self):
self.registry = {}
def register(self, cls, key, secret):
name = cls.__module__.split(".")[0]
service= cls(key, secret)
self.registry[name] = service
def __getattr__(self, attr):
if attr in self.registry:
return self.registry[attr]

How to make a class attribute exclusive to the super class

I have a master class for a planet:
class Planet:
def __init__(self,name):
self.name = name
(...)
def destroy(self):
(...)
I also have a few classes that inherit from Planet and I want to make one of them unable to be destroyed (not to inherit the destroy function)
Example:
class Undestroyable(Planet):
def __init__(self,name):
super().__init__(name)
(...)
#Now it shouldn't have the destroy(self) function
So when this is run,
Undestroyable('This Planet').destroy()
it should produce an error like:
AttributeError: Undestroyable has no attribute 'destroy'
The mixin approach in other answers is nice, and probably better for most cases. But nevertheless, it spoils part of the fun - maybe obliging you to have separate planet-hierarchies - like having to live with two abstract classes each ancestor of "destroyable" and "non-destroyable".
First approach: descriptor decorator
But Python has a powerful mechanism, called the "descriptor protocol", which is used to retrieve any attribute from a class or instance - it is even used to ordinarily retrieve methods from instances - so, it is possible to customize the method retrieval in a way it checks if it "should belong" to that class, and raise attribute error otherwise.
The descriptor protocol mandates that whenever you try to get any attribute from an instance object in Python, Python will check if the attribute exists in that object's class, and if so, if the attribute itself has a method named __get__. If it has, __get__ is called (with the instance and class where it is defined as parameters) - and whatever it returns is the attribute. Python uses this to implement methods: functions in Python 3 have a __get__ method that when called, will return another callable object that, in turn, when called will insert the self parameter in a call to the original function.
So, it is possible to create a class whose __get__ method will decide whether to return a function as a bound method or not depending on the outer class been marked as so - for example, it could check an specific flag non_destrutible. This could be done by using a decorator to wrap the method with this descriptor functionality
class Muteable:
def __init__(self, flag_attr):
self.flag_attr = flag_attr
def __call__(self, func):
"""Called when the decorator is applied"""
self.func = func
return self
def __get__(self, instance, owner):
if instance and getattr(instance, self.flag_attr, False):
raise AttributeError('Objects of type {0} have no {1} method'.format(instance.__class__.__name__, self.func.__name__))
return self.func.__get__(instance, owner)
class Planet:
def __init__(self, name=""):
pass
#Muteable("undestroyable")
def destroy(self):
print("Destroyed")
class BorgWorld(Planet):
undestroyable = True
And on the interactive prompt:
In [110]: Planet().destroy()
Destroyed
In [111]: BorgWorld().destroy()
...
AttributeError: Objects of type BorgWorld have no destroy method
In [112]: BorgWorld().destroy
AttributeError: Objects of type BorgWorld have no destroy method
Perceive that unlike simply overriding the method, this approach raises the error when the attribute is retrieved - and will even make hasattr work:
In [113]: hasattr(BorgWorld(), "destroy")
Out[113]: False
Although, it won't work if one tries to retrieve the method directly from the class, instead of from an instance - in that case the instance parameter to __get__ is set to None, and we can't say from which class it was retrieved - just the owner class, where it was declared.
In [114]: BorgWorld.destroy
Out[114]: <function __main__.Planet.destroy>
Second approach: __delattr__ on the metaclass:
While writting the above, it occurred me that Pythn does have the __delattr__ special method. If the Planet class itself implements __delattr__ and we'd try to delete the destroy method on specifc derived classes, it wuld nt work: __delattr__ gards the attribute deletion of attributes in instances - and if you'd try to del the "destroy" method in an instance, it would fail anyway, since the method is in the class.
However, in Python, the class itself is an instance - of its "metaclass". That is usually type . A proper __delattr__ on the metaclass of "Planet" could make possible the "disinheitance" of the "destroy" method by issuing a `del UndestructiblePlanet.destroy" after class creation.
Again, we use the descriptor protocol to have a proper "deleted method on the subclass":
class Deleted:
def __init__(self, cls, name):
self.cls = cls.__name__
self.name = name
def __get__(self, instance, owner):
raise AttributeError("Objects of type '{0}' have no '{1}' method".format(self.cls, self.name))
class Deletable(type):
def __delattr__(cls, attr):
print("deleting from", cls)
setattr(cls, attr, Deleted(cls, attr))
class Planet(metaclass=Deletable):
def __init__(self, name=""):
pass
def destroy(self):
print("Destroyed")
class BorgWorld(Planet):
pass
del BorgWorld.destroy
And with this method, even trying to retrieve or check for the method existense on the class itself will work:
In [129]: BorgWorld.destroy
...
AttributeError: Objects of type 'BorgWorld' have no 'destroy' method
In [130]: hasattr(BorgWorld, "destroy")
Out[130]: False
metaclass with a custom __prepare__ method.
Since metaclasses allow one to customize the object that contains the class namespace, it is possible to have an object that responds to a del statement within the class body, adding a Deleted descriptor.
For the user (programmer) using this metaclass, it is almost the samething, but for the del statement been allowed into the class body itself:
class Deleted:
def __init__(self, name):
self.name = name
def __get__(self, instance, owner):
raise AttributeError("No '{0}' method on class '{1}'".format(self.name, owner.__name__))
class Deletable(type):
def __prepare__(mcls,arg):
class D(dict):
def __delitem__(self, attr):
self[attr] = Deleted(attr)
return D()
class Planet(metaclass=Deletable):
def destroy(self):
print("destroyed")
class BorgPlanet(Planet):
del destroy
(The 'deleted' descriptor is the correct form to mark a method as 'deleted' - in this method, though, it can't know the class name at class creation time)
As a class decorator:
And given the "deleted" descriptor, one could simply inform the methods to be removed as a class decorator - there is no need for a metaclass in this case:
class Deleted:
def __init__(self, cls, name):
self.cls = cls.__name__
self.name = name
def __get__(self, instance, owner):
raise AttributeError("Objects of type '{0}' have no '{1}' method".format(self.cls, self.name))
def mute(*methods):
def decorator(cls):
for method in methods:
setattr(cls, method, Deleted(cls, method))
return cls
return decorator
class Planet:
def destroy(self):
print("destroyed")
#mute('destroy')
class BorgPlanet(Planet):
pass
Modifying the __getattribute__ mechanism:
For sake of completeness - what really makes Python reach methods and attributes on the super-class is what happens inside the __getattribute__ call. n the object version of __getattribute__ is where the algorithm with the priorities for "data-descriptor, instance, class, chain of base-classes, ..." for attribute retrieval is encoded.
So, changing that for the class is an easy an unique point to get a "legitimate" attribute error, without need for the "non-existent" descritor used on the previous methods.
The problem is that object's __getattribute__ does not make use of type's one to search the attribute in the class - if it did so, just implementing the __getattribute__ on the metaclass would suffice. One have to do that on the instance to avoid instance lookp of an method, and on the metaclass to avoid metaclass look-up. A metaclass can, of course, inject the needed code:
def blocker_getattribute(target, attr, attr_base):
try:
muted = attr_base.__getattribute__(target, '__muted__')
except AttributeError:
muted = []
if attr in muted:
raise AttributeError("object {} has no attribute '{}'".format(target, attr))
return attr_base.__getattribute__(target, attr)
def instance_getattribute(self, attr):
return blocker_getattribute(self, attr, object)
class M(type):
def __init__(cls, name, bases, namespace):
cls.__getattribute__ = instance_getattribute
def __getattribute__(cls, attr):
return blocker_getattribute(cls, attr, type)
class Planet(metaclass=M):
def destroy(self):
print("destroyed")
class BorgPlanet(Planet):
__muted__=['destroy'] # or use a decorator to set this! :-)
pass
If Undestroyable is a unique (or at least unusual) case, it's probably easiest to just redefine destroy():
class Undestroyable(Planet):
# ...
def destroy(self):
cls_name = self.__class__.__name__
raise AttributeError("%s has no attribute 'destroy'" % cls_name)
From the point of view of the user of the class, this will behave as though Undestroyable.destroy() doesn't exist … unless they go poking around with hasattr(Undestroyable, 'destroy'), which is always a possibility.
If it happens more often that you want subclasses to inherit some properties and not others, the mixin approach in chepner's answer is likely to be more maintainable. You can improve it further by making Destructible an abstract base class:
from abc import abstractmethod, ABCMeta
class Destructible(metaclass=ABCMeta):
#abstractmethod
def destroy(self):
pass
class BasePlanet:
# ...
pass
class Planet(BasePlanet, Destructible):
def destroy(self):
# ...
pass
class IndestructiblePlanet(BasePlanet):
# ...
pass
This has the advantage that if you try to instantiate the abstract class Destructible, you'll get an error pointing you at the problem:
>>> Destructible()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class Destructible with abstract methods destroy
… similarly if you inherit from Destructible but forget to define destroy():
class InscrutablePlanet(BasePlanet, Destructible):
pass
>>> InscrutablePlanet()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: Can't instantiate abstract class InscrutablePlanet with abstract methods destroy
Rather than remove an attribute that is inherited, only inherit destroy in the subclasses where it is applicable, via a mix-in class. This preserves the correct "is-a" semantics of inheritance.
class Destructible(object):
def destroy(self):
pass
class BasePlanet(object):
...
class Planet(BasePlanet, Destructible):
...
class IndestructiblePlanet(BasePlanet): # Does *not* inherit from Destructible
...
You can provide suitable definitions for destroy in any of Destructible, Planet, or any class that inherits from Planet.
Metaclasses and descriptor protocols are fun, but perhaps overkill. Sometimes, for raw functionality, you can't beat good ole' __slots__.
class Planet(object):
def __init__(self, name):
self.name = name
def destroy(self):
print("Boom! %s is toast!\n" % self.name)
class Undestroyable(Planet):
__slots__ = ['destroy']
def __init__(self,name):
super().__init__(name)
print()
x = Planet('Pluto') # Small, easy to destroy
y = Undestroyable('Jupiter') # Too big to fail
x.destroy()
y.destroy()
Boom! Pluto is toast!
Traceback (most recent call last):
File "planets.py", line 95, in <module>
y.destroy()
AttributeError: destroy
You cannot inherit only a portion of a class. Its all or nothing.
What you can do is to put the destroy function in a second level of the class, such you have the Planet-class without the destry-function, and then you make a DestroyablePlanet-Class where you add the destroy-function, which all the destroyable planets use.
Or you can put a flag in the construct of the Planet-Class which determines if the destroy function will be able to succeed or not, which is then checked in the destroy-function.

How can one locate where an inherited variable comes from in Python?

If you have multiple layers of inheritance and know that a particular variable exists, is there a way to trace back to where the variable originated? Without having to navigate backwards by looking through each file and classes. Possibly calling some sort of function that will do it?
Example:
parent.py
class parent(object):
def __init__(self):
findMe = "Here I am!"
child.py
from parent import parent
class child(parent):
pass
grandson.py
from child import child
class grandson(child):
def printVar(self):
print self.findMe
Try to locate where the findMe variable came from with a function call.
If the "variable" is an instance variable - , so , if at any point in chain of __init__ methods you do:
def __init__(self):
self.findMe = "Here I am!"
It is an instance variable from that point on, and cannot, for all effects, be made distinct of any other instance variable. (Unless you put in place a mechanism, like a class with a special __setattr__ method, that will keep track of attributes changing, and introspect back which part of the code set the attribute - see last example on this answer)
Please also note that on your example,
class parent(object):
def __init__(self):
findMe = "Here I am!"
findMe is defined as a local variable to that method and does not even exist after __init__ is finished.
Now, if your variable is set as a class attribute somewhere on the inheritance chain:
class parent(object):
findMe = False
class childone(parent):
...
It is possible to find the class where findMe is defined by introspecting each class' __dict__ in the MRO (method resolution order) chain . Of course, there is no way, and no sense, in doing that without introspecting all classes in the MRO chain - except if one keeps track of attributes as defined, like in the example bellow this - but introspecting the MRO itself is a oneliner in Python:
def __init__(self):
super().__init__()
...
findme_definer = [cls for cls in self.__class__.__mro__ if "findMe" in cls.__dict__][0]
Again - it would be possible to have a metaclass to your inheritance chain which would keep track of all defined attributes in the inheritance tree, and use a dictionary to retrieve where each attribute is defined. The same metaclass could also auto-decorate all __init__ (or all methods), and set a special __setitem__ so that it could track instance attributes as they are created, as listed above.
That can be done, is a bit complicated, would be hard to maintain, and probably is a signal you are taking the wrong approach to your problem.
So, the metaclass to record just class attributes could simply be (python3 syntax - define a __metaclass__ attribute on the class body if you are still using Python 2.7):
class MetaBase(type):
definitions = {}
def __init__(cls, name, bases, dct):
for attr in dct.keys():
cls.__class__.definitions[attr] = cls
class parent(metaclass=MetaBase):
findMe = 5
def __init__(self):
print(self.__class__.definitions["findMe"])
Now, if one wants to find which of the superclasses defined an attribute of the currentclass, just a "live" tracking mechanism, wrapping each method in each class can work - it is a lot trickier.
I've made it - even if you won't need this much, this combines both methods - keeping track of class attributes in the class'class definitions and on an instance _definitions dictionary - since in each created instance an arbitrary method might have been the last to set a particular instance attribute: (This is pure Python3, and maybe not that straighforward porting to Python2 due to the "unbound method" that Python2 uses, and is a simple function in Python3)
from threading import current_thread
from functools import wraps
from types import MethodType
from collections import defaultdict
def method_decorator(func, cls):
#wraps(func)
def wrapper(self, *args, **kw):
self.__class__.__class__.current_running_class[current_thread()].append(cls)
result = MethodType(func, self)(*args, **kw)
self.__class__.__class__.current_running_class[current_thread()].pop()
return result
return wrapper
class MetaBase(type):
definitions = {}
current_running_class = defaultdict(list)
def __init__(cls, name, bases, dct):
for attrname, attr in dct.items():
cls.__class__.definitions[attr] = cls
if callable(attr) and attrname != "__setattr__":
setattr(cls, attrname, method_decorator(attr, cls))
class Base(object, metaclass=MetaBase):
def __setattr__(self, attr, value):
if not hasattr(self, "_definitions"):
super().__setattr__("_definitions", {})
self._definitions[attr] = self.__class__.current_running_class[current_thread()][-1]
return super().__setattr__(attr,value)
Example Classes for the code above:
class Parent(Base):
def __init__(self):
super().__init__()
self.findMe = 10
class Child1(Parent):
def __init__(self):
super().__init__()
self.findMe1 = 20
class Child2(Parent):
def __init__(self):
super().__init__()
self.findMe2 = 30
class GrandChild(Child1, Child2):
def __init__(self):
super().__init__()
def findall(self):
for attr in "findMe findMe1 findMe2".split():
print("Attr '{}' defined in class '{}' ".format(attr, self._definitions[attr].__name__))
And on the console one will get this result:
In [87]: g = GrandChild()
In [88]: g.findall()
Attr 'findMe' defined in class 'Parent'
Attr 'findMe1' defined in class 'Child1'
Attr 'findMe2' defined in class 'Child2'

How to access the subclasses of a class as properties?

So I have a .py file containing a class where its subclasses can be accessed as properties. All these subclasses are defined beforehand. I also need all the subclasses to have the same ability (having their own subclasses be accessible as properties). The biggest problem I've been facing is that I don't know how to access the current class within my implementation of __getattr__(), so that'd be a good place to start.
Here's some Python+Pseudocode with what I've tried so far. I'm pretty sure it won't work since __getattr__() seems to be only working with instances of a class. If that is case, sorry, I am not as familiar with OOP in Python as I would like.
class A(object):
def __getattr__(self, name):
subclasses = [c.__name__ for c in current_class.__subclasses__()]
if name in subclasses:
return name
raise AttributeError
If I've understood your question properly, you can do what you want by using a custom metaclass that adds a classmethod to its instances. Here's an example:
class SubclassAttributes(type):
def __getattr__(cls, name): # classmethod of instances
for subclass in cls.__subclasses__():
if subclass.__name__ == name:
return subclass
else:
raise TypeError('Class {!r} has no subclass '
'named {!r}'.format(cls.__name__, name))
class Base(object):
__metaclass__ = SubclassAttributes # Python 2 metaclass syntax
#class Base(object, metaclass=SubclassAttributes): # Python 3 metaclass syntax
# """ nothing to see here """
class Derived1(Base): pass
class Derived2(Base): pass
print(Base.Derived1) # -> <class '__main__.Derived1'>
print(Base.Derived2) # -> <class '__main__.Derived2'>
print(Base.Derived3) # -> TypeError: Class 'Base' has no subclass named 'Derived3'
For something that works in both Python 2 and 3, define the class as shown below. Derives Base from a class that has SubclassAttributes as its metaclass. The is similar to what the six module's with_metaclass() function does:
class Base(type.__new__(type('TemporaryMeta', (SubclassAttributes,), {}),
'TemporaryClass', (), {})): pass
class A(object):
def __getattr__(self, key):
for subclass in self.__class__.__subclasses__():
if (subclass.__name__ == key):
return subclass
raise AttributeError, key
Out of curiosity, what is this designed to be used for?
>>> class A(object):
... pass
...
>>> foo = A()
>>> foo.__class__
<class '__main__.A'>

Python metaclasses: Why isn't __setattr__ called for attributes set during class definition?

I have the following python code:
class FooMeta(type):
def __setattr__(self, name, value):
print name, value
return super(FooMeta, self).__setattr__(name, value)
class Foo(object):
__metaclass__ = FooMeta
FOO = 123
def a(self):
pass
I would have expected __setattr__ of the meta class being called for both FOO and a. However, it is not called at all. When I assign something to Foo.whatever after the class has been defined the method is called.
What's the reason for this behaviour and is there a way to intercept the assignments that happen during the creation of the class? Using attrs in __new__ won't work since I'd like to check if a method is being redefined.
A class block is roughly syntactic sugar for building a dictionary, and then invoking a metaclass to build the class object.
This:
class Foo(object):
__metaclass__ = FooMeta
FOO = 123
def a(self):
pass
Comes out pretty much as if you'd written:
d = {}
d['__metaclass__'] = FooMeta
d['FOO'] = 123
def a(self):
pass
d['a'] = a
Foo = d.get('__metaclass__', type)('Foo', (object,), d)
Only without the namespace pollution (and in reality there's also a search through all the bases to determine the metaclass, or whether there's a metaclass conflict, but I'm ignoring that here).
The metaclass' __setattr__ can control what happens when you try to set an attribute on one of its instances (the class object), but inside the class block you're not doing that, you're inserting into a dictionary object, so the dict class controls what's going on, not your metaclass. So you're out of luck.
Unless you're using Python 3.x! In Python 3.x you can define a __prepare__ classmethod (or staticmethod) on a metaclass, which controls what object is used to accumulate attributes set within a class block before they're passed to the metaclass constructor. The default __prepare__ simply returns a normal dictionary, but you could build a custom dict-like class that doesn't allow keys to be redefined, and use that to accumulate your attributes:
from collections import MutableMapping
class SingleAssignDict(MutableMapping):
def __init__(self, *args, **kwargs):
self._d = dict(*args, **kwargs)
def __getitem__(self, key):
return self._d[key]
def __setitem__(self, key, value):
if key in self._d:
raise ValueError(
'Key {!r} already exists in SingleAssignDict'.format(key)
)
else:
self._d[key] = value
def __delitem__(self, key):
del self._d[key]
def __iter__(self):
return iter(self._d)
def __len__(self):
return len(self._d)
def __contains__(self, key):
return key in self._d
def __repr__(self):
return '{}({!r})'.format(type(self).__name__, self._d)
class RedefBlocker(type):
#classmethod
def __prepare__(metacls, name, bases, **kwargs):
return SingleAssignDict()
def __new__(metacls, name, bases, sad):
return super().__new__(metacls, name, bases, dict(sad))
class Okay(metaclass=RedefBlocker):
a = 1
b = 2
class Boom(metaclass=RedefBlocker):
a = 1
b = 2
a = 3
Running this gives me:
Traceback (most recent call last):
File "/tmp/redef.py", line 50, in <module>
class Boom(metaclass=RedefBlocker):
File "/tmp/redef.py", line 53, in Boom
a = 3
File "/tmp/redef.py", line 15, in __setitem__
'Key {!r} already exists in SingleAssignDict'.format(key)
ValueError: Key 'a' already exists in SingleAssignDict
Some notes:
__prepare__ has to be a classmethod or staticmethod, because it's being called before the metaclass' instance (your class) exists.
type still needs its third parameter to be a real dict, so you have to have a __new__ method that converts the SingleAssignDict to a normal one
I could have subclassed dict, which would probably have avoided (2), but I really dislike doing that because of how the non-basic methods like update don't respect your overrides of the basic methods like __setitem__. So I prefer to subclass collections.MutableMapping and wrap a dictionary.
The actual Okay.__dict__ object is a normal dictionary, because it was set by type and type is finicky about the kind of dictionary it wants. This means that overwriting class attributes after class creation does not raise an exception. You can overwrite the __dict__ attribute after the superclass call in __new__ if you want to maintain the no-overwriting forced by the class object's dictionary.
Sadly this technique is unavailable in Python 2.x (I checked). The __prepare__ method isn't invoked, which makes sense as in Python 2.x the metaclass is determined by the __metaclass__ magic attribute rather than a special keyword in the classblock; which means the dict object used to accumulate attributes for the class block already exists by the time the metaclass is known.
Compare Python 2:
class Foo(object):
__metaclass__ = FooMeta
FOO = 123
def a(self):
pass
Being roughly equivalent to:
d = {}
d['__metaclass__'] = FooMeta
d['FOO'] = 123
def a(self):
pass
d['a'] = a
Foo = d.get('__metaclass__', type)('Foo', (object,), d)
Where the metaclass to invoke is determined from the dictionary, versus Python 3:
class Foo(metaclass=FooMeta):
FOO = 123
def a(self):
pass
Being roughly equivalent to:
d = FooMeta.__prepare__('Foo', ())
d['Foo'] = 123
def a(self):
pass
d['a'] = a
Foo = FooMeta('Foo', (), d)
Where the dictionary to use is determined from the metaclass.
There are no assignments happening during the creation of the class. Or: they are happening, but not in the context you think they are. All class attributes are collected from class body scope and passed to metaclass' __new__, as the last argument:
class FooMeta(type):
def __new__(self, name, bases, attrs):
print attrs
return type.__new__(self, name, bases, attrs)
class Foo(object):
__metaclass__ = FooMeta
FOO = 123
Reason: when the code in the class body executes, there's no class yet. Which means there's no opportunity for metaclass to intercept anything yet.
Class attributes are passed to the metaclass as a single dictionary and my hypothesis is that this is used to update the __dict__ attribute of the class all at once, e.g. something like cls.__dict__.update(dct) rather than doing setattr() on each item. More to the point, it's all handled in C-land and simply wasn't written to call a custom __setattr__().
It's easy enough to do whatever you want to the attributes of the class in your metaclass's __init__() method, since you're passed the class namespace as a dict, so just do that.
During the class creation, your namespace is evaluated to a dict and passed as an argument to the metaclass, together with the class name and base classes. Because of that, assigning a class attribute inside the class definition wouldn't work the way you expect. It doesn't create an empty class and assign everything. You also can't have duplicated keys in a dict, so during class creation attributes are already deduplicated. Only by setting an attribute after the class definition you can trigger your custom __setattr__.
Because the namespace is a dict, there's no way for you to check duplicated methods, as suggested by your other question. The only practical way to do that is parsing the source code.

Categories

Resources