I watched a great video on YouTube about Python metaprogramming. I tried to write the following code (which is almost the same from the video):
class Descriptor:
def __init__(self, name):
self.name = name
def __get__(self, instance, cls):
return instance.__dict__[self.name]
def __set__(self, instance, val):
instance.__dict__[self.name] = val
def __delete__(self, instance):
del instance.__dict__[self.name]
class Type(Descriptor):
ty = object
def __set__(self, instance, val):
if not isinstance(val, self.ty):
raise TypeError("%s should be of type %s" % (self.name, self.ty))
super().__set__(instance, val)
class String(Type):
ty = str
class Integer(Type):
ty = int
class Positive(Descriptor):
def __set__(self, instance, val):
if val <= 0:
raise ValueError("Must be > 0")
super().__set__(instance, val)
class PositiveInteger(Integer, Positive):
pass
class Person(metaclass=StructMeta):
_fields = ['name', 'gender', 'age']
name = String('name')
gender = String('gender')
age = PositiveInteger('age')
So PositiveInteger is inherited from Integer and Positive, and both classes have __get__ method defined to do some validation. I wrote some test code to convince myself that both methods will run:
class A:
def test(self):
self.a = 'OK'
class B:
def test(self):
self.b = 'OK'
class C(A, B):
pass
c = C()
c.test()
print(self.a)
print(self.b)
Only to find that only the first print statement works. The second will raise an AttributeError, which indicates that when there's name conflict, the first base class wins.
So I wonder why both validations work? It's even more weird that when only the Integer check passes (e.g. person.age = -3), it's super().__set__(instance, val) has no effect, leaving person.age untouched.
The validation logic of both Positive and Integer runs because both Type and Positive have this line in __set__:
super().__set__(instance, val)
This doesn't skip to Descriptor.__set__. Instead, it calls the next method in method resolution order. Type.__set__ gets called, and its super().__set__(instance, val) calls Positive.__set__. Positive.__set__ runs its validation and calls Descriptor.__set__, which does the setting. This behavior is one of the reasons we have super.
If you wanted your test methods to behave like that, you would need to do two things. First, you would need to make A and B inherit from a common base class with a test method that doesn't do anything, so the super chains end at a place with a test method instead of going to object:
class Base:
def test():
pass
Then, you would need to add super().test() to both A.test and B.test:
class A(Base):
def test(self):
self.a = 'OK'
super().test()
class B(Base):
def test(self):
self.b = 'OK'
super().test()
For more reading, see Python's super() considered super.
Sorry, my bad.
The video gave perfect explanation just minute after where I paused and asked this question.
So when multiple inheritance happends, there's MRO thing (Method Resolution Order) defined in each class that determines the resolution order of methods in the super() chain.
The order is determined by depth-first search, e.g.
class A:
pass
class B(A):
pass
class C(B):
pass
class D(A):
pass
class E(C, D):
pass
E.__mro__ will be:
(<class '__main__.E'>, <class '__main__.C'>, <class '__main__.B'>, <class '__main__.D'>, <class '__main__.A'>, <class 'object'>)
One thing to notice is that A will appear in the inheritance tree multiple times, and in the MRO list it will only be in the last place where all A's appear.
Here's the trick: the call to super() won't necessarily go to its base. Instead, it'll find in the MRO list what comes next.
So to explain what happens in the code:
The super() call in Integer.__get__ (which is inherited from Type.__get__) won't go to Descriptor.__get__, because Descriptor appears last in the MRO list. It will fall into Positive.__set__, and then its super() will fall into Descriptor, which will eventually set the value of the attribute.
Related
I follow the tutorial out of the docs and an example by fluent Python. In the book they teach me how to avoid the AttributeError by get, (e.g., when you do z = Testing.x) and I wanted to do something simliar with the set method. But it seems like, it lead to a broken class with no error.
To be more specific about the issue:
With outcommented line Testing.x = 1 it invoke the __set__ methods.
With uncommented line #Testing.x = 1 it does not invoke the __set__ methods.
Can someone teach me why it behaves this way?
import abc
class Descriptor:
def __init__(self):
cls = self.__class__
self.storage_name = cls.__name__
def __get__(self, instance, owner):
if instance is None:
return self
else:
return getattr(instance, self.storage_name)
def __set__(self, instance, value):
print(instance,self.storage_name)
setattr(instance, self.storage_name, value)
class Validator(Descriptor):
def __set__(self, instance, value):
value = self.validate(instance, value)
super().__set__(instance, value)
#abc.abstractmethod
def validate(self, instance, value):
"""return validated value or raise ValueError"""
class NonNegative(Validator):
def validate(self, instance, value):
if value <= 0:
raise ValueError(f'{value!r} must be > 0')
return value
class Testing:
x = NonNegative()
def __init__(self,number):
self.x = number
#Testing.x = 1
t = Testing(1)
t.x = 1
Attribute access is generally handled by object.__getattribute__ and type.__getattribute__ (for instances of type, i.e. classes). When an attribute lookup of the form a.x involves a descriptor as x, then various binding rules come into effect, based on what x is:
Instance binding: If binding to an object instance, a.x is transformed into the call: type(a).__dict__['x'].__get__(a, type(a)).
Class binding: If binding to a class, A.x is transformed into the call: A.__dict__['x'].__get__(None, A).
Super binding: [...]
For the scope of this question, only (2) is relevant. Here, Testing.x invokes the descriptor via __get__(None, Testing). Now one might ask why this is done instead of simply returning the descriptor object itself (as if it was any other object, say an int). This behavior is useful to implement the classmethod decorator. The descriptor HowTo guide provides an example implementation:
class ClassMethod:
def __init__(self, f):
self.f = f
def __get__(self, obj, cls=None):
print(f'{obj = }, {cls = }')
return self.f.__get__(cls, cls) # simplified version
class Test:
#ClassMethod
def func(cls, x):
pass
Test().func(2) # call from instance
Test.func(1) # this requires binding without any instance
We can observe that for the second case Test.func(1) there is no instance involved, but the ClassMethod descriptor can still bind to the cls.
Given that __get__ is used for both, instance and class binding, one might ask why this isn't the case for __set__. Specifically, for x.y = z, if y is a data descriptor, why doesn't it invoke y.__set__(None, z)? I guess the reason is that there is no good use case for that and it unnecessarily complicates the descriptor API. What would the descriptor do with that information anyway? Typically, managing how attributes are set is done by the class (or metaclass for types), via object.__setattr__ or type.__setattr__.
So to prevent Testing.x from being replaced by a user, you could use a custom metaclass:
class ProtectDataDescriptors(type):
def __setattr__(self, name, value):
if hasattr(getattr(self, name, None), '__set__'):
raise AttributeError(f'Cannot override data descriptor {name!r}')
super().__setattr__(name, value)
class Testing(metaclass=ProtectDataDescriptors):
x = NonNegative()
def __init__(self, number):
self.x = number
Testing.x = 1 # now this raises AttributeError
However, this is not an absolute guarantee as users can still use type.__setattr__ directly to override that attribute:
type.__setattr__(Testing, 'x', 1) # this will bypass ProtectDataDescriptors.__setattr__
The line
Testing.x = 1
replaces the descriptor you've set as a class attribute for Testing with an integer.
Since the descriptor is no more, self.x = ... or t.x = ... is just an assignment that doesn't involve a descriptor.
As an aside, surely you've noticed there is no true x attribute anymore with your descriptor, and you can't use more than one instance of the same descriptor without conflicts?
class Testing:
x = NonNegative()
y = NonNegative()
def __init__(self, number):
self.x = number
t = Testing(2345)
t.x = 1234
t.y = 5678
print(vars(t))
prints out
{'NonNegative': 5678}
Let's say we have a class A, a class B that inherits from A and classes C, D and E that inherit from B.
We want all of those classes to have an attribute _f initialized with a default value, and we want that attribute to be mutable and to have a separate value for each instance of the class, i.e. it should not be a static, constant value of A used by all subclasses.
One way to do this is to define _f in A's __init__ method, and then rely on this method in the subclasses:
class A:
def __init__(self):
self._f = 'default_value'
class B(A):
def __init__(self):
super(B, self).__init__()
class C(B):
def __init__(self):
super(C, self).__init__()
Is there any nice Pythonic way to avoid this, and possibly avoid using metaclasses?
If your goal is to simplify subclass constructors by eliminating the need the call the base class constructor, but still be able to override the default value in subclasses, there's a common paradigm of exploiting the fact that Python will return the class's value for an attribute if it doesn't exist on the instance.
Using a slightly more concrete example, instead of doing...
class Human(object):
def __init__(self):
self._fingers = 10
def __repr__(self):
return 'I am a %s with %d fingers' % (self.__class__.__name__, self._fingers)
class MutatedHuman(Human):
def __init__(self, fingers):
super(MutatedHuman, self).__init__()
self._fingers = fingers
print MutatedHuman(fingers=11)
print Human()
...you can use...
class Human(object):
_fingers = 10
def __repr__(self):
return 'I am a %s with %d fingers' % (self.__class__.__name__, self._fingers)
class MutatedHuman(Human):
def __init__(self, fingers):
self._fingers = fingers
print MutatedHuman(fingers=11)
print Human()
...both of which output...
I am a MutatedHuman with 11 fingers
I am a Human with 10 fingers
The important point being that the line self._fingers = fingers in the second example doesn't overwrite the default value set on class Human, but merely hides it when referenced as self._fingers.
It's slightly hairy when the variable refers to a mutable type, such as a list. You have to be careful not to perform a operation on the default value which will modify it, although it's still safe to do a self.name = value.
What's neat about this approach is it tends to lead to fewer lines of code than other approaches, which is usually a Good Thing (tm).
I have a parent class and two child class. The parent class is an abstract base class that has method combine that gets inherited by the child classes. But each child implements combine differently from a parameter perspective therefore each of their own methods take different number of parameters. In Python, when a child inherits a method and requires re-implementing it, that newly re-implemented method must match parameter by parameter. Is there a way around this? I.e. the inherited method can have dynamic parameter composition?
This code demonstrates that signature of overridden method can easily change.
class Parent(object):
def foo(self, number):
for _ in range(number):
print "Hello from parent"
class Child(Parent):
def foo(self, number, greeting):
for _ in range(number):
print greeting
class GrandChild(Child):
def foo(self):
super(GrandChild,self).foo(1, "hey")
p = Parent()
p.foo(3)
c = Child()
c.foo(2, "Hi")
g = GrandChild()
g.foo()
As the other answer demonstrates for plain classes, the signature of an overridden inherited method can be different in the child than in the parent.
The same is true even if the parent is an abstract base class:
import abc
class Foo:
__metaclass__ = abc.ABCMeta
#abc.abstractmethod
def bar(self, x, y):
return x + y
class ChildFoo(Foo):
def bar(self, x):
return super(self.__class__, self).bar(x, 3)
class DumbFoo(Foo):
def bar(self):
return "derp derp derp"
cf = ChildFoo()
print cf.bar(5)
df = DumbFoo()
print df.bar()
Inappropriately complicated detour
It is an interesting exercise in Python metaclasses to try to restrict the ability to override methods, such that their argument signature must match that of the base class. Here is an attempt.
Note: I'm not endorsing this as a good engineering idea, and I did not spend time tying up loose ends so there are likely little caveats about the code below that could make it more efficient or something.
import types
import inspect
def strict(func):
"""Add some info for functions having strict signature.
"""
arg_sig = inspect.getargspec(func)
func.is_strict = True
func.arg_signature = arg_sig
return func
class StrictSignature(type):
def __new__(cls, name, bases, attrs):
func_types = (types.MethodType,) # include types.FunctionType?
# Check each attribute in the class being created.
for attr_name, attr_value in attrs.iteritems():
if isinstance(attr_value, func_types):
# Check every base for #strict functions.
for base in bases:
base_attr = base.__dict__.get(attr_name)
base_attr_is_function = isinstance(base_attr, func_types)
base_attr_is_strict = hasattr(base_attr, "is_strict")
# Assert that inspected signatures match.
if base_attr_is_function and base_attr_is_strict:
assert (inspect.getargspec(attr_value) ==
base_attr.arg_signature)
# If everything passed, create the class.
return super(StrictSignature, cls).__new__(cls, name, bases, attrs)
# Make a base class to try out strictness
class Base:
__metaclass__ = StrictSignature
#strict
def foo(self, a, b, c="blah"):
return a + b + len(c)
def bar(self, x, y, z):
return x
#####
# Now try to make some classes inheriting from Base.
#####
class GoodChild(Base):
# Was declared strict, better match the signature.
def foo(self, a, b, c="blah"):
return c
# Was never declared as strict, so no rules!
def bar(im_a_little, teapot):
return teapot/2
# These below can't even be created. Uncomment and try to run the file
# and see. It's not just that you can't instantiate them, you can't
# even get the *class object* defined at class creation time.
#
#class WrongChild(Base):
# def foo(self, a):
# return super(self.__class__, self).foo(a, 5)
#
#class BadChild(Base):
# def foo(self, a, b, c="halb"):
# return super(self.__class__, self).foo(a, b, c)
Note, like with most "strict" or "private" type ideas in Python, that you are still free to monkey-patch functions onto even a "good class" and those monkey-patched functions don't have to satisfy the signature constraint.
# Instance level
gc = GoodChild()
gc.foo = lambda self=gc: "Haha, I changed the signature!"
# Class level
GoodChild.foo = lambda self: "Haha, I changed the signature!"
and even if you add more complexity to the meta class that checks whenever any method type attributes are updated in the class's __dict__ and keeps making the assert statement when the class is modified, you can still use type.__setattr__ to bypass customized behavior and set an attribute anyway.
In these cases, I imagine Jeff Goldblum as Ian Malcolm from Jurassic Park, looking at you blankly and saying "Consenting adults, uhh, find a way.."
I am using the python mock framework for testing (http://www.voidspace.org.uk/python/mock/) and I want to mock out a superclass and focus on testing the subclasses' added behavior.
(For those interested I have extended pymongo.collection.Collection and I want to only test my added behavior. I do not want to have to run mongodb as another process for testing purposes.)
For this discussion, A is the superclass and B is the subclass. Furthermore, I define direct and indirect superclass calls as shown below:
class A(object):
def method(self):
...
def another_method(self):
...
class B(A):
def direct_superclass_call(self):
...
A.method(self)
def indirect_superclass_call(self):
...
super(A, self).another_method()
Approach #1
Define a mock class for A called MockA and use mock.patch to substitute it for the test at runtime. This handles direct superclass calls. Then manipulate B.__bases__ to handle indirect superclass calls. (see below)
The issue that arises is that I have to write MockA and in some cases (as in the case for pymongo.collection.Collection) this can involve a lot of work to unravel all of the internal calls to mock out.
Approach #2
The desired approach is to somehow use a mock.Mock() class to handle calls on the the mock just in time, as well as defined return_value or side_effect in place in the test. In this manner, I have to do less work by avoiding the definition of MockA.
The issue that I am having is that I cannot figure out how to alter B.__bases__ so that an instance of mock.Mock() can be put in place as a superclass (I must need to somehow do some direct binding here). Thus far I have determined, that super() examines the MRO and then calls the first class that defines the method in question. I cannot figure out how to get a superclass to handle the check to it and succeed if it comes across a mock class. __getattr__ does not seem to be used in this case. I want super to to think that the method is defined at this point and then use the mock.Mock() functionality as usual.
How does super() discover what attributes are defined within the class in the MRO sequence? And is there a way for me to interject here and to somehow get it to utilize a mock.Mock() on the fly?
import mock
class A(object):
def __init__(self, value):
self.value = value
def get_value_direct(self):
return self.value
def get_value_indirect(self):
return self.value
class B(A):
def __init__(self, value):
A.__init__(self, value)
def get_value_direct(self):
return A.get_value_direct(self)
def get_value_indirect(self):
return super(B, self).get_value_indirect()
# approach 1 - use a defined MockA
class MockA(object):
def __init__(self, value):
pass
def get_value_direct(self):
return 0
def get_value_indirect(self):
return 0
B.__bases__ = (MockA, ) # - mock superclass
with mock.patch('__main__.A', MockA):
b2 = B(7)
print '\nApproach 1'
print 'expected result = 0'
print 'direct =', b2.get_value_direct()
print 'indirect =', b2.get_value_indirect()
B.__bases__ = (A, ) # - original superclass
# approach 2 - use mock module to mock out superclass
# what does XXX need to be below to use mock.Mock()?
#B.__bases__ = (XXX, )
with mock.patch('__main__.A') as mymock:
b3 = B(7)
mymock.get_value_direct.return_value = 0
mymock.get_value_indirect.return_value = 0
print '\nApproach 2'
print 'expected result = 0'
print 'direct =', b3.get_value_direct()
print 'indirect =', b3.get_value_indirect() # FAILS HERE as the old superclass is called
#B.__bases__ = (A, ) # - original superclass
is there a way for me to interject here and to somehow get it to utilize a mock.Mock() on the fly?
There may be better approaches, but you can always write your own super() and inject it into the module that contains the class you're mocking. Have it return whatever it should based on what's calling it.
You can either just define super() in the current namespace (in which case the redefinition only applies to the current module after the definition), or you can import __builtin__ and apply the redefinition to __builtin__.super, in which case it will apply globally in the Python session.
You can capture the original super function (if you need to call it from your implementation) using a default argument:
def super(type, obj=None, super=super):
# inside the function, super refers to the built-in
I played around with mocking out super() as suggested by kindall. Unfortunately, after a great deal of effort it became quite complicated to handle complex inheritance cases.
After some work I realized that super() accesses the __dict__ of classes directly when resolving attributes through the MRO (it does not do a getattr type of call). The solution is to extend a mock.MagicMock() object and wrap it with a class to accomplish this. The wrapped class can then be placed in the __bases__ variable of a subclass.
The wrapped object reflects all defined attributes of the target class to the __dict__ of the wrapping class so that super() calls resolve to the properly patched in attributes within the internal MagicMock().
The following code is the solution that I have found to work thus far. Note that I actually implement this within a context handler. Also, care has to be taken to patch in the proper namespaces if importing from other modules.
This is a simple example illustrating the approach:
from mock import MagicMock
import inspect
class _WrappedMagicMock(MagicMock):
def __init__(self, *args, **kwds):
object.__setattr__(self, '_mockclass_wrapper', None)
super(_WrappedMagicMock, self).__init__(*args, **kwds)
def wrap(self, cls):
# get defined attribtues of spec class that need to be preset
base_attrs = dir(type('Dummy', (object,), {}))
attrs = inspect.getmembers(self._spec_class)
new_attrs = [a[0] for a in attrs if a[0] not in base_attrs]
# pre set mocks for attributes in the target mock class
for name in new_attrs:
setattr(cls, name, getattr(self, name))
# eat up any attempts to initialize the target mock class
setattr(cls, '__init__', lambda *args, **kwds: None)
object.__setattr__(self, '_mockclass_wrapper', cls)
def unwrap(self):
object.__setattr__(self, '_mockclass_wrapper', None)
def __setattr__(self, name, value):
super(_WrappedMagicMock, self).__setattr__(name, value)
# be sure to reflect to changes wrapper class if activated
if self._mockclass_wrapper is not None:
setattr(self._mockclass_wrapper, name, value)
def _get_child_mock(self, **kwds):
# when created children mocks need only be MagicMocks
return MagicMock(**kwds)
class A(object):
x = 1
def __init__(self, value):
self.value = value
def get_value_direct(self):
return self.value
def get_value_indirect(self):
return self.value
class B(A):
def __init__(self, value):
super(B, self).__init__(value)
def f(self):
return 2
def get_value_direct(self):
return A.get_value_direct(self)
def get_value_indirect(self):
return super(B, self).get_value_indirect()
# nominal behavior
b = B(3)
assert b.get_value_direct() == 3
assert b.get_value_indirect() == 3
assert b.f() == 2
assert b.x == 1
# using mock class
MockClass = type('MockClassWrapper', (), {})
mock = _WrappedMagicMock(A)
mock.wrap(MockClass)
# patch the mock in
B.__bases__ = (MockClass, )
A = MockClass
# set values within the mock
mock.x = 0
mock.get_value_direct.return_value = 0
mock.get_value_indirect.return_value = 0
# mocked behavior
b = B(7)
assert b.get_value_direct() == 0
assert b.get_value_indirect() == 0
assert b.f() == 2
assert b.x == 0
In class B below I wanted the __set__ function in class A to be called whenever you assign a value to B().a . Instead, setting a value to B().a overwrites B().a with the value. Class C assigning to C().a works correctly, but I wanted to have a separate instance of A for each user class, i.e. I don't want changing 'a' in one instance of C() to change 'a' in all other instances. I wrote a couple of tests to help illustrate the problem. Can you help me define a class that will pass both test1 and test2?
class A(object):
def __set__(self, instance, value):
print "__set__ called: ", value
class B(object):
def __init__(self):
self.a = A()
class C(object):
a = A()
def test1( class_in ):
o = class_in()
o.a = "test"
if isinstance(o.a, A):
print "pass"
else:
print "fail"
def test2( class_in ):
o1, o2 = class_in(), class_in()
if o1.a is o2.a:
print "fail"
else:
print "pass"
Accordingly to the documentation:
The following methods only apply when an instance of the class containing
the method (a so-called descriptor
class) appears in the class dictionary
of another new-style class, known as
the owner class. In the examples
below, “the attribute” refers to the
attribute whose name is the key of the
property in the owner class’ __dict__.
Descriptors can only be implemented as
new-style classes themselves.
So you can't have descriptors on instances.
However, since the descriptor gets a ref to the instance being used to access it, just use that as a key to storing state and you can have different behavior depending on the instance.
Here's a class that can pass the original tests, but don't try using it in most situations. it fails the isinstance test on itself!
class E(object):
def __new__(cls, state):
class E(object):
a = A(state)
def __init__(self, state):
self.state = state
return E(state)
#>>> isinstance(E(1), E)
#False
I was bitten by a similar issue in that I wanted to class objects with attributes governed by a descriptor. When I did this, I noticed that the attributes were being overwritten in all of the objects such that they weren't individual.
I raised a SO question and the resultant answer is here: class attribute changing value for no reason
A good document link discussing descriptors is here: http://martyalchin.com/2007/nov/24/python-descriptors-part-2-of-2/
An example descriptor from the aforementioned link is below:
class Numberise(object):
def __init__(self, name):
self.name = name
def __get__(self, instance, owner):
if self.name not in instance.__dict__:
raise (AttributeError, self.name)
return '%o'%(instance.__dict__[self.name])
def __set__(self, instance, value):
print ('setting value to: %d'%value)
instance.__dict__[self.name] = value