How to replace/bypass a class property? - python

I would like to have a class with an attribute attr that, when accessed for the first time, runs a function and returns a value, and then becomes this value (its type changes, etc.).
A similar behavior can be obtained with:
class MyClass(object):
#property
def attr(self):
try:
return self._cached_result
except AttributeError:
result = ...
self._cached_result = result
return result
obj = MyClass()
print obj.attr # First calculation
print obj.attr # Cached result is used
However, .attr does not become the initial result, when doing this. It would be more efficient if it did.
A difficulty is that after obj.attr is set to a property, it cannot be set easily to something else, because infinite loops appear naturally. Thus, in the code above, the obj.attr property has no setter so it cannot be directly modified. If a setter is defined, then replacing obj.attr in this setter creates an infinite loop (the setter is accessed from within the setter). I also thought of first deleting the setter so as to be able to do a regular self.attr = …, with del self.attr, but this calls the property deleter (if any), which recreates the infinite loop problem (modifications of self.attr anywhere generally tend to go through the property rules).
So, is there a way to bypass the property mechanism and replace the bound property obj.attr by anything, from within MyClass.attr.__getter__?

This looks a bit like premature optimization : you want to skip a method call by making a descriptor change itself.
It's perfectly possible, but it would have to be justified.
To modify the descriptor from your property, you'd have to be editing your class, which is probably not what you want.
I think a better way to implement this would be to :
do not define obj.attr
override __getattr__, if argument is "attr", obj.attr = new_value, otherwise raise AttributeError
As soon as obj.attr is set, __getattr__ will not be called any more, as it is only called when the attribute does not exist. (__getattribute__ is the one that would get called all the time.)
The main difference with your initial proposal is that the first attribute access is slower, because of the method call overhead of __getattr__, but then it will be as fact as a regular __dict__ lookup.
Example :
class MyClass(object):
def __getattr__(self, name):
if name == 'attr':
self.attr = ...
return self.attr
raise AttributeError(name)
obj = MyClass()
print obj.attr # First calculation
print obj.attr # Cached result is used
EDIT : Please see the other answer, especially if you use Python 3.6 or more.

For new-style classes, which utilize the descriptor protocol, you could do this by creating your own custom descriptor class whose __get__() method will be called at most one time. When that happens, the result is then cached by creating an instance attribute with the same name the class method has.
Here's what I mean.
from __future__ import print_function
class cached_property(object):
"""Descriptor class for making class methods lazily-evaluated and caches the result."""
def __init__(self, func):
self.func = func
def __get__(self, inst, cls):
if inst is None:
return self
else:
value = self.func(inst)
setattr(inst, self.func.__name__, value)
return value
class MyClass(object):
#cached_property
def attr(self):
print('doing long calculation...', end='')
result = 42
return result
obj = MyClass()
print(obj.attr) # -> doing long calculation...42
print(obj.attr) # -> 42

Related

Don't understand the default implementation of Non-Overriding Descriptors in Python

I don't understand the design decision to render non-overriding descriptors ineffective when an instance attribute exists, e.g.
>>> class Descriptor:
... def __get__(self, obj, objtype=None):
... return 4
...
>>> class Class:
... attr = Descriptor()
... def __init__(self):
... self.attr = 'instance attr'
...
>>> Class().attr # why doesn't this return 4?
'instance attr'
To me, overriding descriptors make sense in that if we have a descriptor with __set__, then that __set__ pretty much always gets used for something like obj.attr = <new value>.
Why aren't non-overriding descriptors this simple in the language, i.e. why isn't __get__ pretty much always used when attributes are accessed, e.g. obj.attr?
This is how I get benefit from this:
class Lazy():
def __init__(self, function):
self.name = function.__name__
self.function = function
def __get__(self, obj, type=None):
print("get value by heavier __get__")
obj.__dict__[self.name] = self.function(obj)
return obj.__dict__[self.name]
class C:
#Lazy
def attr(self):
# doing heavy calculation
return "heavy calculation result"
>>> c = C()
>>> c.attr
get value by heavier __get__
heavy calculation result
>>> c.attr # get value by __dict__
heavy calculation result
In object's attribute lookup logic, __dict__ has higher priority than non-data descriptor(maybe non-overriding descriptor in your question). In your first time trying to access attribute, since attribute doesn't exist in __dict__, attribute could be calculated in __get__ and cached it in traditional location(__dict__). By lower __get__ priority, later lookup will get returned by __dict__ directly.
__get__ host complicated, heavy lookup logic inside. So, if you have two approaches to get identical value, the heavier one should have lower priority.
TLDR: Instance variable definition i.e. self.attr = 'instance attr' overrides class variable definition i.e. attr = Descriptor() whenever there's a name clash. Either change the names, or use the classname to access class variables like so
Class.attr # instead of Class().attr
Long Answer
There's class variables and instance variables. Watch this video for a quick understanding. Class variables can be accessed using the classname or using an instance (if it hasn't yet been overridden), but instance variables can only be accessed using instances.
In your example, the following line sets up a class variable attr, which if accessed will return 4
attr = Descriptor() # returns 4 because of __get__() definition
However with this next line inside the constructor of Class i.e. __init__(), you set up an instance variable with the same name attr, thus overriding the class variable.
self.attr = 'instance attr'
Whenever you access the properties of an instance, first the instance variables are checked and only if it's not found then the class variables are checked. Class() creates an instance, and so Class().attr will search for attr as an instance variable and since it's defined, the value of 'instance attr' will be returned. If you hadn't defined it in __init__(), then it wouldn't be found, and so as the next step attr will be searched for as a class variable, and since that would be defined with the value 4, it will be returned.

Is it possible to inverse inheritance using a Python meta class?

Out of curiosity, I'm interested whether it's possible to write a meta class that causes methods of parent classes to have preference over methods of sub classes. I'd like to play around with it for a while. It would not be possible to override methods anymore. The base class would have to call the sub method explicitly, for example using a reference to a base instance.
class Base(metaclass=InvertedInheritance):
def apply(self, param):
print('Validate parameter')
result = self.subclass.apply(param)
print('Validate result')
return result
class Child(Base):
def apply(self, param):
print('Compute')
result = 42 * param
return result
child = Child()
child.apply(2)
With the output:
Validate parameter
Compute
Validate result
If you only care about making lookups on instances go in reverse order (not classes), you don't even need a metaclass. You can just override __getattribute__:
class ReverseLookup:
def __getattribute__(self, attr):
if attr.startswith('__'):
return super().__getattribute__(attr)
cls = self.__class__
if attr in self.__dict__:
return self.__dict__[attr]
# Using [-3::-1] skips topmost two base classes, which will be ReverseLookup and object
for base in cls.__mro__[-3::-1]:
if attr in base.__dict__:
value = base.__dict__[attr]
# handle descriptors
if hasattr(value, '__get__'):
return value.__get__(self, cls)
else:
return value
raise AttributeError("Attribute {} not found".format(attr))
class Base(ReverseLookup):
def apply(self, param):
print('Validate parameter')
result = self.__class__.apply(self, param)
print('Validate result')
return result
class Child(Base):
def apply(self, param):
print('Compute')
result = 42 * param
return result
>>> Child().apply(2)
Validate parameter
Compute
Validate result
84
This mechanism is relatively simple because lookups on the class aren't in reverse:
>>> Child.apply
<function Child.apply at 0x0000000002E06048>
This makes it easy to get a "normal" lookup just by doing it on a class instead of an instance. However, it could result in confusion in other cases, like if a base class method tries to access a different method on the subclass, but that method actually doesn't exist on that subclass; in this case lookup will proceed in the normal direction and possibly find the method on a higher class. In other words, when doing this you have be sure that you don't look any methods up on a class unless you're sure they're defined on that specific class.
There may well be other corner cases where this approach doesn't work. In particular you can see that I jury-rigged descriptor handling; I wouldn't be surprised if it does something weird for descriptors with a __set__, or for more complicated descriptors that make more intense use of the class/object parameters passed to __get__. Also, this implementation falls back on the default behavior for any attributes beginning with two underscores; changing this would require careful thought about how it's going to work with magic methods like __init__ and __repr__.

Making a LazilyEvaluatedConstantProperty class in Python

There's a little thing I want to do in Python, similar to the built-in property, that I'm not sure how to do.
I call this class LazilyEvaluatedConstantProperty. It is intended for properties that should be calculated only once and do not change, but they should be created lazily rather than on object creation, for performance.
Here's the usage:
class MyObject(object):
# ... Regular definitions here
def _get_personality(self):
# Time consuming process that creates a personality for this object.
print('Calculating personality...')
time.sleep(5)
return 'Nice person'
personality = LazilyEvaluatedConstantProperty(_get_personality)
You can see that the usage is similar to property, except there's only a getter, and no setter or deleter.
The intention is that on the first access to my_object.personality, the _get_personality method will be called, and then the result will be cached and _get_personality will never be called again for this object.
What is my problem with implementing this? I want to do something a bit tricky to improve performance: I want that after the first access and _get_personality call, personality will become a data attribute of the object, so lookup will be faster on subsequent calls. But I don't know how it's possible since I don't have a reference to the object.
Does anyone have an idea?
I implemented it:
class CachedProperty(object):
'''
A property that is calculated (a) lazily and (b) only once for an object.
Usage:
class MyObject(object):
# ... Regular definitions here
def _get_personality(self):
print('Calculating personality...')
time.sleep(5) # Time consuming process that creates personality
return 'Nice person'
personality = CachedProperty(_get_personality)
'''
def __init__(self, getter, name=None):
'''
Construct the cached property.
You may optionally pass in the name that this property has in the
class; This will save a bit of processing later.
'''
self.getter = getter
self.our_name = name
def __get__(self, obj, our_type=None):
if obj is None:
# We're being accessed from the class itself, not from an object
return self
value = self.getter(obj)
if not self.our_name:
if not our_type:
our_type = type(obj)
(self.our_name,) = (key for (key, value) in
vars(our_type).iteritems()
if value is self)
setattr(obj, self.our_name, value)
return value
For the future, the maintained implementation could probably be found here:
https://github.com/cool-RR/GarlicSim/blob/master/garlicsim/garlicsim/general_misc/caching/cached_property.py

Why is getattr() not working like I think it should? I think this code should print 'sss'

the next is my code:
class foo:
def __init__(self):
self.a = "a"
def __getattr__(self,x,defalut):
if x in self:
return x
else:return defalut
a=foo()
print getattr(a,'b','sss')
i know the __getattr__ must be 2 argument,but i want to get a default attribute if the attribute is no being.
how can i get it, thanks
and
i found if defined __setattr__,my next code is also can't run
class foo:
def __init__(self):
self.a={}
def __setattr__(self,name,value):
self.a[name]=value
a=foo()#error ,why
hi alex,
i changed your example:
class foo(object):
def __init__(self):
self.a = {'a': 'boh'}
def __getattr__(self, x):
if x in self.a:
return self.a[x]
raise AttributeError
a=foo()
print getattr(a,'a','sss')
it print {'a': 'boh'},not 'boh'
i think it will print self.a not self.a['a'], This is obviously not want to see
why ,and Is there any way to avoid it
Your problem number one: you're defining an old-style class (we know you're on Python 2.something, even though you don't tell us, because you're using print as a keyword;-). In Python 2:
class foo:
means you're defining an old-style, aka legacy, class, whose behavior can be rather quirky at times. Never do that -- there's no good reason! The old-style classes exist only for compatibility with old legacy code that relies on their quirks (and were finally abolished in Python 3). Use new style classes instead:
class foo(object):
and then the check if x in self: will not cause a recursive __getattr__ call. It will however cause a failure anyway, because your class does not define a __contains__ method and therefore you cannot check if x is contained in an instance of that class.
If what you're trying to do is whether x is defined in the instance dict of self, don't bother: __getattr__ doesn't even get called in that case -- it's only called when the attribute is not otherwise found in self.
To support three-arguments calls to the getattr built-in, just raise AttributeError in your __getattr__ method if necessary (just as would happen if you had no __getattr__ method at all), and the built-in will do its job (it's the built-in's job to intercept such cases and return the default if provided). That's the reason one never ever calls special methods such as __getattr__ directly but rather uses built-ins and operators which internally call them -- the built-ins and operators provide substantial added value.
So to give an example which makes somewhat sense:
class foo(object):
def __init__(self):
self.blah = {'a': 'boh'}
def __getattr__(self, x):
if x in self.blah:
return self.blah[x]
raise AttributeError
a=foo()
print getattr(a,'b','sss')
This prints sss, as desired.
If you add a __setattr__ method, that one intercepts every attempt to set attributes on self -- including self.blah = whatever. So -- when you need to bypass the very __setattr__ you're defining -- you must use a different approach. For example:
class foo(object):
def __init__(self):
self.__dict__['blah'] = {}
def __setattr__(self, name, value):
self.blah[name] = value
def __getattr__(self, x):
if x in self.blah:
return self.blah[x]
raise AttributeError
a=foo()
print getattr(a,'b','sss')
This also prints sss. Instead of
self.__dict__['blah'] = {}
you could also use
object.__setattr__(self, 'blah', {})
Such "upcalls to the superclass's implementation" (which you could also obtain via the super built-in) are one of the rare exceptions to the rules "don't call special methods directly, call the built-in or use the operator instead" -- here, you want to specifically bypass the normal behavior, so the explicit special-method call is a possibility.
You are confusing the getattr built-in function, which retrieves some attribute binding of an object dynamically (by name), at runtime, and the __getattr__ method, which is invoked when you access some missing attribute of an object.
You can't ask
if x in self:
from within __getattr__, because the in operator will cause __getattr__ to be invoked, leading to infinite recursion.
If you simply want to have undefined attributes all be defined as some value, then
def __getattr__(self, ignored):
return "Bob Dobbs"

How to find class of bound method during class construction in Python 3.1?

i want to write a decorator that enables methods of classes to become visible to other parties; the problem i am describing is, however, independent of that detail. the code will look roughly like this:
def CLASS_WHERE_METHOD_IS_DEFINED( method ):
???
def foobar( method ):
print( CLASS_WHERE_METHOD_IS_DEFINED( method ) )
class X:
#foobar
def f( self, x ):
return x ** 2
my problem here is that the very moment that the decorator, foobar(), gets to see the method, it is not yet callable; instead, it gets to see an unbound version of it. maybe this can be resolved by using another decorator on the class that will take care of whatever has to be done to the bound method. the next thing i will try to do is to simply earmark the decorated method with an attribute when it goes through the decorator, and then use a class decorator or a metaclass to do the postprocessing. if i get that to work, then i do not have to solve this riddle, which still puzzles me:
can anyone, in the above code, fill out meaningful lines under CLASS_WHERE_METHOD_IS_DEFINED so that the decorator can actually print out the class where f is defined, the moment it gets defined? or is that possibility precluded in python 3?
When the decorator is called, it's called with a function as its argument, not a method -- therefore it will avail nothing to the decorator to examine and introspect its method as much as it wants to, because it's only a function and carries no information whatsoever about the enclosing class. I hope this solves your "riddle", although in the negative sense!
Other approaches might be tried, such as deep introspection on nested stack frames, but they're hacky, fragile, and sure not to carry over to other implementations of Python 3 such as pynie; I would therefore heartily recommend avoiding them, in favor of the class-decorator solution that you're already considering and is much cleaner and more solid.
As I mentioned in some other answers, since Python 3.6 the solution to this problem is very easy thanks to object.__set_name__ which gets called with the class object that is being defined.
We can use it to define a decorator that has access to the class in the following way:
class class_decorator:
def __init__(self, fn):
self.fn = fn
def __set_name__(self, owner, name):
# do something with "owner" (i.e. the class)
print(f"decorating {self.fn} and using {owner}")
# then replace ourself with the original method
setattr(owner, name, self.fn)
Which can then be used as a normal decorator:
>>> class A:
... #class_decorator
... def hello(self, x=42):
... return x
...
decorating <function A.hello at 0x7f9bedf66bf8> and using <class '__main__.A'>
>>> A.hello
<function __main__.A.hello(self, x=42)>
This is a very old post, but introspection isn't the way to solve this problem, because it can be more easily solved with a metaclass and a bit of clever class construction logic using descriptors.
import types
# a descriptor as a decorator
class foobar(object):
owned_by = None
def __init__(self, func):
self.func = func
def __call__(self, *args, **kwargs):
# a proxy for `func` that gets used when
# `foobar` is referenced from by a class
return self.func(*args, **kwargs)
def __get__(self, inst, cls=None):
if inst is not None:
# return a bound method when `foobar`
# is referenced from by an instance
return types.MethodType(self.func, inst, cls)
else:
return self
def init_self(self, name, cls):
print("I am named '%s' and owned by %r" % (name, cls))
self.named_as = name
self.owned_by = cls
def init_cls(self, cls):
print("I exist in the mro of %r instances" % cls)
# don't set `self.owned_by` here because
# this descriptor exists in the mro of
# many classes, but is only owned by one.
print('')
The key to making this work is the metaclass - it searches through the attributes defined on the classes it creates to find foobar descriptors. Once it does, it passes them information about the classes they are involved in through the descriptor's init_self and init_cls methods.
init_self is called only for the class which the descriptor is defined on. This is where modifications to foobar should be made, because the method is only called once. While init_cls is called for all classes which have access to the decorated method. This is where modifications to the classes foobar can be referenced by should be made.
import inspect
class MetaX(type):
def __init__(cls, name, bases, classdict):
# The classdict contains all the attributes
# defined on **this** class - no attribute in
# the classdict is inherited from a parent.
for k, v in classdict.items():
if isinstance(v, foobar):
v.init_self(k, cls)
# getmembers retrieves all attributes
# including those inherited from parents
for k, v in inspect.getmembers(cls):
if isinstance(v, foobar):
v.init_cls(cls)
example
# for compatibility
import six
class X(six.with_metaclass(MetaX, object)):
def __init__(self):
self.value = 1
#foobar
def f(self, x):
return self.value + x**2
class Y(X): pass
# PRINTS:
# I am named 'f' and owned by <class '__main__.X'>
# I exist in the mro of <class '__main__.X'> instances
# I exist in the mro of <class '__main__.Y'> instances
print('CLASS CONSTRUCTION OVER\n')
print(Y().f(3))
# PRINTS:
# 10

Categories

Resources