The proper way of completely overriding attribute access in Python? - python

This naive class attempts to mimic the attribute access of basic python objects. dict and cls explicitly stores the attributes and the class. The effect is that accessing .x of an instance will return dict[x], or if that fails, cls.x. Just like normal objects.
class Instance(object):
__slots__ = ["dict", "cls"]
def __getattribute__(self, key):
try:
return self.dict[key]
except KeyError:
return getattr(self.cls, key)
def __setattr__(self, key, value):
if key == "__class__":
self.cls = value
else:
self.dict[key] = value
But it's nowhere near as simple as that. One obvious issue is the complete disregard for descriptors. Just imagine that cls has properties. Doing Instance.some_property = 10 should access the property as defined in cls, but will instead happily set some_property as an attribute in dict.
Then there is the issue of binding methods of cls to instances of Instance, and possibly more that I don't even know.
There seem to be a lot of details to get the above class to function as close to python objects as possible, and the docs for descriptors I've read so far hasn't made it clear how to get, simply put, everything right.
What I am asking for is a reference for implementing a complete replacement for python's attribute access. That is, the above class, but correct.

Well, I needed this answer so I had to do the research. The below code covers the following:
data-descriptors are given precedence both when setting and getting attributes.
non-data descriptors are properly called in __getattribute__
There may be typos in the code below as I had to translate it from an internal project. And I am not sure if it is 100% like python objects, so if anyone could spot errors that would be great.
_sentinel = object()
def find_classattr(cls, key):
for base in cls.__mro__: # Using __mro__ for speed.
try: return base.__dict__[key]
except KeyError: pass
return _sentinel
class Instance(object):
__slots__ = ["dict", "cls"]
def __init__(self, d, cls):
object.__setattr__(self, "dict", d)
object.__setattr__(self, "cls", cls)
def __getattribute__(self, key):
d = object.__getattribute__(self, "dict")
cls = object.__getattribute__(self, "cls")
if key == "__class__":
return cls
# Data descriptors in the class, defined by presence of '__set__',
# overrides any other kind of attribute access.
cls_attr = find_classattr(cls, key)
if hasattr(cls_attr, '__set__'):
return cls_attr.__get__(self, cls)
# Next in order of precedence are instance attributes.
try:
return d[key]
except KeyError:
# Finally class attributes, that may or may not be non-data descriptors.
if hasattr(cls_attr, "__get__"):
return cls_attr.__get__(self, cls)
if cls_attr is not _sentinel:
return cls_attr
raise AttributeError("'{}' object has no attribute '{}'".format(
getattr(cls, '__name__', "?"), key))
def __setattr__(self, key, value):
d = object.__getattribute__(self, "dict")
cls = object.__getattribute__(self, "cls")
if key == "__class__":
object.__setattr__(self, "cls", value)
return
# Again, data descriptors override instance attributes.
cls_attr = find_classattr(cls, key)
if hasattr(cls_attr, '__set__'):
cls_attr.__set__(self, value)
else:
d[key] = value
Funny thing is I realized I had written exactly the same stuff before a couple of years ago, but the descriptor protocol is so arcane I had forgotten it since.
EDIT: Fixed bug where using getattr to find an attribute on the class would call it's descriptors on the class level (i.e. without the instance). Replaced it with a method that looks directly in the __dict__ of the bases.

Related

Is there a recommended way of ensuring immutability

I am observing following behavior since python passes object by reference?
class Person(object):
pass
person = Person()
person.name = 'UI'
def test(person):
person.name = 'Test'
test(person)
print(person.name)
>>> Test
I found copy.deepcopy() to deepcopy object to prevent modifying the passed object. Are there any other recommendations ?
import copy
class Person(object):
pass
person = Person()
person.name = 'UI'
def test(person):
person_copy = copy.deepcopy(person)
person_copy.name = 'Test'
test(person)
print(person.name)
>>> UI
I am observing following behavior since python passes object by reference?
Not really. it's a subtle question. you can look at python - How do I pass a variable by reference? - Stack Overflow
Personally, I don't fully agree with the accepted answer and recommend you google call by sharing. Then, you can make your own decision on this subtle question.
I found copy.deepcopy() to deepcopy object to prevent modifying the passed object. Are there any other recommendations ?
As far as I know, there no other better way, if you don't use third package.
You can use the __setattr__ magic method to implement a base class that allows you to "freeze" an object after you're done with it.
This is not bullet-proof; you can still access __dict__ to mutate the object, and you can also unfreeze the object by unsetting _frozen, and if the attribute's value itself is mutable, this doesn't help much (x.things.append('x') would work for a list of things).
class Freezable:
def freeze(self):
self._frozen = True
def __setattr__(self, key, value):
if getattr(self, "_frozen", False):
raise RuntimeError("%r is frozen" % self)
super().__setattr__(key, value)
class Person(Freezable):
def __init__(self, name):
self.name = name
p = Person("x")
print(p.name)
p.name = "y"
print(p.name)
p.freeze()
p.name = "q"
outputs
x
y
Traceback (most recent call last):
File "freezable.py", line 21, in <module>
p.name = 'q'
RuntimeError: <__main__.Person object at 0x10f82f3c8> is frozen
There are no really 100% watertight way, but you can make it difficult to inadvertently mutate an object that you want to keep frozen; the recommended way for most people is probably to use a frozen DataClass, or a frozen attrs class
In his talk on DataClasses (2018), #RaymonHettinger mentions three approaches: one way, is with a metaclass, another, like in the fractions module is to give attributes a read only property; the DataClass module extends __setattr__ and __delattr__, and overrides __hash__:
-> use a metaclass.
Good resources include #DavidBeasley books and talks at python.
-> give attributes a read only property
class SimpleFrozenObject:
def __init__(self, x=0):
self._x = x
#property
def x(self):
return self._x
f = SimpleFrozenObject()
f.x = 2 # raises AttributeError: can't set attribute
-> extend __setattr__ and __delattr__, and override `hash
class FrozenObject:
...
def __setattr__(self, name, value):
if type(self) is cls or name in (tuple of attributes to freeze,):
raise FrozenInstanceError(f'cannot assign to field {name}')
super(cls, self).__setattr__(name, value)
def __delattr__(self, name):
if type(self) is cls or name in (tuple of attributes to freeze,):
raise FrozenInstanceError(f'cannot delete field {name}')
super(cls, self).__delattr__(name, value)
def __hash__(self):
return hash((tuple of attributes to freeze,))
...
The library attrs also offers options to create immutable objects.

Why are properties class attributes in Python?

I'm reading Fluent Python chapter 19 > A Proper Look at Properties, and I'm confused about the following words:
Properties are always class attributes, but they actually manage attribute access in the instances of the class.
The example code is:
class LineItem:
def __init__(self, description, weight, price):
self.description = description
self.weight = weight # <1>
self.price = price
def subtotal(self):
return self.weight * self.price
#property # <2>
def weight(self): # <3>
return self.__weight # <4>
#weight.setter # <5>
def weight(self, value):
if value > 0:
self.__weight = value # <6>
else:
raise ValueError('value must be > 0') # <7>
From my previous experiences, class attributes are belong to the class itself and shared by all the instances. But here, weight, the property, is an instance method and the value returned by it is different between instances. How is it eligible to be a class attribute? Doesn't it that all the class attributes should be the same for any instances?
I think I misunderstand something, so I hope to get a correct explanation. Thanks!
A distinction is made because when you define a #property on a class, that property object becomes an attribute on the class. Whereas when you define attributes against an instance of your class (in your __init__ method), that attribute only exists against that object. This might be confusing if you do:
>>> dir(LineItem)
['__class__', ..., '__weakref__', 'subtotal', 'weight']
>>> item = LineItem("an item", 3, 1.12)
>>> dir(item)
['__class__', ..., '__weakref__', 'description', 'price', 'subtotal', 'weight']
Notice how both subtotal and weight exist as attributes on your class.
I think it's also worth noting that when you define a class, code under that class is executed. This includes defining variables (which then become class attributes), defining functions, and anything else.
>>> import requests
>>> class KindOfJustANamespace:
... text = requests.get("https://example.com").text
... while True:
... break
... for x in range(2):
... print(x)
...
0
1
>>> KindOfJustANamespace.text
'<!doctype html>\n<html>\n<head>\n <title>Example Domain...'
A #decorator is just "syntactic sugar". Meaning #property over a function if the same as function = property(function). This applies to functions defined inside a class as well, but now the function is part of the class's namespace.
class TestClass:
#property
def foo(self):
return "foo"
# ^ is the same as:
def bar(self):
return "bar"
bar = property(bar)
A good explanation of property in Python can be found here: https://stackoverflow.com/a/17330273/7220776
From my previous experiences, class attributes are belong to the class itself and shared by all the instances.
That's right.
But here, weight, the property, is an instance method
No, it's a property object. When you do:
#decorator
def func():
return 42
it's actually syntactic sugar for
def func():
return 42
func = decorator(func)
IOW the def statement is executed, the function object created, but instead of beeing bound to it's name, it's passed to the decorator callable, and the name is bound to whatever decorator() returned.
In this case the decorator is the property class itself, so the weight attribute is a property instance. You can check this out by yourself by inspecting LineItem.weight (which will return the property object itself).
and the value returned by it is different between instances.
Well yes of course, how is this surprising ? LineItem.subtotal is a class attribute also (like all methods), yet it returns values from the instance it's called on (which is passed to the function as the self param).
How is it eligible to be a class attribute? Doesn't it that all the class attributes should be the same for any instances?
The class attributes ARE the same for all instances of a class, yes. There's only one single subtotal function for all instances of LineItem.
A property is mainly a shortcut to make a function (or a pair of functions if you specify a setter) look like it's a plain attribute, so when you type mylinitem.weight, what is really executed is LineItem.weight.fget(mylineitem), where fget is the getter function you decorated with #property. The mechanism behind this is known as the "descriptor protocol", which is also used to turn mylineitem.subtotal() into LineItem.subtotal(mylineitem) (python functions implement the descriptor protocol to return "method" objects, which are themselves wrappers around the function and the current instance and insert the instance as first argument to the function call).
So it's not suprising that properties are class attributes - you only need one property instance to "serve" all instances of the class -, and moreover, properties - like all descriptors FWIW - MUST actually be class attributes to work as expected, since the descriptor protocol is only invoked on class attributes (there's no use case for a "per instance" computed attribute since the function in charge of the "computation" will get the instance as parameter).
I finally understand the descriptor and property concept through Simeon Franklin's excellent presentation, the following contents can be seen as a summary on his lecture notes. Thanks to him!
To understand properties, you first need to understand descriptors, because a property is implemented by a descriptor and python's decorator syntactic sugar. Don't worry, it's not that difficult.
What is a descriptor:
a descriptor is any object that implements at least one of methods named __get__(), __set__(), and __delete__().
Descriptor can be divided into two categories:
A data descriptor implements both __get__() and __set__().
A non-data descriptor implements only __get__().
According to python's HowTo:
a descriptor is an object attribute with “binding behavior”, one whose attribute access has been overridden by methods in the descriptor protocol.
Then what is the descriptor protocol? Basically speaking, it's just says that when Python interpreter comes across an attribute access like obj.attr,it will search in some order to resolve this .attr , and if this attr is a descriptor attribute, then this descriptor will take some precedence in this specific order and this attribute access will be translated into a method call on this descriptor according to the descriptor protocol, possibly shadowing a namesake instance attribute or class attribute. More concretely, if attr is a data descriptor, then obj.attr will be translated into the calling result of this descriptor's __get__ method; if attr is not a data descriptor and is an instance attribute, this instance attribute will be matched; if attr is not in above, and it is a non-data descriptor, we get the calling result of this non-data descriptor's __get__ method. Full rules on attribute resolution can be found here .
Now let's talk about property. If you have looked at Python' descriptor HowTo, you can find a pure Python version implementation of property:
class Property(object):
"Emulate PyProperty_Type() in Objects/descrobject.c"
def __init__(self, fget=None, fset=None, fdel=None, doc=None):
self.fget = fget
self.fset = fset
self.fdel = fdel
if doc is None and fget is not None:
doc = fget.__doc__
self.__doc__ = doc
def __get__(self, obj, objtype=None):
if obj is None:
return self
if self.fget is None:
raise AttributeError("unreadable attribute")
return self.fget(obj)
def __set__(self, obj, value):
if self.fset is None:
raise AttributeError("can't set attribute")
self.fset(obj, value)
def __delete__(self, obj):
if self.fdel is None:
raise AttributeError("can't delete attribute")
self.fdel(obj)
def getter(self, fget):
return type(self)(fget, self.fset, self.fdel, self.__doc__)
def setter(self, fset):
return type(self)(self.fget, fset, self.fdel, self.__doc__)
def deleter(self, fdel):
return type(self)(self.fget, self.fset, fdel, self.__doc__)
Apparently,property is a data descriptor!
#property just uses python's decorator syntactic sugar.
#property
def attr(self):
pass
is equivalent to:
attr = property(attr)
So, attr is no longer an instance method as I posted in thie question, but is translated into a class attribute by the decorator syntactic sugar as the author said. It's a descriptor object attribute.
How is it eligible to be a class attribute?
OK, we solved it now.
Then:
Doesn't it that all the class attributes should be the same for any instances?
No!
I steal an example from Simeon Franklin's excellent presentation .
>>> class MyDescriptor(object):
... def __get__(self, obj, type):
... print self, obj, type
... def __set__(self, obj, val):
... print "Got %s" % val
...
>>> class MyClass(object):
... x = MyDescriptor() # Attached at class definition time!
...
>>> obj = MyClass()
>>> obj.x # a function call is hiding here
<...MyDescriptor object ...> <....MyClass object ...> <class '__main__.MyClass'>
>>>
>>> MyClass.x # and here!
<...MyDescriptor object ...> None <class '__main__.MyClass'>
>>>
>>> obj.x = 4 # and here
Got 4
Pay attention to obj.x and its output. The second element in its output is <....MyClass object ...> . It's the specific instance obj . Shortly speaking, because this attribute access has been translated into a __get__ method call, and this __get__ method get the specific instance argument as its method signature descr.__get__(self, obj, type=None) demands, it can return different values according to which instance it is been called by.
Note: my English explanation maybe not clear enough, so I highly recommend you to look at Simeon Franklin's notes and Python's descriptor HowTo.
You didn't misunderstand. Don't worry, just read on. It will become clear in the next chapter.
The same book explains in chapter 20 that they can be a class attributes because of the descriptor protocol. The documentation explains how properties are implemented as descriptors.
As you see from the example, properties are really class attributes (methods). When called, they get a reference to the instance, and writes/reads to its underlying __dict__.
I think the example is wrong, the init shoul look like this:
def __init__(self, description, weight, price):
self.description = description
self.__weight = weight # <1>
self.__price = price
self.__weight and self.__price are the internal attributes hidden in the class by the properties

Intercepting __getattr__ in a Python 2.3 old-style mixin class?

I have a large Python 2.3 based installation with 200k LOC. As part of a migration project I need to intercept all attribute lookups of all old-style class.
Old legacy code:
class Foo(Bar):
...
My idea is to inject a common mixin class like
class Foo(Bar, Mixin):
...
class Mixin:
def __getattr__(self, k)
print repr(self), k
return Foo.__getattr__(self, k)
However I am running always into a recursion because Foo.__getattr__ resolves
to Mixin.__getattr__.
Is there any way to fix the code for Python 2.3 old-style classes?
If you are already injecting mixins, why not add object as parent, to make them new style
class Foo(Mixin, Bar, object):
...
And then use super
class Mixin(object):
def __getattr__(self, k)
print repr(self), k
return super(Mixin, self).__getattr__(k)
Assuming that none of the classes in your code base implement __setattr__ or __getattr__ then one approach is to intercept __setattr__ in your Mixin, writing the value to another reserved attribute, then read it back in __getattr__
class Mixin:
def __setattr__(self, attr, value):
# write the value into some special reserved space
namespace = self.__dict__.setdefault("_namespace", {})
namespace[attr] = value
def __getattr__(self, attr):
# reject special methods so e.g. __repr__ can't recurse
if attr.startswith("__") and attr.endswith("__"):
raise AttributeError
# do whatever you wish to do here ...
print repr(self), attr
# read the value from the reserved space
namespace = self.__dict__.get("_namespace", {})
return namespace[attr]
Example:
class Foo(Mixin):
def __init__(self):
self.x = 1
Then
>>> Foo().x
<__main__.Foo instance at 0x10c4dad88> x
Clearly this won't work if any of your Foo classes implement __setattr__ or __getattr__ themselves.

How does a descriptor with __set__ but without __get__ work?

I read somewhere about the fact you can have a descriptor with __set__ and without __get__.
How does it work?
Does it count as a data descriptor? Is it a non-data descriptor?
Here is a code example:
class Desc:
def __init__(self, name):
self.name = name
def __set__(self, inst, value):
inst.__dict__[self.name] = value
print("set", self.name)
class Test:
attr = Desc("attr")
>>>myinst = Test()
>>> myinst.attr = 1234
set attr
>>> myinst.attr
1234
>>> myinst.attr = 5678
set attr
>>> myinst.attr
5678
The descriptor you've given in the example is a data descriptor.
Upon setting the attribute, as any other data descriptor, it takes the highest priority and is called like so:
type(myinst).__dict__["attr"].__set__(myinst, 1234)
This in turn, adds attr to the instance dictionary according to your __set__ method.
Upon attribute access, the descriptor is checked for having the __get__ method but fails, causing for the search to be redirected to the instance's dictionary like so:
myinst.__dict__["attr"]
If it is not found in the instance dictionary, the descriptor itself is returned.
This behavior is shortly documented in the data model like so:
If it does not define __get__(), then accessing the attribute will
return the descriptor object itself unless there is a value in the
object’s instance dictionary.
Common usecases include avoiding {instance: value} dictionaries inside the descriptors, and caching values in an efficient way.
In Python 3.6, __set_name__ was added to the descriptor protocol thus eliminating the need for specifying the name inside the descriptor. This way, your descriptor can be written like so:
class Desc:
def __set_name__(self, owner, name):
self.name = name
def __set__(self, inst, value):
inst.__dict__[self.name] = value
print("set", self.name)
class Test:
attr = Desc()

Python properties: why doesn't this code run the function?

I'm trying to use properties and I tried to change python documentation's code. I'd expect the following would print anything, but it doesn't. Why does it not print anything?
class User:
def getter(self, name):
def get_prop(self):
print 'Getting {}'.format(name)
return getattr(self, name)
return get_prop
def setter(self, name):
def set_prop(self, value):
print 'Setting {} to {}'.format(name, value)
return setattr(self, name, value)
return set_prop
user_id = property(getter, setter)
u = User()
u.user_id = 10
u.user_id
There are two reasons your property doesn't work:
You need to use a new style class (by basing your class on object); you cannot use a property with a setter otherwise (only a getter is supported for old-style classes).
you are generating accessors as nested functions; you need to call those outer methods to generate those accessors, the property() function will no do this for you. As such, you can move those functions out of the class and use them as plain functions instead.
The following code works:
def getter(name):
def get_prop(self):
print 'Getting {}'.format(name)
return getattr(self, name)
return get_prop
def setter(name):
def set_prop(self, value):
print 'Setting {} to {}'.format(name, value)
return setattr(self, name, value)
return set_prop
class User(object):
user_id = property(getter('_user_id'), setter('_user_id'))
Note that I used _user_id for the property 'name' here, otherwise the getattr(self, name) call will trigger an infinite recursion; u.user_id would trigger a getattr(u, 'user_id') which triggers the property again.
Probably because properties only work with new-style objects. Change your class statement to:
class User(object):
Looking further, your getter and setter functions are returning functions, that you then do not call.
You seem to want dynamic attribute names, yet you have just one attribute name returned from property, in your case this is user_id. What is it you are trying to achieve?

Categories

Resources