Use cases for property vs. descriptor vs. __getattribute__

Use cases for property vs. descriptor vs. __getattribute__ - python

The question refers to which one is preferable to be used in which use case, not about the technical background.
In python, you can control the access of attributes via a property, a descriptor, or magic methods. Which one is most pythonic in which use case? All of them seem to have the same effect (see the examples below).
I am looking for an answer like:
Property: Should be used in case of …
Descriptor: In the case of … it should be used instead of a property.
Magic method: Only use if ….
Example
A use case would be an attribute that might not be able to be set in the __init__ method, for example because the object is not present in the database yet, but at a later time. Each time the attribute is accessed, it should be tried to be set and returned.
As an example that works with Copy&Paste in the Python shell, there is a class that wants to present its attribute only the second time it is asked for it. So, which one is the best way, or are there different situations one of them is preferable? Here are the three ways to implement it:
With Property::
class ContactBook(object):
intents = 0
def __init__(self):
self.__first_person = None
def get_first_person(self):
ContactBook.intents += 1
if self.__first_person is None:
if ContactBook.intents > 1:
value = 'Mr. First'
self.__first_person = value
else:
return None
return self.__first_person
def set_first_person(self, value):
self.__first_person = value
first_person = property(get_first_person, set_first_person)
With __getattribute__::
class ContactBook(object):
intents = 0
def __init__(self):
self.first_person = None
def __getattribute__(self, name):
if name == 'first_person' \
and object.__getattribute__(self, name) is None:
ContactBook.intents += 1
if ContactBook.intents > 1:
value = 'Mr. First'
self.first_person = value
else:
value = None
else:
value = object.__getattribute__(self, name)
return value
Descriptor::
class FirstPerson(object):
def __init__(self, value=None):
self.value = None
def __get__(self, instance, owner):
if self.value is None:
ContactBook.intents += 1
if ContactBook.intents > 1:
self.value = 'Mr. First'
else:
return None
return self.value
class ContactBook(object):
intents = 0
first_person = FirstPerson()
Each one of it has this behavior::
book = ContactBook()
print(book.first_person)
# >>None
print(book.first_person)
# >>Mr. First

Basically, use the simplest one you can. Roughly speaking, the order of complexity/heavy-duty-ness goes: regular attribute, property, __getattr__, __getattribute__/descriptor. (__getattribute__ and custom descriptors are both things you probably won't need to do very often.) This leads to some simple rules of thumb:
Don't use a property if a normal attribute will work.
Don't write your own descriptor if a property will work.
Don't use __getattr__ if a property will work.
Don't use __getattribute__ if __getattr__ will work.
Stated a bit more specifically: use a property to customize handling of one or a small set of attributes; use __getattr__ to customize handling of all attributes, or all except a small set; use __getattribute__ if you were hoping to use __getattr__ but it doesn't quite work; write your own descriptor class if you are doing something very complicated.
You use a property when you have one or a small set of attributes whose getting/setting you want to hook into. That is, you want things like obj.prop and obj.prop = 2 to secretly call a function that you write to customize what happens.
You would use __getattr__ when you want to do that for so many attributes that you don't actually want to define them all individually, but rather want to customize the whole attribute-access process as a whole. In other words, instead of hooking into obj.prop1, and obj.prop2, etc., you have so many that you want to be able to hook into obj.<anything>, and handle that in general.
However, __getattr__ still won't let you override what happens for attributes that really do exist, it just lets you hook in with a blanket handling for any use of attributes that would otherwise raise an AttributeError. Using __getattribute__ lets you hook in to handle everything, even normal attributes that would have worked without messing with __getattribute__. Because of this, using __getattribute__ has the potential to break fairly basic behavior, so you should only use it if you considered using __getattr__ and it wasn't enough. It also can have a noticeable performance impact. You might for instance need to use __getattribute__ if you're wrapping a class that defines some attributes, and you want to be able to wrap those attributes in a custom way, so that they work as usual in some situations but get custom behavior in other situations.
Finally, I would say writing your own descriptor is a fairly advanced task. property is a descriptor, and for probably 95% of cases it's the only one you'll need. A good simple example of why you might write your own descriptor is given here: basically, you might do it if you would otherwise have to write several propertys with similar behavior; a descriptor lets you factor out the common behavior to avoid code repetition. Custom descriptors are used, for instance, to drive systems like like Django and SQLAlchemy. If you find yourself writing something at that level of complexity you might need to write a custom descriptor.
In your example, property would be the best choice. It is usually (not always) a red flag if you're doing if name == 'somespecificname' inside __getattribute__. If you only need to specially handle one specific name, you can probably do it without stooping to the level of __getattribute__. Likewise, it doesn't make sense to write your own descriptor if all you write for its __get__ is something you could have written in a property's getter method.

__getattribute__ is the hook that enables property (and other descriptors) to work in the first place and is called for all attribute access on an object. Consider it a lower-level API when a property or even a custom descriptor is not enough for your needs. __getattr__ is called by __getattribute__ when no attribute has been located through other means, as a fallback.
Use property for dynamic attributes with a fixed name, __getattr__ for attributes of a more dynamic nature (e.g. a series of attributes that map to values in an algorithmic manner).
Descriptors are used when you need to bind arbitrary objects to an instance. When you need to replace method objects with something more advanced for example; a recent example is a class-based decorator wrapping methods that needed to support additional attributes and methods on the method object. Generally, when you are still thinking in terms of scalar attributes, you don't need descriptors.

Related

How to test a method where the set up and checking depends on untested methods?

I've created an example class (a bitmask class) which has 4 really simple functions. I've also created a unit-test for this class.
import unittest
class BitMask:
def __init__(self):
self.__mask = 0
def set(self, slot):
self.__mask |= (1 << slot)
def remove(self, slot):
self.__mask &= ~(1 << slot)
def has(self, slot):
return (self.__mask >> slot) & 1
def clear(self):
self.__mask = 0
class TestBitmask(unittest.TestCase):
def setUp(self):
self.bitmask = BitMask()
def test_set_on_valid_input(self):
self.bitmask.set(5)
self.assertEqual(self.bitmask.has(5), True)
def test_has_on_valid_input(self):
self.bitmask.set(5)
self.assertEqual(self.bitmask.has(5), True)
def test_remove_on_valid_input(self):
self.bitmask.set(5)
self.bitmask.remove(5)
self.assertEqual(self.bitmask.has(5), False)
def test_clear(self):
for i in range(16):
self.bitmask.set(i)
self.bitmask.clear()
for j in range(16):
with self.subTest(j=j):
self.assertEqual(self.bitmask.has(j), False)
The problem I'm facing is that all these tests requires both the set and has methods for setting and checking values in the bitmask, but these methods are untested. I cannot confirm that one is correct without knowing that the other one is.
This example class isn't the first time I've experienced this issue. It usually occurs when I need to set up and check values/states within a class in order to test a method.
I've tried to find resources that explain this, but unfortunately their examples only use pure functions or where the changed attribute can be read directly. I could solve the problem by extracting the methods to be pure functions, or using a read-only property that returns the attribute __mask.
But is this the preferred approach? If not, how do I test a method that needs to be set up and/or checked using untested methods?

Not sure this answers your question, as it deals with changing of initial class design,
but here it goes.
You make a lazy class with no constructor or property , which hides the state of your
object. It is not the set or has methods that are untested, it is the issue of
state of the object being unknown. Have you had a .value property to reveal
self.__mask, this would solve a question of testing .set() and has().
Also I would strongly consider a default value in constructor, which makes it a better-looking
instantination and allows easier testing (some advice on avoiding setters in python is here).
def __init__(self, mask=0):
self.__mask = mask
If there any design considerations that prevent you from having a .value property,
perhaps an `__eq__ method can be used, if __init__ accepts a value.
a = BitMask(0)
b = BitMask(5)
a.set(5)
assert a == b
Of course, you can challenge that on how is __eq__tested itself.
Finally, perhaps you are failiar with patching or monkey-patching - a
technique to block something inside a object under test or make it work differently
(eg imitate web response without actual call). With any of the libraries for pathcing
I think you would still endup-performing a kind of x.__mask = value assignment, which
is not too reasonable for a small, nice, and locally-defined class like one here.
Hope it helps in line of what you are exploring.

I would’ve used single underscore instead of double, and just looked directly at the _mask in unit test.
Python doesn’t really have private attributes or methods, even double underscore attributes are accessible on your instance like this: obj._BitMask__mask.
Double underscore is used when you want subclasses to not overwrite the attribute of superclass. To indicate “private” you should use single underscore.
Allowing access to private fields is a part of python's design, so using this ability responsibly is not considered wrong, doubly so if you are accessing your own class.
The rationale behind "Do not touch the private fields" is that you as the developer can mess something up with the internals of the class, also private interface of s library can change at any point and break your code.
When you are writing unit tests you are not afraid of messing with your own class, and is accepting that you have to change unit test if you change your class, so this programming idiom is not useful for you to apply.

Read only python attribute? Can't print object

I have an instance of a cx_Oracle.Connection called x and I'm trying to print x.clientinfo or x.module and getting:
attribute 'module' of 'cx_Oracle.Connection' objects is not readable
(What's weird is that I can do print x.username)
I can still do dir(x) with success and I don't have time to look at the source code of cx_Oracle (lots of it implemented in C) so I'm wondering how the implementer was able to do this? Was it by rolling descriptors? or something related to __getitem__? What would be the motivation for this?

You can do this pretty easily in Python with a custom descriptor.
Look at the Descriptor Example in the HOWTO. If you just change the __get__ method to raise an AttributeError… that's it. We might as well rename it and strip out the logging stuff to make it simpler.
class WriteOnly(object):
"""A data descriptor that can't be read.
"""
def __init__(self, initval=None, name='var'):
self.val = initval
self.name = name
def __get__(self, obj, objtype):
raise AttributeError("No peeking at attribute '{}'!".format(self.name))
def __set__(self, obj, val):
self.val = val
class MyClass(object):
x = WriteOnly(0, 'x')
m = MyClass()
m.x = 20 # works
print(m.x) # raises AttributeError
Note that in 2.x, if you forget the (object) and create a classic class, descriptors won't work. (I believe descriptors themselves can actually be classic classes… but don't do that.) In 3.x, there are no classic classes, so that's not a problem.
So, if the value is write-only, how would you ever read it?
Well, this toy example is useless. But you could, e.g., set some private attribute on obj rather than on yourself, at which point code that knows where the data are stored can find it, but casual introspection can't.
But you don't even need descriptors. If you want an attribute that's write-only no matter what class you attach it to, that's one thing, but if you just want to block read access to certain members of a particular class, there's an easier way:
class MyClass(object):
def __getattribute__(self, name):
if name in ('x', 'y', 'z'):
raise AttributeError("No! Bad user! You cannot see my '{}'!".format(name))
return super().__getattribute__(self, name)
m = MyClass()
m.x = 20
m.x # same exception
For more details, see the __getattr__ and __getattribute__ documentation from the data model chapter in the docs.
In 2.x, if you leave the (object) off and create a classic class, the rules for attribute lookup are completely different, and not completely documented, and you really don't want to learn them unless you're planning to spend a lot of time in the 90s, so… don't do that. Also, 2.x will obviously need the 2.x-style explicit super call instead of the 3.x-style magic super().
From the C API side, you've got most of the same hooks, but they're a bit different. See PyTypeObjects for details, but basically:
tp_getset lets you automatically build descriptors out of getter and setter functions, which is similar to #property but not identical.
tp_descr_get and tp_descr_set are for building descriptors separately.
tp_getattro and tp_setattro are similar to __getattr__ and __setattr__, except that the rules for when they get called are a little different, and you typically call PyObject_GenericGetAttr instead of delegating to super() when you know you have no base classes that need to hook attribute access.
Still, why would you do that?
Personally, I've done stuff like this to learn more about the Python data model and descriptors, but that's hardly a reason to put it in a published library.
I'm guessing that more often than not, someone does it because they're trying to force a mistaken notion of OO encapsulation (based on the traditional C++ model) on Python—or, worse, trying to build Java-style security-by-encapsulation (which doesn't work without a secure class loader and all that comes with it).
But there could be cases where there's some generic code that uses these objects via introspection, and "tricking" that code could be useful in a way that trying to trick human users isn't. For example, imagine a serialization library that tried to pickle or JSON-ify or whatever all of the attributes. You could easily write it ignore non-readable attributes. (Of course you could just as easily make it, say, ignore attributes prefixed with a _…)
As for why cx_Oracle did it… I've never even looked at it, so I have no idea.

Why is getattr not invoked for missing "magically" invoked methods?

I am trying to implement a class in which an attempt to access any attributes that do not exist in the current class or any of its ancestors will attempt to access those attributes from a member. Below is a trivial version of what I am trying to do.
class Foo:
def __init__(self, value):
self._value = value
def __getattr__(self, name):
return getattr(self._value, name)
if __name__ == '__main__':
print(Foo(5) > Foo(4)) # should do 5 > 4 (or (5).__gt__(4))
However, this raises a TypeError. Even using the operator module's attrgetter class does the same thing. I was taking a look at the documentation regarding customizing attribute access, but I didn't find it an easy read. How can I get around this?

If I understand you correctly, what you are doing is correct, but it still won't work for what you're trying to use it for. The reason is that implicit magic-method lookup does not use __getattr__ (or __getattribute__ or any other such thing). The methods have to actually explicitly be there with their magic names. Your approach will work for normal attributes, but not magic methods. (Note that if you do Foo(5).__lt__(4) explicitly, it will work; it's only the implicit "magic" lookup --- e.g., calling __lt__ when < is used) --- that is blocked.)
This post describes an approach for autogenerating magic methods using a metaclass. If you only need certain methods, you can just define them on the class manually.

__*__ methods will not work unless they actually exist - so neither __getattr__ nor __getattribute__ will allow you to proxy those calls. You must create every single methods manually.
Yes, this does involve quite a bit of copy&paste. And yes, it's perfectly fine in this case.
You might be able to use the werkzeug LocalProxy class as a base or instead of your own class; your code would look like this when using LocalProxy:
print(LocalProxy(lambda: 5) > LocalProxy(lambda: 4))

How to use default property descriptors and successfully assign from init()?

What's the correct idiom for this please?
I want to define an object containing properties which can (optionally) be initialized from a dict (the dict comes from JSON; it may be incomplete). Later on I may modify the properties via setters.
There are actually 13+ properties, and I want to be able to use default getters and setters, but that doesn't seem to work for this case:
But I don't want to have to write explicit descriptors for all of prop1... propn
Also, I'd like to move the default assignments out of __init__() and into the accessors... but then I'd need expicit descriptors.
What's the most elegant solution? (other than move all the setter calls out of __init__() and into a method/classmethod _make()?)
[DELETED COMMENT The code for badprop using default descriptor was due to comment by a previous SO user, who gave the impression it gives you a default setter. But it doesn't - the setter is undefined and it necessarily throws AttributeError.]
class DubiousPropertyExample(object):
def __init__(self,dct=None):
self.prop1 = 'some default'
self.prop2 = 'other default'
#self.badprop = 'This throws AttributeError: can\'t set attribute'
if dct is None: dct = dict() # or use defaultdict
for prop,val in dct.items():
self.__setattr__(prop,val)
# How do I do default property descriptors? this is wrong
##property
#def badprop(self): pass
# Explicit descriptors for all properties - yukk
#property
def prop1(self): return self._prop1
#prop1.setter
def prop1(self,value): self._prop1 = value
#property
def prop2(self): return self._prop2
#prop2.setter
def prop2(self,value): self._prop2 = value
dub = DubiousPropertyExample({'prop2':'crashandburn'})
print dub.__dict__
# {'_prop2': 'crashandburn', '_prop1': 'some default'}
If you run this with line 5 self.badprop = ... uncommented, it fails:
self.badprop = 'This throws AttributeError: can\'t set attribute'
AttributeError: can't set attribute
[As ever, I read the SO posts on descriptors, implicit descriptors, calling them from init]

I think you're slightly misunderstanding how properties work. There is no "default setter". It throws an AttributeError on setting badprop not because it doesn't yet know that badprop is a property rather than a normal attribute (if that were the case it would just set the attribute with no error, because that's now normal attributes behave), but because you haven't provided a setter for badprop, only a getter.
Have a look at this:
>>> class Foo(object):
#property
def foo(self):
return self._foo
def __init__(self):
self._foo = 1
>>> f = Foo()
>>> f.foo = 2
Traceback (most recent call last):
File "<pyshell#12>", line 1, in <module>
f.foo = 2
AttributeError: can't set attribute
You can't set such an attribute even from outside of __init__, after the instance is constructed. If you just use #property, then what you have is a read-only property (effectively a method call that looks like an attribute read).
If all you're doing in your getters and setters is redirecting read/write access to an attribute of the same name but with an underscore prepended, then by far the simplest thing to do is get rid of the properties altogether and just use normal attributes. Python isn't Java (and even in Java I'm not convinced of the virtue of private fields with the obvious public getter/setter anyway). An attribute that is directly accessible to the outside world is a perfectly reasonable part of your "public" interface. If you later discover that you need to run some code whenever an attribute is read/written you can make it a property then without changing your interface (this is actually what descriptors were originally intended for, not so that we could start writing Java style getters/setters for every single attribute).
If you're actually doing something in the properties other than changing the name of the attribute, and you do want your attributes to be readonly, then your best bet is probably to treat the initialisation in __init__ as directly setting the underlying data attributes with the underscore prepended. Then your class can be straightforwardly initialised without AttributeErrors, and thereafter the properties will do their thing as the attributes are read.
If you're actually doing something in the properties other than changing the name of the attribute, and you want your attributes to be readable and writable, then you'll need to actually specify what happens when you get/set them. If each attribute has independent custom behaviour, then there is no more efficient way to do this than explicitly providing a getter and a setter for each attribute.
If you're running exactly the same (or very similar) code in every single getter/setter (and it's not just adding an underscore to the real attribute name), and that's why you object to writing them all out (rightly so!), then you may be better served by implementing some of __getattr__, __getattribute__, and __setattr__. These allow you to redirect attribute reading/writing to the same code each time (with the name of the attribute as a parameter), rather than to two functions for each attribute (getting/setting).

It seems like the easiest way to go about this is to just implement __getattr__ and __setattr__ such that they will access any key in your parsed JSON dict, which you should set as an instance member. Alternatively, you could call update() on self.__dict__ with your parsed JSON, but that's not really the best way to go about things, as it means your input dict could potentially trample members of your instance.
As to your setters and getters, you should only be creating them if they actually do something special other than directly set or retrieve the value in question. Python isn't Java (or C++ or anything else), you shouldn't try to mimic the private/set/get paradigm that is common in those languages.

I simply put the dict in the local scope and get/set there my properties.
class test(object):
def __init__(self,**kwargs):
self.kwargs = kwargs
#self.value = 20 asign from init is possible
#property
def value(self):
if self.kwargs.get('value') == None:
self.kwargs.update(value=0)#default
return self.kwargs.get('value')
#value.setter
def value(self,v):
print(v) #do something with v
self.kwargs.update(value=v)
x = test()
print(x.value)
x.value = 10
x.value = 5
Output
0
10
5

How does extending classes (Monkey Patching) work in Python?

class Foo(object):
pass
foo = Foo()
def bar(self):
print 'bar'
Foo.bar = bar
foo.bar() #bar
Coming from JavaScript, if a "class" prototype was augmented with a certain attribute. It is known that all instances of that "class" would have that attribute in its prototype chain, hence no modifications has to be done on any of its instances or "sub-classes".
In that sense, how can a Class-based language like Python achieve Monkey patching?

The real question is, how can it not? In Python, classes are first-class objects in their own right. Attribute access on instances of a class is resolved by looking up attributes on the instance, and then the class, and then the parent classes (in the method resolution order.) These lookups are all done at runtime (as is everything in Python.) If you add an attribute to a class after you create an instance, the instance will still "see" the new attribute, simply because nothing prevents it.
In other words, it works because Python doesn't cache attributes (unless your code does), because it doesn't use negative caching or shadowclasses or any of the optimization techniques that would inhibit it (or, when Python implementations do, they take into account the class might change) and because everything is runtime.

I just read through a bunch of documentation, and as far as I can tell, the whole story of how foo.bar is resolved, is as follows:
Can we find foo.__getattribute__ by the following process? If so, use the result of foo.__getattribute__('bar').
(Looking up __getattribute__ will not cause infinite recursion, but the implementation of it might.)
(In reality, we will always find __getattribute__ in new-style objects, as a default implementation is provided in object - but that implementation is of the following process. ;) )
(If we define a __getattribute__ method in Foo, and access foo.__getattribute__, foo.__getattribute__('__getattribute__') will be called! But this does not imply infinite recursion - if you are careful ;) )
Is bar a "special" name for an attribute provided by the Python runtime (e.g. __dict__, __class__, __bases__, __mro__)? If so, use that. (As far as I can tell, __getattribute__ falls into this category, which avoids infinite recursion.)
Is bar in the foo.__dict__ dict? If so, use foo.__dict__['bar'].
Does foo.__mro__ exist (i.e., is foo actually a class)? If so,
For each base-class base in foo.__mro__[1:]:
(Note that the first one will be foo itself, which we already searched.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
For each base-class base of foo.__class__.__mro__:
(Note that this recursion is not a problem: those attributes should always exist, and fall into the "provided by the Python runtime" case. foo.__class__.__mro__[0] will always be foo.__class__, i.e. Foo in our example.)
(Note that we do this even if foo.__mro__ exists. This is because classes have a class, too: its name is type, and it provides, among other things, the method used to calculate __mro__ attributes in the first place.)
Is bar in base.__dict__? If so:
Let x be base.__dict__['bar'].
Can we find (again, recursively, but it won't cause a problem) x.__get__?
If so, use x.__get__(foo, foo.__class__).
(Note that the function bar is, itself, an object, and the Python compiler automatically gives functions a __get__ attribute which is designed to be used this way.)
Otherwise, use x.
If we still haven't found something to use: can we find foo.__getattr__ by the preceding process? If so, use the result of foo.__getattr__('bar').
If everything failed, raise AttributeError.
bar.__get__ is not really a function - it's a "method-wrapper" - but you can imagine it being implemented vaguely like this:
# Somewhere in the Python internals
class __method_wrapper(object):
def __init__(self, func):
self.func = func
def __call__(self, obj, cls):
return lambda *args, **kwargs: func(obj, *args, **kwargs)
# Except it actually returns a "bound method" object
# that uses cls for its __repr__
# and there is a __repr__ for the method_wrapper that I *think*
# uses the hashcode of the underlying function, rather than of itself,
# but I'm not sure.
# Automatically done after compiling bar
bar.__get__ = __method_wrapper(bar)
The "binding" that happens within the __get__ automatically attached to bar (called a descriptor), by the way, is more or less the reason why you have to specify self parameters explicitly for Python methods. In Javascript, this itself is magical; in Python, it is merely the process of binding things to self that is magical. ;)
And yes, you can explicitly set a __get__ method on your own objects and have it do special things when you set a class attribute to an instance of the object and then access it from an instance of that other class. Python is extremely reflective. :) But if you want to learn how to do that, and get a really full understanding of the situation, you have a lot of reading to do. ;)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use cases for property vs. descriptor vs. getattribute - python

Related

How to test a method where the set up and checking depends on untested methods?

Read only python attribute? Can't print object

Why is getattr not invoked for missing "magically" invoked methods?

How to use default property descriptors and successfully assign from init()?

How does extending classes (Monkey Patching) work in Python?

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Use cases for property vs. descriptor vs. __getattribute__ - python

Related

How to test a method where the set up and checking depends on untested methods?

Read only python attribute? Can't print object

Why is __getattr__ not invoked for missing "magically" invoked methods?

How to use default property descriptors and successfully assign from __init__()?

How does extending classes (Monkey Patching) work in Python?

Categories

Resources

Use cases for property vs. descriptor vs. getattribute - python

Why is getattr not invoked for missing "magically" invoked methods?

How to use default property descriptors and successfully assign from init()?