Which descriptor implementation is correct?

Which descriptor implementation is correct? - python

The following functions are listed for the Descriptor Protocol in this section of the official Python documentation.
descr.__get__(self, obj, type=None) -> value
descr.__set__(self, obj, value) -> None
descr.__delete__(self, obj) -> None
And there is an example like that in here;
import logging
logging.basicConfig(level=logging.INFO)
class LoggedAgeAccess:
def __get__(self, obj, objtype=None):
value = obj._age
logging.info('Accessing %r giving %r', 'age', value)
return value
def __set__(self, obj, value):
logging.info('Updating %r to %r', 'age', value)
obj._age = value
class Person:
age = LoggedAgeAccess() # Descriptor instance
def __init__(self, name, age):
self.name = name # Regular instance attribute
self.age = age # Calls __set__()
def birthday(self):
self.age += 1 # Calls both __get__() and __set__()
However, I can implement this example using the functions of the Descriptor definition specified in another official Python document;
import logging
logging.basicConfig(level=logging.INFO)
class LoggedAgeAccess:
def __get__(self, instance, owner=None):
value = instance._age
logging.info('Accessing %r giving %r', 'age', value)
return value
def __set__(self, instance, value):
logging.info('Updating %r to %r', 'age', value)
instance._age = value
class Person:
age = LoggedAgeAccess() # Descriptor instance
def __init__(self, name, age):
self.name = name # Regular instance attribute
self.age = age # Calls __set__()
def birthday(self):
self.age += 1 # Calls both __get__() and __set__()
I couldn't figure out what was different about these documents. Which document should I refer to?

Both documents are correct.
The signatures are the same. The parameter names differ - but it is just that. Have in mind that Python is a dynamic typed language, and parameter names say nothing about the types that parameter should get. In your examples, obj, objtype, instance, owner are just strings as arbitrary as foo and bar.
Usually, code that create descriptors will use the instance and owner nomenclature, but that is totally up to the author, and makes no difference when running the code.
Indeed, it could make a difference if these parameters where ever called with named arguments, instead of positional arguments - but the language does not do that: descriptor methods are always called with positional arguments.
Otherwise, one of the documents would simply be incorrect, and the example would not work - and a bug against the docs should be filed. But as you noticed, both work perfectly fine.

Related

How do I use same getter and setter properties and functions for different attributes of a class the pythonic way?

I've got this class that I'm working on that stores Employees details.
I want all attributes to be protected and be set and gotten with specific logic, but not all in a unique way. I would like the same logic to apply to my _f_name and to my _l_name attributes, I would like the same logic perhaps to be applied to attributes that take in booleans and other general cases.
I've got this for the first attribute:
#property
def f_name(self):
return self.f_name
#f_name.setter
def f_name(self, f_name):
if f_name != str(f_name):
raise TypeError("Name must be set to a string")
else:
self._f_name = self._clean_up_string(f_name)
#f_name.deleter
def available(self):
raise AttributeError("Can't delete, you can only change this value.")
How can I apply the same functions and properites to other attributes?
Thaaaanks!

While it may seem like defining a subclass of property is possible, too many details of how a particular property work is left to the getter and setter to define, meaning it's more straightforward to define a custom property-like descriptor.
class CleanableStringProperty:
def __set_name__(self, owner, name):
self._private_name = "_" + name
self.name = name
def __get__(self, obj, objtype=None):
# Boilerplate to handle accessing the property
# via a class, rather than an instance of the class.
if obj is None:
return self
return getattr(obj, self._private_name)
def __set__(self, obj, value):
if not isinstance(value, str):
raise TypeError(f'{self.name} value must be a str')
setattr(obj, self._private_name, obj._clean_up_string(value))
def __delete__(self, obj):
raise AttributeError("Can't delete, you can only change this value.")
__set_name__ constructs the name of the instance attribute that the getter and setter will use. __get__ acts as the getter, using getattr to retrieve the constructed attribute name from the given object. __set__ validates and modifies the value before using setattr to set the constructed attribute name. __del__ simply raises an attribute error, independent of whatever object the caller is trying to remove the attribute from.
Here's a simple demonstration which causes all values assigned to the descriptor to be put into title case.
class Foo:
f_name = CleanableStringProperty()
l_name = CleanableStringProperty()
def __init__(self, first, last):
self.f_name = first
self.l_name = last
def _clean_up_string(self, v):
return v.title()
f = Foo("john", "doe")
assert f.f_name == "John"
assert f.l_name == "Doe"
try:
del f.f_name
except AttributeError:
print("Prevented first name from being deleted")
It would also be possible for the cleaning function, rather than being somethign that obj is expected to provide, to be passed as an argument to CleanableStringProperty itself. __init__ and __set__ would be modified as
def __init__(self, cleaner):
self.cleaner = cleaner
def __set__(self, obj, value):
if not isinstance(value, str):
raise TypeError(f'{self.name} value must be a str')
setattr(obj, self._private_name, self.cleaner(value))
and the descriptor would be initialized with
class Foo:
fname = CleanableStringProperty(str.title)
Note that Foo is no longer responsible for providing a cleaning method.

A property is just an implementation of a descriptor, so to create a custom property, you need an object with a __get__, __set__, and/or __delete__ method.
In your case, you could do something like this:
from typing import Any, Callable, Tuple
class ValidatedProperty:
def __set_name__(self, obj, name):
self.name = name
self.storage = f"_{name}"
def __init__(self, validation: Callable[[Any], Tuple[str, Any]]=None):
"""Initializes a ValidatedProperty object
Args:
validation (Callable[[Any], Tuple[str, Any]], optional): A Callable that takes the given value and returns an error string (empty string if no error) and the cleaned-up value. Defaults to None.
"""
self.validation = validation
def __get__(self, instance, owner):
return getattr(instance, self.storage)
def __set__(self, instance, value):
if self.validation:
error, value = self.validation(value)
if error:
raise ValueError(f"Error setting property {self.name}: {error}")
setattr(instance, self.storage, value)
def __delete__(self, instance):
raise AttributeError("Can't delete, you can only change this value.")
Let's define an example class to use this:
class User:
def __name_validation(value):
if not isinstance(value, str):
return (f"Expected string value, received {type(value).__name__}", None)
return ("", value.strip().title())
f_name = ValidatedProperty(validation=__name_validation)
l_name = ValidatedProperty(validation=__name_validation)
def __init__(self, fname, lname):
self.f_name = fname
self.l_name = lname
and test:
u = User("Test", "User")
print(repr(u.f_name)) # 'Test'
u.f_name = 123 # ValueError: Error setting property f_name: Expected string value, received int
u.f_name = "robinson " # Notice the trailing space
print(repr(u.f_name)) # 'Robinson'
u.l_name = "crusoe "
print(repr(u.l_name)) # 'Crusoe'

Python - Descriptor - Broken class

I follow the tutorial out of the docs and an example by fluent Python. In the book they teach me how to avoid the AttributeError by get, (e.g., when you do z = Testing.x) and I wanted to do something simliar with the set method. But it seems like, it lead to a broken class with no error.
To be more specific about the issue:
With outcommented line Testing.x = 1 it invoke the __set__ methods.
With uncommented line #Testing.x = 1 it does not invoke the __set__ methods.
Can someone teach me why it behaves this way?
import abc
class Descriptor:
def __init__(self):
cls = self.__class__
self.storage_name = cls.__name__
def __get__(self, instance, owner):
if instance is None:
return self
else:
return getattr(instance, self.storage_name)
def __set__(self, instance, value):
print(instance,self.storage_name)
setattr(instance, self.storage_name, value)
class Validator(Descriptor):
def __set__(self, instance, value):
value = self.validate(instance, value)
super().__set__(instance, value)
#abc.abstractmethod
def validate(self, instance, value):
"""return validated value or raise ValueError"""
class NonNegative(Validator):
def validate(self, instance, value):
if value <= 0:
raise ValueError(f'{value!r} must be > 0')
return value
class Testing:
x = NonNegative()
def __init__(self,number):
self.x = number
#Testing.x = 1
t = Testing(1)
t.x = 1

Attribute access is generally handled by object.__getattribute__ and type.__getattribute__ (for instances of type, i.e. classes). When an attribute lookup of the form a.x involves a descriptor as x, then various binding rules come into effect, based on what x is:
Instance binding: If binding to an object instance, a.x is transformed into the call: type(a).__dict__['x'].__get__(a, type(a)).
Class binding: If binding to a class, A.x is transformed into the call: A.__dict__['x'].__get__(None, A).
Super binding: [...]
For the scope of this question, only (2) is relevant. Here, Testing.x invokes the descriptor via __get__(None, Testing). Now one might ask why this is done instead of simply returning the descriptor object itself (as if it was any other object, say an int). This behavior is useful to implement the classmethod decorator. The descriptor HowTo guide provides an example implementation:
class ClassMethod:
def __init__(self, f):
self.f = f
def __get__(self, obj, cls=None):
print(f'{obj = }, {cls = }')
return self.f.__get__(cls, cls) # simplified version
class Test:
#ClassMethod
def func(cls, x):
pass
Test().func(2) # call from instance
Test.func(1) # this requires binding without any instance
We can observe that for the second case Test.func(1) there is no instance involved, but the ClassMethod descriptor can still bind to the cls.
Given that __get__ is used for both, instance and class binding, one might ask why this isn't the case for __set__. Specifically, for x.y = z, if y is a data descriptor, why doesn't it invoke y.__set__(None, z)? I guess the reason is that there is no good use case for that and it unnecessarily complicates the descriptor API. What would the descriptor do with that information anyway? Typically, managing how attributes are set is done by the class (or metaclass for types), via object.__setattr__ or type.__setattr__.
So to prevent Testing.x from being replaced by a user, you could use a custom metaclass:
class ProtectDataDescriptors(type):
def __setattr__(self, name, value):
if hasattr(getattr(self, name, None), '__set__'):
raise AttributeError(f'Cannot override data descriptor {name!r}')
super().__setattr__(name, value)
class Testing(metaclass=ProtectDataDescriptors):
x = NonNegative()
def __init__(self, number):
self.x = number
Testing.x = 1 # now this raises AttributeError
However, this is not an absolute guarantee as users can still use type.__setattr__ directly to override that attribute:
type.__setattr__(Testing, 'x', 1) # this will bypass ProtectDataDescriptors.__setattr__

The line
Testing.x = 1
replaces the descriptor you've set as a class attribute for Testing with an integer.
Since the descriptor is no more, self.x = ... or t.x = ... is just an assignment that doesn't involve a descriptor.
As an aside, surely you've noticed there is no true x attribute anymore with your descriptor, and you can't use more than one instance of the same descriptor without conflicts?
class Testing:
x = NonNegative()
y = NonNegative()
def __init__(self, number):
self.x = number
t = Testing(2345)
t.x = 1234
t.y = 5678
print(vars(t))
prints out
{'NonNegative': 5678}

How to dynamically change the signature of a function without modifying the AST?

I'm writing an equivalent of Python's #dataclass and I'm trying to monkeypatch the constructor of decorated classes. My decorator is currently as follows:
def decorator(decorated_class):
def __init__(self, **kwargs):
# do stuff
decorated_class.__init__ = __init__
return decorated_class
What I want to be able to do is dynamically set the arguments of the newly created __init__ method based on the class attributes. For example say we have a class Person:
#decorator
class Person:
name: str
age: int
My desired behaviour would be to dynamically generate the following constructor and monkey-patching it as I'm currently already doing in the decorator
def __init__(self, name, age):
self.name = name
self.age = age
This behaviour is probably achievable by messing around with the AST at runtime but it obviously comes at a performance price which I would like to avoid. Is there any way of doing this by (for example) interacting directly with the function object?
Note: the solution provided by this Stack Overflow post is poorly described and doesn't run.

You can use setattr and *args/**kwargs to do it without messing up with the AST. From your explanation, I assume you don't care about the original __init__ function.
The following code generates a __init__ function tailored for the class members:
def decorator(cls):
# Find out the attributes. Standard class attributes have priority over annotations
attributes = [name for name in cls.__dict__ if not name.startswith('_')]
if '__annotations__' in cls.__dict__:
for attr_name in cls.__dict__['__annotations__']:
if attr_name not in attributes:
attributes.append(attr_name)
def __init__(self, *args, **kwargs):
for attr_name, value in zip(attributes, args):
setattr(self, attr_name, value)
for tentative_name, value in kwargs.items():
if tentative_name in attributes:
setattr(self, tentative_name, value)
cls.__init__ = __init__
return cls
It basically searches for entries in __dict__ and __annotations__ that do not start with a _ and allows them to be passed to the constructor. I excluded entries starting with a single _ because they typically wouldn't be set from the constructor. Here are some tests:
#decorator
class Person:
name: str
age: int
john = Person('John', 32)
assert john.name == 'John'
assert john.age == 32
mary = Person(name='Mary', age=18)
assert mary.name == 'Mary'
assert mary.age == 18
This also works for classes with default attributes:
#decorator
class OtherPerson:
name = 'John Doe'
age = 21
mustafa = OtherPerson('Mustafa', 15)
assert mustafa.name == 'Mustafa'
assert mustafa.age == 15
yuko = OtherPerson(name='Yuko', age=71)
assert yuko.name == 'Yuko'
assert yuko.age == 71
I think it is possible to wrap the original __init__ but I don't believe there is anything very intelligent that can be done about the original arguments.

Python type checking system

I am trying to make custom type system in Python. Following is the code.
from inspect import Signature, Parameter
class Descriptor():
def __init__(self, name=None):
self.name = name
def __set__(self, instance, value):
instance.__dict__[self.name] = value
def __get__(self, instance, cls):
return instance.__dict__[self.name]
class Typed(Descriptor):
ty = object
def __set__(self, instance, value):
if not isinstance(value, self.ty):
raise TypeError('Expected %s' %self.ty)
super().__set__(instance, value)
class Integer(Typed):
ty = int
class Float(Typed):
ty = float
class String(Typed):
ty = str
class Positive(Descriptor):
def __set__(self, instance, value):
if value < 0:
raise ValueError('Expected >= 0')
super().__set__(instance, value)
class PosInteger(Integer, Positive):
pass
class Sized(Descriptor):
def __init__(self, *args, maxlen, **kwargs):
self.maxlen = maxlen
super().__init__(*args, **kwargs)
def __set__(self, instance, value):
if len(value) > self.maxlen:
raise ValueError('TooBig')
super().__set__(instance, value)
class SizedString(String, Sized):
pass
def make_signature(names):
return Signature([Parameter(name, Parameter.POSITIONAL_OR_KEYWORD) for name in names])
class StructMeta(type):
def __new__(cls, name, bases, clsdict):
fields = [key for key, value in clsdict.items() if isinstance(value, Descriptor)]
for name in fields:
#print(type(clsdict[name]))
clsdict[name].name = name
clsobj = super().__new__(cls, name, bases, clsdict)
sig = make_signature(fields)
setattr(clsobj, '__signature__', sig)
return clsobj
class Structure(metaclass = StructMeta):
def __init__(self, *args, **kwargs):
bound = self.__signature__.bind(*args, **kwargs)
for name, value in bound.arguments.items():
setattr(self, name, value)
Using the above type system, I got rid of all the boilerplate code and duplicate code that I would have to write in classes (mostly inside init) for checking types, validating values etc.
By using the code above, my classes would look as simple as this
class Stock(Structure):
name = SizedString(maxlen=9)
shares = PosInteger()
price = Float()
stock = Stock('AMZN', 100, 1600.0)
Till here things work fine. Now I want to extend this type checks functionality and create classes holding objects of another classes. For example price is now no longer a Float but its of type Price (i.e. another class Price).
class Price(Structure):
currency = SizedString(maxlen=3)
value = Float()
class Stock(Structure):
name = SizedString(maxlen=9)
shares = PosInteger()
price = Price() # This won't work.
This won't work because line "price = Price()" will make call to constructor of Price and would expect currency and value to be passed to the constructor because Price is a Structure and not a Descriptor. It throws "TypeError: missing a required argument: 'currency'".
But I want it to work and make it look like above because at the end of the day Price is also a type just like PosInteger but at the same time it has to be Structure too. i.e. Price should be inheriting from Structure but at the same time it has to be a descriptor too.
I can make it work by defining another class say "PriceType"
class Price(Structure):
currency = SizedString(maxlen=3)
value = Float()
class PriceType(Typed):
ty = Price
class Stock(Structure):
name = SizedString(maxlen=9)
shares = PosInteger()
price = PriceType()
stock = Stock('AMZN', 100, Price('INR', 2400.0))
But this looks a bit weird - Price and PriceType as two difference classes. Can someone help me understand if I can avoid creating PriceType class?
I am also losing out on a functionality to provide default values to fields.
For example, how can I keep default value of share field in Stock to 0 or default value of currency field in Price to 'USD'? i.e. something like below.
class Stock:
def __init__(name, price, shares=0)
class Price
def __init__(value, currency = 'USD')

A quick thing to do there is to have a simple function that will build the "PriceType" (and equivalents) when you declare the fields.
Since uniqueness of the descriptor classes themselves is not needed, and the relatively long time a class takes to be created is not an issue, since fields in a body class are only created at program-load time, you should be fine with:
def typefield(cls, *args, extra_checkers = (), **kwargs):
descriptor_class = type(
cls.__name__,
(Typed,) + extra_checkers,
{'ty': cls}
)
return descriptor_class(*args, **kwargs)
And now, code like this should just work:
class Stock(Structure):
name = SizedString(maxlen=9)
shares = PosInteger()
price = typefield(Price, "price")
(Also, note that Python 3.6+ have the __set_name__ method incorporated into the descriptor protocol - if you use this, you won't need to pass the field name as a parameter to the default descriptor __init__, and type field names twice)
update
In your comment, you seam to implicate want your Structure classes to work themselves as descriptors - that would not work well - the descriptors __get__ and __set__ methods are class methods - you want the fields to be populated with actual instances of your structures.
What can be done is to move the typefield method above to a class method in Structure, have it annotate the default parameters your want, and create a new intermediate descriptor class for these kind of fields that will automatically create an instance with the default values when it is read. Also, ty can simply be an instance attribute in the descriptor, so no need to create dynamic classes for the fields:
class StructField(Typed):
def __init__(self, *args, ty=None, def_args=(), def_kw=None, **kw):
self.def_args = def_args
self.def_kw = def_kw or {}
self.ty = ty
super().__init__(*args, **kw)
def __get__(self, instance, owner):
if self.name not in instance.__dict__:
instance.__dict__[self.name] = self.ty(*self.def_args, **self.def_kw)
return super().__get__(instance, owner)
...
class Structure(metaclass=StructMeta):
...
#classmethod
def field(cls, *args, **kw):
# Change the signature if you want extra parameters
# for the field, like extra validators, and such
return StructField(ty=cls, def_args=args, def_kw=kw)
...
class Stock(Structure):
...
price = Price.field("USD", 20.00)

How to raise exception for non-existent class member?

I have a method for automatically creating Python classes that wrap database tables, with class members that have the same name as the fields in the table. The class files look like this:
class CpsaUpldBuildChrgResultSet(Recordset):
def __init__(self, connection):
super().__init__(connection)
self.DefaultTableName = 'cpsa_upld_build_chrg_result'
self._keyFields.append('j_trans_seq')
self._keyFields.append('j_index')
#property
def j_trans_seq(self):
return self.GetValue('j_trans_seq')
#j_trans_seq.setter
def j_trans_seq(self, value):
self.SetKeyValue('j_trans_seq', value)
#property
def j_index(self):
return self.GetValue('j_index')
#j_index.setter
def j_index(self, value):
self.SetKeyValue('j_index', value)
I just found that if I try to set a value for a non-existent class member, such as J_TRANS_SEQ, no exception is thrown. Is there something I can add to this class so that an attempt to access a non-existent member would raise an exception?

You can add a __setattr__ method to your class that raises an AttributeError whenever an invalid attribute is assigned to. I'm not sure exactly how you'd want to determine which attributes are valid and which are not, but one approach might be something like this:
def __setattr__(self, name, value):
if hasattr(self, name):
super().__setattr__(name, value)
else:
raise AttributeError("{} object has no attribute {!r}".format(type(self), name))
This assumes that any attribute that can be looked up is also valid to be assigned to. It might break if your property's getters don't work unless the setter is called before the getter. It might also be too permissive, since it would allow setting of instance attributes that override class attributes (such as __init__). Another approach might be to check the name against a white-list of known attributes (but be sure to include the attributes that you need for the inherited class machinery, like DefaultTableName and _keyFields).

I think #Blckknght has the right idea, but left out some important details in his answer—such has how class attributes (class members) are set the first time, when they don't preexist, such as in the typical scenario when the class's __init__() method executes. Here's a more fully fleshed-out answer that works in Python 3 which addresses that deficiency.
It also shows how to minimize the coding of a bunch of repetitive properties.
class Recordset(object):
def __init__(self, connection):
print('Recordset.__init__({!r}) called'.format(connection))
def SetKeyValue(self, name, value):
print('SetKeyValue({!r}, {!r}) called'.format(name, value))
def GetValue(self, name):
print('GetValue({!r}) called'.format(name))
def fieldname_property(name):
storage_name = '_' + name
#property
def prop(self):
return self.GetValue(storage_name)
#prop.setter
def prop(self, value):
self.SetKeyValue(storage_name, value)
return prop
class CpsaUpldBuildChrgResultSet(Recordset):
# define properties for valid fieldnames
j_trans_seq = fieldname_property('j_trans_seq')
j_index = fieldname_property('j_index')
def __init__(self, connection):
super().__init__(connection)
self._setter('DefaultTableName', 'cpsa_upld_build_chrg_result')
def __setattr__(self, name, value):
if hasattr(self, name):
self._setter(name, value)
else:
raise AttributeError("No field named %r" % name)
def _setter(self, name, value):
"""Provides way to intentionally bypass overloaded __setattr__."""
super().__setattr__(name, value)
print('start')
db_table = CpsaUpldBuildChrgResultSet('SomeConnection')
print('assigning attributes...')
db_table.j_trans_seq = 42 # OK
db_table.j_index = 13 # OK
db_table.J_TRANS_SEQ = 99 # -> AttributeError: No field named 'J_TRANS_SEQ'
print('done')

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Which descriptor implementation is correct? - python

Related

How do I use same getter and setter properties and functions for different attributes of a class the pythonic way?

Python - Descriptor - Broken class

How to dynamically change the signature of a function without modifying the AST?

Python type checking system

How to raise exception for non-existent class member?

Categories

Resources