Where does a python descriptor's state go? - python

Background
I'm trying to figure out Python's descriptors by reading Lutz's Learning Python's section on the topic in which he says: "Like properties, descriptors are designed to handle specific attributes... Unlike properties, descriptors have their own state..."
Throughout the chapter he shows examples in which the managed attribute is actually stashed on the containing/wrapping object, as in:
def __set__(self, instance, value):
instance._name = value.lower()
I understand these examples and they seem to be common in write ups on the topic. That said, their benefit over properties isn't obvious to me and they seem to fall short of the internal state promised in the above quote.
At the end of the chapter he shows an example that is closer to what I pictured after reading "have their own state", as in:
def __set__(self, instance, value):
self.name = value.lower()
The example runs but does not do what I'd expect it to do. As the example is a bit long I've put it on Pastebin and added a last line that shows the unexpected behavior (Bob's name is now Sue). Here's a shorter demo snippet:
class Wrapper(object):
class ExampleDescriptor(object):
def __get__(self, instance, owner):
print "get %s" % self.state
return self.state
def __set__(self, instance, value):
print "set %s" % value
self.state = value
ex = ExampleDescriptor()
w1 = Wrapper()
w1.ex = 1
print w1.ex
w2 = Wrapper()
print w2.ex
w2.ex = 2
print w1.ex
print w1.ex is w2.ex
The output of which is:
set 1
get 1
1
get 1
1
set 2
get 2
2
get 2
get 2
True
None of this execution comes as a surprise after looking at the code carefully. The validation logic in the descriptor is making a de facto singleton out of this attribute on the wrapper class; however, it's hard to imagine this shared state was Lutz's intention, or the intention in this widely linked tutorial on the topic.
Question
Is it possible to make a descriptor that has internal state that is unique to the wrapping object without stashing that state on the wrapping object instances (as in the first snippet)? Is it possible to modify the CardHolder class from the linked example such that Bob does not end up as Sue?

"Like properties, descriptors are designed to handle specific attributes... Unlike properties, descriptors have their own state..."
I am not sure what point Lutz is trying to make as properties are, in fact, descriptors themselves.
But, even though descriptors do have their own state, it's not widely useful as, as you have discovered, you only get one descriptor object per class attribute instead of one per instance. This is why the instance is passed in, so that instance-unique values can be saved/accessed.
To prove the point that it is one descriptor object per attribute, you can try this slightly modified code from one of your links:
class RevealAccess(object):
"""A data descriptor that sets and returns values
normally and prints a message logging their access.
"""
def __init__(self, initval=None, name='var'):
self.val = initval
self.name = name
def __get__(self, obj, objtype):
print 'Retrieving', self.name
return self.val
def __set__(self, obj, val):
print 'Updating' , self.name
self.val = val
class MyClass(object):
x = RevealAccess(10, 'var "x"')
y = RevealAccess(5, 'var "y"')
m = MyClass()
m.x
m.x = 20
m.x
m.y
What you should see:
Retrieving var "x"
Updating var "x"
Retrieving var "x"
Retrieving var "y"
To answer your question: Yes. But it's a pain.
class Stored(object):
"""A data descriptor that stores instance values in itself.
"""
instances = dict()
def __init__(self, val):
self.instances[self, None] = val
def __get__(self, obj, objtype):
return self.instances[self, obj]
def __set__(self, obj, val):
self.instances[self, obj] = val
class MyClass(object):
x = Stored(3)
y = Stored(9)
print(MyClass.x)
print(MyClass.y)
m = MyClass()
m.x = 42
print(m.x)
m.y = 19
print(m.y)
print(m.x)

As you've stated already, a descriptor is a class-level instance so its state is shared between each instance of the class.
The descriptor could store an internal hash of instances it's wrapping. To avoid circular references it'd be smarter to have the key be the id of the instance. The only reason I'd see to do this is if the descriptor's purpose is to aggregate these properties from different instances.
As for the second part of your question, just do as you stated already and store the underlying state on the instance instead of the descriptor and then Bob will not be Sue.

(A complement to other answers)
To attach state to instances you do not control without disturbing them, simply use a weak container; weakref.WeakKeyDictionary is appropriate here. The garbage collector will make sure that the descriptor's extra state doesn't linger after the instances are collected, and that the descriptor doesn't cause the instances to live longer than they normally would.

Related

Replacing the object from one of its methods

I am using python and have an object, that object has a method. I am looking for a simple way, to replace the entire object from within that function.
E.g
class a():
def b(self):
self = other_object
How can you do that?
Thanks
You use a proxy/facade object to hold a reference to the actual object, the self if you wish and that proxy (better term than Facade, but not changing my code now) is what the rest of your codebase sees. However, any attribute/method access is forwarded on to the actual object, which is swappable.
Code below should give you a rough idea. Note that you need to be careful about recursion around __the_instance, which is why I am assigning to __dict__ directly. Bit messy, since it's been a while I've written code that wraps getattr and setattr entirely.
class Facade:
def __init__(self, instance):
self.set_obj(instance)
def set_obj(self, instance):
self.__dict__["__theinstance"] = instance
def __getattr__(self, attrname):
if attrname == "__theinstance":
return self.__dict__["__theinstance"]
return getattr(self.__dict__["__theinstance"], attrname)
def __setattr__(self, attrname, value):
if attrname == "__theinstance":
self.set_obj(value)
return setattr(self.__dict__["__theinstance"], attrname, value)
class Test:
def __init__(self, name, cntr):
self.name = name
self.cntr = cntr
def __repr__(self):
return "%s[%s]" % (self.__class__.__name__, self.__dict__)
obj1 = Test("first object", 1)
obj2 = Test("second", 2)
obj2.message = "greetings"
def pretend_client_code(facade):
print(id(facade), facade.name, facade.cntr, getattr(facade, "value", None))
facade = Facade(obj1)
pretend_client_code(facade)
facade.set_obj(obj2)
pretend_client_code(facade)
facade.value = 3
pretend_client_code(facade)
facade.set_obj(obj1)
pretend_client_code(facade)
output:
4467187104 first object 1 None
4467187104 second 2 None
4467187104 second 2 3
4467187104 first object 1 None
So basically, the "client code" always sees the same facade object, but what it is actually accessing depends on what your equivalent of def b is has done.
Facade has a specific meaning in Design Patterns terminology and it may not be really applicable here, but close enough. Maybe Proxy would have been better.
Note that if you want to change the class on the same object, that is a different thing, done through assigning self.__class__ . For example, say an RPG game with an EnemyClass who gets swapped to DeadEnemyClass once killed: self.__class__ = DeadEnemyClass
You can't directly do that. What you can do is save it as an instance variable.
class A():
def __init__(self, instance=None):
self.instance = val or self
# yes, you can make it a property as well.
def set_val(self, obj):
self.instance = obj
def get_val(self):
return self.instance
It is unlikely that replacing the 'self' variable will accomplish
whatever you're trying to do, that couldn't just be accomplished by
storing the result of func(self) in a different variable. 'self' is
effectively a local variable only defined for the duration of the
method call, used to pass in the instance of the class which is being
operated upon. Replacing self will not actually replace references to
the original instance of the class held by other objects, nor will it
create a lasting reference to the new instance which was assigned to
it.
Original source: Is it safe to replace a self object by another object of the same type in a method?

Dynamically creating #attribute.setter methods for all properties in class (Python)

I have code that someone else wrote like this:
class MyClass(object):
def __init__(self, data):
self.data = data
#property
def attribute1(self):
return self.data.another_name1
#property
def attribute2(self):
return self.data.another_name2
and I want to automatically create the corresponding property setters at run time so I don't have to modify the other person's code. The property setters should look like this:
#attribute1.setter
def attribue1(self, val):
self.data.another_name1= val
#attribute2.setter
def attribue2(self, val):
self.data.another_name2= val
How do I dynamically add these setter methods to the class?
You can write a custom Descriptor like this:
from operator import attrgetter
class CustomProperty(object):
def __init__(self, attr):
self.attr = attr
def __get__(self, ins, type):
print 'inside __get__'
if ins is None:
return self
else:
return attrgetter(self.attr)(ins)
def __set__(self, ins, value):
print 'inside __set__'
head, tail = self.attr.rsplit('.', 1)
obj = attrgetter(head)(ins)
setattr(obj, tail, value)
class MyClass(object):
def __init__(self, data):
self.data = data
attribute1 = CustomProperty('data.another_name1')
attribute2 = CustomProperty('data.another_name2')
Demo:
>>> class Foo():
... pass
...
>>> bar = MyClass(Foo())
>>>
>>> bar.attribute1 = 10
inside __set__
>>> bar.attribute2 = 20
inside __set__
>>> bar.attribute1
inside __get__
10
>>> bar.attribute2
inside __get__
20
>>> bar.data.another_name1
10
>>> bar.data.another_name2
20
This is the author of the question. I found out a very jerry-rigged solution, but I don't know another way to do it. (I am using python 3.4 by the way.)
I'll start with the problems I ran into.
First, I thought about overwriting the property entirely, something like this:
Given this class
class A(object):
def __init__(self):
self._value = 42
#property
def value(self):
return self._value
and you can over write the property entirely by doing something like this:
a = A()
A.value = 31 # This just redirects A.value from the #property to the int 31
a.value # Returns 31
The problem is that this is done at the class level and not at the instance level, so if I make a new instance of A then this happens:
a2 = A()
a.value # Returns 31, because the class itself was modified in the previous code block.
I want that to return a2._value because a2 is a totally new instance of A() and therefore shouldn't be influenced by what I did to a.
The solution to this was to overwrite A.value with a new property rather than whatever I wanted to assign the instance _value to. I learned that you can create a new property that instantiates itself from the old property using the special getter, setter, and deleter methods (see here). So I can overwrite A's value property and make a setter for it by doing this:
def make_setter(name):
def value_setter(self, val):
setattr(self, name, val)
return value_setter
my_setter = make_setter('_value')
A.value = A.value.setter(my_setter) # This takes the property defined in the above class and overwrites the setter with my_setter
setattr(A, 'value', getattr(A, 'value').setter(my_setter)) # This does the same thing as the line above I think so you only need one of them
This is all well and good as long as the original class has something extremely simple in the original class's property definition (in this case it was just return self._value). However, as soon as you get more complicated, to something like return self.data._value like I have, things get nasty -- like #BrenBarn said in his comment on my post. I used the inspect.getsourcelines(A.value.fget) function to get the source code line that contains the return value and parsed that. If I failed to parse the string, I raised an exception. The result looks something like this:
def make_setter(name, attrname=None):
def setter(self, val):
try:
split_name = name.split('.')
child_attr = getattr(self, split_name[0])
for i in range(len(split_name)-2):
child_attr = getattr(child_attr, split_name[i+1])
setattr(child_attr, split_name[-1], val)
except:
raise Exception("Failed to set property attribute {0}".format(name))
It seems to work but there are probably bugs.
Now the question is, what to do if the thing failed? That's up to you and sort of off track from this question. Personally, I did a bit of nasty stuff that involves creating a new class that inherits from A (let's call this class B). Then if the setter worked for A, it will work for the instance of B because A is a base class. However, if it didn't work (because the return value defined in A was something nasty), I ran a settattr(B, name, val) on the class B. This would normally change all other instances that were created from B (like in the 2nd code block in this post) but I dynamically create B using type('B', (A,), {}) and only use it once ever, so changing the class itself has no affect on anything else.
There is a lot of black-magic type stuff going on here I think, but it's pretty cool and quite versatile in the day or so I've been using it. None of this is copy-pastable code, but if you understand it then you can write your modifications.
I really hope/wish there is a better way, but I do not know of one. Maybe metaclasses or descriptors created from classes can do some nice magic for you, but I do not know enough about them yet to be sure.
Comments appreciated!

Get static variable value

I'm trying to create a static variable to be accessed through different classes, assigning value to it, and getting this value when needed. I did use this way in order to achieve this, and that leads me to including a property as following:
class GetPartition(Partition):
_i = 100
def __init__(self):
super(Partition,self).__init__("get")
def get_i(self):
return type(self)._i
def set_i(self,val):
type(self)._i = val
i = property(get_i, set_i)
and this is class Partition if needed:
class Partition(BaseCommand):
def __init__(self,type):
super(Partition,self).__init__("databaseTest")
self.type = type
So, when assigning a value to ifrom another class, I'm assigning it directly like:
GetPartition.i = 5
and among that class when printing GetPartition.i it gives me 5, but when trying to get this value from another class:
partitionNum = GetPartition()
print(partitionNum.i) # 100
print(partitionNum.get_i()) # 100
print(GetPartition.i) # <property object at 0x12A938D0>
print(GetPartition._i) # 100
As I explained in my comment, the problem comes when you assign 5 to i by way of:
GetPartition.i = 5
With this line of code, you are overwriting the property, and "bypassing" the property setter. What I mean by that is: the property setter is not called when you call its attribute name from the class; it is only called when you call its attribute name from a class instance.
Since it has been overwritten, the property no longer exists at that point and all references to the i attribute, whether from class instances or from the class itself, are distinct. They will no longer retrieve the same object, but distinct objects.
You can confirm this problem by doing this:
gp = GetPartition()
print(GetPartition.i) # the property is returned
GetPartition.i = 5 # here the property is overwritten
print(GetPartition.i) # 5 ; the property is gone
print(gp.i) # 5 because gp instance doesn't have its own i
gp.i = 2 # now gp does have its own i
print(gp.i) # 2
print(GetPartition.i) # 5 ; i is not synced
As I said above, the property getters and setters (and descriptors in general) only work with instances of GetPartition, not the class itself. They can be forced to work with the class itself by creating a metaclass - which is the class of a class - for your class; this is considered "deep black magic" by many people, and I don't recommend going that route if you can avoid it.
I believe the below example is probably the simplest way to implement the behavior you want. This approach abandons the use of properties in favor of overriding the attribute getter and setter methods directly:
class Example():
i = 1 # this is a "static variable"
j = 3 # this is a regular class attribute
#designate which of the class attributes are "static"
statics = {'i'}
def __getattribute__(self, attr):
'''Overrides default attribute retrieval behavior.'''
if attr in Example.statics:
#use class version if attr is a static var
return getattr(Example, attr)
else:
#default behavior if attr is not static var
return super().__getattribute__(attr)
def __setattr__(self, attr, value):
'''Overrides default attribute setting behavior.'''
if attr in Example.statics:
#use class version if attr is a static var
setattr(Example, attr, value)
else:
#default behavior if attr is not static var
super().__setattr__(attr, value)
#testing
if __name__ == '__main__':
print("\n\nBEGIN TESTING\n\n")
e = Example()
#confirm instance and class versions of i are the same
test = "assert e.i is Example.i"
exec(test)
print(test)
e.i = 5
#confirm they remain the same after instance change
test = "assert e.i is Example.i"
exec(test)
print(test)
Example.i = 100
#confirm they remain the same after class change
test = "assert e.i is Example.i"
exec(test)
print(test)
e.j = 12
#confirm both versions of j are distinct
test = "assert e.j is not Example.j"
exec(test)
print(test)
print("\n\nTESTING COMPLETE\n\n")
If you are not familiar with __getattribute__ and __setattr__, I should let you know that overriding them is often quite perilous and can cause big problems (especially __getattribute__). You'll find many people simply say "don't do it; rethink your problem and find another solution". Doing the overrides correctly requires a deep understanding of a wide range python topics.
I do not claim to have this deep understanding (though I think I have a pretty good understanding), so I cannot be 100% certain that my overrides as given above will not lead to some other problem for you. I believe they are sound, but just be aware that these particular corners of python can be pretty tricky.

OO design: an object that can be exported to a "row", while accessing header names, without repeating myself

Sorry, badly worded title. I hope a simple example will make it clear. Here's the easiest way to do what I want to do:
class Lemon(object):
headers = ['ripeness', 'colour', 'juiciness', 'seeds?']
def to_row(self):
return [self.ripeness, self.colour, self.juiciness, self.seeds > 0]
def save_lemons(lemonset):
f = open('lemons.csv', 'w')
out = csv.writer(f)
out.write(Lemon.headers)
for lemon in lemonset:
out.writerow(lemon.to_row())
This works alright for this small example, but I feel like I'm "repeating myself" in the Lemon class. And in the actual code I'm trying to write (where the number of variables I'm exporting is ~50 rather than 4, and where to_row calls a number of private methods that do a bunch of weird calculations), it becomes awkward.
As I write the code to generate a row, I need to constantly refer to the "headers" variable to make sure I'm building my list in the correct order. If I want to change the variables being outputted, I need to make sure to_row and headers are being changed in parallel (exactly the kind of thing that DRY is meant to prevent, right?).
Is there a better way I could design this code? I've been playing with function decorators, but nothing has stuck. Ideally I should still be able to get at the headers without having a particular lemon instance (i.e. it should be a class variable or class method), and I don't want to have a separate method for each variable.
In this case, getattr() is your friend: it allows you to get a variable based on a string name. For example:
def to_row(self):
return [getattr(self, head) for head in self.headers]
EDIT: to properly use the header seeds?, you would need to set the attribute seeds? for the objects. setattr(self, 'seeds?', self.seeds > 0) right above the return statement.
We could use some metaclass shenanegans to do this...
In python 2, attributes are passed to the metaclass in a dict, without
preserving order, we'll also want a base class to work with so we can
distinguish class attributes that should be mapped into the row. In python3, we could dispense with just about all of this base descriptor class.
import itertools
import functools
#functools.total_ordering
class DryDescriptor(object):
_order_gen = itertools.count()
def __init__(self, alias=None):
self.alias = alias
self.order = next(self._order_gen)
def __lt__(self, other):
return self.order < other.order
We will want a python descriptor for every attribute we wish to map into the
row. slots are a nice way to get data descriptors without much work. One
caveat, though, we'll have to manually remove the helper instance to make the
real slot descriptor visible.
class slot(DryDescriptor):
def annotate(self, attr, attrs):
del attrs[attr]
self.attr = attr
slots = attrs.setdefault('__slots__', []).append(attr)
def annotate_class(self, cls):
if self.alias is not None:
setattr(cls, self.alias, getattr(self.attr))
For computed fields, we can memoize results. Memoizing off of the annotated
instance is tricky without a memory leak, we need weakref. alternatively, we
could have arranged for another slot just to store the cached value. This also isn't quite thread safe, but pretty close.
import weakref
class memo(DryDescriptor):
_memo = None
def __call__(self, method):
self.getter = method
return self
def annotate(self, attr, attrs):
if self.alias is not None:
attrs[self.alias] = self
def annotate_class(self, cls): pass
def __get__(self, instance, owner):
if instance is None:
return self
if self._memo is None:
self._memo = weakref.WeakKeyDictionary()
try:
return self._memo[instance]
except KeyError:
return self._memo.setdefault(instance, self.getter(instance))
On the metaclass, all of the descriptors we created above are found, sorted by
creation order, and instructed to annotate the new, created class. This does
not correctly treat derived classes and could use some other conveniences like
an __init__ for all the slots.
class DryMeta(type):
def __new__(mcls, name, bases, attrs):
descriptors = sorted((value, key)
for key, value
in attrs.iteritems()
if isinstance(value, DryDescriptor))
for descriptor, attr in descriptors:
descriptor.annotate(attr, attrs)
cls = type.__new__(mcls, name, bases, attrs)
for descriptor, attr in descriptors:
descriptor.annotate_class(cls)
cls._header_descriptors = [getattr(cls, attr) for descriptor, attr in descriptors]
return cls
Finally, we want a base class to inherit from so that we can have a to_row
method. this just invokes all of the __get__s for all of the respective
descriptors, in order.
class DryBase(object):
__metaclass__ = DryMeta
def to_row(self):
cls = type(self)
return [desc.__get__(self, cls) for desc in cls._header_descriptors]
Assuming all of that is tucked away, out of sight, the definition of a class
that uses this feature is mostly free of repitition. The only short coming is
that to be practical, every field needs a python friendly name, thus we had the
alias key to associate 'seeds?' to has_seeds
class ADryRow(DryBase):
__slots__ = ['seeds']
ripeness = slot()
colour = slot()
juiciness = slot()
#memo(alias='seeds?')
def has_seeds(self):
print "Expensive!!!"
return self.seeds > 0
>>> my_row = ADryRow()
>>> my_row.ripeness = "tart"
>>> my_row.colour = "#8C2"
>>> my_row.juiciness = 0.3479
>>> my_row.seeds = 19
>>>
>>> print my_row.to_row()
Expensive!!!
['tart', '#8C2', 0.3479, True]
>>> print my_row.to_row()
['tart', '#8C2', 0.3479, True]

Track changes of atributes in instance. Python

I want to implement function which takes as argument any object and trackes changes of value for specific attribute. Than saves old value of attribute in old_name attribute.
For example:
class MyObject(object):
attr_one = None
attr_two = 1
Lets name my magic function magic_function()
Sot than i can do like this:
obj = MyObject()
obj = magic_function(obj)
obj.attr_one = 'new value'
obj.attr_two = 2
and it saves old values so i can get like this
print obj.old_attr_one
None
print obj.attr_one 'new value'
and
print obj.old_attr_two
1
print obj.attr_two
2
Something like this.. I wonder how can i do this by not touching the class of instance?
This is a start:
class MagicWrapper(object):
def __init__(self, wrapped):
self._wrapped = wrapped
def __getattr__(self, attr):
return getattr(self._wrapped, attr)
def __setattr__(self, attr, val):
if attr == '_wrapped':
super(MagicWrapper, self).__setattr__('_wrapped', val)
else:
setattr(self._wrapped, 'old_' + attr, getattr(self._wrapped, attr))
setattr(self._wrapped, attr, val)
class MyObject(object):
def __init__(self):
self.attr_one = None
self.attr_two = 1
obj = MyObject()
obj = MagicWrapper(obj)
obj.attr_one = 'new value'
obj.attr_two = 2
print obj.old_attr_one
print obj.attr_one
print obj.old_attr_two
print obj.attr_two
This isn't bullet-proof when you're trying to wrap weird objects (very little in Python is), but it should work for "normal" classes. You could write a lot more code to get a little bit closer to fully cloning the behaviour of the wrapped object, but it's probably impossible to do perfectly. The main thing to be aware of here is that many special methods will not be redirected to the wrapped object.
If you want to do this without wrapping obj in some way, it's going to get messy. Here's an option:
def add_old_setattr_to_class(cls):
def __setattr__(self, attr, val):
super_setattr = super(self.__class__, self).__setattr__
if attr.startswith('old_'):
super_setattr(attr, val)
else:
super_setattr('old_' + attr, getattr(self, attr))
super_setattr(attr, val)
cls.__setattr__ = __setattr__
class MyObject(object):
def __init__(self):
self.attr_one = None
self.attr_two = 1
obj = MyObject()
add_old_setattr_to_class(obj.__class__)
obj.attr_one = 'new value'
obj.attr_two = 2
print obj.old_attr_one
print obj.attr_one
print obj.old_attr_two
print obj.attr_two
Note that this is extremely invasive if you're using it on externally provided objects. It globally modifies the class of the object you're applying the magic to, not just that one instance. This is because like several other special methods, __setattr__ is not looked up in the instance's attribute dictionary; the lookup skips straight to the class, so there's no way to just override __setattr__ on the instance. I would characterise this sort of code as a bizarre hack if I encountered it in the wild (it's "nifty cleverness" if I write it myself, of course ;) ).
This version may or may not play nicely with objects that already play tricks with __setattr__ and __getattr__/__getattribute__. If you end up modifying the same class several times, I think this still works, but you end up with an ever-increasing number of wrapped __setattr__ definitions. You should probably try to avoid that; maybe by setting a "secret flag" on the class and checking for it in add_old_setattr_to_class before modifying cls. You should probably also use a more-unlikely prefix than just old_, since you're essentially trying to create a whole separate namespace.
You can substitute all attributes with custom properties at runtime. What are you trying to achieve though? Maybe migrating to completely immutable types would be a better choice?

Categories

Resources