Declaring class attributes in __init__ vs with #property - python

If I'm creating a class that needs to store properties, when is it appropriate to use an #property decorator and when should I simply define them in __init__?
The reasons I can think of:
Say I have a class like
class Apple:
def __init__(self):
self.foodType = "fruit"
self.edible = True
self.color = "red"
This works fine. In this case, it's pretty clear to me that I shouldn't write the class as:
class Apple:
#property
def foodType(self):
return "fruit"
#property
def edible(self):
return True
#property
def color(self):
return "red"
But say I have a more complicated class, which has slower methods (say, fetching data over the internet).
I could implement this assigning attributes in __init__:
class Apple:
def __init__(self):
self.wikipedia_url = "https://en.wikipedia.org/wiki/Apple"
self.wikipedia_article_content = requests.get(self.wikipedia_url).text
or I could implement this with #property:
class Apple:
def __init__(self):
self.wikipedia_url = "https://en.wikipedia.org/wiki/Apple"
#property
def wikipedia_article_content(self):
return requests.get(self.wikipedia_url).text
In this case, the latter is about 50,000 times faster to instantiate. However, I could argue that if I were fetching wikipedia_article_content multiple times, the former is faster:
a = Apple()
a.wikipedia_article_content
a.wikipedia_article_content
a.wikipedia_article_content
In which case, the former is ~3 times faster because it has one third the number of requests.
My question
Is the only difference between assigning properties in these two ways the ones I've thought of? What else does #property allow me to do other than save time (in some cases)? For properties that take some time to assign, is there a "right way" to assign them?

Using a property allows for more complex behavior. Such as fetching the article content only when it has changed and only after a certain time period has passed.

Yes, I would suggest using property for those arguments. If you want to make it lazy or cached you can subclass property.
This is just an implementation of a lazy property. It does some operations inside the property and returns the result. This result is saved in the class with another name and each subsequent call on the property just returns the saved result.
class LazyProperty(property):
def __init__(self, *args, **kwargs):
# Let property set everything up
super(LazyProperty, self).__init__(*args, **kwargs)
# We need a name to save the cached result. If the property is called
# "test" save the result as "_test".
self._key = '_{0}'.format(self.fget.__name__)
def __get__(self, obj, owner=None):
# Called on the class not the instance
if obj is None:
return self
# Value is already fetched so just return the stored value
elif self._key in obj.__dict__:
return obj.__dict__[self._key]
# Value is not fetched, so fetch, save and return it
else:
val = self.fget(obj)
obj.__dict__[self._key] = val
return val
This allows you to calculate the value once and then always return it:
class Test:
def __init__(self):
pass
#LazyProperty
def test(self):
print('Doing some very slow stuff.')
return 100
This is how it would work (obviously you need to adapt it for your case):
>>> a = Test()
>>> a._test # The property hasn't been called so there is no result saved yet.
AttributeError: 'Test' object has no attribute '_test'
>>> a.test # First property access will evaluate the code you have in your property
Doing some very slow stuff.
100
>>> a.test # Accessing the property again will give you the saved result
100
>>> a._test # Or access the saved result directly
100

Related

Replacing the object from one of its methods

I am using python and have an object, that object has a method. I am looking for a simple way, to replace the entire object from within that function.
E.g
class a():
def b(self):
self = other_object
How can you do that?
Thanks
You use a proxy/facade object to hold a reference to the actual object, the self if you wish and that proxy (better term than Facade, but not changing my code now) is what the rest of your codebase sees. However, any attribute/method access is forwarded on to the actual object, which is swappable.
Code below should give you a rough idea. Note that you need to be careful about recursion around __the_instance, which is why I am assigning to __dict__ directly. Bit messy, since it's been a while I've written code that wraps getattr and setattr entirely.
class Facade:
def __init__(self, instance):
self.set_obj(instance)
def set_obj(self, instance):
self.__dict__["__theinstance"] = instance
def __getattr__(self, attrname):
if attrname == "__theinstance":
return self.__dict__["__theinstance"]
return getattr(self.__dict__["__theinstance"], attrname)
def __setattr__(self, attrname, value):
if attrname == "__theinstance":
self.set_obj(value)
return setattr(self.__dict__["__theinstance"], attrname, value)
class Test:
def __init__(self, name, cntr):
self.name = name
self.cntr = cntr
def __repr__(self):
return "%s[%s]" % (self.__class__.__name__, self.__dict__)
obj1 = Test("first object", 1)
obj2 = Test("second", 2)
obj2.message = "greetings"
def pretend_client_code(facade):
print(id(facade), facade.name, facade.cntr, getattr(facade, "value", None))
facade = Facade(obj1)
pretend_client_code(facade)
facade.set_obj(obj2)
pretend_client_code(facade)
facade.value = 3
pretend_client_code(facade)
facade.set_obj(obj1)
pretend_client_code(facade)
output:
4467187104 first object 1 None
4467187104 second 2 None
4467187104 second 2 3
4467187104 first object 1 None
So basically, the "client code" always sees the same facade object, but what it is actually accessing depends on what your equivalent of def b is has done.
Facade has a specific meaning in Design Patterns terminology and it may not be really applicable here, but close enough. Maybe Proxy would have been better.
Note that if you want to change the class on the same object, that is a different thing, done through assigning self.__class__ . For example, say an RPG game with an EnemyClass who gets swapped to DeadEnemyClass once killed: self.__class__ = DeadEnemyClass
You can't directly do that. What you can do is save it as an instance variable.
class A():
def __init__(self, instance=None):
self.instance = val or self
# yes, you can make it a property as well.
def set_val(self, obj):
self.instance = obj
def get_val(self):
return self.instance
It is unlikely that replacing the 'self' variable will accomplish
whatever you're trying to do, that couldn't just be accomplished by
storing the result of func(self) in a different variable. 'self' is
effectively a local variable only defined for the duration of the
method call, used to pass in the instance of the class which is being
operated upon. Replacing self will not actually replace references to
the original instance of the class held by other objects, nor will it
create a lasting reference to the new instance which was assigned to
it.
Original source: Is it safe to replace a self object by another object of the same type in a method?

Python Class "Main" Value

I'm pretty new to Python and have been writing a code that will read and write from a binary file.
I decided to create classes for every type of data that will be contained within the file, and to keep them organized I made one class they would all inherit, called InteriorIO. I want each class to have a read and write method which would read/write the data to/from the file. At the same time as inheriting InteriorIO however, I want them to behave like str or int in that they return the value they contain, so I'd modify either __str__ or __int__ depending on what they most closely resemble.
class InteriorIO(object):
__metaclass__ = ABCMeta
#abstractmethod
def read(this, f):
pass
#abstractmethod
def write(this, f):
pass
class byteIO(InteriorIO):
def __init__(this, value=None):
this.value = value
def read(this, f):
this.value = struct.unpack("B", f.read(1))[0]
def __str__:
return value;
class U16IO(InteriorIO):
def __init__(this, value=None):
this.value = value
def read(this, f):
this.value = struct.unpack("<H", f.read(2))[0]
def __int__:
return value;
# how I'd like it to work
f.open("C:/some/path/file.bin")
# In the file, the fileVersion is a U16
fileVersion = U16IO()
# We read the value from the file, storing it in fileVersion
fileVersion.read(f)
# writes the fileVersion that was just read from the file
print(str(fileVersion))
# now let's say we want to write the number 35 to the file in the form of a U16, so we store the value 35 in valueToWrite
valueToWrite = U16IO(35)
# prints the value 35
print(valueToWrite)
# writes the number 35 to the file
valueToWrite.write(f)
f.close()
The code on the bottom works, but the classes feel wrong and too ambiguous. I'm setting this.value, which is a random name I came up with, for every object as a sort of "main" value, and then returning said value as the type I want it to be.
What is the cleanest way to organize my classes such that they all inherit from InteriorIO, yet behave like a str or int in that they return their value?
I think in that case you may want to consider the Factory Design Pattern.
Here is a simple example to explain the idea:
class Cup:
color = ""
# This is the factory method
#staticmethod
def getCup(cupColor, value):
if (cupColor == "red"):
return RedCup(value)
elif (cupColor == "blue"):
return BlueCup(value)
else:
return None
class RedCup(Cup):
color = "Red"
def __init__(self, value):
self.value = value
class BlueCup(Cup):
color = "Blue"
def __init__(self, value):
self.value = value
# A little testing
redCup = Cup.getCup("red", 10)
print("{} ({})".format(redCup.color, redCup.__class__.__name__))
blueCup = Cup.getCup("blue", 20)
print("{} ({})".format(blueCup.color, blueCup.__class__.__name__))
So you have a factory Cup which contains a static method getCup which given a value, will decide which object to "generate" hence the title "factory".
Then in your code you only need to call the factory's getCup method and this will return you back with the appropriate class to work with.
They way to deal with __int__ and __str__ I think in the classes where you are missing either, just implement it and return back None. So your U16IO should implement a __str__ method that return None and your byteIO should implement a __int__ that also return None.
Why are you even using classes here? It seems overly complicated.
You could just define two functions, read and write;
def bread(format, binaryfile):
return struct.unpack(format, binaryfile.read(format.calcsize()))
def bwrite(format, binaryfile, *args):
binaryfile.write(struct.pack(format, *args))

Dynamically creating #attribute.setter methods for all properties in class (Python)

I have code that someone else wrote like this:
class MyClass(object):
def __init__(self, data):
self.data = data
#property
def attribute1(self):
return self.data.another_name1
#property
def attribute2(self):
return self.data.another_name2
and I want to automatically create the corresponding property setters at run time so I don't have to modify the other person's code. The property setters should look like this:
#attribute1.setter
def attribue1(self, val):
self.data.another_name1= val
#attribute2.setter
def attribue2(self, val):
self.data.another_name2= val
How do I dynamically add these setter methods to the class?
You can write a custom Descriptor like this:
from operator import attrgetter
class CustomProperty(object):
def __init__(self, attr):
self.attr = attr
def __get__(self, ins, type):
print 'inside __get__'
if ins is None:
return self
else:
return attrgetter(self.attr)(ins)
def __set__(self, ins, value):
print 'inside __set__'
head, tail = self.attr.rsplit('.', 1)
obj = attrgetter(head)(ins)
setattr(obj, tail, value)
class MyClass(object):
def __init__(self, data):
self.data = data
attribute1 = CustomProperty('data.another_name1')
attribute2 = CustomProperty('data.another_name2')
Demo:
>>> class Foo():
... pass
...
>>> bar = MyClass(Foo())
>>>
>>> bar.attribute1 = 10
inside __set__
>>> bar.attribute2 = 20
inside __set__
>>> bar.attribute1
inside __get__
10
>>> bar.attribute2
inside __get__
20
>>> bar.data.another_name1
10
>>> bar.data.another_name2
20
This is the author of the question. I found out a very jerry-rigged solution, but I don't know another way to do it. (I am using python 3.4 by the way.)
I'll start with the problems I ran into.
First, I thought about overwriting the property entirely, something like this:
Given this class
class A(object):
def __init__(self):
self._value = 42
#property
def value(self):
return self._value
and you can over write the property entirely by doing something like this:
a = A()
A.value = 31 # This just redirects A.value from the #property to the int 31
a.value # Returns 31
The problem is that this is done at the class level and not at the instance level, so if I make a new instance of A then this happens:
a2 = A()
a.value # Returns 31, because the class itself was modified in the previous code block.
I want that to return a2._value because a2 is a totally new instance of A() and therefore shouldn't be influenced by what I did to a.
The solution to this was to overwrite A.value with a new property rather than whatever I wanted to assign the instance _value to. I learned that you can create a new property that instantiates itself from the old property using the special getter, setter, and deleter methods (see here). So I can overwrite A's value property and make a setter for it by doing this:
def make_setter(name):
def value_setter(self, val):
setattr(self, name, val)
return value_setter
my_setter = make_setter('_value')
A.value = A.value.setter(my_setter) # This takes the property defined in the above class and overwrites the setter with my_setter
setattr(A, 'value', getattr(A, 'value').setter(my_setter)) # This does the same thing as the line above I think so you only need one of them
This is all well and good as long as the original class has something extremely simple in the original class's property definition (in this case it was just return self._value). However, as soon as you get more complicated, to something like return self.data._value like I have, things get nasty -- like #BrenBarn said in his comment on my post. I used the inspect.getsourcelines(A.value.fget) function to get the source code line that contains the return value and parsed that. If I failed to parse the string, I raised an exception. The result looks something like this:
def make_setter(name, attrname=None):
def setter(self, val):
try:
split_name = name.split('.')
child_attr = getattr(self, split_name[0])
for i in range(len(split_name)-2):
child_attr = getattr(child_attr, split_name[i+1])
setattr(child_attr, split_name[-1], val)
except:
raise Exception("Failed to set property attribute {0}".format(name))
It seems to work but there are probably bugs.
Now the question is, what to do if the thing failed? That's up to you and sort of off track from this question. Personally, I did a bit of nasty stuff that involves creating a new class that inherits from A (let's call this class B). Then if the setter worked for A, it will work for the instance of B because A is a base class. However, if it didn't work (because the return value defined in A was something nasty), I ran a settattr(B, name, val) on the class B. This would normally change all other instances that were created from B (like in the 2nd code block in this post) but I dynamically create B using type('B', (A,), {}) and only use it once ever, so changing the class itself has no affect on anything else.
There is a lot of black-magic type stuff going on here I think, but it's pretty cool and quite versatile in the day or so I've been using it. None of this is copy-pastable code, but if you understand it then you can write your modifications.
I really hope/wish there is a better way, but I do not know of one. Maybe metaclasses or descriptors created from classes can do some nice magic for you, but I do not know enough about them yet to be sure.
Comments appreciated!

OO design: an object that can be exported to a "row", while accessing header names, without repeating myself

Sorry, badly worded title. I hope a simple example will make it clear. Here's the easiest way to do what I want to do:
class Lemon(object):
headers = ['ripeness', 'colour', 'juiciness', 'seeds?']
def to_row(self):
return [self.ripeness, self.colour, self.juiciness, self.seeds > 0]
def save_lemons(lemonset):
f = open('lemons.csv', 'w')
out = csv.writer(f)
out.write(Lemon.headers)
for lemon in lemonset:
out.writerow(lemon.to_row())
This works alright for this small example, but I feel like I'm "repeating myself" in the Lemon class. And in the actual code I'm trying to write (where the number of variables I'm exporting is ~50 rather than 4, and where to_row calls a number of private methods that do a bunch of weird calculations), it becomes awkward.
As I write the code to generate a row, I need to constantly refer to the "headers" variable to make sure I'm building my list in the correct order. If I want to change the variables being outputted, I need to make sure to_row and headers are being changed in parallel (exactly the kind of thing that DRY is meant to prevent, right?).
Is there a better way I could design this code? I've been playing with function decorators, but nothing has stuck. Ideally I should still be able to get at the headers without having a particular lemon instance (i.e. it should be a class variable or class method), and I don't want to have a separate method for each variable.
In this case, getattr() is your friend: it allows you to get a variable based on a string name. For example:
def to_row(self):
return [getattr(self, head) for head in self.headers]
EDIT: to properly use the header seeds?, you would need to set the attribute seeds? for the objects. setattr(self, 'seeds?', self.seeds > 0) right above the return statement.
We could use some metaclass shenanegans to do this...
In python 2, attributes are passed to the metaclass in a dict, without
preserving order, we'll also want a base class to work with so we can
distinguish class attributes that should be mapped into the row. In python3, we could dispense with just about all of this base descriptor class.
import itertools
import functools
#functools.total_ordering
class DryDescriptor(object):
_order_gen = itertools.count()
def __init__(self, alias=None):
self.alias = alias
self.order = next(self._order_gen)
def __lt__(self, other):
return self.order < other.order
We will want a python descriptor for every attribute we wish to map into the
row. slots are a nice way to get data descriptors without much work. One
caveat, though, we'll have to manually remove the helper instance to make the
real slot descriptor visible.
class slot(DryDescriptor):
def annotate(self, attr, attrs):
del attrs[attr]
self.attr = attr
slots = attrs.setdefault('__slots__', []).append(attr)
def annotate_class(self, cls):
if self.alias is not None:
setattr(cls, self.alias, getattr(self.attr))
For computed fields, we can memoize results. Memoizing off of the annotated
instance is tricky without a memory leak, we need weakref. alternatively, we
could have arranged for another slot just to store the cached value. This also isn't quite thread safe, but pretty close.
import weakref
class memo(DryDescriptor):
_memo = None
def __call__(self, method):
self.getter = method
return self
def annotate(self, attr, attrs):
if self.alias is not None:
attrs[self.alias] = self
def annotate_class(self, cls): pass
def __get__(self, instance, owner):
if instance is None:
return self
if self._memo is None:
self._memo = weakref.WeakKeyDictionary()
try:
return self._memo[instance]
except KeyError:
return self._memo.setdefault(instance, self.getter(instance))
On the metaclass, all of the descriptors we created above are found, sorted by
creation order, and instructed to annotate the new, created class. This does
not correctly treat derived classes and could use some other conveniences like
an __init__ for all the slots.
class DryMeta(type):
def __new__(mcls, name, bases, attrs):
descriptors = sorted((value, key)
for key, value
in attrs.iteritems()
if isinstance(value, DryDescriptor))
for descriptor, attr in descriptors:
descriptor.annotate(attr, attrs)
cls = type.__new__(mcls, name, bases, attrs)
for descriptor, attr in descriptors:
descriptor.annotate_class(cls)
cls._header_descriptors = [getattr(cls, attr) for descriptor, attr in descriptors]
return cls
Finally, we want a base class to inherit from so that we can have a to_row
method. this just invokes all of the __get__s for all of the respective
descriptors, in order.
class DryBase(object):
__metaclass__ = DryMeta
def to_row(self):
cls = type(self)
return [desc.__get__(self, cls) for desc in cls._header_descriptors]
Assuming all of that is tucked away, out of sight, the definition of a class
that uses this feature is mostly free of repitition. The only short coming is
that to be practical, every field needs a python friendly name, thus we had the
alias key to associate 'seeds?' to has_seeds
class ADryRow(DryBase):
__slots__ = ['seeds']
ripeness = slot()
colour = slot()
juiciness = slot()
#memo(alias='seeds?')
def has_seeds(self):
print "Expensive!!!"
return self.seeds > 0
>>> my_row = ADryRow()
>>> my_row.ripeness = "tart"
>>> my_row.colour = "#8C2"
>>> my_row.juiciness = 0.3479
>>> my_row.seeds = 19
>>>
>>> print my_row.to_row()
Expensive!!!
['tart', '#8C2', 0.3479, True]
>>> print my_row.to_row()
['tart', '#8C2', 0.3479, True]

Python "callable" attribute (pseudo-property)

In python, I can alter the state of an instance by directly assigning to attributes, or by making method calls which alter the state of the attributes:
foo.thing = 'baz'
or:
foo.thing('baz')
Is there a nice way to create a class which would accept both of the above forms which scales to large numbers of attributes that behave this way? (Shortly, I'll show an example of an implementation that I don't particularly like.) If you're thinking that this is a stupid API, let me know, but perhaps a more concrete example is in order. Say I have a Document class. Document could have an attribute title. However, title may want to have some state as well (font,fontsize,justification,...), but the average user might be happy enough just setting the title to a string and being done with it ...
One way to accomplish this would be to:
class Title(object):
def __init__(self,text,font='times',size=12):
self.text = text
self.font = font
self.size = size
def __call__(self,*text,**kwargs):
if(text):
self.text = text[0]
for k,v in kwargs.items():
setattr(self,k,v)
def __str__(self):
return '<title font={font}, size={size}>{text}</title>'.format(text=self.text,size=self.size,font=self.font)
class Document(object):
_special_attr = set(['title'])
def __setattr__(self,k,v):
if k in self._special_attr and hasattr(self,k):
getattr(self,k)(v)
else:
object.__setattr__(self,k,v)
def __init__(self,text="",title=""):
self.title = Title(title)
self.text = text
def __str__(self):
return str(self.title)+'<body>'+self.text+'</body>'
Now I can use this as follows:
doc = Document()
doc.title = "Hello World"
print (str(doc))
doc.title("Goodbye World",font="Helvetica")
print (str(doc))
This implementation seems a little messy though (with __special_attr). Maybe that's because this is a messed up API. I'm not sure. Is there a better way to do this? Or did I leave the beaten path a little too far on this one?
I realize I could use #property for this as well, but that wouldn't scale well at all if I had more than just one attribute which is to behave this way -- I'd need to write a getter and setter for each, yuck.
It is a bit harder than the previous answers assume.
Any value stored in the descriptor will be shared between all instances, so it is not the right place to store per-instance data.
Also, obj.attrib(...) is performed in two steps:
tmp = obj.attrib
tmp(...)
Python doesn't know in advance that the second step will follow, so you always have to return something that is callable and has a reference to its parent object.
In the following example that reference is implied in the set argument:
class CallableString(str):
def __new__(class_, set, value):
inst = str.__new__(class_, value)
inst._set = set
return inst
def __call__(self, value):
self._set(value)
class A(object):
def __init__(self):
self._attrib = "foo"
def get_attrib(self):
return CallableString(self.set_attrib, self._attrib)
def set_attrib(self, value):
try:
value = value._value
except AttributeError:
pass
self._attrib = value
attrib = property(get_attrib, set_attrib)
a = A()
print a.attrib
a.attrib = "bar"
print a.attrib
a.attrib("baz")
print a.attrib
In short: what you want cannot be done transparently. You'll write better Python code if you don't insist hacking around this limitation
You can avoid having to use #property on potentially hundreds of attributes by simply creating a descriptor class that follows the appropriate rules:
# Warning: Untested code ahead
class DocAttribute(object):
tag_str = "<{tag}{attrs}>{text}</{tag}>"
def __init__(self, tag_name, default_attrs=None):
self._tag_name = tag_name
self._attrs = default_attrs if default_attrs is not None else {}
def __call__(self, *text, **attrs):
self._text = "".join(text)
self._attrs.update(attrs)
return self
def __get__(self, instance, cls):
return self
def __set__(self, instance, value):
self._text = value
def __str__(self):
# Attrs left as an exercise for the reader
return self.tag_str.format(tag=self._tag_name, text=self._text)
Then you can use Document's __setattr__ method to add a descriptor based on this class if it is in a white list of approved names (or not in a black list of forbidden ones, depending on your domain):
class Document(object):
# prelude
def __setattr__(self, name, value):
if self.is_allowed(name): # Again, left as an exercise for the reader
object.__setattr__(self, name, DocAttribute(name)(value))

Categories

Resources