Python Class "Main" Value - python

I'm pretty new to Python and have been writing a code that will read and write from a binary file.
I decided to create classes for every type of data that will be contained within the file, and to keep them organized I made one class they would all inherit, called InteriorIO. I want each class to have a read and write method which would read/write the data to/from the file. At the same time as inheriting InteriorIO however, I want them to behave like str or int in that they return the value they contain, so I'd modify either __str__ or __int__ depending on what they most closely resemble.
class InteriorIO(object):
__metaclass__ = ABCMeta
#abstractmethod
def read(this, f):
pass
#abstractmethod
def write(this, f):
pass
class byteIO(InteriorIO):
def __init__(this, value=None):
this.value = value
def read(this, f):
this.value = struct.unpack("B", f.read(1))[0]
def __str__:
return value;
class U16IO(InteriorIO):
def __init__(this, value=None):
this.value = value
def read(this, f):
this.value = struct.unpack("<H", f.read(2))[0]
def __int__:
return value;
# how I'd like it to work
f.open("C:/some/path/file.bin")
# In the file, the fileVersion is a U16
fileVersion = U16IO()
# We read the value from the file, storing it in fileVersion
fileVersion.read(f)
# writes the fileVersion that was just read from the file
print(str(fileVersion))
# now let's say we want to write the number 35 to the file in the form of a U16, so we store the value 35 in valueToWrite
valueToWrite = U16IO(35)
# prints the value 35
print(valueToWrite)
# writes the number 35 to the file
valueToWrite.write(f)
f.close()
The code on the bottom works, but the classes feel wrong and too ambiguous. I'm setting this.value, which is a random name I came up with, for every object as a sort of "main" value, and then returning said value as the type I want it to be.
What is the cleanest way to organize my classes such that they all inherit from InteriorIO, yet behave like a str or int in that they return their value?

I think in that case you may want to consider the Factory Design Pattern.
Here is a simple example to explain the idea:
class Cup:
color = ""
# This is the factory method
#staticmethod
def getCup(cupColor, value):
if (cupColor == "red"):
return RedCup(value)
elif (cupColor == "blue"):
return BlueCup(value)
else:
return None
class RedCup(Cup):
color = "Red"
def __init__(self, value):
self.value = value
class BlueCup(Cup):
color = "Blue"
def __init__(self, value):
self.value = value
# A little testing
redCup = Cup.getCup("red", 10)
print("{} ({})".format(redCup.color, redCup.__class__.__name__))
blueCup = Cup.getCup("blue", 20)
print("{} ({})".format(blueCup.color, blueCup.__class__.__name__))
So you have a factory Cup which contains a static method getCup which given a value, will decide which object to "generate" hence the title "factory".
Then in your code you only need to call the factory's getCup method and this will return you back with the appropriate class to work with.
They way to deal with __int__ and __str__ I think in the classes where you are missing either, just implement it and return back None. So your U16IO should implement a __str__ method that return None and your byteIO should implement a __int__ that also return None.

Why are you even using classes here? It seems overly complicated.
You could just define two functions, read and write;
def bread(format, binaryfile):
return struct.unpack(format, binaryfile.read(format.calcsize()))
def bwrite(format, binaryfile, *args):
binaryfile.write(struct.pack(format, *args))

Related

Trickling down attribute value change to other attributes in Python (and fixing the types)

I have the following problem: I'm trying to model a synthesizer, where some component can have subcomponents and I define for example a knob at a higher level, and I want its value to 'trickle down' to a lower level component. So that when Toplevel.knob's value changes, automatically Lowerlevel.value changes.
I have a minimal example below. The first example doesn't work (obviously), and then I have my current implementation, which sort of works but seems ridiculously complicated. Moreover, I get typing issues, because sometimes mypy will detect an attribute as being of the Parameter class but I want to use it as an int (or vice versa).
Some preferred behavior:
Lowerlevel.value and TopLevel.knob can take a regular number (int, float...) or a Parameter
I want a Parameter to be directly used as a number (hence the getter) and not something like Parameter.value.
Can I somehow make the type of Parameter seem like the type of Parameter.value? Note that Parameter.value in general should be any type of number.
I would like to avoid the detour of defining a par_knob variable in TopLevel. The reason I have to do it now is that if I would directly set NewLowerlevel(value=self.knob) inside TopLevel.__init__, it might call the getter of TopLevel.knob if the protect_parameters has already been called (for example if I create a second TopLevel instance, and the getter currently returns the value directly and not a Parameter object.
I hope this makes sense. Please let me know if the explanation is not clear.
I suppose what I would want is something that behaves like an int or a float, but is mutable. I read some things online about creating your own mutable int, but that seems a bit convoluted...
Any suggestions are welcome!
Thanks
from functools import partial
from typing import Any, Union
class Lowerlevel:
def __init__(self, value):
self.value = value
class Toplevel:
def __init__(self, knob):
self.knob = knob
self.elements = [Lowerlevel(value=self.knob)]
T1 = Toplevel(knob = 1)
# should print "1"
print(T1.elements[0].value)
# now turn the knob
T1.knob = 2
# still prints 1
print(T1.elements[0].value)
## So I define a new class to define a parameter:
class Parameter:
def __init__(self, value: Any) -> None:
"""A Parameter is any object with a value. By defining a class argument as a
Parameter, we can use HasParameters to detect them and define getters and setters.
This way, we can make sure its value is passed on to other class attributes that
depend on it.
Args:
value: The value of the parameter
"""
self.value = value
def __repr__(self) -> Any:
"""Redefine the printing action so that it prints the value."""
return self.value.__repr__()
class HasParameters:
def __init__(self) -> None:
"""By calling protect_parameters immediately at init (at the end of init in the
subclasses), we detect the arguments that are Parameter and define their
getters and setters.
"""
self.protect_parameters()
def protect_parameters(self) -> None:
"""Define the getter and setter for each argument that is a Parameter."""
for par_name, parameter in self.__dict__.items():
if isinstance(parameter, Parameter):
def _get_par(self, par_name):
return self.__dict__[par_name].value
def _set_par(self, val, par_name):
if par_name in self.__dict__:
self.__dict__[par_name].value = val
else:
if isinstance(val, Parameter):
self.__dict__[par_name] = val
else:
self.__dict__[par_name] = Parameter(val)
fget = partial(_get_par, par_name=par_name)
fset = partial(_set_par, par_name=par_name)
setattr(self.__class__, par_name, property(fget=fget, fset=fset))
# Now Try again:
class NewLowerlevel(HasParameters):
def __init__(self, value: Union[int, Parameter]) -> None:
self.value = value
super().__init__()
class NewToplevel(HasParameters):
def __init__(self, knob: Union[int, Parameter]) -> None:
if isinstance(knob, Parameter):
par_knob = knob
else:
par_knob = Parameter(knob)
self.knob = par_knob
self.elements = [NewLowerlevel(value=par_knob)]
super().__init__()
T2 = NewToplevel(knob = 1)
# should print "1"
print(T2.elements[0].value)
# now turn the knob
T2.knob = 2
# Now prints 2
print(T2.elements[0].value)

Return a custom value when a class method is accessed as an attribute, but still allow for it to perform a computation when called?

Specifically, I would want MyClass.my_method to be used for lookup of a value in the class dictionary, but MyClass.my_method() to be a method that accepts arguments and performs a computation to update an attribute in MyClass and then returns MyClass with all its attributes (including the updated one).
I am thinking that this might be doable with Python's descriptors (maybe overriding __get__ or __call__), but I can't figure out how this would look. I understand that the behavior might be confusing, but I am interested if it is possible (and if there are any other major caveats).
I have seen that you can do something similar for classes and functions by overriding __repr__, but I can't find a similar way for a method within a class. My returned value will also not always be a string, which seems to prohibit the __repr__-based approaches mentioned in these two questions:
Possible to change a function's repr in python?
How to create a custom string representation for a class object?
Thank you Joel for the minimal implementation. I found that the remaining problem is the lack of initialization of the parent, since I did not find a generic way of initializing it, I need to check for attributes in the case of list/dict, and add the initialization values to the parent accordingly.
This addition to the code should make it work for lists/dicts:
def classFactory(parent, init_val, target):
class modifierClass(parent):
def __init__(self, init_val):
super().__init__()
dict_attr = getattr(parent, "update", None)
list_attr = getattr(parent, "extend", None)
if callable(dict_attr): # parent is dict
self.update(init_val)
elif callable(list_attr): # parent is list
self.extend(init_val)
self.target = target
def __call__(self, *args):
self.target.__init__(*args)
return modifierClass(init_val)
class myClass:
def __init__(self, init_val=''):
self.method = classFactory(init_val.__class__, init_val, self)
Unfortunately, we need to add case by case, but this works as intended.
A slightly less verbose way to write the above is the following:
def classFactory(parent, init_val, target):
class modifierClass(parent):
def __init__(self, init_val):
if isinstance(init_val, list):
self.extend(init_val)
elif isinstance(init_val, dict):
self.update(init_val)
self.target = target
def __call__(self, *args):
self.target.__init__(*args)
return modifierClass(init_val)
class myClass:
def __init__(self, init_val=''):
self.method = classFactory(init_val.__class__, init_val, self)
As jasonharper commented,
MyClass.my_method() works by looking up MyClass.my_method, and then attempting to call that object. So the result of MyClass.my_method cannot be a plain string, int, or other common data type [...]
The trouble comes specifically from reusing the same name for this two properties, which is very confusing just as you said. So, don't do it.
But for the sole interest of it you could try to proxy the value of the property with an object that would return the original MyClass instance when called, use an actual setter to perform any computation you wanted, and also forward arbitrary attributes to the proxied value.
class MyClass:
_my_method = whatever
#property
def my_method(self):
my_class = self
class Proxy:
def __init__(self, value):
self.__proxied = value
def __call__(self, value):
my_class.my_method = value
return my_class
def __getattr__(self, name):
return getattr(self.__proxied, name)
def __str__(self):
return str(self.__proxied)
def __repr__(self):
return repr(self.__proxied)
return Proxy(self._my_method)
#my_method.setter
def my_method(self, value):
# your computations
self._my_method = value
a = MyClass()
b = a.my_method('do not do this at home')
a is b
# True
a.my_method.split(' ')
# ['do', 'not', 'do', 'this', 'at', 'home']
And today, duck typing will abuse you, forcing you to delegate all kinds of magic methods to the proxied value in the proxy class, until the poor codebase where you want to inject this is satisfied with how those values quack.
This is a minimal implementation of Guillherme's answer that updates the method instead of a separate modifiable parameter:
def classFactory(parent, init_val, target):
class modifierClass(parent):
def __init__(self, init_val):
self.target = target
def __call__(self, *args):
self.target.__init__(*args)
return modifierClass(init_val)
class myClass:
def __init__(self, init_val=''):
self.method = classFactory(init_val.__class__, init_val, self)
This and the original answer both work well for single values, but it seems like lists and dictionaries are returned as empty instead of with the expected values and I am not sure why so help is appreciated here:

Declaring class attributes in __init__ vs with #property

If I'm creating a class that needs to store properties, when is it appropriate to use an #property decorator and when should I simply define them in __init__?
The reasons I can think of:
Say I have a class like
class Apple:
def __init__(self):
self.foodType = "fruit"
self.edible = True
self.color = "red"
This works fine. In this case, it's pretty clear to me that I shouldn't write the class as:
class Apple:
#property
def foodType(self):
return "fruit"
#property
def edible(self):
return True
#property
def color(self):
return "red"
But say I have a more complicated class, which has slower methods (say, fetching data over the internet).
I could implement this assigning attributes in __init__:
class Apple:
def __init__(self):
self.wikipedia_url = "https://en.wikipedia.org/wiki/Apple"
self.wikipedia_article_content = requests.get(self.wikipedia_url).text
or I could implement this with #property:
class Apple:
def __init__(self):
self.wikipedia_url = "https://en.wikipedia.org/wiki/Apple"
#property
def wikipedia_article_content(self):
return requests.get(self.wikipedia_url).text
In this case, the latter is about 50,000 times faster to instantiate. However, I could argue that if I were fetching wikipedia_article_content multiple times, the former is faster:
a = Apple()
a.wikipedia_article_content
a.wikipedia_article_content
a.wikipedia_article_content
In which case, the former is ~3 times faster because it has one third the number of requests.
My question
Is the only difference between assigning properties in these two ways the ones I've thought of? What else does #property allow me to do other than save time (in some cases)? For properties that take some time to assign, is there a "right way" to assign them?
Using a property allows for more complex behavior. Such as fetching the article content only when it has changed and only after a certain time period has passed.
Yes, I would suggest using property for those arguments. If you want to make it lazy or cached you can subclass property.
This is just an implementation of a lazy property. It does some operations inside the property and returns the result. This result is saved in the class with another name and each subsequent call on the property just returns the saved result.
class LazyProperty(property):
def __init__(self, *args, **kwargs):
# Let property set everything up
super(LazyProperty, self).__init__(*args, **kwargs)
# We need a name to save the cached result. If the property is called
# "test" save the result as "_test".
self._key = '_{0}'.format(self.fget.__name__)
def __get__(self, obj, owner=None):
# Called on the class not the instance
if obj is None:
return self
# Value is already fetched so just return the stored value
elif self._key in obj.__dict__:
return obj.__dict__[self._key]
# Value is not fetched, so fetch, save and return it
else:
val = self.fget(obj)
obj.__dict__[self._key] = val
return val
This allows you to calculate the value once and then always return it:
class Test:
def __init__(self):
pass
#LazyProperty
def test(self):
print('Doing some very slow stuff.')
return 100
This is how it would work (obviously you need to adapt it for your case):
>>> a = Test()
>>> a._test # The property hasn't been called so there is no result saved yet.
AttributeError: 'Test' object has no attribute '_test'
>>> a.test # First property access will evaluate the code you have in your property
Doing some very slow stuff.
100
>>> a.test # Accessing the property again will give you the saved result
100
>>> a._test # Or access the saved result directly
100

OO design: an object that can be exported to a "row", while accessing header names, without repeating myself

Sorry, badly worded title. I hope a simple example will make it clear. Here's the easiest way to do what I want to do:
class Lemon(object):
headers = ['ripeness', 'colour', 'juiciness', 'seeds?']
def to_row(self):
return [self.ripeness, self.colour, self.juiciness, self.seeds > 0]
def save_lemons(lemonset):
f = open('lemons.csv', 'w')
out = csv.writer(f)
out.write(Lemon.headers)
for lemon in lemonset:
out.writerow(lemon.to_row())
This works alright for this small example, but I feel like I'm "repeating myself" in the Lemon class. And in the actual code I'm trying to write (where the number of variables I'm exporting is ~50 rather than 4, and where to_row calls a number of private methods that do a bunch of weird calculations), it becomes awkward.
As I write the code to generate a row, I need to constantly refer to the "headers" variable to make sure I'm building my list in the correct order. If I want to change the variables being outputted, I need to make sure to_row and headers are being changed in parallel (exactly the kind of thing that DRY is meant to prevent, right?).
Is there a better way I could design this code? I've been playing with function decorators, but nothing has stuck. Ideally I should still be able to get at the headers without having a particular lemon instance (i.e. it should be a class variable or class method), and I don't want to have a separate method for each variable.
In this case, getattr() is your friend: it allows you to get a variable based on a string name. For example:
def to_row(self):
return [getattr(self, head) for head in self.headers]
EDIT: to properly use the header seeds?, you would need to set the attribute seeds? for the objects. setattr(self, 'seeds?', self.seeds > 0) right above the return statement.
We could use some metaclass shenanegans to do this...
In python 2, attributes are passed to the metaclass in a dict, without
preserving order, we'll also want a base class to work with so we can
distinguish class attributes that should be mapped into the row. In python3, we could dispense with just about all of this base descriptor class.
import itertools
import functools
#functools.total_ordering
class DryDescriptor(object):
_order_gen = itertools.count()
def __init__(self, alias=None):
self.alias = alias
self.order = next(self._order_gen)
def __lt__(self, other):
return self.order < other.order
We will want a python descriptor for every attribute we wish to map into the
row. slots are a nice way to get data descriptors without much work. One
caveat, though, we'll have to manually remove the helper instance to make the
real slot descriptor visible.
class slot(DryDescriptor):
def annotate(self, attr, attrs):
del attrs[attr]
self.attr = attr
slots = attrs.setdefault('__slots__', []).append(attr)
def annotate_class(self, cls):
if self.alias is not None:
setattr(cls, self.alias, getattr(self.attr))
For computed fields, we can memoize results. Memoizing off of the annotated
instance is tricky without a memory leak, we need weakref. alternatively, we
could have arranged for another slot just to store the cached value. This also isn't quite thread safe, but pretty close.
import weakref
class memo(DryDescriptor):
_memo = None
def __call__(self, method):
self.getter = method
return self
def annotate(self, attr, attrs):
if self.alias is not None:
attrs[self.alias] = self
def annotate_class(self, cls): pass
def __get__(self, instance, owner):
if instance is None:
return self
if self._memo is None:
self._memo = weakref.WeakKeyDictionary()
try:
return self._memo[instance]
except KeyError:
return self._memo.setdefault(instance, self.getter(instance))
On the metaclass, all of the descriptors we created above are found, sorted by
creation order, and instructed to annotate the new, created class. This does
not correctly treat derived classes and could use some other conveniences like
an __init__ for all the slots.
class DryMeta(type):
def __new__(mcls, name, bases, attrs):
descriptors = sorted((value, key)
for key, value
in attrs.iteritems()
if isinstance(value, DryDescriptor))
for descriptor, attr in descriptors:
descriptor.annotate(attr, attrs)
cls = type.__new__(mcls, name, bases, attrs)
for descriptor, attr in descriptors:
descriptor.annotate_class(cls)
cls._header_descriptors = [getattr(cls, attr) for descriptor, attr in descriptors]
return cls
Finally, we want a base class to inherit from so that we can have a to_row
method. this just invokes all of the __get__s for all of the respective
descriptors, in order.
class DryBase(object):
__metaclass__ = DryMeta
def to_row(self):
cls = type(self)
return [desc.__get__(self, cls) for desc in cls._header_descriptors]
Assuming all of that is tucked away, out of sight, the definition of a class
that uses this feature is mostly free of repitition. The only short coming is
that to be practical, every field needs a python friendly name, thus we had the
alias key to associate 'seeds?' to has_seeds
class ADryRow(DryBase):
__slots__ = ['seeds']
ripeness = slot()
colour = slot()
juiciness = slot()
#memo(alias='seeds?')
def has_seeds(self):
print "Expensive!!!"
return self.seeds > 0
>>> my_row = ADryRow()
>>> my_row.ripeness = "tart"
>>> my_row.colour = "#8C2"
>>> my_row.juiciness = 0.3479
>>> my_row.seeds = 19
>>>
>>> print my_row.to_row()
Expensive!!!
['tart', '#8C2', 0.3479, True]
>>> print my_row.to_row()
['tart', '#8C2', 0.3479, True]

Python "callable" attribute (pseudo-property)

In python, I can alter the state of an instance by directly assigning to attributes, or by making method calls which alter the state of the attributes:
foo.thing = 'baz'
or:
foo.thing('baz')
Is there a nice way to create a class which would accept both of the above forms which scales to large numbers of attributes that behave this way? (Shortly, I'll show an example of an implementation that I don't particularly like.) If you're thinking that this is a stupid API, let me know, but perhaps a more concrete example is in order. Say I have a Document class. Document could have an attribute title. However, title may want to have some state as well (font,fontsize,justification,...), but the average user might be happy enough just setting the title to a string and being done with it ...
One way to accomplish this would be to:
class Title(object):
def __init__(self,text,font='times',size=12):
self.text = text
self.font = font
self.size = size
def __call__(self,*text,**kwargs):
if(text):
self.text = text[0]
for k,v in kwargs.items():
setattr(self,k,v)
def __str__(self):
return '<title font={font}, size={size}>{text}</title>'.format(text=self.text,size=self.size,font=self.font)
class Document(object):
_special_attr = set(['title'])
def __setattr__(self,k,v):
if k in self._special_attr and hasattr(self,k):
getattr(self,k)(v)
else:
object.__setattr__(self,k,v)
def __init__(self,text="",title=""):
self.title = Title(title)
self.text = text
def __str__(self):
return str(self.title)+'<body>'+self.text+'</body>'
Now I can use this as follows:
doc = Document()
doc.title = "Hello World"
print (str(doc))
doc.title("Goodbye World",font="Helvetica")
print (str(doc))
This implementation seems a little messy though (with __special_attr). Maybe that's because this is a messed up API. I'm not sure. Is there a better way to do this? Or did I leave the beaten path a little too far on this one?
I realize I could use #property for this as well, but that wouldn't scale well at all if I had more than just one attribute which is to behave this way -- I'd need to write a getter and setter for each, yuck.
It is a bit harder than the previous answers assume.
Any value stored in the descriptor will be shared between all instances, so it is not the right place to store per-instance data.
Also, obj.attrib(...) is performed in two steps:
tmp = obj.attrib
tmp(...)
Python doesn't know in advance that the second step will follow, so you always have to return something that is callable and has a reference to its parent object.
In the following example that reference is implied in the set argument:
class CallableString(str):
def __new__(class_, set, value):
inst = str.__new__(class_, value)
inst._set = set
return inst
def __call__(self, value):
self._set(value)
class A(object):
def __init__(self):
self._attrib = "foo"
def get_attrib(self):
return CallableString(self.set_attrib, self._attrib)
def set_attrib(self, value):
try:
value = value._value
except AttributeError:
pass
self._attrib = value
attrib = property(get_attrib, set_attrib)
a = A()
print a.attrib
a.attrib = "bar"
print a.attrib
a.attrib("baz")
print a.attrib
In short: what you want cannot be done transparently. You'll write better Python code if you don't insist hacking around this limitation
You can avoid having to use #property on potentially hundreds of attributes by simply creating a descriptor class that follows the appropriate rules:
# Warning: Untested code ahead
class DocAttribute(object):
tag_str = "<{tag}{attrs}>{text}</{tag}>"
def __init__(self, tag_name, default_attrs=None):
self._tag_name = tag_name
self._attrs = default_attrs if default_attrs is not None else {}
def __call__(self, *text, **attrs):
self._text = "".join(text)
self._attrs.update(attrs)
return self
def __get__(self, instance, cls):
return self
def __set__(self, instance, value):
self._text = value
def __str__(self):
# Attrs left as an exercise for the reader
return self.tag_str.format(tag=self._tag_name, text=self._text)
Then you can use Document's __setattr__ method to add a descriptor based on this class if it is in a white list of approved names (or not in a black list of forbidden ones, depending on your domain):
class Document(object):
# prelude
def __setattr__(self, name, value):
if self.is_allowed(name): # Again, left as an exercise for the reader
object.__setattr__(self, name, DocAttribute(name)(value))

Categories

Resources