Proper way to use dataclass in another class - python

After asking my last question, it seems like I have not really understood classes adn dataclasses.
So I would like to learn the correct way of doing the following:
define dataclass
define other class, which will use an instance of dataclass
use a method from the second class to updatenvalues of dataclass
The way I do gives me an error saying that my datafram doesn't exist. I created a method inside the dataclass, using that results in an error stating it is read-only.
#dataclass(slots=True)
def Storage():
timestamp: float
value: float
class UDP():
some attributes
self.datastorage: Storage = Storage()
def updatedata(self, time, val):
self.datastorage.timestamp = time
self.datastorage.value = val
def main():
test = UDP()
test.updatedata(0.01,2)
So my question is how to instantiate a dataclass in another class and be able to manipulate the values in the dataclass?

Your code has several syntactic problems. Once those are fixed, the code works. Storage objects are mutable, and you may freely modify their timestamp and value attributes.
In [7]: #dataclass(slots=True)
...: class Storage:
...: timestamp: float
...: value: float
...:
...:
...: class UDP:
...: datastorage: Storage = Storage(0.0, 0.0)
...:
...: def updatedata(self, time, val):
...: self.datastorage.timestamp = time
...: self.datastorage.value = val
...:
...: def main():
...: test = UDP()
...: test.updatedata(0.01,2)
...:
In [8]: main()

Related

Weird Issue when using dataclass and property together

I ran into a strange issue while trying to use a dataclass together with a property.
I have it down to a minumum to reproduce it:
import dataclasses
#dataclasses.dataclass
class FileObject:
_uploaded_by: str = dataclasses.field(default=None, init=False)
uploaded_by: str = None
def save(self):
print(self.uploaded_by)
#property
def uploaded_by(self):
return self._uploaded_by
#uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
p = FileObject()
p.save()
This outputs:
Setter Called with Value <property object at 0x7faeb00150b0>
<property object at 0x7faeb00150b0>
I would expect to get None instead of
Am I doing something wrong here or have I stumbled across a bug?
After reading #juanpa.arrivillaga answer I thought that making uploaded_by and InitVar might fix the issue, but it still return a property object. I think it is because of the this that he said:
the datalcass machinery interprets any assignment to a type-annotated
variable in the class body as the default value to the created
__init__.
The only option I can find that works with the default value is to remove the uploadedby from the dataclass defintion and write an actual __init__. That has an unfortunate side effect of requiring you to write an __init__ for the dataclass manually which negates some of the value of using a dataclass. Here is what I did:
import dataclasses
#dataclasses.dataclass
class FileObject:
_uploaded_by: str = dataclasses.field(default=None, init=False)
uploaded_by: dataclasses.InitVar=None
other_attrs: str = None
def __init__(self, uploaded_by=None, other_attrs=None):
self._uploaded_by = uploaded_by
self.other_attrs = other_attrs
def save(self):
print("Uploaded by: ", self.uploaded_by)
print("Other Attrs: ", self.other_attrs)
#property
def uploaded_by(self):
if not self._uploaded_by:
print("Doing expensive logic that should not be repeated")
return self._uploaded_by
p = FileObject(other_attrs="More Data")
p.save()
p2 = FileObject(uploaded_by='Already Computed', other_attrs="More Data")
p2.save()
Which outputs:
Doing expensive logic that should not be repeated
Uploaded by: None
Other Attrs: More Data
Uploaded by: Already Computed
Other Attrs: More Data
The negatives of doing this:
You have to write boilerplate __init__ (My actual use case has about
20 attrs)
You lose the uploaded_by in the __repr__, but it is there
in _uploaded_by
Calls to asdict, astuple, dataclasses.replace aren't
handled correctly
So it's really not a fix for the issue
I have filed a bug on the Python Bug Tracker:
https://bugs.python.org/issue39247
So, unfortunately, the #property syntax is always interpreted as an assignment to uploaded_by (since, well, it is). The dataclass machinery is interpreting that as a default value, hence why it is passing the property object! It is equivalent to this:
In [11]: import dataclasses
...:
...: #dataclasses.dataclass
...: class FileObject:
...: uploaded_by: str
...: _uploaded_by: str = dataclasses.field(repr=False, init=False)
...: def save(self):
...: print(self.uploaded_by)
...:
...: def _get_uploaded_by(self):
...: return self._uploaded_by
...:
...: def _set_uploaded_by(self, uploaded_by):
...: print('Setter Called with Value ', uploaded_by)
...: self._uploaded_by = uploaded_by
...: uploaded_by = property(_get_uploaded_by, _set_uploaded_by)
...: p = FileObject()
...: p.save()
Setter Called with Value <property object at 0x10761e7d0>
<property object at 0x10761e7d0>
Which is essentially acting like this:
In [13]: #dataclasses.dataclass
...: class Foo:
...: bar:int = 1
...: bar = 2
...:
In [14]: Foo()
Out[14]: Foo(bar=2)
I don't think there is a clean way around this, and perhaps it could be considered a bug, but really, not sure what the solution should be, because essentially, the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__. You could perhaps either special-case the #property syntax, or maybe just the property object itself, so at least the behavior for #property and x = property(set_x, get_x) would be consistent...
To be clear, the following sort of works:
In [22]: import dataclasses
...:
...: #dataclasses.dataclass
...: class FileObject:
...: uploaded_by: str
...: _uploaded_by: str = dataclasses.field(repr=False, init=False)
...: #property
...: def uploaded_by(self):
...: return self._uploaded_by
...: #uploaded_by.setter
...: def uploaded_by(self, uploaded_by):
...: print('Setter Called with Value ', uploaded_by)
...: self._uploaded_by = uploaded_by
...:
...: p = FileObject(None)
...: print(p.uploaded_by)
Setter Called with Value None
None
In [23]: FileObject()
Setter Called with Value <property object at 0x1086debf0>
Out[23]: FileObject(uploaded_by=<property object at 0x1086debf0>)
But notice, you cannot set a useful default value! It will always take the property... Even worse, IMO, if you don't want a default value it will always create one!
EDIT: Found a potential workaround!
This should have been obvious, but you can just set the property object on the class.
import dataclasses
import typing
#dataclasses.dataclass
class FileObject:
uploaded_by:typing.Optional[str]=None
def _uploaded_by_getter(self):
return self._uploaded_by
def _uploaded_by_setter(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
FileObject.uploaded_by = property(
FileObject._uploaded_by_getter,
FileObject._uploaded_by_setter
)
p = FileObject()
print(p)
print(p.uploaded_by)
The alternative take on #juanpa.arrivillaga solution of setting properties, which may look a tad more object-oriented, initially proposed at python-list by Peter Otten
import dataclasses
from typing import Optional
#dataclasses.dataclass
class FileObject:
uploaded_by: Optional[str] = None
class FileObjectExpensive(FileObject):
#property
def uploaded_by(self):
return self._uploaded_by
#uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
def save(self):
print(self.uploaded_by)
p = FileObjectExpensive()
p.save()
p2 = FileObjectExpensive(uploaded_by='Already Computed')
p2.save()
This outputs:
Setter Called with Value None
None
Setter Called with Value Already Computed
Already Computed
To me this approach, while not being perfect in terms of removing boilerplate, has a little more readability and explicitness in the separation of the pure data container and behaviour on that data. And it keeps all variables' and properties' names the same, so readability seems to be the same.
Slightly modified solution from original question using metaclass approach - hope it helps :)
from __future__ import annotations
import dataclasses
from dataclass_wizard import property_wizard
#dataclasses.dataclass
class FileObject(metaclass=property_wizard):
uploaded_by: str | None
# uncomment and use for better IDE support
# _uploaded_by: str | None = dataclasses.field(default=None)
def save(self):
print(self.uploaded_by)
#property
def uploaded_by(self):
return self._uploaded_by
#uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
p = FileObject()
p.save()
This outputs (as I assume is desired behavior):
Setter Called with Value None
None
Edit (4/1/22): Adding clarification for future viewers. The dataclass-wizard is a library I've created to tackle the issue of field properties with default values in dataclasses, among other things. It can be installed with pip:
$ pip install dataclass-wizard
If you are interested in an optimized approach that relies only on stdlib, I created a simple gist which uses a metaclass approach.
Here's general usage below. This will raise an error as expected when the name field is not passed in to constructor:
#dataclass
class Test(metaclass=field_property_support):
my_int: int
name: str
my_bool: bool = True
#property
def name(self):
return self._name
#name.setter
def name(self, val):
print(f'Setting name to: {val!r}')
self._name = val
For completeness, and with credit to #juanpa.arrivillaga, here is a proposed answer to the original question which uses decorators.
It works at least with the use cases shown, and I prefer it to the method described here because it lets us assign a default value using the normal dataclass idiom.
The key is to defeat the #dataclass machinery by creating the getter and setter on a 'dummy' property (here '_uploaded_by') and then overwriting the original attribute from outside the class.
Maybe someone more knowledgeable than I can find a way to do the overwrite within __post_init__() ...
import dataclasses
#dataclasses.dataclass
class FileObject:
uploaded_by: str = None
def save(self):
print(self.uploaded_by)
#property
def _uploaded_by(self):
return self._uploaded_by_attr
#_uploaded_by.setter
def _uploaded_by(self, uploaded_by):
# print('Setter Called with Value ', uploaded_by)
self._uploaded_by_attr = uploaded_by
# --- has to be called at module level ---
FileObject.uploaded_by = FileObject._uploaded_by
def main():
p = FileObject()
p.save() # displays 'None'
p = FileObject()
p.uploaded_by = 'foo'
p.save() # displays 'foo'
p = FileObject(uploaded_by='bar')
p.save() # displays 'bar'
if __name__ == '__main__':
main()
Based on the solution of #juanpa.arrivillaga, I wrote the following function that makes it reusable as additional decorator:
from dataclasses import fields
def dataprops(cls):
"""A decorator to make dataclasses fields acting as properties
getter and setter methods names must initate with `get_` and `set_`"""
for field in fields(cls):
setattr(cls,
field.name,
property(
getattr(cls,f'get_{field.name}'),
getattr(cls,f'set_{field.name}')
)
)
return cls
Simple usage:
from dataclasses import dataclass
#dataprops
#dataclass
class FileObject:
uploaded_by: str = "no_one"
def save(self):
print(self.uploaded_by)
def get_uploaded_by(self):
return self._uploaded_by
def set_uploaded_by(self, uploaded_by):
print('Setter Called with Value: ', uploaded_by)
self._uploaded_by = uploaded_by
Output results:
p = FileObject()
p.save()
# output:
# Setter Called with Value: no_one
# no_one
p = FileObject("myself")
p.save()
# output:
# Setter Called with Value: myself
# myself

How to define enum values that are functions?

I have a situation where I need to enforce and give the user the option of one of a number of select functions, to be passed in as an argument to another function:
I really want to achieve something like the following:
from enum import Enum
#Trivial Function 1
def functionA():
pass
#Trivial Function 2
def functionB():
pass
#This is not allowed (as far as i can tell the values should be integers)
#But pseudocode for what I am after
class AvailableFunctions(Enum):
OptionA = functionA
OptionB = functionB
So the following can be executed:
def myUserFunction(theFunction = AvailableFunctions.OptionA):
#Type Check
assert isinstance(theFunction,AvailableFunctions)
#Execute the actual function held as value in the enum or equivalent
return theFunction.value()
Your assumption is wrong. Values can be arbitrary, they are not limited to integers. From the documentation:
The examples above use integers for enumeration values. Using integers
is short and handy (and provided by default by the Functional API),
but not strictly enforced. In the vast majority of use-cases, one
doesn’t care what the actual value of an enumeration is. But if the
value is important, enumerations can have arbitrary values.
However the issue with functions is that they are considered to be method definitions instead of attributes!
In [1]: from enum import Enum
In [2]: def f(self, *args):
...: pass
...:
In [3]: class MyEnum(Enum):
...: a = f
...: def b(self, *args):
...: print(self, args)
...:
In [4]: list(MyEnum) # it has no values
Out[4]: []
In [5]: MyEnum.a
Out[5]: <function __main__.f>
In [6]: MyEnum.b
Out[6]: <function __main__.MyEnum.b>
You can work around this by using a wrapper class or just functools.partial or (only in Python2) staticmethod:
from functools import partial
class MyEnum(Enum):
OptionA = partial(functionA)
OptionB = staticmethod(functionB)
Sample run:
In [7]: from functools import partial
In [8]: class MyEnum2(Enum):
...: a = partial(f)
...: def b(self, *args):
...: print(self, args)
...:
In [9]: list(MyEnum2)
Out[9]: [<MyEnum2.a: functools.partial(<function f at 0x7f4130f9aae8>)>]
In [10]: MyEnum2.a
Out[10]: <MyEnum2.a: functools.partial(<function f at 0x7f4130f9aae8>)>
Or using a wrapper class:
In [13]: class Wrapper:
...: def __init__(self, f):
...: self.f = f
...: def __call__(self, *args, **kwargs):
...: return self.f(*args, **kwargs)
...:
In [14]: class MyEnum3(Enum):
...: a = Wrapper(f)
...:
In [15]: list(MyEnum3)
Out[15]: [<MyEnum3.a: <__main__.Wrapper object at 0x7f413075b358>>]
Also note that if you want you can define the __call__ method in your enumeration class to make the values callable:
In [1]: from enum import Enum
In [2]: def f(*args):
...: print(args)
...:
In [3]: class MyEnum(Enum):
...: a = partial(f)
...: def __call__(self, *args):
...: self.value(*args)
...:
In [5]: MyEnum.a(1,2,3) # no need for MyEnum.a.value(1,2,3)
(1, 2, 3)
Since Python 3.11 there is much more concise and understandable way. member and nonmember functions were added to enum among other improvements, so you can now do the following:
from enum import Enum, member
def fn(x):
print(x)
class MyEnum(Enum):
meth = fn
mem = member(fn)
#classmethod
def this_is_a_method(cls):
print('No, still not a member')
def this_is_just_function():
print('No, not a member')
#member
def this_is_a_member(x):
print('Now a member!', x)
And now
>>> list(MyEnum)
[<MyEnum.mem: <function fn at ...>>, <MyEnum.this_is_a_member: <function MyEnum.this_is_a_member at ...>>]
>>> MyEnum.meth(1)
1
>>> MyEnum.mem(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyEnum' object is not callable
>>> MyEnum.mem.value(1)
1
>>> MyEnum.this_is_a_method()
No, still not a member
>>> MyEnum.this_is_just_function()
No, not a member
>>> MyEnum.this_is_a_member()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyEnum' object is not callable
>>> MyEnum.this_is_a_member.value(1)
Now a member! 1
Another less clunky solution is to put the functions in a tuple. As Bakuriu mentioned, you may want to make the enum callable.
from enum import Enum
def functionA():
pass
def functionB():
pass
class AvailableFunctions(Enum):
OptionA = (functionA,)
OptionB = (functionB,)
def __call__(self, *args, **kwargs):
self.value[0](*args, **kwargs)
Now you can use it like this:
AvailableFunctions.OptionA() # calls functionA
In addition to the answer of Bakuriu... If you use the wrapper approach like above you loose information about the original function like __name__, __repr__
and so on after wrapping it. This will cause problems for example if you want to use sphinx for generation of source code documentation. Therefore add the following to your wrapper class.
class wrapper:
def __init__(self, function):
self.function = function
functools.update_wrapper(self, function)
def __call__(self,*args, **kwargs):
return self.function(*args, **kwargs)
def __repr__(self):
return self.function.__repr__()
Building on top of #bakuriu's approach, I just want to highlight that we can also use dictionaries of multiple functions as values and have a broader polymorphism, similar to enums in Java. Here is a fictitious example to show what I mean:
from enum import Enum, unique
#unique
class MyEnum(Enum):
test = {'execute': lambda o: o.test()}
prod = {'execute': lambda o: o.prod()}
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif not name.startswith("_"):
value = self.__dict__['_value_']
return value[name]
raise AttributeError(name)
class Executor:
def __init__(self, mode: MyEnum):
self.mode = mode
def test(self):
print('test run')
def prod(self):
print('prod run')
def execute(self):
self.mode.execute(self)
Executor(MyEnum.test).execute()
Executor(MyEnum.prod).execute()
Obviously, the dictionary approach provides no additional benefit when there is only a single function, so use this approach when there are multiple functions. Ensure that the keys are uniform across all values as otherwise, the usage won't be polymorphic.
The __getattr__ method is optional, it is only there for syntactic sugar (i.e., without it, mode.execute() would become mode.value['execute']().
Since dictionaries can't be made readonly, using namedtuple would be better and require only minor changes to the above.
from enum import Enum, unique
from collections import namedtuple
EnumType = namedtuple("EnumType", "execute")
#unique
class MyEnum(Enum):
test = EnumType(lambda o: o.test())
prod = EnumType(lambda o: o.prod())
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif not name.startswith("_"):
value = self.__dict__['_value_']
return getattr(value, name)
raise AttributeError(name)

Represent a class as a dict or list

I've classes that is used for getting data from one system, making some modifications and then outputting them into another system. Which usually goes the way of converting it into a dict or a list after I've made all the necessary conversions.
So far what I've done is that I've made two methods called as_dict() and as_list() and used that whenever I need that representation.
But I'm curious if there's a way to be able to do dict(instance_of_my_class) or list(instance_of_my_class).
I've been reading up on magic methods and it seems as if this is not possible?
And some simple sample code to work with:
class Cost(object):
#property
def a_metric(self):
return self.raw_data.get('a_metric', 0) * 0.8
[..]
# Repeat for various kinds of transformations
def as_dict(self):
return {
'a_metric': self.a_metric,
[...]
}
Do you mean something like this? If so you have to define a __iter__ method that yield's key-value pairs:
In [1]: class A(object):
...: def __init__(self):
...: self.pairs = ((1,2),(2,3))
...: def __iter__(self):
...: return iter(self.pairs)
...:
In [2]: a = A()
In [3]: dict(a)
Out[3]: {1: 2, 2: 3}
Also, it seems that dict tries to call the .keys / __getitem__ methods before __iter__, so you can make list(instance) and dict(instance) return something completely different.
In [4]: class B(object):
...: def __init__(self):
...: self.d = {'key':'value'}
...: self.l = [1,2,3,4]
...: def keys(self):
...: return self.d.keys()
...: def __getitem__(self, item):
...: return self.d[item]
...: def __iter__(self):
...: return iter(self.l)
...:
In [5]: b = B()
In [6]: list(b)
Out[6]: [1, 2, 3, 4]
In [7]: dict(b)
Out[7]: {'key': 'value'}

__getattr__ on a class and not (or as well as) an instance

I know I can write code like:
class A :
def __getattr__ (self, name) :
return name
to trap access to undefined attributes on an instance of a class, so:
A().ATTR == 'ATTR'
is True. But is there any way to do this for the class itself? What I'd like to be able to is to have the following two lines both work, and be distinguishable (ie., there are different magic methods called, or the magic method can tell how it was called)
a = A().ATTR
b = A.ATTR
I suspect the answer is no but maybe there is some deep python ju-ju?
Edit: The actual problem is extending a custom active record library in a backwards-compatible way. The ORM code supports code along the lines of
ar = AR_people()
ar.find()
name = ar.name
to access tables, where name may get mapped to a column name which is different, e.g. pe_name. We want to be able to write something like
ar.filter(AR_people.age >= 21)
and end up with
pe_age >= 21
(much like other ORM libraries do), so AR_people.age needs to return an instance of a class that implements __ge__ and so forth to do the conversion magic.
You can use a metaclass:
In [1]: class meta(type):
...: def __getattr__(self, name):
...: return name
...:
...:
In [2]: class A(object):
...: __metaclass__ = meta
...: def __getattr__(self, name):
...: return name
...:
...:
In [3]: A().attr
Out[3]: 'attr'
In [4]: A.attr
Out[4]: 'attr'
Yes you can. Metaclasses are the answer.
class MyMetaclass(type):
def __getattr__(cls, name):
return "cls.%s" % name
class A :
__metaclass__ = MyMetaclass
def __getattr__ (self, name) :
return name
print A().ATTR
print A.ATTR
will output
ATTR
cls.ATTR

Converting an object into a subclass in Python?

Lets say I have a library function that I cannot change that produces an object of class A, and I have created a class B that inherits from A.
What is the most straightforward way of using the library function to produce an object of class B?
edit- I was asked in a comment for more detail, so here goes:
PyTables is a package that handles hierarchical datasets in python. The bit I use most is its ability to manage data that is partially on disk. It provides an 'Array' type which only comes with extended slicing, but I need to select arbitrary rows. Numpy offers this capability - you can select by providing a boolean array of the same length as the array you are selecting from. Therefore, I wanted to subclass Array to add this new functionality.
In a more abstract sense this is a problem I have considered before. The usual solution is as has already been suggested- Have a constructor for B that takes an A and additional arguments, and then pulls out the relevant bits of A to insert into B. As it seemed like a fairly basic problem, I asked to question to see if there were any standard solutions I wasn't aware of.
This can be done if the initializer of the subclass can handle it, or you write an explicit upgrader. Here is an example:
class A(object):
def __init__(self):
self.x = 1
class B(A):
def __init__(self):
super(B, self).__init__()
self._init_B()
def _init_B(self):
self.x += 1
a = A()
b = a
b.__class__ = B
b._init_B()
assert b.x == 2
Since the library function returns an A, you can't make it return a B without changing it.
One thing you can do is write a function to take the fields of the A instance and copy them over into a new B instance:
class A: # defined by the library
def __init__(self, field):
self.field = field
class B(A): # your fancy new class
def __init__(self, field, field2):
self.field = field
self.field2 = field2 # B has some fancy extra stuff
def b_from_a(a_instance, field2):
"""Given an instance of A, return a new instance of B."""
return B(a_instance.field, field2)
a = A("spam") # this could be your A instance from the library
b = b_from_a(a, "ham") # make a new B which has the data from a
print b.field, b.field2 # prints "spam ham"
Edit: depending on your situation, composition instead of inheritance could be a good bet; that is your B class could just contain an instance of A instead of inheriting:
class B2: # doesn't have to inherit from A
def __init__(self, a, field2):
self._a = a # using composition instead
self.field2 = field2
#property
def field(self): # pass accesses to a
return self._a.field
# could provide setter, deleter, etc
a = A("spam")
b = B2(a, "ham")
print b.field, b.field2 # prints "spam ham"
you can actually change the .__class__ attribute of the object if you know what you're doing:
In [1]: class A(object):
...: def foo(self):
...: return "foo"
...:
In [2]: class B(object):
...: def foo(self):
...: return "bar"
...:
In [3]: a = A()
In [4]: a.foo()
Out[4]: 'foo'
In [5]: a.__class__
Out[5]: __main__.A
In [6]: a.__class__ = B
In [7]: a.foo()
Out[7]: 'bar'
Monkeypatch the library?
For example,
import other_library
other_library.function_or_class_to_replace = new_function
Poof, it returns whatever you want it to return.
Monkeypatch A.new to return an instance of B?
After you call obj = A(), change the result so obj.class = B?
Depending on use case, you can now hack a dataclass to arguably make the composition solution a little cleaner:
from dataclasses import dataclass, fields
#dataclass
class B:
field: int # Only adds 1 line per field instead of a whole #property method
#classmethod
def from_A(cls, a):
return cls(**{
f.name: getattr(a, f.name)
for f in fields(A)
})

Categories

Resources