Weird Issue when using dataclass and property together - python

I ran into a strange issue while trying to use a dataclass together with a property.
I have it down to a minumum to reproduce it:
import dataclasses
#dataclasses.dataclass
class FileObject:
_uploaded_by: str = dataclasses.field(default=None, init=False)
uploaded_by: str = None
def save(self):
print(self.uploaded_by)
#property
def uploaded_by(self):
return self._uploaded_by
#uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
p = FileObject()
p.save()
This outputs:
Setter Called with Value <property object at 0x7faeb00150b0>
<property object at 0x7faeb00150b0>
I would expect to get None instead of
Am I doing something wrong here or have I stumbled across a bug?
After reading #juanpa.arrivillaga answer I thought that making uploaded_by and InitVar might fix the issue, but it still return a property object. I think it is because of the this that he said:
the datalcass machinery interprets any assignment to a type-annotated
variable in the class body as the default value to the created
__init__.
The only option I can find that works with the default value is to remove the uploadedby from the dataclass defintion and write an actual __init__. That has an unfortunate side effect of requiring you to write an __init__ for the dataclass manually which negates some of the value of using a dataclass. Here is what I did:
import dataclasses
#dataclasses.dataclass
class FileObject:
_uploaded_by: str = dataclasses.field(default=None, init=False)
uploaded_by: dataclasses.InitVar=None
other_attrs: str = None
def __init__(self, uploaded_by=None, other_attrs=None):
self._uploaded_by = uploaded_by
self.other_attrs = other_attrs
def save(self):
print("Uploaded by: ", self.uploaded_by)
print("Other Attrs: ", self.other_attrs)
#property
def uploaded_by(self):
if not self._uploaded_by:
print("Doing expensive logic that should not be repeated")
return self._uploaded_by
p = FileObject(other_attrs="More Data")
p.save()
p2 = FileObject(uploaded_by='Already Computed', other_attrs="More Data")
p2.save()
Which outputs:
Doing expensive logic that should not be repeated
Uploaded by: None
Other Attrs: More Data
Uploaded by: Already Computed
Other Attrs: More Data
The negatives of doing this:
You have to write boilerplate __init__ (My actual use case has about
20 attrs)
You lose the uploaded_by in the __repr__, but it is there
in _uploaded_by
Calls to asdict, astuple, dataclasses.replace aren't
handled correctly
So it's really not a fix for the issue
I have filed a bug on the Python Bug Tracker:
https://bugs.python.org/issue39247

So, unfortunately, the #property syntax is always interpreted as an assignment to uploaded_by (since, well, it is). The dataclass machinery is interpreting that as a default value, hence why it is passing the property object! It is equivalent to this:
In [11]: import dataclasses
...:
...: #dataclasses.dataclass
...: class FileObject:
...: uploaded_by: str
...: _uploaded_by: str = dataclasses.field(repr=False, init=False)
...: def save(self):
...: print(self.uploaded_by)
...:
...: def _get_uploaded_by(self):
...: return self._uploaded_by
...:
...: def _set_uploaded_by(self, uploaded_by):
...: print('Setter Called with Value ', uploaded_by)
...: self._uploaded_by = uploaded_by
...: uploaded_by = property(_get_uploaded_by, _set_uploaded_by)
...: p = FileObject()
...: p.save()
Setter Called with Value <property object at 0x10761e7d0>
<property object at 0x10761e7d0>
Which is essentially acting like this:
In [13]: #dataclasses.dataclass
...: class Foo:
...: bar:int = 1
...: bar = 2
...:
In [14]: Foo()
Out[14]: Foo(bar=2)
I don't think there is a clean way around this, and perhaps it could be considered a bug, but really, not sure what the solution should be, because essentially, the datalcass machinery interprets any assignment to a type-annotated variable in the class body as the default value to the created __init__. You could perhaps either special-case the #property syntax, or maybe just the property object itself, so at least the behavior for #property and x = property(set_x, get_x) would be consistent...
To be clear, the following sort of works:
In [22]: import dataclasses
...:
...: #dataclasses.dataclass
...: class FileObject:
...: uploaded_by: str
...: _uploaded_by: str = dataclasses.field(repr=False, init=False)
...: #property
...: def uploaded_by(self):
...: return self._uploaded_by
...: #uploaded_by.setter
...: def uploaded_by(self, uploaded_by):
...: print('Setter Called with Value ', uploaded_by)
...: self._uploaded_by = uploaded_by
...:
...: p = FileObject(None)
...: print(p.uploaded_by)
Setter Called with Value None
None
In [23]: FileObject()
Setter Called with Value <property object at 0x1086debf0>
Out[23]: FileObject(uploaded_by=<property object at 0x1086debf0>)
But notice, you cannot set a useful default value! It will always take the property... Even worse, IMO, if you don't want a default value it will always create one!
EDIT: Found a potential workaround!
This should have been obvious, but you can just set the property object on the class.
import dataclasses
import typing
#dataclasses.dataclass
class FileObject:
uploaded_by:typing.Optional[str]=None
def _uploaded_by_getter(self):
return self._uploaded_by
def _uploaded_by_setter(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
FileObject.uploaded_by = property(
FileObject._uploaded_by_getter,
FileObject._uploaded_by_setter
)
p = FileObject()
print(p)
print(p.uploaded_by)

The alternative take on #juanpa.arrivillaga solution of setting properties, which may look a tad more object-oriented, initially proposed at python-list by Peter Otten
import dataclasses
from typing import Optional
#dataclasses.dataclass
class FileObject:
uploaded_by: Optional[str] = None
class FileObjectExpensive(FileObject):
#property
def uploaded_by(self):
return self._uploaded_by
#uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
def save(self):
print(self.uploaded_by)
p = FileObjectExpensive()
p.save()
p2 = FileObjectExpensive(uploaded_by='Already Computed')
p2.save()
This outputs:
Setter Called with Value None
None
Setter Called with Value Already Computed
Already Computed
To me this approach, while not being perfect in terms of removing boilerplate, has a little more readability and explicitness in the separation of the pure data container and behaviour on that data. And it keeps all variables' and properties' names the same, so readability seems to be the same.

Slightly modified solution from original question using metaclass approach - hope it helps :)
from __future__ import annotations
import dataclasses
from dataclass_wizard import property_wizard
#dataclasses.dataclass
class FileObject(metaclass=property_wizard):
uploaded_by: str | None
# uncomment and use for better IDE support
# _uploaded_by: str | None = dataclasses.field(default=None)
def save(self):
print(self.uploaded_by)
#property
def uploaded_by(self):
return self._uploaded_by
#uploaded_by.setter
def uploaded_by(self, uploaded_by):
print('Setter Called with Value ', uploaded_by)
self._uploaded_by = uploaded_by
p = FileObject()
p.save()
This outputs (as I assume is desired behavior):
Setter Called with Value None
None
Edit (4/1/22): Adding clarification for future viewers. The dataclass-wizard is a library I've created to tackle the issue of field properties with default values in dataclasses, among other things. It can be installed with pip:
$ pip install dataclass-wizard
If you are interested in an optimized approach that relies only on stdlib, I created a simple gist which uses a metaclass approach.
Here's general usage below. This will raise an error as expected when the name field is not passed in to constructor:
#dataclass
class Test(metaclass=field_property_support):
my_int: int
name: str
my_bool: bool = True
#property
def name(self):
return self._name
#name.setter
def name(self, val):
print(f'Setting name to: {val!r}')
self._name = val

For completeness, and with credit to #juanpa.arrivillaga, here is a proposed answer to the original question which uses decorators.
It works at least with the use cases shown, and I prefer it to the method described here because it lets us assign a default value using the normal dataclass idiom.
The key is to defeat the #dataclass machinery by creating the getter and setter on a 'dummy' property (here '_uploaded_by') and then overwriting the original attribute from outside the class.
Maybe someone more knowledgeable than I can find a way to do the overwrite within __post_init__() ...
import dataclasses
#dataclasses.dataclass
class FileObject:
uploaded_by: str = None
def save(self):
print(self.uploaded_by)
#property
def _uploaded_by(self):
return self._uploaded_by_attr
#_uploaded_by.setter
def _uploaded_by(self, uploaded_by):
# print('Setter Called with Value ', uploaded_by)
self._uploaded_by_attr = uploaded_by
# --- has to be called at module level ---
FileObject.uploaded_by = FileObject._uploaded_by
def main():
p = FileObject()
p.save() # displays 'None'
p = FileObject()
p.uploaded_by = 'foo'
p.save() # displays 'foo'
p = FileObject(uploaded_by='bar')
p.save() # displays 'bar'
if __name__ == '__main__':
main()

Based on the solution of #juanpa.arrivillaga, I wrote the following function that makes it reusable as additional decorator:
from dataclasses import fields
def dataprops(cls):
"""A decorator to make dataclasses fields acting as properties
getter and setter methods names must initate with `get_` and `set_`"""
for field in fields(cls):
setattr(cls,
field.name,
property(
getattr(cls,f'get_{field.name}'),
getattr(cls,f'set_{field.name}')
)
)
return cls
Simple usage:
from dataclasses import dataclass
#dataprops
#dataclass
class FileObject:
uploaded_by: str = "no_one"
def save(self):
print(self.uploaded_by)
def get_uploaded_by(self):
return self._uploaded_by
def set_uploaded_by(self, uploaded_by):
print('Setter Called with Value: ', uploaded_by)
self._uploaded_by = uploaded_by
Output results:
p = FileObject()
p.save()
# output:
# Setter Called with Value: no_one
# no_one
p = FileObject("myself")
p.save()
# output:
# Setter Called with Value: myself
# myself

Related

Proper way to use dataclass in another class

After asking my last question, it seems like I have not really understood classes adn dataclasses.
So I would like to learn the correct way of doing the following:
define dataclass
define other class, which will use an instance of dataclass
use a method from the second class to updatenvalues of dataclass
The way I do gives me an error saying that my datafram doesn't exist. I created a method inside the dataclass, using that results in an error stating it is read-only.
#dataclass(slots=True)
def Storage():
timestamp: float
value: float
class UDP():
some attributes
self.datastorage: Storage = Storage()
def updatedata(self, time, val):
self.datastorage.timestamp = time
self.datastorage.value = val
def main():
test = UDP()
test.updatedata(0.01,2)
So my question is how to instantiate a dataclass in another class and be able to manipulate the values in the dataclass?
Your code has several syntactic problems. Once those are fixed, the code works. Storage objects are mutable, and you may freely modify their timestamp and value attributes.
In [7]: #dataclass(slots=True)
...: class Storage:
...: timestamp: float
...: value: float
...:
...:
...: class UDP:
...: datastorage: Storage = Storage(0.0, 0.0)
...:
...: def updatedata(self, time, val):
...: self.datastorage.timestamp = time
...: self.datastorage.value = val
...:
...: def main():
...: test = UDP()
...: test.updatedata(0.01,2)
...:
In [8]: main()

Required positional arguments with dataclass properties

It seems there's been a fair bit of discussion about this already. I found this post particularly helpful, and it seems to provide one of the best solutions.
But there is a problem with the recommended solution.
Well, it seems to work great at first. Consider a simple test case without properties:
#dataclass
class Foo:
x: int
>>> # Instantiate the class
>>> f = Foo(2)
>>> # Nice, it works!
>>> f.x
2
Now try to implement x as a property using the recommended solution:
#dataclass
class Foo:
x: int
_x: int = field(init=False, repr=False)
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
>>> # Instantiate while explicitly passing `x`
>>> f = Foo(2)
>>> # Still appears to work
>>> f.x
2
But wait...
>>> # Instantiate without any arguments
>>> f = Foo()
>>> # Oops...! Property `x` has never been initialized. Now we have a bug :(
>>> f.x
<property object at 0x10d2a8130>
Really the expected behavior here would be:
>>> # Instantiate without any arguments
>>> f = Foo()
TypeError: __init__() missing 1 required positional argument: 'x'
It seems that the dataclass field has been overridden by the property... any thought on how to get around this?
Related:
Dataclasses and property decorator
Using a property in a dataclass that shares the name of an argument of the __init__ method has an interesting side effect. When the class is instantiated with no argument, the property object is passed as the default.
As a work-around, you can use check the type of x in __post_init__.
#dataclass
class Foo:
x: int
_x: int = field(init=False, repr=False)
def __post_init__(self):
if isinstance(self.x, property):
raise TypeError("__init__() missing 1 required positional argument: 'x'")
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
Now when instantiating Foo, passing no argument raises the expected exception.
f = Foo()
# raises TypeError
f = Foo(1)
f
# returns
Foo(x=1)
Here is a more generalized solution for when multiple properties are being used. This uses InitVar to pass parameters to the __post_init__ method. It DOES require that the the properties are listed first, and that their respective storage attributes be a the same name with a leading underscore.
This is pretty hacky, and the properties no longer show up in the repr.
#dataclass
class Foo:
x: InitVar[int]
y: InitVar[int]
_x: int = field(init=False, repr=False, default=None)
_y: int = field(init=False, repr=False, default=None)
def __post_init__(self, *args):
if m := sum(isinstance(arg, property) for arg in args):
s = 's' if m>1 else ''
raise TypeError(f'__init__() missing {m} required positional argument{s}.')
arg_names = inspect.getfullargspec(self.__class__).args[1:]
for arg_name, val in zip(arg_names, args):
self.__setattr__('_' + arg_name, val)
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
#property
def y(self):
return self._y
#y.setter
def y(self, value):
self._y = value
Using properties in dataclasses actually has a curious effect, as #James also pointed out. In actuality, this issue isn't constrained to dataclasses alone; it rather happens due to the order in which you declare (or re-declare) a variable.
To elaborate, consider what happens when you do something like this, using just a simple class:
class Foo:
x: int = 2
#property
def x(self):
return self._x
But watch what happens when you now do:
>>> Foo.x
<property object at 0x00000263C50ECC78>
So what happened? Clearly, the property method declaration overwrote the attribute that we declared as x: int = 2.
In fact, at the time that the #dataclass decorator runs (which is once the class definition of Foo is complete), this is actually what it sees as the definition of x:
x: int = <property object at 0x00000263C50ECC78>
Confusing, right? It still sees the class annotations that are present in Foo.__annotations__, but it also sees the property object with a getter that we declared after the dataclass field. It's important to note that this result is not a bug in any way; however, since dataclasses doesn't explicitly check for a property object, it treats the value after the assignment = operator as a default value, and thus we observe a <property object at 0x00000263C50ECC78> passed in as a default value to the constructor when we don't explicitly pass a value for the field property x.
This is actually quite an interesting consequence to keep in mind. In fact, I also came up with a section on Using Field Properties which actually goes over this same behavior and some unexpected consequences of it.
Properties with Required Values
Here's a generalized metaclass approach that might prove useful for automation purposes, assuming what you want to do is raise a TypeError when values for any field properties are not passed in the constructor. I also created an optimized, modified approach of it in a public gist.
What this metaclass does is generate a __post_init__() for the class, and for each field property declared it checks if a property object has been set as a default in the __init__() method generated by the #dataclass decorator; this indicates no value was passed in to the constructor for the field property, so a properly formatted TypeError is then raised to the caller. I adapted this metaclass approach from #James's answer above.
Note: The following example should work in Python 3.7+
from __future__ import annotations
from collections import deque
# noinspection PyProtectedMember
from dataclasses import _create_fn
from logging import getLogger
log = getLogger(__name__)
def require_field_properties(name, bases=None, cls_dict=None) -> type:
"""
A metaclass which ensures that values for field properties are passed in
to the __init__() method.
Accepts the same arguments as the builtin `type` function::
type(name, bases, dict) -> a new type
"""
# annotations can also be forward-declared, i.e. as a string
cls_annotations: dict[str, type | str] = cls_dict['__annotations__']
# we're going to be doing a lot of `append`s, so might be better to use a
# deque here rather than a list.
body_lines: deque[str] = deque()
# Loop over and identify all dataclass fields with associated properties.
# Note that dataclasses._create_fn() uses 2 spaces for the initial indent.
for field, annotation in cls_annotations.items():
if field in cls_dict and isinstance(cls_dict[field], property):
body_lines.append(f'if isinstance(self.{field}, property):')
body_lines.append(f" missing_fields.append('{field}')")
# only add a __post_init__() if there are field properties in the class
if not body_lines:
cls = type(name, bases, cls_dict)
return cls
body_lines.appendleft('missing_fields = []')
# to check if there are any missing arguments for field properties
body_lines.append('if missing_fields:')
body_lines.append(" s = 's' if len(missing_fields) > 1 else ''")
body_lines.append(" args = (', and' if len(missing_fields) > 2 else ' and')"
".join(', '.join(map(repr, missing_fields)).rsplit(',', 1))")
body_lines.append(' raise TypeError('
"f'__init__() missing {len(missing_fields)} required "
"positional argument{s}: {args}')")
# does the class define a __post_init__() ?
if '__post_init__' in cls_dict:
fn_locals = {'_orig_post_init': cls_dict['__post_init__']}
body_lines.append('_orig_post_init(self, *args)')
else:
fn_locals = None
# generate a new __post_init__ method
_post_init_fn = _create_fn('__post_init__',
('self', '*args'),
body_lines,
globals=cls_dict,
locals=fn_locals,
return_type=None)
# Set the __post_init__() attribute on the class
cls_dict['__post_init__'] = _post_init_fn
# (Optional) Print the body of the generated method definition
log.debug('Generated a body definition for %s.__post_init__():',
name)
log.debug('%s\n %s', '-------', '\n '.join(body_lines))
log.debug('-------')
cls = type(name, bases, cls_dict)
return cls
And a sample usage of the metaclass:
from dataclasses import dataclass, field
from logging import basicConfig
from metaclasses import require_field_properties
basicConfig(level='DEBUG')
#dataclass
class Foo(metaclass=require_field_properties):
a: str
x: int
y: bool
z: float
# the following definitions are not needed
_x: int = field(init=False, repr=False)
_y: bool = field(init=False, repr=False)
_z: float = field(init=False, repr=False)
#property
def x(self):
return self._x
#x.setter
def x(self, value):
print(f'Setting x: {value!r}')
self._x = value
#property
def y(self):
return self._y
#y.setter
def y(self, value):
print(f'Setting y: {value!r}')
self._y = value
#property
def z(self):
return self._z
#z.setter
def z(self, value):
print(f'Setting z: {value!r}')
self._z = value
if __name__ == '__main__':
foo1 = Foo(a='a value', x=1, y=True, z=2.3)
print('Foo1:', foo1)
print()
foo2 = Foo('hello', 123)
print('Foo2:', foo2)
Output now appears to be as desired:
DEBUG:metaclasses:Generated a body definition for Foo.__post_init__():
DEBUG:metaclasses:-------
missing_fields = []
if isinstance(self.x, property):
missing_fields.append('x')
if isinstance(self.y, property):
missing_fields.append('y')
if isinstance(self.z, property):
missing_fields.append('z')
if missing_fields:
s = 's' if len(missing_fields) > 1 else ''
args = (', and' if len(missing_fields) > 2 else ' and').join(', '.join(map(repr, missing_fields)).rsplit(',', 1))
raise TypeError(f'__init__() missing {len(missing_fields)} required positional argument{s}: {args}')
DEBUG:metaclasses:-------
Setting x: 1
Setting y: True
Setting z: 2.3
Foo1: Foo(a='a value', x=1, y=True, z=2.3)
Setting x: 123
Setting y: <property object at 0x10c2c2350>
Setting z: <property object at 0x10c2c23b0>
Traceback (most recent call last):
...
foo2 = Foo('hello', 123)
File "<string>", line 7, in __init__
File "<string>", line 13, in __post_init__
TypeError: __init__() missing 2 required positional arguments: 'y' and 'z'
So the above solution does work as expected, however it's a lot of code and so it's worth asking: why not make it less code, and rather set the __post_init__ in the class itself, rather than go through a metaclass? The core reason here is actually performance. You'd ideally want to minimize the overhead of creating a new Foo object in the above case, for example.
So in order to explore that a bit further, I've put together a small test case to compare the performance of a metaclass approach against a __post_init__ approach using the inspect module to retrieve the field properties of the class at runtime. Here is the example code below:
import inspect
from dataclasses import dataclass, InitVar
from metaclasses import require_field_properties
#dataclass
class Foo1(metaclass=require_field_properties):
a: str
x: int
y: bool
z: float
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
#property
def y(self):
return self._y
#y.setter
def y(self, value):
self._y = value
#property
def z(self):
return self._z
#z.setter
def z(self, value):
self._z = value
#dataclass
class Foo2:
a: str
x: InitVar[int]
y: InitVar[bool]
z: InitVar[float]
# noinspection PyDataclass
def __post_init__(self, *args):
if m := sum(isinstance(arg, property) for arg in args):
s = 's' if m > 1 else ''
raise TypeError(f'__init__() missing {m} required positional argument{s}.')
arg_names = inspect.getfullargspec(self.__class__).args[2:]
for arg_name, val in zip(arg_names, args):
# setattr calls the property defined for each field
self.__setattr__(arg_name, val)
#property
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
#property
def y(self):
return self._y
#y.setter
def y(self, value):
self._y = value
#property
def z(self):
return self._z
#z.setter
def z(self, value):
self._z = value
if __name__ == '__main__':
from timeit import timeit
n = 1
iterations = 1000
print('Metaclass: ', timeit(f"""
for i in range({iterations}):
_ = Foo1(a='a value' * i, x=i, y=i % 2 == 0, z=i * 1.5)
""", globals=globals(), number=n))
print('InitVar: ', timeit(f"""
for i in range({iterations}):
_ = Foo2(a='a value' * i, x=i, y=i % 2 == 0, z=i * 1.5)
""", globals=globals(), number=n))
And here are the results, when I test in a Python 3.9 environment with N=1000 iterations, with Mac OS X (Big Sur):
Metaclass: 0.0024892739999999997
InitVar: 0.034604513
Not surprisingly, the metaclass approach is overall more efficient when creating multiple Foo objects - on average about 10x faster. The reason for this is it only has to go through and determine the field properties defined in a class once, and then it actually generates a __post_init__ specifically for those fields. Overall the result is that it performs better, even though it technically requires more code and setup in order to get there.
Properties with Default Values
Suppose that you instead don't want to raise an error when x is not explicitly passed in to the constructor; maybe you just want to set a default value, like None or an int value like 3 for example.
I've created a metaclass approach specifically designed to handle this scenario. There's also the original gist you can check out if you want an idea of how it was implemented (or you can also check out the source code directly if you're curious as well). In any case, here's the solution that I've come up with below; note that it involves a third-party library, as unfortunately this behavior is not baked into the dataclasses module at present.
from __future__ import annotations
from dataclasses import dataclass, field
from dataclass_wizard import property_wizard
#dataclass
class Foo(metaclass=property_wizard):
x: int | None
_x: int = field(init=False, repr=False) # technically, not needed
#property
def x(self):
return self._x
#x.setter
def x(self, value):
print(f'Setting x to: {value!r}')
self._x = value
if __name__ == '__main__':
f = Foo(2)
assert f.x == 2
f = Foo()
assert f.x is None
This is the output with the metaclass approach:
Setting x to: 2
Setting x to: None
And the output with the #dataclass decorator alone - also as observed in the question above:
Setting x to: 2
Setting x to: <property object at 0x000002D65A9950E8>
Traceback (most recent call last):
...
assert f.x is None
AssertionError
Specifying a Default Value
Lastly, here's an example of setting an explicit default value for the property, using a property defined with a leading underscore _ to distinguish it from the dataclass field which has a public name.
from dataclasses import dataclass
from dataclass_wizard import property_wizard
#dataclass
class Foo(metaclass=property_wizard):
x: int = 1
#property
def _x(self):
return self._x
#_x.setter
def _x(self, value):
print(f'Setting x to: {value!r}')
self._x = value
if __name__ == '__main__':
f = Foo(2)
assert f.x == 2
f = Foo()
assert f.x == 1
Output:
Setting x to: 2
Setting x to: 1

Dataclasses and property decorator

I've been reading up on Python 3.7's dataclass as an alternative to namedtuples (what I typically use when having to group data in a structure). I was wondering if dataclass is compatible with the property decorator to define getter and setter functions for the data elements of the dataclass. If so, is this described somewhere? Or are there examples available?
It sure does work:
from dataclasses import dataclass
#dataclass
class Test:
_name: str="schbell"
#property
def name(self) -> str:
return self._name
#name.setter
def name(self, v: str) -> None:
self._name = v
t = Test()
print(t.name) # schbell
t.name = "flirp"
print(t.name) # flirp
print(t) # Test(_name='flirp')
In fact, why should it not? In the end, what you get is just a good old class, derived from type:
print(type(t)) # <class '__main__.Test'>
print(type(Test)) # <class 'type'>
Maybe that's why properties are nowhere mentioned specifically. However, the PEP-557's Abstract mentions the general usability of well-known Python class features:
Because Data Classes use normal class definition syntax, you are free
to use inheritance, metaclasses, docstrings, user-defined methods,
class factories, and other Python class features.
TWO VERSIONS THAT SUPPORT DEFAULT VALUES
Most published approaches don't provide a readable way to set a default value for the property, which is quite an important part of dataclass. Here are two possible ways to do that.
The first way is based on the approach referenced by #JorenV. It defines the default value in _name = field() and utilises the observation that if no initial value is specified, then the setter is passed the property object itself:
from dataclasses import dataclass, field
#dataclass
class Test:
name: str
_name: str = field(init=False, repr=False, default='baz')
#property
def name(self) -> str:
return self._name
#name.setter
def name(self, value: str) -> None:
if type(value) is property:
# initial value not specified, use default
value = Test._name
self._name = value
def main():
obj = Test(name='foo')
print(obj) # displays: Test(name='foo')
obj = Test()
obj.name = 'bar'
print(obj) # displays: Test(name='bar')
obj = Test()
print(obj) # displays: Test(name='baz')
if __name__ == '__main__':
main()
The second way is based on the same approach as #Conchylicultor: bypassing the dataclass machinery by overwriting the field outside the class definition.
Personally I think this way is cleaner and more readable than the first because it follows the normal dataclass idiom to define the default value and requires no 'magic' in the setter.
Even so I'd prefer everything to be self-contained... perhaps some clever person can find a way to incorporate the field update in dataclass.__post_init__() or similar?
from dataclasses import dataclass
#dataclass
class Test:
name: str = 'foo'
#property
def _name(self):
return self._my_str_rev[::-1]
#_name.setter
def _name(self, value):
self._my_str_rev = value[::-1]
# --- has to be called at module level ---
Test.name = Test._name
def main():
obj = Test()
print(obj) # displays: Test(name='foo')
obj = Test()
obj.name = 'baz'
print(obj) # displays: Test(name='baz')
obj = Test(name='bar')
print(obj) # displays: Test(name='bar')
if __name__ == '__main__':
main()
A solution with minimal additional code and no hidden variables is to override the __setattr__ method to do any checks on the field:
#dataclass
class Test:
x: int = 1
def __setattr__(self, prop, val):
if prop == "x":
self._check_x(val)
super().__setattr__(prop, val)
#staticmethod
def _check_x(x):
if x <= 0:
raise ValueError("x must be greater than or equal to zero")
An #property is typically used to store a seemingly public argument (e.g. name) into a private attribute (e.g. _name) through getters and setters, while dataclasses generate the __init__() method for you.
The problem is that this generated __init__() method should interface through the public argument name, while internally setting the private attribute _name.
This is not done automatically by dataclasses.
In order to have the same interface (through name) for setting values and creation of the object, the following strategy can be used (Based on this blogpost, which also provides more explanation):
from dataclasses import dataclass, field
#dataclass
class Test:
name: str
_name: str = field(init=False, repr=False)
#property
def name(self) -> str:
return self._name
#name.setter
def name(self, name: str) -> None:
self._name = name
This can now be used as one would expect from a dataclass with a data member name:
my_test = Test(name='foo')
my_test.name = 'bar'
my_test.name('foobar')
print(my_test.name)
The above implementation does the following things:
The name class member will be used as the public interface, but it actually does not really store anything
The _name class member stores the actual content. The assignment with field(init=False, repr=False) makes sure that the #dataclass decorator ignores it when constructing the __init__() and __repr__() methods.
The getter/setter for name actually returns/sets the content of _name
The initializer generated through the #dataclass will use the setter that we just defined. It will not initialize _name explicitly, because we told it not to do so.
Currently, the best way I found was to overwrite the dataclass fields by property in a separate child class.
from dataclasses import dataclass, field
#dataclass
class _A:
x: int = 0
class A(_A):
#property
def x(self) -> int:
return self._x
#x.setter
def x(self, value: int):
self._x = value
The class behave like a regular dataclass. And will correctly define the __repr__ and __init__ field (A(x=4) instead of A(_x=4). The drawback is that the properties cannot be read-only.
This blog post, tries to overwrite the wheels dataclass attribute by the property of the same name.
However, the #property overwrite the default field, which leads to unexpected behavior.
from dataclasses import dataclass, field
#dataclass
class A:
x: int
# same as: `x = property(x) # Overwrite any field() info`
#property
def x(self) -> int:
return self._x
#x.setter
def x(self, value: int):
self._x = value
A() # `A(x=<property object at 0x7f0cf64e5fb0>)` Oups
print(A.__dataclass_fields__) # {'x': Field(name='x',type=<class 'int'>,default=<property object at 0x>,init=True,repr=True}
One way solve this, while avoiding inheritance would be to overwrite the field outside the class definition, after the dataclass metaclass has been called.
#dataclass
class A:
x: int
def x_getter(self):
return self._x
def x_setter(self, value):
self._x = value
A.x = property(x_getter)
A.x = A.x.setter(x_setter)
print(A(x=1))
print(A()) # missing 1 required positional argument: 'x'
It should probably possible to overwrite this automatically by creating some custom metaclass and setting some field(metadata={'setter': _x_setter, 'getter': _x_getter}).
Here's what I did to define the field as a property in __post_init__. This is a total hack, but it works with dataclasses dict-based initialization and even with marshmallow_dataclasses.
from dataclasses import dataclass, field, asdict
#dataclass
class Test:
name: str = "schbell"
_name: str = field(init=False, repr=False)
def __post_init__(self):
# Just so that we don't create the property a second time.
if not isinstance(getattr(Test, "name", False), property):
self._name = self.name
Test.name = property(Test._get_name, Test._set_name)
def _get_name(self):
return self._name
def _set_name(self, val):
self._name = val
if __name__ == "__main__":
t1 = Test()
print(t1)
print(t1.name)
t1.name = "not-schbell"
print(asdict(t1))
t2 = Test("llebhcs")
print(t2)
print(t2.name)
print(asdict(t2))
This would print:
Test(name='schbell')
schbell
{'name': 'not-schbell', '_name': 'not-schbell'}
Test(name='llebhcs')
llebhcs
{'name': 'llebhcs', '_name': 'llebhcs'}
I actually started off from this blog post mentioned somewhere in this SO, but ran into the issue that the dataclass field was being set to type property because the decorator is applied to the class. That is,
#dataclass
class Test:
name: str = field(default='something')
_name: str = field(init=False, repr=False)
#property
def name():
return self._name
#name.setter
def name(self, val):
self._name = val
would make name to be of type property and not str. So, the setter will actually receive property object as the argument instead of the field default.
Some wrapping could be good:
# DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
# Version 2, December 2004
#
# Copyright (C) 2020 Xu Siyuan <inqb#protonmail.com>
#
# Everyone is permitted to copy and distribute verbatim or modified
# copies of this license document, and changing it is allowed as long
# as the name is changed.
#
# DO WHAT THE FUCK YOU WANT TO PUBLIC LICENSE
# TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
#
# 0. You just DO WHAT THE FUCK YOU WANT TO.
from dataclasses import dataclass, field
MISSING = object()
__all__ = ['property_field', 'property_dataclass']
class property_field:
def __init__(self, fget=None, fset=None, fdel=None, doc=None, **kwargs):
self.field = field(**kwargs)
self.property = property(fget, fset, fdel, doc)
def getter(self, fget):
self.property = self.property.getter(fget)
return self
def setter(self, fset):
self.property = self.property.setter(fset)
return self
def deleter(self, fdel):
self.property = self.property.deleter(fdel)
return self
def property_dataclass(cls=MISSING, / , **kwargs):
if cls is MISSING:
return lambda cls: property_dataclass(cls, **kwargs)
remembers = {}
for k in dir(cls):
if isinstance(getattr(cls, k), property_field):
remembers[k] = getattr(cls, k).property
setattr(cls, k, getattr(cls, k).field)
result = dataclass(**kwargs)(cls)
for k, p in remembers.items():
setattr(result, k, p)
return result
You can use it like this:
#property_dataclass
class B:
x: int = property_field(default_factory=int)
#x.getter
def x(self):
return self._x
#x.setter
def x(self, value):
self._x = value
Here's another way which allows you to have fields without a leading underscore:
from dataclasses import dataclass
#dataclass
class Person:
name: str = property
#name
def name(self) -> str:
return self._name
#name.setter
def name(self, value) -> None:
self._name = value
def __post_init__(self) -> None:
if isinstance(self.name, property):
self.name = 'Default'
The result is:
print(Person().name) # Prints: 'Default'
print(Person('Joel').name) # Prints: 'Joel'
print(repr(Person('Jane'))) # Prints: Person(name='Jane')
This method of using properties in dataclasses also works with asdict and is simpler too. Why? Fields that are typed with ClassVar are ignored by the dataclass, but we can still use them in our properties.
#dataclass
def SomeData:
uid: str
_uid: ClassVar[str]
#property
def uid(self) -> str:
return self._uid
#uid.setter
def uid(self, uid: str) -> None:
self._uid = uid
Ok, so this is my first attempt at having everything self-contained within the class.
I tried a couple different approaches, including having a class decorator right next to #dataclass above the class definition. The issue with the decorator version is that my IDE complains if I decide to use it, and then I lose most of the type hints that the dataclass decorator provides. For example, if I'm trying to pass a field name into the constructor method, it doesn't auto-complete anymore when I add a new class decorator. I suppose that makes sense since the IDE assumes a decorator overwrites the original definition in some important way, however that succeeded in convincing me not to try with the decorator approach.
I ended up adding a metaclass to update the properties associated with dataclass fields to check if the value passed to the setter is a property object as mentioned by a few other solutions, and that seems to be working well enough now. Either of the two approaches below should work for testing (based on #Martin CR's solution)
from dataclasses import dataclass, field
#dataclass
class Test(metaclass=dataclass_property_support):
name: str = property
_name: str = field(default='baz', init=False, repr=False)
#name
def name(self) -> str:
return self._name
#name.setter
def name(self, value: str) -> None:
self._name = value
# --- other properties like these should not be affected ---
#property
def other_prop(self) -> str:
return self._other_prop
#other_prop.setter
def other_prop(self, value):
self._other_prop = value
And here is an approach which (implicitly) maps the property _name that begins with an underscore to the dataclass field name:
#dataclass
class Test(metaclass=dataclass_property_support):
name: str = 'baz'
#property
def _name(self) -> str:
return self._name[::-1]
#_name.setter
def _name(self, value: str):
self._name = value[::-1]
I personally prefer the latter approach, because it looks a little cleaner in my opinion and also the field _name doesn't show up when invoking the dataclass helper function asdict for example.
The below should work for testing purposes with either of the approaches above. The best part is my IDE doesn't complain about any of the code either.
def main():
obj = Test(name='foo')
print(obj) # displays: Test(name='foo')
obj = Test()
obj.name = 'bar'
print(obj) # displays: Test(name='bar')
obj = Test()
print(obj) # displays: Test(name='baz')
if __name__ == '__main__':
main()
Finally, here is the definition for the metaclass dataclass_property_support that now seems to be working:
from dataclasses import MISSING, Field
from functools import wraps
from typing import Dict, Any, get_type_hints
def dataclass_property_support(*args, **kwargs):
"""Adds support for using properties with default values in dataclasses."""
cls = type(*args, **kwargs)
# the args passed in to `type` will be a tuple of (name, bases, dict)
cls_dict: Dict[str, Any] = args[2]
# this accesses `__annotations__`, but should also work with sub-classes
annotations = get_type_hints(cls)
def get_default_from_annotation(field_: str):
"""Get the default value for the type annotated on a field"""
default_type = annotations.get(field_)
try:
return default_type()
except TypeError:
return None
for f, val in cls_dict.items():
if isinstance(val, property):
public_f = f.lstrip('_')
if val.fset is None:
# property is read-only, not settable
continue
if f not in annotations and public_f not in annotations:
# adding this to check if it's a regular property (not
# associated with a dataclass field)
continue
try:
# Get the value of the field named without a leading underscore
default = getattr(cls, public_f)
except AttributeError:
# The public field is probably type-annotated but not defined
# i.e. my_var: str
default = get_default_from_annotation(public_f)
else:
if isinstance(default, property):
# The public field is a property
# Check if the value of underscored field is a dataclass
# Field. If so, we can use the `default` if one is set.
f_val = getattr(cls, '_' + f, None)
if isinstance(f_val, Field) \
and f_val.default is not MISSING:
default = f_val.default
else:
default = get_default_from_annotation(public_f)
def wrapper(fset, initial_val):
"""
Wraps the property `setter` method to check if we are passed
in a property object itself, which will be true when no
initial value is specified (thanks to #Martin CR).
"""
#wraps(fset)
def new_fset(self, value):
if isinstance(value, property):
value = initial_val
fset(self, value)
return new_fset
# Wraps the `setter` for the property
val = val.setter(wrapper(val.fset, default))
# Replace the value of the field without a leading underscore
setattr(cls, public_f, val)
# Delete the property if the field name starts with an underscore
# This is technically not needed, but it supports cases where we
# define an attribute with the same name as the property, i.e.
# #property
# def _wheels(self)
# return self._wheels
if f.startswith('_'):
delattr(cls, f)
return cls
Update (10/2021):
I've managed to encapsulate the above logic - including support for additional edge cases - into the helper library dataclass-wizard, in case this is of interest to anyone. You can find out more about using field properties in the linked documentation as well. Happy coding!
Update (11/2021):
A more performant approach is to use a metaclass to generate a __post_init__() on the class that only runs once to fix field properties so it works with dataclasses. You can check out the gist here which I added. I was able to test it out and when creating multiple class instances, this approach is optimized as it sets everything up properly the first time __post_init__() is run.
Following a very thorough post about data classes and properties that can be found here the TL;DR version which solves some very ugly cases where you have to call MyClass(_my_var=2) and strange __repr__ outputs:
from dataclasses import field, dataclass
#dataclass
class Vehicle:
wheels: int
_wheels: int = field(init=False, repr=False)
def __init__(self, wheels: int):
self._wheels = wheels
#property
def wheels(self) -> int:
return self._wheels
#wheels.setter
def wheels(self, wheels: int):
self._wheels = wheels
Just put the field definition after the property:
#dataclasses.dataclass
class Test:
#property
def driver(self):
print("In driver getter")
return self._driver
#driver.setter
def driver(self, value):
print("In driver setter")
self._driver = value
_driver: typing.Optional[str] =\
dataclasses.field(init=False, default=None, repr=False)
driver: typing.Optional[str] =\
dataclasses.field(init=False, default=driver)
>>> t = Test(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() takes 1 positional argument but 2 were given
>>> t = Test()
>>> t._driver is None
True
>>> t.driver is None
In driver getter
True
>>> t.driver = "asdf"
In driver setter
>>> t._driver == "asdf"
True
>>> t
In driver getter
Test(driver='asdf')
I'm surprised this isn't already an answer but I question its wisdom. The only reason for this answer is to include the property in the representation - because the property's backing store (_driver) is already included in comparison tests and equality tests and so on. For example, this is a common idiom:
class Test:
def __init__(self):
self._driver = "default"
#property
def driver(self):
if self._driver == "default":
self._driver = "new"
return self._driver
>>> t = Test()
>>> t
<__main__.Test object at 0x6fffffec11f0>
>>> t._driver
'default'
>>> t.driver
'new'
Here is the dataclass equivalent - except that it adds the property to the representation. In the standard class, the result of (t._driver,t.driver) is ("default","new"). Notice that the result from the dataclass is instead ("new","new"). This is a very simple example but you must recognize that including properties with possible side effects in special methods may not be the best idea.
#dataclasses.dataclass
class Test:
#property
def driver(self):
print("In driver getter")
if self._driver == "default":
self._driver = "new"
return self._driver
_driver: typing.Optional[str] =\
dataclasses.field(init=False, default="default", repr=False)
driver: typing.Optional[str] =\
dataclasses.field(init=False, default=driver)
>>> t = Test()
>>> t
In driver getter
Test(driver='new')
>>> t._driver
'new'
>>> t.driver
In driver getter
'new'
So I would recommend just using:
#dataclasses.dataclass
class Test:
_driver: typing.Optional[str] =\
dataclasses.field(init=False, default="default", repr=False)
#property
def driver(self):
print("In driver getter")
if self._driver == "default":
self._driver = "new"
return self._driver
>>> t
Test()
>>> t._driver
'default'
>>> t.driver
In driver getter
'new'
And you can sidestep the entire issue, avoiding dataclasses for initialization, by simply using hasattr in the property getter.
#dataclasses.dataclass
class Test:
#property
def driver(self):
print("In driver getter")
if not hasattr(self, "_driver"):
self._driver = "new"
return self._driver
Or by using __post_init__:
#dataclasses.dataclass
class Test:
def __post_init__(self):
self._driver = None
#property
def driver(self):
print("In driver getter")
if self._driver is None:
self._driver = "new"
return self._driver
Why do this? Because init=False dataclass defaults are stored only on the class and not the instance.
From the ideas from above, I created a class decorator function resolve_abc_prop that creates a new class containing the getter and setter functions as suggested
by #shmee.
def resolve_abc_prop(cls):
def gen_abstract_properties():
""" search for abstract properties in super classes """
for class_obj in cls.__mro__:
for key, value in class_obj.__dict__.items():
if isinstance(value, property) and value.__isabstractmethod__:
yield key, value
abstract_prop = dict(gen_abstract_properties())
def gen_get_set_properties():
""" for each matching data and abstract property pair,
create a getter and setter method """
for class_obj in cls.__mro__:
if '__dataclass_fields__' in class_obj.__dict__:
for key, value in class_obj.__dict__['__dataclass_fields__'].items():
if key in abstract_prop:
def get_func(self, key=key):
return getattr(self, f'__{key}')
def set_func(self, val, key=key):
return setattr(self, f'__{key}', val)
yield key, property(get_func, set_func)
get_set_properties = dict(gen_get_set_properties())
new_cls = type(
cls.__name__,
cls.__mro__,
{**cls.__dict__, **get_set_properties},
)
return new_cls
Here we define a data class AData and a mixin AOpMixin implementing operations
on the data.
from dataclasses import dataclass, field, replace
from abc import ABC, abstractmethod
class AOpMixin(ABC):
#property
#abstractmethod
def x(self) -> int:
...
def __add__(self, val):
return replace(self, x=self.x + val)
Finally, the decorator resolve_abc_prop is then used to create a new class
with the data from AData and the operations from AOpMixin.
#resolve_abc_prop
#dataclass
class A(AOpMixin):
x: int
A(x=4) + 2 # A(x=6)
EDIT #1: I created a python package that makes it possible to overwrite abstract properties with a dataclass: dataclass-abc
After trying different suggestions from this thread I've come with a little modified version of #Samsara Apathika answer. In short: I removed the "underscore" field variable from the __init__ (so it is available for internal use, but not seen by asdict() or by __dataclass_fields__).
from dataclasses import dataclass, InitVar, field, asdict
#dataclass
class D:
a: float = 10. # Normal attribut with a default value
b: InitVar[float] = 20. # init-only attribute with a default value
c: float = field(init=False) # an attribute that will be defined in __post_init__
def __post_init__(self, b):
if not isinstance(getattr(D, "a", False), property):
print('setting `a` to property')
self._a = self.a
D.a = property(D._get_a, D._set_a)
print('setting `c`')
self.c = self.a + b
self.d = 50.
def _get_a(self):
print('in the getter')
return self._a
def _set_a(self, val):
print('in the setter')
self._a = val
if __name__ == "__main__":
d1 = D()
print(asdict(d1))
print('\n')
d2 = D()
print(asdict(d2))
Gives:
setting `a` to property
setting `c`
in the getter
in the getter
{'a': 10.0, 'c': 30.0}
in the setter
setting `c`
in the getter
in the getter
{'a': 10.0, 'c': 30.0}
I use this idiom to get around the default value during __init__ problem. Returning None from __set__ if a property object is passed in (as is the case during __init__) will keep the initial default value untouched. Defining the default value of the private attribute as that of the previously defined public attribute, ensures the private attribute is available. Type hints are shown with the correct default value, and the comments silence the pylint and mypy warnings:
from dataclasses import dataclass, field
from pprint import pprint
from typing import Any
class dataclass_property(property): # pylint: disable=invalid-name
def __set__(self, __obj: Any, __value: Any) -> None:
if isinstance(__value, self.__class__):
return None
return super().__set__(__obj, __value)
#dataclass
class Vehicle:
wheels: int = 1
_wheels: int = field(default=wheels, init=False, repr=False)
#dataclass_property # type: ignore
def wheels(self) -> int:
print("Get wheels")
return self._wheels
#wheels.setter # type: ignore
def wheels(self, val: int):
print("Set wheels to", val)
self._wheels = val
if __name__ == "__main__":
pprint(Vehicle())
pprint('#####')
pprint(Vehicle(wheels=4))
Output:
└─ $ python wheels.py
Get wheels
Vehicle(wheels=1)
'#####'
Set wheels to 4
Get wheels
Vehicle(wheels=4)
Type hint:
Type hint with correct default value
I went through the previous comments, and although most of them answer thet need to tweak the dataclass itself.
I came up with an approach using a decorator which I think is more concise:
from dataclasses import dataclass
import wrapt
def dataclass_properties(cls, property_starts='_'):
#wrapt.decorator
def wrapper(wrapped, instance, args, kwargs):
properties = [prop for prop in dir(cls) if isinstance(getattr(cls, prop), property)]
new_kwargs = {f"{property_starts}{k}" if k in properties else k: v for k, v in kwargs.items()}
return wrapped(*args, **new_kwargs)
return wrapt.FunctionWrapper(cls, wrapper)()
#dataclass_properties
#dataclass
class State:
_a: int
b: int
_c: int
#property
def a(self):
return self._a
#a.setter
def time(self, value):
self._a = value
if __name__=='__main__':
s = State(b=1,a=2,_c=1)
print(s) # returns: State(_a=2, b=1, _c=1)
print(s.a) # returns: 2
It can filter between properties and those variables that are not properties but start by "_".
It also supports the instantiation providing the property true name. In this case "_a".
if __name__=='__main__':
s = State(b=1,_a=2,_c=1)
print(s) # returns: State(_a=2, b=1, _c=1)
I does not solve the problem of the representation though.
For the use case that brought me to this page, namely to have a dataclass that is immutable, there is a simple option to use #dataclass(frozen=True). This removes all the rather verbose explicit definition of getters and setters. The option eq=True is helpful too.
Credit: a reply from joshorr to this post, linked in a comment to the accepted answer. Also a bit of a classical case of RTFM.

How to define enum values that are functions?

I have a situation where I need to enforce and give the user the option of one of a number of select functions, to be passed in as an argument to another function:
I really want to achieve something like the following:
from enum import Enum
#Trivial Function 1
def functionA():
pass
#Trivial Function 2
def functionB():
pass
#This is not allowed (as far as i can tell the values should be integers)
#But pseudocode for what I am after
class AvailableFunctions(Enum):
OptionA = functionA
OptionB = functionB
So the following can be executed:
def myUserFunction(theFunction = AvailableFunctions.OptionA):
#Type Check
assert isinstance(theFunction,AvailableFunctions)
#Execute the actual function held as value in the enum or equivalent
return theFunction.value()
Your assumption is wrong. Values can be arbitrary, they are not limited to integers. From the documentation:
The examples above use integers for enumeration values. Using integers
is short and handy (and provided by default by the Functional API),
but not strictly enforced. In the vast majority of use-cases, one
doesn’t care what the actual value of an enumeration is. But if the
value is important, enumerations can have arbitrary values.
However the issue with functions is that they are considered to be method definitions instead of attributes!
In [1]: from enum import Enum
In [2]: def f(self, *args):
...: pass
...:
In [3]: class MyEnum(Enum):
...: a = f
...: def b(self, *args):
...: print(self, args)
...:
In [4]: list(MyEnum) # it has no values
Out[4]: []
In [5]: MyEnum.a
Out[5]: <function __main__.f>
In [6]: MyEnum.b
Out[6]: <function __main__.MyEnum.b>
You can work around this by using a wrapper class or just functools.partial or (only in Python2) staticmethod:
from functools import partial
class MyEnum(Enum):
OptionA = partial(functionA)
OptionB = staticmethod(functionB)
Sample run:
In [7]: from functools import partial
In [8]: class MyEnum2(Enum):
...: a = partial(f)
...: def b(self, *args):
...: print(self, args)
...:
In [9]: list(MyEnum2)
Out[9]: [<MyEnum2.a: functools.partial(<function f at 0x7f4130f9aae8>)>]
In [10]: MyEnum2.a
Out[10]: <MyEnum2.a: functools.partial(<function f at 0x7f4130f9aae8>)>
Or using a wrapper class:
In [13]: class Wrapper:
...: def __init__(self, f):
...: self.f = f
...: def __call__(self, *args, **kwargs):
...: return self.f(*args, **kwargs)
...:
In [14]: class MyEnum3(Enum):
...: a = Wrapper(f)
...:
In [15]: list(MyEnum3)
Out[15]: [<MyEnum3.a: <__main__.Wrapper object at 0x7f413075b358>>]
Also note that if you want you can define the __call__ method in your enumeration class to make the values callable:
In [1]: from enum import Enum
In [2]: def f(*args):
...: print(args)
...:
In [3]: class MyEnum(Enum):
...: a = partial(f)
...: def __call__(self, *args):
...: self.value(*args)
...:
In [5]: MyEnum.a(1,2,3) # no need for MyEnum.a.value(1,2,3)
(1, 2, 3)
Since Python 3.11 there is much more concise and understandable way. member and nonmember functions were added to enum among other improvements, so you can now do the following:
from enum import Enum, member
def fn(x):
print(x)
class MyEnum(Enum):
meth = fn
mem = member(fn)
#classmethod
def this_is_a_method(cls):
print('No, still not a member')
def this_is_just_function():
print('No, not a member')
#member
def this_is_a_member(x):
print('Now a member!', x)
And now
>>> list(MyEnum)
[<MyEnum.mem: <function fn at ...>>, <MyEnum.this_is_a_member: <function MyEnum.this_is_a_member at ...>>]
>>> MyEnum.meth(1)
1
>>> MyEnum.mem(1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyEnum' object is not callable
>>> MyEnum.mem.value(1)
1
>>> MyEnum.this_is_a_method()
No, still not a member
>>> MyEnum.this_is_just_function()
No, not a member
>>> MyEnum.this_is_a_member()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'MyEnum' object is not callable
>>> MyEnum.this_is_a_member.value(1)
Now a member! 1
Another less clunky solution is to put the functions in a tuple. As Bakuriu mentioned, you may want to make the enum callable.
from enum import Enum
def functionA():
pass
def functionB():
pass
class AvailableFunctions(Enum):
OptionA = (functionA,)
OptionB = (functionB,)
def __call__(self, *args, **kwargs):
self.value[0](*args, **kwargs)
Now you can use it like this:
AvailableFunctions.OptionA() # calls functionA
In addition to the answer of Bakuriu... If you use the wrapper approach like above you loose information about the original function like __name__, __repr__
and so on after wrapping it. This will cause problems for example if you want to use sphinx for generation of source code documentation. Therefore add the following to your wrapper class.
class wrapper:
def __init__(self, function):
self.function = function
functools.update_wrapper(self, function)
def __call__(self,*args, **kwargs):
return self.function(*args, **kwargs)
def __repr__(self):
return self.function.__repr__()
Building on top of #bakuriu's approach, I just want to highlight that we can also use dictionaries of multiple functions as values and have a broader polymorphism, similar to enums in Java. Here is a fictitious example to show what I mean:
from enum import Enum, unique
#unique
class MyEnum(Enum):
test = {'execute': lambda o: o.test()}
prod = {'execute': lambda o: o.prod()}
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif not name.startswith("_"):
value = self.__dict__['_value_']
return value[name]
raise AttributeError(name)
class Executor:
def __init__(self, mode: MyEnum):
self.mode = mode
def test(self):
print('test run')
def prod(self):
print('prod run')
def execute(self):
self.mode.execute(self)
Executor(MyEnum.test).execute()
Executor(MyEnum.prod).execute()
Obviously, the dictionary approach provides no additional benefit when there is only a single function, so use this approach when there are multiple functions. Ensure that the keys are uniform across all values as otherwise, the usage won't be polymorphic.
The __getattr__ method is optional, it is only there for syntactic sugar (i.e., without it, mode.execute() would become mode.value['execute']().
Since dictionaries can't be made readonly, using namedtuple would be better and require only minor changes to the above.
from enum import Enum, unique
from collections import namedtuple
EnumType = namedtuple("EnumType", "execute")
#unique
class MyEnum(Enum):
test = EnumType(lambda o: o.test())
prod = EnumType(lambda o: o.prod())
def __getattr__(self, name):
if name in self.__dict__:
return self.__dict__[name]
elif not name.startswith("_"):
value = self.__dict__['_value_']
return getattr(value, name)
raise AttributeError(name)

Converting an object into a subclass in Python?

Lets say I have a library function that I cannot change that produces an object of class A, and I have created a class B that inherits from A.
What is the most straightforward way of using the library function to produce an object of class B?
edit- I was asked in a comment for more detail, so here goes:
PyTables is a package that handles hierarchical datasets in python. The bit I use most is its ability to manage data that is partially on disk. It provides an 'Array' type which only comes with extended slicing, but I need to select arbitrary rows. Numpy offers this capability - you can select by providing a boolean array of the same length as the array you are selecting from. Therefore, I wanted to subclass Array to add this new functionality.
In a more abstract sense this is a problem I have considered before. The usual solution is as has already been suggested- Have a constructor for B that takes an A and additional arguments, and then pulls out the relevant bits of A to insert into B. As it seemed like a fairly basic problem, I asked to question to see if there were any standard solutions I wasn't aware of.
This can be done if the initializer of the subclass can handle it, or you write an explicit upgrader. Here is an example:
class A(object):
def __init__(self):
self.x = 1
class B(A):
def __init__(self):
super(B, self).__init__()
self._init_B()
def _init_B(self):
self.x += 1
a = A()
b = a
b.__class__ = B
b._init_B()
assert b.x == 2
Since the library function returns an A, you can't make it return a B without changing it.
One thing you can do is write a function to take the fields of the A instance and copy them over into a new B instance:
class A: # defined by the library
def __init__(self, field):
self.field = field
class B(A): # your fancy new class
def __init__(self, field, field2):
self.field = field
self.field2 = field2 # B has some fancy extra stuff
def b_from_a(a_instance, field2):
"""Given an instance of A, return a new instance of B."""
return B(a_instance.field, field2)
a = A("spam") # this could be your A instance from the library
b = b_from_a(a, "ham") # make a new B which has the data from a
print b.field, b.field2 # prints "spam ham"
Edit: depending on your situation, composition instead of inheritance could be a good bet; that is your B class could just contain an instance of A instead of inheriting:
class B2: # doesn't have to inherit from A
def __init__(self, a, field2):
self._a = a # using composition instead
self.field2 = field2
#property
def field(self): # pass accesses to a
return self._a.field
# could provide setter, deleter, etc
a = A("spam")
b = B2(a, "ham")
print b.field, b.field2 # prints "spam ham"
you can actually change the .__class__ attribute of the object if you know what you're doing:
In [1]: class A(object):
...: def foo(self):
...: return "foo"
...:
In [2]: class B(object):
...: def foo(self):
...: return "bar"
...:
In [3]: a = A()
In [4]: a.foo()
Out[4]: 'foo'
In [5]: a.__class__
Out[5]: __main__.A
In [6]: a.__class__ = B
In [7]: a.foo()
Out[7]: 'bar'
Monkeypatch the library?
For example,
import other_library
other_library.function_or_class_to_replace = new_function
Poof, it returns whatever you want it to return.
Monkeypatch A.new to return an instance of B?
After you call obj = A(), change the result so obj.class = B?
Depending on use case, you can now hack a dataclass to arguably make the composition solution a little cleaner:
from dataclasses import dataclass, fields
#dataclass
class B:
field: int # Only adds 1 line per field instead of a whole #property method
#classmethod
def from_A(cls, a):
return cls(**{
f.name: getattr(a, f.name)
for f in fields(A)
})

Categories

Resources