Is it possible to have something like
class MyAbstract {
final int myFieldSomebodyHasToDefine;
}
class MyAbstractImplementation extends MyAbstract {
final int myFieldSomebodyHasToDefine = 5;
}
using dataclasses in python?
If you are working with a python interpreter before version 3.8, there is no straightforward way. However, since python 3.8, the final decorator has been added to the language. After importing it from the typing module in python, you can use it for methods and classes.
You may also use FINAL type for values.
Here is an example
from typing import final, Final
#final
class Base:
#final
def h(self)->None:
print("old")
class Child(Base):
# Bad overriding
def h(self) -> None:
print("new")
if __name__ == "__main__":
b = Base()
b.h()
c = Child()
c.h()
RATE: Final = 3000
# Bad value assignment
RATE = 7
print(RATE)
Important note: Python does not force the developer with final and FINAL. You can yet change the values upon your wish. The decorators of mostly informative for developers.
For more information, you may visit: https://peps.python.org/pep-0591/
Update: This is also an instance for dataclass
#dataclass
class Item:
"""Class for keeping track of an item in inventory."""
price: float
quantity_on_hand: int = 0
name:Final[str] = "ItemX"
def total_cost(self) -> float:
return self.unit_price * self.quantity_on_hand
As you can see, name is a final field. However, you must put the final values with a default value below all of the fields without an initial value.
Related
I am creating new data class in python.
#dataclass
class User(Mixin):
id: int = None
items: List[DefaultItem] = None
This items is array of DefaultItem objects but I need this to be multiple possible objects like:
items: List[DefaultItem OR SomeSpecificItem OR SomeOtherItem] = None
How can I do something like this in python?
You can use typing.Union for this.
items: List[Union[DefaultItem, SomeSpecificItem, SomeOtherItem]] = None
And if you are on Python 3.10, they've added a convenient shorthand notation:
items: list[DefaultItem | SomeSpecificItem | SomeOtherItem] = None
Also just as a note: If items is allowed to be None, you should mark the type as Optional.
Also, a note that in Python 3.10, you can also pass the kw_only parameter to the #dataclass decorator to work around the issue which I suspect you're having, wherein all fields in a subclass are required to have a default value when there is at least one field with a default value in the superclass, Mixin in this case.
I added an example below to illustrate this a little better:
from dataclasses import dataclass
#dataclass
class Mixin:
string: str
integer: int = 222
#dataclass(kw_only=True)
class User(Mixin):
id: int
items: list['A | B | C']
class A: ...
class B: ...
class C: ...
u = User(string='abc', id=321, integer=123, items=[])
print(u)
Note that I've also wrapped the Union arguments in a string, so that the expression is forward-declared (i.e. not evaluated yet), since the classes in the Union arguments are defined a bit later.
This code works in 3.10 because the kw_only param is enabled, so now only keyword arguments are accepted to the constructor. This allows you to work around that issue as mentioned, where you would otherwise need to define a default value for all fields in a subclass when there's at least one default field in a parent class.
In earlier Python versions than 3.10, missing the kw_only argument, you'd expect to run into a TypeError as below:
TypeError: non-default argument 'id' follows default argument
The workaround for this in a pre-3.10 scenario is exactly how you had it: define a default value for all fields in the User class as below.
from __future__ import annotations
from dataclasses import dataclass, field
#dataclass
class Mixin:
string: str
integer: int = 222
#dataclass
class User(Mixin):
id: int = None
items: list[A | B | C] = field(default_factory=list)
class A: ...
class B: ...
class C: ...
u = User('abc', 123, 321)
print(u)
I am learning about Dataclasses but I am confused on the purpose of sort_index and how it actually works.
I can't seem to find any valuable information on it. The official Python documentation doesn't mention it, which is mind boggling.
Here is an example:
#dataclass(order=True)
class Person:
sort_index: int = field(init=False, repr=False)
name: str
age: int
weight: int = 190
def __post_init__(self):
self.sort_index = self.weight
So, what is the purpose of sort_index? What is it used for? When do I use it?
Thanks again for taking the time to answer my question. I am new to Python.
Setting a sort_index attribute (or indeed, any identifier—the name is irrelevant) in the __post_init__ method appears to be the value on which comparisons are performed.
There is an implicit setting of the comparison methods (__lt__, __gt__, etc--read about dunder methods if unfamiliar), using the attributes provided in the __post_init__ method first, and if required, the remaining attributes for resolution.
Class constructor
from dataclasses import dataclass, field
#dataclass(order=True)
class Person:
sort_index: int = field(init=False)
age: int
def __post_init__(self):
self.sort_index = self.age
first example—attribute age is equal:
>>> p1 = Person(age=10)
>>> p2 = Person(age=10)
>>> p1 == p2
True
Second example—age is greater:
>>> p1 = Person(age=10)
>>> p2 = Person(age=20)
>>> p2 > p1
True
More complex example:
from dataclasses import dataclass, field
#dataclass(order=True)
class Person:
foo: int = field(init=False, repr=False)
bar: int = field(init=False, repr=False)
name: str
age: int
weight: int = 190
def __post_init__(self):
self.foo = self.weight
self.bar = self.age
>>> p1 = Person('p1', 10)
>>> p2 = Person('p1', 11)
>>> p2 > p2
True
Reason
foo (weight) is equal for both instances, so comparison is done on bar (age)
Conclusion
The comparisons can be arbitrarily complex, and identifiers are not important.
I highly recommend this video on dataclasses, by ArjanCodes.
Apart from the video, here's a github link to example dataclass code (from the same video).
Hope this helped—I just learned about dataclasses myself.
Finally I've found the simple truth about that.
First, 'sort_index' or whatever you want to call this attribute, in not usefull unless you need to sort the class depending on an attribute defined after the init is done (then defined in the post_init).
All the tricky behaviour comes from how #dataclasse(order=True) works.
It is not intended to make direct comparisons like var1 > var2, but it is used to sort your objects, if, lets say you store them into an iterable that you can sort.
And this sorting is done like that (objects must be instances from the same class of course):
compare the first attribute to sort the objects
in case of equality -> compare with the second attribute, etc...
So, the order the attributes are wrote matters. And that is why one may use a 'sort_index' simply to put this attribute in the first place even though it is not defined in the init, but after the init.
(I've found a good explanation in this video)
#dataclass(order=True)
class Person:
sort_index: int = field(init=False) # <- not defined yet
age: int
name: str
def __post_init__(self):
self.sort_index = self.age # <- definition's here
# if you try this:
print(person_1 == person_2)
# and get 'True', it means that all the values of the attributes of person_1
# and person_2 are strictly the same, not only 'sort_index'
In this example, the first sorting attribute is sort_index which is also the age, it is not a very good example. A better attribute could be an autogenerated id given after the init of the object, but even then, it would be easier to do:
#dataclass(order=True)
class Person:
id: int = field(init=False, default_factory=get_an_id_function)
age: int
name: str
# Where get_an_id_function is a function that returns an id
This question already has answers here:
Can you annotate return type when value is instance of cls?
(4 answers)
Closed 2 years ago.
Is there an inverse function for Type[SomeType] so that Instance[Type[SomeType]] == SomeType?
I'm given a class and I'd like to annotate the return value of calling its constructor
class FixedSizeUInt(int):
size: int = 0
def __new__(cls, value: int):
cls_max: int = cls.max_value()
if not 0 <= value <= cls_max:
raise ValueError(f"{value} is outside range " +
f"[0, {cls_max}]")
new: Callable[[cls, int], Instance[cls]] = super().__new__ ### HERE
return new(cls, value)
#classmethod
def max_value(cls) -> int:
return 2**(cls.size) - 1
Edit:
This class is abstract, it needs to be subclassed for it to make sense, as a size of 0 only allows for 0 as its value.
class NodeID(FixedSizeUInt):
size: int = 40
class NetworkID(FixedSizeUInt):
size: int = 64
Edit 2: For this specific case, using generics will suffice, as explained in https://stackoverflow.com/a/39205612/5538719 . Still, the question of a inverse of Type remains. Maybe the question then is: Will generics cover every case so that an inverse function is never needed?
I believe you want:
new: Callable[[Type[FixedSizeUInt], int], FixedSizeUInt] = ...
Or a little more dynamically:
from typing import TypeVar, Callable
T = TypeVar('T')
...
def __new__(cls: Type[T], value: int):
...
new: Callable[[Type[T], int], T] = ...
Still, the question of a inverse of Type remains. Maybe the question then is: Will generics cover every case so that an inverse function is never needed?
It's not about generics, it's about type hints in general. Take int as an example. int is the class. int() creates an instance of the class. In type hints, int means instance of int. Using a class as a type hint always talks about an instance of that type, not the class itself. Because talking about instances-of is the more typical case, talking about the class itself is less common.
So, you need to use a class in a type hint and a class in a type hint means instance of that class. Logically, there's no need for an Instance[int] type hint, since you cannot have a non-instance type hint to begin with. On the contrary, a special type hint Type[int] is needed for the special case that you want to talk about the class.
Let's say I want to store some information about a conference schedule with a presentation time and a pause time. I can do this in a NamedTuple.
from typing import NamedTuple
class BlockTime(NamedTuple):
t_present: float
t_pause: float
However, if I also want to store how much each block would take such that t_each = t_pause + t_present, I can't just add it as an attribute:
class BlockTime(NamedTuple):
t_present: float
t_pause: float
# this causes an error
t_each = t_present + t_pause
What is the correct way to do this in Python? If I make an __init__(self) method and store it as an instance variable there, but it would then be mutable.
In case it would be okay that it's not really stored but calculated dynamically you could use a simple property for it.
from typing import NamedTuple
class BlockTime(NamedTuple):
t_present: float
t_pause: float
#property
def t_each(self):
return self.t_present + self.t_pause
>>> b = BlockTime(10, 20)
>>> b.t_each # only available as property, not in the representation nor by indexing or iterating
30
That has the advantage that you can never (not even accidentally) store a wrong value for it. However at the expense of not actually storing it at all. So in order to appear as if it were stored you'd have to at least override __getitem__, __iter__, __repr__ which is likely too much trouble.
For example the NamedTuple approach given by Patrick Haugh has the downside that it's still possible to create inconsistent BlockTimes or lose parts of the namedtuple convenience:
>>> b = BlockTime.factory(1.0, 2.0)
>>> b._replace(t_present=20)
BlockTime(t_present=20, t_pause=2.0, t_each=3.0)
>>> b._make([1, 2])
TypeError: Expected 3 arguments, got 2
The fact that you actually have a "computed" field that has to be in sync with other fields already indicates that you probably shouldn't store it at all to avoid inconsistent state.
You can make a classmethod that builds BlockTime objects
class BlockTime(NamedTuple):
t_present: float
t_pause: float
t_each: float
#classmethod
def factory(cls, present, pause):
return cls(present, pause, present+pause)
print(BlockTime.factory(1.0, 2.0))
# BlockTime(t_present=1.0, t_pause=2.0, t_each=3.0)
EDIT:
Here's a solution using the new Python 3.7 dataclass
from dataclasses import dataclass, field
#dataclass(frozen=True)
class BlockTime:
t_present: float
t_pause: float
t_each: float = field(init=False)
def __post_init__(self):
object.__setattr__(self, 't_each', self.t_present + self.t_pause)
Frozen dataclasses aren't totally immutable but they're pretty close, and this lets you have natural looking instance creation BlockTime(1.0, 2.0)
Well.. You cant override __new__ or __init__ of a class whose parent is NamedTuple. But you can overide __new__ of a class, inherited from another class whose parent is NamedTuple.
So you can do something like this
from typing import NamedTuple
class BlockTimeParent(NamedTuple):
t_present: float
t_pause: float
t_each: float
class BlockTime(BlockTimeParent):
def __new__(cls, t_present, t_pause):
return super().__new__(cls, t_present, t_pause, t_present+ t_pause)
b = BlockTime(1,2)
print (b)
# BlockTime(t_present=1, t_pause=2, t_each=3)
I have an enum defined like this:
def enum(**enums):
return type('Enum', (), enums)
Status = enum(
STATUS_OK=0,
STATUS_ERR_NULL_POINTER=1,
STATUS_ERR_INVALID_PARAMETER=2)
I have a function that returns status as Status enum.
How can I get the name of the enum value, and not just value?
>>> cur_status = get_Status()
>>> print(cur_status)
1
I would like to get STATUS_ERR_NULL_POINTER, instead of 1
You'd have to loop through the class attributes to find the matching name:
name = next(name for name, value in vars(Status).items() if value == 1)
The generator expression loops over the attributes and their values (taken from the dictionary produced by the vars() function) then returns the first one that matches the value 1.
Enumerations are better modelled by the enum library, available in Python 3.4 or as a backport for earlier versions:
from enum import Enum
class Status(Enum):
STATUS_OK = 0
STATUS_ERR_NULL_POINTER = 1
STATUS_ERR_INVALID_PARAMETER = 2
giving you access to the name and value:
name = Status(1).name # gives 'STATUS_ERR_NULL_POINTER'
value = Status.STATUS_ERR_NULL_POINTER.value # gives 1
2021 update:
These answers are out of date. Using Python's standard Enum class,
cur_status.name
will return the name. (STATUS_ERR_NULL_POINTER)
To look up the enum knowing the name:
s = Status['STATUS_ERR_NULL_POINTER']
Not sure which python version it was introduced, but the hidden attribute _value2member_map_ gives you what you want.
class Status(Enum):
STATUS_OK=0
STATUS_ERR_NULL_POINTER=1
STATUS_ERR_INVALID_PARAMETER=2
str(Status._value2member_map_[1])
Out:
'Status.STATUS_ERR_NULL_POINTER'
You don't need to loop through the Enum class but just access _member_map_.
>>> Status._member_map_['STATUS_OK']
<Status.STATUS_OK: 0>
For some reason, most of the methods above did not work for me. All methods return the Enum type as an integer. I'm working with Python 3.7.
In my solution, I defined class function to handle this. It's not purely pythonic, but worked well enough for my case.
from enum import Enum
class Status(Enum):
STATUS_OK = 0
STATUS_ERR_NULL_POINTER = 1
STATUS_ERR_INVALID_PARAMETER = 2
#classmethod
def name(cls,val):
return { v:k for k,v in dict(vars(cls)).items() if isinstance(v,int)}.get(val,None)
# test it
stat = Status.STATUS_OK
print(Status.name(stat))
Prints: 'STATUS_OK'
It may seem obvious that we asked for the status after giving it the status, but in my case, this is set programmatically elsewhere