I have this dataclass:
#dataclass
class Couso:
nome: str
date: str = field(default=datetime.now(), init = False)
id_: str = field(default=key())
Being key() a simple function that returns a str on len 32.
And when i create multiple classes of it (without specifing the id_ obviously) they all share the same id_
But why does it work this way? I cant understand.
Also, would this happen again with the attribute date?
key is called before field is called to create the field, so that every instance will have the same default id_ attribute. It's the same as if you had written
x = key()
#dataclass
class Couso:
...
id_ : str = field(default=x)
If you want to call key each time you create a new instance, use default_factory instead.
id_: str = field(default_factory=key) # key is not called; it's passed as an object.
The same goes for datetime.now:
date: str = field(default_factory=datetime.now, init = False)
Related
Consider the following python code:
from dataclasses import dataclass
#dataclass
class Registration:
category: str = 'new'
#dataclass
class Car:
make: str = None
category: str = None
reg: Registration = None
def __post_init__(self):
''' fill in any missing fields from the registration of car '''
if self.reg:
for var in vars(self.reg):
if not self.var:
self.var = self.reg.var
r = Registration()
a = Car(make='ford', category='used', reg=r)
# its unknown if b is used/new, so we explicitly pass it None
b = Car(make='ford', category=None, reg=r)
In above example, the __post_init__ is supposed to fill in fields in Car class if it was not passed in during creation of Car object. However if None was explicitly passed in as the field value (in this case for category) it's not supposed to overwrite it from the Registration object. But the above code does. How do I detect what values were explicitly passed in during the object creation vs what are defaults?
I'd be surprised if there were a way to distinguish between
a None passed explicitly vs one that the object acquired via
its defaults. In situations like yours, one technique is to use
a kind sentinel value as the default.
#dataclass
class Car:
NO_ARG = object()
make: str = None
category: str = NO_ARG
reg: Registration = None
def __post_init__(self):
if self.reg:
for var in vars(self.reg):
if getattr(self, var) is self.NO_ARG:
setattr(self, var, getattr(self.reg, var))
However, you might also take the awkward situation you find yourself
in as a signal that perhaps there's a better way to model your
objects. Without knowing more about the
broader context it's difficult to offer definitive advice, but
I would say that your current strategy strikes me as fishy, so I
would encourage you to thinks some more about your OO plan.
To give one example of an alternative model, rather than using the Registration to
overwrite the attributes of a Car, you could instead build a property
to expose the Registration attribute when the Car attribute
is missing. A user of the class can decide whether they want
the category strictly from the Car or they are happy to take
the fallback value from the Registration, if available. This approach
comes with tradeoffs as well.
#dataclass
class Car:
make: str = None
category: str = None
reg: Registration = None
#property
def category_reg(self):
if self.category is None and self.reg:
return self.reg.category
else:
return self.category
I'm trying to create Model class with attrs module. In my script CollectionModel class is inherited in User class. So logically all the attributes in the CollectionModel should be available in the User. But while trying to create a new instance of User from dictionary (it is possible with attrs) it shows that attributes of CollectionModel are not present in the User
My Script:
from bson import ObjectId
from attrs import asdict, define, field, validators, Factory
import time
#define
class CollectionModel:
""" Base Class For All Collection Schema"""
_id : str = field(converter=str, default=Factory(ObjectId))
_timestamp : float = field(default=Factory(time.time))
def get_dict(self):
return asdict(self)
def update(self):
self._timestamp = time.time()
#define
class User(CollectionModel):
username : str = field(factory = str, validator=validators.instance_of(str))
userType : str = field(factory = str, validator=validators.instance_of(str))
password : str = field(factory = str, validator=validators.instance_of(str))
user_object = User()
new_user_object = User(**asdict(user_object))
Here, I'm trying to create a new User object from the user_object. It shows the following error.
TypeError: User.__init__() got an unexpected keyword argument '_id'
I guessed that the parent class is not initiated at first. So I tried to initiate it with the super() function and according to attrs documentation it should be done with __attrs_pre_init__. From the documentation:
The sole reason for the existence of __attrs_pre_init__ is to give
users the chance to call super().__init__(), because some
subclassing-based APIs require that.
So the modified child class becomes like this.
#define
class User(CollectionModel):
username : str = field(factory = str, validator=validators.instance_of(str))
userType : str = field(factory = str, validator=validators.instance_of(str))
password : str = field(factory = str, validator=validators.instance_of(str))
def __attrs_pre_init__(self):
super().__init__()
But the problem still remains. Am I doing the OOP in the wrong way? Or is it just a bug of attrs module?
I've solved that issue and there were two problems. The first one, in python, underscore before an attribute name makes it a private attribute. So while creating a new instance from dictionary _id is an unexpected key.
The second problem is, I indeed need an _id field as I'm working with MongoDB. In MongoDB document _id is reserved for using as a primary key.
Now the first problem can be solved by replacing _id to id_ (yes, underscore in the opposite side since id is literally a bad choice for a variable name in python).
For the second problem I've searched a lot to make an alias for id_ field. But it is not simply easy while working with attrs module. So I've modified the get_dict() method to get a perfect dictionary before throwing my data in MongoDB collection.
Here is the modified CollectionModel class and User class:
#define
class CollectionModel:
""" Base Class For All Collection Schema"""
id_: ObjectId = field(default=Factory(ObjectId))
timestamp : float = field(default=Factory(time.time))
def get_dict(self):
d = asdict(self)
d["_id"] = d["id_"]
del d["id_"]
return d
def update(self):
self.timestamp = time.time()
#define
class User(CollectionModel):
username : str = field(factory = str, validator=validators.instance_of(str))
userType : str = field(factory = str, validator=validators.instance_of(str))
password : str = field(factory = str, validator=validators.instance_of(str))
Now printing an instance:
user_object = User()
print(user_object.get_dict())
Output:
{'timestamp': 1645131182.421929, 'username': '', 'userType': '', 'password': '', '_id': ObjectId('620eb5ae10f27b87de5be3a9')}
I have a JSON object that reads:
j = {"id": 1, "label": "x"}
I have two types:
class BaseModel:
def __init__(self, uuid):
self.uuid = uuid
class Entity(BaseModel):
def __init__(self, id, label):
super().__init__(id)
self.name = name
Note how id is stored as uuid in the BaseModel.
I can load Entity from the JSON object as:
entity = Entity(**j)
I want to re-write my model leveraging dataclass:
#dataclass
class BaseModel:
uuid = str
#dataclass
class Entity:
name = str
Since my JSON object does not have the uuid, entity = Entitye(**j) on the dataclass-based model will throw the following error:
TypeError: __init__() got an unexpected keyword argument 'id'
The "ugly" solutions I can think of:
Rename id to uuid in JSON before initialization:
j["uuid"] = j.pop("id")
Define both id and uuid:
#dataclass
class BaseModel:
uuid = str
#dataclass
class Entity:
id = str
name = str
# either use:
uuid = id
# or use this method
def __post_init__(self):
super().uuid = id
Is there any cleaner solution for this kind of object initialization in the dataclass realm?
might be ruining the idea of removing the original __init__ but how about writing a function to initialize the data class?
def init_entity(j):
j["uuid"] = j.pop("id")
return Entity(**j)
and in your code entity = initEntity(j)
I think the answer here might be to define a classmethod that acts as an alternative constructor to the dataclass.
from dataclasses import dataclass
from typing import TypeVar, Any
#dataclass
class BaseModel:
uuid: str
E = TypeVar('E', bound='Entity')
#dataclass
class Entity(BaseModel):
name: str
#classmethod
def from_json(cls: type[E], **kwargs: Any) -> E:
return cls(kwargs['id'], kwargs['label']
(For the from_json type annotation, you'll need to use typing.Type[E] instead of type[E] if you're on python <= 3.8.)
Note that you need to use colons for your type-annotations within the main body of a dataclass, rather than the = operator, as you were doing.
Example usage in the interactive REPL:
>>> my_json_dict = {'id': 1, 'label': 'x'}
>>> Entity.from_json(**my_json_dict)
Entity(uuid=1, name='x')
It's again questionable how much boilerplate code this saves, however. If you find yourself doing this much work to replicate the behaviour of a non-dataclass class, it's often better just to use a non-dataclass class. Dataclasses are not the perfect solution to every problem, nor do they try to be.
Simplest solution seems to be to use an efficient JSON serialization library that supports key remappings. There are actually tons of them that support this, but dataclass-wizard is one example of a (newer) library that supports this particular use case.
Here's an approach using an alias to dataclasses.field() which should be IDE friendly enough:
from dataclasses import dataclass
from dataclass_wizard import json_field, fromdict, asdict
#dataclass
class BaseModel:
uuid: int = json_field('id', all=True)
#dataclass
class Entity(BaseModel):
name: str = json_field('label', all=True)
j = {"id": 1, "label": "x"}
# De-serialize the dictionary object into an `Entity` instance.
e = fromdict(Entity, j)
repr(e)
# Entity(uuid=1, name='x')
# Assert we get the same object when serializing the instance back to a
# JSON-serializable dict.
assert asdict(e) == j
My intention
So, I am developing an API package for one service. I want to make good typehints for every method, which exists in my library
For example, when user types get()., after the dot pycharm will let him know, what response this method will provide.
e.g:
info = get()
info. # and here IDE help with hints.
Pitfalls
But, there are some methods, which provide different responses depending of parameters in methods.
e.g.
# this method responses with object, containing fields:
# count - count of items
# items - list of ids of users
info = get()
# but this method will give additional information. It responses with object, containing fields:
# count - count of items
# items - list of objects with users' information. It has fields:
# id - id of user
# firstname - firstname of user
# lastname - lastname of user
# ... and some others
info = get(fields='firstname')
Objects structure
Now I have such structure (i't simplified)
from typing import List, Union
from pydantic import BaseModel, Field
class UserInfo(BaseModel):
id: int = Field(...)
firstname: str = Field(None)
lastname: str = Field(None)
some_other_fields: str = Field(None)
class GetResponseNoFields(BaseModel):
count: int = Field(...)
items: List[int] = Field(...)
class GetResponseWithFields(BaseModel):
count: int = Field(...)
items: List[UserInfo] = Field(...)
class GetResponseModel(BaseModel):
response: Union[GetResponseNoFields, GetResponseWithFields] = Field(...)
def get(fields=None) -> GetResponseModel:
# some code
pass
The problem
The problem is, when I type get(fields='firsttname').response.items[0]. pycharm shows me typehints only for int. He doesn't think, that items can contain List[UserInfo], he thinks, it only can have List[int]
I have tried
I've tried to use typing.overload decorator, but method 'get' has many parameters, and actually doesn't support default parameter values. Or maybe i didn't do it properly
Here what I have tried with overload (simlified). It didn't work because of 'some_other_param', but I leave it here just in case:
from typing import overload
#overload
def get(fields: None) -> GetResponseNoFields: ...
#overload
def get(fields: str) -> GetResponseWithFields: ...
def get(some_other_param=None, fields=None):
# code here
pass
When I try to call method without parameters, pycharm says, that "Some of the parameters is unfilled"
There are a few classes that I defined
class Animal:
def do_parent_method():
pass
class Monkey(Animal):
pass
class Elephant(Animal):
pass
#dataclass
class Zoo:
monkey: Monkey= Monkey()
elephant: Elephant = Elephant()
start_time: datetime = None
name: str = 'Not important at all'
def data_format(self):
items = [self.monkey, self.elephant] # Now I hard code here
for item in items:
do_something()
The key point is about how to get attributes in the Zoo class
Maybe someday, we will add another animal in our code
#dataclass
class Zoo:
monkey: Monkey= Monkey()
elephant: Elephant = Elephant()
start_time: datetime = None
name: str = 'Not important at all'
def data_format(self):
items = [get the attributes that extends from Animal] # How to do?
for item in items:
do_parent_method()
For now I just want items to be a list, so that I could for-loop it.
Or if you have another good idea is also good for me.
Note:
The all the attributes in Zoom class will only have some str, datetime, int type. All the other instance will be the children class of Animal class.
Fixed:
Accidentally entered 'zoo' into 'zoom'
The dataclasses.fields function can return field information about a class, including both the name and type of each field. So your list comprehension can be written:
items = [getattr(self, field.name) for field in fields(self) if issubclass(field.type, Animal)]
The flaw here is that it doesn't work for string annotations, which includes all cases where the module uses from __future__ import annotations. You could use the tricks here to resolve to the actual type, or you could just unconditionally get all the fields, then filter them with isinstance checks (that verify the runtime type, not the annotated type that can be blithely ignored at runtime):
items = [attr for attr in (getattr(self, field.name) for field in fields(self)) if isinstance(attr, Animal)]