I have this dataclass:
from dataclasses import dataclass, field
from typing import List
#dataclass
class Person:
name: str
dob: str
friends: List['Person'] = field(default_factory=list, init=False)
name and dob are immutable and friends is mutable. I want to generate a hash of each person object. Can I somehow specify which field to be included and excluded for generating the __hash__ method? In this case, name and dob should be included in generating the hash and friends shouldn't. This is my attempt but it doesn't work
#dataclass
class Person:
name: str = field(hash=True)
dob: str = field(hash=True)
friends: List['Person'] = field(default_factory=list, init=False, hash=False)
>>> hash(Person("Mike", "01/01/1900"))
Traceback (most recent call last):
File "<pyshell#43>", line 1, in <module>
hash(Person("Mike", "01/01/1900"))
TypeError: unhashable type: 'Person'
I also can't find a way to set name and dob to be frozen. And I'd refrain from setting unsafe_hash to True, just by the sound of it. Any suggestions?
Also, is what I'm doing considered good practice? If not, can you suggest some alternatives?
Thank you
Edit: This is just a toy example and we can assume that the name and dob fields are unique.
Edit: I gave an example to demonstrate the error.
Just indicate that the friends field should not be taken in account when comparing instances with __eq__, and pass hash=True to field instances on the desired fields.
Then, pass the unsafe_hash=True argument to the dataclass decorator itself - it will work as you intend (mostly):
In case of hash, the language restriction is that if one instance compares equal with another (__eq__), the hash of of both must be equal as well. The implication in this case is that if you have two instances of the "same" person with the same "name" and "dob" fields, they will be considered equal, even if they feature different friends lists.
Other than that, this should work:
from dataclasses import dataclass, field
from typing import List
#dataclass(unsafe_hash=True)
class Person:
name: str = field(hash=True)
dob: str = field(hash=True)
friends: List['Person'] = field(default_factory=list, init=False, compare=False, hash=False)
Then, remember to behave like a "consenting adult" and not change the "name" and "dob" fields of Person instances in any place, and you are set.
Related
I am trying to automatically convert a Pydantic model to a DB schema. To do that, I am recursively looping through a Pydantic model's fields to determine the type of field.
As an example, I have this simple model:
from typing import List
from pydantic import BaseModel
class TestModel(BaseModel):
tags: List[str]
I am recursing through the model using the __fields__ property as described here: https://docs.pydantic.dev/usage/models/#model-properties
If I do type(TestModel).__fields__['tags'] I see:
ModelField(name='tags', type=List[str], required=True)
I want to programatically check if the ModelField type has a List origin. I have tried the following, and none of them work:
type(TestModel).__fields__['tags'].type_ is List[str]
type(TestModel).__fields__['tags'].type_ == List[str]
typing.get_origin(type(TestModel).__fields__['tags'].type_) is List
typing.get_origin(type(TestModel).__fields__['tags'].type_) == List
Frustratingly, this does return True:
type(TestModel).__fields__['tags'].type_ is str
What is the correct way for me to confirm a field is a List type?
Pydantic has the concept of the shape of a field. These shapes are encoded as integers and available as constants in the fields module. The more-or-less standard types have been accommodated there already. If a field was annotated with list[T], then the shape attribute of the field will be SHAPE_LIST and the type_ will be T.
The type_ refers to the element type in the context of everything that is not SHAPE_SINGLETON, i.e. with container-like types. This is why you get str in your example.
Thus for something as simple as list, you can simply check the shape against that constant:
from pydantic import BaseModel
from pydantic.fields import SHAPE_LIST
class TestModel(BaseModel):
tags: list[str]
other: tuple[str]
tags_field = TestModel.__fields__["tags"]
other_field = TestModel.__fields__["other"]
assert tags_field.shape == SHAPE_LIST
assert other_field.shape != SHAPE_LIST
If you want more insight into the actual annotation of the field, that is stored in the annotation attribute of the field. With that you should be able to do all the typing related analyses like get_origin.
That means another way of accomplishing your check would be this:
from typing import get_origin
from pydantic import BaseModel
class TestModel(BaseModel):
tags: list[str]
other: tuple[str]
tags_field = TestModel.__fields__["tags"]
other_field = TestModel.__fields__["other"]
assert get_origin(tags_field.annotation) is list
assert get_origin(other_field.annotation) is tuple
Sadly, neither of those attributes are officially documented anywhere as far as I know, but the beauty of open-source is that we can just check ourselves. Neither the attributes nor the shape constants are obfuscated, protected or made private in any of the usual ways, so I'll assume these are stable (at least until Pydantic v2 drops).
While attempting to name a Pydantic field schema, I received the following error:
NameError: Field name "schema" shadows a BaseModel attribute; use a different field name with "alias='schema'".
Following the documentation, I attempted to use an alias to avoid the clash. See code below:
from pydantic import StrictStr, Field
from pydantic.main import BaseModel
class CreateStreamPayload(BaseModel):
name: StrictStr
_schema: dict[str: str] = Field(alias='schema')
Upon trying to instantiate CreateStreamPayload in the following way:
a = CreateStreamPayload(name= "joe",
_schema= {"name": "a name"})
The resulting instance has only a value for name, nothing else.
a.dict()
{'name': 'joe'}
This makes absolutely no sense to me, can someone please explain what is happening?
Many thanks
From the documentation:
Class variables which begin with an underscore and attributes annotated with typing.ClassVar will be automatically excluded from the model.
In general, append an underscore to avoid conflicts as leading underscores are seen as either dunder (magic) members or private members: _schema ➡ schema_
class Settings(BaseSettings):
SITE_URL: str
CONFIG = Settings()
>>> CONFIG.SITE_URL
returns str, and that's expected
Is it possible somehow to get access to dotted string representation of field?
CONFIG.SITE_URL.__some_magic_attr_ == 'CONFIG.SITE_URL'
Once initialized, the attribute of a Pydantic model is simply of the type that was defined for it. In this case SITE_URL is just a string. Thus, there is no special magic method, to get its field name.
Depending on your actual use case though, the __fields__ attribute of the model might be useful. It is a dictionary mapping field names to the ModelField objects. For example
from pydantic import BaseSettings
class Settings(BaseSettings):
SITE_URL: str
print(Settings.__fields__['SITE_URL'])
gives
name='SITE_URL' type=str required=True
If you have your settings object, you can for example do this:
for name in CONFIG.__fields__.keys():
print(f'{CONFIG.__class__.__name__}.{name}')
giving you
Settings.SITE_URL
...
If you want the name of the variable storing your settings object, I assume you can just write it as a string manually, i.e.
for name in CONFIG.__fields__.keys():
print(f'CONFIG.{name}')
I wanted to know what is the difference between:
from pydantic import BaseModel, Field
class Person(BaseModel):
name: str = Field(..., min_length=1)
And:
from pydantic import BaseModel, constr
class Person(BaseModel):
name: constr(min_length=1)
Both seem to perform the same validation (even raise the exact same exception info when name is an empty string). Is it just a matter of code style? Is one of them preferred over the other?
Also, if I wanted to include a list of nonempty strings as an attribute, which of these ways do you think would be better?:
from typing import List
from pydantic import BaseModel, constr
class Person(BaseModel):
languages: List[constr(min_length=1)]
Or:
from typing import List
from pydantic import BaseModel, Field
class Person(BaseModel):
languages: List[str]
#validator('languages', each_item=True)
def check_nonempty_strings(cls, v):
if not v:
raise ValueError('Empty string is not a valid language.')
return v
EDIT:
FWIW, I am using this for a FastAPI app.
EDIT2:
For my 2nd question, I think the first alternative is better, as it includes the length requirement in the Schema (and so it's in the documentation)
constr and Fields don't serve the same purpose.
constr is a specific type that give validation rules regarding this specific type. You have equivalent for all classic python types.
arguments of constr:
strip_whitespace: bool = False: removes leading and trailing whitespace
to_lower: bool = False: turns all characters to lowercase
to_upper: bool = False: turns all characters to uppercase
strict: bool = False: controls type coercion
min_length: int = None: minimum length of the string
max_length: int = None: maximum length of the string
curtail_length: int = None: shrinks the string length to the set value when it is longer than the set value
regex: str = None: regex to validate the string against
As you can see thoses arguments allow you to manipulate the str itself not the behaviour of pydantic with this field.
Field doesn't serve the same purpose, it's a way of customizing fields, all fields not only str, it add 18 customization variables that you can find here.
Is it just a matter of code style? Is one of them preferred over the other?
for the specific case of str it is a matter of code style and what is preferred doesn't matter, only your usecase does.
In general it is better to don't mix different syntax toguether and since you often need Field(), you will find it often.
A classic use case would be api response that send json object in camelCase or PascalCase, you would use field alias to match thoses object and work with their variables in snake_case.
exemple:
class Voice(BaseModel):
name: str = Field(None, alias='ActorName')
language_code: str = None
mood: str = None
for your 2nd question you are right, using constr is surely the best approach since the validation rule will be added into the openapi doc.
If you want to learn more about limitation and field rules enforcement check this.
This link shows the methods that do and don't work for pydantic and mypy together: https://lyz-code.github.io/blue-book/coding/python/pydantic_types/#using-constrained-strings-in-list-attributes
The best option for my use case was to make a class that inherited from pydantic.ConstrainedStr as so:
import pydantic
from typing import List
...
class Regex(pydantic.ConstrainedStr):
regex = re.compile("^[0-9a-z_]*$")
class Data(pydantic.BaseModel):
regex: List[Regex]
# regex: list[Regex] if you are on 3.9+
Consider the following
from pydantic import BaseModel, Field
class Model(BaseModel):
required: str
This will make required a required field for Model, however, in the FastAPI autogenerated Swagger docs it will have an example value of "string".
How can I make a required field with a sensible default? If I make a model like
from pydantic import BaseModel, Field
class Model(BaseModel):
required: str = 'Sensible default'
Then the field required is no longer required, but it shows up with a sensible default in the docs. Is there an easy workaround for this?
You can use Field() to set up those options and check.
from pydantic import BaseModel, Field
class Model(BaseModel):
something: str # required, shows "string"
something: str = None # not required, shows "string"
something: str = Field(..., example="this is the default display") # required, shows example
something: str = Field(None, example="Foobar") #not required, show example
There are a multitude of different parameters that Field() can validate against.
I haven't looked into why the (pydantic) model representation within the openapi version that ships with FastAPI leaves the asterisk out, but the field is definitely still required (try putting a null value, or anything other than string). This might just be an UI inconsistency.