I've been working with FastAPI for some time, it's a great framework.
However real life scenarios can be surprising, sometimes a non-standard approach is necessary. There's a one case I'd like to ask your help with.
There's a strange external requirement that a model response should be formatted as stated in example:
Desired behavior:
GET /object/1
{status: ‘success’, data: {object: {id:‘1’, category: ‘test’ …}}}
GET /objects
{status: ‘success’, data: {objects: [...]}}}
Current behavior:
GET /object/1 would respond:
{id: 1,field1:"content",... }
GET /objects/ would send a List of Object e.g.,:
{
[
{id: 1,field1:"content",... },
{id: 1,field1:"content",... },
...
]
}
You can substitute 'object' by any class, it's just for description purposes.
How to write a generic response model that will suit those reqs?
I know I can produce response model that would contain status:str and (depending on class) data structure e.g ticket:Ticket or tickets:List[Ticket].
The point is there's a number of classes so I hope there's a more pythonic way to do it.
Thanks for help.
Generic model with static field name
A generic model is a model where one field (or multiple) are annotated with a type variable. Thus the type of that field is unspecified by default and must be specified explicitly during subclassing and/or initialization. But that field is still just an attribute and an attribute must have a name. A fixed name.
To go from your example, say that is your model:
{
"status": "...",
"data": {
"object": {...} # type variable
}
}
Then we could define that model as generic in terms of the type of its object attribute.
This can be done using Pydantic's GenericModel like this:
from typing import Generic, TypeVar
from pydantic import BaseModel
from pydantic.generics import GenericModel
M = TypeVar("M", bound=BaseModel)
class GenericSingleObject(GenericModel, Generic[M]):
object: M
class GenericMultipleObjects(GenericModel, Generic[M]):
objects: list[M]
class BaseGenericResponse(GenericModel):
status: str
class GenericSingleResponse(BaseGenericResponse, Generic[M]):
data: GenericSingleObject[M]
class GenericMultipleResponse(BaseGenericResponse, Generic[M]):
data: GenericMultipleObjects[M]
class Foo(BaseModel):
a: str
b: int
class Bar(BaseModel):
x: float
As you can see, GenericSingleObject reflects the generic type we want for data, whereas GenericSingleResponse is generic in terms of the type parameter M of GenericSingleObject, which is the type of its data attribute.
If we now want to use one of our generic response models, we would need to specify it with a type argument (a concrete model) first, e.g. GenericSingleResponse[Foo].
FastAPI deals with this just fine and can generate the correct OpenAPI documentation. The JSON schema for GenericSingleResponse[Foo] looks like this:
{
"title": "GenericSingleResponse[Foo]",
"type": "object",
"properties": {
"status": {
"title": "Status",
"type": "string"
},
"data": {
"$ref": "#/definitions/GenericSingleObject_Foo_"
}
},
"required": [
"status",
"data"
],
"definitions": {
"Foo": {
"title": "Foo",
"type": "object",
"properties": {
"a": {
"title": "A",
"type": "string"
},
"b": {
"title": "B",
"type": "integer"
}
},
"required": [
"a",
"b"
]
},
"GenericSingleObject_Foo_": {
"title": "GenericSingleObject[Foo]",
"type": "object",
"properties": {
"object": {
"$ref": "#/definitions/Foo"
}
},
"required": [
"object"
]
}
}
}
To demonstrate it with FastAPI:
from fastapi import FastAPI
app = FastAPI()
#app.get("/foo/", response_model=GenericSingleResponse[Foo])
async def get_one_foo() -> dict[str, object]:
return {"status": "foo", "data": {"object": {"a": "spam", "b": 123}}}
Sending a request to that route returns the following:
{
"status": "foo",
"data": {
"object": {
"a": "spam",
"b": 123
}
}
}
Dynamically created model
If you actually want the attribute name to also be different every time, that is obviously no longer possible with static type annotations. In that case we would have to resort to actually creating the model type dynamically via pydantic.create_model.
In that case there is really no point in genericity anymore because type safety is out of the window anyway, at least for the data model. We still have the option to define a GenericResponse model, which we can specify via our dynamically generated models, but this will make every static type checker mad, since we'll be using variables for types. Still, it might make for otherwise concise code.
We just need to define an algorithm for deriving the model parameters:
from typing import Any, Generic, Optional, TypeVar
from pydantic import BaseModel, create_model
from pydantic.generics import GenericModel
M = TypeVar("M", bound=BaseModel)
def create_data_model(
model: type[BaseModel],
plural: bool = False,
custom_plural_name: Optional[str] = None,
**kwargs: Any,
) -> type[BaseModel]:
data_field_name = model.__name__.lower()
if plural:
model_name = f"Multiple{model.__name__}"
if custom_plural_name:
data_field_name = custom_plural_name
else:
data_field_name += "s"
kwargs[data_field_name] = (list[model], ...) # type: ignore[valid-type]
else:
model_name = f"Single{model.__name__}"
kwargs[data_field_name] = (model, ...)
return create_model(model_name, **kwargs)
class GenericResponse(GenericModel, Generic[M]):
status: str
data: M
Using the same Foo and Bar examples as before:
class Foo(BaseModel):
a: str
b: int
class Bar(BaseModel):
x: float
SingleFoo = create_data_model(Foo)
MultipleBar = create_data_model(Bar, plural=True)
This also works as expected with FastAPI including the automatically generated schemas/documentations:
from fastapi import FastAPI
app = FastAPI()
#app.get("/foo/", response_model=GenericResponse[SingleFoo]) # type: ignore[valid-type]
async def get_one_foo() -> dict[str, object]:
return {"status": "foo", "data": {"foo": {"a": "spam", "b": 123}}}
#app.get("/bars/", response_model=GenericResponse[MultipleBar]) # type: ignore[valid-type]
async def get_multiple_bars() -> dict[str, object]:
return {"status": "bars", "data": {"bars": [{"x": 3.14}, {"x": 0}]}}
Output is essentially the same as with the first approach.
You'll have to see, which one works better for you. I find the second option very strange because of the dynamic key/field name. But maybe that is what you need for some reason.
Related
From a similar question, the goal is to create a model like this Typescript interface:
interface ExpandedModel {
fixed: number;
[key: string]: OtherModel;
}
However the OtherModel needs to be validated, so simply using:
class ExpandedModel(BaseModel):
fixed: int
class Config:
extra = "allow"
Won't be enough. I tried root (pydantic docs):
class VariableKeysModel(BaseModel):
__root__: Dict[str, OtherModel]
But doing something like:
class ExpandedModel(VariableKeysModel):
fixed: int
Is not possible due to:
ValueError: root cannot be mixed with other fields
Would something like #root_validator (example from another answer) be helpful in this case?
Thankfully, Python is not TypeScript. As mentioned in the comments here as well, an object is generally not a dictionary and dynamic attributes are considered bad form in almost all cases.
You can of course still set attributes dynamically, but they will for example never be recognized by a static type checker like Mypy or your IDE. This means you will not get auto-suggestions for those dynamic fields. Only attributes that are statically defined within the namespace of the class are considered members of that class.
That being said, you can abuse the extra config option to allow arbitrary fields to by dynamically added to the model, while at the same time enforcing all corresponding values to be of a specific type via a root_validator.
from typing import Any
from pydantic import BaseModel, root_validator
class Foo(BaseModel):
a: int
class Bar(BaseModel):
b: str
#root_validator
def validate_foo(cls, values: dict[str, Any]) -> dict[str, Any]:
for name, value in values.items():
if name in cls.__fields__:
continue # ignore statically defined fields here
values[name] = Foo.parse_obj(value)
return values
class Config:
extra = "allow"
Demo:
if __name__ == "__main__":
from pydantic import ValidationError
bar = Bar.parse_obj({
"b": "xyz",
"foo1": {"a": 1},
"foo2": Foo(a=2),
})
print(bar.json(indent=4))
try:
Bar.parse_obj({
"b": "xyz",
"foo": {"a": "string"},
})
except ValidationError as err:
print(err.json(indent=4))
try:
Bar.parse_obj({
"b": "xyz",
"foo": {"not_a_foo_field": 1},
})
except ValidationError as err:
print(err.json(indent=4))
Output:
{
"b": "xyz",
"foo2": {
"a": 2
},
"foo1": {
"a": 1
}
}
[
{
"loc": [
"__root__",
"a"
],
"msg": "value is not a valid integer",
"type": "type_error.integer"
}
]
[
{
"loc": [
"__root__",
"a"
],
"msg": "field required",
"type": "value_error.missing"
}
]
A better approach IMO is to just put the dynamic name-object-pairs into a dictionary. For example, you could define a separate field foos: dict[str, Foo] on the Bar model and get automatic validation out of the box that way.
Or you ditch the outer base model altogether for that specific case and just handle the data as a native dictionary with Foo values and parse them all via the Foo model.
Hello I am reading a JSON with the following format:
{
"1": {"id":1, "type": "a"},
2: {"id":2, "type": "b"},
"3": {"id":3, "type": "c"},
"5": {"id":4, "type": "d"}
}
As you can see the keys are numbers but are not consecutives.
So I have the following BaseModel to the nested dict:
#validate_arguments
class ObjI(BaseModel):
id: int
type: str
The question is how can I validate that all items in the dict are ObjI without use of:
objIs = json.load(open(path))
assert type(objIs) == dict
for objI in objIs.values():
assert type(objI) == dict
ObjI(**pair)
I tried with:
#validate_arguments
class ObjIs(BaseModel):
ObjIs: Dict[Union[str, int], ObjI]
EDIT
The error validating the previous is:
in pydantic.validators.find_validators TypeError: issubclass() arg 1 must be a class
Is this possible?
Thanks
You could change your model definitions to use a custom root type (no need for the validate_arguments decorators):
from pydantic import BaseModel
from typing import Dict
class ObjI(BaseModel):
id: int
type: str
class ObjIs(BaseModel):
__root__: dict[int, ObjI]
The model can now be initialised with the JSON data, e.g. like this:
import json
with open("/path/to/data") as file:
data = json.load(file)
objis = ObjIs.parse_obj(data)
If data contains invalid types (or has missing fields), prase_obj() will raise a ValidationError.
For examples, if data looked like this:
data = {
"1": {"id": "x", "type": "a"},
# ^
# wrong type
2: {"id": 2, "type": "b"},
"3": {"id": 3, "type": "c"},
"4": {"id": 4, "type": "d"},
}
objs = ObjIs.parse_obj(data)
it would result in:
pydantic.error_wrappers.ValidationError: 1 validation error for ObjIs
__root__ -> 1 -> id
value is not a valid integer (type=type_error.integer)
which tells us that the id of the object with key 1 has an invalid type.
(You can catch and handle a ValidationError like any other exception in Python.)
(The pydantic docs also recommend to implement custom __iter__ and __getitem__ methods on the model if you want to access the items in the __root__ field directly.)
I created a couple of dataclasses (similar to Go's structs) and I want to put my response/data inside of the dataclass. I haven't been able to find wether or not the json package supports this out of the box.
Dataclasses
from dataclasses import dataclass
from typing import List
#dataclass
class TakerPays:
currency: str
issuer: str
value: str
#dataclass
class Offers:
account: str
book_directory: str
book_node: str
flags: int
ledger_entry_type: str
owner_node: str
previous_tx_id: str
previous_tx_lgr_seq: int
sequence: int
taker_gets: str
taker_pays: TakerPays
index: str
owner_funds: str
quality: str
#dataclass
class Warnings:
id: int
message: str
#dataclass
class Result:
ledger_hash: str
ledger_index: int
offers: List[Offers]
validated: bool
warnings: List[Warnings]
#dataclass
class Response:
id: int
result: Result
status: str
type: str
Preview of the json that needs to be put into the Response dataclass
{
"id": 4,
"result": {
"ledger_hash": "5848C7DB5024EC3B532AC2F93BA8086A3D6281D3C0746BFE62E7E3CF4853F663",
"ledger_index": 68379996,
"offers": [
{
"Account": "rPbMHxs7vy5t6e19tYfqG7XJ6Fog8EPZLk",
"BookDirectory": "DFA3B6DDAB58C7E8E5D944E736DA4B7046C30E4F460FD9DE4E1D157637A1048F",
"BookNode": "0",
"Flags": 0,
"LedgerEntryType": "Offer",
"OwnerNode": "0",
"PreviousTxnID": "72B8928E31DF89223C7ADE0030685289BAD772C72DF23DDFFB92FF7B48BAC622",
"PreviousTxnLgrSeq": 68379985,
"Sequence": 386826,
"TakerGets": "789784836",
"TakerPays": {
"currency": "USD",
"issuer": "rvYAfWj5gh67oV6fW32ZzP3Aw4Eubs59B",
"value": "646.5472316"
},
"index": "82F565EDEF8661D7D9C92A75E2F0F5DBF2BAAFAE96A5A5A768AD76B933016031",
"owner_funds": "4587408572",
"quality": "0.0000008186371808232591"
}
],
"validated": true,
"warnings": [
{
"id": 1004,
"message": "This is a reporting server. The default behavior of a reporting server is to only return validated data. If you are looking for not yet validated data, include \"ledger_index : current\" in your request, which will cause this server to forward the request to a p2p node. If the forward is successful the response will include \"forwarded\" : \"true\""
}
]
},
"status": "success",
"type": "response"
}
I've not been able to find support to put the json into the dataclass similar to Go's json.Unmarshal and I'm curious to the best way to do this
Unfortunately the builtin modules in Python such as json don't support de-serializing JSON into a nested dataclass model as in this case.
If you're on board with using third-party libraries, a solid option is to leverage the dataclass-wizard library for this task, as shown below; one advantage that it offers - which really helps in this particular example - is auto key casing transforms, such as from "camelCase" and "TitleCase".
Note that if you prefer not to subclass from any Mixin class, you can opt to use the helper functions fromdict and asdict to convert data from/to Python dict objects instead.
from dataclasses import dataclass
from typing import List
from dataclass_wizard import JSONWizard
#dataclass
class TakerPays:
currency: str
issuer: str
value: str
#dataclass
class Offers:
account: str
book_directory: str
book_node: str
flags: int
ledger_entry_type: str
owner_node: str
previous_txn_id: str
previous_txn_lgr_seq: int
sequence: int
taker_gets: str
taker_pays: TakerPays
index: str
owner_funds: str
quality: str
#dataclass
class Warnings:
id: int
message: str
#dataclass
class Result:
ledger_hash: str
ledger_index: int
offers: List[Offers]
validated: bool
warnings: List[Warnings]
#dataclass
class Response(JSONWizard):
id: int
result: Result
status: str
type: str
def main():
string = r"""
{
"id": 4,
"result": {
"ledger_hash": "5848C7DB5024EC3B532AC2F93BA8086A3D6281D3C0746BFE62E7E3CF4853F663",
"ledger_index": 68379996,
"offers": [
{
"Account": "rPbMHxs7vy5t6e19tYfqG7XJ6Fog8EPZLk",
"BookDirectory": "DFA3B6DDAB58C7E8E5D944E736DA4B7046C30E4F460FD9DE4E1D157637A1048F",
"BookNode": "0",
"Flags": 0,
"LedgerEntryType": "Offer",
"OwnerNode": "0",
"PreviousTxnID": "72B8928E31DF89223C7ADE0030685289BAD772C72DF23DDFFB92FF7B48BAC622",
"PreviousTxnLgrSeq": 68379985,
"Sequence": 386826,
"TakerGets": "789784836",
"TakerPays": {
"currency": "USD",
"issuer": "rvYAfWj5gh67oV6fW32ZzP3Aw4Eubs59B",
"value": "646.5472316"
},
"index": "82F565EDEF8661D7D9C92A75E2F0F5DBF2BAAFAE96A5A5A768AD76B933016031",
"owner_funds": "4587408572",
"quality": "0.0000008186371808232591"
}
],
"validated": true,
"warnings": [
{
"id": 1004,
"message": "This is a reporting server. The default behavior of a reporting server is to only return validated data. If you are looking for not yet validated data, include \"ledger_index : current\" in your request, which will cause this server to forward the request to a p2p node. If the forward is successful the response will include \"forwarded\" : \"true\""
}
]
},
"status": "success",
"type": "response"
}
"""
r = Response.from_json(string)
print(repr(r))
if __name__ == '__main__':
main()
Output:
Response(id=4, result=Result(ledger_hash='5848C7DB5024EC3B532AC2F93BA8086A3D6281D3C0746BFE62E7E3CF4853F663', ledger_index=68379996, offers=[Offers(account='rPbMHxs7vy5t6e19tYfqG7XJ6Fog8EPZLk', book_directory='DFA3B6DDAB58C7E8E5D944E736DA4B7046C30E4F460FD9DE4E1D157637A1048F', book_node='0', flags=0, ledger_entry_type='Offer', owner_node='0', previous_txn_id='72B8928E31DF89223C7ADE0030685289BAD772C72DF23DDFFB92FF7B48BAC622', previous_txn_lgr_seq=68379985, sequence=386826, taker_gets='789784836', taker_pays=TakerPays(currency='USD', issuer='rvYAfWj5gh67oV6fW32ZzP3Aw4Eubs59B', value='646.5472316'), index='82F565EDEF8661D7D9C92A75E2F0F5DBF2BAAFAE96A5A5A768AD76B933016031', owner_funds='4587408572', quality='0.0000008186371808232591')], validated=True, warnings=[Warnings(id=1004, message='This is a reporting server. The default behavior of a reporting server is to only return validated data. If you are looking for not yet validated data, include "ledger_index : current" in your request, which will cause this server to forward the request to a p2p node. If the forward is successful the response will include "forwarded" : "true"')]), status='success', type='response')
NB: I noted that two fields in the Offers dataclass have slightly different names than the fields in the JSON object. For example, the field previous_tx_id is associated with the key PreviousTxnID in the JSON object.
Assuming this was intentional, you could easily work around this by defining a field alias mapping, as shown below:
from dataclass_wizard import json_key
# Note: In Python 3.9+ you can import this from `typing` instead
from typing_extensions import Annotated
#dataclass
class Offers:
...
previous_tx_id: Annotated[str, json_key('PreviousTxnID')]
previous_tx_lgr_seq: Annotated[int, json_key('PreviousTxnLgrSeq')]
...
I would like pydantic to choose the model to use for parsing the input dependent on the input value. Is this possible?
MVCE
I have a pydantic model which looks similar to this one:
from typing import List, Literal
from pydantic import BaseModel
class Animal(BaseModel):
name: str
type: Literal["mamal", "bird"]
class Bird(Animal):
max_eggs: int
class Mamal(Animal):
max_offspring: int
class Config(BaseModel):
animals: List[Animal]
cfg = Config.parse_obj(
{
"animals": [
{"name": "eagle", "type": "bird", "max_eggs": 3},
{"name": "Human", "type": "mamal", "max_offspring": 3},
]
}
)
print(cfg.json(indent=4))
gives
{
"animals": [
{
"name": "eagle",
"type": "bird"
<-- missing max_offspring, as "Animal" was used instead of Bird
},
{
"name": "Human",
"type": "mamal"
<-- missing max_offspring, as "Animal" was used instead of Mamal
}
]
}
I know that I could set Config.extra="allow" in Animal, but that is not what I want. I would like pydantic to see that a dictionary with 'type': 'mamal' should use the Mamal model to parse.
Is this possible?
You could add concrete literals to every child class to differentiate and put them in Union from more to less specific order. Like so:
class Animal(BaseModel):
name: str
type: str
class Bird(Animal):
type: Literal["bird"]
max_eggs: int
class Mamal(Animal):
type: Literal["mamal"]
max_offspring: int
class Config(BaseModel):
animals: List[Union[Bird, Mamal, Animal]] # From more specific to less
I want to knok if is possible to parameterize the type in a custom User defined types.
My working example is:
from pydantic import BaseModel
from typing import TypeVar, Dict, Union, Optional
ListSchemaType = TypeVar("ListSchemaType", bound=BaseModel)
GenericPagination = Dict[str, Union[Optional[int], List[ListSchemaType]]]
What i need to do is call GenericPagination with a parameter.
Like this:
pagination: GenericPagination(schemas.Posts)
My data is structured in this form:
"count": 2,
"limit": 10,
"page": 1,
"results": [{
"name": "Foo",
"url_base": "https://www.bar.cl",
"development_stage": "FOO",
"system_status": "BAR",
"id": 1
}]
}
After some research i found then answer here.
My final working case looks:
ListSchema = TypeVar("ListSchema", bound=BaseModel)
Pagination = Dict[str, Union[Optional[int], List[ListSchema]]]
Then i can just use it like this:
pagination: Optional[Pagination[schemas.Post]]