Make Pydantic BaseModel fields optional including sub-models for PATCH - python

As already asked in similar questions, I want to support PATCH operations for a FastApi application where the caller can specify as many or as few fields as they like, of a Pydantic BaseModel with sub-models, so that efficient PATCH operations can be performed, without the caller having to supply an entire valid model just in order to update two or three of the fields.
I've discovered there are 2 steps in Pydantic PATCH from the tutorial that don't support sub-models. However, Pydantic is far too good for me to criticise it for something that it seems can be built using the tools that Pydantic provides. This question is to request implementation of those 2 things while also supporting sub-models:
generate a new DRY BaseModel with all fields optional
implement deep copy with update of BaseModel
These problems are already recognised by Pydantic.
There is discussion of a class based solution to the optional model
And there two issues open on the deep copy with update
A similar question has been asked one or two times here on SO and there are some great answers with different approaches to generating an all-fields optional version of the nested BaseModel. After considering them all this particular answer by Ziur Olpa seemed to me to be the best, providing a function that takes the existing model with optional and mandatory fields, and returning a new model with all fields optional: https://stackoverflow.com/a/72365032
The beauty of this approach is that you can hide the (actually quite compact) little function in a library and just use it as a dependency so that it appears in-line in the path operation function and there's no other code or boilerplate.
But the implementation provided in the previous answer did not take the step of dealing with sub-objects in the BaseModel being patched.
This question therefore requests an improved implementation of the all-fields-optional function that also deals with sub-objects, as well as a deep copy with update.
I have a simple example as a demonstration of this use-case, which although aiming to be simple for demonstration purposes, also includes a number of fields to more closely reflect the real world examples we see. Hopefully this example provides a test scenario for implementations, saving work:
import logging
from datetime import datetime, date
from collections import defaultdict
from pydantic import BaseModel
from fastapi import FastAPI, HTTPException, status, Depends
from fastapi.encoders import jsonable_encoder
app = FastAPI(title="PATCH demo")
logging.basicConfig(level=logging.DEBUG)
class Collection:
collection = defaultdict(dict)
def __init__(self, this, that):
logging.debug("-".join((this, that)))
self.this = this
self.that = that
def get_document(self):
document = self.collection[self.this].get(self.that)
if not document:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Not Found",
)
logging.debug(document)
return document
def save_document(self, document):
logging.debug(document)
self.collection[self.this][self.that] = document
return document
class SubOne(BaseModel):
original: date
verified: str = ""
source: str = ""
incurred: str = ""
reason: str = ""
attachments: list[str] = []
class SubTwo(BaseModel):
this: str
that: str
amount: float
plan_code: str = ""
plan_name: str = ""
plan_type: str = ""
meta_a: str = ""
meta_b: str = ""
meta_c: str = ""
class Document(BaseModel):
this: str
that: str
created: datetime
updated: datetime
sub_one: SubOne
sub_two: SubTwo
the_code: str = ""
the_status: str = ""
the_type: str = ""
phase: str = ""
process: str = ""
option: str = ""
#app.get("/endpoint/{this}/{that}", response_model=Document)
async def get_submission(this: str, that: str) -> Document:
collection = Collection(this=this, that=that)
return collection.get_document()
#app.put("/endpoint/{this}/{that}", response_model=Document)
async def put_submission(this: str, that: str, document: Document) -> Document:
collection = Collection(this=this, that=that)
return collection.save_document(jsonable_encoder(document))
#app.patch("/endpoint/{this}/{that}", response_model=Document)
async def patch_submission(
document: Document,
# document: optional(Document), # <<< IMPLEMENT optional <<<
this: str,
that: str,
) -> Document:
collection = Collection(this=this, that=that)
existing = collection.get_document()
existing = Document(**existing)
update = document.dict(exclude_unset=True)
updated = existing.copy(update=update, deep=True) # <<< FIX THIS <<<
updated = jsonable_encoder(updated)
collection.save_document(updated)
return updated
This example is a working FastAPI application, following the tutorial, and can be run with uvicorn example:app --reload. Except it doesn't work, because there's no all-optional fields model, and Pydantic's deep copy with update actually overwrites sub-models rather than updating them.
In order to test it the following Bash script can be used to run curl requests. Again I'm supplying this just to hopefully make it easier to get started with this question.
Just comment out the other commands each time you run it so that the command you want is used.
To demonstrate this initial state of the example app working you would run GET (expect 404), PUT (document stored), GET (expect 200 and same document returned), PATCH (expect 200), GET (expect 200 and updated document returned).
host='http://127.0.0.1:8000'
path="/endpoint/A123/B456"
method='PUT'
data='
{
"this":"A123",
"that":"B456",
"created":"2022-12-01T01:02:03.456",
"updated":"2023-01-01T01:02:03.456",
"sub_one":{"original":"2022-12-12","verified":"Y"},
"sub_two":{"this":"A123","that":"B456","amount":0.88,"plan_code":"HELLO"},
"the_code":"BYE"}
'
# method='PATCH'
# data='{"this":"A123","that":"B456","created":"2022-12-01T01:02:03.456","updated":"2023-01-02T03:04:05.678","sub_one":{"original":"2022-12-12","verified":"N"},"sub_two":{"this":"A123","that":"B456","amount":123.456}}'
method='GET'
data=''
if [[ -n data ]]; then data=" --data '$data'"; fi
curl="curl -K curlrc -X $method '$host$path' $data"
echo $curl >&2
eval $curl
This curlrc will need to be co-located to ensure the content type headers are correct:
--cookie "_cookies"
--cookie-jar "_cookies"
--header "Content-Type: application/json"
--header "Accept: application/json"
--header "Accept-Encoding: compress, gzip"
--header "Cache-Control: no-cache"
So what I'm looking for is the implementation of optional that is commented out in the code, and a fix for existing.copy with the update parameter, that will enable this example to be used with PATCH calls that omit otherwise mandatory fields.
The implementation does not have to conform precisely to the commented out line, I just provided that based on Ziur Olpa's previous answer.

When I first posed this question I thought that the only problem was how to turn all fields Optional in a nested BaseModel, but actually that was not difficult to fix.
The real problem with partial updates when implementing a PATCH call is that the Pydantic BaseModel.copy method doesn't attempt to support nested models when applying it's update parameter. That's quite an involved task for the generic case, considering you may have fields that are dicts, lists, or sets of another BaseModel, just for instance. Instead it just unpacks the dict using **: https://github.com/pydantic/pydantic/blob/main/pydantic/main.py#L353
I haven't got a proper implementation of that for Pydantic, but since I've got a working example PATCH by cheating, I'm going to post this as an answer and see if anyone can fault it or provide better, possibly even with an implementation of BaseModel.copy that supports updates for nested models.
Rather than post the implementations separately I am going to update the example given in the question so that it has a working PATCH and being a full demonstration of PATCH hopefully this will help others more.
The two additions are partial and merge. partial is what's referred to as optional in the question code.
partial:
This is a function that takes any BaseModel and returns a new BaseModel with all fields Optional, including sub-object fields. That's enough for Pydantic to allow through any sub-set of fields without throwing an error for "missing fields". It's recursive - not really popular - but given these are nested data models the depth is not expected to exceed single digits.
merge:
The BaseModel update on copy method operates on an instance of BaseModel - but supporting all the possible type variations when descending through a nested model is the hard part - and the database data, and the incoming update, are easily available as plain Python dicts; so this is the cheat: merge is an implementation of a nested dict update instead, and since the dict data has already been validated at one point or other, it should be fine.
Here's the full example solution:
import logging
from typing import Optional, Type
from datetime import datetime, date
from functools import lru_cache
from pydantic import BaseModel, create_model
from collections import defaultdict
from pydantic import BaseModel
from fastapi import FastAPI, HTTPException, status, Depends, Body
from fastapi.encoders import jsonable_encoder
app = FastAPI(title="Nested model PATCH demo")
logging.basicConfig(level=logging.DEBUG)
class Collection:
collection = defaultdict(dict)
def __init__(self, this, that):
logging.debug("-".join((this, that)))
self.this = this
self.that = that
def get_document(self):
document = self.collection[self.this].get(self.that)
if not document:
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Not Found",
)
logging.debug(document)
return document
def save_document(self, document):
logging.debug(document)
self.collection[self.this][self.that] = document
return document
class SubOne(BaseModel):
original: date
verified: str = ""
source: str = ""
incurred: str = ""
reason: str = ""
attachments: list[str] = []
class SubTwo(BaseModel):
this: str
that: str
amount: float
plan_code: str = ""
plan_name: str = ""
plan_type: str = ""
meta_a: str = ""
meta_b: str = ""
meta_c: str = ""
class SubThree(BaseModel):
one: str = ""
two: str = ""
class Document(BaseModel):
this: str
that: str
created: datetime
updated: datetime
sub_one: SubOne
sub_two: SubTwo
# sub_three: dict[str, SubThree] = {} # Hah hah not really
the_code: str = ""
the_status: str = ""
the_type: str = ""
phase: str = ""
process: str = ""
option: str = ""
#lru_cache
def partial(baseclass: Type[BaseModel]) -> Type[BaseModel]:
"""Make all fields in supplied Pydantic BaseModel Optional, for use in PATCH calls.
Iterate over fields of baseclass, descend into sub-classes, convert fields to Optional and return new model.
Cache newly created model with lru_cache to ensure it's only created once.
Use with Body to generate the partial model on the fly, in the PATCH path operation function.
- https://stackoverflow.com/questions/75167317/make-pydantic-basemodel-fields-optional-including-sub-models-for-patch
- https://stackoverflow.com/questions/67699451/make-every-fields-as-optional-with-pydantic
- https://github.com/pydantic/pydantic/discussions/3089
- https://fastapi.tiangolo.com/tutorial/body-updates/#partial-updates-with-patch
"""
fields = {}
for name, field in baseclass.__fields__.items():
type_ = field.type_
if type_.__base__ is BaseModel:
fields[name] = (Optional[partial(type_)], {})
else:
fields[name] = (Optional[type_], None) if field.required else (type_, field.default)
# https://docs.pydantic.dev/usage/models/#dynamic-model-creation
validators = {"__validators__": baseclass.__validators__}
return create_model(baseclass.__name__ + "Partial", **fields, __validators__=validators)
def merge(original, update):
"""Update original nested dict with values from update retaining original values that are missing in update.
- https://github.com/pydantic/pydantic/issues/3785
- https://github.com/pydantic/pydantic/issues/4177
- https://docs.pydantic.dev/usage/exporting_models/#modelcopy
- https://github.com/pydantic/pydantic/blob/main/pydantic/main.py#L353
"""
for key in update:
if key in original:
if isinstance(original[key], dict) and isinstance(update[key], dict):
merge(original[key], update[key])
elif isinstance(original[key], list) and isinstance(update[key], list):
original[key].extend(update[key])
else:
original[key] = update[key]
else:
original[key] = update[key]
return original
#app.get("/endpoint/{this}/{that}", response_model=Document)
async def get_submission(this: str, that: str) -> Document:
collection = Collection(this=this, that=that)
return collection.get_document()
#app.put("/endpoint/{this}/{that}", response_model=Document)
async def put_submission(this: str, that: str, document: Document) -> Document:
collection = Collection(this=this, that=that)
return collection.save_document(jsonable_encoder(document))
#app.patch("/endpoint/{this}/{that}", response_model=Document)
async def patch_submission(
this: str,
that: str,
document: partial(Document), # <<< IMPLEMENTED partial TO MAKE ALL FIELDS Optional <<<
) -> Document:
collection = Collection(this=this, that=that)
existing_document = collection.get_document()
incoming_document = document.dict(exclude_unset=True)
# VVV IMPLEMENTED merge INSTEAD OF USING BROKEN PYDANTIC copy WITH update VVV
updated_document = jsonable_encoder(merge(existing_document, incoming_document))
collection.save_document(updated_document)
return updated_document

Related

list all compute instances in a specific region gcp with python

So, I can list my instances by zones using this API.
GET https://compute.googleapis.com/compute/v1/projects/{project}/zones/{zone}/instances.
I want now to filter my instances by region. Any idea how can I do this (using python)?
You can use aggregated_list(), to list all your instances on your project. Filtering via region could be done on the actual code. See code below where I used regex to mimic a filter using region variable.
from typing import Dict, Iterable
from google.cloud import compute_v1
import re
def list_all_instances(
project_id: str,
region: str
) -> Dict[str, Iterable[compute_v1.Instance]]:
instance_client = compute_v1.InstancesClient()
request = {
"project" : project_id,
}
agg_list = instance_client.aggregated_list(request=request)
all_instances = {}
print("Instances found:")
for zone, response in agg_list:
if response.instances:
if re.search(f"{region}*", zone):
all_instances[zone] = response.instances
print(f" {zone}:")
for instance in response.instances:
print(f" - {instance.name} ({instance.machine_type})")
return all_instances
list_all_instances(project_id="your-project-id",region="us-central1") #used us-central1 for testing
NOTE: Code above is from this code. I just modified it to apply the filtering above.
Actual instances on my GCP account:
Result from code above (only zones with prefix us-central1 were displayed):

How to post an image file with a list of strings using FastAPI?

I have tried a lot of things, but it doesn't seem to work. Here is my code:
#app.post("/my-endpoint")
async def my_func(
languages: List[str] = ["en", "hi"], image: UploadFile = File(...)
):
The function works fine when I remove one of the parameters, but with both of the parameters, the retrieved list comes out to be like ["en,hi"], whereas I want it to be ["en, "hi].
I am not even sure if my approach is correct, hence the broader question, if this approach is not right then how can I post a list and an image together?
Your function looks just fine. That behaviour though has to do with how FastAPI autodocs (Swagger UI)—I am assuming you are using it for testing, as I did myself and noticed the exact same behaviour—handles the list items. For some reason, the Swagger UI/OpenAPI adds all items as a single item to the list, separated by comma (i.e., ["en, hi, ..."], instead of ["en", "hi", ...]).
Testing the code with Python requests and sending the languages' list in the proper way, it works just fine. To fix, however, the behaviour of Swagger UI, or any other tool that might behave the same, you could perform a check on the length of the list that is received in the function, and if it is equal to 1 (meaning that the list contains a single item), then split this item using comma delimiter to get a new list with all languages included.
Below is a working example:
app.py
from fastapi import File, UploadFile, FastAPI
from typing import List
app = FastAPI()
#app.post("/submit")
def submit(languages: List[str] = ["en", "hi"], image: UploadFile = File(...)):
if (len(languages) == 1):
languages= [item.strip() for item in languages[0].split(',')]
return {"Languages ": languages, "Uploaded filename": image.filename}
test.py
import requests
url = 'http://127.0.0.1:8000/submit'
image = {'image': open('sample.png', 'rb')}
#payload ={"languages": ["en", "hi"]} # send languages as separate items
payload ={"languages": "en, hi"} # send languages as a single item
resp = requests.post(url=url, data=payload, files=image)
print(resp.json())
I solved this using Query parameters! This might be helpful for someone, though I think Chris' answer makes much more sense -
#app.post("/my-endpoint")
async def my_func(
languages: List[str] = Query(["en", "hi"]), image: UploadFile = File(...)
):

Does Pydantic accept the same query with both single and multi values?

My schema:
class ArticleBase(Schema):
page: Optional[int] = 1
topic: Optional[List[str]] = Query(None)
Router:
#api.get("/articles/", tags = ['Articles'])
def article_list(request, article: ArticleBase = Query(...)):
return article
I want Pydantic/FastAPI to accept query with both single and multiple values.
I want both of them to get accepted:
www.example.com/articles/?topic=hello&topic=world
www.example.com/articles/?topic=hello
How can I achieve it?
The error message I am getting:
"msg": "value is not a valid list",
"type": "type_error.list"
By removing Optional from my code I managed it to work as I wanted. It accepts query with both single and multiple values.
My working code:
class ArticleBase(Schema):
page: Optional[int] = 1
topic: List[str] = Query(None)
Didn't understand why that was the reason. Maybe I'm missing something from documentation but read it carefully and didn't find anything related to it.

Can FastAPI/Pydantic individually validate input items in a list?

I have a FastAPI post method:
from fastapi import FastAPI
from fastapi.encoders import jsonable_encoder
from pydantic import BaseModel
from typing import List
import pandas as pd
class InputItem(BaseModel):
Feature: str
class Item(BaseModel):
Feature: str
Result: str
app = FastAPI()
#app.post("/", response_model=List[OutputItem])
def my_function(input: List[InputItem]):
df = pd.DataFrame(jsonable_encoder(input))
result = df.apply(another_method)
return result.to_dict(orient=records)
My question is, if I pass it a list like this:
[
{"NOTFeature":"value"},
{"Feature":"value"},
{"Feature":"value"}
]
or if one of the values is of a different data type, the whole thing currently fails and returns an error. Is there a way to get it to handle the error so that the failing entry is skipped, and the API function is still carried out for items in the list which do pass validation?
Incidentally, if there's a smoother way to handle the dataframe conversion which still uses the dataframe (these are essential for the other data handling done in the functions), this would be very helpful to know as well!
welcome to Stack Overflow.
The short answer for your question is no. That's because it's not pydantic (and also FastAPI) responsability to handle payload contents or fix malformed payloads.
The right way you could do that is to make the Feature member Optional and filter out when it gets to your method, something like this:
import fastapi
import typing
import pydantic
class InputItem(pydantic.BaseModel):
feature: typing.Optional[str]
class OutputItem(pydantic.BaseModel):
Feature: str
Result: str
app = fastapi.FastAPI()
#app.post("/", response_model=typing.List[OutputItem])
def my_function(data: typing.List[InputItem]):
data = [i for i in data if i.feature is not None]
print(data)
# ... do what you gotta do
Union with dict as a passthrough seemed to work for me, i.e.:
from typing import List, Union
#app.post("/", response_model=List[OutputItem])
def my_function(input: List[Union[InputItem, dict]]):
df = pd.DataFrame(jsonable_encoder(input))
result = df.apply(another_method)
return result.to_dict(orient=records)

How to chain validations with pydantic

Let's say I have webhook where I get json data. This json is recursively converted by pydantic.
#app.route("/", methods=['POST'])
async def telegram_webhook(request):
update = Update.parse_obj(request.json)
/* do something with update */
I check this json is minimal valid object with Update model (which internally contains Message model):
class Update(BaseModel):
update_id: int
message: Message
...
class Message(BaseModel):
message_id: int
text: Optional[str]
But later in the code I want to extend validation, so to check that message is not only Message, but TextMessage:
// text field now is required
class TextMessage(Message):
text: str
#validator('text')
def check_text_length(cls, value):
length = len(value)
if length > 4096:
raise ValueError(f'text length {length} is too large')
return value
So I pass message to validation function
def process_text_message(message):
text_message = TextMessage.parse_obj(message)
But I get error that pydantic requires not Message type, but dict.
How would I do that?
How could I apply additional validation on already validated (basically) data?
The short answer is: use message.dict():
def process_text_message(message):
text_message = TextMessage.parse_obj(message.dict())
The longer answer is that parse_obj should be fixed to cope with "dict-like" things not just dicts, I'll explain that on the issues you created.

Categories

Resources