I have this model:
class Text(BaseModel):
id: str
text: str = None
class TextsRequest(BaseModel):
data: list[Text]
n_processes: Union[int, None]
So I want to be able to take requests like:
{"data": ["id": "1", "text": "The text 1"], "n_processes": 8}
and
{"data": ["id": "1", "text": "The text 1"]}.
Right now in the second case I get
{'data': [{'id': '1', 'text': 'The text 1'}], 'n_processes': None}
using this code:
app = FastAPI()
#app.post("/make_post/", response_model_exclude_none=True)
async def create_graph(request: TextsRequest):
input_data = jsonable_encoder(request)
So how can I exclude n_processes here?
You can use exclude_none param of Pydantic's model.dict(...):
class Text(BaseModel):
id: str
text: str = None
class TextsRequest(BaseModel):
data: list[Text]
n_processes: Optional[int]
request = TextsRequest(**{"data": [{"id": "1", "text": "The text 1"}]})
print(request.dict(exclude_none=True))
Output:
{'data': [{'id': '1', 'text': 'The text 1'}]}
Also, it's more idiomatic to write Optional[int] instead of Union[int, None].
Pydantic provides the following arguments for exporting models using the model.dict(...) method:
exclude_unset: whether fields which were not explicitly set when
creating the model should be excluded from the returned dictionary;
default False
exclude_none: whether fields which are equal to None should be
excluded from the returned dictionary; default False
Since you are refering to excluding optional unset parameters, you can use the first method (i.e., exclude_unset). This is useful when one would like to exclude a parameter only if it has not been set to either some value or None.
The exclude_none argument, however, ignores that fact that an attribute may have been intentionally set to None, and hence, excludes it from the returned dictionary.
Example:
from pydantic import BaseModel
from typing import List, Union
class Text(BaseModel):
id: str
text: str = None
class TextsRequest(BaseModel):
data: List[Text] # in Python 3.9+ you can use: data: list[Text]
n_processes: Union[int, None] = None
t = TextsRequest(**{'data': [{'id': '1', 'text': 'The text 1'}], 'n_processes': None})
print(t.dict(exclude_none=True))
#> {'data': [{'id': '1', 'text': 'The text 1'}]}
print(t.dict(exclude_unset=True))
#> {'data': [{'id': '1', 'text': 'The text 1'}], 'n_processes': None}
About Optional Parameters
Using Union[int, None] is the same as using Optional[int] (both are equivalent). The most important part, however, to make a parameter optional is the part = None.
As per FastAPI documentation (see admonition Note and Info in the link provided):
Note
FastAPI will know that the value of q is not required because of the
default value = None.
The Union in Union[str, None] will allow your editor to give you
better support and detect errors.
Info
Have in mind that the most important part to make a parameter optional
is the part: = None, as it will use that None as the default value, and that way make the
parameter not required.
The Union[str, None] part allows your editor to provide better
support, but it is not what tells FastAPI that this parameter is
not required.
Hence, regardless of the option you may choose to use, if it is not followed by the = None part, FastAPI won't know that the value of the parameter is optional, and hence, the user will have to provide some value for it. One can also check that through the auto-generated API docs at http://127.0.0.1:8000/docs, where the parameter or request body will appear as a Required field.
For example, any of the below would require the user to pass some body content in their request for the TextsRequest model:
#app.post("/upload")
def upload(t: Union[TextsRequest, None]):
pass
#app.post("/upload")
def upload(t: Optional[TextsRequest]):
pass
If, however, the above TextsRequest definitions were succeeded by = None, for example:
#app.post("/upload")
def upload(t: Union[TextsRequest, None] = None):
pass
#app.post("/upload")
def upload(t: Optional[TextsRequest] = None):
pass
#app.post("/upload")
def upload(t: TextsRequest = None): # this should work as well
pass
the parameter (or body) would be optional, as = None would tell FastAPI that this parameter is not required.
In Python 3.10+
The good news is that in Python 3.10 and above, you don't have to worry about names like Optional and Union, as you can simply use the vertical bar | (also called bitwise or operator, but that meaning is not relevant here) to define an optional parameter (or simply, unions of types). However, the same rule applies to this option as well, i.e., you would still need to add the = None part, if you would like to make the parameter optional (as demonstrated in the example given below).
Example:
#app.post("/upload")
def upload(t: TextsRequest | None = None):
pass
Related
I'm trying to enable filtering for a set of json result through my endpoints. The filter should be optional, can be added directly through the endpoints as (url/?postId=1&...)
I'm utilizing fastAPI in Python for the study. Here's what I've got so far.
Router
#router.get('/comments',
summary="Fetch all comments",
status_code=200,
response_model=List[Comments],
response_description="Returns comments data."
)
def fetch_all_comments(
postId: Optional[str] = None,
id: Optional[int] = None,
name: Optional[str] = None,
email: Optional[str] = None,
body: Optional[str] = None
):
# FETCHING DATA FROM JSON PLACEHOLDER
result = jsonplaceholder.fetch_comments()
for attr in [x for x in result if result[x] is not None]:
query = query.filter(getattr(Comments, attr).like(result[attr]))
return query
Dependencies
def fetch_comments():
# FIRE REQUEST TO GET ALL COMMENTS
req = requests.get(
"https://jsonplaceholder.typicode.com/comments", verify=False)
# HANDLING ERRORS WITHIN THIRD PARTY REQUEST
if req.status_code not in [200, 201]:
raise ThirdPartyException(
"Error", req.status_code, req.reason)
# PREPARING THE PAYLOAD FOR THE RESPONSE
response = req.json()
return response
Models
class Comments(BaseModel):
postId: str
id: int
name: str
email: str
body: str
I feel like the error is on the router side, in which running the application will provide me with an error of 'for attr in [x for x in result if result[x] is not None]:
TypeError: list indices must be integers or slices, not dict'
currently, return result will give me all the comments, without any filter. My end result that I'm hoping to get is '/comments?postId=1' will return me with only the json result with 'postId=1' and 'comments?postId=1&id=2' will return result based on respective filters 'postId=1&id=2'.
May I get any recommendation on the fix that should've been made? thanks for any suggestion and help.
have you tried filter like this:
from sqlalchemy import and_,or_,not_
query(Comments).filter(and_(Comments.name.like(name),Comments.email.like(email), ....)
Okay this is how I solve on getting APIs to return all filtered field as the result.
kwargs = '{}'
j = json.loads(kwargs)
if postId:
j.update({"postId": postId})
if id:
j.update({"id": id})
if name:
j.update({"name": name})
if email:
j.update({"email": email})
if body:
j.update({"body": body})
arguments = [x for x in j.keys() if x]
result = jsonplaceholder.fetch_comments()
empty = []
for y in arguments:
empty = filter_comments(
{'key': y, 'value': j[y]}, empty if empty else result)
print(j)
return empty
There should be a better way to iterate through all the field instead of just listing it in an if statements, however due to my lack of logical thinking decided to solve it this way. Please edit the answer if you guys are able to help reformat it in a better way. Thanks
I am unclear about how to use a #dataclass to convert a mongo doc into a python dataclass. With my NSQL documents they may or may not contain some of the fields. I only want to output a field (using asdict) from the dataclass if that field was present in the mongo document.
Is there a way to create a field that will be output with dataclasses.asdict only if it exists in the mongo doc?
I have tried using post_init but have not figured out a solution.
# in this example I want to output the 'author' field ONLY if it is present in the mongo document
#dataclass
class StoryTitle:
_id: str
title: str
author: InitVar[str] = None
dateOfPub: int = None
def __post_init__(self, author):
print(f'__post_init__ got called....with {author}')
if author is not None:
self.newauthor = author
print(f'self.author is now {self.newauthor}')
# foo and bar approximate documents in mongodb
foo = dict(_id='b23435xx3e4qq', title = 'goldielocks and the big bears', author='mary', dateOfPub = 220415)
newFoo = StoryTitle(**foo)
json_foo = json.dumps(asdict(newFoo))
print(json_foo)
bar = dict(_id='b23435xx3e4qq', title = 'War and Peace', dateOfPub = 220415)
newBar = StoryTitle(**bar)
json_bar = json.dumps(asdict(newBar))
print(json_bar)
My output json does not (of course) have the 'author' field. Anyone know how to accomplish this? I suppose I could just create my own asdict method ...
The dataclasses.asdict helper function doesn't offer a way to exclude fields with default or un-initialized values unfortunately -- however, the dataclass-wizard library does.
The dataclass-wizard is a (de)serialization library I've created, which is built on top of dataclasses module. It adds no extra dependencies outside of stdlib, only the typing-extensions module for compatibility reasons with earlier Python versions.
To skip dataclass fields with default or un-initialized values in serialization for ex. with asdict, the dataclass-wizard provides the skip_defaults option. However, there is also a minor issue I noted with your code above. If we set a default for the author field as None, that means that we won't be able to distinguish between null values and also the case when author field is not present when de-serializing the json data.
So in below example, I've created a CustomNull object similar to the None singleton in python. The name and implementation doesn't matter overmuch, however in our case we use it as a sentinel object to determine if a value for author is passed in or not. If it is not present in the input data when from_dict is called, then we simply exclude it when serializing data with to_dict or asdict, as shown below.
from __future__ import annotations # can be removed in Python 3.10+
from dataclasses import dataclass
from dataclass_wizard import JSONWizard
# create our own custom `NoneType` class
class CustomNullType:
# these methods are not really needed, but useful to have.
def __repr__(self):
return '<null>'
def __bool__(self):
return False
# this is analogous to the builtin `None = NoneType()`
CustomNull = CustomNullType()
# in this example I want to output the 'author' field ONLY if it is present in the mongo document
#dataclass
class StoryTitle(JSONWizard):
class _(JSONWizard.Meta):
# skip default values for dataclass fields when `to_dict` is called
skip_defaults = True
_id: str
title: str
# note: we could also define it like
# author: str | None = None
# however, using that approach we won't know if the value is
# populated as a `null` when de-serializing the json data.
author: str | None = CustomNull
# by default, the `dataclass-wizard` library uses regex to case transform
# json fields to snake case, and caches the field name for next time.
# dateOfPub: int = None
date_of_pub: int = None
# foo and bar approximate documents in mongodb
foo = dict(_id='b23435xx3e4qq', title='goldielocks and the big bears', author='mary', dateOfPub=220415)
new_foo = StoryTitle.from_dict(foo)
json_foo = new_foo.to_json()
print(json_foo)
bar = dict(_id='b23435xx3e4qq', title='War and Peace', dateOfPub=220415)
new_bar = StoryTitle.from_dict(bar)
json_bar = new_bar.to_json()
print(json_bar)
# lastly, we try de-serializing with `author=null`. the `author` field should still
# be populated when serializing the instance, as it was present in input data.
bar = dict(_id='b23435xx3e4qq', title='War and Peace', dateOfPub=220415, author=None)
new_bar = StoryTitle.from_dict(bar)
json_bar = new_bar.to_json()
print(json_bar)
Output:
{"_id": "b23435xx3e4qq", "title": "goldielocks and the big bears", "author": "mary", "dateOfPub": 220415}
{"_id": "b23435xx3e4qq", "title": "War and Peace", "dateOfPub": 220415}
{"_id": "b23435xx3e4qq", "title": "War and Peace", "author": null, "dateOfPub": 220415}
Note: the dataclass-wizard can be installed with pip:
$ pip install dataclass-wizard
I have a pydantic object definition that includes an optional field. I am looking to be able to configure the field to only be serialised if it is not None.
class MyObject(BaseModel):
id: str
msg: Optional[str] = None
pri: Optional[int] = None
MyObject(id="123").json() # ideal output: {"id": "123", "pri": null}
MyObject(id="123", msg="hello").json() # ideal output: {"id": "123", "msg": "hello", "pri": null}
I would like to be able to specify the field precisely, as this object will be nested, and there are other optional fields that should be returned, regardless of whether they are None or not.
The solution to set json option exclude_none to True won't work for this purpose.
you can't do such thing with pydantic and even with more powerfull lib like attrs. The why may be because it is not a good way of returning json object, it is realy confusing for you, the api client and your test suite.
you may get some inspiration from elegant-way-to-remove-fields-from-nested-dictionaries.
you would be able to achieve something (not recommanded at all) by parsing your object jsoned and remove fiels folowing a logic.
exemple of key/value manipulation in nested dict:
import re
def dict_key_convertor(dictionary):
"""
Convert a dictionary from CamelCase to snake_case
:param dictionary: the dictionary given
:return: return a dict
"""
if not isinstance(dictionary, (dict, list)):
return dictionary
if isinstance(dictionary, list):
return [dict_key_convertor(elem) for elem in dictionary]
return {to_snake(key): dict_key_convertor(data) for key, data in dictionary.items()}
def to_snake(word) -> str:
"""
Convert all word from camel to snake case
:param word: the word given to be change from camelCase to snake_case
:return: return word variable in snake_case
"""
return re.sub(r'([A-Z]{2,}(?=[a-z]))', '\\1_', re.sub(r'([a-z])([A-Z]+)', '\\1_\\2', word)).lower()
with a bit of work you may achive something with this:
from typing import List
def dict_key_cleaner(dictionary):
if not isinstance(dictionary, (dict, list)):
return dictionary
if isinstance(dictionary, list):
return [dict_key_cleaner(elem) for elem in dictionary]
# change this return to work with dict
return {poper(key, dictionary): dict_key_cleaner(data) for key, data in dictionary.items()}
def poper(key, dictionary):
special_keys: List[str] = ["field_name","field_name1","field_name2"]
# do some stuff here
for spe_key in special_keys:
if key == spe_key and key.key_value is None:
dictionary.pop(key)
# add return of modified dict
I want to make "partial update" endpoint, but don't want too allow passing null in any field.
Here is the guide from fastapi https://fastapi.tiangolo.com/tutorial/body-updates/#partial-updates-with-patch :
class Item(BaseModel):
name: Optional[str] = None
description: Optional[str] = None
price: Optional[float] = None
tax: float = 10.5
tags: List[str] = []
#app.patch("/items/{item_id}", response_model=Item)
async def update_item(item_id: str, item: Item):
...
update_data = item.dict(exclude_unset=True)
...
With this approach user can pass {"name": null} and corrupt database, because in my case name should always be a string.
So what should I do? The only approach I see so far is playing around with some sentinel objects (using them as "unset" marker instead of None), but this seems hacky and I doubt that pydantic will allow me to do this.
You could use exclude_none in order to exclude values that are equal to None.
Example
item.dict(exclude_none=True)
Source: Pydantic docs
I have the following class
#dataclass_json
#dataclass
class Source:
type: str =None
label: str =None
path: str = None
and the two subclasses:
#dataclass_json
#dataclass
class Csv(Source):
csv_path: str=None
delimiter: str=';'
and
#dataclass_json
#dataclass
class Parquet(Source):
parquet_path: str=None
Given now the dictionary:
parquet={type: 'Parquet', label: 'events', path: '/.../test.parquet', parquet_path: '../../result.parquet'}
csv={type: 'Csv', label: 'events', path: '/.../test.csv', csv_path: '../../result.csv', delimiter:','}
Now I would like to do something like
Source().from_dict(csv)
and that the output will be the class Csv or Parquet. I understand that if you initiate the class source you just "upload" the parameters with the method "from dict", but is there any posibility in doing this by some type of inheritence without using a "Constructor" which makes a if-else if-else over all possible 'types'?
Pureconfig, a Scala Library, creates different case classes when the attribute 'type' has the name of the desired subclass. In Python this is possible?
You can build a helper that picks and instantiates the appropriate subclass.
def from_data(data: dict, tp: type):
"""Create the subtype of ``tp`` for the given ``data``"""
subtype = [
stp for stp in tp.__subclasses__() # look through all subclasses...
if stp.__name__ == data['type'] # ...and select by type name
][0]
return subtype(**data) # instantiate the subtype
This can be called with your data and the base class from which to select:
>>> from_data(
... {'type': 'Csv', 'label': 'events', 'path': '/.../test.csv', 'csv_path': '../../result.csv', 'delimiter':','},
... Source,
... )
Csv(type='Csv', label='events', path='/.../test.csv', csv_path='../../result.csv', delimiter=',')
If you need to run this often, it is worth building a dict to optimise the subtype lookup. A simple means is to add a method to your base class, and store the lookup there:
#dataclass_json
#dataclass
class Source:
type: str =None
label: str =None
path: str = None
#classmethod
def from_data(cls, data: dict):
if not hasattr(cls, '_lookup'):
cls._lookup = {stp.__name__: stp for stp in cls.__subclasses__()}
return cls._lookup[data["type"]](**data)
This can be called directly on the base class:
>>> Source.from_data({'type': 'Csv', 'label': 'events', 'path': '/.../test.csv', 'csv_path': '../../result.csv', 'delimiter':','})
Csv(type='Csv', label='events', path='/.../test.csv', csv_path='../../result.csv', delimiter=',')
This is a variation on my answer to this question.
#dataclass_json
#dataclass
class Source:
type: str = None
label: str = None
path: str = None
def __new__(cls, type=None, **kwargs):
for subclass in cls.__subclasses__():
if subclass.__name__ == type:
break
else:
subclass = cls
instance = super(Source, subclass).__new__(subclass)
return instance
assert type(Source(**csv)) == Csv
assert type(Source(**parquet)) == Parquet
assert Csv(**csv) == Source(**csv)
assert Parquet(**parquet) == Source(**parquet)
You asked and I am happy to oblige. However, I'm questioning whether this is really what you need. I think it might be overkill for your situation. I originally figured this trick out so I could instantiate directly from data when...
my data was heterogeneous and I didn't know ahead of time which subclass was appropriate for each datum,
I didn't have control over the data, and
figuring out which subclass to use required some processing of the data, processing which I felt belonged inside the class (for logical reasons as well as to avoid polluting the scope in which the instantiating took place).
If those conditions apply to your situation, then I think this is a worth-while approach. If not, the added complexity of mucking with __new__ -- a moderately advanced maneuver -- might not outweigh the savings in complexity in the code used to instantiate. There are probably simpler alternatives.
For example, it appears as though you already know which subclass you need; it's one of the fields in the data. If you put it there, presumably whatever logic you wrote to do so could be used to instantiate the appropriate subclass right then and there, bypassing the need for my solution. Alternatively, instead of storing the name of the subclass as a string, store the subclass itself. Then you could do this: data['type'](**data)
It also occurs to me that maybe you don't need inheritance at all. Do Csv and Parquet store the same type of data, differing only in which file format they read it from? Then maybe you just need one class with from_csv and from_parquet methods. Alternatively, if one of the parameters is a filename, it would be easy to figure out which type of file parsing you need based on the filename extension. Normally I'd put this in __init__, but since you're using dataclass, I guess this would happen in __post_init__.
Do you need this behavior?
from dataclasses import dataclass
from typing import Optional, Union, List
from validated_dc import ValidatedDC
#dataclass
class Source(ValidatedDC):
label: Optional[str] = None
path: Optional[str] = None
#dataclass
class Csv(Source):
csv_path: Optional[str] = None
delimiter: str = ';'
#dataclass
class Parquet(Source):
parquet_path: Optional[str] = None
#dataclass
class InputData(ValidatedDC):
data: List[Union[Parquet, Csv]]
# Let's say you got a json-string and loaded it:
data = [
{
'label': 'events', 'path': '/.../test.parquet',
'parquet_path': '../../result.parquet'
},
{
'label': 'events', 'path': '/.../test.csv',
'csv_path': '../../result.csv', 'delimiter': ','
}
]
input_data = InputData(data=data)
for item in input_data.data:
print(item)
# Parquet(label='events', path='/.../test.parquet', parquet_path='../../result.parquet')
# Csv(label='events', path='/.../test.csv', csv_path='../../result.csv', delimiter=',')
validated_dc: https://github.com/EvgeniyBurdin/validated_dc