Currently, I'm learning Python and Fastapi and I can't figure out what typing.Optional is for.
class Post(BaseModel):
# default value
rating: int = None
# typing.Optional
rating: Optional[int] = None
Both work. I don't understand what's the difference.
From the documentation (see typing.Optional):
Optional[x] is simply short hand for Union[x, None]
In Pydantic this means, specifying the field value becomes optional.
In other words, it's not necessary to pass in the field and value when initialising the model, and the value will default to None
(this is slightly different to optional arguments in function calls as described here).
It's also not necessary to explicitly specify None as the default.
It appears to be mostly syntactic sugar in this case but it helps to make the model more readable.
In more advanced cases, it may be necessary to require a field to be explicitly passed into the model, even though the value could be None,
as suggested in the section on Required Optional Fields,
in which case the distinction becomes necessary.
It always depends on the use case but it's not unusual to use a default value of the same type,
or make the field required.
Here's a more common scenario for this:
from pydantic import BaseModel
from typing import Optional
class Post(BaseModel):
# rating is required and must be an integer.
rating: int
# counter is not required and will default to 1 if nothing is passed.
counter: int = 1
# comment is optional and will be coerced into a str.
comment: Optional[str]
# This will work:
post = Post(rating=10)
repr(post)
# 'Post(rating=10, counter=1, comment=None)'
# This will work as well:
post = Post(rating=10, comment="some text")
repr(post)
# "Post(rating=10, counter=1, comment='some text')"
# But this won't work:
post = Post(comment="some text")
# ...
# ValidationError: 1 validation error for Post
# rating
# field required (type=value_error.missing)
# And this won't work either:
post = Post(rating=10, counter=None)
# ...
# ValidationError: 1 validation error for Post1
# counter
# none is not an allowed value (type=type_error.none.not_allowed)
Related
I am trying to automatically convert a Pydantic model to a DB schema. To do that, I am recursively looping through a Pydantic model's fields to determine the type of field.
As an example, I have this simple model:
from typing import List
from pydantic import BaseModel
class TestModel(BaseModel):
tags: List[str]
I am recursing through the model using the __fields__ property as described here: https://docs.pydantic.dev/usage/models/#model-properties
If I do type(TestModel).__fields__['tags'] I see:
ModelField(name='tags', type=List[str], required=True)
I want to programatically check if the ModelField type has a List origin. I have tried the following, and none of them work:
type(TestModel).__fields__['tags'].type_ is List[str]
type(TestModel).__fields__['tags'].type_ == List[str]
typing.get_origin(type(TestModel).__fields__['tags'].type_) is List
typing.get_origin(type(TestModel).__fields__['tags'].type_) == List
Frustratingly, this does return True:
type(TestModel).__fields__['tags'].type_ is str
What is the correct way for me to confirm a field is a List type?
Pydantic has the concept of the shape of a field. These shapes are encoded as integers and available as constants in the fields module. The more-or-less standard types have been accommodated there already. If a field was annotated with list[T], then the shape attribute of the field will be SHAPE_LIST and the type_ will be T.
The type_ refers to the element type in the context of everything that is not SHAPE_SINGLETON, i.e. with container-like types. This is why you get str in your example.
Thus for something as simple as list, you can simply check the shape against that constant:
from pydantic import BaseModel
from pydantic.fields import SHAPE_LIST
class TestModel(BaseModel):
tags: list[str]
other: tuple[str]
tags_field = TestModel.__fields__["tags"]
other_field = TestModel.__fields__["other"]
assert tags_field.shape == SHAPE_LIST
assert other_field.shape != SHAPE_LIST
If you want more insight into the actual annotation of the field, that is stored in the annotation attribute of the field. With that you should be able to do all the typing related analyses like get_origin.
That means another way of accomplishing your check would be this:
from typing import get_origin
from pydantic import BaseModel
class TestModel(BaseModel):
tags: list[str]
other: tuple[str]
tags_field = TestModel.__fields__["tags"]
other_field = TestModel.__fields__["other"]
assert get_origin(tags_field.annotation) is list
assert get_origin(other_field.annotation) is tuple
Sadly, neither of those attributes are officially documented anywhere as far as I know, but the beauty of open-source is that we can just check ourselves. Neither the attributes nor the shape constants are obfuscated, protected or made private in any of the usual ways, so I'll assume these are stable (at least until Pydantic v2 drops).
We'd like to enforce parameter checking before people can insert into a schema as shown below, but the code below doesn't work.
Is there a way to implement pre-insertion parameter checking?
#schema
class AutomaticCurationParameters(dj.Manual):
definition = """
auto_curation_params_name: varchar(200) # name of this parameter set
---
merge_params: blob # dictionary of params to merge units
label_params: blob # dictionary params to label units
"""
def insert1(key, **kwargs):
# validate the labels and then insert
#TODO: add validation for merge_params
for metric in key['label_params']:
if metric not in _metric_name_to_func:
raise Exception(f'{metric} not in list of available metrics')
comparison_list = key['label_params'][metric]
if comparison_list[0] not in _comparison_to_function:
raise Exception(f'{metric}: {comparison_list[0]} not in list of available comparisons')
if type(comparison_list[1]) != int and type(comparison_list) != float:
raise Exception(f'{metric}: {comparison_list[1]} not a number')
for label in comparison_list[2]:
if label not in valid_labels:
raise Exception(f'{metric}: {comparison_list[2]} not a valid label: {valid_labels}')
super().insert1(key, **kwargs)
This is a great question that has come up many times for us.
Most likely the issue is either that you are missing the class' self reference or that you are missing the case where the key is passed in as a keyword argument (we are actually expecting it as a row instead).
I'll demonstrate a simple example that hopefully can illustrate how to inject your validation code which you can tweak to perform as you're intending above.
Suppose, we want to track filepaths within a dj.Manual table but I'd like to validate that only filepaths with a certain extension are inserted.
As you've already discovered, we can achieve this through overloading like so:
import datajoint as dj
schema = dj.Schema('rguzman_insert_validation')
#schema
class FilePath(dj.Manual):
definition = '''
file_id: int
---
file_path: varchar(100)
'''
def insert1(self, *args, **kwargs): # Notice that we need a reference to the class
key = kwargs['row'] if 'row' in kwargs else args[0] # Handles as arg or kwarg
if '.md' not in key['file_path']:
raise Exception('Sorry, we only support Markdown files...')
super().insert1(*args, **kwargs)
P.S. Though this example is meant to illustrate the concept, there is actually a better way of doing the above if you are using MySQL8. There is a CHECK utility available from MySQL that allows simple validation that DataJoint will respect. If those conditions are met, you can simplify it to:
import datajoint as dj
schema = dj.Schema('rguzman_insert_validation')
#schema
class FilePath(dj.Manual):
definition = '''
file_id: int
---
file_path: varchar(100) CHECK(REGEXP_LIKE(file_path, '^.*\.md$', 'c'))
'''
I wanted to know what is the difference between:
from pydantic import BaseModel, Field
class Person(BaseModel):
name: str = Field(..., min_length=1)
And:
from pydantic import BaseModel, constr
class Person(BaseModel):
name: constr(min_length=1)
Both seem to perform the same validation (even raise the exact same exception info when name is an empty string). Is it just a matter of code style? Is one of them preferred over the other?
Also, if I wanted to include a list of nonempty strings as an attribute, which of these ways do you think would be better?:
from typing import List
from pydantic import BaseModel, constr
class Person(BaseModel):
languages: List[constr(min_length=1)]
Or:
from typing import List
from pydantic import BaseModel, Field
class Person(BaseModel):
languages: List[str]
#validator('languages', each_item=True)
def check_nonempty_strings(cls, v):
if not v:
raise ValueError('Empty string is not a valid language.')
return v
EDIT:
FWIW, I am using this for a FastAPI app.
EDIT2:
For my 2nd question, I think the first alternative is better, as it includes the length requirement in the Schema (and so it's in the documentation)
constr and Fields don't serve the same purpose.
constr is a specific type that give validation rules regarding this specific type. You have equivalent for all classic python types.
arguments of constr:
strip_whitespace: bool = False: removes leading and trailing whitespace
to_lower: bool = False: turns all characters to lowercase
to_upper: bool = False: turns all characters to uppercase
strict: bool = False: controls type coercion
min_length: int = None: minimum length of the string
max_length: int = None: maximum length of the string
curtail_length: int = None: shrinks the string length to the set value when it is longer than the set value
regex: str = None: regex to validate the string against
As you can see thoses arguments allow you to manipulate the str itself not the behaviour of pydantic with this field.
Field doesn't serve the same purpose, it's a way of customizing fields, all fields not only str, it add 18 customization variables that you can find here.
Is it just a matter of code style? Is one of them preferred over the other?
for the specific case of str it is a matter of code style and what is preferred doesn't matter, only your usecase does.
In general it is better to don't mix different syntax toguether and since you often need Field(), you will find it often.
A classic use case would be api response that send json object in camelCase or PascalCase, you would use field alias to match thoses object and work with their variables in snake_case.
exemple:
class Voice(BaseModel):
name: str = Field(None, alias='ActorName')
language_code: str = None
mood: str = None
for your 2nd question you are right, using constr is surely the best approach since the validation rule will be added into the openapi doc.
If you want to learn more about limitation and field rules enforcement check this.
This link shows the methods that do and don't work for pydantic and mypy together: https://lyz-code.github.io/blue-book/coding/python/pydantic_types/#using-constrained-strings-in-list-attributes
The best option for my use case was to make a class that inherited from pydantic.ConstrainedStr as so:
import pydantic
from typing import List
...
class Regex(pydantic.ConstrainedStr):
regex = re.compile("^[0-9a-z_]*$")
class Data(pydantic.BaseModel):
regex: List[Regex]
# regex: list[Regex] if you are on 3.9+
Consider the following
from pydantic import BaseModel, Field
class Model(BaseModel):
required: str
This will make required a required field for Model, however, in the FastAPI autogenerated Swagger docs it will have an example value of "string".
How can I make a required field with a sensible default? If I make a model like
from pydantic import BaseModel, Field
class Model(BaseModel):
required: str = 'Sensible default'
Then the field required is no longer required, but it shows up with a sensible default in the docs. Is there an easy workaround for this?
You can use Field() to set up those options and check.
from pydantic import BaseModel, Field
class Model(BaseModel):
something: str # required, shows "string"
something: str = None # not required, shows "string"
something: str = Field(..., example="this is the default display") # required, shows example
something: str = Field(None, example="Foobar") #not required, show example
There are a multitude of different parameters that Field() can validate against.
I haven't looked into why the (pydantic) model representation within the openapi version that ships with FastAPI leaves the asterisk out, but the field is definitely still required (try putting a null value, or anything other than string). This might just be an UI inconsistency.
I'm using the python jsonschema https://python-jsonschema.readthedocs.io/en/latest/
and I'm trying to find how to use default values and remove additional fields when found.
anyone know how am I suppose to do it?
or maybe have another solution to validate jsonschema that supports default values and remove any additional field (like js avj)?
Hidden in the FAQs you'll find this
Why doesn’t my schema’s default property set the default on my
instance? The basic answer is that the specification does not require
that default actually do anything.
For an inkling as to why it doesn’t actually do anything, consider
that none of the other validators modify the instance either. More
importantly, having default modify the instance can produce quite
peculiar things. It’s perfectly valid (and perhaps even useful) to
have a default that is not valid under the schema it lives in! So an
instance modified by the default would pass validation the first time,
but fail the second!
Still, filling in defaults is a thing that is useful. jsonschema
allows you to define your own validator classes and callables, so you
can easily create an jsonschema.IValidator that does do default
setting. Here’s some code to get you started. (In this code, we add
the default properties to each object before the properties are
validated, so the default values themselves will need to be valid
under the schema.)
from jsonschema import Draft4Validator, validators
def extend_with_default(validator_class):
validate_properties = validator_class.VALIDATORS["properties"]
def set_defaults(validator, properties, instance, schema):
for property, subschema in properties.iteritems():
if "default" in subschema:
instance.setdefault(property, subschema["default"])
for error in validate_properties(
validator, properties, instance, schema,
):
yield error
return validators.extend(
validator_class, {"properties" : set_defaults},
)
DefaultValidatingDraft4Validator = extend_with_default(Draft4Validator)
# Example usage:
obj = {}
schema = {'properties': {'foo': {'default': 'bar'}}}
# Note jsonschem.validate(obj, schema, cls=DefaultValidatingDraft4Validator)
# will not work because the metaschema contains `default` directives.
DefaultValidatingDraft4Validator(schema).validate(obj)
assert obj == {'foo': 'bar'}
From: https://python-jsonschema.readthedocs.io/en/latest/faq/#why-doesn-t-my-schema-s-default-property-set-the-default-on-my-instance