Problem in a nutshell
I am having issues with the hypothesis build strategy and custom pydantic data types (no values are returned when invoking the build strategy on my custom data type.
Problem in more detail
Given the following pydantic custom type, which just validates if a value is a timezone.
import pytz
from pydantic import StrictStr
TIMEZONES = pytz.common_timezones_set
class CountryTimeZone(StrictStr):
"""Validate a country timezone."""
#classmethod
def __get_validators__(cls):
yield from super().__get_validators__()
yield cls.validate_timezone
#classmethod
def validate_timezone(cls, v):
breakpoint()
if v not in TIMEZONES:
raise ValueError(f"{v} is not a valid country timezone")
return v
#classmethod
def __modify_schema__(cls, field_schema):
field_schema.update(examples=TIMEZONES)
When I attempt to use this in some schema...
from pydantic import BaseModel
class Foo(BaseModel):
bar: CountryTimeZone
and subsequently try to build an example in a test, using the pydantic hypothesis plugin like.
from hypothesis import given
from hypothesis import strategies as st
#given(st.builds(Foo))
def test_something_interesting(schema) -> None:
# Some assertions
...
schema.bar is always "".
Questions
Is there something missing from this implementation, meaning that values like "Asia/Krasnoyarsk" aren't being generated? From the documentation, examples like PaymentCardNumber and EmailStr build as expected.
Even when using StrictStr by itself, the resulting value is also an empty string. I tried to inherit from str but still no luck.
Came across the same problem today. Seems like the wording in the hypothesis plugin docs give the wrong impression. Pydantic has written hypothesis integrations for their custom types, not that hypothesis supports custom pydantic types out of the box.
Here is a full example of creating a custom class, assigning it a test strategy and using it in a pydantic model.
import re
from hypothesis import given, strategies as st
from pydantic import BaseModel
CAPITAL_WORD = r"^[A-Z][a-z]+"
CAPITAL_WORD_REG = re.compile(CAPITAL_WORD)
class MustBeCapitalWord(str):
"""Custom class that validates the string is a single of only letters
starting with a capital case letter."""
#classmethod
def __get_validators__(cls):
yield cls.validate
#classmethod
def __modify_schema__(cls, field_schema):
# optional stuff, updates the schema if you choose to export the
# pydantic schema
field_schema.UPDATE(
pattern=CAPITAL_WORD,
examples=["Hello", "World"],
)
#classmethod
def validate(cls, v):
if not isinstance(v, str):
raise TypeError("string required")
if not v:
raise ValueError("No capital letter found")
elif CAPITAL_WORD_REG.match(v) is None:
raise ValueError("Input is not a valid word starting with capital letter")
return cls(v)
def __repr__(self):
return f"MustBeCapitalWord({super().__repr__()})"
# register a strategy for our custom type
st.register_type_strategy(
MustBeCapitalWord,
st.from_regex(CAPITAL_WORD, fullmatch=True),
)
# use our custom type in a pydantic model
class Model(BaseModel):
word: MustBeCapitalWord
# test it all
#given(st.builds(Model))
def test_model(instance):
assert instance.word[0].isupper()
Related
I have a frozen dataclass MyData that holds data.
I would like a distinguished subclass MySpecialData can only hold data of length 1.
Here is a working implementation.
from dataclasses import dataclass, field
#dataclass(frozen=True)
class MyData:
id: int = field()
data: list[float] = field()
def __len__(self) -> int:
return len(self.data)
#dataclass(frozen=True)
class MySpecialData(MyData):
def __post_init__(self):
assert len(self) == 1
# correctly throws exception
special_data = MySpecialData(id=1, data=[2, 3])
I spent some time messing with __new__ and __init__, but couldn't reach a working solution.
The code works, but I am a novice and am soliciting the opinion of someone experienced if this is the "right" way to accomplish this.
Any critiques or suggestions on how to do this better or more correctly would be appreciated.
For examples not using dataclasses, I imagine the correct way would be overriding __new__ in the subclass.
I suspect my attempts at overriding __new__ fail here because of the special way dataclasses works.
Would you agree?
Thank you for your opinion.
Don't use assert. Use
if len(self) != 1:
raise ValueError
assert can be turned off with the -O switch ie., if you run your script like
python -O my_script.py
it will no longer raise an error.
Another option is to use a custom user-defined list subclass, which checks the len of the list upon instantiation.
from dataclasses import dataclass, field
from typing import Sequence, TypeVar, Generic
T = TypeVar('T')
class ConstrainedList(list, Generic[T]):
def __init__(self, seq: Sequence[T] = (), desired_len: int = 1):
super().__init__(seq)
if len(self) != desired_len:
raise ValueError(f'expected length {desired_len}, got {len(self)}. items={self}')
#dataclass(frozen=True)
class MyData:
id: int = field()
data: ConstrainedList[float] = field(default_factory=ConstrainedList)
#dataclass(frozen=True)
class MySpecialData(MyData):
...
# correctly throws exception
special_data = MySpecialData(id=1, data=ConstrainedList([2, 3]))
I want to include a custom class into a route's response. I'm mostly using nested pydantic.BaseModels in my application, so it would be nice to return the whole thing without writing a translation from the internal data representation to what the route returns.
As long as everything inherits from pydantic.BaseModel this is trivial, but I'm using a class Foo in my backend which can't do that, and I can't subclass it for this purpose either. Can I somehow duck type that class's definition in a way that fastapi accepts it? What I have right now is essentially this:
main.py
from fastapi import FastAPI
from pydantic import BaseModel
app = FastAPI()
class Foo:
"""Foo holds data and can't inherit from `pydantic.BaseModel`."""
def __init__(self, x: int):
self.x = x
class Response(BaseModel):
foo: Foo
# plus some more stuff that doesn't matter right now because it works
#app.get("/", response_model=Response)
def root():
return Response(foo=Foo(1))
if __name__ == '__main__':
import uvicorn
uvicorn.run("main:app") # RuntimeError
It's not documented, but you can make non-pydantic classes work with fastapi. What you need to do is:
Tell pydantic that using arbitrary classes is fine. It
will try to jsonify them using vars(), so only straight forward
data containers will work - no using property, __slots__ or stuff like that[1].
Create a proxy BaseModel, and tell Foo to offer it if someone
asks for its schema - which is what fastapis OpenAPI pages do.
I'll just assume that you want them to work too since they're
amazing.
main.py
from fastapi import FastAPI
from pydantic import BaseModel, BaseConfig, create_model
app = FastAPI()
BaseConfig.arbitrary_types_allowed = True # change #1
class Foo:
"""Foo holds data and can't inherit from `pydantic.BaseModel`."""
def __init__(self, x: int):
self.x = x
__pydantic_model__ = create_model("Foo", x=(int, ...)) # change #2
class Response(BaseModel):
foo: Foo
#app.get("/", response_model=Response)
def root():
return Response(foo=Foo(1))
if __name__ == '__main__':
import uvicorn
uvicorn.run("main:app") # works
[1] If you want more complex jsonification, you need to provide it to the Response class explicitly via Config.json_encoders.
Here is a full implementation using a subclass with validators and extra schema:
from psycopg2.extras import DateTimeTZRange as DateTimeTZRangeBase
from sqlalchemy.dialects.postgresql import TSTZRANGE
from sqlmodel import (
Column,
Field,
Identity,
SQLModel,
)
from pydantic.json import ENCODERS_BY_TYPE
ENCODERS_BY_TYPE |= {DateTimeTZRangeBase: str}
class DateTimeTZRange(DateTimeTZRangeBase):
#classmethod
def __get_validators__(cls):
yield cls.validate
#classmethod
def validate(cls, v):
if isinstance(v, str):
lower = v.split(", ")[0][1:].strip().strip()
upper = v.split(", ")[1][:-1].strip().strip()
bounds = v[:1] + v[-1:]
return DateTimeTZRange(lower, upper, bounds)
elif isinstance(v, DateTimeTZRangeBase):
return v
raise TypeError("Type must be string or DateTimeTZRange")
#classmethod
def __modify_schema__(cls, field_schema):
field_schema.update(type="string", example="[2022,01,01, 2022,02,02)")
class EventBase(SQLModel):
__tablename__ = "event"
timestamp_range: DateTimeTZRange = Field(
sa_column=Column(
TSTZRANGE(),
nullable=False,
),
)
class Event(EventBase, table=True):
id: int | None = Field(
default=None,
sa_column_args=(Identity(always=True),),
primary_key=True,
nullable=False,
)
as per #Arne 's solution you need to add your own validators and schema if the Type you are using has __slots__ and basically no way to get out a dict.
Link to Github issue: https://github.com/tiangolo/sqlmodel/issues/235#issuecomment-1162063590
Previously I used the marshmallow library with the Flask. Some time ago I have tried FastAPI with Pydantic. At first glance pydantic seems similar to masrhmallow but on closer inspection they differ. And for me the main difference between them is post_load methods which are from marshmallow. I can't find any analogs for it in pydantic.
post_load is decorator for post-processing methods. Using it I can handle return object on my own, can do whatever I want:
class ProductSchema(Schema):
alias = fields.Str()
category = fields.Str()
brand = fields.Str()
#post_load
def check_alias(self, params, **kwargs):
"""One of the fields must be filled"""
if not any([params.get('alias'), params.get('category'), params.get('brand')]):
raise ValidationError('No alias provided', field='alias')
return params
Besides it used not only for validation. Code example is just for visual understanding, do not analyze it, I have just invented it.
So my question is:
is there any analog for post_load in pydantic?
It is not obvious but pydantic's validator returns value of the field. So there are two ways to handle post_load conversions: validator and
root_validator.
validator gets the field value as argument and returns its value.
root_validator is the same but manipulates with the whole object.
from pydantic import validator, root_validator
class PaymentStatusSchema(BaseModel):
order_id: str = Param(..., title="Order id in the shop")
order_number: str = Param(..., title="Order number in chronological order")
status: int = Param(..., title="Payment status")
#validator("status")
def convert_status(cls, status):
return "active" if status == 1 else "inactive"
#root_validator
def check_order_id(cls, values):
"""Check order id"""
if not values.get('orderNumber') and not values.get('mdOrder'):
raise HTTPException(status_code=400, detail='No order data provided')
return values
By default pydantic runs validators as post-processing methods. For pre-processing you should use validators with pre argument:
#root_validator(pre=True)
def check_order_id(cls, values):
"""Check order id"""
# some code here
return values
Yes, you can use Pydantic's #validator decorator to do pre-load, post-load, model validating etc.
Here is a Post load example
from pydantic import validator
class Person(BaseModel):
first_name: str
second_name: str
#validator("first_name")
def make_it_formal(cls, first_name):
return f"Mr. {first_name.capitalize()}"
p = Person(first_name="egvo", second_name="Example")
p.first_name
Out: Mr. Egvo
Alternatively, you can also override __init__ and post-process the instance there:
from pydantic import BaseModel
class ProductSchema(BaseModel):
alias: str
category: str
brand: str
def __init__(self, *args, **kwargs):
# Do Pydantic validation
super().__init__(*args, **kwargs)
# Do things after Pydantic validation
if not any([self.alias, self.category, self.brand]):
raise ValueError("No alias provided")
Though this happens outside of Pydantic's validation.
Introduction
With Python/MyPy type hints, one can use .pyi stubs to keep annotations in separate files to implementations. I am using this functionality to give basic hinting of SQLAlchemy's ORM (more specifically, the flask_sqlalchemy plugin).
Models are defined like:
class MyModel(db.Model):
id = db.Column()
...
...
where db.Model is included directly from SQLAlchemy.
They can be queried, for example, by:
MyModel.query.filter({options: options}).one_or_none()
where filter() returns another Query, and one_or_none() returns an instance of MyModel (or None, obviously).
The follwing .pyi file successfully hints the above construct, though it is incomplete - there is no way to hint the return type of one_or_none().
class _SQLAlchemy(sqlalchemy.orm.session.Session):
class Model:
query = ... # type: _Query
class _Query(sqlalchemy.orm.query.Query):
def filter(self, *args) -> query.Query: ...
def one_or_none(self) -> Any: ...
db = ... # type: _SQLAlchemy
The Question
How can one fully and generically hint the above, and hint the return type of one_or_none()?
My first attempt was to use generics, but it looks like I have no access to the the subtype in question (in the given example, MyModel). To illustrate a nonworking approach:
from typing import Generic, TypeVar
_T = TypeVar('_T')
class _SQLAlchemy(sqlalchemy.orm.session.Session):
class Model:
def __init__(self, *args, **kwargs):
self.query = ... # type: _Query[self.__class__]
class _Query(Generic[_T], sqlalchemy.orm.query.Query):
def filter(self, *args) -> _Query[_T]: ...
def one_or_none(self) -> _T: ...
db = ... # type: _SQLAlchemy
Is there any way to get this working?
Apologies for the specific and example, but I tried for a while to write this concisely with a generic example and it was never as clear as it is currently (which is possibly still not much!)
Edit
Another non-working approach (I'm aware this would have the limitation of having to call myModelInstance.query... instead of the static MyModel.query, but even this does not work):
from typing import Generic, TypeVar
_T = TypeVar('_T')
class _SQLAlchemy(sqlalchemy.orm.session.Session):
class Model:
#property
def query(self: _T) -> _Query[_T]: ...
class _Query(Generic[_T], sqlalchemy.orm.query.Query):
def filter(self, *args) -> _Query[_T]: ...
def one_or_none(self) -> _T: ...
db = ... # type: _SQLAlchemy
Type annotation stubs are fortunately now available for that specific issue https://github.com/dropbox/sqlalchemy-stubs
Exact implementation for Query type annotation is available here: https://github.com/dropbox/sqlalchemy-stubs/blob/master/sqlalchemy-stubs/orm/query.pyi (archive)
Say I've got this simple little Pony ORM mapping here. The built-in Enum class is new as of Python 3.4, and backported to 2.7.
from enum import Enum
from pony.orm import Database, Required
class State(Enum):
ready = 0
running = 1
errored = 2
if __name__ == '__main__':
db = Database('sqlite', ':memory:', create_db=True)
class StateTable(db.Entity):
state = Required(State)
db.generate_mapping(create_tables=True)
When I run the program, an error is thrown.
TypeError: No database converter found for type <enum 'State'>
This happens because Pony doesn't support mapping the enum type. Of course, the workaround here is to just store the Enum value, and provide a getter in Class StateTable to convert the value to the Enum once again. But this is tedious and error prone. I can also just use another ORM. Maybe I will if this issue becomes too much of a headache. But I would rather stick with Pony if I can.
I would much rather create a database converter to store the enum, like the error message is hinting at. Does anyone know how to do this?
UPDATE:
Thanks to Ethan's help, I have come up with the following solution.
from enum import Enum
from pony.orm import Database, Required, db_session
from pony.orm.dbapiprovider import StrConverter
class State(Enum):
ready = 0
running = 1
errored = 2
class EnumConverter(StrConverter):
def validate(self, val):
if not isinstance(val, Enum):
raise ValueError('Must be an Enum. Got {}'.format(type(val)))
return val
def py2sql(self, val):
return val.name
def sql2py(self, value):
# Any enum type can be used, so py_type ensures the correct one is used to create the enum instance
return self.py_type[value]
if __name__ == '__main__':
db = Database('sqlite', ':memory:', create_db=True)
# Register the type converter with the database
db.provider.converter_classes.append((Enum, EnumConverter))
class StateTable(db.Entity):
state = Required(State)
db.generate_mapping(create_tables=True)
with db_session:
s = StateTable(state=State.ready)
print('Got {} from db'.format(s.state))
Excerpt from some random mailing list:
2.2. CONVERTER METHODS
Each converter class should define the following methods:
class MySpecificConverter(Converter):
def init(self, kwargs):
# Override this method to process additional positional
# and keyword arguments of the attribute
if self.attr is not None:
# self.attr.args can be analyzed here
self.args = self.attr.args
self.my_optional_argument = kwargs.pop("kwarg_name")
# You should take all valid options from this kwargs
# What is left in is regarded as unrecognized option
def validate(self, val):
# convert value to the necessary type (e.g. from string)
# validate all necessary constraints (e.g. min/max bounds)
return val
def py2sql(self, val):
# prepare the value (if necessary) to storing in the database
return val
def sql2py(self, value):
# convert value (if necessary) after the reading from the db
return val
def sql_type(self):
# generate corresponding SQL type, based on attribute options
return "SOME_SQL_TYPE_DEFINITION"
You can study the code of the existing converters to see how these methods
are implemented.