I'm using flask-marshmallow (marshmallow=v3.0.0rc1, flask-marshmallow=0.9.0) and flask-sqlalchemy (sqlalchemy=1.2.16, flask-sqlalchemy=2.3.2)
I have this model and schema.
from marshmallow import post_load, fields
from .datastore import sqla_database as db
from .extensions import marshmallow as ma
class Study(db.Model):
_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, nullable=False)
tests = db.relationship("Test", backref="study", lazy='select', cascade="all, delete-orphan")
#property
def test_count(self):
return len(self.tests)
class StudySchema(ma.ModelSchema):
test_count = fields.Integer(dump_only=True)
class Meta:
model = Study
sqla_session = db.session
schema = StudySchema()
payload = request.get_json()
schema.load(data=payload, instance=Study.query.get(payload["_id"]))
schema.session.commit()
If I perform a PUT operation with this payload
{'_id': 1, 'name': 'Study1', 'test_count': 0} I get the following exception marshmallow.exceptions.ValidationError: {'test_count': ['Unknown field.']}
If I remove the dump_only=True I get this exception AttributeError: can't set attribute which makes sense to me because it's trying to set test_count with no setter method on model class.
What I do not understand is why is the attribute is not ignored with dump_only. Why is marshmallow still trying to validate and understand this field during load when it's marked as dump_only?
In marshmallow 2, unknown or dump_only fields are ignored from input. Unless the user decides to add his own validation to error on them.
In marshmallow 3, we changed that to offer three possibilities (see docs):
RAISE (default)
EXCLUDE (like marshmallow 2)
INCLUDE (pass data without validation)
There's been discussions about how to deal with dump_only fields and we came to the conclusion that from client perspective, those should be treated just as unknown fields (see https://github.com/marshmallow-code/marshmallow/issues/875).
Bottom line, your PUT payload should not include dump_only fields. Or you could set the EXCLUDE policy to your schema, but I'd favor the former option.
Related
I'm following this tutorial to adapt it to my needs, in this case, to perform a sql module where I need to record the data collected by a webhook from the gitlab issues.
For the database module I'm using SQLAlchemy library and PostgreSQL as database engine.
So, I would like to solve some doubts, I have regarding the use of the Pydantic library, in particular with this example
From what I've read, Pydantic is a library that is used for data validation using classes with attributes.
But I don't quite understand some things...is the integration of Pydantic strictly necessary? The purpose of using Pydantic I understand, but the integration of using Pydantic with SQLAlchemy models I don't understand.
In the tutorial, models.py has the following content:
from sqlalchemy import Boolean, Column, ForeignKey, Integer, String
from sqlalchemy.orm import relationship
from .database import Base
class User(Base):
__tablename__ = "users"
id = Column(Integer, primary_key=True, index=True)
email = Column(String, unique=True, index=True)
hashed_password = Column(String)
is_active = Column(Boolean, default=True)
items = relationship("Item", back_populates="owner")
class Item(Base):
__tablename__ = "items"
id = Column(Integer, primary_key=True, index=True)
title = Column(String, index=True)
description = Column(String, index=True)
owner_id = Column(Integer, ForeignKey("users.id"))
owner = relationship("User", back_populates="items")
And schemas.py has the following content:
from typing import Optional
from pydantic import BaseModel
class ItemBase(BaseModel):
title: str
description: Optional[str] = None
class ItemCreate(ItemBase):
pass
class Item(ItemBase):
id: int
owner_id: int
class Config:
orm_mode = True
class UserBase(BaseModel):
email: str
class UserCreate(UserBase):
password: str
class User(UserBase):
id: int
is_active: bool
items: list[Item] = []
class Config:
orm_mode = True
I know that the primary means of defining objects in Pydantic is via models and also I know that models are simply classes which inherit from BaseModel.
Why does it create ItemBase, ItemCreate and Item that inherits from ItemBase?
In ItemBase it passes the fields that are strictly necessary in Item table? and defines its type?
The ItemCreate class I have seen that it is used latter in crud.py to create a user, in my case I would have to do the same with the incidents? I mean, I would have to create a clase like this:
class IssueCreate(BaseModel):
pass
There are my examples trying to follow the same workflow:
models.py
import sqlalchemy
from sqlalchemy import Column, Table
from sqlalchemy import Integer, String, Datetime, TIMESTAMP
from .database import Base
class Issues(Base):
__tablename__ = 'issues'
id = Column(Integer, primary_key=True)
gl_assignee_id = Column(Integer, nullable=True)
gl_id_user = Column(Integer, nullable=False)
current_title = Column(String, nullable=False)
previous_title = Column(String, nullable=True)
created_at = Column(TIMESTAMP(timezone=False), nullable=False)
updated_at = Column(TIMESTAMP(timezone=False), nullable=True)
closed_at = Column(TIMESTAMP(timezone=False), nullable=True)
action = Column(String, nullable=False)
And schemas.py
from pydantic import BaseModel
class IssueBase(BaseModel):
updated_at: None
closed_at: None
previous_title: None
class Issue(IssueBase):
id: int
gl_task_id: int
gl_assignee_id: int
gl_id_user: int
current_title: str
action: str
class Config:
orm_mode = True
But I don't know if I'm right doing it in this way, any suggestions are welcome.
The tutorial you mentioned is about FastAPI. Pydantic by itself has nothing to do with SQL, SQLAlchemy or relational databases. It is FastAPI that is showing you a way to use a relational database.
is the integration of pydantic strictly necessary [when using FastAPI]?
Yes. Pydantic is a requirement according to the documentation:
Requirements
Python 3.6+
FastAPI stands on the shoulders of giants:
Starlette for the web parts.
Pydantic for the data parts.
Why does it create ItemBase, ItemCreate and Item that inherits from ItemBase?
Pydantic models are the way FastAPI uses to define the schemas of the data that it receives (requests) and returns (responses). ItemCreate represent the data required to create an item. Item represents the data that is returned when the items are queried. The fields that are common to ItemCreate and Item are placed in ItemBase to avoid duplication.
In ItemBase it passes the fields that are strictly necessary in Item table? and defines its type?
ItemBase has the fields that are common to ItemCreate and Item. It has nothing to do with a table. It is just a way to avoid duplication. Every field of a pydantic model must have a type, there is nothing unusual there.
in my case I would have to do the same with the incidents?
If you have a similar scenario where the schemas of the data that you receive (request) and the data that you return (response) have common fields (same name and type), you could define a model with those fields and have other models inherit from it to avoid duplication.
This could be a (probably simplistic) way of understanding FastAPI and pydantic:
FastAPI transforms requests to pydantic models. Those pydantic models are your input data and are also known as schemas (maybe to avoid confusion with other uses of the word model). You can do whatever you want with those schemas, including using them to create relational database models and persisting them.
Whatever data you want to return as a response needs to be transformed by FastAPI to a pydantic model (schema). It just happens that pydantic supports an orm_mode option that allows it to parse arbitrary objects with attributes instead of dicts. Using that option you can return a relational database model and FastAPI will transform it to the corresponding schema (using pydantic).
FastAPI uses the parsing and validation features of pydantic, but you have to follow a simple rule: the data that you receive must comply with the input schema and the data that you want to return must comply with the output schema. You are in charge of deciding whatever happens in between.
I'm trying to create an instance of this Report model:
class Report(models.Model):
"""
A model for storing credit reports pulled from Equifax.
"""
user = models.ForeignKey(to=CustomUserModel, on_delete=models.CASCADE,
help_text='User report belongs to.')
timestamp = models.DateTimeField(default=timezone.now)
report = JSONField()
However, whenever I try I get this error:
Exception Type: TypeError at /internal/report
Exception Value: 'report' is an invalid keyword argument for this function
This happens whether I instantiate the instance using the Report().save() method, or the Report.object.create() method as follows:
report_obj = Report.objects.create(
user=user,
report=report
)
Does anyone have any clue what's going on? There is very clearly a "report" attribute for that class, so why the error?
Thanks!
Based on the the error and the comment:
(...) Looks like I imported the form field from DRF instead of the model field of the same name from Django (...)
You did not import a JSONField that is a model field, but something else (for example a form field, or here a DRF field). As a result, Django does not consider report to be a field of your Report module, it sees it as a "vanilla" Python attribute.
You thus should make sure that JSONField links to the model field class instead. Adding such field will probably result in another migration to add the field to the database table:
from django.contrib.postgres.fields import JSONField
class Report(models.Model):
"""
A model for storing credit reports pulled from Equifax.
"""
user = models.ForeignKey(to=CustomUserModel, on_delete=models.CASCADE,
help_text='User report belongs to.')
timestamp = models.DateTimeField(default=timezone.now)
report = JSONField()
I'm using flask-marshmallow (marshmallow=v3.0.0rc1, flask-marshmallow=0.9.0) and flask-sqlalchemy (sqlalchemy=1.2.16, flask-sqlalchemy=2.3.2)
I have this model and schema.
from marshmallow import post_load, fields
from .datastore import sqla_database as db
from .extensions import marshmallow as ma
class Study(db.Model):
_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, nullable=False)
tests = db.relationship("Test", backref="study", lazy='select', cascade="all, delete-orphan")
#property
def test_count(self):
return len(self.tests)
class StudySchema(ma.ModelSchema):
test_count = fields.Integer(dump_only=True)
class Meta:
model = Study
sqla_session = db.session
schema = StudySchema()
payload = request.get_json()
schema.load(data=payload, instance=Study.query.get(payload["_id"]))
schema.session.commit()
If I perform a PUT operation with this payload
{'_id': 1, 'name': 'Study1', 'test_count': 0} I get the following exception marshmallow.exceptions.ValidationError: {'test_count': ['Unknown field.']}
If I remove the dump_only=True I get this exception AttributeError: can't set attribute which makes sense to me because it's trying to set test_count with no setter method on model class.
What I do not understand is why is the attribute is not ignored with dump_only. Why is marshmallow still trying to validate and understand this field during load when it's marked as dump_only?
In marshmallow 2, unknown or dump_only fields are ignored from input. Unless the user decides to add his own validation to error on them.
In marshmallow 3, we changed that to offer three possibilities (see docs):
RAISE (default)
EXCLUDE (like marshmallow 2)
INCLUDE (pass data without validation)
There's been discussions about how to deal with dump_only fields and we came to the conclusion that from client perspective, those should be treated just as unknown fields (see https://github.com/marshmallow-code/marshmallow/issues/875).
Bottom line, your PUT payload should not include dump_only fields. Or you could set the EXCLUDE policy to your schema, but I'd favor the former option.
According to the official Marshmallow docs, it's recommended to declare a Schema and then have a separate class that receives loaded data, like this:
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
created_at = fields.DateTime()
#post_load
def make_user(self, data):
return User(**data)
However, my User class would look something like this:
class User:
def __init__(name, email, created_at):
self.name = name
self.email = email
self.created_at = created_at
This seems like repeating myself unnecessarily and I really don't like having to write the attribute names three more times. However, I do like IDE autocompletion and static type checking on well-defined structures.
So, is there any best practice for loading serialized data according to a Marshmallow Schema without defining another class?
For vanilla Python classes, there isn't an out-of-box way to define the class for the schema without repeating the field names.
If you're using SQLAlchemy for example, you can define the schema directly from the model with marshmallow_sqlalchemy.ModelSchema:
from marshmallow_sqlalchemy import ModelSchema
from my_alchemy_models import User
class UserSchema(ModelSchema):
class Meta:
model = User
Same applies to flask-sqlalchemy which uses flask_marshmallow.sqla.ModelSchema.
In the case of vanilla Python classes, you may define the fields once and use it for both schema and model/class:
USER_FIELDS = ('name', 'email', 'created_at')
class User:
def __init__(self, name, email, created_at):
for field in USER_FIELDS:
setattr(self, field, locals()[field])
class UserSchema(Schema):
class Meta:
fields = USER_FIELDS
#post_load
def make_user(self, data):
return User(**data)
Unless you need to deserialize as a specific class or you need custom serialization logic, you can simply do this (adapted from https://kimsereylam.com/python/2019/10/25/serialization-with-marshmallow.html):
from marshmallow import Schema, fields
from datetime import datetime
class UserSchema(Schema):
name = fields.Str(required=True)
email = fields.Email()
created_at = fields.DateTime()
schema = UserSchema()
data = { "name": "Some Guy", "email": "sguy#google.com": datetime.now() }
user = schema.load(data)
You could also create a function in your class that creates a dict with validation rules, though it would still be redundant, it would allow you to keep everything in your model class:
class User:
def __init__(name, email, created_at):
self.name = name
self.email = email
self.created_at = created_at
#classmethod
def Schema(cls):
return {"name": fields.Str(), "email": fields.Email(), "created_at": fields.DateTime()}
UserSchema = Schema.from_dict(User.Schema)
If you need to strong typing and full validation functionality, consider flask-pydantic or marshmallow-dataclass.
marshmallow-dataclass offers a lot of similar validation features to marshmallow. It kind of ties your hands though. It doesn't have built-in support for custom fields/polymorphism (have to use using marshmallow-union instead) and doesn't seem to play well with stack-on packages like flask-marshmallow and marshmallow-sqlalchemy. https://pypi.org/project/marshmallow-dataclass/
from typing import ClassVar, Type
from marshmallow_dataclass import dataclasses
from marshmallow import Schema, field, validate
#dataclass
class Person:
name: str = field(metadata=dict(load_only=True))
height: float = field(metadata=dict(validate=validate.Range(min=0)))
Schema: ClassVar[Type[Schema]] = Schema
Person.Schema().dump(Person('Bob', 2.0))
# => {'height': 2.0}
flask-pydantic is less elegant from a validation standpoint, but offers many of the same features and the validation is built into the class. Note that simple validations like min/max are more awkward than in marshmallow. Personally, I prefer to keep view/api logic out of the class though. https://pypi.org/project/Flask-Pydantic/
from typing import Optional
from flask import Flask, request
from pydantic import BaseModel
from flask_pydantic import validate
app = Flask("flask_pydantic_app")
class QueryModel(BaseModel):
age: int
class ResponseModel(BaseModel):
id: int
age: int
name: str
nickname: Optional[str]
# Example 1: query parameters only
#app.route("/", methods=["GET"])
#validate()
def get(query:QueryModel):
age = query.age
return ResponseModel(
age=age,
id=0, name="abc", nickname="123"
)
You'll have to create the two classes, but the good news is you won't have to enter the attribute names multiple times in most cases. One thing I've found, if you are using Flask, SQLAlchemy, and Marshmallow, is that if you define some of the validation attributes in your Column definition, the Marshmallow Schema will automatically pick up on these and the validations supplied in them. For example:
import (your-database-object-from-flask-init) as db
import (your-marshmallow-object-from-flask-init) as val
class User(db.Model):
name = db.Column(db.String(length=40), nullable=False)
email = db.Column(db.String(length=100))
created_at = db.Column(db.DateTime)
class UserSchema(val.ModelSchema):
class Meta:
model = User
In this example, if you were take a dictionary of data and put it into UserSchema().load(data) , you would see errors if, in this example, name didn't exist, or name was longer than 40 characters, or email is longer than 100 characters. Any custom validations beyond that you'd still have to code within your schema.
It also works if you've created the model class as an extension of another model class, carrying over its attributes. For example, if you wanted every class to have created/modified information, you could put those attributes in the parent model class and the child would inherit those along with their validation parameters. Marshmallow doesn't allow your parent model to have a schema, so I don't have information on custom validations there.
I know you've probably already completed your project, but I hope this helps for other developers that come across this.
Relevant pip list:
Flask (1.0.2)
flask-marshmallow (0.9.0)
Flask-SQLAlchemy (2.3.2)
marshmallow (2.18.0)
marshmallow-sqlalchemy (0.15.0)
SQLAlchemy (1.2.16)
I'm very new to both Python and Django and I'm having issues with a nullable Foreign Key relation. I found similar issues to this one, but none of them seemed to be covering my use case.
I'm using Django 1.8.17, and DRF 3.1.0
I have the following classes in Django (I've simplified them out to just the relevant fields since I can't easily copy/paste my code here):
class Rationale(models.Model):
id = models.AutoField(primary_key=True)
name = models.CharField(max_length=255)
class Alert(models.Model):
id = models.AutoField(primary_key=True)
rationale = models.ForeignKey(Rationale, null=True, blank=True)
priority = models.IntegerField(default=1)
class AlertHistory(models.Model):
id = models.AutoField(primary_key=True)
alert = = models.ForeignKey(Alert)
rationale = models.ForeignKey(Rationale, null=True, blank=True)
priority = models.IntegerField(null=True)
class AlertHistoryListView(generics.ListCreateAPIView):
queryset = AlertHistory.objects.all()
serializer_class = AlertHistorySerializer
pagination_class = DefaultPagination
filter_backends = (filters.DjangoFilterBackend, filters.OrderingFilter)
filter_class = AlertHistoryFilterSet
filter_fields = ['alert']
ordering_fields = filter_fields
class AlertHistoryFilterSet(django_filters.FilterSet):
class Meta:
model = AlertHistory
fields = ['alert']
The idea here is to capture changes to the alert in the history table. A user can update the priority or the Rationale.
The Rationale table is look-up that is pre-populated with a JSON fixture. A user can select a rationale to give the reason why the alert is open. Rationale is optional though, and therefore nullable.
However I get an error when I try to set the Rationale to None:
{'rationale': [u'Incorrect type. Expected pk value, received unicode.']}
So searching around led me to: the PrimaryKeyRelatedField: http://www.django-rest-framework.org/api-guide/relations/#primarykeyrelatedfield
I then updated my AlertHistory serializer:
class AlertHistorySerializer(serializers.ModelSerializer):
rationale = serializers.PrimaryKeyRelatedField(read_only=True, allow_null=True)
class Meta:
model = AlertHistory
This fixed my first issue, but led to problems in the test where I'm updating the Rationale. By marking it read_only, I unsurprisingly cannot update that field.
The documentation says I need to specify either read_only=True or the queryset. However it doesn't provide an example of how to do that, and I can't figure it out or find any examples anywhere.
I need to cover both of the following cases:
data = {'alert' : 1, 'priority' : 2, rationale: 1 } to set the rationale to the foreign key for Rationale 1.
And:
data = {'alert' : 1, 'priority' : 2, rationale: None } if a user wants to set the rationale to null. This use-case is more likely when they are simply updating the priority without selecting a rationale.
So I tried defining my queryset to the "all"
rationale = serializers.PrimaryKeyRelatedField(queryset=Rationale.objects.all(), allow_null=True)
but this causes all my tests where rationale is None to give the original exception:
{'rationale': [u'Incorrect type. Expected pk value, received unicode.']}, even though I now have allow_null set to True.
I then tried defining:
rationale = serializers.PrimaryKeyRelatedField(queryset=Rationale.objects.get(pk=rationale), allow_null=True)
but it doesn't know what rationale is.
I also tried:
rationale = serializers.PrimaryKeyRelatedField(source='rationale', allow_null=True)
but that leads me to the error:
AssertionError: Relational field must provide a 'queryset' argument, or set read_only=True.
How do I properly define my queryset?
Thanks.
use this serializer:
class AlertHistorySerializer(serializers.ModelSerializer):
class Meta:
model = AlertHistory
model:
class AlertHistory(models.Model):
id = models.AutoField(primary_key=True)
alert = = models.ForeignKey(Alert)
rationale = models.ForeignKey(Rationale, null=True)
priority = models.IntegerField(null=True)
sample json for creating alert history object:
{
"alert": 1,
"rationale": null,
"priority": 2
}
if you want to set rationale as null send null in json
PS: make sure your DB schema is updated and all the migrations have been applied.