Using Marshmallow without repeating myself - python

According to the official Marshmallow docs, it's recommended to declare a Schema and then have a separate class that receives loaded data, like this:
class UserSchema(Schema):
name = fields.Str()
email = fields.Email()
created_at = fields.DateTime()
#post_load
def make_user(self, data):
return User(**data)
However, my User class would look something like this:
class User:
def __init__(name, email, created_at):
self.name = name
self.email = email
self.created_at = created_at
This seems like repeating myself unnecessarily and I really don't like having to write the attribute names three more times. However, I do like IDE autocompletion and static type checking on well-defined structures.
So, is there any best practice for loading serialized data according to a Marshmallow Schema without defining another class?

For vanilla Python classes, there isn't an out-of-box way to define the class for the schema without repeating the field names.
If you're using SQLAlchemy for example, you can define the schema directly from the model with marshmallow_sqlalchemy.ModelSchema:
from marshmallow_sqlalchemy import ModelSchema
from my_alchemy_models import User
class UserSchema(ModelSchema):
class Meta:
model = User
Same applies to flask-sqlalchemy which uses flask_marshmallow.sqla.ModelSchema.
In the case of vanilla Python classes, you may define the fields once and use it for both schema and model/class:
USER_FIELDS = ('name', 'email', 'created_at')
class User:
def __init__(self, name, email, created_at):
for field in USER_FIELDS:
setattr(self, field, locals()[field])
class UserSchema(Schema):
class Meta:
fields = USER_FIELDS
#post_load
def make_user(self, data):
return User(**data)

Unless you need to deserialize as a specific class or you need custom serialization logic, you can simply do this (adapted from https://kimsereylam.com/python/2019/10/25/serialization-with-marshmallow.html):
from marshmallow import Schema, fields
from datetime import datetime
class UserSchema(Schema):
name = fields.Str(required=True)
email = fields.Email()
created_at = fields.DateTime()
schema = UserSchema()
data = { "name": "Some Guy", "email": "sguy#google.com": datetime.now() }
user = schema.load(data)
You could also create a function in your class that creates a dict with validation rules, though it would still be redundant, it would allow you to keep everything in your model class:
class User:
def __init__(name, email, created_at):
self.name = name
self.email = email
self.created_at = created_at
#classmethod
def Schema(cls):
return {"name": fields.Str(), "email": fields.Email(), "created_at": fields.DateTime()}
UserSchema = Schema.from_dict(User.Schema)
If you need to strong typing and full validation functionality, consider flask-pydantic or marshmallow-dataclass.
marshmallow-dataclass offers a lot of similar validation features to marshmallow. It kind of ties your hands though. It doesn't have built-in support for custom fields/polymorphism (have to use using marshmallow-union instead) and doesn't seem to play well with stack-on packages like flask-marshmallow and marshmallow-sqlalchemy. https://pypi.org/project/marshmallow-dataclass/
from typing import ClassVar, Type
from marshmallow_dataclass import dataclasses
from marshmallow import Schema, field, validate
#dataclass
class Person:
name: str = field(metadata=dict(load_only=True))
height: float = field(metadata=dict(validate=validate.Range(min=0)))
Schema: ClassVar[Type[Schema]] = Schema
Person.Schema().dump(Person('Bob', 2.0))
# => {'height': 2.0}
flask-pydantic is less elegant from a validation standpoint, but offers many of the same features and the validation is built into the class. Note that simple validations like min/max are more awkward than in marshmallow. Personally, I prefer to keep view/api logic out of the class though. https://pypi.org/project/Flask-Pydantic/
from typing import Optional
from flask import Flask, request
from pydantic import BaseModel
from flask_pydantic import validate
app = Flask("flask_pydantic_app")
class QueryModel(BaseModel):
age: int
class ResponseModel(BaseModel):
id: int
age: int
name: str
nickname: Optional[str]
# Example 1: query parameters only
#app.route("/", methods=["GET"])
#validate()
def get(query:QueryModel):
age = query.age
return ResponseModel(
age=age,
id=0, name="abc", nickname="123"
)

You'll have to create the two classes, but the good news is you won't have to enter the attribute names multiple times in most cases. One thing I've found, if you are using Flask, SQLAlchemy, and Marshmallow, is that if you define some of the validation attributes in your Column definition, the Marshmallow Schema will automatically pick up on these and the validations supplied in them. For example:
import (your-database-object-from-flask-init) as db
import (your-marshmallow-object-from-flask-init) as val
class User(db.Model):
name = db.Column(db.String(length=40), nullable=False)
email = db.Column(db.String(length=100))
created_at = db.Column(db.DateTime)
class UserSchema(val.ModelSchema):
class Meta:
model = User
In this example, if you were take a dictionary of data and put it into UserSchema().load(data) , you would see errors if, in this example, name didn't exist, or name was longer than 40 characters, or email is longer than 100 characters. Any custom validations beyond that you'd still have to code within your schema.
It also works if you've created the model class as an extension of another model class, carrying over its attributes. For example, if you wanted every class to have created/modified information, you could put those attributes in the parent model class and the child would inherit those along with their validation parameters. Marshmallow doesn't allow your parent model to have a schema, so I don't have information on custom validations there.
I know you've probably already completed your project, but I hope this helps for other developers that come across this.
Relevant pip list:
Flask (1.0.2)
flask-marshmallow (0.9.0)
Flask-SQLAlchemy (2.3.2)
marshmallow (2.18.0)
marshmallow-sqlalchemy (0.15.0)
SQLAlchemy (1.2.16)

Related

Deserialize available fields in python [duplicate]

I'm using flask-marshmallow (marshmallow=v3.0.0rc1, flask-marshmallow=0.9.0) and flask-sqlalchemy (sqlalchemy=1.2.16, flask-sqlalchemy=2.3.2)
I have this model and schema.
from marshmallow import post_load, fields
from .datastore import sqla_database as db
from .extensions import marshmallow as ma
class Study(db.Model):
_id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.String, nullable=False)
tests = db.relationship("Test", backref="study", lazy='select', cascade="all, delete-orphan")
#property
def test_count(self):
return len(self.tests)
class StudySchema(ma.ModelSchema):
test_count = fields.Integer(dump_only=True)
class Meta:
model = Study
sqla_session = db.session
schema = StudySchema()
payload = request.get_json()
schema.load(data=payload, instance=Study.query.get(payload["_id"]))
schema.session.commit()
If I perform a PUT operation with this payload
{'_id': 1, 'name': 'Study1', 'test_count': 0} I get the following exception marshmallow.exceptions.ValidationError: {'test_count': ['Unknown field.']}
If I remove the dump_only=True I get this exception AttributeError: can't set attribute which makes sense to me because it's trying to set test_count with no setter method on model class.
What I do not understand is why is the attribute is not ignored with dump_only. Why is marshmallow still trying to validate and understand this field during load when it's marked as dump_only?
In marshmallow 2, unknown or dump_only fields are ignored from input. Unless the user decides to add his own validation to error on them.
In marshmallow 3, we changed that to offer three possibilities (see docs):
RAISE (default)
EXCLUDE (like marshmallow 2)
INCLUDE (pass data without validation)
There's been discussions about how to deal with dump_only fields and we came to the conclusion that from client perspective, those should be treated just as unknown fields (see https://github.com/marshmallow-code/marshmallow/issues/875).
Bottom line, your PUT payload should not include dump_only fields. Or you could set the EXCLUDE policy to your schema, but I'd favor the former option.

In Django, how can I easily inherit a field and create a new class with this field preconfigured parameters?

I am currently using UUID in my PostgreSQL database, therefore I am also using PrimaryKeyRelatedField() with some parameters in order to avoid problems when encoding to JSON the UUID field.
My serializer field looks like:
id = serializers.PrimaryKeyRelatedField(read_only=True,
allow_null=False,
pk_field=serializers.UUIDField(format='hex_verbose'))
And in every serializer that uses UUID I am having to use that.
My question is, how can I create a new class based on PrimaryKeyRelatedField so that I don't have to write all those parameters (read_only, allow_null...) ?
I am looking for something like:
id = BaseUUIDField()
Thanks
You can make an abstract class using the id which is a uuid field. Then inheret that model in your derived models.
import uuid
from django.db import models
//Abstract Model
class AbstractModel(models.Model):
id = models.UUIDField(primary_key=True,default=uuid.uuid4, editable=False)
class Meta:
Abstract =True
//Derived Model
class YourDerivedModel(Abstract.Model):
//fields here
Hope this helps your query

Declaring PeeWee models inside class, passing database parameter to BaseModel

I'm not sure if it's reasonable what I'm tring to do, but I want to put all classes and methods connected with database management in one single class.
So there will be DataSaver class representing one instance of database.
Now, official PeeWee docs recommend to create BaseModel and store database variable there. Here is how it's shown in example:
from peewee import *
db = SqliteDatabase('my_app.db')
class BaseModel(Model):
class Meta:
database = db
class User(BaseModel):
username = CharField(unique=True)
class Tweet(BaseModel):
user = ForeignKeyField(User, backref='tweets')
message = TextField()
created_date = DateTimeField(default=datetime.datetime.now)
is_published = BooleanField(default=True)
Now I'm trying to do the same thing, but inside class:
class DataSaver:
def __init__(self, database_save_path):
self.database_save_path = database_save_path
self.db = SqliteDatabase(database_save_path)
db.connect()
db.create_tables([User, Chat, Message], True)
class BaseModel(Model):
class Meta:
database = self.db
class User(BaseModel):
name = CharField(unique=True)
class Chat(BaseModel):
name = CharField(unique=True)
And the problem here is: BaseModel don't have currently access to variable db. self currently isn't pointing to DataSaver so it's rather clear that complier can be a little confused at this point.
Do you have any idea how to pass db variable to BaseModel so second block of code will work similarily as the first one?
I don't think you're correct. self is still the DataSaver instance at the time when you are declaring database = self.db. Is there an actual error you're getting?
I know, this question is old, but since I researched it for myself, here is my solution.
You should use one of the techniques described in the peewee documentation
class DataSaverA(object):
def __init__(self, database_save_path):
self.database_save_path = database_save_path
self.db = SqliteDatabase(database_save_path)
self.db.bind([DataSaverA.User, DataSaverA.Chat])
self.db.connect()
self.db.create_tables([DataSaverA.User, DataSaverA.Chat])
class BaseModel(Model):
pass
class User(BaseModel):
name = CharField(unique=True)
class Chat(BaseModel):
name = CharField(unique=True)
This way, you can for instance have a class DataSaverB, which needs to declare its own User and Chat inner classes, instantiate users like
u1 = DataSaverA.User.create(name='Uncle Bob')
u2 = DataSaverB.User.create(name='Grandma L.')
However, I think the use of inner classes is bad practice here, snce you would have to replicate the definitions of the inner classes for all DataSaver variants.
In my case, I was more interested in grouping functionality and configuration, instead of actually relying on multiple datasavers, so I dropped the idea of using nested classes.

Create resource via POST specifying related field ID

I am using Django Rest Framework, and want to allow API clients to create resources, where one attribute of the created resource is the (required) primary key of a related data structure. For example, given these models:
class Breed(models.Model):
breed_name = models.CharField(max_length=255)
class Dog(models.Model):
name = models.CharField(max_length=255)
breed = models.ForeignKey(Breed)
I want to allow the caller to create a Dog object by specifying a name and a breed_id corresponding to the primary key of an existing Breed.
I'd like to use HyperlinkedModelSerializer in general for my APIs. This complicates things slightly because it (apparently) expects related fields to be specified by URL rather than primary key.
I've come up with the following solution, using PrimaryKeyRelatedField, that behaves as I'd like:
class BreedSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Breed
class DogSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Dog
read_only_fields = ('breed', )
breed_id = serializers.PrimaryKeyRelatedField(queryset=Breed.objects.all())
def create(self, validated_data):
validated_data['breed'] = validated_data['breed_id']
del validated_data['breed_id']
return Dog.objects.create(**validated_data)
But it seems weird that I would need to do this mucking around with the overloaded create. Is there a cleaner solution to this?
Thanks to dukebody for suggesting implementing a custom related field to allow an attribute to be serialized OUT as a hyperlink, but IN as a primary key:
class HybridPrimaryKeyRelatedField(serializers.HyperlinkedRelatedField):
"""Serializes out as hyperlink, in as primary key"""
def to_internal_value(self, data):
return self.get_queryset().get(pk=data)
This lets me do away with the create override, the read_only_fields decorator, and the weirdness of swapping out the breed and breed_id:
class BreedSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Breed
class DogSerializer(serializers.HyperlinkedModelSerializer):
class Meta:
model = Dog
breed = HybridPrimaryKeyRelatedField(queryset=Breed.objects,
view_name='breed-detail')

django abstract models versus regular inheritance

Besides the syntax, what's the difference between using a django abstract model and using plain Python inheritance with django models? Pros and cons?
UPDATE: I think my question was misunderstood and I received responses for the difference between an abstract model and a class that inherits from django.db.models.Model. I actually want to know the difference between a model class that inherits from a django abstract class (Meta: abstract = True) and a plain Python class that inherits from say, 'object' (and not models.Model).
Here is an example:
class User(object):
first_name = models.CharField(..
def get_username(self):
return self.username
class User(models.Model):
first_name = models.CharField(...
def get_username(self):
return self.username
class Meta:
abstract = True
class Employee(User):
title = models.CharField(...
I actually want to know the difference between a model class that
inherits from a django abstract class (Meta: abstract = True) and a
plain Python class that inherits from say, 'object' (and not
models.Model).
Django will only generate tables for subclasses of models.Model, so the former...
class User(models.Model):
first_name = models.CharField(max_length=255)
def get_username(self):
return self.username
class Meta:
abstract = True
class Employee(User):
title = models.CharField(max_length=255)
...will cause a single table to be generated, along the lines of...
CREATE TABLE myapp_employee
(
id INT NOT NULL AUTO_INCREMENT,
first_name VARCHAR(255) NOT NULL,
title VARCHAR(255) NOT NULL,
PRIMARY KEY (id)
);
...whereas the latter...
class User(object):
first_name = models.CharField(max_length=255)
def get_username(self):
return self.username
class Employee(User):
title = models.CharField(max_length=255)
...won't cause any tables to be generated.
You could use multiple inheritance to do something like this...
class User(object):
first_name = models.CharField(max_length=255)
def get_username(self):
return self.username
class Employee(User, models.Model):
title = models.CharField(max_length=255)
...which would create a table, but it will ignore the fields defined in the User class, so you'll end up with a table like this...
CREATE TABLE myapp_employee
(
id INT NOT NULL AUTO_INCREMENT,
title VARCHAR(255) NOT NULL,
PRIMARY KEY (id)
);
An abstract model creates a table with the entire set of columns for each subchild, whereas using "plain" Python inheritance creates a set of linked tables (aka "multi-table inheritance"). Consider the case in which you have two models:
class Vehicle(models.Model):
num_wheels = models.PositiveIntegerField()
class Car(Vehicle):
make = models.CharField(…)
year = models.PositiveIntegerField()
If Vehicle is an abstract model, you'll have a single table:
app_car:
| id | num_wheels | make | year
However, if you use plain Python inheritance, you'll have two tables:
app_vehicle:
| id | num_wheels
app_car:
| id | vehicle_id | make | model
Where vehicle_id is a link to a row in app_vehicle that would also have the number of wheels for the car.
Now, Django will put this together nicely in object form so you can access num_wheels as an attribute on Car, but the underlying representation in the database will be different.
Update
To address your updated question, the difference between inheriting from a Django abstract class and inheriting from Python's object is that the former is treated as a database object (so tables for it are synced to the database) and it has the behavior of a Model. Inheriting from a plain Python object gives the class (and its subclasses) none of those qualities.
The main difference is how the databases tables for the models are created.
If you use inheritance without abstract = True Django will create a separate table for both the parent and the child model which hold the fields defined in each model.
If you use abstract = True for the base class Django will only create a table for the classes that inherit from the base class - no matter if the fields are defined in the base class or the inheriting class.
Pros and cons depend on the architecture of your application.
Given the following example models:
class Publishable(models.Model):
title = models.CharField(...)
date = models.DateField(....)
class Meta:
# abstract = True
class BlogEntry(Publishable):
text = models.TextField()
class Image(Publishable):
image = models.ImageField(...)
If the Publishable class is not abstract Django will create a table for publishables with the columns title and date and separate tables for BlogEntry and Image. The advantage of this solution would be that you are able to query across all publishables for fields defined in the base model, no matter if they are blog entries or images. But therefore Django will have to do joins if you e.g. do queries for images...
If making Publishable abstract = True Django will not create a table for Publishable, but only for blog entries and images, containing all fields (also the inherited ones). This would be handy because no joins would be needed to an operation such as get.
Also see Django's documentation on model inheritance.
Just wanted to add something which I haven't seen in other answers.
Unlike with python classes, field name hiding is not permited with model inheritance.
For example, I have experimented issues with an use case as follows:
I had a model inheriting from django's auth PermissionMixin:
class PermissionsMixin(models.Model):
"""
A mixin class that adds the fields and methods necessary to support
Django's Group and Permission model using the ModelBackend.
"""
is_superuser = models.BooleanField(_('superuser status'), default=False,
help_text=_('Designates that this user has all permissions without '
'explicitly assigning them.'))
groups = models.ManyToManyField(Group, verbose_name=_('groups'),
blank=True, help_text=_('The groups this user belongs to. A user will '
'get all permissions granted to each of '
'his/her group.'))
user_permissions = models.ManyToManyField(Permission,
verbose_name=_('user permissions'), blank=True,
help_text='Specific permissions for this user.')
class Meta:
abstract = True
# ...
Then I had my mixin which among other things I wanted it to override the related_name of the groups field. So it was more or less like this:
class WithManagedGroupMixin(object):
groups = models.ManyToManyField(Group, verbose_name=_('groups'),
related_name="%(app_label)s_%(class)s",
blank=True, help_text=_('The groups this user belongs to. A user will '
'get all permissions granted to each of '
'his/her group.'))
I was using this 2 mixins as follows:
class Member(PermissionMixin, WithManagedGroupMixin):
pass
So yeah, I expected this to work but it didn't.
But the issue was more serious because the error I was getting wasn't pointing to the models at all, I had no idea of what was going wrong.
While trying to solve this I randomly decided to change my mixin and convert it to an abstract model mixin. The error changed to this:
django.core.exceptions.FieldError: Local field 'groups' in class 'Member' clashes with field of similar name from base class 'PermissionMixin'
As you can see, this error does explain what is going on.
This was a huge difference, in my opinion :)
The main difference is when you inherit the User class. One version will behave like a simple class, and the other will behave like a Django modeel.
If you inherit the base "object" version, your Employee class will just be a standard class, and first_name won't become part of a database table. You can't create a form or use any other Django features with it.
If you inherit the models.Model version, your Employee class will have all the methods of a Django Model, and it will inherit the first_name field as a database field that can be used in a form.
According to the documentation, an Abstract Model "provides a way to factor out common information at the Python level, whilst still only creating one database table per child model at the database level."
I will prefer the abstract class in most of the cases because it does not create a separate table and the ORM does not need to create joins in the database. And using abstract class is pretty simple in Django
class Vehicle(models.Model):
title = models.CharField(...)
Name = models.CharField(....)
class Meta:
abstract = True
class Car(Vehicle):
color = models.CharField()
class Bike(Vehicle):
feul_average = models.IntegerField(...)

Categories

Resources