Flask-marshmallow base_fields.Function in base_fields.Nested - python

I am using flask-marshmallow along with marshmallow-sqlalchemy
I would like to have my own kind of HATEOAS implementation: for n-to-many relationships, along with the link, I d like to have the count of objects
For that, I have a regular sqlalchemy model with a many-to-many relationship:
class ParentChild(Model):
__tablename__ = 'parrent_child'
parent_id =Column(Integer, ForeignKey('parent.id'), primary_key=True)
child_id = Column(Integer, ForeignKey('child.id'), primary_key=True)
class Parent(Model):
__tablename__ = 'parent'
id = Column(Integer, primary_key=True)
name = Column(String())
children = relationship('Child', secondary='parent_child', back_populates='parents')
class Child(Model):
__tablename__ = 'child'
id = Column(Integer, primary_key=True)
name = Column(String())
parents = relationship('Parent', secondary='parent_child', back_populates='children')
Using the following marshmallow schema, I manage to get the data I want:
class ParentSchema(Schema):
class Meta:
model = Parent
children = URLFor('api.parents_children_by_parent_id', parent_id='<id>')
children_count = base_fields.Function(lambda obj: len(obj.children))
Returns:
{
"id" : 42,
"name" : "Bob",
"children" : "/api/parents/42/children",
"children_count" : 3
}
But I have issues when I want to encapsulate the fields like this:
{
"id": 42
"name": "bob",
"children": {
"link": "/api/parents/42/children",
"count": 3
}
}
I tried using a base_fields.Dict:
children = base_fields.Dict(
link = URLFor('api.parents_children_by_parent_id', parent_id='<id>'),
count = base_fields.Function(lambda obj: len(obj.children))
)
But I get
TypeError: Object of type 'Child' is not JSON serializable
I tried various other solutions, without success :
flask-marshmallow's Hyperlinks only accepts
dictionaries of Hyperlinks, and not Functions.
I think the solution would be to use a base_fields.Nested but it breaks the behaviour of URLFor that cannot catch the '<id>'.
I can't find a solution to this in the documentation.
At some point it s hard to think out of the box. Am I missing something? Any help would be appreciated.

So I found a workaround that I'm going to post, but I think it can be improved.
To override the children field with the object I want, I use a base_fields.Method:
class ParentSchema(Schema):
class Meta:
model = Parent
children = base_fields.Method('build_children_obj')
def build_children_obj(self, obj):
return {
"count": len(obj.children),
"link": URLFor('api.parents_children_by_parent_id', parent_id=obj.id)
}
At that point, I was getting TypeError: Object of type 'URLFor' is not JSON serializable
So after checking the source of the _serialize method of URLFor I added a check in my (customized) JSONEncoder:
if isinstance(o, URLFor):
return str(o._serialize(None, None, o))
And I finally got the payload I wanted, but I dont find it very clean. Any ideas?
EDIT : After testing, I found that len(obj.children) to get the count was very expensive in resources by loading the entire list of children. Instead, I do db.session.query(func.count(Children.id)).filter(Children.parents.any(id=obj.id)).scalar() Which is more optimized.

Related

How to deter recursion in Flask JSON output for many-to-many relationships?

The error is clear:
RecursionError: maximum recursion depth exceeded while calling a Python object
A model cycles through its properties, including its relationships, outputs the properties. The relationships have a backref, so it's an endless recursion cycle.
Example
Consider an Author describing its Books. During the formatting (default method), the Author model says, "is the object a Book?" If so, it asks Book to serialize itself. In other examples, the Author might hardcode the Book's key/value pairs instead of asking Book to describe itself. I'd like to avoid that as I want to reduce the amount of awareness one model has of another.
Is there a way to track/pass what level is being called?
What I'd prefer is to track the recursion level, such that
book = Book()
book.to_json
Will display something like
{
"id": 1,
"name": "Python on Stack Overflow",
"authors": [
{
"id": 300,
"name": "Mike",
"books": [
{ "id": 1, "name": "Python on Stack Overflow", "authors": ["<Author id=300>"] },
{ "id": 2, "name": "The Worst Question Ever Asked", "authors": ["<Author id=100>", "<Author id=200>", "<Author id=300>", "<Author id=400>"] },
{ "id": 3, "name": "The Greatest Question Ever Answered", "authors": ["<Author id=300>", "<Author id=400>"] },
]
},
...
]
}
Don't ask Book to describe its authors if Book calls Author calling Book (greater than 1 level deep).
Models
Disclaimer: This is a limited example and don't include imports or other attributes, methods, mixing, or functions.
Book.py
# models/book.py
def default(object):
# format dates
if isinstance(object, (date, datetime)):
return object.strftime('%Y-%m-%d %H:%M %z')
# Call 'Author' to serialize itself
if object.__class__.__name__ == 'Author': # <-- one place to be call-aware; `and level==1`
return object.to_json
# instance display
return f'<{object.__class__.__name__} id={object.id}>'
class Book(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Text, index=True, unique=True, nullable=False)
authors = db.relationship('Author', secondary=Published.__table__, back_populates='authors')
#property
def to_json(self):
columns = self.keys()
response = {}
for column in columns:
response[column] = getattr(self, column)
return json.loads(json.dumps(response, default=default))
Author.py
# models/author.py
def default(object):
# format dates
if isinstance(object, (date, datetime)):
return object.strftime('%Y-%m-%d %H:%M %z')
# Call 'Book' to serialize itself
if object.__class__.__name__ == 'Book': # <-- one place to be call-aware; `and level==1`
return object.to_json
# instance display
return f'<{object.__class__.__name__} id={object.id}>'
class Author(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Text, index=True, unique=True, nullable=False)
books = db.relationship('Book', secondary=Published.__table__, back_populates='authors')
#property
def to_json(self):
columns = self.keys()
response = {}
for column in columns:
response[column] = getattr(self, column)
return json.loads(json.dumps(response, default=default))
One potential solution is to use a global tracking variable.
recursion_level = None
def default(object):
global recursion_level
# format dates
if isinstance(object, (date, datetime)):
return object.strftime('%Y-%m-%d %H:%M %z')
# ask object to serialize itself
max_recursion = 1
classes = [ model.class_.__name__ for model in app.db.Model.registry.mappers ]
if object.__class__.__name__ in classes and recursion_level < max_recursion:
recursion_level += 1
json_str = object.to_json
recursion_level -= 1
return json_str
# instance display
return f'<{object.__class__.__name__} id={object.id}>'
class Author(db.Model):
id = db.Column(db.Integer, primary_key=True)
name = db.Column(db.Text, index=True, unique=True, nullable=False)
books = db.relationship('Book', secondary=Published.__table__, back_populates='authors')
#property
def to_json(self):
columns = self.keys()
response = {}
for column in columns:
response[column] = getattr(self, column)
global recursion_level # <-- new block
if recursion_level is None:
recursion_level = 1
# NOTE: I don't know how to pass `recursion_level` to `default`,
# which is why it's a global variable for now
return json.loads(json.dumps(response, default=default))
Comment:
to_json and default are actually defined in one place on a base model class to keep the code DRY. Try not to be distracted by the placement here.
Even though this answer uses global variables, it is not my preference. Python is supposedly single-threaded so it might be safe enough if not using async, but since I'm new to Python and don't fully understand the call stack or the scoping of globals. I defer to experts to poke the holes.
My preference is to pass a variable to default for the recursion level, used as the recursive terminating condition. I'm not sure how to pass the value in json.dumps
object.id is used in default's instance display output, but because the function may handle multiple classes (and not just Book), those classes may not include an id column. A more robust solution is to survey the primary keys and use those values. Something like:
pks = object.__table__.primary_key.columns.values()
pk_pairs = [ f'{pk.name}={object[pk.name]}' for pk in pks ]
return f'<{object.__class__.name} {" ".join(pk_pairs)}>'
NOTE: this all depends on how much control over your models and priamry keys you have. This could be made even safer, but for the purpose of this demo, this should suffice.

Filter elements by optional lists of related elements

I have a feeling that I've made things more complex than they need to be - this can't possibly be such a rare case. It seems to me that it should be possible - or perhaps that I'm doing something fundamentally wrong.
The issue at hand is this: I've declared a database element, Element, which consists of about 10 many-to-many relations with other elements, one of which is Tag.
I want to enable the user of my application to filter Element by all of these relations, some of them or none of them. Say the user wants to see only Elements which are related to a certain Tag.
To make things even more difficult, the function that will carry out this objective is called from a graphql API, meaning it will recieve ID's instead of ORM objects.
I'm trying to build a resolver in my Python Flask project, using SQLAlchemy, which will provide an interface like so:
# graphql request
query getElements {
getElements(tags:[2, 3] people:[8, 13]) {
id
}
}
# graphql response
{
"data": {
"getElements": [
{
"id": "2"
},
{
"id": "3"
},
{
"id": "8"
}
]
}
}
I imagine the resolver would look something like this simplified pseudo-code, but I can't for the life of me figure out how to pull it off:
def get_elements(tags=None, people=None):
args = {'tags' : tags, 'people' : people}
if any(args):
data_elements = DataElement.query.filter_by(this in args) # this is the tricky bit - for each of DataElements related elements, I want to check if its ID is given in the corresponding argument
else:
data_elements = DataElement.query.all()
return data_elements
Here's a peek at the simplified database model, as requested. DataElement holds a lot of relations like this, and it works perfectly:
class DataElement(db.Model):
__tablename__ = 'DataElement'
id = db.Column(db.Integer, primary_key=True)
tags = db.relationship('Tag', secondary=DataElementTag, back_populates='data_elements')
class Tag(db.Model):
__tablename__ = 'Tag'
id = db.Column(db.Integer, primary_key=True)
data_elements = db.relationship('DataElement', secondary=DataElementTag, back_populates='tags')
DataElementTag = db.Table('DataElementTag',
db.Column('id', db.Integer, primary_key=True),
db.Column('data_element_id', db.Integer, db.ForeignKey('DataElement.id')),
db.Column('tag_id', db.Integer, db.ForeignKey('Tag.id'))
)
Please, ORM wizards and python freaks, I call upon thee!
I've solved it in a rather clunky manner. I suppose there must be a more elegant way to pull this off, and am still holding out for better answers.
I ended up looping over all the given arguments and using eval() (not on user input, don't worry) to get the corresponding database model. From there, I was able to grab the DataElement object with the many-to-many relationship. My final solutions looks like this:
args = {
'status' : status,
'person' : people,
'tag' : tags,
'event' : events,
'location' : locations,
'group' : groups,
'year' : year
} # dictionary for args for easier data handling
if any(args.values()):
final = [] # will contain elements matching criteria
for key, value in args.items():
if value:
model = eval(key.capitalize()) # get ORM model from dictionary key name (eval used on hardcoded string, hence safe)
for id in value:
filter_element = model.query.filter_by(id=id).one_or_none() # get the element in question from db
if filter_element:
elements = filter_element.data_elements # get data_elements linked to element in question
for element in elements:
if not element in final: # to avoid duplicates
final.append(element)
return final

How to store a subclass that is not a direct subclass of "Document" in its own collection?

As per MongoEngine Documentation on Document Inheritence
I tried to create a base class as below
import datetime
from mongoengine import *
connect("testdb")
class Base(Document):
companyId = StringField(required=True)
creationDate = DateTimeField()
modifiedDate = DateTimeField()
meta = {'allow_inheritance': True}
def save(self, *args, **kwargs):
if not self.creationDate:
self.creationDate = datetime.datetime.now()
self.modifiedDate = datetime.datetime.now()
return super(Base, self).save(*args, **kwargs)
class Child1(Base):
# identifier = StringField(required=True, unique=True, primary_key=True)
createdBy = StringField(required=True)
class Child2(Base):
memberId = StringField(required=True)
Child1(companyId='ab', createdBy='123').save()
Child2(companyId='ab', memberId='123').save()
MY aim is to get two collections named Child1 and Child2 under "testdb", but instead only one collection getting created named 'base' with two documents in it.
{
"_id" : ObjectId("5656b66381f49543f27af85a"),
"_cls" : "Base.Child1",
"companyId" : "ab",
"creationDate" : ISODate("2015-11-26T13:06:01.689Z"),
"modifiedDate" : ISODate("2015-11-26T13:06:01.689Z"),
"createdBy" : "123"
}
/* 1 */
{
"_id" : ObjectId("5656b66381f49543f27af85b"),
"_cls" : "Base.Child2",
"companyId" : "ab",
"creationDate" : ISODate("2015-11-26T13:06:03.621Z"),
"modifiedDate" : ISODate("2015-11-26T13:06:03.621Z"),
"memberId" : "123"
}
How can I specify different collection name for my subclass?
versions
python 2.7.10
mongodb 3.0.3
mongoengine 0.8.7
pymongo 2.7.2
This is the expected behaviour according to the documentation you link.
To create a specialised type of a Document you have defined, you may subclass it and add any extra fields or methods you may need. As this is new class is not a direct subclass of Document, it will not be stored in its own collection; it will use the same collection as its superclass uses.
That being said to store in it own collection you need the base class must be an abstract class and you can optionally specify the child class collections' name using their meta attribute.
class Base(Document):
...
meta = {
'allow_inheritance': True,
'abstract': True
}
...
class Child1(Base):
...
meta = {'collection': 'child1'} # optional

How can one customize Django Rest Framework serializers output?

I have a Django model that is like this:
class WindowsMacAddress(models.Model):
address = models.TextField(unique=True)
mapping = models.ForeignKey('imaging.WindowsMapping', related_name='macAddresses')
And two serializers, defined as:
class WindowsFlatMacAddressSerializer(serializers.Serializer):
address = serializers.Field()
class WindowsCompleteMappingSerializer(serializers.Serializer):
id = serializers.Field()
macAddresses = WindowsFlatMacAddressSerializer(many=True)
clientId = serializers.Field()
When accessing the serializer over a view, I get the following output:
[
{
"id": 1,
"macAddresses": [
{
"address": "aa:aa:aa:aa:aa:aa"
},
{
"address": "bb:bb:bb:bb:bb:bb"
}
],
"clientId": null
}
]
Almost good, except that I'd prefer to have:
[
{
"id": 1,
"macAddresses": [
"aa:aa:aa:aa:aa:aa",
"bb:bb:bb:bb:bb:bb"
],
"clientId": null
}
]
How can I achieve that ?
Create a custom serializer field and implement to_native so that it returns the list you want.
If you use the source="*" technique then something like this might work:
class CustomField(Field):
def to_native(self, obj):
return obj.macAddresses.all()
I hope that helps.
Update for djangorestframework>=3.9.1
According to documentation, now you need override either one or both of the to_representation() and to_internal_value() methods. Example
class CustomField(Field):
def to_representation(self, value)
return {'id': value.id, 'name': value.name}
Carlton's answer will work do the job just fine. There's also a couple of other approaches you could take.
You can also use SlugRelatedField, which represents the relationship, using a given field on the target.
So for example...
class WindowsCompleteMappingSerializer(serializers.Serializer):
id = serializers.Field()
macAddresses = serializers.SlugRelatedField(slug_field='address', many=True, read_only=True)
clientId = serializers.Field()
Alternatively, if the __str__ of the WindowsMacAddress simply displays the address, then you could simply use RelatedField, which is a basic read-only field that will give you a simple string representation of the relationship target.
# models.py
class WindowsMacAddress(models.Model):
address = models.TextField(unique=True)
mapping = models.ForeignKey('imaging.WindowsMapping', related_name='macAddresses')
def __str__(self):
return self.address
# serializers.py
class WindowsCompleteMappingSerializer(serializers.Serializer):
id = serializers.Field()
macAddresses = serializers.RelatedField(many=True)
clientId = serializers.Field()
Take a look through the documentation on serializer fields to get a better idea of the various ways you can represent relationships in your API.

Creating a tree from self referential tables in SQLalchemy

I'm building a basic CMS in flask for an iPhone oriented site and I'm having a little trouble with something. I have a very small database with just 1 table (pages). Here's the model:
class Page(db.Model):
__tablename__ = 'pages'
id = db.Column(db.Integer, primary_key=True)
title = db.Column(db.String(100), nullable=False)
content = db.Column(db.Text, nullable=False)
parent_id = db.Column(db.Integer, db.ForeignKey("pages.id"), nullable=True)
As you can see, for sub pages, they just reference another page object in the parent_id field. What I'm trying to do in the admin panel is have a nested unordered list with all the pages nested in their parent pages. I have very little idea on how to do this. All i can think of is the following (which will only work (maybe—I haven't tested it) 2 levels down):
pages = Page.query.filter_by(parent_id=None)
for page in pages:
if Page.query.filter_by(parent_id=page.id):
page.sub_pages = Page.query.filter_by(parent_id=page.id)
I would then just format it into a list in the template. How would I make this work with potentially over 10 nested pages?
Thanks heaps in advance!
EDIT: I've looked around a bit and found http://www.sqlalchemy.org/docs/orm/relationships.html#adjacency-list-relationships, so I added
children = db.relationship("Page", backref=db.backref("parent", remote_side=id))
to the bottom of my Page model. and I'm looking at recursively going through everything and adding it to a tree of objects. I've probably made no sense, but that's the best way I can describe it
EDIT 2: I had a go at making a recursive function to run through all the pages and generate a big nested dictionary with all the pages and their children, but it keeps crashing python so i think it's just an infinite loop... here's the function
def get_tree(base_page, dest_dict):
dest_dict = { 'title': base_page.title, 'content': base_page.content }
children = base_page.children
if children:
dest_dict['children'] = {}
for child in children:
get_tree(base_page, dest_dict)
else:
return
and the page i'm testing it with:
#app.route('/test/')
def test():
pages = Page.query.filter_by(parent_id=None)
pages_dict = {}
for page in pages:
get_tree(page, pages_dict)
return str(pages_dict)
anyone got any ideas?
Look at http://sqlamp.angri.ru/index.html
or http://www.sqlalchemy.org/trac/browser/examples/adjacency_list/adjacency_list.py
UPD: For adjacency_list.py declarative example
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base(metadata=metadata)
class TreeNode(Base):
__tablename__ = 'tree'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('tree.id'))
name = Column(String(50), nullable=False)
children = relationship('TreeNode',
# cascade deletions
cascade="all",
# many to one + adjacency list - remote_side
# is required to reference the 'remote'
# column in the join condition.
backref=backref("parent", remote_side='TreeNode.id'),
# children will be represented as a dictionary
# on the "name" attribute.
collection_class=attribute_mapped_collection('name'),
)
def __init__(self, name, parent=None):
self.name = name
self.parent = parent
def append(self, nodename):
self.children[nodename] = TreeNode(nodename, parent=self)
def __repr__(self):
return "TreeNode(name=%r, id=%r, parent_id=%r)" % (
self.name,
self.id,
self.parent_id
)
Fix recursion
def get_tree(base_page, dest_dict):
dest_dict = { 'title': base_page.title, 'content': base_page.content }
children = base_page.children
if children:
dest_dict['children'] = {}
for child in children:
get_tree(child, dest_dict)
else:
return
Use query in example for recursive fetch data from db:
# 4 level deep
node = session.query(TreeNode).\
options(joinedload_all("children", "children",
"children", "children")).\
filter(TreeNode.name=="rootnode").\
first()

Categories

Resources