mongoengine dereferencing nested documents - python

I have a several comples mongo models. For example, a User that references a Role with some properties. Now when I retrieve users, I want the role property to be populated with those of the referenced role object, not the object id.
from mongoengine import *
connect('test_database')
class Role(Document):
name = StringField(required=True)
description = StringField(required=True)
class User(Document):
role = ReferenceField(Role, reverse_delete_rule=DENY)
r = Role(name='test', description='foo').save()
User(role=r).save()
print(User.objects().select_related()[0].to_mongo().to_dict())
# prints: {'_id': ObjectId('5c769af4e98fc24f4a82fd99'), 'role': ObjectId('5c769af4e98fc24f4a82fd98')}
# want: {'_id': '5c769af4e98fc24f4a82fd99', 'role': {'name' : 'test', 'description' : 'foo'}}
How do I go about achieving this, for any complex mongoengine object?

Mongoengine does not provide anything out of the box but you can either define a method (e.g to_dict(self)) on your Document class, or use a serialisation library like marshmallow-mongoengine

Related

Combine Flask-Marshmallow with marshmallow-jsonapi

Overview
I am using Flask-SqlAlchemy and now I am looking into marshmallow to help me serialize and deserialize request data.
I was able to successfully:
Create my models using Flask-SqlAlchemy
Use Flask-Marshmallow to serialize database objects using the same model, by using the Optional Flask-SqlAlchemy Integration
Use marshmallow-jsonapi to quickly generate Json API compliant responses. This required me to declare new Schemas to specify which attributes I want to include (this is duplicate from Flask-SqlAlchemy Models)
Code Samples
Flask-SqlAlchemy Declarative Model
class Space(db.Model):
__tablename__ = 'spaces'
id = sql.Column(sql.Integer, primary_key=True)
name = sql.Column(sql.String)
version = sql.Column(sql.String)
active = sql.Column(sql.Boolean)
flask_marshmallow Schema Declaration (Inherits from SqlAlchemy Model)
ma = flask_marshmallow.Marshmallow(app)
class SpaceSchema(ma.ModelSchema):
class Meta:
model = Space
# API Response
space = Space.query.first()
return SpaceSchema().dump(space).data
# Returns:
{
'id': 123,
'version': '0.1.0',
'name': 'SpaceName',
'active': True
}
marshmallow_json api - requires new Schema Declaration, must include each attribute and type manually
class SpaceJsonSchema(marshmallow_json.Schema):
id = fields.Str(dump_only=True)
name = fields.Str()
version = fields.Str()
active = fields.Bool()
class Meta:
type_ = 'spaces'
self_url = '/spaces/{id}'
self_url_kwargs = {'id': '<id>'}
self_url_many = '/spaces/'
strict = True
# Returns Json API Compliant
{
'data': {
'id': '1',
'type': 'spaces',
'attributes': {
'name': 'Phonebooth',
'active': True,
'version': '0.1.0'
},
'links': {'self': '/spaces/1'}
},
'links': {'self': '/spaces/1'}
}
Issue
As shown in the code, marshmallow-jsonapi allows me to create json api compliant responses, but I end up having to maintain a Declarative Model + Schema Response model.
flask-marshmallow allows me to create Schema responses from the SqlAlchemy models, so I don't have to maintain a separate set of properties for each model.
Question
Is it at all possible to use flask-marshmallow and marshmallow-jsonapi together so 1. Create Marshmallow Schema from a SqlAlchemy model, AND automatically generate json api responses?
I tried creating Schema declaration that inherited from ma.ModelSchema and marshmallow_json.Schema, in both orders, but it does not work (raises exception for missing methods and properties)
marshmallow-jsonapi
marshmallow-jsonapi provides a simple way to produce JSON
API-compliant data in any Python web framework.
flask-marshmallow
Flask-Marshmallow includes useful extras for integrating with
Flask-SQLAlchemy and marshmallow-sqlalchemy.
Not a solution to this exact problem but I ran into similar issues when implementing this library : https://github.com/thomaxxl/safrs (sqlalchemy + flask-restful + jsonapi compliant spec).
I don't remember exactly how I got around it, but if you try it and serialization doesn't work I can help you resolve it if you open an issue in github

Create ORM object from dict and add to session

Let's say I have a User model with attributes id, name, email and a relationship languages.
Is it possible to create a User instance from existing data that behaves like I would have queried it with dbsession.query(User).get(42)?
What I mean in particular is that I want that an access to user.languages creates a subquery and populates the attribute.
Here a code example:
I have a class User:
class User(Base):
id = Column(Integer, primary_key=True)
name = Column(String(64))
email = Column(String(64))
languages = relationship('Language', secondary='user_languages')
I already have a lot of users stored in my DB.
And I know that I have, for example, this user in my DB:
user_dict = {
'id': 23,
'name': 'foo',
'email': 'foo#bar',
}
So I have all the attributes but the relations.
Now I want to make a sqlalchemy User instance
and kind of register it in sqlalchemy's system
so I can get the languages if needed.
user = User(**user_dict)
# Now I can access the id, name email attributes
assert user.id == 23
# but since sqlalchemy thinks it's a new model it doesn't
# lazy load any relationships
assert len(user.languages) == 0
# I want here that the languages for the user with id 23 appear
# So I want that `user` is the same as when I would have done
user_from_db = DBSession.query(User).get(23)
assert user == user_from_db
The use-case is that I have a big model with lots of complex
relationships but 90% of the time I don't need the data from those.
So I only want to cache the direct attributes plus what else I need
and then load those from the cache like above and be able to
use the sqlalchemy model like I would have queried it from the db.
From the sqlalchemy mailing list:
# to make it look like it was freshly loaded from the db
from sqlalchemy.orm.session import make_transient_to_detached
make_transient_to_detached(user)
# merge instance in session without emitting sql
user = DBSession.merge(user, load=False)
This answer was extracted from the question

GAE NDB with Endpoints Proto Datastore: How to format response fields of reference property?

I have parent-child relationships in DataStore model: Building entity with reference entity to Office. I perform query on Building model and I would like to limit fields of Office entity in JSON response.
Here is my code:
#Building.query_method(collection_fields=('id', 'name', 'office'), path='buildings', name='list')
def List(self, query):
return query
collection_fields attribute works great only to define parent entity fields (Building), but how to limit fields of child entity?
Here is my response message in JSON:
{ id : 5
name : 'building name'
office: {
name: 'office name',
field1 : 'test',
field1 : 'test',
field1 : 'test'
}
}
I would like to remove some fields from Office object (i.e field1,field2 etc) to reduce JSON response size.
Define limited_message_fields_schema of Office object is not good solution, because it works globally. I would like to format only this single query.
You can create EndpointsAliasProperty in the Building model, where you can transform self.office and use that value in collection_fields:
#EndpointsAliasProperty
def office_ltd(self):
limited = doSomethingWith(self.office)
return limited
#Building.query_method(collection_fields=('id', 'name', 'office_ltd'),
path='buildings', name='list')
def List(self, query):
return query

Is it possible to add fields not present in the structure on the fly?

I was trying out mongokit and I'm having a problem. I thought it would be possible to add fields not present in the schema on the fly, but apparently I can't save the document.
The code is as below:
from mongokit import *
connection = Connection()
#connection.register
class Test(Document):
structure = {'title': unicode, 'body': unicode}
On the python shell:
test = connection.testdb.testcol.Test()
test['foo'] = u'bar'
test['title'] = u'my title'
test['body'] = u'my body'
test.save()
This gives me a
StructureError: unknown fields ['foo'] in Test
I have an application where, while I have a core of fields that are always present, I can't predict what new fields will be necessary beforehand. Basically, in this case, it's up to the client to insert what fields it find necessary. I'll just receive whatever he sends, do my thing, and store them in mongodb.
But there is still a core of fields that are common to all documents, so it would be nice to type and validate them.
Is there a way to solve this with mongokit?
According to the MongoKit structure documentation you can have optional fields if you use the Schemaless Structure feature.
As of version 0.7, MongoKit allows you to save partially structured documents.
So if you set up your class like this, it should work:
from mongokit import *
class Test(Document):
use_schemaless = True
structure = {'title': unicode, 'body': unicode}
required_fields = [ 'title', 'body' ]
That will require title and body but should allow any other fields to be present. According to the docs:
MongoKit will raise an exception only if required fields are missing

Foreign key relationship with peewee and python

I'm trying to set up a database ORM with peewee and am not clear on the use of foreign key relationships.
from peewee import *
db = SqliteDatabase('datab.db')
class datab(Model):
class Meta:
database = db
class Collection(datab):
identifier = CharField()
title = CharField()
class File(datab):
identifier = ForeignKeyField(Collection, related_name='files')
name = CharField()
Later, I do an import of "Collections"
for value in collection:
Collection(**value).save()
Finally, where I am having trouble is adding the Files to the collections
for value in collectionFiles:
File(**value).save()
Within the value dict, there is a keyword pair with key of "identifier" and a value that should associate with the Collection identifier keyword.
However I get an error message:
ValueError: invalid literal for int() with base 10: 'somevalue'
If I change the File(datab): identifier Type to VarChar, it saves the data.
I'm realizing I'm doing it wrong. My assumption was that the unique identifier value in each table would apply the foreign key. After reading the documentation, it looks like the foreign key setup is a bit different. Do I need to do something like
Collections.File.files(**values).save() ? In other words, instead of doing a data import, loading the collection object and then adding the file associated fields through peewee?
Values that make up class File
{'crc32': '63bee49d',
'format': 'Metadata',
'identifier': u'somevalue',
'md5': '34104ffce9e4084fd3641d0decad910a',
'mtime': '1368328224',
'name': 'lupi.jpg_meta.txt',
'sha1': '1448ed1159a5d770da76067dd1c53e94d5a01535',
'size': '1244'}
I think the naming of your fields might be part of the confusion. Rather than calling the foreign key from File -> Collection "identifier", you might call it "collection" instead.
class File(datab):
collection = ForeignKeyField(Collection, related_name='files')
name = CharField()
Peewee prefers that, when setting the value of a Foreign Key, it be a model instance. For example, rather than doing:
File.create(collection='foobar', name='/secret/password')
It is preferable to do something like this:
collection = Collection.get(Collection.identifier == 'foobar')
File.create(collection=collection, name='/secret/password')
As a final note, if the Collection "identifier" is the unique primary key, you can set it up thus:
class Collection(datab):
identifier = CharField(primary_key=True)
title = CharField()
(I'm not familiar with peewee, but if it's like Django then this should work.)
class File has a ForeignKeyField and a CharField, so you can't simply save a pair of strings with File(**value). You need to convert a string to a key first, like this:
for value in collectionFiles:
identifier = value['identifier']
name = value['name']
collection_entity = Collection.objects.filter(identifier=identifier).get()
File(identifier=collection_entity, name=name).save()

Categories

Resources