I've been using mongoengine for a while now and have a ton of python data processing code that relies on a common set of Object Document Models.
Now I need to access the same mongodb instances from Flask. I'd like to use the same ODM definitions.
class User(Document):
email = StringField(required=True)
first_name = StringField(max_length=50)
last_name = StringField(max_length=50)
The problem is that flask-mongoengine requires you to first set up your flask context "db" and then build your ODM definitions, inheriting the document class and fieldtypes from "db" instead of the base mongoengine classes.
class User(db.Document):
email = db.StringField(required=True)
first_name = db.StringField(max_length=50)
last_name = db.StringField(max_length=50)
One solution, I suppose, is to make copies of all of the existing ODM definitions, import "db" from my main flask app, and then prepend everything with "db." If I do that, I'll have to maintain two sets of nearly identical ODM definitions.
If I simply change everything to the "db." version, that would probably break all of my legacy code.
So I'm thinking there might be a trick using super() on the document classes that can detect whether I'm importing my ODM into a Flask context or whether I'm importing it from a stand alone data processing script.
I'm also thinking I don't want to have to super() every fieldtype for every document, that I should be able to build or reference a common function that took care of that for me.
However, my super() skills are weak. I'm not even certain if that is the best approach. I was hoping someone might be able and willing to share some hints as to how to approach this.
Related
Does the Python (Flask) SQL Alchemy uses both DAO and ORM design, or simply just ORM?
I am learning design strategies and I thought of SQLAlchemy. Is it considered a DAO (clearly ORM) as well?
By default, it does not look like DAO.
What if I defined a class for an existing model class , for example given I have the following class:
class User(db.model):
id = db.Column(db.Integer, primary_key=True)
email = db.Column(db.String(120), unique=True, nullable=False)
verified= db.Column(db.String(5), unique=False, nullable=False)
I define another class, UserDao
class UserDao:
def addNewUser(user):
pass
def retrieveAllUsers(user):
users = User.query.limit(5).all()
And I instantitate and object of this UserDao class and call the respective methods to do some database operations through the respective method, does this make it a "DAO Pattern"?
Regarding your question "And I instantitate and object of this UserDao class and call the respective methods to do some database operations through the respective method, does this make it a "DAO Pattern"?" I would say yes, it is in that direction however you need to look at your whole application overall to answer that question.
The idea of a DAO is to keep the code of your application that does the business logic separate from the code that handles how you get and store the data. So the easiest way to answer your question is to look at your whole application and ask yourself "if tomorrow I want to work with mongodb (for example) instead of mysql (whether you are using sqlachemy or not) what code of my application would I need to change?". If you need to change code from the business logic code then it means you don't have a dao or at least not a strong one. If you only need to touch the code that handles database operations then you have a DAO. You can also look it in another way: what if tomorrow you decide that you are better off using django instead of sqlachemy? what code would you need to change? would you need to touch the code that does the business logic?
Sqlalchemy implements repository pattern, not dao pattern.
You don't have to implement any dao classes in your code when you use sqlalchemy.
Using sqlalchemy orm session(or classes based on db.model) means you are using repository.
References
https://techspot.zzzeek.org/2012/02/07/patterns-implemented-by-sqlalchemy/
I'm in the process of writing my first RESTful web service atop GAE and the Python 2.7 runtime; I've started out using Guido's shiny new ndb API.
However, I'm unsure how to solve a particular case without the implicit back-reference feature of the original db API. If the user-agent requests a particular resource and those resources 1 degree removed:
host/api/kind/id?depth=2
What's the best way to discover a related collection of entities from the "one" in a one-to-many relationship, given that the kind of the related entity is unknown at development time?
I'm unable to use a replacement query as described in a previous SO inquiry due to the latter restriction. The fact that my model is definable at runtime (and therefore isn't hardcoded) prevents me from using a query to filter properties for matching keys.
Ancestor and other kindless queries are also out due to the datastore limitation that prevents me from filtering on a property without the kind specified.
Thus far, the only idea I've had (beyond reverting to the db api) is to use a cross-group transaction to write my own reference on the "one", either by updating an ndb.StringProperty(repeat=True) containing all the related kinds when an entity of a new kind is introduced or by simply maintaining a list of keys on the "one" ndb.KeyProperty(repeat=True) every time a related "many" entity is written to the datastore.
I'm hoping someone more experienced than myself can suggest a better approach.
Given jmort253's suggestion, I'll try to augment my question with a concrete example adapted from the docs:
class Contact(ndb.Expando):
""" The One """
# basic info
name = ndb.StringProperty()
birth_day = ndb.DateProperty()
# If I were using db, a collection called 'phone_numbers' would be implicitly
# created here. I could use this property to retrieve related phone numbers
# when this entity was queried. Since NDB lacks this feature, the service
# will neither have a reference to query nor the means to know the
# relationship exists in the first place since it cannot be hard-coded. The
# data model is extensible and user-defined at runtime; most relationships
# will be described only in the data, and must be discoverable by the server.
# In this case, when Contact is queried, I need a way to retrieve the
# collection of phone numbers.
# Company info.
company_title = ndb.StringProperty()
company_name = ndb.StringProperty()
company_description = ndb.StringProperty()
company_address = ndb.PostalAddressProperty()
class PhoneNumber(ndb.Expando):
""" The Many """
# no collection_name='phone_numbers' equivalent exists for the key property
contact = ndb.KeyProperty(kind='Contact')
number = ndb.PhoneNumberProperty()
Interesting question! So basically you want to look at the Contact class and find out if there is some other model class that has a KeyProperty referencing it; in this example PhoneNumber (but there could be many).
I think the solution is to ask your users to explicitly add this link when the PhoneNumber class is created.
You can make this easy for your users by giving them a subclass of KeyProperty that takes care of this; e.g.
class LinkedKeyProperty(ndb.KeyProperty):
def _fix_up(self, cls, code_name):
super(LinkedKeyProperty, self)._fix_up(cls, code_name)
modelclass = ndb.Model._kind_map[self._kind]
collection_name = '%s_ref_%s_to_%s' % (cls.__name__,
code_name,
modelclass.__name__)
setattr(modelclass, collection_name, (cls, self))
Exactly how you pick the name for the collection and the value to store there is up to you; just put something there that makes it easy for you to follow the link back. The example would create a new attribute on Contact:
Contact.PhoneNumber_ref_contact_to_Contact == (PhoneNumber, PhoneNumber.contact)
[edited to make the code working and to add an example. :-) ]
Sound like a good use case for ndb.StructuredProperty.
Is it possible to use Django's user authentication features with more than one profile?
Currently I have a settings.py file that has this in it:
AUTH_PROFILE_MODULE = 'auth.UserProfileA'
and a models.py file that has this in it:
from django.db import models
from django.contrib.auth.models import User
class UserProfileA(models.Model):
company = models.CharField(max_length=30)
user = models.ForeignKey(User, unique=True)
that way, if a user logs in, I can easily get the profile because the User has a get_profile() method. However, I would like to add UserProfileB. From looking around a bit, it seems that the starting point is to create a superclass to use as the AUTH_PROFILE_MODULE and have both UserProfileA and UserProfileB inherit from that superclass. The problem is, I don't think the get_profile() method returns the correct profile. It would return an instance of the superclass. I come from a java background (polymorphism) so I'm not sure exactly what I should be doing.
Thanks!
Edit:
Well I found a way to do it via something called an "inheritance hack" that I found at this site http://djangosnippets.org/snippets/1031/
It works really well, however, coming from a java background where this stuff happens automatically, I'm a little unsettled by the fact that someone had to code this up and call it a "hack" to do it in python. Is there a reason why python doesn't enable this?
So the issue you're going to have is that whatever you want for your profile, you need to persist it in a database of some sort. Basically all of the back-ends for django are relational, and thus every field in a persisted object is present in every row of the table. there are a few ways for getting what you want.
Django provides some support for inheritance. You can use the techniques listed and get reasonable results in a polymorphic way.
The most direct approach is to use multiple table inheritance. Roughly:
class UserProfile(models.Model):
# set settings.AUTH_PROFILE_MODULE to this class!
pass
class UserProfileA(UserProfile):
pass
class UserProfileB(UserProfile):
pass
To use it:
try:
profile = user.get_profile().userprofilea
# user profile is UserProfileA
except UserProfileA.DoesNotExist:
# user profile wasn't UserProfileB
pass
try:
profile = user.get_profile().userprofileb
# user profile is UserProfileB
except UserProfileB.DoesNotExist:
# user profile wasn't either a or b...
Edit: Re, your comment.
The relational model implies a number of things that seem to disagree with object oriented philosophy. For a relation to be useful, it requires that every element in the relation to have the same dimensions, so that relational queries are valid for the whole relation. Since this is known a-priori, before encountering an instance of a class stored in the relation, then the row cannot be a subclass. django's orm overcomes this impedance mismatch by storing the subclass information in a different relation (one specific to the subclass), There are other solutions, but they all obey this basic nature of the relational model.
If it helps you come to terms with this, I'd suggest looking at how persistence on a RDBMs works for applications in the absence of an ORM. In particular, relational databases are more about collections and summaries of many rows, rather than applying behaviors to data once fetched from the database.
The specific example of using the profile feature of django.contrib.auth is a rather uninteresting one, especially if the only way that model is ever used is to fetch the profile data associated with a particular django.contrib.auth.models.User instance. If there are no other queries, you don't need a django.models.Model subclass at all. You can pickle a regular python class and store it in a blob field of an otherwise featureless model.
On the other hand, if you want to do more interesting things with profiles, like search for users that live in a particular city, then it will be important for all profiles to have an index for their city property. That's got nothing to do with OOP, and everything to do with relational.
The idios app by the Pinax team aimed at solving the multiple-profile problem. You can tweak the model to make the inheritance of the base profile class either abstract or non-abstract.
https://github.com/eldarion/idios.
Here is the answer to my question of how to get multiple profiles to work:
from django.contrib.contenttypes.models import ContentType
class Contact(models.Model):
content_type = models.ForeignKey(ContentType,editable=False,null=True)
def save(self):
if(not self.content_type):
self.content_type = ContentType.objects.get_for_model(self.__class__)
self.save_base()
def as_leaf_class(self):
content_type = self.content_type
model = content_type.model_class()
if(model == Contact):
return self
return model.objects.get(id=self.id)
I don't really understand why it works or why the developers of django/python made inheritance work this way
If you have app-specific options for each user, I would rather recommend to put them into a separate model.
A simplified example:
class UserSettings(models.Model):
user = models.ForeignKey(User, primary_key = True)
# Settings go here
defaultLocale = models.CharField(max_length = 80, default = "en_US")
...
This would be used like so:
def getUserSettings(request):
try:
return UserSettings.objects.get(pk = request.user)
except UserSettings.DoesNotExist:
# Use defaults instead, that's why you should define reasonable defaults
# in the UserSettings model
return UserSettings()
I have db.Model which has several properties as described below:
class Doc(db.Model):
docTitle = db.StringProperty(required=True)
docText = db.TextProperty()
docUser = db.UserProperty(required=True)
docDate = db.DateTimeProperty(auto_now_add=True)
In the template I just list the names of these documents as links. For that purpose I use the following query:
docList = Doc.gql("WHERE docUser = :1 ORDER BY docDate DESC", user)
As you can see docList includes all properties (including the "TextProperty"). However, I just use its docTitle and key() in my view.
Is there an alternative way to retrieve just the requested attributes of the model class?
If not, should I use PolyModel classes to differentiate the listing and actual usage of the Doc model class by creating another model class for the docText property?
EDIT: I am using webapp web framework in google app engine...
Entities are stored in the App Engine datastore as serialized protocol buffers, which are returned as a single blob, so it's not possible to just retrieve part of them. In any case, this would only save on RPC overhead between the datastore and your app, so the savings would be minimal.
If the size of each entity is significant, you may want to separate the model out, as you suggest. You don't need to (and probably shouldn't) use PolyModel, though - just use two model classes, a 'summary' and a 'detail' one.
I need to store some data in a Django model. These data are not equal to all instances of the model.
At first I thought about subclassing the model, but I’m trying to keep the application flexible. If I use subclasses, I’ll need to create a whole class each time I need a new kind of object, and that’s no good. I’ll also end up with a lot of subclasses only to store a pair of extra fields.
I really feel that a dictionary would be the best approach, but there’s nothing in the Django documentation about storing a dictionary in a Django model (or I can’t find it).
Any clues?
If it's really dictionary like arbitrary data you're looking for you can probably use a two-level setup with one model that's a container and another model that's key-value pairs. You'd create an instance of the container, create each of the key-value instances, and associate the set of key-value instances with the container instance. Something like:
class Dicty(models.Model):
name = models.CharField(max_length=50)
class KeyVal(models.Model):
container = models.ForeignKey(Dicty, db_index=True)
key = models.CharField(max_length=240, db_index=True)
value = models.CharField(max_length=240, db_index=True)
It's not pretty, but it'll let you access/search the innards of the dictionary using the DB whereas a pickle/serialize solution will not.
Another clean and fast solution can be found here: https://github.com/bradjasper/django-jsonfield
For convenience I copied the simple instructions.
Install
pip install jsonfield
Usage
from django.db import models
from jsonfield import JSONField
class MyModel(models.Model):
json = JSONField()
If you don't need to query by any of this extra data, then you can store it as a serialized dictionary. Use repr to turn the dictionary into a string, and eval to turn the string back into a dictionary. Take care with eval that there's no user data in the dictionary, or use a safe_eval implementation.
For example, in the create and update methods of your views, you can add:
if isinstance(request.data, dict) == False:
req_data = request.data.dict().copy()
else:
req_data = request.data.copy()
dict_key = 'request_parameter_that_has_a_dict_inside'
if dict_key in req_data.keys() and isinstance(req_data[dict_key], dict):
req_data[dict_key] = repr(req_data[dict_key])
I came to this post by google's 4rth result to "django store object"
A little bit late, but django-picklefield looks like good solution to me.
Example from doc:
To use, just define a field in your model:
>>> from picklefield.fields import PickledObjectField
>>> class SomeObject(models.Model):
>>> args = PickledObjectField()
and assign whatever you like (as long as it's picklable) to the field:
>>> obj = SomeObject()
>>> obj.args = ['fancy', {'objects': 'inside'}]
>>> obj.save()
As Ned answered, you won't be able to query "some data" if you use the dictionary approach.
If you still need to store dictionaries then the best approach, by far, is the PickleField class documented in Marty Alchin's new book Pro Django. This method uses Python class properties to pickle/unpickle a python object, only on demand, that is stored in a model field.
The basics of this approach is to use django's contibute_to_class method to dynamically add a new field to your model and uses getattr/setattr to do the serializing on demand.
One of the few online examples I could find that is similar is this definition of a JSONField.
I'm not sure exactly sure of the nature of the problem you're trying to solve, but it sounds curiously similar to Google App Engine's BigTable Expando.
Expandos allow you to specify and store additional fields on an database-backed object instance at runtime. To quote from the docs:
import datetime
from google.appengine.ext import db
class Song(db.Expando):
title = db.StringProperty()
crazy = Song(title='Crazy like a diamond',
author='Lucy Sky',
publish_date='yesterday',
rating=5.0)
crazy.last_minute_note=db.Text('Get a train to the station.')
Google App Engine currently supports both Python and the Django framework. Might be worth looking into if this is the best way to express your models.
Traditional relational database models don't have this kind of column-addition flexibility. If your datatypes are simple enough you could break from traditional RDBMS philosophy and hack values into a single column via serialization as #Ned Batchelder proposes; however, if you have to use an RDBMS, Django model inheritance is probably the way to go. Notably, it will create a one-to-one foreign key relation for each level of derivation.
This question is old, but I was having the same problem, ended here and the chosen answer couldn't solve my problem anymore.
If you want to store dictionaries in Django or REST Api, either to be used as objects in your front end, or because your data won't necessarily have the same structure, the solution I used can help you.
When saving the data in your API, use json.dump() method to be able to store it in a proper json format, as described in this question.
If you use this structure, your data will already be in the appropriate json format to be called in the front end with JSON.parse() in your ajax (or whatever) call.
I use a textfield and json.loads()/json.dumps()
models.py
import json
from django.db import models
class Item(models.Model):
data = models.TextField(blank=True, null=True, default='{}')
def save(self, *args, **kwargs):
## load the current string and
## convert string to python dictionary
data_dict = json.loads(self.data)
## do something with the dictionary
for something in somethings:
data_dict[something] = some_function(something)
## if it is empty, save it back to a '{}' string,
## if it is not empty, convert the dictionary back to a json string
if not data_dict:
self.data = '{}'
else:
self.data = json.dumps(data_dict)
super(Item, self).save(*args, **kwargs)
Django-Geo includes a "DictionaryField" you might find helpful:
http://code.google.com/p/django-geo/source/browse/trunk/fields.py?r=13#49
In general, if you don't need to query across the data use a denormalized approach to avoid extra queries. User settings are a pretty good example!
I agree that you need to refrain stuffing otherwise structured data into a single column. But if you must do that, Django has an XMLField build-in.
There's also JSONField at Django snipplets.
Being "not equal to all instances of the model" sounds to me like a good match for a "Schema-free database". CouchDB is the poster child for that approach and you might consider that.
In a project I moved several tables which never played very nice with the Django ORM over to CouchDB and I'm quite happy with that. I use couchdb-python without any of the Django-specific CouchDB modules. A description of the data model can be found here. The movement from five "models" in Django to 3 "models" in Django and one CouchDB "database" actually slightly reduced the total lines of code in my application.
I know this is an old question, but today (2021) the cleanest alternative is to use the native JSONfield (since django 3.1)
docs: https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.JSONField
you just create a field in the model called jsonfield inside the class model and voilá
Think it over, and find the commonalities of each data set... then define your model. It may require the use of subclasses or not. Foreign keys representing commonalities aren't to be avoided, but encouraged when they make sense.
Stuffing random data into a SQL table is not smart, unless it's truly non-relational data. If that's the case, define your problem and we may be able to help.
If you are using Postgres, you can use an hstore field: https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/fields/#hstorefield.