How to store a dictionary on a Django Model? - python

I need to store some data in a Django model. These data are not equal to all instances of the model.
At first I thought about subclassing the model, but I’m trying to keep the application flexible. If I use subclasses, I’ll need to create a whole class each time I need a new kind of object, and that’s no good. I’ll also end up with a lot of subclasses only to store a pair of extra fields.
I really feel that a dictionary would be the best approach, but there’s nothing in the Django documentation about storing a dictionary in a Django model (or I can’t find it).
Any clues?

If it's really dictionary like arbitrary data you're looking for you can probably use a two-level setup with one model that's a container and another model that's key-value pairs. You'd create an instance of the container, create each of the key-value instances, and associate the set of key-value instances with the container instance. Something like:
class Dicty(models.Model):
name = models.CharField(max_length=50)
class KeyVal(models.Model):
container = models.ForeignKey(Dicty, db_index=True)
key = models.CharField(max_length=240, db_index=True)
value = models.CharField(max_length=240, db_index=True)
It's not pretty, but it'll let you access/search the innards of the dictionary using the DB whereas a pickle/serialize solution will not.

Another clean and fast solution can be found here: https://github.com/bradjasper/django-jsonfield
For convenience I copied the simple instructions.
Install
pip install jsonfield
Usage
from django.db import models
from jsonfield import JSONField
class MyModel(models.Model):
json = JSONField()

If you don't need to query by any of this extra data, then you can store it as a serialized dictionary. Use repr to turn the dictionary into a string, and eval to turn the string back into a dictionary. Take care with eval that there's no user data in the dictionary, or use a safe_eval implementation.
For example, in the create and update methods of your views, you can add:
if isinstance(request.data, dict) == False:
req_data = request.data.dict().copy()
else:
req_data = request.data.copy()
dict_key = 'request_parameter_that_has_a_dict_inside'
if dict_key in req_data.keys() and isinstance(req_data[dict_key], dict):
req_data[dict_key] = repr(req_data[dict_key])

I came to this post by google's 4rth result to "django store object"
A little bit late, but django-picklefield looks like good solution to me.
Example from doc:
To use, just define a field in your model:
>>> from picklefield.fields import PickledObjectField
>>> class SomeObject(models.Model):
>>> args = PickledObjectField()
and assign whatever you like (as long as it's picklable) to the field:
>>> obj = SomeObject()
>>> obj.args = ['fancy', {'objects': 'inside'}]
>>> obj.save()

As Ned answered, you won't be able to query "some data" if you use the dictionary approach.
If you still need to store dictionaries then the best approach, by far, is the PickleField class documented in Marty Alchin's new book Pro Django. This method uses Python class properties to pickle/unpickle a python object, only on demand, that is stored in a model field.
The basics of this approach is to use django's contibute_to_class method to dynamically add a new field to your model and uses getattr/setattr to do the serializing on demand.
One of the few online examples I could find that is similar is this definition of a JSONField.

I'm not sure exactly sure of the nature of the problem you're trying to solve, but it sounds curiously similar to Google App Engine's BigTable Expando.
Expandos allow you to specify and store additional fields on an database-backed object instance at runtime. To quote from the docs:
import datetime
from google.appengine.ext import db
class Song(db.Expando):
title = db.StringProperty()
crazy = Song(title='Crazy like a diamond',
author='Lucy Sky',
publish_date='yesterday',
rating=5.0)
crazy.last_minute_note=db.Text('Get a train to the station.')
Google App Engine currently supports both Python and the Django framework. Might be worth looking into if this is the best way to express your models.
Traditional relational database models don't have this kind of column-addition flexibility. If your datatypes are simple enough you could break from traditional RDBMS philosophy and hack values into a single column via serialization as #Ned Batchelder proposes; however, if you have to use an RDBMS, Django model inheritance is probably the way to go. Notably, it will create a one-to-one foreign key relation for each level of derivation.

This question is old, but I was having the same problem, ended here and the chosen answer couldn't solve my problem anymore.
If you want to store dictionaries in Django or REST Api, either to be used as objects in your front end, or because your data won't necessarily have the same structure, the solution I used can help you.
When saving the data in your API, use json.dump() method to be able to store it in a proper json format, as described in this question.
If you use this structure, your data will already be in the appropriate json format to be called in the front end with JSON.parse() in your ajax (or whatever) call.

I use a textfield and json.loads()/json.dumps()
models.py
import json
from django.db import models
class Item(models.Model):
data = models.TextField(blank=True, null=True, default='{}')
def save(self, *args, **kwargs):
## load the current string and
## convert string to python dictionary
data_dict = json.loads(self.data)
## do something with the dictionary
for something in somethings:
data_dict[something] = some_function(something)
## if it is empty, save it back to a '{}' string,
## if it is not empty, convert the dictionary back to a json string
if not data_dict:
self.data = '{}'
else:
self.data = json.dumps(data_dict)
super(Item, self).save(*args, **kwargs)

Django-Geo includes a "DictionaryField" you might find helpful:
http://code.google.com/p/django-geo/source/browse/trunk/fields.py?r=13#49
In general, if you don't need to query across the data use a denormalized approach to avoid extra queries. User settings are a pretty good example!

I agree that you need to refrain stuffing otherwise structured data into a single column. But if you must do that, Django has an XMLField build-in.
There's also JSONField at Django snipplets.

Being "not equal to all instances of the model" sounds to me like a good match for a "Schema-free database". CouchDB is the poster child for that approach and you might consider that.
In a project I moved several tables which never played very nice with the Django ORM over to CouchDB and I'm quite happy with that. I use couchdb-python without any of the Django-specific CouchDB modules. A description of the data model can be found here. The movement from five "models" in Django to 3 "models" in Django and one CouchDB "database" actually slightly reduced the total lines of code in my application.

I know this is an old question, but today (2021) the cleanest alternative is to use the native JSONfield (since django 3.1)
docs: https://docs.djangoproject.com/en/3.2/ref/models/fields/#django.db.models.JSONField
you just create a field in the model called jsonfield inside the class model and voilá

Think it over, and find the commonalities of each data set... then define your model. It may require the use of subclasses or not. Foreign keys representing commonalities aren't to be avoided, but encouraged when they make sense.
Stuffing random data into a SQL table is not smart, unless it's truly non-relational data. If that's the case, define your problem and we may be able to help.

If you are using Postgres, you can use an hstore field: https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/fields/#hstorefield.

Related

What is the best way to implement persistent data model in Django?

What I need is basically a database model with version control. So that every time a record is modified/deleted, the data isn't lost, and the change can be undone.
I've been trying to implement it myself with something like this:
from django.db import models
class AbstractPersistentModel(models.Model):
time_created = models.DateTimeField(auto_now_add=True)
time_changed = models.DateTimeField(null=True, default=None)
time_deleted = models.DateTimeField(null=True, default=None)
class Meta:
abstract = True
Then every model would inherit from AbstractPersistentModel.
Problem is, if I override save() and delete() to make sure they don't actually touch the existing data, I'll still be left with the original object, and not the new version.
After trying to come up with a clean, safe and easy-to-use solution for some hours, I gave up.
Is there some way to implement this functionality that isn't overwhelming?
It seems common enough problem that I thought it would be built into Django itself, or at least there'd be a well documented package for this, but I couldn't find any.
When I hear version control for models and Django, I immediately think of django-reversion.
Then, if you want to access the versions of an instance, and not the actual instance, simply use the Version model.
from reversion.models import Version
versions = Version.objects.get_for_object(instance)
I feel you can work around your issue not by modifying your models but by modifying the logic that access them.
So, you could have two models for your same object: one that can be your staging area, in which you store values as the ones you mention, such as time_created, time_modified, and modifying_user, or others. From there, in the code for your views you go through that table and select the records you want/need according to your design and store in your definitive table.

Passing JSON (dict) vs. Model Object to Templates in Django

Question
In Django, when using data from an API (that doesn't need to be saved to the database) in a view, is there reason to prefer one of the following:
Convert API data (json) to a json dictionary and pass to the template
Convert API data (json) to the appropriate model object from models.py and then pass that to the template
What I've Considered So Far
Performance: I timed both approaches and averaged them over 25 iterations. Converting the API response to a model object was slower by approximately 50ms (0.4117 vs. 0.4583 seconds, +11%). This did not include timing rendering.
Not saving this data to the database does prevent me from creating many-to-many relationships with the API's data (must save an object before adding M2M relationships), however, I want the API to act as the store for this data, not my app
DRY: If I find myself using this API data in multiple views, I may find convenience in putting all my consumption/cleaning/etc. code in the appropriate object __init__ in models.
Thanks very much in advance.
Converting this to a model objects doesn't require storing it in database.
Also if you are sure you don't want to store it maybe placing it in models.py and making it Django models is wrong idea. Probably it should be just normal Python classes e.g. in resources.py or something like that, not to mistake it with models. I prefer such way because maybe converting is slower (very tiny) but it allows to add not only custom constructor but others methods and properties as well which is very helpful. It also is just convenient and organizes stuff when you use normal classes and objects.
Pass the list of dictionaries directly to the template. When you need to use it with models, use .values() to get a list of dictionaries from it instead.

Following backreferences of unknown kinds in NDB

I'm in the process of writing my first RESTful web service atop GAE and the Python 2.7 runtime; I've started out using Guido's shiny new ndb API.
However, I'm unsure how to solve a particular case without the implicit back-reference feature of the original db API. If the user-agent requests a particular resource and those resources 1 degree removed:
host/api/kind/id?depth=2
What's the best way to discover a related collection of entities from the "one" in a one-to-many relationship, given that the kind of the related entity is unknown at development time?
I'm unable to use a replacement query as described in a previous SO inquiry due to the latter restriction. The fact that my model is definable at runtime (and therefore isn't hardcoded) prevents me from using a query to filter properties for matching keys.
Ancestor and other kindless queries are also out due to the datastore limitation that prevents me from filtering on a property without the kind specified.
Thus far, the only idea I've had (beyond reverting to the db api) is to use a cross-group transaction to write my own reference on the "one", either by updating an ndb.StringProperty(repeat=True) containing all the related kinds when an entity of a new kind is introduced or by simply maintaining a list of keys on the "one" ndb.KeyProperty(repeat=True) every time a related "many" entity is written to the datastore.
I'm hoping someone more experienced than myself can suggest a better approach.
Given jmort253's suggestion, I'll try to augment my question with a concrete example adapted from the docs:
class Contact(ndb.Expando):
""" The One """
# basic info
name = ndb.StringProperty()
birth_day = ndb.DateProperty()
# If I were using db, a collection called 'phone_numbers' would be implicitly
# created here. I could use this property to retrieve related phone numbers
# when this entity was queried. Since NDB lacks this feature, the service
# will neither have a reference to query nor the means to know the
# relationship exists in the first place since it cannot be hard-coded. The
# data model is extensible and user-defined at runtime; most relationships
# will be described only in the data, and must be discoverable by the server.
# In this case, when Contact is queried, I need a way to retrieve the
# collection of phone numbers.
# Company info.
company_title = ndb.StringProperty()
company_name = ndb.StringProperty()
company_description = ndb.StringProperty()
company_address = ndb.PostalAddressProperty()
class PhoneNumber(ndb.Expando):
""" The Many """
# no collection_name='phone_numbers' equivalent exists for the key property
contact = ndb.KeyProperty(kind='Contact')
number = ndb.PhoneNumberProperty()
Interesting question! So basically you want to look at the Contact class and find out if there is some other model class that has a KeyProperty referencing it; in this example PhoneNumber (but there could be many).
I think the solution is to ask your users to explicitly add this link when the PhoneNumber class is created.
You can make this easy for your users by giving them a subclass of KeyProperty that takes care of this; e.g.
class LinkedKeyProperty(ndb.KeyProperty):
def _fix_up(self, cls, code_name):
super(LinkedKeyProperty, self)._fix_up(cls, code_name)
modelclass = ndb.Model._kind_map[self._kind]
collection_name = '%s_ref_%s_to_%s' % (cls.__name__,
code_name,
modelclass.__name__)
setattr(modelclass, collection_name, (cls, self))
Exactly how you pick the name for the collection and the value to store there is up to you; just put something there that makes it easy for you to follow the link back. The example would create a new attribute on Contact:
Contact.PhoneNumber_ref_contact_to_Contact == (PhoneNumber, PhoneNumber.contact)
[edited to make the code working and to add an example. :-) ]
Sound like a good use case for ndb.StructuredProperty.

Querying through several models

I have a django project with 5 different models in it. All of them has date field. Let's say i want to get all entries from all models with today date. Of course, i could just filter every model, and put results in one big list, but i believe it's bad. What would be efficient way to do that?
I don't think that it's a bad idea to query each model separately - indeed, from a database perspective, I can't see how you'd be able to do otherwise, as each model will need a separate SQL query. Even if, as #Nagaraj suggests, you set up a common Date model every other model references, you'd still need to query each model separately. You are probably correct, however, that putting the results into a list is bad practice, unless you actually need to load every object into memory, as explained here:
Be warned, though, that [evaluating a QuerySet as a list] could have a large memory overhead, because Django will load each element of the list into memory. In contrast, iterating over a QuerySet will take advantage of your database to load data and instantiate objects only as you need them.
It's hard to suggest other options without knowing more about your use case. However, I think I'd probably approach this by making a list or dictionary of QuerySets, which I could then use in my view, e.g.:
querysets = [cls.objects.filter(date=now) for cls in [Model1, Model2, Model3]]
Take a look at using Multiple Inheritance (docs here) to define those date fields in a class that you can subclass in the classes you want to return in the query.
For example:
class DateStuff(db.Model):
date = db.DateProperty()
class MyClass1(DateStuff):
...
class MyClass2(DateStuff):
...
I believe Django will let you query over the DateStuff class, and it'll return objects from MyClass1 and MyClass2.
Thank #nrabinowitz for pointing out my previous error.

How do I remove an item from db.ListProperty?

In a Google App Engine solution (Python), I've used the db.ListProperty as a way to describe a many-to-many relation, like so:
class Department(db.Model):
name = db.StringProperty()
#property
def employees(self):
return Employee.all().filter('departments', self.key())
class Employee(db.Model):
name = db.StringProperty()
departments = db.ListProperty(db.Key)
I create many-to-many relations by simply appending the Department key to the db.ListProperty like so:
employee.departments.append(department.key())
The problem is that I don't know how to actually remove this relationship again, when it is no longer needed.
I've tried Googling it, but I can't seem to find any documentation that describes the db.ListProperty in details.
Any ideas or references?
The ListProperty is just a Python list with some helper methods to make it work with GAE, so anything that applies to a list applies to a ListProperty.
employee.departments.remove(department.key())
employee.put()
Keep in mind that the data must be deserialized/reserialized every time a change is made, so if you are looking for speed when adding or removing single values you may want to go with another method of modelling the relationship like the one in the Relationship Model section of this page.
The ListProperty method also has the disadvantage of sometimes producing very large indexes if you want to search through the lists in a datastore request.
This may not not be a problem for you since your Lists should be relatively small, but it's something to keep in mind for future projects.
Found it via trial and error:
employee.departments.remove(department.key())

Categories

Resources