Transparently storing Django model field as JSON data - python

Say I have an object, "Order," a field of which, "items," holds a list of order items. The list of items will never be searched or individually selected in the database so I just want to store it in a DB field as a JSON string.
I'm trying to figure out the best way to embed this functionality so it's fairly transparent to anyone using the model. I think saving the model is pretty easy - just override the save method and serialize the "items" list into an internal "_items" field, and then write that to the db. I'm confused about how to deserialize, though. Having looked into possibly some kind of classmethod for creation, or creating a custom manger, or something to do with signals, I've thoroughly confused myself. I'm sure this has been solved a hundred times over and I'm curious what people consider to be best practice.
Example classes:
class OrderItem():
def __init__(self, desc="", qty=0):
self.desc = desc
self.qty = qty
class Order(Model):
user = ForeignKey(User)
_items = TextField()
def save(self, *args, **kwargs):
self._items = jsonpickle.encode(self.items)
super(Order, self).save(*args, **kwargs)
Example usage:
order = Order()
order.items = [OrderItem("widget", 5)]
order.save()
This would create a record in the DB in which
_items = [{"desc":"widget", "qty":5}]
Now I want to be able to later select the object
order = Order.objects.get(id=whatever)
and have order.items be the unpacked array of items, not the stored JSON string.
EDIT:
The solution turned out to be quite simple, and I'm posting here in case it helps any other newbies. Based on Daniel's suggestion, I went with this custom model field:
class JSONField(with_metaclass(SubfieldBase, TextField)):
def db_type(self, connection):
return 'JSONField'
def to_python(self, value):
if isinstance(value, basestring):
return jsonpickle.decode(value)
else:
return value
def get_prep_value(self, value):
return jsonpickle.encode(value)

A much better approach is to subclass TextField and override the relevant methods to do the serialization/deserialization transparently as required. In fact there are a number of implementations of this already: here's one, for example.

Related

How to store a complex number in Django model

I need to store a complex number in a Django model. For those who forget, that simply means Z=R+jX where R and X are real numbers representing the real and imaginary components of the complex. There will be individual numbers, as well as lists that need to be stored. My searches so far haven't provided a good solution for lists, so I intend to let the database handle the list as individual records.
I see two options for storing a complex number:
1) create a custom field: class Complex(models.CharField)
This would allow me to customize all aspects of the field, but that is a lot of extra work for validation if it is to be done properly. The major upside is that a single number is represented by a single field in the table.
2) let each complex number be represented by a row, with a float field for the real part, R, and another float field for the imaginary part, X. The downside to this approach is that I would need to write some converters that will create a complex number from the components, and vice versa. The upside is that the database will just see it as another record.
Surely this issue has been resolved in the past, but I can't find any good references, never mind one particular to Django.
This is my first crack at the field, it is based on another example I found that involved a few string manipulations. What isn't clear to me is how and where various validations should be performed (such as coercing a simple float into a complex number by adding +0j). I intend to add form functionality as well, so that the field behaves like a float field, but with additional restrictions or requirements.
I have not tested this code yet, so there may be issues with it. It is based on the code from an answer in this SO question. It appears after running the code that some changes took place in method names.
What is the most efficient way to store a list in the Django models?
class ComplexField(models.CharField):
description = 'A complex number represented as a string'
def __init__(self, *args, **kwargs):
kwargs['verbose_name'] = 'Complex Number'
kwargs['max_length'] = 64
kwargs['default'] = '0+0j'
super().__init__(*args, **kwargs)
def to_python(self, value):
if not value: return
if isinstance(value, complex):
return value
return complex(value)
def get_db_prep_value(self, value):
if not value: return
assert(isinstance(value, complex))
return str(item)[1:-1]
def value_to_string(self, obj):
value = self._get_val_from_obj(obj)
return self.get_db_prep_value(value)
Regarding custom fields, you've probably found the relevant part in the Django documentation already.
Whether a custom field (or a custom database type, see below) is worth the trouble really depends on what you need to do with the stored numbers. For storage and some occasional pushing around, you can go with the easiest sane solution (your number two as enhanced by Tobit).
With PostgreSQL, you have to possibility to implement custom types directly in the database, including operators. Here's the relevant part in the Postgres docs, complete with a complex numbers example, no less.
Of course you then need to expose the new type and the operators to Django. Quite a bit of work, but then you could do arithmetics with individual fields right in the database using Django ORM.
If your expression every time like R + jX you can make the following class
class ComplexNumber(models.Model):
real_number = models.FloatField('Real number part')
img_number = models.FloatFoeld('Img number part')
def __str__(self):
return complex(self.real_number, self.img_number)
and handle the outcome string with python see here
If you have multiple real and img part you can handle this with foreign keys or ManyToMany Fields. This maybe depend on your need.
To be honest, I'd just split the complex number into two float/decimal fields and add a property for reading and writing as a single complex number.
I came up with this custom field that ends up as a split field on the actual model and injects the aforementioned property too.
contribute_to_class is called deep in the Django model machinery for all the fields that are declared on the model. Generally, they might just add the field itself to the model, and maybe additional methods like get_latest_by_..., but here we're hijacking that mechanism to instead add two fields we construct within, and not the actual "self" field itself at all, as it does not need to exist as a database column. (This might break something, who knows...) Some of this mechanism is explained here in the Django wiki.
The ComplexProperty class is a property descriptor, which allows customization of what happens when the property it's "attached as" into an instance is accessed (read or written). (How descriptors work is a little bit beyond the scope of this answer, but there's a how-to guide in the Python docs.)
NB: I did not test this beyond running migrations, so things may be broken in unexpected ways, but at least the theory is sound. :)
from django.db import models
class ComplexField(models.Field):
def __init__(self, **kwargs):
self.field_class = kwargs.pop('field_class', models.FloatField)
self.field_kwargs = kwargs.pop('field_kwargs', {})
super().__init__(**kwargs)
def contribute_to_class(self, cls, name, private_only=False):
for field in (
self.field_class(name=name + '_real', **self.field_kwargs),
self.field_class(name=name + '_imag', **self.field_kwargs),
):
field.contribute_to_class(cls, field.name)
setattr(cls, name, ComplexProperty(name))
class ComplexProperty:
def __init__(self, name):
self.name = name
def __get__(self, instance, owner):
if not instance:
return self
real = getattr(instance, self.name + '_real')
imag = getattr(instance, self.name + '_imag')
return complex(real, imag)
def __set__(self, instance, value: complex):
setattr(instance, self.name + '_real', value.real)
setattr(instance, self.name + '_imag', value.imag)
class Test(models.Model):
num1 = ComplexField()
num2 = ComplexField()
num3 = ComplexField()
The migration for this looks like
migrations.CreateModel(
name="Test",
fields=[
(
"id",
models.AutoField(
auto_created=True, primary_key=True, serialize=False, verbose_name="ID"
),
),
("num1_real", models.FloatField()),
("num1_imag", models.FloatField()),
("num2_real", models.FloatField()),
("num2_imag", models.FloatField()),
("num3_real", models.FloatField()),
("num3_imag", models.FloatField()),
],
)
so as you can see, the three ComplexFields are broken down into six FloatFields.

How are ModelFields assigned in Django Models?

When we define a model in django we write something like..
class Student(models.Model):
name = models.CharField(max_length=64)
age = models.IntegerField()
...
where, name = models.CharField() implies that name would be an object of models.CharField. When we have to make an object of student we simple do..
my_name = "John Doe"
my_age = 18
s = Student.objects.create(name=my_name, age=my_age)
where my_name and my_age are string and integer data types respectively, and not an object of models.CharField/models.IntegerField. Although while assigning the values the respective validations are performed (like checking on the max_length for CharField)
I'm trying to build similar models for an abstraction of Neo4j over Django but not able to get this workflow. How can I implement this ?
Found a similar question but didn't find it helpful enough.
How things work
First thing I we need to understand that each field on your models has own validation, this one refer to the CharField(_check_max_length_attribute) and it also calling the super on method check from the Field class to validate some basic common things.
That in mind, we now move to the create method which is much more complicated and total different thing, the basics operations for specific object:
Create a python object
Call save()
Using a lot of getattrs the save does tons of validation
Commit to the DB, if anything wrong goes from the DB, raise it to the user
A third thing you need to understand that when you query an object it first get the data from the db, and then(after long process) it set the data to the object.
Simple Example
class BasicCharField:
def __init__(self, max_len):
self.max_len = max_len
def validate(self, value):
if value > self.max_len:
raise ValueError('the value must be lower than {}'.format(self.max_len))
class BasicModel:
score = BasicCharField(max_len=4)
#staticmethod
def create(**kwargs):
obj = BasicModel()
obj.score = kwargs['score']
obj.save()
return obj
def save(self):
# Lots of validations here
BasicModel.score.validate(self.score)
# DB commit here
BasicModel.create(score=5)
And like we was expecting:
>>> ValueError: the value must be lower than 4
Obviously I had to simplify things to make it into few lines of code, you can improve this by a lot (like iterate over the attribute and not hardcode it like obj.score = ...)

Dynamically add properties to a django model

I have a Django model where a lot of fields are choices. So I had to write a lot of "is_something" properties of the class to check whether the instance value is equal to some choice value. Something along the lines of:
class MyModel(models.Model):
some_choicefield = models.IntegerField(choices=SOME_CHOICES)
#property
def is_some_value(self):
return self.some_choicefield == SOME_CHOICES.SOME_CHOICE_VALUE
# a lot of these...
In order to automate this and spare me a lot of redundant code, I thought about patching the instance at creation, with a function that adds a bunch of methods that do the checks.
The code became as follows (I'm assuming there's a "normalize" function that makes the label of the choice a usable function name):
def dynamic_add_checks(instance, field):
if hasattr(field, 'choices'):
choices = getattr(field, 'choices')
for (value,label) in choices:
def fun(instance):
return getattr(instance, field.name) == value
normalized_func_name = "is_%s_%s" % (field.name, normalize(label))
setattr(instance, normalized_func_name, fun(instance))
class MyModel(models.Model):
def __init__(self, *args, **kwargs):
super(MyModel).__init__(*args, **kwargs)
dynamic_add_checks(self, self._meta.get_field('some_choicefield')
some_choicefield = models.IntegerField(choices=SOME_CHOICES)
Now, this works but I have the feeling there is a better way to do it. Perhaps at class creation time (with metaclasses or in the new method)? Do you have any thoughts/suggestions about that?
Well I am not sure how to do this in your way, but in such cases I think the way to go is to simply create a new model, where you keep your choices, and change the field to ForeignKey. This is simpler to code and manage.
You can find a lot of information at a basic level in Django docs: Models: Relationships. In there, there are many links to follow expanding on various topics. Beyong that, I believe it just needs a bit of imagination, and maybe trial and error in the beginning.
I came across a similar problem where I needed to write large number of properties at runtime to provide backward compatibility while changing model fields. There are 2 standard ways to handle this -
First is to use a custom metaclass in your models, which inherits from models default metaclass.
Second, is to use class decorators. Class decorators sometimes provides an easy alternative to metaclasses, unless you have to do something before the creation of class, in which case you have to go with metaclasses.
I bet you know Django fields with choices provided will automatically have a display function.
Say you have a field defined like this:
category = models.SmallIntegerField(choices=CHOICES)
You can simply call a function called get_category_display() to access the display value. Here is the Django source code of this feature:
https://github.com/django/django/blob/baff4dd37dabfef1ff939513fa45124382b57bf8/django/db/models/base.py#L962
https://github.com/django/django/blob/baff4dd37dabfef1ff939513fa45124382b57bf8/django/db/models/fields/init.py#L704
So we can follow this approach to achieve our dynamically set property goal.
Here is my scenario, a little bit different from yours but down to the end it's the same:
I have two classes, Course and Lesson, class Lesson has a ForeignKey field of Course, and I want to add a property name cached_course to class Lesson which will try to get Course from cache first, and fallback to database if cache misses:
Here is a typical solution:
from django.db import models
class Course(models.Model):
# some fields
class Lesson(models.Model):
course = models.ForeignKey(Course)
#property
def cached_course(self):
key = key_func()
course = cache.get(key)
if not course:
course = get_model_from_db()
cache.set(key, course)
return course
Turns out I have so many ForeignKey fields to cache, so here is the code following the similar approach of Django get_FIELD_display feature:
from django.db import models
from django.utils.functional import curry
class CachedForeignKeyField(models.ForeignKey):
def contribute_to_class(self, cls, name, **kwargs):
super(models.ForeignKey, self).contribute_to_class(cls, name, **kwargs)
setattr(cls, "cached_%s" % self.name,
property(curry(cls._cached_FIELD, field=self)))
class BaseModel(models.Model):
def _cached_FIELD(self, field):
value = getattr(self, field.attname)
Model = field.related_model
return cache.get_model(Model, pk=value)
class Meta:
abstract = True
class Course(BaseModel):
# some fields
class Lesson(BaseModel):
course = CachedForeignKeyField(Course)
By customizing CachedForeignKeyField, and overwrite the contribute_to_class method, along with BaseModel class with a _cached_FIELD method, every CachedForeignKeyField will automatically have a cached_FIELD property accordingly.
Too good to be true, bravo!

Keeping track of changes since the last save in django models

A couple of times I've run into a situation, when at save time I need to know which model fields are going to be updated and act accordingly.
The most obvious solution to this is to take the primary key field and retrieve a copy of the model from the database:
class MyModel(models.Model):
def save(self, force_insert=False, force_update=False, using=None):
if self.id is not None:
unsaved_copy = MyModel.objects.get(id=self.id)
# Do your comparisons here
super(MyModel, self).save(force_insert, force_update, using)
That works perfectly fine, however, it hits the database for every instance of the model you are saving (might be quite inconvenient if you are doing a lot of such saves).
It is obvious, that if one can "remember" the old field values at the start of model instance's lifetime (__init__), there should be no need to retrieve a copy of the model from the database. So I came up with this little hack:
class MyModel(models.Model):
def __init__(self, *args, **kwargs):
super(MyModel, self).__init__(*args, **kwargs)
self.unsaved = {}
for field in self._meta.fields:
self.unsaved[field.name] = getattr(self, field.name, None)
def save(self, force_insert=False, force_update=False, using=None):
for name, value in self.unsaved.iteritems():
print "Field:%s Old:%s New:%s" % (name, value, getattr(self, name, None))
# old values can be accessed through the self.unsaved member
super(MyModel, self).save(force_insert, force_update, using)
This seems to work, however it makes use of the non-public interface of django.db.models.Model.
Perhaps someone knows a cleaner way to do it?
I think your solution looks reasonable.
Alternatively you could have a Manager method called get_and_copy() (or something) that hung a copy of the original object off of what is returned. You could then use another Manager method, save_and_check() which took advantage of the copied original.
FWIW: If you are playing with contrib/admin templates there is a context variable called original which is a copy of the original object.
Update: I looked more closely at what admin is doing. In class ModelAdmin (located in django/contrib/admin/options.py) there is a method called construct_change_message(). It is being driven by formset.changed_data and formset.changed_objects, so django/forms/models.py class BaseModelFormSet is where the action is. See the method save_existing_objects(). Also look at the method _existing_object(). It's a little more complicated than what I mentioned before because they are dealing with the possibility of multiple objects, but they are basically caching the results of the query set on first access.
This will not work for fixtures. loaddata command uses models.Model.base_save. Probably the cleanest method would be to use descriptors for fields, but one has to figure out how to inset them properly.

Passing kwargs from template to view?

As you may be able to tell from my questions, I'm new to both python and django. I would like to allow dynamic filter specifications of query sets from my templates using **kwargs. I'm thinking like a select box of a bunch of kwargs. For example:
<select id="filter">
<option value="physician__isnull=True">Unassigned patients</option>
</select>
Does django provide an elegant solution to this problem that I haven't come across yet?
I'm trying to solve this in a generic manner since I need to pass this filter to other views. For example, I need to pass a filter to a paginated patient list view, so the pagination knows what items it's working with. Another example is this filter would have to be passed to a patient detail page so you can iterate through the filtered list of patients with prev/next links.
Thanks a bunch, Pete
Update:
What I came up with was building a FilterSpecification class:
class FilterSpec(object):
def __init__(self, name, *args):
super(FilterSpec, self).__init__()
self.name = name
self.filters = []
for filter in args:
self.add(filter)
def pickle(self):
return encrypt(pickle.dumps(self))
def add(self, f):
self.filters.append(f)
def kwargs(self):
kwargs = {}
for f in self.filters:
kwargs = f.kwarg(**kwargs)
return kwargs
def __unicode__(self):
return self.name
class Filter(object):
def __init__(self, key, value):
super(Filter, self).__init__()
self.filter_key = key
self.filter_value = value
def kwarg(self, **kwargs):
if self.filter_key != None:
kwargs[self.filter_key] = self.filter_value
return kwargs
I then can filter any type of model like this:
filterSpec = FilterSpec('Assigned', Filter('service__isnull', False)))
patients = Patient.objects.filter(**filterSpec.kwargs())
I pass these filterSpec objects from the client to server by serializing, compressing, applying some symmetric encryption, and url-safe base-64 encoding. The only downside is that you end up with URLs looking like this:
http://127.0.0.1:8000/hospitalists/assign_test/?filter=eJwBHQHi_iDiTrccFpHA4It7zvtNIW5nUdRAxdiT-cZStYhy0PHezZH2Q7zmJB-NGAdYY4Q60Tr_gT_Jjy_bXfB6iR8inrNOVkXKVvLz3SCVrCktGc4thePSNAKoBtJHkcuoaf9YJA5q9f_1i6uh45-6k7ZyXntRu5CVEsm0n1u5T1vdMwMnaNA8QzYk4ecsxJRSy6SMbUHIGhDiwHHj1UnQaOWtCSJEt2zVxaurMuCRFT2bOKlj5nHfXCBTUCh4u3aqZZjmSd2CGMXZ8Pn3QGBppWhZQZFztP_1qKJaqSVeTNnDWpehbMvqabpivtnFTxwszJQw9BMcCBNTpvJf3jUGarw_dJ89VX12LuxALsketkPbYhXzXNxTK1PiZBYqGfBbioaYkjo%3D
I would love to get some comments on this approach and hear other solutions.
Rather than face the horrible dangers of SQL injection, why not just assign a value to each select option and have your form-handling view run the selected query based on the value.
Passing the parameters for a DB query from page to view is just asking for disaster. Django is built to avoid this sort of thing.
Concerning your update: FilterSpecs are unfortunately one of those (rare) pieces of Django that lack public documentation. As such, there is no guarantee that they will keep working as they do.
Another approach would be to use Alex Gaynor's django-filter which look really well thought out. I'll be using them for my next project.

Categories

Resources