Django Blob Model Field - python

How do you store a "blob" of binary data using Django's ORM, with a PostgreSQL backend? Yes, I know Django frowns upon that sort of thing, and yes, I know they prefer you use the ImageField or FileField for that, but suffice it to say, that's impractical for my application.
I've tried hacking it by using a TextField, but I get occassional errors when my binary data doesn't strictly confirm to the models encoding type, which is unicode by default. e.g.
psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0xe22665

If you're using Django >= 1.6, there's a BinaryField

This snippet any good:
http://djangosnippets.org/snippets/1597/
This is possibly the simplest solution for storing binary data in a
TextField.
import base64
from django.db import models
class Foo(models.Model):
_data = models.TextField(
db_column='data',
blank=True)
def set_data(self, data):
self._data = base64.encodestring(data)
def get_data(self):
return base64.decodestring(self._data)
data = property(get_data, set_data)
There's a couple of other snippets there that might help.

I have been using this simple field for 'mysql' backend, you can modify it for other backends
class BlobField(models.Field):
description = "Blob"
def db_type(self, connection):
return 'blob'

Also, check out Django Storages' Database Storage:.
I haven't used it yet, but it looks awesome and I'm going to start using it as soon as I Post My Answer.

Related

Django: How to decide between class-based and function-based custom validators?

This is a beginner question. I'm working on a website that allows users to upload a video to a project model (via ModelForm) and I want to validate this file correctly. I originally declared the field in this way:
from django.db import models
from django.core.validators import FileExtensionValidator
def user_directory_path(instance, filename):
"""
Code to return the path
"""
class Project(models.Model):
"""
"""
# ...Some model fields...
# Right now I'm only validating and testing with .mp4 files.
video_file = models.FileField(
upload_to=user_directory_path,
validators=[FileExtensionValidator(allowed_extensions=['mp4'])]
)
But I read in several places that it was better to use libmagic to check the file's magic numbers and make sure its contents match the extension and the MIME type. I'm very new to this, so I might get some things wrong.
I followed the validators reference to write a custom validator that uses magic. The documentation also talks about "a class with a __cal__() method," and the most upvoted answer here uses a class-based validator. The documentation says that this can be done "for more complex or configurable validators," but I haven't understood what would be a specific example of that and if my function-based validator is just enough for what I'm trying to do. I think it is, but I don't have experience to be sure.
This is what I have.
models.py
from django.db import models
from .validators import validate_media_file
def user_directory_path(instance, filename):
"""
Code to return the path
"""
class Project(models.Model):
"""
"""
# ...Some model fields...
# Right now I'm only validating and testing with .mp4 files.
video_file = models.FileField(
upload_to=user_directory_path,
validators=[validate_media_file]
)
validators.py (basically taken from the example in the documentation)
import os
import magic
from django.core.exceptions import ValidationError
from django.utils.translation import gettext_lazy as _
def validate_media_file(value):
"""
"""
# Upper case to then see if it's in magic.from_buffer() output
file_extension = os.path.splitext(value.name)[1].upper()[1:]
# A list because later I will validate other formats
if file_extension not in ['MP4']:
raise ValidationError(
_('File %(value)s does not contain a valid extension'),
params={'value': value},
)
elif file_extension not in magic.from_buffer(value.read()):
raise ValidationError(
_(<appropriate error message>),
params={'value': value},
)
The migration worked with this. I also tested it with a plain text file with a .mp4 extension and then with a different file (and extension) and it works. However, I'd like to know if I'm missing something by using this instead of a class-based validator, and also, as the title says, when should I use one because I might come across another situation in which I'd need to know it.
I know I haven't included the MIME type; I can do it later.
As an additional question, what would be an appropriate error message for when the output from magic.from_buffer() does not match the extension and/or MIME type? I thought about something saying the "file is corrupt," but I'm not sure. Actually, is this the output that's directly based on the magic numbers?
When to use class based validator?
In your example a function based validator is sufficient. If you ever need the advantages of OOP, classes and objects, then you should switch to a class based validator.
Imagine the following very fictional source code:
class StartsWithValidator():
def __init__(self, starts_with):
self.starts_with = starts_with
def __call__(self, value):
if not str(value).startswith(self.starts_with):
raise ValidationError(
'Your string does not start with: {}!'.format(self.starts_with),
params={'value': value}
)
my_validator = StartsWithValidator('123')
test_string = '123OneTwoThree'
my_validator(test_string) # Will it pass the validator?
You can see here different qualities:
With a class based validator, you can use objects. Objects share the same functionality with a different inner state. You now can set up a validator, which checks if the string starts with 'abc', '123', whatever without writing new code
starts_with_abc = StartsWithValidator('abc')
starts_with_123 = StartsWithValidator('123')
starts_with_whatever = StartsWithValidator('whatever')
You can use inheritance. Imagine you want to reuse the starts-with-validation among with other features, you simply inherit from the `StartsWithValidator'-class.
class StartsWithABCValidator(StartsWithValidator):
def __init__(self):
super().__init__('ABC')
def __call__(self, value):
super().__call__(value)
If your validator does a lot of complex things, a simple function can lead to bad readable code. If you use classes, you are able to encapsulate your functionality and group it together.

JSONField getting saved as string django

I have a django model like below:
from jsonfield import JSONField
class SCUser(User):
address = JSONField(blank=True,null=True)
When I save a json in this address it gets saved as string.
Here is a code snippet:
appuser.address = {"state":""}
appuser.save()
Now if I try to retrieve appuser.address it gives me
>>>appuser.address
>>>u'{"state":""}'
>>>appuser.save()
>>>appuser.address
>>>u'"{\\"state\\":\\"\\"}"'
And it gets recursive.
What am I missing here?
Edit:
The AppUser inherits from SCUser model.
I met this problem when I am using a non-Autofield key as the model's primary key and I found some issues which is still open on github related to this problem.
https://github.com/dmkoch/django-jsonfield/issues/92
https://github.com/dmkoch/django-jsonfield/issues/101
I solved this problem by define a pk property in the model. I don't known is there any side effects by using this solution.
class SCUser(User):
....
#property
def pk(self):
return self.id # your pk
Please try:
appuser.address = {"state":""}
appuser.save()
appuser.get_data_json()

Cleaner / reusable way to emit specific JSON for django models

I'm rewriting the back end of an app to use Django, and I'd like to keep the front end as untouched as possible. I need to be consistent with the JSON that is sent between projects.
In models.py I have:
class Resource(models.Model):
# Name chosen for consistency with old app
_id = models.AutoField(primary_key=True)
name = models.CharField(max_length=255)
#property
def bookingPercentage(self):
from bookings.models import Booking
return Booking.objects.filter(resource=self)
.aggregate(models.Sum("percent"))["percent__sum"]
And in views.py that gets all resource data as JSON:
def get_resources(request):
resources = []
for resource in Resource.objects.all():
resources.append({
"_id": resource._id,
"name": resource.first,
"bookingPercentage": resource.bookingPercentage
})
return HttpResponse(json.dumps(resources))
This works exactly as I need it to, but it seems somewhat antithetical to Django and/or Python. Using .all().values will not work because bookinPercentage is a derived property.
Another issue is that there are other similar models that will need JSON representations in pretty much the same way. I would be rewriting similar code and just using different names for the values of the models. In general is there a better way to do this that is more pythonic/djangothonic/does not require manual creation of the JSON?
Here's what I do in this situation:
def get_resources(request):
resources = list(Resource.objects.all())
for resource in resources:
resource.booking = resource.bookingPercentage()
That is, I create a new attribute for each entity using the derived property. It's only a local attribute (not stored in the database), but it's available for your json.dumps() call.
It sounds like you just want a serialisation of your models, in JSON. You can use the serialisers in core:
from django.core import serializers
data = serializers.serialize('json', Resource.objects.all(), fields=('name','_id', 'bookingPercentage'))
So just pass in your Model class, and the fields you want to serialize into your view:
get_resources(request, model_cls, fields):
documentation here https://docs.djangoproject.com/en/dev/topics/serialization/#id2

Django Admin interface with pickled set

I have a model that has a pickled set of strings. (It has to be pickled, because Django has no built in set field, right?)
class Foo(models.Model):
__bar = models.TextField(default=lambda: cPickle.dumps(set()), primary_key=True)
def get_bar(self):
return cPickle.loads(str(self.__bar))
def set_bar(self, values):
self.__bar = cPickle.dumps(values)
bar = property(get_bar, set_bar)
I would like the set to be editable in the admin interface. Obviously the user won't be working with the pickled string directly. Also, the interface would need a widget for adding/removing strings from a set.
What is the best way to go about doing this? I'm not super familiar with Django's admin system. Do I need to build a custom admin widget or something?
Update: If I do need a custom widget, this looks helpful: http://www.fictitiousnonsense.com/archives/22
Update 2: Now I'm looking through different relational models to see if that will work. One idea I'm toying with:
class FooMember(models.Model):
name = models.CharField(max_length=120)
foo = models.ForeignKey('Foo')
class Foo(models.Model):
def get_names(self):
return FooMember.objects.filter(foo__exact=self)
Disadvantages of this include:
It feels excessive to make an entire model for one data field (name).
I would like the admin interface for Foo to allow the user to enter a list of strings. I'm not sure how to do that with this setup; making a custom form widget seems like less work.
Uhm. Django usually stores it's data in an SQL database. Storing a set as a pickled string is definietly not the best way to use an SQL database. It's not immediately obvious which is the right solution in your case, that depends what is in that set, but this is the wrong solution in any case.
You might want a new table for that set, or at least save it as comma separated values or something.

Django ease of building a RESTful interface

I'm looking for an excuse to learn Django for a new project that has come up. Typically I like to build RESTful server-side interfaces where a URL maps to resources that spits out data in some platform independent context, such as XML or JSON. This is
rather straightforward to do without the use of frameworks, but some of them such as Ruby on Rails conveniently allow you to easily spit back XML to a client based on the type of URL you pass it, based on your existing model code.
My question is, does something like Django have support for this? I've googled and found some 'RESTful' 3rd party code that can go on top of Django. Not sure if I'm too keen on that.
If not Django, any other Python framework that's already built with this in mind so I do not have to reinvent the wheel as I already have in languages like PHP?
This is probably pretty easy to do.
URL mappings are easy to construct, for example:
urlpatterns = patterns('books.views',
(r'^books/$', 'index'),
(r'^books/(\d+)/$', 'get'))
Django supports model serialization, so it's easy to turn models into XML:
from django.core import serializers
from models import Book
data = serializers.serialize("xml", Book.objects.all())
Combine the two with decorators and you can build fast, quick handlers:
from django.http import HttpResponse
from django.shortcuts import get_object_or_404
def xml_view(func):
def wrapper(*args, **kwargs):
result = func(*args, **kwargs)
return HttpResponse(serializers.serialize("xml", result),
mimetype="text/xml")
return wrapper
#xml_view
def index(request):
return Books.objects.all()
#xml_view
def get(request, id):
return get_object_or_404(Book, pk=id)
(I had to edit out the most obvious links.)
+1 for piston - (link above). I had used apibuilder (Washington Times open source) in the past, but Piston works easier for me. The most difficult thing for me is in figuring out my URL structures for the API, and to help with the regular expressions. I've also used surlex which makes that chore much easier.
Example, using this model for Group (from a timetable system we're working on):
class Group(models.Model):
"""
Tree-like structure that holds groups that may have other groups as leaves.
For example ``st01gp01`` is part of ``stage1``.
This allows subgroups to work. The name is ``parents``, i.e.::
>>> stage1group01 = Group.objects.get(unique_name = 'St 1 Gp01')
>>> stage1group01
>>> <Group: St 1 Gp01>
# get the parents...
>>> stage1group01.parents.all()
>>> [<Group: Stage 1>]
``symmetrical`` on ``subgroup`` is needed to allow the 'parents' attribute to be 'visible'.
"""
subgroup = models.ManyToManyField("Group", related_name = "parents", symmetrical= False, blank=True)
unique_name = models.CharField(max_length=255)
name = models.CharField(max_length=255)
academic_year = models.CharField(max_length=255)
dept_id = models.CharField(max_length=255)
class Meta:
db_table = u'timetable_group'
def __unicode__(self):
return "%s" % self.name
And this urls.py fragment (note that surlex allows regular expression macros to be set up easily):
from surlex.dj import surl
from surlex import register_macro
from piston.resource import Resource
from api.handlers import GroupHandler
group_handler = Resource(GroupHandler)
# add another macro to our 'surl' function
# this picks up our module definitions
register_macro('t', r'[\w\W ,-]+')
urlpatterns = patterns('',
# group handler
# all groups
url(r'^groups/$', group_handler),
surl(r'^group/<id:#>/$', group_handler),
surl(r'^group/<name:t>/$', group_handler),)
Then this handler will look after JSON output (by default) and can also do XML and YAML.
class GroupHandler(BaseHandler):
"""
Entry point for Group model
"""
allowed_methods = ('GET', )
model = Group
fields = ('id', 'unique_name', 'name', 'dept_id', 'academic_year', 'subgroup')
def read(self, request, id=None, name=None):
base = Group.objects
if id:
print self.__class__, 'ID'
try:
return base.get(id=id)
except ObjectDoesNotExist:
return rc.NOT_FOUND
except MultipleObjectsReturned: # Should never happen, since we're using a primary key.
return rc.BAD_REQUEST
else:
if name:
print self.__class__, 'Name'
return base.filter(unique_name = name).all()
else:
print self.__class__, 'NO ID'
return base.all()
As you can see, most of the handler code is in figuring out what parameters are being passed in urlpatterns.
Some example URLs are api/groups/, api/group/3301/ and api/group/st1gp01/ - all of which will output JSON.
Take a look at Piston, it's a mini-framework for Django for creating RESTful APIs.
A recent blog post by Eric Holscher provides some more insight on the PROs of using Piston: Large Problems in Django, Mostly Solved: APIs
It can respond with any kind of data. JSON/XML/PDF/pictures/CSV...
Django itself comes with a set of serializers.
Edit
I just had a look at at Piston — looks promising. Best feature:
Stays out of your way.
:)
Regarding your comment about not liking 3rd party code - that's too bad because the pluggable apps are one of django's greatest features. Like others answered, piston will do most of the work for you.
A little over a year ago, I wrote a REST web service in Django for a large Seattle company that does streaming media on the Internet.
Django was excellent for the purpose. As "a paid nerd" observed, the Django URL config is wonderful: you can set up your URLs just the way you want them, and have it serve up the appropriate objects.
The one thing I didn't like: the Django ORM has absolutely no support for binary BLOBs. If you want to serve up photos or something, you will need to keep them in a file system, and not in a database. Because we were using multiple servers, I had to choose between writing my own BLOB support or finding some replication framework that would keep all the servers up to date with the latest binary data. (I chose to write my own BLOB support. It wasn't very hard, so I was actually annoyed that the Django guys didn't do that work. There should be one, and preferably only one, obvious way to do something.)
I really like the Django ORM. It makes the database part really easy; you don't need to know any SQL. (I don't like SQL and I do like Python, so it's a double win.) The "admin interface", which you get for free, gives you a great way to look through your data, and to poke data in during testing and development.
I recommend Django without reservation.

Categories

Resources