Proper/most efficient way to handle Django models and relational data - python

I'm curious about what the best way to handle models in Django is. Let's say you want to make an app that deals with TV Show listings. One way to handle the model would be
class TVShow(models.Model)
channel = models.CharField()
show_name = models.CharField()
season = models.CharField()
episode = models.CharField()
Which has the advantage of everything being packed neatly. However, if I want to display a list of all of the channels, or all of the show_names, I would have to go through the TVSHow objects and remove duplicates
On the other hand one could
class CommonModel(models.Model)
name = models.CharField()
class Meta:
Abstract = True
class Channel(CommonModel)
show_name = models.ManyToMany(ShowName)
class ShowName(CommonModel)
seasons = models.ManyToMany(Seasons)
class Season(CommonModel)
episodes = models.ManyToMany(Episodes)
class Episode(CommonModel)
This would make it easy to show all of the ShowNames or all of the Channels, without having to worry about unrelated data. However, it would be much harder to see what Channel a show is, unless you map back as well
Is there a "pythonic" or Django preferred way to do this? Are there any advantages in terms of space, speed, etc?
Thanks!

Your initial stab at it looks fine. That is, you could use
class TVShow(models.Model)
channel = models.CharField()
show_name = models.CharField()
season = models.CharField()
episode = models.CharField()
And then you could just use the django orm to do the queries you were looking for.
That is, if you wanted all the channels with no duplicates, you would say
TVShow.objects.distinct('channel')
Django documentation for distinct().
As far as performance goes, this is the way to do it because you are effectively having the database do it. Databases are designed for these purposes and should be significantly faster than trying to trim it in code.

Preferred way to use normalized database structure unless it's performance-related, it will give you ability to make more complex queries in your code easier. ForeignKey and ManyToManyField accepts 'related_name' argument.
class Channel(models.Model):
pass
class Show(models.Model):
# this means you can have same show on different channels
channels = models.ManyToManyField(Channel, related_name='shows')
class Episode(models.Model):
# this means that one episode can be related only to one show
show = models.ForeignKey(Show, related_name='episodes')
Channel.objects.filter(shows__name='Arrested Development')
Channel.objects.get(name='Discovery').shows.all()
Show.objects.get(name='Arrested Development').episodes.all()
#2 db queries, 1 join
Episode.objects.get(name='Arrested Development S01E01',
select_related='show').show.channels.all()
#1 db query, 3 joins
Channel.objects.filter(shows__episode__name='Arrested Development S01E01')
and so on...

Related

Flexible database models for users to define extra columns to database tables in Django

I am trying to build a tool that, at a simple level, tries to analyse how to buy a flat. DB = POSTGRES
So the model basically is:
class Property(models.Model):
address = CharField(max_length = 200)
price = IntegerField()
user = ForeignKey(User) # user who entered the property in the database
#..
#..
# some more fields that are common across all flats
#However, users might have their own way of analysing
# one user might want to put
estimated_price = IntegerField() # his own estimate of the price, different from the zoopla or rightmove listing price
time_to_purchase = IntegerField() # his own estimate on how long it will take to purchase
# another user might want to put other fields
# might be his purchase process requires sorting or filtering based on these two fields
number_of_bedrooms = IntegerField()
previous_owner_name = CharField()
How do I give such flexiblity to users? They should be able to sort , filter and query their own rows (in the Property table) by these custom fields. The only option I can think of now is the JSONField Postgres field
Any advice? I am surprised this is not solved readily in Django - I am sure lots of other people would have come across this problem already
Thanks
Edit: As the comments point out. JSON field is a better idea in this case.
Simple. Use Relations.
Create a model called attributes.
It will have a foreign key to a Property, a name field and a value field.
Something like,
class Attribute(models.Model):
property = models.ForiegnKey(Property)
name = models.CharField(max_length=50)
value = models.CharField(max_length=150)
Create an object each for all custom attributes of a property.
When using database queries use select_related of prefetch_related for faster response, less db operations.

How to have extra django model fields depending on the value of a field?

I have a model in my Django project called Job. Each Job has a category. An example of a category could be tutoring. This can be represented as what my model looks like right now:
from __future__ import unicode_literals
from django.db import models
class Job(models.Model):
# Abbreviations for possible categories to be stored in the database.
TUTORING = "TU"
PETSITTING = "PS"
BABYSITTING = "BS"
INTERIOR_DESIGN = "IND"
SHOPPING = "SH"
SOFTWARE_DEVELOPMENT = "SD"
DESIGN = "DE"
ART = "AR"
HOUSEKEEPING = "HK"
OTHER = "OT"
JOB_CATEGORY_CHOICES = (
(TUTORING, 'Tutoring'),
(PETSITTING, "Petsitting"),
(BABYSITTING, "Babysitting"),
(INTERIOR_DESIGN, "Interior Design"),
(SHOPPING, "Shopping"),
(SOFTWARE_DEVELOPMENT, "Software Development"),
(DESIGN), "Design"),
(ART, "Art"),
(HOUSEKEEPING, "Housekeeping"),
(OTHER, "Other"),
)
created_at = models.DateTimeField(auto_now_add=True)
title = models.CharField(max_length=255)
description = models.TextField()
category = models.CharField(max_length=3, choices=JOB_CATEGORY_CHOICES, default=OTHER,)
def __str__(self):
return self.title
Depending on the category of the Job, different fields are required. For example, if I take tutoring as the category again, then extra fields like address, subject, level of study and others are needed. If the category of the Job is software development however, extra fields like project_size and required_qualifications are needed.
Should I create a separate model for each type of Job or is there some kind of model inheritance I can use where job types inherit from the main Job model which holds all the common fields that all Jobs need.
Essentially, what is the best way to have extra fields depending on the Job category?
You have some options:
1. OneToOneField on various category models:
Pro:
allows other models to have FK to Job model. E.g. you could retrieve all of a person jobs via person.jobs.all() no matter which category.
Con:
Allows instances of different categories to relate to the same Job instance: Extra work is needed to maintain data integrity
More tables, more joins, slower queries
Adding a category always entails a migration!
2. Multi-Table inheritance:
Uses OneToOneField under the hood.
Pro:
as above + but each instance of a category will autocreate its own Job instance, so no collisions between categories.
Con:
More tables, more joins, slower queries. Obscures some of the db stuff that's going on.
Adding a category always entails a migration!
3. Job as an abstract base model
Pro: Single table for each category, faster queries
Con: separate relations need to be maintained for each category, no grouping possible at the db level.
Adding a category always entails a migration!
4. Put all the category specific fields in Job (and make them nullable)
Pro: One Table, easy relations, Queries for special categories via filter on category field still possible.
You can use specific model managers to handle categories: Job.tutoring.all()
Possibly many categories share various subsets of fields
No overengineering, easy maintainability.
Adding a new category will only require a migration if it requires a field that is not there yet. You could have a generic CharField used by multiple categories for different semantic purposes and access it via propertys with meaningful names. These cannot, however, be used in filters or qs-updates.
À la:
class Job(models.Model):
# ...
attribute = models.CharField(...)
def _get_attribute(self):
return self.attribute
def _set_attribute(self, value):
self.attribute = value
# for shopping
shop_name = property(_get_attribute, _set_attribute)
# for babysitting
family_name = property(_get_attribute, _set_attribute)
# then you can use
babysitting_job.family_name = 'Miller'
Con: Some fields are null for each job
While options 1-3 may better model the real world and make you feel good about the sophisticated model structure you have cooked up, I would not discard option 4 too quickly.
If the category fields are few and commonly shared between categories, this would be my way to go.
The optimal thing to do would be to use a OneToOneField. Before further explanation, I'll just use this example:
from django.db import models
class Menu(models.Model):
name = models.CharField(max_length=30)
class Item(models.Model):
menu = models.OneToOneField(Menu)
name = models.CharField(max_length=30)
description = models.CharField(max_length=100)
Menu here could compare to your Job model. Once an item in the menu is chosen, the Menu model basically extends the chosen Item's fields. Item here can be compared to your Job category.
You can read more on this stuff here.

I want to write a generic function to pair two database fields

Let's say that I have two teams, "red" and "black". And let's say that I have a Story class, which presents similar information in two very different ways, depending on your team:
class Story(models.Model):
red_title = models.CharField()
black_title = models.CharField()
red_prologue = models.TextField()
black_prologue = models.TextField()
# ... and so on ...
def get_field(self, genericName, team):
"""Return the field with suffix genericName belonging to the given team.
>>>self.get_field("prologue", "red") is self.red_prologue
True
>>>self.get_field("title", "black") is self.black_title
True
"""
assert(team in ["red", "black"])
specificName = "{}_{}".format(team, genericName)
return self.__dict__[specificName]
I'm happy with the getter function, but I feel like I should be able to refactor the code which created the fields in the first place. I'd like a function that looks something like this:
def make_fields(self, genericName, fieldType, **kwargs):
"""Create two fields with suffix genericName.
One will be 'red_{genericName}' and one will be 'black_{genericName}'.
"""
for team in ["red", "black"]:
specificName = "{}_{}".format(team, genericName)
self.__dict__[specificName] = fieldType(**kwargs)
But self and __dict__ are meaningless while the class is first defined, and I think Django requires that database fields be class variables rather than instance variables.
So... is there some way to create this make_fields function within Django, or am I out of luck?
Not sure why you're even doing this. A much more sane model would be:
TEAMS = (
("r","red"),
("b","black"),
)
class Story(models.Model):
team = models.CharField(max_length=1, choices=TEAMS)
title = models.CharField()
prologue = models.TextField()
Your current model is creating lots of duplicate columns (for red and black) that should just be defined by a column itself. Using the model above, you queries would be like Story.objects.filter(team="r").
You then wouldn't need your get_field function at all.
No. A Django model shouldn't be treated as something that can be dyamically constructed; it's a Python representation of a database table. For instance, what would be the semantics of changing the format of specificName after you had already run syncdb? There's no definitive, obvious answer - so Django doesn't try to answer it. You columns are defined at the class level, and that's that.
(At some level, you can always drill into the internal ORM data structures and set up these fields - but all you're doing is opening yourself up to a world of ambiguity and not-well-defined problems. Don't do it.)

How to perform queries in Django following double-join relationships (or: How to get around Django's restrictions on ManyToMany "through" models?)

There must be a way to do this query through the ORM, but I'm not seeing it.
The Setup
Here's what I'm modelling: one Tenant can occupy multiple rooms and one User can own multiple rooms. So Rooms have an FK to Tenant and an FK to User. Rooms are also maintained by a (possibly distinct) User.
That is, I have these (simplified) models:
class Tenant(models.Model):
name = models.CharField(max_length=100)
class Room(models.Model):
owner = models.ForeignKey(User)
maintainer = models.ForeignKey(User)
tenant = models.ForeignKey(Tenant)
The Problem
Given a Tenant, I want the Users owning a room which they occupy.
The relevant SQL query would be:
SELECT auth_user.id, ...
FROM tenants_tenant, tenants_room, auth_user
WHERE tenants_tenant.id = tenants_room.tenant_id
AND tenants_room.owner_id = auth_user.id;
Getting any individual value off the related User objects can be done with, for example, my_tenant.rooms.values_list('owner__email', flat=True), but getting a full queryset of Users is tripping me up.
Normally one way to solve it would be to set up a ManyToMany field on my Tenant model pointing at User with TenantRoom as the 'through' model. That won't work in this case, though, because the TenantRoom model has a second (unrelated) ForeignKey to User(see "restictions"). Plus it seems like needless clutter on the Tenant model.
Doing my_tenant.rooms.values_list('user', flat=True) gets me close, but returns a ValuesListQuerySet of user IDs rather than a queryset of the actual User objects.
The Question
So: is there a way to get a queryset of the actual model instances, through the ORM, using just one query?
Edit
If there is, in fact, no way to do this directly in one query through the ORM, what is the best (some combination of most performant, most idiomatic, most readable, etc.) way to accomplish what I'm looking for? Here are the options I see:
Subselect
users = User.objects.filter(id__in=my_tenant.rooms.values_list('user'))
Subselect through Python (see Performance considerations for reasoning behind this)
user_ids = id__in=my_tenant.rooms.values_list('user')
users = User.objects.filter(id__in=list(user_ids))
Raw SQL:
User.objects.all("""SELECT auth_user.*
FROM tenants_tenant, tenants_room, auth_user
WHERE tenants_tenant.id = tenants_room.tenant_id
AND tenants_room.owner_id = auth_user.id""")
Others...?
The proper way to do this is with related_name:
class Tenant(models.Model):
name = models.CharField(max_length=100)
class Room(models.Model):
owner = models.ForeignKey(User, related_name='owns')
maintainer = models.ForeignKey(User, related_name='maintains')
tenant = models.ForeignKey(Tenant)
Then you can do this:
jrb = User.objects.create(username='jrb')
bill = User.objects.create(username='bill')
bob = models.Tenant.objects.create(name="Bob")
models.Room.objects.create(owner=jrb, maintainer=bill, tenant=bob)
User.objects.filter(owns__tenant=bob)

Creating "classes" with Django

I'm just learning Django so feel free to correct me in any of my assumptions. I probably just need my mindset adjusted.
What I'm trying to do is creating a "class" in an OOP style. For example, let's say we're designing a bunch of Rooms. Each Room has Furniture. And each piece of Furniture has a Type and a Color. What I can see so far is that I can have
class FurnitureType(models.Model):
name = models.CharField(max_length=200)
class FurnitureColor(models.Model):
name = models.CharField(max_length=50)
class FurniturePiece(models.Model):
type = models.ForeignKey(FurnitureType)
color = models.ForeignKey(FurnitureColor)
sqft = models.IntegerField()
name = models.CharField(max_length=200)
class Room(models.Model):
name = models.CharField(max_length=200)
furnitures = models.ManyToManyField(FurniturePiece)
The problem is that each FurniturePiece has to have a unique name if I'm picking it out of the Django admin interface. If one person creates "Green Couch" then no one else can have a "Green Couch". What I'm wondering is if a) I need to learn more about Django UI and this is the right way to design this in Django or b) I have a bad design for this domain
The reason I want Furniture name to be unique is because 10 people could create a "Green Couch" each with a different sqft.
I don't get the problem with unique name. You can just specify it to be unique:
class FurniturePiece(models.Model):
type = models.ForeignKey(FurnitureType)
color = models.ForeignKey(FurnitureColor)
sqft = models.IntegerField()
name = models.CharField(max_length=200, unique=True)
I don't know whether you have to learn about Django UI or not. I guess you have to learn how to define models. The admin interface is just a generated interface based on your models. You can change the interface in certain aspects without changing the models, but besides that, there is less to learn about the admin interface.
I suggest you follow a tutorial like the djangobook, to get a good start with Django.
I think, the problem that you have is not how to use Django but more that you don't know how to model your application in general.
First you have to think about which entities do yo have (like Room, Furniture, etc.).
Then think about what relations they have.
Afterwards you can model them in Django. Of course in order to do this you have to know how to model the relations. The syntax might be Django specific but the logical relations are not. E.g. a many-to-many relation is not something Django specific, this is a term used in databases to express a certain relationship.
Djangos models are just abstraction of the database design below.
E.g you specified a many-to-many relationship between Room and FurniturePiece.
Now the question: Is this what you want? It means that a piece of furniture can belong to more than one room. This sounds strange. So maybe you want to model it that a piece of furniture only belongs to one room. But a room should still have several pieces of furniture. We therefore define a relationship from FurniturePiece to Room.
In Django, we can express this with:
class FurniturePiece(models.Model):
room = models.ForeignKey(Room)
type = models.ForeignKey(FurnitureType)
color = models.ForeignKey(FurnitureColor)
sqft = models.IntegerField()
name = models.CharField(max_length=200)
Maybe you should first learn about relational databases to get the basics before you model your application with Django.
It might be that this not necessary in order to create an application in Django. But it will definitely help you to understand whats going on, for every ORM not just Django's.
Why does each FurniturePiece need to have a unique name? It seems to me that if you remove that constraint everything just works.
(as an aside you seem to have accidentally dropped the models.Model base class for all but the Room model).
This is how I would do it:
class Room(models.Model):
name = models.CharField(max_length=255)
pieces = models.ManyToManyField('FurniturePiece')
class FurniturePiece(models.Model):
itemid = models.CharField(max_length=20, unique=True) # This is what I would require to be unique.
name = models.CharField(max_length=255)
type = models.ForeignKey('FurnitureType') # Note I put 'FurnitureType' in quotes because it hasn't been written yet (coming next).
color = models.ForeignKey('FurnitureColor') # Same here.
width_in_inches = models.PositiveIntegerField()
length_in_inches = models.PositiveIntegerField()
# Next is the property decorator which allows a method to be called without using ()
#property
def sqft(self):
return (self.length_in_inches * self.width_in_inches) / 144 # Obviously this is rough.
class FurnitureType(models.Model):
name = models.CharField(max_length=255)
class FurnitureColor(models.Model):
name = models.CharField(max_length=255)
Envision objects as real life objects, and you'll have a deeper understanding of the code as well. The reason for my sqft method is that data is best when normalized as much as possible. If you have a width and length, then when somebody asks, you have length, width, sqft, and if you add height, volume as well.

Categories

Resources