How can I join multiple models in Django into a virtual table? - python

If I have 3 models, like:
class Cow(models.Model):
name =
number_of_eyes =
number_of_feet =
color =
class Pig(models.Model):
name =
number_of_eyes =
number_of_feet =
intelligence =
class Horse(models.Model):
name =
number_of_eyes =
number_of_hooves =
weight_capacity =
speed =
And I'm interested in making a single Livestock table in my template that has instances of all 3, but I'm only interested in these columns that all 3 models have:
name
number_of_eyes
number_of_feet (number_of_hooves if Horse)
And we can ignore all other columns.
How can I join them into a single queryset?
The end goal is to get a single virtual table (queryset) that I can do a few other operations on (filter, order_by, slice), and then return the data in just those columns.
Is this possible in the Django ORM?

I think you have two options:
using itertools.chain:
from itertools import chain
cows = Cow.objects.all()
pigs = Pig.objects.all()
horses = Horse.objects.all()
livestock_list = sorted(
chain(cows, pigs, horses),
key=lambda livestock: livestock.created_at, reverse=True)
)
using contenttypes:
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes.fields import GenericForeignKey
class Livestock(models.Model):
content_type = models.ForeignKey(ContentType)
object_id = models.PositiveIntegerField()
content_object = GenericForeignKey('content_type', 'object_id')
created = models.DateTimeField(auto_now_add=True)
class Meta:
ordering = ['-created']
Now you can query Livestock model like any other model in Django, but you can have a foreign key that can refers to n models. that's what contenttypes do.
Livestock.content_object gives you what you want in your case it can be Cow, Pig or Horse.
Just remember to add objects to Livestock model after you create horse, etc instances. you need to add them in 2 models actually. you can do it with signals.
I think the second solution is better.

Apparently this can also be done using a Union, as suggested by Nick ODell:
from django.db.models import F
Cow.objects.filter(...).union(
Pig.objects.filter(...),
Horse.objects.filter(...).annotate(number_of_feet=F("number_of_hooves"))
).values('name', 'number_of_eyes', 'number_of_feet').order_by('name')[:3]
Unfortunately you can't filter on the resulting queryset after the union, so you need to filter each queryset before the union, but other than that, everything seems to work in my quick test.
From what I understand, the difference here from MojixCoder's suggestion of using ContentType is that you don't need to maintain a separate definition of this virtual table in your Django models module. In some cases, that can be an advantage, as you don't need to keep the module updated when you get new models you want to include in your query, but in other cases, it can be a disadvantage, because my way has a lot of typing every time you want to use this query, whereas in MojixCoder's example, you define it once, and your queries would be much shorter.
Edit: Using annotate and union can result in the results being out of order. Special care must be taken to ensure this doesn't happen

Related

Django-native/ORM based approach to self join

Trying to set up a Django-native query that grabs all rows/relationships when it shows up on the other side of many-to-many relationship.
I can explain with an example:
# Django models
class Ingredient:
id = ...
name = ...
...
class Dish:
id = ...
name = ...
...
class IngredientToDish
# this is a many to many relationship
ingredient_id = models.ForeignKey("Ingredient", ...)
dish_id = models.ForeignKey("Dish", ...)
...
I'd like a Django-native way of: "For each dish that uses tomato, find all the ingredients that it uses".
Looking for a list of rows that looks like:
(cheese_id, pizza_id)
(sausage_id, pizza_id)
(tomato_id, pizza_id)
(cucumber_id, salad_id)
(tomato_id, salad_id)
I'd like to keep it in one DB query, for optimization. In SQL this would be a simple JOIN with itself (IngredientToDish table), but couldn't find what the conventional approach with Django would be... Likely uses some form of select_related but haven't been able to make it work; I think part of the reason is that I haven't been able to succinctly express the problem in words to come across the right documentation during research.
You can .filter(…) [Django-doc] with:
Ingredient.objects.filter(
ingredienttodish__dish_id__ingredienttodish__ingredient_id__name='Tomato'
)
You can also add the primary key of the dish for which this holds with:
from django.db.models import F
Ingredient.objects.filter(
ingredienttodish__dish_id__ingredienttodish__ingredient_id__name='Tomato'
).annotate(
dish_id=F('ingredienttodish__dish_id')
)
The Ingredient objects that arise from this QuerySet will have an extra attribute dish_id that contains the primary key of the Dish for which these were used.
Note: Normally one does not add a suffix …_id to a ForeignKey field, since Django
will automatically add a "twin" field with an …_id suffix. Therefore it should
be dish, instead of dish_id.

Django annotate value based on another model field

I have these two models, Cases and Specialties, just like this:
class Case(models.Model):
...
judge = models.CharField()
....
class Specialty(models.Model):
name = models.CharField()
sys_num = models.IntegerField()
I know this sounds like a really weird structure but try to bare with me:
The field judge in the Case model refer to a Specialty instance sys_num value (judge is a charfield but it will always carries an integer) (each Specialty instance has a unique sys_num). So I can get the Specialty name related to a specific Case instance using something like this:
my_pk = #some number here...
my_case_judge = Case.objects.get(pk=my_pk).judge
my_specialty_name = Specialty.objects.get(sys_num=my_case_judge)
I know this sounds really weird but I can't change the underlying schemma of the tables, just work around it with sql and Django's orm.
My problem is: I want to annotate the Specialty names in a queryset of Cases that have already called values().
I only managed to get it working using Case and When but it's not dynamic. If I add more Specialty instances I'll have to manually alter the code.
cases.annotate(
specialty=Case(
When(judge=0, then=Value('name 0 goes here')),
When(judge=1, then=Value('name 1 goes here')),
When(judge=2, then=Value('name 2 goes here')),
When(judge=3, then=Value('name 3 goes here')),
...
Can this be done dynamically? I look trough django's query reference docs but couldn't produce a working solution with the tools specified there.
You can do this with a subquery expression:
from django.db.models import OuterRef, Subquery
Case.objects.annotate(
specialty=Subquery(
Specialty.objects.filter(sys_num=OuterRef('judge')).values('name')[:1]
)
)
For some databases, casting might even be necessary:
from django.db.models import IntegerField, OuterRef, Subquery
from django.db.models.functions import Cast
Case.objects.annotate(
specialty=Subquery(
Specialty.objects.filter(sys_num=Cast(
OuterRef('judge'),
output_field=IntegerField()
)).values('name')[:1]
)
)
But the modeling is very bad. Usually it is better to work with a ForeignKey, this will guarantee that the judge can only point to a valid case (so referential integrity), will create indexes on the fields, and it will also make the Django ORM more effective since it allows more advanced querying with (relativily) small queries.

Get all values from Django QuerySet plus additional fields from a related model

I was wondering if there is a shortcut to getting all fields from a Django model and only defining additional fields that are retrieved through a join (or multiple joins).
Consider models like the following:
class A(models.Model):
text = models.CharField(max_length=10, blank=True)
class B(models.Model):
a = models.ForeignKey(A, null=True, on_delete=models.CASCADE)
y = models.PositiveIntegerField(null=True)
Now I can use the values() function like this
B.objects.values('y', 'a__text')
to get tuples containing the specified values from the B model and the actual field from the A model. If I only use
B.objects.values()
I only get tuples containing fields from the B model (i.e., y and the foreign key id a). Let's assume a scenario where B and A have many fields, and I am interested in all of those belonging to B but only in a single field from A. Manually specifying all the field names in the values() call would be possible, but tedious and error-prone.
So is there a way to specify that I want all local fields, but only a (few) specific joined field(s)?
Note: I'm currently using Django 1.11, but if a solution only works with a more recent version I am interested in that too.
You can use prefetch_related for this. See docs:
You want to use performance optimization techniques like deferred
fields:
queryset = Pizza.objects.only('name')
restaurants = Restaurant.objects.prefetch_related(Prefetch('best_pizza', queryset=queryset))
In your case you can do something like this:
from django.db.models import Prefetch
queryset = A.objects.only('text')
b_list = B.objects.prefetch_related(Prefetch('a', queryset=queryset))
Maybe something like this would work in your case?
B.objects.select_related('a').defer('a__field_to_lazy_load');
This will load all fields from both models except the ones you specify in defer(), where you can use the usual Django double underscore convention to traverse the relationship.
The fields you specify in defer() won't be loaded from the db but they will be if you try to access them later on (e.g. in a template).

Nested chain vs duplicated information

There is a models.py with 4 model.
Its standard record is:
class Main(models.Model):
stuff = models.IntegerField()
class Second(models.Model):
nested = models.ForeignKey(Main)
stuff = models.IntegerField()
class Third(models.Model):
nested = models.ForeignKey(Second)
stuff = models.IntegerField()
class Last(models.Model):
nested = models.ForeignKey(Third)
stuff = models.IntegerField()
and there is another variant of Last model:
class Last(models.Model):
nested1 = models.ForeignKey(Main)
nested2 = models.ForeignKey(Second)
nested = models.ForeignKey(Third)
stuff = models.IntegerField()
Will that way save some database load?
The information in nested1 and nested2 will duplicate fields in Secod and Third and even it may become outdated ( fortunately not in my case, as the data will not be changed, only new is added ).
But from my thoughts it may save database load, when I'm looking all Last records for a certain Main record. Or when I'm looking only for Main.id for specific Last item.
Am I right?
Will it really save the load or there is a better practice?
It all depends how you access the data. By default Django will make another call to the database when you access a foreign key. So if you want to make less calls to the database, you can use select_related to prefetch the models in foreign keys.

How to do this join query in Django

In Django, I have two models:
class Product(models.Model):
name = models.CharField(max_length = 50)
categories = models.ManyToManyField(Category)
class ProductRank(models.Model):
product = models.ForeignKey(Product)
rank = models.IntegerField(default = 0)
I put the rank into a separate table because every view of a page will cause the rank to change and I was worried that all these writes would make my other (mostly read) queries slow down.
I gather a list of Products from a simple query:
cat = Category.objects.get(pk = 1)
products = Product.objects.filter(categories = cat)
I would now like to get all the ranks for these products. I would prefer to do it all in one go (using a SQL join) and was wondering how to express that using Django's query mechanism.
What is the right way to do this in Django?
This can be done in Django, but you will need to restructure your models a little bit differently:
class Product(models.Model):
name = models.CharField(max_length=50)
product_rank = models.OneToOneField('ProductRank')
class ProductRank(models.Model):
rank = models.IntegerField(default=0)
Now, when fetching Product objects, you can following the one-to-one relationship in one query using the select_related() method:
Product.objects.filter([...]).select_related()
This will produce one query that fetches product ranks using a join:
SELECT "example_product"."id", "example_product"."name", "example_product"."product_rank_id", "example_productrank"."id", "example_productrank"."rank" FROM "example_product" INNER JOIN "example_productrank" ON ("example_product"."product_rank_id" = "example_productrank"."id")
I had to move the relationship field between Product and ProductRank to the Product model because it looks like select_related() follows foreign keys in one direction only.
I haven't checked but:
products = Product.objects.filter(categories__pk=1).select_related()
Should grab every instance.
For Django 2.1
From documentation
This example retrieves all Entry objects with a Blog whose name is 'Beatles Blog':
Entry.objects.filter(blog__name='Beatles Blog')
Doc URL
https://docs.djangoproject.com/en/2.1/topics/db/queries/
Add a call to the QuerySet's select_related() method, though I'm not positive that grabs references in both directions, it is the most likely answer.

Categories

Resources