Django aggregation query on related one-to-many objects - python

Here is my simplified model:
class Item(models.Model):
pass
class TrackingPoint(models.Model):
item = models.ForeignKey(Item)
created = models.DateField()
data = models.IntegerField()
class Meta:
unique_together = ('item', 'created')
In many parts of my application I need to retrieve a set of Item's and annotate each item with data field from latest TrackingPoint from each item ordered by created field. For example, instance i1 of class Item has 3 TrackingPoint's:
tp1 = TrackingPoint(item=i1, created=date(2010,5,15), data=23)
tp2 = TrackingPoint(item=i1, created=date(2010,5,14), data=21)
tp3 = TrackingPoint(item=i1, created=date(2010,5,12), data=120)
I need a query to retrieve i1 instance annotated with tp1.data field value as tp1 is the latest tracking point ordered by created field. That query should also return Item's that don't have any TrackingPoint's at all. If possible I prefer not to use QuerySet's extra method to do this.
That's what I tried so far... and failed :(
Item.objects.annotate(max_created=Max('trackingpoint__created'),
data=Avg('trackingpoint__data')).filter(trackingpoint__created=F('max_created'))
Any ideas?

Here's a single query that will provide (TrackingPoint, Item)-pairs:
TrackingPoint.objects.annotate(max=Max('item__trackingpoint__created')).filter(max=F('created')).select_related('item').order_by('created')
You would have to query for items without TrackingPoints separately.

This isn't directly answer to your question, but in case don't need exactly what you described you might be interested in greatest-n-per-group solution. You can take a look on my answer on similar question:
Django Query That Get Most Recent Objects From Different Categories
-- this should apply directly to your case:
items = Item.objects.annotate(tracking_point_created=Max('trackingpoint__created'))
trackingpoints = TrackingPoint.objects.filter(created__in=[b.tracking_point_created for b in items])
Note that second line can produce ambiguous results if created dates repeat in TrackingPoint model.

Related

Django prefetch_related and N+1 - How is it solved?

I am sitting with a query looking like this:
# Get the amount of kilo attached to products
product_data = {}
for productSpy in ProductSpy.objects.all():
product_data[productSpy.product.product_id] = productSpy.kilo # RERUN
I do not see how I on my last line would be able to use prefetch_related. In the examples in the docs it's very simplified and somehow makes sense, but I do not understand the whole concept enough to see myself out of this. Could I please get explained what's being done and how? I find this very important to understand, and where met by my first N+1 here.
Thank you up front for your time.
models.py
class ProductSpy(models.Model):
created_by = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
product = models.ForeignKey(Product, on_delete=models.CASCADE)
def __str__(self):
return self.kilo
class Product(models.Model):
product_id = models.IntegerField()
name = models.CharField(max_length=150)
def __str__(self):
return self.name
Django fetches related tables at runtime:
each call to productSpy.product will fetch from the table product using productSpy.id
The latency in I/O operation means that this code is highly inefficient. using prefetch_related will fetch product for all the product spy objects in one shot resulting in better performance.
# Get the amount of kilo attached to products
product_data = {}
product_spies = ProductSpy.objects.all()
product_spies.prefetch_related('product')
product_spies.prefetch_related('kilo')
for productSpy in product_spies:
product_data[productSpy.product.product_id] = productSpy.kilo # RERUN
When one writes productSpy.product if the related object is not already fetched, Django makes automatically will make a query to the database to get the related Product instance. Hence if ProductSpy.objects.all() returned N instances by writing productSpy.product in a loop we will be making N more queries which is what we call N + 1 problem.
Moving further although you can use prefetch_related (will use 2 queries in your case) here it would be better for you to use select_related [Django docs] which will use a LEFT JOIN and get you the related instances in 1 query itself:
product_data = {}
queryset = ProductSpy.objects.select_related('product')
for productSpy in queryset:
product_data[productSpy.product.product_id] = productSpy.kilo # No extra queries as we used `select_related`
Note: There seems to be some problem with your logic here though, as multiple ProductSpy instances can have the same Product,
hence your loop might overwrite some values.

Get all values from Django QuerySet plus additional fields from a related model

I was wondering if there is a shortcut to getting all fields from a Django model and only defining additional fields that are retrieved through a join (or multiple joins).
Consider models like the following:
class A(models.Model):
text = models.CharField(max_length=10, blank=True)
class B(models.Model):
a = models.ForeignKey(A, null=True, on_delete=models.CASCADE)
y = models.PositiveIntegerField(null=True)
Now I can use the values() function like this
B.objects.values('y', 'a__text')
to get tuples containing the specified values from the B model and the actual field from the A model. If I only use
B.objects.values()
I only get tuples containing fields from the B model (i.e., y and the foreign key id a). Let's assume a scenario where B and A have many fields, and I am interested in all of those belonging to B but only in a single field from A. Manually specifying all the field names in the values() call would be possible, but tedious and error-prone.
So is there a way to specify that I want all local fields, but only a (few) specific joined field(s)?
Note: I'm currently using Django 1.11, but if a solution only works with a more recent version I am interested in that too.
You can use prefetch_related for this. See docs:
You want to use performance optimization techniques like deferred
fields:
queryset = Pizza.objects.only('name')
restaurants = Restaurant.objects.prefetch_related(Prefetch('best_pizza', queryset=queryset))
In your case you can do something like this:
from django.db.models import Prefetch
queryset = A.objects.only('text')
b_list = B.objects.prefetch_related(Prefetch('a', queryset=queryset))
Maybe something like this would work in your case?
B.objects.select_related('a').defer('a__field_to_lazy_load');
This will load all fields from both models except the ones you specify in defer(), where you can use the usual Django double underscore convention to traverse the relationship.
The fields you specify in defer() won't be loaded from the db but they will be if you try to access them later on (e.g. in a template).

Django queryset filter from two models

In my Django app I have two model and I don't know how to do a query for select the right record. This is the code:
class tab1 (models.Model):
id_tab1 = models.AutoField(primary_key=True)
name = models.CharField(max_length=50)
class tab2 (models.Model):
id_tab1 = models.ForeignKey(tab1)
type = models.IntegerField()
I would like to select the tab1 records that have tab2.type equal to some condition. How can I do this in Django?
your_queryset = tab1.objects.filter(tab2__type=value)
See the relevant documentation here
In a few words: you can span relationships either way (i.e. from each end of a foreign key).
The condition is specified in the named argument to filter(). The one I suggested above is the simplest one (i.e. equality), but there are quite a few more (e.g. startswith, contains, etc). Please read here
Consider you have values 1,2 stored for the field type. The following illustrates the one way of achieving your need for type=1.
filtered_objs = tab1.objects.filter(type=1)
tab2.objects.filter( tab1__in=filtered_objs)

OneToMany queries in Django

I'm having some trouble working out the best way to do queries with one to many relationships in Django. Best explained by an example:
class Item(models.Model):
name = models.CharField(max_length=30)
class Attribute(models.Model):
item = models.ForeignKey(Item)
name = models.CharField(max_length=30)
Items can have multiple attributes. Lets say the attribute is specific to an item though so ManyToMany is not appropriate here. How would I find all items with an attribute with name=a1 but also have an attribute with name=a2?
Something like this:
a1_objects = Attribute.objects.filter(name="a1").values("item__id")
a2_objects = Attribute.objects.filter(name="a2").values("item__id")
#Take the intersection (does this method of taking an intersection work?)
ids_with_a1_and_a2 = [id for id in a1_objects if id in a2_objects]
#Get item objects with those ids
results = Item.objects.filter(id__in = ids_with_a1_and_a2)
Surely there is a better way than my suggested approach? It doesn't seem efficient to me.
Check this section in the docs: Spanning multi-valued relationships
Unless I miss something, filtering Item should work:
Item.objects.filter(attribute_name="a1").filter(attribute__name="a2")

How to do this join query in Django

In Django, I have two models:
class Product(models.Model):
name = models.CharField(max_length = 50)
categories = models.ManyToManyField(Category)
class ProductRank(models.Model):
product = models.ForeignKey(Product)
rank = models.IntegerField(default = 0)
I put the rank into a separate table because every view of a page will cause the rank to change and I was worried that all these writes would make my other (mostly read) queries slow down.
I gather a list of Products from a simple query:
cat = Category.objects.get(pk = 1)
products = Product.objects.filter(categories = cat)
I would now like to get all the ranks for these products. I would prefer to do it all in one go (using a SQL join) and was wondering how to express that using Django's query mechanism.
What is the right way to do this in Django?
This can be done in Django, but you will need to restructure your models a little bit differently:
class Product(models.Model):
name = models.CharField(max_length=50)
product_rank = models.OneToOneField('ProductRank')
class ProductRank(models.Model):
rank = models.IntegerField(default=0)
Now, when fetching Product objects, you can following the one-to-one relationship in one query using the select_related() method:
Product.objects.filter([...]).select_related()
This will produce one query that fetches product ranks using a join:
SELECT "example_product"."id", "example_product"."name", "example_product"."product_rank_id", "example_productrank"."id", "example_productrank"."rank" FROM "example_product" INNER JOIN "example_productrank" ON ("example_product"."product_rank_id" = "example_productrank"."id")
I had to move the relationship field between Product and ProductRank to the Product model because it looks like select_related() follows foreign keys in one direction only.
I haven't checked but:
products = Product.objects.filter(categories__pk=1).select_related()
Should grab every instance.
For Django 2.1
From documentation
This example retrieves all Entry objects with a Blog whose name is 'Beatles Blog':
Entry.objects.filter(blog__name='Beatles Blog')
Doc URL
https://docs.djangoproject.com/en/2.1/topics/db/queries/
Add a call to the QuerySet's select_related() method, though I'm not positive that grabs references in both directions, it is the most likely answer.

Categories

Resources