how to use django annotate with foreign key - python

Consider simple Django models
class Journey(models.Model):
vrn=models.CharField(max_length=200) # Vehicle Reg No
kilo=models.FloatField()
class J_user(models.Model):
jdi=models.ForeignKey(Journey, related_name="Journey_User",on_delete = models.DO_NOTHING,)
uid=models.IntegerField()
It's easy to annotate in a single table like if we want sum total driven kilometers for each vehicle (vrn represent registration number of the vehicle)
Journey.objects.values('vrn').annotate(Total_kilo=Sum('kilo'))
Now i want to make a query that will return how many kilometers each user has traveled in each car.
Let Data of Journey table
Data of J_user table
Then the result should be
Thanks for your help.

This is your query:
Journey
.objects
.order_by() #<-- important to avoid include sort fields
.values('vrn', 'j_user__uid', )
.annotate(Total_kilo=Sum('kilo'))
Fields on values will be included on the aggregation clause. Sample:
print(
Material
.objects
.values( "uf_id", "uf__mp__id", )
.annotate( Sum("total_social_per_c") )
.query )
Result:
SELECT "material_material"."uf_id",
"ufs_uf"."mp_id",
Sum("material_material"."total_social_per_c") AS
"total_social_per_c__sum"
FROM "material_material"
INNER JOIN "ufs_uf"
ON ( "material_material"."uf_id" = "ufs_uf"."id" )
GROUP BY "material_material"."uf_id",
"ufs_uf"."mp_id"

According to your models it should be:
J_user.objects.values('uid', vrn=F('jdi__vrn')).annotate(kilo=Sum('jdi__kilo'))

Related

Is Nested aggregate queries possible with Django queryset

I want to calculate the monthly based profit with the following models using django queryset methods. The tricky point is that I have a freightselloverride field in the order table. It overrides the sum of freightsell in the orderItem table. An order may contain multiple orderItems. That's why I have to calculate order based profit first and then calculate the monthly based profit. Because if there is any order level freightselloverride data I should take this into consideration.
Below I gave a try using annotate method but could not resolve how to reach this SQL. Does Django allow this kind of nested aggregate queries?
select sales_month
,sum(sumSellPrice-sumNetPrice-sumFreighNet+coalesce(FreightSellOverride,sumFreightSell)) as profit
from
(
select CAST(DATE_FORMAT(b.CreateDate, '%Y-%m-01 00:00:00') AS DATETIME) AS `sales_month`,
a.order_id,b.FreightSellOverride
,sum(SellPrice) as sumSellPrice,sum(NetPrice) as sumNetPrice
,sum(FreightNet) as sumFreighNet,sum(FreightSell) as sumFreightSell
from OrderItem a
inner join Order b
on a.order_id=b.id
group by 1,2,3
) c
group by sales_month
I tried this
result = (OrderItem.objects
.annotate(sales_month=TruncMonth('order__CreateDate'))
.values('sales_month','order','order__FreightSellOverride')
.annotate(sumSellPrice=Sum('SellPrice'),sumNetPrice=Sum('NetPrice'),sumFreighNet=Sum('FreightNet'),sumFreightSell=Sum('FreightSell'))
.values('sales_month')
.annotate(profit=Sum(F('sumSellPrice')-F('sumNetPrice')-F('sumFreighNet')+Coalesce('order__FreightSellOverride','sumFreightSell')))
)
but get this error
Exception Type: FieldError
Exception Value:
Cannot compute Sum('<CombinedExpression: F(sumSellPrice) - F(sumNetPrice) - F(sumFreighNet) + Coalesce(F(ProjectId__FreightSellOverride), F(sumFreightSell))>'): '<CombinedExpression: F(sumSellPrice) - F(sumNetPrice) - F(sumFreighNet) + Coalesce(F(ProjectId__FreightSellOverride), F(sumFreightSell))>' is an aggregate
from django.db import models
from django.db.models import F, Count, Sum
from django.db.models.functions import TruncMonth, Coalesce
class Order(models.Model):
CreateDate = models.DateTimeField(verbose_name="Create Date")
FreightSellOverride = models.FloatField()
class OrderItem(models.Model):
SellPrice = models.DecimalField(max_digits=10,decimal_places=2)
FreightSell = models.DecimalField(max_digits=10,decimal_places=2)
NetPrice = models.DecimalField(max_digits=10,decimal_places=2)
FreightNet = models.DecimalField(max_digits=10,decimal_places=2)
order = models.ForeignKey(Order,on_delete=models.DO_NOTHING,related_name="Item")

How to implement cross join in django for a count annotation

I present a simplified version of my problem. I have venues and timeslots and users and bookings, as shown in the model descriptions below. Time slots are universal for all venues, and users can book into a time slot at a venue up until the venue capacity is reached.
class Venue(models.Model):
name = models.Charfield(max_length=200)
capacity = models.PositiveIntegerField(default=0)
class TimeSlot(models.Model):
start_time = models.TimeField()
end_time = models.TimeField()
class Booking(models.Model):
user = models.ForeignKey(User)
time_slot = models.ForeignKey(TimeSlot)
venue = models.ForeignKey(Venue)
Now I would like to as efficiently as possible get all possible combinations of Venues and TimeSlots and annotate the count of the bookings made for each combination, including the case where the number of bookings is 0.
I have managed to achieve this in raw SQL using a cross join on the Venue and TimeSlot tables. Something to the effect of the below. However despite exhaustive searching have not been able to find a django equivalent.
SELECT venue.name, timeslot.start_time, timeslot.end_time, count(booking.id)
FROM myapp_venue as venue
CROSS JOIN myapp_timeslot as timeslot
LEFT JOIN myapp_booking as booking on booking.time_slot_id = timeslot.id
GROUP BY venue.name, timeslot.start_time, timeslot.end_time
I'm also able to annotate the query to retrieve the count of bookings for which bookings for that combination do exist. But those combinations with 0 bookings get excluded. Example:
qs = Booking.objects.all().values(
venue=F('venue__name'),
start_time=F('time_slot__start_time'),
end_time=F('time_slot__end_time')
).annotate(bookings=Count('id')) \
.order_by('venue', 'start_time', 'end_time')
How can I achieve the effect of the CROSS JOIN query using the django ORM?
I don't believe Django has the capability to do cross joins without reverting down to raw SQL. I can give you two ideas that could point you in the right direction though:
Combination of queries and python loops.
venues = Venue.objects.all()
time_slots = TimeSlot.objects.all()
qs = ** your customer query above **
# Loop through both querysets, to create a master list.
venue_time_slots = []
for venue in venues:
for time_slot in time_slots:
venue_time_slots.append(venue.name, time_slot.start_time, time_slot.end_time, 0)
# Loop through master list and then compare to custom qs to update the count.
for venue_time in venue_time_slots:
for vt in qs:
# Check if venue and time found.
if venue_time[0] == qs.venue and venue_time[1] == qs.start_time:
venue_time[3] += qs.bookings
break
The harder one which I don't have a solution is to use a combination of filter, exclude, and union. I only have used this with 3 tables (two parents with a child-link-table), where you have 4 including user. So I can only provide the logic and not an example.
# Get all results that exist in table using .filter().
first_query.filter()
# Get all results that do not exist by using .exclude().
# You can use your results from the first query to exclude also, but
# would need to create an interim list.
exclude_ids = [fq_row.id for fq_row in first_query]
second_query.exclude(id__in=exclude_ids)
# Combine both queries
query = first_query.union(second_query)
return query

Django & Postgres - percentile (median) and group by

I need to calculate period medians per seller ID (see simplyfied model below). The problem is I am unable to construct the ORM query.
Model
class MyModel:
period = models.IntegerField(null=True, default=None)
seller_ids = ArrayField(models.IntegerField(), default=list)
aux = JSONField(default=dict)
Query
queryset = (
MyModel.objects.filter(period=25)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id")
.annotate(
duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()),
median=Func(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
.values("median", "seller_id")
)
ArrayField aggregation (seller_id) source
I think what I need to do is something along the lines below
select t.*, p_25, p_75
from t join
(select district,
percentile_cont(0.25) within group (order by sales) as p_25,
percentile_cont(0.75) within group (order by sales) as p_75
from t
group by district
) td
on t.district = td.district
above example source
Python 3.7.5, Django 2.2.8, Postgres 11.1
You can create a Median child class of the Aggregate class as was done by Ryan Murphy (https://gist.github.com/rdmurphy/3f73c7b1826cacee34f6c2a855b12e2e). Median then works just like Avg:
from django.db.models import Aggregate, FloatField
class Median(Aggregate):
function = 'PERCENTILE_CONT'
name = 'median'
output_field = FloatField()
template = '%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)'
Then to find the median of a field use
my_model_aggregate = MyModel.objects.all().aggregate(Median('period'))
which is then available as my_model_aggregate['period__median'].
Here's what did the trick.
from django.db.models import F, Func, IntegerField
from django.db.models.aggregates import Aggregate
queryset = (
MyModel.objects.filter(period=25)
.annotate(duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()))
.filter(duration__isnull=False)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id") # group by
.annotate(
median=Aggregate(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
)
Notice the median annotation employs Aggregate and not Func as in the question.
Also, order of annotate() and filter() clauses as well as order of annotate() and values() clauses matters a lot!
BTW the resulting SQL is without a nested select and join.

How to query set as ORDER BY and GROUP BY in django?

This my is query:
SELECT kategoriharga,ongkoskirim,diskon,ratingproduk,ratingtoko,label
FROM
(SELECT *
FROM pohonkeputusan
where perdaerah='Kabupaten Toba Samosir'
order by label desc
) AS sub
GROUP BY
kategoriharga,ongkoskirim,diskon,ratingproduk,ratingtoko
How to make to be query set in Django?
I don't understand why you want to group by all fields. Try to use distinct:
Pohonkeputusan.objects.filter(perdaerah='Kabupaten Toba Samosir').order_by('-label').values_list('kategoriharga', 'ongkoskirim', 'diskon', 'ratingproduk', 'ratingtoko').distinct()

Django ORM: sort by aggregate of filter of related table

Here's a subset of my model:
class Case(models.Model):
... # primary key is named "id"
class Employee(models.Model):
... # primary key is named "id"
class Report(models.Model):
case = ForeignKey(Case, null=True)
employee = ForeignKey(Employee)
date = DateField()
Given a particular employee, I want to produce a list of all cases, ordered by when the employee has most recently reported on it. Those cases for which no report exists should be sorted last. Cases on the same date (including NULL) should be sorted by further criteria.
Can I express this in the Django ORM api? If so, how?
In pseudo-SQL, I think I want
Select Case.*
From Case some-kind-of-join Report
Where report.employee_id = the_given_employee_id
Group by Case.id
Order by Max(Report.date) Desc /* Report-less cases last */, Case.id /* etc. */
Do I need to introduce a many-to-many relation from Case to Employee through Report to do this in Django ORM?
Every relationship in a django model has a reverse relationship that can be easily queried (including when you are ordering) so you can do something like:
Case.objects.all().order_by('-report__date', 'another_field', 'a third field')
but this won't get you any information about a single particular employee. You could do this:
Case.objects.filter(report__employee__pk=5).order_by('-report__date', 'another_field', 'a third field')
but this won't return any Case objects that aren't edited by your particular employee.
So unfortunately, you can't natively do subqueries, so you will have to write a custom annotation query so perform the sub query (i.e. the last order dates for those objects last edited by a particular employee). This is untested, but it's the general idea:
Case \
.objects \
.all() \
.extra(select = {
"employee_last_edit" : """
SELECT app_report.date
FROM app_report
JOIN app_case ON app_case__id = app_report.case_id
WHERE app_report.employee_id = %d
""" % employee.id }) \
.order_by('-employee_last_edit' , 'something_else')

Categories

Resources