I want to calculate the monthly based profit with the following models using django queryset methods. The tricky point is that I have a freightselloverride field in the order table. It overrides the sum of freightsell in the orderItem table. An order may contain multiple orderItems. That's why I have to calculate order based profit first and then calculate the monthly based profit. Because if there is any order level freightselloverride data I should take this into consideration.
Below I gave a try using annotate method but could not resolve how to reach this SQL. Does Django allow this kind of nested aggregate queries?
select sales_month
,sum(sumSellPrice-sumNetPrice-sumFreighNet+coalesce(FreightSellOverride,sumFreightSell)) as profit
from
(
select CAST(DATE_FORMAT(b.CreateDate, '%Y-%m-01 00:00:00') AS DATETIME) AS `sales_month`,
a.order_id,b.FreightSellOverride
,sum(SellPrice) as sumSellPrice,sum(NetPrice) as sumNetPrice
,sum(FreightNet) as sumFreighNet,sum(FreightSell) as sumFreightSell
from OrderItem a
inner join Order b
on a.order_id=b.id
group by 1,2,3
) c
group by sales_month
I tried this
result = (OrderItem.objects
.annotate(sales_month=TruncMonth('order__CreateDate'))
.values('sales_month','order','order__FreightSellOverride')
.annotate(sumSellPrice=Sum('SellPrice'),sumNetPrice=Sum('NetPrice'),sumFreighNet=Sum('FreightNet'),sumFreightSell=Sum('FreightSell'))
.values('sales_month')
.annotate(profit=Sum(F('sumSellPrice')-F('sumNetPrice')-F('sumFreighNet')+Coalesce('order__FreightSellOverride','sumFreightSell')))
)
but get this error
Exception Type: FieldError
Exception Value:
Cannot compute Sum('<CombinedExpression: F(sumSellPrice) - F(sumNetPrice) - F(sumFreighNet) + Coalesce(F(ProjectId__FreightSellOverride), F(sumFreightSell))>'): '<CombinedExpression: F(sumSellPrice) - F(sumNetPrice) - F(sumFreighNet) + Coalesce(F(ProjectId__FreightSellOverride), F(sumFreightSell))>' is an aggregate
from django.db import models
from django.db.models import F, Count, Sum
from django.db.models.functions import TruncMonth, Coalesce
class Order(models.Model):
CreateDate = models.DateTimeField(verbose_name="Create Date")
FreightSellOverride = models.FloatField()
class OrderItem(models.Model):
SellPrice = models.DecimalField(max_digits=10,decimal_places=2)
FreightSell = models.DecimalField(max_digits=10,decimal_places=2)
NetPrice = models.DecimalField(max_digits=10,decimal_places=2)
FreightNet = models.DecimalField(max_digits=10,decimal_places=2)
order = models.ForeignKey(Order,on_delete=models.DO_NOTHING,related_name="Item")
Related
I have an annotation like this: which displays the month wise count of a field
bar = Foo.objects.annotate(
item_count=Count('item')
).order_by('-item_month', '-item_year')
and this produces output like this:
html render
I would like to show the change in item_count when compared with the previous month item_count for each month (except the first month). How could I achieve this using annotations or do I need to use pandas?
Thanks
Edit:
In SQL this becomes easy with LAG function, which is similar to
SELECT item_month, item_year, COUNT(item),
LAG(COUNT(item)) OVER (ORDER BY item_month, item_year)
FROM Foo
GROUP BY item_month, item_year
(PS: item_month and item_year are date fields)
Do Django ORM have similar to LAG in SQL?
For these types of Query you need to use Window functions in django Orm
For Lag you can take the help of
https://docs.djangoproject.com/en/4.0/ref/models/database-functions/#lag
Working Query in Orm will look like this :
#models.py
class Review(models.Model):
user = models.ForeignKey(User, on_delete=models.CASCADE, related_name='review_user', db_index=True)
review_text = models.TextField(max_length=5000)
rating = models.SmallIntegerField(
validators=[
MaxValueValidator(10),
MinValueValidator(1),
],
)
date_added = models.DateTimeField(db_index=True)
review_id = models.AutoField(primary_key=True, db_index=True)
This is just a dummy table to show you the use case of Lag and Window function in django
Because examples are not available for Lag function on Django Docs.
from django.db.models.functions import Lag, ExtractYear
from django.db.models import F, Window
print(Review.objects.filter().annotate(
num_likes=Count('likereview_review')
).annotate(item_count_lag=Window(expression=Lag(expression=F('num_likes')),order_by=ExtractYear('date_added').asc())).order_by('-num_likes').distinct().query)
Query will look like
SELECT DISTINCT `temp_view_review`.`user_id`, `temp_view_review`.`review_text`, `temp_view_review`.`rating`, `temp_view_review`.`date_added`, `temp_view_review`.`review_id`, COUNT(`temp_view_likereview`.`id`) AS `num_likes`, LAG(COUNT(`temp_view_likereview`.`id`), 1) OVER (ORDER BY EXTRACT(YEAR FROM `temp_view_review`.`date_added`) ASC) AS `item_count_lag` FROM `temp_view_review` LEFT OUTER JOIN `temp_view_likereview` ON (`temp_view_review`.`review_id` = `temp_view_likereview`.`review_id`) GROUP BY `temp_view_review`.`review_id` ORDER BY `num_likes` DESC
Also if you don't want to order_by on extracted year of date then you can use F expressions like this
print(Review.objects.filter().annotate(
num_likes=Count('likereview_review')
).annotate(item_count_lag=Window(expression=Lag(expression=F('num_likes')),order_by=[F('date_added')])).order_by('-num_likes').distinct().query)
Query for this :
SELECT DISTINCT `temp_view_review`.`user_id`, `temp_view_review`.`review_text`, `temp_view_review`.`rating`, `temp_view_review`.`date_added`, `temp_view_review`.`review_id`, COUNT(`temp_view_likereview`.`id`) AS `num_likes`, LAG(COUNT(`temp_view_likereview`.`id`), 1) OVER (ORDER BY `temp_view_review`.`date_added`) AS `item_count_lag` FROM `temp_view_review` LEFT OUTER JOIN `temp_view_likereview` ON (`temp_view_review`.`review_id` = `temp_view_likereview`.`review_id`) GROUP BY `temp_view_review`.`review_id` ORDER BY `num_likes` DESC
Consider the following Models in Django:
class Item(models.Model):
name = models.CharField(max_length = 100)
class Item_Price(models.Model):
created_on = models.DateTimeField(default = timezone.now)
item = models.ForeignKey('Item', related_name = 'prices')
price = models.DecimalField(decimal_places = 2, max_digits = 15)
The price of an item can vary throughout time so I want to keep a price history.
My goal is to have a single query using the Django ORM to get a list of Items with their latest prices and sort the results by this price in ascending order.
What would be the best way to achieve this?
You can use a Subquery to obtain the latest Item_Price object and sort on these:
from django.db.models import OuterRef, Subquery
last_price = Item_Price.objects.filter(
item_id=OuterRef('pk')
).order_by('-created_on').values('price')[:1]
Item.objects.annotate(
last_price=Subquery(last_price)
).order_by('last_price')
For each Item, we thus obtain the latest Item_Price and use this in the annotation.
That being said, the above modelling is perhaps not ideal, since it will require a lot of complex queries. django-simple-history [readthedocs.io] does this differently by creating an extra model and save historical records. It also has a manager that allows one to query for historical states. This perhaps makes working with historical dat simpeler.
You could prefetch them in order to do the nested ordering inline like the following:
from django.db.models import Prefetch
prefetched_prices = Prefetch("prices", queryset=Item_Price.objects.order_by("price"))
for i in Item.objects.prefetch_related(prefetched_prices): i.name, i.prices.all()
i need to subtraction between two aggregate fields in two different queries :
class ModelA(models.Model):
price = models.IntegerField()
#others
class ModelB(models.Model):
cost = models.IntegerField()
#others
paid_price = ModelA.objects.filter(status=True).annotate(
total_paid = Sum(F('price')),
#some more fields
).aggregate(
paid = Sum(F('total_paid'))
#some more fields
)
paid_costs = ModelB.objects.filter(status=True).annotate(
total_cost = Sum(F('cost')),
#some more fields
).aggregate(
t_cost= Sum(F('total_paid')),
#some more fields
)
i need to calculate t_cost with paid i tried this final_result = paid_price.paid - paid_costs.t_cost
but raised this errror:
'dict' object has no attribute 'paid '
is there a way to achieve it ?
note : the models dont have connection between each other
Aggregate results are dictionaries, you need to get the calculated values by key
final_result = paid_price['paid'] - paid_costs['t_cost']
Note: an annotation on a field that is part of the model will not do anything unless you have some grouping (which I can't see in your queries)
Remove the Sum if you just want to rename the field in the query
ModelA.objects.annotate(
total_paid=F('price')
)
The aggregate() method always results a dict result.
Returns a dictionary of aggregate values (averages, sums, etc.) calculated over the QuerySet.
so, the expression should be,
final_result = paid_price['paid'] - paid_costs['t_cost']
I have a Django App with the following models:
CURRENCY_CHOICES = (('USD', 'US Dollars'), ('EUR', 'Euro'))
class ExchangeRate(models.Model):
currency = models.CharField(max_length=3, default='USD', choices=CURRENCY_CHOICES)
rate = models.FloatField()
exchange_date = models.DateField()
class Donation(models.Model):
donation_date = models.DateField()
donor = models.CharField(max_length=250)
amount = models.FloatField()
currency = models.CharField(max_length=3, default='USD', choices=CURRENCY_CHOICES)
I also have a form I use to filter donations based on some criteria:
class DonationFilterForm(forms.Form)
min_amount = models.FloatField(required=False)
max_amount = models.FloatField(required=False)
The min_amount and max_amount fields will always represent values in US Dollars.
I need to be able to filter a queryset based on min_amount and max_amount, but for that all the amounts must be in USD. To convert the donation amount to USD I need to multiply by the ExchangeRate of the donation currency and date.
The only way I found of doing this so far is by iterating the dict(queryset) and adding a new value called usd_amount, but that may offer very poor performance in the future.
Reading Django documentation, it seems the same thing can be done using aggregation, but so far I haven't been able to create the right logic that would give me same result.
I knew I had to use annotate to solve this, but I didn't know exactly how because it involved getting data from an unrelated Model.
Upon further investigation I found what I needed in the Django Documentation. I needed to use the Subquery and the OuterRef expressions to get the values from the outer queryset so I could filter the inner queryset.
The final solution looks like this:
# Prepare the filter with dynamic fields using OuterRef
rates = ExchangeRate.objects.filter(exchange_date=OuterRef('date'), currency='EUR')
# Get the exchange rate for every donation made in Euros
qs = Donation.objects.filter(currency='EUR').annotate(exchange_rate=Subquery(rates.values('rate')[:1]))
# Get the equivalent amount in USD
qs = qs.annotate(usd_amount=F('amount') * F('exchange_rate'))
So, finally, I could filter the resulting queryset like so:
final_qs = qs.filter(usd_amount__gte=min_amount, usd_amount__lte=max_amount)
I need to calculate period medians per seller ID (see simplyfied model below). The problem is I am unable to construct the ORM query.
Model
class MyModel:
period = models.IntegerField(null=True, default=None)
seller_ids = ArrayField(models.IntegerField(), default=list)
aux = JSONField(default=dict)
Query
queryset = (
MyModel.objects.filter(period=25)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id")
.annotate(
duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()),
median=Func(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
.values("median", "seller_id")
)
ArrayField aggregation (seller_id) source
I think what I need to do is something along the lines below
select t.*, p_25, p_75
from t join
(select district,
percentile_cont(0.25) within group (order by sales) as p_25,
percentile_cont(0.75) within group (order by sales) as p_75
from t
group by district
) td
on t.district = td.district
above example source
Python 3.7.5, Django 2.2.8, Postgres 11.1
You can create a Median child class of the Aggregate class as was done by Ryan Murphy (https://gist.github.com/rdmurphy/3f73c7b1826cacee34f6c2a855b12e2e). Median then works just like Avg:
from django.db.models import Aggregate, FloatField
class Median(Aggregate):
function = 'PERCENTILE_CONT'
name = 'median'
output_field = FloatField()
template = '%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)'
Then to find the median of a field use
my_model_aggregate = MyModel.objects.all().aggregate(Median('period'))
which is then available as my_model_aggregate['period__median'].
Here's what did the trick.
from django.db.models import F, Func, IntegerField
from django.db.models.aggregates import Aggregate
queryset = (
MyModel.objects.filter(period=25)
.annotate(duration=Cast(KeyTextTransform("duration", "aux"), IntegerField()))
.filter(duration__isnull=False)
.annotate(seller_id=Func(F("seller_ids"), function="unnest"))
.values("seller_id") # group by
.annotate(
median=Aggregate(
F("duration"),
function="percentile_cont",
template="%(function)s(0.5) WITHIN GROUP (ORDER BY %(expressions)s)",
),
)
)
Notice the median annotation employs Aggregate and not Func as in the question.
Also, order of annotate() and filter() clauses as well as order of annotate() and values() clauses matters a lot!
BTW the resulting SQL is without a nested select and join.