My Django-foo isn't quite up to par to translate certain raw sql into the ORM.
Currently I am executing:
SELECT avg(<value_to_be_averaged>), <id_to group_on>
FROM <table_name>
WHERE start_time >= <timestamp>
GROUP BY <id_to group_on>;
In Django I can do:
Model.objects.filter(start_time__gte=<timestamp>).aggregate(Avg('<value_to_be_averaged>'))
but that is for all objects in the query and doesn't return a query set that is grouped by the id like in the raw SQL above. I've been fiddling with .annotate() but haven't made much progress. Any help would be appreciated!
Related
Using python Djengo ORM. I want to create a query to get all data from table having less than 10 duplicate mobile numbers in column.
Below is the sql query which working fine but i want this in ORM.
Select *, count() over(pqrtition by mobile_number) as cnt From table_name Where cnt <=10
Below is the ORM query which i have created but not getting correct result
get_data = ResponseMangement.objects.annotate(count=Count('mobile_number')).filter(count__lt=10).order_by('-id')
I want to display the course name along with the question count in a table. Need help to convert below query to a django ORM:
SELECT DISTINCT exam_course.course_name,
COUNT(exam_question.question)
FROM exam_course
INNER JOIN exam_question ON exam_question.course_id = exam_course.id
GROUP BY exam_question.course_id
Use annotate with count as commented...use in this manner, replace accordingly your requirements:
invoices = Invoice.objects.annotate(total_amount=Sum('order__order_items__amount'),number_of_invoices=Count('pk', distinct=True))
What is the best way to make raw SQL queries in django?
I have to search a table for the mode of another table. I could not find a way to solve this in django's ORM so I turned to raw SQL queries.
Yet creating all these very long queries in python is very unreadable and does not feel like a proper way to do this. Is there a way to save these queries in a neat format perhaps in the database.
I have to join three separate tables and compute the mode of a few columns on the last table. The length of the queries is getting very big and the code to make these queries becomes very unreadable. An example query would be
SELECT * FROM "core_assembly" INNER JOIN (SELECT * FROM "core_taxonomy" INNER JOIN(SELECT "core_phenotypic"."taxonomy_id" , \
array_agg("core_phenotypic"."isolation_host_NCBI_tax_id") FILTER (WHERE "core_phenotypic"."isolation_host_NCBI_tax_id" IS NOT NULL) \
AS super_set_isolation_host_NCBI_tax_ids FROM core_phenotypic GROUP BY "core_phenotypic"."taxonomy_id") "mode_table" ON \
"core_taxonomy"."id"="mode_table"."taxonomy_id") "tax_mode" ON "core_assembly"."taxonomy_id"="tax_mode"."id" WHERE ( 404=ANY(super_set_isolation_host_NCBI_tax_ids));
Where I would have a very big parse function to make all the WHERE clauses based on user input.
You can try this:
from django.db import connection
cursor = connection.cursor()
raw_query = "write your query here"
cursor.execute(raw_query)
You can also run raw queries for models. eg. MyModel.objects.raw('my query').
Read Performing raw SQL queries | Django documentation | Django for more.
Question:
How I can get the last 750 records of a query in the Database level?
Here is What I have tried:
# Get last 750 applications
apps = MyModel.active_objects.filter(
**query_params
).order_by('-created_at').values_list('id', flat=True)[:750]
This query fetches all records that hit the query_params filter and after that return the last 750 records. So I want to do this work at the database level, like mongoDb aggregate queries. Is it possible?
Thanks.
Actually that's not how Django works. The limit part is also done in database level.
Django docs - Limiting QuerySets:
Generally, slicing a QuerySet returns a new QuerySet – it doesn’t evaluate the query.
To see what query is actually being run in the database you can simply print the query like this:
apps = MyModel.active_objects.filter(
**query_params
).order_by('-created_at').values_list('id', flat=True)[:750]
print(apps.query)
The result will be something like this:
SELECT * FROM "app_mymodel" WHERE <...> ORDER BY "app_mymodel"."created_at" DESC LIMIT 750
I'm having loads of trouble translating some SQL into Django.
Imagine we have some cars, each with a unique VIN, and we record the dates that they are in the shop with some other data. (Please ignore the reason one might structure the data this way. It's specifically for this question. :-) )
class ShopVisit(models.Model):
vin = models.CharField(...)
date_in_shop = models.DateField(...)
mileage = models.DecimalField(...)
boolfield = models.BooleanField(...)
We want a single query to return a Queryset with the most recent record for each vin and update it!
special_vins = [...]
# Doesn't work
ShopVisit.objects.filter(vin__in=special_vins).annotate(max_date=Max('date_in_shop').filter(date_in_shop=F('max_date')).update(boolfield=True)
# Distinct doesn't work with update
ShopVisit.objects.filter(vin__in=special_vins).order_by('vin', '-date_in_shop).distinct('vin').update(boolfield=True)
Yes, I could iterate over a queryset. But that's not very efficient and it takes a long time when I'm dealing with around 2M records. The SQL that could do this is below (I think!):
SELECT *
FROM cars
INNER JOIN (
SELECT MAX(dateInShop) as maxtime, vin
FROM cars
GROUP BY vin
) AS latest_record ON (cars.dateInShop= maxtime)
AND (latest_record.vin = cars.vin)
So how can I make this happen with Django?
This is somewhat untested, and relies on Django 1.11 for Subqueries, but perhaps something like:
latest_visits = Subquery(ShopVisit.objects.filter(id=OuterRef('id')).order_by('-date_in_shop').values('id')[:1])
ShopVisit.objects.filter(id__in=latest_visits)
I had a similar model, so went to test it but got an error of:
"This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery"
The SQL it generated looked reasonably like what you want, so I think the idea is sound. If you use PostGres, perhaps it has support for that type of subquery.
Here's the SQL it produced (trimmed up a bit and replaced actual names with fake ones):
SELECT `mymodel_activity`.* FROM `mymodel_activity` WHERE `mymodel_activity`.`id` IN (SELECT U0.`id` FROM `mymodel_activity` U0 WHERE U0.`id` = (`mymodel_activity`.`id`) ORDER BY U0.`date_in_shop` DESC LIMIT 1)
I wonder if you found the solution yourself.
I could come up with only raw query string. Django Raw SQL query Manual
UPDATE "yourapplabel_shopvisit"
SET boolfield = True WHERE date_in_shop
IN (SELECT MAX(date_in_shop) FROM "yourapplabel_shopvisit" GROUP BY vin);