How to make a QuerySet with custom select in django - python

I'm working on a project in python I'm kind of a beginner. I searched a bit on the Queryset but didn't found out how to do a custom select query. Here's the raw SQL query :
SELECT
DATE_FORMAT(
DATE_ADD(date_and_time, INTERVAL - WEEKDAY(date_and_time) DAY), '%Y-%m-%d'
) as Week,
device_type as 'Type of device',
COUNT(*) as Views FROM manage_history
GROUP BY Week, device_type;
How could I get the values but with a QuerySet ?
Thanks a lot

You can perform a raw sql query. More details at https://docs.djangoproject.com/en/3.0/topics/db/sql/

Django's default manager gives you RawQuerySet:
>>> from api.models import Song
>>> Song.objects.raw('SELECT * FROM api_song')
<RawQuerySet: SELECT * FROM api_song>
In case of that you want exactly QuerySet to be returned, Take a look Here.

Related

How to write subquery in django

Is it possible to make following sql query in django
select * from (
select * from users
) order by id
It is just minimal example. I have a long subquery instead of select * from users. But I can't understand how insert it into subquery.
UPDATED:
Subquery from doc doesn't suits because it build following request
SELECT "post"."id", (
SELECT U0."email"
FROM "comment" U0
WHERE U0."post_id" = ("post"."id")
ORDER BY U0."created_at" DESC LIMIT 1
) AS "newest_commenter_email" FROM "post"
and this subquery can return only one value (.values('email')).
Construction select (subquery) as value from table instead of select value from (subquery)
i would use a python connector to postgreSQL - http://www.postgresqltutorial.com/postgresql-python/query/, that is what i do for the mysql, thought did not try for the postgresql
Making a subquery is essentially setting up two queries and using one query to "feed" another:
from django.db.models import Subquery
all_users = User.objects.all()
User.objects.annotate(the_user=Subquery(all_users.values('email')[:1]))
This is more or less the same as what you provided. You can get about as complicated as you'd like here but the best source to get going with subqueries is the docs

Django ORM: Get latest record for distinct field

I'm having loads of trouble translating some SQL into Django.
Imagine we have some cars, each with a unique VIN, and we record the dates that they are in the shop with some other data. (Please ignore the reason one might structure the data this way. It's specifically for this question. :-) )
class ShopVisit(models.Model):
vin = models.CharField(...)
date_in_shop = models.DateField(...)
mileage = models.DecimalField(...)
boolfield = models.BooleanField(...)
We want a single query to return a Queryset with the most recent record for each vin and update it!
special_vins = [...]
# Doesn't work
ShopVisit.objects.filter(vin__in=special_vins).annotate(max_date=Max('date_in_shop').filter(date_in_shop=F('max_date')).update(boolfield=True)
# Distinct doesn't work with update
ShopVisit.objects.filter(vin__in=special_vins).order_by('vin', '-date_in_shop).distinct('vin').update(boolfield=True)
Yes, I could iterate over a queryset. But that's not very efficient and it takes a long time when I'm dealing with around 2M records. The SQL that could do this is below (I think!):
SELECT *
FROM cars
INNER JOIN (
SELECT MAX(dateInShop) as maxtime, vin
FROM cars
GROUP BY vin
) AS latest_record ON (cars.dateInShop= maxtime)
AND (latest_record.vin = cars.vin)
So how can I make this happen with Django?
This is somewhat untested, and relies on Django 1.11 for Subqueries, but perhaps something like:
latest_visits = Subquery(ShopVisit.objects.filter(id=OuterRef('id')).order_by('-date_in_shop').values('id')[:1])
ShopVisit.objects.filter(id__in=latest_visits)
I had a similar model, so went to test it but got an error of:
"This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery"
The SQL it generated looked reasonably like what you want, so I think the idea is sound. If you use PostGres, perhaps it has support for that type of subquery.
Here's the SQL it produced (trimmed up a bit and replaced actual names with fake ones):
SELECT `mymodel_activity`.* FROM `mymodel_activity` WHERE `mymodel_activity`.`id` IN (SELECT U0.`id` FROM `mymodel_activity` U0 WHERE U0.`id` = (`mymodel_activity`.`id`) ORDER BY U0.`date_in_shop` DESC LIMIT 1)
I wonder if you found the solution yourself.
I could come up with only raw query string. Django Raw SQL query Manual
UPDATE "yourapplabel_shopvisit"
SET boolfield = True WHERE date_in_shop
IN (SELECT MAX(date_in_shop) FROM "yourapplabel_shopvisit" GROUP BY vin);

how to select * group by in Django models

I have a model: Syscheck(PostgreSQL table name: syscheck_syscheck), include some fields
changes, size, uid, gid, filepath, syscheck_path
How can I perform a SQL query like this
SELECT * FROM syscheck_syscheck GROUP BY filepath
I have tried:
Syscheck.objects.values('filepath').annotate(Count('filepath'))
It worked, but only filepath field return, when I want more fields return, I tried this:
Syscheck.objects.values('filepath', 'size', 'uid', 'gui').annotate(Count('filepath'))
It didn't work
Django ORM's group by is a little bit tricky. it only do it if you group by the "base" model.
I managed that situation making that kind of queries like this:
Filepath.objects.all().select_related('syscheck').filter(syscheck__isnull=False).annotate(Count('filepath'))
You can access to the syscheck data but it will depends on your situation.
Hope it helps!

Django query with AVG and GROUP BY

My Django-foo isn't quite up to par to translate certain raw sql into the ORM.
Currently I am executing:
SELECT avg(<value_to_be_averaged>), <id_to group_on>
FROM <table_name>
WHERE start_time >= <timestamp>
GROUP BY <id_to group_on>;
In Django I can do:
Model.objects.filter(start_time__gte=<timestamp>).aggregate(Avg('<value_to_be_averaged>'))
but that is for all objects in the query and doesn't return a query set that is grouped by the id like in the raw SQL above. I've been fiddling with .annotate() but haven't made much progress. Any help would be appreciated!

Django - SQL Query - Timestamp

Can anyone turn me to a tutorial, code or some kind of resource that will help me out with the following problem.
I have a table in a mySQL database. It contains an ID, Timestamp, another ID and a value. I'm passing it the 'main' ID which can uniquely identify a piece of data. However, I want to do a time search on this piece of data(therefore using the timestamp field). Therefore what would be ideal is to say: between the hours of 12 and 1, show me all the values logged for ID = 1987.
How would I go about querying this in Django? I know in mySQL it'd be something like less than/greater than etc... but how would I go about doing this in Django? i've been using Object.Filter for most of database handling so far. Finally, I'd like to stress that I'm new to Django and I'm genuinely stumped!
If the table in question maps to a Django model MyModel, e.g.
class MyModel(models.Model):
...
primaryid = ...
timestamp = ...
secondaryid = ...
valuefield = ...
then you can use
MyModel.objects.filter(
primaryid=1987
).exclude(
timestamp__lt=<min_timestamp>
).exclude(
timestamp__gt=<max_timestamp>
).values_list('valuefield', flat=True)
This selects entries with the primaryid 1987, with timestamp values between <min_timestamp> and <max_timestamp>, and returns the corresponding values in a list.
Update: Corrected bug in query (filter -> exclude).
I don't think Vinay Sajip's answer is correct. The closest correct variant based on his code is:
MyModel.objects.filter(
primaryid=1987
).exclude(
timestamp__lt=min_timestamp
).exclude(
timestamp__gt=max_timestamp
).values_list('valuefield', flat=True)
That's "exclude the ones less than the minimum timestamp and exclude the ones greater than the maximum timestamp." Alternatively, you can do this:
MyModel.objects.filter(
primaryid=1987
).filter(
timestamp__gte=min_timestamp
).exclude(
timestamp__gte=max_timestamp
).values_list('valuefield', flat=True)
exclude() and filter() are opposites: exclude() omits the identified rows and filter() includes them. You can use a combination of them to include/exclude whichever you prefer. In your case, you want to exclude() those below your minimum time stamp and to exclude() those above your maximum time stamp.
Here is the documentation on chaining QuerySet filters.

Categories

Resources