I have a model: Syscheck(PostgreSQL table name: syscheck_syscheck), include some fields
changes, size, uid, gid, filepath, syscheck_path
How can I perform a SQL query like this
SELECT * FROM syscheck_syscheck GROUP BY filepath
I have tried:
Syscheck.objects.values('filepath').annotate(Count('filepath'))
It worked, but only filepath field return, when I want more fields return, I tried this:
Syscheck.objects.values('filepath', 'size', 'uid', 'gui').annotate(Count('filepath'))
It didn't work
Django ORM's group by is a little bit tricky. it only do it if you group by the "base" model.
I managed that situation making that kind of queries like this:
Filepath.objects.all().select_related('syscheck').filter(syscheck__isnull=False).annotate(Count('filepath'))
You can access to the syscheck data but it will depends on your situation.
Hope it helps!
Related
Given these models
class User(Model):
pass
class Post(Model):
by = ForeignKey(User)
posted_on = models.DateTimeField(auto_now=True)
I want to get the latest Posts, but not all from the same User, I have something like this:
posts = Post.objects.filter(public=True) \
.order_by('posted_on') \
.distinct("by")
But distinct doesn't work on mysql, I'm wondering if there is another way to do it?
I have seen some using values(), but values doesn't work for me because I need to do more things with the objects themselves
Since distinct will not work with MySQL on other fields then model id, this is possible way-around with using Subquery:
from django.db.models import Subquery, OuterRef
...
sub_qs = Post.objects.filter(user_id=OuterRef('id')).order_by('posted_on')
# here you get users with annotated last post
qs = User.objects.annotate(last_post=Subquery(sub_qs[:1]))
# next you can limit the number of users
Also note that ordering on posted_on field depends on your model constraints - perhaps you'll need to change it to -posted_on to order from newest on top.
order_by should match the distinct(). In you case, you should be doing this:
posts = Post.objects.filter(public=True) \
.order_by('by') \
.distinct('by')
.distinct([*fields]) only works in PostgresSQL.
For MySql Engine. This is MySQL documentation in Django:
Here's the difference. For a normal distinct() call, the database
compares each field in each row when determining which rows are
distinct. For a distinct() call with specified field names, the
database will only compare the specified field names.
For MySql workaround could be this:
from django.db.models import Subquery, OuterRef
user_post = Post.objects.filter(user_id=OuterRef('id')).order_by('posted_on')
post_ids = User.objects.filter(related_posts__isnull=False).annotate(post=Subquery(user_post.values_list('id', flat=True)[:1]))).values_list('post', flat=True)
posts = Post.objects.filter(id__in=post_ids)
I'm working on a project in python I'm kind of a beginner. I searched a bit on the Queryset but didn't found out how to do a custom select query. Here's the raw SQL query :
SELECT
DATE_FORMAT(
DATE_ADD(date_and_time, INTERVAL - WEEKDAY(date_and_time) DAY), '%Y-%m-%d'
) as Week,
device_type as 'Type of device',
COUNT(*) as Views FROM manage_history
GROUP BY Week, device_type;
How could I get the values but with a QuerySet ?
Thanks a lot
You can perform a raw sql query. More details at https://docs.djangoproject.com/en/3.0/topics/db/sql/
Django's default manager gives you RawQuerySet:
>>> from api.models import Song
>>> Song.objects.raw('SELECT * FROM api_song')
<RawQuerySet: SELECT * FROM api_song>
In case of that you want exactly QuerySet to be returned, Take a look Here.
I'm having loads of trouble translating some SQL into Django.
Imagine we have some cars, each with a unique VIN, and we record the dates that they are in the shop with some other data. (Please ignore the reason one might structure the data this way. It's specifically for this question. :-) )
class ShopVisit(models.Model):
vin = models.CharField(...)
date_in_shop = models.DateField(...)
mileage = models.DecimalField(...)
boolfield = models.BooleanField(...)
We want a single query to return a Queryset with the most recent record for each vin and update it!
special_vins = [...]
# Doesn't work
ShopVisit.objects.filter(vin__in=special_vins).annotate(max_date=Max('date_in_shop').filter(date_in_shop=F('max_date')).update(boolfield=True)
# Distinct doesn't work with update
ShopVisit.objects.filter(vin__in=special_vins).order_by('vin', '-date_in_shop).distinct('vin').update(boolfield=True)
Yes, I could iterate over a queryset. But that's not very efficient and it takes a long time when I'm dealing with around 2M records. The SQL that could do this is below (I think!):
SELECT *
FROM cars
INNER JOIN (
SELECT MAX(dateInShop) as maxtime, vin
FROM cars
GROUP BY vin
) AS latest_record ON (cars.dateInShop= maxtime)
AND (latest_record.vin = cars.vin)
So how can I make this happen with Django?
This is somewhat untested, and relies on Django 1.11 for Subqueries, but perhaps something like:
latest_visits = Subquery(ShopVisit.objects.filter(id=OuterRef('id')).order_by('-date_in_shop').values('id')[:1])
ShopVisit.objects.filter(id__in=latest_visits)
I had a similar model, so went to test it but got an error of:
"This version of MySQL doesn't yet support 'LIMIT & IN/ALL/ANY/SOME subquery"
The SQL it generated looked reasonably like what you want, so I think the idea is sound. If you use PostGres, perhaps it has support for that type of subquery.
Here's the SQL it produced (trimmed up a bit and replaced actual names with fake ones):
SELECT `mymodel_activity`.* FROM `mymodel_activity` WHERE `mymodel_activity`.`id` IN (SELECT U0.`id` FROM `mymodel_activity` U0 WHERE U0.`id` = (`mymodel_activity`.`id`) ORDER BY U0.`date_in_shop` DESC LIMIT 1)
I wonder if you found the solution yourself.
I could come up with only raw query string. Django Raw SQL query Manual
UPDATE "yourapplabel_shopvisit"
SET boolfield = True WHERE date_in_shop
IN (SELECT MAX(date_in_shop) FROM "yourapplabel_shopvisit" GROUP BY vin);
In Django, is it possible to order by whether or not a field is None, instead of the value of the field itself?
I know I can send the QuerySet to python sorted() but I want to keep it as a QuerySet for subsequent filtering. So, I'd prefer to order in the QuerySet itself.
For example, I have a termination_date field and I want to first sort the ones without a termination_date, then I want to order by a different field, like last_name, first_name.
Is this possible or am I stuck using sorted() and then having to do an entire new Query with the included ids and run sorted() on the new QuerySet? I can do this, but would prefer not to waste the overhead and use the beauty of QuerySets that they don't run until evaluated.
Translation, how can I get this SQL from Django assuming my app is employee, my model is Employee and it has three fields 'first_name (varchar)', 'last_name (varchar)', and 'termination_date (date)':
SELECT
"employee_employee"."last_name",
"employee_employee"."first_name",
"employee_employee"."termination_date"
FROM "employee_employee"
ORDER BY
"employee_employee"."termination_date" IS NOT NULL,
"employee_employee"."last_name",
"employee_employee"."first_name"
You should be able to order by query expressions, like this:
from django.db.models import IntegerField, Case, Value, When
MyModel.objects.all().order_by(
Case(
When(some_field=None, then=Value(1)),
default=Value(0),
output_field=IntegerField(),
).asc(),
'some_other_field'
)
I cannot test here so it might require a bit a fiddling around, but this should put rows that have a NULL some_field after those that have a some_field. And each set of rows should be sorted by some_other_field.
Granted, the CASE/WHEN is be a bit more cumbersome that what you put in your question, but I don't know how to get Django ORM to output that. Maybe someone else will have a better answer.
Spectras' answer works fine, but it only orders your records by 'null or not'. There is a shorter way that allows you to put empty dates wherever you want them in your date ordering - Coalesce:
from django.db.models import Value
from django.db.models.functions import Coalesce
wayback = datetime(year=1, month=1, day=1) # or whatever date you want
MyModel.objects
.annotate(null_date=Coalesce('date_field', Value(wayback)))
.order_by('null_date')
This will essentially sort by the field 'date_field' with all records with date_field == None will be in the order as if they had the date wayback. This works perfectly with PostgreSQL, but might need some raw sql casting in MySQL as described in the documentation.
On search screens, users can sort the results by clicking on a column header. Unfortunately, this doesn't work for all columns. It works fine for regular fields like name and price that are stored on the table itself. It also works for many-to-one fields by joining to the referenced table and using the default sort order for that table.
What doesn't work is most functional fields and related fields. (Related fields are a type of functional field.) When you click on the column, it just ignores you. If you change the field definition to be stored in the database, then you can sort by it, but is that necessary? Is there any way to sort by a functional field without storing its values in the database?
Apparently there has been some discussion of this, and CampToCamp posted a merge proposal with a general solution. There's also some discussion in their blog.
I haven't tried their solution yet, but I did create a specific solution for one field by overriding the _generate_order_by() method. Whenever the user clicks on a column header, _generate_order_by() tries to generate an appropriate ORDER BY clause. I found that you can actually put a SQL subquery in the ORDER BY clause to reproduce the values for a functional field.
As an example, we added a functional field to display the first supplier's name for each product.
def _product_supplier_name(self, cr, uid, ids, name, arg, context=None):
res = {}
for product in self.browse(cr, uid, ids, context):
supplier_name = ""
if len(product.seller_ids) > 0:
supplier_name = product.seller_ids[0].name.name
res[product.id] = supplier_name
return res
In order to sort by that column, we overrode _generate_order_by() with some pretty funky SQL. For any other column, we delegate to the regular code.
def _generate_order_by(self, order_spec, query):
""" Calculate the order by clause to use in SQL based on a set of
model fields. """
if order_spec != 'default_partner_name':
return super(product_product, self)._generate_order_by(order_spec,
query)
return """
ORDER BY
(
select min(supp.name)
from product_supplierinfo supinf
join res_partner supp
on supinf.name = supp.id
where supinf.product_id = product_product.id
),
product_product.default_code
"""
The reason for storing the field is that you delegate sorting to sql, that gives you more performance than any other subsequent sorting, for sure.