Django postgres order_by distinct on field - python

We have a limitation for order_by/distinct fields.
From the docs: "fields in order_by() must start with the fields in distinct(), in the same order"
Now here is the use case:
class Course(models.Model):
is_vip = models.BooleanField()
...
class CourseEvent(models.Model):
date = models.DateTimeField()
course = models.ForeignKey(Course)
The goal is to fetch the courses, ordered by nearest date but vip goes first.
The solution could look like this:
CourseEvent.objects.order_by('-course__is_vip', '-date',).distinct('course_id',).values_list('course')
But it causes an error since the limitation.
Yeah I understand why ordering is necessary when using distinct - we get the first row for each value of course_id so if we don't specify an order we would get some arbitrary row.
But what's the purpose of limiting order to the same field that we have distinct on?
If I change order_by to something like ('course_id', '-course__is_vip', 'date',) it would give me one row for course but the order of courses will have nothing in common with the goal.
Is there any way to bypass this limitation besides walking through the entire queryset and filtering it in a loop?

You can use a nested query using id__in. In the inner query you single out the distinct events and in the outer query you custom-order them:
CourseEvent.objects.filter(
id__in=CourseEvent.objects\
.order_by('course_id', '-date').distinct('course_id')
).order_by('-course__is_vip', '-date')
From the docs on distinct(*fields):
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.

Related

How to get all values for a certain field in django ORM?

I have a table called user_info. I want to get names of all the users. So the table has a field called name. So in sql I do something like
SELECT distinct(name) from user_info
But I am not able to figure out how to do the same in django. Usually if I already have certain value known, then I can do something like below.
user_info.objects.filter(name='Alex')
And then get the information for that particular user.
But in this case for the given table, I want to get all the name values using django ORM just like I do in sql.
Here is my django model
class user_info(models.Model):
name = models.CharField(max_length=255)
priority = models.CharField(max_length=1)
org = models.CharField(max_length=20)
How can I do this in django?
You can use values_list.
user_info.objects.values_list('name', flat=True).distinct()
Note, in Python classes are usually defined in InitialCaps: your model should be UserInfo.
You can use values_list() as given in Daniel's answer, which will provide you your data in a list containing the values in the field. Or you can also use, values() like this:
user_info.object.values('name')
which will return you a queryset containing a dictionary. values_list() and values() are used to select the columns in a table.
Adding on to the accepted answer, if the field is a foreign key the id values(numerical) are returned in the queryset. Hence if you are expecting other kinds of values defined in the model of which the foreign key is part then you have to modify the query like this:
`Post.objects.values_list('author__username')`
Post is a model class having author as a foreign key field which in turn has its username field:
Here, "author" field was appended with double undersocre followed by the field "name", otherwise primary key of the model will be returned in queryset. I assume this was #Carlo's doubt in accepted answer.

Django filter 'first' lookup for many-to-many relationships

Let's say we have a two models like these:
Artist(models.Model):
name = models.CharField(max_length=50)
Track(models.Model):
title = models.CharField(max_length=50)
artist = models.ForeignKey(Artist, related_name='tracks')
How can I filter this relationship to get the first foreign record?
So I've tried something like this, but it didn't work (as expected)
artists = Artist.objects.filter(tracks__first__title=<some-title>)
artists = Artist.objects.filter(tracks[0]__title=<some-title>)
Is there any way to make this work?
Here's a solution not taking performance into consideration.
Artist.objects.filter(tracks__in=[a.tracks.first() for a in Artist.objects.all()], tracks__title=<some_title>)
No list approach, as requested.
Artist.objects.filter(tracks__in=Track.objects.all().distinct('artist').order_by('artist', 'id'), tracks__title=<some_title>)
The order_by 'id' is important to make sure distinct gets the first track based on insertion. The order_by 'artist' is a requirement for sorting distinct queries. Read about it here: https://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-DISTINCT

How to compare Specific value of two different tables in django?

I have two tables 'Contact' and other is "Subscriber".. I want to Compare Contact_id of both and want to show only those Contact_id which is present in Contact but not in Subscriber.These two tables are in two different Models.
Something like this should work:
Contact.objects.exclude(
id__in=Subscriber.objects.all()
).values_list('id', flat=True)
Note that these are actually two SQL queries. I'm sure there are ways to optimize it, but this will usually work fine.
Also, the values_list has nothing to do with selecting the objects, it just modifies "format" of what is returned (list of IDs instead of queryset of objects - but same database records in both cases).
If you are excluding by some field other then Subscriber.id (e.g: Subscriber.quasy_id):
Contact.objects.exclude(
id__in=Subscriber.objects.all().values_list('quasy_id', flat=True)
).values_list('id', flat=True)
Edit:
This answer assumes you don't have a relationship between your Contact and Subscriber models. If you do, then see #navit's answer, it is a better choice.
Edit 2:
That flat=True inside exclude is actually not needed.
I assume you have your model like this:
class Subscriber(models.Model):
contact = models.ForeignKey(Contact)
You can do what you want like this:
my_list = Subscriber.objects.filter(contact=None)
This retrieves Subscribers which don't have a Contact. Retrieveing a list of Contacts is straightforward.
If you want to compare value of fields in two different tables(which have connection with ForeignKey) you can use something like this:
I assume model is like below:
class Contact(models.Model):
name = models.TextField()
family = models.TextField()
class Subscriber(models.Model):
subscriber_name = models.ForeignKey(Contact, on_delete=models.CASCADE)
subscriber_family = models.TextField()
this would be the query:
query = Subscriber.objects.filter(subscriber_name =F(Contact__name))
return query

Django: Order by evaluation of whether or not a date is empty

In Django, is it possible to order by whether or not a field is None, instead of the value of the field itself?
I know I can send the QuerySet to python sorted() but I want to keep it as a QuerySet for subsequent filtering. So, I'd prefer to order in the QuerySet itself.
For example, I have a termination_date field and I want to first sort the ones without a termination_date, then I want to order by a different field, like last_name, first_name.
Is this possible or am I stuck using sorted() and then having to do an entire new Query with the included ids and run sorted() on the new QuerySet? I can do this, but would prefer not to waste the overhead and use the beauty of QuerySets that they don't run until evaluated.
Translation, how can I get this SQL from Django assuming my app is employee, my model is Employee and it has three fields 'first_name (varchar)', 'last_name (varchar)', and 'termination_date (date)':
SELECT
"employee_employee"."last_name",
"employee_employee"."first_name",
"employee_employee"."termination_date"
FROM "employee_employee"
ORDER BY
"employee_employee"."termination_date" IS NOT NULL,
"employee_employee"."last_name",
"employee_employee"."first_name"
You should be able to order by query expressions, like this:
from django.db.models import IntegerField, Case, Value, When
MyModel.objects.all().order_by(
Case(
When(some_field=None, then=Value(1)),
default=Value(0),
output_field=IntegerField(),
).asc(),
'some_other_field'
)
I cannot test here so it might require a bit a fiddling around, but this should put rows that have a NULL some_field after those that have a some_field. And each set of rows should be sorted by some_other_field.
Granted, the CASE/WHEN is be a bit more cumbersome that what you put in your question, but I don't know how to get Django ORM to output that. Maybe someone else will have a better answer.
Spectras' answer works fine, but it only orders your records by 'null or not'. There is a shorter way that allows you to put empty dates wherever you want them in your date ordering - Coalesce:
from django.db.models import Value
from django.db.models.functions import Coalesce
wayback = datetime(year=1, month=1, day=1) # or whatever date you want
MyModel.objects
.annotate(null_date=Coalesce('date_field', Value(wayback)))
.order_by('null_date')
This will essentially sort by the field 'date_field' with all records with date_field == None will be in the order as if they had the date wayback. This works perfectly with PostgreSQL, but might need some raw sql casting in MySQL as described in the documentation.

Django remove duplicates from queryset

i want to remove duplicates in relative fields, my queryset example:
example = models.Object.objects.values('name', 'photo__name', 'url', 'photo__url').distinct()
if name == photo__name and url == photo_url i need to delete one of them, how can i do this with Django ORM or i need to iterate through queryset?
If you are using PostgreSQL, check out the Django docs on distinct():
On PostgreSQL only, you can pass positional arguments (*fields) in order to specify the names of fields to which the DISTINCT should apply...
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.
Thus, in your example, you can remove duplicates on certain fields by using:
.order_by('photo__name', 'photo__url').distinct('photo__name', 'photo__url')
To reference fields of model in filtering you can use Django ORM F function: https://docs.djangoproject.com/en/dev/topics/db/queries/#filters-can-reference-fields-on-the-model
But i guess you cannot delete one of them :) You got to decide which one you want to delete
UPDATE
Look when you filter like Object.objects.filter(photo__name='something') you filter Object table by related photo name. So you dealing with join over two tables. If you want to exclude objects with name = related photo name you should do something like this
from django.db.models import F
Object.objects.exclude(name=F('photo__name'))
Is that useful?

Categories

Resources