Django remove duplicates from queryset - python

i want to remove duplicates in relative fields, my queryset example:
example = models.Object.objects.values('name', 'photo__name', 'url', 'photo__url').distinct()
if name == photo__name and url == photo_url i need to delete one of them, how can i do this with Django ORM or i need to iterate through queryset?

If you are using PostgreSQL, check out the Django docs on distinct():
On PostgreSQL only, you can pass positional arguments (*fields) in order to specify the names of fields to which the DISTINCT should apply...
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.
Thus, in your example, you can remove duplicates on certain fields by using:
.order_by('photo__name', 'photo__url').distinct('photo__name', 'photo__url')

To reference fields of model in filtering you can use Django ORM F function: https://docs.djangoproject.com/en/dev/topics/db/queries/#filters-can-reference-fields-on-the-model
But i guess you cannot delete one of them :) You got to decide which one you want to delete
UPDATE
Look when you filter like Object.objects.filter(photo__name='something') you filter Object table by related photo name. So you dealing with join over two tables. If you want to exclude objects with name = related photo name you should do something like this
from django.db.models import F
Object.objects.exclude(name=F('photo__name'))
Is that useful?

Related

Django - Annotate multiple fields from a Subquery

I'm working on a Django project on which i have a queryset of a 'A' objects ( A.objects.all() ), and i need to annotate multiple fields from a 'B' objects' Subquery. The problem is that the annotate method can only deal with one field type per parameter (DecimalField, CharField, etc.), so, in order to annotate multiple fields, i must use something like:
A.objects.all().annotate(b_id =Subquery(B_queryset.values('id')[:1],
b_name =Subquery(B_queryset.values('name')[:1],
b_other_field =Subquery(B_queryset.values('other_field')[:1],
... )
Which is very inefficient, as it creates a new subquery/subselect on the final SQL for each field i want to annotate. I would like to use the same Subselect with multiple fields on it's values() params, and annotate them all on A's queryset. I'd like to use something like this:
b_subquery = Subquery(B_queryset.values('id', 'name', 'other_field', ...)[:1])
A.objects.all().annotate(b=b_subquery)
But when i try to do that (and access the first element A.objects.all().annotate(b=b_subquery)[0]) it raises an exception:
{FieldError}Expression contains mixed types. You must set output_field.
And if i set Subquery(B_quer...[:1], output_field=ForeignKey(B, models.DO_NOTHING)), i get a DB exception:
{ProgrammingError}subquery must return only one column
In a nutshell, the whole problem is that i have multiple Bs that "belongs" to a A, so i need to use Subquery to, for every A in A.objects.all(), pick a specific B and attach it on that A, using OuterRefs and a few filters (i only want a few fields of B), which seens a trivial problem for me.
Thanks for any help in advance!
What I do in such situations is to use prefetch-related
a_qs = A.objects.all().prefetch_related(
models.Prefetch('b_set',
# NOTE: no need to filter with OuterRef (it wont work anyway)
# Django automatically filter and matches B objects to A
queryset=B_queryset,
to_attr='b_records'
)
)
Now a.b_records will be a list containing a's related b objects. Depending on how you filter your B_queryset this list may be limited to only 1 object.

How to get all values for a certain field in django ORM?

I have a table called user_info. I want to get names of all the users. So the table has a field called name. So in sql I do something like
SELECT distinct(name) from user_info
But I am not able to figure out how to do the same in django. Usually if I already have certain value known, then I can do something like below.
user_info.objects.filter(name='Alex')
And then get the information for that particular user.
But in this case for the given table, I want to get all the name values using django ORM just like I do in sql.
Here is my django model
class user_info(models.Model):
name = models.CharField(max_length=255)
priority = models.CharField(max_length=1)
org = models.CharField(max_length=20)
How can I do this in django?
You can use values_list.
user_info.objects.values_list('name', flat=True).distinct()
Note, in Python classes are usually defined in InitialCaps: your model should be UserInfo.
You can use values_list() as given in Daniel's answer, which will provide you your data in a list containing the values in the field. Or you can also use, values() like this:
user_info.object.values('name')
which will return you a queryset containing a dictionary. values_list() and values() are used to select the columns in a table.
Adding on to the accepted answer, if the field is a foreign key the id values(numerical) are returned in the queryset. Hence if you are expecting other kinds of values defined in the model of which the foreign key is part then you have to modify the query like this:
`Post.objects.values_list('author__username')`
Post is a model class having author as a foreign key field which in turn has its username field:
Here, "author" field was appended with double undersocre followed by the field "name", otherwise primary key of the model will be returned in queryset. I assume this was #Carlo's doubt in accepted answer.

Django postgres order_by distinct on field

We have a limitation for order_by/distinct fields.
From the docs: "fields in order_by() must start with the fields in distinct(), in the same order"
Now here is the use case:
class Course(models.Model):
is_vip = models.BooleanField()
...
class CourseEvent(models.Model):
date = models.DateTimeField()
course = models.ForeignKey(Course)
The goal is to fetch the courses, ordered by nearest date but vip goes first.
The solution could look like this:
CourseEvent.objects.order_by('-course__is_vip', '-date',).distinct('course_id',).values_list('course')
But it causes an error since the limitation.
Yeah I understand why ordering is necessary when using distinct - we get the first row for each value of course_id so if we don't specify an order we would get some arbitrary row.
But what's the purpose of limiting order to the same field that we have distinct on?
If I change order_by to something like ('course_id', '-course__is_vip', 'date',) it would give me one row for course but the order of courses will have nothing in common with the goal.
Is there any way to bypass this limitation besides walking through the entire queryset and filtering it in a loop?
You can use a nested query using id__in. In the inner query you single out the distinct events and in the outer query you custom-order them:
CourseEvent.objects.filter(
id__in=CourseEvent.objects\
.order_by('course_id', '-date').distinct('course_id')
).order_by('-course__is_vip', '-date')
From the docs on distinct(*fields):
When you specify field names, you must provide an order_by() in the QuerySet, and the fields in order_by() must start with the fields in distinct(), in the same order.

Exclude field from values() or values_list()

Is there an efficient way to exclude fields from the function values() or values_list.
e.g
Videos.objects.filter(id=1).get().values()
I want to exclude from this queryset the field duration.
I know that I can specify fields what I want to have in the result but what if I want everything but only one field not. Like in the cases if I have 20 fields and if I want only one from them not.
Thanks
You must use defer This will not add defined fields to your select query.
Videos.objects.filter(...).defer('duration')
You can get all fields first, then pop out the fields you do not want:
fields = Video._meta.get_all_field_names()
fields.remove('id')
Video.object.filter(...).values(*fields)

Custom SQL in the django ORM: Filtering a list based off another filtered list in one query

What i'm trying to do is retrieve a filtered list of CarModel objects where the carfield
is in a list of fields of another model, say the GasModel. But the set of GasModels must also be filtered out, to a list where a field in GasModel must equal another field from the CarModel (different field).
So pretty much I want to filter a list so that a field of that list is contained in a separate list of fields of a different model, and that list (of the second model) is also filtered (but by a different field of the first (car) model). I'd like for this to be all in one queryset call.
This is what I have so far, the error I belive is
WHERE anothergasfield = another_field_from_car_carmodel
Am I missing a FROM keyword or something? And if so where should it go?
CarModel.objects.extra(where = ['carfield IN (SELECT gasfield FROM\
gas_gasmodel WHERE anothergasfield = another_field_from_car_carmodel)'])
.order_by(...)
Thanks
How about this:
CarModel.objects.extra(where = ['carfield IN (SELECT gasfield FROM\
gas_gasmodel WHERE anothergasfield = carmodel.another_field_from_car_carmodel)'])
.order_by(...)
Just replace the carmodel with the table name for CarModel. Usually is {{ app_name }}_{{ model_name }}.
You have nested select statement inside of which fields are from gas_gasmodel table, not from carmodel table.

Categories

Resources