Cleaner way to query on a dynamic number of columns in Django? - python

In my case, I have a number of column names coming from a form. I want to filter to make sure they're all true. Here's how I currently do it:
for op in self.cleaned_data['options']:
cars = cars.filter((op, True))
Now it works but there are are a possible ~40 columns to be tested and it therefore doesn't appear very efficient to keep querying.
Is there a way I can condense this into one filter query?

Build the query as a dictionary and use the ** operator to unpack the options as keyword arguments to the filter method.
op_kwargs = {}
for op in self.cleaned_data['options']:
op_kwargs[op] = True
cars = CarModel.objects.filter(**op_kwargs)
This is covered in the django documentation and has been covered on SO as well.

Django's query sets are lazy, so what you're currently doing is actually pretty efficient. The database won't be hit until you try to access one of the fields in the QuerySet... assuming, that is, that you didn't edit out some code, and it is effectively like this:
cars = CarModel.objects.all()
for op in self.cleaned_data['options']:
cars = cars.filter((op, True))
More information here.

Related

Different behavior between multiple nested lookups inside .filter and .exclude

What's the difference between having multiple nested lookups inside queryset.filter and queryset.exclude?
For example car ratings. User can create ratings of multiple types for any car.
class Car(Model):
...
class Rating(Model):
type = ForeignKey('RatingType') # names like engine, design, handling
user = ... # user
Let's try to get a list of cars without rating by user "a" and type "design".
Approach 1
car_ids = Car.objects.filter(
rating__user="A", rating__type__name="design"
).values_list('id',flat=True)
Car.objects.exclude(id__in=car_ids)
Approach 2
Car.objects.exclude(
rating__user="A", rating__type__name="design"
)
The Approach 1 works well to me whereas the Approach 2 looks to be excluding more cars. My suspicion is that nested lookup inside exclude does not behave like AND (for the rating), rather it behaves like OR.
Is that true? If not, why these two approaches results in different querysets?
Regarding filter, "multiple parameters are joined via AND in the underlying SQL statement". Your first approach results not in one but in two SQL queries roughly equivalent to:
SELECT ... WHERE rating.user='A' AND rating.type.name='design';
SELECT ... WHERE car.id NOT IN (id1, id2, id3 ...);
Here's the part of the documentation that answers your question very precisely regarding exclude:
https://docs.djangoproject.com/en/stable/ref/models/querysets/#exclude
The evaluated SQL query would look like:
SELECT ... WHERE NOT (rating.user='A' AND rating.type.name='design')
Nested lookups inside filter and exclude behave similarly and use AND conditions. At the end of the day, most of the time, your 2 approaches are indeed equivalent... Except that the Car table might have been updated between the 1st and the 2d query of your approach 1.
Are you sure it's not the case? To be sure, try maybe to wrap the 2 lines of approach 1 in a transaction.atomic block? In any case, your second approach is probably the best (the less queries, the better).
If you have any doubt, you can also have a look at the evaluated queries (see here or here).

Index of row looping over django queryset [duplicate]

I have a QuerySet, let's call it qs, which is ordered by some attribute which is irrelevant to this problem. Then I have an object, let's call it obj. Now I'd like to know at what index obj has in qs, as efficiently as possible. I know that I could use .index() from Python or possibly loop through qs comparing each object to obj, but what is the best way to go about doing this? I'm looking for high performance and that's my only criteria.
Using Python 2.6.2 with Django 1.0.2 on Windows.
If you're already iterating over the queryset and just want to know the index of the element you're currently on, the compact and probably the most efficient solution is:
for index, item in enumerate(your_queryset):
...
However, don't use this if you have a queryset and an object obtained by some unrelated means, and want to learn the position of this object in the queryset (if it's even there).
If you just want to know where you object sits amongst all others (e.g. when determining rank), you can do it quickly by counting the objects before you:
index = MyModel.objects.filter(sortField__lt = myObject.sortField).count()
Assuming for the purpose of illustration that your models are standard with a primary key id, then evaluating
list(qs.values_list('id', flat=True)).index(obj.id)
will find the index of obj in qs. While the use of list evaluates the queryset, it evaluates not the original queryset but a derived queryset. This evaluation runs a SQL query to get the id fields only, not wasting time fetching other fields.
QuerySets in Django are actually generators, not lists (for further details, see Django documentation on QuerySets).
As such, there is no shortcut to get the index of an element, and I think a plain iteration is the best way to do it.
For starter, I would implement your requirement in the simplest way possible (like iterating); if you really have performance issues, then I would use some different approach, like building a queryset with a smaller amount of fields, or whatever.
In any case, the idea is to leave such tricks as late as possible, when you definitely knows you need them.
Update: You may want to use directly some SQL statement to get the rownumber (something lie . However, Django's ORM does not support this natively and you have to use a raw SQL query (see documentation). I think this could be the best option, but again - only if you really see a real performance issue.
It's possible for a simple pythonic way to query the index of an element in a queryset:
(*qs,).index(instance)
This answer will unpack the queryset into a list, then use the inbuilt Python index function to determine it's position.
You can do this using queryset.extra(…) and some raw SQL like so:
queryset = queryset.order_by("id")
record500 = queryset[500]
numbered_qs = queryset.extra(select={
'queryset_row_number': 'ROW_NUMBER() OVER (ORDER BY "id")'
})
from django.db import connection
cursor = connection.cursor()
cursor.execute(
"WITH OrderedQueryset AS (" + str(numbered_qs.query) + ") "
"SELECT queryset_row_number FROM OrderedQueryset WHERE id = %s",
[record500.id]
)
index = cursor.fetchall()[0][0]
index == 501 # because row_number() is 1 indexed not 0 indexed

How to get the equalent of python [:-1] in django ORM?

I am writing a Django application where I want to get all the items but last from a query. My query goes like this:
objects = Model.objects.filter(name='alpha').order_by('rank')[:-1]
but it throws out error:
Assertion Error: Negative indexing not supported.
Any idea where I am going wrong?
Any suggestions will be appreciated.
You can use QuerySet.last() to get the last and use its id for excluding it from results.
objects = Model.objects.filter(name='alpha').order_by('rank')
last = objects.last()
objects = objects.exclude(pk=last.pk)
A query for excluding from the result all objects ranked with the minimum value found in DB:
objects = Model.objects.annotate(
mini_rank=Min('rank'), # Annotate each object with the minimum known rank
).exclude(
mini_rank=F('rank') # Exclude all objects ranked with the minimum value found
)
EDITED
Django does not support negative indexing on QuerySets. Please read https://code.djangoproject.com/ticket/13089 for more information.
The quick and "dirty" way to do it is to convert the Queryset as a list and then use the negative indexing.
objects = list( Model.objects.filter(name='alpha').order_by('rank') )[:-1]
Please do note that the objects variable is no longer a queryset but a list.
However i would recommend using .exclude() method.
If you wish to use the .exclude() method, which i recommend, I would like to ask you to read the solution #RaydelMiranda has wrote below.
Negative indexing is not allowed in Django.
However you can use negative indexing in order_by function and take the first or any number of objects in the order.
You can do something like this:
objects = Model.objects.filter(name='alpha').order_by('-rank')[n:]
Here n suggests the number of objects you will need. In your case it would be:
objects = Model.objects.filter(name='alpha').order_by('-rank')[1:]
query=model.objects.filter(user=request.user)
if query.exists():
query=query.last()

Doing "group by" in django but still retaining complete object

I want to do a GROUP BY in Django. I saw answers on Stack Overflow that recommend:
Member.objects.values('designation').annotate(dcount=Count('designation'))
This works, but the problem is you're getting a ValuesQuerySet instead of a QuerySet, so the queryset isn't giving me full objects but only specific fields. I want to get complete objects.
Of course, since we're grouping we need to choose which object to take out of each group; I want a way to specify the object (e.g. take the one with the biggest value in a certain field, etc.)
Does anyone know how I can do that?
If you're willing to make two queries, you could do the following:
dcounts = Member.objects.values('id', 'designation').annotate(dcount=Count('designation')).order_by('-dcount')
member = Member.objects.get(id=dcounts.first()['id'])
If you wanted the top five objects by dcount, you could do the following:
ids = [dcount['id'] for dcount in dcounts[:5]]
members = Member.objects.filter(id__in=ids)
It sounds like you don't necessarily need to GROUP BY, but just want to limit your selection to one item per field (eg, the MAX value of a certain field).
Can you try getting distinct objects by field, such as
In Postgres
Member.objects.order_by('designation').distinct('designation')
In any other database
Member.objects.distinct('designation')
https://docs.djangoproject.com/en/dev/ref/models/querysets/#django.db.models.query.QuerySet.distinct

Union on ValuesQuerySet in django

I've been searching for a way to take the union of querysets in django. From what I read you can use query1 | query2 to take the union... This doesn't seem to work when using values() though. I'd skip using values until after taking the union but I need to use annotate to take the sum of a field and filter on it and since there's no way to do "group by" I have to use values(). The other suggestions I read were to use Q objects but I can't think of a way that would work.
Do I pretty much need to just use straight SQL or is there a django way of doing this?
What I want is:
q1 = mymodel.objects.filter(date__lt = '2010-06-11').values('field1','field2').annotate(volsum=Sum('volume')).exclude(volsum=0)
q2 = mymodel.objects.values('field1','field2').annotate(volsum=Sum('volume')).exclude(volsum=0)
query = q1|q2
But this doesn't work and as far as I know I need the "values" part because there's no other way for Sum to know how to act since it's a 15 column table.
QuerySet.values() does not return a QuerySet, but rather a ValuesQuerySet, which does not support this operation. Convert them to lists then add them.
query = list(q1) + list(q2)

Categories

Resources