I read the Querysets Django docs regarding querysets being lazy, but still am a bit confused here.
So in my view, I set a queryset to a variable like so
players = Players.objects.filter(team=team)
Later on, I have a sorting mechanism that I can apply
sort = '-player_last_name'
if pts > 20:
players = players.filter(pts__gte = pts).order_by(sort)
else:
players = players.filter(pts__lte = pts).order_by(sort)
if ast < 5:
players = players.filter(asts__lte = ast).order_by(sort)
else:
players = players.filter(asts__gte = ast).order_by(sort)
context = {players: players)
return render(request, '2021-2022/allstars.html', context)
What I want to know is, when is the players queryset evaluated? is it when each page is rendered, or everytime I assign the queryset to a variable? Because if it's the former, then I can just apply the .order_by(sort) chain and the previous applications are redundant.
QuerySets are evaluated if you "consume" the queryset. You consume a queryset by enumerating over it, call .get(…), .exists(…), .aggregate(…) or .count(…), check the truthiness (for example with if myqueryset: or bool(queryset), or call len(…) over it, etc. As a rule of thumb, it gets evaluated if you perform an action on it such that the result is no longer a QuerySet.
If you enumerate over it, or you call len(…) the result is cached in the QuerySet, so calling it a second time, or enumerating over it after you have called len(…) will not make another trip to the database.
In this specific case, none of the QuerySets are evaluated before you make the call to the render(…) function. If in the template you for example use {% if players %}, or {% for players %}, {{ players|length }}, or {{ players.exists }}, then it will evaluate the queryset.
Django queries are designed to be "lazy" - that is. they don't run the database query until something requests actual data. Queries can be modified by the addition of filtering and other similar functions.
For example, the following code requests all TeamMember objects when the search string is 'all', but otherwise adds a filter to restrict names to those matching the given search.
squad_list = TeamMember.objects(state__in={"Hired", "Joined", "Staff", "Recruiting"})
if squadname != 'all':
squad_list = squad_list(squad__icontains=squadname.lower())
When the squadlist query is finally executed it will retrieve the required record. Dopes this help?
Related
I want to add two numbers from two different objects.
Here is a simplified version. I have two integers and I multiply those to get multiplied .
models.py:
class ModelA(models.Model):
number_a = models.IntegerField(default=1, null=True, blank=True)
number_b = models.IntegerField(default=1, null=True, blank=True)
def multiplied(self):
return self.number_a * self.number_b
views.py:
#login_required
def homeview(request):
numbers = ModelA.objects.all()
context = {
'numbers': numbers,
}
return TemplateResponse...
What I'm trying to do is basically multiplied + multiplied in the template but I simply can't figure out how to do it since I first have to loop through the objects.
So if I had 2 instances of ModelA and two 'multiplied' values of 100 I want to display 200 in the template. Is this possible?
The good practice is always to avoid logic on the template. It would be better to loop at the view and add calculated value to context:
def homeview(request):
queryset = ModelA.objects.all()
multipliers_addition = 0
for obj in queryset:
multipliers_addition += obj.multiplied()
context = {
'addition': multipliers_addition,
}
return render(request, 'multiply.html', context)
You can try doing that with aggregate
from django.db.models import Sum
ModelA.objects.aggregate(Sum('multiplied'))
If that does not suit you just use aggregate on each field and then add them together.
In your template, when you do a forloop over the numbers variable, you can directly access properties, functions and attributes.
So to access the value you want I guess it would look something like this, simplified:
{% for number in numbers %}
{{ number.multiplied }}
{% endfor %}
Hope that makes sense?
However, please take note that this is not the most efficient method.
We can make Django ask the SQL server to do the heavy lifting on the calculation side of things, so if you want to be really clever and optimise your view, you can comment out your multiplied function and replace then, you still access it in the template the same way I described above, but we needs to change the code in your view slightly like so:
numbers = ModelA.objects.aggregate(Sum('number_a', 'number_b'))
As described loosely in haduki's answer. This offloads the calculation to the SQL server by crafting an SQL query which uses the SUM SQL database function, which for all intents and purposes is almost always going to be substantially faster that having a function on the object.
I'm trying to add an extra field to the instances of my queryset and then sort the set by the new field.
But I get a field error( Cannot resolve keyword 'foo' into field. Choices are: ... )
My view(abstract):
def view(request):
instances = Model.objects.all()
for counter, instance in enumerate(instances):
instance.foo = 'bar' + str(counter)
instances.order_by('foo') #this line causes trouble
return render(request, 'template.html', {'instances': instance})
My template:
{% for instance in instances %}
{{instance.foo}}
{% endfor %}
If I leave out the order_by line the templates renders as expected, so the field seems to be there.
So why do I get a field error?
It would be awesome, if somebody could help me to understand what I'm doing wrong.
Thanks in advance.
I found a possible solution to change the template to
{% for instance in instances|dictsort:'foo' %}
and that works fine, but from what I understand there should be as little logic as possible in the view, so I figure sorting should be done in the view.
Or is this actually the right way?
The Django ORM aims to construct database queries. As a result, you can only query on what a database "knows". Methods, properties, or attributes you added yourself are unknown to the database. The .order_by thus has no effect, since you "patched" the objects in the instances queryset.
If you however call an instances.order_by you construct a new queryset. This queryset takes the context of the parent, and thus represents a (slightly) modified query, but again, a query. Whether the old queryset is already evaluated or patched, is of no importance.
Furthermore even if there was a column foo, it would not help, since the instance.order_by does not order the instance queryset, it constructs a new one, one that looks approximately like the old one, except that the rows are ordered.
You thus will have to sort in Python now. You can for example construct a list of ordered elements with sorted(..), like:
from operator import attrgetter
def view(request):
instances = Model.objects.all()
for counter, instance in enumerate(instances):
instance.foo = 'bar' + str(counter)
mydata = sorted(instances, key=attrgetter('foo'))
return render(request, 'template.html', {'instances': mydata})
So now mydata is no longer a QuerySet, but a vanilla list. Furthermore note that ordering in a database might be slightly different than ordering in Python. In Python exceptions can occur if not all elements have the same type, and furthermore it is not guaranteed that the semantics behind the order relation is exactly the same.
The new attribute in the Python Objects does not exist in the database and only in those instances. order_by changes the queryset and not your current list of objects stored in memory.
One approach would be to use the builtin python sorting functions in the view like: sorted or even list.sort().
I'm using django-activity-stream to display a list of recent events. For the sake of example these could be someone commenting or someone editing an article. I.e. the GenericForeignKey action_object could reference a Comment or an Article. I'd like to display a link to whatever the action_object is too:
<a href="{{ action.action_object.get_absolute_url }}">
{{ action.action_object }}
</a>
The problem is this causes queries for every single item, particularly as Comment.get_absolute_url requires the comment's article, which has not been fetched yet, and Article.__unicode__ requires its revision.content, which also hasn't been fetched.
django-activity-stream already calls prefetch_related('action_object') automatically (related discussion).
This appears to be working as testing with {{ action.action_object.id }} results in a single query per action_object_content_type, despite the docs saying:
It also supports prefetching of GenericRelation and GenericForeignKey, however, it must be restricted to a homogeneous set of results. For example, prefetching objects referenced by a GenericForeignKey is only supported if the query is restricted to one ContentType.
And there is more than one content type. However in my use case above I need extra prefetch_related calls, for example:
query = query.prefetch_related('action_object__article`, `action_object__revision`)
But this complains because Articles don't have an __article (and would probably complain about Comments not having a __revision too if it got that far). I'm assuming this is what the docs are really referring to. So I thought I'd try this:
comments = query._clone().filter(action_object_content_type=comment_ctype).prefetch_related('action_object__article')
articles = query._clone().filter(action_object_content_type=article_ctype).prefetch_related('action_object__revision')
query = comments | articles
But the results are always empty. I guess querysets only support a single prefetch_related list and can't be joined like that.
I like a single queryset to return because further filtering is done later in the code which this part doesn't know about. Although once the queryset is finally evaluated I want to be able to have django fetch everything needed to render the events.
Is there another way?
I had a look at Prefetch objects but I don't think they offer any help in this situation.
A solution can be found in django-notify-x which is derived from django-notifications which, in turn, is derived from django-activity-stream. It makes use of a "django snippet" linked in the copied text below.
https://github.com/v1k45/django-notify-x/pull/19
Using a snippet from https://djangosnippets.org/snippets/2492/,
prefetch generic relations to reduce the number of queries.
Currently, we trigger one additional query for each generic relation
for each record, with this code, we reduce to one additional query for
each generic relation for each type of generic relation used.
If all your notifications are related to a Badges model, only one
aditional query will be triggered.
For Django 1.10 and 1.11, I am using the snippet above modified as below (just in case you are not using django-activity-stream):
from django.contrib.contenttypes.models import ContentType
from django.contrib.contenttypes import fields as generic
def get_field_by_name(meta, fname):
return [f for f in meta.get_fields() if f.name == fname]
def prefetch_relations(weak_queryset):
weak_queryset = weak_queryset.select_related()
# reverse model's generic foreign keys into a dict:
# { 'field_name': generic.GenericForeignKey instance, ... }
gfks = {}
for name, gfk in weak_queryset.model.__dict__.items():
if not isinstance(gfk, generic.GenericForeignKey):
continue
gfks[name] = gfk
data = {}
for weak_model in weak_queryset:
for gfk_name, gfk_field in gfks.items():
related_content_type_id = getattr(weak_model, get_field_by_name(gfk_field.model._meta, gfk_field.ct_field)[
0].get_attname())
if not related_content_type_id:
continue
related_content_type = ContentType.objects.get_for_id(related_content_type_id)
related_object_id = int(getattr(weak_model, gfk_field.fk_field))
if related_content_type not in data.keys():
data[related_content_type] = []
data[related_content_type].append(related_object_id)
for content_type, object_ids in data.items():
model_class = content_type.model_class()
models = prefetch_relations(model_class.objects.filter(pk__in=object_ids))
for model in models:
for weak_model in weak_queryset:
for gfk_name, gfk_field in gfks.items():
related_content_type_id = getattr(weak_model,
get_field_by_name(gfk_field.model._meta, gfk_field.ct_field)[
0].get_attname())
if not related_content_type_id:
continue
related_content_type = ContentType.objects.get_for_id(related_content_type_id)
related_object_id = int(getattr(weak_model, gfk_field.fk_field))
if related_object_id != model.pk:
continue
if related_content_type != content_type:
continue
setattr(weak_model, gfk_name, model)
return weak_queryset
This is giving me the intended results.
EDIT:
To use it, you simply call prefetch_relations, with your QuerySet as the argument.
For example, instead of:
my_objects = MyModel.objects.all()
you can do this:
my_objects = prefetch_relations(MyModel.objects.all())
I try some code like this:
mymodels = MyModel.objects.filter(status=1)
mymodels.update(status=4)
print(mymodels)
And the result is an empty list
I know that I can use a for loop to replace the update.
But it will makes a lot of update query.
Is there anyway to continue manipulate mymodels after the bulk update?
Remember that Django's QuerySets are lazy:
QuerySets are lazy – the act of creating a QuerySet doesn’t involve any database activity. You can stack filters together all day long, and Django won’t actually run the query until the QuerySet is evaluated
but the update() method function is actually applied immediately:
The update() method is applied instantly, and the only restriction on the QuerySet that is updated is that it can only update columns in the model’s main table, not on related models.
So while - in your code - are applying the update call after your filter, in reality it is being applied beforehand and therefore your objects status is being changed before the filter is (lazily) applied, meaning there are no matching records and the result is empty.
mymodels = MyModel.objects.filter(status=1)
objs = [obj for obj in mymodels] # save the objects you are about to update
mymodels.update(status=4)
print(objs)
should work.
Explanations why had been given by Timmy O'Mahony.
Tricky code:
user = User.objects.filter(id=123)
user[0].last_name = 'foo'
user[0].save() # Cannot be saved.
id(user[0]) # 32131
id(user[0]) # 44232 ( different )
user cannot be saved in this way.
Normal code:
user = User.objects.filter(id=123)
if user:
user[0].last_name = 'foo'
user[0].save() # Saved successfully.
id(user[0]) # 32131
id(user[0]) # 32131 ( same )
So, what is the problem?
In first variant your user queryset isn't evaluated yet. So every time you write user[0] ORM makes independent query to DB. In second variation queryset is evalutaed and acts like normal Python list.
And BTW if you want just one row, use get:
user = User.objects.get(id=123)
when you index into a queryset, django fetches the data (or looks in its cache) and creates a model instance for you. as you discovered with id(), each call creates a new instance. so while you can set the properties on these qs[0].last_name = 'foo', the subsequent call to qs[0].save() creates a new instance (with the original last_name) and saves that
i'm guessing your particular issue has to do with when django caches query results. when you are just indexing into the qs, nothing gets cached, but your call if users causes the entire (original) qs to be evaluated, and thus cached. so in that case each call to [0] retrieves the same model instance
Saving is possible, but everytime you access user[0], you actually get it from the database so it's unchanged.
Indeed, when you slice a Queryset, Django issues a SELECT ... FROM ... OFFSET ... LIMIT ... query to your database.
A Queryset is not a list, so if you want to it to behave like a list, you need to evaluate it, to do so, call list() on it.
user = list(User.objects.filter(id=123))
In your second example, calling if user will actually evaluate the queryset (get it from the database into your python program), so you then work with your Queryset's internal cache.
Alternatively, you can use u = user[0], edit that and then save, which will work.
Finally, you should actually be calling Queryset.get, not filter here, since you're using the unique key.