Django - Following ForeignKey relationships "backward" for entire QuerySet

Django - Following ForeignKey relationships "backward" for entire QuerySet - python

is it possible to follow ForeignKey relationships backward for entire querySet?
i mean something like this:
x = table1.objects.select_related().filter(name='foo')
x.table2.all()
when table1 hase ForeignKey to table2.
in
https://docs.djangoproject.com/en/1.2/topics/db/queries/#following-relationships-backward
i can see that it works only with get() and not filter()
Thanks

You basically want to get QuerySet of different type from data you start with.
class Kid(models.Model):
mom = models.ForeignKey('Mom')
name = models.CharField…
class Mom(models.Model):
name = models.CharField…
Let's say you want to get all moms having any son named Johnny.
Mom.objects.filter(kid__name='Johnny')
Let's say you want to get all kids of any Lucy.
Kid.objects.filter(mom__name='Lucy')

You should be able to use something like:
for y in x:
y.table2.all()
But you could also use get() for a list of the unique values (which will be id, unless you have a different specified), after finding them using a query.
So,
x = table1.objects.select_related().filter(name='foo')
for y in x:
z=table1.objects.select_related().get(y.id)
z.table2.all()
Should also work.

You can also use values() to fetch specific values of a foreign key reference. With values the select query on the DB will be reduced to fetch only those values and the appropriate joins will be done.
To re-use the example from Krzysztof Szularz:
jonny_moms = Kid.objects.filter(name='Jonny').values('mom__id', 'mom__name').distinct()
This will return a dictionary of Mom attributes by using the Kid QueryManager.

Related

Django - Annotate multiple fields from a Subquery

I'm working on a Django project on which i have a queryset of a 'A' objects ( A.objects.all() ), and i need to annotate multiple fields from a 'B' objects' Subquery. The problem is that the annotate method can only deal with one field type per parameter (DecimalField, CharField, etc.), so, in order to annotate multiple fields, i must use something like:
A.objects.all().annotate(b_id =Subquery(B_queryset.values('id')[:1],
b_name =Subquery(B_queryset.values('name')[:1],
b_other_field =Subquery(B_queryset.values('other_field')[:1],
... )
Which is very inefficient, as it creates a new subquery/subselect on the final SQL for each field i want to annotate. I would like to use the same Subselect with multiple fields on it's values() params, and annotate them all on A's queryset. I'd like to use something like this:
b_subquery = Subquery(B_queryset.values('id', 'name', 'other_field', ...)[:1])
A.objects.all().annotate(b=b_subquery)
But when i try to do that (and access the first element A.objects.all().annotate(b=b_subquery)[0]) it raises an exception:
{FieldError}Expression contains mixed types. You must set output_field.
And if i set Subquery(B_quer...[:1], output_field=ForeignKey(B, models.DO_NOTHING)), i get a DB exception:
{ProgrammingError}subquery must return only one column
In a nutshell, the whole problem is that i have multiple Bs that "belongs" to a A, so i need to use Subquery to, for every A in A.objects.all(), pick a specific B and attach it on that A, using OuterRefs and a few filters (i only want a few fields of B), which seens a trivial problem for me.
Thanks for any help in advance!

What I do in such situations is to use prefetch-related
a_qs = A.objects.all().prefetch_related(
models.Prefetch('b_set',
# NOTE: no need to filter with OuterRef (it wont work anyway)
# Django automatically filter and matches B objects to A
queryset=B_queryset,
to_attr='b_records'
)
)
Now a.b_records will be a list containing a's related b objects. Depending on how you filter your B_queryset this list may be limited to only 1 object.

Django querysets optimization - preventing selection of annotated fields

Let's say I have following models:
class Invoice(models.Model):
...
class Note(models.Model):
invoice = models.ForeignKey(Invoice, related_name='notes', on_delete=models.CASCADE)
text = models.TextField()
and I want to select Invoices that have some notes. I would write it using annotate/Exists like this:
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True)
This works well enough, filters only Invoices with notes. However, this method results in the field being present in the query result, which I don't need and means worse performance (SQL has to execute the subquery 2 times).
I realize I could write this using extra(where=) like this:
Invoice.objects.extra(where=['EXISTS(SELECT 1 FROM note WHERE invoice_id=invoice.id)'])
which would result in the ideal SQL, but in general it is discouraged to use extra / raw SQL.
Is there a better way to do this?

You can remove annotations from the SELECT clause using .values() query set method. The trouble with .values() is that you have to enumerate all names you want to keep instead of names you want to skip, and .values() returns dictionaries instead of model instances.
Django internaly keeps the track of removed annotations in
QuerySet.query.annotation_select_mask. So you can use it to tell Django, which annotations to skip even wihout .values():
class YourQuerySet(QuerySet):
def mask_annotations(self, *names):
if self.query.annotation_select_mask is None:
self.query.set_annotation_mask(set(self.query.annotations.keys()) - set(names))
else:
self.query.set_annotation_mask(self.query.annotation_select_mask - set(names))
return self
Then you can write:
invoices = (Invoice.objects
.annotate(has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
.filter(has_notes=True)
.mask_annotations('has_notes')
)
to skip has_notes from the SELECT clause and still geting filtered invoice instances. The resulting SQL query will be something like:
SELECT invoice.id, invoice.foo FROM invoice
WHERE EXISTS(SELECT note.id, note.bar FROM notes WHERE note.invoice_id = invoice.id) = True
Just note that annotation_select_mask is internal Django API that can change in future versions without a warning.

Ok, I've just noticed in Django 3.0 docs, that they've updated how Exists works and can be used directly in filter:
Invoice.objects.filter(Exists(Note.objects.filter(invoice_id=OuterRef('pk'))))
This will ensure that the subquery will not be added to the SELECT columns, which may result in a better performance.
Changed in Django 3.0:
In previous versions of Django, it was necessary to first annotate and then filter against the annotation. This resulted in the annotated value always being present in the query result, and often resulted in a query that took more time to execute.
Still, if someone knows a better way for Django 1.11, I would appreciate it. We really need to upgrade :(

We can filter for Invoices that have, when we perform a LEFT OUTER JOIN, no NULL as Note, and make the query distinct (to avoid returning the same Invoice twice).
Invoice.objects.filter(notes__isnull=False).distinct()

This is best optimize code if you want to get data from another table which primary key reference stored in another table
Invoice.objects.filter(note__invoice_id=OuterRef('pk'),)

We should be able to clear the annotated field using the below method.
Invoice.objects.annotate(
has_notes=Exists(Note.objects.filter(invoice_id=OuterRef('pk')))
).filter(has_notes=True).query.annotations.clear()

Extend queryset in django python

I am looking for way how to add new objects to existing queryset, or how to implement what I want by other way.
contact = watson.filter(contacts, searchline)
This line returns queryset, which I later use to iterate.
Then I want to do this to add more objects, which watson couldn't find
contact_in_iteration = Contact.objects.get(id = fild.f_for)
contact.append(contact_in_iteration)
And sorry for my poor english
Did this
contacts = Contact.objects.filter(crm_id=request.session['crm_id'])
query = Q(contacts,searchline)
contact = watson.filter(query)
and get 'filter() missing 1 required positional argument: 'search_text'' error

You can use | and Q lookups. See the docs.
I'm not sure I've fully understood your initial query, but I think that in your case you would want to do:
query = Q(contacts='Foo', searchline='Bar')
contact = watson.filter(query)
Then later:
contact = watson.filter(query | Q(id=field.f_for))
Strictly speaking it won't append to the queryset, but will return a new queryset. But that's okay, because that's what .filter() does anyway.

You should look at a queryset as a sql query that will be executed later. When constructing a queryset and save the result in a variable, you can later filter it even more, but you can not expand it. If you need a query that has more particular rules (like, you need an OR operation) you should state that when you are constructing the query. One way of doing that is indeed using the Q object.
But it looks like you are confused about what querysets really are and how they are used. First of all:
Contact.objects.get(id = fild.f_for)
will never return a queryset, but an instance, because you use get and thus ask for a single particular record. You need to use filter() if you want to get a quersyet. So if you had an existing queryset say active_contacts and you wanted to filter it down so you only get the contacts that have a first_name of 'John' you would do:
active_contacts = Contact.objects.filter(active=True)
active_contacts_named_John = active_contacts.filter(first_name='John')
Of course you could do this in one line too, but I'm assuming you do the first queryset construction somewhere else in your code.
Second remark:
If in your example watson is a queryset, your user of filter() is unclear. This doesn't really make sense:
contact = watson.filter(contacts, searchline)
As stated earlier, filtering a queryset returns another queryset. So you should use a plurar as your variable name e.g. contacts. Then the correct use of filter would be:
contacts = watson.filter(first_name=searchline)
I'm assuming searchline here is a variable that contains a user inputted search term. So maybe here you should name your variable searchterm or similar. This will return all contacts that are filtered by whatever watson is filtering out already and whose first_name matches searchline exactly. You could also use a more liberate method and filter out results that 'contains' the searching term, like so:
contacts = watson.filter(first_name__contains=searchline)
Hope this helps you get on the right path.

Group by foreign key as boolean in Django

I have a model in which I have a ForeignKey and an IntegerField.
I want to sum the integer field grouped by the foreign key but the foreign key can have many values. I am only interested in knowing if the foreign key has a 'real value' or is none. So the foreign key should be interpreted as a boolean.
I could make two queries:
a = Model.objects.filter(parent=None).aggregate(Sum('amount'))
b = Model.objects.exclude(parent=None).aggregate(Sum('amount'))
but isn't it less memory demanding to make something like
c = Model.objects.values('parent__as_bool').annotate(Sum('amount'))
if it's possible?

You can always use .extra, like this:
.extra(select={'parent_is_null': "parent is NULL"})
see docs for more examples

Your second option isn't valid because you need to filter the queryset before asking for values, and to specify what you're calling your annotation. As far as treating a foreignkey as a boolean is concerned you're looking for isnull. So you end with something like:
c = Model.objects.filter(parent__isnull=False).values().annotate(amount_sum=Sum('amount'))

use IDs instead of instances to create new objects with foreign keys

I want to create a new Django ORM object with three foreign keys. I got the IDs of the foreign rows already, and I mean - that's all I need to fill the foreign key columns in my new row, right? However, I don't seem able to create the new row without hitting the DB three times to instantiate objects out of those IDs.
So what I need to do:
foreign_object = models.ForeignObject.get(pk=foreign_object_id)
a = models.Object1.get_or_create(f = foreign_object)
What I'd like to do:
a = models.Object1.get_or_create(f_id = foreign_object_id)
f_id however is not a field Django recognizes. If I just assign foreign_object_id to f (I think I recall this works in some cases), Django complains that it wants a ForeignObject instance instead of an int.
Any way to do this?

You need to use the double-underscore notation in this case
eg
a = models.Object1.get_or_create(f__pk=foreign_object_id)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.