How to do a JOIN over multiple Django models - python

The following models are given:
class Copy(CommonLibraryBaseModel):
lecture = models.ForeignKey('Lecture', ...)
signature = models.CharField(max_length=100, ...)
class Lecture(CommonLibraryBaseModel):
category = models.ForeignKey('LectureCategory', ...)
class LectureCategory(CommonLibraryBaseModel):
parent = models.ForeignKey('self', ...)
display_name = models.CharField(max_length=100, ...)
I basically want to do the following query:
SELECT signature, display_name FROM lecturecategory as lc, lecture as l, copy as c WHERE lc.id = l.category_id AND c.lecture_id = l.id AND lc.parent_id=2;
How would I do that in Django? I could not figure out how to combine the different models.
Thanks for the help!

SELECT signature, display_name
FROM lecturecategory as lc, lecture as l, copy as c
WHERE lc.id = l.category_id AND c.lecture_id = l.id AND lc.parent_id=2;
will be :
Copy.objects.filter(lecture__category__parent_id=2).values_list('signature', 'lecture__category__display_name')
If you want a QuerSet of dictionnary in result, use values instead of values_list. Values_list return a tuple.
Documentation about lookup relationship

You could get a queryset of Copy instances with the following filter
copies = Copy.objects.filter(lecture__category_parent_id=2)
See the docs on lookups that span relationships for more info.
You can then loop through the queryset, and access the related lecture and lecture category using the foreign key.
for copy in copies:
print(copy.signature, copy.lecture.category.display_name)
Finally, you can change the initial query to use select_related, so that Django uses an inner join to fetch the lecture and category rather than separate queries:
copies = Copy.objects.filter(lecture__category_parent_id=2).select_related('lecture', lecture__category')

Related

Django - How do I write a queryset that's equivalent to this SQL query? - Manging duplicates with Counting and FIRST_VALUE

I have Model "A" that both relates to another model and acts as a public face to the actual data (Model "B"), users can modify the contents of A but not of B.
For every B there can be many As, and they have a one to many relation.
When I display this model anytime there's two or more A's related to the B I see "duplicate" records with (almost always) the same data, a bad experience.
I want to return a queryset of A items that relate to the B items, and when there's more than one roll them up to the first entered item.
I also want to count the related model B items and return that count to give me an indication of how much duplication is available.
I wrote the following analogous SQL query which counts the related items and uses first_value to find the first A created partitioned by B.
SELECT *
FROM
(
SELECT
COUNT(*) OVER (PARTITION BY b_id) as count_related_items,
FIRST_VALUE(id) OVER (PARTITION BY b_id order by created_time ASC) as first_filter,
*
FROM A
) AS A1
WHERE
A1.first_filter = A1.id;
As requested, here's a simplified view of the models:
class CoreData(models.Model):
title = models.CharField(max_length=500)
class UserData(models.Model):
core = models.ForeignKey("CoreData", on_delete=models.CASCADE)
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
title = models.CharField(max_length=500)
When a user creates data it first checks/creates the CoreData, storing things like the title, and then it creates the UserData, with a reference to the CoreData.
When a second user creates a piece of data and it references the same CoreData is when the "duplication" is introduced and why you can roll up the UserData (in SQL) to find the count and the "first" entry in the one to many relation.
Assuming my understanding is correct -
If you are querying from the UserData model the query would look something like this:
Considering CoreData.id = 18
user_data = UserData.objects.filter(core__id=18).
order_by("created_time").annotate(duplicate_count=Count('core__userData', filter(core__id=18))).first()
user_data would be the First object created which is related to the CoreData object. Also,
user_data.duplicate_count will give you the Count of UserData objects that are related to the CoreData object.
Reference Docs on Annotate here
Update:
If you need the list of UserData of specific CoreData you could use
user_data = UserData.objects.filter(core__id=18).
order_by("created_time").annotate(duplicate_count=Count('core__UserData', filter(core__id=18)))

Order by with specific rows first

I have a generic ListView in django 1.11 and I need to return the object ordered by alphabetical order, but changing the first 2 :
class LanguageListAPIView(generics.ListCreateAPIView):
queryset = Language.objects.all().order_by("name")
serializer_class = LanguageSerializer
with the following Language model :
class Language(models.Model):
name = models.CharField(max_length=50, unique=True)
And I'd like to return ENGLISH, FRENCH then every other languages in the database ordered by name.
Is there a way to achieve this with django ORM ?
Thank you,
Maybe you can use two querysets and combine them to obtain the result as:
q1 = Language.objects.filter(Q(name='ENGLISH'|name='FRENCH'))
and
q2 = Language.objects.filter(~Q(name='ENGLISH'|name='FRENCH')).order_by('name')
Then join the querysets as:
queryset = list(chain(q1, q2))
Import Q from django.db.models and chain from itertools
Since Django 1.8 you use Conditional Expressions:
from django.db.models import Case, When, Value, IntegerField
Language.objects.annotate(
order=Case(
When(name="ENGLISH", then=Value(1)),
When(name="FRENCH", then=Value(2)),
default=Value(3),
output_field=IntegerField(),
)
).order_by('order', 'name)
This will annotate a field called order, then sort the results first by the order field, then by the name field, where English/French will get a a lower order value, all following languages the same so that they are only sorted by name.

How to sort queryset by annotated attr from ManyToMany field

Simplest example:
class User(models.Model):
name = ...
class Group(models.Model):
members = models.ManyToManyField(User, through='GroupMembership')
class GroupMembership(models.Model):
user = ...
group = ...
I want to get list of Groups ordered by annotated field of members.
I'm using trigram search to filter and annotate User queryset.
To get annotated users I have something like that:
User.objects.annotate(...).annotate(similarity=...)
And now I'm trying to sort Groups queryset by Users' "similarity":
ann_users = User.objects.annotate(...).annotate(similarity=...)
qs = Group.objects.prefetch_related(Prefetch('members',
queryset=ann_users))
qs.annotate(similarity=Max('members__similarity')).order_by('similarity')
But it doesn't work, because prefetch_related does the ‘joining’ in Python; so I have the error:
"FieldError: Cannot resolve keyword 'members' into field."
I expect that you have a database function for similarity of names by trigram search and its Django binding or you create any:
from django.db.models import Max, Func, Value, Prefetch
class Similarity(Func):
function = 'SIMILARITY'
arity = 2
SEARCHED_NAME = 'searched_name'
ann_users = User.objects.annotate(similarity=Similarity('name', Value(SEARCHED_NAME)))
qs = Group.objects.prefetch_related(Prefetch('members', queryset=ann_users))
qs = qs.annotate(
similarity=Max(Similarity('members__name', Value(SEARCHED_NAME)))
).order_by('similarity')
The main query is compiled to
SELECT app_group.id, MAX(SIMILARITY(app_user.name, %s)) AS similarity
FROM app_group
LEFT OUTER JOIN app_groupmembership ON (app_group.id = app_groupmembership.group_id)
LEFT OUTER JOIN app_user ON (app_groupmembership.user_id = app_user.id)
GROUP BY app_group.id
ORDER BY similarity ASC;
-- params: ['searched_name']
It is not exactly what you want in the title, but the result is the same.
Notes: The efficiency how many times will be the SIMILARITY function evaluated depends on the database query optimizer. The query plan by EXPLAIN command will be an interesting your answer, if the original idea by raw query in some simplified case is better.

How to chain Django querysets preserving individual order

I'd like to append or chain several Querysets in Django, preserving the order of each one (not the result). I'm using a third-party library to paginate the result, and it only accepts lists or querysets. I've tried these options:
Queryset join: Doesn't preserve ordering in individual querysets, so I can't use this.
result = queryset_1 | queryset_2
Using itertools: Calling list() on the chain object actually evaluates the querysets and this could cause a lot of overhead. Doesn't it?
result = list(itertools.chain(queryset_1, queryset_2))
How do you think I should go?
This solution prevents duplicates:
q1 = Q(...)
q2 = Q(...)
q3 = Q(...)
qs = (
Model.objects
.filter(q1 | q2 | q3)
.annotate(
search_type_ordering=Case(
When(q1, then=Value(2)),
When(q2, then=Value(1)),
When(q3, then=Value(0)),
default=Value(-1),
output_field=IntegerField(),
)
)
.order_by('-search_type_ordering', ...)
)
If the querysets are of different models, you have to evaluate them to lists and then you can just append:
result = list(queryset_1) + list(queryset_2)
If they are the same model, you should combine the queries using the Q object and 'order_by("queryset_1 field", "queryset_2 field")'.
The right answer largely depends on why you want to combine these and how you are going to use the results.
So, inspired by Peter's answer this is what I did in my project (Django 2.2):
from django.db import models
from .models import MyModel
# Add an extra field to each query with a constant value
queryset_0 = MyModel.objects.annotate(
qs_order=models.Value(0, models.IntegerField())
)
# Each constant should basically act as the position where we want the
# queryset to stay
queryset_1 = MyModel.objects.annotate(
qs_order=models.Value(1, models.IntegerField())
)
[...]
queryset_n = MyModel.objects.annotate(
qs_order=models.Value(n, models.IntegerField())
)
# Finally, I ordered the union result by that extra field.
union = queryset_0.union(
queryset_1,
queryset_2,
[...],
queryset_n).order_by('qs_order')
With this, I could order the resulting union as I wanted without changing any private attribute while only evaluating the querysets once.
I'm not 100% sure this solution works in every possible case, but it looks like the result is the union of two QuerySets (on the same model) preserving the order of the first one:
union = qset1.union(qset2)
union.query.extra_order_by = qset1.query.extra_order_by
union.query.order_by = qset1.query.order_by
union.query.default_ordering = qset1.query.default_ordering
union.query.get_meta().ordering = qset1.query.get_meta().ordering
I did not test it extensively, so before you use that code in production, make sure it behaves like expected.
If you need to merge two querysets into a third queryset, here is an example, using _result_cache.
model
class ImportMinAttend(models.Model):
country=models.CharField(max_length=2, blank=False, null=False)
status=models.CharField(max_length=5, blank=True, null=True, default=None)
From this model, I want to display a list of all the rows such that :
(query 1) empty status go first, ordered by countries
(query 2) non empty status go in second, ordered by countries
I want to merge query 1 and query 2.
#get all the objects
queryset=ImportMinAttend.objects.all()
#get the first queryset
queryset_1=queryset.filter(status=None).order_by("country")
#len or anything that hits the database
len(queryset_1)
#get the second queryset
queryset_2=queryset.exclude(status=None).order_by("country")
#append the second queryset to the first one AND PRESERVE ORDER
for query in queryset_2:
queryset_1._result_cache.append(query)
#final result
queryset=queryset_1
It might not be very efficient, but it works :).
For Django 1.11 (released on April 4, 2017) use union() for this, documentation here:
https://docs.djangoproject.com/en/1.11/ref/models/querysets/#django.db.models.query.QuerySet.union
Here is the Version 2.1 link to this:
https://docs.djangoproject.com/en/2.1/ref/models/querysets/#union
the union() function to combine multiple querysets together, rather than the or (|) operator. This avoids a very inefficient OUTER JOIN query that reads the entire table.
If two querysets has common field, you can order combined queryset by that field. Querysets are not evaluated during this operation.
For example:
class EventsHistory(models.Model):
id = models.IntegerField(primary_key=True)
event_time = models.DateTimeField()
event_id = models.IntegerField()
class EventsOperational(models.Model):
id = models.IntegerField(primary_key=True)
event_time = models.DateTimeField()
event_id = models.IntegerField()
qs1 = EventsHistory.objects.all()
qs2 = EventsOperational.objects.all()
qs_combined = qs2.union(qs1).order_by('event_time')

Django help - how to get data from foreign keys

Given the following model setup (stripped down):
class Object(models.Model):
name = models.CharField(max_length=255)
class ObjectVersion(models.Model):
Cabinet = models.ForeignKey(Object)
Say I had a bunch of ObjectVersion entries that I'd filtered called "filteredObjects". How would I get the names of those objects into a vector?
You can use either values_list or values. The docs are available on the Django QuerySet API reference here.
>>> ObjectVersion.objects.filter(...).values('Cabinet__name')
[{'Cabinet__name':'foo'}, {'Cabinet__name':'foo2'}, ...]
or
>>> ObjectVersion.objects.filter(...).values_list('Cabinet__name')
[('foo',), ('foo2',), ...]
so you are saying that you have a list called filteredObjects?
objNames = [fo.name for fo in filteredObjects]
result = ObjectVersion.objects.filter(your filters here).values_list('Cabinet__name', flat=True)
Output:
result = ['ver_1', 'ver_2'..]

Categories

Resources