Django 1.11 - nested OuterRef usage

Django 1.11 - nested OuterRef usage - python

I recently updated Django to the bleeding-edge version 1.11rc1 because of the Subquery feature that was introduced there.
Now, let's say this is my use case: I have following models - Users, Groups and Permissions. So, I have some Users whom I can group (e.g. Administrators group) and Permissions - which are lists of users that can do some things (e.g. I have User A, User B and Administrators who can create new users). What I want to do now is display all of the Permissions with a number of users inside them efficiently. So in other words, I want to make a QuerySet which would return all the information about the Permissions and calculate the number of the users for each Permission. The first, obvious way to work-around this would be to create a get_user_count method for the Permission model which would return all users from my ManyToMany relationships, but that would require at least 1 additional query per Permission, which is unacceptable for me, as I'm planning to have a lot of Permissions. This is where I want to use Subquery.
So, to clarify things up - this is models.py:
class User(models.Model):
name = models.CharField(max_length=20)
class Group(models.Model):
users = models.ManyToManyField(User)
class Permission(models.Model):
users = models.ManyToManyField(User)
groups = models.ManyToManyField(Group)
And I want to create queryset that will return all Permissions with a number of users inside. For the sake of example, let's say I only want to include Users that belong to my groups - so I'd have something like this:
groups = Group.objects.filter(permission=OuterRef('pk'))
users = User.objects.filter(group__in=groups)
queryset = Permission.objects.annotate(
user_no=Subquery(users.annotate(c=Count('*')).values('c'))
)
The problem here is that my OuterRef cannot be resolved as used in "subquery's filter's filter":
This queryset contains a reference to an outer query and may only be used in a subquery.
Although, when I use another subquery to fetch the groups:
groups = Group.objects.filter(permission=OuterRef(OuterRef('pk')))
users = User.objects.filter(group__in=Subquery(groups))
queryset = Permission.objects.annotate(
user_no=Subquery(users.annotate(c=Count('*')).values('c'))
)
I get an error right in the first line:
int() argument must be a string, a bytes-like object or a number, not 'OuterRef'
The rest of the lines do not matter and have no influence on the error. The weird thing is, the exact same syntax appears in the documentation: https://docs.djangoproject.com/en/dev/ref/models/expressions/#django.db.models.OuterRef
The question is: what do I do incorrectly? Or how to achieve what I want in other way (although efficiently)?

Well, it's a bug in Django: https://github.com/django/django/pull/9529
I fixed it by excluding double-depth (OuterRef(OuterRef('pk'))) using annotations:
return self.annotate(category=Subquery(
# this is the "inner" subquery which is now just an annotated variable `category`
Category.objects.filter(offer=OuterRef('pk'))[:1].values('pk')
)).annotate(fs=Subquery(
# this is the "outer" subquery; instead of using subquery, I just use annotated `category` variable
Category.objects.filter(pk=OuterRef('category')).values('slug')
))
Hope it helps :)

Related

Django .order_by() related field returns too many items

I'm trying to return a list of users that have recently made a post, but the order_by method makes it return too many items.
there is only 2 accounts total, but when I call
test = Account.objects.all().order_by('-posts__timestamp')
[print(i) for i in test]
it will return the author of every post instance, and its duplicates. Not just the two account instances.
test#test.example
test#test.example
test#test.example
test#test.example
foo#bar.example
Any help?
class Account(AbstractBaseUser):
...
class Posts(models.Model):
author = models.ForeignKey('accounts.Account',on_delete=models.RESTRICT, related_name="posts")
timestamp = models.DateTimeField(auto_now_add=True)
title = ...
content = ...

This is totally normal. You should understand how is the SQL query generated.
Yours should look something like that:
select *
from accounts
left join post on post.account_id = account.id
order by post.timestamp
You are effectively selecting every post with its related users. It is normal that you have some duplicated users.
What you could do is ensure that your are selecting distinct users: Account.objects.order_by('-posts__timestamp').distinct('pk')
What I would do is cache this information in the account (directly on the acount model or in another model that has a 1-to-1 relashionship with your users.
Adding a last_post_date to your Account model would allow you to have a less heavy request.
Updating the Account.last_post_date every time a Post is created can be a little tedious, but you can abstract this by using django models signals.

filter model using manytomanyfield

I've got these models :
work.py
class Work(models.Model):
...
network = models.ManyToManyField(Network,related_name='network')
network.py
class Network(models.Model):
...
users = models.ManyToManyField(User, related_name="users")
In my views.py I got this class-based generic ListView
class WorkList(PermsMixin, ListContextMixin, ListView):
model = Work
# Here I want to filter queryset
What I want to do is to filter the queryset such that the logged in user is in the network users.
I tried many things, for example
queryset = Work.objects.all()
queryset.filter('self.request.user__in=network_users')
But I got this error:
ValueError : too many values to unpack (expected 2)
Can anyone help me please?

This is not the way to write queries. Typically you pass to a .filter(..) named arguments, these follow a "language" that specify what you want.
In your case, you want - if I understand it correctly - all the Works for which there exists a related Network that belongs to that User, you do this with:
Work.objects.filter(network__users=self.request.user).distinct()
Here two consecutive underscores (__) are used to see "through" a relation. In case of a one-to-many relation, or a many-to-many relation, it is sufficient that one path (a path from Work through Network to user) exists.
The .distinct() should be used if you want to only return different Works: if you do not use this, the same Work object can be in the queryset multiple times, since there can be multiple Networks that belong to the given user and are part of that Work.
Extra remark: you define the related_names exactly the same as the field name, for example:
network = models.ManyToManyField(Network,related_name='network')
But the related_name is not the name of the relation you define there, the related name is the reverse relation: the name of the implicit relation you have defined between Network and Work. So that now means that you obtain a queryset of Works by writing some_network.network, which does not make sense. You typically thus give it a name like:
network = models.ManyToManyField(Network,related_name='work')

Django 1.7 and smart, deep, filtered aggregations

I'm using Django 1.7 and I'm trying to seize the advantages of new features in the ORM.
Assume I have:
class Player(models.Model):
name = models.CharField(...)
class Question(models.Model):
title = models.CharField(...)
answer1 = models.CharField(...)
answer2 = models.CharField(...)
answer3 = models.CharField(...)
right = models.PositiveSmallIntegerField(...) #choices=1, 2, or 3
class Session(models.Model):
player = models.ForeignKey(Player, related_name="games")
class RightAnswerManager(models.Manager):
def get_queryset(self):
super(RightAnswerManager, self).get_queryset().filter(answer=models.F('question__right'))
class AnsweredQuestion(models.Model):
session = models.ForeignKey(Session, related_name="questions")
question models.ForeignKey(Question, ...)
answer = models.PositiveSmallIntegerField(...) #1, 2, 3, or None if not yet ans.
objects = models.Manager()
right = RightAnswerManager()
I know I can do:
Session.objects.prefetch_related('questions')
And get the sessions with the questions.
Also I can do:
Session.objects.prefetch_related(models.Prefetch('questions', queryset=AnsweredQuestion.right.all(), to_attr='answered'))
And get the sessions with the list of questions that were actually answered and right.
BUT I cannot do aggregation over those, to get -e.g.- the count of elements instead:
Session.objects.prefetch_related(models.Prefetch('questions', queryset=AnsweredQuestion.right.all(), to_attr='answered')).annotate(total_right=models.Count('answered'))
since answered is not a real field:
FieldError: Cannot resolve keyword 'rightones' into field. Choices are: id, name, sessions
This is only a sample, since there are a lot of fields in my models I never included. However the idea is clear: I cannot aggregate over created attributes.
Is there a way without falling to raw to respond to the following question?
Get each user annotated with their "points".
A user may play any amount of sessions.
In each session it gets many questions to answer.
For each right answer, a point is earned.
In RAW SQL it would be something like:
SELECT user.*, COUNT(answeredquestion.id)
FROM user
LEFT OUTER JOIN session ON (session.user_id = user.id)
INNER JOIN answeredquestion ON (answeredquestion.session_id = session.id)
INNER JOIN question ON (answeredquestion.question_id = question.id)
WHERE answeredquestion.answer = question.right
GROUP BY user.id
Or something like that (since there's a functional dependency in the grouping field, I would collect the user data and count the related answeredquestions, assuming the condition passes). So RAW queries are not an option for me.
The idea is to get the users with the total points.
My question can be responded in one of two ways (or both).
Is there a way to perform the same query (actually I never tested this exact query; It's here to present the idea) with the ORM in Django 1.7, somehow given the Prefetch or manager selection on related/inverse FK fields? Iteration is not allowed here (I'd have a quadratic version of the N+1 problem!).
Is there any django package which somehow does this? Perhaps doing an abstraction of RAW calls, provided by 3rd party. this is because I will have many queries like this one.

I don't believe using prefetch gets you any gain in this situation. Generally prefetch_related and select_related are used when looping through a filterset and accessing a related object for each. By doing the annotate in the initial query I believe django will take of that optimization for you.
For the question "Get each user annotated with their "points" try this query:
Player.objects.annotate(total_right=models.Count('games__questions__right'))

chaining queries together in Django

I have a query that gets me 32 avatar images from my avatar application:
newUserAv = Avatar.objects.filter(valid=True)[:32]
I'd like to combine this with a query to django's Auth user model, so I can get the last the last 32 people, who have avatar images, sorted by the date joined.
What is the best way to chain these two together?
The avatar application was a reusable app, and its model is:
image = models.ImageField(upload_to="avatars/%Y/%b/%d", storage=storage)
user = models.ForeignKey(User)
date = models.DateTimeField(auto_now_add=True)
valid = models.BooleanField()
Note that the date field, is the date the avatar is updated, so not suitable for my purporse

Either you put a field in your own User class (you might have to subclass User or compose with django.contrib.auth.models.User) that indicates that the User has an avatar. Than you can make your query easily.
Or do something like that:
from django.utils.itercompat import groupby
avatars = Avatar.objects.select_related("user").filter(valid=True).order_by("-user__date_joined")[:32]
grouped_users = groupby(avatars, lambda x: x.user)
user_list = []
for user, avatar_list in grouped_users:
user.avatar = list(avatar_list)[0]
user_list.append(user)
# user_list is now what you asked for in the first_place:
# a list of users with their avatars
This assumes that one user has one and only one avatar. Your model allows for more than one avatar per user so you have to watch out not to store more than one.
Explanation of Code Snippet:
The avatars of the most 32 recent joined users are requested together with the related user, so there doesn't have to be a database query for any of them in the upcoming code.
The list of avatars is then grouped with the user as a key. The list gets all items from the generator avatar_list and the first item (there should only be one) is assigned to user.avatar
Note that this is not necessary, you could always do something like:
for avatar in avatars:
user = avatar.user
But it might feel more naturally to access the avatars by user.avatar.

It's not possible to combine queries on two different base models. Django won't let you do this (it'll throw an error telling you exactly that).
However, if you have a foreignkey from one model to the other, then adding select_related() to your query will fetch the related objects into memory in a single DB query so that you can access them without going back to the DB.

Django: How to filter Users that belong to a specific group

I'm looking to narrow a query set for a form field that has a foreignkey to the User's table down to the group that a user belongs to.
The groups have been previously associated by me. The model might have something like the following:
myuser = models.ForeignKey(User)
And my ModelForm is very bare bones:
class MyForm(ModelForm):
class Meta:
model = MyModel
So when I instantiate the form I do something like this in my views.py:
form = MyForm()
Now my question is, how can I take the myuser field, and filter it so only users of group 'foo' show up.. something like:
form.fields["myuser"].queryset = ???
The query in SQL looks like this:
mysql> SELECT * from auth_user INNER JOIN auth_user_groups ON auth_user.id = auth_user_groups.user_id INNER JOIN auth_group ON auth_group.id = auth_user_groups.group_id WHERE auth_group.name = 'client';
I'd like to avoid using raw SQL though. Is it possible to do so?

You'll want to use Django's convention for joining across relationships to join to the group table in your query set.
Firstly, I recommend giving your relationship a related_name. This makes the code more readable than what Django generates by default.
class Group(models.Model):
myuser = models.ForeignKey(User, related_name='groups')
If you want only a single group, you can join across that relationship and compare the name field using either of these methods:
form.fields['myuser'].queryset = User.objects.filter(
groups__name='foo')
form.fields['myuser'].queryset = User.objects.filter(
groups__name__in=['foo'])
If you want to qualify multiple groups, use the in clause:
form.fields['myuser'].queryset = User.objects.filter(
groups__name__in=['foo', 'bar'])
If you want to quickly see the generated SQL, you can do this:
qs = User.objects.filter(groups__name='foo')
print qs.query

This is a really old question, but for those googling the answer to this (like I did), please know that the accepted answer is no longer 100% correct. A user can belong to multiple groups, so to correctly check if a user is in some group, you should do:
qs = User.objects.filter(groups__name__in=['foo'])
Of course, if you want to check for multiple groups, you can add those to the list:
qs = User.objects.filter(groups__name__in=['foo', 'bar'])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Django 1.11 - nested OuterRef usage - python

Related

Django .order_by() related field returns too many items

filter model using manytomanyfield

Django 1.7 and smart, deep, filtered aggregations

chaining queries together in Django

Django: How to filter Users that belong to a specific group

Categories

Resources