How to change join and group by SQL to ORM in Django - python

I'm new in Django. So, I want to join two models which are company and client and count the number of clients for each of the company. Here the SQL
SELECT Company_company.name, count(Client_client.cid)
FROM Company_company
LEFT JOIN Client_client
ON Company_company.comid = Client_client.comid_id
GROUP BY Company_company.name;
But since in Django, we use ORM. So I'm a little bit confusing since I'm a beginner. I already refer few SQL to ORM converter website such as Django ORM and do some try and error. But, I didn't know where the problem since I want the output from the ORM to be classified into a different array. Here is my code:
labels = []
data = []
queryClientCompany = client.objects.values('comid').annotate(c=Count('cid')).values('comid__name','c')
for comp in queryClientCompany:
labels.append(comp.comid__name)
data.append(comp.c)
Here some of the relevant things in the client and company models:
class client (models.Model):
#client info
cid = models.AutoField(primary_key = True)
comid = models.ForeignKey(company,related_name='companys',
on_delete = models.DO_NOTHING,verbose_name="Company",null = True, blank = True)
class company(models.Model):
comid = models.AutoField(_('Company'),primary_key = True)
#company info
name = models.CharField(_('Company Name'),max_length = 50)
The error stated that the comid__name is not defined. So actually how to append the result? I hope someone can help me. Thank you for helping in advanced.

You should query from the opposite side to perform the LEFT OUTER JOIN between company and client (and not client and company):
from django.db.models import Count
labels = []
data = []
queryClientCompany = company.objects.annotate(
c=Count('companys__cid')
)
for comp in queryClientCompany:
labels.append(comp.name)
data.append(comp.c)
The companys part is due to the related_name='copanys', but it does not make much sense to name this relation that way. The related_name=… parameter [Django-doc] specifies how to access the Clients for a given Company, so clients is a more appropriate value for the related_name:
class client (models.Model):
cid = models.AutoField(primary_key=True)
comid = models.ForeignKey(
company,
related_name='clients',
on_delete = models.DO_NOTHING,
verbose_name="Company",
null = True,
blank = True
)
then the query is:
from django.db.models import Count
labels = []
data = []
queryClientCompany = company.objects.annotate(
c=Count('clients__cid')
)
for comp in queryClientCompany:
labels.append(comp.name)
data.append(comp.c)

Related

Django Left join how

I fairly new to Django and stuck with creating a left join in Django. I tried so many, but none of them seems to be working:
The query I want to translate to Django is:
select ssc.id
,mgz.Title
,tli.id
,tli.Time
from Subscription ssc
join Person prs
on ssc.PersonID = prs.id
and prs.id = 3
join Magazine mgz
on mgz.id = ssc.MagazineID
and mgz.from <= date.today()
and mgz.until > date.today()
left join TimeLogedIn tli
on tli.SubscriptionID = ssc.id
and tli.DateOnline = date.today()
The model I'm using looks like this:
class Magazine(models.Model):
Title = models.CharField(max_length=100L)
from = models.Datefield()
until = models.Datefield()
Persons = models.ManyToManyField(Person, through='Subscription')
class Person(models.Model):
user = models.OneToOneField(User, on_delete=models.CASCADE)
Magazines = models.ManyToManyField(Magazine, through='Subscription')
class Subscription(models.Model):
MagazineID = models.ForeignKey(Magazine,on_delete=models.CASCADE)
PersonID = models.ForeignKey(Person,on_delete=models.CASCADE)
class TimeLogedIn(models.Model):
SubscriptionID = models.ForeignKey('Subscription', on_delete=models.CASCADE)
DateOnline = models.DateField()
Time = models.DecimalField(max_digits=5, decimal_places=2)
Like I said, tried so many but no succes and now I don't know how to do this in Django ORM , is it even possible? I created already a raw-query and this is working ok, but how to create this in Django ORM?
You can use field lookups lte and gt to filter your objects and then values() method.
You can also querying in the opposite direction and use Q objects for null values:
from django.db.models import Q
Subscription.objects.filter(
PersonID_id=3,
MagazineID__from__lte=date.today(),
MagazineID__until__gt=date.today()
).filter(
Q(TimeLogedIn__DateOnline=date.today()) | Q(TimeLogedIn__DateOnline__isnull=True)
).values("id", "MagazineID__Title", "TimeLogedIn__id", "TimeLogedIn__Time")
OR from TimeLogedIn:
TimeLogedIn.objects.filter(DateOnline=date.today()).filter(
SubscriptionID__MagazineID__from__lte=date.today(),
SubscriptionID__MagazineID__util__gt=date.today()
).values(
"SubscriptionID_id", "SubscriptionID__MagazineID__Title", "id", "Time"
)
Querysets also have the query attribute that contains the sql query to be executed, you can see it like following:
print(TimeLogedIn.objects.filter(...).values(...).query)
Note: Behind the scenes, Django appends "_id" to the field name to create its database column name. Therefore it should be
subscription, instead of SubscriptionID.
You can also use prefetch_related() and select_related() to prevent multiple database hits:
SubscriptionID.objects.filter(...).prefetch_related("TimeLogedIn_set")
SubscriptionID.objects.filter(...).select_related("PersonID")

Perform lookup and update within a single Django query

I have two models: MetaModel and RelatedModel. I want to include the result of a RelatedModel lookup within a MetaModel query, and I'd like to do this within a single DB call.
I've tried to define a 'subquery' QuerySet for use in the main query, but that hasn't worked - it's still making two queries to complete the operation.
Note: I can't use a traditional ForeignKey relationship because the profile_id field is not unique. Uniqueness is a combination of profile_id and channel. This is an aggregation table and profile_id is not guaranteed to be unique across multiple third-party channels.
Any suggestions?
Models:
class Channel(models.Model):
id = models.AutoField(primary_key=True)
name = models.CharField(
max_length=25,
)
class MetaModel(models.Model):
profile_id = fields.IntegerField()
channel = fields.ForeignKey(Channel))
metadata = fields.TextField()
class RelatedModel(models.Model):
related_id = fields.IntegerField()
profile_id = fields.IntegerField()
channel = fields.ForeignKey(Channel))
Dummy data
channel = Channel("Web site A")
channel.save()
sample_meta = MetaModel(profile_id=1234, channel=channel)
sample_related = RelatedModel(profile_id=1234, related_id=5678, channel=channel)
Query:
# Create a queryset to filter down to the single record we need the `profile_id` for
# I've limited to the only field we need via a `values` operation
related_qs = RelatedAccount.objects.filter(
related_id=5678,
channel=channel
).values_list("profile_id", flat=True)
# I'm doing an update_or_create as there is other data to store, not included for brevity
obj, created = MetaModel.objects.update_or_create(
profile_id=related_qs.first(), # <<< This var is the dynamic part of the query
channel=channel,
defaults={"metadata": "Metadata is added to a new or existing record."}
)
Regarding your note on uniqueness, you can use unique_together option in Django as described here in the documentation.
class MetaModel(models.Model):
profile_id = fields.ForeignKey(RelatedModel)
channel = fields.ForeignKey(Channel)
metadata = fields.TextField()
class Meta:
unique_together = ('profile_id', 'channel')
Then you can change your query accordingly and should solve your problem.

My django query is very slow in givig me data on terminal

I have a users table which has 3 types of users Student, Faculty and Club and I have a university table.
What I want is how many users are there in the specific university.
I am getting my desired output but the output is very slow.I have 90k users and the output it is generating it takes minutes to produce results.
My user model:-
from __future__ import unicode_literals
from django.db import models
from django.contrib.auth.models import User
from cms.models.masterUserTypes import MasterUserTypes
from cms.models.universities import Universities
from cms.models.departments import MasterDepartments
# WE ARE AT MODELS/APPUSERS
requestChoice = (
('male', 'male'),
('female', 'female'),
)
class Users(models.Model):
id = models.IntegerField(db_column="id", max_length=11, help_text="")
userTypeId = models.ForeignKey(MasterUserTypes, db_column="userTypeId")
universityId = models.ForeignKey(Universities, db_column="universityId")
departmentId = models.ForeignKey(MasterDepartments , db_column="departmentId",help_text="")
name = models.CharField(db_column="name",max_length=255,help_text="")
username = models.CharField(db_column="username",unique=True, max_length=255,help_text="")
email = models.CharField(db_column="email",unique=True, max_length=255,help_text="")
password = models.CharField(db_column="password",max_length=255,help_text="")
bio = models.TextField(db_column="bio",max_length=500,help_text="")
gender = models.CharField(db_column="gender",max_length=6, choices=requestChoice,help_text="")
mobileNo = models.CharField(db_column='mobileNo', max_length=16,help_text="")
dob = models.DateField(db_column="dob",help_text="")
major = models.CharField(db_column="major",max_length=255,help_text="")
graduationYear = models.IntegerField(db_column='graduationYear',max_length=11,help_text="")
canAddNews = models.BooleanField(db_column='canAddNews',default=False,help_text="")
receivePrivateMsgNotification = models.BooleanField(db_column='receivePrivateMsgNotification',default=True ,help_text="")
receivePrivateMsg = models.BooleanField(db_column='receivePrivateMsg',default=True ,help_text="")
receiveCommentNotification = models.BooleanField(db_column='receiveCommentNotification',default=True ,help_text="")
receiveLikeNotification = models.BooleanField(db_column='receiveLikeNotification',default=True ,help_text="")
receiveFavoriteFollowNotification = models.BooleanField(db_column='receiveFavoriteFollowNotification',default=True ,help_text="")
receiveNewPostNotification = models.BooleanField(db_column='receiveNewPostNotification',default=True ,help_text="")
allowInPopularList = models.BooleanField(db_column='allowInPopularList',default=True ,help_text="")
xmppResponse = models.TextField(db_column='xmppResponse',help_text="")
xmppDatetime = models.DateTimeField(db_column='xmppDatetime', help_text="")
status = models.BooleanField(db_column="status", default=False, help_text="")
deactivatedByAdmin = models.BooleanField(db_column="deactivatedByAdmin", default=False, help_text="")
createdAt = models.DateTimeField(db_column='createdAt', auto_now=True, help_text="")
modifiedAt = models.DateTimeField(db_column='modifiedAt', auto_now=True, help_text="")
updatedBy = models.ForeignKey(User,db_column="updatedBy",help_text="Logged in user updated by ......")
lastPasswordReset = models.DateTimeField(db_column='lastPasswordReset',help_text="")
authorities = models.CharField(db_column="departmentId",max_length=255,help_text="")
class Meta:
managed = False
db_table = 'users'
the query i am using which is producing the desired output but too sloq is:-
universities = Universities.objects.using('cms').all()
for item in universities:
studentcount = Users.objects.using('cms').filter(universityId=item.id,userTypeId=2).count()
facultyCount = Users.objects.using('cms').filter(universityId=item.id,userTypeId=1).count()
clubCount = Users.objects.using('cms').filter(universityId=item.id,userTypeId=3).count()
totalcount = Users.objects.using('cms').filter(universityId=item.id).count()
print studentcount,facultyCount,clubCount,totalcount
print item.name
You should use annotate to get the counts for each university and conditional expressions to get the counts based on conditions (docs)
Universities.objects.using('cms').annotate(
studentcount=Sum(Case(When(users_set__userTypeId=2, then=1), output_field=IntegerField())),
facultyCount =Sum(Case(When(users_set__userTypeId=1, then=1), output_field=IntegerField())),
clubCount=Sum(Case(When(users_set__userTypeId=3, then=1), output_field=IntegerField())),
totalcount=Count('users_set'),
)
First, an obvious optimization. In the loop, you're doing essentially the same query four times: thrice filtering for different userTypeId, and once without one. You can do this in a single COUNT(*) ... GROUP BY userTypeId query.
...
# Here, we're building a dict {userTypeId: count}
# by counting PKs over each userTypeId
qs = Users.objects.using('cms').filter(universityId=item.id)
counts = {
x["userTypeId"]: x["cnt"]
for x in qs.values('userTypeId').annotate(cnt=Count('pk'))
}
student_count = counts.get(2, 0)
faculty_count = counts.get(1, 0)
club_count = count.get(3, 0)
total_count = sum(count.values()) # Assuming there may be other userTypeIds
...
However, you're still doing 1+n queries, where n is number of universities you have in the database. This is fine if the number is low, but if it's high you need further aggregation, joining Universities and Users. A first draft I came with is something like this:
# Assuming University.name is unique, otherwise you'll need to use IDs
# to distinguish between different projects, instead of names.
qs = Users.objects.using('cms').values('userTypeId', 'university__name')\
.annotate(cnt=Count('pk').order_by('university__name')
for name, group in itertools.groupby(qs, lambda x: x["university__name"]):
print("University: %s" % name)
cnts = {g["userTypeId"]: g["cnt"] for g in group}
faculty, student, club = cnts.get(1, 0), cnts.get(2, 0), cnts.get(3, 0)
# NOTE: I'm assuming there are only few (if any) userTypeId values
# other than {1,2,3}.
total = sum(cnts.values())
print(" Student: %d, faculty: %d, club: %d, total: %d" % (
student, faculty, club, total))
I might've made a typo there, but hope it's correct. In terms of SQL, it should emit a query like
SELECT uni.name, usr.userTypeId, COUNT(usr.id)
FROM some_app_universities AS uni
LEFT JOUN some_app_users AS usr ON us.universityId = uni.id
GROUP BY uni.name, usr.userTypeId
ORDER BY uni.name
Consider reading documentation on aggregations and annotations. And be sure to check out raw SQL that Django ORM emits (e.g. use Django Debug Toolbar) and analyze how well it works on your database. For example, use EXPLAIN SELECT if you're using PostgreSQL. Depending on your dataset, you may benefit from some indexes there (e.g. on userTypeId column).
Oh, and on a side note... it's off-topic, but in Python it's a custom to have variables and attributes use lowercase_with_underscores. In Django, model class names are usually singular, e.g. User and University.

Django model inner join ORM issues

suppose I have the following three models, see as follow: I wanna to construct a Django model query to archive the same effect as the following SQL statement.
SQL statement
select B.value, C.special
from B inner join C
where B.version = C.version and B.order = C.order;
I got the following three models:
class Process(models.Model):
name = models.CharField(max_length=30)
description = models.CharField(max_length=150)
class ProcessStep(models.Model):
process = models.ForeignKey(Process)
name = models.CharField(max_length=30)
...
order = models.SmallIntegerField(default=1)
version = models.SmallIntegerField(null=True)
class Approve(models.Model):
process = models.ForeignKey(Process)
content = models.CharField(max_length=300)
...
version = models.SmallIntegerField(null=True)
order = models.SmallIntegerField(default=0)
I want to find all Approves that have the same (version, order) tuple matching against the ProcessStep model.
Something like this?
Approve.objects.filter(process__processstep_set__order=<value>)
Right now I can only come out with a solution by using raw SQL, see the following code.
from django.db import connection
cursor = connection.cursor()
cursor.execute("
SELECT A.id, B.id
from approval_approve as A inner join approval_approvalstep as B
where A.process_id = B.process.id
and A.order = B.order
and A.version = B.version;")
# do something with the returned data
If you guys got any better solution, I would like to hear from you, thanks~

Django query: Joining two models with two fields

I have the following models:
class AcademicRecord(models.Model):
record_id = models.PositiveIntegerField(unique=True, primary_key=True)
subjects = models.ManyToManyField(Subject,through='AcademicRecordSubject')
...
class AcademicRecordSubject(models.Model):
academic_record = models.ForeignKey('AcademicRecord')
subject = models.ForeignKey('Subject')
language_group = IntegerCharField(max_length=2)
...
class SubjectTime(models.Model):
time_id = models.CharField(max_length=128, unique=True, primary_key=True)
subject = models.ForeignKey(Subject)
language_group = IntegerCharField(max_length=2)
...
class Subject(models.Model):
subject_id = models.PositiveIntegerField(unique=True,primary_key=True)
...
The academic records have list of subjects each with a language code and the subject times have a subject and language code.
With a given AcademicRecord, how can I get the subject times that matches with the AcademicRecordSubjects that the AcademicRecord has?
This is my approach, but it makes more queries than needed:
# record is the given AcademicRecord
times = []
for record_subject in record.academicrecordsubject_set.all():
matched_times = SubjectTime.objects.filter(subject=record_subject.subject)
current_times = matched_times.filter(language_group=record_subject.language_group)
times.append(current_times)
I want to make the query using django ORM not with raw SQL
SubjectTime language group has to match with Subject's language group aswell
I got it, in part thanks to #Robert Jørgensgaard Eng
My problem was how to do the inner join using more than 1 field, in which the F object came on handly.
The correct query is:
SubjectTime.objects.filter(subject__academicrecordsubject__academic_record=record,
subject__academicrecordsubject__language_group=F('language_group'))
Given an AcademicRecord instance academic_record, it is either
SubjectTime.objects.filter(subject__academicrecordsubject_set__academic_record=academic_record)
or
SubjectTime.objects.filter(subject__academicrecordsubject__academic_record=academic_record)
The results reflect all the rows of the join that these ORM queries become in SQL. To avoid duplicates, just use distinct().
Now this would be much easier, if I had a django shell to test in :)

Categories

Resources