Erroneous group_by query generated in python django - python

I am using Django==1.8.7 and I have the following models
# a model in users.py
class User(models.Model):
id = models.AutoField(primary_key=True)
username = models.CharField(max_length=100, blank=True)
displayname = models.CharField(max_length=100, blank=True)
# other fields deleted
# a model in healthrepo.py
class Report(models.Model):
id = models.AutoField(primary_key=True)
uploaded_by = models.ForeignKey(User, related_name='uploads',
db_index=True)
owner = models.ForeignKey(User, related_name='reports', db_index=True)
# other fields like dateofreport, deleted
I use the following Django queryset:
Report.objects.filter(owner__id=1).values('uploaded_by__username',
'uploaded_by__displayname').annotate(
total=Count('uploaded_by__username')
)
I see that this generates the following query:
SELECT T3."username", T3."displayname", COUNT(T3."username") AS "total" FROM "healthrepo_report"
INNER JOIN "users_user" T3 ON ( "healthrepo_report"."uploaded_by_id" = T3."id" )
WHERE "healthrepo_report”.”owner_id" = 1
GROUP BY T3."username", T3."displayname", "healthrepo_report"."dateofreport", "healthrepo_report”.”owner_id", "healthrepo_report"."uploaded_by_id"
ORDER BY "healthrepo_report"."dateofreport" DESC, "healthrepo_report"."user_id" ASC, "healthrepo_report"."uploaded_by_id" ASC
However, what I really wanted was just grouping based on "healthrepo_report”.”owner_id" and not multiple fields. i.e. What I wanted was:
SELECT T3."username", T3."displayname", COUNT(T3."username") AS "total" FROM "healthrepo_report"
INNER JOIN "users_user" T3 ON ( "healthrepo_report"."uploaded_by_id" = T3."id" )
WHERE "healthrepo_report”.”owner_id" = 1
GROUP BY T3."username", T3."displayname" ORDER BY "healthrepo_report"."dateofreport" DESC, "healthrepo_report"."user_id" ASC, "healthrepo_report"."uploaded_by_id" ASC
I am wondering why this is happening and how do I get grouping based on single column.

I just saw this post:
Django annotate and values(): extra field in 'group by' causes unexpected results
Changing the query by adding empty order_by() fixes it
Report.objects.filter(owner__id=1).values('uploaded_by__username',
'uploaded_by__displayname').annotate(
total=Count('uploaded_by__username')
).order_by()

Related

Django use LEFT JOIN instead of INNER JOIN

I have two models: Comments and CommentFlags
class Comments(models.Model):
content_type = models.ForeignKey(ContentType,
verbose_name=_('content type'),
related_name="content_type_set_for_%(class)s",
on_delete=models.CASCADE)
object_pk = models.CharField(_('object ID'), db_index=True, max_length=64)
content_object = GenericForeignKey(ct_field="content_type", fk_field="object_pk")
submit_date = models.DateTimeField(_('date/time submitted'), default=None, db_index=True)
...
...
class CommentFlags(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name="comment_flags",
on_delete=models.CASCADE)
comment = models.ForeignKey(Comment, related_name="flags", on_delete=models.CASCADE)
flag = models.CharField(max_length=30, db_index=True)
...
...
CommentFlags flag can have values: like, dislike etc.
Problem Statement: I want to get all Comments sorted by number of likes in DESC manner.
Raw Query for above problem statement:
SELECT
cmnts.*, coalesce(cmnt_flgs.num_like, 0) as num_like
FROM
comments cmnts
LEFT JOIN
(
SELECT
comment_id, Count(comment_id) AS num_like
FROM
comment_flags
WHERE
flag='like'
GROUP BY comment_id
) cmnt_flgs
ON
cmnt_flgs.comment_id = cmnts.id
ORDER BY
num_like DESC
I have not been able to convert the above query in Django ORM Queryset.
What I have tried so far...
>>> qs = (Comment.objects.filter(flags__flag='like').values('flags__comment_id')
.annotate(num_likes=Count('flags__comment_id')))
which generates different query.
>>> print(qs.query)
>>> SELECT "comment_flags"."comment_id",
COUNT("comment_flags"."comment_id") AS "num_likes"
FROM "comments"
INNER JOIN "comment_flags"
ON ("comments"."id" = "comment_flags"."comment_id")
WHERE "comment_flags"."flag" = 'like'
GROUP BY "comment_flags"."comment_id",
"comments"."submit_date"
ORDER BY "comments"."submit_date" ASC
LIMIT 21
Problem with above ORM queryset is, it uses InnerJoin and also I don't know how it adds submit_date in groupby clause.
Can you please suggest me a way to convert above mentioned Raw query to Django ORM queryset ?
You can try using filter argument in Count:
qs = (Comment.objects.all()
.annotate(num_likes=Count('flags__comment_id', filter=Q(flags__flag='like'))))
It may produce slightly different query that you're expecting, depending on the database backend, but it should have equivalent behavior.

rawsql equivalent django queryset

I would like to write django queryset which is equivalent of below query with one hit in db. Right now I am using manager.raw() to execute.
With annotate, I can generate the inner query. But I can't use that in the filter condition (when I checked queryset.query, it looks like ex1).
select *
from table1
where (company_id, year) in (select company_id, max(year) year
from table1
where company_id=3
and total_employees is not null
group by company_id);
Ex1:
SELECT `table1`.`company_id`, `table1`.`total_employees`
FROM `table1`
WHERE `table1`.`id` = (SELECT U0.`company_id` AS Col1, MAX(U0.`year`) AS `year`
FROM `table1` U0
WHERE NOT (U0.`total_employees` IS NULL)
GROUP BY U0.`company_id`
ORDER BY NULL)
Model:
class Table1(models.Model):
year = models.IntegerField(null=False, validators=[validate_not_null])
total_employees = models.FloatField(null=True, blank=True)
company = models.ForeignKey('Company', on_delete=models.CASCADE, related_name='dummy_relation')
last_modified = models.DateTimeField(auto_now=True)
updated_by = models.CharField(max_length=100, null=False, default="research")
class Meta:
unique_together = ('company', 'year',)
I appreciate your response.
You can use OuterRef and Subquery to achive it. Try like this:
newest = Table1.objects.filter(company=OuterRef('pk'), total_employees_isnull=False).order_by('-year')
companies = Company.objects.annotate(total_employees=Subquery(newest.values('total_employees')[:1])).annotate(max_year=Subquery(newest.values('year')[:1]))
# these queries will not execute until you call companies. So DB gets hit once
Show values:
# all values
companies.values('id', 'total_employees', 'max_year')
# company three values
company_three_values = companies.filter(id=3).values('id', 'total_employees', 'max_year')
Filter on Max Year:
companies_max = companies.filter(max_year__gte=2018)
FYI: OuterRef and Subquery is available in Django from version 1.11
if you have model name is Table1, try this.
Table1.objects.get(pk=Table1.objects.filter(company_id=3, total_employees_isnull=False).latest('year').first().id)
This maybe one hit in db.
But if .first() not match anything. Better like this:
filter_item = Table1.objects.filter(company_id=3, total_employees_isnull=False).latest('year').first()
if filter_item:
return Table1.objects.get(pk=filter_item.id)

Django - ForeignKey - how to use it correctly?

I have following DB model:
class Table1( models.Model ):
sctg = models.CharField(max_length=100, verbose_name="Sctg")
emailAddress = models.CharField(max_length=100, verbose_name="Email Address", default='')
def __unicode__(self):
return str( self.sctg )
class Table2( models.Model ):
sctg = models.ForeignKey( Table1 )
street = models.CharField(max_length=100, verbose_name="Street")
zipCode = models.CharField(max_length=100, verbose_name="Zip Code")
def __unicode__(self):
return str( self.sctg )
and I would like to execute select query.
This is what I did:
sctg = Table1.objects.get( sctg = self.sctg )
data = Table2.objects.get( sctg = sctg )
and it works but now I am executing 2 queries. Is there a chance to do this in only one ? in raw SQL I'd do a JOIN query but no idea how to do this in Django models.
You can use two consecutive underscores to look "through" a ForeignKey reference. So your query is equivalent to:
Table2.objects.get(sctg__sctg=self.sctg)
The non-boldface part thus looks through the ForeignKey, whereas the boldface part corresponds to the CharField column.
Note that
it is possible that there is no such Table2 element, or multiple. In both cases this will result in an error. In case you want to retrieve all (possibly empty), you can use .filter(..) over .get(..);
here self.sctg should be a string (or something string-like) since the sctg of Table1 is a CharField.
The above will result in some sort of query like:
SELECT t2.*
FROM table2 AS t2
INNER JOIN table1 AS t1 ON t2.sctg = t1.id
WHERE t1.sctg = 'mysctg'
where 'mysctg' is the value stored in you self.sctg.

django 1.11.x -> 2.x migration. I got the incorrect 'group by' fields

I just upgraded the Django version. The model or code has not been modified in my app.
But... I get different results QuerySet.
Fields specified in 'Group by' are different when printing a 'query'.
Model:
class Content(models.Model):
id_field = models.AutoField(db_column='id_', primary_key=True)
...
collections = models.ManyToManyField('Collection',
through='CollectionMap',
through_fields=('contentid', 'collectionid'))
class CollectionMap(models.Model):
field_index = models.AutoField(db_column='_index', primary_key=True)
collectionid = models.ForeignKey(Collection, on_delete=models.CASCADE, db_column='collectionid')
contentid = models.ForeignKey(Content, on_delete=models.CASCADE, db_column='contentid')
field_time = models.DateTimeField(db_column='_time')
class Collection(models.Model):
field_index = models.AutoField(db_column='_index', primary_key=True)
name = models.CharField(unique=True, max_length=50)
collectionorder = models.IntegerField(db_column='collectionOrder', blank=True, null=True)
active = models.IntegerField(blank=True, null=True)
field_time = models.DateTimeField(db_column='_time')
Code:
Content.objects\
.annotate(count=Count('id_field')\
.values('id_field', 'collections__name')
Query:
1.11.X
SELECT
Content.id_, Collection. name
FROM Content
LEFT OUTER JOIN Collection_Map ON (Content.id_ = Collection_Map.contentid)
LEFT OUTER JOIN Collection ON (Collection_Map.collectionid = Collection._index)
GROUP BY Content.id_
ORDER BY Content.itemOrder DESC
LIMIT 50;
2.X
SELECT
Content.id_, Collection. name
FROM Content
LEFT OUTER JOIN Collection_Map ON (Content.id_ = Collection_Map.contentid)
LEFT OUTER JOIN Collection ON (Collection_Map.collectionid = Collection._index)
GROUP BY Content.id_, Collection. name
ORDER BY Content.itemOrder DESC
LIMIT 50;
Group By fields:
Version 1.11.X : GROUP BY Content.id
Version 2.X : GROUP BY Content.id, Collection.Name
I want this... '... GROUP BY Content.id ...'
What should I do??

Django query api: complex subquery

I wasted lots of time trying to compose such query. Here my models:
class User(Dealer):
pass
class Post(models.Model):
text = models.CharField(max_length=500, default='')
date = models.DateTimeField(default=timezone.now)
interactions = models.ManyToManyField(User, through='UserPostInteraction', related_name='post_interaction')
class UserPostInteraction(models.Model):
post = models.ForeignKey(Post, related_name='pppost')
user = models.ForeignKey(User, related_name='uuuuser')
status = models.SmallIntegerField()
DISCARD = -1
VIEWED = 0
LIKED = 1
DISLIKED = 2
And what i need:
Subquery is: (UserPostInteractions where status = LIKED) - (UserPostInteractions where status = DISLIKED) of Post(OuterRef('pk'))
Query is : Select all posts order by value of subquery.
I'm stuck at error Subquery returned multiple rows
Elp!!))
If i understand correctly your needs, you can get what you need with such qs:
from django.db.models import Case, Sum, When, IntegerField
posts = Post.objects.values('id', 'text', 'date').annotate(
rate=Sum(Case(
When(pppost__status=1, then=1),
When(pppost__status=2, then=-1),
default=0,
output_field=IntegerField()
))
).order_by('rate')
In MySql it converts in such sql query:
SELECT
`yourapp_post`.`id`,
`yourapp_post`.`text`,
`yourapp_post`.`date`,
SUM(
CASE
WHEN `yourapp_userpostinteraction`.`status` = 1
THEN 1
WHEN `yourapp_userpostinteraction`.`status` = 2
THEN -1
ELSE 0
END) AS `rate`
FROM `yourapp_post`
LEFT OUTER JOIN `yourapp_userpostinteraction` ON (`yourapp_post`.`id` = `yourapp_userpostinteraction`.`post_id`)
GROUP BY `yourapp_post`.`id`
ORDER BY `rate` ASC

Categories

Resources