Sub-query to make use of different distinct & orderby

Sub-query to make use of different distinct & orderby - python

I need to use different order_by & distinct values, and I have made an attempt using a subquery.
How can I achieve this?
Could a qset select the Products I want, and then in a separate query, select the 15 Variations whose price you want to display?
In other words: Qset randomly selects product ID's (in a queryset), then python tells it to return a queryset of just those 15 items.
Speeding up the query too is important- as it takes ~800ms (when I order_by the pk) or 5.8seconds when I use order_by '?'.
My attempt:
distinct_qs = (
Product.objects
.distinct('id')
)
qset = (
Product.objects
.filter(pk__in=distinct_qs)
.order_by('rating', '?')
.values('name', 'image',)
.annotate(
price=F('variation__price__price'),
id=F('pk'),
vari=F('variation'),
)[:15]
)
Sample of output data:
{"name":"Test Item","vari":10, id":1, "price":"80", "image":"xyz.com/1.jpg"},
{"name":"Test Item","vari":11, id":1, "price":"80", "image":"xyz.com/1.jpg"},
{"name":"Another one","vari":14, id":2, "price":"10", "image":"xyz.com/2.jpg"},
{"name":"Another one","vari":15, id":2, "price":"10", "image":"xyz.com/2.jpg"},
{"name":"And Again","vari":17, id":3, "price":"12", "image":"xyz.com/3.jpg"},
{"name":"And Again","vari":18, id":3, "price":"12", "image":"xyz.com/3.jpg"},
Desired output data:
{"name":"Test Item","vari":13, id":1, "price":"80", "image":"xyz.com/1.jpg"},
{"name":"Another one","vari":14, id":2, "price":"10", "image":"xyz.com/2.jpg"},
{"name":"And Again","vari":17, id":3, "price":"12", "image":"xyz.com/3.jpg"},
Sample of models.py
class Product(models.Model):
name = models.CharField ("Name", max_length=400)
...
class Variation(models.Model):
product = models.ForeignKey(Product, db_index=True, blank=False, null=False)
...
class Image(models.Model):
variation = models.ForeignKey(Variation, blank=False, null=False)
image = models.URLField(max_length=540, blank=True, null=True)
class Price(models.Model):
price = models.DecimalField("Price", decimal_places=2, max_digits=10)
variation = models.ForeignKey(Variation, blank=False, null=False)

I think you should write a custom model manager (see https://docs.djangoproject.com/en/1.9/topics/db/managers/ ) and create a method there which you then would use for returning variations instead of a standard query.
For randomising you could do like this:
select the last id of Variation (or Product), then generate different random 15 ids from that interval and then just pull objects with those ids from database. I think it should work faster.

Related

Django use LEFT JOIN instead of INNER JOIN

I have two models: Comments and CommentFlags
class Comments(models.Model):
content_type = models.ForeignKey(ContentType,
verbose_name=_('content type'),
related_name="content_type_set_for_%(class)s",
on_delete=models.CASCADE)
object_pk = models.CharField(_('object ID'), db_index=True, max_length=64)
content_object = GenericForeignKey(ct_field="content_type", fk_field="object_pk")
submit_date = models.DateTimeField(_('date/time submitted'), default=None, db_index=True)
...
...
class CommentFlags(models.Model):
user = models.ForeignKey(settings.AUTH_USER_MODEL, related_name="comment_flags",
on_delete=models.CASCADE)
comment = models.ForeignKey(Comment, related_name="flags", on_delete=models.CASCADE)
flag = models.CharField(max_length=30, db_index=True)
...
...
CommentFlags flag can have values: like, dislike etc.
Problem Statement: I want to get all Comments sorted by number of likes in DESC manner.
Raw Query for above problem statement:
SELECT
cmnts.*, coalesce(cmnt_flgs.num_like, 0) as num_like
FROM
comments cmnts
LEFT JOIN
(
SELECT
comment_id, Count(comment_id) AS num_like
FROM
comment_flags
WHERE
flag='like'
GROUP BY comment_id
) cmnt_flgs
ON
cmnt_flgs.comment_id = cmnts.id
ORDER BY
num_like DESC
I have not been able to convert the above query in Django ORM Queryset.
What I have tried so far...
>>> qs = (Comment.objects.filter(flags__flag='like').values('flags__comment_id')
.annotate(num_likes=Count('flags__comment_id')))
which generates different query.
>>> print(qs.query)
>>> SELECT "comment_flags"."comment_id",
COUNT("comment_flags"."comment_id") AS "num_likes"
FROM "comments"
INNER JOIN "comment_flags"
ON ("comments"."id" = "comment_flags"."comment_id")
WHERE "comment_flags"."flag" = 'like'
GROUP BY "comment_flags"."comment_id",
"comments"."submit_date"
ORDER BY "comments"."submit_date" ASC
LIMIT 21
Problem with above ORM queryset is, it uses InnerJoin and also I don't know how it adds submit_date in groupby clause.
Can you please suggest me a way to convert above mentioned Raw query to Django ORM queryset ?

You can try using filter argument in Count:
qs = (Comment.objects.all()
.annotate(num_likes=Count('flags__comment_id', filter=Q(flags__flag='like'))))
It may produce slightly different query that you're expecting, depending on the database backend, but it should have equivalent behavior.

Django filter() by ForeignKey returns incorrect queryset

I have two models, Product model connects to ProductGroup by ForeignKey and a Product model also has a ForeignKey field named shop:
class ProductGroup(models.Model):
name = models.CharField(max_length=64)
vat_rate = models.ForeignKey(VatRate, verbose_name="VAT in percent", related_name='product_group_vat_rates',
on_delete=models.CASCADE)
class Product(models.Model):
product_id = models.CharField(max_length=128, blank=True, null=True)
name = models.CharField(max_length=128)
shop = models.ForeignKey(Shop, on_delete=models.CASCADE, related_name="product_shop")
product_group = models.ForeignKey(ProductGroup, on_delete=models.CASCADE, related_name="product")
price = models.DecimalField(max_digits=7, decimal_places=2, blank=True, null=True, default=0)
cost_price = models.DecimalField(max_digits=7, decimal_places=2, blank=True, null=True, default=0)
stock_amount = models.IntegerField(default=0, blank=True, null=True,
help_text=_('Product amount in stock'))
barcode = models.CharField(max_length=64, blank=True)
is_active = models.BooleanField(default=True)
Let's assume, I have two Product instances, connected to particular shop, and one ProductGroup, that connected to this two Product instances
So now I want to get all ProductGroup instances, that relate to particular shop instance.
What I do:
product_group_list = ProductGroup.objects.filter(product__shop=shop_inst)
What I think I will get:
<QuerySet [<ProductGroup: Product Group test product group>]>
But, unfortunately, I get this:
<QuerySet [<ProductGroup: Product Group test product group>, <ProductGroup: Product Group test product group>]>
So it returns to me queryset of these two ProductGroup instances connected to Product instances.
How can I improve my query to DB to get only ONE ProductGroup instance, that connected to this shop? (Because there's only one record in DB with ProductGroup)

It returns the same ProductCategory multiple times due to JOINing, you can use .distinct():
product_group_list = ProductGroup.objects.filter(product__shop=shop_inst).distinct()

Use distinct on your query to eliminate duplicates
distinct(*fields)
Returns a new QuerySet that uses SELECT DISTINCT in its SQL query.
This eliminates duplicate rows from the query results.
By default, a QuerySet will not eliminate duplicate rows. In practice,
this is rarely a problem, because simple queries such as
Blog.objects.all() don’t introduce the possibility of duplicate result
rows. However, if your query spans multiple tables, it’s possible to
get duplicate results when a QuerySet is evaluated. That’s when you’d
use distinct().
product_group_list = ProductGroup.objects.filter(product__shop=shop_inst).distinct()

Calculate sum of items in a date_range queryset if matched by foreign_key_id and skip the rest

I have a 100k entries per day and I am using them to output in an API(i have a limit and and offset by default). I want to calculate values in my queryset if they have a common owner_id and leave the rest as it is if no common owner for the date delta
What i am doing now but doesnt look to be correct( it doest calculate some data correct tho, but some data is increased as well for some reason, which should have not been)
TrendData.objects.filter(owner__trend_type__mnemonic='posts').filter(
date_trend__date__range=[date_from, date_to]).values('owner__name').annotate(
views=(Sum('views') / date_delta),
views_u=(Sum('views_u') / date_delta),
likes=(Sum('likes') / date_delta),
shares=(Sum('shares') / date_delta),
interaction_rate=(
Sum('interaction_rate') / date_delta),
)
date_delta = date_to - date_from #<- integer
my models are:
class Owner(models.Model):
class Meta:
verbose_name_plural = 'objects'
TREND_OWNERS = Choices('group', 'user')
link = models.CharField(max_length=255)
name = models.CharField(max_length=255)
owner_type = models.CharField(choices=TREND_OWNERS, max_length=50)
trend_type = models.ForeignKey(TrendType, on_delete=models.CASCADE)
def __str__(self):
return f'{self.link}[{self.trend_type}]'
class TrendData(models.Model):
class Meta:
verbose_name_plural = 'Trends'
owner = models.ForeignKey(Owner, on_delete=models.CASCADE)
views = models.IntegerField()
views_u = models.IntegerField()
likes = models.IntegerField()
shares = models.IntegerField()
interaction_rate = models.DecimalField(max_digits=20, decimal_places=10)
mean_age = models.IntegerField()
source = models.ForeignKey(TrendSource, on_delete=models.CASCADE)
date_trend = models.DateTimeField()
Source parent model doesn't really help in that case, it's a csv file data was loaded from, so we don't ever reference it.
What I want is, is it possible to calculate sum of views, views_u, likes, shares, interaction_rate if the owner is met for both days (let's say 01.01.19 to 10.01.2019) and if there are 2 of the owners in both days calculate the Sum if not skip and leave it as a simple queryset without summing ALL the values in it, if met then calculate and leave the rest as it is.
I can do it with a python, but i think it is possible to do in django ORM

Django ORM provides a conditional expressions for doing this kind of condition based annotations. You can use Case to annotate the Sum based on the condition you mentioned.
TrendData.objects.filter(owner__trend_type__mnemonic='posts').annotate(
views=Sum(
Case(
When("Your condition here", then=F('views')),
default=0,
output_field=IntegerField(),
)
)
...
)

New to python django, how to auto populate a field?

I have two fields in my model.py one has multi choice drop down and one that is empty. What I would like to have is that if the user select "Gas" from the menu for type, I would like the amount field to get auto populated with distance * 2
Can I do that?
CHOICE = (
('Meal', 'Meal'),
('Gas', 'Gas'),
)
type = models.CharField(max_length=10, choices=CHOICE)
distance = models.CharField(max_length=100)
amount = models.CharField(max_length=100)
Thanks in advance.

You can use the django-observer app for this. Although there are cleaner Javascript approaches, you can make the automation totally depend on Django.
First, modify the amount field as:
amount = models.CharField(max_length=100, blank=True, null=True)
since it won't take any values when the model object is initially saved to the database. Then the rest of the code will look something like:
from observer.decorators import watch
def compute_amount(sender, obj, attr):
if obj.type == 'Gas':
obj.amount = obj.distance * 2
obj.save()
#watch('type', compute_amount, call_on_created=True)
class FuelConsumption(models.Model):
CHOICE = (
('Meal', 'Meal'),
('Gas', 'Gas'),
)
type = models.CharField(max_length=10, choices=CHOICE)
distance = models.CharField(max_length=100)
amount = models.CharField(max_length=100, blank=True, null=True)

Django filter with annotate

class Review(models.Model):
slug = models.SlugField(max_length=255, unique=True)
vendor = models.ForeignKey(Vendor)
user = models.ForeignKey(User, blank=True, null=True)
product = models.ForeignKey(Product, blank=True, null=True)
images = models.ManyToManyField(ReviewImage, blank=True, null=True)
headline = models.CharField(max_length=100)
review = models.TextField(blank=True, null=True)
rating = models.IntegerField()
active = models.BooleanField(default=1)
created = models.DateTimeField(auto_now_add=True)
changed = models.DateTimeField(auto_now=True)
# This is the problem... I works, but no vendor is shown if there is no review.
vendor_list = (Vendor.objects.filter(category=category,
review__product__isnull=True,
active=True)
.annotate(rating_avg=Avg('review__rating')))
HOW can I do it with review__product__isnull=True? If there is no review at all, I still want the vendor, but the rating should be: "0", .. what to do?

Let's see if I understand this. You are trying to list all active vendors in the category, annotated with the average rating of their reviews. The way you determine that a review is a vendor review rather than a product review is that the product field is null. And you want the average rating of vendors with no reviews to be zero.
In SQL your query requires an OUTER JOIN:
SELECT vendor.id, COALESCE(AVG(review.rating), 0.0) AS rating
FROM myapp_vendor AS vendor
LEFT OUTER JOIN myapp_review AS review
ON review.vendor_id = vendor.id
AND review.product IS NULL
WHERE vendor.category = %s
AND vendor.active
GROUP BY vendor.id
Sometimes in Django the simplest solution is a raw SQL query: as the developers say, the database API is "a shortcut but not necessarily an end-all-be-all." So that would look like this:
for v in Vendor.objects.raw('SELECT ... ', [category]): # query as above
print 'Vendor {0} has rating {1}'.format(v.name, v.rating)

OK I might be wrong here. I did a small test and it gave me the correct result but I would have to spend more time testing and I don't have that now.
You could try this:
vendor_list = Vendor.objects.filter(category=category, active=True)
vendor_list = vendor_list.filter(Q(review__product__isnull=True)|Q(review__isnull=True)).annotate(rating_avg=Avg('review__rating'))
(The filter has been separated in to 2 lines to make it easier to read but could be merged)
The idea is that you first take all vendors and then filter those who either has no product reviews or no reviews at all. Then you annotate those.
The rating for those vendors missing a review would be None not 0.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sub-query to make use of different distinct & orderby - python

Related

Django use LEFT JOIN instead of INNER JOIN

Django filter() by ForeignKey returns incorrect queryset

Calculate sum of items in a date_range queryset if matched by foreign_key_id and skip the rest

New to python django, how to auto populate a field?

Django filter with annotate

Categories

Resources