Python Django get distinct queryset by month from a DateField - python

class MyModel(models.Model):
TRANSACTION_TYPE_CHOICES = (
('p', 'P'),
('c', 'C'),
)
status = models.CharField(max_length=50, choices=TRANSACTION_TYPE_CHOICES, default='c')
user = models.ForeignKey(User, db_index=True, on_delete=models.CASCADE,related_name='user_wallet')
date = models.DateField(auto_now=True)
amount = models.FloatField(null=True, blank=True)
def __unicode__(self):
return str(self.id)
I am a fresher in Python django and have a little knowledge in Django Rest Framework.
I have a model like above and I want to filter the date field by month and get distinct queryset by month.....Is there any default way to do this...
Thanks in advance

you can use TruncMonth with annotations
from django.db.models.functions import TruncMonth
MyModel.objects.annotate(
month=TruncMonth('date')
).filter(month=YOURVALUE).values('month').distinct()
or if you need only filter date by month with distinct you can use __month option
MyModel.objects.filter(date__month=YOURVALUE).distinct()
Older django
you can use extra, example for postgres
MyModel.objects.extra(
select={'month': "EXTRACT(month FROM date)"},
where=["EXTRACT(month FROM date)=%s"],
params=[5]
# CHANGE 5 on you value
).values('month').distinct()

This may help you
MyModel.object.values('col1','col2',...,'date').distinct('date')
OR try this:
from django.db.models.functions import TruncMonth
MyModel.objects
.annotate(month=TruncMonth('date')) # Truncate to month and add to select list
.values('month') # Group By month
.annotate(c=Count('id')) # Select the count of the grouping
.values('month', 'c') # (might be redundant, haven't tested) select month and count

Related

Django ORM Get users who have NOT updated their salary information in the last 1 year (Simlpe History )

I want to bring users who have not updated their salary information in the last 1 year. BUT WITH ORM not For Loop.
from simple_history.models import HistoricalRecords
class User(AbstractUser):
...
salary_expectation = models.IntegerField()
history = HistoricalRecords(cascade_delete_history=True)
################################################################
User.objects.filter(# MAGIC ) # Get users who have NOT updated their salary information in the last year
I can see that this is a package which has its documentation in querying its entries, see below:
https://django-simple-history.readthedocs.io/en/latest/querying_history.html
nevertheless you can do that intuitively following Django's normal behavior and a couple of SQL knowledge, I'd expect that history field's table most likely has a one-to-many relationship with the users table, so what I'd do is first open the database, find the column that shows the date of change, write down its name and then write this ORM query below
sub_query = ~Q(history__history_date__lte= "Replace with end of date", history__history_date__gte= "Replace with beginning of date", salary_expectation__isnull=False)
users = User.objects.filter(sub_query)
dont forget to import Q
from django.db.models import Q
You do not need to check HistoricalRecords class for this information.
Add created_at and updated_at (date_time_fields) fields to your User model
class User(...):
...
created_at = models.DateTimeField(auto_now_add=True)
updated_at = models.DateTimeField(auto_now=True)
Queryset Code
from django.db.models.functions import Now, ExtractDay
from django.contrib.auth import get_user_model
User = get_user_model()
users = User.objects.annotate(
# Calculate duration between now and last update date saved
duration=models.ExpressionWrapper(
Now() - models.F("updated_at"),
output_field=models.DurationField()
),
# Extract the amount of days in the duration
days=ExtractDay('duration'),
# Check if the number of days between the 2 fields exceeds 1 year (365.25 Days)
last_update_beyond_a_year=models.Case(
models.When(
models.Q(days__gte=365.25),
then=True
),
default=False,
output_field=models.BooleanField()
)
# Then filter
).filter(last_update_beyond_a_year=True)
and Voila !

first and last methods used excluded items

i used last() to get last item of queryset after exclude some items as below:
holidays = HolidayModel.objects.all().values_list('date', flat=True)
result = BorseExchangeLog.objects.exclude(
hint_time__date__in=holidays
)
# output 1
print(list(result.valuse_list('hint_time__date',flat=True).distinct('hint_time__date')))
#output2
print(result.last().hint_time.date())
but in output2 print item that not exists in output1
i test some other codes as below:
print(list(logs.values_list('hint_time__date',flat=True).distinct('hint_time__date')))
print(list(logs.values_list('hint_time__date', flat=True).distinct('hint_time__date'))[-1])
print(logs.order_by('hint_time__date').last().hint_time.date())
[..., datetime.date(2020, 10, 21), datetime.date(2020, 10, 26)]
2020-10-26
2020-10-25
my holiday model:
class HolidayModel(models.Model):
creator = models.ForeignKey('accounts.Account', on_delete=models.PROTECT, verbose_name=_('Creator'))
reason = models.CharField(default='', max_length=200, verbose_name=_('Reason'))
date = models.DateField(default=timezone.now, verbose_name=_('Date'))
and other model is :
class BorseExchangeLog(models.Model):
create_time = models.DateTimeField(default=timezone.now)
hint_time = models.DateTimeField(default=timezone.now)
i test that by first() and problem was there too
what is problem? my code is wrong or bug from django orm?
using django2.2 and postgresql
Your datetimes are timezone aware but the date() method on datetime objects does not take the timezone into account, __date will take the timezone into account provided your DB supports it. Use the django.utils.timezone.localdate function to get a date taking into account the timezone
from django.utils.timezone import localdate
print(localdate(result.last().hint_time))

How to limit top N of each group in Django ORM by using Postgres Window functions or Lateral Joins?

I have following Post, Category & PostScore Model.
class Post(models.Model):
category = models.ForeignKey('Category', on_delete=models.SET_NULL, related_name='category_posts', limit_choices_to={'parent_category': None}, blank=True, null=True)
status = models.CharField(max_length=100, choices=STATUS_CHOICES, default='draft')
deleted_at = models.DateTimeField(null=True, blank=True)
...
...
class Category(models.Model):
title = models.CharField(max_length=100)
parent_category = models.ForeignKey('self', on_delete=models.SET_NULL,
related_name='sub_categories', null=True, blank=True,
limit_choices_to={'parent_category': None})
...
...
class PostScore(models.Model):
post = models.OneToOneField(Post, on_delete=models.CASCADE, related_name='post_score')
total_score = models.DecimalField(max_digits=8, decimal_places=5, default=0)
...
...
So what i want is to write a query which returns N number of posts (Posts) of each distinct category (Category) sorted by post score (denoted by total_score column in PostScore model) in descending manner. So that i have atmost N records of each category with highest post score.
So i can achieve the above mentioned thing by the following raw query which gives me top 10 posts having highest score of each category :
SELECT *
FROM (
SELECT *,
RANK() OVER (PARTITION BY "post"."category_id"
ORDER BY "postscore"."total_score" DESC) AS "rank"
FROM
"post"
LEFT OUTER JOIN
"postscore"
ON
("post"."id" = "postscore"."post_id")
WHERE
("post"."deleted_at" IS NULL AND "post"."status" = 'accepted')
ORDER BY
"postscore"."total_score"
DESC
) final_posts
WHERE
rank <= 10
What i have achieved so far using Django ORM:
>>> from django.db.models.expressions import Window
>>> from django.db.models.functions import Rank
>>> from django.db.models import F
>>> posts = Post.objects.annotate(
rank=Window( expression=Rank(),
order_by=F('post_score__total_score').desc(),
partition_by[F('category_id')]
)). \
filter(status='accepted', deleted_at__isnull=True). \
order_by('-post_score__total_score')
which roughly evaluates to
>>> print(posts.query)
>>> SELECT *,
RANK() OVER (PARTITION BY "post"."category_id"
ORDER BY "postscore"."total_score" DESC) AS "rank"
FROM
"post"
LEFT OUTER JOIN
"postscore"
ON
("post"."id" = "postscore"."post_id")
WHERE
("post"."deleted_at" IS NULL AND "post"."status" = 'accepted')
ORDER BY
"postscore"."total_score"
DESC
So basically what is missing that i need to limit each group (i.e category) results by using “rank” alias.
Would love to know how this can be done ?
I have seen one answer suggested by Alexandr on this question, one way of achieving this is by using Subquery and in operator . Although it satisfies the above condition and outputs the right results but the query is very slow.
Anyway this would be the query if I go by Alexandr suggestions:
>>> from django.db.models import OuterRef, Subquery
>>> q = Post.objects.filter(status='accepted', deleted_at__isnull=True,
category=OuterRef('category')).order_by('-post_score__total_score')[:10]
>>> posts = Post.objects.filter(id__in=Subquery(q.values('id')))
So i am more keen in completing the above raw query (which is almost done just misses the limit part) by using window function in ORM. Also, i think this can be achieved by using lateral join so answers in this direction are also welcomed.
So I have got a workaround using RawQuerySet but the things is it returns a django.db.models.query.RawQuerySet which won't support methods like filter, exclude etc.
>>> posts = Post.objects.annotate(rank=Window(expression=Rank(),
order_by=F('post_score__total_score').desc(),
partition_by=[F('category_id')])).filter(status='accepted',
deleted_at__isnull=True)
>>> sql, params = posts.query.sql_with_params()
>>> posts = Post.objects.raw(""" SELECT * FROM ({}) final_posts WHERE
rank <= %s""".format(sql),[*params, 10],)
I'll wait for the answers which provides a solution which returns a QuerySet object instead, otherwise i have to do by this way only.

Django get values for Max of grouped data

After many trials and errors and checking similar questions, I think it worth asking it with all the details.
Here's a simple model. Let's say we have a Book model and a Reserve model that holds reservation data for each Book.
class Book(models.Model):
title = models.CharField(
'Book Title',
max_length=50
)
name = models.CharField(
max_length=250
)
class Reserve(models.Model):
book = models.ForeignKey(
Book,
on_delete=models.CASCADE
)
reserve_date = models.DateTimeField()
status = models.CharField(
'Reservation Status',
max_length=5,
choices=[
('R', 'Reserved'),
('F', 'Free')
]
)
I added a book and two reservation records to the model:
from django.utils import timezone
book_inst = Book(title='Book1')
book_inst.save()
reserve_inst = Reserve(book=book_inst, reserve_date=timezone.now(), status='R')
reserve_inst.save()
reserve_inst = Reserve(book=book_inst, reserve_date=timezone.now(), status='F')
reserve_inst.save()
My goal is to get data for the last reservation for each book. Based on other questions, I get it to this point:
from django.db.models import F, Q, Max
reserve_qs = Reserve.objects.values(
'book__title'
)
reserve_qs now has the last action for each Book, but when I add .value() it ignores the grouping and returns all the records.
I also tried filtering with F:
Reserve.objects.values(
'book__title'
).annotate(
last_action=Max('reserve_date')
).values(
).filter(
reserve_date=F('last_action')
)
I'm using Django 3 and SQLite.
By using another filter, you will break the GROUP BY mechanism. You can however simply obtain the last reservation with:
from django.db.models import F, Max
Reserve.objects.filter(
book__title='Book1'
).annotate(
book_title=F('book__title'),
last_action=Max('book__reserve__reserve_date')
).filter(
reserve_date=F('last_action')
)
or for all books:
from django.db.models import F, Max
qs = Reserve.objects.annotate(
book_title=F('book__title'),
last_action=Max('book__reserve__reserve_date')
).filter(
reserve_date=F('last_action')
).select_related('book')
Here we will thus calculate the maximum for that book. Since we here join on the same table, we thus group correctly.
This will retrieve all the last reservations for all Books that are retained after filtering. Normally that is one per Book. But if there are multiple Books with multiple Reservations with exactly the same timestamp, then multiple ones will be returned.
So we can for example print the reservations with:
for q in qs:
print(
'Last reservation for {} is {} with status {}',
q.book.title,
q.reserve_date,
q.status
)
For a single book however, it is better to simply fetch the Book object and return the .latest(..) [Django-doc] reseervation:
Book.objects.get(title='Book1').reserve_set.latest('reserve_date')
book_obj = Book.objects.get(title='Book1')
reserve_qs = book_obj.reserve_set.all()
This returns all the Reserves that contains this book.
You can get the latest object using .first or .last() or sort them.

How to run a custom aggregation on a queryset?

I have a model called LeaveEntry:
class LeaveEntry(models.Model):
date = models.DateField(auto_now=False, auto_now_add=False)
user = models.ForeignKey(
settings.AUTH_USER_MODEL,
on_delete=models.PROTECT,
limit_choices_to={'is_active': True},
unique_for_date='date'
)
half_day = models.BooleanField(default=False)
I get a set of LeaveEntries with the filter:
LeaveEntry.objects.filter(
leave_request=self.unapproved_leave
).count()
I would like to get an aggregation called total days, so where a LeaveEntry has half_day=True then it is half a day so 0.5.
What I was thinking based on the django aggregations docs was annotating the days like this:
days = LeaveEntry.objects.annotate(days=<If this half_day is True: 0.5 else 1>)
You can use django's conditional expressions Case and When (only for django 1.8+):
Keeping the order of filter() and annotate() in wind you can count the the number of days left for unapproved leaves like so:
from django.db.models import FloatField, Case, When
# ...
LeaveEntry.objects.filter(
leave_request=self.unapproved_leave # not sure what self relates to
).annotate(
days=Count(Case(
When(half_day=True, then=0.5),
When(half_day=False, then=1),
output_field=FloatField()
)
)
)

Categories

Resources