Order DateTimeField by rounded time using only the ORM? - python

I came up with a solution but it's using a model method ( as far as i understand it cannot be used with X.objects.filter() ) and then use the python method sorted. I've read that it's way faster to use django ORM than direct python so I'm searching for a solution. I precise that adding fields to my model is not possible as the database is already well populated.
Basically I've an Articles model :
class Articles(models.Model):
title = models.CharField(max_length=200, null=False, blank=False)
image = models.URLField(null=False, blank=False)
summary = models.TextField()
link = models.URLField(null=False, blank=False)
pub_date = models.DateTimeField(default=timezone.now)
source = models.ForeignKey(
Sources, on_delete=models.CASCADE, related_name="souurce")
category = models.ManyToManyField(Categories)
and what I want to do is ordering result by approximate publication date ( for example an article published at 5:34 and an another one published a 5:31 are both considered published at the same time ), then I can perform other orderings like per category, source or even by title.
Here is my class method to do that (by closest 10 minutes ):
def get_approximate_date(self):
pub_date = self.pub_date
def timeround10(dt):
"""timertound closets 10 minutes"""
a, b = divmod(round(dt.minute, -1), 60)
h = (dt.hour + a) % 24
m = b
new = dt.replace(hour=h, minute=m, second=0, microsecond=0)
return new
return timeround10(pub_date)
Then in my view I can do the following ( I chose to order by approximate date then by reverse alphabetical order ) :
articles_ = Articles.objects.all()
articles_list = sorted(articles_, key=lambda i: (i.get_approximate_date(), i.summary), reverse=True)
The closest thing I came up with using only django ORM is :
Articles.objects.order_by("-pub_date__year","-pub_date__month","-pub_date__day","-summary")
Apart from being ugly it only round the pub date by hour, so 1:59 PM = 1:01PM.
I'm aware of Trunc https://docs.djangoproject.com/en/3.1/ref/models/database-functions/#trunc but it doesn't only implement a way to order by hour minutes etc, maybe i should expand it if it's the only option.
Thanks in advance !

Related

How to limit top N of each group in Django ORM by using Postgres Window functions or Lateral Joins?

I have following Post, Category & PostScore Model.
class Post(models.Model):
category = models.ForeignKey('Category', on_delete=models.SET_NULL, related_name='category_posts', limit_choices_to={'parent_category': None}, blank=True, null=True)
status = models.CharField(max_length=100, choices=STATUS_CHOICES, default='draft')
deleted_at = models.DateTimeField(null=True, blank=True)
...
...
class Category(models.Model):
title = models.CharField(max_length=100)
parent_category = models.ForeignKey('self', on_delete=models.SET_NULL,
related_name='sub_categories', null=True, blank=True,
limit_choices_to={'parent_category': None})
...
...
class PostScore(models.Model):
post = models.OneToOneField(Post, on_delete=models.CASCADE, related_name='post_score')
total_score = models.DecimalField(max_digits=8, decimal_places=5, default=0)
...
...
So what i want is to write a query which returns N number of posts (Posts) of each distinct category (Category) sorted by post score (denoted by total_score column in PostScore model) in descending manner. So that i have atmost N records of each category with highest post score.
So i can achieve the above mentioned thing by the following raw query which gives me top 10 posts having highest score of each category :
SELECT *
FROM (
SELECT *,
RANK() OVER (PARTITION BY "post"."category_id"
ORDER BY "postscore"."total_score" DESC) AS "rank"
FROM
"post"
LEFT OUTER JOIN
"postscore"
ON
("post"."id" = "postscore"."post_id")
WHERE
("post"."deleted_at" IS NULL AND "post"."status" = 'accepted')
ORDER BY
"postscore"."total_score"
DESC
) final_posts
WHERE
rank <= 10
What i have achieved so far using Django ORM:
>>> from django.db.models.expressions import Window
>>> from django.db.models.functions import Rank
>>> from django.db.models import F
>>> posts = Post.objects.annotate(
rank=Window( expression=Rank(),
order_by=F('post_score__total_score').desc(),
partition_by[F('category_id')]
)). \
filter(status='accepted', deleted_at__isnull=True). \
order_by('-post_score__total_score')
which roughly evaluates to
>>> print(posts.query)
>>> SELECT *,
RANK() OVER (PARTITION BY "post"."category_id"
ORDER BY "postscore"."total_score" DESC) AS "rank"
FROM
"post"
LEFT OUTER JOIN
"postscore"
ON
("post"."id" = "postscore"."post_id")
WHERE
("post"."deleted_at" IS NULL AND "post"."status" = 'accepted')
ORDER BY
"postscore"."total_score"
DESC
So basically what is missing that i need to limit each group (i.e category) results by using “rank” alias.
Would love to know how this can be done ?
I have seen one answer suggested by Alexandr on this question, one way of achieving this is by using Subquery and in operator . Although it satisfies the above condition and outputs the right results but the query is very slow.
Anyway this would be the query if I go by Alexandr suggestions:
>>> from django.db.models import OuterRef, Subquery
>>> q = Post.objects.filter(status='accepted', deleted_at__isnull=True,
category=OuterRef('category')).order_by('-post_score__total_score')[:10]
>>> posts = Post.objects.filter(id__in=Subquery(q.values('id')))
So i am more keen in completing the above raw query (which is almost done just misses the limit part) by using window function in ORM. Also, i think this can be achieved by using lateral join so answers in this direction are also welcomed.
So I have got a workaround using RawQuerySet but the things is it returns a django.db.models.query.RawQuerySet which won't support methods like filter, exclude etc.
>>> posts = Post.objects.annotate(rank=Window(expression=Rank(),
order_by=F('post_score__total_score').desc(),
partition_by=[F('category_id')])).filter(status='accepted',
deleted_at__isnull=True)
>>> sql, params = posts.query.sql_with_params()
>>> posts = Post.objects.raw(""" SELECT * FROM ({}) final_posts WHERE
rank <= %s""".format(sql),[*params, 10],)
I'll wait for the answers which provides a solution which returns a QuerySet object instead, otherwise i have to do by this way only.

Django Querying the Database

Here's my Answer Model,
class Answer(models.Model):
likes = models.ManyToManyField(User, related_name='answer_likes')
timestamp = models.DateTimeField(auto_now=False, auto_now_add=True)
I wants to filter out the Answers which received Maximum likes in last 24 Hours. How can I do that in view?
Thank You :)
You need django aggregation api. Try:
from datetime import *
from django.db.models import Count
last_24 = datetime.now() - timedelta(hours = 24)
ans = Answer.objects.filter(timestamp__gte = last_24).annotate(counted_likes = Count('likes')).order_by('-counted_likes')
Now you can ans[0].counted_likesto find out how many likes answer ans[0] have, and order_by term up there assures to you that this first element has the largest number of likes.
See aggregation in django docs for further explanations.

How to sort a Django QuerySet by (field, custom function, field)

I am looking for getting a QuerySet that is sorted by field1, function, field2.
The model:
class Task(models.Model):
issue_id = models.CharField(max_length=20, unique=True)
title = models.CharField(max_length=100)
priority_id = models.IntegerField(blank=True, null=True)
created_date = models.DateTimeField(auto_now_add=True)
def due_date(self):
...
return ageing
I'm looking for something like:
taskList = Task.objects.all().order_by('priority_id', ***duedate***, 'title')
Obviously, you can't sort a queryset by custom function. Any advise?
Since the actual sorting happens in the database, which does not speak Python, you cannot use a Python function for ordering. You will need to implement your due date logic in an SQL expression, as an Queryset.extra(select={...}) calculated field, something along the lines of:
due_date_expr = '(implementation of your logic in SQL)'
taskList = Task.objects.all().extra(select={'due_date': due_date_expr}).order_by('priority_id', 'due_date', 'title')
If your logic is too complicated, you might need to implement it as a stored procedure in your database.
Alternatively, if your data set is very small (say, tens to a few hundred records), you can fetch the entire result set in a list and sort it post-factum:
taskList = list(Task.objects.all())
taskList.sort(cmp=comparison_function) // or .sort(key=key_function)
The answer by #lanzz, even though seems correct, didn't work for me but this answer from another thread did the magic for me:
https://stackoverflow.com/a/37648265/6420686
from django.db.models import Case, When
ids = [list of ids]
preserved = Case(*[When(id=pk, then=pos) for pos, pk in enumerate(ids)])
filtered_users = User.objects \
.filter(id__in=ids) \
.order_by(preserved)
You can use sort in Python if the queryset is not too large:
ordered = sorted(Task.objects.all(), key=lambda o: (o.priority_id, o.due_date(), o.title))

Django filter against ForeignKey and by result of manytomany sub query

I've looked at doing a query using an extra and/or annotate but have not been able to get the result I want.
I want to get a list of Products, which has active licenses and also the total number of available licenses. An active license is defined as being not obsolete, in date, and the number of licenses less the number of assigned licenses (as defined by a count on the manytomany field).
The models I have defined are:
class Vendor(models.Model):
name = models.CharField(max_length=200)
url = models.URLField(blank=True)
class Product(models.Model):
name = models.CharField(max_length=200)
vendor = models.ForeignKey(Vendor)
product_url = models.URLField(blank=True)
is_obsolete = models.BooleanField(default=False, help_text="Is this product obsolete?")
class License(models.Model):
product = models.ForeignKey(Product)
num_licenses = models.IntegerField(default=1, help_text="The number of assignable licenses.")
licensee_name = models.CharField(max_length=200, blank=True)
license_key = models.TextField(blank=True)
license_startdate = models.DateField(default=date.today())
license_enddate = models.DateField(null=True, blank=True)
is_obsolete = models.BooleanField(default=False, help_text="Is this licenses obsolete?")
licensees = models.ManyToManyField(User, blank=True)
I have tried filtering by the License model. Which works, but I don't know how to then collate / GROUP BY / aggregate the returned data into a single queryset that is returned.
When trying to filter by procuct, I can quite figure out the query I need to do. I can get bits and pieces, and have tried using a .extra() select= query to return the number of available licenses (which is all I really need at this point) of which there will be multiple licenses associated with a product.
So, the ultimate answer I am after is, how can I retrieve a list of available products with the number of available licenses in Django. I'd rather not resort to using raw as much as possible.
An example queryset that gets all the License details I want, I just can't get the product:
License.objects.annotate(
used_licenses=Count('licensees')
).extra(
select={
'avail_licenses': 'licenses_license.num_licenses - (SELECT count(*) FROM licenses_license_licensees WHERE licenses_license_licensees.license_id = licenses_license.id)'
}
).filter(
is_obsolete=False,
num_licenses__gt=F('used_licenses')
).exclude(
license_enddate__lte=date.today()
)
Thank you in advance.
EDIT (2014-02-11):
I think I've solved it in possibly an ugly way. I didn't want to make too many DB calls if I can, so I get all the information using a License query, then filter it in Python and return it all from inside a manager class. Maybe an overuse of Dict and list. Anyway, it works, and I can expand it with additional info later on without a huge amount of risk or custom SQL. And it also uses some of the models parameters that I have defined in the model class.
class LicenseManager(models.Manager):
def get_available_products(self):
licenses = self.get_queryset().annotate(
used_licenses=Count('licensees')
).extra(
select={
'avail_licenses': 'licenses_license.num_licenses - (SELECT count(*) FROM licenses_license_licensees WHERE licenses_license_licensees.license_id = licenses_license.id)'
}
).filter(
is_obsolete=False,
num_licenses__gt=F('used_licenses')
).exclude(
license_enddate__lte=date.today()
).prefetch_related('product')
products = {}
for lic in licenses:
if lic.product not in products:
products[lic.product] = lic.product
products[lic.product].avail_licenses = lic.avail_licenses
else:
products[lic.product].avail_licenses += lic.avail_licenses
avail_products = []
for prod in products.values():
if prod.avail_licenses > 0:
avail_products.append(prod)
return avail_products
EDIT (2014-02-12):
Okay, this is the final solution I have decided to go with. Uses Python to filter the results. Reduces cache calls, and has a constant number of SQL queries.
The lesson here is that for something with many levels of filtering, it's best to get as much as needed, and filter in Python when returned.
class ProductManager(models.Manager):
def get_all_available(self, curruser):
"""
Gets all available Products that are available to the current user
"""
q = self.get_queryset().select_related().prefetch_related('license', 'license__licensees').filter(
is_obsolete=False,
license__is_obsolete=False
).exclude(
license__enddate__lte=date.today()
).distinct()
# return a curated list. Need further information first
products = []
for x in q:
x.avail_licenses = 0
x.user_assigned = False
# checks licenses. Does this on the model level as it's cached so as to save SQL queries
for y in x.license.all():
if not y.is_active:
break
x.avail_licenses += y.available_licenses
if curruser in y.licensees.all():
x.user_assigned = True
products.append(x)
return q
One strategy would be to get all the product ids from your License queryset:
productIDList = list(License.objects.filter(...).values_list(
'product_id', flat=True))
and then query the products using that list of ids:
Product.objects.filter(id__in=productIDList)

Django-date incrementation in a list with ManyToManyField

New to django/programming, any help is greatly appreciated. I need help moving through a history of doctor appointments and selecting what immunizations were performed at each appointment, then creating a date when the immunization is due in the future (based on an immunization information table, which has the proper interval of immunizations and will increment from the visit date)
models.py
class Immunizations(models.Model):
immunization = models.CharField(max_length=100, null=True)
interval = models.CharField(max_length=5, null=True)**This should probably be an integer field, will change later
class Visit(models.Model):
patient = models.ForeignKey(Patients)
date_of_visit = models.DateField(null=True)
weight = models.CharField(max_length=5, null=True)
immunization = models.ManyToManyField(Immunizations)
timestamp = models.DateTimeField(auto_now_add=True, default=datetime.datetime.now())
active = models.BooleanField(default=True)
I have been reading the documentation and questions on SO all weekend, but am still very conflicted about what way to go through this.
What I want is:
Visit.date_of_visit1
Visit.immunization1, Visit.date_of_visit1 + Immunization.interval1
Visit.immunization2, Visit.date_of_visit1 + Immunization.interval2
Visit.date_of_visit2
Visit.immunization1, Visit.date_of_visit2 + Immunization.interval1
ETC
This could go on for years with each visit having different immunizations performed. I want to maintain a record of which immunization was performed and record the due date, even if that due date has passed.
views.py
def visit_profile(request, slug):
patient = Patients.objects.get(slug=slug)
try:
visit = Visit.objects.filter(patient_id=patient.id)
except:
return HttpResponseRedirect('/')
#Immunization Due Dates
visitdate = Visit.objects.get(patient_id=patient.id, active=1).date_of_visit
imm = Immunizations.objects.all()
visitimm = []
for immunization in imm:
due = Immunizations.objects.get(id= immunization.pk)
duedate = visitdate + timedelta(days=int(due.interval))
visitimm.append((due, duedate))
return render_to_response('patient.html',locals(), context_instance=RequestContext(request))
Need help with my views.py. The above works, but only at showing the active=1 visit information. I can't figure out how to modify/re-do to achieve what I want and be able to access the data in my template file. I've experimented with __in method, itertools, looping, etc. Can anyone provide the proper method/direction for doing this? I will go back and properly setup error catching once I can get the code to work. Thanks!
Yep, make interval an IntegerField or maybe rather a PositiveSmallIntegerField since it will never get a negative value nor a very huge number.
Careful, better don't mix plural and singular in model names, they affect the related names when you traverse your foreign keys which makes it a pain to debug, see here. I prefer to use only singulars.
Instead of:
visit = Visit.objects.filter(patient_id=patient.id)
You can simply type:
visit = Visit.objects.filter(patient=patient)
Try something like this
def visit_profile(request, slug):
patient = Patients.objects.get(slug=slug)
visitimm = []
# Looping over all active visit records of the patient in date order
for v in patient.visit_set
.filter(active=True).order_by('date_of_visit'):
# Looping over each visit's immunizations
for i in v.immunizations_set.all():
duedate = v.date_of_visit + timedelta(days=int(i.interval))
visitimm.append((i, duedate))
...

Categories

Resources