Limit prefetch_related to 1 by a certain criteria - python

So I have models like these
class Status(models.Mode):
name = models.CharField(max_length=255, choices=StatusName.choices, unique=True)
class Case(models.Model):
# has some fields
class CaseStatus(models.Model):
case = models.ForeignKey("cases.Case", on_delete=models.CASCADE, related_name="case_statuses")
status = models.ForeignKey("cases.Status", on_delete=models.CASCADE, related_name="case_statuses")
created = models.DateTimeField(auto_now_add=True)
I need to filter the cases on the basis of the status of their case-status but the catch is only the latest case-status should be taken into account.
To get Case objects based on all the case-statuses, this query works:
Case.objects.filter(case_statuses__status=status_name)
But I need to get the Case objects such that only their latest case_status object (descending created) is taken into account. Something like this is what I am looking for:
Case.objects.filter(case_statuses__order_by_created_first__status=status_name)
I have tried Prefetch as well but doesnt seem to work with my use-case
sub_query = CaseStatus.objects.filter(
id=CaseStatus.objects.select_related('case').order_by('-created').first().id)
Case.objects.prefetch_related(Prefetch('case_statuses', queryset=sub_query)).filter(
case_statuses__status=status_name)
This would be easy to solve in raw postgres by using limit 1. But not sure how can I make this work in Django ORM.

You can annotate your cases with their last status, and then filter on that status to be what you want.
from django.db.models import OuterRef
status_qs = CaseStatus.objects.filter(case=OuterRef('pk')).order_by('-created').values('status__name')[:1]
Case.objects.annotate(last_status=status_qs).filter(last_status=status_name)

Related

Django models - is it a good practice to create models that only link two tables?

Let's say that I want to develop a simple todo-list app. This is what the models.py mght look like.
PRIORITIES = ["LOW", "MEDIUM", "HIGH", "URGENT"]
class User(models.Model):
username = models.CharField(max_length = 20)
password = models.CharField(max_length = 100)
email = models.EmailField()
class Task(models.Model):
text = models.TextField()
dueDate = models.DateField()
priority = models.CharField(choices = PRIORITIES, default = "LOW")
class UserTask(models.Model):
user = models.ForeignKey(User, on_delete = models.CASCADE)
task = models.ForeignKey(Task, on_delete = models.CASCADE)
Here, the UserTask model was created only with a view to reducing redundancies in the database.
Is it a good practice? I don't think that this is what models should be used for.
Here, the UserTask model was created only with a view to reducing redundancies in the database.
Given I understand it correctly, a Task belongs to a single User, at least based on your comment:
I read about ManyToMany, the problem is that this relation in question is in fact OneToMany.
In that case, I do not see how you are "reducing" redundancies in the database, in fact you create extra data duplication, since now you need to keep the Task and its user = ForeignKey(..) in harmony with UserTask, this thus means that all creates, edits and removals have impact, and some circumvent Django's signal tooling, so it makes it quite hard, if not impossible.
The fact is that you do not need to construct a table to query the relation in reverse. By default Django will put a database index on a ForeignKey, so the database can efficiently retrieve the Tasks that belong to a single (or multiple) users, it thus makes a JOIN with the user table more efficient.
One typically thus defines a ForeignKey like:
class Task(models.Model):
text = models.TextField()
dueDate = models.DateField()
user = models.ForeignKey(User, on_delete=models.CASCADE)
priority = models.CharField(choices = PRIORITIES, default = "LOW")
At the Python/Django level, Django also provides convenience to obtain the tasks of a User. If you want to retrieve all the Tasks of some_user, you can query with:
some_user.task_set.all()
or you can filter on the tasks, like:
some_user.task_set.filter(text__icontains='someword')
You can make the name of the relation in reverse more convenient as well, by specifying a related_name in the ForeignKey, like:
class Task(models.Model):
user = models.ForeignKey(
User,
on_delete=models.CASCADE,
related_name='tasks'
)
text = models.TextField()
dueDate = models.DateField()
priority = models.CharField(choices = PRIORITIES, default = "LOW")
In which case you thus query the Tasks of a User with:
some_user.tasks.all()
There is however a scenario where Django creates an implicit model that links two models together: in case one defines a ManyToManyField [Django-doc], since specifying an array of identifiers is typically hard (or even impossible) for a lot of relational databases (even if it is possible, typically one can no longer guarantee FOREIGN KEY constraints), etc.
In case the many-to-many relation between two models contains extra data (for example a timestamp when two users became friends), then one even defines an explicit model with two ForeignKey relations (and extra attributes), and uses this as a through [Django-doc] model.

How to combine django "prefetch_related" and "values" methods?

How can prefetch_related and values method be applied in combination?
Previously, I had the following code. Limiting fields in this query is required for performance optimization.
Organizations.objects.values('id','name').order_by('name')
Now, I need to prefetch its association and append it in the serializer using "prefetch_related" method.
Organizations.objects.prefetch_related('locations').order_by('name')
Here, I cannot seem to find a way to limit the fields after using "prefetch_related".
I have tried the following, but on doing so serializer does not see the associated "locations".
Organizations.objects.prefetch_related('locations').values("id", "name").order_by('name')
Model Skeleton:
class Organizations(models.Model):
name = models.CharField(max_length=40)
class Location(models.Model):
name = models.CharField(max_length=50)
organization = models.ForeignKey(Organizations, to_field="name", db_column="organization_name", related_name='locations')
class Meta:
db_table = u'locations'
Use only() to limit number of fields retrieved if you're concerned about your app performances. See reference.
In the example above, this would be:
Organizations.objects.prefetch_related('locations').only('id', 'name').order_by('name')
which would result in two queries:
SELECT id, name FROM organizations;
SELECT * from locations WHERE organization_name = <whatever is seen before>;

Django aggregate filters

I have 3 models similar to the below, and I am trying to get the latest sale date for my items in a single query, which is definitely possible using SQL, but I am trying to use the built in Django functionality:
class Item(models.Model):
name = models.CharField()
...
class InventoryEntry(models.Model):
delta = models.IntegerField()
item = models.ForeignKey("Item")
receipt = models.ForeignKey("Receipt", null=True)
created = models.DateTimeField(default=timezone.now)
...
class Receipt(models.Model):
amt = models.IntegerField()
...
What I am trying to do is query my items and annotate the last time a sale was made on them. The InventoryEntry model can be queried for whether or not an entry was a sale based on the existence of a receipt (inventory can also be adjusted because of an order, or being stolen, etc, and I am only interested in the most recent sale).
My query right now looks something like this, which currently just gets the latest of ANY inventory entry. I want to filter the annotation to only return the max value of created when receipt__isnull=False on the InventoryEntry:
Item.objects.filter(**item_qs_kwargs).annotate(latest_sale_date=Max('inventoryentry_set__created'))
I attempted to use the When query expression but it did not work as intended, so perhaps I misused it. Any insight would be appreciated
A solution with conditional expressions should work like this:
from django.db.models import Max, Case, When, F
sale_date = Case(When(
inventoryentry__receipt=None,
then=None
), default=F('inventoryentry__created'))
qs = Item.objects.annotate(latest_sale_date=Max(sale_date))
I have tried some modified solution. Have a look.
from django.db.models import F
Item.objects\
.annotate(latest_inventoryentry_id=Max('inventoryentry__created'))\
.filter(
inventoryentry__id=F('latest_inventoryentry_id'),
inventoryentry__receipt=None
)
I did not check manually. you can check and let me know.
Thanks

Django Two Model Fields, same DB Column

We are trying to work with legacy DB Tables that were generated outside of Django and are not structured in an ideal way. We also can not modify the existing tables.
The DB uses the same user ID (pk) across all the tables, wether or not there is a record for that user ID. It also uses that ID as a PK on the other tables, rather than rely on them to auto increment their own IDs.
So imagine something like this below:
class Items(models.Model):
user_id = models.ForeignKey('User', db_column='UserID')
class User(models.Model):
user_id = models.IntegerField(primary_key=True)
class UserTypeA(models.Model):
user_id = models.IntegerField(primary_key=True) # Same Value as User
class UserTypeB(models.Model):
user_id = models.IntegerField(primary_key=True) # Same Value as User
What we thought of creating a relationship between Items and UserTypeA (as well as UserTypeB) is to create another field entry that uses the same column as the user_id.
class Items(models.Model):
user_id = models.ForeignKey('User', db_column='UserID')
user_type_a = models.ForeignKey('UserTypeA', db_column='UserID')
user_type_b = models.ForeignKey('UserTypeB', db_column='UserID')
This unfortunately returns a "db_column is already used" type error.
Any thoughts on how to better approach the way what we're trying to do?
A detail to note is that we're only ever reading from this databases (no updates to), so a read-only solution is fine.
Thanks,
-RB
I've solved a similar problem with this (this code should be put before the definition of your Model):
from django.db.models.signals import class_prepared
def remove_field(sender, **kwargs):
if sender.__name__ == "MyModel":
sender._meta.local_fields.remove(sender.myFKField.field)
class_prepared.connect(remove_field)
(Tested in Django 1.5.11)
Django uses local_fields to make the CREATE TABLE query.
So, I've just attached the signal class_prepared and check if sender equals the class I was expecting. If so, I've removed the field from that list.
After doing that, the CREATE TABLE query didn't include the field with same db_column and the error did not ocurr.
However the Model still working properly (with manager methods properly populating the removed field from local_fields), I can't tell the real impact of that.

queries in django

How to query Employee to get all the address related to the employee, Employee.Add.all() does not work..
class Employee():
Add = models.ManyToManyField(Address)
parent = models.ManyToManyField(Parent, blank=True, null=True)
class Address(models.Model):
address_emp = models.CharField(max_length=512)
description = models.TextField()
def __unicode__(self):
return self.name()
Employee.objects.get(pk=1).Add.all()
You need to show which employee do you mean. pk=1 is obviously an example (employee with primary key equal to 1).
BTW, there is a strong convention to use lowercase letters for field names. Employee.objects.get(pk=1).addresses.all() would look much better.
Employee.Add.all() does not work because you are trying to access a related field from the Model and this kind of queries require an instance of the model, like Ludwik's example. To access a model and its related foreign key field in the same query you have to do something like this:
Employee.objects.select_related('Add').all()
That would do the trick.
employee = Employee.objects.prefetch_related('Add')
[emp.Add.all() for emp in employee]
prefetch_related supports many relationships and caches the query set and reduces the database hits hence increases the performance..

Categories

Resources