Get time based model statistics in django - python

I know this is not a django question per say but I am working with django models and would like to have a solution specific to django
Suppose I have a model like this
class Foo(models.Model):
type = models.IntegerField()
timestamp = models.DateTimeField(auto_now_add=True)
Now what is the best method get a count of all objects of type(say 1) spread over date/time
For example: get_stat(type=1) gives me information on how many objects(of type 1) were created on 12/10/2018, on 13/10/2018, 14/10/2018 and so on...

I think you need to use group by. See this answer: How to query as GROUP BY in django?
#classmethod
def get_stat(cls, type):
return cls.objects.filter(type=type).values('timestamp').annotate(
count=Count('id')
).values('timestamp', 'count')
This function is an example in your case.

Related

Django filtering: from a list of IDs

I'm using Django Rest Framework.
The model class is
class MyModel(models.Model):
id = models.CharField(max_length=200)
name = models.CharField(max_length=200)
genre = models.CharField(max_length=200)
And what I have set up so far is that, when the user does a POST request, the backend will take the request data and run a python script (which takes some parameters from the request data) which will in turn return a list of IDs corresponding to the "id" in MyModel. But the problem is, let's say I want to return only the ids that point to the model instances with genre "forensic", how do I do that?
I don't really have a clue how to do that, apart from doing a query on each id returned by the python script and filtering out the ones I want based on the genre returned from the query?
Maybe you can try like this:
MyModel.objects.filter(id__in=IDS, genre='forensic').values_list('id', flat=True) # assuming IDS come from the script

Django aggregate filters

I have 3 models similar to the below, and I am trying to get the latest sale date for my items in a single query, which is definitely possible using SQL, but I am trying to use the built in Django functionality:
class Item(models.Model):
name = models.CharField()
...
class InventoryEntry(models.Model):
delta = models.IntegerField()
item = models.ForeignKey("Item")
receipt = models.ForeignKey("Receipt", null=True)
created = models.DateTimeField(default=timezone.now)
...
class Receipt(models.Model):
amt = models.IntegerField()
...
What I am trying to do is query my items and annotate the last time a sale was made on them. The InventoryEntry model can be queried for whether or not an entry was a sale based on the existence of a receipt (inventory can also be adjusted because of an order, or being stolen, etc, and I am only interested in the most recent sale).
My query right now looks something like this, which currently just gets the latest of ANY inventory entry. I want to filter the annotation to only return the max value of created when receipt__isnull=False on the InventoryEntry:
Item.objects.filter(**item_qs_kwargs).annotate(latest_sale_date=Max('inventoryentry_set__created'))
I attempted to use the When query expression but it did not work as intended, so perhaps I misused it. Any insight would be appreciated
A solution with conditional expressions should work like this:
from django.db.models import Max, Case, When, F
sale_date = Case(When(
inventoryentry__receipt=None,
then=None
), default=F('inventoryentry__created'))
qs = Item.objects.annotate(latest_sale_date=Max(sale_date))
I have tried some modified solution. Have a look.
from django.db.models import F
Item.objects\
.annotate(latest_inventoryentry_id=Max('inventoryentry__created'))\
.filter(
inventoryentry__id=F('latest_inventoryentry_id'),
inventoryentry__receipt=None
)
I did not check manually. you can check and let me know.
Thanks

Two way table and django ORM

Consider two django models 'User' and 'BoardGame', the latter has a ManyToMany field 'vote' defined with a custom through table:
class Vote(models.Model):
user = models.ForeignKey(User)
boardgame = models.ForeignKey(BoardGame)
vote = models.IntegerField()
I need to print a two way table having users names on the top, boardgames names on the left column and votes in the middle.
Is there a way to obtain this using django? (Remember that a user might not have voted every single boardgame.)
UPDATE: MORE DETAILS
1) Clearly this can be work out using some lines of python (which probably would result in many queries to the database), but I'm more interested in discovering if there is something directly implemented in django that could do the work. After all a ManyToMany field is nothing but a two way table (in this case with some data associated).
2) A possible 'solution' would be a FULL OUTER JOIN using a raw query, but, again, I am looking for something built-in inside django.
3) More specifically I'm using Class Based View and I was wondering if there exists an appropriate query to associate to queryset parameter of ListView.
Assuming:
class User(models.Model):
...
class BoardGame(models.Model):
users = models.ManyToManyField(User, through='Vote')
...
class Vote(models.Model):
user = models.ForeignKey(User)
boardgame = models.ForeignKey(BoardGame)
vote = models.IntegerField()
would work like this:
from django.db import connections, reset_queries
reset_queries()
users = User.objects.all().prefetch_related('vote_set')
table = []
table.append([''] + list(users))
for board_game in BoardGame.objects.all().prefetch_related('vote_set'):
row = [board_game]
for user in users:
for vote in user.vote_set.all():
if vote in board_game.vote_set.all():
row.append(vote)
break
else:
row.append('')
table.append(row)
len(connection.queries) # Returns always 4
This is not the solution you wanted, but it shows a way to get the table from the database with only 4 queries no matter how many objects you have.
I don't think there is anything in the Django core or generic Class Based Views that will render tables for you, but try django-tables2

django perform logic on fields during aggregation (instead of directly doing aggregate(Sum ('somefield')))

Code:
MyModel(models.Model):
start_date = models.DateTimeField()
end_date = models.DateTimeField()
name = models.CharField()
somenum = models.IntegerField()
If I want to calculate sum of all the 'somenum'(last field in above model), then I can do this:
queryset.aggregate(Sum('somenum'))
My Requirement:
sum of (enddate-startdate)(means result is num of days) excluding (saturdays, and sundays).
I can do this using normal logic, but I think aggregate or someother way is preferred.
Ways i know:
Loop every record of queryset, calculate num of days(excluding saturdays, sundays) for each record, and sum it during loop(like sum=0 before loop, inside loop sum += current_value; ).
Write signal that does this calculation(sum) when model is saved, and save calculateed value to UserProfile. When we want value, we can get it from UserProfile, but I think this process is buggy.
Add another field 'num_of_days' for MyModel shown above, and whenever a model is saved, the signal should perform calculation and should save that field. Then we can use queryset.aggregate(Sum('num_of_days')) .
Everytime you want this data, you have to query all MyModel and sum in python; this is not efficient or scalable.
Why you would save to UserProfile it seems the number of days belongs in MyModel?
I think this should be the preferred method. It is simple, elegant, easy to work on. By doing this you put the aggregation on your database

Django aggregation query on related one-to-many objects

Here is my simplified model:
class Item(models.Model):
pass
class TrackingPoint(models.Model):
item = models.ForeignKey(Item)
created = models.DateField()
data = models.IntegerField()
class Meta:
unique_together = ('item', 'created')
In many parts of my application I need to retrieve a set of Item's and annotate each item with data field from latest TrackingPoint from each item ordered by created field. For example, instance i1 of class Item has 3 TrackingPoint's:
tp1 = TrackingPoint(item=i1, created=date(2010,5,15), data=23)
tp2 = TrackingPoint(item=i1, created=date(2010,5,14), data=21)
tp3 = TrackingPoint(item=i1, created=date(2010,5,12), data=120)
I need a query to retrieve i1 instance annotated with tp1.data field value as tp1 is the latest tracking point ordered by created field. That query should also return Item's that don't have any TrackingPoint's at all. If possible I prefer not to use QuerySet's extra method to do this.
That's what I tried so far... and failed :(
Item.objects.annotate(max_created=Max('trackingpoint__created'),
data=Avg('trackingpoint__data')).filter(trackingpoint__created=F('max_created'))
Any ideas?
Here's a single query that will provide (TrackingPoint, Item)-pairs:
TrackingPoint.objects.annotate(max=Max('item__trackingpoint__created')).filter(max=F('created')).select_related('item').order_by('created')
You would have to query for items without TrackingPoints separately.
This isn't directly answer to your question, but in case don't need exactly what you described you might be interested in greatest-n-per-group solution. You can take a look on my answer on similar question:
Django Query That Get Most Recent Objects From Different Categories
-- this should apply directly to your case:
items = Item.objects.annotate(tracking_point_created=Max('trackingpoint__created'))
trackingpoints = TrackingPoint.objects.filter(created__in=[b.tracking_point_created for b in items])
Note that second line can produce ambiguous results if created dates repeat in TrackingPoint model.

Categories

Resources