Am creating an chart for data analytics. So i need to group the count by month for the whole year.
My model:
Class Application:
reference,
created_at
From the above, i need count of applications for each month for the current year. And with the current query i get all the data but i am not getting data for the months which no data is available:
My query:
queryset = Application.objects.filter(user=user).annotate(month=TruncMonth('created_at')).values('month').annotate(_applications=Count('id')).order_by('month')
For example, If i have data for month Jan and Feb the above query gives data of those only but i need the data to contain "0" for all non data available months:
If March doesnt have data means, the result should be "0" for that month. How to do this ?
You can manually create your dataset using query reqults
queryset = Application.objects.filter(user=user).annotate(
month=TruncMonth('created_at')).values('month').annotate(
_applications=Count('id')).order_by('month')
applications_by_month = {
m['month'].month: m['_applications'] for m in queryset
}
dataset = []
year = 2021
for month in range(1, 13):
dataset.append({
"month": datetime.date(year=year, month=month, day=1),
"applications": applications_by_month.get(month, 0)
})
print(dataset)
Output
[{'month': datetime.date(2021, 1, 1), 'applications': 0},
{'month': datetime.date(2021, 2, 1), 'applications': 0},
{'month': datetime.date(2021, 3, 1), 'applications': 1},
{'month': datetime.date(2021, 4, 1), 'applications': 0},
{'month': datetime.date(2021, 5, 1), 'applications': 0},
{'month': datetime.date(2021, 6, 1), 'applications': 1},
{'month': datetime.date(2021, 7, 1), 'applications': 0},
{'month': datetime.date(2021, 8, 1), 'applications': 0},
{'month': datetime.date(2021, 9, 1), 'applications': 0},
{'month': datetime.date(2021, 10, 1), 'applications': 0},
{'month': datetime.date(2021, 11, 1), 'applications': 0},
{'month': datetime.date(2021, 12, 1), 'applications': 0}]
Related
I'm trying to group_by() data based on dates and with every day I want to calculate Count on that day also the total count so far.
Sample output I'm getting:
[
{
"dates": "2022-11-07",
"count": 1
},
{
"dates": "2022-11-08",
"count": 3
},
{
"dates": "2022-11-09",
"count": 33
}
]
Sample output I'm trying to achieve:
[
{
"dates": "2022-11-07",
"count": 1,
"cumulative_count": 1
},
{
"dates": "2022-11-08",
"count": 3,
"cumulative_count": 4
},
{
"dates": "2022-11-09",
"count": 33,
"cumulative_count": 37
}
]
Here's my query:
self.serializer_class.Meta.model.objects.all().annotate(dates=TruncDate("date__date")).values("dates").order_by("dates").annotate(count=Count("channel", distinct=True)).values("count", "dates")
How can I extend this query to get a cumulative sum as well?
I tried to solve your problem like this
models.py
class Demo(models.Model):
count =models.IntegerField()
dates = models.DateField()
serializers.py
class DemoSerializer(serializers.ModelSerializer):
class Meta:
model = Demo
fields = "__all__"
Views.py
class DemoAPI(APIView):
def get(self, request, pk=None, format=None):
data = Demo.objects.all()
cumulative_count= 0
# Normal Django ORM Queruset
print('--------- Default Queryset Response ---------')
for i in data:
del i.__dict__['_state']
print(i.__dict__)
# Adding cumulative_count key in ORM Queryset
for i in data:
cumulative_count += i.__dict__['count']
i.__dict__['cumulative_count'] = cumulative_count
# Updated Django ORM Queruset with cumulative_count
print('--------- Updated Queryset Response ---------')
for i in data:
# del i.__dict__['_state']
print(i.__dict__)
Output before delete _state key from Queryset
#--------- Default Queryset Response ---------
{'_state': <django.db.models.base.ModelState object at 0x000001A07002A680>, 'id': 1, 'count': 1, 'dates': datetime.date(2022, 11, 7)}
{'_state': <django.db.models.base.ModelState object at 0x000001A07002A5C0>, 'id': 2, 'count': 3, 'dates': datetime.date(2022, 11, 8)}
{'_state': <django.db.models.base.ModelState object at 0x000001A07002A7A0>, 'id': 3, 'count': 33, 'dates': datetime.date(2022, 11, 9)}
#--------- Updated Queryset Response ---------
{'_state': <django.db.models.base.ModelState object at 0x000002DAB66E0AC0>, 'id': 1, 'count': 1, 'dates': datetime.date(2022, 11, 7), 'cumulative_count': 1}
{'_state': <django.db.models.base.ModelState object at 0x000002DAB66E0C10>, 'id': 2, 'count': 3, 'dates': datetime.date(2022, 11, 8), 'cumulative_count': 4}
{'_state': <django.db.models.base.ModelState object at 0x000002DAB66E0D60>, 'id': 3, 'count': 33, 'dates': datetime.date(2022, 11, 9), 'cumulative_count': 37}
Output after delete _state key from Queryset Added cumulative_count key in Queryset
#--------- Default Queryset Response ---------
{'id': 1, 'count': 1, 'dates': datetime.date(2022, 11, 7)}
{'id': 2, 'count': 3, 'dates': datetime.date(2022, 11, 8)}
{'id': 3, 'count': 33, 'dates': datetime.date(2022, 11, 9)}
#--------- Updated Queryset Response ---------
{'id': 1, 'count': 1, 'dates': datetime.date(2022, 11, 7), 'cumulative_count': 1}
{'id': 2, 'count': 3, 'dates': datetime.date(2022, 11, 8), 'cumulative_count': 4}
{'id': 3, 'count': 33, 'dates': datetime.date(2022, 11, 9), 'cumulative_count': 37}
I have this data :
[
{'name': 'INV/2021/0913', 'invoice_date': datetime.date(2021, 3, 12), 'qty_total': 5.0},
{'name': 'INV/2021/0965', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 6.0},
{'name': 'INV/2021/0966', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 7.0},
{'name': 'INV/2021/0967', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 3.0},
{'name': 'INV/2021/0992', 'invoice_date': datetime.date(2021, 3, 15), 'qty_total': 4.0}
]
As it can be seen the middle 3 dicts have same date.
I want to combine the dictionaries having the same invoice_date and sum up the its qty_total.
Set the name attribute to "" for the combined dictionaries.
The result should look like this:
[
{'name': 'INV/2021/0913', 'invoice_date': datetime.date(2021, 3, 12), 'qty_total': 5.0},
{'name': '', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 16.0},
{'name': 'INV/2021/0992', 'invoice_date': datetime.date(2021, 3, 15), 'qty_total': 4.0}
]
use itertools.groupby
from datetime import datetime
from itertools import groupby
l = [
{'name': 'INV/2021/0913', 'invoice_date': datetime(2021, 3, 12).date(), 'qty_total': 5.0},
{'name': 'INV/2021/0965', 'invoice_date': datetime(2021, 3, 14).date(), 'qty_total': 6.0},
{'name': 'INV/2021/0966', 'invoice_date': datetime(2021, 3, 14).date(), 'qty_total': 7.0},
{'name': 'INV/2021/0967', 'invoice_date': datetime(2021, 3, 14).date(), 'qty_total': 3.0},
{'name': 'INV/2021/0992', 'invoice_date': datetime(2021, 3, 15).date(), 'qty_total': 4.0}
]
res = []
for k, v in groupby(sorted(l, key=lambda x: x["invoice_date"]), key=lambda x: (x["invoice_date"])):
val = list(v)
res.append(
{"name": " " if len(val)>1 else val[0]["name"], "invoice_date": k, "qty_total": sum(vals["qty_total"] for vals in val)}
)
print(res)
Output
[{'name': 'INV/2021/0913',
'invoice_date': datetime.date(2021, 3, 12),
'qty_total': 5.0},
{'name': ' ', 'invoice_date': datetime.date(2021, 3, 14), 'qty_total': 16.0},
{'name': 'INV/2021/0992',
'invoice_date': datetime.date(2021, 3, 15),
'qty_total': 4.0}]
I have a list of dictionaries:
data = [{'average': 2, 'day': '2022-01-01'},
{'average': 3, 'day': '2022-01-02'},
{'average': 5, 'day': '2022-01-03'},
{'sum': 8, 'day': '2022-01-01'},
{'sum': 15, 'day': '2022-01-02'},
{'sum': 9, 'day': '2022-01-03'},
{'total_value': 19, 'day': '2022-01-01'},
{'total_value': 99, 'day': '2022-01-02'},
{'total_value': 15, 'day': '2022-01-03'}]
I want my output as:
output = [{'average': 2, 'sum': 8, 'total_value': 19, 'day': '2022-01-01'},
{'average': 3, 'sum': 15, 'total_value': 99, 'day': '2022-01-02'},
{'average': 5, 'sum': 9, 'total_value': 15, 'day': '2022-01-03'}]
The output puts the values together based off their date. My approaches so far have been to try and separate everything out into different dictionaries (date_dict, sum_dict, etc.) and then bringing them all together, but that doesn't seem to work and is extremely sloppy.
You could iterate over data and create a dictionary using day as key:
data = [{'average': 2, 'day': '2022-01-01'},
{'average': 3, 'day': '2022-01-02'},
{'average': 5, 'day': '2022-01-03'},
{'sum': 8, 'day': '2022-01-01'},
{'sum': 15, 'day': '2022-01-02'},
{'sum': 9, 'day': '2022-01-03'},
{'total_value': 19, 'day': '2022-01-01'},
{'total_value': 99, 'day': '2022-01-02'},
{'total_value': 15, 'day': '2022-01-03'}]
output = {}
for item in data:
if item['day'] not in output:
output[item['day']] = item
else:
output[item['day']].update(item)
print(list(output.values()))
Out:
[
{'average': 2, 'day': '2022-01-01', 'sum': 8, 'total_value': 19},
{'average': 3, 'day': '2022-01-02', 'sum': 15, 'total_value': 99},
{'average': 5, 'day': '2022-01-03', 'sum': 9, 'total_value': 15}
]
Had a bit of fun and made it with dict/list comprehension. Check out that neat | operator in python 3.9+ :-)
Python <3.9
from collections import ChainMap
data_grouped_by_day = {
day : dict(ChainMap(*[d for d in data if d["day"] == day ]))
for day in {d["day"] for d in data }
}
for day, group_data in data_grouped_by_day.items():
group_data.update(day=day)
result = list(data_grouped_by_day.values())
Python 3.9+
from collections import ChainMap
result = [
dict(ChainMap(*[d for d in data if d["day"] == day ])) | {"day" : day}
for day in {d["day"] for d in data}
]
The output in both cases is (keys order may vary)
[{'total_value': 99, 'day': '2022-01-02', 'sum': 15, 'average': 3},
{'total_value': 15, 'day': '2022-01-03', 'sum': 9, 'average': 5},
{'total_value': 19, 'day': '2022-01-01', 'sum': 8, 'average': 2}]
I would like to sort this list of dicts by a list key and then by date.
I am trying to sort the dicts by 'label' according the label_order and then by descending 'date'.
label_order = [3, 4, 2, 1]
data = [
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
]
After sorting would look like this:
data = [
{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)},
]
I've tried lambda expressions and itemgetter, but I am having difficulty combining the right strategies for the sort key. Maybe it is just trying to do too much at one time.
Any help or direction would be appreciated.
A more efficient approach is to build a dict that maps items in label_order to indices, so that you can use the indices as keys when performing the sort:
keys = {n: i for i, n in enumerate(label_order)}
sorted(data, key=lambda d: (-keys[d['label']], d['date']), reverse=True)
This returns:
[{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)}]
It's a little tricky to sort dates in reverse order. Instead, let's use the negative of the label's index so they're sorted in descending order. Then we can reverse the sorting and get the results in the order we actually want!
from datetime import datetime
label_order = [3, 4, 2, 1]
data = [
{'label': 1, 'data': 5, 'date': datetime(2018, 12, 31)},
{'label': 3, 'data': 2, 'date': datetime(2017, 12, 31)},
{'label': 3, 'data': 1, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 3, 'date': datetime(2018, 12, 31)},
{'label': 4, 'data': 4, 'date': datetime(2018, 12, 25)},
]
def descending_sort_key(item):
return -label_order.index(item['label']), item['date']
data.sort(key=descending_sort_key, reverse=True)
Voila - no date math or other trickery.
I have a model called Log which has a datetime field created_at.
What I want to do is to calculate the number of Logs for each date.
I tried to do this:
from django.db.models.functions import TruncDate
Log.objects.annotate(date=TruncDate('created_at')).values('date').annotate(c=Count('id'))
This is giving me the following:
{'date': datetime.date(2018, 1, 17), 'count': 1}, {'date': datetime.date(2018, 1, 17), 'count': 1}, {'date': datetime.date(2018, 1, 17), 'count': 2}
That is, the date is not unique.
The result I want would be this:
{'date': datetime.date(2018, 1, 17), 'count': 4}, {'date': datetime.date(2018, 1, 18), 'count': 2}
How could approach this problem?
If you set default ordering in your Log model extra field will be added to GROUP BY section of your query, and it can cause this problem. Try to remove ordering.
Log.objects.order_by().annotate(date=TruncDate('created_at')).values('date').annotate(c=Count('id'))
You can use the distinct() method to get unique values in Django
Ex:
Log.objects.annotate(date=TruncDate('created_at')).values('date').distinct()