I have a model IncomingCorrespondence with auto incrementing field ID. I also have field number, and I want two things for this field:
This field will auto-increment its value, just like ID
Every new year its value will start over from 0 (or 1)
ID
Number
Date
…
…
…
285
285
2020-03-12
286
286
2020-04-19
287
1
2021-01-01
class IncomingCorrespondence(models.Model):
ID = models.models.AutoField(primary_key=True)
date = models.DateField(null=True)
number = models.IntegerField(null=True)
How can I do that the most efficient and reliable way?
You do not need to store the number, you can simply derive it by the number of items that are stored in the database since it has turned that year with:
class IncomingCorrespondence(models.Model):
date = models.DateField(null=True)
created = models.DateTimeField(auto_now_add=True)
#property
def number(self):
return IncomingCorrespondence._base_manager.filter(
created__year=self.created.year,
created__lt=self.created
).count() + 1
We thus have a timestamp created that will store at what datetime the object was created, and we count the number of IncomingCorrespondence for that year before the timestamp plus one. You can in that case work with a package like django-softdelete [GitHub] to keep deleted objects in the database, and just filter these out when viewing the item.
another way might be to assign the maximum plus one to a field:
from django.db.models import Max
from django.utils.timezone import now
def next_number():
data = IncomingCorrespondence._base_manager.filter(
date__year=now().year
).aggregate(
max_number=Max('number')
)['max_number'] or 0
return data + 1
class IncomingCorrespondence(models.Model):
ID = models.models.AutoField(primary_key=True)
date = models.DateField(auto_now_add=True)
number = models.IntegerField(default=next_number, editable=False)
But here Django will dispatch numbers through a query. If there are multiple threads that concurrently create an IncomingCorrespondence, then this can fail. It also depends on the insertion time, not the date of the IncomingCorrespondence object.
You should count number with some method like querying count of created IncomingCorrespondence this year. It should not be done any other way (cronjob for example) as it won't be stable (crontab may fail and you will end up with anomalies (and won't even be able to notice that), or you will create instance right before crontab reset the sequence)
Related
Suppose you have this model:
from django import models
from django.contrib.postgres.indexes import BrinIndex
class MyModel(model.Models):
device_id = models.IntegerField()
timestamp = models.DateTimeField(auto_now_add=True)
my_value = models.FloatField()
class Meta:
indexes = (BrinIndex(fields=['timestamp']),)
There is a periodic process that creates an instance of this model every 2 minutes or so. This process is supposed to run for years, with multiple devices, so this table will contain a great number of records.
My goal is, for each day when there are records, to get the first and last records in that day.
So far, what I could come up with is this:
from django.db.models import Min, Max
results = []
device_id = 1 # Could be other device id, of course, but 1 for illustration's sake
# This will get me a list of dictionaries that have first and last fields
# with the desired timestamps, but not the field my_value for them.
first_last = MyModel.objects.filter(device_id=device_id).values('timestamp__date')\
.annotate(first=Min('timestamp__date'),last=Max('timestamp__date'))
# So now I have to iterate over that list to get the instances/values
for f in first_last:
first = f['first']
last = f['last']
first_value = MyModel.objects.get(device=device, timestmap=first).my_value
last_value = MyModel.objects.get(device=device, timestamp=last).my_value
results.append({
'first': first,
'last': last,
'first_value': first_value,
'last_value': last_value,
})
# Do something with results[]
This works, but takes a long time (about 50 seconds on my machine, retrieving first and last values for about 450 days).
I have tried other combinations of annotate(), values(), values_list(), extra() etc, but this is the best I could come up with so far.
Any help or insight is appreciated!
You can take advantage of .distinct() if you are using PostgreSQL as DBMS.
first_models = MyModel.objects.order_by('timestamp__date', 'timestamp').distinct('timestamp__date')
last_models = MyModel.objects.order_by('timestamp__date', '-timestamp').distinct('timestamp__date')
first_last = first_models.union(last_models)
# do something with first_last
One more things need to be mentioned: first_last might eliminate duplicate when there is only one record for a date. It should not be a problem for you, but if it does, you can iterate first_models and last_models separately.
I'm trying to create a simple daily time recording app, that updates the table row upon submitting.
Here's what I mean, suppose a staff timed-in in the morning, then my table row would be like this:
id
time_in_am
time_out_am
time_in_pm
time_out_pm
staff_id
1
2021-05-09 08:17:07.27
NULL
NULL
NULL
223-8881
and upon submitting or scanning an id again, then it would update time_out_am, until the end of the day which is time_out_pm.
My problem then starts here, how would I know if the staff with an id no. of 223-8881 already clocked in today?
I've tried this:
today_dt = datetime(datetime.today().year, datetime.today().month, datetime.today().day)
# check if staff clocked in today
dtr_log = DailyTimeRecord.query.filter(DailyTimeRecord.time_in_am==today_dt, staff_id=staff.id).first()
# end check
using the above code, I get the error: TypeError: filter() got an unexpected keyword argument 'staff_id'
and if I use filter_by(), I get this: filter_by() takes 1 positional argument but 2 were given
heres my model if it helps:
class DailyTimeRecord(db.Model):
id = db.Column(db.Integer, primary_key=True)
time_in_am = db.Column(db.DateTime(timezone=True))
time_out_am = db.Column(db.DateTime(timezone=True))
time_in_pm = db.Column(db.DateTime(timezone=True))
time_out_pm = db.Column(db.DateTime(timezone=True))
staff_id = db.Column(db.Integer, db.ForeignKey('staff.id'))
When you're using filter you need to specify the model and use 'proper' comparisons e.g: the staff_id=staff.id should be DailyTimeRecord.staff_id==staff.id so would look like:
dtr_log = DailyTimeRecord.query.filter(
DailyTimeRecord.time_in_am==today_dt,
DailyTimeRecord.staff_id==staff.id
).first()
If you were using the filter_by helper it would look like:
dtr_log = DailyTimeRecord.query.filter(
time_in_am=today_dt,
staff_id=staff.id
).first()
But you'll also run into a problem with the date comparison. time_in_am is a datetime, and you're building a datetime to compare with, but essentially your today_dt is a datetime at midnight (the hour, min, sec will default to zero because you didn't give them a value).
You really want to just deal with a date to date comparison-- so would want to cast the database value to also be a date to do what you mention, so with filter:
from sqlalchemy import func
from datetime import date
today_dt = date.today()
dtr_log = DailyTimeRecord.query.filter(
func.date(DailyTimeRecord.time_in_am)==today_dt,
DailyTimeRecord.staff_id==staff.id
).first()
I am using Django 3.1 with Postgres, and this is my abridged model:
class PlayerSeasonReport:
player = models.ForeignKey(Player)
competition_season = models.ForeignKey(CompetitionSeason)
class PlayerPrice:
player_season_report = models.ForeignKey(PlayerSeasonReport)
price = models.IntegerField()
date = models.DateTimeField()
# unique on (price, date)
I'm querying on the PlayerSeasonReport to get aggregate information about all players, in particular I would like the prices for the last n records (so the last price, the 7th-to-last price, etc.)
I currently get the PlayerSeasonReport queryset and annotate it like this:
base_query = PlayerSeasonReport.objects.filter(competition_season_id=id)
# This works fine
last_value = base_query.filter(
pk=OuterRef('pk'),
).order_by(
'pk',
'-player_prices__date'
).distinct('pk').annotate(
value=F('player_prices__price')
)
# Pull the value from a week ago
# This produces a value but is logically incorrect
# I am interested in the 7th-to-last value, not really from a week ago from day of query
week_ago = datetime.datetime.now() - datetime.timedelta(7)
value_7d_ago = base_query.filter(
pk=OuterRef('pk'),
player_prices__date__gte=week_ago,
).order_by(
'pk',
'fantasy_player_prices__date'
).distinct('pk').annotate(
value=F('player_prices__price')
)
return base_query.annotate(
value=Subquery(
value.values('value'),
output_field=FloatField()
),
# Same for value_7d_ago
# ...
# Many other annotations
)
Getting the most recent value works fine, but getting the last n values doesn't. I shouldn't be using datetime concepts in my logic, since what I'm really interested in is in the n-to-last values.
I've tried annotating the max date, then filtering based on this annotation, and also somehow slicing the subquery, but I can't seem to get any of it right.
It's worth noting that a price may not exist (there may be no record for n values in the past), in which case it should be null (the annotation based on datetime works)
How can I annotate the price values for the last n days?
Sorted:
base_query = PlayerSeasonReport.objects.filter(id=id)
# ...other manipulations on base query
prices = PlayerPrice.objects.filter(
fantasy_player_season_report=OuterRef('pk')
).order_by('-date')
return base_query.annotate(
price=Subquery(
prices.values('price')[:1],
output_field=FloatField()
),
prev_day_price=Subquery(
prices.values('price')[1:2],
output_field=FloatField()
),
# ...
)
Explanation:
We query on the child model (PlayerPrice) and join on the pk of the PlayerSeasonReport.
prices.values('price')[i:j] where j = i + 1 allows us to get the value we desire without evaluating the QuerySet (which is indispensable in a Subquery).
I'm using Python 3.7 with Postgres 9.5. I would like to write a Django ORM query if possible. I have the below model
class PageStat(models.Model):
article = models.ForeignKey(Article, on_delete=models.CASCADE, )
elapsed_time = models.IntegerField(default=0)
score = models.IntegerField(default=0)
What I would like to do is write a query that, for a given article and elapsed time (an integer in seconds), I would like to write a query taht returns the PageStat object with the greatest elapsed time value that doesn't exceed the argument. So, for example, if my table had these values
article_id elapsed_time
==================
1 10
1 20
1 30
2 15
And I did a search with article ID "1" and elapsed time" 15", I would like to get back the first row (where article_id = 1 and elapsed_time = 10), since "10" is the greatest value that is still less than 15. I know how to write a query for just finding the stat with the article,
PageStat.objects.filter(article=article)
but I don't know how to factor in the time value.
You could try:
PageStat.objects.filter(elapsed_time__lte=20)
There are also:
lt - less than
gte - greater than equal
gt - greater than
etc
You can order the QuerySet by elapsed_time, and then select the first one, like:
slowest_pagestat = PageStat.objects.filter(
article=my_article,
elapsed_time__lte=threshold
).order_by('-elapsed_time').first()
This will return a PageState object if such PageState exists, and None otherwise.
This will make a query that looks like:
SELECT pagestat.*
FROM pagestat
WHERE pagestat.article_id = 1
AND pagestat.elapsed_time <= 15
ORDER BY pagestat.elapsed_time DESC
LIMIT 1
with 1 the primary key of my_article, and 15 the threshold.
i have a model which contains a field birth_year and in another model i have the user registration date.
I have the list of user ids for which i want to query if their age belongs to a particular range of age.User age is calculated as registration date - birth_year.
I was able to calculate it from current date as:
startAge=25
endAge=50
ageStartRange = (today - relativedelta(years=startAge)).year
ageEndRange = (today - relativedelta(years=endAge)).year
and i made the query as:
query.filter(profile_id__in=communityUsersIds, birth_year__lte=age_from, birth_year__gte=age_to).values('profile_id')
This way i am getting the userids whose age is in range bw 25 and 50. Instead of today how can i use registration_date(it is a field in another model) of user.
You can use native DB functions. Works like a charm using Postgres.
from django.contrib.auth.models import User
from django.db.models import DurationField, IntegerField, F, Func
class Age(Func):
function = 'AGE'
output_field = DurationField()
class AgeYears(Func):
template = 'EXTRACT (YEAR FROM %(function)s(%(expressions)s))'
function = 'AGE'
output_field = IntegerField()
users = User.objects.annotate(age=Age(F("dob")), age_years=AgeYears(F("dob"))).filter(age_years__gte=18)
for user in users:
print(user.age, user.age_years)
# which will generate result like below
# 10611 days, 0:00:00 29
The "today" version of the query was easy to do, because the "today" date doesn't depend on the individual fields in the row.
F Expressions
You can explore Django's F expressions as they allow you to reference the fields of the model in your queries (without pulling them into Python)
https://docs.djangoproject.com/en/1.7/topics/db/queries/#using-f-expressions-in-filters
e.g. for you, the age would be this F expressions:
F('registration_date__year') - F('birth_year')
However, we don't really need to calculate that, because e.g. to query for what you want, consider this query:
Model.filter(birth_year__lte=F('regisration_date__year') - 25)
From that you can do add a:
birth_year__gte=F('regisration_date__year') + 50,
or use a birth_year__range=(F('regisration_date__year') - 25, F('regisration_date__year') + 50))
Alternative: precalculate age value
Otherwise you can precalculate that age, since that value is knowable on user registration time
Model.update(age=F('registration_date__year') - F('birth_year'))
Once that is saved, it's as simple as Model.filter(age__range=(25, 50))