Django day and month event date query - python

I have a django model that looks like this:
class Event(models.Model):
name = model.CharField(...etc...)
date = model.DateField(...etc...)
What I need is a way to get all events that are on a given day and month - much like an "on this day" page.
def on_this_day(self,day,month):
reutrn Events.filter(????)
I've tried all the regular date query types, but they all seem to require a year, and short of iterating through all years, I can't see how this could be done.

You can make a query like this by specifying the day and the month:
def on_this_day(day, month):
return Event.objects.filter(date__day=day, date__month=month)
It most likely scans your database table using SQL operators like MONTH(date) and DAY(date) or some lookup equivalent
You might get a better query performance if you add and index Event.day and Event.month (if Event.date is internally stored as an int in the DB, it makes it less adapted to your (day, month) queries)
Here's some docs from Django: https://docs.djangoproject.com/en/dev/ref/models/querysets/#month

Related

How do I query for objects created before a certain hour in a day in Django?

In Django, I am trying to filter my query only to objects that were created before a certain hour in the day. I have a datetime field called 'created_at' that stored the datetime from which that object was created.
What I would like to do is:
query = query.filter(created_at__hour__lte=10)
Which I would expect to get all the objects that were created before 10am. However, when I try that I get a:
FieldError: Join on field 'created_at' not permitted. Did you misspell 'hour' for the lookup type?
I could loop through each day and get that day's objects, but that seems highly inefficient. Is there a way I can do this in a single query? If not, what is the fastest way to run this sort of filter?
__hour on a DateTimeField is a lookup type, so you can't mix it with another lookup type like __lte. You could construct a filter with Q objects, EG:
before_ten = Q(created_at__hour=0)
for hour in range(1, 11):
before_ten = before_ten | Q(created_at__hour=hour)
query = query.filter(before_ten)
If you can change your data model, it might be more convenient to save a creation time TimeField as well as your existing created_at.
In Django 1.9+, you can chain hour lookups like created_at__hour__lte, so the query from the question will work.
query = query.filter(created_at__hour__lte=10)
import datetime
start_time = datetime.datetime.now().replace(hour=00, minute=00)
certain_hour = 10
end_time = start_time.replace(hour=certain_hour)
query = query.filter(created_at__range=(start_time, end_time)

How to query based on datetimeproperty in app engine

I have a bunch of posts:
PostModel(db.Model)
...
created = db.DateTimeProperty(auto_now = True)
and the postmodel.html for rendering each post had
{{p.created.strftime("%Y %m %d")}}
And eventually I want to sort the posts based on the time they were last modified by month. Would the query for, say all the posts created December of 2013, look something like
posts = PostModel.all().filter("created", 2013 12)
?
You could use the order method to add a sort order to the query.
For e.g. you can use the .order('-created') in a cascaded form to your above statement.
The argument to the filter for DatetimeProperty is a Datetime object.

How to aggregate data only by the month part of a datetime field in Django

So my model is like this:
class Blog(models.Model):
title = models.CharField(max_length=100)
publication_date = models.DateField()
And now I want to get the count of the blog posts by each month. The raw sql would look like:
SELECT COUNT (*), EXTRACT(MONTH FROM publication_date) AS month FROM blog GROUP BY month;
One solution I found is here, it suggests that getting a date list first then loop through it using filter for each iteration. I do not like it, I am looking for a way to get the result without using loop.
You could use something to the effect of Blog.objects.filter(publication_date__range=[start_of_month, end_of_month]) to get all items from between those two dates. See range for details.

Django - get distinct dates from timestamp

I'm trying to filter users by date, but can't until I can find the first and last date of users in the db. While I can have my script filter out dups later on, I want to do it from the outset using Django's distinct since it significantly reduces. I tried
User.objects.values('install_time').distinct().order_by()
but since install_time is a timestamp, it includes the date AND time (which I don't really care about). As a result, the only ones it filters out are dates where we could retrieve multiple users' install dates but not times.
Any idea how to do this? I'm running this using Django 1.3.1, Postgres 9.0.5, and the latest version of psycopg2.
EDIT: I forgot to add the data type of install_time:
install_time = models.DateTimeField()
EDIT 2: Here's some sample output from the Postgres shell, along with a quick explanation of what I want:
2011-09-19 00:00:00
2011-09-11 00:00:00
2011-09-11 00:00:00 <--filtered out by distinct() (same date and time)
2011-10-13 06:38:37.576
2011-10-13 00:00:00 <--NOT filtered out by distinct() (same date but different time)
I am aware of Manager.raw, but would rather user django.db.connection.cursor to write the query directly since Manager.raw returns a RawQuerySet which, IMO, is worse than just writing the SQL query manually and iterating.
When doing reports on larger datasets itertools.group_by might be too slow. In those cases I make postgres handle the grouping:
truncate_date = connection.ops.date_trunc_sql('day','timestamp')
qs = qs.extra({'date':truncate_date})
return qs.values('date').annotate(Sum('amount')).order_by('date')
I've voted to close this since it's a dup of this question, so here's the answer if you don't want to visit the link, courtesy of nosklo.
Create a small function to extract just the date:
def extract_date(entity):
'extracts the starting date from an entity'
return entity.start_time.date()
Then you can use it with itertools.groupby:
from itertools import groupby
entities = Entity.objects.order_by('start_time')
for start_date, group in groupby(entities, key=extract_date):
do_something_with(start_date, list(group))

How do I GROUP BY on every given increment of a field value?

I have a Python application. It has an SQLite database, full of data about things that happen, retrieved by a Web scraper from the Web. This data includes time-date groups, as Unix timestamps, in a column reserved for them. I want to retrieve the names of organisations that did things and count how often they did them, but to do this for each week (i.e. 604,800 seconds) I have data for.
Pseudocode:
for each 604800-second increment in time:
select count(time), org from table group by org
Essentially what I'm trying to do is iterate through the database like a list sorted on the time column, with a step value of 604800. The aim is to analyse how the distribution of different organisations in the total changed over time.
If at all possible, I'd like to avoid pulling all the rows from the db and processing them in Python as this seems a) inefficient and b) probably pointless given that the data is in a database.
Not being familiar with SQLite I think this approach should work for most databases, as it finds the weeknumber and subtracts the offset
SELECT org, ROUND(time/604800) - week_offset, COUNT(*)
FROM table
GROUP BY org, ROUND(time/604800) - week_offset
In Oracle I would use the following if time was a date column:
SELECT org, TO_CHAR(time, 'YYYY-IW'), COUNT(*)
FROM table
GROUP BY org, TO_CHAR(time, 'YYYY-IW')
SQLite probably has similar functionality that allows this kind of SELECT which is easier on the eye.
Create a table listing all weeks since the epoch, and JOIN it to your table of events.
CREATE TABLE Weeks (
week INTEGER PRIMARY KEY
);
INSERT INTO Weeks (week) VALUES (200919); -- e.g. this week
SELECT w.week, e.org, COUNT(*)
FROM Events e JOIN Weeks w ON (w.week = strftime('%Y%W', e.time))
GROUP BY w.week, e.org;
There are only 52-53 weeks per year. Even if you populate the Weeks table for 100 years, that's still a small table.
To do this in a set-based manner (which is what SQL is good at) you will need a set-based representation of your time increments. That can be a temporary table, a permanent table, or a derived table (i.e. subquery). I'm not too familiar with SQLite and it's been awhile since I've worked with UNIX. Timestamps in UNIX are just # seconds since some set date/time? Using a standard Calendar table (which is useful to have in a database)...
SELECT
C1.start_time,
C2.end_time,
T.org,
COUNT(time)
FROM
Calendar C1
INNER JOIN Calendar C2 ON
C2.start_time = DATEADD(dy, 6, C1.start_time)
INNER JOIN My_Table T ON
T.time BETWEEN C1.start_time AND C2.end_time -- You'll need to convert to timestamp here
WHERE
DATEPART(dw, C1.start_time) = 1 AND -- Basically, only get dates that are a Sunday or whatever other day starts your intervals
C1.start_time BETWEEN #start_range_date AND #end_range_date -- Period for which you're running the report
GROUP BY
C1.start_time,
C2.end_time,
T.org
The Calendar table can take whatever form you want, so you could use UNIX timestamps in it for the start_time and end_time. You just pre-populate it with all of the dates in any conceivable range that you might want to use. Even going from 1900-01-01 to 9999-12-31 won't be a terribly large table. It can come in handy for a lot of reporting type queries.
Finally, this code is T-SQL, so you'll probably need to convert the DATEPART and DATEADD to whatever the equivalent is in SQLite.

Categories

Resources