When storing a time in Python (in my case in ZODB, but applies to any DB), what format (epoch, datetime etc) do you use and why?
The datetime module has the standard types for modern Python handling of dates and times, and I use it because I like standards (I also think it's well designed); I typically also have timezone information via pytz.
Most DBs have their own standard way of storing dates and times, of course, but modern Python adapters to/from the DBs typically support datetime (another good reason to use it;-) on the Python side of things -- for example that's what I get with Google App Engine's storage, Python's own embedded SQLite, and so on.
If the database has a native date-time format, I try to use that even if it involves encoding and decoding. Even if this is not 100% standard such as SQLITE, I would still use the date and time adaptors described near the bottom of the SQLITE3 help page.
In all other cases I would use ISO 8601 format unless it was a Python object database that stores some kind of binary encoding of the object.
ISO 8601 format is sortable and that is often required in databases for indexing. Also, it is unambiguous so you know that 2009-01-12 was in January, not in December. The people who change the position of month and day, always put the year last, so putting it first stops people from automatically assuming an incorrect format.
Of course, you can reformat however you want for display and input in your applications but data in databases is often viewed with other tools, not your application.
Seconds since epoch is the most compact and portable format for storing time data. Native DATETIME format in MySQL, for example, takes 8 bytes instead of 4 for TIMESTAMP (seconds since epoch). You'd also avoid timezone issues if you need to get the time from clients in multiple geographic locations. Logical operations (for sorting, etc.) are also fastest on integers.
Related
What would be the best approach to handle the following case with Django?
Django needs access to a database (in MariaDB) in which datetime values are stored in UTC timezone, except for one table that has all values for all of its datetime columns stored in local timezone (obviously different that UTC). This particular table is being populated by a different system, not Django, and for some reasons we cannot have the option to convert the timestamps in that table to UTC or change that system to start storing the values in UTC. The queries involving that table are read-only, but may join data from other tables. The table itself does not have a foreign key but there are other tables with a foreign key to that table. The table is very big (millions of rows) and one of its datetime columns is part of more than one indexes that help for making optimized queries.
I am asking your opinion for an approach to the above case that would be as seamless as it can be, preferably without doing conversions here and there in various parts of the codebase while accessing and filtering on the datetime fields of this "problematic" table / model. I think an approach at the model layer, which will let Django ORM work as if the values for that table were stored in UTC timezone, would be preferable. Perhaps a solution based on a custom model field that does the conversions from and back to the database "transparently". Am I thinking right? Or perhaps there is a better approach?
It is what it is. If you have different timezones then you need to convert different timezones to the one you prefer. Plus, there is no such thing as for reasons we cannot have the option to convert the timestamps in that table to UTC - well, too bad for you, should have thought about that, now you need to deal with it (if that is the case, which it is not - this is "programming", after all. Of course everything can be changed)
Say you have a column that contains the values for the year, month and date. Is it possible to get just the year? In particular I have
ALTER TABLE pmk_pp_disturbances.disturbances_natural ADD COLUMN sdate timestamp without time zone;
and want just the 2004 from 2004-08-10 05:00:00. Can this be done with Postgres or must a script parse the string? By the way, any rules as to when to "let the database do the work" vs. let the script running on the local computer do the work? I once heard querying databases is slower than the rest of the program written in C/C++, generally speaking.
You can use extract:
SELECT extract('year' FROM sdate) FROM pmk_pp_disturbances.disturbances_natural;
For many queries it's worth investigating whether the database can perform the data transformations as needed. That being said, it also depends on what your application will do with the data so it's a trade-off as to whether the work should be done by the database or in the application.
SELECT date_part('year', your_column) FROM your_table;
I think no. You're forced to read the entire value of a column. You can divide the date in few columns, one for the year, another for the month, etc. , or store the date on an integer format if you want an aggressive space optimization. But it will doing the database worst about scalability and modifications.
The databases are slow, you must assume it, but they offer hardest things to do with C/C++.
If you think make a game and save your 'save game' on SQL forget it. Use it if you're doing a back-end server or a management application, tool, etc.
I have a SQLite data base which I am pulling data for a specific set of dates (lets say 01-01-2011 to 01-01-2011). What is the best way to implement this query into SQL. Ideally I would like the following line to run:
SELECT * FROM database where start_date < date_stamp and end_date > date_stamp
This obviously does not work when I store the dates as strings.
My solution (which I think is messy and I am hoping for another one) is to convert the dates into integers in the following format:
YYYYMMDD
Which makes the above line able to run (theoretically). IS there a better method?
Using python sqlite3
Would the answer be any different if I were using SQL not SQLite
For SQLlite it is the best approach, as comparison with int much faster than strings or any Date And Time manipulations
You should store the dates in one of the supported date/time datatypes, then comparisons will work without conversions, and you would be able to use the built-in date/time functions on them.
(Whether you use strings or numbers does not matter for speed; database performance is mostly determined by the amount of I/O needed.)
In other SQL databases that have a built-in date datatype, you could use that.
(However, this is usually not portable.)
I have a fairly simple scenario that I am having a difficult time finding what would seem to require a simple answer! I have an audit log in my application that has the time an action was performed stored via auto_now_add on a timestamp field.
I've been searching around and gae-pytz is thrown around as an answer very commonly, but it hasn't been updated in well over a year:
http://pypi.python.org/pypi/gaepytz
This makes me wonder if there's something incredibly simple I'm overlooking, or new functionality added that I can't find any documentation on.
All I need is to display the datastore's UTC timestamps as one specific timezone, and perhaps to do some basic filtering based on this timezone's point of reference. What is the most efficient way to do this?
How well does Django handle the case of different timezones for each user? Ideally I would like to run the server in the UTC timezone (eg, in settings.py set TIME_ZONE="UTC") so all datetimes were stored in the database as UTC. Stuff like this scares me which is why I prefer UTC everywhere.
But how hard will it be to store a timezone for each user and still use the standard django datetime formatting and modelform wrappers. Do I anticipate having to write date handling code everywhere to convert dates into the user's timezone and back to UTC again?
I am still going through the django tutorial but I know how much of a pain it can be to deal with user timezones in some other frameworks that assume system timezone everywhere so I thought I'd ask now.
My research at the moment consisted of searching the django documentation and only finding one reference to timezones.
Additional:
There are a few bugs submitted concerning Django and timezone handling.
Babel has some contrib code for django that seems to deal with timezone formatting in locales.
Update, January 2013: Django 1.4 now has time zone support!!
Old answer for historical reasons:
I'm going to be working on this problem myself for my application. My first approach to this problem would be to go with django core developer Malcom Tredinnick's advice in this django-user's post. You'll want to store the user's timezone setting in their user profile, probably.
I would also highly encourage you to look into the pytz module, which makes working with timezones less painful. For the front end, I created a "timezone picker" based on the common timezones in pytz. I have one select box for the area, and another for the location (e.g. US/Central is rendered with two select boxes). It makes picking timezones slightly more convenient than wading through a list of 400+ choices.
It's not that hard to write timezone aware code in django:
I've written simple django application which helps handle timezones issue in django projects: https://github.com/paluh/django-tz. It's based on Brosner (django-timezone) code but takes different approach to solve the problem - I think it implements something similar to yours and FernandoEscher propositions.
All datetime values are stored in data base in one timezone (according to TIME_ZONE setting) and conversion to appropriate value (i.e. user timezone) are done in templates and forms (there is form widget for datetimes fields which contains additional subwidget with timezone). Every datetime conversion is explicit - no magic.
Additionally there is per thread cache which allows you simplify these datatime conversions (implementation is based on django i18n translation machinery).
When you want to remember user timezone, you should add timezone field to profile model and write simple middleware (follow the example from doc).
Django doesn't handle it at all, largely because Python doesn't either. Python (Guido?) has so far decided not to support timezones since although a reality of the world are "more political than rational, and there is no standard suitable for every application."
The best solution for most is to not worry about it initially and rely on what Django provides by default in the settings.py file TIME_ZONE = 'America/Los_Angeles' to help later on.
Given your situation pytz is the way to go (it's already been mentioned). You can install it with easy_install. I recommend converting times on the server to UTC on the fly when they are asked for by the client, and then converting these UTC times to the user's local timezone on the client (via Javascript in the browser or via the OS with iOS/Android).
The server code to convert times stored in the database with the America/Los_Angeles timezone to UTC looks like this:
>>> # Get a datetime from the database somehow and store into "x"
>>> x = ...
>>>
>>> # Create an instance of the Los_Angeles timezone
>>> la_tz = pytz.timezone(settings.TIME_ZONE)
>>>
>>> # Attach timezone information to the datetime from the database
>>> x_localized = la_tz.localize(x)
>>>
>>> # Finally, convert the localized time to UTC
>>> x_utc = x_localized.astimezone(pytz.utc)
If you send x_utc down to a web page, Javascript can convert it to the user's operating system timezone. If you send x_utc down to an iPhone, iOS can do the same thing, etc. I hope that helps.
Not a Django expert here, but afaik Django has no magic, and I can't even imagine any such magic that would work.
For example: you don't always want to save times in UTC. In a calendar application, for example, you want to save the datetime in the local time that the calendar event happens. Which can be different both from the servers and the users time zone. So having code that automatically converts every selected datetime to the servers time zone would be a Very Bad Thing.
So yes, you will have to handle this yourself. I'd recommend to store the time zone for everything, and of course run the server in UTC, and let all the datetimes generated by the application use UTC, and then convert them to the users time zone when displaying. It's not difficult, just annoying to remember. when it comes to datetimes that are inputted by the user, it's dependant on the application if you should convert to UTC or not. I would as a general recommendation not convert to UTC but save in the users time zone, with the information of which time zone that is.
Yes, time zones is a big problem. I've written a couple of blog posts on the annoying issue, like here: http://regebro.wordpress.com/2007/12/18/python-and-time-zones-fighting-the-beast/
In the end you will have to take care of time zone issues yourself, because there is no real correct answer to most of the issues.
You could start by taking a look at the django-timezones application. It makes available a number of timezone-based model fields (and their corresponding form fields, and some decorators), which you could use to at least store different timezone values per user (if nothing else).
Looking at the django-timezones application I found that it doesn't support MySQL DBMS, since MySQL doesn't store any timezone reference within datetimes.
Well, I think I manage to work around this by forking the Brosner library and modifying it to work transparently within your models.
There, I'm doing the same thing the django translation system do, so you can get user unique timezone conversion. There you should find a Field class and some datetime utils to always get datetimes converted to the user timezone. So everytime you make a request and do a query, everything will be timezoned.
Give it a try!