While reviewing my past answers, I noticed I'd proposed code such as this:
import time
def dates_between(start, end):
# muck around between the 9k+ time representation systems in Python
# now start and end are seconds since epoch
# return [start, start + 86400, start + 86400*2, ...]
return range(start, end + 1, 86400)
When rereading this piece of code, I couldn't help but feel the ghastly touch of Tony the Pony on my spine, gently murmuring "leap seconds" to my ears and other such terrible, terrible things.
When does the "a day is 86,400 seconds long" assumption break, for epoch definitions of 'second', if ever? (I assume functions such as Python's time.mktime already return DST-adjusted values, so the above snippet should also work on DST switching days... I hope?)
Whenever doing calendrical calculations, it is almost always better to use whatever API the platform provides, such as Python's datetime and calendar modules, or a mature high-quality library, than it is to write "simpler" code yourself. Date and calendar APIs are ugly and complicated, but that's because real-world calendars have a lot of weird behavior.
For example, if it is "10:00:00 AM" right now, then the number of seconds to "10:00:00 AM tomorrow" could be a few different things, depending on what timezone(s) you are using, whether DST is starting or ending tonight, and so on.
Any time the constant 86400 appears in your code, there is a good chance you're doing something that's not quite right.
And things get even more complicated when you need to determine the number of seconds in a week, a month, a year, a quarter, and so on. Learn to use those calendar libraries.
Number of seconds in a day depends on time system that you use e.g., in POSIX, a day is exactly 86400 seconds by definition:
As represented in seconds since the Epoch, each and every day shall be
accounted for by exactly 86400 seconds.
In UTC, there could be a leap second included i.e., a day can be 86401 SI seconds (and theoretically 86399 SI seconds). As of Jun 30 2015, it has happened 26 times.
If we measure days by apparent motion of the Sun then the length of a (solar) day varies through the year by ~16 minutes from the mean.
In turn it is different from UT1 that is also based on rotation of the Earth (mean solar time). An apparent solar day can be 20 seconds shorter or 30 seconds longer than a mean solar day. UTC is kept within 0.9 seconds of UT1 by the introduction of occasional intercalary leap seconds.
If you define a day by local clock then it may be very chaotic due to bizarre political timezone changes. It is not correct to assume that a day may change only by an hour due to DST.
According to Wikipedia,
UTC days are almost always 86 400 s long, but due to "leap seconds"
are occasionally 86 401 s and could be 86 399 s long (though the
latter option has never been used as of December 2010); this keeps the
days synchronized with the rotation of the Earth (or Universal Time).
I expect that a double leap second could in fact make the day 86402s long, if that were to ever be used.
EDIT again: second guessed myself due to confusing python documentation. time.mktime always returns UTC epoch seconds. There done. : )
In all time zones that "support" daylight savings time, you'll get two days a year that don't have 24h. They'll have 25h or 23h respectively. And don't even think of hardcoding those dates. They change every year, and between time zones.
Oh, and here's a list of 34 other reasons that you hadn't thought about, and why you shouldn't do what you're doing.
Related
I have a dataset that looks as follows:
What I would like to do with this data is calculate how much time was spent in specific states, per day. So say for example I wanted to know how long the unit was running today. I would just like to know the sum of the time the unit spent RUNNING: 45 minutes, NOT_RUNNING: 400 minutes, WARMING_UP: 10 minutes, etc.
I know how to summarize the column data on its own, but I'm looking to reference the time stamp I have available to subtract the first time it was on, from the last time it was on and get that measure of difference. I haven't had any luck searching for this solution, but there's no way I'm the first to come across this and know it can be done some how, just looking to learn how. Anything helps, Thanks!
I have a program that outputs flight time but sometimes it's less than an hour and in that case I don't want a 0 hour displayed.
I could use an if statement for one that has hours and one that doesn't but that doesn't seem efficient. Also would be nice if minutes is zero don't display minutes either.
landed_time_msg = time.strftime("Apx. flt. time %-H Hours : %-M Mins. ",time.gmtime(self.landed_time))
I am measuring run times of a program in seconds. Depending on the amount of data I input, that can take milliseconds to days. Is there a Python module that I can use to convert the number of seconds to the most useful unit and display that? Approximations are fine.
For example, 50 should become 50 seconds, 590 should become 10 minutes, 100000 should become 1 day or something like that. I could write the basic thing myself, but I am sure people have thought about this more than I have and have considered many of the edge case I wouldn't think about in a 1000 years :)
Edit: I noticed tqdm must have some logic associated with that, as it selects the length of the ETA string accordingly. Compare
for _ in tqdm.tqdm(range(10)): time.sleep(1)
with
for _ in tqdm.tqdm(range(100000)): time.sleep(1)
Edit: I have also found this Gist, but I would prefer code with at least some maintanence :)
https://gist.github.com/alexwlchan/73933442112f5ae431cc
Close the question if you want, humanize.naturaldelta is the answer:
This modest package contains various common humanization utilities, like turning a number into a fuzzy human readable duration ('3 minutes ago') or into a human readable size or throughput. It works with python 2.7 and 3.3 and is localized to Russian, French, Korean and Slovak.
https://github.com/jmoiron/humanize
I just found arrow:
Arrow is a Python library that offers a sensible and human-friendly approach to creating, manipulating, formatting and converting dates, times and timestamps.
It has humanize(), too, and is much more well-maintainted, it seems:
https://arrow.readthedocs.io/en/latest/#humanize
Did anybody notice that the interval of second in Python datetime is [00,61]
see the table on this page.
https://docs.python.org/3/library/time.html#time.strftime
Why?
The answer is on the same page in footnote (2):
The range really is 0 to 61; value 60 is valid in timestamps representing leap seconds and value 61 is supported for historical reasons.
The "historical reasons" are described in https://bugs.python.org/issue2568.
There is no such thing as a double leap second. There cannot be 62 seconds in a minute. 59, yes. 60, yes. 61, yes. 62, no.
http://www.monkey.org/openbsd/archive2/tech/199905/msg00031.html
Probably to account for leap seconds.
When you have to add leap second it will be helpful to calculate that. You can search on net for leap second. Due to that second range in python is 0-61.
Leap seconds.
It has been the case that there have been 62 seconds in a minute in the past.
It adjusts for the world spinning slower.
Part of this is down to tides. The energy for the tides comes from the rotation of the earth and moon. The result is that the world slows down.
If global warming takes place the oceans get hotter and expand. That is like a skater throwing their arms out, and the spin slows down. That hasn't taken place. The measurement of ocean levels doesn't agree with the rotation measurements. It's likely to be down to problems with the earth's surface moving, which is far larger than the sea level rise.
I'm wondering if anyone knows of a good date and time library that has correctly-implemented features like the following:
Microsecond resolution
Daylight savings
Example: it knows that 2:30am did not exist in the US on 8 March 2009 for timezones that respect daylight savings.
I should be able to specify a timezone like "US/Eastern" and it should be smart enough to know whether a given timestamp should correspond to EST or EDT.
Custom date ranges
The ability to create specialized business calendars that skip over weekends and holidays.
Custom time ranges
The ability to define business hours so that times requested outside the business hours can be rounded up or down to the next or previous valid hour.
Arithmetic
Be able to add and subtract integer amounts of all units (years, months, weeks, days, hours, minutes, ...). Note that adding something like 0.5 days isn't well-defined because it could mean 12 hours or it could mean half the duration of a day (which isn't 24 hours on daylight savings changes).
Natural boundary alignment
Given a timestamp, I'd like be able to do things like round down to the nearest decade, year, month, week, ..., quarter hour, hour, etc.
I'm currently using Python, though I'm happy to have a solution in another language like perl, C, or C++.
I've found that the built-in Python libraries lack sophistication with their daylight savings logic and there isn't an obvious way (to me) to set up things like custom time ranges.
Python's standard library's datetime module is deliberately limited to non-controversial aspects that aren't changing all the time by legislative fiat -- that's why it deliberately excludes direct support for timezones, DST, fuzzy parsing and ill-defined arithmetic (such as "one month later"...) and the like. On top of it, dateutil for many kinds of manipulations, and pytz for timezones (including DST issues), add most of what you're asking for, though not extremely explosive things like "holidays" which vary so wildly not just across political jurisdictions but even across employers within a simgle jurisdiction (e.g. in the US some employers consider "Columbus Day" a holiday, but many don't -- and some, with offices in many locations, have it as a holiday on some locations but not in others; given this utter, total chaos, to expect to find a general-purpose library that somehow magically makes sense of the chaos is pretty weird).
Take a look at the dateutil and possibly mx.DateTime packages.
Saw this the other day, I havn't used it myself... looks promising. http://crsmithdev.com/arrow/
I should be able to specify a timezone
like "US/Eastern" and it should be
smart enough to know whether a given
timestamp should correspond to EST or
EDT.
This part isn't always possible - in the same way that 2:30am doesn't exist for one day of the year (in timezones with daylight saving that switches at 2:00am), 2:30am exists twice for another day - once in EDT and then an hour later in EST. If you pass that date/time to the library, how does it know which of the two times you're talking about?
Although the book is over ten years old, I would strongly recommend reading Standard C Date/Time Library: Programming the World's Calendars and Clocks by Lance Latham. This is one of those books that you will pick back up from time to time in amazement that it got written at all. The author goes into more detail than you want about calenders and time keeping systems, and along the way develops the source code to a library (written in C) to handle all of the calculations.
Amazingly, it seems to be still in print...
I just released a Python library called Fleming (https://github.com/ambitioninc/fleming), and it appears to solve two of your problems with sophistication in regards to Daylight Savings Time.
Problem 1, Arithmetic - Fleming has an add_timedelta function that takes a timedelta (from Python's datetime module) or a relativedelta from python-dateutil and adds it to a datetime object. The add_timedelta function handles the case when the datetime object crosses a DST boundary. Check out https://github.com/ambitioninc/fleming#add_timedelta for a complete explanation and examples. Here is a short example:
import fleming
import datetime
import timedelta
dt = fleming.add_timedelta(dt, datetime.timedelta(weeks=2, days=1))
print dt
2013-03-29 20:00:00-04:00
# Do timedelta arithmetic such that it starts in DST and crosses over into no DST.
# Note that the hours stay in tact and the timezone changes
dt = fleming.add_timedelta(dt, datetime.timedelta(weeks=-4))
print dt
2013-03-01 20:00:00-05:00
Problem 2, Natural Boundary Alignment - Fleming has a floor function that can take an arbitrary alignment. Let's say your time was datetime(2013, 2, 3) and you gave it a floor interval of month=3. This means it will round to the nearest trimonth (quarter). You could similarly specify nearest decade by using year=10 in the arguments. Check out (https://github.com/ambitioninc/fleming#floordt-within_tznone-yearnone-monthnone-weeknone-daynone-hournone-minutenone-secondnone-microsecondnone) for complete examples and illustrations. Here is a quick one:
import fleming
import datetime
# Get the starting of a quarter by using month=3
print fleming.floor(datetime.datetime(2013, 2, 4), month=3)
2013-01-01 00:00:00