Make Datetime Timezone Aware From UTC Offset and DST Bit

Make Datetime Timezone Aware From UTC Offset and DST Bit - python

I am currently battling the cruel beast that is timezone localization in my django application, and having some trouble... I want to make naive datetimes timezone aware, based on a location. I have a database of zip codes that have the UTC offset in hours, as well as a 0 or 1 depending on if the zip codes adhere to DST. How might I use this data to accurately apply a timezone to my datetimes? Ideally the datetime would respond to changes in DST, rather than just always simply following the UTC offset.

With pytz it's not hard to convert the datetimes as you describe; the only complication is getting tzinfo instances corresponding to the time zone descriptions in your database.
The problem is that real timezones are more complicated than just offset + DST. For example, different regions adopted DST at different points in history, and different regions in the world can make the DST switch at different points in the year.
If your usage is only for the US, and only concerns future (not historical) dates, then there are a couple options that should yield accurate results (though note the caveat below):
Just create your own concrete tzinfo subclass that uses the offset and DST flag from your database. For example, the Python documentation gives sample code for "a complete implementation of current DST rules for major US time zones."
Map from the offset / DST to the corresponding pytz tzinfo object. Since there are only a handful of possible combinations in the US, just figure out which timezone name corresponds and use that.
TZ_MAP = {
...
(-5, 1): pytz.timezone('US/Eastern')
...
}
tz = TZ_MAP[(offset, is_dst)]
Once you have the tzinfo instance the conversion is simple, but note that dealing with DST involves inherent ambiguities. For example, when the clock is turned back at 2am, all the times between 1am and 2am occur twice in the local timezone. Assuming you don't know which one you actually mean, you can either pick one arbitrarily, or raise an exception.
# with no is_dst argument, pytz will guess if there is ambiguity
aware_dt = tz.localize(naive_dst)
# with is_dst=None, pytz will raise an exception if there is ambiguity
aware_dt = tz.localize(naive_dst, is_dst=None)

Related

If a timestamp is anchored in UTC, why isn't Python's `fromtimestamp` timezone-aware?

To my knowledge:
Python's datetime can be "naive" (if no timezone-info is available) or "timezone-aware". In contrast, a timestamp is well-defined to be anchored in UTC, i.e. a timestamp 0 corresponds to 1970-01-01 00:00:00+00:00 (no matter of your location).
Question: Why does datetime.fromtimestamp() return a naive datetime object though it has a well-defined input?
MWE
from datetime import datetime, timezone
timestamp = 0
# output: "1970-01-01 00:00:00+00:00", i.e. providing the timezone information,
# the resulting datetime is timezone-aware and accurate
print(datetime.fromtimestamp(timestamp, tz=timezone.utc))
# output: "1970-01-01 01:00:00" (for me running it in CET+0100 timezone), i.e.
# the interpretation is aware of my local time shift, but the resulting datetime
# is naive though it could be timezone-aware and thus not well-defined anymore
#
# I would have wished for/expected: "1970-01-01 01:00:00+01:00"
print(datetime.fromtimestamp(timestamp))
Why do I care?
The point is that we loose information in a dangerous way, i.e. we switch from a well-defined object to an object that is only well-defined if we know the timezone of the PC it has been read in. Though it could do better, imo. The way it is implemented, it is easy to mess things up without recognizing it.
But maybe I got the whole concept wrong :) That is why I am asking...

How to format a datetime string as defined below?

How can I format a datetime string like 2020-04-30T22:30:00-04:00 to something like 2015-03-22T10:00:00+0900 in Python? The formatting, not the actual date.

The 2 examples you have provided don't appear to have different formatting,
they're both ISO8601 Extended format.
If you need to specify your own format you can use the datetime function strftime and a format string, the reverse of this is strptime, which takes an input string and a format string and returns a datetime object
If you need to change timezones then the tzinfo object what your looking for
You might also be interested in the dateutil
Computing of relative deltas (next month, next year, next Monday, last week of month, etc);
Computing of relative deltas between two given date and/or datetime objects;
Computing of dates based on very flexible recurrence rules, using a superset of the iCalendar specification.
Parsing of RFC strings is supported as well. Generic parsing of dates in almost any string format
Timezone (tzinfo) implementations for tzfile(5) format files (/etc/localtime,/usr/share/zoneinfo, etc), TZ environment string (in all known
formats), iCalendar format files, given ranges (with help from relative deltas), local machine timezone, fixed offset timezone, UTC timezone, and Windows registry-based time zones. Internal up-to-date world timezone information based on Olson’s database.
Computing of Easter Sunday dates for any given year, using Western, Orthodox or Julian algorithms

Why does converting timezones (and to unix timestamps) behave inconsistently in Pandas?

I'm parsing and manipulating some dates and times which, for reasons of interoperability with other systems, also need to be stored as UNIX (epoch) timestamps. In doing so, I'm seeing some weird behavior from pandas' Timestamp.tz_convert(), and then in its Timestamp.strftime() behavior in casting to epoch time, that makes me doubt my understanding of what should be going on.
The times I'm working with are in the US/Eastern timezone, but of course, epoch time is UTC, so my approach had been to cast to UTC since most conversions to/from UNIX timestamps assume that a tz-naive DateTime is in UTC. Let's leave aside whether doing that conversion is absolutely necessary to get valid timestamps; here's what I'm seeing that's problematic:
1. Using Timestamp.tz_convert() to change the timezone representation of a timestamp (i.e., a universal point in time) also changes the UNIX timestamp when you convert using Timestamp.strftime().
2. The differences in those timestamps don't even correspond to the proper hour differences between US-Eastern and GMT.
Here's some basic interactive-mode python to illustrate:
>>> import pytz
>>> from pytz import timezone
>>> import pandas as pd
>>> dtest = pd.to_datetime("Sunday, July 28, 2018 10:00 AM", infer_datetime_format=True).replace(tzinfo=timezone('America/New_York')) # okay, this should uniquely represent a point in time
>>> dtest
Timestamp('2018-07-28 10:00:00-0400', tz='America/New_York') # yup, that's the time - 10AM at GMT-0400.
>>> dtest2 = dtest.tz_convert('UTC') # convert to UTC
>>> dtest2
Timestamp('2018-07-28 14:00:00+0000', tz='UTC') # yup, same point in time, just different time zone now
>>> dtest.strftime('%s') # let's convert to unix time - this looks right
'1532786400'
>>> dtest2.strftime('%s') # should be the same, but it's not. WTF?
'1532804400'
The timestamps look like they are describing things equivalently: one is 10 AM at GMT-0400, the other is 2 PM at GMT+0000, a difference of 4 hours of clock time, as expected. They're both, of course, timezone-aware. But then converting them to UNIX timestamps yields
(A) different numbers, and even worse,
(B) numbers that differ by 5 hours (18000 seconds = 5 * 60 * 60) rather than 4, so I can't even assume that strftime() is merely ignoring timezone.
I'm using https://www.epochconverter.com/ to validate any timestamps as I sanity-check this, so that's a possible point of being misled. But according to that site,
1532786400 = 2018-07-28T10:00 -0400, and
1532804400 (that last result) = 2018-07-28T15:00 -0400, or 7pm GMT, a difference of 5 hours.
There are lots of questions on the subject of casting pandas Timestamps FROM a UNIX timestamp, but very little on questions casting TO epoch time. I can think of 2 possible explanations:
(1) tz_convert() is pulling some environment variable on my system that says I'm GMT -0500 and using that in the conversion process, in spite of that being irrelevant to converting between timezone-aware timestamps, and in so doing is actually changing the underlying point in time being represented. Or:
(2) Timestamp.strftime() is bugged and either ignoring the timezone parameter of a tz-aware timestamp or doing something truly bizarre when asked for a '%s' formatting parameter.
All advice greatly appreciated.

Find the daylight savings status of a naive datetime given the timezone [duplicate]

This question already has answers here:
How to make a timezone aware datetime object
(15 answers)
Closed 6 years ago.
I've got a datetime which has no timezone information. I'm now getting the timezone info and would like to add the timezone into the existed datetime instance, how can I do?
d = datetime.datetime.now()
tz = pytz.timezone('Asia/Taipei')
How to add the timezone info tz into datetime a

Use tz.localize(d) to localize the instance. From the documentation:
The first is to use the localize() method provided by the pytz library. This is used to localize a naive datetime (datetime with no timezone information):
>>> loc_dt = eastern.localize(datetime(2002, 10, 27, 6, 0, 0))
>>> print(loc_dt.strftime(fmt))
2002-10-27 06:00:00 EST-0500
If you don't use tz.localize(), but use datetime.replace(), chances are that a historical offset is used instead; tz.localize() will pick the right offset in effect for the given date. The US Eastern timezone DST start and end dates have changed over time, for example.
When you try to localize a datetime value that is ambiguous because it straddles the transition period from summer to winter time or vice-versa, the timezone will be consulted to see if the resulting datetime object should have .dst() return True or False. You can override the default for the timezone with the is_dst keyword argument for .localize():
dt = tz.localize(naive, is_dst=True)
or even switch off the choice altogether by setting is_dst=None. In that case, or in the rare cases there is no default set for a timezone, an ambiguous datetime value would lead to a AmbiguousTimeError exception being raised. The is_dst flag is only consulted for datetime values that are ambiguous and is ignored otherwise.
To go back the other way, turn a timezone-aware object back to a naive object, use .replace(tzinfo=None):
naivedt = awaredt.replace(tzinfo=None)

If you know that your original datetime was "measured" in the time zone you are trying to add to it, you could (but probably shouldn't) use replace rather than localize.
# d = datetime.datetime.now()
# tz = pytz.timezone('Asia/Taipei')
d = d.replace(tzinfo=tz)
I can imagine 2 times when this might make sense (the second one happened to me):
Your server locale is set to the incorrect time zone and you are trying to correct a datetime instance by making it aware of this incorrect timezone (and presumably later localizing it to the "correct" time zone so the values of now() match up to other times you are comparing it to (your watch, perhaps)
You want to "tag" a time instance (NOT a datetime) with a time zone (tzinfo) attribute so that attribute can be used later to form a full datetime instance.

timezone aware vs. timezone naive in python

I am working with datetime objects in python. I have a function that takes a time and finds the different between that time and now.
def function(past_time):
now = datetime.now()
diff = now - past_time
When I initialized past_time before passing it to this function I initialized it as datetime naive. And now is also a datetime naive object. However when I try to call this function I get the error: can't subtract offset-naive and offset-aware datetimes. How come this is the case if they are both theoretically datetime naive objects?
Any help would be appreciated. Thanks!

datetime doesn't do any cross time zone calculations, because it's a complex and involved subject.
I suggest converting dates to UTC universally and performing maths on those.
I recently completed a project using timezones in a large python/Django project and after investigation went with converting everything internally to UTC and converting only on display to the user.
You should look into pytz to do the conversions to/from UTC, and store Olson codes for the timezones you want in your app - perhaps associated with each user, or appropriate to your program.

Use :
now = now.replace(tzinfo=past_time.tzinfo)
before diff = now - past_time.
so that both now and past_time have same tzinfo.
only if now and past_time intended to be in same timezone.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.