Correctly parse date string with timezone information - python

I'm receiving a formatted date string like this via the pivotal tracker API: "2012/06/05 17:42:29 CEST"
I want to convert this string to a UTC datetime object, it looks like python-dateutil does not recognize that timezone, pytz doesn't know it either.
I fear my best bet is to replace CEST in the string with CET, but this feels very wrong. Is there any other way to parse summer time strings to UTC datetime objects I couldn't find?
pytz.timezone('CEST')
# -> pytz.exceptions.UnknownTimeZoneError: 'CEST'
dateutil.parser.parse("2012/06/05 17:42:29 CEST")
# -> datetime.datetime(2012, 6, 5, 17, 42, 29)
Edit: After thinking about it again subtracting one hour is completely false as the corresponding timezone is also currently in summer time, the issue of parsing still stands

There is no real CEST timezone. Use Europe/Paris, Europe/Berlin or Europe/Prague (or another one) according to your region:
>>> pytz.country_timezones('de')
[u'Europe/Berlin']
>>> pytz.country_timezones('fr')
[u'Europe/Paris']
They are (currently) identical and all referring to CEST in summer.
>>> dateutil.parser.parse("2012/06/05 17:42:29 CEST").astimezone(pytz.utc)
datetime.datetime(2012, 6, 5, 15, 42, 29, tzinfo=<UTC>)

Related

How can I convert dates from past to UTC time?

I'm trying to convert Bill Clinton's Birthday
August 19, 8:51 AM - Hope, Arkansas
To UTC Time but I keep getting discrepancies.
In python when I do:
import datetime
from datetime import timezone
from zoneinfo import ZoneInfo
dt = datetime(1946, 8, 16, 8, 51, tzinfo=ZoneInfo('America/Chicago'))
dt.astimezone(timezone.utc)
# This returns
#datetime.datetime(1946, 8, 16, 13, 51, tzinfo=datetime.timezone.utc)
All libraries I use and even online date to UTC time converters converts the date to
13:51 UTC.
However, official astrology datasets such as https://www.astro-seek.com/birth-chart/bill-clinton-horoscope, https://www.astro.com/astro-databank/Clinton,_Bill,
etc
All compute his natal chart using his birth time as 14:51 UTC, even though they all still use the proper local time.
My question is: What is causing this discrepancy and how do I know which UTC time is right?

Why does Python's datetime strptime() not set timezone when %Z is specified in a string? [duplicate]

I have a CSV dumpfile from a Blackberry IPD backup, created using IPDDump.
The date/time strings in here look something like this
(where EST is an Australian time-zone):
Tue Jun 22 07:46:22 EST 2010
I need to be able to parse this date in Python. At first, I tried to use the strptime() function from datettime.
>>> datetime.datetime.strptime('Tue Jun 22 12:10:20 2010 EST', '%a %b %d %H:%M:%S %Y %Z')
However, for some reason, the datetime object that comes back doesn't seem to have any tzinfo associated with it.
I did read on this page that apparently datetime.strptime silently discards tzinfo, however, I checked the documentation, and I can't find anything to that effect documented here.
Is there any way to get strptime() to play nicely with timezones?
I recommend using python-dateutil. Its parser has been able to parse every date format I've thrown at it so far.
>>> from dateutil import parser
>>> parser.parse("Tue Jun 22 07:46:22 EST 2010")
datetime.datetime(2010, 6, 22, 7, 46, 22, tzinfo=tzlocal())
>>> parser.parse("Fri, 11 Nov 2011 03:18:09 -0400")
datetime.datetime(2011, 11, 11, 3, 18, 9, tzinfo=tzoffset(None, -14400))
>>> parser.parse("Sun")
datetime.datetime(2011, 12, 18, 0, 0)
>>> parser.parse("10-11-08")
datetime.datetime(2008, 10, 11, 0, 0)
and so on. No dealing with strptime() format nonsense... just throw a date at it and it Does The Right Thing.
The datetime module documentation says:
Return a datetime corresponding to date_string, parsed according to format. This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])).
See that [0:6]? That gets you (year, month, day, hour, minute, second). Nothing else. No mention of timezones.
Interestingly, [Win XP SP2, Python 2.6, 2.7] passing your example to time.strptime doesn't work but if you strip off the " %Z" and the " EST" it does work. Also using "UTC" or "GMT" instead of "EST" works. "PST" and "MEZ" don't work. Puzzling.
It's worth noting this has been updated as of version 3.2 and the same documentation now also states the following:
When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.
Note that this doesn't work with %Z, so the case is important. See the following example:
In [1]: from datetime import datetime
In [2]: start_time = datetime.strptime('2018-04-18-17-04-30-AEST','%Y-%m-%d-%H-%M-%S-%Z')
In [3]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: None
In [4]: start_time = datetime.strptime('2018-04-18-17-04-30-+1000','%Y-%m-%d-%H-%M-%S-%z')
In [5]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: UTC+10:00
Since strptime returns a datetime object which has tzinfo attribute, We can simply replace it with desired timezone.
>>> import datetime
>>> date_time_str = '2018-06-29 08:15:27.243860'
>>> date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f').replace(tzinfo=datetime.timezone.utc)
>>> date_time_obj.tzname()
'UTC'
Your time string is similar to the time format in rfc 2822 (date format in email, http headers). You could parse it using only stdlib:
>>> from email.utils import parsedate_tz
>>> parsedate_tz('Tue Jun 22 07:46:22 EST 2010')
(2010, 6, 22, 7, 46, 22, 0, 1, -1, -18000)
See solutions that yield timezone-aware datetime objects for various Python versions: parsing date with timezone from an email.
In this format, EST is semantically equivalent to -0500. Though, in general, a timezone abbreviation is not enough, to identify a timezone uniquely.
Ran into this exact problem.
What I ended up doing:
# starting with date string
sdt = "20190901"
std_format = '%Y%m%d'
# create naive datetime object
from datetime import datetime
dt = datetime.strptime(sdt, sdt_format)
# extract the relevant date time items
dt_formatters = ['%Y','%m','%d']
dt_vals = tuple(map(lambda formatter: int(datetime.strftime(dt,formatter)), dt_formatters))
# set timezone
import pendulum
tz = pendulum.timezone('utc')
dt_tz = datetime(*dt_vals,tzinfo=tz)

Convert non-UTC time string with timezone abbreviation into UTC time in python, while accounting for daylight savings

I am having a hard time converting a string representation of non-UTC times to UTC due to the timezone abbreviation.
(update: it seems that the timezone abbreviations may not be unique. if so, perhaps i should also be trying to take this into account.)
I've been trying to look for a way around this using datetutil and pytz, but haven't had any luck.
Suggestions or workaround would be appreciated.
string = "Jun 20, 4:00PM EDT"
I'd like to convert that into UTC time, accounting for daylight savings when appropriate.
UPDATE: Found some references that may help more experienced users answer the Q.
Essentially, I would imagine part of the solution doing the reverse of this.
FINAL UPDATE (IMPORTANT)
Taken from the dateutil docs examples.
Some simple examples based on the date command, using the TZOFFSET dictionary to provide the BRST timezone offset.
parse("Thu Sep 25 10:36:28 BRST 2003", tzinfos=TZOFFSETS)
datetime.datetime(2003, 9, 25, 10, 36, 28,
tzinfo=tzoffset('BRST', -10800))
parse("2003 10:36:28 BRST 25 Sep Thu", tzinfos=TZOFFSETS)
datetime.datetime(2003, 9, 25, 10, 36, 28,
tzinfo=tzoffset('BRST', -10800))
Combine this with a library such as found here. and you will have a solution to this problem.
Using Nas Banov's excellent dictionary mapping timezone abbreviations to UTC offset:
import dateutil
import pytz
# timezone dictionary built here: https://stackoverflow.com/a/4766400/366335
# tzd = {...}
string = 'Jun 20, 4:00PM EDT'
date = dateutil.parser.parse(string, tzinfos=tzd).astimezone(pytz.utc)

How to preserve timezone when parsing date/time strings with strptime()?

I have a CSV dumpfile from a Blackberry IPD backup, created using IPDDump.
The date/time strings in here look something like this
(where EST is an Australian time-zone):
Tue Jun 22 07:46:22 EST 2010
I need to be able to parse this date in Python. At first, I tried to use the strptime() function from datettime.
>>> datetime.datetime.strptime('Tue Jun 22 12:10:20 2010 EST', '%a %b %d %H:%M:%S %Y %Z')
However, for some reason, the datetime object that comes back doesn't seem to have any tzinfo associated with it.
I did read on this page that apparently datetime.strptime silently discards tzinfo, however, I checked the documentation, and I can't find anything to that effect documented here.
Is there any way to get strptime() to play nicely with timezones?
I recommend using python-dateutil. Its parser has been able to parse every date format I've thrown at it so far.
>>> from dateutil import parser
>>> parser.parse("Tue Jun 22 07:46:22 EST 2010")
datetime.datetime(2010, 6, 22, 7, 46, 22, tzinfo=tzlocal())
>>> parser.parse("Fri, 11 Nov 2011 03:18:09 -0400")
datetime.datetime(2011, 11, 11, 3, 18, 9, tzinfo=tzoffset(None, -14400))
>>> parser.parse("Sun")
datetime.datetime(2011, 12, 18, 0, 0)
>>> parser.parse("10-11-08")
datetime.datetime(2008, 10, 11, 0, 0)
and so on. No dealing with strptime() format nonsense... just throw a date at it and it Does The Right Thing.
The datetime module documentation says:
Return a datetime corresponding to date_string, parsed according to format. This is equivalent to datetime(*(time.strptime(date_string, format)[0:6])).
See that [0:6]? That gets you (year, month, day, hour, minute, second). Nothing else. No mention of timezones.
Interestingly, [Win XP SP2, Python 2.6, 2.7] passing your example to time.strptime doesn't work but if you strip off the " %Z" and the " EST" it does work. Also using "UTC" or "GMT" instead of "EST" works. "PST" and "MEZ" don't work. Puzzling.
It's worth noting this has been updated as of version 3.2 and the same documentation now also states the following:
When the %z directive is provided to the strptime() method, an aware datetime object will be produced. The tzinfo of the result will be set to a timezone instance.
Note that this doesn't work with %Z, so the case is important. See the following example:
In [1]: from datetime import datetime
In [2]: start_time = datetime.strptime('2018-04-18-17-04-30-AEST','%Y-%m-%d-%H-%M-%S-%Z')
In [3]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: None
In [4]: start_time = datetime.strptime('2018-04-18-17-04-30-+1000','%Y-%m-%d-%H-%M-%S-%z')
In [5]: print("TZ NAME: {tz}".format(tz=start_time.tzname()))
TZ NAME: UTC+10:00
Since strptime returns a datetime object which has tzinfo attribute, We can simply replace it with desired timezone.
>>> import datetime
>>> date_time_str = '2018-06-29 08:15:27.243860'
>>> date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S.%f').replace(tzinfo=datetime.timezone.utc)
>>> date_time_obj.tzname()
'UTC'
Your time string is similar to the time format in rfc 2822 (date format in email, http headers). You could parse it using only stdlib:
>>> from email.utils import parsedate_tz
>>> parsedate_tz('Tue Jun 22 07:46:22 EST 2010')
(2010, 6, 22, 7, 46, 22, 0, 1, -1, -18000)
See solutions that yield timezone-aware datetime objects for various Python versions: parsing date with timezone from an email.
In this format, EST is semantically equivalent to -0500. Though, in general, a timezone abbreviation is not enough, to identify a timezone uniquely.
Ran into this exact problem.
What I ended up doing:
# starting with date string
sdt = "20190901"
std_format = '%Y%m%d'
# create naive datetime object
from datetime import datetime
dt = datetime.strptime(sdt, sdt_format)
# extract the relevant date time items
dt_formatters = ['%Y','%m','%d']
dt_vals = tuple(map(lambda formatter: int(datetime.strftime(dt,formatter)), dt_formatters))
# set timezone
import pendulum
tz = pendulum.timezone('utc')
dt_tz = datetime(*dt_vals,tzinfo=tz)

Python datetime not including DST when using pytz timezone

If I convert a UTC datetime to swedish format, summertime is included (CEST). However, while creating a datetime with sweden as the timezone, it gets CET instead of CEST. Why is this?
>>> # Modified for readability
>>> import pytz
>>> import datetime
>>> sweden = pytz.timezone('Europe/Stockholm')
>>>
>>> datetime.datetime(2010, 4, 20, 16, 20, tzinfo=pytz.utc).astimezone(sweden)
datetime(2010, 4, 20, 18, 20, tzinfo=<... 'Europe/Stockholm' CEST+2:00:00 DST>)
>>>
>>> datetime.datetime(2010, 4, 20, 18, 20, tzinfo=sweden)
datetime(2010, 4, 20, 18, 20, tzinfo=<... 'Europe/Stockholm' CET+1:00:00 STD>)
>>>
The sweden object specifies the CET time zone by default but contains enough information to know when CEST starts and stop.
In the first example, you create a datetime object and convert it to local time. The sweden object knows that the UTC time you passed occurs during daylight savings time and can convert it appropriately.
In the second example, the datetime constructor always interprets your input as not-daylight-savings-time and returns an appropriate object.
If datetime treated your input as wall-clock time and chose the appropriate daylight-savings setting for you, there would be an ambiguity during the time of year when clocks are set back. On a wall-clock the same hour occurs twice. Hence, datetime forces you to specify which timezone you're using when you create the datetime object.
Timezone abbreviations are not unique. For example "IST" could refer to "Irish Standard Time", "Iranian Standard Time", "Indian Standard Time" or "Isreali Standard Time". You shouldn't rely on parsing that, and instead should use zoneinfo timezones.

Categories

Resources