Parse timezone from a human time input - python

I'm having trouble parsing out timezone information from a string that looks like:
8pm PST on sunday
So far, using parsedatetime and dateutils allows me to parse the date out but the timezone usually causes issues.
Anyone know of a library that handles this sort of thing? My fallback is to naively parse out the timezones via a regex or a simple "PST" in datestring.

The abbreviations you're using are not unique; you will therefore need to interpret the time zones somehow (e.g. assume United States) and specify what each abbreviation means for your application:
from dateutil import parser
# map time zones to seconds from GMT
zones = {
'EST': -5 * 3600,
'PST': -8 * 3600
}
parser.parse('8 PM on Sunday PST', tzinfos=zones)
# datetime.datetime(2016, 4, 24, 20, 0, tzinfo=tzoffset('PST', -28800))
You can install dateutil with pip: pip install python-dateutil.
See this similar question for more information.

Related

Convert an RFC 3339 nano time to Python datetime

Is there an easy way to convert an RFC 3339 nano time into a regular Python timestamp?
For example, time = '2022-07-14T12:01:25.225089838+08:00',
I found a way using datetime
from datetime import datetime
time = '2022-07-14T12:01:25.225089+08:00'
date = datetime.fromisoformat(time) # good
time = '2022-07-14T12:01:25.225089838+08:00'
date = datetime.fromisoformat(time) # error
It works with string like '2022-07-14T12:01:25.225089+08:00', but it doesn't work with the time above.
There are a few ways to do it.
Depends on what is the input format and how you define an easy way.
There are actually many post asking similar issues you have.
I'll post a few at the end for your reference if you are interested and please check next time before posting.
The main issue of datetime object is that it only holds 6 digit after second.
You will need a different data structure to save it if you want to preserve all of the digits.
If you are ok with cutting off at 6 digit, FObersteiner's answer is perfect.
Another methodology is vanilla datetime string parsing
from datetime import datetime
date = '2022-07-14T12:01:25.225089838+08:00'.removesuffix('+08:00')
x = datetime.strptime( date[:-3], '%Y-%m-%dT%H:%M:%S.%f')
If you would like to preserve all the digits. You may want to create your own class extending from the datetime class or create some function for it.
Convert an RFC 3339 time to a standard Python timestamp
Parsing datetime strings containing nanoseconds
from datetime.fromisoformat docs:
Caution: This does not support parsing arbitrary ISO 8601 strings - it is only intended as the inverse operation of datetime.isoformat(). A more full-featured ISO 8601 parser, dateutil.parser.isoparse is available in the third-party package dateutil.
dateutil's isoparse will do the job:
from dateutil.parser import isoparse
time = '2022-07-14T12:01:25.225089838+08:00'
date = isoparse(time)
print(date)
# 2022-07-14 12:01:25.225089+08:00
print(repr(date))
# datetime.datetime(2022, 7, 14, 12, 1, 25, 225089, tzinfo=tzoffset(None, 28800))
Note: it doesn't round to microseconds, it just slices off the last 3 decimal places. So basically, if you're dealing with a standardized format like RFC 3339, you can do the slicing yourself like
from datetime import datetime
time = '2022-07-14T12:01:25.225089838+08:00'
date = datetime.fromisoformat(time[:-9] + time[-6:])
print(date)
# 2022-07-14 12:01:25.225089+08:00

python strptime vs dateutil - recommended use

I need to convert between strings and datetime objects quite often - up until now i have always used strptime and strftime.
I started working with the google calendar API where i recieve strings like this: '2018-03-17T09:00:00+01:00
It seems like i need to convert the +01:00 into 0100 for strptime which is a little annoying.
While i dont have this issue with dateutil there are a few other inconveniences.Also i saw that the last update of dateutil was in 2016 which seems odd.
So my question is which one would you recommend for adding and substracting dates and datetimes and switching between string and datetime obj?
Also is dateutil still maintained or is it outdated?
Thanks a lot!
- Sally
It seems like i need to convert the +01:00 into 0100 for strptime
No you don't.
For one thing, the standard format is +0100, not 0100.
For another, strptime handles +01:00 just fine:
>>> datetime.datetime.strptime('2018-08-13T11:18:24+00:00',
... '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2018, 8, 13, 11, 18, 24, tzinfo=datetime.timezone.utc)
>>> datetime.datetime.strptime('2018-08-13T11:18:24+01:00',
... '%Y-%m-%dT%H:%M:%S%z')
datetime.datetime(2018, 8, 13, 11, 18, 24, tzinfo=datetime.timezone(datetime.timedelta(seconds=3600)))
So, the problem you're trying to solve doesn't exist in the first place.
While i dont have this issue with dateutil there are a few other inconveniences.Also i saw that the last update of dateutil was in 2016 which seems odd.
As of 13 Aug 2018, the last update to dateutil was 2 days ago. And the last official release to PyPI, version 2.7.3, was 3 months ago.
So, your secondary problem doesn't exist either.
So my question is which one would you recommend for adding and substracting dates and datetimes and switching between string and datetime obj?
Since dateutil just gives you the same datetime objects that datetime gives you, neither one is better for adding and subtracting dates and datetimes.
For converting to and from string format, sometimes dateutil is more convenient, and it also supports a wider range of formats that you don't care about—but for what you're doing, they both work fine, so there's no difference. If you expect to need other formats in the future, it might be worth bringing in dateutil, but if not, you might as well stick with the standard library.
dateutil last version (2.7.3) is in may 2018. It just says "copyright 2016" somewhere in the credits. Moreover, the documentation talks about policy for future versions, so it seems to be quite active. I would suggest to prefer it over strptime. However, be sure to get the latest version. Previous versions had a bug with converting ISO dates, with which you are working.

Timezone not available in python, but the system timezone is properly set

As specified in the documentation:
%Z -> Time zone name (no characters if no time zone exists).
According to date, my system has the time zone properly set:
gonvaled#pegasus ~ » date
Sat Sep 28 09:14:29 CEST 2013
But this test:
def test_timezone():
from datetime import datetime
dt = datetime.now()
print dt.strftime('%Y-%m-%d %H:%M:%S%Z')
test_timezone()
Produces:
gonvaled#pegasus ~ » python test_timezone.py
2013-09-28 09:19:10
Without time zone information. Why is that? How can I force python to output time zone info?
I have also trying re-configuring the time zone with tzselect, but has not helped.
Standard Python datetime.datetime() objects do not have a timezone object attached to them. The system time is taken as is.
You'll need to install Python timezone support in the form of the pytz package; timezone definitions change too frequently to be bundled with Python itself.
pytz does not tell you what timezone your machine has been configured with. You can use the python-dateutil module for that; it has a dateutil.tz.gettz() function that returns the timezone currently in use. This is much more reliable than what Python can get from the limited C API:
>>> import datetime
>>> from dateutil.tz import gettz
>>> datetime.datetime.now(gettz())
datetime.datetime(2013, 9, 28, 8, 34, 14, 680998, tzinfo=tzfile('/etc/localtime'))
>>> datetime.datetime.now(gettz()).strftime('%Y-%m-%d %H:%M:%S%Z')
'2013-09-28 08:36:01BST'

Python strptime or alternative for complex date string parsing

I have been given a large list of date-time representations that need to be read into a database. I am using Python (because it rocks). The strings are in a terrible, terrible format where they are not precise to seconds, no timezone is stated, and the hours do not have a leading 0. So they look more like this:
April 29, 2013, 7:52 p.m.
April 30, 2013, 4 p.m.
You'll notice that if something happens between 4:00 and 4:01 it drops the minutes, too (ugh). Anyway, trying to parse these with time.strptime, but the docs state that hours must be decimal numbers [01:12] (or [01:24]). Since nothing is padded with 0's I'm wondering if there is something else I can pass to strptime to accept hours without leading 0; or if I should try splitting, then padding the strings; or use some other method of constructing the datetime object.
Also, it does not look like strptime accepts AM/PM as "A.M." or "P.M.", so I'll have to correct that as well. . .
Note, I am not able to just handle these strings in a batch. I receive them one-at-a-time from a foreign application which sometimes uses nicely formatted Unix epoch timestamps, but occasionally uses this format. Processing them on the fly is the only option.
I am using Python 2.7 with some Python 3 features imported.
from __future__ import (print_function, unicode_literals)
The most flexible parser is part of the dateutil package; it eats your input for breakfast:
>>> from dateutil import parser
>>> parser.parse('April 29, 2013, 7:52 p.m.')
datetime.datetime(2013, 4, 29, 19, 52)
>>> parser.parse('April 30, 2013, 4 p.m.')
datetime.datetime(2013, 4, 30, 16, 0)

Python - Setting a datetime in a specific timezone (without UTC conversions)

Just to be clear, this is python 2.6, I am using pytz.
This is for an application that only deals with US timezones, I need to be able to anchor a date (today), and get a unix timestamp (epoch time) for 8pm and 11pm in PST only.
This is driving me crazy.
> pacific = pytz.timezone("US/Pacific")
> datetime(2011,2,11,20,0,0,0,pacific)
datetime.datetime(2011, 2, 11, 20, 0, tzinfo=<DstTzInfo 'US/Pacific' PST-1 day, 16:00:0 STD>)
> datetime(2011,2,11,20,0,0,0,pacific).strftime("%s")
'1297454400'
zsh> date -d '#1297454400'
Fri Feb 11 12:00:00 PST 2011
So, even though I am setting up a timezone, and creating the datetime with that time zone, it is still creating it as UTC and then converting it. This is more of a problem since UTC will be a day ahead when I am trying to do the calculations.
Is there an easy (or at least sensical) way to generate a timestamp for 8pm PST today?
(to be clear, I do understand the value of using UTC in most situations, like database timestamps, or for general storage. This is not one of those situations, I specifically need a timestamp for evening in PST, and UTC should not have to enter into it.)
There are at least two issues:
you shouldn't pass a timezone with non-fixed UTC offset such as "US/Pacific" as tzinfo parameter directly. You should use pytz.timezone("US/Pacific").localize() method instead
.strftime('%s') is not portable, it ignores tzinfo, and it always uses the local timezone. Use datetime.timestamp() or its analogs on older Python versions instead.
To make a timezone-aware datetime in the given timezone:
#!/usr/bin/env python
from datetime import datetime
import pytz # $ pip install pytz
tz = pytz.timezone("US/Pacific")
aware = tz.localize(datetime(2011, 2, 11, 20), is_dst=None)
To get POSIX timestamp:
timestamp = (aware - datetime(1970, 1, 1, tzinfo=pytz.utc)).total_seconds()
(On Python 2.6, see totimestamp() function on how to emulate .total_seconds() method).
Create a tzinfo object utc for the UTC time zone, then try this:
#XXX: WRONG (for any timezone with a non-fixed utc offset), DON'T DO IT
datetime(2011,2,11,20,0,0,0,pacific).astimezone(utc).strftime("%s")
Edit: As pointed out in the comments, putting the timezone into the datetime constructor isn't always robust. The preferred method using the pytz documentation would be:
pacific.localize(datetime(2011,2,11,20,0,0,0)).astimezone(utc).strftime("%s")
Also note from the comments that strftime("%s") isn't reliable, it ignores the time zone information (even UTC) and assumes the time zone of the system it's running on. It relies on an underlying C library implementation and doesn't work at all on some systems (e.g. Windows).

Categories

Resources