Difference between datetime.strptime and parse from dateutil?

Difference between datetime.strptime and parse from dateutil? - python

I am getting two different results in seconds when I parse the following time string:
Method 1:
from datetime import datetime
int(datetime.strptime('2015-03-25T19:46:23.286966Z', '%Y-%m-%dT%H:%M:%S.%fZ').timestamp())
yields 1427309183
Method 2:
from dateutil.parser import parse
int(parse('2015-03-25T19:46:23.286966Z').timestamp())
yields 1427312783
It seems that method 1 ignores the TZ vs method do (I run it from a UTC+1 tz).
Question: Why do these two methods yield different second timestamps? Can someone please explain what's going on under the hood and how to best handle such situations.
My goal is to convert the string to seconds in unix epoch time (i.e. utc).

If you take a look at the repr of your intermediate result (the datetime objects), you notice a difference:
from datetime import datetime
from dateutil.parser import parse
print(repr(datetime.strptime('2015-03-25T19:46:23.286966Z', '%Y-%m-%dT%H:%M:%S.%fZ')))
# datetime.datetime(2015, 3, 25, 19, 46, 23, 286966)
print(repr(parse('2015-03-25T19:46:23.286966Z')))
# datetime.datetime(2015, 3, 25, 19, 46, 23, 286966, tzinfo=tzutc())
The first one is naive, no tzinfo set since you use a literal Z in the parsing directive. The second one is aware; tzinfo is set to UTC since dateutil's parser recognizes the Z to signal UTC. That makes for the difference in the timestamp, since Python treats naive datetime as local time - thus the difference of 1 hour, which is your local time's UTC offset.
You can correctly parse like
print(repr(datetime.fromisoformat('2015-03-25T19:46:23.286966Z'.replace('Z', '+00:00'))))
# datetime.datetime(2015, 3, 25, 19, 46, 23, 286966, tzinfo=datetime.timezone.utc)
see also here.
Or less convenient (imho), with strptime:
print(repr(datetime.strptime('2015-03-25T19:46:23.286966Z', '%Y-%m-%dT%H:%M:%S.%f%z')))
# datetime.datetime(2015, 3, 25, 19, 46, 23, 286966, tzinfo=datetime.timezone.utc)

Related

How to handle a timestamp field where the off set is out of the acceptable bounds

I have a field in a data frame with is ISO time with offset:
pages[['dimension1', 'dimension3']].head()
dimension1 dimension3
1572461291083.sanyrqy8 2019-10-30T14:45:42.71-04:00
Most of the rows are fine except some have an off set outside 24 hours.
x = pd.to_datetime(pages.dimension3)
ValueError: offset must be a timedelta strictly between -timedelta(hours=24) and timedelta(hours=24).
Here's an example of a rogue data point that's causing this error:
2019-11-11T07:08:09.640-31:00
My current task is not to solve why the data exists in this way but simply to get the raw data into a Postgres.
Is there some kind of if else logic I can use on this field to tell pandas when using to_datetime() if the offset is larger than 24 then to change it to 24? This would alter the rogue example above to be 2019-11-11T07:08:09.640-24:00
How could I do that with Pandas?

Use dateutil. It's great for parsing dates that give errors
import dateutil
dateutil.parser.parse('2019-10-30T14:45:42.71-04:00')
# datetime.datetime(2019, 10, 30, 14, 45, 42, 710000, tzinfo=tzoffset(None, -14400))
dateutil.parser.parse('2019-11-11T07:08:09.640-31:00')
# datetime.datetime(2019, 11, 11, 7, 8, 9, 640000, tzinfo=tzoffset(None, -111600))

Python3.5 Time Zone conversion

I have tried a number of posts/suggestions on here on converting time zone objects and have failed. I hope someone can point me to an easy way to do this.
I have a string/datetime of 2017-05-11T16:24:56-04:00
I can parse it a number of ways, dateutil, etc, into a datetime object.
when printed i get
datetime.datetime(2017, 5, 11, 16, 24, 56, tzinfo=tzoffset(None, -14400))
so it gets a tzoffset.
Trying any conversion doesn't seem to update the actual time portion, just the zone information.
How do I convert this string to my local time zone (EST, or offset -5hrs).
edit: trying astimezone() gets me this:
dt.astimezone()
Out[18]: datetime.datetime(2017, 5, 11, 16, 24, 56,
tzinfo=datetime.timezone(datetime.timedelta(-1, 72000), 'EDT'))
Thanks!

When you convert it using tzinfo you only change the suffix of the output string (e.g. with tzutc()): 2017-05-11 16:24:56+00:00
If you want to print it in your time zone, first create the datetime object using the actual timezone it represents:
dt = datetime(2017, 5, 11, 16, 24, 56,
tzinfo=tzoffset(None, -18000))
# 2017-05-11 16:24:56-05:00
And then convert it to the desired timezone using:
mydt = dt.astimezone(tzutc())
# 2017-05-11 21:24:56+00:00

How to convert a timestamp string with 7 digits on the microseconds part using strptime?

Having a timestamp as string like 2016-09-22T13:57:31.2311892-04:00, how can one get the datetime object?
I've tried using strptime for this, but I got two issues:
I need to remove : from the timezone part, at the end, for %z to work properly.
The microseconds part has 7 digits, but strptime handles only up to 6 digits.
Is there a way to parse timestamps in this format without modifying* the string itself before passing to strptime?
* - by modifying, I think of removing the last microsecond digit, and removing the last :.
Note: This is for inserting a record in MySQL. If that helps.

How about convert like this:
dt = datetime.strptime(s[:-len('2-04:00')], '%Y-%m-%dT%H:%M:%S.%f')
# datetime.datetime(2016, 9, 22, 13, 57, 31, 231189)
https://docs.python.org/3/library/datetime.html#strftime-strptime-behavior
Suddenly found a useful method at django:
from django.utils.dateparse import parse_datetime
dt = parse_datetime('2016-09-22T13:57:31.2311892-04:00')
# datetime.datetime(2016, 9, 22, 13, 57, 31, 231189, tzinfo=<django.utils.timezone.FixedOffset object at 0x7f20184f8390>)
https://docs.djangoproject.com/en/2.0/ref/utils/#module-django.utils.dateparse
Another pythonic format (use maya https://github.com/kennethreitz/maya):
# pip install maya
import maya
maya.parse('2016-09-22T13:57:31.2311892-04:00').datetime()
# datetime.datetime(2016, 9, 22, 17, 57, 31, 231189, tzinfo=<UTC>)

Converting datetime object to timestamp and back gives me a different time

I have encountered this problem today and I don't have an explanation for it.
I have a Python datetime object:
dt = datetime.datetime(2012, 3, 31, 18, 30, 48, tzinfo=<FixedOffset '-04:00'>)
which, to my understanding is 18:30 in a time zone offset from UTC by 4 hours.
I then tried to convert it to timestamp like so:
epo = time.mktime(dt.timetuple()) and get back 1333247448.0.
However, when I try to convert it back to make sure it's correct using date
time.datetime.fromtimestring(epo),
I get back
datetime.datetime(2012, 3, 31, 19, 30, 48)
Notice that time is 19 not 18.
Can anybody tell me why it's doing that?

Try using
time.localtime(epo)
instead of
time.datetime.fromtimestring(epo)

Python datetime not including DST when using pytz timezone

If I convert a UTC datetime to swedish format, summertime is included (CEST). However, while creating a datetime with sweden as the timezone, it gets CET instead of CEST. Why is this?
>>> # Modified for readability
>>> import pytz
>>> import datetime
>>> sweden = pytz.timezone('Europe/Stockholm')
>>>
>>> datetime.datetime(2010, 4, 20, 16, 20, tzinfo=pytz.utc).astimezone(sweden)
datetime(2010, 4, 20, 18, 20, tzinfo=<... 'Europe/Stockholm' CEST+2:00:00 DST>)
>>>
>>> datetime.datetime(2010, 4, 20, 18, 20, tzinfo=sweden)
datetime(2010, 4, 20, 18, 20, tzinfo=<... 'Europe/Stockholm' CET+1:00:00 STD>)
>>>

The sweden object specifies the CET time zone by default but contains enough information to know when CEST starts and stop.
In the first example, you create a datetime object and convert it to local time. The sweden object knows that the UTC time you passed occurs during daylight savings time and can convert it appropriately.
In the second example, the datetime constructor always interprets your input as not-daylight-savings-time and returns an appropriate object.
If datetime treated your input as wall-clock time and chose the appropriate daylight-savings setting for you, there would be an ambiguity during the time of year when clocks are set back. On a wall-clock the same hour occurs twice. Hence, datetime forces you to specify which timezone you're using when you create the datetime object.

Timezone abbreviations are not unique. For example "IST" could refer to "Irish Standard Time", "Iranian Standard Time", "Indian Standard Time" or "Isreali Standard Time". You shouldn't rely on parsing that, and instead should use zoneinfo timezones.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Difference between datetime.strptime and parse from dateutil? - python

Related

How to handle a timestamp field where the off set is out of the acceptable bounds

Python3.5 Time Zone conversion

How to convert a timestamp string with 7 digits on the microseconds part using strptime?

Converting datetime object to timestamp and back gives me a different time

Python datetime not including DST when using pytz timezone

Categories

Resources