Conversion between Pandas Timestamp, DateTime, Unix Timestamp with Timezone / without Timezone Info - python

I created the following test case to understand the conversion of between DateTime to Pandas Timestamp, to Unix Timestamp, and back to DateTime, with and without TimeZone info, with Python 3.6
from unittest import TestCase
import datetime
from datetime import timezone
from pytz import timezone
import time
import pandas as pd
def test_date_with_timestamp_method(self):
hkzone = timezone('Hongkong')
dt_with_tz = datetime.datetime(2017, 9, 24, tzinfo=hkzone)
dt_without_tz = datetime.datetime(2017, 9, 24)
uts_with = dt_with_tz.timestamp()
uts_without = dt_without_tz.timestamp()
self.assertNotEqual(uts_without, uts_with)
pd_with = pd.Timestamp(dt_with_tz)
pd_without = pd.Timestamp(dt_without_tz)
pd_unix_with_tz = pd_with.value // 10 ** 9
pd_unix_without_tz = pd_without.value // 10 ** 9
self.assertEqual(uts_with, pd_unix_with_tz)
self.assertEqual(uts_without, pd_unix_without_tz)
I would like to ask why this assertion failed? The result of this is
AssertionError: 1506182400.0 != 1506211200
# convert back to datetime
pd_dt_with_tz = pd_with.to_pydatetime()
pd_dt_without_tz = pd_without.to_pydatetime()
self.assertEqual(pd_dt_with_tz, dt_with_tz)
self.assertEqual(pd_dt_without_tz, dt_without_tz)
And this line
self.assertEqual(pd_dt_without_tz, dt_without_tz)
will result in this error.
AssertionError: datet[16 chars]7, 9, 24, 0, 46, tzinfo=) != datet[16 chars]7, 9, 24, 0, 0, tzinfo=)
So can I say it is the best practice to always put in back the timezone info to Datetime object before convert it to Timestamp?
Is it possible to make this two assertion success without timezone info?

Related

Python compare int on different environments (Docker vs Local machine) [duplicate]

What I need to do
I have a timezone-unaware datetime object, to which I need to add a time zone in order to be able to compare it with other timezone-aware datetime objects. I do not want to convert my entire application to timezone unaware for this one legacy case.
What I've Tried
First, to demonstrate the problem:
Python 2.6.1 (r261:67515, Jun 24 2010, 21:47:49)
[GCC 4.2.1 (Apple Inc. build 5646)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import datetime
>>> import pytz
>>> unaware = datetime.datetime(2011,8,15,8,15,12,0)
>>> unaware
datetime.datetime(2011, 8, 15, 8, 15, 12)
>>> aware = datetime.datetime(2011,8,15,8,15,12,0,pytz.UTC)
>>> aware
datetime.datetime(2011, 8, 15, 8, 15, 12, tzinfo=<UTC>)
>>> aware == unaware
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't compare offset-naive and offset-aware datetimes
First, I tried astimezone:
>>> unaware.astimezone(pytz.UTC)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: astimezone() cannot be applied to a naive datetime
>>>
It's not terribly surprising this failed, since it's actually trying to do a conversion. Replace seemed like a better choice (as per How do I get a value of datetime.today() in Python that is "timezone aware"?):
>>> unaware.replace(tzinfo=pytz.UTC)
datetime.datetime(2011, 8, 15, 8, 15, 12, tzinfo=<UTC>)
>>> unaware == aware
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can't compare offset-naive and offset-aware datetimes
>>>
But as you can see, replace seems to set the tzinfo, but not make the object aware. I'm getting ready to fall back to doctoring the input string to have a timezone before parsing it (I'm using dateutil for parsing, if that matters), but that seems incredibly kludgy.
Also, I've tried this in both Python 2.6 and Python 2.7, with the same results.
Context
I am writing a parser for some data files. There is an old format I need to support where the date string does not have a timezone indicator. I've already fixed the data source, but I still need to support the legacy data format. A one time conversion of the legacy data is not an option for various business BS reasons. While in general, I do not like the idea of hard-coding a default timezone, in this case it seems like the best option. I know with reasonable confidence that all the legacy data in question is in UTC, so I'm prepared to accept the risk of defaulting to that in this case.
In general, to make a naive datetime timezone-aware, use the localize method:
import datetime
import pytz
unaware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0)
aware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0, pytz.UTC)
now_aware = pytz.utc.localize(unaware)
assert aware == now_aware
For the UTC timezone, it is not really necessary to use localize since there is no daylight savings time calculation to handle:
now_aware = unaware.replace(tzinfo=pytz.UTC)
works. (.replace returns a new datetime; it does not modify unaware.)
All of these examples use an external module, but you can achieve the same result using just the datetime module, as also presented in this SO answer:
from datetime import datetime, timezone
dt = datetime.now()
dt = dt.replace(tzinfo=timezone.utc)
print(dt.isoformat())
# '2017-01-12T22:11:31+00:00'
Fewer dependencies and no pytz issues.
NOTE: If you wish to use this with python3 and python2, you can use this as well for the timezone import (hardcoded for UTC):
try:
from datetime import timezone
utc = timezone.utc
except ImportError:
#Hi there python2 user
class UTC(tzinfo):
def utcoffset(self, dt):
return timedelta(0)
def tzname(self, dt):
return "UTC"
def dst(self, dt):
return timedelta(0)
utc = UTC()
I wrote this Python 2 script in 2011, but never checked if it works on Python 3.
I had moved from dt_aware to dt_unaware:
dt_unaware = dt_aware.replace(tzinfo=None)
and dt_unware to dt_aware:
from pytz import timezone
localtz = timezone('Europe/Lisbon')
dt_aware = localtz.localize(dt_unware)
I use this statement in Django to convert an unaware time to an aware:
from django.utils import timezone
dt_aware = timezone.make_aware(dt_unaware, timezone.get_current_timezone())
Python 3.9 adds the zoneinfo module so now only the standard library is needed!
from zoneinfo import ZoneInfo
from datetime import datetime
unaware = datetime(2020, 10, 31, 12)
Attach a timezone:
>>> unaware.replace(tzinfo=ZoneInfo('Asia/Tokyo'))
datetime.datetime(2020, 10, 31, 12, 0, tzinfo=zoneinfo.ZoneInfo(key='Asia/Tokyo'))
>>> str(_)
'2020-10-31 12:00:00+09:00'
Attach the system's local timezone:
>>> unaware.replace(tzinfo=ZoneInfo('localtime'))
datetime.datetime(2020, 10, 31, 12, 0, tzinfo=zoneinfo.ZoneInfo(key='localtime'))
>>> str(_)
'2020-10-31 12:00:00+01:00'
Subsequently it is properly converted to other timezones:
>>> unaware.replace(tzinfo=ZoneInfo('localtime')).astimezone(ZoneInfo('Asia/Tokyo'))
datetime.datetime(2020, 10, 31, 20, 0, tzinfo=backports.zoneinfo.ZoneInfo(key='Asia/Tokyo'))
>>> str(_)
'2020-10-31 20:00:00+09:00'
Wikipedia list of available time zones
Windows has no system time zone database, so here an extra package is needed:
pip install tzdata
There is a backport to allow use of zoneinfo in Python 3.6 to 3.8:
pip install backports.zoneinfo
Then:
from backports.zoneinfo import ZoneInfo
I agree with the previous answers, and is fine if you are ok to start in UTC. But I think it is also a common scenario for people to work with a tz aware value that has a datetime that has a non UTC local timezone.
If you were to just go by name, one would probably infer replace() will be applicable and produce the right datetime aware object. This is not the case.
the replace( tzinfo=... ) seems to be random in its behaviour. It is therefore useless. Do not use this!
localize is the correct function to use. Example:
localdatetime_aware = tz.localize(datetime_nonaware)
Or a more complete example:
import pytz
from datetime import datetime
pytz.timezone('Australia/Melbourne').localize(datetime.now())
gives me a timezone aware datetime value of the current local time:
datetime.datetime(2017, 11, 3, 7, 44, 51, 908574, tzinfo=<DstTzInfo 'Australia/Melbourne' AEDT+11:00:00 DST>)
Use dateutil.tz.tzlocal() to get the timezone in your usage of datetime.datetime.now() and datetime.datetime.astimezone():
from datetime import datetime
from dateutil import tz
unlocalisedDatetime = datetime.now()
localisedDatetime1 = datetime.now(tz = tz.tzlocal())
localisedDatetime2 = datetime(2017, 6, 24, 12, 24, 36, tz.tzlocal())
localisedDatetime3 = unlocalisedDatetime.astimezone(tz = tz.tzlocal())
localisedDatetime4 = unlocalisedDatetime.replace(tzinfo = tz.tzlocal())
Note that datetime.astimezone will first convert your datetime object to UTC then into the timezone, which is the same as calling datetime.replace with the original timezone information being None.
This codifies #Sérgio and #unutbu's answers. It will "just work" with either a pytz.timezone object or an IANA Time Zone string.
def make_tz_aware(dt, tz='UTC', is_dst=None):
"""Add timezone information to a datetime object, only if it is naive."""
tz = dt.tzinfo or tz
try:
tz = pytz.timezone(tz)
except AttributeError:
pass
return tz.localize(dt, is_dst=is_dst)
This seems like what datetime.localize() (or .inform() or .awarify()) should do, accept both strings and timezone objects for the tz argument and default to UTC if no time zone is specified.
for those that just want to make a timezone aware datetime
import datetime
datetime.datetime(2019, 12, 7, tzinfo=datetime.timezone.utc)
for those that want a datetime with a non utc timezone starting in python 3.9 stdlib
import datetime
from zoneinfo import ZoneInfo
datetime.datetime(2019, 12, 7, tzinfo=ZoneInfo("America/Los_Angeles"))
Yet another way of having a datetime object NOT naive:
>>> from datetime import datetime, timezone
>>> datetime.now(timezone.utc)
datetime.datetime(2021, 5, 1, 22, 51, 16, 219942, tzinfo=datetime.timezone.utc)
quite new to Python and I encountered the same issue. I find this solution quite simple and for me it works fine (Python 3.6):
unaware=parser.parse("2020-05-01 0:00:00")
aware=unaware.replace(tzinfo=tz.tzlocal()).astimezone(tz.tzlocal())
Here is a simple solution to minimize changes to your code:
from datetime import datetime
import pytz
start_utc = datetime.utcnow()
print ("Time (UTC): %s" % start_utc.strftime("%d-%m-%Y %H:%M:%S"))
Time (UTC): 09-01-2021 03:49:03
tz = pytz.timezone('Africa/Cairo')
start_tz = datetime.now().astimezone(tz)
print ("Time (RSA): %s" % start_tz.strftime("%d-%m-%Y %H:%M:%S"))
Time (RSA): 09-01-2021 05:49:03
In the format of unutbu's answer; I made a utility module that handles things like this, with more intuitive syntax. Can be installed with pip.
import datetime
import saturn
unaware = datetime.datetime(2011, 8, 15, 8, 15, 12, 0)
now_aware = saturn.fix_naive(unaware)
now_aware_madrid = saturn.fix_naive(unaware, 'Europe/Madrid')
Changing between timezones
import pytz
from datetime import datetime
other_tz = pytz.timezone('Europe/Madrid')
# From random aware datetime...
aware_datetime = datetime.utcnow().astimezone(other_tz)
>> 2020-05-21 08:28:26.984948+02:00
# 1. Change aware datetime to UTC and remove tzinfo to obtain an unaware datetime
unaware_datetime = aware_datetime.astimezone(pytz.UTC).replace(tzinfo=None)
>> 2020-05-21 06:28:26.984948
# 2. Set tzinfo to UTC directly on an unaware datetime to obtain an utc aware datetime
aware_datetime_utc = unaware_datetime.replace(tzinfo=pytz.UTC)
>> 2020-05-21 06:28:26.984948+00:00
# 3. Convert the aware utc datetime into another timezone
reconverted_aware_datetime = aware_datetime_utc.astimezone(other_tz)
>> 2020-05-21 08:28:26.984948+02:00
# Initial Aware Datetime and Reconverted Aware Datetime are equal
print(aware_datetime1 == aware_datetime2)
>> True
Above all mentioned approaches, when it is a Unix timestamp, there is a very simple solution using pandas.
import pandas as pd
unix_timestamp = 1513393355
pst_tz = pd.Timestamp(unix_timestamp, unit='s', tz='US/Pacific')
utc_tz = pd.Timestamp(unix_timestamp, unit='s', tz='UTC')

How to render timestamp according to the timezone in Python

I have two datetime objects, they represent the same datetime value in different timezones. I would like to convert them to POSIX timestamp. However appearently calling datetime.timestamp() returns a value regardless of the timezone.
from datetime import datetime
import pytz
dt = datetime(2020, 7, 26, 6, 0)
utc_dt = pytz.utc.localize(dt) # datetime.datetime(2020, 7, 26, 6, 0, tzinfo=<UTC>)
bp = pytz.timezone("Europe/Budapest")
bp_dt = utc_dt.astimezone(bp) # datetime.datetime(2020, 7, 26, 8, 0, tzinfo=<DstTzInfo 'Europe/Budapest' CEST+2:00:00 DST>)
utc_dt.timestamp() # 1595743200.0
bp_dt.timestamp() # 1595743200.0
The documentation of datetime.timestamp() says the following:
For aware datetime instances, the return value is computed as:
(dt - datetime(1970, 1, 1, tzinfo=timezone.utc)).total_seconds()
Running utc_dt - bp_dt returns datetime.timedelta(0). So it seems it calculates with the UTC value of the datetime objects.
I use Python in a web stack. I want the backend to deal with the timezone handling and the client to recieve the precalculated datetime values in the user's timezone in the API responses.
What is the Pythonic way to get timezone aware timestamps?
In short, I would not recommend doing this because you can create a total mess, see my comment.
Technically, you could do it by simply replacing the tzinfo property of the datetime object with UTC. Note that I'm using dateutil.tz here so I can set the initial timezone directly (no localize()).
from datetime import datetime, timezone
from dateutil import tz
dt = datetime(2020, 7, 26, 6, 0, tzinfo=tz.gettz("Europe/Budapest"))
# dt.utcoffset()
# >>> datetime.timedelta(seconds=7200)
# POSIX timestamp that references to 1970-01-01 UTC:
ts_posix = dt.timestamp()
# timestamp that includes the UTC offset:
ts = dt.replace(tzinfo=timezone.utc).timestamp()
# ts-ts_posix
# >>> 7200.0

Difference between pandas datetime and datetime datetime

Hi have some dates in datetime.datetime format that I use to filter a panda dataframe with panda timestamp. I just tried the following and get a 2 hour offset :
from datetime import datetime
import pandas as pd
pd.to_datetime(datetime(2020, 5, 11, 0, 0, 0).timestamp()*1e9)
The output is:
->Timestamp('2020-05-10 22:00:00')
Can anybody explain why this gives a 2 hour offset? I am in Denmark so it corresponds to the offset to GMT. Is this the reason. I can of course just add 2 hours but want to understand why to make the script robust in the future.
Thanks for your help Jesper
pd.to_datetime accepts a datetime object so you could just do (pandas assumes UTC):
pd.to_datetime(datetime(2020, 5, 11))
You are getting a 2 hour offset when converting to a timestamp because by default python's datetime is unaware of timezone and will give you a "naive" datetime object (docs are here: https://docs.python.org/3/library/datetime.html#aware-and-naive-objects). The generated timestamp will be in the local timezone, hence the 2 hour offset.
You can pass in a tzinfo parameter to the datetime object specifying that the time should be treated as UTC:
from datetime import datetime
import pandas as pd
import pytz
pd.to_datetime(datetime(2020, 5, 11, 0, 0, 0, tzinfo=pytz.UTC).timestamp()*1e9)
Alternatively, you can generate a UTC timestamp using the calendar module:
from datetime import datetime
import pandas as pd
import calendar
timestamp = calendar.timegm(datetime(2020, 5, 11, 0, 0, 0).utctimetuple())
pd.to_datetime(timestamp*1e9)
if your datetime objects actually represent local time (i.e. your OS setting), you can simply use
from datetime import datetime
import pandas as pd
t = pd.to_datetime(datetime(2020, 5, 11).astimezone())
# e.g. I'm on CEST, so t is
# Timestamp('2020-05-11 00:00:00+0200', tz='Mitteleuropäische Sommerzeit')
see: How do I get a value of datetime.today() in Python that is “timezone aware”?
Just keep in mind that pandas will treat naive Python datetime objects as if they were UTC:
from datetime import timezone
t1 = pd.to_datetime(datetime(2020, 5, 11, tzinfo=timezone.utc))
t2 = pd.to_datetime(datetime(2020, 5, 11))
t1.timestamp() == t2.timestamp()
# True
see also: Python datetime and pandas give different timestamps for the same date

How to convert JSON date & time to Python datetime?

I get the date and time as string like 2014-05-18T12:19:24+04:00
I found another question explaining how to handle dates in UTC timezone (2012-05-29T19:30:03.283Z)
What should I do with +04:00 in my case (if I want to store time in UTC timezone in Python)?
Upd. I've tried to parse it like below:
dt = '2014-05-19T14:48:50+04:00'
plus_position = dt.find('+') # remove column in the timezone part
colon_pos = dt.find(':', plus_position)
dt = dt[:colon_pos] + dt[colon_pos+1:]
dt = datetime.datetime.strptime(dt, '%Y-%m-%dT%H:%M:%S%z') # '2014-05-19T14:48:50+0400'
But it fails - 'z' is a bad directive in format '%Y-%m-%dT%H:%M:%S%z'
Using dateutil:
>>> import dateutil.parser
>>> dateutil.parser.parse('2014-05-18T12:19:24+04:00')
datetime.datetime(2014, 5, 18, 12, 19, 24, tzinfo=tzoffset(None, 14400))

Extract the time from a UUID v1 in python

I have some UUIDs that are being generated in my program at random, but I want to be able to extract the timestamp of the generated UUID for testing purposes. I noticed that using the fields accessor I can get the various parts of the timestamp but I have no idea on how to combine them.
Looking inside /usr/lib/python2.6/uuid.py you'll see
def uuid1(node=None, clock_seq=None):
...
nanoseconds = int(time.time() * 1e9)
# 0x01b21dd213814000 is the number of 100-ns intervals between the
# UUID epoch 1582-10-15 00:00:00 and the Unix epoch 1970-01-01 00:00:00.
timestamp = int(nanoseconds/100) + 0x01b21dd213814000L
solving the equations for time.time(), you'll get
time.time()-like quantity = ((timestamp - 0x01b21dd213814000L)*100/1e9)
So use:
In [3]: import uuid
In [4]: u = uuid.uuid1()
In [58]: datetime.datetime.fromtimestamp((u.time - 0x01b21dd213814000L)*100/1e9)
Out[58]: datetime.datetime(2010, 9, 25, 17, 43, 6, 298623)
This gives the datetime associated with a UUID generated by uuid.uuid1.
You could use a simple formula that follows directly from the definition:
The timestamp is a 60-bit value. For UUID version 1, this is
represented by Coordinated Universal Time (UTC) as a count of 100-
nanosecond intervals since 00:00:00.00, 15 October 1582 (the date of
Gregorian reform to the Christian calendar).
>>> from uuid import uuid1
>>> from datetime import datetime, timedelta
>>> datetime(1582, 10, 15) + timedelta(microseconds=uuid1().time//10)
datetime.datetime(2015, 11, 13, 6, 59, 12, 109560)
Or just use the TimeUUID library, so that you know you didn't get the math wrong
Example
import uuid
import time_uuid
my_uuid = uuid.UUID('{12345678-1234-5678-1234-567812345678}')
ts = time_uuid.TimeUUID(bytes=my_uuid.bytes).get_timestamp()
Since I have Cassandra installed and I am using this with Cassandra I was able to use the datetime_from_uuid1 from cassandra.util
>>> import uuid
>>> from cassandra.util import datetime_from_uuid1
>>> foo = uuid.uuid1()
>>> dt_foo = datetime_from_uuid1(foo)
>>> dt_foo
datetime.datetime(2016, 07, 26, 8, 2, 12, 104560)

Categories

Resources