Time Calculation with "numpy.datetime64()" [duplicate] - python
How do I convert a numpy.datetime64 object to a datetime.datetime (or Timestamp)?
In the following code, I create a datetime, timestamp and datetime64 objects.
import datetime
import numpy as np
import pandas as pd
dt = datetime.datetime(2012, 5, 1)
# A strange way to extract a Timestamp object, there's surely a better way?
ts = pd.DatetimeIndex([dt])[0]
dt64 = np.datetime64(dt)
In [7]: dt
Out[7]: datetime.datetime(2012, 5, 1, 0, 0)
In [8]: ts
Out[8]: <Timestamp: 2012-05-01 00:00:00>
In [9]: dt64
Out[9]: numpy.datetime64('2012-05-01T01:00:00.000000+0100')
Note: it's easy to get the datetime from the Timestamp:
In [10]: ts.to_datetime()
Out[10]: datetime.datetime(2012, 5, 1, 0, 0)
But how do we extract the datetime or Timestamp from a numpy.datetime64 (dt64)?
.
Update: a somewhat nasty example in my dataset (perhaps the motivating example) seems to be:
dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100')
which should be datetime.datetime(2002, 6, 28, 1, 0), and not a long (!) (1025222400000000000L)...
You can just use the pd.Timestamp constructor. The following diagram may be useful for this and related questions.
Welcome to hell.
You can just pass a datetime64 object to pandas.Timestamp:
In [16]: Timestamp(numpy.datetime64('2012-05-01T01:00:00.000000'))
Out[16]: <Timestamp: 2012-05-01 01:00:00>
I noticed that this doesn't work right though in NumPy 1.6.1:
numpy.datetime64('2012-05-01T01:00:00.000000+0100')
Also, pandas.to_datetime can be used (this is off of the dev version, haven't checked v0.9.1):
In [24]: pandas.to_datetime('2012-05-01T01:00:00.000000+0100')
Out[24]: datetime.datetime(2012, 5, 1, 1, 0, tzinfo=tzoffset(None, 3600))
To convert numpy.datetime64 to datetime object that represents time in UTC on numpy-1.8:
>>> from datetime import datetime
>>> import numpy as np
>>> dt = datetime.utcnow()
>>> dt
datetime.datetime(2012, 12, 4, 19, 51, 25, 362455)
>>> dt64 = np.datetime64(dt)
>>> ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's')
>>> ts
1354650685.3624549
>>> datetime.utcfromtimestamp(ts)
datetime.datetime(2012, 12, 4, 19, 51, 25, 362455)
>>> np.__version__
'1.8.0.dev-7b75899'
The above example assumes that a naive datetime object is interpreted by np.datetime64 as time in UTC.
To convert datetime to np.datetime64 and back (numpy-1.6):
>>> np.datetime64(datetime.utcnow()).astype(datetime)
datetime.datetime(2012, 12, 4, 13, 34, 52, 827542)
It works both on a single np.datetime64 object and a numpy array of np.datetime64.
Think of np.datetime64 the same way you would about np.int8, np.int16, etc and apply the same methods to convert between Python objects such as int, datetime and corresponding numpy objects.
Your "nasty example" works correctly:
>>> from datetime import datetime
>>> import numpy
>>> numpy.datetime64('2002-06-28T01:00:00.000000000+0100').astype(datetime)
datetime.datetime(2002, 6, 28, 0, 0)
>>> numpy.__version__
'1.6.2' # current version available via pip install numpy
I can reproduce the long value on numpy-1.8.0 installed as:
pip install git+https://github.com/numpy/numpy.git#egg=numpy-dev
The same example:
>>> from datetime import datetime
>>> import numpy
>>> numpy.datetime64('2002-06-28T01:00:00.000000000+0100').astype(datetime)
1025222400000000000L
>>> numpy.__version__
'1.8.0.dev-7b75899'
It returns long because for numpy.datetime64 type .astype(datetime) is equivalent to .astype(object) that returns Python integer (long) on numpy-1.8.
To get datetime object you could:
>>> dt64.dtype
dtype('<M8[ns]')
>>> ns = 1e-9 # number of seconds in a nanosecond
>>> datetime.utcfromtimestamp(dt64.astype(int) * ns)
datetime.datetime(2002, 6, 28, 0, 0)
To get datetime64 that uses seconds directly:
>>> dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100', 's')
>>> dt64.dtype
dtype('<M8[s]')
>>> datetime.utcfromtimestamp(dt64.astype(int))
datetime.datetime(2002, 6, 28, 0, 0)
The numpy docs say that the datetime API is experimental and may change in future numpy versions.
I think there could be a more consolidated effort in an answer to better explain the relationship between Python's datetime module, numpy's datetime64/timedelta64 and pandas' Timestamp/Timedelta objects.
The datetime standard library of Python
The datetime standard library has four main objects
time - only time, measured in hours, minutes, seconds and microseconds
date - only year, month and day
datetime - All components of time and date
timedelta - An amount of time with maximum unit of days
Create these four objects
>>> import datetime
>>> datetime.time(hour=4, minute=3, second=10, microsecond=7199)
datetime.time(4, 3, 10, 7199)
>>> datetime.date(year=2017, month=10, day=24)
datetime.date(2017, 10, 24)
>>> datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199)
datetime.datetime(2017, 10, 24, 4, 3, 10, 7199)
>>> datetime.timedelta(days=3, minutes = 55)
datetime.timedelta(3, 3300)
>>> # add timedelta to datetime
>>> datetime.timedelta(days=3, minutes = 55) + \
datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199)
datetime.datetime(2017, 10, 27, 4, 58, 10, 7199)
NumPy's datetime64 and timedelta64 objects
NumPy has no separate date and time objects, just a single datetime64 object to represent a single moment in time. The datetime module's datetime object has microsecond precision (one-millionth of a second). NumPy's datetime64 object allows you to set its precision from hours all the way to attoseconds (10 ^ -18). It's constructor is more flexible and can take a variety of inputs.
Construct NumPy's datetime64 and timedelta64 objects
Pass an integer with a string for the units. See all units here. It gets converted to that many units after the UNIX epoch: Jan 1, 1970
>>> np.datetime64(5, 'ns')
numpy.datetime64('1970-01-01T00:00:00.000000005')
>>> np.datetime64(1508887504, 's')
numpy.datetime64('2017-10-24T23:25:04')
You can also use strings as long as they are in ISO 8601 format.
>>> np.datetime64('2017-10-24')
numpy.datetime64('2017-10-24')
Timedeltas have a single unit
>>> np.timedelta64(5, 'D') # 5 days
>>> np.timedelta64(10, 'h') 10 hours
Can also create them by subtracting two datetime64 objects
>>> np.datetime64('2017-10-24T05:30:45.67') - np.datetime64('2017-10-22T12:35:40.123')
numpy.timedelta64(147305547,'ms')
Pandas Timestamp and Timedelta build much more functionality on top of NumPy
A pandas Timestamp is a moment in time very similar to a datetime but with much more functionality. You can construct them with either pd.Timestamp or pd.to_datetime.
>>> pd.Timestamp(1239.1238934) #defaults to nanoseconds
Timestamp('1970-01-01 00:00:00.000001239')
>>> pd.Timestamp(1239.1238934, unit='D') # change units
Timestamp('1973-05-24 02:58:24.355200')
>>> pd.Timestamp('2017-10-24 05') # partial strings work
Timestamp('2017-10-24 05:00:00')
pd.to_datetime works very similarly (with a few more options) and can convert a list of strings into Timestamps.
>>> pd.to_datetime('2017-10-24 05')
Timestamp('2017-10-24 05:00:00')
>>> pd.to_datetime(['2017-1-1', '2017-1-2'])
DatetimeIndex(['2017-01-01', '2017-01-02'], dtype='datetime64[ns]', freq=None)
Converting Python datetime to datetime64 and Timestamp
>>> dt = datetime.datetime(year=2017, month=10, day=24, hour=4,
minute=3, second=10, microsecond=7199)
>>> np.datetime64(dt)
numpy.datetime64('2017-10-24T04:03:10.007199')
>>> pd.Timestamp(dt) # or pd.to_datetime(dt)
Timestamp('2017-10-24 04:03:10.007199')
Converting numpy datetime64 to datetime and Timestamp
>>> dt64 = np.datetime64('2017-10-24 05:34:20.123456')
>>> unix_epoch = np.datetime64(0, 's')
>>> one_second = np.timedelta64(1, 's')
>>> seconds_since_epoch = (dt64 - unix_epoch) / one_second
>>> seconds_since_epoch
1508823260.123456
>>> datetime.datetime.utcfromtimestamp(seconds_since_epoch)
>>> datetime.datetime(2017, 10, 24, 5, 34, 20, 123456)
Convert to Timestamp
>>> pd.Timestamp(dt64)
Timestamp('2017-10-24 05:34:20.123456')
Convert from Timestamp to datetime and datetime64
This is quite easy as pandas timestamps are very powerful
>>> ts = pd.Timestamp('2017-10-24 04:24:33.654321')
>>> ts.to_pydatetime() # Python's datetime
datetime.datetime(2017, 10, 24, 4, 24, 33, 654321)
>>> ts.to_datetime64()
numpy.datetime64('2017-10-24T04:24:33.654321000')
>>> dt64.tolist()
datetime.datetime(2012, 5, 1, 0, 0)
For DatetimeIndex, the tolist returns a list of datetime objects. For a single datetime64 object it returns a single datetime object.
One option is to use str, and then to_datetime (or similar):
In [11]: str(dt64)
Out[11]: '2012-05-01T01:00:00.000000+0100'
In [12]: pd.to_datetime(str(dt64))
Out[12]: datetime.datetime(2012, 5, 1, 1, 0, tzinfo=tzoffset(None, 3600))
Note: it is not equal to dt because it's become "offset-aware":
In [13]: pd.to_datetime(str(dt64)).replace(tzinfo=None)
Out[13]: datetime.datetime(2012, 5, 1, 1, 0)
This seems inelegant.
.
Update: this can deal with the "nasty example":
In [21]: dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100')
In [22]: pd.to_datetime(str(dt64)).replace(tzinfo=None)
Out[22]: datetime.datetime(2002, 6, 28, 1, 0)
If you want to convert an entire pandas series of datetimes to regular python datetimes, you can also use .to_pydatetime().
pd.date_range('20110101','20110102',freq='H').to_pydatetime()
> [datetime.datetime(2011, 1, 1, 0, 0) datetime.datetime(2011, 1, 1, 1, 0)
datetime.datetime(2011, 1, 1, 2, 0) datetime.datetime(2011, 1, 1, 3, 0)
....
It also supports timezones:
pd.date_range('20110101','20110102',freq='H').tz_localize('UTC').tz_convert('Australia/Sydney').to_pydatetime()
[ datetime.datetime(2011, 1, 1, 11, 0, tzinfo=<DstTzInfo 'Australia/Sydney' EST+11:00:00 DST>)
datetime.datetime(2011, 1, 1, 12, 0, tzinfo=<DstTzInfo 'Australia/Sydney' EST+11:00:00 DST>)
....
NOTE: If you are operating on a Pandas Series you cannot call to_pydatetime() on the entire series. You will need to call .to_pydatetime() on each individual datetime64 using a list comprehension or something similar:
datetimes = [val.to_pydatetime() for val in df.problem_datetime_column]
This post has been up for 4 years and I still struggled with this conversion problem - so the issue is still active in 2017 in some sense. I was somewhat shocked that the numpy documentation does not readily offer a simple conversion algorithm but that's another story.
I have come across another way to do the conversion that only involves modules numpy and datetime, it does not require pandas to be imported which seems to me to be a lot of code to import for such a simple conversion. I noticed that datetime64.astype(datetime.datetime) will return a datetime.datetime object if the original datetime64 is in micro-second units while other units return an integer timestamp. I use module xarray for data I/O from Netcdf files which uses the datetime64 in nanosecond units making the conversion fail unless you first convert to micro-second units. Here is the example conversion code,
import numpy as np
import datetime
def convert_datetime64_to_datetime( usert: np.datetime64 )->datetime.datetime:
t = np.datetime64( usert, 'us').astype(datetime.datetime)
return t
Its only tested on my machine, which is Python 3.6 with a recent 2017 Anaconda distribution. I have only looked at scalar conversion and have not checked array based conversions although I'm guessing it will be good. Nor have I looked at the numpy datetime64 source code to see if the operation makes sense or not.
import numpy as np
import pandas as pd
def np64toDate(np64):
return pd.to_datetime(str(np64)).replace(tzinfo=None).to_datetime()
use this function to get pythons native datetime object
I've come back to this answer more times than I can count, so I decided to throw together a quick little class, which converts a Numpy datetime64 value to Python datetime value. I hope it helps others out there.
from datetime import datetime
import pandas as pd
class NumpyConverter(object):
#classmethod
def to_datetime(cls, dt64, tzinfo=None):
"""
Converts a Numpy datetime64 to a Python datetime.
:param dt64: A Numpy datetime64 variable
:type dt64: numpy.datetime64
:param tzinfo: The timezone the date / time value is in
:type tzinfo: pytz.timezone
:return: A Python datetime variable
:rtype: datetime
"""
ts = pd.to_datetime(dt64)
if tzinfo is not None:
return datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second, tzinfo=tzinfo)
return datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second)
I'm gonna keep this in my tool bag, something tells me I'll need it again.
I did like this
import pandas as pd
# Custom function to convert Pandas Datetime to Timestamp
def toTimestamp(data):
return data.timestamp()
# Read a csv file
df = pd.read_csv("friends.csv")
# Replace the "birthdate" column by:
# 1. Transform to datetime
# 2. Apply the custom function to the column just converted
df["birthdate"] = pd.to_datetime(df["birthdate"]).apply(toTimestamp)
Some solutions work well for me but numpy will deprecate some parameters.
The solution that work better for me is to read the date as a pandas datetime and excract explicitly the year, month and day of a pandas object.
The following code works for the most common situation.
def format_dates(dates):
dt = pd.to_datetime(dates)
try: return [datetime.date(x.year, x.month, x.day) for x in dt]
except TypeError: return datetime.date(dt.year, dt.month, dt.day)
Only way I managed to convert a column 'date' in pandas dataframe containing time info to numpy array was as following: (dataframe is read from csv file "csvIn.csv")
import pandas as pd
import numpy as np
df = pd.read_csv("csvIn.csv")
df["date"] = pd.to_datetime(df["date"])
timestamps = np.array([np.datetime64(value) for dummy, value in df["date"].items()])
indeed, all of these datetime types can be difficult, and potentially problematic (must keep careful track of timezone information). here's what i have done, though i admit that i am concerned that at least part of it is "not by design". also, this can be made a bit more compact as needed.
starting with a numpy.datetime64 dt_a:
dt_a
numpy.datetime64('2015-04-24T23:11:26.270000-0700')
dt_a1 = dt_a.tolist() # yields a datetime object in UTC, but without tzinfo
dt_a1
datetime.datetime(2015, 4, 25, 6, 11, 26, 270000)
# now, make your "aware" datetime:
dt_a2=datetime.datetime(*list(dt_a1.timetuple()[:6]) + [dt_a1.microsecond], tzinfo=pytz.timezone('UTC'))
... and of course, that can be compressed into one line as needed.
Related
Strange behavior with pandas timestamp to posix conversion
I do the following operations: Convert string datetime in pandas dataframe to python datetime via apply(strptime) Convert datetime to posix timestamp via .timestamp() method If I revert posix back to datetime with .fromtimestamp() I obtain different datetime It differs by 3 hours which is my timezone (I'm at UTC+3 now), so I suppose it is a kind of timezone issue. Also I understand that in apply it implicitly converts to pandas.Timestamp, but I don't understand the difference in this case. What is the reason for such strange behavior and what should I do to avoid it? Actually in my project I need to compare this pandas timestamps with correct poxis timestamps and now it works wrong. Below is dummy reproducible example: df = pd.DataFrame(['2018-03-03 14:30:00'], columns=['c']) df['c'] = df['c'].apply(lambda x: datetime.datetime.strptime(x, '%Y-%m-%d %H:%M:%S')) dt = df['c'].iloc[0] dt >> Timestamp('2018-03-03 14:30:00') datetime.datetime.fromtimestamp(dt.timestamp()) >> datetime.datetime(2018, 3, 3, 17, 30)
First, I suggest using the np.timedelta64 dtype when working with pandas. In this case it makes the reciprocity simple. pd.to_datetime('2018-03-03 14:30:00').value #1520087400000000000 pd.to_datetime(pd.to_datetime('2018-03-03 14:30:00').value) #Timestamp('2018-03-03 14:30:00') The issue with the other methods is that POSIX has UTC as the origin, but fromtimestamp returns the local time. If your system isn't UTC compliant, then we get issues. The following methods will work to remedy this: from datetime import datetime import pytz dt #Timestamp('2018-03-03 14:30:00') # Seemingly problematic: datetime.fromtimestamp(dt.timestamp()) #datetime.datetime(2018, 3, 3, 9, 30) datetime.fromtimestamp(dt.timestamp(), tz=pytz.utc) #datetime.datetime(2018, 3, 3, 14, 30, tzinfo=<UTC>) datetime.combine(dt.date(), dt.timetz()) #datetime.datetime(2018, 3, 3, 14, 30) mytz = pytz.timezone('US/Eastern') # Use your own local timezone datetime.fromtimestamp(mytz.localize(dt).timestamp()) #datetime.datetime(2018, 3, 3, 14, 30)
An answer with the to_datetime function: df = pd.DataFrame(['2018-03-03 14:30:00'], columns=['c']) df['c'] = pd.to_datetime(df['c'].values, dayfirst=False).tz_localize('Your/Timezone') When working with date, you should always put a timezone it is easier after to work with. It does not explain the difference between the datetime in pandas and alone.
How to convert numpy datetime64 [ns] to python datetime?
I need to convert dates from pandas frame values in the separate function: def myfunc(lat, lon, when): ts = (when - np.datetime64('1970-01-01T00:00:00Z','s')) / np.timedelta64(1, 's') date = datetime.datetime.utcfromtimestamp(ts) print("Numpy date= ", when, " Python date= ", date) return float(90) - next_func(lat, lon, date) Invokation this function: new_df['new_column'] = np.vectorize(my_func)(lat, lon, new_df['datetime(LT)']) But it raise error: ufunc subtract cannot use operands with types dtype('int64') and dtype('<M8[s]') How to convert numpy datetime64 [ns] to python datetime?
I wonder if you need all this conversion work. With the right time units a datetime64 can produce a datetime object directly. I'm not sure about your when variable, but let's assume it comes from pandas, and is something like a DatetimeIndex: In [56]: time = pandas.date_range('6/28/2013', periods=5, freq='5D') In [57]: time Out[57]: DatetimeIndex(['2013-06-28', '2013-07-03', '2013-07-08', '2013-07-13', '2013-07-18'], dtype='datetime64[ns]', freq='5D') The equivalent numpy array In [58]: time.values Out[58]: array(['2013-06-28T00:00:00.000000000', '2013-07-03T00:00:00.000000000', '2013-07-08T00:00:00.000000000', '2013-07-13T00:00:00.000000000', '2013-07-18T00:00:00.000000000'], dtype='datetime64[ns]') In [59]: time.values.tolist() Out[59]: [1372377600000000000, 1372809600000000000, 1373241600000000000, 1373673600000000000, 1374105600000000000] With [ns] the result is a large integer, a 'timestamp' of some sort. But if I convert the time units to something like seconds, or even microseconds (us): In [60]: time.values.astype('datetime64[s]') Out[60]: array(['2013-06-28T00:00:00', '2013-07-03T00:00:00', '2013-07-08T00:00:00', '2013-07-13T00:00:00', '2013-07-18T00:00:00'], dtype='datetime64[s]') In [61]: time.values.astype('datetime64[s]').tolist() Out[61]: [datetime.datetime(2013, 6, 28, 0, 0), datetime.datetime(2013, 7, 3, 0, 0), datetime.datetime(2013, 7, 8, 0, 0), datetime.datetime(2013, 7, 13, 0, 0), datetime.datetime(2013, 7, 18, 0, 0)] the result is a list of datetime objects.
I prefer this workaround because sometimes np.datetime64 has different resolution def ___convert_to_datetime(d): return datetime.strptime(np.datetime_as_string(d,unit='s'), '%Y-%m-%dT%H:%M:%S') for timestamp def ___convert_to_ts(d): return datetime.strptime(np.datetime_as_string(d,unit='s'), '%Y-%m-%dT%H:%M:%S').timestamp() for instance import numpy as np from datetime import datetime def ___convert_to_datetime(d): return datetime.strptime(np.datetime_as_string(d,unit='s'), '%Y-%m-%dT%H:%M:%S') def ___convert_to_ts(d): return datetime.strptime(np.datetime_as_string(d,unit='s'), '%Y-%m-%dT%H:%M:%S').timestamp() print(___convert_to_datetime(np.datetime64('2005-02-25'))) my_ns_date = np.datetime64('2009') + np.timedelta64(20, 'ns') print(my_ns_date) print(___convert_to_datetime(my_ns_date)) output will be 2005-02-25 00:00:00 2009-01-01T00:00:00.000000020 2009-01-01 00:00:00
def myfunc(lat, lon, when): ts = (when - np.datetime64('1970-01-01T00:00:00Z','s')) / np.timedelta64(1, 's') date = datetime.utcfromtimestamp(ts) print("Numpy date= ", when, " Python date= ", date) return float(90) - next_func(lat, lon, date) try this code to convert numpy datetime64[ns] to python datetime you just try the following code segment from datetime import datetime datetime.utcfromtimestamp('your_time_stamp')
Guidelines for using various datetime classes in pandas [duplicate]
How do I convert a numpy.datetime64 object to a datetime.datetime (or Timestamp)? In the following code, I create a datetime, timestamp and datetime64 objects. import datetime import numpy as np import pandas as pd dt = datetime.datetime(2012, 5, 1) # A strange way to extract a Timestamp object, there's surely a better way? ts = pd.DatetimeIndex([dt])[0] dt64 = np.datetime64(dt) In [7]: dt Out[7]: datetime.datetime(2012, 5, 1, 0, 0) In [8]: ts Out[8]: <Timestamp: 2012-05-01 00:00:00> In [9]: dt64 Out[9]: numpy.datetime64('2012-05-01T01:00:00.000000+0100') Note: it's easy to get the datetime from the Timestamp: In [10]: ts.to_datetime() Out[10]: datetime.datetime(2012, 5, 1, 0, 0) But how do we extract the datetime or Timestamp from a numpy.datetime64 (dt64)? . Update: a somewhat nasty example in my dataset (perhaps the motivating example) seems to be: dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100') which should be datetime.datetime(2002, 6, 28, 1, 0), and not a long (!) (1025222400000000000L)...
You can just use the pd.Timestamp constructor. The following diagram may be useful for this and related questions.
Welcome to hell. You can just pass a datetime64 object to pandas.Timestamp: In [16]: Timestamp(numpy.datetime64('2012-05-01T01:00:00.000000')) Out[16]: <Timestamp: 2012-05-01 01:00:00> I noticed that this doesn't work right though in NumPy 1.6.1: numpy.datetime64('2012-05-01T01:00:00.000000+0100') Also, pandas.to_datetime can be used (this is off of the dev version, haven't checked v0.9.1): In [24]: pandas.to_datetime('2012-05-01T01:00:00.000000+0100') Out[24]: datetime.datetime(2012, 5, 1, 1, 0, tzinfo=tzoffset(None, 3600))
To convert numpy.datetime64 to datetime object that represents time in UTC on numpy-1.8: >>> from datetime import datetime >>> import numpy as np >>> dt = datetime.utcnow() >>> dt datetime.datetime(2012, 12, 4, 19, 51, 25, 362455) >>> dt64 = np.datetime64(dt) >>> ts = (dt64 - np.datetime64('1970-01-01T00:00:00Z')) / np.timedelta64(1, 's') >>> ts 1354650685.3624549 >>> datetime.utcfromtimestamp(ts) datetime.datetime(2012, 12, 4, 19, 51, 25, 362455) >>> np.__version__ '1.8.0.dev-7b75899' The above example assumes that a naive datetime object is interpreted by np.datetime64 as time in UTC. To convert datetime to np.datetime64 and back (numpy-1.6): >>> np.datetime64(datetime.utcnow()).astype(datetime) datetime.datetime(2012, 12, 4, 13, 34, 52, 827542) It works both on a single np.datetime64 object and a numpy array of np.datetime64. Think of np.datetime64 the same way you would about np.int8, np.int16, etc and apply the same methods to convert between Python objects such as int, datetime and corresponding numpy objects. Your "nasty example" works correctly: >>> from datetime import datetime >>> import numpy >>> numpy.datetime64('2002-06-28T01:00:00.000000000+0100').astype(datetime) datetime.datetime(2002, 6, 28, 0, 0) >>> numpy.__version__ '1.6.2' # current version available via pip install numpy I can reproduce the long value on numpy-1.8.0 installed as: pip install git+https://github.com/numpy/numpy.git#egg=numpy-dev The same example: >>> from datetime import datetime >>> import numpy >>> numpy.datetime64('2002-06-28T01:00:00.000000000+0100').astype(datetime) 1025222400000000000L >>> numpy.__version__ '1.8.0.dev-7b75899' It returns long because for numpy.datetime64 type .astype(datetime) is equivalent to .astype(object) that returns Python integer (long) on numpy-1.8. To get datetime object you could: >>> dt64.dtype dtype('<M8[ns]') >>> ns = 1e-9 # number of seconds in a nanosecond >>> datetime.utcfromtimestamp(dt64.astype(int) * ns) datetime.datetime(2002, 6, 28, 0, 0) To get datetime64 that uses seconds directly: >>> dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100', 's') >>> dt64.dtype dtype('<M8[s]') >>> datetime.utcfromtimestamp(dt64.astype(int)) datetime.datetime(2002, 6, 28, 0, 0) The numpy docs say that the datetime API is experimental and may change in future numpy versions.
I think there could be a more consolidated effort in an answer to better explain the relationship between Python's datetime module, numpy's datetime64/timedelta64 and pandas' Timestamp/Timedelta objects. The datetime standard library of Python The datetime standard library has four main objects time - only time, measured in hours, minutes, seconds and microseconds date - only year, month and day datetime - All components of time and date timedelta - An amount of time with maximum unit of days Create these four objects >>> import datetime >>> datetime.time(hour=4, minute=3, second=10, microsecond=7199) datetime.time(4, 3, 10, 7199) >>> datetime.date(year=2017, month=10, day=24) datetime.date(2017, 10, 24) >>> datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199) datetime.datetime(2017, 10, 24, 4, 3, 10, 7199) >>> datetime.timedelta(days=3, minutes = 55) datetime.timedelta(3, 3300) >>> # add timedelta to datetime >>> datetime.timedelta(days=3, minutes = 55) + \ datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199) datetime.datetime(2017, 10, 27, 4, 58, 10, 7199) NumPy's datetime64 and timedelta64 objects NumPy has no separate date and time objects, just a single datetime64 object to represent a single moment in time. The datetime module's datetime object has microsecond precision (one-millionth of a second). NumPy's datetime64 object allows you to set its precision from hours all the way to attoseconds (10 ^ -18). It's constructor is more flexible and can take a variety of inputs. Construct NumPy's datetime64 and timedelta64 objects Pass an integer with a string for the units. See all units here. It gets converted to that many units after the UNIX epoch: Jan 1, 1970 >>> np.datetime64(5, 'ns') numpy.datetime64('1970-01-01T00:00:00.000000005') >>> np.datetime64(1508887504, 's') numpy.datetime64('2017-10-24T23:25:04') You can also use strings as long as they are in ISO 8601 format. >>> np.datetime64('2017-10-24') numpy.datetime64('2017-10-24') Timedeltas have a single unit >>> np.timedelta64(5, 'D') # 5 days >>> np.timedelta64(10, 'h') 10 hours Can also create them by subtracting two datetime64 objects >>> np.datetime64('2017-10-24T05:30:45.67') - np.datetime64('2017-10-22T12:35:40.123') numpy.timedelta64(147305547,'ms') Pandas Timestamp and Timedelta build much more functionality on top of NumPy A pandas Timestamp is a moment in time very similar to a datetime but with much more functionality. You can construct them with either pd.Timestamp or pd.to_datetime. >>> pd.Timestamp(1239.1238934) #defaults to nanoseconds Timestamp('1970-01-01 00:00:00.000001239') >>> pd.Timestamp(1239.1238934, unit='D') # change units Timestamp('1973-05-24 02:58:24.355200') >>> pd.Timestamp('2017-10-24 05') # partial strings work Timestamp('2017-10-24 05:00:00') pd.to_datetime works very similarly (with a few more options) and can convert a list of strings into Timestamps. >>> pd.to_datetime('2017-10-24 05') Timestamp('2017-10-24 05:00:00') >>> pd.to_datetime(['2017-1-1', '2017-1-2']) DatetimeIndex(['2017-01-01', '2017-01-02'], dtype='datetime64[ns]', freq=None) Converting Python datetime to datetime64 and Timestamp >>> dt = datetime.datetime(year=2017, month=10, day=24, hour=4, minute=3, second=10, microsecond=7199) >>> np.datetime64(dt) numpy.datetime64('2017-10-24T04:03:10.007199') >>> pd.Timestamp(dt) # or pd.to_datetime(dt) Timestamp('2017-10-24 04:03:10.007199') Converting numpy datetime64 to datetime and Timestamp >>> dt64 = np.datetime64('2017-10-24 05:34:20.123456') >>> unix_epoch = np.datetime64(0, 's') >>> one_second = np.timedelta64(1, 's') >>> seconds_since_epoch = (dt64 - unix_epoch) / one_second >>> seconds_since_epoch 1508823260.123456 >>> datetime.datetime.utcfromtimestamp(seconds_since_epoch) >>> datetime.datetime(2017, 10, 24, 5, 34, 20, 123456) Convert to Timestamp >>> pd.Timestamp(dt64) Timestamp('2017-10-24 05:34:20.123456') Convert from Timestamp to datetime and datetime64 This is quite easy as pandas timestamps are very powerful >>> ts = pd.Timestamp('2017-10-24 04:24:33.654321') >>> ts.to_pydatetime() # Python's datetime datetime.datetime(2017, 10, 24, 4, 24, 33, 654321) >>> ts.to_datetime64() numpy.datetime64('2017-10-24T04:24:33.654321000')
>>> dt64.tolist() datetime.datetime(2012, 5, 1, 0, 0) For DatetimeIndex, the tolist returns a list of datetime objects. For a single datetime64 object it returns a single datetime object.
One option is to use str, and then to_datetime (or similar): In [11]: str(dt64) Out[11]: '2012-05-01T01:00:00.000000+0100' In [12]: pd.to_datetime(str(dt64)) Out[12]: datetime.datetime(2012, 5, 1, 1, 0, tzinfo=tzoffset(None, 3600)) Note: it is not equal to dt because it's become "offset-aware": In [13]: pd.to_datetime(str(dt64)).replace(tzinfo=None) Out[13]: datetime.datetime(2012, 5, 1, 1, 0) This seems inelegant. . Update: this can deal with the "nasty example": In [21]: dt64 = numpy.datetime64('2002-06-28T01:00:00.000000000+0100') In [22]: pd.to_datetime(str(dt64)).replace(tzinfo=None) Out[22]: datetime.datetime(2002, 6, 28, 1, 0)
If you want to convert an entire pandas series of datetimes to regular python datetimes, you can also use .to_pydatetime(). pd.date_range('20110101','20110102',freq='H').to_pydatetime() > [datetime.datetime(2011, 1, 1, 0, 0) datetime.datetime(2011, 1, 1, 1, 0) datetime.datetime(2011, 1, 1, 2, 0) datetime.datetime(2011, 1, 1, 3, 0) .... It also supports timezones: pd.date_range('20110101','20110102',freq='H').tz_localize('UTC').tz_convert('Australia/Sydney').to_pydatetime() [ datetime.datetime(2011, 1, 1, 11, 0, tzinfo=<DstTzInfo 'Australia/Sydney' EST+11:00:00 DST>) datetime.datetime(2011, 1, 1, 12, 0, tzinfo=<DstTzInfo 'Australia/Sydney' EST+11:00:00 DST>) .... NOTE: If you are operating on a Pandas Series you cannot call to_pydatetime() on the entire series. You will need to call .to_pydatetime() on each individual datetime64 using a list comprehension or something similar: datetimes = [val.to_pydatetime() for val in df.problem_datetime_column]
This post has been up for 4 years and I still struggled with this conversion problem - so the issue is still active in 2017 in some sense. I was somewhat shocked that the numpy documentation does not readily offer a simple conversion algorithm but that's another story. I have come across another way to do the conversion that only involves modules numpy and datetime, it does not require pandas to be imported which seems to me to be a lot of code to import for such a simple conversion. I noticed that datetime64.astype(datetime.datetime) will return a datetime.datetime object if the original datetime64 is in micro-second units while other units return an integer timestamp. I use module xarray for data I/O from Netcdf files which uses the datetime64 in nanosecond units making the conversion fail unless you first convert to micro-second units. Here is the example conversion code, import numpy as np import datetime def convert_datetime64_to_datetime( usert: np.datetime64 )->datetime.datetime: t = np.datetime64( usert, 'us').astype(datetime.datetime) return t Its only tested on my machine, which is Python 3.6 with a recent 2017 Anaconda distribution. I have only looked at scalar conversion and have not checked array based conversions although I'm guessing it will be good. Nor have I looked at the numpy datetime64 source code to see if the operation makes sense or not.
import numpy as np import pandas as pd def np64toDate(np64): return pd.to_datetime(str(np64)).replace(tzinfo=None).to_datetime() use this function to get pythons native datetime object
I've come back to this answer more times than I can count, so I decided to throw together a quick little class, which converts a Numpy datetime64 value to Python datetime value. I hope it helps others out there. from datetime import datetime import pandas as pd class NumpyConverter(object): #classmethod def to_datetime(cls, dt64, tzinfo=None): """ Converts a Numpy datetime64 to a Python datetime. :param dt64: A Numpy datetime64 variable :type dt64: numpy.datetime64 :param tzinfo: The timezone the date / time value is in :type tzinfo: pytz.timezone :return: A Python datetime variable :rtype: datetime """ ts = pd.to_datetime(dt64) if tzinfo is not None: return datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second, tzinfo=tzinfo) return datetime(ts.year, ts.month, ts.day, ts.hour, ts.minute, ts.second) I'm gonna keep this in my tool bag, something tells me I'll need it again.
I did like this import pandas as pd # Custom function to convert Pandas Datetime to Timestamp def toTimestamp(data): return data.timestamp() # Read a csv file df = pd.read_csv("friends.csv") # Replace the "birthdate" column by: # 1. Transform to datetime # 2. Apply the custom function to the column just converted df["birthdate"] = pd.to_datetime(df["birthdate"]).apply(toTimestamp)
Some solutions work well for me but numpy will deprecate some parameters. The solution that work better for me is to read the date as a pandas datetime and excract explicitly the year, month and day of a pandas object. The following code works for the most common situation. def format_dates(dates): dt = pd.to_datetime(dates) try: return [datetime.date(x.year, x.month, x.day) for x in dt] except TypeError: return datetime.date(dt.year, dt.month, dt.day)
Only way I managed to convert a column 'date' in pandas dataframe containing time info to numpy array was as following: (dataframe is read from csv file "csvIn.csv") import pandas as pd import numpy as np df = pd.read_csv("csvIn.csv") df["date"] = pd.to_datetime(df["date"]) timestamps = np.array([np.datetime64(value) for dummy, value in df["date"].items()])
indeed, all of these datetime types can be difficult, and potentially problematic (must keep careful track of timezone information). here's what i have done, though i admit that i am concerned that at least part of it is "not by design". also, this can be made a bit more compact as needed. starting with a numpy.datetime64 dt_a: dt_a numpy.datetime64('2015-04-24T23:11:26.270000-0700') dt_a1 = dt_a.tolist() # yields a datetime object in UTC, but without tzinfo dt_a1 datetime.datetime(2015, 4, 25, 6, 11, 26, 270000) # now, make your "aware" datetime: dt_a2=datetime.datetime(*list(dt_a1.timetuple()[:6]) + [dt_a1.microsecond], tzinfo=pytz.timezone('UTC')) ... and of course, that can be compressed into one line as needed.
Python: Convert UTC time to localtime given UTC offset
Given the following Python datetime object representing an UTC time: 2016-09-15 22:13:03-2:00 I'm trying to obtain the corresponding local time datetime, where the UTC offset is applied: 2016-09-15 20:13:03 I was hoping to find a method in the datetime module that was able to do this, but I did not succeed. Any help is very appreciated. Regards
I do not know if this is the best answer but here is what I have for you. Typically I would not do this since it is better to use the UTC time and convert Here is a example: value = datetime.datetime.strptime(str(utc_datetime), '%Y-%m-%d %H:%M:%S').replace(tzinfo=pytz.utc) value = value.astimezone(pytz.timezone("America/Los_Angeles")) I was unable to use your datetime as the syntax is a bit off so I went ahead and used dateutil.parser to convert it to a datetime object >>> from dateutil.parser import parse >>> val = parse('2016-09-15 22:13:03-2:00') There are other ways to set a datetime object to UTC but I find pytz to be the easiest >>> import pytz >>> utc_val = val.replace(tzinfo=pytz.utc) Here is the output of those two values. From here I grab the delta and subtract it >>> val, utc_val (datetime.datetime(2016, 9, 15, 22, 13, 3, tzinfo=tzoffset(None, -7200)), datetime.datetime(2016, 9, 15, 22, 13, 3, tzinfo=<UTC>)) >>> >>> delta = val - utc_val I remove the tzinfo since this is a converted datetime value >>> local_dt = (val - delta).replace(tzinfo=None) >>> local_dt datetime.datetime(2016, 9, 15, 20, 13, 3) >>> str(local_dt) '2016-09-15 20:13:03'
Convert a datetame object to the correct date (MM-DD-YYYY to DD-MM-YYYY)
I have parsed a date and stored it as a datetime object. The date was written in the format MM-DD-YYYY instead of DD-MM-YYYY when it was parsed. What would be the easiest way to convert the object to the correct date?
You can swap out values with the datetime.datetime.replace() method, provided the day value is within the range 1-12, of course: dt = dt.replace(month=dt.day, day=dt.month) The method returns a new datetime instance. Demo: >>> from datetime import datetime >>> dt = datetime(2015, 2, 11) >>> dt datetime.datetime(2015, 2, 11, 0, 0) >>> dt.replace(month=dt.day, day=dt.month) datetime.datetime(2015, 11, 2, 0, 0)
Try this out >>> import datetime >>> d = datetime.datetime.strptime('2011-06-09', '%Y-%m-%d') >>> d.strftime('%d-%m-%Y') '09-06-2011' Not working? Let me know :)