How to convert datetime object to milliseconds - python

I am parsing datetime values as follows:
df['actualDateTime'] = pd.to_datetime(df['actualDateTime'])
How can I convert this datetime objects to milliseconds?
I didn't see mention of milliseconds in the doc of to_datetime.
Update (Based on feedback):
This is the current version of the code that provides error TypeError: Cannot convert input to Timestamp. The column Date3 must contain milliseconds (as a numeric equivalent of a datetime object).
import pandas as pd
import time
s1 = {'Date' : ['2015-10-20T07:21:00.000','2015-10-19T07:18:00.000','2015-10-19T07:15:00.000']}
df = pd.DataFrame(s1)
df['Date2'] = pd.to_datetime(df['Date'])
t = pd.Timestamp(df['Date2'])
df['Date3'] = time.mktime(t.timetuple())
print df

You can try pd.to_datetime(df['actualDateTime'], unit='ms')
http://pandas.pydata.org/pandas-docs/stable/generated/pandas.to_datetime.html
says this will denote in epoch, with variations 's','ms', 'ns' ...
Update
If you want in epoch timestamp of the form 14567899..
import pandas as pd
import time
t = pd.Timestamp('2015-10-19 07:22:00')
time.mktime(t.timetuple())
>> 1445219520.0
Latest update
df = pd.DataFrame(s1)
df1 = pd.to_datetime(df['Date'])
pd.DatetimeIndex(df1)
>>>DatetimeIndex(['2015-10-20 07:21:00', '2015-10-19 07:18:00',
'2015-10-19 07:15:00'],
dtype='datetime64[ns]', freq=None)
df1.astype(np.int64)
>>>0 1445325660000000000
1 1445239080000000000
2 1445238900000000000
df1.astype(np.int64) // 10**9
>>>0 1445325660
1 1445239080
2 1445238900
Name: Date, dtype: int64

Timestamps in pandas are always in nanoseconds.
This gives you milliseconds since the epoch (1970-01-01):
df['actualDateTime'] = df['actualDateTime'].astype(np.int64) / int(1e6)

This will return milliseconds from epoch
timestamp_object.timestamp() * 1000

pandas.to_datetime is to convert string or few other datatype to pandas datetime[ns]
In your instance initial 'actualDateTime' is not having milliseconds.So, if you are parsing a column which has milliseconds you will get data.
for example,
df
Out[60]:
a b
0 2015-11-02 18:04:32.926 0
1 2015-11-02 18:04:32.928 1
2 2015-11-02 18:04:32.927 2
df.a
Out[61]:
0 2015-11-02 18:04:32.926
1 2015-11-02 18:04:32.928
2 2015-11-02 18:04:32.927
Name: a, dtype: object
df.a = pd.to_datetime(df.a)
df.a
Out[63]:
0 2015-11-02 18:04:32.926
1 2015-11-02 18:04:32.928
2 2015-11-02 18:04:32.927
Name: a, dtype: datetime64[ns]
df.a.dt.nanosecond
Out[64]:
0 0
1 0
2 0
dtype: int64
df.a.dt.microsecond
Out[65]:
0 926000
1 928000
2 927000
dtype: int64

For what it's worth, to convert a single Pandas timestamp object to milliseconds, I had to do:
import time
time.mktime(<timestamp_object>.timetuple())*1000

For python >= 3.8, for e.g.
pd.DataFrame({'temp':[1,2,3]}, index = [pd.Timestamp.utcnow()]*3)
convert to milliseconds:
times = df.index.view(np.int64) // int(1e6)
print(times[0])
gives:
1666925409051
Note: to convert to seconds, similarly e.g.:
times = df.index.view(np.int64) // int(1e9)
print(times[0])
1666925409

from datetime import datetime
print datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S.%f')[:-3]
>>>> OUTPUT >>>>
2015-11-02 18:04:32.926

Related

i have datetime column(hh:mm:ss ) format in dataframe.i want to pivot dataframe in which i want to use aggfunc on date column

I am trying to pivot dataframe given as below. I have datetime column(hh:mm:ss) format in dataframe. I want to pivot dataframe in which I want to use aggfunc on date column.
import pandas as pd
data = {'Type':['A', 'B', 'C', 'C'],'Name':['ab', 'ef','gh', 'ij'],'Time':['02:00:00', '03:02:00', '04:00:30','01:02:20']}
df = pd.DataFrame(data)
print (df)
pivot = (
df.pivot_table(index=['Type'],values=['Time'], aggfunc='sum')
)
Type
Name
Time
A
ab
02:00:00
B
ef
03:02:00
C
gh
04:00:30
C
ij
01:02:20
Type
Time
C
04:00:3001:02:20
A
02:00:00
B
03:02:00
I want C row should be addition of two time ; 05:02:50
This looks more like groupby sum than pivot_table.
Convert to_timedelta to have appropriate dtype for duration. (Makes mathmatical operations function as expected)
groupby sum on Type and Time to get the total duration per Type.
# Convert to TimeDelta (appropriate dtype)
df['Time'] = pd.to_timedelta(df['Time'])
new_df = df.groupby('Type')['Time'].sum().reset_index()
new_df:
Type Time
0 A 0 days 02:00:00
1 B 0 days 03:02:00
2 C 0 days 05:02:50
Optional convert back to string:
new_df['Time'] = new_df['Time'].dt.to_pytimedelta().astype(str)
new_df:
Type Time
0 A 2:00:00
1 B 3:02:00
2 C 5:02:50

convert object (numeric value) to datetime using pandas

I have the data with time column as shown below, how to convert this as date and time using pandas
Consider df:
In [208]: df
Out[208]:
Time
0 2384798300
1 1500353475
2 7006557825
3 1239779541
4 1237529231
Use datetime.fromtimestamp with df.apply:
In [200]: from datetime import datetime
In [209]: df['Time'] = df['Time'].apply(lambda x: datetime.fromtimestamp(x))
In [210]: df
Out[210]:
Time
0 2045-07-28 01:28:20
1 2017-07-18 10:21:15
2 2192-01-11 15:33:45
3 2009-04-15 12:42:21
4 2009-03-20 11:37:11

Python: Converting datetime to ordinal

I have a list(actually a column in pandas DataFrame if this matters) of Timestamps and I'm trying to convert every element of the list to ordinal format. So I run a for loop through the list(is there a faster way?) and use:
import datetime as dt
a = a.toordinal()
or
import datetime as dt
a = dt.datetime.toordinal(a)
however the following happened(for simplicity):
In[1]: a
Out[1]: Timestamp('2019-12-25 00:00:00')
In[2]: b = dt.datetime.toordinal(a)
In[3]:b
Out[3]: 737418
In[4]:a = b
In[5]:a
Out[5]: Timestamp('1970-01-01 00:00:00.000737418')
The result makes absolutely non sense to me. Obviously what I was trying to get is:
In[1]: a
Out[1]: Timestamp('2019-12-25 00:00:00')
In[2]: b = dt.datetime.toordinal(a)
In[3]:b
Out[3]: 737418
In[4]:a = b
In[5]:a
Out[5]: 737418
What went wrong?
console output screenshot
What went wrong?
Your question is a bit misleading, and the screenshot shows what is going on.
Normally, when you write
a = b
in Python, it will bind the name a to the object bound to b. In this case, you will have
id(a) == id(b)
In your case, however, contrary to your question, you're actually doing the assignment
a[0] = b
This will call a method of a, assigning b to its 0 index. The object's class determines what happens in this case. Here, specifically, a is a pandas.Series, and it converts the object in order to conform to its dtype.
Please don't loop. It's not necessary.
#!/usr/bin/env python
import pandas as pd
from datetime import datetime
df = pd.DataFrame({'dates': [datetime(1990, 4, 28),
datetime(2018, 4, 13),
datetime(2017, 11, 4)]})
print(df)
print(df['dates'].dt.weekday_name)
print(df['dates'].dt.weekday)
print(df['dates'].dt.month)
print(df['dates'].dt.year)
gives the dataframe:
dates
0 1990-04-28
1 2018-04-13
2 2017-11-04
And the printed values
0 Saturday
1 Friday
2 Saturday
Name: dates, dtype: object
0 5
1 4
2 5
Name: dates, dtype: int64
0 4
1 4
2 11
Name: dates, dtype: int64
0 1990
1 2018
2 2017
Name: dates, dtype: int64
For the toordinal, you need to "loop" with apply:
print(df['dates'].apply(lambda x: x.toordinal()))
gives the following pandas series
0 726585
1 736797
2 736637
Name: dates, dtype: int64

Pandas isin with empty dataframe produces epoch value on datetime type instead of boolean

I've noticed that doing an isin on a DataFrame which contains datetime types where the operand is an empty DataFrame produces epoch datetime values (i.e. 1970-01-01), instead of 'False'. It seems unlikely that this is correct?
The following code demonstrates this:
(pandas = 0.19.2, numpy = 1.12.0)
import pandas as pd
data = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994', '2014-05-02 18:47:05.178768']}
data2 = {'date': ['2014-05-01 18:47:05.069722', '2014-05-01 18:47:05.119994']}
df = pd.DataFrame(data, columns = ['date'])
df['date'] = pd.to_datetime(df['date'])
df2 = pd.DataFrame(data2, columns = ['date'])
df2['date'] = pd.to_datetime(df2['date'])
df3 = pd.DataFrame([], columns = ['date'])
df4 = pd.DataFrame()
print df.isin(df2)
print df.isin(df3)
print df.isin(df4)
This outputs:
date
0 True
1 True
2 False
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
I would normally expect a list of False values instead of '1970-01-01'? I notice that with pandas = 0.16.2 and numpy = 1.9.2, df.isin(df3) produces the more expected:
date
0 False
1 False
2 False
But df.isin(df4) is as previous.
This definitely looks like a bug to me. isin() calls DataFrame.eq as seen in the source code, and the odd behavior is reproducible with DataFrame.eq itself.
>>> df
date
0 2014-05-01 18:47:05.069722
1 2014-05-01 18:47:05.119994
2 2014-05-02 18:47:05.178768
>>> df.eq(pd.DataFrame(dict(date=[np.nan]*3)))
date
0 1970-01-01
1 1970-01-01
2 1970-01-01
I see you've now raised it to be an open issue,
Pandas isin with empty dataframe produces epoch value on datetime type instead of boolean #15473
and it should be resolved for an upcoming release.

Need to convert entire column from string format to date format from a Dataframe

I am trying to convert an entire column from a Dataframe from a string to date format. Does anyone have any recommendations on how to do this?
I've successfully been able to change one element of this using strptime, but not sure the best way to apply this to the entire column:
Sample of the raw Data:
2016/06/28 02:13:51
In:
Var1 = Dataframename['columnname'][0]
temp = datetime.datetime.strptime(Var1, '%Y/%m/%d %H:%M:%S')
temp
Out:
datetime.datetime(2016, 6, 28, 2, 13, 51)
I think you are looking for this
import datetime
data['colname']=data['colname'].apply(lambda x:datetime.datetime.strptime(x,"%Y-%m-%d %H:%M:%S"))
You can use to_datetime:
df = pd.DataFrame({'b':['2016/06/28 02:13:51','2016/06/28 02:13:51','2016/06/28 02:13:51'],
'a':[4,5,6]})
print (df)
a b
0 4 2016/06/28 02:13:51
1 5 2016/06/28 02:13:51
2 6 2016/06/28 02:13:51
df['b'] = pd.to_datetime(df.b)
print (df)
a b
0 4 2016-06-28 02:13:51
1 5 2016-06-28 02:13:51
2 6 2016-06-28 02:13:51
print (df.dtypes)
a int64
b datetime64[ns]
dtype: object

Categories

Resources