I have an amount of seconds in a dataframe, let's say:
s = 122
I want to convert it to the following format:
00:02:02.0000
To do that I try using to_datetime the following way:
pd.to_datetime(s, format='%H:%M:%S.%f')
However this doesn't work:
ValueError: time data 122 does not match format '%H:%M:%S.%f' (match)
I also tried using unit='ms' instead of format, but then I get the date before the time.
How can I modify my code to get the desired convertion ?
It needs to be done in the dataframe using pandas if possible.
EDIT: both jezrael and MedAli solutions below are valid, however Jezrael solution have the advantage to work not only with integers but also with Datetime.time as input!
Use to_timedelta with convert seconds to nanoseconds:
df = pd.DataFrame({'sec':[122,3,5,7,1,0]})
df['t'] = pd.to_timedelta(df['sec'] * 10**9)
print (df)
sec t
0 122 00:02:02
1 3 00:00:03
2 5 00:00:05
3 7 00:00:07
4 1 00:00:01
5 0 00:00:00
You can edit your code as follows to get the desired result:
df = pd.DataFrame({'sec':[122,3,5,7,1,0]})
df['time'] = pd.to_datetime(df.sec, unit="s").dt.time
Output:
In [10]: df
Out[10]:
sec time
0 110 00:01:50
1 3 00:00:03
2 5 00:00:05
3 7 00:00:07
4 1 00:00:01
5 0 00:00:00
Related
I've been trying to convert a milliseconds (0, 5000, 10000) column into a new column with the format: 00:00:00 (00:00:05, 00:00:10 etc)
I tried datetime.datetime.fromtimestamp(5000/1000.0) but it didn't give me the format I wanted.
Any help appreciated!
The best is probably to convert to TimeDelta (using pandas.to_timedelta).
Thus you'll benefit from the timedelta object properties
s = pd.Series([0, 5000, 10000])
s2 = pd.to_timedelta(s, unit='ms')
output:
0 0 days 00:00:00
1 0 days 00:00:05
2 0 days 00:00:10
dtype: timedelta64[ns]
If you really want the '00:00:00' format, use instead pandas.to_datetime:
s2 = pd.to_datetime(s, unit='ms').dt.time
output:
0 00:00:00
1 00:00:05
2 00:00:10
dtype: object
optionally with .astype(str) to have strings
Convert values to timedeltas by to_timedelta:
df['col'] = pd.to_timedelta(df['col'], unit='ms')
print (df)
col
0 0 days 00:00:00
1 0 days 00:00:05
2 0 days 00:00:10
I have a dataset that has mixed data types in the Date column.
For example, the column looks like this:
ID Date
1 2019-01-01
2 2019-01-02
3 2019-11-01
4 40993
5 40577
6 39949
When I just try to convert the column using pd.to_datetime, I get an error message "mixed datetimes and integers in passed array".
I would really appreciate it if someone could help me out with this! Ideally, it would be nice to have all rows in 'yyyy-mm-dd' format.
Thank you!
I'm guessing those are excel date format?
Convert Excel style date with pandas
import xlrd
def read_date(date):
try:
return xlrd.xldate.xldate_as_datetime(int(date), 0)
except:
return pd.to_datetime(date)
df['New Date'] = df['Date'].apply(read_date)
df
Out[1]:
ID Date New Date
0 1 2019-01-01 2019-01-01
1 2 2019-01-02 2019-01-02
2 3 2019-11-01 2019-11-01
3 4 40993 2012-03-25
4 5 40577 2011-02-03
5 6 39949 2009-05-16
I have a column in my dataframe which I want to convert to a Timestamp. However, it is in a bit of a strange format that I am struggling to manipulate. The column is in the format HHMMSS, but does not include the leading zeros.
For example for a time that should be '00:03:15' the dataframe has '315'. I want to convert the latter to a Timestamp similar to the former. Here is an illustration of the column:
message_time
25
35
114
1421
...
235347
235959
Thanks
Use Series.str.zfill for add leading zero and then to_datetime:
s = df['message_time'].astype(str).str.zfill(6)
df['message_time'] = pd.to_datetime(s, format='%H%M%S')
print (df)
message_time
0 1900-01-01 00:00:25
1 1900-01-01 00:00:35
2 1900-01-01 00:01:14
3 1900-01-01 00:14:21
4 1900-01-01 23:53:47
5 1900-01-01 23:59:59
In my opinion here is better create timedeltas by to_timedelta:
s = df['message_time'].astype(str).str.zfill(6)
df['message_time'] = pd.to_timedelta(s.str[:2] + ':' + s.str[2:4] + ':' + s.str[4:])
print (df)
message_time
0 00:00:25
1 00:00:35
2 00:01:14
3 00:14:21
4 23:53:47
5 23:59:59
I am trying to remove rows from a dataframe that have a timedelta value of less than some number of seconds.
My dataframe looks something like this:
Start Elapsed time
0 2018-10-29 07:56:20 0 days 00:15:05
1 2018-10-29 07:56:20 0 days 00:15:05
2 2018-10-29 08:11:25 0 days 00:00:02
3 2018-10-29 08:11:27 0 days 00:00:08
4 2018-10-29 08:11:27 0 days 00:00:08
5 2018-10-29 08:11:35 0 days 00:00:02
6 2018-10-29 08:11:37 0 days 00:00:00
I would like to remove all the rows where Elapsed time is less than some number of seconds - let's say 3 for now. So I'd like a dataframe that looks like this (from the above):
Start Elapsed time
0 2018-10-29 07:56:20 0 days 00:15:05
1 2018-10-29 07:56:20 0 days 00:15:05
3 2018-10-29 08:11:27 0 days 00:00:08
4 2018-10-29 08:11:27 0 days 00:00:08
I've tried a number of different things yielding a number of different error messages - usually incompatible type comparison errors. For example:
df_new = df[df['Elapsed time'] > pd.to_timedelta('3 seconds')]
df_new = df[df['Elapsed time'] > datetime.timedelta(seconds=3)]
I'd like to avoid iterating over all of the rows, but if that's what I have to do then I'll do that.
Your help is very appreciated!
Edit: My real problem is that the dtype of my 'Elapsed time' column is object instead of timedelta. A quick fix would be to cast the dtype using the code below, but a better fix would be to ensure that the dtype is not set to the object type in the first place. Thank you all for your help and comments.
df_new = df[pd.to_timedelta(df['Elapsed time']) > pd.to_timedelta('3 seconds')]
Getting data using pd.read_clipboard(sep='\s\s+)
df = pd.read_clipboard(sep='\s\s+')
df['Elapsed time'] = pd.to_timedelta(df['Elapsed time'])
You can use:
df[df['Elapsed time'].dt.total_seconds() > 3]
Output:
Start Elapsed time
0 2018-10-29 07:56:20 00:15:05
1 2018-10-29 07:56:20 00:15:05
3 2018-10-29 08:11:27 00:00:08
4 2018-10-29 08:11:27 00:00:08
I have following dataframe in pandas
code time
1 003002
1 053003
1 060002
1 073001
1 073003
I want to generate following dataframe in pandas
code time new_time
1 003002 00:30:00
1 053003 05:30:00
1 060002 06:00:00
1 073001 07:30:00
1 073003 07:30:00
I am doing it with following code
df['new_time'] = pd.to_datetime(df['time'] ,format='%H%M%S').dt.time
How can I do it in pandas?
Use Series.dt.floor:
df['time'] = pd.to_datetime(df['time'], format='%H%M%S').dt.floor('T').dt.time
Or remove last 2 values by indexing, then change format to %H%M:
df['time'] = pd.to_datetime(df['time'].str[:-2], format='%H%M').dt.time
print (df)
code time
0 1 00:30:00
1 1 05:30:00
2 1 06:00:00
3 1 07:30:00
4 1 07:30:00
An option using astype:
pd.to_datetime(df_oclh.Time).astype('datetime64[m]').dt.time
'datetime64[m]' symbolizes the time we want to convert to which is datetime with minutes being the largest granulariy of time wanted. Alternatively you could use [s] for seconds (rid of milliseconds) or [H] for hours (rid of minutes, seconds and milliseconds)