Converting Milliseconds into Clocktime Format - python

I've been trying to convert a milliseconds (0, 5000, 10000) column into a new column with the format: 00:00:00 (00:00:05, 00:00:10 etc)
I tried datetime.datetime.fromtimestamp(5000/1000.0) but it didn't give me the format I wanted.
Any help appreciated!

The best is probably to convert to TimeDelta (using pandas.to_timedelta).
Thus you'll benefit from the timedelta object properties
s = pd.Series([0, 5000, 10000])
s2 = pd.to_timedelta(s, unit='ms')
output:
0 0 days 00:00:00
1 0 days 00:00:05
2 0 days 00:00:10
dtype: timedelta64[ns]
If you really want the '00:00:00' format, use instead pandas.to_datetime:
s2 = pd.to_datetime(s, unit='ms').dt.time
output:
0 00:00:00
1 00:00:05
2 00:00:10
dtype: object
optionally with .astype(str) to have strings

Convert values to timedeltas by to_timedelta:
df['col'] = pd.to_timedelta(df['col'], unit='ms')
print (df)
col
0 0 days 00:00:00
1 0 days 00:00:05
2 0 days 00:00:10

Related

percentage difference of datetime object

I want to create a new column which contains the values of column diff(s) but in percentage.
Finish Time diff (s)
0 1900-01-01 00:42:43.500 0 days 00:00:00
1 1900-01-01 00:44:01.200 0 days 00:01:17
2 1900-01-01 00:44:06.500 0 days 00:01:23
3 1900-01-01 00:44:29.500 0 days 00:01:46
4 1900-01-01 00:44:47.500 0 days 00:02:04
to further understand the data:
df["diff(s)"] = df["Finish Time"] - min(df["Finish Time"])
Finish Time datetime64[ns]
diff (s) timedelta64[ns]
dtype: object
df["diff(%)"] = ((df["Finish Time"]/min(df["Finish
Time"]))*100)
-> results in this error
TypeError: cannot perform __truediv__ with this index type:
DatetimeArray
It depends how are defined percentages - if need divide by summed timedeltas:
df["diff(s)"] = df["Finish Time"] - df["Finish Time"].min()
df["diff(%)"] = (df["diff(s)"] / df["diff(s)"].sum()) * 100
print (df)
Finish Time diff(s) diff(%)
0 1900-01-01 00:42:43.500 0 days 00:00:00 0.000000
1 1900-01-01 00:44:01.200 0 days 00:01:17.700000 19.887382
2 1900-01-01 00:44:06.500 0 days 00:01:23 21.243921
3 1900-01-01 00:44:29.500 0 days 00:01:46 27.130791
4 1900-01-01 00:44:47.500 0 days 00:02:04 31.737906
Or using Series.pct_change:
df["diff(%)"] = df["diff(s)"].pct_change() * 100

Time calculations, mean , median, mode

(
Name
Gun_time
Net_time
Pace
John
28:48:00
28:47:00
4:38:00
George
29:11:00
29:10:00
4:42:00
Mike
29:38:00
29:37:00
4:46:00
Sarah
29:46:00
29:46:00
4:48:00
Roy
30:31:00
30:30:00
4:55:00
Q1. How can I add another column stating difference between Gun_time and Net_time?
Q2. How will I calculate the mean for Gun_time and Net_time. Please help!
I have tried doing the following but it doesn't work
df['Difference'] = df['Gun_time'] - df['Net_time']
for mean value I tried df['Gun_time'].mean
but it doesn't work either, please help!
Q.3 What if we have times in 28:48 (minutes and seconds) format and not 28:48:00 the function gives out a value error.
ValueError: expected hh:mm:ss format
Convert your columns to dtype timedelta, e.g. like
for col in ("Gun_time", "Net_time", "Pace"):
df[col] = pd.to_timedelta(df[col])
Now you can do calculations like
df['Gun_time'].mean()
# Timedelta('1 days 05:34:48')
or
df['Difference'] = df['Gun_time'] - df['Net_time']
#df['Difference']
# 0 0 days 00:01:00
# 1 0 days 00:01:00
# 2 0 days 00:01:00
# 3 0 days 00:00:00
# 4 0 days 00:01:00
# Name: Difference, dtype: timedelta64[ns]
If you need nicer output to string, you can use
def timedeltaToString(td):
hours, remainder = divmod(td.total_seconds(), 3600)
minutes, seconds = divmod(remainder, 60)
return f"{int(hours):02d}:{int(minutes):02d}:{int(seconds):02d}"
df['diffString'] = df['Difference'].apply(timedeltaToString)
# df['diffString']
# 0 00:01:00
# 1 00:01:00
# 2 00:01:00
# 3 00:00:00
# 4 00:01:00
#Name: diffString, dtype: object
See also Format timedelta to string.

Convert date column formated as xx:xx.x

I have come across a CSV file that contains a date column formatted in the following manner: xx:xx.x, here's a couple of the data present in the column marked as date:
07:33.0
34:53.0
06:30.0
30:09.0
02:18.0
My question is what type of formatting is this? And how can I convert it to a proper date format using Python?
It looks like times without hours.
You can create timedeltas by add 0 hours by to_timedelta:
df['col'] = pd.to_timedelta('00:' + df['col'])
print (df)
col
0 0 days 00:07:33
1 0 days 00:34:53
2 0 days 00:06:30
3 0 days 00:30:09
4 0 days 00:02:18
Or convert to datetimes by to_datetime - there is added default date:
df['col'] = pd.to_datetime(df['col'], format='%M:%S.%f')
print (df)
col
0 1900-01-01 00:07:33
1 1900-01-01 00:34:53
2 1900-01-01 00:06:30
3 1900-01-01 00:30:09
4 1900-01-01 00:02:18

how to convert time in unorthodox format to timestamp in pandas dataframe

I have a column in my dataframe which I want to convert to a Timestamp. However, it is in a bit of a strange format that I am struggling to manipulate. The column is in the format HHMMSS, but does not include the leading zeros.
For example for a time that should be '00:03:15' the dataframe has '315'. I want to convert the latter to a Timestamp similar to the former. Here is an illustration of the column:
message_time
25
35
114
1421
...
235347
235959
Thanks
Use Series.str.zfill for add leading zero and then to_datetime:
s = df['message_time'].astype(str).str.zfill(6)
df['message_time'] = pd.to_datetime(s, format='%H%M%S')
print (df)
message_time
0 1900-01-01 00:00:25
1 1900-01-01 00:00:35
2 1900-01-01 00:01:14
3 1900-01-01 00:14:21
4 1900-01-01 23:53:47
5 1900-01-01 23:59:59
In my opinion here is better create timedeltas by to_timedelta:
s = df['message_time'].astype(str).str.zfill(6)
df['message_time'] = pd.to_timedelta(s.str[:2] + ':' + s.str[2:4] + ':' + s.str[4:])
print (df)
message_time
0 00:00:25
1 00:00:35
2 00:01:14
3 00:14:21
4 23:53:47
5 23:59:59

Pandas: converting amount of seconds into timedeltas or times

I have an amount of seconds in a dataframe, let's say:
s = 122
I want to convert it to the following format:
00:02:02.0000
To do that I try using to_datetime the following way:
pd.to_datetime(s, format='%H:%M:%S.%f')
However this doesn't work:
ValueError: time data 122 does not match format '%H:%M:%S.%f' (match)
I also tried using unit='ms' instead of format, but then I get the date before the time.
How can I modify my code to get the desired convertion ?
It needs to be done in the dataframe using pandas if possible.
EDIT: both jezrael and MedAli solutions below are valid, however Jezrael solution have the advantage to work not only with integers but also with Datetime.time as input!
Use to_timedelta with convert seconds to nanoseconds:
df = pd.DataFrame({'sec':[122,3,5,7,1,0]})
df['t'] = pd.to_timedelta(df['sec'] * 10**9)
print (df)
sec t
0 122 00:02:02
1 3 00:00:03
2 5 00:00:05
3 7 00:00:07
4 1 00:00:01
5 0 00:00:00
You can edit your code as follows to get the desired result:
df = pd.DataFrame({'sec':[122,3,5,7,1,0]})
df['time'] = pd.to_datetime(df.sec, unit="s").dt.time
Output:
In [10]: df
Out[10]:
sec time
0 110 00:01:50
1 3 00:00:03
2 5 00:00:05
3 7 00:00:07
4 1 00:00:01
5 0 00:00:00

Categories

Resources