I'm trying to change the time format of my data that's now in form of 15:41:28:4330 or hh:mm:ss:msmsmsms to seconds.
I browsed through some of the pandas documentation but can't seem to find this format anywhere.
Would it be possible to simply calculate the seconds from that time format row by row?
You'll want to obtain a timedelta and take the total_seconds method to get seconds after midnight. So you can parse to datetime first, and subtract the default date (that will be added automatically). Ex:
#1 - via datetime
import pandas as pd
df = pd.DataFrame({'time': ["15:41:28:4330"]})
df['time'] = pd.to_datetime(df['time'], format='%H:%M:%S:%f')
df['sec_after_mdnt'] = (df['time']-df['time'].dt.floor('d')).dt.total_seconds()
df
time sec_after_mdnt
0 1900-01-01 15:41:28.433 56488.433
Alternatively, you can clean your time format and parse directly to timedelta:
#2 - str cleaning & to timedelta
df = pd.DataFrame({'time': ["15:41:28:4330"]})
# last separator must be a dot...
df['time'] = df['time'].str[::-1].str.replace(':', '.', n=1, regex=False).str[::-1]
df['sec_after_mdnt'] = pd.to_timedelta(df['time']).dt.total_seconds()
df
time sec_after_mdnt
0 15:41:28.4330 56488.433
Related
I am calling some financial data from an API which is storing the time values as (I think) UTC (example below):
enter image description here
I cannot seem to convert the entire column into a useable date, I can do it for a single value using the following code so I know this works, but I have 1000's of rows with this problem and thought pandas would offer an easier way to update all the values.
from datetime import datetime
tx = int('1645804609719')/1000
print(datetime.utcfromtimestamp(tx).strftime('%Y-%m-%d %H:%M:%S'))
Any help would be greatly appreciated.
Simply use pandas.DataFrame.apply:
df['date'] = df.date.apply(lambda x: datetime.utcfromtimestamp(int(x)/1000).strftime('%Y-%m-%d %H:%M:%S'))
Another way to do it is by using pd.to_datetime as recommended by Panagiotos in the comments:
df['date'] = pd.to_datetime(df['date'],unit='ms')
You can use "to_numeric" to convert the column in integers, "div" to divide it by 1000 and finally a loop to iterate the dataframe column with datetime to get the format you want.
import pandas as pd
import datetime
df = pd.DataFrame({'date': ['1584199972000', '1645804609719'], 'values': [30,40]})
df['date'] = pd.to_numeric(df['date']).div(1000)
for i in range(len(df)):
df.iloc[i,0] = datetime.utcfromtimestamp(df.iloc[i,0]).strftime('%Y-%m-%d %H:%M:%S')
print(df)
Output:
date values
0 2020-03-14 15:32:52 30
1 2022-02-25 15:56:49 40
I am trying to convert a date column containing only hours, minutes and seconds ito a datetime form using pandas.to_datetime(). However, it adds year and date automatically. I also tried using
pandas.to_datetime(df["time"], format = %H:%M:%S").dt.time, again the data type remains object.
Is there any method that can change into datetime format without year and date?
Something like this?
df['Time'] = pd.to_datetime(df['Time'], format='%H:%M:%S', errors='ignore')
put .dt.time on the end
df['Time'] = pd.to_datetime(df['Time'], format='%H:%M:%S', errors='ignore').dt.time
I have a dataframe with time column as string and I should convert it to a timestamp only with h:m:sec.ms . Here an example:
import pandas as pd
df=pd.DataFrame({'time': ['02:21:18.110']})
df.time= pd.to_datetime(df.time , format="%H:%M:%S.%f")
df # I get 1900-01-01 02:21:18.110
Without format flag, I get current day 2020-12-16. How can I get the stamp without year-month-day which seemingly always is included. Thanks!
If need processing values later by some datetimelike methods better is convert values to timedeltas by to_timedelta instead times:
df['time'] = pd.to_timedelta(df['time'])
print (df)
time
0 0 days 02:21:18.110000
You need this:
df=pd.DataFrame({'time': ['02:21:18.110']})
df['time'] = pd.to_datetime(df['time']).dt.time
In [1023]: df
Out[1023]:
time
0 02:21:18.110000
I'm working with a dataset that lists the amount of time that has elapsed since an entry was made in our ERP. It is presented in seconds elapsed (some objects are days old though). I want to create a new column in my table (in format MM:DD:YY HH:SS) that shows the date/time an entry was made by subtracting the number of seconds that have elapsed from the current time. The data type for the 'Time' column is 'M8[ns]'
I looked through various solutions for different pieces (such as converting the entry in my dataframe to be read as seconds) but I'm having some difficulty with my code. Here's what I've tried:
import pandas as pd
import time
import datetime
df = pd.read_excel (r'File1.xlsx', sheet_name = 'Sheet1')
df['Time'] = df['Time'].astype('float64')
df['Time'] = pd.to_datetime(df['Time'], unit = 's')
Date Created = (datetime.now () - df['Time'])
df['Date Created'] = Date Created
Any insight would be much appreciated
Update:
I made some progress
import pandas as pd
import time
from datetime import datetime
df = pd.read_excel (r'File1.xlsx', sheet_name = 'Sheet1')
df['Time'] = pd.to_datetime(df['Time'], unit = 's')
df['Date Created'] = datetime.now() - df['Time']
The Time column is now stored as an int64. Using the new code, ['Time'] is showing: 1970-01-01 00:02:57
However only 2:57 has elapsed since created. There are some values that were created days ago. I would like to be able to subtract that 2:57 from the current time to find the date-time the entry was made.
Thank you for the support.
Update 2:
One of my engineers and I were able to figure out a way of determining the date an entry was created.
timer = pd.to_timedelta(df['Time'], unit='s')
#timer = pd.to_datetime(timer)
now = pd.to_datetime(datetime.now())
#x = (datetime.now()-datetime(1970,1,1)) #.total_seconds()
#print(timer)
#x = datetime.now()-timer #.total_seconds()
date_created = now-timer
print('Date Created:\n', date_created)
The reason ['Time'] shows 1970-01-01 00:02:57 is that Unix code starts at 1/1/1970 00:00 (which is equivalent to zero).
I believe your code will work if you convert the two times to integers, and then back to times.
df_seconds = pd.to_datetime(df['Time'], unit = 's').astype(int)
df['Date Created'] = datetime.fromtimestamp(int(datetime.now().timestamp()) - df_seconds)
Depending on how your spreadsheet works, you may need to play around with df['Time'].timestamp() a little, but I think the above code should work.
I hope this helps!
I have a column of timestamps that I would like to convert to datetime in my pandas dataframe. The format of the dates is %Y-%m-%d-%H-%M-%S which pd.to_datetime does not recognize. I have manually entered the format as below:
df['TIME'] = pd.to_datetime(df['TIME'], format = '%Y-%m-%d-%H-%M-%S')
My problem is some of the times do not have seconds so they are shorter
(format = %Y-%m-%d-%H-%M).
How can I get all of these strings to datetimes?
I was thinking I could add zero seconds (-0) to the end of my shorter dates but I don't know how to do that.
try strftime and if you want the right format and if Pandas can't recognize your custom datetime format, you should provide it explicetly
from functools import partial
df1 = pd.DataFrame({'Date': ['2018-07-02-06-05-23','2018-07-02-06-05']})
newdatetime_fmt = partial(pd.to_datetime, format='%Y-%m-%d-%H-%M-%S')
df1['Clean_Date'] = (df1.Date.str.replace('-','').apply(lambda x: pd.to_datetime(x).strftime('%Y-%m-%d-%H-%M-%S'))
.apply(newdatetime_fmt))
print(df1,df1.dtypes)
output:
Date Clean_Date
0 2018-07-02-06-05-23 2018-07-02 06:05:23
1 2018-07-02-06-05 2018-07-02 06:05:00
Date object
Clean_Date datetime64[ns]