This question already has answers here:
Pandas - convert strings to time without date
(3 answers)
Closed 1 year ago.
I have a column in for stop_time 05:38 (MM:SS) but it is showing up as an object. is there a way to turn this to a time?
I tried using # perf_dfExtended['Stop_Time'] = pd.to_datetime(perf_dfExtended['Stop_Time'], format='%M:%S')
but then it adds a date to the output: 1900-01-01 00:05:38
I guess what you're looking for is pd.to_timedelta (https://pandas.pydata.org/docs/reference/api/pandas.to_timedelta.html). to_datetime operation which will of course always try to create a date.
What you have to remember about though is that pd.to_timedelta could raise ValueError for your column, as it requires hh:mm:ss format. Try to use apply function on your column by adding '00:' by the beginning of arguments of your column (which I think are strings?), and then turn the column to timedelta. Could be something like:
pd.to_timedelta(perf_dfExtended['Stop_Time'].apply(lambda x: f'00:{x}'))
This may work for you:
perf_dfExtended['Stop_Time'] = \
pd.to_datetime(perf_dfExtended['Stop_Time'], format='%M:%S').dt.time
Output (with some additional examples)
0 00:05:38
1 00:10:17
2 00:23:45
Related
This question already has an answer here:
Pandas read_excel: parsing Excel datetime field correctly [duplicate]
(1 answer)
Closed 1 year ago.
I'm trying to convert an entire column containing a 5 digit date code (EX: 43390, 43599) to a normal date format. This is just to make data analysis easier, it doesn't matter which way it's formatted. In a series, the DATE column looks like this:
1 43390
2 43599
3 43605
4 43329
5 43330
...
264832 43533
264833 43325
264834 43410
264835 43461
264836 43365
I don't understand previous submissions with this question, and when I tried code such as
date_col = df.iloc[:,0]
print((datetime.utcfromtimestamp(0) + timedelta(date_col)).strftime("%Y-%m-%d"))
I get this error
unsupported type for timedelta days component: Series
Thanks, sorry if this is a basic question.
You are calling the dataframe and assigning it to date_col. If you want to get the value of your first row, for example, use date_col = df.iloc[0]. This will return the value.
Timedelta takes an integer value, not Series.
This question already has answers here:
Pandas - convert strings to time without date
(3 answers)
Closed 1 year ago.
I have csv file with the date and jobs runtime values, I need to convert the object to time format.
Time value will be as follows:
00:04:23
00:04:25
pd.to_datetime(df[‘Time’], format=‘%H:%M:%S:’)
This returns the value with default date
1900-01-01 00:04:23
1900-01-01 00:04:25
How do I retain only the runtime as time data type in the column without date.
We can make use of to_timedelta function in pandas.
df['Time'] = pd.to_timedelta(df['Time'])
It will create time format of `timedelta64[ns]
You can use pandas.Series.dt.time() to access time part.
print(pd.to_datetime(df['Time'], format='%H:%M:%S').dt.time)
'''
0 00:04:23
1 00:04:25
Name: Time, dtype: object
'''
This question already has answers here:
Calculate Time Difference Between Two Pandas Columns in Hours and Minutes
(4 answers)
calculate the time difference between two consecutive rows in pandas
(2 answers)
Closed 2 years ago.
I have a dataset like this:
data = pd.DataFrame({'order_date-time':['2017-09-13 08:59:02', '2017-06-28 11:52:20', '2018-05-18 10:25:53', '2017-08-01 18:38:42', '2017-08-10 21:48:40','2017-07-27 15:11:51',
'2018-03-18 21:00:44','2017-08-05 16:59:05', '2017-08-05 16:59:05','2017-06-05 12:22:19'],
'delivery_date_time':['2017-09-20 23:43:48', '2017-07-13 20:39:29','2018-06-04 18:34:26','2017-08-09 21:26:33','2017-08-24 20:04:21','2017-08-31 20:19:52',
'2018-03-28 21:57:44','2017-08-14 18:13:03','2017-08-14 18:13:03','2017-06-26 13:52:03']})
I want to calculate the time differences between these dates as the number of days and add it to the table as the delivery delay column. But I need to include both day and time for this calculation
for example, if the difference is 7 days 14:44:46 we can round this to 7 days.
from datetime import datetime
datetime.strptime(date_string, format)
you could use this to convert the string to DateTime format and put it in variable and then calculate it
Visit https://www.journaldev.com/23365/python-string-to-datetime-strptime/
Python's datetime library is good to work with individual timestamps. If you have your data in a pandas DataFrame as in your case, however, you should use pandas datetime functionality.
To convert a column with timestamps from stings to proper datetime format you can use pandas.to_datetime():
data['order_date_time'] = pd.to_datetime(data['order_date_time'], format="%Y-%m-%d %H:%M:%S")
data['delivery_date_time'] = pd.to_datetime(data['delivery_date_time'], format="%Y-%m-%d %H:%M:%S")
The format argument is optional, but I think it is a good idea to always use it to make sure your datetime format is not "interpreted" incorrectly. It also makes the process much faster on large data-sets.
Once you have the columns in a datetime format you can simply calculate the timedelta between them:
data['delay'] = data['delivery_date_time'] - data['order_date_time']
And then finally, if you want to round this timedelta, then pandas has again the right method for this:
data['approx_delay'] = data['delay'].dt.round('d')
where the extra dt gives access to datetime specific methods, the round function takes a frequency as arguments, and in this case that frequency has been set to a day using 'd'
This question already has answers here:
computing the mean for python datetime
(5 answers)
Get the average date from multiple dates - pandas
(2 answers)
Closed 2 years ago.
New to python, hence I hope you don't mind my simple questions...
I'm new to datetime functions, and I'm working on a time series data at the moment.
Below is a sample dataframe for my purpose.
My objective is to groupby my messages according to the 'group' column and take a mean of the datetime. This would be helpful for my data visualisation.
df = pd.DataFrame([['2018-04-12 11:20:57','Hello everyone',1],['2018-04-12 11:20:57','Hello everyone',1],
['2018-04-12 11:19:34','second msg',1],['2018-04-13 11:00:57','Random',1],
['2018-04-13 11:49:34','3rd msg',2],
['2018-04-13 11:29:57','Msg',2]],columns=['datetime','msg','group'])
The code below does not work.
chat_1.groupby('group')['datetime'].mean()
DataError: No numeric types to aggregate
Wondering if there's any way to get around this? Thank you.
See you do it like this :
df1.datetime = pd.to_datetime(df1.datetime).values.astype(np.int64)
df1 = pd.DataFrame(pd.to_datetime(df1.groupby('group').mean().datetime))
Output will be:
group datetime
1 2018-04-12 17:15:36.249999872
2 2018-04-13 11:39:45.500000000
Never had to do this myself but I thought it would work out of box. I might be missing a point but here's a workaround (if I understood you correctly):
df["datetime"] = pd.to_datetime(df["datetime"])
out = [
{"group": g, "mean": df.loc[df["group"].eq(g)]["datetime"].mean()}
for g in df["group"].unique()
]
pd.DataFrame(out)
Output
EDIT
If anyone could explain why df["datetime"].mean() works but df.groupby("group")["datetime"].mean() doesn't, that would be interesting to hear because I'm confused.
This question already has answers here:
Convert date to months and year
(4 answers)
Closed 4 years ago.
New to Python, and have already spent 30 minutes reviewing old responses, but still can't figure this out.
'Year' variable is a string. Examples: 1990, 2010.
I need to convert to date format, but just with the 4 year "digits".
Tried the following, but none are working:
date1 = datetime.datetime.date('Year', "%Y")
datetime.datetime.strftime('Year', "%Y")
wcData.astype(str).apply(lambda x: pd.to_datetime('Year', format='%Y'))
df.astype(str).apply(lambda x: pd.to_datetime(x, format='%Y%m%d'))
Please help!
You need datetime.datetime.strptime, not strftime:
import datetime
datetime.datetime.strptime("2010", "%Y")
Here's how to remember which one to use:
strPtime means "string -> parse -> time", that is, parse a string to create some kind of object that represents time;
strFtime means "string <- format <- time" (note the reversed arrows), that is, given an object that represents time, create a string representation of it using some format.