This question already has answers here:
Calculate Time Difference Between Two Pandas Columns in Hours and Minutes
(4 answers)
calculate the time difference between two consecutive rows in pandas
(2 answers)
Closed 2 years ago.
I have a dataset like this:
data = pd.DataFrame({'order_date-time':['2017-09-13 08:59:02', '2017-06-28 11:52:20', '2018-05-18 10:25:53', '2017-08-01 18:38:42', '2017-08-10 21:48:40','2017-07-27 15:11:51',
'2018-03-18 21:00:44','2017-08-05 16:59:05', '2017-08-05 16:59:05','2017-06-05 12:22:19'],
'delivery_date_time':['2017-09-20 23:43:48', '2017-07-13 20:39:29','2018-06-04 18:34:26','2017-08-09 21:26:33','2017-08-24 20:04:21','2017-08-31 20:19:52',
'2018-03-28 21:57:44','2017-08-14 18:13:03','2017-08-14 18:13:03','2017-06-26 13:52:03']})
I want to calculate the time differences between these dates as the number of days and add it to the table as the delivery delay column. But I need to include both day and time for this calculation
for example, if the difference is 7 days 14:44:46 we can round this to 7 days.
from datetime import datetime
datetime.strptime(date_string, format)
you could use this to convert the string to DateTime format and put it in variable and then calculate it
Visit https://www.journaldev.com/23365/python-string-to-datetime-strptime/
Python's datetime library is good to work with individual timestamps. If you have your data in a pandas DataFrame as in your case, however, you should use pandas datetime functionality.
To convert a column with timestamps from stings to proper datetime format you can use pandas.to_datetime():
data['order_date_time'] = pd.to_datetime(data['order_date_time'], format="%Y-%m-%d %H:%M:%S")
data['delivery_date_time'] = pd.to_datetime(data['delivery_date_time'], format="%Y-%m-%d %H:%M:%S")
The format argument is optional, but I think it is a good idea to always use it to make sure your datetime format is not "interpreted" incorrectly. It also makes the process much faster on large data-sets.
Once you have the columns in a datetime format you can simply calculate the timedelta between them:
data['delay'] = data['delivery_date_time'] - data['order_date_time']
And then finally, if you want to round this timedelta, then pandas has again the right method for this:
data['approx_delay'] = data['delay'].dt.round('d')
where the extra dt gives access to datetime specific methods, the round function takes a frequency as arguments, and in this case that frequency has been set to a day using 'd'
Related
I have multiple csv files, I've set DateTime as the index.
df6.set_index("gmtime", inplace=True)
#correct the underscores in old datetime format
df6.index = [" ".join( str(val).split("_")) for val in df6.index]
df6.index = pd.to_datetime(df6.index)
The time was put in GMT, but I think it's been saved as BST (British summertime) when I set the clock for raspberry pi.
I want to shift the time one hour backwards. When I use
df6.tz_convert(pytz.timezone('utc'))
it gives me below error as it assumes that the time is correct.
Cannot convert tz-naive timestamps, use tz_localize to localize
How can I shift the time to one hour?
Given a column that contains date/time info as string, you would convert to datetime, localize to a time zone (here: Europe/London), then convert to UTC. You can do that before you set as index.
Ex:
import pandas as pd
dti = pd.to_datetime(["2021-09-01"]).tz_localize("Europe/London").tz_convert("UTC")
print(dti) # notice 1 hour shift:
# DatetimeIndex(['2021-08-31 23:00:00+00:00'], dtype='datetime64[ns, UTC]', freq=None)
Note: setting a time zone means that DST is accounted for, i.e. here, during winter you'd have UTC+0 and during summer UTC+1.
To add to FObersteiner's response (sorry,new user, can't comment on answers yet):
I've noticed that in all the real world situations I've run across it (with full dataframes or pandas series instead of just a single date), .tz_localize() and .tz_convert() need to be called slightly differently.
What's worked for me is
df['column'] = pd.to_datetime(df['column']).dt.tz_localize('Europe/London').dt.tz_convert('UTC')
Without the .dt, I get "index is not a valid DatetimeIndex or PeriodIndex."
I have data in a pandas dataframe that is marked by timestamps as datetime objects. I would like to make a graph that takes the time as something fluid. My idea was to substract the first timestamp from the others (here exemplary for the second entry)
xhertz_df.loc[1]['Dates']-xhertz_df.loc[0]['Dates']
to get the time passed since the first measurement. Which gives 350 days 08:27:51 as a timedelta object. So far so good.
This might be a duplicate but I have not found the solution here so far. Is there a way to quickly transform this object to a number of e.g. minutes or seconds or hours. I know I could extract the individual days, hours and minutes and make a tedious calculation to get it. But is there an integrated way to just turn this object into what I want?
Something like
timedelta.tominutes
that gives it back as a float of minutes, would be great.
If all you want is a float representation, maybe as simple as:
float_index = pd.Index(xhertz_df.loc['Dates'].values.astype(float))
In Pandas, Timestamp and Timedelta columns are internally handled as numpy datetime64[ns], that is an integer number of nanoseconds.
So it is trivial to convert a Timedelta column to a number of minutes:
(xhertz_df.loc[1]['Dates']-xhertz_df.loc[0]['Dates']).astype('int64')/60000000000.
Here is a way to do so with ‘timestamp‘:
Two examples for converting and one for the diff
import datetime as dt
import time
# current date and time
now = dt.datetime.now()
timestamp1 = dt.datetime.timestamp(now)
print("timestamp1 =", timestamp1)
time.sleep(4)
now = dt.datetime.now()
timestamp2 = dt.datetime.timestamp(now)
print("timestamp2 =", timestamp2)
print(timestamp2 - timestamp1)
This question already has answers here:
How to calculate number of days between two given dates
(15 answers)
Closed 3 years ago.
If I have two dates (ex. 19960104 and 19960314), what is the best way to get the number of days between these two dates?
Actually I have to calculate many dates in my dataframe. I use the code:
`for j in range(datenumbers[i]):
date_format = "%Y%m%d"
a = datetime.strptime(str(df_first_day.date[j]), date_format)
b = datetime.strptime(str(df_first_day.exdate[j]), date_format)
delta = (b - a).days
df_first_day.days_to_expire[j] = delta`
I need to put every difference between two dates in one of column of my dataframe. I wonder if there is a better way to do as not using for loop
You will find dates much easier to handle if you first convert text to datetime.
Then it becomes trivial to compute a timedelta that answers your question.
import datetime as dt
fmt = '%Y%m%d'
d1 = dt.datetime.strptime('19960104', fmt)
d2 = dt.datetime.strptime('19960314', fmt)
print((d2 - d1).days)
This displays:
70
EDIT
If you choose to define a function:
def num_to_timestamp(ymd):
return dt.datetime.strptime(str(num), '%Y%m%d')
then you can conveniently apply it to a column:
df['date'] = df['date'].apply(num_to_timestamp)
Similarly the axis=1 option lets you construct a column that is difference of two existing columns.
I have timestamp strings of the form "091250Z", where the first two numbers are the date and the last four numbers are the hours and minutes. The "Z" indicates UTC. Assuming the timestamp corresponds to the current month and year, how can this string be converted reliably to a datetime object?
I have the parsing to a timedelta sorted, but the task quickly becomes nontrivial when going further and I'm not sure how to proceed:
datetime.strptime("091250Z", "%d%H%MZ")
What you need is to replace the year and month of your existing datetime object.
your_datetime_obj = datetime.strptime("091250Z", "%d%H%MZ")
new_datetime_obj = your_datetime_obj.replace(year=datetime.now().year, month=datetime.now().month)
Like this? You've basically already done it, you just needed to assign it a variable
from datetime import datetime
dt = datetime.strptime('091250Z', '%m%H%MZ')
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I subtract a day from a python date?
subtract two times in python
I had generated date in python as below
import time
time_stamp = time.strftime('%Y-%m-%d')
print time_stamp
Result:
'2012-12-19'
What i am trying is, if above time_stamp is the present today's date , i want a date '2012-12-17' by performing difference(substraction of two days)
So how to perform date reduction of two days from the current date in python
To perform calculations between some dates in Python use the timedelta class from the datetime module.
To do what you want to achieve, the following code should suffice.
import datetime
time_stamp = datetime.datetime(day=21, month=12, year=2012)
difference = time_stamp - datetime.timedelta(day=2)
print '%s-%s-%s' % (difference.year, difference.year, difference.day)
Explanation of the above:
The second line creates a new datetime object (from the the datetime class), with specified day, month and year
The third line creates a timedelta object, which is used to perform calculations between datetime objects