i have an dataframe with dates and would like to get the time between the first date and the last date, when i run the code below
df.sort_values('timestamp', inplace=True)
firstDay = df.iloc[0]['timestamp']
lastDay = df.iloc[len(df)-1]['timestamp']
print(firstDay)
print(lastDay)
it provides the following formate of the dates :
2016-09-24 17:42:27.839496
2017-01-18 10:24:08.629327
and I'm trying to get the different between them but they're in the str format, and I've been having trouble converting them to a form where i can get the difference
here you go :o)
import datetime
from datetime import date
from datetime import datetime
import pandas as pd
date_format_str = '%Y-%m-%d %H:%M:%S.%f'
date_1 = '2016-09-24 17:42:27.839496'
date_2 = '2017-01-18 10:24:08.629327'
start = datetime.strptime(date_1, date_format_str)
end = datetime.strptime(date_2, date_format_str)
diff = end - start
# Get interval between two timstamps as timedelta object
diff_in_hours = diff.total_seconds() / 3600
print(diff_in_hours)
# get the difference between two dates as timedelta object
diff = end.date() - start.date()
print(diff.days)
Pandas
import datetime
from datetime import date
from datetime import datetime
import pandas as pd
date_1 = '2016-09-24 17:42:27.839496'
date_2 = '2017-01-18 10:24:08.629327'
start = pd.to_datetime(date_1, format='%Y-%m-%d %H:%M:%S.%f')
end = pd.to_datetime(date_2, format='%Y-%m-%d %H:%M:%S.%f')
# get the difference between two datetimes as timedelta object
diff = end - start
print(diff.days)
Related
I have a dataframe with timestamp of different formats one with 05-28-2022 14:05:30 and one with 06-04-2022 03:04:13.002 both I want to convert into iso format how can I do that?
input output
05-28-2022 14:05:30 -> 2022-05-28T14:05:30.000+0000
06-04-2022 03:04:13.002 -> 2022-06-04T03:04:13.002+0000
You can use strptime() + strftime(). Here is an example:
from datetime import datetime
import pytz
# parse str to instance
first = datetime.strptime('05-28-2022 14:05:30', '%m-%d-%Y %H:%M:%S')
first = first.replace(tzinfo=pytz.UTC)
print(first.strftime('%Y-%m-%dT%H:%M:%S.%f%z'))
print(f'{first.isoformat()}')
second = datetime.strptime('06-04-2022 03:04:13.002', '%m-%d-%Y %H:%M:%S.%f')
second = second.replace(tzinfo=pytz.UTC)
print(second.strftime('%Y-%m-%dT%H:%M:%S.%f%z'))
print(second.isoformat())
# 2022-05-28T14:05:30.000000+0000
# 2022-05-28T14:05:30+00:00
# 2022-06-04T03:04:13.002000+0000
# 2022-06-04T03:04:13.002000+00:00
See datetime docs. Also you can use other packages for dates processing / formatting:
iso8601
pendulum
dateutil
arrow
Example with dataframe:
import pandas as pd
import pytz
from datetime import datetime
df = pd.DataFrame({'date': ['05-28-2022 14:05:30', '06-04-2022 03:04:13.002']})
def convert_date(x):
dt_format = '%m-%d-%Y %H:%M:%S.%f' if x.rfind('.', 1) > -1 else '%m-%d-%Y %H:%M:%S'
dt = datetime.strptime(x, dt_format).replace(tzinfo=pytz.UTC)
return dt.strftime('%Y-%m-%dT%H:%M:%S.%f%z')
df['new_date'] = df['date'].apply(convert_date)
print(df)
date new_date
0 05-28-2022 14:05:30 2022-05-28T14:05:30.000000+0000
1 06-04-2022 03:04:13.002 2022-06-04T03:04:13.002000+0000
I would like to convert datetime to UTC time. I try below code but the output looks like not correct:
import datetime
import pandas as pd
def str2dt(tstr):
dt = datetime.datetime.strptime(tstr, '%m-%d %H:%M:%S.%f')
return dt
ts = "04-12 20:43:34.342"
dt = str2dt(ts)
utc_delta = datetime.datetime.utcnow() - datetime.datetime.now()
utc = dt - utc_delta
print(dt,'->',utc)
Current output:
1900-04-12 20:43:34.342000 -> 1900-04-12 15:43:34.342001
The expected output time is 1900-04-12 02:43:34.342001
It looks like you would be better off using isoformat() on your datetime:
utc = dt.isoformat(sep=' ') # The normal date-time separator is 'T', but that isn't very readable
print(f"{dt} -> {utc}")
This gives you the answer you're looking for.
If you still need the UTC offset, consider using datetime.datetime.utcoffset().
Should be plus the delta:
import datetime
import pandas as pd
def str2dt(tstr):
dt = datetime.datetime.strptime(tstr, '%m-%d %H:%M:%S.%f')
return dt
ts = "04-12 20:43:34.342"
dt = str2dt(ts)
utc_delta = datetime.datetime.utcnow() - datetime.datetime.now()
utc = dt + utc_delta
print(dt,'->',utc)
Output:
1900-04-12 20:43:34.342000 -> 1900-04-13 01:43:34.341999
So, Basically, I got this 2 df columns with data content. The initial content is in the dd/mm/YYYY format, and I want to subtract them. But I can't really subtract string, so I converted it to datetime, but when I do such thing for some reason the format changes to YYYY-dd-mm, so when I try to subtract them, I got a wrong result. For example:
Initial Content:
a: 05/09/2022
b: 30/09/2021
result expected: 25 days.
Converted to DateTime:
a: 2022-05-09
b: 2021-09-30 (For some reason this date stills the same)
result: 144 days.
I'm using pandas and datetime to make this project.
So, I wanted to know a way I can subtract this 2 columns with the proper result.
--- Answer
When I used
pd.to_datetime(date, format="%d/%m/%Y")
It worked. Thank you all for your time. This is my first project in pandas. :)
df = pd.DataFrame({'Date1': ['05/09/2021'], 'Date2': ['30/09/2021']})
df = df.apply(lambda x:pd.to_datetime(x,format=r'%d/%m/%Y')).assign(Delta=lambda x: (x.Date2-x.Date1).dt.days)
print(df)
Date1 Date2 Delta
0 2021-09-05 2021-09-30 25
I just answered a similar query here subtracting dates in python
import datetime
from datetime import date
from datetime import datetime
import pandas as pd
date_format_str = '%Y-%m-%d %H:%M:%S.%f'
date_1 = '2016-09-24 17:42:27.839496'
date_2 = '2017-01-18 10:24:08.629327'
start = datetime.strptime(date_1, date_format_str)
end = datetime.strptime(date_2, date_format_str)
diff = end - start
# Get interval between two timstamps as timedelta object
diff_in_hours = diff.total_seconds() / 3600
print(diff_in_hours)
# get the difference between two dates as timedelta object
diff = end.date() - start.date()
print(diff.days)
Pandas
import datetime
from datetime import date
from datetime import datetime
import pandas as pd
date_1 = '2016-09-24 17:42:27.839496'
date_2 = '2017-01-18 10:24:08.629327'
start = pd.to_datetime(date_1, format='%Y-%m-%d %H:%M:%S.%f')
end = pd.to_datetime(date_2, format='%Y-%m-%d %H:%M:%S.%f')
# get the difference between two datetimes as timedelta object
diff = end - start
print(diff.days)
I would like to generate a timestamp between each StartDate:StartTime and EndDate:EndTime.
such as
StartDate 2019-01-01 ,StartTime 01:00:00 --- only appear hour (2019-01-01 ,01)
EndDate 2019-01-11 ,EndTime 1:00:00.---only appear hour (2019-01-11 ,01)
and show Date and Time both in one column in the table
from datetime import datetime, timezone, timedelta
import pandas as pd
def generate_date_list(from_date, to_date):
list_of_dates = []
start = datetime.strptime(from_date, '%Y-%m-%d')
end = datetime.strptime(to_date, '%Y-%m-%d')
step = timedelta(days=1)
while start <= end:
list_of_dates.append(start.date())
start += step
dt = [i.strftime("%Y-%m-%d %H:%M:%S") for i in list_of_dates]
return {"Date_Time": dt}
date_time_list = generate_date_list("2019-01-01", "2019-01-11")
df = pd.DataFrame(date_time_list)
print(df)
#You can use df.to_sql() to insert on table.
You can try using pandas date_range function
import pandas as pd
dates = pd.date_range(start='2009-01-01 01:00:00', end='2009-01-11 00:00:00',freq='H')
for date in dates:
print(date)
This will print all datetimes between starttime and endtime
I'm trying to subtract a day from this date 06-30-2019 in order to make it 06-29-2019 but can't figure out any way to achive that.
I've tried with:
import datetime
date = "06-30-2019"
date = datetime.datetime.strptime(date,'%m-%d-%Y').strftime('%m-%d-%Y')
print(date)
It surely gives me back the date I used above.
How can I subtract a day from a date in the above format?
try this
import datetime
date = "06/30/19"
date = datetime.datetime.strptime(date, "%m/%d/%y")
NewDate = date + datetime.timedelta(days=-1)
print(NewDate) # 2019-06-29 00:00:00
Your code:
date = "06-30-2019"
date = datetime.datetime.strptime(date,'%m-%d-%Y').strftime('%m-%d-%Y')
Check type of date variable.
type(date)
Out[]: str
It is in string format. To perform subtraction operation you must convert it into date format first. You can use pd.to_datetime()
# Import packages
import pandas as pd
from datetime import timedelta
# input date
date = "06-30-2019"
# Convert it into pd.to_datetime format
date = pd.to_datetime(date)
print(date)
# Substracting days
number_of_days = 1
new_date = date - timedelta(number_of_days)
print(new_date)
output:
2019-06-29 00:00:00
If you want to get rid of timestamp you can use:
str(new_date.date())
Out[]: '2019-06-29'
use timedelta
import datetime
date = datetime.datetime.strptime("06/30/19" ,"%m/%d/%y")
print( date - datetime.timedelta(days=1))