Pandas get date difference between 2 ISO date - python

I have 2 date strings: 2020-02-12T16:02:51Z and 2017-03-08T18:16:02-05:00, and I'd like to get the difference in days that includes partial day difference (for example, a difference of 1 day 12 hours 00 minutes 00 seconds will be 1.5 days).
Here is what I have:
import pandas as pd
date1 = pd.to_datetime("2020-02-12T16:02:51Z", utc=True)
date2 = pd.to_datetime("2017-03-08T18:16:02-05:00", utc=True)
diff = date1 - date2
diff.days
>>> 1070
I expect it should be 1070.<some decimal digits>. Because diff is Timedelta('1070 days 16:46:49')
What did I do wrong? I am using Python 3.8.1 and pandas 1.0.1

Timedelta.days represents the number of days in the delta. So, in your case 1070.
However, you have different options to get the results in fractional form.
>>> diff = date1 - date2
>>> diff.days + diff.seconds/86400
1070.6991782407408
>>> diff.total_seconds()/86400
1070.6991782407408

You are not doing anything wrong. pandas.Timedelta.days returns full days in the timedelta object. You can find fractional days with using diff.value (which returns nanoseconds), such as:
import pandas as pd
date1 = pd.to_datetime("2020-02-12T16:02:51Z", utc=True)
date2 = pd.to_datetime("2017-03-08T18:16:02-05:00", utc=True)
diff = date1 - date2
# nano seconds to days
diff.value / 8.64e+13
>>> 1070.6991782407408

Related

Dataframe datetime switching month into days

I am trying to convert a day/month/Year Hours:Minutes column into just day and month. When I run my code, the conversion switches the months into days and the days into months.
You can find a copy of my dataframe with the one column I want to switch to Day/Month here
https://file.io/JkWl7fsBN0vl
Below is the code I am using to convert:
df =pd.read_csv('Example.csv')
df['DateTime'] = pd.to_datetime(df['DateTime'])
df.to_csv("output.csv", index=False)
Without knowing the exact DateTime format you are using (the link to the dataframe is broken), I'm going to use an example of
day/month/Year Hours:Minutes
05/09/2014 12:30
You can determine the exact format date code using this site
Essentially, to_datetime() has a format argument where you can pass in the specific format when it is not immediately obvious. This will let you specify that what it keeps confusing for month -> day, day -> month is actually the opposite.
>>> df = pd.DataFrame(['05/09/2014 12:30'],columns=['DateTime'])
DateTime
0 05/09/2014 12:30
>>> df['DateTime'] = pd.to_datetime(df['DateTime'], format='%d/%m/%Y %H:%M')
DateTime
0 2014-09-05 12:30:00
>>> df['day'] = df['DateTime'].dt.day
>>> df['month'] = df['DateTime'].dt.month
DateTime day month
0 2014-09-05 12:30:00 5 9
>>> df['DD/MM'] = df['DateTime'].dt.strftime('%d/%m')
DateTime day month DD/MM
0 2014-09-05 12:30:00 5 9 05/09
I'm unsure about the exact format you want the day and month available in (separate columns, combined), but I provided a few examples, so you can remove the DateTime column when you're done with it and use the one you need.

Convert a number into a special datetime

Date1 :20061201
Date2 :01/12/2006
How could use pandas in Python to convert date1 into date2(day/month/year) format?Thanks!Date1 and Date2 are two column in csv files.
Data:
In [151]: df
Out[151]:
Date
0 20061201
1 20170530
Option 1:
In [152]: pd.to_datetime(df.Date, format='%Y%m%d').dt.strftime('%d/%m/%Y')
Out[152]:
0 01/12/2006
1 30/05/2017
Name: Date, dtype: object
Option 2:
In [153]: df.Date.astype(str).str.replace('(\d{4})(\d{2})(\d{2})', r'\3/\2/\1')
Out[153]:
0 01/12/2006
1 30/05/2017
Name: Date, dtype: object
If you're using pandas and want a timestamp object back
pd.to_datetime('20061201')
Timestamp('2006-12-01 00:00:00')
If you want a string back
str(pd.to_datetime('20061201').date())
'2006-12-01'
Assuming you have a dataframe df
df = pd.DataFrame(dict(Date1=['20161201']))
Then you can use the same techniques in vectorized form.
as timestamps
df.assign(Date2=pd.to_datetime(df.Date1))
Date1 Date2
0 20161201 2016-12-01
as strings
df.assign(Date2=pd.to_datetime(df.Date1).dt.date.astype(str))
Date1 Date2
0 20161201 2016-12-01
import datetime
A=datetime.datetime.strptime('20061201','%Y%m%d')
A.strftime('%m/%d/%Y')
You may use apply and lambda function here.
Suppose you have a dataset named df as below:
id date1
0 20061201
2 20061202
You can use the code like below:
df['date2'] = df['date1'].apply(lambda x: x[6:] + '/' + x[4:6] + '/' + x[:4])
The result will be:
id date1 date2
0 20061201 01/12/2016
2 20061202 02/12/2016
The simplest way is probably using the date parsing provided by datetime:
from datetime import datetime
datetime.strptime(str(20061201), "%Y%m%d")
You can apply this transformation to all rows in your pandas dataframe/series using the following:
from datetime import datetime
def convert_date(d):
return datetime.strptime(str(d), "%Y%m%d")
df['Date2'] = df.Date1.apply(convert_date)
This will add a Date2 column to your dataframe df, which is the datetime representation of the Date1 column.
You can then serialize the date again by using strftime:
def serialize_date(d):
return d.strftime(d, "%d/%m/%Y")
df['Date2'] = df.Date2.apply(serialize_date)
Alternatively you can do it all with string manipulations:
def reformat_date(d):
year = d // 10000
month = d % 10000 // 100
day = d % 100
return "{day}/{month}/{year}".format(day=day, month=month, year=year)
df['Date2'] = df.Date1.apply(reformat_date)
This is quite a bit faster than using the parsing machinery provided by strptime.

Add days to date in pandas

I have a data frame that contains 2 columns, one is Date and other is float number.
I would like to add those 2 to get the following:
Index Date Days NewDate
0 20-04-2016 5 25-04-2016
1 16-03-2015 3.7 20-03-2015
As you can see if there is decimal it is converted as int as 3.1--> 4 (days).
I have some weird questions so I appreciate any help.
Thank you !
First, ensure that the Date column is a datetime object:
df['Date'] = pd.to_datetime(df['Date'])
Then, we can convert the Days column to int by ceiling it and the converting it to a pandas Timedelta:
temp = df['Days'].apply(np.ceil).apply(lambda x: pd.Timedelta(x, unit='D'))
Datetime objects and timedeltas can be added:
df['NewDate'] = df['Date'] + temp
You can convert the Days column to timedelta and add it to Date column:
import pandas as pd
df['NewDate'] = pd.to_datetime(df.Date) + pd.to_timedelta(pd.np.ceil(df.Days), unit="D")
df
using combine for two columns calculations and pd.DateOffset for adding days
df['NewDate'] = df['Date'].combine(df['Days'], lambda x,y: x + pd.DateOffset(days=int(np.ceil(y))))
output:
Date Days NewDate
0 2016-04-20 5.0 2016-04-25
1 2016-03-16 3.7 2016-03-20

Python get number of the week by month

Any one could help me please, How to get number of week by month in Python?
from datetime import datetime, date, timedelta
Input:
date1 = "2015-07-09"
date2 = "2016-08-20"
Output:
2015-07 : 4
2015-08 : 5
2015-08 : 4
....
2016-08 : 5
How to count number of the week by monthly from date1 to date2?
If you wanted to measure the number of full weeks between two dates, you could accomplish this with datetime.strptime and timedelta like so:
from datetime import datetime, date, timedelta
dateformat = "%Y-%m-%d"
date1 = datetime.strptime("2015-07-09", dateformat)
date2 = datetime.strptime("2016-08-20", dateformat)
weeks = int((date2-date1).days/7)
print weeks
This outputs 58. The divide by 7 causes the number of weeks to be returned. The number of whole weeks is used (rather than partial) because of int which returns only the integer portion. If you wanted to get the number of partial weeks, you could divide by 7.0 instead of 7, and ensure that you remove the int piece.
Try this:
date1 = "2015-07-09"
date2 = "2016-08-20"
d1 = datetime.datetime.strptime(date1, '%Y-%m-%d').date()
d2 = datetime.datetime.strptime(date2, '%Y-%m-%d').date()
diff = d2 -d1
weeks, days = divmod(diff.days, 7)

Create a series of days between timestamps - DUPLICATE LINK ABOVE DOESN'T SOLVE

I have two date features of type datetime. I'd like to express the difference between them in days cast as type int. How do I accomplish this:
In[]
print lcd.time_to_default
print lcd.issue_date
lcd['time_to_default']=(lcd.last_pymnt_date - lcd.issue_date)
lcd.time_to_default.head()
Out[92]:
datetime64[ns]
datetime64[ns]
0 1127 days
1 487 days
2 913 days
3 1127 days
4 1217 days
Name: time_to_default, dtype: timedelta64[ns]
I want to cast this series as an int, not timedelta64.
Addendum: I can't cast this as ".days" as the link above which supposes a duplicate, suggests.
In[]
lcd.time_to_default.days
Returns:
Out[]
'Series' object has no attribute 'days'
Just subtract the two datetime variables. That yields timedelta type.
Eg:
In [2]: datetime.datetime.now()
Out[2]: datetime.datetime(2015, 6, 2, 0, 30, 49, 548657)
In [3]: yesterday = datetime.datetime.now() - datetime.timedelta(days=1)
In [4]: datetime.datetime.now() - yesterday
Out[4]: datetime.timedelta(1, 17, 32459)
In [5]: diff = (datetime.datetime.now() - yesterday)
In [6]: diff.days
Out[6]: 1
Try this,
>>> from datetime import datetime
>>> date1 = datetime(2015,6,2)
>>> date2 = datetime(2015,5,2)
>>> diff = date1 - date2
>>> print (diff.days)
31
To get integer number of days from a series of timedelta64[ns], you could try (not tested):
result = np.divide(lcd.time_to_default, np.timedelta64(1, 'D'))
See Time difference in seconds from numpy.timedelta64 and Converting between datetime, Timestamp and datetime64.

Categories

Resources