Pandas Dataframe Calculate Num Business Days - python

I am working on a project and I am trying to calculate the number of business days within a month. What I currently did was extract all of the unique months from one dataframe into a different dataframe and created a second column with
df2['Signin Date Shifted'] = df2['Signin Date'] + pd.DateOffset(months=1)
Thus the current dataframe looks like:
I know I can do dt.daysinmonth or a timedelta but that gives me all of the days within a month including Sundays/Saturdays (which I don't want).

Using busday_count from np
Ex:
import pandas as pd
import numpy as np
df = pd.DataFrame({"Signin Date": ["2018-01-01", "2018-02-01"]})
df["Signin Date"] = pd.to_datetime(df["Signin Date"])
df['Signin Date Shifted'] = pd.DatetimeIndex(df['Signin Date']) + pd.DateOffset(months=1)
df["bussDays"] = np.busday_count( df["Signin Date"].values.astype('datetime64[D]'), df['Signin Date Shifted'].values.astype('datetime64[D]'))
print(df)
Output:
Signin Date Signin Date Shifted bussDays
0 2018-01-01 2018-02-01 23
1 2018-02-01 2018-03-01 20
MoreInfo

Related

Pandas - How to Create Month and Year Column from DateTime Column

I have the following data (I purposely created a DateTime column from the string column of dates because that's how I am receiving the data):
import numpy as np
import pandas as pd
data = pd.DataFrame({"String_Date" : ['10/12/2021', '9/21/2021', '2/12/2010', '3/25/2009']})
#Create DateTime columns
data['Date'] = pd.to_datetime(data["String_Date"])
data
String_Date Date
0 10/12/2021 2021-10-12
1 9/21/2021 2021-09-21
2 2/12/2010 2010-02-12
3 3/25/2009 2009-03-25
I want to add the following "Month & Year Date" column with entries that are comparable (i.e. can determine whether Oct-12 < Sept-21):
String_Date Date Month & Year Date
0 10/12/2021 2021-10-12 Oct-12
1 9/21/2021 2021-09-21 Sept-21
2 2/12/2010 2010-02-12 Feb-12
3 3/25/2009 2009-03-25 Mar-25
The "Month & Year Date" column doesn't have to be in the exact format I show above (although bonus points if it does), just as long as it shows both the month (abbreviated name, full name, or month number) and the year in the same column. Most importantly, I want to be able to groupby the entries in the "Month & Year Date" column so that I can aggregate data in my original data set across every month.
You can do:
data["Month & Year Date"] = (
data["Date"].dt.month_name() + "-" + data["Date"].dt.year.astype(str)
)
print(data)
Prints:
String_Date Date Month & Year Date
0 10/12/2021 2021-10-12 October-2021
1 9/21/2021 2021-09-21 September-2021
2 2/12/2010 2010-02-12 February-2010
3 3/25/2009 2009-03-25 March-2009
But if you want to group by month/year it's preferable to use:
data.groupby([data["Date"].dt.month, data["Date"].dt.year])
data['Month & Year Date'] = data['Date'].dt.strftime('%b') + '-' + data['Date'].dt.strftime('%y')
print(data)
Outputs:
String_Date Date Month & Year Date
0 10/12/2021 2021-10-12 Oct-21
1 9/21/2021 2021-09-21 Sep-21
2 2/12/2010 2010-02-12 Feb-10
3 3/25/2009 2009-03-25 Mar-09
You can use the .dt accessor to format your date field however you like. For your example, it'd look like this:
data['Month & Year Date'] = data['Date'].dt.strftime('%b-%y')
Although honestly I don't think that's the best representation for the purpose of sorting or evaluating greater than or less than. If what you want is essentially a truncated date, you could do this instead:
As a string -
data['Month & Year Date'] = data['Date'].dt.strftime('%Y-%m-01')
As a datetime object -
data['Month & Year Date'] = data['Date'].dt.to_period.dt.to_timestamp()
You can use strftime. You can find the formats here
data['Month Day'] = data['Date'].apply(lambda x:x.strftime('%b-%d'))

Python: Add Weeks to Date from df

How would I add two df columns together (date + weeks):
This works for me:
df['Date'] = pd.to_datetime(startDate, format='%Y-%m-%d') + datetime.timedelta(weeks = 3)
But when I try to add weeks from a column, I get a type error: unsupported type for timedelta weeks component: Series
df['Date'] = pd.to_datetime(startDate, format='%Y-%m-%d') + datetime.timedelta(weeks = df['Duration (weeks)'])
Would appreciate any help thank you!
You can use the pandas to_timelta function to transform the number of weeks column to a timedelta, like this:
import pandas as pd
import numpy as np
# create a DataFrame with a `date` column
df = pd.DataFrame(
pd.date_range(start='1/1/2018', end='1/08/2018'),
columns=["date"]
)
# add a column `weeks` with a random number of weeks
df['weeks'] = np.random.randint(1, 6, df.shape[0])
# use `pd.to_timedelta` to transform the number of weeks column to a timedelta
# and add it to the `date` column
df["new_date"] = df["date"] + pd.to_timedelta(df["weeks"], unit="W")
df.head()
date weeks new_date
0 2018-01-01 5 2018-02-05
1 2018-01-02 2 2018-01-16
2 2018-01-03 2 2018-01-17
3 2018-01-04 4 2018-02-01
4 2018-01-05 3 2018-01-26

add business days to a df column

I want to add a column called 'Date' which starts from todays date and adds business days as you go down the df up until a year. I am trying the below code but it repeats days as its adding a BD to Friday and Saturdays. The output should have row 1 = 2021-10-07 and end with 2022-10-08 with only BD being shown. Can anyone help please?
import datetime as dt
from pandas.tseries.offsets import BDay
from datetime import date
df = pd.DataFrame({'Date': pd.date_range(start=date.today(), end=date.today() + dt.timedelta(days=365))})
df['Date'] = df['Date'] + BDay(1)
It is unclear what your desired output is, but if you want a column 'Date' that only shows the dates for business days, you can use the code below.
import datetime as dt
import pandas as pd
from datetime import date
df = pd.DataFrame({'Date': pd.date_range(start=date.today(), end=date.today() + dt.timedelta(days=365))})
df = df[df.Date.dt.weekday < 5] # 0 is Monday, # 6 is Sunday

Return date 260 working days from reference date

I have a daily data dataframe (df) indexed by date - the head is below:
nominal
date
2016-01-04 114185.481138
2016-01-04 17841.990960
2016-01-04 -8799.514730
2016-01-04 0.000000
2016-01-04 -3028.765682
I can find the max date using
maxDate = df.index.max()
How would I find the date 260 working days (1 working year) before this date? How could I go about retrieving the date 260 days ago from the maxDate?
By using Bday
from pandas.tseries.offsets import BDay
df.index.max()-BDay(260)
Timestamp('2015-01-05 00:00:00')
If I understanding you wanting to subtract a date, in your case it would be like this:
import datetime
dat = datetime.datetie(2016,1,4)
dd = datetime.timedelta(days = 260)
print(dat - dd)
output: 2015-04-19
import datetime
#Only use the following line if the column type for your 'date' column is
# string
df['date'] = pd.to_datetime(df['date'])
(max(df['date']) - pd.tseries.offsets.BDay(260)).strftime('%Y-%m-%d')
#The line above produces:
# '2015-01-06'

Add days to date in pandas

I have a data frame that contains 2 columns, one is Date and other is float number.
I would like to add those 2 to get the following:
Index Date Days NewDate
0 20-04-2016 5 25-04-2016
1 16-03-2015 3.7 20-03-2015
As you can see if there is decimal it is converted as int as 3.1--> 4 (days).
I have some weird questions so I appreciate any help.
Thank you !
First, ensure that the Date column is a datetime object:
df['Date'] = pd.to_datetime(df['Date'])
Then, we can convert the Days column to int by ceiling it and the converting it to a pandas Timedelta:
temp = df['Days'].apply(np.ceil).apply(lambda x: pd.Timedelta(x, unit='D'))
Datetime objects and timedeltas can be added:
df['NewDate'] = df['Date'] + temp
You can convert the Days column to timedelta and add it to Date column:
import pandas as pd
df['NewDate'] = pd.to_datetime(df.Date) + pd.to_timedelta(pd.np.ceil(df.Days), unit="D")
df
using combine for two columns calculations and pd.DateOffset for adding days
df['NewDate'] = df['Date'].combine(df['Days'], lambda x,y: x + pd.DateOffset(days=int(np.ceil(y))))
output:
Date Days NewDate
0 2016-04-20 5.0 2016-04-25
1 2016-03-16 3.7 2016-03-20

Categories

Resources