Python how to auto pick last Trade Day closing price - python

Hi I had created a dataframe with Acutal Close, High, Low and now I will have to calculate the Day-Change, 3Days-Change, 2weeks-Change for each of the row.
With the code below, I can see the Day-Change field with Blank/NaN value (10/27/2009 D-Chg field), and now how can I get python to Auto-Pick the last trading date (10/23/2009) AC price for calculation when shifted date doesn't exist?
data["D-Chg"]=stock_store['Adj Close'] - stock_store['Adj Close'].shift(1, freq='B')
Thanks with Regards

format your first column to datetime.
data['Mycol'] = pd.to_datetime(data['Mycol'], format='%d%b%Y:%H:%M:%S.%f')
get the max value.
last_date = data['date'].max()
Get the most up-to-date row
is_last = data['date'] == last_date
data[is_last]
This may be done in one step if you give your desired column to max().

Related

getting previous week highs and lows in pandas dataframe using 30 min data

I have a few set of days where the index is based on 30min data from monday to friday. There might some missing dates (Might be because of holidays). But i would like to find the highest from column high and lowest from column low for ever past week. Like i am calculating today so previous week high and low is marked in the yellow of attached image.
Tried using rolling , resampling but some how not working. Can any one help
enter image description here
You really should add sample data to your question (by that I mean a piece of code/text that can easily be used to create a dataframe for illustrating how the proposed solution works).
Here's a suggestion. With df your dataframe, and column datatime with datetimes (and not strings):
df["week"] = (
df["datetime"].dt.isocalendar().year.astype(str)
+ df["datetime"].dt.isocalendar().week.astype(str)
)
mask = df["high"] == df.groupby("week")["high"].transform("max")
df = df.merge(
df[mask].rename(columns={"low": "high_low"})
.groupby("week").agg({"high_low": "min"}).shift(),
on="week", how="left"
).drop(columns="week")
Add a week column to df (year + week) for grouping along weeks.
Extract the rows with the weekly maximum highs by mask (there could be more than one for a week).
Build a corresponding dataframe with the weekly minimum of the lows corresponding to the weekly maximum highs (column named high_low), shift it once to get the value from the previous week, and .merge it to df.
If column datetime doesn't contain datetimes:
df["datetime"] = pd.to_datetime(df["datetime"])
If I have understood correctly, the solution should be
get the week number from the date
groupby the week number and fetch the max and min number.
groupby the week fetch max date to get max/last date for a week
now merge all the dataframes into one based on date key
Once the steps are done, you could do any formatting as required.

Python Shift by trading date

Hi I created a daily price list of a stock and now I need to insert another column for 14days return. Currently I have the following:
data["2W-Chg"]=stock_store['Adj Close'] - stock_store['Adj Close'].shift(14)
enter image description here
but when checking the shifting, it had actual shift more than 14 days (eg. Today: 6/23/2021, Shifted to: 6/2/2021). The correct shifting date that I'm looking for should be 6/09/2021.
Is there anyway to shift base on trading date instead by row?
Thanks
JC
shift has a parameter called freq that, when set, shifts the index of the data according to the given frequency instead of ignoring the time relation. So you can first set your data's index as the date column and then shift with your desired frequency:
# or you can set it when reading CSV, as you mentioned in the comments
stock_store = stock_store.set_index("Date")
# using `D` as the daily frequenct (as you mentioned in comments)
data["2W-Chg"] = stock_store["Adj Close"] - stock_store["Adj Close"].shift(14, freq="D")

dataframe groupby date and resample by seconds for daily change price

I have data like below.
I need to get the percentage change of current resample(10s)'s last price compared to daily open price(00:00:00) like below.
There are more than one compid.
I did something like below, but df_price_curr_last gets error.
df_t is the data
group = ['compid', df_t['datetime'].date]
df_price_open = df_t.groupby(group)['price'].first().to_frame()
df_price_open
df_price_curr_last = df_t.groupby(group).resample('10S')['price'].last()
df_price_curr_last/df_price_open
Below is the error msg.
ValueError: Key 2020-11-06 00:00:00 not in level Index([2020-11-06, 2020-11-07], dtype='object')
I think you can grouping by dates and also by Grouper with 10S, aggregate last and then grouping by first and second level (compid and date) with GroupBy.transform for repeat first value, so possible divide both Series:
grouper = ['compid',
df_t['datetime'].dt.date.rename('date'),
pd.Grouper(freq='10S', key='datetime')]
df_price_curr_last = df_t.groupby(grouper)['price'].last()
print (df_price_curr_last)
df_price_open = df_price_curr_last.groupby(level=[0,1]).transform('first')
a = df_price_curr_last/df_price_open

Python: Date conversion to year-weeknumber, issue at switch of year

I am trying to convert a dataframe column with a date and timestamp to a year-weeknumber format, i.e., 01-05-2017 03:44 = 2017-1. This is pretty easy, however, I am stuck at dates that are in a new year, yet their weeknumber is still the last week of the previous year. The same thing that happens here.
I did the following:
df['WEEK_NUMBER'] = df.date.dt.year.astype(str).str.cat(df.date.dt.week.astype(str), sep='-')
Where df['date'] is a very large column with date and times, ranging over multiple years.
A date which gives a problem is for example:
Timestamp('2017-01-01 02:11:27')
The output for my code will be 2017-52, while it should be 2016-52. Since the data covers multiple years, and weeknumbers and their corresponding dates change every year, I cannot simply subtract a few days.
Does anybody have an idea of how to fix this? Thanks!
Replace df.date.dt.year by this:
(df.date.dt.year- ((df.date.dt.week>50) & (df.date.dt.month==1)))
Basically, it means that you will substract 1 to the year value if the week number is greater than 50 and the month is January.

Get last date in each month of a time series pandas

Currently I'm generating a DateTimeIndex using a certain function, zipline.utils.tradingcalendar.get_trading_days. The time series is roughly daily but with some gaps.
My goal is to get the last date in the DateTimeIndex for each month.
.to_period('M') & .to_timestamp('M') don't work since they give the last day of the month rather than the last value of the variable in each month.
As an example, if this is my time series I would want to select '2015-05-29' while the last day of the month is '2015-05-31'.
['2015-05-18', '2015-05-19', '2015-05-20', '2015-05-21',
'2015-05-22', '2015-05-26', '2015-05-27', '2015-05-28',
'2015-05-29', '2015-06-01']
Condla's answer came closest to what I needed except that since my time index stretched for more than a year I needed to groupby by both month and year and then select the maximum date. Below is the code I ended up with.
# tempTradeDays is the initial DatetimeIndex
dateRange = []
tempYear = None
dictYears = tempTradeDays.groupby(tempTradeDays.year)
for yr in dictYears.keys():
tempYear = pd.DatetimeIndex(dictYears[yr]).groupby(pd.DatetimeIndex(dictYears[yr]).month)
for m in tempYear.keys():
dateRange.append(max(tempYear[m]))
dateRange = pd.DatetimeIndex(dateRange).order()
Suppose your data frame looks like this
original dataframe
Then the following Code will give you the last day of each month.
df_monthly = df.reset_index().groupby([df.index.year,df.index.month],as_index=False).last().set_index('index')
transformed_dataframe
This one line code does its job :)
My strategy would be to group by month and then select the "maximum" of each group:
If "dt" is your DatetimeIndex object:
last_dates_of_the_month = []
dt_month_group_dict = dt.groupby(dt.month)
for month in dt_month_group_dict:
last_date = max(dt_month_group_dict[month])
last_dates_of_the_month.append(last_date)
The list "last_date_of_the_month" contains all occuring last dates of each month in your dataset. You can use this list to create a DatetimeIndex in pandas again (or whatever you want to do with it).
This is an old question, but all existing answers here aren't perfect. This is the solution I came up with (assuming that date is a sorted index), which can be even written in one line, but I split it for readability:
month1 = pd.Series(apple.index.month)
month2 = pd.Series(apple.index.month).shift(-1)
mask = (month1 != month2)
apple[mask.values].head(10)
Few notes here:
Shifting a datetime series requires another pd.Series instance (see here)
Boolean mask indexing requires .values (see here)
By the way, when the dates are the business days, it'd be easier to use resampling: apple.resample('BM')
Maybe the answer is not needed anymore, but while searching for an answer to the same question I found maybe a simpler solution:
import pandas as pd
sample_dates = pd.date_range(start='2010-01-01', periods=100, freq='B')
month_end_dates = sample_dates[sample_dates.is_month_end]
Try this, to create a new diff column where the value 1 points to the change from one month to the next.
df['diff'] = np.where(df['Date'].dt.month.diff() != 0,1,0)

Categories

Resources