Python: How to use ffill() and store calculations in if conditions

Python: How to use ffill() and store calculations in if conditions - python

My code below is checking to see if a certain signal shows it self in some data.
I am trying to implement the solution from:
https://stackoverflow.com/a/50302942/11739577
especially these lines:
data['cor_price'] = data['close'].where((data['signal'] == 1) & (data['positions'] == 1), pd.np.nan)
data['cor_price'] = data['cor_price'].ffill().astype(data['close'].dtype)
data['diff_perc'] = (data['close'] - data['cor_price']) / data['cor_price']
data['positions2'] = np.where(data['diff_perc'] <= -0.05, 1, 0)
This is what the data looks like:
Index DATE NAME PRICE ORDER MACD MACD_SIGNAL MACD_HIST
33 2020-02-18 Close 39.450001 buy -0.473582 -0.775 0.301418
34 2020-02-19 Close 40.610001 buy -0.314391 -0.682879 0.368487
35 2020-02-20 Close 41.18 buy -0.140616 -0.574426 0.43381
36 2020-02-21 Close 39.959999 buy -0.100187 -0.479578 0.379391
37 2020-02-24 Close 38.669998 buy -0.170276 -0.417718 0.247441
38 2020-02-25 Close 38.41 buy -0.24399 -0.382972 0.138982
39 2020-02-26 Close 37.950001 buy -0.335657 -0.373509 0.037852
40 2020-02-27 Close 36.16 buy -0.546443 -0.408096 -0.138347
41 2020-02-28 Close 34.490002 buy -0.838581 -0.494193 -0.344388
42 2020-03-02 Close 33.98 buy -1.098591 -0.615073 -0.483518
43 2020-03-03 Close 34.169998 buy -1.274626 -0.746983 -0.527643
44 2020-03-04 Close 35.060001 buy -1.327023 -0.862991 -0.464032
45 2020-03-05 Close 34.110001 buy -1.428735 -0.97614 -0.452595
46 2020-03-06 Close 32.82 buy -1.595048 -1.099922 -0.495127
47 2020-03-09 Close 29.040001 buy -2.008712 -1.28168 -0.727032
48 2020-03-10 Close 29.200001 buy -2.297153 -1.484774 -0.812378
49 2020-03-11 Close 29.74 buy -2.453883 -1.678596 -0.775287
To give some context:
When the signal, line 1 and line 2 in the code, is triggered a buy takes place: buy += 1.
If buy == 1 I add the buying price to cor_price and use ffill() to fill cor_price "column" in order to calculate the diff_perc. As each day goes by and price, which is index[column][i] changes. diff_perc is the difference between when we bought and at each given day after buying.
When diff_perc is < 0.05 stop_loss is triggered: stop_loss +=1 and this means we sell: sell += 1 and are no longer buying: buy -= 1.
How can I implement the stop loss?
I can't seem to attach index[column][i] to cor_price and use ffill()
buy = 0
sell = 0
cor_price=0
diff_perc=0
stop_loss=0
temp = 0
stop_loss = 0
if (df["macd_signal"][i+1] < df["macd"][i+1]) != (df["macd_signal"][i] < df["macd"][i]):
if ((df["macd_signal"][i+1] < df["macd"][i+1]),(df["macd_signal"][i] < df["macd"][i])) == (True,False):
buy += 1 #we buy as we have spotted a signal
if sell == 1: #If we previously have sold but make a buy order, sells should be 0
sell -= 1
if stop_loss == 1: #If we previously sold on stop loss, but we buy again, stop loss should be 0
stop_loss -= 1
if buy == 1: #If we have bought we check if the price is negative, and hits our stop loss meaning we also sell.
cor_price += index[column][i]
cor_price = cor_price.ffill().astype(index[column][i].dtype)
diff_perc += (index[column][i] - cor_price) / cor_price
stop_loss += np.where(diff_perc <= -0.05, 1, 0)
buy -= 1
sell += 1
else:
sell += 1 #we sell as we have observed a sell signal
buy -= 1 #as we now have sold, we have no buy orders - until we buy again.
#append to a list to create a pd.DataFrame later
BUY_SIGNAL.append(buy)
SELL_SIGNAL.append(sell)
DF_CORR.append(cor_price)
DIFF_PER.append(diff_perc)
STOP_LOSS.append(stop_loss)

Related

Overwrite a value in a pandas dataframe column based on a calculation function applied to it

From the following DataFrame:
worktime = 1440
person = [11,22,33,44,55]
begin_date = '2019-10-01'
shift= [1,2,3,1,2]
pause = [90,0,85,70,0]
occu = [60,0,40,20,0]
time_u = [50,40,80,20,0]
time_a = [84.5,0.0,10.5,47.7,0.0]
time_p = 0
time_q = [35.9,69.1,0.0,0.0,84.4]
df = pd.DataFrame({'date':pd.date_range(begin_date, periods=len(person)),'person':person,'shift':shift,'worktime':worktime,'pause':pause,'occu':occu, 'time_u':time_u,'time_a':time_a,'time_p ':time_p,'time_q':time_q,})
Output:
date person shift worktime pause occu time_u time_a time_p time_q
0 2019-10-01 11 1 1440 90 60 50 84.5 0 35.9
1 2019-10-02 22 2 1440 0 0 40 0.0 0 69.1
2 2019-10-03 33 3 1440 85 40 80 10.5 0 0.0
3 2019-10-04 44 1 1440 70 20 20 47.7 0 0.0
4 2019-10-05 55 2 1440 0 0 0 0.0 0 84.4
I am looking for a suitable function that takes the already contained value of the columns and uses it in a calculation and then overwrites it with the result of the calculation.
It concerns the columns time_u, time_a, time_p and time_q and should be applied according to the following principle:
time_u = worktime - pause - occu - (existing value of time_u)
time_a = (new value of time_u) - time_a
time_p = (new value of time_a) - time_p
time_q = (new value of time_p)- time_q
Is there a possible function that could be used here?
Using this formula manually, the output would look like this:
date person shift worktime pause occu time_u time_a time_p time_q
0 2019-10-01 11 1 1440 90 60 1240 1155.5 1155.5 1119.6
1 2019-10-02 22 2 1440 0 0 1400 1400 1400 1330.9
2 2019-10-03 33 3 1440 85 40 1235 1224.5 1224.5 1224.5
3 2019-10-04 44 1 1440 70 20 1330 1282.3 1282.3 1282.3
4 2019-10-05 55 2 1440 0 0 1440 1440 1440 1355.6
Unfortunately, this task is way beyond my skill level, so any help in setting up the appropriate function would be greatly appreciated.
Many thanks in advance

You can simply apply the relationships you have supplied sequentially. Or are you looking for something else? By the way, you put an extra space at the end of 'time_p'
df['time_u'] = df['worktime'] - df['pause'] - df['occu'] - df['time_u']
df['time_a'] = df['time_u'] - df['time_a']
df['time_p'] = df['time_a'] - df['time_p']
df['time_q'] = df['time_p'] - df['time_q']

Calculating Time Weighted Rate of Return in Python

I'm trying to calculate daily returns using the time weighted rate of return formula:
(Ending Value-(Beginning Value + Net Additions)) / (Beginning value + Net Additions)
My DF looks like:
Account # Date Balance Net Additions
1 9/1/2022 100 0
1 9/2/2022 115 10
1 9/3/2022 117 0
2 9/1/2022 50 0
2 9/2/2022 52 0
2 9/3/2022 40 -15
It should look like:
Account # Date Balance Net Additions Daily TWRR
1 9/1/2022 100 0
1 9/2/2022 115 10 0.04545
1 9/3/2022 117 0 0.01739
2 9/1/2022 50 0
2 9/2/2022 52 0 0.04
2 9/3/2022 40 -15 0.08108
After calculating the daily returns for each account, I want to link all the returns throughout the month to get the monthly return:
((1 + return) * (1 + return)) - 1
The final result should look like:
Account # Monthly Return
1 0.063636
2 0.12432
Through research (and trial and error), I was able to get the output I am looking for but as a new python user, I'm sure there is an easier/better way to accomplish this.
DF["Numerator"] = DF.groupby("Account #")[Balance].diff() - DF["Net Additions"]
DF["Denominator"] = ((DF["Numerator"] + DF["Net Additions"] - DF["Balance"]) * -1) + DF["Net Additions"]
DF["Daily Returns"] = (DF["Numerator"] / DF["Denominator"]) + 1
DF = DF.groupby("Account #")["Daily Returns"].prod() - 1
Any help is appreciated!

Perform a cross column calculation in Python

Context
I am trying to build a portfolio dashboard following this example, only instead of Excel, I am using Python. I am currently not sure how to conduct from 3:47 onwards, cross calculating to arrive at the next period balance.
Problem
Is there a way to conduct this in Python? I tried a for loop but it returned the same number iterated over the number of forward periods. Below is the example:
date_range = pd.date_range(start=today, periods=period_of_investments, freq=contribution_periods)
returns_port = 12
rs = []
balance_total = []
for one in range(len(date_range))):
return_loss = (returns_port/period_of_investments)*capital_insert
rs.append(return_loss)
period_one_balance = capital_insert+return_loss
period_two_return_loss = (returns_port/period_of_investments)*(period_one_balance + capital_insert)
period_two_balance = period_one_balance + capital_insert + period_two_return_loss
balance_total.append(period_two_balance)

I did not watch the video but I will explain how to write a Python code for the following problem, which is similar to the one in the video.
Suppose you want to calculate the return of investment of a fixed monthly deposit for the next 20 years with a fixed interest rate.
The first step is understanding how pd.date_range() works. If you started at the beginning of this month the whole period would be pd.date_rage(start='4-1-2021', periods='240', freq='1m') (240 comes from 20 years, 12 month each). Basically, we are calculating the return at the end of each month.
import pandas as pd
portfolio = pd.DataFrame(columns=['Date', 'Investment', 'Return/Loss', 'Balance'])
interest_rate = 0.121
monthly_deposit = 500
dates = pd.date_range(start="3-31-2021", periods=240, freq='1m')
investment = [monthly_deposit]*len(dates)
return_losses = []
balances = []
current_balance = 500
for date in dates:
current_return_loss = (interest_rate/12)*current_balance
return_losses.append(round(current_return_loss,2))
balances.append(round(current_balance + current_return_loss))
current_balance += (current_return_loss + monthly_deposit)
portfolio['Date'] = pd.to_datetime(dates)
portfolio['Investment'] = investment
portfolio['Return/Loss'] = return_losses
portfolio['Balance'] = balances
balance_at_end = balances[-1]
print(portfolio.head(10))
print(balance_at_end)
You will get the following result, which is identical to the video:
Date Investment Return/Loss Balance
0 2021-03-31 500 5.04 505
1 2021-04-30 500 10.13 1015
2 2021-05-31 500 15.28 1530
3 2021-06-30 500 20.47 2051
4 2021-07-31 500 25.72 2577
5 2021-08-31 500 31.02 3108
6 2021-09-30 500 36.38 3644
7 2021-10-31 500 41.79 4186
8 2021-11-30 500 47.25 4733
9 2021-12-31 500 52.77 5286
506397

A value that need to be extended in time

people.
I need to sum values of a data frame in different columns.
OUT with the amount invested
IN with the amount received
DRAW with the amount taken.
So, OUT is the total invested. If you have a DRAW, means that value was taken for the investment.
As an example, -100 (LINE 1) + 100 (LINE 2[DRAW]) means that you took part of the investment. In this case, the value received in the column IN, means we have 110-100 (both in line 2. One in column IN, the other, in DRAW) of income, giving us a total income of 10 units (10% of the investment = (110-100)/100 = (IN-DRAW)/OUT).
We also could have a DRAW without a return, as in line 12. In this example, from this line on, the income will be calculate in 2x-200 (-400) + 20 = -380.
After line 5, we have an investment of 2 times -200; -400 in the total and no DRAWs and OUTs, until line 12.
My doubt lies in what is the best way to calculate the % in each month based on the OUTs, INs and DRAWs in all table.
LINE
DATE
OUT
IN
DRAW
1
2020-01-20
-100
-
2
2020-02-10
-
110
100
3
2020-02-11
-200
-
4
2020-02-21
-
20
5
2020-02-25
-200
-
6
2020-02-26
-200
-
7
2020-02-26
-
20
8
2020-03-09
-
40
9
2020-04-01
-
10
10
2020-04-07
-
20
11
2020-04-10
-
10
12
2020-05-10
-
-
20

I came with a solution, still not knowing if is the best one, but worked fine.
I made a new column join OUT and DRAW together (OUTDRAW).
This column was created with OUT data, and them filled the NaN spaces with DRAW values (this worked only because the values are not in the same line):
df['OUTDRAW'] = df['OUT']
df['OUTDRAW'].fillna(df['DRAW'], inplace=True)
After, filled NaN with 0 and made a cumsum on it.
df['OUTDRAW'].fillna(0, inplace=True)
df['OUTDRAW'].cumsum()
This gave me the column
LINE
DATE
OUT
DRAW
OUTDRAW
1
2020-01-20
-100
-
-100
2
2020-02-10
-
100
0
3
2020-02-11
-200
-
-200
4
2020-02-21
-
20
-200
5
2020-02-25
-200
-
-400
6
2020-02-26
-200
-
-600
7
2020-02-26
-
20
-600
8
2020-03-09
-
40
-600
9
2020-04-01
-
10
-600
10
2020-04-07
-
20
-600
11
2020-04-10
-
10
-600
12
2020-05-10
-
20
-580
So, now we have a column in time that can be used to calculate % in time.
Note: If you want to make it by month, first make a new column with the months (be careful with the years), group by it and them do the cumsum or your values will be calculate wrongly.

how to subtract within pandas dataframe

I have a question on arithmetic within a dataframe. Please note that each of the below columns in my dataframe are based on one another except for 'holdings'
Here is a shortened version of my dataframe
'holdings' & 'cash' & 'total'
0.0 10000.0 10000.0
0.0 10000.0 10000.0
1000 9000.0 10000.0
1500 10000.0 11500.0
2000 10000.0 12000.0
initial_cap = 10000.0
But here is my problem... the first time I have holdings, the cash is calculated correctly where cash of 10000.0 - holdings of 1000.0 = 9000.0
I need cash to remain at 9000.0 until my holdings goes back to 0.0 again
Here are my calculations
In other words, how would you calculate cash so that it remains at 9000.0 until holdings goes back to 0.0
Here is how I want it to look like
'holdings' & 'cash' & 'total'
0.0 10000.0 10000.0
0.0 10000.0 10000.0
1000 9000.0 10000.0
1500 9000.0 10500.0
2000 9000.0 11000.0
cash = initial_cap - holdings

So I try to rephrase: You start with initial capital 10 and a given sequence of holdings {0, 0, 1, 1.5, 2} and want to create a cashvariable that is 10 whenever cash is 0. As soon as cash increases in an initial period by x, you want cash to be 10 - x until cash equals 0 again.
If this is correct, this is what I would do (the logic of total and all of this is still unclear to me, but this is what you added in the end, so I focus on this).
PS. Providing code to create your sample is considered nice
df = pd.DataFrame([0, 1, 2, 2, 0, 2, 3, 3], columns=['holdings'])
x = 10
# triggers are when cash is supposed to be zero
triggers = df['holdings'] == 0
# inits are when holdings change for the first time
inits = df.index[triggers].values + 1
df['cash'] = 0
for i in inits:
df['cash'][i:] = x - df['holdings'][i]
df['cash'][triggers] = 0
df
Out[339]:
holdings cash
0 0 0
1 1 9
2 2 9
3 2 9
4 0 0
5 2 8
6 3 8
7 3 8

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: How to use ffill() and store calculations in if conditions - python

Related

Overwrite a value in a pandas dataframe column based on a calculation function applied to it

Calculating Time Weighted Rate of Return in Python

Perform a cross column calculation in Python

A value that need to be extended in time

how to subtract within pandas dataframe

Categories

Resources