Dataframe
(Disregard the two index columns)
level_0
index
Year
Month
Day
Open
High
Low
Close
Volume
Length
Polarity
Sentiment_Negative
Sentiment_Neutral
Sentiment_Positive
Target_variable
Predicted
0
0
0
2020
1
19
8941.45
9164.36
8620.08
8706.25
3.42173e+10
937.167
0.0884653
0
0
1
0
0
1
1
1
2020
1
18
8927.21
9012.2
8827.33
8942.81
3.23378e+10
1177.5
0.176394
0
0
1
1
1
2
2
2
2020
1
17
8725.21
8958.12
8677.32
8929.04
3.63721e+10
1580
0.216762
0
0
1
0
0
3
3
3
2020
1
16
8812.48
8846.46
8612.1
8723.79
3.1314e+10
1336.33
0.182707
0
0
1
0
0
Description
The value of the target_variable is 1 if todays closing price is greater than yesterdays closing price
The value of the target_variable is 0 if todays closing price is less than yesterdays closing price
The predicted value is the output of my classifier.
Problem
I need to run some code that tracks how much money is gained if I invest when the classifier tells me to invest
I have started to code this
credit = 10000
for index, row in df.iterrows():
if row["Predicted"] == 1:
#print(row["Percentage_diff"])
credit = credit - 100
credit = credit + (100 * row["Percentage_diff"])
print(credit)
The idea is that I start off with a balance of 10,000 and invest 100 every time the classifier signals to. The only problem is that when I lose 8000 credits. Is the code correct and the classifier is very poor?
Or have I made an error in the code?
I am not a trading expert, so I assume that every day the classifier tells you to trade, you will buy with the opening price and sell with the close price.
You can start by calculating the percentage of profit or loss when the classifier tells you to trade. You can do that by subtracting the closing price from the opening and dividing it by the opening price.
df["perc_diff"] = (df["Close"] - df["Open"])/df["open"]
Of course, this will be negative when the classifier is wrong. To compute the cumulative profits/losses, all you want to do is to iteratively add/subtract your profit/loss to your capital. This means at a day with a profit/loss percentage of r, if you invest x dollars, your new credit is (1+r)*x. So a simple for loop can do it like that:
credit = 1 # your capital
for label, row in df.iterrows():
credit = (1 + row["Predicted"] * r) * row["perc_diff"]
print(credit)
Edit to address your updated problem:
If you want to specify an amount to invest rather than all your capital, then you can use this:
credit = 1 # your capital
to_invest = 0.1 # money to invest
for label, row in df.iterrows():
# update invest
invest_update = (1 + row["Predicted"] * row["perc_diff"]) * to_invest
credit = credit - to_invest + invest_update
print(credit)
The last two lines can be combined into one line:
credit = credit + row["Predicted"] * row["perc_diff"] * to_invest
I think the code is correct, and if you lose, then it is probably due to poor performance from your classifier, but this should be evident from your evaluation of the model (like accuracy and precision metrics). Also, if it is a classifier that is not made for time series (e.g. logistic regression), then it is very reasonable that it performs poorly.
Solution
df["Percentage_diff"] = (df["Close"] - df["Open"])/df["Open"]
credit = 10000
for index, row in df.iterrows():
if row["Predicted"] == 1:
#print(row["Percentage_diff"])
credit = credit - 100
credit = credit + ((100 * row["Percentage_diff"]) + 100)
print(credit)
This was the solution thanks to Ahmed.
If I start with an original balance of 10000 every time the classifier signals to invest I invest 100 dollars at opening and withdraw at close this calculates the balance.
Related
I'm trying to come up with a formula to calculate the average entry/position price to further update my stop loss and take profit.
For example opened BTC buy position with amount of 1 when price was 20000.
Later when price dropped down to 19000 we made another buy using the same amount of 1, "avereging" the position to the middle, so end up with position at 19500 with amount of 2.
Where I'm struggling is what if we want to increase the order size on each price.
Say 1 at 20000, 1.5 at 19500, 2 at 19000 and so on.
Or made new buys of the same amount but shorter distance between.
Inital buy at 20000. then 19000 then 19150
Or combine these two variants.
I use mainly Python and Pandas. Maybe the latter one has some built-in function which I'm not aware of. I checked the official Pandas docs, but found only regular mean function.
Thanks to Yuri's suggestion to look into VWAP, I came up with the following code, which is more advanced and allows you to use different contract/volume sizes and increase/decrease "distance" between orders.
As an example here I used avarage price of BTC 20000 and increased steps distance using 1.1 multiplier as well as increased volume. Operated in Binance futures terms, where you can buy minimum 1 contract for 10$.
The idea is to find sweet spot for orders distance, volume, stop loss and take profit while avereging down.
# initial entry price
initial_price = 20000
# bottom price
bottom_price = 0
# enter on every 5% price drop
step = int(initial_price*0.05)
# 1.1 to increase distance between orders, 0.9 to decrease
step_multiplier = 1.1
# initial volume size in contracts
initial_volume = 1
# volume_multiplier, can't be less than 1, in case of use float, will be rounded to decimal number
volume_multiplier = 1.1
# defining empty arrays
prices = []
volumes = []
# checking if we are going to use simple approach with 1 contract volume and no sep or volume multiplier
if step_multiplier == 1 and volume_multiplier == 1:
prices = range(initial_price,bottom_price,-step)
else:
# defining current price and volume vars
curr_price = initial_price
curr_volume = initial_volume
# Checking if current price is still bigger then defined bottom price
while curr_price > bottom_price:
# adding current price to the list
prices.append(curr_price)
# calulating next order price
curr_price = curr_price-step*step_multiplier
# checking if volume multiplier is bigger then 1
if volume_multiplier > 1:
# adding current volume to the list
volumes.append(int(curr_volume))
# calulating next order volume
curr_volume = curr_volume*volume_multiplier
print("Prices:")
for price in prices:
print(price)
print("Volumes:")
for volume in volumes:
print(volume)
print("Prices array length", len(prices))
print("Volumes array length", len(volumes))
a = [item1 * item2 for item1, item2 in zip(prices, volumes)]
b = volumes
print("Average position price when price will reach",prices[-1], "is", sum(a)/sum(b))
I have a dataframe similar to the one below;
Price
return
indicator
5
0.05
1
6
0.20
-1
5
-0.16
1
Where the indicator is based upon the forecasted return on the following day.
what I would like to achieve is a strategy where when the indicator is positive 1, I buy the stock at the price on that date/row. Then if the indicator is negative we sell at that price. Then I would like to create a new column with represents the value of the portfolio on each day. Assuming I have $1000 to invest the value of the portfolio should equal the holdings and cash amount. Im assuming that any fraction of Stock can be purchased.
Im unsure where to start with this one. I tried calculating a the Buy/Hold strategy using;
df['Holding'] = df['return'].add(1).cumprod().*5000
this worked for a buy hold strategy but to modify it to the new strategy seems difficult.
I tried;
df['HOLDINg'] = (df['return'].add(1).cumprod()* 5000 * df['Indicator])
#to get the value of the buy or the sell
#then using
df['HOLDING'] = np.where(df['HOLDING'] >0, df['HOLDING'] , df['HON HOLDING 2']*-1)
#my logic was, if its positive its the value of the stock holding, and if its negative it is a cash inflow therefore I made it positive as it would be cash.
the issue is, my logic is massively flawed, as if the holding is cash the return shouldn't apply to it. further I don't think using the cumprod is correct with this strategy.
Has anyone used this strategy before and can offer tips of how to make it work?
thank you
I'm not sure about the returns and prices being in the correct place (they shouldn't really be in the same row if they represent the buying price (presumably yesterday's close), and the daily return (assuming the position was held for the whole day). But anyway...
import pandas as pd
# the data you provided
df = pd.read_csv("Data.csv", header=0)
# an initial starting row (explanation provided)
starting = pd.DataFrame({'Price': [0], 'return': [0], 'indicator': [0]})
# concatenate so starting is first row
df = pd.concat([starting, df]).reset_index(drop=True)
# setting holding to 0 at start (no shares), and cash at 1000 (therefore portfolio = 1000)
df[["Holding", "Cash", "Portfolio"]] = [0, 1000, 1000]
# buy/sell is the difference (explanation provided)
df["BuySell"] = df["indicator"].diff()
# simulating every day
for i in range(1, len(df)):
# buying
if df["BuySell"].iloc[i] > 0:
df["Holding"].iloc[i] += df["Cash"].iloc[i-1] / df["Price"].iloc[i]
df["Cash"].iloc[i] = 0
# selling
elif df["BuySell"].iloc[i] < 0:
df["Cash"].iloc[i] = df["Holding"].iloc[i-1] * df["Price"].iloc[i]
df["Holding"].iloc[i] = 0
# holding position
else:
df["Cash"].iloc[i] = df["Cash"].iloc[i-1]
df["Holding"].iloc[i] = df["Holding"].iloc[i-1]
# multiply holding by return (assuming all-in, so holding=0 not affected)
df["Holding"].iloc[i] *= (1 + df["return"].iloc[i])
df["Portfolio"].iloc[i] = df["Holding"].iloc[i] * df["Price"].iloc[i] + df["Cash"].iloc[i]
Explanations:
Starting row:
This is needed so that the loop can refer to the previous holdings and cash (it would be more of an inconvenience to add in an if statement in the loop if i=0).
Buy/Sell:
The difference is necessary here, as if the position changes from buy to sell, then obviously selling the shares (and vice versa). However, if the previous was buy/sell, the same as the current row, there would be no change (diff=0), with no shares bought or sold.
Portfolio:
This is an "equivalent" amount (the amount you would hold if you converted all shares to cash at the time).
Holding:
This is the number of shares held.
NOTE: from what I understood of your question, this is an all-in strategy - there is no percentage in, which has made this strategy more simplistic, but easier to code.
Output:
#Out:
# Price return indicator Holding Cash Portfolio BuySell
#0 0 0.00 0 0.00 1000 1000.0 NaN
#1 5 0.05 1 210.00 0 1050.0 1.0
#2 6 0.20 -1 0.00 1260 1260.0 -2.0
#3 5 -0.16 1 211.68 0 1058.4 2.0
Hopefully this will give you a good starting point to create something more to your specification and more advanced, such as with multiple shares, or being a certain percentage exposed, etc.
Write a program that calculates and prints the balance of a loan over time. The program must ask user for the following (in order):
The amount of the loan, i.e. principal (float).
The annual interest rate as a fraction of 1.0 (float).
The monthly payment amount that the user has arbitrarily chosen (float; the amount is not calculated).
The term of the loan in years (int).
The loan is principal-and-interest and the program simulates paying a fixed amount off the loan every month. The interest is calculated monthly before the fixed payment.
Note that the program must calculate principal and interest changes monthly but only prints an update every year. For this question, it's fine for the balance to go negative.
Running the program should look exactly like the following:
Principal? 100000
Interest rate? 0.055
Monthly repayment? 1000.00
Term in years? 5
Year Opening Closing
0 100,000.00 93,333.62
1 93,333.62 86,291.20
2 86,291.20 78,851.53
3 78,851.53 70,992.21
4 70,992.21 62,689.55
The output must be formatted to 2 decimal places and alignment as shown.. The numbers are 11 characters wide, with comma separators and two decimal places. I wrote the below program but I am not getting the required output:
principal = float(input('Principal? '))
interest = float(input('Interest rate? '))
payment = float(input('Monthly repayment? '))
Term = int(input('Term in years? '))
payment = 1000
n = 1
p = 0
opening = principal
m_opening = 0
m_closing = 0
rate = 0
closing = 0
i=0
year = 0
m = 0
print(f'Year Opening Closing')
for i in range (Term):
if year == 0:
while n<=12:
rate = opening*interest/12
closing = opening - (payment-rate)
opening = closing
n = n+1
m = closing
opening = principal
print(f' {year:2} {opening:11,.2f} {closing:11,.2f}')
year = year+1
else:
n=1
while n<=12:
rate = opening*interest/12
closing = opening - (payment-rate)
opening = closing
n = n+1
print(f' {year:2} {opening:11,.2f} {closing:11,.2f}')
year = year+1
My output is coming as below:
Principal? 100000
Interest rate? 0.055
Monthly repayment? 1000
Term in years? 5
Year Opening Closing
0 100,000.00 93,333.62
1 93,333.62 93,333.62
2 86,291.20 86,291.20
3 78,851.53 78,851.53
4 70,992.21 70,992.21
You need to keep the opening of the year after the first year, of not you are modificating opening variable each time. Also you dont need m = closing and opening = principal. Find below slighly modificated working code.
principal = float(input('Principal? '))
interest = float(input('Interest rate? '))
payment = float(input('Monthly repayment? '))
Term = int(input('Term in years? '))
payment = 1000
n = 1
p = 0
opening = principal
m_opening = 0
m_closing = 0
rate = 0
closing = 0
i = 0
year = 0
m = 0
print(f'Year Opening Closing')
for i in range(Term):
if year == 0:
while n <= 12:
rate = opening * interest / 12
closing = opening - (payment - rate)
opening = closing
n = n + 1
# m = closing
# opening = principal
print(f' {year:2} {principal:11,.2f} {closing:11,.2f}')
year = year + 1
else:
n = 1
yearOpening = closing
while n <= 12:
rate = opening * interest / 12
closing = opening - (payment - rate)
opening = closing
n = n + 1
print(f' {year:2} {yearOpening:11,.2f} {closing:11,.2f}')
year = year + 1
Output:
Hope that helps!
I was wondering if there is a more efficient/cleaner way of doing the following. Say I have a dataframe that contains 2 columns, the percentage, (base on previous price) and the action, play/buy (1) or not play/sell (-1). Its basically about stocks.
For simplicity, consider the example df:
Percent Action
1.25 1
1.20 1
0.50 -1
0.75 1
I would like to generate the following. I only care about the final money amount, I am just showing this table for reference. Say we started with $100 and a state of not playing. Thus we should get the money amount of:
Playing Percent Action Money
No 1.25 1 $100
Yes 1.20 1 $120
Yes 0.50 -1 $60
No 0.75 1 $60
Yes ... ... ...
The amount didnt change in the first row since we weren't playing yet. Since the action is 1, we will play the next one. The percentage went up 20%, thus we get $120. The next action is still a 1, so we'll still be in the next one. The percentage went down to 50% so we end up with $60. Next action is -1, thus we will not play. The percentage went down to 75%, but since we weren't playing, our money stayed the same. And so on.
Currently, I have the code below. It works fine, but just wondering if there is a more efficient way using numpy/pandas functions. Mine basically iterates through each row and calculate the value.
playing = False
money = 10000
for index, row in df.iterrows():
## UPDATE MONEY IF PLAYING
if index > 0 and playing == True:
money = float(format(money*row['Percent'],'.2f'))
## BUY/SELL
if row['Action'] == 1:
if playing == False:
playing = True ## Buy, playing after this
elif row['Action'] == -1:
if playing == True:
playing = False ## Sell, not playing after this
You could try this:
# decide whether to play based on action
df['Playing'] = df.Action.shift().eq(1)
# replace Percent for not playing row with 1 and then calculate the cumulative product
df['Money'] = '$' + df.Percent.where(df.Playing, 1).cumprod().mul(100).astype(str)
df
#Percent Action Playing Money
#0 1.25 1 False $100.0
#1 1.20 1 True $120.0
#2 0.50 -1 True $60.0
#3 0.75 1 False $60.0
I'm trying to figure out how to get the interest and principal to display correctly over the years. Here is the part of my code I am having trouble with:
print ('Luke\n-----')
print ('Year\tPrincipal\tInterest\t Total')
LU_RATE = .05
YEAR = 1
Principal = 100
for YEAR in range (1,28):
# Calculating Luke's total using formula for compounding interest
Lu_Total = (Principal * ((1 + LU_RATE) ** YEAR))
# I realize it's a logical error occurring somewhere here
Lu_Interest = #I'm not sure what to code here
Lu_Principal = #And here
# Displaying the Principal, Interest, and Total over the 27
print (YEAR,'\t%.02f\t\t %.02f\t\t %.02f' %(Lu_Principal, Lu_Interest, Lu_Total))
This is what gets displayed (minus the comment symbols of course):
Luke
-----
Year Principal Interest Total
1 # # 105.00
2 # # 110.25
3 # # 115.76
4 # # 121.55
5 # # 127.63
6 # # 134.01
#etc etc....
Every equation I've tried to code had the correct Interest for year one but ends up putting the Principal as the Total. Every year past that calculates out to the wrong numbers.
It should look like:
Luke
-----
Year Principal Interest Total
1 100.00 5.00 105.00
2 105.00 5.25 110.25
3 110.25 5.51 115.76
#etc etc....
I've been working at it on and off throughout the day and just can't seem to figure it out. Thank you in advance for any help or suggestions.
This sounds like homework, so I'll be a little vague:
You have a loop. Your program executes from the top of the loop to the bottom of the loop, and then goes back and starts over at the top of the loop again.
You can change things by setting values in the bottom of the loop that will be used in the top of the loop next time.
For example, you can compute the interest based on this year's principal. You're doing that in the top of the loop.
At the bottom of the loop, after you print everything out for this year, you could change the (next year's) principal by adding (this year's) interest to it. Then 100 would become 105, etc.
And another contestant ;-)
print ('Luke\n-----')
print ('Year\tPrincipal\tInterest\t Total')
rate = .05
principal = 100.
for year in range (1, 28):
# calculate interest and total
interest = principal * rate
total = principal + interest
# displaying this year's values
print(year,'\t%.02f\t\t %.02f\t\t %.02f' %(principal, interest, total))
# next year's principal == this year's total
principal = total
produces
Luke
-----
Year Principal Interest Total
1 100.00 5.00 105.00
2 105.00 5.25 110.25
3 110.25 5.51 115.76
4 115.76 5.79 121.55
# ... etc ...
Here is what I did:
print ('Luke\n-----')
print ('Year\tPrincipal\tInterest\t Total')
LU_RATE = .05
YEAR = 1
Principal = 100
Prev_Principal = 100 #added to store previous year principal
for YEAR in range (1,28):
# Calculating Luke's total using formula for compounding interest
Lu_Total = (Principal * ((1 + LU_RATE) ** YEAR))
Lu_Interest = Lu_Total - Prev_Principal
Lu_Principal = Lu_Total - Lu_Interest
Prev_Principal = Lu_Total
# Displaying the Principal, Interest, and Total over the 27
print (YEAR,'\t%.02f\t\t %.02f\t\t %.02f' %(Lu_Principal, Lu_Interest, Lu_Total))
There may be another way to do this, but I think you have a few issues. One is that you need to base your "total" calculation (where you're multiplying the principal by the 1+rate ** year) on the original principal value, and you need to keep this value separate from the rest of the calculations.
So you can work with two names like p0 and pN, where p0 represents the initial principal at year 0, and pN represents the original principal PLUS accrued interest at year N, then we reassign pN at the end of the loop.
r = .05
p0, pN = 100, p0
for y in range(1,5):
total = p0 * ((1+r)**y)
i = total - pN
print (y,'\t%.02f\t\t %.02f\t\t %.02f' %(pN, i, total))
pN = total
The output is as you expect: