Calculate average asset price when using netting instead of hedging - python

I'm trying to come up with a formula to calculate the average entry/position price to further update my stop loss and take profit.
For example opened BTC buy position with amount of 1 when price was 20000.
Later when price dropped down to 19000 we made another buy using the same amount of 1, "avereging" the position to the middle, so end up with position at 19500 with amount of 2.
Where I'm struggling is what if we want to increase the order size on each price.
Say 1 at 20000, 1.5 at 19500, 2 at 19000 and so on.
Or made new buys of the same amount but shorter distance between.
Inital buy at 20000. then 19000 then 19150
Or combine these two variants.
I use mainly Python and Pandas. Maybe the latter one has some built-in function which I'm not aware of. I checked the official Pandas docs, but found only regular mean function.

Thanks to Yuri's suggestion to look into VWAP, I came up with the following code, which is more advanced and allows you to use different contract/volume sizes and increase/decrease "distance" between orders.
As an example here I used avarage price of BTC 20000 and increased steps distance using 1.1 multiplier as well as increased volume. Operated in Binance futures terms, where you can buy minimum 1 contract for 10$.
The idea is to find sweet spot for orders distance, volume, stop loss and take profit while avereging down.
# initial entry price
initial_price = 20000
# bottom price
bottom_price = 0
# enter on every 5% price drop
step = int(initial_price*0.05)
# 1.1 to increase distance between orders, 0.9 to decrease
step_multiplier = 1.1
# initial volume size in contracts
initial_volume = 1
# volume_multiplier, can't be less than 1, in case of use float, will be rounded to decimal number
volume_multiplier = 1.1
# defining empty arrays
prices = []
volumes = []
# checking if we are going to use simple approach with 1 contract volume and no sep or volume multiplier
if step_multiplier == 1 and volume_multiplier == 1:
prices = range(initial_price,bottom_price,-step)
else:
# defining current price and volume vars
curr_price = initial_price
curr_volume = initial_volume
# Checking if current price is still bigger then defined bottom price
while curr_price > bottom_price:
# adding current price to the list
prices.append(curr_price)
# calulating next order price
curr_price = curr_price-step*step_multiplier
# checking if volume multiplier is bigger then 1
if volume_multiplier > 1:
# adding current volume to the list
volumes.append(int(curr_volume))
# calulating next order volume
curr_volume = curr_volume*volume_multiplier
print("Prices:")
for price in prices:
print(price)
print("Volumes:")
for volume in volumes:
print(volume)
print("Prices array length", len(prices))
print("Volumes array length", len(volumes))
a = [item1 * item2 for item1, item2 in zip(prices, volumes)]
b = volumes
print("Average position price when price will reach",prices[-1], "is", sum(a)/sum(b))

Related

Buy/sell strategy with indicators?

I have a dataframe similar to the one below;
Price
return
indicator
5
0.05
1
6
0.20
-1
5
-0.16
1
Where the indicator is based upon the forecasted return on the following day.
what I would like to achieve is a strategy where when the indicator is positive 1, I buy the stock at the price on that date/row. Then if the indicator is negative we sell at that price. Then I would like to create a new column with represents the value of the portfolio on each day. Assuming I have $1000 to invest the value of the portfolio should equal the holdings and cash amount. Im assuming that any fraction of Stock can be purchased.
Im unsure where to start with this one. I tried calculating a the Buy/Hold strategy using;
df['Holding'] = df['return'].add(1).cumprod().*5000
this worked for a buy hold strategy but to modify it to the new strategy seems difficult.
I tried;
df['HOLDINg'] = (df['return'].add(1).cumprod()* 5000 * df['Indicator])
#to get the value of the buy or the sell
#then using
df['HOLDING'] = np.where(df['HOLDING'] >0, df['HOLDING'] , df['HON HOLDING 2']*-1)
#my logic was, if its positive its the value of the stock holding, and if its negative it is a cash inflow therefore I made it positive as it would be cash.
the issue is, my logic is massively flawed, as if the holding is cash the return shouldn't apply to it. further I don't think using the cumprod is correct with this strategy.
Has anyone used this strategy before and can offer tips of how to make it work?
thank you
I'm not sure about the returns and prices being in the correct place (they shouldn't really be in the same row if they represent the buying price (presumably yesterday's close), and the daily return (assuming the position was held for the whole day). But anyway...
import pandas as pd
# the data you provided
df = pd.read_csv("Data.csv", header=0)
# an initial starting row (explanation provided)
starting = pd.DataFrame({'Price': [0], 'return': [0], 'indicator': [0]})
# concatenate so starting is first row
df = pd.concat([starting, df]).reset_index(drop=True)
# setting holding to 0 at start (no shares), and cash at 1000 (therefore portfolio = 1000)
df[["Holding", "Cash", "Portfolio"]] = [0, 1000, 1000]
# buy/sell is the difference (explanation provided)
df["BuySell"] = df["indicator"].diff()
# simulating every day
for i in range(1, len(df)):
# buying
if df["BuySell"].iloc[i] > 0:
df["Holding"].iloc[i] += df["Cash"].iloc[i-1] / df["Price"].iloc[i]
df["Cash"].iloc[i] = 0
# selling
elif df["BuySell"].iloc[i] < 0:
df["Cash"].iloc[i] = df["Holding"].iloc[i-1] * df["Price"].iloc[i]
df["Holding"].iloc[i] = 0
# holding position
else:
df["Cash"].iloc[i] = df["Cash"].iloc[i-1]
df["Holding"].iloc[i] = df["Holding"].iloc[i-1]
# multiply holding by return (assuming all-in, so holding=0 not affected)
df["Holding"].iloc[i] *= (1 + df["return"].iloc[i])
df["Portfolio"].iloc[i] = df["Holding"].iloc[i] * df["Price"].iloc[i] + df["Cash"].iloc[i]
Explanations:
Starting row:
This is needed so that the loop can refer to the previous holdings and cash (it would be more of an inconvenience to add in an if statement in the loop if i=0).
Buy/Sell:
The difference is necessary here, as if the position changes from buy to sell, then obviously selling the shares (and vice versa). However, if the previous was buy/sell, the same as the current row, there would be no change (diff=0), with no shares bought or sold.
Portfolio:
This is an "equivalent" amount (the amount you would hold if you converted all shares to cash at the time).
Holding:
This is the number of shares held.
NOTE: from what I understood of your question, this is an all-in strategy - there is no percentage in, which has made this strategy more simplistic, but easier to code.
Output:
#Out:
# Price return indicator Holding Cash Portfolio BuySell
#0 0 0.00 0 0.00 1000 1000.0 NaN
#1 5 0.05 1 210.00 0 1050.0 1.0
#2 6 0.20 -1 0.00 1260 1260.0 -2.0
#3 5 -0.16 1 211.68 0 1058.4 2.0
Hopefully this will give you a good starting point to create something more to your specification and more advanced, such as with multiple shares, or being a certain percentage exposed, etc.

Pandas dataframe trading gain/loss

Dataframe
(Disregard the two index columns)
level_0
index
Year
Month
Day
Open
High
Low
Close
Volume
Length
Polarity
Sentiment_Negative
Sentiment_Neutral
Sentiment_Positive
Target_variable
Predicted
0
0
0
2020
1
19
8941.45
9164.36
8620.08
8706.25
3.42173e+10
937.167
0.0884653
0
0
1
0
0
1
1
1
2020
1
18
8927.21
9012.2
8827.33
8942.81
3.23378e+10
1177.5
0.176394
0
0
1
1
1
2
2
2
2020
1
17
8725.21
8958.12
8677.32
8929.04
3.63721e+10
1580
0.216762
0
0
1
0
0
3
3
3
2020
1
16
8812.48
8846.46
8612.1
8723.79
3.1314e+10
1336.33
0.182707
0
0
1
0
0
Description
The value of the target_variable is 1 if todays closing price is greater than yesterdays closing price
The value of the target_variable is 0 if todays closing price is less than yesterdays closing price
The predicted value is the output of my classifier.
Problem
I need to run some code that tracks how much money is gained if I invest when the classifier tells me to invest
I have started to code this
credit = 10000
for index, row in df.iterrows():
if row["Predicted"] == 1:
#print(row["Percentage_diff"])
credit = credit - 100
credit = credit + (100 * row["Percentage_diff"])
print(credit)
The idea is that I start off with a balance of 10,000 and invest 100 every time the classifier signals to. The only problem is that when I lose 8000 credits. Is the code correct and the classifier is very poor?
Or have I made an error in the code?
I am not a trading expert, so I assume that every day the classifier tells you to trade, you will buy with the opening price and sell with the close price.
You can start by calculating the percentage of profit or loss when the classifier tells you to trade. You can do that by subtracting the closing price from the opening and dividing it by the opening price.
df["perc_diff"] = (df["Close"] - df["Open"])/df["open"]
Of course, this will be negative when the classifier is wrong. To compute the cumulative profits/losses, all you want to do is to iteratively add/subtract your profit/loss to your capital. This means at a day with a profit/loss percentage of r, if you invest x dollars, your new credit is (1+r)*x. So a simple for loop can do it like that:
credit = 1 # your capital
for label, row in df.iterrows():
credit = (1 + row["Predicted"] * r) * row["perc_diff"]
print(credit)
Edit to address your updated problem:
If you want to specify an amount to invest rather than all your capital, then you can use this:
credit = 1 # your capital
to_invest = 0.1 # money to invest
for label, row in df.iterrows():
# update invest
invest_update = (1 + row["Predicted"] * row["perc_diff"]) * to_invest
credit = credit - to_invest + invest_update
print(credit)
The last two lines can be combined into one line:
credit = credit + row["Predicted"] * row["perc_diff"] * to_invest
I think the code is correct, and if you lose, then it is probably due to poor performance from your classifier, but this should be evident from your evaluation of the model (like accuracy and precision metrics). Also, if it is a classifier that is not made for time series (e.g. logistic regression), then it is very reasonable that it performs poorly.
Solution
df["Percentage_diff"] = (df["Close"] - df["Open"])/df["Open"]
credit = 10000
for index, row in df.iterrows():
if row["Predicted"] == 1:
#print(row["Percentage_diff"])
credit = credit - 100
credit = credit + ((100 * row["Percentage_diff"]) + 100)
print(credit)
This was the solution thanks to Ahmed.
If I start with an original balance of 10000 every time the classifier signals to invest I invest 100 dollars at opening and withdraw at close this calculates the balance.

zipline: target_order not executed in handle_data

I'm trying to develop a monthly rotational trading strategy with Zipline and data from the Quandl bundle.
The strategy is supposed to hold a number ("topn") of assets with the highest momentum score and hold them until they dropped below a certain momentum rank ("keepn").
When I run the following code through zipline, it works for a couple of months, then suddenly starts holding more and more positions, selling the same positions repeatedly without actually removing the positions from the portfolio. This happens with Quandl data as well as with a custom bundle.
I'm guessing, there's a fundamental flaw in my strategy, but going through debugging, I really can't find it.
Any help is appreciated!
Thank you.
Dirk
def initialize(context):
# List of all assets to chose from
context.tickers = ["AAPL", "YELP", "YHOO", "MMM",
"ABT", "AMD", "AMZN", "GOOG",
"AXP", "AMGN", "BBY", "BLK",
"CAT"]
context.universe = [symbol(ticker) for ticker in context.tickers]
context.momentum_lookback = 256
# Hold (topn) 3 assets, as long as they are in the (keepn) top 5 momentum_rank
context.topn = 3
context.keepn = 5
# Schedule the trading routine for once per month
schedule_function(handle_data, date_rules.month_start(), time_rules.market_close())
# Allow no leverage
set_max_leverage = 1.0
def momentum_score(ts):
# Simplified momentum score: Last price / price 256 days ago
return ts[-1] / ts[0]
def handle_data(context, data):
# String with today's date for logging purposes
today = get_datetime().date().strftime('%d/%m/%Y')
# Create 256 days (context.momentum_lookup) history for all equities
hist = data.history(context.universe,
"close",
context.momentum_lookback,
"1d")
# How much to hold of each equity
target_percent = 100 / context.topn
# Rank ETFs by momentum score
ranking_table = hist.apply(momentum_score).sort_values(ascending=False)
top_assets = ranking_table[:context.topn]
grace_assets = ranking_table[:context.keepn]
# List of equities being held in current portfolio
kept_positions = list(context.portfolio.positions.keys())
# Sell logic
# ==========
# Sell current holdings no longer in grace assets
for holding in context.portfolio.positions:
if holding not in grace_assets:
if data.can_trade(holding):
print(today + " [Sell] "+holding.symbol)
order_target_percent(holding, 0.0)
kept_positions.remove(holding)
# Buy Logic
# =========
# Determine how many new assets to buy
replacements = context.topn - len(kept_positions)
# Remove currently held positions from the top list
buy_list = ranking_table.loc[~ranking_table.index.isin(kept_positions)][:replacements]
# Buy new entities and rebalance "kept_positions" to the desired weights
new_portfolio = list(buy_list.index) + kept_positions
# Buy/rebalance assets
for asset in new_portfolio:
if data.can_trade(asset):
print(today+"[BUY] "+asset.symbol)
order_target_percent(asset, target_percent)
Ok, so I figured out what the problem is. Basic math failure on my end.
This is the troublesome code:
# How much to hold of each equity
target_percent = 100 / context.topn
It should have been target_percent context.topn / 100 instead. facepalm
I'm assuming this leads to situations in which orders aren't filled properly, leading to the described behavior.
Lesson learned:
Check for open orders and cancel them, if needed
Keep an eye on leverage and position sizes and check against restrictions during the algo run

linear programming problem using python scipy minimize

I am trying to optimise, using python scipy.optimize.minimize, the calorie intake of a person using available food items and sticking to a budget.
The problem statement is: There are n food items, each available in different quantities. Their price changes everyday. Each has a different nutrition value which reduces every day. I need to buy food over a month so that total nutrition is closest to my target and I use my exact monthly budget to buy them.
#df_available has 1 row each for each item's available quantity at the beginning of the month
#df_bought has initial guesses for purchase of each item for each day of the month. This is based on prorated allotment of my total budget to each item on each day.
#df_price has price of each item on each day of the month.
#df_nutrition['nutrition'] has initial nutrition value per unit. it decreases by 1 unit each year, so 1/365 each day.
#strt is start date of the month.
#tot_nutrition is monthly total nutrition target for the month
#tot_budget is my monthly budget
def obj_func():
return (df_bought.sum()*(df_nutrition['nutrition'] - strt/365).sum())
#constraint 1 makes sure that I buy exactly at my budget
def constraint1():
return ((df_bought * df_price).sum().sum()- tot_budget)
cons1 = {'type':'eq', 'fun':constraint1}
#constraint 2 makes sure that I dont buy more than available quantity of any item
def constraint2():
return df_available - df_bought
cons2 = {'type':'ineq', 'fun':constraint2}
cons = ([cons1, cons2])
#bounds ensure that I dont buy negative quantity of any item
bnds = (0, None)
res = minimize(obj_func, df_bought_nominal_m, bounds= bnds, constraints=cons)
print(res)
For output, i would like the df_bought values to be adjusted to minimise the obj_function.
The code gives the following error:
TypeError: constraint1() takes no arguments (1 given)
apologies if I am making any rookie mistakes. I am new to python and couldn't find appropriate help online for this problem.

Plotting (discrete sum over time period) vs. (time period) yields graph with discontinuities

I have some lists related to buying and selling bitcoin.
One is the price (of a buy or sell) and the other is an associated date.
When I plot the total money made (or lost) from my buying/selling over different lengths of time vs. those different lengths of time, the result is 'choppy' - not what I expected. And I think my logic might be wrong
My raw input lists look like:
dates=['2013-05-12 00:00:00', '2013-05-13 00:00:00', '2013-05-14 00:00:00', ....]
prices=[114.713, 117.18, 114.5, 114.156,...]
#simple moving average of prices calced over a short period
sma_short_list = [None, None, None, None, 115.2098, 116.8872, 118.2272, 119.42739999999999, 121.11219999999999, 122.59219999999998....]
#simple moving average of prices calced over a longer period
sma_long_list = [...None, None, None, None, 115.2098, 116.8872, 118.2272, 119.42739999999999, 121.11219999999999, 122.59219999999998....]
Based on the moving average cross-overs (which were calculated based on https://stackoverflow.com/a/14884058/2089889) I will either buy or sell the bitcoin at the date/price where crossover occurred.
I wanted to plot how (much money this approach would have made me as of today) vs. (days ago that I started this approach)
BUT
I am having trouble in that the resulting graph is really choppy. First I thought this was because I have one more buy than sell (or vis-versa) so I tried to account for that. But it was still choppy. NOTE the following code is called in a loop for days_ago in reversed(range(0,approach_started_days_ago)): so each time the following code executes it should spit out how much money that approach would have made had I started it days_ago (I call this bank), and the choppy plot is the days_ago vs. bank
dates = data_dict[file]['dates']
prices = data_dict[file]['prices']
sma_short_list = data_dict[file]['sma'][str(sma_short)]
sma_long_list = data_dict[file]['sma'][str(sma_long)]
prev_diff=0
bank = 0.0
buy_amt, sell_amt = 0.0,0.0
buys,sells, amt, first_tx_amt, last_tx_amt=0,0,0, 0, 0
start, finish = len(dates)-days_ago,len(dates)
for j in range(start, finish):
diff = sma_short_list[j]-sma_long_list[j]
amt=prices[j]
#If a crossover of the moving averages occured
if diff*prev_diff<0:
if first_tx_amt==0:
first_tx_amt = amt
#BUY
if diff>=0 and prev_diff<=0:
buys+=1
bank = bank - amt
#buy_amt = buy_amt+amt
#print('BUY ON %s (PRICE %s)'%(dates[j], prices[j]))
#SELL
elif diff<=0 and prev_diff>=0:
sells+=1
bank = bank + amt
#sell_amt = sell_amt + amt
#print('SELL ON %s (PRICE %s)'%(dates[j], prices[j]))
prev_diff=diff
last_tx_amt=amt
#if buys > sells, subtract last
if buys > sells:
bank = bank + amt
elif sells < buys:
bank = bank - amt
#THIS IS RELATED TO SOME OTHER APPROACH I TRIED
#a = (buy_amt) / buys if buys else 0
#b = (sell_amt) / sells if sells else 0
#diff_of_sum_of_avg_tx_amts = a - b
start_date = datetime.now()-timedelta(days=days_ago)
return bank, start_date
I reasoned that my amount in the 'bank' would be the amount I have sold - the amount I have bought
But, if the first crossover was a sell I don't want to count that (I am going to assume that the first tx I make will be a buy.
Then if the last tx I make is a buy (negative to my bank), I will count today's price into my 'bank'
if last_tx_type=='buy':
sell_amt=sell_amt+prices[len(prices)-1] #add the current amount to the sell amount if the last purchase you made is a buy
if sell_first==True:
sell_amt = sell_amt - first_tx_amt #if the first thing you did was sell, you do not want to add this to money made b/c it was with apriori money
bank = sell_amt-buy_amt

Categories

Resources