why is my beta different from yahoo finance? - python

I have some code which calculates the beta of the S&P 500 vs any stock - in this case the ticker symbol "FET". However the result seems to be completely different from what I am seeing on yahoo finance, historical this stock has been very volatile and that would explain the beta value of 1.55 on yahoo finance - http://finance.yahoo.com/q?s=fet. Can someone please advise as to why I am seeing a completely different number (0.0088)? Thanks in advance.
from pandas.io.data import DataReader
from datetime import datetime
from datetime import date
import numpy
import sys
today = date.today()
stock_one = DataReader('FET','yahoo',datetime(2009,1,1), today)
stock_two = DataReader('^GSPC','yahoo',stock_one['Adj Close'].keys()[0], today)
a = stock_one['Adj Close'].pct_change()
b = stock_two['Adj Close'].pct_change()
covariance = numpy.cov(a[1:],b[1:])[0][1]
variance = numpy.var(b[1:])
beta = covariance / variance
print 'beta value ' + str(beta)

Ok, so I played with the code a bit and this is what I have.
from pandas.io.data import DataReader
import pandas.io.data as web
from datetime import datetime
from datetime import date
import numpy
import sys
start = datetime(2009, 1, 1)
today = date.today()
stock1 = 'AAPL'
stock2 = '^GSPC'
stocks = web.DataReader([stock1, stock2],'yahoo', start, today)
# stock_two = DataReader('^GSPC','yahoo', start, today)
a = stocks['Adj Close'].pct_change()
covariance = a.cov() # Cov Matrix
variance = a.var() # Of stock2
var = variance[stock2]
cov = covariance.loc[stock2, stock1]
beta = cov / var
print "The Beta for %s is: " % (stock2), str(beta)
The length of the prices did not equal each other, so there was problem #1. Also when your final line executed found the beta for every value of the cov matrix, which is probably not what you wanted. You don't need to know what the beta is based on cov(0,0) and cov(1,1), you just need to look at cov(0,1) or cov(1,0). Those are the positions in the matrix not the values.
Anyway, here is the answer I got:
The Beta for ^GSPC is: 0.885852632799
* Edit *
Made the code easier to run, and changed it so there is only one line for inputting what stocks you want to pull from Yahoo.

You need to convert the closing Px into correct format for calculation. These prices should be converted into return percentages for both the index and the stock price.

In order to match Yahoo finance, you need to use three years' of monthly Adjusted Close prices.
https://help.yahoo.com/kb/finance/SLN2347.html?impressions=true
Beta
The Beta used is Beta of Equity. Beta is the monthly price change of a
particular company relative to the monthly price change of the S&P500.
The time period for Beta is 3 years (36 months) when available.

Related

calculate monthly customer churn with the 1st of each month

I am working with a subscription based data set of which this is an exemplar:
import pandas as pd
import numpy as np
from datetime import timedelta
start_date = pd.date_range(start = "2015-01-09", end = "2022-09-11", freq = "6D")
cancel_date = [start_date + timedelta(days = np.random.exponential(scale = 100)) for start_date in start_date]
churned = [random.randint(0, 1) for i in range(len(start_date))]; churned = [bool(x) for x in churned]
df = pd.DataFrame(
{"start_date":start_date,
"cancel_date":cancel_date,
"churned":churned}
)
df["cancel_date"] = df["cancel_date"].dt.date
df["cancel_date"] = df["cancel_date"].astype("datetime64[ns]")
I need a way to calculate monthly customer churn in python using the following steps:
Firstly, I need to obtain the number of subscriptions that started before the 1st of each month that are still active
Secondly, I need to obtain the number of subscriptions that started before the 1st of each month and which were cancelled after the 1st of each month
These two steps constitute the denominator of the monthly calculation
Finally, I need to obtain the number of subscriptions that cancelled in each month
This step produces the numerator of the monthly calculation.
The numerator and the denominator are divided and multiplied by 100 to obtain the percentage of customers that churn each month
I am really really lost with this problem can someone please point me in the right direction - I have been working on this problem for so long

How to construct the daily returns of a index

I should using the snp500 series, which contains the closing prices of S&P500 index for the years 2010-2019, construct the daily returns of this index (returns can be defined a percentage increase in price: $r_1=(P_1-P_0)/P_0$ and convert them to yearly returns, building on the functionx = lambda p,r,n,t: "%"+str(round(p*(1+(r/n))**(n*t),2)/100) Pay attention to the units of measurement. I should assume that there are 252 days in a year. Maybe, I can use the method .shift() for this assignment.
Firstly, I defined the function $r_1=(P_1-P_0)/P_0$
def percentage_increase_in_price():
r_1 = (P_1 - P_0) / P_0
Secondly, I wrote the function for finding the data about the index of snp500 from 2010 to 2019
import pandas as pd
import pandas_datareader.data as web
import datetime as dt
start = dt.datetime(2010, 1, 1)
end = dt.datetime(2019, 12, 31)
snp500 = web.DataReader('SP500', 'fred', start, end)
snp500
Then, I have no idea what my next step is.
Could you advise me on how to complete this task?
How about this?
import pandas as pd
import pandas_datareader.data as web
snp500 = web.DataReader('SP500', 'fred', '2010-01-01', '2019-12-31')
# calculate simple returns
snp500["daily_ret"] = snp500["SP500"].pct_change()
snp500.dropna(inplace=True)
# scale daily returns to annual returns and apply rounding
def annualize(r, n, p, t=1):
return round(p * (1 + r/n)**(n*t),2)/100
snp500["inv"] = snp500["daily_ret"].apply(annualize, p=100, n=252)
Output:
SP500 daily_ret inv
DATE
2012-03-27 1412.52 -0.002817 0.9972
2012-03-28 1405.54 -0.004942 0.9951
2012-03-29 1403.28 -0.001608 0.9984

How to calculate stock pullback

I am trying to calculate the pullback (percentage change) off of its high. Not necessarily change from high to today, but percentage change from high to the lowest point after that high.
Where I am drawing a blank, is I don't know where to begin with finding the lowest point in the stock after the high for the stock. I can find the high for each stock, but how do I trim that column so that it only has the stock prices after that high?
import numpy as np
import pandas as pd
import datetime as dt
import pandas.io.data as web
stocks = ['AAPL', 'NFLX', 'MSFT', 'MCD', 'DIS']
start = dt.datetime(2015, 1, 1)
end = dt.datetime.today()
df = web.DataReader(stocks, 'yahoo', start, end)
df = df['Close']
dfMax = df.max()
From here, I have 5 columns, one column for each stock, and the subsequent prices on each day. I am stumped...
First, you need to use the Adj Close price so that you can accurately measure daily returns (i.e. so your results aren't impacted by splits and dividends).
To calculate the forward min (i.e. the lowest point AFTER the most recent max high), perform a cummin on the prices sorted in reverse order, and then reverse again: df[::-1].cummin()[::-1].
The pullback from the cumulative max price is one minus the ratio of this forward min price to the cumulative max price: 1 - df[::-1].cummin()[::-1] / df.cummax()
df = web.DataReader(stocks, 'yahoo', start, end)['Adj Close']
df_pullback = 1 - df[::-1].cummin()[::-1] / df.cummax()
df_pullback.plot()

Rolling Mean of Rolling Correlation dataframe in Python?

Python beginner here.
What I've done so far:
Imported price data from Yahoo Finance from a list of stocks.
Between the stocks (every combination), computed the 20 day rolling correlation into a dataframe.
I would like to:
1) Calculate the 200 day simple moving average for each of the 20 day rolling correlations.
2) Report the 200 day moving average results in a matrix.
How to do this in python/pandas? Thanks, this would help me out a ton!
Here is what I have so far...
import pandas as pd
from pandas import DataFrame
import datetime
import pandas.io.data as web
from pandas.io.data import DataReader
stocks = ['spy', 'gld', 'uso']
start = datetime.datetime(2014,1,1)
end = datetime.datetime(2015,1,1)
f = web.DataReader(stocks, 'yahoo', start, end)
adj_close_df = f['Adj Close']
correls = pd.rolling_corr(adj_close_df, 20)
means = pd.rolling_mean(correls, 200) #<---- I get an error message here!
This is a start which answers questions 1-3 (you should only have one question per post).
import pandas.io.data as web
import datetime as dt
import pandas as pd
end_date = dt.datetime.now().date()
start_date = end_date - pd.DateOffset(years=5)
symbols = ['AAPL', 'IBM', 'GM']
prices = web.get_data_yahoo(symbols=symbols, start=start_date, end=end_date)['Adj Close']
returns = prices.pct_change()
rolling_corr = pd.rolling_corr_pairwise(returns, window=20)
Getting the rolling mean of the rolling correlation is relatively simple for a single stock against all others. For example:
pd.rolling_mean(rolling_corr.major_xs('AAPL').T, 200).tail()
Out[34]:
AAPL GM IBM
Date
2015-05-08 1 0.313391 0.324728
2015-05-11 1 0.315561 0.327537
2015-05-12 1 0.317844 0.330375
2015-05-13 1 0.320137 0.333189
2015-05-14 1 0.322119 0.335659
To view the correlation matrix for the most recent 200 day window:
>>> rolling_corr.iloc[-200:].mean(axis=0)
AAPL GM IBM
AAPL 1.000000 0.322119 0.335659
GM 0.322119 1.000000 0.383672
IBM 0.335659 0.383672 1.000000

Subtract loan payment every month with daily compounding interest

I'm trying to figure out how to subtract a monthly loan payment with daily compounded interest. Right now I think I've got the right code for subtracting the payment amount daily over a 10 year loan:
P = 20000
r = .068
t = 10
n = 365
payment = 200
for payment_number in xrange(1, n*t):
daily_interest = P * (1+(r/n)) - P
P = (P + daily_interest) - payment
print P
I'd like it if possible to still print the daily balances but instead subtract the payment every month rather than every day. Initially I though maybe use a nested for loop with xrange(1, 30) but I'm not sure that worked correctly. Thanks in advance for the suggestions!
What about inserting an if statement for the purpose?
P = P-200 if payment_number%30 == 0 else P
This will run if the payment_number variable is a multiple of 30.
"Monthly" is a complicated idea. To fully handle the months you will need to use the datetime module.
from datetime import date, timedelta
date_started = date(2000,1,1)
So say you are 123 days from the start date, we need to calculate that date:
date = date_started + timedelta(days=123)
>>> date
datetime.date(2000, 5, 3)
So now we know need to figure how many days are between that date and the first of the month, the dateutil module can help us with that (you will have to download it).
from dateutil.relativedata import relativedelta
firstofmonth_date = date_started + relativedelta(months=4)
tddays = firstofmonth_date - date_started
days = tddays.days
Then just put "days" into the function you already have and you should be good. The only part I left for you to do is figuring out how many months have passed between your dates.

Categories

Resources