ValueError: Endog and Exog are in different size - python

I have met ValueError: Exog and Ebndog are in different size.
When I type len(y) or len(y_scaled), it returns 0, but it supposed to be five. Hope for help. Thanks in advance.
import datetime
import dateutil
import pandas_datareader.data as wb
import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm
year=5
tickers =["0200.KL"]
ohlc = wb.DataReader(tickers, data_source="yahoo",start=datetime.date.today()-dateutil.relativedelta.relativedelta(years=year),end=datetime.date.today())
n=5 #get 5 consecutive data
df =ohlc.copy()
series=df["Adj Close"]
slopes=[i*0 for i in range(n-1)]
for i in range(n,len(series)+1):
y=series[i-n:n]
x=np.array(range(n))
#normalize x and y variable
y_scaled=(y-y.min())/(y.max()-y.min())
X_scaled=(x-x.min())/(x.max()-x.min())
#add a constant to the equation
X_scaled=sm.add_constant(X_scaled)
model=sm.OLS(y_scaled,X_scaled)
results=model.fit()
slopes.append(results.params[-1])
#slope coefficient is the theta in radians
slopes_angle=np.rad2degree(np.arctan(np.array(slopes)))
np.array(slopes_angle)

Solved. Thank you.
Should be y=df["Adj Close"][i-n:i] instead of y=series[n-i:n]
The full code as below:
import datetime
import dateutil
import pandas_datareader.data as wb
import matplotlib.pyplot as plt
import numpy as np
import statsmodels.api as sm
%matplotlib inline
year=1
tickers ="AAPL"
ohlc = wb.DataReader(tickers, data_source="yahoo",start=datetime.date.today()-dateutil.relativedelta.relativedelta(years=year),end=datetime.date.today())
n=5 #get 5 consecutive datas
df =ohlc.copy()
slopes=[i*0 for i in range(n-1)]
for i in range(n,len(df)+1):
y=df["Adj Close"][i-n:i]
x=np.array(range(n))
#normalize x and y variable
y_scaled=(y-y.min())/(y.max()-y.min())
X_scaled=(x-x.min())/(x.max()-x.min())
#add a constant to the equation
X_scaled=sm.add_constant(X_scaled)
model=sm.OLS(y_scaled,X_scaled)
results=model.fit()
slopes.append(results.params[-1])
#slope coefficient is the theta in radians
slopes_angle=np.rad2deg(np.arctan(np.array(slopes)))
slopes_angle=np.array(slopes_angle)
plt.plot(slopes_angle)
plt.title("Slope Coefficient of 5 Consecutive Stock Price Data")
plt.ylabel("Slope Coefficient")
plt.xlabel("Period")
plt.show()

Related

Shift phase between two sinusoids in Python

How to find Shift phase between two sinusoids in Python.
For example, I created two sinusoid with phase shift 180 radian (Visually). Can we calculate the phase shift in python script if we know only graph_1 and graph_2?
import matplotlib.pyplot as plt
import numpy as np
data=[]
def sin (f):
x=np.array(range(1,200))
y = 10*np.sin((0.1*x)+f)
return (y)
import matplotlib.pyplot as plt
graph_1 = sin(3.12)
graph_2 = sin(0)
plt.plot(graph_1 ,graph_2)
plt.show()
Please see the image here

how to plot a graph with mpmath in python?

I need to simulate several models with interval arithmetic, the most viable package I found was: mpmath. However I am having problems with plotting the graphics. I did an initial test before applying it to the models. can anybody help me?
Another problem is that I always need a for to create my interval variable and this greatly increases the computational cost. Would there be another alternative?
This my code:
import mpmath as mp
import math as mt
from mpmath import *
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
iv.dps = 10
iv.pretty = True
X = np.linspace(-np.pi, np.pi, 10, endpoint=True)
a=iv.mpf(X[1])
b=[]
for k in range(len(X)):
b = np.append(b,iv.mpf(X[k]) )
C=[]
for k in range(len(X)):
C = np.append(C, iv.sin(b[k]))
print(C)
I need to plot the sin, and mp.plot doesn't work.
It is fairly straightforward:
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(0, 5, 0.01)
y = np.sin(x)
plt.plot(x, y)
plt.show()

Frequency vs time plot python

I have the following code:
from scipy import signal
import numpy as np
import matplotlib.pyplot as plt
from scipy.io import wavfile
fs, data = wavfile.read("New Recording 2.wav")
f, t , zxx = signal.stft(data, fs)
fa = np.array(f)
ta = np.array(t)
plt.plot(ta, fa, np.abs(zxx))
plt.show()
I want to plot a graph like this: Graph, with time (secs) on x axis and Khz on Y axis.
When I run the above code I run into the value error
"ValueError: x and y must have same first dimension, but have shapes (3,) and (2,)"
I thought this was a quirk in numpy hence me converting f and t to np.arrays instead of the arrays outputted by scipy.
Any help would be appreciated.

Plot RidgeCV coefficients as a function of the regularization

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import RidgeCV
tips = sns.load_dataset('tips')
X = tips.drop(columns=['tip','sex', 'smoker', 'day', 'time'])
y = tips['tip']
alphas = 10**np.linspace(10,-2,100)*0.5
ridge_clf = RidgeCV(alphas=alphas,scoring='r2').fit(X, y)
ridge_clf.score(X, y)
I wanted to plot the following graph for RidgeCV. I don't see any option to do that like GridSearhCV. I appreciate your suggestions!
There is no indication what the colors stand for. I assume they stand for features and we investigate the size of each feature weight as function of alpha. Here is my solution:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import RidgeCV
tips = sns.load_dataset('tips')
X = tips.drop(columns=['tip','sex', 'smoker', 'day', 'time'])
y = tips['tip']
alphas = 10**np.linspace(10,-2,100)*0.5
w = list()
for a in alphas:
ridge_clf = RidgeCV(alphas=[a],cv=10).fit(X, y)
w.append(ridge_clf.coef_)
w = np.array(w)
plt.semilogx(alphas,w)
plt.title('Ridge coefficients as function of the regularization')
plt.xlabel('alpha')
plt.ylabel('weights')
plt.legend(X.keys())
Output:
Since you only have two features in X there are only two lines.
Here is the code for generating the plot that you had posted.
Firstly, we need to understand that RidgeCV would not return the coef for each alpha value that we had fed in the alphas param.
The motivation behind having the RidgeCV is that it will try for different alpha values mentioned in alphas param, then based on cross validation scoring, it will return the best alpha along with the fitted model.
Hence, the only way to get the coef for each alpha value using cv is iterate through RidgeCV using each alpha value.
Example:
# Author: Fabian Pedregosa -- <fabian.pedregosa#inria.fr>
# License: BSD 3 clause
print(__doc__)
import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model
# X is the 10x10 Hilbert matrix
X = 1. / (np.arange(1, 11) + np.arange(0, 10)[:, np.newaxis])
y = np.ones(10)
# #############################################################################
# Compute paths
n_alphas = 200
alphas = np.logspace(-10, -2, n_alphas)
coefs = []
for a in alphas:
ridge = linear_model.RidgeCV(alphas=[a], fit_intercept=False, cv=3)
ridge.fit(X, y)
coefs.append(ridge.coef_)
# #############################################################################
# Display results
ax = plt.gca()
ax.plot(alphas, coefs)
ax.set_xscale('log')
ax.set_xlim(ax.get_xlim()[::-1]) # reverse axis
plt.xlabel('alpha')
plt.ylabel('weights')
plt.title('RidgeCV coefficients as a function of the regularization')
plt.axis('tight')
plt.show()

Pandas not plotting code.

I'm new to coding and am trying to understand a lecture on Quantopian by going through the code but when I run the code in PyCharm, there is no output. Can someone tell me what's going on and advise me on how to resolve this?
Below is my a piece of code (2.7.13):
import numpy as np
import pandas as pd
import statsmodels
import statsmodels.api as sm
from statsmodels.tsa.stattools import coint
# just set the seed for the random number generator
np.random.seed(107)
import matplotlib.pyplot as plt
X_returns = np.random.normal(0, 1, 100) # Generate the daily returns
# sum them and shift all the prices up into a reasonable range
X = pd.Series(np.cumsum(X_returns), name='X') + 50
X.plot();
The sole output, when I run this, is: "Process finished with exit code 0"
Just add plt.show() at the end:
import numpy as np
import pandas as pd
import statsmodels
import statsmodels.api as sm
from statsmodels.tsa.stattools import coint
# just set the seed for the random number generator
np.random.seed(107)
import matplotlib.pyplot as plt
X_returns = np.random.normal(0, 1, 100) # Generate the daily returns
# sum them and shift all the prices up into a reasonable range
X = pd.Series(np.cumsum(X_returns), name='X') + 50
X.plot()
plt.show()

Categories

Resources