pymc3 - stochastic volatility model with latent AR(1) process - python

I've been trying to implement and estimate, with pymc3, a basic stochastic volatility (SV) model of the following form:
r_t = exp{h_t/2}*e_t
h_t = r_0 + r_1*h_{t-1} + n_t
where r_t is the return process and h_t the (latent) log-variance process following a AR(1) process. My code (MWE) for this looks as follows:
import numpy as np
import pymc3 as pm
# simulate some random data
np.random.seed(13)
data = np.random.randn(10)
# SV model with AR
with pm.Model() as model:
nu = 2
rho = pm.Uniform("rho", -1, 1)
h = pm.AR("h", rho=rho, sigma=1, shape=len(data))
volatility_process = pm.Deterministic(
"volatility_process", pm.math.exp(h / 2) ** 0.5
)
r = pm.StudentT("r", nu=nu, sigma=volatility_process, observed=data)
prior = pm.sample_prior_predictive(10)
# trace = pm.sample(10)
But running the above results in the following error message:
Traceback (most recent call last):
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\distribution.py", line 801, in _draw_value
return dist_tmp.random(point=point, size=size)
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\continuous.py", line 1979, in random
point=point, size=size)
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\distribution.py", line 638, in draw_values
raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
ValueError: Cannot resolve inputs for ['Elemwise{mul,no_inplace}.0']
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 9, in <module>
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\sampling.py", line 1495, in sample_prior_predictive
values = draw_values([model[name] for name in names], size=samples)
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\distribution.py", line 620, in draw_values
size=size)
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\distribution.py", line 810, in _draw_value
size=None))
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\continuous.py", line 1979, in random
point=point, size=size)
File "C:\Users\jrilla\AppData\Local\Continuum\anaconda3\lib\site-packages\pymc3\distributions\distribution.py", line 638, in draw_values
raise ValueError('Cannot resolve inputs for {}'.format([str(params[j]) for j in to_eval]))
ValueError: Cannot resolve inputs for ['Elemwise{mul,no_inplace}.0']
Moreover, it is exactly the prior = ... line that causes the error. Note that I am using pm.AR() instead of pm.AR1(), but this would result in the same error. I don't really understand why it does not work as expected. I am able to run the (simplified) SV example as provided in the pymc3 documentation:
# SV model with GaussianRandomWalk
with pm.Model() as model:
nu = 2
sigma = pm.Exponential("sigma", 1.0, testval=1.0)
s = pm.GaussianRandomWalk("s", sigma=sigma, shape=len(data))
volatility_process = pm.Deterministic(
"volatility_process", pm.math.exp(-2 * s) ** 0.5
)
r = pm.StudentT("r", nu=nu, sigma=volatility_process, observed=data)
prior = pm.sample_prior_predictive(10)
# trace = pm.sample(10)
where they show the example for a Gaussian random walk (GRW), instead of the general AR process I want to use. Since the GRM is just a specific AR and it works for GRM, I don't see why it shouldn't also work for the general AR. As can be seen in the code, I basically just replace pm.GaussianRandomWalk(...) with pm.AR(...) (each with their required arguments). I am also able to simply implement/estimate the AR process itself:
# Simple AR
with pm.Model() as model:
rho = pm.Uniform("rho", -1, 1)
h = pm.AR("h", rho=rho, sigma=1, shape=len(data), observed=data)
prior = pm.sample_prior_predictive(10)
#trace = pm.sample(10)
which works fine as well, so I assume I am not making a mistake with defining the AR. It's only when the AR is used as the latent process that the error arises. The pymc3 documentation on both GRW and AR models can be found here
Any idea on what the issue is here or what I'm doing wrong?
Thanks!

Having asked the same question on the pymc3 discours pages. The developers responded and the reason for the above error is that for pm.sample_prior_predictive to work, one needs the random() method, which is only implemented for the GRM. However, pm.sample works just fine.

Related

KDE - Is there something wrong in scipy or numpy? Or is it something I am doing?

I am simply trying to follow an example: https://medium.com/swlh/how-to-analyze-volume-profiles-with-python-3166bb10ff24
I am only on the second step and I am getting errors. Here is my code:
# Load data
df = botc.ib.data_saver.get_df(SYMBOL.lower())
# Separate for vol prof
volume = np.asarray(df['Volume'])
close = np.asarray(df['Close'])
print("Close:")
print(close)
print("VOLUME:")
print(volume)
# Plot volume profile based on close
px.histogram(df, x="Volume", y="Close", nbins=150, orientation='h').show()
# Kernel Density Estimator
kde_factor = 0.05
num_samples = 500
kde = stats.gaussian_kde(close, weights=volume, bw_method=kde_factor)
xr = np.linspace(close.min(), close.max(), num_samples)
kdy = kde(xr)
ticks_per_sample = (xr.max() - xr.min()) / num_samples
def get_dist_plot(c, v, kx, ky):
fig = go.Figure()
fig.add_trace(go.Histogram(name="Vol Profile", x=c, y=v, nbinsx=150,
histfunc='sum', histnorm='probability density'))
fig.add_trace(go.Scatter(name="KDE", x=kx, y=ky, mode='lines'))
return fig
get_dist_plot(close, volume, xr, kdy).show()
And here are the errors:
Traceback (most recent call last):
File "C:/Users/Jagel/PycharmProjects/VolumeBotv1-1-1/main.py", line 80, in <module>
start_bot()
File "C:/Users/Jagel/PycharmProjects/VolumeBotv1-1-1/main.py", line 64, in start_bot
kde = stats.gaussian_kde(close, weights=volume, bw_method=kde_factor)
File "M:\PROGRAMS\Anacondaa\envs\MLStockBot2\lib\site-packages\scipy\stats\_kde.py", line 207, in __init__
self.set_bandwidth(bw_method=bw_method)
File "M:\PROGRAMS\Anacondaa\envs\MLStockBot2\lib\site-packages\scipy\stats\_kde.py", line 555, in set_bandwidth
self._compute_covariance()
File "M:\PROGRAMS\Anacondaa\envs\MLStockBot2\lib\site-packages\scipy\stats\_kde.py", line 564, in _compute_covariance
self._data_covariance = atleast_2d(cov(self.dataset, rowvar=1,
File "<__array_function__ internals>", line 180, in cov
File "M:\PROGRAMS\Anacondaa\envs\MLStockBot2\lib\site-packages\numpy\lib\function_base.py", line 2680, in cov
avg, w_sum = average(X, axis=1, weights=w, returned=True)
File "<__array_function__ internals>", line 180, in average
File "M:\PROGRAMS\Anacondaa\envs\MLStockBot2\lib\site-packages\numpy\lib\function_base.py", line 550, in average
avg = np.multiply(a, wgt,
TypeError: can't multiply sequence by non-int of type 'float'
I have looked all over the internet for over an hour and haven't been able to solve this. Sorry if it is simple, but I'm starting to get quite angry, so any help is very much appreciated.
Other things I have tried: using different bw_methods, convert to numpy array first.
I don't know about your data, but in your bug, I can reproduce the error as follows:
>>> [5] * 0.1
TypeError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18536/2403475853.py in <module>
----> 1 [5] * 0.1
TypeError: can't multiply sequence by non-int of type 'float'
So, you can check about your data, I think in a certain row of the column there is array data

Scipy Kmeans exits with TypeError

When running the code below, I'm getting a TypeError that says:
"File "_vq.pyx", line 342, in scipy.cluster._vq.update_cluster_means
TypeError: type other than float or double not supported"
from PIL import Image
import scipy, scipy.misc, scipy.cluster
NUM_CLUSTERS = 5
im = Image.open('d:/temp/test.jpg')
ar = scipy.misc.fromimage(im)
shape = ar.shape
ar = ar.reshape(scipy.product(shape[:2]), shape[2])
codes, dist = scipy.cluster.vq.kmeans(ar, NUM_CLUSTERS)
vecs, dist = scipy.cluster.vq.vq(ar, codes)
counts, bins = scipy.histogram(vecs, len(codes))
peak = codes[scipy.argmax(counts)]
print 'Most frequent color: %s (#%s)' % (peak, ''.join(chr(c) for c in peak).encode('hex'))
I have no idea how to fix this.
Update:
Full traceback:
Traceback (most recent call last):
File "...\temp.py", line 110, in <module>
codes, dist = scipy.cluster.vq.kmeans2(ar, NUM_CLUSTERS)
File "...\site-packages\scipy\cluster\vq.py", line 642, in kmeans2
new_code_book, has_members = _vq.update_cluster_means(data, label, nc)
File "_vq.pyx", line 342, in scipy.cluster._vq.update_cluster_means
TypeError: type other than float or double not supported
Doing:
ar = ar.reshape(scipy.product(shape[:2]), shape[2])
print(ar.dtype)
you will see, that you call kmeans with data of type uint8.
As kmeans, in theory, is defined on a d-dimensional real vector, scipy also does not like it (as given in the error)!
So just do:
ar = ar.reshape(scipy.product(shape[:2]), shape[2]).astype(float)
Casting like that is making my example run until the print, which also needs to be changed to reflect the given types.

What is the right version of matplotlib for sympy 1.0?

I tried to use the plot module of Sympy(1.0) in Pycharm, but encounter errors like the one below. I guess it is caused by an version imcompatibility between matplotlib(2.0.2) and Sympy(1.0). Does anyone have a clue? Thanks in advance~
Traceback (most recent call last):
File "/home/leizh/PycharmProjects/Learn_python/Smoothness_Bilinear_Quadrilateral_Elmt.py", line 49, in <module>
plot_parametric(cos(u),sin(u),(u,-5,5))
File "/home/leizh/.local/lib/python3.5/site-packages/sympy/plotting/plot.py", line 1415, in plot_parametric
plots.show()
File "/home/leizh/.local/lib/python3.5/site-packages/sympy/plotting/plot.py", line 184, in show
self._backend = self.backend(self)
File "/home/leizh/.local/lib/python3.5/site-packages/sympy/plotting/plot.py", line 1056, in __new__
return MatplotlibBackend(parent)
File "/home/leizh/.local/lib/python3.5/site-packages/sympy/plotting/plot.py", line 868, in __init__
self.cm = self.matplotlib.cm
AttributeError: 'NoneType' object has no attribute 'cm'
The code is meant to calculate a mapping for a bilinear quadrilateral element.
from sympy import *
from sympy.plotting import *
xi = Symbol("xi")
eta = Symbol("eta")
#Shape functions in reference element
def Ni(xi,eta,i):
references_vertices = {1:[-1,-1],2:[1,-1],3:[1,1],4:[-1,1]}
xiv = references_vertices[i][0]
etav = references_vertices[i][1]
return Rational(1,4)*(1+xiv*xi)*(1+etav*eta)
#Give a specific element in physical space with an angle >= 180 degree
physical_vertices = {1:[-1,-1],2:[1,-1],3:[1,1],4:[0,0]}
#Interpolation for (x,y) in terms of (xi,eta)
def mapping(xi,eta,vertices):
x = 0
y = 0
for i in vertices:
xv = vertices[i][0]
yv = vertices[i][1]
x += Ni(xi,eta,i)*xv
y += Ni(xi,eta,i)*yv
return [x,y]
#mapping (xi, eta) -> (x, y)
xy = mapping(xi,eta,physical_vertices)
print("x and y")
print(factor(xy[0]))
print(factor(xy[1]))
#Jacobian
jac = []
jac.append([xy[0].diff(xi),xy[0].diff(eta)])
jac.append([xy[1].diff(xi),xy[1].diff(eta)])
print("Jacobian Matrix")
print(factor(jac))
#The determinant of Jacobian
det_jac = jac[0][0]*jac[1][1]-jac[0][1]*jac[1][0]
print(factor(det_jac))
#Plot
plot3d_parametric_surface(xy[0], xy[1], det_jac,(xi,-1,1),(eta,-1,1))
det_jac.subs([(xi,1),(eta,-1)])
#test
u = symbols('u')
plot(u**2,(u,-1,1))
plot_parametric(cos(u),sin(u),(u,-5,5))
I have been able to reproduce your problem with matplotlib 2.0.2, sympy 1.0 and python 3.4.6. However using matplotlib 2.0.2, sympy 1.0 and python 3.5.3 works just fine. Note that I am using different computers, but fresh virtual environments every time. So there should be no other issues here. I suggest upgrading to python 3.5.x.
In the future please provide a "minimal" working example which reproduces your error, for example:
import sympy as sym
u = sym.symbols('u')
sym.plotting.plot(sym.sin(u), (u,-5,5))
EDIT: There was a difference between the 2 computers: one used the qt4agg backend (did not work), the other used tkagg (does work). So there seems to be a problem regarding which backend you use with sympy and matplotlib.

Float error while attempting to use the bisect optimizer within scipy

I’m having trouble using the bisect optimizer within scipy. Here are the relevant portions of my code:
How I’m importing things
import numpy as np
import scipy.optimize as sp
import matplotlib.pyplot as plt
Break in code, section causing errors below
#All variables are previously defined except for h
def BeamHeight(h):
x = 1000e3*M[i]*h/(fw*h^3-(fw-wt)(h-2*ft)^3) - Max_stress_steel
return x
for i in range(0,50):
h = np.zeros((50))
h[i] = sp.bisect(BeamHeight, hb, 5,xtol = 0.001)
Causing this error:
Traceback (most recent call last):
File "ShearMoment.py", line 63, in <module>
h[i] = sp.bisect(BeamHeight, hb, 5,xtol = 0.001)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/zeros.py", line 248, in bisect
r = _zeros._bisect(f,a,b,xtol,rtol,maxiter,args,full_output,disp)
File "ShearMoment.py", line 58, in BeamHeight
x = 1000e3*M[i]*h/(fw*h^3-(fw-wt)(h-2*ft)^3) - Max_stress_steel
TypeError: 'float' object is not callable
I understand that scipy.optimize expects a function as one of its arguments. Am I doing this incorrectly?
In Python, concatenation is not implicitly multiplication, and ^ is not exponentiation. Multiplication must be made explicit with *, and exponentiation must be written as **. This part of BeamHeight:
fw*h^3-(fw-wt)(h-2*ft)^3
must be written as
fw*h**3-(fw-wt)*(h-2*ft)**3

Classifier.fit for oneclassSVM complaining about float Type. TypeError float is required

I'm trying to fit two One Class SVMs to a small sets of data. These sets of data are call m1 and m2 respectively. m1 and m2 are lists of decimals which are converted to numpy arrays of type float t1 and t2.
When I attempt to fit the oneclass SVMs to these sets of data I am seeing errors saying that the the fit function will only accept a float. Can someone help me fix this problem?
Example Values:
m1 =[0.020000000000000018, 0.22799999999999998, 0.15799999999999992, 0.18999999999999995, 0.264]
m2 = [0.1279999999999999, 0.07400000000000007, 0.75, 1.0, 1.0]
Code below:
classifier1 =sklearn.svm.OneClassSVM(kernel='linear', nu ='0.5',gamma ='auto')
classifier2 = sklearn.svm.OneClassSVM(kernel='linear', nu ='0.5',gamma='auto')
for x in xrange(len(m1)):
print" Iteration "+str(x)
t1.append(float(m1[x]))
t2.append(float(m2[x]))
tx = np.array(t1).astype(float)
ty = np.array(t2).astype(float)
t1 = np.r_[tx+1.0,tx-1.0]
t2 = np.r_[ty+1.0,ty-1.0]
print t1
print t2
clfit1 = classifier1.fit(t1.astype(float))
clfit2 = classifier2.fit(t2.astype(float))
Error on commandline:
/usr/local/lib/python2.7/dist-packages/sklearn/utils/validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File "normalize_data.py", line 108, in <module>
main()
File "normalize_data.py", line 15, in main
trainSVM(result1[0],yval1,result2[0],yval2,0.04)
File "normalize_data.py", line 99, in trainSVM
clfit1 = classifier1.fit(t1.astype(float))
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/classes.py", line 1029, in fit
**params)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 193, in fit
fit(X, y, sample_weight, solver_type, kernel, random_seed=seed)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 251, in _dense_fit
max_iter=self.max_iter, random_seed=random_seed)
File "sklearn/svm/libsvm.pyx", line 59, in sklearn.svm.libsvm.fit (sklearn/svm/libsvm.c:1571)
TypeError: a float is required
made an error and set nu as a string instead of a float.
setting nu=0.05 fixes the problem.

Categories

Resources