PyMC2 and PyMC3 give different results...?

PyMC2 and PyMC3 give different results...? - python

I'm trying to get a simple PyMC2 model working in PyMC3. I've gotten the model to run but the models give very different MAP estimates for the variables. Here is my PyMC2 model:
import pymc
theta = pymc.Normal('theta', 0, .88)
X1 = pymc.Bernoulli('X2', p=pymc.Lambda('a', lambda theta=theta:1./(1+np.exp(-(theta-(-0.75))))), value=[1],observed=True)
X2 = pymc.Bernoulli('X3', p=pymc.Lambda('b', lambda theta=theta:1./(1+np.exp(-(theta-0)))), value=[1],observed=True)
model = pymc.Model([theta, X1, X2])
mcmc = pymc.MCMC(model)
mcmc.sample(iter=25000, burn=5000)
trace = (mcmc.trace('theta')[:])
print "\nThe MAP value for theta is", trace.sum()/len(trace)
That seems to work as expected. I had all sorts of trouble figuring out how to use the equivalent of the pymc.Lambda object in PyMC3. I eventually came across the Deterministic object. The following is my code:
import pymc3
with pymc3.Model() as model:
theta = pymc3.Normal('theta', 0, 0.88)
X1 = pymc3.Bernoulli('X1', p=pymc3.Deterministic('b', 1./(1+np.exp(-(theta-(-0.75))))), observed=[1])
X2 = pymc3.Bernoulli('X2', p=pymc3.Deterministic('c', 1./(1+np.exp(-(theta-(0))))), observed=[1])
start=pymc3.find_MAP()
step=pymc3.NUTS(state=start)
trace = pymc3.sample(20000, step, njobs=1, progressbar=True)
pymc3.traceplot(trace)
The problem I'm having is that my MAP estimate for theta using PyMC2 is ~0.68 (correct), while the estimate PyMC3 gives is ~0.26 (incorrect). I suspect this has something to do with the way I'm defining the deterministic function. PyMC3 won't let me use a lambda function, so I just have to write the expression in-line. When I try to use lambda theta=theta:... I get this error:
AsTensorError: ('Cannot convert <function <lambda> at 0x157323e60> to TensorType', <type 'function'>)
Something to do with Theano?? Any suggestions would be greatly appreciated!

It works when you use a theano tensor instead of a numpy function in your Deterministic.
import pymc3
import theano.tensor as tt
with pymc3.Model() as model:
theta = pymc3.Normal('theta', 0, 0.88)
X1 = pymc3.Bernoulli('X1', p=pymc3.Deterministic('b', 1./(1+tt.exp(-(theta-(-0.75))))), observed=[1])
X2 = pymc3.Bernoulli('X2', p=pymc3.Deterministic('c', 1./(1+tt.exp(-(theta-(0))))), observed=[1])
start=pymc3.find_MAP()
step=pymc3.NUTS(state=start)
trace = pymc3.sample(20000, step, njobs=1, progressbar=True)
print "\nThe MAP value for theta is", np.median(trace['theta'])
pymc3.traceplot(trace);
Here's the output:

Just in case someone else has the same problem, I think I found an answer. After trying different sampling algorithms I found that:
find_MAP gave the incorrect answer
the NUTS sampler gave the incorrect answer
the Metropolis sampler gave the correct answer, yay!
I read somewhere else that the NUTS sampler doesn't work with Deterministic. I don't know why. Maybe that's the case with find_MAP too? But for now I'll stick with Metropolis.

Also, NUTS doesn't handle discrete variables. If you want to use NUTS, you have to split up the samplers:
step1 = pymc3.NUTS([theta])
step2 = pymc3.BinaryMetropolis([X1,X2])
trace = pymc3.sample(10000, [step1, step2], start)
EDIT:
Missed that 'b' and 'c' were defined inline. Removed them from the NUTS function call

The MAP value is not defined as the mean of a distribution, but as its maximum. With pymc2 you can find it with:
M = pymc.MAP(model)
M.fit()
theta.value
which returns array(0.6253614422469552)
This agrees with the MAP that you find with find_MAP in pymc3, which you call start:
{'theta': array(0.6253614811102668)}
The issue of which is a better sampler is a different one, and does not depend on the calculation of the MAP. The MAP calculation is an optimization.
See: https://pymc-devs.github.io/pymc/modelfitting.html#maximum-a-posteriori-estimates for pymc2.

Related

pm.Simulator not accepting single parameter vector function

I'm trying to do some approximate Bayesian computing, and am able to use the pm.Simulator class to estimate functions with 2 or more parameters (where each parameter is actually an array of multiple values). However, when I try to estimate values of a single parameter function, I get an error.
The simplest working example (loosely based on the actual code):
# 2 parameter pm.Simulator snippet that *works*
import pymc3 as pm
import numpy as np
def get_mean_sig2(mu,sigma):
multi_var = np.random.normal(mu,sigma)
return multi_var
# create the observed data
obs2 = get_mean_sig2(np.array([10,5,2,1]), np.array([0.5,1,2,1]))
with pm.Model() as m91:
mu = pm.Uniform('mu', lower=1, upper=15, shape=obs2.shape[0])
sigma = pm.Uniform('sigma',lower=0.25, upper=3,shape=obs2.shape[0])
sim = pm.Simulator('sim', get_mean_sig2,params=(mu,sigma),observed=obs2)
jj = pm.sample_smc(kernel='ABC')
When I remove the 'sigma' parameter, and simplify the problem to only estimating the mean with this code:
# 1 parameter pm.Simulator snippet that doesn't work
def get_only_mean(mu):
multi_var = np.random.normal(mu,0.2)
return multi_var
obs = get_only_mean(np.array([10,5,2,1]))
with pm.Model() as m90:
mu = pm.Uniform('mu', lower=1, upper=15, shape=obs.shape[0])
sim = pm.Simulator('sim', get_only_mean,params=(mu),observed=obs)
jj = pm.sample_smc(kernel='ABC')
I get the error message ValueError: Length of mu ~ Uniform cannot be determined . I have tried
variations of inputting shape=(1,obs.shape[0]) or manually setting shape=4 for the 'shape' parameter's input - but failed.
I'm unable to understand why this problem suddenly appears - any help would be appreciated.
My environment/system config is:
OS: Linux Mint 19.2
Python 3.8.5
numpy 1.19.5
pymc3 3.11.0
theano 1.1.0

The error disappears when the variable/s are put into a list rather than a tuple.
For the single-parameter example, using params=[mu] instead of params=(mu) solves the issue.
A list is a valid data-type for multi-parameter situations too - eg. params=(mu, sigma) is equivalent to params=[mu, sigma].

Problems using numpy.piecewise

1. The core problem and question
I will provide an executable example below, but let me first walk you through the problem first.
I am using solve_ivp from scipy.integrate to solve an initial value problem (see documentation). In fact I have to call the solver twice, to once integrate forward and once backward in time. (I would have to go unnecessarily deep into my concrete problem to explain why this is necessary, but please trust me here--it is!)
sol0 = solve_ivp(rhs,[0,-1e8],y0,rtol=10e-12,atol=10e-12,dense_output=True)
sol1 = solve_ivp(rhs,[0, 1e8],y0,rtol=10e-12,atol=10e-12,dense_output=True)
Here rhs is the right hand side function of the initial value problem y(t) = rhs(t,y). In my case, y has six components y[0] to y[5]. y0=y(0) is the initial condition. [0,±1e8] are the respective integration ranges, one forward and the other backward in time. rtol and atol are tolerances.
Importantly, you see that I flagged dense_output=True, which means that the solver does not only return the solutions on the numerical grids, but also as interpolation functions sol0.sol(t) and sol1.sol(t).
My main goal now is to define a piecewise function, say sol(t) which takes the value sol0.sol(t) for t<0 and the value sol1.sol(t) for t>=0. So the main question is: How do I do that?
I thought that numpy.piecewise should be tool of choice to do this for me. But I am having trouble using it, as you will see below, where I show you what I tried so far.
2. Example code
The code in the box below solves the initial value problem of my example. Most of the code is the definition of the rhs function, the details of which are not important to the question.
import numpy as np
from scipy.integrate import solve_ivp
# aux definitions and constants
sin=np.sin; cos=np.cos; tan=np.tan; sqrt=np.sqrt; pi=np.pi;
c = 299792458
Gm = 5.655090674872875e26
# define right hand side function of initial value problem, y'(t) = rhs(t,y)
def rhs(t,y):
p,e,i,Om,om,f = y
sinf=np.sin(f); cosf=np.cos(f); Q=sqrt(p/Gm); opecf=1+e*cosf;
R = Gm**2/(c**2*p**3)*opecf**2*(3*(e**2 + 1) + 2*e*cosf - 4*e**2*cosf**2)
S = Gm**2/(c**2*p**3)*4*opecf**3*e*sinf
rhs = np.zeros(6)
rhs[0] = 2*sqrt(p**3/Gm)/opecf*S
rhs[1] = Q*(sinf*R + (2*cosf + e*(1 + cosf**2))/opecf*S)
rhs[2] = 0
rhs[3] = 0
rhs[4] = Q/e*(-cosf*R + (2 + e*cosf)/opecf*sinf*S)
rhs[5] = sqrt(Gm/p**3)*opecf**2 + Q/e*(cosf*R - (2 + e*cosf)/opecf*sinf*S)
return rhs
# define initial values, y0
y0=[3.3578528933149297e13,0.8846,2.34921,3.98284,1.15715,0]
# integrate twice from t = 0, once backward in time (sol0) and once forward in time (sol1)
sol0 = solve_ivp(rhs,[0,-1e8],y0,rtol=10e-12,atol=10e-12,dense_output=True)
sol1 = solve_ivp(rhs,[0, 1e8],y0,rtol=10e-12,atol=10e-12,dense_output=True)
The solution functions can be addressed from here by sol0.sol and sol1.sol respectively. As an example, let's plot the 4th component:
from matplotlib import pyplot as plt
t0 = np.linspace(-1,0,500)*1e8
t1 = np.linspace( 0,1,500)*1e8
plt.plot(t0,sol0.sol(t0)[4])
plt.plot(t1,sol1.sol(t1)[4])
plt.title('plot 1')
plt.show()
3. Failing attempts to build piecewise function
3.1 Build vector valued piecewise function directly out of sol0.sol and sol1.sol
def sol(t): return np.piecewise(t,[t<0,t>=0],[sol0.sol,sol1.sol])
t = np.linspace(-1,1,1000)*1e8
print(sol(t))
This leads to the following error in piecewise in line 628 of .../numpy/lib/function_base.py:
TypeError: NumPy boolean array indexing assignment requires a 0 or 1-dimensional input, input has 2 dimensions
I am not sure, but I do think this is because of the following: In the documentation of piecewise it says about the third argument:
funclistlist of callables, f(x,*args,**kw), or scalars
[...]. It should take a 1d array as input and give an 1d array or a scalar value as output. [...].
I suppose the problem is, that the solution in my case has six components. Hence, evaluated on a time grid the output would be a 2d array. Can someone confirm, that this is indeed the problem? Since I think this really limits the usefulness of piecewiseby a lot.
3.2 Try the same, but just for one component (e.g. for the 4th)
def sol4(t): return np.piecewise(t,[t<0,t>=0],[sol0.sol(t)[4],sol1.sol(t)[4]])
t = np.linspace(-1,1,1000)*1e8
print(sol4(t))
This results in this error in line 624 of the same file as above:
ValueError: NumPy boolean array indexing assignment cannot assign 1000 input values to the 500 output values where the mask is true
Contrary to the previous error, unfortunately here I have so far no idea why it is not working.
3.3 Similar attempt, however first defining functions for the 4th components
def sol40(t): return sol0.sol(t)[4]
def sol41(t): return sol1.sol(t)[4]
def sol4(t): return np.piecewise(t,[t<0,t>=0],[sol40,sol41])
t = np.linspace(-1,1,1000)
plt.plot(t,sol4(t))
plt.title('plot 2')
plt.show()
Now this does not result in an error, and I can produce a plot, however this plot doesn't look like it should. It should look like plot 1 above. Also here, I so far have no clue what is going on.
Am thankful for help!

You can take a look to numpy.piecewise source code. There is nothing special in this function so I suggest to do everything manually.
def sol(t):
ans = np.empty((6, len(t)))
ans[:, t<0] = sol0.sol(t[t<0])
ans[:, t>=0] = sol1.sol(t[t>=0])
return ans
Regarding your failed attempts. Yes, piecewise excpect functions return 1d array. Your second attempt failed because documentation says that funclist argument should be list of functions or scalars but you send the list of arrays. Contrary to the documentation it works even with arrays, you just should use the arrays of the same size as t < 0 and t >= 0 like:
def sol4(t): return np.piecewise(t,[t<0,t>=0],[sol0.sol(t[t<0])[4],sol1.sol(t[t>=0])[4]])

Acceptance-rate in PyMC3 (Metropolis-Hastings)

Does anyone know how I can see the final acceptance-rate in PyMC3 (Metropolis-Hastings) ? Or in general, how can I see all the information that pymc3.sample() returns ?
Thanks

Given an example, first, set up the model:
import pymc3 as pm3
sigma = 3 # Note this is the std of our data
data = norm(10,sigma).rvs(100)
mu_prior = 8
sigma_prior = 1.5 # Note this is our prior on the std of mu
plt.hist(data,bins=20)
plt.show()
basic_model = pm3.Model()
with basic_model:
# Priors for unknown model parameters
mu = pm3.Normal('Mean of Data',mu_prior,sigma_prior)
# Likelihood (sampling distribution) of observations
data_in = pm3.Normal('Y_obs', mu=mu, sd=sigma, observed=data)
Second, perform the simulation:
chain_length = 10000
with basic_model:
# obtain starting values via MAP
startvals = pm3.find_MAP(model=basic_model)
# instantiate sampler
step = pm3.Metropolis()
# draw 5000 posterior samples
trace = pm3.sample(chain_length, step=step, start=startvals)
Using the above example, the acceptance rate can be calculated this way:
accept = np.sum(trace['Mean of Data'][1:] != trace['Mean of Data'][:-1])
print("Acceptance Rate: ", accept/trace['Mean of Data'].shape[0])
(I found this solution in an online tutorial, but I don't quite understand it.)
Reference: Introduction to PyMC3

I checked for the NUTS algorithm, and found the solution from here pymc3 forum.
trace.mean_tree_accept.mean()

Let step = pymc3.Metropolis() be our sampler, we can get the final acceptance-rate through
"step.accepted"
Just for beginners (pymc3) like myself, after each variable/obj. put a "." and hit the tab key; you will see some interesting suggestions ;)

Defining a custom PyMC distribution

This is perhaps a silly question.
I'm trying to fit data to a very strange PDF using MCMC evaluation in PyMC. For this example I just want to figure out how to fit to a normal distribution where I manually input the normal PDF. My code is:
data = [];
for count in range(1000): data.append(random.gauss(-200,15));
mean = mc.Uniform('mean', lower=min(data), upper=max(data))
std_dev = mc.Uniform('std_dev', lower=0, upper=50)
# #mc.potential
# def density(x = data, mu = mean, sigma = std_dev):
# return (1./(sigma*np.sqrt(2*np.pi))*np.exp(-((x-mu)**2/(2*sigma**2))))
mc.Normal('process', mu=mean, tau=1./std_dev**2, value=data, observed=True)
model = mc.MCMC([mean,std_dev])
model.sample(iter=5000)
print "!"
print(model.stats()['mean']['mean'])
print(model.stats()['std_dev']['mean'])
The examples I've found all use something like mc.Normal, or mc.Poisson or whatnot, but I want to fit to the commented out density function.
Any help would be appreciated.

An easy way is to use the stochastic decorator:
import pymc as mc
import numpy as np
data = np.random.normal(-200,15,size=1000)
mean = mc.Uniform('mean', lower=min(data), upper=max(data))
std_dev = mc.Uniform('std_dev', lower=0, upper=50)
#mc.stochastic(observed=True)
def custom_stochastic(value=data, mean=mean, std_dev=std_dev):
return np.sum(-np.log(std_dev) - 0.5*np.log(2) -
0.5*np.log(np.pi) -
(value-mean)**2 / (2*(std_dev**2)))
model = mc.MCMC([mean,std_dev,custom_stochastic])
model.sample(iter=5000)
print "!"
print(model.stats()['mean']['mean'])
print(model.stats()['std_dev']['mean'])
Note that my custom_stochastic function returns the log likelihood, not the likelihood, and that it is the log likelihood for the entire sample.
There are a few other ways to create custom stochastic nodes. This doc gives more details, and this gist contains an example using pymc.Stochastic to create a node with a kernel density estimator.

How can I obtain segmented linear regressions with a priori breakpoints?

I need to explain this in excruciating detail because I don't have the basics of statistics to explain in a more succinct way. Asking here in SO because I am looking for a python solution, but might go to stats.SE if more appropriate.
I have downhole well data, it might be a bit like this:
Rt T
0.0000 15.0000
4.0054 15.4523
25.1858 16.0761
27.9998 16.2013
35.7259 16.5914
39.0769 16.8777
45.1805 17.3545
45.6717 17.3877
48.3419 17.5307
51.5661 17.7079
64.1578 18.4177
66.8280 18.5750
111.1613 19.8261
114.2518 19.9731
121.8681 20.4074
146.0591 21.2622
148.8134 21.4117
164.6219 22.1776
176.5220 23.4835
177.9578 23.6738
180.8773 23.9973
187.1846 24.4976
210.5131 25.7585
211.4830 26.0231
230.2598 28.5495
262.3549 30.8602
266.2318 31.3067
303.3181 37.3183
329.4067 39.2858
335.0262 39.4731
337.8323 39.6756
343.1142 39.9271
352.2322 40.6634
367.8386 42.3641
380.0900 43.9158
388.5412 44.1891
390.4162 44.3563
395.6409 44.5837
(the Rt variable can be considered a proxy for depth, and T is temperature). I also have 'a priori' data giving me the temperature at Rt=0 and, not shown, some markers that i can use as breakpoints, guides to breakpoints, or at least compare to any discovered breakpoints.
The linear relationship of these two variables is in some depth intervals affected by some processes. A simple linear regression is
q, T0, r_value, p_value, std_err = stats.linregress(Rt, T)
and looks like this, where you can see the deviations clearly, and the poor fit for T0 (which should be 15):
I want to be able to perform a series of linear regressions (joining at ends of each segment), but I want to do it:
(a) by NOT specifying the number or locations of breaks,
(b) by specifying the number and location of breaks, and
(c) calculate the coefficients for each segment
I think I can do (b) and (c) by just splitting the data up and doing each bit separately with a bit of care, but I don't know about (a), and wonder if there's a way someone knows this can be done more simply.
I have seen this: https://stats.stackexchange.com/a/20210/9311, and I think MARS might be a good way to deal with it, but that's just because it looks good; I don't really understand it. I tried it with my data in a blind cut'n'paste way and have the output below, but again, I don't understand it:

The short answer is that I solved my problem using R to create a linear regression model, and then used the segmented package to generate the piecewise linear regression from the linear model. I was able to specify the expected number of breakpoints (or knots) n as shown below using psi=NA and K=n.
The long answer is:
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)
# example data:
bullard <- structure(list(Rt = c(5.1861, 10.5266, 11.6688, 19.2345, 59.2882,
68.6889, 320.6442, 340.4545, 479.3034, 482.6092, 484.048, 485.7009,
486.4204, 488.1337, 489.5725, 491.2254, 492.3676, 493.2297, 494.3719,
495.2339, 496.3762, 499.6819, 500.253, 501.1151, 504.5417, 505.4038,
507.6278, 508.4899, 509.6321, 522.1321, 524.4165, 527.0027, 529.2871,
531.8733, 533.0155, 544.6534, 547.9592, 551.4075, 553.0604, 556.9397,
558.5926, 561.1788, 562.321, 563.1831, 563.7542, 565.0473, 566.1895,
572.801, 573.9432, 575.6674, 576.2385, 577.1006, 586.2382, 587.5313,
589.2446, 590.1067, 593.4125, 594.5547, 595.8478, 596.99, 598.7141,
599.8563, 600.2873, 603.1429, 604.0049, 604.576, 605.8691, 607.0113,
610.0286, 614.0263, 617.3321, 624.7564, 626.4805, 628.1334, 630.9889,
631.851, 636.4198, 638.0727, 638.5038, 639.646, 644.8184, 647.1028,
647.9649, 649.1071, 649.5381, 650.6803, 651.5424, 652.6846, 654.3375,
656.0508, 658.2059, 659.9193, 661.2124, 662.3546, 664.0787, 664.6498,
665.9429, 682.4782, 731.3561, 734.6619, 778.1154, 787.2919, 803.9261,
814.335, 848.1552, 898.2568, 912.6188, 924.6932, 940.9083), Tem = c(12.7813,
12.9341, 12.9163, 14.6367, 15.6235, 15.9454, 27.7281, 28.4951,
34.7237, 34.8028, 34.8841, 34.9175, 34.9618, 35.087, 35.1581,
35.204, 35.2824, 35.3751, 35.4615, 35.5567, 35.6494, 35.7464,
35.8007, 35.8951, 36.2097, 36.3225, 36.4435, 36.5458, 36.6758,
38.5766, 38.8014, 39.1435, 39.3543, 39.6769, 39.786, 41.0773,
41.155, 41.4648, 41.5047, 41.8333, 41.8819, 42.111, 42.1904,
42.2751, 42.3316, 42.4573, 42.5571, 42.7591, 42.8758, 43.0994,
43.1605, 43.2751, 44.3113, 44.502, 44.704, 44.8372, 44.9648,
45.104, 45.3173, 45.4562, 45.7358, 45.8809, 45.9543, 46.3093,
46.4571, 46.5263, 46.7352, 46.8716, 47.3605, 47.8788, 48.0124,
48.9564, 49.2635, 49.3216, 49.6884, 49.8318, 50.3981, 50.4609,
50.5309, 50.6636, 51.4257, 51.6715, 51.7854, 51.9082, 51.9701,
52.0924, 52.2088, 52.3334, 52.3839, 52.5518, 52.844, 53.0192,
53.1816, 53.2734, 53.5312, 53.5609, 53.6907, 55.2449, 57.8091,
57.8523, 59.6843, 60.0675, 60.8166, 61.3004, 63.2003, 66.456,
67.4, 68.2014, 69.3065)), .Names = c("Rt", "Tem"), class = "data.frame", row.names = c(NA,
-109L))
library(segmented) # Version: segmented_0.2-9.4
# create a linear model
out.lm <- lm(Tem ~ Rt, data = bullard)
# Set X breakpoints: Set psi=NA and K=n:
o <- segmented(out.lm, seg.Z=~Rt, psi=NA, control=seg.control(display=FALSE, K=3))
slope(o) # defaults to confidence level of 0.95 (conf.level=0.95)
# Trickery for placing text labels
r <- o$rangeZ[, 1]
est.psi <- o$psi[, 2]
v <- sort(c(r, est.psi))
xCoord <- rowMeans(cbind(v[-length(v)], v[-1]))
Z <- o$model[, o$nameUV$Z]
id <- sapply(xCoord, function(x) which.min(abs(x - Z)))
yCoord <- broken.line(o)[id]
# create the segmented plot, add linear regression for comparison, and text labels
plot(o, lwd=2, col=2:6, main="Segmented regression", res=TRUE)
abline(out.lm, col="red", lwd=1, lty=2) # dashed line for linear regression
text(xCoord, yCoord,
labels=formatC(slope(o)[[1]][, 1] * 1000, digits=1, format="f"),
pos = 4, cex = 1.3)

What you want is technically called spline interpolation, particularly order-1 spline interpolation (which would join straight line segments; order-2 joins parabolas, etc).
There is already a question here on Stack Overflow dealing with Spline Interpolation in Python, which will help you on your question. Here's the link. Post back if you have further questions after trying those tips.

A very simple method ( not iterative, without initial guess, no bound to specify) is provided pages 30-31 in the paper : https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf . The result is :
NOTE : The method is based on the fitting of an integral equation. The present exemple is not a favourable case because the distribution of the abscisses of the points is far to be regular (no points in large ranges). This makes the numerical integration less accurate. Nevertheless, the piecewise fitting is surprisingly not bad.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

PyMC2 and PyMC3 give different results...? - python

Related

pm.Simulator not accepting single parameter vector function

Problems using numpy.piecewise

Acceptance-rate in PyMC3 (Metropolis-Hastings)

Defining a custom PyMC distribution

How can I obtain segmented linear regressions with a priori breakpoints?

Categories

Resources