Converting a mixture of gaussians to PyMC3

Converting a mixture of gaussians to PyMC3 - python

I am trying to learn PyMC3, I want to make a simple mixture of gaussians example. I found this example and want to convert it to pymc3 but I'm currently getting an error when trying to plot the traceplot.
n1 = 500
n2 = 200
n = n1+n2
mean1 = 21.8
mean2 = 42.0
precision = 0.1
sigma = np.sqrt(1 / precision)
# precision = 1/sigma^2
print "sigma1: %s" % sigma1
print "sigma2: %s" % sigma2
data1 = np.random.normal(mean1,sigma,n1)
data2 = np.random.normal(mean2,sigma,n2)
data = np.concatenate([data1 , data2])
#np.random.shuffle(data)
fig = plt.figure(figsize=(7, 7))
ax = fig.add_subplot(111, xlabel='x', ylabel='y', title='mixture of 2 guassians')
ax.plot(range(0,n1+n2), data, 'x', label='data')
plt.legend(loc=0)
with pm.Model() as model:
#priors
p = pm.Uniform( "p", 0 , 1) #this is the fraction that come from mean1 vs mean2
ber = pm.Bernoulli( "ber", p = p) # produces 1 with proportion p.
precision = pm.Gamma('precision', alpha=0.1, beta=0.1)
mean1 = pm.Normal( "mean1", 0, 0.01 ) #better to use normals versus Uniforms (unless you are certain the value is truncated at 0 and 200
mean2 = pm.Normal( "mean2", 0, 0.01 )
mean = pm.Deterministic('mean', ber*mean1 + (1-ber)*mean2)
process = pm.Normal('process', mu=mean, tau=precision, observed=data)
# inference
step = pm.Metropolis()
trace = pm.sample(10000, step)
pm.traceplot(trace)
Error:
sigma1: 3.16227766017
sigma2: 1.69030850946
[-----------------100%-----------------] 10000 of 10000 complete in 4.4 sec
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
<ipython-input-10-eb728824de83> in <module>()
44 step = pm.Metropolis()
45 trace = pm.sample(10000, step)
---> 46 pm.traceplot(trace)
/usr/lib/python2.7/site-packages/pymc-3.0-py2.7.egg/pymc/plots.pyc in traceplot(trace, vars, figsize, lines, combined, grid)
70 ax[i, 0].set_xlim(mind - .5, maxd + .5)
71 else:
---> 72 kdeplot_op(ax[i, 0], d)
73 ax[i, 0].set_title(str(v))
74 ax[i, 0].grid(grid)
/usr/lib/python2.7/site-packages/pymc-3.0-py2.7.egg/pymc/plots.pyc in kdeplot_op(ax, data)
94 for i in range(data.shape[1]):
95 d = data[:, i]
---> 96 density = kde.gaussian_kde(d)
97 l = np.min(d)
98 u = np.max(d)
/usr/lib64/python2.7/site-packages/scipy/stats/kde.pyc in __init__(self, dataset, bw_method)
186
187 self.d, self.n = self.dataset.shape
--> 188 self.set_bandwidth(bw_method=bw_method)
189
190 def evaluate(self, points):
/usr/lib64/python2.7/site-packages/scipy/stats/kde.pyc in set_bandwidth(self, bw_method)
496 raise ValueError(msg)
497
--> 498 self._compute_covariance()
499
500 def _compute_covariance(self):
/usr/lib64/python2.7/site-packages/scipy/stats/kde.pyc in _compute_covariance(self)
507 self._data_covariance = atleast_2d(np.cov(self.dataset, rowvar=1,
508 bias=False))
--> 509 self._data_inv_cov = linalg.inv(self._data_covariance)
510
511 self.covariance = self._data_covariance * self.factor**2
/usr/lib64/python2.7/site-packages/scipy/linalg/basic.pyc in inv(a, overwrite_a, check_finite)
381 inv_a, info = getri(lu, piv, lwork=lwork, overwrite_lu=1)
382 if info > 0:
--> 383 raise LinAlgError("singular matrix")
384 if info < 0:
385 raise ValueError('illegal value in %d-th argument of internal '
LinAlgError: singular matrix

Thanks to Fonnesbeck for answering this on the github issue tracker:
https://github.com/pymc-devs/pymc3/issues/452
here is the updated code:
with pm.Model() as model:
#priors
p = pm.Uniform( "p", 0 , 1) #this is the fraction that come from mean1 vs mean2
ber = pm.Bernoulli( "ber", p = p, shape=len(data)) # produces 1 with proportion p.
sigma = pm.Uniform('sigma', 0, 100)
precision = sigma**-2
mean = pm.Normal( "mean", 0, 0.01, shape=2 )
mu = pm.Deterministic('mu', mean[ber])
process = pm.Normal('process', mu=mu, tau=precision, observed=data)
with model:
step1 = pm.Metropolis([p, sigma, mean])
step2 = pm.BinaryMetropolis([ber])
trace = pm.sample(10000, [step1, step2])
You need to use BinaryMetropolis when inferring a Bernoulli random variable

And an even simpler and quicker version is as follows:
with pm.Model() as model2:
p = pm.Beta( "p", 1., 1.)
means = pm.Uniform('mean', 15, 60, shape=2)
sigma = pm.Uniform('sigma', 0, 20, testval=5)
process = pm.NormalMixture('obs', tt.stack([p, 1-p]), means, sd=sigma, observed=data)
with model2:
step = pm.Metropolis()
trace = pm.sample(10000, step=step)

I know this issue is old, but I am trying differente examples of PyMC3 usages to get used to modeling in PyMC3. The answer as given above does not work in current version 1.0 of PyMC3 (It does not distringuish the two means correctly). The minimum changes I had to do in order to make it work were the following:
1)
# mean = pm.Normal("mean", 0, 0.01, shape=2 )
mean = pm.Uniform('mean', 15, 60, shape=2)
2)
# step2 = pm.BinaryMetropolis([ber])
step2 = pm.ElemwiseCategorical(vars=[ber], values=[0, 1])
Just in case anybody else is having a similar problem.

Related

Find complex roots for every e^(i phi) with 0 <= phi < 4pi

I want to find the complex roots for z1 = -0.9 and z2 = 0.3 but for every phi between 0 and 4pi.
phi = np.linspace(0, 4*np.pi, 400, endpoint=False)
e = np.exp(1j*phi)
z1 = [-0.9, -0.25, -0.99, -0.9405, -0.76, -1.019898, -1.00]
z2 = [0.3, 0.25, 0.11, 0.0495, 0.04, 0.000102, 1.00]
#Coefficients
P = [e, -e*(2*z1[0] + 2*z2[0]), e*(z1[0]**2 + z2[0]**2 + 4*z1[0]*z2[0] - 1), -e*((2*z1[0]**2 * z2[0]) + (2*z1[0]*z2[0]**2)), (z1[0]*z2[0])*(z1[0]*z2[0] + 1)]
#Output
ROOT = np.roots(P)
print(ROOT)
I'm getting the error:
/opt/conda/lib/python3.7/site-packages/numpy/core/shape_base.py:65: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated.
If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
ary = asanyarray(ary)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/tmp/ipykernel_17/4090299359.py in <module>
18 #Outputting the roots
19
---> 20 ROOT = np.roots(P)
21
22 print(ROOT)
<__array_function__ internals> in roots(*args, **kwargs)
/opt/conda/lib/python3.7/site-packages/numpy/lib/polynomial.py in roots(p)
232
233 # find non-zero array entries
--> 234 non_zero = NX.nonzero(NX.ravel(p))[0]
235
236 # Return an empty array if polynomial is all zeros
<__array_function__ internals> in nonzero(*args, **kwargs)
/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py in nonzero(a)
1919
1920 """
-> 1921 return _wrapfunc(a, 'nonzero')
1922
1923
/opt/conda/lib/python3.7/site-packages/numpy/core/fromnumeric.py in _wrapfunc(obj, method, *args, **kwds)
55
56 try:
---> 57 return bound(*args, **kwds)
58 except TypeError:
59 # A TypeError occurs if the object does have such a method in its
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I also went and tried a.all() on 'phi', I'm not sure how to fix it so I just place it randomly hoping the error would go away:
phi = np.linspace(0, 4*np.pi, 400, endpoint=False).all()
And I only get 4 roots, instead of 4 roots for every phi. How can I fix this? Any actual help would be appreciated, thanks.

import numpy as np
phi = np.linspace(0, 4*np.pi, 400, endpoint=False)
e = np.exp(1j*phi)
z1 = [-0.9, -0.25, -0.99, -0.9405, -0.76, -1.019898, -1.00]
z2 = [0.3, 0.25, 0.11, 0.0495, 0.04, 0.000102, 1.00]
#Coefficients
P = [e,
-e*(2*z1[0] + 2*z2[0]),
e*(z1[0]**2 + z2[0]**2 + 4*z1[0]*z2[0] - 1),
-e*((2*z1[0]**2 * z2[0]) + (2*z1[0]*z2[0]**2)),
(z1[0]*z2[0])*(z1[0]*z2[0] + 1)
]
# the last element of "P" is just a float, but the others are arrays...
# Is this really so different?
#Output
n_phi = len(phi)
ROOT = np.zeros((n_phi, 4), dtype=complex)
# apply "np.roots" for each point, one by one
for i in range(n_phi):
# change new_P if the last element of "P" should be a array
new_P = [P[0][i], P[1][i], P[2][i], P[3][i], P[4]]
ROOT[i, :] = np.roots(new_P)
print(ROOT)
# For a more readable presentation you can use this:
# with np.printoptions(precision=2, linewidth=85):
# print(ROOT)

ValueError: supplied range of [-inf, -inf] is not finite

When trying to plot these graphs, the line:
sub3.hist(x=np.log(df[i]), bins = 100, color="grey")
Gives the error:
ValueError: supplied range of [-inf, -inf] is not finite.
I don't understand this error and can't find any explanations online. Here is the full code. df and df_norm are pandas dataframes with identical data, save for df_norm being minmax normalised.
tb = widgets.TabBar([str(c) for c in range(16)])
k = 0
for c in range(len(df_norm.columns)):
with tb.output_to(c, select=(c < 3)):
colours = ["orange", "green"]
fig = plt.figure(figsize=(20, 5))
plt.subplots_adjust(bottom = 0., left = 0, top = 1., right = 1)
p = 0
g = 1
for i in df_norm.columns[k:k+2]:
sub1 = fig.add_subplot(2,3,g)
sub1.hist(x=df[i], bins = 100, alpha=0.3, color=colours[p])
sub2 = fig.add_subplot(2,3,g+1)
sub2.hist(x=df_norm[i], bins = 100, alpha=0.3, color=colours[p])
sub3 = fig.add_subplot(2,3,g+2)
sub3.hist(x=np.log(df[i]), bins = 100, color="grey")
sub1.set_title(i)
sub2.set_title('title ' + i)
sub3.set_title('title ' + i)
sub1.set_ylabel('label')
p = p + 1
k = k + 1
g = g + 3
Edit, full stack trace:
ValueError Traceback (most recent call last)
<ipython-input-179-d6170fc0d99d> in <module>()
20 sub2.hist(x=df_norm[i], bins = 100, alpha=0.3, color=colours[p])
21 sub3 = fig.add_subplot(2,3,g+2) # two rows, two columns, second cell
---> 22 sub3.hist(x=np.log(df[i]), bins = 100, color="grey")
23 sub1.set_title(i)
24 sub2.set_title('title ' + i)
4 frames
<__array_function__ internals> in histogram(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/numpy/lib/histograms.py in _get_outer_edges(a, range)
314 if not (np.isfinite(first_edge) and np.isfinite(last_edge)):
315 raise ValueError(
--> 316 "supplied range of [{}, {}] is not finite".format(first_edge, last_edge))
317 elif a.size == 0:
318 # handle empty arrays. Can't determine range, so use 0-1.
ValueError: supplied range of [-inf, -inf] is not finite

Index issue whilst running fmin_l_bfgs

I am trying to use the fmin_l_bfgs function in python to maximize the log-likelihood function below:
def loglik(x0):
p = np.zeros((NCS,1)) #vector to hold the probabilities for each observation
data['v'] = (data.iloc[:, [3,4]]).dot(x0) #calculate determinstic utility
for i in range(NCS):
vv = data.v[(data.idcase == i + 1)]
vy = data.v[(data.idcase == i + 1) & (data.depvar == 1)]
p[i][0] = np.maximum(np.exp(vy)/ sum(np.exp(vv)),0.00000001)
#print("p", p)
ll = -sum(np.log(p)) #Negative since neg of ll is minimized
return ll
The input data being used is:
data = pd.read_csv("drive/My Drive/example_data.csv") #read data
data.iloc[:, [3,4]] = data.iloc[:, [3,4]]/100 #scale costs
B = np.zeros((1,2)) #give starting values of beta; 1xK vector; 2alternatives so 1x2 vector
NCS = data['idcase'].nunique() # number of choice situations in the dataset
x0 = B.T
estimation
optim2 = fmin_l_bfgs_b(loglik, x0, fprime=None, args=(), approx_grad=0, bounds=None, m=10, factr=10000000.0, pgtol=1e-05, epsilon=1e-08,iprint=0, maxfun=15000, maxiter=15000, disp=None, callback=None)
However, I keep getting this:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-77-2821f2269a8c> in <module>()
83 print('which is the same as maximizing the log-likelihood.')
84
---> 85 optim2 = fmin_l_bfgs_b(loglik, x0, fprime=None, args=(), approx_grad=0, bounds=None, m=10, factr=10000000.0, pgtol=1e-05, epsilon=1e-08, iprint=0, maxfun=15000, maxiter=15000, disp=None, callback=None)
86
87 print(optim2)
4 frames
/usr/local/lib/python3.6/dist-packages/scipy/optimize/optimize.py in __call__(self, x, *args)
64 self.x = numpy.asarray(x).copy()
65 fg = self.fun(x, *args)
---> 66 self.jac = fg[1]
67 return fg[0]
68
IndexError: index 1 is out of bounds for axis 0 with size 1#
Can someone kindly advise me as to what to do? I am quite new in using numerical optimization methods.
Thanks

Can't differentiate wrt numpy arrays of dtype int64?

I am a newbie to numpy. Today when I use it to work with linear regression, it shows as below:
KeyError Traceback (most recent call
last)
~/anaconda3/lib/python3.6/site-packages/autograd/numpy/numpy_extra.py
in new_array_node(value, tapes)
84 try:
---> 85 return array_dtype_mappings[value.dtype](value, tapes)
86 except KeyError:
KeyError: dtype('int64')
During handling of the above exception, another exception occurred:
TypeError Traceback (most recent call
last)
<ipython-input-4-aebe8f7987b0> in <module>()
24 return cost/float(np.size(y))
25
---> 26 weight_h, cost_h = gradient_descent(least_squares, alpha,
max_its, w)
27
28 # a)
<ipython-input-2-1b74c4f818f4> in gradient_descent(g, alpha, max_its,
w)
12 for k in range(max_its):
13 # evaluate the gradient
---> 14 grad_eval = gradient(w)
15
16 # take gradient descent step
~/anaconda3/lib/python3.6/site-packages/autograd/core.py in
gradfun(*args, **kwargs)
19 #attach_name_and_doc(fun, argnum, 'Gradient')
20 def gradfun(*args,**kwargs):
---> 21 return
backward_pass(*forward_pass(fun,args,kwargs,argnum))
22 return gradfun
23
~/anaconda3/lib/python3.6/site-packages/autograd/core.py in
forward_pass(fun, args, kwargs, argnum)
57 tape = CalculationTape()
58 arg_wrt = args[argnum]
---> 59 start_node = new_node(safe_type(getval(arg_wrt)),
[tape])
60 args = list(args)
61 args[argnum] = merge_tapes(start_node, arg_wrt)
~/anaconda3/lib/python3.6/site-packages/autograd/core.py in
new_node(value, tapes)
185 def new_node(value, tapes=[]):
186 try:
--> 187 return Node.type_mappings[type(value)](value, tapes)
188 except KeyError:
189 return NoDerivativeNode(value, tapes)
~/anaconda3/lib/python3.6/site-packages/autograd/numpy/numpy_extra.py
in new_array_node(value, tapes)
85 return array_dtype_mappings[value.dtype](value, tapes)
86 except KeyError:
---> 87 raise TypeError("Can't differentiate wrt numpy arrays
of dtype {0}".format(value.dtype))
88 Node.type_mappings[anp.ndarray] = new_array_node
89
TypeError: Can't differentiate wrt numpy arrays of dtype int64
I really have no idea about what is happened. I guess it might be related to the structure of array in numpy. Or did I forget to download any packages? Below is my original codes.
# import statements
datapath = 'datasets/'
from autograd import numpy as np
# import automatic differentiator to compute gradient module
from autograd import grad
# gradient descent function
def gradient_descent(g,alpha,max_its,w):
# compute gradient module using autograd
gradient = grad(g)
# run the gradient descent loop
weight_history = [w] # weight history container
cost_history = [g(w)] # cost function history container
for k in range(max_its):
# evaluate the gradient
grad_eval = gradient(w)
# take gradient descent step
w = w - alpha*grad_eval
# record weight and cost
weight_history.append(w)
cost_history.append(g(w))
return weight_history,cost_history
# load in dataset
csvname = datapath + 'kleibers_law_data.csv'
data = np.loadtxt(csvname,delimiter=',')
# get input and output of dataset
x = data[:-1,:]
y = data[-1:,:]
x = np.log(x)
y = np.log(y)
#Data Initiation
alpha = 0.01
max_its = 1000
w = np.array([0,0])
#linear model
def model(x, w):
a = w[0] + np.dot(x.T, w[1:])
return a.T
def least_squares(w):
cost = np.sum((model(x,w)-y)**2)
return cost/float(np.size(y))
weight_h, cost_h = gradient_descent(least_squares, alpha, max_its, w)
# a)
k = np.linspace(-5.5, 7.5, 250)
y = weight_h[max_its][0] + k*weight_h[max_its][1]
plt.figure()
plt.plot(x, y, label='Linear Line', color='g')
plt.xlabel('log of mass')
plt.ylabel('log of metabolic rate')
plt.title("Answer Of a")
plt.legend()
plt.show()
# b)
w0 = weight_h[max_its][0]
w1 = weight_h[max_its][1]
print("Nonlinear relationship between the body mass x and the metabolic
rate y is " /
+ str(w0) + " + " + "log(xp)" + str(w1) + " = " + "log(yp)")
# c)
x2 = np.log(10)
Kj = np.exp(w0 + w1*x2)*1000/4.18
print("It needs " + str(Kj) + " calories")
Could someone help me to figure it out? Thanks a lot.

Here's the important parts of your error:
---> 14 grad_eval = gradient(w)
...
Type Error: Can't differentiate wrt numpy arrays of dtype int64
Your gradient function is saying it doesn't like to differentiate arrays of ints, which makes some sense, since it probably wants more precision than an int can give. You probably need them to be doubles or floats. For a simple solution to this, I believe you can just change your initializer from:
w = np.array([0,0])
which is going to automatically cast those 0s as ints, to:
w = np.array([0.0,0.0])
Those decimals after the 0 will let it know you want floats. There's other ways to go about telling it what kind of array you want (https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.array.html), but this is a simple way.

PyMC3 Multivariate Mixture Model: Constraining components to be non-empty

I am implementing a Personalized Mixture of Multivariate Gaussian Regressions in pymc3 and running into an issue with empty components. After referring to the related PyMC3 mixture model example, I tried implementing the model using univariate normals instead, but I've had some issues there as well.
I've tried several strategies to constrain each component to be non-empty, but each has failed. These are shown in the code below. My specific question is: What is the best way to constrain all components to be non-empty in a mixture of multivariate Gaussians using pymc3?
Note that attempt #1 in the code below comes from the Mixture Model in PyMC3 Example and does not work here.
You can replicate the synthetic data I am using with the function in this gist.
import pymc3 as pm
import numpy as np
import theano
import theano.tensor as T
from scipy import stats
# Extract problem dimensions.
N = X.shape[0] # number of samples
F = X.shape[1] # number of features
pids = I[:, 0].astype(np.int) # primary entity ids
uniq_pids = np.unique(pids) # array of unique primary entity ids
n_pe = len(uniq_pids) # number of primary entities
with pm.Model() as gmreg:
# Init hyperparameters.
a0 = 1
b0 = 1
mu0 = pm.constant(np.zeros(F))
alpha = pm.constant(np.ones(K))
coeff_precisions = pm.constant(1 / X.var(0))
# Init parameters.
# Dirichlet shape parameter, prior on indicators.
pi = pm.Dirichlet(
'pi', a=alpha, shape=K)
# ATTEMPT 1: Make probability of membership for each cluter >= 0.1
# ================================================================
pi_min_potential = pm.Potential(
'pi_min_potential', T.switch(T.min(pi) < .1, -np.inf, 0))
# ================================================================
# The multinomial (and by extension, the Categorical), is a symmetric
# distribution. Using this as a prior for the indicator variables Z
# makes the likelihood invariant under the many possible permutations of
# the indices. This invariance is inherited in posterior inference.
# This invariance model implies unidentifiability and induces label
# switching during inference.
# Resolve by ordering the components to have increasing weights.
# This does not deal with the parameter identifiability issue.
order_pi_potential = pm.Potential(
'order_pi_potential',
T.sum([T.switch(pi[k] - pi[k-1] < 0, -np.inf, 0)
for k in range(1, K)]))
# Indicators, specifying which cluster each primary entity belongs to.
# These are draws from Multinomial with 1 trial.
init_pi = stats.dirichlet.rvs(alpha.eval())[0]
test_Z = np.random.multinomial(n=1, pvals=init_pi, size=n_pe)
as_cat = np.nonzero(test_Z)[1]
Z = pm.Categorical(
'Z', p=pi, shape=n_pe, testval=as_cat)
# ATTEMPT 2: Give infinite negative likelihood to the case
# where any of the clusters have no users assigned.
# ================================================================
# sizes = [T.eq(Z, k).nonzero()[0].shape[0] for k in range(K)]
# nonempty_potential = pm.Potential(
# 'comp_nonempty_potential',
# np.sum([T.switch(sizes[k] < 1, -np.inf, 0) for k in range(K)]))
# ================================================================
# ATTEMPT 3: Add same sample to each cluster, each has at least 1.
# ================================================================
# shared_X = X.mean(0)[None, :]
# shared_y = y.mean().reshape(1)
# X = T.concatenate((shared_X.repeat(K).reshape(K, F), X))
# y = T.concatenate((shared_y.repeat(K), y))
# Add range(K) on to the beginning to include shared instance.
# Z_expanded = Z[pids]
# Z_with_shared = T.concatenate((range(K), Z_expanded))
# pid_idx = pm.Deterministic('pid_idx', Z_with_shared)
# ================================================================
# Expand user cluster indicators to each observation for each user.
pid_idx = pm.Deterministic('pid_idx', Z[pids])
# Construct masks for each component.
masks = [T.eq(pid_idx, k).nonzero() for k in range(K)]
comp_sizes = [masks[k][0].shape[0] for k in range(K)]
# Component regression precision parameters.
beta = pm.Gamma(
'beta', alpha=a0, beta=b0, shape=(K,),
testval=np.random.gamma(a0, b0, size=K))
# Regression coefficient matrix, with coeffs for each component.
W = pm.MvNormal(
'W', mu=mu0, tau=T.diag(coeff_precisions), shape=(K, F),
testval=np.random.randn(K, F) * std)
# The mean of the observations is the result of a regression, with
# coefficients determined by the cluster the sample belongs to.
# Now we have K different multivariate normal distributions.
X = T.cast(X, 'float64')
y = T.cast(y, 'float64')
comps = []
for k in range(K):
mask_k = masks[k]
X_k = X[mask_k]
y_k = y[mask_k]
n_k = comp_sizes[k]
precision_matrix = beta[k] * T.eye(n_k)
comp_k = pm.MvNormal(
'comp_%d' % k,
mu=T.dot(X_k, W[k]), tau=precision_matrix,
observed=y_k)
comps.append(comp_k)
The first two approaches fail to ensure non-empty clusters; attempting to sample results in a LinAlgError:
with gmreg:
step1 = pm.Metropolis(vars=[pi, beta, W])
step2 = pm.ElemwiseCategoricalStep(vars=[Z], values=np.arange(K))
tr = pm.sample(100, step=[step1, step2])
...:
Failed to compute determinant []
---------------------------------------------------------------------------
LinAlgError Traceback (most recent call last)
<ipython-input-2-c7df53f4c6a5> in <module>()
2 step1 = pm.Metropolis(vars=[pi, beta, W])
3 step2 = pm.ElemwiseCategoricalStep(vars=[Z], values=np.arange(K))
----> 4 tr = pm.sample(100, step=[step1, step2])
5
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/sampling.pyc in sample(draws, step, start, trace, chain, njobs, tune, progressbar, model, random_seed)
155 sample_args = [draws, step, start, trace, chain,
156 tune, progressbar, model, random_seed]
--> 157 return sample_func(*sample_args)
158
159
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/sampling.pyc in _sample(draws, step, start, trace, chain, tune, progressbar, model, random_seed)
164 progress = progress_bar(draws)
165 try:
--> 166 for i, strace in enumerate(sampling):
167 if progressbar:
168 progress.update(i)
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/sampling.pyc in _iter_sample(draws, step, start, trace, chain, tune, model, random_seed)
246 if i == tune:
247 step = stop_tuning(step)
--> 248 point = step.step(point)
249 strace.record(point)
250 yield strace
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/step_methods/compound.pyc in step(self, point)
12 def step(self, point):
13 for method in self.methods:
---> 14 point = method.step(point)
15 return point
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/step_methods/arraystep.pyc in step(self, point)
87 inputs += [point]
88
---> 89 apoint = self.astep(bij.map(point), *inputs)
90 return bij.rmap(apoint)
91
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/step_methods/gibbs.pyc in astep(self, q, logp)
38
39 def astep(self, q, logp):
---> 40 p = array([logp(v * self.sh) for v in self.values])
41 return categorical(p, self.var.dshape)
42
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/blocking.pyc in __call__(self, x)
117
118 def __call__(self, x):
--> 119 return self.fa(self.fb(x))
/home/mack/anaconda/lib/python2.7/site-packages/pymc3/model.pyc in __call__(self, *args, **kwargs)
423 def __call__(self, *args, **kwargs):
424 point = Point(model=self.model, *args, **kwargs)
--> 425 return self.f(**point)
426
427 compilef = fastfn
/home/mack/anaconda/lib/python2.7/site-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
604 self.fn.nodes[self.fn.position_of_error],
605 self.fn.thunks[self.fn.position_of_error],
--> 606 storage_map=self.fn.storage_map)
607 else:
608 # For the c linker We don't have access from
/home/mack/anaconda/lib/python2.7/site-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
593 t0_fn = time.time()
594 try:
--> 595 outputs = self.fn()
596 except Exception:
597 if hasattr(self.fn, 'position_of_error'):
/home/mack/anaconda/lib/python2.7/site-packages/theano/gof/op.pyc in rval(p, i, o, n)
766 # default arguments are stored in the closure of `rval`
767 def rval(p=p, i=node_input_storage, o=node_output_storage, n=node):
--> 768 r = p(n, [x[0] for x in i], o)
769 for o in node.outputs:
770 compute_map[o][0] = True
/home/mack/anaconda/lib/python2.7/site-packages/theano/tensor/nlinalg.pyc in perform(self, node, (x,), (z,))
267 def perform(self, node, (x,), (z, )):
268 try:
--> 269 z[0] = numpy.asarray(numpy.linalg.det(x), dtype=x.dtype)
270 except Exception:
271 print 'Failed to compute determinant', x
/home/mack/anaconda/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in det(a)
1769 """
1770 a = asarray(a)
-> 1771 _assertNoEmpty2d(a)
1772 _assertRankAtLeast2(a)
1773 _assertNdSquareness(a)
/home/mack/anaconda/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in _assertNoEmpty2d(*arrays)
220 for a in arrays:
221 if a.size == 0 and product(a.shape[-2:]) == 0:
--> 222 raise LinAlgError("Arrays cannot be empty")
223
224
LinAlgError: Arrays cannot be empty
Apply node that caused the error: Det(Elemwise{Mul}[(0, 1)].0)
Inputs types: [TensorType(float64, matrix)]
Inputs shapes: [(0, 0)]
Inputs strides: [(8, 8)]
Inputs values: [array([], shape=(0, 0), dtype=float64)]
Backtrace when the node is created:
File "/home/mack/anaconda/lib/python2.7/site-packages/pymc3/distributions/multivariate.py", line 66, in logp
result = k * T.log(2 * np.pi) + T.log(1./det(tau))
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.
...which indicates the component is empty, since the precision matrix has shape (0, 0).
The third method actually resolves the empty component issue but gives very strange inference behavior. I selected a burn-in based on traceplots and thinned to every 10th sample. The samples are still highly autocorrelated but much better than without thinning. At this point, I summed the Z values across the samples, and this is what I get:
In [3]: with gmreg:
step1 = pm.Metropolis(vars=[pi, beta, W])
step2 = pm.ElemwiseCategoricalStep(vars=[Z], values=np.arange(K))
tr = pm.sample(1000, step=[step1, step2])
...:
[-----------------100%-----------------] 1000 of 1000 complete in 258.8 sec
...
In [24]: zvals = tr[300::10]['Z']
In [25]: np.array([np.bincount(zvals[:, n]) for n in range(nusers)])
Out[25]:
array([[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70],
[ 0, 0, 70]])
So for some reason, all of the users are being assigned to the last cluster for every sample.

I have run into a similar problem. Something like this worked for a mixture of multivariate gaussians model. As for whether it's the best, it's certainly the best solution I've found.
pm.Potential('pi_min_potential', T.switch(
T.all(
[pi[i, 0] < 0.1 for i in range(K)]), -np.inf, 0))
The key here is that you need to account for each potential that is below your cutoff. Further, you should adjust the shape of your pi distribution, as mentioned in the comments. This will affect your indexing in the T.switch call (on the pi[i,0]).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Converting a mixture of gaussians to PyMC3 - python

Related

Find complex roots for every e^(i phi) with 0 <= phi < 4pi

ValueError: supplied range of [-inf, -inf] is not finite

Index issue whilst running fmin_l_bfgs

Can't differentiate wrt numpy arrays of dtype int64?

PyMC3 Multivariate Mixture Model: Constraining components to be non-empty

Categories

Resources