I'm doing a Python project in which I'd like to use the Viterbi Algorithm. Does anyone know of a complete Python implementation of the Viterbi algorithm? The correctness of the one on Wikipedia seems to be in question on the talk page. Does anyone have a pointer?
Here's mine. Its paraphrased directly from the psuedocode implemenation from wikipedia. It uses numpy for conveince of their ndarray but is otherwise a pure python3 implementation.
import numpy as np
def viterbi(y, A, B, Pi=None):
"""
Return the MAP estimate of state trajectory of Hidden Markov Model.
Parameters
----------
y : array (T,)
Observation state sequence. int dtype.
A : array (K, K)
State transition matrix. See HiddenMarkovModel.state_transition for
details.
B : array (K, M)
Emission matrix. See HiddenMarkovModel.emission for details.
Pi: optional, (K,)
Initial state probabilities: Pi[i] is the probability x[0] == i. If
None, uniform initial distribution is assumed (Pi[:] == 1/K).
Returns
-------
x : array (T,)
Maximum a posteriori probability estimate of hidden state trajectory,
conditioned on observation sequence y under the model parameters A, B,
Pi.
T1: array (K, T)
the probability of the most likely path so far
T2: array (K, T)
the x_j-1 of the most likely path so far
"""
# Cardinality of the state space
K = A.shape[0]
# Initialize the priors with default (uniform dist) if not given by caller
Pi = Pi if Pi is not None else np.full(K, 1 / K)
T = len(y)
T1 = np.empty((K, T), 'd')
T2 = np.empty((K, T), 'B')
# Initilaize the tracking tables from first observation
T1[:, 0] = Pi * B[:, y[0]]
T2[:, 0] = 0
# Iterate throught the observations updating the tracking tables
for i in range(1, T):
T1[:, i] = np.max(T1[:, i - 1] * A.T * B[np.newaxis, :, y[i]].T, 1)
T2[:, i] = np.argmax(T1[:, i - 1] * A.T, 1)
# Build the output, optimal model trajectory
x = np.empty(T, 'B')
x[-1] = np.argmax(T1[:, T - 1])
for i in reversed(range(1, T)):
x[i - 1] = T2[x[i], i]
return x, T1, T2
I found the following code in the example repository of Artificial Intelligence: A Modern Approach. Is something like this what you're looking for?
def viterbi_segment(text, P):
"""Find the best segmentation of the string of characters, given the
UnigramTextModel P."""
# best[i] = best probability for text[0:i]
# words[i] = best word ending at position i
n = len(text)
words = [''] + list(text)
best = [1.0] + [0.0] * n
## Fill in the vectors best, words via dynamic programming
for i in range(n+1):
for j in range(0, i):
w = text[j:i]
if P[w] * best[i - len(w)] >= best[i]:
best[i] = P[w] * best[i - len(w)]
words[i] = w
## Now recover the sequence of best words
sequence = []; i = len(words)-1
while i > 0:
sequence[0:0] = [words[i]]
i = i - len(words[i])
## Return sequence of best words and overall probability
return sequence, best[-1]
Hmm I can post mine. Its not pretty though, please let me know if you need clarification. I wrote this relatively recently for specifically part of speech tagging.
class Trellis:
trell = []
def __init__(self, hmm, words):
self.trell = []
temp = {}
for label in hmm.labels:
temp[label] = [0,None]
for word in words:
self.trell.append([word,copy.deepcopy(temp)])
self.fill_in(hmm)
def fill_in(self,hmm):
for i in range(len(self.trell)):
for token in self.trell[i][1]:
word = self.trell[i][0]
if i == 0:
self.trell[i][1][token][0] = hmm.e(token,word)
else:
max = None
guess = None
c = None
for k in self.trell[i-1][1]:
c = self.trell[i-1][1][k][0] + hmm.t(k,token)
if max == None or c > max:
max = c
guess = k
max += hmm.e(token,word)
self.trell[i][1][token][0] = max
self.trell[i][1][token][1] = guess
def return_max(self):
tokens = []
token = None
for i in range(len(self.trell)-1,-1,-1):
if token == None:
max = None
guess = None
for k in self.trell[i][1]:
if max == None or self.trell[i][1][k][0] > max:
max = self.trell[i][1][k][0]
token = self.trell[i][1][k][1]
guess = k
tokens.append(guess)
else:
tokens.append(token)
token = self.trell[i][1][token][1]
tokens.reverse()
return tokens
I have just corrected the pseudo implementation of Viterbi in Wikipedia. From the initial (incorrect) version, it took me a while to figure out where I was going wrong but I finally managed it, thanks partly to Kevin Murphy's implementation of the viterbi_path.m in the MatLab HMM toolbox.
In the context of an HMM object with variables as shown:
hmm = HMM()
hmm.priors = np.array([0.5, 0.5]) # pi = prior probs
hmm.transition = np.array([[0.75, 0.25], # A = transition probs. / 2 states
[0.32, 0.68]])
hmm.emission = np.array([[0.8, 0.1, 0.1], # B = emission (observation) probs. / 3 obs modes
[0.1, 0.2, 0.7]])
The Python function to run Viterbi (best-path) algorithm is below:
def viterbi (self,observations):
"""Return the best path, given an HMM model and a sequence of observations"""
# A - initialise stuff
nSamples = len(observations[0])
nStates = self.transition.shape[0] # number of states
c = np.zeros(nSamples) #scale factors (necessary to prevent underflow)
viterbi = np.zeros((nStates,nSamples)) # initialise viterbi table
psi = np.zeros((nStates,nSamples)) # initialise the best path table
best_path = np.zeros(nSamples); # this will be your output
# B- appoint initial values for viterbi and best path (bp) tables - Eq (32a-32b)
viterbi[:,0] = self.priors.T * self.emission[:,observations(0)]
c[0] = 1.0/np.sum(viterbi[:,0])
viterbi[:,0] = c[0] * viterbi[:,0] # apply the scaling factor
psi[0] = 0;
# C- Do the iterations for viterbi and psi for time>0 until T
for t in range(1,nSamples): # loop through time
for s in range (0,nStates): # loop through the states #(t-1)
trans_p = viterbi[:,t-1] * self.transition[:,s]
psi[s,t], viterbi[s,t] = max(enumerate(trans_p), key=operator.itemgetter(1))
viterbi[s,t] = viterbi[s,t]*self.emission[s,observations(t)]
c[t] = 1.0/np.sum(viterbi[:,t]) # scaling factor
viterbi[:,t] = c[t] * viterbi[:,t]
# D - Back-tracking
best_path[nSamples-1] = viterbi[:,nSamples-1].argmax() # last state
for t in range(nSamples-1,0,-1): # states of (last-1)th to 0th time step
best_path[t-1] = psi[best_path[t],t]
return best_path
This is an old question, but none of the other answers were quite what I needed because my application doesn't have specific observed states.
Taking after #Rhubarb, I've also re-implemented Kevin Murphey's Matlab implementation (see viterbi_path.m), but I've kept it closer to the original. I've included a simple test case as well.
import numpy as np
def viterbi_path(prior, transmat, obslik, scaled=True, ret_loglik=False):
'''Finds the most-probable (Viterbi) path through the HMM state trellis
Notation:
Z[t] := Observation at time t
Q[t] := Hidden state at time t
Inputs:
prior: np.array(num_hid)
prior[i] := Pr(Q[0] == i)
transmat: np.ndarray((num_hid,num_hid))
transmat[i,j] := Pr(Q[t+1] == j | Q[t] == i)
obslik: np.ndarray((num_hid,num_obs))
obslik[i,t] := Pr(Z[t] | Q[t] == i)
scaled: bool
whether or not to normalize the probability trellis along the way
doing so prevents underflow by repeated multiplications of probabilities
ret_loglik: bool
whether or not to return the log-likelihood of the best path
Outputs:
path: np.array(num_obs)
path[t] := Q[t]
'''
num_hid = obslik.shape[0] # number of hidden states
num_obs = obslik.shape[1] # number of observations (not observation *states*)
# trellis_prob[i,t] := Pr((best sequence of length t-1 goes to state i), Z[1:(t+1)])
trellis_prob = np.zeros((num_hid,num_obs))
# trellis_state[i,t] := best predecessor state given that we ended up in state i at t
trellis_state = np.zeros((num_hid,num_obs), dtype=int) # int because its elements will be used as indicies
path = np.zeros(num_obs, dtype=int) # int because its elements will be used as indicies
trellis_prob[:,0] = prior * obslik[:,0] # element-wise mult
if scaled:
scale = np.ones(num_obs) # only instantiated if necessary to save memory
scale[0] = 1.0 / np.sum(trellis_prob[:,0])
trellis_prob[:,0] *= scale[0]
trellis_state[:,0] = 0 # arbitrary value since t == 0 has no predecessor
for t in xrange(1, num_obs):
for j in xrange(num_hid):
trans_probs = trellis_prob[:,t-1] * transmat[:,j] # element-wise mult
trellis_state[j,t] = trans_probs.argmax()
trellis_prob[j,t] = trans_probs[trellis_state[j,t]] # max of trans_probs
trellis_prob[j,t] *= obslik[j,t]
if scaled:
scale[t] = 1.0 / np.sum(trellis_prob[:,t])
trellis_prob[:,t] *= scale[t]
path[-1] = trellis_prob[:,-1].argmax()
for t in range(num_obs-2, -1, -1):
path[t] = trellis_state[(path[t+1]), t+1]
if not ret_loglik:
return path
else:
if scaled:
loglik = -np.sum(np.log(scale))
else:
p = trellis_prob[path[-1],-1]
loglik = np.log(p)
return path, loglik
if __name__=='__main__':
# Assume there are 3 observation states, 2 hidden states, and 5 observations
priors = np.array([0.5, 0.5])
transmat = np.array([
[0.75, 0.25],
[0.32, 0.68]])
emmat = np.array([
[0.8, 0.1, 0.1],
[0.1, 0.2, 0.7]])
observations = np.array([0, 1, 2, 1, 0], dtype=int)
obslik = np.array([emmat[:,z] for z in observations]).T
print viterbi_path(priors, transmat, obslik) #=> [0 1 1 1 0]
print viterbi_path(priors, transmat, obslik, scaled=False) #=> [0 1 1 1 0]
print viterbi_path(priors, transmat, obslik, ret_loglik=True) #=> (array([0, 1, 1, 1, 0]), -7.776472586614755)
print viterbi_path(priors, transmat, obslik, scaled=False, ret_loglik=True) #=> (array([0, 1, 1, 1, 0]), -8.0120386579275227)
Note that this implementation does not use emission probabilities directly but uses a variable obslik. Generally, emissions[i,j] := Pr(observed_state == j | hidden_state == i) for a particular observed state i, making emissions.shape == (num_hidden_states, num_obs_states).
However, given a sequence observations[t] := observation at time t, all the Viterbi Algorithm requires is the likelihood of that observation for each hidden state. Hence, obslik[i,t] := Pr(observations[t] | hidden_state == i). The actual value the of the observed state isn't necessary.
I have modified #Rhubarb's answer for the condition where the marginal probabilities are already known (e.g by computing the Forward Backward algorithm).
def viterbi (transition_probabilities, conditional_probabilities):
# Initialise everything
num_samples = conditional_probabilities.shape[1]
num_states = transition_probabilities.shape[0] # number of states
c = np.zeros(num_samples) #scale factors (necessary to prevent underflow)
viterbi = np.zeros((num_states,num_samples)) # initialise viterbi table
best_path_table = np.zeros((num_states,num_samples)) # initialise the best path table
best_path = np.zeros(num_samples).astype(np.int32) # this will be your output
# B- appoint initial values for viterbi and best path (bp) tables - Eq (32a-32b)
viterbi[:,0] = conditional_probabilities[:,0]
c[0] = 1.0/np.sum(viterbi[:,0])
viterbi[:,0] = c[0] * viterbi[:,0] # apply the scaling factor
# C- Do the iterations for viterbi and psi for time>0 until T
for t in range(1, num_samples): # loop through time
for s in range (0,num_states): # loop through the states #(t-1)
trans_p = viterbi[:, t-1] * transition_probabilities[:,s] # transition probs of each state transitioning
best_path_table[s,t], viterbi[s,t] = max(enumerate(trans_p), key=operator.itemgetter(1))
viterbi[s,t] = viterbi[s,t] * conditional_probabilities[s][t]
c[t] = 1.0/np.sum(viterbi[:,t]) # scaling factor
viterbi[:,t] = c[t] * viterbi[:,t]
## D - Back-tracking
best_path[num_samples-1] = viterbi[:,num_samples-1].argmax() # last state
for t in range(num_samples-1,0,-1): # states of (last-1)th to 0th time step
best_path[t-1] = best_path_table[best_path[t],t]
return best_path
Related
I am trying to make my own CFD solver and one of the most computationally expensive parts is solving for the pressure term. One way to solve Poisson differential equations faster is by using a multigrid method. The basic recursive algorithm for this is:
function phi = V_Cycle(phi,f,h)
% Recursive V-Cycle Multigrid for solving the Poisson equation (\nabla^2 phi = f) on a uniform grid of spacing h
% Pre-Smoothing
phi = smoothing(phi,f,h);
% Compute Residual Errors
r = residual(phi,f,h);
% Restriction
rhs = restriction(r);
eps = zeros(size(rhs));
% stop recursion at smallest grid size, otherwise continue recursion
if smallest_grid_size_is_achieved
eps = smoothing(eps,rhs,2*h);
else
eps = V_Cycle(eps,rhs,2*h);
end
% Prolongation and Correction
phi = phi + prolongation(eps);
% Post-Smoothing
phi = smoothing(phi,f,h);
end
I've attempted to implement this algorithm myself (also at the end of this question) however it is very slow and doesn't give good results so evidently it is doing something wrong. I've been trying to find why for too long and I think it's just worthwhile seeing if anyone can help me.
If I use a grid size of 2^5 by 2^5 points, then it can solve it and give reasonable results. However, as soon as I go above this it takes exponentially longer to solve and basically get stuck at some level of inaccuracy, no matter how many V-Loops are performed. at 2^7 by 2^7 points, the code takes way too long to be useful.
I think my main issue is that my implementation of a jacobian iteration is using linear algebra to calculate the update at each step. This should, in general, be fast however, the update matrix A is an n*m sized matrix, and calculating the dot product of a 2^7 * 2^7 sized matrix is expensive. As most of the cells are just zeros, should I calculate the result using a different method?
if anyone has any experience in multigrid methods, I would appreciate any advice!
Thanks
my code:
# -*- coding: utf-8 -*-
"""
Created on Tue Dec 29 16:24:16 2020
#author: mclea
"""
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import convolve2d
from mpl_toolkits.mplot3d import Axes3D
from scipy.interpolate import griddata
from matplotlib import cm
def restrict(A):
"""
Creates a new grid of points which is half the size of the original
grid in each dimension.
"""
n = A.shape[0]
m = A.shape[1]
new_n = int((n-2)/2+2)
new_m = int((m-2)/2+2)
new_array = np.zeros((new_n, new_m))
for i in range(1, new_n-1):
for j in range(1, new_m-1):
ii = int((i-1)*2)+1
jj = int((j-1)*2)+1
# print(i, j, ii, jj)
new_array[i,j] = np.average(A[ii:ii+2, jj:jj+2])
new_array = set_BC(new_array)
return new_array
def interpolate_array(A):
"""
Creates a grid of points which is double the size of the original
grid in each dimension. Uses linear interpolation between grid points.
"""
n = A.shape[0]
m = A.shape[1]
new_n = int((n-2)*2 + 2)
new_m = int((m-2)*2 + 2)
new_array = np.zeros((new_n, new_m))
i = (np.indices(A.shape)[0]/(A.shape[0]-1)).flatten()
j = (np.indices(A.shape)[1]/(A.shape[1]-1)).flatten()
A = A.flatten()
new_i = np.linspace(0, 1, new_n)
new_j = np.linspace(0, 1, new_m)
new_ii, new_jj = np.meshgrid(new_i, new_j)
new_array = griddata((i, j), A, (new_jj, new_ii), method="linear")
return new_array
def adjacency_matrix(rows, cols):
"""
Creates the adjacency matrix for an n by m shaped grid
"""
n = rows*cols
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = 1
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = 1
return M
def create_differences_matrix(rows, cols):
"""
Creates the central differences matrix A for an n by m shaped grid
"""
n = rows*cols
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = -1
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = -1
np.fill_diagonal(M, 4)
return M
def set_BC(A):
"""
Sets the boundary conditions of the field
"""
A[:, 0] = A[:, 1]
A[:, -1] = A[:, -2]
A[0, :] = A[1, :]
A[-1, :] = A[-2, :]
return A
def create_A(n,m):
"""
Creates all the components required for the jacobian update function
for an n by m shaped grid
"""
LaddU = adjacency_matrix(n,m)
A = create_differences_matrix(n,m)
invD = np.zeros((n*m, n*m))
np.fill_diagonal(invD, 1/4)
return A, LaddU, invD
def calc_RJ(rows, cols):
"""
Calculates the jacobian update matrix Rj for an n by m shaped grid
"""
n = int(rows*cols)
M = np.zeros((n,n))
for r in range(rows):
for c in range(cols):
i = r*cols + c
# Two inner diagonals
if c > 0: M[i-1,i] = M[i,i-1] = 0.25
# Two outer diagonals
if r > 0: M[i-cols,i] = M[i,i-cols] = 0.25
return M
def jacobi_update(v, f, nsteps=1, max_err=1e-3):
"""
Uses a jacobian update matrix to solve nabla(v) = f
"""
f_inner = f[1:-1, 1:-1].flatten()
n = v.shape[0]
m = v.shape[1]
A, LaddU, invD = create_A(n-2, m-2)
Rj = calc_RJ(n-2,m-2)
update=True
step = 0
while update:
v_old = v.copy()
step += 1
vt = v_old[1:-1, 1:-1].flatten()
vt = np.dot(Rj, vt) + np.dot(invD, f_inner)
v[1:-1, 1:-1] = vt.reshape((n-2),(m-2))
err = v - v_old
if step == nsteps or np.abs(err).max()<max_err:
update=False
return v, (step, np.abs(err).max())
def MGV(f, v):
"""
Solves for nabla(v) = f using a multigrid method
"""
# global A, r
n = v.shape[0]
m = v.shape[1]
# If on the smallest grid size, compute the exact solution
if n <= 6 or m <=6:
v, info = jacobi_update(v, f, nsteps=1000)
return v
else:
# smoothing
v, info = jacobi_update(v, f, nsteps=10, max_err=1e-1)
A = create_A(n, m)[0]
# calculate residual
r = np.dot(A, v.flatten()) - f.flatten()
r = r.reshape(n,m)
# downsample resitdual error
r = restrict(r)
zero_array = np.zeros(r.shape)
# interploate the correction computed on a corser grid
d = interpolate_array(MGV(r, zero_array))
# Add prolongated corser grid solution onto the finer grid
v = v - d
v, info = jacobi_update(v, f, nsteps=10, max_err=1e-6)
return v
sigma = 0
# Setting up the grid
k = 6
n = 2**k+2
m = 2**(k)+2
hx = 1/n
hy = 1/m
L = 1
H = 1
x = np.linspace(0, L, n)
y = np.linspace(0, H, m)
XX, YY = np.meshgrid(x, y)
# Setting up the initial conditions
f = np.ones((n,m))
v = np.zeros((n,m))
# How many V cyles to perform
err = 1
n_cycles = 10
loop = True
cycle = 0
# Perform V cycles until converged or reached the maximum
# number of cycles
while loop:
cycle += 1
v_new = MGV(f, v)
if np.abs(v - v_new).max() < err:
loop = False
if cycle == n_cycles:
loop = False
v = v_new
print("Number of cycles " + str(cycle))
plt.contourf(v)
I realize that I'm not answering your question directly, but I do note that you have quite a few loops that will contribute some overhead cost. When optimizing code, I have found the following thread useful - particularly the line profiler thread. This way you can focus in on "high time cost" lines and then start to ask more specific questions regarding opportunities to optimize.
How do I get time of a Python program's execution?
I'm implementing a Markov Chain Monte Carlo with both Metropolis and Barker's α's for numerical integration. I've created a class called MCMCIntegrator(). Below the __init__() method, are the g(x) the PDF of the function I'm integrating and alpha method, implementing the Metropolis and Barker α's.
import numpy as np
import scipy.stats as st
class MCMCIntegrator:
def __init__(self):
self.size = 1000
self.std = 0.6
self.real_int = 0.06496359
self.sample = None
#staticmethod
def g(x):
return st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))
def alpha(self, a, b, method):
if method:
return min(1, self.g(b) / self.g(a))
else:
return self.g(b) / (self.g(a) + self.g(b))
The size is the size of the sample that the class must generate, std is the standard deviation of the normal kernel, which you will see in a few seconds. The real_int is the value of the integral from 1 to 2 of the function we're integrating. I've generated it with a R script. Now, to the problem.
def _chain(self, method):
"""
Markov chain heat-up with burn-in
:param method: Metropolis or Barker alpha
:return: np.array containing the sample
"""
old = 0
sample = np.zeros(int(self.size * 1.3))
i = 0
while i != len(sample):
new = np.random.normal(loc=old, scale=self.std)
new = abs(new)
al = self.alpha(old, new, method=method)
u = np.random.uniform()
if al > u:
sample[i] = new
i += 1
old = new
return np.array(sample)
Below this method is an integrate() method that calculates the proportion of numbers in the [1, 2] interval:
def integrate(self, method=None):
"""
Integration step
"""
sample = self._chain(method=method)
# discarding 30% of the sample for the burn-in
ind = int(len(sample)*0.3)
sample = sample[ind:]
setattr(self, "sample", sample)
sample = [1 if 1 < v < 2 else 0 for v in sample]
return np.mean(sample)
this is the main function:
def main():
print("-- RESULTS --".center(20), end='\n')
mcmc = MCMCIntegrator()
print(f"\t{mcmc.integrate()}", end='\n')
print(f"\t{np.abs(mcmc.integrate() - mcmc.real_int) / mcmc.real_int}")
if __name__ == "__main__":
main()
I'm stuck in an infinite while loop, with no idea why this is happening.
While I have no prior exposure to Python and no direct explanation for the infinite loop, here are some problematic issues in the code:
The
while i != len(sample):
loop increments the value i only when the Uniform variate is below the acceptance probability if al > u: This is not how Metropolis-Hastings operates. If the Uniform variate is above the acceptance probability, the current value of the chain is duplicated. but this does not explain for the infinite loop since a proposed value should eventually be accepted.
If the target density is
st.gamma.pdf(x, 1, scale=1.378008857)*np.abs(np.cos(1.10257704))
then (i) what is the point of the second and constant term np.abs(np.cos(1.10257704)) and (ii) where is the need for so many digits?
The proposal distribution
new = np.random.normal(loc=old, scale=self.std)
new = abs(new)
is a folded normal, which density is not symmetric. Hence it should appear in the Metropolis-Hastings probability but may have little impact since the scale is small.
Here is my R rendering of the Python code (edited and corrected)
self.size = 1e5
self.std = 0.6
self.real_int = 0.06496359
g <- function(x){dgamma(x, shape=1, scale=1.378)}
alpha <- function(a, b, method=1)ifelse(method,
min(1, r <- g(b) / g(a)), 1 / (1 + 1 / r))
old = 0
smple = rep(0,self.size * 1.3)
for (i in 1:length(smple)){
new = abs(old+self.std*rnorm(1))
al = alpha(old, new, 0)
old=smple[i]=ifelse(al > runif(1), new, old)
}
ind = trunc(length(smple)*0.3)
smple = sample[ind:length(smple)]
hist(smple,pro=TRUE,nclass=10*log2(self.size),col="wheat")
curve(g(x),add=TRUE,lwd=2,col="sienna")
clearly reproducing the Gamma target:
without the correction for the non-symmetric proposal. The correction would be
q <- function(a, b)
dnorm(b-a,sd=self.std)+dnorm(-b-a,sd=self.std)
alpha <- function(a, b, method=1){
return(ifelse(method,
min(1, r <- g(b) * q(b,a) / g(a) / q(a,b)),
1 / (1 + 1/r)))}
old = 0
smple = rep(0,self.size * 1.3)
for (i in 1:length(smple)){
new = abs(old+self.std*rnorm(1))
al = alpha(old, new, 3)
old=smple[i]=ifelse(al > runif(1), new, old)
}
and makes no difference in the above picture. (The acceptance rate for the Metropolis ratio is 85%, while for Baker's, it is 48%.)
I hope everybody is well, safe, and healthy during this time.
I'm currently working on a python assignment. Using the following code, I need to run through each value of beta, and for each value of beta, I need to run through each value of reduction)factor to run through the following steps.
Then for each iteration of reductiom_factor for each value of beta I need to save the data in a file with the titles in listofsolutions. Two things I'm not sure how to do: is how to export data with a file with a name in list of solutions (which are in order for each value of beta and reduction_factor), and how to use np.savez to save the data to the files I have just created.
This is my amended code
b = [1/4, 0]
beta = np.asarray(b)
gamma = 0.5
listofsolutions = ['Q2_AA_0.1','Q2_AA_0.9','Q2_AA_0.99', 'Q2_AA_1', 'Q2_AA_1.1', 'Q2_AA_2','Q2_CD_0.1','Q2_CD_0.9','Q2_CD_0.99', 'Q2_CD_1', 'Q2_CD_1.1', 'Q2_CD_2']
consistent = True # use a consistent mass matrix
for bb in itertools.zip_longest(beta):
c = np.sqrt(E / rho_tilde) # wave speed
T = 0.016 # total time
# compute the critical time-step
# note: uncondionally stable AA scheme will return 1.0
delta_t_crit = fe.get_delta_t_crit(le = le, gamma = gamma, beta = bb, consistent = consistent, c = c)
# actual times-step used is a factor of the critical time-step
reduction_factor = [0.1, 0.9, 0.99, 1, 1.1, 2]
for rf in reduction_factor:
delta_t = rf * delta_t_crit
# selected output data is stored to a file with the name given below
# use this to save the results from the different runs
# change the name to match the data you want to store
for i in b and r in reduction_factor:
outfile[i] = listofsolutions[i]
n_t_steps = int(np.ceil(T / delta_t)); # number of time step
# initialise the time domain, K and M
t = np.linspace(0, T, n_t_steps)
K = np.zeros((n_dof, n_dof))
M = np.zeros((n_dof, n_dof))
# assemble K and M
for ee in range(n_el):
dof_index = fe.get_dof_index(ee)
M[np.ix_(dof_index, dof_index)] += fe.get_Me(le = le, Ae = Ae, rho_tilde_e = rho_tilde, consistent = consistent)
# damping matrix
C = np.zeros((n_dof, n_dof))
# assemble the system matrix A
A_matrix = M + (gamma * delta_t) * C + (beta * delta_t**2)*K
# define the free dofs
free_dof = np.arange(1,n_dof)
# initial conditions
d = np.zeros((n_dof, 1))
v = np.zeros((n_dof, 1))
F = np.zeros((n_dof, 1))
# compute the initial acceleration
a = np.linalg.solve(M, F - C.dot(v) - K.dot(d))
# store the history data
# rows -> each node
# columns -> each time step including initial at 0
d_his = np.zeros((n_dof, n_t_steps))
v_his = np.zeros((n_dof, n_t_steps))
a_his = np.zeros((n_dof, n_t_steps))
d_his[:,0] = d[:,0]
v_his[:,0] = v[:,0]
a_his[:,0] = a[:,0]
# loop over the time domain and solve the problem at each step
for n in range(1,n_t_steps):
# data at beginning of the time-step n
a_n = a
v_n = v
d_n = d
# applied loading
t_current = n * delta_t # current time
if t_current<0.001:
F[-1] = A_bar * Ae * np.sin(1000 * t_current * np.pi)
else:
F[-1]=0.
# define predictors
d_tilde = d_n + delta_t*v_n + ((delta_t**2)/2.) * (1 - 2*beta) * a_n
v_tilde = v_n + (1 - gamma) * delta_t * a_n
# assemble the right-hand side from the known data
R = F - C.dot(v_tilde) - K.dot(d_tilde)
# impose essential boundary condition and solve A a = RHS
A_free = A_matrix[np.ix_(free_dof, free_dof)]
R_free = R[np.ix_(free_dof)]
# solve for the accelerations at the free nodes
a_free = np.linalg.solve(A_free, R_free)
a = np.zeros((n_dof, 1))
a[1:] = a_free
# update displacement and vecloity predictors using the acceleration
d = d_tilde + (beta * delta_t**2) * a
v = v_tilde + (gamma * delta_t) * a
# store solutions
d_his[:,n] = d[:,0]
v_his[:,n] = v[:,0]
a_his[:,n] = a[:,0]
# post-processing
mid_node = int(np.ceil(n_dof / 2)) # mid node
# compute the stress in each element
# assuming constant E
stress = (E / le) * np.diff(d_his, axis=0)
# here we save the stress data for the middle element
np.savez(outfile, t, stress[mid_node,:]
I'm not sure how to specify to the program to save the result for each value of reduction_factor within gamma. In addition, for the last line of the code, I'm not sure how to save each iteration to the list of file names I have created.
I tried to do this using the statement"
for i in b` and r in reduction_factor:
outfile[i] = listofsolutions[i]"`
but I don't think this makes sense.
I am a total newbie at python so I am not familiar how to save files within nested loops.. I apologize if any of my questions are rudimentary.
for i in b and r in reduction_factor:
outfile[i] = listofsolutions[i]
It is not correct. Possible solution:
for i in b: # variable 'i' will take every value of list b
for r in reduction_factor: # 'r' will iterate through reduction_factor
outfile[i] = listofsolutions[i] # outfile must be declared until it
Still, there is no logic. In that way, you can create a dictionary.
If you really want to create the file - read about "with open(filename, 'w'):" construction and nested loops or list comprehension.
This code is taking more than half an hour for a data set of 200000 floats.
import numpy as np
try:
import progressbar
pbar = progressbar.ProgressBar(widgets=[progressbar.Percentage(),
progressbar.Counter('%5d'), progressbar.Bar(), progressbar.ETA()])
except:
pbar = list
block_length = np.loadtxt('bb.txt.gz') # get data file from http://filebin.ca/29LbYfKnsKqJ/bb.txt.gz (2MB, 200000 float numbers)
N = len(block_length) - 1
# arrays to store the best configuration
best = np.zeros(N, dtype=float)
last = np.zeros(N, dtype=int)
log = np.log
# Start with first data cell; add one cell at each iteration
for R in pbar(range(N)):
# Compute fit_vec : fitness of putative last block (end at R)
#fit_vec = fitfunc.fitness(
T_k = block_length[:R + 1] - block_length[R + 1]
#N_k = np.cumsum(x[:R + 1][::-1])[::-1]
N_k = np.arange(R + 1, 0, -1)
fit_vec = N_k * (log(N_k) - log(T_k))
prior = 4 - log(73.53 * 0.05 * ((R+1) ** -0.478))
A_R = fit_vec - prior #fitfunc.prior(R + 1, N)
A_R[1:] += best[:R]
i_max = np.argmax(A_R)
last[R] = i_max
best[R] = A_R[i_max]
# Now find changepoints by iteratively peeling off the last block
change_points = np.zeros(N, dtype=int)
i_cp = N
ind = N
while True:
i_cp -= 1
change_points[i_cp] = ind
if ind == 0:
break
ind = last[ind - 1]
change_points = change_points[i_cp:]
print edges[change_points] # show result
The first loop is very slow because the length of arrays is R at every iteration, i.e. increasing, leading to N^2 complexity.
Is there any way to optimize this code further, e.g. through pre-computation? I am also happy with solutions using other programming languages.
I can replicate A_R (up to the fit-prior step) as a upper triangular NxN matrix with:
def trilog(n):
nn = n[:-1,None]-n[None,1:]
nn[np.tril_indices_from(nn,-1)]=1
return nn
T_k = trilog(block_length)
N_k = trilog(-np.arange(N+1))
fit_vec = N_k * (np.log(N_k) - np.log(T_k))
R = np.arange(N)+1
prior = 4 - log(73.53 * 0.05 * (R ** -0.478))
A_R = fit_vec - prior
A_R = np.triu(A_R,0)
print(A_R)
I haven't worked through the logic of calculation and applying best.
I've only done this with small arrays. For your full problem, the corresponding matrix is too large for my memory.
B=np.ones((200000,200000),float)
So just from memory considerations you might be stuck with the for R in range(N) iteration.
I am trying to make a training set of data points by making a line (perceptron) f and making the points on one side +1 and -1 on the other. Then making a new line g and trying to get it as close to f as possible by updating with w = w+ y(t)x(t) where w is weights and y(t) is +1,-1 and x(t) is coordinates of a missclassified point. after implementing this tho i am not getting a very good fit from g to f. here is my code and some sample outputs.
import random
random.seed()
points = [ [1, random.randint(-25, 25), random.randint(-25,25), 0] for k in range(1000)]
weights = [.1,.1,.1]
misclassified = []
############################################################# Function f
interceptf = (0,random.randint(-5,5))
slopef = (random.randint(-10, 10),random.randint(-10,10))
point1f = ((interceptf[0] + slopef[0]),(interceptf[1] + slopef[1]))
point2f = ((interceptf[0] - slopef[0]),(interceptf[1] - slopef[1]))
############################################################# Function G starting
interceptg = (-weights[0],weights[2])
slopeg = (-weights[1],weights[2])
point1g = ((interceptg[0] + slopeg[0]),(interceptg[1] + slopeg[1]))
point2g = ((interceptg[0] - slopeg[0]),(interceptg[1] - slopeg[1]))
#############################################################
def isLeft(a, b, c):
return ((b[0] - a[0])*(c[1] - a[1]) - (b[1] - a[1])*(c[0] - a[0])) > 0
for i in points:
if isLeft(point1f,point2f,i):
i[3]=1
else:
i[3]=-1
for i in points:
if (isLeft(point1g,point2g,i)) and (i[3] == -1):
misclassified.append(i)
if (not isLeft(point1g,point2g,i)) and (i[3] == 1):
misclassified.append(i)
print len(misclassified)
while misclassified:
first = misclassified[0]
misclassified.pop(0)
a = [first[0],first[1],first[2]]
b = first[3]
a[:] = [x*b for x in a]
weights = [(x + y) for x, y in zip(weights,a)]
interceptg = (-weights[0],weights[2])
slopeg = (-weights[1],weights[2])
point1g = ((interceptg[0] + slopeg[0]),(interceptg[1] + slopeg[1]))
point2g = ((interceptg[0] - slopeg[0]),(interceptg[1] - slopeg[1]))
check = 0
for i in points:
if (isLeft(point1g,point2g,i)) and (i[3] == -1):
check += 1
if (not isLeft(point1g,point2g,i)) and (i[3] == 1):
check += 1
print weights
print check
117 <--- number of original missclassifieds with g
[-116.9, -300.9, 190.1] <--- final weights
617 <--- number of original missclassifieds with g after algorithm
956 <--- number of original missclassifieds with g
[-33.9, -12769.9, -572.9] <--- final weights
461 <--- number of original missclassifieds with g after algorithm
There are at least few problems with your algorithm:
Your "while" conditions is wrong - the perceptron learning is not about iterating once through all misclassified points as you do now. The algorithm should iterate through all the points for as long as any of them is missclassified. In particular - each update can make some correctly classified point as the wrong one, so you have to always iterate through all of them and check if everything is fine.
I am pretty sure that what you actually wanted is update rule in form of (y(i)-p(i))x(i) where p(i) is predicted label and y(i) is a true label (but this obviously degenrates to your method if you only update misclassifieds)