This is my first question on Stackoverflow, so I apologize if I word it poorly. I am writing code to take raw acceleration data from an IMU and then integrate it to update the position of an object. Currently this code takes a new accelerometer reading every milisecond, and uses that to update the position. My system has a lot of noise, which results in crazy readings due to compounding error, even with the ZUPT scheme I implemented. I know that a Kalman filter is theoretically ideal for this scenario, and I would like to use the pykalman module instead of building one myself.
My first question is, can pykalman be used in real time like this? From the documentation it looks to me like you have to have a record of all measurements and then perform the smooth operation, which would not be practical as I want to filter recursively every milisecond.
My second question is, for the transition matrix can I only apply pykalman to the acceleration data by itself, or can I somehow include the double integration to position? What would that matrix look like?
If pykalman is not practical for this situation, is there another way I can implement a Kalman Filter? Thank you in advance!
You can use a Kalman Filter in this case, but your position estimation will strongly depend on the precision of your acceleration signal. The Kalman Filter is actually useful for a fusion of several signals. So error of one signal can be compensated by another signal. Ideally you need to use sensors based on different physical effects (for example an IMU for acceleration, GPS for position, odometry for velocity).
In this answer I'm going to use readings from two acceleration sensors (both in X direction). One of these sensors is an expansive and precise. The second one is much cheaper. So you will see the sensor precision influence on the position and velocity estimations.
You already mentioned the ZUPT scheme. I just want to add some notes: it is very important to have a good estimation of the pitch angle, to get rid of the gravitation component in your X-acceleration. If you use Y- and Z-acceleration you need both pitch and roll angles.
Let's start with modelling. Assume you have only acceleration readings in X-direction. So your observation will look like
Now you need to define the smallest data set, which completely describes your system in each point of time. It will be the system state.
The mapping between the measurement and state domains is defined by the observation matrix:
Now you need to describe the system dynamics. According to this information the Filter will predict a new state based on the previous one.
In my case dt=0.01s. Using this matrix the Filter will integrate the acceleration signal to estimate the velocity and position.
The observation covariance R can be described by the variance of your sensor readings. In my case I have only one signal in my observation, so the observation covariance is equal to the variance of the X-acceleration (the value can be calculated based on your sensors datasheet).
Through the transition covariance Q you describe the system noise. The smaller the matrix values, the smaller the system noise. The Filter will become stiffer and the estimation will be delayed. The weight of the system's past will be higher compared to new measurement. Otherwise the filter will be more flexible and will react strongly on each new measurement.
Now everything is ready to configure the Pykalman. In order to use it in real time, you have to use the filter_update function.
from pykalman import KalmanFilter
import numpy as np
import matplotlib.pyplot as plt
load_data()
# Data description
# Time
# AccX_HP - high precision acceleration signal
# AccX_LP - low precision acceleration signal
# RefPosX - real position (ground truth)
# RefVelX - real velocity (ground truth)
# switch between two acceleration signals
use_HP_signal = 1
if use_HP_signal:
AccX_Value = AccX_HP
AccX_Variance = 0.0007
else:
AccX_Value = AccX_LP
AccX_Variance = 0.0020
# time step
dt = 0.01
# transition_matrix
F = [[1, dt, 0.5*dt**2],
[0, 1, dt],
[0, 0, 1]]
# observation_matrix
H = [0, 0, 1]
# transition_covariance
Q = [[0.2, 0, 0],
[ 0, 0.1, 0],
[ 0, 0, 10e-4]]
# observation_covariance
R = AccX_Variance
# initial_state_mean
X0 = [0,
0,
AccX_Value[0, 0]]
# initial_state_covariance
P0 = [[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, AccX_Variance]]
n_timesteps = AccX_Value.shape[0]
n_dim_state = 3
filtered_state_means = np.zeros((n_timesteps, n_dim_state))
filtered_state_covariances = np.zeros((n_timesteps, n_dim_state, n_dim_state))
kf = KalmanFilter(transition_matrices = F,
observation_matrices = H,
transition_covariance = Q,
observation_covariance = R,
initial_state_mean = X0,
initial_state_covariance = P0)
# iterative estimation for each new measurement
for t in range(n_timesteps):
if t == 0:
filtered_state_means[t] = X0
filtered_state_covariances[t] = P0
else:
filtered_state_means[t], filtered_state_covariances[t] = (
kf.filter_update(
filtered_state_means[t-1],
filtered_state_covariances[t-1],
AccX_Value[t, 0]
)
)
f, axarr = plt.subplots(3, sharex=True)
axarr[0].plot(Time, AccX_Value, label="Input AccX")
axarr[0].plot(Time, filtered_state_means[:, 2], "r-", label="Estimated AccX")
axarr[0].set_title('Acceleration X')
axarr[0].grid()
axarr[0].legend()
axarr[0].set_ylim([-4, 4])
axarr[1].plot(Time, RefVelX, label="Reference VelX")
axarr[1].plot(Time, filtered_state_means[:, 1], "r-", label="Estimated VelX")
axarr[1].set_title('Velocity X')
axarr[1].grid()
axarr[1].legend()
axarr[1].set_ylim([-1, 20])
axarr[2].plot(Time, RefPosX, label="Reference PosX")
axarr[2].plot(Time, filtered_state_means[:, 0], "r-", label="Estimated PosX")
axarr[2].set_title('Position X')
axarr[2].grid()
axarr[2].legend()
axarr[2].set_ylim([-10, 1000])
plt.show()
When using the better IMU-sensor, the estimated position is exactly the same as the ground truth:
The cheaper sensor gives significantly worse results:
I hope I could help you. If you have some questions, I will try to answer them.
UPDATE
If you want to experiment with different data you can generate them easily (unfortunately I don't have the original data any more).
Here is a simple matlab script to generate reference, good and poor sensor set.
clear;
dt = 0.01;
t=0:dt:70;
accX_var_best = 0.0005; % (m/s^2)^2
accX_var_good = 0.0007; % (m/s^2)^2
accX_var_worst = 0.001; % (m/s^2)^2
accX_ref_noise = randn(size(t))*sqrt(accX_var_best);
accX_good_noise = randn(size(t))*sqrt(accX_var_good);
accX_worst_noise = randn(size(t))*sqrt(accX_var_worst);
accX_basesignal = sin(0.3*t) + 0.5*sin(0.04*t);
accX_ref = accX_basesignal + accX_ref_noise;
velX_ref = cumsum(accX_ref)*dt;
distX_ref = cumsum(velX_ref)*dt;
accX_good_offset = 0.001 + 0.0004*sin(0.05*t);
accX_good = accX_basesignal + accX_good_noise + accX_good_offset;
velX_good = cumsum(accX_good)*dt;
distX_good = cumsum(velX_good)*dt;
accX_worst_offset = -0.08 + 0.004*sin(0.07*t);
accX_worst = accX_basesignal + accX_worst_noise + accX_worst_offset;
velX_worst = cumsum(accX_worst)*dt;
distX_worst = cumsum(velX_worst)*dt;
subplot(3,1,1);
plot(t, accX_ref);
hold on;
plot(t, accX_good);
plot(t, accX_worst);
hold off;
grid minor;
legend('ref', 'good', 'worst');
title('AccX');
subplot(3,1,2);
plot(t, velX_ref);
hold on;
plot(t, velX_good);
plot(t, velX_worst);
hold off;
grid minor;
legend('ref', 'good', 'worst');
title('VelX');
subplot(3,1,3);
plot(t, distX_ref);
hold on;
plot(t, distX_good);
plot(t, distX_worst);
hold off;
grid minor;
legend('ref', 'good', 'worst');
title('DistX');
The simulated data looks pretty the same like the data above.
Related
I'm trying to solve a system of differential equations and find the trajectory of Sun and Jupiter. But I don't have a nice trajectory, only some points.
Could you help? ("Soleil" means Sun)
Here's my code
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from mc_deriv import deriv
start = 0
end = 14*365
nbpas = end/10
t = np.linspace(start,end,nbpas)
M = M_Soleil + M_Jupiter
x0 = x_Jupiter - x_Soleil
y0 = y_Jupiter - y_Soleil
vx0 = vx_Jupiter - vx_Soleil
vy0 = vy_Jupiter - vy_Soleil
syst_CI = [x0,y0,vx0,vy0]
Sols=odeint(deriv,syst_CI,t,args=(M,))
x = Sols[:, 0]
y = Sols[:, 1]
vx = Sols[:, 2]
vy = Sols[:, 3]
The initialisation
x_Soleil = -7.139143380212696e-03 # (UA)
y_Soleil = -2.792019770161695e-03 # (UA)
x_Jupiter = +3.996321311604079e+00 # (UA)
y_Jupiter = +2.932561211517850e+00 # (UA)
vx_Soleil = -7.139143380212696e-03 # (UA*j^-1)
vy_Soleil = -2.792019770161695e-03 # (UA*j^-1)
vx_Jupiter = +3.996321311604079e+00 # (UA*j^-1)
vy_Jupiter = +2.932561211517850e+00 # (UA*j^-1)
M_Soleil = 2e30 # masse Soleil (kg)
M_Jupiter = 1.9e27 # masse Jupiter (kg)
r_Soleil = 696e6 # rayon Soleil (m)
And the outer function
def deriv(syst,t,M):
G = 6.67e-11
x = syst[0]
y = syst[1]
vx = syst[2]
vy = syst[3]
dxdt = vx
dydt = vy
dvxdt = -(G*M*x)/((x**2+y**2)**(3/2))
dvydt = -(G*M*y)/((x**2+y**2)**(3/2))
return dxdt,dydt,dvxdt,dvydt
The plot
plt.figure(figsize=(7, 5))
plt.title("Trajectoires Soleil-Jupiter")
#plt.xlabel("UA)")
#plt.ylabel("UA)")
plt.plot(x, y, '-', color="red")
plt.show()
The result of the plot :
Eureka it works!!!!
Currently I see the following problems in your code that render your observations unreproducible (apart from missing reference values):
In the initial data, the length unit is the astronomical unit and the time unit is one day. The unit of the gravitational constant is m^3 s^-1 kg^-2, so to combine them in one formula you need to convert AU into m, the factor is about 150e+09, of which the cube is to be divided out. And one day is 24*3600 seconds, which has to be multiplied in.
The integration time interval should also be counted in years, at the moment you seem to think of days, a third unit without appropriate conversion factors. [solved, un jour=one day]
From the division in the construction of the time nodes it appears as if you use python 2. Then the exponents 3/2 evaluate to 1 in integer division, you can directly use 1.5 in the exponent, it is an exact value in the binary floating point format. [actually python 3, then the first division should be explicitly integer]
In copying the initial data you made a copy-paste error, the initial positions and velocities have the same numbers, while the real numbers should form perpendicular vectors. [not solved in code, image has correct velocities] Looking for online data that fits your position, the NASA HORIZON system gives me for 2011-Nov-11 04:00 the Jupiter positions and velocities as
pos: 3.996662712108880E+00, 2.938301820497121E+00, -1.017177623308866E-01,
vel: -4.560191659347578E-03, 6.440946682361135E-03, 7.529386668190383E-05
The normalization to a center-of-gravity frame needs to apply conservation of momentum, the mass of Jupiter is large enough that just subtracting the velocities might give physically wrong results. [not resolved, the initial data should already be barycentric, no corrections should be necessary]
The varying accuracy of the physical constants will also introduce errors that will lead away from the reference positions. The most "dirty" constants that are visible at the moment are the gravitational constant and the masses, after that the uncertainty in the type of year. You only get the first two digits reliable of any (correctly) computed result.
I want to get kernel density estimation for positive data points. Using Python Scipy Stats package, I came up with the following code.
def get_pdf(data):
a = np.array(data)
ag = st.gaussian_kde(a)
x = np.linspace(0, max(data), max(data))
y = ag(x)
return x, y
This works perfectly for most data sets, but it gives an erroneous result for "all positive" data points. To make sure this works correctly, I use numerical integration to compute the area under this curve.
def trapezoidal_2(ag, a, b, n):
h = np.float(b - a) / n
s = 0.0
s += ag(a)[0]/2.0
for i in range(1, n):
s += ag(a + i*h)[0]
s += ag(b)[0]/2.0
return s * h
Since the data is spread in the region (0, int(max(data))), we should get a value close to 1, when executing the following line.
b = 1
data = st.pareto.rvs(b, size=10000)
data = list(data)
a = np.array(data)
ag = st.gaussian_kde(a)
trapezoidal_2(ag, 0, int(max(data)), int(max(data))*2)
But it gives a value close to 0.5 when I test.
But when I intergrate from -100 to max(data), it provides a value close to 1.
trapezoidal_2(ag, -100, int(max(data)), int(max(data))*2+200)
The reason is, ag (KDE) is defined for values less than 0, even though the original data set contains only positive values.
So how can I get a kernel density estimation that considers only positive values, such that area under the curve in the region (o, max(data)) is close to 1?
The choice of the bandwidth is quite important when performing kernel density estimation. I think the Scott's Rule and Silverman's Rule work well for distribution similar to a Gaussian. However, they do not work well for the Pareto distribution.
Quote from the doc:
Bandwidth selection strongly influences the estimate obtained from
the KDE (much more so than the actual shape of the kernel). Bandwidth selection
can be done by a "rule of thumb", by cross-validation, by "plug-in
methods" or by other means; see [3], [4] for reviews. gaussian_kde
uses a rule of thumb, the default is Scott's Rule.
Try with different bandwidth values, for example:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
b = 1
sample = stats.pareto.rvs(b, size=3000)
kde_sample_scott = stats.gaussian_kde(sample, bw_method='scott')
kde_sample_scalar = stats.gaussian_kde(sample, bw_method=1e-3)
# Compute the integrale:
print('integrale scott:', kde_sample_scott.integrate_box_1d(0, np.inf))
print('integrale scalar:', kde_sample_scalar.integrate_box_1d(0, np.inf))
# Graph:
x_span = np.logspace(-2, 1, 550)
plt.plot(x_span, stats.pareto.pdf(x_span, b), label='theoretical pdf')
plt.plot(x_span, kde_sample_scott(x_span), label="estimated pdf 'scott'")
plt.plot(x_span, kde_sample_scalar(x_span), label="estimated pdf 'scalar'")
plt.xlabel('X'); plt.legend();
gives:
integrale scott: 0.5572130540733236
integrale scalar: 0.9999999999968957
and:
We see that the kde using the Scott method is wrong.
I'm running into a problem when using sklearn FastICA. I'm trying to predict what the 'measured' variables (X in the code) would be if one of the predicted 'sources' was changing in a given way. I'm modifying this example.
I think the problem is that FastICA approximates the 'mixing' matrix but ica.mixing_ is very different from what I used to generate the data. I understand that the mixing matrix is undefined since the product np.dot(S, A.T) is what is relevant and changing S to S*a and A to A/a will yield the same result for all a != 0.
Any ideas? Thanks for reading and helping
Here is my code.
# this is exactly how the example start
np.random.seed(0)
n_samples = 200
time = np.linspace(0, 8, n_samples)
s1 = np.sin(2 * time) # Signal 1 : sinusoidal signal
s2 = np.sign(np.sin(3 * time)) # Signal 2 : square signal
s3 = signal.sawtooth(2 * np.pi * time) # Signal 3: saw tooth signal
S = np.c_[s1, s2, s3]
S += 0.2 * np.random.normal(size=S.shape) # Add noise
S /= S.std(axis=0) # Standardize data
# Here I'm changing the example. I'm modifying the 'mixing' array
# such that s1 is not mixed with neither s2 nor s3
A = np.array([[1, 0, 0], [0, 2, 1.0], [0, 1.0, 2.0]]) # Mixing matrix
# Mix data,
X = np.dot(S, A.T) # Generate observations
# Compute ICA
ica = FastICA()
S_ = ica.fit_transform(X) # Reconstruct signals
A_ = ica.mixing_ # Get estimated mixing matrix
# We can `prove` that the ICA model applies by reverting the unmixing.
assert np.allclose(X, np.dot(S_, A_.T) + ica.mean_)
# Here is where my real code starts,
# Now modify source s1
s1 *= 1.1
S = np.c_[s1, s2, s3]
S /= S.std(axis=0) # Standardize data
# regenerate observations.
# Note that original code in the example uses np.dot(S, A.T)
# (that doesn't work either). I'm using ica.inverse_transform
# because it is what is documented but also because there is an
# FastICA.mean_ that is not documented and I'm hoping
# inverse_transform uses it in the right way.
# modified_X = np.dot(S, A.T) # does not work either
modified_X = ica.inverse_transform(S)
# check that last 2 observations are not changed
# The original 'mixing' array was defined to mix s2 and s3 but not s1
# Next tests fail
np_testing.assert_array_almost_equal(X[:, 1], modified_X[:, 1])
np_testing.assert_array_almost_equal(X[:, 2], modified_X[:, 2])
I'm posting my findings in case it helps anyone.
I think there are 2 problems with the code I posted
When fitting the ICA, the exact 'mixing' matrix is not found and the solution will be leaking source 1 into all measured outputs. The result should be small with lots of data but it should still be there. However, I don't see a change in behavior when increasing the amount of faked data nor when changing FastICA's max_iter or tol parameters.
Order of sources is unpredictable, in the code I was assuming that found S_ was in the same order as S (which is wrong). Looping over all sources (after fit_transform), changing one at a time I see results that are close to what I expect. Two of the sources (1 and 2 for me) have impacts mostly on measured variables 2 and 3 and the 3rd source has most impact on measured variable 1 with some minor impact on variables 2 and 3.
This link provides code for an autocorrelation-based pitch detection algorithm. I am using it to detect pitches in simple guitar melodies.
In general, it produces very good results. For example, for the melody C4, C#4, D4, D#4, E4 it outputs:
262.743653536
272.144441273
290.826273006
310.431336809
327.094621169
Which correlates to the correct notes.
However, in some cases like this audio file (E4, F4, F#4, G4, G#4, A4, A#4, B4) it produces errors:
325.861452246
13381.6439242
367.518651703
391.479384923
414.604661221
218.345286173
466.503751322
244.994090035
More specifically, there are three errors here: 13381Hz is wrongly detected instead of F4 (~350Hz) (weird error), and also 218Hz instead of A4 (440Hz) and 244Hz instead of B4 (~493Hz), which are octave errors.
I assume the two errors are caused by something different? Here is the code:
slices = segment_signal(y, sr)
for segment in slices:
pitch = freq_from_autocorr(segment, sr)
print pitch
def segment_signal(y, sr, onset_frames=None, offset=0.1):
if (onset_frames == None):
onset_frames = remove_dense_onsets(librosa.onset.onset_detect(y=y, sr=sr))
offset_samples = int(librosa.time_to_samples(offset, sr))
print onset_frames
slices = np.array([y[i : i + offset_samples] for i
in librosa.frames_to_samples(onset_frames)])
return slices
You can see the freq_from_autocorr function in the first link above.
The only think that I have changed is this line:
corr = corr[len(corr)/2:]
Which I have replaced with:
corr = corr[int(len(corr)/2):]
UPDATE:
I noticed the smallest the offset I use (the smallest the signal segment I use to detect each pitch), the more high-frequency (10000+ Hz) errors I get.
Specifically, I noticed that the part that goes differently in those cases (10000+ Hz) is the calculation of the i_peak value. When in cases with no error it is in the range of 50-150, in the case of the error it is 3-5.
The autocorrelation function in the code snippet that you linked is not particularly robust. In order to get the correct result, it needs to locate the first peak on the left hand side of the autocorrelation curve. The method that the other developer used (calling the numpy.argmax() function) does not always find the correct value.
I've implemented a slightly more robust version, using the peakutils package. I don't promise that it's perfectly robust either, but in any case it achieves a better result than the version of the freq_from_autocorr() function that you were previously using.
My example solution is listed below:
import librosa
import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import fftconvolve
from pprint import pprint
import peakutils
def freq_from_autocorr(signal, fs):
# Calculate autocorrelation (same thing as convolution, but with one input
# reversed in time), and throw away the negative lags
signal -= np.mean(signal) # Remove DC offset
corr = fftconvolve(signal, signal[::-1], mode='full')
corr = corr[len(corr)//2:]
# Find the first peak on the left
i_peak = peakutils.indexes(corr, thres=0.8, min_dist=5)[0]
i_interp = parabolic(corr, i_peak)[0]
return fs / i_interp, corr, i_interp
def parabolic(f, x):
"""
Quadratic interpolation for estimating the true position of an
inter-sample maximum when nearby samples are known.
f is a vector and x is an index for that vector.
Returns (vx, vy), the coordinates of the vertex of a parabola that goes
through point x and its two neighbors.
Example:
Defining a vector f with a local maximum at index 3 (= 6), find local
maximum if points 2, 3, and 4 actually defined a parabola.
In [3]: f = [2, 3, 1, 6, 4, 2, 3, 1]
In [4]: parabolic(f, argmax(f))
Out[4]: (3.2142857142857144, 6.1607142857142856)
"""
xv = 1/2. * (f[x-1] - f[x+1]) / (f[x-1] - 2 * f[x] + f[x+1]) + x
yv = f[x] - 1/4. * (f[x-1] - f[x+1]) * (xv - x)
return (xv, yv)
# Time window after initial onset (in units of seconds)
window = 0.1
# Open the file and obtain the sampling rate
y, sr = librosa.core.load("./Vocaroo_s1A26VqpKgT0.mp3")
idx = np.arange(len(y))
# Set the window size in terms of number of samples
winsamp = int(window * sr)
# Calcualte the onset frames in the usual way
onset_frames = librosa.onset.onset_detect(y=y, sr=sr)
onstm = librosa.frames_to_time(onset_frames, sr=sr)
fqlist = [] # List of estimated frequencies, one per note
crlist = [] # List of autocorrelation arrays, one array per note
iplist = [] # List of peak interpolated peak indices, one per note
for tm in onstm:
startidx = int(tm * sr)
freq, corr, ip = freq_from_autocorr(y[startidx:startidx+winsamp], sr)
fqlist.append(freq)
crlist.append(corr)
iplist.append(ip)
pprint(fqlist)
# Choose which notes to plot (it's set to show all 8 notes in this case)
plidx = [0, 1, 2, 3, 4, 5, 6, 7]
# Plot amplitude curves of all notes in the plidx list
fgwin = plt.figure(figsize=[8, 10])
fgwin.subplots_adjust(bottom=0.0, top=0.98, hspace=0.3)
axwin = []
ii = 1
for tm in onstm[plidx]:
axwin.append(fgwin.add_subplot(len(plidx)+1, 1, ii))
startidx = int(tm * sr)
axwin[-1].plot(np.arange(startidx, startidx+winsamp), y[startidx:startidx+winsamp])
ii += 1
axwin[-1].set_xlabel('Sample ID Number', fontsize=18)
fgwin.show()
# Plot autocorrelation function of all notes in the plidx list
fgcorr = plt.figure(figsize=[8,10])
fgcorr.subplots_adjust(bottom=0.0, top=0.98, hspace=0.3)
axcorr = []
ii = 1
for cr, ip in zip([crlist[ii] for ii in plidx], [iplist[ij] for ij in plidx]):
if ii == 1:
shax = None
else:
shax = axcorr[0]
axcorr.append(fgcorr.add_subplot(len(plidx)+1, 1, ii, sharex=shax))
axcorr[-1].plot(np.arange(500), cr[0:500])
# Plot the location of the leftmost peak
axcorr[-1].axvline(ip, color='r')
ii += 1
axcorr[-1].set_xlabel('Time Lag Index (Zoomed)', fontsize=18)
fgcorr.show()
The printed output looks like:
In [1]: %run autocorr.py
[325.81996740236065,
346.43374761017725,
367.12435233192753,
390.17291696559079,
412.9358117076161,
436.04054933498134,
465.38986619237039,
490.34120132405866]
The first figure produced by my code sample depicts the amplitude curves for the next 0.1 seconds following each detected onset time:
The second figure produced by the code shows the autocorrelation curves, as computed inside of the freq_from_autocorr() function. The vertical red lines depict the location of the first peak on the left for each curve, as estimated by the peakutils package. The method used by the other developer was getting incorrect results for some of these red lines; that's why his version of that function was occasionally returning the wrong frequencies.
My suggestion would be to test the revised version of the freq_from_autocorr() function on other recordings, see if you can find more challenging examples where even the improved version still gives incorrect results, and then get creative and try to develop an even more robust peak finding algorithm that never, ever mis-fires.
The autocorrelation method is not always right. You may want to implement a more sophisticated method like YIN:
http://audition.ens.fr/adc/pdf/2002_JASA_YIN.pdf
or MPM:
http://www.cs.otago.ac.nz/tartini/papers/A_Smarter_Way_to_Find_Pitch.pdf
Both of the above papers are good reads.
I'm a new-learner of python, recently I'm working on some project to perform computation of Joint distribution of a markov process.
An example of a stochastic kernel is the one used in a recent study by Hamilton (2005), who investigates a nonlinear statistical model of the business cycle based on US unemployment data. As part of his calculation he estimates the kernel
pH := 0.971 0.029 0
0.145 0.778 0.077
0 0.508 0.492
Here S = {x1, x2, x3} = {NG, MR, SR}, where NG corresponds to normal growth, MR to mild recession, and SR to severe recession. For example, the probability of transitioning from severe recession to mild recession in one period is 0.508. The length of the period is one month.
the excercise based on the above markov process is
With regards to Hamilton’s kernel pH, and using the same initial condition ψ = (0.2, 0.2, 0.6) , compute the probability that the economy starts and remains in recession through periods 0, 1, 2 (i.e., that xt = NG fort = 0, 1, 2).
My script is like
import numpy as np
## In this case, X should be a matrix rather than vector
## and we compute w.r.t P rather than merely its element [i][j]
path = []
def path_prob2 (p, psi , x2): # X a sequence giving the path
prob = psi # initial distribution is an row vector
for t in range(x2.shape[1] -1): # .shape[1] grasp # of columns
prob = np.dot(prob , p) # prob[t]: marginal distribution at period t
ression = np.dot(prob, x2[:,t])
path.append(ression)
return path,prob
p = ((0.971, 0.029, 0 ),
(0.145, 0.778, 0.077),
(0 , 0.508, 0.492))
# p must to be a 2-D numpy array
p = np.array(p)
psi = (0.2, 0.2, 0.6)
psi = np.array(psi)
x2 = ((0,0,0),
(1,1,1),
(1,1,1))
x2 = np.array(x2)
path_prob2(p,psi,x2)
During the execute process, two problems arise. The first one is , in the first round of loop, I don't need the initial distribution psi to postmultiply transaction matrix p, so the probability of "remaining in recession" should be 0.2+0.6 = 0.8, but I don't know how to write the if-statement.
The second one is , as you may note, I use a list named path to collect the probility of "remaining in recession" in each period. And finally I need to multiply every element in the list one-by-one, I don't manage to find a method to implement such task , like path[0]*path[1]*path[2] (np.multiply can only take two arguments as far as I know). Please give me some clues if such method do exist.
An additional ask is please give me any suggestion that you think can make the code more efficient. Thank you.
If I understood you correctly this should work (I'd love for some manual calculations for some of the steps/outcome), take notice of the fact that I didn't use if/else statement but instead started iterating from the second column:
import numpy as np
# In this case, X should be a matrix rather than vector
# and we compute w.r.t P rather than merely its element [i][j]
path = []
def path_prob2(p, psi, x2): # X a sequence giving the path
path.append(np.dot(psi, x2[:, 0])) # first step
prob = psi # initial distribution is an row vector
for t in range(1, x2.shape[1]): # .shape[1] grasp # of columns
prob = np.dot(prob, p) # prob[t]: marginal distribution at period t
path.append(np.prod(np.dot(prob, t)))
return path, prob
# p must to be a 2-D numpy array
p = np.array([[0.971, 0.029, 0],
[0.145, 0.778, 0.077],
[0, 0.508, 0.492]])
psi = np.array([0.2, 0.2, 0.6])
x2 = np.array([[0, 0, 0],
[1, 1, 1],
[1, 1, 1]])
print path_prob2(p, psi, x2)
For your second question I believe Numpy.prod will give you a multiplication between all elements of a list/array.
You can use the prod as such:
>>> np.prod([15,20,31])
9300