Bjontegaard calculation using only one pair of PSNR and BitRate - python

I want to calculate the BD-Rate for two different video encoding settings using the python script below.
Using 4 RD Points (R1 and PSNR1 are the reference RD Points of the Video1 while R2 and PSNR2 are the new tests with different video settings of Video2) the script works fine ie
from bjontegaard_metric import *
R1 = np.array([686.76, 309.58, 157.11, 85.95])
PSNR1 = np.array([40.28, 37.18, 34.24, 31.42])
R2 = np.array([893.34, 407.8, 204.93, 112.75])
PSNR2 = np.array([40.39, 37.21, 34.17, 31.24])
print 'BD-PSNR: ', BD_PSNR(R1, PSNR1, R2, PSNR2)
print 'BD-RATE: ', BD_RATE(R1, PSNR1, R2, PSNR2)
But with just 1 RD Point ie
from bjontegaard_metric import *
R1 = np.array([686.76])
PSNR1 = np.array([40.28])
R2 = np.array([893.34])
PSNR2 = np.array([40.39])
print 'BD-PSNR: ', BD_PSNR(R1, PSNR1, R2, PSNR2)
print 'BD-RATE: ', BD_RATE(R1, PSNR1, R2, PSNR2)
I get a warning: RankWarning: Polyfit may be poorly conditioned. Each video encoder run, returns just one pair of PSNR and Bitrate as a result. So I want to compare two pairs of PSNR/BitRate (Reference video & modified video). Is there any way to fix this warning? The results I get using only 1 RD point are reliable?
import numpy as np
import scipy.interpolate
def BD_PSNR(R1, PSNR1, R2, PSNR2, piecewise=0):
lR1 = np.log(R1)
lR2 = np.log(R2)
PSNR1 = np.array(PSNR1)
PSNR2 = np.array(PSNR2)
p1 = np.polyfit(lR1, PSNR1, 3)
p2 = np.polyfit(lR2, PSNR2, 3)
# integration interval
min_int = max(min(lR1), min(lR2))
max_int = min(max(lR1), max(lR2))
# find integral
if piecewise == 0:
p_int1 = np.polyint(p1)
p_int2 = np.polyint(p2)
int1 = np.polyval(p_int1, max_int) - np.polyval(p_int1, min_int)
int2 = np.polyval(p_int2, max_int) - np.polyval(p_int2, min_int)
else:
# See https://chromium.googlesource.com/webm/contributor-guide/+/master/scripts/visual_metrics.py
lin = np.linspace(min_int, max_int, num=100, retstep=True)
interval = lin[1]
samples = lin[0]
v1 = scipy.interpolate.pchip_interpolate(np.sort(lR1), PSNR1[np.argsort(lR1)], samples)
v2 = scipy.interpolate.pchip_interpolate(np.sort(lR2), PSNR2[np.argsort(lR2)], samples)
# Calculate the integral using the trapezoid method on the samples.
int1 = np.trapz(v1, dx=interval)
int2 = np.trapz(v2, dx=interval)
# find avg diff
avg_diff = (int2-int1)/(max_int-min_int)
return avg_diff
def BD_RATE(R1, PSNR1, R2, PSNR2, piecewise=0):
lR1 = np.log(R1)
lR2 = np.log(R2)
# rate method
p1 = np.polyfit(PSNR1, lR1, 3)
p2 = np.polyfit(PSNR2, lR2, 3)
# integration interval
min_int = max(min(PSNR1), min(PSNR2))
max_int = min(max(PSNR1), max(PSNR2))
# find integral
if piecewise == 0:
p_int1 = np.polyint(p1)
p_int2 = np.polyint(p2)
int1 = np.polyval(p_int1, max_int) - np.polyval(p_int1, min_int)
int2 = np.polyval(p_int2, max_int) - np.polyval(p_int2, min_int)
else:
lin = np.linspace(min_int, max_int, num=100, retstep=True)
interval = lin[1]
samples = lin[0]
v1 = scipy.interpolate.pchip_interpolate(np.sort(PSNR1), lR1[np.argsort(PSNR1)], samples)
v2 = scipy.interpolate.pchip_interpolate(np.sort(PSNR2), lR2[np.argsort(PSNR2)], samples)
# Calculate the integral using the trapezoid method on the samples.
int1 = np.trapz(v1, dx=interval)
int2 = np.trapz(v2, dx=interval)
# find avg diff
avg_exp_diff = (int2-int1)/(max_int-min_int)
avg_diff = (np.exp(avg_exp_diff)-1)*100
return avg_diff

According to IETF at https://tools.ietf.org/id/draft-ietf-netvc-testing-06.html#rfc.section.4.2 number 2 At least four points must be computed. These points should be the same quantizers when comparing two versions of the same codec. So any lesser points than 4 are not valid for reliable results.
1. Rate/distortion points are calculated for the reference and test codec.
2. At least four points must be computed. These points should be the same quantizers when comparing two versions of the same codec.
3. Additional points outside of the range should be discarded.
4. The rates are converted into log-rates.
5. A piecewise cubic hermite interpolating polynomial is fit to the points for each codec to produce functions of log-rate in terms of distortion.
Metric score ranges are computed:
1. If comparing two versions of the same codec, the overlap is the intersection of the two curves, bound by the chosen quantizer points.
2. If comparing dissimilar codecs, a third anchor codec’s metric scores at fixed quantizers are used directly as the bounds.
3. The log-rate is numerically integrated over the metric range for each curve, using at least 1000 samples and trapezoidal integration.
4. The resulting integrated log-rates are converted back into linear rate, and then the percent difference is calculated from the reference to the test codec.

Related

How to compute Dimitrov spectral fatigue index in Python?

I have an already existing python script to calculate mean power frequency. Additionally I would like to calculate dimitrovs spectral fatigue index. Its formula slightly differs from mean power frequency instead of using a ratio of the moments of 1 and 0 it uses the moments of -1 and 5. I thought, I could simply add the moments of interest, but I get only inf-values in the renamed FInms5 function
[mean power frequency]
[dimitrov spectral fatigue index]
This is the working mean freq function. I changed the last line to:
mean_freq[i] = np.dot(P,f.t-1)/np.dot(P,f.t5)
from scipy.signal import periodogram
def get_mean_freq(emg_sig, sfreq, epoch_duration = 0.5):
'''
Parameters
----------
emg_sig : array
pre-filtered emg data.
sfreq : int
emg sampling frequency, in Hz.
epoch_duration : float
epoch (time window) duration, in seconds.
Returns
-------
mean_freq: array
mean frequency at each epoch
time_points: array
time point at the center of each evaluated epoch
samples: array
sample numbers at the center of each evaluated epoch
Method according to:
https://stackoverflow.com/questions/37922928/difference-in-mean-frequency-in-python-and-matlab
'''
ons = range(
0,
len(emg_sig),
int(epoch_duration*sfreq)
)
mean_freq = np.empty((len(ons),))
samples =np.empty((len(ons),))
time_points = np.empty((len(ons),))
for i,on in enumerate(ons):# i,on = 0,ons[0]
off = ons[i+1]-1 if i+1<len(ons) else len(emg_sig)
# print([on,off])
processing_window = emg_sig[on:off]
mid_point = (on + off) / 2
samples[i] = mid_point
# t_epoch[ch] += [mid_point / sfreq]
time_points[i] = mid_point / sfreq
# Processing window power spectrum (PSD) generation
f, Pxx_den = periodogram(np.array(processing_window), fs=float(sfreq))# plot_freq_spectrum(f, Pxx_den)
Pxx_den = np.reshape(Pxx_den, (1,-1))
width = np.tile(f[2]-f[0], (1, Pxx_den.shape[2]))
f = np.reshape(f, (1,-1))
P = Pxx_den*width
pwr = np.sum(P)
mean_freq[i] = np.dot(P, f.T)/pwr
return mean_freq, time_points, samples
This is the working mean freq function. I changed the last line to:
mean_freq[i] = np.dot(P,f.t-1)/np.dot(P,f.t5)
[1]: https://i.stack.imgur.com/He32L.png
[2]: https://i.stack.imgur.com/o6gGD.png

Checking if Frequentist approach is correct? Bayesian approach using MCMC for AB test. How to calculate Bayes Factors in Python?

I've been trying to get my head around Frequentist and Bayesian approaches for a toy data AB test problem.
The results don't really make sense to me. I am struggling to understand the results, or whether I have computed them (in)correctly (which is probably likely). Furthermore, after much research, I am still somewhat lost as to how to compute Bayes Factors. I've seen packages in R that make this look somewhat easy. Alas, I am not familiar with R and would prefer to be able to solve this problem in Python.
I would greatly appreciate any help and guidance regarding this!
Here is the data:
# imports
import pingouin as pg
import pymc3 as pm
import pandas as pd
import numpy as np
import scipy.stats as scs
import statsmodels.stats.api as sms
import math
import matplotlib.pyplot as plt
# A = control -- B = treatment
a_success = 10730
a_failure = 61988
a_total = a_success + a_failure
a_cr = a_success / a_total
b_success = 10966
b_failure = 60738
b_total = b_success + b_failure
b_cr = b_success / b_total
I started by doing some power analysis, to determine the number of required samples with a power of 0.8, alpha of 0.05 and a practical significance of 2%. I'm not sure whether expected conversion rates should be supplied, or the baseline + some proportion. Depending on the effect size, the required number of samples increases significantly.
# determine required sample size
baseline_rate = a_cr
practical_significance = 0.02
alpha = 0.05
power = 0.8
nobs1 = None
# is this how to calculate effect size?
effect_size = sms.proportion_effectsize(baseline_rate, baseline_rate + practical_significance) # 5204
# # or this?
# effect_size = sms.proportion_effectsize(baseline_rate, baseline_rate + baseline_rate * practical_significance) # 228583
sample_size = sms.NormalIndPower().solve_power(effect_size = effect_size,
power = power,
alpha = alpha,
nobs1 = nobs1,
ratio = 1)
I continued trying to determine if the null hypothesis could be rejected:
# calculate pooled probability
pooled_probability = (a_success + b_success) / (a_total + b_total)
# calculate pooled standard error and margin of error
se_pooled = math.sqrt(pooled_probability * (1 - pooled_probability) * (1 / b_total + 1 / a_total))
z_score = scs.norm.ppf(1 - alpha / 2)
margin_of_error = se_pooled * z_score
# the estimated difference between probability of conversions of both groups
d_hat = (test_b_success / test_b_total) - (test_a_success / test_a_total)
# test if null hypothesis can be rejected
lower_bound = d_hat - margin_of_error
upper_bound = d_hat + margin_of_error
if practical_significance < lower_bound:
print("reject null hypothesis -- groups do not have the same conversion rates")
else:
print("do not reject the null hypothesis -- groups have the same conversion rates")
which evaluates to 'do not reject the null ...' despite group B (treatment) showing a 3.65% relative improvement with regards to conversion rate over group A (control) which seems... odd?
I tried a slightly different approach (I guess a slightly different hypothesis?):
successes = [a_success, b_success]
nobs = [a_total, b_total]
z_stat, p_value = sms.proportions_ztest(successes, nobs=nobs)
(lower_a, lower_b), (upper_a, upper_b) = sms.proportion_confint(successes, nobs=nobs, alpha=alpha)
if p_value < alpha:
print("reject null hypothesis -- groups do not have the same conversion rates")
else:
print("do not reject the null hypothesis -- groups have the same conversion rates")
Which evaluates to 'reject null hypothesis ... ' with p-value: 0.004236. This seems highly contradictory, especially since the p-value is < 0.01.
On to Bayes... I created some arrays of success and failures (and only tested on 100 observations) due to how long this thing takes, and ran the following:
# generate lists of 1, 0
obs_a = np.repeat([1, 0], [a_success, a_failure])
obs_v = np.repeat([1, 0], [b_success, b_failure])
for _ in range(10):
np.random.shuffle(observations_A)
np.random.shuffle(observations_B)
with pm.Model() as model:
p_A = pm.Beta("p_A", 1, 1)
p_B = pm.Beta("p_B", 1, 1)
delta = pm.Deterministic("delta", p_A - p_B)
obs_A = pm.Bernoulli("obs_A", p_A, observed = obs_a[:1000])
obs_B = pm.Bernoulli("obs_B", p_B, observed = obs_b[:1000])
step = pm.NUTS()
trace = pm.sample(1000, step = step, chains = 2)
Firstly, I understand that you are supposed to burn some proportion of the trace -- how do you determine an appropriate number of indices to burn?
In trying to evaluate the posterior probabilities, is the following code the correct way to do this?
b_lift = (trace['p_B'].mean() - trace['p_A'].mean()) / trace['p_A'].mean() * 100
b_prob = np.mean(trace["delta"] > 0)
a_lift = (trace['p_A'].mean() - trace['p_B'].mean()) / trace['p_B'].mean() * 100
a_prob = np.mean(trace["delta"] < 0)
# is the Bayes Factor just the ratio of the posterior probabilities for these two models?
BF = (trace['p_B'] / trace['p_A']).mean()
print(f'There is {b_prob} probability B outperforms A by a magnitude of {round(b_lift, 2)}%')
print(f'There is {a_prob} probability A outperforms B by a magnitude of {round(a_lift, 2)}%')
print('BF:', BF)
-- output:
There is 0.666 probability B outperforms A by a magnitude of 1.29%
There is 0.334 probability A outperforms B by a magnitude of -1.28%
BF: 1.013357654428127
I suspect that this is not the correct way to calculate Bayes Factors. How can the Bayes Factor be calculated?
I really hope you can help me understand all of the above... I realize it's an exceptionally long post. But I've tried every resource I can find and am still stuck!
Kind regards.

How to align two sets of points (translation+rotation) when those sets contain noise?

Consider the following two sets of points. I would like to find the optimal 2D translation and rotation that aligns the largest number of points between dataset blue and dataset orange, where a point is considered aligned if the distance to its nearest neighbor in the other dataset is smaller than a threshold.
I understand that this is related to "Iterative Closest Point" algorithms, but in this case the situation is a bit harder because not all points from one dataset are in the other, and also because some points may turn out to be "false positives" (noise).
Is there an efficient way of doing this?
I come across the same problem and found solution in comaring the CCD stars observation figures, the basic idea is to find the best match of the triangles of the two set of points.
I then use astroalign package to calculate the transformation matrix, and align to all the points. Thank the Lord, it works pretty good.
import itertools
import numpy as np
import matplotlib.pyplot as plt
import astroalign as aa
def getTriangles(set_X, X_combs):
"""
Inefficient way of obtaining the lengths of each triangle's side.
Normalized so that the minimum length is 1.
"""
triang = []
for p0, p1, p2 in X_combs:
d1 = np.sqrt((set_X[p0][0] - set_X[p1][0]) ** 2 +
(set_X[p0][1] - set_X[p1][1]) ** 2)
d2 = np.sqrt((set_X[p0][0] - set_X[p2][0]) ** 2 +
(set_X[p0][1] - set_X[p2][1]) ** 2)
d3 = np.sqrt((set_X[p1][0] - set_X[p2][0]) ** 2 +
(set_X[p1][1] - set_X[p2][1]) ** 2)
d_min = min(d1, d2, d3)
d_unsort = [d1 / d_min, d2 / d_min, d3 / d_min]
triang.append(sorted(d_unsort))
return triang
def sumTriangles(ref_triang, in_triang):
"""
For each normalized triangle in ref, compare with each normalized triangle
in B. find the differences between their sides, sum their absolute values,
and select the two triangles with the smallest sum of absolute differences.
"""
tr_sum, tr_idx = [], []
for i, ref_tr in enumerate(ref_triang):
for j, in_tr in enumerate(in_triang):
# Absolute value of lengths differences.
tr_diff = abs(np.array(ref_tr) - np.array(in_tr))
# Sum the differences
tr_sum.append(sum(tr_diff))
tr_idx.append([i, j])
# Index of the triangles in ref and in with the smallest sum of absolute
# length differences.
tr_idx_min = tr_idx[tr_sum.index(min(tr_sum))]
ref_idx, in_idx = tr_idx_min[0], tr_idx_min[1]
print("Smallest difference: {}".format(min(tr_sum)))
return ref_idx, in_idx
set_ref = np.array([[2511.268821,44.864124],
[2374.085032,201.922566],
[1619.282942,216.089335],
[1655.866502,221.127787],
[ 804.171659,2133.549517], ])
set_in = np.array([[1992.438563,63.727282],
[2285.793346,255.402548],
[1568.915358, 279.144544],
[1509.720134, 289.434629],
[1914.255205, 349.477788],
[2370.786382, 496.026836],
[ 482.702882, 508.685952],
[2089.691026, 523.18825 ],
[ 216.827439, 561.807396],
[ 614.874621, 2007.304727],
[1286.639124, 2155.264827],
[ 729.566116, 2190.982364]])
# All possible triangles.
ref_combs = list(itertools.combinations(range(len(set_ref)), 3))
in_combs = list(itertools.combinations(range(len(set_in)), 3))
# Obtain normalized triangles.
ref_triang, in_triang = getTriangles(set_ref, ref_combs), getTriangles(set_in, in_combs)
# Index of the ref and in triangles with the smallest difference.
ref_idx, in_idx = sumTriangles(ref_triang, in_triang)
# Indexes of points in ref and in of the best match triangles.
ref_idx_pts, in_idx_pts = ref_combs[ref_idx], in_combs[in_idx]
print ('triangle ref %s matches triangle in %s' % (ref_idx_pts, in_idx_pts))
print ("ref:", [set_ref[_] for _ in ref_idx_pts])
print ("input:", [set_in[_] for _ in in_idx_pts])
ref_pts = np.array([set_ref[_] for _ in ref_idx_pts])
in_pts = np.array([set_in[_] for _ in in_idx_pts])
transf, (in_list,ref_list) = aa.find_transform(in_pts, ref_pts)
transf_in = transf(set_in)
print(f'transformation matrix: {transf}')
plt.scatter(set_ref[:,0],set_ref[:,1], s=100,marker='.', c='r',label='Reference')
plt.scatter(set_in[:,0],set_in[:,1], s=100,marker='.', c='b',label='Input')
plt.scatter(transf_in[:,0],transf_in[:,1], s=100,marker='+', c='b',label='Input Aligned')
plt.plot(ref_pts[:,0],ref_pts[:,1], c='r')
plt.plot(in_pts[:,0],in_pts[:,1], c='b')
plt.legend()
plt.tight_layout()
plt.savefig( 'align_coordinates.png', format = 'png')
plt.show()

Estimating the p value of the difference between two proportions using statsmodels and PyMC3 (MCMC simulation) in Python

In Probabilistic-Programming-and-Bayesian-Methods-for-Hackers, a method is proposed to compute the p value that two proportions are different.
(You can find the jupyter notebook here containing the entire chapter
http://nbviewer.jupyter.org/github/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers/blob/master/Chapter2_MorePyMC/Ch2_MorePyMC_PyMC2.ipynb)
The code is the following:
import pymc3 as pm
figsize(12, 4)
#these two quantities are unknown to us.
true_p_A = 0.05
true_p_B = 0.04
N_A = 1700
N_B = 1700
#generate some observations
observations_A = bernoulli.rvs(true_p_A, size=N_A)
observations_B = bernoulli.rvs(true_p_B, size=N_B)
print(np.mean(observations_A))
print(np.mean(observations_B))
0.04058823529411765
0.03411764705882353
# Set up the pymc3 model. Again assume Uniform priors for p_A and p_B.
with pm.Model() as model:
p_A = pm.Uniform("p_A", 0, 1)
p_B = pm.Uniform("p_B", 0, 1)
# Define the deterministic delta function. This is our unknown of interest.
delta = pm.Deterministic("delta", p_A - p_B)
# Set of observations, in this case we have two observation datasets.
obs_A = pm.Bernoulli("obs_A", p_A, observed=observations_A)
obs_B = pm.Bernoulli("obs_B", p_B, observed=observations_B)
# To be explained in chapter 3.
step = pm.Metropolis()
trace = pm.sample(20000, step=step)
burned_trace=trace[1000:]
p_A_samples = burned_trace["p_A"]
p_B_samples = burned_trace["p_B"]
delta_samples = burned_trace["delta"]
# Count the number of samples less than 0, i.e. the area under the curve
# before 0, represent the probability that site A is worse than site B.
print("Probability site A is WORSE than site B: %.3f" % \
np.mean(delta_samples < 0))
print("Probability site A is BETTER than site B: %.3f" % \
np.mean(delta_samples > 0))
Probability site A is WORSE than site B: 0.167
Probability site A is BETTER than site B: 0.833
However, if we compute the p value using statsmodels, we get a very different result:
from scipy.stats import norm, chi2_contingency
import statsmodels.api as sm
s1 = int(1700 * 0.04058823529411765)
n1 = 1700
s2 = int(1700 * 0.03411764705882353)
n2 = 1700
p1 = s1/n1
p2 = s2/n2
p = (s1 + s2)/(n1+n2)
z = (p2-p1)/ ((p*(1-p)*((1/n1)+(1/n2)))**0.5)
z1, p_value1 = sm.stats.proportions_ztest([s1, s2], [n1, n2])
print('z1 is {0} and p is {1}'.format(z1, p))
z1 is 0.9948492584166934 and p is 0.03735294117647059
With MCMC, the p value seems to be 0.167, but using statsmodels, we get a p value 0.037.
How can I understand this?
Looks like you printed the wrong value. Try this instead:
print('z1 is {0} and p is {1}'.format(z1, p_value1))
Also, if you want to test the hypothesis p_A > p_B then you should set the alternative parameter in the function call to larger like so:
z1, p_value1 = sm.stats.proportions_ztest([s1, s2], [n1, n2], alternative='larger')
The docs have more examples on how to use it.

Splitting integrated probability density into two spatial regions

I have some probability density function:
T = 10000
tmin = 0
tmax = 10**20
t = np.linspace(tmin, tmax, T)
time = np.asarray(t) #this line may be redundant
for j in range(T):
timedep_PD[j]= probdensity_func(x,time[j],initial_state)
I want to integrate it over two distinct regions of x. I tried the following to split the timedep_PD array into two spatial regions and then proceeded to integrate:
step = abs(xmin - xmax) / T
l1 = int(np.floor((abs(ab - xmin)* T ) / abs(xmin - xmax)))
l2 = int(np.floor((abs(bd - ab)* T ) / abs(xmin - xmax)))
#For spatial region 1
R1 = np.empty([l1])
R1 = x[:l1]
for i in range(T):
Pd1[i] = Pd[i][:l1]
#For spatial region 2
Pd2 = np.empty([T,l2])
R2 = np.empty([l2])
R2 = x[l1:l1+l2]
for i in range(T):
Pd2[i] = Pd[i][l1:l1+l2]
#Integrating over each spatial region
for i in range(T):
P[0][i] = np.trapz(Pd1[i],R1)
P[1][i] = np.trapz(Pd2[i],R2)
Is there an easier/more clear way to go about splitting up a probability density function into two spatial regions and then integrating within each spatial region at each time-step?
The loops can be eliminated by using vectorized operations instead. It's not clear whether Pd is a 2D NumPy array; it it's something else (e.g., a list of lists), it should be converted to a 2D NumPy array with np.array(...). After that you can do this:
Pd1 = Pd[:, :l1]
Pd2 = Pd[:, l1:l1+l2]
No need to loop over the time index; the slicing happens for all times at once (having : in place of an index means "all valid indices").
Similarly, np.trapz can integrate all time slices at once:
P1 = np.trapz(Pd1, R1, axis=1)
P2 = np.trapz(Pd2, R2, axis=1)
Each P1 and P2 is now a time series of integrals. The axis parameter determines along which axis Pd1 gets integrated - it's the second axis, i.e., space.

Categories

Resources