Evaluate parameter in symfit model - python

I am using symfit for fitting of two NMR data set simultaneously.I defined a cut-off gaussian distribution on one of the parameter. I summerized my explanation in the following simplified example:
import symfit as sf
from symfit import parameters, variables, Fit, Model, Ge
from symfit.core.minimizers import BFGS, BasinHopping, NelderMead, DifferentialEvolution
from symfit import GradientModel, CallableModel
xd= [1.1, 3, 5, 7, 9, 11, 14, 19, 25, 32, 44]
yd= [5.5, 8, 11, 14, 18, 22, 28, 35,45, 69, 110]
pi=3.14
x, y = variables('x, y')
a = sf.Parameter('a',value=3)
b = sf.Parameter('b',value=0.7)
sigma= sf.Parameter('sigma',value=0.7)
res=0
norm=0
for i in range(1,5):
atemp= (a + ((i-1)*3*sigma/2))
if atemp < 0:
atemp = 0
gauss= sf.exp(-(atemp-a)**2/(2*(sigma**2)))/sf.sqrt(2*pi*(sigma**2))
res= res+ gauss* (atemp * x + b)
norm= norm + gauss
if i == 4:
firstres= res
firstnorm= norm
res=0
norm=0
funfit = CallableModel({y: (firstres/firstnorm)})
fit = Fit(funfit, x= xd, y=yd, minimizer=[NelderMead, BFGS])
fit_result = fit.execute()
print(" Best-Fit Parameters: ", fit_result)
I need to check "atemp" be positive because negetive values does not have any physical meaning. I know that "atemp" is a parameter and is an expression but I need to get the value of this parameter. I tried atemp.evalf() and it does not work. I get such an error:
"TypeError: cannot determine truth value of Relational"

Related

Plotting/calculating prediction intervals for weighted least squares (nonlinear model) - to make intervals narrowing with time (Python or equations)

I have the following data:
x = np.array([0, 0, 0, 0, 0, 0, 1, 3, 3, 5, 5, 5, 5, 7, 7, 14, 14, 15, 15, 15, 15, 25, 25, 25, 25, 25, 35, 35, 40, 40, 45, 45, 45, 45, 45, 45])
y = np.array([87.9, 91.3, 94.1, 173.9, 87.7, 117.8, 52.4, 46.5, 73.7, 63.3, 50.6, 56.8, 47.5, 30.3, 59.2, 38.7, 12.2, 25.7, 23.5, 37.3, 16.6, 25, 19.7, 27.2, 27.3, 11.1, 1.1, 0.1, 0.9, 0.1, 0.3, 0.5, 0.4, 1.2, 0.6, 1])
and I would like to perform weighted least square optimization for the following model (as I have different euqations for different data I cannot just simply use the log transformation to convert to linear regression):
# defining a model
def model(x, slope):
return 100 * np.exp(-slope * x)
# fit the parameters, weighting each data point by its inverse value: 1/y^K (where K = 1.2)
params, pcov = curve_fit(model, x, y,
sigma=1/(y**1.2), absolute_sigma=False)
But I have no idea how to get the 95% prediction intervals as in the figure below (i.e. 95% PI are wide at the beginning (from 41.5 to 158.6 at y = 0) and get narrower with time (e.g., from -5 to 18 at y=30):
Prediction intervals narrowing with time
I have tried by calculating standard errors, MSE and t-critical and using the relationship between condfidence intervals and prediction intervals but it probably doesn't work for weighted fit:
#find T critical value (two-tailed inverse of the Student's t-distribution)
t_crit = scipy.stats.t.ppf(q=1-.05/2,df=75)
SE_CI = np.sqrt(np.diag(pcov))
MSE = np.mean((y-model(x, *params))**2)
#for some modelled data
x_pred = np.arange(50)
y_pred = 100 * np.exp(-params[0] * x_pred)
y_upper_CI = y_pred+t_crit*SE_CI
y_lower_CI = y_pred-t_crit*SE_CI
y_upper_PI = y_pred + np.sqrt((SE_CI)**2+MSE)*t_crit
y_lower_PI = y_pred - np.sqrt((SE_CI)**2+MSE)*t_crit
I have also found out that I might try:
define G|x, which is the gradient of the parameters at a particular value of X and using all the best-fit values of the parameters. The result is a vector, with one element per parameter. For each parameter, it is defined as dY/dP, where Y is the Y value of the curve given the particular value of X and all the best-fit parameter values, and P is one of the parameters.)
Cov is the covariance matrix (inverted Hessian from last iteration). It is a square matrix with the number of rows and columns equal to the number of parameters.
Now compute c = G|x * Cov * G'|x. The result is a single number for any value of X.
The prediction bands extend a further distance above and below the curve, equal to:
sqrt(c+1)*sqrt(SS/DF)*CriticalT(Confidence%, DF)
But I do not know how to implement it in python (namely how to get G|x, how to compute: c = G|x * Cov * G'|x and from where to take sum of squares SS)...
Thank you in advance for your help!

Using curve_fit to a function defined by indefinite integral in Python

I'm trying to make a code to fit 2 curves with 5 parameters to real data. They are shown here:
The first curve only depends on a,b and gamma. So I decided to use curve_fit once to these 3 (which works) and then use it again on the second curve to adjust the last two alpha and k_0.
Problem is that the second is defined by this indefinite integral and i can't code it properly.
I have tried to treat x as a symbol and integrate using sym.integrate and just integrate normally with quad. Neither worked. In the second case, I get "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()" in "mortes" function.
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import scipy.integrate as integrate
import numpy as np
import sympy as sym
#Experimental x and y data points
#Dados de SP
xData = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34])
ycasos = np.array([2, 13, 65, 459, 1406, 4466, 8419, 13894, 20004, 31174, 44411, 61183, 80558, 107142, 140549, 172875, 215793, 265581, 312530, 366890, 412027, 479481, 552318, 621731, 697530, 749244, 801422, 853085, 890690, 931673, 970888, 1003429, 1034816, 1062634, 1089255])
ymortes = np.array([0, 0, 15, 84, 260, 560, 991, 1667, 2586, 3608, 4688, 6045, 7532, 9058, 10581, 12494, 14263, 15996, 17702, 19647, 21517, 23236, 25016, 26780, 28392, 29944, 31313, 32567, 33927, 35063, 36136, 37223, 37992, 38726, 39311])
#Dados do Brasil
#xData = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45])
#ycasos = np.array([2,9,121,1128,3912,10298,20818,36739,58973,96559,155939,233142, 347398, 498440, 672846, 850514, 1067579, 1313667, 1577004, 1839850, 2074860, 2394513, 2707877, 3012412, 3317096, 3582362, 3846153, 4123000, 4315687, 4528240, 4717991, 4906833, 5082637, 5224362, 5380635, 5535605, 5653561, 5848959, 6052786, 6290272, 6577177, 6880127, 7213155, 7465806, 7716405, 8013708])
#ymortes = np.array([])
u0 = ycasos[0]
v0 = ymortes[0]
#u(t)
def casos(x,a,b,gama):
return u0 * (a ** (1/gama)) * np.exp(a*x) *((a + b * (u0 ** gama) * (np.exp(a*gama*x)-1)) ** (-1/gama))
#Plot experimental data points
plt.plot(xData, ycasos, 'bo', label='reais')
# Initial guess for the parameters
#initialGuess = [3.0,1.5,0.05]
#Primeiro fit
copt, ccov = curve_fit(casos, xData, ycasos,bounds=(0, [1., 1., np.inf]),maxfev=100000)
a_opt = copt[0]
b_opt = copt[1]
gama_opt = copt[2]
print('Primeira etapa \n')
print('Parametros encontrados: a=%.9f, b=%.9f,gama=%.9f \n' % tuple(copt))
def integrand(t,alpha):
return np.exp((a_opt - alpha)*t) *((a_opt + b_opt * (u0 ** gama_opt) * (np.exp(a_opt*gama_opt*t)-1)) ** (-1/gama_opt))
def mortes(x,k0,alpha):
return u0 * (a_opt ** (1/gama_opt)) * k0 * integrate.quad(integrand, 0, x, args=(alpha)) + v0
#Segundo fit
mopt, mcov = curve_fit(mortes, xData, ymortes, bounds=(0, [np.inf, a_opt]), maxfev=100000)
print('Segunda etapa \n')
print('Parametros encontrados: k0=%.9f, alpha=%.9f \n' % tuple(mopt))
#x values for the fitted function
xFit = np.arange(0.0, float(len(xData)), 0.01)
#Plot the fitted function
plt.plot(xFit, casos(xFit, *copt), 'r', label='estimados')
plt.xlabel('t')
plt.ylabel('casos')
plt.legend()
plt.show()
The upper bound of an integral (integrate.quad) has to be a float, not an array as your x (argument of mortes()):
In this way it should work:
def mortes(x,k0,alpha):
integralRes = []
for upBound in x:
integralRes.append(integrate.quad(integrand, 0, upBound, args=(alpha))[0])
return u0 * (a_opt ** (1/gama_opt)) * k0 * np.array(integralRes) + v0
p.s. Elegant editions of the code style are more than welcomed (like allowing passing an array to upper and lower bounds of integrate.quad ).

Gaussian fit to histogram on python seems off. What could I change to improve the fit?

I have created a Gaussian fit to data plotted as a bar chart. However, the fit does not look right, and I don't know what to change to improve the fit. My code is as follows:
import matplotlib.pyplot as plt
import math
import numpy as np
from collections import Counter
import collections
from scipy.optimize import curve_fit
from scipy.stats import norm
from scipy import stats
import matplotlib.mlab as mlab
k_list = [-40, -32, -30, -28, -26, -24, -22, -20, -18, -16, -14, -12, -10, -8, -6, -4, -3, -2, 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34]
v_list = [1, 2, 11, 18, 65, 122, 291, 584, 1113, 2021, 3335, 5198, 7407, 10043, 12552, 14949, 1, 16599, 16770, 16728, 14772, 12475, 9932, 7186, 4987, 3286, 1950, 1080, 546, 285, 130, 54, 18, 11, 2, 2]
def func(x, A, beta, B, mu, sigma):
return (A * np.exp(-x/beta) + B * np.exp(-100.0 * (x - mu)**2 / (2 * sigma**2))) #Normal distribution
popt, pcov = curve_fit(func, xdata=k_list, ydata=v_list, p0=[10000, 5, 10000, 10, 10])
print(popt)
x = np.linspace(-50, 50, 1000)
plt.bar(k_list, v_list, label='myPLOT', color = 'b', width = 0.75)
plt.plot(x, func(x, *popt), color='darkorange', linewidth=2.5, label=r'Fitted function')
plt.xlim((-30, 45))
plt.legend()
plt.show()
The plot I obtain is as follows:
How can I adjust my fit?
You have a significant outlier here, possibly caused by a typo: (k, v) == (-3, 1) at index 16 in the data.
The representation of the data as a bar chart is not optimal here. The issue would be clearly visible if you showed the data in the same format as you show the fit. Either of the following would work better:
The outlier forces the peak down. Here is the fit if we remove the outlier manually:
You can remove the outlier automatically by checking its individual residual against the RMSE of the entire fit:
popt, pcov = curve_fit(func, xdata=k_list, ydata=v_list, p0=[10000, 5, 10000, 10, 10])
resid = np.abs(func(k_list, *popt) - v_list)
rmse = np.std(resid)
keep = resid < 3 * rmse
if keep.sum() < keep.size:
popt, pcov = curve_fit(func, xdata=k_list[keep], ydata=v_list[keep], p0=popt)
Or even a repeated application:
popt = [10000, 5, 10000, 10, 10]
while True:
popt, pcov = curve_fit(func, xdata=k_list, ydata=v_list, p0=popt)
resid = np.abs(func(k_list, *popt) - v_list)
rmse = np.std(resid)
keep = resid < 5 * rmse
if keep.sum() == keep.size:
break
k_list = k_list[keep]
v_list = v_list[keep]
A 3-sigma outlier will trim everything off your data after a couple of iterations, so I used 5-sigma. Keep in mind that this is a very quick and dirty way to denoise data. It's really basically manual, since you have to re-check the data to make sure that your choice of factor was correct.

Piecewise regresion Python

Hi I'm trying to figure out how to fit those values with a piecewise linear function. I have read this question but I can't get forward (How to apply piecewise linear fit in Python? ). In this example is show how to implement a piecewise function for a 2 segment case. But I need to do it in a three segment case as in figure.
I'have written this code:
from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
x1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15,16,17,18,19,20,21], dtype=float)
y1 = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03,145,147,149,151,153,155])
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15], dtype=float)
y = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03])
def piecewise(x,x0,x1,y0,y1,k0,k1,k2):
return np.piecewise(x , [x <= x0, (x>= x1)] , [lambda x:k0*x + y0-k0*x0, lambda x:k1*(x-(x1+x0))-y1, lambda x:k2*x + y1-k2*x1])
p , e = optimize.curve_fit(piecewise_linear, x1, y1)
xd = np.linspace(0, 15, 100)
plt.figure()
plt.plot(x1, y1, "o")
plt.plot(xd, piecewise_linear(xd, *p))
but this is the output
Any suggestion? I belive that the problem is in return np.piecewise(x , [x <= x0, (x>= x1)] , [lambda x:k0*x + y0-k0*x0, lambda x:k1*(x-(x1+x0))-y1, lambda x:k2*x + y1-k2*x1]) in particular in the second lambda.
EDIT 1:
If I try to different data the solution provided by A.L. I don't get good results.
I get this result:
with
x=[ 16.01690476, 16.13801587, 14.63628571, 15.32664399,
15.8145 , 15.71507143, 15.56107143, 15.553 ,
15.08734524, 14.97275 , 15.51958333, 16.61981859,
16.36589286, 14.78708333, 14.41565476, 13.47763158,
13.42412281, 12.95551378, 13.66601504, 13.63315789,
13.21463659, 13.53464286, 14.60130952, 14.7774881 ,
13.04319048, 12.53385965, 12.65745614, 13.90535714,
14.82412281, 14.6565 , 15.09541667, 13.41434524,
13.66033333, 14.57964286, 13.55416667, 13.43041667,
13.01137566, 12.76429825, 11.55241667, 11.0634881 ,
10.92729762, 11.21625 , 10.72092857, 11.80380952,
12.55233333, 12.11307143, 11.78892857, 12.45458333,
11.05539286, 10.69214286, 10.32566667, 11.3439881 ,
9.69563492, 10.72535714, 10.26180272, 7.77272727,
6.37704082, 8.49666667, 8.5389881 , 5.68547619,
7.00616667, 8.22015873, 10.20315476, 15.35736842,
12.25158333, 11.09622153, 10.4118254 , 9.8602381 ,
10.16727273, 15.10858333, 13.82215539, 12.44719298,
10.92341667, 11.44565476, 11.43333333, 10.5045 ,
11.14357143, 10.37625 , 8.93421769, 9.48444444,
10.43483333, 10.8659881 , 10.96166667, 10.12872619,
9.64663265, 9.29979762, 9.67173469, 8.978322 ,
9.10419501, 9.45411565, 10.46411565, 7.95739229,
8.72616667, 7.03892857, 7.32547619, 7.56441667,
6.61022676, 9.09014739, 10.78141667, 10.85918367,
11.11665476, 10.141 , 9.17760771, 8.27968254,
11.02625 , 12.34809524, 11.17807018, 11.25416667,
11.29236905, 9.28357143, 9.77033333, 11.52086168,
9.8625 , 12.60281955, 12.42785714, 12.11902256,
13.1 , 13.02791667, 13.87779449, 15.09857143,
13.93935185, 13.69821429, 13.39880952, 12.45692982,
12.76921053, 13.23708333, 13.71666667, 15.39807143,
15.27916667, 14.66464286, 13.38694444, 10.97555556,
10.02191667, 11.99608333, 14.26325 , 15.40991667,
15.12908333, 15.76265476, 12.12763158, 15.01641667,
14.39602381, 12.98532143, 14.98807018, 18.30547619,
16.7564966 , 16.82982143, 19.8487013 , 19.18600907]
and
y=[ 2.36846863, 2.73722628, 2.77177583, 2.63930636, 2.80864749,
2.57066667, 2.65277287, 2.57162347, 2.76295667, 2.79835391,
2.60431154, 2.17326401, 2.67740698, 2.47138153, 2.49882574,
2.60987338, 2.69935565, 2.60755362, 2.77702029, 2.62996942,
2.45959517, 2.52750434, 2.73833005, 2.52009 , 2.80933226,
1.63807085, 2.49230099, 2.55441614, 3.19256506, 2.52609288,
1.02931596, 2.40266963, 2.3306463 , 2.69094276, 2.60779985,
2.48351648, 2.45131766, 2.40526763, 2.03952569, 1.86217009,
1.79971848, 1.91772218, 1.85895421, 2.32725731, 2.28189713,
2.11835833, 2.09636517, 2.2230303 , 1.85863317, 1.77550406,
1.68862391, 1.79187765, 1.70887476, 1.81911193, 1.74802483,
1.65776432, 1.58012849, 1.67781494, 1.62451541, 1.60555884,
1.56172214, 1.60083809, 1.65256994, 2.74794704, 2.27089627,
1.80364982, 1.51412482, 1.77738757, 1.56979564, 2.46538633,
2.37679625, 2.40389294, 2.04165763, 1.82086407, 1.90609219,
1.87480978, 1.8877854 , 1.76080074, 1.68369028, 1.57419297,
1.66470126, 1.74522552, 1.72459756, 1.65510503, 1.72131148,
1.6254417 , 1.57091907, 1.68755268, 1.70307911, 1.59445121,
1.74393783, 1.72913779, 1.66883237, 1.59859545, 1.62335831,
1.73378184, 1.62621588, 1.79532164, 1.78289992, 1.79475101,
1.7826266 , 1.68778918, 1.64484127, 1.62332696, 1.75372393,
1.99038021, 1.87268137, 1.86124502, 1.82435911, 1.62927102,
1.66443723, 1.86743516, 1.62745098, 2.20200312, 2.09641026,
2.26649111, 2.63271605, 2.18050721, 2.57138433, 2.51833359,
2.74684184, 2.57209998, 2.63762019, 2.30027877, 2.28471286,
2.40323668, 2.37103313, 2.16414489, 1.01027109, 2.64181007,
2.45467765, 2.05773672, 1.73624917, 2.05233688, 2.70820669,
2.65594222, 2.67445635, 2.37212985, 2.48221803, 2.77655216,
2.62839879, 2.26481307, 2.58005799, 2.1188172 , 2.14017268,
2.16459571, 1.95083406, 1.46224418]
Fitting a piecewise linear function is a nonlinear optimization problem which may have local optimas. The result you see is probably one of the local optimas where your optimization algorithm gets stuck.
One way to solve this problem is to repeat your optimization algorithm with different initial values and take the best fit. I used the mean absolute error (MAE) to compare the different fits against each other.
perr = np.sum(np.abs(y1-piecewise(x1, *p)))
I also changed your piecewise funtion because it was a bit confusing for me. But it still a piecewise function as before
Further think you forgot to extend the x and xd array to the value of 21. (thats why the green line ends early).
from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
def piecewise(x,x0,x1,y0,y1,k0,k1,k2):
return np.piecewise(x , [x <= x0, np.logical_and(x0<x, x<= x1),x>x1] , [lambda x:k0*x + y0, lambda x:k1*(x-x0)+y1+k0*x0,
lambda x:k2*(x-x1) + y0+y1+k0*x0+k1*(x1-x0)])
x1 = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15,16,17,18,19,20,21], dtype=float)
y1 = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03,145,147,149,151,153,155])
x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15,16,17,18,19,20,21], dtype=float)
y = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03,145,147,149,151,153,155])
perr_min = np.inf
p_best = None
for n in range(100):
k = np.random.rand(7)*20
p , e = optimize.curve_fit(piecewise, x1, y1,p0=k)
perr = np.sum(np.abs(y1-piecewise(x1, *p)))
if(perr < perr_min):
perr_min = perr
p_best = p
xd = np.linspace(0, 21, 100)
plt.figure()
plt.plot(x1, y1, "o")
y_out = piecewise(xd, *p_best)
plt.plot(xd, y_out)
plt.show()
this gives me:
with p = [ 6.34259491 15.00000023 2.97272604 7.05498314 2.00751828
13.88881542 1.99960597]
Edit1
You edited your question, and this ist the answer to the edited one.
Sorry Iam new at stackoverlfow and not sure if I should post another answer instead
In your second dataset you added noise to data. In my opinion there are two kinds of noises. A gaussian one, which places the points close to the underlying piecewise line and outlier noise which places points far away from the original underlying line.
Under the hood the optimization algorithm you use optimizes the following according to p:
E = sum(square(y-piecewise(x,p)))
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html#scipy.optimize.curve_fit
The gaussian noise is not very problematic. The optimization you use assumes indirectly this gaussian noise (by minimizing the least square error) and fits the line as good as possible. The real problem comes in with the outliers.
The problem is that outliers are far way from the original function. Even if the optimization tries the optimal parameters, the Energy function E will be not minimal, as your outliers are far away from the original function and this distance is even squared so it shifts away the minimum of the Function E far away from the true parameters of your function.
So whats the solution ?
Get rid of the outliers.
An automized approach to to that is ransac
https://en.wikipedia.org/wiki/RANSAC.
In Brief: You choose a random subset of the original data. You hope the subset has not outliers. You fit your function to the subset and discard the points, which are far way from the fitted function. If enough points survived this step, you take all the surviving points and repeat the fit. The error on this "inlier" set is a measure of the quality of your fit. Then you repeat the whole process and take the best final fit.
I ajusted my script accordingly:
from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np
def piecewise(x,x0,x1,y0,y1,k0,k1,k2):
return np.piecewise(x , [x <= x0, np.logical_and(x0<x, x<= x1),x>x1] , [lambda x:k0*x + y0, lambda x:k1*(x-x0)+y1+k0*x0,
lambda x:k2*(x-x1) + y0+y1+k0*x0+k1*(x1-x0)])
x = np.array(x)
y = np.array(y)
x1 = x
y1 = y
perr_min = np.inf
p_best = None
for n in range(100):
idx = np.random.choice(np.arange(len(x)), 10, replace=False)
x_sample = x[idx]
y_sample = y[idx]
k = np.random.rand(7)*20
try:
p , e = optimize.curve_fit(piecewise, x_sample,y_sample ,p0=k)
each_error = np.abs(y-piecewise(x, *p))
x_inliner = x[each_error < 1]
y_inlier = y[each_error < 1]
if(x_inliner.shape[0] < 0.8 * x.shape[0]):
continue
p_inlier , e_inlier = optimize.curve_fit(piecewise, x_inliner,y_inlier ,p0=p)
perr = np.sum(np.abs(y-piecewise(x, *p_inlier)))
if(perr < perr_min):
perr_min = perr
p_best = p_inlier
except RuntimeError:
pass
xd = np.linspace(0, 21, 100)
plt.figure()
plt.plot(x, y, "o")
y_out = piecewise(xd, *p_best)
plt.plot(xd, y_out)
print p_best
plt.show()
With 100 repetitions I get the following result:
The piecewise-regression python library can fit models with different numbers of breakpoints.
First of all, for demonstration purposes generate some data with 2 breakpoints:
import numpy as np
gradients = [2.5,12,2]
constant = 0
breakpoints = [6, 15]
n_points = 100
np.random.seed(1)
xx = np.linspace(0, 25, n_points)
yy = constant + gradients[0]*xx + np.random.normal(size=n_points)*10
for bp_n in range(len(breakpoints)):
yy += (gradients[bp_n+1] - gradients[bp_n]) * np.maximum(xx - breakpoints[bp_n], 0)
To fit and plot the model:
import piecewise_regression
import matplotlib.pyplot as plt
pw_fit = piecewise_regression.Fit(xx, yy, n_breakpoints=2)
pw_fit.plot()
plt.xlabel("x")
plt.ylabel("y")
plt.show()
It also gives you a statistical analysis:
pw_fit.summary()
It won't work well with the data you provided in your edit, because there are outliers that dominate the error cost function. This will be an issue whichever method you use to fit the data, you need to decide how to handle the outliers in this instance.

Pass an array in python odeint

I am quite new to Python, so do excuse me if the following question has a 'duh' answer.
So, I'm trying to solve an ODE using odeint and wish to pass an array. But, the TypeError: can't multiply sequence by non-int of type 'float' keeps cropping up, in the line:
CA0 = (-kd-kn*Cv)*CAi/(1+(CAi/ks))
So, the code is:
from scipy.integrate import odeint
import numpy as np
Ap_data = [2, 7, 91, 1.6, 0.4, 5]
tdata= [0, 1, 4, 5, 4, 20]
Cv_data = [43, 580, 250, 34, 30, 3]
#Define parameters
kn = 1E-5 #change
ks = 1E+5 #change
kd = 0.058
def deriv (CAi,t, Cv):
CA0 = (-kd-kn*Cv)*CAi/(1+(CAi/ks))
return CA0
#Initial conditions
CA_init = 21.6
#Solve the ODE
(CAb_soln) = odeint (derivCAb, CA_init, tdata, (Cv_data,))
print CAb_soln
Some help, please?
Your immediate problem is that your deriv function is trying to multiply the ordinary Python list, Cv_data (passed in as Cv) by float values. If you want to vectorize this operation, use NumPy arrays:
Ap_data = np.array([2, 7, 91, 1.6, 0.4, 5])
tdata= np.array([0, 1, 4, 5, 4, 20])
Cv_data = np.array([43, 580, 250, 34, 30, 3])
to solve this. You now have the problem that odeint fails for the input you give it...
intdy-- t (=r1) illegal
in above message, r1 = 0.4000000000000D+01
t not in interval tcur - hu (= r1) to tcur (=r2)
in above, r1 = 0.4287484688360D+01 r2 = 0.5551311182627D+01
lsoda-- trouble from intdy. itask = i1, tout = r1ls
in above message, i1 = 1
in above message, r1 = 0.4000000000000D+01
Illegal input detected (internal error).
Run with full_output = 1 to get quantitative information.
[[ 21.6 ]
[ 20.37432613]
[ 17.09897165]
[ 16.12866355]
[ 16.12866355]
[ -0.90614016]]
Perhaps you can give more information about what your equation is and how it relates to Cv_data. In particular, your derivative doesn't depend on t, but you have a range of values for this parameter, Cv.
UPDATE: It fails because of your funny time series. odeint works properly if it is monotonic, for example:
from scipy.integrate import odeint
import numpy as np
Ap_data = [2, 7, 91, 1.6, 0.4, 5]
tdata= np.array([0, 1, 4, 5, 10, 20])
Cv_data = np.array([43, 580, 250, 34, 30, 3])
#Define parameters
kn = 1E-5 #change
ks = 1E+5 #change
kd = 0.058
def deriv (CAi,t, Cv):
CA0 = (-kd-kn*Cv)*CAi/(1+(CAi/ks))
return CA0
#Initial conditions
CA_init = 21.6
#Solve the ODE
(CAb_soln) = odeint (deriv, CA_init, tdata, (Cv_data,))
print CAb_soln
The result:
[[ 21.6 ]
[ 20.37432613]
[ 17.09897165]
[ 16.12866355]
[ 12.04306424]
[ 6.71431758]]
Well, as it turns out I cannot post an image yet (being new to stackoverflow). So, the code that I used was-
from scipy.integrate import odeint
import numpy as np
Ap_data = np.array([2, 7, 91, 1.6, 0.4, 5])
tdata= [0, 1, 4, 5, 4, 20]
Cv_data = np.array([43, 580, 250, 34, 30, 3])
#Define parameters
kn = 1E-5 #change
ks = 1E+5 #change
kd = 0.058
def deriv (CAi,t, Cv):
CA0 = (-kd-kn*Cv)*CAi/(1+(CAi/ks))
return CA0
#Initial conditions
CA_init = 21.6
#Solve the ODE
(CAb_soln) = odeint (deriv, CA_init, tdata, (Cv_data,), full_output=True)
print CAb_soln

Categories

Resources