Using ODE to plot particle-motion with scipy.integrate.solve_ivp - python

My Problem:
A positively charged particle (mass = 2 * 10-27 kg) is moving along the x-axis. It is travelling in a homogenous magnetic field such that the field axis in z-direction. The energy of the particle is 2 MeV and B = 4 T. Use a ODE solver to plot the motion of the particle for 1 microseconds.
My attempt of solving the problem
Note that I have used question marks where I am unsure.
import numpy as np
from scipy.integrate import solve_ivp
initialZ = [?, ?, ?, ?, ?, ?] # = [positionX, positionY, positionZ, velocityX, velocityY, velocityZ]
t0 = 0
tf = 1*(10**-6) # 1 microsecond = 1*10^6 seconds
times = (t0, tf)
def ivf(t, Z):
x, y, z = Z[0], Z[1], Z[2]
u, v, w = Z[3], Z[4], Z[5]
return np.array([u, v, w, ?, ?, ?])
s = solve_ivp(ivf, times, initialZ)
My question
what should the question marks (?) in the code be replaced with?
I have tried to solve the ODE as an initial value problem. I tried to determine inital velocity by equating the lorentz force and centripetal force. I am very new to differential equations and it is therefore difficult to know If I am doing things in the correct way. (note that the first three values of my initialZ vector represents positions x, y and z, and the last three values represent velocity in the x, y and z direction). I am grateful for any help or guidance.

Let us set up the ode and then the initial conditions. From the magnetic force equations and Newton's second law, a = (q B / m) v * z. I'm using * to represent the cross product and x, y, and z are unit vectors. In your question you didn't give a charge q for the particle, I'm assuming you are doing a proton and just rounded the mass of 1.67 to 2. You will see that we just need 4 equations and 4 initial conditions as this problem is 2D in nature.
We have that a = (q B / m) (v_y x - v_x y). Note there is no z term! We need to write these 2 second order odes as four first order odes so ode solver can work it out. We know that a = dv/dt.
dv_x / dt = (q B / m) v_y
dv_y / dt = -(q B / m) v_x
dx/dt = v_x
dy/dt = v_y
The initial condition is that the velocity is v_x0 = Sqrt(2 E / m) where E is 2 MeV, v_y0 = 0, and lets just have the particle start at the origin, so x0 = y0 = 0.
import numpy as np
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
# make sure everything is SI
q = 1.602176634E-19 # charge assuming it is a proton
B = 4 # magnetic field magnitude
m = 2E-27 # mass
E = 2 * 1.60218E-13 # kinetic energy in joules
C = q * B / m # constant for convenience
v0x = np.sqrt(2 * E / m)
initialZ = [0, 0, v0x, 0] # = [positionX, positionY, velocityX, velocityY]
def ivf(t, Z) :
# Z[2] is vx, Z[3] is vy
dxdt = Z[2]
dydt = Z[3]
dvxdt = C * Z[3]
dvydt = -C * Z[2]
return [dxdt, dydt, dvxdt, dvydt]
sol = solve_ivp(ivf, [0, 1E-6], initialZ, method='Radau') # 1 microsecond = 1*10^6 seconds
plt.plot(sol.y[0], sol.y[1])
plt.xlabel('x')
plt.ylabel('y')
plt.title('Displacement on point particle')
plt.show()
What I am expecting is circular motion like in a cyclotron. Here is a rough graph I made,
I was able to get it a bit smooth by playing with the integration method. I'm sure you will be able to tinker and play around with different things until you get a satisfactory result.

Related

How to solve first order ODE equations of motion with given init

I have converted an equation to the first order ODE and now i would like to solve the motion equations for multiple periods with given conditions.
The equations shall be solved with the following 0 values:
x(0), y(0), vx(0), vy(0) = 3, 1, 2, 1.3 × π
This is the first order ODE, derived from motion equation of Newton's gravitational law for cellestial bodies:
How can I solve this in python? Could I use Runge Kutta or Keplers methods? I feel like I'm doing something wrong
import math
import matplotlib.pyplot as plt
import numpy as np
g = 6.674 * 10**(-11)
M = 4*(math.pi**2)/g
x0 = 3 #x-position of the center or h
y0 = 1 #y-position of the center or k
vx0 = 2
vy0 = 1.3* math.pi
#Trying second order
def RK2(y0, f, tlist):
t = [tlist[0]]
tf = tlist[1]
dt = tlist[2]
y = [y0]
while t[-1] < tf:
k1 = dt * f(y[-1],t[-1])
k2 = dt * f(y[-1]+0.5*k1 , t[-1]+0.5*dt) ##Take half step
y.append( y[-1] + k2 )
t.append(t[-1] + dt)
return(np.array(y), np.array(t) )

Python, Four-bar linkage angle-time plot

I'm trying to plot the angle vs. time plot for the output angle of a four-bar linkage (angle fi4 in the image below). This angle is calculated using the solution from the https://scholar.cu.edu.eg/?q=anis/files/week04-mdp206-position_analysis-draft.pdf, page 23.
I'm now trying to plot the fi_4(t) plot and am getting some strange results. The diagram displays the input angle fi2 as blue and output angle fi4 as red. Why is the fi2 fluctuating over time? Shouldn't the fi4 have some sort of sine curve?
Am I missing something here?
Four-bar linkage:
The code:
from __future__ import division
import math
import numpy as np
import matplotlib.pyplot as plt
# Input
#lengths of links (tube testing machine actual lengths)
a = 45.5 #mm
b = 250 #mm
c = 140 #mm
d = 244.244 #mm
# Solution for fi2 being a time function, f(time) = angle
f = 16.7/60 #/s
omega = 2 * np.pi * f #rad/s
t = np.linspace(0, 50, 100)
y = a * np.sin(omega * t)
x = a * np.cos(omega * t)
fi2 = np.arctan(y/x)
# Solution of the vector loop equation
#https://scholar.cu.edu.eg/?q=anis/files/week04-mdp206-position_analysis-draft.pdf
K1 = d/a
K2 = d/c
K3 = (a**2 - b**2 + c**2 + d**2)/(2*a*c)
A = np.cos(fi2) - K1 - K2*np.cos(fi2) + K3
B = -2*np.sin(fi2)
C = K1 - (K2+1)*np.cos(fi2) + K3
fi4_1 = 2*np.arctan((-B+np.sqrt(B**2 - 4*A*C))/(2*A))
fi4_2 = 2*np.arctan((-B-np.sqrt(B**2 - 4*A*C))/(2*A))
# Plot the fi2 time diagram and fi4 time diagram
plt.plot(t, np.degrees(fi2), color = 'blue')
plt.plot(t, np.degrees(fi4_2), color = 'red')
plt.show()
Diagram:
The linespace(0, 50, 100) is too fast. Replacing it with:
t = np.linspace(0, 5, 100)
Second, all the calculations involving the bare np.arctan() are incorrect. You should use np.arctan2(y, x), which determines the correct quadrant (unlike anything based on y/x where the respective signs of x and y are lost). So:
fi2 = np.arctan2(y, x) # not: np.arctan(y/x)
...
fi4_1 = 2 * np.arctan2(-B + np.sqrt(B**2 - 4*A*C), 2*A)
fi4_2 = 2 * np.arctan2(-B - np.sqrt(B**2 - 4*A*C), 2*A)
Putting some labels on your plots and showing both solutions for θ_4:
plt.plot(t, np.degrees(fi2) % 360, color = 'k', label=r'$θ_2$')
plt.plot(t, np.degrees(fi4_1) % 360, color = 'b', label=r'$θ_{4_1}$')
plt.plot(t, np.degrees(fi4_2) % 360, color = 'r', label=r'$θ_{4_2}$')
plt.xlabel('t [s]')
plt.ylabel('degrees')
plt.legend()
plt.show()
With these mods, we get:
BTW, do you want to see an amazingly lazy way of solving problems like these? Much more inefficient than your code, but much easier to derive (e.g. for other structures) without trying to express the closed form of your solution:
from scipy.optimize import fsolve
def polar(r, theta):
return r * np.array((np.cos(theta), np.sin(theta)))
def f(th34, th2):
th3, th4 = th34 # solve simultaneously for theta_3 and theta_4
pb_23 = polar(a, th2) + polar(b, th3) # point B based on links a, b
pb_14 = polar(d, 0) + polar(c, th4) # point B based on links d, c
return pb_23 - pb_14 # error: difference of the two
def solve(th2):
th4_1 = np.array([fsolve(f, [0, -1.5], args=(th2_k,))[1] for th2_k in th2])
th4_2 = np.array([fsolve(f, [0, 1.5], args=(th2_k,))[1] for th2_k in th2])
return th4_1, th4_2
Application:
t = np.linspace(0, 5, 100)
th2 = omega * t
th4_1, th4_2 = solve(th2)
twopi = 2 * np.pi
np.allclose(th4_1 % twopi, fi4_1 % twopi)
# True
np.allclose(th4_2 % twopi, fi4_2 % twopi)
# True
Depending on the structure of your mechanism (e.g. 5 links), you may have more than two solutions, and of course more angles, so you'd have to adapt the code above. But you get the idea.
Be warned: fsolve iterates to find a suitable (close enough) solution, so as I said, it is much slower than your closed form.
Update (some clarification/explanation):
The function f computes the position of the point B in two different ways (via R2-R3 and via R1-R4) and returns the difference (as a vector). We solve for the difference to be zero.
That function takes two arguments: one 2-dimensional variable (th34, which is an array [th3, th4]) and one parameter th2; the parameter is constant during one run of fsolve.
The values [0, -1.5] and [0, 1.5] are initialization values (guesses) for th34 (th3 and th4). We call fsolve twice to get the two possible solutions.
All angles refer to your figure. I use th for θ (theta, not phi), but I kept along the original fi4_1 and fi4_2 for comparison.
Modulo 2*pi, th4_1 should be equal to fi4_1 etc., which is tested by np.allclose to account for numerical rounding errors.

Access current time step in scipy.integrate.odeint within the function

Is there a way to access what the current time step is in scipy.integrate.odeint?
I am trying to solve a system of ODEs where the form of the ode depends on whether or not a population will be depleted. Basically I take from population x provided x doesn't go below a threshold. If the amount I need to take this timestep is greater than that threshold I will take all of x to that point and the rest from z.
I am trying to do this by checking how much I will take this time step, and then allocating between populations x and z in the DEs.
To do this I need to be able to access the step size within the ODE solver to calculate what will be taken this time step. I am using scipy.integrate.odeint - is there a way to access the time step within the function defining the odes?
Alternatively, can you access what the last time was in the solver? I know it won't necessarily be the next time step, but it's likely a good enough approximation for me if that is the best I can do. Or is there another option I've not thought of to do this?
The below MWE is not my system of equations but what I could come up with to try to illustrate what I'm doing. The problem is that on the first time step, if the time step were 1 then the population will go too low, but since the timestep will be small, initially you can take all from x.
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
plt.interactive(False)
tend = 5
tspan = np.linspace(0.0, tend, 1000)
A = 3
B = 4.09
C = 1.96
D = 2.29
def odefunc(P,t):
x = P[0]
y = P[1]
z = P[2]
if A * x - B * x * y < 0.6:
dxdt = A/5 * x
dydt = -C * y + D * x * y
dzdt = - B * z * y
else:
dxdt = A * x - B * x * y
dydt = -C * y + D * x * y
dzdt = 0
dPdt = np.ravel([dxdt, dydt, dzdt])
return dPdt
init = ([0.75,0.95,100])
sol = odeint(odefunc, init, tspan, hmax = 0.01)
x = sol[:, 0]
y = sol[:, 1]
z = sol[:, 2]
plt.figure(1)
plt.plot(tspan,x)
plt.plot(tspan,y)
plt.plot(tspan,z)
Of course you can hack something together that might work.
You could log t but you have to be aware that the values
might not be constantly increasing. This depends on the ODE algorithm and how it works (forward, backward, and central finite differences).
But it will give you an idea where about you are.
logger = [] # visible in odefunc
def odefunc(P,t):
x = P[0]
y = P[1]
z = P[2]
print(t)
logger.append(t)
if logger: # if the list is not empty
if logger[-1] > 2.5: # then read the last value
print('hua!')
if A * x - B * x * y < 0.6:
dxdt = A/5 * x
dydt = -C * y + D * x * y
dzdt = - B * z * y
else:
dxdt = A * x - B * x * y
dydt = -C * y + D * x * y
dzdt = 0
dPdt = np.ravel([dxdt, dydt, dzdt])
return dPdt
print(logger)
As pointed out in the another answer, time may not be strictly increasing at each call to the ODE function in odeint, especially for stiff problems.
The most robust way to handle this kind of discontinuity in the ode function is to use an event to find the location of the zero of (A * x - B * x * y) - 0.6 in your example. For a discontinuous solution, use a terminal event to stop the computation precisely at the zero, and then change the ode function. In solve_ivp you can do this with the events parameter. See the solve ivp documentation and specifically the examples related to the cannonball trajectories. odeint does not support events, and solve_ivp has an LSODA method available that calls the same Fortran library as odeint.
Here is a short example, but you may want to additionally check that sol1 reached the terminal event before solving for sol2.
from scipy.integrate import solve_ivp
tend = 10
def discontinuity_zero(t, y):
return y[0] - 10
discontinuity_zero.terminal = True
def ode_func1(t, y):
return y
def ode_func2 (t, y):
return -y**2
sol1 = solve_ivp(ode_func1, t_span=[0, tend], y0=[1], events=discontinuity_zero, rtol=1e-8)
t1 = sol1.t[-1]
y1 = [sol1.y[0, -1]]
print(f'time={t1} y={y1} discontinuity_zero={discontinuity_zero(t1, y1)}')
sol2 = solve_ivp(ode_func2, t_span=[t1, tend], y0=y1, rtol=1e-8)
plt.plot(sol1.t, sol1.y[0,:])
plt.plot(sol2.t, sol2.y[0,:])
plt.show()
This prints the following, where the time of the discontinuity is accurate to 7 digits.
time=2.302584885712467 y=[10.000000000000002] discontinuity_zero=1.7763568394002505e-15

algebraic constraint to terminate ODE integration with scipy

I'm using Scipy 14.0 to solve a system of ordinary differential equations describing the dynamics of a gas bubble rising vertically (in the z direction) in a standing still fluid because of buoyancy forces. In particular, I have an equation expressing the rising velocity U as a function of bubble radius R, i.e. U=dz/dt=f(R), and one expressing the radius variation as a function of R and U, i.e. dR/dT=f(R,U). All the rest appearing in the code below are material properties.
I'd like to implement something to account for the physical constraint on z which, obviously, is limited by the liquid height H. I consequently implemented a sort of z<=H constraint in order to stop integration in advance if needed: I used set_solout in order to do so. The situation is that the code runs and gives good results, but set_solout is not working at all (it seems like z_constraint is never called actually...). Do you know why?
Is there somebody with a more clever idea, may be also in order to interrupt exactly when z=H (i.e. a final value problem) ? is this the right way/tool or should I reformulate the problem?
thanks in advance
Emi
from scipy.integrate import ode
Db0 = 0.001 # init bubble radius
y0, t0 = [ Db0/2 , 0. ], 0. #init conditions
H = 1
def y_(t,y,g,p0,rho_g,mi_g,sig_g,H):
R = y[0]
z = y[1]
z_ = ( R**2 * g * rho_g ) / ( 3*mi_g ) #velocity
R_ = ( R/3 * g * rho_g * z_ ) / ( p0 + rho_g*g*(H-z) + 4/3*sig_g/R ) #R dynamics
return [R_, z_]
def z_constraint(t,y):
H = 1 #should rather be a variable..
z = y[1]
if z >= H:
flag = -1
else:
flag = 0
return flag
r = ode( y_ )
r.set_integrator('dopri5')
r.set_initial_value(y0, t0)
r.set_f_params(g, 5*1e5, 2000, 40, 0.31, H)
r.set_solout(z_constraint)
t1 = 6
dt = 0.1
while r.successful() and r.t < t1:
r.integrate(r.t+dt)
You're running into this issue. For set_solout to work correctly, it must be called right after set_integrator, before set_initial_value. If you introduce this modification into your code (and set a value for g), integration will terminate when z >= H, as you want.
To find the exact time when the bubble reached the surface, you can make a change of variables after the integration is terminated by solout and integrate back with respect to z (rather than t) to z = H. A paper that describes the technique is M. Henon, Physica 5D, 412 (1982); you may also find this discussion helpful. Here's a very simple example in which the time t such that y(t) = 0.5 is found, given dy/dt = -y:
import numpy as np
from scipy.integrate import ode
def f(t, y):
"""Exponential decay: dy/dt = -y."""
return -y
def solout(t, y):
if y[0] < 0.5:
return -1
else:
return 0
y_initial = 1
t_initial = 0
r = ode(f).set_integrator('dopri5')
r.set_solout(solout)
r.set_initial_value(y_initial, t_initial)
# Integrate until solout constraint violated
r.integrate(2)
# New system with t as independent variable: see Henon's paper for details.
def g(y, t):
return -1.0/y
r2 = ode(g).set_integrator('dopri5')
r2.set_initial_value(r.t, r.y)
r2.integrate(0.5)
y_final = r2.t
t_final = r2.y
# Error: difference between found and analytical solution
print t_final - np.log(2)

Fit a curve for data made up of two distinct regimes

I'm looking for a way to plot a curve through some experimental data. The data shows a small linear regime with a shallow gradient, followed by a steep linear regime after a threshold value.
My data is here: http://pastebin.com/H4NSbxqr
I could fit the data with two lines relatively easily, but I'd like to fit with a continuous line ideally - which should look like two lines with a smooth curve joining them around the threshold (~5000 in the data, shown above).
I attempted this using scipy.optimize curve_fit and trying a function which included the sum of a straight line and an exponential:
y = a*x + b + c*np.exp((x-d)/e)
although despite numerous attempts, it didn't find a solution.
If anyone has any suggestions please, either on the choice of fitting distribution / method or the curve_fit implementation, they would be greatly appreciated.
If you don't have a particular reason to believe that linear + exponential is the true underlying cause of your data, then I think a fit to two lines makes the most sense. You can do this by making your fitting function the maximum of two lines, for example:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def two_lines(x, a, b, c, d):
one = a*x + b
two = c*x + d
return np.maximum(one, two)
Then,
x, y = np.genfromtxt('tmp.txt', unpack=True, delimiter=',')
pw0 = (.02, 30, .2, -2000) # a guess for slope, intercept, slope, intercept
pw, cov = curve_fit(two_lines, x, y, pw0)
crossover = (pw[3] - pw[1]) / (pw[0] - pw[2])
plt.plot(x, y, 'o', x, two_lines(x, *pw), '-')
If you really want a continuous and differentiable solution, it occurred to me that a hyperbola has a sharp bend to it, but it has to be rotated. It was a bit difficult to implement (maybe there's an easier way), but here's a go:
def hyperbola(x, a, b, c, d, e):
""" hyperbola(x) with parameters
a/b = asymptotic slope
c = curvature at vertex
d = offset to vertex
e = vertical offset
"""
return a*np.sqrt((b*c)**2 + (x-d)**2)/b + e
def rot_hyperbola(x, a, b, c, d, e, th):
pars = a, b, c, 0, 0 # do the shifting after rotation
xd = x - d
hsin = hyperbola(xd, *pars)*np.sin(th)
xcos = xd*np.cos(th)
return e + hyperbola(xcos - hsin, *pars)*np.cos(th) + xcos - hsin
Run it as
h0 = 1.1, 1, 0, 5000, 100, .5
h, hcov = curve_fit(rot_hyperbola, x, y, h0)
plt.plot(x, y, 'o', x, two_lines(x, *pw), '-', x, rot_hyperbola(x, *h), '-')
plt.legend(['data', 'piecewise linear', 'rotated hyperbola'], loc='upper left')
plt.show()
I was also able to get the line + exponential to converge, but it looks terrible. This is because it's not a good descriptor of your data, which is linear and an exponential is very far from linear!
def line_exp(x, a, b, c, d, e):
return a*x + b + c*np.exp((x-d)/e)
e0 = .1, 20., .01, 1000., 2000.
e, ecov = curve_fit(line_exp, x, y, e0)
If you want to keep it simple, there's always a polynomial or spline (piecewise polynomials)
from scipy.interpolate import UnivariateSpline
s = UnivariateSpline(x, y, s=x.size) #larger s-value has fewer "knots"
plt.plot(x, s(x))
I researched this a little, Applied Linear Regression by Sanford, and the Correlation and Regression lecture by Steiger had some good info on it. They all however lack the right model, the piecewise function should be
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import lmfit
dfseg = pd.read_csv('segreg.csv')
def err(w):
th0 = w['th0'].value
th1 = w['th1'].value
th2 = w['th2'].value
gamma = w['gamma'].value
fit = th0 + th1*dfseg.Temp + th2*np.maximum(0,dfseg.Temp-gamma)
return fit-dfseg.C
p = lmfit.Parameters()
p.add_many(('th0', 0.), ('th1', 0.0),('th2', 0.0),('gamma', 40.))
mi = lmfit.minimize(err, p)
lmfit.printfuncs.report_fit(mi.params)
b0 = mi.params['th0']; b1=mi.params['th1'];b2=mi.params['th2']
gamma = int(mi.params['gamma'].value)
import statsmodels.formula.api as smf
reslin = smf.ols('C ~ 1 + Temp + I((Temp-%d)*(Temp>%d))' % (gamma,gamma), data=dfseg).fit()
print reslin.summary()
x0 = np.array(range(0,gamma,1))
x1 = np.array(range(0,80-gamma,1))
y0 = b0 + b1*x0
y1 = (b0 + b1 * float(gamma) + (b1 + b2)* x1)
plt.scatter(dfseg.Temp, dfseg.C)
plt.hold(True)
plt.plot(x0,y0)
plt.plot(x1+gamma,y1)
plt.show()
Result
[[Variables]]
th0: 78.6554456 +/- 3.966238 (5.04%) (init= 0)
th1: -0.15728297 +/- 0.148250 (94.26%) (init= 0)
th2: 0.72471237 +/- 0.179052 (24.71%) (init= 0)
gamma: 38.3110177 +/- 4.845767 (12.65%) (init= 40)
The data
"","Temp","C"
"1",8.5536,86.2143
"2",10.6613,72.3871
"3",12.4516,74.0968
"4",16.9032,68.2258
"5",20.5161,72.3548
"6",21.1613,76.4839
"7",24.3929,83.6429
"8",26.4839,74.1935
"9",26.5645,71.2581
"10",27.9828,78.2069
"11",32.6833,79.0667
"12",33.0806,71.0968
"13",33.7097,76.6452
"14",34.2903,74.4516
"15",36,56.9677
"16",37.4167,79.8333
"17",43.9516,79.7097
"18",45.2667,76.9667
"19",47,76
"20",47.1129,78.0323
"21",47.3833,79.8333
"22",48.0968,73.9032
"23",49.05,78.1667
"24",57.5,81.7097
"25",59.2,80.3
"26",61.3226,75
"27",61.9194,87.0323
"28",62.3833,89.8
"29",64.3667,96.4
"30",65.371,88.9677
"31",68.35,91.3333
"32",70.7581,91.8387
"33",71.129,90.9355
"34",72.2419,93.4516
"35",72.85,97.8333
"36",73.9194,92.4839
"37",74.4167,96.1333
"38",76.3871,89.8387
"39",78.0484,89.4516
Graph
I used #user423805 's answer (found via google groups thread: https://groups.google.com/forum/#!topic/lmfit-py/7I2zv2WwFLU ) but noticed it had some limitations when trying to use three or more segments.
Instead of applying np.maximum in the minimizer error function or adding (b1 + b2) in #user423805 's answer, I used the same linear spline calculation for both the minimizer and end-usage:
# least_splines_calc works like this for an example with three segments
# (four threshold params, three gamma params):
#
# for 0 < x < gamma0 : y = th0 + (th1 * x)
# for gamma0 < x < gamma1 : y = th0 + (th1 * x) + (th2 * (x - gamma0))
# for gamma1 < x : y = th0 + (th1 * x) + (th2 * (x - gamma0)) + (th3 * (x - gamma1))
#
def least_splines_calc(x, thresholds, gammas):
if(len(thresholds) < 2):
print("Error: expected at least two thresholds")
return None
applicable_gammas = filter(lambda gamma: x > gamma , gammas)
#base result
y = thresholds[0] + (thresholds[1] * x)
#additional factors calculated depending on x value
for i in range(0, len(applicable_gammas)):
y = y + ( thresholds[i + 2] * ( x - applicable_gammas[i] ) )
return y
def least_splines_calc_array(x_array, thresholds, gammas):
y_array = map(lambda x: least_splines_calc(x, thresholds, gammas), x_array)
return y_array
def err(params, x, data):
th0 = params['th0'].value
th1 = params['th1'].value
th2 = params['th2'].value
th3 = params['th3'].value
gamma1 = params['gamma1'].value
gamma2 = params['gamma2'].value
thresholds = np.array([th0, th1, th2, th3])
gammas = np.array([gamma1, gamma2])
fit = least_splines_calc_array(x, thresholds, gammas)
return np.array(fit)-np.array(data)
p = lmfit.Parameters()
p.add_many(('th0', 0.), ('th1', 0.0),('th2', 0.0),('th3', 0.0),('gamma1', 9.),('gamma2', 9.3)) #NOTE: the 9. / 9.3 were guesses specific to my data, you will need to change these
mi = lmfit.minimize(err_alt, p, args=(np.array(dfseg.Temp), np.array(dfseg.C)))
After minimization, convert the params found by the minimizer into an array of thresholds and gammas to re-use linear_splines_calc to plot the linear splines regression.
Reference: While there's various places that explain least splines (I think #user423805 used http://www.statpower.net/Content/313/Lecture%20Notes/Splines.pdf , which has the (b1 + b2) addition I disagree with in its sample code despite similar equations) , the one that made the most sense to me was this one (by Rob Schapire / Zia Khan at Princeton) : https://www.cs.princeton.edu/courses/archive/spring07/cos424/scribe_notes/0403.pdf - section 2.2 goes into linear splines. Excerpt below:
If you're looking to join what appears to be two straight lines with a hyperbola having a variable radius at/near the intersection of the two lines (which are its asymptotes), I urge you to look hard at Using an Hyperbola as a Transition Model to Fit Two-Regime Straight-Line Data, by Donald G. Watts and David W. Bacon, Technometrics, Vol. 16, No. 3 (Aug., 1974), pp. 369-373.
The formula is drop dead simple, nicely adjustable, and works like a charm. From their paper (in case you can't access it):
As a more useful alternative form we consider an hyperbola for which:
(i) the dependent variable y is a single valued function of the independent variable x,
(ii) the left asymptote has slope theta_1,
(iii) the right asymptote has slope theta_2,
(iv) the asymptotes intersect at the point (x_o, beta_o),
(v) the radius of curvature at x = x_o is proportional to a quantity delta. Such an hyperbola can be written y = beta_o + beta_1*(x - x_o) + beta_2* SQRT[(x - x_o)^2 + delta^2/4], where beta_1 = (theta_1 + theta_2)/2 and beta_2 = (theta_2 - theta_1)/2.
delta is the adjustable parameter that allows you to either closely follow the lines right to the intersection point or smoothly merge from one line to the other.
Just solve for the intersection point (x_o, beta_o), and plug into the formula above.
BTW, in general, if line 1 is y_1 = b_1 + m_1 *x and line 2 is y_2 = b_2 + m_2 * x, then they intersect at x* = (b_2 - b_1) / (m_1 - m_2) and y* = b_1 + m_1 * x*. So, to connect with the formalism above, x_o = x*, beta_o = y* and the two m_*'s are the two thetas.
There is a straightforward method (not iterative, no initial guess) pp.12-13 in https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf
The data comes from the scanning of the figure published by IanRoberts in his question. Scanning for the coordinates of the pixels in not accurate. So, don't be surprised by additional deviation.
Note that the abscisses and ordinates scales have been devised by 1000.
The equations of the two segments are
The approximate values of the five parameters are written on the above figure.

Categories

Resources