Velocity-Verlet Double-Well Algorithm Python - python

I am implemententing the Verlet algorithm for a double well potential V(x) = x^4-20x^2, as to create a simple phase portrait.The generated phase portrait has an augmented oval shape and is clearly incorrect. I have a feeling that my problem is occurring in my definition of the of x^3 but I am not sure. I have also included the algorithm for a classical harmonic oscillator to show that my code works correctly.
import numpy as np
import matplotlib.pyplot as plt
###Constants
w = 2
m=1
N=500
dt=0.05
t = np.linspace(0, N*dt, N+1)
np.shape(t)
x = np.zeros(N+1)
p = np.zeros(N+1)
p_0 = 0
x_0 = 1
x[0] = x_0
p[0] = p_0
#Velocity Verlet Tuckerman
#x(dt) = x(0) +p(0)/m*dt + 1/(2m) * F(x(0))
#p(dt = p(0) + dt/2[F(x(0)) + F(x(dt))]
#Harmonic Oscillator F(x) = -kx = -mw^2x
for n in range(N):
x[n+1] = x[n] + (p[n]/m)*dt - (0.5)*w**2*x[n]*dt*dt
p[n+1] = p[n] - m*(0.5)*w**2*x[n]*dt - m*0.5*w**2*x[n+1]*dt
plt.plot(x,p)
#Symmetric Double Well: F(x) = -4x^3 + 40x
#V(x) = x^4 -20x^2
for n in range(N):
x[n+1] = x[n] + (p[n]/m)*dt +1/(2*m)*( -4*(x[n]*x[n]*x[n])*dt*dt +40*x[n]*dt*dt)
p[n+1] = p[n] + (1/2)*(-4*m*(x[n]*x[n]*x[n])*dt +40*m*x[n]*dt - 4*m*(x[n+1]*x[n+1]*x[n+1])*dt +40*m*x[n+1]*dt)
plt.plot(x,p)
Thanks!

To be more precise, V has a minimum at x=+-sqrt(10) with value -100, the local maximum at x=0 gives value 0. The initial position x0=1, v0=0 places the solution in the right valley, oscillating around sqrt(10).
To get a figure-8 shape you need an initial point with V(x0) slightly larger than zero. For instance with x0=5 one gets V=25*(25-20)=125. Or take x0=4.5 ==> x0^2=20.25 ==> V ~ 5.

Related

Simulate a coupled ordinary differential equation

I want to write a program which turns a 2nd order differential equation into two ordinary differential equations but I don't know how I can do that in Python.
I am getting lots of errors, please help in writing the code correctly.
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
import numpy as np
N = 30 # Number of coupled oscillators.
alpha=0.25
A = 1.0
# Initial positions.
y[0] = 0 # Fix the left-hand side at zero.
y[N+1] = 0 # Fix the right-hand side at zero.
# The range(1,N+1) command only prints out [1,2,3, ... N].
for p in range(1, N+1): # p is particle number.
y[p] = A * np.sin(3 * p * np.pi /(N+1.0))
####################################################
# Initial velocities.
####################################################
v[0] = 0 # The left and right boundaries are
v[N+1] = 0 # clamped and don't move.
# This version sets them all the particle velocities to zero.
for p in range(1, N+1):
v[p] = 0
w0 = [v[p], y[p]]
def accel(t,w):
v[p], y[p] = w
global a
a[0] = 0.0
a[N+1] = 0.0
# This version loops explicitly over all the particles.
for p in range(1,N+1):
a[p] = [v[p], y(p+1)+y(p-1)-2*y(p)+ alpha * ((y[p+1] - y[p])**2 - (y[p] - y[p-1])**2)]
return a
duration = 50
t = np.linspace(0, duration, 800)
abserr = 1.0e-8
relerr = 1.0e-6
solution = solve_ivp(accel, [0, duration], w0, method='RK45', t_eval=t,
vectorized=False, dense_output=True, args=(), atol=abserr, rtol=relerr)
Most general-purpose solvers do not do structured state objects. They just work with a flat array as representation of the state space points. From the construction of the initial point you seem to favor the state space ordering
[ v[0], v[1], ... v[N+1], y[0], y[1], ..., y[N+1] ]
This allows to simply split both and to assemble the derivatives vector from the velocity and acceleration arrays.
Let's keep things simple and separate functionality in small functions
a = np.zeros(N+2)
def accel(y):
global a ## initialized to the correct length with zero, avoids repeated allocation
a[1:-1] = y[2:]+y[:-2]-2*y[1:-1] + alpha*((y[2:]-y[1:-1])**2-(y[1:-1]-y[:-2])**2)
return a
def derivs(t,w):
v,y = w[:N+2], w[N+2:]
return np.concatenate([accel(y), v])
or keeping the theme of avoiding allocations
dwdt = np.zeros(2*N+4)
def derivs(t,w):
global dwdt
v,y = w[:N+2], w[N+2:]
dwdt[:N+2] = accel(y)
dwdt[N+2:] = v
return dwdt
Now you only need to set
w0=np.concatenate([v,y])
to rapidly get to a more interesting class of errors.

Minimizing this error function, using NumPy

Background
I've been working for some time on attempting to solve the (notoriously painful) Time Difference of Arrival (TDoA) multi-lateration problem, in 3-dimensions and using 4 nodes. If you're unfamiliar with the problem, it is to determine the coordinates of some signal source (X,Y,Z), given the coordinates of n nodes, the time of arrival of the signal at each node, and the velocity of the signal v.
My solution is as follows:
For each node, we write (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 = (v(t_i - T)**2
Where (x_i, y_i, z_i) are the coordinates of the ith node, and T is the time of emission.
We have now 4 equations in 4 unknowns. Four nodes are obviously insufficient. We could try to solve this system directly, however that seems next to impossible given the highly nonlinear nature of the problem (and, indeed, I've tried many direct techniques... and failed). Instead, we simplify this to a linear problem by considering all i/j possibilities, subtracting equation i from equation j. We obtain (n(n-1))/2 =6 equations of the form:
2*(x_j - x_i)*X + 2*(y_j - y_i)*Y + 2*(z_j - z_i)*Z + 2 * v**2 * (t_i - t_j) = v**2 ( t_i**2 - t_j**2) + (x_j**2 + y_j**2 + z_j**2) - (x_i**2 + y_i**2 + z_i**2)
Which look like Xv_1 + Y_v2 + Z_v3 + T_v4 = b. We try now to apply standard linear least squares, where the solution is the matrix vector x in A^T Ax = A^T b. Unfortunately, if you were to try feeding this into any standard linear least squares algorithm, it'll choke up. So, what do we do now?
...
The time of arrival of the signal at node i is given (of course) by:
sqrt( (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 ) / v
This equation implies that the time of arrival, T, is 0. If we have that T = 0, we can drop the T column in matrix A and the problem is greatly simplified. Indeed, NumPy's linalg.lstsq() gives a surprisingly accurate & precise result.
...
So, what I do is normalize the input times by subtracting from each equation the earliest time. All I have to do then is determine the dt that I can add to each time such that the residual of summed squared error for the point found by linear least squares is minimized.
I define the error for some dt to be the squared difference between the arrival time for the point predicted by feeding the input times + dt to the least squares algorithm, minus the input time (normalized), summed over all 4 nodes.
for node, time in nodes, times:
error += ( (sqrt( (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 ) / v) - time) ** 2
My problem:
I was able to do this somewhat satisfactorily by using brute-force. I started at dt = 0, and moved by some step up to some maximum # of iterations OR until some minimum RSS error is reached, and that was the dt I added to the normalized times to obtain a solution. The resulting solutions were very accurate and precise, but quite slow.
In practice, I'd like to be able to solve this in real time, and therefore a far faster solution will be needed. I began with the assumption that the error function (that is, dt vs error as defined above) would be highly nonlinear-- offhand, this made sense to me.
Since I don't have an actual, mathematical function, I can automatically rule out methods that require differentiation (e.g. Newton-Raphson). The error function will always be positive, so I can rule out bisection, etc. Instead, I try a simple approximation search. Unfortunately, that failed miserably. I then tried Tabu search, followed by a genetic algorithm, and several others. They all failed horribly.
So, I decided to do some investigating. As it turns out the plot of the error function vs dt looks a bit like a square root, only shifted right depending upon the distance from the nodes that the signal source is:
Where dt is on horizontal axis, error on vertical axis
And, in hindsight, of course it does!. I defined the error function to involve square roots so, at least to me, this seems reasonable.
What to do?
So, my issue now is, how do I determine the dt corresponding to the minimum of the error function?
My first (very crude) attempt was to get some points on the error graph (as above), fit it using numpy.polyfit, then feed the results to numpy.root. That root corresponds to the dt. Unfortunately, this failed, too. I tried fitting with various degrees, and also with various points, up to a ridiculous number of points such that I may as well just use brute-force.
How can I determine the dt corresponding to the minimum of this error function?
Since we're dealing with high velocities (radio signals), it's important that the results be precise and accurate, as minor variances in dt can throw off the resulting point.
I'm sure that there's some infinitely simpler approach buried in what I'm doing here however, ignoring everything else, how do I find dt?
My requirements:
Speed is of utmost importance
I have access only to pure Python and NumPy in the environment where this will be run
EDIT:
Here's my code. Admittedly, a bit messy. Here, I'm using the polyfit technique. It will "simulate" a source for you, and compare results:
from numpy import poly1d, linspace, set_printoptions, array, linalg, triu_indices, roots, polyfit
from dataclasses import dataclass
from random import randrange
import math
#dataclass
class Vertexer:
receivers: list
# Defaults
c = 299792
# Receivers:
# [x_1, y_1, z_1]
# [x_2, y_2, z_2]
# [x_3, y_3, z_3]
# Solved:
# [x, y, z]
def error(self, dt, times):
solved = self.linear([time + dt for time in times])
error = 0
for time, receiver in zip(times, self.receivers):
error += ((math.sqrt( (solved[0] - receiver[0])**2 +
(solved[1] - receiver[1])**2 +
(solved[2] - receiver[2])**2 ) / c ) - time)**2
return error
def linear(self, times):
X = array(self.receivers)
t = array(times)
x, y, z = X.T
i, j = triu_indices(len(x), 1)
A = 2 * (X[i] - X[j])
b = self.c**2 * (t[j]**2 - t[i]**2) + (X[i]**2).sum(1) - (X[j]**2).sum(1)
solved, residuals, rank, s = linalg.lstsq(A, b, rcond=None)
return(solved)
def find(self, times):
# Normalize times
times = [time - min(times) for time in times]
# Fit the error function
y = []
x = []
dt = 1E-10
for i in range(50000):
x.append(self.error(dt * i, times))
y.append(dt * i)
p = polyfit(array(x), array(y), 2)
r = roots(p)
return(self.linear([time + r for time in times]))
# SIMPLE CODE FOR SIMULATING A SIGNAL
# Pick nodes to be at random locations
x_1 = randrange(10); y_1 = randrange(10); z_1 = randrange(10)
x_2 = randrange(10); y_2 = randrange(10); z_2 = randrange(10)
x_3 = randrange(10); y_3 = randrange(10); z_3 = randrange(10)
x_4 = randrange(10); y_4 = randrange(10); z_4 = randrange(10)
# Pick source to be at random location
x = randrange(1000); y = randrange(1000); z = randrange(1000)
# Set velocity
c = 299792 # km/ns
# Generate simulated source
t_1 = math.sqrt( (x - x_1)**2 + (y - y_1)**2 + (z - z_1)**2 ) / c
t_2 = math.sqrt( (x - x_2)**2 + (y - y_2)**2 + (z - z_2)**2 ) / c
t_3 = math.sqrt( (x - x_3)**2 + (y - y_3)**2 + (z - z_3)**2 ) / c
t_4 = math.sqrt( (x - x_4)**2 + (y - y_4)**2 + (z - z_4)**2 ) / c
print('Actual:', x, y, z)
myVertexer = Vertexer([[x_1, y_1, z_1],[x_2, y_2, z_2],[x_3, y_3, z_3],[x_4, y_4, z_4]])
solution = myVertexer.find([t_1, t_2, t_3, t_4])
print(solution)
It seems like the Bancroft method applies to this problem? Here's a pure NumPy implementation.
# Implementation of the Bancroft method, following
# https://gssc.esa.int/navipedia/index.php/Bancroft_Method
M = np.diag([1, 1, 1, -1])
def lorentz_inner(v, w):
return np.sum(v * (w # M), axis=-1)
B = np.array(
[
[x_1, y_1, z_1, c * t_1],
[x_2, y_2, z_2, c * t_2],
[x_3, y_3, z_3, c * t_3],
[x_4, y_4, z_4, c * t_4],
]
)
one = np.ones(4)
a = 0.5 * lorentz_inner(B, B)
B_inv_one = np.linalg.solve(B, one)
B_inv_a = np.linalg.solve(B, a)
for Lambda in np.roots(
[
lorentz_inner(B_inv_one, B_inv_one),
2 * (lorentz_inner(B_inv_one, B_inv_a) - 1),
lorentz_inner(B_inv_a, B_inv_a),
]
):
x, y, z, c_t = M # np.linalg.solve(B, Lambda * one + a)
print("Candidate:", x, y, z, c_t / c)
My answer might have mistakes (glaring) as I had not heard the TDOA term before this afternoon. Please double check if the method is right.
I could not find solution to your original problem of finding dt corresponding to the minimum error. My answer also deviates from the requirement that other than numpy no third party library had to be used (I used Sympy and largely used the code from here). However I am still posting this thinking that somebody someday might find it useful if all one is interested in ... is to find X,Y,Z of the source emitter. This method also does not take into account real-life situations where white noise or errors might be present or curvature of the earth and other complications.
Your initial test conditions are as below.
from random import randrange
import math
# SIMPLE CODE FOR SIMULATING A SIGNAL
# Pick nodes to be at random locations
x_1 = randrange(10); y_1 = randrange(10); z_1 = randrange(10)
x_2 = randrange(10); y_2 = randrange(10); z_2 = randrange(10)
x_3 = randrange(10); y_3 = randrange(10); z_3 = randrange(10)
x_4 = randrange(10); y_4 = randrange(10); z_4 = randrange(10)
# Pick source to be at random location
x = randrange(1000); y = randrange(1000); z = randrange(1000)
# Set velocity
c = 299792 # km/ns
# Generate simulated source
t_1 = math.sqrt( (x - x_1)**2 + (y - y_1)**2 + (z - z_1)**2 ) / c
t_2 = math.sqrt( (x - x_2)**2 + (y - y_2)**2 + (z - z_2)**2 ) / c
t_3 = math.sqrt( (x - x_3)**2 + (y - y_3)**2 + (z - z_3)**2 ) / c
t_4 = math.sqrt( (x - x_4)**2 + (y - y_4)**2 + (z - z_4)**2 ) / c
print('Actual:', x, y, z)
My solution is as below.
import sympy as sym
X,Y,Z = sym.symbols('X,Y,Z', real=True)
f = sym.Eq((x_1 - X)**2 +(y_1 - Y)**2 + (z_1 - Z)**2 , (c*t_1)**2)
g = sym.Eq((x_2 - X)**2 +(y_2 - Y)**2 + (z_2 - Z)**2 , (c*t_2)**2)
h = sym.Eq((x_3 - X)**2 +(y_3 - Y)**2 + (z_3 - Z)**2 , (c*t_3)**2)
i = sym.Eq((x_4 - X)**2 +(y_4 - Y)**2 + (z_4 - Z)**2 , (c*t_4)**2)
print("Solved coordinates are ", sym.solve([f,g,h,i],X,Y,Z))
print statement from your initial condition gave.
Actual: 111 553 110
and the solution that almost instantly came out was
Solved coordinates are [(111.000000000000, 553.000000000000, 110.000000000000)]
Sorry again if something is totally amiss.

Smooth a curve in Python while preserving the value and slope at the end points

I have two solutions to this problem actually, they are both applied below to a test case. The thing is that none of them is perfect: first one only take into account the two end points, the other one can't be made "arbitrarily smooth": there is a limit in the amount of smoothness one can achieve (the one I am showing).
I am sure there is a better solution, that kind-of go from the first solution to the other and all the way to no smoothing at all. It may already be implemented somewhere. Maybe solving a minimization problem with an arbitrary number of splines equidistributed?
Thank you very much for your help
Ps: the seed used is a challenging one
import matplotlib.pyplot as plt
from scipy import interpolate
from scipy.signal import savgol_filter
import numpy as np
import random
def scipy_bspline(cv, n=100, degree=3):
""" Calculate n samples on a bspline
cv : Array ov control vertices
n : Number of samples to return
degree: Curve degree
"""
cv = np.asarray(cv)
count = cv.shape[0]
degree = np.clip(degree,1,count-1)
kv = np.clip(np.arange(count+degree+1)-degree,0,count-degree)
# Return samples
max_param = count - (degree * (1-periodic))
spl = interpolate.BSpline(kv, cv, degree)
return spl(np.linspace(0,max_param,n))
def round_up_to_odd(f):
return np.int(np.ceil(f / 2.) * 2 + 1)
def generateRandomSignal(n=1000, seed=None):
"""
Parameters
----------
n : integer, optional
Number of points in the signal. The default is 1000.
Returns
-------
sig : numpy array
"""
np.random.seed(seed)
print("Seed was:", seed)
steps = np.random.choice(a=[-1, 0, 1], size=(n-1))
roughSig = np.concatenate([np.array([0]), steps]).cumsum(0)
sig = savgol_filter(roughSig, round_up_to_odd(n/10), 6)
return sig
# Generate a random signal to illustrate my point
n = 1000
t = np.linspace(0, 10, n)
seed = 45136. # Challenging seed
sig = generateRandomSignal(n=1000, seed=seed)
sigInit = np.copy(sig)
# Add noise to the signal
mean = 0
std = sig.max()/3.0
num_samples = n/5
idxMin = n/2-100
idxMax = idxMin + num_samples
tCut = t[idxMin+1:idxMax]
noise = np.random.normal(mean, std, size=num_samples-1) + 2*std*np.sin(2.0*np.pi*tCut/0.4)
sig[idxMin+1:idxMax] += noise
# Define filtering range enclosing the noisy area of the signal
idxMin -= 20
idxMax += 20
# Extreme filtering solution
# Spline between first and last points, the points in between have no influence
sigTrim = np.delete(sig, np.arange(idxMin,idxMax))
tTrim = np.delete(t, np.arange(idxMin,idxMax))
f = interpolate.interp1d(tTrim, sigTrim, kind='quadratic')
sigSmooth1 = f(t)
# My attempt. Not bad but not perfect because there is a limit in the maximum
# amount of smoothing we can add (degree=len(tSlice) is the maximum)
# If I could do degree=10*len(tSlice) and converging to the first solution
# I would be done!
sigSlice = sig[idxMin:idxMax]
tSlice = t[idxMin:idxMax]
cv = np.stack((tSlice, sigSlice)).T
p = scipy_bspline(cv, n=len(tSlice), degree=len(tSlice))
tSlice = p.T[0]
sigSliceSmooth = p.T[1]
sigSmooth2 = np.copy(sig)
sigSmooth2[idxMin:idxMax] = sigSliceSmooth
# Plot
plt.figure()
plt.plot(t, sig, label="Signal")
plt.plot(t, sigSmooth1, label="Solution 1")
plt.plot(t, sigSmooth2, label="Solution 2")
plt.plot(t[idxMin:idxMax], sigInit[idxMin:idxMax], label="What I'd want (kind of, smoother will be even better actually)")
plt.plot([t[idxMin],t[idxMax]], [sig[idxMin],sig[idxMax]],"o")
plt.legend()
plt.show()
sys.exit()
Yes, a minimization is a good way to approach this smoothing problem.
Least squares problem
Here is a suggestion for a least squares formulation: let s[0], ..., s[N] denote the N+1 samples of the given signal to smooth, and let L and R be the desired slopes to preserve at the left and right endpoints. Find the smoothed signal u[0], ..., u[N] as the minimizer of
min_u (1/2) sum_n (u[n] - s[n])² + (λ/2) sum_n (u[n+1] - 2 u[n] + u[n-1])²
subject to
s[0] = u[0], s[N] = u[N] (value constraints),
L = u[1] - u[0], R = u[N] - u[N-1] (slope constraints),
where in the minimization objective, the sums are over n = 1, ..., N-1 and λ is a positive parameter controlling the smoothing strength. The first term tries to keep the solution close to the original signal, and the second term penalizes u for bending to encourage a smooth solution.
The slope constraints require that
u[1] = L + u[0] = L + s[0] and u[N-1] = u[N] - R = s[N] - R. So we can consider the minimization as over only the interior samples u[2], ..., u[N-2].
Finding the minimizer
The minimizer satisfies the Euler–Lagrange equations
(u[n] - s[n]) / λ + (u[n+2] - 4 u[n+1] + 6 u[n] - 4 u[n-1] + u[n-2]) = 0
for n = 2, ..., N-2.
An easy way to find an approximate solution is by gradient descent: initialize u = np.copy(s), set u[1] = L + s[0] and u[N-1] = s[N] - R, and do 100 iterations or so of
u[2:-2] -= (0.05 / λ) * (u - s)[2:-2] + np.convolve(u, [1, -4, 6, -4, 1])[4:-4]
But with some more work, it is possible to do better than this by solving the E–L equations directly. For each n, move the known quantities to the right-hand side: s[n] and also the endpoints u[0] = s[0], u[1] = L + s[0], u[N-1] = s[N] - R, u[N] = s[N]. The you will have a linear system "A u = b", and matrix A has rows like
0, ..., 0, 1, -4, (6 + 1/λ), -4, 1, 0, ..., 0.
Finally, solve the linear system to find the smoothed signal u. You could use numpy.linalg.solve to do this if N is not too large, or if N is large, try an iterative method like conjugate gradients.
you can apply a simple smoothing method and plot the smooth curves with different smoothness values to see which one works best.
def smoothing(data, smoothness=0.5):
last = data[0]
new_data = [data[0]]
for datum in data[1:]:
new_value = smoothness * last + (1 - smoothness) * datum
new_data.append(new_value)
last = datum
return new_data
You can plot this curve for multiple values of smoothness and pick the curve which suits your needs. You can also apply this method only on a range of values in the actual curve by defining start and end

ValueError: x and y must have the same first dimension

I am trying to implement a finite difference approximation to solve the Heat Equation, u_t = k * u_{xx}, in Python using NumPy.
Here is a copy of the code I am running:
## This program is to implement a Finite Difference method approximation
## to solve the Heat Equation, u_t = k * u_xx,
## in 1D w/out sources & on a finite interval 0 < x < L. The PDE
## is subject to B.C: u(0,t) = u(L,t) = 0,
## and the I.C: u(x,0) = f(x).
import numpy as np
import matplotlib.pyplot as plt
# parameters
L = 1 # legnth of the rod
T = 10 # terminal time
N = 10
M = 100
s = 0.25
# uniform mesh
x_init = 0
x_end = L
dx = float(x_end - x_init) / N
x = np.arange(x_init, x_end, dx)
x[0] = x_init
# time discretization
t_init = 0
t_end = T
dt = float(t_end - t_init) / M
t = np.arange(t_init, t_end, dt)
t[0] = t_init
# Boundary Conditions
for m in xrange(0, M):
t[m] = m * dt
# Initial Conditions
for j in xrange(0, N):
x[j] = j * dx
# definition of solution u(x,t) to u_t = k * u_xx
u = np.zeros((N, M+1)) # array to store values of the solution
# Finite Difference Scheme:
u[:,0] = x**2 #initial condition
for m in xrange(0, M):
for j in xrange(1, N-1):
if j == 1:
u[j-1,m] = 0 # Boundary condition
elif j == N-1:
u[j+1,m] = 0
else:
u[j,m+1] = u[j,m] + s * ( u[j+1,m] -
2 * u[j,m] + u[j-1,m] )
print u, #t, x
plt.plot(u, t)
#plt.show()
I think my code is working properly and it is producing an output. I want to plot the output of the solution u versus t (my time vector). If I can plot the graph then I am able to check if my numerical approximation agrees with the expected phenomena for the Heat Equation. However, I am getting the error that "x and y must have same first dimension". How can I correct this issue?
An additional question: Am I better off attempting to make an animation with matplotlib.animation instead of using matplotlib.plyplot ???
Thanks so much for any and all help! It is very greatly appreciated!
Okay so I had a "brain dump" and tried plotting u vs. t sort of forgetting that u, being the solution to the Heat Equation (u_t = k * u_{xx}), is defined as u(x,t) so it has values for time. I made the following correction to my code:
print u #t, x
plt.plot(u)
plt.show()
And now my programming is finally displaying an image. And here it is:
It is absolutely beautiful, isn't it?

How to perform cubic spline interpolation in python?

I have two lists to describe the function y(x):
x = [0,1,2,3,4,5]
y = [12,14,22,39,58,77]
I would like to perform cubic spline interpolation so that given some value u in the domain of x, e.g.
u = 1.25
I can find y(u).
I found this in SciPy but I am not sure how to use it.
Short answer:
from scipy import interpolate
def f(x):
x_points = [ 0, 1, 2, 3, 4, 5]
y_points = [12,14,22,39,58,77]
tck = interpolate.splrep(x_points, y_points)
return interpolate.splev(x, tck)
print(f(1.25))
Long answer:
scipy separates the steps involved in spline interpolation into two operations, most likely for computational efficiency.
The coefficients describing the spline curve are computed,
using splrep(). splrep returns an array of tuples containing the
coefficients.
These coefficients are passed into splev() to actually
evaluate the spline at the desired point x (in this example 1.25).
x can also be an array. Calling f([1.0, 1.25, 1.5]) returns the
interpolated points at 1, 1.25, and 1,5, respectively.
This approach is admittedly inconvenient for single evaluations, but since the most common use case is to start with a handful of function evaluation points, then to repeatedly use the spline to find interpolated values, it is usually quite useful in practice.
In case, scipy is not installed:
import numpy as np
from math import sqrt
def cubic_interp1d(x0, x, y):
"""
Interpolate a 1-D function using cubic splines.
x0 : a float or an 1d-array
x : (N,) array_like
A 1-D array of real/complex values.
y : (N,) array_like
A 1-D array of real values. The length of y along the
interpolation axis must be equal to the length of x.
Implement a trick to generate at first step the cholesky matrice L of
the tridiagonal matrice A (thus L is a bidiagonal matrice that
can be solved in two distinct loops).
additional ref: www.math.uh.edu/~jingqiu/math4364/spline.pdf
"""
x = np.asfarray(x)
y = np.asfarray(y)
# remove non finite values
# indexes = np.isfinite(x)
# x = x[indexes]
# y = y[indexes]
# check if sorted
if np.any(np.diff(x) < 0):
indexes = np.argsort(x)
x = x[indexes]
y = y[indexes]
size = len(x)
xdiff = np.diff(x)
ydiff = np.diff(y)
# allocate buffer matrices
Li = np.empty(size)
Li_1 = np.empty(size-1)
z = np.empty(size)
# fill diagonals Li and Li-1 and solve [L][y] = [B]
Li[0] = sqrt(2*xdiff[0])
Li_1[0] = 0.0
B0 = 0.0 # natural boundary
z[0] = B0 / Li[0]
for i in range(1, size-1, 1):
Li_1[i] = xdiff[i-1] / Li[i-1]
Li[i] = sqrt(2*(xdiff[i-1]+xdiff[i]) - Li_1[i-1] * Li_1[i-1])
Bi = 6*(ydiff[i]/xdiff[i] - ydiff[i-1]/xdiff[i-1])
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
i = size - 1
Li_1[i-1] = xdiff[-1] / Li[i-1]
Li[i] = sqrt(2*xdiff[-1] - Li_1[i-1] * Li_1[i-1])
Bi = 0.0 # natural boundary
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
# solve [L.T][x] = [y]
i = size-1
z[i] = z[i] / Li[i]
for i in range(size-2, -1, -1):
z[i] = (z[i] - Li_1[i-1]*z[i+1])/Li[i]
# find index
index = x.searchsorted(x0)
np.clip(index, 1, size-1, index)
xi1, xi0 = x[index], x[index-1]
yi1, yi0 = y[index], y[index-1]
zi1, zi0 = z[index], z[index-1]
hi1 = xi1 - xi0
# calculate cubic
f0 = zi0/(6*hi1)*(xi1-x0)**3 + \
zi1/(6*hi1)*(x0-xi0)**3 + \
(yi1/hi1 - zi1*hi1/6)*(x0-xi0) + \
(yi0/hi1 - zi0*hi1/6)*(xi1-x0)
return f0
if __name__ == '__main__':
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 11)
y = np.sin(x)
plt.scatter(x, y)
x_new = np.linspace(0, 10, 201)
plt.plot(x_new, cubic_interp1d(x_new, x, y))
plt.show()
If you have scipy version >= 0.18.0 installed you can use CubicSpline function from scipy.interpolate for cubic spline interpolation.
You can check scipy version by running following commands in python:
#!/usr/bin/env python3
import scipy
scipy.version.version
If your scipy version is >= 0.18.0 you can run following example code for cubic spline interpolation:
#!/usr/bin/env python3
import numpy as np
from scipy.interpolate import CubicSpline
# calculate 5 natural cubic spline polynomials for 6 points
# (x,y) = (0,12) (1,14) (2,22) (3,39) (4,58) (5,77)
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([12,14,22,39,58,77])
# calculate natural cubic spline polynomials
cs = CubicSpline(x,y,bc_type='natural')
# show values of interpolation function at x=1.25
print('S(1.25) = ', cs(1.25))
## Aditional - find polynomial coefficients for different x regions
# if you want to print polynomial coefficients in form
# S0(0<=x<=1) = a0 + b0(x-x0) + c0(x-x0)^2 + d0(x-x0)^3
# S1(1< x<=2) = a1 + b1(x-x1) + c1(x-x1)^2 + d1(x-x1)^3
# ...
# S4(4< x<=5) = a4 + b4(x-x4) + c5(x-x4)^2 + d5(x-x4)^3
# x0 = 0; x1 = 1; x4 = 4; (start of x region interval)
# show values of a0, b0, c0, d0, a1, b1, c1, d1 ...
cs.c
# Polynomial coefficients for 0 <= x <= 1
a0 = cs.c.item(3,0)
b0 = cs.c.item(2,0)
c0 = cs.c.item(1,0)
d0 = cs.c.item(0,0)
# Polynomial coefficients for 1 < x <= 2
a1 = cs.c.item(3,1)
b1 = cs.c.item(2,1)
c1 = cs.c.item(1,1)
d1 = cs.c.item(0,1)
# ...
# Polynomial coefficients for 4 < x <= 5
a4 = cs.c.item(3,4)
b4 = cs.c.item(2,4)
c4 = cs.c.item(1,4)
d4 = cs.c.item(0,4)
# Print polynomial equations for different x regions
print('S0(0<=x<=1) = ', a0, ' + ', b0, '(x-0) + ', c0, '(x-0)^2 + ', d0, '(x-0)^3')
print('S1(1< x<=2) = ', a1, ' + ', b1, '(x-1) + ', c1, '(x-1)^2 + ', d1, '(x-1)^3')
print('...')
print('S5(4< x<=5) = ', a4, ' + ', b4, '(x-4) + ', c4, '(x-4)^2 + ', d4, '(x-4)^3')
# So we can calculate S(1.25) by using equation S1(1< x<=2)
print('S(1.25) = ', a1 + b1*0.25 + c1*(0.25**2) + d1*(0.25**3))
# Cubic spline interpolation calculus example
# https://www.youtube.com/watch?v=gT7F3TWihvk
Just putting this here if you want a dependency-free solution.
Code taken from an answer above: https://stackoverflow.com/a/48085583/36061
def my_cubic_interp1d(x0, x, y):
"""
Interpolate a 1-D function using cubic splines.
x0 : a 1d-array of floats to interpolate at
x : a 1-D array of floats sorted in increasing order
y : A 1-D array of floats. The length of y along the
interpolation axis must be equal to the length of x.
Implement a trick to generate at first step the cholesky matrice L of
the tridiagonal matrice A (thus L is a bidiagonal matrice that
can be solved in two distinct loops).
additional ref: www.math.uh.edu/~jingqiu/math4364/spline.pdf
# original function code at: https://stackoverflow.com/a/48085583/36061
This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Original Author raphael valentin
Date 3 Jan 2018
Modifications made to remove numpy dependencies:
-all sub-functions by MR
This function, and all sub-functions, are licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
Mod author: Matthew Rowles
Date 3 May 2021
"""
def diff(lst):
"""
numpy.diff with default settings
"""
size = len(lst)-1
r = [0]*size
for i in range(size):
r[i] = lst[i+1] - lst[i]
return r
def list_searchsorted(listToInsert, insertInto):
"""
numpy.searchsorted with default settings
"""
def float_searchsorted(floatToInsert, insertInto):
for i in range(len(insertInto)):
if floatToInsert <= insertInto[i]:
return i
return len(insertInto)
return [float_searchsorted(i, insertInto) for i in listToInsert]
def clip(lst, min_val, max_val, inPlace = False):
"""
numpy.clip
"""
if not inPlace:
lst = lst[:]
for i in range(len(lst)):
if lst[i] < min_val:
lst[i] = min_val
elif lst[i] > max_val:
lst[i] = max_val
return lst
def subtract(a,b):
"""
returns a - b
"""
return a - b
size = len(x)
xdiff = diff(x)
ydiff = diff(y)
# allocate buffer matrices
Li = [0]*size
Li_1 = [0]*(size-1)
z = [0]*(size)
# fill diagonals Li and Li-1 and solve [L][y] = [B]
Li[0] = sqrt(2*xdiff[0])
Li_1[0] = 0.0
B0 = 0.0 # natural boundary
z[0] = B0 / Li[0]
for i in range(1, size-1, 1):
Li_1[i] = xdiff[i-1] / Li[i-1]
Li[i] = sqrt(2*(xdiff[i-1]+xdiff[i]) - Li_1[i-1] * Li_1[i-1])
Bi = 6*(ydiff[i]/xdiff[i] - ydiff[i-1]/xdiff[i-1])
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
i = size - 1
Li_1[i-1] = xdiff[-1] / Li[i-1]
Li[i] = sqrt(2*xdiff[-1] - Li_1[i-1] * Li_1[i-1])
Bi = 0.0 # natural boundary
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
# solve [L.T][x] = [y]
i = size-1
z[i] = z[i] / Li[i]
for i in range(size-2, -1, -1):
z[i] = (z[i] - Li_1[i-1]*z[i+1])/Li[i]
# find index
index = list_searchsorted(x0,x)
index = clip(index, 1, size-1)
xi1 = [x[num] for num in index]
xi0 = [x[num-1] for num in index]
yi1 = [y[num] for num in index]
yi0 = [y[num-1] for num in index]
zi1 = [z[num] for num in index]
zi0 = [z[num-1] for num in index]
hi1 = list( map(subtract, xi1, xi0) )
# calculate cubic - all element-wise multiplication
f0 = [0]*len(hi1)
for j in range(len(f0)):
f0[j] = zi0[j]/(6*hi1[j])*(xi1[j]-x0[j])**3 + \
zi1[j]/(6*hi1[j])*(x0[j]-xi0[j])**3 + \
(yi1[j]/hi1[j] - zi1[j]*hi1[j]/6)*(x0[j]-xi0[j]) + \
(yi0[j]/hi1[j] - zi0[j]*hi1[j]/6)*(xi1[j]-x0[j])
return f0
Minimal python3 code:
from scipy import interpolate
if __name__ == '__main__':
x = [ 0, 1, 2, 3, 4, 5]
y = [12,14,22,39,58,77]
# tck : tuple (t,c,k) a tuple containing the vector of knots,
# the B-spline coefficients, and the degree of the spline.
tck = interpolate.splrep(x, y)
print(interpolate.splev(1.25, tck)) # Prints 15.203125000000002
print(interpolate.splev(...other_value_here..., tck))
Based on comment of cwhy and answer by youngmit
In my previous post, I wrote a code based on a Cholesky development to solve the matrix generated by the cubic algorithm. Unfortunately, due to the square root function, it may perform badly on some sets of points (typically a non-uniform set of points).
In the same spirit than previously, there is another idea using the Thomas algorithm (TDMA) (see https://en.wikipedia.org/wiki/Tridiagonal_matrix_algorithm) to solve partially the tridiagonal matrix during its definition loop. However, the condition to use TDMA is that it requires at least that the matrix shall be diagonally dominant. However, in our case, it shall be true since |bi| > |ai| + |ci| with ai = h[i], bi = 2*(h[i]+h[i+1]), ci = h[i+1], with h[i] unconditionally positive. (see https://www.cfd-online.com/Wiki/Tridiagonal_matrix_algorithm_-TDMA(Thomas_algorithm)
I refer again to the document from jingqiu (see my previous post, unfortunately the link is broken, but it is still possible to find it in the cache of the web).
An optimized version of the TDMA solver can be described as follows:
def TDMAsolver(a,b,c,d):
""" This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Author raphael valentin
Date 25 Mar 2022
ref. https://www.cfd-online.com/Wiki/Tridiagonal_matrix_algorithm_-_TDMA_(Thomas_algorithm)
"""
n = len(d)
w = np.empty(n-1,float)
g = np.empty(n, float)
w[0] = c[0]/b[0]
g[0] = d[0]/b[0]
for i in range(1, n-1):
m = b[i] - a[i-1]*w[i-1]
w[i] = c[i] / m
g[i] = (d[i] - a[i-1]*g[i-1]) / m
g[n-1] = (d[n-1] - a[n-2]*g[n-2]) / (b[n-1] - a[n-2]*w[n-2])
for i in range(n-2, -1, -1):
g[i] = g[i] - w[i]*g[i+1]
return g
When it is possible to get each individual for ai, bi, ci, di, it becomes easy to combine the definitions of the natural cubic spline interpolator function within these 2 single loops.
def cubic_interpolate(x0, x, y):
""" Natural cubic spline interpolate function
This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Author raphael valentin
Date 25 Mar 2022
"""
xdiff = np.diff(x)
dydx = np.diff(y)
dydx /= xdiff
n = size = len(x)
w = np.empty(n-1, float)
z = np.empty(n, float)
w[0] = 0.
z[0] = 0.
for i in range(1, n-1):
m = xdiff[i-1] * (2 - w[i-1]) + 2 * xdiff[i]
w[i] = xdiff[i] / m
z[i] = (6*(dydx[i] - dydx[i-1]) - xdiff[i-1]*z[i-1]) / m
z[-1] = 0.
for i in range(n-2, -1, -1):
z[i] = z[i] - w[i]*z[i+1]
# find index (it requires x0 is already sorted)
index = x.searchsorted(x0)
np.clip(index, 1, size-1, index)
xi1, xi0 = x[index], x[index-1]
yi1, yi0 = y[index], y[index-1]
zi1, zi0 = z[index], z[index-1]
hi1 = xi1 - xi0
# calculate cubic
f0 = zi0/(6*hi1)*(xi1-x0)**3 + \
zi1/(6*hi1)*(x0-xi0)**3 + \
(yi1/hi1 - zi1*hi1/6)*(x0-xi0) + \
(yi0/hi1 - zi0*hi1/6)*(xi1-x0)
return f0
This function gives the same results as the function/class CubicSpline from scipy.interpolate, as we can see in the next plot.
It is possible to implement as well the first and second analytical derivatives that can be described such way:
f1p = -zi0/(2*hi1)*(xi1-x0)**2 + zi1/(2*hi1)*(x0-xi0)**2 + (yi1/hi1 - zi1*hi1/6) + (yi0/hi1 - zi0*hi1/6)
f2p = zi0/hi1 * (xi1-x0) + zi1/hi1 * (x0-xi0)
Then, it is easy to verify that f2p[0] and f2p[-1] are equal to 0, then that the interpolator function yields natural splines.
An additional reference concerning natural spline:
https://faculty.ksu.edu.sa/sites/default/files/numerical_analysis_9th.pdf#page=167
An example of use:
import matplotlib.pyplot as plt
import numpy as np
x = [-8,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
x = np.asfarray(x)
y = np.asfarray(y)
plt.scatter(x, y)
x_new= np.linspace(min(x), max(x), 10000)
y_new = cubic_interpolate(x_new, x, y)
plt.plot(x_new, y_new)
from scipy.interpolate import CubicSpline
f = CubicSpline(x, y, bc_type='natural')
plt.plot(x_new, f(x_new), label='ref')
plt.legend()
plt.show()
In a conclusion, this updated algorithm shall perform interpolation with better stability and faster than the previous code (O(n)). Associated with numba or cython, it shall be even very fast. Finally, it is totally independent of Scipy.
Important, note that as most of algorithms, it is sometimes useful to normalize the data (e.g. against large or small number values) to get the best results. As well, in this code, I do not check nan values or ordered data.
Whatever, this update was a good lesson learning for me and I hope it can help someone. Let me know if you find something strange.
If you want to get the value
from scipy.interpolate import CubicSpline
import numpy as np
x = [-5,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
value = 2
#ascending order
if np.any(np.diff(x) < 0):
indexes = np.argsort(x).astype(int)
x = np.array(x)[indexes]
y = np.array(y)[indexes]
f = CubicSpline(x, y, bc_type='natural')
specificVal = f(value).item(0) #f(value) is numpy.ndarray!!
print(specificVal)
If you want to plot the interpolated function.
np.linspace third parameter increase the "accuracy".
from scipy.interpolate import CubicSpline
import numpy as np
import matplotlib.pyplot as plt
x = [-5,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
#ascending order
if np.any(np.diff(x) < 0):
indexes = np.argsort(x).astype(int)
x = np.array(x)[indexes]
y = np.array(y)[indexes]
f = CubicSpline(x, y, bc_type='natural')
x_new = np.linspace(min(x), max(x), 100)
y_new = f(x_new)
plt.plot(x_new, y_new)
plt.scatter(x, y)
plt.title('Cubic Spline Interpolation')
plt.show()
output:
Yes, as others have already noted, it should be as simple as
>>> from scipy.interpolate import CubicSpline
>>> CubicSpline(x,y)(u)
array(15.203125)
(you can, for example, convert it to float to get the value from a 0d NumPy array)
What has not been described yet is boundary conditions: the default ‘not-a-knot’ boundary conditions work best if you have zero knowledge about the data you’re going to interpolate.
If you see the following ‘features’ on the plot, you can fine-tune the boundary conditions to get a better result:
the first derivative vanishes at boundaries => bc_type=‘clamped’
the second derivative vanishes at boundaries => bc_type='natural'
the function is periodic => bc_type='periodic'
See my article for more details and an interactive demo.

Categories

Resources