Cubic Spline interpolation implementation - python

In the following code I am trying to implement the following
write a function naturalSpline that implements cubic spline interpolation with natural boundary conditions
Use a tridiagonal solver to solve the arising tridiagonal system for the first derivatives.
The prototype of the function should read yy=naturalSpline(x,y,xx) where (x,y) are the input points and data, and xx are the points where the data should be interpolated.
I figured first I would start with the second bullet point, creating the tridiagonal solver. So this is just the Thomas algorithm. I spent some time to create this part of the code and I have formatted it below. But now I am trying to finish the first and third bullet points but I am not sure how to use what I have done already to finish those. Looking for some help with this! Thanks in advance.
import numpy as np
def TDMA(a,b,c,d):
n = len(d)
w= np.zeros(n-1,float)
g= np.zeros(n, float)
p = np.zeros(n,float)
w[0] = c[0]/b[0]
g[0] = d[0]/b[0]
for i in range(1,n-1):
w[i] = c[i]/(b[i] - a[i-1]*w[i-1])
for i in range(1,n):
g[i] = (d[i] - a[i-1]*g[i-1])/(b[i] - a[i-1]*w[i-1])
p[n-1] = g[n-1]
for i in range(n-1,0,-1):
p[i-1] = g[i-1] - w[i-1]*p[i]
return p
A = np.array([[10,2,0,0],[3,10,4,0],[0,1,7,5], [0,0,3,4]],dtype=float)
a = np.array([3.,1,3])
b = np.array([10.,10.,7.,4.])
c = np.array([2.,4.,5.])
d = np.array([3,4,5,6.])
print (TDMA(a, b, c, d))
Which gives the correct output, I even tested it against np.linalg.solve(a,b,c,d) to make sure it was correct
[ 0.14877589 0.75612053 -1.00188324 2.25141243]

For each interval [x_k, x_(k+1)], you can solve the four equations
p_k(x_k) = f(x_k) = y_k
p_k'(x_k) = f'(x_k) = d_k
p_k(x_(k+1)) = f(x_(k+1)) = y_(k+1)
p_k'(x_(k+1)) = f'(x_(k+1)) = d_(k+1)
(without checking your code, I assume that this is what you did).
From this, you can construct a dict
{'polynomials': [ [a_0, ..., d_0], ..., [a_24, ..., d_24] ],
'knots': [x_0, ..., x_24]}
For each x of your 250 point, you check for which k the point x is in the interval [x_k, x_(k+1)] and evaluate p_k(x).
All of this is straight forward mathematics and python coding. If something is not clear, you are better of learning more about both fields, instead of getting specialized advise on this website.

Related

Data structures and format in matlab vs python?

I have come to the conclusion that I simply do not understand python as well as I thought.
after a great many tries in python, I tried writing the same thing in Matlab and it just worked.
What my conclusion is, is that the way structures work is just a lot different from what I expect and I cannot see what that difference is.
For example in python, it could have a structure that looks like [[1], [2], [3]] and in Matlab, it would be [1,2,3]. running a loop over i in python would yield only [1] and the same in Matlab would be the sequence.
I remedied this by using np.hstack to get [1,2,3], so I fixed that issue, but I suspect that the rest of my issue right now is also structure-based. in the Matlab code, I get a coupling and the numbers converge. However, in my python code, all of them diverge.
Is there a great resource on how data structures work in python, that isn't the python doc, maybe something that compares Matlab and Python structures? or does anyone have an idea of how I should restructure my Python code?
EDIT: the code is an attempt at euler integration of coupled oscillators, where each oscillator couples to all of its neigbours
it takes in a frequency, w, as this is a solution for a coupling constant of K = 0
a loop runs from each oscillator i, over each neighbour j.
dTheta[i] is the frequency of the current oscillator in the loop.
k/N indicates the coupling strength based on the number of neighbours and theta[j,c] and theta[i,c] is respectively the previous angles for neighbour and current oscillator
a new angle is then assigned based on the integration step of the frequency
in Matlab, I wrote the following
%% Initialize items
k = 1; %coupling factor
N = 20 ; %Number of oscillators
tend = 10;
dt= tend/200;
t = 0:dt:tend;
theta=zeros(N,length(t));
theta(:,1)=abs(2*pi*rand(N,1));
dTheta=zeros(N,1);
w = .1.*ones(N,1); %Set the frequency of the oscillators
y = zeros(size(theta));
%% Calculations
for c=2:length(t)
dTheta=w;
for i=1:N
for j=1:N
dTheta(i)=dTheta(i)+((k/N)*sin(theta(j,c-1)-theta(i,c-1))); %Genereate delta theta.
end
end
theta(:,c)=theta(:,c-1)+(dTheta*dt); %Euler forward step
c/length(t);
end
for c=1:length(t)
y(:,c)=sin((5*t(c))+theta(:,c)); %Generate the y.
end
in python I have
import numpy as np
import matplotlib.pyplot as plt
k = 1
N = 10
tend = 20
dt = tend*4
t = np.linspace(0,tend,dt)
theta = np.zeros((N,len(t)))
theta[:,0] = 2*np.pi*abs(np.random.randint(0, high=N ,size=(N)))/10
dTheta = np.hstack(np.zeros((N,1)))
w = np.hstack(20*np.ones((N,1)))
y = np.zeros(theta.shape)
for c in range(1,len(t)):
dTheta=w
for i in range(N):
for j in range(N):
dTheta[i] = dTheta[i] + ((k/N)*np.sin(theta[j,c-1] - theta[i,c-1]))
theta[:,c] = theta[:,c-1] + (dTheta*dt)
c/len(t)
for c in range(len(t)):
y[:,c] = np.sin((5*t[c]) + theta[:,c])
plt.figure()
for c in range(N):
plt.plot(t,y[c,:])
plt.figure()
for c in range(N):
plt.plot(t,theta[c,:])
plt.show()

Fast way to build array from originating arrays of different sizes

I am building an array of (1000,100,100,100) from two arrays of sizes (1000,100,100,100) and (100,100,100). For this, I am using a for-loop to run the first entry (0 - 1000). However, my code (below) is still pretty slow and as a beginner, I was wondering whether there is more efficient way to do it.
n_train = 1000
Nx = 100
Ny = 100
Nt = 100
x = np.linspace(-Nx, Nx, 100)
y = np.linspace(-Ny, Ny, 100)
t = np.linspace(0, Nt-1, 100)
def gw(xx, yy, tt):
num1 = tt - np.sqrt(xx**2+yy**2)
denom = (tt**2-xx**2-yy**2)
if denom < 0:
denom1 = 0
else:
denom1 = np.sqrt(denom)
kk = np.heaviside(num1,1)/(2*np.pi*denom1+1)
return (kk)
# Slow FOR-LOOP
for i_train in range (n_train):
ugreen = np.array([gw(i, j, k) for k in t for j in y for i in x])
Ugreen = ugreen.reshape(Nt, Ny, Nx)
prob = randrange(2)
Utot = UN[i_train,:,:,:] + Ugreen/1.75*prob
Utot = (Utot - np.min(Utot))/(np.max(Utot)-np.min(Utot))
Utot_green = 10*Ugreen/1.75*prob
P[i_train,:,:,:] = Utot
Pg[i_train,:,:,:]= Utot_green
I think several issues make this difficult. First of all, your code snippet does not compile, which IMHO is problematic in any programming language and corresponding forums.
And second, you do not explain what you are actually trying to calculate with your code. So I had to translate your code back to mathematics.
This brings me to the numpy-specific problem of this question.
Long story short: if you would have given some kind of mathematical equation in terms of vectors and matrices it would have been much easier to answer.
randrange
UN
P
PG
are not defined in your code.
In each loop you are evaluating Ugreen again, but I don't see that it changes in your loop. Move it outside the loop and save time.
gw()
can be written vectorized (and it is easily seen now what you are calculating - this should look very similar to your handwritten equation in your notes now):
import numpy as np
from scipy import sqrt
def gw(xx,yy,tt):
return (0<=tt-sqrt(xx**2+yy**2))/(2*np.pi*sqrt(np.clip(tt**2-xx**2-yy**2,0,None))+1)
Ugreen = gw(x[:,None,None],y[None,:,None],t[None,None,:])
prob = randrange(2)#I assume this is some scalar
Utot = UN + Ugreen[None,:,:,:]/1.75*prob
Utot = (Utot - np.min(Utot,axis=0)[None,...])/(np.max(Utot,axis=0)[None,...]-np.min(Utot,axis=0)[None,...]
Here is a working example of what I think you are trying to calculate
import numpy as np
from scipy import sqrt
def gw(x,y,t):
return (0<=t-sqrt(x**2+y**2))/(2*np.pi*sqrt(np.clip(t**2-x**2-y**2,0,None))+1)
n_train = 1000
Nx = 100
Ny = 100
Nt = 100
x = np.linspace(-Nx, Nx, 100)
y = np.linspace(-Ny, Ny, 100)
t = np.arange(Nt)
Ugreen = gw(x[:,None,None],y[None,:,None],t[None,None,:])
prob = 2
Utot = np.random.random((n_train,Nt,Ny,Nx))
Utot += Ugreen[None,:,:,:]/1.75*prob
Utot /= (np.max(Utot,axis=0)[None,...]-np.min(Utot,axis=0)[None,...])
Utot -= np.min(Utot,axis=0)[None,...]/(np.max(Utot,axis=0)[None,...]-np.min(Utot,axis=0)[None,...])
I had to arange the Utot-evaluation in order not to run into memory-issues.
However, on my machine it runs within a few seconds.
It is certainly possible to optimize this further and I hope for some responses from other users to learn new things.
Some general hints on numpy:
IMHO if you want to see the true practical power of numpy it's better to forget everything you know about numerical programming from other languages than sticking with any thoughts/ideas/loop-wise thinking you might have.
My experience is that, yes numpy is very fast and this is very nice and so on, but it also makes your code extremely short, compact, and best for scientific work: extremely close to whatever complicated equation you might want to solve. One should start implementing equations as close to the paper as possible and optimize only where and when necessary.

How do you feed Scipy's bvp only the BCs you have?

The only example/docs I can find are on the Scipy docs page.
To test, I'm looking at a time-independent Schrod eq in a 1d infinite potential well. This has a neat analytic solution found by solving the DE, and inserting boundary conditions of ψ(0) = 0, ψ(L) = 0, and that the func soln to 1, but this question applies to solving any DE where the BCs we know aren't for the initial value.
You can solve it numerically with Scipy's solve_ivp by starting with ψ(0) = 0, and cheating to place ψ'(0) appropriately using the analytic soln. Can use shooting method to find an appropriate E value, eg the normalization condition above.
These are two sets of BCs: ψ(0) = 0 for both, normalization for both, and a second value of ψ for the analytic approach, and an initial value of ψ' for the ivp approach. Scipy's solve_bvp seems to offer a solution using the first set of BCs numerically (since we're cheating by inserting ψ'), but i can't get it working. This pseudocode describes the problem, and is how I expect the API to behave:
bcs = {0: (0, None), L: (0, None)} # Two BCs on ψ; no BCs on derivative
x_span = (0, L)
sol = solve_bvp(rhs, bcs, x_span)
In reality, the code looks something like this, and I can't get it to work:
def bc(ψ_a, ψ_b):
return np.array([ψ_a[0], ψ_b[0]])
x_span = (0, L)
x_eval = np.linspace(x_span[0], x_span[1], int(1e5))
x_guess = np.array([0, L])
ψ_guess = np.array([[0, 1], [0, -1]])
res = solve_bvp(rhs_1d, bc, x_guess, ψ_guess)
I've no idea how to build the bc function, and don't know why the guesses are set up the way they are. And unsure how I can guess for the value of ψ without also inserting a guess for ψ'. (The docs imply you can) Also of note, the docs shows an example implying you can use solve_bvp for a normalization BC as well, but not sure how to approach. (Example is too sparse)
The equivalent and working ivp code, for ref: (Compare to my solve_bvp pseudocode)
Python code:
ψ_0 = (0, sqrt(2/L) * n*π/L)
x_span = (0, L)
sol = solve_ivp(rhs_1d, x_span, ψ_0)
For the eigenvalue problem
-u''+V(x)u = c*u
with boundary conditions
u(0)=0=u(L)
and normalization
int(u(x)^2, x=0 to L)=1
set up the integral as third component. With the eigenvalue as parameter these are 4 dimensions allowing for 4 boundary conditions, the additional 2 are that the integral at 0 is zero and that the integral at L has value 1.
# some length
L = 10;
# some potential function
def V(x): return 1+(2*x-L)**2;
# the ODE function
def odesys(x,y,p):
u,v,S = y; c=p[0]
return [v, (V(x)-c)*u , u**2 ]
# the boundary conditions
def boundary(y0, yL, c):
return [ y0[0], yL[0], y0[2], yL[2]-1 ]
With the initial guess you select approximately what eigenfunction/eigenvalue you will get, more or less.
n=11;
w = (np.pi*n)/L
x_init = np.linspace(0,L,4*n+1);
u_init = np.sin(w*x_init);
v_init = np.cos(w*x_init)*w;
y_init = [ u_init, v_init, x_init/L ]
There is no need to put too many points into the guess, just enough that the structure of the first component is faithfully represented.
Then call the solver with the prepared data, take notice that the default tolerance is 1e-3, if you want better you have to allow for a finer subdivision. If everything runs fine, plot the solution.
res = solve_bvp(odesys, boundary, x_init, y_init, p=[w**2], max_nodes=10000, tol=1e-6)
print res.message
if res.success:
x_disp = np.linspace(0,L,3001)
y_disp = res.sol(x_disp)
plt.plot(x_disp, y_disp[0])
plt.title("eigenfunction to eigenvalue $\lambda=%.6f$"%res.p[0]);
plt.grid(); plt.show()

How to minimize a multivariable function using scipy

So I have the function
f(x) = I_0(exp(Q*x/nKT)
Where Q, K and T are constants, for the sake of clarity I'll add the values
Q = 1.6x10^(-19)
K = 1.38x10^(-23)
T = 77.6
and n and I_0 are the two constraints that I'm trying to minimize.
my xdata is a list of 50 datapoints and as is my ydata. So as of yet this is my code:
from __future__ import division
import scipy.optimize as optimize
import numpy
xdata = numpy.array([1.07,1.07994,1.08752,1.09355,
1.09929,1.10536,1.10819,1.11321,
1.11692,1.12099,1.12435,1.12814,
1.13181,1.13594,1.1382,1.14147,
1.14443,1.14752,1.15023,1.15231,
1.15514,1.15763,1.15985,1.16291,1.16482])
ydata = [0.00205,
0.004136,0.006252,0.008252,0.010401,
0.012907,0.014162,0.016498,0.018328,
0.020426,0.022234,0.024363,0.026509,
0.029024,0.030457,0.032593,0.034576,
0.036725,0.038703,0.040223,0.042352,
0.044289,0.046043,0.048549,0.050146]
#data and ydata is experimental data, xdata is voltage and ydata is current
def f(x,I0,N):
# I0 = 7.85E-07
# N = 3.185413895
Q = 1.66E-19
K = 1.38065E-23
T = 77.3692
return I0*(numpy.e**((Q*x)/(N*K*T))-1)
result = optimize.curve_fit(f, xdata,ydata) #trying to minize I0 and N
But the answer doesn't give suitably optimized constraints
Any help would be hugely appreciated I realize there may be something obvious I am missing, I just can't see what it is!
I have tried this, but for some reason if you throw out those constants so function becomes
def f(x,I0,N):
return I0*(numpy.exp(x/N)-1)
you get something reasonable.
1.86901114e-13, 4.41838309e-02
Its true, that when we get rid off constants its better. Define function as:
def f(x,A,B):
return A*(np.e**(B*x)-1)
and fit it by curve_fit, you'll be able to get A that is explicitly I0 (A=I0) and B (you can obtain N simply by N=Q/(BKT) ). I managed to get pretty good fit.
I think if there is too much constants, algorithm gets confused some way.

PyMC, deterministic nodes in loops

I'm a bit new to Python and PyMC, and making rapid progress. But I'm just confused about the use of setting deterministic values of a 2D matrix. I have a model below, that I cannot get to parse correctly. The problem relates to setting the value theta in the model.
import numpy as np
import pymc
define known variables
N = 2
T = 10
tau = 1
define model... which I cannot get to parse correctly. It's the allocation of theta that I'm having trouble with. The aim to to get samples of D and x. Theta is just an intermediate variable, but I need to keep it as it's used in more complex variations of the model.
def NAFCgenerator():
D = np.empty(T, dtype=object)
theta = np.empty([N,T], dtype=object)
x = np.empty([N,T], dtype=object)
# true location of signal
for t in range(T):
D[t] = pymc.DiscreteUniform('D_%i' % t, lower=0, upper=N-1)
for t in range(T):
for n in range(N):
#pymc.deterministic(plot=False)
def temp_theta(dt=D[t], n=n):
return dt==n
theta[n,t] = temp_theta
x[n,t] = pymc.Normal('x_%i,%i' % (n,t),
mu=theta[n,t], tau=tau)
return locals()
** EDIT **
Explicit indexing is useful for me as I'm learning both PyMC and Python. But it seems that extracting MCMC samples is a bit clunky, e.g.
D0values = pymc_generator.trace('D_0')[:]
But I am probably missing something. But did I managed to get a vectorised version working
# Approach 1b - actually quite promising
def NAFCgenerator():
# NOTE TO SELF. It's important to declare these as objects
D = np.empty(T, dtype=object)
theta = np.empty([N,T], dtype=object)
x = np.empty([N,T], dtype=object)
# true location of signal
D = pymc.Categorical('D', spatial_prior, size=T)
# displayed stimuli
#pymc.deterministic(plot=False)
def theta(D=D):
theta = np.zeros([N,T])
theta[0,D==0]=1
theta[1,D==1]=1
return theta
#for n in range(N):
x = pymc.Normal('x', mu=theta, tau=tau)
return locals()
Which seems easier to get at MCMC samples using this for example
Dvalues = pymc_generator.trace('D')[:]
In PyMC2, when creating deterministic nodes with decorators, the default is to take the node name from the function name. The solution is simple: specify the node name as a parameter for the decorator.
#pymc.deterministic(name='temp_theta_%d_%d'%(t,n), plot=False)
def temp_theta(dt=D[t], n=n):
return dt==n
theta[n,t] = temp_theta
Here is a notebook that puts this in context.

Categories

Resources