Runge-Kutta fourth order method. Integrating backwards - python

I am using a Runge-Kutta fourth order method to solve numerically the usual equation of motion of a background scalar field in curved spacetime with a quartic potential:
$\phi^{''}=-3\left(1+\frac{H^{'}}{3H}\right)\phi^{'}-\lambda\phi^3/H^2$,
$'$ denoting the derivative w.r.t. the e-folds number $\textrm{d}N=H\textrm{d}t$ and, from the Friedmann equation:
$H^2=\frac{\lambda \phi^4}{4}\frac{1}{3M_{Pl}^2-(1/2)\phi^{'2}}$;
$H^{'}=-\frac{1}{2M_{Pl}^2}H\phi^{'2}$.
The problem comes when integrating backwards using as the initial conditions the final values I got after integrating forward. The outcome blow up without matching the values obtained before, when integrating forward. I simply do not understand where the problem is as both the equation and the code are not unknown whatsoever. Firstly, I integrated from 0 to 64 e-folds. Then I simply reverse the integration direction.
I attach the code too:
def rk4trial(f,v0,t0,tf,n,V):
t=np.linspace(t0,tf,n)
h=t[1]-t[0]
v=np.array((n+1)*[v0])
for j in range(n):
V.append(v[j])
V[j]=copy.deepcopy(V[j])
k1=f(v[j],t[j])*h
k2=f(v[j]+(1/2)*k1,t[j]+(1/2)*h)*h
k3=f(v[j]+(1/2)*k2,t[j]+(1/2)*h)*h
k4=f(v[j]+k3,t[j]+h)*h
v[j+1]=v[j]+(k1+2*k2+2*k3+k4)/6
return V, t, h
def Fdet(v,t):
phi, sigma = v
H=(((lamb/4)*phi**4)/(3*mpl**2-(1/2)*sigma**2))**(1/2)
HH=-((1/2)*sigma**2)*(1/mpl**2)
return np.array([sigma,-3*(1+HH/3)*sigma-lamb*phi**3/(H**2)])
PS: This question has been posted here too:https://scicomp.stackexchange.com/questions/33583/runge-kutta-fourth-order-method-integrating-backwards, where equations are shown in detail.
EDIT: Unnecessary parts in the code removed.
EDIT: In response to #LutzL, I attach both the plots of \phi/M_{Pl} and \phi^{'} after integrating forward (solid lines) and backwards (dashed lines), by doing what (s)he said. As you see, there is a sudden deviation from the results of forward integration that I cannot explain.

I would change the RK4 method to the minimal necessary. It is not necessary to have a v array partially duplicating the contents of the V array, so
def rk4trial(f,v0,t0,tf,n,V):
t=np.linspace(t0,tf,n)
h=t[1]-t[0]
v=v0
for j in range(n):
V.append(v)
k1=f(v,t[j])*h
k2=f(v+0.5*k1,t[j]+0.5*h)*h
k3=f(v+0.5*k2,t[j]+0.5*h)*h
k4=f(v+k3,t[j]+h)*h
v=v+(k1+2*k2+2*k3+k4)/6
return V, t, h
There are no copy issues here, as v is constructed anew in every step, so that the objects appended to the return array are all separate.
The backward integration should be as simple as the forward integration,
V1, t1, h1 = rk4trial(Fdet,v0,t0,tf,n,[])
V2, t2, h2 = rk4trial(Fdet,V1[-1],tf,t0,n,[])
and V2[k] should be the same as V1[-k-1] within the bounds of the method error. Large differences are only to be expected in stiff ODE.

Related

Performance issue with Scipy's solve_bvp and coupled differential equations

I'm facing a problem while trying to implement the coupled differential equation below (also known as single-mode coupling equation) in Python 3.8.3. As for the solver, I am using Scipy's function scipy.integrate.solve_bvp, whose documentation can be read here. I want to solve the equations in the complex domain, for different values of the propagation axis (z) and different values of beta (beta_analysis).
The problem is that it is extremely slow (not manageable) compared with an equivalent implementation in Matlab using the functions bvp4c, bvpinit and bvpset. Evaluating the first few iterations of both executions, they return the same result, except for the resulting mesh which is a lot greater in the case of Scipy. The mesh sometimes even saturates to the maximum value.
The equation to be solved is shown here below, along with the boundary conditions function.
import h5py
import numpy as np
from scipy import integrate
def coupling_equation(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty_like(a).astype(complex)
for idx, z_i in enumerate(z_mesh):
beta_zf_i = np.interp(z_i, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_i, z_a, ka_z) # Get ka at the desired point of the mesh
coupling_matrix = np.empty((2, 2), complex)
coupling_matrix[0] = [-1j * beta_zf_i, ka_z_i]
coupling_matrix[1] = [ka_z_i, 1j * beta_zf_i]
a_p[:, idx] = np.matmul(coupling_matrix, a[:, idx]) # Solve the coupling matrix
return a_p
def boundary_conditions(a_a, a_b):
return np.hstack(((a_a[0]-1), a_b[1]))
Moreover, I couldn't find a way to pass k, z and beta_zf as arguments of the function coupling_equation, given that the fun argument of the solve_bpv function must be a callable with the parameters (x, y). My approach is to define some global variables, but I would appreciate any help on this too if there is a better solution.
The analysis function which I am trying to code is:
def analysis(k, z, beta_analysis, max_mesh):
s11_analysis = np.empty_like(beta_analysis, dtype=complex)
s21_analysis = np.empty_like(beta_analysis, dtype=complex)
initial_mesh = np.linspace(z[0], z[-1], 10) # Initial mesh of 10 samples along L
mesh = initial_mesh
# a_init must be complex in order to solve the problem in a complex domain
a_init = np.vstack((np.ones(np.size(initial_mesh)).astype(complex),
np.zeros(np.size(initial_mesh)).astype(complex)))
for idx, beta in enumerate(beta_analysis):
print(f"Iteration {idx}: beta_analysis = {beta}")
global beta_zf
beta_zf = beta * np.ones(len(z)) # Global variable so as to use it in coupling_equation(x, y)
a = integrate.solve_bvp(fun=coupling_equation,
bc=boundary_conditions,
x=mesh,
y=a_init,
max_nodes=max_mesh,
verbose=1)
# mesh = a.x # Mesh for the next iteration
# a_init = a.y # Initial guess for the next iteration, corresponding to the current solution
s11_analysis[idx] = a.y[1][0]
s21_analysis[idx] = a.y[0][-1]
return s11_analysis, s21_analysis
I suspect that the problem has something to do with the initial guess that is being passed to the different iterations (see commented lines inside the loop in the analysis function). I try to set the solution of an iteration as the initial guess for the following (which must reduce the time needed for the solver), but it is even slower, which I don't understand. Maybe I missed something, because it is my first time trying to solve differential equations.
The parameters used for the execution are the following:
f2 = h5py.File(r'path/to/file', 'r')
k = np.array(f2['k']).squeeze()
z = np.array(f2['z']).squeeze()
f2.close()
analysis_points = 501
max_mesh = 1e6
beta_0 = 3e2;
beta_low = 0; # Lower value of the frequency for the analysis
beta_up = beta_0; # Upper value of the frequency for the analysis
beta_analysis = np.linspace(beta_low, beta_up, analysis_points);
s11_analysis, s21_analysis = analysis(k, z, beta_analysis, max_mesh)
Any ideas on how to improve the performance of these functions? Thank you all in advance, and sorry if the question is not well-formulated, I accept any suggestions about this.
Edit: Added some information about performance and sizing of the problem.
In practice, I can't find a relation that determines de number of times coupling_equation is called. It must be a matter of the internal operation of the solver. I checked the number of callings in one iteration by printing a line, and it happened in 133 ocasions (this was one of the fastests). This must be multiplied by the number of iterations of beta. For the analyzed one, the solver returned this:
Solved in 11 iterations, number of nodes 529.
Maximum relative residual: 9.99e-04
Maximum boundary residual: 0.00e+00
The shapes of a and z_mesh are correlated, since z_mesh is a vector whose length corresponds with the size of the mesh, recalculated by the solver each time it calls coupling_equation. Given that a contains the amplitudes of the progressive and regressive waves at each point of z_mesh, the shape of a is (2, len(z_mesh)).
In terms of computation times, I only managed to achieve 19 iterations in about 2 hours with Python. In this case, the initial iterations were faster, but they start to take more time as their mesh grows, until the point that the mesh saturates to the maximum allowed value. I think this is because of the value of the input coupling coefficients in that point, because it also happens when no loop in beta_analysisis executed (just the solve_bvp function for the intermediate value of beta). Instead, Matlab managed to return a solution for the entire problem in just 6 minutes, aproximately. If I pass the result of the last iteration as initial_guess (commented lines in the analysis function, the mesh overflows even faster and it is impossible to get more than a couple iterations.
Based on semi-random inputs, we can see that max_mesh is sometimes reached. This means that coupling_equation can be called with a quite big z_mesh and a arrays. The problem is that coupling_equation contains a slow pure-Python loop iterating on each column of the arrays. You can speed the computation up a lot using Numpy vectorization. Here is an implementation:
def coupling_equation_fast(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty(a.shape, dtype=np.complex128)
beta_zf_i = np.interp(z_mesh, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_mesh, z_a, ka_z) # Get ka at the desired point of the mesh
# Fast manual matrix multiplication
a_p[0] = (-1j * beta_zf_i) * a[0] + ka_z_i * a[1]
a_p[1] = ka_z_i * a[0] + (1j * beta_zf_i) * a[1]
return a_p
This code provides a similar output with semi-random inputs compared to the original implementation but is roughly 20 times faster on my machine.
Furthermore, I do not know if max_mesh happens to be big with your inputs too and even if this is normal/intended. It may make sense to decrease the value of max_mesh in order to reduce the execution time even more.

Matlab's "OutputFcn" implementation in Python for ODE solving

I am trying to solve an ODE using Python's solve_ivp. However, I want to change the right hand side of my ODE dynamically based on a comparison between the current solution and previous solution. The idea behind this is that my right hand side is a vector field, and I want to ensure directionality of the vector field by reversing the right hand side based on the direction of the previous solution.
The implementation for this is as follows: I want to check the dot product in the right hand side function definition between the previous solution and the vector field. If the dot product is negative the right hand side is multiplied by -1.
I therefore need to access the previous state of the ODE solver and use it in comparison with the current iteration. In MATLAB there is the possibility of using "OutputFcn" while solving an ODE. This function is called after every iteration of the integrator. In the function it is therefore possible to simply extract the state as a variable and use it in the next iteration. I have not been able to find something similar for Python.
def RHS(timesnotused,x):
out = solve_ivp(doubleGyreVar, [0,T/2, T], [x[0], x[1], 1, 0, 0, 1], rtol = 1e-10, atol=1e-10)
output = out.y
J = output[2:,-1].reshape(2,2)
CG = np.matmul(J.T , J)
lambdas, xis = np.linalg.eig(CG)
xi_1 = xis[np.argmin(lambdas)]
xi_2 = xis[np.argmax(lambdas)]
lambda_1 = np.min(lambdas)
lambda_2 = np.max(lambdas)
alpha = ((lambda_2-lambda_1) / (lambda_2+lambda_1))**2
sign = 1
if np.dot(xlast,xi_1) < 0:
sign = -1
return(sign*alpha*xi_1)
As can be seen I want "xlast" to be the previous solution, and check it with xi_1 of the current iteration. Somehow xlast needs to be updated every iteration.
If the differential equation you want to solve is
dx/dt = sign(g(x)) * F(x)
for some sufficiently smooth functions g, F, then you have a discontinuous right side where all advanced numerical solvers will produce nonsense as soon as they approach this jump singularity.
The clearest method to solve such a multi-phase system is to present only continuous right sides to the numerical solver and to handle the phase change via the event mechanism that is also present in scipy.integrate.solve_ivp.
I explored a mechanism to do that with the tools of scipy in a similar problem with a sign function producing a discontinuity in odeint returns wrong results for an ODE including descrete function

How may I get more values of a variable than my linspace while solving ODE in Python? (EDIT)

To check some of my results more easily I used an Excel sheet to make a few diagrams. However, I noticed something really awkward.
EDIT :
So let present the problem in another way, I found something that represent what I don't understand in my code.
import numpy as np
from scipy.integrate import odeint
A = []
def F(y, z):
global A
a = y[0]
b = y[1]
A.append(a)
return [a, b]
y0 = [1, 1]
z = np.linspace(0, 1, 101)
y = odeint(F, y0, z)
print(len(z), len(A))
The question is why the length of z and A are different (e.g. 101 and 55)?
For me ,during the solving, a should vary len(z) times and so A. So it looks like the linspace is not doing anything on the solving of the equations. Or perhaps I haven't understood the usage of linspace in Python.
The solution via odeint uses an implicit linear multi-step method with adaptive internal time stepping. This is implemented via a PECE predictor-corrector scheme. The E there stands for "evaluation". Which means that in each internal integration step, the ODE function is called twice. You might get less internal steps than the input time list has entries, the output array is interpolated from the internal time steps, so that you can have multiple output values per internal step. But the other extreme is also possible, that to reach the requested tolerances the internal step size is so small that one output time step requires multiple internal steps.
If the problem were more stiff, there would be even more calls, periodically for the numerical approximation of the Jacobian, and possibly multiple calls per step of the Newton-like corrector step or just multiple simple correction steps, which is then called PE(CE)d.
To compare with, look at the explicit RK4 method. There you have 4 evaluations of the ODE function per time step. The Dormand-Prince method of ode45 has 6+1 evaluations per time step, however there the internal time steps need not correspond to the time sample list passed to the method, the requested output samples are interpolated from the internal steps.

scipy.linalg.sparse.eigsh does not work for generalised eigenvalues

I'm working on a machine learning project which involves doing a Principal Component Analysis on some labeled data and using those labels to extract more valuable information from the data.
To do that, I'm calculating a scatter matrix for each class, and for each pair of classes I need to solve a generalised eigenvalue problem for their scatter matrices, as follows:
S_i * v = w * (S_j + b.I) * v
where b is a multiplier and I is the identity matrix. Now, this is the code in python:
jeigenvalues = eigsh(scatter_j, k=10, return_eigenvectors=False, maxiter=100)
print('eigenvalues made')
beta = betaMult*mean(jeigenvalues)
print(beta)
print(scatter_j+beta*eye(shape(x_data)[1]))
w, v = eigsh(scatter_i,M=scatter_j+beta*eye(shape(x_data)[1]),k=int(numberOfEVs/45), maxiter=100)
print(i,j,'done')
numberOfEVs is 90 in my current code (so that it's divisible by 45).
But the problem is, at the line where I use the eigsh for the aforementioned formula, it never gives me an answer. It keeps eating more and more memory without even completing a single iteration (I set its maxiter input to 1, and it still didn't give an answer). When I don't give the eigsh function the M argument (which is the matrix on the right side of the generalised EV problem and it is assumed to be "I" when not specified), it works correctly. But when M is provided, it becomes unresponsive.
Any ideas?
EDIT: The scatter matrices have rather small entries, mostly around 10^-5. I've also tried multiplying the left hand side by the inverse of the RHS matrix, and again it's having the same issue (goes on for a long time without an answer). Is the smallness of these entries the issue? How can I solve it, then?

resampling, interpolating matrix

I'm trying to interpolate some data for the purpose of plotting. For instance, given N data points, I'd like to be able to generate a "smooth" plot, made up of 10*N or so interpolated data points.
My approach is to generate an N-by-10*N matrix and compute the inner product the original vector and the matrix I generated, yielding a 1-by-10*N vector. I've already worked out the math I'd like to use for the interpolation, but my code is pretty slow. I'm pretty new to Python, so I'm hopeful that some of the experts here can give me some ideas of ways I can try to speed up my code.
I think part of the problem is that generating the matrix requires 10*N^2 calls to the following function:
def sinc(x):
import math
try:
return math.sin(math.pi * x) / (math.pi * x)
except ZeroDivisionError:
return 1.0
(This comes from sampling theory. Essentially, I'm attempting to recreate a signal from its samples, and upsample it to a higher frequency.)
The matrix is generated by the following:
def resampleMatrix(Tso, Tsf, o, f):
from numpy import array as npar
retval = []
for i in range(f):
retval.append([sinc((Tsf*i - Tso*j)/Tso) for j in range(o)])
return npar(retval)
I'm considering breaking up the task into smaller pieces because I don't like the idea of an N^2 matrix sitting in memory. I could probably make 'resampleMatrix' into a generator function and do the inner product row-by-row, but I don't think that will speed up my code much until I start paging stuff in and out of memory.
Thanks in advance for your suggestions!
This is upsampling. See Help with resampling/upsampling for some example solutions.
A fast way to do this (for offline data, like your plotting application) is to use FFTs. This is what SciPy's native resample() function does. It assumes a periodic signal, though, so it's not exactly the same. See this reference:
Here’s the second issue regarding time-domain real signal interpolation, and it’s a big deal indeed. This exact interpolation algorithm provides correct results only if the original x(n) sequence is periodic within its full time inter­val.
Your function assumes the signal's samples are all 0 outside of the defined range, so the two methods will diverge away from the center point. If you pad the signal with lots of zeros first, it will produce a very close result. There are several more zeros past the edge of the plot not shown here:
Cubic interpolation won't be correct for resampling purposes. This example is an extreme case (near the sampling frequency), but as you can see, cubic interpolation isn't even close. For lower frequencies it should be pretty accurate.
If you want to interpolate data in a quite general and fast way, splines or polynomials are very useful. Scipy has the scipy.interpolate module, which is very useful. You can find many examples in the official pages.
Your question isn't entirely clear; you're trying to optimize the code you posted, right?
Re-writing sinc like this should speed it up considerably. This implementation avoids checking that the math module is imported on every call, doesn't do attribute access three times, and replaces exception handling with a conditional expression:
from math import sin, pi
def sinc(x):
return (sin(pi * x) / (pi * x)) if x != 0 else 1.0
You could also try avoiding creating the matrix twice (and holding it twice in parallel in memory) by creating a numpy.array directly (not from a list of lists):
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.zeros((f, o))
for i in xrange(f):
for j in xrange(o):
retval[i][j] = sinc((Tsf*i - Tso*j)/Tso)
return retval
(replace xrange with range on Python 3.0 and above)
Finally, you can create rows with numpy.arange as well as calling numpy.sinc on each row or even on the entire matrix:
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.zeros((f, o))
for i in xrange(f):
retval[i] = numpy.arange(Tsf*i / Tso, Tsf*i / Tso - o, -1.0)
return numpy.sinc(retval)
This should be significantly faster than your original implementation. Try different combinations of these ideas and test their performance, see which works out the best!
I'm not quite sure what you're trying to do, but there are some speedups you can do to create the matrix. Braincore's suggestion to use numpy.sinc is a first step, but the second is to realize that numpy functions want to work on numpy arrays, where they can do loops at C speen, and can do it faster than on individual elements.
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.sinc((Tsi*numpy.arange(i)[:,numpy.newaxis]
-Tso*numpy.arange(j)[numpy.newaxis,:])/Tso)
return retval
The trick is that by indexing the aranges with the numpy.newaxis, numpy converts the array with shape i to one with shape i x 1, and the array with shape j, to shape 1 x j. At the subtraction step, numpy will "broadcast" the each input to act as a i x j shaped array and the do the subtraction. ("Broadcast" is numpy's term, reflecting the fact no additional copy is made to stretch the i x 1 to i x j.)
Now the numpy.sinc can iterate over all the elements in compiled code, much quicker than any for-loop you could write.
(There's an additional speed-up available if you do the division before the subtraction, especially since inthe latter the division cancels the multiplication.)
The only drawback is that you now pay for an extra Nx10*N array to hold the difference. This might be a dealbreaker if N is large and memory is an issue.
Otherwise, you should be able to write this using numpy.convolve. From what little I just learned about sinc-interpolation, I'd say you want something like numpy.convolve(orig,numpy.sinc(numpy.arange(j)),mode="same"). But I'm probably wrong about the specifics.
If your only interest is to 'generate a "smooth" plot' I would just go with a simple polynomial spline curve fit:
For any two adjacent data points the coefficients of a third degree polynomial function can be computed from the coordinates of those data points and the two additional points to their left and right (disregarding boundary points.) This will generate points on a nice smooth curve with a continuous first dirivitive. There's a straight forward formula for converting 4 coordinates to 4 polynomial coefficients but I don't want to deprive you of the fun of looking it up ;o).
Here's a minimal example of 1d interpolation with scipy -- not as much fun as reinventing, but.
The plot looks like sinc, which is no coincidence:
try google spline resample "approximate sinc".
(Presumably less local / more taps ⇒ better approximation,
but I have no idea how local UnivariateSplines are.)
""" interpolate with scipy.interpolate.UnivariateSpline """
from __future__ import division
import numpy as np
from scipy.interpolate import UnivariateSpline
import pylab as pl
N = 10
H = 8
x = np.arange(N+1)
xup = np.arange( 0, N, 1/H )
y = np.zeros(N+1); y[N//2] = 100
interpolator = UnivariateSpline( x, y, k=3, s=0 ) # s=0 interpolates
yup = interpolator( xup )
np.set_printoptions( 1, threshold=100, suppress=True ) # .1f
print "yup:", yup
pl.plot( x, y, "green", xup, yup, "blue" )
pl.show()
Added feb 2010: see also basic-spline-interpolation-in-a-few-lines-of-numpy
Small improvement. Use the built-in numpy.sinc(x) function which runs in compiled C code.
Possible larger improvement: Can you do the interpolation on the fly (as the plotting occurs)? Or are you tied to a plotting library that only accepts a matrix?
I recommend that you check your algorithm, as it is a non-trivial problem. Specifically, I suggest you gain access to the article "Function Plotting Using Conic Splines" (IEEE Computer Graphics and Applications) by Hu and Pavlidis (1991). Their algorithm implementation allows for adaptive sampling of the function, such that the rendering time is smaller than with regularly spaced approaches.
The abstract follows:
A method is presented whereby, given a
mathematical description of a
function, a conic spline approximating
the plot of the function is produced.
Conic arcs were selected as the
primitive curves because there are
simple incremental plotting algorithms
for conics already included in some
device drivers, and there are simple
algorithms for local approximations by
conics. A split-and-merge algorithm
for choosing the knots adaptively,
according to shape analysis of the
original function based on its
first-order derivatives, is
introduced.

Categories

Resources