Performance issue with Scipy's solve_bvp and coupled differential equations

Performance issue with Scipy's solve_bvp and coupled differential equations - python

I'm facing a problem while trying to implement the coupled differential equation below (also known as single-mode coupling equation) in Python 3.8.3. As for the solver, I am using Scipy's function scipy.integrate.solve_bvp, whose documentation can be read here. I want to solve the equations in the complex domain, for different values of the propagation axis (z) and different values of beta (beta_analysis).
The problem is that it is extremely slow (not manageable) compared with an equivalent implementation in Matlab using the functions bvp4c, bvpinit and bvpset. Evaluating the first few iterations of both executions, they return the same result, except for the resulting mesh which is a lot greater in the case of Scipy. The mesh sometimes even saturates to the maximum value.
The equation to be solved is shown here below, along with the boundary conditions function.
import h5py
import numpy as np
from scipy import integrate
def coupling_equation(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty_like(a).astype(complex)
for idx, z_i in enumerate(z_mesh):
beta_zf_i = np.interp(z_i, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_i, z_a, ka_z) # Get ka at the desired point of the mesh
coupling_matrix = np.empty((2, 2), complex)
coupling_matrix[0] = [-1j * beta_zf_i, ka_z_i]
coupling_matrix[1] = [ka_z_i, 1j * beta_zf_i]
a_p[:, idx] = np.matmul(coupling_matrix, a[:, idx]) # Solve the coupling matrix
return a_p
def boundary_conditions(a_a, a_b):
return np.hstack(((a_a[0]-1), a_b[1]))
Moreover, I couldn't find a way to pass k, z and beta_zf as arguments of the function coupling_equation, given that the fun argument of the solve_bpv function must be a callable with the parameters (x, y). My approach is to define some global variables, but I would appreciate any help on this too if there is a better solution.
The analysis function which I am trying to code is:
def analysis(k, z, beta_analysis, max_mesh):
s11_analysis = np.empty_like(beta_analysis, dtype=complex)
s21_analysis = np.empty_like(beta_analysis, dtype=complex)
initial_mesh = np.linspace(z[0], z[-1], 10) # Initial mesh of 10 samples along L
mesh = initial_mesh
# a_init must be complex in order to solve the problem in a complex domain
a_init = np.vstack((np.ones(np.size(initial_mesh)).astype(complex),
np.zeros(np.size(initial_mesh)).astype(complex)))
for idx, beta in enumerate(beta_analysis):
print(f"Iteration {idx}: beta_analysis = {beta}")
global beta_zf
beta_zf = beta * np.ones(len(z)) # Global variable so as to use it in coupling_equation(x, y)
a = integrate.solve_bvp(fun=coupling_equation,
bc=boundary_conditions,
x=mesh,
y=a_init,
max_nodes=max_mesh,
verbose=1)
# mesh = a.x # Mesh for the next iteration
# a_init = a.y # Initial guess for the next iteration, corresponding to the current solution
s11_analysis[idx] = a.y[1][0]
s21_analysis[idx] = a.y[0][-1]
return s11_analysis, s21_analysis
I suspect that the problem has something to do with the initial guess that is being passed to the different iterations (see commented lines inside the loop in the analysis function). I try to set the solution of an iteration as the initial guess for the following (which must reduce the time needed for the solver), but it is even slower, which I don't understand. Maybe I missed something, because it is my first time trying to solve differential equations.
The parameters used for the execution are the following:
f2 = h5py.File(r'path/to/file', 'r')
k = np.array(f2['k']).squeeze()
z = np.array(f2['z']).squeeze()
f2.close()
analysis_points = 501
max_mesh = 1e6
beta_0 = 3e2;
beta_low = 0; # Lower value of the frequency for the analysis
beta_up = beta_0; # Upper value of the frequency for the analysis
beta_analysis = np.linspace(beta_low, beta_up, analysis_points);
s11_analysis, s21_analysis = analysis(k, z, beta_analysis, max_mesh)
Any ideas on how to improve the performance of these functions? Thank you all in advance, and sorry if the question is not well-formulated, I accept any suggestions about this.
Edit: Added some information about performance and sizing of the problem.
In practice, I can't find a relation that determines de number of times coupling_equation is called. It must be a matter of the internal operation of the solver. I checked the number of callings in one iteration by printing a line, and it happened in 133 ocasions (this was one of the fastests). This must be multiplied by the number of iterations of beta. For the analyzed one, the solver returned this:
Solved in 11 iterations, number of nodes 529.
Maximum relative residual: 9.99e-04
Maximum boundary residual: 0.00e+00
The shapes of a and z_mesh are correlated, since z_mesh is a vector whose length corresponds with the size of the mesh, recalculated by the solver each time it calls coupling_equation. Given that a contains the amplitudes of the progressive and regressive waves at each point of z_mesh, the shape of a is (2, len(z_mesh)).
In terms of computation times, I only managed to achieve 19 iterations in about 2 hours with Python. In this case, the initial iterations were faster, but they start to take more time as their mesh grows, until the point that the mesh saturates to the maximum allowed value. I think this is because of the value of the input coupling coefficients in that point, because it also happens when no loop in beta_analysisis executed (just the solve_bvp function for the intermediate value of beta). Instead, Matlab managed to return a solution for the entire problem in just 6 minutes, aproximately. If I pass the result of the last iteration as initial_guess (commented lines in the analysis function, the mesh overflows even faster and it is impossible to get more than a couple iterations.

Based on semi-random inputs, we can see that max_mesh is sometimes reached. This means that coupling_equation can be called with a quite big z_mesh and a arrays. The problem is that coupling_equation contains a slow pure-Python loop iterating on each column of the arrays. You can speed the computation up a lot using Numpy vectorization. Here is an implementation:
def coupling_equation_fast(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty(a.shape, dtype=np.complex128)
beta_zf_i = np.interp(z_mesh, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_mesh, z_a, ka_z) # Get ka at the desired point of the mesh
# Fast manual matrix multiplication
a_p[0] = (-1j * beta_zf_i) * a[0] + ka_z_i * a[1]
a_p[1] = ka_z_i * a[0] + (1j * beta_zf_i) * a[1]
return a_p
This code provides a similar output with semi-random inputs compared to the original implementation but is roughly 20 times faster on my machine.
Furthermore, I do not know if max_mesh happens to be big with your inputs too and even if this is normal/intended. It may make sense to decrease the value of max_mesh in order to reduce the execution time even more.

Related

Python finding local maxima/minima for multiple polynomials efficiently

I am looking for an efficient way to find local mins for multiple (>1 million) but independent 4th order polynomials in given/ specified ranges/ boundaries.
I have two requirements:
R1: efficient even for 1 million different polynomial equations
R2: the local min is accurate up to 0.01 (i.e. 2dp)
Here is some code I have created using scipy. It's okay but I am wondering if there's any other better packages in performing such a task before I go for parallel programming.
To illustrate my problem, let's start with one polynomial first:
Below I am trying to find the local min of 4x^4 + 6x^3 + 3x^2 + x + 5 within the range (-5, 5).
On my laptop, it takes about 2ms to find the local min (which is at ~ -0.72770502).
The time is alright for one polynomial but I would want something faster as I need to perform this operation over 1 million times regularly.
from scipy import optimize
import numpy as np
# Define a objective and gradient function for 4th order polynomial
# x is the value to be evaluated
# par is a numpy array of len 5 that specifies the polynomial coefficients.
def obj_grad_fun_custom(x,par):
obj = (np.array([x**4,x**3,x**2,x**1,1]) * par).sum()
grad = (np.array([4*x**3,3*x**2,2*x,1]) * par[:-1]).sum()
return obj, grad
# Try minimise an example polynomial of 4x^4 + 6x^3 + 3x^2 + x + 5
# with contrainted bound
res = optimize.minimize(
fun = obj_grad_fun_custom,
x0 = 0,
args=(np.array([4,6,3,1,5])), # polynomial coefficients
jac=True ,
bounds=[(-2, 10)],
tol=1e-10)
print(res.x)
# Timing (this takes about 2 ms for me)
%timeit optimize.minimize(fun = obj_grad_fun_custom, x0 = 0, args=(np.array([4,6,3,1,5])), jac=True, bounds=[(-5, 5)], tol=1e-10)
Below is what I am planning to do regular with 1 million different order 4 polynomials I would want to minimise locally. Hopefully, someone could point me to a more suitable package other than scipy. Or any alternative methods? Thanks!
# Multiple polynomials
result = [] # saving the local minima
poly_sim_no = 1000000 #ideally 1 million or even more
np.random.seed(0)
par_set = np.random.choice(np.arange(10), size=(poly_sim_no, 5), replace=True) #generate some order 4 polynomial coefficients
for a in par_set:
res = optimize.minimize(obj_grad_fun_custom, 0,args=(a), jac=True ,bounds=[(-5, 5)], tol=1e-10)
result.append(res.x)
print(result)

Since you're finding the minimum of a polynomial, you can take advantage of the fact that it's easy to take the derivative of a polynomial, and that there are many good algorithms for finding the roots of a polynomial.
Here's how it works:
First, take the derivative. All of the points which are minimums will have a derivative of zero.
Look for those zeros, aka find the roots of the derivative.
Once we have the list of candidates, check that the solutions are real.
Check that the solutions are within the bounds you set. (I don't know if you added bounds because you actually want the bounds, or to make it go faster. If it's the latter, feel free to remove this step.)
Actually evaluate the candidates with the polynomial and find the smallest one.
Here's the code:
import numpy as np
from numpy.polynomial import Polynomial
def find_extrema(poly, bounds):
deriv = poly.deriv()
extrema = deriv.roots()
# Filter out complex roots
extrema = extrema[np.isreal(extrema)]
# Get real part of root
extrema = np.real(extrema)
# Apply bounds check
lb, ub = bounds
extrema = extrema[(lb <= extrema) & (extrema <= ub)]
return extrema
def find_minimum(poly, bounds):
extrema = find_extrema(poly, bounds)
# Note: initially I tried taking the 2nd derivative to filter out local maxima.
# This ended up being slower than just evaluating the function.
# Either bound could end up being the minimum. Check those too.
extrema = np.concatenate((extrema, bounds))
# Check every candidate by evaluating the polynomial at each possible minimum,
# and picking the minimum.
value_at_extrema = poly(extrema)
minimum_index = np.argmin(value_at_extrema)
return extrema[minimum_index]
# Warning: polynomial expects coeffients in the opposite order that you use.
poly = Polynomial([5,1,3,6,4])
print(find_minimum(poly, (-5, 5)))
This takes 162 microseconds on my computer, making it about 6x faster than the scipy.optimize solution. (The solution shown in the question takes 1.12 ms on my computer.)
Edit: A faster alternative
Here's a faster approach. However, it abandons bounds checking, uses a deprecated API, and is generally harder to read.
p = np.poly1d([4,6,3,1,5]) # Note: polynomials are opposite order of before
def find_minimum2(poly):
roots = np.real(np.roots(poly.deriv()))
return roots[np.argmin(poly(roots))]
print(find_minimum2(p))
This clocks in at 110 microseconds, making it roughly 10x faster than the original.

Closed Form Ridge Regression

I am having trouble understanding the output of my function to implement multiple-ridge regression. I am doing this from scratch in Python for the closed form of the method. This closed form is shown below:
I have a training set X that is 100 rows x 10 columns and a vector y that is 100x1.
My attempt is as follows:
def ridgeRegression(xMatrix, yVector, lambdaRange):
wList = []
for i in range(1, lambdaRange+1):
lambVal = i
# compute the inner values (X.T X + lambda I)
xTranspose = np.transpose(x)
xTx = xTranspose # x
lamb_I = lambVal * np.eye(xTx.shape[0])
# invert inner, e.g. (inner)**(-1)
inner_matInv = np.linalg.inv(xTx + lamb_I)
# compute outer (X.T y)
outer_xTy = np.dot(xTranspose, y)
# multiply together
w = inner_matInv # outer_xTy
wList.append(w)
print(wList)
For testing, I am running it with the first 5 lambda values.
wList becomes 5 numpy.arrays each of length 10 (I'm assuming for the 10 coefficients).
Here is the first of those 5 arrays:
array([ 0.29686755, 1.48420319, 0.36388528, 0.70324668, -0.51604451,
2.39045735, 1.45295857, 2.21437745, 0.98222546, 0.86124358])
My question, and clarification:
Shouldn't there be 11 coefficients, (1 for the y-intercept + 10 slopes)?
How do I get the Minimum Square Error from this computation?
What comes next if I wanted to plot this line?
I think I am just really confused as to what I'm looking at, since I'm still working on my linear-algebra.
Thanks!

First, I would modify your ridge regression to look like the following:
import numpy as np
def ridgeRegression(X, y, lambdaRange):
wList = []
# Get normal form of `X`
A = X.T # X
# Get Identity matrix
I = np.eye(A.shape[0])
# Get right hand side
c = X.T # y
for lambVal in range(1, lambdaRange+1):
# Set up equations Bw = c
lamb_I = lambVal * I
B = A + lamb_I
# Solve for w
w = np.linalg.solve(B,c)
wList.append(w)
return wList
Notice that I replaced your inv call to compute the matrix inverse with an implicit solve. This is much more numerically stable, which is an important consideration for these types of problems especially.
I've also taken the A=X.T#X computation, identity matrix I generation, and right hand side vector c=X.T#y computation out of the loop--these don't change within the loop and are relatively expensive to compute.
As was pointed out by #qwr, the number of columns of X will determine the number of coefficients you have. You have not described your model, so it's not clear how the underlying domain, x, is structured into X.
Traditionally, one might use polynomial regression, in which case X is the Vandermonde Matrix. In that case, the first coefficient would be associated with the y-intercept. However, based on the context of your question, you seem to be interested in multivariate linear regression. In any case, the model needs to be clearly defined. Once it is, then the returned weights may be used to further analyze your data.

Typically to make notation more compact, the matrix X contains a column of ones for an intercept, so if you have p predictors, the matrix is dimensions n by p+1. See Wikipedia article on linear regression for an example.
To compute in-sample MSE, use the definition for MSE: the average of squared residuals. To compute generalization error, you need cross-validation.

Also, you shouldn't take lambVal as an integer. It can be small (close to 0) if the aim is just to avoid numerical error when xTx is ill-conditionned.
I would advise you to use a logarithmic range instead of a linear one, starting from 0.001 and going up to 100 or more if you want to. For instance you can change your code to that:
powerMin = -3
powerMax = 3
for i in range(powerMin, powerMax):
lambVal = 10**i
print(lambVal)
And then you can try a smaller range or a linear range once you figure out what is the correct order of lambVal with your data from cross-validation.

Vectorizing an element-wise

Is it possible to vectorize (or otherwise speedup) an element-wise optimization with NumPy (and SciPy)?
In the most abstract sense, I have a function, y, which is parabolically shaped and could be expressed basically as y=x^2+b*x+z, where x is an array of known values, and I want to find a z that makes the minimum value of y exactly zero (said another way, I want to find a value z that makes my parabola only have one zero). For this, I've chosen to implement a simple bisection-like method. The code for this is below:
import numpy as np
def find_single_root():
x = np.arange(-5, 6,0.1) # domain
z = 1 # initial guess
delta = 1 # initial step size
tol = 0.001 # tolerance
while True:
y = x**2-5*x+z
minimum = np.nanmin(y)
# update z
print(delta)
print(z)
if minimum > 0:
if delta > 0:
delta = -1*delta/2
z += delta
else:
if delta < 0:
delta = -1*delta/2
z += delta
# check if step is smaller than tolerance
if np.abs(delta) < tol:
return z
Now lets say x(v,w), and I want to create a 2D array of z values, where each is optimized. What I have right now is below (note, the new function definition and domain are as follows)
def find_single_root(v, w):
x = np.arange(-5*v/w, 6*w,0.1) # domain
... # rest of the function
vs = np.arange(1,5)
ws = np.arange(1,5)
zs = np.zeros((len(vs),len(ws)))
for i, v in enumerate(vs):
for j, w in enumerate(ws):
zs[i][j] = find_single_root(v,w)
Right now I just have these simple nested for loops, but is there a way I can approach this differently or speed it up with NumPy vectorizing?

Vectorization may be applicable when the computations to be performed are precisely known in advance. Like "take two arrays of numbers, and multiply them pairwise".
Vectorization is not applicable when the computations adapt to the given data. Any kind of optimization algorithm is adaptive, because where you look for the minimum depends on what the function returns. If you have a bunch of functions, and need to find the minimum of each, you are going to have to minimize them one at a time, in a loop. If this process is slow, it's because it takes long to minimize a bunch of function, not because there is a for loop in the program.
Concerning your program, I would try using some of SciPy methods for both minimization and root-finding. Have a function min_of_f(z) which finds the minimum for a given value of parameter z, possibly using minimize_scalar. Then feed min_of_f to a root-finding routine. How long these will take can be controlled by their tolerance parameters (xtol and others).
OP edit:
I wanted to give credit for this as a correct answer, but still provide more information.
I ended up using numpy.vectorize to vectorize without restructuring the problem. Although numpy.vectorize is not meant for increasing performance, the performance in my specific use case was a modest factor of two faster. Applying the same approach to the original problem in the question resulted in virtually no speed up with 100x100 vectors so YMMV.
Even though I wasn't able to vectorize this problem from a speed aspect for the reasons given in the above answer, being able to use plain vector syntax instead of nested for loops all over my code was useful.

High frequency noise at solving differential equation

I'm trying to simulate a simple diffusion based on Fick's 2nd law.
from pylab import *
import numpy as np
gridpoints = 128
def profile(x):
range = 2.
straggle = .1576
dose = 1
return dose/(sqrt(2*pi)*straggle)*exp(-(x-range)**2/2/straggle**2)
x = linspace(0,4,gridpoints)
nx = profile(x)
dx = x[1] - x[0] # use np.diff(x) if x is not uniform
dxdx = dx**2
figure(figsize=(12,8))
plot(x,nx)
timestep = 0.5
steps = 21
diffusion_coefficient = 0.002
for i in range(steps):
coefficients = [-1.785714e-3, 2.539683e-2, -0.2e0, 1.6e0,
-2.847222e0,
1.6e0, -0.2e0, 2.539683e-2, -1.785714e-3]
ccf = (np.convolve(nx, coefficients) / dxdx)[4:-4] # second order derivative
nx = timestep*diffusion_coefficient*ccf + nx
plot(x,nx)
for the first few time steps everything looks fine, but then I start to get high frequency noise, do to build-up from numerical errors which are amplified through the second derivative. Since it seems to be hard to increase the float precision I'm hoping that there is something else that I can do to suppress this? I already increased the number of points that are being used to construct the 2nd derivative.

I don't have the time to study your solution in detail, but it seems that you are solving the partial differential equation with a forward Euler scheme. This is pretty easy to implement, as you show, but this can become numerical instable if your timestep is too small. Your only solution is to reduce the timestep or to increase the spatial resolution.
The easiest way to explain this is for the 1-D case: assume your concentration is a function of spatial coordinate x and timestep i. If you do all the math (write down your equations, substitute the partial derivatives with finite differences, should be pretty easy), you will probably get something like this:
C(x, i+1) = [1 - 2 * k] * C(x, i) + k * [C(x - 1, i) + C(x + 1, i)]
so the concentration of a point on the next step depends on its previous value and the ones of its two neighbors. It is not too hard to see that when k = 0.5, every point gets replaced by the average of its two neighbors, so a concentration profile of [...,0,1,0,1,0,...] will become [...,1,0,1,0,1,...] on the next step. If k > 0.5, such a profile will blow up exponentially. You calculate your second order derivative with a longer convolution (I effectively use [1,-2,1]), but I guess that does not change anything for the instability problem.
I don't know about normal diffusion, but based on experience with thermal diffusion, I would guess that k scales with dt * diffusion_coeff / dx^2. You thus have to chose your timestep small enough so that your simulation does not become instable. To make the simulation stable, but still as fast as possible, chose your parameters so that k is a bit smaller than 0.5. Something similar can be derived for 2-D and 3-D cases. The easiest way to achieve this is to increase dx, since your total calculation time will scale with 1/dx^3 for a linear problem, 1/dx^4 for 2-D problems, and even 1/dx^5 for 3-D problems.
There are better methods to solve diffusion equations, I believe that Crank Nicolson is at least standard for solving heat-equations (which is also a diffusion problem). The 'problem' is that this is an implicit method, which means that you have to solve a set of equations to calculate your 'concentration' at the next timestep, which is a bit of a pain to implement. But this method is guaranteed to be numerical stable, even for big timesteps.

How to find all zeros of a function using numpy (and scipy)?

Suppose I have a function f(x) defined between a and b. This function can have many zeros, but also many asymptotes. I need to retrieve all the zeros of this function. What is the best way to do it?
Actually, my strategy is the following:
I evaluate my function on a given number of points
I detect whether there is a change of sign
I find the zero between the points that are changing sign
I verify if the zero found is really a zero, or if this is an asymptote
U = numpy.linspace(a, b, 100) # evaluate function at 100 different points
c = f(U)
s = numpy.sign(c)
for i in range(100-1):
if s[i] + s[i+1] == 0: # oposite signs
u = scipy.optimize.brentq(f, U[i], U[i+1])
z = f(u)
if numpy.isnan(z) or abs(z) > 1e-3:
continue
print('found zero at {}'.format(u))
This algorithm seems to work, except I see two potential problems:
It will not detect a zero that doesn't cross the x axis (for example, in a function like f(x) = x**2) However, I don't think it can occur with the function I'm evaluating.
If the discretization points are too far, there could be more that one zero between them, and the algorithm could fail finding them.
Do you have a better strategy (still efficient) to find all the zeros of a function?
I don't think it's important for the question, but for those who are curious, I'm dealing with characteristic equations of wave propagation in optical fiber. The function looks like (where V and ell are previously defined, and ell is an positive integer):
def f(u):
w = numpy.sqrt(V**2 - u**2)
jl = scipy.special.jn(ell, u)
jl1 = scipy.special.jnjn(ell-1, u)
kl = scipy.special.jnkn(ell, w)
kl1 = scipy.special.jnkn(ell-1, w)
return jl / (u*jl1) + kl / (w*kl1)

Why are you limited to numpy? Scipy has a package that does exactly what you want:
http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html
One lesson I've learned: numerical programming is hard, so don't do it :)
Anyway, if you're dead set on building the algorithm yourself, the doc page on scipy I linked (takes forever to load, btw) gives you a list of algorithms to start with. One method that I've used before is to discretize the function to the degree that is necessary for your problem. (That is, tune \delta x so that it is much smaller than the characteristic size in your problem.) This lets you look for features of the function (like changes in sign). AND, you can compute the derivative of a line segment (probably since kindergarten) pretty easily, so your discretized function has a well-defined first derivative. Because you've tuned the dx to be smaller than the characteristic size, you're guaranteed not to miss any features of the function that are important for your problem.
If you want to know what "characteristic size" means, look for some parameter of your function with units of length or 1/length. That is, for some function f(x), assume x has units of length and f has no units. Then look for the things that multiply x. For example, if you want to discretize cos(\pi x), the parameter that multiplies x (if x has units of length) must have units of 1/length. So the characteristic size of cos(\pi x) is 1/\pi. If you make your discretization much smaller than this, you won't have any issues. To be sure, this trick won't always work, so you may need to do some tinkering.

I found out it's relatively easy to implement your own root finder using the scipy.optimize.fsolve.
Idea: Find any zeroes from interval (start, stop) and stepsize step by calling the fsolve repeatedly with changing x0. Use relatively small stepsize to find all the roots.
Can only search for zeroes in one dimension (other dimensions must be fixed). If you have other needs, I would recommend using sympy for calculating the analytical solution.
Note: It may not always find all the zeroes, but I saw it giving relatively good results. I put the code also to a gist, which I will update if needed.
import numpy as np
import scipy
from scipy.optimize import fsolve
from matplotlib import pyplot as plt
# Defined below
r = RootFinder(1, 20, 0.01)
args = (90, 5)
roots = r.find(f, *args)
print("Roots: ", roots)
# plot results
u = np.linspace(1, 20, num=600)
fig, ax = plt.subplots()
ax.plot(u, f(u, *args))
ax.scatter(roots, f(np.array(roots), *args), color="r", s=10)
ax.grid(color="grey", ls="--", lw=0.5)
plt.show()
Example output:
Roots: [ 2.84599497 8.82720551 12.38857782 15.74736542 19.02545276]
zoom-in:
RootFinder definition
import numpy as np
import scipy
from scipy.optimize import fsolve
from matplotlib import pyplot as plt
class RootFinder:
def __init__(self, start, stop, step=0.01, root_dtype="float64", xtol=1e-9):
self.start = start
self.stop = stop
self.step = step
self.xtol = xtol
self.roots = np.array([], dtype=root_dtype)
def add_to_roots(self, x):
if (x < self.start) or (x > self.stop):
return # outside range
if any(abs(self.roots - x) < self.xtol):
return # root already found.
self.roots = np.append(self.roots, x)
def find(self, f, *args):
current = self.start
for x0 in np.arange(self.start, self.stop + self.step, self.step):
if x0 < current:
continue
x = self.find_root(f, x0, *args)
if x is None: # no root found.
continue
current = x
self.add_to_roots(x)
return self.roots
def find_root(self, f, x0, *args):
x, _, ier, _ = fsolve(f, x0=x0, args=args, full_output=True, xtol=self.xtol)
if ier == 1:
return x[0]
return None
Test function
The scipy.special.jnjn does not exist anymore, but I created similar test function for the case.
def f(u, V=90, ell=5):
w = np.sqrt(V ** 2 - u ** 2)
jl = scipy.special.jn(ell, u)
jl1 = scipy.special.yn(ell - 1, u)
kl = scipy.special.kn(ell, w)
kl1 = scipy.special.kn(ell - 1, w)
return jl / (u * jl1) + kl / (w * kl1)

The main problem I see with this is if you can actually find all roots --- as have already been mentioned in comments, this is not always possible. If you are sure that your function is not completely pathological (sin(1/x) was already mentioned), the next one is what's your tolerance to missing a root or several of them. Put differently, it's about to what length you are prepared to go to make sure you did not miss any --- to the best of my knowledge, there is no general method to isolate all the roots for you, so you'll have to do it yourself. What you show is a reasonable first step already. A couple of comments:
Brent's method is indeed a good choice here.
First of all, deal with the divergencies. Since in your function you have Bessels in the denominators, you can first solve for their roots -- better look them up in e.g., Abramovitch and Stegun (Mathworld link). This will be a better than using an ad hoc grid you're using.
What you can do, once you've found two roots or divergencies, x_1 and x_2, run the search again in the interval [x_1+epsilon, x_2-epsilon]. Continue until no more roots are found (Brent's method is guaranteed to converge to a root, provided there is one).
If you cannot enumerate all the divergencies, you might want to be a little more careful in verifying a candidate is indeed a divergency: given x don't just check that f(x) is large, check that, e.g. |f(x-epsilon/2)| > |f(x-epsilon)| for several values of epsilon (1e-8, 1e-9, 1e-10, something like that).
If you want to make sure you don't have roots which simply touch zero, look for the extrema of the function, and for each extremum, x_e, check the value of f(x_e).

I've also encountered this problem to solve equations like f(z)=0 where f was an holomorphic function. I wanted to be sure not to miss any zero and finally developed an algorithm which is based on the argument principle.
It helps to find the exact number of zeros lying in a complex domain. Once you know the number of zeros, it is easier to find them. There are however two concerns which must be taken into account :
Take care about multiplicity : when solving (z-1)^2 = 0, you'll get two zeros as z=1 is counting twice
If the function is meromorphic (thus contains poles), each pole reduce the number of zero and break the attempt to count them.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.