Numerical Accuracy with scipy.optimize.curve_fit in Python

Numerical Accuracy with scipy.optimize.curve_fit in Python - python

I am having issues with the numerical accuracy of scipy.optimize.curve_fit function in python. It seems to me that I can only get ~ 8 digits of accuracy when I desire ~ 15 digits. I have some data (at this point artificially created) made from the following data creation:
where term 1 ~ 10^-3, term 2 ~ 10^-6, and term 3 is ~ 10^-11. In the data, I vary A randomly (it is a Gaussian error). I then try to fit this to a model:
where lambda is a constant, and I only fit alpha (it is a parameter in the function). Now what I would expect is to see a linear relationship between alpha and A because terms 1 and 2 in the data creation are also in the model, so they should cancel perfectly;
So;
However, what happens is for small A (~10^-11 and below), alpha does not scale with A, that is to say, as A gets smaller and smaller, alpha levels out and remains constant.
For reference, I call the following:
op, pcov = scipy.optimize.curve_fit(model, xdata, ydata, p0=None, sigma=sig)
My first thought was that I was not using double precision, but I am pretty sure that python automatically creates numbers in double precision. Then I thought it was an issue with the documentation perhaps that cuts off the digits? Anyways, I could put my code in here but it is sort of complicated. Is there a way to ensure that the curve fitting function saves my digits?
Thank you so much for your help!
EDIT: The below is my code:
# Import proper packages
import numpy as np
import numpy.random as npr
import scipy as sp
import scipy.constants as spc
import scipy.optimize as spo
from matplotlib import pyplot as plt
from numpy import ndarray as nda
from decimal import *
# Declare global variables
AU = 149597871000.0
test_lambda = 20*AU
M_Sun = (1.98855*(sp.power(10.0,30.0)))
M_Jupiter = (M_Sun/1047.3486)
test_jupiter_mass = M_Jupiter
test_sun_mass = M_Sun
rad_jup = 5.2*AU
ran = np.linspace(AU, 100*AU, num=100)
delta_a = np.power(10.0, -11.0)
chi_limit = 118.498
# Model acceleration of the spacecraft from the sun (with Yukawa term)
def model1(distance, A):
return (spc.G)*(M_Sun/(distance**2.0))*(1 +A*(np.exp(-distance/test_lambda))) + (spc.G)*(M_Jupiter*distance)/((distance**2.0 + rad_jup**2.0)**(3.0/2.0))
# Function that creates a data point for test 1
def data1(distance, dela):
return (spc.G)*(M_Sun/(distance**2.0) + (M_Jupiter*distance)/((distance**2.0 + rad_jup**2.0)**(3.0/2.0))) + dela
# Generates a list of 100 data sets varying by ~&a for test 1
def generate_data1():
data_list = []
for i in range(100):
acc_lst = []
for dist in ran:
x = data1(dist, npr.normal(0, delta_a))
acc_lst.append(x)
data_list.append(acc_lst)
return data_list
# Generates a list of standard deviations at each distance from the sun. Since &a is constant, the standard deviation of each point is constant
def generate_sig():
sig = []
for i in range(100):
sig.append(delta_a)
return sig
# Finds alpha for test 1, since we vary &a in test 1, we need to generate new data for each time we find alpha
def find_alpha1(data_list, sig):
alphas = []
for data in data_list:
op, pcov = spo.curve_fit(model1, ran, data, p0=None, sigma=sig)
alphas.append(op[0])
return alphas
# Tests the dependence of alpha on &a and plots the dependence
def test1():
global delta_a
global test_lambda
test_lambda = 20*AU
delta_a = 10.0**-20.0
alphas = []
delta_as = []
for i in range(20):
print i
data_list = generate_data1()
print np.array(data_list[0])
sig = generate_sig()
alpha = find_alpha1(data_list, sig)
delas = []
for alp in alpha:
if alp < 0:
x = 0
plt.loglog(delta_a, abs(alp), '.' 'r')
else:
x = 0
plt.loglog(delta_a, alp, '.' 'b')
delta_a *= 10
plt.xlabel('Delta A')
plt.ylabel('Alpha (at Lambda = 5 AU)')
plt.show()
def main():
test1()
if __name__ == '__main__':
main()

I believe this is to do with the minimisation algorithm used here, and the maximum obtainable precision.
I remember reading about it in numerical recipes a few years ago, I'll see if i can dig up a reference for you.
edit:
link to numerical recipes here - skip down to page 394 and then read that chapter. Note the third paragraph on page 404:
"Indulge us a ﬁnal reminder that tol should generally be no smaller
than the square root of your machine’s ﬂoating-point precision."
And mathematica mention that if you want accuracy, then you need to go for a different method, and that they don't infact use LMA unless the problem is recognised as being a sum of squares problem.
Given that you're just doing a one dimensional fit, it might be a good exercise to try just implementing one of the fitting algorithms they mention in that chapter.
What are you actually trying to achieve though? From what i understand about it, you're essentially trying to work out the amount of random noise you've added to the curve. But then that's not really what you're doing - unless i've understood wrong...
Edit2:
So after reading how you generate the data, there's an issue with the data and the model you're applying.
You're essentially fitting the two sides of this:
You're essentially trying to fit the height of a gaussian to random numbers. You're not fitting the gaussian to the frequency of those numbers.
Looking at your code, and judging from what you've said, this isn't you end goal, and you're just wanting to get used to the optimise method?
It would make more sense if you randomly adjusted the distance from the sun, and then fit to the data and see if you can minimise to find the distance which generated the data set?

Related

Understading hyperopt's TPE algorithm

I am illustrating hyperopt's TPE algorithm for my master project and cant seem to get the algorithm to converge. From what i understand from the original paper and youtube lecture the TPE algorithm works in the following steps:
(in the following, x=hyperparameters and y=loss)
Start by creating a search history of [x,y], say 10 points.
Sort the hyperparameters according to their loss and divide them into two sets using some quantile γ (γ = 0.5 means the sets will be equally sized)
Make a kernel density estimation for both the poor hyperparameter group (g(x)) and good hyperparameter group (l(x))
Good estimations will have low probability in g(x) and high probability in l(x), so we propose to evaluate the function at argmin(g(x)/l(x))
Evaluate (x,y) pair at the proposed point and repeat steps 2-5.
I have implemented this in python on the objective function f(x) = x^2, but the algorithm fails to converge to the minimum.
import numpy as np
import scipy as sp
from matplotlib import pyplot as plt
from scipy.stats import gaussian_kde
def objective_func(x):
return x**2
def measure(x):
noise = np.random.randn(len(x))*0
return x**2+noise
def split_meassures(x_obs,y_obs,gamma=1/2):
#split x and y observations into two sets and return a seperation threshold (y_star)
size = int(len(x_obs)//(1/gamma))
l = {'x':x_obs[:size],'y':y_obs[:size]}
g = {'x':x_obs[size:],'y':y_obs[size:]}
y_star = (l['y'][-1]+g['y'][0])/2
return l,g,y_star
#sample objective function values for ilustration
x_obj = np.linspace(-5,5,10000)
y_obj = objective_func(x_obj)
#start by sampling a parameter search history
x_obs = np.linspace(-5,5,10)
y_obs = measure(x_obs)
nr_iterations = 100
for i in range(nr_iterations):
#sort observations according to loss
sort_idx = y_obs.argsort()
x_obs,y_obs = x_obs[sort_idx],y_obs[sort_idx]
#split sorted observations in two groups (l and g)
l,g,y_star = split_meassures(x_obs,y_obs)
#aproximate distributions for both groups using kernel density estimation
kde_l = gaussian_kde(l['x']).evaluate(x_obj)
kde_g = gaussian_kde(g['x']).evaluate(x_obj)
#define our evaluation measure for sampling a new point
eval_measure = kde_g/kde_l
if i%10==0:
plt.figure()
plt.subplot(2,2,1)
plt.plot(x_obj,y_obj,label='Objective')
plt.plot(x_obs,y_obs,'*',label='Observations')
plt.plot([-5,5],[y_star,y_star],'k')
plt.subplot(2,2,2)
plt.plot(x_obj,kde_l)
plt.subplot(2,2,3)
plt.plot(x_obj,kde_g)
plt.subplot(2,2,4)
plt.semilogy(x_obj,eval_measure)
plt.draw()
#find point to evaluate and add the new observation
best_search = x_obj[np.argmin(eval_measure)]
x_obs = np.append(x_obs,[best_search])
y_obs = np.append(y_obs,[measure(np.asarray([best_search]))])
plt.show()
I suspect this happens because we keep sampling where we are most certain, thus making l(x) more and more narrow around this point, which doesn't change where we sample at all. So where is my understanding lacking?

So, I am still learning about TPE as well. But here's are the two problems in this code:
This code will only evaluate a few unique point. Because the best location is calculated based on the best recommended by the kernel density functions but there is no way for the code to do exploration of the search space. For example, what acquisition functions do.
Because this code is simply appending new observations to the list of x and y. It adds a whole lot of duplicates. The duplicates lead to a skewed set of observations and that leads to a very weird split and you can easily see that in the later plots. The eval_measure starts as something similar to the objective function but diverges later on.
If you remove the duplicates in x_obs and y_obs you can remove the problem no. 2. However, the first problem can only be removed through the addition of some way of exploring the search space.

Python: curve_fit not working for function with three fitting parameters, and an improper integral

I am trying to fit a data set to this monster of an equation. I know this has been asked before, but I don't think initial guesses are my problem, nor can I add more terms to my fitting equation.
My fitting equation. Note the "u" in the integral is NOT the same u as defined up top.
My data set is in mA/um, by the way.
I implemented this in a function F, which takes inputs Vd,T,r, and Vt. T,r,and Vt are fitting parameters. T and r range from 0
My first few programs had horrible fits (if it could even accomplish the integral), so I decided to see if the algorithm even works. The function's implementation is as follows:
from scipy import integrate
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
#Constants
eSiO2 = 3.9 #Relative dielectric constant of SiO2
tox = 2e-9 #Gate oxide thickness in m
evac = 8.854e-12 #Vacuum permittivity, F/m
em = 0.2*9.11e-31 #Effective electron mass in kg
KT = 4.11e-21 #Thermal energy in joules
Mv = 2.5 #Degeneracy factor
q = 1.6e-19 #Electron charge, coulombs
hbar = 1.054e-34 #Reduced plancks constant
Vg = 1
def F(Vd,T,r,Vt):
#Derived constants required for computation
Ci = (eSiO2*evac)/tox #Oxide capacitance per area
ved = (q*r*Vd)/(KT) #little Vd
I0 = (np.sqrt(2)*q*(KT**1.5)*Mv*np.sqrt(em))/(np.pi*hbar)**2 #Leakage Current
#Rho
rho1 = 2*np.pi*hbar**2*Ci
rho2 = q*KT*em*Mv
rhoV = Vg-Vt
rho = (rho1*rhoV)/rho2
#u
UA = 1 + np.exp(ved)
UB = np.exp(ved)*(np.exp(rho)-1)
Usq = np.sqrt(UA**2+4*UB)
u = np.log(Usq-UA)-np.log(2)
#Integrand of F(u)
def integrand1(A,x):
return (np.sqrt(A))/(1+np.exp(A-x))
#Integrand of F(u-v)
def integrand2(A,x):
return (np.sqrt(A))/(1+np.exp(A-x))
sum1 = 0
sum2 = 0
tempy1=[]
tempy2=[]
tempx2=[]
#Tempx2 = dummy variable domain
tempx2 = np.linspace(0,100,num=10000)
#Fitting parameters are determined
if Ready is True:
#Evaluate the integrands for all the Vd values
tempy1 = integrand1(tempx2,u)
tempy2 = integrand2(tempx2,u-ved)
#Fitting parameters are NOT determined
else:
print ("Calculating")
#Evaluate the integrands for all the Vd values
for i in range (0,len(u)):
tempy1 = integrand1(tempx2,u[i])
tempy2 = integrand2(tempx2,u[i]-ved[i])
#Perform integration over dummy domain
sum1 = integrate.simps(tempy1,tempx2,0.1)
sum2 = integrate.simps(tempy2,tempx2,0.1)
if Ready is False:
print ("u=%s" %u,"ved=%s" %ved)
print ("Sum1 is %s" %sum1)
return I0*T*1e-3*(sum1-sum2)
The function will compute F(x,T,r,Vt) if T,r, and Vt are specified. So I decided to make a "sample" data set to see if it will fit itself nearly perfectly:
#Create domain for reference curve
Ready = True
x = np.linspace(0,1.2,50)
y=[]
#Evaluate the reference curve domain
for j in range (0,50):
y.append(F(x[j],0.2,0.147,0.45))
Now that the reference curve is created, the curve will now attempted to be fit. Note how my p0 values are extremely close to the real values.
#Guesses for the curve fit
initial = [0.21,0.15,0.46]
Ready = False
#Attempt to fit the reference curve
popt, popc = curve_fit(F,x,y,initial,bounds=(0,1))
#Create the fit curve
fitdata=[]
Ready = True
for i in range (0,50):
fitdata.append(F(x[i],popt[0],popt[1],popt[2]))
And then plot both the reference and the fit curve. Yet, the fit curve is poor even though the p0 values are really close to the actual ones. I saw that people had problems with that in previous StackOverflow posts.
plt.plot(x,y,label='Reference')
plt.plot(x,fitdata,label='Fit')
plt.legend()
plt.show()
Here is the plot:
I've found that it's at least useful to picking some parameters to then manually guess and check with for the final fit. It's just so odd that it can't even fit itself even if curve_fit is basically with touching distance of the best fit parameters.
Due to the complexity of this fitting equation, will I have to do that? I did nearly the exact same computation with a quadratic fit (for a different project) on real data, and getting an appropriate curve was trivial.

You are discarding all results for tempy1 and tempy2 except for the last one. I think you wanted to append to the list.
Changing
for i in range (0,len(u)):
tempy1 = integrand1(tempx2,u[i])
tempy2 = integrand2(tempx2,u[i]-ved[i])
to
for i in range (0,len(u)):
tempy1.append(integrand1(tempx2,u[i]))
tempy2.append(integrand2(tempx2,u[i]-ved[i]))
results in the two graphs overlapping.

Curve Fitting For 3 dimensional data in python

I have X and Y data with (7,360,720) dimension (global grid cells with 0.5 resolution) as input data and I want to fit Sigmoid curve with below code and obtaining curve parameters in the same shape as X and Y:
# -*- coding: utf-8 -*-
import os, sys
from collections import OrderedDict as odict
import numpy as np
import pylab as pl
import numpy.ma as ma
from scipy.optimize import curve_fit
f=open('test.csv','w')
def sigmoid(x,a,b, c):
y = a+(b*(1 - np.exp(-c*(x**2))))
return y
for i in range(360):
for j in range(720):
xdata=[0,x[0,i,j],x[1,i,j],x[2,i,j],x[3,i,j],x[4,i,j],x[5,i,j],x[6,i,j]]
ydata=[0,y[0,i,j],y[1,i,j],y[2,i,j],y[3,i,j],y[4,i,j],y[5,i,j],y[6,i,j]]
popt, pcov = curve_fit(sigmoid, xdata, ydata)
print popt
f.write(','.join(map(str,popt)))
f.write("\n")
f.close()
Now this code write and sore fitting result in .csv file with 3 columns(a,b,c), but I want o write and store fitting result in the file with (360,720) shape as grid cells. also this code show me below error:
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.

The dimensionality of your data (referenced in the title to the question) is not the cause of the problem you are seeing. What you are trying to do is run 360*720 (~260,000) separate fits to your sigmoidal function, with inputs derived from your arrays x and y. It should work, but it might be slow simply because you are doing so many fits.
If you haven't already, you should definitely start by fitting a couple arrays to your function -- if you can't get 3 to work, there's no point in trying 260,000, right? So, start with 1, then try 3, then 360, then all of them.
I suspect the problem you are seeing is because curve_fit() stupidly allows you to not explicitly specify starting values for your parameters, and even-more-stupidly assigns unspecified starting values to the arbitrary value of 1. This encourages new users to not think more carefully about the problem they are trying to solve, and then gives cryptic error messages like the one you seeing that do not explicitly say "you need better starting values". The message says the fit took many iterations, which probably means the fit "got lost" trying to find optimal values. That "getting lost" is probably because it started "too far from home".
In general, curve fitting is sensitive to the starting values of the parameters. And you probably do know better starting values than a=1, b=1, c=1. I suspect that you also know that exponentiation can get huge or tiny very quickly. Because of this, and depending on the scale of your x, there are probably ranges of values for c that are not really sensible -- it might be that c should be positive, and smaller than 10, for example. Again, you probably sort of know these ranges.
Let me suggest using lmfit (https://lmfit.github.io/lmfit-py/) for this work. It provides an alternative approach to curve-fitting, with many useful improvements over curve_fit. For your problem, a single fit might look like this:
import numpy as np
from lmfit import Model
def sigmoid(x, offset, scale, decay):
return offset + scale*(1 - np.exp(-decay*(x**2)))
## set up the model and parameters from your model function
# note that parameters will be *named* using the names of the
# arguments to your model function.
model = Model(sigmoid)
# make parameters (OrderedDict-like) with initial values
params = model.make_params(offset=0, scale=1, decay=0.25)
# you may want to set bounds on some of the parameters
params['scale'].min = 0
params['decay'].min = 0
params['decay'].max = 5
# you can also fix some parameters if desired
# params['offset'].vary = False
## set up data
# pick arbitrary data to fit, and make sure data use np arrays.
# but also: (0, 0) isn't in your data -- do you need to assert it?
# won't that drive `offset` to 0?
i, j = 7, 12
xdata = np.array([0] + x[:, i, j])
ydata = np.array([0] + y[:, i, j])
# now fit model to data, get results
result = model.fit(params, ydata, x=xdata)
print(result.fit_report())
This will print out a report with fit statistics, best-fit parameter values, and uncertainties. You can read the docs for all the components of results, but results.params holds best-fit parameters and uncertainties.
For use in a loop, this approach has the convenient feature that result is unique for each data set, while the starting params are not altered by the fit and can be re-used as starting values for all your fits. A test loop might look like
results = []
for i in (50, 150, 250):
for j in (200, 400, 600):
xdata = np.array([0] + x[:, i, j])
ydata = np.array([0] + y[:, i, j])
result = model.fit(params, ydata, x=xdata)
results.append([i, j, result.params, result.chisqr])
It will still be possible that some of the 260,000 fits will not succeed, but I think that lmfit will give you better tools to avoid and identify these cases.

Translating rJAGS censored linear regression model to PyMC3

I'm currently attempting to translate a program my boss originally wrote in R/rJAGS to Python/PyMC3, partially because he wanted to see if it was something python could do, partially because I want to learn how to do this sort of thing, it seems like a good thing to know. I've gotten a linear fit model working in PyMC3, but I'm having difficulty trying to replicate the censoring bit.
The R program reads in a table, each line having three y-values for three specific x-values which are constant across the data set. Each y-value also has some error associated with it. If that were it then I have a PyMC3 model that can do that; here's the toy model I had set up for it:
import numpy as np
import pymc3 as pmc
# set random seed for reproducibility
np.random.seed(12345)
x = np.linspace(0,10,3)
# Make some model data
# Parameters for linear fit
slope_true = -0.2
inter_true = 0.1
#Linear function
linear = lambda x,slope,inter: slope*x+inter
f_true = linear(x=x,slope=slope_true, inter=inter_true )
# add noise to the data points
f = f_true + np.random.normal(size=len(x)) * 0.05
f_error = np.ones_like(f_true)*f.max()*np.random.uniform(0,1,size=len(x))
with pmc.Model() as model3:
slope = pmc.Normal('slope', mu=0, tau=0.4, testval= 0.15)
inter = pmc.Normal('inter', mu=0, tau=40, testval=0.15)
linear = pmc.Deterministic('linear', slope*x+inter)
y = pmc.Normal('y', mu=linear, tau=1.0/f_error**2, observed=f)
start = pmc.find_MAP()
step = pmc.NUTS()
trace = pmc.sample(1000,start=start)
# extract results
slope_fit = np.median(trace.slope)
slope_up = slope_fit - np.percentile(trace.slope, 15.9)
slope_dn = np.percentile(trace.slope, 84.1) - slope_fit
The above model was somewhat hacked together from examples I found online, it generates points on a line, adds a bit of noise and some "error", then performs a fit on the noisy points with error. After that it grabs the a median value for the slope and some errors associated with it.
But now I need to be able to account for these censored points that sometimes pop up. In this instance certain y-values may have been non-detections, so the value for that point is considered a censor limit and the point is then set to NaN, with an error still associated with the point. The R code model (saved as lin_regress_model.bug) which handles this looks like this:
model {
for (i in 1:N) {
isCensored[i] ~ dinterval(rv[i], censorLimitVec[i])
rv[i] ~ dnorm(y[i],rve[i])
y[i] <- a*x[i] + b
}
a ~ dnorm(0, 1e-6)
b ~ dnorm(0, 1e-6)
tau ~ dgamma(0.001, 0.001)
sigma <- 1/sqrt(tau)
}
Here's an example of data it might get fed:
N = 3 # always 3, because 3 points
isCensored = c(False, False, True)
censorLimitVec = c(-6.65, -6.65, -6.65) # was value of 3rd point before NA
rv = c(-3.4, -4.7, NA) # y-values
rve = c(7e3, 7e2, 6.66) # these are Tau I think, like 1/sigma^2
x = c(0.15, 0.68, 0.94) # x-values
So all of those get passed into the jags model, and it's able to fit this censored data, but I can't for the life of me figure out how to translate that bit into PyMC3-speak. It sounds like the dinterval function in this may be similar to Uniform in PyMC3, but I don't really know what to do with that because I can't directly translate the formula lines (the concept of the tilde itself in R is still a bit weird to me).
If anyone out there can help me it would be greatly appreciated. For all I know it might not even be possible with PyMC3, or maybe it's easy and I've just missed something. Regardless, I've been banging my head against the wall for a few days now so I figure it'd be best just to ask for help at this point.

How to find all zeros of a function using numpy (and scipy)?

Suppose I have a function f(x) defined between a and b. This function can have many zeros, but also many asymptotes. I need to retrieve all the zeros of this function. What is the best way to do it?
Actually, my strategy is the following:
I evaluate my function on a given number of points
I detect whether there is a change of sign
I find the zero between the points that are changing sign
I verify if the zero found is really a zero, or if this is an asymptote
U = numpy.linspace(a, b, 100) # evaluate function at 100 different points
c = f(U)
s = numpy.sign(c)
for i in range(100-1):
if s[i] + s[i+1] == 0: # oposite signs
u = scipy.optimize.brentq(f, U[i], U[i+1])
z = f(u)
if numpy.isnan(z) or abs(z) > 1e-3:
continue
print('found zero at {}'.format(u))
This algorithm seems to work, except I see two potential problems:
It will not detect a zero that doesn't cross the x axis (for example, in a function like f(x) = x**2) However, I don't think it can occur with the function I'm evaluating.
If the discretization points are too far, there could be more that one zero between them, and the algorithm could fail finding them.
Do you have a better strategy (still efficient) to find all the zeros of a function?
I don't think it's important for the question, but for those who are curious, I'm dealing with characteristic equations of wave propagation in optical fiber. The function looks like (where V and ell are previously defined, and ell is an positive integer):
def f(u):
w = numpy.sqrt(V**2 - u**2)
jl = scipy.special.jn(ell, u)
jl1 = scipy.special.jnjn(ell-1, u)
kl = scipy.special.jnkn(ell, w)
kl1 = scipy.special.jnkn(ell-1, w)
return jl / (u*jl1) + kl / (w*kl1)

Why are you limited to numpy? Scipy has a package that does exactly what you want:
http://docs.scipy.org/doc/scipy/reference/optimize.nonlin.html
One lesson I've learned: numerical programming is hard, so don't do it :)
Anyway, if you're dead set on building the algorithm yourself, the doc page on scipy I linked (takes forever to load, btw) gives you a list of algorithms to start with. One method that I've used before is to discretize the function to the degree that is necessary for your problem. (That is, tune \delta x so that it is much smaller than the characteristic size in your problem.) This lets you look for features of the function (like changes in sign). AND, you can compute the derivative of a line segment (probably since kindergarten) pretty easily, so your discretized function has a well-defined first derivative. Because you've tuned the dx to be smaller than the characteristic size, you're guaranteed not to miss any features of the function that are important for your problem.
If you want to know what "characteristic size" means, look for some parameter of your function with units of length or 1/length. That is, for some function f(x), assume x has units of length and f has no units. Then look for the things that multiply x. For example, if you want to discretize cos(\pi x), the parameter that multiplies x (if x has units of length) must have units of 1/length. So the characteristic size of cos(\pi x) is 1/\pi. If you make your discretization much smaller than this, you won't have any issues. To be sure, this trick won't always work, so you may need to do some tinkering.

I found out it's relatively easy to implement your own root finder using the scipy.optimize.fsolve.
Idea: Find any zeroes from interval (start, stop) and stepsize step by calling the fsolve repeatedly with changing x0. Use relatively small stepsize to find all the roots.
Can only search for zeroes in one dimension (other dimensions must be fixed). If you have other needs, I would recommend using sympy for calculating the analytical solution.
Note: It may not always find all the zeroes, but I saw it giving relatively good results. I put the code also to a gist, which I will update if needed.
import numpy as np
import scipy
from scipy.optimize import fsolve
from matplotlib import pyplot as plt
# Defined below
r = RootFinder(1, 20, 0.01)
args = (90, 5)
roots = r.find(f, *args)
print("Roots: ", roots)
# plot results
u = np.linspace(1, 20, num=600)
fig, ax = plt.subplots()
ax.plot(u, f(u, *args))
ax.scatter(roots, f(np.array(roots), *args), color="r", s=10)
ax.grid(color="grey", ls="--", lw=0.5)
plt.show()
Example output:
Roots: [ 2.84599497 8.82720551 12.38857782 15.74736542 19.02545276]
zoom-in:
RootFinder definition
import numpy as np
import scipy
from scipy.optimize import fsolve
from matplotlib import pyplot as plt
class RootFinder:
def __init__(self, start, stop, step=0.01, root_dtype="float64", xtol=1e-9):
self.start = start
self.stop = stop
self.step = step
self.xtol = xtol
self.roots = np.array([], dtype=root_dtype)
def add_to_roots(self, x):
if (x < self.start) or (x > self.stop):
return # outside range
if any(abs(self.roots - x) < self.xtol):
return # root already found.
self.roots = np.append(self.roots, x)
def find(self, f, *args):
current = self.start
for x0 in np.arange(self.start, self.stop + self.step, self.step):
if x0 < current:
continue
x = self.find_root(f, x0, *args)
if x is None: # no root found.
continue
current = x
self.add_to_roots(x)
return self.roots
def find_root(self, f, x0, *args):
x, _, ier, _ = fsolve(f, x0=x0, args=args, full_output=True, xtol=self.xtol)
if ier == 1:
return x[0]
return None
Test function
The scipy.special.jnjn does not exist anymore, but I created similar test function for the case.
def f(u, V=90, ell=5):
w = np.sqrt(V ** 2 - u ** 2)
jl = scipy.special.jn(ell, u)
jl1 = scipy.special.yn(ell - 1, u)
kl = scipy.special.kn(ell, w)
kl1 = scipy.special.kn(ell - 1, w)
return jl / (u * jl1) + kl / (w * kl1)

The main problem I see with this is if you can actually find all roots --- as have already been mentioned in comments, this is not always possible. If you are sure that your function is not completely pathological (sin(1/x) was already mentioned), the next one is what's your tolerance to missing a root or several of them. Put differently, it's about to what length you are prepared to go to make sure you did not miss any --- to the best of my knowledge, there is no general method to isolate all the roots for you, so you'll have to do it yourself. What you show is a reasonable first step already. A couple of comments:
Brent's method is indeed a good choice here.
First of all, deal with the divergencies. Since in your function you have Bessels in the denominators, you can first solve for their roots -- better look them up in e.g., Abramovitch and Stegun (Mathworld link). This will be a better than using an ad hoc grid you're using.
What you can do, once you've found two roots or divergencies, x_1 and x_2, run the search again in the interval [x_1+epsilon, x_2-epsilon]. Continue until no more roots are found (Brent's method is guaranteed to converge to a root, provided there is one).
If you cannot enumerate all the divergencies, you might want to be a little more careful in verifying a candidate is indeed a divergency: given x don't just check that f(x) is large, check that, e.g. |f(x-epsilon/2)| > |f(x-epsilon)| for several values of epsilon (1e-8, 1e-9, 1e-10, something like that).
If you want to make sure you don't have roots which simply touch zero, look for the extrema of the function, and for each extremum, x_e, check the value of f(x_e).

I've also encountered this problem to solve equations like f(z)=0 where f was an holomorphic function. I wanted to be sure not to miss any zero and finally developed an algorithm which is based on the argument principle.
It helps to find the exact number of zeros lying in a complex domain. Once you know the number of zeros, it is easier to find them. There are however two concerns which must be taken into account :
Take care about multiplicity : when solving (z-1)^2 = 0, you'll get two zeros as z=1 is counting twice
If the function is meromorphic (thus contains poles), each pole reduce the number of zero and break the attempt to count them.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.