Let us assume I have a smooth, nonlinear function f: R^n -> R with the (known) maximum number of roots N. How can I find the roots efficiently? Right now I have calculated the function on the grid on a preselected area, refined the grid where the function is below a predefined threshold and continued that routine, but this does not seem to be very efficient, though, because I have noticed that it is difficult to select the area correctly before and to define the threshold accordingly.
there are several ways to go about this of course, scipy is known to contain the safest and most efficient method for finding a single root provided you know the interval:
scipy.optimize.brentq
to find more roots using some estimate you can use:
scipy.optimize.fsolve
de Moivre's formulae to use for root finding that is fairly quick in comparison to others (in case you would rather build your own method):
given a complex number
the n roots are given by:
where k varies over the integer values from 0 to n − 1.
You can square the function and use global optimization software to locate all the minima inside a domain and pick those with a zero value.
Stochastic multistart based global optimization methods with clustering are quite proper for this task.
Related
I'm trying to to find a way to perform an incomplete LU factorization of a symbolic matrix in SymPy and cannot find anything helpful on my own. It's an option for solvers to use ilu as a preconditioner, but it seems there's no way to have it on its own.
Am I missing it? Is it even possible/feasible for symbolic matrices of size 20x20 and larger?
The reason I need this is because I need to approximate O(1) terms in the inverse of that symbolic matrix. I had luck with ilu and non-symbolic matrices, so I thought this may be the way. If this is relevant, the symbols are all binary variables and linear in the terms.
Update 1:
I tried to use the LU solver, but the number of variables in the matrix is much lower than the matrix dimension, so it's no option (unless there is an efficient way to compute just the very first component of the solution vector?). I also tried full LU decomposition with the additional simplification function
def simpfunc(E):
E = E.replace(lambda e: e.is_Pow, lambda e: e.args[0])
return E
which I do hope is correctly formulated this way, since there seems to be no example in the documentation. The idea came from the answer to a previous question here. I could additionally provide an iszerofunc because terms with more than n factors would be zero automatically, but I don't know how to check the degree of terms (example: 0.5x_0x_1x_2x_4 would be zero, while 0.8x_0x_2x_4 would not).
I have two related fitting parameters. They have the same fitting range. Let's call them r1 and r2. I know I can limit the fitting range using minuit.limits, but I have an additional criteria that r2 has to be smaller than r1, can I do that in iminuit?
I've found this, I hope this can help you!
Extracted from: https://iminuit.readthedocs.io/en/stable/faq.html
**Can I have parameter limits that depend on each other (e.g. x^2 + y^2 < 3)?**¶
MINUIT was only designed to handle box constrains, meaning that the limits on the parameters are independent of each other and constant during the minimisation. If you want limits that depend on each other, you have three options (all with caveats), which are listed in increasing order of difficulty:
Change the variables so that the limits become independent. For example, transform from cartesian coordinates to polar coordinates for a circle. This is not always possible, of course.
Use another minimiser to locate the minimum which supports complex boundaries. The nlopt library and scipy.optimize have such minimisers. Once the minimum is found and if it is not near the boundary, place box constraints around the minimum and run iminuit to get the uncertainties (make sure that the box constraints are not too tight around the minimum). Neither nlopt nor scipy can give you the uncertainties.
Artificially increase the negative log-likelihood in the forbidden region. This is not as easy as it sounds.
The third method done properly is known as the interior point or barrier method. A glance at the Wikipedia article shows that one has to either run a series of minimisations with iminuit (and find a clever way of knowing when to stop) or implement this properly at the level of a Newton step, which would require changes to the complex and convoluted internals of MINUIT2.
Warning: you cannot just add a large value to the likelihood when the parameter boundary is violated. MIGRAD expects the likelihood function to be differential everywhere, because it uses the gradient of the likelihood to go downhill. The derivative at a discrete step is infinity and zero in the forbidden region. MIGRAD does not like this at all.
I want to find the minimum of a function in python y = f(x)
Problem : the solver tries to compute the gradient with super close x values (delta x around 1e-8), and my function f is not sensitive to such a small step (ie we can see y vary when delta x around 1e-1).
Hence gradient is 0 to the solver, and can not find the proper solution.
I've tried following solvers from scipy, I can't find the option I'm looking for..
scipy.optimize.minimize
scipy.optimize.fmin
In Matlab fmincon , there is an option that does the job 'DiffMinChange' : Minimum change in variables for finite-difference gradients (a positive scalar).
You may want to try and use L-BFGS-B from scipy:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
And provide the “epsilon” parameter to be around 0.1/0.05 and see if it makes it better. I am of course assuming that you will let the solver compute the gradient for you by numerical differentiation (I.e., you pass fprime=None and approx_grad=True) to the routine.
I personally despise the “minimize” interface to various solvers so I prefer to deal with the actual solvers themselves.
I want to solve the following optimization problem with Python:
I have a black box function f with multiple variables as input.
The execution of the black box function is quite time consuming, therefore I would like to avoid a brute force approach.
I would like to find the optimum input parameters for that black box function f.
In the following, for simplicity I just write the dependency for one dimension x.
An optimum parameter x is defined as:
the cost function cost(x) is maximized with the sum of
f(x) value
a maximum standard deviation of f(x)
.
cost(x) = A * f(x) + B * max(standardDeviation(f(x)))
The parameters A and B are fix.
E.g., for the picture below, the value of x at the position 'U' would be preferred over the value of x at the positon of 'V'.
My question is:
Is there any easily adaptable framework or process that I could utilize (similar to e. g. simulated annealing or bayesian optimisation)?
As mentioned, I would like to avoid a brute force approach.
I’m still not 100% sure of your approach, but does this formula ring true to you:
A * max(f(x)) + B * max(standardDeviation(f(x)))
?
If it does, then I guess you may want to consider that maximizing f(x) may (or may not) be compatible with maximizing the standard deviation of f(x), which means you may be facing a multi-objective optimization problem.
Again, you haven’t specified what f(x) returns - is it a vector? I hope it is, otherwise I’m unclear on what you can calculate the standard deviation on.
The picture you posted is not so obvious to me. F(x) is the entire black curve, it has a maximum at the point v, but what can you say about the standard deviation? To calculate the standard deviation of you have to take into account the entire f(x) curve (including the point u), not just the neighborhood of u and v. If you only want to get the standard deviation in an interval around a maximum for f(x), then I think you’re out of luck when it comes to frameworks. The best thing that comes to my mind is to use a local (or maybe global, better) optimization algorithm to hunt for the maximum of f(x) - simulated annealing, differential evolution, tunnelling, and so on - and then, when you have found a maximum for f(x), sample a few points on the left and right of your optimum and calculate the standard deviation of these evaluations. Then you’ll have to decide if the combination of the maximum of f(x) and this standard deviation is good enough or not compared to any previous “optimal” point found.
This is all speculation, as I’m unsure that your problem is really an optimization one or simply a “peak finding” exercise, for which there are many different - and more powerful and adequate- methods.
Andrea.
I have attempted an exercise from the computational physics written by Newman and written the following code for an adaptive trapezoidal rule. When the error estimate of each slide is larger than the permitted value, it divides that portion into two halves. I am just wondering what else I can do to make the algorithm more efficient.
xm=[]
def trap_adapt(f,a,b,epsilon=1.0e-8):
def step(x1,x2,f1,f2):
xm = (x1+x2)/2.0
fm = f(xm)
h1 = x2-x1
h2 = h1/2.0
I1 = (f1+f2)*h1/2.0
I2 = (f1+2*fm+f2)*h2/2.0
error = abs((I2-I1)/3.0) # leading term in the error expression
if error <= h2*delta:
points.append(xm) # add the points to the list to check if it is really using more points for more rapid-varying regions
return h2/3*(f1 + 4*fm + f2)
else:
return step(x1,xm,f1,fm)+step(xm,x2,fm,f2)
delta = epsilon/(b-a)
fa, fb = f(a), f(b)
return step(a,b,fa,fb)
Besides, I used a few simple formula to compare this to Romberg integration, and found that for the same accuracy, this adaptive method uses many more point to calculate the integral.
Is it just because of its inherent limitations? Any advantages of using this adaptive algorithm instead of the Romberg method? any ways to make it faster and more accurate?
Your code is refining to meet an error tolerance in each individual subinterval. It's also using a low-order integration rule. Improvements in both of these can significantly reduce the number of function evaluations.
Rather than considering the error in each subinterval separately, more advanced codes compute the total error over all the subintervals and refine until the total error is below the desired threshold. Subintervals are chosen for refinement according to their contribution to the total error, with larger errors being refined first. Typically a priority queue is used to quickly chose the subinterval for refinement.
Higher-order integration rules can integrate more complicated functions exactly. For example, your code is based on Simpson's rule, which is exact for polynomials of degree up to 3. A more advanced code will probably use a rule that's exact for polynomials of much higher degree (say 10-15).
From a practical point of view, the simplest thing is to use a canned routine that implements the above ideas, e.g., scipy.integrate.quad. Unless you have particular knowledge of what you want to integrate, you're unlikely to do better.
Romberg integration requires evaluation at equally-spaced points. If you can evaluate the function at any point, then other methods are generally more accurate for "smooth" (polynomial-like) functions. And if your function is not smooth everywhere, then an adaptive code will do much better because it can focus on beating down the error in the non-smooth regions.