I'm using the scipy.optimize "minimize" function to solve a nonlinear optimization problem.
optimizationOutput = minimize(objective, x0, args=(n), method='SLSQP', bounds = bnds, constraints = cons, tol = 1, options={'disp': True ,'eps' : 1e5, "maxiter": 2000})
My input (x0) is an array containing 20 floats.
The first 19 inputs are initially estimated to be 10000, bounded by (0, None).
The last input is initially estimated to be 65, bounded by (60,70).
My problem has to do with the last input. The optimization algorithm seems to exclusively choose it's upper and lower bound, occasionally choosing values like 60.00001460018602. I know that this has to do with the 'eps' parameter of the minimize() function, but am wondering if there is a way to have a separate step size for this last input, as the first 19 inputs rely on the larger step size in order to find a global solution in a timely manner?
Related
I'm facing a problem while trying to implement the coupled differential equation below (also known as single-mode coupling equation) in Python 3.8.3. As for the solver, I am using Scipy's function scipy.integrate.solve_bvp, whose documentation can be read here. I want to solve the equations in the complex domain, for different values of the propagation axis (z) and different values of beta (beta_analysis).
The problem is that it is extremely slow (not manageable) compared with an equivalent implementation in Matlab using the functions bvp4c, bvpinit and bvpset. Evaluating the first few iterations of both executions, they return the same result, except for the resulting mesh which is a lot greater in the case of Scipy. The mesh sometimes even saturates to the maximum value.
The equation to be solved is shown here below, along with the boundary conditions function.
import h5py
import numpy as np
from scipy import integrate
def coupling_equation(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty_like(a).astype(complex)
for idx, z_i in enumerate(z_mesh):
beta_zf_i = np.interp(z_i, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_i, z_a, ka_z) # Get ka at the desired point of the mesh
coupling_matrix = np.empty((2, 2), complex)
coupling_matrix[0] = [-1j * beta_zf_i, ka_z_i]
coupling_matrix[1] = [ka_z_i, 1j * beta_zf_i]
a_p[:, idx] = np.matmul(coupling_matrix, a[:, idx]) # Solve the coupling matrix
return a_p
def boundary_conditions(a_a, a_b):
return np.hstack(((a_a[0]-1), a_b[1]))
Moreover, I couldn't find a way to pass k, z and beta_zf as arguments of the function coupling_equation, given that the fun argument of the solve_bpv function must be a callable with the parameters (x, y). My approach is to define some global variables, but I would appreciate any help on this too if there is a better solution.
The analysis function which I am trying to code is:
def analysis(k, z, beta_analysis, max_mesh):
s11_analysis = np.empty_like(beta_analysis, dtype=complex)
s21_analysis = np.empty_like(beta_analysis, dtype=complex)
initial_mesh = np.linspace(z[0], z[-1], 10) # Initial mesh of 10 samples along L
mesh = initial_mesh
# a_init must be complex in order to solve the problem in a complex domain
a_init = np.vstack((np.ones(np.size(initial_mesh)).astype(complex),
np.zeros(np.size(initial_mesh)).astype(complex)))
for idx, beta in enumerate(beta_analysis):
print(f"Iteration {idx}: beta_analysis = {beta}")
global beta_zf
beta_zf = beta * np.ones(len(z)) # Global variable so as to use it in coupling_equation(x, y)
a = integrate.solve_bvp(fun=coupling_equation,
bc=boundary_conditions,
x=mesh,
y=a_init,
max_nodes=max_mesh,
verbose=1)
# mesh = a.x # Mesh for the next iteration
# a_init = a.y # Initial guess for the next iteration, corresponding to the current solution
s11_analysis[idx] = a.y[1][0]
s21_analysis[idx] = a.y[0][-1]
return s11_analysis, s21_analysis
I suspect that the problem has something to do with the initial guess that is being passed to the different iterations (see commented lines inside the loop in the analysis function). I try to set the solution of an iteration as the initial guess for the following (which must reduce the time needed for the solver), but it is even slower, which I don't understand. Maybe I missed something, because it is my first time trying to solve differential equations.
The parameters used for the execution are the following:
f2 = h5py.File(r'path/to/file', 'r')
k = np.array(f2['k']).squeeze()
z = np.array(f2['z']).squeeze()
f2.close()
analysis_points = 501
max_mesh = 1e6
beta_0 = 3e2;
beta_low = 0; # Lower value of the frequency for the analysis
beta_up = beta_0; # Upper value of the frequency for the analysis
beta_analysis = np.linspace(beta_low, beta_up, analysis_points);
s11_analysis, s21_analysis = analysis(k, z, beta_analysis, max_mesh)
Any ideas on how to improve the performance of these functions? Thank you all in advance, and sorry if the question is not well-formulated, I accept any suggestions about this.
Edit: Added some information about performance and sizing of the problem.
In practice, I can't find a relation that determines de number of times coupling_equation is called. It must be a matter of the internal operation of the solver. I checked the number of callings in one iteration by printing a line, and it happened in 133 ocasions (this was one of the fastests). This must be multiplied by the number of iterations of beta. For the analyzed one, the solver returned this:
Solved in 11 iterations, number of nodes 529.
Maximum relative residual: 9.99e-04
Maximum boundary residual: 0.00e+00
The shapes of a and z_mesh are correlated, since z_mesh is a vector whose length corresponds with the size of the mesh, recalculated by the solver each time it calls coupling_equation. Given that a contains the amplitudes of the progressive and regressive waves at each point of z_mesh, the shape of a is (2, len(z_mesh)).
In terms of computation times, I only managed to achieve 19 iterations in about 2 hours with Python. In this case, the initial iterations were faster, but they start to take more time as their mesh grows, until the point that the mesh saturates to the maximum allowed value. I think this is because of the value of the input coupling coefficients in that point, because it also happens when no loop in beta_analysisis executed (just the solve_bvp function for the intermediate value of beta). Instead, Matlab managed to return a solution for the entire problem in just 6 minutes, aproximately. If I pass the result of the last iteration as initial_guess (commented lines in the analysis function, the mesh overflows even faster and it is impossible to get more than a couple iterations.
Based on semi-random inputs, we can see that max_mesh is sometimes reached. This means that coupling_equation can be called with a quite big z_mesh and a arrays. The problem is that coupling_equation contains a slow pure-Python loop iterating on each column of the arrays. You can speed the computation up a lot using Numpy vectorization. Here is an implementation:
def coupling_equation_fast(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty(a.shape, dtype=np.complex128)
beta_zf_i = np.interp(z_mesh, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_mesh, z_a, ka_z) # Get ka at the desired point of the mesh
# Fast manual matrix multiplication
a_p[0] = (-1j * beta_zf_i) * a[0] + ka_z_i * a[1]
a_p[1] = ka_z_i * a[0] + (1j * beta_zf_i) * a[1]
return a_p
This code provides a similar output with semi-random inputs compared to the original implementation but is roughly 20 times faster on my machine.
Furthermore, I do not know if max_mesh happens to be big with your inputs too and even if this is normal/intended. It may make sense to decrease the value of max_mesh in order to reduce the execution time even more.
I know the library curve_fit of scipy and its power to fitting curves. I have read many examples here and in the documentation, but I cannot solve my problem.
For example, I have 10 files (chemical structers but it does not matter) and ten experimental energy values. I have a function inside a class that calculates for each structure the theoretical energy for some parameters and it returns a numpy array with the theoretical energy values.
I want to find the best parameters to have the theoretical values nearest to the experimental ones. I will furnish here the minimum exemple of my code
This is the class function that reads the experimental energy files, extracts the correct substring and returns the values as a numpy array. The self.path is just the directory and self.nPoints = 10. It is not so important, but I furnish for the sake of completeness
def experimentalValues(self):
os.chdir(self.path)
energy = np.zeros(self.nPoints)
for i in range(1, self.nPoints):
f = open("p_" + str(i + 1) + ".xyz", "r")
energy[i] = float(f.readlines()[1].split()[1])
f.close()
os.chdir('..')
return energy
I calculate the theoretical value with this class function that takes two numpy arrays as arguments, lets say
sigma = np.full(nSubstrate, 2.)
epsilon = np.full(nSubstrate, 0.15)
where nSubstrate = 9
Here there is the class function. It reads files and does two nested loops to calculate for each file the theoretical value and return it to a numpy array.
def theoreticalEnergy(self, epsilon, sigma):
os.chdir(self.path)
cE = np.zeros(self.nPoints)
for n in range(0, self.nPoints):
filenameXYZ = "p_" + str(n + 1) + "_extended.xyz"
allCoordinates = np.loadtxt(filenameXYZ, skiprows = 0, usecols = (1, 2, 3))
substrate = allCoordinates[0:self.nSubstrate]
surface = allCoordinates[self.nSubstrate:]
for i in range(0, substrate.shape[0]):
positionAtomI = np.array(substrate[i][:])
for j in range(0, surface.shape[0]):
positionAtomJ = np.array(surface[j][:])
distanceIJ = self.distance(positionAtomI, positionAtomJ)
cE[n] += self.LennardJones(distanceIJ, epsilon[i], sigma[i])
os.chdir('..')
return cE
Again, for the sake of completeness the Lennard Jones class function is defined as
def LennardJones(self, distance, epsilon, sigma):
repulsive = (sigma/distance) ** 12.
attractive = (sigma/distance) ** 6.
potential = 4. * epsilon* (repulsive - attractive)
return potential
where in this case all the arguments are scalar as the return value.
To conclude the problem presentation I have 3 ingredients:
a numpy array with the experimental data
two numpy arrays with a guess for the parameters sigma and epsilon
a function that takes the last parameters and returns a numpy vector with the values to be fitted.
How can I solve this problem like the approach described in the documentation https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html?
Curve fitting
The curve_fit fits a function f(w, x[i]) to points y[i] by finding w that minimizes sum((f(w, x[i] - y[i])**2 for i in range(n)). As you will read in the first line after the function definition
[It uses] non-linear least squares to fit a function, f, to data.
It refers to least_squares where it states
Given the residuals f(x) (an m-D real function of n real variables) and the loss function rho(s) (a scalar function), least_squares finds a local minimum of the cost function F(x):
Curve fitting is a kind of convex-cost multi-objective optimization. Since the each individual cost is convex, you can add all of them and that will still be a convex function. Notice that the decision variables (the parameters to be optimized) are the same in every point.
Your problem
In my understanding for each energy level you have a different set of parameters, if you write it as a curve fitting problem, the objective function could be expressed as sum((f(w[i], x[i]) - y[i])**2 ...), where y[i]is determined by the energy level. Since each of the terms in the sum is independent on the other terms, this is equivalent to finding each group of parametersw[i]separately minimizing(f(w[i], x[i]) - y[i])**2`.
Convexity
Convexity is a very convenient property for optimization because it ensures that you will have only one minimum in the parameter space. I am not doing a detailed analysis but have reasonable doubts about the convexity of your energy function.
The Lennard Jones function has the difference of a repulsive and an attractive force both with negative even exponent on the distance this alone is very unlikely to be convex.
The sum of multiple local functions centered at different positions has no defined convexity.
Molecular energy, or crystal energy, or protein folding are well known to be non-convex.
A few days ago (on a bike ride) I was thinking about this, how the molecules will be configured in a global minimum energy, and I was wondering if it finds that configuration so rapidly because of quantum tunneling effects.
Non-convex optimization
The non-convex (global) optimization is different from (non-linear) least-squares, in the sense that when a local minimum is found the process don't return immediately, it start making new attempts in different regions of the search spaces. If the function is smooth you can still take advantage of a gradient based local optimization method, but the complexity is still NP.
A classic global optimization method is the Simulated annenaling, if you have a chemical background I think you will have some insights reading about it. Once upon a time, simulated annealing was provided in scipy.optimize.
You will find a few global optimization methods in scipy.optimize. I would encourage you to try Basin hopping, since it was successfully applied to similar problems, as you can read in the references.
I hope this drop you on the right way to your solution. But, be aware that you will probably need to spend, learning how to use the function and will need to make some decisions. You will need to find a balance of accuracy, simplicity, efficiency.
If you want better solution take the time to derive the gradient of the cost function (you can return two values f, and df, where df is the gradient of f with respect to the decision variables).
Does the SciPy implementation of the differential evolution algorithm have a maximum number of variables? My code works on a toy version of the problem with 8 variables, but when I try to optimize the actual problem with 4000 variables a value of infinity is consistently returned for the objective function.
code (see GitHub repo for input files)
import numpy as np
from scipy.optimize import differential_evolution as de
from scipy.optimize import NonlinearConstraint as nlc
def kf(x, w, freq):
kc = x>0
kw = ~np.any(w[~kc,:], axis=0)
return -freq[kw].sum()
def cons_fun(x):
return np.sum(x>0)
def optimize(w, freq):
cons = nlc(cons_fun, -np.inf, 1000)
bnds = [np.array([-1,1]),]*w.shape[0]
res = de(kf, args=(w, freq), maxiter=1000, bounds=bnds, popsize=2, polish=False,
constraints=cons, disp=True, workers=-1, updating='deferred')
output = res.x>0
np.save('output.npy', output)
if __name__ == '__main__':
# try optimizing toy version of problem
small_w = np.load('small_w.npy')
small_freq = np.load('small_freq.npy')
optimize(small_w, small_freq)
# try optimizing actual problem
w = np.load('w.npy')
freq = np.load('freq.npy')
optimize(w, freq)
program output for the actual problem
differential_evolution step 1: f(x)= inf
differential_evolution step 2: f(x)= inf
differential_evolution step 3: f(x)= inf
...and so on for hundreds of steps
more information about the optimization problem
I'm trying to determine a set of 1000 Chinese characters that maximizes your ability to write common words. The array w is a sparse boolean matrix with shape 4000 (number of potential characters) by 30000 (number of words). An element of w is true if the character corresponding to that row occurs in the word corresponding to that column. The array freq is a vector of length 30000 that contains word frequency values.
The objective function kf takes 4000-element array x as its argument. The array x contains values between -1 and 1. The trial set of characters is determined by the positive elements in x. A nonlinear constraint restricts the number of positive elements in x to 1000.
There is no limit to the number of variables that can be used in differential_evolution.
For a constrained minimization with differential_evolution the objective function is only evaluated if the constraints are feasible. This is so that computational time is not wasted on trial solutions.
A trial solution is accepted if:
* it satisfies all constraints and provides a lower or equal objective
function value, while both the compared solutions are feasible
- or -
* it is feasible while the original solution is infeasible,
- or -
* it is infeasible, but provides a lower or equal constraint violation
for all constraint functions.
Have you investigated your constraints function to check that it is possible to create a feasible solution within the bounds?
I have a function (compErr) that takes an array of numbers (y) as input and returns an error (err) based on the input. I would like to minimize err. I have written a python script to do that. Here is the basic outline of my code.
from scipy.optimize import minimize
def compErr(y):
...error calculation...
return(err)
#y0 is the initial value of y
res = minimize(compErr, y0, method='nelder-mead', options={'xtol': 1e-8, 'disp': True})
print(res.x)
This is minimizing the error.
Now say suppose y has 7 elements out of which I want the first 3 to vary very less compared to the last four elements. Say the rate of change of first 3 elements of y should be scaled down 100 times. How do I implement that?
Background: A ship is berthed to a jetty using 24 mooring lines and 4 fenders. These mooring lines needs to be pre-tensioned to a design value by experienced engineers. Pre-tensioning is done by setting the appropriate length of each mooring line. Static simulation is done to obtain tension on the lines and compression on the fenders. This is an iterative process as small change in mooring line lenght may cause significant variation in the tension.
Problem Description:
An objective function is set up to take mooring line lenghs as an input array and return the sum of absolute differences between target and achieved pretension values.
Now, I am using the scipy.optimize.minimize function with following options:
target_wire_lenghts = {'Line1': (48.0, 49.0),'Line2': (48.0, 49.0),'Line3': (45.0,46.0),
'Line4': (10.0,11.0),'Line5': (8.0,9.0),'Line6': (7.0,8.0),
'Line7': (46.0,47.0),'Line8': (48.0,49.0),'Line9': (50.0,51.0),
'Line10': (33.0,34.0),'Line11': (31.0,32.0),'Line12': (29.0,30.0),
'Line13': (32.0,33.0),'Line14': (34.0,35.0),'Line15': (36.0,37.0),
'Line16': (48.0,49.0),'Line17': (46.0,47.0),'Line18': (45.0,46.0),
'Line19': (8.0,9.0), 'Line20': (8.0,9.0), 'Line21': (9.0,10.0),
'Line22': (44.0,45.0),'Line23': (45.0,46.0), 'Line24': (46.0,48.0)}
# Bounds
bounds = list(target_wire_lenghts.values())
# Initial guess
x0 = [np.mean([min, max], axis=0) for min,max in bounds]
# Options
options = {'ftol' : 0.1,
'xtol' : 0.1,
'gtol' : 0.1,
'maxiter' : 100,
'accuracy' : 0.1}
result = minimize(objfn, x0, method = 'TNC', bounds = bounds, options = options)
print(result)
However, the optimizer is not varying the input array. The results are the same as initial input array x0(See the length column below). I tried playing around with the optional tolerance parameters of the 'TNC' solver, but do not see any improvement. Also, notice that eventhough I have set the maxiter = 100, the iteration went to 130.
Please suggest what mistake am I making while calling the minimize function.
EDIT: I figured the optimization was running, but changing the variables by 0.000001 at a time. The option parameter eps (Step size used for numerical approximation of the jacobian.) when set to 0.01, the optimization looked working. Unfortunately, it still was not able to reach a reasonable solution. I tried doing an unbounded optimization, with initial guess x0 being very close to the answer (which I found by manually altering each variable), and then the optimizer was able to give a better solution than my manual one.
So the question now is how to do a 24 variable optimization quickly with bad initial guess? Could multi objective optimization be the answer, where reaching each line pre-tension is an objective?