I am very new to Pyomo and I am trying to implement the example given here, i.e.,
theta' = omega,
omega' = -b * omega - c * sin(theta)
with b(t) = 0.25, for 0<=t<15, b(t) = 0.025 for t>=15,
c(t) = 5, for 0<=t<7, c(t) = 50, for t>=7,
by modifying the time-varying coefficients in the following way:
b(t) = 0.25, for 0<=t<=14,
b(t) = 0.025 for t>=15, and b(t) = m*t for 14<t<15, a straight line with a slope m joining the two constant segments. In other words, b(t) is not piecewise constant anymore as in the example, but also linear in an interval.
I looked at the varying_inputs option given here but it can only be used for the piecewise constant profiles. The page also says, "time-varying inputs can be specified using a Pyomo Suffix. We currently only support piecewise constant profiles. For more complex inputs defined by a continuous function of time, we recommend adding an algebraic variable and constraint to your model"
How can I implement such a function b(t) in my problem? That is, how to add the linear part of the new b(t) in b_profile as defined in the link above?
(With this, my final goal is to calculate the time evolution of dependent variables and later, solve a parameter estimation problem)
Any comments would be much appreciated.
Best,
Related
I have this set of experimental data:
x_data = np.array([0, 2, 5, 10, 15, 30, 60, 120])
y_data = np.array([1.00, 0.71, 0.41, 0.31, 0.29, 0.36, 0.26, 0.35])
t = np.linspace(min(x_data), max(x_data), 151)
scatter plot
I want to fit them with a curve that follows an exponential behaviour for t < t_lim and a linear behaviour for t > t_lim, where t_lim is a value that i can set as i want. I want to use curve_fit to find the best fit. I would like to find the best fit meeting these two conditions:
The end point of the first behaviour (exponential) must be the starting point of the second behaviour (linear): in other words, I don't want the jump discontinuity in the middle.
I would like the second behaviour (linear) to be descending.
I solved in this way:
t_lim = 15
def y(t, k, m, q):
return np.concatenate((np.exp(-k*t)[t<t_lim], (m*t + q)[t>=t_lim]))
popt, pcov = curve_fit(y, x_data, y_data, p0=[0.5, -0.005, 0.005])
y_model = y(t, k_opt, m_opt, q_opt)
I obtain this kind of curve:
chart_plot
I don't know how to tell python to find the best values of m, k, q that meet the two conditions (no jump discontinuity, and m < 0)
Instead of trying to add these conditions as explicit constraints, I'd go about modifying the form of y so that these conditions are always satisfied.
For example, try replacing m with -m**2. That way, the coefficient in the linear part will always be negative.
For the continuity condition, how about this: For an exponential with a given decay factor and a linear curve with a given slope which are supposed to meet at a given t_lim there's only exactly one value for q that will satisfy that condition. You can explicitly compute that value and just plug that in.
Basically, q won't be a fit parameter anymore; instead, inside of y, you'd compute the correct q value based on k, m, t_lim.
This post is not a direct answer to the question. This is a preliminary study.
First : Fitting to a simple exponential function with only a constant (without decreasing or increasing linear part) :
The result is not bad considering the wide scatter on the right part.
Second : Fitting to an exponential function with a linear function (without taking account of the expected decreasing on the right).
The slope of the linear part is very low : 0.000361
But the slope is positive which is not as wanted.
Since the scatter is very large one suspects that the slope of the linear function might be governed mainly by the scatter. In order to check this hypothesis one make the same fitting calculus whitout one point. Taking only the seven first points (that is forgetting the eighth point) the result is :
Now the slope is negative as wanted. But this is an untruthful result.
Of course if some technical reason implies that the slope is necessarily negative one could use a picewise function made of an exponenlial and a linear function. But what is the credibility of such a model ?
This doesn't answer to the question. Neverthelss I hope that this inspection will be of interest.
For information :
The usual nonlinear regression methods are often non convergent in case of large scatter due to the difficulty to set initial values of the parameters sufficienly close to the unknown correct values. In order to avoid the difficulty the above fittings where made with a non usual method which doesn't requires "guessed" initial value. For the principle refer to : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales
In the referenced document the case of the function exponential and linear isn't fully treated. In order to overcome this deficiency the method is shown below with the numerical calculus (MathsCAD).
If more accuracy is needed use a nonlinear regression software with the values of p,a,b,c found above as initial values to start the iterative calculus.
I am using GEKKO for fitting purposes trying to optimise functions which are explicitly defined - so I have a fully functional form and can create equation objects for optimisation purposes.
But now I have a different problem.
I can't create equations because of the complexed functional dependence.
But I have a python function that calculates the output using some inputs - optimisation parameters and some other that can be interpreted as fixed or known.
The key moments: I have the experimental data and a complexed model that is described in f1(set_of_parameters) - python function. f1 - is nonlinear, nonconvex and it can't be expressed as one simple equation - it has a lot of conditional parameters and a lot of branches the calls of other python functions inside, etc.
So actually f1 can't be converted to a gekko model equation.
And I need to find such parameters - set_of_optimal_parameters, which will lead to the minimum of a distance so that f1(set_of_optimal_parameters) will be as close as possible to the experimental data I have, so I will find a set_of_optimal_parameters.
For each parameter of a set, I have initial values and boundaries and even some constraints.
So I need to do something like this:
m = GEKKO()
#parameters I need to find
param_1 = m.FV(value = val_1, lb = lb_1, ub=rb_1)
param_2 = m.FV(value = val_2, lb = lb_2, ub=rb_2)
...
param_n = m.FV(value = val_n, lb = lb_n, ub=rb_n) #constructing the input for the function f1()
params_dataframe = ....()# some function that collects all the parameters and arranges all of them to a proper form to an input of f1()
#exp data description
x = m.Param(value = xData)
z = m.Param(value = yData)
y = m.Var()
#model description - is it possible to use other function inside equation? because f1 is very complexed with a lot of branches and options.. I don't really want to translate it in equation form..
m.Equation(
y==f1(params_dataframe)
)
#add some constraints
min = m.Param(value=some_val_min)
m.Equation(min <= (param_1+param_2) / (param_1+param_2)**2))
# trying to solve and minimize the sum of squares
m.Minimize(((y-z))**2)
# Options for solver
param_1.STATUS = 1
param_2.STATUS = 1
...
param_n.STATUS = 1
m.options.IMODE = 2
m.options.SOLVER = 1
m.options.MAX_ITER = 1000
m.solve(disp=1)
Is it possible to use GEKKO this way or it's not allowed? and why?
Gekko compiles equations into byte-code and requires all equations in Gekko format so that it can overload equation operators to provide exact first and second derivatives in sparse form. Black-box functions do not provide the necessary first and second derivatives, but they can provide function evaluations for finite differences (derivative approximations) or for surrogate functions.
To answer your question directly, you can't use f1(params) in a Gekko problem. If you need an optimizer to evaluate arbitrary black box functions, an optimizer such as scipy.optimize.minimize() is a good choice.
If you would still like to use Gekko, there are several options to built a surrogate model for f1 that has continuous first and second derivatives. The surrogate model depends on the number of params:
1D: use cspline()
2D: use bspline()
3D+: use Machine learning such as Gaussian Processes, Neural Network, Linear Regression, etc.
Here is an example that create a surrogate model for y=f(x) where f(x)=3*np.sin(x) - (x-3). This equation could be modeled directly in Gekko, but it serves as an example of creating a cspline() object that approximates the function and finds the minimum.
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
"""
minimize y
s.t. y = f(x)
using cubic spline with random sampling of data
"""
# function to generate data for cspline
def f(x):
return 3*np.sin(x) - (x-3)
x_data = np.random.rand(50)*10+10
y_data = f(x_data)
c = GEKKO()
x = c.Var(value=np.random.rand(1)*10+10)
y = c.Var()
c.cspline(x,y,x_data,y_data,True)
c.Obj(y)
c.options.IMODE = 3
c.options.CSV_READ = 0
c.options.SOLVER = 3
c.solve(disp=True)
if c.options.SOLVESTATUS == 1:
plt.figure()
plt.scatter(x_data,y_data,5,'b')
plt.scatter(x.value,y.value,200,'r','x')
else:
print ('Failed!')
print(x_data,y_data)
plt.figure()
plt.scatter(x_data,y_data,5,'b')
plt.show()
I know the library curve_fit of scipy and its power to fitting curves. I have read many examples here and in the documentation, but I cannot solve my problem.
For example, I have 10 files (chemical structers but it does not matter) and ten experimental energy values. I have a function inside a class that calculates for each structure the theoretical energy for some parameters and it returns a numpy array with the theoretical energy values.
I want to find the best parameters to have the theoretical values nearest to the experimental ones. I will furnish here the minimum exemple of my code
This is the class function that reads the experimental energy files, extracts the correct substring and returns the values as a numpy array. The self.path is just the directory and self.nPoints = 10. It is not so important, but I furnish for the sake of completeness
def experimentalValues(self):
os.chdir(self.path)
energy = np.zeros(self.nPoints)
for i in range(1, self.nPoints):
f = open("p_" + str(i + 1) + ".xyz", "r")
energy[i] = float(f.readlines()[1].split()[1])
f.close()
os.chdir('..')
return energy
I calculate the theoretical value with this class function that takes two numpy arrays as arguments, lets say
sigma = np.full(nSubstrate, 2.)
epsilon = np.full(nSubstrate, 0.15)
where nSubstrate = 9
Here there is the class function. It reads files and does two nested loops to calculate for each file the theoretical value and return it to a numpy array.
def theoreticalEnergy(self, epsilon, sigma):
os.chdir(self.path)
cE = np.zeros(self.nPoints)
for n in range(0, self.nPoints):
filenameXYZ = "p_" + str(n + 1) + "_extended.xyz"
allCoordinates = np.loadtxt(filenameXYZ, skiprows = 0, usecols = (1, 2, 3))
substrate = allCoordinates[0:self.nSubstrate]
surface = allCoordinates[self.nSubstrate:]
for i in range(0, substrate.shape[0]):
positionAtomI = np.array(substrate[i][:])
for j in range(0, surface.shape[0]):
positionAtomJ = np.array(surface[j][:])
distanceIJ = self.distance(positionAtomI, positionAtomJ)
cE[n] += self.LennardJones(distanceIJ, epsilon[i], sigma[i])
os.chdir('..')
return cE
Again, for the sake of completeness the Lennard Jones class function is defined as
def LennardJones(self, distance, epsilon, sigma):
repulsive = (sigma/distance) ** 12.
attractive = (sigma/distance) ** 6.
potential = 4. * epsilon* (repulsive - attractive)
return potential
where in this case all the arguments are scalar as the return value.
To conclude the problem presentation I have 3 ingredients:
a numpy array with the experimental data
two numpy arrays with a guess for the parameters sigma and epsilon
a function that takes the last parameters and returns a numpy vector with the values to be fitted.
How can I solve this problem like the approach described in the documentation https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html?
Curve fitting
The curve_fit fits a function f(w, x[i]) to points y[i] by finding w that minimizes sum((f(w, x[i] - y[i])**2 for i in range(n)). As you will read in the first line after the function definition
[It uses] non-linear least squares to fit a function, f, to data.
It refers to least_squares where it states
Given the residuals f(x) (an m-D real function of n real variables) and the loss function rho(s) (a scalar function), least_squares finds a local minimum of the cost function F(x):
Curve fitting is a kind of convex-cost multi-objective optimization. Since the each individual cost is convex, you can add all of them and that will still be a convex function. Notice that the decision variables (the parameters to be optimized) are the same in every point.
Your problem
In my understanding for each energy level you have a different set of parameters, if you write it as a curve fitting problem, the objective function could be expressed as sum((f(w[i], x[i]) - y[i])**2 ...), where y[i]is determined by the energy level. Since each of the terms in the sum is independent on the other terms, this is equivalent to finding each group of parametersw[i]separately minimizing(f(w[i], x[i]) - y[i])**2`.
Convexity
Convexity is a very convenient property for optimization because it ensures that you will have only one minimum in the parameter space. I am not doing a detailed analysis but have reasonable doubts about the convexity of your energy function.
The Lennard Jones function has the difference of a repulsive and an attractive force both with negative even exponent on the distance this alone is very unlikely to be convex.
The sum of multiple local functions centered at different positions has no defined convexity.
Molecular energy, or crystal energy, or protein folding are well known to be non-convex.
A few days ago (on a bike ride) I was thinking about this, how the molecules will be configured in a global minimum energy, and I was wondering if it finds that configuration so rapidly because of quantum tunneling effects.
Non-convex optimization
The non-convex (global) optimization is different from (non-linear) least-squares, in the sense that when a local minimum is found the process don't return immediately, it start making new attempts in different regions of the search spaces. If the function is smooth you can still take advantage of a gradient based local optimization method, but the complexity is still NP.
A classic global optimization method is the Simulated annenaling, if you have a chemical background I think you will have some insights reading about it. Once upon a time, simulated annealing was provided in scipy.optimize.
You will find a few global optimization methods in scipy.optimize. I would encourage you to try Basin hopping, since it was successfully applied to similar problems, as you can read in the references.
I hope this drop you on the right way to your solution. But, be aware that you will probably need to spend, learning how to use the function and will need to make some decisions. You will need to find a balance of accuracy, simplicity, efficiency.
If you want better solution take the time to derive the gradient of the cost function (you can return two values f, and df, where df is the gradient of f with respect to the decision variables).
lets assume a function
f(x,y) = z
Now I want to choose x so that the output of f matches real data, and y decreases in equidistant steps to zero starting from 1. The output is calculated in the function f by a set of differential equations.
How can I select x so that the error to the real outputs is as small as possible. Assuming I know a set of z - values, namely
f(x,1) = z_1
f(x,0.9) = z_2
f(x,0.8) = z_3
now find x, that the error to the real data z_1,z_2,z_3 is minimal.
How can one do this?
A common method of optimizing is least squares fitting, in which you would basically try to find params such that the sum of squares: sum (f(params,xdata_i) - ydata_i))^2 is minimized for given xdata and ydata. In your case: params would be x, xdata_i would be 1, 0.9 and 0.8 and ydata_i z_1, z_2 and z_3.
You should consider the package scipy.optimize. It's used in finding parameters for a function. I think this page gives quite a good example on how to use it.
My question is how can I put a weighted least squares problem into a python solver. I'm trying to implement the approaches in the paper found here (PDF warning). There is an overview of the problem at the bottom of the post.
Specifically I want to start with the following minimization equation (19 in the paper):
latex formula can be found here:
\frac{min}{\Theta \epsilon M} \sum_{j=1}^{n} \sum_{i=1}^{m}(w(i,j))\left | \Psi(i,j)*\Theta (i,j) - I(i,j) \right |^{2}
It is represented as a weighted least squares problem.
w, psi, and I are my knowns, and I am trying to solve for theta.
I tried at first creating a function that takes a theta and returns the sum of this equation exactly as it's expressed above. Then I passed it to scipy.optimize.least_squares, but the theta values always remained the same after optimization. I tried implementing a jacobian, but the resulting sum explodes to huge negative values. It also takes ages as I'm attempting to run this on images (I is the pixel value for a pixel j with light i).
I then realized I'm almost certainly misunderstanding how to solve this problem and could use some help approaching it. My current code is below:
def theta_solver(self, theta):
imshape = self.images.shape
sm = 0
for j in j_array:
for i in i_array:
w = self.get_w(i, j, theta)
psi = self.non_diff_smoothing(self.get_psi(i, j))
diff = psi*(theta[i, j]) - self.I[i, j]
res = w*(diff)
sm += res
return sm
def solve_theta(self, theta_guess):
res = scipy.optimize.least_squares(self.theta_solver, theta_guess)
Something tells me I'm way off base for how I'm approaching this problem, and I could use a finger in the right direction. Thanks for your time.
Problem overview:
This particular vision approach is called photometric stereo. By taking several images of a scene with different light sources, we can create a 3D reconstruction of that scene.
One issue is the 1/r^2 decay in lighting is dependent on distance from the light source, which means this can't be solved by normal linear solutions.
The approach documented in the paper is a nonlinear approach for solving near light photometric stereo. It does two things:
it solves the surface Z, and
the albedos/intensities at each pixel represented by theta, by alternating the solvers.
In this question I'm only trying to solve the theta element of the equation, which can be solved via weighted least squares.
Turns out I was heavily overthinking the problem. This can be decomposed to a simple linear solution of the form Ax = b. When looking at an error equation, in this case:
argmin(THETA) sum(W * ||PSI * THETA - I||^2)
we can just distribute the weight through the parts within the root mean square. Our equation ends up being:
W * PSI * THETA = W * I
Which we can solve using your favorite linear solver (i.e. conjugate gradient descent)