scipy.optimize.minimize results differ between Python 2.x-3.x - python

Basically, I have a nonlinear constrained problem using the SLSQP solver in scipy.optimize.minimize. Unfortunately, the problem (same file, same code) is returning different results on different computers (one Windows, one Linux). The scipy version is the same (1.2.1). Here is my code:
import numpy as np
from scipy.optimize import minimize
class OptimalAcc():
def __init__(self, v0, tg, tr, D0, sgr, l, t0, a0, b0,
rho_t=0.5, rho_u=0.5, vM=15, vm=2.78, aM=2.5, am=-2.9):
# Problem constants
self.v0 = v0
self.D0 = D0
self.sgr = sgr
self.l = l
self.T = tg + tr
self.D = tg / self.T
self.t0 = t0
self.a0 = a0
self.b0 = b0
self.rho_t = rho_t
self.rho_u = rho_u
self.vM = vM
self.vm = vm
self.aM = aM
self.am = am
def cost_fn(self, x):
# Acceleration profile variables
t = x[:1]
a = x[1:2]
b = x[2:3]
# Objective function
f = self.rho_t*x[:1]+self.rho_u*(a**2*t**3/3 +
a*b*t**2 +
b**2*t)
return f
def solve(self):
# Inequality constraints
ineq = ({'type':'ineq',
'fun':lambda x: np.array([self.aM - x[2],
x[2]-self.am,
x[0],
self.vM - (self.v0 + x[2]*x[0] + 0.5*x[1]*x[0]**2),
self.v0 + x[2]*x[0] + 0.5*x[1]*x[0]**2 - self.vm,
np.sin(np.pi*self.D - np.pi/2)-
np.sin(2*np.pi*(x[0] -((self.D0*self.T)/abs(self.sgr-2)))/self.T + 3*np.pi/2 - np.pi*self.D)])})
# Equality constraints
eq = ({'type':'eq',
'fun':lambda x: np.array([x[1]*x[0] + x[2],
self.v0*x[0] + 0.5*x[2]*x[0]**2 + x[1]*x[0]**3/6 - self.l])})
# Starting points
x0 = np.array([self.t0, self.a0, self.b0])
# Solve optimization problem
res = minimize(self.cost_fn,
x0=x0,
constraints=[ineq,eq],
options={'disp': True})
return res
if __name__== "__main__":
v0 = 1
tg = 20
tr = 20
D0 = 1
sgr = 1
l = 70
t0 = 10
a0 = -0.1
b0 = 1.5
# Create instance of optimization problem class
obj = OptimalAcc(v0, tg, tr, D0, sgr, l, t0, a0, b0)
# Solve problem and return optimal profile
u_t = obj.solve().x
print('x_1:',u_t[0])
print('x_2:',u_t[1])
print('x_3:',u_t[2])
The Windows machine yields:
Optimization terminated successfully. (Exit mode 0)
Current function value: 8.696191258640086
Iterations: 7
Function evaluations: 35
Gradient evaluations: 7
x_1: 13.508645429307041
x_2: -0.06874922875473621
x_3: 0.9287089606820067
I believe these results are locally optimal and I can verify the same output with fmincon in MATLAB.
However, the Linux machine yields:
Positive directional derivative for linesearch (Exit mode 8)
Current function value: 14.4116342889
Iterations: 17
Function evaluations: 147
Gradient evaluations: 13
x_1: 7.65875894797259
x_2: -0.241800477348664
x_3: 2.5000000000000053
Clearly, the optimizer is getting stuck in the Linux computer. What could be causing this? My only guess is that there's some precision within numpy that's throwing off the numbers.

As discussed in the comments, the issue is most likely not related to Windows vs. Linux, but more a Python 2 vs. Python 3 one. For example, the term
a**2*t**3/3
can look different between Python 2 and 3 as there might be only integers involved (there are more examples like this in your code).
An easy fix might be to include
from __future__ import division
at the top of your script which would take care of the differences on how divisions are performed in Python 2 and 3.

Related

Using the BFGS method to find roots of few equations

I am trying to use the BFGS method to find the roots of these equations.
ax[0]^2 - bx[1]^2
a = 35; b = 25; d = 15
import numpy as np
from scipy import optimize
def f(x):
return a*x[0]^2 - b*x[1]^2
optimize.fmin_bfgs(f,[0.55,0.65])
The output I am getting is,
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: -2791745.308471
Iterations: 3
Function evaluations: 196
Gradient evaluations: 46
array([ 300.41455833, 2439.35586751])
The output is of course not desirable. I want to add two more equations and want the roots x[0], x[1], x[2]. Is it possible in BFGS, if so, how?
The two more equations are like,
b*x[2]^2 - x[1]^2 == 0
d *x[0]x[2](x[2] + x[0]) - x[1]^2 == 0
The BFGS algorithm tries to find a local minimum of the given function, as the method name fmin_bfgs indicates. You can use scipy.optimize.root to find the root of the function F: R^n -> R^n of n variables:
import numpy as np
from scipy.optimize import root
a = 35; b = 25; d = 15
def F(x):
return np.array([a*x[0]**2 - b*x[1]**2, 0])
# res.x contains your root
res = root(F, x0=np.ones(2))
In order to solve a*x[0]**2 - b*x[1]**2 == 0 we added the equation 0 == 0, since root expects 2 equations for a function of 2 variables. When adding your two other equations, we have a function of three variables, i.e:
def F(x):
eq1 = a*x[0]**2 - b*x[1]**2
eq2 = b*x[1]**2 - x[0]**2
eq3 = d*x[0]*x[2]*(x[2] + x[0]) - x[1]**2
return np.array([eq1, eq2, eq3])
# res.x contains your root
res = root(F, x0=np.ones(3))
Note also that in Python, the ^ operator denotes the bitwise XOR. Use x[0]**2 to denote the power of two of x[0].

Inaccurate scipy minimize with bounds

I am very new to Python, and I try to minimize some function, but the result seems inaccurate, or at least the difference with the Matlab result is too big.
My two questions are:
(1) Am I right to assume the difference in the results comes from an inaccurate Python solution?
(I believe so because for example change(c1)/change(y1) is constant in Matlab as I believe it should be, while it changes quite a bit in Python.)
(2) What can I do to improve the accuracy of the Python result?
I have already tried other methods (TNC, L-BFGS-B), and providing an analytical gradient or more accurate numerical gradient, other routines (minimize_scalar with method='Bounded'), but they all give pretty much the same result.
here is my code:
import numpy as np
from scipy.optimize import minimize
from scipy.optimize import Bounds
#need to define nu
def ut_fun(CC):
if nu != 1:
UU = (CC ** (1-nu) - 1) / (1-nu)
else:
UU = log(CC)
return UU
#need to define RR, A1,y1,y2_L,y2_H,gamma,beta,bb
def obj_2per(c1):
A2 = RR*A1 + y1 - c1
c2_L = RR*A2 + y2_L
U_L = ut_fun(c2_L)
c2_H = RR*A2 + y2_H
U_H = ut_fun(c2_H)
EV2 = gamma*U_L + (1-gamma)*U_H
mVV = - (ut_fun(c1) + beta*EV2)
return mVV
nu = 2
A1 = 0
bb = 0
beta = 0.96
RR = 1/beta
y2_L = 0.1
y2_H = 0.2
gamma = 0.5
#Pre-allocation in np arrays
y1_vec = np.linspace(0.04,0.4,10)
c1_star = np.zeros(len(y1_vec))
#Actual optimization:
c1_0 = 0.01
for ii in range(len(y1_vec)):
y1 = y1_vec[ii]
ub = RR*A1 + y1 - bb
bnds = [(-np.inf,ub)]
sol = minimize(obj_2per,c1_0,method='trust-constr', bounds=bnds)
c1_star[ii] = float(sol.x)
c1_0 = c1_star[ii];
print(c1_star)
The Python result is:
[0.03999284 0.07995512 0.11997128
0.14458588 0.16599669 0.18724888
0.20837178 0.22939139 0.25032751
0.27119543]
The Matlab result is:
0.0399997050892807 0.0799994508682207 0.119999719341015 0.153878407968280 0.174286891630529 0.194695468467231 0.215103764323911 0.235511996564921 0.255920191410148 0.276328383256344
The difference in results from the fourth entry onwards is too large.

Have I implemented Milstein's method/Euler-Maruyama correctly?

I have an stochastic differential equation (SDE) that I am trying to solve using Milsteins method but am getting results that disagree with experiment.
The SDE is
which I have broken up into 2 first order equations:
eq1:
eq2:
Then I have used the Ito form:
So that for eq1:
and for eq2:
My python code used to attempt to solve this is like so:
# set constants from real data
Gamma0 = 4000 # defines enviromental damping
Omega0 = 75e3*2*np.pi # defines the angular frequency of the motion
eta = 0 # set eta 0 => no effect from non-linear p*q**2 term
T_0 = 300 # temperature of enviroment
k_b = scipy.constants.Boltzmann
m = 3.1e-19 # mass of oscillator
# set a and b functions for these 2 equations
def a_p(t, p, q):
return -(Gamma0 - Omega0*eta*q**2)*p
def b_p(t, p, q):
return np.sqrt(2*Gamma0*k_b*T_0/m)
def a_q(t, p, q):
return p
# generate time data
dt = 10e-11
tArray = np.arange(0, 200e-6, dt)
# initialise q and p arrays and set initial conditions to 0, 0
q0 = 0
p0 = 0
q = np.zeros_like(tArray)
p = np.zeros_like(tArray)
q[0] = q0
p[0] = p0
# generate normally distributed random numbers
dwArray = np.random.normal(0, np.sqrt(dt), len(tArray)) # independent and identically distributed normal random variables with expected value 0 and variance dt
# iterate through implementing Milstein's method (technically Euler-Maruyama since b' = 0
for n, t in enumerate(tArray[:-1]):
dw = dwArray[n]
p[n+1] = p[n] + a_p(t, p[n], q[n])*dt + b_p(t, p[n], q[n])*dw + 0
q[n+1] = q[n] + a_q(t, p[n], q[n])*dt + 0
Where in this case p is velocity and q is position.
I then get the following plots of q and p:
I expected the resulting plot of position to look something like the following, which I get from experimental data (from which the constants used in the model are determined):
Have I implemented Milstein's method correctly?
If I have, what else might be wrong my process of solving the SDE that'd causing this disagreement with the experiment?
You missed a term in the drift coefficient, note that to the right of dp there are two dt terms. Thus
def a_p(t, p, q):
return -(Gamma0 - Omega0*eta*q**2)*p - Omega0**2*q
which is actually the part that makes the oscillator into an oscillator. With that corrected the solution looks like
And no, you did not implement the Milstein method as there are no derivatives of b_p which are what distinguishes Milstein from Euler-Maruyama, the missing term is +0.5*b'(X)*b(X)*(dW**2-dt).
There is also a derivative-free version of Milsteins method as a two-stage kind-of Runge-Kutta method, documented in wikipedia or the original in arxiv.org (PDF).
The step there is (vector based, duplicate into X=[p,q], K1=[k1_p,k1_q] etc. to be close to your conventions)
S = random_choice_of ([-1,1])
K1 = a(X )*dt + b(X )*(dW - S*sqrt(dt))
Xh = X + K1
K2 = a(Xh)*dt + b(Xh)*(dW + S*sqrt(dt))
X = X + 0.5 * (K1+K2)

Stiff ODE-solver

I need an ODE-solver for a stiff problem similar to MATLAB ode15s.
For my problem I need to check how many steps (calculations) is needed for different initial values and compare this to my own ODE-solver.
I tried using
solver = scipy.integrate.ode(f)
solver.set_integrator('vode', method='bdf', order=15, nsteps=3000)
solver.set_initial_value(u0, t0)
And then integrating with:
i = 0
while solver.successful() and solver.t<tf:
solver.integrate(tf, step=True)
i += 1
print(i)
Where tf is the end of my time interval.
The function used is defined as:
def func(self, t, u):
u1 = u[1]
u2 = mu * (1-numpy.dot(u[0], u[0]))*u[1] - u[0]
return numpy.array([u1, u2])
Which with the initial value u0 = [ 2, 0] is a stiff problem.
This means that the number of steps should not depend on my constant mu.
But it does.
I think the odeint-method can solve this as a stiff problem - but then I have to send in the whole t-vector and therefore need to set the amount of steps that is done and this ruins the point of my assignment.
Is there anyway to use odeint with adaptive stepsize between two t0 and tf?
Or can you see anything I miss in the use of the vode-integrator?
I'm seeing something similar; with the 'vode' solver, changing methods between 'adams' and 'bdf' doesn't change the number of steps by very much. (By the way, there is no point in using order=15; the maximum order of the 'bdf' method of the 'vode' solver is 5 (and the maximum order of the 'adams' solver is 12). If you leave the argument out, it should use the maximum by default.)
odeint is a wrapper of LSODA. ode also provides a wrapper of LSODA:
change 'vode' to 'lsoda'. Unfortunately the 'lsoda' solver ignores
the step=True argument of the integrate method.
The 'lsoda' solver does much better than 'vode' with method='bdf'.
You can get an upper bound on
the number of steps that were used by initializing tvals = [],
and in func, do tvals.append(t). When the solver completes, set
tvals = np.unique(tvals). The length of tvals tells you the
number of time values at which your function was evaluated.
This is not exactly what you want, but it does show a huge difference
between using the 'lsoda' solver and the 'vode' solver with
method 'bdf'. The number of steps used by the 'lsoda' solver is
on the same order as you quoted for matlab in your comment. (I used mu=10000, tf = 10.)
Update: It turns out that, at least for a stiff problem, it make a huge difference for the 'vode' solver if you provide a function to compute the Jacobian matrix.
The script below runs the 'vode' solver with both methods, and it
runs the 'lsoda' solver. In each case, it runs the solver with and without the Jacobian function. Here's the output it generates:
vode adams jac=None len(tvals) = 517992
vode adams jac=jac len(tvals) = 195
vode bdf jac=None len(tvals) = 516284
vode bdf jac=jac len(tvals) = 55
lsoda jac=None len(tvals) = 49
lsoda jac=jac len(tvals) = 49
The script:
from __future__ import print_function
import numpy as np
from scipy.integrate import ode
def func(t, u, mu):
tvals.append(t)
u1 = u[1]
u2 = mu*(1 - u[0]*u[0])*u[1] - u[0]
return np.array([u1, u2])
def jac(t, u, mu):
j = np.empty((2, 2))
j[0, 0] = 0.0
j[0, 1] = 1.0
j[1, 0] = -mu*2*u[0]*u[1] - 1
j[1, 1] = mu*(1 - u[0]*u[0])
return j
mu = 10000.0
u0 = [2, 0]
t0 = 0.0
tf = 10
for name, kwargs in [('vode', dict(method='adams')),
('vode', dict(method='bdf')),
('lsoda', {})]:
for j in [None, jac]:
solver = ode(func, jac=j)
solver.set_integrator(name, atol=1e-8, rtol=1e-6, **kwargs)
solver.set_f_params(mu)
solver.set_jac_params(mu)
solver.set_initial_value(u0, t0)
tvals = []
i = 0
while solver.successful() and solver.t < tf:
solver.integrate(tf, step=True)
i += 1
print("%-6s %-8s jac=%-5s " %
(name, kwargs.get('method', ''), j.func_name if j else None),
end='')
tvals = np.unique(tvals)
print("len(tvals) =", len(tvals))

algebraic constraint to terminate ODE integration with scipy

I'm using Scipy 14.0 to solve a system of ordinary differential equations describing the dynamics of a gas bubble rising vertically (in the z direction) in a standing still fluid because of buoyancy forces. In particular, I have an equation expressing the rising velocity U as a function of bubble radius R, i.e. U=dz/dt=f(R), and one expressing the radius variation as a function of R and U, i.e. dR/dT=f(R,U). All the rest appearing in the code below are material properties.
I'd like to implement something to account for the physical constraint on z which, obviously, is limited by the liquid height H. I consequently implemented a sort of z<=H constraint in order to stop integration in advance if needed: I used set_solout in order to do so. The situation is that the code runs and gives good results, but set_solout is not working at all (it seems like z_constraint is never called actually...). Do you know why?
Is there somebody with a more clever idea, may be also in order to interrupt exactly when z=H (i.e. a final value problem) ? is this the right way/tool or should I reformulate the problem?
thanks in advance
Emi
from scipy.integrate import ode
Db0 = 0.001 # init bubble radius
y0, t0 = [ Db0/2 , 0. ], 0. #init conditions
H = 1
def y_(t,y,g,p0,rho_g,mi_g,sig_g,H):
R = y[0]
z = y[1]
z_ = ( R**2 * g * rho_g ) / ( 3*mi_g ) #velocity
R_ = ( R/3 * g * rho_g * z_ ) / ( p0 + rho_g*g*(H-z) + 4/3*sig_g/R ) #R dynamics
return [R_, z_]
def z_constraint(t,y):
H = 1 #should rather be a variable..
z = y[1]
if z >= H:
flag = -1
else:
flag = 0
return flag
r = ode( y_ )
r.set_integrator('dopri5')
r.set_initial_value(y0, t0)
r.set_f_params(g, 5*1e5, 2000, 40, 0.31, H)
r.set_solout(z_constraint)
t1 = 6
dt = 0.1
while r.successful() and r.t < t1:
r.integrate(r.t+dt)
You're running into this issue. For set_solout to work correctly, it must be called right after set_integrator, before set_initial_value. If you introduce this modification into your code (and set a value for g), integration will terminate when z >= H, as you want.
To find the exact time when the bubble reached the surface, you can make a change of variables after the integration is terminated by solout and integrate back with respect to z (rather than t) to z = H. A paper that describes the technique is M. Henon, Physica 5D, 412 (1982); you may also find this discussion helpful. Here's a very simple example in which the time t such that y(t) = 0.5 is found, given dy/dt = -y:
import numpy as np
from scipy.integrate import ode
def f(t, y):
"""Exponential decay: dy/dt = -y."""
return -y
def solout(t, y):
if y[0] < 0.5:
return -1
else:
return 0
y_initial = 1
t_initial = 0
r = ode(f).set_integrator('dopri5')
r.set_solout(solout)
r.set_initial_value(y_initial, t_initial)
# Integrate until solout constraint violated
r.integrate(2)
# New system with t as independent variable: see Henon's paper for details.
def g(y, t):
return -1.0/y
r2 = ode(g).set_integrator('dopri5')
r2.set_initial_value(r.t, r.y)
r2.integrate(0.5)
y_final = r2.t
t_final = r2.y
# Error: difference between found and analytical solution
print t_final - np.log(2)

Categories

Resources