I'm hoping to just get some assistance conceptually about how to solve a linear system of equations with penalty functions. Example code is at the bottom.
Let's say I'm trying trying to do a fit of this equation:
Ie=IaX+IbY+IcZ
where Ie, Ia, Ib, and Ic are constants, and X,Y,Z are variables
I could easily solve this system of equations using scipy.least_squares, but I want to constrain the system with 2 contraints.
1. X+Y+Z=1
2. X,Y,Z > 0 and X,Y,Z < 1
To do this, I then modified the above function.
Ie-Ic=X(Ia-Ic)+Y(Ib-Ic) where X+Y+Z=1 I solved for Z
therefore
Ax=B where A=[Ia-Ic,Ib-Ic] and B=[Ie-Ic] given bounds (0,1)
This solves the 2nd criteria of X,Y,Z > 0 and X,Y,Z < 1, but it does not solve the 1st.
To resolve the first issue, an additional constraint needs to be made, where X+Y<1, and this I don't quite know how to do.
So I presume least_squares has some built in penalty function for it's bounds. I.E.
chi^2=||A-Bx||^2+P
where P is the conditions, if X,Y,Z>0 then P = 10000
thus, giving high chi squared values and thus setting the constraints
I don't know how I can add a further condition. So if X+Y<1 then P=10000 or something similar along those lines.
In short, least_squares enables you to set bounds on the individual values, but I'd like to set some further constraints, and I don't quite know how to do this with least_squares. I've seen additional inequality constraint options in scipy.minimize, but I don't quite know how to apply that to a linear system of equations with the format Ax=B.
So as an example code, lets say I've already done the calculations and obtained my A matrix of constants and my B vector. I use least squares, get my values for X and Y, and can calculate Z since X+Y+Z=1. The issue here is in my minimization, I did not set a constraint that X+Y<1, so in some cases you can actually get values where X+Y>1. So I'd like to find a method where I can set that additional constraint, in addition to the bounds constraint for the individual variables:
Ax=np.array([[1,2],[2,4],[3,4]])
B=np.array([0,1,2])
solution=lsq_linear(Ax,B,lsq_solver='lsmr',bounds=(0,1))
X=solution.x[0]
Y=solution.x[1]
Z=1-sum(solution.x)
If minimize is the solution here, can you please show me how to set it up given the above matrix of A and array of B?
Any advice, tips, or help to point me in the right direction is greatly appreciated!
Edit:
So I found something similar on here: Minimizing Least Squares with Algebraic Constraints and Bounds
So I thought I'd apply it to my case, but I don't think I've been able to apply it properly.
Ax=np.array([[1,2],[2,4],[3,4]])
B=np.array([0,1,2])
def fun(x,a1,a2,y):
fun_output=x[0]*a1+x[1]*a2
return np.sum((fun_output-y)**2)
cons = [{"type": "eq", "fun": lambda x: x[0] + x[1] - 1}]
bnds = [(0, 1), (0, 1)]
xinit = np.array([1, 1])
solution=minimize(fun,args=(Ax[:,0],Ax[:,1], B), x0=xinit, bounds=bnds, constraints=cons)
solution_2=lsq_linear(Ax,B,bounds=(0,1))
print(solution.x)
print(solution_2.x)
Issue is, the output of this differs from lsq_linear, and I almost always get a very close to zero value for Z regardless of what the input arrays are. I don't know if I'm setting this up/understanding this correctly.
Your initial guess xinit is not feasible and doesn't satisfy your constraint.
IMO, solving the initial problem directly as a constrained nonlinear optimization problem (NLP) instead of rewriting it is the easier approach. Assuming you have all the data points Ia, Ib, Ic and Ie (you didn't provide all of them), you can use the following code snippet which is in the same vein as the linked answer of mine in your question.
from scipy.optimize import minimize
import numpy as np
def fun_to_fit(coeffs, *args):
x, y, z = coeffs
Ia, Ib, Ic = args
return Ia*x + Ib*y + Ic*z
def objective(coeffs, *args):
Ia, Ib, Ic, Ie = args
residual = Ie - fun_to_fit(coeffs, Ia, Ib, Ic)
return np.sum(residual**2)
# Constraint: x + y + z == 1
con = [{'type': 'eq', 'fun': lambda coeffs: np.sum(coeffs) - 1}]
# bounds
bounds = [(0, 1), (0, 1), (0, 1)]
# initial guess (fulfils the constraint and lies within the bounds)
x0 = np.array([0.25, 0.5, 0.25])
# your given data points
#Ia = np.array(...)
#Ib = np.array(...)
#Ic = np.array(...)
#Ie = np.array(...)
# solve the NLP
res = minimize(lambda coeffs: objective(coeffs, Ia, Ib, Ic, Ie), x0=x0, bounds=bounds, constraint=con)
I'm changing my programming language to julia due to it's better performance for numerical calculations and differential equations compared to python (I used mostly scipy solve_ivp and odeint but they turned out being too slow for my problems).
For testing purposes I did a simple translation from my old code in python to a new code in julia and I'm pretty sure both codes should have the same numerical results. The problem seems to appear when I try to use the ODE solver from julia and I can't figure out why (maybe because there is too many equations?)
My physics problem is a simulation of a beam of charged planes. For the integration over the time, I use a 2n-dimensional vector, with the [1:n] coordinates corresponding to the positions of the planes and the [n+1:2n] positions correspondind to the particles speeds. So the derivative function is 2n-dimensional as well and the [1:n] time derivatives corresponds to the [n+1:2n] coordinates of the initial vector and the [n+1:2n] derivatives corresponds to a positional term and an index term (see codes below).
Starting by my working code in python:
def deriv(t,y):
p= len(y)
dydt = np.zeros(int(p))
dydt[0:int(p/2)] = y[int(p/2):int(p)] #derivative of the positions
k = index(y[0:int(p/2)])
signal=abs(y[0:int(p/2)])/y[0:int(p/2)]
dydt[int(p/2):int(p)] = -y[0:int(p/2)] + ((np.ones(int(p/2)) + 2*k )/(p))*signal
#derivative of the speeds
deriv = dydt[0:int(p)]
return deriv
With the following integration
sol = solve_ivp(deriv, (time,time+ delta_t), initial_phase, rtol = 0.00001)
and the resulting graphic of the phase-space looks like this:
Few time expected evolution:
Then we have my code in Julia:
function dydt(du, u,p,t)
n = floor(Int,length(u)/2)
k = sortperm(abs.(u[1:n]))
k = k[k] #k ends up beeing the index equivalent
du[1:n] = u[n+1: 2n] #derivative of the positions
du[n+1: 2n] = (1/2n)*(-ones(n) + 2*k).*(abs.(u[1:n])./u[1:n])- (u[1:n])
#derivative of the speeds
end
With the following integration:
using DifferentialEquations
p = 0 #there is no p parameters in my code but they are defined in the tutorials anyway
H = 1.5*rand(2000)
L = (2/3)*(rand(2000) - 0.5*ones(2000))
D = append!(H,L) #Initial state, similar to the python example
tspan = (0.0,1.0)
prob = ODEProblem(dydt,D,tspan,p,reltol=1e-4,abstol=1e-4)
sol = solve(prob)
And the result is assimetrical as we can see below:
I hope anybody can help. I'll answer any asked major question about the physics envolved if necessary.
The problem at hand is optimization of multivariate function with nonlinear constraints.
There is a differential equation (in its oversimplified form)
dy/dx = y(x)*t(x) + g(x)
I need to minimize the solution of the DE y(x), but by varying the t(x).
Since it is physics under the hood, there are constraints on t(x). I successfully implemented all of them except one:
0 < t(x) < 1 for any x in range [a,b]
For certainty, the t(x) is a general polynomial:
t(x) = a0 + a1*x + a2*x**2 + a3*x**3 + a4*x**4 + a5*x**5
The x is fixed numpy.ndarray of floats and the optimization goes for coefficients a. I use scipy.optimize with trust-constr.
What I have tried so far:
Root finding at each step and determining the minimal/maximal value of the function using optimize.root and checking for sign changes. Return 0.5 if constraints are satisfied and numpy.inf or -1 or whatever not in [0;1] range if constraints are not satisfied. The optimizer stops soon and the function is not minimized properly.
Since x is fixed-length and known, I tried to define a constraint for each point, so I got N constraints where N = len(x). This works (at least look like) but takes forever for not-so large N. Also, since x is discrete and non-uniform, I can't be sure that there are no violated constraints for any x in [a,b].
EDIT #1: the minimal reproducible example
import scipy.optimize as optimize
from scipy.optimize import Bounds
import numpy as np
# some function y(x)
x = np.linspace(-np.pi,np.pi,100)
y = np.sin(x)
# polynomial t(z)
def t(a,z):
v = 0.0;
for ii in range(len(a)):
v += a[ii]*z**ii
return v
# let's minimize the sum
def targetFn(a):
return np.sum(y*t(a,x))
# polynomial order
polyord = 3
# simple bounds to have reliable results,
# otherwise the solution will grow toward +-infinity
bnd = 10.0
bounds = Bounds([-bnd for i in range(polyord+1)],
[bnd for i in range(polyord+1)])
res = optimize.minimize(targetFn, [1.0 for i in range(polyord+1)],
bounds = bounds)
if np.max(t(res.x,x))>200:
print('max constraint violated!')
if np.min(t(res.x,x))<-100:
print('min constraint violated!')
In the reproducible example given above, let the constraints to be that the value of the polynomial t(a,x) is in range [-100;200] for the given x.
So the question is: how does one properly define a constraint to tell the optimizer that the function's values must be constrained for the given range of arguments?
I'm finding CVXPY is randomly failing with the following error:
ArpackError: ARPACK error 3: No shifts could be applied during a cycle of the Implicitly restarted
Arnoldi iteration. One possibility is to increase the size of NCV relative to NEV.
The code below is a minimal example where it is just trying to do mean variance optimisation with no constraints, identity correlation matrix, and normally distributed mean vector. Roughly once in every thousand runs this fails. It doesn't seem to matter which solver I ask it to use, which makes me think it is failing setting up the problem?
import cvxpy as cp
import numpy as np
n = 199
np.random.seed(100)
mu = np.random.normal(size = n)
C = np.eye(n)
for repeat in range(1000):
x = cp.Variable(n)
mean = x.T # mu
variance = cp.quad_form(x, C)
objective = cp.Maximize(mean - variance)
constraints = []
prob = cp.Problem(objective, constraints)
result = prob.solve()
print(repeat, end = " ")
I am trying to solve the TISE for an infinite potential well V=0 on the interval [0,L]. The exercise gives us that the value of the wavefunction and its derivative at 0 is 0,1 respectively. This allows us to using the scipy.integrate.odeint function in order to solve the problem for a given energy value.
The task is to now find the energy eigenvalues given the further boundary condition that the wavefunction at L is 0, using a root finding function on python. I have done some research and could only find something called the 'shooting method' which I cannot figure out how to implement. Also, I have come across the solve BVP scipy function, however I can't seem to understand what exactly goes in the second input for this function (boundary condition residuals)
m_el = 9.1094e-31 # mass of electron in [kg]
hbar = 1.0546e-34 # Planck's constant over 2 pi [Js]
e_el = 1.6022e-19 # electron charge in [C]
L_bohr = 5.2918e-11 # Bohr radius [m]
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
def eqn(y, x, energy): #array of first order ODE's
y0 = y[1]
y1 = -2*m_el*energy*y[0]/hbar**2
return np.array([y0,y1])
def solve(energy, func): #use of odeint
p0 = 0
dp0 = 1
x = np.linspace(0,L_bohr,1000)
init = np.array([p0,dp0])
ysolve = odeint(func, init, x, args=(energy,))
return ysolve[-1,0]
The method here is to input eqn as func in solve(energy,func). L_bohr is the L value in this problem. We are trying to numerically find the energy eigenvalues using some scipy method
For all the other solvers in scipy the argument order x,y, and even in odeint one can use this order by giving the option tfirst=True. Thus change to
def eqn(x, y, energy): #array of first order ODE's
y0, y1 = y
y2 = -2*m_el*energy*y0/hbar**2
return [y1,y2]
For the BVP solver you have to think of the energy parameter as an
extra state component with zero derivative, thus adding a third slot
in the boundary conditions. Scipy's solve_bvp allows to keep it as parameter,
so that you get 3 slots in the boundary conditions, allowing to fix the first derivative at x=0 to select one non-trivial solution from the eigenspace.
def bc(y0, yL, E):
return [ y0[0], y0[1]-1, yL[0] ]
Next construct an initial state that is close to the suspected ground state and call the solver
x0 = np.linspace(0,L_bohr,6);
y0 = [ x0*(1-x0/L_bohr), 1-2*x0/L_bohr ]
E0 = 134*e_el
sol = solve_bvp(eqn, bc, x0, y0, p=[E0])
print(sol.message, " E=", sol.p[0]/e_el," eV")
and then produce the plot
x = np.linspace(0,L_bohr,1000)
plt.plot(x/L_bohr, sol.sol(x)[0]/L_bohr,'-+', ms=1)
plt.grid()
The algorithm converged to the desired accuracy. E= 134.29310361903723 eV