I have a constrained optimization problem where I am trying to minimize an objective function of 100+ variables which is of the form
Min F(x) = f(x1) + f(x2) + ... + f(xn)
Subject to functional constraint
(g(x1) + g(x2) + ... + g(xn))/(f(x1) + f(x2) + ... + f(xn)) - constant >= 0
I also have individual bounds for each variable x1, x2, x3...xn
a <= x1 <= b
c <= x2 <= d
...
For this, I wrote a python script, using the scipy.optimize.minimize implementation with constraints and bounds, but I am unable to fulfill my bounds and constraints in the solutions. These are all cases where optimization could converge to a solution (message: success)
Here is a sample of my code:
df is my pandas dataset
B(x) is LogNorm transform based on x and other constants
Values U, c, lb, ub are pre-calculated constant dictionaries for each index in df
import scipy
df = pd.DataFrame(..)
k = set(df.index.values) ## list of indexes to iterate on
val = 0.25 ## Arbitrary
def obj(x):
fn = 0
for n,i in enumerate(k):
x0 = x[n]
fn1 = (U[i]) * B(x0) * (x0)
fn += fn1
return fn
def cons(x):
cn = 1
c1 = 0
c2 = 0
for n,i in enumerate(k):
x0 = x[n]
c1 += (U[i]) * (B(x0) * (x0 - c[i])
c2 += (U[i]) * (B(x0) * (x0)
cn = c1/(c2)
return cn - val
const = [{'type':'ineq', 'fun':cons}]
bnds = tuple((lb[i], ub[i]) for i in k) ## Lower, Upper for each element ((lb1, ub1), (lb2, ub2)...)
x_init = [lb[i] for i in k] ## for eg. starting from lower bound
## Solution
sol = scipy.optimize.minimize(obj, x_init, method = 'COBYLA', bounds = bnds, constraints = const)
I have more pointed questions if that helps:
Is there a way to construct the same equation concisely/ without the use of loops (given the number of variables could depend on input data and I have no control over it)?
Is there any noticeable issue in my application of bounds? I can't seem to get the final values of all variables follow individual bounds.
Similarly, is there a visible flaw in the construction on constraint equation? My results often DO NOT follow the constraints is repeated runs with different inputs.
Any help with either of the questions can help me progress further at work.
I have also looked into a Lagrangian solution of the same but so far I am unable to solve it for undefined number of (n) variables.
Thanks!
Related
I have a mathematical model of differential equations that begins as linear and then uses correctional coefficients after reaching a certain value (1).
Currently, I solve the linear function independently, find out where the array goes from less than 1 to greater than 1, and then use that value from the array as the new initial condition. I also correct the time scale.
def vttmodel_linear(m,t,tm,tv,M_max):
n = 1/(7*tm)
dMdt = n
return dMdt
M_0 = 0
M_max = 1 + 7*((RH_crit-RH)/(RH_crit-100)) - 2*np.square((RH_crit-RH)/(RH_crit-100))
print(M_max)
# tm = days
# M = weeks so 7*tm
t = np.arange(0,104+1)
tm = np.exp(-0.68*np.log(T) - 13.9*np.log(RH) + 0.14*W - 0.33*SQ + 66.02)
tv = np.exp(-0.74*np.log(T) - 12.72*np.log(RH) + 0.06*W + 61.50)
m = odient(vttmodel_linear, M_0, t, args=(tm,tv,M_max))
M_0 = m[(np.where(m>1)[0][0])-1]
t = np.where(m>1)[0]
Then I use the new initial condition, M_0 and the updated time scale to solve the non-linear portion of the model.
def vttmodel(M,t,tm,tv,M_max):
n = 1/(7*tm)
k1 = 2/((tv/tm)-1)
k2 = np.max([1-np.exp(2.3*(M-M_max)), 0])
dMdt = n*k1*k2
return dMdt
M = odient(vttmodel, M_0, t, args=(tm,tv,M_max))
I then splice the arrays m and M at the location I found earlier and graph the result.
I would like to find a simplified way to do this. I have tried using If statements within the odient function and also a While loop when calling the two functions, but have not had any luck interrupting the odient function. Suggestions would be helpful. Thank you.
I have this kind of data :
import random
data=random.sample(range(1, 100), 5)
x= [-1,1,1,-1,1]
def f(x,data):
prod=[a * b for a, b in zip(data, x)]
result=abs(sum(prod))
return result
I Would like to find the best x composed of -1 or 1 to minimize the value of f(x)
Maybe we can use scipy.minimise() but how can we add the -1 or 1 as a constrain on the value inside of x ?
Does somebody have an idea ?
You want to solve a mixed-integer linear programming problem (MILP), which aren't supported yet by scipy.optimize.
However, you can use a modelling package like PuLP to formulate your MILP and pass it to a MILP solver. Note that your MIP can be formulated as
(P)
min |f(x)| = |d_0 * x_0 + ... + d_n * x_n|
s.t. x_i ∈ {-1, 1} ∀ i = 0,...,n
which is the same as
(P')
min |f(x)| = |d_0 * (2*x_0 - 1) + ... + d_n * (2*x_n - 1)|
s.t. x_i ∈ {0, 1} ∀ i = 0,...,n
and can be implemented like this
min abs_obj
s.t. f(x) <= abs_obj
f(x) >= -1.0*abs_obj
x_i ∈ {0, 1} ∀ i = 0,...,n
In code:
import pulp
import random
data = random.sample(range(1, 100), 5)
# pulp model
mdl = pulp.LpProblem("our_model", sense=pulp.LpMinimize)
# the binary variables x
x = pulp.LpVariable.dicts("x", range(5), cat="Binary")
# the variable that stores the absolute value of the objective
abs_obj = pulp.LpVariable("abs_obj")
# set the MIP objective
mdl += abs_obj
# Define the objective: |f(x)| = abs_obj
mdl += pulp.lpSum((2 * x[i] - 1) * data[i] for i in range(5)) <= abs_obj
mdl += pulp.lpSum((2 * x[i] - 1) * data[i] for i in range(5)) >= -1.0*abs_obj
# solve the problem
mdl.solve()
# your solution
signs = [1 if var.varValue > 0 else -1 for var in x.values()]
Alternatively, if you don't want to use another package, you can use scipy.optimize.minimize and implement a simple penalty method. Thereby you solve the problem (P') by solving the penalty problem
min |f(x)| + Ɛ * (x_0 * (1 - x_0) + ... + x_n * (1 - x_n))
with 0 <= x_i <= 1
where Ɛ is a given penalty parameter. Here, the idea is that the right penalty term equals zero for an integer solution.
Note that as the case may be that you need to solve a sequence of penalty problems to achieve convergence to an integer solution. Thus, I'd highly recommend sticking to a MILP solver instead of implementing a penalty method on your own.
Yes, you can do it using scipy.optimize.minimize:
from scipy.optimize import minimize
minimize(f, [0] * len(data), args=data, bounds=[(-1, 1)] * len(data))
This call minimizes f which you defined in the original post.
It passes a zero array as an initial guess for the minimization problem.
The argument f requires is 'data' which is specified by the argument 'args'.
The constraints you want are specified by the argument 'bounds' as a list of min/max tuples with the length of the input data.
Please, consider the following optimisation problem. Specifically, x and b are (1,n) vectors, C is (n,n) symmetric matrix, k is an arbitrary constant and i is a (1,n) vector of ones.
Please, also consider the following equivalent optimisation problem. In such case, k is determined during the optimisation process so there is no need to scale the values in x to obtain the solution y.
Please, also consider the following code for solving both the problems with cvxpy.
import cvxpy as cp
import numpy as np
def problem_1(C):
n, t = np.shape(C)
x = cp.Variable(n)
b = np.array([1 / n] * n)
obj = cp.quad_form(x, C)
constraints = [b.T # cp.log(x)>=0.5, x >= 0]
cp.Problem(cp.Minimize(obj), constraints).solve()
return (x.value / (np.ones(n).T # x.value))
def problem_2(C):
n, t = np.shape(C)
y = cp.Variable(n)
k = cp.Variable()
b = np.array([1 / n] * n)
obj = cp.quad_form(y, C)
constraints = [b.T # cp.log(y)>=k, np.ones(n)#y.T==1, y >= 0]
cp.Problem(cp.Minimize(obj), constraints).solve()
return y.value
While the first function do provide me with the correct solution for a sample set of data I am using, the second does not. Specifically, values in y differ heavily while employing the second function with some of them being equal to zero (which cannot be since all values in b are positive and greater than zero). I am wondering wether or not the second function minimise also k. Its value should not be minimised on the contrary it should just be determined during the optimisation problem as the one that leads to the solution that minimise the objective function.
UPDATE_1
I just found that the solution that I obtain with the second formulation of the problem is equal to the one derived with the following equations and function. It appears that the constraint with the logarithmic barrier and the k variable is ignored.
def problem_3(C):
n, t = np.shape(C)
y = cp.Variable(n)
k = cp.Variable()
b = np.array([1 / n] * n)
obj = cp.quad_form(y, C)
constraints = [np.ones(n)#y.T==1, y >= 0]
cp.Problem(cp.Minimize(obj), constraints).solve()
return y.value
UPDATE_2
Here is the link to a sample input C - https://www.dropbox.com/s/kaa7voufzk5k9qt/matrix_.csv?dl=0. In such case the correct output for both problem_1 and problem_2 is approximately equal to [0.0659 0.068 0.0371 0.1188 0.1647 0.3387 0.1315 0.0311 0.0441] since they are equivalent by definition. I am able to obtain the the correct output by solving only problem_1. Solving problem_2 leads to [0.0227 0. 0. 0.3095 0.3392 0.3286 0. 0. 0. ] which is wrong since it happens to be the correct output for problem_3.
UPDATE_3
To be clear, by definition problem_2 exhibits solution equal to the solution of problem_3 when the parameter k goes to minus infinity.
UPDATE_4
Please consider the following code that is for solving problem_1 using SciPy Optimize instead CVXPY. By imposing k=9 the correct optimal solution can still be achieved which is consistent with problem_1 being independent of the parameter.
import scipy.optimize as opt
def obj(x, C):
return x.T # C # x
def problem_1_1(C):
n, t = np.shape(C)
b = np.array([1 / n] * n)
constraints = [{"type": "eq", "fun": lambda x: (b * np.log(x)).sum() - 9}]
res = opt.minimize(
obj,
x0 = np.array([1 / n] * n),
args = (C),
bounds = ((0, None),) * n,
constraints = constraints
)
return (res['x'] / (np.ones(n).T # res['x']))
UPDATE_5
By considering the code in UPDATE_4, whenever k is set equal to 10 the correct solution is still achieved however appears the following warning. I suppose that is due to rounding error that might occur during the optimisation process.
Untitled.py:56: RuntimeWarning: divide by zero encountered in
log {"type": "eq", "fun": lambda x: (b * np.log(x)).sum() - 10}
I am wondering if there is a way to impose strict inequality constraint with CVXPY or apply a condition on the logarithm argument. Please consider the following modified code for problem_1_1.
import scipy.optimize as opt
def obj(x, C):
return x.T # C # x
def problem_1_1(C):
n, t = np.shape(C)
b = np.array([1 / n] * n)
constraints = [{"type": "eq", "fun": lambda x: (b * np.log(x if x.all() > 0 else 1e-100)).sum() - 10}]
res = opt.minimize(
obj,
x0 = np.array([1 / n] * n),
args = (C),
bounds = ((0, None),) * n,
constraints = constraints
)
return (res['x'] / (np.ones(n).T # res['x']))
UPDATE_6
To be thorough, the correct value of optimal k is approximatively -2.4827186402337564.
If you let be arbitrary then you are basically saying that is greater or equal to some arbitrary number, which is trivially true, so the constraint becomes irrelevant.
I believe you should either fix the value of or turn this problem into a minimax problem by determining a tadeoff betweenmaximizing and minimizing .
I am currently trying to write some python code to solve an arbitrary system of first order ODEs, using a general explicit Runge-Kutta method defined by the values alpha, gamma (both vectors of dimension m) and beta (lower triangular matrix of dimension m x m) of the Butcher table which are passed in by the user. My code appears to work for single ODEs, having tested it on a few different examples, but I'm struggling to generalise my code to vector valued ODEs (i.e. systems).
In particular, I try to solve a Van der Pol oscillator ODE (reduced to a first order system) using Heun's method defined by the Butcher Tableau values given in my code, but I receive the errors
"RuntimeWarning: overflow encountered in double_scalars f = lambda t,u: np.array(... etc)" and
"RuntimeWarning: invalid value encountered in add kvec[i] = f(t+alpha[i]*h,y+h*sum)"
followed by my solution vector that is clearly blowing up. Note that the commented out code below is one of the examples of single ODEs that I tried and is solved correctly. Could anyone please help? Here is my code:
import numpy as np
def rk(t,y,h,f,alpha,beta,gamma):
'''Runga Kutta iteration'''
return y + h*phi(t,y,h,f,alpha,beta,gamma)
def phi(t,y,h,f,alpha,beta,gamma):
'''Phi function for the Runga Kutta iteration'''
m = len(alpha)
count = np.zeros(len(f(t,y)))
kvec = k(t,y,h,f,alpha,beta,gamma)
for i in range(1,m+1):
count = count + gamma[i-1]*kvec[i-1]
return count
def k(t,y,h,f,alpha,beta,gamma):
'''returning a vector containing each step k_{i} in the m step Runga Kutta method'''
m = len(alpha)
kvec = np.zeros((m,len(f(t,y))))
kvec[0] = f(t,y)
for i in range(1,m):
sum = np.zeros(len(f(t,y)))
for l in range(1,i+1):
sum = sum + beta[i][l-1]*kvec[l-1]
kvec[i] = f(t+alpha[i]*h,y+h*sum)
return kvec
def timeLoop(y0,N,f,alpha,beta,gamma,h,rk):
'''function that loops through time using the RK method'''
t = np.zeros([N+1])
y = np.zeros([N+1,len(y0)])
y[0] = y0
t[0] = 0
for i in range(1,N+1):
y[i] = rk(t[i-1],y[i-1], h, f,alpha,beta,gamma)
t[i] = t[i-1]+h
return t,y
#################################################################
'''f = lambda t,y: (c-y)**2
Y = lambda t: np.array([(1+t*c*(c-1))/(1+t*(c-1))])
h0 = 1
c = 1.5
T = 10
alpha = np.array([0,1])
gamma = np.array([0.5,0.5])
beta = np.array([[0,0],[1,0]])
eff_rk = compute(h0,Y(0),T,f,alpha,beta,gamma,rk, Y,11)'''
#constants
mu = 100
T = 1000
h = 0.01
N = int(T/h)
#initial conditions
y0 = 0.02
d0 = 0
init = np.array([y0,d0])
#Butcher Tableau for Heun's method
alpha = np.array([0,1])
gamma = np.array([0.5,0.5])
beta = np.array([[0,0],[1,0]])
#rhs of the ode system
f = lambda t,u: np.array([u[1],mu*(1-u[0]**2)*u[1]-u[0]])
#solving the system
time, sol = timeLoop(init,N,f,alpha,beta,gamma,h,rk)
print(sol)
Your step size is not small enough. The Van der Pol oscillator with mu=100 is a fast-slow system with very sharp turns at the switching of the modes, so rather stiff. With explicit methods this requires small step sizes, the smallest sensible step size is 1e-5 to 1e-6. You get a solution on the limit cycle already for h=0.001, with resulting velocities up to 150.
You can reduce some of that stiffness by using a different velocity/impulse variable. In the equation
x'' - mu*(1-x^2)*x' + x = 0
you can combine the first two terms into a derivative,
mu*v = x' - mu*(1-x^2/3)*x
so that
x' = mu*(v+(1-x^2/3)*x)
v' = -x/mu
The second equation is now uniformly slow close to the limit cycle, while the first has long relatively straight jumps when v leaves the cubic v=x^3/3-x.
This integrates nicely with the original h=0.01, keeping the solution inside the box [-3,3]x[-2,2], even if it shows some strange oscillations that are not present for smaller step sizes and the exact solution.
I'm implementing Bayesian Changepoint Detection in Python/NumPy (if you are interested have a look at the paper). I need to compute likelihoods for data in ranges [a, b], where a and b can have all values from 1 to n. However I can prune the computation at some points, so that I don't have to compute every likelihood. On the other hand some likelihoods are used more than once, so that I can save time by saving the values in a matrix P[a, b]. Right now I check whether the value is already computed, whenever I use it, but I find that a bit of a hassle. It looks like this:
# ...
P = np.ones((n, n)) * np.inf # a likelihood can't get inf, so I use it
# as pseudo value
for a in range(n):
for b in range(a, n):
# The following two lines get annoying and error prone if you
# use P more than once
if P[a, b] == np.inf:
P[a, b] = likelihood(data, a, b)
Q[a] += P[a, b] * g[a] * Q[a - 1] # some computation using P[a, b]
# ...
I wonder, whether there is a more intuitive and pythonic way to achieve this, without having the if ... statement before every use of a P[a, b]. Something like an automagical function call if some condition is not met. I could of course make the likelihood function aware of the fact that it could save values, but then it needs some kind of state (e.g. becomes an object). I want to avoid that.
The likelihood function
Since it was asked for in a comment, I add the likelihood function. It actually computes the conjugate prior and then the likelihood. And all in log representation... So it is quite complicated.
from scipy.special import gammaln
def gaussian_obs_log_likelihood(data, t, s):
n = s - t
mean = data[t:s].sum() / n
muT = (n * mean) / (1 + n)
nuT = 1 + n
alphaT = 1 + n / 2
betaT = 1 + 0.5 * ((data[t:s] - mean) ** 2).sum() + ((n)/(1 + n)) * (mean**2 / 2)
scale = (betaT*(nuT + 1))/(alphaT * nuT)
# splitting the PDF of the student distribution up is /much/ faster. (~ factor 20)
prob = 1
for yi in data[t:s]:
prob += np.log(1 + (yi - muT)**2/(nuT * scale))
lgA = gammaln((nuT + 1) / 2) - np.log(np.sqrt(np.pi * nuT * scale)) - gammaln(nuT/2)
return n * lgA - (nuT + 1)/2 * prob
Although I work with Python 2.7, both answers for 2.7 and 3.x are appreciated.
I would use a sibling of defaultdict for this (you can't use defaultdict directly since it won't tell you the key that is missing):
class Cache(object):
def __init__(self):
self.cache = {}
def get(self, a, b):
key = (a,b)
result = self.cache.get(key, None)
if result is None:
result = likelihood(data, a, b)
self.cache[key] = result
return result
Another approach would be using a cache decorator on likelihood as described here.