How to make linear programming optimization faster - python

I am trying to allocate customers Ci to financial advisers Pj. Each customer has a policy value xi. I'm assuming that the number of customers (n) allocated to each adviser is the same, and that the same customer cannot be assigned to multiple advisers. Therefore each partner will have an allocation of policy values like so:
P1=[x1,x2,x3] , P2=[x4,x5,x6], P3=[x7,x8,x9]
I am trying to find the optimal allocation to minimise dispersion in fund value between the advisers. I am defining dispersion as the difference between the adviser with the highest fund value (z_max) and the lowest fund value (z_min).
The formulation for this problem is therefore:
where yij=1 if we allocate customer Ci to adviser Pj, 0 otherwise
The first constraint says that zmax has to be greater than or equal to each policy value; since the objective function encourages smaller values of zmax, this means that zmax will equal the largest policy value. Similarly, the second constraint sets zmin equal to the smallest policy value. The third constraint says that each customer must be assigned to exactly one adviser. The fourth says that each adviser must have n customers assigned to him/her.
I have a working solution using the optimization package: PuLP that finds the optimal allocation.
import random
import pulp
import time
# DATA
n = 5 # number of customers for each financial adviser
c = 25 # number of customers
p = 5 # number of financial adviser
policy_values = random.sample(range(1, 1000000), c) # random generated policy values
# INDEXES
set_I = range(c)
set_J = range(p)
set_N = range(n)
x = {i: policy_values[i] for i in set_I} #customer policy values
y = {(i,j): random.randint(0, 1) for i in set_I for j in set_J} # allocation dummies
# DECISION VARIABLES
model = pulp.LpProblem("Allocation Model", pulp.LpMinimize)
y_sum = {}
y_vars = pulp.LpVariable.dicts('y_vars',((i,j) for i in set_I for j in set_J), lowBound=0, upBound = 1, cat=pulp.LpInteger)
z_max = pulp.LpVariable("Max Policy Value")
z_min = pulp.LpVariable("Min Policy Value")
for j in set_J:
y_sum[j] = pulp.lpSum([y_vars[i,j] * x[i] for i in set_I])
# OBJECTIVE FUNCTION
model += z_max - z_min
# CONSTRAINTS
for j in set_J:
model += pulp.lpSum([y_vars[i,j] for i in set_I]) == n
model += y_sum[j] <= z_max
model += y_sum[j] >= z_min
for i in set_I:
model += pulp.lpSum([y_vars[i,j] for j in set_J]) == 1
# SOLVE MODEL
start = time.clock()
model.solve()
print('Optimised model status: '+str(pulp.LpStatus[model.status]))
print('Time elapsed: '+str(time.clock() - start))
Note that I have implemented constraints 1 and 2 slightly differently by including an additional variable y_sum to prevent duplicating an expression with a large number of nonzero elements
The problem
The issue is that for larger values of n,p and c the model takes far too long to optimise. Is it possible to make any changes to how I've implemented the objective function/constraints to make the solution faster?

Try a using a commercial solver like Gurobi with pulp. You should get a substantial decrease in solve time.
Also check your computers memory, if any solver runs out of memory and starts paging to disk the solve time will be very long.

You should monitor the time needed for each part of the program (model declaration and solving)
If the solving is too long, you can use a different solver as suggested above (here is some clue how to do it : https://coin-or.github.io/pulp/guides/how_to_configure_solvers.html).
If the model declaration is too long, you may have to optimise your code (try to use the pulp enabled fuctions as pulp.lpSum rather than python sum for example). You can also fidn some tricks here https://groups.google.com/g/pulp-or-discuss/c/p1N2fkVtYyM and here https://github.com/IBMDecisionOptimization/docplex-examples/blob/master/examples/mp/jupyter/efficient.ipynb

Related

Conditional average of variables in Pulp optimization problem

Suppose I have 2 pulp variables: x1 and x2. These two variables represent water temperatures inside two different water pipes. These two pipes, at a certain point, merges into one single pipe and the two water flows mixes together. The water temperature after the mixing is equal to the average of the two temperatures because the flow rate is the same.
If the flow of one water pipe is zero, there is no mixing and the output temperature is equal to the temperature of the non-zero flow water temperature.
This final water temperature is then used into the objective function of the pulp problem to calculate some cost.
This means that I have to calculate the average of these two variables but each variable has to be considered in the calculation of the average only if it is greater than 0.
Here is an example you can reproduce to calculate the average without the condition of >0.
from pulp import *
# Define the variables
x1 = LpVariable("x1", 0, None)
x2 = LpVariable("x2", 0, None)
avg = LpVariable("avg",0,None)
# Define the problem
prob = LpProblem("average_problem", LpMinimize)
# Define the objective function
prob += 0, "objective function"
# Calculate avg value
prob += avg==(x1+x2)/2, "average_constraint"
# Set x1 and x2 value just as example
prob += x1==100
prob += x2==50
cost_of_engine = (105-avg)*3/0.2
total_production_cost = lpSum(cost_of_engine+10)
prob.setObjective(total_production_cost)
# Solve the problem
prob.solve()
This example works if x1 and x2 are both higher than zero.
However, if for instance x1=0 and x2=100, then avg=50.
What I need, instead, is to discard the x1 variable from the calculation of the average so that avg=100.
This is clearly a non-linear problem because the denominator of the calculation of the average is dynamic and depends on the value of the variable x1 and x2.
Do you have any idea how to solve this problem? Maybe using the Big M technique?
There are several approaches that might be reasonable, depending on the characteristics of your problem that are not described.... As noted, if you are trying to minimize an average in the objective and both the numerator and the denominator are variables, the resultant expression is non-linear and you'll need to consider a substitute objective or move outside of pulp and look at non-linear formulations and non-linear solvers.
Idea #1: Use a penalty for the number of items used.
You can introduce (and properly constrain with a big-M constraint) a new variable y_i ∈ {0, 1} that is 1 if x_i is used and some logical weight w and use an objective like:
obj = ∑ x_i + w * ∑ y_i ; minimized
which might work OK if the x_i are in a range such that a logical w can be generated.
Or...
Idea #2: Use a mini-max or maxi-min constraint
If you are seeking an aggregate total of the x_i used, while minimizing the number that are used and there is some "trade space" in the model, you can set the objective to "maximize the minimum used x_i value", which might work, again, depending on the other characteristics of your model. This should have similar effect by encouraging the model to pick larger x_i to make the target value. In that case, you can... In pseudocode:
Introduce y and z ...
y_i ∈ {0, 1}
x_i <= y_i * M
z ∈ Reals
z <= x_i + (1-y_i)*M # constrain z to the lowest x_i used...
obj = max(z)

How to define sets, parameters and constraints in pyomo

I have a list of asset returns.
Re=[0.5346,0.5064,1.0838,0.7665,0.9463,0.7047,0.6735,0.5294,0.7697,0.7299,0.99,1.0856,0.9052,0.3827,0.3804,1.0271,0.9431,0.538,0.9313,0.9423]
I want to maximize the following objective:
$Re(w)=\sum_{i=1}^{n}w_{i}Re_{i}$
and constraints are:
(i) Full utilization of capital:
$\sum_{k=1}^{n}w_{k}v_{k}=1$
where v_{k} is a binary variable, which is 1 if any of asset k is held, and 0 otherwise.
(ii) Cardinality constraint:
$\sum_{k=1}^{n}v_{k}=q$
where q ∈ [q1, q2] is the desired number of assets in the portfolio.
(iii) No short selling constraint:
$w_{k}\ge 0$ for k=1,2,...,n.
(iv) Lower and upper bounds defining the proportion of the
capital that can be invested in a single asset. For avoiding very
small investments in several assets and at the same time to
maintain sufficient diversification of the funds, the bounds of
investment in individual assets are specified as:
$l_{k}v_{k} \le w_{k} \le u_{k}v_{k}$ for k=1,2,...,n.
In this example, it is assumed that the desired number of assets in the portfolio specified by investors is between 7 and 10, i.e.
7 ≤ q ≤ 10. The lower and upper bounds to be invested in
each asset k are set as l_{k} = 0.01 and u_{k} = 0.3, respectively.
I tried to solve this problem as follows:
import pyomo.environ as pyo
from pyomo.opt import SolverFactory
from pyomo.environ import Var, NonNegativeReals
# Defining the model
model=pyo.ConcreteModel()
# set
model.i=pyo.Set(initialize=['a1','a2','a3','a4','a5','a6','a7','a8','a9','a10','a11','a12','a13','a14','a15','a16','a17','a18','a19','a20'])
#parameters
model.Re=pyo.Param(model.i, initialize={'a1':0.5346,'a2':0.5064,'a3':1.0838,'a4':0.7665,'a5':0.9463,'a6':0.7047,'a7':0.6735,'a8':0.5294,'a9':0.7697,'a10':0.7299,'a11':0.99,'a12':1.0856,'a13':0.9052,'a14':0.3827,'a15':0.3804,'a16':1.0271,'a17':0.9431,'a18':0.538,'a19':0.9313,'a20':0.9423})
re=model.Re
# Decision variable
model.w=pyo.Var(model.i, within=NonNegativeReals)
w=model.w
model.v=pyo.Var(model.i, domain=pyo.Binary)
v=model.v
model.q=pyo.Var(model.i, domain=pyo.Integers, bounds=(7,10))
q=model.q
# Objective Function
def Objective_rule(model,i):
return sum(re[i]*w[i] for i in model.i)
model.Obj=pyo.Objective(rule=Objective_rule, sense=pyo.maximize)
# Constraints
def Constraint1(model,i):
return sum(w[i]*v[i] for i in model.i)==1
model.Const1=pyo.Constraint(model.i, rule=Constraint1 )
def Constraint2(model,i):
return sum(v[i] for i in model.i)==7
model.Const2=pyo.Constraint(model.i, rule=Constraint2 )
def Constraint3(model,i):
return (.01<=w[i]<=.3 for i in model.i)
model.Const3=pyo.Constraint(model.i, rule=Constraint3 )
#results
Solver=SolverFactory('cplex_direct')
results=Solver.solve(model)
print(results)
But this code doesn't work! I tried a lot to solve this problem, but unfortunately I could not.Can anyone help me?

Gurobi dynamic probem

I am trying to make this kind of problem without add the equality constraint into the problem. This is useful to don't have equality constraints in my problem, get the standard form in matrix shape and apply other algorithm. This would be a simplification.
def main_building():
m = gp.Model("building")
m.reset() # Reset the problem, keep options.
PelHP = [m.addVar(name = 'PelHP_%d'%t) for t in range(T)] #Eletric Power
PthFC = [m.addVar(name = 'PthFC_%d'%t) for t in range(T)] #Thermal power given by the fan coil
T_ST_BMS= [m.addVar(name = 'T_ST_BMS_%d'%t) for t in range(T)] #Storage temperature
T_build_BMS= [m.addVar(name = 'T_build_BMS_%d'%t) for t in range(T)] #Building temperature
T_ST_BMS[0]=T_ST0+273.15
T_build_BMS[0]=T_BMS0+273.15
for t in range (T-1):
T_ST_BMS[t+1]=((COP_HP*PelHP[t]-PthFC[t]/eta_FC)/C_ST)*dt+T_ST_BMS[t]
for t in range(T-1):
m.addConstr(T_ST_BMS[t+1]<=273.15+60)
m.addConstr(-273.15+30 >= -T_ST_BMS[t+1] )
Objective_b=0
for t in range(T):
m.addConstr(PelHP[t]<=PelHP_max)
m.addConstr(PthFC[t]<=eta_FC*m_air*cp_air*(T_ST_BMS[t]-T_build_BMS[t]))
m.addConstr(0>=-PelHP[t])
m.addConstr(0>=-PthFC[t])
Objective_b=Objective_b+dt*(C_buy*(-(-PelHP[t]))+(T_build_BMS [t]-T_obj)**2
# Set objective:
m.setObjective(Objective_b, gp.GRB.MINIMIZE)
m.optimize()
return m
Gurobi is able to solve it with 10 times in T. But when I increase it, gurobi blocks. Anyone that could help me? The point is that the expression of T_ST_BMS becomes higher with t.

PV Overproduction within a linear cost factor optimization

So I am currently trying to optimize the costs for energy in a household. The optimization is based on a cost factor function which I am trying to minimize.
model = ConcreteModel()
model.t = RangeSet(0, 8759)
def costs(model, t):
return sum(model.cost_factor[t] * model.elec_grid[t] for t in model.t)
model.costs = Objective(rule = costs, sense = minimize)
Due to pv overproduction being a thing I try to negate by using these functions:
model.elec_consumption = Param(model.t, initialize = df['Consumption'])
model.pv = Param(model.t, initialize = df['PV'])
model.excess_pv = Var(model.t, within = NonNegativeReals, initialize = 0)
model.demand = Var(model.t, initialize = 0, within = NonNegativeReals)
def pv_overproduction(model, t):
return model.excess_pv[t] >= model.pv[t] - model.demand[t]
model.pv_overproduction = Constraint(model.t, rule = pv_overproduction)
def lastdeckung(model, t):
return (model.pv[t] - model.excess_pv[t]) + model.elec_grid[t] == model.demand[t]
model.lastdeckung = Constraint(model.t, rule = lastdeckung)
The problem is when the cost factor is negative the optimizer puts model.excess_pv very high so he can crank up the model.elec_grid variable in an effort to minimize the cost factor.
That is obviously not the intention but so far I wasnt able to find a better way to calculate the excess pv. An easy fix would technically be to just have a cost factor which is constantly positive but sadly thats not an option.
I'd appreciate if someone had an idea how to fix this.
The basics are that I want to maximize the usage of the pv electricity in order to reduce costs. At some points there is to mooch pv in the system so in order for that optimization to still work I need to get rid of the excess.
return model.demand[t] == model.elec_consumption[t]
model.demand_rule = Constraint(model.t, rule = demand_rule)
This is the demand. Technically there are more functions but for the the problem solving that is irrelevant. The main problem is that this function doesnt work due to the cost factor being negative sometimes
model.excess_pv[t] >= model.pv[t] - model.demand[t]
Excess_pv aswell as model.demand are variables wheres model.pv is a parameter.
So as far as I got in my problemsearching I need to change my overproduction function in a way that it uses the value from pv - excess_pv if the value is > 0 and should the value be < = 0 its supposed to be zero.
I think the easiest way to do this is to probably just penalize excess production to a greater extent than the maximally negative cost factor.
Why can't you...
excess_pentalty = max(-min(cost) + epsilon, 0) # use maximin to prevent odd behavior if there is no negative cost, which might lead to a negative penalty...
# make obj from components, so we can inspect true cost (w/o penalty) later...
cost = sum(model.cost_factor[t] * model.elec_grid[t] for t in model.t)
overproduction_pentaly = sum(excess_penalty * model.excess_pv[t] for t in model.t)
model.obj = Objective(expr= cost + overproduction_penalty, sense = minimize)
and later if you want the cost independently, you can just check the value of cost, which is a legal pyomo expression.
value(cost)
I think you could also add the expression as a model component, if that is important...
model.cost = ...
model.overproduction_penalty = ...
So the idea of a piecewise function is definitely an option for the problem mentioned in this post. It is quite a fancy and complicated solution though. The idea of penalties is much easier and it also showed a few more flaws in my code. Due to negative cost factor the optimizer tries to maximize grid input which is not wrong but when some variables are not capped the optimizer uses electricity with no efficiency whatsoever. So easiest way as mentionted earlier is to just penalize the grid import from the beginning so there are no negative cost factor during the optimization.

How to find the maximum of a prob in PuLP

I am trying to solve a linear problem in PuLP that minimizes a cost function. The cost function is itself a function of the maximum value of the cost function, e.g., I have a daily cost, and I am trying to minimize the monthly cost, which is the sum of the daily cost plus the maximum daily cost in the month. I don't think I'm capturing the maximum value of the function in the final solution, and I'm not sure how to go about troubleshooting this issue. The basic outline of the code is below:
# Initialize the problem to be solved
prob = LpProblem("monthly_cost", LpMinimize)
# The number of time steps
# price is a pre-existing array of variable prices
tmax = len(price)
# Time range
time = list(range(tmax))
# Price reduction at every time step
d = LpVariable.dict("d", (time), 0, 5)
# Price increase at every time step
c = LpVariable.dict("c", (time), 0, 5)
# Define revenues = price increase - price reduction + initial price
revenue = ([(c[t] - d[t] + price[t]) for t in time])
# Find maximum revenue
max_revenue = max(revenue)
# Initialize the problem
prob += sum([revenue[t]*0.0245 for t in time]) + max_revenue
# Solve the problem
prob.solve()
The variable max_revenue always equals c_0 - d_0 + price[0] even though price[0] is not the maximum of price and c_0 and d_0 both equal 0. Does anyone know how to ensure the dynamic maximum is being inserted into the problem? Thanks!
I don't think you can do the following in PuLP or any other standard LP solvers:
max_revenue = max(revenue)
This is because determining the maximum will require the solver to evaluate revenue equations; so in this case, I don't think you can extract a standard LP model. Such models are in fact non-smooth.
In such situations, you can easily reformulate the problem as follows:
max_revenue >= revenue = ([(c[t] - d[t] + price[t]) for t in time])
This works, as for any value of revenue: max_revenue >= revenue. This in turn helps in extracting a standard LP model from the equations. Hence, the original problem formulation gets extended with additional inequality constraints (the equality constraints and the objective functions should be the same as before). So it could look something like this (word of caution: I have not tested this):
# Define variable
max_revenue = LpVariable("Max Revenue", 0)
# Define other variables, revenues, etc.
# Add the inequality constraints
for item in revenue:
prob += max_revenue >= item
I would also suggest that you have a look at scipy.optimize.linprog. PuLP writes the model in an intermediary file, and then calls installed solver to solve the model. On the other hand, in scipy.optimize.linprog it's all done in python and should be faster. However, if your problem can not be solved using simplex algorithm, or you require other professional solvers (e.g. CPlex, Gurobi, etc.) then PuLP is a good choice.
Also, see the discussion on Data Fitting (page 19) in Introduction to Linear Optimisation by Bertsimas.
Hope this helps. Cheers.

Categories

Resources