I try to use PuLP to solve route optimization problem but it took around 1 hour to finish. I also monitor resources and it seems to use only 1 processor. Is it possible to do a multi-thread or multi-processor? or is there anyway to improve an efficiency?
Here is some source code.
Variables & Objective function
# DECISION VARIABLE X
x_vars = LpVariable.dicts("route",[(i,j,k) for i in job_id for j in job_id for k in truck_id],lowBound=0,upBound=1,cat=LpBinary)
# DECISION VARIABLE Y
y_vars = LpVariable.dicts("work",[(j,k) for j in job_id for k in truck_id],lowBound=0,upBound=1,cat=LpBinary)
# OBJECTIVE FUNCTION
opt_model += lpSum(x_vars[(i,j,k)]*travel_cost[i+'-'+j+'-'+k] for i in job_id for j in job_id for k in truck_id)
Constrains
#CONSTRAINTS x[i,j,k] = 0 for all i!=k & j!=k
for k in truck_id:
opt_model += lpSum(x_vars[(i,j,k)] for j in job_id for i in yard_id if i!=truck_yard[k]) == 0
#CONSTRAINTS
#2
t2 = time.time()
print(t2 - t1)
for j in job_id:
for k in truck_id:
opt_model += lpSum(x_vars[(i,j,k)] for i in job_id) == y_vars[(j,k)]
Solver
Solver_name = 'PULP_CBC_CMD'
solver = pl.getSolver(Solver_name)
results = opt_model.solve(solver)
At least in PuLP of version 2.6.0, I can execute CBC solver with multi threads by simply adding threads parameter to getSolver.
Solver_name = 'PULP_CBC_CMD'
solver = pl.getSolver(Solver_name, threads=4)
results = opt_model.solve(solver)
https://coin-or.github.io/pulp/technical/solvers.html?highlight=getsolver#pulp.apis.PULP_CBC_CMD
Related
I am running a optimization problem trying to solve how to best meet a power demand. When i run the code for all 8760 hours of the year, i get the error in the picture. The amount of nonzeros leads me to believe my code is written in an un-optimal way.
My current code looks like this (with some variables defined earlier). I know this sort of problem can be solved with more memory but as I have 32 GB I think it should be sufficient. The code runs fine for for smaller amounts of hours, for example 1000. Is there any way to better optimize my code?
model.plants = pyo.Set(initialize=['Nuclear','Gas','Wind','Solar'])
model.periods = pyo.Set(initialize=list_hours)
def production_bounds(model,i,j):
return(P_min[i],P_max[i][j])
model.pBounds = pyo.Var(model.plants,model.periods,bounds=production_bounds)
def load_balance(model,i,j):
return (model.pBounds['Nuclear',j] + model.pBounds['Gas',j] + model.pBounds['Wind',j] + model.pBounds['Solar',j] == Load_demand[j])
model.load_constraint = pyo.Constraint(model.plants,model.periods,rule=load_balance)
def renewable_con(model,i,j):
x = 0
for j in model.periods:
x += model.pBounds['Wind',j] + model.pBounds['Solar',j]
return (x>=renewable_limit)
model.renew_constraint = pyo.Constraint(model.plants,model.periods,rule=renewable_con)
def obj_func(model):
final_costs = 0
for i in model.plants:
for j in model.periods:
final_costs = final_costs + get_costs_func(i, j)
return final_costs
model.obj = pyo.Objective(rule = obj_func, sense = pyo.minimize)
model.dual = pyo.Suffix(direction=pyo.Suffix.IMPORT)
opt = SolverFactory('gurobi', solver_io="python")
results = opt.solve(model,tee=True)
I have two different functions for solving the knapsack problem.
The difference in these functions is that the v2 function uses less space over v1. From my time complexity analysis, the v2 function should not be faster than v1.
However, after running my test cases several times, I found that v2 is significantly faster than v1, and I cannot understand why.
I am using Python Unittest.
Here is the test times:
v1 execution time:
Ran 1 test in 35.985s
v2 execution time:
Ran 1 test in 25.294s
Here is my v1 functions:
def knapsack_bottom_up_v1(self):
N = len(self.values)
C = self.capacity
# table
dp = [[0 for rc in range(C+1)] for i in range(N)]
# filling out the table
for i in range(0, N):
i_weight = self.weights[i]
i_val = self.values[i]
for rc in range(1, C+1):
# edge case
if i == 0:
if i_weight > rc:
dp[i][rc] = 0
else:
dp[i][rc] = i_val
# recurrence relation
if i_weight > rc:
dp[i][rc] = dp[i-1][rc]
else:
dp[i][rc] = max(dp[i-1][rc], dp[i-1][rc-i_weight] + i_val)
return dp[N-1][C]
Here is my v2 function:
def knapsack_bottom_up_v2(self):
N = len(self.values)
C = self.capacity
# prev_dp == dp[i-1]
prev_dp = [0]*(C+1)
# dp == dp[i]
dp = [0]*(C+1)
# filling out the table
for i in range(0, N):
i_weight = self.weights[i]
i_val = self.values[i]
for rc in range(1, C+1):
# recurrence relation
if i_weight > rc:
dp[rc] = prev_dp[rc]
else:
dp[rc] = max(prev_dp[rc], prev_dp[rc-i_weight] + i_val)
prev_dp, dp = dp, prev_dp
for i in range(len(dp)):
dp[i] = 0
return prev_dp[C]
Here is also the test case I'm using:
values = [825594,1677009,1676628,1523970,943972,97426,69666,1296457,1679693,\
1902996,1844992,1049289,1252836,1319836,953277,2067538,675367,853655,\
1826027,65731,901489,577243,466257,369261]
weights = [382745,799601,909247,729069,467902,44328,34610,698150,823460,903959,\
853665,551830,610856,670702,488960,951111,323046,446298,931161,31385,\
496951,264724,224916,169684]
capacity = 6404180
solution = [1,1,0,1,1,1,0,0,0,1,1,0,1,0,0,1,0,0,0,0,0,1,1,1]
Can anyone help me understand why the execution time of v2 is faster than v1? I think it should be about the same, if not, v2 should be slightly slower than v1.
Thanks!
The time difference mainly comes from two or three more indexes in each internal loop.
I did a test on my machine and did two additional indexes in each internal loop. The difference was about 9 seconds:
>>> lst = [0]
>>> timeit("""for i in range(C):
... prev = lst[0]
... for j in range(N):
... prev
... prev
... """, globals=globals(), number=1)
3.6931853000132833
>>> timeit("""for i in range(C):
... for j in range(N):
... lst[0]
... lst[0]
... """, globals=globals(), number=1)
12.408428700000513
There are number of jobs to be assigned to number of resources each with a score (performance indicator) and cost. The resource assignment problem (RAP) objective is to maximize assignment scores considering the budget. Constraints: Each resource can handle at most one job and each job if it is filled should be done by one resource. Also, there is a limited budget to spend.
I have tackled the problem in two ways: CVXPY using gurobi solver and gurobi packages. My challenge is I can't program it in a memory-efficient way with cvxpy. There are hundreds of constraint list comprehensions! How can I can improve efficiency of my code in cvxpy? For example, is there a better way to define dictionary variables in cvxpy similar to gurobi?
ms is dictionary of format {('firstName lastName', 'job'), score_value}
cst is dictionary of format {('firstName lastName', 'job'), cost_value}
job is set of jobs
res is set of resources {'firstName lastName'}
G (or g in gurobi implementation) is a dictionary with jobs as keys and values of 0 or 1 whether that job is filled due to budget limit (0 if filled and 1 if not)
thanks
github link including codes and memory profiling comparison
gurobi implementation:
m = gp.Model("RAP")
assign = m.addVars(ms.keys(), vtype=GRB.BINARY, name="assign")
g = m.addVars(job, name="gap")
m.addConstrs((assign.sum("*", j) + g[j] == 1 for j in job), name="demand")
m.addConstrs((assign.sum(r, "*") <= 1 for r in res), name="supply")
m.addConstr(assign.prod(cst) <= budget, name="Budget")
job_gap_penalty = 101 # penatly of not filling a job
m.setObjective(assign.prod(ms) -job_gap_penalty*g.sum(), GRB.MAXIMIZE)
m.optimize()
cvxpy implenentation:
X = {}
for a in ms.keys():
X[a] = cp.Variable(boolean=True, name="assign")
G = {}
for g in job:
G[g] = cp.Variable(boolean=True, name="gap")
constraints = []
for j in job:
X_r = 0
for r in res:
X_r += X[r, j]
constraints += [
X_r + G[j] == 1
]
for r in res:
X_j = 0
for j in job:
X_j += X[r, j]
constraints += [
X_j <= 1
]
constraints += [
np.array(list(cst.values())) # np.array(list(X.values())) <= budget,
]
obj = cp.Maximize(np.array(list(ms.values())) # np.array(list(X.values()))
- job_gap_penalty * cp.sum(list(G.values())))
prob = cp.Problem(obj, constraints)
prob.solve(solver=cp.GUROBI, verbose=False)
Here is the memory profiling comparison:
memeory profiling for cvxpy
memory profiling for gurobi
Previously, I tried to solve thru defining dictionary variables similar to gurobi but at is not available in cvxpy, the code was not efficient when scaling up. But now I solved it thru matrix variables and then converting to dictionary variables which super fast!
assign_scores = np.array(list(ms.values())).reshape(len(res), len(job))
assign_cost = np.array(list(cst.values())).reshape(len(res), len(job))
# make a bool matrix variable with the shape of number of resources and jobs
x = cp.Variable(shape=(len(res), len(job)), boolean=True, name="assign")
# make a bool vector variable with the shape of number of jobs
g = cp.Variable(shape=(len(job), ), boolean=True, name="gap")
constraints = []
# each job can be assigned to at most one resource or remains unfilled due to budget cap
constraints += [cp.sum(x[:, j]) + g[j] == 1 for j in range(len(job))]
# each resource can be assigned to at most one job
constraints += [cp.sum(x[r, :]) <= 1 for r in range(len(res))]
# budget cap
constraints += [cp.sum(cp.multiply(assign_cost, x)) <= budget]
# pentalty if a job is not filled
job_gap_penalty=101
# objective is to maiximize performance score
obj = cp.Maximize(cp.sum(cp.multiply(assign_scores, x) - job_gap_penalty * cp.sum(g)))
prob = cp.Problem(obj, constraints)
prob.solve(solver=cp.GUROBI, verbose=True)
I have the following code(python 3) for adding constraints to pulp(v 2.3). It needs to add up to 400000 constraints(100^2 S, 4 A).
def linearProgram(self, error = 1e-12):
lp_problem = p.LpProblem('best-Vpi', p.LpMinimize)
#create problem variables
V = p.LpVariable.dicts("V",range(self.S), cat = "Continuous")
#objective function
for i in range(self.S):
self.v.append(V[i])
lp_problem += p.lpSum(self.v)
#constraints
for s in range(self.S):
for a in range(self.A):
pv = p.LpAffineExpression([(V[x],self.T[s][a][x]) for x in range(self.S)])
constraint = p.lpSum([self.PR[s][a], self.gamma*pv ])
lp_problem += V[s] >= constraint
status = lp_problem.solve(p.PULP_CBC_CMD(msg = 0)) #solve
I can't seem to be able to optimise it further..
I even tried multiprocessing, but it gave a lot of errors-
def __addconstraints(self, S0, S1, lp_problem):
for s in range(S0, S1):
for a in range(self.A):
pv= p.lpDot(self.T[s][a],self.v)
lp_problem += self.v[s] >= p.lpSum([self.PR[s][a], self.gamma*pv])
..................
#in linearProgram
if self.S%4:
s0, s1 = 0, self.S//3
else:
s0, s1 = 0, self.S//4
incr = s1
processes = []
for x in range(4):
proc = multiprocessing.Process(target=self.__addconstraints, args=(s0, s1, lp_problem))
processes.append(proc)
proc.start()
s0 = s1
s1 = min(s1+incr, self.S)
for proc in processes:
proc.join()
hard code for episodic? no need (due to initialization of mdp)
if self.mdptype=="episodic":
for state in self.end:
lp_problem += V[state] == 0
I am new to both pulp and multiprocessing, so I don't really have an idea what I'm doing :p
Any kind of help is appreciated.
In your code, you first build a p.LpAffineExpression, then you apply a p.lpSum and finally you do a third operation on the result V[s] >= constraint. The two last operations may increase the time because the expression is being copied each time.
From my experience, the fastest times I've gotten are doing the following:
# _vars_tup is a list of (key, value) pairs where each key is a variable and each value is a coefficient.
# it's like initializing a dictionary.
# CONSTANT is a python number (not a pulp variable)
model += p.LpAffineExpression(_vars_tup, constant=CONSTANT) >= 0
The idea is to reduce the number of times you do operations with p.LpAffineExpression objects, because a copy is done at each operation. So, build the list of variables and coefficients (_vars_tup) for ALL the variables present in the constraint and then at the last step create the p.LpAffineExpression and compare it with a constant.
An equivalent way would be (although I haven't tried it):
const = p.LpConstraint(e=p.LpAffineExpression(_vars_tup, constant=_constant), sense = p.LpConstraintGE, rhs = -CONSTANT)
model.addConstraint(other)
I recently found out that the bottleneck of my code is the following block. N is of order 10,000, and L (10,000)^2. RQ_func is just a function that takes indices (tuples) and returns float V and dictionary sp_dist of {index : probability} format.
Is there a way I can parallelize this code? I have access to cluster computing from which I can use up to 20 cores at a time and would like to use the option.
R = np.empty((L,))
Q = scipy.sparse.lil_matrix((L, N))
traverser = 0 # Populate R and Q by traversing the array
for s_index in state_indices:
for a_index in action_indices:
V, sp_dist = RQ_func(s_index, a_index)
R[traverser] = V
for sp_index, prob in sp_dist.items():
Q[traverser, sp_index] = prob
traverser += 1