Numerical issue with SCIP: same problem, different optimal values

Numerical issue with SCIP: same problem, different optimal values - python

I'm having the following numerical issue while using SCIP solver as a callable library via PySCIPOpt:
two equivalent and almost identical models for the same problem yield different optimal
values and optimal solutions, with optimal values having a relative difference of order 1e-6
An independent piece of software verified that the solutions yielded by both models are feasible for the original problem and that their true values agree with the optimal values reported by SCIP for each model. Until the appearance of this instance a bunch of instances of the same problem had been solved, with both models always agreeing on their optimal solutions and values.
Is it possible to modify the numerical precision of SCIP for comparing the values of primal solutions among them and against dual bounds? Which parameters should be modified to overcome this numerical difficulty?
What I tried
I've tried the following things and the problem persisted:
Turning presolving off with model.setPresolve(SCIP_PARAMSETTING.OFF).
Setting model.setParam('numerics/feastol', epsilon) with different values of epsilon.
Since feasible sets and objective functions agree (see description below), I've checked that the actual coefficients of the objective functions agree by calling model.getObjective() and comparing coefficients for equality for each variable appearing in the objective function.
The only thing that seemed to help was to add some noise (multiplying by numbers of the form 1+eps with small eps) to the coefficients in the objective function of the model yielding the worst solution. This makes SCIP to yield the same (and the better) solution for both models if eps is within certain range.
Just in case, this is what I get with scip -v in the terminal:
SCIP version 6.0.2 [precision: 8 byte] [memory: block] [mode:
optimized] [LP solver: SoPlex 4.0.2] [GitHash: e639a0059d]
Description of the models
Model (I) has approximately 30K binary variables, say X[i] for i in some index set I. It's feasible and not a hard MIP.
Model (II) is has the same variables of model (I) plus ~100 continuous variables, say Y[j] for j in some index set J. Also model (II) has some constraints like this X[i_1] + X[i_2] + ... + X[i_n] <= Y[j].
Both objective functions agree and depend only on X[i] variables, the sense is minimization. Note that variables Y[j] are essentially free in model (II), since they are continuous and they do not appear in the objective function. Obviously, there is no point in including the Y[j] variables, but the optimal values shouldn't be different.
Model (II) is the one yielding the better (i.e. smaller) value.

sorry for the late answer.
So, in general, it can definitely happen that any MIP solver reports different optimal solutions for formulations that are slightly perturbed (even if they are mathematically the same), due to the use of floating-point arithmetic.
Your problem does seem a bit weird though. You say the Y-variables are free, i.e. they do not have any variable bounds attached to them? If this is the case, I would be very surprised if they don't get presolved away.
If you are still interested in this problem, could you provide your instance files to me and I will look at them?

Related

How to obtain multiple solutions of a binary LP problem using Google's OR-Tools in Python?

I am new to integer optimization. I am trying to solve the following large (although not that large) binary linear optimization problem:
max_{x} x_1+x_2+...+x_n
subject to: A*x <= b ; x_i is binary for all i=1,...,n
As you can see,
. the control variable is a vector x of lengh, say, n=150; x_i is binary for all i=1,...,n
. I want to maximize the sum of the x_i's
. in the constraint, A is an nxn matrix and b is an nx1 vector. So I have n=150 linear inequality constraints.
I want to obtain a certain number of solutions, NS. Say, NS=100. (I know there is more than one solution, and there are potentially millions of them.)
I am using Google's OR-Tools for Python. I was able to write the problem and to obtain one solution. I have tried many different ways to obtain more solutions after that, but I just couldn't. For example:
I tried using the SCIP solver, and then I used the value of the objective function at the optimum, call it V, to add another constraint, x_1+x_2+...+x_n >= V, on top of the original "Ax<=b," and then used the CP-SAT solver to find NS feasible vectors (I followed the instructions in this guide). There is no optimization in this second step, just a quest for feasibility. This didn't work: the solver produced N replicas of the same vector. Still, when asked for the number of solutions found, it misleadingly replies that solution_printer.solution_count() is equal to NS. Here's a snippet of the code that I used:
# Define the constraints (A and b are lists)
for j in range(n):
constraint_expr = [int(A[j][l])*x[l] for l in range(n)]
model.Add(sum(constraint_expr) <= int(b[j][0]))
V = 112
constraint_obj_val = [-x[l] for l in range(n)]
model.Add(sum(constraint_obj_val) <= -V)
# Call the solver:
solver = cp_model.CpSolver()
solution_printer = VarArraySolutionPrinterWithLimit(x, NS)
solver.parameters.enumerate_all_solutions = True
status = solver.Solve(model, solution_printer)
I tried using the SCIP solver and then using solver.NextSolution(), but every time I was using this command, the algorithm would produce a vector that was less and less optimal every time: the first one corresponded to a value of, say, V=112 (the optimal one!); the second vector corresponded to a value of 111; the third one, to 108; fourth to sixth, to 103; etc.
My question is, unfortunately, a bit vague, but here it goes: what's the best way to obtain more than one solution to my optimization problem?
Please let me know if I'm not being clear enough or if you need more/other chunks of the code, etc. This is my first time posting a question here :)
Thanks in advance.

Is your matrix A integral ? if not, you are not solving the same problem with scip and CP-SAT.
Furthermore, why use scip? You should solve both part with the same solver.
Furthermore, I believe the default solution pool implementation in scip will return all solutions found, in reverse order, thus in decreasing quality order.

In Gurobi, you can do something like this to get more than one optimal solution :
solver->SetSolverSpecificParametersAsString("PoolSearchMode=2"); // or-tools [Gurobi]
From Gurobi Reference [Section 20.1]:
By default, the Gurobi MIP solver will try to find one proven optimal solution to your model.
You can use the PoolSearchMode parameter to control the approach used to find solutions.
In its default setting (0), the MIP search simply aims to find one
optimal solution. Setting the parameter to 1 causes the MIP search to
expend additional effort to find more solutions, but in a
non-systematic way. You will get more solutions, but not necessarily
the best solutions. Setting the parameter to 2 causes the MIP to do a
systematic search for the n best solutions. For both non-default
settings, the PoolSolutions parameter sets the target for the number
of solutions to find.
Another way to find multiple optimal solutions could be to first solve the original problem to optimality and then add the objective function as a constraint with lower and upper bound as the optimal objective value.

How to treat float values in a consistent and logical manner? Priority settings producing unusual results (Z3Py)

I have an optimization problem I am attempting to solve with Z3, but it's producing unusual results.
I have two objectives which are isolated from each other; the variables being solved are different for each objective. I would have thought this meant both pareto and lex settings would find the same optimal solution.
However, for different sets of inputs, switching priority stops one of the objectives from being optimal. Pareto sometimes produces sub-optimal results for one objective. Lex sometimes produces sub-optimal results for the other objective.
I suspect this is to do with the types used. Constraints and objectives are formulated using a combination of Real and float types. If I change some float values to RealVals, this also changes the expected outcome.
Is there a way to set the optimizer so that it treats float values in a consistent and logical manner?
Many thanks.

Pareto and lex are different styles. Even if your objectives don’t share variables, they won’t match. Lex will be the right solution, Pareto will explore the entire optimization frontier. So I’m not sure why you expect them to be the same.
Perhaps more importantly for your question, I don’t think the optimizer supports floats out of the box. It might be treating the variables as bitvectors, which would produce misleading results. You can optimize floats, but you’ll have to do a transformation on them. Same thing applies to signed bit vectors.(I can provide details on how to transform if that’s the root-cause of your problem.)
Again, without seeing sample code, all this is guessing what might be going wrong in your setup.

objective function with binary variables

I am working on a complex model in pyomo. Unfortunately, i have to change the formula of the objective function, based on how is the previous value.
In particular my objective function is composed of two terms ,call them A and B, that have different order of magnitude (A is usually 2 or 3 order of magnitude higher than B, but this may vary)
In order to guarantee that A and B have the same weight of the formula, i need to write my objective function as below:
objective= A + B*K`
Where K is the value which bring the second term at the same scale/magnitude of A
example:
A=4e10
B=2e3
K=1e(10-3)=1e7
The problem is that, in order to know K, i must know the values of A and B, but pyomo doesn't give value, it just pass an expression to the solver.
I have read that thanks to a smart use of binary variables is possible to overcome this issue, anyone could suggest a useful methodology?
Kind regards

It seems like you are dealing with a multi-objective optimization problem. Since the values of variables involved in A and B are not known before solving the model, you can't define the value of K based on A and B.
There are different ways to solve multi-objective optimization problems which you can consider for your specific problem (e.g., ε-constraints method). In these problems, usually you are not interested in finding a single solution, but finding a set of Pareto optimal solutions which are not dominated by any other solution in the feasible region.

Fixed-point bounds for optimisation using scipy linprog

I want to run an optimisation problem repeatedly to further refine the end result.
Essentially, the objective is to minimise the maximum of a set of variables (subject to inequality and equality constraints), and then minimise the maximum of the set excluding the maximum, and then minimise the maximum of the set excluding the two largest numbers and so on...
The algorithm I have in mind is:
Run scipy.linprog(..., bounds=[(-numpy.inf, numpy.inf), (-numpy.inf, numpy.inf), (-numpy.inf, numpy.inf), ...]) with all variables unbounded, to minimise the maximum of the set of numbers.
Assuming optimisation problem is feasible and successfully solved, fix the maximum to opt_val by setting bounds=[..., (opt_val, opt_val), ...], where all other variables have the bounds (-numpy.inf, numpy.inf).
Make inequality constraints corresponding to that variable ineffective, by changing the coefficient of b_ub to numpy.inf.
Rerun simulation with modified bounds and inequality vector.
This can run without error, but it seems like scipy/numpy explicitly ignore the bounds I place on the variables - I get results for the variables that I have 'fixed' that are not the corresponding opt_val.
Can scipy handle bounds that restrict a variable to a single floating point number?
Is this the best way to be solving my problem?
The code I have developed is really quite extensive, which is why I have not posted it here, so of course I don't expect a code-based solution. What I am looking for here is a yes/no answer as to whether scipy can handle bounding intervals restricted to a single float, and, on a higher level, whether I have the correct approach.
The documentation at https://docs.scipy.org/doc/scipy-0.15.1/reference/generated/scipy.optimize.linprog.html does not explicitly say whether or not it is possible to specify fixed-point bounds.

It turns out it was a problem with relaxing the inequality constraints. I had mistakenly relaxed all constraints regarding the fixed variables, when instead I needed to have relaxed some of the constraints.
#ErwinKalvelagen's comment is still worth noting however.

Constrained least-squares estimation in Python

I'm trying to perform a constrained least-squares estimation using Scipy such that all of the coefficients are in the range (0,1) and sum to 1 (this functionality is implemented in Matlab's LSQLIN function).
Does anybody have tips for setting up this calculation using Python/Scipy. I believe I should be using scipy.optimize.fmin_slsqp(), but am not entirely sure what parameters I should be passing to it.[1]
Many thanks for the help,
Nick
[1] The one example in the documentation for fmin_slsqp is a bit difficult for me to parse without the referenced text -- and I'm new to using Scipy.

scipy-optimize-leastsq-with-bound-constraints on SO givesleastsq_bounds, which is
leastsq
with bound constraints such as 0 <= x_i <= 1.
The constraint that they sum to 1 can be added in the same way.
(I've found leastsq_bounds / MINPACK to be good on synthetic test functions in 5d, 10d, 20d;
how many variables do you have ?)

Have a look at this tutorial, it seems pretty clear.

Since MATLAB's lsqlin is a bounded linear least squares solver, you would want to check out scipy.optimize.lsq_linear.

Non-negative least squares optimization using scipy.optimize.nnls is a robust way of doing it. Note that, if the coefficients are constrained to be positive and sum to unity, they are automatically limited to interval [0,1], that is one need not additionally constrain them from above.
scipy.optimize.nnls automatically makes variables positive using Lawson and Hanson algorithm, whereas the sum constraint can be taken care of as discussed in this thread and this one.
Scipy nnls uses an old fortran backend, which is apparently widely used in equivalent implementations of nnls by other software.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.