Extending a LP using Python

Extending a LP using Python - python

Hello good day to you all.
I have a small problem in which I assign locations at which demands to return goods to a lets say distribution centre occur. In order to handle this demand at these locations, we have to install certain technologies (A and B). We can only install as much of B as we have of A, or 0 of B. So, if B = 2, then A must be 2 as well. However, B can never be lets say B = 1 if A = 2. It can be B = 0 and A = 2.
Now the problem occurs in which I would like some help. If coded this in Python and it works for individual locations. But if I want to extend it to cover all locations, I run into errors... Below the code of the small problem:
# Create a new model
m = gp.Model("mip1")
BigM = 100
q = locations[i]
# Create parameters
cost_A = 20000
cost_B = 13500
cost_transport = 0.007
cap_A = 400000
cap_B = 500000
# Create variables
x = m.addVar(lb=0, ub=3, vtype=GRB.INTEGER, name="A")
y = m.addVar(lb=0, ub=3, vtype=GRB.INTEGER, name="B")
z = m.addVar(lb=0, vtype=GRB.INTEGER, name="flow_1")
a = m.addVar(vtype=GRB.BINARY, name="BigM")
# Set objective to minimize cost
m.setObjective(cost_A * x + cost_B * y + cost_transport * z, GRB.MINIMIZE)
# Add constraint: amount of UBC
m.addConstr(z == q, "accept returned demand")
# Add constraint: capacity
m.addConstr(z <= cap_A * x + cap_B * y, "capacity of locations")
# Add constraint: only B if location exists/has A
m.addConstr(y <= x, "B constraint_1")
# Add constraint: only B if location exists/has A
m.addConstr(x - y <= BigM * a, "B constraint_2")
# Add constraint: only B if location exists/has A
m.addConstr(y <= BigM * (1 - a), "B constraint_3")
# Optimize model
m.optimize()
This model above creates output on how many A and B to install at a certain location, given the demand at that location. This ofcourse is very simple. However, I now want to extend this problem considering multiple locations and that is where I run into errors... KeyError: (0, 1), KeyError: (1, 1), KeyError: (1) and KeyError: (0) all have been there already..
I thought it would be "simply" making addVars and addConstraints, but then you need to assigns sets etc.. I am actually quite lost in the process. Is there anybody who could please help me out?
Thank you all very much in advance for your time and consideration.
P.S. it does not have to be solved by solvers like Gurobi, if you know anything else I am happy to hear!

Yes, you are going to have to tear down most of that and introduce sets... :). My Gurobi syntax is weak, so here are a few things to get you started in pseudocode.
You will need a couple of sets
T = {A, B}
L = {LA, Chicago, NY} # for example
You need a variable to assign (this is an assignment problem) a quantity of tech T at location L...
X[tech, loc] # domain non-neg integers, right?
Then you can re-construct your constraints & such above using summations over the indices as appropriate. I'm sure there are a bunch of Gurobi examples to help with that. Start VERY small so you can troubleshoot.
You will need a constraint for your conditions on tech B. You didn't say if that constraint was global or enforced by location. Let's assume it is by location (the harder case). You then need another indicator variable to indicate whether they are equal or not.
Eq[loc] # indicate whether B=A at particular location, binary var
Then you can use that with Big M to control the equality by location by making 3 constraints per location
X[B, loc] <= X[A, loc]
X[A, loc] - X[B, loc] <= (1 - Eq[loc]) * M
X[B, loc] <= Eq[loc] * M

Related

Performing Householder Reflection of a vector for QR Decomposition

This question was asked before me on here.
However, the solution there was not satisfactory for me, I am still stuck at 33% mismatch, so I felt the need to re-open this topic (And also the author of that thread didn't add an appropriate answer after solving the issue for themselves).
The code that I have written is here:
def householder(vec):
vec = np.asarray(vec, dtype=float)
if vec.ndim != 1:
raise ValueError("vec.ndim = %s, expected 1" % vec.ndim)
n = len(vec)
I = np.eye(n)
e1 = np.zeros_like(vec).astype(float)
e1[0] = 1.0
V1 = e1 * np.linalg.norm(vec)
print("V1:", V1)
u = vec
u[0] = -(np.sum(np.square(u[1:]))) / (vec[0] + np.linalg.norm(vec))
u = u / np.linalg.norm(u)
H = I - 2 * (np.outer(u, u))
return V1 , H
Here is the test case that this code is supposed to pass:
v = np.array([1, 2, 3])
v1, h = householder(v)
assert_allclose(np.dot(h, v1), v)
assert_allclose(np.dot(h, v), v1)
The first assertion is passed successfully, however, the second one gives me a 33% mismatch:
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0
Mismatch: 33.3%
Max absolute difference: 4.4408921e-16
Max relative difference: 1.18687834e-16
x: array([3.741657e+00, 2.220446e-16, 0.000000e+00])
y: array([3.741657, 0. , 0. ])
I have been trying everything for like 5 hours now, and I feel like I'm wasting too much time on this. Any help to make this code pass the test would be much appreciated by me.

Well, it looks correct to me.
The problem seem to be the parameters of the assert_allclose function. Specifically, it reports whether or not
absolute(a - b) <= (atol + rtol * absolute(b))
for each pair of entries a and b. According to the docs, the absolute tolerance is 1e-8 for the ordinary allclose function. However, the assert_allclose parameter of atol is 0 by default.
Since your target b is zero, any value != 0 is not close with respect to this function, even though the two values are certainly reasonably close.
I recommend setting atol to 1e-8, i.e.
assert_allclose(np.dot(h, v), v1, atol=1e-8)
I am not quite sure why the numpy people chose different parameters for the ordinary allclose and assert_allclose though...

Using SMT-LIB to count the number of modules using a formula

I am not sure that this is possible using SMT-LIB, if it is not possible does an alternative solver exist that can do it?
Consider the equations
a < 10 and a > 5
b < 5 and b > 0
b < c < a
with a, b and c integers
The values for a and b where the maximum number of model exist that satisfy the equations when a=9 and b=1.
Do SMT-LIB support the following: For each values of a and b count the number of models that satisfy the formulas and give the value for a and b that maximize the count.

I don't think you can do this in general; that is, when you can have arbitrary constraints over arbitrary theories. You are asking a "meta"-question: "Maximize the number of models" is not a question about the problem itself, but rather about the models of the problem; something SMTLib cannot deal with.
Having said that, however, I think it should be possible to code it for specific problems. In the example you gave, the model space is maximized when a - b is the greatest; so you can simply write:
(set-option :produce-models true)
(declare-fun a () Int)
(declare-fun b () Int)
(declare-fun c () Int)
(assert (< 5 a 10))
(assert (< 0 b 5))
(assert (< b c a))
(maximize (- a b))
(check-sat)
(get-value (a b))
To which z3 responds:
sat
((a 9)
(b 1))
as desired. Or, you can use the Python bindings:
from z3 import *
a, b, c = Ints('a b c')
o = Optimize()
o.add(And(5 < a, a < 10, 0 < b, b < 5, b < c, c < a))
o.maximize(a - b)
if o.check() == sat:
m = o.model()
print "a = %s, b = %s" % (m[a], m[b])
else:
print "unsatisfiable or unknown"
which prints:
a = 9, b = 1
There are also bindings for C/C++/Java/Scala/Haskell etc. that let you do more or less the same from those hosts as well.
But the crucial point here is that we had to manually come up with the goal that maximizing a - b would solve the problem here. That step is something that needs human intervention as it applies to whatever your current problem is. (Imagine you're working with the theory of floats, or arbitrary data-types; coming up with such a measure might be impossible.) I don't think that part can be automated magically using traditional SMT solving. (Unless Patrick comes up with a clever encoding, he's quite clever that way!)

Let's break down your goals:
You want to enumerate all possible ways in which a and b (...and more) can be assigned
For each combination, you want to count the number of satisfiable models
In general, this is not possible, as the domain of some variables in the problem might contain an infinite number of elements.
Even when one can safely assume that the domain of every other variable contains a finite number of elements, it is still highly inefficient.
For instance, if you had only Boolean variables in your problem, you would still have an exponential number of combination of values --and therefore candidate models-- to consider along the search.
However, it is also possible that your actual application is not that complex in practice, and therefore it can be handled by an SMT Solver.
The general idea could be to use some SMT Solver API and proceed as follows:
assert the whole formula
repeat until finish combinations of values:
push a back-track point
assert one specific combination of values, e.g. a = 8 and b = 2
repeat forever:
check for a solution
if UNSAT, exit the inner-most loop
if SAT, increase counter of models for the given combination of values of a and b
take the model value of any other variable, e.g. c = 5 and d = 6
assert a new constraint requesting that at least one of the "other" variables changes its value, e.g. c != 5 or d != 6
pop backtrack point
Alternatively, you may enumerate the possible assignments over a and b implicitly rather than explicitly. The idea would be as follows:
assert the whole formula
repeat forver:
check for a solution
if UNSAT, exit loop
if SAT, take the combination of values of your control variables from the model (e.g. a = 8 and b = 2), check in an internal map if you encountered this combination before, if not set the counter to 1, otherwise increase the counter by 1.
take the model value of any other variable, e.g. c = 5 and d = 6
assert a new constraint requesting for a new solution, e.g. a != 8 or b != 2 or c != 5 or d != 6
In the case that you are in doubt on which SMT Solver to pick, I would advice you to start solving your task with pysmt, which allows one to choose among several SMT engines with ease.
If for your application an explicit enumeration of models is too slow to be practical, then I would advice you to look at the vast literature on Counting Solutions of CSPs, where this problem has already been tackled and there seem to exist several ways to approximately estimate the number of solutions of CSPs.

Simultaneous Equations with given conditions

to start off I have already solved this problem so it's not a big deal, I'm just asking to satisfy my own curiosity. The question is how to solve a series of simultaneous equations given a set of constraints. The equations are:
tau = 62.4*d*0.0007
A = (b + 1.5*d)*d
P = b + 2*d*sqrt(1 + 1.5**2)
R = A/P
Q = (1.486/0.03)*A*(R**(2.0/3.0))*(0.0007**0.5)
and the conditions are:
tau <= 0.29, Q = 10000 +- say 3, and minimize b
As I mentioned I was already able to come up with a solution using a series of nested loops:
b = linspace(320, 330, 1000)
d = linspace(0.1, 6.6392, 1000)
ansQ = []
ansv = []
anstau = []
i_index = []
j_index = []
for i in range(len(b)):
for j in range(len(d)):
tau = 62.4*d[j]*0.0007
A = (b[i] + 1.5*d[j])*d[j]
P = b[i] + 2*d[j]*sqrt(1 + 1.5**2)
R = A/P
Q = (1.486/0.03)*A*(R**(2.0/3.0))*(0.0007**0.5)
if Q >= 10000 and tau <= 0.29:
ansQ.append(Q)
ansv.append(Q/A)
anstau.append(tau)
i_index.append(i)
j_index.append(j)
This takes a while, and there is something in the back of my head saying that there must be an easier/more elegant solution to this problem. Thanks (Linux Mint 13, Python 2.7.x, scipy 0.11.0)

You seem to only have two degrees of freedom here---you can rewrite everything in terms of b and d or b and tau or (pick your two favorites). Your constraint on tau implies directly a constraint on d, and you can use your constraint on Q to imply a constraint on b.
And it doesn't look (to me at least, I still haven't finished my coffee) that your code is doing anything other than plotting some two dimensional functions over a grid you've defined--NOT solving a system of equations. I normally understand "solving" to involve setting something equal to something else, and writing one variable as a function of another variable.
It does appear you've only posted a snippet, though, so I'll assume you do something else with your data down stream.
Ok, I see. I think this isn't really a minimization problem, it's a plotting problem. The first thing I'd do is see what ranges are implied for b and d from your constraints on tau, and then use that to derive a constraint on d. Then you can mesh those points with meshgrid (as you mentioned below) and run over all combinations.
Since you're applying the constraint before you apply the mesh (as opposed to after, as in your code), you'll only be sampling the parameter space that you're interested in. In your code you generate a bunch of junk you're not interested in, and pick out the gems. If you apply your constraints first, you'll only be left with gems!
I'd define my functions like:
P = lambda b, d: b + 2*d*np.sqrt(1 + 1.5**2)
which works like
>>> import numpy as np
>>> P = lambda b, d: b + 2*d*np.sqrt(1 + 1.5**2)
>>> P(1,2)
8.2111025509279791
Then you can write another function to serve up b and d for you, so you can do something like:
def get_func_vals(b, d):
pvals.append(P(b,d))
or, better yet, store b and d as tuples in a function that doesn't return but yields:
pvals = [P(b,d) for (b,d) in thing_that_yields_b_and_d_tuples]
I didn't test this last line of code, and I always screw up these parenthesis, but I think it's right.

Solving a difficult (polynomial?) equation in Python

I am new to programming (Python is my first language) but I love to design algorithms. I am currently working on a system of equations (integers) and I cannot find any references to solving my particular problem.
Let me explain.
I have an equation (a test, if you will):
raw_input == [(90*x + a) * y] + z
where a is some constant.
My problem is, the variable z counts in a manner very similar to a Fibonacci sequence, and the variable x is the step of z. So what I mean by this (for a Fibonacci sequence) is that at the first term of the z sequence, x = 0, and at the second term of the z sequence, x = 1. I need to solve for y.
The exact process for determining z is as follows
where c and d are constants:
#at x = 0
temp = (c+(90*x)) * (d+(90*x))
temp/90 = z(0)
#at x = 1
new_temp = (c+(90*x)) * (d + (90*x))
new_temp/90 = z(1)
#for all the rest of the values of z (and x), use:
j = z(# x=1) - z(# x=0)
k = j + 180
l = z(# x=1) + k
print "z(# x=1) - z(# x=0) = j"
print "j + 180 = k"
print "k + z(1) = l"
repeat until z > raw_input
this creates the spread of z values by the relation:
j = z(# x=n) - z(# x=n-1)
k = j + 180
l = k + z(# x = n)
I need to scan through (skip) the values of z < x to test for the condition of a whole-number solution for y.
Does this seem possible?

It seems your best approach would be to recast the given equation as a recurrence relation and then either define a recursive function to determine the values you desire to compute or find the closed form solution to the relation. For more information on recurrence relations see:
Any decent book on Combinatorics
Wikipedia: Recurrence relation. Particularly, the sections:
2.1: Linear homogeneous recurrence relations with constant coefficients
2.2: Rational generating function
3.1: Solving recurrence relations, General Methods
Though the general methods for solving recurrence relations are reasonably able, the most powerful technique is the z-transform: 3.3: Solving with z-transforms
3.5: Solving non-homogeneous recurrence relations. The techniques and discussion in the rest of the article are mostly suited for pure applications, but may occasionally find practical uses as well.
WolframMathWorld: Recurrence equation
Finally, in my experience, such problems are best tackled with mathematical numerical analysis software such as MatLab, Octave,or Mathematica. At the very least, with these you have a platform which enables rapid deployment and testing.

All I've done is translate your psuedo-code into Python. Maybe it can be of some help. Perhaps you should have a look at the Python tutorial if you haven't already.
# python 2.7
# raw_input returns string - convert to int
upper_bound = int(raw_input('Upper bound: '))
def z(x):
'A function to calculate z from x.'
# c and d are constants
c = 5
d = 2
# integer division here
return (c + 90*x)*(d + 90*x)/90
# the value of z_0
z0 = z_x = z(0)
# a list to hold the z values z_0, z_1, ...
# the list includes z_0 (when x = 0)
zs = [z0]
x = 1
while z_x < upper_bound:
z_x = z(x)
zs.append(z_x)
j = zs[x] - zs[x - 1]
k = j + 180
l = zs[x] + k
print j, k, l
x += 1

Generating random numbers under very specific constraints

I am faced with the following programming problem. I need to generate n (a, b) tuples for which the sum of all a's is a given A and sum of all b's is a given B and for each tuple the ratio of a / b is in the range (c_min, c_max). A / B is within the same range, too. I am also trying to make sure there is no bias in the result other than what is introduced by the constraints and the a / b values are more-or-less uniformly distributed in the given range.
Some clarifications and meta-constraints:
A, B, c_min, and c_max are given.
The ratio A / B is in the (c_min, c_max) range. This has to be so if the problem is to have a solution given the other constraints.
a and b are >0 and non-integer.
I am trying to implement this in Python but ideas in any language (English included) are much appreciated.

We look for tuples a_i and b_i such that
(a_1, ... a_n) and (b_1, ... b_n) have a distribution which is invariant under permutation of indices (what you would call "unbiased")
the ratios a_i / b_i are uniformly distributed on [cmin, cmax]
sum(a_i) = A, sum(b_i) = B
If c_min and c_max are not too ill conditioned (ie they are not very close to another), and n is not very large, the following works:
Generate a_i "uniformly" such that sum a_i = A:
Draw n samples aa_i (i = 1..n) from some distribution (eg. uniform)
Divide them by their sum and multiply by A: a_i = A * aa_i / sum(aa_i) has desired properties.
Generate b_i such that sum b_i = B by the same method.
If there exists i such that a_i / b_i is not in the interval [cmin, cmax], throw away all the a_i and b_i and try again from the beginning.
It doesn't scale well with n, because the set of a_i and b_i satisfying the constraints gets more and more narrow as n increases (and so you reject more candidates).
To be honest, I don't see any other simple solution. If n gets large and cmin ~ cmax, then you will have to use a sledgehammer (eg. MCMC) to generate samples from your distribution, unless there is some trick we did not see.
If you really want to use MCMC algorithms, note that you can change cmin to cmin * B / A (likewise for cmax) and assume A == B == 1. The problem is then to draw uniformly on the product of two unit n-simplices (u_1...u_n, v_1...v_n) such that
u_i / v_i \in [cmin, cmax].
So you have to use a MCMC algorithm (Metropolis-Hastings seems more suited) on the product of two unit n-simplices with the density
f(u_1, ..., u_n, v_1, ..., v_n) = \prod indicator_{u_i/v_i \in [cmin, cmax]}
which is definitely doable (albeit involved).

Start by generating as many identical tuples, n, as you need:
(A/n, B/n)
Now pick two tuples at random. Make a random change to the a value of one, and a compensating change to the a value of the other, keeping everything within the given constraints. Put the two tuples back.
Now pick another random pair. This times twiddle with the b values.
Lather, rinse repeat.

I think the simplest thing is to
Use your favorite method to throw n-1 values such that \sum_i=0,n-1 a_i < A, and set a_n to get the right total. There are several SO question about doing that, though I've never seen a answer I'm really happy with yet. Maybe I'll write a paper or something.
Get the n-1 b's by throwing the c_i uniformly on the allowed range, and set final b to get the right total and check on the final c (I think it must be OK, but I haven't proven it yet).
Note that since we have 2 hard constrains we should expect to throw 2n-2 random numbers, and this method does exactly that (on the assumption that you can do step 1 with n-1 throws.

Blocked Gibbs sampling is pretty simple and converges to the right distribution (this is along the lines of what Alexandre is proposing).
For all i, initialize ai = A / n and bi = B / n.
Select i ≠ j uniformly at random. With probability 1/2, update ai and aj with uniform random values satisfying the constraints. The rest of the time, do the same for bi and bj.
Repeat Step 2 as many times as seems to be necessary for your application. I have no idea what the convergence rate is.

Lots of good ideas here. Thanks! Rossum's idea seemed the most straightforward implementation-wise so I went for it. Here is the code for posterity:
c_min = 0.25
c_max = 0.75
a_sum = 100.0
b_sum = 200.0
n = 1000
a = [a_sum / n] * n
b = [b_sum / n] * n
while not good_enough(a, b):
i, j = random.sample(range(n), 2)
li, ui = c_min * b[i] - a[i], c_max * b[i] - a[i]
lj, uj = a[j] - c_min * b[j], a[j] - c_max * b[j]
llim = max((li, uj))
ulim = min((ui, lj))
q = random.uniform(llim, ulim)
a[i] += q
a[j] -= q
i, j = random.sample(range(n), 2)
li, ui = a[i] / c_max - b[i], a[i] / c_min - b[i]
lj, uj = b[j] - a[j] / c_max, b[j] - a[j] / c_min
llim = max((li, uj))
ulim = min((ui, lj))
q = random.uniform(llim, ulim)
b[i] += q
b[j] -= q
The good_enough(a, b) function can be a lot of things. I tried:
Standard deviation, which is hit or miss, as you don't know what is a good enough value.
Kurtosis, where a large negative value would be nice. However, it is relatively slow to calculate and is undefined with the seed values of (a_sum / n, b_sum / n) (though that's trivial to fix).
Skewness, where a value close to 0 is desirable. But it has the same drawbacks as kurtosis.
A number of iterations proportional to n. 2n sometimes wasn't enough, n ^ 2 is a little bit of overkill and is, well, exponential.
Ideally, a heuristic using a combination of skewness and kurtosis would be best but I settled for making sure each value has been changed from the initial (again, as rossum suggested in a comment). Though there is no theoretical guarantee that the loop will complete, it seemed to work well enough for me.

So here's what I think from mathematical point of view. We have sequences a_i and b_i such that sum of a_i is A and sum of b_i is B. Furthermore A/B is in (x,y) and so is a_i/b_i for each i. Furthermore you want a_i/b_i to be uniformly distributed in (x,y).
So do it starting from the end. Choose c_i from (x,y) such that they are uniformly distributed. Then we want to have the following equality a_i/b_i = c_i, so a_i = b_i*c_i.
Therefore we only need to find b_i. But we have the following system of linear equations:
A = (sum)b_i*c_i
B = (sum)b_i
where b_i are variables. Solve it (some fancy linear algebra tricks) and you're done!
Note that for large enough n this system will have lots of solutions. They will be dependent on some parameters which you can choose randomly.
Enough of the theoretical approach, let's see some practical solution.
// EDIT 1: Here's some hard core Python code :D
import random
min = 0.0
max = 10.0
A = 500.0
B = 100.0
def generate(n):
C = [min + i*(max-min)/(n+1) for i in range(1, n+1)]
Y = [0]
for i in range(1,n-1):
# This line should be changed in order to always get positive numbers
# It should be relatively easy to figure out some good random generator
Y.append(random.random())
val = A - C[0]*B
for i in range(1, n-1):
val -= Y[i] * (C[i] - C[0])
val /= (C[n-1] - C[0])
Y.append(val)
val = B
for i in range(1, n):
val -= Y[i]
Y[0] = val
result = []
for i in range(0, n):
result.append([ Y[i]*C[i], Y[i] ])
return result
The result is a list of pairs (X,Y) satisfying your conditions with the exception that they may be negative (see the random generator line in code) i.e. the first and the last pair may contain negative numbers.
// EDIT 2:
Too ensure that they are positive you may try something like
Y.append(random.random() * B / n)
instead of
Y.append(random.random())
I'm not sure though.
// EDIT 3:
In order to have better results try something like this:
avrg = B / n
ran = avrg / 20
for i in range(1, n-1):
Y.append(random.gauss(avrg, ran))
instead of
for i in range(1, n-1):
Y.append(random.random())
This will make all b_i to be near B / n. Unfortunetly the last term will still sometimes jump high. I'm sorry, but there is no way to avoid this (mathematics) since the last and the first terms depend on the others. For small n (~100) it looks good though. Unfortunetly some negative values may appear.
The choice of a correct generator is not so simple if you additionally want b_i to be uniformly distributed.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.