nonlinear optimization with vectors, scalars and inequality constraints - python

I have set of equation in form: Y=aA+bB
where Y-is know vector of floats (only this one is known!); a, b are unkown scalar (float) and A, B are unknown vectors of floats. Each equation have it own Y, a, b, whereas all equation share the same unknow vectors A and B.
I have set of such equation so my problem is to minimize function:
(Y-aA-bB)+(Y'-a'A-b'B)+....
I have also many inequality constrains of type: Ai>Aj (Ai i-th element of vector A), Bi>= Bk, Bi>0, a>a', ...
Is there any software or library (ideally for python) which can handle this problem?

General remarks
This is a linear problem (at least in the linear least-squares sense, continue reading)!
It's also incompletely specified as it's not clear if there should be always a feasible solution in your case or if you want to minimize some given loss in general. Your text sounds like the latter, but in this case one has to chose the loss (which makes a difference in regards to possible algorithms). Let's take the euclidean-norm (probably the best pick here)!
Ignoring constraints for a moment, we can view this problem as basic least-squares solution to a linear matrix equation problem (euclidean-norm vs. squared euclidean-norm does not make a difference!).
min || b - Ax ||^2
Here:
M = number of Y's
N = size of Y
b = (Y0,
Y1,
...) -> shape: M*N (flattened: Y_x = (y_x_0, y_x_1).T)
A = ((a0, 0, 0, ..., b0, 0, 0, ...),
(0, a0, 0, ..., 0, b0, 0, ...),
(0, 0, a0, ..., 0, 0, b0, ...),
...
(a1, 0, 0, ..., b1, 0, 0, ...)) -> shape: (M*N, N*2)
x = (A0, A1, A2, ... B0, B1, B2, ...) -> shape: N*2 (one for A, one for B)
What you should do
If unconstrained:
Convert to standard-form and use numpy's lstsq
If constrained:
Either use customized optimization algorithms, or:
Linear-programming (if minimizing absolute-differences / l1-norm)
I'm too lazy to formulate it for scipy's linprog
Not that hard, but l1-norm is non-trivial using scipy's API
Much easier to formulate with cvxpy (obj=cvxpy.norm(X, 1))
Quadratic-programming / Second-order-cone-programming (if minimizing euclidean norm / l2-norm)
Again, too lazy to formuate it; no special solver available at scipy yet
Could be easily formulated with cvxpy (obj=cvxpy.norm(X, 2))
Emergency: use general-purpose constrained nonlinear-optimization algorithms like SLSQP -> see code
Some hacky code (not the best approach!)
This code:
Is just a demo!
Uses general nonlinear optimization algorithms from scipy
Therefore:
easier to formulate
Less fast & robust than LP, QP, SOCP
But will achieve approximately the same result as convergence on convex optimization problems is guaranteed
Uses automatic-differentiation whenever needed
(author too lazy to add gradients)
this can really hurt if performance is important
Is really ugly in terms of np.repeat vs. broadcasting!
Code:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
""" Fake-problem (usually the job of the question-author!) """
def get_partial(N=10):
Y = np.random.uniform(size=N)
a, b = np.random.uniform(size=2)
return Y, a, b
""" Optimization """
def optimize(list_partials, N, M):
""" General approach:
This is a linear system of equations (with constraints)
Basic (unconstrained) form: min || b - Ax ||^2
"""
Y_all = np.vstack(map(lambda x: x[0], list_partials)).ravel() # flat 1d
a_all = np.hstack(map(lambda x: np.repeat(x[1], N), list_partials)) # repeat to be of same shape
b_all = np.hstack(map(lambda x: np.repeat(x[2], N), list_partials)) # """
def func(x):
A = x[:N]
B = x[N:]
return np.linalg.norm(Y_all - a_all * np.repeat(A, M) - b_all * np.repeat(B, M))
""" Example constraints: A >= B element-wise """
cons = ({'type': 'ineq',
'fun' : lambda x: x[:N] - x[N:]})
res = minimize(func, np.zeros(N*2), constraints=cons, method='SLSQP', options={'disp': True})
print(res)
print(Y_all - a_all * np.repeat(res.x[:N], M) - b_all * np.repeat(res.x[N:], M))
""" Test """
M = 4
N = 3
list_partials = [get_partial(N) for i in range(M)]
optimize(list_partials, N, M)
Output:
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.9019356096498999
Iterations: 12
Function evaluations: 96
Gradient evaluations: 12
fun: 0.9019356096498999
jac: array([ 1.03786588e-04, 4.84041870e-04, 2.08129734e-01,
1.57609582e-04, 2.87599862e-04, -2.07959406e-01])
message: 'Optimization terminated successfully.'
nfev: 96
nit: 12
njev: 12
status: 0
success: True
x: array([ 1.82177105, 0.62803449, 0.63815278, -1.16960281, 0.03147683,
0.63815278])
[ 3.78873785e-02 3.41189867e-01 -3.79020251e-01 -2.79338679e-04
-7.98836875e-02 7.94168282e-02 -1.33155595e-01 1.32869391e-01
-3.73398306e-01 4.54460178e-01 2.01297470e-01 3.42682496e-01]
I did not check the result! If there is an error it's an implementation-error, not a conceptional one (my opinion)!

I agree with sascha that this is a linear problem. As I do not like constrains very much, I prefer, actually, to make it a non-linear without constrains. I do so by setting the vector A=(a1**2, a1**2+a2**2, a1**2+a2**2+a3**2, ...) like this it is ensured that it is all positive and A_i > A_j for i>j. That makes errors a bit problematic, as you now have to consider error propagation to get A1, A2, etc. including correlation, but I will have an important point on that at the end. The "simple" solution would look as follows:
import numpy as np
from scipy.optimize import leastsq
from random import random
np.set_printoptions(linewidth=190)
def generate_random_vector(n, sortIt=True):
out=np.fromiter( (random() for x in range(n) ),np.float)
if sortIt:
out.sort()
return out
def residuals(parameters,dataVec,dataLength,vecDims):
aParams=parameters[:dataLength]
bParams=parameters[dataLength:2*dataLength]
AParams=parameters[-2*vecDims:-vecDims]
BParams=parameters[-vecDims:]
YList=dataVec
AVec=[a**2 for a in AParams]##assures A_i > 0
BVec=[b**2 for b in BParams]
AAVec=np.cumsum(AVec)##assures A_i>A_j for i>j
BBVec=np.cumsum(BVec)
dist=[ np.array(Y)-a*np.array(AAVec)-b*np.array(BBVec) for Y,a,b in zip(YList,aParams,bParams) ]
dist=np.ravel(dist)
return dist
if __name__=="__main__":
aList=generate_random_vector(20, sortIt=False)
bList=generate_random_vector(20, sortIt=False)
AVec=generate_random_vector(5)
BVec=generate_random_vector(5)
YList=[a*AVec+b*BVec for a,b in zip(aList,bList)]
aGuess=20*[.2]
bGuess=20*[.3]
AGuess=5*[.4]
BGuess=5*[.5]
bestFitValues, covMX, infoDict, messages ,ier = leastsq(residuals, aGuess+bGuess+AGuess+BGuess ,args=(YList,20,5) ,full_output=True)
print "a"
print aList
besta = bestFitValues[:20]
print besta
print "b"
print bList
bestb = bestFitValues[20:40]
print bestb
print "A"
print AVec
bestA = bestFitValues[-2*5:-5]
realBestA = np.cumsum([x**2 for x in bestA])
print realBestA
print "B"
print BVec
bestB = bestFitValues[-5:]
realBestB = np.cumsum([x**2 for x in bestB])
print realBestB
print covMX
The problem on errors and correlation is that the solution to the problem is not unique. If Y = a A + b B is a solution and we, e.g., rotate such that A = c E + s F and B = -s E + c F then also Y = (ac-bs) E + (as+bc) F =e E + f F is a solution. The parameter space is, hence, completely flat at "the solution" resulting in huge errors and apocalyptic correlations.

Related

How to compute the operator Schmidt decomposition of a matrix using python

How to do THIS in python?
Currently, I can transform a 4X4 matrix 'A' into its bipartite shape [2,2,2,2], as described in here.
Once performing the SVD on the latter (the transformed version of 'A') using python via:
NumPy SVD: U, S, Vh = numpy.linalg.svd(A)
PyTorch SVD: U, S, Vh = torch.linalg.svd(A)
and it turns out (in both cases):
U.shape = Vh.shape = [2,2,2,2]
S.shape = [2,2,2]
I want to write a function:
def to_schmidt(A):
U, S, Vh = torch.linalg.svd(A)
#
# Do stuff with U, S, and Vh
#
return [[s1, ..., s4], [B1, ..., B4], [C1, ..., C4]]
Which returns a list of the s's' B's, and C's just as described in here. Just to clarify (using PyTorch in the example):
should_be_the_original_matrix_A = None
for i in range(4):
if should_be_the_original_matrix_A = None:
should_be_the_original_matrix_A = s[i] * torch.kron(B[i], C[i])
else:
should_be_the_original_matrix_A += s[i] * torch.kron(B[i], C[i])
#
# The one with A_{i}{j} with i,j = 0, 1, 2, 3
# or alternatively A.shape= [4,4], i.e.
# before the transformation to the bipartite rep.
#
The answer given there uses Mathematica, but as I'm not familiar with it:
How to complete this function?
Hopefully, using PyTorch, NumPy, or anything convertible to them, fairly trivially.

Use linear algebra methods of SciPy to solve the three simultaneous equations

My question is that how to write theese equations in array and solve?
from scipy import linalg
import numpy as np
import matplotlib.pyplot as plt
x = np.array[-23,1100,2300],[2300,1500,550],[550,1600,]
I tried to write in the array above, but I couldn't figure out how to replace 'In' and 'Vs2' in the question. Can you help me solve the question?
You want to solve these equations for several voltages, which suggests the use of a for-loop. For clarity, it's usually better to use identifiers for values, thus for instance, R1 rather than 1100. Put the R1 in formulae and let the computer do the simple arithmetic for you.
You may be thinking of using the linalg solve function since you need to solve a square matrix of order three. The unknowns are the currents. Therefore, do the algebra so that you have expressions for the coefficients of the matrix, and for the right side of the equation, in terms of resistances and voltages.
For the matrix (as indicated in the documentation at https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.solve.html#scipy.linalg.solve),
a = np.array([[f1(Rs, Vs), f2(Rs, Vs), f3(Rs, Vs)], [...], [...]])
For the vector on the right side,
b = np.array([f4(Rs, Vs), f5(Rs,Vs), f6(Rs, Vs)])
Then currents = solve(a, b)
Notice that f1, f2, etc are those functions that you have to calculate algebraically.
Now put this code in a loop, more or less like this:
for vs2 in [10,15,20,25]:
currents = solve(a, b)
Because you've got the resistances and vs2's in your algebraic expressions you'll get the corresponding currents. You'll need to collect the currents corresponding to voltages for plotting.
Addition: Partial result of algebraic manipulation:
More: How I would avoid most of the pesky algebra using the sympy library:
>>> R1, R2, R3, R4, R5, Vs1 = 1100, 2300, 1500, 550, 1600, 23
>>> from sympy import *
>>> var('I1,I2,I3,Vs2')
(I1, I2, I3, Vs2)
>>> eq1 = -Vs1 + R1*I1 + R2 * (I1-I2)
>>> eq1
3400*I1 - 2300*I2 - 23
>>> eq2 = R2*(I2-I1)+R3*I2+R4*(I2-I3)
>>> eq2
-2300*I1 + 4350*I2 - 550*I3
>>> eq3 = R4*(I3-I2)+R5*I3 + Vs2
>>> eq3
-550*I2 + 2150*I3 + Vs2
>>> from scipy import linalg
>>> import numpy as np
>>> for Vs2 in [10,15,20,25]:
... ls = np.array([[3400,-2300,0],[-2300,4350,-550],[0,-550,2150]])
... rs = np.array([23, 0, -Vs2])
... I = linalg.solve(ls, rs)
... Vs2, I
...
(10, array([ 0.01007914, 0.0048996 , -0.00339778]))
(15, array([ 0.00975305, 0.00441755, -0.00584667]))
(20, array([ 0.00942696, 0.0039355 , -0.00829557]))
(25, array([ 0.00910087, 0.00345346, -0.01074446]))
In order to solve a linear system of equations for unknown vector x=In which is classically written as Ax=b, you need to specify a coefficient matrix A and right hand side vector b to linalg.solve function. Based on your question, you just have to re-write in matrix form your three equations in terms of unknown currents to get A and b which was done with sympy but it is pretty overkill here imo. Here follows an easier to read solution with analytic A:
from scipy.linalg import lu_factor, lu_solve
import numpy as np
import matplotlib.pyplot as plt
# your data
R1 = 1100
R2 = 2300
R3 = 1500
R4 = 550
R5 = 1600
Vs1 = 23
# Vs2 range of interest as a list
Vs2_range = [10,15,20,25]
# construct A: the coefficient matrix of the left-hand side in terms of In = [I1, I2, I3]
A = np.array([[ R1+R2, -R2, 0],
[ -R2, R2+R3+R4, -R4],
[ 0, -R4, R4+R5]])
# pre-compute pivoted LU decomposition of A to solve Ax=b (because only b is changing here)
A_LU = lu_factor(A)
# initialize results
res = np.empty((len(Vs2_range),3))
# loop over Vs2 values
for i,Vs2 in enumerate(Vs2_range):
# construct b: the right hand side vector for each Vs2
b = np.array([Vs1,0,-Vs2])
# then solve the linear system Ax=b
In = lu_solve(A_LU,b)
# stock results as rows of the res array
res[i,:] = In
# plot each current In (column of res) vs Vs2_range
for i in range(3):
plt.plot(Vs2_range,res[:,i],'-+',label='I'+str(i+1))
plt.xlabel('Vs2 [V]')
plt.ylabel('I [A]')
plt.legend()
which gives:
Hope this helps.

Is my problem suited for convex optimization, and if so, how to express it with cvxpy?

I have an array of scalars of m rows and n columns. I have a Variable(m) and a Variable(n) that I would like to find solutions for.
The two variables represent values that need to be broadcast over the columns and rows respectively.
I was naively thinking of writing the variables as Variable((m, 1)) and Variable((1, n)), and adding them together as if they're ndarrays. However, that doesn't work, as broadcasting is not allowed.
import cvxpy as cp
import numpy as np
# Problem data.
m = 3
n = 4
np.random.seed(1)
data = np.random.randn(m, n)
# Construct the problem.
x = cp.Variable((m, 1))
y = cp.Variable((1, n))
objective = cp.Minimize(cp.sum(cp.abs(x + y + data)))
# or:
#objective = cp.Minimize(cp.sum_squares(x + y + data))
prob = cp.Problem(objective)
result = prob.solve()
print(x.value)
print(y.value)
This fails on the x + y expression: ValueError: Cannot broadcast dimensions (3, 1) (1, 4).
Now I'm wondering two things:
Is my problem indeed solvable using convex optimization?
If yes, how can I express it in a way that cvxpy understands?
I'm very new to the concept of convex optimization, as well as cvxpy, and I hope I described my problem well enough.
I offered to show you how to represent this as a linear program, so here it goes. I'm using Pyomo, since I'm more familiar with that, but you could do something similar in PuLP.
To run this, you will need to first install Pyomo and a linear program solver like glpk. glpk should work for reasonable-sized problems, but if you are finding it's taking too long to solve, you could try a (much faster) commercial solver like CPLEX or Gurobi.
You can install Pyomo via pip install pyomo or conda install -c conda-forge pyomo. You can install glpk from https://www.gnu.org/software/glpk/ or via conda install glpk. (I think PuLP comes with a version of glpk built-in, so that might save you a step.)
Here's the script. Note that this calculates absolute error as a linear expression by defining one variable for the positive component of the error and another for the negative part. Then it seeks to minimize the sum of both. In this case, the solver will always set one to zero since that's an easy way to reduce the error, and then the other will be equal to the absolute error.
import random
import pyomo.environ as po
random.seed(1)
# ~50% sparse data set, big enough to populate every row and column
m = 10 # number of rows
n = 10 # number of cols
data = {
(r, c): random.random()
for r in range(m)
for c in range(n)
if random.random() >= 0.5
}
# define a linear program to find vectors
# x in R^m, y in R^n, such that x[r] + y[c] is close to data[r, c]
# create an optimization model object
model = po.ConcreteModel()
# create indexes for the rows and columns
model.ROWS = po.Set(initialize=range(m))
model.COLS = po.Set(initialize=range(n))
# create indexes for the dataset
model.DATAPOINTS = po.Set(dimen=2, initialize=data.keys())
# data values
model.data = po.Param(model.DATAPOINTS, initialize=data)
# create the x and y vectors
model.X = po.Var(model.ROWS, within=po.NonNegativeReals)
model.Y = po.Var(model.COLS, within=po.NonNegativeReals)
# create dummy variables to represent errors
model.ErrUp = po.Var(model.DATAPOINTS, within=po.NonNegativeReals)
model.ErrDown = po.Var(model.DATAPOINTS, within=po.NonNegativeReals)
# Force the error variables to match the error
def Calculate_Error_rule(model, r, c):
pred = model.X[r] + model.Y[c]
err = model.ErrUp[r, c] - model.ErrDown[r, c]
return (model.data[r, c] + err == pred)
model.Calculate_Error = po.Constraint(
model.DATAPOINTS, rule=Calculate_Error_rule
)
# Minimize the total error
def ClosestMatch_rule(model):
return sum(
model.ErrUp[r, c] + model.ErrDown[r, c]
for (r, c) in model.DATAPOINTS
)
model.ClosestMatch = po.Objective(
rule=ClosestMatch_rule, sense=po.minimize
)
# Solve the model
# get a solver object
opt = po.SolverFactory("glpk")
# solve the model
# turn off "tee" if you want less verbose output
results = opt.solve(model, tee=True)
# show solution status
print(results)
# show verbose description of the model
model.pprint()
# show X and Y values in the solution
for r in model.ROWS:
print('X[{}]: {}'.format(r, po.value(model.X[r])))
for c in model.COLS:
print('Y[{}]: {}'.format(c, po.value(model.Y[c])))
Just to complete the story, here's a solution that's closer to your original example. It uses cvxpy, but with the sparse data approach from my solution.
I don't know the "official" way to do elementwise calculations with cvxpy, but it seems to work OK to just use the standard Python sum function with a lot of individual cp.abs(...) calculations.
This gives a solution that is very slightly worse than the linear program, but you may be able to fix that by adjusting the solution tolerance.
import cvxpy as cp
import random
random.seed(1)
# Problem data.
# ~50% sparse data set
m = 10 # number of rows
n = 10 # number of cols
data = {
(i, j): random.random()
for i in range(m)
for j in range(n)
if random.random() >= 0.5
}
# Construct the problem.
x = cp.Variable(m)
y = cp.Variable(n)
objective = cp.Minimize(
sum(
cp.abs(x[i] + y[j] + data[i, j])
for (i, j) in data.keys()
)
)
prob = cp.Problem(objective)
result = prob.solve()
print(x.value)
print(y.value)
I did not get the idea, but just some hacky stuff based on the assumption:
you want some cvxpy-equivalent to numpy's broadcasting-rules behaviour on arrays (m, 1) + (1, n)
So numpy-wise:
m = 3
n = 4
np.random.seed(1)
a = np.random.randn(m, 1)
b = np.random.randn(1, n)
a
array([[ 1.62434536],
[-0.61175641],
[-0.52817175]])
b
array([[-1.07296862, 0.86540763, -2.3015387 , 1.74481176]])
a + b
array([[ 0.55137674, 2.48975299, -0.67719333, 3.36915713],
[-1.68472504, 0.25365122, -2.91329511, 1.13305535],
[-1.60114037, 0.33723588, -2.82971045, 1.21664001]])
Let's mimic this with np.kron, which has a cvxpy-equivalent:
aLifted = np.kron(np.ones((1,n)), a)
bLifted = np.kron(np.ones((m,1)), b)
aLifted
array([[ 1.62434536, 1.62434536, 1.62434536, 1.62434536],
[-0.61175641, -0.61175641, -0.61175641, -0.61175641],
[-0.52817175, -0.52817175, -0.52817175, -0.52817175]])
bLifted
array([[-1.07296862, 0.86540763, -2.3015387 , 1.74481176],
[-1.07296862, 0.86540763, -2.3015387 , 1.74481176],
[-1.07296862, 0.86540763, -2.3015387 , 1.74481176]])
aLifted + bLifted
array([[ 0.55137674, 2.48975299, -0.67719333, 3.36915713],
[-1.68472504, 0.25365122, -2.91329511, 1.13305535],
[-1.60114037, 0.33723588, -2.82971045, 1.21664001]])
Let's check cvxpy semi-blindly (we only dimensions; too lazy to setup a problem and fix variable to check the output :-D):
import cvxpy as cp
x = cp.Variable((m, 1))
y = cp.Variable((1, n))
cp.kron(np.ones((1,n)), x) + cp.kron(np.ones((m, 1)), y)
# Expression(AFFINE, UNKNOWN, (3, 4))
# looks good!
Now some caveats:
i don't know how efficient cvxpy can reason about this matrix-form internally
unclear if more efficient as a simple list-comprehension based form using cp.vstack and co (it probably is)
this operation itself kills all sparsity
(if both vectors are dense; your matrix is dense)
cvxpy and more or less all convex-optimization solvers are based on some sparsity assumption
scaling this problem up to machine-learning dimensions will not make you happy
there is probably a much more concise mathematical theory for your problem then to use (sparsity-assuming) (pretty) general (DCP implemented in cvxpy is a subset) convex-optimization

Calculate the Euclidean distance for 2 different size arrays [duplicate]

I have two arrays of x-y coordinates, and I would like to find the minimum Euclidean distance between each point in one array with all the points in the other array. The arrays are not necessarily the same size. For example:
xy1=numpy.array(
[[ 243, 3173],
[ 525, 2997]])
xy2=numpy.array(
[[ 682, 2644],
[ 277, 2651],
[ 396, 2640]])
My current method loops through each coordinate xy in xy1 and calculates the distances between that coordinate and the other coordinates.
mindist=numpy.zeros(len(xy1))
minid=numpy.zeros(len(xy1))
for i,xy in enumerate(xy1):
dists=numpy.sqrt(numpy.sum((xy-xy2)**2,axis=1))
mindist[i],minid[i]=dists.min(),dists.argmin()
Is there a way to eliminate the for loop and somehow do element-by-element calculations between the two arrays? I envision generating a distance matrix for which I could find the minimum element in each row or column.
Another way to look at the problem. Say I concatenate xy1 (length m) and xy2 (length p) into xy (length n), and I store the lengths of the original arrays. Theoretically, I should then be able to generate a n x n distance matrix from those coordinates from which I can grab an m x p submatrix. Is there a way to efficiently generate this submatrix?
(Months later)
scipy.spatial.distance.cdist( X, Y )
gives all pairs of distances,
for X and Y 2 dim, 3 dim ...
It also does 22 different norms, detailed
here .
# cdist example: (nx,dim) (ny,dim) -> (nx,ny)
from __future__ import division
import sys
import numpy as np
from scipy.spatial.distance import cdist
#...............................................................................
dim = 10
nx = 1000
ny = 100
metric = "euclidean"
seed = 1
# change these params in sh or ipython: run this.py dim=3 ...
for arg in sys.argv[1:]:
exec( arg )
np.random.seed(seed)
np.set_printoptions( 2, threshold=100, edgeitems=10, suppress=True )
title = "%s dim %d nx %d ny %d metric %s" % (
__file__, dim, nx, ny, metric )
print "\n", title
#...............................................................................
X = np.random.uniform( 0, 1, size=(nx,dim) )
Y = np.random.uniform( 0, 1, size=(ny,dim) )
dist = cdist( X, Y, metric=metric ) # -> (nx, ny) distances
#...............................................................................
print "scipy.spatial.distance.cdist: X %s Y %s -> %s" % (
X.shape, Y.shape, dist.shape )
print "dist average %.3g +- %.2g" % (dist.mean(), dist.std())
print "check: dist[0,3] %.3g == cdist( [X[0]], [Y[3]] ) %.3g" % (
dist[0,3], cdist( [X[0]], [Y[3]] ))
# (trivia: how do pairwise distances between uniform-random points in the unit cube
# depend on the metric ? With the right scaling, not much at all:
# L1 / dim ~ .33 +- .2/sqrt dim
# L2 / sqrt dim ~ .4 +- .2/sqrt dim
# Lmax / 2 ~ .4 +- .2/sqrt dim
To compute the m by p matrix of distances, this should work:
>>> def distances(xy1, xy2):
... d0 = numpy.subtract.outer(xy1[:,0], xy2[:,0])
... d1 = numpy.subtract.outer(xy1[:,1], xy2[:,1])
... return numpy.hypot(d0, d1)
the .outer calls make two such matrices (of scalar differences along the two axes), the .hypot calls turns those into a same-shape matrix (of scalar euclidean distances).
The accepted answer does not fully address the question, which requests to find the minimum distance between the two sets of points, not the distance between every point in the two sets.
Although a straightforward solution to the original question indeed consists of computing the distance between every pair and subsequently finding the minimum one, this is not necessary if one is only interested in the minimum distances. A much faster solution exists for the latter problem.
All the proposed solutions have a running time that scales as m*p = len(xy1)*len(xy2). This is OK for small datasets, but an optimal solution can be written that scales as m*log(p), producing huge savings for large xy2 datasets.
This optimal execution time scaling can be achieved using scipy.spatial.KDTree as follows
import numpy as np
from scipy import spatial
xy1 = np.array(
[[243, 3173],
[525, 2997]])
xy2 = np.array(
[[682, 2644],
[277, 2651],
[396, 2640]])
# This solution is optimal when xy2 is very large
tree = spatial.KDTree(xy2)
mindist, minid = tree.query(xy1)
print(mindist)
# This solution by #denis is OK for small xy2
mindist = np.min(spatial.distance.cdist(xy1, xy2), axis=1)
print(mindist)
where mindist is the minimum distance between each point in xy1 and the set of points in xy2
For what you're trying to do:
dists = numpy.sqrt((xy1[:, 0, numpy.newaxis] - xy2[:, 0])**2 + (xy1[:, 1, numpy.newaxis - xy2[:, 1])**2)
mindist = numpy.min(dists, axis=1)
minid = numpy.argmin(dists, axis=1)
Edit: Instead of calling sqrt, doing squares, etc., you can use numpy.hypot:
dists = numpy.hypot(xy1[:, 0, numpy.newaxis]-xy2[:, 0], xy1[:, 1, numpy.newaxis]-xy2[:, 1])
import numpy as np
P = np.add.outer(np.sum(xy1**2, axis=1), np.sum(xy2**2, axis=1))
N = np.dot(xy1, xy2.T)
dists = np.sqrt(P - 2*N)
I think the following function also works.
import numpy as np
from typing import Optional
def pairwise_dist(X: np.ndarray, Y: Optional[np.ndarray] = None) -> np.ndarray:
Y = X if Y is None else Y
xx = (X ** 2).sum(axis = 1)[:, None]
yy = (Y ** 2).sum(axis = 1)[:, None]
return xx + yy.T - 2 * (X # Y.T)
Explanation
Suppose each row of X and Y are coordinates of the two sets of points.
Let their sizes be m X p and p X n respectively.
The result will produce a numpy array of size m X n with the (i, j)-th entry being the distance between the i-th row and the j-th row of X and Y respectively.
I highly recommend using some inbuilt method for calculating squares, and roots for they are customized for optimized way to calculate and very safe against overflows.
#alex answer below is the most safest in terms of overflow and should also be very fast. Also for single points you can use math.hypot which now supports more than 2 dimensions.
>>> def distances(xy1, xy2):
... d0 = numpy.subtract.outer(xy1[:,0], xy2[:,0])
... d1 = numpy.subtract.outer(xy1[:,1], xy2[:,1])
... return numpy.hypot(d0, d1)
Safety concerns
i, j, k = 1e+200, 1e+200, 1e+200
math.hypot(i, j, k)
# np.hypot for 2d points
# 1.7320508075688773e+200
np.sqrt(np.sum((np.array([i, j, k])) ** 2))
# RuntimeWarning: overflow encountered in square
overflow/underflow/speeds
I think that the most straightforward and efficient solution is to do it like this:
distances = np.linalg.norm(xy1, xy2) # calculate the euclidean distances between the test point and the training features.
min_dist = numpy.min(dists, axis=1) # get the minimum distance
min_id = np.argmi(distances) # get the index of the class with the minimum distance, i.e., the minimum difference.
Although many answers here are great, there is another way which has not been mentioned here, using numpy's vectorization / broadcasting properties to compute the distance between each points of two different arrays of different length (and, if wanted, the closest matches). I publish it here because it can be very handy to master broadcasting, and it also solves this problem elengantly while remaining very efficient.
Assuming you have two arrays like so:
# two arrays of different length, but with the same dimension
a = np.random.randn(6,2)
b = np.random.randn(4,2)
You can't do the operation a-b: numpy complains with operands could not be broadcast together with shapes (6,2) (4,2). The trick to allow broadcasting is to manually add a dimension for numpy to broadcast along to. By leaving the dimension 2 in both reshaped arrays, numpy knows that it must perform the operation over this dimension.
deltas = a.reshape(6, 1, 2) - b.reshape(1, 4, 2)
# contains the distance between each points
distance_matrix = (deltas ** 2).sum(axis=2)
The distance_matrix has a shape (6,4): for each point in a, the distances to all points in b are computed. Then, if you want the "minimum Euclidean distance between each point in one array with all the points in the other array", you would do :
distance_matrix.argmin(axis=1)
This returns the index of the point in b that is closest to each point of a.

Problems with scipy.optimize using matrix as input, bounds, constraints

I have used Python to perform optimization in the past; however, I am now trying to use a matrix as the input for the objective function as well as set bounds on the individual element values and the sum of the value of each row in the matrix, and I am encountering problems.
Specifically, I would like to pass the objective function ObjFunc three parameters - w, p, ret - and then minimize the value of this function (technically I am trying to maximize the function by minimizing the value of -1*ObjFunc) by adjusting the value of w subject to the bound that all elements of w should fall within the range [0, 1] and the constraint that sum of each row in w should sum to 1.
I have included a simplified piece of example code below to demonstrate the issue I'm encountering. As you can see, I am using the minimize function from scipy.opimize. The problems begin in the first line of objective function x = np.dot(p, w) in which the optimization procedure attempts to flatten the matrix into a one-dimensional vector - a problem that does not occur when the function is called without performing optimization. The bounds = b and constraints = c are both producing errors as well.
I know that I am making an elementary mistake in how I am approaching this optimization and would appreciate any insight that can be offered.
import numpy as np
from scipy.optimize import minimize
def objFunc(w, p, ret):
x = np.dot(p, w)
y = np.multiply(x, ret)
z = np.sum(y, axis=1)
r = z.mean()
s = z.std()
ratio = r/s
return -1 * ratio
# CREATE MATRICES
# returns, ret, of each of the three assets in the 5 periods
ret = np.matrix([[0.10, 0.05, -0.03], [0.05, 0.05, 0.50], [0.01, 0.05, -0.10], [0.01, 0.05, 0.40], [1.00, 0.05, -0.20]])
# probability, p, of being in each stae {X, Y, Z} in each of the 5 periods
p = np.matrix([[0,0.5,0.5], [0,0.6,0.4], [0.2,0.4,0.4], [0.3,0.3,0.4], [1,0,0]])
# initial equal weights, w
w = np.matrix([[0.33333,0.33333,0.33333],[0.33333,0.33333,0.33333],[0.33333,0.33333,0.33333]])
# OPTIMIZATION
b = [(0, 1)]
c = ({'type': 'eq', 'fun': lambda w_: np.sum(w, 1) - 1})
result = minimize(objFunc, w, (p, ret), method = 'SLSQP', bounds = b, constraints = c)
Digging into the code a bit. minimize calls optimize._minimize._minimize_slsqp. One of the first things it does is:
x = asfarray(x0).flatten()
So you need to design your objFunc to work with the flattened version of w. It may be enough to reshape it at the start of that function.
I read the code from a IPython session, but you can also find it in your scipy directory:
/usr/local/lib/python3.5/dist-packages/scipy/optimize/_minimize.py

Categories

Resources