I have following test program. My query is two folded: (1) Some how the solution is giving zero and (2) Is it appropriate to use this x2= np.where(x > y, 1, x) kind of conditions on variables ? Are there any constrained optimization routines in Scipy ?
a = 13.235
b = 70.678
def system(X, a,b):
x=X[0]
y=X[1]
x2= np.where(x > y, 1, x)
f=np.zeros(3)
f[0] = 2*x2 - y - a
f[1] = 3*x2 + 2*y- b
return (X)
func= lambda X: system(X, a, b)
guess=[5,5]
sol = optimize.root(func,guess)
print(sol)
edit: (2a) Here with x2= np.where(x > y, 1, x) condition, two equations becomes one equation.
(2b) In another variation requirement is: x2= np.where(x > y, x^2, x^3). Let me comments on these two as well. Thanks !
First up, your system function is an identity, since you return X instead of return f. The return should be the same shape as the X so you had better have
f = np.array([2*x2 - y - a, 3*x2 + 2*y- b])
Next the function, as written has a discontinuity where x=y, and this is causing there to be be a problem for the initial guess of (5,5). Setting the initial guess to (5,6) allows for the the solution [13.87828571, 14.52157143] to be found rapidly.
With the second example, again using an initial guess of [5,5] causes problems of discontinuity, using [5,6] gives a solution of [ 2.40313743, 14.52157143].
Here is my code:
import numpy as np
from scipy import optimize
def system(X, a=13.235, b=70.678):
x = np.where(X[0] > X[1], X[0]**2, X[0]**3)
y=X[1]
return np.array( [2*x - y - a, 3*x + 2*y - b])
guess = [5,6]
sol = optimize.root(system, guess)
print(sol)
Related
I am trying to fit 3-dimensional data (that is, 2 independent and 1 dependent variable) using multivariate fitting in scipy curve_fit. I wish to do piecewise fitting for the same problem. I have tried to proceed on the basis of this without any success. The problem is defined below:
import numpy as np
from scipy.optimize import curve_fit
#..........................................................................................................
def F0(X, a, b, c, c0, y0):
x, y = X
value = []
for i in range(0, len(x)):
if y[i] < y0:
lnZ = x[i] + c0*y[i]
else:
lnZ = x[i] + c*y[i]
val = a + (b*lnZ)
value.append(val)
return value
#..........................................................................................................
def F1(X, a, b, c):
x, y = X
lnZ = x + c*y
value = a + (b*lnZ)
return value
#..........................................................................................................
x = [-2.302585093,
-2.302585093,
-2.302585093,
-2.302585093,
-2.302585093,
-2.302585093,
-2.302585093,
0,
0,
0,
0,
0,
0,
0,
2.302585093,
2.302585093,
2.302585093,
2.302585093,
2.302585093,
2.302585093,
2.302585093
]
y = [7.55E-04,
7.85E-04,
8.17E-04,
8.52E-04,
8.90E-04,
9.32E-04,
9.77E-04,
7.55E-04,
7.85E-04,
8.17E-04,
8.52E-04,
8.90E-04,
9.32E-04,
9.77E-04,
7.55E-04,
7.85E-04,
8.17E-04,
8.52E-04,
8.90E-04,
9.32E-04,
9.77E-04
]
z = [4.077424497,
4.358253892,
4.610475878,
4.881769469,
5.153063061,
5.323277142,
5.462023074,
4.610475878,
4.840765517,
5.04864602,
5.235070966,
5.351407761,
5.440090728,
5.540693448,
4.960439843,
5.118257381,
5.266539115,
5.370479367,
5.440090728,
5.528296904,
5.5816974,
]
popt, pcov = curve_fit(F0, (x, y), z, method = 'lm')
print(popt)
popt, pcov = curve_fit(F1, (x, y), z, method = 'lm')
print(popt)
The output is:
[1.34957781e+00 1.05456428e-01 1.00000000e+00 4.14879613e+04
1.00000000e+00]
[1.34957771e+00 1.05456434e-01 4.14879603e+04]
You can see that the values of parameters in the piecewise fitting remain as the initial values. I know I am not doing it in the correct way. Please correct me.
The main source of the problem is the insensitivity of this approach to the value of the variable that defines the switch from one function to another (see this response for a similar explanation). Moreover, the choice of starting parameters isn't good.
Since no starting values are provided, curve_fit chooses a value of 1 for all the fitting parameters (see here the default value for p0). Since the fitting algorithm works by making small variations on the parameters, y0 is varied in small steps around 1, which produces no changes in the output of the function (all y values are much smaller than 1). Since y[i] < y0 is always True and only the first branch is ever evaluated, and the output of the function does not depend on the value of c. That explains why y0 and c stay at the initial values.
One might expect that setting y0 initial value to be inside of the range of values that are evaluated (i.e. around 8E-4) might solve the problem. Indeed, since the second branch is evaluated, the value of c is now optimized. Nevertheless, y0 value will stay unchanged. As the fitting algorithm works testing very small changes to the values, the changes are not large enough to move from the interval between two experimental y values to another one. In this particular case, if one chooses 8E-4, the small variations will never be enough to make it go over 8.17E-04 or below 7.85E-4, that are the values encompassing initial y0 choice.
One can usually circumvent this problem making the function depend explicitly on the value of y0. A smart choice would be to redefine the function so the value at y0 is the same no matter which branch is taken (i.e. ensure that the function is continuous). In this case, the function definition does not ensure so. A reasonable change would be:
def F2(X, a, b, c, c0, y0):
x, y = X
value = []
for i in range(0, len(x)):
lnZ = x[i] + c0 * y[i]
if y[i] >= y0:
lnZ += c * (y[i]-y0)
val = a + (b*lnZ)
value.append(val)
return value
which changes the meaning of the parameter c, and limits the results to only continuous functions. In this case, the value of y0 is indeed the function turning point. Nevertheless, it yields the desired results:
popt2, pcov = curve_fit(F2, (x, y), z, p0=(1, 1, 1E4, 1E4, 9.1E-4), method = 'lm')
print(popt2)
results in:
[-1.93417968e-01 1.05456433e-01 -3.65740192e+04 5.97890809e+04
8.64354057e-04]
A better (pythonic) definition for the function avoids the for loop:
def F3(X, a, b, c, c0, y0):
x, y = X
lnZ = x + c0 * y
idx = np.where(y>=y0)
lnZ[idx] += c * (y[idx] - y0)
rv = a + (b * lnZ)
return rv
which will probably be much faster for larger datasets.
Please, consider the following optimisation problem. Specifically, x and b are (1,n) vectors, C is (n,n) symmetric matrix, k is an arbitrary constant and i is a (1,n) vector of ones.
Please, also consider the following equivalent optimisation problem. In such case, k is determined during the optimisation process so there is no need to scale the values in x to obtain the solution y.
Please, also consider the following code for solving both the problems with cvxpy.
import cvxpy as cp
import numpy as np
def problem_1(C):
n, t = np.shape(C)
x = cp.Variable(n)
b = np.array([1 / n] * n)
obj = cp.quad_form(x, C)
constraints = [b.T # cp.log(x)>=0.5, x >= 0]
cp.Problem(cp.Minimize(obj), constraints).solve()
return (x.value / (np.ones(n).T # x.value))
def problem_2(C):
n, t = np.shape(C)
y = cp.Variable(n)
k = cp.Variable()
b = np.array([1 / n] * n)
obj = cp.quad_form(y, C)
constraints = [b.T # cp.log(y)>=k, np.ones(n)#y.T==1, y >= 0]
cp.Problem(cp.Minimize(obj), constraints).solve()
return y.value
While the first function do provide me with the correct solution for a sample set of data I am using, the second does not. Specifically, values in y differ heavily while employing the second function with some of them being equal to zero (which cannot be since all values in b are positive and greater than zero). I am wondering wether or not the second function minimise also k. Its value should not be minimised on the contrary it should just be determined during the optimisation problem as the one that leads to the solution that minimise the objective function.
UPDATE_1
I just found that the solution that I obtain with the second formulation of the problem is equal to the one derived with the following equations and function. It appears that the constraint with the logarithmic barrier and the k variable is ignored.
def problem_3(C):
n, t = np.shape(C)
y = cp.Variable(n)
k = cp.Variable()
b = np.array([1 / n] * n)
obj = cp.quad_form(y, C)
constraints = [np.ones(n)#y.T==1, y >= 0]
cp.Problem(cp.Minimize(obj), constraints).solve()
return y.value
UPDATE_2
Here is the link to a sample input C - https://www.dropbox.com/s/kaa7voufzk5k9qt/matrix_.csv?dl=0. In such case the correct output for both problem_1 and problem_2 is approximately equal to [0.0659 0.068 0.0371 0.1188 0.1647 0.3387 0.1315 0.0311 0.0441] since they are equivalent by definition. I am able to obtain the the correct output by solving only problem_1. Solving problem_2 leads to [0.0227 0. 0. 0.3095 0.3392 0.3286 0. 0. 0. ] which is wrong since it happens to be the correct output for problem_3.
UPDATE_3
To be clear, by definition problem_2 exhibits solution equal to the solution of problem_3 when the parameter k goes to minus infinity.
UPDATE_4
Please consider the following code that is for solving problem_1 using SciPy Optimize instead CVXPY. By imposing k=9 the correct optimal solution can still be achieved which is consistent with problem_1 being independent of the parameter.
import scipy.optimize as opt
def obj(x, C):
return x.T # C # x
def problem_1_1(C):
n, t = np.shape(C)
b = np.array([1 / n] * n)
constraints = [{"type": "eq", "fun": lambda x: (b * np.log(x)).sum() - 9}]
res = opt.minimize(
obj,
x0 = np.array([1 / n] * n),
args = (C),
bounds = ((0, None),) * n,
constraints = constraints
)
return (res['x'] / (np.ones(n).T # res['x']))
UPDATE_5
By considering the code in UPDATE_4, whenever k is set equal to 10 the correct solution is still achieved however appears the following warning. I suppose that is due to rounding error that might occur during the optimisation process.
Untitled.py:56: RuntimeWarning: divide by zero encountered in
log {"type": "eq", "fun": lambda x: (b * np.log(x)).sum() - 10}
I am wondering if there is a way to impose strict inequality constraint with CVXPY or apply a condition on the logarithm argument. Please consider the following modified code for problem_1_1.
import scipy.optimize as opt
def obj(x, C):
return x.T # C # x
def problem_1_1(C):
n, t = np.shape(C)
b = np.array([1 / n] * n)
constraints = [{"type": "eq", "fun": lambda x: (b * np.log(x if x.all() > 0 else 1e-100)).sum() - 10}]
res = opt.minimize(
obj,
x0 = np.array([1 / n] * n),
args = (C),
bounds = ((0, None),) * n,
constraints = constraints
)
return (res['x'] / (np.ones(n).T # res['x']))
UPDATE_6
To be thorough, the correct value of optimal k is approximatively -2.4827186402337564.
If you let be arbitrary then you are basically saying that is greater or equal to some arbitrary number, which is trivially true, so the constraint becomes irrelevant.
I believe you should either fix the value of or turn this problem into a minimax problem by determining a tadeoff betweenmaximizing and minimizing .
My first py file is the function that I want to find the roots, like this:
def myfun(unknowns,a,b):
x = unknowns[0]
y = unknowns[1]
eq1 = a*y+b
eq2 = x**b
z = x*y + y/x
return eq1, eq2
And my second one is to find the value of x and y from a starting point, given the parameter value of a and b:
a = 3
b = 2
x0 = 1
y0 = 1
x, y = scipy.optimize.fsolve(myfun, (x0,y0), args= (a,b))
My question is: I actually need the value of z after plugging in the result of found x and y, and I don't want to repeat again z = x*y + y/x + ..., which in my real case it's a middle step variable without an explicit expression.
However, I cannot replace the last line of fun with return eq1, eq2, z, since fslove only find the roots of eq1 and eq2.
The only solution now is to rewrite this function and let it return z, and plug in x and y to get z.
Is there a good solution to this problem?
I believe that's the wrong approach. Since you have z as a direct function of x and y, then what you need is to retrieve those two values. In the listed case, it's easy enough: given b you can derive x as the inverse of eqn2; also given a, you can invert eqn1 to get y.
For clarity, I'm changing the names of your return variables:
ret1, ret2 = scipy.optimize.fsolve(myfun, (x0,y0), args= (a,b))
Now, invert the two functions:
# eq2 = x**b
x = ret2**(1/b)
# eq1 = a*y+b
y = (ret1 - b) / a
... and finally ...
z = x*y + y/x
Note that you should remove the z computation from your function, as it serves no purpose.
Definition of the problem
I am trying to calculate the points of intersection of geometrical objects, such as two planes and a sphere, in python.
Let's consider for example these three objects:
This system gives two solutions:
I would like to know if there is a python library that can help develop a solver to calculate these intersections. I am looking for something working as Wolfram alpha, where we can input three equations and it returns all the possible solutions when there's finite number of solutions for simplicity.
What I tried
I tried with SymPy, but it returns []:
from sympy.solvers import solve
from sympy import Symbol
x = Symbol('x')
y = Symbol('y')
z = Symbol('z')
solve(z, x, x**2 + y**2 + z**2 -1)
I then tried with scipy:
from scipy.optimize import fsolve
def f(x):
y = np.zeros(3)
y[2] = x[2]
y[0] = x[0]
y[1] = x[0] ** 2 + x[1] ** 2+ x[2] ** 2 - 1
return y
x0 = np.array([10, 10, 10])
solution = fsolve(f, x0)
print(solution[0],solution[1],solution[2])
but it only returns one of the two solutions:
6.79746218330325e-28 1.0000000000000002 -2.3528179942097343e-35
I also tried with gekko, and stil it only returns one possible solution (which depends on the initial guess):
from gekko import GEKKO
m = GEKKO()
x = m.Var(value = 1)
y = m.Var(value = 1)
z = m.Var(value = 1)
m.Equation(x == 0)
m.Equation(z == 0)
m.Equation(x**2 + y**2+z**2 ==1)
m.solve()
fsolve from scipy, and all other functions that I personally know of that will accept any form of input function, will return one value.
One workaround if you have an idea where the other solution is would be to give an x0 value that is closer to the second solution with a second call to fsolve (see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fsolve.html).
If you alternatively know what range you want to try and find solutions in, the easiest way is to make an array that you then check to see where the value changes sign (this would be doing it from scratch)
I found the solution with sympy. Apparently it's one of the only (if not only) libraries that allow finding analytical solutions, and returns more than just one solution. Also, we don't need to pass guesses as initial variables. In my question, there was an error in the example I posted with sympy. This is how I solved the system:
from sympy.solvers import solve
import sympy as sp
x = Symbol('x')
y = Symbol('y')
z = Symbol('z')
sp.solve([z , x, (x**2 + y**2 + z**2) - 1], x,y,z)
Result: [0,-1,0], [0,1,0]
I have written code to plot the average squared error of a linear function over a given dataset, to visualise progress during a gradient descent training for the optimum regression line.
The relevant bits are these:
def compute_error(f, X, Y):
e = lambda x, y : (y - f(x))**2
return sum(e(x, y) for (x, y) in zip(X, Y))/len(X)
mn, bn, density = abs(target_slope)*1.5, abs(target_intercept)*1.5, 20
M, B = map(list, zip(*[(m, b) for m in np.linspace(-mn, +mn, density)
for b in np.linspace(-bn, +bn, density)]))
E = [compute_error(lambda x : m*x+b, X, Y) for m, b in zip(M,B)]
This works, but is very messy. I suspect there might be a very succinct way to pull off the same thing with numpy. So far I have gotten this:
M, B = map(np.ndarray.flatten, np.mgrid[-mn:+mn:1/density, -bn:+bn:1/density])
I still don't know how to improve the instantiation of E, and for some reason right now it is a lot slower than the messy version.
So, what would be a good way to map over a plane like MXB with numpy?
If you want to run above code you can build X and Y like so:
import numpy as np
from numpy.random import normal
target_slope = 3
target_intercept = 15
def generate_random_data(slope=1, minx=0, maxx=100, n=200, intercept=0):
f = lambda x : normal(slope*x, maxx/5)+intercept
X = np.linspace(minx, maxx, n)
Y = [f(x) for x in X]
return X, Y
X, Y = generate_random_data(slope=target_slope, intercept=target_intercept)
def compute_error(f, X, Y):
return np.mean( (Y - f(X))**2 )
MB = np.mgrid[-mn:+mn:2*mn/density, -bn:+bn:2*bn/density]
MB = MB.reshape((2, -1)).T
E = [compute_error(lambda x : m*x+b, X, Y) for m, b in MB]
It is possible to write a full numpy solution:
Y = np.array(Y)
M, B = np.mgrid[-mn:+mn:2*mn/density, -bn:+bn:2*bn/density]
mx = M.reshape((-1,1))*X
b = B.reshape((-1,1))*np.ones_like(X)
E = np.mean( (mx+b - Y)**2, axis=1 )
It may also be possible to write a solution without using the need to flatten the arrays and obtain the error as a 2D array...
I don't fully follow what you're trying to achieve here. However, this may help get you started with a numpy solution:
X, Y = generate_random_data(slope=target_slope, intercept=target_intercept, n=180)
M, B = np.mgrid[-mn:+mn:1/density, -bn:+bn:1/density]
f = M.T*X + B.T
error = np.sum((f-Y)**2)
Note I've had to alter the default number of X,Y values