Global minimization of multivariable with scipy.optimize.brute - python

I'm trying to minimize the following function:
with respect to the parameters H and alpha using the brute force method, specifically the scipy.optimize.brute algorithm. The problem arises that I don't know how to deal with this unknown number of variables, I mean, that is 2n variables and n is an input of the program.
I have the following code, where I'd like that the minimization would lead to arrays for Hand alpha values:
import numpy as np
#Entries:
gamma = 17.0
C = 70.0
T = 1
R = 0.5
n = int(2)
def F(mins, gamma,C,T,R):
H,alpha = mins
ret = 0
for i in range(n):
inner_sum = 0
for j in range(i+1):
inner_sum += H[j]*np.tan(alpha[j])
ret += 3*gamma*H[i]*(R+inner_sum)**2
So I can get the values of H and alpha from the position of the array. I was used to multivariable minimization with brute force but only when I have a fixed number of variables. In this case, how can I proceed?
P.S.: I know that the minimization of the above expression will lead to 0 for both variables. This is just a small piece of a bigger expression to illustrate the problem, in which a working algorithm would be very helpful. Thanks in advance!

Related

How to define the objective function for integer optimization task?

I need to find the k in the range [1, 10], which is the least positive integer such that binomial(k, 2)≥ m, where m≥3 - integer. The binomial() function is the binominal coefficient.
My attempt is:
After some algebraic steps, I have found the minization task: min k(k-1) - 2m ≥ 0, s.t. m≥3. I have defined the objective function and gradient. In the objective function I fixed the m=3 and my problem is how to define integer domain for the variable m.
from scipy.optimize import line_search
# objective function
def objective(k):
m = 3
return k*(k-1)-2*m
# gradient for the objective function
def gradient(k):
return 2.0 * k - 1
# define range
r_min, r_max = 1, 11
# prepare inputs
inputs = arange(r_min, r_max, 1)
# compute targets
targets = [objective(k) for k in inputs]
# define the starting point
point = 1.0
# define the direction to move
direction = 1.0
# print the initial conditions
print('start=%.1f, direction=%.1f' % (point, direction))
# perform the line search
result = line_search(objective, gradient, point, direction)
print(result)
I have see the
LineSearchWarning: The line search algorithm did not converge
Question. How to define the objective function in Python?
You are look to minimise k such that k(k-1)-2m≥0, with additional constraints on k on which we'll come back later. You can explicitly solve this inequation, by solving the corresponding equation first, that is, finding the roots of P:=X²-X-2m. The quadratic formulas give the roots (1±√(1+4m²))/2. Since P(x)→∞ as x→±∞, you know that the x that satisfy your inequation are the ones above the greatest root, and below the lowest root. Since you are only interested in positive solutions, and since 1-√(1+m²)<0, the set of wanted solutions is [(1+√(1+m²))/2,∞). Among these solutions, the smallest integer is the ceil of (1+√(1+m²))/2 which is strictly greater than 1. Let k=ceil((1+sqrt(1+m**2))/2) be that integer. If k≤10, then your problem has a solution, which is k. Otherwise, your problem has no solutions. In Python, you get the following:
import math
def solve(m):
k = math.ceil((1+math.sqrt(1+m**2))/2)
return k if k <= 10 else None

Performance issue with Scipy's solve_bvp and coupled differential equations

I'm facing a problem while trying to implement the coupled differential equation below (also known as single-mode coupling equation) in Python 3.8.3. As for the solver, I am using Scipy's function scipy.integrate.solve_bvp, whose documentation can be read here. I want to solve the equations in the complex domain, for different values of the propagation axis (z) and different values of beta (beta_analysis).
The problem is that it is extremely slow (not manageable) compared with an equivalent implementation in Matlab using the functions bvp4c, bvpinit and bvpset. Evaluating the first few iterations of both executions, they return the same result, except for the resulting mesh which is a lot greater in the case of Scipy. The mesh sometimes even saturates to the maximum value.
The equation to be solved is shown here below, along with the boundary conditions function.
import h5py
import numpy as np
from scipy import integrate
def coupling_equation(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty_like(a).astype(complex)
for idx, z_i in enumerate(z_mesh):
beta_zf_i = np.interp(z_i, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_i, z_a, ka_z) # Get ka at the desired point of the mesh
coupling_matrix = np.empty((2, 2), complex)
coupling_matrix[0] = [-1j * beta_zf_i, ka_z_i]
coupling_matrix[1] = [ka_z_i, 1j * beta_zf_i]
a_p[:, idx] = np.matmul(coupling_matrix, a[:, idx]) # Solve the coupling matrix
return a_p
def boundary_conditions(a_a, a_b):
return np.hstack(((a_a[0]-1), a_b[1]))
Moreover, I couldn't find a way to pass k, z and beta_zf as arguments of the function coupling_equation, given that the fun argument of the solve_bpv function must be a callable with the parameters (x, y). My approach is to define some global variables, but I would appreciate any help on this too if there is a better solution.
The analysis function which I am trying to code is:
def analysis(k, z, beta_analysis, max_mesh):
s11_analysis = np.empty_like(beta_analysis, dtype=complex)
s21_analysis = np.empty_like(beta_analysis, dtype=complex)
initial_mesh = np.linspace(z[0], z[-1], 10) # Initial mesh of 10 samples along L
mesh = initial_mesh
# a_init must be complex in order to solve the problem in a complex domain
a_init = np.vstack((np.ones(np.size(initial_mesh)).astype(complex),
np.zeros(np.size(initial_mesh)).astype(complex)))
for idx, beta in enumerate(beta_analysis):
print(f"Iteration {idx}: beta_analysis = {beta}")
global beta_zf
beta_zf = beta * np.ones(len(z)) # Global variable so as to use it in coupling_equation(x, y)
a = integrate.solve_bvp(fun=coupling_equation,
bc=boundary_conditions,
x=mesh,
y=a_init,
max_nodes=max_mesh,
verbose=1)
# mesh = a.x # Mesh for the next iteration
# a_init = a.y # Initial guess for the next iteration, corresponding to the current solution
s11_analysis[idx] = a.y[1][0]
s21_analysis[idx] = a.y[0][-1]
return s11_analysis, s21_analysis
I suspect that the problem has something to do with the initial guess that is being passed to the different iterations (see commented lines inside the loop in the analysis function). I try to set the solution of an iteration as the initial guess for the following (which must reduce the time needed for the solver), but it is even slower, which I don't understand. Maybe I missed something, because it is my first time trying to solve differential equations.
The parameters used for the execution are the following:
f2 = h5py.File(r'path/to/file', 'r')
k = np.array(f2['k']).squeeze()
z = np.array(f2['z']).squeeze()
f2.close()
analysis_points = 501
max_mesh = 1e6
beta_0 = 3e2;
beta_low = 0; # Lower value of the frequency for the analysis
beta_up = beta_0; # Upper value of the frequency for the analysis
beta_analysis = np.linspace(beta_low, beta_up, analysis_points);
s11_analysis, s21_analysis = analysis(k, z, beta_analysis, max_mesh)
Any ideas on how to improve the performance of these functions? Thank you all in advance, and sorry if the question is not well-formulated, I accept any suggestions about this.
Edit: Added some information about performance and sizing of the problem.
In practice, I can't find a relation that determines de number of times coupling_equation is called. It must be a matter of the internal operation of the solver. I checked the number of callings in one iteration by printing a line, and it happened in 133 ocasions (this was one of the fastests). This must be multiplied by the number of iterations of beta. For the analyzed one, the solver returned this:
Solved in 11 iterations, number of nodes 529.
Maximum relative residual: 9.99e-04
Maximum boundary residual: 0.00e+00
The shapes of a and z_mesh are correlated, since z_mesh is a vector whose length corresponds with the size of the mesh, recalculated by the solver each time it calls coupling_equation. Given that a contains the amplitudes of the progressive and regressive waves at each point of z_mesh, the shape of a is (2, len(z_mesh)).
In terms of computation times, I only managed to achieve 19 iterations in about 2 hours with Python. In this case, the initial iterations were faster, but they start to take more time as their mesh grows, until the point that the mesh saturates to the maximum allowed value. I think this is because of the value of the input coupling coefficients in that point, because it also happens when no loop in beta_analysisis executed (just the solve_bvp function for the intermediate value of beta). Instead, Matlab managed to return a solution for the entire problem in just 6 minutes, aproximately. If I pass the result of the last iteration as initial_guess (commented lines in the analysis function, the mesh overflows even faster and it is impossible to get more than a couple iterations.
Based on semi-random inputs, we can see that max_mesh is sometimes reached. This means that coupling_equation can be called with a quite big z_mesh and a arrays. The problem is that coupling_equation contains a slow pure-Python loop iterating on each column of the arrays. You can speed the computation up a lot using Numpy vectorization. Here is an implementation:
def coupling_equation_fast(z_mesh, a):
ka_z = k # Global
z_a = z # Global
a_p = np.empty(a.shape, dtype=np.complex128)
beta_zf_i = np.interp(z_mesh, z_a, beta_zf) # Get beta at the desired point of the mesh
ka_z_i = np.interp(z_mesh, z_a, ka_z) # Get ka at the desired point of the mesh
# Fast manual matrix multiplication
a_p[0] = (-1j * beta_zf_i) * a[0] + ka_z_i * a[1]
a_p[1] = ka_z_i * a[0] + (1j * beta_zf_i) * a[1]
return a_p
This code provides a similar output with semi-random inputs compared to the original implementation but is roughly 20 times faster on my machine.
Furthermore, I do not know if max_mesh happens to be big with your inputs too and even if this is normal/intended. It may make sense to decrease the value of max_mesh in order to reduce the execution time even more.

Precision for Python root function

I'm trying to approximate Julia sets using roots of polynomials in Python. In particular I want to find the roots of the nth iterate of the polynomial q(z) = z^2-0.5. In other words I want to find the roots of q(q(q(..))) composed n times. In order to get many sample points I would need to compute the roots of polynomials of degree >1000.
I've tried solving this problem using both the built in polynomial class of numpy which has a root function and also the function solver of sympy. In the first case precision is lost when I choose degrees larger than 100. The sympy computation simply takes to long time. Here is my code:
p = P([-0.5,0,1])
for k in range(9):
p = p**2-0.5
roots = p.roots()
plt.plot([np.real(r) for r in roots], [np.imag(r) for r in roots],'x')
plt.show()
abs_vector = [np.abs(p(r)) for r in roots]
max = 0
for a in abs_vector:
if a > max:
max = a
print(max)
The max value above gives the largest value of p at a supposed root. However running this code gives me 7.881370400084486e+296 which is very large.
How would one go about computing roots of high degree polynomials with good accuracy in a short amount of time?
For the n-times composition of a polynomial q you can reconstruct the roots iteratively
q = [1,0,-0.5]
n = 9
def q_preimage(w):
c = q.copy()
c[-1] -= w
return np.roots(c)
rts = [0]
for k in range(n):
rts = np.concatenate([q_preimage(w) for w in rts])
which returns
array([ 1.36444432e+00+0.00095319j, -1.36444432e+00-0.00095319j,
1.40104860e-03-0.92828301j, -1.40104860e-03+0.92828301j,
8.82183775e-01-0.52384727j, -8.82183775e-01+0.52384727j,
8.78972436e-01+0.52576116j, -8.78972436e-01-0.52576116j,
1.19545693e+00-0.21647154j, -1.19545693e+00+0.21647154j,
3.61362916e-01+0.71612883j, -3.61362916e-01-0.71612883j,
1.19225541e+00+0.21925381j, -1.19225541e+00-0.21925381j,
3.66786415e-01-0.71269419j, -3.66786415e-01+0.71269419j,
...
or plotted
plt.plot(rts.real, rts.imag,'ob', ms=2); plt.grid(); plt.show()

How can i vectorize a matrix/ the input so that scipy.optimize.minimize can work with it?

I ran into an issue while converting a unconstrainted problem for scipy.optimize.minimize. I want to run the method L-BFGS.
The baseproblem looked like this:
min|| A - XY||
s.t. X,Y
while A is a given Matrix and X $\in \R^{nxl}$ and Y $\in \R^{lxm}
Since scipy only accept Vector inputs, i tried to interpret XY as a bigger variable: Z=(X,Y) where i already put the columns of X and Y under each other.
First i tried to program the function that it will convert my input vector. For a base example it worked fine (maybe because the matrix was dense? idk)
Here is my code:
import numpy as np
from scipy.optimize import minimize
R=np.array(np.arange(12)).reshape(3, 4)
Z0 = np.array(np.random.random(14))
#X=3x2 = 6
#Y=2x4 = 8
def whatineed (Z):
return np.linalg.norm(R - np.dot(Z[:6].reshape(3,2),Z[6:].reshape(2,4)))
A = minimize(fun=whatineed, x0=Z0, method='L-BFGS-B', options={'disp': 1})
#print A
Above is just a (seemingly?) working dummy code. It gave me the/a result:
x: array([ 1.55308851, -0.50000733, 1.89812395, 1.44382572, 2.24315938, 3.38765876, 0.62668062, 1.23575295, 1.8448253 , 2.45389762, 1.94655245, 1.83844053, 1.73032859, 1.62221667])
If i run it with a big one it doesnt work at all.
RUNNING THE L-BFGS-B CODE
* * *
Machine precision = 2.220D-16
N = 377400 M = 10
This problem is unconstrained.
At X0 0 variables are exactly at the bounds
without moving further. R is in real a more or less sparse Matrix. But i really dont know WHERE to begin with. Is it my functioncode? Is it the sparsity of R? Both? What is the work around?
Update: Solver works with VERY small dimension. If i go a bit bigger this error occurs:
ABNORMAL_TERMINATION_IN_LNSRCH
Line search cannot locate an adequate point after 20 function
and gradient evaluations. Previous x, f and g restored.
Possible causes: 1 error in function or gradient evaluation;
2 rounding error dominate computation.
Cauchy time 0.000E+00 seconds.
Subspace minimization time 0.000E+00 seconds.
Line search time 0.000E+00 seconds.
Total User time 0.000E+00 seconds.
As u can see at the User time, the problem is quite small and stops working.
Running it handwritten (L-BFGS) results in doing no steps/descents at all.
If we are trying to solve
min_{B is rank <= k} ‖ A - B ‖_2
the solution is well known to be rank k truncated SVD of A.
Instead, we are trying to solve
min_{X,Y} ‖ A - XY ‖_2
where X has shape n × k and Y has shape k × n (I'm using k because it is easier to read than l)
Claim: these are equivalent problems. To see this we need to show that:
XY is rank <= k (with the above shapes).
any rank k matrix can be written as a product XY (with the above shapes).
Proof:
The first follows from the fact that Y is rank <= k and that the nulls pace of XY contains in the null space of Y
The second follows from writing the SVD of a rank <= k matrix B = U D V* and observing that (UD) has shape n × k and V has shape k × n where we have dropped all but the first k singular values from the decomposition since they are guaranteed to be zero.
Implementation
To solve the problem OP stated we need only compute the SVD of A and truncate it to rank k.
You can use np.linalg.svd or sp.sparse.linalg.svds depending on if your matrix is sparse. For the numpy version the rank k svd can be computed as:
m,n = 10,20
A = np.random.randn(m,n)
k = 6
u,s,vt = np.linalg.svd(A)
X = u[:,:k]*s[:k]
Y = vt[:k]
print(X.shape, Y.shape)
print(np.linalg.norm(A-X#Y,2))
The syntax of sp.sparse.linalg.svds is almost the same except you are able to specify the rank you want ahead of time.

Unknown error with self-defined function for approximation of an integral

I've defined the following function as a method of approximating an integral using Boole's Rule:
def integrate_boole(f,l,r,N):
h=((r-l)/N)
xN = np.linspace(l,r,N+1)
fN = f(xN)
return ((2*h)/45)*(7*fN[0]+32*(np.sum(fN[1:-2:2]))+12*(np.sum(fN[2:-3:4]))+14*(np.sum(fN[4:-5]))+7*fN[-1])
I used the function to get the value of the integral for sin(x)dx between 0 and pi (where N=8) and assigned it to a variable sine_int.
The answer given was 1.3938101893248442
After doing the original equation (see here) out by hand I realised this answer was quite inaccurate.
The sums of fN are giving incorrect values, but I'm not sure why. For example, np.sum(fN[4:-5]) is going to 0.
Is there a better way of coding the sums involved, or is there an error in my parameters that's causing the calculations to be inaccurate?
Thanks in advance.
EDIT
I should have made it clearer that this is supposed to be a composite version of the rule, i.e. approximating over N points where N is divisible by 4. So the typical 5 points with 4 intervals isn't going to cut it here, unfortunately. I would copy the equation I'm using into here, but I don't have an image of it and LaTex isn't an option. It should/might be clear from the code I have after return.
From a quick inspection looks like the term multiplying f(x_4) should be 32, not 14:
def integrate_boole(f,l,r,N):
h=((r-l)/N)
xN = np.linspace(l,r,N+1)
fN = f(xN)
return ((2*h)/45)*(7*fN[0]+32*(np.sum(fN[1:-2:2]))+
12*(np.sum(fN[2:-3:4]))+32*(np.sum(fN[4:-5]))+7*fN[-1])
First, one of your coefficients was wrong as pointed out by #nixon. Then, I think you do not really understand how the Boole's rule works - It approximates the integral of a function only using 5 points of the function. Hence, the terms like np.sum(fN[1:-2:2]) makes no sense. You only need five points, which you can obtain with xN = np.linspace(l,r,5). Your h is simply the distance between 2 of the contiguos points h = xN[1] - xN[0]. And then, easy peasy:
import numpy as np
def integrate_boole(f,l,r):
xN = np.linspace(l,r,5)
h = xN[1] - xN[0]
fN = f(xN)
return ((2*h)/45)*(7*fN[0]+32*fN[1]+12*fN[2]+32*fN[3]+7*fN[4])
def f(x):
return np.sin(x)
I = integrate_boole(f, 0, np.pi)
print(I) # Outputs 1.99857...
I'm not sure what you're hoping your code does w.r.t. Boole's rule. Why are you summing over samples of the function (i.e. np.sum(fN[2:-3:4]))? I think your N parameter is also not well defined and I'm not sure what it's supposed to represent. Maybe you're using another rule I'm not familiar with: I'll let you decide.
Regardless, here's an implementation of Boole's rule as Wikipedia defines it. Variables map to the Wikipedia version you linked:
def integ_boole(func, left, right):
h = (right - left) / 4
x1 = left
x2 = left + h
x3 = left + 2*h
x4 = left + 3*h
x5 = right # or left + 4h
result = (2*h / 45) * (7*func(x1) + 32*func(x2) + 12*func(x3) + 32*func(x4) + 7*func(x5))
return result
then, to test:
import numpy as np
print(integ_boole(np.sin, 0, np.pi))
outputs 1.9985707318238357, which is extremely close to the correct answer of 2.
HTH.

Categories

Resources