I'm trying to make Newton - CG Optimization in python. My function is f(x,y) =(1-x)^2 + 2(y-x^2)^2. Initial points: x = 3, y = 2. Here is my code:
from scipy.optimize import minimize
def f(params): #definite function
x, y = params #amount of params
return (1 - x) ** 2 + 2 * (y - x ** 2) ** 2
def jacobian(params): #definite function
x, y = params #amount of params
der = np.zeros_like(x)
der[0] = -8 * x * (-x ** 2 + y) + 2 * x - 2 #derivative by x
der[1] = -4 * x ** 2 + 4 * y #derivative by y
return der
initial_guess = [3, 2] #initial points
result = minimize(f, initial_guess, jac = jacobian, method = 'Newton-CG')
I got an error "IndexError: too many indices for array".
As I made Nelder - mead optimization, BFGS and they work. So, problem is with Jacobian matrix. I feel somewhere in def jacobian is a mistake.
The error is indeed in the jacobian function, you are defining der as zeros taking the size of x, which is a scalar. Instead use params:
def jacobian(params): #definite function
x, y = params #amount of params
der = np.zeros_like(params)
der[0] = -8 * x * (-x ** 2 + y) + 2 * x - 2 #derivative by x
der[1] = -4 * x ** 2 + 4 * y #derivative by y
return der
Related
i wrote the below program in python with the hope of conducting a Helmholtz decomposition on a vector V(x,z)=[f(x,z),0,0] where f(x,z) is a function defined earlier, the aim of this program is to get the solenoidal and harmonic parts of vector V as S(x,z)=[S1(x,z),S2(x,z),S3(x,z)] and H(x,z)=[H1(x,z),H2(x,z),H3(x,z)] with S and H satisfying the condition V=S+H which transllates to (S1+H1=f, S2+H2=0, S3+H3=0)
please help i cant get anywhere with this problem, the output of the above code isnt what i wanted , its the following:
Solenoidal:
[[-22.6179559436889 + 41.14742726254I, 33.243161684442 - 99.9416505604629I, -22.6179559436889 + 41.14742726254I], [0.000151144774536593 + 0.000222403457962539I, 0, -0.000151144774536593 - 0.000222403457962539I], [22.6210744289585 - 41.1540953247099I, -41.2442631673893 + 88.1909008014634I, 6.6295316668479 - 64.6849359328842I]]
Harmonic:
[[26.6155393446675 - 35.2651619174123I, -33.243161684442 + 99.9416505604629I, 18.6203725427103 - 47.0296926076676I], [-0.000151144774536593 - 0.000222403457962539I, 0, 0.000151144774536593 + 0.000222403457962539I], [-18.6231887384308 + 47.0368054767535I, 41.2442631673893 - 88.1909008014634I, -10.6274173573755 + 58.8022257808406I]]
`
import math
import numpy as np
from sympy import symbols, simplify, lambdify
# Define x and z as symbolic variables
x, z = symbols('x, z')
# Define the function f
def f(x, z):
term1 = 171.05 * 10**(-18) * ((1.00 * x**4 + 2.00 * x**2 * z**2 + 1.00 * z**4) * math.atan(z*x) - 1.00 * x**3 * z - 1.00 * x * z**3)
term2 = -3.17 * 10**6 * x**4 - 6.36 * 10**6 * x**2 * z**2 - 3.19 * 10**6 * z**4 + 1.00 * x**4 * z + 2.00 * x**2 * z**3 + 1.00 * z**5
term3 = (z - 44.33 * 10**3)
term4 = ((-2.00 * 10**3) / (576.30 * 10**3 + 13.00 * z))**2.69 * (x**2 + z**2)**7.00 / 2.00 * z
return term1 * term2 * term3 / (term4 + 1e-15) # Add a small value to term4 to avoid division by zero
# Define a 2D array with 3 elements
vector = np.array([[f(x, z) for x in range(-1, 2)] for z in range(-1, 2)])
def helmholtz_hodge_decomposition(vector):
# Compute the gradient of the vector field
gradient = np.gradient(vector)
# Compute the curl of the vector field
curl = np.cross(gradient[0], gradient[1])
# Compute the divergence of the vector field
divergence = np.sum(gradient, axis=0)
# Compute the harmonic part of the vector field
harmonic = -curl - divergence
# Compute the solenoidal part of the vector field
solenoidal = vector - harmonic
return solenoidal, harmonic
# Print the solenoidal and harmonic parts as functions of x and z
solenoidal, harmonic = helmholtz_hodge_decomposition(vector)
print("Solenoidal:")
print(simplify(solenoidal))
print("Harmonic:")
print(simplify(harmonic))
# Create functions from the solenoidal and harmonic parts
solenoidal_part = lambdify((x, z), simplify(solenoidal), 'numpy')
harmonic_part = lambdify((x, z), simplify(harmonic), 'numpy')
`
expecting :Conducting a Helmholtz decomposition on a vector V(x,z)=[f(x,z),0,0] where f(x,z) is a function defined earlier, the aim of this program is to get the solenoidal and harmonic parts of vector V as S(x,z)=[S1(x,z),S2(x,z),S3(x,z)] and H(x,z)=[H1(x,z),H2(x,z),H3(x,z)] with S and H satisfying the condition V=S+H which transllates to (S1+H1=f, S2+H2=0, S3+H3=0)
I need algorithm, that solve systems like this:
Example 1:
5x - 6y = 0 <--- line
(10- x)**2 + (10- y)**2 = 2 <--- circle
Solution:
find y:
(10- 6/5*y)**2 + (10- y)**2 = 2
100 - 24y + 1.44y**2 + 100 - 20y + y**2 = 2
2.44y**2 - 44y + 198 = 0
D = b**2 - 4ac
D = 44*44 - 4*2.44*198 = 3.52
y[1,2] = (-b+-sqrt(D))/2a
y[1,2] = (44+-1.8761)/4.88 = 9.4008 , 8.6319
find x:
(10- x)**2 + (10- 5/6y)**2 = 2
100 - 20x + y**2 + 100 - 5/6*20y + (5/6*y)**2 = 2
1.6944x**2 - 36.6666x + 198 = 0
D = b**2 - 4ac
D = 36.6666*36.6666 - 4*1.6944*198 = 2.4747
x[1,2] = (-b+-sqrt(D))/2a
x[1,2] = (36.6666+-1.5731)/3.3888 = 11.2841 , 10.3557
my skills are not enough to write this algorithm please help
and another algorithm that solve this system.
5x - 6y = 0 <--- line
|-10 - x| + |-10 - y| = 2 <--- rhomb
as answer here i need two x and two y.
You can use sympy, Python's symbolic math library.
Solutions for fixed parameters
from sympy import symbols, Eq, solve
x, y = symbols('x y', real=True)
eq1 = Eq(5 * x - 6 * y, 0)
eq2 = Eq((10 - x) ** 2 + (10 - y) ** 2, 2)
solutions = solve([eq1, eq2], (x, y))
print(solutions)
for x, y in solutions:
print(f'{x.evalf()}, {y.evalf()}')
This leads to two solutions:
[(660/61 - 6*sqrt(22)/61, 550/61 - 5*sqrt(22)/61),
(6*sqrt(22)/61 + 660/61, 5*sqrt(22)/61 + 550/61)]
10.3583197613288, 8.63193313444070
11.2810245009662, 9.40085375080520
The other equations work very similar:
eq1 = Eq(5 * x - 6 * y, 0)
eq2 = Eq(Abs(-10 - x) + Abs(-10 - y), 2)
leading to :
[(-12, -10),
(-108/11, -90/11)]
-12.0000000000000, -10.0000000000000
-9.81818181818182, -8.18181818181818
Dealing with arbitrary parameters
For your new question, how to deal with arbitrary parameters, sympy can help to find formulas, at least when the structure of the equations is fixed:
from sympy import symbols, Eq, Abs, solve
x, y = symbols('x y', real=True)
a, b, xc, yc = symbols('a b xc yc', real=True)
r = symbols('r', real=True, positive=True)
eq1 = Eq(a * x - b * y, 0)
eq2 = Eq((xc - x) ** 2 + (yc - y) ** 2, r ** 2)
solutions = solve([eq1, eq2], (x, y))
Studying the generated solutions, some complicated expressions are repeated. Those could be substituted by auxiliary variables. Note that this step isn't necessary, but helps a lot in making sense of the solutions. Also note that substitution in sympy often only considers quite literal replacements. That's by the introduction of c below is done in two steps:
c, d = symbols('c d', real=True)
for xi, yi in solutions:
print(xi.subs(a ** 2 + b ** 2, c)
.subs(r ** 2 * a ** 2 + r ** 2 * b ** 2, c * r ** 2)
.subs(-a ** 2 * xc ** 2 + 2 * a * b * xc * yc - b ** 2 * yc ** 2 + c * r ** 2, d)
.simplify())
print(yi.subs(a ** 2 + b ** 2, c)
.subs(r ** 2 * a ** 2 + r ** 2 * b ** 2, c * r ** 2)
.subs(-a ** 2 * xc ** 2 + 2 * a * b * xc * yc - b ** 2 * yc ** 2 + c * r ** 2, d)
.simplify())
Which gives the formulas:
x1 = b*(a*yc + b*xc - sqrt(d))/c
y1 = a*(a*yc + b*xc - sqrt(d))/c
x2 = b*(a*yc + b*xc + sqrt(d))/c
y2 = a*(a*yc + b*xc + sqrt(d))/c
These formulas then can be converted to regular Python code without the need of sympy. That code will only work for an arbitrary line and circle. Some tests need to be added around, such as c == 0 (meaning the line is just a dot), and d either be zero, positive or negative.
The stand-alone code could look like:
import math
def give_solutions(a, b, xc, yc, r):
# intersection between a line a*x-b*y==0 and a circle with center (xc, yc) and radius r
c =a ** 2 + b ** 2
if c == 0:
print("degenerate line equation given")
else:
d = -a**2 * xc**2 + 2*a*b * xc*yc - b**2 * yc**2 + c * r**2
if d < 0:
print("no solutions")
elif d == 0:
print("1 solution:")
print(f" x1 = {b*(a*yc + b*xc)/c}")
print(f" y1 = {a*(a*yc + b*xc)/c}")
else: # d > 0
print("2 solutions:")
sqrt_d = math.sqrt(d)
print(f" x1 = {b*(a*yc + b*xc - sqrt_d)/c}")
print(f" y1 = {a*(a*yc + b*xc - sqrt_d)/c}")
print(f" x2 = {b*(a*yc + b*xc + sqrt_d)/c}")
print(f" y2 = {a*(a*yc + b*xc + sqrt_d)/c}")
For the rhombus, sympy doesn't seem to be able to work well with abs in the equations. However, you could use equations for the 4 sides, and test whether the obtained intersections are inside the range of the rhombus. (The four sides would be obtained by replacing abs with either + or -, giving four combinations.)
Working this out further, is far beyond the reach of a typical stackoverflow answer, especially as you seem to ask for an even more general solution.
I have an example code. When I calculate dloss/dw manually I get the result 8, but the following code gives me a 16. Please tell me how the gradient is 16.
import torch
x = torch.tensor(2.0)
y = torch.tensor(2.0)
w = torch.tensor(3.0, requires_grad=True)
# forward
y_hat = w * x
s = y_hat - y
loss = s**2
#backward
loss.backward()
print(w.grad)
I think you simply miscalculated.
The derivation of loss = (w * x - y) ^ 2 is:
dloss/dw = 2 * (w * x - y) * x = 2 * (3 * 2 - 2) * 2 = 16
Keep in mind that back-propagation in neural networks is done by applying the chain rule: I think you forgot the *x at the end of the derivation
To be specific:
chain rule for derivation says that df(g(x))/dx = f'(g(x)) * g'(x) (derivated with respect to x)
the whole loss function in your case is built like this:
loss(y_hat) = (y_hat - y)^2
y_hat(x) = w * x
thus: loss(y_hat(x)) = (y_hat(x) - y)^2
the derivation of this is according to chain rule:
dloss(y_hat(x))/dw = loss'(y_hat(x)) * dy_hat(x)/dw
for any z:
loss'(z) = 2 * (z - y) * 1 and dy_hat(z)/dw = z
thus: dloss((y_hat(x))/dw = dloss(y_hat(x))/dw = loss'(y_hat(x)) * y_hat'(x) = 2 * (y_hat(x) - z) * dy_hat(x)/dw = 2 * (y_hat(x) - z) * x = 2 * (w * x - z) * x = 16
pytorch knows that in your forward pass each layer applies some kind of function to its input and that your forward pass is 1 * loss(y_hat(x)) and than keeps applying the chain rule for the backward pass (each layer requires one application of the chain rule).
I'm trying to get the coefficients of a function p(T,x). I provided the data for p, T and x from excel sheets via panda. The following Code works quite nice for me:
import pandas as pd
import os
from scipy.optimize import curve_fit
import numpy as np
df = pd.read_excel(os.path.join(os.path.dirname(__file__), "./Data.xlsx"))
T = np.array(df['T'], dtype=float)
x = np.array(df['x'], dtype=float)
p = np.array(df['p'], dtype=float)
p_s = 67.17
def func(X, a, b, c, d, e, f):
T, x = X
return x * p_s + x * (1 - x) * (a + b * T + c * T ** 2 + d * x + e * x * T + f * x * T ** 2) * p_s
popt, pcov = curve_fit(func, (T, x), p)
print("a = %s , b = %s, c = %s, d = %s, e = %s, f = %s" % (popt[0], popt[1], popt[2], popt[3], popt[4], popt[5]))
My acutal problem is, that the function is swinging a little bit in the end.
Because of this behaviour i get two x values for one p value, which i dont want.
So to avoid this little swing i want to accomplish a boundary condition for the fitting that say something like dp/dx (for constant T) > 0. With dp/dx I mean the derivation of the function after x.
Is this possible with the normal bound paramter of curve_fit? How can I do this?
EDIT:
As suggested I've messed a little bit around with least_square function but I guess that I came to a point where I don't realy understand what I'm doing or have to do.
T = np.array(df['T'], dtype=float)
x = np.array(df['x'], dtype=float)
p = np.array(df['p'], dtype=float)
p_s = 67
def f(X, z):
T, x = X
return x * p_s + x * (1 - x) * (z[0] + z[1] * T + z[2] * T ** 2 + z[3] * x + z[4] * x * T + z[5] * x * T ** 2) * p_s
def g(X,p,z):
return p - f(X ,z)
z0 = np.array([0,0,0,0,0,0], dtype=float)
res, flag = least_squares(g,z0, args=(T,x,p))
print(res)
With this code I get the following error:
TypeError: g() takes 3 positional arguments but 4 were given
Say we have this function,
f = poly(2*x**2 + 3*x - 1,x)
How would one go about dropping terms of degree n or lower.
For instance if n = 1 the result would be 2*x**2.
from sympy import poly
from sympy.abc import x
p = poly(x ** 5 + 2 * x ** 4 - x ** 3 - 2 * x ** 2 + x)
print(p)
n = 2
new_p = poly(sum(c * x ** i[0] for i, c in p.terms() if i[0] > n))
print(new_p)
Output:
Poly(x**5 + 2*x**4 - x**3 - 2*x**2 + x, x, domain='ZZ')
Poly(x**5 + 2*x**4 - x**3, x, domain='ZZ')