fmin_ncg not returning an optimized result - python

I am trying to use fmin_ncg for minimizing my cost function. But, the results that I get back are not minimized. I get the same result I would get without advanced optimization. I know for a fact that it can further be minimized.
PS. I am trying to code assignment 2 of the Coursera's ML course.
My cost fn:
def costFn(theta, X, y, m, lam):
h = sigmoid(X.dot(theta))
theta0 = theta
J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h))) + (lam/(2*m) * theta0.T.dot(theta0))
return J.flatten()
X would look something like this:
[[ 1.00000000e+00 5.12670000e-02 6.99560000e-01 ..., 6.29470940e-04
8.58939846e-03 1.17205992e-01]
[ 1.00000000e+00 -9.27420000e-02 6.84940000e-01 ..., 1.89305413e-03
-1.39810280e-02 1.03255971e-01]
[ 1.00000000e+00 -2.13710000e-01 6.92250000e-01 ..., 1.04882142e-02
-3.39734512e-02 1.10046893e-01]
...,
[ 1.00000000e+00 -4.84450000e-01 9.99270000e-01 ..., 2.34007252e-01
-4.82684337e-01 9.95627986e-01]
....
Y is a bunch of 0s and 1s
[[1]
[1]
[1]
[1]
...
[0]
[0]]
X.shape = (118, 28)
y.shape = (118, 1)
My grad function:
def grad(theta, X, y, m, lam):
h = sigmoid(X.dot(theta))
theta0 = initial_theta
gg = 1.0 / m * ((X.T.dot(h-y)) + (lam * theta0))
return gg.flatten()
Using just my costFn and grad, I get the following:
Cost at initial theta (zeros): 0.69314718056
With fmin_ncg:
xopt = fmin_ncg(costFn, fprime=grad, x0=initial_theta, args=(X, y, m, lam), maxiter=400, disp=True, full_output=True )
I get:
Optimization terminated successfully.
Current function value: 0.693147
Iterations: 1
Function evaluations: 2
Gradient evaluations: 4
Hessian evaluations: 0
Using octave, my J after advanced optimization should be:
0.52900
What am I doing wrong?
EDIT:
I got my optimization to work:
y1 = y.flatten()
Result = op.minimize(fun = costFn,
x0 = initial_theta,
args = (X, y1, m, lam),
method = 'CG',
options={'disp': True})
I get the costFn to be 0.52900, which is what I expected.
But the values of 'theta' are a bit off that the accuracy is only 42%. It's supposed to be 83%.
The values of theta I got:
[ 1.14227089 0.60130664 1.16707559 -1.87187892 -0.91534354 -1.26956697
0.12663015 -0.36875537 -0.34522652 -0.17363325 -1.42401493 -0.04872243
-0.60650726 -0.269242 -1.1631064 -0.24319088 -0.20711764 -0.04333854
-0.28026111 -0.28693582 -0.46918892 -1.03640373 0.02909611 -0.29266766
0.01725324 -0.32899144 -0.13795701 -0.93215664]
The actual values:
[1.273005 0.624876 1.177376 -2.020142 -0.912616 -1.429907 0.125668 -0.368551
-0.360033 -0.171068 -1.460894 -0.052499 -0.618889 -0.273745 -1.192301
-0.240993 -0.207934 -0.047224 -0.278327 -0.296602 -0.453957 -1.045511
0.026463 -0.294330 0.014381 -0.328703 -0.143796 -0.924883]

First of all your gradient is invalid
def grad(theta, X, y, m, lam):
h = sigmoid(X.dot(initial_theta))
theta0 = initial_theta
gg = 1 / m * ((X.T.dot(h-y)) + (lam * theta0))
return gg.flatten()
this function never uses theta, you put initial_theta instead, which is incorrect.
Similar error in the cost
def costFn(theta, X, y, m, lam):
h = sigmoid(X.dot(initial_theta))
theta0 = theta
J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h))) + (lam/(2*m) * theta0.T.dot(theta0))
return J.flatten()
you have some odd mix of theta and initial_theta, which also does not make sense, there should be only theta inside. As a side note - there should be no need for flattening, your cost function should be a scalar, thus if you have to flatten - something is wrong in your computations.
Also worth checking - what is your m? If it is an integer, and you are using python 2.X, then 1 / m equals zero, since it is integer division. You should do 1.0 / m instead. (in both functions)

Related

Python problems on Machine-Learning

import numpy as np
import pandas as pd
import numpy as np
from matplotlib import pyplot as pt
def computeCost(X,y,theta):
m=len(y)
predictions= X*theta-y
sqrerror=np.power(predictions,2)
return 1/(2*m)*np.sum(sqrerror)
def gradientDescent(X, y, theta, alpha, num_iters):
m = len(y)
jhistory = np.zeros((num_iters,1))
for i in range(num_iters):
h = X * theta
s = h - y
theta = theta - (alpha / m) * (s.T*X).T
jhistory_iter = computeCost(X, y, theta)
return theta,jhistory_iter
data = open(r'C:\Users\Coding\Desktop\machine-learning-ex1\ex1\ex1data1.txt')
data1=np.array(pd.read_csv(r'C:\Users\Coding\Desktop\machine-learning-ex1\ex1\ex1data1.txt',header=None))
y =np.array(data1[:,1])
m=len(y)
y=np.asmatrix(y.reshape(m,1))
X = np.array([data1[:,0]]).reshape(m,1)
X = np.asmatrix(np.insert(X,0,1,axis=1))
theta=np.zeros((2,1))
iterations = 1500
alpha = 0.01;
print('Testing the cost function ...')
J = computeCost(X, y, theta)
print('With theta = [0 , 0]\nCost computed = ', J)
print('Expected cost value (approx) 32.07')
theta=np.asmatrix([[-1,0],[1,2]])
J = computeCost(X, y, theta)
print('With theta = [-1 , 2]\nCost computed =', J)
print('Expected cost value (approx) 54.24')
theta,JJ = gradientDescent(X, y, theta, alpha, iterations)
print('Theta found by gradient descent:')
print(theta)
print('Expected theta values (approx)')
print(' -3.6303\n 1.1664\n')
predict1 = [1, 3.5] *theta
print(predict1*10000)
Result:
Testing the cost function ...
With theta = [0 , 0]
Cost computed = 32.072733877455676
Expected cost value (approx) 32.07
With theta = [-1 , 2]
Cost computed = 69.84811062494227
Expected cost value (approx) 54.24
Theta found by gradient descent:
[[-3.70304726 -3.64357517]
[ 1.17367146 1.16769684]]
Expected theta values (approx)
-3.6303
1.1664
[[4048.02858742 4433.63790186]]
There are two problems, the first Cost computed was right, but the second one was wrong. And there are 4 element in my gradient descent(suppose to be two)
When you mention "With theta = [-1 , 2]"
and you enter
theta=np.asmatrix([[-1,0],[1,2]])
I think this is incorrect. Assuming that you have single feature and you added a column of 1, and you are trying to do simple linear regression
The correct way should be
np.array([-1,2])
Also where have
predictions= X*theta-y
It would be better if you did
np.dot(X,theta)-y
When you multiply, it's not doing the same thing.

How to solve the following question using the provided Runge-Kutta method in python

The question body:
A skydiver of mass m in a vertical free fall experiences an aerodynamic drag force F=cy'² ('c times y prime square') where y is measured downward from the start of the fall, and y is a function of time (y' denotes the derivative of y w.r.t time). The differential equation describing the fall is:
y''=g-(c/m)y'²
(where g = 9.80665 m/s^2; c = 0.2028 kg/m; m = 80 kg). And y(0)=y'(0)=0 as this is a free fall.
Task: The function must return the time of a fall of x meters, where x is the parameter of the function. The values of g, c and m are given below.
The Runge-Kutta function is defined as follows:
from numpy import *
def runge_kutta_4(F, x0, y0, x, h):
'''
Return y(x) given the following initial value problem:
y' = F(x, y)
y(x0) = y0 # initial conditions
h is the increment of x used in integration
F = [y'[0], y'[1], ..., y'[n-1]]
y = [y[0], y[1], ..., y[n-1]]
'''
X = []
Y = []
X.append(x0)
Y.append(y0)
while x0 < x:
k0 = F(x0, y0)
k1 = F(x0 + h / 2.0, y0 + h / 2.0 * k0)
k2 = F(x0 + h / 2.0, y0 + h / 2 * k1)
k3 = F(x0 + h, y0 + h * k2)
y0 = y0 + h / 6.0 * (k0 + 2 * k1 + 2.0 * k2 + k3)
x0 += h
X.append(x0)
Y.append(y0)
return array(X), array(Y)
And this is what I've done so far:
def prob_1_8(x)
g = 9.80665 # m/s**2
c = 0.2028 # kg/m
m = 80 # kg
def F(x, y):
return array([
y[1],
g - (c / m) * ((y[1]) ** 2)
])
X, Y = runge_kutta_4(F, 0, array([0, 0]), 5000, 1000)
for i in range(len(X)):
if X[i] == 5000:
return Y[i]
However, when I tried to print prob_1_8(5000), the number looks ridiculous and it displayed:
RuntimeWarning: overflow encountered in double_scalars.
According to the answer provided, I should get a value close to 84.8 when x=5000. Can someone help me with this? I don't know what's the problem and how to fix it.
Please contemplate the function call of X, Y = runge_kutta_4(F, 0, array([0, 0]), 5000, 1000). You are integrating over a time span of 5000 sec > 1 hour in steps of 1000 sec > 16 min. It is intuitively clear that this will be imprecise, as most of the acceleration will happen in the first 10 sec.
Then the question is what exactly you are trying to filter out with the loop. Is it the speed after this time?
The limit speed is where the right side is zero, at vmax=sqrt(g*m/c) = 62.1972 = 223.91 km/h, the claimed value of 84.8 can not be reached as a speed starting from rest. The fall time to a distance of x will be a little more than x/vmax, so you could use tmax = 100+x/vmax in T, Y = runge_kutta_4(F, t0, y0, tmax, 1).
Integrating in 1 sec time steps and looking for the speed after 5000 meters fall distance gives a result of 85 sec, distance 5013.33465614 m, speed 62.1972 m/s which is as to be expected close to the limit speed.
You can get a more precise time value by using (reverse) linear interpolation, then at time about 84.786 sec you reach distance 5000 m with a speed 62.1972 m/s. This again is compatible with the claimed result value, which now is a time, not a velocity.

Gradient Descent is not converging for very large values in a small dataset

I am trying to write a program to calculate the slope and the intercept of a linear regression model but when I am running more than 10 iterations, the gradient descent function gives the np.nan value for both intercept as well as slope.
Below is my implementation
def get_gradient_at_b(x, y, b, m):
N = len(x)
diff = 0
for i in range(N):
x_val = x[i]
y_val = y[i]
diff += (y_val - ((m * x_val) + b))
b_gradient = -(2/N) * diff
return b_gradient
def get_gradient_at_m(x, y, b, m):
N = len(x)
diff = 0
for i in range(N):
x_val = x[i]
y_val = y[i]
diff += x_val * (y_val - ((m * x_val) + b))
m_gradient = -(2/N) * diff
return m_gradient
def step_gradient(b_current, m_current, x, y, learning_rate):
b_gradient = get_gradient_at_b(x, y, b_current, m_current)
m_gradient = get_gradient_at_m(x, y, b_current, m_current)
b = b_current - (learning_rate * b_gradient)
m = m_current - (learning_rate * m_gradient)
return [b, m]
def gradient_descent(x, y, learning_rate, num_iterations):
b = 0
m = 0
for i in range(num_iterations):
b, m = step_gradient(b, m, x, y, learning_rate)
return [b,m]
I am running it on the following data:
a=[3.87656018e+11, 4.10320300e+11, 4.15730874e+11, 4.52699998e+11,
4.62146799e+11, 4.78965491e+11, 5.08068952e+11, 5.99592902e+11,
6.99688853e+11, 8.08901077e+11, 9.20316530e+11, 1.20111177e+12,
1.18695276e+12, 1.32394030e+12, 1.65661707e+12, 1.82304993e+12,
1.82763786e+12, 1.85672212e+12, 2.03912745e+12, 2.10239081e+12,
2.27422971e+12, 2.60081824e+12]
b=[3.3469950e+10, 3.4784980e+10, 3.3218720e+10, 3.6822490e+10,
4.4560290e+10, 4.3826720e+10, 5.2719430e+10, 6.3842550e+10,
8.3535940e+10, 1.0309053e+11, 1.2641405e+11, 1.6313218e+11,
1.8529536e+11, 1.7875143e+11, 2.4981555e+11, 3.0596392e+11,
3.0040058e+11, 3.1440530e+11, 3.1033848e+11, 2.6229109e+11,
2.7585243e+11, 3.0352616e+11]
print(gradient_descent(a, b, 0.01, 100))
#result --> [nan, nan]
When I run the gradient_descent function on a dataset with smaller values, it gives the correct answers. Also I was able to obtain the intercept and slope for the above data with from sklearn.linear_model import LinearRegression
Any help will be appreciated in figuring out why the result is [nan, nan] instead of giving me the correct intercept and slope.
You need to reduce the learning rate. Since the values in a and b are so large (>= 1e11), the learning rate needs be approximately 1e-25 for this to even do the gradient descent, else it will randomly overshoot because of large gradients of a and b.
b, m = gradient_descent(a, b, 5e-25, 100)
print(b, m)
Out: -3.7387067636195266e-13 0.13854551291084335

Shape mismatch error with scipy.optimize.minimize for logistic regression

I am going through's Andrew Ng's ML course, and I am trying to implement the programs in python. For the second exercise, on logistic regression, I am trying to use scipy.optimize.minimize for optimizing the cost function. My code is as follows.
import os
import numpy as np
from scipy.special import expit
from scipy import optimize
datafile1 = os.path.join('data','ex2data1.txt')
data1 = np.loadtxt(datafile1, delimiter=',')
exam_scores, results = data1[:, :2], data1[:, 2]
m, n = exam_scores.shape
exam_scores = np.concatenate([np.ones([m, 1]), exam_scores], axis=1)
def cost_function(x, y, theta):
m = len(y)
hypothesis = expit(np.dot(x, theta))
term1 = -np.dot(y.T, np.log(hypothesis)) / m
term2 = -np.dot((1 - y).T, np.log(1 - hypothesis)) / m
cost = term1 + term2
return cost
def gradient(x, y, theta):
m = len(y)
hypothesis = expit(np.dot(x, theta))
return np.dot(hypothesis - y, x) / m
def minimize_cost(x, y, theta):
output = optimize.minimize(cost_function, theta, args=(x, y),
jac=gradient, options={'maxiter':400})
return output.fun, output.x
theta = np.zeros(n + 1)
theta, cost = minimize_cost(exam_scores, results, theta)
This gives me
<ipython-input-42-e2ba65cce1d8> in gradient(x, y, theta)
9 def gradient(x, y, theta):
10 m = len(y)
---> 11 hypothesis = expit(np.dot(x, theta))
12 return np.dot(hypothesis - y, x) / m
ValueError: shapes (3,) and (100,) not aligned: 3 (dim 0) != 100 (dim 0).
However the shape of theta and the output of the gradient function is the same, i.e. theta.shape == gradient(exam_scores, results, theta).shape gives me True.
I do not understand why is the gradient function raising a ValueError when called from minimize since by itself it is giving the expected output.
Any pointers would be appreciated.
P.S. Here is a part of the data.
exam_scores[:5, :]
array([[34.62365962, 78.02469282],
[30.28671077, 43.89499752],
[35.84740877, 72.90219803],
[60.18259939, 86.3085521 ],
[79.03273605, 75.34437644]])
results.reshape(m, 1)[:5, :]
array([[0.],
[0.],
[0.],
[1.],
[1.]])
Edit: Added part of the data.

Numpy/Scipy Solve simulataneous equations with integrals in them

I am trying to use numpy and scipy to solve the following two equations:
P(z) = sgn(-cos(np.pi*D1) + cos(5*z)) * sgn(-cos(np.pi*D2) + cos(6*z))
1. 0 = 2/2pi ∫ P(z,D1,D2) * cos(5z) dz + z/L
2. 0 = 2/2pi ∫ P(z,D1,D2) * cos(6z) dz - z/L
for D1 and D2 (integral limits are 0 -> 2pi).
My code is:
def equations(p, z):
D1, D2 = p
period = 2*np.pi
P1 = lambda zz, D1, D2: \
np.sign(-np.cos(np.pi*D1) + np.cos(6.*zz)) * \
np.sign(-np.cos(np.pi*D2) + np.cos(5.*zz)) * \
np.cos(6.*zz)
P2 = lambda zz, D1, D2: \
np.sign(-np.cos(np.pi*D1) + np.cos(6.*zz)) * \
np.sign(-np.cos(np.pi*D2) + np.cos(5.*zz)) * \
np.cos(5.*zz)
eq1 = 2./period * integrate.quad(P1, 0., period, args=(D1,D2), epsabs=0.01)[0] + z
eq2 = 2./period * integrate.quad(P2, 0., period, args=(D1,D2), epsabs=0.01)[0] - z
return (eq1, eq2)
z = np.arange(0., 1000., 0.01)
N = int(len(z))
D1 = np.empty([N])
D2 = np.empty([N])
for i in range(N):
D1[i], D2[i] = fsolve(equations, x0=(0.5, 0.5), args=z[i])
print D1, D2
Unfortunately, it does not seem to converge. I don't know much about numerical methods and was hoping someone could give me a hand.
Thank you.
P.S. I'm also trying the following which should be equivalent:
import numpy as np
from scipy.optimize import fsolve
from scipy import integrate
from scipy import signal
def equations(p, z):
D1, D2 = p
period = 2.*np.pi
K12 = 1./L * z
K32 = -1./L * z + 1.
P1 = lambda zz, D1, D2: \
signal.square(6.*zz, duty=D1) * \
signal.square(5.*zz, duty=D2) * \
np.cos(6.*zz)
P2 = lambda zz, D1, D2: \
signal.square(6.*zz, duty=D1) * \
signal.square(5.*zz, duty=D2) * \
np.cos(5.*zz)
eq1 = 2./period * integrate.quad(P1, 0., period, args=(D1,D2))[0] + K12
eq2 = 2./period * integrate.quad(P2, 0., period, args=(D1,D2))[0] - K32
return (eq1, eq2)
h = 0.01
L = 10.
z = np.arange(0., L, h)
N = int(len(z))
D1 = np.empty([N])
D2 = np.empty([N])
for i in range(N):
D1[i], D2[i] = fsolve(equations, x0=(0.5, 0.5), args=z[i])
print
print z[i]
print ("%0.8f,%0.8f" % (D1[i], D2[i]))
print
PSS:
I implemented what you wrote (I think I understand it!), very nicely done. Thank you. Unfortunately, I really don't have much skill in this field and don't really know how to make a suitable guess, so I just guess 0.5 (I also added a small amount of noise to the initial guess to try and improve it). The result I'm getting have numerical errors it seems, and I'm not sure why, I was hoping you could point me in the right direction. So essentially, I did an FFT sweep (did an FFT for each dutycycle variation and looked at the frequency component at 5, which is shown below in the graph) and found that the linear part (z/L) is slightly jagged.
PSSS:
Thank you for that, I've noted some of the techniques you've suggested. I tried replicated your second graph as it seems very useful. To do this, I kept D1 (D2) fixed and swept D2 (D1), and I did this for various z values. fmin did not always find the correct minimum (it was dependent on the initial guess) so I swept the initial guess of fmin until I found the correct answer. I get a similar answer to you. (I think it's correct?)
Also, I would just like to say that you might like to give me your contact details, as this solution as a step in finding the solution to a problem I have (I'm a student doing research), and I will most certainly acknowledge you in any papers in which this code is used.
#!/usr/bin/env python
import numpy as np
from scipy.optimize import fsolve
from scipy import integrate
from scipy import optimize
from scipy import signal
######################################################
######################################################
altsigns = np.ones(50)
altsigns[1::2] = -1
def get_breaks(x, y, a, b):
sa = np.arange(0, 2*a, 2)
sb = np.arange(0, 2*b, 2)
zx = (( x + sa) % (2*a))*np.pi/a
zx2 = ((-x + sa) % (2*a))*np.pi/a
zy = (( y + sb) % (2*b))*np.pi/b
zy2 = ((-y + sb) % (2*b))*np.pi/b
zi = np.r_[np.sort(np.hstack((zx, zx2, zy, zy2))), 2*np.pi]
if zi[0]:
zi = np.r_[0, zi]
return zi
def integrals(x, y, a, b):
zi = get_breaks(x % 1., y % 1., a, b)
sins = np.vstack((np.sin(b*zi), np.sin(a*zi)))
return (altsigns[:zi.size-1]*(sins[:,1:] - sins[:,:-1])).sum(1) / np.array((b, a))
def equation1(p, z, d2):
D2 = d2
D1 = p
I1, _ = integrals(D1, D2, deltaK1, deltaK2)
eq1 = 1. / np.pi * I1 + z
return abs(eq1)
def equation2(p, z, d1):
D1 = d1
D2 = p
_, I2 = integrals(D1, D2, deltaK1, deltaK2)
eq2 = 1. / np.pi * I2 - z + 1
return abs(eq2)
######################################################
######################################################
z = [0.2, 0.4, 0.6, 0.8, 1.0]#np.arange(0., 1., 0.1)
step = 0.05
deltaK1 = 5.
deltaK2 = 6.
f = open('data.dat', 'w')
D = np.arange(0.0, 1.0, step)
D1eq1 = np.empty([len(D)])
D2eq2 = np.empty([len(D)])
D1eq1Err = np.empty([len(D)])
D2eq2Err = np.empty([len(D)])
for n in z:
for i in range(len(D)):
# Fix D2 and solve for D1.
for guessD1 in np.arange(0.,1.,0.1):
D2 = D
tempD1 = optimize.fmin(equation1, guessD1, args=(n, D2[i]), disp=False, xtol=1e-8, ftol=1e-8, full_output=True)
if tempD1[1] < 1.e-6:
D1eq1Err[i] = tempD1[1]
D1eq1[i] = tempD1[0][0]
break
else:
D1eq1Err[i] = -1.
D1eq1[i] = -1.
# Fix D1 and solve for D2.
for guessD2 in np.arange(0.,1.,0.1):
D1 = D
tempD2 = optimize.fmin(equation2, guessD2, args=(n, D1[i]), disp=False, xtol=1e-8, ftol=1e-8, full_output=True)
if tempD2[1] < 1.e-6:
D2eq2Err[i] = tempD2[1]
D2eq2[i] = tempD2[0][0]
break
else:
D2eq2Err[i] = -2.
D2eq2[i] = -2.
for i in range(len(D)):
f.write('%0.8f,%0.8f,%0.8f,%0.8f,%0.8f\n' %(D[i], D1eq1[i], D2eq2[i], D1eq1Err[i], D2eq2Err[i]))
f.write('\n\n')
f.close()
This is a very ill-posed problem. Let's recap what you are trying to do:
You want to solve 100000 optimization problems
Each optimization problem is 2 dimensional, so you need O(10000) function evaluations (estimating O(100) function evaluations for a 1D optimization problem)
Each function evaluation depends on the evaluation of two numerical integrals
The integrands contain jumps, i.e. they are 0-times contiguously differentiable
The integrands are composed of periodic functions, so they have multiple minima and maxima
So you are off to a very hard time. In addition, even in the most optimistic estimate in which all factors in the integrand that are < 1 are replaced by 1, the integrals can only take values between -2*pi and 2*pi. Much less than that in reality. So you can already see that you only have a chance of a solution for
I1 - z = 0
I2 + z = 0
for very small numbers of z. So there is no point in trying up to z = 1000.
I am almost certain that this is not the problem you need to solve. (I cannot imagine a context in which such a problem would appear. It seems like a weird twist on Fourier coefficient computation...) But in case you insist, your best bet is to work on the inner loop first.
As you noted, the numerical evaluation of the integrals is subject to large errors. This is due to the jumps introduced by the sgn() function. Functions such as scipy.integrate.quad() tend to use higher order algorithms which assume that the integrands are smooth. If they are not, they perform very badly. You either need to hand-pick an algorithm that can deal with jumps or, much better in this case, do the integrals by hand:
The following algorithm calculates the jump points of the sgn() function and then evaluates the analytic integrals on all pieces:
altsigns = np.ones(50)
altsigns[1::2] = -1
def get_breaks(x, y, a, b):
sa = np.arange(0, 2*a, 2)
sb = np.arange(0, 2*b, 2)
zx = (( x + sa) % (2*a))*np.pi/a
zx2 = ((-x + sa) % (2*a))*np.pi/a
zy = (( y + sb) % (2*b))*np.pi/b
zy2 = ((-y + sb) % (2*b))*np.pi/b
zi = np.r_[np.sort(np.hstack((zx, zx2, zy, zy2))), 2*pi]
if zi[0]:
zi = np.r_[0, zi]
return zi
def integrals(x, y, a, b):
zi = get_breaks(x % 1., y % 1., a, b)
sins = np.vstack((np.sin(b*zi), np.sin(a*zi)))
return (altsigns[:zi.size-1]*(sins[:,1:] - sins[:,:-1])).sum(1) / np.array((b, a))
This gets rid of the problem of the numerical integration. It is very accurate and fast. However, even the integrals will not be perfectly contiguous for all parameters, so in order to solve your optimization problem, you are better off using an algorithm that doesn't rely on the existence of any derivatives. The only choice in scipy is scipy.optimize.fmin(), which you can use like:
def equations2(p, z):
x, y = p
I1, I2 = integrals(x, y, 6., 5.)
fact = 1. / pi
eq1 = fact * I1 + z
eq2 = fact * I2 - z
return eq1, eq2
def norm2(p, z):
eq1, eq2 = equations2(p, z)
return eq1**2 + eq2**2 # this has the minimum when eq1 == eq2 == 0
z = 0.25
res = fmin(norm2, (0.25, 0.25), args=(z,), xtol=1e-8, ftol=1e-8)
print res
# -> [ 0.3972 0.5988]
print equations2(res, z)
# -> (-2.7285737558280232e-09, -2.4748670890417657e-09)
You are still left with the problem of finding suitable starting values for all z, which is still a tricky business. Good Luck!
Edit
To check if you still have numerical errors, plug the result of the optimization back in the equations and see if they are satisfied to the required accuracy, which is what I did above. Note that I used (0.25, 0.25) as a starting value, since starting at (0.5, 0.5) didn't lead to convergence. This is normal for optimizations problems with local minima (such as yours). There is no better way to deal with this other than trying multiple starting values, rejecting non-converged results. In the case above, if equations2(res, z) returns anything higher than, say, (1e-6, 1e-6), I would reject the result and try again with a different starting value. A very useful technique for successive optimization problems is to use the result of the previous problem as the starting value for the next problem.
Note however that you have no guarantee of a smooth solution for D1(z) and D2(z). Just a tiny change in D1 could push one break point off the integration interval, resulting in a big change of the value of the integral. The algorithm may well adjust by using D2, leading to jumps in D1(z) and D2(z). Note also that you can take any result modulo 1, due to the symmetries of cos(pi*D1).
The bottom line: There shouldn't be any remaining numerical inaccuracies if you use the analytical formula for the integrals. If the residuals are less than the accuracy you specified, this is your solution. If they are not, you need to find better starting values. If you can't, a solution may not exist. If the solutions are not contiguous as a function of z, that is also expected, since your integrals are not contiguous. Good luck!
Edit 2
It appears your equations have two solutions in the interval z in [0, ~0.46], and no solutions for z > 0.46, see the first figure below. To prove this, see the good old graphical solution in the second figure below. The contours represent solutions of Eq. 1 (vertical) and Eq. 2 (horizontal), for different z. You can see that the contours cross twice for z < 0.46 (two solutions) and not at all for z > 0.46 (no solution that simultaneously satisfies both equations). If this is not what you expected, you need to write down different equations (which was my suspicion in the first place...)
Here is the final code I was using:
import numpy as np
from numpy import sin, cos, sign, pi, arange, sort, concatenate
from scipy.optimize import fmin
a = 6.0
b = 5.0
def P(z, x, y):
return sign((cos(a*z) - cos(pi*x)) * (cos(b*z) - cos(pi*y)))
def P1(z, x, y):
return P(z, x, y) * cos(b*z)
def P2(z, x, y):
return P(z, x, y) * cos(a*z)
altsigns = np.ones(50)
altsigns[1::2] = -1
twopi = 2*pi
pi_a = pi/a
da = 2*pi_a
pi_b = pi/b
db = 2*pi_b
lim = np.array([0., twopi])
def get_breaks(x, y):
zx = arange(x*pi_a, twopi, da)
zx2 = arange((2-x)*pi_a, twopi, da)
zy = arange(y*pi_b, twopi, db)
zy2 = arange((2-y)*pi_b, twopi, db)
zi = sort(concatenate((lim, zx, zx2, zy, zy2)))
return zi
ba = np.array((b, a))[:,None]
fact = np.array((1. / b, 1. / a))
def integrals(x, y):
zi = get_breaks(x % 1., y % 1.)
sins = sin(ba*zi)
return fact * (altsigns[:zi.size-1]*(sins[:,1:] - sins[:,:-1])).sum(1)
def equations2(p, z):
x, y = p
I1, I2 = integrals(x, y)
fact = 1. / pi
eq1 = fact * I1 + z
eq2 = fact * I2 - z
return eq1, eq2
def norm2(p, z):
eq1, eq2 = equations2(p, z)
return eq1**2 + eq2**2
def eval_integrals(Nx=100, Ny=101):
x = np.arange(Nx) / float(Nx)
y = np.arange(Ny) / float(Ny)
I = np.zeros((Nx, Ny, 2))
for i in xrange(Nx):
xi = x[i]
Ii = I[i]
for j in xrange(Ny):
Ii[j] = integrals(xi, y[j])
return x, y, I
def solve(z, start=(0.25, 0.25)):
N = len(z)
res = np.zeros((N, 2))
res.fill(np.nan)
for i in xrange(N):
if i < 100:
prev = start
prev = fmin(norm2, prev, args=(z[i],), xtol=1e-8, ftol=1e-8)
if norm2(prev, z[i]) < 1e-7:
res[i] = prev
else:
break
return res
#x, y, I = eval_integrals(Nx=1000, Ny=1001)
#zlvl = np.arange(0.2, 1.2, 0.2)
#contour(x, y, -I[:,:,0].T/pi, zlvl)
#contour(x, y, I[:,:,1].T/pi, zlvl)
N = 1000
z = np.linspace(0., 1., N)
res = np.zeros((N, 2, 2))
res[:,0,:] = solve(z, (0.25, 0.25))
res[:,1,:] = solve(z, (0.05, 0.95))

Categories

Resources