having trouble using scipy.optimize.leastsq - python

everything else works fine but when I use the leasesq function the pydev editor have an error that says Undefined variable from import: leastsq what is going on here?
the code is the MIT's python cost model timing.py at the url: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/readings/python-cost-model/timing.py
and the leastsq part is in the function:
def fit2(A,b):
""" Relative error minimizer """
def f(x):
assert len(x) == len(A[0])
resids = []
for i in range(len(A)):
sum = 0.0
for j in range(len(A[0])):
sum += A[i][j]*x[j]
relative_error = (sum-b[i])/b[i]
resids.append(relative_error)
return resids
ans = scipy.optimize.leastsq(f,[0.0]*len(A[0]))
# print "ans:",ans
if len(A[0])==1:
x = [ans[0]]
else:
x = ans[0]
resids = sum([r*r for r in f(x)])
return (x,resids,0,0)

It seems to me that you're giving the LSQ-function two keyword arguments, while it requires three. You're supplying it with the function, the initial values, but not with the actual values over which the LSQ is to be made?

Instead of hard-coding the calculation of the residuals try just wrapping the residuals as a function which is the difference between the data values and the function to minimize:
for example, just fitting a gaussian function to some data set:
M = np.array(data) # your data as a Nx2 Matrix of (x, y) data points
initials = [3,2,1] # just some initial guess values
def gaussian(x, p):
return p[0]*np.exp((-(x-p[1])**2.0)/p[2]**2.0) # definition of the function
def residuals(p, y, x):
return y - gaussian(x, p) # definition of the residual
cnsts = leastsq(residuals, initials, args=(M[:,1], M[:,0]))[0] # outputs optimized initials

Related

Numpy Array Index Issue

I am trying to implement Newtons_method for multivariable scenario. I am trying to output multiple iterations, but am getting an indexing error
import symengine
from symengine import var
import numpy as np
vars =var("x y")
sol=[]
def jacob():
f = ['2*x+y**2-8','x**2-y**2+x*y-3'] # Define function
J = symengine.zeros(len(f),len(vars)) # Initialise Jacobian matrix
# Fill Jacobian matrix with entries
for i, fi in enumerate(f):
for j, s in enumerate(vars):
J[i,j] = symengine.diff(fi, s)
sol.append(J[i,j])
a = np.array([sol[0],sol[1],sol[2],sol[3]])
return a
def eval1(func,val1,val2):
z=[]
for b in func:
b=str(b)
x=val1
y=val2
z.append(eval(b))
return np.array([[z[0],z[1]],[z[2],z[3]]])
def newtons_method(f, df, x0, e):
q=10
while q > e:
q=q-1
x0 = np.absolute(x0 - (np.array([[f(x0[0][0])],[df(x0[1][0])]]))/(eval1(jacob(),x0[0][0],x0[1][0])))
print('Root is at: ', x0)
print('f(x) at root is: ', f(x0))
def f(x):
return 2*x+y**2-8
def df(x):
return x**2-y**2+x*y-3
x0s = np.array([[1],[1]])
for x0 in x0s:
newtons_method(f, df, x0, 1)
I am getting error: invalid index to scalar variable.
Where I set the value to x0 under the newtons_method function, any ideas as to what is going wrong?
When calling newtons_method function, an element of [[1],[1]] i.e. [1] is being passed as parameter x0. Inside the function, x0[0][0] has been used which gives invalid index to scalar variable error.
You should pass x0s as parameter if you want to use both the elements.
x0s = np.array([[1],[1]])
newtons_method(f, df, x0s, 1)

Scipy `fmin_cg` args are not match with my functions args

I am trying to build a linear regression model and find optimal values using fmin_cg optimizer.
I have two functions for this job. First linear_reg_cost which is cost function and second linear_reg_grad which is gradient of cost function. This functions both have same argument.
def hypothesis(x,theta):
return np.dot(x,theta)
Cost function:
def linear_reg_cost(x_flatten, y, theta_flatten, lambda_, num_of_features,num_of_samples):
x = x_flatten.reshape(num_of_samples, num_of_features)
theta = theta_flatten.reshape(n,1)
loss = hypothesis(x,theta)-y
regularizer = lambda_*np.sum(theta[1:,:]**2)/(2*m)
j = np.sum(loss ** 2)/(2*m)
return j
Gradient function:
def linear_reg_grad(x_flatten, y, theta_flatten, lambda_, num_of_features,num_of_samples):
x = x_flatten.reshape(num_of_samples, num_of_features)
m,n = x.shape
theta = theta_flatten.reshape(n,1)
new_theta = np.zeros(shape=(theta.shape))
loss = hypothesis(x,theta)-y
gradient = np.dot(x.T,loss)
new_theta[0:,:] = gradient/m
new_theta[1:,:] = gradient[1:,:]/m + lambda_*(theta[1:,]/m)
return new_theta
and fmin_cg:
theta = np.ones(n)
from scipy.optimize import fmin_cg
new_theta = fmin_cg(f=linear_reg_cost, x0=theta, fprime=linear_reg_grad,args=(x.flatten(), y, lambda_, m,n))
Note: I flatten x as input and retrieve in the cost and gradient function as matrix.
the output error:
<ipython-input-98-b29c1b8f6e58> in linear_reg_grad(x_flatten, y, theta_flatten, lambda_, num_of_features, num_of_samples)
1 def linear_reg_grad(x_flatten, y, theta_flatten, lambda_,num_of_features, num_of_samples):
----> 2 x = x_flatten.reshape(num_of_samples, num_of_features)
3 m,n = x.shape
4 theta = theta_flatten.reshape(n,1)
5 new_theta = np.zeros(shape=(theta.shape))
ValueError: cannot reshape array of size 2 into shape (2,12)
Note: x.shape = (12,2), y.shape = (12,1) ,theta.shape = (2,). So num_of_features =2 and num_of_samples=12. But error shows that my input x is parsing instead of theta. Why this happening even when I explicitly assigned args in fmin_cg? And how I should solve this problem?
Thanks for any advice
All of your implementations are correct but you have a little mistake.
Be inform to pass arguments in order for both of your functions.
Your problem is the order of num_of_feature and num_of_samples. You can replace their position with each other in linear_reg_grad or linear_reg_cost. Of course you should change this order in scipy.optimize.fmin_cg, args argument.
Second important thing is, x as first argument in fmin_cg is the variable you want to update each time and find the optimal one. So in your solution, x in fmin_cg must be theta not your x which is your input.

Python lambda function with arrays as parameters

I am trying to define a function of n variables to fit to a data set. The function looks like this.
Kelly Function
I then want to find the optimal ai's and bj's to fit my data set using scipy.optimize.leastsq
Here's my code so far.
from scipy.optimize import leastsq
import numpy as np
def kellyFunc(a, b, x): #Function to fit.
top = 0
bot = 0
a = [a]
b = [b]
for i in range(len(a)):
top = top + a[i]*x**(2*i)
bot = bot + b[i]*x**(2*i)
return(top/bot)
def fitKelly(x, y, n):
line = lambda params, x : kellyFunc(params[0,:], params[1,:], x) #Lambda Function to minimize
error = lambda params, x, y : line(params, x) - y #Kelly - dataset
paramsInit = [[1 for x in range(n)] for y in range(2)] #define all ai and bi = 1 for initial guess
paramsFin, success = leastsq(error, paramsInit, args = (x,y)) #run leastsq optimization
#line of best fit
xx = np.linspace(x.min(), x.max(), 100)
yy = line(paramsFin, xx)
return(paramsFin, xx, yy)
At the moment it's giving me the error:
"IndexError: too many indices" because of the way I've defined my initial lambda function with params[0,:] and params[1,:].
There are a few problems with your approach that makes me write a full answer.
As for your specific question: leastsq doesn't really expect multidimensional arrays as parameter input. The documentation doesn't make this clear, but parameter inputs are flattened when passed to the objective function. You can verify this by using full functions instead of lambdas:
from scipy.optimize import leastsq
import numpy as np
def kellyFunc(a, b, x): #Function to fit.
top = 0
bot = 0
for i in range(len(a)):
top = top + a[i]*x**(2*i)
bot = bot + b[i]*x**(2*i)
return(top/bot)
def line(params,x):
print(repr(params)) # params is 1d!
params = params.reshape(2,-1) # need to reshape back
return kellyFunc(params[0,:], params[1,:], x)
def error(params,x,y):
print(repr(params)) # params is 1d!
return line(params, x) - y # pass it on, reshape in line()
def fitKelly(x, y, n):
#paramsInit = [[1 for x in range(n)] for y in range(2)] #define all ai and bi = 1 for initial guess
paramsInit = np.ones((n,2)) #better
paramsFin, success = leastsq(error, paramsInit, args = (x,y)) #run leastsq optimization
#line of best fit
xx = np.linspace(x.min(), x.max(), 100)
yy = line(paramsFin, xx)
return(paramsFin, xx, yy)
Now, as you see, the shape of the params array is (2*n,) instead of (2,n). By doing the re-reshape ourselves, your code (almost) works. Of course the print calls are only there to show you this fact; they are not needed for the code to run (and will produce bunch of needless output in each iteration).
See my other changes, related to other errors: you had a=[a] and b=[b] in your kellyFunc, for no good reason. This turned the input arrays into lists containing arrays, which made the next loop do something very different from what you intended.
Finally, the sneakiest error: you have input variables named x, y in fitKelly, then you use x and y is loop variables in a list comprehension. Please be aware that this only works as you expect it to in python 3; in python 2 the internal variables of list comprehensions actually leak outside the outer scope, overwriting your input variables named x and y.

Scipy, differential evolution

The thing is, im trying to design of fitting procedure for my purposes and want to use scipy`s differential evolution algorithm as a general estimator of initial values which then will be used in LM algorithm for better fitting. The function i want to minimize with DE is the least squares between analytically defined non-linear function and some experimental values. Point at which i stuck is the function design. As its stated in scipy reference: "function must be in the form f(x, *args) , where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function"
There is an ugly example of code which i wrote just for illustrative purposes:
def func(x, *args):
"""args[0] = x
args[1] = y"""
result = 0
for i in range(len(args[0][0])):
result += (x[0]*(args[0][0][i]**2) + x[1]*(args[0][0][i]) + x[2] - args[0][1][i])**2
return result**0.5
if __name__ == '__main__':
bounds = [(1.5, 0.5), (-0.3, 0.3), (0.1, -0.1)]
x = [0,1,2,3,4]
y = [i**2 for i in x]
args = (x, y)
result = differential_evolution(func, bounds, args=args)
print(func(bounds, args))
I wanted to supply raw data as a tuple into the function but it seems that its not how its suppose to be since interpreter isn't happy with the function. The problem should be easy solvable, but i really frustrated, so advice will be much appreciated.
This is kinda straightforward solution which shows the idea, also code isn`t very pythonic but for simplicity i think its good enough. Ok as example we want to fit equation of a kind y = ax^2 + bx + c to a data obtained from equation y = x^2. It obvious that parameter a = 1 and b,c should equal to 0. Since differential evolution algorithm finds minimum of a function we want to find a minimum of a root mean square deviation (again, for simplicity) of analytic solution of general equation (y = ax^2 + bx + c) with given parameters (providing some initial guess) vs "experimental" data. So, to the code:
from scipy.optimize import differential_evolution
def func(parameters, *data):
#we have 3 parameters which will be passed as parameters and
#"experimental" x,y which will be passed as data
a,b,c = parameters
x,y = data
result = 0
for i in range(len(x)):
result += (a*x[i]**2 + b*x[i]+ c - y[i])**2
return result**0.5
if __name__ == '__main__':
#initial guess for variation of parameters
# a b c
bounds = [(1.5, 0.5), (-0.3, 0.3), (0.1, -0.1)]
#producing "experimental" data
x = [i for i in range(6)]
y = [x**2 for x in x]
#packing "experimental" data into args
args = (x,y)
result = differential_evolution(func, bounds, args=args)
print(result.x)

How Do I put 2 matrix into scipy.optimize.minimize?

I work with the scipy.optimize.minimize function.
My purpose is get w,z which minimize f(w,z)
Both w and z are n by m matrices:
[[1,1,1,1],
[2,2,2,2]]
f(w,z) is receive parameter w and z.
I already tried the form given below:
def f(x):
w = x[0]
z = x[1]
...
minimize(f, [w,z])
but, minimize does not work well.
What is the valid form to put two matrices (n by m) into scipy.optimize.minimize?
Optimize needs a 1D vector to optimize. You are on the right track. You need to flatten your argument to minimize and then in f, start with x = np.reshape(x, (2, m, n)) then pull out w and z and you should be in business.
I've run into this issue before. For example, optimizing parts of vectors in multiple different classes at the same time. I typically wind up with a function that maps things to a 1D vector and then another function that pulls the data back out into the objects so I can evaluate the cost function. As in:
def toVector(w, z):
assert w.shape == (2, 4)
assert z.shape == (2, 4)
return np.hstack([w.flatten(), z.flatten()])
def toWZ(vec):
assert vec.shape == (2*2*4,)
return vec[:2*4].reshape(2,4), vec[2*4:].reshape(2,4)
def doOptimization(f_of_w_z, w0, z0):
def f(x):
w, z = toWZ(x)
return f_of_w_z(w, z)
result = minimize(f, toVec(w0, z0))
# Different optimize functions return their
# vector result differently. In this case it's result.x:
result.x = toWZ(result.x)
return result

Categories

Resources