I work with the scipy.optimize.minimize function.
My purpose is get w,z which minimize f(w,z)
Both w and z are n by m matrices:
[[1,1,1,1],
[2,2,2,2]]
f(w,z) is receive parameter w and z.
I already tried the form given below:
def f(x):
w = x[0]
z = x[1]
...
minimize(f, [w,z])
but, minimize does not work well.
What is the valid form to put two matrices (n by m) into scipy.optimize.minimize?
Optimize needs a 1D vector to optimize. You are on the right track. You need to flatten your argument to minimize and then in f, start with x = np.reshape(x, (2, m, n)) then pull out w and z and you should be in business.
I've run into this issue before. For example, optimizing parts of vectors in multiple different classes at the same time. I typically wind up with a function that maps things to a 1D vector and then another function that pulls the data back out into the objects so I can evaluate the cost function. As in:
def toVector(w, z):
assert w.shape == (2, 4)
assert z.shape == (2, 4)
return np.hstack([w.flatten(), z.flatten()])
def toWZ(vec):
assert vec.shape == (2*2*4,)
return vec[:2*4].reshape(2,4), vec[2*4:].reshape(2,4)
def doOptimization(f_of_w_z, w0, z0):
def f(x):
w, z = toWZ(x)
return f_of_w_z(w, z)
result = minimize(f, toVec(w0, z0))
# Different optimize functions return their
# vector result differently. In this case it's result.x:
result.x = toWZ(result.x)
return result
Related
I'm trying to use scipy curve_fit to capture the value of a0 parameter. As of now, it is not changing (always comes out as 1):
X = [[1,2,3],[4,5,6]]
def func(X, a0, c):
x1 = X[0]; x2 = X[1]
a = x1*x2
result = np.where(a(a<a0), -c*(a + np.sqrt(x2)), -c*x1)
return result
Popt, Cov = scipy.curve_fit(func, X, y)
a0, c = Popt
Predicted = func(X, a0, c) # a0 and c are constants
I get the values for c, which is a scalar, without any problem. I can't explain why a0 (also a scalar) is always 1, and I am not sure how to fix it. I did see elsewhere on SO that np.where can be used the way I have used it here, but apparently not for curve_fit function. Maybe I need to use a different method of optimization, and I'd like some pointers to do this using scipy methods.
Edit: I tried the construct suggested by Brad, but that's not it.
Updated!
This should work. note that the a variable is a vector in this example of length 3 because it is computed by the element wise multiplication of the first and second elements of X which is a 2x3 matrix. Therefore a0 can either be a scalar or a vector of length 3 and c can also be a scalar or a vector of length 3.
import numpy as np
X = np.array([[1, 2, 3], [4, 5, 6]])
a0 = np.array([8,25,400])
# a0 = 2
# Code works whether C is scalar or a matrix since it can be broadcast to matrix a below.
# c = 3 # Uncomment this for scalar
c = np.array([8, 12, 2000]) # Element wise
def func(X, a0, c):
x = X[0]
y = X[1]
a = x * y
print(a.shape)
result = np.where(a < a0, c * (a + np.sqrt(y)), c * x)
return result
func(X, a0, c)
This is a minimum amount of code that works. Notice I removed the y>0 and defined a to be the same size as c. Now you get the correct insertions because the first parameter of np.where is now the same size as the second and third parameters. Before (x<a) & (y>0) always evaluated to True or False and that is a scalar in this context. If a was a N dimensional array you would have received a ValueError because the operands could not be broadcast together
import numpy as np
c = np.array([[22,34],[33,480]])
def func(X, a):
x = X[0]; y = X[1]
return np.where(c[(x<a)], -c*(a + np.sqrt(y)), -c*x)
X = [25, 600]
a = np.array([[2,14],[33,22]])
func(X,a)
This also works if c is a constant and a was the array you wanted manipulated
import numpy as np
c = 2
def func(X, a):
x = X[0]; y = X[1]
return np.where(a[(x<a)], -c*(a + np.sqrt(y)), -c*x)
X = [25, 600]
a = np.array([[2,14],[33,22]])
func(X,a)
I am trying to build a linear regression model and find optimal values using fmin_cg optimizer.
I have two functions for this job. First linear_reg_cost which is cost function and second linear_reg_grad which is gradient of cost function. This functions both have same argument.
def hypothesis(x,theta):
return np.dot(x,theta)
Cost function:
def linear_reg_cost(x_flatten, y, theta_flatten, lambda_, num_of_features,num_of_samples):
x = x_flatten.reshape(num_of_samples, num_of_features)
theta = theta_flatten.reshape(n,1)
loss = hypothesis(x,theta)-y
regularizer = lambda_*np.sum(theta[1:,:]**2)/(2*m)
j = np.sum(loss ** 2)/(2*m)
return j
Gradient function:
def linear_reg_grad(x_flatten, y, theta_flatten, lambda_, num_of_features,num_of_samples):
x = x_flatten.reshape(num_of_samples, num_of_features)
m,n = x.shape
theta = theta_flatten.reshape(n,1)
new_theta = np.zeros(shape=(theta.shape))
loss = hypothesis(x,theta)-y
gradient = np.dot(x.T,loss)
new_theta[0:,:] = gradient/m
new_theta[1:,:] = gradient[1:,:]/m + lambda_*(theta[1:,]/m)
return new_theta
and fmin_cg:
theta = np.ones(n)
from scipy.optimize import fmin_cg
new_theta = fmin_cg(f=linear_reg_cost, x0=theta, fprime=linear_reg_grad,args=(x.flatten(), y, lambda_, m,n))
Note: I flatten x as input and retrieve in the cost and gradient function as matrix.
the output error:
<ipython-input-98-b29c1b8f6e58> in linear_reg_grad(x_flatten, y, theta_flatten, lambda_, num_of_features, num_of_samples)
1 def linear_reg_grad(x_flatten, y, theta_flatten, lambda_,num_of_features, num_of_samples):
----> 2 x = x_flatten.reshape(num_of_samples, num_of_features)
3 m,n = x.shape
4 theta = theta_flatten.reshape(n,1)
5 new_theta = np.zeros(shape=(theta.shape))
ValueError: cannot reshape array of size 2 into shape (2,12)
Note: x.shape = (12,2), y.shape = (12,1) ,theta.shape = (2,). So num_of_features =2 and num_of_samples=12. But error shows that my input x is parsing instead of theta. Why this happening even when I explicitly assigned args in fmin_cg? And how I should solve this problem?
Thanks for any advice
All of your implementations are correct but you have a little mistake.
Be inform to pass arguments in order for both of your functions.
Your problem is the order of num_of_feature and num_of_samples. You can replace their position with each other in linear_reg_grad or linear_reg_cost. Of course you should change this order in scipy.optimize.fmin_cg, args argument.
Second important thing is, x as first argument in fmin_cg is the variable you want to update each time and find the optimal one. So in your solution, x in fmin_cg must be theta not your x which is your input.
Suppose I have two 2D NumPy arrays A and B, I would like to compute the matrix C whose entries are C[i, j] = f(A[i, :], B[:, j]), where f is some function that takes two 1D arrays and returns a number.
For instance, if def f(x, y): return np.sum(x * y) then I would simply have C = np.dot(A, B). However, for a general function f, are there NumPy/SciPy utilities I could exploit that are more efficient than doing a double for-loop?
For example, take def f(x, y): return np.sum(x != y) / len(x), where x and y are not simply 0/1-bit vectors.
Here is a reasonably general approach using broadcasting.
First, reshape your two matrices to be rank-four tensors.
A = A.reshape(A.shape + (1, 1))
B = B.reshape((1, 1) + B.shape)
Second, apply your function element by element without performing any reduction.
C = f(A, B) # e.g. A != B
Having reshaped your matrices allows numpy to broadcast. The resulting tensor C has shape A.shape + B.shape.
Third, apply any desired reduction by, for example, summing over the indices you want to discard:
C = C.sum(axis=(1, 3)) / C.shape[0]
The thing is, im trying to design of fitting procedure for my purposes and want to use scipy`s differential evolution algorithm as a general estimator of initial values which then will be used in LM algorithm for better fitting. The function i want to minimize with DE is the least squares between analytically defined non-linear function and some experimental values. Point at which i stuck is the function design. As its stated in scipy reference: "function must be in the form f(x, *args) , where x is the argument in the form of a 1-D array and args is a tuple of any additional fixed parameters needed to completely specify the function"
There is an ugly example of code which i wrote just for illustrative purposes:
def func(x, *args):
"""args[0] = x
args[1] = y"""
result = 0
for i in range(len(args[0][0])):
result += (x[0]*(args[0][0][i]**2) + x[1]*(args[0][0][i]) + x[2] - args[0][1][i])**2
return result**0.5
if __name__ == '__main__':
bounds = [(1.5, 0.5), (-0.3, 0.3), (0.1, -0.1)]
x = [0,1,2,3,4]
y = [i**2 for i in x]
args = (x, y)
result = differential_evolution(func, bounds, args=args)
print(func(bounds, args))
I wanted to supply raw data as a tuple into the function but it seems that its not how its suppose to be since interpreter isn't happy with the function. The problem should be easy solvable, but i really frustrated, so advice will be much appreciated.
This is kinda straightforward solution which shows the idea, also code isn`t very pythonic but for simplicity i think its good enough. Ok as example we want to fit equation of a kind y = ax^2 + bx + c to a data obtained from equation y = x^2. It obvious that parameter a = 1 and b,c should equal to 0. Since differential evolution algorithm finds minimum of a function we want to find a minimum of a root mean square deviation (again, for simplicity) of analytic solution of general equation (y = ax^2 + bx + c) with given parameters (providing some initial guess) vs "experimental" data. So, to the code:
from scipy.optimize import differential_evolution
def func(parameters, *data):
#we have 3 parameters which will be passed as parameters and
#"experimental" x,y which will be passed as data
a,b,c = parameters
x,y = data
result = 0
for i in range(len(x)):
result += (a*x[i]**2 + b*x[i]+ c - y[i])**2
return result**0.5
if __name__ == '__main__':
#initial guess for variation of parameters
# a b c
bounds = [(1.5, 0.5), (-0.3, 0.3), (0.1, -0.1)]
#producing "experimental" data
x = [i for i in range(6)]
y = [x**2 for x in x]
#packing "experimental" data into args
args = (x,y)
result = differential_evolution(func, bounds, args=args)
print(result.x)
everything else works fine but when I use the leasesq function the pydev editor have an error that says Undefined variable from import: leastsq what is going on here?
the code is the MIT's python cost model timing.py at the url: http://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-006-introduction-to-algorithms-fall-2011/readings/python-cost-model/timing.py
and the leastsq part is in the function:
def fit2(A,b):
""" Relative error minimizer """
def f(x):
assert len(x) == len(A[0])
resids = []
for i in range(len(A)):
sum = 0.0
for j in range(len(A[0])):
sum += A[i][j]*x[j]
relative_error = (sum-b[i])/b[i]
resids.append(relative_error)
return resids
ans = scipy.optimize.leastsq(f,[0.0]*len(A[0]))
# print "ans:",ans
if len(A[0])==1:
x = [ans[0]]
else:
x = ans[0]
resids = sum([r*r for r in f(x)])
return (x,resids,0,0)
It seems to me that you're giving the LSQ-function two keyword arguments, while it requires three. You're supplying it with the function, the initial values, but not with the actual values over which the LSQ is to be made?
Instead of hard-coding the calculation of the residuals try just wrapping the residuals as a function which is the difference between the data values and the function to minimize:
for example, just fitting a gaussian function to some data set:
M = np.array(data) # your data as a Nx2 Matrix of (x, y) data points
initials = [3,2,1] # just some initial guess values
def gaussian(x, p):
return p[0]*np.exp((-(x-p[1])**2.0)/p[2]**2.0) # definition of the function
def residuals(p, y, x):
return y - gaussian(x, p) # definition of the residual
cnsts = leastsq(residuals, initials, args=(M[:,1], M[:,0]))[0] # outputs optimized initials