Optimizing a function where one of the parameters is an array - python

I want to optimize a function by varying the parameters where two of the parameters are actually arrays. I've tried to do
...
# initial parameters
params0 = np.array([p1, p2, ... , p_array1, p_array2])
p_min = minimize(myfunc, params0, args)
...
where the pj's are scalars and p_array1 and p_array2 are arrays of the same length, but this gave me an error saying
ValueError: setting an array element with a sequence.
I've also tried passing p_array1 and p_array2 as scalars into myfunc and then create predetermined arrays from those two inside myfunc (e.g. setting p_array1 = p_array1*np.arange(6) and similarly for p_array2), eliminating the error, but I don't want them to be predetermined -- instead I want 'minimize' to figure out what they should be.
Is there any way that I can utilize one of Scipy's optimization functions without getting this error while still keeping p_array1 and p_array2 as arrays and not scalars?
EDIT
Sorry for being very broad but here is my code:
NOTE: 'myfunc' here is actually norm_residual .
import pandas as pd
import numpy as np
def f(yvec, t, a, b, c, d, M, theta):
# the system of ODEs to be solved
x, y = yvec
dydt = [ a*x - b*y**2 + 1, -c*x - d*x*y + np.sum(M * np.cos(theta*t)) ]
return dydt
ni = 3 # the number of periodic forcing functions to add to the DE system
M = 0.56*np.random.rand(ni) # the initial amplitudes of forcing functions
theta = np.pi/6*np.arange(ni) # the initial coefficients of the forcing functions
# initialize the parameters
params0 = [0.75, 0.23, 1.0, 0.2, M, theta]
# grabbing the data to be used later
data = pd.read_csv('data.csv')
y_data = data['Y']
N = y_data.shape[0] #20
t = np.linspace(0, N, N) # array of t values to integrate over
yvec0 = [0.3, 0.34] # initial conditions for x and y respectively
def norm_residual(params, *args):
"""
Computes the L^2 norm of the residual of y and the data (y as defined above).
Input: params = array of parameters (scalars or arrays) for the DE system
args = other arguments to pass into the function f or to use
to compute the residual.
Output: err = L^2 error of the solution vector (scalar).
"""
data, yvec0, t = args
a, b, c, d, M, theta = params
sol = odeint(f, yvec0, t, args=(a, b, c, d, M, theta))
x = sol[:, 0]; y = sol[:, 1]
res = data - y
err = np.linalg.norm(res, 2)
return err
from scipy.optimize import minimize
p_min = minimize(norm_residual, params0, args=(y_data, yvec0, t))
print(p_min)
And the traceback
Traceback (most recent call last):
File "model_ex_1.py", line 62, in <module>
p_min = minimize(norm_residual, params0, args=(y_anom, yvec0, t))
File "/usr/lib/python2.7/dist-packages/scipy/optimize/_minimize.py", line 354, in minimize
x0 = np.asarray(x0)
File "/usr/lib/python2.7/dist-packages/numpy/core/numeric.py", line 482, in asarray
return array(a, dtype, copy=False, order=order)
ValueError: setting an array element with a sequence.

You cannot put a list in a numpy array if the other elements are scalars.
>>> import numpy as np
>>> foo_array = np.array([1,2,3,[5,6,7]])
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
foo_array = np.array([1,2,3,[5,6,7]])
ValueError: setting an array element with a sequence.

It would be helpful if you post myfunc
but you can do this -
def foo():
return [p0,p1,p2..pn]
params0 = numpy.array([foo(), p_array1, p_array2])
p_min = minimize(myfunc, params0, args)
OR from Multiple variables in SciPy's optimize.minimize
import scipy.optimize as optimize
def f(params):
# print(params) # <-- you'll see that params is a NumPy array
a, b, c = params # <-- for readability you may wish to assign names to the component variables
return a**2 + b**2 + c**2
initial_guess = [1, 1, 1]
result = optimize.minimize(f, initial_guess)
if result.success:
fitted_params = result.x
print(fitted_params)
else:
raise ValueError(result.message)

I figured it out! The solution that I found to work was to change
params0 = [0.75, 0.23, 1.0, 0.2, M, theta]
in line 6 to
params0 = np.array([ 0.75, 0.23, 1.0, 0.2, *M, *theta], dtype=np.float64)
and in my function definition of my system of ODEs to be solved, instead of having
def f(yvec, t, a, b, c, d, M, theta):
x, y = yvec
dydt = [ a*x - b*y**2 + 1, -c*x - d*x*y + np.sum(M * np.cos(theta*t)) ]
return dydt
I now have
def f(yvec, t, myparams):
x, y = yvec
a, b, c, d = myparams[:4]
ni = (myparams[4:].shape[0])//2 # halved b/c M and theta are of the same shape
M = myparams[4:ni+4]
theta = myparams[ni+4:]
dydt = [ a*x - b*y**2 + 1, -c*x - d*x*y + np.sum(M * np.cos(theta*t)) ]
return dydt
NOTE: I had to add "dtype=np.float64" for 'params0' because I was getting the error
AttributeError: 'numpy.float64' object has no attribute 'cos'
when I did not have it there and it appears that 'cos' does not know how to handle 'ndarray' objects. The workaround can be found here.
Thanks everyone for the suggestions!

Related

curve_fit with ODE of unknown coefficients

I'm trying to solve the equation Iy'' + b|y'|y' + ky = 0 and fit the coefficients to data.
This is the code I have so far (ready to run):
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
import pandas as pd
from scipy.optimize import curve_fit
# Define derivatives of function
def f(y, t, I, b, k):
theta, omega = y
derivs = [omega, -b / I * omega * abs(omega) - k / I * theta]
return derivs
# integrate the function
def yint(t, I, b, k, y0, y1):
y = odeint(f, [y0, y1], t, args=(I, b, k))
return y.ravel()
# define t and y to fit
y0 = [.5, 0]
t = np.arange(0, 3, .01)
y = np.cos(t)*np.e**(-.01*t)
# fit
vals, cov = curve_fit(yint, t, y, p0=[0.002245, 1e-5, 0.2492, y0[0], y[1]])
However, when I run the function, I get the error:
Traceback (most recent call last):
File "---", line 24, in <module>
vals, cov = curve_fit(yint, t, y, p0=[0.002245, 1e-5, 0.2492, y0[0], y[1]])
File "---.py", line 578, in curve_fit
res = leastsq(func, p0, args=args, full_output=1, **kw)
File "---.py", line 371, in leastsq
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
File "---.py", line 20, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "---.py", line 447, in _general_function
return function(xdata, *params) - ydata
ValueError: operands could not be broadcast together with shapes (600,) (300,)
Any thoughts on how to fix this?
The problem is that function yint returns an array of shape (600,) for the argument of shape (300,). Think again about yint: it solves a second-order differential equation by representing it as a system of two first-order equations. So the result of y = odeint(...) has two columns, one for the solution itself, the second for its derivative. Its shape is (300,2). Mashing the solution and its derivative together with ravel does not make sense. Instead, you should only take the actual solution, that's the thing you are fitting.
So, replace
return y.ravel()
with
return y[:, 0]

fitting an inverse proportional function

I want to fit the function f(x) = b + a / x to my data set. For that I found scipy leastsquares from optimize were suitable.
My code is as follows:
x = np.asarray(range(20,401,20))
y is distances that I calculated, but is an array of length 20, here is just random numbers for example
y = np.random.rand(20)
Initial guesses of the params a and b:
params = np.array([1,1])
Function to minimize
def funcinv(x):
return params[0]/x+params[1]
res = least_squares(funinv, params, args=(x, y))
Error given:
return np.atleast_1d(fun(x, *args, **kwargs))
TypeError: funinv() takes 1 positional argument but 3 were given
How can I fit my data?
To make a little of clarity. There are two related problems:
Minimizing a function
Fitting model to data
To fit a model to observed data is to find such parameters of a model which minimize some sort of error between model data and observed data.
least_squares method just minimizes a following function with respect to x (x can be a vector).
F(x) = 0.5 * sum(rho(f_i(x)**2), i = 0, ..., m - 1)
(rho is a loss function and default is rho(x) = x so don't mind it for now)
least_squares(func, x0) expects that call to func(x) will return a vector [a1, a2, a3, ...] for which a sum of squares will be computed: S = 0.5 * (a1^2 + a2^2 + a3^2 + ...).
least_squares will tweak x0 to minimize S.
Thus, in order to use it to fit model to data, one must construct a function of error between a model and actual data - residuals and then minimize that residuals function. In your case you can write it as follows:
import numpy as np
from scipy.optimize import least_squares
x = np.asarray(range(20,401,20))
y = np.random.rand(20)
params = np.array([1,1])
def funcinv(x, a, b):
return b + a/x
def residuals(params, x, data):
# evaluates function given vector of params [a, b]
# and return residuals: (observed_data - model_data)
a, b = params
func_eval = funcinv(x, a, b)
return (data - func_eval)
res = least_squares(residuals, params, args=(x, y))
This gives a result:
print(res)
...
message: '`gtol` termination condition is satisfied.'
nfev: 4
njev: 4 optimality: 5.6774618339971994e-10
status: 1
success: True
x: array([ 6.89518618, 0.37118815])
However, as a residuals function pretty much the same all the time (res = observed_data - model_data), there is a shortcut in scipy.optimize called curve_fit: curve_fit(func, xdata, ydata, x0). curve_fit builds residuals function automatically and you can simply write:
import numpy as np
from scipy.optimize import curve_fit
x = np.asarray(range(20,401,20))
y = np.random.rand(20)
params = np.array([1,1])
def funcinv(x, a, b):
return b + a/x
res = curve_fit(funcinv, x, y, params)
print(res) # ... array([ 6.89518618, 0.37118815]), ...

Fitting data to numerical solution of an ode in python

I have a system of two first order ODEs, which are nonlinear, and hence difficult to solve analytically in a closed form. I want to fit the numerical solution to this system of ODEs to a data set. My data set is for only one of the two variables that are part of the ODE system. How do I go about this?
This didn't help because there's only one variable there.
My code which is currently leading to an error is:
import numpy as np
from scipy.integrate import odeint
from scipy.optimize import curve_fit
def f(y, t, a, b, g):
S, I = y # S, I are supposed to be my variables
Sdot = -a * S * I
Idot = (a - b) * S * I + (b - g - b * I) * I
dydt = [Sdot, Idot]
return dydt
def y(t, a, b, g, y0):
y = odeint(f, y0, t, args=(a, b, g))
return y.ravel()
I_data =[] # I have data only for I, not for S
file = open('./ratings_showdown.csv')
for e_raw in file.read().split('\r\n'):
try:
e=float(e_raw); I_data.append(e)
except ValueError:
continue
data_t = range(len(I_data))
popt, cov = curve_fit(y, data_t, I_data, [.05, 0.02, 0.01, [0.99,0.01]])
#want to fit I part of solution to data for variable I
#ERROR here, ValueError: setting an array element with a sequence
a_opt, b_opt, g_opt, y0_opt = popt
print("a = %g" % a_opt)
print("b = %g" % b_opt)
print("g = %g" % g_opt)
print("y0 = %g" % y0_opt)
import matplotlib.pyplot as plt
t = np.linspace(0, len(data_y), 2000)
plt.plot(data_t, data_y, '.',
t, y(t, a_opt, b_opt, g_opt, y0_opt), '-')
plt.gcf().set_size_inches(6, 4)
#plt.savefig('out.png', dpi=96) #to save the fit result
plt.show()
This type of ODE fitting becomes a lot easier in symfit, which I wrote specifically as a user friendly wrapper to scipy. I think it will be very useful for your situation because the decreased amount of boiler-plate code simplifies things a lot.
From the docs and applied roughly to your problem:
from symfit import variables, parameters, Fit, D, ODEModel
S, I, t = variables('S, I, t')
a, b, g = parameters('a, b, g')
model_dict = {
D(S, t): -a * S * I,
D(I, t): (a - b) * S * I + (b - g - b * I) * I
}
ode_model = ODEModel(model_dict, initial={t: 0.0, S: 0.99, I: 0.01})
fit = Fit(ode_model, t=tdata, I=I_data, S=None)
fit_result = fit.execute()
Check out the docs for more :)
So I figured out the problem.
The curve_fit() function apparently returns a list as it's second return value. So, instead of passing the initial conditions as a list [0.99,0.01], I passed them separately as 0.99 and 0.01.

CVXOPT QP Solver: TypeError: 'A' must be a 'd' matrix with 1000 columns

I'm trying to use the CVXOPT qp solver to compute the Lagrange Multipliers for a Support Vector Machine
def svm(X, Y, c):
m = len(X)
P = matrix(np.dot(Y, Y.T) * np.dot(X, X.T))
q = matrix(np.ones(m) * -1)
g1 = np.asarray(np.diag(np.ones(m) * -1))
g2 = np.asarray(np.diag(np.ones(m)))
G = matrix(np.append(g1, g2, axis=0))
h = matrix(np.append(np.zeros(m), (np.ones(m) * c), axis =0))
A = np.reshape((Y.T), (1,m))
b = matrix([0])
print (A).shape
A = matrix(A)
sol = solvers.qp(P, q, G, h, A, b)
print sol
Here X is a 1000 X 2 matrix and Y has the same number of labels. The solver throws the following error:
$ python svm.py
(1, 1000)
Traceback (most recent call last):
File "svm.py", line 35, in <module>
svm(X, Y, 50)
File "svm.py", line 29, in svm
sol = solvers.qp(P, q, G, h, A, b)
File "/usr/local/lib/python2.7/site-packages/cvxopt/coneprog.py", line 4468, in qp
return coneqp(P, q, G, h, None, A, b, initvals, options = options)
File "/usr/local/lib/python2.7/site-packages/cvxopt/coneprog.py", line 1914, in coneqp
%q.size[0])
TypeError: 'A' must be a 'd' matrix with 1000 columns
I printed the shape of A and it's a (1,1000) matrix after reshaping from a vector. What exactly is causing this error?
Your matrix elements have to be of the floating-point type as well. So the error is removed by using A = A.astype('float') to cast it.
i have try A=A.astype(double) to solve it, but it is invalid, since python doesn't know what double is or A has no method astype.
therefore
via using
A = matrix(A, (1, m), 'd')
could actually solve this problem!
The error - "TypeError: 'A' must be a 'd' matrix with 1000 columns:" has two condition namely:
if the type code is not equal to 'd'
if the A.size[1] != c.size[0].
Check for these conditions.
To convert CVXOPT matrix items to floats:
A = A * 1.0

Python scipy.optimize.fmin_l_bfgs_b error occurs

My code is to implement an active learning algorithm, using L-BFGS optimization. I want to optimize four parameters: alpha, beta, w and gamma.
However, when I run the code below, I got an error:
optimLogitLBFGS = sp.optimize.fmin_l_bfgs_b(func, x0 = x0, args = (X,Y,Z), fprime = func_grad)
File "C:\Python27\lib\site-packages\scipy\optimize\lbfgsb.py", line 188, in fmin_l_bfgs_b
**opts)
File "C:\Python27\lib\site-packages\scipy\optimize\lbfgsb.py", line 311, in _minimize_lbfgsb
isave, dsave)
_lbfgsb.error: failed in converting 7th argument ``g' of _lbfgsb.setulb to C/Fortran array
0-th dimension must be fixed to 22 but got 4
My code is:
# -*- coding: utf-8 -*-
import numpy as np
import scipy as sp
import scipy.stats as sps
num_labeler = 3
num_instance = 5
X = np.array([[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4],[5,5,5,5]])
Z = np.array([1,0,1,0,1])
Y = np.array([[1,0,1],[0,1,0],[0,0,0],[1,1,1],[1,0,0]])
W = np.array([[1,1,1,1],[2,2,2,2],[3,3,3,3]])
gamma = np.array([1,1,1,1,1])
alpha = np.array([1,1,1,1])
beta = 1
para = np.array([1,1,1,1,1,1,1,1,1,2,2,2,2,3,3,3,3,1,1,1,1,1])
def get_params(para):
# extract parameters from 1D parameter vector
assert len(para) == 22
alpha = para[0:4]
beta = para[4]
W = para[5:17].reshape(3, 4)
gamma = para[17:]
return alpha, beta, gamma, W
def log_p_y_xz(yit,zi,sigmati): #log P(y_it|x_i,z_i)
return np.log(sps.norm(zi,sigmati).pdf(yit))#tested
def log_p_z_x(alpha,beta,xi): #log P(z_i=1|x_i)
return -np.log(1+np.exp(-np.dot(alpha,xi)-beta))#tested
def sigma_eta_ti(xi, w_t, gamma_t): # 1+exp(-w_t x_i -gamma_t)^-1
return 1/(1+np.exp(-np.dot(xi,w_t)-gamma_t)) #tested
def df_alpha(X,Y,Z,W,alpha,beta,gamma):#df/dalpha
return np.sum((2/(1+np.exp(-np.dot(alpha,X[i])-beta))-1)*np.exp(-np.dot(alpha,X[i])-beta)*X[i]/(1+np.exp(-np.dot(alpha,X[i])-beta))**2 for i in range (num_instance))
#tested
def df_beta(X,Y,Z,W,alpha,beta,gamma):#df/dbelta
return np.sum((2/(1+np.exp(-np.dot(alpha,X[i])-beta))-1)*np.exp(-np.dot(alpha,X[i])-beta)/(1+np.exp(-np.dot(alpha,X[i])-beta))**2 for i in range (num_instance))
def df_w(X,Y,Z,W,alpha,beta,gamma):#df/sigma * sigma/dw
return np.sum(np.sum((-3)*(Y[i][t]**2-(-np.log(1+np.exp(-np.dot(alpha,X[i])-beta)))*(2*Y[i][t]-1))*(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**4)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))*X[i]+(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**2)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))*X[i]for t in range(num_labeler)) for i in range (num_instance))
def df_gamma(X,Y,Z,W,alpha,beta,gamma):#df/sigma * sigma/dgamma
return np.sum(np.sum((-3)*(Y[i][t]**2-(-np.log(1+np.exp(-np.dot(alpha,X[i])-beta)))*(2*Y[i][t]-1))*(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**4)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))+(1/(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))**2)*(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t])))*(1-(1/(1+np.exp(-np.dot(X[i],W[t])-gamma[t]))))for t in range(num_labeler)) for i in range (num_instance))
def func(para, *args):
alpha, beta, gamma, W = get_params(para)
#args
X = args [0]
Y = args[1]
Z = args[2]
return np.sum(np.sum(log_p_y_xz(Y[i][t], Z[i], sigma_eta_ti(X[i],W[t],gamma[t]))+log_p_z_x(alpha, beta, X[i]) for t in range(num_labeler)) for i in range (num_instance))
#tested
def func_grad(para, *args):
alpha, beta, gamma, W = get_params(para)
#args
X = args [0]
Y = args[1]
Z = args[2]
#gradiants
d_f_a = df_alpha(X,Y,Z,W,alpha,beta,gamma)
d_f_b = df_beta(X,Y,Z,W,alpha,beta,gamma)
d_f_w = df_w(X,Y,Z,W,alpha,beta,gamma)
d_f_g = df_gamma(X,Y,Z,W,alpha,beta,gamma)
return np.array([d_f_a, d_f_b,d_f_w,d_f_g])
x0 = np.concatenate([np.ravel(alpha), np.ravel(beta), np.ravel(W), np.ravel(gamma)])
optimLogitLBFGS = sp.optimize.fmin_l_bfgs_b(func, x0 = x0, args = (X,Y,Z), fprime = func_grad)
I am not sure what is the problem. Maybe, the func_grad cause the problem? Could anyone have a look? thanks
You need to be taking the derivative of func with respect to each of the elements in your concatenated array of alpha, beta, w, gamma parameters, so func_grad ought to return a single 1D array of the same length as x0 (i.e. 22). Instead it returns a jumble of two arrays and two scalar floats nested inside an np.object array:
In [1]: func_grad(x0, X, Y, Z)
Out[1]:
array([array([ 0.00681272, 0.00681272, 0.00681272, 0.00681272]),
0.006684719133999417,
array([-0.01351227, -0.01351227, -0.01351227, -0.01351227]),
-0.013639910534587798], dtype=object)
Part of the problem is that np.array([d_f_a, d_f_b,d_f_w,d_f_g]) is not concatenating those objects into a single 1D array since some are numpy arrays and some are Python floats. That part is easily solved by using np.hstack([d_f_a, d_f_b,d_f_w,d_f_g]) instead.
However, the combined sizes of these objects is still only 10, whereas the output of func_grad needs to be a 22-long vector. You will need to take another look at your df_* functions. In particular, W is a (3, 4) array, but df_w only returns a (4,) vector, and gamma is a (4,) vector whereas df_gamma only returns a scalar.

Categories

Resources