I'm trying to fit a 2D-Gaussian to some greyscale image data, which is given by one 2D array.
The lmfit library implements a easy-to-use Model class, that should be capable of doing this.
Unfortunately the documentation (http://lmfit.github.io/lmfit-py/model.html) does only provide examples for 1D fitting. For my case I simply construct the lmfit Model with 2 independent variables.
The following code seems valid for me, but causes scipy to throw a "minpack.error: Result from function call is not a proper array of floats."
Tom sum it up: How to input 2D (x1,x2)->(y) data to a Model of lmfit.?
Here is my approach:
Everything is packed in a GaussianFit2D class, but here are the important parts:
That's the Gaussian function. The documentation says about user defined functions
Of course, the model function will have to return an array that will be the same size as the data being modeled. Generally this is handled by also specifying one or more independent variables.
I don't really get what this should mean, since for given values x1,x2 the only reasonable result is a scalar value.
def _function(self, x1, x2, amp, wid, cen1, cen2):
val = (amp/(np.sqrt(2*np.pi)*wid)) * np.exp(-((x1-cen1)**2+(x2-cen2)**2)/(2*wid**2))
return val
Here the model is generated:
def _buildModel(self, **kwargs):
model = lmfit.Model(self._function, independent_vars=["x1", "x2"],
param_names=["amp", "wid", "cen1", "cen2"])
return model
That's the function that takes the data, builds the model and params and calls lmfit fit():
def fit(self, data, freeX, **kwargs):
freeX = np.asarray(freeX, float)
model = self._buildModel(**kwargs)
params = self._generateModelParams(model, **kwargs)
model.fit(data, x1=freeX[0], x2=freeX[1], params=params)
Anf finally here this fit function gets called:
data = np.asarray(img, float)
gaussFit = GaussianFit2D()
x1 = np.arange(len(img[0, :]))
x2 = np.arange(len(img[:, 0]))
fit = gaussFit.fit(data, [x1, x2])
Ok, wrote with the devs and got the answer from them (thanks to Matt here).
The basic idea is to flatten all the input to 1D data, hiding from lmfit the >1 dimensional input.
Here's how you do it.
Modify your function:
def function(self, x1, x2):
return (x1+x2).flatten()
Flatten your 2D input array you want to fit to:
...
data = data.flatten()
...
Modify the two 1D x-variables such that you have any combination of them:
...
x1n = []
x2n = []
for i in x1:
for j in x2:
x1n.append(i)
x2n.append(j)
x1n = np.asarray(x1n)
x2n = np.asarray(x2n)
...
And throw anything into the fitter:
model.fit(data, x1=x1n, x2=x2n, params=params)
Here is an example for your reference, hope it may help you.
import numpy
from lmfit import Model
def gaussian(x, cenu, cenv, wid):
u = x[:, 0]
v = x[:, 1]
return (1/(2*numpy.pi*wid**2)) * numpy.exp(-(u-cenu)**2 / (2*wid**2)-(v-cenv)**2 / (2*wid**2))
data = numpy.empty((25,3))
x = numpy.arange(-2,3,1)
y = numpy.arange(-2,3,1)
xx, yy = numpy.meshgrid(x, y)
data[:,0] = xx.flatten()
data[:,1] = yy.flatten()
data[:, 2]= gaussian(data[:,0:2],0,0,0.5)
print 'xx\n', xx
print 'yy\n',yy
print 'data to be fit\n', data[:, 2]
cu = 0.9
cv = 0.5
wid = 1
gmod = Model(gaussian)
gmod.set_param_hint('cenu', value=cu, min=cu-2, max=cu+2)
gmod.set_param_hint('cenv', value=cv, min=cv -2, max=cv+2)
gmod.set_param_hint('wid', value=wid, min=0.1, max=5)
params = gmod.make_params()
result = gmod.fit(data[:, 2], x=data[:, 0:2], params=params)
print result.fit_report(min_correl=0.25)
print result.best_values
print result.best_fit
Related
I’m trying to apply multiclass logistic regression from scratch. The dataset is the MNIST.
I built some functions such as hypothesis, sigmoid, cost function, cost function derivate, and gradient descendent. My code is below.
I’m struggling with:
As all images are labeled with the respective digit that they represent. There are a total of 10 classes.
Inside the function gradient descendent, I need to loop through each class, but I do not know how to apply it using the One vs All method.
In other words, what I need to do are:
How to filter each class inside the gradient descendent.
After that, how to build a function to predict the test set.
Here is my code.
import numpy as np
import pandas as pd
# Only training data set
# the test data will be load later.
url='https://drive.google.com/file/d/1-MO8oCfq4KU361QeeL4DdafVBhZePUNT/view?usp=sharing'
url='https://drive.google.com/uc?id=' + url.split('/')[-2]
df = pd.read_csv(url,header = None)
X = df.values[:, 0:-1]
y = df.values[:, -1]
m = np.size(X, 0)
y = np.array(y).reshape(m, 1)
X = np.c_[ np.ones(m), X ] # Bias
def hypothesis(X, thetas):
return sigmoid( X.dot(thetas)) #- 0.0000001
def sigmoid(z):
return 1/(1+np.exp(-z))
def losscost(X, y, m, thetas):
h = hypothesis(X, thetas)
return -(1/m) * ( y.dot(np.log(h)) + (1-y).dot(np.log(1-h)) )
def derivativelosscost(X, y, m, thetas):
h = hypothesis(X, thetas)
return (h-y).dot(X)/m
def descendinggradient(X, y, m, epoch, alpha, thetas):
n = np.size(X, 1)
J_historico = []
for i in range(epoch):
for j in range(0,10): # 10 classes
# How to filter each class inside here (inside this def descendinggradient)?
# 2 lines below are wrong.
#thetas = thetas - alpha * derivativelosscost(X, y, m, thetas)
#J_historico = J_historico + [losscost(X, y, m, thetas)]
return [thetas, J_historico]
alpha = 0.01
epoch = 50
(thetas, J_historico) = descendinggradient(X, y, m, epoch, alpha)
# After that, how to build a function to predict the test set.
Let me explain this problem step-by-step:
First since you code doesn't provides the actual data or a link to it I've created a random dataset followed by the same commands you used to create X and Y:
batch_size = 20
num_classes = 10
rng = np.random.default_rng(seed=42)
df = pd.DataFrame(
4* rng.random((batch_size, num_classes + 1)) - 2, # Create Random Array Between -2, 2
columns=['X0','X1','X2','X3','X4','X5','X6','X7','X8', 'X9','Y']
)
X = df.values[:, 0:-1]
y = df.values[:, -1]
m = np.size(X, 0)
y = np.array(y).reshape(m, 1)
X = np.c_[ np.ones(m), X ] # Bias
Next lets take a look at your hypothesis function. If we would just run hypothesis and take a look at the first sample, we will get a vector with the size (10,1). I also needed to provide the initial thetas for this case:
thetas = rng.random((X.shape[1],num_classes))
h = hypothesis(X, thetas)
print(h[0])
>>>[0.89701729 0.90050806 0.98358408 0.81786334 0.96636732 0.97819512
0.89118488 0.87238045 0.70612173 0.30256924]
Basically the function calculates a "propabilties"[1] for each class.
At this point we got to the first issue in your code. The result of the sigmoid function returns "propabilities" which are not "connected" to each other. So to set those "propabilties" in relation we need a another function: SOFTMAX. You will find plenty implementations about this functions. In short: It will calculate the "propabilites" based on the "sigmoid", so that the sum overall class-"propabilites" results to 1.
So for your second question "How to implement a predict after training", we only need to find the argmax value to determine the class:
h = hypothesis(X, thetas)
p = softmax(h) # needs to be implemented
prediction = np.argmax(p, axis=1)
print(prediction)
>>>[2 5 5 8 3 5 2 1 3 5 2 3 8 3 3 9 5 1 1 8]
Now that we know how to predict a class, we also need to know where to setup the training. We want to do this directly after the softmax function. But instead of using the argmax to determine the winning class, we use the costfunction and its derivative. Your problem in your code: You used the crossentropy loss for a binary problem. The binary problem also don't need to use the softmax function, because the sigmoid function already provides the connection of the two binary classes. So since we are not interested in the result at all of the cross-entropy-loss for multiple classes and only into its derivative, we also want to calculate this directly.
The conversion from binary crossentropy to multiclass is kind of unintuitive in the first view. I recommend to read a bit about it before implementing. After this you basicly use your line:
thetas = thetas - alpha * derivativelosscost(X, y, m, thetas)
for updating the thetas.
[1]These are not actuall propabilities, but this is a complete different topic.
I'm trying to fit a parameter eta_H in function TGp_xx to some data (x_data, data_num_xx) using curve_fit. Now, the code below is a reduced version of what I'm using and it won't work by itself, but I hope the issue is conceptual enough to be understandable even from this
from scipy.optimize import curve_fit
Lx = 150
y_cut = 20
data = np.loadtxt("../dump/results.dat")
ux = data[:,3]
ux = np.reshape(ux , (Ly, Lx))
def Par_x(x,y,vec):
fdx = vec[(x+1)%Lx , y]
fsx = vec[(x-1+Lx)%Lx , y]
return (fdx - fsx) / 2.0
def TGp_xx(x, eta_H): return 2*eta_H*Par_x(x,y_cut,ux)
x_data = np.arange(Lx, dtype=np.int)
data_num_xx = np.empty(Lx, dtype='float64') #this is just a placeholder
popt_xx, pcov_xx = curve_fit(TGp_xx, x_data, data_num_xx)
I get an IndexError raised within Par_x:
fdx = vec[(x+1)%Lx , y]
IndexError: arrays used as indices must be of integer (or boolean) type
I tried something simpler like calling TGp_xx(x_data, some_constant) outside curve_fit, and it works. I don't really get why inside curve_fit i get the IndexError, as if I'm passing a float value (or an array of floats) as x, that can't be used as an index.
Edit:
Modeling and fitting with this approach work fine, the data in here is not good.-------------------
I want to do a curve-fitting on a complex dataset. After thorough reading and searching, I found that i can use a couple of methods (e.g. lmfit optimize, scipy leastsq).
But none gives me a good fit at all.
here is the fit equation:
here is the data to be fitted (list of y values):
[(0.00011342104914066835+8.448890220616275e-07j),
(0.00011340386404065371+7.379293582429708e-07j),
(0.0001133540327309949+6.389834505824625e-07j),
(0.00011332170913939336+5.244566142401774e-07j),
(0.00011331311156154074+4.3841061618015007e-07j),
(0.00011329383047059048+3.6163513508002877e-07j),
(0.00011328700094846502+3.0542249453666894e-07j),
(0.00011327650033983806+2.548725558622188e-07j),
(0.00011327702539337786+2.2508174567697671e-07j),
(0.00011327342238146558+1.9607648998100523e-07j),
(0.0001132710747364799+1.721721661949941e-07j),
(0.00011326933241850936+1.5246061350710235e-07j),
(0.00011326798040984542+1.3614817802178457e-07j),
(0.00011326752037650585+1.233483784504962e-07j),
(0.00011326758290166552+1.1258801448459512e-07j),
(0.00011326813100914905+1.0284749122099354e-07j),
(0.0001132684076390416+9.45791423595816e-08j),
(0.00011326982474882009+8.733105218572698e-08j),
(0.00011327158639135678+8.212191452217794e-08j),
(0.00011327366823516856+7.747920115589205e-08j),
(0.00011327694366034208+7.227069986108343e-08j),
(0.00011327915327873038+6.819405851172907e-08j),
(0.00011328181165961218+6.468392148750885e-08j),
(0.00011328531688122571+6.151393311227958e-08j),
(0.00011328857849500441+5.811704586613896e-08j),
(0.00011329241716561626+5.596645863242474e-08j),
(0.0001132970129528527+5.4722461511610696e-08j),
(0.0001133002881788021+5.064523218904898e-08j),
(0.00011330507671740223+5.0307457368330284e-08j),
(0.00011331106068787993+4.7703959367963307e-08j),
(0.00011331577350707601+4.634615394867111e-08j),
(0.00011332064001939156+4.6914747648361504e-08j),
(0.00011333034985824086+4.4992151257444304e-08j),
(0.00011334188526870483+4.363662798446445e-08j),
(0.00011335491299924776+4.364164366097129e-08j),
(0.00011337451201475147+4.262881852644385e-08j),
(0.00011339778209066752+4.275096587356569e-08j),
(0.00011342832992628646+4.4463907608604945e-08j),
(0.00011346526768580432+4.35706649329342e-08j),
(0.00011351108008292451+4.4155812379491554e-08j),
(0.00011356967192325835+4.327004709646922e-08j),
(0.00011364164970635006+4.420660396556604e-08j),
(0.00011373150199883139+4.3672898914161596e-08j),
(0.00011384660942003356+4.326171366194325e-08j),
(0.00011399193321804955+4.1493065523925126e-08j),
(0.00011418043916260295+4.0762418512759096e-08j),
(0.00011443271767970721+3.91359909722939e-08j),
(0.00011479600563688605+3.845666332695652e-08j),
(0.0001153652105925112+3.6224677316584614e-08j),
(0.00011638635682516399+3.386843079212692e-08j),
(0.00011836223959714231+3.6692295450490655e-08j)]
here is the list of x values:
[999.9999960000001,
794.328231,
630.957342,
501.18723099999994,
398.107168,
316.22776400000004,
251.188642,
199.52623,
158.489318,
125.89254,
99.999999,
79.432823,
63.095734,
50.118722999999996,
39.810717,
31.622776,
25.118864000000002,
19.952623000000003,
15.848932000000001,
12.589253999999999,
10.0,
7.943282000000001,
6.309573,
5.011872,
3.981072,
3.1622779999999997,
2.511886,
1.9952619999999999,
1.584893,
1.258925,
1.0,
0.7943279999999999,
0.630957,
0.5011869999999999,
0.398107,
0.316228,
0.251189,
0.199526,
0.15848900000000002,
0.125893,
0.1,
0.079433,
0.063096,
0.050119,
0.039811,
0.031623000000000005,
0.025119,
0.019953,
0.015849000000000002,
0.012589,
0.01]
and here is the code which works but not the way I want:
import numpy as np
import matplotlib.pyplot as plt
from lmfit import minimize, Parameters
#%% the equation
def ColeCole(params, fr): #fr is x values array and params are the fitting parameters
sig0 = params['sig0']
m = params['m']
tau = params['tau']
c = params['c']
w = fr*2*np.pi
num = 1
denom = 1+(1j*w*tau)**c
sigComplex = sig0*(1.0+(m/(1-m))*(1-num/denom))
return sigComplex
def res(params, fr, data): #calculating reseduals of fit
resedual = ColeCole(params, fr) - data
return resedual.view(np.float)
#%% Adding model parameters and fitting
params = Parameters()
params.add('sig0', value=0.00166)
params.add('m', value=0.19,)
params.add('tau', value=0.05386)
params.add('c', value=0.80)
params['tau'].min = 0 # these conditions must be met but even if I remove them the fit is ugly!!
params['m'].min = 0
out= minimize(res, params , args= (np.array(fr2), np.array(data)))
#%%plotting Imaginary part
fig, ax = plt.subplots()
plotX = fr2
plotY = data.imag
fitplot = ColeCole(out.params, fr2)
ax.semilogx(plotX,plotY,'o',label='imc')
ax.semilogx(plotX,fitplot.imag,label='fit')
#%%plotting real part
fig2, ax2 = plt.subplots()
plotX2 = fr2
plotY2 = data.real
fitplot2 = ColeCole(out.params, fr2)
ax2.semilogx(plotX2,plotY2,'o',label='imc')
ax2.semilogx(plotX2,fitplot2.real,label='fit')
I might be doing it completely wrong, please help me if you know the proper solution to do a curve fitting on complex data.
I would suggest first converting the complex data to numpy arrays and get real, imag pairs separately and then using lmfit Model to model that same sort of data. Perhaps something like this:
cdata = np.array((0.00011342104914066835+8.448890220616275e-07j,
0.00011340386404065371+7.379293582429708e-07j,
0.0001133540327309949+6.389834505824625e-07j,
0.00011332170913939336+5.244566142401774e-07j,
0.00011331311156154074+4.3841061618015007e-07j,
0.00011329383047059048+3.6163513508002877e-07j,
0.00011328700094846502+3.0542249453666894e-07j,
0.00011327650033983806+2.548725558622188e-07j,
0.00011327702539337786+2.2508174567697671e-07j,
0.00011327342238146558+1.9607648998100523e-07j,
0.0001132710747364799+1.721721661949941e-07j,
0.00011326933241850936+1.5246061350710235e-07j,
0.00011326798040984542+1.3614817802178457e-07j,
0.00011326752037650585+1.233483784504962e-07j,
0.00011326758290166552+1.1258801448459512e-07j,
0.00011326813100914905+1.0284749122099354e-07j,
0.0001132684076390416+9.45791423595816e-08j,
0.00011326982474882009+8.733105218572698e-08j,
0.00011327158639135678+8.212191452217794e-08j,
0.00011327366823516856+7.747920115589205e-08j,
0.00011327694366034208+7.227069986108343e-08j,
0.00011327915327873038+6.819405851172907e-08j,
0.00011328181165961218+6.468392148750885e-08j,
0.00011328531688122571+6.151393311227958e-08j,
0.00011328857849500441+5.811704586613896e-08j,
0.00011329241716561626+5.596645863242474e-08j,
0.0001132970129528527+5.4722461511610696e-08j,
0.0001133002881788021+5.064523218904898e-08j,
0.00011330507671740223+5.0307457368330284e-08j,
0.00011331106068787993+4.7703959367963307e-08j,
0.00011331577350707601+4.634615394867111e-08j,
0.00011332064001939156+4.6914747648361504e-08j,
0.00011333034985824086+4.4992151257444304e-08j,
0.00011334188526870483+4.363662798446445e-08j,
0.00011335491299924776+4.364164366097129e-08j,
0.00011337451201475147+4.262881852644385e-08j,
0.00011339778209066752+4.275096587356569e-08j,
0.00011342832992628646+4.4463907608604945e-08j,
0.00011346526768580432+4.35706649329342e-08j,
0.00011351108008292451+4.4155812379491554e-08j,
0.00011356967192325835+4.327004709646922e-08j,
0.00011364164970635006+4.420660396556604e-08j,
0.00011373150199883139+4.3672898914161596e-08j,
0.00011384660942003356+4.326171366194325e-08j,
0.00011399193321804955+4.1493065523925126e-08j,
0.00011418043916260295+4.0762418512759096e-08j,
0.00011443271767970721+3.91359909722939e-08j,
0.00011479600563688605+3.845666332695652e-08j,
0.0001153652105925112+3.6224677316584614e-08j,
0.00011638635682516399+3.386843079212692e-08j,
0.00011836223959714231+3.6692295450490655e-08j))
fr = np.array((999.9999960000001, 794.328231, 630.957342,
501.18723099999994, 398.107168, 316.22776400000004,
251.188642, 199.52623, 158.489318, 125.89254, 99.999999,
79.432823, 63.095734, 50.118722999999996, 39.810717,
31.622776, 25.118864000000002, 19.952623000000003,
15.848932000000001, 12.589253999999999, 10.0,
7.943282000000001, 6.309573, 5.011872, 3.981072,
3.1622779999999997, 2.511886, 1.9952619999999999, 1.584893,
1.258925, 1.0, 0.7943279999999999, 0.630957,
0.5011869999999999, 0.398107, 0.316228, 0.251189, 0.199526,
0.15848900000000002, 0.125893, 0.1, 0.079433, 0.063096,
0.050119, 0.039811, 0.031623000000000005, 0.025119, 0.019953,
0.015849000000000002, 0.012589, 0.01))
data = np.concatenate((cdata.real, cdata.imag))
# model function for lmfit
def colecole_function(x, sig0, m, tau, c):
w = x*2*np.pi
denom = 1+(1j*w*tau)**c
sig = sig0*(1.0+(m/(1.0-m))*(1-1.0/denom))
return np.concatenate((sig.real, sig.imag))
mod = Model(colecole_function)
params = mod.make_params(sig0=0.002, m=-0.19, tau=0.05, c=0.8)
params['tau'].min = 0
result = mod.fit(data, params, x=fr)
print(result.fit_report())
You would then want to plot the results like
nf = len(fr)
plt.plot(fr, data[:nf], label='data(real)')
plt.plot(fr, result.best_fit[:nf], label='fit(real)')
and similarly
plt.plot(fr, data[nf:], label='data(imag)')
plt.plot(fr, result.best_fit[nf:], label='fit(imag)')
Note that I think you're going to want to allow m to be negative (or maybe I misuderstand your model). I did not work carefully on getting a great fit, but I think this should get you started.
I have installed Numpy and SciPy, but I'm not quite understand their documentation about polyfit.
For exmpale, Here's my three data samples:
[-0.042780748663101636, -0.0040771571786609945, -0.00506567946276074]
[0.042780748663101636, -0.0044771571786609945, -0.10506567946276074]
[0.542780748663101636, -0.005771571786609945, 0.30506567946276074]
[-0.342780748663101636, -0.0304077157178660995, 0.90506567946276074]
The first two columns are sample features, the third column is output, My target is to get a function that could take two parameters(first two columns) and return its prediction(the output).
Any simple example ?
====================== EDIT ======================
Note that, I need to fit something like a curve, not only straight lines. The polynomial should be something like this ( n = 3):
a*x1^3 + b*x2^2 + c*x3 + d = y
Not:
a*x1 + b*x2 + c*x3 + d = y
x1, x2, x3 are features of one sample, y is the output
Try something like
edit: added an example function that used results of linear regression to estimate output.
import numpy as np
data =np.array(
[[-0.042780748663101636, -0.0040771571786609945, -0.00506567946276074],
[0.042780748663101636, -0.0044771571786609945, -0.10506567946276074],
[0.542780748663101636, -0.005771571786609945, 0.30506567946276074],
[-0.342780748663101636, -0.0304077157178660995, 0.90506567946276074]])
coefficient = data[:,0:2]
dependent = data[:,-1]
x,residuals,rank,s = np.linalg.lstsq(coefficient,dependent)
def f(x,u,v):
return u*x[0] + v*x[1]
for datum in data:
print f(x,*datum[0:2])
Which gives
>>> x
array([ 0.16991146, -30.18923739])
>>> residuals
array([ 0.07941146])
>>> rank
2
>>> s
array([ 0.64490113, 0.02944663])
and the function created with your coefficients gave
0.115817326583
0.142430900298
0.266464019171
0.859743371665
More info can be found at the documentation I posted as a comment.
edit 2: fitting your data to an arbitrary model.
edit 3: made my model a function for ease of understanding.
edit 4: made code more easily read/ changed model to a quadratic fit, but you should be able to read this code and know how to make it minimize any residual you want now.
contrived example:
import numpy as np
from scipy.optimize import leastsq
data =np.array(
[[-0.042780748663101636, -0.0040771571786609945, -0.00506567946276074],
[0.042780748663101636, -0.0044771571786609945, -0.10506567946276074],
[0.542780748663101636, -0.005771571786609945, 0.30506567946276074],
[-0.342780748663101636, -0.0304077157178660995, 0.90506567946276074]])
coefficient = data[:,0:2]
dependent = data[:,-1]
def model(p,x):
a,b,c = p
u = x[:,0]
v = x[:,1]
return (a*u**2 + b*v + c)
def residuals(p, y, x):
a,b,c = p
err = y - model(p,x)
return err
p0 = np.array([2,3,4]) #some initial guess
p = leastsq(residuals, p0, args=(dependent, coefficient))[0]
def f(p,x):
return p[0]*x[0] + p[1]*x[1] + p[2]
for x in coefficient:
print f(p,x)
gives
-0.108798280153
-0.00470479385807
0.570237823475
0.413016072653
I have a set of data that I am trying to fit to an ODE model using scipy's leastsq function. My ODE has parameters beta and gamma, so that it looks for example like this:
# dS/dt = -betaSI
# dI/dt = betaSI - gammaI
# dR/dt = gammaI
# with y0 = y(t=0) = (S(0),I(0),R(0))
The idea is to find beta and gamma so that the numerical integration of my system of ODE's best approximates the data. I am able to do this just fine using leastsq if I know all the points in my initial condition y0.
Now, I am trying to do the same thing but to pass now one of the entries of y0 as an extra parameter. Here is where the Python and me stop communicating...
I did a function so that now the first entry of the parameters that I pass to leastsq is the initial condition of my variable R.
I get the following message:
*Traceback (most recent call last):
File "/Users/Laura/Dropbox/SHIV/shivmodels/test.py", line 73, in <module>
p1,success = optimize.leastsq(errfunc, initguess, args=(simpleSIR,[y0[0]],[Tx],[mydata]))
File "/Library/Frameworks/Python.framework/Versions/7.2/lib/python2.7/site-packages/scipy/optimize/minpack.py", line 283, in leastsq
gtol, maxfev, epsfcn, factor, diag)
TypeError: array cannot be safely cast to required type*
Here is my code. It is a little more involved that what it needs to be for this example because in reality I want to fit another ode with 7 parameters and want to fit to several data sets at once. But I wanted to post here something simpler... Any help will be very very much appreciated! Thank you very much!
import numpy as np
from matplotlib import pyplot as plt
from scipy import optimize
from scipy.integrate import odeint
#define the time span for the ODE integration:
Tx = np.arange(0,50,1)
num_points = len(Tx)
#define a simple ODE to fit:
def simpleSIR(y,t,params):
dydt0 = -params[0]*y[0]*y[1]
dydt1 = params[0]*y[0]*y[1] - params[1]*y[1]
dydt2 = params[1]*y[1]
dydt = [dydt0,dydt1,dydt2]
return dydt
#generate noisy data:
y0 = [1000.,1.,0.]
beta = 12*0.06/1000.0
gamma = 0.25
myparam = [beta,gamma]
sir = odeint(simpleSIR, y0, Tx, (myparam,))
mydata0 = sir[:,0] + 0.05*(-1)**(np.random.randint(num_points,size=num_points))*sir[:,0]
mydata1 = sir[:,1] + 0.05*(-1)**(np.random.randint(num_points,size=num_points))*sir[:,1]
mydata2 = sir[:,2] + 0.05*(-1)**(np.random.randint(num_points,size=num_points))*sir[:,2]
mydata = np.array([mydata0,mydata1,mydata2]).transpose()
#define a function that will run the ode and fit it, the reason I am doing this
#is because I will use several ODE's to see which one fits the data the best.
def fitfunc(myfun,y0,Tx,params):
myfit = odeint(myfun, y0, Tx, args=(params,))
return myfit
#define a function that will measure the error between the fit and the real data:
def errfunc(params,myfun,y0,Tx,y):
"""
INPUTS:
params are the parameters for the ODE
myfun is the function to be integrated by odeint
y0 vector of initial conditions, so that y(t0) = y0
Tx is the vector over which integration occurs, since I have several data sets and each
one has its own vector of time points, Tx is a list of arrays.
y is the data, it is a list of arrays since I want to fit to multiple data sets at once
"""
res = []
for i in range(len(y)):
V0 = params[0][i]
myparams = params[1:]
initCond = np.zeros([3,])
initCond[:2] = y0[i]
initCond[2] = V0
myfit = fitfunc(myfun,initCond,Tx[i],myparams)
res.append(myfit[:,0] - y[i][:,0])
res.append(myfit[:,1] - y[i][:,1])
res.append(myfit[1:,2] - y[i][1:,2])
#end for
all_residuals = np.hstack(res).ravel()
return all_residuals
#end errfunc
#example of the problem:
V0 = [0]
params = [V0,beta,gamma]
y0 = [1000,1]
#this is just to test that my errfunc does work well.
errfunc(params,simpleSIR,[y0],[Tx],[mydata])
initguess = [V0,0.5,0.5]
p1,success = optimize.leastsq(errfunc, initguess, args=(simpleSIR,[y0[0]],[Tx],[mydata]))
The problem is with the variable initguess. The function optimize.leastsq has the following call signature:
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html
It's second argument, x0, has to be an array. Your list
initguess = [v0,0.5,0.5]
won't be converted to an array because v0 is a list instead of an int or float. So you get an error when you try to convert initguess from a list to an array in the leastsq function.
I would adjust the variable params from
def errfunc(params,myfun,y0,Tx,y):
so that it is a 1-D array. Make the first few entries the values of v0 then append beta and gamma to that.