Fitting a vector function with curve_fit in Scipy - python

I want to fit a function with vector output using Scipy's curve_fit (or something more appropriate if available). For example, consider the following function:
import numpy as np
def fmodel(x, a, b):
return np.vstack([a*np.sin(b*x), a*x**2 - b*x, a*np.exp(b/x)])
Each component is a different function but they share the parameters I wish to fit. Ideally, I would do something like this:
x = np.linspace(1, 20, 50)
a = 0.1
b = 0.5
y = fmodel(x, a, b)
y_noisy = y + 0.2 * np.random.normal(size=y.shape)
from scipy.optimize import curve_fit
popt, pcov = curve_fit(f=fmodel, xdata=x, ydata=y_noisy, p0=[0.3, 0.1])
But curve_fit does not work with functions with vector output, and an error Result from function call is not a proper array of floats. is thrown. What I did instead is to flatten out the output like this:
def fmodel_flat(x, a, b):
return fmodel(x[0:len(x)/3], a, b).flatten()
popt, pcov = curve_fit(f=fmodel_flat, xdata=np.tile(x, 3),
ydata=y_noisy.flatten(), p0=[0.3, 0.1])
and this works. If instead of a vector function I am actually fitting several functions with different inputs as well but which share model parameters, I can concatenate both input and output.
Is there a more appropriate way to fit vector function with Scipy or perhaps some additional module? A main consideration for me is efficiency - the actual functions to fit are much more complex and fitting can take some time, so if this use of curve_fit is mangled and is leading to excessive runtimes I would like to know what I should do instead.

If I can be so blunt as to recommend my own package symfit, I think it does precisely what you need. An example on fitting with shared parameters can be found in the docs.
Your specific problem stated above would become:
from symfit import variables, parameters, Model, Fit, sin, exp
x, y_1, y_2, y_3 = variables('x, y_1, y_2, y_3')
a, b = parameters('a, b')
a.value = 0.3
b.value = 0.1
model = Model({
y_1: a * sin(b * x),
y_2: a * x**2 - b * x,
y_3: a * exp(b / x),
})
xdata = np.linspace(1, 20, 50)
ydata = model(x=xdata, a=0.1, b=0.5)
y_noisy = ydata + 0.2 * np.random.normal(size=(len(model), len(xdata)))
fit = Fit(model, x=xdata, y_1=y_noisy[0], y_2=y_noisy[1], y_3=y_noisy[2])
fit_result = fit.execute()
Check out the docs for more!

I think what you're doing is perfectly fine from an efficiency stand point. I'll try to look at the implementation and come up with something more quantitative, but for the time being here is my reasoning.
What you're doing during curve fitting is optimizing the parameters (a,b) such that
res = sum_i |f(x_i; a,b)-y_i|^2
is minimal. By this I mean that you have data points (x_i,y_i) of arbitrary dimensionality, two parameters (a,b) and a fitting model that approximates the data at query points x_i.
The curve fitting algorithm starts from a starting (a,b) pair, puts this into a black box that computes the above square error, and tries to come up with a new (a',b') pair that produces a smaller error. My point is that the error above is really a black box for the fitting algorithm: the configurational space of the fitting is defined merely by the (a,b) parameters. If you imagine how you'd implement a simple curve fitting function, you could imagine that you try to do, say, a gradient descent, with the square error as cost function.
Now, it should be irrelevant to the fitting procedure how the black box computes the error. It's easy to see that the dimensionality of x_i is really irrelevant for scalar functions, since it doesn't matter if you have 1000 1d query points to fit for, or a 10x10x10 grid in 3d space. What matters is that you have 1000 points x_i for which you need to compute f(x_i) ~ y_i from the model.
The only subtlety that should further be noted is that in case of a vector-valued function, the calculation of the error is not trivial. In my opinion, it's fine to define the error at each x_i point using the 2-norm of the vector-valued function. But hey: in this case, the square error at point x_i is
|f(x_i; a,b)-y_i|^2 == sum_k (f(x_i; a,b)[k]-y_i[k])^2
which implies that the square error for each component is accumulated. This just means that what you're doing right now is just right: by replicating your x_i points and taking into account each component of the function individually, your square error will contain exactly the 2-norm of the error at each point.
So my point is what you're doing is mathematically correct, and I don't expect any behaviour of the fitting procedure to depend on the way how multivariate/vector-valued functions are handled.

Related

How to minimize the error to a given dataset

lets assume a function
f(x,y) = z
Now I want to choose x so that the output of f matches real data, and y decreases in equidistant steps to zero starting from 1. The output is calculated in the function f by a set of differential equations.
How can I select x so that the error to the real outputs is as small as possible. Assuming I know a set of z - values, namely
f(x,1) = z_1
f(x,0.9) = z_2
f(x,0.8) = z_3
now find x, that the error to the real data z_1,z_2,z_3 is minimal.
How can one do this?
A common method of optimizing is least squares fitting, in which you would basically try to find params such that the sum of squares: sum (f(params,xdata_i) - ydata_i))^2 is minimized for given xdata and ydata. In your case: params would be x, xdata_i would be 1, 0.9 and 0.8 and ydata_i z_1, z_2 and z_3.
You should consider the package scipy.optimize. It's used in finding parameters for a function. I think this page gives quite a good example on how to use it.

Constraining OLS (or WLS) coeffecients using statsmodels

I have a regression of the form model = sm.GLM(y, X, w = weight).
Which ends up being a simple weighted OLS. (note that specificying w as the error weights array actually works in sm.GLM identically to sm.WLS despite it not being in the documentation).
I'm using GLM because this allows me to fit with some additional constraints using fit_constrained(). My X consists of 6 independent variables, 2 of which i want to constrain the resulting coeffecients to be positive. But i can not seem to figure out the syntax to get fit_constrained() to work. The documentation is extremely bare and i can not find any good examples anywhere. All i really need is the correct syntax for imputing these constraints. Thanks!
The function you see is meant for linear constraints, that is a combination of your coefficients fulfill some linear equalities, not meant for defining boundaries.
The closest you can get is using scipy least squares and defining the boundaries, for example, we set up some dataset with 6 coefficients:
from scipy.optimize import least_squares
import numpy as np
np.random.seed(100)
x = np.random.uniform(0,1,(30,6))
y = np.random.normal(0,2,30)
The function to basically matrix multiply and return error:
def fun(b, x, y):
return b[0] + np.matmul(x,b[1:]) - y
The first coefficient is the intercept. Let's say we require the 2nd and 6th to be always positive:
res_lsq = least_squares(fun, [1,1,1,1,1,1,1], args=(x, y),
bounds=([-np.inf,0,-np.inf,-np.inf,-np.inf,-np.inf,0],+np.inf))
And we check the result:
res_lsq.x
array([-1.74342242e-01, 2.09521327e+00, -2.02132481e-01, 2.06247855e+00,
-3.65963504e+00, 6.52264332e-01, 5.33657765e-20])

How to prioritise some points over others using curve fit from SciPy

I want to model the following curve:
To perform it, I'm using curve_fit from SciPy, fitting an exponential function.
def exponenial_func(x, a, b, c):
return a * b**(c*x)
popt, pcov = curve_fit(exponenial_func, x, y, p0=(1,2,2),
bounds=((0, 0, 0), (np.inf, np.inf, np.inf)))
When I first do it, I get this:
Which is minimising the residuals, each point with the same level of importance.
What I want, is to get a curve that gives more importance to the last values of the curve (from x-axis 30, for example) than to the first values, so it fits better in the end of the curve than in the beginning of it.
I know that from here there are many ways to approach this (first of all, define what is the importance that I want to give to each of the residuals). My question here, is to get some idea of how to approach this.
One idea that I had, is to change the sigma value to weight each data point by its inverse value.
popt, pcov = curve_fit(exponenial_func, x, y, p0=(1,2,2),
bounds=((0, 0, 0), (np.inf, np.inf, np.inf)),
sigma=1/y)
In this case, I get something like I was looking for:
It doesn't look bad, but I'm looking for another way of doing this, so that I can "control" each of the data points, like to weight each of the residuals in a linear way, or exponential, or even choosing it manually (rather than all of them by the inverse, as in the previous case).
Thanks in advance
First of all, note that there's no need for three coefficients. Since
a * b**(c*x) = a * exp(log(b)*c*x).
we can define k = log(b)*c.
Here's a suggestion how you could tackle your problem by hands with scipy.optimize.least_squares and a priority vector:
import numpy as np
from scipy.optimize import least_squares
def exponenial_func2(x, a, k):
return a * np.exp(k*x)
# returns the vector of residuals
def fitwrapper2(coeffs, *args):
xdata, ydata, prio = args
return prio*(exponenial_func2(xdata, *coeffs)-ydata)
# Data
n = 31
xdata = np.arange(n)
ydata = np.array([155.0,229,322,453,655,888,1128,1694,
2036,2502,3089,3858,4636,5883,7375,
9172,10149,12462,12462,17660,21157,
24747,27980,31506,35713,41035,47021,
53578,59138,63927,69176])
# The priority vector
prio = np.ones(n)
prio[-1] = 5
res = least_squares(fitwrapper2, x0=[1.0,2.0], bounds=(0,np.inf), args=(xdata,ydata,prio))
With prio[-1] = 5 we give the last point a high priority.
res.x contains your optimal coefficients. Here a, k = res.x.
Note that for prio = np.ones(n) it's a normal least squares fitting (like curve_fit does) where all points have the same priority.
You can control the priority of each point by increasing its value in the prio array. Comparing both results gives me:

SciPy + Numpy: Finding the slope of a sigmoid curve

I have some data that follow a sigmoid distribution as you can see in the following image:
After normalizing and scaling my data, I have adjusted the curve at the bottom using scipy.optimize.curve_fit and some initial parameters:
popt, pcov = curve_fit(sigmoid_function, xdata, ydata, p0 = [0.05, 0.05, 0.05])
>>> print popt
[ 2.82019932e+02 -1.90996563e-01 5.00000000e-02]
So popt, according to the documentation, returns *"Optimal values for the parameters so that the sum of the squared error of f(xdata, popt) - ydata is minimized". I understand here that there is no calculation of the slope with curve_fit, because I do not think the slope of this gentle curve is 282, neither is negative.
Then I tried with scipy.optimize.leastsq, because the documentation says it returns "The solution (or the result of the last iteration for an unsuccessful call).", so I thought the slope would be returned. Like this:
p, cov, infodict, mesg, ier = leastsq(residuals, p_guess, args = (nxdata, nydata), full_output=True)
>>> print p
Param(x0=281.73193626250207, y0=-0.012731420027056234, c=1.0069006606656596, k=0.18836680131910222)
But again, I did not get what I expected. curve_fit and leastsq returned almost the same values, with is not surprising I guess, as curve_fit is using an implementation of the least squares method within to find the curve. But no slope back...unless I overlooked something.
So, how to calculate the slope in a point, say, where X = 285 and Y = 0.5?
I am trying to avoid manual methods, like calculating the derivative in, say, (285.5, 0.55) and (284.5, 0.45) and subtract and divide results and so. I would like to know if there is a more automatic method for this.
Thank you all!
EDIT #1
This is my "sigmoid_function", used by curve_fit and leastsq methods:
def sigmoid_function(xdata, x0, k, p0): # p0 not used anymore, only its components (x0, k)
# This function is called by two different methods: curve_fit and leastsq,
# this last one through function "residuals". I don't know if it makes sense
# to use a single function for two (somewhat similar) methods, but there
# it goes.
# p0:
# + Is the initial parameter for scipy.optimize.curve_fit.
# + For residuals calculation is left empty
# + It is initialized to [0.05, 0.05, 0.05]
# x0:
# + Is the convergence parameter in X-axis and also the shift
# + It starts with 0.05 and ends up being around ~282 (days in a year)
# k:
# + Set up either by curve_fit or leastsq
# + In least squares it is initially fixed at 0.5 and in curve_fit
# + to 0.05. Why? Just did this approach in two different ways and
# + it seems it is working.
# + But honestly, I have no clue on what it represents
# xdata:
# + Positions in X-axis. In this case from 240 to 365
# Finally I changed those parameters as suggested in the answer.
# Sigmoid curve has 2 degrees of freedom, therefore, the initial
# guess only needs to be this size. In this case, p0 = [282, 0.5]
y = np.exp(-k*(xdata-x0)) / (1 + np.exp(-k*(xdata-x0)))
return y
def residuals(p_guess, xdata, ydata):
# For the residuals calculation, there is no need of setting up the initial parameters
# After fixing the initial guess and sigmoid_function header, remove []
# return ydata - sigmoid_function(xdata, p_guess[0], p_guess[1], [])
return ydata - sigmoid_function(xdata, p_guess[0], p_guess[1], [])
I am sorry if I made mistakes while describing the parameters or confused technical terms. I am very new with numpy and I have not studied maths for years, so I am catching up again.
So, again, what is your advice to calculate the slope of X = 285, Y = 0.5 (more or less the midpoint) for this dataset? Thanks!!
EDIT #2
Thanks to Oliver W., I updated my code as he suggested and understood a bit better the problem.
There is a final detail I do not fully get. Apparently, curve_fit returns a popt array (x0, k) with the optimum parameters for the fitting:
x0 seems to be how shifted is the curve by indicating the central point of the curve
k parameter is the slope when y = 0.5, also in the center of the curve (I think!)
Why if the sigmoid function is a growing one, the derivative/slope in popt is negative? Does it make sense?
I used sigmoid_derivative to calculate the slope and, yes, I obtained the same results that popt but with positive sign.
# Year 2003, 2005, 2007. Slope in midpoint.
k = [-0.1910, -0.2545, -0.2259] # Values coming from popt
slope = [0.1910, 0.2545, 0.2259] # Values coming from sigmoid_derivative function
I know this is being a bit peaky because I could use both. The relevant data is in there but with negative sign, but I was wondering why is this happening.
So, the calculation of the derivative function as you suggested, is only required if I need to know the slope in other points than y = 0.5. Only for midpoint, I can use popt.
Thanks for your help, it saved me a lot of time. :-)
You're never using the parameter p0 you're passing to your sigmoid function. Hence, curve fitting will not have any good measure to find convergence, because it can take any value for this parameter. You should first rewrite your sigmoid function like this:
def sigmoid_function(xdata, x0, k):
y = np.exp(-k*(xdata-x0)) / (1 + np.exp(-k*(xdata-x0)))
return y
This means your model (the sigmoid) has only two degrees of freedom. This will be returned in popt:
initial_guess = [282, 1] # (x0, k): at x0, the sigmoid reaches 50%, k is slope related
popt, pcov = curve_fit(sigmoid_function, xdata, ydata, p0=initial_guess)
Now popt will be a tuple (or array of 2 values), being the best possible x0 and k.
To get the slope of this function at any point, to be honest, I would just calculate the derivative symbolically as the sigmoid is not such a hard function. You will end up with:
def sigmoid_derivative(x, x0, k):
f = np.exp(-k*(x-x0))
return -k / f
If you have the results from your curve fitting stored in popt, you could pass this easily to this function:
print(sigmoid_derivative(285, *popt))
which will return for you the derivative at x=285. But, because you ask specifically for the midpoint, so when x==x0 and y==.5, you'll see (from the sigmoid_derivative) that the derivative there is just -k, which can be observed immediately from the curve_fit output you've already obtained. In the output you've shown, that's about 0.19.

How to force polyfit with second degree to a y-intercept of 0

I've been using the numpy.polyfit function to do some forecasting. If I put in a degree of 1, it works, but I need to do a second degree polynomial fit. In some cases it works, in other cases the plot of the prediction goes down and then goes up forever. For example:
import matplotlib.pyplot as plt
from numpy import *
x=[1,2,3,4,5,6,7,8,9,10]
y=[100,85,72,66,52,48,39,33,29,32]
fit = polyfit(x, y, degree)
fitfunction = poly1d(z4)
to_predict=[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]
plt.plot(to_predict,fitfunction(to_predict))
plt.show()
After I run that, this shows up (I tried putting a picture up but stackoverflow won't let me).
I want to force it to go through zero.
How would I do that?
If you don't need the fit's error be computed using the original least square formula (i.e. minimizing ∑ |yi - (axi2 + bxi)|2), you could try to perform a linear fit of y/x instead, because (ax2 + bx)/x = ax + b.
If you must use the same error metric, construct the coefficient matrices directly and use numpy.linalg.lstsq:
coeff = numpy.transpose([x*x, x])
((a, b), _, _, _) = numpy.linalg.lstsq(coeff, y)
polynomial = numpy.poly1d([a, b, 0])
(Note that your provided data sequence does not look like a parabola having a y-intercept of 0.)
if anyone has to do this under a deadline, a quick solution is to just add a bunch of extra points at 0 to skew the weighting off. i did this:
for i in range(0,100):
x_vent.insert(i,0)
y_vent.insert(i,0)
slope_vent,intercept_vent=np.polyfit(x_vent,y_vent,1)

Categories

Resources