Fitting a two-layer model to wind profile data in python

Fitting a two-layer model to wind profile data in python - python

I'm trying to fit a model to my dataset of wind profiles, i.e. wind speed values u(z) at different altitudes z.
The model consists of two parts, which I for now simplified to:
u(z) = ust/k * ln(z/z0) for z < zsl
u(z) = a*z + b for z > zsl
In the logarithmic model, ust and z0 are free parameters k is fixed. zsl is the height of the surface layer, which is also not known a priori.
I want to fit this model to my data, and I have already tried different approaches. The best result I'm getting so far is with:
def two_layer(z,hsl,ust,z0,a,b):
return ust/0.4*(np.log(z/z0)) if z<hsl else a*z+b
two_layer = np.vectorize(two_layer)
def two_layer_err(p,z,u):
return two_layer(z,*p)-u
popt, pcov ,infodict, mesg, ier = optimize.leastsq(two_layer_err,[150.,0.3,0.002,0.1,10.],(wspd_hgt,prof),full_output=1)
# wspd_hgt are my measurements heights and
# prof are the corresponding wind speed values
This gives me reasonable estimates for all parameters, except for zsl which is not changed during the fitting procedure. I guess this has to do with the fact that is used as a threshold rather than a function parameter. Is there any way I could let zsl be varied during the optimization?
I tried something with numpy.piecewise, but that didn't work very well, perhaps because I don't really understand it very well, or I might be completely off here because it's not suitable for my cause.
For the idea, the wind profile looks like this if the axes are reversed (z plotted versus u):

I think I finally have a solution for this type of problem, which I came across while answering a similar question.
The solution seems to be to implement a constraint saying that u1 == u2 at the switch between the two models. Because I cannot try it with your model because there is no data posted, I will instead show how it works for the other model, and you can adapt it to your situation. I solved this using a scipy wrapper I wrote to make fitting of such problems more pythonic, called symfit. But you could do the same using the SLSQP algorithm in scipy if you prefer.
from symfit import parameters, variables, Fit, Piecewise, exp, Eq
import numpy as np
import matplotlib.pyplot as plt
t, y = variables('t, y')
m, c, d, k, t0 = parameters('m, c, d, k, t0')
# Help the fit by bounding the switchpoint between the models
t0.min = 0.6
t0.max = 0.9
# Make a piecewise model
y1 = m * t + c
y2 = d * exp(- k * t)
model = {y: Piecewise((y1, t <= t0), (y2, t > t0))}
# As a constraint, we demand equality between the two models at the point t0
# Substitutes t in the components by t0
constraints = [Eq(y1.subs({t: t0}), y2.subs({t: t0}))]
# Read the data
tdata, ydata = np.genfromtxt('Experimental Data.csv', delimiter=',', skip_header=1).T
fit = Fit(model, t=tdata, y=ydata, constraints=constraints)
fit_result = fit.execute()
print(fit_result)
plt.scatter(tdata, ydata)
plt.plot(tdata, fit.model(t=tdata, **fit_result.params).y)
plt.show()
I think you should be able to adapt this example to your situation!
Edit: as requested in the comment, it is possible to demand matching derivatives as well. In order to do this, the following additional code is needed for the above example:
from symfit import Derivative
dy1dt = Derivative(y1, t)
dy2dt = Derivative(y2, t)
constraints = [
Eq(y1.subs({t: t0}), y2.subs({t: t0})),
Eq(dy1dt.subs({t: t0}), dy2dt.subs({t: t0}))
]
That should do the trick! So from a programming point of view it is very doable, although depending on the model this might not actually have a solution.

Related

Finding a better scipy curve_fit! -- exponential function?

I am trying to fit data using scipy curve_fit. I believe that negative exponential is probably best, as this works well for some of my other (similarly generated) data -- but I am achieving sub-optimal results.
I've normalized the dataset to avoid supplying initial values and am applying an exponential function as follows:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
data = np.array([[0.32,0.38],[0.61,0.32],[0.28,0.50],[0.60,0.32],[0.26,0.45],[0.19,0.57],[0.61,0.32],[0.59,0.29],[0.39,0.42],[0.61,0.32],[0.20,0.46],[0.24,0.45],[0.59,0.29],[0.39,0.42],[0.56,0.39],[0.32,0.43],[0.38,0.44],[0.54,0.34],[0.61,0.32],[0.20,0.46],[0.28,0.51],[0.54,0.34],[0.60,0.32],[0.30,0.42],[0.28,0.43],[0.14,0.57],[0.24,0.54],[0.39,0.42],[0.20,0.56],[0.56,0.39],[0.24,0.54],[0.33,0.37],[0.33,0.51],[0.20,0.46],[0.32,0.39],[0.20,0.56],[0.19,0.57],[0.32,0.39],[0.30,0.42],[0.33,0.50],[0.54,0.34],[0.28,0.50],[0.32,0.39],[0.28,0.43],[0.27,0.42],[0.56,0.39],[0.19,0.57],[0.19,0.57],[0.60,0.32],[0.44,0.41],[0.27,0.42],[0.19,0.57],[0.24,0.38],[0.24,0.54],[0.61,0.32],[0.39,0.40],[0.30,0.41],[0.19,0.57],[0.14,0.57],[0.32,0.43],[0.14,0.57],[0.59,0.29],[0.44,0.41],[0.30,0.41],[0.32,0.38],[0.61,0.32],[0.20,0.46],[0.20,0.56],[0.30,0.41],[0.33,0.36],[0.14,0.57],[0.19,0.57],[0.46,0.38],[0.36,0.44],[0.61,0.32],[0.31,0.48],[0.60,0.32],[0.39,0.40],[0.14,0.57],[0.44,0.41],[0.24,0.49],[0.41,0.40],[0.19,0.57],[0.19,0.57],[0.31,0.49],[0.31,0.43],[0.35,0.35],[0.20,0.46],[0.54,0.34],[0.20,0.56],[0.39,0.44],[0.33,0.36],[0.20,0.56],[0.30,0.41],[0.56,0.39],[0.31,0.48],[0.28,0.51],[0.14,0.57],[0.61,0.32],[0.30,0.50],[0.20,0.56],[0.19,0.57],[0.59,0.31],[0.20,0.56],[0.27,0.42],[0.29,0.48],[0.56,0.39],[0.32,0.39],[0.20,0.56],[0.59,0.29],[0.24,0.49],[0.56,0.39],[0.60,0.32],[0.35,0.35],[0.28,0.50],[0.46,0.38],[0.14,0.57],[0.54,0.34],[0.32,0.38],[0.26,0.45],[0.26,0.45],[0.39,0.42],[0.19,0.57],[0.28,0.51],[0.27,0.42],[0.33,0.50],[0.54,0.34],[0.39,0.40],[0.19,0.57],[0.33,0.36],[0.22,0.44],[0.33,0.51],[0.61,0.32],[0.28,0.51],[0.25,0.50],[0.39,0.40],[0.34,0.35],[0.59,0.31],[0.31,0.49],[0.20,0.46],[0.39,0.46],[0.20,0.50],[0.32,0.39],[0.30,0.41],[0.23,0.44],[0.29,0.53],[0.28,0.50],[0.31,0.48],[0.61,0.32],[0.54,0.34],[0.28,0.53],[0.56,0.39],[0.19,0.57],[0.14,0.57],[0.59,0.29],[0.29,0.48],[0.44,0.41],[0.27,0.51],[0.50,0.29],[0.14,0.57],[0.60,0.32],[0.32,0.39],[0.19,0.57],[0.24,0.38],[0.56,0.39],[0.14,0.57],[0.54,0.34],[0.61,0.38],[0.27,0.53],[0.20,0.46],[0.61,0.32],[0.27,0.42],[0.27,0.42],[0.20,0.56],[0.30,0.41],[0.31,0.51],[0.32,0.39],[0.31,0.51],[0.29,0.48],[0.20,0.46],[0.33,0.51],[0.31,0.43],[0.30,0.41],[0.27,0.44],[0.31,0.51],[0.29,0.48],[0.35,0.35],[0.46,0.38],[0.28,0.51],[0.61,0.38],[0.31,0.49],[0.33,0.51],[0.59,0.29],[0.14,0.57],[0.31,0.51],[0.39,0.40],[0.32,0.39],[0.20,0.56],[0.55,0.31],[0.56,0.39],[0.24,0.49],[0.56,0.39],[0.27,0.50],[0.60,0.32],[0.54,0.34],[0.19,0.57],[0.28,0.51],[0.54,0.34],[0.56,0.39],[0.19,0.57],[0.59,0.31],[0.37,0.45],[0.19,0.57],[0.44,0.41],[0.32,0.43],[0.35,0.48],[0.24,0.49],[0.26,0.45],[0.14,0.57],[0.59,0.30],[0.26,0.45],[0.26,0.45],[0.14,0.57],[0.20,0.50],[0.31,0.45],[0.27,0.51],[0.30,0.41],[0.19,0.57],[0.30,0.41],[0.27,0.50],[0.34,0.35],[0.30,0.42],[0.27,0.42],[0.27,0.42],[0.34,0.35],[0.35,0.35],[0.14,0.57],[0.45,0.36],[0.26,0.45],[0.56,0.39],[0.34,0.35],[0.19,0.57],[0.30,0.41],[0.19,0.57],[0.26,0.45],[0.26,0.45],[0.59,0.29],[0.19,0.57],[0.26,0.45],[0.32,0.39],[0.30,0.50],[0.28,0.50],[0.32,0.39],[0.59,0.29],[0.32,0.51],[0.56,0.39],[0.59,0.29],[0.61,0.38],[0.33,0.51],[0.22,0.44],[0.33,0.36],[0.27,0.42],[0.20,0.56],[0.28,0.51],[0.31,0.48],[0.20,0.56],[0.61,0.32],[0.24,0.54],[0.59,0.29],[0.32,0.43],[0.61,0.32],[0.19,0.57],[0.61,0.38],[0.55,0.31],[0.19,0.57],[0.31,0.46],[0.32,0.52],[0.30,0.41],[0.28,0.51],[0.28,0.50],[0.60,0.32],[0.61,0.32],[0.27,0.50],[0.59,0.29],[0.41,0.47],[0.39,0.42],[0.20,0.46],[0.19,0.57],[0.14,0.57],[0.23,0.47],[0.54,0.34],[0.28,0.51],[0.19,0.57],[0.33,0.37],[0.46,0.38],[0.27,0.42],[0.20,0.56],[0.39,0.42],[0.30,0.47],[0.26,0.45],[0.61,0.32],[0.61,0.38],[0.35,0.35],[0.14,0.57],[0.35,0.35],[0.28,0.51],[0.61,0.32],[0.24,0.54],[0.54,0.34],[0.28,0.43],[0.24,0.54],[0.30,0.41],[0.56,0.39],[0.23,0.52],[0.14,0.57],[0.26,0.45],[0.30,0.42],[0.32,0.43],[0.19,0.57],[0.45,0.36],[0.27,0.42],[0.29,0.48],[0.28,0.43],[0.27,0.51],[0.39,0.44],[0.32,0.49],[0.24,0.49],[0.56,0.39],[0.20,0.56],[0.30,0.42],[0.24,0.38],[0.46,0.38],[0.28,0.50],[0.26,0.45],[0.27,0.50],[0.23,0.47],[0.39,0.42],[0.28,0.51],[0.24,0.49],[0.27,0.42],[0.26,0.45],[0.60,0.32],[0.32,0.43],[0.39,0.42],[0.28,0.50],[0.28,0.52],[0.61,0.32],[0.32,0.39],[0.24,0.50],[0.39,0.40],[0.33,0.36],[0.24,0.38],[0.54,0.33],[0.19,0.57],[0.61,0.32],[0.33,0.36],[0.19,0.57],[0.30,0.41],[0.19,0.57],[0.34,0.35],[0.24,0.42],[0.27,0.42],[0.54,0.34],[0.54,0.34],[0.24,0.49],[0.27,0.42],[0.56,0.39],[0.19,0.57],[0.20,0.50],[0.14,0.57],[0.30,0.41],[0.30,0.41],[0.33,0.36],[0.26,0.45],[0.26,0.45],[0.23,0.47],[0.32,0.39],[0.27,0.53],[0.30,0.41],[0.20,0.46],[0.34,0.35],[0.34,0.35],[0.14,0.57],[0.46,0.38],[0.27,0.42],[0.36,0.44],[0.17,0.51],[0.60,0.32],[0.27,0.42],[0.20,0.56],[0.24,0.49],[0.41,0.40],[0.61,0.38],[0.19,0.57],[0.28,0.50],[0.23,0.52],[0.61,0.32],[0.39,0.46],[0.33,0.51],[0.19,0.57],[0.39,0.44],[0.56,0.39],[0.35,0.35],[0.28,0.43],[0.54,0.34],[0.36,0.44],[0.14,0.57],[0.61,0.38],[0.46,0.38],[0.61,0.32],[0.19,0.57],[0.54,0.34],[0.27,0.53],[0.33,0.51],[0.31,0.51],[0.59,0.29],[0.24,0.42],[0.28,0.43],[0.56,0.39],[0.28,0.50],[0.61,0.32],[0.29,0.48],[0.20,0.46],[0.50,0.29],[0.56,0.39],[0.20,0.50],[0.24,0.38],[0.32,0.39],[0.32,0.43],[0.28,0.50],[0.22,0.44],[0.20,0.56],[0.27,0.42],[0.61,0.38],[0.31,0.49],[0.20,0.46],[0.27,0.42],[0.24,0.38],[0.61,0.32],[0.26,0.45],[0.23,0.44],[0.59,0.30],[0.56,0.39],[0.33,0.44],[0.27,0.42],[0.31,0.51],[0.27,0.53],[0.32,0.39],[0.28,0.51],[0.30,0.42],[0.46,0.38],[0.27,0.42],[0.30,0.47],[0.39,0.40],[0.28,0.43],[0.30,0.42],[0.32,0.39],[0.59,0.31],[0.36,0.44],[0.54,0.34],[0.34,0.35],[0.30,0.41],[0.32,0.49],[0.32,0.43],[0.31,0.51],[0.32,0.52],[0.60,0.32],[0.19,0.57],[0.41,0.47],[0.32,0.39],[0.28,0.43],[0.28,0.51],[0.32,0.51],[0.56,0.39],[0.24,0.45],[0.55,0.31],[0.24,0.43],[0.61,0.38],[0.33,0.51],[0.30,0.41],[0.32,0.47],[0.32,0.38],[0.33,0.51],[0.39,0.40],[0.19,0.57],[0.27,0.42],[0.54,0.33],[0.59,0.29],[0.28,0.51],[0.61,0.38],[0.19,0.57],[0.30,0.41],[0.14,0.57],[0.32,0.39],[0.34,0.35],[0.54,0.34],[0.24,0.54],[0.56,0.39],[0.24,0.49],[0.61,0.32],[0.61,0.38],[0.61,0.32],[0.19,0.57],[0.14,0.57],[0.54,0.34],[0.59,0.29],[0.28,0.43],[0.19,0.57],[0.61,0.32],[0.32,0.43],[0.29,0.48],[0.56,0.39],[0.19,0.57],[0.56,0.39],[0.59,0.29],[0.59,0.29],[0.59,0.30],[0.14,0.57],[0.23,0.44],[0.28,0.50],[0.29,0.48],[0.31,0.45],[0.27,0.51],[0.24,0.45],[0.61,0.38],[0.24,0.49],[0.14,0.57],[0.61,0.32],[0.39,0.40],[0.33,0.44],[0.54,0.33],[0.33,0.51],[0.20,0.50],[0.19,0.57],[0.25,0.50],[0.28,0.43],[0.17,0.51],[0.19,0.57],[0.27,0.42],[0.20,0.56],[0.24,0.38],[0.19,0.57],[0.28,0.50],[0.28,0.50],[0.27,0.42],[0.26,0.45],[0.39,0.42],[0.23,0.47],[0.28,0.43],[0.32,0.39],[0.32,0.39],[0.24,0.54],[0.33,0.36],[0.29,0.53],[0.27,0.42],[0.44,0.41],[0.27,0.42],[0.33,0.36],[0.24,0.43],[0.61,0.38],[0.20,0.50],[0.55,0.31],[0.31,0.46],[0.60,0.32],[0.30,0.41],[0.41,0.47],[0.39,0.40],[0.27,0.53],[0.61,0.38],[0.46,0.38],[0.28,0.43],[0.44,0.41],[0.35,0.35],[0.24,0.49],[0.31,0.43],[0.27,0.42],[0.61,0.38],[0.29,0.48],[0.54,0.34],[0.61,0.32],[0.20,0.56],[0.24,0.49],[0.39,0.40],[0.27,0.42],[0.59,0.29],[0.59,0.29],[0.19,0.57],[0.24,0.54],[0.59,0.31],[0.24,0.38],[0.33,0.51],[0.23,0.44],[0.20,0.46],[0.24,0.45],[0.29,0.48],[0.28,0.50],[0.61,0.32],[0.19,0.57],[0.22,0.44],[0.19,0.57],[0.39,0.44],[0.19,0.57],[0.28,0.50],[0.30,0.41],[0.44,0.41],[0.28,0.52],[0.28,0.43],[0.54,0.33],[0.28,0.50],[0.19,0.57],[0.14,0.57],[0.30,0.41],[0.26,0.45],[0.56,0.39],[0.27,0.51],[0.20,0.46],[0.24,0.38],[0.32,0.38],[0.26,0.45],[0.61,0.32],[0.59,0.29],[0.19,0.57],[0.43,0.45],[0.14,0.57],[0.35,0.35],[0.56,0.39],[0.34,0.35],[0.19,0.57],[0.56,0.39],[0.27,0.42],[0.19,0.57],[0.60,0.32],[0.24,0.54],[0.54,0.34],[0.61,0.38],[0.33,0.51],[0.27,0.42],[0.32,0.39],[0.34,0.35],[0.20,0.56],[0.26,0.45],[0.32,0.51],[0.33,0.51],[0.35,0.35],[0.31,0.43],[0.56,0.39],[0.59,0.29],[0.28,0.43],[0.30,0.42],[0.27,0.44],[0.28,0.53],[0.29,0.48],[0.33,0.51],[0.60,0.32],[0.54,0.33],[0.19,0.57],[0.33,0.49],[0.30,0.41],[0.54,0.34],[0.27,0.53],[0.19,0.57],[0.19,0.57],[0.32,0.39],[0.20,0.56],[0.35,0.35],[0.30,0.42],[0.46,0.38],[0.54,0.34],[0.54,0.34],[0.14,0.57],[0.33,0.51],[0.32,0.39],[0.14,0.57],[0.59,0.29],[0.59,0.31],[0.30,0.41],[0.26,0.45],[0.32,0.38],[0.32,0.39],[0.59,0.31],[0.20,0.56],[0.20,0.46],[0.29,0.48],[0.59,0.29],[0.39,0.40],[0.28,0.50],[0.32,0.39],[0.28,0.53],[0.44,0.41],[0.20,0.50],[0.24,0.49],[0.20,0.46],[0.28,0.52],[0.24,0.50],[0.32,0.43],[0.39,0.40],[0.38,0.44],[0.60,0.32],[0.54,0.33],[0.61,0.32],[0.19,0.57],[0.59,0.29],[0.33,0.49],[0.28,0.43],[0.24,0.38],[0.30,0.41],[0.27,0.51],[0.35,0.48],[0.61,0.32],[0.43,0.45],[0.20,0.50],[0.24,0.49],[0.20,0.50],[0.20,0.56],[0.29,0.48],[0.14,0.57],[0.14,0.57],[0.26,0.45],[0.26,0.45],[0.39,0.40],[0.33,0.36],[0.56,0.39],[0.59,0.29],[0.27,0.42],[0.35,0.35],[0.30,0.41],[0.20,0.50],[0.19,0.57],[0.29,0.48],[0.39,0.42],[0.37,0.45],[0.30,0.41],[0.20,0.56],[0.30,0.42],[0.41,0.47],[0.28,0.43],[0.14,0.57],[0.27,0.53],[0.32,0.39],[0.30,0.41],[0.34,0.35],[0.32,0.47],[0.33,0.51],[0.20,0.56],[0.56,0.39],[0.60,0.32],[0.28,0.52],[0.56,0.39],[0.44,0.41],[0.27,0.42],[0.00,1.00],[0.29,0.49],[0.89,0.06],[0.22,0.66],[0.18,0.70],[0.67,0.22],[0.14,0.79],[0.58,0.17],[0.67,0.12],[0.95,0.05],[0.46,0.26],[0.15,0.54],[0.16,0.67],[0.48,0.31],[0.41,0.29],[0.18,0.66],[0.10,0.71],[0.11,0.72],[0.65,0.15],[0.94,0.03],[0.17,0.67],[0.44,0.29],[0.32,0.38],[0.79,0.10],[0.52,0.26],[0.25,0.59],[0.89,0.04],[0.69,0.13],[0.43,0.34],[0.75,0.07],[0.16,0.65],[0.02,0.70],[0.38,0.33],[0.57,0.23],[0.75,0.07],[0.25,0.58],[0.94,0.02],[0.55,0.22],[0.58,0.17],[0.14,0.79],[0.20,0.56],[0.10,0.88],[0.15,0.79],[0.11,0.77],[0.67,0.22],[0.07,0.87],[0.43,0.33],[0.08,0.84],[0.05,0.67],[0.07,0.77],[0.17,0.68],[1.00,0.00],[0.15,0.79],[0.08,0.77],[0.16,0.67],[0.69,0.13],[0.07,0.87],[0.15,0.54],[0.55,0.19],[0.14,0.63],[0.75,0.18],[0.25,0.63],[0.83,0.05],[0.55,0.50],[0.86,0.04],[0.73,0.18],[0.44,0.32],[0.70,0.15],[0.89,0.06],[0.17,0.67],[0.61,0.12],[0.55,0.50],[0.36,0.56],[0.03,0.86],[0.09,0.82],[0.09,0.82],[0.09,0.83],[0.17,0.68],[0.88,0.03],[0.64,0.22],[0.08,0.85],[0.74,0.16],[0.47,0.28],[0.05,0.84],[0.14,0.54],[0.01,0.93],[0.77,0.16],[0.17,0.60],[0.64,0.22],[0.84,0.05],[0.85,0.03],[0.23,0.67],[0.20,0.69],[0.00,0.87],[0.14,0.77],[0.11,0.69],[0.17,0.67],[0.56,0.27],[0.14,0.67],[0.37,0.31],[0.11,0.69],[0.35,0.52],[0.53,0.27],[0.50,0.21],[0.25,0.64],[0.36,0.56],[0.39,0.26],[0.02,0.83],[0.41,0.29],[0.07,0.77],[0.16,0.63],[0.92,0.03],[0.10,0.71],[0.83,0.05],[0.42,0.27],[0.62,0.12],[0.23,0.60],[0.19,0.61],[0.69,0.19],[0.21,0.65],[0.67,0.19],[0.18,0.69],[0.44,0.29],[0.14,0.65],[0.73,0.18],[0.15,0.66],[0.44,0.34],[0.74,0.10],[0.18,0.69],[0.25,0.61],[0.52,0.23],[0.06,0.82],[0.52,0.29],[0.22,0.68],[0.46,0.26],[0.14,0.54],[0.78,0.07],[0.80,0.05],[0.15,0.67],[0.10,0.82],[0.56,0.27],[0.64,0.22],[0.87,0.06],[0.14,0.66],[0.10,0.84],[0.88,0.05],[0.02,0.81],[0.62,0.15],[0.13,0.68],[0.50,0.28],[0.11,0.62],[0.46,0.32],[0.56,0.28],[0.43,0.28],[0.12,0.83],[0.11,0.80],[0.10,0.83],[0.90,0.04],[0.17,0.65],[0.15,0.63],[0.72,0.15],[0.64,0.26],[0.84,0.06],[0.09,0.83],[0.16,0.68],[0.09,0.63],[0.43,0.29],[0.88,0.05],[0.20,0.69],[0.73,0.09],[0.61,0.20],[0.67,0.13],[0.08,0.85],[0.73,0.16],[0.89,0.05],[0.41,0.25],[0.61,0.23],[0.58,0.22],[0.03,0.84],[0.58,0.24],[0.48,0.30],[0.25,0.54],[0.23,0.63],[0.41,0.46],[0.84,0.06],[0.45,0.29],[0.09,0.55],[0.54,0.26],[0.11,0.82],[0.69,0.18],[0.43,0.45],[0.43,0.28],[0.45,0.32],[0.07,0.78],[0.26,0.64],[0.92,0.04],[0.12,0.66],[0.32,0.51],[0.28,0.59],[0.70,0.18]])
x = data[:,0]
y = data[:,1]
def func(x,a,b,c):
return a * np.exp(-b*x) + c
popt, pcov = curve_fit(func, x, y)
a, b, c = popt
x_line = np.arange(min(x), max(x), 0.01)
x_line =np.reshape(x_line,(-1,1))
y_line = func(x_line, a, b, c)
y_line = np.reshape(y_line,(-1,1))
plt.scatter(x,y)
plt.plot(x_line,y_line)
plt.show()
As you can see, the fit deviates towards high x values. example plot I know there are numerous similar questions out there, and I have read many - but my math skills are not phenomenal so i am struggling coming up with a better solution for my particular problem.
I'm not tied to the exponential function - can anyone suggest something better?
I need to do this semi-automatically for hundreds of datasets, so ideally I want something as flexible as possible.
Any help greatly appreciated!
p.s. I am sorry about posting such a large sample dataset - but I figured this kind of question necessitates the actual data, and I didn't want to post links to suspicious looking files.. =)

This is not an optimal solution, but this should work for any kind of density distribution in your data. The idea is to resample the data a given number of times by computing local averages along the x-axis to have evenly distributed points.
#!/usr/bin/python3.6
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def my_floor(a, precision=0):
return np.round(a - 0.5 * 10**(-precision), precision)
data = np.array([[0.32,0.38],[0.61,0.32],[0.28,0.50],[0.60,0.32],[0.26,0.45],[0.19,0.57],[0.61,0.32],[0.59,0.29],[0.39,0.42],[0.61,0.32],[0.20,0.46],[0.24,0.45],[0.59,0.29],[0.39,0.42],[0.56,0.39],[0.32,0.43],[0.38,0.44],[0.54,0.34],[0.61,0.32],[0.20,0.46],[0.28,0.51],[0.54,0.34],[0.60,0.32],[0.30,0.42],[0.28,0.43],[0.14,0.57],[0.24,0.54],[0.39,0.42],[0.20,0.56],[0.56,0.39],[0.24,0.54],[0.33,0.37],[0.33,0.51],[0.20,0.46],[0.32,0.39],[0.20,0.56],[0.19,0.57],[0.32,0.39],[0.30,0.42],[0.33,0.50],[0.54,0.34],[0.28,0.50],[0.32,0.39],[0.28,0.43],[0.27,0.42],[0.56,0.39],[0.19,0.57],[0.19,0.57],[0.60,0.32],[0.44,0.41],[0.27,0.42],[0.19,0.57],[0.24,0.38],[0.24,0.54],[0.61,0.32],[0.39,0.40],[0.30,0.41],[0.19,0.57],[0.14,0.57],[0.32,0.43],[0.14,0.57],[0.59,0.29],[0.44,0.41],[0.30,0.41],[0.32,0.38],[0.61,0.32],[0.20,0.46],[0.20,0.56],[0.30,0.41],[0.33,0.36],[0.14,0.57],[0.19,0.57],[0.46,0.38],[0.36,0.44],[0.61,0.32],[0.31,0.48],[0.60,0.32],[0.39,0.40],[0.14,0.57],[0.44,0.41],[0.24,0.49],[0.41,0.40],[0.19,0.57],[0.19,0.57],[0.31,0.49],[0.31,0.43],[0.35,0.35],[0.20,0.46],[0.54,0.34],[0.20,0.56],[0.39,0.44],[0.33,0.36],[0.20,0.56],[0.30,0.41],[0.56,0.39],[0.31,0.48],[0.28,0.51],[0.14,0.57],[0.61,0.32],[0.30,0.50],[0.20,0.56],[0.19,0.57],[0.59,0.31],[0.20,0.56],[0.27,0.42],[0.29,0.48],[0.56,0.39],[0.32,0.39],[0.20,0.56],[0.59,0.29],[0.24,0.49],[0.56,0.39],[0.60,0.32],[0.35,0.35],[0.28,0.50],[0.46,0.38],[0.14,0.57],[0.54,0.34],[0.32,0.38],[0.26,0.45],[0.26,0.45],[0.39,0.42],[0.19,0.57],[0.28,0.51],[0.27,0.42],[0.33,0.50],[0.54,0.34],[0.39,0.40],[0.19,0.57],[0.33,0.36],[0.22,0.44],[0.33,0.51],[0.61,0.32],[0.28,0.51],[0.25,0.50],[0.39,0.40],[0.34,0.35],[0.59,0.31],[0.31,0.49],[0.20,0.46],[0.39,0.46],[0.20,0.50],[0.32,0.39],[0.30,0.41],[0.23,0.44],[0.29,0.53],[0.28,0.50],[0.31,0.48],[0.61,0.32],[0.54,0.34],[0.28,0.53],[0.56,0.39],[0.19,0.57],[0.14,0.57],[0.59,0.29],[0.29,0.48],[0.44,0.41],[0.27,0.51],[0.50,0.29],[0.14,0.57],[0.60,0.32],[0.32,0.39],[0.19,0.57],[0.24,0.38],[0.56,0.39],[0.14,0.57],[0.54,0.34],[0.61,0.38],[0.27,0.53],[0.20,0.46],[0.61,0.32],[0.27,0.42],[0.27,0.42],[0.20,0.56],[0.30,0.41],[0.31,0.51],[0.32,0.39],[0.31,0.51],[0.29,0.48],[0.20,0.46],[0.33,0.51],[0.31,0.43],[0.30,0.41],[0.27,0.44],[0.31,0.51],[0.29,0.48],[0.35,0.35],[0.46,0.38],[0.28,0.51],[0.61,0.38],[0.31,0.49],[0.33,0.51],[0.59,0.29],[0.14,0.57],[0.31,0.51],[0.39,0.40],[0.32,0.39],[0.20,0.56],[0.55,0.31],[0.56,0.39],[0.24,0.49],[0.56,0.39],[0.27,0.50],[0.60,0.32],[0.54,0.34],[0.19,0.57],[0.28,0.51],[0.54,0.34],[0.56,0.39],[0.19,0.57],[0.59,0.31],[0.37,0.45],[0.19,0.57],[0.44,0.41],[0.32,0.43],[0.35,0.48],[0.24,0.49],[0.26,0.45],[0.14,0.57],[0.59,0.30],[0.26,0.45],[0.26,0.45],[0.14,0.57],[0.20,0.50],[0.31,0.45],[0.27,0.51],[0.30,0.41],[0.19,0.57],[0.30,0.41],[0.27,0.50],[0.34,0.35],[0.30,0.42],[0.27,0.42],[0.27,0.42],[0.34,0.35],[0.35,0.35],[0.14,0.57],[0.45,0.36],[0.26,0.45],[0.56,0.39],[0.34,0.35],[0.19,0.57],[0.30,0.41],[0.19,0.57],[0.26,0.45],[0.26,0.45],[0.59,0.29],[0.19,0.57],[0.26,0.45],[0.32,0.39],[0.30,0.50],[0.28,0.50],[0.32,0.39],[0.59,0.29],[0.32,0.51],[0.56,0.39],[0.59,0.29],[0.61,0.38],[0.33,0.51],[0.22,0.44],[0.33,0.36],[0.27,0.42],[0.20,0.56],[0.28,0.51],[0.31,0.48],[0.20,0.56],[0.61,0.32],[0.24,0.54],[0.59,0.29],[0.32,0.43],[0.61,0.32],[0.19,0.57],[0.61,0.38],[0.55,0.31],[0.19,0.57],[0.31,0.46],[0.32,0.52],[0.30,0.41],[0.28,0.51],[0.28,0.50],[0.60,0.32],[0.61,0.32],[0.27,0.50],[0.59,0.29],[0.41,0.47],[0.39,0.42],[0.20,0.46],[0.19,0.57],[0.14,0.57],[0.23,0.47],[0.54,0.34],[0.28,0.51],[0.19,0.57],[0.33,0.37],[0.46,0.38],[0.27,0.42],[0.20,0.56],[0.39,0.42],[0.30,0.47],[0.26,0.45],[0.61,0.32],[0.61,0.38],[0.35,0.35],[0.14,0.57],[0.35,0.35],[0.28,0.51],[0.61,0.32],[0.24,0.54],[0.54,0.34],[0.28,0.43],[0.24,0.54],[0.30,0.41],[0.56,0.39],[0.23,0.52],[0.14,0.57],[0.26,0.45],[0.30,0.42],[0.32,0.43],[0.19,0.57],[0.45,0.36],[0.27,0.42],[0.29,0.48],[0.28,0.43],[0.27,0.51],[0.39,0.44],[0.32,0.49],[0.24,0.49],[0.56,0.39],[0.20,0.56],[0.30,0.42],[0.24,0.38],[0.46,0.38],[0.28,0.50],[0.26,0.45],[0.27,0.50],[0.23,0.47],[0.39,0.42],[0.28,0.51],[0.24,0.49],[0.27,0.42],[0.26,0.45],[0.60,0.32],[0.32,0.43],[0.39,0.42],[0.28,0.50],[0.28,0.52],[0.61,0.32],[0.32,0.39],[0.24,0.50],[0.39,0.40],[0.33,0.36],[0.24,0.38],[0.54,0.33],[0.19,0.57],[0.61,0.32],[0.33,0.36],[0.19,0.57],[0.30,0.41],[0.19,0.57],[0.34,0.35],[0.24,0.42],[0.27,0.42],[0.54,0.34],[0.54,0.34],[0.24,0.49],[0.27,0.42],[0.56,0.39],[0.19,0.57],[0.20,0.50],[0.14,0.57],[0.30,0.41],[0.30,0.41],[0.33,0.36],[0.26,0.45],[0.26,0.45],[0.23,0.47],[0.32,0.39],[0.27,0.53],[0.30,0.41],[0.20,0.46],[0.34,0.35],[0.34,0.35],[0.14,0.57],[0.46,0.38],[0.27,0.42],[0.36,0.44],[0.17,0.51],[0.60,0.32],[0.27,0.42],[0.20,0.56],[0.24,0.49],[0.41,0.40],[0.61,0.38],[0.19,0.57],[0.28,0.50],[0.23,0.52],[0.61,0.32],[0.39,0.46],[0.33,0.51],[0.19,0.57],[0.39,0.44],[0.56,0.39],[0.35,0.35],[0.28,0.43],[0.54,0.34],[0.36,0.44],[0.14,0.57],[0.61,0.38],[0.46,0.38],[0.61,0.32],[0.19,0.57],[0.54,0.34],[0.27,0.53],[0.33,0.51],[0.31,0.51],[0.59,0.29],[0.24,0.42],[0.28,0.43],[0.56,0.39],[0.28,0.50],[0.61,0.32],[0.29,0.48],[0.20,0.46],[0.50,0.29],[0.56,0.39],[0.20,0.50],[0.24,0.38],[0.32,0.39],[0.32,0.43],[0.28,0.50],[0.22,0.44],[0.20,0.56],[0.27,0.42],[0.61,0.38],[0.31,0.49],[0.20,0.46],[0.27,0.42],[0.24,0.38],[0.61,0.32],[0.26,0.45],[0.23,0.44],[0.59,0.30],[0.56,0.39],[0.33,0.44],[0.27,0.42],[0.31,0.51],[0.27,0.53],[0.32,0.39],[0.28,0.51],[0.30,0.42],[0.46,0.38],[0.27,0.42],[0.30,0.47],[0.39,0.40],[0.28,0.43],[0.30,0.42],[0.32,0.39],[0.59,0.31],[0.36,0.44],[0.54,0.34],[0.34,0.35],[0.30,0.41],[0.32,0.49],[0.32,0.43],[0.31,0.51],[0.32,0.52],[0.60,0.32],[0.19,0.57],[0.41,0.47],[0.32,0.39],[0.28,0.43],[0.28,0.51],[0.32,0.51],[0.56,0.39],[0.24,0.45],[0.55,0.31],[0.24,0.43],[0.61,0.38],[0.33,0.51],[0.30,0.41],[0.32,0.47],[0.32,0.38],[0.33,0.51],[0.39,0.40],[0.19,0.57],[0.27,0.42],[0.54,0.33],[0.59,0.29],[0.28,0.51],[0.61,0.38],[0.19,0.57],[0.30,0.41],[0.14,0.57],[0.32,0.39],[0.34,0.35],[0.54,0.34],[0.24,0.54],[0.56,0.39],[0.24,0.49],[0.61,0.32],[0.61,0.38],[0.61,0.32],[0.19,0.57],[0.14,0.57],[0.54,0.34],[0.59,0.29],[0.28,0.43],[0.19,0.57],[0.61,0.32],[0.32,0.43],[0.29,0.48],[0.56,0.39],[0.19,0.57],[0.56,0.39],[0.59,0.29],[0.59,0.29],[0.59,0.30],[0.14,0.57],[0.23,0.44],[0.28,0.50],[0.29,0.48],[0.31,0.45],[0.27,0.51],[0.24,0.45],[0.61,0.38],[0.24,0.49],[0.14,0.57],[0.61,0.32],[0.39,0.40],[0.33,0.44],[0.54,0.33],[0.33,0.51],[0.20,0.50],[0.19,0.57],[0.25,0.50],[0.28,0.43],[0.17,0.51],[0.19,0.57],[0.27,0.42],[0.20,0.56],[0.24,0.38],[0.19,0.57],[0.28,0.50],[0.28,0.50],[0.27,0.42],[0.26,0.45],[0.39,0.42],[0.23,0.47],[0.28,0.43],[0.32,0.39],[0.32,0.39],[0.24,0.54],[0.33,0.36],[0.29,0.53],[0.27,0.42],[0.44,0.41],[0.27,0.42],[0.33,0.36],[0.24,0.43],[0.61,0.38],[0.20,0.50],[0.55,0.31],[0.31,0.46],[0.60,0.32],[0.30,0.41],[0.41,0.47],[0.39,0.40],[0.27,0.53],[0.61,0.38],[0.46,0.38],[0.28,0.43],[0.44,0.41],[0.35,0.35],[0.24,0.49],[0.31,0.43],[0.27,0.42],[0.61,0.38],[0.29,0.48],[0.54,0.34],[0.61,0.32],[0.20,0.56],[0.24,0.49],[0.39,0.40],[0.27,0.42],[0.59,0.29],[0.59,0.29],[0.19,0.57],[0.24,0.54],[0.59,0.31],[0.24,0.38],[0.33,0.51],[0.23,0.44],[0.20,0.46],[0.24,0.45],[0.29,0.48],[0.28,0.50],[0.61,0.32],[0.19,0.57],[0.22,0.44],[0.19,0.57],[0.39,0.44],[0.19,0.57],[0.28,0.50],[0.30,0.41],[0.44,0.41],[0.28,0.52],[0.28,0.43],[0.54,0.33],[0.28,0.50],[0.19,0.57],[0.14,0.57],[0.30,0.41],[0.26,0.45],[0.56,0.39],[0.27,0.51],[0.20,0.46],[0.24,0.38],[0.32,0.38],[0.26,0.45],[0.61,0.32],[0.59,0.29],[0.19,0.57],[0.43,0.45],[0.14,0.57],[0.35,0.35],[0.56,0.39],[0.34,0.35],[0.19,0.57],[0.56,0.39],[0.27,0.42],[0.19,0.57],[0.60,0.32],[0.24,0.54],[0.54,0.34],[0.61,0.38],[0.33,0.51],[0.27,0.42],[0.32,0.39],[0.34,0.35],[0.20,0.56],[0.26,0.45],[0.32,0.51],[0.33,0.51],[0.35,0.35],[0.31,0.43],[0.56,0.39],[0.59,0.29],[0.28,0.43],[0.30,0.42],[0.27,0.44],[0.28,0.53],[0.29,0.48],[0.33,0.51],[0.60,0.32],[0.54,0.33],[0.19,0.57],[0.33,0.49],[0.30,0.41],[0.54,0.34],[0.27,0.53],[0.19,0.57],[0.19,0.57],[0.32,0.39],[0.20,0.56],[0.35,0.35],[0.30,0.42],[0.46,0.38],[0.54,0.34],[0.54,0.34],[0.14,0.57],[0.33,0.51],[0.32,0.39],[0.14,0.57],[0.59,0.29],[0.59,0.31],[0.30,0.41],[0.26,0.45],[0.32,0.38],[0.32,0.39],[0.59,0.31],[0.20,0.56],[0.20,0.46],[0.29,0.48],[0.59,0.29],[0.39,0.40],[0.28,0.50],[0.32,0.39],[0.28,0.53],[0.44,0.41],[0.20,0.50],[0.24,0.49],[0.20,0.46],[0.28,0.52],[0.24,0.50],[0.32,0.43],[0.39,0.40],[0.38,0.44],[0.60,0.32],[0.54,0.33],[0.61,0.32],[0.19,0.57],[0.59,0.29],[0.33,0.49],[0.28,0.43],[0.24,0.38],[0.30,0.41],[0.27,0.51],[0.35,0.48],[0.61,0.32],[0.43,0.45],[0.20,0.50],[0.24,0.49],[0.20,0.50],[0.20,0.56],[0.29,0.48],[0.14,0.57],[0.14,0.57],[0.26,0.45],[0.26,0.45],[0.39,0.40],[0.33,0.36],[0.56,0.39],[0.59,0.29],[0.27,0.42],[0.35,0.35],[0.30,0.41],[0.20,0.50],[0.19,0.57],[0.29,0.48],[0.39,0.42],[0.37,0.45],[0.30,0.41],[0.20,0.56],[0.30,0.42],[0.41,0.47],[0.28,0.43],[0.14,0.57],[0.27,0.53],[0.32,0.39],[0.30,0.41],[0.34,0.35],[0.32,0.47],[0.33,0.51],[0.20,0.56],[0.56,0.39],[0.60,0.32],[0.28,0.52],[0.56,0.39],[0.44,0.41],[0.27,0.42],[0.00,1.00],[0.29,0.49],[0.89,0.06],[0.22,0.66],[0.18,0.70],[0.67,0.22],[0.14,0.79],[0.58,0.17],[0.67,0.12],[0.95,0.05],[0.46,0.26],[0.15,0.54],[0.16,0.67],[0.48,0.31],[0.41,0.29],[0.18,0.66],[0.10,0.71],[0.11,0.72],[0.65,0.15],[0.94,0.03],[0.17,0.67],[0.44,0.29],[0.32,0.38],[0.79,0.10],[0.52,0.26],[0.25,0.59],[0.89,0.04],[0.69,0.13],[0.43,0.34],[0.75,0.07],[0.16,0.65],[0.02,0.70],[0.38,0.33],[0.57,0.23],[0.75,0.07],[0.25,0.58],[0.94,0.02],[0.55,0.22],[0.58,0.17],[0.14,0.79],[0.20,0.56],[0.10,0.88],[0.15,0.79],[0.11,0.77],[0.67,0.22],[0.07,0.87],[0.43,0.33],[0.08,0.84],[0.05,0.67],[0.07,0.77],[0.17,0.68],[1.00,0.00],[0.15,0.79],[0.08,0.77],[0.16,0.67],[0.69,0.13],[0.07,0.87],[0.15,0.54],[0.55,0.19],[0.14,0.63],[0.75,0.18],[0.25,0.63],[0.83,0.05],[0.55,0.50],[0.86,0.04],[0.73,0.18],[0.44,0.32],[0.70,0.15],[0.89,0.06],[0.17,0.67],[0.61,0.12],[0.55,0.50],[0.36,0.56],[0.03,0.86],[0.09,0.82],[0.09,0.82],[0.09,0.83],[0.17,0.68],[0.88,0.03],[0.64,0.22],[0.08,0.85],[0.74,0.16],[0.47,0.28],[0.05,0.84],[0.14,0.54],[0.01,0.93],[0.77,0.16],[0.17,0.60],[0.64,0.22],[0.84,0.05],[0.85,0.03],[0.23,0.67],[0.20,0.69],[0.00,0.87],[0.14,0.77],[0.11,0.69],[0.17,0.67],[0.56,0.27],[0.14,0.67],[0.37,0.31],[0.11,0.69],[0.35,0.52],[0.53,0.27],[0.50,0.21],[0.25,0.64],[0.36,0.56],[0.39,0.26],[0.02,0.83],[0.41,0.29],[0.07,0.77],[0.16,0.63],[0.92,0.03],[0.10,0.71],[0.83,0.05],[0.42,0.27],[0.62,0.12],[0.23,0.60],[0.19,0.61],[0.69,0.19],[0.21,0.65],[0.67,0.19],[0.18,0.69],[0.44,0.29],[0.14,0.65],[0.73,0.18],[0.15,0.66],[0.44,0.34],[0.74,0.10],[0.18,0.69],[0.25,0.61],[0.52,0.23],[0.06,0.82],[0.52,0.29],[0.22,0.68],[0.46,0.26],[0.14,0.54],[0.78,0.07],[0.80,0.05],[0.15,0.67],[0.10,0.82],[0.56,0.27],[0.64,0.22],[0.87,0.06],[0.14,0.66],[0.10,0.84],[0.88,0.05],[0.02,0.81],[0.62,0.15],[0.13,0.68],[0.50,0.28],[0.11,0.62],[0.46,0.32],[0.56,0.28],[0.43,0.28],[0.12,0.83],[0.11,0.80],[0.10,0.83],[0.90,0.04],[0.17,0.65],[0.15,0.63],[0.72,0.15],[0.64,0.26],[0.84,0.06],[0.09,0.83],[0.16,0.68],[0.09,0.63],[0.43,0.29],[0.88,0.05],[0.20,0.69],[0.73,0.09],[0.61,0.20],[0.67,0.13],[0.08,0.85],[0.73,0.16],[0.89,0.05],[0.41,0.25],[0.61,0.23],[0.58,0.22],[0.03,0.84],[0.58,0.24],[0.48,0.30],[0.25,0.54],[0.23,0.63],[0.41,0.46],[0.84,0.06],[0.45,0.29],[0.09,0.55],[0.54,0.26],[0.11,0.82],[0.69,0.18],[0.43,0.45],[0.43,0.28],[0.45,0.32],[0.07,0.78],[0.26,0.64],[0.92,0.04],[0.12,0.66],[0.32,0.51],[0.28,0.59],[0.70,0.18]])
x = data[:,0]
y = data[:,1]
#---------------------------ADD THIS---------------------------
# Define how to resample the data
n_bins = 20 # choose how many samples to use
bin_size = (max(x) - min(x))/n_bins
# Prepare empty arrays for resampled x and y
x_res, y_res = [],[]
# Resample the data with consistent density
for i in range(n_bins-1):
lower = x >= min(x)+i*bin_size
higher = x < min(x)+(i+1)*bin_size
x_res.append(np.mean(x[np.where(lower & higher)]))
y_res.append(np.mean(y[np.where(lower & higher)]))
#------------------------------------------------------
def func(x,a,b,c):
return a * np.exp(-b*x) + c
popt, pcov = curve_fit(func, x_res, y_res)
a, b, c = popt
x_line = np.arange(min(x), max(x), 0.01)
x_line = np.reshape(x_line,(-1,1))
y_line = func(x_line, a, b, c)
y_line = np.reshape(y_line,(-1,1))
plt.scatter(x,y,alpha=0.5)
plt.scatter(x_res, y_res)
plt.plot(x_line,y_line, c='red')
plt.show()
Which gives the output:

curve_fit does not give you a great deal of control over the fit, you may want to look at the much more general, but somewhat more complicated to use, least_squares:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html
where you can control a lot of things. curve_fit does give you a sigma parameter which allows you to weight the 'uncertainty' in your points. The trick here is to assign lower uncertainty to the points around x=1 where the fit is poor. By giving it lower uncertainty the fitter will try harder to fit them.
after some experimenting, if replacing your ...curve_fit... line with
uncertainty = np.exp(-5*x*x)
popt, pcov = curve_fit(func, x, y, sigma = uncertainty)
I got the following fit
you can try to improve this by playing with the uncertainty vector above

How to fit any non-linear functions in python?

I have already checked post1, post2, post3 and post4 but didn't help.
I have a data about a specific plant including two variables called "Age" and "Height". The correlation between them is non-linear.
To fit a model, one solution I assume is as follows:
If the non-linear function is
then we can bring in a new variable k where
so we have changed the first non-linear function into a multilinear regression one. Based on this, I have the following code:
data['K'] = data["Age"].pow(2)
x = data[["Age", "K"]]
y = data["Height"]
model = LinearRegression().fit(x, y)
print(model.score(x, y)) # = 0.9908571840250205
Am I doing correctly?
How to do with cubic and exponential functions?
Thanks.

for cubic polynomials
data['x2'] = data["Age"].pow(2)
data['x3'] = data["Age"].pow(3)
x = data[["Age", "x2","x3"]]
y = data["Height"]
model = LinearRegression().fit(x, y)
print(model.score(x, y))
you can handle exponential data by fitting log(y).
or find some library that can fit polynomials automatically t.ex: https://numpy.org/doc/stable/reference/generated/numpy.polyfit.html

Hopefully you don't have a religious fervor for using SKLearn here because the answer I'm going to suggest is going to completely ignore it.
If you're interested doing regression analysis where you get to have complete autonomy with the fitting function, I'd suggest cutting directly down to the least-squares optimization algorithm that drives a lot of this type of work, which you can do using scipy
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import leastsq
x, y = np.array([0,1,2,3,4,5]), np.array([0,1,4,9,16,25])
# initial_guess[i] maps to p[x] in function_to_fit, must be reasonable
initial_guess = [1, 1, 1]
def function_to_fit(x, p):
return pow(p[0]*x, 2) + p[1]*x + p[2]
def residuals(p,y,x):
return y - function_to_fit(x,p)
cnsts = leastsq(
residuals,
initial_guess,
args=(y, x)
)[0]
fig, ax = plt.subplots()
ax.plot(x, y, 'o')
xi = np.arange(0,10,0.1)
ax.plot(xi, [function_to_fit(x, cnsts) for x in xi])
plt.show()
Now this is a numeric approach to the solution, so I would recommend taking a moment to make sure you understand the limitations of such an approach - but for problems like these I've found they're more than adequate for functionalizing non-linear data sets without trying to do some hand-waving to make it if inside a linearizable manifold.

Curve Fitting For 3 dimensional data in python

I have X and Y data with (7,360,720) dimension (global grid cells with 0.5 resolution) as input data and I want to fit Sigmoid curve with below code and obtaining curve parameters in the same shape as X and Y:
# -*- coding: utf-8 -*-
import os, sys
from collections import OrderedDict as odict
import numpy as np
import pylab as pl
import numpy.ma as ma
from scipy.optimize import curve_fit
f=open('test.csv','w')
def sigmoid(x,a,b, c):
y = a+(b*(1 - np.exp(-c*(x**2))))
return y
for i in range(360):
for j in range(720):
xdata=[0,x[0,i,j],x[1,i,j],x[2,i,j],x[3,i,j],x[4,i,j],x[5,i,j],x[6,i,j]]
ydata=[0,y[0,i,j],y[1,i,j],y[2,i,j],y[3,i,j],y[4,i,j],y[5,i,j],y[6,i,j]]
popt, pcov = curve_fit(sigmoid, xdata, ydata)
print popt
f.write(','.join(map(str,popt)))
f.write("\n")
f.close()
Now this code write and sore fitting result in .csv file with 3 columns(a,b,c), but I want o write and store fitting result in the file with (360,720) shape as grid cells. also this code show me below error:
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.

The dimensionality of your data (referenced in the title to the question) is not the cause of the problem you are seeing. What you are trying to do is run 360*720 (~260,000) separate fits to your sigmoidal function, with inputs derived from your arrays x and y. It should work, but it might be slow simply because you are doing so many fits.
If you haven't already, you should definitely start by fitting a couple arrays to your function -- if you can't get 3 to work, there's no point in trying 260,000, right? So, start with 1, then try 3, then 360, then all of them.
I suspect the problem you are seeing is because curve_fit() stupidly allows you to not explicitly specify starting values for your parameters, and even-more-stupidly assigns unspecified starting values to the arbitrary value of 1. This encourages new users to not think more carefully about the problem they are trying to solve, and then gives cryptic error messages like the one you seeing that do not explicitly say "you need better starting values". The message says the fit took many iterations, which probably means the fit "got lost" trying to find optimal values. That "getting lost" is probably because it started "too far from home".
In general, curve fitting is sensitive to the starting values of the parameters. And you probably do know better starting values than a=1, b=1, c=1. I suspect that you also know that exponentiation can get huge or tiny very quickly. Because of this, and depending on the scale of your x, there are probably ranges of values for c that are not really sensible -- it might be that c should be positive, and smaller than 10, for example. Again, you probably sort of know these ranges.
Let me suggest using lmfit (https://lmfit.github.io/lmfit-py/) for this work. It provides an alternative approach to curve-fitting, with many useful improvements over curve_fit. For your problem, a single fit might look like this:
import numpy as np
from lmfit import Model
def sigmoid(x, offset, scale, decay):
return offset + scale*(1 - np.exp(-decay*(x**2)))
## set up the model and parameters from your model function
# note that parameters will be *named* using the names of the
# arguments to your model function.
model = Model(sigmoid)
# make parameters (OrderedDict-like) with initial values
params = model.make_params(offset=0, scale=1, decay=0.25)
# you may want to set bounds on some of the parameters
params['scale'].min = 0
params['decay'].min = 0
params['decay'].max = 5
# you can also fix some parameters if desired
# params['offset'].vary = False
## set up data
# pick arbitrary data to fit, and make sure data use np arrays.
# but also: (0, 0) isn't in your data -- do you need to assert it?
# won't that drive `offset` to 0?
i, j = 7, 12
xdata = np.array([0] + x[:, i, j])
ydata = np.array([0] + y[:, i, j])
# now fit model to data, get results
result = model.fit(params, ydata, x=xdata)
print(result.fit_report())
This will print out a report with fit statistics, best-fit parameter values, and uncertainties. You can read the docs for all the components of results, but results.params holds best-fit parameters and uncertainties.
For use in a loop, this approach has the convenient feature that result is unique for each data set, while the starting params are not altered by the fit and can be re-used as starting values for all your fits. A test loop might look like
results = []
for i in (50, 150, 250):
for j in (200, 400, 600):
xdata = np.array([0] + x[:, i, j])
ydata = np.array([0] + y[:, i, j])
result = model.fit(params, ydata, x=xdata)
results.append([i, j, result.params, result.chisqr])
It will still be possible that some of the 260,000 fits will not succeed, but I think that lmfit will give you better tools to avoid and identify these cases.

Fitting a vector function with curve_fit in Scipy

I want to fit a function with vector output using Scipy's curve_fit (or something more appropriate if available). For example, consider the following function:
import numpy as np
def fmodel(x, a, b):
return np.vstack([a*np.sin(b*x), a*x**2 - b*x, a*np.exp(b/x)])
Each component is a different function but they share the parameters I wish to fit. Ideally, I would do something like this:
x = np.linspace(1, 20, 50)
a = 0.1
b = 0.5
y = fmodel(x, a, b)
y_noisy = y + 0.2 * np.random.normal(size=y.shape)
from scipy.optimize import curve_fit
popt, pcov = curve_fit(f=fmodel, xdata=x, ydata=y_noisy, p0=[0.3, 0.1])
But curve_fit does not work with functions with vector output, and an error Result from function call is not a proper array of floats. is thrown. What I did instead is to flatten out the output like this:
def fmodel_flat(x, a, b):
return fmodel(x[0:len(x)/3], a, b).flatten()
popt, pcov = curve_fit(f=fmodel_flat, xdata=np.tile(x, 3),
ydata=y_noisy.flatten(), p0=[0.3, 0.1])
and this works. If instead of a vector function I am actually fitting several functions with different inputs as well but which share model parameters, I can concatenate both input and output.
Is there a more appropriate way to fit vector function with Scipy or perhaps some additional module? A main consideration for me is efficiency - the actual functions to fit are much more complex and fitting can take some time, so if this use of curve_fit is mangled and is leading to excessive runtimes I would like to know what I should do instead.

If I can be so blunt as to recommend my own package symfit, I think it does precisely what you need. An example on fitting with shared parameters can be found in the docs.
Your specific problem stated above would become:
from symfit import variables, parameters, Model, Fit, sin, exp
x, y_1, y_2, y_3 = variables('x, y_1, y_2, y_3')
a, b = parameters('a, b')
a.value = 0.3
b.value = 0.1
model = Model({
y_1: a * sin(b * x),
y_2: a * x**2 - b * x,
y_3: a * exp(b / x),
})
xdata = np.linspace(1, 20, 50)
ydata = model(x=xdata, a=0.1, b=0.5)
y_noisy = ydata + 0.2 * np.random.normal(size=(len(model), len(xdata)))
fit = Fit(model, x=xdata, y_1=y_noisy[0], y_2=y_noisy[1], y_3=y_noisy[2])
fit_result = fit.execute()
Check out the docs for more!

I think what you're doing is perfectly fine from an efficiency stand point. I'll try to look at the implementation and come up with something more quantitative, but for the time being here is my reasoning.
What you're doing during curve fitting is optimizing the parameters (a,b) such that
res = sum_i |f(x_i; a,b)-y_i|^2
is minimal. By this I mean that you have data points (x_i,y_i) of arbitrary dimensionality, two parameters (a,b) and a fitting model that approximates the data at query points x_i.
The curve fitting algorithm starts from a starting (a,b) pair, puts this into a black box that computes the above square error, and tries to come up with a new (a',b') pair that produces a smaller error. My point is that the error above is really a black box for the fitting algorithm: the configurational space of the fitting is defined merely by the (a,b) parameters. If you imagine how you'd implement a simple curve fitting function, you could imagine that you try to do, say, a gradient descent, with the square error as cost function.
Now, it should be irrelevant to the fitting procedure how the black box computes the error. It's easy to see that the dimensionality of x_i is really irrelevant for scalar functions, since it doesn't matter if you have 1000 1d query points to fit for, or a 10x10x10 grid in 3d space. What matters is that you have 1000 points x_i for which you need to compute f(x_i) ~ y_i from the model.
The only subtlety that should further be noted is that in case of a vector-valued function, the calculation of the error is not trivial. In my opinion, it's fine to define the error at each x_i point using the 2-norm of the vector-valued function. But hey: in this case, the square error at point x_i is
|f(x_i; a,b)-y_i|^2 == sum_k (f(x_i; a,b)[k]-y_i[k])^2
which implies that the square error for each component is accumulated. This just means that what you're doing right now is just right: by replicating your x_i points and taking into account each component of the function individually, your square error will contain exactly the 2-norm of the error at each point.
So my point is what you're doing is mathematically correct, and I don't expect any behaviour of the fitting procedure to depend on the way how multivariate/vector-valued functions are handled.

Scipy Fmin Guassian model to real data

I've been trying to solve this for a bit and really just haven't seen an example or anything that my brain is able to use to move forward.
The goal is to find a model Gaussian curve by minimizing the total chi-squared between the real data and the model resulting from unknown parameters that require sensible estimations (the Gaussian is of unknown position, amplitude and width). scipy.optimize.fmin has come up but I've never used this before and I'm still very new to python...
Ultimately, I'd like to plot the original data along with the model - I have use pyplot before, it's just generating the model and using fmin that has me completely bewildered where I'm essentially here:
def gaussian(a, b, c, x):
return a*np.exp(-(x-b)**2/(2*c**2))
I've seen multiple ways to generate a model and this has rendered me confused and thus I have no code! I have imported my data file through np.loadtxt.
Thanks for anyone that can suggest a framework or help at all.

There are basically four (or five) main steps involved in model fitting problems like this:
Define your forward model, yhat = F(P, x), that takes a set of parameters P and your independent variable x, and estimates your response variable y
Define your loss function, loss = L(P, x, y) that you'd like to minimize over your parameters
Optional: define a function that returns the Jacobian matrix, i.e. the partial derivatives of your loss function w.r.t. your model parameters.*
Make an initial guess at your model parameters
Plug all these into one of the optimizers and get the fitted parameters for your model
Here's a worked example to get you started:
import numpy as np
from scipy.optimize import minimize
from matplotlib import pyplot as pp
# function that defines the model we're fitting
def gaussian(P, x):
a, b, c = P
return a*np.exp(-(x-b)**2 /( 2*c**2))
# objective function to minimize
def loss(P, x, y):
yhat = gaussian(P, x)
return ((y - yhat)**2).sum()
# generate a gaussian distribution with known parameters
amp = 1.3543
pos = 64.546
var = 12.234
P_real = np.array([amp, pos, var])
# we use the vector of real parameters to generate our fake data
x = np.arange(100)
y = gaussian(P_real, x)
# add some gaussian noise to make things harder
y_noisy = y + np.random.randn(y.size)*0.5
# minimize needs an initial guess at the model parameters
P_guess = np.array([1, 50, 25])
# minimize provides a unified interface to all of scipy's solvers. you
# can also access them individually in scipy.optimize, but the
# standalone versions have annoying differences in their syntax. for now
# we'll use the Nelder-Mead solver, which doesn't use the Jacobian. we
# also need to hand it x and y_noisy as additional args to loss()
res = minimize(loss, P_guess, method='Nelder-Mead', args=(x, y_noisy))
# res is a dict containing the results of the optimization. in particular we
# want the optimized model parameters:
P_fit = res['x']
# we can pass these to gaussian() to evaluate our fitted model
y_fit = gaussian(P_fit, x)
# now let's plot the results:
fig, ax = pp.subplots(1,1)
ax.hold(True)
ax.plot(x, y, '-r', lw=2, label='Real')
ax.plot(x, y_noisy, '-k', alpha=0.5, label='Noisy')
ax.plot(x, y_fit, '--b', lw=5, label='Fit')
ax.legend(loc=0, fancybox=True)
*Some solvers, e.g. conjugate gradient methods, take the Jacobian as an additional argument, and by and large these solvers are faster and more robust, but if you're feeling lazy and performance isn't all that critical then you can usually get away without providing the Jacobian, in which case it will use the finite differences method to estimate the gradients.
You can read more about the different solvers here

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Fitting a two-layer model to wind profile data in python - python

Related

Finding a better scipy curve_fit! -- exponential function?

How to fit any non-linear functions in python?

Curve Fitting For 3 dimensional data in python

Fitting a vector function with curve_fit in Scipy

Scipy Fmin Guassian model to real data

Categories

Resources