Related
I have recorded some data about the color of an LED that varies with the 8bit signal sent to the LED driver, the signal can vary between 0 and 255.
Exponential curve fitting seems to work very well to represent the LED's behavior. I have had good results with the following formula:
x * signal ** ex
y * signal ** ey
z * signal ** ez
In Python, I use the following function:
from scipy.optimize import curve_fit
def fit_func_xae(x, a, e):
# Curve fitting function
return a * x**e
# X, Y, Z are real colorimetric values that are measured by a physical instrument
(aX, eX), cov = curve_fit(fit_func3xa, signal, X)
(aY, eY), cov = curve_fit(fit_func3xa, signal, Y)
(aZ, eZ), cov = curve_fit(fit_func3xa, signal, Z)
Note:
In colorimetry, we represent the color of the LED in the CIE XYZ color space, which is a linear space that works in a similar way as a linear RGB color space. Even if it is an aproximation, you can think of XYZ as a synonym of (linear) RGB.
So a color can be represented as a triplet of linear values X, Y, Z.
here is the data behind the curves.
for each 8bit parameter sent to the LED driver, there are 3 measures
Signal
[ 3. 3. 3. 5. 5. 5. 7. 7. 7. 10. 10. 10. 15. 15.
15. 20. 20. 20. 30. 30. 30. 40. 40. 40. 80. 80. 80. 160.
160. 160. 240. 240. 240. 255. 255. 255.]
X, Y, Z
[[9.93295448e-05 8.88955748e-04 6.34978556e-04]
[9.66399391e-05 8.86031926e-04 6.24680520e-04]
[1.06108685e-04 8.99010175e-04 6.41577838e-04]
[1.96407653e-04 1.70210146e-03 1.27178991e-03]
[1.84965943e-04 1.67927596e-03 1.24985475e-03]
[1.83770476e-04 1.67905297e-03 1.24855580e-03]
[3.28537613e-04 2.75382195e-03 2.14639821e-03]
[3.17804246e-04 2.74152647e-03 2.11730825e-03]
[3.19167905e-04 2.74977632e-03 2.11142769e-03]
[5.43770342e-04 4.09314433e-03 3.33793380e-03]
[5.02493149e-04 4.04392581e-03 3.24784452e-03]
[5.00712102e-04 4.03456071e-03 3.26803716e-03]
[1.48001671e-03 1.09367632e-02 9.59283037e-03]
[1.52082180e-03 1.09920985e-02 9.63624777e-03]
[1.50153844e-03 1.09623592e-02 9.61724422e-03]
[3.66206564e-03 2.74730946e-02 2.51982924e-02]
[3.64074861e-03 2.74283157e-02 2.52187294e-02]
[3.68719991e-03 2.75033778e-02 2.51691331e-02]
[1.50905917e-02 1.06056566e-01 1.06534373e-01]
[1.51370269e-02 1.06091182e-01 1.06790424e-01]
[1.51654172e-02 1.06109863e-01 1.06943957e-01]
[3.42912601e-02 2.30854413e-01 2.43427207e-01]
[3.42217124e-02 2.30565972e-01 2.43529454e-01]
[3.41486993e-02 2.30807320e-01 2.43591644e-01]
[1.95905112e-01 1.27409867e+00 1.37490536e+00]
[1.94923951e-01 1.26934278e+00 1.37751808e+00]
[1.95242984e-01 1.26805844e+00 1.37565458e+00]
[1.07931878e+00 6.97822521e+00 7.49602715e+00]
[1.08944832e+00 7.03128378e+00 7.54296884e+00]
[1.07994964e+00 6.96864302e+00 7.44011991e+00]
[2.95296087e+00 1.90746191e+01 1.99164655e+01]
[2.94254973e+00 1.89524517e+01 1.98158118e+01]
[2.95753358e+00 1.90200667e+01 1.98885050e+01]
[3.44049055e+00 2.21221159e+01 2.29667049e+01]
[3.43817829e+00 2.21225393e+01 2.29363833e+01]
[3.43077583e+00 2.21158929e+01 2.29399652e+01]]
_
The Problem:
Here's a scatter plot of some of my LED's XYZ values, together with a plot of the exponential curve fitting obtained with the code above:
It all seems good... until you zoom a bit:
On this zoom we can also see that the curve is fitted on multiple measurements:
At high values, Z values (blue dots) are always higher than Y values (green dots). But at low values, Y values are higher than Z values.
The meaning of this is that the LED changes in color depending on the PWM applied, for some reason (maybe because the temperature rises when more power is applied).
This behavior cannot be represented mathematically by the formula that I have used for the curve fit, however, the curve fit is so good for high values that I am searching for a way to improve it in a simple and elegant way.
Do you have any idea how this could be done? I have tried unsuccessfully to add more parameters, for example I tried to use:
x1 * signal ** ex + x2 * signal ** fx
instead of:
x * signal ** ex
but that causes scipy to overflow.
My idea was that by adding two such elements I could still have a funtion that equals 0 when signal = 0, but that increases faster at low values than a simple exponential.
The data shows two steps in the log-log plot so I used an approach already used here.
Code is as follows:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
signal = np.array( [
3.0, 3.0, 3.0,
5.0, 5.0, 5.0,
7.0, 7.0, 7.0,
10.0, 10., 10.,
15.0, 15., 15.,
20.0, 20., 20.,
30.0, 30., 30.,
40.0, 40., 40.,
80.0, 80., 80.,
160.0, 160., 160.,
240.0, 240., 240.,
255.0, 255., 255.
] )
data = np.array( [
[9.93295448e-05, 8.88955748e-04, 6.34978556e-04],
[9.66399391e-05, 8.86031926e-04, 6.24680520e-04],
[1.06108685e-04, 8.99010175e-04, 6.41577838e-04],
[1.96407653e-04, 1.70210146e-03, 1.27178991e-03],
[1.84965943e-04, 1.67927596e-03, 1.24985475e-03],
[1.83770476e-04, 1.67905297e-03, 1.24855580e-03],
[3.28537613e-04, 2.75382195e-03, 2.14639821e-03],
[3.17804246e-04, 2.74152647e-03, 2.11730825e-03],
[3.19167905e-04, 2.74977632e-03, 2.11142769e-03],
[5.43770342e-04, 4.09314433e-03, 3.33793380e-03],
[5.02493149e-04, 4.04392581e-03, 3.24784452e-03],
[5.00712102e-04, 4.03456071e-03, 3.26803716e-03],
[1.48001671e-03, 1.09367632e-02, 9.59283037e-03],
[1.52082180e-03, 1.09920985e-02, 9.63624777e-03],
[1.50153844e-03, 1.09623592e-02, 9.61724422e-03],
[3.66206564e-03, 2.74730946e-02, 2.51982924e-02],
[3.64074861e-03, 2.74283157e-02, 2.52187294e-02],
[3.68719991e-03, 2.75033778e-02, 2.51691331e-02],
[1.50905917e-02, 1.06056566e-01, 1.06534373e-01],
[1.51370269e-02, 1.06091182e-01, 1.06790424e-01],
[1.51654172e-02, 1.06109863e-01, 1.06943957e-01],
[3.42912601e-02, 2.30854413e-01, 2.43427207e-01],
[3.42217124e-02, 2.30565972e-01, 2.43529454e-01],
[3.41486993e-02, 2.30807320e-01, 2.43591644e-01],
[1.95905112e-01, 1.27409867e+00, 1.37490536e+00],
[1.94923951e-01, 1.26934278e+00, 1.37751808e+00],
[1.95242984e-01, 1.26805844e+00, 1.37565458e+00],
[1.07931878e+00, 6.97822521e+00, 7.49602715e+00],
[1.08944832e+00, 7.03128378e+00, 7.54296884e+00],
[1.07994964e+00, 6.96864302e+00, 7.44011991e+00],
[2.95296087e+00, 1.90746191e+01, 1.99164655e+01],
[2.94254973e+00, 1.89524517e+01, 1.98158118e+01],
[2.95753358e+00, 1.90200667e+01, 1.98885050e+01],
[3.44049055e+00, 2.21221159e+01, 2.29667049e+01],
[3.43817829e+00, 2.21225393e+01, 2.29363833e+01],
[3.43077583e+00, 2.21158929e+01, 2.29399652e+01]
] )
def determine_start_parameters( x , y, edge=9 ):
logx = np.log( x )
logy = np.log( y )
xx = logx[ :edge ]
yy = logy[ :edge ]
(ar1, br1), _ = curve_fit( lambda x, slope, off: slope * x + off, xx , yy )
xx = logx[ edge : -edge ]
yy = logy[ edge : -edge]
(ar2, br2), _ = curve_fit( lambda x, slope, off: slope * x + off, xx , yy )
xx = logx[ -edge : ]
yy = logy[ -edge : ]
(ar3, br3), _ = curve_fit( lambda x, slope, off: slope * x + off, xx , yy )
cross1r = ( br2 - br1 ) / ( ar1 - ar2 )
mr = ar1 * cross1r + br1
cross2r = ( br3 - br2 ) / ( ar2 - ar3 )
xx0r = [ mr, ar1, ar2 , ar3, cross1r, cross2r, 1 ]
return xx0r
def func(
x, b,
m1, m2, m3,
a1, a2,
p
):
"""
continuous approxiation for a two-step function
used to fit the log-log data
p is a sharpness parameter for the transition
"""
out = b - np.log(
( 1 + np.exp( -m1 * ( x - a1 ) )**abs( p ) )
) / p + np.log(
( 1 + np.exp( m2 * ( x - a1 ) )**abs( p ) )
) / p - np.log(
( 1 + np.exp( m3 * ( x - a2 ) )**abs( p ) )
) / abs( p )
return out
def expfunc(
x, b,
m1, m2, m3,
a1, a2,
p
):
"""
remapping to the original data
"""
xi = np.log( x )
eta = func(
xi, b,
m1, m2, m3,
a1, a2,
p)
return np.exp(eta)
def expfunc2(
x, b,
m1, m2, m3,
a1, a2,
p
):
"""
simplified remapping
"""
aa1 = np.exp( a1 )
aa2 = np.exp( a2 )
return (
np.exp( b ) * (
( 1 + ( x / aa1 )**( m2 * p ) ) /
( 1 + ( x / aa2 )**( m3 * p ) ) /
( 1 + ( aa1 / x )**( m1 * p ) )
)**( 1 / p )
)
logsig = np.log( signal )
logred = np.log( data[:,0] )
loggreen = np.log( data[:,1] )
logblue = np.log( data[:,2] )
### getting initial parameters
### red
xx0r = determine_start_parameters( signal, data[ :, 0 ] )
xx0g = determine_start_parameters( signal, data[ :, 1 ] )
xx0b = determine_start_parameters( signal, data[ :, 2 ] )
print( xx0r )
print( xx0g )
print( xx0b )
xl = np.linspace( 1, 6, 150 )
tl = np.linspace( 1, 260, 150 )
solred = curve_fit( func, logsig, logred, p0=xx0r )[0]
solgreen = curve_fit( func, logsig, loggreen, p0=xx0g )[0]
solblue = curve_fit( func, logsig, logblue, p0=xx0b )[0]
print( solred )
print( solgreen )
print( solblue )
fig = plt.figure()
ax = fig.add_subplot( 2, 1, 1 )
bx = fig.add_subplot( 2, 1, 2 )
ax.scatter( np.log( signal ), np.log( data[:,0] ), color = 'r' )
ax.scatter( np.log( signal ), np.log( data[:,1] ), color = 'g' )
ax.scatter( np.log( signal ), np.log( data[:,2] ), color = 'b' )
ax.plot( xl, func( xl, *solred ), color = 'r' )
ax.plot( xl, func( xl, *solgreen ), color = 'g' )
ax.plot( xl, func( xl, *solblue ), color = 'b' )
bx.scatter( signal, data[:,0], color = 'r' )
bx.scatter( signal, data[:,1], color = 'g' )
bx.scatter( signal, data[:,2], color = 'b' )
bx.plot( tl, expfunc2( tl, *solred), color = 'r' )
bx.plot( tl, expfunc2( tl, *solgreen), color = 'g' )
bx.plot( tl, expfunc2( tl, *solblue), color = 'b' )
plt.show()
Which results in
I was trying to fit a curve to my designed points using curve_fit function. However, I got the warning message and not data output from the curve_fit function. Please find the attached screenshots below.
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
#Fitting function
def func(x, a, b):
print("x: "+str(x))
print("b: "+str(b))
#return a*np.exp(b*x)
#return a*x+b
return a*x/(b+x)
#Experimental x and y data points
xData = np.array([ 31875.324,31876.35,31877.651,31878.859,31879.771,31880.657,31881.617,31882.343, \
31883.099,31883.758,31884.489,31885.311,31886.084,31886.736,31887.582,31888.262, \
31888.908,31889.627,31890.312,31890.989,31891.534,31892.142,31892.759,31893.323, \
31893.812,31894.397,31894.928,31895.555,31896.211,31896.797,31898.16,31898.761, \
31899.462,31900.099,31900.609,31901.197,31901.815,31902.32,31902.755,31903.235, \
31903.698,31904.232,31904.776,31905.291,31905.806,31906.182,31906.533,31906.843, \
31907.083,31907.175,31822.221,31833.14,31846.066,31860.254,31875.324], dtype=np.longdouble)
yData = np.array([ 7999.026,7999.483,8000.048,8000.559,8000.937,8001.298,8001.683,8001.969,8002.263, \
8002.516,8002.793,8003.101,8003.387,8003.625,8003.931,8004.174,8004.403,8004.655, \
8004.892,8005.125,8005.311,8005.517,8005.725,8005.913,8006.076,8006.269,8006.443, \
8006.648,8006.861,8007.05,8007.486,8007.677,8007.899,8008.1,8008.259,8008.443, \
8008.636,8008.793,8008.929,8009.077,8009.221,8009.386,8009.554,8009.713,8009.871, \
8009.987,8010.095,8010.19,8010.264,8010.293,7956.451,7969.307,7981.115,7991.074,7999.026], dtype=np.longdouble)
#Plot experimental data points
plt.plot(xData, yData, 'bo', label='experimental-data')
# Initial guess for the parameters
initialGuess = [31880.0,8000.0]
#Perform the curve-fit
popt, pcov = curve_fit(func, xData, yData, initialGuess)
print("popt"+str(popt))
#x values for the fitted function
xFit = np.arange(0.0, 50.0, 0.001, dtype=np.longdouble)
print("xFit"+str(xFit))
#Plot the fitted function
plt.plot(xFit, func(xFit, *popt), 'r', label='fit params: a=%5.3f, b=%5.3f' % tuple(popt))
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
The screeshot for the warning message.
The plot for the result.
Try this fit, which works. The original function will not really fit the data. For large x it converges to a. That's okay. One has a drop to smaller x but only if b is positive. Moreover, one is lacking a parameter that is scaling the drop-rate. So with a<0 and b<0 one gets the right shape, but is converging to a negative number. i.e. an offset is missing. Hence the use of y = a + b / ( x + c )
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.integrate import cumtrapz
import numpy as np
#Fitting function
def func( x, a, b, c ):
return a + b / ( x + c )
#Experimental x and y data points
xData = np.array([
31875.324, 31876.35, 31877.651, 31878.859, 31879.771, 31880.657,
31881.617, 31882.343, 31883.099, 31883.758, 31884.489, 31885.311,
31886.084, 31886.736, 31887.582, 31888.262, 31888.908, 31889.627,
31890.312, 31890.989, 31891.534, 31892.142, 31892.759, 31893.323,
31893.812, 31894.397, 31894.928, 31895.555, 31896.211, 31896.797,
31898.16, 31898.761, 31899.462, 31900.099, 31900.609, 31901.197,
31901.815, 31902.32, 31902.755, 31903.235, 31903.698, 31904.232,
31904.776, 31905.291, 31905.806, 31906.182, 31906.533, 31906.843,
31907.083, 31907.175, 31822.221, 31833.14, 31846.066, 31860.254,
31875.324
], dtype=float )
yData = np.array([
7999.026, 7999.483, 8000.048, 8000.559, 8000.937, 8001.298,
8001.683, 8001.969, 8002.263, 8002.516, 8002.793, 8003.101,
8003.387, 8003.625, 8003.931, 8004.174, 8004.403, 8004.655,
8004.892, 8005.125, 8005.311, 8005.517, 8005.725, 8005.913,
8006.076, 8006.269, 8006.443, 8006.648, 8006.861, 8007.05,
8007.486, 8007.677, 8007.899, 8008.1, 8008.259, 8008.443,
8008.636, 8008.793, 8008.929, 8009.077, 8009.221, 8009.386,
8009.554, 8009.713, 8009.871, 8009.987, 8010.095, 8010.19,
8010.264, 8010.293, 7956.451, 7969.307, 7981.115, 7991.074,
7999.026
], dtype=float )
#x values for the fitted function
xFit = np.linspace( 31820, 31950, 150 )
### sorting and shifting t get reasonable values
### scaling also would be a good idea, but I skip this part here
mx=np.mean( xData )
my=np.mean( yData )
data = np.column_stack( ( xData - mx , yData - my ) )
data = np.sort( data, axis=0 )
### getting good starting values using the fact that
### int x y = ( b + a c) x + a/2 x^2 - c * ( int y ) + const
### With two simple and fast numerical integrals, hence, we get a good
### guess for a, b, c from a simple linear fit.
Sy = cumtrapz( data[:, 1], data[:, 0], initial = 0 )
Sxy = cumtrapz( data[:, 0] * data[:, 1], data[:, 0], initial = 0 )
ST = np.array([
data[:, 0], data[:, 0]**2, Sy, np.ones( len( data ) )
])
S = np.transpose( ST )
A = np.dot( ST, S )
eta = np.dot( ST, Sxy )
sol = np.linalg.solve( A, eta )
a = 2 * sol[1]
c = -sol[2]
b = sol[0] - a * c
print( "a = {}".format( a ) )
print( "b = {}".format( b ) )
print( "c = {}".format( c ) )
sol, cov = curve_fit( func, xData, yData, p0 = (a + my, b, c - mx) )
print( "linear fit vs non-linear fit:" )
print( a + my, b, c - mx )
print( sol )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xData, yData, ls='', marker ='+', label="data")
ax.plot( xFit, func( xFit, a + my, b, c - mx), ls=':', label="linear guess")
ax.plot( xFit, func( xFit, *sol ), label="non-linear fit" )
ax.legend( loc=0 )
plt.show()
Providing
[ 8054 -6613 -31754]
and
I'm using LMFIT to fit a piecewise polynomials to the first quadrant of a sine wave.
I would like to be able to add a constraint on the polynomial output - as opposed to on its parameters.
For example, I would like to ensure that the output is >= 0 and <= 1.0 (which of course only affects the first and last segment in the code below).
Another use case if if I want the polynomial to pass through some specific (x,y) exact points.
I understand this might be better done with np.polyfit but eventually I want to add more non-linear constraints and the LMFIT framework is more flexible.
import numpy as np
from lmfit.models import LinearModel
#split sine wave in 4 segments with 1024 points
nseg = 4
frac = 2**10
npoints = nseg*frac
xfrac = np.linspace(0, 1, num=frac, endpoint=False)
x = np.linspace(0, 1, num=npoints, endpoint=False)
y = np.sin(x*np.pi/2)
yseg = np.reshape(y, (nseg, frac))
mod = LinearModel()
coeff = []
bestfit = []
for i in range(nseg):
pars = mod.guess(yseg[i], x=xfrac)
out = mod.fit(yseg[i], pars, x=xfrac)
coeff.append([out.best_values['slope'], out.best_values['intercept']])
bestfit.append(out.best_fit)
bestfit = np.reshape(bestfit, (1, npoints))[0]
Turns out this is done by adding constraints on the parameters themselves that turns into the right constraint on the model output.
Using a custom model for linear interpolation it can be done as following:
def func(x, c0, c1):
return c0 + c1*x
pmodel = Model(func)
params = Parameters()
params.add('c0')
params.add('clip', value=0, max=1.0, vary=True)
params.add('c1', expr='clip-c0')
One option might be using splines.
A quick and dirty approach, just to present the idea, might look like this:
import matplotlib.pyplot as plt
import numpy as np
## quich and dirty spline function
def l_spline(x, abc ):
if isinstance( x, ( list, tuple, np.ndarray ) ):
out = [ l_spline( elem, abc ) for elem in x]
else:
a, b, c = abc
if x < a:
f = lambda t: 0
elif x < b:
f = lambda t: ( t - a ) / ( b - a )
elif x < c:
f = lambda t: -( t - c ) / (c - b )
else:
f = lambda t: 0
out = f(x)
return out
### test data
xl = np.linspace( 0, 4, 150 )
sl = np.fromiter( ( np.sin( elem ) for elem in xl ), np.float )
### test splines with manual double knots on first and last
yl = dict()
yl[0] = l_spline( xl, ( 0, 0, .4 ) )
for i in range(1, 10 ):
yl[i] = l_spline( xl, ( (i - 1 ) * 0.4 , i * 0.4, (i + 1 ) * 0.4 ) )
yl[10] = l_spline( xl, ( 3.6, 4, 4 ) )
## This is the most simple linear least square for the coefficients
AT = list()
for i in range( 11 ):
AT.append( yl[i] )
AT = np.array( AT )
A = np.transpose( AT )
U = np.dot( AT, A )
UI = np.linalg.inv( U )
K = np.dot( UI, AT )
v = np.dot( K, sl )
## adding up the weigthed sum
out = np.zeros( len( sl ) )
for a, l in zip( v, AT ):
out += a * l
### plotting
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xl, sl, ls=':' )
for i in range( 11 ):
ax.plot( xl, yl[i] )
ax.plot( xl, out, color='k')
plt.show()
Looks like this:
Instead of the simple linear optimization one could use more complex functions to ensure that no parameter is larger than 1. This automatically ensures that the function does not go beyond 1. A fixed point can be established by setting the according b-spline to a fixed value, i.e. not fitting its parameter.
Hello as the title suggests I have been trying to add an exponential and power law fit to my PDF.
As shown in this picture:
The code i am using produces the underlying graph:
The code is this one:
a11=[9.76032106e-02, 6.73754187e-02, 3.20683249e-02, 2.21788509e-02,
2.70850237e-02, 9.90377323e-03, 2.11573411e-02, 8.46232347e-03,
8.49027869e-03, 7.33997745e-03, 5.71819070e-03, 4.62720448e-03,
4.11562884e-03, 3.20064313e-03, 2.66192941e-03, 1.69116510e-03,
1.94355212e-03, 2.55224949e-03, 1.23822395e-03, 5.29618250e-04,
4.03769641e-04, 3.96865740e-04, 3.38530868e-04, 2.04124701e-04,
1.63913557e-04, 2.04486864e-04, 1.82216592e-04, 1.34708400e-04,
9.24289261e-05, 9.55074181e-05, 8.13695322e-05, 5.15610541e-05,
4.15425149e-05, 4.68101099e-05, 3.33696885e-05, 1.61893058e-05,
9.61743970e-06, 1.17314090e-05, 6.65239507e-06]
b11=[3.97213201e+00, 4.77600082e+00, 5.74255432e+00, 6.90471618e+00,
8.30207306e+00, 9.98222306e+00, 1.20023970e+01, 1.44314081e+01,
1.73519956e+01, 2.08636432e+01, 2.50859682e+01, 3.01627952e+01,
3.62670562e+01, 4.36066802e+01, 5.24316764e+01, 6.30426504e+01,
7.58010432e+01, 9.11414433e+01, 1.09586390e+02, 1.31764173e+02,
1.58430233e+02, 1.90492894e+02, 2.29044305e+02, 2.75397642e+02,
3.31131836e+02, 3.98145358e+02, 4.78720886e+02, 5.75603061e+02,
6.92091976e+02, 8.32155588e+02, 1.00056488e+03, 1.20305636e+03,
1.44652749e+03, 1.73927162e+03, 2.09126048e+03, 2.51448384e+03,
3.02335795e+03, 3.63521656e+03, 4.37090138e+03]
plt.plot(b11,a11, 'ro')
plt.yscale("log")
plt.xscale("log")
plt.show()
I would like to add to the underlying graph a power law fit at smaller time and an exponential fit for loner times based on chi square error minimization method.
The data for the x axis saved in csv form:
The data for the x axis:
As mentioned in my comments, I think you can couple the power law and the exponential via a constant term. Alternatively, the data look like it can be fitted by two power laws. Although the comments suggest that there is truly an exponential behavior. Anyhow, I show both approaches here. In both cases I try to avoid any type of piece-wise definition. This also ensures $C^infty$.
In the first approach we have a * x**( -b ) for small x and a1 * exp( -d * x ) for large x. The idea is to choose an c such that the power law is much bigger than c for the required small x but significantly smaller otherwise.
This allows for the function mentioned in my comment, namely ( a * x**( -b ) + c ) * exp( -d * x ) . One may consider c as an transition parameter.
In the alternative approaches, I am taking two power-laws. There are, hence, two regions, In the first one function one is smaller, in the second, the second is smaller. As I always want the smaller function I make inverse summation, i.e., f = 1 / ( 1 / f1 + 1 / f2 ). As can be seen in the code below, I add an additional parameter ( technically in ] 0, infty [ ). This parameter controls the smoothness of the transition.
import matplotlib.pyplot as mp
import numpy as np
from scipy.optimize import curve_fit
data = np.loadtxt( "7jyRi.txt", delimiter=',' )
#### p-e: power and exponential coupled via a small constant term
def func_log( x, a, b, c, d ):
return np.log10( ( a * x**( -b ) + c ) * np.exp( -d * x ) )
guess = [.1, .8, 0.01, .005 ]
testx = np.logspace( 0, 3, 150 )
testy = np.fromiter( ( 10**func_log( x, *guess ) for x in testx ), np.float )
sol, _ = curve_fit( func_log, data[ ::, 0 ], np.log10( data[::,1] ), p0=guess )
fity = np.fromiter( ( 10**func_log( x, *sol ) for x in testx ), np.float )
#### p-p: alternatively using two power laws
def double_power_log( x, a, b, c, d, k ):
s1 = ( a * x**( -b ) )**k
s2 = ( c * x**( -d ) )**k
out = 1.0 / ( 1.0 / s1 + 1.0 / s2 )**( 1.0 / k )
return np.log10( out )
aguess = [.1, .8, 1e7, 4, 1 ]
atesty = np.fromiter( ( 10**double_power_log( x, *aguess ) for x in testx ), np.float )
asol, _ = curve_fit( double_power_log, data[ ::, 0 ], np.log10( data[ ::, 1 ] ), p0=aguess )
afity = np.fromiter( ( 10**double_power_log( x, *asol ) for x in testx ), np.float )
#### plotting
fig = mp.figure( figsize=( 10, 8 ) )
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( data[::,0], data[::,1] ,ls='', marker='o', label="data" )
ax.plot( testx, testy ,ls=':', label="guess p-e" )
ax.plot( testx, atesty ,ls=':',label="guess p-p" )
ax.plot( testx, fity ,ls='-',label="fit p-e: {}".format( sol ) )
ax.plot( testx, afity ,ls='-', label="fit p-p: {}".format( asol ) )
ax.set_xscale( "log" )
ax.set_yscale( "log" )
ax.set_xlim( [ 5e-1, 2e3 ] )
ax.set_ylim( [ 1e-5, 2e-1 ] )
ax.legend( loc=0 )
mp.show()
The results look like
For completeness I'd like to add a solution with a piece-wise definition. As I want the function continuous and differentiable, the parameters of the exponential law are not completely free. With f = a * x**(-b) and g = alpha * exp( -beta * x ) and a transition at x0 I choose ( a, b, x0 ) as free parameters. From this alpha and beta follow. The equations have no easy solution though, such that this itself requires a minimization.
import matplotlib.pyplot as mp
import numpy as np
from scipy.optimize import curve_fit
from scipy.optimize import minimize
from scipy.special import lambertw
data = np.loadtxt( "7jyRi.txt", delimiter=',' )
def pwl( x, a, b):
return a * x**( -b )
def expl( x, a, b ):
return a * np.exp( -b * x )
def alpha_fun(alpha, a, b, x0):
out = alpha - pwl( x0, a, b ) * expl(1, 1, lambertw( pwl( x0, -a * b/ alpha, b ) ) )
return 1e10 * np.abs( out )**2
def p_w( v, a,b, alpha, beta, x0 ):
if v < x0:
out = pwl( v, a, b )
else:
out = expl( v, alpha, beta )
return np.log10( out )
def alpha_beta( x, a, b, x0 ):
"""
continuous and differentiable define alpha and beta
free parameter is the point where I connect
"""
sol = minimize(alpha_fun, .005, args=( a, b, x0 ) )### attention, strongly depends on starting guess, i.e might be a catastrophic fail
alpha = sol.x[0]
# ~print alpha
beta = np.real( -lambertw( pwl( x0, -a * b/ alpha, b ) )/ x0 )
###
if isinstance( x, ( np.ndarray, list, tuple ) ):
out = list()
for v in x:
out.append( p_w( v, a, b, alpha, beta, x0 ) )
else:
out = p_w( v, a, b, alpha, beta, x0 )
return out
sol,_ = curve_fit( alpha_beta, data[ ::, 0 ], np.log10( data[ ::, 1 ] ), p0=[ .1, .8, 70. ] )
alpha0 = minimize(alpha_fun, .005, args=tuple(sol ) ).x[0]
beta0 = np.real( -lambertw( pwl( sol[2], -sol[0] * sol[1]/ alpha0, sol[1] ) )/ sol[2] )
xl = np.logspace(0,3,100)
yl = alpha_beta( xl, *sol )
pl = pwl( xl, sol[0], sol[1] )
el = expl( xl, alpha0, beta0 )
#### plotting
fig = mp.figure( figsize=( 10, 8 ) )
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( data[::,0], data[::,1] ,ls='', marker='o', label="data" )
ax.plot( xl, pl ,ls=':', label="p" )
ax.plot( xl, el ,ls=':', label="{:0.3e} exp(-{:0.3e} x)".format(alpha0, beta0) )
ax.plot( xl, [10**y for y in yl] ,ls='-', label="sol: {}".format(sol) )
ax.axvline(sol[-1], color='k', ls=':')
ax.set_xscale( "log" )
ax.set_yscale( "log" )
ax.set_xlim( [ 5e-1, 2e3 ] )
ax.set_ylim( [ 1e-5, 2e-1 ] )
ax.legend( loc=0 )
mp.show()
Eventually providing
I am trying to make a gaussian fit on a function that is messy. I want to only fit the exterior outer shell (these are not just the max values at each x, because some of the max values will be too low too, because the sample size is low).
from scipy.optimize import curve_fit
def Gauss(x, a, x0, sigma, offset):
return a * np.exp(-np.power(x - x0,2) / (2 * np.power(sigma,2))) + offset
def fitNormal(x, y):
popt, pcov = curve_fit(Gauss, x, y, p0=[np.max(y), np.median(x), np.std(x), np.min(y)])
return popt
plt.plot(xPlot,yPlot, 'k.')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Y(x)')
x,y = xPlot,yPlot
popt = fitNormal(x, y)
minx, maxx = np.min(x), np.max(x)
xFit = np.arange(start=minx, stop=maxx, step=(maxx-minx)/1000)
yFitTest = Gauss(xPlot, popt[0], popt[1], popt[2], popt[3])
print('max fit test: ',np.max(yFitTest))
print('max y: ',np.max(yPlot))
maxIndex = np.where(yPlot==np.max(yPlot))[0][0]
factor = yPlot[maxIndex]/yFitTest[maxIndex]
yFit = Gauss(xPlot, popt[0], popt[1], popt[2], popt[3]) * factor
plt.plot(xFit,yFit,'r')
This is an iterative approach similar to this post. It is different in the sense that the shape of the graph does not permit the use of convex hull. So the idea is to create a cost function that tries to minimize the area of the graph while paying high cost if a point is above the graph. Depending on the type of the graph in OP the cost function needs to be adapted. One also has to check if in the final result all points are really below the graph. Here one can fiddle with details of the cost function. One my, e.g., include an offset in the tanh like tanh( slope * ( x - offset) ) to push the solution farther away from the data.
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import leastsq
def g( x, a, s ):
return a * np.exp(-x**2 / s**2 )
def cost_function( params, xData, yData, slope, val ):
a,s = params
area = 0.5 * np.sqrt( np.pi ) * a * s
diff = np.fromiter ( ( y - g( x, a, s) for x, y in zip( xData, yData ) ), np.float )
cDiff = np.fromiter( ( val * ( 1 + np.tanh( slope * d ) ) for d in diff ), np.float )
out = np.concatenate( [ [area] , cDiff ] )
return out
xData = np.linspace( -5, 5, 500 )
yData = np.fromiter( ( g( x, .77, 2 ) * np.sin( 257.7 * x )**2 for x in xData ), np.float )
sol=[ [ 1, 2.2 ] ]
for i in range( 1, 6 ):
solN, err = leastsq( cost_function, sol[-1] , args=( xData, yData, 10**i, 1 ) )
sol += [ solN ]
print sol
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1)
ax.scatter( xData, yData, s=1 )
for solN in sol:
solY = np.fromiter( ( g( x, *solN ) for x in xData ), np.float )
ax.plot( xData, solY )
plt.show()
giving
>> [0.8627445 3.55774814]
>> [0.77758636 2.52613376]
>> [0.76712184 2.1181137 ]
>> [0.76874125 2.01910211]
>> [0.7695663 2.00262339]
and
Here is a different approach using scipy's Differental Evolution module combined with a "brick wall", where if any predicted value during the fit is greater than the corresponding Y value, the fitting error is made extremely large. I have shamelessly poached code from the answer of #mikuszefski to generate the data used in this example.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import warnings
from scipy.optimize import differential_evolution
def g( x, a, s ):
return a * np.exp(-x**2 / s**2 )
xData = np.linspace( -5, 5, 500 )
yData = np.fromiter( ( g( x, .77, 2 )* np.sin( 257.7 * x )**2 for x in xData ), np.float )
def Gauss(x, a, x0, sigma, offset):
return a * np.exp(-np.power(x - x0,2) / (2 * np.power(sigma,2))) + offset
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = Gauss(xData, *parameterTuple)
multiplier = 1.0
for i in range(len(val)):
if val[i] < yData[i]: # ****** brick wall ******
multiplier = 1.0E10
return np.sum((multiplier * (yData - val)) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minData = min(minX, minY)
maxData = max(maxX, maxY)
parameterBounds = []
parameterBounds.append([minData, maxData]) # parameter bounds for a
parameterBounds.append([minData, maxData]) # parameter bounds for x0
parameterBounds.append([minData, maxData]) # parameter bounds for sigma
parameterBounds.append([minData, maxData]) # parameter bounds for offset
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3, polish=False)
return result.x
# generate initial parameter values
geneticParameters = generate_Initial_Parameters()
# create values for display of fitted function
y_fit = Gauss(xData, *geneticParameters)
plt.scatter(xData, yData, s=1 ) # plot the raw data
plt.plot(xData, y_fit) # plot the equation using the fitted parameters
plt.show()
print('parameters:', geneticParameters)