I have recorded some data about the color of an LED that varies with the 8bit signal sent to the LED driver, the signal can vary between 0 and 255.
Exponential curve fitting seems to work very well to represent the LED's behavior. I have had good results with the following formula:
x * signal ** ex
y * signal ** ey
z * signal ** ez
In Python, I use the following function:
from scipy.optimize import curve_fit
def fit_func_xae(x, a, e):
# Curve fitting function
return a * x**e
# X, Y, Z are real colorimetric values that are measured by a physical instrument
(aX, eX), cov = curve_fit(fit_func3xa, signal, X)
(aY, eY), cov = curve_fit(fit_func3xa, signal, Y)
(aZ, eZ), cov = curve_fit(fit_func3xa, signal, Z)
Note:
In colorimetry, we represent the color of the LED in the CIE XYZ color space, which is a linear space that works in a similar way as a linear RGB color space. Even if it is an aproximation, you can think of XYZ as a synonym of (linear) RGB.
So a color can be represented as a triplet of linear values X, Y, Z.
here is the data behind the curves.
for each 8bit parameter sent to the LED driver, there are 3 measures
Signal
[ 3. 3. 3. 5. 5. 5. 7. 7. 7. 10. 10. 10. 15. 15.
15. 20. 20. 20. 30. 30. 30. 40. 40. 40. 80. 80. 80. 160.
160. 160. 240. 240. 240. 255. 255. 255.]
X, Y, Z
[[9.93295448e-05 8.88955748e-04 6.34978556e-04]
[9.66399391e-05 8.86031926e-04 6.24680520e-04]
[1.06108685e-04 8.99010175e-04 6.41577838e-04]
[1.96407653e-04 1.70210146e-03 1.27178991e-03]
[1.84965943e-04 1.67927596e-03 1.24985475e-03]
[1.83770476e-04 1.67905297e-03 1.24855580e-03]
[3.28537613e-04 2.75382195e-03 2.14639821e-03]
[3.17804246e-04 2.74152647e-03 2.11730825e-03]
[3.19167905e-04 2.74977632e-03 2.11142769e-03]
[5.43770342e-04 4.09314433e-03 3.33793380e-03]
[5.02493149e-04 4.04392581e-03 3.24784452e-03]
[5.00712102e-04 4.03456071e-03 3.26803716e-03]
[1.48001671e-03 1.09367632e-02 9.59283037e-03]
[1.52082180e-03 1.09920985e-02 9.63624777e-03]
[1.50153844e-03 1.09623592e-02 9.61724422e-03]
[3.66206564e-03 2.74730946e-02 2.51982924e-02]
[3.64074861e-03 2.74283157e-02 2.52187294e-02]
[3.68719991e-03 2.75033778e-02 2.51691331e-02]
[1.50905917e-02 1.06056566e-01 1.06534373e-01]
[1.51370269e-02 1.06091182e-01 1.06790424e-01]
[1.51654172e-02 1.06109863e-01 1.06943957e-01]
[3.42912601e-02 2.30854413e-01 2.43427207e-01]
[3.42217124e-02 2.30565972e-01 2.43529454e-01]
[3.41486993e-02 2.30807320e-01 2.43591644e-01]
[1.95905112e-01 1.27409867e+00 1.37490536e+00]
[1.94923951e-01 1.26934278e+00 1.37751808e+00]
[1.95242984e-01 1.26805844e+00 1.37565458e+00]
[1.07931878e+00 6.97822521e+00 7.49602715e+00]
[1.08944832e+00 7.03128378e+00 7.54296884e+00]
[1.07994964e+00 6.96864302e+00 7.44011991e+00]
[2.95296087e+00 1.90746191e+01 1.99164655e+01]
[2.94254973e+00 1.89524517e+01 1.98158118e+01]
[2.95753358e+00 1.90200667e+01 1.98885050e+01]
[3.44049055e+00 2.21221159e+01 2.29667049e+01]
[3.43817829e+00 2.21225393e+01 2.29363833e+01]
[3.43077583e+00 2.21158929e+01 2.29399652e+01]]
_
The Problem:
Here's a scatter plot of some of my LED's XYZ values, together with a plot of the exponential curve fitting obtained with the code above:
It all seems good... until you zoom a bit:
On this zoom we can also see that the curve is fitted on multiple measurements:
At high values, Z values (blue dots) are always higher than Y values (green dots). But at low values, Y values are higher than Z values.
The meaning of this is that the LED changes in color depending on the PWM applied, for some reason (maybe because the temperature rises when more power is applied).
This behavior cannot be represented mathematically by the formula that I have used for the curve fit, however, the curve fit is so good for high values that I am searching for a way to improve it in a simple and elegant way.
Do you have any idea how this could be done? I have tried unsuccessfully to add more parameters, for example I tried to use:
x1 * signal ** ex + x2 * signal ** fx
instead of:
x * signal ** ex
but that causes scipy to overflow.
My idea was that by adding two such elements I could still have a funtion that equals 0 when signal = 0, but that increases faster at low values than a simple exponential.
The data shows two steps in the log-log plot so I used an approach already used here.
Code is as follows:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
signal = np.array( [
3.0, 3.0, 3.0,
5.0, 5.0, 5.0,
7.0, 7.0, 7.0,
10.0, 10., 10.,
15.0, 15., 15.,
20.0, 20., 20.,
30.0, 30., 30.,
40.0, 40., 40.,
80.0, 80., 80.,
160.0, 160., 160.,
240.0, 240., 240.,
255.0, 255., 255.
] )
data = np.array( [
[9.93295448e-05, 8.88955748e-04, 6.34978556e-04],
[9.66399391e-05, 8.86031926e-04, 6.24680520e-04],
[1.06108685e-04, 8.99010175e-04, 6.41577838e-04],
[1.96407653e-04, 1.70210146e-03, 1.27178991e-03],
[1.84965943e-04, 1.67927596e-03, 1.24985475e-03],
[1.83770476e-04, 1.67905297e-03, 1.24855580e-03],
[3.28537613e-04, 2.75382195e-03, 2.14639821e-03],
[3.17804246e-04, 2.74152647e-03, 2.11730825e-03],
[3.19167905e-04, 2.74977632e-03, 2.11142769e-03],
[5.43770342e-04, 4.09314433e-03, 3.33793380e-03],
[5.02493149e-04, 4.04392581e-03, 3.24784452e-03],
[5.00712102e-04, 4.03456071e-03, 3.26803716e-03],
[1.48001671e-03, 1.09367632e-02, 9.59283037e-03],
[1.52082180e-03, 1.09920985e-02, 9.63624777e-03],
[1.50153844e-03, 1.09623592e-02, 9.61724422e-03],
[3.66206564e-03, 2.74730946e-02, 2.51982924e-02],
[3.64074861e-03, 2.74283157e-02, 2.52187294e-02],
[3.68719991e-03, 2.75033778e-02, 2.51691331e-02],
[1.50905917e-02, 1.06056566e-01, 1.06534373e-01],
[1.51370269e-02, 1.06091182e-01, 1.06790424e-01],
[1.51654172e-02, 1.06109863e-01, 1.06943957e-01],
[3.42912601e-02, 2.30854413e-01, 2.43427207e-01],
[3.42217124e-02, 2.30565972e-01, 2.43529454e-01],
[3.41486993e-02, 2.30807320e-01, 2.43591644e-01],
[1.95905112e-01, 1.27409867e+00, 1.37490536e+00],
[1.94923951e-01, 1.26934278e+00, 1.37751808e+00],
[1.95242984e-01, 1.26805844e+00, 1.37565458e+00],
[1.07931878e+00, 6.97822521e+00, 7.49602715e+00],
[1.08944832e+00, 7.03128378e+00, 7.54296884e+00],
[1.07994964e+00, 6.96864302e+00, 7.44011991e+00],
[2.95296087e+00, 1.90746191e+01, 1.99164655e+01],
[2.94254973e+00, 1.89524517e+01, 1.98158118e+01],
[2.95753358e+00, 1.90200667e+01, 1.98885050e+01],
[3.44049055e+00, 2.21221159e+01, 2.29667049e+01],
[3.43817829e+00, 2.21225393e+01, 2.29363833e+01],
[3.43077583e+00, 2.21158929e+01, 2.29399652e+01]
] )
def determine_start_parameters( x , y, edge=9 ):
logx = np.log( x )
logy = np.log( y )
xx = logx[ :edge ]
yy = logy[ :edge ]
(ar1, br1), _ = curve_fit( lambda x, slope, off: slope * x + off, xx , yy )
xx = logx[ edge : -edge ]
yy = logy[ edge : -edge]
(ar2, br2), _ = curve_fit( lambda x, slope, off: slope * x + off, xx , yy )
xx = logx[ -edge : ]
yy = logy[ -edge : ]
(ar3, br3), _ = curve_fit( lambda x, slope, off: slope * x + off, xx , yy )
cross1r = ( br2 - br1 ) / ( ar1 - ar2 )
mr = ar1 * cross1r + br1
cross2r = ( br3 - br2 ) / ( ar2 - ar3 )
xx0r = [ mr, ar1, ar2 , ar3, cross1r, cross2r, 1 ]
return xx0r
def func(
x, b,
m1, m2, m3,
a1, a2,
p
):
"""
continuous approxiation for a two-step function
used to fit the log-log data
p is a sharpness parameter for the transition
"""
out = b - np.log(
( 1 + np.exp( -m1 * ( x - a1 ) )**abs( p ) )
) / p + np.log(
( 1 + np.exp( m2 * ( x - a1 ) )**abs( p ) )
) / p - np.log(
( 1 + np.exp( m3 * ( x - a2 ) )**abs( p ) )
) / abs( p )
return out
def expfunc(
x, b,
m1, m2, m3,
a1, a2,
p
):
"""
remapping to the original data
"""
xi = np.log( x )
eta = func(
xi, b,
m1, m2, m3,
a1, a2,
p)
return np.exp(eta)
def expfunc2(
x, b,
m1, m2, m3,
a1, a2,
p
):
"""
simplified remapping
"""
aa1 = np.exp( a1 )
aa2 = np.exp( a2 )
return (
np.exp( b ) * (
( 1 + ( x / aa1 )**( m2 * p ) ) /
( 1 + ( x / aa2 )**( m3 * p ) ) /
( 1 + ( aa1 / x )**( m1 * p ) )
)**( 1 / p )
)
logsig = np.log( signal )
logred = np.log( data[:,0] )
loggreen = np.log( data[:,1] )
logblue = np.log( data[:,2] )
### getting initial parameters
### red
xx0r = determine_start_parameters( signal, data[ :, 0 ] )
xx0g = determine_start_parameters( signal, data[ :, 1 ] )
xx0b = determine_start_parameters( signal, data[ :, 2 ] )
print( xx0r )
print( xx0g )
print( xx0b )
xl = np.linspace( 1, 6, 150 )
tl = np.linspace( 1, 260, 150 )
solred = curve_fit( func, logsig, logred, p0=xx0r )[0]
solgreen = curve_fit( func, logsig, loggreen, p0=xx0g )[0]
solblue = curve_fit( func, logsig, logblue, p0=xx0b )[0]
print( solred )
print( solgreen )
print( solblue )
fig = plt.figure()
ax = fig.add_subplot( 2, 1, 1 )
bx = fig.add_subplot( 2, 1, 2 )
ax.scatter( np.log( signal ), np.log( data[:,0] ), color = 'r' )
ax.scatter( np.log( signal ), np.log( data[:,1] ), color = 'g' )
ax.scatter( np.log( signal ), np.log( data[:,2] ), color = 'b' )
ax.plot( xl, func( xl, *solred ), color = 'r' )
ax.plot( xl, func( xl, *solgreen ), color = 'g' )
ax.plot( xl, func( xl, *solblue ), color = 'b' )
bx.scatter( signal, data[:,0], color = 'r' )
bx.scatter( signal, data[:,1], color = 'g' )
bx.scatter( signal, data[:,2], color = 'b' )
bx.plot( tl, expfunc2( tl, *solred), color = 'r' )
bx.plot( tl, expfunc2( tl, *solgreen), color = 'g' )
bx.plot( tl, expfunc2( tl, *solblue), color = 'b' )
plt.show()
Which results in
Related
The curve and my attempt at fitting:
I wish to find the coefficients (A, B, C, D, E, F) for my model function: A * x**2 + B * x + C * np.cos(D * x - E) + F that would almost exactly match the blue curve. But because I used SciPy's optimization curve_fit, which finds the curve with the lowest square difference, it's going to look like the red curve in the image. While I would want the red curve to match up with the crests and troughs of the blue curve. Can scipy do this and how do you do it. If not is there any other library that can handle this?
This is the method mentioned by JJacquelin to make a double linear fit. It fits the data and can be used to provide initial guesses for the non-linear fit. Note that for this method, it is required to express P sin( w t + p ) as A sin( w t ) + B cos( w t ), but that is easily done.
import matplotlib.pyplot as plt
import numpy as np
from scipy.integrate import cumtrapz
from scipy.optimize import curve_fit
def signal( x, A, B, C, D, E, F ):
### note: C, D, E, F have different meaning here
r = (
A * x**2
+ B * x
+ C
+ D * np.sin( F * x )
+ E * np.cos( F * x )
)
return r
def signal_p( x, A, B, C, D, E, F ):
r = (
A * x**2
+ B * x
+ C * np.sin( D * x - E )
+ F
)
return r
testparams = [ -1, 1, 3, 0.005, 0.03, 22 ]
### test data with noise
xl = np.linspace( -0.3, 1.6, 190 )
sl = signal( xl, *testparams )
sl += np.random.normal( size=len( xl ), scale=0.005 )
### numerical integrals
Sl = cumtrapz( sl, x=xl, initial=0 )
SSl = cumtrapz( Sl, x=xl, initial=0 )
### fitting the integro-differential equation to get the frequency
"""
note:
with y = A x**2 +...+ D sin() + E cos()
the double integral int( int(y) ) = a x**4 + ... - y/F**2
"""
VMXT = np.array( [ xl**4, xl**3, xl**2, xl, np.ones( len( xl ) ), sl ] )
VMX = VMXT.transpose()
A = np.dot( VMXT, VMX )
SV = np.dot( VMXT, SSl )
AI = np.linalg.inv( A )
result = np.dot( AI , SV )
print ( "Fit: ",result )
F = np.sqrt( -1 / result[-1] )
print("F = ", F)
### Fitting the linear parameters with the frequency known
VMXT = np.array(
[
xl**2, xl, np.ones( len( xl ) ),
np.sin( F * xl), np.cos( F * xl )
]
)
VMX = VMXT.transpose()
A = np.dot( VMXT, VMX )
SV = np.dot( VMXT, sl )
AI = np.linalg.inv( A )
A, B, C, D, E = np.dot( AI , SV )
print( A, B, C, D, E )
### Non-linear fit with initial guesses
amp = np.sqrt( D**2 + E**2 )
phi = -np.arctan( D / E )
opt, cov = curve_fit( signal_p, xl, sl, p0=( A, B, amp, F, phi, C ) )
print( opt )
### plotting
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot(
xl, sl,
ls='', marker='+', label="data", markersize=5
)
ax.plot(
xl, signal( xl, A, B, C, D, E, F ),
ls="--", label="double linear fit"
)
ax.plot(
xl, signal_p( xl, *opt ),
ls=":", label="non-linear"
)
ax.legend( loc=0 )
ax.grid()
plt.show()
Providing
Fit: [-0.083161 0.1659759 1.49879056 0.848999 0.130222 -0.001990]
F = 22.414133356157887
-0.998516 0.998429 3.000265 0.012701 0.026926
[-0.99856269 0.9973273 0.0305014 21.96402992 -1.4215656 3.00100979]
and
When using the non-linear fit without initial guesses, I get basically a parabola. One can understand why when visualizing a sine half-wave. That is basically a parabola as well. Hence, the non-linear fit drives the according parameters in that direction, especially knowing that the default initial guesses are 1. So one is far off the small amplitude and the high frequency. The fit only finds a local minimum in the chi-square hyper-plane.
I'm interested in modeling this surface with a simple equation that takes in two parameters (x,y) values and produces a z value. Ideally an equation that has a simple form. I have tried Monkey Saddle, polynomial regression (3rd and 4th order) and also multi-linear and log-linear OLS with some success (R^2 0.99), but none that are perfect especially for the curvy part. It seems like there should be a simple model to predict this surface. Maybe a non-linear regression method. Any suggestions? Thanks!
Using Mikuszefski's suggestion seems to produce a reasonable result for the curvy bit:
While the OP is problematic in the sense that it is not really a programming question, here my try to fit the data with a function that has a reasonable small amount of parameters, i.e. 6. The code somehow shows my line of thinking in retrieving this solution. It fits the data very well, but probably has no physical meaning whatsoever.
import matplotlib.pyplot as plt
from mpl_toolkits import mplot3d
import numpy as np
np.set_printoptions( precision=3 )
from scipy.optimize import curve_fit
from scipy.optimize import least_squares
data0 = np.loadtxt( "XYZ.csv", delimiter=",")
data = data0.reshape(11,16,4)
tdata = np.transpose(data, axes=(1,0,2))
### first guess for the shape of the data when plotted against x
def my_p( x, a, p):
return a * x**p
### turns out that the fit parameters theselve follow the above law
### themselve but with an offset and plotted against z, of course
def my_p2( x, a, p, x0):
return a * (x - x0)**p
def my_full( x, y, asc, aexp, ash, psc, pexp, psh):
a = my_p2( y, asc, aexp, ash )
p = my_p2( y, psc, pexp, psh )
z = my_p( x, a, p)
return z
### base lists for plotting
xl = np.linspace( 0, 30, 150 )
yl = np.linspace( 5, 200, 200 )
### setting the plots
fig = plt.figure( figsize=( 16, 12) )
ax = fig.add_subplot( 2, 3, 1 )
bx = fig.add_subplot( 2, 3, 2 )
cx = fig.add_subplot( 2, 3, 3 )
dx = fig.add_subplot( 2, 3, 4 )
ex = fig.add_subplot( 2, 3, 5 )
fx = fig.add_subplot( 2, 3, 6, projection='3d' )
### fitting the data linewise for different z as function of x
### keeping track of the fit parameters
adata = list()
pdata = list()
for subdata in data:
xdata = subdata[::,1]
zdata = subdata[::,3 ]
sol,_ = curve_fit( my_p, xdata, zdata )
print( sol, subdata[0,2] ) ### fit parameters for different z
adata.append( [subdata[0,2] , sol[0] ] )
pdata.append( [subdata[0,2] , sol[1] ] )
### plotting the z-cuts
ax.plot( xdata , zdata , ls='', marker='o')
ax.plot( xl, my_p( xl, *sol ) )
adata = np.array( adata )
pdata = np.array( pdata )
ax.scatter( [0],[0] )
ax.grid()
### fitting the the fitparameters as function of z
sola, _ = curve_fit( my_p2, adata[::,0], adata[::,1], p0= ( 1, -0.05,0 ) )
print( sola )
bx.plot( *(adata.transpose() ) )
bx.plot( yl, my_p2( yl, *sola))
solp, _ = curve_fit( my_p2, pdata[::,0], pdata[::,1], p0= ( 1, -0.05,0 ) )
print( solp )
cx.plot( *(pdata.transpose() ) )
cx.plot( yl, my_p2( yl, *solp))
### plotting the cuts applying the resuts from the "fit of fits"
for subdata in data:
xdata = subdata[ ::, 1 ]
y0 = subdata[ 0, 2 ]
zdata = subdata[ ::, 3 ]
dx.plot( xdata , zdata , ls='', marker='o' )
dx.plot(
xl,
my_full(
xl, y0, 2.12478827, -0.187, -20.84, 0.928, -0.0468, 0.678
)
)
### now fitting the entire data with the overall 6 parameter function
def residuals( params, alldata ):
asc, aexp, ash, psc, pexp, psh = params
diff = list()
for data in alldata:
x = data[1]
y = data[2]
z = data[3]
zth = my_full( x, y, asc, aexp, ash, psc, pexp, psh)
diff.append( z - zth )
return diff
## and fitting using the hand-made residual function and least_squares
resultfinal = least_squares(
residuals,
x0 =( 2.12478827, -0.187, -20.84, 0.928, -0.0468, 0.678 ),
args = ( data0, ) )
### least_squares does not provide errors but the approximated jacobian
### so we follow:
### https://stackoverflow.com/q/61459040/803359
### https://stackoverflow.com/q/14854339/803359
print( resultfinal.x)
resi = resultfinal.fun
JMX = resultfinal.jac
HMX = np.dot( JMX.transpose(),JMX )
cov_red = np.linalg.inv( HMX )
s_sq = np.sum( resi**2 ) /( len(data0) - 6 )
cov = cov_red * s_sq
print( cov )
### plotting the cuts with the overall fit
for subdata in data:
xdata = subdata[::,1]
y0 = subdata[0,2]
zdata = subdata[::,3 ]
ex.plot( xdata , zdata , ls='', marker='o')
ex.plot( xl, my_full( xl, y0, *resultfinal.x ) )
### and in 3d, which is actually not very helpful partially due to the
### fact that matplotlib has limited 3d capabilities.
XX, YY = np.meshgrid( xl, yl )
ZZ = my_full( XX, YY, *resultfinal.x )
fx.scatter(
data0[::,1], data0[::,2], data0[::,3],
color="#ff0000", alpha=1 )
fx.plot_wireframe( XX, YY, ZZ , cmap='inferno')
plt.show()
Providing
[1.154 0.866] 5.0
[1.126 0.837] 10.0
[1.076 0.802] 20.0
[1.013 0.794] 30.0
[0.975 0.789] 40.0
[0.961 0.771] 50.0
[0.919 0.754] 75.0
[0.86 0.748] 100.0
[0.845 0.738] 125.0
[0.816 0.735] 150.0
[0.774 0.726] 200.0
[ 2.125 -0.186 -20.841]
[ 0.928 -0.047 0.678]
[ 1.874 -0.162 -13.83 0.949 -0.052 -1.228]
[[ 6.851e-03 -7.413e-04 -1.737e-01 -6.914e-04 1.638e-04 5.367e-02]
[-7.413e-04 8.293e-05 1.729e-02 8.103e-05 -2.019e-05 -5.816e-03]
[-1.737e-01 1.729e-02 5.961e+00 1.140e-02 -2.272e-03 -1.423e+00]
[-6.914e-04 8.103e-05 1.140e-02 1.050e-04 -2.672e-05 -6.100e-03]
[ 1.638e-04 -2.019e-05 -2.272e-03 -2.672e-05 7.164e-06 1.455e-03]
[ 5.367e-02 -5.816e-03 -1.423e+00 -6.100e-03 1.455e-03 5.090e-01]]
and
The fit looks good and the covariance matrix seems also ok.
The final function, hence, is:
z = 1.874 / ( y + 13.83 )**0.162 * x**( 0.949 / ( y + 1.228 )**0.052 )
Hello as the title suggests I have been trying to add an exponential and power law fit to my PDF.
As shown in this picture:
The code i am using produces the underlying graph:
The code is this one:
a11=[9.76032106e-02, 6.73754187e-02, 3.20683249e-02, 2.21788509e-02,
2.70850237e-02, 9.90377323e-03, 2.11573411e-02, 8.46232347e-03,
8.49027869e-03, 7.33997745e-03, 5.71819070e-03, 4.62720448e-03,
4.11562884e-03, 3.20064313e-03, 2.66192941e-03, 1.69116510e-03,
1.94355212e-03, 2.55224949e-03, 1.23822395e-03, 5.29618250e-04,
4.03769641e-04, 3.96865740e-04, 3.38530868e-04, 2.04124701e-04,
1.63913557e-04, 2.04486864e-04, 1.82216592e-04, 1.34708400e-04,
9.24289261e-05, 9.55074181e-05, 8.13695322e-05, 5.15610541e-05,
4.15425149e-05, 4.68101099e-05, 3.33696885e-05, 1.61893058e-05,
9.61743970e-06, 1.17314090e-05, 6.65239507e-06]
b11=[3.97213201e+00, 4.77600082e+00, 5.74255432e+00, 6.90471618e+00,
8.30207306e+00, 9.98222306e+00, 1.20023970e+01, 1.44314081e+01,
1.73519956e+01, 2.08636432e+01, 2.50859682e+01, 3.01627952e+01,
3.62670562e+01, 4.36066802e+01, 5.24316764e+01, 6.30426504e+01,
7.58010432e+01, 9.11414433e+01, 1.09586390e+02, 1.31764173e+02,
1.58430233e+02, 1.90492894e+02, 2.29044305e+02, 2.75397642e+02,
3.31131836e+02, 3.98145358e+02, 4.78720886e+02, 5.75603061e+02,
6.92091976e+02, 8.32155588e+02, 1.00056488e+03, 1.20305636e+03,
1.44652749e+03, 1.73927162e+03, 2.09126048e+03, 2.51448384e+03,
3.02335795e+03, 3.63521656e+03, 4.37090138e+03]
plt.plot(b11,a11, 'ro')
plt.yscale("log")
plt.xscale("log")
plt.show()
I would like to add to the underlying graph a power law fit at smaller time and an exponential fit for loner times based on chi square error minimization method.
The data for the x axis saved in csv form:
The data for the x axis:
As mentioned in my comments, I think you can couple the power law and the exponential via a constant term. Alternatively, the data look like it can be fitted by two power laws. Although the comments suggest that there is truly an exponential behavior. Anyhow, I show both approaches here. In both cases I try to avoid any type of piece-wise definition. This also ensures $C^infty$.
In the first approach we have a * x**( -b ) for small x and a1 * exp( -d * x ) for large x. The idea is to choose an c such that the power law is much bigger than c for the required small x but significantly smaller otherwise.
This allows for the function mentioned in my comment, namely ( a * x**( -b ) + c ) * exp( -d * x ) . One may consider c as an transition parameter.
In the alternative approaches, I am taking two power-laws. There are, hence, two regions, In the first one function one is smaller, in the second, the second is smaller. As I always want the smaller function I make inverse summation, i.e., f = 1 / ( 1 / f1 + 1 / f2 ). As can be seen in the code below, I add an additional parameter ( technically in ] 0, infty [ ). This parameter controls the smoothness of the transition.
import matplotlib.pyplot as mp
import numpy as np
from scipy.optimize import curve_fit
data = np.loadtxt( "7jyRi.txt", delimiter=',' )
#### p-e: power and exponential coupled via a small constant term
def func_log( x, a, b, c, d ):
return np.log10( ( a * x**( -b ) + c ) * np.exp( -d * x ) )
guess = [.1, .8, 0.01, .005 ]
testx = np.logspace( 0, 3, 150 )
testy = np.fromiter( ( 10**func_log( x, *guess ) for x in testx ), np.float )
sol, _ = curve_fit( func_log, data[ ::, 0 ], np.log10( data[::,1] ), p0=guess )
fity = np.fromiter( ( 10**func_log( x, *sol ) for x in testx ), np.float )
#### p-p: alternatively using two power laws
def double_power_log( x, a, b, c, d, k ):
s1 = ( a * x**( -b ) )**k
s2 = ( c * x**( -d ) )**k
out = 1.0 / ( 1.0 / s1 + 1.0 / s2 )**( 1.0 / k )
return np.log10( out )
aguess = [.1, .8, 1e7, 4, 1 ]
atesty = np.fromiter( ( 10**double_power_log( x, *aguess ) for x in testx ), np.float )
asol, _ = curve_fit( double_power_log, data[ ::, 0 ], np.log10( data[ ::, 1 ] ), p0=aguess )
afity = np.fromiter( ( 10**double_power_log( x, *asol ) for x in testx ), np.float )
#### plotting
fig = mp.figure( figsize=( 10, 8 ) )
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( data[::,0], data[::,1] ,ls='', marker='o', label="data" )
ax.plot( testx, testy ,ls=':', label="guess p-e" )
ax.plot( testx, atesty ,ls=':',label="guess p-p" )
ax.plot( testx, fity ,ls='-',label="fit p-e: {}".format( sol ) )
ax.plot( testx, afity ,ls='-', label="fit p-p: {}".format( asol ) )
ax.set_xscale( "log" )
ax.set_yscale( "log" )
ax.set_xlim( [ 5e-1, 2e3 ] )
ax.set_ylim( [ 1e-5, 2e-1 ] )
ax.legend( loc=0 )
mp.show()
The results look like
For completeness I'd like to add a solution with a piece-wise definition. As I want the function continuous and differentiable, the parameters of the exponential law are not completely free. With f = a * x**(-b) and g = alpha * exp( -beta * x ) and a transition at x0 I choose ( a, b, x0 ) as free parameters. From this alpha and beta follow. The equations have no easy solution though, such that this itself requires a minimization.
import matplotlib.pyplot as mp
import numpy as np
from scipy.optimize import curve_fit
from scipy.optimize import minimize
from scipy.special import lambertw
data = np.loadtxt( "7jyRi.txt", delimiter=',' )
def pwl( x, a, b):
return a * x**( -b )
def expl( x, a, b ):
return a * np.exp( -b * x )
def alpha_fun(alpha, a, b, x0):
out = alpha - pwl( x0, a, b ) * expl(1, 1, lambertw( pwl( x0, -a * b/ alpha, b ) ) )
return 1e10 * np.abs( out )**2
def p_w( v, a,b, alpha, beta, x0 ):
if v < x0:
out = pwl( v, a, b )
else:
out = expl( v, alpha, beta )
return np.log10( out )
def alpha_beta( x, a, b, x0 ):
"""
continuous and differentiable define alpha and beta
free parameter is the point where I connect
"""
sol = minimize(alpha_fun, .005, args=( a, b, x0 ) )### attention, strongly depends on starting guess, i.e might be a catastrophic fail
alpha = sol.x[0]
# ~print alpha
beta = np.real( -lambertw( pwl( x0, -a * b/ alpha, b ) )/ x0 )
###
if isinstance( x, ( np.ndarray, list, tuple ) ):
out = list()
for v in x:
out.append( p_w( v, a, b, alpha, beta, x0 ) )
else:
out = p_w( v, a, b, alpha, beta, x0 )
return out
sol,_ = curve_fit( alpha_beta, data[ ::, 0 ], np.log10( data[ ::, 1 ] ), p0=[ .1, .8, 70. ] )
alpha0 = minimize(alpha_fun, .005, args=tuple(sol ) ).x[0]
beta0 = np.real( -lambertw( pwl( sol[2], -sol[0] * sol[1]/ alpha0, sol[1] ) )/ sol[2] )
xl = np.logspace(0,3,100)
yl = alpha_beta( xl, *sol )
pl = pwl( xl, sol[0], sol[1] )
el = expl( xl, alpha0, beta0 )
#### plotting
fig = mp.figure( figsize=( 10, 8 ) )
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( data[::,0], data[::,1] ,ls='', marker='o', label="data" )
ax.plot( xl, pl ,ls=':', label="p" )
ax.plot( xl, el ,ls=':', label="{:0.3e} exp(-{:0.3e} x)".format(alpha0, beta0) )
ax.plot( xl, [10**y for y in yl] ,ls='-', label="sol: {}".format(sol) )
ax.axvline(sol[-1], color='k', ls=':')
ax.set_xscale( "log" )
ax.set_yscale( "log" )
ax.set_xlim( [ 5e-1, 2e3 ] )
ax.set_ylim( [ 1e-5, 2e-1 ] )
ax.legend( loc=0 )
mp.show()
Eventually providing
I'm trying to fit a double broken profile function which consists of the arctangent function. My code doesn't seem to be working:
XX=np.linspace(7.5,9.5,16)
YY=np.asarray([7,7,7,7.1,7.3,7.5,8.4,9,9.3,9.6,10.3,10.2,10.4,10.5,10.5,10.5])
def func_arc(x,a1,a2,a3,b1,b2,b3,H1,H2):
beta=0.001
w1=np.zeros(len(x))
w2=np.zeros(len(x))
for i in np.arange(0,len(x)):
w1[i]=(((math.pi/2)+atan((x[i]-H1)/beta))/math.pi)
w2[i]=(((math.pi/2)+atan((x[i]-H2)/beta))/math.pi)
y=(a1*x[i]+b1)*(1-w1[i])+(a2*x[i]+b2)*w1[i]*(1-w2[i])+(a3*x+b3)*w2[i]
return(y)
Where the a and b terms are slope and zero-point values of the linear regressions.
The w terms are used to switch the domain.
I take into account the following restrictions for continuity (H1 y H2) and restrict parameters:
mask=(XX<=8.2)
mask2=(XX>8.2) & (XX<9)
mask3=(XX>=9)
l1=np.polyfit(XX[mask], YY[mask], 1)
l2=np.polyfit(XX[mask2], YY[mask2], 1)
l3=np.polyfit(XX[mask3], YY[mask3], 1)
H1=(l2[1]-l1[1])/(l1[0]-l2[0])
H2=(l3[1]-l2[1])/(l2[0]-l3[0])
p0=[l1[0],l2[0],l3[0],l1[1],l2[1],l3[1],H1,H2]
popt_arc1, pcov_arc1 =curve_fit(func_arc, XX, YY,p0)
I obtain a single line instead of a broken profile (S-shape).
What I obtain:
Here is my version. Due to the fact that the linear functions should connect continuously, the parameters are actually less. The offsets b2 and b3, hence, are not fitted, but are a result of reacquiring the linear function to meet at the transitions. Moreover, one might argue that the data does not justify a slope in the beginning or end. This could be justified / checked via the reduced chi-square or other statistical methods.
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
XX=np.linspace( 7.5, 9.5, 16 )
YY=np.asarray( [
7, 7, 7, 7.1, 7.3, 7.5, 8.4, 9, 9.3, 9.6,
10.3, 10.2, 10.4, 10.5, 10.5, 10.5
])
yy0 = np.median( YY )
xx0 = np.median( XX )
h0 = 0.8 * min( XX ) + 0.2 * max( XX )
h1 = 0.8 * max( XX ) + 0.2 * min( XX )
def transition( x, x0, s ):
return ( 0.5 * np.pi + np.arctan( ( x - x0 ) * s ) ) / np.pi
def box( x, x0, x1, s ):
return transition( x, x0, s ) * transition( -x, -x1, s )
def my_piecewise( x, a1, a2, a3, b1, x0, x1 ):
S = 100
b2 = ( a1 - a2 ) * x0 + b1 ### due to continuity
b3 = ( a2 - a3 ) * x1 + b2 ### due to continuity
out = transition( -x , -x0, S ) * ( a1 * x + b1 )
out += box( x , x0, x1 , S ) * ( a2 * x + b2 )
out += transition( x , x1, S ) * ( a3 * x + b3 )
return out
def parameter_reduced( x, a2, b1, x0, x1 ):
return my_piecewise(x, 0, a2, 0, b1, x0, x1 )
def alternative( x, x0, a, s, p, y0 ):
out = np.arctan( np.abs( ( s * ( x - x0 ) ) )**p )**( 1.0 / p )
out *= a * (x - x0 ) / np.abs( x - x0 )
out += y0
return out
xl=np.linspace( 7.2, 10, 150 )
sol, _ = curve_fit(
my_piecewise, XX, YY, p0=[ 0, 1, 0, min( YY ), h0, h1 ]
)
fl = np.fromiter( ( my_piecewise(x, *sol ) for x in xl ), np.float )
rcp = np.fromiter( ( y - my_piecewise(x, *sol ) for x,y in zip( XX, YY ) ), np.float )
rcp = sum( rcp**2 )
solr, _ = curve_fit(
parameter_reduced, XX, YY, p0=[ 1, min(YY), h0, h1 ]
)
rl = np.fromiter( ( parameter_reduced( x, *solr ) for x in xl ), np.float )
rcpr = np.fromiter( ( y - parameter_reduced(x, *solr ) for x, y in zip( XX, YY ) ), np.float )
rcpr = sum( rcpr**2 )
guessa = [ xx0, max(YY)-min(YY), 1, 1, yy0 ]
sola, _ = curve_fit( alternative, XX, YY, p0=guessa)
al = np.fromiter( ( alternative( x, *sola ) for x in xl ), np.float )
rca = np.fromiter( ( y - alternative(x, *sola ) for x, y in zip( XX, YY ) ), np.float )
rca = sum( rca**2 )
print rcp, rcp / ( len(XX) - 6 )
print rcpr, rcpr / ( len(XX) - 4 )
print rca, rca / ( len(XX) - 5 )
fig = plt.figure( figsize=( 12, 8 ) )
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( XX, YY , ls='', marker='o')
ax.plot( xl, fl )
ax.plot( xl, rl )
ax.plot( xl, al )
plt.show()
Results look okey for me.
I am trying to make a gaussian fit on a function that is messy. I want to only fit the exterior outer shell (these are not just the max values at each x, because some of the max values will be too low too, because the sample size is low).
from scipy.optimize import curve_fit
def Gauss(x, a, x0, sigma, offset):
return a * np.exp(-np.power(x - x0,2) / (2 * np.power(sigma,2))) + offset
def fitNormal(x, y):
popt, pcov = curve_fit(Gauss, x, y, p0=[np.max(y), np.median(x), np.std(x), np.min(y)])
return popt
plt.plot(xPlot,yPlot, 'k.')
plt.xlabel('x')
plt.ylabel('y')
plt.title('Y(x)')
x,y = xPlot,yPlot
popt = fitNormal(x, y)
minx, maxx = np.min(x), np.max(x)
xFit = np.arange(start=minx, stop=maxx, step=(maxx-minx)/1000)
yFitTest = Gauss(xPlot, popt[0], popt[1], popt[2], popt[3])
print('max fit test: ',np.max(yFitTest))
print('max y: ',np.max(yPlot))
maxIndex = np.where(yPlot==np.max(yPlot))[0][0]
factor = yPlot[maxIndex]/yFitTest[maxIndex]
yFit = Gauss(xPlot, popt[0], popt[1], popt[2], popt[3]) * factor
plt.plot(xFit,yFit,'r')
This is an iterative approach similar to this post. It is different in the sense that the shape of the graph does not permit the use of convex hull. So the idea is to create a cost function that tries to minimize the area of the graph while paying high cost if a point is above the graph. Depending on the type of the graph in OP the cost function needs to be adapted. One also has to check if in the final result all points are really below the graph. Here one can fiddle with details of the cost function. One my, e.g., include an offset in the tanh like tanh( slope * ( x - offset) ) to push the solution farther away from the data.
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import leastsq
def g( x, a, s ):
return a * np.exp(-x**2 / s**2 )
def cost_function( params, xData, yData, slope, val ):
a,s = params
area = 0.5 * np.sqrt( np.pi ) * a * s
diff = np.fromiter ( ( y - g( x, a, s) for x, y in zip( xData, yData ) ), np.float )
cDiff = np.fromiter( ( val * ( 1 + np.tanh( slope * d ) ) for d in diff ), np.float )
out = np.concatenate( [ [area] , cDiff ] )
return out
xData = np.linspace( -5, 5, 500 )
yData = np.fromiter( ( g( x, .77, 2 ) * np.sin( 257.7 * x )**2 for x in xData ), np.float )
sol=[ [ 1, 2.2 ] ]
for i in range( 1, 6 ):
solN, err = leastsq( cost_function, sol[-1] , args=( xData, yData, 10**i, 1 ) )
sol += [ solN ]
print sol
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1)
ax.scatter( xData, yData, s=1 )
for solN in sol:
solY = np.fromiter( ( g( x, *solN ) for x in xData ), np.float )
ax.plot( xData, solY )
plt.show()
giving
>> [0.8627445 3.55774814]
>> [0.77758636 2.52613376]
>> [0.76712184 2.1181137 ]
>> [0.76874125 2.01910211]
>> [0.7695663 2.00262339]
and
Here is a different approach using scipy's Differental Evolution module combined with a "brick wall", where if any predicted value during the fit is greater than the corresponding Y value, the fitting error is made extremely large. I have shamelessly poached code from the answer of #mikuszefski to generate the data used in this example.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import warnings
from scipy.optimize import differential_evolution
def g( x, a, s ):
return a * np.exp(-x**2 / s**2 )
xData = np.linspace( -5, 5, 500 )
yData = np.fromiter( ( g( x, .77, 2 )* np.sin( 257.7 * x )**2 for x in xData ), np.float )
def Gauss(x, a, x0, sigma, offset):
return a * np.exp(-np.power(x - x0,2) / (2 * np.power(sigma,2))) + offset
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = Gauss(xData, *parameterTuple)
multiplier = 1.0
for i in range(len(val)):
if val[i] < yData[i]: # ****** brick wall ******
multiplier = 1.0E10
return np.sum((multiplier * (yData - val)) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minData = min(minX, minY)
maxData = max(maxX, maxY)
parameterBounds = []
parameterBounds.append([minData, maxData]) # parameter bounds for a
parameterBounds.append([minData, maxData]) # parameter bounds for x0
parameterBounds.append([minData, maxData]) # parameter bounds for sigma
parameterBounds.append([minData, maxData]) # parameter bounds for offset
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3, polish=False)
return result.x
# generate initial parameter values
geneticParameters = generate_Initial_Parameters()
# create values for display of fitted function
y_fit = Gauss(xData, *geneticParameters)
plt.scatter(xData, yData, s=1 ) # plot the raw data
plt.plot(xData, y_fit) # plot the equation using the fitted parameters
plt.show()
print('parameters:', geneticParameters)