Least mean square method for multiple functions at once in python - python

I have 2 formulas that describes the behaviour in 2 perpendicular axes. Also I have data from FEM simulation. The goal is to use least mean square method to get parameters Rr, Lr and cm.
I wanted to use scipy.curve_fit unfortunately it accepts only single function as an input. In this case i would need it to accept 2 functions as an input.
I did something in excel where arguments are inserted by hand to prove that it can/can not be perfectly fitted. They cant be but i would like to get "best" fit.
Any idea how it can be solved besides hard coding the last mean square method by hand to calculate deviances and find min?
Thank you so much for help.

If not using packages like lmfit or similar, fitting curves with shared parameters will always require to write some sort of wrapper. Personally I'd write a residual function and use scipy.optimize.least_squares, but if one insists to use curve_fit, this would be a possible wrapper:
import numpy as np
from scipy.optimize import curve_fit
def f1( x, c, L, R):
a = c**2 * x / ( R**2 + (x * L )**2 )
return a * x * L
def f2( x, c, L, R):
a = c**2 * x / ( R**2 + (x * L )**2 )
return a * R
def falt( x, c, L, R, n=-1):
"""
by construction x is the doubled x-list, 0 <= nn / l < 1
and >= 1/2 is the second part
"""
if isinstance( x, ( list, tuple, np.ndarray ) ):
### curve_fit sends array
l = len( x )
out = [ falt( xx, c, L, R, n=( nn / l ) ) for nn, xx in enumerate( x ) ]
else:
if n < 0.5:
out = f1( x, c, L, R)
else:
out = f2( x, c, L, R)
return out
## some data
c0=1.2
L0=0.3
R0= 0.45
size = 99
xl = np.linspace( 0, 10, size )
y1l = f1( xl , c0, L0, R0 ) + ( 2 * np.random.random( size=size ) - 1 ) * 0.1
y2l = f2( xl , c0, L0, R0 ) + ( 2 * np.random.random( size=size ) - 1 ) * 0.1
sol, err = curve_fit(
falt,
np.append( xl, xl ),
np.append( y1l, y2l )
)
print( sol )

You can put the relative importance of the functions in a hyperparamter lambda, then use func1 + lambda * func2.
With code:
importance_of_func1_relative_to_func2 = 1
def objective(args1, args2):
return func1(args1) * importance_of_func1_relative_to_func2 + func2(args2)

If I understand the goal (not sure of that!), I think that what you might want to do is have a single function that evaluates your 2 values for Fperp and Fpara and then concatenates them. You wrote those as both being multiplied by |z| (maybe abs(zhat)?) -- I cannot tell if that should be a common scaling factor, a fitting variable, or some other array of values...
Anyway, I might suggest a function like
def f_model(omega, cm, rr, lr, zhat):
ll = lr * omega
scale = abs(zhat) * cm**2 / (rr**2 + ll**2)
fpara = scale * ll * omega
fperp = scale * rr * omega
return np.concatenate((fpara, fperp))
Then you would want to arrange the data that you model with this function to also be the concatenation of the data corresponding to fpara and fperp.
That concatenation would effectively fit Fpara and Fperp together, weighting them evenly in the fit.

Related

Calculating the summation parameters separately

I am trying to use curve_fitting for a defined function of the form below:
Z = (Rth(1 - np.exp(- x/tau))
I want to calculate 1st four values of parameters Rth and tau. At the moment, it works fine If i use the whole function like this:
Z = (a * (1- np.exp (- x / b))) + (c * (1- np.exp (- x / d)))+ (e * (1- np.exp (- x / f))) + (g * (1- np.exp (- x / f)))
But this is certainly not the nice way to do it for example if i have a really long function with more than 4 exponential terms and I want to get all the parameters. How can I adjust it so that it returns specific number of values of Rth and tau after curve fitting?
For example, If I want to get 16 parameters from a 8 term exponential function, I don't have to write full 8 terms but just a general form and it gives the desired output.
Thank you.
Using least_squares it is quite simple to get an arbitrary sum of functions.
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import least_squares
def partition( inList, n ):
return zip( *[ iter( inList ) ] * n )
def f( x, a, b ):
return a * ( 1 - np.exp( -b * x ) )
def multi_f( x, params ):
if len( params) % 2:
raise TypeError
subparams = partition( params, 2 )
out = np.zeros( len(x) )
for p in subparams:
out += f( x, *p )
return out
def residuals( params, xdata, ydata ):
return multi_f( xdata, params ) - ydata
xl = np.linspace( 0, 8, 150 )
yl = multi_f( xl, ( .21, 5, 0.5, 0.1,2.7, .01 ) )
res = least_squares( residuals, x0=( 1,.9, 1, 1, 1, 1.1 ), args=( xl, yl ) )
print( res.x )
yth = multi_f( xl, res.x )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xl, yl )
ax.plot( xl, yth )
plt.show( )
I managed to solve it by the following way, maybe not the smart way but it works for me.
def func(x,*args):
Z=0
for i in range(0,round(len(args)/2)):
Z += (args[i*2] * (1- np.exp (- x / args[2*i+1])))
return Z
Then calling the parameters in a separate function, I can adjust the number of parameters.
def func2(x,a,b,c,d,e,f,g,h):
return func(x,a,b,c,d,e,f,g,h)
popt , pcov = curve_fit(func2,x,y, method = 'trf', maxfev = 100000)
and it works fine for me.

Partial integral in Python

I want to use the Riemann method to evaluate numerically an partial integral in Python. I would like to integrate with respect to x and find a function of t, but i don't know how do this
My fonction : f(x) = cos(2*pi*x*t) its primitive between [-1/2,1/2]: f(t) = sin(pi*t)/t
def riemann(a, b, dx):
if a > b:
a,b = b,a
n = int((b - a) / dx)
s = 0.0
x = a
for i in range(n):
f_i[k] = np.cos(2*np.pi*x)
s += f_i[k]
x += dx
f_i = s * dx
return f_i,t
There's nothing too horrible about your approach. The result does come out close to the true value:
import numpy as np
def riemann(a, b, dx):
if a > b:
a, b = b, a
n = int((b - a) / dx)
s = 0.0
x = a
for i in range(n):
s += np.cos(2 * np.pi * x)
x += dx
return s * dx
print(riemann(0.0, 0.25, 1.0e-3))
print(1 / (2 * np.pi))
0.15965441949277526
0.15915494309189535
Some remarks:
You wouldn't call this Riemann method. It's the midpoint method (of numerical integration).
Pay a little more attention at the boundaries of your domain. Right now, your numerical domain is [a - dx, b + dx].
If you're looking for speed, best collect all your x values (perhaps with linspace), evaluate the function once with all the points, and then np.sum the values up. (Loops in Python are slow.)

Solving a boundary value problem DE in python

I am trying to solve the following set of DE's:
dx' = cos(a)
dy' = sin(a)
dF' = - b * x * cos(a) + sin(a)
da' = (b * x * sin(a) + cos(a)) / F
with the conditions:
x(0) = y(0) = x(1) = 0
y(1) = 0.6
F(0) = 0.38
a(0) = -0.5
I tried following a similar problem, but I just can't get it to work. Is it possible, that my F(0) and a(0) are completely off, I am not even sure about them.
import numpy as np
from scipy.integrate import solve_bvp
import matplotlib.pyplot as plt
beta = 5
def fun(x, y):
x, dx, y, dy, F, dF, a, da, = y;
dxds=np.cos(a)
dyds=np.sin(a)
dFds=-beta * x * np.cos(a) + np.sin(a)
dads=(beta * x * np.sin(a) + np.cos(a) ) / F
return dx, dxds, dy, dyds, dF, dFds, da, dads
def bc(ya, yb):
return ya[0], yb[0], ya[2], yb[2] + 0.6, ya[4] + 1, yb[4] + 1, ya[6], yb[6]
x = np.linspace(0, 0.5, 10)
y = np.zeros((8, x.size))
y[4] = 0.38
y[6] = 2.5
res = solve_bvp(fun, bc, x, y)
print(res.message)
x_plot = np.linspace(0, 0.5, 200)
plt.plot(x_plot, res.sol(x_plot)[0])
I think that you have foremost a physics problem, translating the physical situation into an ODE system.
x(s) and y(s) are the coordinates of the rope where s is the length along the rope. Consequently, (x'(s),y'(s)) is a unit vector that is uniquely characterized by its angle a(s), giving
x'(s) = cos(a(s))
y'(s) = sin(a(s))
To get the shape, one now has to consider the mechanics. The assumption seems to be that the rope rotates without spiraling around the rotation axis, staying in one plane. Additionally, from the equilibrium of forces you also get that the other two equations are indeed first order, not second order equations. So your state only has 4 components and the ODE system function thus has to be
def fun(s, u):
x, y, F, a = u;
dxds=np.cos(a)
dyds=np.sin(a)
dFds=-beta * x * np.cos(a) + np.sin(a)
dads=(beta * x * np.sin(a) + np.cos(a) ) / F
return dxds, dyds, dFds, dads
Now there are only 4 boundary condition slots available, which are the coordinates of the start and end of the rope.
def bc(ua, ub):
return ua[0], ub[0], ua[1], ub[1] - 0.6
Additionally, the interval length for s is also the rope length, so a value of 0.5 is impossible for the given coordinates on the pole, try 1.0. There is some experimentation needed to get an initial guess that does not lead to a singular Jacobian in the BVP solver. In the end I get the solution in the x-y plane
with the components

Is there a faster way of repeating a chunk of code x times and taking an average?

Starting with:
a,b=np.ogrid[0:n+1:1,0:n+1:1]
B=np.exp(1j*(np.pi/3)*np.abs(a-b))
B[z,b] = np.exp(1j * (np.pi/3) * np.abs(z - b +x))
B[a,z] = np.exp(1j * (np.pi/3) * np.abs(a - z +x))
B[diag,diag]=1-1j/np.sqrt(3)
this produces an n*n grid that acts as a matrix.
n is just a number chosen to represent the indices, i.e. an a*b matrix where a and b both go up to n.
Where z is a constant I choose to replace a row and column with the B[z,b] and B[a,z] formulas. (Essentially the same formula but with a small number added to the np.abs(a-b))
The diagonal of the matrix is given by the bottom line:
B[diag,diag]=1-1j/np.sqrt(3)
where,
diag=np.arange(n+1)
I would like to repeat this code 50 times where the only thing that changes is x so I will end up with 50 versions of the B np.ogrid. x is a randomly generated number between -0.8 and 0.8 each time.
x=np.random.uniform(-0.8,0.8)
I want to generate 50 versions of B with random values of x each time and take a geometric average of the 50 versions of B using the definition:
def geo_mean(y):
y = np.asarray(y)
return np.prod(y ** (1.0 / y.shape[0]), axis=-1)
I have tried to set B as a function of some index and then use a for _ in range(): loop, this doesn't work. Aside from copy and pasting the block 50 times and denoting each one as B1, B2, B3 etc; I can't think of another way of working this out.
EDIT:
I'm now using part of a given solution in order to show clearly what I am looking for:
#A matrix with 50 random values between -0.8 and 0.8 to be used in the loop
X=np.random.uniform(-0.8,0.8, (50,1))
#constructing the base array before modification by random x values in position z
a,b = np.ogrid[0:n+1:1,0:n+1:1]
B = np.exp(1j * ( np.pi / 3) * np.abs( a - b ))
B[diag,diag] = 1 - 1j / np.sqrt(3)
#list to store all modified arrays
randomarrays = []
for i in range( 0,50 ):
#copy array and modify it
Bnew = np.copy( B )
Bnew[z, b] = np.exp( 1j * ( np.pi / 3 ) * np.abs(z - b + X[i]))
Bnew[a, z] = np.exp( 1j * ( np.pi / 3 ) * np.abs(a - z + X[i]))
randomarrays.append(Bnew)
Bstack = np.dstack(randomarrays)
#calculate the geometric mean value along the axis that was the row in 2D arrays
B0 = geo_mean(Bstack)
From this example, every iteration of i uses the same value of X, I can't seem to get a way to get each new loop of i to use the next value in the matrix X. I am unsure of the ++ action in python, I know it does not work in python, I just don't know how to use the python equivalent. I want a loop to use a value of X, then the next loop to use the next value and so on and so forth so I can dstack all the matrices at the end and find a geo_mean for each element in the stacked matrices.
One pedestrian way would be to use a list comprehension or generator expression:
>>> def f(n, z, x):
... diag = np.arange(n+1)
... a,b=np.ogrid[0:n+1:1,0:n+1:1]
... B=np.exp(1j*(np.pi/3)*np.abs(a-b))
... B[z,b] = np.exp(1j * (np.pi/3) * np.abs(z - b +x))
... B[a,z] = np.exp(1j * (np.pi/3) * np.abs(a - z +x))
... B[diag,diag]=1-1j/np.sqrt(3)
... return B
...
>>> X = np.random.uniform(-0.8, 0.8, (10,))
>>> np.prod((*map(np.power, map(f, 10*(4,), 10*(2,), X), 10 * (1/10,)),), axis=0)
But in your concrete example we can do much better than that;
using the identity exp(a) x exp(b) = exp(a + b) we can convert the geometric mean after exponentiation to an arithmetic mean before exponentition. A bit of care is required because of the multivaluedness of the complex n-th root which occurs in the geometric mean. In the code below we normalize the angles occurring to range -pi, pi so as to always hit the same branch as the n-th root.
Please also note that the geo_mean function you provide is definitely wrong. It fails the basic sanity check that taking the average of copies of the same thing should return the same thing. I've provided a better version. It is still not perfect, but I think there actually is no perfect solution, because of the nonuniqueness of the complex root.
Because of this I recommend taking the average before exponentiating. As long as your random spread is less than pi this allows a well-defined averaging procedure with an average that is actually close to the samples
import numpy as np
def f(n, z, X, do_it_pps_way=True):
X = np.asanyarray(X)
diag = np.arange(n+1)
a,b=np.ogrid[0:n+1:1,0:n+1:1]
B=np.exp(1j*(np.pi/3)*np.abs(a-b))
X = X.reshape(-1,1,1)
if do_it_pps_way:
zbx = np.mean(np.abs(z-b+X), axis=0)
azx = np.mean(np.abs(a-z+X), axis=0)
else:
zbx = np.mean((np.abs(z-b+X)+3) % 6 - 3, axis=0)
azx = np.mean((np.abs(a-z+X)+3) % 6 - 3, axis=0)
B[z,b] = np.exp(1j * (np.pi/3) * zbx)
B[a,z] = np.exp(1j * (np.pi/3) * azx)
B[diag,diag]=1-1j/np.sqrt(3)
return B
def geo_mean(y):
y = np.asarray(y)
dim = len(y.shape)
y = np.atleast_2d(y)
v = np.prod(y, axis=0) ** (1.0 / y.shape[0])
return v[0] if dim == 1 else v
def geo_mean_correct(y):
y = np.asarray(y)
return np.prod(y ** (1.0 / y.shape[0]), axis=0)
# demo that orig geo_mean is wrong
B = np.exp(1j * np.random.random((5, 5)))
# the mean of four times the same thing should be the same thing:
if not np.allclose(B, geo_mean([B, B, B, B])):
print('geo_mean failed')
if np.allclose(B, geo_mean_correct([B, B, B, B])):
print('but geo_mean_correct works')
n, z, m = 10, 3, 50
X = np.random.uniform(-0.8, 0.8, (m,))
B0 = f(n, z, X, do_it_pps_way=False)
B1 = np.prod((*map(np.power, map(f, m*(n,), m*(z,), X), m * (1/m,)),), axis=0)
B2 = geo_mean_correct([f(n, z, x) for x in X])
# This is the recommended way:
B_recommended = f(n, z, X, do_it_pps_way=True)
print()
print(np.allclose(B1, B0))
print(np.allclose(B2, B1))
I think you should rely more on numpy functionality, when approaching your problem. Not a numpy expert myself, so there is surely room for improvement:
from scipy.stats import gmean
n = 2
z = 1
a = np.arange(n + 1).reshape(1, n + 1)
#constructing the base array before modification by random x values in position z
B = np.exp(1j * (np.pi / 3) * np.abs(a - a.T))
B[a, a] = 1 - 1j / np.sqrt(3)
#list to store all modified arrays
random_arrays = []
for _ in range(50):
#generate random x value
x=np.random.uniform(-0.8, 0.8)
#copy array and modify it
B_new = np.copy(B)
B_new[z, a] = np.exp(1j * (np.pi / 3) * np.abs(z - a + x))
B_new[a, z] = np.exp(1j * (np.pi / 3) * np.abs(a - z + x))
random_arrays.append(B_new)
#store all B arrays as a 3D array
B_stack = np.stack(random_arrays)
#calculate the geometric mean value along the axis that was the row in 2D arrays
geom_mean_for_rows = gmean(B_stack, axis = 2)
It uses the geometric mean function from scipy.stats module to have a vectorised approach for this calculation.

Non-linear least square minimization of 2 variables (different dimension) in python

I have a function of two variables k and T.
If have the value of the function for a number of (k,T) couple. However I do not have the same amount for each. For example I know the values f of the function at 2 T and 3 k:
F(k1,T1) = f1
F(k1,T2) = f2
F(k2,T1) = f3
F(k2,T2) = f4
F(k3,T1) = f5
F(k3,T2) = f6
I also know the form of the function F:
def func(X, a, b, c, omega):
T,k = X # The two variables
n = 1.0 / ( np.exp(omega / T ) - 1.0 )
return a * k * n + b * k**2 * (n + 1.0)
I would like to find the value of a,b,c and omega that minimize the error.
I tried with curve_fit:
k = [k1,k2,k3]
T = [T1,T2]
F[k1,T1] = f1
F[k1,T2] = f2
F[k2,T1] = f3
F[k2,T2] = f4
F[k3,T1] = f5
F[k3,T2] = f6
popt, pcov = curve_fit(func, (T,k), F )
However I get the following error (in my practical case I have 19 k values and 4 T values):
return a * k * n + b * k**2 * (n + 1.0)
ValueError: operands could not be broadcast together with shapes (19,) (4,)
Now if I create an array of higher dimension:
X = np.zeros((4,19,2))
for ii in np.arange(19):
X[0,ii,:] = np.array([T[0],k[ii]])
X[1,ii,:] = np.array([T[1],k[ii]])
X[2,ii,:] = np.array([T[2],k[ii]])
X[3,ii,:] = np.array([T[3],k[ii]])
and pass that:
def func(X, a, b, c, omega):
T = X[:,:,0]
k = X[:,:,1]
n = 1.0 / ( np.exp(omega / T ) - 1.0 )
return a * k * n + b * k**2 * (n + 1.0)
popt, pcov = curve_fit(func, X, F )
then I get the following issue:
minpack.error: Result from function call is not a proper array of floats.
Thank you in advance.
You need an array of pairs of data with the input X (probably your original dataset already looks like that) and the corresponding output array F:
X = np.array([k1,T1],[k1,T2],[k2,T1],[k2,T2],[k3,T1],[k3,T2])
F = [f1,f2,f3,f4,f5,f6]
Then calling the curve_fit function is directly:
popt, pcov = curve_fit(func, (X[:,0],X[:,1]),F)
Alternatively you can use single arrays for the k and T and use them in place of X[:,0] and X[:,1], but note that they should have the same dimensions since each element corresponds with the individual value of k and T of each observation/experiment. In other words, the index in the k or T array tells you the label of the corresponding observation.

Categories

Resources