Shift arrays over non-uniform grids in Python

Shift arrays over non-uniform grids in Python - python

I would like to know if there is a Python functionality in either Numpy or SciPy that allows to shift arrays over non-uniform grids. I have created a minimal example to illustrate the procedure, but this does not seem to work in this minimal example:
import numpy as np
import matplotlib.pyplot as pyt
def roll_arrays( a, shift_values,x_grid ):
#from scipy.interpolate import interp1d
x_max = np.amax(x_grid)
total_items = a.shape[0]
the_ddtype = a.dtype
result = np.zeros( (a.shape[0], a.shape[1] ), dtype=the_ddtype )
for k in range( total_items ):
edge_val_left = a[k,0]
edge_val_right = a[k,-1]
#extend grid to edges with boundary values (flat extrapolation)
extended_boundary = np.abs( shift_values[k] )#positive or negative depending on shift
if( shift_values[k] != 0.0 ):
x0_right = np.linspace( x_max +1e-3, x_max + 1e-3 + extended_boundary, 10 )
x0_left = np.linspace( -x_max - 1e-3 -extended_boundary, -x_max - 1e-3, 10 )
if( shift_values[k]>0.0 ):
#we fill left values
x_dense_grid = np.concatenate( ( x0_left, x_grid + shift_values[k] ) )
ynew = np.concatenate( ( edge_val_left*np.ones( 10 ), a[k,:] ) )
elif( shift_values[k]<0.0 ):
x_dense_grid = np.concatenate( ( x_grid + shift_values[k], x0_right ) )
ynew = np.concatenate( ( a[k,:], edge_val_right*np.ones( 10 ) ) )
###
#return on the original grid
f_interp = np.interp( x_grid, x_dense_grid, ynew )
result[k,:] = f_interp
else:
#no shift
result[k,:] = a[k,:]
return result
x_geom = np.array( [ 100*( 1.5**(-0.5*k) ) for k in range(1000)] )
x_geom_neg =-( x_geom )
x_geom = np.concatenate( (np.array([0.0]), np.flip(x_geom)) )
x_geom = np.concatenate( (x_geom_neg, x_geom) )
shifts = np.array([-1.0,-2.0,1.0])
f = np.array( [ k**2/( x_geom**2 + k**4 ) for k in range(1,shifts.shape[0]+1) ] )
fs = roll_arrays( f, shifts, x_geom)
pyt.plot( x_geom, f[0,:], marker='.' )
pyt.plot( x_geom, fs[0,:], marker='.' )
print("done")
Note that the data points of "x_grid" are, in this case, logarithmically spaced. Is there a way to do this making use of Scipy/Numpy? Through interpolation methods or similar.
EDIT:I noted that removing the if,elif,else statements about the shift of the boundaries (where flat extrapolation was done) seems to solve the issue; but I still think this is too naive implementation for something that should already exist in Python; so the problem still persists.

If I understand the question right, np.interp will just do what you want (it copies the values at the edges by default):
def roll_arrays(a, shift_values, x_grid):
total_items = a.shape[0]
result = np.zeros_like(a)
for k in range(total_items):
if shift_values[k] != 0.0:
# shift the x values
x_grid_shifted = x_grid + shift_values[k]
# interpolate back to the original grid
f_interp = np.interp(x_grid, x_grid_shifted, a[k, :])
result[k, :] = f_interp
else:
# no shift
result[k, :] = a[k, :]
return result
For the example input from the question, this will give something very close to
fs_expected = np.array([k ** 2 / ((x_geom - shift) ** 2 + k ** 4) for k, shift in enumerate(shifts, start=1)])

Related

fitting data to fourier3 series always produce a straight line

I have data where I want to fit the Fourier3 series, I looked to this answer: here and tried different algorithms from different packages (like symfit, and scipy). But when I plot the data, different packages give me get this result:
enter image description here
Currently, I'm using the curve_fit package from scipy and here is my code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import pandas as pd
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len(as_bs)-1, 2):
sum_a += as_bs[i] * np.cos(j * w * x)
sum_b += as_bs[i+1] * np.sin(j * w * x)
j = j + 1
return a0 + sum_a + sum_b
T = pd.read_excel('FS_data.xlsx')
A = pd.DataFrame(T)
xdata = np.array(A.iloc[:, 0])
ydata = np.array(A.iloc[:, 1])
# fits
popt, pcov = curve_fit(fourier, xdata, ydata, [np.random.rand(1)] * 8)
print(popt)
data_fit = fourier(ydata, *popt)
print(data_fit)
plt.plot(ydata)
plt.plot(data_fit, label='after fitting')
plt.legend()
plt.show()
So, my code basically will read random 8 numbers and assign them as initial guesses for (f, a0, a1, b1, a2, b2, a3, b3) respectively.
I tried to fit the data on Matlab to check if the data can be fitted with the fourier3 and the results there are great:
enter image description here
I printed the output on both Python and Matlab to compare and here is the results for both:
Python:
w = 5.66709943e-01
a0 = 3.80499132e+01
a1 = 5.56883486e-04
b1 = -3.88408379e-04
a2 = -3.88408379e-04
b2 = 3.32951592e-04
a3 = 3.15641900e-04
b3 = 1.96414168e-04
Matlab:
a0 = 38.07 (38.07, 38.08)
a1 = 0.5352 (0.4951, 0.5753)
b1 = -0.5788 (-0.5863, -0.5714)
a2 = -0.3728 (-0.413, -0.3326)
b2 = 0.5411 (0.492, 0.5901)
a3 = 0.2357 (0.2226, 0.2488)
b3 = 0.05895 (0.02773, 0.09018)
w = 0.0003088
So as noted, only the value for a0 was correct, but the others are very far from Matlab.
So why I'm getting this result in Python? What I'm doing wrong?
Here is the data for those who like to test it out:
https://docs.google.com/spreadsheets/d/18lL1iMZ3kdaqUUtRDLNRK4A3uCPzOrXt/edit?usp=sharing&ouid=112684448221465330517&rtpof=true&sd=true

I am not into Matlab, so I don't know, which additional work the Matlab fit does to estimate starting values for a non-linear fit. I can say, though, that curve_fit does non at all, i.e. all values are assumed to be on the order of 1. The easiest way, would have been to rescale the x axis to the range [0, 2 pi]. Hence, the problem of the OP is, once again, wrong starting values. Rescaling requires, however, the knowledge that the main wave to be fitted is approximately the width of the data set. Moreover, we need to assume that all other fit parameters are also of the order 1. Luckily, this is the case, so this would have worked:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
xdat, ydat = np.loadtxt( "data.tsv", unpack=True, skiprows=1 )
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len( as_bs ) - 1, 2 ):
sum_a += as_bs[i] * np.cos( j * w * x )
sum_b += as_bs[i+1] * np.sin( j * w * x )
j = j + 1
return a0 + sum_a + sum_b
"""
lets rescale the data to get the base frequency in the range of one
"""
xmin = min( xdat )
xmax = max( xdat )
xdat = ( xdat - xmin ) / (xmax - xmin ) * 2 * np.pi
popt, pcov = curve_fit(
fourier,
xdat, ydat,
p0 = np.ones(8)
)
### here I assume that higher order are similar to lower orders
### but slightly smaller. ... hoping that the fit correts errors in
### this assumption
print(popt)
### scale back w noting that it scales inverse to x
print( popt[0] * 2 * np.pi / (xmax - xmin ) )
data_fit = fourier( xdat, *popt )
If we cannot make the assumptions above, we may only assume that there is a base frequency with a dominant contribution to the signal (Note that this is not always true). In this case we can pre-calculate starting guesses in an non-iterative way.
The solution looks a bit more complicated:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
from scipy.integrate import cumtrapz
xdat, ydat = np.loadtxt( "data.tsv", unpack=True, skiprows=1 )
def fourier(x, *as_bs):
sum_a = 0
sum_b = 0
j = 1
w = as_bs[0]
a0 = as_bs[1]
for i in range(2, len( as_bs ) - 1, 2 ):
sum_a += as_bs[i] * np.cos( j * w * x )
sum_b += as_bs[i+1] * np.sin( j * w * x )
j = j + 1
return a0 + sum_a + sum_b
#### initial guess
"""
This uses the fact that if y = a sin w t + b cos w t + c we have
int int y = -y/w^2 + c/2 t^2 + d t + e
i.e. we can get 1/w^2 as linear fit parameter without the danger of
a non-linear fit iterative process running into a local minimum
for details see:
https://scikit-guess.readthedocs.io/en/sine/_downloads/4b4ed1e691ff195be3ca73879a674234/Regressions-et-equations-integrales.pdf
"""
Sy = cumtrapz( ydat, xdat, initial=0 )
SSy = cumtrapz( Sy, xdat, initial=0 )
ST = np.array( [
ydat, xdat**2, xdat, np.ones( len( xdat ) )
] )
S = np.transpose( ST )
eta = np.dot( ST, SSy )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
wFit = np.sqrt( -1 / sol[0] )
### linear parameters
"""
Once we have a good guess for w we can get starting guesses for
a, b and c from a standard linear fit
"""
ST = np.array( [
np.sin( wFit * xdat ), np.cos( wFit * xdat ), np.ones( len( xdat ) )
])
S = np.transpose( ST )
eta = np.dot( ST, ydat )
A = np.dot( ST, S )
sol = np.linalg.solve( A, eta )
a1 = sol[0]
b1 = sol[1]
a0 = sol[2]
### final non-linear fit
"""
Now we can use the guesses from above as input for the final
non-linear fit. Hopefully, we are now close enough to the global minimum
and have the algorithm converge reasonably
"""
popt, pcov = curve_fit(
fourier,
xdat, ydat,
p0=[
wFit, a0, a1, b1,
a1 / 2, b1 / 2,
a1 / 4, b1 / 4
]
)
### here I assume that higher order are similar to lower orders
### but slightly smaller. ... hoping that the fit correts errors in
### this assumption
print(popt)
data_fit = fourier( xdat, *popt )
plt.plot( xdat, ydat, ls="", marker="o", ms=0.5, label="data" )
plt.plot( xdat, data_fit, label='fitting')
plt.legend()
plt.show()
Both providing basically the same solution, with the latter code being applicable to more cases with less assumptions.

how to extract center coordinates, height, width and phi of an ellipse from SymPy to plot a fitted ellipse?

I have been working with lsq-ellipse package where I get the coordinates of ellipse with the following code below:
from ellipse import LsqEllipse
from matplotlib.patches import Ellipse
coords_D0 = np.array(coords_D0)
reg = LsqEllipse().fit(coords_D0)
center_D0, width_D0, height_D0, phi_D0 = reg.as_parameters()
print(f'center: {center_D0[0]:.3f}, {center_D0[1]:.3f}')
print(f'width: {width_D0:.3f}')
print(f'height: {height_D0:.3f}')
print(f'phi: {phi_D0:.3f}')
However, my coords_D0 variable consists of three coordinates which caused the following error:
ValueError: Received too few samplesGot 3 features, 5 or more required.
But, after looking into some packages and online, I found that sympy also can do Ellipse and I understand that you can extract the centre, vradius and hradius from sympy. But, I would like to know how to get the width, height and phi from sympy and will it be the same as the lsq-ellipse package to be used in Ellipse of matplotlib? I use the values from lsq-ellipse package in matplotlib to form the ellipse part and it can be found in the following code line:
Code:
ellipse_D0 = Ellipse(xy=center_D0, width=2*width_D0, height=2*height_D0, angle=np.rad2deg(phi_D0),edgecolor='b', fc='None', lw=2, label='Fit', zorder=2)
My coordinates are the following:
coords_D0 =
-1.98976 -1.91574
-0.0157721 2.5438
2.00553 -0.628061
# another points
coords_D1 =
-0.195518 0.0273673
-0.655686 -1.45848
-0.447061 -0.168108
# another points
coords_D2 =
-2.28529 0.91896
-2.43207 0.446211
-2.23044 0.200087
Side Question:
Is there a way to fit an ellipse to these coordinates (in general, 3 coordinates or more)?

Assuming that the OP is about the Minimum Volume Enclosing Ellipse, I'd suggest the following solution.
#! /usr/bin/python3
# coding=utf-8
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse
import numpy as np
from mymodules3.mvee import mvee
coords= list()
coords.append( np.array([
-1.98976, -1.91574,
-0.0157721, 2.5438,
2.00553, -0.628061
]).reshape(3,-1) )
coords.append( np.array([
-0.195518, 0.0273673,
-0.655686, -1.4584,8
-0.447061, -0.168108,
]).reshape(3,-1)
)
coords.append( np.array([
-2.28529, 0.91896,
-2.43207, 0.446211,
-2.23044, 0.200087
]).reshape(3,-1)
)
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
for i, col in enumerate( ['k', 'g', 'm'] ):
sol = mvee( coords[i] )
e =Ellipse(
sol[0],
width=2 * sol[1][0],
height=2 * sol[1][1],
angle=sol[2] * 180/3.1415926,
color=col, alpha=0.5,
zorder = -1000 + i
)
ax.scatter( coords[i][:,0], coords[i][:,1], c=col, zorder=10 * i )
ax.add_artist( e )
plt.show()
providing
The mvee is is based on an SE answer on a similar question.
"""
NMI : 2021-11-11
Minimum Volume Enclosing Ellipsoids, see e.g.
NIMA MOSHTAGH : MINIMUM VOLUME ENCLOSING ELLIPSOIDS
or
Linus Källberg : Minimum_Enclosing_Balls_and_Ellipsoids (Thesis)
"""
from warnings import warn
from numpy import pi
from numpy import sqrt
from numpy import arccos
from numpy import dot, outer
from numpy import diag, transpose
from numpy import append
from numpy import asarray
from numpy import ones
from numpy import argmax
from numpy.linalg import inv
from numpy.linalg import norm
from numpy.linalg import eig
def mvee( data, tolerance=1e-4, maxcnt=1000 ):
"""
param data: list of xy data points
param tolerance: termination condition for iterative approximation
param maxcnt: maximum number of iterations
type data: iterable of float
type tolerance: float
return: (offset, semiaxis, angle)
return type: ( (float, float), (float, float), float )
"""
locdata = asarray( data )
N = len( locdata )
if not locdata.shape == ( N, 2):
raise ValueError ( " data must be of shape( n, 2 )" )
if tolerance >= 1 or tolerance <= 0:
raise ValueError (" 0 < tolerance < 1 required")
if not isinstance( maxcnt, int ):
raise TypeError
if not maxcnt > 0:
raise ValueError
count = 1
err = 1
d = 2
d1 = d + 1
u = ones( N ) / N
P = transpose( locdata )
Q = append( P, ones( N ) ).reshape( 3, -1 )
while ( err > tolerance):
X = dot( Q, dot( diag( u ), transpose( Q ) ) )
M = diag(
dot(
transpose( Q ),
dot(
inv( X ),
Q
)
)
)
maximum = max( M )
j = argmax( M )
step_size = ( maximum - d1 ) / ( d1 * ( maximum - 1 ) )
new_u = ( 1 - step_size ) * u
new_u[ j ] += step_size
err = norm( new_u - u )
count = count + 1
u = new_u
if count > maxcnt:
warn(
"Process did not converge in {} steps".format(
count - 1
),
UserWarning
)
break
U = diag( u )
c = dot( P, u )
A = inv(
dot(
P,
dot( U, transpose( P ) )
) - outer( c, c )
) / d
E, V = eig( A )
phiopt = arccos( V[ 0, 0 ] )
if V[ 0, 1 ] < 0:
phiopt = 2 * pi - phiopt
### cw vs ccw and periodicity of pi
phiopt = -phiopt % pi
sol = ( c, sqrt( 1.0 / E ), phiopt)
return sol

Calculating the summation parameters separately

I am trying to use curve_fitting for a defined function of the form below:
Z = (Rth(1 - np.exp(- x/tau))
I want to calculate 1st four values of parameters Rth and tau. At the moment, it works fine If i use the whole function like this:
Z = (a * (1- np.exp (- x / b))) + (c * (1- np.exp (- x / d)))+ (e * (1- np.exp (- x / f))) + (g * (1- np.exp (- x / f)))
But this is certainly not the nice way to do it for example if i have a really long function with more than 4 exponential terms and I want to get all the parameters. How can I adjust it so that it returns specific number of values of Rth and tau after curve fitting?
For example, If I want to get 16 parameters from a 8 term exponential function, I don't have to write full 8 terms but just a general form and it gives the desired output.
Thank you.

Using least_squares it is quite simple to get an arbitrary sum of functions.
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import least_squares
def partition( inList, n ):
return zip( *[ iter( inList ) ] * n )
def f( x, a, b ):
return a * ( 1 - np.exp( -b * x ) )
def multi_f( x, params ):
if len( params) % 2:
raise TypeError
subparams = partition( params, 2 )
out = np.zeros( len(x) )
for p in subparams:
out += f( x, *p )
return out
def residuals( params, xdata, ydata ):
return multi_f( xdata, params ) - ydata
xl = np.linspace( 0, 8, 150 )
yl = multi_f( xl, ( .21, 5, 0.5, 0.1,2.7, .01 ) )
res = least_squares( residuals, x0=( 1,.9, 1, 1, 1, 1.1 ), args=( xl, yl ) )
print( res.x )
yth = multi_f( xl, res.x )
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xl, yl )
ax.plot( xl, yth )
plt.show( )

I managed to solve it by the following way, maybe not the smart way but it works for me.
def func(x,*args):
Z=0
for i in range(0,round(len(args)/2)):
Z += (args[i*2] * (1- np.exp (- x / args[2*i+1])))
return Z
Then calling the parameters in a separate function, I can adjust the number of parameters.
def func2(x,a,b,c,d,e,f,g,h):
return func(x,a,b,c,d,e,f,g,h)
popt , pcov = curve_fit(func2,x,y, method = 'trf', maxfev = 100000)
and it works fine for me.

LMFIT: Constraining the output when using the polynomial model

I'm using LMFIT to fit a piecewise polynomials to the first quadrant of a sine wave.
I would like to be able to add a constraint on the polynomial output - as opposed to on its parameters.
For example, I would like to ensure that the output is >= 0 and <= 1.0 (which of course only affects the first and last segment in the code below).
Another use case if if I want the polynomial to pass through some specific (x,y) exact points.
I understand this might be better done with np.polyfit but eventually I want to add more non-linear constraints and the LMFIT framework is more flexible.
import numpy as np
from lmfit.models import LinearModel
#split sine wave in 4 segments with 1024 points
nseg = 4
frac = 2**10
npoints = nseg*frac
xfrac = np.linspace(0, 1, num=frac, endpoint=False)
x = np.linspace(0, 1, num=npoints, endpoint=False)
y = np.sin(x*np.pi/2)
yseg = np.reshape(y, (nseg, frac))
mod = LinearModel()
coeff = []
bestfit = []
for i in range(nseg):
pars = mod.guess(yseg[i], x=xfrac)
out = mod.fit(yseg[i], pars, x=xfrac)
coeff.append([out.best_values['slope'], out.best_values['intercept']])
bestfit.append(out.best_fit)
bestfit = np.reshape(bestfit, (1, npoints))[0]

Turns out this is done by adding constraints on the parameters themselves that turns into the right constraint on the model output.
Using a custom model for linear interpolation it can be done as following:
def func(x, c0, c1):
return c0 + c1*x
pmodel = Model(func)
params = Parameters()
params.add('c0')
params.add('clip', value=0, max=1.0, vary=True)
params.add('c1', expr='clip-c0')

One option might be using splines.
A quick and dirty approach, just to present the idea, might look like this:
import matplotlib.pyplot as plt
import numpy as np
## quich and dirty spline function
def l_spline(x, abc ):
if isinstance( x, ( list, tuple, np.ndarray ) ):
out = [ l_spline( elem, abc ) for elem in x]
else:
a, b, c = abc
if x < a:
f = lambda t: 0
elif x < b:
f = lambda t: ( t - a ) / ( b - a )
elif x < c:
f = lambda t: -( t - c ) / (c - b )
else:
f = lambda t: 0
out = f(x)
return out
### test data
xl = np.linspace( 0, 4, 150 )
sl = np.fromiter( ( np.sin( elem ) for elem in xl ), np.float )
### test splines with manual double knots on first and last
yl = dict()
yl[0] = l_spline( xl, ( 0, 0, .4 ) )
for i in range(1, 10 ):
yl[i] = l_spline( xl, ( (i - 1 ) * 0.4 , i * 0.4, (i + 1 ) * 0.4 ) )
yl[10] = l_spline( xl, ( 3.6, 4, 4 ) )
## This is the most simple linear least square for the coefficients
AT = list()
for i in range( 11 ):
AT.append( yl[i] )
AT = np.array( AT )
A = np.transpose( AT )
U = np.dot( AT, A )
UI = np.linalg.inv( U )
K = np.dot( UI, AT )
v = np.dot( K, sl )
## adding up the weigthed sum
out = np.zeros( len( sl ) )
for a, l in zip( v, AT ):
out += a * l
### plotting
fig = plt.figure()
ax = fig.add_subplot( 1, 1, 1 )
ax.plot( xl, sl, ls=':' )
for i in range( 11 ):
ax.plot( xl, yl[i] )
ax.plot( xl, out, color='k')
plt.show()
Looks like this:
Instead of the simple linear optimization one could use more complex functions to ensure that no parameter is larger than 1. This automatically ensures that the function does not go beyond 1. A fixed point can be established by setting the according b-spline to a fixed value, i.e. not fitting its parameter.

How do I create a (1000, 500) array from an (1000, 1) and indexing trough n values?

I need to make my variables phi, En, and Cn into appropriate sizes arrays. I was able to do this successfully in Matlab by the conversion from Matlab to python is difficult. How would I go about this calculation. I would essentially need the entire array of x to be multiplied when n = 1, again when n = 2, ..., n = 500 and get the correct sized arrays for En and Cn as well.
def Gaussan_wave_packet():
quantum_number = 500
x = np.linspace(0,100,1000).astype(complex)
x0 = 50, a = 10, l = 1
A = (1/(4*a**2))**(1/4.0)
m = 0.511*10**6 #mass
hbar = 6.58211951*10**(-16)
L = x[-1]
#Gaussian wave packet
psi_x0 = np.exp((-(x - x0)**2)/(4*a**2))*np.exp(1j*l*x)
#Normalize wave function
A = (1/(np.sqrt(np.trapz((np.conj(psi_x0)*psi_x0),x))))
psi_x0_normalized = np.outer(psi_x0,A) # Makes a (1000,1) array
phi_result = np.array([])
En_result = np.array([])
Cn_result = np.array([])
for n in range(0,quantum_number):
phi = ( np.sqrt( 2/L ) * np.sin( ( n * x * np.pi )/L ) ) # Needs to be (1000,500)
En = ( ( np.power(n,2))*(np.pi**2)*(hbar**2))/(2*m*L**2) # Needs to be (1,500)
Cn = np.trapz( ( np.conj(phi) * psi_x0_normalized ), x ) # Needs to be (1,500)

You can use element wise multiplication with np.multiply(a,b).
And reshape xin order to use implicit expansion and to avoid a for loop:
n = np.arange(quantum_number)
phi = np.sqrt(2/L) * np.sin((np.multiply(n,x.reshape(1000,1)*np.pi)/L ))
You can apply the same logic to En and Cn.
The matlab equivalent would be:
n = 0:(quantum_number-1);
phi = (2/L)^0.5*sin(n.*x.'*pi/L);

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Shift arrays over non-uniform grids in Python - python

Related

fitting data to fourier3 series always produce a straight line

how to extract center coordinates, height, width and phi of an ellipse from SymPy to plot a fitted ellipse?

Calculating the summation parameters separately

LMFIT: Constraining the output when using the polynomial model

How do I create a (1000, 500) array from an (1000, 1) and indexing trough n values?

Categories

Resources