Finding the full width half maximum of a peak

Finding the full width half maximum of a peak - python

I have been trying to figure out the full width half maximum (FWHM) of the the blue peak (see image). The green peak and the magenta peak combined make up the blue peak. I have been using the following equation to find the FWHM of the green and magenta peaks: fwhm = 2*np.sqrt(2*(math.log(2)))*sd where sd = standard deviation. I created the green and magenta peaks and I know the standard deviation which is why I can use that equation.
I created the green and magenta peaks using the following code:
def make_norm_dist(self, x, mean, sd):
import numpy as np
norm = []
for i in range(x.size):
norm += [1.0/(sd*np.sqrt(2*np.pi))*np.exp(-(x[i] - mean)**2/(2*sd**2))]
return np.array(norm)
If I did not know the blue peak was made up of two peaks and I only had the blue peak in my data, how would I find the FWHM?
I have been using this code to find the peak top:
peak_top = 0.0e-1000
for i in x_axis:
if i > peak_top:
peak_top = i
I could divide the peak_top by 2 to find the half height and then try and find y-values corresponding to the half height, but then I would run into trouble if there are no x-values exactly matching the half height.
I am pretty sure there is a more elegant solution to the one I am trying.

You can use spline to fit the [blue curve - peak/2], and then find it's roots:
import numpy as np
from scipy.interpolate import UnivariateSpline
def make_norm_dist(x, mean, sd):
return 1.0/(sd*np.sqrt(2*np.pi))*np.exp(-(x - mean)**2/(2*sd**2))
x = np.linspace(10, 110, 1000)
green = make_norm_dist(x, 50, 10)
pink = make_norm_dist(x, 60, 10)
blue = green + pink
# create a spline of x and blue-np.max(blue)/2
spline = UnivariateSpline(x, blue-np.max(blue)/2, s=0)
r1, r2 = spline.roots() # find the roots
import pylab as pl
pl.plot(x, blue)
pl.axvspan(r1, r2, facecolor='g', alpha=0.5)
pl.show()
Here is the result:

This worked for me in iPython (quick and dirty, can be reduced to 3 lines):
def FWHM(X,Y):
half_max = max(Y) / 2.
#find when function crosses line half_max (when sign of diff flips)
#take the 'derivative' of signum(half_max - Y[])
d = sign(half_max - array(Y[0:-1])) - sign(half_max - array(Y[1:]))
#plot(X[0:len(d)],d) #if you are interested
#find the left and right most indexes
left_idx = find(d > 0)[0]
right_idx = find(d < 0)[-1]
return X[right_idx] - X[left_idx] #return the difference (full width)
Some additions can be made to make the resolution more accurate, but in the limit that there are many samples along the X axis and the data is not too noisy, this works great.
Even when the data are not Gaussian and a little noisy, it worked for me (I just take the first and last time half max crosses the data).

If your data has noise (and it always does in the real world), a more robust solution would be to fit a Gaussian to the data and extract FWHM from that:
import numpy as np
import scipy.optimize as opt
def gauss(x, p): # p[0]==mean, p[1]==stdev
return 1.0/(p[1]*np.sqrt(2*np.pi))*np.exp(-(x-p[0])**2/(2*p[1]**2))
# Create some sample data
known_param = np.array([2.0, .7])
xmin,xmax = -1.0, 5.0
N = 1000
X = np.linspace(xmin,xmax,N)
Y = gauss(X, known_param)
# Add some noise
Y += .10*np.random.random(N)
# Renormalize to a proper PDF
Y /= ((xmax-xmin)/N)*Y.sum()
# Fit a guassian
p0 = [0,1] # Inital guess is a normal distribution
errfunc = lambda p, x, y: gauss(x, p) - y # Distance to the target function
p1, success = opt.leastsq(errfunc, p0[:], args=(X, Y))
fit_mu, fit_stdev = p1
FWHM = 2*np.sqrt(2*np.log(2))*fit_stdev
print "FWHM", FWHM
The plotted image can be generated by:
from pylab import *
plot(X,Y)
plot(X, gauss(X,p1),lw=3,alpha=.5, color='r')
axvspan(fit_mu-FWHM/2, fit_mu+FWHM/2, facecolor='g', alpha=0.5)
show()
An even better approximation would filter out the noisy data below a given threshold before the fit.

Here is a nice little function using the spline approach.
from scipy.interpolate import splrep, sproot, splev
class MultiplePeaks(Exception): pass
class NoPeaksFound(Exception): pass
def fwhm(x, y, k=10):
"""
Determine full-with-half-maximum of a peaked set of points, x and y.
Assumes that there is only one peak present in the datasset. The function
uses a spline interpolation of order k.
"""
half_max = amax(y)/2.0
s = splrep(x, y - half_max, k=k)
roots = sproot(s)
if len(roots) > 2:
raise MultiplePeaks("The dataset appears to have multiple peaks, and "
"thus the FWHM can't be determined.")
elif len(roots) < 2:
raise NoPeaksFound("No proper peaks were found in the data set; likely "
"the dataset is flat (e.g. all zeros).")
else:
return abs(roots[1] - roots[0])

You should use scipy to solve it: first find_peaks and then peak_widths.
With default value in rel_height(0.5) you're measuring the width at half maximum of the peak.

If you prefer interpolation over fitting:
import numpy as np
def get_full_width(x: np.ndarray, y: np.ndarray, height: float = 0.5) -> float:
height_half_max = np.max(y) * height
index_max = np.argmax(y)
x_low = np.interp(height_half_max, y[:index_max+1], x[:index_max+1])
x_high = np.interp(height_half_max, np.flip(y[index_max:]), np.flip(x[index_max:]))
return x_high - x_low

For monotonic functions with many data points and if there's no need for perfect accuracy, I would use:
def FWHM(X, Y):
deltax = x[1] - x[0]
half_max = max(Y) / 2.
l = np.where(y > half_max, 1, 0)
return np.sum(l) * deltax

I implemented an empirical solution which works for noisy and not-quite-Gaussian data fairly well in haggis.math.full_width_half_max. The usage is extremely straightforward:
fwhm = full_width_half_max(x, y)
The function is robust: it simply finds the maximum of the data and the nearest points crossing the "halfway down" threshold using the requested interpolation scheme.
Here are a couple of examples using data from the other answers.
#HYRY's smooth data
def make_norm_dist(x, mean, sd):
return 1.0/(sd*np.sqrt(2*np.pi))*np.exp(-(x - mean)**2/(2*sd**2))
x = np.linspace(10, 110, 1000)
green = make_norm_dist(x, 50, 10)
pink = make_norm_dist(x, 60, 10)
blue = green + pink
# create a spline of x and blue-np.max(blue)/2
spline = UnivariateSpline(x, blue-np.max(blue)/2, s=0)
r1, r2 = spline.roots() # find the roots
# Compute using my function
fwhm, (x1, y1), (x2, y2) = full_width_half_max(x, blue, return_points=True)
# Print comparison
print('HYRY:', r2 - r1, 'MP:', fwhm)
plt.plot(x, blue)
plt.axvspan(r1, r2, facecolor='g', alpha=0.5)
plt.plot(x1, y1, 'r.')
plt.plot(x2, y2, 'r.')
For smooth data, the results are pretty exact:
HYRY: 26.891157007233254 MP: 26.891193606203814
#Hooked's Noisy Data
def gauss(x, p): # p[0]==mean, p[1]==stdev
return 1.0/(p[1]*np.sqrt(2*np.pi))*np.exp(-(x-p[0])**2/(2*p[1]**2))
# Create some sample data
known_param = np.array([2.0, .7])
xmin,xmax = -1.0, 5.0
N = 1000
X = np.linspace(xmin,xmax,N)
Y = gauss(X, known_param)
# Add some noise
Y += .10*np.random.random(N)
# Renormalize to a proper PDF
Y /= ((xmax-xmin)/N)*Y.sum()
# Fit a guassian
p0 = [0,1] # Inital guess is a normal distribution
errfunc = lambda p, x, y: gauss(x, p) - y # Distance to the target function
p1, success = opt.leastsq(errfunc, p0[:], args=(X, Y))
fit_mu, fit_stdev = p1
FWHM = 2*np.sqrt(2*np.log(2))*fit_stdev
# Compute using my function
fwhm, (x1, y1), (x2, y2) = full_width_half_max(X, Y, return_points=True)
# Print comparison
print('Hooked:', FWHM, 'MP:', fwhm)
plt.plot(X, Y)
plt.plot(X, gauss(X, p1), lw=3, alpha=.5, color='r')
plt.axvspan(fit_mu - FWHM / 2, fit_mu + FWHM / 2, facecolor='g', alpha=0.5)
plt.plot(x1, y1, 'r.')
plt.plot(x2, y2, 'r.')
For noisy data (with a biased baseline), the results are not as consistent.
Hooked: 1.9903193212254346 MP: 1.5039676990530118
On the one hand the Gaussian fit is not very optimal for the data, but on the other hand, the strategy of picking the nearest point that intersects the half-max threshold is likely not optimal either.

Related

Interpolating non-uniformly distributed points on a 3D sphere

I have several points on the unit sphere that are distributed according to the algorithm described in https://www.cmu.edu/biolphys/deserno/pdf/sphere_equi.pdf (and implemented in the code below). On each of these points, I have a value that in my particular case represents 1 minus a small error. The errors are in [0, 0.1] if this is important, so my values are in [0.9, 1].
Sadly, computing the errors is a costly process and I cannot do this for as many points as I want. Still, I want my plots to look like I am plotting something "continuous".
So I want to fit an interpolation function to my data, to be able to sample as many points as I want.
After a little bit of research I found scipy.interpolate.SmoothSphereBivariateSpline which seems to do exactly what I want. But I cannot make it work properly.
Question: what can I use to interpolate (spline, linear interpolation, anything would be fine for the moment) my data on the unit sphere? An answer can be either "you misused scipy.interpolation, here is the correct way to do this" or "this other function is better suited to your problem".
Sample code that should be executable with numpy and scipy installed:
import typing as ty
import numpy
import scipy.interpolate
def get_equidistant_points(N: int) -> ty.List[numpy.ndarray]:
"""Generate approximately n points evenly distributed accros the 3-d sphere.
This function tries to find approximately n points (might be a little less
or more) that are evenly distributed accros the 3-dimensional unit sphere.
The algorithm used is described in
https://www.cmu.edu/biolphys/deserno/pdf/sphere_equi.pdf.
"""
# Unit sphere
r = 1
points: ty.List[numpy.ndarray] = list()
a = 4 * numpy.pi * r ** 2 / N
d = numpy.sqrt(a)
m_v = int(numpy.round(numpy.pi / d))
d_v = numpy.pi / m_v
d_phi = a / d_v
for m in range(m_v):
v = numpy.pi * (m + 0.5) / m_v
m_phi = int(numpy.round(2 * numpy.pi * numpy.sin(v) / d_phi))
for n in range(m_phi):
phi = 2 * numpy.pi * n / m_phi
points.append(
numpy.array(
[
numpy.sin(v) * numpy.cos(phi),
numpy.sin(v) * numpy.sin(phi),
numpy.cos(v),
]
)
)
return points
def cartesian2spherical(x: float, y: float, z: float) -> numpy.ndarray:
r = numpy.linalg.norm([x, y, z])
theta = numpy.arccos(z / r)
phi = numpy.arctan2(y, x)
return numpy.array([r, theta, phi])
n = 100
points = get_equidistant_points(n)
# Random here, but costly in real life.
errors = numpy.random.rand(len(points)) / 10
# Change everything to spherical to use the interpolator from scipy.
ideal_spherical_points = numpy.array([cartesian2spherical(*point) for point in points])
r_interp = 1 - errors
theta_interp = ideal_spherical_points[:, 1]
phi_interp = ideal_spherical_points[:, 2]
# Change phi coordinate from [-pi, pi] to [0, 2pi] to please scipy.
phi_interp[phi_interp < 0] += 2 * numpy.pi
# Create the interpolator.
interpolator = scipy.interpolate.SmoothSphereBivariateSpline(
theta_interp, phi_interp, r_interp
)
# Creating the finer theta and phi values for the final plot
theta = numpy.linspace(0, numpy.pi, 100, endpoint=True)
phi = numpy.linspace(0, numpy.pi * 2, 100, endpoint=True)
# Creating the coordinate grid for the unit sphere.
X = numpy.outer(numpy.sin(theta), numpy.cos(phi))
Y = numpy.outer(numpy.sin(theta), numpy.sin(phi))
Z = numpy.outer(numpy.cos(theta), numpy.ones(100))
thetas, phis = numpy.meshgrid(theta, phi)
heatmap = interpolator(thetas, phis)
Issue with the code above:
With the code as-is, I have a
ValueError: The required storage space exceeds the available storage space: nxest or nyest too small, or s too small. The weighted least-squares spline corresponds to the current set of knots.
that is raised when initialising the interpolator instance.
The issue above seems to say that I should change the value of s that is one on the parameters of scipy.interpolate.SmoothSphereBivariateSpline. I tested different values of s ranging from 0.0001 to 100000, the code above always raise, either the exception described above or:
ValueError: Error code returned by bispev: 10
Edit: I am including my findings here. They can't really be considered as a solution, that is why I am editing and not posting as an answer.
With more research I found this question Using Radial Basis Functions to Interpolate a Function on a Sphere. The author has exactly the same problem as me and use a different interpolator: scipy.interpolate.Rbf. I changed the above code by replacing the interpolator and plotting:
# Create the interpolator.
interpolator = scipy.interpolate.Rbf(theta_interp, phi_interp, r_interp)
# Creating the finer theta and phi values for the final plot
plot_points = 100
theta = numpy.linspace(0, numpy.pi, plot_points, endpoint=True)
phi = numpy.linspace(0, numpy.pi * 2, plot_points, endpoint=True)
# Creating the coordinate grid for the unit sphere.
X = numpy.outer(numpy.sin(theta), numpy.cos(phi))
Y = numpy.outer(numpy.sin(theta), numpy.sin(phi))
Z = numpy.outer(numpy.cos(theta), numpy.ones(plot_points))
thetas, phis = numpy.meshgrid(theta, phi)
heatmap = interpolator(thetas, phis)
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib import cm
colormap = cm.inferno
normaliser = mpl.colors.Normalize(vmin=numpy.min(heatmap), vmax=1)
scalar_mappable = cm.ScalarMappable(cmap=colormap, norm=normaliser)
scalar_mappable.set_array([])
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.plot_surface(
X,
Y,
Z,
facecolors=colormap(normaliser(heatmap)),
alpha=0.7,
cmap=colormap,
)
plt.colorbar(scalar_mappable)
plt.show()
This code runs smoothly and gives the following result:
The interpolation seems OK except on one line that is discontinuous, just like in the question that led me to this class. One of the answer give the idea of using a different distance, more adapted the the spherical coordinates: the Haversine distance.
def haversine(x1, x2):
theta1, phi1 = x1
theta2, phi2 = x2
return 2 * numpy.arcsin(
numpy.sqrt(
numpy.sin((theta2 - theta1) / 2) ** 2
+ numpy.cos(theta1) * numpy.cos(theta2) * numpy.sin((phi2 - phi1) / 2) ** 2
)
)
# Create the interpolator.
interpolator = scipy.interpolate.Rbf(theta_interp, phi_interp, r_interp, norm=haversine)
which, when executed, gives a warning:
LinAlgWarning: Ill-conditioned matrix (rcond=1.33262e-19): result may not be accurate.
self.nodes = linalg.solve(self.A, self.di)
and a result that is not at all the one expected: the interpolated function have values that may go up to -1 which is clearly wrong.

You can use Cartesian coordinate instead of Spherical coordinate.
The default norm parameter ('euclidean') used by Rbf is sufficient
# interpolation
x, y, z = numpy.array(points).T
interpolator = scipy.interpolate.Rbf(x, y, z, r_interp)
# predict
heatmap = interpolator(X, Y, Z)
Here the result:
ax.plot_surface(
X, Y, Z,
rstride=1, cstride=1,
# or rcount=50, ccount=50,
facecolors=colormap(normaliser(heatmap)),
cmap=colormap,
alpha=0.7, shade=False
)
ax.set_xlabel('x axis')
ax.set_ylabel('y axis')
ax.set_zlabel('z axis')
You can also use a cosine distance if you want (norm parameter):
def cosine(XA, XB):
if XA.ndim == 1:
XA = numpy.expand_dims(XA, axis=0)
if XB.ndim == 1:
XB = numpy.expand_dims(XB, axis=0)
return scipy.spatial.distance.cosine(XA, XB)
In order to better see the differences,
I stacked the two images, substracted them and inverted the layer.

How can I calculate arbitrary values from a spline created with scipy.interpolate.Rbf?

I have several data points in 3 dimensional space (x, y, z) and have interpolated them using scipy.interpolate.Rbf. This gives me a spline nicely representing the surface of my 3D object. I would now like to determine several x and y pairs that have the same, arbitrary z value. I would like to do that in order to compute the cross section of my 3D object at any given value of z. Does someone know how to do that? Maybe there is also a better way to do that instead of using scipy.interpolate.Rbf.
Up to now I have evaluated the cross sections by making a contour plot using matplotlib.pyplot and extracting the displayed segments. 3D points and interpolated spline
segments extracted using a contour plot

I was able to solve the problem. I have calculated the area by triangulating the x-y data and cutting the triangles with the z-plane I wanted to calculate the cross-sectional area of (z=z0). Specifically, I have searched for those triangles whose z-values are both above and below z0. Then I have calculated the x and y values of the sides of these triangles where the sides are equal to z0. Then I use scipy.spatial.ConvexHull to sort the intersected points. Using the shoelace formula I can then determine the area.
I have attached the example code here:
import numpy as np
from scipy import spatial
import matplotlib.pyplot as plt
# Generation of random test data
n = 500
x = np.random.random(n)
y = np.random.random(n)
z = np.exp(-2*(x-.5)**2-4*(y-.5)**2)
z0 = .75
# Triangulation of the test data
triang= spatial.Delaunay(np.array([x, y]).T)
# Determine all triangles where not all points are above or below z0, i.e. the triangles that intersect z0
tri_inter=np.zeros_like(triang.simplices, dtype=np.int) # The triangles which intersect the plane at z0, filled below
i = 0
for tri in triang.simplices:
if ~np.all(z[tri] > z0) and ~np.all(z[tri] < z0):
tri_inter[i,:] = tri
i += 1
tri_inter = tri_inter[~np.all(tri_inter==0, axis=1)] # Remove all rows with only 0
# The number of interpolated values for x and y has twice the length of the triangles
# Because each triangle intersects the plane at z0 twice
x_inter = np.zeros(tri_inter.shape[0]*2)
y_inter = np.zeros(tri_inter.shape[0]*2)
for j, tri in enumerate(tri_inter):
# Determine which of the three points are above and which are below z0
points_above = []
points_below = []
for i in tri:
if z[i] > z0:
points_above.append(i)
else:
points_below.append(i)
# Calculate the intersections and put the values into x_inter and y_inter
t = (z0-z[points_below[0]])/(z[points_above[0]]-z[points_below[0]])
x_new = t * (x[points_above[0]]-x[points_below[0]]) + x[points_below[0]]
y_new = t * (y[points_above[0]]-y[points_below[0]]) + y[points_below[0]]
x_inter[j*2] = x_new
y_inter[j*2] = y_new
if len(points_above) > len(points_below):
t = (z0-z[points_below[0]])/(z[points_above[1]]-z[points_below[0]])
x_new = t * (x[points_above[1]]-x[points_below[0]]) + x[points_below[0]]
y_new = t * (y[points_above[1]]-y[points_below[0]]) + y[points_below[0]]
else:
t = (z0-z[points_below[1]])/(z[points_above[0]]-z[points_below[1]])
x_new = t * (x[points_above[0]]-x[points_below[1]]) + x[points_below[1]]
y_new = t * (y[points_above[0]]-y[points_below[1]]) + y[points_below[1]]
x_inter[j*2+1] = x_new
y_inter[j*2+1] = y_new
# sort points to calculate area
hull = spatial.ConvexHull(np.array([x_inter, y_inter]).T)
x_hull, y_hull = x_inter[hull.vertices], y_inter[hull.vertices]
# Calculation of are using the shoelace formula
area = 0.5*np.abs(np.dot(x_hull,np.roll(y_hull,1))-np.dot(y_hull,np.roll(x_hull,1)))
print('Area:', area)
plt.figure()
plt.plot(x_inter, y_inter, 'ro')
plt.plot(x_hull, y_hull, 'b--')
plt.triplot(x, y, triangles=tri_inter, color='k')
plt.show()

Gaussian fit fails to return the correct parameters for this PSF

To test my 2D Gaussian fitting code for images of bright objects, I'm running it on a Point Spread Function (PSF) that was constructed and fit by the WISE team. On their website, they list the parameters for the central PSF for each band: the FWHM along the major and minor axes, and position angle (this is the angle from the y-axis). All the information is available here: WISE PSF information, but below is an image of those parameters for the central PSF, and the corresponding image of the PSF for band 3.
So, I have downloaded the corresponding FITS image for the central PSF in band 3 (all the images are available to download via the above link), and tried running my code on it. However, my code does not return the parameters I would expect, and the parameters change depending on if I fit to a subimage (and depending on the size of this), or the whole image, which is kind of worrying.
I am wondering if there's a way to make my Gaussian fit code recover the most accurate parameters -- or maybe another fitting method would be more effective. But I'm mostly concerned at the fact that the output parameters of my Gaussian fit can become so obviously wrong. Below is my code.
from scipy import optimize
import numpy as np
from astropy.io import fits
image = 'wise-w3-psf-wpro-09x09-05x05.fits' #WISE central PSF
stacked_image = fits.getdata(image)
# Center of image (the PSF is centered)
x0 = np.shape(stacked_image)[1]//2
y0 = np.shape(stacked_image)[0]//2
# Normalize image so peak = 1
def normalize(image):
image *= 1/np.max(image)
return image
stacked_image = normalize(stacked_image)
def gaussian_func(xy, x0, y0, sigma_x, sigma_y, amp, theta, offset):
x, y = xy
a = (np.cos(theta))**2/(2*sigma_x**2) + (np.sin(theta))**2/(2*sigma_y**2)
b = -np.sin(2*theta)/(4*sigma_x**2) + np.sin(2*theta)/(4*sigma_y**2)
c = (np.sin(theta))**2/(2*sigma_x**2) + (np.cos(theta))**2/(2*sigma_y**2)
inner = a * (x-x0)**2
inner += 2*b*(x-x0)*(y-y0)
inner += c * (y-y0)**2
return (offset + amp * np.exp(-inner)).ravel()
def Sigma2width(sigma):
return 2 * np.sqrt(2*np.log(2)) * sigma
def generate(data_set):
xvec = np.arange(0, np.shape(data_set)[1], 1)
yvec = np.arange(0, np.shape(data_set)[0], 1)
X, Y = np.meshgrid(xvec, yvec)
return X, Y
# METHOD 1: Fit subimage of PSF to Gaussian
# Guesses
theta_guess = np.deg2rad(96) #I believe that the angle in the Gaussian corresponds to CCW from the x-axis (so use WISE position angle + 90 degrees)
sigma_x = 5
sigma_y = 4
amp = 1 #I know this is true since I normalized it
subimage = stacked_image[y0-50:y0+50, x0-50:x0+50]
offset = np.min(subimage)
guesses = [np.shape(subimage)[1]//2, np.shape(subimage)[0]//2, sigma_x, sigma_y, amp, theta_guess, offset]
xx, yy = generate(subimage)
pred_params, uncert_cov = optimize.curve_fit(gaussian_func, (xx.ravel(), yy.ravel()), subimage.ravel(), p0=guesses)
width_x, width_y = Sigma2width(np.abs(pred_params[2]))*0.275, Sigma2width(np.abs(pred_params[3]))*0.275 #multiply by pixel scale factor (available on website) to get FWHMs in arcseconds
x_0, y_0 = pred_params[0]+(x0-50), pred_params[1]+(y0-50) #add back origin
theta_deg = np.rad2deg(pred_params[5])
pred_params[5] = theta_deg
pred_params[0] = x_0
pred_params[1] = y_0
if theta_deg < 90:
pos_angle = theta_deg+90
elif theta_deg >= 90:
pos_angle = theta_deg-90
print('PREDICTED FWHM x, y in arcsecs:', width_x, width_y)
print('FIT PARAMS [x0, y0, sigma_x, sigma_y, amp, theta, offset]:', pred_params)
print('POSITION ANGLE:', pos_angle)
# Output: PREDICTED FWHM x, y in arcsecs: 6.4917 5.4978
# FIT PARAMS [x0, y0, sigma_x, sigma_y, amp, theta, offset]: [3.195e+02 3.189e+02 1.002e+01 8.489e+00 8.695e-01 8.655e+01 2.613e-02]
# POSITION ANGLE: 176.556
# METHOD 2: Fit whole image to Gaussian
# Guesses
theta_guess = np.deg2rad(96)
sigma_x = 5
sigma_y = 4
amp = 1
offset = np.median(stacked_image)
guesses = [x0, y0, sigma_x, sigma_y, amp, theta_guess, offset]
# Sigmas - manual estimation
ylim, xlim = np.shape(stacked_image)
x, y = np.arange(0, xlim, 1), np.arange(0, ylim, 1)
ypix, xpix = np.where(stacked_image==amp)
y_range = np.take(stacked_image, ypix[0], axis=0)
x_range = np.take(stacked_image, xpix[0], axis=1)
xx, yy = generate(stacked_image)
pred_params, uncert_cov = optimize.curve_fit(gaussian_func, (xx.ravel(), yy.ravel()), stacked_image.ravel(), p0=guesses)
width_x, width_y = Sigma2width(np.abs(pred_params[2]))*0.275, Sigma2width(np.abs(pred_params[3]))*0.275 #in arcsecs
theta = pred_params[5]
print('PREDICTED FWHM x, y in arcsecs:', width_x, width_y)
print('FIT PARAMS [x0, y0, sigma_x, sigma_y, amp, theta, offset]:', pred_params)
# Output:
# PREDICTED FWHM x, y in arcsecs: 7.088 6.106
# FIT PARAMS [x0, y0, sigma_x, sigma_y, amp, theta, offset]: [3.195e+02 3.190e+02 1.095e+01 9.429e+00 8.378e-01 1.521e+00 7.998e-04]
if theta < 90:
pos_angle = 90+np.rad2deg(theta)
elif theta >= 90:
pos_angle = 90-np.rad2deg(theta)
print('POSITION ANGLE:', pos_angle)
# POSITION ANGLE: 177.147
You can see in the (rounded) outputs that the amplitudes my Gaussian fits return aren't even 1, and the other parameters (FWHMs and angles) don't match up with the correct parameters shown in the table either.
If I fit a subimage, it seems that the amplitude gets closer and closer to (but never reaches) 1 the smaller I make the subimage, but then the FWHMs might get too small compared to the real values. Why am I not getting back the correct results, and how can I make the fit as accurate as possible?

python optimize.leastsq: fitting a circle to 3d set of points

I am trying to use circle fitting code for 3D data set. I have modified it for 3D points just adding z-coordinate where necessary. My modification works fine for one set of points and works bad for another. Please look at the code, if it has some errors.
import trig_items
import numpy as np
from trig_items import *
from numpy import *
from matplotlib import pyplot as p
from scipy import optimize
# Coordinates of the 3D points
##x = r_[36, 36, 19, 18, 33, 26]
##y = r_[14, 10, 28, 31, 18, 26]
##z = r_[0, 1, 2, 3, 4, 5]
x = r_[ 2144.18908574, 2144.26880854, 2144.05552972, 2143.90303742, 2143.62520676,
2143.43628579, 2143.14005775, 2142.79919654, 2142.51436023, 2142.11240866,
2141.68564346, 2141.29333828, 2140.92596405, 2140.3475612, 2139.90848046,
2139.24661021, 2138.67384709, 2138.03313547, 2137.40301734, 2137.40908256,
2137.06611224, 2136.50943781, 2136.0553113, 2135.50313189, 2135.07049922,
2134.62098139, 2134.10459535, 2133.50838433, 2130.6600465, 2130.03537342,
2130.04047644, 2128.83522468, 2127.79827542, 2126.43513385, 2125.36700593,
2124.00350543, 2122.68564431, 2121.20709478, 2119.79047011, 2118.38417647,
2116.90063343, 2115.52685778, 2113.82246629, 2112.21159431, 2110.63180117,
2109.00713198, 2108.94434529, 2106.82777156, 2100.62343757, 2098.5090226,
2096.28787738, 2093.91550703, 2091.66075061, 2089.15316429, 2086.69753869,
2084.3002414, 2081.87590579, 2079.19141866, 2076.5394574, 2073.89128676,
2071.18786213]
y = r_[ 725.74913818, 724.43874065, 723.15226506, 720.45950581, 717.77827954,
715.07048092, 712.39633862, 709.73267688, 707.06039438, 704.43405908,
701.80074596, 699.15371526, 696.5309022, 693.96109921, 691.35585501,
688.83496327, 686.32148661, 683.80286662, 681.30705568, 681.30530975,
679.66483676, 678.01922321, 676.32721779, 674.6667554, 672.9658024,
671.23686095, 669.52021535, 667.84999077, 659.19757984, 657.46179949,
657.45700508, 654.46901086, 651.38177517, 648.41739432, 645.32356976,
642.39034578, 639.42628453, 636.51107198, 633.57732055, 630.63825133,
627.75308356, 624.80162215, 622.01980232, 619.18814892, 616.37688894,
613.57400131, 613.61535723, 610.4724493, 600.98277781, 597.84782844,
594.75983001, 591.77946964, 588.74874068, 585.84525834, 582.92311166,
579.99564481, 577.06666417, 574.30782762, 571.54115037, 568.79760614,
566.08551098]
z = r_[ 339.77146775, 339.60021095, 339.47645894, 339.47130963, 339.37216218,
339.4126132, 339.67942046, 339.40917728, 339.39500353, 339.15041461,
339.38959195, 339.3358209, 339.47764895, 339.17854867, 339.14624071,
339.16403926, 339.02308811, 339.27011082, 338.97684183, 338.95087698,
338.97321177, 339.02175448, 339.02543922, 338.88725411, 339.06942374,
339.0557553, 339.04414618, 338.89234303, 338.95572249, 339.00880416,
339.00413073, 338.91080374, 338.98214758, 339.01135789, 338.96393537,
338.73446188, 338.62784913, 338.72443217, 338.74880562, 338.69090173,
338.50765186, 338.49056867, 338.57353355, 338.6196255, 338.43754399,
338.27218569, 338.10587265, 338.43880881, 338.28962141, 338.14338705,
338.25784154, 338.49792568, 338.15572139, 338.52967693, 338.4594245,
338.1511823, 338.03711207, 338.19144663, 338.22022045, 338.29032321,
337.8623197 ]
# coordinates of the barycenter
xm = mean(x)
ym = mean(y)
zm = mean(z)
### Basic usage of optimize.leastsq
def calc_R(xc, yc, zc):
""" calculate the distance of each 3D points from the center (xc, yc, zc) """
return sqrt((x - xc) ** 2 + (y - yc) ** 2 + (z - zc) ** 2)
def func(c):
""" calculate the algebraic distance between the 3D points and the mean circle centered at c=(xc, yc, zc) """
Ri = calc_R(*c)
return Ri - Ri.mean()
center_estimate = xm, ym, zm
center, ier = optimize.leastsq(func, center_estimate)
##print center
xc, yc, zc = center
Ri = calc_R(xc, yc, zc)
R = Ri.mean()
residu = sum((Ri - R)**2)
print 'R =', R
So, for the first set of x, y, z (commented in the code) it works well: the output is R = 39.0097846735. If I run the code with the second set of points (uncommented) the resulting radius is R = 108576.859834, which is almost straight line. I plotted the last one.
The blue points is a given data set, the red ones is the arc of the resulting radius R = 108576.859834. It is obvious that the given data set has much smaller radius than the result.
Here is another set of points.
It is clear that the least squares does not work correctly.
Please help me solving this issue.
UPDATE
Here is my solution:
### fit 3D arc into a set of 3D points ###
### output is the centre and the radius of the arc ###
def fitArc3d(arr, eps = 0.0001):
# Coordinates of the 3D points
x = numpy.array([arr[k][0] for k in range(len(arr))])
y = numpy.array([arr[k][4] for k in range(len(arr))])
z = numpy.array([arr[k][5] for k in range(len(arr))])
# coordinates of the barycenter
xm = mean(x)
ym = mean(y)
zm = mean(z)
### gradient descent minimisation method ###
pnts = [[x[k], y[k], z[k]] for k in range(len(x))]
meanP = Point(xm, ym, zm) # mean point
Ri = [Point(*meanP).distance(Point(*pnts[k])) for k in range(len(pnts))] # radii to the points
Rm = math.fsum(Ri) / len(Ri) # mean radius
dR = Rm + 10 # difference between mean radii
alpha = 0.1
c = meanP
cArr = []
while dR > eps:
cArr.append(c)
Jx = math.fsum([2 * (x[k] - c[0]) * (Ri[k] - Rm) / Ri[k] for k in range(len(Ri))])
Jy = math.fsum([2 * (y[k] - c[1]) * (Ri[k] - Rm) / Ri[k] for k in range(len(Ri))])
Jz = math.fsum([2 * (z[k] - c[2]) * (Ri[k] - Rm) / Ri[k] for k in range(len(Ri))])
gradJ = [Jx, Jy, Jz] # find gradient
c = [c[k] + alpha * gradJ[k] for k in range(len(c)) if len(c) == len(gradJ)] # find new centre point
Ri = [Point(*c).distance(Point(*pnts[k])) for k in range(len(pnts))] # calculate new radii
RmOld = Rm
Rm = math.fsum(Ri) / len(Ri) # calculate new mean radius
dR = abs(Rm - RmOld) # new difference between mean radii
return Point(*c), Rm
It is not very optimal code (I do not have time to fine tune it) but it works.

I guess the problem is the data and the corresponding algorithm. The least square method works fine if it produces a local parabolic minimum, such that a simple gradient method goes approximately direction minimum. Unfortunately, this is not necessarily the case for your data. You can check this by keeping some rough estimates for xc and yc fixed and plotting the sum of the squared residuals as a function of zc and R. I get a boomerang shaped minimum. Depending on your starting parameters you might end in one of the branches going away from the real minimum. Once in the valley this can be very flat such that you exceed the number of max iterations or get something that is accepted within the tolerance of the algorithm. As always, thinks are better the better your starting parameters. Unfortunately you have only a small arc of the circle, so that it is difficult to get better. I am not a specialist in Python, but I think that leastsq allows you to play with the Jacobian and Gradient Methods. Try to play with the tolerance as well.
In short: the code looks basically fine to me, but your data is pathological and you have to adapt the code to that kind of data.
There is a non-iterative solution in 2D from Karimäki, maybe you can adapt
this method to 3D. You can also look at this. Sure you will find more literature.
I just checked the data using a Simplex-Algorithm. The minimum is, as I said, not well behaved. See here some cuts of the residual function. Only in the xy-plane you get some reasonable behavior. The properties of the zr- and xr- plane make the finding process very difficult.
So in the beginning the simplex algorithm finds several almost stable solutions. You can see them as flat steps in the graph below (blue x, purple y, yellow z, green R). At the end the algorithm has to walk down the almost flat but very stretched out valley, resulting in the final conversion of z and R. Nevertheless, I expect many regions that look like a solution if the tolerance is insufficient. With the standard tolerance of 10^-5 the algoritm stopped after approx 350 iterations. I had to set it to 10^-10 to get this solution, i.e. [1899.32, 741.874, 298.696, 248.956], which seems quite ok.
Update
As mentioned earlier, the solution depends on the working precision and requested accuracy. So your hand made gradient method works probably better as these values are different compared to the build-in least square fit. Nevertheless, this is my version making a two step fit. First I fit a plane to the data. In a next step I fit a circle within this plane. Both steps use the least square method. This time it works, as each step avoids critically shaped minima. (Naturally, the plane fit runs into problems if the arc segment becomes small and the data lies virtually on a straight line. But this will happen for all algorithms)
from math import *
from matplotlib import pyplot as plt
from scipy import optimize
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import pprint as pp
dataTupel=zip(xs,ys,zs) #your data from above
# Fitting a plane first
# let the affine plane be defined by two vectors,
# the zero point P0 and the plane normal n0
# a point p is member of the plane if (p-p0).n0 = 0
def distanceToPlane(p0,n0,p):
return np.dot(np.array(n0),np.array(p)-np.array(p0))
def residualsPlane(parameters,dataPoint):
px,py,pz,theta,phi = parameters
nx,ny,nz =sin(theta)*cos(phi),sin(theta)*sin(phi),cos(theta)
distances = [distanceToPlane([px,py,pz],[nx,ny,nz],[x,y,z]) for x,y,z in dataPoint]
return distances
estimate = [1900, 700, 335,0,0] # px,py,pz and zeta, phi
#you may automize this by using the center of mass data
# note that the normal vector is given in polar coordinates
bestFitValues, ier = optimize.leastsq(residualsPlane, estimate, args=(dataTupel))
xF,yF,zF,tF,pF = bestFitValues
point = [xF,yF,zF]
normal = [sin(tF)*cos(pF),sin(tF)*sin(pF),cos(tF)]
# Fitting a circle inside the plane
#creating two inplane vectors
sArr=np.cross(np.array([1,0,0]),np.array(normal))#assuming that normal not parallel x!
sArr=sArr/np.linalg.norm(sArr)
rArr=np.cross(sArr,np.array(normal))
rArr=rArr/np.linalg.norm(rArr)#should be normalized already, but anyhow
def residualsCircle(parameters,dataPoint):
r,s,Ri = parameters
planePointArr = s*sArr + r*rArr + np.array(point)
distance = [ np.linalg.norm( planePointArr-np.array([x,y,z])) for x,y,z in dataPoint]
res = [(Ri-dist) for dist in distance]
return res
estimateCircle = [0, 0, 335] # px,py,pz and zeta, phi
bestCircleFitValues, ier = optimize.leastsq(residualsCircle, estimateCircle,args=(dataTupel))
rF,sF,RiF = bestCircleFitValues
print bestCircleFitValues
# Synthetic Data
centerPointArr=sF*sArr + rF*rArr + np.array(point)
synthetic=[list(centerPointArr+ RiF*cos(phi)*rArr+RiF*sin(phi)*sArr) for phi in np.linspace(0, 2*pi,50)]
[cxTupel,cyTupel,czTupel]=[ x for x in zip(*synthetic)]
### Plotting
d = -np.dot(np.array(point),np.array(normal))# dot product
# create x,y mesh
xx, yy = np.meshgrid(np.linspace(2000,2200,10), np.linspace(540,740,10))
# calculate corresponding z
# Note: does not work if normal vector is without z-component
z = (-normal[0]*xx - normal[1]*yy - d)/normal[2]
# plot the surface, data, and synthetic circle
fig = plt.figure()
ax = fig.add_subplot(211, projection='3d')
ax.scatter(xs, ys, zs, c='b', marker='o')
ax.plot_wireframe(xx,yy,z)
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
bx = fig.add_subplot(212, projection='3d')
bx.scatter(xs, ys, zs, c='b', marker='o')
bx.scatter(cxTupel,cyTupel,czTupel, c='r', marker='o')
bx.set_xlabel('X Label')
bx.set_ylabel('Y Label')
bx.set_zlabel('Z Label')
plt.show()
which give a radius of 245. This is close to what the other approach gave (249). So within error margins I get the same.
The plotted result looks reasonable.
Hope this helps.

Feel like you missed some constraints in your 1st version code. The implementation could be explained as fitting a sphere to 3d points. So that's why the 2nd radius for 2nd data list is almost straight line. It's thinking like you are giving it a small circle on a large sphere.

Bézier curve fitting with SciPy

I have a set of points which approximate a 2D curve. I would like to use Python with numpy and scipy to find a cubic Bézier path which approximately fits the points, where I specify the exact coordinates of two endpoints, and it returns the coordinates of the other two control points.
I initially thought scipy.interpolate.splprep() might do what I want, but it seems to force the curve to pass through each one of the data points (as I suppose you would want for interpolation). I'll assume that I was on the wrong track with that.
My question is similar to this one: How can I fit a Bézier curve to a set of data?, except that they said they didn't want to use numpy. My preference would be to find what I need already implemented somewhere in scipy or numpy. Otherwise, I plan to implement the algorithm linked from one of the answers to that question, using numpy: An algorithm for automatically fitting digitized curves (pdf.page 622).
Thank you for any suggestions!
Edit: I understand that a cubic Bézier curve is not guaranteed to pass through all the points; I want one which passes through two given endpoints, and which is as close as possible to the specified interior points.

Here's a way to do Bezier curves with numpy:
import numpy as np
from scipy.special import comb
def bernstein_poly(i, n, t):
"""
The Bernstein polynomial of n, i as a function of t
"""
return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bezier_curve(points, nTimes=1000):
"""
Given a set of control points, return the
bezier curve defined by the control points.
points should be a list of lists, or list of tuples
such as [ [1,1],
[2,3],
[4,5], ..[Xn, Yn] ]
nTimes is the number of time steps, defaults to 1000
See http://processingjs.nihongoresources.com/bezierinfo/
"""
nPoints = len(points)
xPoints = np.array([p[0] for p in points])
yPoints = np.array([p[1] for p in points])
t = np.linspace(0.0, 1.0, nTimes)
polynomial_array = np.array([ bernstein_poly(i, nPoints-1, t) for i in range(0, nPoints) ])
xvals = np.dot(xPoints, polynomial_array)
yvals = np.dot(yPoints, polynomial_array)
return xvals, yvals
if __name__ == "__main__":
from matplotlib import pyplot as plt
nPoints = 4
points = np.random.rand(nPoints,2)*200
xpoints = [p[0] for p in points]
ypoints = [p[1] for p in points]
xvals, yvals = bezier_curve(points, nTimes=1000)
plt.plot(xvals, yvals)
plt.plot(xpoints, ypoints, "ro")
for nr in range(len(points)):
plt.text(points[nr][0], points[nr][1], nr)
plt.show()

Here is a piece of python code for fitting points:
'''least square qbezier fit using penrose pseudoinverse
>>> V=array
>>> E, W, N, S = V((1,0)), V((-1,0)), V((0,1)), V((0,-1))
>>> cw = 100
>>> ch = 300
>>> cpb = V((0, 0))
>>> cpe = V((cw, 0))
>>> xys=[cpb,cpb+ch*N+E*cw/8,cpe+ch*N+E*cw/8, cpe]
>>>
>>> ts = V(range(11), dtype='float')/10
>>> M = bezierM (ts)
>>> points = M*xys #produces the points on the bezier curve at t in ts
>>>
>>> control_points=lsqfit(points, M)
>>> linalg.norm(control_points-xys)<10e-5
True
>>> control_points.tolist()[1]
[12.500000000000037, 300.00000000000017]
'''
from numpy import array, linalg, matrix
from scipy.misc import comb as nOk
Mtk = lambda n, t, k: t**(k)*(1-t)**(n-k)*nOk(n,k)
bezierM = lambda ts: matrix([[Mtk(3,t,k) for k in range(4)] for t in ts])
def lsqfit(points,M):
M_ = linalg.pinv(M)
return M_ * points
Generally on bezier curves check out
Animated bezier and
bezierinfo

Resulting Plot
Building upon the answers from #reptilicus and #Guillaume P., here is the complete code to:
Get the Bezier Parameters i.e. the control points from a list of points.
Create the Bezier Curve from the Bezier Parameters i.e. the control points.
Plot the original points, the control points and the resulting Bezier Curve.
Getting the Bezier Parameters i.e. the control points from a set of X,Y points or coordinates. The other parameter needed is the degree for the approximation and the resulting control points will be (degree + 1)
import numpy as np
from scipy.special import comb
def get_bezier_parameters(X, Y, degree=3):
""" Least square qbezier fit using penrose pseudoinverse.
Parameters:
X: array of x data.
Y: array of y data. Y[0] is the y point for X[0].
degree: degree of the Bézier curve. 2 for quadratic, 3 for cubic.
Based on https://stackoverflow.com/questions/12643079/b%C3%A9zier-curve-fitting-with-scipy
and probably on the 1998 thesis by Tim Andrew Pastva, "Bézier Curve Fitting".
"""
if degree < 1:
raise ValueError('degree must be 1 or greater.')
if len(X) != len(Y):
raise ValueError('X and Y must be of the same length.')
if len(X) < degree + 1:
raise ValueError(f'There must be at least {degree + 1} points to '
f'determine the parameters of a degree {degree} curve. '
f'Got only {len(X)} points.')
def bpoly(n, t, k):
""" Bernstein polynomial when a = 0 and b = 1. """
return t ** k * (1 - t) ** (n - k) * comb(n, k)
#return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bmatrix(T):
""" Bernstein matrix for Bézier curves. """
return np.matrix([[bpoly(degree, t, k) for k in range(degree + 1)] for t in T])
def least_square_fit(points, M):
M_ = np.linalg.pinv(M)
return M_ * points
T = np.linspace(0, 1, len(X))
M = bmatrix(T)
points = np.array(list(zip(X, Y)))
final = least_square_fit(points, M).tolist()
final[0] = [X[0], Y[0]]
final[len(final)-1] = [X[len(X)-1], Y[len(Y)-1]]
return final
Create the Bezier curve given the Bezier Parameters i.e. control points.
def bernstein_poly(i, n, t):
"""
The Bernstein polynomial of n, i as a function of t
"""
return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bezier_curve(points, nTimes=50):
"""
Given a set of control points, return the
bezier curve defined by the control points.
points should be a list of lists, or list of tuples
such as [ [1,1],
[2,3],
[4,5], ..[Xn, Yn] ]
nTimes is the number of time steps, defaults to 1000
See http://processingjs.nihongoresources.com/bezierinfo/
"""
nPoints = len(points)
xPoints = np.array([p[0] for p in points])
yPoints = np.array([p[1] for p in points])
t = np.linspace(0.0, 1.0, nTimes)
polynomial_array = np.array([ bernstein_poly(i, nPoints-1, t) for i in range(0, nPoints) ])
xvals = np.dot(xPoints, polynomial_array)
yvals = np.dot(yPoints, polynomial_array)
return xvals, yvals
Sample data used (can be replaced with any data, this is GPS data).
points = []
xpoints = [19.21270, 19.21269, 19.21268, 19.21266, 19.21264, 19.21263, 19.21261, 19.21261, 19.21264, 19.21268,19.21274, 19.21282, 19.21290, 19.21299, 19.21307, 19.21316, 19.21324, 19.21333, 19.21342]
ypoints = [-100.14895, -100.14885, -100.14875, -100.14865, -100.14855, -100.14847, -100.14840, -100.14832, -100.14827, -100.14823, -100.14818, -100.14818, -100.14818, -100.14818, -100.14819, -100.14819, -100.14819, -100.14820, -100.14820]
for i in range(len(xpoints)):
points.append([xpoints[i],ypoints[i]])
Plot the original points, the control points and the resulting Bezier Curve.
import matplotlib.pyplot as plt
# Plot the original points
plt.plot(xpoints, ypoints, "ro",label='Original Points')
# Get the Bezier parameters based on a degree.
data = get_bezier_parameters(xpoints, ypoints, degree=4)
x_val = [x[0] for x in data]
y_val = [x[1] for x in data]
print(data)
# Plot the control points
plt.plot(x_val,y_val,'k--o', label='Control Points')
# Plot the resulting Bezier curve
xvals, yvals = bezier_curve(data, nTimes=1000)
plt.plot(xvals, yvals, 'b-', label='B Curve')
plt.legend()
plt.show()

#keynesiancross asked for "comments in [Roland's] code as to what the variables are" and others completely missed the stated problem. Roland started with a Bézier curve as input (to get a perfect match), which made it harder to understand both the problem and (at least for me) the solution. The difference from interpolation is easier to see for input that leaves residuals. Here is both paraphrased code and non-Bézier input -- and an unexpected outcome.
import matplotlib.pyplot as plt
import numpy as np
from scipy.special import comb as n_over_k
Mtk = lambda n, t, k: t**k * (1-t)**(n-k) * n_over_k(n,k)
BézierCoeff = lambda ts: [[Mtk(3,t,k) for k in range(4)] for t in ts]
fcn = np.log
tPlot = np.linspace(0. ,1. , 81)
xPlot = np.linspace(0.1,2.5, 81)
tData = tPlot[0:81:10]
xData = xPlot[0:81:10]
data = np.column_stack((xData, fcn(xData))) # shapes (9,2)
Pseudoinverse = np.linalg.pinv(BézierCoeff(tData)) # (9,4) -> (4,9)
control_points = Pseudoinverse.dot(data) # (4,9)*(9,2) -> (4,2)
Bézier = np.array(BézierCoeff(tPlot)).dot(control_points)
residuum = fcn(Bézier[:,0]) - Bézier[:,1]
fig, ax = plt.subplots()
ax.plot(xPlot, fcn(xPlot), 'r-')
ax.plot(xData, data[:,1], 'ro', label='input')
ax.plot(Bézier[:,0],
Bézier[:,1], 'k-', label='fit')
ax.plot(xPlot, 10.*residuum, 'b-', label='10*residuum')
ax.plot(control_points[:,0],
control_points[:,1], 'ko:', fillstyle='none')
ax.legend()
fig.show()
This works well for fcn = np.cos but not for log. I kind of expected that the fit would use the t-component of the control points as additional degrees of freedom, as we would do by dragging the control points:
manual_points = np.array([[0.1,np.log(.1)],[.27,-.6],[.82,.23],[2.5,np.log(2.5)]])
Bézier = np.array(BézierCoeff(tPlot)).dot(manual_points)
residuum = fcn(Bézier[:,0]) - Bézier[:,1]
fig, ax = plt.subplots()
ax.plot(xPlot, fcn(xPlot), 'r-')
ax.plot(xData, data[:,1], 'ro', label='input')
ax.plot(Bézier[:,0],
Bézier[:,1], 'k-', label='fit')
ax.plot(xPlot, 10.*residuum, 'b-', label='10*residuum')
ax.plot(manual_points[:,0],
manual_points[:,1], 'ko:', fillstyle='none')
ax.legend()
fig.show()
The cause of failure, I guess, is that the norm measures the distance between points on the curves instead of the distance between a point on one curve to the nearest point on the other curve.

Short answer: you don't, because that's not how Bezier curves work. Longer answer: have a look at Catmull-Rom splines instead. They're pretty easy to form (the tangent vector at any point P, barring start and end, is parallel to the lines {P-1,P+1}, so they're easy to program, too) and always pass through the points that define them, unlike Bezier curves, which interpolates "somewhere" inside the convex hull set up by all the control points.

A Bezier curve isn't guaranteed to pass through every point you supply it with; control points are arbitrary (in the sense that there is no specific algorithm for finding them, you simply choose them yourself) and only pull the curve in a direction.
If you want a curve which will pass through every point you supply it with, you need something like a natural cubic spline, and due to the limitations of those (you must supply them with increasing x co-ordinates, or it tends to infinity), you'll probably want a parametric natural cubic spline.
There are nice tutorials here:
Cubic Splines
Parametric Cubic Splines

I had the same problem as detailed in the question. I took the code provided Roland Puntaier and was able to make it work. Here:
def get_bezier_parameters(X, Y, degree=2):
""" Least square qbezier fit using penrose pseudoinverse.
Parameters:
X: array of x data.
Y: array of y data. Y[0] is the y point for X[0].
degree: degree of the Bézier curve. 2 for quadratic, 3 for cubic.
Based on https://stackoverflow.com/questions/12643079/b%C3%A9zier-curve-fitting-with-scipy
and probably on the 1998 thesis by Tim Andrew Pastva, "Bézier Curve Fitting".
"""
if degree < 1:
raise ValueError('degree must be 1 or greater.')
if len(X) != len(Y):
raise ValueError('X and Y must be of the same length.')
if len(X) < degree + 1:
raise ValueError(f'There must be at least {degree + 1} points to '
f'determine the parameters of a degree {degree} curve. '
f'Got only {len(X)} points.')
def bpoly(n, t, k):
""" Bernstein polynomial when a = 0 and b = 1. """
return t ** k * (1 - t) ** (n - k) * comb(n, k)
def bmatrix(T):
""" Bernstein matrix for Bézier curves. """
return np.matrix([[bpoly(degree, t, k) for k in range(degree + 1)] for t in T])
def least_square_fit(points, M):
M_ = np.linalg.pinv(M)
return M_ * points
T = np.linspace(0, 1, len(X))
M = bmatrix(T)
points = np.array(list(zip(X, Y)))
return least_square_fit(points, M).tolist()
To fix the end points of the curve, ignore the first and last parameter returned by the function and use your own points.

What Mike Kamermans said is true, but I also wanted to point out that, as far as I know, catmull-rom splines can be defined in terms of cubic beziers. So, if you only have a library that works with cubics, you should still be able to do catmull-rom splines:
http://schepers.cc/getting-to-the-point
https://github.com/DmitryBaranovskiy/raphael/blob/d8fbe4be81d362837f95e33886b80fb41de443b4/dev/raphael.core.js#L1021

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Finding the full width half maximum of a peak - python

You should use scipy to solve it: first find_peaks and then peak_widths. With default value in rel_height(0.5) you're measuring the width at half maximum of the peak.

For monotonic functions with many data points and if there's no need for perfect accuracy, I would use: def FWHM(X, Y): deltax = x[1] - x[0] half_max = max(Y) / 2. l = np.where(y > half_max, 1, 0) return np.sum(l) * deltax

Related

Interpolating non-uniformly distributed points on a 3D sphere

How can I calculate arbitrary values from a spline created with scipy.interpolate.Rbf?

Gaussian fit fails to return the correct parameters for this PSF

python optimize.leastsq: fitting a circle to 3d set of points

Bézier curve fitting with SciPy

Categories

Resources