I have a curve as shown below:
The x coordinates and the y coordinates for this plot are:
path_x= (4.0, 5.638304088577984, 6.785456961280076, 5.638304088577984, 4.0)
path_y =(0.0, 1.147152872702092, 2.7854569612800755, 4.423761049858059, 3.2766081771559668)
And I obtained the above picture by:
x_min =min(path_x)-1
x_max =max(path_x)+1
y_min =min(path_y)-1
y_max =max(path_y)+1
num_pts = len(path_x)
fig = plt.figure(figsize=(8,8))
#fig = plt.figure()
plt.suptitle("Curve and the boundary")
ax = fig.add_subplot(1,1,1)
ax.set_xlim([min(x_min,y_min),max(x_max,y_max)])
ax.set_ylim([min(x_min,y_min),max(x_max,y_max)])
ax.plot(path_x,path_y)
Now my intention is to draw a smooth curve using cubic splines. But looks like for cubic splines you need the x coordinates to be on ascending order. whereas in this case, neither x values nor y values are in the ascending order.
Also this is not a function. That is an x value is mapped with more than one element in the range.
I also went over this post. But I couldn't figure out a proper method to solve my problem.
I really appreciate your help in this regard
As suggested in the comments, you can always parameterize any curve/surface with an arbitrary (and linear!) parameter.
For example, define t as a parameter such that you get x=x(t) and y=y(t). Since t is arbitrary, you can define it such that at t=0, you get your first path_x[0],path_y[0], and at t=1, you get your last pair of coordinates, path_x[-1],path_y[-1].
Here is a code using scipy.interpolate
import numpy
import scipy.interpolate
import matplotlib.pyplot as plt
path_x = numpy.asarray((4.0, 5.638304088577984, 6.785456961280076, 5.638304088577984, 4.0),dtype=float)
path_y = numpy.asarray((0.0, 1.147152872702092, 2.7854569612800755, 4.423761049858059, 3.2766081771559668),dtype=float)
# defining arbitrary parameter to parameterize the curve
path_t = numpy.linspace(0,1,path_x.size)
# this is the position vector with
# x coord (1st row) given by path_x, and
# y coord (2nd row) given by path_y
r = numpy.vstack((path_x.reshape((1,path_x.size)),path_y.reshape((1,path_y.size))))
# creating the spline object
spline = scipy.interpolate.interp1d(path_t,r,kind='cubic')
# defining values of the arbitrary parameter over which
# you want to interpolate x and y
# it MUST be within 0 and 1, since you defined
# the spline between path_t=0 and path_t=1
t = numpy.linspace(numpy.min(path_t),numpy.max(path_t),100)
# interpolating along t
# r[0,:] -> interpolated x coordinates
# r[1,:] -> interpolated y coordinates
r = spline(t)
plt.plot(path_x,path_y,'or')
plt.plot(r[0,:],r[1,:],'-k')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
With output
For non-ascending x splines can be easily computed if you make both x and y functions of another parameter t: x(t), y(t).
In your case you have 5 points so t should be just enumeration of these points, i.e. t = 0, 1, 2, 3, 4 for 5 points.
So if x = [5, 2, 7, 3, 6] then x(t) = x(0) = 5, x(1) = 2, x(2) = 7, x(3) = 3, x(4) = 6. Same for y.
Then compute spline function for both x(t) and y(t). Afterwards compute values of splines in all many intermediate t points. Lastly just use all calculated values x(t) and y(t) as a function y(x).
Once before I implemented cubic spline computation from scratch using Numpy, so I use this code in my example below if you don't mind (it could be useful for you to learn about spline math), replace with your library functions. Also in my code you can see numba lines commented out, if you want you can use these Numba annotations to speed up computation.
You have to look at main() function at the bottom of code, it shows how to compute and use x(t) and y(t).
Try it online!
import numpy as np, matplotlib.pyplot as plt
# Solves linear system given by Tridiagonal Matrix
# Helper for calculating cubic splines
##numba.njit(cache = True, fastmath = True, inline = 'always')
def tri_diag_solve(A, B, C, F):
n = B.size
assert A.ndim == B.ndim == C.ndim == F.ndim == 1 and (
A.size == B.size == C.size == F.size == n
) #, (A.shape, B.shape, C.shape, F.shape)
Bs, Fs = np.zeros_like(B), np.zeros_like(F)
Bs[0], Fs[0] = B[0], F[0]
for i in range(1, n):
Bs[i] = B[i] - A[i] / Bs[i - 1] * C[i - 1]
Fs[i] = F[i] - A[i] / Bs[i - 1] * Fs[i - 1]
x = np.zeros_like(B)
x[-1] = Fs[-1] / Bs[-1]
for i in range(n - 2, -1, -1):
x[i] = (Fs[i] - C[i] * x[i + 1]) / Bs[i]
return x
# Calculate cubic spline params
##numba.njit(cache = True, fastmath = True, inline = 'always')
def calc_spline_params(x, y):
a = y
h = np.diff(x)
c = np.concatenate((np.zeros((1,), dtype = y.dtype),
np.append(tri_diag_solve(h[:-1], (h[:-1] + h[1:]) * 2, h[1:],
((a[2:] - a[1:-1]) / h[1:] - (a[1:-1] - a[:-2]) / h[:-1]) * 3), 0)))
d = np.diff(c) / (3 * h)
b = (a[1:] - a[:-1]) / h + (2 * c[1:] + c[:-1]) / 3 * h
return a[1:], b, c[1:], d
# Spline value calculating function, given params and "x"
##numba.njit(cache = True, fastmath = True, inline = 'always')
def func_spline(x, ix, x0, a, b, c, d):
dx = x - x0[1:][ix]
return a[ix] + (b[ix] + (c[ix] + d[ix] * dx) * dx) * dx
# Compute piece-wise spline function for "x" out of sorted "x0" points
##numba.njit([f'f{ii}[:](f{ii}[:], f{ii}[:], f{ii}[:], f{ii}[:], f{ii}[:], f{ii}[:])' for ii in (4, 8)],
# cache = True, fastmath = True, inline = 'always')
def piece_wise_spline(x, x0, a, b, c, d):
xsh = x.shape
x = x.ravel()
ix = np.searchsorted(x0[1 : -1], x)
y = func_spline(x, ix, x0, a, b, c, d)
y = y.reshape(xsh)
return y
def main():
x0 = np.array([4.0, 5.638304088577984, 6.785456961280076, 5.638304088577984, 4.0])
y0 = np.array([0.0, 1.147152872702092, 2.7854569612800755, 4.423761049858059, 3.2766081771559668])
t0 = np.arange(len(x0)).astype(np.float64)
plt.plot(x0, y0)
vs = []
for e in (x0, y0):
a, b, c, d = calc_spline_params(t0, e)
x = np.linspace(0, t0[-1], 100)
vs.append(piece_wise_spline(x, t0, a, b, c, d))
plt.plot(vs[0], vs[1])
plt.show()
if __name__ == '__main__':
main()
Output:
Related
I am trying to find the control points and handles of a Cubic Bezier curve from a series of points. My current code is below (credit to Zero Zero on the Python Discord). The Cubic Spline is creating the desired fit, but the handles (in orange) are incorrect. How may I find the handles of this curve?
Thank you!
import numpy as np
import scipy as sp
def fit_curve(points):
# Fit a cubic bezier curve to the points
curve = sp.interpolate.CubicSpline(points[:, 0], points[:, 1], bc_type=((1, 0.0), (1, 0.0)))
# Get 4 control points for the curve
p = np.zeros((4, 2))
p[0, :] = points[0, :]
p[3, :] = points[-1, :]
p[1, :] = points[0, :] + 0.3 * (points[-1, :] - points[0, :])
p[2, :] = points[-1, :] - 0.3 * (points[-1, :] - points[0, :])
return p, curve
ypoints = [0.0, 0.03771681353260319, 0.20421680080883106, 0.49896111463402026, 0.7183501026981503, 0.8481517096346528, 0.9256128196832564, 0.9705404287079152, 0.9933297674379904, 1.0]
xpoints = [x for x in range(len(ypoints))]
points = np.array([xpoints, ypoints]).T
from scipy.interpolate import splprep, splev
tck, u = splprep([xpoints, ypoints], s=0)
#print(tck, u)
xnew, ynew = splev(np.linspace(0, 1, 100), tck)
# Plot the original points and the Bézier curve
import matplotlib.pyplot as plt
#plt.plot(xpoints, ypoints, 'x', xnew, ynew, xpoints, ypoints, 'b')
plt.axis([0, 10, -0.05, 1.05])
plt.legend(['Points', 'Bézier curve', 'True curve'])
plt.title('Bézier curve fitting')
# Get the curve
p, curve = fit_curve(points)
# Plot the points and the curve
plt.plot(points[:, 0], points[:, 1], 'o')
plt.plot(p[:, 0], p[:, 1], 'o')
plt.plot(np.linspace(0, 9, 100), curve(np.linspace(0, 9, 100)))
plt.show()
The answer for my case was a Bezier best fit function that accepts an input of point values, fits the points to a Cubic Spline, and outputs the Bézier handles of the curve by finding their coefficients.
Here is one such script, fitCurves, which can be used like so:
import numpy as np
from fitCurve import fitCurve
import matplotlib.pyplot as plt
y = [0.0,
0.03771681353260319,
0.20421680080883106,
0.49896111463402026,
0.7183501026981503,
0.8481517096346528,
0.9256128196832564,
0.9705404287079152,
0.9933297674379904,
1.0]
x = np.linspace(0, 1, len(y))
pts = np.array([x,y]).T
bezier_handles = fitCurve(points=pts , maxError=20)
x_bez = []
y_bez = []
for bez in bezier_handles:
for pt in bez:
x_bez.append(pt[0])
y_bez.append(pt[1])
plt.plot(pts[:,0], pts[:,1], 'bo-', label='Points')
plt.plot(x_bez[:2], y_bez[:2], 'ro--', label='Handle') # handle 1
plt.plot(x_bez[2:4], y_bez[2:4], 'ro--') # handle 2
plt.legend()
plt.show()
fitCurve.py
from numpy import *
""" Python implementation of
Algorithm for Automatically Fitting Digitized Curves
by Philip J. Schneider
"Graphics Gems", Academic Press, 1990
"""
# evaluates cubic bezier at t, return point
def q(ctrlPoly, t):
return (1.0-t)**3 * ctrlPoly[0] + 3*(1.0-t)**2 * t * ctrlPoly[1] + 3*(1.0-t)* t**2 * ctrlPoly[2] + t**3 * ctrlPoly[3]
# evaluates cubic bezier first derivative at t, return point
def qprime(ctrlPoly, t):
return 3*(1.0-t)**2 * (ctrlPoly[1]-ctrlPoly[0]) + 6*(1.0-t) * t * (ctrlPoly[2]-ctrlPoly[1]) + 3*t**2 * (ctrlPoly[3]-ctrlPoly[2])
# evaluates cubic bezier second derivative at t, return point
def qprimeprime(ctrlPoly, t):
return 6*(1.0-t) * (ctrlPoly[2]-2*ctrlPoly[1]+ctrlPoly[0]) + 6*(t) * (ctrlPoly[3]-2*ctrlPoly[2]+ctrlPoly[1])
# Fit one (ore more) Bezier curves to a set of points
def fitCurve(points, maxError):
leftTangent = normalize(points[1] - points[0])
rightTangent = normalize(points[-2] - points[-1])
return fitCubic(points, leftTangent, rightTangent, maxError)
def fitCubic(points, leftTangent, rightTangent, error):
# Use heuristic if region only has two points in it
if (len(points) == 2):
dist = linalg.norm(points[0] - points[1]) / 3.0
bezCurve = [points[0], points[0] + leftTangent * dist, points[1] + rightTangent * dist, points[1]]
return [bezCurve]
# Parameterize points, and attempt to fit curve
u = chordLengthParameterize(points)
bezCurve = generateBezier(points, u, leftTangent, rightTangent)
# Find max deviation of points to fitted curve
maxError, splitPoint = computeMaxError(points, bezCurve, u)
if maxError < error:
return [bezCurve]
# If error not too large, try some reparameterization and iteration
if maxError < error**2:
for i in range(20):
uPrime = reparameterize(bezCurve, points, u)
bezCurve = generateBezier(points, uPrime, leftTangent, rightTangent)
maxError, splitPoint = computeMaxError(points, bezCurve, uPrime)
if maxError < error:
return [bezCurve]
u = uPrime
# Fitting failed -- split at max error point and fit recursively
beziers = []
centerTangent = normalize(points[splitPoint-1] - points[splitPoint+1])
beziers += fitCubic(points[:splitPoint+1], leftTangent, centerTangent, error)
beziers += fitCubic(points[splitPoint:], -centerTangent, rightTangent, error)
return beziers
def generateBezier(points, parameters, leftTangent, rightTangent):
bezCurve = [points[0], None, None, points[-1]]
# compute the A's
A = zeros((len(parameters), 2, 2))
for i, u in enumerate(parameters):
A[i][0] = leftTangent * 3*(1-u)**2 * u
A[i][1] = rightTangent * 3*(1-u) * u**2
# Create the C and X matrices
C = zeros((2, 2))
X = zeros(2)
for i, (point, u) in enumerate(zip(points, parameters)):
C[0][0] += dot(A[i][0], A[i][0])
C[0][1] += dot(A[i][0], A[i][1])
C[1][0] += dot(A[i][0], A[i][1])
C[1][1] += dot(A[i][1], A[i][1])
tmp = point - q([points[0], points[0], points[-1], points[-1]], u)
X[0] += dot(A[i][0], tmp)
X[1] += dot(A[i][1], tmp)
# Compute the determinants of C and X
det_C0_C1 = C[0][0] * C[1][1] - C[1][0] * C[0][1]
det_C0_X = C[0][0] * X[1] - C[1][0] * X[0]
det_X_C1 = X[0] * C[1][1] - X[1] * C[0][1]
# Finally, derive alpha values
alpha_l = 0.0 if det_C0_C1 == 0 else det_X_C1 / det_C0_C1
alpha_r = 0.0 if det_C0_C1 == 0 else det_C0_X / det_C0_C1
# If alpha negative, use the Wu/Barsky heuristic (see text) */
# (if alpha is 0, you get coincident control points that lead to
# divide by zero in any subsequent NewtonRaphsonRootFind() call. */
segLength = linalg.norm(points[0] - points[-1])
epsilon = 1.0e-6 * segLength
if alpha_l < epsilon or alpha_r < epsilon:
# fall back on standard (probably inaccurate) formula, and subdivide further if needed.
bezCurve[1] = bezCurve[0] + leftTangent * (segLength / 3.0)
bezCurve[2] = bezCurve[3] + rightTangent * (segLength / 3.0)
else:
# First and last control points of the Bezier curve are
# positioned exactly at the first and last data points
# Control points 1 and 2 are positioned an alpha distance out
# on the tangent vectors, left and right, respectively
bezCurve[1] = bezCurve[0] + leftTangent * alpha_l
bezCurve[2] = bezCurve[3] + rightTangent * alpha_r
return bezCurve
def reparameterize(bezier, points, parameters):
return [newtonRaphsonRootFind(bezier, point, u) for point, u in zip(points, parameters)]
def newtonRaphsonRootFind(bez, point, u):
"""
Newton's root finding algorithm calculates f(x)=0 by reiterating
x_n+1 = x_n - f(x_n)/f'(x_n)
We are trying to find curve parameter u for some point p that minimizes
the distance from that point to the curve. Distance point to curve is d=q(u)-p.
At minimum distance the point is perpendicular to the curve.
We are solving
f = q(u)-p * q'(u) = 0
with
f' = q'(u) * q'(u) + q(u)-p * q''(u)
gives
u_n+1 = u_n - |q(u_n)-p * q'(u_n)| / |q'(u_n)**2 + q(u_n)-p * q''(u_n)|
"""
d = q(bez, u)-point
numerator = (d * qprime(bez, u)).sum()
denominator = (qprime(bez, u)**2 + d * qprimeprime(bez, u)).sum()
if denominator == 0.0:
return u
else:
return u - numerator/denominator
def chordLengthParameterize(points):
u = [0.0]
for i in range(1, len(points)):
u.append(u[i-1] + linalg.norm(points[i] - points[i-1]))
for i, _ in enumerate(u):
u[i] = u[i] / u[-1]
return u
def computeMaxError(points, bez, parameters):
maxDist = 0.0
splitPoint = len(points)/2
for i, (point, u) in enumerate(zip(points, parameters)):
dist = linalg.norm(q(bez, u)-point)**2
if dist > maxDist:
maxDist = dist
splitPoint = i
return maxDist, splitPoint
def normalize(v):
return v / linalg.norm(v)
I have a set of data and want to put a parabolic fit over it. This already works with the polyfit function from numpy like this:
fit = np.polyfit(X, y, 2)
formula = np.poly1d(fit)
Now I want the parabula to have its peak value at a fixed x value and that the fit is still carried out as best as possible with this fixed peak. Is there a way to accomplish that?
From my data I know that the parabola will always be open downwards.
I think this is quite a difficult problem since the x coordinate of the peak of a second-order polynomial (ax^2 + bx + c) always lies in x = -b/2a.
A thing you could do is to drop the b term and offset it by the desired peak x value in fitting the polynomial like the code below. Note that I used scipy.optimize.curve_fit to fit for the custom function func.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# generating a parabola with noise
np.random.seed(42)
x = np.linspace(-10, 10, 100)
y = 10 -(x-2)**2 + np.random.normal(0, 5, x.shape)
# function to fit
def func(x, a, c):
return a*x**2 + c
# desired x peak value
x_peak = 2
popt, pcov = curve_fit(func, x - x_peak, y)
y_fit = func(x - x_peak, *popt)
# plotting
plt.plot(x, y, 'k.')
plt.plot(x, y_fit)
plt.axvline(x_peak)
plt.show()
Outputs the image:
Fixing a point on your parabola simplifies the problem, since you can rewrite your equation slightly in terms of a constant now:
y = A(x - B)**2 + C
Given the coefficients a, b, c in your original unconstrained fit, you have the relationships
a = A
b = -2AB
c = AB**2 + C
The only difference is that since B is a constant and you don't have an x - B term in the equation, you need to set up the least-squares problem yourself. Given arrays x, y and constant B, the problem looks like this:
m = np.stack((x - B, np.ones_like(x)), axis=-1)
(A, C), *_ = np.linalg.lstsq(m, y, rcond=None)
You can then extract the normal coefficient from the formulas for a, b, c above.
Here is a complete example, just like the one in the other answer:
B = 2
np.random.seed(42)
x = np.linspace(-10, 10, 100)
y = 10 -(x - B)**2 + np.random.normal(0, 5, x.shape)
m = np.stack(((x - B)**2, np.ones_like(x)), axis=-1)
(A, C), *_ = np.linalg.lstsq(m, y, rcond=None)
a = A
b = -2 * A * B
c = A * B**2 + C
y_fit = a * x**2 + b * x + c
You can drop a, b, c entirely and do
y_fit = A * (x - B)**2 + C
The result will be identical.
plt.plot(x, y, 'k.')
plt.plot(x, y_fit)
Without the condition of location of the peak the function to be fitted would be :
y = a x^2 + b x + c
With condition of location of the peak at x=p , given p :
-b/(2a)=p
b=-2 a p
y = a x^2 -2 a p x + c
y = a (x^2 - 2 p x) +c
Knowing p , one change of variable :
X = x^2 -2 p x
So, from the data (x,y) one first compute the new data (X,y)
Then a and c are computed thanks to linear regression
y = a X + c
Is there something like Matlab's procrustes function in NumPy/SciPy or related libraries?
For reference. Procrustes analysis aims to align 2 sets of points (in other words, 2 shapes) to minimize square distance between them by removing scale, translation and rotation warp components.
Example in Matlab:
X = [0 1; 2 3; 4 5; 6 7; 8 9]; % first shape
R = [1 2; 2 1]; % rotation matrix
t = [3 5]; % translation vector
Y = X * R + repmat(t, 5, 1); % warped shape, no scale and no distortion
[d Z] = procrustes(X, Y); % Z is Y aligned back to X
Z
Z =
0.0000 1.0000
2.0000 3.0000
4.0000 5.0000
6.0000 7.0000
8.0000 9.0000
Same task in NumPy:
X = arange(10).reshape((5, 2))
R = array([[1, 2], [2, 1]])
t = array([3, 5])
Y = dot(X, R) + t
Z = ???
Note: I'm only interested in aligned shape, since square error (variable d in Matlab code) is easily computed from 2 shapes.
I'm not aware of any pre-existing implementation in Python, but it's easy to take a look at the MATLAB code using edit procrustes.m and port it to Numpy:
def procrustes(X, Y, scaling=True, reflection='best'):
"""
A port of MATLAB's `procrustes` function to Numpy.
Procrustes analysis determines a linear transformation (translation,
reflection, orthogonal rotation and scaling) of the points in Y to best
conform them to the points in matrix X, using the sum of squared errors
as the goodness of fit criterion.
d, Z, [tform] = procrustes(X, Y)
Inputs:
------------
X, Y
matrices of target and input coordinates. they must have equal
numbers of points (rows), but Y may have fewer dimensions
(columns) than X.
scaling
if False, the scaling component of the transformation is forced
to 1
reflection
if 'best' (default), the transformation solution may or may not
include a reflection component, depending on which fits the data
best. setting reflection to True or False forces a solution with
reflection or no reflection respectively.
Outputs
------------
d
the residual sum of squared errors, normalized according to a
measure of the scale of X, ((X - X.mean(0))**2).sum()
Z
the matrix of transformed Y-values
tform
a dict specifying the rotation, translation and scaling that
maps X --> Y
"""
n,m = X.shape
ny,my = Y.shape
muX = X.mean(0)
muY = Y.mean(0)
X0 = X - muX
Y0 = Y - muY
ssX = (X0**2.).sum()
ssY = (Y0**2.).sum()
# centred Frobenius norm
normX = np.sqrt(ssX)
normY = np.sqrt(ssY)
# scale to equal (unit) norm
X0 /= normX
Y0 /= normY
if my < m:
Y0 = np.concatenate((Y0, np.zeros(n, m-my)),0)
# optimum rotation matrix of Y
A = np.dot(X0.T, Y0)
U,s,Vt = np.linalg.svd(A,full_matrices=False)
V = Vt.T
T = np.dot(V, U.T)
if reflection != 'best':
# does the current solution use a reflection?
have_reflection = np.linalg.det(T) < 0
# if that's not what was specified, force another reflection
if reflection != have_reflection:
V[:,-1] *= -1
s[-1] *= -1
T = np.dot(V, U.T)
traceTA = s.sum()
if scaling:
# optimum scaling of Y
b = traceTA * normX / normY
# standarised distance between X and b*Y*T + c
d = 1 - traceTA**2
# transformed coords
Z = normX*traceTA*np.dot(Y0, T) + muX
else:
b = 1
d = 1 + ssY/ssX - 2 * traceTA * normY / normX
Z = normY*np.dot(Y0, T) + muX
# transformation matrix
if my < m:
T = T[:my,:]
c = muX - b*np.dot(muY, T)
#transformation values
tform = {'rotation':T, 'scale':b, 'translation':c}
return d, Z, tform
There is a Scipy function for it: scipy.spatial.procrustes
I'm just posting its example here:
>>> import numpy as np
>>> from scipy.spatial import procrustes
>>> a = np.array([[1, 3], [1, 2], [1, 1], [2, 1]], 'd')
>>> b = np.array([[4, -2], [4, -4], [4, -6], [2, -6]], 'd')
>>> mtx1, mtx2, disparity = procrustes(a, b)
>>> round(disparity)
0.0
You can have both Ordinary Procrustes Analysis and Generalized Procrustes Analysis in python with something like this:
import numpy as np
def opa(a, b):
aT = a.mean(0)
bT = b.mean(0)
A = a - aT
B = b - bT
aS = np.sum(A * A)**.5
bS = np.sum(B * B)**.5
A /= aS
B /= bS
U, _, V = np.linalg.svd(np.dot(B.T, A))
aR = np.dot(U, V)
if np.linalg.det(aR) < 0:
V[1] *= -1
aR = np.dot(U, V)
aS = aS / bS
aT-= (bT.dot(aR) * aS)
aD = (np.sum((A - B.dot(aR))**2) / len(a))**.5
return aR, aS, aT, aD
def gpa(v, n=-1):
if n < 0:
p = avg(v)
else:
p = v[n]
l = len(v)
r, s, t, d = np.ndarray((4, l), object)
for i in range(l):
r[i], s[i], t[i], d[i] = opa(p, v[i])
return r, s, t, d
def avg(v):
v_= np.copy(v)
l = len(v_)
R, S, T = [list(np.zeros(l)) for _ in range(3)]
for i, j in np.ndindex(l, l):
r, s, t, _ = opa(v_[i], v_[j])
R[j] += np.arccos(min(1, max(-1, np.trace(r[:1])))) * np.sign(r[1][0])
S[j] += s
T[j] += t
for i in range(l):
a = R[i] / l
r = [np.cos(a), -np.sin(a)], [np.sin(a), np.cos(a)]
v_[i] = v_[i].dot(r) * (S[i] / l) + (T[i] / l)
return v_.mean(0)
For testing purposes, the output of each algorithm can be visualized as follows:
import matplotlib.pyplot as p; p.rcParams['toolbar'] = 'None';
def plt(o, e, b):
p.figure(figsize=(10, 10), dpi=72, facecolor='w').add_axes([0.05, 0.05, 0.9, 0.9], aspect='equal')
p.plot(0, 0, marker='x', mew=1, ms=10, c='g', zorder=2, clip_on=False)
p.gcf().canvas.set_window_title('%f' % e)
x = np.ravel(o[0].T[0])
y = np.ravel(o[0].T[1])
p.xlim(min(x), max(x))
p.ylim(min(y), max(y))
a = []
for i, j in np.ndindex(len(o), 2):
a.append(o[i].T[j])
O = p.plot(*a, marker='x', mew=1, ms=10, lw=.25, c='b', zorder=0, clip_on=False)
O[0].set(c='r', zorder=1)
if not b:
O[2].set_color('b')
O[2].set_alpha(0.4)
p.axis('off')
p.show()
# Fly wings example (Klingenberg, 2015 | https://en.wikipedia.org/wiki/Procrustes_analysis)
arr1 = np.array([[588.0, 443.0], [178.0, 443.0], [56.0, 436.0], [50.0, 376.0], [129.0, 360.0], [15.0, 342.0], [92.0, 293.0], [79.0, 269.0], [276.0, 295.0], [281.0, 331.0], [785.0, 260.0], [754.0, 174.0], [405.0, 233.0], [386.0, 167.0], [466.0, 59.0]])
arr2 = np.array([[477.0, 557.0], [130.129, 374.307], [52.0, 334.0], [67.662, 306.953], [111.916, 323.0], [55.119, 275.854], [107.935, 277.723], [101.899, 259.73], [175.0, 329.0], [171.0, 345.0], [589.0, 527.0], [591.0, 468.0], [299.0, 363.0], [306.0, 317.0], [406.0, 288.0]])
def opa_out(a):
r, s, t, d = opa(a[0], a[1])
a[1] = a[1].dot(r) * s + t
return a, d, False
plt(*opa_out([arr1, arr2, np.matrix.copy(arr2)]))
def gpa_out(a):
g = gpa(a, -1)
D = [avg(a)]
for i in range(len(a)):
D.append(a[i].dot(g[0][i]) * g[1][i] + g[2][i])
return D, sum(g[3])/len(a), True
plt(*gpa_out([arr1, arr2]))
Probably you want to try this package with various flavors of different Procrustes methods, https://github.com/theochem/procrustes.
I want to generate x and y having a uniform distribution and limited by [xmin,xmax] and [ymin,ymax]
The points (x,y) should be inside a triangle.
How can I solve such a problem?
Here's some code that generates points uniformly on an arbitrary triangle in the plane.
import random
def point_on_triangle(pt1, pt2, pt3):
"""
Random point on the triangle with vertices pt1, pt2 and pt3.
"""
x, y = sorted([random.random(), random.random()])
s, t, u = x, y - x, 1 - y
return (s * pt1[0] + t * pt2[0] + u * pt3[0],
s * pt1[1] + t * pt2[1] + u * pt3[1])
The idea is to compute a weighted average of the three vertices, with the weights given by a random break of the unit interval [0, 1] into three pieces (uniformly over all such breaks). Here x and y represent the places at which we break the unit interval, and s, t and u are the length of the pieces following that break. We then use s, t and u as the barycentric coordinates of the point in the triangle.
Here's a variant of the above that avoids the need to sort, instead making use of an absolute value call:
def point_on_triangle2(pt1, pt2, pt3):
"""
Random point on the triangle with vertices pt1, pt2 and pt3.
"""
x, y = random.random(), random.random()
q = abs(x - y)
s, t, u = q, 0.5 * (x + y - q), 1 - 0.5 * (q + x + y)
return (
s * pt1[0] + t * pt2[0] + u * pt3[0],
s * pt1[1] + t * pt2[1] + u * pt3[1],
)
Here's an example usage that generates 10000 points in a triangle:
pt1 = (1, 1)
pt2 = (2, 4)
pt3 = (5, 2)
points = [point_on_triangle(pt1, pt2, pt3) for _ in range(10000)]
And a plot obtained from the above, demonstrating the uniformity. The plot was generated by this code:
import matplotlib.pyplot as plt
x, y = zip(*points)
plt.scatter(x, y, s=0.1)
plt.show()
Here's the image:
And since you tagged the question with the "numpy" tag, here's a NumPy version that generates multiple samples at once. Note that it uses the matrix multiplication operator #, introduced in Python 3.5 and supported in NumPy >= 1.10. You'll need to replace that with a call to np.dot on older Python or NumPy versions.
import numpy as np
def points_on_triangle(v, n):
"""
Give n random points uniformly on a triangle.
The vertices of the triangle are given by the shape
(2, 3) array *v*: one vertex per row.
"""
x = np.sort(np.random.rand(2, n), axis=0)
return np.column_stack([x[0], x[1]-x[0], 1.0-x[1]]) # v
# Example usage
v = np.array([(1, 1), (2, 4), (5, 2)])
points = points_on_triangle(v, 10000)
Ok, time to add another version, I guess. There is known algorithm to sample uniformly in triangle, see paper, chapter 4.2 for details.
Python code:
import math
import random
import matplotlib.pyplot as plt
def trisample(A, B, C):
"""
Given three vertices A, B, C,
sample point uniformly in the triangle
"""
r1 = random.random()
r2 = random.random()
s1 = math.sqrt(r1)
x = A[0] * (1.0 - s1) + B[0] * (1.0 - r2) * s1 + C[0] * r2 * s1
y = A[1] * (1.0 - s1) + B[1] * (1.0 - r2) * s1 + C[1] * r2 * s1
return (x, y)
random.seed(312345)
A = (1, 1)
B = (2, 4)
C = (5, 2)
points = [trisample(A, B, C) for _ in range(10000)]
xx, yy = zip(*points)
plt.scatter(xx, yy, s=0.2)
plt.show()
And result looks like
Uniform on the triangle?
import numpy as np
N = 10 # number of points to create in one go
rvs = np.random.random((N, 2)) # uniform on the unit square
# Now use the fact that the unit square is tiled by the two triangles
# 0 <= y <= x <= 1 and 0 <= x < y <= 1
# which are mapped onto each other (except for the diagonal which has
# probability 0) by swapping x and y.
# We use this map to send all points of the square to the same of the
# two triangles. Because the map preserves areas this will yield
# uniformly distributed points.
rvs = np.where(rvs[:, 0, None]>rvs[:, 1, None], rvs, rvs[:, ::-1])
Finally, transform the coordinates
xmin, ymin, xmax, ymax = -0.1, 1.1, 2.0, 3.3
rvs = np.array((ymin, xmin)) + rvs*(ymax-ymin, xmax-xmin)
Uniform marginals? The simplest solution would be to uniformly concentrate the mass on the line (ymin, xmin) - (ymax, xmax)
rvs = np.random.random((N,))
rvs = np.c_[ymin + (ymax-ymin)*rvs, xmin + (xmax-xmin)*rvs]
but that is not very interesting, is it?
I have a set of x, y points and I'd like to find the line of best fit such that the line is below all points using SciPy. I'm trying to use leastsq for this, but I'm unsure how to adjust the line to be below all points instead of the line of best fit. The coefficients for the line of best fit can be produced via:
def linreg(x, y):
fit = lambda params, x: params[0] * x - params[1]
err = lambda p, x, y: (y - fit(p, x))**2
# initial slope/intercept
init_p = np.array((1, 0))
p, _ = leastsq(err, init_p.copy(), args=(x, y))
return p
xs = sp.array([1, 2, 3, 4, 5])
ys = sp.array([10, 20, 30, 40, 50])
print linreg(xs, ys)
The output is the coefficients for the line of best fit:
array([ 9.99999997e+00, -1.68071668e-15])
How can I get the coefficients of the line of best fit that is below all points?
A possible algorithm is as follows:
Move the axes to have all the data on the positive half of the x axis.
If the fit is of the form y = a * x + b, then for a given b the best fit for a will be the minimum of the slopes joining the point (0, b) with each of the (x, y) points.
You can then calculate a fit error, which is a function of only b, and use scipy.optimize.minimize to find the best value for b.
All that's left is computing a for that b and calculating b for the original position of the axes.
The following does that most of the time, except when the minimization fails with some mysterious error:
from __future__ import division
import numpy as np
import scipy.optimize
import matplotlib.pyplot as plt
def fit_below(x, y) :
idx = np.argsort(x)
x = x[idx]
y = y[idx]
x0, y0 = x[0] - 1, y[0]
x -= x0
y -= y0
def error_function_2(b, x, y) :
a = np.min((y - b) / x)
return np.sum((y - a * x - b)**2)
b = scipy.optimize.minimize(error_function_2, [0], args=(x, y)).x[0]
a = np.min((y - b) / x)
return a, b - a * x0 + y0
x = np.arange(10).astype(float)
y = x * 2 + 3 + 3 * np.random.rand(len(x))
a, b = fit_below(x, y)
plt.plot(x, y, 'o')
plt.plot(x, a*x + b, '-')
plt.show()
And as TheodrosZelleke wisely predicted, it goes through two points that are part of the convex hull: