Find points that lie in a concave hull of a point cloud

Find points that lie in a concave hull of a point cloud - python

in this thread a method is suggested for masking out point that lie in a convex hull for example:
x = np.array([0,1,2,3,4,4, 4, 6, 6, 5, 5, 1])
y = np.array([0,1,2,3,4,3, 3.5, 3, 2, 0, 3, 0])
xx = np.linspace(np.min(x)-1, np.max(x)+1, 40)
yy = np.linspace(np.min(y)-1, np.max(y)+1, 40)
xx, yy = np.meshgrid(xx, yy)
plt.scatter(x, y, s=50)
plt.scatter(xx, yy, s=10)
def in_hull(p, hull):
from scipy.spatial import Delaunay
if not isinstance(hull, Delaunay):
hull = Delaunay(hull)
hull1 = np.stack((x,y)).T
p1 = np.stack((xx.ravel(),yy.ravel())).T
cond = in_hull(p1, hull1)
p2 = p1[cond,:]
plt.scatter(x, y)
plt.scatter(p2[:,0],p2[:,1], s=10)
return hull.find_simplex(p)>=0
with which the set of masked points look like the following. However I am looking for a way that does so with a concave hull (similar to what the blue points suggest)
I found this thread that suggest some functionality for a concave border but am not sure yet if it is applicable in my case. Does anyone has a suggestion?

The method from the first thread you reference can be adopted to the concave case using the alpha-shape (sometimes called the concave hull) concept, which is what the answer from your second reference suggests.
The alpha-shape is a subset of triangles of the Delaunay triangulation, where each triangle satisfies a circumscribing radius condition.
The following code is modified from my previous answer to compute the set of Delaunay triangles in the alpha-shape. Once the Delaunay triangulation and alpha-shape mask are computed, the fast method you reference can be adopted to the alpha-shape as I'll explain below.
def circ_radius(p0,p1,p2):
"""
Vectorized computation of triangle circumscribing radii.
See for example https://www.cuemath.com/jee/circumcircle-formulae-trigonometry/
"""
a = p1-p0
b = p2-p0
norm_a = np.linalg.norm(a, axis=1)
norm_b = np.linalg.norm(b, axis=1)
norm_a_b = np.linalg.norm(a-b, axis=1)
cross_a_b = np.cross(a,b) # 2 * area of triangles
return (norm_a*norm_b*norm_a_b) / np.abs(2.0*cross_a_b)
def alpha_shape_delaunay_mask(points, alpha):
"""
Compute the alpha shape (concave hull) of a set of points and return the Delaunay triangulation and a boolean
mask for any triangle in the triangulation whether it belongs to the alpha shape.
:param points: np.array of shape (n,2) points.
:param alpha: alpha value.
:return: Delaunay triangulation dt and boolean array is_in_shape, so that dt.simplices[is_in_alpha] contains
only the triangles that belong to the alpha shape.
"""
# Modified and vectorized from:
# https://stackoverflow.com/questions/50549128/boundary-enclosing-a-given-set-of-points/50714300#50714300
assert points.shape[0] > 3, "Need at least four points"
dt = Delaunay(points)
p0 = points[dt.simplices[:,0],:]
p1 = points[dt.simplices[:,1],:]
p2 = points[dt.simplices[:,2],:]
rads = circ_radius(p0, p1, p2)
is_in_shape = (rads < alpha)
return dt, is_in_shape
The method from your first reference can then be adjusted to check not only if the point is in one of the Delaunay triangles (in which case it is in the convex hull), but also whether it is in one of the alpha-shape triangles.
The following function does this:
def in_alpha_shape(p, dt, is_in_alpha):
simplex_ids = dt.find_simplex(p)
res = np.full(p.shape[0], False)
res[simplex_ids >= 0] = is_in_alpha[simplex_ids[simplex_ids >= 0]] # simplex should be in dt _and_ in alpha
return res
This method is very fast since it relies on the efficient search implementation of the Delaunay find_simplex() function.
Running it (with alpha=2) on the example data points from your post with the code below gives the results in the following figure, which I believe are not what you wished for...
points = np.vstack([x, y]).T
alpha = 2.
dt, is_in_alpha = alpha_shape_delaunay_mask(points, alpha)
p1 = np.stack((xx.ravel(),yy.ravel())).T
cond = in_alpha_shape(p1, dt, is_in_alpha)
p2 = p1[cond,:]
plt.figure()
plt.scatter(x, y)
plt.scatter(p2[:,0],p2[:,1], s=10)
The reason for the result above is that, since there are large gaps between your input points, the alpha-shape of your data does not follow the polygon from your points. Increasing the alpha parameter won't help either since it will cut concave corners in other places. If you add more dense sample points then this alpha-shape method can be well-suited for your task. If not, then below I propose another solution.
Since your original polygon is not suited for the alpha-shape method, you need an implementation of a function that returns whether point(s) are inside a given polygon. The following function implements such an algorithm based on accumulating inner/outer angles (see here for an explanation).
def points_in_polygon(pts, polygon):
"""
Returns if the points are inside the given polygon,
Implemented with angle accumulation.
see:
https://en.wikipedia.org/wiki/Point_in_polygon#Winding_number_algorithm
:param np.ndarray pts: 2d points
:param np.ndarray polygon: 2d polygon
:return: Returns if the points are inside the given polygon, array[i] == True means pts[i] is inside the polygon.
"""
polygon = np.vstack((polygon, polygon[0, :])) # close the polygon (if already closed shouldn't hurt)
sum_angles = np.zeros([len(pts), ])
for i in range(len(polygon) - 1):
v1 = polygon[i, :] - pts
norm_v1 = np.linalg.norm(v1, axis=1, keepdims=True)
norm_v1[norm_v1 == 0.0] = 1.0 # prevent divide-by-zero nans
v1 = v1 / norm_v1
v2 = polygon[i + 1, :] - pts
norm_v2 = np.linalg.norm(v2, axis=1, keepdims=True)
norm_v2[norm_v2 == 0.0] = 1.0 # prevent divide-by-zero nans
v2 = v2 / norm_v2
dot_prods = np.sum(v1 * v2, axis=1)
cross_prods = np.cross(v1, v2)
angs = np.arccos(np.clip(dot_prods, -1, 1))
angs = np.sign(cross_prods) * angs
sum_angles += angs
sum_degrees = np.rad2deg(sum_angles)
# In most cases abs(sum_degrees) should be close to 360 (inside) or to 0 (outside).
# However, in end cases, points that are on the polygon can be less than 360, so I allow a generous margin..
return abs(sum_degrees) > 90.0
Calling it with the code below results in the following figure, which I believe is what you were looking for.
points = np.vstack([x, y]).T
p1 = np.vstack([xx.ravel(), yy.ravel()]).T
cond = points_in_polygon(p1, points)
p2 = p1[cond,:]
plt.figure()
plt.scatter(x, y)
plt.plot(x, y)
plt.scatter(p2[:,0],p2[:,1], s=10)

Related

How can I calculate arbitrary values from a spline created with scipy.interpolate.Rbf?

I have several data points in 3 dimensional space (x, y, z) and have interpolated them using scipy.interpolate.Rbf. This gives me a spline nicely representing the surface of my 3D object. I would now like to determine several x and y pairs that have the same, arbitrary z value. I would like to do that in order to compute the cross section of my 3D object at any given value of z. Does someone know how to do that? Maybe there is also a better way to do that instead of using scipy.interpolate.Rbf.
Up to now I have evaluated the cross sections by making a contour plot using matplotlib.pyplot and extracting the displayed segments. 3D points and interpolated spline
segments extracted using a contour plot

I was able to solve the problem. I have calculated the area by triangulating the x-y data and cutting the triangles with the z-plane I wanted to calculate the cross-sectional area of (z=z0). Specifically, I have searched for those triangles whose z-values are both above and below z0. Then I have calculated the x and y values of the sides of these triangles where the sides are equal to z0. Then I use scipy.spatial.ConvexHull to sort the intersected points. Using the shoelace formula I can then determine the area.
I have attached the example code here:
import numpy as np
from scipy import spatial
import matplotlib.pyplot as plt
# Generation of random test data
n = 500
x = np.random.random(n)
y = np.random.random(n)
z = np.exp(-2*(x-.5)**2-4*(y-.5)**2)
z0 = .75
# Triangulation of the test data
triang= spatial.Delaunay(np.array([x, y]).T)
# Determine all triangles where not all points are above or below z0, i.e. the triangles that intersect z0
tri_inter=np.zeros_like(triang.simplices, dtype=np.int) # The triangles which intersect the plane at z0, filled below
i = 0
for tri in triang.simplices:
if ~np.all(z[tri] > z0) and ~np.all(z[tri] < z0):
tri_inter[i,:] = tri
i += 1
tri_inter = tri_inter[~np.all(tri_inter==0, axis=1)] # Remove all rows with only 0
# The number of interpolated values for x and y has twice the length of the triangles
# Because each triangle intersects the plane at z0 twice
x_inter = np.zeros(tri_inter.shape[0]*2)
y_inter = np.zeros(tri_inter.shape[0]*2)
for j, tri in enumerate(tri_inter):
# Determine which of the three points are above and which are below z0
points_above = []
points_below = []
for i in tri:
if z[i] > z0:
points_above.append(i)
else:
points_below.append(i)
# Calculate the intersections and put the values into x_inter and y_inter
t = (z0-z[points_below[0]])/(z[points_above[0]]-z[points_below[0]])
x_new = t * (x[points_above[0]]-x[points_below[0]]) + x[points_below[0]]
y_new = t * (y[points_above[0]]-y[points_below[0]]) + y[points_below[0]]
x_inter[j*2] = x_new
y_inter[j*2] = y_new
if len(points_above) > len(points_below):
t = (z0-z[points_below[0]])/(z[points_above[1]]-z[points_below[0]])
x_new = t * (x[points_above[1]]-x[points_below[0]]) + x[points_below[0]]
y_new = t * (y[points_above[1]]-y[points_below[0]]) + y[points_below[0]]
else:
t = (z0-z[points_below[1]])/(z[points_above[0]]-z[points_below[1]])
x_new = t * (x[points_above[0]]-x[points_below[1]]) + x[points_below[1]]
y_new = t * (y[points_above[0]]-y[points_below[1]]) + y[points_below[1]]
x_inter[j*2+1] = x_new
y_inter[j*2+1] = y_new
# sort points to calculate area
hull = spatial.ConvexHull(np.array([x_inter, y_inter]).T)
x_hull, y_hull = x_inter[hull.vertices], y_inter[hull.vertices]
# Calculation of are using the shoelace formula
area = 0.5*np.abs(np.dot(x_hull,np.roll(y_hull,1))-np.dot(y_hull,np.roll(x_hull,1)))
print('Area:', area)
plt.figure()
plt.plot(x_inter, y_inter, 'ro')
plt.plot(x_hull, y_hull, 'b--')
plt.triplot(x, y, triangles=tri_inter, color='k')
plt.show()

Apply a rotation matrix to xy coordinates

I have xy coordinates that represents a subject over a given space. It is referenced from another point and is therefore off centre. As in the longitudinal axes is not aligned along the x-axis.
The randomly generated ellipse below provides an indication of this:
import numpy as np
from matplotlib.pyplot import scatter
xx = np.array([-0.51, 51.2])
yy = np.array([0.33, 51.6])
means = [xx.mean(), yy.mean()]
stds = [xx.std() / 3, yy.std() / 3]
corr = 0.8 # correlation
covs = [[stds[0]**2 , stds[0]*stds[1]*corr],
[stds[0]*stds[1]*corr, stds[1]**2]]
m = np.random.multivariate_normal(means, covs, 1000).T
scatter(m[0], m[1])
To straighten the coordinates I was thinking of applying the vector to a rotation matrix.
Would something like this work?
angle = 65.
theta = (angle/180.) * np.pi
rotMatrix = np.array([[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]])
This may also seem like a silly question but is there a way to determine if the resulting vector of xy coordinates is perpendicular? Or will you just have to play around with the rotation angle?

You can use sklearn.decomposition.PCA (principal component analysis) with n_components=2 to extract the smallest angle required to rotate the point cloud such that its major axis is horizontal.
Runnable example
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
np.random.seed(1)
xx = np.array([-0.51, 51.2])
yy = np.array([0.33, 51.6])
means = [xx.mean(), yy.mean()]
stds = [xx.std() / 3, yy.std() / 3]
corr = 0.8 # correlation
covs = [[stds[0]**2, stds[0]*stds[1]*corr],
[stds[0]*stds[1]*corr, stds[1]**2]]
m = np.random.multivariate_normal(means, covs, 1000)
pca = PCA(2)
# This was in my first answer attempt: fit_transform works fine, but it randomly
# flips (mirrors) points across one of the principal axes.
# m2 = pca.fit_transform(m)
# Workaround: get the rotation angle from the PCA components and manually
# build the rotation matrix.
# Fit the PCA object, but do not transform the data
pca.fit(m)
# pca.components_ : array, shape (n_components, n_features)
# cos theta
ct = pca.components_[0, 0]
# sin theta
st = pca.components_[0, 1]
# One possible value of theta that lies in [0, pi]
t = np.arccos(ct)
# If t is in quadrant 1, rotate CLOCKwise by t
if ct > 0 and st > 0:
t *= -1
# If t is in Q2, rotate COUNTERclockwise by the complement of theta
elif ct < 0 and st > 0:
t = np.pi - t
# If t is in Q3, rotate CLOCKwise by the complement of theta
elif ct < 0 and st < 0:
t = -(np.pi - t)
# If t is in Q4, rotate COUNTERclockwise by theta, i.e., do nothing
elif ct > 0 and st < 0:
pass
# Manually build the ccw rotation matrix
rotmat = np.array([[np.cos(t), -np.sin(t)],
[np.sin(t), np.cos(t)]])
# Apply rotation to each row of m
m2 = (rotmat # m.T).T
# Center the rotated point cloud at (0, 0)
m2 -= m2.mean(axis=0)
fig, ax = plt.subplots()
plot_kws = {'alpha': '0.75',
'edgecolor': 'white',
'linewidths': 0.75}
ax.scatter(m[:, 0], m[:, 1], **plot_kws)
ax.scatter(m2[:, 0], m2[:, 1], **plot_kws)
Output
Warning: pca.fit_transform() sometimes flips (mirrors) the point cloud
The principal components can randomly come out as either positive or negative. In some cases, your point cloud may appear flipped upside down or even mirrored across one of its principal axes. (To test this, change the random seed and re-run the code until you observe flipping.) There's an in-depth discussion here (based in R, but the math is relevant). To correct this, you'd have to replace the fit_transform line with manual flipping of one or both components' signs, then multiply the sign-flipped component matrix by the point cloud array.

Indeed a very useful concept here is a linear transformation of a vector v performed by a matrix A. If you treat your scatter points as the tip of vectors originating from (0,0), then is very easy to rotate them any angle theta. A matrix that performs such rotation of theta would be
A = [[cos(theta) -sin(theta]
[sin(theta) cos(theta)]]
Evidently, when theta is 90 degrees this results into
A = [[ 0 1]
[-1 0]]
And to apply the rotation you would only need to perform the matrix multiplication w = A v
With this, the current goal is to perform a matrix multiplication of the vectors stored in m with x,y tips as m[0],m[1]. The rotated vector are gonna be stored in m2. Below is the relevant code to do so. Note that I have transposed m for an easier computation of the matrix multiplication (performed with #) and that the rotation angle is 90 degress counterclockwise.
import numpy as np
import matplotlib.pyplot as plt
xx = np.array([-0.51, 51.2])
yy = np.array([0.33, 51.6])
means = [xx.mean(), yy.mean()]
stds = [xx.std() / 3, yy.std() / 3]
corr = 0.8 # correlation
covs = [[stds[0]**2 , stds[0]*stds[1]*corr],
[stds[0]*stds[1]*corr, stds[1]**2]]
m = np.random.multivariate_normal(means, covs, 1000).T
plt.scatter(m[0], m[1])
theta_deg = 90
theta_rad = np.deg2rad(theta_deg)
A = np.matrix([[np.cos(theta_rad), -np.sin(theta_rad)],
[np.sin(theta_rad), np.cos(theta_rad)]])
m2 = np.zeros(m.T.shape)
for i,v in enumerate(m.T):
w = A # v.T
m2[i] = w
m2 = m2.T
plt.scatter(m2[0], m2[1])
This leads to the rotated scatter plot:
You can be sure that the rotated version is exactly 90 degrees counterclockwise with the linear transformation.
Edit
To find the rotation angle you need to apply in order for the scatter plot to be aligned with the x axis a good approach is to find the linear approximation of the scattered data with numpy.polyfit. This yields to a linear function by providing the slope and the intercept of the y axis b. Then get the rotation angle with the arctan function of the slope and compute the transformation matrix as before. You can do this by adding the following part to the code
slope, b = np.polyfit(m[1], m[0], 1)
x = np.arange(min(m[0]), max(m[0]), 1)
y_line = slope*x + b
plt.plot(x, y_line, color='r')
theta_rad = -np.arctan(slope)
And result to the plot you were seeking
Edit 2
Because #Peter Leimbigler pointed out that numpy.polyfit does not find the correct global direction of the scattered data, I have thought that you can get the average slope by averaging the x and y parts of the data. This is to find another slope, called slope2 (depicted in green now) to apply the rotation. So simply,
slope, b = np.polyfit(m[1], m[0], 1)
x = np.arange(min(m[0]), max(m[0]), 1)
y_line = slope*x + b
slope2 = np.mean(m[1])/np.mean(m[0])
y_line2 = slope2*x + b
plt.plot(x, y_line, color='r')
plt.plot(x, y_line2, color='g')
theta_rad = -np.arctan(slope2)
And by applying the linear transformation with the rotation matrix you get

If the slope of the two lines multiplied together is equal to -1 than they are perpendicular.
The other case this is true, is when one slope is 0 and the other is undefined (a perfectly horizontal line and a perfectly vertical line).

Locating the centroid (center of mass) of spherical polygons

I'm trying to work out how best to locate the centroid of an arbitrary shape draped over a unit sphere, with the input being ordered (clockwise or anti-cw) vertices for the shape boundary. The density of vertices is irregular along the boundary, so the arc-lengths between them are not generally equal. Because the shapes may be very large (half a hemisphere) it is generally not possible to simply project the vertices to a plane and use planar methods, as detailed on Wikipedia (sorry I'm not allowed more than 2 hyperlinks as a newcomer). A slightly better approach involves the use of planar geometry manipulated in spherical coordinates, but again, with large polygons this method fails, as nicely illustrated here. On that same page, 'Cffk' highlighted this paper which describes a method for calculating the centroid of spherical triangles. I've tried to implement this method, but without success, and I'm hoping someone can spot the problem?
I have kept the variable definitions similar to those in the paper to make it easier to compare. The input (data) is a list of longitude/latitude coordinates, converted to [x,y,z] coordinates by the code. For each of the triangles I have arbitrarily fixed one point to be the +z-pole, the other two vertices being composed of a pair of neighboring points along the polygon boundary. The code steps along the boundary (starting at an arbitrary point), using each boundary segment of the polygon as a triangle side in turn. A sub-centroid is determined for each of these individual spherical triangles and they are weighted according to triangle area and added to calculate the total polygon centroid. I don't get any errors when running the code, but the total centroids returned are clearly wrong (I have run some very basic shapes where the centroid location is unambiguous). I haven't found any sensible pattern in the location of the centroids returned...so at the moment I'm not sure what is going wrong, either in the math or code (although, the suspicion is the math).
The code below should work copy-paste as is if you would like to try it. If you have matplotlib and numpy installed, it will plot the results (it will ignore plotting if you don't). You just have to put the longitude/latitude data below the code into a text file called example.txt.
from math import *
try:
import matplotlib as mpl
import matplotlib.pyplot
from mpl_toolkits.mplot3d import Axes3D
import numpy
plotting_enabled = True
except ImportError:
plotting_enabled = False
def sph_car(point):
if len(point) == 2:
point.append(1.0)
rlon = radians(float(point[0]))
rlat = radians(float(point[1]))
x = cos(rlat) * cos(rlon) * point[2]
y = cos(rlat) * sin(rlon) * point[2]
z = sin(rlat) * point[2]
return [x, y, z]
def xprod(v1, v2):
x = v1[1] * v2[2] - v1[2] * v2[1]
y = v1[2] * v2[0] - v1[0] * v2[2]
z = v1[0] * v2[1] - v1[1] * v2[0]
return [x, y, z]
def dprod(v1, v2):
dot = 0
for i in range(3):
dot += v1[i] * v2[i]
return dot
def plot(poly_xyz, g_xyz):
fig = mpl.pyplot.figure()
ax = fig.add_subplot(111, projection='3d')
# plot the unit sphere
u = numpy.linspace(0, 2 * numpy.pi, 100)
v = numpy.linspace(-1 * numpy.pi / 2, numpy.pi / 2, 100)
x = numpy.outer(numpy.cos(u), numpy.sin(v))
y = numpy.outer(numpy.sin(u), numpy.sin(v))
z = numpy.outer(numpy.ones(numpy.size(u)), numpy.cos(v))
ax.plot_surface(x, y, z, rstride=4, cstride=4, color='w', linewidth=0,
alpha=0.3)
# plot 3d and flattened polygon
x, y, z = zip(*poly_xyz)
ax.plot(x, y, z)
ax.plot(x, y, zs=0)
# plot the alleged 3d and flattened centroid
x, y, z = g_xyz
ax.scatter(x, y, z, c='r')
ax.scatter(x, y, 0, c='r')
# display
ax.set_xlim3d(-1, 1)
ax.set_ylim3d(-1, 1)
ax.set_zlim3d(0, 1)
mpl.pyplot.show()
lons, lats, v = list(), list(), list()
# put the two-column data at the bottom of the question into a file called
# example.txt in the same directory as this script
with open('example.txt') as f:
for line in f.readlines():
sep = line.split()
lons.append(float(sep[0]))
lats.append(float(sep[1]))
# convert spherical coordinates to cartesian
for lon, lat in zip(lons, lats):
v.append(sph_car([lon, lat, 1.0]))
# z unit vector/pole ('north pole'). This is an arbitrary point selected to act as one
#(fixed) vertex of the summed spherical triangles. The other two vertices of any
#triangle are composed of neighboring vertices from the polygon boundary.
np = [0.0, 0.0, 1.0]
# Gx,Gy,Gz are the cartesian coordinates of the calculated centroid
Gx, Gy, Gz = 0.0, 0.0, 0.0
for i in range(-1, len(v) - 1):
# cycle through the boundary vertices of the polygon, from 0 to n
if all((v[i][0] != v[i+1][0],
v[i][1] != v[i+1][1],
v[i][2] != v[i+1][2])):
# this just ignores redundant points which are common in my larger input files
# A,B,C are the internal angles in the triangle: 'np-v[i]-v[i+1]-np'
A = asin(sqrt((dprod(np, xprod(v[i], v[i+1])))**2
/ ((1 - (dprod(v[i+1], np))**2) * (1 - (dprod(np, v[i]))**2))))
B = asin(sqrt((dprod(v[i], xprod(v[i+1], np)))**2
/ ((1 - (dprod(np , v[i]))**2) * (1 - (dprod(v[i], v[i+1]))**2))))
C = asin(sqrt((dprod(v[i + 1], xprod(np, v[i])))**2
/ ((1 - (dprod(v[i], v[i+1]))**2) * (1 - (dprod(v[i+1], np))**2))))
# A/B/Cbar are the vertex angles, such that if 'O' is the sphere center, Abar
# is the angle (v[i]-O-v[i+1])
Abar = acos(dprod(v[i], v[i+1]))
Bbar = acos(dprod(v[i+1], np))
Cbar = acos(dprod(np, v[i]))
# e is the 'spherical excess', as defined on wikipedia
e = A + B + C - pi
# mag1/2/3 are the magnitudes of vectors np,v[i] and v[i+1].
mag1 = 1.0
mag2 = float(sqrt(v[i][0]**2 + v[i][1]**2 + v[i][2]**2))
mag3 = float(sqrt(v[i+1][0]**2 + v[i+1][1]**2 + v[i+1][2]**2))
# vec1/2/3 are cross products, defined here to simplify the equation below.
vec1 = xprod(np, v[i])
vec2 = xprod(v[i], v[i+1])
vec3 = xprod(v[i+1], np)
# multiplying vec1/2/3 by e and respective internal angles, according to the
#posted paper
for x in range(3):
vec1[x] *= Cbar / (2 * e * mag1 * mag2
* sqrt(1 - (dprod(np, v[i])**2)))
vec2[x] *= Abar / (2 * e * mag2 * mag3
* sqrt(1 - (dprod(v[i], v[i+1])**2)))
vec3[x] *= Bbar / (2 * e * mag3 * mag1
* sqrt(1 - (dprod(v[i+1], np)**2)))
Gx += vec1[0] + vec2[0] + vec3[0]
Gy += vec1[1] + vec2[1] + vec3[1]
Gz += vec1[2] + vec2[2] + vec3[2]
approx_expected_Gxyz = (0.78, -0.56, 0.27)
print('Approximate Expected Gxyz: {0}\n'
' Actual Gxyz: {1}'
''.format(approx_expected_Gxyz, (Gx, Gy, Gz)))
if plotting_enabled:
plot(v, (Gx, Gy, Gz))
Thanks in advance for any suggestions or insight.
EDIT: Here is a figure that shows a projection of the unit sphere with a polygon and the resulting centroid I calculate from the code. Clearly, the centroid is wrong as the polygon is rather small and convex but yet the centroid falls outside its perimeter.
EDIT: Here is a highly-similar set of coordinates to those above, but in the original [lon,lat] format I normally use (which is now converted to [x,y,z] by the updated code).
-39.366295 -1.633460
-47.282630 -0.740433
-53.912136 0.741380
-59.004217 2.759183
-63.489005 5.426812
-68.566001 8.712068
-71.394853 11.659135
-66.629580 15.362600
-67.632276 16.827507
-66.459524 19.069327
-63.819523 21.446736
-61.672712 23.532143
-57.538431 25.947815
-52.519889 28.691766
-48.606227 30.646295
-45.000447 31.089437
-41.549866 32.139873
-36.605156 32.956277
-32.010080 34.156692
-29.730629 33.756566
-26.158767 33.714080
-25.821513 34.179648
-23.614658 36.173719
-20.896869 36.977645
-17.991994 35.600074
-13.375742 32.581447
-9.554027 28.675497
-7.825604 26.535234
-7.825604 26.535234
-9.094304 23.363132
-9.564002 22.527385
-9.713885 22.217165
-9.948596 20.367878
-10.496531 16.486580
-11.151919 12.666850
-12.350144 8.800367
-15.446347 4.993373
-20.366139 1.132118
-24.784805 -0.927448
-31.532135 -1.910227
-39.366295 -1.633460
EDIT: A couple more examples...with 4 vertices defining a perfect square centered at [1,0,0] I get the expected result:
However, from a non-symmetric triangle I get a centroid that is nowhere close...the centroid actually falls on the far side of the sphere (here projected onto the front side as the antipode):
Interestingly, the centroid estimation appears 'stable' in the sense that if I invert the list (go from clockwise to counterclockwise order or vice-versa) the centroid correspondingly inverts exactly.

Anybody finding this, make sure to check Don Hatch's answer which is probably better.
I think this will do it. You should be able to reproduce this result by just copy-pasting the code below.
You will need to have the latitude and longitude data in a file called longitude and latitude.txt. You can copy-paste the original sample data which is included below the code.
If you have mplotlib it will additionally produce the plot below
For non-obvious calculations, I included a link that explains what is going on
In the graph below, the reference vector is very short (r = 1/10) so that the 3d-centroids are easier to see. You can easily remove the scaling to maximize accuracy.
Note to op: I rewrote almost everything so I'm not sure exactly where the original code was not working. However, at least I think it was not taking into consideration the need to handle clockwise / counterclockwise triangle vertices.
Legend:
(black line) reference vector
(small red dots) spherical triangle 3d-centroids
(large red / blue / green dot) 3d-centroid / projected to the surface / projected to the xy plane
(blue / green lines) the spherical polygon and the projection onto the xy plane
from math import *
try:
import matplotlib as mpl
import matplotlib.pyplot
from mpl_toolkits.mplot3d import Axes3D
import numpy
plotting_enabled = True
except ImportError:
plotting_enabled = False
def main():
# get base polygon data based on unit sphere
r = 1.0
polygon = get_cartesian_polygon_data(r)
point_count = len(polygon)
reference = ok_reference_for_polygon(polygon)
# decompose the polygon into triangles and record each area and 3d centroid
areas, subcentroids = list(), list()
for ia, a in enumerate(polygon):
# build an a-b-c point set
ib = (ia + 1) % point_count
b, c = polygon[ib], reference
if points_are_equivalent(a, b, 0.001):
continue # skip nearly identical points
# store the area and 3d centroid
areas.append(area_of_spherical_triangle(r, a, b, c))
tx, ty, tz = zip(a, b, c)
subcentroids.append((sum(tx)/3.0,
sum(ty)/3.0,
sum(tz)/3.0))
# combine all the centroids, weighted by their areas
total_area = sum(areas)
subxs, subys, subzs = zip(*subcentroids)
_3d_centroid = (sum(a*subx for a, subx in zip(areas, subxs))/total_area,
sum(a*suby for a, suby in zip(areas, subys))/total_area,
sum(a*subz for a, subz in zip(areas, subzs))/total_area)
# shift the final centroid to the surface
surface_centroid = scale_v(1.0 / mag(_3d_centroid), _3d_centroid)
plot(polygon, reference, _3d_centroid, surface_centroid, subcentroids)
def get_cartesian_polygon_data(fixed_radius):
cartesians = list()
with open('longitude and latitude.txt') as f:
for line in f.readlines():
spherical_point = [float(v) for v in line.split()]
if len(spherical_point) == 2:
spherical_point.append(fixed_radius)
cartesians.append(degree_spherical_to_cartesian(spherical_point))
return cartesians
def ok_reference_for_polygon(polygon):
point_count = len(polygon)
# fix the average of all vectors to minimize float skew
polyx, polyy, polyz = zip(*polygon)
# /10 is for visualization. Remove it to maximize accuracy
return (sum(polyx)/(point_count*10.0),
sum(polyy)/(point_count*10.0),
sum(polyz)/(point_count*10.0))
def points_are_equivalent(a, b, vague_tolerance):
# vague tolerance is something like a percentage tolerance (1% = 0.01)
(ax, ay, az), (bx, by, bz) = a, b
return all(((ax-bx)/ax < vague_tolerance,
(ay-by)/ay < vague_tolerance,
(az-bz)/az < vague_tolerance))
def degree_spherical_to_cartesian(point):
rad_lon, rad_lat, r = radians(point[0]), radians(point[1]), point[2]
x = r * cos(rad_lat) * cos(rad_lon)
y = r * cos(rad_lat) * sin(rad_lon)
z = r * sin(rad_lat)
return x, y, z
def area_of_spherical_triangle(r, a, b, c):
# points abc
# build an angle set: A(CAB), B(ABC), C(BCA)
# http://math.stackexchange.com/a/66731/25581
A, B, C = surface_points_to_surface_radians(a, b, c)
E = A + B + C - pi # E is called the spherical excess
area = r**2 * E
# add or subtract area based on clockwise-ness of a-b-c
# http://stackoverflow.com/a/10032657/377366
if clockwise_or_counter(a, b, c) == 'counter':
area *= -1.0
return area
def surface_points_to_surface_radians(a, b, c):
"""build an angle set: A(cab), B(abc), C(bca)"""
points = a, b, c
angles = list()
for i, mid in enumerate(points):
start, end = points[(i - 1) % 3], points[(i + 1) % 3]
x_startmid, x_endmid = xprod(start, mid), xprod(end, mid)
ratio = (dprod(x_startmid, x_endmid)
/ ((mag(x_startmid) * mag(x_endmid))))
angles.append(acos(ratio))
return angles
def clockwise_or_counter(a, b, c):
ab = diff_cartesians(b, a)
bc = diff_cartesians(c, b)
x = xprod(ab, bc)
if x < 0:
return 'clockwise'
elif x > 0:
return 'counter'
else:
raise RuntimeError('The reference point is in the polygon.')
def diff_cartesians(positive, negative):
return tuple(p - n for p, n in zip(positive, negative))
def xprod(v1, v2):
x = v1[1] * v2[2] - v1[2] * v2[1]
y = v1[2] * v2[0] - v1[0] * v2[2]
z = v1[0] * v2[1] - v1[1] * v2[0]
return [x, y, z]
def dprod(v1, v2):
dot = 0
for i in range(3):
dot += v1[i] * v2[i]
return dot
def mag(v1):
return sqrt(v1[0]**2 + v1[1]**2 + v1[2]**2)
def scale_v(scalar, v):
return tuple(scalar * vi for vi in v)
def plot(polygon, reference, _3d_centroid, surface_centroid, subcentroids):
fig = mpl.pyplot.figure()
ax = fig.add_subplot(111, projection='3d')
# plot the unit sphere
u = numpy.linspace(0, 2 * numpy.pi, 100)
v = numpy.linspace(-1 * numpy.pi / 2, numpy.pi / 2, 100)
x = numpy.outer(numpy.cos(u), numpy.sin(v))
y = numpy.outer(numpy.sin(u), numpy.sin(v))
z = numpy.outer(numpy.ones(numpy.size(u)), numpy.cos(v))
ax.plot_surface(x, y, z, rstride=4, cstride=4, color='w', linewidth=0,
alpha=0.3)
# plot 3d and flattened polygon
x, y, z = zip(*polygon)
ax.plot(x, y, z, c='b')
ax.plot(x, y, zs=0, c='g')
# plot the 3d centroid
x, y, z = _3d_centroid
ax.scatter(x, y, z, c='r', s=20)
# plot the spherical surface centroid and flattened centroid
x, y, z = surface_centroid
ax.scatter(x, y, z, c='b', s=20)
ax.scatter(x, y, 0, c='g', s=20)
# plot the full set of triangular centroids
x, y, z = zip(*subcentroids)
ax.scatter(x, y, z, c='r', s=4)
# plot the reference vector used to findsub centroids
x, y, z = reference
ax.plot((0, x), (0, y), (0, z), c='k')
ax.scatter(x, y, z, c='k', marker='^')
# display
ax.set_xlim3d(-1, 1)
ax.set_ylim3d(-1, 1)
ax.set_zlim3d(0, 1)
mpl.pyplot.show()
# run it in a function so the main code can appear at the top
main()
Here is the longitude and latitude data you can paste into longitude and latitude.txt
-39.366295 -1.633460
-47.282630 -0.740433
-53.912136 0.741380
-59.004217 2.759183
-63.489005 5.426812
-68.566001 8.712068
-71.394853 11.659135
-66.629580 15.362600
-67.632276 16.827507
-66.459524 19.069327
-63.819523 21.446736
-61.672712 23.532143
-57.538431 25.947815
-52.519889 28.691766
-48.606227 30.646295
-45.000447 31.089437
-41.549866 32.139873
-36.605156 32.956277
-32.010080 34.156692
-29.730629 33.756566
-26.158767 33.714080
-25.821513 34.179648
-23.614658 36.173719
-20.896869 36.977645
-17.991994 35.600074
-13.375742 32.581447
-9.554027 28.675497
-7.825604 26.535234
-7.825604 26.535234
-9.094304 23.363132
-9.564002 22.527385
-9.713885 22.217165
-9.948596 20.367878
-10.496531 16.486580
-11.151919 12.666850
-12.350144 8.800367
-15.446347 4.993373
-20.366139 1.132118
-24.784805 -0.927448
-31.532135 -1.910227
-39.366295 -1.633460

To clarify: the quantity of interest is the projection of the true 3d centroid
(i.e. 3d center-of-mass, i.e. 3d center-of-area) onto the unit sphere.
Since all you care about is the direction from the origin to the 3d centroid,
you don't need to bother with areas at all;
it's easier to just compute the moment (i.e. 3d centroid times area).
The moment of the region to the left of a closed path on the unit sphere
is half the integral of the leftward unit vector as you walk around the path.
This follows from a non-obvious application of Stokes' theorem; see Frank Jones's vector calculus book, chapter 13 Problem 13-12.
In particular, for a spherical polygon, the moment is half the sum of
(a x b) / ||a x b|| * (angle between a and b) for each pair of consecutive vertices a,b.
(That's for the region to the left of the path;
negate it for the region to the right of the path.)
(And if you really did want the 3d centroid, just compute the area and divide the moment by it. Comparing areas might also be useful in choosing which of the two regions to call "the polygon".)
Here's some code; it's really simple:
#!/usr/bin/python
import math
def plus(a,b): return [x+y for x,y in zip(a,b)]
def minus(a,b): return [x-y for x,y in zip(a,b)]
def cross(a,b): return [a[1]*b[2]-a[2]*b[1], a[2]*b[0]-a[0]*b[2], a[0]*b[1]-a[1]*b[0]]
def dot(a,b): return sum([x*y for x,y in zip(a,b)])
def length(v): return math.sqrt(dot(v,v))
def normalized(v): l = length(v); return [1,0,0] if l==0 else [x/l for x in v]
def addVectorTimesScalar(accumulator, vector, scalar):
for i in xrange(len(accumulator)): accumulator[i] += vector[i] * scalar
def angleBetweenUnitVectors(a,b):
# https://www.plunk.org/~hatch/rightway.html
if dot(a,b) < 0:
return math.pi - 2*math.asin(length(plus(a,b))/2.)
else:
return 2*math.asin(length(minus(a,b))/2.)
def sphericalPolygonMoment(verts):
moment = [0.,0.,0.]
for i in xrange(len(verts)):
a = verts[i]
b = verts[(i+1)%len(verts)]
addVectorTimesScalar(moment, normalized(cross(a,b)),
angleBetweenUnitVectors(a,b) / 2.)
return moment
if __name__ == '__main__':
import sys
def lonlat_degrees_to_xyz(lon_degrees,lat_degrees):
lon = lon_degrees*(math.pi/180)
lat = lat_degrees*(math.pi/180)
coslat = math.cos(lat)
return [coslat*math.cos(lon), coslat*math.sin(lon), math.sin(lat)]
verts = [lonlat_degrees_to_xyz(*[float(v) for v in line.split()])
for line in sys.stdin.readlines()]
#print "verts = "+`verts`
moment = sphericalPolygonMoment(verts)
print "moment = "+`moment`
print "centroid unit direction = "+`normalized(moment)`
For the example polygon, this gives the answer (unit vector):
[-0.7644875430808217, 0.579935445918147, -0.2814847687566214]
This is roughly the same as, but more accurate than, the answer computed by #KobeJohn's code, which uses rough tolerances and planar approximations to the sub-centroids:
[0.7628095787179151, -0.5977153368303585, 0.24669398601094406]
The directions of the two answers are roughly opposite (so I guess KobeJohn's code
decided to take the region to the right of the path in this case).

I think a good approximation would be to compute the center of mass using weighted cartesian coordinates and projecting the result onto the sphere (supposing the origin of coordinates is (0, 0, 0)^T).
Let be (p[0], p[1], ... p[n-1]) the n points of the polygon. The approximative (cartesian) centroid can be computed by:
c = 1 / w * (sum of w[i] * p[i])
whereas w is the sum of all weights and whereas p[i] is a polygon point and w[i] is a weight for that point, e.g.
w[i] = |p[i] - p[(i - 1 + n) % n]| / 2 + |p[i] - p[(i + 1) % n]| / 2
whereas |x| is the length of a vector x.
I.e. a point is weighted with half the length to the previous and half the length to the next polygon point.
This centroid c can now projected onto the sphere by:
c' = r * c / |c|
whereas r is the radius of the sphere.
To consider orientation of polygon (ccw, cw) the result may be
c' = - r * c / |c|.

Sorry I (as a newly registered user) had to write a new post instead of just voting/commenting on the above answer by Don Hatch. Don's answer, I think, is the best and most elegant. It is mathematically rigorous in computing the center of mass (first moment of mass) in a simple way when applying to the spherical polygon.
Kobe John's answer is a good approximation but only satisfactory for smaller areas. I also noticed a few glitches in the code. Firstly, the reference point should be projected to the spherical surface to compute the actual spherical area. Secondly, function points_are_equivalent() might need to be refined to avoid divided-by-zero.
The approximation error in Kobe's method lies in the calculation of the centroid of spherical triangles. The sub-centroid is NOT the center of mass of the spherical triangle but the planar one. This is not an issue if one is to determine that single triangle (sign may flip, see below). It is also not an issue if triangles are small (e.g. a dense triangulation of the polygon).
A few simple tests could illustrate the approximation error. For example if we use just four points:
10 -20
10 20
-10 20
-10 -20
The exact answer is (1,0,0) and both methods are good. But if you throw in a few more points along one edge (e.g. add {10,-15},{10,-10}... to the first edge), you'll see the results from Kobe's method start to shift. Further more, if you increase the longitude from [10,-10] to [100,-100], you'll see Kobe's result flips the direction. A possible improvement might be to add another level(s) for sub-centroid calculation (basically refine/reduce sizes of triangles).
For our application, the spherical area boundary is composed of multiple arcs and thus not polygon (i.e. the arc is not part of great circle). But this will just be a little more work to find the n-vector in the curve integration.
EDIT: Replacing the subcentroid calculation with the one given in Brock's paper should fix Kobe's method. But I did not try though.

python optimize.leastsq: fitting a circle to 3d set of points

I am trying to use circle fitting code for 3D data set. I have modified it for 3D points just adding z-coordinate where necessary. My modification works fine for one set of points and works bad for another. Please look at the code, if it has some errors.
import trig_items
import numpy as np
from trig_items import *
from numpy import *
from matplotlib import pyplot as p
from scipy import optimize
# Coordinates of the 3D points
##x = r_[36, 36, 19, 18, 33, 26]
##y = r_[14, 10, 28, 31, 18, 26]
##z = r_[0, 1, 2, 3, 4, 5]
x = r_[ 2144.18908574, 2144.26880854, 2144.05552972, 2143.90303742, 2143.62520676,
2143.43628579, 2143.14005775, 2142.79919654, 2142.51436023, 2142.11240866,
2141.68564346, 2141.29333828, 2140.92596405, 2140.3475612, 2139.90848046,
2139.24661021, 2138.67384709, 2138.03313547, 2137.40301734, 2137.40908256,
2137.06611224, 2136.50943781, 2136.0553113, 2135.50313189, 2135.07049922,
2134.62098139, 2134.10459535, 2133.50838433, 2130.6600465, 2130.03537342,
2130.04047644, 2128.83522468, 2127.79827542, 2126.43513385, 2125.36700593,
2124.00350543, 2122.68564431, 2121.20709478, 2119.79047011, 2118.38417647,
2116.90063343, 2115.52685778, 2113.82246629, 2112.21159431, 2110.63180117,
2109.00713198, 2108.94434529, 2106.82777156, 2100.62343757, 2098.5090226,
2096.28787738, 2093.91550703, 2091.66075061, 2089.15316429, 2086.69753869,
2084.3002414, 2081.87590579, 2079.19141866, 2076.5394574, 2073.89128676,
2071.18786213]
y = r_[ 725.74913818, 724.43874065, 723.15226506, 720.45950581, 717.77827954,
715.07048092, 712.39633862, 709.73267688, 707.06039438, 704.43405908,
701.80074596, 699.15371526, 696.5309022, 693.96109921, 691.35585501,
688.83496327, 686.32148661, 683.80286662, 681.30705568, 681.30530975,
679.66483676, 678.01922321, 676.32721779, 674.6667554, 672.9658024,
671.23686095, 669.52021535, 667.84999077, 659.19757984, 657.46179949,
657.45700508, 654.46901086, 651.38177517, 648.41739432, 645.32356976,
642.39034578, 639.42628453, 636.51107198, 633.57732055, 630.63825133,
627.75308356, 624.80162215, 622.01980232, 619.18814892, 616.37688894,
613.57400131, 613.61535723, 610.4724493, 600.98277781, 597.84782844,
594.75983001, 591.77946964, 588.74874068, 585.84525834, 582.92311166,
579.99564481, 577.06666417, 574.30782762, 571.54115037, 568.79760614,
566.08551098]
z = r_[ 339.77146775, 339.60021095, 339.47645894, 339.47130963, 339.37216218,
339.4126132, 339.67942046, 339.40917728, 339.39500353, 339.15041461,
339.38959195, 339.3358209, 339.47764895, 339.17854867, 339.14624071,
339.16403926, 339.02308811, 339.27011082, 338.97684183, 338.95087698,
338.97321177, 339.02175448, 339.02543922, 338.88725411, 339.06942374,
339.0557553, 339.04414618, 338.89234303, 338.95572249, 339.00880416,
339.00413073, 338.91080374, 338.98214758, 339.01135789, 338.96393537,
338.73446188, 338.62784913, 338.72443217, 338.74880562, 338.69090173,
338.50765186, 338.49056867, 338.57353355, 338.6196255, 338.43754399,
338.27218569, 338.10587265, 338.43880881, 338.28962141, 338.14338705,
338.25784154, 338.49792568, 338.15572139, 338.52967693, 338.4594245,
338.1511823, 338.03711207, 338.19144663, 338.22022045, 338.29032321,
337.8623197 ]
# coordinates of the barycenter
xm = mean(x)
ym = mean(y)
zm = mean(z)
### Basic usage of optimize.leastsq
def calc_R(xc, yc, zc):
""" calculate the distance of each 3D points from the center (xc, yc, zc) """
return sqrt((x - xc) ** 2 + (y - yc) ** 2 + (z - zc) ** 2)
def func(c):
""" calculate the algebraic distance between the 3D points and the mean circle centered at c=(xc, yc, zc) """
Ri = calc_R(*c)
return Ri - Ri.mean()
center_estimate = xm, ym, zm
center, ier = optimize.leastsq(func, center_estimate)
##print center
xc, yc, zc = center
Ri = calc_R(xc, yc, zc)
R = Ri.mean()
residu = sum((Ri - R)**2)
print 'R =', R
So, for the first set of x, y, z (commented in the code) it works well: the output is R = 39.0097846735. If I run the code with the second set of points (uncommented) the resulting radius is R = 108576.859834, which is almost straight line. I plotted the last one.
The blue points is a given data set, the red ones is the arc of the resulting radius R = 108576.859834. It is obvious that the given data set has much smaller radius than the result.
Here is another set of points.
It is clear that the least squares does not work correctly.
Please help me solving this issue.
UPDATE
Here is my solution:
### fit 3D arc into a set of 3D points ###
### output is the centre and the radius of the arc ###
def fitArc3d(arr, eps = 0.0001):
# Coordinates of the 3D points
x = numpy.array([arr[k][0] for k in range(len(arr))])
y = numpy.array([arr[k][4] for k in range(len(arr))])
z = numpy.array([arr[k][5] for k in range(len(arr))])
# coordinates of the barycenter
xm = mean(x)
ym = mean(y)
zm = mean(z)
### gradient descent minimisation method ###
pnts = [[x[k], y[k], z[k]] for k in range(len(x))]
meanP = Point(xm, ym, zm) # mean point
Ri = [Point(*meanP).distance(Point(*pnts[k])) for k in range(len(pnts))] # radii to the points
Rm = math.fsum(Ri) / len(Ri) # mean radius
dR = Rm + 10 # difference between mean radii
alpha = 0.1
c = meanP
cArr = []
while dR > eps:
cArr.append(c)
Jx = math.fsum([2 * (x[k] - c[0]) * (Ri[k] - Rm) / Ri[k] for k in range(len(Ri))])
Jy = math.fsum([2 * (y[k] - c[1]) * (Ri[k] - Rm) / Ri[k] for k in range(len(Ri))])
Jz = math.fsum([2 * (z[k] - c[2]) * (Ri[k] - Rm) / Ri[k] for k in range(len(Ri))])
gradJ = [Jx, Jy, Jz] # find gradient
c = [c[k] + alpha * gradJ[k] for k in range(len(c)) if len(c) == len(gradJ)] # find new centre point
Ri = [Point(*c).distance(Point(*pnts[k])) for k in range(len(pnts))] # calculate new radii
RmOld = Rm
Rm = math.fsum(Ri) / len(Ri) # calculate new mean radius
dR = abs(Rm - RmOld) # new difference between mean radii
return Point(*c), Rm
It is not very optimal code (I do not have time to fine tune it) but it works.

I guess the problem is the data and the corresponding algorithm. The least square method works fine if it produces a local parabolic minimum, such that a simple gradient method goes approximately direction minimum. Unfortunately, this is not necessarily the case for your data. You can check this by keeping some rough estimates for xc and yc fixed and plotting the sum of the squared residuals as a function of zc and R. I get a boomerang shaped minimum. Depending on your starting parameters you might end in one of the branches going away from the real minimum. Once in the valley this can be very flat such that you exceed the number of max iterations or get something that is accepted within the tolerance of the algorithm. As always, thinks are better the better your starting parameters. Unfortunately you have only a small arc of the circle, so that it is difficult to get better. I am not a specialist in Python, but I think that leastsq allows you to play with the Jacobian and Gradient Methods. Try to play with the tolerance as well.
In short: the code looks basically fine to me, but your data is pathological and you have to adapt the code to that kind of data.
There is a non-iterative solution in 2D from Karimäki, maybe you can adapt
this method to 3D. You can also look at this. Sure you will find more literature.
I just checked the data using a Simplex-Algorithm. The minimum is, as I said, not well behaved. See here some cuts of the residual function. Only in the xy-plane you get some reasonable behavior. The properties of the zr- and xr- plane make the finding process very difficult.
So in the beginning the simplex algorithm finds several almost stable solutions. You can see them as flat steps in the graph below (blue x, purple y, yellow z, green R). At the end the algorithm has to walk down the almost flat but very stretched out valley, resulting in the final conversion of z and R. Nevertheless, I expect many regions that look like a solution if the tolerance is insufficient. With the standard tolerance of 10^-5 the algoritm stopped after approx 350 iterations. I had to set it to 10^-10 to get this solution, i.e. [1899.32, 741.874, 298.696, 248.956], which seems quite ok.
Update
As mentioned earlier, the solution depends on the working precision and requested accuracy. So your hand made gradient method works probably better as these values are different compared to the build-in least square fit. Nevertheless, this is my version making a two step fit. First I fit a plane to the data. In a next step I fit a circle within this plane. Both steps use the least square method. This time it works, as each step avoids critically shaped minima. (Naturally, the plane fit runs into problems if the arc segment becomes small and the data lies virtually on a straight line. But this will happen for all algorithms)
from math import *
from matplotlib import pyplot as plt
from scipy import optimize
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import pprint as pp
dataTupel=zip(xs,ys,zs) #your data from above
# Fitting a plane first
# let the affine plane be defined by two vectors,
# the zero point P0 and the plane normal n0
# a point p is member of the plane if (p-p0).n0 = 0
def distanceToPlane(p0,n0,p):
return np.dot(np.array(n0),np.array(p)-np.array(p0))
def residualsPlane(parameters,dataPoint):
px,py,pz,theta,phi = parameters
nx,ny,nz =sin(theta)*cos(phi),sin(theta)*sin(phi),cos(theta)
distances = [distanceToPlane([px,py,pz],[nx,ny,nz],[x,y,z]) for x,y,z in dataPoint]
return distances
estimate = [1900, 700, 335,0,0] # px,py,pz and zeta, phi
#you may automize this by using the center of mass data
# note that the normal vector is given in polar coordinates
bestFitValues, ier = optimize.leastsq(residualsPlane, estimate, args=(dataTupel))
xF,yF,zF,tF,pF = bestFitValues
point = [xF,yF,zF]
normal = [sin(tF)*cos(pF),sin(tF)*sin(pF),cos(tF)]
# Fitting a circle inside the plane
#creating two inplane vectors
sArr=np.cross(np.array([1,0,0]),np.array(normal))#assuming that normal not parallel x!
sArr=sArr/np.linalg.norm(sArr)
rArr=np.cross(sArr,np.array(normal))
rArr=rArr/np.linalg.norm(rArr)#should be normalized already, but anyhow
def residualsCircle(parameters,dataPoint):
r,s,Ri = parameters
planePointArr = s*sArr + r*rArr + np.array(point)
distance = [ np.linalg.norm( planePointArr-np.array([x,y,z])) for x,y,z in dataPoint]
res = [(Ri-dist) for dist in distance]
return res
estimateCircle = [0, 0, 335] # px,py,pz and zeta, phi
bestCircleFitValues, ier = optimize.leastsq(residualsCircle, estimateCircle,args=(dataTupel))
rF,sF,RiF = bestCircleFitValues
print bestCircleFitValues
# Synthetic Data
centerPointArr=sF*sArr + rF*rArr + np.array(point)
synthetic=[list(centerPointArr+ RiF*cos(phi)*rArr+RiF*sin(phi)*sArr) for phi in np.linspace(0, 2*pi,50)]
[cxTupel,cyTupel,czTupel]=[ x for x in zip(*synthetic)]
### Plotting
d = -np.dot(np.array(point),np.array(normal))# dot product
# create x,y mesh
xx, yy = np.meshgrid(np.linspace(2000,2200,10), np.linspace(540,740,10))
# calculate corresponding z
# Note: does not work if normal vector is without z-component
z = (-normal[0]*xx - normal[1]*yy - d)/normal[2]
# plot the surface, data, and synthetic circle
fig = plt.figure()
ax = fig.add_subplot(211, projection='3d')
ax.scatter(xs, ys, zs, c='b', marker='o')
ax.plot_wireframe(xx,yy,z)
ax.set_xlabel('X Label')
ax.set_ylabel('Y Label')
ax.set_zlabel('Z Label')
bx = fig.add_subplot(212, projection='3d')
bx.scatter(xs, ys, zs, c='b', marker='o')
bx.scatter(cxTupel,cyTupel,czTupel, c='r', marker='o')
bx.set_xlabel('X Label')
bx.set_ylabel('Y Label')
bx.set_zlabel('Z Label')
plt.show()
which give a radius of 245. This is close to what the other approach gave (249). So within error margins I get the same.
The plotted result looks reasonable.
Hope this helps.

Feel like you missed some constraints in your 1st version code. The implementation could be explained as fitting a sphere to 3d points. So that's why the 2nd radius for 2nd data list is almost straight line. It's thinking like you are giving it a small circle on a large sphere.

Bézier curve fitting with SciPy

I have a set of points which approximate a 2D curve. I would like to use Python with numpy and scipy to find a cubic Bézier path which approximately fits the points, where I specify the exact coordinates of two endpoints, and it returns the coordinates of the other two control points.
I initially thought scipy.interpolate.splprep() might do what I want, but it seems to force the curve to pass through each one of the data points (as I suppose you would want for interpolation). I'll assume that I was on the wrong track with that.
My question is similar to this one: How can I fit a Bézier curve to a set of data?, except that they said they didn't want to use numpy. My preference would be to find what I need already implemented somewhere in scipy or numpy. Otherwise, I plan to implement the algorithm linked from one of the answers to that question, using numpy: An algorithm for automatically fitting digitized curves (pdf.page 622).
Thank you for any suggestions!
Edit: I understand that a cubic Bézier curve is not guaranteed to pass through all the points; I want one which passes through two given endpoints, and which is as close as possible to the specified interior points.

Here's a way to do Bezier curves with numpy:
import numpy as np
from scipy.special import comb
def bernstein_poly(i, n, t):
"""
The Bernstein polynomial of n, i as a function of t
"""
return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bezier_curve(points, nTimes=1000):
"""
Given a set of control points, return the
bezier curve defined by the control points.
points should be a list of lists, or list of tuples
such as [ [1,1],
[2,3],
[4,5], ..[Xn, Yn] ]
nTimes is the number of time steps, defaults to 1000
See http://processingjs.nihongoresources.com/bezierinfo/
"""
nPoints = len(points)
xPoints = np.array([p[0] for p in points])
yPoints = np.array([p[1] for p in points])
t = np.linspace(0.0, 1.0, nTimes)
polynomial_array = np.array([ bernstein_poly(i, nPoints-1, t) for i in range(0, nPoints) ])
xvals = np.dot(xPoints, polynomial_array)
yvals = np.dot(yPoints, polynomial_array)
return xvals, yvals
if __name__ == "__main__":
from matplotlib import pyplot as plt
nPoints = 4
points = np.random.rand(nPoints,2)*200
xpoints = [p[0] for p in points]
ypoints = [p[1] for p in points]
xvals, yvals = bezier_curve(points, nTimes=1000)
plt.plot(xvals, yvals)
plt.plot(xpoints, ypoints, "ro")
for nr in range(len(points)):
plt.text(points[nr][0], points[nr][1], nr)
plt.show()

Here is a piece of python code for fitting points:
'''least square qbezier fit using penrose pseudoinverse
>>> V=array
>>> E, W, N, S = V((1,0)), V((-1,0)), V((0,1)), V((0,-1))
>>> cw = 100
>>> ch = 300
>>> cpb = V((0, 0))
>>> cpe = V((cw, 0))
>>> xys=[cpb,cpb+ch*N+E*cw/8,cpe+ch*N+E*cw/8, cpe]
>>>
>>> ts = V(range(11), dtype='float')/10
>>> M = bezierM (ts)
>>> points = M*xys #produces the points on the bezier curve at t in ts
>>>
>>> control_points=lsqfit(points, M)
>>> linalg.norm(control_points-xys)<10e-5
True
>>> control_points.tolist()[1]
[12.500000000000037, 300.00000000000017]
'''
from numpy import array, linalg, matrix
from scipy.misc import comb as nOk
Mtk = lambda n, t, k: t**(k)*(1-t)**(n-k)*nOk(n,k)
bezierM = lambda ts: matrix([[Mtk(3,t,k) for k in range(4)] for t in ts])
def lsqfit(points,M):
M_ = linalg.pinv(M)
return M_ * points
Generally on bezier curves check out
Animated bezier and
bezierinfo

Resulting Plot
Building upon the answers from #reptilicus and #Guillaume P., here is the complete code to:
Get the Bezier Parameters i.e. the control points from a list of points.
Create the Bezier Curve from the Bezier Parameters i.e. the control points.
Plot the original points, the control points and the resulting Bezier Curve.
Getting the Bezier Parameters i.e. the control points from a set of X,Y points or coordinates. The other parameter needed is the degree for the approximation and the resulting control points will be (degree + 1)
import numpy as np
from scipy.special import comb
def get_bezier_parameters(X, Y, degree=3):
""" Least square qbezier fit using penrose pseudoinverse.
Parameters:
X: array of x data.
Y: array of y data. Y[0] is the y point for X[0].
degree: degree of the Bézier curve. 2 for quadratic, 3 for cubic.
Based on https://stackoverflow.com/questions/12643079/b%C3%A9zier-curve-fitting-with-scipy
and probably on the 1998 thesis by Tim Andrew Pastva, "Bézier Curve Fitting".
"""
if degree < 1:
raise ValueError('degree must be 1 or greater.')
if len(X) != len(Y):
raise ValueError('X and Y must be of the same length.')
if len(X) < degree + 1:
raise ValueError(f'There must be at least {degree + 1} points to '
f'determine the parameters of a degree {degree} curve. '
f'Got only {len(X)} points.')
def bpoly(n, t, k):
""" Bernstein polynomial when a = 0 and b = 1. """
return t ** k * (1 - t) ** (n - k) * comb(n, k)
#return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bmatrix(T):
""" Bernstein matrix for Bézier curves. """
return np.matrix([[bpoly(degree, t, k) for k in range(degree + 1)] for t in T])
def least_square_fit(points, M):
M_ = np.linalg.pinv(M)
return M_ * points
T = np.linspace(0, 1, len(X))
M = bmatrix(T)
points = np.array(list(zip(X, Y)))
final = least_square_fit(points, M).tolist()
final[0] = [X[0], Y[0]]
final[len(final)-1] = [X[len(X)-1], Y[len(Y)-1]]
return final
Create the Bezier curve given the Bezier Parameters i.e. control points.
def bernstein_poly(i, n, t):
"""
The Bernstein polynomial of n, i as a function of t
"""
return comb(n, i) * ( t**(n-i) ) * (1 - t)**i
def bezier_curve(points, nTimes=50):
"""
Given a set of control points, return the
bezier curve defined by the control points.
points should be a list of lists, or list of tuples
such as [ [1,1],
[2,3],
[4,5], ..[Xn, Yn] ]
nTimes is the number of time steps, defaults to 1000
See http://processingjs.nihongoresources.com/bezierinfo/
"""
nPoints = len(points)
xPoints = np.array([p[0] for p in points])
yPoints = np.array([p[1] for p in points])
t = np.linspace(0.0, 1.0, nTimes)
polynomial_array = np.array([ bernstein_poly(i, nPoints-1, t) for i in range(0, nPoints) ])
xvals = np.dot(xPoints, polynomial_array)
yvals = np.dot(yPoints, polynomial_array)
return xvals, yvals
Sample data used (can be replaced with any data, this is GPS data).
points = []
xpoints = [19.21270, 19.21269, 19.21268, 19.21266, 19.21264, 19.21263, 19.21261, 19.21261, 19.21264, 19.21268,19.21274, 19.21282, 19.21290, 19.21299, 19.21307, 19.21316, 19.21324, 19.21333, 19.21342]
ypoints = [-100.14895, -100.14885, -100.14875, -100.14865, -100.14855, -100.14847, -100.14840, -100.14832, -100.14827, -100.14823, -100.14818, -100.14818, -100.14818, -100.14818, -100.14819, -100.14819, -100.14819, -100.14820, -100.14820]
for i in range(len(xpoints)):
points.append([xpoints[i],ypoints[i]])
Plot the original points, the control points and the resulting Bezier Curve.
import matplotlib.pyplot as plt
# Plot the original points
plt.plot(xpoints, ypoints, "ro",label='Original Points')
# Get the Bezier parameters based on a degree.
data = get_bezier_parameters(xpoints, ypoints, degree=4)
x_val = [x[0] for x in data]
y_val = [x[1] for x in data]
print(data)
# Plot the control points
plt.plot(x_val,y_val,'k--o', label='Control Points')
# Plot the resulting Bezier curve
xvals, yvals = bezier_curve(data, nTimes=1000)
plt.plot(xvals, yvals, 'b-', label='B Curve')
plt.legend()
plt.show()

#keynesiancross asked for "comments in [Roland's] code as to what the variables are" and others completely missed the stated problem. Roland started with a Bézier curve as input (to get a perfect match), which made it harder to understand both the problem and (at least for me) the solution. The difference from interpolation is easier to see for input that leaves residuals. Here is both paraphrased code and non-Bézier input -- and an unexpected outcome.
import matplotlib.pyplot as plt
import numpy as np
from scipy.special import comb as n_over_k
Mtk = lambda n, t, k: t**k * (1-t)**(n-k) * n_over_k(n,k)
BézierCoeff = lambda ts: [[Mtk(3,t,k) for k in range(4)] for t in ts]
fcn = np.log
tPlot = np.linspace(0. ,1. , 81)
xPlot = np.linspace(0.1,2.5, 81)
tData = tPlot[0:81:10]
xData = xPlot[0:81:10]
data = np.column_stack((xData, fcn(xData))) # shapes (9,2)
Pseudoinverse = np.linalg.pinv(BézierCoeff(tData)) # (9,4) -> (4,9)
control_points = Pseudoinverse.dot(data) # (4,9)*(9,2) -> (4,2)
Bézier = np.array(BézierCoeff(tPlot)).dot(control_points)
residuum = fcn(Bézier[:,0]) - Bézier[:,1]
fig, ax = plt.subplots()
ax.plot(xPlot, fcn(xPlot), 'r-')
ax.plot(xData, data[:,1], 'ro', label='input')
ax.plot(Bézier[:,0],
Bézier[:,1], 'k-', label='fit')
ax.plot(xPlot, 10.*residuum, 'b-', label='10*residuum')
ax.plot(control_points[:,0],
control_points[:,1], 'ko:', fillstyle='none')
ax.legend()
fig.show()
This works well for fcn = np.cos but not for log. I kind of expected that the fit would use the t-component of the control points as additional degrees of freedom, as we would do by dragging the control points:
manual_points = np.array([[0.1,np.log(.1)],[.27,-.6],[.82,.23],[2.5,np.log(2.5)]])
Bézier = np.array(BézierCoeff(tPlot)).dot(manual_points)
residuum = fcn(Bézier[:,0]) - Bézier[:,1]
fig, ax = plt.subplots()
ax.plot(xPlot, fcn(xPlot), 'r-')
ax.plot(xData, data[:,1], 'ro', label='input')
ax.plot(Bézier[:,0],
Bézier[:,1], 'k-', label='fit')
ax.plot(xPlot, 10.*residuum, 'b-', label='10*residuum')
ax.plot(manual_points[:,0],
manual_points[:,1], 'ko:', fillstyle='none')
ax.legend()
fig.show()
The cause of failure, I guess, is that the norm measures the distance between points on the curves instead of the distance between a point on one curve to the nearest point on the other curve.

Short answer: you don't, because that's not how Bezier curves work. Longer answer: have a look at Catmull-Rom splines instead. They're pretty easy to form (the tangent vector at any point P, barring start and end, is parallel to the lines {P-1,P+1}, so they're easy to program, too) and always pass through the points that define them, unlike Bezier curves, which interpolates "somewhere" inside the convex hull set up by all the control points.

A Bezier curve isn't guaranteed to pass through every point you supply it with; control points are arbitrary (in the sense that there is no specific algorithm for finding them, you simply choose them yourself) and only pull the curve in a direction.
If you want a curve which will pass through every point you supply it with, you need something like a natural cubic spline, and due to the limitations of those (you must supply them with increasing x co-ordinates, or it tends to infinity), you'll probably want a parametric natural cubic spline.
There are nice tutorials here:
Cubic Splines
Parametric Cubic Splines

I had the same problem as detailed in the question. I took the code provided Roland Puntaier and was able to make it work. Here:
def get_bezier_parameters(X, Y, degree=2):
""" Least square qbezier fit using penrose pseudoinverse.
Parameters:
X: array of x data.
Y: array of y data. Y[0] is the y point for X[0].
degree: degree of the Bézier curve. 2 for quadratic, 3 for cubic.
Based on https://stackoverflow.com/questions/12643079/b%C3%A9zier-curve-fitting-with-scipy
and probably on the 1998 thesis by Tim Andrew Pastva, "Bézier Curve Fitting".
"""
if degree < 1:
raise ValueError('degree must be 1 or greater.')
if len(X) != len(Y):
raise ValueError('X and Y must be of the same length.')
if len(X) < degree + 1:
raise ValueError(f'There must be at least {degree + 1} points to '
f'determine the parameters of a degree {degree} curve. '
f'Got only {len(X)} points.')
def bpoly(n, t, k):
""" Bernstein polynomial when a = 0 and b = 1. """
return t ** k * (1 - t) ** (n - k) * comb(n, k)
def bmatrix(T):
""" Bernstein matrix for Bézier curves. """
return np.matrix([[bpoly(degree, t, k) for k in range(degree + 1)] for t in T])
def least_square_fit(points, M):
M_ = np.linalg.pinv(M)
return M_ * points
T = np.linspace(0, 1, len(X))
M = bmatrix(T)
points = np.array(list(zip(X, Y)))
return least_square_fit(points, M).tolist()
To fix the end points of the curve, ignore the first and last parameter returned by the function and use your own points.

What Mike Kamermans said is true, but I also wanted to point out that, as far as I know, catmull-rom splines can be defined in terms of cubic beziers. So, if you only have a library that works with cubics, you should still be able to do catmull-rom splines:
http://schepers.cc/getting-to-the-point
https://github.com/DmitryBaranovskiy/raphael/blob/d8fbe4be81d362837f95e33886b80fb41de443b4/dev/raphael.core.js#L1021

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.