How do I efficiently find the set of points within a circle of a given radius and centre from a sorted numpy array of equally spaced points?
For example, this is my code and how I currently extract those points within the radius.
import numpy as np
n_points = 10000
x_lim = [0, 100]
y_lim = [0, 100]
x, y = np.meshgrid(np.linspace(*x_lim, n_points), np.linspace(*y_lim, n_points))
xy = np.vstack((x.flatten(), y.flatten())).T
# Current approach
radius = 5
point = np.array([50, 35], dtype=float)
# Indexes of those points within a circle of radius centered at point
idxs = np.linalg.norm(point - xy, axis=-1) < radius
points_within_circle = xy[idxs]
How do I do I calculate these indexes more efficiently? I imagine because the array is structured and has a set distance between each point I should be able to exploit this to eliminate most of the checks.
One of the most important tricks that people forget is that it is a lot faster to calculate distance**2 and compare it to radius**2, than to calculate if distance < radius. So given that it looks like you're using a center of 0, calculate x**2 + y**2, and compare to 25.
I am trying to sample around 1000 points from a 3-D ellipsoid, uniformly. Is there some way to code it such that we can get points starting from the equation of the ellipsoid?
I want points on the surface of the ellipsoid.
Theory
Using this excellent answer to the MSE question How to generate points uniformly distributed on the surface of an ellipsoid? we can
generate a point uniformly on the sphere, apply the mapping f :
(x,y,z) -> (x'=ax,y'=by,z'=cz) and then correct the distortion
created by the map by discarding the point randomly with some
probability p(x,y,z).
Assuming that the 3 axes of the ellipsoid are named such that
0 < a < b < c
We discard a generated point with
p(x,y,z) = 1 - mu(x,y,y)/mu_max
probability, ie we keep it with mu(x,y,y)/mu_max probability where
mu(x,y,z) = ((acy)^2 + (abz)^2 + (bcx)^2)^0.5
and
mu_max = bc
Implementation
import numpy as np
np.random.seed(42)
# Function to generate a random point on a uniform sphere
# (relying on https://stackoverflow.com/a/33977530/8565438)
def randompoint(ndim=3):
vec = np.random.randn(ndim,1)
vec /= np.linalg.norm(vec, axis=0)
return vec
# Give the length of each axis (example values):
a, b, c = 1, 2, 4
# Function to scale up generated points using the function `f` mentioned above:
f = lambda x,y,z : np.multiply(np.array([a,b,c]),np.array([x,y,z]))
# Keep the point with probability `mu(x,y,z)/mu_max`, ie
def keep(x, y, z, a=a, b=b, c=c):
mu_xyz = ((a * c * y) ** 2 + (a * b * z) ** 2 + (b * c * x) ** 2) ** 0.5
return mu_xyz / (b * c) > np.random.uniform(low=0.0, high=1.0)
# Generate points until we have, let's say, 1000 points:
n = 1000
points = []
while len(points) < n:
[x], [y], [z] = randompoint()
if keep(x, y, z):
points.append(f(x, y, z))
Checks
Check if all points generated satisfy the ellipsoid condition (ie that x^2/a^2 + y^2/b^2 + z^2/c^2 = 1):
for p in points:
pscaled = np.multiply(p,np.array([1/a,1/b,1/c]))
assert np.allclose(np.sum(np.dot(pscaled,pscaled)),1)
Runs without raising any errors. Visualize the points:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(projection="3d")
points = np.array(points)
ax.scatter(points[:, 0], points[:, 1], points[:, 2])
# set aspect ratio for the axes using https://stackoverflow.com/a/64453375/8565438
ax.set_box_aspect((np.ptp(points[:, 0]), np.ptp(points[:, 1]), np.ptp(points[:, 2])))
plt.show()
These points seem evenly distributed.
Problem with currently accepted answer
Generating a point on a sphere and then just reprojecting it without any further corrections to an ellipse will result in a distorted distribution. This is essentially the same as setting this posts's p(x,y,z) to 0. Imagine an ellipsoid where one axis is orders of magnitude bigger than another. This way, it is easy to see, that naive reprojection is not going to work.
Consider using Monte-Carlo simulation: generate a random 3D point; check if the point is inside the ellipsoid; if it is, keep it. Repeat until you get 1,000 points.
P.S. Since the OP changed their question, this answer is no longer valid.
J.F. Williamson, "Random selection of points distributed on curved surfaces", Physics in Medicine & Biology 32(10), 1987, describes a general method of choosing a uniformly random point on a parametric surface. It is an acceptance/rejection method that accepts or rejects each candidate point depending on its stretch factor (norm-of-gradient). To use this method for a parametric surface, several things have to be known about the surface, namely—
x(u, v), y(u, v) and z(u, v), which are functions that generate 3-dimensional coordinates from two dimensional coordinates u and v,
The ranges of u and v,
g(point), the norm of the gradient ("stretch factor") at each point on the surface, and
gmax, the maximum value of g for the entire surface.
The algorithm is then:
Generate a point on the surface, xyz.
If g(xyz) >= RNDU01()*gmax, where RNDU01() is a uniform random variate in [0, 1), accept the point. Otherwise, repeat this process.
Chen and Glotzer (2007) apply the method to the surface of a prolate spheroid (one form of ellipsoid) in "Simulation studies of a phenomenological model for elongated virus capsid formation", Physical Review E 75(5), 051504 (preprint).
Here is a generic function to pick a random point on a surface of a sphere, spheroid or any triaxial ellipsoid with a, b and c parameters. Note that generating angles directly will not provide uniform distribution and will cause excessive population of points along z direction. Instead, phi is obtained as an inverse of randomly generated cos(phi).
import numpy as np
def random_point_ellipsoid(a,b,c):
u = np.random.rand()
v = np.random.rand()
theta = u * 2.0 * np.pi
phi = np.arccos(2.0 * v - 1.0)
sinTheta = np.sin(theta);
cosTheta = np.cos(theta);
sinPhi = np.sin(phi);
cosPhi = np.cos(phi);
rx = a * sinPhi * cosTheta;
ry = b * sinPhi * sinTheta;
rz = c * cosPhi;
return rx, ry, rz
This function is adopted from this post: https://karthikkaranth.me/blog/generating-random-points-in-a-sphere/
One way of doing this whch generalises for any shape or surface is to convert the surface to a voxel representation at arbitrarily high resolution (the higher the resolution the better but also the slower). Then you can easily select the voxels randomly however you want, and then you can select a point on the surface within the voxel using the parametric equation. The voxel selection should be completely unbiased, and the selection of the point within the voxel will suffer the same biases that come from using the parametric equation but if there are enough voxels then the size of these biases will be very small.
You need a high quality cube intersection code but with something like an elipsoid that can optimised quite easily. I'd suggest stepping through the bounding box subdivided into voxels. A quick distance check will eliminate most cubes and you can do a proper intersection check for the ones where an intersection is possible. For the point within the cube I'd be tempted to do something simple like a random XYZ distance from the centre and then cast a ray from the centre of the elipsoid and the selected point is where the ray intersects the surface. As I said above, it will be biased but with small voxels, the bias will probably be small enough.
There are libraries that do convex shape intersection very efficiently and cube/elipsoid will be one of the options. They will be highly optimised but I think the distance culling would probably be worth doing by hand whatever. And you will need a library that differentiates between a surface intersection and one object being totally inside the other.
And if you know your elipsoid is aligned to an axis then you can do the voxel/edge intersection very easily as a stack of 2D square intersection elipse problems with the set of squares to be tested defined as those that are adjacent to those in the layer above. That might be quicker.
One of the things that makes this approach more managable is that you do not need to write all the code for edge cases (it is a lot of work to get around issues with floating point inaccuracies that can lead to missing or doubled voxels at the intersection). That's because these will be very rare so they won't affect your sampling.
It might even be quicker to simply find all the voxels inside the elipse and then throw away all the voxels with 6 neighbours... Lots of options. It all depends how important performance is. This will be much slower than the opther suggestions but if you want ~1000 points then ~100,000 voxels feels about the minimum for the surface, so you probably need ~1,000,000 voxels in your bounding box. However even testing 1,000,000 intersections is pretty fast on modern computers.
Depending on what "uniformly" refers to, different methods are applicable. In any case, we can use the parametric equations using spherical coordinates (from Wikipedia):
where s = 1 refers to the ellipsoid given by the semi-axes a > b > c. From these equations we can derive the relevant volume/area element and generate points such that their probability of being generated is proportional to that volume/area element. This will provide constant volume/area density across the surface of the ellipsoid.
1. Constant volume density
This method generates points on the surface of an ellipsoid such that their volume density across the surface of the ellipsoid is constant. A consequence of this is that the one-dimensional projections (i.e. the x, y, z coordinates) are uniformly distributed; for details see the plot below.
The volume element for a triaxial ellipsoid is given by (see here):
and is thus proportional to sin(theta) (for 0 <= theta <= pi). We can use this as the basis for a probability distribution that indicates "how many" points should be generated for a given value of theta: where the area density is low/high, the probability for generating a corresponding value of theta should be low/high, too.
Hence, we can use the function f(theta) = sin(theta)/2 as our probability distribution on the interval [0, pi]. The corresponding cumulative distribution function is F(theta) = (1 - cos(theta))/2. Now we can use Inverse transform sampling to generate values of theta according to f(theta) from a uniform random distribution. The values of phi can be obtained directly from a uniform distribution on [0, 2*pi].
Example code:
import matplotlib.pyplot as plt
import numpy as np
from numpy import sin, cos, pi
rng = np.random.default_rng(seed=0)
a, b, c = 10, 3, 1
N = 5000
phi = rng.uniform(0, 2*pi, size=N)
theta = np.arccos(1 - 2*rng.random(size=N))
x = a*sin(theta)*cos(phi)
y = b*sin(theta)*sin(phi)
z = c*cos(theta)
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(x, y, z, s=2)
plt.show()
which produces the following plot:
The following plot shows the one-dimensional projections (i.e. density plots of x, y, z):
import seaborn as sns
sns.kdeplot(data=dict(x=x, y=y, z=z))
plt.show()
2. Constant area density
This method generates points on the surface of an ellipsoid such that their area density is constant across the surface of the ellipsoid.
Again, we start by calculating the corresponding area element. For simplicity we can use SymPy:
from sympy import cos, sin, symbols, Matrix
a, b, c, t, p = symbols('a b c t p')
x = a*sin(t)*cos(p)
y = b*sin(t)*sin(p)
z = c*cos(t)
J = Matrix([
[x.diff(t), x.diff(p)],
[y.diff(t), y.diff(p)],
[z.diff(t), z.diff(p)],
])
print((J.T # J).det().simplify())
This yields
-a**2*b**2*sin(t)**4 + a**2*b**2*sin(t)**2 + a**2*c**2*sin(p)**2*sin(t)**4 - b**2*c**2*sin(p)**2*sin(t)**4 + b**2*c**2*sin(t)**4
and further simplifies to (dividing by (a*b)**2 and taking the sqrt):
sin(t)*np.sqrt(1 + ((c/b)**2*sin(p)**2 + (c/a)**2*cos(p)**2 - 1)*sin(t)**2)
Since for this case the area element is more complex, we can use rejection sampling:
import matplotlib.pyplot as plt
import numpy as np
from numpy import cos, sin
def f_redo(t, p):
return (
sin(t)*np.sqrt(1 + ((c/b)**2*sin(p)**2 + (c/a)**2*cos(p)**2 - 1)*sin(t)**2)
< rng.random(size=t.size)
)
rng = np.random.default_rng(seed=0)
N = 5000
a, b, c = 10, 3, 1
t = rng.uniform(0, np.pi, size=N)
p = rng.uniform(0, 2*np.pi, size=N)
redo = f_redo(t, p)
while redo.any():
t[redo] = rng.uniform(0, np.pi, size=redo.sum())
p[redo] = rng.uniform(0, 2*np.pi, size=redo.sum())
redo[redo] = f_redo(t[redo], p[redo])
x = a*np.sin(t)*np.cos(p)
y = b*np.sin(t)*np.sin(p)
z = c*np.cos(t)
fig = plt.figure()
ax = fig.add_subplot(projection='3d')
ax.scatter(x, y, z, s=2)
plt.show()
which yields the following distribution:
The following plot shows the corresponding one-dimensional projections (x, y, z):
I have measurements (PPI arc scans) taken with a doppler wind lidar. The data is stored in a pandas dataframe where rows represent azimuth angle and columns represent radial distance (input shape = 30x197). Link to example scan, (csv). I want to transform this to a cartesian coordinate system, and output a 2d array which is re-gridded into x,y coordinates instead of polar with the values stored in the appropriate grid cell. Interpolation (nearest neighbor) is ok and so is zero or NaN padding of areas where no data exists.
Ideally the X and Y grid should correspond to the actual distances between points, but right now I'm just trying to get this working. This shouldn’t be terribly difficult, but I’m having trouble obtaining the result I want.
So far, I have working code which plots on a polar axis beautifully (example image) but this won't work for the next steps of my analysis.
I have tried many different approaches with scipy.interpolate.griddata, scipy.ndimage.geometric_transform, and scipy.ndimage.map_coordinates but haven't gotten the correct output. Here is an example of my recent attempt (df_polar is the csv file linked):
# Generate polar and cartesian meshgrids
r = df_polar.columns
theta = df_polar.index
theta = np.deg2rad(theta)
# Polar meshgrid
rad_c, theta_c = np.meshgrid(r,theta)
# Cartesian meshgrid
X = rad_c * np.cos(theta_c)
Y = rad_c * np.sin(theta_c)
x,y = np.meshgrid(X,Y)
# Interpolate from polar to cartesian grid
new_grid = scipy.interpolate.griddata(
(rad_c.flatten(), theta_c.flatten()),
np.array(df_polar).flatten(), (x,y), method='nearest')
The result is not correct at all, and from reading the documentation and examples I don't understand why. I would greatly appreciate any tips on where I have gone wrong. Thanks a lot!!
I think you might be feeding griddata the wrong points. It wants cartesian points and if you want the values interpolated over a regular x/y grid you need to create one and provide that too.
Try this and let me know if it produces the expected result. It's hard for me to tell if this is what it should produce:
from scipy.interpolate import griddata
import pandas as pd
import numpy as np
df_polar = pd.read_csv('onescan.txt', index_col=0)
# Generate polar and cartesian meshgrids
r = pd.to_numeric(df_polar.columns)
theta = np.deg2rad(df_polar.index)
# Polar meshgrid
rad_c, theta_c = np.meshgrid(r, theta)
# Cartesian equivalents of polar co-ordinates
X = rad_c*np.cos(theta_c)
Y = rad_c*np.sin(theta_c)
# Cartesian (x/y) meshgrid
grid_spacing = 100.0 # You can change this
nx = (X.max() - X.min())/grid_spacing
ny = (Y.max() - Y.min())/grid_spacing
x = np.arange(X.min(), X.max() + grid_spacing, grid_spacing)
y = np.arange(Y.min(), Y.max() + grid_spacing, grid_spacing)
grid_x, grid_y = np.meshgrid(x, y)
# Interpolate from polar to cartesian grid
new_grid = griddata(
(X.flatten(), Y.flatten()),
df_polar.values.flatten(),
(grid_x, grid_y),
method='nearest'
)
The resulting values look something like this (with grid_spacing = 10 and flipping x and y):
import matplotlib.pyplot as plt
plt.imshow(new_grid.T, cmap='hot')
Clearly interpolate "nearest" needs taming...
I will have a 3-d grid of points (defined by Cartesian vectors). For any given coordinate within the grid, I wish to find the 8 grid points making the cuboid which surrounds the given coordinate. I also need the distances between the vertices of the cuboid and the given coordinate. I have found a way of doing this for a meshgrid with regular spacings, but not for irregular spacings. I do not yet have an example of the irregularly spaced grid data, I just know that the algorithm will have to deal with them eventually. My solution for the regularly spaced points is based off of this post, Finding index of nearest point in numpy arrays of x and y coordinates and is as follows:
import scipy as sp
import numpy as np
x, y, z = np.mgrid[0:5, 0:10, 0:20]
# Example 3-d grid of points.
b = np.dstack((x.ravel(), y.ravel(), z.ravel()))[0]
tree = sp.spatial.cKDTree(b)
example_coord = np.array([1.5, 3.5, 5.5])
d, i = tree.query((example_coord), 8)
# i being the indices of the closest grid points, d being their distance from the
# given coordinate, example_coord
b[i[0]], d[0]
# This gives one of the points of the surrounding cuboid and its distance from
# example_coord
I am looking to make this algorithm run as efficiently as possible as it will need to be run a lot. Thanks in advance for your help.
I want to draw curves between any two points in 3d space. The curve must be, umm, "vertical". I mean, x,y positions of the points of curve must be on the same line, but z values must change as if you sent a projectile from ground, it traveled in air, and hit the ground again. It does not need to be physically correct, an arc is OK.
This is the starting code:
import numpy as np
p1=np.array([1,1,1]) #x,y,z coordinates of the first point
p2=np.array([3,3,3]) #x,y,z coordinates of the second point
xi=np.linspace(p1[0],p2[0],100) #determine 100 x coordinates between two points
yi=np.linspace(p1[1],p2[1],100) #determine 100 y coordinates between two points
zi= ?? #determine 100 z coordinates between two points.
How can I determine those 100 z coordinates (zi)?
After determining zi it is trivial to draw lines between consecutive points(using mayavi or mplot3d) , giving the visual of a curve.
I ended up using scipy.interpolate to get the curve, and adding it to z coordinates of the line between points. As others said, there are more than one way to do this. This will be enough for my purpose.
### objective: draw an arc between points p1 and p2. z coordinates are raised.
import numpy as np
from scipy import interpolate
from mayavi import mlab
###inputs
p1=np.random.uniform(0,20,(3)) #first point
p2=np.random.uniform(0,20,(3)) #second point
npts = 100 # number of points to sample
y=np.array([0,.5,.75,.75,.5,0]) #describe your shape in 1d like this
amp=5 #curve height factor. bigger means heigher
#get the adder. This will be used to raise the z coords
x=np.arange(y.size)
xnew = np.linspace(x[0],x[-1] , npts) #sample the x coord
tck = interpolate.splrep(x,y,s=0)
adder = interpolate.splev(xnew,tck,der=0)*amp
adder[0]=adder[-1]=0
adder=adder.reshape((-1,1))
#get a line between points
shape3=np.vstack([np.linspace(p1[dim],p2[dim],npts) for dim in xrange(3)]).T
#raise the z coordinate
shape3[:,-1]=shape3[:,-1]+adder[:,-1]
#plot
x,y,z=(shape3[:,dim] for dim in xrange(3))
mlab.points3d(x,y,z,color=(0,0,0))
mlab.plot3d(x,y,z,tube_radius=1)
mlab.outline()
mlab.axes()
mlab.show()
There isn't one right answer to this question because the curvature of the arc isn't constrained. The basis for the math for this problem is projectile motion, which gives you two key equations:
x_2 - x_1 = v_1 cos theta dt
z_2 - z_1 = -1/2 g dt^2 + v_0 sin theta dt
where v_1 is the initial velocity of the projectile, theta is the angle from horizontal that the projectile is shot at, dt is the time it takes for the projectile to go from point 1 to point 2, and g is the gravitational constant. This neglects y for now for simplicity. The problem for you is that this gives you two equations, but you have three unknowns, v_1, theta, and dt.
You can add a constraint, for example, that the higher of p1 and p2 is the peak of the trajectory. If p2 is higher, for example,
v_2 = v_1 - g dt = 0
Solving those three equations gives you v_1, which gives the z coordinate over time:
z = -1/2 g t^2 + v_1 t + z_1
t = np.linspace(0, dt, 100) gives you a numpy vector of times, and you can plug that into your formula for z.