I want to calculate a point/vector with the least Euclidian distance to a given set of N lines (e.g. given by a point and a vector for the direction) in a D dimensional space (for example by least squares)
Since I use Python for my project, I was wondering, whether there are already appropriate implementations for this general problem in some standard library like numpy, but I have not found any.
There are already related questions like:
Finding the centre of multiple lines using least squares approach in Python
nearest intersection point to many lines in python
However, these questions did not consider a dimension larger than 3 and in my case, I would like to adapt the problem to dimensions like 100.
I also found this resource for Matlab, which does not seem to be used that much, but it deals with the same problem:
https://de.mathworks.com/matlabcentral/fileexchange/59805-line-line-intersection-n-lines-d-space?s_tid=FX_rc1_behav
If each line with index i is given by a unit vector column
vi = {
v1i,
v2i,
v3i,
...
vDi
}
pointing along the line i, and a point given by a vector column
pi = {
p1i,
p2i,
p3i,
...
pDi
}
, where i = 1...N, then the point x you seek, given as a column, is given by the equation
x = inverse( sum(i=1:N, I - vi * transposed(vi)) ) * sum(i=1:N, (I - vi * transposed(vi)) * pi);
Here I is the D-dimensional identity square matrix.
If each line is given by two points pi and qi then you can calculate
vi = (qi - pi) / sqrt(transposed(qi - pi) * (qi - pi))
Related
I am new to ArcGIS and trying to learn how I should make it work with Python.
I have a few points plotted on ArcGIS software. Now I want to measure the distance between those points using Python. Say I have 7 points (A,B,C,D,E,F,G). I want to measure the distance between points A and B, A and C, A and D, and so on. I know it is simple just do it in ArcGIS but wanted to learn how I should do it in Python.
If there is a way to do it exactly or if there is any better alternative, any leads would help.
Thanks!
I don't have experience with ArcGis and you do not specify if you need a 2D distance or a 3D distance or a non-euclidian distance. So my answer may not be appropriate.
The 2D distance between 2 points is based on Pythagore's theorem:
d = sqrt( (y2-y1)**2 + (x2-x1)**2))
In python (2 or 3), it would be:
import math
d = math.sqrt((y2-y1)**2 + (x2-x1)**2)
For those who are afraid of typing "sqrt", there is now (since python3.8) a math.dist() function:
import math
d = math.dist((x1,x2), (y1,y2))
If your data points are in lists, math.dist shines because if will compute all your distances in a single call and return a list of distances.
For a 3D distance, we are back to Pythagore's theorem:
d = math.sqrt( (z2-z1)**2 + (y2-y1)**2 + (x2-x1)**2)
And if you need a more accurate answer that take into account that the earth is not a perfect sphere, you should search for "delft stack haversine".
DuckDuckGo (duckduckgo.com) will be your friend.
I have to work on task of using hand keypoints as pointer (or touchless mouse).
The main problem here is that the (deep learning) hand keypoints are not perfect (sometime under varied of light, skin colors), therefore the chosen key point are scattering, not smoothly moving like the real mouse we use.
How can I smooth them online (in real-time). Not the solution given array of 2D points and then we smooth on this array. This is the case of new point get in one by one and we have to correct them immediately! to avoid user suffer the scattering mouse.
I'm using opencv and python. Please be nice since I'm new to Computer Vision.
Thanks
The simplest way is to use a moving average. You can compute, very efficiently, the average position of the last n steps and use that to "smooth" the trajectory:
n = 5 # the average "window size"
counter = 0 # count how many steps so far
avg = 0. # the average
while True:
# every time step
val = get_keypoint_value_for_this_time_step()
counter += 1
coeff = 1. / min(counter, n)
# update using moving average
avg = coeff * val + (1. - coeff) * avg
print(f'Current time step={val} smothed={avg}')
More variants of moving averages can be found here.
Since you want a physics like behavior, you can use a simple physics model. Note all arrays below describe properties of the current state of your dynamics and therefore have the same shape (1, 2).
define forces
a:attraction = (k - p) * scaler
velocity
v:velocity = v + a
positions
p:current position = p + v
k:new dl key point = whatever your dl outputs
You output p to the user. Note, If you want a more natural motion, you can play with the scaler or add additional forces (like a) to v.
Also, to get all points, concatenate p to ps where ps.shape = (n, 2).
I have a large quantity of pixel colors (96 thousands different colors):
And I want to get some kind of a mathematically-defined probability region like in this question:
The main obstacle I see right now – all methods on Google are mainly about visualisations and about two-dimensional spaces, yet there is no algorithm for finding coefficients of an equation like:
a1x2 + b1y2 + c1y2 + a2xy + b2xz + c2yz + a3x + b3y + c3z = 0
And this paper is too difficult for me to implement it in python. :(
Anyway, what I just want is to determine if some pixel is more-or-less lies within the diapason I have.
I tried making it using scikit clustering, but I failed due to having only one
set of data, probably. And creating an array 2563 elements
representing each pixel color seems a wrong way.
I wonder if there is an easy way to determine boundaries of this point cluster?
Or, maybe I'm just overthinking it and there is something like OpenCV
cv2.inRange() function?
this can be solved by optimization and fitting of the ellipsoid polynomial. However I would start with geometrical approach which is much faster:
find avg point position
that will be the center of your ellipsoid
p0 = sum (p[i]) / n // average
i = { 0,1,2,3,...,n-1 } // of all points
If your point density is not homogenuous then it is safer to use bounding box center instead. So find xmin,ymin,zmin,xmax,ymax,zmax and the middle between them is your center.
find most distant point to center
that will give you main semi axis
pa = p[j];
|p[j]-p0| >= |p[i]-p0| // max
i = { 0,1,2,3,...,n-1 } // of all points
find second semi-axises
so vector pa-p0 is normal to plane in which the other semi-axises should be. So find most distant point to p0 from that plane:
pb = p[j];
|p[j]-p0| >= |p[i]-p0| // max
dot(pa-p0,p[j]-p0) == 0 // but inly if inside plane
i = { 0,1,2,3,...,n-1 } // from all points
beware that the result of dot product may not be precisely zero so it is better to test against something like this:
|dot(pa-p0,p[j]-p0)| <= 1e-3
You can use any threshold you want (should be based on the ellipsoid size).
find last semi-axis
So we know that last semi-axis should be perpendicular to both
(pa-p0) AND (pb-p0)
So find point such that:
pc = p[j];
|p[j]-p0| >= |p[i]-p0| // max
dot(pa-p0,p[j]-p0) == 0 // but inly if inside plane
dot(pb-p0,p[j]-p0) == 0 // and perpendicular also to b semi-axis
i = { 0,1,2,3,...,n-1 } // from all points
Ellipsoid
Now you have all the parameters you need to form your ellipsoid. vectors
(pa-p0),(pb-p0),(pc-p0)
are the basis vectors of your ellipsoid (you can make them perpendicular by using cross product). Their size gives you the radiuses. And p0 is the center. You can also use this parametric equation:
a=pa-p0;
b=pb-p0;
c=pc-p0;
p(u,v) = p0 + a*cos(u)*cos(v)
+ b*cos(u)*sin(v)
+ c*sin(u);
u = < -0.5*PI , +0.5*PI >
v = < 0.0 , 2.0*PI >
This whole process is just O(n) and the results can be used as start point for both optimization and fitting to speed them up without the loss of accuracy. If you want to further improve accuracy See:
How approximation search works
The sub links shows you examples of fitting ...
You can also take a look at this:
Algorithms: Ellipse matching
which is basically similar to your task but only in 2D still may bring you some ideas.
Here is unstrict solution with fast and simple random search approach*. Best side - no heavy linear algebra library required**. Seems it worked fine for mesh collision detection.
Is assumes that ellipsoid center matches cloud center and then uses some sort of mirrored average to search for main axis.
Full working code is slightly bigger and placed on git, idea of main axis search is here:
np.random.shuffle(pts)
pts_len = len(pts)
pt_average = np.sum(pts, axis = 0) / pts_len
vec_major = pt_average * 0
minor_max, major_max = 0, 0
# may be improved with overlapped pass,
for pt_cur in pts:
vec_cur = pt_cur - pt_average
proj_len, rej_len = proj_length(vec_cur, vec_major)
if proj_len < 0:
vec_cur = -vec_cur
vec_major += (vec_cur - vec_major) / pts_len
major_max = max(major_max, abs(proj_len))
minor_max = max(minor_max, rej_len)
It can be improved/optimized even more at some points. Examples what it will produce:
And full experiment code with plots
*i.e. adjusting code lines randomly until they work
**was actually reason to figure out this solution
I'm relatively new to Python coding (I'm switching from R mostly due to running time speed) and I'm trying to figure out how to code a proximity graph.
That is suppose i have an array of evenly-spaced points in d-dimensional Euclidean space, these will be my nodes. I want to make these into an undirected graph by connecting two points if and only if they lie within e apart. How can I encode this functionally with parameters:
n: spacing between two points on the same axis
d: dimension of R^d
e: maximum distance allowed for an edge to exist.
The graph-tool library has much of the functionality you need. So you could do something like this, assuming you have numpy and graph-tool:
coords = numpy.meshgrid(*(numpy.linspace(0, (n-1)*delta, n) for i in range(d)))
# coords is a Python list of numpy arrays
coords = [c.flatten() for c in coords]
# now coords is a Python list of 1-d numpy arrays
coords = numpy.array(coords).transpose()
# now coords is a numpy array, one row per point
g = graph_tool.generation.geometric_graph(coords, e*(1+1e-9))
The silly e*(1+1e-9) thing is because your criterion is "distance <= e" and geometric_graph's criterion is "distance < e".
There's a parameter called delta that you didn't mention because I think your description of parameter n is doing duty for two params (spacing between points, and number of points).
This bit of code should work, although it certainly isn't the most efficient. It will go through each node and check its distance to all the other nodes (that haven't already compared to it). If that distance is less than your value e then the corresponding value in the connected matrix is set to one. Zero indicates two nodes are not connected.
In this code I'm assuming that your nodeList is a list of cartesian coordinates of the form nodeList = [[x1,y1,...],[x2,y2,...],...[xN,yN,...]]. I also assume you have some function called calcDistance which returns the euclidean distance between two cartesian coordinates. This is basic enough to implement that I haven't written the code for that, and in any case using a function allows for future generalizing and modability.
numNodes = len(nodeList)
connected = np.zeros([numNodes,numNodes])
for i, n1 in enumerate(nodeList):
for j, n2 in enumerate(nodeList[i:]):
dist = calcDistance(n1, n2)
if dist < e:
connected[i,j] = 1
connected[j,i] = 1
I use the following code to generate the fibonacci lattice, see page 4 for the unit sphere. I think the code is working correctly. Next, I have a list of points (specified by latitude and longitude in radians, just as the generated fibonacci lattice points). For each of the points I want to find the index of the closest point on the fibonacci lattice. I.e. I have latitude and longitude and want to get i. How would I do this?
I specifically don't want to iterate over all the points from the lattice and find the one with minimal distance, as in practice I generate much more than just 50 points and I don't want the runtime to be O(n*m) if O(m) is possible.
FWIW, when talking about distance, I mean haversine distance.
#!/usr/bin/env python2
import math
import sys
n = 50
phi = (math.sqrt(5.0) + 1.0) / 2.0
phi_inv = phi - 1.0
ga = 2.0 * phi_inv * math.pi
for i in xrange(-n, n + 1):
longitude = ga * i
longitude = (longitude % phi) - phi if longitude < 0 else longitude % phi
latitude = math.asin(2.0 * float(i) / (2.0 * n + 1.0))
print("{}-th point: ".format(i + n + 1))
print("\tLongitude is {}".format(longitude))
print("\tLatitude is {}".format(latitude))
// Given latitude and longitude of point A, determine index i of point which is closest to A
// ???
What you are probably looking for is a spatial index: https://en.wikipedia.org/wiki/Spatial_database#Spatial_index. Since you only care about nearest neighbor search, you might want to use something relatively simple like http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.KDTree.html.
Note that spatial indexes usually consider points on a plane rather than a sphere. To adapt it to your situation, you'll probably want to split up the sphere into several regions that can be approximated by rectangles. You can then find several of the nearest neighbors according to the rectangular approximation and compute their actual haversine distances to identify the true nearest neighbor.
It's somewhat easier to use spherical coordinates here.
Your spherical coordinates are given by lat = arcsin(2 * i / (2 * N + 1)), and lon = 2 * PI * i / the golden ratio.
Reversing this is not a dead end - it's a great way to determine latitude. The issue with the reverse approach is only that it fails to represent longitude.
sin(lat) = 2 * i / (2 * N + 1)
i = (2 * N + 1) * sin(lat) / 2
This i is an exact representation of the index of a point matching the latitude of your input point. The next step is your choice - brute force, or choosing a different spiral.
The Fibonacci spiral is great at covering a sphere, but one of its properties is that it does not preserve locality between consecutive indices. Thus, if you want to find the closest points, you have to search a wide range - it is difficult to even estimate bounds for this search. Brute force is expensive. However, this is already a significant improvement over the original problem of checking every point - if you like, you can threshhold your results and bound your search in any way you like and get approximately accurate results. If you want to accomplish this in a more deterministic way, though, you'll have to dig deeper.
My solution to this problem looks a bit like this (and apologies, this is written in C# not Python)
// Take a stored index on a spiral on a sphere and convert it to a normal vector
public Vector3 UI2N(uint i)
{
double h = -1 + 2 * (i/n);
double phi = math.acos(h);
double theta = sqnpi*phi;
return new Vector3((float)(math.cos(theta) * math.sin(phi)), (float)math.cos(phi), (float)(math.sin(theta) * math.sin(phi)));
}
// Take a normalized vector and return the closest matching index on a spiral on a sphere
public uint N2UI(Vector3 v)
{
double iT = sqnpi * math.acos(v.y); // theta calculated to match latitude
double vT = math.atan2(v.z, v.x); // theta calculated to match longitude
double iTR = (iT - vT + math.PI_DBL)%(twoPi); // Remainder from iTR, preserving the coarse number of turns
double nT = iT - iTR + math.PI_DBL; // new theta, containing info from both
return (uint)math.round(n * (math.cos(nT / sqnpi) + 1) / 2);
}
Where n is the spiral's resolution, and sqnpi is sqrt(n * PI).
This is not the most efficient possible implementation, nor is it particularly clear. However, it is a middle ground where I can attempt to explain it.
The spiral I am using is one I found here:
https://web.archive.org/web/20121103201321/http://groups.google.com/group/sci.math/browse_thread/thread/983105fb1ced42c/e803d9e3e9ba3d23#e803d9e3e9ba3d23%22%22
(Orion's spiral is basically the one I'm using here)
From this I can reverse the function to get both a coarse and a fine measure of Theta (distance along the spiral), and combine them to find the best-fitting index. The way this works is that iT is cumulative, but vT is periodic. vT is a more correct measure of the longitude, but iT is a more correct measure of latitude.
I strongly encourage that anyone reading this try things other than what I'm doing with my code, as I know that it can be improved from here - that's what I did, and I would do well to do more. Using doubles is absolutely necessary here with the current implementation - otherwise too much information would be lost, particularly with the trig functions and the conversion to uint.