Calculate maximum diameter in 3D binary mask - python

I want to calculate maximum diameter of a 3D binary mask of a nodule (irregular shape).
I have implemented a function that calculates the distance of all boundary points between each other. This method is very computational expensive while dealing with tumors or larger volume.
So my Question is what can be the possible methods to calculate maximum diameter of a 3d binary mask which is less computationally expensive.

Something similar to a Gradient Descent could be implemented.
Start with 2 points (A and B), located randomly around the 3D mask
For point A, calculate the direction to travel on the 3D mask that will most increase the distance between him and point B.
Make point A take a small step in that direction.
For point B, calculate the direction to travel on the 3D mask that will most increase the distance between him and point A.
Make point B take a small step in that direction.
Repeat until it converges.
This will very likely find local maxima, so you would probably have to repeat the experiment several times to find the real global maxima.

Related

Are there any python algorithms for conversion of coordinates between vector spaces of different norms?

Suppose I have a kxn array of data with columns of vectors and a distance function defined on these vectors. How do I convert the kxn array into another array of the same shape such that the euclidean norm among the converted vectors is the norm derived by the given distance function? I know you can directly calculate the distance matrix for the data by that given distance function, and derive the coordinates in R^k thereby. But this method is really expensive espesically when the distance function has a complexity O(n^2) or more. So I wonder if there is any simpler algorithm to do that.
It sounds like you are describing multidimension scaling (MDS). One way to do it in Python is with scikit-learn's sklearn.manifold.MDS.
MDS expects the NxN distance (or "dissimilarity") matrix as input, so that doesn't get around the cost of evaluating the distance function. The distance matrix is unavoidably needed for this conversion, so if the distance function itself is expensive, it seems the best thing to do is reduce the number of samples or look for a way to compute fast approximate distances it to speed it up. Also, beware that MDS is usually only approximate. A numerical optimization looks for the best fit of Euclidean norms to the given distances.

Speeding up nearest Neighbor with scipy.spatial.cKDTree

I'm trying to optimize my nearest neighbor distance code that I need to calculate for many iterations of the same dataset
I am calculating the nearest neighbor distance for the points in dataset A, to the points in dataset B. Both datasets contain ~ (1000-2000) 2-dimensional points. While the points in dataset A stay the same, I have lots of different iterations for dataset B (~100000), B0,B1, ...B100000.I wonder if I can somehow speed this up given that A stays the same.
To calculate the nearest neighbor distances I use
for i in range(100000):
tree = spatial.cKDTree(B[i])
mindist1, minid = tree.query(A)
score[i] = np.mean(mindist1**4))**0.25
# And some other calculations
...
I wonder if there is a way to speed this up given A stays the same throughout the entire loop. It seems to me like there should be a smarter way to do this given that A is the same.

Efficient calculation of euclidean distance

I have a MxN array, where M is the number of observations and N is the dimensionality of each vector. From this array of vectors, I need to calculate the mean and minimum euclidean distance between the vectors.
In my mind, this requires me to calculate MC2 distances, which is an O(nmin(k, n-k)) algorithm. My M is ~10,000 and my N is ~1,000, and this computation takes ~45 seconds.
Is there a more efficient way to compute the mean and min distances? Perhaps a probabilistic method? I don't need it to be exact, just close.
You didn't describe where your vectors come from, nor what use you will put mean and median to. Here are some observations about the general case. Limited ranges, error tolerance, and discrete values may admit of a more efficient approach.
The mean distance between M points sounds quadratic, O(M^2). But M / N is 10, fairly small, and N is huge, so the data probably resembles a hairy sphere in 1e3-space. Computing centroid of M points, and then computing M distances to centroid, might turn out to be useful in your problem domain, hard to tell.
The minimum distance among M points is more interesting. Choose a small number of pairs at random, say 100, compute their distance, and take half the minimum as an estimate of the global minimum distance. (Validate by comparing to the next few smallest distances, if desired.) Now use spatial UB-tree to model each point as a positive integer. This involves finding N minima for M x N values, adding constants so min becomes zero, scaling so estimated global min distance corresponds to at least 1.0, and then truncating to integer.
With these transformed vectors in hand, we're ready to turn them into a UB-tree representation that we can sort, and then do nearest neighbor spatial queries on the sorted values. For each point compute an integer. Shift the low-order bit of each dimension's value into the result, then iterate. Continue iterating over all dimensions until non-zero bits have all been consumed and appear in the result, and proceed to the next point. Numerically sort the integer result values, yielding a data structure similar to a PostGIS index.
Now you have a discretized representation that supports reasonably efficient queries for nearest neighbors (though admittedly N=1e3 is inconveniently large). After finding two or more coarse-grained nearby neighbors, you can query the original vector representation to obtain high-resolution distances between them, for finer discrimination. If your data distribution turns out to have a large fraction of points that discretize to being off by single bit from nearest neighbor, e.g. location of oxygen atoms where each has a buddy, then increase the global min distance estimate so the low order bits offer adequate discrimination.
A similar discretization approach would be appropriately scaling e.g. 2-dimensional inputs and marking an initially empty grid, then scanning immediate neighborhoods. This relies on global min being within a "small" neighborhood, due to appropriate scaling. In your case you would be marking an N-dimensional grid.
You may be able to speed things up with some sort of Space Partitioning.
For the minimum distance calculation, you would only need to consider pairs of points in the same or neigbouring partitions. For an approximate mean, you might be able to come up with some sort of weighted average based on the distances between partitions and the number of points within them.
I had the same issue before, and it worked for me once I normalized the values. So try to normalize the data before calculating the distance.

Exemplar-Based Inpainting - how to compute the normal to the contour and the isophate

I am using the Exemplar-Based algorithm by Criminisi. In section 3 of his paper it describes the algorithm. The target region that needs to be inpainted is denoted as Ω (omega). The border or the contour of Ω where it meets the rest of the image (denoted as Φ(phi)), is δΩ (delta omega).
Now on page four of the paper, it states that np(n subscript p) is the normal to the contour of δΩ. and ▽Ip (also includes orthogonal superscript) is the isophote at point p, which is the gradient turned 90 degrees.
My multivariable calculus is rusty, but how do we go about computing np and ▽Ip with python libraries? Also isn't np different for each point p on δΩ?
There are different ways of computing those variables, all depending in your numeric description of that boundary. n_p is the normal direction of the contour.
Generally, if your contour is described with an analytic equation, or if you can write an analytic equation that approximates the contour (e.g. a spline curve that fits 5 points (2 in each side of the point you want), you can derive that spline, compute the tangent line in the point you want using
Then, get a unit vector among that line and get the orthonormal vector to that one. All this is very easy to do (ask if you don't understand).
Then you have the isophone. It looks like its a vector orthonormal of the gradient with its modulus. Computing the directional gradient on an image is very very commonly used technique in image processing. You can get the X and Y derivatives of the image easily (hint: numpy.gradient, or SO python gradient). Then, the total gradient of the image is described as:
So just create a vector with the x and y gradients (taken from numpy.gradient). Then get the orthogonal vector to that one.
NOTE: How to get an orthogonal vector in 2D
[v2x v2y] = [v1y, -v1x]

Circular Dimensionality Reduction?

I want dimensionality reduction such that dimensions it returns are circular.
ex) If I reduce 12d data to 2d, normalized between 0 and 1, then I want (0,0) to be as equally close to (.1,.1) as (.9,.9).
What is my algorithm? (bonus points for python implementation)
PCA gives me 2d plane of data, whereas I want spherical surface of data.
Make sense? Simple? Inherent problems? Thanks.
I think what you ask is all about transformation.
Circular
I want (0,0) to be as equally close to (.1,.1) as (.9,.9).
PCA
Taking your approach of normalization what you could do is to
map the values in the interval from [0.5, 1] to [0.5, 0]
MDS
If you want to use a distance metric, you could first compute the distances and then do the same. For instance taking the correlation, you could do 1-abs(corr). Since the correlation is between [-1, 1] positive and negative correlations will give values close to zero, while non correlated data will give values close to one. Then, having computed the distances you use MDS to get your projection.
Space
PCA gives me 2d plane of data, whereas I want spherical surface of data.
Since you want a spherical surface you can directly transform the 2-d plane to a sphere as I think. A spherical coordinate system with a constant Z would do that, wouldn't it?
Another question is then: Is all this a reasonable thing to do?

Categories

Resources