How to apply clustering algorithms to a volume surface? - python

I’m trying to compare some OpenFOAM CFD simulations that reproduce the flow through an almost spherical object(the object is a reconstruction, with an irregular shape), searching for the one with the smallest cluster of wall shear stress. So, I want to know if there is a way of running a clustering algorithm on this irregular surface, like K Means, EM or other unsupervised algorithm. In other words, I would like to numerically compare the colormap plotted in slightly different shapes, taking the area and the mean value of the clusters as parameters to do this comparison, for example. Someone has ever handled a similar situation?
I've already tried to project this object to a plane or a sphere, but the distortion was greater than expected, and this became no longer an option.

Related

explanation of sklearn optics plot

I am currently learning how to use OPTICS in sklearn. I am inputting a numpy array of (205,22). I am able to get plots out of it, but I do not understand how I am getting a 2d plot out of multiple dimensions and how I am supposed to read it. I more or less understand the reachability plot, but the rest of it makes no sense to me. Can someone please explain what is happening. Is the function just simplifying the data to two dimensions somehow? Thank you
From the sklearn user guide:
The reachability distances generated by OPTICS allow for variable density extraction of clusters within a single data set. As shown in the above plot, combining reachability distances and data set ordering_ produces a reachability plot, where point density is represented on the Y-axis, and points are ordered such that nearby points are adjacent. ‘Cutting’ the reachability plot at a single value produces DBSCAN like results; all points above the ‘cut’ are classified as noise, and each time that there is a break when reading from left to right signifies a new cluster.
the other three plots are a visual representation of the actual clusters found by three different algorithms.
as you can see in the OPTICS Clustering plot there are two high density clusters (blue and cyan) the gray crosses acording to the reachability plot are classify as noise because of the low xi value
in the DBSCAN clustering with eps = 0.5 everithing is considered noise since the epsilon value is to low and the algorithm can not found any density points.
Now it is obvious that in the third plot the algorithm found just a single cluster because of the adjustment of the epsilon value and everything above the 2.0 line is considered noise.
please refer to the user guide:

Algorithm to create polygons(No Thiesen/Voronoi)

I have been trying to create custom regions for states. I want to fill the state map by using area of influence of points.
The below image represents what I have been trying. The left image shows the points and I just want to fill all the areas as in the right image. I have used Voronoi/Thiesen, but it leaves some points outside the area since it just takes the centroid to color the polygon.
Is there any algorithm or process to achieve that?, now I am using in Python.
You've identified your basic problem: you used a cluster-unit Voronoi algorithm, which is too simplistic for your application. You need to apply that same algebra to the points themselves, not to the region as a single-statistic entity.
To this end, I strongly recommend a multi-class SVM (Support Vector Machine) algorithm, which will identify the largest gaps between identified regions (classes) of points. Use a Gaussian kernel modification (of a very low degree) to handle non-linear boundaries. You will almost certainly get simple curves instead of lines.

Constraining RBF interpolation of 3D surface to keep curvature

I've been tasked to develop an algorithm that, given a set of sparse points representing measurements of an existing surface, would allow us to compute the z coordinate of any point on the surface. The challenge is to find a suitable interpolation method that can recreate the 3D surface given only a few points and extrapolate values also outside of the range containing the initial measurements (a notorious problem for many interpolation methods).
After trying to fit many analytic curves to the points I've decided to use RBF interpolation as I thought this will better reproduce the surface given that the points should all lie on it (I'm assuming the measurements have a negligible error).
The first results are quite impressive considering the few points that I'm using.
Interpolation results
In the picture that I'm showing the blue points are the ones used for the RBF interpolation which produces the shape represented in gray scale. The red points are instead additional measurements of the same shape that I'm trying to reproduce with my interpolation algorithm.
Unfortunately there are some outliers, especially when I'm trying to extrapolate points outside of the area where the initial measurements were taken (you can see this in the upper right and lower center insets in the picture). This is to be expected, especially in RBF methods, as I'm trying to extract information from an area that initially does not have any.
Apparently the RBF interpolation is trying to flatten out the surface while I would just need to continue with the curvature of the shape. Of course the method does not know anything about that given how it is defined. However this causes a large discrepancy from the measurements that I'm trying to fit.
That's why I'm asking if there is any way to constrain the interpolation method to keep the curvature or use a different radial basis function that doesn't smooth out so quickly only on the border of the interpolation range. I've tried different combination of the epsilon parameters and distance functions without luck. This is what I'm using right now:
from scipy import interpolate
import numpy as np
spline = interpolate.Rbf(df.X.values, df.Y.values, df.Z.values,
function='thin_plate')
X,Y = np.meshgrid(np.linspace(xmin.round(), xmax.round(), precision),
np.linspace(ymin.round(), ymax.round(), precision))
Z = spline(X, Y)
I was also thinking of creating some additional dummy points outside of the interpolation range to constrain the model even more, but that would be quite complicated.
I'm also attaching an animation to give a better idea of the surface.
Animation
Just wanted to post my solution in case someone has the same problem. The issue was indeed with scipy implementation of the RBF interpolation. I tried instead to adopt a more flexible library, https://rbf.readthedocs.io/en/latest/index.html#.
The results are pretty cool! Using the following options
from rbf.interpolate import RBFInterpolant
spline = RBFInterpolant(X_obs, U_obs, phi='phs5', order=1, sigma=0.0, eps=1.)
I was able to get the right shape even at the edge.
Surface interpolation
I've played around with the different phi functions and here is the boxplot of the spread between the interpolated surface and the points that I'm testing the interpolation against (the red points in the picture).
Boxplot
With phs5 I get the best result with an average spread of about 0.5 mm on the upper surface and 0.8 on the lower surface. Before I was getting a similar average but with many outliers > 15 mm. Definitely a success :)

Given a contour outlining the edges of an 'S' shape in OpenCV/Python, what methods can be used to trace a curve along the center of the shape?

Given a contour outlining the edge of the letter S (in comic sans for example), how can I get a series of points along the spine of this letter in order to later represent this shape using lines, cubic spline or other curve-representing technique? I want to process and represent the shape using 30-40 points in Python/OpenCV.
Morphological skeletonization could help with this but the operation always seems to produce erroneous branches. Is there a better way to collapse the contour into just the 'S' shape of the letter?
In the example below you can see the erroneous 'serpent's tongue' like branches that are produced by morphological skeletonization. I don't know if it's fair to say they are erroneous if that's what the algorithm is supposed to be doing, but for me I would not like them to be there.
Below is the comic sans alphabet:
Another problem with skeletonization is that it is computationally expensive, but if you know a way of making it robust to forming 'serpent's tongue' like branches then I will give it a try.
Actually vectorizing fonts isn't trivial problem and quite tricky. To properly vectorize fonts using bezier curve you'll need tracing. There are many library you can use for tracing image, for example Potrace. I'm not knowledgeable using python but based on my experience, I have done similar project using c++ described below:
A. Fit the contour using cubic bezier
This method is quite simple although a lot of work should be done. I believe this also works well if you want to fit skeletons obtained from thinning.
Find contour/edge of the object, you can use OpenCV function findContours()
The entire shape can't be represented using a single cubic bezier, so divide them to several segments using Ramer-Douglas-Peucker (RDP). The important thing in this step, don't delete any points, use RDP only to segment the points. See colored segments on image below.
For each segments, where S is a set of n points S = (s0, s1,...Sn), fit a cubic bezier using Least Square Fitting
Illustration of least square fitting:
B. Resolution Resolution Independent Curve Rendering
This method as described in this paper is quite complex but one of the best algorithms available to display vector fonts:
Find contour (the same with method A)
Use RDP, differently from method A, use RDP to remove points so the contour can be simplified.
Do delaunay triangulation.
Draw bezier curve on the outer edges using method described in the paper
The following simple idea might be usefull.
Calculate Medial axis of the outer contour. This would ensure connectivity of the curves.
Find out the branch points. Depending on its length you can delete them in order to eliminate "serpent's tongue" problem.
Hope it helps.

Skewed gaussian distribution within an ellipse with python

Okay, so I've been pulling some hairs out over this for the last couple of days and haven't made much progress.
I want to generate a 2-D array (grid) of gaussian-like distribution on an elliptical domain. Why do I say gaussian-like?, well I want an asymmetric gaussian, aka skewed gaussian where the peak of the gaussian-like surface is at some point x0,y0 within the ellipse and the values on the perimeter of the ellipse are zero (or approaching zero...).
The attached picture might describe what I mean a little better.

Categories

Resources