Define a 2D Gaussian probability with five peaks - python

I have a 2D data and it contains five peaks. Could I fit five 2D Gaussians function to obtain the peaks? In my problem, the peaks do not refer to the clustering problem. Which I think EM would be an appropriate answer for it.
In my case I measure a variable in x-y space and it shows maximum in more than one position. Is still fitting Fourier series or using Expectation-Maximization method an applicable solution to my problem?
In order to make my likelihood, do I need to just add up the five 2D Gaussians distributions with x and y and the height of each peak as variables?

If I understand what you're asking, check out Gaussian Mixture Models and Expectation Maximization. I don't know of any pre-implemented versions of these in Python, although I haven't looked too hard.

Related

How would you find the roots of a linear interpolation in Python?

I'm working on calculating the full width at half maximum for light curves with irregular shapes. My approach right now is to
fit a spline (scipy.interpolate.UnivariateSpline) to the data (minus half maximum, so that y = 0 is the half max)
find the roots of the spline (UnivariateSpline.roots())
find the difference between the first and last roots to determine the width of the curve at half max
The roots method is only available for cubic splines, but I need the spline to be linear, or else I get results like the image below (due to the spacing of the data points). I should note that I'm working with hundreds of datasets, so manually selecting these "roots" is not quite feasible.
Does anyone have any tricks to find the roots of a linear spline (or all of the x-values for a given y value)? Many thanks!
You can use make_interp_spline_(..., k=1) to get a BSpline object, convert to PPoly via PPoly.from_spline() and the result has a .roots method.
Alternatively, as other answers suggest, just find the relevant interval and solve for a root of linear segment.

Constraining RBF interpolation of 3D surface to keep curvature

I've been tasked to develop an algorithm that, given a set of sparse points representing measurements of an existing surface, would allow us to compute the z coordinate of any point on the surface. The challenge is to find a suitable interpolation method that can recreate the 3D surface given only a few points and extrapolate values also outside of the range containing the initial measurements (a notorious problem for many interpolation methods).
After trying to fit many analytic curves to the points I've decided to use RBF interpolation as I thought this will better reproduce the surface given that the points should all lie on it (I'm assuming the measurements have a negligible error).
The first results are quite impressive considering the few points that I'm using.
Interpolation results
In the picture that I'm showing the blue points are the ones used for the RBF interpolation which produces the shape represented in gray scale. The red points are instead additional measurements of the same shape that I'm trying to reproduce with my interpolation algorithm.
Unfortunately there are some outliers, especially when I'm trying to extrapolate points outside of the area where the initial measurements were taken (you can see this in the upper right and lower center insets in the picture). This is to be expected, especially in RBF methods, as I'm trying to extract information from an area that initially does not have any.
Apparently the RBF interpolation is trying to flatten out the surface while I would just need to continue with the curvature of the shape. Of course the method does not know anything about that given how it is defined. However this causes a large discrepancy from the measurements that I'm trying to fit.
That's why I'm asking if there is any way to constrain the interpolation method to keep the curvature or use a different radial basis function that doesn't smooth out so quickly only on the border of the interpolation range. I've tried different combination of the epsilon parameters and distance functions without luck. This is what I'm using right now:
from scipy import interpolate
import numpy as np
spline = interpolate.Rbf(df.X.values, df.Y.values, df.Z.values,
function='thin_plate')
X,Y = np.meshgrid(np.linspace(xmin.round(), xmax.round(), precision),
np.linspace(ymin.round(), ymax.round(), precision))
Z = spline(X, Y)
I was also thinking of creating some additional dummy points outside of the interpolation range to constrain the model even more, but that would be quite complicated.
I'm also attaching an animation to give a better idea of the surface.
Animation
Just wanted to post my solution in case someone has the same problem. The issue was indeed with scipy implementation of the RBF interpolation. I tried instead to adopt a more flexible library, https://rbf.readthedocs.io/en/latest/index.html#.
The results are pretty cool! Using the following options
from rbf.interpolate import RBFInterpolant
spline = RBFInterpolant(X_obs, U_obs, phi='phs5', order=1, sigma=0.0, eps=1.)
I was able to get the right shape even at the edge.
Surface interpolation
I've played around with the different phi functions and here is the boxplot of the spread between the interpolated surface and the points that I'm testing the interpolation against (the red points in the picture).
Boxplot
With phs5 I get the best result with an average spread of about 0.5 mm on the upper surface and 0.8 on the lower surface. Before I was getting a similar average but with many outliers > 15 mm. Definitely a success :)

Are there any Python options for 3D linear piecewise/segmented regression

I'm looking for a solution to fit a number of piecewise planes to linearly approximate a surface. Ideally the user could define the number of planes and the code would determine the "optimal" pieces of the data to fit them to.
There seems to be a number of 2D options discussed, e.g., here How to apply piecewise linear fit in Python? but nothing in 3D.
Thanks!

How to apply clustering algorithms to a volume surface?

I’m trying to compare some OpenFOAM CFD simulations that reproduce the flow through an almost spherical object(the object is a reconstruction, with an irregular shape), searching for the one with the smallest cluster of wall shear stress. So, I want to know if there is a way of running a clustering algorithm on this irregular surface, like K Means, EM or other unsupervised algorithm. In other words, I would like to numerically compare the colormap plotted in slightly different shapes, taking the area and the mean value of the clusters as parameters to do this comparison, for example. Someone has ever handled a similar situation?
I've already tried to project this object to a plane or a sphere, but the distortion was greater than expected, and this became no longer an option.

How to improve HOG detector with linear SVM performance for car detection?

So, I want to detect cars from a driver recorder recorded video. I've read a lot and do research quite a lot but still not quite getting it. I do think of using a HOG descriptor with linear SVM. But in what way it can still be improver to make it easier to be implement and more robust since this will be kind of a research for me?
I am thinkin of combining another technique/algorithm with the HOG but still kind of lost. I am quite new in this.
Any help is greatly appreciated. I am also open to other better ideas.
HOG (histogram of oriented gradients) is merely a certain type of feature vector that can be computed from your data. You compute the gradient vector at each pixel in your image and then you divide up the possible angles into a discrete number of bins. Within a given image sub-region, you add the total magnitude of the gradient pointing in a given direction as the entry for the relevant angular bin containing that direction.
This leaves you with a vector that has a length equal to the number of bins you've chosen for dividing up the range of angles and acts as an unnormalized histogram.
If you want to compute other image features for the same sub-region, such as the sum of the pixels, some measurement of sharp angles or lines, aspects of the color distribution, or so forth, you can compute as many or as few as you would like, arrange them into a long vector as well, and simply concatenate that feature vector with the HOG vector.
You may also want to repeat the computation of the HOG vector for several different scale levels to help capture some scale variability, concatenating each scale-specific HOG vector onto the overall feature vector. There are other feature concepts like SIFT and others, which are created to automatically account for scale invariance.
You may need to do some normalization or scaling, which you can read about in any standard SVM guide. The standard LIBSVM guide is a great place to start.
You will have to be careful to organize your feature vector correctly since you will likely have a very large number of components to the feature vector, and you have to ensure they are always calculated and placed into the same ordering and undergo exactly the same scaling or normalization treatments.

Categories

Resources