I want to approximate scattered 2D data onto a regular grid with slightly bigger dimensions than the surrounding box of scattered data in Python.
My reference is the behavior of the R package mba.
I tried out SmoothBivariateSpline. In the inner area the results with kx and ky set to 5 and default smoothing came near to the R implementation. However, the approximated values near to the border are going crazy. That is not the case for the R function.
The RBFInterpolator looks comparable too with kernel 'linear'. However, the smoothing values I need to get propper smoothing go up in the 10^4 region.
I would like to unterstand how that smoothing works and find a good method to set default on various input.
Edit: I tried out to scale down the dimensions of x and y. To get same results I had to scale down the smoothing factor too. I guess now the smoothing factor is something like a radius for a smoothing range.
I have surface data Z over an [X,Y] mesh. In general Z = 0, but there will be peaks which stick up above this flat background, and these peaks will have roughly elliptical cross sections. These are diffraction intensity peaks, if anyone is curious. I would like to measure the elliptical cross section at about half the peak's maximum value.
So typically with diffraction, if it's a peak y = f(x), we want to look at the Full Width at Half Max (FWHM), which can be done by finding the peak's maximum, then intersecting the peak at that value and measuring the width. No problem.
Here I want to perform the analogous operation, but at higher dimension. If the peak had a circular cross section, then I could just do the FWHM = diameter of cross section. However, these peaks are elliptical, so I want to slice the peak at its half max and then fit an ellipse to the cross section. That way I can get the major and minor axes, inclination angle, and goodness of fit, all of which contain relevant information that a simple FWHM number would not provide.
I can hack together a way to do this, but it's slow and messy, so it feels like there must be a better way to do this. So my question really just comes down to, has anyone done this kind of problem before, and if so, are there any modules that I could use to perform the calculation quickly and with a simple, clean code?
I need to draw a plot of a function that I know its analytic form. For example, y=exp(-x^2/2).
If I plot this function over an interval of [-5,5] with 11 points, I get the blue curve below. Notice how lines connecting points are linearly interpolated, making an incorrect representation of the function. If I use more points, I could get more accurate plot like the yellow curve below. On the other hand, if I use cubic spline then the curve seems more realistic even if the sampling rate is low (green curve).
This is not a problem in most cases, but if I want a vector output like svg or pdf, I'd love to reduce the number of points as low as possible to minimize the size of output. So, I'd like to see if there is any option in matplotlib or any other python-based plotting library that can either connect the dots using spline, or specify the slope and curvature at each point (similar to Photoshop or MS Powerpoint).
I'm working on a heatmap generation program which hopefully will fill in the colors based on value samples provided from a building layout (this is not GPS based).
If I have only a few known data points such as these in a large matrix of unknowns, how do I get the values in between interpolated in Python?:
0,0,0,0,1,0,0,0,0,0,5,0,0,0,0,9
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,2,0,0,0,0,0,0,0,0,8,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,8,0,0,0,0,0,0,0,6,0,0,0,0,0,0
0,0,0,0,0,3,0,0,0,0,0,0,0,0,7,0
I understand that bilinear won't do it, and Gaussian will bring all the peaks down to low values due to the sheer number of surrounding zeros. This is obviously a matrix handling proposition, and I don't need it to be Bezier curve smooth, just close enough to be a graphic representation would be fine. My matrix will end up being about 1500×900 cells in size, with approximately 100 known points.
Once the values are interpolated, I have written code to convert it all to colors, no problem. It's just that right now I'm getting single colored pixels sprinkled over a black background.
Proposing a naive solution:
Step 1: interpolate and extrapolate existing data points onto surroundings.
This can be done using "wave propagation" type algorithm.
The known points "spread out" their values onto surroundings until all the grid is "flooded" with some known values. At the end of this stage you have a number of intersected "disks", and no zeroes left.
Step 2: smoothen the result (using bilinear filtering or some other filtering).
If you are able to use ScyPy, then interp2d does exactly what you want. A possible problem with is that it seems to not extrapolate smoothly according to this issue. This means that all values near the walls are going to be the same as closest their neighbour points. This can be solved by putting thermometers in all 4 corners :)
I'm referencing this question and this documentation in trying to turn a set of points (the purple dots in the image below) into an interpolated grid.
As you can see, the image has missing spots where dots should be. I'd like to figure out where those are.
import numpy as np
from scipy import interpolate
CIRCLES_X = 25 # There should be 25 circles going across
CIRCLES_Y = 10 # There should be 10 circles going down
points = []
values = []
# Points range from 0-800 ish X, 0-300 ish Y
for point in points:
points.append([points.x, points.y])
values.append(1) # Not sure what this should be
grid_x, grid_y = np.mgrid[0:CIRCLES_Y, 0:CIRCLES_X]
grid = interpolate.griddata(points, values, (grid_x, grid_y), method='linear')
print(grid)
Whenever I print out the result of the grid, I get nan for all of my values.
Where am I going wrong? Is my problem even the correct use case for interpolate.grid?
First, your uncertain points are mainly at an edge, so it's actually extrapolation. Second, interpolation methods built into scipy deal with continuous functions defined on the entire plane and approximate it as a polynomial. While yours is discrete (1 or 0), somewhat periodic rather than polynomial and only defined in a discrete "grid" of points.
So you have to invent some algorithm to inter/extrapolate your specific kind of function. Whether you'll be able to reuse an existing one - from scipy or elsewhere - is up to you.
One possible way is to replace it with some function (continuous or not) defined everywhere, then calculate that approximation in the missing points - whether as one step as scipy.interpolate non-class functions do or as two separate steps.
e.g. you can use a 3-D parabola with peaks in your dots and troughs exactly between them. Or just with ones in the dots and 0's in the blanks and hope the resulting approximation in the grid's points is good enough to give a meaningful result (random overswings are likely). Then you can use scipy.interpolate.RegularGridInterpolator for both inter- and extrapolation.
or as a harmonic function - then what you're seeking is Fourier transformation
Another possible way is to go straight for a discrete solution rather than try to shoehorn the continual mathanalysis' methods into your case: design a (probably entirely custom) algorithm that'll try to figure out the "shape" and "dimensions" of your "grids of dots" and then simply fill in the blanks. I'm not sure if it is possible to add it into the scipy.interpolate's harness as a selectable algorithm in addition to the built-in ones.
And last but not the least. You didn't specify whether the "missing" points are points where the value is unknown or are actual part of the data - i.e. are incorrect data. If it's the latter, simple interpolation is not applicable at all as it assumes that all the data are strictly correct. Then it's a related but different problem: you can approximate the data but then have to somehow "throw away irregularities" (higher order of smallness entities after some point).