I'm looking for a function which mimics MATLAB's cscvn function in their Curve Fitting Toolbox, suitable for points in 3D space. The closest function I've found has been scipy.interpolate.splprep, which is capable of computing 3 dimensions but loses its accuracy with fewer data points. If smoothness is reduced to a point of fitting the points, the curve has kinks.
I have a discrete dataset made up of physical points (elevation data) that I'm looking to model, so the spline must pass through those points. There is a finite number of points at varying chord lengths from one another.
Here's a sample of the quick test function I've written to test Python splines. Unfortunately, I can't share my MATLAB code, but the cscvn function splines smoothly and passes through all data points.
import scipy as sp
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import splprep, splev, interp2d
x = np.linspace(0, 10, num = 20) #list of known x coordinates
y = 2*x #list of known y coordinates
z = x*x #list of known z coordinates
## Note: You must have more points than degree of the spline. if k = 3, must have 4 points min.
print([x,y,z])
tck, u = splprep([x,y,z], s = 26) # Generate function out of provided points, default k = 3
newPoints = splev(u, tck) # Creating spline points
print(newPoints)
ax = plt.axes(projection = "3d")
ax.plot3D(x, y, z, 'go') # Green is the actual 3D function
ax.plot3D(newPoints[:][0], newPoints[:][1], newPoints[:][2], 'r-') # Red is the spline
plt.show()
Here is an example of many points creating a smooth curve (red), but the line doesn't align with the physical data points (green).
Here is an example of kinks in the spline (red) created by too few data points (green). This is more akin to what my dataset looks like.
Change your U for:
unew = np.arange(0, 1.00, 0.005)
I have measurements (PPI arc scans) taken with a doppler wind lidar. The data is stored in a pandas dataframe where rows represent azimuth angle and columns represent radial distance (input shape = 30x197). Link to example scan, (csv). I want to transform this to a cartesian coordinate system, and output a 2d array which is re-gridded into x,y coordinates instead of polar with the values stored in the appropriate grid cell. Interpolation (nearest neighbor) is ok and so is zero or NaN padding of areas where no data exists.
Ideally the X and Y grid should correspond to the actual distances between points, but right now I'm just trying to get this working. This shouldn’t be terribly difficult, but I’m having trouble obtaining the result I want.
So far, I have working code which plots on a polar axis beautifully (example image) but this won't work for the next steps of my analysis.
I have tried many different approaches with scipy.interpolate.griddata, scipy.ndimage.geometric_transform, and scipy.ndimage.map_coordinates but haven't gotten the correct output. Here is an example of my recent attempt (df_polar is the csv file linked):
# Generate polar and cartesian meshgrids
r = df_polar.columns
theta = df_polar.index
theta = np.deg2rad(theta)
# Polar meshgrid
rad_c, theta_c = np.meshgrid(r,theta)
# Cartesian meshgrid
X = rad_c * np.cos(theta_c)
Y = rad_c * np.sin(theta_c)
x,y = np.meshgrid(X,Y)
# Interpolate from polar to cartesian grid
new_grid = scipy.interpolate.griddata(
(rad_c.flatten(), theta_c.flatten()),
np.array(df_polar).flatten(), (x,y), method='nearest')
The result is not correct at all, and from reading the documentation and examples I don't understand why. I would greatly appreciate any tips on where I have gone wrong. Thanks a lot!!
I think you might be feeding griddata the wrong points. It wants cartesian points and if you want the values interpolated over a regular x/y grid you need to create one and provide that too.
Try this and let me know if it produces the expected result. It's hard for me to tell if this is what it should produce:
from scipy.interpolate import griddata
import pandas as pd
import numpy as np
df_polar = pd.read_csv('onescan.txt', index_col=0)
# Generate polar and cartesian meshgrids
r = pd.to_numeric(df_polar.columns)
theta = np.deg2rad(df_polar.index)
# Polar meshgrid
rad_c, theta_c = np.meshgrid(r, theta)
# Cartesian equivalents of polar co-ordinates
X = rad_c*np.cos(theta_c)
Y = rad_c*np.sin(theta_c)
# Cartesian (x/y) meshgrid
grid_spacing = 100.0 # You can change this
nx = (X.max() - X.min())/grid_spacing
ny = (Y.max() - Y.min())/grid_spacing
x = np.arange(X.min(), X.max() + grid_spacing, grid_spacing)
y = np.arange(Y.min(), Y.max() + grid_spacing, grid_spacing)
grid_x, grid_y = np.meshgrid(x, y)
# Interpolate from polar to cartesian grid
new_grid = griddata(
(X.flatten(), Y.flatten()),
df_polar.values.flatten(),
(grid_x, grid_y),
method='nearest'
)
The resulting values look something like this (with grid_spacing = 10 and flipping x and y):
import matplotlib.pyplot as plt
plt.imshow(new_grid.T, cmap='hot')
Clearly interpolate "nearest" needs taming...
I will have a 3-d grid of points (defined by Cartesian vectors). For any given coordinate within the grid, I wish to find the 8 grid points making the cuboid which surrounds the given coordinate. I also need the distances between the vertices of the cuboid and the given coordinate. I have found a way of doing this for a meshgrid with regular spacings, but not for irregular spacings. I do not yet have an example of the irregularly spaced grid data, I just know that the algorithm will have to deal with them eventually. My solution for the regularly spaced points is based off of this post, Finding index of nearest point in numpy arrays of x and y coordinates and is as follows:
import scipy as sp
import numpy as np
x, y, z = np.mgrid[0:5, 0:10, 0:20]
# Example 3-d grid of points.
b = np.dstack((x.ravel(), y.ravel(), z.ravel()))[0]
tree = sp.spatial.cKDTree(b)
example_coord = np.array([1.5, 3.5, 5.5])
d, i = tree.query((example_coord), 8)
# i being the indices of the closest grid points, d being their distance from the
# given coordinate, example_coord
b[i[0]], d[0]
# This gives one of the points of the surrounding cuboid and its distance from
# example_coord
I am looking to make this algorithm run as efficiently as possible as it will need to be run a lot. Thanks in advance for your help.
I've been having invalid input errors when working with scipy interp2d function. It turns out the problem comes from the bisplrep function, as showed here:
import numpy as np
from scipy import interpolate
# Case 1
x = np.linspace(0,1)
y = np.zeros_like(x)
z = np.ones_like(x)
tck = interpolate.bisplrep(x,y,z) # or interp2d
Returns: ValueError: Invalid inputs
It turned out the test data I was giving interp2d contained only one distinct value for the 2nd axis, as in the test sample above. The bisplrep function inside interp2d considers it as an invalid output:
This may be considered as an acceptable behaviour: interp2d & bisplrep expect a 2D grid, and I'm only giving them values along one line.
On a side note, I find the error message quite unclear. One could include a test in interp2d to deal with such cases: something along the lines of
if len(np.unique(x))==1 or len(np.unique(y))==1:
ValueError ("Can't build 2D splines if x or y values are all the same")
may be enough to detect this kind of invalid input, and raise a more explicit error message, or even directly call the more appropriate interp1d function (which works perfectly here)
I thought I had correctly understood the problem. However, consider the following code sample:
# Case 2
x = np.linspace(0,1)
y = x
z = np.ones_like(x)
tck = interpolate.bisplrep(x,y,z)
In that case, y being proportional to x, I'm also feeding bisplrep with data along one line. But, surprisingly, bisplrep is able to compute a 2D spline interpolation in that case. I plotted it:
# Plot
def plot_0to1(tck):
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X = np.linspace(0,1,10)
Y = np.linspace(0,1,10)
Z = interpolate.bisplev(X,Y,tck)
X,Y = np.meshgrid(X,Y)
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1, cmap=cm.coolwarm,
linewidth=0, antialiased=False)
plt.show()
plot_0to1(tck)
The result is the following:
where bisplrep seems to fill the gaps with 0's, as better showed when I extend the plot below:
Regarding of whether adding 0 is expected, my real question is: why does bisplrep work in Case 2 but not in Case 1?
Or, in other words: do we want it to return an error when 2D interpolation is fed with input along one direction only (Case 1 & 2 fail), or not? (Case 1 & 2 should return something, even if unpredicted).
I was originally going to show you how much of a difference it makes for 2d interpolation if your input data are oriented along the coordinate axes rather than in some general direction, but it turns out that the result would be even messier than I had anticipated. I tried using a random dataset over an interpolated rectangular mesh, and comparing that to a case where the same x and y coordinates were rotated by 45 degrees for interpolation. The result was abysmal.
I then tried doing a comparison with a smoother dataset: turns out scipy.interpolate.interp2d has quite a few issues. So my bottom line will be "use scipy.interpolate.griddata".
For instructive purposes, here's my (quite messy) code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
n = 10 # rough number of points
dom = np.linspace(-2,2,n+1) # 1d input grid
x1,y1 = np.meshgrid(dom,dom) # 2d input grid
z = np.random.rand(*x1.shape) # ill-conditioned sample
#z = np.cos(x1)*np.sin(y1) # smooth sample
# first interpolator with interp2d:
fun1 = interp.interp2d(x1,y1,z,kind='linear')
# construct twice finer plotting and interpolating mesh
plotdom = np.linspace(-1,1,2*n+1) # for interpolation and plotting
plotx1,ploty1 = np.meshgrid(plotdom,plotdom)
plotz1 = fun1(plotdom,plotdom) # interpolated points
# construct 45-degree rotated input and interpolating meshes
rotmat = np.array([[1,-1],[1,1]])/np.sqrt(2) # 45-degree rotation
x2,y2 = rotmat.dot(np.vstack([x1.ravel(),y1.ravel()])) # rotate input mesh
plotx2,ploty2 = rotmat.dot(np.vstack([plotx1.ravel(),ploty1.ravel()])) # rotate plotting/interp mesh
# interpolate on rotated mesh with interp2d
# (reverse rotate by using plotx1, ploty1 later!)
fun2 = interp.interp2d(x2,y2,z.ravel(),kind='linear')
# I had to generate the rotated points element-by-element
# since fun2() accepts only rectangular meshes as input
plotz2 = np.array([fun2(xx,yy) for (xx,yy) in zip(plotx2.ravel(),ploty2.ravel())])
# try interpolating with griddata
plotz3 = interp.griddata(np.array([x1.ravel(),y1.ravel()]).T,z.ravel(),np.array([plotx1.ravel(),ploty1.ravel()]).T,method='linear')
plotz4 = interp.griddata(np.array([x2,y2]).T,z.ravel(),np.array([plotx2,ploty2]).T,method='linear')
# function to plot a surface
def myplot(X,Y,Z):
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1,
linewidth=0, antialiased=False,cmap=cm.coolwarm)
plt.show()
# plot interp2d versions
myplot(plotx1,ploty1,plotz1) # Cartesian meshes
myplot(plotx1,ploty1,plotz2.reshape(2*n+1,-1)) # rotated meshes
# plot griddata versions
myplot(plotx1,ploty1,plotz3.reshape(2*n+1,-1)) # Cartesian meshes
myplot(plotx1,ploty1,plotz4.reshape(2*n+1,-1)) # rotated meshes
So here's a gallery of the results. Using random input z data, and interp2d, Cartesian (left) vs rotated interpolation (right):
Note the horrible scale on the right side, noting that the input points are between 0 and 1. Even its mother wouldn't recognize the data set. Note that there are runtime warnings during the evaluation of the rotated data set, so we're being warned that it's all crap.
Now let's do the same with griddata:
We should note that these figures are much closer to each other, and they seem to make way more sense than the output of interp2d. For instance, note the overshoot in the scale of the very first figure.
These artifacts always arise between input data points. Since it's still interpolation, the input points have to be reproduced by the interpolating function, but it's pretty weird that a linear interpolating function overshoots between data points. It's clear that griddata doesn't suffer from this issue.
Consider an even more clear case: the other set of z values, which are smooth and deterministic. The surfaces with interp2d:
HELP! Call the interpolation police! Already the Cartesian input case has inexplicable (well, at least by me) spurious features in it, and the rotated input case poses the threat of s͔̖̰͕̞͖͇ͣ́̈̒ͦ̀̀ü͇̹̞̳ͭ̊̓̎̈m̥̠͈̣̆̐ͦ̚m̻͑͒̔̓ͦ̇oͣ̐ͣṉ̟͖͙̆͋i͉̓̓ͭ̒͛n̹̙̥̩̥̯̭ͤͤͤ̄g͈͇̼͖͖̭̙ ̐z̻̉ͬͪ̑ͭͨ͊ä̼̣̬̗̖́̄ͥl̫̣͔͓̟͛͊̏ͨ͗̎g̻͇͈͚̟̻͛ͫ͛̅͋͒o͈͓̱̥̙̫͚̾͂.
So let's do the same with griddata:
The day is saved, thanks to The Powerpuff Girls scipy.interpolate.griddata. Homework: check the same with cubic interpolation.
By the way, a very short answer to your original question is in help(interp.interp2d):
| Notes
| -----
| The minimum number of data points required along the interpolation
| axis is ``(k+1)**2``, with k=1 for linear, k=3 for cubic and k=5 for
| quintic interpolation.
For linear interpolation you need at least 4 points along the interpolation axis, i.e. at least 4 unique x and y values have to be present to get a meaningful result. Check these:
nvals = 3 # -> RuntimeWarning
x = np.linspace(0,1,10)
y = np.random.randint(low=0,high=nvals,size=x.shape)
z = x
interp.interp2d(x,y,z)
nvals = 4 # -> no problem here
x = np.linspace(0,1,10)
y = np.random.randint(low=0,high=nvals,size=x.shape)
z = x
interp.interp2d(x,y,z)
And of course this all ties in to you question like this: it makes a huge difference if your geometrically 1d data set is along one of the Cartesian axes, or if it's in a general way such that the coordinate values assume various different values. It's probably meaningless (or at least very ill-defined) to try 2d interpolation from a geometrically 1d data set, but at least the algorithm shouldn't break if your data are along a general direction of the x,y plane.
I checked the available interpolation method in scipy, but could not get the proper solution for my case.
assume i have 100 points whose coordinates are random,
e.g., their x and y positions are:
x=np.random.rand(100)*100
y=np.random.rand(100)*100
z = f(x,y) #the point value calculated by certain function
now i want to get the point value z of a new evenly sampled coordinates (xnew and y new)
xnew = range(100)
ynew = range(100)
how should i do this using bilinear sampling?
i know it is possible to do it point by point, e.g., find the 4 nearest random points, and do the interpolation, but there got to be some easier existing functions to do this
thanks alot!
Use scipy.interpolate.griddata. It does the exact thing you need
# griddata expects an ndarray for the interpolant coordinates
interpolants = numpy.array([xnew, ynew])
# defaults to linear interpolation
znew = scipy.interpolate.griddata((x, y), z, interpolants)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.griddata.html#scipy.interpolate.griddata