I have an array of data Y such that Y is a function of an independent variable X (another array).
The values in X vary from 0 to 360, with wraparound.
The values in Y vary from -180 to 180, also with wraparound.
(That is, these values are angles in degrees around a circle.)
Does anyone know of any function in Python (in numpy, scipy, etc.) capable of low-pass filtering my Y values as a function of X?
In case this is at all confusing, here's a plot of example data:
Say you start with
import numpy as np
x = np.linspace(0, 360, 360)
y = 5 * np.sin(x / 90. * 3.14) + np.random.randn(360)
plot(x, y, '+');
To perform a circular convolution, you can do the following:
yy = np.concatenate((y, y))
smoothed = np.convolve(np.array([1] * 5), yy)[5: len(x) + 5]
This uses, at each point, the cyclic average with the previous 5 points (inclusive). Of course, there are other ways of doing so.
>>> plot(x, smoothed)
Here is a solution using pandas to do a moving average. First unwrap the data (need to convert to radians and back), so there are no discontinuities (e.g., jump from 180 to -179). Then compute the moving average and finally convert back to wrapped data if desired. Also, check out this numpy cookbook recipe using np.convolve().
import numpy as np
import pandas as pd
# generate random data
X = pd.Series([(x + 5*np.random.random())%360 for x in range(-100, 600, 15)])
Y = pd.Series([(y + 5*np.random.random())%360 - 180 for y in range(-200, 500, 15)])
# 'unwrap' the angles so there is no wrap around
X1 = pd.Series(np.rad2deg(np.unwrap(np.deg2rad(Y))))
Y1 = pd.Series(np.rad2deg(np.unwrap(np.deg2rad(Y))))
# smooth the data with a moving average
# note: this is pandas 17.1, the api changed for version 18
X2 = pd.rolling_mean(X1, window=3)
Y2 = pd.rolling_mean(Y1, window=3)
# convert back to wrapped data if desired
X3 = X2 % 360
Y3 = (Y2 + 180)%360 - 180
You can use convolve2D from scipy.signal. Here is a function, which applies smoothing to a numpy array a. If a has more than one dimension smoothing is applied to the innermost (fastest) dimension.
import numpy as np
from scipy import signal
def cyclic_moving_av( a, n= 3, win_type= 'boxcar' ):
window= signal.get_window( win_type, n, fftbins=False ).reshape( (1,n) )
shp_a= a.shape
b= signal.convolve2d( a.reshape( ( np.prod( shp_a[:-1], dtype=int ), shp_a[-1] ) ),
window, boundary='wrap', mode='same' )
return ( b / np.sum( window ) ).reshape( shp_a )
For instance it can be used like
import matplotlib.pyplot as plt
x = np.linspace(0, 360, 360)
y1 = 5 * np.sin(x / 90. * 3.14) + 0.5 * np.random.randn(360)
y2 = 5 * np.cos(0.8 * x / 90. * 3.14) + 0.5 * np.random.randn(360)
y_av= cyclic_moving_av( np.stack((y1,y2)), n=10 ) #1
plt.plot(x, y1, '+')
plt.plot(x, y2, '+')
plt.plot(x, y_av[0])
plt.plot(x, y_av[1])
plt.show()
This results in
Line #1 is equivalent to
y_av[0]= cyclic_moving_av( y1, n=10 )
y_av[1]= cyclic_moving_av( y2, n=10 )
win_type= 'boxcar' results in averaging over neighbors with equal weights. See signal.get_window for other options.
Related
I have several points on the unit sphere that are distributed according to the algorithm described in https://www.cmu.edu/biolphys/deserno/pdf/sphere_equi.pdf (and implemented in the code below). On each of these points, I have a value that in my particular case represents 1 minus a small error. The errors are in [0, 0.1] if this is important, so my values are in [0.9, 1].
Sadly, computing the errors is a costly process and I cannot do this for as many points as I want. Still, I want my plots to look like I am plotting something "continuous".
So I want to fit an interpolation function to my data, to be able to sample as many points as I want.
After a little bit of research I found scipy.interpolate.SmoothSphereBivariateSpline which seems to do exactly what I want. But I cannot make it work properly.
Question: what can I use to interpolate (spline, linear interpolation, anything would be fine for the moment) my data on the unit sphere? An answer can be either "you misused scipy.interpolation, here is the correct way to do this" or "this other function is better suited to your problem".
Sample code that should be executable with numpy and scipy installed:
import typing as ty
import numpy
import scipy.interpolate
def get_equidistant_points(N: int) -> ty.List[numpy.ndarray]:
"""Generate approximately n points evenly distributed accros the 3-d sphere.
This function tries to find approximately n points (might be a little less
or more) that are evenly distributed accros the 3-dimensional unit sphere.
The algorithm used is described in
https://www.cmu.edu/biolphys/deserno/pdf/sphere_equi.pdf.
"""
# Unit sphere
r = 1
points: ty.List[numpy.ndarray] = list()
a = 4 * numpy.pi * r ** 2 / N
d = numpy.sqrt(a)
m_v = int(numpy.round(numpy.pi / d))
d_v = numpy.pi / m_v
d_phi = a / d_v
for m in range(m_v):
v = numpy.pi * (m + 0.5) / m_v
m_phi = int(numpy.round(2 * numpy.pi * numpy.sin(v) / d_phi))
for n in range(m_phi):
phi = 2 * numpy.pi * n / m_phi
points.append(
numpy.array(
[
numpy.sin(v) * numpy.cos(phi),
numpy.sin(v) * numpy.sin(phi),
numpy.cos(v),
]
)
)
return points
def cartesian2spherical(x: float, y: float, z: float) -> numpy.ndarray:
r = numpy.linalg.norm([x, y, z])
theta = numpy.arccos(z / r)
phi = numpy.arctan2(y, x)
return numpy.array([r, theta, phi])
n = 100
points = get_equidistant_points(n)
# Random here, but costly in real life.
errors = numpy.random.rand(len(points)) / 10
# Change everything to spherical to use the interpolator from scipy.
ideal_spherical_points = numpy.array([cartesian2spherical(*point) for point in points])
r_interp = 1 - errors
theta_interp = ideal_spherical_points[:, 1]
phi_interp = ideal_spherical_points[:, 2]
# Change phi coordinate from [-pi, pi] to [0, 2pi] to please scipy.
phi_interp[phi_interp < 0] += 2 * numpy.pi
# Create the interpolator.
interpolator = scipy.interpolate.SmoothSphereBivariateSpline(
theta_interp, phi_interp, r_interp
)
# Creating the finer theta and phi values for the final plot
theta = numpy.linspace(0, numpy.pi, 100, endpoint=True)
phi = numpy.linspace(0, numpy.pi * 2, 100, endpoint=True)
# Creating the coordinate grid for the unit sphere.
X = numpy.outer(numpy.sin(theta), numpy.cos(phi))
Y = numpy.outer(numpy.sin(theta), numpy.sin(phi))
Z = numpy.outer(numpy.cos(theta), numpy.ones(100))
thetas, phis = numpy.meshgrid(theta, phi)
heatmap = interpolator(thetas, phis)
Issue with the code above:
With the code as-is, I have a
ValueError: The required storage space exceeds the available storage space: nxest or nyest too small, or s too small. The weighted least-squares spline corresponds to the current set of knots.
that is raised when initialising the interpolator instance.
The issue above seems to say that I should change the value of s that is one on the parameters of scipy.interpolate.SmoothSphereBivariateSpline. I tested different values of s ranging from 0.0001 to 100000, the code above always raise, either the exception described above or:
ValueError: Error code returned by bispev: 10
Edit: I am including my findings here. They can't really be considered as a solution, that is why I am editing and not posting as an answer.
With more research I found this question Using Radial Basis Functions to Interpolate a Function on a Sphere. The author has exactly the same problem as me and use a different interpolator: scipy.interpolate.Rbf. I changed the above code by replacing the interpolator and plotting:
# Create the interpolator.
interpolator = scipy.interpolate.Rbf(theta_interp, phi_interp, r_interp)
# Creating the finer theta and phi values for the final plot
plot_points = 100
theta = numpy.linspace(0, numpy.pi, plot_points, endpoint=True)
phi = numpy.linspace(0, numpy.pi * 2, plot_points, endpoint=True)
# Creating the coordinate grid for the unit sphere.
X = numpy.outer(numpy.sin(theta), numpy.cos(phi))
Y = numpy.outer(numpy.sin(theta), numpy.sin(phi))
Z = numpy.outer(numpy.cos(theta), numpy.ones(plot_points))
thetas, phis = numpy.meshgrid(theta, phi)
heatmap = interpolator(thetas, phis)
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib import cm
colormap = cm.inferno
normaliser = mpl.colors.Normalize(vmin=numpy.min(heatmap), vmax=1)
scalar_mappable = cm.ScalarMappable(cmap=colormap, norm=normaliser)
scalar_mappable.set_array([])
fig = plt.figure()
ax = fig.add_subplot(111, projection="3d")
ax.plot_surface(
X,
Y,
Z,
facecolors=colormap(normaliser(heatmap)),
alpha=0.7,
cmap=colormap,
)
plt.colorbar(scalar_mappable)
plt.show()
This code runs smoothly and gives the following result:
The interpolation seems OK except on one line that is discontinuous, just like in the question that led me to this class. One of the answer give the idea of using a different distance, more adapted the the spherical coordinates: the Haversine distance.
def haversine(x1, x2):
theta1, phi1 = x1
theta2, phi2 = x2
return 2 * numpy.arcsin(
numpy.sqrt(
numpy.sin((theta2 - theta1) / 2) ** 2
+ numpy.cos(theta1) * numpy.cos(theta2) * numpy.sin((phi2 - phi1) / 2) ** 2
)
)
# Create the interpolator.
interpolator = scipy.interpolate.Rbf(theta_interp, phi_interp, r_interp, norm=haversine)
which, when executed, gives a warning:
LinAlgWarning: Ill-conditioned matrix (rcond=1.33262e-19): result may not be accurate.
self.nodes = linalg.solve(self.A, self.di)
and a result that is not at all the one expected: the interpolated function have values that may go up to -1 which is clearly wrong.
You can use Cartesian coordinate instead of Spherical coordinate.
The default norm parameter ('euclidean') used by Rbf is sufficient
# interpolation
x, y, z = numpy.array(points).T
interpolator = scipy.interpolate.Rbf(x, y, z, r_interp)
# predict
heatmap = interpolator(X, Y, Z)
Here the result:
ax.plot_surface(
X, Y, Z,
rstride=1, cstride=1,
# or rcount=50, ccount=50,
facecolors=colormap(normaliser(heatmap)),
cmap=colormap,
alpha=0.7, shade=False
)
ax.set_xlabel('x axis')
ax.set_ylabel('y axis')
ax.set_zlabel('z axis')
You can also use a cosine distance if you want (norm parameter):
def cosine(XA, XB):
if XA.ndim == 1:
XA = numpy.expand_dims(XA, axis=0)
if XB.ndim == 1:
XB = numpy.expand_dims(XB, axis=0)
return scipy.spatial.distance.cosine(XA, XB)
In order to better see the differences,
I stacked the two images, substracted them and inverted the layer.
I am working on a visualization and trying to create a 2D array that is the product of a normalized Gaussian function on the X axis and a normalized exponential function on the Y axis (using Python).
I would use NumPy for this. You can use np.meshgrid to create the (X, Y) axes and use NumPy's vectorized functions to create the function on these coordinates. The array f below is your two-dimensional array, here containing the product of exp(-X/4) and exp(-((Y-2)/1.5)**2). (Substitute your own normalized functions here.)
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(0,10,100)
y = np.linspace(0,5,100)
X, Y = np.meshgrid(x, y)
f = np.exp(-X/4.) * np.exp(-((Y-2)/1.5)**2)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(f)
plt.show()
If you can't or don't want to use NumPy, you'll have to loop by hand and use conventional math functions:
import math
dx, dy = 0.1, 0.05
nx, ny = 101, 101
f = [[None]*nx for i in range(ny)]
for ix in range(nx):
x = xmin + dx*ix
for iy in range(ny):
y = ymin + dy*iy
f[iy][ix] = math.exp(-x/4.) * math.exp(-((y-2)/1.5)**2)
I would use numpy for this, because numpy makes it very simple to do what you want. If you can't use it, then something like the following should work:
import math
def gauss(x, mu=0.0, sigma=1.0):
return 1.0 / math.sqrt(2.0*math.pi*sigma**2) * math.exp(-0.5*(x-mu)**2/sigma**2)
def exponential(x, lam=1.0):
return lam * math.exp(-lam * x)
# X values from -10 to 10 with 0.01 step size
xvals = [x * 0.01 for x in range(-1000, 1001)]
# Y values from 0 to 10 with 0.01 step size
yvals = [y * 0.01 for y in range(0, 1001)]
# Calculate your function at the grid points
f = [[gauss(x)*exponential(y) for x in xvals] for y in yvals]
I have a trajectory formed by a sequence of (x,y) pairs. I would like to interpolate points on this trajectory using splines.
How do I do this? Using scipy.interpolate.UnivariateSpline doesn't work because neither x nor y are monotonic. I could introduce a parametrization (e.g. length d along the trajectory), but then I have two dependent variables x(d) and y(d).
Example:
import numpy as np
import matplotlib.pyplot as plt
import math
error = 0.1
x0 = 1
y0 = 1
r0 = 0.5
alpha = np.linspace(0, 2*math.pi, 40, endpoint=False)
r = r0 + error * np.random.random(len(alpha))
x = x0 + r * np.cos(alpha)
y = x0 + r * np.sin(alpha)
plt.scatter(x, y, color='blue', label='given')
# For this special case, the following code produces the
# desired results. However, I need something that depends
# only on x and y:
from scipy.interpolate import interp1d
alpha_i = np.linspace(alpha[0], alpha[-1], 100)
r_i = interp1d(alpha, r, kind=3)(alpha_i)
x_i = x0 + r_i * np.cos(alpha_i)
y_i = x0 + r_i * np.sin(alpha_i)
plt.plot(x_i, y_i, color='green', label='desired')
plt.legend()
plt.show()
Using splprep you can interpolate over curves of any geometry.
from scipy import interpolate
tck,u=interpolate.splprep([x,y],s=0.0)
x_i,y_i= interpolate.splev(np.linspace(0,1,100),tck)
Which produces a plot like the one given, but only using the x and y points and not the alpha and r paramters.
Sorry about my original answer, I misread the question.
I can generate Gaussian data with random.gauss(mu, sigma) function, but how can I generate 2D gaussian? Is there any function like that?
If you can use numpy, there is numpy.random.multivariate_normal(mean, cov[, size]).
For example, to get 10,000 2D samples:
np.random.multivariate_normal(mean, cov, 10000)
where mean.shape==(2,) and cov.shape==(2,2).
I'd like to add an approximation using exponential functions. This directly generates a 2d matrix which contains a movable, symmetric 2d gaussian.
I should note that I found this code on the scipy mailing list archives and modified it a little.
import numpy as np
def makeGaussian(size, fwhm = 3, center=None):
""" Make a square gaussian kernel.
size is the length of a side of the square
fwhm is full-width-half-maximum, which
can be thought of as an effective radius.
"""
x = np.arange(0, size, 1, float)
y = x[:,np.newaxis]
if center is None:
x0 = y0 = size // 2
else:
x0 = center[0]
y0 = center[1]
return np.exp(-4*np.log(2) * ((x-x0)**2 + (y-y0)**2) / fwhm**2)
For reference and enhancements, it is hosted as a gist here. Pull requests welcome!
Since the standard 2D Gaussian distribution is just the product of two 1D Gaussian distribution, if there are no correlation between the two axes (i.e. the covariant matrix is diagonal), just call random.gauss twice.
def gauss_2d(mu, sigma):
x = random.gauss(mu, sigma)
y = random.gauss(mu, sigma)
return (x, y)
import numpy as np
# define normalized 2D gaussian
def gaus2d(x=0, y=0, mx=0, my=0, sx=1, sy=1):
return 1. / (2. * np.pi * sx * sy) * np.exp(-((x - mx)**2. / (2. * sx**2.) + (y - my)**2. / (2. * sy**2.)))
x = np.linspace(-5, 5)
y = np.linspace(-5, 5)
x, y = np.meshgrid(x, y) # get 2D variables instead of 1D
z = gaus2d(x, y)
Straightforward implementation and example of the 2D Gaussian function. Here sx and sy are the spreads in x and y direction, mx and my are the center coordinates.
Numpy has a function to do this. It is documented here. Additionally to the method proposed above it allows to draw samples with arbitrary covariance.
Here is a small example, assuming ipython -pylab is started:
samples = multivariate_normal([-0.5, -0.5], [[1, 0],[0, 1]], 1000)
plot(samples[:, 0], samples[:, 1], '.')
samples = multivariate_normal([0.5, 0.5], [[0.1, 0.5],[0.5, 0.6]], 1000)
plot(samples[:, 0], samples[:, 1], '.')
In case someone find this thread and is looking for somethinga little more versatile (like I did), I have modified the code from #giessel. The code below will allow for asymmetry and rotation.
import numpy as np
def makeGaussian2(x_center=0, y_center=0, theta=0, sigma_x = 10, sigma_y=10, x_size=640, y_size=480):
# x_center and y_center will be the center of the gaussian, theta will be the rotation angle
# sigma_x and sigma_y will be the stdevs in the x and y axis before rotation
# x_size and y_size give the size of the frame
theta = 2*np.pi*theta/360
x = np.arange(0,x_size, 1, float)
y = np.arange(0,y_size, 1, float)
y = y[:,np.newaxis]
sx = sigma_x
sy = sigma_y
x0 = x_center
y0 = y_center
# rotation
a=np.cos(theta)*x -np.sin(theta)*y
b=np.sin(theta)*x +np.cos(theta)*y
a0=np.cos(theta)*x0 -np.sin(theta)*y0
b0=np.sin(theta)*x0 +np.cos(theta)*y0
return np.exp(-(((a-a0)**2)/(2*(sx**2)) + ((b-b0)**2) /(2*(sy**2))))
We can try just using the numpy method np.random.normal to generate a 2D gaussian distribution.
The sample code is np.random.normal(mean, sigma, (num_samples, 2)).
A sample run by taking mean = 0 and sigma 20 is shown below :
np.random.normal(0, 20, (10,2))
>>array([[ 11.62158316, 3.30702215],
[-18.49936277, -11.23592946],
[ -7.54555371, 14.42238838],
[-14.61531423, -9.2881661 ],
[-30.36890026, -6.2562164 ],
[-27.77763286, -23.56723819],
[-18.18876597, 41.83504042],
[-23.62068377, 21.10615509],
[ 15.48830184, -15.42140269],
[ 19.91510876, 26.88563983]])
Hence we got 10 samples in a 2d array with mean = 0 and sigma = 20
I'm trying to port a program which uses a hand-rolled interpolator (developed by a mathematician colleage) over to use the interpolators provided by scipy. I'd like to use or wrap the scipy interpolator so that it has as close as possible behavior to the old interpolator.
A key difference between the two functions is that in our original interpolator - if the input value is above or below the input range, our original interpolator will extrapolate the result. If you try this with the scipy interpolator it raises a ValueError. Consider this program as an example:
import numpy as np
from scipy import interpolate
x = np.arange(0,10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y)
print f(9)
print f(11) # Causes ValueError, because it's greater than max(x)
Is there a sensible way to make it so that instead of crashing, the final line will simply do a linear extrapolate, continuing the gradients defined by the first and last two points to infinity.
Note, that in the real software I'm not actually using the exp function - that's here for illustration only!
As of SciPy version 0.17.0, there is a new option for scipy.interpolate.interp1d that allows extrapolation. Simply set fill_value='extrapolate' in the call. Modifying your code in this way gives:
import numpy as np
from scipy import interpolate
x = np.arange(0,10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y, fill_value='extrapolate')
print f(9)
print f(11)
and the output is:
0.0497870683679
0.010394302658
You can take a look at InterpolatedUnivariateSpline
Here an example using it:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
# given values
xi = np.array([0.2, 0.5, 0.7, 0.9])
yi = np.array([0.3, -0.1, 0.2, 0.1])
# positions to inter/extrapolate
x = np.linspace(0, 1, 50)
# spline order: 1 linear, 2 quadratic, 3 cubic ...
order = 1
# do inter/extrapolation
s = InterpolatedUnivariateSpline(xi, yi, k=order)
y = s(x)
# example showing the interpolation for linear, quadratic and cubic interpolation
plt.figure()
plt.plot(xi, yi)
for order in range(1, 4):
s = InterpolatedUnivariateSpline(xi, yi, k=order)
y = s(x)
plt.plot(x, y)
plt.show()
1. Constant extrapolation
You can use interp function from scipy, it extrapolates left and right values as constant beyond the range:
>>> from scipy import interp, arange, exp
>>> x = arange(0,10)
>>> y = exp(-x/3.0)
>>> interp([9,10], x, y)
array([ 0.04978707, 0.04978707])
2. Linear (or other custom) extrapolation
You can write a wrapper around an interpolation function which takes care of linear extrapolation. For example:
from scipy.interpolate import interp1d
from scipy import arange, array, exp
def extrap1d(interpolator):
xs = interpolator.x
ys = interpolator.y
def pointwise(x):
if x < xs[0]:
return ys[0]+(x-xs[0])*(ys[1]-ys[0])/(xs[1]-xs[0])
elif x > xs[-1]:
return ys[-1]+(x-xs[-1])*(ys[-1]-ys[-2])/(xs[-1]-xs[-2])
else:
return interpolator(x)
def ufunclike(xs):
return array(list(map(pointwise, array(xs))))
return ufunclike
extrap1d takes an interpolation function and returns a function which can also extrapolate. And you can use it like this:
x = arange(0,10)
y = exp(-x/3.0)
f_i = interp1d(x, y)
f_x = extrap1d(f_i)
print f_x([9,10])
Output:
[ 0.04978707 0.03009069]
What about scipy.interpolate.splrep (with degree 1 and no smoothing):
>> tck = scipy.interpolate.splrep([1, 2, 3, 4, 5], [1, 4, 9, 16, 25], k=1, s=0)
>> scipy.interpolate.splev(6, tck)
34.0
It seems to do what you want, since 34 = 25 + (25 - 16).
Here's an alternative method that uses only the numpy package. It takes advantage of numpy's array functions, so may be faster when interpolating/extrapolating large arrays:
import numpy as np
def extrap(x, xp, yp):
"""np.interp function with linear extrapolation"""
y = np.interp(x, xp, yp)
y = np.where(x<xp[0], yp[0]+(x-xp[0])*(yp[0]-yp[1])/(xp[0]-xp[1]), y)
y = np.where(x>xp[-1], yp[-1]+(x-xp[-1])*(yp[-1]-yp[-2])/(xp[-1]-xp[-2]), y)
return y
x = np.arange(0,10)
y = np.exp(-x/3.0)
xtest = np.array((8.5,9.5))
print np.exp(-xtest/3.0)
print np.interp(xtest, x, y)
print extrap(xtest, x, y)
Edit: Mark Mikofski's suggested modification of the "extrap" function:
def extrap(x, xp, yp):
"""np.interp function with linear extrapolation"""
y = np.interp(x, xp, yp)
y[x < xp[0]] = yp[0] + (x[x<xp[0]]-xp[0]) * (yp[0]-yp[1]) / (xp[0]-xp[1])
y[x > xp[-1]]= yp[-1] + (x[x>xp[-1]]-xp[-1])*(yp[-1]-yp[-2])/(xp[-1]-xp[-2])
return y
It may be faster to use boolean indexing with large datasets, since the algorithm checks if every point is in outside the interval, whereas boolean indexing allows an easier and faster comparison.
For example:
# Necessary modules
import numpy as np
from scipy.interpolate import interp1d
# Original data
x = np.arange(0,10)
y = np.exp(-x/3.0)
# Interpolator class
f = interp1d(x, y)
# Output range (quite large)
xo = np.arange(0, 10, 0.001)
# Boolean indexing approach
# Generate an empty output array for "y" values
yo = np.empty_like(xo)
# Values lower than the minimum "x" are extrapolated at the same time
low = xo < f.x[0]
yo[low] = f.y[0] + (xo[low]-f.x[0])*(f.y[1]-f.y[0])/(f.x[1]-f.x[0])
# Values higher than the maximum "x" are extrapolated at same time
high = xo > f.x[-1]
yo[high] = f.y[-1] + (xo[high]-f.x[-1])*(f.y[-1]-f.y[-2])/(f.x[-1]-f.x[-2])
# Values inside the interpolation range are interpolated directly
inside = np.logical_and(xo >= f.x[0], xo <= f.x[-1])
yo[inside] = f(xo[inside])
In my case, with a data set of 300000 points, this means an speed up from 25.8 to 0.094 seconds, this is more than 250 times faster.
I did it by adding a point to my initial arrays. In this way I avoid defining self-made functions, and the linear extrapolation (in the example below: right extrapolation) looks ok.
import numpy as np
from scipy import interp as itp
xnew = np.linspace(0,1,51)
x1=xold[-2]
x2=xold[-1]
y1=yold[-2]
y2=yold[-1]
right_val=y1+(xnew[-1]-x1)*(y2-y1)/(x2-x1)
x=np.append(xold,xnew[-1])
y=np.append(yold,right_val)
f = itp(xnew,x,y)
I don't have enough reputation to comment, but in case somebody is looking for an extrapolation wrapper for a linear 2d-interpolation with scipy, I have adapted the answer that was given here for the 1d interpolation.
def extrap2d(interpolator):
xs = interpolator.x
ys = interpolator.y
zs = interpolator.z
zs = np.reshape(zs, (-1, len(xs)))
def pointwise(x, y):
if x < xs[0] or y < ys[0]:
x1_index = np.argmin(np.abs(xs - x))
x2_index = x1_index + 1
y1_index = np.argmin(np.abs(ys - y))
y2_index = y1_index + 1
x1 = xs[x1_index]
x2 = xs[x2_index]
y1 = ys[y1_index]
y2 = ys[y2_index]
z11 = zs[x1_index, y1_index]
z12 = zs[x1_index, y2_index]
z21 = zs[x2_index, y1_index]
z22 = zs[x2_index, y2_index]
return (z11 * (x2 - x) * (y2 - y) +
z21 * (x - x1) * (y2 - y) +
z12 * (x2 - x) * (y - y1) +
z22 * (x - x1) * (y - y1)
) / ((x2 - x1) * (y2 - y1) + 0.0)
elif x > xs[-1] or y > ys[-1]:
x1_index = np.argmin(np.abs(xs - x))
x2_index = x1_index - 1
y1_index = np.argmin(np.abs(ys - y))
y2_index = y1_index - 1
x1 = xs[x1_index]
x2 = xs[x2_index]
y1 = ys[y1_index]
y2 = ys[y2_index]
z11 = zs[x1_index, y1_index]
z12 = zs[x1_index, y2_index]
z21 = zs[x2_index, y1_index]
z22 = zs[x2_index, y2_index]#
return (z11 * (x2 - x) * (y2 - y) +
z21 * (x - x1) * (y2 - y) +
z12 * (x2 - x) * (y - y1) +
z22 * (x - x1) * (y - y1)
) / ((x2 - x1) * (y2 - y1) + 0.0)
else:
return interpolator(x, y)
def ufunclike(xs, ys):
if isinstance(xs, int) or isinstance(ys, int) or isinstance(xs, np.int32) or isinstance(ys, np.int32):
res_array = pointwise(xs, ys)
else:
res_array = np.zeros((len(xs), len(ys)))
for x_c in range(len(xs)):
res_array[x_c, :] = np.array([pointwise(xs[x_c], ys[y_c]) for y_c in range(len(ys))]).T
return res_array
return ufunclike
I haven't commented a lot and I am aware, that the code isn't super clean. If anybody sees any errors, please let me know. In my current use-case it is working without a problem :)
I'm afraid that there is no easy to do this in Scipy to my knowledge. You can, as I'm fairly sure that you are aware, turn off the bounds errors and fill all function values beyond the range with a constant, but that doesn't really help. See this question on the mailing list for some more ideas. Maybe you could use some kind of piecewise function, but that seems like a major pain.
The below code gives you the simple extrapolation module. k is the value to which the data set y has to be extrapolated based on the data set x. The numpy module is required.
def extrapol(k,x,y):
xm=np.mean(x);
ym=np.mean(y);
sumnr=0;
sumdr=0;
length=len(x);
for i in range(0,length):
sumnr=sumnr+((x[i]-xm)*(y[i]-ym));
sumdr=sumdr+((x[i]-xm)*(x[i]-xm));
m=sumnr/sumdr;
c=ym-(m*xm);
return((m*k)+c)
Standard interpolate + linear extrapolate:
def interpola(v, x, y):
if v <= x[0]:
return y[0]+(y[1]-y[0])/(x[1]-x[0])*(v-x[0])
elif v >= x[-1]:
return y[-2]+(y[-1]-y[-2])/(x[-1]-x[-2])*(v-x[-2])
else:
f = interp1d(x, y, kind='cubic')
return f(v)