Compute divergence of vector field using python - python

Is there a function that could be used for calculation of the divergence of the vectorial field? (in matlab) I would expect it exists in numpy/scipy but I can not find it using Google.
I need to calculate div[A * grad(F)], where
F = np.array([[1,2,3,4],[5,6,7,8]]) # (2D numpy ndarray)
A = np.array([[1,2,3,4],[1,2,3,4]]) # (2D numpy ndarray)
so grad(F) is a list of 2D ndarrays
I know I can calculate divergence like this but do not want to reinvent the wheel. (I would also expect something more optimized) Does anyone have suggestions?

Just a hint for everybody reading that:
the functions above do not compute the divergence of a vector field. they sum the derivatives of a scalar field A:
result = dA/dx + dA/dy
in contrast to a vector field (with three dimensional example):
result = sum dAi/dxi = dAx/dx + dAy/dy + dAz/dz
Vote down for all! It is mathematically simply wrong.
Cheers!

import numpy as np
def divergence(field):
"return the divergence of a n-D field"
return np.sum(np.gradient(field),axis=0)

Based on Juh_'s answer, but modified for the correct divergence of a vector field formula
def divergence(f):
"""
Computes the divergence of the vector field f, corresponding to dFx/dx + dFy/dy + ...
:param f: List of ndarrays, where every item of the list is one dimension of the vector field
:return: Single ndarray of the same shape as each of the items in f, which corresponds to a scalar field
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
Matlab's documentation uses this exact formula (scroll down to Divergence of a Vector Field)

The answer of #user2818943 is good, but it can be optimized a little:
def divergence(F):
""" compute the divergence of n-D scalar field `F` """
return reduce(np.add,np.gradient(F))
Timeit:
F = np.random.rand(100,100)
timeit reduce(np.add,np.gradient(F))
# 1000 loops, best of 3: 318 us per loop
timeit np.sum(np.gradient(F),axis=0)
# 100 loops, best of 3: 2.27 ms per loop
About 7 times faster:
sum implicitely construct a 3d array from the list of gradient fields which are returned by np.gradient. This is avoided using reduce
Now, in your question what do you mean by div[A * grad(F)]?
about A * grad(F): A is a 2d array, and grad(f) is a list of 2d arrays. So I considered it means to multiply each gradient field by A.
about applying divergence to the (scaled by A) gradient field is unclear. By definition, div(F) = d(F)/dx + d(F)/dy + .... I guess this is just an error of formulation.
For 1, multiplying summed elements Bi by a same factor A can be factorized:
Sum(A*Bi) = A*Sum(Bi)
Thus, you can get this weighted gradient simply with: A*divergence(F)
If ̀A is instead a list of factor, one for each dimension, then the solution would be:
def weighted_divergence(W,F):
"""
Return the divergence of n-D array `F` with gradient weighted by `W`
̀`W` is a list of factors for each dimension of F: the gradient of `F` over
the `i`th dimension is multiplied by `W[i]`. Each `W[i]` can be a scalar
or an array with same (or broadcastable) shape as `F`.
"""
wGrad = return map(np.multiply, W, np.gradient(F))
return reduce(np.add,wGrad)
result = weighted_divergence(A,F)

What Daniel had modified is the right answer, let me explain self defined func divergence further in more detail :
Function np.gradient() defined as : np.gradient(f) = df/dx, df/dy, df/dz +...
but we need define func divergence as : divergence ( f) = dfx/dx + dfy/dy + dfz/dz +... = np.gradient( fx) + np.gradient(fy) + np.gradient(fz) + ...
Let's test, compare with example of divergence in matlab
import numpy as np
import matplotlib.pyplot as plt
NY = 50
ymin = -2.
ymax = 2.
dy = (ymax -ymin )/(NY-1.)
NX = NY
xmin = -2.
xmax = 2.
dx = (xmax -xmin)/(NX-1.)
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
y = np.array([ ymin + float(i)*dy for i in range(NY)])
x = np.array([ xmin + float(i)*dx for i in range(NX)])
x, y = np.meshgrid( x, y, indexing = 'ij', sparse = False)
Fx = np.cos(x + 2*y)
Fy = np.sin(x - 2*y)
F = [Fx, Fy]
g = divergence(F)
plt.pcolormesh(x, y, g)
plt.colorbar()
plt.savefig( 'Div' + str(NY) +'.png', format = 'png')
plt.show()
---------- UPDATED VERSION: Include the differential Steps----------------
Thank the comment from #henry, the np.gradient take the default step as 1, so the results may have some mismatch. We can provide our own differential steps.
#https://stackoverflow.com/a/47905007/5845212
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
NY = 50
ymin = -2.
ymax = 2.
dy = (ymax -ymin )/(NY-1.)
NX = NY
xmin = -2.
xmax = 2.
dx = (xmax -xmin)/(NX-1.)
def divergence(f,h):
"""
div(F) = dFx/dx + dFy/dy + ...
g = np.gradient(Fx,dx, axis=1)+ np.gradient(Fy,dy, axis=0) #2D
g = np.gradient(Fx,dx, axis=2)+ np.gradient(Fy,dy, axis=1) +np.gradient(Fz,dz,axis=0) #3D
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], h[i], axis=i) for i in range(num_dims)])
y = np.array([ ymin + float(i)*dy for i in range(NY)])
x = np.array([ xmin + float(i)*dx for i in range(NX)])
x, y = np.meshgrid( x, y, indexing = 'ij', sparse = False)
Fx = np.cos(x + 2*y)
Fy = np.sin(x - 2*y)
F = [Fx, Fy]
h = [dx, dy]
print('plotting')
rows = 1
cols = 2
#plt.clf()
plt.figure(figsize=(cols*3.5,rows*3.5))
plt.minorticks_on()
#g = np.gradient(Fx,dx, axis=1)+np.gradient(Fy,dy, axis=0) # equivalent to our func
g = divergence(F,h)
ax = plt.subplot(rows,cols,1,aspect='equal',title='div numerical')
#im=plt.pcolormesh(x, y, g)
im = plt.pcolormesh(x, y, g, shading='nearest', cmap=plt.cm.get_cmap('coolwarm'))
plt.quiver(x,y,Fx,Fy)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
cbar = plt.colorbar(im, cax = cax,format='%.1f')
g = -np.sin(x+2*y) -2*np.cos(x-2*y)
ax = plt.subplot(rows,cols,2,aspect='equal',title='div analytical')
im=plt.pcolormesh(x, y, g)
im = plt.pcolormesh(x, y, g, shading='nearest', cmap=plt.cm.get_cmap('coolwarm'))
plt.quiver(x,y,Fx,Fy)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
cbar = plt.colorbar(im, cax = cax,format='%.1f')
plt.tight_layout()
plt.savefig( 'divergence.png', format = 'png')
plt.show()

Based on #paul_chen answer, and with some additions for Matplotlib 3.3.0 (a shading param needs to be passed, and default colormap I guess has changed)
import numpy as np
import matplotlib.pyplot as plt
NY = 20; ymin = -2.; ymax = 2.
dy = (ymax -ymin )/(NY-1.)
NX = NY
xmin = -2.; xmax = 2.
dx = (xmax -xmin)/(NX-1.)
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
y = np.array([ ymin + float(i)*dy for i in range(NY)])
x = np.array([ xmin + float(i)*dx for i in range(NX)])
x, y = np.meshgrid( x, y, indexing = 'ij', sparse = False)
Fx = np.cos(x + 2*y)
Fy = np.sin(x - 2*y)
F = [Fx, Fy]
g = divergence(F)
plt.pcolormesh(x, y, g, shading='nearest', cmap=plt.cm.get_cmap('coolwarm'))
plt.colorbar()
plt.quiver(x,y,Fx,Fy)
plt.savefig( 'Div.png', format = 'png')

The divergence as a built-in function is included in matlab, but not numpy. This is the sort of thing that it may perhaps be worthwhile to contribute to pylab, an effort to create a viable open-source alternative to matlab.
http://wiki.scipy.org/PyLab
Edit: Now called http://www.scipy.org/stackspec.html

As far as I can tell, the answer is that there is no native divergence function in numpy. Therefore, the best method for calculating divergence is to sum the components of the gradient vector i.e. calculate the divergence.

I don't think the answer by #Daniel is correct, especially when the input is in order [Fx, Fy, Fz, ...].
A simple test case
See the MATLAB code:
a = [1 2 3;1 2 3; 1 2 3];
b = [[7 8 9] ;[1 5 8] ;[2 4 7]];
divergence(a,b)
which gives the result:
ans =
-5.0000 -2.0000 0
-1.5000 -1.0000 0
2.0000 0 0
and Daniel's solution:
def divergence(f):
"""
Daniel's solution
Computes the divergence of the vector field f, corresponding to dFx/dx + dFy/dy + ...
:param f: List of ndarrays, where every item of the list is one dimension of the vector field
:return: Single ndarray of the same shape as each of the items in f, which corresponds to a scalar field
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
if __name__ == '__main__':
a = np.array([[1, 2, 3]] * 3)
b = np.array([[7, 8, 9], [1, 5, 8], [2, 4, 7]])
div = divergence([a, b])
print(div)
pass
which gives:
[[1. 1. 1. ]
[4. 3.5 3. ]
[2. 2.5 3. ]]
Explanation
The mistake of Daniel's solution is, in Numpy, the x axis is the last axis instead of the first axis. When using np.gradient(x, axis=0), Numpy actually gives the gradient of y direction (when x is a 2d array).
My solution
There is my solution based on Daniel's answer.
def divergence(f):
"""
Computes the divergence of the vector field f, corresponding to dFx/dx + dFy/dy + ...
:param f: List of ndarrays, where every item of the list is one dimension of the vector field
:return: Single ndarray of the same shape as each of the items in f, which corresponds to a scalar field
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[num_dims - i - 1], axis=i) for i in range(num_dims)])
which gives the same result as MATLAB divergence in my test case.

Somehow the previous attempts to compute the divergence are wrong! Let me show you:
We have the following vector field F:
F(x) = cos(x+2y)
F(y) = sin(x-2y)
If we compute the divergence (using Mathematica):
Div[{Cos[x + 2*y], Sin[x - 2*y]}, {x, y}]
we get:
-2 Cos[x - 2 y] - Sin[x + 2 y]
which has a maximum value in the range of y [-1,2] and x [-2,2]:
N[Max[Table[-2 Cos[x - 2 y] - Sin[x + 2 y], {x, -2, 2 }, {y, -2, 2}]]] = 2.938
Using the divergence equation given here:
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
we get a maximum value of about 0.625
Correct divergence function: Compute divergence with python

Related

Why is my real-valued spherical harmonic representation inaccurate?

I wrote a simple class in Python to compute spherical harmonic basis functions and corresponding coefficients for a function defined on a sphere. See below.
import numpy as np
from scipy.special import sph_harm
import scipy.integrate as integrate
class SphHarmBasis():
def __init__(self, n_coeffs=16):
self._n_coeffs = n_coeffs
self.basis = self.sph_harm_basis()
''' Compute real spherical harmonic basis function '''
def get_sph_harm_function(self, l, m):
def basis_function(theta, phi):
Y = sph_harm(abs(m), l, phi, theta)
if m < 0:
Y = np.sqrt(2) * (-1)**m * Y.imag
elif m > 0:
Y = np.sqrt(2) * (-1)**m * Y.real
return Y.real
return basis_function
''' Get a specified number of basis functions '''
def sph_harm_basis(self):
basis_functions = []
dimension = 0
l, m = 0, 0
while dimension < self._n_coeffs:
while m <= l:
basis_functions.append(self.get_sph_harm_function(l, m))
m += 1
dimension += 1
l += 1
m = -l
return basis_functions
''' Compute spherical harmonic coefficients '''
def sph_harm_coeff(self, Y, f):
def integrand(phi, theta):
return f(theta, phi) * Y(theta, phi) * np.sin(theta)
return integrate.dblquad(integrand, 0., np.pi, lambda x : 0., lambda x : 2*np.pi)[0]
''' Get spherical harmonic coefficients for a function in a basis '''
def sph_harm_transform(self, f, basis=None):
if basis is None:
basis = self.basis
coeffs = []
for Y in basis:
coeffs.append(self.sph_harm_coeff(Y, f))
return coeffs
''' Reconstruct a function from basis and corresponding coefficients'''
def sph_harm_reconstruct(self, coeffs, basis=None):
if basis is None:
basis = self.basis
return lambda theta, phi : np.dot(coeffs, [f(theta, phi) for f in basis])
And you can use it like this:
def my_sphere_function(theta, phi):
return np.sin(theta+phi)
my_basis = SphHarmBasis(n_coeffs=25)
# encode your function in the desired basis and record the coefficients
my_coeffs = my_basis.sph_harm_transform(my_sphere_function)
# reconstruct your function at a point
point = (np.pi, np.pi/8)
my_basis.sph_harm_reconstruct(my_coeffs)(*point), my_sphere_function(*point)
My problem is that it doesn't seem to be very accurate.
For example, running the follow test code reports a mean absolute error of 0.4933463836715332.
import matplotlib.pyplot as plt
my_reconstr_function = my_basis.sph_harm_reconstruct(my_coeffs)
pts = np.linspace(0, 2*np.pi)
pts_2d = np.reshape(np.stack(np.meshgrid(pts,pts), axis=-1), (-1, 2))
actual = []
approx = []
pts_total = 0
for n, pt in enumerate(pts_2d):
f_actual = my_sphere_function(*pt)
f_approx = my_reconstr_function(*pt)
actual.append(f_actual)
approx.append(f_approx)
pts_total += abs(f_approx - f_actual)
print(pts_total / len(pts_2d))
plt.subplot(121)
plt.imshow(np.reshape(np.repeat(actual, 3), (50, 50, 3)))
plt.subplot(122)
plt.imshow(np.reshape(np.repeat(approx, 3), (50, 50, 3)))
Actual vs. reconstructed images
On the left is an image representation of the actual function, and on the right is the reconstructed function.
Where is my error? Changing the number of coefficients doesn't seem to affect much.
My problem was that I was testing samples out of the domain of the spherical harmonic basis functions. I specified the correct integration bounds, as the polar angle must be in the interval [0, pi] while the azimuthal belongs to [0, 2pi]. But for my test example, pts_2d was in the interval [0, 2pi]^2. All is well!
Reconstruction comparison after fixing the test code

Compute divergence with python

From this answer, the divergence of a numeric vector field can be computed as such:
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
However, I have noticed that the output seems to depend a lot on the grid resolution, so there seems to be something wrong!
If I look at an example:
We have the following vector field F:
F(x) = cos(x+2y)
F(y) = sin(x-2y)
If we compute the divergence (using Mathematica):
Div[{Cos[x + 2*y], Sin[x - 2*y]}, {x, y}]
we get:
-2 Cos[x - 2 y] - Sin[x + 2 y]
which has a maximum value in the range of y [-2,2] and x [-2,2]:
N[Max[Table[-2 Cos[x - 2 y] - Sin[x + 2 y], {x, -2, 2 }, {y, -2, 2}]]] = 2.938
Using the divergence equation given here, we get the following plot, for max value vs. resolution (NxN: number of values in x and y-direction). None of these are even close to 3.
Here is the code:
import numpy as np
import matplotlib.pyplot as plt
# Boundaries
ymin = -2.; ymax = 2.
xmin = -2.; xmax = 2.
# Number of points (NxN)
N = 20
# Divergence function
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
# Create Meshgrid
x = np.linspace(xmin,xmax, N)
y = np.linspace(ymin,ymax, N)
xx, yy = np.meshgrid(x, y)
# Define 2D Vector Field
Fx = np.cos(xx + 2*yy)
Fy = np.sin(xx - 2*yy)
F = np.array([Fx, Fy])
# Compute Divergence
g = divergence(F)
print("Max: ", np.max(g.flatten()))
plt.imshow(g)
plt.colorbar()
Edit:
To create the plot:
# %%
a = []
for N in range(20,100):
# Number of points (NxN)
# = 20
# Boundaries
ymin = -2.; ymax = 2.
xmin = -2.; xmax = 2.
# Deivergence function
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
# Create Meshgrid
x = np.linspace(xmin,xmax, N)
y = np.linspace(ymin,ymax, N)
xx, yy = np.meshgrid(x, y)
# Define 2D Vector Field
Fx = np.cos(xx + 2*yy)
Fy = np.sin(xx - 2*yy)
F = np.array([Fx, Fy])
# Compute Divergence
g = divergence(F)
print("Max: ", np.max(g.flatten()))
a.append(np.max(g.flatten()))
plt.plot(a)
I realized what the issue was with the help of this answer. The default spacing assumed between two consecutive values in numpy.gradient() is 1. It needs to be changed if there is a different grid.
Hence the divergence function needs to be adapted as such:
Divergence function
def divergence(f,sp):
""" Computes divergence of vector field
f: array -> vector field components [Fx,Fy,Fz,...]
sp: array -> spacing between points in respecitve directions [spx, spy,spz,...]
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], sp[i], axis=i) for i in range(num_dims)])
Example
a = []
for N in range(20,100):
# Number of points (NxN)
# = 20
# Boundaries
ymin = -2.; ymax = 2.
xmin = -2.; xmax = 2.
# Divergence function
def divergence(f,sp):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], sp[i], axis=i) for i in range(num_dims)])
# Create Meshgrid
x = np.linspace(xmin,xmax, N)
y = np.linspace(ymin,ymax, N)
xx, yy = np.meshgrid(x, y)
# Define 2D Vector Field
Fx = np.cos(xx + 2*yy)
Fy = np.sin(xx - 2*yy)
F = np.array([Fx, Fy])
# Compute Divergence
sp_x = np.diff(x)[0]
sp_y = np.diff(y)[0]
sp = [sp_x, sp_y]
g = divergence(F, sp)
print("Max: ", np.max(g.flatten()))
a.append(np.max(g.flatten()))
plt.plot(a)
We can see that with increasing resolution, the maximum of the divergence really tends to 3.

How to perform cubic spline interpolation in python?

I have two lists to describe the function y(x):
x = [0,1,2,3,4,5]
y = [12,14,22,39,58,77]
I would like to perform cubic spline interpolation so that given some value u in the domain of x, e.g.
u = 1.25
I can find y(u).
I found this in SciPy but I am not sure how to use it.
Short answer:
from scipy import interpolate
def f(x):
x_points = [ 0, 1, 2, 3, 4, 5]
y_points = [12,14,22,39,58,77]
tck = interpolate.splrep(x_points, y_points)
return interpolate.splev(x, tck)
print(f(1.25))
Long answer:
scipy separates the steps involved in spline interpolation into two operations, most likely for computational efficiency.
The coefficients describing the spline curve are computed,
using splrep(). splrep returns an array of tuples containing the
coefficients.
These coefficients are passed into splev() to actually
evaluate the spline at the desired point x (in this example 1.25).
x can also be an array. Calling f([1.0, 1.25, 1.5]) returns the
interpolated points at 1, 1.25, and 1,5, respectively.
This approach is admittedly inconvenient for single evaluations, but since the most common use case is to start with a handful of function evaluation points, then to repeatedly use the spline to find interpolated values, it is usually quite useful in practice.
In case, scipy is not installed:
import numpy as np
from math import sqrt
def cubic_interp1d(x0, x, y):
"""
Interpolate a 1-D function using cubic splines.
x0 : a float or an 1d-array
x : (N,) array_like
A 1-D array of real/complex values.
y : (N,) array_like
A 1-D array of real values. The length of y along the
interpolation axis must be equal to the length of x.
Implement a trick to generate at first step the cholesky matrice L of
the tridiagonal matrice A (thus L is a bidiagonal matrice that
can be solved in two distinct loops).
additional ref: www.math.uh.edu/~jingqiu/math4364/spline.pdf
"""
x = np.asfarray(x)
y = np.asfarray(y)
# remove non finite values
# indexes = np.isfinite(x)
# x = x[indexes]
# y = y[indexes]
# check if sorted
if np.any(np.diff(x) < 0):
indexes = np.argsort(x)
x = x[indexes]
y = y[indexes]
size = len(x)
xdiff = np.diff(x)
ydiff = np.diff(y)
# allocate buffer matrices
Li = np.empty(size)
Li_1 = np.empty(size-1)
z = np.empty(size)
# fill diagonals Li and Li-1 and solve [L][y] = [B]
Li[0] = sqrt(2*xdiff[0])
Li_1[0] = 0.0
B0 = 0.0 # natural boundary
z[0] = B0 / Li[0]
for i in range(1, size-1, 1):
Li_1[i] = xdiff[i-1] / Li[i-1]
Li[i] = sqrt(2*(xdiff[i-1]+xdiff[i]) - Li_1[i-1] * Li_1[i-1])
Bi = 6*(ydiff[i]/xdiff[i] - ydiff[i-1]/xdiff[i-1])
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
i = size - 1
Li_1[i-1] = xdiff[-1] / Li[i-1]
Li[i] = sqrt(2*xdiff[-1] - Li_1[i-1] * Li_1[i-1])
Bi = 0.0 # natural boundary
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
# solve [L.T][x] = [y]
i = size-1
z[i] = z[i] / Li[i]
for i in range(size-2, -1, -1):
z[i] = (z[i] - Li_1[i-1]*z[i+1])/Li[i]
# find index
index = x.searchsorted(x0)
np.clip(index, 1, size-1, index)
xi1, xi0 = x[index], x[index-1]
yi1, yi0 = y[index], y[index-1]
zi1, zi0 = z[index], z[index-1]
hi1 = xi1 - xi0
# calculate cubic
f0 = zi0/(6*hi1)*(xi1-x0)**3 + \
zi1/(6*hi1)*(x0-xi0)**3 + \
(yi1/hi1 - zi1*hi1/6)*(x0-xi0) + \
(yi0/hi1 - zi0*hi1/6)*(xi1-x0)
return f0
if __name__ == '__main__':
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 11)
y = np.sin(x)
plt.scatter(x, y)
x_new = np.linspace(0, 10, 201)
plt.plot(x_new, cubic_interp1d(x_new, x, y))
plt.show()
If you have scipy version >= 0.18.0 installed you can use CubicSpline function from scipy.interpolate for cubic spline interpolation.
You can check scipy version by running following commands in python:
#!/usr/bin/env python3
import scipy
scipy.version.version
If your scipy version is >= 0.18.0 you can run following example code for cubic spline interpolation:
#!/usr/bin/env python3
import numpy as np
from scipy.interpolate import CubicSpline
# calculate 5 natural cubic spline polynomials for 6 points
# (x,y) = (0,12) (1,14) (2,22) (3,39) (4,58) (5,77)
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([12,14,22,39,58,77])
# calculate natural cubic spline polynomials
cs = CubicSpline(x,y,bc_type='natural')
# show values of interpolation function at x=1.25
print('S(1.25) = ', cs(1.25))
## Aditional - find polynomial coefficients for different x regions
# if you want to print polynomial coefficients in form
# S0(0<=x<=1) = a0 + b0(x-x0) + c0(x-x0)^2 + d0(x-x0)^3
# S1(1< x<=2) = a1 + b1(x-x1) + c1(x-x1)^2 + d1(x-x1)^3
# ...
# S4(4< x<=5) = a4 + b4(x-x4) + c5(x-x4)^2 + d5(x-x4)^3
# x0 = 0; x1 = 1; x4 = 4; (start of x region interval)
# show values of a0, b0, c0, d0, a1, b1, c1, d1 ...
cs.c
# Polynomial coefficients for 0 <= x <= 1
a0 = cs.c.item(3,0)
b0 = cs.c.item(2,0)
c0 = cs.c.item(1,0)
d0 = cs.c.item(0,0)
# Polynomial coefficients for 1 < x <= 2
a1 = cs.c.item(3,1)
b1 = cs.c.item(2,1)
c1 = cs.c.item(1,1)
d1 = cs.c.item(0,1)
# ...
# Polynomial coefficients for 4 < x <= 5
a4 = cs.c.item(3,4)
b4 = cs.c.item(2,4)
c4 = cs.c.item(1,4)
d4 = cs.c.item(0,4)
# Print polynomial equations for different x regions
print('S0(0<=x<=1) = ', a0, ' + ', b0, '(x-0) + ', c0, '(x-0)^2 + ', d0, '(x-0)^3')
print('S1(1< x<=2) = ', a1, ' + ', b1, '(x-1) + ', c1, '(x-1)^2 + ', d1, '(x-1)^3')
print('...')
print('S5(4< x<=5) = ', a4, ' + ', b4, '(x-4) + ', c4, '(x-4)^2 + ', d4, '(x-4)^3')
# So we can calculate S(1.25) by using equation S1(1< x<=2)
print('S(1.25) = ', a1 + b1*0.25 + c1*(0.25**2) + d1*(0.25**3))
# Cubic spline interpolation calculus example
# https://www.youtube.com/watch?v=gT7F3TWihvk
Just putting this here if you want a dependency-free solution.
Code taken from an answer above: https://stackoverflow.com/a/48085583/36061
def my_cubic_interp1d(x0, x, y):
"""
Interpolate a 1-D function using cubic splines.
x0 : a 1d-array of floats to interpolate at
x : a 1-D array of floats sorted in increasing order
y : A 1-D array of floats. The length of y along the
interpolation axis must be equal to the length of x.
Implement a trick to generate at first step the cholesky matrice L of
the tridiagonal matrice A (thus L is a bidiagonal matrice that
can be solved in two distinct loops).
additional ref: www.math.uh.edu/~jingqiu/math4364/spline.pdf
# original function code at: https://stackoverflow.com/a/48085583/36061
This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Original Author raphael valentin
Date 3 Jan 2018
Modifications made to remove numpy dependencies:
-all sub-functions by MR
This function, and all sub-functions, are licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
Mod author: Matthew Rowles
Date 3 May 2021
"""
def diff(lst):
"""
numpy.diff with default settings
"""
size = len(lst)-1
r = [0]*size
for i in range(size):
r[i] = lst[i+1] - lst[i]
return r
def list_searchsorted(listToInsert, insertInto):
"""
numpy.searchsorted with default settings
"""
def float_searchsorted(floatToInsert, insertInto):
for i in range(len(insertInto)):
if floatToInsert <= insertInto[i]:
return i
return len(insertInto)
return [float_searchsorted(i, insertInto) for i in listToInsert]
def clip(lst, min_val, max_val, inPlace = False):
"""
numpy.clip
"""
if not inPlace:
lst = lst[:]
for i in range(len(lst)):
if lst[i] < min_val:
lst[i] = min_val
elif lst[i] > max_val:
lst[i] = max_val
return lst
def subtract(a,b):
"""
returns a - b
"""
return a - b
size = len(x)
xdiff = diff(x)
ydiff = diff(y)
# allocate buffer matrices
Li = [0]*size
Li_1 = [0]*(size-1)
z = [0]*(size)
# fill diagonals Li and Li-1 and solve [L][y] = [B]
Li[0] = sqrt(2*xdiff[0])
Li_1[0] = 0.0
B0 = 0.0 # natural boundary
z[0] = B0 / Li[0]
for i in range(1, size-1, 1):
Li_1[i] = xdiff[i-1] / Li[i-1]
Li[i] = sqrt(2*(xdiff[i-1]+xdiff[i]) - Li_1[i-1] * Li_1[i-1])
Bi = 6*(ydiff[i]/xdiff[i] - ydiff[i-1]/xdiff[i-1])
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
i = size - 1
Li_1[i-1] = xdiff[-1] / Li[i-1]
Li[i] = sqrt(2*xdiff[-1] - Li_1[i-1] * Li_1[i-1])
Bi = 0.0 # natural boundary
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
# solve [L.T][x] = [y]
i = size-1
z[i] = z[i] / Li[i]
for i in range(size-2, -1, -1):
z[i] = (z[i] - Li_1[i-1]*z[i+1])/Li[i]
# find index
index = list_searchsorted(x0,x)
index = clip(index, 1, size-1)
xi1 = [x[num] for num in index]
xi0 = [x[num-1] for num in index]
yi1 = [y[num] for num in index]
yi0 = [y[num-1] for num in index]
zi1 = [z[num] for num in index]
zi0 = [z[num-1] for num in index]
hi1 = list( map(subtract, xi1, xi0) )
# calculate cubic - all element-wise multiplication
f0 = [0]*len(hi1)
for j in range(len(f0)):
f0[j] = zi0[j]/(6*hi1[j])*(xi1[j]-x0[j])**3 + \
zi1[j]/(6*hi1[j])*(x0[j]-xi0[j])**3 + \
(yi1[j]/hi1[j] - zi1[j]*hi1[j]/6)*(x0[j]-xi0[j]) + \
(yi0[j]/hi1[j] - zi0[j]*hi1[j]/6)*(xi1[j]-x0[j])
return f0
Minimal python3 code:
from scipy import interpolate
if __name__ == '__main__':
x = [ 0, 1, 2, 3, 4, 5]
y = [12,14,22,39,58,77]
# tck : tuple (t,c,k) a tuple containing the vector of knots,
# the B-spline coefficients, and the degree of the spline.
tck = interpolate.splrep(x, y)
print(interpolate.splev(1.25, tck)) # Prints 15.203125000000002
print(interpolate.splev(...other_value_here..., tck))
Based on comment of cwhy and answer by youngmit
In my previous post, I wrote a code based on a Cholesky development to solve the matrix generated by the cubic algorithm. Unfortunately, due to the square root function, it may perform badly on some sets of points (typically a non-uniform set of points).
In the same spirit than previously, there is another idea using the Thomas algorithm (TDMA) (see https://en.wikipedia.org/wiki/Tridiagonal_matrix_algorithm) to solve partially the tridiagonal matrix during its definition loop. However, the condition to use TDMA is that it requires at least that the matrix shall be diagonally dominant. However, in our case, it shall be true since |bi| > |ai| + |ci| with ai = h[i], bi = 2*(h[i]+h[i+1]), ci = h[i+1], with h[i] unconditionally positive. (see https://www.cfd-online.com/Wiki/Tridiagonal_matrix_algorithm_-TDMA(Thomas_algorithm)
I refer again to the document from jingqiu (see my previous post, unfortunately the link is broken, but it is still possible to find it in the cache of the web).
An optimized version of the TDMA solver can be described as follows:
def TDMAsolver(a,b,c,d):
""" This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Author raphael valentin
Date 25 Mar 2022
ref. https://www.cfd-online.com/Wiki/Tridiagonal_matrix_algorithm_-_TDMA_(Thomas_algorithm)
"""
n = len(d)
w = np.empty(n-1,float)
g = np.empty(n, float)
w[0] = c[0]/b[0]
g[0] = d[0]/b[0]
for i in range(1, n-1):
m = b[i] - a[i-1]*w[i-1]
w[i] = c[i] / m
g[i] = (d[i] - a[i-1]*g[i-1]) / m
g[n-1] = (d[n-1] - a[n-2]*g[n-2]) / (b[n-1] - a[n-2]*w[n-2])
for i in range(n-2, -1, -1):
g[i] = g[i] - w[i]*g[i+1]
return g
When it is possible to get each individual for ai, bi, ci, di, it becomes easy to combine the definitions of the natural cubic spline interpolator function within these 2 single loops.
def cubic_interpolate(x0, x, y):
""" Natural cubic spline interpolate function
This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Author raphael valentin
Date 25 Mar 2022
"""
xdiff = np.diff(x)
dydx = np.diff(y)
dydx /= xdiff
n = size = len(x)
w = np.empty(n-1, float)
z = np.empty(n, float)
w[0] = 0.
z[0] = 0.
for i in range(1, n-1):
m = xdiff[i-1] * (2 - w[i-1]) + 2 * xdiff[i]
w[i] = xdiff[i] / m
z[i] = (6*(dydx[i] - dydx[i-1]) - xdiff[i-1]*z[i-1]) / m
z[-1] = 0.
for i in range(n-2, -1, -1):
z[i] = z[i] - w[i]*z[i+1]
# find index (it requires x0 is already sorted)
index = x.searchsorted(x0)
np.clip(index, 1, size-1, index)
xi1, xi0 = x[index], x[index-1]
yi1, yi0 = y[index], y[index-1]
zi1, zi0 = z[index], z[index-1]
hi1 = xi1 - xi0
# calculate cubic
f0 = zi0/(6*hi1)*(xi1-x0)**3 + \
zi1/(6*hi1)*(x0-xi0)**3 + \
(yi1/hi1 - zi1*hi1/6)*(x0-xi0) + \
(yi0/hi1 - zi0*hi1/6)*(xi1-x0)
return f0
This function gives the same results as the function/class CubicSpline from scipy.interpolate, as we can see in the next plot.
It is possible to implement as well the first and second analytical derivatives that can be described such way:
f1p = -zi0/(2*hi1)*(xi1-x0)**2 + zi1/(2*hi1)*(x0-xi0)**2 + (yi1/hi1 - zi1*hi1/6) + (yi0/hi1 - zi0*hi1/6)
f2p = zi0/hi1 * (xi1-x0) + zi1/hi1 * (x0-xi0)
Then, it is easy to verify that f2p[0] and f2p[-1] are equal to 0, then that the interpolator function yields natural splines.
An additional reference concerning natural spline:
https://faculty.ksu.edu.sa/sites/default/files/numerical_analysis_9th.pdf#page=167
An example of use:
import matplotlib.pyplot as plt
import numpy as np
x = [-8,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
x = np.asfarray(x)
y = np.asfarray(y)
plt.scatter(x, y)
x_new= np.linspace(min(x), max(x), 10000)
y_new = cubic_interpolate(x_new, x, y)
plt.plot(x_new, y_new)
from scipy.interpolate import CubicSpline
f = CubicSpline(x, y, bc_type='natural')
plt.plot(x_new, f(x_new), label='ref')
plt.legend()
plt.show()
In a conclusion, this updated algorithm shall perform interpolation with better stability and faster than the previous code (O(n)). Associated with numba or cython, it shall be even very fast. Finally, it is totally independent of Scipy.
Important, note that as most of algorithms, it is sometimes useful to normalize the data (e.g. against large or small number values) to get the best results. As well, in this code, I do not check nan values or ordered data.
Whatever, this update was a good lesson learning for me and I hope it can help someone. Let me know if you find something strange.
If you want to get the value
from scipy.interpolate import CubicSpline
import numpy as np
x = [-5,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
value = 2
#ascending order
if np.any(np.diff(x) < 0):
indexes = np.argsort(x).astype(int)
x = np.array(x)[indexes]
y = np.array(y)[indexes]
f = CubicSpline(x, y, bc_type='natural')
specificVal = f(value).item(0) #f(value) is numpy.ndarray!!
print(specificVal)
If you want to plot the interpolated function.
np.linspace third parameter increase the "accuracy".
from scipy.interpolate import CubicSpline
import numpy as np
import matplotlib.pyplot as plt
x = [-5,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
#ascending order
if np.any(np.diff(x) < 0):
indexes = np.argsort(x).astype(int)
x = np.array(x)[indexes]
y = np.array(y)[indexes]
f = CubicSpline(x, y, bc_type='natural')
x_new = np.linspace(min(x), max(x), 100)
y_new = f(x_new)
plt.plot(x_new, y_new)
plt.scatter(x, y)
plt.title('Cubic Spline Interpolation')
plt.show()
output:
Yes, as others have already noted, it should be as simple as
>>> from scipy.interpolate import CubicSpline
>>> CubicSpline(x,y)(u)
array(15.203125)
(you can, for example, convert it to float to get the value from a 0d NumPy array)
What has not been described yet is boundary conditions: the default ‘not-a-knot’ boundary conditions work best if you have zero knowledge about the data you’re going to interpolate.
If you see the following ‘features’ on the plot, you can fine-tune the boundary conditions to get a better result:
the first derivative vanishes at boundaries => bc_type=‘clamped’
the second derivative vanishes at boundaries => bc_type='natural'
the function is periodic => bc_type='periodic'
See my article for more details and an interactive demo.

Best way to interpolate a numpy.ndarray along an axis

I have 4-dimensional data, say for the temperature, in an numpy.ndarray.
The shape of the array is (ntime, nheight_in, nlat, nlon).
I have corresponding 1D arrays for each of the dimensions that tell me which time, height, latitude, and longitude a certain value corresponds to, for this example I need height_in giving the height in metres.
Now I need to bring it onto a different height dimension, height_out, with a different length.
The following seems to do what I want:
ntime, nheight_in, nlat, nlon = t_in.shape
nheight_out = len(height_out)
t_out = np.empty((ntime, nheight_out, nlat, nlon))
for time in range(ntime):
for lat in range(nlat):
for lon in range(nlon):
t_out[time, :, lat, lon] = np.interp(
height_out, height_in, t[time, :, lat, lon]
)
But with 3 nested loops, and lots of switching between python and numpy, I don't think this is the best way to do it.
Any suggestions on how to improve this? Thanks
scipy's interp1d can help:
import numpy as np
from scipy.interpolate import interp1d
ntime, nheight_in, nlat, nlon = (10, 20, 30, 40)
heights = np.linspace(0, 1, nheight_in)
t_in = np.random.normal(size=(ntime, nheight_in, nlat, nlon))
f_out = interp1d(heights, t_in, axis=1)
nheight_out = 50
new_heights = np.linspace(0, 1, nheight_out)
t_out = f_out(new_heights)
I was looking for a similar function that works with irregularly spaced coordinates, and ended up writing my own function. As far as I see, the interpolation is handled nicely and the performance in terms of memory and speed is also quite good. I thought I'd share it here in case anyone else comes across this question looking for a similar function:
import numpy as np
import warnings
def interp_along_axis(y, x, newx, axis, inverse=False, method='linear'):
""" Interpolate vertical profiles, e.g. of atmospheric variables
using vectorized numpy operations
This function assumes that the x-xoordinate increases monotonically
ps:
* Updated to work with irregularly spaced x-coordinate.
* Updated to work with irregularly spaced newx-coordinate
* Updated to easily inverse the direction of the x-coordinate
* Updated to fill with nans outside extrapolation range
* Updated to include a linear interpolation method as well
(it was initially written for a cubic function)
Peter Kalverla
March 2018
--------------------
More info:
Algorithm from: http://www.paulinternet.nl/?page=bicubic
It approximates y = f(x) = ax^3 + bx^2 + cx + d
where y may be an ndarray input vector
Returns f(newx)
The algorithm uses the derivative f'(x) = 3ax^2 + 2bx + c
and uses the fact that:
f(0) = d
f(1) = a + b + c + d
f'(0) = c
f'(1) = 3a + 2b + c
Rewriting this yields expressions for a, b, c, d:
a = 2f(0) - 2f(1) + f'(0) + f'(1)
b = -3f(0) + 3f(1) - 2f'(0) - f'(1)
c = f'(0)
d = f(0)
These can be evaluated at two neighbouring points in x and
as such constitute the piecewise cubic interpolator.
"""
# View of x and y with axis as first dimension
if inverse:
_x = np.moveaxis(x, axis, 0)[::-1, ...]
_y = np.moveaxis(y, axis, 0)[::-1, ...]
_newx = np.moveaxis(newx, axis, 0)[::-1, ...]
else:
_y = np.moveaxis(y, axis, 0)
_x = np.moveaxis(x, axis, 0)
_newx = np.moveaxis(newx, axis, 0)
# Sanity checks
if np.any(_newx[0] < _x[0]) or np.any(_newx[-1] > _x[-1]):
# raise ValueError('This function cannot extrapolate')
warnings.warn("Some values are outside the interpolation range. "
"These will be filled with NaN")
if np.any(np.diff(_x, axis=0) < 0):
raise ValueError('x should increase monotonically')
if np.any(np.diff(_newx, axis=0) < 0):
raise ValueError('newx should increase monotonically')
# Cubic interpolation needs the gradient of y in addition to its values
if method == 'cubic':
# For now, simply use a numpy function to get the derivatives
# This produces the largest memory overhead of the function and
# could alternatively be done in passing.
ydx = np.gradient(_y, axis=0, edge_order=2)
# This will later be concatenated with a dynamic '0th' index
ind = [i for i in np.indices(_y.shape[1:])]
# Allocate the output array
original_dims = _y.shape
newdims = list(original_dims)
newdims[0] = len(_newx)
newy = np.zeros(newdims)
# set initial bounds
i_lower = np.zeros(_x.shape[1:], dtype=int)
i_upper = np.ones(_x.shape[1:], dtype=int)
x_lower = _x[0, ...]
x_upper = _x[1, ...]
for i, xi in enumerate(_newx):
# Start at the 'bottom' of the array and work upwards
# This only works if x and newx increase monotonically
# Update bounds where necessary and possible
needs_update = (xi > x_upper) & (i_upper+1<len(_x))
# print x_upper.max(), np.any(needs_update)
while np.any(needs_update):
i_lower = np.where(needs_update, i_lower+1, i_lower)
i_upper = i_lower + 1
x_lower = _x[[i_lower]+ind]
x_upper = _x[[i_upper]+ind]
# Check again
needs_update = (xi > x_upper) & (i_upper+1<len(_x))
# Express the position of xi relative to its neighbours
xj = (xi-x_lower)/(x_upper - x_lower)
# Determine where there is a valid interpolation range
within_bounds = (_x[0, ...] < xi) & (xi < _x[-1, ...])
if method == 'linear':
f0, f1 = _y[[i_lower]+ind], _y[[i_upper]+ind]
a = f1 - f0
b = f0
newy[i, ...] = np.where(within_bounds, a*xj+b, np.nan)
elif method=='cubic':
f0, f1 = _y[[i_lower]+ind], _y[[i_upper]+ind]
df0, df1 = ydx[[i_lower]+ind], ydx[[i_upper]+ind]
a = 2*f0 - 2*f1 + df0 + df1
b = -3*f0 + 3*f1 - 2*df0 - df1
c = df0
d = f0
newy[i, ...] = np.where(within_bounds, a*xj**3 + b*xj**2 + c*xj + d, np.nan)
else:
raise ValueError("invalid interpolation method"
"(choose 'linear' or 'cubic')")
if inverse:
newy = newy[::-1, ...]
return np.moveaxis(newy, 0, axis)
And this is a small example to test it:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d as scipy1d
# toy coordinates and data
nx, ny, nz = 25, 30, 10
x = np.arange(nx)
y = np.arange(ny)
z = np.tile(np.arange(nz), (nx,ny,1)) + np.random.randn(nx, ny, nz)*.1
testdata = np.random.randn(nx,ny,nz) # x,y,z
# Desired z-coordinates (must be between bounds of z)
znew = np.tile(np.linspace(2,nz-2,50), (nx,ny,1)) + np.random.randn(nx, ny, 50)*0.01
# Inverse the coordinates for testing
z = z[..., ::-1]
znew = znew[..., ::-1]
# Now use own routine
ynew = interp_along_axis(testdata, z, znew, axis=2, inverse=True)
# Check some random profiles
for i in range(5):
randx = np.random.randint(nx)
randy = np.random.randint(ny)
checkfunc = scipy1d(z[randx, randy], testdata[randx,randy], kind='cubic')
checkdata = checkfunc(znew)
fig, ax = plt.subplots()
ax.plot(testdata[randx, randy], z[randx, randy], 'x', label='original data')
ax.plot(checkdata[randx, randy], znew[randx, randy], label='scipy')
ax.plot(ynew[randx, randy], znew[randx, randy], '--', label='Peter')
ax.legend()
plt.show()
Following the criteria of numpy.interp, one can assign the left/right bounds to the points outside the range adding this lines after within_bounds = ...
out_lbound = (xi <= _x[0,...])
out_rbound = (_x[-1,...] <= xi)
and
newy[i, out_lbound] = _y[0, out_lbound]
newy[i, out_rbound] = _y[-1, out_rbound]
after newy[i, ...] = ....
If I understood well the strategy used by #Peter9192, I think the changes are in the same line. I've checked a little bit, but maybe some strange case could not work properly.

Specify the shift for numpy.correlate

I wonder if there is a possibility to specify the shift expressed by k variable for the cross-correlation of two 1D arrays. Because with the numpy.correlate function and its mode parameter set to 'full' I will get cross-correlate coefficients for each k shift for whole length of the taken array (assuming that both arrays are the same size). Let me show you what I mean exactly on below example:
import numpy as np
# Define signal 1.
signal_1 = np.array([1, 2 ,3])
# Define signal 2.
signal_2 = np.array([1, 2, 3])
# Other definitions.
Xi = signal_1
Yi = signal_2
N = np.size(Xi)
k = 3
Xs = np.average(Xi)
Ys = np.average(Yi)
# Cross-covariance coefficient function.
def crossCovariance(Xi, Yi, N, k, Xs, Ys, forCorrelation = False):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Yi[i]-Ys)
if forCorrelation == True:
return autoCov/N
else:
return (1/(N-1))*autoCov
# Expected value function.
def E(X, P):
expectedValue = 0
for i in np.arange(0, np.size(X)):
expectedValue += X[i] * (P[i] / np.size(X))
return expectedValue
# Cross-correlation coefficient function.
def crossCorrelation(Xi, Yi, k):
# Calculate the covariance coefficient.
cov = crossCovariance(Xi, Yi, N, k, Xs, Ys, forCorrelation = True)
# Calculate standard deviations.
EX = E(Xi, np.ones(np.size(Xi)))
SDX = (E((Xi - EX) ** 2, np.ones(np.size(Xi)))) ** (1/2)
EY = E(Yi, np.ones(np.size(Yi)))
SDY = (E((Yi - EY) ** 2, np.ones(np.size(Yi)))) ** (1/2)
# Calculate correlation coefficient.
return cov / (SDX * SDY)
# Express cross-covariance or cross-correlation function in a form of a 1D vector.
def array(k, norm = True):
# If norm = True, return array of autocorrelation coefficients.
# If norm = False, return array of autocovariance coefficients.
vector = np.array([])
shifts = np.abs(np.arange(-k, k+1, 1))
for i in shifts:
if norm == True:
vector = np.append(crossCorrelation(Xi, Yi, i), vector)
else:
vector = np.append(crossCovariance(Xi, Yi, N, i, Xs, Ys), vector)
return vector
In my example, calling the method array(k, norm = True) for different values of k will give resuslt as I shown below:
k = 3, [ 0. -0.5 0. 1. 0. -0.5 0. ]
k = 2, [-0.5 0. 1. 0. -0.5]
k = 1, [ 0. 1. 0.]
k = 0, [ 1.]
My approach is good for the learning purposes but I need to move to the native numpy functions in order to speed up my analysis. How one could specify the k shift value while using the native numpy.correlate function? PS k parameter specify the "time" shift between two arrays. Thank you in advance.
Whilst I'm not aware of any built-in function for computing the cross-correlation for a particular range of signal lags, you can speed your version up a lot by vectorization, i.e. performing operations on arrays rather than single elements in an array.
This version uses only a single Python loop over the lags:
import numpy as np
def xcorr(x, y, k, normalize=True):
n = x.shape[0]
# initialize the output array
out = np.empty((2 * k) + 1, dtype=np.double)
lags = np.arange(-k, k + 1)
# pre-compute E(x), E(y)
mu_x = x.mean()
mu_y = y.mean()
# loop over lags
for ii, lag in enumerate(lags):
# use slice indexing to get 'shifted' views of the two input signals
if lag < 0:
xi = x[:lag]
yi = y[-lag:]
elif lag > 0:
xi = x[:-lag]
yi = y[lag:]
else:
xi = x
yi = y
# x - mu_x; y - mu_y
xdiff = xi - mu_x
ydiff = yi - mu_y
# E[(x - mu_x) * (y - mu_y)]
out[ii] = xdiff.dot(ydiff) / n
# NB: xdiff.dot(ydiff) == (xdiff * ydiff).sum()
if normalize:
# E[(x - mu_x) * (y - mu_y)] / (sigma_x * sigma_y)
out /= np.std(x) * np.std(y)
return lags, out
Some more general points of advice:
As I mentioned in the comments, you should try to give your functions names that are informative, and that aren't likely to conflict with other things in your namespace (e.g. array vs np.array).
It's much better to make your functions self-contained. In your version, N, k, Xs and Ys are defined outside the main function. In this situation you might accidentally modify or overwrite one of these variables, and it can get tricky to debug errors caused by this sort of thing.
Appending to numpy arrays (e.g. using np.append or np.concatenate) is slow, so avoid it whenever you can. If, as in this case, you know the size of the output ahead of time, it's much faster to pre-allocate the output array (e.g. using np.empty or np.zeros), then fill in the elements. If you absolutely have to do concatenation, it's often faster to append to a normal Python list, then convert it to a numpy array at the end.
It's available by specifying maxlags:
import matplotlib.pyplot as plt
xcorr = plt.xcorr(signal_1, signal_2, maxlags=1)
Documentation can be found here. This implementation is based on np.correlate.

Categories

Resources