Specify the shift for numpy.correlate

Specify the shift for numpy.correlate - python

I wonder if there is a possibility to specify the shift expressed by k variable for the cross-correlation of two 1D arrays. Because with the numpy.correlate function and its mode parameter set to 'full' I will get cross-correlate coefficients for each k shift for whole length of the taken array (assuming that both arrays are the same size). Let me show you what I mean exactly on below example:
import numpy as np
# Define signal 1.
signal_1 = np.array([1, 2 ,3])
# Define signal 2.
signal_2 = np.array([1, 2, 3])
# Other definitions.
Xi = signal_1
Yi = signal_2
N = np.size(Xi)
k = 3
Xs = np.average(Xi)
Ys = np.average(Yi)
# Cross-covariance coefficient function.
def crossCovariance(Xi, Yi, N, k, Xs, Ys, forCorrelation = False):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Yi[i]-Ys)
if forCorrelation == True:
return autoCov/N
else:
return (1/(N-1))*autoCov
# Expected value function.
def E(X, P):
expectedValue = 0
for i in np.arange(0, np.size(X)):
expectedValue += X[i] * (P[i] / np.size(X))
return expectedValue
# Cross-correlation coefficient function.
def crossCorrelation(Xi, Yi, k):
# Calculate the covariance coefficient.
cov = crossCovariance(Xi, Yi, N, k, Xs, Ys, forCorrelation = True)
# Calculate standard deviations.
EX = E(Xi, np.ones(np.size(Xi)))
SDX = (E((Xi - EX) ** 2, np.ones(np.size(Xi)))) ** (1/2)
EY = E(Yi, np.ones(np.size(Yi)))
SDY = (E((Yi - EY) ** 2, np.ones(np.size(Yi)))) ** (1/2)
# Calculate correlation coefficient.
return cov / (SDX * SDY)
# Express cross-covariance or cross-correlation function in a form of a 1D vector.
def array(k, norm = True):
# If norm = True, return array of autocorrelation coefficients.
# If norm = False, return array of autocovariance coefficients.
vector = np.array([])
shifts = np.abs(np.arange(-k, k+1, 1))
for i in shifts:
if norm == True:
vector = np.append(crossCorrelation(Xi, Yi, i), vector)
else:
vector = np.append(crossCovariance(Xi, Yi, N, i, Xs, Ys), vector)
return vector
In my example, calling the method array(k, norm = True) for different values of k will give resuslt as I shown below:
k = 3, [ 0. -0.5 0. 1. 0. -0.5 0. ]
k = 2, [-0.5 0. 1. 0. -0.5]
k = 1, [ 0. 1. 0.]
k = 0, [ 1.]
My approach is good for the learning purposes but I need to move to the native numpy functions in order to speed up my analysis. How one could specify the k shift value while using the native numpy.correlate function? PS k parameter specify the "time" shift between two arrays. Thank you in advance.

Whilst I'm not aware of any built-in function for computing the cross-correlation for a particular range of signal lags, you can speed your version up a lot by vectorization, i.e. performing operations on arrays rather than single elements in an array.
This version uses only a single Python loop over the lags:
import numpy as np
def xcorr(x, y, k, normalize=True):
n = x.shape[0]
# initialize the output array
out = np.empty((2 * k) + 1, dtype=np.double)
lags = np.arange(-k, k + 1)
# pre-compute E(x), E(y)
mu_x = x.mean()
mu_y = y.mean()
# loop over lags
for ii, lag in enumerate(lags):
# use slice indexing to get 'shifted' views of the two input signals
if lag < 0:
xi = x[:lag]
yi = y[-lag:]
elif lag > 0:
xi = x[:-lag]
yi = y[lag:]
else:
xi = x
yi = y
# x - mu_x; y - mu_y
xdiff = xi - mu_x
ydiff = yi - mu_y
# E[(x - mu_x) * (y - mu_y)]
out[ii] = xdiff.dot(ydiff) / n
# NB: xdiff.dot(ydiff) == (xdiff * ydiff).sum()
if normalize:
# E[(x - mu_x) * (y - mu_y)] / (sigma_x * sigma_y)
out /= np.std(x) * np.std(y)
return lags, out
Some more general points of advice:
As I mentioned in the comments, you should try to give your functions names that are informative, and that aren't likely to conflict with other things in your namespace (e.g. array vs np.array).
It's much better to make your functions self-contained. In your version, N, k, Xs and Ys are defined outside the main function. In this situation you might accidentally modify or overwrite one of these variables, and it can get tricky to debug errors caused by this sort of thing.
Appending to numpy arrays (e.g. using np.append or np.concatenate) is slow, so avoid it whenever you can. If, as in this case, you know the size of the output ahead of time, it's much faster to pre-allocate the output array (e.g. using np.empty or np.zeros), then fill in the elements. If you absolutely have to do concatenation, it's often faster to append to a normal Python list, then convert it to a numpy array at the end.

It's available by specifying maxlags:
import matplotlib.pyplot as plt
xcorr = plt.xcorr(signal_1, signal_2, maxlags=1)
Documentation can be found here. This implementation is based on np.correlate.

Related

Simulate a coupled ordinary differential equation

I want to write a program which turns a 2nd order differential equation into two ordinary differential equations but I don't know how I can do that in Python.
I am getting lots of errors, please help in writing the code correctly.
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
import numpy as np
N = 30 # Number of coupled oscillators.
alpha=0.25
A = 1.0
# Initial positions.
y[0] = 0 # Fix the left-hand side at zero.
y[N+1] = 0 # Fix the right-hand side at zero.
# The range(1,N+1) command only prints out [1,2,3, ... N].
for p in range(1, N+1): # p is particle number.
y[p] = A * np.sin(3 * p * np.pi /(N+1.0))
####################################################
# Initial velocities.
####################################################
v[0] = 0 # The left and right boundaries are
v[N+1] = 0 # clamped and don't move.
# This version sets them all the particle velocities to zero.
for p in range(1, N+1):
v[p] = 0
w0 = [v[p], y[p]]
def accel(t,w):
v[p], y[p] = w
global a
a[0] = 0.0
a[N+1] = 0.0
# This version loops explicitly over all the particles.
for p in range(1,N+1):
a[p] = [v[p], y(p+1)+y(p-1)-2*y(p)+ alpha * ((y[p+1] - y[p])**2 - (y[p] - y[p-1])**2)]
return a
duration = 50
t = np.linspace(0, duration, 800)
abserr = 1.0e-8
relerr = 1.0e-6
solution = solve_ivp(accel, [0, duration], w0, method='RK45', t_eval=t,
vectorized=False, dense_output=True, args=(), atol=abserr, rtol=relerr)

Most general-purpose solvers do not do structured state objects. They just work with a flat array as representation of the state space points. From the construction of the initial point you seem to favor the state space ordering
[ v[0], v[1], ... v[N+1], y[0], y[1], ..., y[N+1] ]
This allows to simply split both and to assemble the derivatives vector from the velocity and acceleration arrays.
Let's keep things simple and separate functionality in small functions
a = np.zeros(N+2)
def accel(y):
global a ## initialized to the correct length with zero, avoids repeated allocation
a[1:-1] = y[2:]+y[:-2]-2*y[1:-1] + alpha*((y[2:]-y[1:-1])**2-(y[1:-1]-y[:-2])**2)
return a
def derivs(t,w):
v,y = w[:N+2], w[N+2:]
return np.concatenate([accel(y), v])
or keeping the theme of avoiding allocations
dwdt = np.zeros(2*N+4)
def derivs(t,w):
global dwdt
v,y = w[:N+2], w[N+2:]
dwdt[:N+2] = accel(y)
dwdt[N+2:] = v
return dwdt
Now you only need to set
w0=np.concatenate([v,y])
to rapidly get to a more interesting class of errors.

Optimize a double loop with mesh grids involved

I am doing a double loop to sum a function that has mesh grids as an input. The problem is that it runs very slow... I want to optimize the code with an alternative procedure, maybe using vectorize function of numpy, but I don't see how can be implemented. I show you the code that I have:
import numpy as np
import time
Lxx = 2.
Lyy = 1.0
dxx = dyy = 0.01
nxx = 100
nyy = 100
XX, YY = np.meshgrid(np.arange(0, Lxx+dxx, dxx), np.arange(0, Lyy+dyy, dyy)) #mesh grid
def solution(xx,yy,nnmax,mmmax):
sol = 0.
for m in range(nnmax):
for n in range(mmmax):
sol = sol+np.sin(XX*0.356*n)+np.cos(YY*2.3*m)
return sol
start = time.time()
solution(XX,YY,nxx,nyy)
end = time.time()
print ("TIME", end-start)
What I want is to make the sum for large values in nxx, nyy. But of course then it takes a lot of time...This is the reason why I want optimize the code.

If you notice, the terms of the sum are completely separable: they don't share any loop variables. You can therefore create independent (smaller) arrays for the sum over XX, n and YY, m, and take the trig functions and sum of those. The final grid can be accumulated by broadcasting.
To begin, don't bother making the grid:
x = np.arange(0, Lxx+dxx, dxx)
y = np.arange(0, Lyy+dyy, dyy)
Compute a single sum using broadcasting:
n = np.arange(nyy)[:, None]
m = np.arange(nxx)[:, None]
sumx = np.sin(x + 0.356 * n).sum(0)
sumy = np.cos(y + 2.3 * m).sum(0)
You can use the same broadcasting trick to get the final sum in a grid:
result = sumx[:, None] + sumy

Convolve array with kernel of variable standard deviation

Good day to you fellow programmer !
Today I would like to do something that I believe is tricky. I have a very large 2D array called tac that basically contains time curve values and a file containing a tuple of coordinates called coor which contains information on where to place these curves in a 3D array. What this set of variables represents is actually a 4D array: the first 3 dimensions represent space dimensions and the fourth is time. The whole thing is stored as is to avoid storing an immense amount of zeros.
I would like to apply, for each time (in other words, each values in the 4th dimension), a gaussian kernel to this set of data. I was able to generate this kernel and to perform the convolution quite easily for a fixed standard deviation for the whole array using scipy.ndimage.convolve. The kernel was created using scipy.signal.gaussian. Here is a brief example of the principle where tac_4d contains the 4D array (stores a lot of data I know... but one problem at the time):
def gaussian_kernel_3d(radius, sigma):
num = 2 * radius + 1
kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
kernel_2d = np.outer(kernel_1d, kernel_1d)
kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
kernel_3d = np.expand_dims(kernel_3d, -1)
return kernel_3d
g = gaussian_kernel_3d(1, .5)
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0)
The trick is now to convolve the array with a kernel which standard deviation is different for each SPACE coordinate. In other words, I would have a 3D array std containing standard deviations for each coordinate of the array.
It seems https://github.com/sheliak/varconvolve is the code needed to take care of this problem. However I don't really understand how to use it and quite frankly, I would prefer to come up with a genuine solution. Do you guys see a way to solve this problem?
Thanks in advance !
EDIT
Here is what I hope can be considered MCVE
import numpy as np
from scipy import signal
from scipy import ndimage as nd
def gaussian_kernel_2d(radius, sigma):
num = 2 * radius + 1
kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
kernel_2d = np.outer(kernel_1d, kernel_1d)
return kernel_2d
def gaussian_kernel_3d(radius, sigma):
num = 2 * radius + 1
kernel_1d = signal.gaussian(num, std=sigma).reshape(num, 1)
kernel_2d = np.outer(kernel_1d, kernel_1d)
kernel_3d = np.outer(kernel_1d, kernel_2d).reshape(num, num, num)
kernel_3d = np.expand_dims(kernel_3d, -1)
return kernel_3d
np.random.seed(0)
number_of_tac = 150
time_samples = 915
z, y, x = 100, 150, 100
voxel_number = x * y * z
# TACs in the right order
tac = np.random.uniform(0, 4, time_samples * number_of_tac).reshape(number_of_tac, time_samples)
arr = np.array([0] * (voxel_number - number_of_tac) + [1] * number_of_tac)
np.random.shuffle(arr)
arr = arr.reshape(z, y, x)
coor = np.where(arr != 0) # non-empty voxel
# Algorithm to replace TAC in 3D space
nnz = np.zeros(arr.shape)
nnz[coor] = 1
tac_4d = np.zeros((x, y, z, time_samples))
tac_4d[np.where(nnz == 1)] = tac
# 3D convolution for all time
# TODO: find a way to make standard deviation change for each voxel
g = gaussian_kernel_3d(1, 1) # 3D kernel of std = 1
v = np.random.uniform(0, 1, x * y * z).reshape(z, y, x) # 3D array of std
cag = nd.convolve(tac_4d, g, mode='constant', cval=0.0) # convolution

Essentially, you have a 4D dataset, shape (nx, ny, nz, nt) that is sparse in (nx, ny, nz) and dense in the nt axis. If (i, j, k) are coordinates of nonzero points in the sparse dimensions, you want to convolve with a Gaussian 3D kernel that has a sigma that depends on (i, j, k).
For example, if there are nonzero points at [1, 2, 5] and [1, 4, 5] with corresponding sigmas 0.1 and 1.0, then the output at coordinates [1, 3, 5] is affected mostly by the [1, 4, 5] point because that one has the largest point spread.
Your question is ambiguous; it could also mean that point [1, 3, 5] has a its own associated sigma, for example 0.5, and pulls data from the two adjacent points with equal weight. I will assume the first definition (sigma values associated with input points, not with output points).
Because the operation is not a true convolution, there is no fast FFT-based method to do the entire operation in one operation. Instead, you have to loop over the sigma values. Fortunately, your example has only 150 nonzero points, so the loop is not too expensive.
Here is an implementation. I keep the data in sparse representation as long as possible.
import scipy.signal
import numpy as np
def kernel3d(mm, sigma):
"""Return (mm, mm, mm) shaped, normalized kernel."""
g1 = scipy.signal.gaussian(mm, std=sigma)
g3 = g1.reshape(mm, 1, 1) * g1.reshape(1, mm, 1) * g1.reshape(1, 1, mm)
return g3 * (1/g3.sum())
np.random.seed(1)
s = 2 # scaling factor (original problem: s=10)
nx, ny, nz, nt, nnz = 10*s, 11*s, 12*s, 91*s, 15*s
# select nnz random voxels to fill with time series data
randint = np.random.randint
tseries = {} # key: (i, j, k) tuple; value: time series data, shape (nt,)
for _ in range(nnz):
while True:
ijk = (randint(nx), randint(ny), randint(nz))
if ijk not in tseries:
tseries[ijk] = np.random.uniform(0, 1, size=nt)
break
ijks = np.array(list(tseries.keys())) # shape (nnz, 3)
# sigmas: key: (i, j, k) tuple; value: standard deviation
sigmas = { k: np.random.uniform(0, 2) for k in tseries.keys() }
# output will be stored as dense array, padded to avoid edge issues
# with convolution.
m = 5 # padding size
cag_4dp = np.zeros((nx+2*m, ny+2*m, nz+2*m, nt))
mm = 2*m + 1 # kernel width
for (i, j, k), tdata in tseries.items():
kernel = kernel3d(mm, sigmas[(i, j, k)]).reshape(mm, mm, mm, 1)
# convolution of one voxel by kernel is trivial.
# slice4d_c has shape (mm, mm, mm, nt).
slice4d_c = kernel * tdata
cag_4dp[i:i+mm, j:j+mm, k:k+mm, :] += slice4d_c
cag_4d = cag_4dp[m:-m, m:-m, m:-m, :]
#%%
import matplotlib.pyplot as plt
fig, axs = plt.subplots(2, 2, tight_layout=True)
plt.close('all')
# find a few planes
#ks = np.where(np.any(cag_4d != 0, axis=(0, 1,3)))[0]
ks = ijks[:4, 2]
for ax, k in zip(axs.ravel(), ks):
ax.imshow(cag_4d[:, :, k, nt//2].T)
ax.set_title(f'Voxel [:, :, {k}] at time {nt//2}')
fig.show()
for ijk, sigma in sigmas.items():
print(f'{ijk}: sigma={sigma:.2f}')

How to perform cubic spline interpolation in python?

I have two lists to describe the function y(x):
x = [0,1,2,3,4,5]
y = [12,14,22,39,58,77]
I would like to perform cubic spline interpolation so that given some value u in the domain of x, e.g.
u = 1.25
I can find y(u).
I found this in SciPy but I am not sure how to use it.

Short answer:
from scipy import interpolate
def f(x):
x_points = [ 0, 1, 2, 3, 4, 5]
y_points = [12,14,22,39,58,77]
tck = interpolate.splrep(x_points, y_points)
return interpolate.splev(x, tck)
print(f(1.25))
Long answer:
scipy separates the steps involved in spline interpolation into two operations, most likely for computational efficiency.
The coefficients describing the spline curve are computed,
using splrep(). splrep returns an array of tuples containing the
coefficients.
These coefficients are passed into splev() to actually
evaluate the spline at the desired point x (in this example 1.25).
x can also be an array. Calling f([1.0, 1.25, 1.5]) returns the
interpolated points at 1, 1.25, and 1,5, respectively.
This approach is admittedly inconvenient for single evaluations, but since the most common use case is to start with a handful of function evaluation points, then to repeatedly use the spline to find interpolated values, it is usually quite useful in practice.

In case, scipy is not installed:
import numpy as np
from math import sqrt
def cubic_interp1d(x0, x, y):
"""
Interpolate a 1-D function using cubic splines.
x0 : a float or an 1d-array
x : (N,) array_like
A 1-D array of real/complex values.
y : (N,) array_like
A 1-D array of real values. The length of y along the
interpolation axis must be equal to the length of x.
Implement a trick to generate at first step the cholesky matrice L of
the tridiagonal matrice A (thus L is a bidiagonal matrice that
can be solved in two distinct loops).
additional ref: www.math.uh.edu/~jingqiu/math4364/spline.pdf
"""
x = np.asfarray(x)
y = np.asfarray(y)
# remove non finite values
# indexes = np.isfinite(x)
# x = x[indexes]
# y = y[indexes]
# check if sorted
if np.any(np.diff(x) < 0):
indexes = np.argsort(x)
x = x[indexes]
y = y[indexes]
size = len(x)
xdiff = np.diff(x)
ydiff = np.diff(y)
# allocate buffer matrices
Li = np.empty(size)
Li_1 = np.empty(size-1)
z = np.empty(size)
# fill diagonals Li and Li-1 and solve [L][y] = [B]
Li[0] = sqrt(2*xdiff[0])
Li_1[0] = 0.0
B0 = 0.0 # natural boundary
z[0] = B0 / Li[0]
for i in range(1, size-1, 1):
Li_1[i] = xdiff[i-1] / Li[i-1]
Li[i] = sqrt(2*(xdiff[i-1]+xdiff[i]) - Li_1[i-1] * Li_1[i-1])
Bi = 6*(ydiff[i]/xdiff[i] - ydiff[i-1]/xdiff[i-1])
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
i = size - 1
Li_1[i-1] = xdiff[-1] / Li[i-1]
Li[i] = sqrt(2*xdiff[-1] - Li_1[i-1] * Li_1[i-1])
Bi = 0.0 # natural boundary
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
# solve [L.T][x] = [y]
i = size-1
z[i] = z[i] / Li[i]
for i in range(size-2, -1, -1):
z[i] = (z[i] - Li_1[i-1]*z[i+1])/Li[i]
# find index
index = x.searchsorted(x0)
np.clip(index, 1, size-1, index)
xi1, xi0 = x[index], x[index-1]
yi1, yi0 = y[index], y[index-1]
zi1, zi0 = z[index], z[index-1]
hi1 = xi1 - xi0
# calculate cubic
f0 = zi0/(6*hi1)*(xi1-x0)**3 + \
zi1/(6*hi1)*(x0-xi0)**3 + \
(yi1/hi1 - zi1*hi1/6)*(x0-xi0) + \
(yi0/hi1 - zi0*hi1/6)*(xi1-x0)
return f0
if __name__ == '__main__':
import matplotlib.pyplot as plt
x = np.linspace(0, 10, 11)
y = np.sin(x)
plt.scatter(x, y)
x_new = np.linspace(0, 10, 201)
plt.plot(x_new, cubic_interp1d(x_new, x, y))
plt.show()

If you have scipy version >= 0.18.0 installed you can use CubicSpline function from scipy.interpolate for cubic spline interpolation.
You can check scipy version by running following commands in python:
#!/usr/bin/env python3
import scipy
scipy.version.version
If your scipy version is >= 0.18.0 you can run following example code for cubic spline interpolation:
#!/usr/bin/env python3
import numpy as np
from scipy.interpolate import CubicSpline
# calculate 5 natural cubic spline polynomials for 6 points
# (x,y) = (0,12) (1,14) (2,22) (3,39) (4,58) (5,77)
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([12,14,22,39,58,77])
# calculate natural cubic spline polynomials
cs = CubicSpline(x,y,bc_type='natural')
# show values of interpolation function at x=1.25
print('S(1.25) = ', cs(1.25))
## Aditional - find polynomial coefficients for different x regions
# if you want to print polynomial coefficients in form
# S0(0<=x<=1) = a0 + b0(x-x0) + c0(x-x0)^2 + d0(x-x0)^3
# S1(1< x<=2) = a1 + b1(x-x1) + c1(x-x1)^2 + d1(x-x1)^3
# ...
# S4(4< x<=5) = a4 + b4(x-x4) + c5(x-x4)^2 + d5(x-x4)^3
# x0 = 0; x1 = 1; x4 = 4; (start of x region interval)
# show values of a0, b0, c0, d0, a1, b1, c1, d1 ...
cs.c
# Polynomial coefficients for 0 <= x <= 1
a0 = cs.c.item(3,0)
b0 = cs.c.item(2,0)
c0 = cs.c.item(1,0)
d0 = cs.c.item(0,0)
# Polynomial coefficients for 1 < x <= 2
a1 = cs.c.item(3,1)
b1 = cs.c.item(2,1)
c1 = cs.c.item(1,1)
d1 = cs.c.item(0,1)
# ...
# Polynomial coefficients for 4 < x <= 5
a4 = cs.c.item(3,4)
b4 = cs.c.item(2,4)
c4 = cs.c.item(1,4)
d4 = cs.c.item(0,4)
# Print polynomial equations for different x regions
print('S0(0<=x<=1) = ', a0, ' + ', b0, '(x-0) + ', c0, '(x-0)^2 + ', d0, '(x-0)^3')
print('S1(1< x<=2) = ', a1, ' + ', b1, '(x-1) + ', c1, '(x-1)^2 + ', d1, '(x-1)^3')
print('...')
print('S5(4< x<=5) = ', a4, ' + ', b4, '(x-4) + ', c4, '(x-4)^2 + ', d4, '(x-4)^3')
# So we can calculate S(1.25) by using equation S1(1< x<=2)
print('S(1.25) = ', a1 + b1*0.25 + c1*(0.25**2) + d1*(0.25**3))
# Cubic spline interpolation calculus example
# https://www.youtube.com/watch?v=gT7F3TWihvk

Just putting this here if you want a dependency-free solution.
Code taken from an answer above: https://stackoverflow.com/a/48085583/36061
def my_cubic_interp1d(x0, x, y):
"""
Interpolate a 1-D function using cubic splines.
x0 : a 1d-array of floats to interpolate at
x : a 1-D array of floats sorted in increasing order
y : A 1-D array of floats. The length of y along the
interpolation axis must be equal to the length of x.
Implement a trick to generate at first step the cholesky matrice L of
the tridiagonal matrice A (thus L is a bidiagonal matrice that
can be solved in two distinct loops).
additional ref: www.math.uh.edu/~jingqiu/math4364/spline.pdf
# original function code at: https://stackoverflow.com/a/48085583/36061
This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Original Author raphael valentin
Date 3 Jan 2018
Modifications made to remove numpy dependencies:
-all sub-functions by MR
This function, and all sub-functions, are licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
Mod author: Matthew Rowles
Date 3 May 2021
"""
def diff(lst):
"""
numpy.diff with default settings
"""
size = len(lst)-1
r = [0]*size
for i in range(size):
r[i] = lst[i+1] - lst[i]
return r
def list_searchsorted(listToInsert, insertInto):
"""
numpy.searchsorted with default settings
"""
def float_searchsorted(floatToInsert, insertInto):
for i in range(len(insertInto)):
if floatToInsert <= insertInto[i]:
return i
return len(insertInto)
return [float_searchsorted(i, insertInto) for i in listToInsert]
def clip(lst, min_val, max_val, inPlace = False):
"""
numpy.clip
"""
if not inPlace:
lst = lst[:]
for i in range(len(lst)):
if lst[i] < min_val:
lst[i] = min_val
elif lst[i] > max_val:
lst[i] = max_val
return lst
def subtract(a,b):
"""
returns a - b
"""
return a - b
size = len(x)
xdiff = diff(x)
ydiff = diff(y)
# allocate buffer matrices
Li = [0]*size
Li_1 = [0]*(size-1)
z = [0]*(size)
# fill diagonals Li and Li-1 and solve [L][y] = [B]
Li[0] = sqrt(2*xdiff[0])
Li_1[0] = 0.0
B0 = 0.0 # natural boundary
z[0] = B0 / Li[0]
for i in range(1, size-1, 1):
Li_1[i] = xdiff[i-1] / Li[i-1]
Li[i] = sqrt(2*(xdiff[i-1]+xdiff[i]) - Li_1[i-1] * Li_1[i-1])
Bi = 6*(ydiff[i]/xdiff[i] - ydiff[i-1]/xdiff[i-1])
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
i = size - 1
Li_1[i-1] = xdiff[-1] / Li[i-1]
Li[i] = sqrt(2*xdiff[-1] - Li_1[i-1] * Li_1[i-1])
Bi = 0.0 # natural boundary
z[i] = (Bi - Li_1[i-1]*z[i-1])/Li[i]
# solve [L.T][x] = [y]
i = size-1
z[i] = z[i] / Li[i]
for i in range(size-2, -1, -1):
z[i] = (z[i] - Li_1[i-1]*z[i+1])/Li[i]
# find index
index = list_searchsorted(x0,x)
index = clip(index, 1, size-1)
xi1 = [x[num] for num in index]
xi0 = [x[num-1] for num in index]
yi1 = [y[num] for num in index]
yi0 = [y[num-1] for num in index]
zi1 = [z[num] for num in index]
zi0 = [z[num-1] for num in index]
hi1 = list( map(subtract, xi1, xi0) )
# calculate cubic - all element-wise multiplication
f0 = [0]*len(hi1)
for j in range(len(f0)):
f0[j] = zi0[j]/(6*hi1[j])*(xi1[j]-x0[j])**3 + \
zi1[j]/(6*hi1[j])*(x0[j]-xi0[j])**3 + \
(yi1[j]/hi1[j] - zi1[j]*hi1[j]/6)*(x0[j]-xi0[j]) + \
(yi0[j]/hi1[j] - zi0[j]*hi1[j]/6)*(xi1[j]-x0[j])
return f0

Minimal python3 code:
from scipy import interpolate
if __name__ == '__main__':
x = [ 0, 1, 2, 3, 4, 5]
y = [12,14,22,39,58,77]
# tck : tuple (t,c,k) a tuple containing the vector of knots,
# the B-spline coefficients, and the degree of the spline.
tck = interpolate.splrep(x, y)
print(interpolate.splev(1.25, tck)) # Prints 15.203125000000002
print(interpolate.splev(...other_value_here..., tck))
Based on comment of cwhy and answer by youngmit

In my previous post, I wrote a code based on a Cholesky development to solve the matrix generated by the cubic algorithm. Unfortunately, due to the square root function, it may perform badly on some sets of points (typically a non-uniform set of points).
In the same spirit than previously, there is another idea using the Thomas algorithm (TDMA) (see https://en.wikipedia.org/wiki/Tridiagonal_matrix_algorithm) to solve partially the tridiagonal matrix during its definition loop. However, the condition to use TDMA is that it requires at least that the matrix shall be diagonally dominant. However, in our case, it shall be true since |bi| > |ai| + |ci| with ai = h[i], bi = 2*(h[i]+h[i+1]), ci = h[i+1], with h[i] unconditionally positive. (see https://www.cfd-online.com/Wiki/Tridiagonal_matrix_algorithm_-TDMA(Thomas_algorithm)
I refer again to the document from jingqiu (see my previous post, unfortunately the link is broken, but it is still possible to find it in the cache of the web).
An optimized version of the TDMA solver can be described as follows:
def TDMAsolver(a,b,c,d):
""" This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Author raphael valentin
Date 25 Mar 2022
ref. https://www.cfd-online.com/Wiki/Tridiagonal_matrix_algorithm_-_TDMA_(Thomas_algorithm)
"""
n = len(d)
w = np.empty(n-1,float)
g = np.empty(n, float)
w[0] = c[0]/b[0]
g[0] = d[0]/b[0]
for i in range(1, n-1):
m = b[i] - a[i-1]*w[i-1]
w[i] = c[i] / m
g[i] = (d[i] - a[i-1]*g[i-1]) / m
g[n-1] = (d[n-1] - a[n-2]*g[n-2]) / (b[n-1] - a[n-2]*w[n-2])
for i in range(n-2, -1, -1):
g[i] = g[i] - w[i]*g[i+1]
return g
When it is possible to get each individual for ai, bi, ci, di, it becomes easy to combine the definitions of the natural cubic spline interpolator function within these 2 single loops.
def cubic_interpolate(x0, x, y):
""" Natural cubic spline interpolate function
This function is licenced under: Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
https://creativecommons.org/licenses/by-sa/3.0/
Author raphael valentin
Date 25 Mar 2022
"""
xdiff = np.diff(x)
dydx = np.diff(y)
dydx /= xdiff
n = size = len(x)
w = np.empty(n-1, float)
z = np.empty(n, float)
w[0] = 0.
z[0] = 0.
for i in range(1, n-1):
m = xdiff[i-1] * (2 - w[i-1]) + 2 * xdiff[i]
w[i] = xdiff[i] / m
z[i] = (6*(dydx[i] - dydx[i-1]) - xdiff[i-1]*z[i-1]) / m
z[-1] = 0.
for i in range(n-2, -1, -1):
z[i] = z[i] - w[i]*z[i+1]
# find index (it requires x0 is already sorted)
index = x.searchsorted(x0)
np.clip(index, 1, size-1, index)
xi1, xi0 = x[index], x[index-1]
yi1, yi0 = y[index], y[index-1]
zi1, zi0 = z[index], z[index-1]
hi1 = xi1 - xi0
# calculate cubic
f0 = zi0/(6*hi1)*(xi1-x0)**3 + \
zi1/(6*hi1)*(x0-xi0)**3 + \
(yi1/hi1 - zi1*hi1/6)*(x0-xi0) + \
(yi0/hi1 - zi0*hi1/6)*(xi1-x0)
return f0
This function gives the same results as the function/class CubicSpline from scipy.interpolate, as we can see in the next plot.
It is possible to implement as well the first and second analytical derivatives that can be described such way:
f1p = -zi0/(2*hi1)*(xi1-x0)**2 + zi1/(2*hi1)*(x0-xi0)**2 + (yi1/hi1 - zi1*hi1/6) + (yi0/hi1 - zi0*hi1/6)
f2p = zi0/hi1 * (xi1-x0) + zi1/hi1 * (x0-xi0)
Then, it is easy to verify that f2p[0] and f2p[-1] are equal to 0, then that the interpolator function yields natural splines.
An additional reference concerning natural spline:
https://faculty.ksu.edu.sa/sites/default/files/numerical_analysis_9th.pdf#page=167
An example of use:
import matplotlib.pyplot as plt
import numpy as np
x = [-8,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
x = np.asfarray(x)
y = np.asfarray(y)
plt.scatter(x, y)
x_new= np.linspace(min(x), max(x), 10000)
y_new = cubic_interpolate(x_new, x, y)
plt.plot(x_new, y_new)
from scipy.interpolate import CubicSpline
f = CubicSpline(x, y, bc_type='natural')
plt.plot(x_new, f(x_new), label='ref')
plt.legend()
plt.show()
In a conclusion, this updated algorithm shall perform interpolation with better stability and faster than the previous code (O(n)). Associated with numba or cython, it shall be even very fast. Finally, it is totally independent of Scipy.
Important, note that as most of algorithms, it is sometimes useful to normalize the data (e.g. against large or small number values) to get the best results. As well, in this code, I do not check nan values or ordered data.
Whatever, this update was a good lesson learning for me and I hope it can help someone. Let me know if you find something strange.

If you want to get the value
from scipy.interpolate import CubicSpline
import numpy as np
x = [-5,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
value = 2
#ascending order
if np.any(np.diff(x) < 0):
indexes = np.argsort(x).astype(int)
x = np.array(x)[indexes]
y = np.array(y)[indexes]
f = CubicSpline(x, y, bc_type='natural')
specificVal = f(value).item(0) #f(value) is numpy.ndarray!!
print(specificVal)
If you want to plot the interpolated function.
np.linspace third parameter increase the "accuracy".
from scipy.interpolate import CubicSpline
import numpy as np
import matplotlib.pyplot as plt
x = [-5,-4.19,-3.54,-3.31,-2.56,-2.31,-1.66,-0.96,-0.22,0.62,1.21,3]
y = [-0.01,0.01,0.03,0.04,0.07,0.09,0.16,0.28,0.45,0.65,0.77,1]
#ascending order
if np.any(np.diff(x) < 0):
indexes = np.argsort(x).astype(int)
x = np.array(x)[indexes]
y = np.array(y)[indexes]
f = CubicSpline(x, y, bc_type='natural')
x_new = np.linspace(min(x), max(x), 100)
y_new = f(x_new)
plt.plot(x_new, y_new)
plt.scatter(x, y)
plt.title('Cubic Spline Interpolation')
plt.show()
output:

Yes, as others have already noted, it should be as simple as
>>> from scipy.interpolate import CubicSpline
>>> CubicSpline(x,y)(u)
array(15.203125)
(you can, for example, convert it to float to get the value from a 0d NumPy array)
What has not been described yet is boundary conditions: the default ‘not-a-knot’ boundary conditions work best if you have zero knowledge about the data you’re going to interpolate.
If you see the following ‘features’ on the plot, you can fine-tune the boundary conditions to get a better result:
the first derivative vanishes at boundaries => bc_type=‘clamped’
the second derivative vanishes at boundaries => bc_type='natural'
the function is periodic => bc_type='periodic'
See my article for more details and an interactive demo.

Compute divergence of vector field using python

Is there a function that could be used for calculation of the divergence of the vectorial field? (in matlab) I would expect it exists in numpy/scipy but I can not find it using Google.
I need to calculate div[A * grad(F)], where
F = np.array([[1,2,3,4],[5,6,7,8]]) # (2D numpy ndarray)
A = np.array([[1,2,3,4],[1,2,3,4]]) # (2D numpy ndarray)
so grad(F) is a list of 2D ndarrays
I know I can calculate divergence like this but do not want to reinvent the wheel. (I would also expect something more optimized) Does anyone have suggestions?

Just a hint for everybody reading that:
the functions above do not compute the divergence of a vector field. they sum the derivatives of a scalar field A:
result = dA/dx + dA/dy
in contrast to a vector field (with three dimensional example):
result = sum dAi/dxi = dAx/dx + dAy/dy + dAz/dz
Vote down for all! It is mathematically simply wrong.
Cheers!

import numpy as np
def divergence(field):
"return the divergence of a n-D field"
return np.sum(np.gradient(field),axis=0)

Based on Juh_'s answer, but modified for the correct divergence of a vector field formula
def divergence(f):
"""
Computes the divergence of the vector field f, corresponding to dFx/dx + dFy/dy + ...
:param f: List of ndarrays, where every item of the list is one dimension of the vector field
:return: Single ndarray of the same shape as each of the items in f, which corresponds to a scalar field
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
Matlab's documentation uses this exact formula (scroll down to Divergence of a Vector Field)

The answer of #user2818943 is good, but it can be optimized a little:
def divergence(F):
""" compute the divergence of n-D scalar field `F` """
return reduce(np.add,np.gradient(F))
Timeit:
F = np.random.rand(100,100)
timeit reduce(np.add,np.gradient(F))
# 1000 loops, best of 3: 318 us per loop
timeit np.sum(np.gradient(F),axis=0)
# 100 loops, best of 3: 2.27 ms per loop
About 7 times faster:
sum implicitely construct a 3d array from the list of gradient fields which are returned by np.gradient. This is avoided using reduce
Now, in your question what do you mean by div[A * grad(F)]?
about A * grad(F): A is a 2d array, and grad(f) is a list of 2d arrays. So I considered it means to multiply each gradient field by A.
about applying divergence to the (scaled by A) gradient field is unclear. By definition, div(F) = d(F)/dx + d(F)/dy + .... I guess this is just an error of formulation.
For 1, multiplying summed elements Bi by a same factor A can be factorized:
Sum(A*Bi) = A*Sum(Bi)
Thus, you can get this weighted gradient simply with: A*divergence(F)
If ̀A is instead a list of factor, one for each dimension, then the solution would be:
def weighted_divergence(W,F):
"""
Return the divergence of n-D array `F` with gradient weighted by `W`
̀`W` is a list of factors for each dimension of F: the gradient of `F` over
the `i`th dimension is multiplied by `W[i]`. Each `W[i]` can be a scalar
or an array with same (or broadcastable) shape as `F`.
"""
wGrad = return map(np.multiply, W, np.gradient(F))
return reduce(np.add,wGrad)
result = weighted_divergence(A,F)

What Daniel had modified is the right answer, let me explain self defined func divergence further in more detail :
Function np.gradient() defined as : np.gradient(f) = df/dx, df/dy, df/dz +...
but we need define func divergence as : divergence ( f) = dfx/dx + dfy/dy + dfz/dz +... = np.gradient( fx) + np.gradient(fy) + np.gradient(fz) + ...
Let's test, compare with example of divergence in matlab
import numpy as np
import matplotlib.pyplot as plt
NY = 50
ymin = -2.
ymax = 2.
dy = (ymax -ymin )/(NY-1.)
NX = NY
xmin = -2.
xmax = 2.
dx = (xmax -xmin)/(NX-1.)
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
y = np.array([ ymin + float(i)*dy for i in range(NY)])
x = np.array([ xmin + float(i)*dx for i in range(NX)])
x, y = np.meshgrid( x, y, indexing = 'ij', sparse = False)
Fx = np.cos(x + 2*y)
Fy = np.sin(x - 2*y)
F = [Fx, Fy]
g = divergence(F)
plt.pcolormesh(x, y, g)
plt.colorbar()
plt.savefig( 'Div' + str(NY) +'.png', format = 'png')
plt.show()
---------- UPDATED VERSION: Include the differential Steps----------------
Thank the comment from #henry, the np.gradient take the default step as 1, so the results may have some mismatch. We can provide our own differential steps.
#https://stackoverflow.com/a/47905007/5845212
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import make_axes_locatable
NY = 50
ymin = -2.
ymax = 2.
dy = (ymax -ymin )/(NY-1.)
NX = NY
xmin = -2.
xmax = 2.
dx = (xmax -xmin)/(NX-1.)
def divergence(f,h):
"""
div(F) = dFx/dx + dFy/dy + ...
g = np.gradient(Fx,dx, axis=1)+ np.gradient(Fy,dy, axis=0) #2D
g = np.gradient(Fx,dx, axis=2)+ np.gradient(Fy,dy, axis=1) +np.gradient(Fz,dz,axis=0) #3D
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], h[i], axis=i) for i in range(num_dims)])
y = np.array([ ymin + float(i)*dy for i in range(NY)])
x = np.array([ xmin + float(i)*dx for i in range(NX)])
x, y = np.meshgrid( x, y, indexing = 'ij', sparse = False)
Fx = np.cos(x + 2*y)
Fy = np.sin(x - 2*y)
F = [Fx, Fy]
h = [dx, dy]
print('plotting')
rows = 1
cols = 2
#plt.clf()
plt.figure(figsize=(cols*3.5,rows*3.5))
plt.minorticks_on()
#g = np.gradient(Fx,dx, axis=1)+np.gradient(Fy,dy, axis=0) # equivalent to our func
g = divergence(F,h)
ax = plt.subplot(rows,cols,1,aspect='equal',title='div numerical')
#im=plt.pcolormesh(x, y, g)
im = plt.pcolormesh(x, y, g, shading='nearest', cmap=plt.cm.get_cmap('coolwarm'))
plt.quiver(x,y,Fx,Fy)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
cbar = plt.colorbar(im, cax = cax,format='%.1f')
g = -np.sin(x+2*y) -2*np.cos(x-2*y)
ax = plt.subplot(rows,cols,2,aspect='equal',title='div analytical')
im=plt.pcolormesh(x, y, g)
im = plt.pcolormesh(x, y, g, shading='nearest', cmap=plt.cm.get_cmap('coolwarm'))
plt.quiver(x,y,Fx,Fy)
divider = make_axes_locatable(ax)
cax = divider.append_axes("right", size="5%", pad=0.05)
cbar = plt.colorbar(im, cax = cax,format='%.1f')
plt.tight_layout()
plt.savefig( 'divergence.png', format = 'png')
plt.show()

Based on #paul_chen answer, and with some additions for Matplotlib 3.3.0 (a shading param needs to be passed, and default colormap I guess has changed)
import numpy as np
import matplotlib.pyplot as plt
NY = 20; ymin = -2.; ymax = 2.
dy = (ymax -ymin )/(NY-1.)
NX = NY
xmin = -2.; xmax = 2.
dx = (xmax -xmin)/(NX-1.)
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
y = np.array([ ymin + float(i)*dy for i in range(NY)])
x = np.array([ xmin + float(i)*dx for i in range(NX)])
x, y = np.meshgrid( x, y, indexing = 'ij', sparse = False)
Fx = np.cos(x + 2*y)
Fy = np.sin(x - 2*y)
F = [Fx, Fy]
g = divergence(F)
plt.pcolormesh(x, y, g, shading='nearest', cmap=plt.cm.get_cmap('coolwarm'))
plt.colorbar()
plt.quiver(x,y,Fx,Fy)
plt.savefig( 'Div.png', format = 'png')

The divergence as a built-in function is included in matlab, but not numpy. This is the sort of thing that it may perhaps be worthwhile to contribute to pylab, an effort to create a viable open-source alternative to matlab.
http://wiki.scipy.org/PyLab
Edit: Now called http://www.scipy.org/stackspec.html

As far as I can tell, the answer is that there is no native divergence function in numpy. Therefore, the best method for calculating divergence is to sum the components of the gradient vector i.e. calculate the divergence.

I don't think the answer by #Daniel is correct, especially when the input is in order [Fx, Fy, Fz, ...].
A simple test case
See the MATLAB code:
a = [1 2 3;1 2 3; 1 2 3];
b = [[7 8 9] ;[1 5 8] ;[2 4 7]];
divergence(a,b)
which gives the result:
ans =
-5.0000 -2.0000 0
-1.5000 -1.0000 0
2.0000 0 0
and Daniel's solution:
def divergence(f):
"""
Daniel's solution
Computes the divergence of the vector field f, corresponding to dFx/dx + dFy/dy + ...
:param f: List of ndarrays, where every item of the list is one dimension of the vector field
:return: Single ndarray of the same shape as each of the items in f, which corresponds to a scalar field
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
if __name__ == '__main__':
a = np.array([[1, 2, 3]] * 3)
b = np.array([[7, 8, 9], [1, 5, 8], [2, 4, 7]])
div = divergence([a, b])
print(div)
pass
which gives:
[[1. 1. 1. ]
[4. 3.5 3. ]
[2. 2.5 3. ]]
Explanation
The mistake of Daniel's solution is, in Numpy, the x axis is the last axis instead of the first axis. When using np.gradient(x, axis=0), Numpy actually gives the gradient of y direction (when x is a 2d array).
My solution
There is my solution based on Daniel's answer.
def divergence(f):
"""
Computes the divergence of the vector field f, corresponding to dFx/dx + dFy/dy + ...
:param f: List of ndarrays, where every item of the list is one dimension of the vector field
:return: Single ndarray of the same shape as each of the items in f, which corresponds to a scalar field
"""
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[num_dims - i - 1], axis=i) for i in range(num_dims)])
which gives the same result as MATLAB divergence in my test case.

Somehow the previous attempts to compute the divergence are wrong! Let me show you:
We have the following vector field F:
F(x) = cos(x+2y)
F(y) = sin(x-2y)
If we compute the divergence (using Mathematica):
Div[{Cos[x + 2*y], Sin[x - 2*y]}, {x, y}]
we get:
-2 Cos[x - 2 y] - Sin[x + 2 y]
which has a maximum value in the range of y [-1,2] and x [-2,2]:
N[Max[Table[-2 Cos[x - 2 y] - Sin[x + 2 y], {x, -2, 2 }, {y, -2, 2}]]] = 2.938
Using the divergence equation given here:
def divergence(f):
num_dims = len(f)
return np.ufunc.reduce(np.add, [np.gradient(f[i], axis=i) for i in range(num_dims)])
we get a maximum value of about 0.625
Correct divergence function: Compute divergence with python

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Specify the shift for numpy.correlate - python

It's available by specifying maxlags: import matplotlib.pyplot as plt xcorr = plt.xcorr(signal_1, signal_2, maxlags=1) Documentation can be found here. This implementation is based on np.correlate.

Related

Simulate a coupled ordinary differential equation

Optimize a double loop with mesh grids involved

Convolve array with kernel of variable standard deviation

How to perform cubic spline interpolation in python?

Compute divergence of vector field using python

Categories

Resources