Spline interpolation - python

I'm having difficulties to perform a spline interpolation on the below set:
import numpy
SOURCE = numpy.array([[1,2,3],[3,4,5], [9,10,11]])
from scipy.interpolate import griddata
from scipy.interpolate import interp1d
input = [0.5,2,3,6,9,15]
The linear interpolation works fine, yet when I replace linear with cubic, I have an error :
f = interp1d(SOURCE[:,0], SOURCE[:,1:], kind="linear", axis=0, bounds_error=False)
f(input)
f = interp1d(SOURCE[:,0], SOURCE[:,1:], kind="cubic", axis=0, bounds_error=False)
ValueError: The number of derivatives at boundaries does not match: expected 1, got 0+0
How can I perform this cubic interpolation ?

Your SOURCE data is too short. A cubic spline needs at least four points to interpolate from, but you're only provide three. If you add one more value to SOURCE, it should work more or less as expected:
>>> SOURCE = numpy.array([[1,2,3],[3,4,5], [9,10,11], [12,13,14]]) # added an extra value
>>> f = interp1d(SOURCE[:,0], SOURCE[:,1:], kind="cubic", axis=0, bounds_error=False)
>>> f(input)
array([[nan, nan],
[ 3., 4.],
[ 4., 5.],
[ 7., 8.],
[10., 11.],
[nan, nan]])

Related

Creating a NumPy array out of another array with shifted indices

I would like to produce a 4D array from a 2D one by periodic shifts, in a way that can be summarized by the following:
uuvv[kx,ky,qx,qy] = uu[kx+qx,ky+qy]
This is easiest to illustrate with a "2D from 1D" MWE:
def pc(idx):
return idx - Npts*int(idx/Npts)
uu = np.square(np.arange(Npts))
uv = np.zeros((Npts,Npts))
for kx in np.arange(Npts):
for qx in np.arange(Npts):
uv[kx,qx] = uu[pc(kx+qx)]
Here, the periodicity condition pc just brings the index back into the allowed range. The output for Npts=4 is:
array([[0., 1., 4., 9.],
[1., 4., 9., 0.],
[4., 9., 0., 1.],
[9., 0., 1., 4.]])
So that each value is shifted slightly. For the "4D from 2D" case, I could obviously use:
def pbc(idx):
return idx - Npts*int(idx/Npts)
uv = np.zeros((Npts,Npts,Npts,Npts))
for kx in np.arange(Npts):
for ky in np.arange(Npts):
for qx in np.arange(Npts):
for qy in np.arange(Npts):
uv[kx,ky,qx,qy] = uu[pbc(kx+qx),pbc(ky+qy)]
However, using four loops is going to be slow, as I will be doing this multiple times for much larger arrays. How can I do this more efficiently?
Please note that, although the MWE example could be reproduced by applying the square function to a 2D array, that would not be a helpful solution. Using the MWE to illustrate, the goal is to apply the function as few times as possible (i.e. only on the 1D array) and then to create the 2D array without for loops. Ultimately, I will need to do this to generate a 4D array from a 2D array. How can I do this?
You can replicate the 2D array and then extract the shifted 2D sub-arrays (avoiding modulus and conditionals). Here is how to do that:
uuRep = np.tile(uu, (2,2))
uv = np.zeros((Npts,Npts,Npts,Npts))
for kx in np.arange(Npts):
for ky in np.arange(Npts):
uv[kx,ky,:,:] = uuRep[kx:kx+Npts,ky:ky+Npts]
With Npts=64, this solution is about 1000 times faster.

How this command "preprocessing.scale" do in term of math?

I have read the manual in scikit learn website and i still don't know what is the mathematical formula behind this command.
>>> from sklearn import preprocessing
>>> import numpy as np
>>> X = np.array([[ 1., -1., 2.],
... [ 2., 0., 0.],
... [ 0., 1., -1.]])
>>> X_scaled = preprocessing.scale(X)
>>> X_scaled
array([[ 0. ..., -1.22..., 1.33...],
[ 1.22..., 0. ..., -0.26...],
[-1.22..., 1.22..., -1.06...]])
Center to the mean and component wise scale to unit variance.
This means that mean value along the axis is subtracted from X and the resulting value is divided by std along the axis.
Andrey's formula in the comments is correct - I'd just add that numpy and scikit-learn use the population formula for calculating the standard deviation, not the sample standard deviation, which is the default in other languages like R. So numpy and scikit-learn divide the sum of squares by n, instead of n-1.

Python indexing for central differencing

I have a question about python indexing: I am trying to use central differencing to estimate 'dU' from an array 'U' and I'm doing this by initialising 'dU' with an array of 'nan' of length(U) and then applying central differencing such that dU(i) = (U(i+1) - U(i-1))/2 to the central elements. The output 'dU' array is currently giving me two 'nan' entries at the end of the vector. Can anyone explain why the second to last element isn't being updated?
import numpy as np
U= np.array([1,2,3,4,5,6])
dU = np.zeros(len(U))
dU[:] = np.NAN
dU[1:-2] = (U[2:-1]-U[0:-3])/2
>>> dU
array([ nan, 1., 1., 1., nan, nan])
To have second to last element included you would need:
dU[1:-1] = (U[2:]-U[0:-2])/2
Doesn't answer your question, but as a helpful tip, you can just use numpy.gradient
>>> np.gradient(np.array([1,2,3,4,5,6]))
>>> array([ 1., 1., 1., 1., 1., 1.])

Finding upper/lower triangular form of arbitrary matrix n*n - python

every matrix can be written in upper or lower triangular form simply just by rotating the basis. Is there a simple routine in python (numpy) to do it? I was unable to find it and I cant believe that there is no such thing. To ilustrate it:
matrix = numpy.array([[a,b,c],
[d,e,f],
[g,h,i]])
to
matrix2 = numpy.array([[z,0,0],
[y,x,0],
[v,u,t]])
letters are floats. So how to make this change, but not simply just by zeroing numbers b, c and f, but by correct rotation of basis in the most simple way.
Thank you!
You are looking for Schur decomposition. Schur decomposition decomposes a matrix A as A = Q U Q^H, where U is an upper triangular matrix, Q is a unitary matrix (which effects the basis rotation) and Q^H is the Hermitian adjoint of Q.
import numpy as np
from scipy.linalg import schur
a = np.array([[ 1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])
u, q = schur(a) # q is the unitary matrix, u is upper triangular
repr(u)
# array([[ 1.61168440e+01, 4.89897949e+00, 1.58820582e-15],
# [ 0.00000000e+00, -1.11684397e+00, -1.11643184e-15],
# [ 0.00000000e+00, 0.00000000e+00, -1.30367773e-15]])

I don't understand the k-means scipy algorithm

I'm trying to use the scipy kmeans algorithm.
So I have this really simple example:
from numpy import array
from scipy.cluster.vq import vq, kmeans, whiten
features = array([[3,4],[3,5],[4,2],[4,2]])
book = array((features[0],features[2]))
final = kmeans(features,book)
and the result is
final
(array([[3, 4],
[4, 2]]), 0.25)
What I don't understand is, for me the centroids coordinate should be the barycentre of all the points belongings to the cluster, so in this exemple
[3,9/2] and [4,2]
can anyone explain me the result the scipy algorithm is giving?
It looks like it is preserving the data type that you are giving it (int). Try:
features = array([[3., 4.], [3., 5.], [4., 2.], [4., 2.]])

Categories

Resources