I want to calculate and plot a gradient of any scalar function of two variables. If you really want a concrete example, lets say f=x^2+y^2 where x goes from -10 to 10 and same for y. How do I calculate and plot grad(f)? The solution should be vector and I should see vector lines. I am new to python so please use simple words.
EDIT:
#Andras Deak: thank you for your post, i tried what you suggested and instead of your test function (fun=3*x^2-5*y^2) I used function that i defined as V(x,y); this is how the code looks like but it reports an error
import numpy as np
import math
import sympy
import matplotlib.pyplot as plt
def V(x,y):
t=[]
for k in range (1,3):
for l in range (1,3):
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
return term
return term.sum()
x,y=sympy.symbols('x y')
fun=V(x,y)
gradfun=[sympy.diff(fun,var) for var in (x,y)]
numgradfun=sympy.lambdify([x,y],gradfun)
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
AttributeError: 'Mul' object has no attribute 'sin'
And lets say I remove sin, I get another error:
TypeError: can't multiply sequence by non-int of type 'Mul'
I read tutorial for sympy and it says "The real power of a symbolic computation system such as SymPy is the ability to do all sorts of computations symbolically". I get this, I just dont get why I cannot multiply x and y symbols with float numbers.
What is the way around this? :( Help please!
UPDATE
#Andras Deak: I wanted to make things shorter so I removed many constants from the original formulas for V(x,y) and Cn*Dm. As you pointed out, that caused the sin function to always return 0 (i just noticed). Apologies for that. I will update the post later today when i read your comment in details. Big thanks!
UPDATE 2
I changed coefficients in my expression for voltage and this is the result:
It looks good except that the arrows point in the opposite direction (they are supposed to go out of the reddish dot and into the blue one). Do you know how I could change that? And if possible, could you please tell me the way to increase the size of the arrows? I tried what was suggested in another topic (Computing and drawing vector fields):
skip = (slice(None, None, 3), slice(None, None, 3))
This plots only every third arrow and matplotlib does the autoscale but it doesnt work for me (nothing happens when i add this, for any number that i enter)
You were already of huge help , i cannot thank you enough!
Here's a solution using sympy and numpy. This is the first time I use sympy, so others will/could probably come up with much better and more elegant solutions.
import sympy
#define symbolic vars, function
x,y=sympy.symbols('x y')
fun=3*x**2-5*y**2
#take the gradient symbolically
gradfun=[sympy.diff(fun,var) for var in (x,y)]
#turn into a bivariate lambda for numpy
numgradfun=sympy.lambdify([x,y],gradfun)
now you can use numgradfun(1,3) to compute the gradient at (x,y)==(1,3). This function can then be used for plotting, which you said you can do.
For plotting, you can use, for instance, matplotlib's quiver, like so:
import numpy as np
import matplotlib.pyplot as plt
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
UPDATE
You added a specification for your function to be computed. It contains the product of terms depending on x and y, which seems to break my above solution. I managed to come up with a new one to suit your needs. However, your function seems to make little sense. From your edited question:
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
On the other hand, from your corresponding comment to this answer:
V(x,y) = Sum over n and m of [Cn * Dm * sin(2pinx) * cos(2pimy)]; sum goes from -10 to 10; Cn and Dm are coefficients, and i calculated
that CkDl = sin(2pik)/(k^2 +l^2) (i used here k and l as one of the
indices from the sum over n and m).
I have several problems with this: both sin(2*pi*k) and sin(2*pi*k/2) (the two competing versions in the prefactor are always zero for integer k, giving you a constant zero V at every (x,y). Furthermore, in your code you have magical frequency factors in the trigonometric functions, which are missing from the comment. If you multiply your x by 4e-3, you drastically change the spatial dependence of your function (by changing the wavelength by roughly a factor of a thousand). So you should really decide what your function is.
So here's a solution, where I assumed
V(x,y)=sum_{k,l = 1 to 10} C_{k,l} * sin(2*pi*k*x)*cos(2*pi*l*y), with
C_{k,l}=sin(2*pi*k/4)/((4*pi^2)*(k^2+l^2))*1e-6
This is a combination of your various versions of the function, with the modification of sin(2*pi*k/4) in the prefactor in order to have a non-zero function. I expect you to be able to fix the numerical factors to your actual needs, after you figure out the proper mathematical model.
So here's the full code:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
def CD(k,l):
#return sp.sin(2*sp.pi*k/2)/((4*sp.pi**2)*(k**2+l**2))*1e-6
return sp.sin(2*sp.pi*k/4)/((4*sp.pi**2)*(k**2+l**2))*1e-6
def Vkl(x,y,k,l):
return CD(k,l)*sp.sin(2*sp.pi*k*x)*sp.cos(2*sp.pi*l*y)
def V(x,y,kmax,lmax):
k,l=sp.symbols('k l',integers=True)
return sp.summation(Vkl(x,y,k,l),(k,1,kmax),(l,1,lmax))
#define symbolic vars, function
kmax=10
lmax=10
x,y=sp.symbols('x y')
fun=V(x,y,kmax,lmax)
#take the gradient symbolically
gradfun=[sp.diff(fun,var) for var in (x,y)]
#turn into bivariate lambda for numpy
numgradfun=sp.lambdify([x,y],gradfun,'numpy')
numfun=sp.lambdify([x,y],fun,'numpy')
#plot
X,Y=np.meshgrid(np.linspace(-10,10,51),np.linspace(-10,10,51))
graddat=numgradfun(X,Y)
fundat=numfun(X,Y)
hf=plt.figure()
hc=plt.contourf(X,Y,fundat,np.linspace(fundat.min(),fundat.max(),25))
plt.quiver(X,Y,graddat[0],graddat[1])
plt.colorbar(hc)
plt.show()
I defined your V(x,y) function using some auxiliary functions for transparence. I left the summation cut-offs as literal parameters, kmax and lmax: in your code these were 3, in your comment they were said to be 10, and anyway they should be infinity.
The gradient is taken the same way as before, but when converting to a numpy function using lambdify you have to set an additional string parameter, 'numpy'. This will alow the resulting numpy lambda to accept array input (essentially it will use np.sin instead of math.sin and the same for cos).
I also changed the definition of the grid from array to np.linspace: this is usually more convenient. Since your function is almost constant at integer grid points, I created a denser mesh for plotting (51 points while keeping your original limits of (-10,10) fixed).
For clarity I included a few more plots: a contourf to show the value of the function (contour lines should always be orthogonal to the gradient vectors), and a colorbar to indicate the value of the function. Here's the result:
The composition is obviously not the best, but I didn't want to stray too much from your specifications. The arrows in this figure are actually hardly visible, but as you can see (and also evident from the definition of V) your function is periodic, so if you plot the same thing with smaller limits and less grid points, you'll see more features and larger arrows.
Related
Recently I wanted to demonstrate generating a continuous random variable using the universality of the Uniform. For that, I wanted to use the combination of numpy and matplotlib. However, the generated random variable seems a little bit off to me - and I don't know whether it is caused by the way in which NumPy's random uniform and vectorized works or if I am doing something fundamentally wrong here.
Let U ~ Unif(0, 1) and X = F^-1(U). Then X is a real variable with a CDF F (please note that the F^-1 here denotes the quantile function, I also omit the second part of the universality because it will not be necessary).
Let's assume that the CDF of interest to me is:
then:
According to the universality of the uniform, to generate a real variable, it is enough to plug U ~ Unif(0, 1) in the F-1. Therefore, I've written a very simple code snippet for that:
U = np.random.uniform(0, 1, 1000000)
def logistic(u):
x = np.log(u / (1 - u))
return x
logistic_transform = np.vectorize(logistic)
X = logistic_transform(U)
However, the result seems a little bit off to me - although the histogram of a generated real variable X resembles a logistic distribution (which simplified CDF I've used) - the r.v. seems to be distributed in a very unequal way - and I can't wrap my head around exactly why it is so. I would be grateful for any suggestions on that. Below are the histograms of U and X.
You have a large sample size, so you can increase the number of bins in your histogram and still get a good number samples per bin. If you are using matplotlib's hist function, try (for exampe) bins=400. I get this plot, which has the symmetry that I think you expected:
Also--and this is not relevant to the question--your function logistic will handle a NumPy array without wrapping it with vectorize, so you can save a few CPU cycles by writing X = logistic(U). And you can save a few lines of code by using scipy.special.logit instead of implementing it yourself.
I'm trying to simulate the 2D Schrödinger equation using the explicit algorithm proposed by Askar and Cakmak (1977). I define a 100x100 grid with a complex function u+iv, null at the boundaries. The problem is, after just a few iterations the absolute value of the complex function explodes near the boundaries.
I post here the code so, if interested, you can check it:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
#Initialization+meshgrid
Ntsteps=30
dx=0.1
dt=0.005
alpha=dt/(2*dx**2)
x=np.arange(0,10,dx)
y=np.arange(0,10,dx)
X,Y=np.meshgrid(x,y)
#Initial Gaussian wavepacket centered in (5,5)
vargaussx=1.
vargaussy=1.
kx=10
ky=10
upre=np.zeros((100,100))
ucopy=np.zeros((100,100))
u=(np.exp(-(X-5)**2/(2*vargaussx**2)-(Y-5)**2/(2*vargaussy**2))/(2*np.pi*(vargaussx*vargaussy)**2))*np.cos(kx*X+ky*Y)
vpre=np.zeros((100,100))
vcopy=np.zeros((100,100))
v=(np.exp(-(X-5)**2/(2*vargaussx**2)-(Y-5)**2/(2*vargaussy**2))/(2*np.pi*(vargaussx*vargaussy)**2))*np.sin(kx*X+ky*Y)
#For the simple scenario, null potential
V=np.zeros((100,100))
#Boundary conditions
u[0,:]=0
u[:,0]=0
u[99,:]=0
u[:,99]=0
v[0,:]=0
v[:,0]=0
v[99,:]=0
v[:,99]=0
#Evolution with Askar-Cakmak algorithm
for n in range(1,Ntsteps):
upre=np.copy(ucopy)
vpre=np.copy(vcopy)
ucopy=np.copy(u)
vcopy=np.copy(v)
#For the first iteration, simple Euler method: without this I cannot have the two steps backwards wavefunction at the second iteration
#I use ucopy to make sure that for example u[i,j] is calculated not using the already modified version of u[i-1,j] and u[i,j-1]
if(n==1):
upre=np.copy(ucopy)
vpre=np.copy(vcopy)
for i in range(1,len(x)-1):
for j in range(1,len(y)-1):
u[i,j]=upre[i,j]+2*((4*alpha+V[i,j]*dt)*vcopy[i,j]-alpha*(vcopy[i+1,j]+vcopy[i-1,j]+vcopy[i,j+1]+vcopy[i,j-1]))
v[i,j]=vpre[i,j]-2*((4*alpha+V[i,j]*dt)*ucopy[i,j]-alpha*(ucopy[i+1,j]+ucopy[i-1,j]+ucopy[i,j+1]+ucopy[i,j-1]))
#Calculate absolute value and plot
abspsi=np.sqrt(np.square(u)+np.square(v))
fig=plt.figure()
ax=fig.add_subplot(projection='3d')
surf=ax.plot_surface(X,Y,abspsi)
plt.show()
As you can see the code is extremely simple: I cannot see where this error is coming from (I don't think is a stability problem because alpha<1/2). Have you ever encountered anything similar in your past simulations?
I'd try setting your dt to a smaller value (e.g. 0.001) and increase the number of integration steps (e.g fivefold).
The wavefunction looks in shape also at Ntsteps=150 and well beyond when trying out your code with dt=0.001.
Checking integrals of the motion (e.g. kinetic energy here?) should also confirm that things are going OK (or not) for different choices of dt.
I need to non-linearly expand on each pixel value from 1 dim pixel vector with taylor series expansion of specific non-linear function (e^x or log(x) or log(1+e^x)), but my current implementation is not right to me at least based on taylor series concepts. The basic intuition behind is taking pixel array as input neurons for a CNN model where each pixel should be non-linearly expanded with taylor series expansion of non-linear function.
new update 1:
From my understanding from taylor series, taylor series is written for a function F of a variable x in terms of the value of the function F and it's derivatives in for another value of variable x0. In my problem, F is function of non-linear transformation of features (a.k.a, pixels), x is each pixel value, x0 is maclaurin series approximation at 0.
new update 2
if we use taylor series of log(1+e^x) with approximation order of 2, each pixel value will yield two new pixel by taking first and second expansion terms of taylor series.
graphic illustration
Here is the graphical illustration of the above formulation:
Where X is pixel array, p is approximation order of taylor series, and α is the taylor expansion coefficient.
I wanted to non-linearly expand pixel vectors with taylor series expansion of non-linear function like above illustration demonstrated.
My current attempt
This is my current attempt which is not working correctly for pixel arrays. I was thinking about how to make the same idea applicable to pixel arrays.
def taylor_func(x, approx_order=2):
x_ = x[..., None]
x_ = tf.tile(x_, multiples=[1, 1, approx_order+ 1])
pows = tf.range(0, approx_order + 1, dtype=tf.float32)
x_p = tf.pow(x_, pows)
x_p_ = x_p[..., None]
return x_p_
x = Input(shape=(4,4,3))
x_new = Lambda(lambda x: taylor_func(x, max_pow))(x)
my new updated attempt:
x_input= Input(shape=(32, 32,3))
def maclurin_exp(x, powers=2):
out= 0
for k in range(powers):
out+= ((-1)**k) * (x ** (2*k)) / (math.factorial(2 * k))
return res
x_input_new = Lambda(lambda x: maclurin_exp(x, max_pow))(x_input)
This attempt doesn't yield what the above mathematical formulation describes. I bet I missed something while doing the expansion. Can anyone point me on how to make this correct? Any better idea?
goal
I wanted to take pixel vector and make non-linearly distributed or expanded with taylor series expansion of certain non-linear function. Is there any possible way to do this? any thoughts? thanks
This is a really interesting question but I can't say that I'm clear on it as of yet. So, while I have some thoughts, I might be missing the thrust of what you're looking to do.
It seems like you want to develop your own activation function instead of using something RELU or softmax. Certainly no harm there. And you gave three candidates: e^x, log(x), and log(1+e^x).
Notice log(x) asymptotically approaches negative infinity x --> 0. So, log(x) is right out. If that was intended as a check on the answers you get or was something jotted down as you were falling asleep, no worries. But if it wasn't, you should spend some time and make sure you understand the underpinnings of what you doing because the consequences can be quite high.
You indicated you were looking for a canonical answer and you get a two for one here. You get both a canonical answer and highly performant code.
Considering you're not likely to able to write faster, more streamlined code than the folks of SciPy, Numpy, or Pandas. Or, PyPy. Or Cython for that matter. Their stuff is the standard. So don't try to compete against them by writing your own, less performant (and possibly bugged) version which you will then have to maintain as time passes. Instead, maximize your development and run times by using them.
Let's take a look at the implementation e^x in SciPy and give you some code to work with. I know you don't need a graph for what you're at this stage but they're pretty and can help you understand how they Taylor (or Maclaurin, aka Euler-Maclaurin) will work as the order of the approximation changes. It just so happens that SciPy has Taylor approximation built-in.
import scipy
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import approximate_taylor_polynomial
x = np.linspace(-10.0, 10.0, num=100)
plt.plot(x, np.exp(x), label="e^x", color = 'black')
for degree in np.arange(1, 4, step=1):
e_to_the_x_taylor = approximate_taylor_polynomial(np.exp, 0, degree, 1, order=degree + 2)
plt.plot(x, e_to_the_x_taylor(x), label=f"degree={degree}")
plt.legend(bbox_to_anchor=(1.05, 1), loc='upper left', borderaxespad=0.0, shadow=True)
plt.tight_layout()
plt.axis([-10, 10, -10, 10])
plt.show()
That produces this:
But let's say if you're good with 'the maths', so to speak, and are willing to go with something slightly slower if it's more 'mathy' as in it handles symbolic notation well. For that, let me suggest SymPy.
And with that in mind here is a bit of SymPy code with a graph because, well, it looks good AND because we need to go back and hit another point again.
from sympy import series, Symbol, log, E
from sympy.functions import exp
from sympy.plotting import plot
import matplotlib.pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = 13,10
plt.rcParams['lines.linewidth'] = 2
x = Symbol('x')
def taylor(function, x0, n):
""" Defines Taylor approximation of a given function
function -- is our function which we want to approximate
x0 -- point where to approximate
n -- order of approximation
"""
return function.series(x,x0,n).removeO()
# I get eyestain; feel free to get rid of this
plt.rcParams['figure.figsize'] = 10, 8
plt.rcParams['lines.linewidth'] = 1
c = log(1 + pow(E, x))
plt = plot(c, taylor(c,0,1), taylor(c,0,2), taylor(c,0,3), taylor(c,0,4), (x,-5,5),legend=True, show=False)
plt[0].line_color = 'black'
plt[1].line_color = 'red'
plt[2].line_color = 'orange'
plt[3].line_color = 'green'
plt[4].line_color = 'blue'
plt.title = 'Taylor Series Expansion for log(1 +e^x)'
plt.show()
I think either option will get you where you need go.
Ok, now for the other point. You clearly stated after a bit of revision that log(1 +e^x) was your first choice. But the others don't pass the sniff test. e^x vacillates wildly as the degree of the polynomial changes. Because of the opaqueness of algorithms and how few people can conceptually understand this stuff, Data Scientists can screw things up to a degree people can't even imagine. So make sure you're very solid on theory for this.
One last thing, consider looking at the CDF of the Erlang Distribution as an activation function (assuming I'm right and you're looking to roll your own activation function as an area of research). I don't think anyone has looked at that but it strikes as promising. I think you could break out each channel of the RGB as one of the two parameters, with the other being the physical coordinate.
You can use tf.tile and tf.math.pow to generate the elements of the series expansion. Then you can use tf.math.cumsum to compute the partial sums s_i. Eventually you can multiply with the weights w_i and compute the final sum.
Here is a code sample:
import math
import tensorflow as tf
x = tf.keras.Input(shape=(32, 32, 3)) # 3-channel RGB.
# The following is determined by your series expansion and its order.
# For example: log(1 + exp(x)) to 3rd order.
# https://www.wolframalpha.com/input/?i=taylor+series+log%281+%2B+e%5Ex%29
order = 3
alpha = tf.constant([1/2, 1/8, -1/192]) # Series coefficients.
power = tf.constant([1.0, 2.0, 4.0])
offset = math.log(2)
# These are the weights of the network; using a constant for simplicity here.
# The shape must coincide with the above order of series expansion.
w_i = tf.constant([1.0, 1.0, 1.0])
elements = offset + alpha * tf.math.pow(
tf.tile(x[..., None], [1, 1, 1, 1, order]),
power
)
s_i = tf.math.cumsum(elements, axis=-1)
y = tf.math.reduce_sum(w_i * s_i, axis=-1)
I analyzed the sunspots.dat data (below) using fft which is a classic example in this area. I obtained results from fft in real and imaginery parts. Then I tried to use these coefficients (first 20) to recreate the data following the formula for Fourier transform. Thinking real parts correspond to a_n and imaginery to b_n, I have
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack
def f(Y,x):
total = 0
for i in range(20):
total += Y.real[i]*np.cos(i*x) + Y.imag[i]*np.sin(i*x)
return total
tempdata = np.loadtxt("sunspots.dat")
year=tempdata[:,0]
wolfer=tempdata[:,1]
Y=fft(wolfer)
n=len(Y)
print n
xs = linspace(0, 2*pi,1000)
gplt.plot(xs, [f(Y, x) for x in xs], '.')
gplt.show()
For some reason however, my plot does not mirror the one generated by ifft (I use the same number of coefficients on both sides). What could be wrong ?
Data:
http://linuxgazette.net/115/misc/andreasen/sunspots.dat
When you called fft(wolfer), you told the transform to assume a fundamental period equal to the length of the data. To reconstruct the data, you have to use basis functions of the same fundamental period = 2*pi/N. By the same token, your time index xs has to range over the time samples of the original signal.
Another mistake was in forgetting to do to the full complex multiplication. It's easier to think of this as Y[omega]*exp(1j*n*omega/N).
Here's the fixed code. Note I renamed i to ctr to avoid confusion with sqrt(-1), and n to N to follow the usual signal processing convention of using the lower case for a sample, and the upper case for total sample length. I also imported __future__ division to avoid confusion about integer division.
forgot to add earlier: Note that SciPy's fft doesn't divide by N after accumulating. I didn't divide this out before using Y[n]; you should if you want to get back the same numbers, rather than just seeing the same shape.
And finally, note that I am summing over the full range of frequency coefficients. When I plotted np.abs(Y), it looked like there were significant values in the upper frequencies, at least until sample 70 or so. I figured it would be easier to understand the result by summing over the full range, seeing the correct result, then paring back coefficients and seeing what happens.
from __future__ import division
import numpy as np
from scipy import *
from matplotlib import pyplot as gplt
from scipy import fftpack
def f(Y,x, N):
total = 0
for ctr in range(len(Y)):
total += Y[ctr] * (np.cos(x*ctr*2*np.pi/N) + 1j*np.sin(x*ctr*2*np.pi/N))
return real(total)
tempdata = np.loadtxt("sunspots.dat")
year=tempdata[:,0]
wolfer=tempdata[:,1]
Y=fft(wolfer)
N=len(Y)
print(N)
xs = range(N)
gplt.plot(xs, [f(Y, x, N) for x in xs])
gplt.show()
The answer from mtrw was extremely helpful and helped me answer the same question as the OP, but my head almost exploded trying to understand the nested loop.
Here's the last part but with numpy broadcasting (not sure if this even existed when the question was asked) rather than calling the f function:
xs = np.arange(N)
omega = 2*np.pi/N
phase = omega * xs[:,None] * xs[None,:]
reconstruct = Y[None,:] * (np.cos(phase) + 1j*np.sin(phase))
reconstruct = (reconstruct).sum(axis=1).real / N
# same output
plt.plot(reconstruct)
plt.plot(wolfer)
I'm trying to interpolate some data for the purpose of plotting. For instance, given N data points, I'd like to be able to generate a "smooth" plot, made up of 10*N or so interpolated data points.
My approach is to generate an N-by-10*N matrix and compute the inner product the original vector and the matrix I generated, yielding a 1-by-10*N vector. I've already worked out the math I'd like to use for the interpolation, but my code is pretty slow. I'm pretty new to Python, so I'm hopeful that some of the experts here can give me some ideas of ways I can try to speed up my code.
I think part of the problem is that generating the matrix requires 10*N^2 calls to the following function:
def sinc(x):
import math
try:
return math.sin(math.pi * x) / (math.pi * x)
except ZeroDivisionError:
return 1.0
(This comes from sampling theory. Essentially, I'm attempting to recreate a signal from its samples, and upsample it to a higher frequency.)
The matrix is generated by the following:
def resampleMatrix(Tso, Tsf, o, f):
from numpy import array as npar
retval = []
for i in range(f):
retval.append([sinc((Tsf*i - Tso*j)/Tso) for j in range(o)])
return npar(retval)
I'm considering breaking up the task into smaller pieces because I don't like the idea of an N^2 matrix sitting in memory. I could probably make 'resampleMatrix' into a generator function and do the inner product row-by-row, but I don't think that will speed up my code much until I start paging stuff in and out of memory.
Thanks in advance for your suggestions!
This is upsampling. See Help with resampling/upsampling for some example solutions.
A fast way to do this (for offline data, like your plotting application) is to use FFTs. This is what SciPy's native resample() function does. It assumes a periodic signal, though, so it's not exactly the same. See this reference:
Here’s the second issue regarding time-domain real signal interpolation, and it’s a big deal indeed. This exact interpolation algorithm provides correct results only if the original x(n) sequence is periodic within its full time interval.
Your function assumes the signal's samples are all 0 outside of the defined range, so the two methods will diverge away from the center point. If you pad the signal with lots of zeros first, it will produce a very close result. There are several more zeros past the edge of the plot not shown here:
Cubic interpolation won't be correct for resampling purposes. This example is an extreme case (near the sampling frequency), but as you can see, cubic interpolation isn't even close. For lower frequencies it should be pretty accurate.
If you want to interpolate data in a quite general and fast way, splines or polynomials are very useful. Scipy has the scipy.interpolate module, which is very useful. You can find many examples in the official pages.
Your question isn't entirely clear; you're trying to optimize the code you posted, right?
Re-writing sinc like this should speed it up considerably. This implementation avoids checking that the math module is imported on every call, doesn't do attribute access three times, and replaces exception handling with a conditional expression:
from math import sin, pi
def sinc(x):
return (sin(pi * x) / (pi * x)) if x != 0 else 1.0
You could also try avoiding creating the matrix twice (and holding it twice in parallel in memory) by creating a numpy.array directly (not from a list of lists):
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.zeros((f, o))
for i in xrange(f):
for j in xrange(o):
retval[i][j] = sinc((Tsf*i - Tso*j)/Tso)
return retval
(replace xrange with range on Python 3.0 and above)
Finally, you can create rows with numpy.arange as well as calling numpy.sinc on each row or even on the entire matrix:
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.zeros((f, o))
for i in xrange(f):
retval[i] = numpy.arange(Tsf*i / Tso, Tsf*i / Tso - o, -1.0)
return numpy.sinc(retval)
This should be significantly faster than your original implementation. Try different combinations of these ideas and test their performance, see which works out the best!
I'm not quite sure what you're trying to do, but there are some speedups you can do to create the matrix. Braincore's suggestion to use numpy.sinc is a first step, but the second is to realize that numpy functions want to work on numpy arrays, where they can do loops at C speen, and can do it faster than on individual elements.
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.sinc((Tsi*numpy.arange(i)[:,numpy.newaxis]
-Tso*numpy.arange(j)[numpy.newaxis,:])/Tso)
return retval
The trick is that by indexing the aranges with the numpy.newaxis, numpy converts the array with shape i to one with shape i x 1, and the array with shape j, to shape 1 x j. At the subtraction step, numpy will "broadcast" the each input to act as a i x j shaped array and the do the subtraction. ("Broadcast" is numpy's term, reflecting the fact no additional copy is made to stretch the i x 1 to i x j.)
Now the numpy.sinc can iterate over all the elements in compiled code, much quicker than any for-loop you could write.
(There's an additional speed-up available if you do the division before the subtraction, especially since inthe latter the division cancels the multiplication.)
The only drawback is that you now pay for an extra Nx10*N array to hold the difference. This might be a dealbreaker if N is large and memory is an issue.
Otherwise, you should be able to write this using numpy.convolve. From what little I just learned about sinc-interpolation, I'd say you want something like numpy.convolve(orig,numpy.sinc(numpy.arange(j)),mode="same"). But I'm probably wrong about the specifics.
If your only interest is to 'generate a "smooth" plot' I would just go with a simple polynomial spline curve fit:
For any two adjacent data points the coefficients of a third degree polynomial function can be computed from the coordinates of those data points and the two additional points to their left and right (disregarding boundary points.) This will generate points on a nice smooth curve with a continuous first dirivitive. There's a straight forward formula for converting 4 coordinates to 4 polynomial coefficients but I don't want to deprive you of the fun of looking it up ;o).
Here's a minimal example of 1d interpolation with scipy -- not as much fun as reinventing, but.
The plot looks like sinc, which is no coincidence:
try google spline resample "approximate sinc".
(Presumably less local / more taps ⇒ better approximation,
but I have no idea how local UnivariateSplines are.)
""" interpolate with scipy.interpolate.UnivariateSpline """
from __future__ import division
import numpy as np
from scipy.interpolate import UnivariateSpline
import pylab as pl
N = 10
H = 8
x = np.arange(N+1)
xup = np.arange( 0, N, 1/H )
y = np.zeros(N+1); y[N//2] = 100
interpolator = UnivariateSpline( x, y, k=3, s=0 ) # s=0 interpolates
yup = interpolator( xup )
np.set_printoptions( 1, threshold=100, suppress=True ) # .1f
print "yup:", yup
pl.plot( x, y, "green", xup, yup, "blue" )
pl.show()
Added feb 2010: see also basic-spline-interpolation-in-a-few-lines-of-numpy
Small improvement. Use the built-in numpy.sinc(x) function which runs in compiled C code.
Possible larger improvement: Can you do the interpolation on the fly (as the plotting occurs)? Or are you tied to a plotting library that only accepts a matrix?
I recommend that you check your algorithm, as it is a non-trivial problem. Specifically, I suggest you gain access to the article "Function Plotting Using Conic Splines" (IEEE Computer Graphics and Applications) by Hu and Pavlidis (1991). Their algorithm implementation allows for adaptive sampling of the function, such that the rendering time is smaller than with regularly spaced approaches.
The abstract follows:
A method is presented whereby, given a
mathematical description of a
function, a conic spline approximating
the plot of the function is produced.
Conic arcs were selected as the
primitive curves because there are
simple incremental plotting algorithms
for conics already included in some
device drivers, and there are simple
algorithms for local approximations by
conics. A split-and-merge algorithm
for choosing the knots adaptively,
according to shape analysis of the
original function based on its
first-order derivatives, is
introduced.