Function diverging at boundaries: Schrödinger 2D, explicit method - python

I'm trying to simulate the 2D Schrödinger equation using the explicit algorithm proposed by Askar and Cakmak (1977). I define a 100x100 grid with a complex function u+iv, null at the boundaries. The problem is, after just a few iterations the absolute value of the complex function explodes near the boundaries.
I post here the code so, if interested, you can check it:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import cm
from mpl_toolkits.mplot3d import Axes3D
#Initialization+meshgrid
Ntsteps=30
dx=0.1
dt=0.005
alpha=dt/(2*dx**2)
x=np.arange(0,10,dx)
y=np.arange(0,10,dx)
X,Y=np.meshgrid(x,y)
#Initial Gaussian wavepacket centered in (5,5)
vargaussx=1.
vargaussy=1.
kx=10
ky=10
upre=np.zeros((100,100))
ucopy=np.zeros((100,100))
u=(np.exp(-(X-5)**2/(2*vargaussx**2)-(Y-5)**2/(2*vargaussy**2))/(2*np.pi*(vargaussx*vargaussy)**2))*np.cos(kx*X+ky*Y)
vpre=np.zeros((100,100))
vcopy=np.zeros((100,100))
v=(np.exp(-(X-5)**2/(2*vargaussx**2)-(Y-5)**2/(2*vargaussy**2))/(2*np.pi*(vargaussx*vargaussy)**2))*np.sin(kx*X+ky*Y)
#For the simple scenario, null potential
V=np.zeros((100,100))
#Boundary conditions
u[0,:]=0
u[:,0]=0
u[99,:]=0
u[:,99]=0
v[0,:]=0
v[:,0]=0
v[99,:]=0
v[:,99]=0
#Evolution with Askar-Cakmak algorithm
for n in range(1,Ntsteps):
upre=np.copy(ucopy)
vpre=np.copy(vcopy)
ucopy=np.copy(u)
vcopy=np.copy(v)
#For the first iteration, simple Euler method: without this I cannot have the two steps backwards wavefunction at the second iteration
#I use ucopy to make sure that for example u[i,j] is calculated not using the already modified version of u[i-1,j] and u[i,j-1]
if(n==1):
upre=np.copy(ucopy)
vpre=np.copy(vcopy)
for i in range(1,len(x)-1):
for j in range(1,len(y)-1):
u[i,j]=upre[i,j]+2*((4*alpha+V[i,j]*dt)*vcopy[i,j]-alpha*(vcopy[i+1,j]+vcopy[i-1,j]+vcopy[i,j+1]+vcopy[i,j-1]))
v[i,j]=vpre[i,j]-2*((4*alpha+V[i,j]*dt)*ucopy[i,j]-alpha*(ucopy[i+1,j]+ucopy[i-1,j]+ucopy[i,j+1]+ucopy[i,j-1]))
#Calculate absolute value and plot
abspsi=np.sqrt(np.square(u)+np.square(v))
fig=plt.figure()
ax=fig.add_subplot(projection='3d')
surf=ax.plot_surface(X,Y,abspsi)
plt.show()
As you can see the code is extremely simple: I cannot see where this error is coming from (I don't think is a stability problem because alpha<1/2). Have you ever encountered anything similar in your past simulations?

I'd try setting your dt to a smaller value (e.g. 0.001) and increase the number of integration steps (e.g fivefold).
The wavefunction looks in shape also at Ntsteps=150 and well beyond when trying out your code with dt=0.001.
Checking integrals of the motion (e.g. kinetic energy here?) should also confirm that things are going OK (or not) for different choices of dt.

Related

Edge of a curve based out of numpy array

I am looking for some mathematical guidance, to help me find the index locations (red circles) of a curve as shown in the image below. The curve is just 1D numpy array. I tried scipy - gaussianfilter1d. I also tried np.gradient and I am not anywhere close to what I want to do. The gradient is abruptly changing, so a first order gradient should give what I am looking for. Then I realized the data is not smooth, and I tried smoothing by 'gaussianfilter1d'. Even then, I am unable to pick up where it changes. I have various types of these numpy arrays (same size, values ranging from 0 - 1), so the solution has to be applicable, and not dependent on the given data set. So I could not hardcode. Any thoughts would be much appreciated.
CSV file
First you get a smooth function from your data using scipy's UnivariateSpline. Then you plot the area where the absolute slope is say at least 1/4 of it's maximum.
from scipy.interpolate import UnivariateSpline
f= UnivariateSpline(np.arange(5500), y, k=3, s=0.3)
df = f.derivative()
plt.plot(x,f(x))
cond = np.abs(df(x)) > 0.25*np.max(np.abs(df(x)))
plt.scatter(x[cond],f(x[cond]), c='r')
Looks like what you are looking for is the first and last point of the marked ones. So you do
(x[cond].min(),f(x[cond].min()).item()), (x[cond].max(), f(x[cond].max()).item())
And your points are:
((1455, 0.20595740349084446), (4230, 0.1722999962943679))

creating a function that changes equations at certain slope, usable in curve_fit

I am currently working with fitting decline curves to real-world production data. I have had good luck with creating a hyperbolic and using curve_fit from scipy.optimize. The current function I use:
def hyp_func(x,qi,b,di):
return qi*(1.0-b*di*x)**(-1.0/b)
What I would like to do now, is at a certain rate of decline, transition to an exponential function. How would i go about this and still be able to use in curve_fit (I think below works)? I am trying the code below, is this the way to do it? or is there a better way?
def hyp_func2(x,qi,b,di):
dlim = -0.003
hy = qi*(1.0-b*di*x)**(-1.0/b)
hdy = di/(1.0-b*di*x)
ex = x[hdy>dlim]
qlim = qi*(dlim/di)**(1/b)
xlim = ((qi/qlim)**b-1)/(b*-di)
ey = qlim*np.exp(dlim*(ex-xlim))
y = np.concatenate((hy[hdy<dlim],ey))
return y
hy is the hyperbolic equation
hdy is the hy derivative
ex is the part of x after derivative hits dlim
ey is the exponential equation
I am still working out the equations, I am not getting a continuous function.
edit: data here, and updated equations
Sorry to be the bearer of bad news, but if I understand what you are trying to do, I think it is very difficult to have scipy.optimize.curve_fit, or any of the other methods from scipy.optimize do what you are hoping to do.
Most fitting algorithms are designed to work with continuous variables, and usually (and curve_fit for sure) start off by making very small changes in parameter values to find the right direction and step size to take to improve the result.
But what you're looking for is a discrete variable as the breakpoint between one functional form (roughly, "power law") to another ("exponential") The algorithm won't normally make a large enough change in your di parameter to make a difference for which value is used as that breakpoint, and may decide that di does not affect the fit (your model used di in other ways too, so you might get lucky and di might have an affect on the fit.
Assuming that qi>0 the slope is actually positive, so I do not get the choice of -0.003. Moreover I think the derivative is wrong.
You can calculate exactly the value where the lope reaches a critical value.
Now, from my experience you have two choices. If you define a piecewise function yourself, you usually run into trouble with function calls using numpy arrays. I typically use scipy.optimize.leastsq with a self-defined residual function. A second option is a continuous transition between the two functions. You can make that as sharp as you want, as value and slope already fit, by definition.
The two solutions look as follows
import matplotlib.pyplot as plt
import numpy as np
def hy(x,b,qi,di):
return qi*(1.0-b*di*x)**(-1.0/b)
def abshy(x,b,qi,di):#same as hy but defined for all x
return qi*abs(1.0-b*di*x)**(-1.0/b)
def dhy(x,b,qi,di):#derivative of hy
return qi*di*(1.0-b*di*x)**(-(b+1.0)/b)
def get_x_from_slope(s,b,qi,di):#self explaining
return (1.0-(s/(qi*di))**(-b/(b+1.0)))/(b*di)
def exh(x,xlim,qlim,dlim):#exponential part (actually no free parameters)
return qlim*np.exp(dlim*(x-xlim))
def trans(x,b,qi,di, s0):#piecewise function
x0=get_x_from_slope(s0,b,qi,di)
if x<x0:
out= hy(x,b,qi,di)
else:
H0=hy(x0,b,qi,di)
out=exh(x,x0,H0,s0/H0)
return out
def no_if_trans(x,b,qi,di, s0,sharpness=10):#continuous transition between the two functions
x0=get_x_from_slope(s0,b,qi,di)
H0=hy(x0,b,qi,di)
weight=0.5*(1+np.tanh(sharpness*(x-x0)))
return weight*exh(x,x0,H0,s0/H0)+(1.0-weight)*abshy(x,b,qi,di)
xList=np.linspace(0,5.5,90)
hyList=np.fromiter(( hy(x,2.2,1.2,.1) for x in xList ) ,np.float)
t1List=np.fromiter(( trans(x,2.2,1.2,.1,3.59) for x in xList ) ,np.float)
nt1List=np.fromiter(( no_if_trans(x,2.2,1.2,.1,3.59) for x in xList ) ,np.float)
fig1=plt.figure(1)
ax=fig1.add_subplot(1,1,1)
ax.plot(xList,hyList)
ax.plot(xList,t1List,linestyle='--')
ax.plot(xList,nt1List,linestyle=':')
ax.set_ylim([1,10])
ax.set_yscale('log')
plt.show()
There is almost no differences in the two solutions, but your options for using scipy fitting functions are slightly different. The second solution should easily work with curve_fit

applying "tighter" bounds in scipy.optimize.curve_fit

I have a dataset that I am trying to fit with parameters (a,b,c,d) that are within +/- 5% of the true fitting parameters. However, when I do this with scipy.optimize.curve_fit I get the error "ValueError: 'x0' is infeasible." inside the least squares package.
If I relax my bounds, then optimize.curve_fit seems to work as desired. I have also noticed that my parameters that are larger seem to be more flexible in applying bounds (i.e. I can get these to work with tighter constraints, but never below 20%). The following code is an MWE and has two sets of bounds (variable B), one that works and one that returns an error.
# %% import modules
import IPython as IP
IP.get_ipython().magic('reset -sf')
import matplotlib.pyplot as plt
import os as os
import numpy as np
import scipy as sp
import scipy.io as sio
plt.close('all')
#%% Load the data
capacity = np.array([1.0,9.896265560165975472e-01,9.854771784232364551e-01,9.823651452282157193e-01,9.797717842323651061e-01,9.776970954356846155e-01,9.751037344398340023e-01,9.735477178423235234e-01,9.714730290456431439e-01,9.699170124481327759e-01,9.683609958506222970e-01,9.668049792531120401e-01,9.652489626556015612e-01,9.636929460580913043e-01,9.621369294605808253e-01,9.610995850622406911e-01,9.595435684647302121e-01,9.585062240663900779e-01,9.574688796680497216e-01,9.559128630705394647e-01,9.548755186721991084e-01,9.538381742738588631e-01,9.528008298755185068e-01,9.517634854771783726e-01,9.507261410788381273e-01,9.496887966804978820e-01,9.486514522821576367e-01,9.476141078838172804e-01,9.460580912863070235e-01,9.450207468879666672e-01,9.439834024896265330e-01,9.429460580912862877e-01,9.419087136929459314e-01,9.408713692946057972e-01,9.393153526970953182e-01,9.382780082987551840e-01,9.372406639004148277e-01,9.356846473029045708e-01,9.346473029045642145e-01,9.330912863070539576e-01,9.320539419087136013e-01,9.304979253112033444e-01,9.289419087136928654e-01,9.273858921161826085e-01,9.258298755186721296e-01,9.242738589211617617e-01,9.227178423236513938e-01,9.211618257261410259e-01,9.196058091286306579e-01,9.180497925311202900e-01,9.159751037344397995e-01,9.144190871369294316e-01,9.123443983402489410e-01,9.107883817427384621e-01,9.087136929460579715e-01,9.071576763485477146e-01,9.050829875518671130e-01,9.030082987551866225e-01,9.009336099585061319e-01,8.988589211618257524e-01,8.967842323651451508e-01,8.947095435684646603e-01,8.926348547717841697e-01,8.905601659751035681e-01,8.884854771784231886e-01,8.864107883817426980e-01,8.843360995850622075e-01,8.817427385892115943e-01,8.796680497925309927e-01,8.775933609958505022e-01,8.749999999999998890e-01,8.729253112033195094e-01,8.708506224066390189e-01,8.682572614107884057e-01,8.661825726141078041e-01,8.635892116182571909e-01,8.615145228215767004e-01,8.589211618257260872e-01,8.563278008298754740e-01,8.542531120331948724e-01,8.516597510373442592e-01,8.490663900414936460e-01,8.469917012448132665e-01,8.443983402489626533e-01,8.418049792531120401e-01,8.397302904564315496e-01,8.371369294605809364e-01,8.345435684647303232e-01,8.324688796680497216e-01,8.298755186721991084e-01,8.272821576763484952e-01,8.246887966804978820e-01,8.226141078838173915e-01,8.200207468879667783e-01,8.174273858921160540e-01,8.153526970954355635e-01,8.127593360995849503e-01,8.101659751037343371e-01,8.075726141078837239e-01,8.054979253112033444e-01,8.029045643153527312e-01,8.003112033195021180e-01,7.977178423236515048e-01,7.956431535269707922e-01,7.930497925311201790e-01,7.904564315352695658e-01,7.883817427385891863e-01,7.857883817427385731e-01,7.831950207468879599e-01,7.811203319502073583e-01,7.785269709543567451e-01,7.759336099585061319e-01,7.738589211618256414e-01,7.712655601659750282e-01,7.686721991701244150e-01,7.665975103734440355e-01,7.640041493775934223e-01,7.619294605809127097e-01,7.593360995850620965e-01,7.567427385892114833e-01,7.546680497925311037e-01,7.520746887966804906e-01,7.499999999999998890e-01,7.474066390041492758e-01,7.453319502074687852e-01,7.427385892116181720e-01,7.406639004149377925e-01,7.380705394190871793e-01,7.359958506224064667e-01,7.339211618257260872e-01,7.313278008298754740e-01,7.292531120331949834e-01,7.266597510373443702e-01,7.245850622406637687e-01,7.225103734439833891e-01,7.199170124481327759e-01,7.178423236514521744e-01,7.157676348547717948e-01,7.136929460580911933e-01,7.110995850622405801e-01,7.090248962655600895e-01,7.069502074688797100e-01,7.048755186721989974e-01,7.022821576763483842e-01,7.002074688796680046e-01,6.981327800829875141e-01,6.960580912863069125e-01,6.939834024896265330e-01,6.919087136929459314e-01,6.898340248962655519e-01,6.877593360995849503e-01])
cycles = np.arange(0,151)
#%% fit the capacity data
# define the empicrial model to be fitted
def He_model(k,a,b,c,d):
return a*np.exp(b*k)+c*np.exp(d*k)
# Fit the entire data set with the function
P0 = [-40,0.005,40,-0.005]
fit, tmp = sp.optimize.curve_fit(He_model, cycles,capacity, p0=P0,maxfev=10000000)
capacity_fit = He_model(cycles, fit[0], fit[1],fit[2], fit[3])
# track all four data points with a 5% bound from the best fit
b_lim = np.zeros((4,2))
for i in range(4):
b_lim[i,0] = fit[i]-np.abs(0.2*fit[i]) # these should be 0.05
b_lim[i,1] = fit[i]+np.abs(0.2*fit[i])
# bounds that work
B = ([b_lim[0,0],-np.inf,b_lim[2,0],-np.inf],[b_lim[0,1], np.inf, b_lim[2,1], np.inf])
# bounds that do not work, but are closer to what I want.
#B = ([b_lim[0,0],b_lim[1,0],b_lim[2,0],b_lim[3,0]],[b_lim[0,1], b_lim[1,1], b_lim[2,1], b_lim[3,1]])
fit_4_5per, tmp = sp.optimize.curve_fit(He_model, cycles,capacity , p0=P0,bounds=B)
capacity_4_5per = He_model(cycles, fit_4_5per[0], fit_4_5per[1], fit_4_5per[2], fit_4_5per[3])
#%% plot the results
plt.figure()
plt.plot(cycles,capacity,'o',fillstyle='none',label='data',markersize=4)
plt.plot(cycles,capacity_fit,'--',label='best fit')
plt.plot(cycles,capacity_4_5per,'-.',label='5 percent bounds')
plt.ylabel('capacity')
plt.xlabel('charge cycles')
plt.legend()
plt.ylim([0.70,1])
plt.tight_layout()
I understand that optimize.curve_fit may need some space to explore the data set to find the optimum spot, however, I feel that 5% should be enough for it. Maybe I am wrong? Is there maybe a better way to apply bounds to a parameter?
The ValueError of "x0 is infeasible" is coming because your initial values violate the bounds. Printing out the parameters values and bounds will show this.
Basically, you're setting the bounds too cleverly, based on the first refined values. But the refined values are different enough from your starting values that the bounds for the second call to curve_fit mean the initial values fall outside the bounds.
More importantly, what leads you to "feel that 5% should be enough"? Primarily, you should apply bounds to make sure the model makes sense, and secondarily to help the fit avoid false solutions. You're calculating the bounds based on an initial fit, so I doubt there's a strong physical justification for those bounds. Why not let the fit do its job?

How to plot grad(f(x,y))?

I want to calculate and plot a gradient of any scalar function of two variables. If you really want a concrete example, lets say f=x^2+y^2 where x goes from -10 to 10 and same for y. How do I calculate and plot grad(f)? The solution should be vector and I should see vector lines. I am new to python so please use simple words.
EDIT:
#Andras Deak: thank you for your post, i tried what you suggested and instead of your test function (fun=3*x^2-5*y^2) I used function that i defined as V(x,y); this is how the code looks like but it reports an error
import numpy as np
import math
import sympy
import matplotlib.pyplot as plt
def V(x,y):
t=[]
for k in range (1,3):
for l in range (1,3):
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
return term
return term.sum()
x,y=sympy.symbols('x y')
fun=V(x,y)
gradfun=[sympy.diff(fun,var) for var in (x,y)]
numgradfun=sympy.lambdify([x,y],gradfun)
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
AttributeError: 'Mul' object has no attribute 'sin'
And lets say I remove sin, I get another error:
TypeError: can't multiply sequence by non-int of type 'Mul'
I read tutorial for sympy and it says "The real power of a symbolic computation system such as SymPy is the ability to do all sorts of computations symbolically". I get this, I just dont get why I cannot multiply x and y symbols with float numbers.
What is the way around this? :( Help please!
UPDATE
#Andras Deak: I wanted to make things shorter so I removed many constants from the original formulas for V(x,y) and Cn*Dm. As you pointed out, that caused the sin function to always return 0 (i just noticed). Apologies for that. I will update the post later today when i read your comment in details. Big thanks!
UPDATE 2
I changed coefficients in my expression for voltage and this is the result:
It looks good except that the arrows point in the opposite direction (they are supposed to go out of the reddish dot and into the blue one). Do you know how I could change that? And if possible, could you please tell me the way to increase the size of the arrows? I tried what was suggested in another topic (Computing and drawing vector fields):
skip = (slice(None, None, 3), slice(None, None, 3))
This plots only every third arrow and matplotlib does the autoscale but it doesnt work for me (nothing happens when i add this, for any number that i enter)
You were already of huge help , i cannot thank you enough!
Here's a solution using sympy and numpy. This is the first time I use sympy, so others will/could probably come up with much better and more elegant solutions.
import sympy
#define symbolic vars, function
x,y=sympy.symbols('x y')
fun=3*x**2-5*y**2
#take the gradient symbolically
gradfun=[sympy.diff(fun,var) for var in (x,y)]
#turn into a bivariate lambda for numpy
numgradfun=sympy.lambdify([x,y],gradfun)
now you can use numgradfun(1,3) to compute the gradient at (x,y)==(1,3). This function can then be used for plotting, which you said you can do.
For plotting, you can use, for instance, matplotlib's quiver, like so:
import numpy as np
import matplotlib.pyplot as plt
X,Y=np.meshgrid(np.arange(-10,11),np.arange(-10,11))
graddat=numgradfun(X,Y)
plt.figure()
plt.quiver(X,Y,graddat[0],graddat[1])
plt.show()
UPDATE
You added a specification for your function to be computed. It contains the product of terms depending on x and y, which seems to break my above solution. I managed to come up with a new one to suit your needs. However, your function seems to make little sense. From your edited question:
t.append(0.000001*np.sin(2*math.pi*k*0.5)/((4*(math.pi)**2)* (k**2+l**2)))
term = t* np.sin(2 * math.pi * k * x/0.004) * np.cos(2 * math.pi * l * y/0.004)
On the other hand, from your corresponding comment to this answer:
V(x,y) = Sum over n and m of [Cn * Dm * sin(2pinx) * cos(2pimy)]; sum goes from -10 to 10; Cn and Dm are coefficients, and i calculated
that CkDl = sin(2pik)/(k^2 +l^2) (i used here k and l as one of the
indices from the sum over n and m).
I have several problems with this: both sin(2*pi*k) and sin(2*pi*k/2) (the two competing versions in the prefactor are always zero for integer k, giving you a constant zero V at every (x,y). Furthermore, in your code you have magical frequency factors in the trigonometric functions, which are missing from the comment. If you multiply your x by 4e-3, you drastically change the spatial dependence of your function (by changing the wavelength by roughly a factor of a thousand). So you should really decide what your function is.
So here's a solution, where I assumed
V(x,y)=sum_{k,l = 1 to 10} C_{k,l} * sin(2*pi*k*x)*cos(2*pi*l*y), with
C_{k,l}=sin(2*pi*k/4)/((4*pi^2)*(k^2+l^2))*1e-6
This is a combination of your various versions of the function, with the modification of sin(2*pi*k/4) in the prefactor in order to have a non-zero function. I expect you to be able to fix the numerical factors to your actual needs, after you figure out the proper mathematical model.
So here's the full code:
import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
def CD(k,l):
#return sp.sin(2*sp.pi*k/2)/((4*sp.pi**2)*(k**2+l**2))*1e-6
return sp.sin(2*sp.pi*k/4)/((4*sp.pi**2)*(k**2+l**2))*1e-6
def Vkl(x,y,k,l):
return CD(k,l)*sp.sin(2*sp.pi*k*x)*sp.cos(2*sp.pi*l*y)
def V(x,y,kmax,lmax):
k,l=sp.symbols('k l',integers=True)
return sp.summation(Vkl(x,y,k,l),(k,1,kmax),(l,1,lmax))
#define symbolic vars, function
kmax=10
lmax=10
x,y=sp.symbols('x y')
fun=V(x,y,kmax,lmax)
#take the gradient symbolically
gradfun=[sp.diff(fun,var) for var in (x,y)]
#turn into bivariate lambda for numpy
numgradfun=sp.lambdify([x,y],gradfun,'numpy')
numfun=sp.lambdify([x,y],fun,'numpy')
#plot
X,Y=np.meshgrid(np.linspace(-10,10,51),np.linspace(-10,10,51))
graddat=numgradfun(X,Y)
fundat=numfun(X,Y)
hf=plt.figure()
hc=plt.contourf(X,Y,fundat,np.linspace(fundat.min(),fundat.max(),25))
plt.quiver(X,Y,graddat[0],graddat[1])
plt.colorbar(hc)
plt.show()
I defined your V(x,y) function using some auxiliary functions for transparence. I left the summation cut-offs as literal parameters, kmax and lmax: in your code these were 3, in your comment they were said to be 10, and anyway they should be infinity.
The gradient is taken the same way as before, but when converting to a numpy function using lambdify you have to set an additional string parameter, 'numpy'. This will alow the resulting numpy lambda to accept array input (essentially it will use np.sin instead of math.sin and the same for cos).
I also changed the definition of the grid from array to np.linspace: this is usually more convenient. Since your function is almost constant at integer grid points, I created a denser mesh for plotting (51 points while keeping your original limits of (-10,10) fixed).
For clarity I included a few more plots: a contourf to show the value of the function (contour lines should always be orthogonal to the gradient vectors), and a colorbar to indicate the value of the function. Here's the result:
The composition is obviously not the best, but I didn't want to stray too much from your specifications. The arrows in this figure are actually hardly visible, but as you can see (and also evident from the definition of V) your function is periodic, so if you plot the same thing with smaller limits and less grid points, you'll see more features and larger arrows.

resampling, interpolating matrix

I'm trying to interpolate some data for the purpose of plotting. For instance, given N data points, I'd like to be able to generate a "smooth" plot, made up of 10*N or so interpolated data points.
My approach is to generate an N-by-10*N matrix and compute the inner product the original vector and the matrix I generated, yielding a 1-by-10*N vector. I've already worked out the math I'd like to use for the interpolation, but my code is pretty slow. I'm pretty new to Python, so I'm hopeful that some of the experts here can give me some ideas of ways I can try to speed up my code.
I think part of the problem is that generating the matrix requires 10*N^2 calls to the following function:
def sinc(x):
import math
try:
return math.sin(math.pi * x) / (math.pi * x)
except ZeroDivisionError:
return 1.0
(This comes from sampling theory. Essentially, I'm attempting to recreate a signal from its samples, and upsample it to a higher frequency.)
The matrix is generated by the following:
def resampleMatrix(Tso, Tsf, o, f):
from numpy import array as npar
retval = []
for i in range(f):
retval.append([sinc((Tsf*i - Tso*j)/Tso) for j in range(o)])
return npar(retval)
I'm considering breaking up the task into smaller pieces because I don't like the idea of an N^2 matrix sitting in memory. I could probably make 'resampleMatrix' into a generator function and do the inner product row-by-row, but I don't think that will speed up my code much until I start paging stuff in and out of memory.
Thanks in advance for your suggestions!
This is upsampling. See Help with resampling/upsampling for some example solutions.
A fast way to do this (for offline data, like your plotting application) is to use FFTs. This is what SciPy's native resample() function does. It assumes a periodic signal, though, so it's not exactly the same. See this reference:
Here’s the second issue regarding time-domain real signal interpolation, and it’s a big deal indeed. This exact interpolation algorithm provides correct results only if the original x(n) sequence is periodic within its full time inter­val.
Your function assumes the signal's samples are all 0 outside of the defined range, so the two methods will diverge away from the center point. If you pad the signal with lots of zeros first, it will produce a very close result. There are several more zeros past the edge of the plot not shown here:
Cubic interpolation won't be correct for resampling purposes. This example is an extreme case (near the sampling frequency), but as you can see, cubic interpolation isn't even close. For lower frequencies it should be pretty accurate.
If you want to interpolate data in a quite general and fast way, splines or polynomials are very useful. Scipy has the scipy.interpolate module, which is very useful. You can find many examples in the official pages.
Your question isn't entirely clear; you're trying to optimize the code you posted, right?
Re-writing sinc like this should speed it up considerably. This implementation avoids checking that the math module is imported on every call, doesn't do attribute access three times, and replaces exception handling with a conditional expression:
from math import sin, pi
def sinc(x):
return (sin(pi * x) / (pi * x)) if x != 0 else 1.0
You could also try avoiding creating the matrix twice (and holding it twice in parallel in memory) by creating a numpy.array directly (not from a list of lists):
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.zeros((f, o))
for i in xrange(f):
for j in xrange(o):
retval[i][j] = sinc((Tsf*i - Tso*j)/Tso)
return retval
(replace xrange with range on Python 3.0 and above)
Finally, you can create rows with numpy.arange as well as calling numpy.sinc on each row or even on the entire matrix:
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.zeros((f, o))
for i in xrange(f):
retval[i] = numpy.arange(Tsf*i / Tso, Tsf*i / Tso - o, -1.0)
return numpy.sinc(retval)
This should be significantly faster than your original implementation. Try different combinations of these ideas and test their performance, see which works out the best!
I'm not quite sure what you're trying to do, but there are some speedups you can do to create the matrix. Braincore's suggestion to use numpy.sinc is a first step, but the second is to realize that numpy functions want to work on numpy arrays, where they can do loops at C speen, and can do it faster than on individual elements.
def resampleMatrix(Tso, Tsf, o, f):
retval = numpy.sinc((Tsi*numpy.arange(i)[:,numpy.newaxis]
-Tso*numpy.arange(j)[numpy.newaxis,:])/Tso)
return retval
The trick is that by indexing the aranges with the numpy.newaxis, numpy converts the array with shape i to one with shape i x 1, and the array with shape j, to shape 1 x j. At the subtraction step, numpy will "broadcast" the each input to act as a i x j shaped array and the do the subtraction. ("Broadcast" is numpy's term, reflecting the fact no additional copy is made to stretch the i x 1 to i x j.)
Now the numpy.sinc can iterate over all the elements in compiled code, much quicker than any for-loop you could write.
(There's an additional speed-up available if you do the division before the subtraction, especially since inthe latter the division cancels the multiplication.)
The only drawback is that you now pay for an extra Nx10*N array to hold the difference. This might be a dealbreaker if N is large and memory is an issue.
Otherwise, you should be able to write this using numpy.convolve. From what little I just learned about sinc-interpolation, I'd say you want something like numpy.convolve(orig,numpy.sinc(numpy.arange(j)),mode="same"). But I'm probably wrong about the specifics.
If your only interest is to 'generate a "smooth" plot' I would just go with a simple polynomial spline curve fit:
For any two adjacent data points the coefficients of a third degree polynomial function can be computed from the coordinates of those data points and the two additional points to their left and right (disregarding boundary points.) This will generate points on a nice smooth curve with a continuous first dirivitive. There's a straight forward formula for converting 4 coordinates to 4 polynomial coefficients but I don't want to deprive you of the fun of looking it up ;o).
Here's a minimal example of 1d interpolation with scipy -- not as much fun as reinventing, but.
The plot looks like sinc, which is no coincidence:
try google spline resample "approximate sinc".
(Presumably less local / more taps ⇒ better approximation,
but I have no idea how local UnivariateSplines are.)
""" interpolate with scipy.interpolate.UnivariateSpline """
from __future__ import division
import numpy as np
from scipy.interpolate import UnivariateSpline
import pylab as pl
N = 10
H = 8
x = np.arange(N+1)
xup = np.arange( 0, N, 1/H )
y = np.zeros(N+1); y[N//2] = 100
interpolator = UnivariateSpline( x, y, k=3, s=0 ) # s=0 interpolates
yup = interpolator( xup )
np.set_printoptions( 1, threshold=100, suppress=True ) # .1f
print "yup:", yup
pl.plot( x, y, "green", xup, yup, "blue" )
pl.show()
Added feb 2010: see also basic-spline-interpolation-in-a-few-lines-of-numpy
Small improvement. Use the built-in numpy.sinc(x) function which runs in compiled C code.
Possible larger improvement: Can you do the interpolation on the fly (as the plotting occurs)? Or are you tied to a plotting library that only accepts a matrix?
I recommend that you check your algorithm, as it is a non-trivial problem. Specifically, I suggest you gain access to the article "Function Plotting Using Conic Splines" (IEEE Computer Graphics and Applications) by Hu and Pavlidis (1991). Their algorithm implementation allows for adaptive sampling of the function, such that the rendering time is smaller than with regularly spaced approaches.
The abstract follows:
A method is presented whereby, given a
mathematical description of a
function, a conic spline approximating
the plot of the function is produced.
Conic arcs were selected as the
primitive curves because there are
simple incremental plotting algorithms
for conics already included in some
device drivers, and there are simple
algorithms for local approximations by
conics. A split-and-merge algorithm
for choosing the knots adaptively,
according to shape analysis of the
original function based on its
first-order derivatives, is
introduced.

Categories

Resources