Difference in x,y parameters for scipy interpolate RectBivariateSpline and interp2d - python

If I want to interpolate the data below:
from scipy.interpolate import RectBivariateSpline, interp2d
import numpy as np
x1 = np.linspace(0,5,10)
y1 = np.linspace(0,20,20)
xx, yy = np.meshgrid(x1, y1)
z = np.sin(xx**2+yy**2)
with interp2d this works:
f = interp2d(x1, y1, z, kind='cubic')
however if I use RectBivariateSpline with the same x1, y1 parameters:
f = RectBivariateSpline(x1, y1, z)
I get this error:
TypeError Traceback (most recent call last)
<ipython-input-9-3da046e1ebe0> in <module>()
----> 1 f = RectBivariateSpline(x, y, z)
C:\...\Local\Continuum\Anaconda\lib\site-packages\scipy\interpolate\fitpack2.pyc in __init__(self, x, y, z, bbox, kx, ky, s)
958 raise TypeError('y must be strictly ascending')
959 if not x.size == z.shape[0]:
--> 960 raise TypeError('x dimension of z must have same number of '
961 'elements as x')
962 if not y.size == z.shape[1]:
TypeError: x dimension of z must have same number of elements as x
I'd have to switch the sizes of x, y like this to have it work:
x2 = np.linspace(0,5,20)
y2 = np.linspace(0,20,10)
f = RectBivariateSpline(x2, y2, z)
Is there a reason for this behavior - or something I am not understanding?

Well, the reason is that the parameters to the two functions are, as you have noted, different. Yes, this makes it really hard to just switch out one for the other, as I well know.
Why? In general it was a clear design decision to break backward compatibility with the new object-oriented spline functions, or at least not worry about it. Certainly, for large grid sizes there is significant space savings with not having to pass x and y as 2D objects. Frankly, I have found in my code that once this initial barrier is overcome, I'm much happier using the spline objects. For example, with the UnivariateSpline object, getting the derivative(s) is easy, as is the integral.
It would appear that, going forward, the SciPy folks will focus on the new objects, so you might contemplate just moving to them now. They are the same base functionality, and have additional methods that provide nice benefits.
EDIT - clarify what 'broke' between the two.
From the SciPy manual on interp2d you get the code snippet:
from scipy import interpolate
x = np.arange(-5.01, 5.01, 0.25)
y = np.arange(-5.01, 5.01, 0.25)
xx, yy = np.meshgrid(x, y)
z = np.sin(xx**2+yy**2)
f = interpolate.interp2d(x, y, z, kind=’cubic’)
This can be, unfortunately, potentially misleading since both x and y are the same length, so z will be a square matrix. So, lets play with this a bit:
x = np.linspace(0,5,11)
y = np.linspace(0,20,21) # note different lengths
z = x[None,:].T + y*y # need broadcasting
xx,yy = np.meshgrid(x,y) # this is from the interp2d example to compare
zz = xx + yy*yy
These now have different shapes: shape(z) is (11,21) and shape(zz) is (21,11). In fact, they are the transpose of each other, z == zz.T. Once you realize this, it all becomes clearer - going from interp2d to RectBivariateSpline swapped the expected axes. Pick one instantiation of the splines (I've opted for the newer ones), and you have picked a particular set of axes to keep clear in your head. To mix them together, a simple transpose will work as well, but can get to be a headache when you go back through your code a month or more from now.

Related

Most efficient way to save 2D array and its axes values

Suppose I have some large arrays as a result of a 2-variable function:
import numpy as np
lenx = 100
leny = 37
x = np.linspace(0, 10, lenx)
y = np.linspace(0, 20, leny)
z = np.zeros((lenx, leny))
fx = x + 1 # Might need this later
for i in range(lenx):
for k in range(leny):
z[i, k] = x + y
I can plot this using plt.imshow (for example) after I am done calculating z, but sometimes I want to save this data (x, y, z, and maybe fx too) and replot it later. What would be the best way to go about it?
I could expand this into a 3 by lenx*leny array:
[[x0, y0, z00],
[x0, y1, z01],
...
[xn, ym, znm]]
But that seems very inefficient (x and y values would be repeated many times unnecessarily) and would take a long time to load every time.
I also thought of a json file but my understanding is that they are text-based, so in case of very large arrays (think 200 x 250000) it would take up a lot of memory.
What is the best option in this case?

Differentiate a 2d cubic spline in python

I'm using interpolate.interp2d() to fit a 2-D spline over a function. How can I get the first derivative of the spline w.r.t. each of the dependent variables? Here is my code so far, Z are the descrete points on a mesh-grid that I have
from scipy import interpolate
YY, XX = np.meshgrid(Y, X)
f = interpolate.interp2d(AA, XX, Z, kind='cubic')
So, I need df/dx and df/dy. Note also that my Y-grid is not evenly spaced. I guess I can numerically differentiate Z and then fit a new spline, but it seemed like too much hassle. Is there an easier way?
You can differentiate the output of interp2d by using the function bisplev on the tck property of the interpolant with the optional arguments dx and dy.
If you've got some meshed data which you've interpolated:
X = np.arange(5.)
Y = np.arange(6., 11)
Y[0] = 4 # Demonstrate an irregular mesh
YY, XX = np.meshgrid(Y, X)
Z = np.sin(XX*2*np.pi/5 + YY*YY*2*np.pi/11)
f = sp.interpolate.interp2d(XX, YY, Z, kind='cubic')
xt = np.linspace(X.min(), X.max())
yt = np.linspace(Y.min(), Y.max())
then you can access the appropriate structure for bisplev as f.tck: the partial derivative of f with respect to x can be evaluated as
Z_x = sp.interpolate.bisplev(xt, yt, f.tck, dx=1, dy=0)
Edit: From this answer, it looks like the result of interp2d can itself take the optional arguments of dx and dy:
Z_x = f(xt, yt, dx=1, dy=0)

Get derivative of data in python

I write a program to get derivative. InterpolatedUnivariateSpline is used for calculating f(x+h). The red line is derivative of cosine, the green line is cosine consine, the blue line is -sine function. The red and blue line are matched. It works well in the following.
from scipy.interpolate import InterpolatedUnivariateSpline
import numpy as np
import matplotlib.pyplot as plt
pi = np.pi
x = np.arange(0,5*pi,0.2*pi)
y = np.cos(x)
f2 = InterpolatedUnivariateSpline(x, y)
#Get dervative
der = []
for i in range(len(y)):
h = 1e-4
der.append( ( f2(x[i]+h)-f2(x[i]-h) )/(2*h) )
der = np.array(der)
plt.plot(x, der, 'r', x, y, 'g', x, -np.sin(x),'b')
plt.show()
But I encounter some problem. In my project, my variable x(frequency) vary from 10^7 to 2.2812375*10^9, its step is 22487500,so I change array x.
As a result, I get the following result.
The derivative is a line and almost close to 0, It is not -sine function. How do I solve this?
You have a loss of significance problem. It means that when adding a large floating point number to a small one, the precision of the small one is partially lost as a numpy double can only hold 64 bits of information.
To solve this issue you have to make sure that the scale of the numbers you add/multiply/divide is not too different. One simple solution is dividing x by 1e9 or multiplying h by 1e9. If you do this you get essentially the same precision as in your example.
Also if x has high enough resolution a simpler way to numerically differentiate the function would be der = np.diff(y) / np.diff(x). This way you do not have to worry about setting h. However in this case note that dy is one element shorter that y, and dy[i] is actually an approximation of the derivative at `(x[i] + x[i+1]) / 2. So to plot it you would do:
der = np.diff(y) / np.diff(x)
x2 = (x[:-1] + x[1:]) / 2
plt.plot(x2, der, 'r', x, y, 'g', x, -np.sin(x),'b')
Show my plots using np. gradient()
import numpy as np
import matplotlib.pyplot as plt
pi = np.pi
x = np.arange(0,5*pi,0.2*pi)
y = np.cos(x)
der = np.gradient(y,x)
plt.plot(x, der, 'r', x, y, 'g', x, -np.sin(x),'b')
plt.show()
I can smooth the plots using spline. If you need, I will post.

Using meshgrid to convert X,Y,Z triplet to three 2D arrays for surface plot in matplotlib

I'm new to Python so please be patient. I appreciate any help!
What I have: three 1D lists (xr, yr, zr), one containing x-values, the other two y- and z-values
What I want to do: create a 3D contour plot in matplotlib
I realized that I need to convert the three 1D lists into three 2D lists, by using the meshgrid function.
Here's what I have so far:
xr = np.asarray(xr)
yr = np.asarray(yr)
zr = np.asarray(zr)
X, Y = np.meshgrid(xr,yr)
znew = np.array([zr for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = znew.reshape(X.shape)
Running this gives me the following error (for the last line I entered above):
total size of new array must be unchanged
I went digging around stackoverflow, and tried using suggestions from people having similar problems. Here are the errors I get from each of those suggestions:
Changing the last line to:
Z = znew.reshape(X.shape[0])
Gives the same error.
Changing the last line to:
Z = znew.reshape(X.shape[0], len(znew))
Gives the error:
Shape of x does not match that of z: found (294, 294) instead of (294, 86436).
Changing it to:
Z = znew.reshape(X.shape, len(znew))
Gives the error:
an integer is required
Any ideas?
Well,sample code below works for me
import numpy as np
import matplotlib.pyplot as plt
xr = np.linspace(-20, 20, 100)
yr = np.linspace(-25, 25, 110)
X, Y = np.meshgrid(xr, yr)
#Z = 4*X**2 + Y**2
zr = []
for i in range(0, 110):
y = -25.0 + (50./110.)*float(i)
for k in range(0, 100):
x = -20.0 + (40./100.)*float(k)
v = 4.0*x*x + y*y
zr.append(v)
Z = np.reshape(zr, X.shape)
print(X.shape)
print(Y.shape)
print(Z.shape)
plt.contour(X, Y, Z)
plt.show()
TL;DR
import matplotlib.pyplot as plt
import numpy as np
def get_data_for_mpl(X, Y, Z):
result_x = np.unique(X)
result_y = np.unique(Y)
result_z = np.zeros((len(result_x), len(result_y)))
# result_z[:] = np.nan
for x, y, z in zip(X, Y, Z):
i = np.searchsorted(result_x, x)
j = np.searchsorted(result_y, y)
result_z[i, j] = z
return result_x, result_y, result_z
xr, yr, zr = np.genfromtxt('data.txt', unpack=True)
plt.contourf(*get_data_for_mpl(xr, yr, zr), 100)
plt.show()
Detailed answer
At the beginning, you need to find out for which values of x and y the graph is being plotted. This can be done using the numpy.unique function:
result_x = numpy.unique(X)
result_y = numpy.unique(Y)
Next, you need to create a numpy.ndarray with function values for each point (x, y) from zip(X, Y):
result_z = numpy.zeros((len(result_x), len(result_y)))
for x, y, z in zip(X, Y, Z):
i = search(result_x, x)
j = search(result_y, y)
result_z[i, j] = z
If the array is sorted, then the search in it can be performed not in linear time, but in logarithmic time, so it is enough to use the numpy.searchsorted function to search. but to use it, the arrays result_x and result_y must be sorted. Fortunately, sorting is part of the numpy.unique method and there are no additional actions to do. It is enough to replace the search (this method is not implemented anywhere and is given simply as an intermediate step) method with np.searchsorted.
Finally, to get the desired image, it is enough to call the matplotlib.pyplot.contour or matplotlib.pyplot.contourf method.
If the function value does not exist for (x, y) for all x from result_x and all y from result_y, and you just want to not draw anything, then it is enough to replace the missing values with NaN. Or, more simply, create result_z as numpy.ndarray` from NaN and then fill it in:
result_z = numpy.zeros((len(result_x), len(result_y)))
result_z[:] = numpy.nan

How to solve differential equation using Python builtin function odeint?

I want to solve this differential equations with the given initial conditions:
(3x-1)y''-(3x+2)y'+(6x-8)y=0, y(0)=2, y'(0)=3
the ans should be
y=2*exp(2*x)-x*exp(-x)
here is my code:
def g(y,x):
y0 = y[0]
y1 = y[1]
y2 = (6*x-8)*y0/(3*x-1)+(3*x+2)*y1/(3*x-1)
return [y1,y2]
init = [2.0, 3.0]
x=np.linspace(-2,2,100)
sol=spi.odeint(g,init,x)
plt.plot(x,sol[:,0])
plt.show()
but what I get is different from the answer.
what have I done wrong?
There are several things wrong here. Firstly, your equation is apparently
(3x-1)y''-(3x+2)y'-(6x-8)y=0; y(0)=2, y'(0)=3
(note the sign of the term in y). For this equation, your analytical solution and definition of y2 are correct.
Secondly, as the #Warren Weckesser says, you must pass 2 parameters as y to g: y[0] (y), y[1] (y') and return their derivatives, y' and y''.
Thirdly, your initial conditions are given for x=0, but your x-grid to integrate on starts at -2. From the docs for odeint, this parameter, t in their call signature description:
odeint(func, y0, t, args=(),...):
t : array
A sequence of time points for which to solve for y. The initial
value point should be the first element of this sequence.
So you must integrate starting at 0 or provide initial conditions starting at -2.
Finally, your range of integration covers a singularity at x=1/3. odeint may have a bad time here (but apparently doesn't).
Here's one approach that seems to work:
import numpy as np
import scipy as sp
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def g(y, x):
y0 = y[0]
y1 = y[1]
y2 = ((3*x+2)*y1 + (6*x-8)*y0)/(3*x-1)
return y1, y2
# Initial conditions on y, y' at x=0
init = 2.0, 3.0
# First integrate from 0 to 2
x = np.linspace(0,2,100)
sol=odeint(g, init, x)
# Then integrate from 0 to -2
plt.plot(x, sol[:,0], color='b')
x = np.linspace(0,-2,100)
sol=odeint(g, init, x)
plt.plot(x, sol[:,0], color='b')
# The analytical answer in red dots
exact_x = np.linspace(-2,2,10)
exact_y = 2*np.exp(2*exact_x)-exact_x*np.exp(-exact_x)
plt.plot(exact_x,exact_y, 'o', color='r', label='exact')
plt.legend()
plt.show()

Categories

Resources