Curve fitting and Extrapolation for 3d plot in python - python

I want to extrapolate 3d plot in python using numpy/scipy. Extrapolation is done with curve fitting. Refer to the following data which is having different x & y sizes.
x = np.array([740,760,780,800,820,840,860,880,900,920,940,960]) # Presssure in mBar
y = np.array([1500,1800,2100,2400,2700,3000,3300,3600,3900]) # Rpm
# Fuel Amount in micro seconds
z = np.array([[1820,1820,1820,1820,2350,2820,3200,3440,3520,3600,3600,3600],
[1930,1930,1930,2170,2700,2880,3240,3580,3990,3990,3990,3990],
[1900,1900,2370,2680,2730,3050,3450,3760,3970,3970,3970,3970],
[2090,2090,2240,2410,2875,3180,3410,3935,4270,4270,4270,4270],
[1600,2180,2400,2700,2950,3290,3780,4180,4470,4470,4470,4470],
[2100,2280,2600,2880,3320,3640,4150,4550,4550,4550,4550,4550],
[2300,2460,2810,3170,3400,3900,4280,4760,4760,4760,4760,4760],
[2170,2740,3030,3250,3600,4100,4370,4370,4370,4370,4370,4370],
[2240,2580,2870,3275,3640,4050,4260,4260,4260,4260,4260,4260]])
Scipy has scipy.interpolate.interp2d class but it only interpolates if x & y are of same size.
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.interpolate.interp2d.html
I want to extrapolate the curve at y axis points 900 & 1200 & at x axis point 720.
i.e.
xNew = (720,x)
yNew = (900,1200,y)
Since I don't have the function in terms of z = f(x,y). How curve fitting can be done in python for above case and get the curve values at required points.

You need to provide the grid on which your z values are constructed, i.e. something like
x=[[740,760,...,960],
.....
[740,760,...,960]]
and similarly for y. This can be achieved using numpy.meshgrid:
xx,yy=np.meshgrid(x,y)
test_function=interp2d(xx,yy,z)
Using your data, I can execute test_function(720,900) and get a value of 1820, which is nearest neighbour extrapolation. If you need "better" extrapolation (whatever that means), you need to develop some kind of model function for your data and use the fitting methods inside scipy.

Related

Higher order local interpolation of implicit curves in Python

Given a set of points describing some trajectory in the 2D plane, I would like to provide a smooth representation of this trajectory with local high order interpolation.
For instance, say we define a circle in 2D with 11 points in the figure below. I would like to add points in between each consecutive pair of points in order or produce a smooth trace. Adding points on every segment is easy enough, but it produces slope discontinuities typical for a "local linear interpolation". Of course it is not an interpolation in the classical sense, because
the function can have multiple y values for a given x
simply adding more points on the trajectory would be fine (no continuous representation is needed).
so I'm not sure what would be the proper vocabulary for this.
The code to produce this figure can be found below. The linear interpolation is performed with the lin_refine_implicit function. I'm looking for a higher order solution to produce a smooth trace and I was wondering if there is a way of achieving it with classical functions in Scipy? I have tried to use various 1D interpolations from scipy.interpolate without much success (again because of multiple y values for a given x).
The end goals is to use this method to provide a smooth GPS trajectory from discrete measurements, so I would think this should have a classical solution somewhere.
import numpy as np
import matplotlib.pyplot as plt
def lin_refine_implicit(x, n):
"""
Given a 2D ndarray (npt, m) of npt coordinates in m dimension, insert 2**(n-1) additional points on each trajectory segment
Returns an (npt*2**(n-1), m) ndarray
"""
if n > 1:
m = 0.5*(x[:-1] + x[1:])
if x.ndim == 2:
msize = (x.shape[0] + m.shape[0], x.shape[1])
else:
raise NotImplementedError
x_new = np.empty(msize, dtype=x.dtype)
x_new[0::2] = x
x_new[1::2] = m
return lin_refine_implicit(x_new, n-1)
elif n == 1:
return x
else:
raise ValueError
n = 11
r = np.arange(0, 2*np.pi, 2*np.pi/n)
x = 0.9*np.cos(r)
y = 0.9*np.sin(r)
xy = np.vstack((x, y)).T
xy_highres_lin = lin_refine_implicit(xy, n=3)
plt.plot(xy[:,0], xy[:,1], 'ob', ms=15.0, label='original data')
plt.plot(xy_highres_lin[:,0], xy_highres_lin[:,1], 'dr', ms=10.0, label='linear local interpolation')
plt.legend(loc='best')
plt.plot(x, y, '--k')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('GPS trajectory')
plt.show()
This is called parametric interpolation.
scipy.interpolate.splprep provides spline approximations for such curves. This assumes you know the order in which the points are on the curve.
If you don't know which point comes after which on the curve, the problem becomes more difficult. I think in this case, the problem is called manifold learning, and some of the algorithms in scikit-learn may be helpful in that.
I would suggest you try to transform your cartesian coordinates into polar coordinates, that should allow you to use the standard scipy.interpolation without issues as you won't have the ambiguity of the x->y mapping anymore.

Cubic Fit to Graph. How Do I create a better fit to my data and retrieve values from the fit?

I have plotted some experimental data in Python and need to find a cubic fit to the data. The reason I need to do this is because the cubic fit will be used to remove background (in this case resistance in a diode) and you will be left with the evident features. Here is the code I am currently using to make the cubic fit in the first place, where Vnew and yone represent arrays of the experimental data.
answer1=raw_input ('Cubic Plot attempt?\n ')
if answer1 in['y','Y','Yes']:
def cubic(x,A):
return A*x**3
cubic_guess=array([40])
popt,pcov=curve_fit(cubic,Vnew,yone,cubic_guess)
plot(Vnew,cubic(Vnew,*popt),'r-',label='Cubic Fit: curve_fit')
#ylim(-0.05,0.05)
legend(loc='best')
print 'Cubic plotted'
else:
print 'No Cubic Removal done'
I have knowledge of curve smoothing but only in theory. I do not know how to implement it. I would really appreciate any assistance.
Here is the graph generated so far:
To make the fitted curve "wider", you're looking for extrapolation. Although in this case, you could just make Vnew cover a larger interval, in which case you'd put this before your plot command:
Vnew = numpy.linspace(-1,1, 256) # min and max are merely an example, based on your graph
plot(Vnew,cubic(Vnew,*popt),'r-',label='Cubic Fit: curve_fit')
"Blanking out" the feature you see, can be done with numpy's masked arrays but also just by removing those elements you don't want from both your original Vnew (which I'll call xone) and yone:
mask = (xone > 0.1) & (xone < 0.35) # values between these voltages (?) need to be removed
xone = xone[numpy.logical_not(mask)]
yone = yone[numpy.logical_not(mask)]
Then redo the curve fitting:
popt,_ = curve_fit(cubic, xone, yone, cubic_guess)
This will have fitted only to the data that was actually there (which aren't that many points in your dataset, from the looks of it, so beware!).

Extrapolation from curved datapoints

I can't quite wrap my head around on how to extrapolate from a dataset where the points are not ordered, i.e. be decreasing for 'x'. like so:
http://www.pic-host.org/images/2014/07/21/0b5ad6a11266f549.png
I got that I need to create a plot for the x and y values seperately. So the code that gets me this: (The points are ordered)
x = bananax
y = bananay
t = np.arange(x.shape[0], dtype=float)
t /= t[-1]
nt = np.linspace(0, 1, 100)
x1 = scipy.interpolate.spline(t, x, nt)
y1 = scipy.interpolate.spline(t, y, nt)
plt.plot(nt, x1, label='data x')
plt.plot(nt, y1, label='data y')
Now I got the interpolated splines. I guess I have to do the extrapolation for f(nt)=x1 and f(nt)=y1 respectivly. I get how to interpolate from the data with a simple linear regression but I'm missing how to get a more complex spline(?) extrapolated from it.
The aim is to let the extrapolated function follow the curvature of the datapoints. (At one end at least)
Cheers, and thanks!
I believe that you're on the right track in that you're creating a parametric curve (creating x(t) and y(t)) because the points are ordered. Part of issue seems to be that the spline function is giving you back discrete values rather than the form and parameters of the spline. scipy.optimize has some nice tools that will help you find functions rather than calculating points
If you've got any insight into the underlying process generating the data I suggest that you use that to help select a functional form for fitting. These more free-form methods will give you a degree of flexibility to do so.
Fit x(t) and y(t) and hold onto the resulting fitting functions. They'll be generated with data from t=0 to t=1 but nothing* will stop you from evaluating them outside that range.
I can recommend the following links for guidance on curve fitting procedure:
short: http://glowingpython.blogspot.com/2011/05/curve-fitting-using-fmin.html
long: http://nbviewer.ipython.org/gist/keflavich/4042018
*almost nothing
Thanks this got me on the right track. What worked for me was:
x = bananax
y = bananay
#------ fit a spline to the coordinates, x and y axis are interpolated towards t
t = np.arange(x.shape[0], dtype=float) #t is # of values
t /= t[-1] #t is now devided from 0 to 1
nt = np.linspace(0, 1, 100) #nt is array with values from 0 to 1 with 100 intermediate values
x1 = scipy.interpolate.spline(t, x, nt) #The x values where spline should estimate the y values
y1 = scipy.interpolate.spline(t, y, nt)
#------ create a new linear space for nnt in which an extrapolation from the interpolated spline will be made
nnt = np.linspace(-1, 1, 100) #values <0 are extrapolated (interpolation started at the tip(=0)
x1fit = np.polyfit(nt,x1,3) #fits a polynomial function of the nth order with the spline as input, output are the function parameters
y1fit = np.polyfit(nt,y1,3)
xpoly = np.poly1d(x1fit) #genereates the function based on the parameters obtained by polyfit
ypoly = np.poly1d(y1fit)

Gradient calculation with python

I would like to know how does numpy.gradient work.
I used gradient to try to calculate group velocity (group velocity of a wave packet is the derivative of frequencies respect to wavenumbers, not a group of velocities). I fed a 3 column array to it, the first 2 colums are x and y coords, the third column is the frequency of that point (x,y). I need to calculate gradient and I did expect a 2d vector, being gradient definition
df/dx*i+df/dy*j+df/dz*k
and my function only a function of x and y i did expect something like
df/dx*i+df/dy*j
But i got 2 arrays with 3 colums each, i.e. 2 3d vectors; at first i thought that the sum of the two would give me the vector i were searchin for but the z component doesn't vanish. I hope i've been sufficiently clear in my explanation. I would like to know how numpy.gradient works and if it's the right choice for my problem. Otherwise i would like to know if there's any other python function i can use.
What i mean is: I want to calculate gradient of an array of values:
data=[[x1,x2,x3]...[x1,x2,x3]]
where x1,x2 are point coordinates on an uniform grid (my points on the brillouin zone) and x3 is the value of frequency for that point. I give in input also steps for derivation for the 2 directions:
stepx=abs(max(unique(data[:,0])-min(unique(data[:,0]))/(len(unique(data[:,0]))-1)
the same for y direction.
I didn't build my data on a grid, i already have a grid and this is why kind examples given here in answers do not help me.
A more fitting example should have a grid of points and values like the one i have:
data=[]
for i in range(10):
for j in range(10):
data.append([i,j,i**2+j**2])
data=array(data,dtype=float)
gx,gy=gradient(data)
another thing i can add is that my grid is not a square one but has the shape of a polygon being the brillouin zone of a 2d crystal.
I've understood that numpy.gradient works properly only on a square grid of values, not what i'm searchin for. Even if i make my data as a grid that would have lots of zeroes outside of the polygon of my original data, that would add really high vectors to my gradient affecting (negatively) the precision of calculation. This module seems to me more a toy than a tool, it has severe limitations imho.
Problem solved using dictionaries.
You need to give gradient a matrix that describes your angular frequency values for your (x,y) points. e.g.
def f(x,y):
return np.sin((x + y))
x = y = np.arange(-5, 5, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array([f(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
gx,gy = np.gradient(Z,0.05,0.05)
You can see that plotting Z as a surface gives:
Here is how to interpret your gradient:
gx is a matrix that gives the change dz/dx at all points. e.g. gx[0][0] is dz/dx at (x0,y0). Visualizing gx helps in understanding:
Since my data was generated from f(x,y) = sin(x+y) gy looks the same.
Here is a more obvious example using f(x,y) = sin(x)...
f(x,y)
and the gradients
update Let's take a look at the xy pairs.
This is the code I used:
def f(x,y):
return np.sin(x)
x = y = np.arange(-3,3,.05)
X, Y = np.meshgrid(x, y)
zs = np.array([f(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
xy_pairs = np.array([str(x)+','+str(y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
xy_pairs = xy_pairs.reshape(X.shape)
gy,gx = np.gradient(Z,.05,.05)
Now we can look and see exactly what is happening. Say we wanted to know what point was associated with the value atZ[20][30]? Then...
>>> Z[20][30]
-0.99749498660405478
And the point is
>>> xy_pairs[20][30]
'-1.5,-2.0'
Is that right? Let's check.
>>> np.sin(-1.5)
-0.99749498660405445
Yes.
And what are our gradient components at that point?
>>> gy[20][30]
0.0
>>> gx[20][30]
0.070707731517679617
Do those check out?
dz/dy always 0 check.
dz/dx = cos(x) and...
>>> np.cos(-1.5)
0.070737201667702906
Looks good.
You'll notice they aren't exactly correct, that is because my Z data isn't continuous, there is a step size of 0.05 and gradient can only approximate the rate of change.

retrieving data from a plot in python?

suppose I have
t= [0,7,10,17,23,29,31]
f_t= [4,3,11,19,12,9,17]
and I have plotted f_t vs t.
Now from plotting these 7 data points, I want to retrieve 100 data points and save them in a text file. What do I have to do?
Note that I am not asking about the fitting of the plot; I know between two points the plot is linear.
What I am asking If I create a array like t=np.arange(0,31,.1), then what is the corresponding array of f_t which agrees well with the previous plot, i.e., for any t between t=0 to t=7, f_t will be determined by using a straight line connecting (0,4) and (7,3), and so on.
You should use a linear regression, that gives you a straight line formula, in which you can grasp as many points as you want.
If the line is more of a curve, then you should try to have a polynomial regression of higher degree.
ie:
import pylab
import numpy
py_x = [0,7,10,17,23,29,31]
py_y = [4,3,11,19,12,9,17]
x = numpy.asarray(py_x)
y = numpy.asarray(py_y)
poly = numpy.polyfit(x,y,1) # 1 is the degree here. If you want curves, put 2, 3 or 5...
poly is now the polynome you can use to calculate other points with.
for z in range(100):
print numpy.polyval(poly,z) #this returns the interpolated f(z)
The function np.interp will do linear interpolation between your data points:
f2 = np.interp(np.arange(0,31,.1), t, ft)

Categories

Resources