retrieving data from a plot in python? - python

suppose I have
t= [0,7,10,17,23,29,31]
f_t= [4,3,11,19,12,9,17]
and I have plotted f_t vs t.
Now from plotting these 7 data points, I want to retrieve 100 data points and save them in a text file. What do I have to do?
Note that I am not asking about the fitting of the plot; I know between two points the plot is linear.
What I am asking If I create a array like t=np.arange(0,31,.1), then what is the corresponding array of f_t which agrees well with the previous plot, i.e., for any t between t=0 to t=7, f_t will be determined by using a straight line connecting (0,4) and (7,3), and so on.

You should use a linear regression, that gives you a straight line formula, in which you can grasp as many points as you want.
If the line is more of a curve, then you should try to have a polynomial regression of higher degree.
ie:
import pylab
import numpy
py_x = [0,7,10,17,23,29,31]
py_y = [4,3,11,19,12,9,17]
x = numpy.asarray(py_x)
y = numpy.asarray(py_y)
poly = numpy.polyfit(x,y,1) # 1 is the degree here. If you want curves, put 2, 3 or 5...
poly is now the polynome you can use to calculate other points with.
for z in range(100):
print numpy.polyval(poly,z) #this returns the interpolated f(z)

The function np.interp will do linear interpolation between your data points:
f2 = np.interp(np.arange(0,31,.1), t, ft)

Related

Find two points/derivatives on curves between which the line is straight/constant

I'm plotting x and y points. This results in a curved line, the line is first bending and then after a point its straight and after some time it bends again. I want to retrieve those two points. Though x is linear and y is plotted against x but y is not linearly dependent on x.
I tried matplotlib for plotting and numpy polynomial functions, and am currently looking into splines, but it seems that for these y needs to be directly dependent on x.
Your data is noisy, so you can't use a simple numerical derivative. Instead, as you may have found already, you should fit it with a spline and then check the curvature of the spline.
Keying off this answer, you can fit a spline and calculate the second derivative (curvature) like this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import UnivariateSpline
x = file['n']
y = file['Ds/2']
y_spline = UnivariateSpline(x, y)
x_range = np.linspace(x[0], x[-1], 1000) # or could use x_range = x
y_spline_deriv = y_spl.derivative(n=2)
curvature = y_spline_deriv(x_range)
Then you can find the start and end of the straight region like this:
straight_points = np.where(curvature.abs() <= 0.1)[0] # pick your threshold
start_idx = straight_points[0]
end_idx = straight_points[-1]
start_x = x_range[start_idx]
end_x = x_range[end_idx]
Alternatively, if you're mainly interested in finding the flattest part of the curve (as shown in your graphic), you could try calculating the first derivative and then finding regions where the slope is within some small amount of the minimum slope anywhere in the data. In that case, just substitute y_spline_deriv = y_spl.derivative(n=1) in the code above.

How to randomly generate continuous functions

My objective is to randomly generate good looking continuous functions, good looking meaning that functions which can be recovered from their plots.
Essentially I want to generate a random time series data for 1 second with 1024 samples per second. If I randomly choose 1024 values, then the plot looks very noisy and nothing meaningful can be extracted out of it. In the end I have attached plots of two sinusoids, one with a frequency of 3Hz and another with a frequency of 100Hz. I consider 3Hz cosine as a good function because I can extract back the timeseries by looking at the plot. But the 100 Hz sinusoid is bad for me as I cant recover the timeseries from the plot. So in the above mentioned meaning of goodness of a timeseries, I want to randomly generate good looking continuos functions/timeseries.
The method I am thinking of using is as follows (python language):
(1) Choose 32 points in x-axis between 0 to 1 using x=linspace(0,1,32).
(2) For each of these 32 points choose a random value using y=np.random.rand(32).
(3) Then I need an interpolation or curve fitting method which takes as input (x,y) and outputs a continuos function which would look something like func=curve_fit(x,y)
(4) I can obtain the time seires by sampling from the func function
Following are the questions that I have:
1) What is the best curve-fitting or interpolation method that I can
use. They should also be available in python.
2) Is there a better method to generate good looking functions,
without using curve fitting or interpolation.
Edit
Here is the code I am using currently for generating random time-series of length 1024. In my case I need to scale the function between 0 and 1 in the y-axis. Hence for me l=0 and h=0. If that scaling is not needed you just need to uncomment a line in each function to randomize the scaling.
import numpy as np
from scipy import interpolate
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt
## Curve fitting technique
def random_poly_fit():
l=0
h=1
degree = np.random.randint(2,11)
c_points = np.random.randint(2,32)
cx = np.linspace(0,1,c_points)
cy = np.random.rand(c_points)
z = np.polyfit(cx, cy, degree)
f = np.poly1d(z)
y = f(x)
# l,h=np.sort(np.random.rand(2))
y = MinMaxScaler(feature_range=(l,h)).fit_transform(y.reshape(-1, 1)).reshape(-1)
return y
## Cubic Spline Interpolation technique
def random_cubic_spline():
l=0
h=1
c_points = np.random.randint(4,32)
cx = np.linspace(0,1,c_points)
cy = np.random.rand(c_points)
z = interpolate.CubicSpline(cx, cy)
y = z(x)
# l,h=np.sort(np.random.rand(2))
y = MinMaxScaler(feature_range=(l,h)).fit_transform(y.reshape(-1, 1)).reshape(-1)
return y
func_families = [random_poly_fit, random_cubic_spline]
func = np.random.choice(func_families)
x = np.linspace(0,1,1024)
y = func()
plt.plot(x,y)
plt.show()
Add sin and cosine signals
from numpy.random import randint
x= np.linspace(0,1,1000)
for i in range(10):
y = randint(0,100)*np.sin(randint(0,100)*x)+randint(0,100)*np.cos(randint(0,100)*x)
y = MinMaxScaler(feature_range=(-1,1)).fit_transform(y.reshape(-1, 1)).reshape(-1)
plt.plot(x,y)
plt.show()
Output:
convolve sin and cosine signals
for i in range(10):
y = np.convolve(randint(0,100)*np.sin(randint(0,100)*x), randint(0,100)*np.cos(randint(0,100)*x), 'same')
y = MinMaxScaler(feature_range=(-1,1)).fit_transform(y.reshape(-1, 1)).reshape(-1)
plt.plot(x,y)
plt.show()
Output:

Curve fitting and Extrapolation for 3d plot in python

I want to extrapolate 3d plot in python using numpy/scipy. Extrapolation is done with curve fitting. Refer to the following data which is having different x & y sizes.
x = np.array([740,760,780,800,820,840,860,880,900,920,940,960]) # Presssure in mBar
y = np.array([1500,1800,2100,2400,2700,3000,3300,3600,3900]) # Rpm
# Fuel Amount in micro seconds
z = np.array([[1820,1820,1820,1820,2350,2820,3200,3440,3520,3600,3600,3600],
[1930,1930,1930,2170,2700,2880,3240,3580,3990,3990,3990,3990],
[1900,1900,2370,2680,2730,3050,3450,3760,3970,3970,3970,3970],
[2090,2090,2240,2410,2875,3180,3410,3935,4270,4270,4270,4270],
[1600,2180,2400,2700,2950,3290,3780,4180,4470,4470,4470,4470],
[2100,2280,2600,2880,3320,3640,4150,4550,4550,4550,4550,4550],
[2300,2460,2810,3170,3400,3900,4280,4760,4760,4760,4760,4760],
[2170,2740,3030,3250,3600,4100,4370,4370,4370,4370,4370,4370],
[2240,2580,2870,3275,3640,4050,4260,4260,4260,4260,4260,4260]])
Scipy has scipy.interpolate.interp2d class but it only interpolates if x & y are of same size.
http://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.interpolate.interp2d.html
I want to extrapolate the curve at y axis points 900 & 1200 & at x axis point 720.
i.e.
xNew = (720,x)
yNew = (900,1200,y)
Since I don't have the function in terms of z = f(x,y). How curve fitting can be done in python for above case and get the curve values at required points.
You need to provide the grid on which your z values are constructed, i.e. something like
x=[[740,760,...,960],
.....
[740,760,...,960]]
and similarly for y. This can be achieved using numpy.meshgrid:
xx,yy=np.meshgrid(x,y)
test_function=interp2d(xx,yy,z)
Using your data, I can execute test_function(720,900) and get a value of 1820, which is nearest neighbour extrapolation. If you need "better" extrapolation (whatever that means), you need to develop some kind of model function for your data and use the fitting methods inside scipy.

Higher order local interpolation of implicit curves in Python

Given a set of points describing some trajectory in the 2D plane, I would like to provide a smooth representation of this trajectory with local high order interpolation.
For instance, say we define a circle in 2D with 11 points in the figure below. I would like to add points in between each consecutive pair of points in order or produce a smooth trace. Adding points on every segment is easy enough, but it produces slope discontinuities typical for a "local linear interpolation". Of course it is not an interpolation in the classical sense, because
the function can have multiple y values for a given x
simply adding more points on the trajectory would be fine (no continuous representation is needed).
so I'm not sure what would be the proper vocabulary for this.
The code to produce this figure can be found below. The linear interpolation is performed with the lin_refine_implicit function. I'm looking for a higher order solution to produce a smooth trace and I was wondering if there is a way of achieving it with classical functions in Scipy? I have tried to use various 1D interpolations from scipy.interpolate without much success (again because of multiple y values for a given x).
The end goals is to use this method to provide a smooth GPS trajectory from discrete measurements, so I would think this should have a classical solution somewhere.
import numpy as np
import matplotlib.pyplot as plt
def lin_refine_implicit(x, n):
"""
Given a 2D ndarray (npt, m) of npt coordinates in m dimension, insert 2**(n-1) additional points on each trajectory segment
Returns an (npt*2**(n-1), m) ndarray
"""
if n > 1:
m = 0.5*(x[:-1] + x[1:])
if x.ndim == 2:
msize = (x.shape[0] + m.shape[0], x.shape[1])
else:
raise NotImplementedError
x_new = np.empty(msize, dtype=x.dtype)
x_new[0::2] = x
x_new[1::2] = m
return lin_refine_implicit(x_new, n-1)
elif n == 1:
return x
else:
raise ValueError
n = 11
r = np.arange(0, 2*np.pi, 2*np.pi/n)
x = 0.9*np.cos(r)
y = 0.9*np.sin(r)
xy = np.vstack((x, y)).T
xy_highres_lin = lin_refine_implicit(xy, n=3)
plt.plot(xy[:,0], xy[:,1], 'ob', ms=15.0, label='original data')
plt.plot(xy_highres_lin[:,0], xy_highres_lin[:,1], 'dr', ms=10.0, label='linear local interpolation')
plt.legend(loc='best')
plt.plot(x, y, '--k')
plt.xlabel('X')
plt.ylabel('Y')
plt.title('GPS trajectory')
plt.show()
This is called parametric interpolation.
scipy.interpolate.splprep provides spline approximations for such curves. This assumes you know the order in which the points are on the curve.
If you don't know which point comes after which on the curve, the problem becomes more difficult. I think in this case, the problem is called manifold learning, and some of the algorithms in scikit-learn may be helpful in that.
I would suggest you try to transform your cartesian coordinates into polar coordinates, that should allow you to use the standard scipy.interpolation without issues as you won't have the ambiguity of the x->y mapping anymore.

How do I limit the interpolation region in the InterpolatedUnivariateSpline in Python when given non-uniform samples?

I'm trying to get a nice upsampler using Python when I have non-uniform spaced inputs. Any suggestions would be helpful. I've tried a number of interp functions. Here's an example:
from scipy.interpolate import InterpolatedUnivariateSpline
from numpy import linspace, arange, append
from matplotlib.pyplot import plot
F=[0, 1000,1500,2000,2500,3000,3500,4000,4500,5000,5500,22050]
M=[0.,2.85,2.49,1.65,1.55,1.81,1.35,1.00,1.13,1.58,1.21,0.]
ff=linspace(F[0],F[1],10)
for i in arange(2, len(F)):
ff=append(ff,linspace(F[i-1],F[i], 10))
aa=InterpolatedUnivariateSpline(x=F,y=M,k=2);
mm=aa(ff)
plot(F,M,'r-o'); plot(ff,mm,'bo'); show()
This is the plot I get:
I need to get interpolated values that don't go below 0. Note that the blue dots go below zero. The red line represents the original F vs. M data. If I use k=1 (piece-wise linear interp) then I get good values as shown here:
aa=InterpolatedUnivariateSpline(x=F,y=M,k=1)
mm=aa(ff); plot(F,M,'r-o');plot(ff,mm,'bo'); show()
The problem is that I need to have a "smooth" interpolation and not the piece-wise value. Does anyone know if the bbox argument in InterpolatedUnivarientSpline helps to fix that? I cant find any documentation on what bbox does. Is there another easier way to accomplish this?
Thanks in advance for any help.
Positivity-preserving interpolation is hard (if it wasn't, there wouldn't be a bunch of papers written about it). The splines of low degree (2, 3) usually do pretty well in this regard, but your data has that large gap in it, and it happens to be at the end of data range, making things worse.
One solution is to do interpolation in two steps: first upsample the data by piecewise linear interpolation, then interpolate new data with a smooth spline (I'll use cubic spline below, though quadratic also works).
The gap_size array records how large each gap is, relative to the smallest one. In subsequent loop, uniformly spaced points are replaced in large gaps (those that are at least twice the size of smallest one). The result is F_new, a nearly-uniform better grid that still includes the original points. The corresponding M values for it are generated by a piecewise linear spline.
Subsequent cubic interpolation produces a smooth curve that stays positive.
F = [0, 1000,1500,2000,2500,3000,3500,4000,4500,5000,5500,22050]
M = [0.,2.85,2.49,1.65,1.55,1.81,1.35,1.00,1.13,1.58,1.21,0.]
gap_size = np.diff(F) // np.diff(F).min()
F_new = []
for i in range(len(F)-1):
F_new.extend(np.linspace(F[i], F[i+1], gap_size[i], endpoint=False))
F_new.append(F[-1])
pl_spline = InterpolatedUnivariateSpline(F, M, k=1);
M_new = pl_spline(F_new)
smooth_spline = InterpolatedUnivariateSpline(F_new, M_new, k=3)
ff = np.linspace(F[0], F[-1], 100)
plt.plot(F, M, 'ro')
plt.plot(ff, smooth_spline(ff), 'b')
plt.show()
Of course, no tricks can hide the truth that we don't know what happens between 5500 and 22050 (Hz, I presume), the nearly-linear part is just a placeholder.

Categories

Resources