I have two arrays of the same length, say array x and array y. I want to find the value of y corresponding to x=0.56. This is not a value present in array x.
I would like python to find by itself the closest value larger than 0.56 (and its corresponding y value) and the closest value smaller than 0.56 (and its corresponding y value). Then simply interpolate to find the value of y when x 0.56.
This is easily done when I find the indices of the two x values and corresponding y values by myself and input them into Python (see following bit of code).
But is there any way for python to find the indices by itself?
#interpolation:
def effective_height(h1,h2,g1,g2):
return (h1 + (((0.56-g1)/(g2-g1))*(h2-h1)))
eff_alt1 = effective_height(x[12],x[13],y[12],y[13])
In this bit of code, I had to find the indices [12] and [13] corresponding to the closest smaller value to 0.56 and the closest larger value to 0.56.
Now I am looking for a similar technique where I would just tell python to interpolate between the two values of x for x=0.56 and print the corresponding value of y when x=0.56.
I have looked at scipy's interpolate but don't think it would help in this case, although further clarification on how I can use it in my case would be helpful too.
Does Numpy interp do what you want?:
import numpy as np
x = [0,1,2]
y = [2,3,4]
np.interp(0.56, x, y)
Out[81]: 2.56
Given your two arrays, x and y, you can do something like the following using SciPy.
from scipy.interpolate import InterpolatedUnivariateSpline
spline = InterpolatedUnivariateSpline(x, y, k=5)
spline(0.56)
The keyword k must be between 1 and 5, and controls the degree of the spline.
Example:
>>> x = range(10)
>>> y = range(0, 100, 10)
>>> spline = InterpolatedUnivariateSpline(x, y, k=5)
>>> spline(0.56)
array(5.6000000000000017)
Related
I am given something like selected percentile values (5th, 10th, 25th, 50th) and so on, and need to find what percentile a given value is. So I have tried scipy and numpy, but have come across a problem. It is not uncommon for multiple percentiles to have the same value (for example a value of 0 all the way until the 50th percentile). When I interpolate, it always returns the highest value, which introduces a skew into my bulk stats. I have a quick example below. X would be percentile values, Y is the corresponding percentiles. 0.0 is a value I would be interpolating. It seems the interpolation function and method is fairly limited since I have repeating x values.
x=[0.0,0.0,0.0,0.0,0.05,0.2,0.5]
y=[5,10,25,50,75,90,95]
interp = interp1d(x, y, kind='slinear', fill_value='extrapolate')
z2 = np.interp(0.0, x, y, left=0, right=100).round(1)
z = interp(0.0)
print(z)
print(z2)
In this case, both z and z2 return 50.0, when I expect/want 0.0 or 5.0 (depending on extrapolation). Is there anyway to force these to return the minimum possible value, the middle possible value, or any other way to accomplish this?
Both np.interp() and scipy.interpolate.interp1d() require that the x values must be strictly increasing (i.e. x[i+1] > x[i]), and may return nonsense if they aren't. If you want some specific behavior, you need to preprocess your data to get rid of any repeated x values. For example:
# assuming x and y are already sorted
x_fixed, indices = np.unique(x, return_index=True)
y_fixed = [np.min(vals) for vals in np.split(y, indices[1:])]
Initially, I have two arrays that correspond to the values of x and y in a function, but I don't know that function, I just know that the values of y depend on x. Then, I calculate a function that depends on both arrays.
I need to calculate in python the integral of that last function to obtain the total area under the curve between the first value of x and the last. Any idea of how to do that?
x = [array]
y(x) = [array]
a = 2.839*10**25
b = 4*math.pi
alpha = 0.5
z = 0.003642
def L(x,y,a,b,alpha,z):
return x*((y*b*a)/(1+z)**(1+alpha))
Your function is a function of x (in that given a value of x it spits out a value), so first you should repackage it as such (introduce a function yy which, given x, produces the requisite y), then write LL(x) = L(x, yy[x]), then use scipy.integrate to integrate it.
I'm a MATLAB user and I'm trying to translate some code in Python as an assignment. Since I noticed some differences between the two languages in 3d interpolation results from my original code, I am trying to address the issue by analysing a simple example.
I set a 2x2x2 matrix (named blocc below) with some values, and its coordinates in three vectors (X,Y,Z). Given a query point, I use 3D-linear interpolation to find the intepolated value. Again,I get different results in MATLAB and Python (code below).
Python
import numpy as np
import scipy.interpolate as si
X,Y,Z =(np.array([1, 2]),np.array([1, 2]),np.array([1, 2]))
a = np.ones((2,2,1))
b = np.ones((2,2,1))*2
blocc = np.concatenate((a,b),axis=2) # Matrix with values
blocc[1,0,0]=7
blocc[0,1,1]=7
qp = np.array([2,1.5,1.5]) #My query point
value=si.interpn((X,Y,Z),blocc,qp,'linear')
print(value)
Here I get value=3
MATLAB
blocc = zeros(2,2,2);
blocc(:,:,1) = ones(2,2);
blocc(:,:,2) = ones(2,2)*2;
blocc(2,1,1)=7;
blocc(1,2,2)=7;
X=[1,2];
Y=[1,2];
Z=[1,2];
qp = [2 1.5 1.5];
value=interp3(X,Y,Z,blocc,qp(1),qp(2),qp(3),'linear')
And here value=2.75
I can't understand why: I think there is something I don't get about how does interpolation and/or matrix indexing work in Python. Can you please make it clear for me? Thanks!
Apparently, for MATLAB when X, Y and Z are vectors, then it considers that the order of the dimensions in the values array is (Y, X, Z). From the documentation:
V — Sample values
array
Sample values, specified as a real or complex array. The size requirements for V depend on the size of X, Y, and Z:
If X, Y, and Z are arrays representing a full grid (in meshgrid format), then the size of V matches the size of X, Y, or Z .
If X, Y, and Z are grid vectors, then size(V) = [length(Y) length(X) length(Z)].
If V contains complex numbers, then interp3 interpolates the real and imaginary parts separately.
Example: rand(10,10,10)
Data Types: single | double
Complex Number Support: Yes
This means that, to get the same result in Python, you just need to swap the first and second values in the query:
qp = np.array([1.5, 2, 1.5])
f = si.interpn((X, Y, Z), blocc, qp, 'linear')
print(f)
# [2.75]
I have 3 arrays. One array contains x-values, the second array contains y-values, and the third array contains values for sigma (errors).
How can I use the numpy.polyfit function to fit for x, y, and sigma? I have figured out how to fit the x and y values but not sigma.
import numpy as np
p = np.polyfit(x,y,2)
xp = np.linspace(0.4,1,40)
y = np.polyval(p,xp)
Use the w parameter as described here
p = np.polyfit(x,y,2,w=1/sigma)
I have some x and y data, where for every entry in the x vector there's a corresponding entry in the y vector. Furthermore, the x data are not evenly spaced.
I'd like to interpolate between the x samples to obtain an even spacing in the x dimension, and to approximate the corresponding y value. In numpy, interp1d seems like a natural solution, but my problem has a caveat: the x values are not monotonically increasing (because both x and y are a function of time). The interp1d function, and the other functions from the interpolate module, thus give weird results at those points where x reverses direction.
What I'd really like to do is simply fit a straight line between every set of two adjacent x points and then interpolate based on this very local approximation. Is there a function to do this in numpy or do I have to rig something up myself?
Could you sort your xy pairs and then use interp1d? Something like this?
import sort
xy = zip(x,y)
new_xy = sorted(xy, key=lambda xy: xy[0])
x = new_xy[:,0]
y = new_xy[:,1]
Now your x's are monotonically increasing and the relationships have been preserved.