How to get piecewise linear function in Python - python

I would like to get piecewise linear function from set of points. Here is visual example:
import matplotlib.pyplot as plt
x = [1,2,7,9,11]
y = [2,5,9,1,11]
plt.plot(x, y)
plt.show()
So I need a function that would take two lists and would return piecewise linear function back. I do not need regression or any kind of least square fit.
I can try to write it myself, but wonder if there is something already written. So far, I only found code returning regression

try np.interp. It interpolates the values.
Here is a small example.
>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> x = [1,2,7,9,11]
>>> y = [2,5,9,1,11]
>>> np.interp([1.5, 3], x, y)
array([ 3.5, 5.8])
A caution note is to make sure for the sample points, make sure the x increases.

Related

Python generate random right skewed gaussian with constraints

I need to generate a unit curve that is going to look like a right skewed gaussian and I have the following constraints:
The X axis is Days (variable but usually 45+)
All values on the Y axis sum to 1
The peak will always occur around day 4 or 5
Example:
Is there a way to do this programmatically in python?
as noted by #Severin, a gamma looks to be a reasonable fit. e.g:
import matplotlib.pyplot as plt
import numpy as np
import scipy.stats as sps
x = np.linspace(75)
plt.plot(x, sps.gamma.pdf(x, 4) '.-')
plt.show()
if they really need to sum to 1, rather than integrate, I'd use the cdf and then use np.diff on the result

Scipy: efficiently generate a series of integration (integral function)

I have a function, I want to get its integral function, something like this:
That is, instead of getting a single integration value at point x, I need to get values at multiple points.
For example:
Let's say I want the range at (-20,20)
def f(x):
return x**2
x_vals = np.arange(-20, 21, 1)
y_vals =[integrate.nquad(f, [[0, x_val]]) for x_val in x_vals ]
plt.plot(x_vals, y_vals,'-', color = 'r')
The problem
In the example code I give above, for each point, the integration is done from scratch. In my real code, the f(x) is pretty complex, and it's a multiple integration, so the running time is simply too slow(Scipy: speed up integration when doing it for the whole surface?).
I'm wondering if there is any way of efficient generating the Phi(x), at a giving range.
My thoughs:
The integration value at point Phi(20) is calucation from Phi(19), and Phi(19) is from Phi(18) and so on. So when we get Phi(20), in reality we also get the series of (-20,-19,-18,-17 ... 18,19,20). Except that we didn't save the value.
So I'm thinking, is it possible to create save points for a integrate function, so when it passes a save point, the value would get saved and continues to the next point. Therefore, by a single process toward 20, we could also get the value at (-20,-19,-18,-17 ... 18,19,20)
One could implement the strategy you outlined by integrating only over the short intervals (between consecutive x-values) and then taking the cumulative sum of the results. Like this:
import numpy as np
import scipy.integrate as si
def f(x):
return x**2
x_vals = np.arange(-20, 21, 1)
pieces = [si.quad(f, x_vals[i], x_vals[i+1])[0] for i in range(len(x_vals)-1)]
y_vals = np.cumsum([0] + pieces)
Here pieces are the integrals over short intervals, which get summed to produce y-values. As written, this code outputs a function that is 0 at the beginning of the range of integration which is -20. One can, of course, subtract the y-value that corresponds to x=0 in order to have the same normalization as on your plot.
That said, the split-and-sum process is unnecessary. When you find an indefinite integral of f, you are really solving the differential equation F' = f. And SciPy has a built-in method for that, odeint. Just use it:
import numpy as np
import scipy.integrate as si
def f(x):
return x**2
x_vals = np.arange(-20, 21, 1)
y_vals = si.odeint(lambda y,x: f(x), 0, x_vals)
The output is essential identical to the first version (within tiny computational errors), with less code. The reason for using lambda y,x: f(x) is that the first argument of odeint must be a function taking two arguments, the right-hand side of the equation y' = f(y, x).
For the equivalent version of user3717023's answer using scipy's solve_ivp you need to keep in mind the different ordering of x and y in the function f (different from the odeint version).
Further, keep in mind that you can only compute the solution up to a constant. So you might want to shift the result according to some given condition. In the example here (with the function f(x)=x^2 as given by the OP), I shifted the numeric solution such that it goes through the origin, matching the simplest analytic solution F(x)=x^3/3.
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import solve_ivp
def f(x):
return x**2
xs = np.linspace(-20, 20, 1001)
# This is the integration step:
sol = solve_ivp(lambda x, y: f(x), t_span=(xs[0], xs[-1]), y0=[0], t_eval=xs)
plt.plot(sol.t, sol.t**3/3, ls='-', c='C0', label="analytic: $F(x)=x^3/3$")
plt.plot(sol.t, sol.y[0], ls='--', c='C1', label="numeric solution")
plt.plot(sol.t, sol.y[0] - sol.y[0][sol.t.size//2], ls='-.', c='C3', label="shifted solution going through origin")
plt.legend()
In case you don't have an analytical version of the function f, but only xs and ys as data points, then you can use scipy's interp1d function to interpolate between the data points and pass on that interpolating function the same way as before:
from scipy.interpolate import interp1d
f = interp1d(xs, ys)

scipy.interp2d warning and large errors off the grid

I am trying to interpolate a 2-dimensional function and I am running into what I consider weird behavior by scipy.interpolate.interp2d. I don't understand what the problem is, and I'd be happy for any help or hints.
import numpy as np
from scipy.interpolate import interp2d
x = np.arange(10)
y = np.arange(20)
xx, yy = np.meshgrid(x, y, indexing = 'ij')
val = xx + yy
f = interp2d(xx, yy, val, kind = 'linear')
When I run this code, I get the following Warning:
scipy/interpolate/fitpack.py:981: RuntimeWarning: No more knots can be
added because the number of B-spline coefficients already exceeds the
number of data points m. Probable causes: either s or m too small.
(fp>s) kx,ky=1,1 nx,ny=18,15 m=200 fp=0.000000 s=0.000000
warnings.warn(RuntimeWarning(_iermess2[ierm][0] + _mess))
I don't understand why interp2d would use any splines when I tell it it should do linear interpolation. When I continue and evaluate f on the grid everything is good:
>>> f(1,1)
array([ 2.])
When I evaluate it off the grid, I get large errors, even though the function is clearly linear.
>>> f(1.1,1)
array([ 2.44361975])
I am a bit confused and I am not sure what the problem is. Did anybody run into similar problems? I used to work with matlab and this is almost 1:1 how I would do it there, but maybe I did something wrong.
When I use a rectangular grid (i.e. y = np.arange(10)) everything works fine by the way, but that isn't what I need. When I use cubic instead of linear interpolation, the error gets smaller (that doesn't make much sense either since the function is linear) but is still unacceptably large.
I tried a couple of things and managed to get (kind of) what I want using scipy.LinearNDInterpolator. However, I have to convert the grid to lists of points and values. Since the rest of my program stores coordinates and values in grid format that is kind of annoying, so if possible I'd still like to get the original code to work properly.
import numpy as np
import itertools
from scipy.interpolate import LinearNDInterpolator
x = np.arange(10)
y = np.arange(20)
coords = list(itertools.product(x,y))
val = [sum(c) for c in coords]
f = LinearNDInterpolator(coords, val)
>>>f(1,1)
array(2.0)
>>> f(1.1,1)
array(2.1)

Using brentq with two lists of data

I am trying to find the root(s) of a line which is defined by data like:
x = [1,2,3,4,5]
y = [-2,4,6,8,4]
I have started by using interpolation but I have been told I can then use the brentq function. How can I use brentq from two lists? I thought continuous functions are needed for it.
As the documentation of brentq says, the first argument must be a continuous function. Therefore, you must first generate, from your data, a function that will return a value for each parameter passed to it. You can do that with interp1d:
import numpy as np
from scipy.interpolate import interp1d
from scipy.optimize import brentq
x, y = np.array([1,2,3,4,5]), np.array([-2,4,6,8,4])
f = interp1d(x,y, kind='linear') # change kind to something different if you want e.g. smoother interpolation
brentq(f, x.min(), x.max()) # returns: 1.33333
You could also use splines to generate the continuous function needed for brentq.

Python: Integration of Interpolation

I've got some question I cant solve:
#! /usr/bin/env python
import numpy as np
from scipy.interpolate import UnivariateSpline
from scipy.integrate import quad
import pylab as pl
x = ([0,10,20,30,40,50,60,70,...,4550,4560])
y = ([0,0,0,0,0,0,0,3,2,3,2,1,2,1,2,...,8,6,5,7,11,6,7,10,6,5,8,13,6,8,8,3])
s = UnivariateSpline(x, y, k=5, s=5)
xs = np.linspace(0, 4560, 4560)
ys = s(xs)
This is my code for making some Interpolation over some data.
In addition, I plotted this function.
But now I want to integrate it (from zero to infinity).
I tried
results = integrate.quad(ys, 0, 99999)
but it didnt work.
Can you give me some hints (or solutions) please? thanks
As Pierre GM said, you have to give a function for quad (I think also you can use np.inf for the upper bound, though here it doesn't matter as the splines go to 0 quickly anyways). However, what you want is:
s.integral(0, np.inf)
Since this is a spline, the UnivariateSpline object already implements an integral that should be better and faster.
According to the documentation of quad, you need to give a function as first argument, followed by the lower and upper bounds of the integration range, and some extra arguments for your function (type help(quad) in your shell for more info.
You passed an array as first argument (ys), which is why it doesn't work. You may want to try something like:
results = quad(s, xs[0], xs[-1])
or
results = quad(s, 0, 9999)

Categories

Resources