Gradient function evaluation

Gradient function evaluation - python

Below is some code I wrote to evaluate the position of a point moving towards the minimum of a gradient/3d function (defined at the beginning as "eq"). roll.roll() does this by repeatedly evaluating the equation at point (x,y), moving it in the direction of the gradient, then repeating with the new point.
It is very very slow to run though. I think this is because either calculate() is inefficient, or sympy's symbolic equation manipulation in roll.roll is really slow. Does anyone have any ideas on how to speed this up? Is there another libray other than SymPy that is faster?
import sympy as smp
x, y = smp.symbols('x y')
eq = 1*smp.exp(-((x-5)/5)**2 - ((y-1)/2)**2) + \
2*smp.exp(-((x+3)/2)**2 - ((y-3)/2)**2) + \
3*smp.exp(-((x-4)/2)**2 - ((y-7)/2)**2)
# Evaluates the 2 input sympy symbolic function "expression" at points (x1,y1)
def calculate(expression,x1,y1):
EQ = smp.lambdify((x,y), expression, 'numpy')
return EQ(x1,y1)
class roll:
xDiff = smp.diff(eq,x)
yDiff = smp.diff(eq,y)
normalize = eq/smp.sqrt(xDiff**2 + yDiff**2)
def roll(x,y,duration):
(x,y) = (x,y)
for i in range(0,duration):
(x,y) = (
x-calculate((roll.normalize*roll.xDiff),x,y),
y-calculate((roll.normalize*roll.yDiff),x,y)
)
return (x,y)
print(roll.roll(1,2,10))
Here is a visual to help see what this program is doing; the bigger the colored dots are, the greater the function f(x) is evaluated at that point. The draggable point represents what the program is attempting to find. https://www.desmos.com/calculator/c8mq2rijqn
I've tried to figure out if it's possible to pre-calculate normalize*xDiff not inside of roll.roll, but idk if thats possible.
Also, I believe that it is actually pretty easy to do this if the step size isn't dependent on the value of the function at the current point. I do need it to move faster when it's at a high point on the graph though (not just a point with a steep slope) so that has really been hard to figure out too.

The question is you want to do gradient-descent.
If you define a function about grad value, it's the most efficient way.
def eq(x,y):
return y*x**2+2*x*y
def grad_x(x,y):
return 2*x*y+2*y
def grad_y(x,y):
return x**2+2*x
But when you confused about how to calculate the grad of your function, you can use other package which can autograd (i.e. numpy, pytorch).
Here is an example of pytorch:
import torch
def eq(x, y):
return (
1 * torch.exp(-(((x - 5) / 5) ** 2) - ((y - 1) / 2) ** 2)
+ 2 * torch.exp(-(((x + 3) / 2) ** 2) - ((y - 3) / 2) ** 2)
+ 3 * torch.exp(-(((x - 4) / 2) ** 2) - ((y - 7) / 2) ** 2)
)
def roll(x, y, duration):
x, y = (
torch.tensor(x).requires_grad_(True),
torch.tensor(y).requires_grad_(True),
)
for _ in range(0, duration):
func_value = eq(x, y)
xDiff = torch.autograd.grad(func_value, x, retain_graph=True)[0]
yDiff = torch.autograd.grad(func_value, y)[0]
normalize = func_value / torch.sqrt(xDiff ** 2 + yDiff ** 2)
x = x - normalize * xDiff
y = y - normalize * yDiff
return (x, y)
print(roll(1.0, 2.0, 10))

You are calling lambdify inside the loop. The point of lambdify is that it returns a fast function but lambdify itself is a lot slower than the function that it returns. You should call lambdify once and then in the loop repeatedly use the function that was returned by it.
This code is equivalent to yours and returns the exact same result but the loop is 500x times faster:
import sympy as smp
x, y = smp.symbols('x y')
eq = 1*smp.exp(-((x-5)/5)**2 - ((y-1)/2)**2) + \
2*smp.exp(-((x+3)/2)**2 - ((y-3)/2)**2) + \
3*smp.exp(-((x-4)/2)**2 - ((y-7)/2)**2)
# Evaluates the 2 input sympy symbolic function "expression" at points (x1,y1)
def calculate(expression,x1,y1):
EQ = smp.lambdify((x,y), expression, 'numpy')
return EQ(x1,y1)
class roll:
xDiff = smp.diff(eq,x)
yDiff = smp.diff(eq,y)
normalize = eq/smp.sqrt(xDiff**2 + yDiff**2)
# call lambdify once
fxy = smp.lambdify((x, y), (x - normalize*xDiff, y - normalize*yDiff))
def roll(x,y,duration):
(x,y) = (x,y)
for i in range(0,duration):
# in the loop call the function that was returned by lambdify
x, y = roll.fxy(x, y)
return (x,y)
print(roll.roll(1,2,10))

Related

Evaluating a function with a well-defined value at x,y=0

I am trying to write a program that uses an array in further calculations. I initialize a grid of equally spaced points with NumPy and assign a value at each point as per the code snippet provided below. The function I am trying to describe with this array gives me a division by 0 error at x=y and it generally blows up around it. I know that the real part of said function is bounded by band_D/(2*math.pi)
at x=y and I tried manually assigning this value on the diagonal, but it seems that points around it are still ill-behaved and so I am not getting any right values. Is there a way to remedy this? This is how the function looks like with matplotlib
gamma=5
band_D=100
Dt=1e-3
x = np.arange(0,1/gamma,Dt)
y = np.arange(0,1/gamma,Dt)
xx,yy= np.meshgrid(x,y)
N=x.shape[0]
di = np.diag_indices(N)
time_fourier=(1j/2*math.pi)*(1-np.exp(1j*band_D*(xx-yy)))/(xx-yy)
time_fourier[di]=band_D/(2*math.pi)

You have a classic 0 / 0 problem. It's not really Numpy's job to figure out to apply De L'Hospital and solve this for you... I see, as other have commented, that you had the right idea with trying to set the limit value at the diagonal (where x approx y), but by the time you'd hit that line, the warning had already been emitted (just a warning, BTW, not an exception).
For a quick fix (but a bit of a fudge), in this case, you can try to add a small value to the difference:
xy = xx - yy + 1e-100
num = (1j / 2*np.pi) * (1 - np.exp(1j * band_D * xy))
time_fourier = num / xy
This also reveals that there is something wrong with your limit calculation... (time_fourier[0,0] approx 157.0796..., not 15.91549...).
and not band_D / (2*math.pi).
For a correct calculation:
def f(xy):
mask = xy != 0
limit = band_D * np.pi/2
return np.where(mask, np.divide((1j/2 * np.pi) * (1 - np.exp(1j * band_D * xy)), xy, where=mask), limit)
time_fourier = f(xx - yy)

You are dividing by x-y, that will definitely throw an error when x = y. The function being well behaved here means that the Taylor series doesn't diverge. But python doesn't know or care about that, it just calculates one step at a time until it reaches division by 0.
You had the right idea by defining a different function when x = y (ie, the mathematically true answer) but your way of applying it doesn't work because the correction is AFTER the division by 0, so it never gets read. This, however, should work
def make_time_fourier(x, y):
if np.isclose(x, y):
return band_D/(2*math.pi)
else:
return (1j/2*math.pi)*(1-np.exp(1j*band_D*(x-y)))/(x-y)
time_fourier = np.vectorize(make_time_fourier)(xx, yy)
print(time_fourier)

You can use np.divide with where option.
import math
gamma=5
band_D=100
Dt=1e-3
x = np.arange(0,1/gamma,Dt)
y = np.arange(0,1/gamma,Dt)
xx,yy = np.meshgrid(x,y)
N = x.shape[0]
di = np.diag_indices(N)
time_fourier = (1j / 2 * np.pi) * (1 - np.exp(1j * band_D * (xx - yy)))
time_fourier = np.divide(time_fourier,
(xx - yy),
where=(xx - yy) != 0)
time_fourier[di] = band_D / (2 * np.pi)

You can reformulate your function so that the division is inside the (numpy) sinc function, which handles it correctly.
To save typing I'll use D for band_D and use a variable
z = D*(xx-yy)/2
Then
T = (1j/2*pi)*(1-np.exp(1j*band_D*(xx-yy)))/(xx-yy)
= (2/D)*(1j/2*pi)*( 1 - cos( 2*z) - 1j*sin( 2*z))/z
= (1j/D*pi)* (2*sin(z)*sin(z) - 2j*sin(z)*cos(z))/z
= (2j/D*pi) * sin(z)/z * (sin(z) - 1j*cos(z))
= (2j/D*pi) * sinc( z/pi) * (sin(z) - 1j*cos(z))
numpy defines
sinc(x) to be sin(pi*x)/(pi*x)
I can't run python do you should chrck my calculations
The steps are
Substitute the definition of z and expand the complex exp
Apply the double angle formulae for sin and cos
Factor out sin(z)
Substitute the definition of sinc

Minimizing this error function, using NumPy

Background
I've been working for some time on attempting to solve the (notoriously painful) Time Difference of Arrival (TDoA) multi-lateration problem, in 3-dimensions and using 4 nodes. If you're unfamiliar with the problem, it is to determine the coordinates of some signal source (X,Y,Z), given the coordinates of n nodes, the time of arrival of the signal at each node, and the velocity of the signal v.
My solution is as follows:
For each node, we write (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 = (v(t_i - T)**2
Where (x_i, y_i, z_i) are the coordinates of the ith node, and T is the time of emission.
We have now 4 equations in 4 unknowns. Four nodes are obviously insufficient. We could try to solve this system directly, however that seems next to impossible given the highly nonlinear nature of the problem (and, indeed, I've tried many direct techniques... and failed). Instead, we simplify this to a linear problem by considering all i/j possibilities, subtracting equation i from equation j. We obtain (n(n-1))/2 =6 equations of the form:
2*(x_j - x_i)*X + 2*(y_j - y_i)*Y + 2*(z_j - z_i)*Z + 2 * v**2 * (t_i - t_j) = v**2 ( t_i**2 - t_j**2) + (x_j**2 + y_j**2 + z_j**2) - (x_i**2 + y_i**2 + z_i**2)
Which look like Xv_1 + Y_v2 + Z_v3 + T_v4 = b. We try now to apply standard linear least squares, where the solution is the matrix vector x in A^T Ax = A^T b. Unfortunately, if you were to try feeding this into any standard linear least squares algorithm, it'll choke up. So, what do we do now?
...
The time of arrival of the signal at node i is given (of course) by:
sqrt( (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 ) / v
This equation implies that the time of arrival, T, is 0. If we have that T = 0, we can drop the T column in matrix A and the problem is greatly simplified. Indeed, NumPy's linalg.lstsq() gives a surprisingly accurate & precise result.
...
So, what I do is normalize the input times by subtracting from each equation the earliest time. All I have to do then is determine the dt that I can add to each time such that the residual of summed squared error for the point found by linear least squares is minimized.
I define the error for some dt to be the squared difference between the arrival time for the point predicted by feeding the input times + dt to the least squares algorithm, minus the input time (normalized), summed over all 4 nodes.
for node, time in nodes, times:
error += ( (sqrt( (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 ) / v) - time) ** 2
My problem:
I was able to do this somewhat satisfactorily by using brute-force. I started at dt = 0, and moved by some step up to some maximum # of iterations OR until some minimum RSS error is reached, and that was the dt I added to the normalized times to obtain a solution. The resulting solutions were very accurate and precise, but quite slow.
In practice, I'd like to be able to solve this in real time, and therefore a far faster solution will be needed. I began with the assumption that the error function (that is, dt vs error as defined above) would be highly nonlinear-- offhand, this made sense to me.
Since I don't have an actual, mathematical function, I can automatically rule out methods that require differentiation (e.g. Newton-Raphson). The error function will always be positive, so I can rule out bisection, etc. Instead, I try a simple approximation search. Unfortunately, that failed miserably. I then tried Tabu search, followed by a genetic algorithm, and several others. They all failed horribly.
So, I decided to do some investigating. As it turns out the plot of the error function vs dt looks a bit like a square root, only shifted right depending upon the distance from the nodes that the signal source is:
Where dt is on horizontal axis, error on vertical axis
And, in hindsight, of course it does!. I defined the error function to involve square roots so, at least to me, this seems reasonable.
What to do?
So, my issue now is, how do I determine the dt corresponding to the minimum of the error function?
My first (very crude) attempt was to get some points on the error graph (as above), fit it using numpy.polyfit, then feed the results to numpy.root. That root corresponds to the dt. Unfortunately, this failed, too. I tried fitting with various degrees, and also with various points, up to a ridiculous number of points such that I may as well just use brute-force.
How can I determine the dt corresponding to the minimum of this error function?
Since we're dealing with high velocities (radio signals), it's important that the results be precise and accurate, as minor variances in dt can throw off the resulting point.
I'm sure that there's some infinitely simpler approach buried in what I'm doing here however, ignoring everything else, how do I find dt?
My requirements:
Speed is of utmost importance
I have access only to pure Python and NumPy in the environment where this will be run
EDIT:
Here's my code. Admittedly, a bit messy. Here, I'm using the polyfit technique. It will "simulate" a source for you, and compare results:
from numpy import poly1d, linspace, set_printoptions, array, linalg, triu_indices, roots, polyfit
from dataclasses import dataclass
from random import randrange
import math
#dataclass
class Vertexer:
receivers: list
# Defaults
c = 299792
# Receivers:
# [x_1, y_1, z_1]
# [x_2, y_2, z_2]
# [x_3, y_3, z_3]
# Solved:
# [x, y, z]
def error(self, dt, times):
solved = self.linear([time + dt for time in times])
error = 0
for time, receiver in zip(times, self.receivers):
error += ((math.sqrt( (solved[0] - receiver[0])**2 +
(solved[1] - receiver[1])**2 +
(solved[2] - receiver[2])**2 ) / c ) - time)**2
return error
def linear(self, times):
X = array(self.receivers)
t = array(times)
x, y, z = X.T
i, j = triu_indices(len(x), 1)
A = 2 * (X[i] - X[j])
b = self.c**2 * (t[j]**2 - t[i]**2) + (X[i]**2).sum(1) - (X[j]**2).sum(1)
solved, residuals, rank, s = linalg.lstsq(A, b, rcond=None)
return(solved)
def find(self, times):
# Normalize times
times = [time - min(times) for time in times]
# Fit the error function
y = []
x = []
dt = 1E-10
for i in range(50000):
x.append(self.error(dt * i, times))
y.append(dt * i)
p = polyfit(array(x), array(y), 2)
r = roots(p)
return(self.linear([time + r for time in times]))
# SIMPLE CODE FOR SIMULATING A SIGNAL
# Pick nodes to be at random locations
x_1 = randrange(10); y_1 = randrange(10); z_1 = randrange(10)
x_2 = randrange(10); y_2 = randrange(10); z_2 = randrange(10)
x_3 = randrange(10); y_3 = randrange(10); z_3 = randrange(10)
x_4 = randrange(10); y_4 = randrange(10); z_4 = randrange(10)
# Pick source to be at random location
x = randrange(1000); y = randrange(1000); z = randrange(1000)
# Set velocity
c = 299792 # km/ns
# Generate simulated source
t_1 = math.sqrt( (x - x_1)**2 + (y - y_1)**2 + (z - z_1)**2 ) / c
t_2 = math.sqrt( (x - x_2)**2 + (y - y_2)**2 + (z - z_2)**2 ) / c
t_3 = math.sqrt( (x - x_3)**2 + (y - y_3)**2 + (z - z_3)**2 ) / c
t_4 = math.sqrt( (x - x_4)**2 + (y - y_4)**2 + (z - z_4)**2 ) / c
print('Actual:', x, y, z)
myVertexer = Vertexer([[x_1, y_1, z_1],[x_2, y_2, z_2],[x_3, y_3, z_3],[x_4, y_4, z_4]])
solution = myVertexer.find([t_1, t_2, t_3, t_4])
print(solution)

It seems like the Bancroft method applies to this problem? Here's a pure NumPy implementation.
# Implementation of the Bancroft method, following
# https://gssc.esa.int/navipedia/index.php/Bancroft_Method
M = np.diag([1, 1, 1, -1])
def lorentz_inner(v, w):
return np.sum(v * (w # M), axis=-1)
B = np.array(
[
[x_1, y_1, z_1, c * t_1],
[x_2, y_2, z_2, c * t_2],
[x_3, y_3, z_3, c * t_3],
[x_4, y_4, z_4, c * t_4],
]
)
one = np.ones(4)
a = 0.5 * lorentz_inner(B, B)
B_inv_one = np.linalg.solve(B, one)
B_inv_a = np.linalg.solve(B, a)
for Lambda in np.roots(
[
lorentz_inner(B_inv_one, B_inv_one),
2 * (lorentz_inner(B_inv_one, B_inv_a) - 1),
lorentz_inner(B_inv_a, B_inv_a),
]
):
x, y, z, c_t = M # np.linalg.solve(B, Lambda * one + a)
print("Candidate:", x, y, z, c_t / c)

My answer might have mistakes (glaring) as I had not heard the TDOA term before this afternoon. Please double check if the method is right.
I could not find solution to your original problem of finding dt corresponding to the minimum error. My answer also deviates from the requirement that other than numpy no third party library had to be used (I used Sympy and largely used the code from here). However I am still posting this thinking that somebody someday might find it useful if all one is interested in ... is to find X,Y,Z of the source emitter. This method also does not take into account real-life situations where white noise or errors might be present or curvature of the earth and other complications.
Your initial test conditions are as below.
from random import randrange
import math
# SIMPLE CODE FOR SIMULATING A SIGNAL
# Pick nodes to be at random locations
x_1 = randrange(10); y_1 = randrange(10); z_1 = randrange(10)
x_2 = randrange(10); y_2 = randrange(10); z_2 = randrange(10)
x_3 = randrange(10); y_3 = randrange(10); z_3 = randrange(10)
x_4 = randrange(10); y_4 = randrange(10); z_4 = randrange(10)
# Pick source to be at random location
x = randrange(1000); y = randrange(1000); z = randrange(1000)
# Set velocity
c = 299792 # km/ns
# Generate simulated source
t_1 = math.sqrt( (x - x_1)**2 + (y - y_1)**2 + (z - z_1)**2 ) / c
t_2 = math.sqrt( (x - x_2)**2 + (y - y_2)**2 + (z - z_2)**2 ) / c
t_3 = math.sqrt( (x - x_3)**2 + (y - y_3)**2 + (z - z_3)**2 ) / c
t_4 = math.sqrt( (x - x_4)**2 + (y - y_4)**2 + (z - z_4)**2 ) / c
print('Actual:', x, y, z)
My solution is as below.
import sympy as sym
X,Y,Z = sym.symbols('X,Y,Z', real=True)
f = sym.Eq((x_1 - X)**2 +(y_1 - Y)**2 + (z_1 - Z)**2 , (c*t_1)**2)
g = sym.Eq((x_2 - X)**2 +(y_2 - Y)**2 + (z_2 - Z)**2 , (c*t_2)**2)
h = sym.Eq((x_3 - X)**2 +(y_3 - Y)**2 + (z_3 - Z)**2 , (c*t_3)**2)
i = sym.Eq((x_4 - X)**2 +(y_4 - Y)**2 + (z_4 - Z)**2 , (c*t_4)**2)
print("Solved coordinates are ", sym.solve([f,g,h,i],X,Y,Z))
print statement from your initial condition gave.
Actual: 111 553 110
and the solution that almost instantly came out was
Solved coordinates are [(111.000000000000, 553.000000000000, 110.000000000000)]
Sorry again if something is totally amiss.

python divide by zero encountered in log - logistic regression

I'm trying to implement a multiclass logistic regression classifier that distinguishes between k different classes.
This is my code.
import numpy as np
from scipy.special import expit
def cost(X,y,theta,regTerm):
(m,n) = X.shape
J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
return J
def gradient(X,y,theta,regTerm):
(m,n) = X.shape
grad = np.dot(((expit(np.dot(X,theta))).reshape(m,1) - y).T,X)/m + (np.concatenate(([0],theta[1:].T),axis=0)).reshape(1,n)
return np.asarray(grad)
def train(X,y,regTerm,learnRate,epsilon,k):
(m,n) = X.shape
theta = np.zeros((k,n))
for i in range(0,k):
previousCost = 0;
currentCost = cost(X,y,theta[i,:],regTerm)
while(np.abs(currentCost-previousCost) > epsilon):
print(theta[i,:])
theta[i,:] = theta[i,:] - learnRate*gradient(X,y,theta[i,:],regTerm)
print(theta[i,:])
previousCost = currentCost
currentCost = cost(X,y,theta[i,:],regTerm)
return theta
trX = np.load('trX.npy')
trY = np.load('trY.npy')
theta = train(trX,trY,2,0.1,0.1,4)
I can verify that cost and gradient are returning values that are in the right dimension (cost returns a scalar, and gradient returns a 1 by n row vector), but i get the error
RuntimeWarning: divide by zero encountered in log
J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
why is this happening and how can i avoid this?

The proper solution here is to add some small epsilon to the argument of log function. What worked for me was
epsilon = 1e-5
def cost(X, y, theta):
m = X.shape[0]
yp = expit(X # theta)
cost = - np.average(y * np.log(yp + epsilon) + (1 - y) * np.log(1 - yp + epsilon))
return cost

You can clean up the formula by appropriately using broadcasting, the operator * for dot products of vectors, and the operator # for matrix multiplication — and breaking it up as suggested in the comments.
Here is your cost function:
def cost(X, y, theta, regTerm):
m = X.shape[0] # or y.shape, or even p.shape after the next line, number of training set
p = expit(X # theta)
log_loss = -np.average(y*np.log(p) + (1-y)*np.log(1-p))
J = log_loss + regTerm * np.linalg.norm(theta[1:]) / (2*m)
return J
You can clean up your gradient function along the same lines.
By the way, are you sure you want np.linalg.norm(theta[1:]). If you're trying to do L2-regularization, the term should be np.linalg.norm(theta[1:]) ** 2.

Cause:
This is happening because in some cases, whenever y[i] is equal to 1, the value of the Sigmoid function (theta) also becomes equal to 1.
Cost function:
J = (np.dot(-(y.T),np.log(expit(np.dot(X,theta))))-np.dot((np.ones((m,1))-y).T,np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1))))) / m + (regTerm / (2 * m)) * np.linalg.norm(theta[1:])
Now, consider the following part in the above code snippet:
np.log(np.ones((m,1)) - (expit(np.dot(X,theta))).reshape((m,1)))
Here, you are performing (1 - theta) when the value of theta is 1. So, that will effectively become log (1 - 1) = log (0) which is undefined.

I'm guessing your data has negative values in it. You can't log a negative.
import numpy as np
np.log(2)
> 0.69314718055994529
np.log(-2)
> nan
There are a lot of different ways to transform your data that should help, if this is the case.

def cost(X, y, theta):
yp = expit(X # theta)
cost = - np.average(y * np.log(yp) + (1 - y) * np.log(1 - yp))
return cost
The warning originates from np.log(yp) when yp==0 and in np.log(1 - yp) when yp==1. One option is to filter out these values, and not to pass them into np.log. The other option is to add a small constant to prevent the value from being exactly 0 (as suggested in one of the comments above)

Add epsilon value[which is a miniature value] to the log value so that it won't be a problem at all.
But i am not sure if it will give accurate results or not .

How to make scipy.interpolate give an extrapolated result beyond the input range?

I'm trying to port a program which uses a hand-rolled interpolator (developed by a mathematician colleage) over to use the interpolators provided by scipy. I'd like to use or wrap the scipy interpolator so that it has as close as possible behavior to the old interpolator.
A key difference between the two functions is that in our original interpolator - if the input value is above or below the input range, our original interpolator will extrapolate the result. If you try this with the scipy interpolator it raises a ValueError. Consider this program as an example:
import numpy as np
from scipy import interpolate
x = np.arange(0,10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y)
print f(9)
print f(11) # Causes ValueError, because it's greater than max(x)
Is there a sensible way to make it so that instead of crashing, the final line will simply do a linear extrapolate, continuing the gradients defined by the first and last two points to infinity.
Note, that in the real software I'm not actually using the exp function - that's here for illustration only!

As of SciPy version 0.17.0, there is a new option for scipy.interpolate.interp1d that allows extrapolation. Simply set fill_value='extrapolate' in the call. Modifying your code in this way gives:
import numpy as np
from scipy import interpolate
x = np.arange(0,10)
y = np.exp(-x/3.0)
f = interpolate.interp1d(x, y, fill_value='extrapolate')
print f(9)
print f(11)
and the output is:
0.0497870683679
0.010394302658

You can take a look at InterpolatedUnivariateSpline
Here an example using it:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
# given values
xi = np.array([0.2, 0.5, 0.7, 0.9])
yi = np.array([0.3, -0.1, 0.2, 0.1])
# positions to inter/extrapolate
x = np.linspace(0, 1, 50)
# spline order: 1 linear, 2 quadratic, 3 cubic ...
order = 1
# do inter/extrapolation
s = InterpolatedUnivariateSpline(xi, yi, k=order)
y = s(x)
# example showing the interpolation for linear, quadratic and cubic interpolation
plt.figure()
plt.plot(xi, yi)
for order in range(1, 4):
s = InterpolatedUnivariateSpline(xi, yi, k=order)
y = s(x)
plt.plot(x, y)
plt.show()

1. Constant extrapolation
You can use interp function from scipy, it extrapolates left and right values as constant beyond the range:
>>> from scipy import interp, arange, exp
>>> x = arange(0,10)
>>> y = exp(-x/3.0)
>>> interp([9,10], x, y)
array([ 0.04978707, 0.04978707])
2. Linear (or other custom) extrapolation
You can write a wrapper around an interpolation function which takes care of linear extrapolation. For example:
from scipy.interpolate import interp1d
from scipy import arange, array, exp
def extrap1d(interpolator):
xs = interpolator.x
ys = interpolator.y
def pointwise(x):
if x < xs[0]:
return ys[0]+(x-xs[0])*(ys[1]-ys[0])/(xs[1]-xs[0])
elif x > xs[-1]:
return ys[-1]+(x-xs[-1])*(ys[-1]-ys[-2])/(xs[-1]-xs[-2])
else:
return interpolator(x)
def ufunclike(xs):
return array(list(map(pointwise, array(xs))))
return ufunclike
extrap1d takes an interpolation function and returns a function which can also extrapolate. And you can use it like this:
x = arange(0,10)
y = exp(-x/3.0)
f_i = interp1d(x, y)
f_x = extrap1d(f_i)
print f_x([9,10])
Output:
[ 0.04978707 0.03009069]

What about scipy.interpolate.splrep (with degree 1 and no smoothing):
>> tck = scipy.interpolate.splrep([1, 2, 3, 4, 5], [1, 4, 9, 16, 25], k=1, s=0)
>> scipy.interpolate.splev(6, tck)
34.0
It seems to do what you want, since 34 = 25 + (25 - 16).

Here's an alternative method that uses only the numpy package. It takes advantage of numpy's array functions, so may be faster when interpolating/extrapolating large arrays:
import numpy as np
def extrap(x, xp, yp):
"""np.interp function with linear extrapolation"""
y = np.interp(x, xp, yp)
y = np.where(x<xp[0], yp[0]+(x-xp[0])*(yp[0]-yp[1])/(xp[0]-xp[1]), y)
y = np.where(x>xp[-1], yp[-1]+(x-xp[-1])*(yp[-1]-yp[-2])/(xp[-1]-xp[-2]), y)
return y
x = np.arange(0,10)
y = np.exp(-x/3.0)
xtest = np.array((8.5,9.5))
print np.exp(-xtest/3.0)
print np.interp(xtest, x, y)
print extrap(xtest, x, y)
Edit: Mark Mikofski's suggested modification of the "extrap" function:
def extrap(x, xp, yp):
"""np.interp function with linear extrapolation"""
y = np.interp(x, xp, yp)
y[x < xp[0]] = yp[0] + (x[x<xp[0]]-xp[0]) * (yp[0]-yp[1]) / (xp[0]-xp[1])
y[x > xp[-1]]= yp[-1] + (x[x>xp[-1]]-xp[-1])*(yp[-1]-yp[-2])/(xp[-1]-xp[-2])
return y

It may be faster to use boolean indexing with large datasets, since the algorithm checks if every point is in outside the interval, whereas boolean indexing allows an easier and faster comparison.
For example:
# Necessary modules
import numpy as np
from scipy.interpolate import interp1d
# Original data
x = np.arange(0,10)
y = np.exp(-x/3.0)
# Interpolator class
f = interp1d(x, y)
# Output range (quite large)
xo = np.arange(0, 10, 0.001)
# Boolean indexing approach
# Generate an empty output array for "y" values
yo = np.empty_like(xo)
# Values lower than the minimum "x" are extrapolated at the same time
low = xo < f.x[0]
yo[low] = f.y[0] + (xo[low]-f.x[0])*(f.y[1]-f.y[0])/(f.x[1]-f.x[0])
# Values higher than the maximum "x" are extrapolated at same time
high = xo > f.x[-1]
yo[high] = f.y[-1] + (xo[high]-f.x[-1])*(f.y[-1]-f.y[-2])/(f.x[-1]-f.x[-2])
# Values inside the interpolation range are interpolated directly
inside = np.logical_and(xo >= f.x[0], xo <= f.x[-1])
yo[inside] = f(xo[inside])
In my case, with a data set of 300000 points, this means an speed up from 25.8 to 0.094 seconds, this is more than 250 times faster.

I did it by adding a point to my initial arrays. In this way I avoid defining self-made functions, and the linear extrapolation (in the example below: right extrapolation) looks ok.
import numpy as np
from scipy import interp as itp
xnew = np.linspace(0,1,51)
x1=xold[-2]
x2=xold[-1]
y1=yold[-2]
y2=yold[-1]
right_val=y1+(xnew[-1]-x1)*(y2-y1)/(x2-x1)
x=np.append(xold,xnew[-1])
y=np.append(yold,right_val)
f = itp(xnew,x,y)

I don't have enough reputation to comment, but in case somebody is looking for an extrapolation wrapper for a linear 2d-interpolation with scipy, I have adapted the answer that was given here for the 1d interpolation.
def extrap2d(interpolator):
xs = interpolator.x
ys = interpolator.y
zs = interpolator.z
zs = np.reshape(zs, (-1, len(xs)))
def pointwise(x, y):
if x < xs[0] or y < ys[0]:
x1_index = np.argmin(np.abs(xs - x))
x2_index = x1_index + 1
y1_index = np.argmin(np.abs(ys - y))
y2_index = y1_index + 1
x1 = xs[x1_index]
x2 = xs[x2_index]
y1 = ys[y1_index]
y2 = ys[y2_index]
z11 = zs[x1_index, y1_index]
z12 = zs[x1_index, y2_index]
z21 = zs[x2_index, y1_index]
z22 = zs[x2_index, y2_index]
return (z11 * (x2 - x) * (y2 - y) +
z21 * (x - x1) * (y2 - y) +
z12 * (x2 - x) * (y - y1) +
z22 * (x - x1) * (y - y1)
) / ((x2 - x1) * (y2 - y1) + 0.0)
elif x > xs[-1] or y > ys[-1]:
x1_index = np.argmin(np.abs(xs - x))
x2_index = x1_index - 1
y1_index = np.argmin(np.abs(ys - y))
y2_index = y1_index - 1
x1 = xs[x1_index]
x2 = xs[x2_index]
y1 = ys[y1_index]
y2 = ys[y2_index]
z11 = zs[x1_index, y1_index]
z12 = zs[x1_index, y2_index]
z21 = zs[x2_index, y1_index]
z22 = zs[x2_index, y2_index]#
return (z11 * (x2 - x) * (y2 - y) +
z21 * (x - x1) * (y2 - y) +
z12 * (x2 - x) * (y - y1) +
z22 * (x - x1) * (y - y1)
) / ((x2 - x1) * (y2 - y1) + 0.0)
else:
return interpolator(x, y)
def ufunclike(xs, ys):
if isinstance(xs, int) or isinstance(ys, int) or isinstance(xs, np.int32) or isinstance(ys, np.int32):
res_array = pointwise(xs, ys)
else:
res_array = np.zeros((len(xs), len(ys)))
for x_c in range(len(xs)):
res_array[x_c, :] = np.array([pointwise(xs[x_c], ys[y_c]) for y_c in range(len(ys))]).T
return res_array
return ufunclike
I haven't commented a lot and I am aware, that the code isn't super clean. If anybody sees any errors, please let me know. In my current use-case it is working without a problem :)

I'm afraid that there is no easy to do this in Scipy to my knowledge. You can, as I'm fairly sure that you are aware, turn off the bounds errors and fill all function values beyond the range with a constant, but that doesn't really help. See this question on the mailing list for some more ideas. Maybe you could use some kind of piecewise function, but that seems like a major pain.

The below code gives you the simple extrapolation module. k is the value to which the data set y has to be extrapolated based on the data set x. The numpy module is required.
def extrapol(k,x,y):
xm=np.mean(x);
ym=np.mean(y);
sumnr=0;
sumdr=0;
length=len(x);
for i in range(0,length):
sumnr=sumnr+((x[i]-xm)*(y[i]-ym));
sumdr=sumdr+((x[i]-xm)*(x[i]-xm));
m=sumnr/sumdr;
c=ym-(m*xm);
return((m*k)+c)

Standard interpolate + linear extrapolate:
def interpola(v, x, y):
if v <= x[0]:
return y[0]+(y[1]-y[0])/(x[1]-x[0])*(v-x[0])
elif v >= x[-1]:
return y[-2]+(y[-1]-y[-2])/(x[-1]-x[-2])*(v-x[-2])
else:
f = interp1d(x, y, kind='cubic')
return f(v)

Euler method (explicit and implicit)

I'd like to implement Euler's method (the explicit and the implicit one)
(https://en.wikipedia.org/wiki/Euler_method) for the following model:
x(t)' = q(x_M -x(t))x(t)
x(0) = x_0
where q, x_M and x_0 are real numbers.
I know already the (theoretical) implementation of the method. But I couldn't figure out where I can insert / change the model.
Could anybody help?
EDIT: You were right. I didn't understand correctly the method. Now, after a few hours, I think that I really got it! With the explicit method, I'm pretty sure (nevertheless: could anybody please have a look at my code? )
With the implicit implementation, I'm not very sure if it's correct. Could please anyone have a look at the implementation of the implicit method and give me a feedback what's correct / not good?
def explizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = x + dt * (q * (xM - x) * x)
print '%6.3f %6.3f' % (t, x)
def implizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = (1.0 / (1.0 - q *(xM + x) * x))
print '%6.3f %6.3f' % (t, x)

Pre-emptive note: Although the general idea should be correct, I did all the algebra in place in the editor box so there might be mistakes there. Please, check it yourself before using for anything really important.
I'm not sure how you come to the "implicit" formula
x = (1.0 / (1.0 - q *(xM + x) * x))
but this is wrong and you can check it by comparing your "explicit" and "implicit" results: they should slightly diverge but with this formula they will diverge drastically.
To understand the implicit Euler method, you should first get the idea behind the explicit one. And the idea is really simple and is explained at the Derivation section in the wiki: since derivative y'(x) is a limit of (y(x+h) - y(x))/h, you can approximate y(x+h) as y(x) + h*y'(x) for small h, assuming our original differential equation is
y'(x) = F(x, y(x))
Note that the reason this is only an approximation rather than exact value is that even over small range [x, x+h] the derivative y'(x) changes slightly. It means that if you want to get a better approximation of y(x+h), you need a better approximation of "average" derivative y'(x) over the range [x, x+h]. Let's call that approximation just y'. One idea of such improvement is to find both y' and y(x+h) at the same time by saying that we want to find such y' and y(x+h) that y' would be actually y'(x+h) (i.e. the derivative at the end). This results in the following system of equations:
y'(x+h) = F(x+h, y(x+h))
y(x+h) = y(x) + h*y'(x+h)
which is equivalent to a single "implicit" equation:
y(x+h) - y(x) = h * F(x+h, y(x+h))
It is called "implicit" because here the target y(x+h) is also a part of F. And note that quite similar equation is mentioned in the Modifications and extensions section of the wiki article.
So now going to your case that equation becomes
x(t+dt) - x(t) = dt*q*(xM -x(t+dt))*x(t+dt)
or equivalently
dt*q*x(t+dt)^2 + (1 - dt*q*xM)*x(t+dt) - x(t) = 0
This is a quadratic equation with two solutions:
x(t+dt) = [(dt*q*xM - 1) ± sqrt((dt*q*xM - 1)^2 + 4*dt*q*x(t))]/(2*dt*q)
Obviously we want the solution that is "close" to the x(t) which is the + solution. So the code should be something like:
b = (q * xM * dt - 1)
x(t+h) = (b + (b ** 2 + 4 * q * x(t) * dt) ** 0.5) / 2 / q / dt
(editor note:) Applying the binomial complement, this formula has the numerically more stable form for small dt, where then b < 0,
x(t+h) = (2 * x(t)) / ((b ** 2 + 4 * q * x(t) * dt) ** 0.5 - b)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Gradient function evaluation - python

Related

Evaluating a function with a well-defined value at x,y=0

Minimizing this error function, using NumPy

python divide by zero encountered in log - logistic regression

How to make scipy.interpolate give an extrapolated result beyond the input range?

Euler method (explicit and implicit)

Categories

Resources