Below is some code I wrote to evaluate the position of a point moving towards the minimum of a gradient/3d function (defined at the beginning as "eq"). roll.roll() does this by repeatedly evaluating the equation at point (x,y), moving it in the direction of the gradient, then repeating with the new point.
It is very very slow to run though. I think this is because either calculate() is inefficient, or sympy's symbolic equation manipulation in roll.roll is really slow. Does anyone have any ideas on how to speed this up? Is there another libray other than SymPy that is faster?
import sympy as smp
x, y = smp.symbols('x y')
eq = 1*smp.exp(-((x-5)/5)**2 - ((y-1)/2)**2) + \
2*smp.exp(-((x+3)/2)**2 - ((y-3)/2)**2) + \
3*smp.exp(-((x-4)/2)**2 - ((y-7)/2)**2)
# Evaluates the 2 input sympy symbolic function "expression" at points (x1,y1)
def calculate(expression,x1,y1):
EQ = smp.lambdify((x,y), expression, 'numpy')
return EQ(x1,y1)
class roll:
xDiff = smp.diff(eq,x)
yDiff = smp.diff(eq,y)
normalize = eq/smp.sqrt(xDiff**2 + yDiff**2)
def roll(x,y,duration):
(x,y) = (x,y)
for i in range(0,duration):
(x,y) = (
x-calculate((roll.normalize*roll.xDiff),x,y),
y-calculate((roll.normalize*roll.yDiff),x,y)
)
return (x,y)
print(roll.roll(1,2,10))
Here is a visual to help see what this program is doing; the bigger the colored dots are, the greater the function f(x) is evaluated at that point. The draggable point represents what the program is attempting to find. https://www.desmos.com/calculator/c8mq2rijqn
I've tried to figure out if it's possible to pre-calculate normalize*xDiff not inside of roll.roll, but idk if thats possible.
Also, I believe that it is actually pretty easy to do this if the step size isn't dependent on the value of the function at the current point. I do need it to move faster when it's at a high point on the graph though (not just a point with a steep slope) so that has really been hard to figure out too.
The question is you want to do gradient-descent.
If you define a function about grad value, it's the most efficient way.
def eq(x,y):
return y*x**2+2*x*y
def grad_x(x,y):
return 2*x*y+2*y
def grad_y(x,y):
return x**2+2*x
But when you confused about how to calculate the grad of your function, you can use other package which can autograd (i.e. numpy, pytorch).
Here is an example of pytorch:
import torch
def eq(x, y):
return (
1 * torch.exp(-(((x - 5) / 5) ** 2) - ((y - 1) / 2) ** 2)
+ 2 * torch.exp(-(((x + 3) / 2) ** 2) - ((y - 3) / 2) ** 2)
+ 3 * torch.exp(-(((x - 4) / 2) ** 2) - ((y - 7) / 2) ** 2)
)
def roll(x, y, duration):
x, y = (
torch.tensor(x).requires_grad_(True),
torch.tensor(y).requires_grad_(True),
)
for _ in range(0, duration):
func_value = eq(x, y)
xDiff = torch.autograd.grad(func_value, x, retain_graph=True)[0]
yDiff = torch.autograd.grad(func_value, y)[0]
normalize = func_value / torch.sqrt(xDiff ** 2 + yDiff ** 2)
x = x - normalize * xDiff
y = y - normalize * yDiff
return (x, y)
print(roll(1.0, 2.0, 10))
You are calling lambdify inside the loop. The point of lambdify is that it returns a fast function but lambdify itself is a lot slower than the function that it returns. You should call lambdify once and then in the loop repeatedly use the function that was returned by it.
This code is equivalent to yours and returns the exact same result but the loop is 500x times faster:
import sympy as smp
x, y = smp.symbols('x y')
eq = 1*smp.exp(-((x-5)/5)**2 - ((y-1)/2)**2) + \
2*smp.exp(-((x+3)/2)**2 - ((y-3)/2)**2) + \
3*smp.exp(-((x-4)/2)**2 - ((y-7)/2)**2)
# Evaluates the 2 input sympy symbolic function "expression" at points (x1,y1)
def calculate(expression,x1,y1):
EQ = smp.lambdify((x,y), expression, 'numpy')
return EQ(x1,y1)
class roll:
xDiff = smp.diff(eq,x)
yDiff = smp.diff(eq,y)
normalize = eq/smp.sqrt(xDiff**2 + yDiff**2)
# call lambdify once
fxy = smp.lambdify((x, y), (x - normalize*xDiff, y - normalize*yDiff))
def roll(x,y,duration):
(x,y) = (x,y)
for i in range(0,duration):
# in the loop call the function that was returned by lambdify
x, y = roll.fxy(x, y)
return (x,y)
print(roll.roll(1,2,10))
Background
I've been working for some time on attempting to solve the (notoriously painful) Time Difference of Arrival (TDoA) multi-lateration problem, in 3-dimensions and using 4 nodes. If you're unfamiliar with the problem, it is to determine the coordinates of some signal source (X,Y,Z), given the coordinates of n nodes, the time of arrival of the signal at each node, and the velocity of the signal v.
My solution is as follows:
For each node, we write (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 = (v(t_i - T)**2
Where (x_i, y_i, z_i) are the coordinates of the ith node, and T is the time of emission.
We have now 4 equations in 4 unknowns. Four nodes are obviously insufficient. We could try to solve this system directly, however that seems next to impossible given the highly nonlinear nature of the problem (and, indeed, I've tried many direct techniques... and failed). Instead, we simplify this to a linear problem by considering all i/j possibilities, subtracting equation i from equation j. We obtain (n(n-1))/2 =6 equations of the form:
2*(x_j - x_i)*X + 2*(y_j - y_i)*Y + 2*(z_j - z_i)*Z + 2 * v**2 * (t_i - t_j) = v**2 ( t_i**2 - t_j**2) + (x_j**2 + y_j**2 + z_j**2) - (x_i**2 + y_i**2 + z_i**2)
Which look like Xv_1 + Y_v2 + Z_v3 + T_v4 = b. We try now to apply standard linear least squares, where the solution is the matrix vector x in A^T Ax = A^T b. Unfortunately, if you were to try feeding this into any standard linear least squares algorithm, it'll choke up. So, what do we do now?
...
The time of arrival of the signal at node i is given (of course) by:
sqrt( (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 ) / v
This equation implies that the time of arrival, T, is 0. If we have that T = 0, we can drop the T column in matrix A and the problem is greatly simplified. Indeed, NumPy's linalg.lstsq() gives a surprisingly accurate & precise result.
...
So, what I do is normalize the input times by subtracting from each equation the earliest time. All I have to do then is determine the dt that I can add to each time such that the residual of summed squared error for the point found by linear least squares is minimized.
I define the error for some dt to be the squared difference between the arrival time for the point predicted by feeding the input times + dt to the least squares algorithm, minus the input time (normalized), summed over all 4 nodes.
for node, time in nodes, times:
error += ( (sqrt( (X-x_i)**2 + (Y-y_i)**2 + (Z-z_i)**2 ) / v) - time) ** 2
My problem:
I was able to do this somewhat satisfactorily by using brute-force. I started at dt = 0, and moved by some step up to some maximum # of iterations OR until some minimum RSS error is reached, and that was the dt I added to the normalized times to obtain a solution. The resulting solutions were very accurate and precise, but quite slow.
In practice, I'd like to be able to solve this in real time, and therefore a far faster solution will be needed. I began with the assumption that the error function (that is, dt vs error as defined above) would be highly nonlinear-- offhand, this made sense to me.
Since I don't have an actual, mathematical function, I can automatically rule out methods that require differentiation (e.g. Newton-Raphson). The error function will always be positive, so I can rule out bisection, etc. Instead, I try a simple approximation search. Unfortunately, that failed miserably. I then tried Tabu search, followed by a genetic algorithm, and several others. They all failed horribly.
So, I decided to do some investigating. As it turns out the plot of the error function vs dt looks a bit like a square root, only shifted right depending upon the distance from the nodes that the signal source is:
Where dt is on horizontal axis, error on vertical axis
And, in hindsight, of course it does!. I defined the error function to involve square roots so, at least to me, this seems reasonable.
What to do?
So, my issue now is, how do I determine the dt corresponding to the minimum of the error function?
My first (very crude) attempt was to get some points on the error graph (as above), fit it using numpy.polyfit, then feed the results to numpy.root. That root corresponds to the dt. Unfortunately, this failed, too. I tried fitting with various degrees, and also with various points, up to a ridiculous number of points such that I may as well just use brute-force.
How can I determine the dt corresponding to the minimum of this error function?
Since we're dealing with high velocities (radio signals), it's important that the results be precise and accurate, as minor variances in dt can throw off the resulting point.
I'm sure that there's some infinitely simpler approach buried in what I'm doing here however, ignoring everything else, how do I find dt?
My requirements:
Speed is of utmost importance
I have access only to pure Python and NumPy in the environment where this will be run
EDIT:
Here's my code. Admittedly, a bit messy. Here, I'm using the polyfit technique. It will "simulate" a source for you, and compare results:
from numpy import poly1d, linspace, set_printoptions, array, linalg, triu_indices, roots, polyfit
from dataclasses import dataclass
from random import randrange
import math
#dataclass
class Vertexer:
receivers: list
# Defaults
c = 299792
# Receivers:
# [x_1, y_1, z_1]
# [x_2, y_2, z_2]
# [x_3, y_3, z_3]
# Solved:
# [x, y, z]
def error(self, dt, times):
solved = self.linear([time + dt for time in times])
error = 0
for time, receiver in zip(times, self.receivers):
error += ((math.sqrt( (solved[0] - receiver[0])**2 +
(solved[1] - receiver[1])**2 +
(solved[2] - receiver[2])**2 ) / c ) - time)**2
return error
def linear(self, times):
X = array(self.receivers)
t = array(times)
x, y, z = X.T
i, j = triu_indices(len(x), 1)
A = 2 * (X[i] - X[j])
b = self.c**2 * (t[j]**2 - t[i]**2) + (X[i]**2).sum(1) - (X[j]**2).sum(1)
solved, residuals, rank, s = linalg.lstsq(A, b, rcond=None)
return(solved)
def find(self, times):
# Normalize times
times = [time - min(times) for time in times]
# Fit the error function
y = []
x = []
dt = 1E-10
for i in range(50000):
x.append(self.error(dt * i, times))
y.append(dt * i)
p = polyfit(array(x), array(y), 2)
r = roots(p)
return(self.linear([time + r for time in times]))
# SIMPLE CODE FOR SIMULATING A SIGNAL
# Pick nodes to be at random locations
x_1 = randrange(10); y_1 = randrange(10); z_1 = randrange(10)
x_2 = randrange(10); y_2 = randrange(10); z_2 = randrange(10)
x_3 = randrange(10); y_3 = randrange(10); z_3 = randrange(10)
x_4 = randrange(10); y_4 = randrange(10); z_4 = randrange(10)
# Pick source to be at random location
x = randrange(1000); y = randrange(1000); z = randrange(1000)
# Set velocity
c = 299792 # km/ns
# Generate simulated source
t_1 = math.sqrt( (x - x_1)**2 + (y - y_1)**2 + (z - z_1)**2 ) / c
t_2 = math.sqrt( (x - x_2)**2 + (y - y_2)**2 + (z - z_2)**2 ) / c
t_3 = math.sqrt( (x - x_3)**2 + (y - y_3)**2 + (z - z_3)**2 ) / c
t_4 = math.sqrt( (x - x_4)**2 + (y - y_4)**2 + (z - z_4)**2 ) / c
print('Actual:', x, y, z)
myVertexer = Vertexer([[x_1, y_1, z_1],[x_2, y_2, z_2],[x_3, y_3, z_3],[x_4, y_4, z_4]])
solution = myVertexer.find([t_1, t_2, t_3, t_4])
print(solution)
It seems like the Bancroft method applies to this problem? Here's a pure NumPy implementation.
# Implementation of the Bancroft method, following
# https://gssc.esa.int/navipedia/index.php/Bancroft_Method
M = np.diag([1, 1, 1, -1])
def lorentz_inner(v, w):
return np.sum(v * (w # M), axis=-1)
B = np.array(
[
[x_1, y_1, z_1, c * t_1],
[x_2, y_2, z_2, c * t_2],
[x_3, y_3, z_3, c * t_3],
[x_4, y_4, z_4, c * t_4],
]
)
one = np.ones(4)
a = 0.5 * lorentz_inner(B, B)
B_inv_one = np.linalg.solve(B, one)
B_inv_a = np.linalg.solve(B, a)
for Lambda in np.roots(
[
lorentz_inner(B_inv_one, B_inv_one),
2 * (lorentz_inner(B_inv_one, B_inv_a) - 1),
lorentz_inner(B_inv_a, B_inv_a),
]
):
x, y, z, c_t = M # np.linalg.solve(B, Lambda * one + a)
print("Candidate:", x, y, z, c_t / c)
My answer might have mistakes (glaring) as I had not heard the TDOA term before this afternoon. Please double check if the method is right.
I could not find solution to your original problem of finding dt corresponding to the minimum error. My answer also deviates from the requirement that other than numpy no third party library had to be used (I used Sympy and largely used the code from here). However I am still posting this thinking that somebody someday might find it useful if all one is interested in ... is to find X,Y,Z of the source emitter. This method also does not take into account real-life situations where white noise or errors might be present or curvature of the earth and other complications.
Your initial test conditions are as below.
from random import randrange
import math
# SIMPLE CODE FOR SIMULATING A SIGNAL
# Pick nodes to be at random locations
x_1 = randrange(10); y_1 = randrange(10); z_1 = randrange(10)
x_2 = randrange(10); y_2 = randrange(10); z_2 = randrange(10)
x_3 = randrange(10); y_3 = randrange(10); z_3 = randrange(10)
x_4 = randrange(10); y_4 = randrange(10); z_4 = randrange(10)
# Pick source to be at random location
x = randrange(1000); y = randrange(1000); z = randrange(1000)
# Set velocity
c = 299792 # km/ns
# Generate simulated source
t_1 = math.sqrt( (x - x_1)**2 + (y - y_1)**2 + (z - z_1)**2 ) / c
t_2 = math.sqrt( (x - x_2)**2 + (y - y_2)**2 + (z - z_2)**2 ) / c
t_3 = math.sqrt( (x - x_3)**2 + (y - y_3)**2 + (z - z_3)**2 ) / c
t_4 = math.sqrt( (x - x_4)**2 + (y - y_4)**2 + (z - z_4)**2 ) / c
print('Actual:', x, y, z)
My solution is as below.
import sympy as sym
X,Y,Z = sym.symbols('X,Y,Z', real=True)
f = sym.Eq((x_1 - X)**2 +(y_1 - Y)**2 + (z_1 - Z)**2 , (c*t_1)**2)
g = sym.Eq((x_2 - X)**2 +(y_2 - Y)**2 + (z_2 - Z)**2 , (c*t_2)**2)
h = sym.Eq((x_3 - X)**2 +(y_3 - Y)**2 + (z_3 - Z)**2 , (c*t_3)**2)
i = sym.Eq((x_4 - X)**2 +(y_4 - Y)**2 + (z_4 - Z)**2 , (c*t_4)**2)
print("Solved coordinates are ", sym.solve([f,g,h,i],X,Y,Z))
print statement from your initial condition gave.
Actual: 111 553 110
and the solution that almost instantly came out was
Solved coordinates are [(111.000000000000, 553.000000000000, 110.000000000000)]
Sorry again if something is totally amiss.
I have been at this for more than 2 hours. Each individual line prints out the result I am looking for. However, when I run all the lines in the program, python print values or Ixx, Iyy and Ixy. Why is this?
import numpy as np
Ixx = 14600000
Iyy = 14600000
Ixy = 7080*(47.2-12.5)**2
alpha = 45
x = 0.5*(Ixx+Iyy)+0.5*(Ixx-Iyy)*np.cos(2*alpha/180*np.pi)+Ixy*np.sin(2*alpha/180*np.pi)
y = 0.5*(Ixx+Iyy)-0.5*(Ixx-Iyy)*np.cos(2*alpha/180*np.pi)-Ixy*np.sin(2*alpha/180*np.pi)
z = 0.5*(Ixx-Iyy)*np.sin(2*alpha/180*np.pi)+Ixy*np.cos(2*alpha/180*np.pi)
print x,y,z
If you're running this under python2.x, then you are losing information with the statement 2*alpha/180*np.pi (python3.x should work though).
The operations are evaluated in order of precedence (left to right in this case), which gives
((2 * alpha) / 180) * ni.pi
=> (90 / 180) * ni.pi # integer division truncates this to 0
=> 0 * ni.pi
You need to manually convert to a float, either:
np.sin(2.0*alpha/180*np.pi) # the floating point 2.0 will promote alpha to float for the multiply
Or
np.sin(2*float(alpha)/180*np.pi) # explicit, very clear
Or
alpha = 45.0 # this is a little dangerous as you might change the angle in the future and forget to make it a float again
I can't say I understand why, but changing the interior of my sinus and cosinus terms solved the problem. I changed
x = 0.5*(Ixx+Iyy)+0.5*(Ixx-Iyy)*np.cos(2*alpha/180*np.pi)+Ixy*np.sin(2*alpha/180*np.pi)
y = 0.5*(Ixx+Iyy)-0.5*(Ixx-Iyy)*np.cos(2*alpha/180*np.pi)-Ixy*np.sin(2*alpha/180*np.pi)
z = 0.5*(Ixx-Iyy)*np.sin(2*alpha/180*np.pi)+Ixy*np.cos(2*alpha/180*np.pi)
to
Ixx_a = 0.5*(Ixx+Iyy)+0.5*(Ixx-Iyy)*np.cos(np.pi*alpha/90) + Ixy*np.sin(np.pi*alpha/90)
Iyy_a = 0.5*(Ixx+Iyy)-0.5*(Ixx-Iyy)*np.cos(np.pi*alpha/90) - Ixy*np.sin(np.pi*alpha/90)
Ixy_a = 0.5*(Ixx-Iyy)*np.sin(np.pi*alpha/90)+Ixy*np.cos(np.pi*alpha/90)
which is mathematically equivalent, but apparently numpy finds it easier to compute.
I'd like to implement Euler's method (the explicit and the implicit one)
(https://en.wikipedia.org/wiki/Euler_method) for the following model:
x(t)' = q(x_M -x(t))x(t)
x(0) = x_0
where q, x_M and x_0 are real numbers.
I know already the (theoretical) implementation of the method. But I couldn't figure out where I can insert / change the model.
Could anybody help?
EDIT: You were right. I didn't understand correctly the method. Now, after a few hours, I think that I really got it! With the explicit method, I'm pretty sure (nevertheless: could anybody please have a look at my code? )
With the implicit implementation, I'm not very sure if it's correct. Could please anyone have a look at the implementation of the implicit method and give me a feedback what's correct / not good?
def explizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = x + dt * (q * (xM - x) * x)
print '%6.3f %6.3f' % (t, x)
def implizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = (1.0 / (1.0 - q *(xM + x) * x))
print '%6.3f %6.3f' % (t, x)
Pre-emptive note: Although the general idea should be correct, I did all the algebra in place in the editor box so there might be mistakes there. Please, check it yourself before using for anything really important.
I'm not sure how you come to the "implicit" formula
x = (1.0 / (1.0 - q *(xM + x) * x))
but this is wrong and you can check it by comparing your "explicit" and "implicit" results: they should slightly diverge but with this formula they will diverge drastically.
To understand the implicit Euler method, you should first get the idea behind the explicit one. And the idea is really simple and is explained at the Derivation section in the wiki: since derivative y'(x) is a limit of (y(x+h) - y(x))/h, you can approximate y(x+h) as y(x) + h*y'(x) for small h, assuming our original differential equation is
y'(x) = F(x, y(x))
Note that the reason this is only an approximation rather than exact value is that even over small range [x, x+h] the derivative y'(x) changes slightly. It means that if you want to get a better approximation of y(x+h), you need a better approximation of "average" derivative y'(x) over the range [x, x+h]. Let's call that approximation just y'. One idea of such improvement is to find both y' and y(x+h) at the same time by saying that we want to find such y' and y(x+h) that y' would be actually y'(x+h) (i.e. the derivative at the end). This results in the following system of equations:
y'(x+h) = F(x+h, y(x+h))
y(x+h) = y(x) + h*y'(x+h)
which is equivalent to a single "implicit" equation:
y(x+h) - y(x) = h * F(x+h, y(x+h))
It is called "implicit" because here the target y(x+h) is also a part of F. And note that quite similar equation is mentioned in the Modifications and extensions section of the wiki article.
So now going to your case that equation becomes
x(t+dt) - x(t) = dt*q*(xM -x(t+dt))*x(t+dt)
or equivalently
dt*q*x(t+dt)^2 + (1 - dt*q*xM)*x(t+dt) - x(t) = 0
This is a quadratic equation with two solutions:
x(t+dt) = [(dt*q*xM - 1) ± sqrt((dt*q*xM - 1)^2 + 4*dt*q*x(t))]/(2*dt*q)
Obviously we want the solution that is "close" to the x(t) which is the + solution. So the code should be something like:
b = (q * xM * dt - 1)
x(t+h) = (b + (b ** 2 + 4 * q * x(t) * dt) ** 0.5) / 2 / q / dt
(editor note:) Applying the binomial complement, this formula has the numerically more stable form for small dt, where then b < 0,
x(t+h) = (2 * x(t)) / ((b ** 2 + 4 * q * x(t) * dt) ** 0.5 - b)