Compare 2 lines direction - python

I have points of line as
line1 = (13.010815620422363, 6.765378475189209), (-9.916780471801758, 12.464008331298828)
line2 = (-28.914321899414062, 2.4057865142822266),(13.973191261291504, -8.306382179260254)
Is there is a way to get the line direction from some formula or code(python)?

First calculate the vectors of the two lines. Then you can calculate the cosine of the angle between the two vectors using the dot product. If the result is close to 1, both lines point in the same direction. If the result is close to -1, the second line points in the opposite direction.
import math
line1 = (13.010815620422363, 6.765378475189209), (-9.916780471801758, 12.464008331298828)
line2 = (-28.914321899414062, 2.4057865142822266),(13.973191261291504, -8.306382179260254)
vec1 = (line1[1][0] - line1[0][0], line1[1][1] - line1[0][1])
vec2 = (line2[1][0] - line2[0][0], line2[1][1] - line2[0][1])
cos_angle = (vec1[0] * vec2[0] + vec1[1] * vec2[1]) / math.sqrt((vec1[0]**2 + vec1[1]**2) * (vec2[0]**2 + vec2[1]**2))
In this case the result is -0.9999993352122917

Each of your two "lines" have a start point and an end point. This defines a vector, whose coordinates you can get by subtracting the coordinates of the start point from the coordinates of the end point.
To figure out whether two vectors are going in the same direction, you can look at the oriented angle between them; or better yet, at the cosine of that angle. The cosine will be +1 if they are exactly pointing in the same direction; 0 if they are exactly orthogonal; and -1 if they are pointing in exactly opposing direction. Some intermediary value between -1 and +1 if it's not exact.
See also:
Wikipedia on cosine similarity
StackOverflow: Python, Cosine similarity between two number lists?
With all that in mind:
def vector_of_segment(start, end):
a, b = start
c, d = end
return (c - a, d - b)
def scalar_product(u, v):
a, b = u
c, d = v
return a * c + b * d
import math
def norm(u):
return math.sqrt(scalar_product(u,u))
# python>=3.8: use math.hypot instead of defining your own norm
def cosine_similarity(u,v):
return scalar_product(u,v) / (norm(u) * norm(v))
def cosine_similarity_of_roads(line1, line2):
u = vector_of_segment(*line1)
v = vector_of_segment(*line2)
return cosine_similarity(u, v)
If you have one of the two awesome libraries numpy and scipy installed, you can also use already-implemented versions of the cosine similarity from these libraries, rather than implementing your own. Refer to the answers to the question I linked above.
Test:
>>> line1 = (13.010815620422363, 6.765378475189209), (-9.916780471801758, 12.464008331298828)
>>> line2 = (-28.914321899414062, 2.4057865142822266),(13.973191261291504, -8.306382179260254)
>>> cosine_similarity_of_roads(line1, line2)
-0.9999993352122917
Your lines are exactly in opposing directions.

Related

Function to calculate the average distance from a set of tuples (Python)

I need to implement a function that from a given set of points, specified by a pair of integers returns the average distance between the points. If there are less points that 2 in the set, it raises a Value Error.
distance is computed using the formula:
d=sqrt ((x1−x2)**2+(y1−y2)**2)
I'm struggling to get the loop to work, but it gives me an error that types.Genericaliases has no len(). Realised that this has something to do with the input being a set, but now I don't know how to resolve this:
def average_distance(points: set[tuple[int,int]]) -> float:
from math import sqrt
points = list[input()]
list_dist =[]
for index in range(0, len(points)):
coordinate = points[index] # tuple in the set points
x1 = coordinate[0] # first el in the pair
y1 = coordinate[1] # second el in the pair
next_coordinate = points[index +1]
x2 = next_coordinate[0]
y2 = next_coordinate[1]
distance = math.sqrt(((x1-x2)**2)+((y1-y2)**2))
list_dist.append(distance)
total_dist = 0
for dist in distance:
total_dist += dist
avg_dist = total_dist//(len(distance))
return avg_dist
So
print (average_distance({(1,2), (3,4), (5,6)}))
Expected output:
3.7712
Would be grateful for your advice on this.
Many thanks
Shorter solution using the library more:
from statistics import mean
from math import dist
from itertools import combinations, starmap
def average_distance(points):
return mean(starmap(dist, combinations(points, 2)))
print(average_distance({(1,2), (3,4), (5,6)}))
Output:
3.771236166328254
Here is my implementation for both the average distance between every group of two points given sequentially and all combinations of points. Take a look.
from itertools import combinations
from math import sqrt
from typing import List, NamedTuple
class Point(NamedTuple):
x: float
y: float
def distance(p1: Point, p2: Point) -> float:
return sqrt((p2.x - p1.x) ** 2 + (p2.y - p1.y) ** 2)
def avg_dist_between_all_points(points: List[Point]) -> float:
c = list(combinations(points, 2))
return sum(distance(*pair) for pair in c) / len(c)
def avg_dist_between_seq_points(points: List[Point]) -> float:
c = [points[i : i + 2] for i in range(len(points) - 1)]
return sum(distance(*pair) for pair in c) / len(c)
if __name__ == "__main__":
input_str = input("points (ex: 1,2 3,4 5,6): ")
point_strs = input_str.split(" ")
points: List[Point] = []
for s in point_strs:
x, y = s.split(",")
points.append(Point(float(x), float(y)))
print(avg_dist_between_all_points(points))
print(avg_dist_between_seq_points(points))
This yields:
➜ ./avgdist.py
points (ex: 1,2 3,4 5,6): 1,2 3,4 5,6
3.771236166328254
2.8284271247461903

Method of generating a string with results from a curve_fit

I have created a class which takes a distribution, and fits it. The method has the option for choosing between a few predefined functions.
As part of printing the class, I print the result of the fit in the form of an equation, where the fit-results and subsequent errors are displayed on the over the figure.
My question is is there a tidy way to handle when a number is negative, such that the string for printing is formed as: "y = mx - c", and not "y = mx + -c".
I developed this with a linear fit, where I simply assess the sign of the constant, and form the string in one of two ways:
def fit_result_string(self, results, errors):
if self.fit_model is utl.linear:
if results[1] > 0:
fit_str = r"y = {:.3}($\pm${:.3})x + {:.3}($\pm${:.3})".format(
results[0],
errors[0],
results[1],
errors[1])
else:
fit_str = r"y = {:.3}($\pm${:.3})x - {:.3}($\pm${:.3})".format(
results[0],
errors[0],
abs(results[1]),
errors[1])
return fit_str
I now want to build this up to also be able to form a string containing the results if the fit model is changed to a 2nd, 3rd, or 4th degree polynomial, while handling the sign of each coefficient.
Is there a better way to do this than using a whole bunch of if-else statements?
Thanks in advance!
Define a function which returns '+' or '-' according to the given number, and call it inside a f-string.
def plus_minus_string(n):
return '+' if n >= 0 else '-'
print(f"y = {m}x {plus_minus_string(c)} {abs(c)}")
Examples:
>>> m = 2
>>> c = 5
>>> print(f"y = {m}x {plus_minus_string(c)} {abs(c)}")
y = 2x + 5
>>> c = -4
>>> print(f"y = {m}x {plus_minus_string(c)} {abs(c)}")
y = 2x - 4
You will need to change it a bit to fit to your code, but it's quite straight-forward I hope.

evalf and subs in sympy on single variable expression returns expression instead of expected float value

I'm new to sympy and I'm trying to use it to get the values of higher order Greeks of options (basically higher order derivatives). My goal is to do a Taylor series expansion. The function in question is the first derivative.
f(x) = N(d1)
N(d1) is the P(X <= d1) of a standard normal distribution. d1 in turn is another function of x (x in this case is the price of the stock to anybody who's interested).
d1 = (np.log(x/100) + (0.01 + 0.5*0.11**2)*0.5)/(0.11*np.sqrt(0.5))
As you can see, d1 is a function of only x. This is what I have tried so far.
import sympy as sp
from math import pi
from sympy.stats import Normal,P
x = sp.symbols('x')
u = (sp.log(x/100) + (0.01 + 0.5*0.11**2)*0.5)/(0.11*np.sqrt(0.5))
N = Normal('N',0,1)
f = sp.simplify(P(N <= u))
print(f.evalf(subs={x:100})) # This should be 0.5155
f1 = sp.simplify(sp.diff(f,x))
f1.evalf(subs={x:100}) # This should also return a float value
The last line of code however returns an expression, not a float value as I expected like in the case with f. I feel like I'm making a very simple mistake but I can't find out why. I'd appreciate any help.
Thanks.
If you define x with positive=True (which is implied by the log in the definition of u assuming u is real which is implied by the definition of f) it looks like you get almost the expected result (also using f1.subs({x:100}) in the version without the positive x assumption shows the trouble is with unevaluated polar_lift(0) terms):
import sympy as sp
from sympy.stats import Normal, P
x = sp.symbols('x', positive=True)
u = (sp.log(x/100) + (0.01 + 0.5*0.11**2)*0.5)/(0.11*sp.sqrt(0.5)) # changed np to sp
N = Normal('N',0,1)
f = sp.simplify(P(N <= u))
print(f.evalf(subs={x:100})) # 0.541087287864516
f1 = sp.simplify(sp.diff(f,x))
print(f1.evalf(subs={x:100})) # 0.0510177033783834

Correct sequence of commands for symbolic equation using sympy

Note: I am brand new to sympy and trying to figure it out how it works.
What I have now:
I do get the correct solutions but it takes 35 - 50 seconds.
Goal:
To speed it up the calculations by defining symbolic equation once and then reusing it with different variables.
Set up:
I need to calculate a polynomial G(t) (t = 6 roots) for every iteration of the loop. (220 iterations total)
G(t) have 6 other variables, which are calculated and are known on every iterations.
These variables are different on every iteration.
First try (slow):
I simply put every into one python function, where I defined Gt symbolically and solved for t.
It was running around 35 - 40 seconds. function_G is called on every iteration.
def function_G(f1, f2, a, b, c, d):
t = sp.symbols('t')
left = t * ((a * t + b)**2 + f2**2 * (c*t+d)**2)**2
right = (a*d-b*c) * (1+ f1**2 * t**2)**2 * (a*t+b) * (c*t+d)
eq = sp.expand(left - right)
roots = sp.solveset(Gt, t)
return roots
Then a person gave me a hint that:
You should only need to (symbolically) solve for the coefficients of the polynomial once, as a preprocessing step. After that, when processing each iterations, you simply calculate the polynomial coefficients, then solve for the roots.
I asked for clarification the person added:
So I defined the function g(t) and then used sympy.expand to work out all parenthesis/exponents, and then sympy.collect to collect terms by powers of t. Finally I used .coeff on the output of collect to get the coefficients to feed into numpy.root.
Second try:
To followed the advice, I defined a G(t) symbolically first and passed it to the function that runs the loop along with its symbolic parameters. Function constructGt() thus is called only once.
def constructGt():
t, a, b, c, d, f1, f2 = sp.symbols('t a b c d f1 f2')
left = t * ((a * t + b)**2 + f2**2 * (c*t+d)**2)**2
right = (a*d-b*c) * (1+ f1**2 * t**2)**2 * (a*t+b) * (c*t+d)
gt = sp.Eq(left - right, 0)
expanded = sp.expand(gt)
expanded = sp.collect(expanded, t)
g_vars = {
"a": a,
"b": b,
"c": c,
"d": d,
"f1": f1,
"f2": f2
}
return expanded, g_vars
then on every iteration I was passing the function and its parameters to get the roots:
#Variables values:
#a = 0.00011713490404073987
#b = 0.00020253296124588926
#c = 4.235688216068313e-07
#d = 0.012262546040805029
#f1= -0.012553203944721956
#f2 = 0.018529776776949003
def function_G(f1_, f2_, a_, b_, c_, d_, Gt, v):
Gt = Gt.subs([(v['a'], a_), (v['b'], b_),
(v['c'], c_), (v['d'], d_),
(v['f1'], f1_), (v['f2'], f2_)])
roots = sp.solveset(Gt, t)
return roots
But it got even slower around 56 seconds.
Question:
I do not understand what Am I doing wrong? I also do not understand how this person is using .coeff() and then np.roots on the results.
Even if your f1 and f2 are linear in a variable, you are working with a quartic polynomial and the roots of that are very long. If this is univariate then it would be better to just use the expression, solve it at some value of constants where the solution is known and then use that value and new constants that are relatively close to the old ones and use nsolve to get the next root. If you are interested in more than one solution you may have to "follow" each root separately with nsolve...but I think you will be much happier with the overall performance. Using real_roots is another option, especially if the expression is simply a polynomial in some variable.
Given that you are working with a quartic you should keep this in mind: the general solution is so long and complicated (except for very special cases) that it is not efficient to work with the general solution and substitute in values as they are known. It is very easy to solve for numerical values, however, and it is much faster:
First create the symbolic expression into which values will be substituted; assumptionless "vanilla" symbols are used:
t, a, b, c, d, f1, f2 = symbols('t a b c d f1 f2')
left = t * ((a * t + b)**2 + f2**2 * (c*t+d)**2)**2
right = (a*d-b*c) * (1+ f1**2 * t**2)**2 * (a*t+b) * (c*t+d)
eq = left - right
Next, define a dictionary of replacements to subtitute into the expression noting that dict(x=1) creates {'x': 1} and when this is used with subs a vanilla Symbol will be created for "x":
reps = dict(
a = 0.00011713490404073987 ,
b = 0.00020253296124588926 ,
c = 4.235688216068313e-07 ,
d = 0.012262546040805029 ,
f1= -0.012553203944721956 ,
f2 = 0.018529776776949003)
Evaluate the real roots of the expression:
from time import time
t=time();[i.n(3) for i in real_roots(eq.subs(reps))];'%s sec' % round(time()-t)
[-11.5, -1.73, 8.86, 1.06e+8]
'3 sec'
Find all 6 roots of the expression but take only the real parts:
>>> roots(eq.subs(reps))
{-11.4594523988215: 1, -1.73129179415963: 1, 8.85927293271708: 1, 106354884.4365
42: 1, -1.29328524826433 - 10.3034942999005*I: 1, -1.29328524826433 + 10.3034942
999005*I: 1}
>>> [re(i).n(3) for i in _]
[-11.5, -1.73, 8.86, 1.06e+8, -1.29, -1.29]
Change one or more values and do it again
reps.update(dict(a=2))
[i.n(3) for i in real_roots(eq.subs(reps))]
[-0.0784, -0.000101, 0.0782, 3.10e+16]
Update values in a loop:
>>> a = 1
>>> for i in range(3):
... a += 1
... reps.update(dict(a=a))
... a, real_roots(eq.subs(reps))[0].n(3)
...
(2, -0.0784)
(3, -0.0640)
(4, -0.0554)
Note: when using roots, the real roots will come first in sorted order and then imaginary roots will come in conjugate pairs (but otherwise not in any given order).

Numpy: different values when calculating a sum of a sequence

I'm using scipy.integrate's odeint function to evaluate the time evolution of to find solutions to the equation
$$ \dot x = -\frac{f(x)}{g(x)}, $$
where $f$ and $g$ are both functions of $x$. $f,g$ are given by series of the form
$$ f(x) = x(1 + \sum_k b_k x^{k/2}) $$
$$ g(x) = 1 + \sum_k a_k (1 + k/2) x^{k/2}. $$
All positive initial values for $x$ should result in the solution blowing up in time, but they aren't...well, not always.
The coefficients $a_n, b_n$ are long polynomials, where $b_n$ is dependent on $x$ in a certain way, and $a_n$ is dependent on several terms being held constant.
Depending on the way I compute $g(x)$, I get very different behavior.
The first way I tried is as follows. 'a' and 'b' are 1x8 and 1x9 numpy arrays. Note that in the function g(x, a), a is multiplied by gterms in line 3, and does not appear in line 2.
def g(x, a):
gterms = [(0.5*k + 1.) * x**(0.5*k) for k in range( len(a) )]
return = 1. + np.sum(a*gterms)
def rhs(u,t)
x = u
a, b = An(), Bn(x) #An() and Bn(x) are functions that return an array of coefficients
return -f(x, b)/g(x, a)
t = np.linspace(.,.,.)
solution = odeint(rhs, <some initial value>, t)
The second way was this:
def g(x, a):
gterms = [(0.5*k + 1.) * a[k] * x**(0.5*k) for k in range( len(a) )]
return = 1. + np.sum(gterms)
def rhs(u,t)
x = u
a, b = An(), Bn(x) #An() and Bn(x) are functions that return an array of coefficients
return -f(x, b)/g(x, a)
t = np.linspace(.,.,.)
solution = odeint(rhs, <some initial value>, t)
Note the difference: using the first method, I stuck the array 'a' into the sum in line 3, whereas using the second method, I suck the values of 'a' into the list 'gterms' in line 2 instead.
The first method gives the expected behavior: solutions blow up positive x. However, the second method does not do this. The second method gives a bifurcation for some x0 > 0 that acts as a source. For initial conditions greater than x0, solutions blow up as expected, but initial conditions less than x0 have the solutions tending to 0 very slowly.
Something else of note: in the rhs function, if I change it from
def rhs(u,t)
x = u
...
return .
to
def rhs(u,t)
x = u[0]
...
return .
the same exact change occurs
So my question is: what is the difference between the two different methods I used? I can't tell for the life of me what is actually going on here. Sorry for being so verbose.

Categories

Resources