Distance in arrays

Distance in arrays - python

So I have to code a function that calculates the distance between two points - p1 and p2 - from an array with several points. Each point represents a square with 20 m on each side.
the distance should be something like:
d = sqrt(w(r1 −r2))^2 + (w(c1 −c2))^2 + (a1 −a2)^2 )
Where w is 20, the side of the square, r1 and r2, the row's index, c1 and c2 the column's index, and a1 and a2 the value of each point.
the array of distances is:
test = [ [206,205,204,190,208], [190,194,206,197,203], [196,196,205,201,193], [194,199,199,206,205], [192,196,195,201,193], [194,199,200,200,205], [196,196,195,200,193] ]
Can someone help on this easy one?

first you need to import sqrt from the math libary
squares can either be calculated by multiplying the value with itself (r2-r1)(r2-r1) or by using pow from the math libary as well. (r2-r1)^2 does not work.
strictly speaking A is not an array but a list (rows) of lists (columns).
But you can thing of it as a kind of array anyway. You get values of it by using two indices A[row_index][column_index]
from math import sqrt, pow
A = [[206,205,204,190,208],
[190,194,206,197,203],
[196,196,205,201,193],
[194,199,199,206,205],
[192,196,195,201,193],
[194,199,200,200,205],
[196,196,195,200,193]]
W = 20
def distance(r1, c1, r2, c2):
# get a values for point 1 and 2
a1 = A[r1][c1]
a2 = A[r2][c2]
# calculate the distance
d = sqrt(pow(W*(r2-r1), 2) + pow(W*(c2-c1), 2) + pow(a2-a1, 2))
return d
print(distance(0,0,4,4))
>>> 113.88

Related

Adding 2 vectors and comparing the angle of the result to both vectors does not give same result. Why?

I want to calculate the angle between vectors. I thought the sum of 2 vectors should be in the middle of the 2. But calculating the angle with my method gives different results. I guess it has to be with rounding but the result is too different. I tried 2 different approaches. Can you explain me, why? Or am I wrong with my math understanding?
from numpy import (array, dot, arccos, clip, sum)
from numpy.linalg import norm
import spectral
import numpy as np
def calculateAngle(u, v):
c = dot(u, v) / norm(u) / norm(v) # -> cosine of the angle
angle = arccos(clip(c, -1, 1)) # if you really want the angle
return c, angle
def calc_with_numpy():
print("Method 2:")
v = (u1_norm + u2_norm)
c1, angle1 = calculateAngle(u1, v)
c2, angle2 = calculateAngle(u2, v)
print("angle1:", angle1)
print("angle2:", angle2)
def calc_with_spectral():
print("Method 1:")
v = (u1_norm + u2_norm)
img=np.array([v]).reshape((1,1,v.size))
means = np.array([u1, u2])
angles = spectral.spectral_angles(img, means)
print("angle1:", angles[0,0,0])
print("angle2:", angles[0, 0, 1])
u1 = array([1.0,2.0], dtype="float64")
u1_norm = u1 / sum(u1)
u2 = array([3.0,2.0], dtype="float64")
u2_norm = u2 / sum(u2)
calc_with_spectral()
calc_with_numpy()
My results:
Method 1:
angle1: 0.25518239062081866
angle2: 0.2639637236257044
Method 2:
angle1: 0.2551823906208191
angle2: 0.2639637236257044

You are wrong here
u1_norm = u1 / sum(u1)
u2_norm = u2 / sum(u2)
To get normalized (unit length) vector, you need to divide it's components by vector length, not by component sum (like you perform right job inside calculateAngle)
u1_norm = u1 / np.linalg.norm(u1)

You've normalised wrong. Instead, do
u1_norm = u1 / np.sqrt(np.sum(u1**2))
u2_norm = u2 / np.sqrt(np.sum(u2**2))
I now get
>>> calc_with_numpy()
angle1: 0.2595730571232615
angle2: 0.2595730571232615
>>> norm(u1) == np.sqrt(np.sum(u1**2))
True
>>> norm(u2) == np.sqrt(np.sum(u2**2))
True
I don't know what spectral is, my python distribution doesn't have it as a module.

just extchange the following code
u1_norm = u1 / sum(u1)
u2_norm = u2 / sum(u2)
by
u1_norm = u1 / len(u1)
u2_norm = u2 / len(u2)

This function can be used to compute the angle (in degrees) with the x axis of a 2d vector [x, y]:
from math import atan2, pi
def angle(vec):
return atan2(*reversed(vec)) * 180 / pi
Now if you want to compute the angle between two vectors, you can use this function:
def angle_between_vecs(vec1, vec2):
return abs(angle(vec1) - angle(vec2))
You may also want to compute the sum of two vectors:
def sum_vecs(vec1, vec2):
return [vec1[0] + vec2[0], vec1[1]+ vec2[1]]
Now notice that angle(sum_vecs(vec1, vec2)) is not necessarily equal to angle_between_vecs(vec1, vec2). Look at this graphical example that follows (vec3 is the sum of vec1 and vec2). As you can see, the sum is not exactly cutting the angle in two parts.
This code can be clearly optimised by using for example NumPy, but this is just an example to show you that your assumption that the angle of the sum should be between the two vectors is wrong!

How to write Z[i,k] = sqrt(sum_j((X[i,j] - Y[k,j])**2) in einsum notation? [duplicate]

I have 2 lists of points as numpy.ndarray, each row is the coordinate of a point, like:
a = np.array([[1,0,0],[0,1,0],[0,0,1]])
b = np.array([[1,1,0],[0,1,1],[1,0,1]])
Here I want to calculate the euclidean distance between all pairs of points in the 2 lists, for each point p_a in a, I want to calculate the distance between it and every point p_b in b. So the result is
d = np.array([[1,sqrt(3),1],[1,1,sqrt(3)],[sqrt(3),1,1]])
How to use matrix multiplication in numpy to compute the distance matrix?

Using direct numpy broadcasting, you can do this:
dist = np.sqrt(((a[:, None] - b[:, :, None]) ** 2).sum(0))
Alternatively, scipy has a routine that will compute this slightly more efficiently (particularly for large matrices)
from scipy.spatial.distance import cdist
dist = cdist(a, b)
I would avoid solutions that depend on factoring-out matrix products (of the form A^2 + B^2 - 2AB), because they can be numerically unstable due to floating point roundoff errors.

To compute the squared euclidean distance for each pair of elements off them - x and y, we need to find :
(Xik-Yjk)**2 = Xik**2 + Yjk**2 - 2*Xik*Yjk
and then sum along k to get the distance at coressponding point as dist(Xi,Yj).
Using associativity, it reduces to :
dist(Xi,Yj) = sum_k(Xik**2) + sum_k(Yjk**2) - 2*sum_k(Xik*Yjk)
Bringing in matrix-multiplication for the last part, we would have all the distances, like so -
dist = sum_rows(X^2), sum_rows(Y^2), -2*matrix_multiplication(X, Y.T)
Hence, putting into NumPy terms, we would end up with the euclidean distances for our case with a and b as the inputs, like so -
np.sqrt((a**2).sum(1)[:,None] + (b**2).sum(1) - 2*a.dot(b.T))
Leveraging np.einsum, we could replace the first two summation-reductions with -
np.einsum('ij,ij->i',a,a)[:,None] + np.einsum('ij,ij->i',b,b)
More info could be found on eucl_dist package's wiki page (disclaimer: I am its author).

If you have 2 each 1-dimensional arrays, x and y, you can convert the arrays into matrices with repeating columns, transpose, and apply the distance formula. This assumes that x and y are coordinated pairs. The result is a symmetrical distance matrix.
x = [1, 2, 3]
y = [4, 5, 6]
xx = np.repeat(x,3,axis = 0).reshape(3,3)
yy = np.repeat(y,3,axis = 0).reshape(3,3)
dist = np.sqrt((xx-xx.T)**2 + (yy-yy.T)**2)
dist
Out[135]:
array([[0. , 1.41421356, 2.82842712],
[1.41421356, 0. , 1.41421356],
[2.82842712, 1.41421356, 0. ]])

L2 distance = (a^2 + b^2 - 2ab)^0.5
a = np.random.randn(5, 3)
b = np.random.randn(2, 3)
a2 = np.sum(np.square(a), axis = 1)[..., None]
b2 = np.sum(np.square(b), axis = 1)[None, ...]
ab = -2*np.dot(a, b.T)
dist = np.sqrt(a2 + b2 + ab)

Python: Speeding up large double sum with elements precalculated

I need to calculate a double sum of the form:
wignersum{ell} = sum_{ell1} sum_{ell2} (2*ell1+1)(2*ell2+1) * W{ell,ell1,ell2}^2 * C1(ell1) * C2(ell2)
where wignersum is an array indexed by ell, and ell, ell1, and ell2 all run from 0 to ellmax. The W{ell,ell1,ell2}^2 are a set of known coefficients that I've already calculated (called w3j), stored in an array of shape (ellmax, ellmax, ellmax) as a global variable to be called by this function. (These coefficients are time intensive to calculate and I've found it faster to load them from a numpy file). The C1 and C2 are arrays of coefficients of shape (ellmax).
I have successfully calculated this sum by making use of a double for loop and grabbing the appropriate elements from each prexisting array and updating the wignersum array in each iteration. I assume there is a better way to vectorize this problem to speed up the calculation. I thought about making the C1 and C2 arrays into arrays of the same shape as the w3j array, then multiplying these arrays elementwise before using np.sum on the ell1 and ell2 axes. I'm unsure whether this is in fact a good method of vecotrizing, and if it is, how to actually do this.
The code as it stands is something like
import numpy as np
ell_max = 400
w3j = np.ones((ell_max, ell_max, ell_max))
C1 = np.arange(ell_max)
C2 = np.arange(ell_max)
def function(ell_max)
ells = np.arange(ell_max)
wignersum = np.zeros(ell_max)
factor = np.array([2*i+1 for i in range(384)])
for ell1 in ells:
A = factor[ell1]
B = C1[ell1]
for ell2 in ells:
D = factor[ell2] * C2[ell2] * w3j[:,ell1,ell2]
wignersum += A * B * D
return wignersum
(note the in actuality C1 and C2 are not global variables but are local variables that must be calculated from a set of parameters fed to function. This is not the limiting factor in the code speed however)
With the double for loop this takes ~1.5 seconds to run for ell_max~400 which is too long for the purposes I'm using it for. I'd like to vectorize this as much as possible to improve speed.

You can use either einsum or matrix multiplication for a ~20x speedup:
import numpy as np
ell_max = 400
w3j = np.random.randint(1,10,(ell_max, ell_max, ell_max))
C1 = np.random.randint(1,10,ell_max)
C2 = np.random.randint(1,10,ell_max)
def function(ell_max):
ells = np.arange(ell_max)
wignersum = np.zeros(ell_max)
factor = np.array([2*i+1 for i in range(ell_max)])
for ell1 in ells:
A = factor[ell1]
B = C1[ell1]
for ell2 in ells:
D = factor[ell2] * C2[ell2] * w3j[:,ell1,ell2]
wignersum += A * B * D
return wignersum
def pp_es(l_mx):
l = np.arange(l_mx)
f = 2*l+1
return np.einsum("i,i,j,j,kij",f,C1,f,C2,w3j,optimize=True)
def pp_mm(l_mx):
l = np.arange(l_mx)
f = 2*l+1
return w3j.reshape(l_mx,-1)#np.outer(f*C1,f*C2).ravel()
from timeit import timeit
print(timeit(lambda:pp_es(400),number=10))
print(timeit(lambda:pp_mm(400),number=10))
print(timeit(lambda:function(400),number=10))
print((pp_mm(400)==pp_es(400)).all())
print((function(400)==pp_mm(400)).all())
Sample run:
0.6061844169162214 # einsum
0.6111843499820679 # matrix x vector
12.233918005018495 # OP
True # einsum == matrix x vector
True # OP == matrix x vector

Recover angles after transformation

This seems like an easy enough task but I've failed to find a solution and I've run out of ideas.
I have two angles which I employ to define some transformation coefficients. Now, I don't actually have the values for those angles in my real data, I have the coefficients and I need to recover the angles.
I thought the arctan2 function would take care of this, but there are cases where it fails to recover the proper a1 angle and instead returns its 180 complement, which later affects the recovery of the a2 angle.
What am I doing wrong and how can I recover the a1, a2 angles properly?
import numpy as np
# Repeat 100 times
for _ in range(100):
# Define two random angles in the range [-pi, pi]. I do not have these
# angles in my actual data, I have the A,B,C coefficients shown below.
a1, a2 = np.random.uniform(-180., 180., (2,))
# Transformation coefficients using the above angles.
# This is the data I actually have.
a1_rad, a2_rad = np.deg2rad(a1), np.deg2rad(a2) # to radians
A = - np.sin(a1_rad) * np.sin(a2_rad)
B = np.cos(a1_rad) * np.sin(a2_rad)
C = np.cos(a2_rad)
# Recover a1 using 'arctan2' (returns angle in the range [-pi, pi])
a1_recover = np.arctan2(-A / B, 1.)
# Now obtain sin(a2), used below to obtain 'a2'
sin_a2 = -A / np.sin(a1_recover)
# Recover a2 using 'arctan2', where: C = cos(a2)
a2_recover = np.arctan2(sin_a2, C)
# Print differences.
a1_recover = np.rad2deg(a1_recover)
print("a1: {:.2f} = {} - {}".format(a1 - a1_recover, a1, a1_recover))
a2_recover = np.rad2deg(a2_recover)
print("a2: {:.2f} = {} - {}\n".format(a2 - a2_recover, a2, a2_recover))

When a2_rad equals 0, (A, B, C) equals (0, 0, 1) no matter what a1_rad equals. So the transformation is not 1-to-1. Therefore there is no well-defined inverse.
def ABC(a1, a2):
a1_rad, a2_rad = np.deg2rad(a1), np.deg2rad(a2) # to radians
A = - np.sin(a1_rad) * np.sin(a2_rad)
B = np.cos(a1_rad) * np.sin(a2_rad)
C = np.cos(a2_rad)
return A, B, C
print(ABC(0, 0))
# (-0.0, 0.0, 1.0)
print(90, 0)
# (-0.0, 0.0, 1.0)
print(-90, 0)
# (-0.0, 0.0, 1.0)
A similar problem happens at the opposite (South) pole. Within the limits of floating point accuracy, all these values (of the form ABC(a1, 180)) are essentially equal too:
ABC(1, 180)
# (-2.1373033680837913e-18, 1.2244602795081332e-16, -1.0)
ABC(0, 180)
# (-0.0, 1.2246467991473532e-16, -1.0)
ABC(90, 180)
# (-1.2246467991473532e-16, 7.498798913309288e-33, -1.0)
You can think of a1, a2 as coordinates on a unit sphere where a1
represents the angle away from the x-axis (more often called theta) and a2
represents the angle away from the z-axis (often called phi).
A,B,C represents the same point on the unit sphere in Cartesian coordinates.
Usually spherical coordinates restrict a1 to the range [0, 2*pi) and a2 to the range [0, pi].
Even with this restriction, the North and South poles have more than one (actually infinite number of) valid representation.

You cannot restore angle sign information because it was loosed in A,B calculation (formation).
8 possible combinations of sin/cos signs give only 4 results of A/B signs (and sign of cos(a2) cannot help here).
Note that for spherical coordinates inclination range is only 0..Pi

You should use np.arctan2(-A , B) instead of np.arctan2(-A / B, 1.). With the latter you are losing information: A = -1 and B = 1 will give the same result as A - 1 and B = -1, hence the 180 mismatch sometimes.
If you restrict a2 to be in (0,180) then you can recover the angles. Note that with this restriction a2 can be recovered as acos(C). (I've tried this but since my program is in C it might not be helpful)

calculate distance of 2 list of points in numpy

I have 2 lists of points as numpy.ndarray, each row is the coordinate of a point, like:
a = np.array([[1,0,0],[0,1,0],[0,0,1]])
b = np.array([[1,1,0],[0,1,1],[1,0,1]])
Here I want to calculate the euclidean distance between all pairs of points in the 2 lists, for each point p_a in a, I want to calculate the distance between it and every point p_b in b. So the result is
d = np.array([[1,sqrt(3),1],[1,1,sqrt(3)],[sqrt(3),1,1]])
How to use matrix multiplication in numpy to compute the distance matrix?

Using direct numpy broadcasting, you can do this:
dist = np.sqrt(((a[:, None] - b[:, :, None]) ** 2).sum(0))
Alternatively, scipy has a routine that will compute this slightly more efficiently (particularly for large matrices)
from scipy.spatial.distance import cdist
dist = cdist(a, b)
I would avoid solutions that depend on factoring-out matrix products (of the form A^2 + B^2 - 2AB), because they can be numerically unstable due to floating point roundoff errors.

To compute the squared euclidean distance for each pair of elements off them - x and y, we need to find :
(Xik-Yjk)**2 = Xik**2 + Yjk**2 - 2*Xik*Yjk
and then sum along k to get the distance at coressponding point as dist(Xi,Yj).
Using associativity, it reduces to :
dist(Xi,Yj) = sum_k(Xik**2) + sum_k(Yjk**2) - 2*sum_k(Xik*Yjk)
Bringing in matrix-multiplication for the last part, we would have all the distances, like so -
dist = sum_rows(X^2), sum_rows(Y^2), -2*matrix_multiplication(X, Y.T)
Hence, putting into NumPy terms, we would end up with the euclidean distances for our case with a and b as the inputs, like so -
np.sqrt((a**2).sum(1)[:,None] + (b**2).sum(1) - 2*a.dot(b.T))
Leveraging np.einsum, we could replace the first two summation-reductions with -
np.einsum('ij,ij->i',a,a)[:,None] + np.einsum('ij,ij->i',b,b)
More info could be found on eucl_dist package's wiki page (disclaimer: I am its author).

If you have 2 each 1-dimensional arrays, x and y, you can convert the arrays into matrices with repeating columns, transpose, and apply the distance formula. This assumes that x and y are coordinated pairs. The result is a symmetrical distance matrix.
x = [1, 2, 3]
y = [4, 5, 6]
xx = np.repeat(x,3,axis = 0).reshape(3,3)
yy = np.repeat(y,3,axis = 0).reshape(3,3)
dist = np.sqrt((xx-xx.T)**2 + (yy-yy.T)**2)
dist
Out[135]:
array([[0. , 1.41421356, 2.82842712],
[1.41421356, 0. , 1.41421356],
[2.82842712, 1.41421356, 0. ]])

L2 distance = (a^2 + b^2 - 2ab)^0.5
a = np.random.randn(5, 3)
b = np.random.randn(2, 3)
a2 = np.sum(np.square(a), axis = 1)[..., None]
b2 = np.sum(np.square(b), axis = 1)[None, ...]
ab = -2*np.dot(a, b.T)
dist = np.sqrt(a2 + b2 + ab)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Distance in arrays - python

Related

Adding 2 vectors and comparing the angle of the result to both vectors does not give same result. Why?

How to write Z[i,k] = sqrt(sum_j((X[i,j] - Y[k,j])**2) in einsum notation? [duplicate]

Python: Speeding up large double sum with elements precalculated

Recover angles after transformation

calculate distance of 2 list of points in numpy

Categories

Resources