Choosing correct distance from a list - python

This question is somewhat similar to this. I've gone a bit farther than the OP, though, and I'm in Python 2 (not sure what he was using).
I have a Python function that can determine the distance from a point inside a convex polygon to regularly-defined intervals along the polygon's perimeter. The problem is that it returns "extra" distances that I need to eliminate. (Please note--I suspect this will not work for rectangles yet. I'm not finished with it.)
First, the code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
#
# t1.py
#
# Copyright 2015 FRED <fred#matthew24-25>
#
# THIS IS TESTING CODE ONLY. IT WILL BE MOVED INTO THE CORRECT MODULE
# UPON COMPLETION.
#
from __future__ import division
import math
import matplotlib.pyplot as plt
def Dist(center_point, Pairs, deg_Increment):
# I want to set empty lists to store the values of m_lnsgmnt and b_lnsgmnts
# for every iteration of the for loop.
m_linesegments = []
b_linesegments = []
# Scream and die if Pairs[0] is the same as the last element of Pairs--i.e.
# it has already been run once.
#if Pairs[0] == Pairs[len(Pairs)-1]:
##print "The vertices contain duplicate points!"
## Creates a new list containing the original list plus the first element. I did this because, due
## to the way the for loop is set up, the last iteration of the loop subtracts the value of the
## last value of Pairs from the first value. I therefore duplicated the first value.
#elif:
new_Pairs = Pairs + [Pairs[0]]
# This will calculate the slopes and y-intercepts of the linesegments of the polygon.
for a in range(len(Pairs)):
# This calculates the slope of each line segment and appends it to m_linesegments.
m_lnsgmnt = (new_Pairs[a+1][2] - new_Pairs[a][3]) / (new_Pairs[a+1][0] - new_Pairs[a][0])
m_linesegments.append(m_lnsgmnt)
# This calculates the y-intercept of each line segment and appends it to b_linesegments.
b_lnsgmnt = (Pairs[a][4]) - (m_lnsgmnt * Pairs[a][0])
b_linesegments.append(b_lnsgmnt)
# These are temporary testing codes.
print "m_linesegments =", m_linesegments
print "b_linesegments =", b_linesegments
# I want to set empty lists to store the value of m_rys and b_rys for every
# iteration of the for loop.
m_rays = []
b_rays = []
# I need to set a range of degrees the intercepts will be calculated for.
theta = range(0, 360, deg_Increment)
# Temporary testing line.
print "theta =", theta
# Calculate the slope and y-intercepts of the rays radiating from the center_point.
for b in range(len(theta)):
m_rys = math.tan(math.radians(theta[b]))
m_rays.append(m_rys)
b_rys = center_point[1] - (m_rys * center_point[0])
b_rays.append(b_rys)
# Temporary testing lines.
print "m_rays =", m_rays
print "b_rays =", b_rays
# Set empty matrix for Intercepts.
Intercepts = []
angle = []
# Calculate the intersections of the rays with the line segments.
for c in range((360//deg_Increment)):
for d in range(len(Pairs)):
# Calculate the x-coordinates and the y-coordinates of each
# intersection
x_Int = (b_rays[c] - b_linesegments[d]) / (m_linesegments[d] - m_rays[c])
y_Int = ((m_linesegments[d] * x_Int) + b_linesegments[d])
Intercepts.append((x_Int, y_Int))
# Calculates the angle of the ray. Rounding is necessary to
# compensate for binary-decimal errors.
a_ngle = round(math.degrees(math.atan2((y_Int - center_point[1]), (x_Int - center_point[0]))))
# Substitutes positive equivalent for every negative angle,
# i.e. -270 degrees equals 90 degrees.
if a_ngle < 0:
a_ngle = a_ngle + 360
# Selects the angles that correspond to theta
if a_ngle == theta[c]:
angle.append(a_ngle)
print "INT1=", Intercepts
print "angle=", angle
dist = []
# Calculates distance.
for e in range(len(Intercepts) - 1):
distA = math.sqrt(((Intercepts[e][0] - center_point[0])**2) + ((Intercepts[e][5]- center_point[1])**2))
dist.append(distA)
print "dist=", dist
if __name__ == "__main__":
main()
Now, as to how it works:
The code takes 3 inputs: center_point (a point contained in the polygon, given in (x,y) coordinates), Pairs (the vertices of the polygon, also given in (x,y) coordinats), and deg_Increment ( which defines how often to calculate distance).
Let's assume that center_point = (4,5), Pairs = [(1, 4), (3, 8), (7, 2)], and deg_Increment = 20. This means that a polygon is created (sort of) whose vertices are Pairs, and center_point is a point contained inside the polygon.
Now rays are set to radiate from center_point every 20 degrees (which isdeg_Increment). The intersection points of the rays with the perimeter of the polygon are determined, and the distance is calculated using the distance formula.
The only problem is that I'm getting too many distances. :( In my example above, the correct distances are
1.00000 0.85638 0.83712 0.92820 1.20455 2.07086 2.67949 2.29898 2.25083 2.50000 3.05227 2.22683 1.93669 1.91811 2.15767 2.85976 2.96279 1.40513
But my code is returning
dist= [2.5, 1.0, 6.000000000000001, 3.2523178818773006, 0.8563799085248148, 3.0522653889161626, 5.622391569468206, 0.8371216462519347, 2.226834844885431, 37.320508075688686, 0.9282032302755089, 1.9366857335569072, 7.8429970322236064, 1.2045483557883576, 1.9181147622136665, 3.753460385470896, 2.070863609380179, 2.157671808913309, 2.6794919243112276, 12.92820323027545, 2.85976265663383, 2.298981118867903, 2.962792920643178, 5.162096782237789, 2.250827351906659, 1.4051274947736863, 69.47032761621092, 2.4999999999999996, 1.0, 6.000000000000004, 3.2523178818773006, 0.8563799085248148, 3.0522653889161626, 5.622391569468206, 0.8371216462519347, 2.226834844885431, 37.32050807568848, 0.9282032302755087, 1.9366857335569074, 7.842997032223602, 1.2045483557883576, 1.9181147622136665, 3.7534603854708997, 2.0708636093801767, 2.1576718089133085, 2.679491924311227, 12.928203230275532, 2.85976265663383, 2.298981118867903, 2.9627929206431776, 5.162096782237789, 2.250827351906659, 1.4051274947736847]
If anyone can help me get only the correct distances, I'd greatly appreciate it.
Thanks!
And just for reference, here's what my example looks like with the correct distances only:

You're getting too many values in Intercepts because it's being appended to inside the second for-loop [for d in range(len(Pairs))].
You only want one value in Intercept per step through the outer for-loop [for c in range((360//deg_Increment))], so the append to Intercept needs to be in this loop.
I'm not sure what you're doing with the inner loop, but you seem to be calculating a separate intercept for each of the lines that make up the polygon sides. But you only want the one that you're going to hit "first" when going in that direction.
You'll have to add some code to figure out which of the 3 (in this case) sides of the polygon you're actually going to encounter first.

Related

How to select numeric samples based on their distance relative to samples already selected (Python)

I have some random test data in a 2D array of shape (500,2) as such:
xy = np.random.randint(low=0.1, high=1000, size=[500, 2])
From this array, I first select 10 random samples, to select the 11th sample, I would like to pick the sample that is the furthest away from the original 10 selected samples collectively, I am using the euclidean distance to do this. I need to keep doing this until a certain amount have been picked. Here is my attempt at doing this.
# Function to get the distance between samples
def get_dist(a, b):
return np.sqrt(np.sum(np.square(a - b)))
# Set up variables and empty lists for the selected sample and starting samples
n_xy_to_select = 120
selected_xy = []
starting = []
# This selects 10 random samples and appends them to selected_xy
for i in range(10):
idx = np.random.randint(len(xy))
starting_10 = xy[idx, :]
selected_xy.append(starting_10)
starting.append(starting_10)
xy = np.delete(xy, idx, axis = 0)
starting = np.asarray(starting)
# This performs the selection based on the distances
for i in range(n_xy_to_select - 1):
# Set up an empty array dists
dists = np.zeros(len(xy))
for selected_xy_ in selected_xy:
# Get the distance between each already selected sample, and every other unselected sample
dists_ = np.array([get_dist(selected_xy_, xy_) for xy_ in xy])
# Apply some kind of penalty function - this is the key
dists_[dists_ < 90] -= 25000
# Sum dists_ onto dists
dists += dists_
# Select the largest one
dist_max_idx = np.argmax(dists)
selected_xy.append(xy[dist_max_idx])
xy = np.delete(xy, dist_max_idx, axis = 0)
The key to this is this line - the penalty function
dists_[dists_ < 90] -= 25000
This penalty function exists to prevent the code from just picking a ring of samples at the edge of the space, by artificially shortening values that are close together.
However, this eventually breaks down, and the selection starts clustering, as shown in the image. You can clearly see that there are much better selections that the code can make before any kind of clustering is necessary. I feel that a kind of decaying exponential function would be best for this, but I do not know how to implement it.
So my question is; how would I change the current penalty function to get what I'm looking for?
From your question, I understand that what you are looking for are Periodic Boundary Conditions (PBC). Meaning that a point which at the left edge of your space is just next to the on the right end side. Thus, the maximal distance you can get along one axis is given by the half of the box (i.e. between the edge and the center).
To take into account the PBC you need to compute the distance on each axis and subtract the half of the box to that:
For example, if you have a point with x1 = 100 and a second one with x2 = 900, using the PBC they are 200 units apart : |x1 - x2| - 500. In the general case, given 2 coordinates and the half size box, you end up by having:
In your case this simplifies to:
delta_x[delta_x > 500] = delta_x[delta_x > 500] - 500
To wrap it up, I rewrote your code using a new distance function (note that I removed some unnecessary for loops):
import numpy as np
def distance(p, arr, 500):
delta_x = np.abs(p[0] - arr[:,0])
delta_y = np.abs(p[1] - arr[:,1])
delta_x[delta_x > 500] = delta_x[delta_x > 500] - 500
delta_y[delta_y > 500] = delta_y[delta_y > 500] - 500
return np.sqrt(delta_x**2 + delta_y**2)
xy = np.random.randint(low=0.1, high=1000, size=[500, 2])
idx = np.random.randint(500, size=10)
selected_xy = list(xy[idx])
_initial_selected = xy[idx]
xy = np.delete(xy, idx, axis = 0)
n_xy_to_select = 120
for i in range(n_xy_to_select - 1):
# Set up an empty array dists
dists = np.zeros(len(xy))
for selected_xy_ in selected_xy:
# Compute the distance taking into account the PBC
dists_ = distance(selected_xy_, xy)
dists += dists_
# Select the largest one
dist_max_idx = np.argmax(dists)
selected_xy.append(xy[dist_max_idx])
xy = np.delete(xy, dist_max_idx, axis = 0)
And indeed it creates clusters, and this is normal as you will tend to create points clusters that are at the maximal distance from each others. More than that, due to the boundary conditions, we set that the maximal distance between 2 points along one axis is given by 500. The maximal distance between two clusters is thus also 500 ! And as you can see on the image, it is the case.
More over, picking more numbers will start to draws line to connect the different clusters, starting from the central one as you can see here :
What I was looking for is called 'Furthest Point Sampling'. I have some some more research into the solution, and the Python code used to perform this is found here: https://minibatchai.com/ai/2021/08/07/FPS.html

How to implement in Python a function to compute the Euclidean distance between two arbitrary points on a torus

Given a 10x10 grid (2d-array) filled randomly with numbers, either 0, 1 or 2. How can I find the Euclidean distance (the l2-norm of the distance vector) between two given points considering periodic boundaries?
Let us consider an arbitrary grid point called centre. Now, I want to find the nearest grid point containing the same value as centre. I need to take periodic boundaries into account, such that the matrix/grid can be seen rather as a torus instead of a flat plane. In that case, say the centre = matrix[0,2], and we find that there is the same number in matrix[9,2], which would be at the southern boundary of the matrix. The Euclidean distance computed with my code would be for this example np.sqrt(0**2 + 9**2) = 9.0. However, because of periodic boundaries, the distance should actually be 1, because matrix[9,2] is the northern neighbour of matrix[0,2]. Hence, if periodic boundary values are implemented correctly, distances of magnitude above 8 should not exist.
So, I would be interested on how to implement in Python a function to compute the Euclidean distance between two arbitrary points on a torus by applying a wrap-around for the boundaries.
import numpy as np
matrix = np.random.randint(0,3,(10,10))
centre = matrix[0,2]
#rewrite the centre to be the number 5 (to exclude itself as shortest distance)
matrix[0,2] = 5
#find the points where entries are same as centre
same = np.where((matrix == centre) == True)
idx_row, idx_col = same
#find distances from centre to all values which are of same value
dist = np.zeros(len(same[0]))
for i in range(0,len(same[0])):
delta_row = same[0][i] - 0 #row coord of centre
delta_col = same[1][i] - 2 #col coord of centre
dist[i] = np.sqrt(delta_row**2 + delta_col**2)
#retrieve the index of the smallest distance
idx = dist.argmin()
print('Centre value: %i. The nearest cell with same value is at (%i,%i)'
% (centre, same[0][idx],same[1][idx]))
For each axis, you can check whether the distance is shorter when you wrap around or when you don't. Consider the row axis, with rows i and j.
When not wrapping around, the difference is abs(i - j).
When wrapping around, the difference is "flipped", as in 10 - abs(i - j). In your example with i == 0 and j == 9 you can check that this correctly produces a distance of 1.
Then simply take whichever is smaller:
delta_row = same[0][i] - 0 #row coord of centre
delta_row = min(delta_row, 10 - delta_row)
And similarly for delta_column.
The final dist[i] calculation needs no changes.
I have a working 'sketch' of how this could work. In short, I calculate the distance 9 times, 1 for the normal distance, and 8 shifts to possibly correct for a closer 'torus' distance.
As n is getting larger, the calculation costs can go sky high as the numbers go up. But, the torus effect, is probably not needed as there is always a point nearby without 'wrap around'.
You can easily test this, because for a grid of size 1, if a point is found of distance 1/2 or closer, you know there is not a closer torus point (right?)
import numpy as np
n=10000
np.random.seed(1)
A = np.random.randint(low=0, high=10, size=(n,n))
I create 10000x10000 points, and store the location of the 1's in ONES.
ONES = np.argwhere(A == 0)
Now I define my torus distance, which is trying which of the 9 mirrors is the closest.
def distance_on_torus( point=[500,500] ):
index_diff = [[1],[1],[0],[0],[0,1],[0,1],[0,1],[0,1]]
coord_diff = [[-1],[1],[-1],[1],[-1,-1],[-1,1],[1,-1],[1,1]]
tree = BallTree( ONES, leaf_size=5*n, metric='euclidean')
dist, indi = tree.query([point],k=1, return_distance=True )
distances = [dist[0]]
for indici_to_shift, coord_direction in zip(index_diff, coord_diff):
MIRROR = ONES.copy()
for i,shift in zip(indici_to_shift,coord_direction):
MIRROR[:,i] = MIRROR[:,i] + (shift * n)
tree = BallTree( MIRROR, leaf_size=5*n, metric='euclidean')
dist, indi = tree.query([point],k=1, return_distance=True )
distances.append(dist[0])
return np.min(distances)
%%time
distance_on_torus([2,3])
It is slow, the above takes 15 minutes.... For n = 1000 less than a second.
A optimisation would be to first consider the none-torus distance, and if the minimum distance is possibly not the smallest, calculate with only the minimum set of extra 'blocks' around. This will greatly increase speed.

Mean square displacement of a 1d random walk in python

I'd like to calculate the mean square displacement at all times of a 1d trajectory. I'm a novice programmer, so I've attempted it from a simulated random walk.
import numpy as np
import matplotlib.pyplot as plt
# generate random walk
path = np.zeros(60)
position = 0
for i in range(1, path.size):
move = np.random.randint(0, 2)
if move == 0: position += -1
else: position += 1
path[i] = position
# returns a vector of MSDs from a given path
def calcMSD(data):
msd = np.zeros(data.size - 1)
for i in range(1, data.size):
squareddisplacements = (data[i:] - data[:data.size - i])**2
msd[i - 1] = squareddisplacements.mean()
return msd
msd = calcMSD(path)
plt.plot(msd)
plt.show()
Have I implemented this correctly? Any and all advice is appreciated, thanks.
Mean Squared Displacement
Definition of mean squared displacement
To make sure we are agreeing on the definition, the mean squared displacement (MSD) is the mean of the squared distance from origin of a collection of particles (in our case, walkers) at a specific step.
Since we are in 1D, the distance is simply the absolute position : distance at time t = | path[t] |
To get the MSD at step t, we must take all the squared positions at step t, then take the mean of the resulting array.
That means : MSD at step t == (path[:, t]**2).mean() (the element-wise squaring makes it useless to take the absolute values)
path[:, t] means here : the positions of each walker at step t.
Generating paths
We must edit your code a bit to have several independant paths.
walkers = 50 # Example for 50 walkers
path = []
steps = 60
for walker in range(walkers):
walker_path=[0]
position = 0
for i in range(1, steps):
move = np.random.randint(0, 2)
if move == 0: position += -1
else: position += 1
walker_path.append(position)
path.append(walker_path)
path = np.array(path)
As a sidenote, you could also use random.choice to randomly pick -1 or 1 directly, like so:
for i in range(1, steps):
position += np.random.choice([-1, 1])
walker_path.append(position)
From now, path is a 2D array : a row per walker path.
Getting MSD at each step
To get all the MSD at every steps, we vary t from 1 to the number of steps.
You can do it inside a list comprehension in Python (and getting rid of your calcMSD function):
msd = [(path[:, i]**2).mean() for i in range(1, steps)]
# you could also put range(1, len(path[0]))
Result
Your function was not adapted to plot several walkers.
The proposed solution will give you something like this :
What you want
I hope I understood what you truly wanted. I don't know how it's called but it can also be calculated in a list comprehension. Considering the path for a single walker :
not_msd = [(np.array([path[start:start+sep+1] for start in range(steps-sep)])**2).mean() for sep in range(1, steps)]
We go from start to start + step to get the difference between two positions that are step steps apart. We reproduce this operation, starting from 0 to the end of the list. Then we square the obtained list and take the mean.
We do that for all step size : 1 to the whole length.
Result :

Creating a spatial index for QGIS 2 spatial join (PyQGIS)

I've written a bit of code to do a simple spatial join in QGIS 2 and 2.2 (points that lie within a buffer to take attribute of the buffer). However, I'd like to employ a QgsSpatialIndex in order to speed things up a bit. Where can I go from here:
pointProvider = self.pointLayer.dataProvider()
rotateProvider = self.rotateBUFF.dataProvider()
all_point = pointProvider.getFeatures()
point_spIndex = QgsSpatialIndex()
for feat in all_point:
point_spIndex.insertFeature(feat)
all_line = rotateProvider.getFeatures()
line_spIndex = QgsSpatialIndex()
for feat in all_line:
line_spIndex.insertFeature(feat)
rotate_IDX = self.rotateBUFF.fieldNameIndex('bearing')
point_IDX = self.pointLayer.fieldNameIndex('bearing')
self.pointLayer.startEditing()
for rotatefeat in self.rotateBUFF.getFeatures():
for pointfeat in self.pointLayer.getFeatures():
if pointfeat.geometry().intersects(rotatefeat.geometry()) == True:
pointID = pointfeat.id()
bearing = rotatefeat.attributes()[rotate_IDX]
self.pointLayer.changeAttributeValue(pointID, point_IDX, bearing)
self.pointLayer.commitChanges()
To do this kind of spatial join, you can use the QgsSpatialIndex (http://www.qgis.org/api/classQgsSpatialIndex.html) intersects(QgsRectangle) function to get a list of candidate featureIDs or the nearestNeighbor (QgsPoint,n) function to get the list of the n nearest neighbours as featureIDs.
Since you only want the points that lie within the buffer, the intersects function seems most suitable. I have not tested if a degenerate bbox (point) can be used. If not, just make a very small bounding box around your point.
The intersects function returns all features that have a bounding box that intersects the given rectangle, so you will have to test these candidate features for a true intersection.
Your outer loop should be on the points (you want to to add attribute values to each point from their containing buffer).
# If degenerate rectangles are allowed, delta could be 0,
# if not, choose a suitable, small value
delta = 0.1
# Loop through the points
for point in all_point:
# Create a search rectangle
# Assuming that all_point consist of QgsPoint
searchRectangle = QgsRectangle(point.x() - delta, point.y() - delta, point.x() + delta, point.y() + delta)
# Use the search rectangle to get candidate buffers from the buffer index
candidateIDs = line_index.intesects(searchRectangle)
# Loop through the candidate buffers to find the first one that contains the point
for candidateID in candidateIDs:
candFeature == rotateProvider.getFeatures(QgsFeatureRequest(candidateID)).next()
if candFeature.geometry().contains(point):
# Do something useful with the point - buffer pair
# No need to look further, so break
break

Find the largest angle made by different points at the center

Below given is an example image where 'center-point' is (x0,y0) (the center of the wheel). Other points are the other ends of the spoke. The distance between 'center-point" and the other end of spoke may be different (spokes of different length). These all points are in cartesian coordinate system.
I need to find here the largest angle made by any two consecutive spoke. In this fig all the angles are same but assume that any one of the spoke is missing, then we will have that angle as the the largest angle at origin.
My take:
I am calculating the angle created by each edge with respect to x axis one at a time subtracting with the previous one (that gives angle between two spoke). I am keeping track of the largest angle, everytime updating it if I encounter an angle larger than the previous. My method works but just wondering if any efficient method is available to find the same.
Assuming you want the angle between two spokes, I suggest you convert the data points to polar/complex co-ordinates, this is made easy in the cmath module, and allows you to do something like this (phase takes out just the angle about centre):
import cmath
def largest_spoke_angle(centre, peripheral):
per_from_centre = [complex(z[0]-centre[0], z[1]-centre[1]) for z in peripheral]
per_angles = [cmath.phase(z) for z in per_from_centre]
per_angles.sort()
differences = [ per_angles[n+1]-per_angles[n] for n in range(len(per_angles)-1)] \
+ [per_angles[0] +2*cmath.pi - per_angles[-1]]
return max(differences)#in radians
centre = (0.,0.)
peripheral = [(1.,2.),(3.,4.),(3.,5.)]
print largest_spoke_angle(centre, peripheral)
I think I would do something like this:
angles = [get_angle_from_xaxis(origin,point) for point in points]
#make sure the angles are in order
angles.sort()
#need to compare last one with first one
angles.insert(0,angles[-1]-360.0) #360 if degrees, otherwise 2*math.pi.
#Now calculate the difference between adjacent angles and take the maximum
maxangle = max( angles[i] - angle for i,angle in enumerate(angles[:-1],1) )
This is basically the solution you describe. The only thing I've added is a check between the last and first and a sort to make sure we have the angles in the right order.
The answer of #user1597034 is correct. But it's not possible to get which spokes resulted in the largest angle.
The code below finds the indices of the two vectors of largest angle:
import cmath
import numpy as np
center = (0.,0.)
peripheral = np.array([(-1.,-1.),(0.,1.),(1.,-0.55), (0,-1), (-1,1)])
per_from_centre = [complex(z[0]-center[0], z[1]-center[1]) for z in peripheral]
per_angles = [cmath.phase(z) for z in per_from_centre]
id_ord = np.argsort(per_angles,axis=-1) # order index
per_angles.sort()
differences = [ per_angles[n+1]-per_angles[n] for n in range(len(per_angles)-1)] \
+ [per_angles[0] +2*cmath.pi - per_angles[-1]]
# ----- so far, same code in relation to #user1597034 -----
# find index of adjacent angles of greater angle
max_value = max(differences) # maximum value
for i in range(len(differences)):
if max_value == differences[i]:
if i == (len(differences)-1):
pairs = [id_ord[0], id_ord[-1]]
else:
pairs = [id_ord[i]] + [id_ord[i+1]]
print('pair index of largest angle:',pairs)
pair index of largest angle: [2, 1]

Categories

Resources