closest point function modification

closest point function modification - python

I currently have a function that, given a list of x,y pairs, will calculate the minimum distance between any of the two pairs in the list and return that distance. I would like to modify the code below so that instead of returning the distance itself, it returns the two pairs that yielded that minimum distance in the same order as they were in the input list. For example, if the given input list was [(1, 2), (4, 5), (5, 5), (4, 1)] the resulting output would be ((4, 5), (5, 5)).
import math
#distance function
def distance(p1, p2):
return math.sqrt((p1[0] - p2[0])**2+(p1[0] - p2[0])**2)
def closest_neighbor(point_list):
if(len(point_list) < 2):
return None
else:
dist = []
for i in range(len(point_list) - 1):
for j in range(i +1, len(point_list)):
x = point_list[i]
y = point_list[j]
dist += [distance(point_list[i], point_list[j])]
return dist

Assuming you fix your currently incorrect distance function, then it is as simple as:
>>> min( (distance(x, y), (x, y)) for x, y in itertools.combinations(point_list, 2))[1]
((4, 5), (5, 5))
Here we generate all combinations of two points from the list, calculate their distance, and take advantage of the fact that tuples sort element-wise.
(You will need to add a check for the length of point_list if that is important to you)

import math
#distance function
def distance(p1, p2):
return math.sqrt((p1[0] - p2[0])**2+(p1[1] - p2[1])**2) # the original function was wrong
def closest_neighbor(point_list):
if(len(point_list) < 2):
return None
else:
min = -1
pair = []
for i in range(len(point_list) - 1):
for j in range(i +1, len(point_list)):
x = point_list[i]
y = point_list[j]
dist = distance(point_list[i], point_list[j]) # this could be changed to distance(x,y)
if dist < min or min == -1:
pair = [x,y] # store the pair with min distance
min = dist # store the min distance to later comparison
return pair
print(closest_neighbor( [(1, 2), (4, 5), (5, 5), (4, 1)]))

Related

Scaling tuples using different scales in Python

I am trying to scale three vectors with three different scale for each of the vectors, and then find the linear combination of the vectors to find a single point.
The vectors are three corners of a quilateral triangle, and the scales are randomly selected.
w_list = np.zeros(3)
sum = 0
for i in range(3):
w_list[i] = np.random.random_sample()
sum += w_list[i]
w = w_list/sum # scaling weights
corners = [(0, 0), (1, 0), (0.5, np.sqrt(3/4))]
x = 0, 0
for i, e in zip(w, corners): # w contains the three scales
x += (i * e[0], i * e[1])
However, when I print x I get a number a vector with 7 points when it should just print one point.
(0, 0, 0.0, 0.0, 0.4682062316923167, 0.0, 0.15138058449552882, 0.26219886362572936)

While not strictly necessary to solve your problem, it might be useful for you to get accustomed to using numpy to its full potential. It is really very powerful and can speed up all your calculations by a factor of a hundred.
import numpy as np
corners = np.array([(0, 0), (1, 0), (0.5, np.sqrt(3/4))])
weights = np.random.random(3)
weights /= np.sum(weights) # normalization
x = np.dot(weights, corners)

try the below (using list and tuple as x)
import numpy as np
corners = [(0, 0), (1, 0), (0.5, np.sqrt(3/4))]
w = [1,2,3]
x = [0, 0]
for i, e in zip(w, corners): # w contains the three scales
x[0]= x[0] + i * e[0]
x[1]= x[1] + i * e[1]
print(x)
output
[3.5, 2.598076211353316]

You are currently doing tuple concatenation rather than vector addition (note that the final tuple in your question has 8 rather than 7 numbers). Instead, make your x a numpy array:
import numpy as np
w_list = np.zeros(3)
sum = 0
for i in range(3):
w_list[i] = np.random.random_sample()
sum += w_list[i]
w = w_list/sum # scaling weights
corners = [(0, 0), (1, 0), (0.5, np.sqrt(3/4))]
x = np.array([0.0, 0.0]) #don't want an int array
for i, e in zip(w, corners): # w contains the three scales
x += (i * e[0], i * e[1])
print(x) #[0.25691217 0.4105851 ]

Seeking better dynamic programming solution to find algorithm with less complexity

I met this question in a coding interview.
You are in an infinite 2D grid where you can move in any of the 8 directions :
(x,y) to
(x+1, y),
(x - 1, y),
(x, y+1),
(x, y-1),
(x-1, y-1),
(x+1,y+1),
(x-1,y+1),
(x+1,y-1)
You are given a sequence of points you need to cover. Give the minimum number of steps in which you can achieve it. You start from the first point.
Example :
Input : [(0, 0), (1, 1), (1, 2)]
Output : 2
It takes 1 step to move from (0, 0) to (1, 1). It takes one more step to move from (1, 1) to (1, 2).
I was able to come up with a recursive solution with memoization (DP) technic with keeping the list of visited points, but still doesn't seem perfectly optimal. I am still thinking about better solution even after the interview. Can anyone come up with better solution than I did? I need help!
# #param X : list of integers
# #param Y : list of integers
# Points are represented by (X[i], Y[i])
# #return an integer
def coverPoints(self, X, Y):
if len(X) == 1:return 0
def odist(A, B): #to calculate shortest distance between a pair of points
min_d = 0 if abs(A[1]-B[1]) > abs(A[0]-B[0]) else 1
return abs(A[min_d]-B[min_d]) + (abs(A[1-min_d]-B[1-min_d])- abs(A[min_d]-B[min_d]))
D = {}
def rec(curL, last, dist):
if D.get((tuple(curL), dist), False) != False:return D[(tuple(curL),dist)]
if len(curL) == 0:return dist
else:
s = sys.maxsize
for id, i in enumerate(curL):
newL = curL[:id] + curL[id+1:]
s = min(s, rec(newL, id, odist( (X[last], Y[last]), (X[curL[id]], Y[curL[id]]) )))
D[(tuple(curL),dist)] = dist + s
return dist + s
s = rec([i for i in range(len(X))], 0, 0)
return s

Python: average distance between a bunch of points in the (x,y) plane

The formula for computing the distance between two points in the (x, y) plane is fairly known and straightforward.
However, what is the best way to approach a problem with n points, for which you want to compute the average distance?
Example:
import matplotlib.pyplot as plt
x=[89.86, 23.0, 9.29, 55.47, 4.5, 59.0, 1.65, 56.2, 18.53, 40.0]
y=[78.65, 28.0, 63.43, 66.47, 68.0, 69.5, 86.26, 84.2, 88.0, 111.0]
plt.scatter(x, y,color='k')
plt.show()
The distance is simply rendered as:
import math
dist=math.sqrt((x2-x1)**2+(y2-y1)**2)
But this is a problem of combinations with repetitions that are not allowed. How to approach it?

itertools.combinations gives combinations without repeats:
>>> for combo in itertools.combinations([(1,1), (2,2), (3,3), (4,4)], 2):
... print(combo)
...
((1, 1), (2, 2))
((1, 1), (3, 3))
((1, 1), (4, 4))
((2, 2), (3, 3))
((2, 2), (4, 4))
((3, 3), (4, 4))
Code for your problem:
import math
from itertools import combinations
def dist(p1, p2):
(x1, y1), (x2, y2) = p1, p2
return math.sqrt((x2 - x1)**2 + (y2 - y1)**2)
x = [89.86, 23.0, 9.29, 55.47, 4.5, 59.0, 1.65, 56.2, 18.53, 40.0]
y = [78.65, 28.0, 63.43, 66.47, 68.0, 69.5, 86.26, 84.2, 88.0, 111.0]
points = list(zip(x,y))
distances = [dist(p1, p2) for p1, p2 in combinations(points, 2)]
avg_distance = sum(distances) / len(distances)

In that case you need to loop over the sequence of points:
from math import sqrt
def avg_distance(x,y):
n = len(x)
dist = 0
for i in range(n):
xi = x[i]
yi = y[i]
for j in range(i+1,n):
dx = x[j]-xi
dy = y[j]-yi
dist += sqrt(dx*dx+dy*dy)
return 2.0*dist/(n*(n-1))
In the last step, we divide the total distance by n×(n-1)/2 which is the result of:
n-1
---
\ n (n-1)
/ i = -------
--- 2
i=1
which is thus the total amount of distances we have calculated.
Here we do not measure the distance between a point and itself (which is of course always 0). Note that this of course has impact on the average since you do not count them as well.
Given there are n points, this algorithm runs in O(n2).

You can solve this problem (probably more efficiently) by using the function pdist from the Scipy library. Such function computes the pairwise distances between observations in n-dimensional space.
To solve the problem, you can use the following function:
from scipy.spatial.distance import pdist
import numpy as np
def compute_average_distance(X):
"""
Computes the average distance among a set of n points in the d-dimensional space.
Arguments:
X {numpy array} - the query points in an array of shape (n,d),
where n is the number of points and d is the dimension.
Returns:
{float} - the average distance among the points
"""
return np.mean(pdist(X))

Fastest method for comparing ranges of multidimensional lists

Context: I am trying to write an A* search in an unknown environment. To do this, I am maintaining a representation of the environment in a 2-D or 3-D list (depending on the environment), and another n-D list that represents the agent's knowledge of the environment.
When the agent moves, I check the area around them with the actual environment. If there is a discrepancy, their map gets updated, and I run A* again.
Problem: What is the fastest method to check if there is a difference between the ranges of these two lists?
Naive Solution:
from itertools import product
from random import randint
width, height = 10, 10
known_environment = [[0 for x in range(width)] for y in range(height)]
actual_environment = [[0 for x in range(width)] for y in range(height)]
# Populate with obstacles
for i in xrange(10):
x = randint(0, len(actual_environment) - 1)
y = randint(0, len(actual_environment[x]) - 1)
actual_environment[x][y] += 1
# Run A* and get a path
path = [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4),
(5, 5), (6, 6), (7, 7), (8, 8), (9, 9)] # dummy path
# Traverse path, checking for "new" obstacles
for step in path:
x, y = step[0], step[1]
# check area around agent
for (i, j) in product([-1, 0, 1], [-1, 0, 1]):
# don't bother checking out-of-bounds
if not 0 <= x + i < width:
continue
if not 0 <= y + j < height:
continue
# internal map doesn't match real world, update
if not known_environment[x + i][ y + j] == actual_environment[x + i][ y + j]:
known_environment[x + i][ y + j] = actual_environment[x + i][ y + j]
# Re-run A*
This works, but it feels inefficient. I'm thinking I could replace the loop with something like set(known_environment).intersection(actual_environment) to check if there is a discrepancy, and then update if needed; but this can probably be improved upon as well.
Thoughts?
Edit: I've switched over to numpy slicing, and use array_equal instead of sets.
# check area around agent
left = x - sight if x - sight >= 0 else 0
right = x + sight if x + sight < width else width - 1
top = y - sight if y - sight >= 0 else 0
bottom = y + sight if y + sight < height else height - 1
known_section = known_environment[left:right + 1, top:bottom + 1]
actual_section = actual_environment[left:right + 1, top:bottom + 1]
if not np.array_equal(known_section, actual_section):
known_environment[left:right + 1, top:bottom + 1] = actual_section

It should already be a tad faster, when you employ the solution concept from the link given in my comment to the question.
I modified / hacked up a bit the given code and tried:
#! /usr/bin/env python
from __future__ import print_function
from itertools import product
from random import randint
width, height = 10, 10
known_env = [[0 for x in range(width)] for y in range(height)]
actual_env = [[0 for x in range(width)] for y in range(height)]
# Populate with obstacles
for i in xrange(10):
x = randint(0, len(actual_env) - 1)
y = randint(0, len(actual_env[x]) - 1)
actual_env[x][y] += 1
# Run A* and get a path
path = [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4),
(5, 5), (6, 6), (7, 7), (8, 8), (9, 9)] # dummy path
def effective_slices(i_w, j_h):
"""Note: Depends on globals width and height."""
w, h = width - 1, height - 1
i_w_p, j_h_p = max(0, i_w - 1), max(0, j_h - 1)
i_w_s, j_h_s = min(w, i_w + 1), min(h, j_h + 1)
return slice(i_w_p, i_w_s), slice(j_h_p, j_h_s)
# Traverse path, checking for "new" obstacles
for step in path:
x, y = step[0], step[1]
# check area around agent
dim_w, dim_h = effective_slices(x, y)
actual_set = set(map(tuple, actual_env[dim_w][dim_h]))
known_set = set(map(tuple, known_env[dim_w][dim_h]))
sym_diff = actual_set.symmetric_difference(known_set)
if sym_diff: # internal map doesn't match real world, update
for (i, j) in product(range(dim_w.start, dim_w.stop + 1),
range(dim_h.start, dim_h.stop + 1)):
if known_env[i][j] != actual_env[i][j]:
known_env[i][j] = actual_env[i][j]
# Re-run A*
(Edited): Added some kind of reuse of indexing above in eager update loop.
2nd Edit to accommodate for comment w.r.t. updated question (cf. comment below):
Looking at the amended question i.e. the snippet now relating to a numpy based implementation, I'd suggest two changes that would make the code clearer to me at least:
To avoid the literal + 1 clutter address the issue of slices excluding the stop by introducing supreme values for right and bottom
Define the section boundary box with minand max to make the relation clear.
Like so:
# ... 8< - -
# check area around agent (note: uses numpy)
sight = 1
left, right_sup = max(0, x - sight), min(x + sight + 1, width)
top, bottom_sup = max(0, y - sight), min(y + sight + 1, height)
known_section = known_environment[left:right_sup, top:bottom_sup]
actual_section = actual_environment[left:right_sup, top:bottom_sup]
if not np.array_equal(known_section, actual_section):
known_environment[left:right_sup, top:bottom_sup] = actual_section
# - - >8 ...
... or getting rid of colon-itis (sorry):
# ... 8< - -
h_slice = slice(max(0, x - sight), min(x + sight + 1, width))
v_slice = slice(max(0, y - sight), min(y + sight + 1, height))
known_section = known_environment[h_slice, v_slice]
actual_section = actual_environment[h_slice, v_slice]
if not np.array_equal(known_section, actual_section):
known_environment[h_slice, v_slice] = actual_section
# - - >8 ...
Should be concise read, easy on the run time and nice playground.
... but image processing (e.g. with fixed masks) and staggered grid processing algorithms should be abound to offer ready made solutions

Python: Indices of identical tuple elements are also identical?

just starting out so I apologize if this is a stupid question. Python 2.7 if it's important. I'm writing a program that evaluates a polynomial whose coefficients are represented by the elements of a tuple at some x whose power is the index of the coefficient. It runs fine when all the coefficients are different, the issue I'm having is when any of the coefficients are the same. Code is below -
def evaluate_poly(poly, x):
"""polynomial coefficients represented by elements of tuple.
each coefficient evaluated at x ** index of coefficient"""
poly_sum = 0.0
for coefficient in poly:
val = coefficient * (x ** poly.index(coefficient))
poly_sum += val
return poly_sum
poly = (1, 2, 3)
x = 5
print evaluate_poly(poly, x)
##for coefficient in poly:
##print poly.index(coefficient)
Which returns 86 as you would expect.
The commented out print statement will return the indices of each element in poly. When they're all different (1, 2, 3) it returns what you would expect (0, 1, 2) but if any of the elements are the same (1, 1, 2) their indices will also be the same (0, 0, 1), so I'm really only able to evaluate polynomials where all the coefficients are different. What am I doing wrong here? I figure it has something to do with -
poly.index(coefficient)
but I can't figure out why exactly. Thanks in advance

Use enumerate, index will get the index of the first occurrence so for repeated elements it will obviously fail, in your code poly.index(1) using (1, 1, 2) is going to return 0 each time:
Uusing enumerate will give you each actual index of every element and also more efficiently:
def evaluate_poly(poly, x):
"""polynomial coefficients represented by elements of tuple.
each coefficient evaluated at x ** index of coefficient"""
poly_sum = 0.0
# ind is each index, coefficient is each element
for ind, coefficient in enumerate(poly):
# no need for val just += coefficient * (x ** ind)
poly_sum += coefficient * (x ** ind)
return poly_sum
If you print(list(enumerate(poly))) you will see each element and it's index in the list:
[(0, 1), (1, 1), (2, 3)]
So ind each time in the loop refers to the index of each coefficient in your poly list.
You can also just return a generator expression using sum:
def evaluate_poly(poly, x):
"""polynomial coefficients represented by elements of tuple.
each coefficient evaluated at x ** index of coefficient"""
return sum((coefficient * (x ** ind) for ind, coefficient in enumerate(poly)),0.0)
using 0.0 as the start value will mean a float is returned as opposed to an int. You could also cast float(sum... but i think it is simpler just to pass the start value as a float.

try this one here:
def evaluate_poly(poly, x):
'''
polynomial coefficients represented by elements of tuple.
each coefficient evaluated at x ** index of coefficient
'''
poly_sum = 0.0
for ind, coefficient in enumerate(poly):
print ind
val = coefficient * (x ** ind)
poly_sum += val
return poly_sum
poly = (1, 2, 3)
x = 5
print evaluate_poly(poly, x)
it works with poly = (1, 1, 3) too!

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

closest point function modification - python

Related

Scaling tuples using different scales in Python

Seeking better dynamic programming solution to find algorithm with less complexity

Python: average distance between a bunch of points in the (x,y) plane

Fastest method for comparing ranges of multidimensional lists

Python: Indices of identical tuple elements are also identical?

Categories

Resources