Find shortest distance between points of two arrays with scipy - python

I have two arrays centroids and nodes
I need to find the shortest distance of each point in centroids to any point in nodes
The output for centroids is following
array([[12.52512263, 55.78940022],
[12.52027731, 55.7893347 ],
[12.51987146, 55.78855611]])
The output for nodes is following
array([[12.5217378, 55.7799275],
[12.5122589, 55.7811443],
[12.5241664, 55.7843297],
[12.5189395, 55.7802709]])
I use the following code to get the shortest distance
shortdist_from_centroid_to_node = np.min(cdist(centroids,nodes))
However, this is the output I get (I should get 3 lines of output)
Out[131]: 3.0575613850140956e-05
Can anyone specify what the problem is here? Thanks.

When you doing np.min it return the minimal value of the 2d-array.
You want the minimum value for each centroids.
shortest_distance_to_centroid = [min(x) for x in cdist(centroids,nodes)]
To have the associate index one way would be to get the index of the corresponding value. Another is to write a custom min() function that also return the index (so you parse the list only once)
[(list(x).index(min(x)), min(x)) for x in cdist(centroids,nodes)] # the cast list() is needed because numpy array don't have index methods
solution with a custom function:
def my_min(x):
current_min = x[0]
current_index = [1]
for i, v in enumerate(x[1:]):
if v < current_min:
current_min = v
current_index = i + 1
return (current_index, current_min)
[my_min(x) for x in cdist(centroids,nodes)]

I guess what you need is just add an arg called axis, just like this:
shortdist_from_centroid_to_node = np.min(cdist(centroids,nodes), axis=1)
As for the meaning of the axis arg, you could refer to numpy.min. All in all you need minimum on each row rather than on the whole matrix.

If I am not wrong your code says you are trying to access the min value hence you are getting a single value. remove np.min() try:
shortdist_from_centroid_to_node = cdist(centroids,nodes)

Related

Finding the minimum result of a list of results

Is there a code to define / sort through lists?
Advanced solution
You can use the min function of python with the key argument like this:
def find_closest(start_point, remaining_points):
return min(remaining_points, key=lambda a: distance(start_point, a))
Basic solution
Because of your specific needs (only loops and if statements), here is another solution. For other people who are not restricted, I recommend using the above solution.
def find_closest(start_point, remaining_points):
closest = None
closest_distance = 1000000
# Better but more advanced initialisation
# closest_distance = float("inf")
for element in list(remaining_points):
dist = distance(start_point, element)
if dist < closest_distance:
closest = element
closest_distance = dist
return closest
Explanation
Before going through all points, we initialise the closest point to None (it is not found yet) and the closest_distante to a very high value (to be sure that the first evaluated point will be closer).
Then, for each point in remaining_points, we calculate its distance from start_point and store it in dist.
If this distance dist is less than closest_distance, then the current point is closest from the current stored one, so we update the stored closest point closest with the current point and we update the closest_distance with the current distance dist.
When all points have been evaluated, we return the closest point closest.
Links for more information
min function: https://www.programiz.com/python-programming/methods/built-in/min
lambda function: https://www.w3schools.com/python/python_lambda.asp
A quick and straightforward solution would be to create a list of results and then corroborate the index with your list of remaining points (because they are inherently lined up). Below is a step-by-step process to achieving this, and at the very bottom is a cleaned-up version of the code.
def find_closest(start_point, remaining_points):
results = [] # list of results for later use
for element in list(remaining_points):
result = distance(start_point, element)
results.append(result) # append the result to the list
# After iteration is finished, find the lowest result
lowest_result = min(results)
# Find the index of the lowest result
lowest_result_index = results.index(lowest_result)
# Corroborate this with the inputs
closest_point = remaining_points[lowest_result_index]
# Return the closest point
return closest_point
Or to simplify the code:
def find_closest(start_point, remaining_points):
results = []
for element in remaining_points:
results.append(distance(start_point, element))
return remaining_points[results.index(min(results))]
Edit: you commented saying you can't use Python's in-built min() function. A solution would be to just create your own functional minimum_value() function.
def minimum_value(lst: list):
min_val = lst[0]
for item in lst:
if item < min_val:
min_val = item
return min_val
def find_closest(start_point, remaining_points):
results = []
for element in remaining_points:
results.append(distance(start_point, element))
return remaining_points[results.index(minimum_value(results))]

Find minimum return value of function with two parameters

I have an error function, and sum of all errors on self.array:
#'array' looks something like this [[x1,y1],[x2,y2],[x3,y3],...,[xn,yn]]
#'distances' is an array with same length as array with different int values in it
def calcError(self,n,X,Y): #calculate distance of nth member of array from given point
X,Y = float(X),float(Y)
arrX = float(self.array[n][0])
arrY = float(self.array[n][1])
e = 2.71828
eToThePower = e**(-1*self.distances[n])
distanceFromPoint=math.sqrt((arrX-X)**2+(arrY-Y)**2)
return float(eToThePower*(distanceFromPoint-self.distances[n])**2)
def sumFunction(self,X,Y):
res = 0.0
for i in range(len(self.array)):
res += self.calcError(i,X,Y)
return res
I have been looking for a way to find for which coordinates sumFunction return value is minimal. I have heard about scipy yet I am looking for a way to build that manualy. Gradient descent won't seem to work either since it is very hard to derive this sum function.
Thank you!
Did you try that create variable as a dictionary then append all iteration like this {self.calcError(i,X,Y)}:{i,X,Y}. If you return minimum the variable.keys then you can reach the coordinate from the min keys to value.

Trying to add specific values from a matrix without using numpy

I am supposed to create a function that adds the absolute value of numbers that are larger than a certain number without using any python modules to do it. I wrote this:
def MatrixSumLarge(vals,large):
sum=0
sum1=0
sum2=0
i=0
#looking at the rows of the matrix
for i in range(len(vals[i])):
if abs(vals)>=large:
sum1=sum1+i
#lookinng at the columns of the matrix
for j in range(0,len(vals[j])):
if abs(vals[j])>=large:
sum2=sum2+vals[j]
sum=sum1+sum2
return(sum)
vals=[[1,-2.5,7,4],[-8,9,2,-1.5],[-12,7.5,4.2,11]]
# setting value of large
large = 7.1
#to print the final answer
print('b) Sum of Large Values = ',MatrixSumLarge(vals,large))
And got the error:
TypeError: bad operand type for abs(): 'list'
You vals is a list of list, so vals[i] is a list. However, abs() cannot be applied on a list, it should be applied on a number (floating number, integer, or complex number), see the official document here. You can for example, add an extra loop to add up the element in the row, like sum(abs(x) for x in vals[i] if x >= 2)
You can change 2 to the number you need.
Overall, your function could look something like this
def f(mat, v):
row_sums = [sum(abs(x) for x in row if x >= v) for row in mat]
return sum(row_sums)
From your question, it's not very clear to me whether you wanna
compare abs(x) to v or x to v
sum up abs(x) or x
But you should be able to adjust the code above to suit your need.
There is also a python module that helps you to flatten list of list. The function is called chain.from_iterable. So the above code could also be written like this
from itertools import chain
def f(mat, v):
sums = [sum(abs(x) for x in chain.from_iterable(mat) if x >= v]
return sum(sums)

recursive function in python but with strange return

I am trying to solve a primary equation with several variables. For example:11x+7y+3z=20. non-negative integer result only.
I use code below in python 3.5.1, but the result contains something like [...]. I wonder what is it?
The code I have is to test every variables from 0 to max [total value divided by corresponding variable]. Because the variables may be of a large number, I want to use recursion to solve it.
def equation (a,b,relist):
global total
if len(a)>1:
for i in range(b//a[0]+1):
corelist=relist.copy()
corelist+=[i]
testrest=equation(a[1:],b-a[0]*i,corelist)
if testrest:
total+=[testrest]
return total
else:
if b%a[0]==0:
relist+=[b//a[0]]
return relist
else:
return False
total=[]
re=equation([11,7,3],20,[])
print(re)
the result is
[[0, 2, 2], [...], [1, 0, 3], [...]]
change to a new one could get clean result, but I still need a global variable:
def equation (a,b,relist):
global total
if len(a)>1:
for i in range(b//a[0]+1):
corelist=relist.copy()
corelist+=[i]
equation(a[1:],b-a[0]*i,corelist)
return total
else:
if b%a[0]==0:
relist+=[b//a[0]]
total+=[relist]
return
else:
return
total=[]
print(equation([11,7,3],20,[]))
I see three layers of problems here.
1) There seems to be a misunderstanding about recursion.
2) There seems to be an underestimation of the complexity of the problem you are trying to solve (a modeling issue)
3) Your main question exposes some lacking skills in python itself.
I will address the questions in backward order given that your actual question is "the result contains something like [...]. I wonder what is it?"
"[]" in python designates a list.
For example:
var = [ 1, 2 ,3 ,4 ]
Creates a reference "var" to a list containing 4 integers of values 1, 2, 3 and 4 respectively.
var2 = [ "hello", ["foo", "bar"], "world" ]
var2 on the other hand is a reference to a composite list of 3 elements, a string, another list and a string. The 2nd element is a list of 2 strings.
So your results is a list of lists of integers (assuming the 2 lists with "..." are integers). If each sublists are of the same size, you could also think of it as a matrix. And the way the function is written, you could end up with a composite list of lists of integers, the value "False" (or the value "None" in the newest version)
Now to the modeling problem. The equation 11x + 7y + 3z = 20 is one equation with 3 unknowns. It is not clear at all to me what you want to acheive with this program, but unless you solve the equation by selecting 2 independent variables, you won't achieve much. It is not clear at all to me what is the relation between the program and the equation save for the list you provided as argument with the values 11, 7 and 3.
What I would do (assuming you are looking for triplets of values that solves the equation) is go for the equation: f(x,y) = (20/3) - (11/3)x - (7/3)y. Then the code I would rather write is:
def func_f(x, y):
return 20.0/3.0 - (11.0/3.0) * x - (7.0/3.0) * y
list_of_list_of_triplets = []
for (x, y) in zip(range(100),range(100)):
list_of_triplet = [x, y, func_f(x,y)]
list_of_list_of_triplets += [list_of_triplet] # or .append(list_of_triplet)
Be mindful that the number of solutions to this equation is infinite. You could think of it as a straight line in a rectangular prism if you bound the variables. If you wanted to represent the same line in an abstract number of dimensions, you could rewrite the above as:
def func_multi_f(nthc, const, coeffs, vars):
return const - sum([a*b/nth for a,b in zip(coeffs, vars)])
Where nthc is the coefficient of the Nth variable, const is an offset constant, coeffs is a list of coefficients and vars the values of the N-1 other variables. For example, we could re-write the func_f as:
def func_f(x,y):
return func_multi_f(3.0, 20.0, [11.0, 7.0], [x,y])
Now about recursion. A recursion is a formulation of a reducible input that can be called repetivitely as to achieve a final result. In pseudo code a recursive algorithm can be formulated as:
input = a reduced value or input items
if input has reached final state: return final value
operation = perform something on input and reduce it, combine with return value of this algorithm with reduced input.
For example, the fibonacci suite:
def fibonacci(val):
if val == 1:
return 1
return fibonacci(val - 1) + val
If you wanted to recusively add elements from a list:
def sum_recursive(list):
if len(list) == 1:
return list[0]
return sum_recursive(list[:-1]) + list[-1]
Hope it helps.
UPDATE
From comments and original question edits, it appears that we are rather looking for INTEGER solutions to the equation. Of non-negative values. That is quite different.
1) Step one find bounds: use the equation ax + by + cz <= 20 with a,b,c > 0 and x,y,z >= 0
2) Step two, simply do [(x, y, z) for x, y, z in zip(bounds_x, bounds_y, bounds_z) if x*11 + y*7 + z*3 - 20 == 0] and you will have a list of valid triplets.
in code:
def bounds(coeff, const):
return [val for val in range(const) if coeff * val <= const]
def combine_bounds(bounds_list):
# here you have to write your recusive function to build
# all possible combinations assuming N dimensions
def sols(coeffs, const):
bounds_lists = [bounds(a, const) for a in coeffs]
return [vals for vals in combine_bounds(bounds_lists) if sum([a*b for a,b in zip(coeff, vals)] - const == 0)
Here is a solution built from your second one, but without the global variable. Instead, each call passes back a list of solutions; the parent call appends each solution to the current element, making a new list to return.
def equation (a, b):
result = []
if len(a) > 1:
# For each valid value of the current coefficient,
# recur on the remainder of the list.
for i in range(b // a[0]+1):
soln = equation(a[1:], b-a[0]*i)
# prepend the current coefficient
# to each solution of the recursive call.
for item in soln:
result.append([i] + item)
else:
# Only one item left: is it a solution?
if b%a[0] == 0:
# Success: return a list of the one element
result = [[b // a[0]]]
else:
# Failure: return empty list
result = []
return result
print(equation([11, 7, 3], 20, []))

Return elements in a location corresponding to the minimum values of another array

I have two arrays with the same shape in the first two dimensions and I'm looking to record the minimum value in each row of the first array. However I would also like to record the elements in the corresponding position in the third dimension of the second array. I can do it like this:
A = np.random.random((5000, 100))
B = np.random.random((5000, 100, 3))
A_mins = np.ndarray((5000, 4))
for i, row in enumerate(A):
current_min = min(row)
A_mins[i, 0] = current_min
A_mins[i, 1:] = B[i, row == current_min]
I'm new to programming (so correct me if I'm wrong) but I understand that with Numpy doing calculations on whole arrays is faster than iterating over them. With this in mind is there a faster way of doing this? I can't see a way to get rid of the row == current_min bit even though the location of the minimum point must have been 'known' to the computer when it was calculating the min().
Any tips/suggestions appreciated! Thanks.
Something along what #lib talked about:
index = np.argmin(A, axis=1)
A_mins[:,0] = A[np.arange(len(A)), index]
A_mins[:,1:] = B[np.arange(len(A)), index]
It is much faster than using a for loop.
For getting the index of the minimum value, use amin instead of min + comparison
The amin function (and many other functions in numpy) also takes the argument axis, that you can use to get the minimum of each row or each column.
See http://docs.scipy.org/doc/numpy/reference/generated/numpy.amin.html

Categories

Resources