I have the following situation:
(1) I have a large grid. By some conditions I want to further observe specific points/cells in this grid. Each cell has an ID and coordinates X, Y seperately. So in this case lets observe one cell only - marked C on the image, that is located on the edge of the grid. By some formula I can get all the neighbouring cells of the first order (marked 1 on the image) and the second order (marked 2 on the image).
(2) With a further condition I identify some cells in the neighbouring cells and are marked in orange on the second image. What I want to do is to connect all orange cells with each other by optimizing the distances and takih into account only min() distances. My first attempt was to observe cells only by calculating the distances to cells of the lower order. So when looking at cells in neighbours cells 2, i'm looking at the cells in 1 only. The solution of connections is presented on image 2, but it's not optimal, since the ideal solution would compare the distances of all cells and not only of the cells of the lower neighbour order. By doing this, i'm getting the situation presented on image 3. And the problem is that the cells are of course not connected to the centre. What to do?
The current code is:
CO - list of centre points.
data - df all all ID's with X,Y values
CO_list = CO['ID'].tolist()
neighbor100 = []
for p in IskanjeCO_list:
d = get_neighbors100k2(p, len(data)) #function that finds the ID's of neighbours of the first order
neighbor100.append(d)
neighbor200 = []
for p in IskanjeCO_list:
d = get_neighbors200k2(p, len(data)) #function that finds the ID's of neighbours of the second order
neighbor200.append(d)
flat100 = []
for i in neighbor100:
for j in i:
flat100.append(j)
flat200 = []
for i in neighbor200:
for j in i:
flat200.append(j)
neighbors100 = flat100
neighbors200 = flat200
data_sosedi100 = data.iloc[flat100,].reset_index(drop=True)
data_sosedi200 = data.iloc[flat200,].reset_index(drop=True)
dist200 = []
for b in flat200:
d = ((pd.DataFrame((data_sosedi100['X']* - data.iloc[b,]['X'])**2
+ (data_sosedi100['Y'] - data.iloc[b,]['Y'])**2 )**0.5)).sum(1)
dist200.append(d.min())
data_sosedi200['dist'] = dist200
data_sosedi200['id'] = None
for e in CO_list:
data_sosedi200.loc[data_sosedi200['FID_2'].isin((get_neighbors200k2(e, len(data)))),'id'] = e
Do you have any suggestion how to optimize this a bit further? I hope i presented the whole image. If needed, I'll clarify further. If you see a part of the code, where i'd be able to furher optimize this loop, i'd be very grateful!
I defined the points manually to work with:
import numpy as np
from operator import itemgetter, attrgetter
nodes = [[-2,1], [-2,0], [-1,0], [0,0], [1,1], [2,1], [2,0], [1,2], [2,2]]
center = [0,0]
def find_neighbor(node):
n=[]
for i in range(-1,2):
for j in range(-1,2):
if not (i ==0 and j ==0):
n.append([node[0]+i,node[1]+j])
return [N for N in n if N in nodes]
def distance_to_center(node):
return np.sqrt(node[0]**2+node[1]**2)
def distance_between_two_nodes(node1, node2):
return np.sqrt((node1[0]-node2[0])**2+(node1[1]-node2[1])**2)
def next_node_closest_to_center(node):
min = distance_to_center(node)
next_node = node
for n in find_neighbor(node):
if distance_to_center(n) < min:
min = distance_to_center(n)
next_node = n
return next_node, min
def get_path_to_center(node):
node_path = [node]
distance = 0.
while node!= center:
new_node = next_node_closest_to_center(node)[0]
distance += distance_between_two_nodes(node, new_node)
node_path.append(new_node)
node=new_node
return node_path,distance
def furthest_nodes_from_center(nodes):
max = 0.
for n in nodes:
if get_path_to_center(n)[1] > max:
furthest_nodes_pathwise = []
max = get_path_to_center(n)[1]
furthest_nodes_pathwise.append(n)
elif get_path_to_center(n)[1] == max:
furthest_nodes_pathwise.append(n)
return furthest_nodes_pathwise
def farthest_node_from_center(nodes):
max = 0.
farthest_node = center
for n in nodes:
if distance_to_center(n) > max:
max = distance_to_center(n)
farthest_node = n
return farthest_node
def closest_node_to_center(nodes):
min = distance_to_center(farthest_node_from_center(nodes))
for n in nodes:
if distance_to_center(n) < min:
min = distance_to_center(n)
closest_node = n
return closest_node
def closest_node_center_with_furthest_distance(node_selection):
if len(node_selection) == 1:
return node_selection[0]
else:
return closest_node_to_center(node_selection)
print(closest_node_center_with_furthest_distance(furthest_nodes_from_center(nodes)))
Output:
[2, 0]
[Finished in 0.266s]
By running on all nodes I can now determine that the furthest node away path-wise but still closest to the center distance wise is [2,0] and not [2,2]. So we start from there. To find the one on the other side just split the data like I said into negative x values and positive. if you run it over a list of only the negative x value cells you will get [-2,1]
Now that you have your 2 starting cells [2,0] and [-2,1] I will leave you to figure out the algorithm to navigate to the center passing by all cells using the steps in my comments (you can now skip step 1 because this is the answer posted)
Related
The code below should be completely reproducible. I tried my best to make question is clear, if not please ask for clarification.
What I need to do:
"From set R, find the two closest nodes to the nodes in set P Call the closest node i and the next closest node j" - quoting page 157 end of paragraph that starts step 2, in this paper.
The list R is the ordered set of nodes in a graph and P contains sublists of nodes assigned to a particular vehicle. For example
R = [1,5,6,9] # Size of R might change for new k
P = [[4],[7,3,8],[2]] # Sizes of sublists might change for new k
So vehicle k=0 takes gets node P[0]= [4], vehicle k=1 gets nodes P[1] = [7,3,8] and vehicle k=2 gets node P[2] = [2]. For each sublist in P, I want to find the two closest nodes in R from P[k].
The distances are stored in a dict:
dist =
{(1,4) : 52.35456045083369,
(5,4) : 37.48332962798263,
(6,4) : 52.92447448959697,
(9,4) : 76.83749084919418,
(1,7) : 94.89467845985885,
(1,3) : 58.9406481131655,
(1,8) : 11.180339887498949,
(5,7) : 54.817880294662984,
(5,3) : 51.478150704935004,
(5,8) : 45.044422518220834,
(6,7) : 27.80287754891569,
(6,3) : 60.74537019394976,
(6,8) : 72.3671196055225,
(9,7) : 99.68951800465283,
(9,3) : 44.68780594300866,
(9,8) : 15.811388300841896,
(1,2) : 102.44998779892558,
(5,2) : 65.60487786742691,
(6,2) : 42.37924020083418,
(9,2) : 102.55242561733974}
where the first element in the tuple-key are the R-nodes and the second element are the P-nodes.
So first P[0] = [4], the two closest to this node in R are i = 5 since the distance from 5 to 4 is 37.48 and the second closest is j = 1 since the distance from 1 to 4 is 52.35.
Now we proceed to P[1] = [7,3,8]. Here, is where I run into trouble. I interpret the paper as "which two nodes in R are closest to the entire group [7,3,8]?" My first instinct was to calculate the average distance from the R-nodes to each node in P[1] and the smallest value is the closest.
I've made an attempt, but it only works if len(p[k]) = 1. The function I need is a function that takes in R and P, and spits out i and j for each k. Here is my code:
for k in range(2):
all_nodes_dict = {}
for i in range(len(R)):
all_nodes_dict[(R[i],P[k][0])] = dist[(R[i], P[k][0])]
min_list = sorted(list(all_nodes_dict.values()), key = lambda x:float(x))
min_vals = min_list[:2]
two_closest_nodes = []
for i in range(len(min_vals)):
two_closest_nodes += [return_key(min_vals[i], all_nodes_dict)[0]]
i = two_closest_nodes[0]
j = two_closest_nodes[1]
# do something with i and j before resetting them for new iteration
Here is my code for the function return_key().
# function to return key given value and dict
def return_key(val, my_dict):
for key, value in my_dict.items():
if val == value:
return key
return "key doesn't exist"
Here is the code to generate all distances, or the dist dictionary in my code:
n = 10
random.seed(1)
# Create n random points
points = [(0, 0)]
points += [(random.randint(0, 100), random.randint(0, 100)) for i in range(n - 1)]
# Dictionary of distances between each pair of points
dist = {
(i, j): math.sqrt(sum((points[i][p] - points[j][p]) ** 2 for p in range(2)))
for i in range(n)
for j in range(n)
if i != j
}
There are two rasters as below. One consisting of only four values [1,2,3,4]. The other, consisting of values between 800 to 2500. The problem is to go through all of the raster-1 regions and find the maximum values of raster-2 which are located inside each region or segment.
In theory, it seems simple but I can't find a way to implement it. I'm reading scikit image documentation and I'm getting more confused. In theory, it would be:
for i in raster1rows:
for j in i:
# where j is a part of closed patch, iterate through the identical
# elements of raster-2 and find the maximum value.
There is another problem inherent to this question which I can't post as a different topic. As you can see, there are a lot of isolated pixels on raster-1, which could be interpreted as a region and produce a lot of additional maximums. to prevent this I used :
raster1 = raster1.astype(int)
raster1 = skimage.morphology.remove_small_objects(raster1 , min_size=20, connectivity=2, in_place=True)
But raster-1 seems to take no effect.
To remove the small object I've done
array_aspect = sp.median_filter(array_aspect, size=10)
And it gave me good results.
To find maximum elevation inside each closed part I've done:
# %%% to flood-fill closed boundaries on the classified raster
p = 5
ind = 1
for i in rangerow:
for j in rangecol:
if array_aspect[i][j] in [0, 1, 2, 3, 4]:
print("{}. row: {} col: {} is {} is floodfilled with {}, {} meters".format(ind, i, j, array_aspect[i][j], p, array_dem[i][j]))
array_aspect = sk.flood_fill(array_aspect, (i,j), p, in_place=True, connectivity=2)
p = p + 1
else:
pass
ind = ind + 1
# %%% Finds the max elev inside each fill and returns an array-based [Y,X, (ELEV #in meters)]
p = 5
maxdems = {}
for i in rangerow:
for j in rangecol:
try:
if bool(maxdems[array_aspect[i][j]]) == False or maxdems[array_aspect[i][j]][-1] < array_dem[i][j]:
maxdems[array_aspect[i][j]] = [i, j, array_dem[i][j]]
else:
pass
except: #This is very diabolical, but yeah :))
maxdems[array_aspect[i][j]] = [i, j, array_dem[i][j]]
print(maxdems)`
I've got my desired results.
Aim: To find out how many camera's can communicate and therefore in the same mesh. The tuples are camera coordinates and r is max distance they can be apart. n= amount of camera's
My code takes a list in this case numl and arranges it into pairs of coordinates based on whether or not the value of p <= r^2. Resulting in a list like lis = [ [(3,4),(5,6)] ,[(2,4),(5,9)], [(1,4),(5,6)] ]
I attempt to compare the values in the area i highlighted as my prob area and join any lists which had one tuple similar. In the example above for instance if any of the elements in lis[0] match any of the elements in the others then I would merge those to minus the duplicate and end up with [(3,4),(1,4),(5,6)] ,[(2,4),(5,9)]]
Can anyone help me to figure this out?
def smesh():
fin = []
i = 0
numl= [9,8,3,2,1,2,3,4,3,2]
r = numl.pop(0)
r = r**11
n = numl.pop(0)
flis = numl[::2]
numl.pop(0)
slis = numl[::2]
newl = list((zip(flis,slis)))
for x,y in itertools.combinations(newl,2):
p = (((x[0]-y[0])**2)+ ((x[1]-y[1])**2))
if p<=r:
fin.append([x,y])
<---below here is my prob area--->
for x,y in itertools.combinations(fin,2):
if x[0] in y:
fin.append(x+y)
elif x[1] in y:
fin.append(x+y)
print(fin)
As above I am trying to implement a Graham Scan Convex Hull algorithm but I am having trouble with the stack appending too many vertices. The points are read from a .dat file here
For reading points my function is as follows:
def readDataPts(filename, N):
"""Reads the first N lines of data from the input file
and returns a list of N tuples
[(x0,y0), (x1, y1), ...]
"""
count = 0
points = []
listPts = open(filename,"r")
lines = listPts.readlines()
for line in lines:
if count < N:
point_list = line.split()
count += 1
for i in range(0,len(point_list)-1):
points.append((float(point_list[i]),float(point_list[i+1])))
return points
My graham scan is as follows:
def theta(pointA,pointB):
dx = pointB[0] - pointA[0]
dy = pointB[1] - pointA[1]
if abs(dx) < 0.1**5 and abs(dy) < 0.1**5:
t = 0
else:
t = dy/(abs(dx) + abs(dy))
if dx < 0:
t = 2 - t
elif dy < 0:
t = 4 + t
return t*90
def grahamscan(listPts):
"""Returns the convex hull vertices computed using the
Graham-scan algorithm as a list of 'h' tuples
[(u0,v0), (u1,v1), ...]
"""
listPts.sort(key=lambda x: x[1])
p0 = listPts[0]
angles = []
for each in listPts:
angle = theta(p0,each)
angles.append((angle,each))
angles.sort(key=lambda angle: angle[0])
stack = []
for i in range(0,3):
stack.append(angles[i][1])
for i in range(3, len(angles)):
while not (isCCW(stack[-2],stack[-1],angles[i][1])):
stack.pop()
stack.append(angles[i][1])
merge_error = stack[-1]
#stack.remove(merge_error)
#stack.insert(0,merge_error)
return stack #stack becomes track of convex hull
def lineFn(ptA, ptB, ptC):
"""Given three points, the function finds the value which could be used to determine which sides the third point lies"""
val1 = (ptB[0]-ptA[0])*(ptC[1]-ptA[1])
val2 = (ptB[1]-ptA[1])*(ptC[0]-ptA[0])
ans = val1 - val2
return ans
def isCCW(ptA, ptB, ptC):
"""Return True if the third point is on the left side of the line from ptA to ptB and False otherwise"""
ans = lineFn(ptA, ptB, ptC) > 0
return ans
When I run it using the data set taking the first 50 lines as input it produces the stack:
[(599.4, 400.8), (599.0, 514.4), (594.5, 583.9), (550.1, 598.5), (463.3, 597.2), (409.2, 572.5), (406.0, 425.9), (407.3, 410.2), (416.3, 405.3), (485.2, 400.9)]
but it should produce (in this order):
[(599.4, 400.8), (594.5, 583.9), (550.1, 598.5), (472.6, 596.1), (454.2, 589.4), (410.8, 564.2), (416.3, 405.3), (487.7, 401.5)]
Any ideas?
Angle sorting should be made against some extremal reference point (for example, the most bottom and left), that is undoubtedly included in convex hull. But your implementation uses the first point of list as reference.
Wiki excerpt:
swap points[1] with the point with the lowest y-coordinate
This is my pathfinding function:
def get_distance(x1,y1,x2,y2):
neighbors = [(-1,0),(1,0),(0,-1),(0,1)]
old_nodes = [(square_pos[x1,y1],0)]
new_nodes = []
for i in range(50):
for node in old_nodes:
if node[0].x == x2 and node[0].y == y2:
return node[1]
for neighbor in neighbors:
try:
square = square_pos[node[0].x+neighbor[0],node[0].y+neighbor[1]]
if square.lightcycle == None:
new_nodes.append((square,node[1]))
except KeyError:
pass
old_nodes = []
old_nodes = list(new_nodes)
new_nodes = []
nodes = []
return 50
The problem is that the AI takes to long to respond( response time <= 100ms)
This is just a python way of doing https://en.wikipedia.org/wiki/Pathfinding#Sample_algorithm
You should replace your algorithm with A*-search with the Manhattan distance as a heuristic.
One reasonably fast solution is to implement the Dijkstra algorithm (that I have already implemented in that question):
Build the original map. It's a masked array where the walker cannot walk on masked element:
%pylab inline
map_size = (20,20)
MAP = np.ma.masked_array(np.zeros(map_size), np.random.choice([0,1], size=map_size))
matshow(MAP)
Below is the Dijkstra algorithm:
def dijkstra(V):
mask = V.mask
visit_mask = mask.copy() # mask visited cells
m = numpy.ones_like(V) * numpy.inf
connectivity = [(i,j) for i in [-1, 0, 1] for j in [-1, 0, 1] if (not (i == j == 0))]
cc = unravel_index(V.argmin(), m.shape) # current_cell
m[cc] = 0
P = {} # dictionary of predecessors
#while (~visit_mask).sum() > 0:
for _ in range(V.size):
#print cc
neighbors = [tuple(e) for e in asarray(cc) - connectivity
if e[0] > 0 and e[1] > 0 and e[0] < V.shape[0] and e[1] < V.shape[1]]
neighbors = [ e for e in neighbors if not visit_mask[e] ]
tentative_distance = [(V[e]-V[cc])**2 for e in neighbors]
for i,e in enumerate(neighbors):
d = tentative_distance[i] + m[cc]
if d < m[e]:
m[e] = d
P[e] = cc
visit_mask[cc] = True
m_mask = ma.masked_array(m, visit_mask)
cc = unravel_index(m_mask.argmin(), m.shape)
return m, P
def shortestPath(start, end, P):
Path = []
step = end
while 1:
Path.append(step)
if step == start: break
if P.has_key(step):
step = P[step]
else:
break
Path.reverse()
return asarray(Path)
And the result:
start = (2,8)
stop = (17,19)
D, P = dijkstra(MAP)
path = shortestPath(start, stop, P)
imshow(MAP, interpolation='nearest')
plot(path[:,1], path[:,0], 'ro-', linewidth=2.5)
Below some timing statistics:
%timeit dijkstra(MAP)
#10 loops, best of 3: 32.6 ms per loop
The biggest issue with your code is that you don't do anything to avoid the same coordinates being visited multiple times. This means that the number of nodes you visit is guaranteed to grow exponentially, since it can keep going back and forth over the first few nodes many times.
The best way to avoid duplication is to maintain a set of the coordinates we've added to the queue (though if your node values are hashable, you might be able to add them directly to the set instead of coordinate tuples). Since we're doing a breadth-first search, we'll always reach a given coordinate by (one of) the shortest path(s), so we never need to worry about finding a better route later on.
Try something like this:
def get_distance(x1,y1,x2,y2):
neighbors = [(-1,0),(1,0),(0,-1),(0,1)]
nodes = [(square_pos[x1,y1],0)]
seen = set([(x1, y1)])
for node, path_length in nodes:
if path_length == 50:
break
if node.x == x2 and node.y == y2:
return path_length
for nx, ny in neighbors:
try:
square = square_pos[node.x + nx, node.y + ny]
if square.lightcycle == None and (square.x, square.y) not in seen:
nodes.append((square, path_length + 1))
seen.add((square.x, square.y))
except KeyError:
pass
return 50
I've also simplified the loop a bit. Rather than switching out the list after each depth, you can just use one loop and add to its end as you're iterating over the earlier values. I still abort if a path hasn't been found with fewer than 50 steps (using the distance stored in the 2-tuple, rather than the number of passes of the outer loop). A further improvement might be to use a collections.dequeue for the queue, since you could efficiently pop from one end while appending to the other end. It probably won't make a huge difference, but might avoid a little bit of memory usage.
I also avoided most of the indexing by one and zero in favor of unpacking into separate variable names in the for loops. I think this is much easier to read, and it avoids confusion since the two different kinds of 2-tuples had had different meanings (one is a node, distance tuple, the other is x, y).