Is the given Graph a tree? Faster than below approach - - python

I was given a question during an interview and although my answer was accepted at the end they wanted a faster approach and I went blank..
Question :
Given an undirected graph, can you see if it's a tree? If so, return true and false otherwise.
A tree:
A - B
|
C - D
not a tree:
A
/ \
B - C
/
D
You'll be given two parameters: n for number of nodes, and a multidimensional array of edges like such: [[1, 2], [2, 3]], each pair representing the vertices connected by the edge.
Note:Expected space complexity : O(|V|)
The array edges can be empty
Here is My code: 105ms
def is_graph_tree(n, edges):
nodes = [None] * (n + 1)
for i in range(1, n+1):
nodes[i] = i
for i in range(len(edges)):
start_edge = edges[i][0]
dest_edge = edges[i][1]
if nodes[start_edge] != start_edge:
start_edge = nodes[start_edge]
if nodes[dest_edge] != dest_edge:
dest_edge = nodes[dest_edge]
if start_edge == dest_edge:
return False
nodes[start_edge] = dest_edge
return len(edges) <= n - 1

Here's one approach using a disjoint-set-union / union-find data structure:
def is_graph_tree(n, edges):
parent = list(range(n+1))
size = [1] * (n + 1)
for x, y in edges:
# find x (path splitting)
while parent[x] != x:
x, parent[x] = parent[x], parent[parent[x]]
# find y
while parent[y] != y:
y, parent[y] = parent[y], parent[parent[y]]
if x == y:
# Already connected
return False
# Union (by size)
if size[x] < size[y]:
x, y = y, x
parent[y] = x
size[x] += size[y]
return True
assert not is_graph_tree(4, [(1, 2), (2, 3), (3, 4), (4, 2)])
assert is_graph_tree(6, [(1, 2), (2, 3), (3, 4), (3, 5), (1, 6)])
The runtime is O(V + E*InverseAckermannFunction(V)), which better than O(V + E * log(log V)), so it's basically O(V + E).

Tim Roberts has posted a candidate solution, but this will work in the case of disconnected subtrees:
import queue
def is_graph_tree(n, edges):
# A tree with n nodes has n - 1 edges.
if len(edges) != n - 1:
return False
# Construct graph.
graph = [[] for _ in range(n)]
for first_vertex, second_vertex in edges:
graph[first_vertex].append(second_vertex)
graph[second_vertex].append(first_vertex)
# BFS to find edges that create cycles.
# The graph is undirected, so we can root the tree wherever we want.
visited = set()
q = queue.Queue()
q.put((0, None))
while not q.empty():
current_node, previous_node = q.get()
if current_node in visited:
return False
visited.add(current_node)
for neighbor in graph[current_node]:
if neighbor != previous_node:
q.put((neighbor, current_node))
# Only return true if the graph has only one connected component.
return len(visited) == n
This runs in O(n + len(edges)) time.

You could approach this from the perspective of tree leaves. Every leaf node in a tree will have exactly one edge connected to it. So, if you count the number of edges for each nodes, you can get the list of leaves (i.e. the ones with only one edge).
Then, take the linked node from these leaves and reduce their edge count by one (as if you were removing all the leaves from the tree. That will give you a new set of leaves corresponding to the parents of the original leaves. Repeat the process until you have no more leaves.
[EDIT] checking that the number of edges is N-1 eliminiates the need to do the multi-root check because there will be another discrepancy (e.g. double link, missing node) in the graph if there are multiple 'roots' or a disconnected subtree
If the graph is a tree, this process should eliminate all nodes from the node counts (i.e. they will all be flagged as leaves at some point).
Using the Counter class (from collections) will make this relatively easy to implement:
from collections import Counter
def isTree(N,E):
if N==1 and not E: return True # root only is a tree
if len(E) != N-1: return False # a tree has N-1 edges
counts = Counter(n for ab in E for n in ab) # edge counts per node
if len(counts) != N : return False # unlinked nodes
while True:
leaves = {n for n,c in counts.items() if c==1} # new leaves
if not leaves:break
for a,b in E: # subtract leaf counts
if counts[a]>1 and b in leaves: counts[a] -= 1
if counts[b]>1 and a in leaves: counts[b] -= 1
for n in leaves: counts[n] = -1 # flag leaves in counts
return all(c==-1 for c in counts.values()) # all must become leaves
output:
G = [[1,2],[1,3],[4,5],[4,6]]
print(isTree(6,G)) # False (disconnected sub-tree)
G = [[1,2],[1,3],[1,4],[2,3],[5,6]]
print(isTree(6,G)) # False (doubly linked node 3)
G = [[1,2],[2,6],[3,4],[5,1],[2,3]]
print(isTree(6,G)) # True
G = [[1,2],[2,3]]
print(isTree(3,G)) # True
G = [[1,2],[2,3],[3,4]]
print(isTree(4,G)) # True
G = [[1,2],[1,3],[2,5],[2,4]]
print(isTree(6,G)) # False (missing node)
Space complexity is O(N) because the counts dictionary has one entry per node(vertex) with an integer as value. Time complexity will be O(ExL) where E is the number of edges and L is the number of levels in the tree. The worts case time is O(E^2) for a tree where all parents have only one child node. However, since the initial condition is for E to be less than V, the worst case will actually be O(V^2)
Note that this algorithm makes no assumption on edge order or numerical relationships between node numbers. The root (last node to be made a leaf) found by this algorithm is not necessarily the only possible root given that, unless the nodes have an implicit cardinality relationship (or edges have an order), there could be ambiguous scenarios:
[1,2],[2,3],[2,4] could be:
1 2 3
|_2 OR |_1 OR |_2
|_3 |_3 |_1
|_4 |_4 |_4
If a cardinality relationship between node numbers or an order of edges can be relied upon, the algorithm could potentially be made more time efficient (because we could easily determine which node is the root and start from there).
[EDIT2] Alternative method using groups.
When the number of edges is N-1, if the graph is a tree, all nodes should be reachable from any other node. This means that, if we form groups of reachable nodes for each node and merge them together based on the edges, we should end up with a single group after going through all the edges.
Here is the modified function based on that approach:
def isTree(N,E):
if N==1 and not E: return True # root only is a tree
if len(E) != N-1: return False # a tree has N-1 edges
groups = {n:[n] for ab in E for n in ab} # each node in its own group
if len(groups) != N : return False # unlinked nodes
for a,b in E:
groups[a].extend(groups[b]) # merge groups
for n in groups[b]: groups[n] = groups[a] # update nodes' groups
return len(set(map(id,groups.values()))) == 1 # only one group when done
Given that we start out with fewer edges than nodes and that group merging will consume at most 2x a group size (so also < N), the space complexity will remain O(V). The time complexity will also be O(V^2) at for the worts case scenarios

You don't even need to know how many edges there are:
def is_graph_tree(n, edges):
seen = set()
for a,b in edges:
b = max(a,b)
if b in seen:
return False
seen.add(b)
return True
a = [[1,2],[2,3],[3,4]]
print(is_graph_tree(0,a))
b = [[1,2],[1,3],[2,3],[2,4]]
print(is_graph_tree(0,b))
Now, this WON'T catch the case of disconnected subtrees, but that wasn't in the problem description...

Related

Given a list of words, determine whether the words can be chained to form a circle

Given a list of words, determine whether the words can be chained to form a circle. A word X
can be placed in front of another word Y in a circle if the last character of X is the same as
the first character of Y.
For example, the words ['chair', 'height', 'racket', touch', 'tunic'] can form the following circle:
chair --> racket --> touch --> height --> tunic --> chair
The output it has to be a txt file with one word per line, ex:
chair
racket
touch
height
tunic
I searched for the solution, but i only managed to get the partial solution which answers wether or not it can be a circle.
# Python program to check if a given directed graph is Eulerian or not
CHARS = 26
# A class that represents an undirected graph
class Graph(object):
def __init__(self, V):
self.V = V # No. of vertices
self.adj = [[] for x in range(V)] # a dynamic array
self.inp = [0] * V
# function to add an edge to graph
def addEdge(self, v, w):
self.adj[v].append(w)
self.inp[w]+=1
# Method to check if this graph is Eulerian or not
def isSC(self):
# Mark all the vertices as not visited (For first DFS)
visited = [False] * self.V
# Find the first vertex with non-zero degree
n = 0
for n in range(self.V):
if len(self.adj[n]) > 0:
break
# Do DFS traversal starting from first non zero degree vertex.
self.DFSUtil(n, visited)
# If DFS traversal doesn't visit all vertices, then return false.
for i in range(self.V):
if len(self.adj[i]) > 0 and visited[i] == False:
return False
# Create a reversed graph
gr = self.getTranspose()
# Mark all the vertices as not visited (For second DFS)
for i in range(self.V):
visited[i] = False
# Do DFS for reversed graph starting from first vertex.
# Starting Vertex must be same starting point of first DFS
gr.DFSUtil(n, visited)
# If all vertices are not visited in second DFS, then
# return false
for i in range(self.V):
if len(self.adj[i]) > 0 and visited[i] == False:
return False
return True
# This function returns true if the directed graph has an eulerian
# cycle, otherwise returns false
def isEulerianCycle(self):
# Check if all non-zero degree vertices are connected
if self.isSC() == False:
return False
# Check if in degree and out degree of every vertex is same
for i in range(self.V):
if len(self.adj[i]) != self.inp[i]:
return False
return True
# A recursive function to do DFS starting from v
def DFSUtil(self, v, visited):
# Mark the current node as visited and print it
visited[v] = True
# Recur for all the vertices adjacent to this vertex
for i in range(len(self.adj[v])):
if not visited[self.adj[v][i]]:
self.DFSUtil(self.adj[v][i], visited)
# Function that returns reverse (or transpose) of this graph
# This function is needed in isSC()
def getTranspose(self):
g = Graph(self.V)
for v in range(self.V):
# Recur for all the vertices adjacent to this vertex
for i in range(len(self.adj[v])):
g.adj[self.adj[v][i]].append(v)
g.inp[v]+=1
return g
# This function takes an of strings and returns true
# if the given array of strings can be chained to
# form cycle
def canBeChained(arr, n):
# Create a graph with 'alpha' edges
g = Graph(CHARS)
# Create an edge from first character to last character
# of every string
for i in range(n):
s = arr[i]
g.addEdge(ord(s[0])-ord('a'), ord(s[len(s)-1])-ord('a'))
# The given array of strings can be chained if there
# is an eulerian cycle in the created graph
return g.isEulerianCycle()
# Driver program
arr1 = ["for", "geek", "rig", "kaf"]
n1 = len(arr1)
if canBeChained(arr1, n1):
print ("Can be chained")
else:
print ("Cant be chained")
arr2 = ["aab", "abb"]
n2 = len(arr2)
if canBeChained(arr2, n2):
print ("Can be chained")
else:
print ("Can't be chained")
Source: https://www.geeksforgeeks.org/given-array-strings-find-strings-can-chained-form-circle/
This solution only returns the Boolean statement of the list, it means that if there is a circle it will output True. The goal for me is to try and expand this solution to give the list separated, i will give another example:
Input:
{"for", "geek", "rig", "kaf"}
Output:
for
rig
geek
kaf
for
The problem you're describing is the Eulerian circuit problem.
There is an algorithm implemented in module networkx:
networkx.algorithms.euler.eulerian_circuit
from networkx import DiGraph, eulerian_circuit
words = ['chair', 'height', 'racket', 'touch', 'tunic']
G = DiGraph()
G.add_weighted_edges_from(((w[0], w[-1], w) for w in words), weight='word')
result = [G[a][b]['word'] for a,b in eulerian_circuit(G)]
print(result)
# ['chair', 'racket', 'touch', 'height', 'tunic']
This seems like a lot of effort to solve this problem. Consider a simple solution like:
from collections import defaultdict
words = ['chair', 'height', 'racket', 'touch', 'tunic']
def findChains(words):
dictionary = defaultdict(list)
for word in words:
dictionary[word[0]].append(word)
chains = [[words[0]]] # start with an arbitrary word
while True:
new_chains = []
for chain in chains:
for follower in dictionary[chain[-1][-1]]:
if follower in chain:
continue
new_chains.append([*chain, follower])
if new_chains:
chains = new_chains
else:
break
return [chain for chain in chains if len(chain) == len(words) and chain[-1][-1] == chain[0][0]]
print(findChains(words))
OUTPUT
% python3 test.py
[['chair', 'racket', 'touch', 'height', 'tunic']]
%
Is the issue that a simple algorithm like the above becomes unworkable as the list of words gets longer? You also seem to assume a single solution, but with enough start and end letter redundancy, there could be multiple solutions. You need to code for multiple even if in the end you just pick one.

Faster way to add dummy nodes in networkx to limit degree

I am wondering if I can speed up my operation of limiting node degree using an inbuilt function.
A submodule of my task requires me to limit the indegree to 2. So, the solution I proposed was to introduce sequential dummy nodes and absorb the extra edges. Finally, the last dummy gets connected to the children of the original node. To be specific if an original node 2 is split into 3 nodes (original node 2 & two dummy nodes), ALL the properties of the graph should be maintained if we analyse the graph by packaging 2 & its dummies into one hypothetical node 2'; The function I wrote is shown below:
def split_merging(G, dummy_counter):
"""
Args:
G: as the name suggests
dummy_counter: as the name suggests
Returns:
G with each merging node > 2 incoming split into several consecutive nodes
and dummy_counter
"""
# we need two copies; one to ensure the sanctity of the input G
# and second, to ensure that while we change the Graph in the loop,
# the loop doesn't go crazy due to changing bounds
G_copy = nx.DiGraph(G)
G_copy_2 = nx.DiGraph(G)
for node in G_copy.nodes:
in_deg = G_copy.in_degree[node]
if in_deg > 2: # node must be split for incoming
new_nodes = ["dummy" + str(i) for i in range(dummy_counter, dummy_counter + in_deg - 2)]
dummy_counter = dummy_counter + in_deg - 2
upstreams = [i for i in G_copy_2.predecessors(node)]
downstreams = [i for i in G_copy_2.successors(node)]
for up in upstreams:
G_copy_2.remove_edge(up, node)
for down in downstreams:
G_copy_2.remove_edge(node, down)
prev_node = node
G_copy_2.add_edge(upstreams[0], prev_node)
G_copy_2.add_edge(upstreams[1], prev_node)
for i in range(2, len(upstreams)):
G_copy_2.add_edge(prev_node, new_nodes[i - 2])
G_copy_2.add_edge(upstreams[i], new_nodes[i - 2])
prev_node = new_nodes[i - 2]
for down in downstreams:
G_copy_2.add_edge(prev_node, down)
return G_copy_2, dummy_counter
For clarification, the input and output are shown below:
Input:
Output:
It works as expected. But the problem is that this is very slow for larger graphs. Is there a way to speed this up using some inbuilt function from networkx or any other library?
Sure; the idea is similar to balancing a B-tree. If a node has too many in-neighbors, create two new children, and split up all your in-neighbors among those children. The children have out-degree 1 and point to your original node; you may need to recursively split them as well.
This is as balanced as possible: node n becomes a complete binary tree rooted at node n, with external in-neighbors at the leaves only, and external out-neighbors at the root.
def recursive_split_node(G: 'nx.DiGraph', node, max_in_degree: int = 2):
"""Given a possibly overfull node, create a minimal complete
binary tree rooted at that node with no overfull nodes.
Return the new graph."""
global dummy_counter
current_in_degree = G.in_degree[node]
if current_in_degree <= max_in_degree:
return G
# Complete binary tree, so left gets 1 more descendant if tied
left_child_in_degree = (current_in_degree + 1) // 2
left_child = "dummy" + str(dummy_counter)
right_child = "dummy" + str(dummy_counter + 1)
dummy_counter += 2
G.add_node(left_child)
G.add_node(right_child)
old_predecessors = list(G.predecessors(node))
# Give all predecessors to left and right children
G.add_edges_from([(y, left_child)
for y in old_predecessors[:left_child_in_degree]])
G.add_edges_from([(y, right_child)
for y in old_predecessors[left_child_in_degree:]])
# Remove all incoming edges
G.remove_edges_from([(y, node) for y in old_predecessors])
# Connect children to me
G.add_edge(left_child, node)
G.add_edge(right_child, node)
# Split children
G = recursive_split_node(G, left_child, max_in_degree)
G = recursive_split_node(G, right_child, max_in_degree)
return G
def clean_graph(G: 'nx.DiGraph', max_in_degree: int = 2) -> 'nx.DiGraph':
"""Return a copy of our original graph, with nodes added to ensure
the max in degree does not exceed our limit."""
G_copy = nx.DiGraph(G)
for node in G.nodes:
if G_copy.in_degree[node] > max_in_degree:
G_copy = recursive_split_node(G_copy, node, max_in_degree)
return G_copy
This code for recursively splitting nodes is quite handy and easily generalized, and intentionally left unoptimized.
To solve your exact use case, you could go with an iterative solution: build a full, complete binary tree (with the same structure as a heap) implicitly as an array. This is, I believe, the theoretically optimal solution to the problem, in terms of minimizing the number of graph operations (new nodes, new edges, deleting edges) to achieve the constraint, and gives the same graph as the recursive solution.
def clean_graph(G):
"""Return a copy of our original graph, with nodes added to ensure
the max in degree does not exceed 2."""
global dummy_counter
G_copy = nx.DiGraph(G)
for node in G.nodes:
if G_copy.in_degree[node] > 2:
predecessors_list = list(G_copy.predecessors(node))
G_copy.remove_edges_from((y, node) for y in predecessors_list)
N = len(predecessors_list)
leaf_count = (N + 1) // 2
internal_count = leaf_count // 2
total_nodes = leaf_count + internal_count
node_names = [node]
node_names.extend(("dummy" + str(dummy_counter + i) for i in range(total_nodes - 1)))
dummy_counter += total_nodes - 1
for i in range(internal_count):
G_copy.add_edges_from(((node_names[2 * i + 1], node_names[i]), (node_names[2 * i + 2], node_names[i])))
for leaf in range(internal_count, internal_count + leaf_count):
G_copy.add_edge(predecessors_list.pop(), node_names[leaf])
if not predecessors_list:
break
G_copy.add_edge(predecessors_list.pop(), node_names[leaf])
if not predecessors_list:
break
return G_copy
From my testing, comparing performance on very dense graphs generated with nx.fast_gnp_random_graph(500, 0.3, directed=True), this is 2.75x faster than the recursive solution, and 1.75x faster than the original posted solution. The bottleneck for further optimizations is networkx and Python, or changing the input graphs to be less dense.

Find connected branches from list of line segments

Problem
I have a list of line segments:
exampleLineSegments = [(1,2),(2,3),(3,4),(4,5),(5,6),(4,7),(8,7)]
These segments include the indices of the corresponding point in a separate array.
From this sublist, one can see that there is a branching point (4). So three different branches are emerging from this branching point.
(In other, more specific problems, there might be / are multiple branching points for n branches.)
Target
My target is to get a dictionary including information about the existing branches, so e.g.:
result = { branch_1: [1,2,3,4],
branch_2: [4,5,6],
branch_3: [4,7,8]}
Current state of work/problems
Currently, I am identifying the branch points first by setting up a dictionary for each point and checking for each entry if there are more than 2 neighbor points found. This means that there is a branching point.
Afterwards I am crawling through all points emerging from these branch points, checking for successors etc.
In these functions, there are a some for loops and generally an intensive "crawling". This is not the cleanest solution and if the number of points increasing, the performance is not so good either.
Question
What is the best / fastest / most performant way to achieve the target in this case?
I think you can achieve it by following steps:
use a neighbors dict to store the graph
find all branch points, which neighbors count > 2
start from every branch point, and use dfs to find all the paths
from collections import defaultdict
def find_branch_paths(exampleLineSegments):
# use dict to store the graph
neighbors = defaultdict(list)
for p1, p2 in exampleLineSegments:
neighbors[p1].append(p2)
neighbors[p2].append(p1)
# find all branch points
branch_points = [k for k, v in neighbors.items() if len(v) > 2]
res = []
def dfs(cur, prev, path):
# reach the leaf
if len(neighbors[cur]) == 1:
res.append(path)
return
for neighbor in neighbors[cur]:
if neighbor != prev:
dfs(neighbor, cur, path + [neighbor])
# start from all the branch points
for branch_point in branch_points:
dfs(branch_point, None, [branch_point])
return res
update an iteration version, for big data, which may cause a recursion depth problem:
def find_branch_paths(exampleLineSegments):
# use dict to store the graph
neighbors = defaultdict(list)
for p1, p2 in exampleLineSegments:
neighbors[p1].append(p2)
neighbors[p2].append(p1)
# find all branch points
branch_points = [k for k, v in neighbors.items() if len(v) > 2]
res = []
# iteration way to dfs
stack = [(bp, None, [bp]) for bp in branch_points]
while stack:
cur, prev, path = stack.pop()
if len(neighbors[cur]) == 1 or (prev and cur in branch_points):
res.append(path)
continue
for neighbor in neighbors[cur]:
if neighbor != prev:
stack.append((neighbor, cur, path + [neighbor]))
return res
test and output:
print(find_branch_paths([(1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (4, 7), (8, 7)]))
# output:
# [[4, 3, 2, 1], [4, 5, 6], [4, 7, 8]]
Hope that helps you, and comment if you have further questions. : )
UPDATE: if there are many branch points, the path will grow exponentially. So if you only want distinct segments, you can end the path when encounter another branch point.
change this line
if len(neighbors[cur]) == 1:
to
if len(neighbors[cur]) == 1 or (prev and cur in branch_points):

Calculating the number of graphs created and the number of vertices in each graph from a list of edges

Given a list of edges such as, edges = [[1,2],[2,3],[3,1],[4,5]]
I need to find how many graphs are created, by this I mean how many groups of components are created by these edges. Then get the number of vertices in the group of components.
However, I am required to be able to handle 10^5 edges, and i am currently having trouble completing the task for large number of edges.
My algorithm is currently getting the list of edges= [[1,2],[2,3],[3,1],[4,5]] and merging each list as set if they have a intersection, this will output a new list that now contains group components such as , graphs = [[1,2,3],[4,5]]
There are two connected components : [1,2,3] are connected and [4,5] are connected as well.
I would like to know if there is a much better way of doing this task.
def mergeList(edges):
sets = [set(x) for x in edges if x]
m = 1
while m:
m = 0
res = []
while sets:
common, r = sets[0], sets[1:]
sets = []
for x in r:
if x.isdisjoint(common):
sets.append(x)
else:
m = 1
common |= x
res.append(common)
sets = res
return sets
I would like to try doing this in a dictionary or something efficient, because this is toooo slow.
A basic iterative graph traversal in Python isn't too bad.
import collections
def connected_components(edges):
# build the graph
neighbors = collections.defaultdict(set)
for u, v in edges:
neighbors[u].add(v)
neighbors[v].add(u)
# traverse the graph
sizes = []
visited = set()
for u in neighbors.keys():
if u in visited:
continue
# visit the component that includes u
size = 0
agenda = {u}
while agenda:
v = agenda.pop()
visited.add(v)
size += 1
agenda.update(neighbors[v] - visited)
sizes.append(size)
return sizes
Do you need to write your own algorithm? networkx already has algorithms for this.
To get the length of each component try
import networkx as nx
G = nx.Graph()
G.add_edges_from([[1,2],[2,3],[3,1],[4,5]])
components = []
for graph in nx.connected_components(G):
components.append([graph, len(graph)])
components
# [[set([1, 2, 3]), 3], [set([4, 5]), 2]]
You could use Disjoint-set data structure:
edges = [[1,2],[2,3],[3,1],[4,5]]
parents = {}
size = {}
def get_ancestor(parents, item):
# Returns ancestor for a given item and compresses path
# Recursion would be easier but might blow stack
stack = []
while True:
parent = parents.setdefault(item, item)
if parent == item:
break
stack.append(item)
item = parent
for item in stack:
parents[item] = parent
return parent
for x, y in edges:
x = get_ancestor(parents, x)
y = get_ancestor(parents, y)
size_x = size.setdefault(x, 1)
size_y = size.setdefault(y, 1)
if size_x < size_y:
parents[x] = y
size[y] += size_x
else:
parents[y] = x
size[x] += size_y
print(sum(1 for k, v in parents.items() if k == v)) # 2
In above parents is a dict where vertices are keys and ancestors are values. If given vertex doesn't have a parent then the value is the vertex itself. For every edge in the list the ancestor of both vertices is set the same. Note that when current ancestor is queried the path is compressed so following queries can be done in O(1) time. This allows the whole algorithm to have O(n) time complexity.
Update
In case components are required instead of just number of them the resulting dict can be iterated to produce it:
from collections import defaultdict
components = defaultdict(list)
for k, v in parents.items():
components[v].append(k)
print(components)
Output:
defaultdict(<type 'list'>, {3: [1, 2, 3], 5: [4, 5]})

Python - speed up pathfinding

This is my pathfinding function:
def get_distance(x1,y1,x2,y2):
neighbors = [(-1,0),(1,0),(0,-1),(0,1)]
old_nodes = [(square_pos[x1,y1],0)]
new_nodes = []
for i in range(50):
for node in old_nodes:
if node[0].x == x2 and node[0].y == y2:
return node[1]
for neighbor in neighbors:
try:
square = square_pos[node[0].x+neighbor[0],node[0].y+neighbor[1]]
if square.lightcycle == None:
new_nodes.append((square,node[1]))
except KeyError:
pass
old_nodes = []
old_nodes = list(new_nodes)
new_nodes = []
nodes = []
return 50
The problem is that the AI takes to long to respond( response time <= 100ms)
This is just a python way of doing https://en.wikipedia.org/wiki/Pathfinding#Sample_algorithm
You should replace your algorithm with A*-search with the Manhattan distance as a heuristic.
One reasonably fast solution is to implement the Dijkstra algorithm (that I have already implemented in that question):
Build the original map. It's a masked array where the walker cannot walk on masked element:
%pylab inline
map_size = (20,20)
MAP = np.ma.masked_array(np.zeros(map_size), np.random.choice([0,1], size=map_size))
matshow(MAP)
Below is the Dijkstra algorithm:
def dijkstra(V):
mask = V.mask
visit_mask = mask.copy() # mask visited cells
m = numpy.ones_like(V) * numpy.inf
connectivity = [(i,j) for i in [-1, 0, 1] for j in [-1, 0, 1] if (not (i == j == 0))]
cc = unravel_index(V.argmin(), m.shape) # current_cell
m[cc] = 0
P = {} # dictionary of predecessors
#while (~visit_mask).sum() > 0:
for _ in range(V.size):
#print cc
neighbors = [tuple(e) for e in asarray(cc) - connectivity
if e[0] > 0 and e[1] > 0 and e[0] < V.shape[0] and e[1] < V.shape[1]]
neighbors = [ e for e in neighbors if not visit_mask[e] ]
tentative_distance = [(V[e]-V[cc])**2 for e in neighbors]
for i,e in enumerate(neighbors):
d = tentative_distance[i] + m[cc]
if d < m[e]:
m[e] = d
P[e] = cc
visit_mask[cc] = True
m_mask = ma.masked_array(m, visit_mask)
cc = unravel_index(m_mask.argmin(), m.shape)
return m, P
def shortestPath(start, end, P):
Path = []
step = end
while 1:
Path.append(step)
if step == start: break
if P.has_key(step):
step = P[step]
else:
break
Path.reverse()
return asarray(Path)
And the result:
start = (2,8)
stop = (17,19)
D, P = dijkstra(MAP)
path = shortestPath(start, stop, P)
imshow(MAP, interpolation='nearest')
plot(path[:,1], path[:,0], 'ro-', linewidth=2.5)
Below some timing statistics:
%timeit dijkstra(MAP)
#10 loops, best of 3: 32.6 ms per loop
The biggest issue with your code is that you don't do anything to avoid the same coordinates being visited multiple times. This means that the number of nodes you visit is guaranteed to grow exponentially, since it can keep going back and forth over the first few nodes many times.
The best way to avoid duplication is to maintain a set of the coordinates we've added to the queue (though if your node values are hashable, you might be able to add them directly to the set instead of coordinate tuples). Since we're doing a breadth-first search, we'll always reach a given coordinate by (one of) the shortest path(s), so we never need to worry about finding a better route later on.
Try something like this:
def get_distance(x1,y1,x2,y2):
neighbors = [(-1,0),(1,0),(0,-1),(0,1)]
nodes = [(square_pos[x1,y1],0)]
seen = set([(x1, y1)])
for node, path_length in nodes:
if path_length == 50:
break
if node.x == x2 and node.y == y2:
return path_length
for nx, ny in neighbors:
try:
square = square_pos[node.x + nx, node.y + ny]
if square.lightcycle == None and (square.x, square.y) not in seen:
nodes.append((square, path_length + 1))
seen.add((square.x, square.y))
except KeyError:
pass
return 50
I've also simplified the loop a bit. Rather than switching out the list after each depth, you can just use one loop and add to its end as you're iterating over the earlier values. I still abort if a path hasn't been found with fewer than 50 steps (using the distance stored in the 2-tuple, rather than the number of passes of the outer loop). A further improvement might be to use a collections.dequeue for the queue, since you could efficiently pop from one end while appending to the other end. It probably won't make a huge difference, but might avoid a little bit of memory usage.
I also avoided most of the indexing by one and zero in favor of unpacking into separate variable names in the for loops. I think this is much easier to read, and it avoids confusion since the two different kinds of 2-tuples had had different meanings (one is a node, distance tuple, the other is x, y).

Categories

Resources