Graph recursive DFS when cycles exist? - python

I'm interested in handling DFS in undirected (or directed) graphs where cycles exist, such that the risk of entering an infinite-loop is non-trivial.
Note: This question is not about the cycle-detection problem(s) on LeetCode. Below is an iterative approach:
g = {'a':['b','c'],
'b':['a','f'],
'c':['a','f','d'],
'd':['c','e'],
'e':['d'],
'f':['c','b'],
'g':['h'],
'h':['g']
}
def dfs(graph, node, destination):
stack = [node]
visited = []
while stack:
current = stack.pop()
if current == destination:
return True
visited.append(current)
next_nodes = list(filter(lambda x: x not in visited + stack, graph[current]))
stack.extend(next_nodes)
return False
dfs(g,'h', 'g')
>>> True
dfs(g,'a', 'g')
>>> False
My question is, does such a recursive approach exist? And if so, how can it be defined in python?

If you're not interested in detecting if there are any loops or not and just interested in avoiding infinite loops (if any), then something like the following recursive implementation would work for you:
def dfs(graph, node, destination, visited=None):
if visited is None:
visited = set()
if node == destination:
return True
visited.add(node)
return any(
dfs(graph, neighbor, destination, visited=visited)
for neighbor in graph[node]
if neighbor not in visited
)
Note that a generator expression is used inside any, so it's evaluated in a lazy manner (one by one), and the whole any(...) expression returns True early as soon as a solution (i.e. a path to the destination) is found without checking the other neighbors and paths, so no extra recursive calls are made.

Related

Leetcode 261 Graph Valid Tree: second test fails

I am trying to write a solution for Leet Code problem 261. Graph Valid Tree:
Given n nodes labeled from 0 to n-1 and a list of undirected edges (each edge is a pair of nodes), write a function to check whether these edges make up a valid tree.
Example 1:
Input: n = 5, and edges = [[0,1], [0,2], [0,3], [1,4]]
Output: true
Example 2:
Input: n = 5, and edges = [[0,1], [1,2], [2,3], [1,3], [1,4]]
Output: false
Here is my solution thus far. I believe that the goal is to detect cycles in the tree. I use dfs to do this.
class Node:
def __init__(self, val):
self.val = val
self.outgoing = []
class Solution:
def validTree(self, n: int, edges: List[List[int]]) -> bool:
visited = {}
for pre, end in edges:
if pre not in visited:
"we add a new node to the visited set"
visited[pre] = Node(pre)
if end not in visited:
visited[end] = Node(end)
"We append it to the list"
visited[pre].outgoing.append(visited[end])
def dfs(current, dvisit = set()):
if current.val in dvisit:
print("is the condition happening here")
return True
dvisit.add(current.val)
for nodes in current.outgoing:
dfs(nodes, dvisit)
return False
mdict = set()
for key in visited.keys():
mdict.clear()
if dfs(visited[key], mdict) == True:
return False
return True
It fails this test n = 5, edges = [[0,1],[1,2],[2,3],[1,3],[1,4]]
It is supposed to return false but it returns true.
I placed some print statements in my dfs helper function and it does seem to be hitting the case where dfs is supposed to return true. However for some reason, the case in my for loop does not hit in the end, which causes the entire problem to return true for some reason. Can I receive some guidance on how I can modify this?
A few issues:
The given graph is undirected, so edges should be added in both directions when the tree data structure is built. Without doing this, you might miss cycles.
Once edges are made undirected, the algorithm should not travel back along the edge it just came from. For this purpose keep track of the parent node that the traversal just came from.
In dfs the returned value from the recursive call is ignored. It should not: when the returned value indicates there is a cycle, the loop should be exited and the same indication should be returned to the caller.
The main loop should not clear mdict. In fact, if after the first call to dfs, that loop finds another node that has not been visited, then this means the graph is not a tree: in a tree every pair of nodes is connected. No second call of dfs needs to be made from the main code, which means the main code does not need a loop. It can just call dfs on any node and then check that all nodes were visited by that call.
The function could do a preliminary "sanity" check, since a tree always has one less edge than it has vertices. If that is not true, then there is no need to continue: it is not a tree.
One boundary check could be made: when n is 1, and thus there are no edges, then there is nothing to call dfs on. In that case we can just return True, as this is a valid boundary case.
So a correction could look like this:
class Solution:
def validTree(self, n: int, edges: List[List[int]]) -> bool:
if n != len(edges) + 1: # Quick sanity check
return False
if n == 1: # Boundary case
return True
visited = {}
for pre, end in edges:
if pre not in visited:
visited[pre] = Node(pre)
if end not in visited:
visited[end] = Node(end)
visited[pre].outgoing.append(visited[end])
visited[end].outgoing.append(visited[pre]) # It's undirected
def dfs(parent, current, dvisit):
if current.val in dvisit:
return True # Cycle detected!
dvisit.add(current.val)
for node in current.outgoing:
# Avoid going back along the same edge, and
# check the returned boolean!
if node is not parent and dfs(current, node, dvisit):
return True # Quit as soon as cycle is found
return False
mdict = set()
# Start in any node:
if dfs(None, visited[pre], mdict):
return False # Cycle detected!
# After one DFS traversal there should not be any node that has not been visited
return len(mdict) == n
A tree is a special undirected graph. It satisfies two properties
It is connected
It has no cycle.
No cycle can be expressed as NumberOfNodes ==NumberOfEdges+1.
Based on this, given edges:
1- Create the graph
2- then traverse the graph and store the nodes in a set
3- Finally check if two conditions above are met
class Solution:
def validTree(self, n: int, edges: List[List[int]]) -> bool:
from collections import defaultdict
graph = defaultdict(list)
for src, dest in edges:
graph[src].append(dest)
graph[dest].append(src)
visited = set()
def dfs(root):
visited.add(root)
for node in graph[root]:
if node in visited:
# if you already visited before, means you alredy run dfs so do not run dfs again
continue
dfs(node)
dfs(0)
# this shows we have no cycle and connected
return len(visited) == n and len(edges)+1 == n
This question is locked in leetcode but you can test it here for now:
https://www.lintcode.com/problem/178/description

Determine if subtree t is inside tree s

I'm trying leetcode problem 572.
Given two non-empty binary trees s and t, check whether tree t has exactly the same structure and node values with a subtree of s. A subtree of s is a tree consists of a node in s and all of this node's descendants. The tree s could also be considered as a subtree of itself.
Since, tree's a great for recursion, I thought about splitting the cases up.
a) If the current tree s is not the subtree t, then recurse on the left and right parts of s if possible
b) If tree s is subtree t, then return True
c) if s is empty, then we've exhausted all the subtrees in s and should return False
def isSubtree(self, s: TreeNode, t: TreeNode) -> bool:
if not s:
return False
if s == t:
return True
else:
if s.left and s.right:
return any([self.isSubtree(s.left, t), self.isSubtree(s.right, t)])
elif s.left:
return self.isSubtree(s.left, t)
elif s.right:
return self.isSubtree(s.right, t)
else:
return False
However, this for some reason returns False even for the cases where they are obviously True
Ex:
My code here returns False, but it should be True. Any pointers on what to do?
This'll simply get through:
class Solution:
def isSubtree(self, a, b):
def sub(node):
return f'A{node.val}#{sub(node.left)}{sub(node.right)}' if node else 'Z'
return sub(b) in sub(a)
References
For additional details, you can see the Discussion Board. There are plenty of accepted solutions with a variety of languages and explanations, efficient algorithms, as well as asymptotic time/space complexity analysis1, 2 in there.
You need to change second if statement.
In order to check if tree t is subtree of tree s, each time a node in s matches the root of t, call check method which determines whether two subtrees are identical.
if s.val == t.val and check(s, t):
return True
A check method is look like this.
def check(self, s, t):
if s is None and t is None:
return True
if s is None or t is None or s.val != t.val:
return False
return self.check(s.left, t.left) and self.check(s.right, t.right)
While other code work well, your code will much simpler in else statement like the following. You don't need to check whether left and right nodes are None because first if statement will check that.
else:
return self.isSubtree(s.left, t) or self.isSubtree(s.right, t)

How to determine last stack space in recursion

I'm implementing well-known depth first search by recursion. I wonder whether there may be a way to know the code within last stack space. Why I need is I don't want to put -> character at the end of output. If possible just '\n' in the last step.
def DFS(self, vertex=None, visited=None):
if vertex is None:
vertex = self.root
if visited is None:
visited = []
print(f"{vertex} -> ", end='')
visited.append(vertex)
for neighbor in self.getNeighbors(vertex):
if neighbor not in visited:
visited.append(neighbor)
print(f"{neighbor} -> ", end='')
self.DFS(neighbor, visited)
For example, it yields 1 -> 2 -> 4 -> 5 ->
Is there anyway to do within the same method? Moreover, I could write a helper function removing the last -> character.
#Edit: What I've done according to #Carcigenicate's comment follows
return visited # last line in DFS method
-- in main --
dfs = graph.DFS()
path = " -> ".join(str(vertex) for vertex in dfs)
print(path)
Rather than trying to special-case the last vertex, special-case the first. That is, don't try to figure out when not to append the "->", just don't do it for the first vertex:
def DFS(self, vertex=None, visited=None):
if vertex is None:
vertex = self.root
else:
# Not the first vertex, so need to add the separator.
print(f" ->", end='')
if visited is None:
visited = []
print(f"{vertex}", end='')
visited.append(vertex)
for neighbor in self.getNeighbors(vertex):
if neighbor not in visited:
# no need to append here, because it will be done in the recursive call.
# and the vertex will be printed in the recursive call, too.
# visited.append(neighbor)
# print(f"{neighbor} -> ", end='')
self.DFS(neighbor, visited)
This assumes that your initial call will always be DFS(root, None, visited). Which I think is a reasonable assumption.
On second thought, perhaps using the visited parameter as the condition is a better idea:
if vertex is None:
vertex = self.root
if visited is None:
visited = []
else:
# Not the first vertex, so need to add the separator.
print(f" ->", end='')
print(f"{vertex}", end='')
The whole point is that it's easier to special-case the first item than the last.

Size of subtree in Python

Almost every online tutorial I see on the subject when it comes to finding the size of a subtree involves calling a recursive function on each child's subtree.
The problem with this in Python is that it overflows if you recurse past a few hundred levels, so if I theoretically had a long, linear tree, it would fail.
Is there a better way to handle this? Do I need to use a stack instead?
Do I need to use a stack instead?
Sure, that's one way of doing it.
def iter_tree(root):
to_explore = [root]
while to_explore:
node = to_explore.pop()
yield node
for child in node.children:
to_explore.append(child)
def size(root):
count = 0
for node in iter_tree(root):
count += 1
return count
The stack would be the easiest non-recursive way of getting the size of the subtree (count of nodes under the given node, including the current node)
class Node():
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def subtree_size(root):
visited = 0
if not root: return visited
stack = [root]
while stack:
node = stack.pop()
visited += 1
if node.left: stack.append(node.left)
if node.right: stack.append(node.right)
return visited
You can mirror the recursive algorithm using a stack:
numNodes = 0
nodeStack = [(root,0)] # (Node,0 means explore left 1 means explore right)
while nodeStack:
nextNode, leftOrRight = nodeStack.pop()
if not nextNode: #nextNode is empty
continue
if leftOrRight == 0:
numNodes += 1
nodeStack.append((nextNode,1))
nodeStack.append((nextNode.leftChild,0))
else:
nodeStack.append((nextNode.rightChild,0))
print(numNodes)
Some things to notice: This is still a Depth-first search! That is, we still fully explore a subtree before starting to explore the other. What this means to you is that the amount of additional memory required is proportional to the height of the tree and not the width of the tree. For a balanced tree the width of the tree is 2^h where h is the height of the tree. For a totally unbalanced tree the height of the tree is the number of nodes in the tree, whereas the width is one! so it all depends on what you need :)
Now It is worth mentioning that you can make a potential optimization by checking if one of the subtrees is empty! We can change the body of if leftOrRight == 0: to:
numNodes += 1
if nextNode.rightChild: #Has a right child to explore
nodeStack.append((nextNode,1))
nodeStack.append((nextNode.leftChild,0))
Which potentially cuts down on memory usage :)

Returning value when condition met in recursion

I am trying to find tours in a graph. I have written the following code, this seems to be printing tours correctly. I want it to stop once it have found the first tour and return the tour as a list. However, the recursion stack seems to finish to completion and I am not getting the desired result. How can I return a value and fully stop the recursion when I find the first tour i.e. my condition is met? Thanks.
def get_tour(start, graph, path):
if path==[]:
from_node=start
else:
from_node=path[-1][1]
if graph==[]:
if start in path[-1]:
print "Tour Found"
return path
else:
edges=[node for node in graph if from_node in node]
for edge in edges:
to_node=[i for i in edge if i<> from_node][0]
p=list(path)
p.append((from_node,to_node))
g=list(graph)
g.remove(edge)
get_tour(start, g,p)
g=[(1,2), (1,3), (2,3)]
get_tour(1, graph=g, path=[])
When using recursion you need to pass back the return value up to the whole call stack. Normally this isn't the best way to use recursion.
Without going in the details of your code, here is a quick suggestion:
def get_tour(start, graph, path):
ret_val = None
# Some code..
if graph==[]:
# Some code..
else:
edges=[node for node in graph if from_node in node]
for edge in edges:
# Some more code..
ret_val = get_tour(start, g,p)
if ret_val:
break
return ret_val
The reason the code continues to execute after finding the tour and returning the path is because it returns it to the call that was made within the iteration through the edges. If there is no break or return condition there then the iterations continue (and more recursive calls are followed).
Here is an amended version of your code that returns to the original call (as well as the recursive call) as soon as the conditions are satisfied, I have added some debug information to try to make the process clearer:
#!/usr/bin/python
# globals
verbose = True
def get_tour(start, graph, path):
if path==[]:
from_node=start
else:
from_node=path[-1][1]
if verbose:
print '\nfrom_node:\t', from_node
print 'start:\t', start
print 'graph:\t', graph
print 'path:\t', path
if graph==[]:
if start in path[-1]:
print "Tour Found"
return path
else:
edges=[node for node in graph if from_node in node]
for edge in edges:
to_node=[i for i in edge if i <> from_node][0]
p=list(path)
p.append((from_node,to_node))
g=list(graph)
g.remove(edge)
path = get_tour(start, g, p)
if path:
return path
g=[(1,2), (1,3), (2,3)]
get_tour(1, graph=g, path=[])

Categories

Resources