Implementing a (modified) DFS in a Graph - python

I have implemented a simple graph data structure in Python with the following structure below. The code is here just to clarify what the functions/variables mean, but they are pretty self-explanatory so you can skip reading it.
class Node:
def __init__(self, label):
self.out_edges = []
self.label = label
self.is_goal = False
self.is_visited = False
def add_edge(self, node, weight):
self.out_edges.append(Edge(node, weight))
def visit(self):
self.is_visited = True
class Edge:
def __init__(self, node, weight):
self.node = node
self.weight = weight
def to(self):
return self.node
class Graph:
def __init__(self):
self.nodes = []
def add_node(self, label):
self.nodes.append(Node(label))
def visit_nodes(self):
for node in self.nodes:
node.is_visited = True
Now I am trying to implement a depth-first search which starts from a given node v, and returns a path (in list form) to a goal node. By goal node, I mean a node with the attribute is_goal set to true. If a path exists, and a goal node is found, the string ':-)' is added to the list. Otherwise, the function just performs a DFS and goes as far as it can go. (I do this here just to easily check whether a path exists or not).
This is my implementation:
def dfs(G, v):
path = [] # path is empty so far
v.visit() # mark the node as visited
path.append(v.label) # add to path
if v.is_goal: # if v is a goal node
path.append(':-)') # indicate a path is reached
G.visit_nodes() # set all remaining nodes to visited
else:
for edge in v.out_edges: # for each out_edge of the starting node
if not edge.to().is_visited: # if the node pointed to is not visited
path += dfs(G, edge.to()) # return the path + dfs starting from that node
return path
Now the problem is, I have to set all the nodes to visited (line 9, visit_nodes()) for the algorithm to end once a goal node is reached. In effect, this sort of breaks out of the awaiting recursive calls since it ensures no other nodes are added to the path. My question is:
Is there a cleaner/better way to do this?
The solution seems a bit kludgy. I'd appreciate any help.

It would be better not to clutter the graph structure with visited information, as that really is context-sensitive information linked to a search algorithm, not with the graph itself. You can use a separate set instead.
Secondly, you have a bug in the code, as you keep adding to the path variable, even if your recursive call did not find the target node. So your path will even have nodes in sequence that have no edge between them, but are (close or remote) siblings/cousins.
Instead you should only return a path when you found the target node, and then after making the recursive call you should test that condition to determine whether to prefix that path with the current edge node you are trying with.
There is in fact no need to keep a path variable, as per recursion level you are only looking for one node to be added to a path you get from the recursive call. It is not necessary to store that one node in a list. Just a simple variable will do.
Here is the suggested code (not tested):
def dfs(G, v):
visited = set() # keep visited information away from graph
def _dfs(v):
visited.add(v) # mark the node as visited
if v.is_goal:
return [':-)'] # return end point of path
for edge in v.out_edges:
neighbor = edge.to() # to avoid calling to() several times
if neighbor not in visited:
result = _dfs(neighbor)
if result: # only when successful
# we only need 1 success: add current neighbor and exit
return [neighbor.label] + result
# otherwise, nothing should change to any path: continue
# don't return anything in case of failure
# call nested function: the visited and Graph variables are shared
return _dfs(v)
Remark
For the same reason as for visited, it is maybe better to remove the is_goal marking from the graph as well, and pass that target node as an additional argument to the dfs function.
It would also be nice to give a default value for the weight argument, so that you can use this code for unweighted graphs as well.
See how it runs on a sample graph with 5 nodes on repl.it.

Related

Python Temperamentally Accessing Global Object from within a Function (which references that object from within another)

(Edit: Fundamentally my problem is that python is sometimes creating new instances of an object x,which I accessed by another object y, accessed by z instead of editing the original x directly. x and y both belong to a list of global variables, but when I access y via z to access x via y on a subsequent iteration of my recursive algorithm, its information isn't always correctly updated.)
Introduction
I'm writing a recursive function to emulate this version of Dijkstra's algorithm (Problem 2) with input from a CSV file. I have two globals Branches [] and Nodes [] to store all branch and node python objects. (I'll put how everything is initialized below).
When I attempt to change the set of the destination node of my branch from within my function, python generates a new object. It's not that it cannot access global Branches, because it does function correctly when the start node of the branch I am accessing is the same as the origin.
Update: I tried searching through the global list for a node with the label matching the one I wanted to change, but while this correctly changed the set of the global none of the branches referencing that node recognized the change
def dijkstra_algorithm(node_being_investigated, origin_node, destination_node):
...
for branch in shortest_route.requisite_branches:
# finding the new branch added to the route and adding it to set I and its destination to set A
if branch not in branches_in_set_I:
print(f"adding the new branch {branch.info()} to set I")
print(f"Adding the new node {branch.destination.info()} to set A")
print(f"The New Node: {branch.destination} ")
branch.set = "I"
branch.destination.set = "A"
# a redundancy I added just in case the for loop was somehow messing with things
shortest_route.requisite_branches[0].destination.set="A"
if destination in obtain_nodes_of_set("A"):
return shortest_route
else:
return dijkstra_algorithm(shortest_route.requisite_branches[0].destination, origin, destination)
NB: There is one other class Routes which stores a list of branches and their cumulative time. I suspect that since this is the only place some operation is performed on a node or branch, the problem is linked to this somehow. However, since I'm not sure how, here's a small table of contents of all the code snippets I've attached. I can send more (or even the whole file) if need be.
The Code Below
the way I define the Node and Branch classes
the initialization of everything from the CSVs
the way I define the Route class
the search_to_origin function which finds the overall length of various Route options
the piece of the dijkstra_algorithm function which finds the shortest_route inputted to this piece of the algorithm
Defining the Node and Branch classes
class Node:
def __init__(self, label, set):
self.label = label
self.set = set and set or "C"
print(f"\n\n!!!creating new node object with label {self.label}!!!\n\n")
def info(self):
return f"Set {self.set}: [[{self.label}]]"
and:
class Branch:
def __init__(self, origin, destination, duration, set):
self.origin = origin
self.destination = destination
self.duration = duration
self.set = set and set or "III"
def info(self):
return f"Set {self.set}:{self.origin.info()} -> {self.destination.info()} ({self.duration})"
Initializing the global Lists from the CSVs
From what I've checked by printing out the object IDs this seems to be working correctly, but since I'm not actually sure where the problem lies, I'm including it just in case
with open('nodes.csv') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
Nodes.append(Node(row["LABEL"], 99999, "C"))
# here all the roads are read into the global list of branches
with open('roads.csv') as csv_file:
csv_reader = csv.DictReader(csv_file)
for row in csv_reader:
branch_info = {}
# getting the actual node objects to attach
for node in Nodes:
if node.label == row["ORIGIN"]:
branch_info['origin'] = node
if node.label == row["DESTINATION"]:
branch_info['destination'] = node
Branches.append(Branch(branch_info['origin'], branch_info['destination'], int(row["DURATION"]), "III"))
The Route Class and the Main Function Operating on It
There is a sub function called by the overall algorithm which traces the route from a list of branches back to a specified origin node object. Afterwards it returns a list of possible_routes to the main dijkstra function, for that function to decide which is quickest.
Route Class Initialization
class Route:
def __init__(self, branch_list, duration):
self.requisite_branches = branch_list
self.duration = duration
def extend_route(self, branch):
self.requisite_branches.append(branch)
self.duration += branch.duration
The function determining routes to the origin
Node that this calls a utility function find_attached_branches which searches for branches with a node as its origin or destination from a specific set.
def search_to_origin(origin, branches, route):
print("\n Searching to origin...")
possible_routes = []
# for each of the branches being investigated
for branch in branches:
# if this branch has not already been investigated in this route
if branch not in route.requisite_branches:
# this is a new individual route being investigated
new_route = copy.deepcopy(route)
new_route.extend_route(branch)
# if the start node has been found this is a possible route
if branch.origin == origin:
print("This branch leads to the origin")
possible_routes.append(new_route)
# if the start node has not been found search for branches preceding this one
else:
branches_to_this_node = find_attached_branches(branch.origin, 'destination, "II")
branches_to_this_node.extend(find_attached_branches(branch.origin, 'destination, "I"))
if len(branches_to_this_node) != 0:
print("this branch does not lead to the origin")
route_to_start = search_to_origin(origin, branches_to_this_node, new_route)
possible_routes.extend(route_to_start)
# return the lengths and requisite branches found
return possible_routes
Finding the Quickest Route to the Origin From a Given Node in B
This portion of the algorithm is done just before the one you see in the introduction section. The way it functions should be independent of whether the shortest route only consists of one branch, but this doesn't seem to be the case. It goes through the following steps:
Looks for all the nodes now in set B.
For each of these nodes it finds all the branches attached to each node in set B.
For all of those branches it looks for routes back to the origin.
From all of the resulting routes it determines the shortest_route
global Nodes
# get all the nodes in set B
nodes_in_set_B = obtain_nodes_of_set("B")
# get all branches in sets I or II
global Branches
branches_in_I_or_II = []
for branch in Branches:
if branch.set != "III": branches_in_I_or_II.append(branch)
# the shortest route found from one of the nodes of set B. Initialized as a large empty route
shortest_route = Route([], 99999)
if nodes_in_set_B is not None:
for node in nodes_in_set_B:
# the branches under consideration are only those to this node in set B
branches_under_consideration = []
for branch in branches_in_I_or_II:
if branch.destination == node: branches_under_consideration.append(branch)
possible_routes = search_to_origin(origin, branches_under_consideration, Route([], 0))
# finding the possible route of minimum length
for route in possible_routes:
if route.duration < shortest_route.duration:
shortest_route = route

Remove router function in python

Hi just wondering if anyone can tell me why this code isn't working
It's giving an internal server error
Thanks
#app.post("/removerouter/")
def removerouter(name: str):
Graph.remove_node(name)
return "success"
And this is the function inside Graph
class Graph:
def __init__(self):
self.nodes = []
self.edges = []
def remove_node(self, name):
self.nodes.remove(name)
for Edges in self.edges:
if name in Edges:
self.edges.remove(Edges)
Based on the code that you posted, I would say that the issue is somehow related to how you remove your edge(s) in the for loop.
When you delete a list element using the remove()function in Python, it changes the remaining element's indexing.
For more details and alternatives, see this SO question.
I also don't understand why you are using an iterator variable called Edges in your for loop. Python variables shall always start with a lowercase letter in order not to clash with any existing (or future) class name.
I would rather do something like this:
class Graph:
def __init__(self):
self.nodes = []
self.edges = []
def remove_node(self, name):
self.nodes.remove(name)
self.edges = [edge for edge in self.edges if name not in edge]
Note that I'm using a list comprehension here to assign a new list to self.edges.
If you want to avoid list comprehension, you could also keep your for loop and first store the indexes of the edges that need to be removed. Then, for each index, you can simply do del self.edges[index].

Implementation of Shortest Path Graph in Python Class

Hi! I am new to Python and I am struggling for many hours so far with some problem regarding Shortest Path Algorithm implementation in Python.
I am expected to solve some task about finding shortest paths (graph problem) among given persons, and at the end find a common person who connects all of them.
I've made something like this so far:
import itertools
class centralperson():
def initialization(self,N,data,a,b,c):
self.data = data
self.N = N
self.a = a
self.b = b
self.c = c
self.list_of_values = [self.a,self.b,self.c]
self.list_of_paths = []
self.common_person = []
def makeGraph(self):
# Create dict with empty list for each key
graph = {key: [] for key in range(self.N)}
self.graph = graph
for key in self.graph:
for x in self.data:
if key in x:
x = x.copy()
x.remove(key)
self.graph[key].extend(x)
def find_path(self,start, end):
path = []
path = path + [start]
if start == end:
return path
if start not in self.graph.keys():
raise ValueError('No such key in graph!')
for node in self.graph[start]:
if node not in path:
newpath = self.find_path(self.graph, node, end, path)
if newpath: return newpath
return self.list_of_paths.append(path)
def findPaths(self):
for pair in itertools.combinations(self.list_of_values, r=5):
self.find_path(*pair)
def commonperson(self):
list_of_lens = {}
commonalities = set(self.list_of_paths[0]) & set(self.list_of_paths[1]) & set(self.list_of_paths[2])
for i in self.list_of_values:
list_of_lens[i] = (len(self.graph[i]))
if len(commonalities)>1 or len(commonalities)<1:
for k,v in list_of_lens.items():
if v==1 and self.graph[k][0] in commonalities:
output = self.graph[k]
self.common_person.append(output)
else:
output = list(commonalities)
self.common_person.append(output)
return
def printo(self):
#return(self.common_person[0])
return(self.list_of_paths,self.list_of_values)
Description of each function and inputs:
N -> number of unique nodes
a,b,c -> some arbitrarily chosen nodes to find common one among them
initialization -> just initialize our global variables used in other methods, and store the list of outputs
makeGraph -> makes an Adjacency List out of an input.
find_path -> find path between two given nodes (backtracking recursion)
findPaths -> it was expected to call find_path here for every combination of A,B,C i.e A->B, A->C, B->C
commonperson -> expected to find common person from the output of list_of_paths list
printo -> print this common person
Generally It works (I'think) when I'am running each function separately. But, when I try to make a huge class of it, it doesn't work :(
I think the problem is with this recursive function find_path. It is supposed to find a path between two person given, and append the result path to the list with all paths. Yet, as I have 3 different persons, and find_path is a recursive function with only 2 parameters.
Hence, I need to find all paths that connects them (3 paths) and append it to a bigger list list_of_paths. I've created a def findPaths to make use of itertools.combinations and in for loop cal function find_path for every combination of start and end argument of this function, but it seems not to work...
Can you help me with this? Also I don't know how to run all the def functions at once, because honestly I wouldn't like to run all instances of the class separately... My desired version is to:
Provide Input to a class i.e : N,data,a,b,c where N is number of unique nodes, data is just list of list with networks assigned, and A,B,C are my persons.
Get Output: which is a common person for all this 3 persons, (I planned to store it in common_person list.
The code inside you class should be indented, i.e.:
class centralperson:
def __init__(self, ...):
...
def makeGraph(self, ...):
...
instead of
class centralperson:
def __init__(self, ...):
...
def makeGraph(self, ...):
...
Try googling for 'python class examples'. I hope this helps!
It might also be useful to experiment with simpler classes before working on this problem.
itertools.combinations(self.list_of_values, r=5)
returns an empty list, since self.list_of_values only has 3 elements, from which you cannot pick 5.
Perhaps you meant:
itertools.combinations(self.list_of_values, r=2)

Implementing Graph for Bayes Net in FSharp

I'm trying to translate a graph formulation from Python to F#
The python "Node" class:
class Node:
""" A Node is the basic element of a graph. In its most basic form a graph is just a list of nodes. A Node is a really just a list of neighbors.
"""
def __init__(self, id, index=-1, name="anonymous"):
# This defines a list of edges to other nodes in the graph.
self.neighbors = set()
self.visited = False
self.id = id
# The index of this node within the list of nodes in the overall graph.
self.index = index
# Optional name, most usefull for debugging purposes.
self.name = name
def __lt__(self, other):
# Defines a < operator for this class, which allows for easily sorting a list of nodes.
return self.index < other.index
def __hash__(self):
return hash(self.id)
def __eq__(self, right):
return self.id == right.id
def add_neighbor(self, node):
""" Make node a neighbor if it is not alreadly. This is a hack, we should be allowing self to be a neighbor of self in some graphs. This should be enforced at the level of a graph, because that is where the type of the graph would disallow it.
"""
if (not node in self.neighbors) and (not self == node):
self.neighbors.add(node)
def remove_neighbor(self, node):
# Remove the node from the list of neighbors, effectively deleting that edge from
# the graph.
self.neighbors.remove(node)
def is_neighbor(self, node):
# Check if node is a member of neighbors.
return node in self.neighbors
My F# class so far:
type Node<'T>= string*'T
type Edge<'T,'G> = Node<'T>*Node<'T>*'G
type Graph =
| Undirected of seq(Node*list Edge)
| Directed of seq(Node*list Edge *list Edge)
Yes, this does have to do with immutability. F#'s Set is an immutable collection, it is based on a binary tree which supports Add, Remove and lookup in O(log n) time.
However, because the collection is immutable, the add operation returns a new Set.
let originalSet = set [1; 2; 7]
let newSet = originalSet.Add(5)
The most purely functional solution is probably to reconstruct your problem to remove the mutability entirely. This approach would probably see you reconstruct your node class as an immutable data container (with no methods) and define the functions that act on that data container in a separate module.
module Nodes =
/// creates a new node from an old node with a supplied neighbour node added.
let addNeighbour neighbourNode node =
Node <| Set.add neighbourNode (node.Neighbours)
//Note: you'll need to replace the backwards pipe with brackets for pre-F# 4.0
See the immutable collections in the FSharp Core library such as List, Map, etc. for more examples.
If you prefer the mutable approach, you could just make your neighbours mutable so that it can be updated when the map changes or just use a mutable collection such as a System.Collections.Generic.HashSet<'T>.
When it comes to the hashcode, Set<'T> actually doesn't make use of that. It requires that objects that can be contained within it implement the IComparable interface. This is used to generate the ordering required for the binary tree. It looks like your object already has a concept of ordering built-in which would be appropriate to provide this behaviour.

How to delete all nodes of a Binary Search Tree

I am trying to write a code to delete all nodes of a BST (each node has only three attributes, left, right and data, there are no parent pointers). The following code is what I have come up with, it deletes only the right half of the tree, keeping the left half intact. How do I modify it so that the left half is deleted as well (so that ultimately I am left with only the root node which has neither left or right subtrees)?
def delete(root):
global last
if root:
delete(root.left)
delete(root.right)
if not (root.left or root.right):
last = root
elif root.left == last:
root.left = None
else:
root.right = None
And secondly, can anybody suggest an iterative approach as well, using stack or other related data structure?
Blckknght is right about garbage collection, but in case you want to do some more complex cleanup than your example suggests or understand why your code didn't work, i'll provide an additional answer:
Your problem seems to be the elif node.left == last check.
I'm not sure what your last variable is used for or what the logic is behind it.
But the problem is that node.left is almost never equal to last (you only assign a node to the last variable if both children are already set to None, which they aren't for any of the interesting nodes (those that have children)).
If you look at your code, you'll see that in that if node.left isn't equal to last only the right child gets set to None, and thus only the right part of the subtree is deleted.
I don't know python, but this should work:
def delete(node):
if node:
# recurse: visit all nodes in the two subtrees
delete(node.left)
delete(node.right)
# after both subtrees have been visited, set pointers of this node to None
node.left = None
node.right = None
(I took the liberty of renaming your root parameter to node, since the node given to the function doesn't have to be the root-node of the tree.)
If you want to delete both subtrees, there's no need to recurse. Just set root.left and root.right to None and let the garbage collector take care of them. Indeed, rather than making a delete function in the first place, you could just set root = None and be done with it!
Edit: If you need to run cleanup code on the data values, you might want to recurse through the tree to get to all of them if the GC doesn't do enough. Tearing down the links in the tree shouldn't really be necessary, but I'll do that too for good measure:
def delete(node):
if node:
node.data.cleanup() # run data value cleanup code
delete(node.left) # recurse
delete(node.right)
node.data = None # clear pointers (not really necessary)
node.left = None
none.right = None
You had also asked about an iterative approach to traversing the tree, which is a little more complicated. Here's a way to an traversal using a deque (as a stack) to keep track of the ancestors:
from collections import deque
def delete_iterative(node):
stack = deque()
last = None
# start up by pushing nodes to the stack until reaching leftmost node
while node:
stack.append(node)
node = node.left
# the main loop
while stack:
node = stack.pop()
# should we expand the right subtree?
if node.right && node.right != last: # yes
stack.append(node)
node = node.right
while node: # expand to find leftmost node in right subtree
stack.append(node)
node = node.left
else: # no, we just came from there (or it doesn't exist)
# delete node's contents
node.data.cleanup()
node.data = None # clear pointers (not really necessary)
node.left = None
node.right = None
# let our parent know that it was us it just visited
last = node
An iterative post-order traversal using a stack could look like this:
def is_first_visit(cur, prev):
return prev is None or prev.left is cur or prev.right is cur
def visit_tree(root):
if root:
todo = [root]
previous = None
while len(todo):
node = todo[-1]
if is_first_visit(node, previous):
# add one of our children to the stack
if node.left:
todo.append(node.left)
elif node.right:
todo.append(node.right)
# now set previous to ourself and continue
elif previous is node.left:
# we've done the left subtree, do right subtree if any
if node.right:
todo.append(node.right)
else:
# previous is either node.right (we've visited both sub-trees)
# or ourself (we don't have a right subtree)
do_something(node)
todo.pop()
previous = node
do_something does whatever you want to call "actually deleting this node".
You can do it a bit more simply by setting an attribute on each node to say whether it has had do_something called on it yet, but obviously that doesn't work so well if your nodes have __slots__ or whatever, and you don't want to modify the node type to allow for the flag.
I'm not sure what you're doing with those conditions after the recursive calls, but I think this should be enough:
def delete(root):
if root:
delete(root.left)
delete(root.right)
root = None
As pointed out in comments, Python does not pass parameters by reference. In that case you can make this work in Python like this:
def delete(root):
if root:
delete(root.left)
delete(root.right)
root.left = None
root.right = None
Usage:
delete(root)
root = None
As for an iterative approach, you can try this. It's pseudocode, I don't know python. Basically we do a BF search.
delete(root):
make an empty queue Q
Q.push(root)
while not Q.empty:
c = Q.popFront()
Q.push(c.left, c.right)
c = None
Again, this won't modify the root by default if you use it as a function, but it will delete all other nodes. You could just set the root to None after the function call, or remove the parameter and work on a global root variable.

Categories

Resources