How can I use dfs to produce topological sort? - python

How can I use the depth first search function (dfs) to produce a topological sort?
This is my code:
class Vertex:
def __init__(self,key):
self.id = key
self.connectedTo = {}
def addNeighbor(self,nbr,weight=0):
self.connectedTo[nbr] = weight
def __str__(self):
return str(self.id) + ' connectedTo: ' + str([x.id for x in self.connectedTo])
def getConnections(self):
return self.connectedTo.keys()
def getId(self):
return self.id
def getWeight(self,nbr):
return self.connectedTo[nbr]
class Graph:
def __init__(self):
self.vertList = {}
self.numVertices = 0
def addVertex(self,key):
self.numVertices = self.numVertices + 1
newVertex = Vertex(key)
self.vertList[key] = newVertex
return newVertex
def getVertex(self,n):
if n in self.vertList:
return self.vertList[n]
else:
return None
def __contains__(self,n):
return n in self.vertList
def addEdge(self,f,t,cost=0):
if f not in self.vertList:
nv = self.addVertex(f)
if t not in self.vertList:
nv = self.addVertex(t)
self.vertList[f].addNeighbor(self.vertList[t], cost)
def getVertices(self):
return self.vertList.keys()
def __iter__(self):
return iter(self.vertList.values())
class DFSGraph(Graph):
def __init__(self):
super().__init__()
self.time = 0
def dfs(self):
for aVertex in self:
aVertex.setColor('white')
aVertex.setPred(-1)
for aVertex in self:
if aVertex.getColor() == 'white':
self.dfsvisit(aVertex)

While there are different ways to obtain the topological sort of a given graph, the topological sort of a graph can be obtained using DFS with some bookkeeping. Of course, for there to be a topological ordering of a graph in the first place, the graph must be a directed acyclic graph (DAG). A way to know if a graph is a DAG is if you can find a vertex with no incoming edges.
A helpful visual I found from https://www.techiedelight.com/topological-sorting-dag/ shows that a back edge creates a cycle in the graph. In topological sort we are attempting to order the edges from largest departure time to smallest departure time.
The reason DFS works to find Topological sort is that if DFS is run on a DAG, the search will not find any back edges because there are no cycles by definition. That means there are no edges in a tree (U,V) where the finishing time of V is greater than the finishing time of U. DFS on a DAG has no back edges so all the edges U,V finish with U < V. Therefore, if we run through DFS and place our nodes that finished first into a stack, and then pop out the items and print them, we get the topological ordering.
Since this is simply DFS with an extra data structure, the runtime would simply be O(V + E).

Related

python class for graph data structure contain add edge method i find it complex to understand

I'm learning python and this an exercise, I'm trying to build graph data.
class Vertex:
def __init__(self, value):
self.value = value
self.edges = {}
def add_edge(self, vertex, weight = 0):
self.edges[vertex] = weight
def get_edges(self):
return list(self.edges.keys())
class Graph:
def __init__(self, directed = False):
self.graph_dict = {}
self.directed = directed
def add_vertex(self, vertex):
self.graph_dict[vertex.value] = vertex
def add_edge(self, from_vertex, to_vertex, weight = 0):
self.graph_dict[from_vertex.value].add_edge(to_vertex.value, weight)
if not self.directed:
self.graph_dict[to_vertex.value].add_edge(from_vertex.value, weight)
In graph class in add_edge method i find the following code complex and I don't understand how it works.
self.graph_dict[from_vertex.value].add_edge(to_vertex.value, weight)
I don't know what make you problem but you can write it in two lines
#get some vertex from ditcionary
some_vertex = self.graph_dict[from_vertex.value]
#and add new edge to this vertex
some_vertex.add_edge(to_vertex.value, weight)

TypeError between two instances of Vertices (nodes) in Dijkstra's SPF algorithm

I am currently working on solving a train schedule optimization problem as part of my studies. In this problem, a utility function has to be maximized which increases in the number of (critical) stations visited, and decreases in the amount of trains used and in the total amount of minutes the trains are running.
The problem consists of stations (nodes) and connections (edges). Data on both of these is first loaded from two CSV files. Then, classes are instantiated for each station (containing the name and whether or not it is critical), and each connection (containing the stations in the connection, and the time it costs to travel to one another). These stations and connection are both stored in dictionaries.
As a first step, my groupmates and I decided we first wanted to implement a version of Dijkstra's Pathfinding Algorithm in order to find the quickest route between two stations. BogoToBogo has a very detailed guide on how to implement a version of Dijkstra's algorithm. We decided first to try and implement their code to see what the results would be. However, a TypeError keeps popping up:
TypeError: '<' not supported between instances of 'Vertex' and 'Vertex'
If anyone has an idea what is causing this error, any help would be greatly appriciated!
#Makes the shortest path from v.previous
def shortest(v, path):
if v.previous:
path.append(v.previous.get_id())
shortest(v.previous, path)
return
def dijkstra(aGraph, start, target):
print('Dijkstras shortest path')
# Set the distance for the start node to zero
start.set_distance(0)
# Put tuple pair into the priority queue
unvisited_queue = [(v.get_distance(),v) for v in aGraph]
heapq.heapify(unvisited_queue)
while len(unvisited_queue):
# Pops a vertex with the smallest distance
uv = heapq.heappop(unvisited_queue)
current = uv[1]
current.set_visited()
#for next in v.adjacent:
for next in current.adjacent:
# if visited, skip
if next.visited:
continue
new_dist = current.get_distance() + current.get_weight(next)
if new_dist < next.get_distance():
next.set_distance(new_dist)
next.set_previous(current)
print('updated : current = ' + current.get_id() + ' next = ' + next.get_id() + ' new_dist = ' + next.get_distance())
else:
print('not updated : current = ' + current.get_id() + ' next = ' + next.get_id() + ' new_dist = ' + next.get_distance())
# Rebuild heap
# 1. Pop every item
while len(unvisited_queue):
heapq.heappop(unvisited_queue)
# 2. Put all vertices not visited into the queue
unvisited_queue = [(v.get_distance(),v) for v in aGraph if not v.visited]
heapq.heapify(unvisited_queue)
if __name__ == "__main__":
# Calling the CSV loading functions in mainActivity
# These functions will also instantiate station and connections objects
load_stations(INPUT_STATIONS)
load_connections(INPUT_CONNECTIONS)
g = Graph()
for index in stations:
g.add_vertex(stations[index].name)
for counter in connections:
g.add_edge(connections[counter].stat1, connections[counter].stat2, int(connections[counter].time))
for v in g:
for w in v.get_connections():
vid = v.get_id()
wid = w.get_id()
print( vid, wid, v.get_weight(w))
dijkstra(g, g.get_vertex('Alkmaar'), g.get_vertex('Zaandam'))
target = g.get_vertex('Zaandam')
path = [target.get_id()]
shortest(target, path)
print('The shortest path :' + (path[::-1]))
In this case, the function dijkstra is called, given the parameters g (which is a instance of the Graph class), Alkmaar, and Zaandam.
# Represents a grid of nodes/stations composed of nodes and edges
class Graph:
def __init__(self):
self.vert_dict = {}
self.num_vertices = 0
def __iter__(self):
return iter(self.vert_dict.values())
def add_vertex(self, node):
self.num_vertices = self.num_vertices + 1
new_vertex = Vertex(node)
self.vert_dict[node] = new_vertex
return new_vertex
def get_vertex(self, n):
if n in self.vert_dict:
return self.vert_dict[n]
else:
return None
def add_edge(self, frm, to, cost = 0):
if frm not in self.vert_dict:
self.add_vertex(frm)
if to not in self.vert_dict:
self.add_vertex(to)
self.vert_dict[frm].add_neighbor(self.vert_dict[to], cost)
self.vert_dict[to].add_neighbor(self.vert_dict[frm], cost)
def get_vertices(self):
return self.vert_dict.keys()
def set_previous(self, current):
self.previous = current
def get_previous(self, current):
return self.previous
The Graph class.
# Represents a node (station)
class Vertex:
def __init__(self, node):
self.id = node
self.adjacent = {}
# Set distance to infinity for all nodes
self.distance = sys.maxsize
# Mark all nodes unvisited
self.visited = False
# Predecessor
self.previous = None
def add_neighbor(self, neighbor, weight=0):
self.adjacent[neighbor] = weight
def get_connections(self):
return self.adjacent.keys()
def get_id(self):
return self.id
def get_weight(self, neighbor):
return self.adjacent[neighbor]
def set_distance(self, dist):
self.distance = dist
def get_distance(self):
return self.distance
def set_previous(self, prev):
self.previous = prev
def set_visited(self):
self.visited = True
def __str__(self):
return str(self.id) + ' adjacent: ' + str([x.id for x in self.adjacent])
The Vertex class.
Thanks for your time!
I think this might help, but the way when posting to stackoverflow just post as little and complete information as possible
# Put tuple pair into the priority queue
unvisited_queue = [(v.get_distance(),v) for v in aGraph]
heapq.heapify(unvisited_queue)
if you look at this code it converts the list to a heap which requires < comparison of whatever you give to it, define a __gt__() method in the vertex class , the function will determine what gets popped first so write it as you see fit and I think the error will go away. :-)

Make undirected graph from adjacency list

I'm trying to make an undirected graph from an adjacency list to practice the Karger's Min Cut algorithm. The following is my code
class Vertex(object):
'''Represents a vertex, with the indices of edges
incident on it'''
def __init__(self,name,edgeIndices=[]):
self.name = name
self.edgeIndices = edgeIndices
def getName(self):
return self.name
def addEdge(self,ind):
self.edgeIndices.append(ind)
def getEdges(self):
return self.edgeIndices
def __eq__(self,other):
return self.name == other.name
class Edge(object):
'''Represents an edge with the indices of its endpoints'''
def __init__(self,ends):
self.ends = ends
def getEnds(self):
return self.ends
def __eq__(self,other):
return (self.ends == other.ends)\
or ((self.ends[1],self.ends[0]) == other.ends)
class Graph(object):
def __init__(self,vertices,edges):
self.edges = edges
self.vertices = vertices
def createGraph(filename):
'''Input: Adjacency list
Output: Graph object'''
vertices = []
edges = []
with open(filename) as f:
for line in f:
elements = line.split()
newVert = Vertex(elements[0])
if newVert not in vertices:
vertices.append(newVert)
for verts in elements[1:]:
otherVert = Vertex(verts)
if otherVert not in vertices:
vertices.append(otherVert)
end1 = vertices.index(newVert)
end2 = vertices.index(otherVert)
newEdge = Edge((end1,end2))
if newEdge not in edges:
edges.append(newEdge)
newVert.addEdge(edges.index(newEdge))
return Graph(vertices,edges)
Suppose the adjacency list is as follows with vertices represented by integers
1 -> 2,3,4
2 -> 1,3
3 -> 1,2,4
4 -> 1,3
In total, this graph will have five edges, so the length of list holding indices of edges a vertex is associated with can't more than 5 long.
For instance, I expect the vertex '2' to have indices of just two edges, i.e. edges with vertices 1 and 3. Instead, what I get is [0, 1, 2, 3, 0, 2, 1, 3].
I need help to figure out what is going wrong.
First error comes from the Vertex init. When passing a list as default argument, Python instantiates it once, and share this instance with all future instances of Vertex.
Pass None, and use a local list if no list is given.
class Vertex(object):
def __init__(self,name,edgeIndices=None):
self.name = name
self.edgeIndices = edgeIndices if edgeIndices else []
In the createGraph method, when the vertex already exists in the graph you need to use it. See the added else: newVert = ...
You also seem to have an issue with the ligne splitting. See the iteration over elements[2].split(',').
def createGraph(filename):
'''Input: Adjacency list
Output: Graph object'''
vertices = []
edges = []
with open(filename) as f:
for line in f:
elements = line.split()
newVert = Vertex(elements[0])
if newVert not in vertices:
vertices.append(newVert)
else:
newVert = vertices[vertices.index(newVert)]
for verts in elements[2].split(','):
otherVert = Vertex(verts)
if otherVert not in vertices:
vertices.append(otherVert)
end1 = vertices.index(newVert)
end2 = vertices.index(otherVert)
newEdge = Edge((end1,end2))
if newEdge not in edges:
edges.append(newEdge)
newVert.addEdge(edges.index(newEdge))
return Graph(vertices,edges)
As a side note, I would try to use a dict to store the vertices (and edges) and do the lookup. List.index is used a lot, and you may create a lot of objects for nothing.
I would recommend to take a look at Dict, OrderedDict, Linked List based graph implementations. The are far more effective then based on lists and indexes.
To make you code work you can do the following:
Change a Vertex to avoid issue described in previous answer:
class Vertex(object):
def __init__(self,name, edgeIndices=None):
self.name = name
self.edgeIndices = edgeIndices or []
Let the graph do some work:
class Graph(object):
def __init__(self):
self.edges = []
self.vertices = []
def add_vertex(self, name):
vertex = Vertex(name)
if vertex not in self.vertices:
self.vertices.append(vertex)
def add_edge(self, *names):
self._add_vertices(names)
edge = self._add_edge(names)
self._update_vertices_links(edge, names)
def get_vertex_index(self, name):
vertex = Vertex(name)
return self.vertices.index(vertex)
def get_vertex(self, name):
return self.vertices[self.get_vertex_index(name)]
def _update_vertices_links(self, edge, names):
for name in names:
vertex = self.get_vertex(name)
vertex.addEdge(self.edges.index(edge))
def _add_edge(self, names):
edge = Edge((self.get_vertex_index(names[0]), self.get_vertex_index(names[1])))
if edge not in self.edges:
self.edges.append(edge)
return edge
def _add_vertices(self, names):
for name in names:
self.add_vertex(name)
def __repr__(self):
return "Vertices: %s\nEdges: %s" % (self.vertices, self.edges)
Create Graph:
def createGraph(filename):
with open(filename) as f:
graph = Graph()
for line in f:
elements = line.strip().split()
graph.add_vertex(elements[0])
for element in elements[2].split(","):
graph.add_edge(elements[0], element)
return graph
Run it:
graph = createGraph('input.txt')
print graph
Output for your input:
Vertices: [<Name:1 Edges:[0, 1, 2]>, <Name:2 Edges:[0, 3]>, <Name:3 Edges:[1, 3, 4]>, <Name:4 Edges:[2, 4]>]
Edges: [(0, 1), (0, 2), (0, 3), (1, 2), (2, 3)]

Dijkstra's Algorithm - wrong order of nodes in shortest path

I've been working on a school assignment, where I need to implement Dijkstra's algorithm. That wouldn't be too hard by itself but unfortunately, the automatic checking script disagrees with all of my implementations (I actually made like 8 different versions). All the initial data checking works correctly, only when the script generates random data, it differs. My path and script's path has the same distance, but different vertexes on the path. For example:
Teachers path: City2, City15, City16, City6,
Students path: City2, City15, City18, City0, City6,
I even contacted the teacher who just responded with "You have to use priority queue :-)" despite me using one (in fact, several implementations of one, from my own to heapq). Am I doing something wrong or is it the teacher script that's incorrect? I hope the code is self-commenting enough to be understandable. Thank you for any advice you can give me.
The algorithm is called on source vertex and computes shortest distance and path to every other connected node. If the vertex has same minDistance (ie. priority) as some that's already there, it should go in front of it, not after it.
class Node:
"""Basic node of the priority queue"""
def __init__(self, data, priority):
self.data = data
self.nextNode = None
self.priority = priority
self.id = data.id
class PriorityQueue:
"""Basic priority queue with add, remove and update methods"""
def __init__(self):
self.head = None
self.count = 0
def add(self, data, priority):
"""Adds data with priority in the proper place"""
node = Node(data, priority)
if not self.head:
self.head = node
elif node.priority <= self.head.priority:
node.nextNode = self.head
self.head = node
else:
checker = self.head
while True:
if not checker.nextNode or node.priority >= checker.nextNode.priority:
break
checker = checker.nextNode
node.nextNode = checker.nextNode
checker.nextNode = node
return 0
def remove(self, data):
"""Removes specified node and reconnects the remaining nodes, does nothing if node not found"""
checker = self.head
if not self.head:
return 0
if checker.id == data.id:
self.head = checker.nextNode
while True:
checker = checker.nextNode
if not checker or not checker.nextNode:
return 0
if checker.nextNode.id == data.id:
checker.nextNode = checker.nextNode.nextNode
break
return 0
def update(self, data):
"""Updates priority of existing node via removing and re-adding it"""
self.remove(data)
self.add(data, data.minDistance)
return 0
def getMin(self):
"""Returns the minimum priority data"""
min = self.head
return min.data
class Edge:
"""Edge of the graph, contains source, target and weight of line"""
def __init__(self, source, target, weight):
self.source = source
self.target = target
self.weight = weight
class Vertex:
"""Vertex of the graph, everything except id and name is filled later"""
def __init__(self, id, name):
self.id = id
self.name = name
self.minDistance = float('inf')
self.previousVertex = None
self.edges = []
self.visited = False
class Dijkstra:
"""Dijkstra's algorithm implementation"""
def __init__(self):
self.vertexes = []
self.nodes = {}
self.unvisited = PriorityQueue()
def createGraph(self, vertexes, edgesToVertexes):
"""Connects edges to appropriate vertexes, adds vertexes to node dictionary"""
self.vertexes = vertexes
for vertex in self.vertexes:
for edge in edgesToVertexes:
if vertex.id == edge.source:
vertex.edges.append(edge)
edgesToVertexes.remove(edge)
self.nodes[vertex.id] = vertex
return 0
def getVertexes(self):
"""Returns vertexes in graph, should be called after creating it just to check"""
return self.vertexes
def computePath(self, sourceId):
"""Fills in minDistance and previousVertex of all nodes from source"""
mainNode = self.nodes[sourceId]
mainNode.minDistance = 0
self.unvisited.add(mainNode, 0)
while self.unvisited.head:
mainNode = self.unvisited.getMin()
mainNode.visited=True
for edge in mainNode.edges:
tempDistance = mainNode.minDistance + edge.weight
targetNode = self.nodes[edge.target]
self.unvisited.remove(mainNode)
if tempDistance < targetNode.minDistance:
targetNode.minDistance = tempDistance
targetNode.previousVertex = mainNode
self.unvisited.update(targetNode)
return 0
def getShortestPathTo(self, targetId):
"""Returns list of shortest parth to targetId from source. Call only after doing ComputePath"""
path = []
mainNode = self.nodes[targetId]
while True:
path.append(mainNode)
mainNode = mainNode.previousVertex
if not mainNode:
break
return list(reversed(path))
def resetDijkstra(self):
"""Resets ComputePath but leaves graph untouched"""
for vertex in self.vertexes:
vertex.minDistance = float('inf')
vertex.previousVertex = None
return 0
def createGraph(self, vertexes, edgesToVertexes):
"""Connects edges to appropriate vertexes, adds vertexes to node dictionary"""
self.vertexes = vertexes
for vertex in self.vertexes:
for edge in edgesToVertexes:
if vertex.id == edge.source:
vertex.edges.append(edge)
edgesToVertexes.remove(edge)
self.nodes[vertex.id] = vertex
return 0
I belive this was wrong => edgesToVertexes.remove(edge)
I had similar home work and used some of your code and this one line was incorrect I think. It removed one path from vortex in every loop.

In order BST traversal: find

I am trying to find the kth smallest element of binary search tree and I have problems using recursion. I understand how to print the tree inorder/postorder etc. but I fail to return the rank of the element. Can someone point where I am making a mistake? In general, I am having hard time understanding recursion in trees.
Edit: this is an exercise, so I am not looking for using built-in functions. I have another solution where I keep track of number of left and right children as I insert nodes and that code is working fine. I am wondering if it is possible to do this using inorder traversal because it seems to be a simpler solution.
class BinaryTreeNode:
def __init__(self, data, left=None, right=None):
self.data = data
self.left = left
self.right = right
def traverseInOrder(root,order):
if root == None:
return
traverseInOrder(root.left,order+1)
print root.data,
print order
traverseInOrder(root.right,order)
"""
a
/ \
b c
/ \ / \
d e f g
/ \
h i
"""
h = BinaryTreeNode("h")
i = BinaryTreeNode("i")
d = BinaryTreeNode("d", h, i)
e = BinaryTreeNode("e")
f = BinaryTreeNode("f")
g = BinaryTreeNode("g")
b = BinaryTreeNode("b", d, e)
c = BinaryTreeNode("c", f, g)
a = BinaryTreeNode("a", b, c)
print traverseInOrder(a,0)
If this is an academic exercise, make traverseInOrder (or similar method tailored to the purpose) return the number of children it visited. From there things get simpler.
If this isn't academic, have a look at http://stromberg.dnsalias.org/~dstromberg/datastructures/ - the dictionary-like objects are all trees, and support iterators - so finding the nth is a matter of zip(tree, range(n)).
You could find the smallets element in the binary search tree first. Then from that element call a method to give you the next element k times.
For find_smallest_node method, note that you can traverse all the nodes "in-order" until reach to smallest. But that approach takes O(n) time.
However, you do not need a recursion to find the smallest node, because in BST smallest node is simply the left most node, so you can traverse the nodes until finding a node that has no left child and it takes O(log n) time:
class BST(object):
def find_smallest_node(self):
if self.root == None:
return
walking_node = self.root
smallest_node = self.root
while walking_node != None:
if walking_node.data <= smallest_node.data:
smallest_node = walking_node
if walking_node.left != None:
walking_node = walking_node.left
elif walking_node.left == None:
walking_node = None
return smallest_node
def find_k_smallest(self, k):
k_smallest_node = self.find_smallest_node()
if k_smallest_node == None:
return
else:
k_smallest_data = k_smallest_node.data
count = 1
while count < k:
k_smallest_data = self.get_next(k_smallest_data)
count += 1
return k_smallest_data
def get_next (self, key):
...
It just requires to keep the parent of the nodes when inserting them to the tree.
class Node(object):
def __init__(self, data, left=None, right=None, parent=None):
self.data = data
self.right = right
self.left = left
self.parent = parent
An implementation of the bst class with the above methods and also def get_next (self, key) function is here. The upper folder contains the test cases for it and it worked.

Categories

Resources