Related
Dear experienced friends, I am looking for an algorithm (Python) that outputs the width of a tree at each level. Here are the input and expected outputs.
(I have updated the problem with a more complex edge list. The original question with sorted edge list can be elegantly solved by #Samwise answer.)
Input (Edge List: source-->target)
[[11,1],[11,2],
[10,11],[10,22],[10,33],
[33,3],[33,4],[33,5],[33,6]]
The tree graph looks like this:
10
/ | \
11 22 33
/ \ / | \ \
1 2 3 4 5 6
Expected Output (Width of each level/height)
[1,3,6] # according to the width of level 0,1,2
I have looked through the web. It seems this topic related to BFS and Level Order Traversal. However, most solutions are based on the binary tree. How can solve the problem when the tree is not binary (e.g. the above case)?
(I'm new to the algorithm, and any references would be really appreciated. Thank you!)
Build a dictionary of the "level" of each node, and then count the number of nodes at each level:
>>> from collections import Counter
>>> def tree_width(edges):
... levels = {} # {node: level}
... for [p, c] in edges:
... levels[c] = levels.setdefault(p, 0) + 1
... widths = Counter(levels.values()) # {level: width}
... return [widths[level] for level in sorted(widths)]
...
>>> tree_width([[0,1],[0,2],[0,3],
... [1,4],[1,5],
... [3,6],[3,7],[3,8],[3,9]])
[1, 3, 6]
This might not be the most efficient, but it requires only two scans over the edge list, so it's optimal up to a constant factor. It places no requirement on the order of the edges in the edge list, but does insist that each edge be (source, dest). Also, doesn't check that the edge list describes a connected tree (or a tree at all; if the edge list is cyclic, the program will never terminate).
from collections import defauiltdict
# Turn the edge list into a (non-binary) tree, represented as a
# dictionary whose keys are the source nodes with the list of children
# as its value.
def edge_list_to_tree(edges):
'''Given a list of (source, dest) pairs, constructs a tree.
Returns a tuple (tree, root) where root is the root node
and tree is a dict which maps each node to a list of its children.
(Leaves are not present as keys in the dictionary.)
'''
tree = defaultdict(list)
sources = set() # nodes used as sources
dests = set() # nodes used as destinations
for source, dest in edges:
tree[source].append(dest)
sources.add(source)
dests.add(dest)
roots = sources - dests # Source nodes which are not destinations
assert(len(roots) == 1) # There is only one in a tree
tree.default_factory = None # Defang the defaultdict
return tree, roots.pop()
# A simple breadth-first-search, keeping the count of nodes at each level.
def level_widths(tree, root):
'''Does a BFS of tree starting at root counting nodes at each level.
Returns a list of counts.
'''
widths = [] # Widths of the levels
fringe = [root] # List of nodes at current level
while fringe:
widths.append(len(fringe))
kids = [] # List of nodes at next level
for parent in fringe:
if parent in tree:
for kid in tree[parent]:
kids.append(kid)
fringe = kids # For next iteration, use this level's kids
return widths
# Put the two pieces together.
def tree_width(edges):
return level_widths(*edge_list_to_tree(edges))
Possible solution that is based on Width-First-Traversal
In Width-First-Traversal we add the node to the array, but in this solution we put the array in an object together with its level and then add it to the array.
function levelWidth(root) {
const counter = [];
const traverseBF = fn => {
const arr = [{n: root, l:0}];
const pushToArr = l => n => arr.push({n, l});
while (arr.length) {
const node = arr.shift();
node.n.children.forEach(pushToArr(node.l+1));
fn(node);
}
};
traverseBF(node => {
counter[node.l] = (+counter[node.l] || 0) + 1;
});
return counter;
}
I have a tree like this:
A
/ | \
B C D
/ \ | / | \
E F G H I J
and I am trying to append the nodes of the tree to an empty list
such that the list looks like:
[[A], [B, C, D], [E, F, G, H, I, J]]
Suppose that I have a root_node A and I don't know how deep my tree is.
How can I append nodes from the tree to an empty list in the above-mentioned format?
I tried breadth first search, but my list length is way longer than than the
depth of the tree.
Append each new depth of nodes as a new list.
Start with an empty list: tree = []
Create a new inner list for the current depth
Append each element at the current depth in the list: tree[depth].append(element)
Go to the next depth and repeat
Given a Node implementation like:
class Node:
def __init__(self, name):
self.name = name
self.children = []
def __repr__(self):
return f'Node({self.name})'
You can create your nodes and arrange a graph with:
nodes = {letter: Node(letter) for letter in 'ABCDEFGHIJ'}
nodes['A'].children.extend([nodes['B'], nodes['C'], nodes['D']])
nodes['B'].children.extend([nodes['E'], nodes['F']])
nodes['C'].children.extend([nodes['G']])
nodes['D'].children.extend([nodes['H'], nodes['I'], nodes['J']])
Now you can start with the root node, and continually make a new list of nodes until you run out with a simple generator:
def make_lists(root):
current = [root]
while current:
yield current
current = [c for n in current for c in n.children]
list(make_lists(nodes['A']))
The while loop will end when there are no more children, resulting in:
[[Node(A)],
[Node(B), Node(C), Node(D)],
[Node(E), Node(F), Node(G), Node(H), Node(I), Node(J)]]
I'm relatively new to python, and I was trying out some questions when I encountered this problem. A tree is defined in a text file in the following manner,
d:
e:
b: d e
c:
a: b c
So, I want to write a simple python script that finds the depth of this. I'm not able to figure out a strategy to work this out. Is there any algorithm or technique for this?
My strategy would be as follows:
Find elements with no children.
For each of these, find the parent. Determine if any elements have this parent as a child - if not, your length is two (2).
If so, find the parent of the parent. Repeat step 2, incrementing your length counter. Continue the process updating a counter with each step.
For your case:
d -> b -> a (len 3)
e -> b -> a (len 3)
c -> a (len 2)
This could be described as a 'bottom up' tree construction method/algorithm.
The tree format you've given has a nice property: if x is the child of y, then x is given before y in the file. So you can simply loop through the file once and read the depth into a dictionary. For example:
depth = {}
for line in f:
parent, children = read_node(line)
if children:
depth[parent] = max(depth.get(child,1) for child in children) + 1
Then just print depth['a'], as a is the root. Here read_node is a quick function to parse the parent and children from a line of the file:
def read_node(line):
parent, children = line.split(":")
return parent, children.split()
I'm not sure what you mean by depth, if it's how many steps you have to go to visit every node, you could use the Depth-First Search to see how long it takes to visit every node in the graph.
Here's a simple implementation:
text_tree = """d:
e:
b: d e
c:
a: b c"""
tree = {}
for line in text_tree.splitlines():
node, childs = line.split(":")
tree[node] = set(childs.split())
def dfs(graph, start):
visited, stack = [], [start]
while stack:
vertex = stack.pop()
if vertex not in visited:
visited.append(vertex)
stack.extend(graph[vertex])
return visited
result = dfs(tree,"a")
print "It took %d steps, to visit every node in tree, the path took was %s"%(len(result),result)
Which outputs:
It took 5 steps, to visit every node in tree, the path took was ['a', 'b', 'd', 'e', 'c']
After thorough research and based on this , this and a lot more I was suggested to implement k shortest paths algorithm in order to find first, second, third ... k-th shortest path in a large undirected, cyclic, weighted graph. About 2000 nodes.
The pseudocode on Wikipedia is this:
function YenKSP(Graph, source, sink, K):
//Determine the shortest path from the source to the sink.
A[0] = Dijkstra(Graph, source, sink);
// Initialize the heap to store the potential kth shortest path.
B = [];
for k from 1 to K:
// The spur node ranges from the first node to the next to last node in the shortest path.
for i from 0 to size(A[i]) − 1:
// Spur node is retrieved from the previous k-shortest path, k − 1.
spurNode = A[k-1].node(i);
// The sequence of nodes from the source to the spur node of the previous k-shortest path.
rootPath = A[k-1].nodes(0, i);
for each path p in A:
if rootPath == p.nodes(0, i):
// Remove the links that are part of the previous shortest paths which share the same root path.
remove p.edge(i, i) from Graph;
// Calculate the spur path from the spur node to the sink.
spurPath = Dijkstra(Graph, spurNode, sink);
// Entire path is made up of the root path and spur path.
totalPath = rootPath + spurPath;
// Add the potential k-shortest path to the heap.
B.append(totalPath);
// Add back the edges that were removed from the graph.
restore edges to Graph;
// Sort the potential k-shortest paths by cost.
B.sort();
// Add the lowest cost path becomes the k-shortest path.
A[k] = B[0];
return A;
The main problem is that I couldn't write the correct python script yet for this (delete edges and places them back in place correctly) so I've only got this far with reliyng on Igraph as usual:
def yenksp(graph,source,sink, k):
global distance
"""Determine the shortest path from the source to the sink."""
a = graph.get_shortest_paths(source, sink, weights=distance, mode=ALL, output="vpath")[0]
b = [] #Initialize the heap to store the potential kth shortest path
#for xk in range(1,k):
for xk in range(1,k+1):
#for i in range(0,len(a)-1):
for i in range(0,len(a)):
if i != len(a[:-1])-1:
spurnode = a[i]
rootpath = a[0:i]
#I should remove edges part of the previous shortest paths, but...:
for p in a:
if rootpath == p:
graph.delete_edges(i)
spurpath = graph.get_shortest_paths(spurnode, sink, weights=distance, mode=ALL, output="vpath")[0]
totalpath = rootpath + spurpath
b.append(totalpath)
# should restore the edges
# graph.add_edges([(0,i)]) <- this is definitely not correct.
graph.add_edges(i)
b.sort()
a[k] = b[0]
return a
It's a really poor try and it returns only a list in a list
I'm not very sure anymore what am I doing and I'm very desperate with this issue already and in the last days my point of view on this was changed with 180 degrees and even once.
I'm just a noob doing its best. Please help. Networkx implementation can also be suggested.
P.S. It's likely that there are no other working ways about this because we researched it here already . I've already received lots of suggestions and I owe the community alot. DFS or BFS wont work. Graph is huge.
Edit: I keep correcting the python script. In a nutshell the aim of this question is the correct script.
There is a python implementation of Yen's KSP on Github, YenKSP. Giving full credit to the author, the heart of the algorithm is given here:
def ksp_yen(graph, node_start, node_end, max_k=2):
distances, previous = dijkstra(graph, node_start)
A = [{'cost': distances[node_end],
'path': path(previous, node_start, node_end)}]
B = []
if not A[0]['path']: return A
for k in range(1, max_k):
for i in range(0, len(A[-1]['path']) - 1):
node_spur = A[-1]['path'][i]
path_root = A[-1]['path'][:i+1]
edges_removed = []
for path_k in A:
curr_path = path_k['path']
if len(curr_path) > i and path_root == curr_path[:i+1]:
cost = graph.remove_edge(curr_path[i], curr_path[i+1])
if cost == -1:
continue
edges_removed.append([curr_path[i], curr_path[i+1], cost])
path_spur = dijkstra(graph, node_spur, node_end)
if path_spur['path']:
path_total = path_root[:-1] + path_spur['path']
dist_total = distances[node_spur] + path_spur['cost']
potential_k = {'cost': dist_total, 'path': path_total}
if not (potential_k in B):
B.append(potential_k)
for edge in edges_removed:
graph.add_edge(edge[0], edge[1], edge[2])
if len(B):
B = sorted(B, key=itemgetter('cost'))
A.append(B[0])
B.pop(0)
else:
break
return A
I had the same problem as you so I ported Wikipedia's pseudocode for Yen's algorithm for use in Python with the igraph library.
You can find it there : https://gist.github.com/ALenfant/5491853
I'm using recursion to find the path from some point A to some point D.
I'm transversing a graph to find the pathways.
Lets say:
Graph = {'A':['route1','route2'],'B':['route1','route2','route3','route4'], 'C':['route3','route4'], 'D':['route4'] }
Accessible through:
A -> route1, route2
B -> route2, route 3, route 4
C -> route3, route4
There are two solutions in this path from A -> D:
route1 -> route2 -> route4
route1 -> route2 -> route3 -> route4
Since point B and point A has both route 1, and route 2. There is an infinite loop so i add a check whenever
i visit the node( 0 or 1 values ).
However with the check, i only get one solution back: route1 -> route2 -> route4, and not the other possible solution.
Here is the actual coding: Routes will be substituted by Reactions.
def find_all_paths(graph,start, end, addReaction, passed = {}, reaction = [] ,path=[]):
passOver = passed
path = path + [start]
reaction = reaction + [addReaction]
if start == end:
return [reaction]
if not graph.has_key(start):
return []
paths=[]
reactions=[]
for x in range (len(graph[start])):
for y in range (len(graph)):
for z in range (len(graph.values()[y])):
if (graph[start][x] == graph.values()[y][z]):
if passOver.values()[y][z] < 161 :
passOver.values()[y][z] = passOver.values()[y][z] + 1
if (graph.keys()[y] not in path):
newpaths = find_all_paths(graph, (graph.keys()[y]), end, graph.values()[y][z], passOver , reaction, path)
for newpath in newpaths:
reactions.append(newpath)
return reactions
Here is the method call: dic_passOver is a dictionary keeping track if the nodes are visited
solution = (find_all_paths( graph, "M_glc_DASH_D_c', 'M_pyr_c', 'begin', dic_passOver ))
My problem seems to be that once a route is visited, it can no longer be access, so other possible solutions are not possible. I accounted for this by adding a maximum amount of recursion at 161, where all the possible routes are found for my specific code.
if passOver.values()[y][z] < 161 :
passOver.values()[y][z] = passOver.values()[y][z] + 1
However, this seem highly inefficient, and most of my data will be graphs with indexes in their thousands. In addition i won't know the amount of allowed node visits to find all routes. The number 161 was manually figured out.
Well, I can't understand your representation of the graph. But this is a generic algorithm you can use for finding all paths which avoids infinite loops.
First you need to represent your graph as a dictionary which maps nodes to a set of nodes they are connected to. Example:
graph = {'A':{'B','C'}, 'B':{'D'}, 'C':{'D'}}
That means that from A you can go to B and C. From B you can go to D and from C you can go to D. We're assuming the links are one-way. If you want them to be two way just add links for going both ways.
If you represent your graph in that way, you can use the below function to find all paths:
def find_all_paths(start, end, graph, visited=None):
if visited is None:
visited = set()
visited |= {start}
for node in graph[start]:
if node in visited:
continue
if node == end:
yield [start,end]
else:
for path in find_all_paths(node, end, graph, visited):
yield [start] + path
Example usage:
>>> graph = {'A':{'B','C'}, 'B':{'D'}, 'C':{'D'}}
>>> for path in find_all_paths('A','D', graph):
... print path
...
['A', 'C', 'D']
['A', 'B', 'D']
>>>
Edit to take into account comments clarifying graph representation
Below is a function to transform your graph representation(assuming I understood it correctly and that routes are bi-directional) to the one used in the algorithm above
def change_graph_representation(graph):
reverse_graph = {}
for node, links in graph.items():
for link in links:
if link not in reverse_graph:
reverse_graph[link] = set()
reverse_graph[link].add(node)
result = {}
for node,links in graph.items():
adj = set()
for link in links:
adj |= reverse_graph[link]
adj -= {node}
result[node] = adj
return result
If it is important that you find the path in terms of the links, not the nodes traversed you can preserve this information like so:
def change_graph_representation(graph):
reverse_graph = {}
for node, links in graph.items():
for link in links:
if link not in reverse_graph:
reverse_graph[link] = set()
reverse_graph[link].add(node)
result = {}
for node,links in graph.items():
adj = {}
for link in links:
for n in reverse_graph[link]:
adj[n] = link
del(adj[node])
result[node] = adj
return result
And use this modified search:
def find_all_paths(start, end, graph, visited=None):
if visited is None:
visited = set()
visited |= {start}
for node,link in graph[start].items():
if node in visited:
continue
if node == end:
yield [link]
else:
for path in find_all_paths(node, end, graph, visited):
yield [link] + path
That will give you paths in terms of links to follow instead of nodes to traverse. Hope this helps :)