I have this Binary Tree Structure:
# A Node is an object
# - value : Number
# - children : List of Nodes
class Node:
def __init__(self, value, children):
self.value = value
self.children = children
I can easily sum the Nodes, recursively:
def sumNodesRec(root):
sumOfNodes = 0
for child in root.children:
sumOfNodes += sumNodesRec(child)
return root.value + sumOfNodes
Example Tree:
exampleTree = Node(1,[Node(2,[]),Node(3,[Node(4,[Node(5,[]),Node(6,[Node(7,[])])])])])
sumNodesRec(exampleTree)
> 28
However, I'm having difficulty figuring out how to sum all the nodes iteratively. Normally, with a binary tree that has 'left' and 'right' in the definition, I can find the sum. But, this definition is tripping me up a bit when thinking about it iteratively.
Any help or explanation would be great. I'm trying to make sure I'm not always doing things recursively, so I'm trying to practice creating normally recursive functions as iterative types, instead.
If we're talking iteration, this is a good use case for a queue.
total = 0
queue = [exampleTree]
while queue:
v = queue.pop(0)
queue.extend(v.children)
total += v.value
print(total)
28
This is a common idiom. Iterative graph traversal algorithms also work in this manner.
You can simulate stacks/queues using python's vanilla lists. Other (better) alternatives would be the collections.deque structure in the standard library. I should explicitly mention that its enque/deque operations are more efficient than what you'd expect from a vanilla list.
Iteratively you can create a list, stack, queue, or other structure that can hold the items you run through. Put the root into it. Start going through the list, take an element and add its children into the list also. Add the value to the sum. Take next element and repeat. This way there’s no recursion but performance and memory usage may be worse.
In response to the first answer:
def sumNodes(root):
current = [root]
nodeList = []
while current:
next_level = []
for n in current:
nodeList.append(n.value)
next_level.extend(n.children)
current = next_level
return sum(nodeList)
Thank you! That explanation helped me think through it more clearly.
Related
I am trying to get a list of values of the tree using DFS but I only get root's value. :(
def depthFirstSearch(root):
output = []
if root:
output.append(root.value)
depthFirstSearch(root.left)
depthFirstSearch(root.right)
return output
A bit late perhaps, but if you look at what you do it makes sense that only your root is returned.
Specifically, you append the root.value in the first iteration, then you run depthFirstSearch for both children (in a binary tree, presumably), but here's the thing: You just discard the result. It will probably work if you concatenate the results from both recursive calls to the output before returning.
That would get you something like:
def depthFirstSearch(root):
output = []
if root:
output.append(root.value)
# use the += operator to concatenate the current output list and the new one from the subtree
output += depthFirstSearch(root.left)
output += depthFirstSearch(root.right)
return output
I solved an exercise where I had to apply a recursive algorithm to a tree that's so defined:
class GenericTree:
""" A tree in which each node can have any number of children.
Each node is linked to its parent and to its immediate sibling on the right
"""
def __init__(self, data):
self._data = data
self._child = None
self._sibling = None
self._parent = None
I had to concatenate the data of the leaves with the data of the parents and so on until we arrive to the root that will have the sum of all the leaves data. I solved it in this way and it works but it seems very tortuous and mechanic:
def marvelous(self):
""" MODIFIES each node data replacing it with the concatenation
of its leaves data
- MUST USE a recursive solution
- assume node data is always a string
"""
if not self._child: #If there isn't any child
self._data=self._data #the value remains the same
if self._child: #If there are children
if self._child._child: #if there are niece
self._child.marvelous() #reapply the function to them
else: #if not nieces
self._data=self._child._data #initializing the name of our root node with the name of its 1st son
#if there are other sons, we'll add them to the root name
if self._child._sibling: #check
current=self._child._sibling #iterating through the sons-siblings line
while current:
current.marvelous() #we reapplying the function to them to replacing them with their concatenation (bottom-up process)
self._data+=current._data #we sum the sibling content to the node data
current=current._sibling #next for the iteration
#To add the new names to the new root node name:
self._data="" #initializing the root str value
current=self._child #having the child that through recursion have the correct str values, i can sum all them to the root node
while current:
self._data+=current._data
current=current._sibling
if self._sibling: #if there are siblings, they need to go through the function themselves
self._sibling.marvelous()
Basically I check if the node tree passed has children: if not, it remains with the same data.
If there are children, I check if there are nieces: in this case I restart the algorithm until I can some the leaves to the pre-terminal nodes, and I sum the leaves values to put that sum to their parents'data.
Then, I act on the root node with the code after the first while loop, so to put its name as the sum of all the leaves.
The final piece of code serves as to make the code ok for the siblings in each step.
How can I improve it?
It seems to me that your method performs a lot of redundant recursive calls.
For example this loop in your code:
while current:
current.marvelous()
self._data += current._data
current = current._sibling
is useless because the recursive call will be anyway performed by the last
instruction in your method (self._sibling.marvelous()). Besides,
you update self._data and then right after the loop you reset
self._data to "".
I tried to simplify it and came up with this solution that seems to
work.
def marvelous(self):
if self.child:
self.child.marvelous()
# at that point we know that the data for all the tree
# rooted in self have been computed. we collect these
self.data = ""
current = self.child
while current:
self.data += current.data
current = current.sibling
if self.sibling:
self.sibling.marvelous()
And here is a simpler solution:
def marvelous2(self):
if not self.child:
result = self.data
else:
result = self.child.marvelous2()
self.data = result
if self.sibling:
result += self.sibling.marvelous2()
return result
marvelous2 returns the data computed for a node and all its siblings. This avoids performing the while loop of the previous solution.
I have a tree of Python objects. The tree is defined intrinsically: each object has a list (potentially empty) of children.
I would like to be able to print a list of all paths from the root to each leaf.
In the case of the tree above, this would mean:
result = [
[Node_0001,Node_0002,Node_0004],
[Node_0001,Node_0002,Node_0005,Node_0007],
[Node_0001,Node_0003,Node_0006],
]
The nodes must be treated as objects and not as integers (only their integer ID is displayed).
I don't care about the order of branches in the result. Each node has an arbitrary number of children, and the level of recursion is not fixed either.
I am trying a recursive approach:
def get_all_paths(node):
if len(node.children)==0:
return [[node]]
else:
return [[node] + get_all_paths(child) for child in node.children]
but I end-up with nested lists, which is not what I want:
[[Node_0001,
[Node_0002, [Node_0004]],
[Node_0002, [Node_0005, [Node_0007]]]],
[Node_0001, [Node_0003, [Node_0006]]]]
Any help would be gladly welcomed, this problem is driving me crazy :p
Thanks
I think this is what you are trying:
def get_all_paths(node):
if len(node.children) == 0:
return [[node]]
return [
[node] + path for child in node.children for path in get_all_paths(child)
]
For each child of a node, you should take all paths of the child and prepend the node itself to each path. You prepended the node to the list of paths, not every path individually.
Say I'm given a tuple of strings, representing relationships between objects, for example:
connections = ("dr101-mr99", "mr99-out00", "dr101-out00", "scout1-scout2","scout3-scout1", "scout1-scout4", "scout4-sscout", "sscout-super")
each dash "-" shows a relationship between the two items in the string. Then I'm given two items:
first = "scout2"
second = "scout3"
How might I go about finding if first and second are interrelated, meaning I could find a path that connects them, not necessarily if they are just in a string group.
You can try concatenating the strings and using the in operator to check if it is an element of the tuple connections:
if first + "-" + second in connections:
# ...
Edit:
You can also use the join() function:
if "-".join((first, second)) in connections:
# ...
If you plan on doing this any number of times, I'd consider frozensets...
connections_set = set(frozenset(c.split('-')) for c in connections)
Now you can do something like:
if frozenset((first, second)) in connections_set:
...
and you have an O(1) solution (plus the O(N) upfront investment). Note that I'm assuming the order of the pairs is irrelevant. If it's relevant, just use a tuple instead of frozenset and you're good to go.
If you actually need to walk through a graph, an adjacency list implementation might be a little better.
from collections import defaultdict
adjacency_dict = defaultdict(list)
for c in connections:
left, right = c.split('-')
adjacency_dict[left].append(right)
# if undirected: adjacency_dict[right].append(left)
class DFS(object):
def __init__(self, graph):
self.graph = graph
def is_connected(self, node1, node2):
self._seen = set()
self._walk_connections(node1)
output = node2 in self._seen
del self._seen
return output
def _walk_connections(self, node):
if node in self._seen:
return
self._seen.add(node)
for subnode in self.graph[node]:
self._walk_connections(subnode)
print DFS(adjacency_dict).is_connected()
Note that this implementation is definitely suboptimal (I don't stop when I found the node I'm looking for for example) -- and I don't check for an optimal path from node1 to node2. For that, you'd want something like Dijkstra's algorithm
You could use a set of pairs (tuples):
connections = {("dr101", "mr99"), ("mr99", "out00"), ("dr101", "out00")} # ...
if ("scout2", "scout3") in connections:
print "scout2-scout3 in connections"
This only works if the 2 elements are already in the right order, though, because ("scout3", "scout2") != ("scout2", "scout3"), but maybe this is what you want.
If the order of the items in the connection is not significant, you can use a set of frozensets instead (see mgilson's answer). Then you can look up pairs of item regardless of which order they appear in, but the order of the original pairs in connections is lost.
I have two trees in python. I need to compare them in a customized way according to the following specifications. Suppose I have a tree for entity E1 and a tree for entity E2. I need to traverse both the trees starting from E1 and E2 and moving upwards till I get to a common root. (Please note that I have to start the traversal from node E1 on the first tree and node E2 on the second tree.) Then I need to compare the count of the lengths of both their paths.
Can someone provide me an insight as to how to do this in Python? Can the classical tree traversal algorithms be useful here?
This is not a "traveral" (which visits each node in a tree); what you describe is merely following the parents.
The algorithm would look as follows. One of the tricks is to interleave calculations, so that the algorithm is guaranteed to terminate quickly. You also have to consider cases like node1==node2. This is also an O(A+B=N) rather than O(A*B=N^2) algorithm, where we consider the distance between the nodes and their youngest common ancestor.
def findYoungestCommonAncestor(node1, node2):
visited = set()
if node1==node2:
return node1
while True:
if node1 in visited:
return node1
if node2 in visited:
return node2
if not node1.parent and not node2.parent:
return None
if node1.parent:
visited.add(node1)
node1 = node1.parent
if node2.parent:
visited.add(node2)
node2 = node2.parent
The nodes node1 and node2 may be part of a forest (a set of interlinked trees) and this should still work.
More elegant would be something like:
def ancestors(node):
"""
an iterator of node, node.parent, node.parent.parent...
"""
yield node
while node.parent:
yield node
node = node.parent
def interleave(*iters):
"""
interleave(range(3), range(10,16)) -> 0,10,1,11,2,12,13,14,15
"""
ignore = object()
for tuple in zip_longest(*iters, fillvalue=ignore):
for x in tuple:
if not x is ignore:
yield x
def findYoungestCommonAncestor(node1, node2):
# implementation: find first repeated value in interleaved ancestors
visited = set()
for node in interleave(ancestors(node1), ancestors(node2)):
if node in visited:
return node
else:
visited.add(node)
That they are trees is not even relevant to the solution. You're looking for how long it takes for two single-linked (the parent link) lists to converge into the same list.
Simply follow the links, but keep a length count for each visited node. Once you reach an already visited node, sum the previously found count and the new one. This won't work if either list ends up circular, but if they do it's not a proper tree anyway. A way to fix that case is to track separate visited dictionaries for either branch; if you reach a node visited in its own branch, you can stop traversing that branch as there's no point recounting the loop.
This all naturally assumes you can find the parent of any node. The simplest tree structures don't actually have that link.
def closest_common_ancestor(ds1, ds2):
while ds1 != None:
dd = ds2
while dd != None:
if ds1 == dd:
return dd
dd = dd.parent
ds1 = ds1.parent
return None