Size of subtree in Python

Size of subtree in Python - python

Almost every online tutorial I see on the subject when it comes to finding the size of a subtree involves calling a recursive function on each child's subtree.
The problem with this in Python is that it overflows if you recurse past a few hundred levels, so if I theoretically had a long, linear tree, it would fail.
Is there a better way to handle this? Do I need to use a stack instead?

Do I need to use a stack instead?
Sure, that's one way of doing it.
def iter_tree(root):
to_explore = [root]
while to_explore:
node = to_explore.pop()
yield node
for child in node.children:
to_explore.append(child)
def size(root):
count = 0
for node in iter_tree(root):
count += 1
return count

The stack would be the easiest non-recursive way of getting the size of the subtree (count of nodes under the given node, including the current node)
class Node():
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def subtree_size(root):
visited = 0
if not root: return visited
stack = [root]
while stack:
node = stack.pop()
visited += 1
if node.left: stack.append(node.left)
if node.right: stack.append(node.right)
return visited

You can mirror the recursive algorithm using a stack:
numNodes = 0
nodeStack = [(root,0)] # (Node,0 means explore left 1 means explore right)
while nodeStack:
nextNode, leftOrRight = nodeStack.pop()
if not nextNode: #nextNode is empty
continue
if leftOrRight == 0:
numNodes += 1
nodeStack.append((nextNode,1))
nodeStack.append((nextNode.leftChild,0))
else:
nodeStack.append((nextNode.rightChild,0))
print(numNodes)
Some things to notice: This is still a Depth-first search! That is, we still fully explore a subtree before starting to explore the other. What this means to you is that the amount of additional memory required is proportional to the height of the tree and not the width of the tree. For a balanced tree the width of the tree is 2^h where h is the height of the tree. For a totally unbalanced tree the height of the tree is the number of nodes in the tree, whereas the width is one! so it all depends on what you need :)
Now It is worth mentioning that you can make a potential optimization by checking if one of the subtrees is empty! We can change the body of if leftOrRight == 0: to:
numNodes += 1
if nextNode.rightChild: #Has a right child to explore
nodeStack.append((nextNode,1))
nodeStack.append((nextNode.leftChild,0))
Which potentially cuts down on memory usage :)

Related

Binary Tree Search Chech Algorithm Python Not Working

I wrote this algorithm for a coding challenge on HackerRank to determine if a give binary tree is a BST. Yet in some cases when the tree is not a BST my algorithm returns True anyway. I couldn't find what was wrong, everything seems ok, right? Or is there something I don't know about BSTs?
Node is defined as:
class node:
def __init__(self, data):
self.data = data
self.left = None
self.right = None
My algorithm is
def checkBST(root):
if not root:
return True
else:
if root.left and root.left.data >= root.data:
return False
if root.right and root.right.data <= root.data:
return False
return checkBST(root.left) and checkBST(root.right)

A binary search tree has all the nodes on the left branch less than the parent node, and all the nodes on the right branch greater than the parent node.
So your code fails on a case like this:
5
/ \
4 7
/ \
2 6
It's not a valid BST because if you searched for 6, you'd follow the right branch from the root, and subsequently fail to find it.

Take a look at a picture below for which your code is giving an incorrect answer:
Where are you going wrong:
1. if an element exists on the left subtree if there exists a node with a value greater than root.
2. if an element exists on the right subtree if there exists a node with a value smaller than root.
You should try this approach :
if not root:
return True
else:
if root.left and maximumOfSubtree(root.left) >= root.data:
return False
if root.right and minimumOfSubtree(root.right) <= root.data:
return False
return checkBST(root.left) and checkBST(root.right)

so the problem is to determine whether the given tree is a BST or not.
the best way to find out is through an inorder traversal.
Do In-Order Traversal of the given tree and store the result in a temp array.
Check if the temp array is sorted in ascending order, if it is, then the tree is BST.
this can be the approach
def check_binary_search_tree_(root):
visited = []
def traverse(node):
if node.left: traverse(node.left)
visited.append(node.data)
if node.right: traverse(node.right)
traverse(root)
fc = {}
for i in visited:
if i in fc:
return False
else:
fc[i]=1
m = sorted(visited)
if visited==m:
return True
return False
refer to this for other methods https://www.geeksforgeeks.org/a-program-to-check-if-a-binary-tree-is-bst-or-not/
method 1 and method 2 are similar to your approach so it will help you to understand that too.

Creating Binary Tree

Most of the questions I've searched for regarding binary trees shows the implementation of binary search trees, but not binary trees. The terms of a complete binary tree are:
Either an empty tree or it has 1 node with 2 children, where each
child is another Binary Tree.
All levels are full (except for possibly the last level)
All leaves on the bottom-most level are
as far left as possible.
I've come up with a concept but it doesn't seem to running through the recursion properly -- Does anyone know what I'm doing wrong?
class Node():
def __init__(self, key):
self.key = key
self.left = None
self.right = None
def add(self, key):
if self.key:
if self.left is None:
self.left = Node(key)
else:
self.left.add(key)
if self.right is None:
self.right = Node(key)
else:
self.right.add(key)
else:
self.key = key
return (self.key)

The problem in your code is that you are adding the same value multiple times. You add the node, and then still recurse deeper, where you do the same.
The deeper problem is that you don't really know where to insert the node before you have reached the bottom level of the tree, and have detected where that level is incomplete. Finding the correct insertion point may need a traversal through the whole tree... which is defeating the speed gain you would expect to get from using binary trees in the first place.
I provide here three solutions, starting with the most efficient:
1. Using a list as tree implementation
For complete trees there is a special consideration to make: if you number the nodes by level, starting with 0 for the root, and within each level from left to right, you notice that the number of a node's parent is (k-1)/2 when its own number is k. In the other direction: if a node with number k has children, then its left child has number k*2+1, and the right child has a number that is one greater.
Because the tree is complete, there will never be gaps in this numbering, and so you could store the nodes in a list, and use the indexes of that list for the node numbering. Adding a node to the tree now simply means you append it to that list. Instead of a Node object, you just have the tree list, and the index in that list is your node reference.
Here is an implementation:
class CompleteTree(list):
def add(self, key):
self.append(key)
return len(self) - 1
def left(self, i):
return i * 2 + 1 if i * 2 + 1 < len(self) else -1
def right(self, i):
return i * 2 + 2 if i * 2 + 2 < len(self) else -1
#staticmethod
def parent(i):
return (i - 1) // 2
def swapwithparent(self, i):
if i > 0:
p = self.parent(i)
self[p], self[i] = self[i], self[p]
def inorder(self, i=0):
left = self.left(i)
right = self.right(i)
if left >= 0:
yield from self.inorder(left)
yield i
if right >= 0:
yield from self.inorder(right)
#staticmethod
def depth(i):
return (i + 1).bit_length() - 1
Here is a demo that creates your example tree, and then prints the keys visited in an in-order traversal, indented by their depth in the tree:
tree = CompleteTree()
tree.add(1)
tree.add(2)
tree.add(3)
tree.add(4)
tree.add(5)
for node in tree.inorder():
print(" " * tree.depth(node), tree[node])
Of course, this means you have to reference nodes a bit different from when you would use a real Node class, but the efficiency gain pays off.
2. Using an extra property
If you know how many nodes there are in a (sub)tree, then from the bit representation of that number, you can know where exactly the next node should be added.
For instance, in your example tree you have 5 nodes. Imagine you want to add a 6 to that tree. The root node would tell you that you currently have 5 and so you need to update it to 6. In binary that is 110. Ignoring the left-most 1-bit, the rest of the bits tell you whether to go left or right. In this case, you should go right (1) and then finally left (0), creating the node in that direction. You can do this iteratively or recursively.
Here is an implementation with recursion:
class Node():
def __init__(self, key):
self.key = key
self.left = None
self.right = None
self.count = 1
def add(self, key):
self.count += 1
if self.left is None:
self.left = Node(key)
elif self.right is None:
self.right = Node(key)
# extract from the count the second-most significant bit:
elif self.count & (1 << (self.count.bit_length() - 2)):
self.right.add(key)
else:
self.left.add(key)
def inorder(self):
if self.left:
yield from self.left.inorder()
yield self
if self.right:
yield from self.right.inorder()
tree = Node(1)
tree.add(2)
tree.add(3)
tree.add(4)
tree.add(5)
for node in tree.inorder():
print(node.key)
3. Without extra property
If no property can be added to Node objects, then a more extensive search is needed to find the right insertion point:
class Node():
def __init__(self, key):
self.key = key
self.left = None
self.right = None
def newparent(self):
# Finds the node that should serve as parent for a new node
# It returns a tuple:
# if parent found: [-1, parent for new node]
# if not found: [height, left-most leaf]
# In the latter case, the subtree is perfect, and its left-most
# leaf is the node to be used, unless self is a right child
# and its sibling has the insertion point.
if self.right:
right = self.right.newparent()
if right[0] == -1: # found inbalance
return right
left = self.left.newparent()
if left[0] == -1: # found inbalance
return left
if left[0] != right[0]:
return [-1, right[1]] # found inbalance
# temporary result in perfect subtree
return [left[0]+1, left[1]]
elif self.left:
return [-1, self] # found inbalance
# temporary result for leaf
return [0, self]
def add(self, key):
_, parent = self.newparent()
if not parent.left:
parent.left = Node(key)
else:
parent.right = Node(key)
def __repr__(self):
s = ""
if self.left:
s += str(self.left).replace("\n", "\n ")
s += "\n" + str(self.key)
if self.right:
s += str(self.right).replace("\n", "\n ")
return s
tree = Node(1)
tree.add(2)
tree.add(3)
tree.add(4)
tree.add(5)
print(tree)
This searches recursively the tree from right to left, to find the candidate parent of the node to be added.
For large trees, this can be improved a bit, by doing a binary-search among paths from root to leaf, based on the length of those paths. But it will still not be as efficient as the first two solutions.

You can use the sklearn Decision trees, as they are able to be set up as binary decision trees as well. link to the documentation here.

You really need to augment your tree in some way. Since this is not a binary search tree, the only real information you have about each node is whether or not it has a left and right child. Unfortunately, this isn't helpful in navigating a complete binary tree. Imagine a complete binary tree with 10 levels. Until the 9th level, every single node has both a left child and a right child, so you have no way of knowing which path to take down to the leaves. So the question is, what information do you add to each node? I would add the count of nodes in that tree.
Maintaining the count is easy, since every time you descend down a subtree you know to add one to the count at that node. What you want to recognize is the leftmost imperfect subtree. Every perfect binary tree has n = 2^k - 1, where k is the number of levels and n is the number of nodes. There are quick and easy ways to check if a number is 1 less than a power of two (see the first answer to this question), and in fact in a complete binary tree every node has at most one child that isn't the root of a perfect binary tree. Follow a simple rule to add nodes:
If the left child is None, set root.left = Node(key) and return
Else if the right child is None, set root.right = Node(key) and return
If one of the children of the current node is the root of an imperfect subtree, make that node the current node (descend down that subtree)
Else if the sizes are unequal, make the node with the smaller subtree the current node.
Else, make the left child the current node.
By augmenting each node with the size of the subtree rooted there, you have all the information you need at every node to build a recursive solution.

Is it right approach to fix problem of BST?

I have to check is Binary tree balanced and I am pretty sure my solution should work.
import sys
class Solution:
def isBalanced(self, root: TreeNode) -> bool:
cache = {
max:-sys.maxsize, #min possible number
min:sys.maxsize #max possible number
}
self.checkBalanced(root, cache, 0)
return cache[max] - cache[min] <= 1
def checkBalanced(self,node, cache, depth):
if node is None:
if depth < cache[min]:
cache[min] = depth
if depth > cache[max]:
cache[max] = depth
else:
self.checkBalanced(node.left, cache, depth+1)
self.checkBalanced(node.right, cache, depth+1)
But in this case I have an error
Here is link for question on Leetcode: https://leetcode.com/problems/balanced-binary-tree

There is a definition from the link that you provided:
For this problem, a height-balanced binary tree is defined as: a binary tree in which the depth of the two subtrees of every node never differ by more than 1.
For given "bad" input your code calculates cache[max] = 5, cache[min] = 3, so it returns false. However, if we consider root, its left subtree has depth 4 and right subtree has depth 3, so this satisfies the definition.
You should find depths of left and right subtrees for each root, however your code calculates depth (your cache[max]) and length of shortest path to any leaf of subtree (your cache[min]).
I hope you will easily fix your code after these clarifications.

Iterative postorder traversal of a binary tree with a single stack, how to approach the problem?

I have been studying up on algorithms and data structures and I wrote a post-order traversal for a binary tree without using recursion and using only one stack.
Here is the code:
def postorder_iterative(self):
current = self
s = []
current1 = None
done = 0
def peek(s):
return s[-1]
while(not done):
if current and (current != current1):
s.append(current)
current = current.leftChild
elif(len(s) > 0):
if peek(s).rightChild and peek(s).rightChild != current:
current = peek(s).rightChild
else:
current = current1 = s.pop()
print(current.key)
else:
done = 1
This code actually works but it took me forever to come up with it.
Can someone explain what is the intuitive way of thinking about this problem?
I'd like to be able to reproduce it using logic and not spend as much time as I did on it.

Post-order traversal requires that you only print the current node value after traversing both the left and right subtrees. You are using the stack to traverse the left tree only, and use the current1 variable (the last node printed) to know that you are now backing out of a right-hand side tree so you can print the current node.
I'd rename current to node, current1 to last (for last printed), remove the peek() function to just reference stack[-1] directly as tos (top of stack), and simplify your approach to:
def postorder_iterative(self):
node, last = self, None
stack = []
while True:
if node and node is not last:
# build up the stack from the left tree
stack.append(node)
node = node.leftChild
elif stack:
# no more left-hand tree to process, is there a right-hand tree?
tos = stack[-1]
if tos.rightChild and tos.rightChild is not node:
node = tos.rightChild
else:
# both left and right have been printed
node = last = stack.pop()
print(last.key)
else:
break
It is still hard to follow what is going on however, as the connection between last and the point where the left and right subtrees have been processed isn't all that clear.
I'd use a single stack with a state flag to track where in the process you are:
def postorder_iterative(self):
new, left_done, right_done = range(3) # status of node
stack = [[self, new]] # node, status
while stack:
node, status = stack[-1]
if status == right_done:
stack.pop()
print(node.key)
else:
stack[-1][1] += 1 # from new -> left_done and left_done -> right_done
# new -> add left child, left_done -> add right child
next_node = [node.leftChild, node.rightChild][status]
if next_node is not None:
stack.append((next_node, new))
Nodes go through three states, simply by incrementing the state flag. They start as new nodes, then progress to left, then right, and when the top of the stack is in that last state we remove it from the stack and print the node value.
When still in the new or left states, we add the left or right node, if present, to the stack as a new node.
Another approach pushes the right-hand tree onto the stack before the current node. Then later, when you return to the current node, having taken it from the stack, you can detect the case where you still need to process the right-hand side because the top of the stack will have the right-hand node. In that case you swap the top of the stack with the current node and continue from there; you'll later return to the same place and will no longer have that right-hand side node on the top of the stack so you can print:
def postorder_iterative(self):
stack = []
node = self
while node or stack:
while node:
# traverse to the left, but add the right to the stack first
if node.rightChild is not None:
stack.append(node.rightChild)
stack.append(node)
node = node.leftChild
# left-hand tree traversed, time to process right or print
node = stack.pop()
if stack and node.rightChild is stack[-1]:
# right-hand tree present and not yet done, swap tos and node
node, stack[-1] = stack[-1], node
else:
print(node.key)
node = None

What's wrong with this least common ancestor algorithm?

I was asked the following question in a job interview:
Given a root node (to a well formed binary tree) and two other nodes (which are guaranteed to be in the tree, and are also distinct), return the lowest common ancestor of the two nodes.
I didn't know any least common ancestor algorithms, so I tried to make one on the spot. I produced the following code:
def least_common_ancestor(root, a, b):
lca = [None]
def check_subtree(subtree, lca=lca):
if lca[0] is not None or subtree is None:
return 0
if subtree is a or subtree is b:
return 1
else:
ans = sum(check_subtree(n) for n in (subtree.left, subtree.right))
if ans == 2:
lca[0] = subtree
return 0
return ans
check_subtree(root)
return lca[0]
class Node:
def __init__(self, left, right):
self.left = left
self.right = right
I tried the following test cases and got the answer that I expected:
a = Node(None, None)
b = Node(None, None)
tree = Node(Node(Node(None, a), b), None)
tree2 = Node(a, Node(Node(None, None), b))
tree3 = Node(a, b)
but my interviewer told me that "there is a class of trees for which your algorithm returns None." I couldn't figure out what it was and I flubbed the interview. I can't think of a case where the algorithm would make it to the bottom of the tree without ans ever becoming 2 -- what am I missing?

You forgot to account for the case where a is a direct ancestor of b, or vice versa. You stop searching as soon as you find either node and return 1, so you'll never find the other node in that case.
You were given a well-formed binary search tree; one of the properties of such a tree is that you can easily find elements based on their relative size to the current node; smaller elements are going into the left sub-tree, greater go into the right. As such, if you know that both elements are in the tree you only need to compare keys; as soon as you find a node that is in between the two target nodes, or equal to one them, you have found lowest common ancestor.
Your sample nodes never included the keys stored in the tree, so you cannot make use of this property, but if you did, you'd use:
def lca(tree, a, b):
if a.key <= tree.key <= b.key:
return tree
if a.key < tree.key and b.key < tree.key:
return lca(tree.left, a, b)
return lca(tree.right, a, b)
If the tree is merely a 'regular' binary tree, and not a search tree, your only option is to find the paths for both elements and find the point at which these paths diverge.
If your binary tree maintains parent references and depth, this can be done efficiently; simply walk up the deeper of the two nodes until you are at the same depth, then continue upwards from both nodes until you have found a common node; that is the least-common-ancestor.
If you don't have those two elements, you'll have to find the path to both nodes with separate searches, starting from the root, then find the last common node in those two paths.

You are missing the case where a is an ancestor of b.
Look at the simple counter example:
a
b None
a is also given as root, and when invoking the function, you invoke check_subtree(root), which is a, you then find out that this is what you are looking for (in the stop clause that returns 1), and return 1 immidiately without setting lca as it should have been.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.