Is it right approach to fix problem of BST?

Is it right approach to fix problem of BST? - python

I have to check is Binary tree balanced and I am pretty sure my solution should work.
import sys
class Solution:
def isBalanced(self, root: TreeNode) -> bool:
cache = {
max:-sys.maxsize, #min possible number
min:sys.maxsize #max possible number
}
self.checkBalanced(root, cache, 0)
return cache[max] - cache[min] <= 1
def checkBalanced(self,node, cache, depth):
if node is None:
if depth < cache[min]:
cache[min] = depth
if depth > cache[max]:
cache[max] = depth
else:
self.checkBalanced(node.left, cache, depth+1)
self.checkBalanced(node.right, cache, depth+1)
But in this case I have an error
Here is link for question on Leetcode: https://leetcode.com/problems/balanced-binary-tree

There is a definition from the link that you provided:
For this problem, a height-balanced binary tree is defined as: a binary tree in which the depth of the two subtrees of every node never differ by more than 1.
For given "bad" input your code calculates cache[max] = 5, cache[min] = 3, so it returns false. However, if we consider root, its left subtree has depth 4 and right subtree has depth 3, so this satisfies the definition.
You should find depths of left and right subtrees for each root, however your code calculates depth (your cache[max]) and length of shortest path to any leaf of subtree (your cache[min]).
I hope you will easily fix your code after these clarifications.

Related

How to optimise the solution to not get memory limit exceeded error or what might be getting me the error?

I came across the following problem.
You are given the root of a binary tree with n nodes.
Each node is uniquely assigned a value from 1 to n.
You are also given an integer startValue representing
the value of the start node s,
and a different integer destValue representing
the value of the destination node t.
Find the shortest path starting from node s and ending at node t.
Generate step-by-step directions of such path as a string consisting of only the
uppercase letters 'L', 'R', and 'U'. Each letter indicates a specific direction:
'L' means to go from a node to its left child node.
'R' means to go from a node to its right child node.
'U' means to go from a node to its parent node.
Return the step-by-step directions of the shortest path from node s to node t
Example 1:
Input: root = [5,1,2,3,null,6,4], startValue = 3, destValue = 6
Output: "UURL"
Explanation: The shortest path is: 3 → 1 → 5 → 2 → 6.
Example 2:
Input: root = [2,1], startValue = 2, destValue = 1
Output: "L"
Explanation: The shortest path is: 2 → 1.
I created the solution by finding the least common ancestor and then doing a depth-first-search to find the elements, Like this:-
# Definition for a binary tree node.
# class TreeNode(object):
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution(object):
def getDirections(self, root, startValue, destValue):
"""
:type root: Optional[TreeNode]
:type startValue: int
:type destValue: int
:rtype: str
"""
def lca(root):
if root == None or root.val == startValue or root.val == destValue:
return root
left = lca(root.left)
right = lca(root.right)
if left and right:
return root
return left or right
def dfs(root, value, path):
if root == None:
return ""
if root.val == value:
return path
return dfs(root.left, value, path + "L") + dfs(root.right, value, path + "R")
root = lca(root)
return "U"*len(dfs(root, startValue, "")) + dfs(root, destValue, "")
The solution runs good, however for a very large input it throws "Memory Limit Exceeded" error, can anyone tell me how I can optimise the solution, or what might I be doing that could be getting me into it ?

The reason you're getting a memory limit exceeded is the arguments to the dfs function. Your 'path' variable is a string that can be as large as the height of the tree (which can be the size of the whole tree if it's unbalanced).
Normally that wouldn't be a problem, but path + "L" creates a new string for every recursive call of the function. Besides being very slow, this means that your memory usage is O(n^2), where n is the number of nodes in the tree.
For example, if your final path is "L" * 1000, your call stack for dfs will look like this:
Depth 0: dfs(root, path = "")
Depth 1: dfs(root.left, path = "L")
Depth 2: dfs(root.left.left, path = "LL")
...
Depth 999: path = "L"*999
Depth 1000: path = "L"*1000
Despite all those variables being called path, they are all completely different strings, for a total memory usage of ~(1000*1000)/2 = 500,000 characters at one time. With one million nodes, this is half a trillion characters.
Now, this doesn't happen just because strings are immutable; in fact, even if you were using lists (which are mutable), you'd still have this problem, as path + ["L"] would still be forced to create a copy of path.
To solve this, you need to have exactly one variable for the path stored outside of the dfs function, and only append to it from the recursive dfs function. This will ensure you only ever use O(n) space.
def dfs(root, value, path):
if root is None:
return False
if root.val == value:
return True
if dfs(root.left, value, path):
path.append("L")
return True
elif dfs(root.right, value, path):
path.append("R")
return True
return False
root = lca(root)
start_to_root = []
dfs(root, startValue, start_to_root)
dest_to_root = []
dfs(root, destValue, dest_to_root)
return "U" * len(start_to_root) + ''.join(reversed(dest_to_root))

Maximum depth of binary tree- do we need a 'holder' to keep track of the maximum current depth?

I a writing code to solve the following leetcode problem: https://leetcode.com/problems/maximum-depth-of-binary-tree/
Here is the iterative solution that passes all the tests:
def maxDepth(root):
stack = []
if not root:
return 0
if root:
stack.append((1,root))
depth =0
while stack:
current_depth, root = stack.pop()
depth = max(current_depth,depth)
if root.left:
stack.append((current_depth+1,root.left))
if root.right:
stack.append((current_depth+1,root.right))
return depth
I do understand on the whole what is happening, but my question is with depth = max(current_depth,depth). Am I right in understanding that the only purpose of 'depth' is to act as a holder to hold the current maximum depth as we traverse the tree?
Because when reading the code initially, the first thing that struck me is why not ONLY have current_depth? But then it hit me that we need to store the current_depth somewhere and only keep the largest. Am I right on this point?

my question is with depth = max(current_depth,depth). Am I right in understanding that the only purpose of 'depth' is to act as a holder to hold the current maximum depth as we traverse the tree?
Yes, that is correct. Maybe it helps clarifying this point when you would replace this line with this equivalent code:
if current_depth > depth:
depth = current_depth
we need to store the current_depth somewhere and only keep the largest. Am I right on this point?
Yes, that is correct. During the execution of the algorithm, current_depth is fluctuating up and down, as you move up and down the stack. Actually, current_depth is always one less than the size of the stack after the pop (or equal to it before the pop) so if you really wanted to, you could do this without the current_depth variable, and rely only on len(stack). In that case you don't even have to push that info on the stack. The outcome of the algorithm is really the maximum size that the stack reached during the whole execution:
def maxDepth(root):
stack = []
if not root:
return 0
if root:
stack.append(root)
depth =0
while stack:
depth = max(len(stack), depth)
root = stack.pop()
if root.left:
stack.append(root.left)
if root.right:
stack.append(root.right)
return depth
Recursive versions
The original code you presented really is an almost literal conversion of a recursive function to an iterative function, introducing an explicit stack variable instead of the call stack frames you would produce in a recursive version.
It may also help to see the recursive implementation that this code mimics:
def maxDepth(root):
if not root:
return 0
depth = 0
def dfs(current_depth, root): # <-- these variables live on THE stack
nonlocal depth
depth = max(current_depth, depth)
if root.left:
dfs(current_depth + 1, root.left)
if root.right:
dfs(current_depth + 1, root.right)
dfs(1, root)
return depth
And moving the three similar if statements one level deeper in the recursion tree, so to only have one if, we get:
def maxDepth(root):
depth = 0
def dfs(current_depth, root):
nonlocal depth
if root:
depth = max(current_depth, depth)
dfs(current_depth + 1, root.left)
dfs(current_depth + 1, root.right)
dfs(1, root)
return depth
It is essentially the same algorithm, but it may help clarify what is happening.
We can turn this into a more functional version, which makes dfs return the depth value: that way you can avoid the nonlocal trick to mutate the depth value from inside that function:
def maxDepth(root):
def dfs(current_depth, root):
return max(current_depth,
dfs(current_depth + 1, root.left),
dfs(current_depth + 1, root.right)
) if root else current_depth
return dfs(0, root)
And now we can even merge that inner function with the outside function, by providing it an optional argument (current_depth) -- it should not be provided in the main call of maxDepth:
def maxDepth(root, current_depth=0):
return max(current_depth,
maxDepth(root.left, current_depth + 1),
maxDepth(root.right, current_depth + 1)
) if root else current_depth
And finally, the most elegant solution is to make maxDepth return the depth of the subtree that it is given, so without any context of the larger tree. In that case it is no longer necessary to pass a current_depth argument. The 1 is added after the recursive call is made, to account for the parent node:
def maxDepth(root):
return 1 + max(
maxDepth(root.left), maxDepth(root.right)
) if root else 0

Binary Search Trees Max Height

def getHeight(self, root):
if not root:
return -1
else:
return (max(self.getHeight(root.right), self.getHeight(root.left)) + 1)
I was doing a python problem related to binary search trees and the instructor wrote a line of code like this,
return (max(self.getHeight(root.right), self.getHeight(root.left)) + 1)
I am not sure how this works. Could someone explain to me why it is like this?

This is recursion method to find the height of BST.
if not root:
return -1
The above is base condition, when you reach to right or left node of node leaf node then function returns -1 .
The return statement can written simply as:
left_height=height(root.left)
right_height=height(root.right)
return 1+max(right_height,left_height)
Where you get the height of left subtree and right subtree recursively and then add 1 to the maximum of them because we consider that tree with height 0 has -1 height ,so add 1.
You can watch this simple tutorial.

Size of subtree in Python

Almost every online tutorial I see on the subject when it comes to finding the size of a subtree involves calling a recursive function on each child's subtree.
The problem with this in Python is that it overflows if you recurse past a few hundred levels, so if I theoretically had a long, linear tree, it would fail.
Is there a better way to handle this? Do I need to use a stack instead?

Do I need to use a stack instead?
Sure, that's one way of doing it.
def iter_tree(root):
to_explore = [root]
while to_explore:
node = to_explore.pop()
yield node
for child in node.children:
to_explore.append(child)
def size(root):
count = 0
for node in iter_tree(root):
count += 1
return count

The stack would be the easiest non-recursive way of getting the size of the subtree (count of nodes under the given node, including the current node)
class Node():
def __init__(self, value):
self.value = value
self.left = None
self.right = None
def subtree_size(root):
visited = 0
if not root: return visited
stack = [root]
while stack:
node = stack.pop()
visited += 1
if node.left: stack.append(node.left)
if node.right: stack.append(node.right)
return visited

You can mirror the recursive algorithm using a stack:
numNodes = 0
nodeStack = [(root,0)] # (Node,0 means explore left 1 means explore right)
while nodeStack:
nextNode, leftOrRight = nodeStack.pop()
if not nextNode: #nextNode is empty
continue
if leftOrRight == 0:
numNodes += 1
nodeStack.append((nextNode,1))
nodeStack.append((nextNode.leftChild,0))
else:
nodeStack.append((nextNode.rightChild,0))
print(numNodes)
Some things to notice: This is still a Depth-first search! That is, we still fully explore a subtree before starting to explore the other. What this means to you is that the amount of additional memory required is proportional to the height of the tree and not the width of the tree. For a balanced tree the width of the tree is 2^h where h is the height of the tree. For a totally unbalanced tree the height of the tree is the number of nodes in the tree, whereas the width is one! so it all depends on what you need :)
Now It is worth mentioning that you can make a potential optimization by checking if one of the subtrees is empty! We can change the body of if leftOrRight == 0: to:
numNodes += 1
if nextNode.rightChild: #Has a right child to explore
nodeStack.append((nextNode,1))
nodeStack.append((nextNode.leftChild,0))
Which potentially cuts down on memory usage :)

What's wrong with this least common ancestor algorithm?

I was asked the following question in a job interview:
Given a root node (to a well formed binary tree) and two other nodes (which are guaranteed to be in the tree, and are also distinct), return the lowest common ancestor of the two nodes.
I didn't know any least common ancestor algorithms, so I tried to make one on the spot. I produced the following code:
def least_common_ancestor(root, a, b):
lca = [None]
def check_subtree(subtree, lca=lca):
if lca[0] is not None or subtree is None:
return 0
if subtree is a or subtree is b:
return 1
else:
ans = sum(check_subtree(n) for n in (subtree.left, subtree.right))
if ans == 2:
lca[0] = subtree
return 0
return ans
check_subtree(root)
return lca[0]
class Node:
def __init__(self, left, right):
self.left = left
self.right = right
I tried the following test cases and got the answer that I expected:
a = Node(None, None)
b = Node(None, None)
tree = Node(Node(Node(None, a), b), None)
tree2 = Node(a, Node(Node(None, None), b))
tree3 = Node(a, b)
but my interviewer told me that "there is a class of trees for which your algorithm returns None." I couldn't figure out what it was and I flubbed the interview. I can't think of a case where the algorithm would make it to the bottom of the tree without ans ever becoming 2 -- what am I missing?

You forgot to account for the case where a is a direct ancestor of b, or vice versa. You stop searching as soon as you find either node and return 1, so you'll never find the other node in that case.
You were given a well-formed binary search tree; one of the properties of such a tree is that you can easily find elements based on their relative size to the current node; smaller elements are going into the left sub-tree, greater go into the right. As such, if you know that both elements are in the tree you only need to compare keys; as soon as you find a node that is in between the two target nodes, or equal to one them, you have found lowest common ancestor.
Your sample nodes never included the keys stored in the tree, so you cannot make use of this property, but if you did, you'd use:
def lca(tree, a, b):
if a.key <= tree.key <= b.key:
return tree
if a.key < tree.key and b.key < tree.key:
return lca(tree.left, a, b)
return lca(tree.right, a, b)
If the tree is merely a 'regular' binary tree, and not a search tree, your only option is to find the paths for both elements and find the point at which these paths diverge.
If your binary tree maintains parent references and depth, this can be done efficiently; simply walk up the deeper of the two nodes until you are at the same depth, then continue upwards from both nodes until you have found a common node; that is the least-common-ancestor.
If you don't have those two elements, you'll have to find the path to both nodes with separate searches, starting from the root, then find the last common node in those two paths.

You are missing the case where a is an ancestor of b.
Look at the simple counter example:
a
b None
a is also given as root, and when invoking the function, you invoke check_subtree(root), which is a, you then find out that this is what you are looking for (in the stop clause that returns 1), and return 1 immidiately without setting lca as it should have been.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Is it right approach to fix problem of BST? - python

Related

How to optimise the solution to not get memory limit exceeded error or what might be getting me the error?

Maximum depth of binary tree- do we need a 'holder' to keep track of the maximum current depth?

Binary Search Trees Max Height

Size of subtree in Python

What's wrong with this least common ancestor algorithm?

Categories

Resources