Maximum Binary Tree (Leetcode) - Optimal Solution Explanation? - python

I was going through the Maximum Binary Tree leetcode problem. The TL;DR is that you have an array, such as this one:
[3,2,1,6,0,5]
You're supposed to take the maximum element and make that the root of your tree. Then split the array into the part to the left of that element and the part to its right, and these are used to recursively create the left and right subtrees in the same way, respectively.
LeetCode claims that the optimal solution (shown in the "Solution" tab) uses a linear search for the maximum value of the sub-array in each recursive step. This is O(n^2) in the worst case. This is the solution I came up with, and it's simple enough.
However, I was looking through other submissions and found a linear time solution, but I've struggled to understand how it works! It looks something like this:
def constructMaximumBinaryTree(nums):
nodes=[]
for num in nums:
node = TreeNode(num)
while nodes and num>nodes[-1].val:
node.left = nodes.pop()
if nodes:
nodes[-1].right = node
nodes.append(node)
return nodes[0]
I've analysed this function and in aggregate, this appears to be linear time (O(n)), since each unique node is added to and popped from the nodes array at most once. I've tried running it with different example inputs, but I'm struggling to connect the dots and wrap my head around how this works. Can someone please explain it to me?

One way to understand the algorithm is to consider the loop invariants. In this case, the array of nodes always satisfies the condition that before and after each execution of the for-loop, either:
nodes is empty and a max binary tree does not exist (for example, if the input nums was empty)
the first item in nodes is the max binary tree based on the data processed so far from the input nums
The while-loop ensures that the current max binary tree is the first item in the nodes array, since otherwise, it would have been popped and added as a left subtree.
During each iteration of the for-loop, the check:
if nodes:
nodes[-1].right = node
adds the current node as a right subtree to the last item in the nodes array. And when this happens, the current node is less than the last node in the nodes array (since each input integer is defined to be unique). And since the current node is less than the last node in the array, the last node acts as a partition point whose value is greater than the current item, which is why the current node is added as a right subtree.
When there are multiple items in the nodes array, each item is a subtree of the item to its left.
Running Time
For the running time, let n be the length of the input nums. There are n executions of the for-loop. If the input data were sorted in descending order, but with the max input value at the end of the input (such as: 4, 3, 2, 1, 5), then the inner while-loop would be skipped during each iteration until the last for-loop iteration. During the last for-loop iteration, the while loop would run n - 1 times, for a total running time of n + (n - 1) => 2n - 1 => O(n).

Related

How to sort N elements given a list of tuples stating their known order?

sorry for the complicated / confusing title.
Basically I'm trying to implement a system that helps with the sorting of documents with no known date of writing.
It consists of two inputs:
Input 1: A tuple, where the first element is the number of documents, N. The second element is the number of pairs of documents with a known writing order, L. (First document was written before the second document).
Input 2: A list of L tuples. Each tuple contains two elements (documents). The first document was written before the second document. Eg: (1 2), (3 4) means that document1 was written before document2 and document3 was written before document4.
Next, the software must determine if there is a way of sorting all documents chronologically, there can be three outputs:
Inconclusive - Means that the temporal organization of the documents is inconsistent and there is no way of sorting them.
Incomplete - Means that information is lacking and the system can't find a way of sorting.
In case the information is enough, the system should output the order in which the documents have been written.
So far, I have managed to take both inputs, but I do not know where to start in terms of sorting the documents. Any suggestions?
Here's my code so far (Python3):
LN = tuple(int(x.strip()) for x in input("Number of docs. / Number of known pairs").split(' '))
print(LN)
if (LN[1]) > (LN[0]**2):
print("Invalid number of pairs")
else:
A = ['none'] * LN[0]
for i in range(LN[0]):
t = tuple(int(x.strip()) for x in input("Pair:").split(' '))
A[i] = t
print(A)
I appreciate all suggestions :)
Build a directed graph. The inputs are the edges. Check for cycles which would indicate inconsistent input. Find the "leftmost" node, that is the node that doesn't have any edge to it, meaning nothing to its left. Multiple that are leftmost? Incomplete. Then, for each node in the graph, assign the index that equals the length of the longest path from the leftmost node. As there are no (directed) cycles, you could probably just do BFS starting at the leftmost node and at each step assign to the node the maximum of its current value and its value given from its parent. Then iterate through all nodes, and put the numbers in their corresponding indices. Two nodes have the same index assigned? Incomplete.

How do I compute the time and the space complexity of a recursive function?

I am currently practicing an interview question. The question is:
Given an integer array with no duplicates. A maximum tree building on this array is defined as follow:
1. The root is the maximum number in the array.
2. The left subtree is the maximum tree constructed from left part subarray divided by the maximum number.
3. The right subtree is the maximum tree constructed from right part subarray divided by the maximum number.
Construct the maximum tree by the given array and output the root node of this tree.
My solution to this question is:
def constructMaximumBinaryTree(nums):
"""
:type nums: List[int]
:rtype: TreeNode
"""
if nums:
maxIdx = nums.index(max(nums))
root = TreeNode(nums[maxIdx])
left = constructMaximumBinaryTree(nums[:maxIdx])
right = constructMaximumBinaryTree(nums[maxIdx + 1:])
root.left = left
root.right = right
return root
I get how it works, but I am not sure how to compute the time and space complexity. If I try to draw the solution out, the input array gets split into two, for each node until it gets empty. So, I guessed it would be something like O(log n), but I am not sure about the exact reasoning. Same with the space complexity. Any tips?
No, it's not necessarily O(n log n).
First, consider the recursion process itself: what is the worst-case (default interpretation of "complexity) position of the splitting decision? If the given array is sorted, then the maximum element is always at the end, and your recursion degenerates into a process of removing one element on each iteration.
Second, consider the complexity of one pass through the function, recursion aside. What is the complexity of each operation in your sequence?
find max of list
find element in list
construct node
slice list
function all
slice list
function call
assignment
assignment
return root node
Many of those are O(1) operations, but several are O(n) -- where n is the length of the current list, not the original.
This results in a worst-case O(n^2). Best-case is O(n log n), as you intuited, given a perfectly balanced tree as input. The average case ... you probably don't need that any more, but it's O(n log n) with less favorable constants.

Counting all nodes of all paths from root to leaves

If given a tree with nodes with integers: 1 ~ 10, and branching factor of 3 for all nodes, how can I write a function that traverses through the tree counting from root to leaves for EVERY paths
So for this example, let's say it needs to return this:
{1: 1, 2: 5}
I've tried this helper function:
def tree_lengths(t):
temp = []
for i in t.children:
temp.append(1)
temp += [e + 1 for e in tree_lengths(i)]
return temp
There are too many errors with this code. For one, it leaves behind imprints of every visited node in the traversal in the returning list - so it's difficult to figure out which ones are the values that I need from that list. For another, if the tree is large, it does not leave behind imprints of the root and earlier nodes in the path prior to reaching the line "for i in t.children". It needs to first: duplicate all paths from root leaves; second: return a list exclusively for the final number of each path count.
Please help! This is so difficult.
I'm not sure exactly what you are trying to do, but you'll likely need to define a recursive function that takes a node (the head of a tree or subtree) and an integer (the number of children you've traversed so far), and maybe a list of each visited node so far. If the node has no children, you've reached a leaf and you can print out whatever info you need. Otherwise, for each child, call this recursive function again with new parameters (+1 to count, the child node as head node, etc).

Given a list L labeled 1 to N, and a process that "removes" a random element from consideration, how can one efficiently keep track of min(L)?

The question is pretty much in the title, but say I have a list L
L = [1,2,3,4,5]
min(L) = 1 here. Now I remove 4. The min is still 1. Then I remove 2. The min is still 1. Then I remove 1. The min is now 3. Then I remove 3. The min is now 5, and so on.
I am wondering if there is a good way to keep track of the min of the list at all times without needing to do min(L) or scanning through the entire list, etc.
There is an efficiency cost to actually removing the items from the list because it has to move everything else over. Re-sorting the list each time is expensive, too. Is there a way around this?
To remove a random element you need to know what elements have not been removed yet.
To know the minimum element, you need to sort or scan the items.
A min heap implemented as an array neatly solves both problems. The cost to remove an item is O(log N) and the cost to find the min is O(1). The items are stored contiguously in an array, so choosing one at random is very easy, O(1).
The min heap is described on this Wikipedia page
BTW, if the data are large, you can leave them in place and store pointers or indexes in the min heap and adjust the comparison operator accordingly.
Google for self-balancing binary search trees. Building one from the initial list takes O(n lg n) time, and finding and removing an arbitrary item will take O(lg n) (instead of O(n) for finding/removing from a simple list). A smallest item will always appear in the root of the tree.
This question may be useful. It provides links to several implementation of various balanced binary search trees. The advice to use a hash table does not apply well to your case, since it does not address maintaining a minimum item.
Here's a solution that need O(N lg N) preprocessing time + O(lg N) update time and O(lg(n)*lg(n)) delete time.
Preprocessing:
step 1: sort the L
step 2: for each item L[i], map L[i]->i
step 3: Build a Binary Indexed Tree or segment tree where for every 1<=i<=length of L, BIT[i]=1 and keep the sum of the ranges.
Query type delete:
Step 1: if an item x is said to be removed, with a binary search on array L (where L is sorted) or from the mapping find its index. set BIT[index[x]] = 0 and update all the ranges. Runtime: O(lg N)
Query type findMin:
Step 1: do a binary search over array L. for every mid, find the sum on BIT from 1-mid. if BIT[mid]>0 then we know some value<=mid is still alive. So we set hi=mid-1. otherwise we set low=mid+1. Runtime: O(lg**2N)
Same can be done with Segment tree.
Edit: If I'm not wrong per query can be processed in O(1) with Linked List
If sorting isn't in your best interest, I would suggest only do comparisons where you need to do them. If you remove elements that are not the old minimum, and you aren't inserting any new elements, there isn't a re-scan necessary for a minimum value.
Can you give us some more information about the processing going on that you are trying to do?
Comment answer: You don't have to compute min(L). Just keep track of its index and then only re-run the scan for min(L) when you remove at(or below) the old index (and make sure you track it accordingly).
Your current approach of rescanning when the minimum is removed is O(1)-time in expectation for each removal (assuming every item is equally likely to be removed).
Given a list of n items, a rescan is necessary with probability 1/n, so the expected work at each step is n * 1/n = O(1).

List implemented using an inorder binary tree

For the new computer science assignment we are to implement a list/array using an inorder binary tree. I would just like a suggestion rather than a solution.
The idea is having a binary tree that has its nodes accessible via indexes, e.g.
t = ListTree()
t.insert(2,0) # 1st argument is the value, 2nd the index to insert at
t.get(0) # returns 2
The Node class that the values are stored in is not modifiable but has a property size which contains the total number of children below, along with left, right and value that point to children and store the value accordingly.
My chief problem at the moment keeping track of the index - as we're not allowed to store the index of the node in the node itself I must rely on traversing to track it. As I always start with the left node when traversing I haven't yet thought of a way to recursively figure out what index we are currently at.
Any suggestions would be welcome.
Thanks!
You really wouldn't want to store it on the node itself, because then the index would have to be updated on inserts for all nodes with index less than insert index. I think the real question is how to do an in-order traversal. Try having your recursive function return the number of nodes to its left.
I don't think you want to store the index, rather just the size of each subtree. For insance, if you wanted to look up the 10th element in the list, and the left and right subrees had 7 elements each, you would know that the root is the eight element (since it's in-order binary), and the first element of the right subree is 9th. armed with this knowledge, you would recurse into the right subree, looking for the 2nd element.
HTH
Well, a node in a binary tree cannot have a value and an index. They can have multiple pieces of data but the tree can only be keyed/built on one.
Maybe your assignment wants you to use the index value as the key to the tree and attach the value to the node for quick retrieval of the value given an index.
Does the tree have to be balanced? Does the algorithm need to be efficient?
If not, then the simplest thing to do is make a tree in which all the left children are null, i.e., a tree that devolves to a linked list.
To insert, recursively look go to the right child, and then update the size of the node on the way back out. Something like (pseudocode)
function insert(node, value, index, depth)
if depth < index
create the rightchild if necessary
insert( rightchild, value, index, depth + 1)
if depth == size
node.value = value
node.size = rightchild.size + 1
After you have this working, you can modify it to be more balanced. When increasing the length of the list, add nodes to the left or right child nodes depending on which currently has the least, and update the size on the way out of the recursion.
To generalize to be more efficient, you need to work on the index in terms of its binary representation.
For example, and empty list has one node, without children with value null and size 0.
Say you want to insert "hello" at index 1034. Then you want to end up with a tree whose root has two children, with sizes 1024 and 10. The left child has no actual children, but the right node has a right child of its own of size 2. (The left of size 8 is implied.) That node in turn, has one right child of size 0, with the value "hello". (This list has a 1-based index, but a 0-based index is similar.)
So you need to recursively break down the index into its binary parts, and add nodes as necessary. When searching the list, you need to take care when traversing a node with null children
A very easy solution is to do GetFirst() to get the first node of the tree (this is simply finding the leftmost node of the tree). If your index N is 0, return the first node. Otherwise, call GetNodeNext() N times to get the appropriate node. This isn't super efficient though since accessing a node by index takes O(N Lg N) time.
Node *Tree::GetFirstNode()
{
Node *pN,*child;
pN=GetRootNode();
while(NOTNIL(child=GetNodeLeft(pN)))
{
pN=child;
}
return(pN);
}
Node *Tree::GetNodeNext(Node *pNode)
{
Node *temp;
temp=GetNodeRight(pNode);
if(NOTNIL(temp))
{
pNode=temp;
temp=GetNodeLeft(pNode);
while(NOTNIL(temp))
{
pNode=temp;
temp=GetNodeLeft(pNode);
}
return(pNode);
}
else
{
temp=GetNodeParent(pNode);
while( (NOTNIL(temp)) && (GetNodeRight(temp)==pNode) )
{
pNode=temp;
temp=GetNodeParent(pNode);
}
return(temp);
}
}

Categories

Resources