so i guess you are all fimilliar with a binary heap data structure if not.. Brilliant. org say
i.e. a binary tree which obeys the property that the root of any tree is greater than or equal to (or smaller than or equal to) all its children (heap property). The primary use of such a data structure is to implement a priority queue.
will one of the properties of a binary heap is that it must be filled from top to bottom (from root) and from right to left
I coded this algorithm to find the next available spot to insert the next number I add (I hard coded the first nodes so I can track more further down the tree
this search method is inspired by BFS(Breadth First Search) algorithm
note that in this code I only care about finding the next empty node without the need to keep the heap property
I tested the code but I don't think I tested it enough so if you spot problems, bugs or suggest any ideas, every comment is welcomed
def insert(self, data):
if self.root.data == None:
self.root.data = data
print('root', self.root.data)
else:
self.search()
def search(self):
print('search..L31')
queue = [self.root]
while queue:
curr = queue.pop(0)
print(curr.data)
if curr.right_child == None:
print('made it')
return
else:
queue.append(curr.left_child)
queue.append(curr.right_child)
h = Min_heap(10)
h.insert(2)
h.root.left_child = Node(3)
h.root.right_child = Node(5)
h.root.left_child.left_child = Node(8)
h.root.left_child.right_child = Node(7)
h.root.right_child.left_child = Node(9)
# The tree I am building...
# __2__
# / \
# 3 5
# / \ / \
# 8 7 9 ⨂
# ↑
# what am
# looking for
h.search()
there is another way to figuring this out which is basically translating the tree into an array/list using special formulas and then we just assume that the next data we want to insert is the last element in the previous array and then work back through the same formulas but I already know that algorithm and I thought why not trying to solve it as a graph soooo...
You should better implement a binary heap as a list (array). But if you want to do it with node objects that have left/right attributes, then the position for the next node can be derived from the size of the tree.
So if you enrich your heap class instances with a size attribute and maintain that attribute to reflect the current number of nodes in the tree, then the following method will tell you where the next insertion point is, in O(logn) time:
Take the binary representation of the current size plus 1. So if the tree currently has 4 nodes, take the binary representation of 5, i.e. 101. Then drop the leftmost (most significant) bit. The bits that then remain are an encoding of the path towards the new spot: 0 means "left", 1 means "right".
Here is an implementation of a method that will return the parent node of where the new insertion spot is, and whether it would become the "left" or the "right" child of it:
def next_spot(self):
if not self.root:
raise ValueError("empty tree")
node = self.root
path = self.size + 1
sides = bin(path)[3:-1] # skip "0b1" and final bit
for side in sides:
if side == "0":
node = node.left
else:
node = node.right
# use final bit for saying "left" or "right"
return node, ("left", "right")[path % 2]
If you want to guarantee balanced, just add to each node how many items are there or below. Maintain that with the heap. And when placing an element, always go to where there are the fewest things.
If you just want a simple way to place, just randomly place it. You don't have to be perfect. You will still on average be O(log(n)) levels, just with a worse constant.
(Of course your constants are better with the array approach, but you say you know that one and are deliberately not implementing it.)
Related
Not sure how this works...
class TreeNode:
def __init__(self, val, left=None, right=None):
self.val = val
self.left, self.right = left, right
def find_diameter(self, root):
self.calculate_height(root)
return self.treeDiameter
def calculate_height(self, currentNode):
if currentNode is None:
return 0
leftTreeDiameter = self.calculate_height(currentNode.left)
rightTreeDiameter = self.calculate_height(currentNode.right)
diameter = leftTreeDiameter + rightTreeDiameter + 1
self.treeDiameter = max(self.treeDiameter, diameter)
return max(leftTreeDiameter, rightTreeDiameter) + 1
The above code works to get the max diameter of a binary tree but I don't understand the last line in calculate_height. Why do we need to return max(leftTreeDiameter, rightTreeDiameter) + 1
I obviously don't understand it but what I do know is that for each currentNode we are going to keep going down the left side of the tree and similarly then do the same for the right. If we ended up with no node (meaning right before we were at a leaf node) then we return 0 as we don't want to add 1 for a node that does not exist.
The only place that seems to be adding anything besides 0 is the last line of code in calculate_height because although we are adding leftTreeDiameter + rightTreeDiameter + 1 to get the total diameter this is only possible because of the return 0 and return max(leftTreeDiameter, rightTreeDiameter) + 1 correct?
Also, I am confused as to why leftTreeDiameter can be assigned self.calculate_height(currentNode.left). What I mean is that I thought I would need something like...
def calculate_left_height(self, currentNode, height=0):
if currentNode is None:
return 0
self.calculate_height(currentNode.left, height + 1)
return height
where we just add 1 to the height each time. In this case instead of doing something like leftTreeDiameter += self.calculate_height(currentNode.left) I just pass in as an argument height + 1 each time we see a node.
but if I do this I would need a separate method just to calculate the right height as well and in my find_diameter method would need to recursively call find_diameter with both root.left and also with root.right.
Where is my logic wrong and how is it that calculate_height actually works. I guess I am having trouble trying to figure out how to keep track of the stack?
The names used in this code are confusing: leftTreeDiameter and rightTreeDiameter are not diameters, but heights.
Secondly, the function calculate_height has side effects, which is not very nice. On the one hand it returns a height, and simultaneously it assigns a diameter. This is confusing. Many Python coders would prefer a function to be pure and just return something, without altering anything else. Or, alternatively, a function could only alter some state and not return it. Doing both can be confusing.
Also, it is confusing that although the class is called TreeNode, its find_diameter method still requires a node as argument. This is counter-intuitive. We would expect the method to take self as the node to act on, not the argument.
But let's just rename the variables and add some comments:
leftHeight = self.calculate_height(currentNode.left)
rightHeight = self.calculate_height(currentNode.right)
# What is the size of the longest path from leaf-to-leaf
# whose top node is the current node?
diameter = leftHeight + rightHeight + 1
# Is this path longer than the longest path that we
# had found so far? If so, take this one.
self.treeDiameter = max(self.treeDiameter, diameter)
# The height of the tree rooted at the current node
# is the height of the highest childtree (either left or right),
# with one added to account for the current node
return max(leftHeight, rightHeight) + 1
It should be clear, but do realise that self in this process is always the instance on which the find_diameter method is called, and does not really play a role as actual node, as the root is passed as argument. So the repeated assignment to self.treeDiameter is always to the same one property. This property is not created on every node... just on the node on which you invoke find_diameter.
I hope the inserted comments have clarified how this algorithm works.
NB: your own idea on creating calculate_left_height is not going to do it: it never alters the value of height that it receives as argument, and ends up returning it. So it returns the same value it already receives. That is obviously not going to do much...
I know this might be trivial but i just want to make sure. I believe it's runtime will be at most O(n). My reasoning is that every node will return a height value once throughout this recursive method. Or in other words we will visit every node in the tree once.
def height(self):
if self.is_empty():
return 0
else:
left_max = self._left.height()
right_max = self._right.height()
return max(left_max, right_max) + 1
You are performing DFS traversal on tree all nodes will be visited only one time.
So obviously it will take time of O(N) only.
I need to figure out the number of left children in a binary tree. There is a lot of ways to do it, but I would like to know why the code below does not work.
def leftChildren(self):
leftChildren = []
if self != None:
leftChildren.append(self.v)
if self.l:
leftChildren = leftChildren + self.l.leftChildren()
if self.r:
self.r.leftChildren()
return leftChildren
What is wrong and how to improve it?
I suspect :
self.r to return the right children
self.l to return the left children
All childrens are either nodes or trees
self.v return the value of the node/leaf
Therefore when looking at your code, I can see that
the right side of the tree isn't computed as it should be.
You should probably store the remaining values to only compute it once at the top (personal advice, don't need to)
The value and the name of func is shared, which can be confusing
I would advise to rename to the function and the variable to avoid confusion with the right/left and left = left to see
I don't think you need to check for value. if your binary tree is correctly computed.
You really need to provide the full code , or at least a minimal viable run.
Here is a quick draft:
def getremainingChildren(self):
self.remainingChildren = []
self.remainingChildren.append(self.v)
if self.l:
self.remainingChildren += self.l.getremainingChildren()
if self.r:
self.remainingChildren += self.r.getremainingChildren()
return self.remainingChildren
I was looking at code for if a Binary Tree is a BST, and I was confused at how it is doing the comparison.
def is_bst(cur_node, prev_node_list):
if (not cur_node):
return True
#Check if the left sub-tree is a BST
if (not TreeNode.is_bst(cur_node.left, prev_node_list)):
return False
#If data in cur_node is <= previous node then it is not a BST
prev_node = prev_node_list[0]
if (prev_node and cur_node.data <= prev_node.data):
return False
#Update previous node to current node
prev_node_list[0] = cur_node
#Check if the right sub-tree is a BST
return TreeNode.is_bst(cur_node.right, prev_node_list)
I was wondering what
if (prev_node and cur_node.data <= prev_node.data):
return False
is doing. If the code is constantly checking the left subtrees, shouldn't the next value be less than the previous node?
The code visits all elements in sorted order. That is, first left nodes then current node than right node.
If you replace the check with the previous node with a print statement, you get the elements from smallest to biggest (if the tree was valid).
Now, it is sufficient to check, if these vistited elements are sorted.
To answer your question: the current node is checked after the left node. The code first goes to the very left leaf node.
I'm trying to recursively build a binary decision tree, for diagnosing diseases in python 3.
The builder takes a list of records (each is an illness and a list of its symptoms), and a list of symptoms, shown bellow:
class Node:
def __init__(self, data = "", pos = None, neg = None):
self.data = data
self.positive_child = pos
self.negative_child = neg
class Record:
def __init__(self, illness, symptoms):
self.illness = illness
self.symptoms = symptoms
records= [Record('A',['1','3']),
Record('B',['1','2']),
Record('C',['2','3']),
]
symptoms = ['1','2','3']
And builds a binary tree, each level checks if symptom is true, or false, with a child node for each one. The right child is always means the symtom is not present and the left one that it is present. For the example data the tree should look like this:
1
2 2
3 3 3 3
None B A None C None None Healthy
For example, the leaf A is reached by asking:
1 : True
2 : False
3 : True
and it's path is [1,3] (the trues)
Here is the code I'm using, but isn't working:
def builder(records, symptoms, path):
#Chekl if we are in a leaf node that matches an illness
for record in records:
if path == record.symptoms:
return Node(record.illness,None,None)
#No more symptoms means an empty leaf node
if len(symptoms) == 0:
return Node(None,None,None)
#create subtree
else:
symptom = symptoms.pop(0)
right_child = builder(records,symptoms,path)
path.append(symptom)
left_child = builder(records,symptoms,path)
return Node(symptom,right_child,left_child)
I tried a cold run, and in paper it worked fine. I'm not sure of what I'm missing, but the resulting tree has a lot of empty nodes, and not one with the illness. Maybe I'm messing up the path thing, but I'm not sure how to fix it right now.
Your symptoms.pop(0) is affecting the one symptoms list shared by all calls to builder. This is fine on the way down, since you want to consider only the subsequent symptoms. But when a recursive call returns, your list is missing elements. (If it returns without finding a match, it’s empty!) Similarly, the shared path keeps growing forever.
The simple if inefficient answer is to make new lists when recursing:
symptom=symptoms[0]
symptoms=symptoms[1:]
path=path+[symptom] # not +=