Bad Tree design, Data Structure

Bad Tree design, Data Structure - python

I tried making a Tree as a part of my Data Structures course. The code works but is extremely slow, almost double the time that is accepted for the course. I do not have experience with Data Structures and Algorithms but I need to optimize the program. If anyone has any tips, advices, criticism I would greatly appreciate it.
The tree is not necessarily a binary tree.
Here is the code:
import sys
import threading
class Node:
def __init__(self,value):
self.value = value
self.children = []
self.parent = None
def add_child(self,child):
child.parent = self
self.children.append(child)
def compute_height(n, parents):
found = False
indices = []
for i in range(n):
indices.append(i)
for i in range(len(parents)):
currentItem = parents[i]
if currentItem == -1:
root = Node(parents[i])
startingIndex = i
found = True
break
if found == False:
root = Node(parents[0])
startingIndex = 0
return recursion(startingIndex,root,indices,parents)
def recursion(index,toWhomAdd,indexes,values):
children = []
for i in range(len(values)):
if index == values[i]:
children.append(indexes[i])
newNode = Node(indexes[i])
toWhomAdd.add_child(newNode)
recursion(i, newNode, indexes, values)
return toWhomAdd
def checkHeight(node):
if node == '' or node == None or node == []:
return 0
counter = []
for i in node.children:
counter.append(checkHeight(i))
if node.children != []:
mostChildren = max(counter)
else:
mostChildren = 0
return(1 + mostChildren)
def main():
n = int(int(input()))
parents = list(map(int, input().split()))
root = compute_height(n, parents)
print(checkHeight(root))
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
threading.Thread(target=main).start()
Edit:
For this input(first number being number of nodes and other numbers the node's values)
5
4 -1 4 1 1
We expect this output(height of the tree)
3
Another example:
Input:
5
-1 0 4 0 3
Output:
4

It looks like the value that is given for a node, is a reference by index of another node (its parent). This is nowhere stated in the question, but if that assumption is right, you don't really need to create the tree with Node instances. Just read the input into a list (which you already do), and you actually have the tree encoded in it.
So for example, the list [4, -1, 4, 1, 1] represents this tree, where the labels are the indices in this list:
1
/ \
4 3
/ \
0 2
The height of this tree — according to the definition given in Wikipedia — would be 2. But apparently the expected result is 3, which is the number of nodes (not edges) on the longest path from the root to a leaf, or — otherwise put — the number of levels in the tree.
The idea to use recursion is correct, but you can do it bottom up (starting at any node), getting the result of the parent recursively, and adding one to 1. Use the principle of dynamic programming by storing the result for each node in a separate list, which I called levels:
def get_num_levels(parents):
levels = [0] * len(parents)
def recur(node):
if levels[node] == 0: # this node's level hasn't been determined yet
parent = parents[node]
levels[node] = 1 if parent == -1 else recur(parent) + 1
return levels[node]
for node in range(len(parents)):
recur(node)
return max(levels)
And the main code could be as you had it:
def main():
n = int(int(input()))
parents = list(map(int, input().split()))
print(get_num_levels(parents))

Related

Count Number of Good Nodes

problem statement
I am having trouble understanding what is wrong with my code and understanding the constraint below.
My pseudocode:
Traverse the tree Level Order and construct the array representation (input is actually given as a single root, but they use array representation to show the full tree)
iterate over this array representation, skipping null nodes
for each node, let's call it X, iterate upwards until we reach the root checking to see if at any point in the path, parentNode > nodeX, meaning, nodeX is not a good node.
increment counter if the node is good
Constraints:
The number of nodes in the binary tree is in the range [1, 10^5].
Each node's value is between [-10^4, 10^4]
First of all:
My confusion on the constraint is that, the automated tests are giving input such as [2,4,4,4,null,1,3,null,null,5,null,null,null,null,5,4,4] and if we follow the rules that childs are at c1 = 2k+1 and c2 = 2k+2 and parent = (k-1)//2 then this means that there are nodes with value null
Secondly:
For the input above, my code outputs 8, the expected value is 6, but when I draw the tree from the array, I also think the answer should be 8!
tree of input
# Definition for a binary tree node.
# class TreeNode:
# def __init__(self, val=0, left=None, right=None):
# self.val = val
# self.left = left
# self.right = right
class Solution:
def goodNodes(self, root: TreeNode) -> int:
arrRepresentation = []
queue = []
queue.append(root)
# while queue not empty
while queue:
# remove node
node = queue.pop(0)
if node is None:
arrRepresentation.append(None)
else:
arrRepresentation.append(node.val)
if node is not None:
# add left to queue
queue.append(node.left)
# add right to queue
queue.append(node.right)
print(arrRepresentation)
goodNodeCounter = 1
# iterate over array representation of binary tree
for k in range(len(arrRepresentation)-1, 0, -1):
child = arrRepresentation[k]
if child is None:
continue
isGoodNode = self._isGoodNode(k, arrRepresentation)
print('is good: ' + str(isGoodNode))
if isGoodNode:
goodNodeCounter += 1
return goodNodeCounter
def _isGoodNode(self, k, arrRepresentation):
child = arrRepresentation[k]
print('child: '+str(child))
# calculate index of parent
parentIndex = (k-1)//2
isGood = True
# if we have not reached root node
while parentIndex >= 0:
parent = arrRepresentation[parentIndex]
print('parent: '+ str(parent))
# calculate index of parent
parentIndex = (parentIndex-1)//2
if parent is None:
continue
if parent > child:
isGood = False
break
return isGood

Recursion might be easier:
class Node:
def __init__(self, val=0, left=None, right=None):
self.val = val
self.left = left
self.right = right
def good_nodes(root, maximum=float('-inf')):
if not root: # null-root
return 0
is_this_good = maximum <= root.val # is this root a good node?
maximum = max(maximum, root.val) # update max
good_from_left = good_nodes(root.left, maximum) if root.left else 0
good_from_right = good_nodes(root.right, maximum) if root.right else 0
return is_this_good + good_from_left + good_from_right
tree = Node(2, Node(4, Node(4)), Node(4, Node(1, Node(5, None, Node(5, Node(4), Node(4)))), Node(3)))
print(good_nodes(tree)) # 6
Basically, recursion traverses the tree while updating the maximum number seen so far. At each iteration, the value of a root is compared with the maximum, incrementing the counter if necessary.

Since you wanted to solve with breadth first search:
from collections import deque
class Solution:
def goodNodes(self,root:TreeNode)->int:
if not root:
return 0
queue=deque()
# run bfs with track of max_val till its parent node
queue.append((root,-inf))
res=0
while queue:
current,max_val=queue.popleft()
if current.val>=max_val:
res+=1
if current.left:
queue.append((current.left,max(max_val,current.val)))
if current.right:
queue.append((current.right,max(max_val,current.val)))
return res
I added the node and its max_value till its parent node. I could not add a global max_value, because look at this tree:
For the first 3 nodes, you would have this [3,1,4] and if you were keeping the max_val globally, max_val would be 4.
Now next node would be 3, leaf node on the left. Since max_node is 4, 3<4 would be incorrect so 3 would not be considered as good node. So instead, I keep track of max_val of each node till its parent node

The binary heap you provided corresponds to the folloring hierarchy:
tree = [2,4,4,4,None,1,3,None,None,5,None,None,None,None,5,4,4]
printHeapTree(tree)
2
/ \
4 4
/ / \
4 1 3
\
5
In that tree, only item value 1 has an ancestor that is greater than itself. The 6 other nodes are good, because they have no ancestor that are greater than themselves (counting the root as good).
Note that there are values in the list that are unreachable because their parent is null (None) so they are not part of the tree (this could be a copy/paste mistake though). If we replace these None values by something else to make them part of the tree, we can see where the unreachable nodes are located in the hierarchy:
t = [2,4,4,4,'*', 1,3,'*',None, 5,None, None,None,None,5,4,4]
printHeapTree(t)
2
__/ \_
4 4
/ \ / \
4 * 1 3
/ / \
* 5 5
/ \
4 4
This is likely where the difference between a result of 8 (not counting root as good) vs 6 (counting root as good) comes from.
You can find the printHeapTree() function here.

An algorithm that finds an edge that can be removed to create a tree

Load the graph at the input. This graph started as a tree (i.e. an unoriented graph that does not contain loops) with n vertices numbered 1 to n, to which one edge was added that it did not contain. The input graph is represented as a list of its edges, where a_i b_i says that the graph has an edge between the vertices a_i and b_i.
Program will list which edge we can remove from the graph to create a tree with n vertices from the graph. If more than one answer is possible, answer with the one at the input last.
For example, to input:
1 2
1 3
2 3
Program will answer 2 3
For input:
1 2
2 3
3 4
1 4
1 5
Answer 1 4
I have a code that can determine if numbers are a tree, but I don't know how to make it so that they can be entered, and how to make it so that it removes unnecessary edges:
from collections import defaultdict
class Graph():
def __init__(self, V):
self.V = V
self.graph = defaultdict(list)
def addEdge(self, v, w):
self.graph[v].append(w)
self.graph[w].append(v)
def isCyclicUtil(self, v, visited, parent):
visited[v] = True
for i in self.graph[v]:
if visited[i] == False:
if self.isCyclicUtil(i, visited, v) == True:
return True
elif i != parent:
return True
return False
def isTree(self):
visited = [False] * self.V
if self.isCyclicUtil(0, visited, -1) == True:
return False
for i in range(self.V):
if visited[i] == False:
return False
return True
g1 = Graph(5)
g1.addEdge(1, 0)
g1.addEdge(0, 2)
g1.addEdge(0, 3)
g1.addEdge(2, 3)
if g1.isTree() == True:
print("Tree")
else:
print("Not Tree")

You can read input from standard input via the input function.
Your code currently creates the graph structure, and then determines whether two nodes are connected by walking through the graph with a depth-first traversal.
I would however suggest a slightly more efficient algorithm: instead of creating a graph, create a disjoint set. There are several libraries out there that offer this data structure, but I'll throw my own in the below code.
This structure keeps track of which nodes belong to the same connected group(s).
Then the algorithm becomes simple: for each edge you read from the input, indicate (using the disjoint set interface) that the two involved nodes belong to the same set. Before doing that however, check whether they already are in the same set. If this happens, stop the algorithm and output this edge.
Here is the generic DisjointSet class I will be using:
class DisjointSet:
class Element:
def __init__(self):
self.parent = self
self.rank = 0
def __init__(self):
self.elements = {}
def add(self, key):
el = self.Element()
self.elements[key] = el
return el
def find(self, key, add_if_not_exists=False):
el = self.elements.get(key, None)
if not el:
if add_if_not_exists:
el = self.add(key)
return el
# Path splitting algorithm
while el.parent != el:
el, el.parent = el.parent, el.parent.parent
return el
def union(self, key=None, *otherkeys):
if key is not None:
root = self.find(key, True)
for otherkey in otherkeys:
el = self.find(otherkey, True)
if el != root:
# Union by rank
if root.rank < el.rank:
root, el = el, root
el.parent = root
if root.rank == el.rank:
root.rank += 1
And here is the code specific for the problem:
def solve():
visited = DisjointSet()
while True:
edge = input().split()
a = visited.find(edge[0])
b = visited.find(edge[1])
if a and a is b: # This edge creates a cycle
print(" ".join(edge))
break # Stop reading more input
visited.union(*edge)
As you can see, this code assumes that the input will have an edge that creates a cycle (like is stated in the problem description). So it only has a way to stop the process when such an offending edge is found.

Generating, traversing and printing binary tree

I generated perfectly balanced binary tree and I want to print it. In the output there are only 0s instead of the data I generated. I think it's because of the line in function printtree that says print(tree.elem), cause in the class self.elem = 0.
How can I connect these two functions generate and printtree?
class BinTree:
def __init__(self):
self.elem = 0
self.left = None
self.right = None
def generate(pbt, N):
if N == 0:
pbt = None
else:
pbt = BinTree()
x = input()
pbt.elem = int(x)
generate(pbt.left, N // 2)
generate(pbt.right, N - N // 2 - 1)
def printtree(tree, h):
if tree is not None:
tree = BinTree()
printtree(tree.right, h+1)
for i in range(1, h):
print(end = "......")
print(tree.elem)
printtree(tree.left, h+1)
Hope somebody can help me. I am a beginner in coding.
For example:
N=6, pbt=pbt, tree=pbt, h=0
input:
1
2
3
4
5
6
and the output:
......5
............6
1
............4
......2
............3

I'd suggest reading up on: https://www.geeksforgeeks.org/tree-traversals-inorder-preorder-and-postorder/
Basically, there are three ways to traverse a binary tree; in-order, post-order and pre-order.
The issue with your print statement is that, you're reassigning the tree that is being passed in, to an empty tree.
if tree is not None:
tree = BinTree()
Right? If tree is not none and has something, lets reassign that to an empty tree.
Traversing a tree is actually a lot more simpler than you'd imagine. I think the complexity comes in just trying to imagine in your head how it all works out, but the truth is that traversing a tree can be done in 3 - 4 lines.

Non-binary Tree Height (Optimization)

Introduction
So I'm doing a course on edX and have been working on this practice assignment for
the better part of 3 hours, yet I still can't find a way to implement this method
without it taking to long and timing out the automatic grader.
I've tried 3 different methods all of which did the same thing.
Including 2 recursive approaches and 1 non-recursive approach (my latest).
The problem I think I'm having with my code is that the method to find children just takes way to long because it has to iterate over the entire list of nodes.
Input and output format
Input includes N on the first line which is the size of the list which is given on line 2.
Example:
5
-1 0 4 0 3
To build a tree from this:
Each of the values in the array are a pointer to another index in the array such that in the example above 0 is a child node of -1 (index 0). Since -1 points to no other index it is the root node.
The tree in the example has the root node -1, which has two children 0 (index 1) and 0 (index 3). The 0 with index 1 has no children and the 0 with index 3 has 1 child: 3 (index 4) which in turn has only one child which is 4 (index 2).
The output resulting from the above input is 4. This is because the max height of the branch which included -1 (the root node), 0, 3, and 4 was of height 4 compared to the height of the other branch (-1, and 0) which was height 2.
If you need more elaborate explanation then I can give another example in the comments!
The output is the max height of the tree. The size of the input goes up to 100,000 which was where I was having trouble as it has to do that it in exactly 3 seconds or under.
My code
Here's my latest non-recursive method which I think is the fastest I've made (still not fast enough). I used the starter from the website which I will also include beneath my code. Anyways, thanks for the help!
My code:
# python3
import sys, threading
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
def height(node, parent_list):
h = 0
while not node == -1:
h = h + 1
node = parent_list[node]
return h + 1
def search_bottom_nodes(parent_list):
bottom_nodes = []
for index, value in enumerate(parent_list):
children = [i for i, x in enumerate(parent_list) if x == index]
if len(children) == 0:
bottom_nodes.append(value)
return bottom_nodes
class TreeHeight:
def read(self):
self.n = int(sys.stdin.readline())
self.parent = list(map(int, sys.stdin.readline().split()))
def compute_height(self):
# Replace this code with a faster implementation
bottom_nodes = search_bottom_nodes(self.parent)
h = 0
for index, value in enumerate(bottom_nodes):
h = max(height(value, self.parent), h)
return h
def main():
tree = TreeHeight()
tree.read()
print(tree.compute_height())
threading.Thread(target=main).start()
edX starter:
# python3
import sys, threading
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
class TreeHeight:
def read(self):
self.n = int(sys.stdin.readline())
self.parent = list(map(int, sys.stdin.readline().split()))
def compute_height(self):
# Replace this code with a faster implementation
maxHeight = 0
for vertex in range(self.n):
height = 0
i = vertex
while i != -1:
height += 1
i = self.parent[i]
maxHeight = max(maxHeight, height);
return maxHeight;
def main():
tree = TreeHeight()
tree.read()
print(tree.compute_height())
threading.Thread(target=main).start()

Simply cache the previously computed heights of the nodes you've traversed through in a dict and reuse them when they are referenced as parents.
import sys, threading
sys.setrecursionlimit(10**7) # max depth of recursion
threading.stack_size(2**27) # new thread will get stack of such size
class TreeHeight:
def height(self, node):
if node == -1:
return 0
if self.parent[node] in self.heights:
self.heights[node] = self.heights[self.parent[node]] + 1
else:
self.heights[node] = self.height(self.parent[node]) + 1
return self.heights[node]
def read(self):
self.n = int(sys.stdin.readline())
self.parent = list(map(int, sys.stdin.readline().split()))
self.heights = {}
def compute_height(self):
maxHeight = 0
for vertex in range(self.n):
maxHeight = max(maxHeight, self.height(vertex))
return maxHeight;
def main():
tree = TreeHeight()
tree.read()
print(tree.compute_height())
threading.Thread(target=main).start()
Given the same input from your question, this (and your original code) outputs:
4

How do I test a sum tree?

I have 2 lists. One contains values, the other contains the levels those values hold in a sum tree. (the lists have same length)
For example:
[40,20,5,15,10,10] and [0,1,2,2,1,1]
Those lists correctly correspond because
- 40
- - 20
- - - 5
- - - 15
- - 10
- - 10
(20+10+10) == 40 and (5+15) == 20
I need to check if a given list of values and a list of its levels corresponds correctly. So far I have managed to put together this function, but for some reason it's not returning True for correct lists array and numbers. Input numbers here would be [40,20,5,15,10,10] and array would be [0,1,2,2,1,1]
def testsum(array, numbers):
k = len(array)
target = [0]*k
subsum = [0]*k
for x in range(0, k):
if target[array[x]]!=subsum[array[x]]:
return False
target[array[x]]=numbers[x]
subsum[array[x]]=0
if array[x]>0:
subsum[array[x]-1]+=numbers[x]
for x in range(0, k):
if(target[x]!=subsum[x]):
print(x, target[x],subsum[x])
return False
return True

I got this running using itertools.takewhile to grab the subtree under each level. Toss that into a recursive function and assert that all recursions pass.
I've slightly improved my initial implementation by grabbing a next_v and next_l and testing early to see if the current node is a parent node and only building subtree if there's something to build. That inequality check is much cheaper than iterating through the whole vs_ls zip.
import itertools
def testtree(values, levels):
if len(values) == 1:
# Last element, always true!
return True
vs_ls = zip(values, levels)
test_v, test_l = next(vs_ls)
next_v, next_l = next(vs_ls)
if next_l > test_l:
subtree = [v for v,l in itertools.takewhile(
lambda v_l: v_l[1] > test_l,
itertools.chain([(next_v, next_l)], vs_ls))
if l == test_l+1]
if sum(subtree) != test_v and subtree:
#TODO test if you can remove the "and subtree" check now!
print("{} != {}".format(subtree, test_v))
return False
return testtree(values[1:], levels[1:])
if __name__ == "__main__":
vs = [40, 20, 15, 5, 10, 10]
ls = [0, 1, 2, 2, 1, 1]
assert testtree(vs, ls) == True
It unfortunately adds a lot of complexity to the code since it pulls out the first value that we need, which necessitates an extra itertools.chain call. That's not ideal. Unless you're expecting to get very large lists for values and levels, it might be worthwhile to do vs_ls = list(zip(values, levels)) and approach this list-wise rather than iterator-wise. e.g...
...
vs_ls = list(zip(values, levels))
test_v, test_l = vs_ls[0]
next_v, next_l = vs_ls[1]
...
subtree = [v for v,l in itertools.takewhile(
lambda v_l: v_l[1] > test_l,
vs_ls[1:]) if l == test_l+1]
I still think the fastest way is probably to iterate once with an approach almost like a state machine and grab all the possible subtrees, then check them all individually. Something like:
from collections import namedtuple
Tree = namedtuple("Tree", ["level_num", "parent", "children"])
# equivalent to
# # class Tree:
# # def __init__(self, level_num: int,
# # parent: int,
# # children: list):
# # self.level_num = level_num
# # self.parent = parent
# # self.children = children
def build_trees(values, levels):
trees = [] # list of Trees
pending_trees = []
vs_ls = zip(values, levels)
last_v, last_l = next(vs_ls)
test_l = last_l + 1
for v, l in zip(values, levels):
if l > last_l:
# we've found a new tree
if l != last_l + 1:
# What do you do if you get levels like [0, 1, 3]??
raise ValueError("Improper leveling: {}".format(levels))
test_l = l
# Stash the old tree and start a new one.
pending_trees.append(cur_tree)
cur_tree = Tree(level_num=last_l, parent=last_v, children=[])
elif l < test_l:
# tree is finished
# Store the finished tree and grab the last one we stashed.
trees.append(cur_tree)
try:
cur_tree = pending_trees.pop()
except IndexError:
# No trees pending?? That's weird....
# I can't think of any case that this should happen, so maybe
# we should be raising ValueError here, but I'm not sure either
cur_tree = Tree(level_num=-1, parent=-1, children=[])
elif l == test_l:
# This is a child value in our current tree
cur_tree.children.append(v)
# Close the pending trees
trees.extend(pending_trees)
return trees
This should give you a list of Tree objects, each of which having the following attributes
level_num := level number of parent (as found in levels)
parent := number representing the expected sum of the tree
children := list containing all the children in that level
After you do that, you should be able to simply check
all([sum(t.children) == t.parent for t in trees])
But note that I haven't been able to test this second approach.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Bad Tree design, Data Structure - python

Related

Count Number of Good Nodes

An algorithm that finds an edge that can be removed to create a tree

Generating, traversing and printing binary tree

Non-binary Tree Height (Optimization)

How do I test a sum tree?

Categories

Resources