Binary Tree: How Do Class Instances Link? - python

I am trying to understand binary trees, but doing so has brought me to confusion about how class instances interact, how does each instance link to another?
My Implementation:
class Node(object):
def __init__(self, key):
self.key= key
self.L = None
self.R = None
class BinaryTree(object):
def __init__(self):
self.root = None
def get_root(self):
return self.root
def insert(self, key):
if self.get_root()==None:
self.root = Node(key)
else:
self._insert(key, self.root)
def _insert(self, key, node):
if key < node.key:
if node.L == None:
node.L = key
else:
self._insert(key, Node(node.L))
if key > node.key:
if node.R == None:
node.R = key
else:
self._insert(key, Node(node.R))
myTree= BinaryTree()
A Scenario
So lets say I want to insert 10, I do myTree.insert(10) and this will instantiate a new instance of Node(), this is clear to me.
Now I want to add 11, I would expect this to become the right node of the root node; i.e it will be stored in the attribute R of the root node Node().
Now here comes the part I don't understand. When I add 12, it should become the child of the root nodes right child. In my code this creates a new instance of Node() where 11 should the be key and 12 should be R.
So my question is 2-fold: what happens to the last instance of Node()? Is it deleted if not how do I access it?
Or is the structure of a binary tree to abstract to think of each Node() connected together like in a graph
NB: this implementation is heavily derived from djra's implementation from this question How to Implement a Binary Tree?

Make L and R Nodes instead of ints. You can do this by changing the parts of your _insert function from this:
if node.L == None:
node.L = key
to this:
if node.L == None:
node.L = Node(key)
There is also a problem with this line:
self._insert(key, Node(node.L))
The way you're doing it right now, there is no way to access that last reference of Node() because your _insert function inserted it under an anonymously constructed node that has no parent node, and therefore is not a part of your tree. That node being passed in to your insert function is not the L or R of any other node in the tree, so you're not actually adding anything to the tree with this.
Now that we changed the Ls and Rs to be Nodes, you have a way to pass in a node that's part of the tree into the insert function:
self._insert(key, node.L)
Now you're passing the node's left child into the recursive insert, which by the looks of thing is what you were originally trying to do.
Once you make these changes in your code for both the L and R insert cases you can get to the last instance of Node() in your
10
\
11
\
12
example tree via myTree.root.R.R. You can get its key via myTree.root.R.R.key, which equals 12.

Most of you're questions come from not finishing the program; In your current code after myTree.insert(11) you're tree is setting R equal to a int rather than another Node.
If the value isn't found then create the new node at that point. Otherwise pass the next node into the recursive function to keep moving further down the tree.
def _insert(self, key, node):
if key < node.key:
if node.L == None:
node.L = Node(key)
else:
self._insert(key, node.L)
if key > node.key:
if node.R == None:
node.R = Node(key)
else:
self._insert(key, node.R)
P.S. This isn't finished you're going to need another level of logic testing incase something is bigger than the current Node.key but smaller than the next Node.

Related

Linked lists in Python

I have a Linked Lists assignment for school although I am just getting the hang of class constructors. I am trying to simply get the basics of the linked list data structure down, and I understand the basic concept. I have watched lots of Youtube tutorials and the like, but where I am failing to understand is how to print out the cargo or data in my nodes using a loop.
I have written something along these lines:
class Node:
def __init__(self, value, pointer):
self.value = value
self.pointer = pointer
node4 = Node(31, None)
node3 = Node(37, None)
node2 = Node(62, None)
node1 = Node(23, None)
Now...I understand that each node declaration is a call to the class constructor of Node and that the list is linked because each node contains a pointer to the next node, but I simply don't understand how to print them out using a loop. I've seen examples using global variables for the "head" and I've seen subclasses created to accomplish the task. I'm old and dumb. I was wondering if someone could take it slow and explain it to me like I'm 5. If anyone out there has the compassion and willingness to hold my hand through the explanation, I would be greatly obliged. Thank you in advance, kind sirs.
First of all, your nodes should be created something like this :
node4 = Node(31, node3)
node3 = Node(37, node2)
node2 = Node(62, node1)
node1 = Node(23, None)
Now, i am sure you can see that the last node in the list would point to None. So, therefore, you can loop through the list until you encounter None. Something like this should work :
printhead = node4
while True:
print(printhead.value)
if printhead.pointer is None:
break;
else :
printhead = printhead.pointer
This is a very basic linked list implementation for educational purposes only.
from __future__ import print_function
"""The above is needed for Python 2.x unless you change
`print(node.value)` into `print node.value`"""
class Node(object):
"""This class represents list item (node)"""
def __init__(self, value, next_node):
"""Store item value and pointer to the next node"""
self.value = value
self.next_node = next_node
class LinkedList(object):
"""This class represents linked list"""
def __init__(self, *values):
"""Create nodes and store reference to the first node"""
node = None
# Create nodes in reversed order as each node needs to store reference to next node
for value in reversed(values):
node = Node(value, node)
self.first_node = node
# Initialize current_node for iterator
self.current_node = self.first_node
def __iter__(self):
"""Tell Python that this class is iterable"""
return self
def __next__(self):
"""Return next node from the linked list"""
# If previous call marked iteration as done, let's really finish it
if isinstance(self.current_node, StopIteration):
stop_iteration = self.current_node
# Reset current_node back to reference first_node
self.current_node = self.first_node
# Raise StopIteration to exit for loop
raise stop_iteration
# Take the current_node into local variable
node = self.current_node
# If next_node is None, then the current_node is the last one, let's mark this with StopIteration instance
if node.next_node is None:
self.current_node = StopIteration()
else:
# Put next_node reference into current_node
self.current_node = self.current_node.next_node
return node
linked_list = LinkedList(31, 37, 62, 23)
for node in linked_list:
print(node.value)
This doesn't handle many cases properly (including break statement in the loop body) but the goal is to show minimum requirements for linked list implementation in Python.

How to replace a subtree in python

I have my tree data structure as below:
class Node(object):
def __init__(self, data):
self.data = data
self.children = []
def add_child(self, obj):
self.children.append(obj)
Then I created a method to accomplish it.
def replace(node, newNode):
if node.data == 1:
node = newNode
return
else:
for i in xrange(0, len(node.children)):
replace(node.children[i], newNode)
This method is called just like that:
replace(mytree,newNode)
Since it is recursive call, I think the object get destroyed and the assignment does not happen.
I tried it manually as:
mytree.children[0].children[0] = newNode
then the tree is correctly updated. How can I achieve it using my method above?
The assignment node = newNode doesn't do what you want. It doesn't replace the object you know as node with newNode everywhere. It just rebinds the local variable name node to point to the same object as the other local name newNode. Other references to the first node (such as in its parent's children list) will be unchanged.
To actually do what you want requires more subtlety. The best approach is often often not to replace the node at all, but rather to replace its contents. That is, set node.data and node.children to be equal to newNode.data and newNode.children and leave node in place. This only fails to work properly if there are other references to node or newNode and you want them to work properly after the replacement.
The alternative is to do the replacement in the parent of the node you're looking for. This won't work at the top of your tree, so you'll need special logic to handle that situation.
def replace(node, newNode):
if node.value == 1:
raise ValueError("can't replace the current node this way")
for index, child in enumerate(node.children):
if child.data == 1:
node.children[index] = newNode
return True
if replace(child, newNode):
return True
return False
I've also added some extra logic to stop the recursive processing of the tree when the appropriate node has been found. The function will return True if a replacement has been made, or False if the right data value was not found.

Python tree operations

I need to implement (or just use) a tree data structure on which I can perform:
1. Child additions at any specified position. The new child can itself be a big tree (need not be a singleton)
2. Subtree deletions and moving (to another node in the same tree)
3. Common traversal operations.
4. Access parent from child node.
First, is there any module I can use for this?
Second, if I were to implement this by myself, I've this concern:
When I do tree manipulations like moving subtrees, removing subtrees or adding new subtrees, I only wish to move the "references" to these tree nodes. For example, in C/C++ these operations can be performed by pointer manipulations and I can be assured that only the references are being moved.
Similarly, when I do tree "movements" I need to move only the reference - aka, a new copy of the tree should not be created at the destination.
I'm still in a "pointers" frame of thinking, and hence the question. May be, I don't need to do all this?
You can easily make your own tree with operator overloading. For example, here is a basic class with __add__ implemented :
class Node(object):
def __init__(self, value):
self.value = value
self.child = []
def add_child(self, child):
self.child.append(child)
def __add__(self, element):
if type(element) != Node:
raise NotImplementedError("Addition is possible only between 2 nodes")
self.value += element.value # i didn't get if you have to add also childs
return self # return the NODE object
So to answer to your second question, there is a python trick here. In __add__ you return self. Then, this return True:
a = Node(1)
b = Node(2)
print a is a + b
If you use a + b, this will modify the value a. a and b are, in fact, pointers. Then if you pass it as argument in a function, and you modify them in the function, the a and b instances will be modified. There is two different way to avoid this (maybe more, but this is the two i use) :
The first one is to directly modify the definition of __add__ :
def __add__(self, element):
# .../...
value = self.value + element.value
return Node(value) # you may add rows in order to copy childs
The second one is to add a copy method :
def copy(self):
# .../...
n = Node(self.value)
n.child = self.child[:] # Copy the list, in order to have 2 different instance of this list.
return n
This will allow you to do something like c = a.copy() + b and the assertion c is a will be false.
Hope I answered to your question.
Thi is an example for you:
class BinaryTree:
def __init__(self,rootObj):
self.key = rootObj
self.leftChild = None
self.rightChild = None
def insertLeft(self,newNode):
if self.leftChild == None:
self.leftChild = BinaryTree(newNode)
else:
t = BinaryTree(newNode)
t.leftChild = self.leftChild
self.leftChild = t
def insertRight(self,newNode):
if self.rightChild == None:
self.rightChild = BinaryTree(newNode)
else:
t = BinaryTree(newNode)
t.rightChild = self.rightChild
self.rightChild = t
def getRightChild(self):
return self.rightChild
def getLeftChild(self):
return self.leftChild
def setRootVal(self,obj):
self.key = obj
def getRootVal(self):
return self.key

Sum of length of the branches in a tree

For example, a tree like this:
5
/ \
3 6
/ \
7 2
print(tree.branchLenSum())
will be 1+1+2+2=6
Tree class:
class BinaryTree:
# Constructor, takes in new key value
def __init__(self, myKey):
self.key = myKey
self.leftChild = None
self.rightChild = None
# Returns root key value
def getRootValue(self):
return self.key
# Changes root key value
def setRootValue(self, newKey):
self.key = newKey
# Returns reference to left child
def getLeftChild(self):
value=None
if self.leftChild!=None:
value=self.leftChild
return value
# Returns reference to right child
def getRightChild(self):
value=None
if self.rightChild!=None:
value = self.rightChild
return value
def insertLeftChild(self, childKey):
newNode = BinaryTree(childKey)
newNode.leftChild = self.leftChild
self.leftChild = newNode
# Inserts key as right child. Existing right child becomes new right child
# of new key
def insertRightChild(self, childKey):
newNode = BinaryTree(childKey)
newNode.rightChild = self.rightChild
self.rightChild = newNode
The tree I have built for the example:
tree=BinaryTree(5)
tree.insertLeftChild(3)
tree.insertRightChild(6)
nodeA=tree.getLeftChild()
nodeA.insertLeftChild(7)
nodeA.insertRightChild(2)
What I have so far:
def branchLenSum(self):
rounds=0
if self.getLeftChild() ==None and self.getRightChild()==None:
return rounds+rounds+1
else:
rounds+=rounds+1
if self.getLeftChild()!=None:
rounds+=self.getLeftChild().branchLenSum()
if self.getRightChild()!=None:
rounds+=self.getRightChild().branchLenSum()
return rounds
My idea is that every time travel to next node, counter adds 1+counter itself. I think this will get all the length sum.
Okay, so the reason why you only get a result of 5 is rather simple: What you are doing is count the nodes. So in your case, you have 5 nodes, so the result is 5.
If you want to get the internal path length, then I believe you will have to keep track of the current depth while navigating through the tree. You can do this simply by using an optional parameter.
def branchLenSum(self, depth = 0):
rounds = depth
if self.leftChild:
rounds += self.leftChild.branchLenSum(depth + 1)
if self.rightChild:
rounds += self.rightChild.branchLenSum(depth + 1)
return rounds
In this case, whenever we navigate down to a child, we increase the current depth by one. And when counting the branch length of a node, we start at the depth.
Btw. note that officially, the internal path length is defined as the length for only the internal nodes, i.e. not leaves. The method above counts every node including leaves. If you want to follow the official definiton, you will have to add a leaf-check at the beginning and return 0 for leaves.
Some other things:
The methods getLeftChild and getRightChild do effectively nothing. You assign None to the return value, then check if the left/right child is None and if that’s not the case you assign the child to the return value and return it.
So essentially, you are returning self.leftChild/self.rightChild; there’s no need to actually look at the value and check for None.
In Python, you usually don’t use accessor or mutator methods (getters/setters); you just access the underlying property itself. This makes the methods getLeftChild, getRightChild, getKey and setKey redundant.
Checking for None with != None or == None is an antipattern. If you want to check if, for example a child is not None, just do if child. And if you want to check if it is not set (i.e. not None) just do if not child.

How to use a generator to iterate over a tree's leafs

The problem:
I have a trie and I want to return the information stored in it. Some leaves have information (set as value > 0) and some leaves do not. I would like to return only those leaves that have a value.
As in all trie's number of leaves on each node is variable, and the key to each value is actually made up of the path necessary to reach each leaf.
I am trying to use a generator to traverse the tree postorder, but I cannot get it to work. What am I doing wrong?
My module:
class Node():
'''Each leaf in the trie is a Node() class'''
def __init__(self):
self.children = {}
self.value = 0
class Trie():
'''The Trie() holds all nodes and can return a list of their values'''
def __init__(self):
self.root = Node()
def add(self, key, value):
'''Store a "value" in a position "key"'''
node = self.root
for digit in key:
number = digit
if number not in node.children:
node.children[number] = Node()
node = node.children[number]
node.value = value
def __iter__(self):
return self.postorder(self.root)
def postorder(self, node):
if node:
for child in node.children.values():
self.postorder(child)
# Do my printing / job related stuff here
if node.value > 0:
yield node.value
Example use:
>>trie = Trie()
>>trie.add('foo', 3)
>>trie.add('foobar', 5)
>>trie.add('fobaz', 23)
>>for key in trie:
>>....print key
>>
3
5
23
I know that the example given is simple and can be solved using any other data structure. However, it is important for this program to use a trie as it is very beneficial for the data access patterns.
Thanks for the help!
Note: I have omitted newlines in the code block to be able to copy-paste with greater ease.
Change
self.postorder(child)
to
for n in self.postorder(child):
yield n
seems to make it work.
P.S. It is very helpful for you to left out the blank lines for ease of cut & paste :)

Categories

Resources