Python BST, change data of leaf nodes

Python BST, change data of leaf nodes - python

I'm trying create a function to change the data in the leaf nodes (the ones with no children nodes) in a binary search tree to "Leif". Currently I have this code for my BST:
def add(tree, value, name):
if tree == None:
return {'data':value, 'data1':name, 'left':None, 'right':None}
elif value < tree['data']:
tree['left'] = add(tree['left'],value,name)
return tree
elif value > tree['data']:
tree['right'] = add(tree['right'],value,name)
return tree
else: # value == tree['data']
return tree # ignore duplicate
Essentially, I want to make a function that will change the name in data1 to "Leif" when there is no children nodes. What is the best way for me to achieve this? Thanks in advance.

Split the problem into smaller problems which can be solved with simple functions.
from itertools import ifilter
def is_leaf(tree):
return tree['left'] is None and tree['right'] is None
def traverse(tree):
if tree is not None:
yield tree
for side in ['left', 'right']:
for child in traverse(tree[side]):
yield child
def set_data1_in_leafes_to_leif(tree):
for leaf in ifilter(is_leaf, traverse(tree)):
leaf['data1'] = 'Leif'

Related

Python For loop display previous value if correct one match with XML

right now I have this XML
I need to get 1 value ("enabled") behind the correct one I get.
this is the code I'm using
def checktap(text):
nodes = minidom.parse("file.xml").getElementsByTagName("node")
for node in nodes:
if node.getAttribute("text") == text:
if node.getAttribute("enabled") == True:
return ("Yes")
else:
return ("No")
value = checktap("EU 39")
print(value)
with this code, I'll get the exact node I'm searching, and the value is enabled=True, but I need to get the one behind this (android.widget.LinearLayout) with the value enabled=False

You can use itertools.pairwise
from itertools import pairwise
def checktap(text):
nodes = minidom.parse("file.xml").getElementsByTagName("node")
for node1, node2 in pairwise(nodes):
if node2.getAttribute("text") == text and node2.getAttribute("enabled"):
return True
return False

Creating Binary Tree

Most of the questions I've searched for regarding binary trees shows the implementation of binary search trees, but not binary trees. The terms of a complete binary tree are:
Either an empty tree or it has 1 node with 2 children, where each
child is another Binary Tree.
All levels are full (except for possibly the last level)
All leaves on the bottom-most level are
as far left as possible.
I've come up with a concept but it doesn't seem to running through the recursion properly -- Does anyone know what I'm doing wrong?
class Node():
def __init__(self, key):
self.key = key
self.left = None
self.right = None
def add(self, key):
if self.key:
if self.left is None:
self.left = Node(key)
else:
self.left.add(key)
if self.right is None:
self.right = Node(key)
else:
self.right.add(key)
else:
self.key = key
return (self.key)

The problem in your code is that you are adding the same value multiple times. You add the node, and then still recurse deeper, where you do the same.
The deeper problem is that you don't really know where to insert the node before you have reached the bottom level of the tree, and have detected where that level is incomplete. Finding the correct insertion point may need a traversal through the whole tree... which is defeating the speed gain you would expect to get from using binary trees in the first place.
I provide here three solutions, starting with the most efficient:
1. Using a list as tree implementation
For complete trees there is a special consideration to make: if you number the nodes by level, starting with 0 for the root, and within each level from left to right, you notice that the number of a node's parent is (k-1)/2 when its own number is k. In the other direction: if a node with number k has children, then its left child has number k*2+1, and the right child has a number that is one greater.
Because the tree is complete, there will never be gaps in this numbering, and so you could store the nodes in a list, and use the indexes of that list for the node numbering. Adding a node to the tree now simply means you append it to that list. Instead of a Node object, you just have the tree list, and the index in that list is your node reference.
Here is an implementation:
class CompleteTree(list):
def add(self, key):
self.append(key)
return len(self) - 1
def left(self, i):
return i * 2 + 1 if i * 2 + 1 < len(self) else -1
def right(self, i):
return i * 2 + 2 if i * 2 + 2 < len(self) else -1
#staticmethod
def parent(i):
return (i - 1) // 2
def swapwithparent(self, i):
if i > 0:
p = self.parent(i)
self[p], self[i] = self[i], self[p]
def inorder(self, i=0):
left = self.left(i)
right = self.right(i)
if left >= 0:
yield from self.inorder(left)
yield i
if right >= 0:
yield from self.inorder(right)
#staticmethod
def depth(i):
return (i + 1).bit_length() - 1
Here is a demo that creates your example tree, and then prints the keys visited in an in-order traversal, indented by their depth in the tree:
tree = CompleteTree()
tree.add(1)
tree.add(2)
tree.add(3)
tree.add(4)
tree.add(5)
for node in tree.inorder():
print(" " * tree.depth(node), tree[node])
Of course, this means you have to reference nodes a bit different from when you would use a real Node class, but the efficiency gain pays off.
2. Using an extra property
If you know how many nodes there are in a (sub)tree, then from the bit representation of that number, you can know where exactly the next node should be added.
For instance, in your example tree you have 5 nodes. Imagine you want to add a 6 to that tree. The root node would tell you that you currently have 5 and so you need to update it to 6. In binary that is 110. Ignoring the left-most 1-bit, the rest of the bits tell you whether to go left or right. In this case, you should go right (1) and then finally left (0), creating the node in that direction. You can do this iteratively or recursively.
Here is an implementation with recursion:
class Node():
def __init__(self, key):
self.key = key
self.left = None
self.right = None
self.count = 1
def add(self, key):
self.count += 1
if self.left is None:
self.left = Node(key)
elif self.right is None:
self.right = Node(key)
# extract from the count the second-most significant bit:
elif self.count & (1 << (self.count.bit_length() - 2)):
self.right.add(key)
else:
self.left.add(key)
def inorder(self):
if self.left:
yield from self.left.inorder()
yield self
if self.right:
yield from self.right.inorder()
tree = Node(1)
tree.add(2)
tree.add(3)
tree.add(4)
tree.add(5)
for node in tree.inorder():
print(node.key)
3. Without extra property
If no property can be added to Node objects, then a more extensive search is needed to find the right insertion point:
class Node():
def __init__(self, key):
self.key = key
self.left = None
self.right = None
def newparent(self):
# Finds the node that should serve as parent for a new node
# It returns a tuple:
# if parent found: [-1, parent for new node]
# if not found: [height, left-most leaf]
# In the latter case, the subtree is perfect, and its left-most
# leaf is the node to be used, unless self is a right child
# and its sibling has the insertion point.
if self.right:
right = self.right.newparent()
if right[0] == -1: # found inbalance
return right
left = self.left.newparent()
if left[0] == -1: # found inbalance
return left
if left[0] != right[0]:
return [-1, right[1]] # found inbalance
# temporary result in perfect subtree
return [left[0]+1, left[1]]
elif self.left:
return [-1, self] # found inbalance
# temporary result for leaf
return [0, self]
def add(self, key):
_, parent = self.newparent()
if not parent.left:
parent.left = Node(key)
else:
parent.right = Node(key)
def __repr__(self):
s = ""
if self.left:
s += str(self.left).replace("\n", "\n ")
s += "\n" + str(self.key)
if self.right:
s += str(self.right).replace("\n", "\n ")
return s
tree = Node(1)
tree.add(2)
tree.add(3)
tree.add(4)
tree.add(5)
print(tree)
This searches recursively the tree from right to left, to find the candidate parent of the node to be added.
For large trees, this can be improved a bit, by doing a binary-search among paths from root to leaf, based on the length of those paths. But it will still not be as efficient as the first two solutions.

You can use the sklearn Decision trees, as they are able to be set up as binary decision trees as well. link to the documentation here.

You really need to augment your tree in some way. Since this is not a binary search tree, the only real information you have about each node is whether or not it has a left and right child. Unfortunately, this isn't helpful in navigating a complete binary tree. Imagine a complete binary tree with 10 levels. Until the 9th level, every single node has both a left child and a right child, so you have no way of knowing which path to take down to the leaves. So the question is, what information do you add to each node? I would add the count of nodes in that tree.
Maintaining the count is easy, since every time you descend down a subtree you know to add one to the count at that node. What you want to recognize is the leftmost imperfect subtree. Every perfect binary tree has n = 2^k - 1, where k is the number of levels and n is the number of nodes. There are quick and easy ways to check if a number is 1 less than a power of two (see the first answer to this question), and in fact in a complete binary tree every node has at most one child that isn't the root of a perfect binary tree. Follow a simple rule to add nodes:
If the left child is None, set root.left = Node(key) and return
Else if the right child is None, set root.right = Node(key) and return
If one of the children of the current node is the root of an imperfect subtree, make that node the current node (descend down that subtree)
Else if the sizes are unequal, make the node with the smaller subtree the current node.
Else, make the left child the current node.
By augmenting each node with the size of the subtree rooted there, you have all the information you need at every node to build a recursive solution.

Retrieve Graph Lowest Height Node with Filter

Given a Tree T, sometimes binary or not, I need to retrieve the lowest Node that matches a criteria in each branch.
So, I need to retrieve a list (array) of those red marked nodes, where they label is equal to "NP" node.label() == 'NP'.
Actually I'm using NLTK Tree (nltk.tree.Tree) data structure, but you can post the pseudocode only, and I can implement it.
Here is the code that I've tried:
def traverseTree(tree):
if not isinstance(tree, nltk.Tree): return []
h = []
for subtree in tree:
if type(subtree) == nltk.tree.Tree:
t = traverseTree(subtree)
if subtree.label() == 'NP' and len(t) == 0: h.append(subtree)
return h

you have a conditional that if the there are no better candidates for your specification then append subtree, but what if len(t)>0? in that case you want to keep the nodes found in sub calls:
def traverseTree(tree):
if not isinstance(tree, nltk.Tree): return []
h = []
for subtree in tree:
if type(subtree) == nltk.tree.Tree:
t = traverseTree(subtree)
#RIGHT HERE!! need to extend by t or the other found nodes are thrown out
h.extend(t)
if subtree.label() == 'NP' and len(t) == 0:
h.append(subtree)
return h
Keep in mind that if t is always empty you would append all the valid nodes one level below, but any end-of-branch "NP" nodes will be found and returned in t so you want to pass them up a level in the recursion.
Edit: the only case where this would fail is if the top level node is "NP" and there are no sub-nodes of "NP" in which case tree should be added to h:
#after for loop has finished
if len(h) == 0 and tree.label() == "NP":
h.append(tree)
return h
edit2: if you add tree to h then the check for subtrees will never actually come true since they are checking the same node with the same conditionals just in differnt levels of recursion, so you can actually just write the function like this:
def traverseTree(tree):
if not isinstance(tree, nltk.Tree): return []
h = []
for subtree in tree:
#no need to check here as well as right inside the call
h.extend(traverseTree(subtree))
if tree.label() == 'NP' and len(h) == 0:
h.append(tree)
return h

How to use recursion on a list of lists that doesn't follow a predictable pattern

I am trying to create a function that takes in a list of lists and returns a Binary Tree form the elements in the list of lists. The list of lists follow this format: ["root", "left subtree", "right subtree"]
How can i use recursion to parse through a list of lists and create a Binary tree.
The main problem is that the list of list does not necessarily follow the same pattern.
So, for example :
lst = ['root', ['left_child', 'leaf' , 'leaf'], ['right_child', 'leaf', 'leaf']]
but it can also be:
lst = ['root', [], ['right_child', ['leaf', 'value', 'value'] , 'leaf']]
There are many different conditions that can vary upon different types of lists,so the main problem boils down how to use recursion to go through a list of list that does not follow a predictable pattern other than ["root", "left subtree", "right subtree"]. What kind of conditional statements would help me to avoid index errors.
Note:
The class BinaryTree has setLeft and setRight built in already and asks for a root upon initialization
Thank you for your insight.

This is a predictable pattern; it is just that a tree can also be an empty list or a string (that is, a leaf). So that is you pattern: any time you can have a tree, it can take one of three forms:
an empty list, which likely corresponds to None in your BinaryTree, but I can't say for sure
a string, a leaf, which I also don't know how your BinaryTree handles
a full binary tree node, which you seem to understand already.

My implementation I spun up in a few minutes. Builds a binary tree from a list and then does an in-order traversal to verify the tree was built correctly. I'm using isinstance of to verify whether the node is a leaf(ie: a string) or not (ie: a list) but this is a bit of a hack.
class Tree:
def __init__(self, val=None):
self.root = val
self.left = None
self.right = None
def in_order_print(self):
if self.root is None:
return
if isinstance(self.left, Tree):
self.left.in_order_print()
print self.root
if isinstance(self.right, Tree):
self.right.in_order_print()
def grab(alist):
return alist[0], alist[1], alist[2]
def recurse_it(atree, alist):
if alist == [] or isinstance(alist, basestring):
return
root, left, right = grab(alist)
atree.root = root
atree.left = Tree(left)
atree.right = Tree(right)
recurse_it(atree.left, left)
recurse_it(atree.right, right)
return atree
def list_to_tree(alist):
return recurse_it(Tree(), alist)
def main():
lst1 = ['root', ['left_child', 'leaf', 'leaf'],
['right_child', 'leaf', 'leaf']]
lst2 = ['root', [], ['right_child', ['leaf', 'value', 'value'], 'leaf']]
thetree1 = list_to_tree(lst1)
thetree1.in_order_print()
print '___________________________'
thetree2 = list_to_tree(lst2)
thetree2.in_order_print()
if __name__ == '__main__':
main()
"""
Tree1:
[ root ]
[left] [right]
[leaf][leaf] [leaf][leaf]
Tree2:
[ root ]
[] [right]
[leaf] [leaf]
[value][value]
"""

BeautifulSoup lowest common ancestor

Does the BeautifulSoup library for Python have any function that can take a list of nodes and return the lowest common ancestor?
If not, has any of you ever implemented such a function and care to share it?

I think this is what you want, with link1 being one element and link2 being another;
link_1_parents = list(link1.parents)[::-1]
link_2_parents = list(link2.parents)[::-1]
common_parent = [x for x,y in zip(link_1_parents, link_2_parents) if x is y][-1]
print common_parent
print common_parent.name
It'll basically walk both elements' parents from root down, and return the last common one.

The accepted answer does not work if the distance from a tag in the input list to the lowest common ancestor is not the exact same for every nodes in the input.
It also uses every ancestors of each node, which is unnecessary and could be very expensive in some cases.
import collections
def lowest_common_ancestor(parents=None, *args):
if parents is None:
parents = collections.defaultdict(int)
for tag in args:
if not tag:
continue
parents[tag] += 1
if parents[tag] == len(args):
return tag
return lowest_common_ancestor(parents, *[tag.parent if tag else None for tag in args])

Since Arthur's answer is not correct in some cases. I modified Arthur's answer, and give my answer. I have tested the code for LCA with two nodes as input.
import collections
def lowest_common_ancestor(parents=None, *args):
if parents is None:
parents = collections.defaultdict(int)
for tag in args:
parents[tag] += 1
if parents[tag] == NUM_OF_NODES:
return tag
next_arg_list = [tag.parent for tag in args if tag.parent is not None]
return lowest_common_ancestor(parents, *next_arg_list)
Call the function like:
list_of_tag = [tag_a, tag_b]
NUM_OF_NODES = len(list_of_tag)
lca = lowest_common_ancestor(None, *list_of_tag)
print(lca)

You could also compute XPaths of all elements and then use os.path.commonprefix. I am not familiar with BeautifulSoup, but in lxml, I have done this:
def lowest_common_ancestor(nodes: list[lxml.html.HtmlElement]):
if len(set(nodes)) == 1: # all nodes are the same
return nodes[0]
tree: lxml.etree._ElementTree = nodes[0].getroottree()
xpaths = [tree.getpath(node) for node in nodes]
lca_xpath = os.path.commonprefix(xpaths)
lca_xpath = lca_xpath.rsplit('/', 1)[0] # strip partially matching tag names
return tree.xpath(lca_xpath)[0]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.