Store dictionary key value pair to tree structure - python

I have generated a key:value pair from the excel data and now I want to store the key:value pair in the tree structure. Since the dictionary lost its order,I have stored all the keys in the separate data frame to get the order of tree generation.
Here is the example data:
key_value_dict = {(key3:value),(key2:value),(key4:value),(key1:value),(key5:value),..}
df_all_keys_inOrder = [key1,key2,key3,key1,key4,key5,key1,key2,key6,..]
for i in df_all_keys_inOrder.index:
for key, value in key_value_dict.iteritems():
if df_all_keys_inOrder[i] == key:
if i == 0:
root = Node((key,value))
leaf = root
else:
leaf = Node((key,value), parent= leaf)
The problem with this code is: when it come to root node (key]1) again, instead of creating the children of root node, it is creating a new node with key1.
The resultant tree should look like : https://i.stack.imgur.com/tN7em.png

Related

Concatenating tree data - How to simplify my code?

I solved an exercise where I had to apply a recursive algorithm to a tree that's so defined:
class GenericTree:
""" A tree in which each node can have any number of children.
Each node is linked to its parent and to its immediate sibling on the right
"""
def __init__(self, data):
self._data = data
self._child = None
self._sibling = None
self._parent = None
I had to concatenate the data of the leaves with the data of the parents and so on until we arrive to the root that will have the sum of all the leaves data. I solved it in this way and it works but it seems very tortuous and mechanic:
def marvelous(self):
""" MODIFIES each node data replacing it with the concatenation
of its leaves data
- MUST USE a recursive solution
- assume node data is always a string
"""
if not self._child: #If there isn't any child
self._data=self._data #the value remains the same
if self._child: #If there are children
if self._child._child: #if there are niece
self._child.marvelous() #reapply the function to them
else: #if not nieces
self._data=self._child._data #initializing the name of our root node with the name of its 1st son
#if there are other sons, we'll add them to the root name
if self._child._sibling: #check
current=self._child._sibling #iterating through the sons-siblings line
while current:
current.marvelous() #we reapplying the function to them to replacing them with their concatenation (bottom-up process)
self._data+=current._data #we sum the sibling content to the node data
current=current._sibling #next for the iteration
#To add the new names to the new root node name:
self._data="" #initializing the root str value
current=self._child #having the child that through recursion have the correct str values, i can sum all them to the root node
while current:
self._data+=current._data
current=current._sibling
if self._sibling: #if there are siblings, they need to go through the function themselves
self._sibling.marvelous()
Basically I check if the node tree passed has children: if not, it remains with the same data.
If there are children, I check if there are nieces: in this case I restart the algorithm until I can some the leaves to the pre-terminal nodes, and I sum the leaves values to put that sum to their parents'data.
Then, I act on the root node with the code after the first while loop, so to put its name as the sum of all the leaves.
The final piece of code serves as to make the code ok for the siblings in each step.
How can I improve it?
It seems to me that your method performs a lot of redundant recursive calls.
For example this loop in your code:
while current:
current.marvelous()
self._data += current._data
current = current._sibling
is useless because the recursive call will be anyway performed by the last
instruction in your method (self._sibling.marvelous()). Besides,
you update self._data and then right after the loop you reset
self._data to "".
I tried to simplify it and came up with this solution that seems to
work.
def marvelous(self):
if self.child:
self.child.marvelous()
# at that point we know that the data for all the tree
# rooted in self have been computed. we collect these
self.data = ""
current = self.child
while current:
self.data += current.data
current = current.sibling
if self.sibling:
self.sibling.marvelous()
And here is a simpler solution:
def marvelous2(self):
if not self.child:
result = self.data
else:
result = self.child.marvelous2()
self.data = result
if self.sibling:
result += self.sibling.marvelous2()
return result
marvelous2 returns the data computed for a node and all its siblings. This avoids performing the while loop of the previous solution.

Build tree-hierachy from two-dimensional list

I have a 2D-list that looks something like this:
[
["elem1","elem2"],
["elem1","elem3"],
["elem4","elem7"],
...
]
And I want to create a nested dictionary that then looks something like this:
[{"elem1":["elem2","elem3"]},{"elem4":"elem7"}]
So the higher the index in one of the initial sublists the higher will be the hierachical posiiton in the generated tree. How would you go about this in python? How do you call that "treeification"? I feel like there has to be a package out there that does exactly that.
I don't imagine there is something in a library for this considering it is fairly simple and not that useful for most people. It is better to write the code manually.
First of all, the output format in the question cannot fully represent a tree: for example the data
[
["elem1", "elem2"],
["elem1", "elem3"],
["elem4", "elem7"],
["elem3", "elem5"],
]
...would need to be similar to [{elem1":["elem2","elem3"]},{"elem4":"elem7"}] but add elem5 as a child of elem3, however elem3 is a string type, with no place for children to be stored. Thus, I suggest the following output format:
{'elem4': {'elem7': {}}, 'elem1': {'elem2': {}, 'elem3': {'elem5': {}}}}
Here every node is represented as a dictionary from child node names to child node values, so a tree containing only a root node looks like {}, and a tree with 3 nodes (the root + 2 children) looks like {'child1': {}, 'child2': {}}.
To take the turn a list of parent-child associations and turn them into such a tree you can use this code:
def treeify(data):
# result dictionary
map_list = {}
# initially all nodes with a child, will have items removed later
root_nodes = {parent for parent, child in data}
for parent, child in data:
# get the dictionary that this node maps to (empty dictionary by default)
children = map_list.setdefault(parent, {})
# add this connection
children[child] = map_list.setdefault(child, {})
# remove node with a parent from the set of root_nodes
if child in root_nodes:
root_nodes.remove(child)
# return the dictionary with only root nodes at the root
return dict((root_node, map_list[root_node]) for root_node in root_nodes)
print(treeify([
["elem1", "elem2"],
["elem1", "elem3"],
["elem4", "elem7"],
["elem3", "elem5"],
]))
Here is code which can help you to get as your required output
data = [
["elem1","elem2"],
["elem1","elem3"],
["elem4","elem7"],
]
maplist = {}
for a in data:
if a[0] in maplist:
maplist[a[0]].append(a[1])
else:
maplist[a[0]] = [a[1]]
print(maplist)
To get sorted based on list item you can use below code
sorted_items = sorted(maplist.items(), key = lambda item : len(item[1]), reverse=True)

Append new values to the similar keys python dictionary

I have a tree and i'm traversing the tree using map methods(i.e nodes with similar keys must come together) assuming root node is marked 0,left of the root is -1,right of the root is +1. Traverse the complete tree and assign the HD(horizontal distance from root) of each node as the key of the node and the data of the node as a value. Now i want to append all the values with similar keys at one place in dictionary like {0: ['10','13'], 1: ['12'], -2: ['12'], -1: ['11']} for the below created tree.
Code
class node:
dict1={}
def __init__(self,data):
self.data=data
self.left=None
self.right=None
def check_if_exists(self,hd,root):
if not self.dict1:
self.dict1[hd] = [root.data]
else:
if hd in self.dict1.keys(): ###Checking if key already exists for some node
self.dict1[hd].append(root.data)
else:
self.dict1[hd] = [root.data]
def vertical_order_tree(self,root,hd):
if root:
self.check_if_exists(hd,root)
self.vertical_order_tree(root.left,hd-1)
self.vertical_order_tree(root.right,hd+1)
root=node("10")
root.left=node("11")
root.left.left=node("12")
root.right=node("12")
#root.right.left=node("13")
root.vertical_order_tree(root,0)
print(root.dict1)
Output:
self.dict1[hd].append(root.data)
AttributeError: 'str' object has no attribute 'append'
Appending the similar values is causing the issue. Anyone can catch the bug here. i'm not good at handling dictionaries.
The error says self.dict[hd] is a string. Try using self.dict[hd]+=root.data instead of self.dict[hd].append(root.data). Of course only if root.data is another string.
If you want the value of that key to be a list then I'd suggest first making an empty list at that key and then appending

Binary search tree insertion Python

What is wrong with my insert function? I'm passing along the tr and the element el that I wish to insert, but I keep getting errors...
def insert( tr,el ):
""" Inserts an element into a BST -- returns an updated tree """
if tr == None:
return createEyecuBST( el,None )
else:
if el > tr.value:
tr.left = createEyecuBST( el,tr )
else:
tr.right = createEyecuBST( el,tr )
return EyecuBST( tr.left,tr.right,tr)
Thanks in advance.
ERROR:
ValueError: Not expected BST with 2 elements
It's a test function that basically tells me whether or not what I'm putting in is what I want out.
So, the way insertion in a binary tree usually works is that you start at the root node, and then decide which side, i.e. which subtree, you want to insert your element. Once you have made that decision, you are recursively inserting the element into that subtree, treating its root node as the new root node.
However, what you are doing in your function is that instead of going down towards the tree’s leaves, you are just creating a new subtree with the new value immediately (and generally mess up the existing tree).
Ideally, an binary tree insert should look like this:
def insert (tree, value):
if not tree:
# The subtree we entered doesn’t actually exist. So create a
# new tree with no left or right child.
return Node(value, None, None)
# Otherwise, the subtree does exist, so let’s see where we have
# to insert the value
if value < tree.value:
# Insert the value in the left subtree
tree.left = insert(tree.left, value)
else:
# Insert the value in the right subtree
tree.right = insert(tree.right, value)
# Since you want to return the changed tree, and since we expect
# that in our recursive calls, return this subtree (where the
# insertion has happened by now!).
return tree
Note, that this modifies the existing tree. It’s also possible that you treat a tree as an immutable state, where inserting an element creates a completely new tree without touching the old one. Since you are using createEyecuBST all the time, it is possible that this was your original intention.
To do that, you want to always return a newly created subtree representing the changed state of that subtree. It looks like this:
def insert (tree, value):
if tree is None:
# As before, if the subtree does not exist, create a new one
return Node(value, None, None)
if value < tree.value:
# Insert in the left subtree, so re-build the left subtree and
# return the new subtree at this level
return Node(tree.value, insert(tree.left, value), tree.right)
elif value > tree.value:
# Insert in the right subtree and rebuild it
return Node(tree.value, tree.left, insert(tree.right, value))
# Final case is that `tree.value == value`; in that case, we don’t
# need to change anything
return tree
Note: Since I didn’t know what’s the difference in your createEyecuBST function and the EyecuBST type is, I’m just using a type Node here which constructer accepts the value as the first parameter, and then the left and right subtree as the second and third.
Since the binary doesn't have the need to balance out anything , you can write as simple logic as possible while traversing at each step .
--> Compare with root value.
--> Is it less than root then go to left node.
--> Not greater than root , then go to right node.
--> Node exists ? Make it new root and repeat , else add the new node with the value
def insert(self, val):
treeNode = Node(val)
placed = 0
tmp = self.root
if not self.root:
self.root = treeNode
else:
while(not placed):
if val<tmp.info:
if not tmp.left:
tmp.left = treeNode
placed = 1
else:
tmp = tmp.left
else:
if not tmp.right:
tmp.right = treeNode
placed = 1
else:
tmp = tmp.right
return
You can also make the function recursive , but it shouldn't return anything. It will just attach the node in the innermost call .

Adding new nodes to a Tree by dendroPy

I would like to create a tree by dynamically adding nodes to an already
existing tree in DendroPy. So here is how I am proceeding,
>>> t1 = dendropy.Tree(stream=StringIO("(8,3)"),schema="newick")
Now That creates a small tree with two children having Taxon labels 8 and 3. Now
I want to add a new leaf to the node with taxon label 3. In order to do that I want the node
object.
>>> cp = t1.find_node_with_taxon_label('3')
I want to use add child function at that point which is an attribute of a node.
>>> n = dendropy.Node(taxon='5',label='5')
>>> cp.add_child(n)
But even after adding the node when I am printing all the node objects in t1, It is
returning the only children 8 and 3 that it was initialized with.
Please help me to understand how to add nodes in an existing tree in dendropy..
Now if we print t1 we would see the tree. But even after adding the elements
I could not find the objects that are added. For example if we do a
>>> cp1 = t1.find_node_with_taxon_label('5')
It is not returning the object related to 5.
To add a taxon you have to explicitly create and add it to the tree:
t1 = dendropy.Tree(stream=StringIO("(8,3)"),schema="newick")
# Explicitly create and add the taxon to the taxon set
taxon_1 = dendropy.Taxon(label="5")
t1.taxon_set.add_taxon(taxon_1)
# Create a new node and assign a taxon OBJECT to it (not a label)
n = dendropy.Node(taxon=taxon_1, label='5')
# Now this works
print t1.find_node_with_taxon_label("5")
The key is that find_node_with_taxon_label search in the t1.taxon_set list of taxons.

Categories

Resources