I successfully solved an algorithm question to serialize and deserialize binary tree.
class Codec:
def __init__(self):
self.i=0
def serialize(self, root):
store=[]
def preorder(node):
if not node:
store.append("N")
return
store.append(str(node.val))
preorder(node.left)
preorder(node.right)
preorder(root)
return ",".join(store)
# serialized data is passed here as "data" argument
def deserialize(self, data):
values=data.split(",")
def helper():
if values[self.i]=="N":
self.i+=1
return
root=TreeNode(int(values[self.i]))
self.i+=1
root.left=helper()
root.right=helper()
return root
return helper()
to solve the deserialize function, I created a top-level state variable self.i. Instead, I want to pass i to the helper function but I cannot figure it out. I tried to code like this with local variable:
def deserialize(self, data):
values=data.split(",")
def helper(i):
if values[i]=="N":
i+=1
return
root=TreeNode(int(values[i]))
i+=1
root.left=helper(i)
# i think issue is here.
# Because i is modified inside root.left=helper(i)
# so somehow I need to keep track of this modification
root.right=helper(i)
return root
return helper(0)
Instead of using an instance attribute (i), you could use a local variable i like you tried, but then not passing it as argument to helper, but referencing it as a nonlocal name. But I would not advise that. Instead create an iterator over the given values. Then you can call next on it to get the next value.
Once you got rid of the ugly instance attribute, you no longer need instances at all, and I wonder why you would need a class Codec at all. It you really want to keep it, then create those two functions as static methods as it doesn't make sense to ever need to create an instance of Codec:
Here is complete code with a run on a sample tree:
class TreeNode:
def __init__(self, val):
self.val = val
self.left = self.right = None
def add(self, val):
if val < self.val:
if self.left:
self.left.add(val)
else:
self.left = TreeNode(val)
else:
if self.right:
self.right.add(val)
else:
self.right = TreeNode(val)
def print(self, tab=""):
if self.right:
self.right.print(tab + " ")
print(tab, self.val)
if self.left:
self.left.print(tab + " ")
#staticmethod
def of(*values):
if values:
root = TreeNode(values[0])
for val in values[1:]:
root.add(val)
return root
class Codec:
#staticmethod
def serialize(root):
store=[]
def preorder(node):
if not node:
store.append("N")
return
store.append(str(node.val))
preorder(node.left)
preorder(node.right)
preorder(root)
return ",".join(store)
#staticmethod
def deserialize(data):
values = iter(data.split(","))
def helper():
val = next(values)
if val=="N":
return
root = TreeNode(int(val))
root.left = helper()
root.right = helper()
return root
return helper()
tree = TreeNode.of(4,2,6,1,3,5,7)
tree.print()
s = Codec.serialize(tree)
print(s)
root = Codec.deserialize(s)
root.print()
Related
I was reading the following link. I want to create a method that instead of printing the elements, it returns the associated list. This is what I did:
class Node(object):
def __init__(self, value):
self.value = value
self.left = None
self.right = None
class BinaryTree(object):
def __init__(self, root):
self.root = Node(root)
def preorder_sequence(self, current_node):
"""Helper method - use this to create a
recursive print solution."""
s = []
if current_node is not None:
s.append(current_node.value)
self.preorder_sequence(current_node.left)
self.preorder_sequence(current_node.right)
return s
This method is just returning the root element.
The main issue is that although your method returns a list, the recursive calls ignore that returned list, and so each call can only return a list that has at the most one element in it. Remember that s is a local name, and each execution context of your function has its own version of it. Your code really needs to capture the returned list as that is the only access the caller has to the list that the recursive execution created.
You could correct your code like this:
def preorder_sequence(self, current_node):
if current_node is not None:
return ([current_node.value]
+ self.preorder_sequence(current_node.left)
+ self.preorder_sequence(current_node.right))
else:
return []
Here you see how the lists that are created by the recursive calls are used to build a larger list (using +).
Now I should add that it is more pythonic to create a generator for this purpose: this means the function does not create a list, but just "spits out" the values in the requested order, and it becomes the responsibility of the caller to do something with them (like putting them in a list or printing them):
def preorder_sequence(self, current_node):
if current_node is not None:
yield current_node.value
yield from self.preorder_sequence(current_node.left)
yield from self.preorder_sequence(current_node.right)
I would even move the recursive part of this method to the Node class:
class Node:
# ...
def preorder_sequence(self):
yield self.value
if self.left:
yield from self.left.preorder_sequence()
if self.right:
yield from self.right.preorder_sequence()
And then in the BinaryTree class it would become:
class BinaryTree:
# ...
def preorder_sequence(self): # No more extra argument here
if self.root:
return self.root.preorder_sequence()
The caller can for instance do these things now:
tree = BST()
# ... populate the tree
# ...
# And finally create the list
lst = list(tree.preorder_sequence())
Or, just print using the * operator:
tree = BST()
# ... populate the tree
# ...
print(*tree.preorder_sequence())
I'm trying to simplify one of my homework problems and make the code a little better. What I'm working with is a binary search tree. Right now I have a function in my Tree() class that finds all the elements and puts them into a list.
tree = Tree()
#insert a bunch of items into tree
then I use my makeList() function to take all the nodes from the tree and puts them in a list.
To call the makeList() function, I do tree.makeList(tree.root). To me this seems a little repetitive. I'm already calling the tree object with tree.so the tree.root is just a waste of a little typing.
Right now the makeList function is:
def makeList(self, aNode):
if aNode is None:
return []
return [aNode.data] + self.makeList(aNode.lChild) + self.makeList(aNode.rChild)
I would like to make the aNode input a default parameter such as aNode = self.root (which does not work) that way I could run the function with this, tree.makeList().
First question is, why doesn't that work?
Second question is, is there a way that it can work? As you can see the makeList() function is recursive so I cannot define anything at the beginning of the function or I get an infinite loop.
EDIT
Here is all the code as requested:
class Node(object):
def __init__(self, data):
self.data = data
self.lChild = None
self.rChild = None
class Tree(object):
def __init__(self):
self.root = None
def __str__(self):
current = self.root
def isEmpty(self):
if self.root == None:
return True
else:
return False
def insert (self, item):
newNode = Node (item)
current = self.root
parent = self.root
if self.root == None:
self.root = newNode
else:
while current != None:
parent = current
if item < current.data:
current = current.lChild
else:
current = current.rChild
if item < parent.data:
parent.lChild = newNode
else:
parent.rChild = newNode
def inOrder(self, aNode):
if aNode != None:
self.inOrder(aNode.lChild)
print aNode.data
self.inOrder(aNode.rChild)
def makeList(self, aNode):
if aNode is None:
return []
return [aNode.data] + self.makeList(aNode.lChild) + self.makeList(aNode.rChild)
def isSimilar(self, n, m):
nList = self.makeList(n.root)
mList = self.makeList(m.root)
print mList == nList
larsmans answered your first question
For your second question, can you simply look before you leap to avoid recursion?
def makeList(self, aNode=None):
if aNode is None:
aNode = self.root
treeaslist = [aNode.data]
if aNode.lChild:
treeaslist.extend(self.makeList(aNode.lChild))
if aNode.rChild:
treeaslist.extend(self.makeList(aNode.rChild))
return treeaslist
It doesn't work because default arguments are evaluated at function definition time, not at call time:
def f(lst = []):
lst.append(1)
return lst
print(f()) # prints [1]
print(f()) # prints [1, 1]
The common strategy is to use a None default parameter. If None is a valid value, use a singleton sentinel:
NOTHING = object()
def f(arg = NOTHING):
if arg is NOTHING:
# no argument
# etc.
If you want to treat None as a valid argument, you could use a **kwarg parameter.
def function(arg1, arg2, **kwargs):
kwargs.setdefault('arg3', default)
arg3 = kwargs['arg3']
# Continue with function
function("amazing", "fantastic") # uses default
function("foo", "bar", arg3=None) # Not default, but None
function("hello", "world", arg3="!!!")
I have also seen ... or some other singleton be used like this.
def function(arg1, arg2=...):
if arg2 is ...:
arg2 = default
I have an application that deals with ~1-2 megabyte XML files. Doesn't sound like much, but I've run into a performance problem nonetheless.
Since I've some compute bound tasks that I'd like to speed up I've tried using multiprocessing.imap to do that - which requires pickling this XML data. Pickling the datastructures containing references into this DOM turns out to be slower than those compute bound processes, and the culprit seems to be recursions - I had to set the recursion limit to 10'000 in order to get pickle to work in the first place :-S.
Anyways, my question is:
If I wanted to attack this problem from the referential performance angle, what should I replace minidom with? Criterias are both pickling performance but also ease of transition.
To give you an idea of what kind of methods are needed, I have pasted a wrapper class (written sometimes earlier in order to speed up getElementsByTagName calls). It would be acceptable to replace all minidom nodes with nodes that adhere to the same interface as this class, i.e. I don't need all the methods from minidom. Getting rid of the parentNode method would also be acceptable (and probably a good idea in order to improve pickling performance).
And yes, if I'd be designing this nowadays I wouldn't go for XML node references in the first place, but it would be a lot of work to rip all of this out now, so I hope this can be patched instead.
Should I just write the damn thing myself using python built-ins or the collections library?
class ImmutableDOMNode(object):
def __init__(self, node):
self.node = node
self.cachedElementsByTagName = {}
#property
def nodeType(self):
return self.node.nodeType
#property
def tagName(self):
return self.node.tagName
#property
def ownerDocument(self):
return self.node.ownerDocument
#property
def nodeName(self):
return self.node.nodeName
#property
def nodeValue(self):
return self.node.nodeValue
#property
def attributes(self):
return self.node.attributes
#property
def parentNode(self):
return ImmutableDOMNode(self.node.parentNode)
#property
def firstChild(self):
return ImmutableDOMNode(self.node.firstChild)
#property
def childNodes(self):
return [ImmutableDOMNode(node) for node in self.node.childNodes]
def getElementsByTagName(self, name):
result = self.cachedElementsByTagName.get(name)
if result != None:
return result
uncachedResult = self.node.getElementsByTagName(name)
cachedResult = [ImmutableDOMNode(node) for node in uncachedResult]
self.cachedElementsByTagName[name] = cachedResult
return cachedResult
def getAttribute(self, qName):
return self.node.getAttribute(qName)
def toxml(self, encoding=None):
return self.node.toxml(encoding)
def toprettyxml(self, indent="", newl="", encoding=None):
return self.node.toprettyxml(indent, newl, encoding)
def appendChild(self, node):
raise Exception("cannot append child to immutable node")
def removeChild(self, node):
raise Exception("cannot remove child from immutable node")
def cloneNode(self, deep):
raise Exception("clone node not implemented")
def createElement(self, tagName):
raise Exception("cannot create element for immutable node")
def createTextNode(self, tagName):
raise Exception("cannot create text node for immutable node")
def createAttribute(self, qName):
raise Exception("cannot create attribute for immutable node")
So I decided to just make my own DOM implementation that meets my requirements, I've pasted it below in case it helps someone. It depends on lru_cache from memoization library for python 2.7 and #Raymond Hettinger's immutable dict from Immutable dictionary, only use as a key for another dictionary. However, these dependencies are easy to remove if you don't mind less safety/performance.
class CycleFreeDOMNode(object):
def __init__(self, minidomNode=None):
if minidomNode is None:
return
if not isinstance(minidomNode, xml.dom.minidom.Node):
raise ValueError("%s needs to be instantiated with a minidom.Node" %(
type(self).__name__
))
if minidomNode.nodeValue and minidomNode.childNodes:
raise ValueError(
"both nodeValue and childNodes in same node are not supported"
)
self._tagName = minidomNode.tagName \
if hasattr(minidomNode, "tagName") else None
self._nodeType = minidomNode.nodeType
self._nodeName = minidomNode.nodeName
self._nodeValue = minidomNode.nodeValue
self._attributes = dict(
item
for item in minidomNode.attributes.items()
) if minidomNode.attributes else {}
self._childNodes = tuple(
CycleFreeDOMNode(cn)
for cn in minidomNode.childNodes
)
childNodesByTagName = defaultdict(list)
for cn in self._childNodes:
childNodesByTagName[cn.tagName].append(cn)
self._childNodesByTagName = ImmutableDict(childNodesByTagName)
#property
def nodeType(self):
return self._nodeType
#property
def tagName(self):
return self._tagName
#property
def nodeName(self):
return self._nodeName
#property
def nodeValue(self):
return self._nodeValue
#property
def attributes(self):
return self._attributes
#property
def firstChild(self):
return self._childNodes[0] if self._childNodes else None
#property
def childNodes(self):
return self._childNodes
#lru_cache(maxsize = 100)
def getElementsByTagName(self, name):
result = self._childNodesByTagName.get(name, [])
for cn in self.childNodes:
result += cn.getElementsByTagName(name)
return result
def cloneNode(self, deep=False):
clone = CycleFreeDOMNode()
clone._tagName = self._tagName
clone._nodeType = self._nodeType
clone._nodeName = self._nodeName
clone._nodeValue = self._nodeValue
clone._attributes = copy.copy(self._attributes)
if deep:
clone._childNodes = tuple(
cn.cloneNode(deep)
for cn in self.childNodes
)
childNodesByTagName = defaultdict(list)
for cn in clone._childNodes:
childNodesByTagName[cn.tagName].append(cn)
clone._childNodesByTagName = ImmutableDict(childNodesByTagName)
else:
clone._childNodes = tuple(cn for cn in self.childNodes)
clone._childNodesByTagName = self._childNodesByTagName
return clone
def toxml(self):
def makeXMLForContent():
return self.nodeValue or "".join([
cn.toxml() for cn in self.childNodes
])
if not self.tagName:
return makeXMLForContent()
return "<%s%s>%s</%s>" %(
self.tagName,
" " + ", ".join([
"%s=\"%s\"" %(k,v)
for k,v in self.attributes.items()
]) if any(self.attributes) else "",
makeXMLForContent(),
self.tagName
)
def getAttribute(self, name):
return self._attributes.get(name, "")
def setAttribute(self, name, value):
self._attributes[name] = value
I am trying to solve this problem:
Imagine a (literal) stack of plates. If the stack gets too high, it
might topple. There- fore, in real life, we would likely start a new
stack when the previous stack exceeds some threshold. Implement a data
structure SetOfStacks that mimics this. SetOf- Stacks should be
composed of several stacks, and should create a new stack once the
previous one exceeds capacity. SetOfStacks.push() and
SetOfStacks.pop() should behave identically to a single stack (that
is, pop() should return the same values as it would if there were just
a single stack). Bonus: Implement a function popAt(int index) which
performs a pop operation on a specific sub-stack.
So I wrote the code:
#!/bin/env python
from types import *
class Stack:
def __init__(self):
self.items = []
self.capacity = 3
self.stackscount = 0
def create(self):
id = self.stackscount + 1
id = str(id) + "_stack"
# How to create a new instance of Stack class at runtime ?
# the __init__ must be run too.
def push(self, item):
if self.size() <= self.capacity:
self.items.append(item)
else:
self.create()
def pop(self):
return self.items.pop()
def popAt(self):
pass
def peek(self):
return self.items[len(self.items)-1]
def size(self):
return len(self.items)
s = Stack()
s.push(10)
How do I create a new s type object dynamically at runtime? I searched on the internet and found that using new.instance or new.classobj is the solution but when I did so my new object did not seem to have items from __init__ function. In python3, type() seems to be the answer but the docs doesn't have any examples.
You've confused yourself by referring to a "type object". In Python that means the class itself, not its instances.
To create new Stack objects, simply do what you're already doing: call the Stack class. You can append them to a list:
stacks = [Stack() for _ in range(5)]
However, as jon points out, that won't solve your problem since you haven't defined the SetOfStacks class.
You could simply use a parent-child relation : when a Stack is full, it creates a child and delegate next pushes to it. It could lead to :
class Stack:
def __init__(self, parent = None, id=None):
self.stackscount = 0
self.capacity = 3
self.items = []
self.parent = parent
self.id = id
self.child = None
def create(self):
id = self.stackscount + 1
id = str(id) + "_stack"
return Stack(self, id)
def push(self, item):
if self.size() <= self.capacity:
self.items.append(item)
else:
if self.child is None:
self.child = self.create()
self.child.push(item)
def pop(self):
if self.child is not None:
item = self.child.pop()
if len(self.child.items) == 0:
self.child = None
else:
item = self.items.pop()
return item
def popAt(self):
pass
def peek(self):
if self.child is not None:
item = self.child.peek()
else:
item = self.items[len(self.items)-1]
return item
def size(self):
l = len(self.items)
if self.child is not None:
l += self.child.size()
return l
s = Stack()
s.push(10)
popAt is still to be implemented, but I tested it and it correctly creates new stacks when pushing and empties and removes them when popping.
The implementation of popAt will require some evolutions to current pop implementation, to allow removing an intermediate stack :
def pop(self):
if self.child is not None:
item = self.child.pop()
if len(self.child.items) == 0:
self.child = self.child.child
if self.child is not None:
self.child.parent = self
else:
item = self.items.pop()
return item
def popAt(self, stacknumber):
s = self
for i in range(stacknumber):
s = s.child
if s is None:
return None
if len(s.items) == 0:
return None
item = s.items.pop()
if len(s.items) == 0 and s.parent is not None:
s.parent.child = s.child
if s.child is not None:
s.child.parent = s.parent
return item
The type() function is indeed what you are looking for. Documentation can be found here: https://docs.python.org/2/library/functions.html#type
You can call it like this:
# Bases is a tuple of parent classes to inherit
bases = Stack,
# Dict contains extra properties for the class, for example if you want to add a class variable or function
dict_ = {}
# Construct the class
YourClass = type('YourClass', bases, dict_)
# Create an instance of the class
your_instance = YourClass()
It looks like you are just looking at instance creation though:
class Stack(object):
def create(self):
id = self.stackscount + 1
id = str(id) + "_stack"
# How to create a new instance of Stack class at runtime ?
# the __init__ must be run too.
stack = Stack()
I am trying to implement Binary Search Tree operations in python. As of now, I have written some code to add nodes to this search tree (sorted).
Here's what I've in my code:
class TreeNode:
def __init__(self, data):
self.data = data
self.lLink = None
self.rLink = None
class BinaryTree:
def __init__(self):
self.root = None
def AddNode(self, data):
if self.root is None:
self.root = TreeNode(data)
else:
if data < self.root.data:
if self.root.lLink is None:
self.root.lLink = TreeNode(data)
else:
AddNode(self.root.lLink, data)
else:
if self.root.rLink is None:
self.root.rLink = TreeNode(data)
else:
AddNode(self.root.rLink, data)
def InOrder(self, head):
if self.root.lLink is not None:
InOrder(self.root.lLink)
print self.root.data,
if self.root.rLink is not None:
InOrder(self.root.rLink)
myTree = BinaryTree()
myTree.AddNode(15)
myTree.AddNode(18)
myTree.AddNode(14)
How do I test if my AddNode() method is correct? I know the algorithm but just to be sure.
What I was thinking of is to create an InOrder() method and try to print elements through this InOrder traversal. As a result, my data added to the tree should be displayed in sorted order. If it is displayed in sorted order, I'll be sure that both my AddNode() and InOrder() methods are correct.
Your BinaryTree class is faulty, changing the order of insertions to
myTree.AddNode(14)
myTree.AddNode(18)
myTree.AddNode(15)
raises an error - NameError: global name 'AddNode' is not defined.
This is because in the lines, AddNode(self.root.rLink, data) and AddNode(self.root.lLink, data) you seem to be calling the AddNode function on instances of TreeNode which is not possible. I fixed up some of the errors in your code and it should work great now.
class TreeNode:
def __init__(self, data):
self.data = data
self.lLink = None
self.rLink = None
class BinaryTree:
def __init__(self):
self.root = None
def AddNode(self, data):
if self.root is None:
self.root = TreeNode(data)
else:
self.AddHelper(data, self.root)
def AddHelper(self, data, startingPoint):
if data < startingPoint.data:
if startingPoint.lLink is None:
startingPoint.lLink = TreeNode(data)
else:
self.AddHelper(data, startingPoint.lLink)
else:
if startingPoint.rLink is None:
startingPoint.rLink = TreeNode(data)
else:
self.AddHelper(data, startingPoint.rLink)
def InOrder(self):
self.InOrderHelper(self.root)
def InOrderHelper(self, startingPoint):
if startingPoint is None:
return
self.InOrderHelper(startingPoint.lLink)
print startingPoint.data,
self.InOrderHelper(startingPoint.rLink)
Output Test :
>>> myTree = BinaryTree()
>>> myTree.AddNode(14)
>>> myTree.AddNode(18)
>>> myTree.AddNode(15)
>>> myTree.InOrder()
14 15 18
Inserting can be a little tricky, especially because the function is a part of the tree itself. So, you call the insert function on the tree, but specifying a starting point. This defaults to root, so you can leave the argument when you call the function.
Also, I think you are a little unclear about how self works in a function. You cannot pass it as an argument to the function, which is what it seems you have done.
class TreeNode:
def __init__(self, data):
self.data = data
self.rLink = None
self.lLink = None
class BinaryTree:
def __init__(self):
self.root = None
def AddNode(self, data, node=None):
if not node :
node = self.root
if self.root is None:
self.root = TreeNode(data)
else:
if data < node.data:
if node.lLink is None:
node.lLink = TreeNode(data)
else:
self.AddNode(data, self.root.lLink)
else:
if node.rLink is None:
node.rLink = TreeNode(data)
else:
self.AddNode(data, self.root.rLink)
def InOrder(self, head):
if head.lLink is not None:
self.InOrder(head.lLink)
print head.data,
if head.rLink is not None:
self.InOrder(head.rLink)
myTree = BinaryTree()
myTree.AddNode(14)
myTree.AddNode(15)
myTree.AddNode(18)
myTree.InOrder(myTree.root)
Testing the insert function with an in-order traversal is the best approach.
This should work. You are not going down the tree if you use self.root.lLink every time.
Optionally, you could write one more line of code to check if the output is indeed in ascending order.