Circular, doubly linked lists with hash table space - python

I'm currently working on implementing a Fibonacci heap in Python for my own personal development. While writing up the object class for a circular, doubly linked-list, I ran into an issue that I wasn't sure of.
For fast membership testing of the linked-list (in order to perform operations like 'remove' and 'merge' faster), I was thinking of adding a hash-table (a python 'set' object) to my linked-list class. See my admittedly very imperfect code below for how I did this:
class Node:
def __init__(self,value):
self.value = value
self.degree = 0
self.p = None
self.child = None
self.mark = False
self.next = self
self.prev = self
def __lt__(self,other):
return self.value < other.value
class Linked_list:
def __init__(self):
self.root = None
self.nNodes = 0
self.members = set()
def add_node(self,node):
if self.root == None:
self.root = node
else:
self.root.next.prev = node
node.next = self.root.next
self.root.next = node
node.prev = self.root
if node < self.root:
self.root = node
self.members.add(node)
self.nNodes = len(self.members)
def find_min():
min = None
for element in self.members:
if min == None or element<min:
min = element
return min
def remove_node(self,node):
if node not in self.members:
raise ValueError('node not in Linked List')
node.prev.next, node.next.prev = node.next, node.prev
self.members.remove(node)
if self.root not in self.members:
self.root = self.find_min()
self.nNodes -=1
def merge_linked_list(self,LL2):
for element in self.members&LL2.members:
self.remove_node(element)
self.root.prev.next = LL2.root
LL2.root.prev.next = self.root
self.root.prev, LL2.root.prev = LL2.root.prev, self.root.prev
if LL2.root < self.root:
self.root = LL2.root
self.members = self.members|LL2.members
self.nNodes = len(self.members)
def print_values(self):
print self.root.value
j = self.root.next
while j is not self.root:
print j.value
j = j.next
My question is, does the hash table take up double the amount of space that just implementing the linked list without the hash table? When I look at the Node objects in the hash table, they seem to be in the exact same memory location that they are when just independent node objects. For example, if I create a node:
In: n1 = Node(5)
In: print n1
Out: <__main__.Node instance at 0x1041aa320>
and then put this node in a set:
In: s1 = set()
In: s1.add(n1)
In: print s1
Out: <__main__.Node instance at 0x1041aa320>
which is the same memory location. So it seems like the set doesn't copy the node.
My question is, what is the space complexity for a linked list of size n with a hash-table that keeps track of elements. Is it n or 2n? Is there anything elementary wrong about using a hash table to keep track of elements.
I hope this isn't a duplicate. I tried searching for a post that answered this question, but didn't find anything satisfactory.

Check In-memory size of a Python structure and How do I determine the size of an object in Python? for complete answers in determining size of objects
I have this small results on a 64 bits machine with python 3
>>> import sys
>>> sys.getsizeof (1)
28
>>> sys.getsizeof (set())
224
>>> sys.getsizeof (set(range(100)))
8416
The results are in bytes. This can give you a hint about how big sets are (they are pretty big).
My question is, what is the space complexity for a linked list of size n with a hash-table that keeps track of elements. Is it n or 2n? Is there anything elementary wrong about using a hash table to keep track of elements.
Complexity calculations never make a difference between n and 2n.Optimisation does. And it's commonly said "early optimisation is the root of all evil", to warn about potential optimisation pitfalls. So do as you think is best for supported operations.

Related

Calling a linked list

I'm new to programming, so excuse the possibly stupid question.
I'm doing leetcodes and I got to the linked lists. I think I understand them okay, it's just that I don't know how to test my code/call my function(?)
Problem I'm working on
Here's my code, I know it works since I uploaded it onto leetcode, but I would still like to be able to run it on my machine.
class Solution:
def middleNode(self, head: Optional[ListNode]) -> Optional[ListNode]:
slow = fast = head
while fast and fast.next:
slow = slow.next
fast = fast.next.next
return slow
I guess I have two different problems:
the "Optional[ListNode]) -> Optional[ListNode]:" part
and the actual calling of the function
Some questions before used the "typing" module functions like "List", so I would simply import them and they wouldn't be a problem. But I'm not really sure what to do here
To check my solutions, I write a short piece of code that I can put example inputs into
Solution = Solution()
print(Solution.middleNode(head = [1,2,3,4,5,6]))
But isn't the "head" there, just a normal list? Do I have to create an extra function separately to create the "links". I've seen the creation of a linked list done by calling a function every time you want to add a new node. So would I use a for loop to add my example case?
well if you looking for internal boilerplate, below is code for that.
here you need to create classes for nodes, linked list and solutions,.
then with the given number, you need to create a linkedlist object.
this above part is done in leetcode by themself and this object is passed to class Solution method middleNode, where OP code run and give result. next once output is got it is compared with existing solution
# Node class for individual nodes
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
# linked list class, to create a linked list of the list nodes
class LinkedList:
def __init__(self):
self.head = None
# adding a node element to linked list
def add(self, node):
if self.head is None:
self.head = node
else:
curr = self.head
while curr.next:
curr = curr.next
curr.next = node
curr = node
# printing element of existing linked list
def print_ll(self):
curr= self.head
while curr:
print(curr.val)
curr= curr.next
# leetcode solution class
class Solution:
def middleNode(self, head) :
slow = fast = head
while fast and fast.next:
slow = slow.next
fast = fast.next.next
return slow
# value with which we create list nodes
values = [1,2,3,4,5,6,7,8,9,10]
ll = LinkedList() # create a linked list class object
for i in values:
node = ListNode(i) # creating a list node
ll.add(node) # adding list node to linked list
#ll.print_ll() # printing linked list
x = Solution().middleNode(ll.head) # passing linked list object to leetcode solution method and getting result
while x: # printing result
print(x.val)
x=x.next
I think the problem on LeetCode is poorly worded. head = [1,2,3,4,5] is not really a head as it should only refer to the first item in the list - there seems to be a bit of hidden boilerplate code that creates a linked list from input list and an output list from output node.
Here's an example code that works similiar to the LeetCode task.
from typing import Optional
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
class Solution:
def middleNode(self, head: Optional[ListNode]) -> Optional[ListNode]:
slow = fast = head
while fast and fast.next:
slow = slow.next
fast = fast.next.next
return slow
inp = [1,2,3,4,5,6]
next = None
for i in reversed(inp):
next = ListNode(i, next) # next points to head at the end of the loop
res = Solution().middleNode(next)
out = []
while res:
out.append(res)
res = res.next
print([o.val for o in out])
to make your code work on your machine you have to implement a couple of things:
First, for your first answer, you have to implement the class ListNode given at the top of the leetcode-page:
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
The you import "Optional" from typing:
from typing import Optional
Then you have the prerequisites for your code.
You have to initialise the class, as you have mentioned. The only problem here is, that your variable has the same name as your class, what could cause trouble later.
To finish, you have to call your function as you already did, with one little difference: This function has to be called with "head" as a variable of type ListNode, not List, and gives you back a variable of the type ListNode.
In a nutshell, this would be my solution (of course you can and as much ListNodes as you want):
from typing import Optional
class ListNode:
def __init__(self, val=0, next=None):
self.val = val
self.next = next
class Solution:
def middleNode(self, head: Optional[ListNode]) -> Optional[ListNode]:
slow = fast = head
while fast and fast.next:
slow = slow.next
fast = fast.next.next
return slow
# Initialise class
s = Solution()
# Defining nodes of the list, assigning always the next node.
seven = ListNode(7)
six = ListNode(6, next=seven)
five = ListNode(5, next=six)
four = ListNode(4, next=five)
three = ListNode(3, next=four)
two = ListNode(2, next=three)
one = ListNode(1, next=two)
# Calling your function (with "one" as your head node of the list)
# NOTE: As this gives you back an attribute of type ListNode, you have to access the "val" attribute of it to print out the value.
print(s.middleNode(one).val)

Binary Tree in Python, control over the height

I want to implement a Binary Tree in Python. I sumit my code. What I would like to do is to
set the height of the Binary Tree with the variable L. But, when I implement the code, it seems that the code has created a Binary Tree that is greater than I expected.
I arrive to this conclusion because when I set the height as 1 and I do print(node.right.right.right), I still get 1.
class Tree:
def __init__(self,x,left=None,right=None):
self.x=x
self.left=left
self.right=right
def one_tree(self,node):
node=Tree(1)
node.right=Tree(1)
node.left=Tree(1)
return node
node=Tree(1)
node=node.one_tree(node)
L=1
while L>0:
node=node.one_tree(node)
node.left=node
node.right=node
L=L-1
print(node.right.right.right.right)
I found a problem with your code. one_tree method overwrites the argument node itself.
class Tree:
def __init__(self, x, left=None, right=None):
self.x = x
self.left = left
self.right = right
def one_tree(self, node):
node = Tree(1) # This assignment statement overwrites the argument 'node'.
node.right = Tree(1)
node.left = Tree(1)
return node
one_tree method gets an argument node but the first line of this method overwrites it like this node = Tree(1). Whatever the method gets as an argument, the method always has a new instance of Tree as node variable.
Several issues:
one_tree doesn't use the node that you pass as argument, so whatever you pass to it, the returned tree will always have 3 nodes (a root with 2 children).
one_tree is a method that doesn't use self, so it really should be a class method, not an instance method
If the intended algorithm was to add a level above the already created tree, and have the new root's children reference the already existing tree, then you would need to only create one node, not three, and let the two children be the given node.
Not a problem, but your loop is not really the "python way" to loop L times. Use range instead.
This means your code could be this:
class Tree:
def __init__(self, x, left=None, right=None):
self.x = x
self.left = left
self.right = right
node = Tree(1)
L = 1
for _ in range(L):
node = Tree(1, node, node)
Now you should still be careful with this tree, as it only has L+1 unique node instances, where all "nodes" on the same level are actually the same node instance. All the rest are references that give the "impression" of having a tree with many more nodes. Whenever you start mutating nodes in that tree, you'll see undesired effects.
If you really need separate node instances for all nodes, then the algorithm will need to be adapted. You could then use recursion:
def create(height):
if height == 0: # base case
return Tree(1)
return Tree(1, create(height-1), create(height-1))
L = 1
node = create(L)

A linked list program using one class

Is it possible to write this single linked list program in the class Node and using all methods of class LinkedList inside that by eliminating the class LinkedList and using only one class for all this ? If it is possible then why we prefer writing this way and why not with a single class ?
Some say we use it to keep a track of head node but I don't get it. We can simply use a head named variable to store the head node and then use it further in other operations. Then why a different class for head node.
class Node:
def __init__(self, data):
self.data = data
self.next = None
class LinkedList:
def __init__(self):
self.head = None
def push(self, new_data):
new_node = Node(new_data)
new_node.next = self.head
self.head = new_node
def deleteNode(self, key):
temp = self.head
if (temp is not None):
if (temp.data == key):
self.head = temp.next
temp = None
return
while(temp is not None):
if temp.data == key:
break
prev = temp
temp = temp.next
if(temp == None):
return
prev.next = temp.next
temp = None
def printList(self):
temp = self.head
while(temp):
print (" %d" %(temp.data)),
temp = temp.next
There are lots and lots of ways to implement the same interface as this LinkedList class.
One of those ways would be to give the LinkedList class data and next fields, where next points to a linked list... but why do you think that is better? The Node class in your example is only used inside a LinkedList. It has no external purpose at all. The author of this LinkedList class made two classes so he could separate the operations applied to nodes from the operations applied to the list as a whole, because that's how he liked to think about it. Whether you think your way is better or worse is a matter of choice...
But here's the real reason why it's better to keep LinkedList separate from nodes:
In real life programming, you will encounter many singly-linked lists. Probably none of these will be instances of any kind of List class. You will just have a bunch of objects that are linked together by some kind of next pointer. You will essentially have only nodes. You will want to perform list operations on those linked nodes without adding list methods to them, because those node objects will have their own distinct purposes -- they are not meant to be lists, and you will not want to make them lists.
So this LinkedList class you have above is teaching you how to perform those list operations on linked objects. You will never write this class in real life, but you will use these techniques many many times.

Comparing regexes with recursion

So I'm stuck here trying to recursively compare regexes with recursion. The user will create an object with two parameters, each a string of length one. These strings can only be "0", "1" or "2". But I want to recursively check if these strings point to another string as well. Like:
*
/ \
1 2
/ \
2 1
I can't figure out how to recursively point to a new object:
This is what I have so far:
class DotNode(object):
def __init__(self, _cargo, _left=None, _right=None):
self._cargo = _cargo
self._left = _left
self._right = _right
def __eq__(self, _other):
base = ['0','1','2']
if self._left in base and self._right in base:
return self._left == _other._left and self._right == _other._right
else:
while self._left not in base or self._right not in base:
new = self._left
new2 = self._right
new3 = _other._left
new4 = _other._right
return new._left == new3._left and new2._right == new4._right
You seem to already know how to do this: recursion. You want to call the __eq__function recursively here. I would also advice you check if the given cargo is one of the possible values in the constructor - or even better - every time the value is set.
class DotNode(object):
#property
def _cargo(self):
return self._vcargo
#_cargo.setter
def _cargo(self, val):
if val not in ['0', '1', '2']:
raise ValueError("{} is not a possible value for _cargo.".format(val))
self._vcargo = val
def __eq__(self, other):
return isinstance(other, DotNode) and self._cargo == other._cargo and self._left == other._left and self._right == other._right
Breaking it down
Of course you want to keep your constructor here. i just wrote the changed parts down. As you may have noticed you don't even need RegExes here, standard string comparison works just fine.
The _cargo property
I changed _cargo from a simple attribute to a property here. What does that mean? You get getters and setters à la Java that allow better control over the possible values. The actual data is stored in _vcargo of course someone could write to that attribute directly, but that would be downright stupid and you are certainly not responsible if someone uses your code in a way it was not intended. Should you try to set a value different from the possible values, a ValueError will be raised.
The __eq__ function
As you can see this function is actually very simple. Everything it does is compute whether the cargo of the node itself and the other node is equal. Now, if both subtrees are also equal the whole tree is equal. At the deepest level it will compare None with None if both trees are equal, because there will be no more subtrees.

Huffman encoding issue

As an exercise I'm trying to encode some symbols using Huffman trees, but using my own class instead of the built in data types with Python.
Here is my node class:
class Node(object):
left = None
right = None
weight = None
data = None
code = ''
length = len(code)
def __init__(self, d, w, c):
self.data = d
self.weight = w
self.code = c
def set_children(self, ln, rn):
self.left = ln
self.right = rn
def __repr__(self):
return "[%s,%s,(%s),(%s)]" %(self.data,self.code,self.left,self.right)
def __cmp__(self, a):
return cmp(self.code, a.code)
def __getitem__(self):
return self.code
and here is the encoding function:
def encode(symbfreq):
tree = [Node(sym,wt,'') for sym, wt in symbfreq]
heapify(tree)
while len(tree)>1:
lo, hi = sorted([heappop(tree), heappop(tree)])
lo.code = '0'+lo.code
hi.code = '1'+hi.code
n = Node(lo.data+hi.data,lo.weight+hi.weight,lo.code+hi.code)
n.set_children(lo, hi)
heappush(tree, n)
return tree[0]
(Note, that the data field will eventually contain a set() of all the items in the children of a node. It just contains a sum for the moment whilst I get the encoding correct).
Here is the previous function I had for encoding the tree:
def encode(symbfreq):
tree = [[wt, [sym, ""]] for sym, wt in symbfreq]
heapq.heapify(tree)
while len(tree)>1:
lo, hi = sorted([heapq.heappop(tree), heapq.heappop(tree)], key=len)
for pair in lo[1:]:
pair[1] = '0' + pair[1]
for pair in hi[1:]:
pair[1] = '1' + pair[1]
heapq.heappush(tree, [lo[0] + hi[0]] + lo[1:] + hi[1:])
return sorted(heapq.heappop(tree)[1:], key=lambda p: (len(p[-1]), p))
However I've noticed that my new procedure is incorrect: it gives the top nodes the longest codewords instead of the final leaves, and doesn't produce the same tree for permutations of input symbols i.e. the following don't produce the same tree (when run with new encoding function):
input1 = [(1,0.25),(0,0.25),(0,0.25),(0,0.125),(0,0.125)]
input2 = [(0,0.25),(0,0.25),(0,0.25),(1,0.125),(0,0.125)]
I'm finding I'm really bad at avoiding this kind of off-by-one/ordering bugs - how might I go about sorting this out in the future?
There's more than one oddity ;-) in this code, but I think your primary problem is this:
def __cmp__(self, a):
return cmp(self.code, a.code)
Heap operations use the comparison method to order the heap, but for some reason you're telling it to order Nodes by the current length of their codes. You almost certainly want the heap to order them by their weights instead, right? That's how Huffman encoding works.
def __cmp__(self, a):
return cmp(self.weight, a.weight)
For the rest, it's difficult to follow because 4 of your 5 symbols are the same (four 0 and one 1). How can you possibly tell whether it's working or not?
Inside the loop, this is strained:
lo, hi = sorted([heappop(tree), heappop(tree)])
Given the repair to __cmp__, that's easier as:
lo = heappop(tree)
hi = heappop(tree)
Sorting is pointless - the currently smallest element is always popped. So pop twice, and lo <= hi must be true.
I'd say more ;-), but at this point I'm confused about what you're trying to accomplish in the end. If you agree __cmp__ should be repaired, make that change and edit the question to give both some inputs and the exact output you're hoping to get.
More
About:
it gives the top nodes the longest codewords instead of the final leaves,
This isn't an "off by 1" thing, it's more of a "backwards" thing ;-) Huffman coding looks at nodes with the smallest weights first. The later a node is popped from the heap, the higher the weight, and the shorter its code should be. But you're making codes longer & longer as the process goes on. They should be getting shorter & shorter as the process goes on.
You can't do this while building the tree. In fact the codes aren't knowable until the tree-building process has finished.
So, rather than guess at intents, etc, I'll give some working code you can modify to taste. And I'll include a sample input and its output:
from heapq import heappush, heappop, heapify
class Node(object):
def __init__(self, weight, left, right):
self.weight = weight
self.left = left
self.right = right
self.code = None
def __cmp__(self, a):
return cmp(self.weight, a.weight)
class Symbol(object):
def __init__(self, name, weight):
self.name = name
self.weight = weight
self.code = None
def __cmp__(self, a):
return cmp(self.weight, a.weight)
def encode(symbfreq):
# return pair (sym2object, tree), where
# sym2object is a dict mapping a symbol name to its Symbol object,
# and tree is the root of the Huffman tree
sym2object = {sym: Symbol(sym, w) for sym, w in symbfreq}
tree = sym2object.values()
heapify(tree)
while len(tree) > 1:
lo = heappop(tree)
hi = heappop(tree)
heappush(tree, Node(lo.weight + hi.weight, lo, hi))
tree = tree[0]
def assigncode(node, code):
node.code = code
if isinstance(node, Node):
assigncode(node.left, code + "0")
assigncode(node.right, code + "1")
assigncode(tree, "")
return sym2object, tree
i = [('a', 1), ('b', 2), ('c', 3), ('d', 4), ('e', 5)]
s2o, t = encode(i)
for v in s2o.values():
print v.name, v.code
That prints:
a 010
c 00
b 011
e 11
d 10
So, as hoped, the symbols with the highest weights have the shortest codes.
I suspect the problem is in segment: -
lo.code = '0'+lo.code
hi.code = '1'+hi.code
both hi & low can be intermediate nodes where as you need to update the codes at the leaves where the actual symbols are. I think you should not maintain any codes at construction of the huffman tree but get the codes of individual symbols after the construction of huffman tree by just traversing.
Heres my pseudo code for encode: -
tree = ConstructHuffman(); // Same as your current but without code field
def encode(tree,symbol):
if tree.data == symbol :
return None
else if tree.left.data.contains(symbol) :
return '0' + encode(tree.left,symbol)
else :
return '1' + encode(tree.right,symbol)
Calculate codes for all symbol using above encode method and then u can use them for encoding.
Note: Change comparision function from comparing codes to comparing weights.

Categories

Resources