Optimizing element searching in a python Heap

Optimizing element searching in a python Heap - python

I'm looking for an object in a python Heap. Technically, looking for the absence of it, but I assume the logic works similarly.
heap = []
heapq.heappush(heap, (10, object))
if object not in [k for v, k in heap]:
## code goes here ##
However, this check is the longest (most processor-intensive) part of my program at large numbers of elements in the heap.
Can this search be optimized? And if so, how?

heapq is a binary heap implementation of a priority queue. A binary heap makes a pretty efficient priority queue but, as you've discovered, finding an item requires a sequential search.
If all you need to know is whether an item is in the queue, then your best bet is probably to maintain a dictionary along with the queue. So when you add something to the queue, your code is something like:
"""
I'm not really a python guy, so the code probably has syntax errors.
But I think you get the idea.
"""
theQueue = [];
queueIndex = {};
queueInsert(item)
if (item.key in queueIndex)
// item already in queue. Exit.
heapq.heappush(theQueue, item);
queueIndex[item.key] = 1
queuePop()
result = heapq.heappop();
del queueIndex[result.key];
return result;
Note that if the item you're putting on the heap is a primitive like a number or a string, then you'd replace item.key with item.
Also, note that this will not work correctly if you can place duplicates in the queue. You could modify it to allow that, though. You'd just need to maintain the count of items so that you don't remove from the index until the count goes to 0.

You can't do it with heapq, but here's a compatible implementation that works as long as the heap will not contain multiple copies of the same element.
https://github.com/elplatt/python-priorityq

Related

Appropriate to use repeated function calls to loop through something (i.e. a list) in Python?

Lets say I have the following Python script:
def pop_and_loop():
my_list.pop(0)
my_func()
def my_func():
#do something with list item [0]
if my_list[0] finished_with:
pop_and_loop()
#continued actions if not finished with
if my_list[0] finished_with:
pop_and_loop()
my_list = [#list containing 100 items]
my_func()
Is this an appropriate setup? Because, am I not leaving each function call open in a way because its having to hold a marker at the position where I have left the function to go to another, so theoretically it is waiting for me to come back, but I'm never coming back to that one. Does this create problems and is there a different way you're meant to do this?
EDIT: My actual script is more complicated than this, with loads of different functions that I need to call to whilst processing each item in the main list. Essentially my question is whether I need to convert this setup into an actual loop. Bearing in mind that I will need to refresh the main list to refill it again and then loop through the same again. So how would I keep looping that? Should I instead have:
my_list = []
def my_func(item):
#do something with list item
if item finished_with:
return output
elif item_finished_now:
return output
while not len(my_list):
while #there are items to fill the list with:
#fill list
for x in my_list:
output = my_func(x)
#deal with output and list popping here
#sleep loop waiting for there to be things to put into the list again
time.sleep(60)

Yours is simply an example of recursion.
Both the question and answer are borderline opinion-based, but in most cases you would prefer an iterative solution (loops) over recursion unless the recursive solution has a clear benefit of either being simpler or being easier to comprehend in code and in reasoning.
For various reasons, Python does not have any recursion optimizations such as tail call and creates a new stack frame for each new level (or function call). That, and more, are reasons an iterative solution would generally be faster and why the overhead of extra recursive calls in Python is rather large - it takes more memory for the stack and spends more time creating those frames. On top of all, there is a limit to the recursion depth, and most recursive algorithms can be converted to an iterative solution in an easy fashion.
Your specific example is simple enough to convert like so:
while my_list:
while my_list[0] != "finished":
# do stuff
my_list.pop(0)
On a side note, please don't pop(0) and use a collections.deque instead as it's O(1) instead of O(N).

Counting Total Number of Elements in a List using Recursion

The Problem:
Count the number of elements in a List using recursion.
I wrote the following function:
def count_rec(arr, i):
"""
This function takes List (arr) and Index Number
then returns the count of number of elements in it
using Recursion.
"""
try:
temp = arr[i] # if element exists at i, continue
return 1 + count_rec(arr, i+1)
except IndexError:
# if index error, that means, i == length of list
return 0
I noticed some problems with it:
RecursionError (when the number of elements is more than 990)
Using a temp element (wasting memory..?)
Exception Handling (I feel like we shouldn't use it unless necessary)
If anyone can suggest how to improve the above solution or come up with an alternative one, It would be really helpful.

What you have is probably as efficient as you are going to get for this thought experiment (obviously, python already calculates and stores length for LIST objects, which can be retrieved with the len() built-in, so this function is completely unnecessary).
You could get shorter code if you want:
def count(L):
return count(L[:-1])+1 if L else 0
But you still need to change python's recursion limit.
import sys; sys.setrecursionlimit(100000)
However, we should note that in most cases, "if else" statements take longer to process than "try except". Hence, "try except" is going to be a better (if you are after performance). Of course, that's weird talking about performance because recursion typically doesn't perform very well, due to how python manage's namespaces and such. Recursion is typically frowned upon, unnecessary, and slow. So, trying to optimize recursion performance is a littler strange.
A last point to note. You mention the temp=arr[i] taking up memory. Yes, possibly a few bytes. Of course, any calculation you do to determine if arr has an element at i, is going to take a few bytes in memory even simply running "arr[i]" without assignment. In addition, those bytes are freed the second the temp variable falls out of scope, gets re-used, or the function exits. Hence, unless you are planning on launching 10,000,000,000 sub-processes, rest assure there is no performance degradation in using a temp variable like that.

you are prob looking for something like this
def count_rec(arr):
if arr == []:
return 0
return count_rec(arr[1:]) + 1

You can use pop() to do it.
def count_r(l):
if l==[]:
return 0
else:
l.pop()
return count_r(l)+1

python find out which lists have been added to diagnose memory leak

I'm trying to diagnose a memory leak. By using tools like pympler and objgraph, I can see that a lot of large lists are added after each iteration of the main loop in my program code. This is unexpected behavior - the number of lists should stay constant after the program starts, not grow in the loop.
I would like to look at the lists that are added after each iteration. I've tried to do this via something similar to the following (very simplified) code:
def my_func():
import objgraph
import gc
existing = objgraph.by_type("list")
for item in to_do():
gc.collect()
new = objgraph.by_type('list')
diff = [item for item in new if item not in existing]
existing = new
do_something(item)
However, I get the following error when I attempt this:
RuntimeError: maximum recursion depth exceeded in comparison
I understand why this is happening, but I still need a way to investigate the new lists. How can I get access to just these new lists so that I can investigate the memory leak?

As you already know, using item not in existing blows up because it checks for equality, like a == b, which requires traversal of nested structures. However, equality is actually not what we are interested in the first place. Instead, we want to compare identity, i.e. a is b. This is a lot cheaper, as it is independent of the object content.
As a list comprehension, we could say
diff = [n for e in existing for n in new if n is e]
That is rather wasteful, as we keep searching all of existing even if we've found a match.
def is_in(item, collection):
for c in collection:
if item is c:
return True
return False
diff = [item for item in new if not is_in(item, existing)]
The fastest solution should be acquiring the identities in existing once and putting them into a set for more efficient searching:
existing_ids = {id(item) for item in existing}
diff = [item for item in new if id(item) in existing_ids]

Use an API debugger, e.g. the one coming with PyCharm.
Set a breakpoint at a sensible point within your iteration and go through the execution step by step.

LFU cache implementation in python

I have implemented LFU cache in python with the help of Priority Queue Implementation given at
https://docs.python.org/2/library/heapq.html#priority-queue-implementation-notes
I have given code in the end of the post.
But I feel that code has some serious problems:
1. To give a scenario, suppose there is only one page is continuously getting visited (say 50 times). But this code will always mark the already added node as "removed" and add it to heap again. So basically it will have 50 different nodes for the same page. Hence increasing heap size enormously.
2. This question is almost similar to Q1 of Telephonic Interview of
http://www.geeksforgeeks.org/flipkart-interview-set-2-sde-2/
And the person mentioned that doubly linked list can give better efficiency as compared to heap. Can anyone explain me, how?
from llist import dllist
import sys
from heapq import heappush, heappop
class LFUCache:
heap = []
cache_map = {}
REMOVED = "<removed-task>"
def __init__(self, cache_size):
self.cache_size = cache_size
def get_page_content(self, page_no):
if self.cache_map.has_key(page_no):
self.update_frequency_of_page_in_cache(page_no)
else:
self.add_page_in_cache(page_no)
return self.cache_map[page_no][2]
def add_page_in_cache(self, page_no):
if (len(self.cache_map) == self.cache_size):
self.delete_page_from_cache()
heap_node = [1, page_no, "content of page " + str(page_no)]
heappush(self.heap, heap_node)
self.cache_map[page_no] = heap_node
def delete_page_from_cache(self):
while self.heap:
count, page_no, page_content = heappop(self.heap)
if page_content is not self.REMOVED:
del self.cache_map[page_no]
return
def update_frequency_of_page_in_cache(self, page_no):
heap_node = self.cache_map[page_no]
heap_node[2] = self.REMOVED
count = heap_node[0]
heap_node = [count+1, page_no, "content of page " + str(page_no)]
heappush(self.heap, heap_node)
self.cache_map[page_no] = heap_node
def main():
cache_size = int(raw_input("Enter cache size "))
cache = LFUCache(cache_size)
while 1:
page_no = int(raw_input("Enter page no needed "))
print cache.get_page_content(page_no)
print cache.heap, cache.cache_map, "\n"
if __name__ == "__main__":
main()

Efficiency is a tricky thing. In real-world applications, it's often a good idea to use the simplest and easiest algorithm, and only start to optimize when that's measurably slow. And then you optimize by doing profiling to figure out where the code is slow.
If you are using CPython, it gets especially tricky, as even an inefficient algorithm implemented in C can beat an efficient algorithm implemented in Python due to the large constant factors; e.g. a double-linked list implemented in Python tends to be a lot slower than simply using the normal Python list, even for cases where in theory it should be faster.
Simple algorithm:
For an LFU, the simplest algorithm is to use a dictionary that maps keys to (item, frequency) objects, and update the frequency on each access. This makes access very fast (O(1)), but pruning the cache is slower as you need to sort by frequency to cut off the least-used elements. For certain usage characteristics, this is actually faster than other "smarter" solutions, though.
You can optimize for this pattern by not simply pruning your LFU cache to the maximum length, but to prune it to, say, 50% of the maximum length when it grows too large. That means your prune operation is called infrequently, so it can be inefficient compared to the read operation.
Using a heap:
In (1), you used a heap because that's an efficient way of storing a priority queue. But you are not implementing a priority queue. The resulting algorithm is optimized for pruning, but not access: You can easily find the n smallest elements, but it's not quite as obvious how to update the priority of an existing element. In theory, you'd have to rebalance the heap after every access, which is highly inefficient.
To avoid that, you added a trick by keeping elements around even if they are deleted. But this trades in space for time.
If you don't want to trade in time, you could update the frequencies in-place, and simply rebalance the heap before pruning the cache. You regain fast access times at the expense of slower pruning time, like the simple algorithm above. (I doubt there is any speed difference between the two, but I have not measured this.)
Using a double-linked list:
The double-linked list mentioned in (2) takes advantage of the nature of the possible changes here: An element is either added as the lowest priority (0 accesses), or an existing element's priority is incremented exactly by 1. You can use these attributes to your advantage if you design your data structures like this:
You have a double-linked list of elements which is ordered by the frequency of the elements. In addition, you have a dictionary that maps items to elements within that list.
Accessing an element then means:
Either it's not in the dictionary, that is, it's a new item, in which case you can simply append it to the end of the double-linked list (O(1))
or it's in the dictionary, in which case you increment the frequency in the element and move it leftwards through the double-linked list until the list is ordered again (O(n) worst-case, but usually closer to O(1)).
To prune the cache, you simply cut off n elements from the end of the list (O(n)).

O(1) indexable deque of integers in Python

what are my options there? I need to call a lot of appends (to the right end) and poplefts (from the left end, naturally), but also to read from the middle of the storage, which will steadily grow, by the nature of the algorithm. I would like to have all these operations in O(1).
I could implement it in C easy enough on circularly-addressed array (what's the word?) which would grow automatically when it's full; but what about Python? Pointers to other languages are appreciated too (I realize the "collections" tag is more Java etc. oriented and would appreciate the comparison, but as a secondary goal).
I come from a Lisp background and was amazed to learn that in Python removing a head element from a list is an O(n) operation. A deque could be an answer except the documentation says access is O(n) in the middle. Is there anything else, pre-built?

You can get an amortized O(1) data structure by using two python lists, one holding the left half of the deque and the other holding the right half. The front half is stored reversed so the left end of the deque is at the back of the list. Something like this:
class mydeque(object):
def __init__(self):
self.left = []
self.right = []
def pushleft(self, v):
self.left.append(v)
def pushright(self, v):
self.right.append(v)
def popleft(self):
if not self.left:
self.__fill_left()
return self.left.pop()
def popright(self):
if not self.right:
self.__fill_right()
return self.right.pop()
def __len__(self):
return len(self.left) + len(self.right)
def __getitem__(self, i):
if i >= len(self.left):
return self.right[i-len(self.left)]
else:
return self.left[-(i+1)]
def __fill_right(self):
x = len(self.left)//2
self.right.extend(self.left[0:x])
self.right.reverse()
del self.left[0:x]
def __fill_left(self):
x = len(self.right)//2
self.left.extend(self.right[0:x])
self.left.reverse()
del self.right[0:x]
I'm not 100% sure if the interaction between this code and the amortized performance of python's lists actually result in O(1) for each operation, but my gut says so.

Accessing the middle of a lisp list is also O(n).
Python lists are array lists, which is why popping the head is expensive (popping the tail is constant time).
What you are looking for is an array with (amortised) constant time deletions at the head; that basically means that you are going to have to build a datastructure on top of list that uses lazy deletion, and is able to recycle lazily-deleted slots when the queue is empty.
Alternatively, use a hashtable, and a couple of integers to keep track of the current contiguous range of keys.

Python's Queue Module may help you, although I'm not sure if access is O(1).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.