Complexity does not match actual growth in running time? (python)

Complexity does not match actual growth in running time? (python) - python

I ran 2 codes in python then measured the time it took to complete.The codes are quite simple , just recursive maximums. Here it is:
1.
def max22(L, left, right):
if(left>=right):
return L[int(left)]
k = max22(L,left,(left+right-1)//2)
p = max22(L, (right+left+1)//2,right)
return max(k,p)
def max_list22(L):
return max22(L,0,len(L)-1)
def max2(L):
if len(L)==1:
return L[0]
l = max2(L[:len(L)//2])
r = max2(L[len(L)//2:])
return max(l,r)
The first one should run (imo) in O(logn), and the second one in O(n*logn).
However, I measured the running time for n=1000 , n=2000 and n=4000,
And somehow the growth for both of the algorithms seems to be linear! How is this possible? Did I get the complexity wrong, or is it okay?
Thanks.

The first algorithms is not O(log n) because it checks value of each element. It may be shown that it is O(n)
As for the second, possibly you just couldn't notice difference between n and nlogn on such small scales.

Just because a function is splitting the search space by 2 and then recursively looking at each half does not mean that it has a log(n) factor in the complexity.
In your first solution, you are splitting the search space by 2, but then ultimately inspecting every element in each half. Unlike binary search which discards one half of the search space, you are inspecting both halves. This means nothing is discarded from the search and you ultimately end up looking at every element, making your complexity O(n). The same holds true for your second implementation.

Your first algorithm is O(n) on a normal machine, so it is not surprising that your testing indicated this. Your second algorithm is O(n*log in), but it would be O(n) if you were using proper arrays instead of lists. Since Python builtin list operations are pretty fast, you may not have hit the logarithmic slowdown yet; try it with values more like n=4000000 and see what you get.
Note that, if you could run both recursive calls in parallel (with O(1) slicing), both algorithms could run in O(log n) time. Of course, you would need O(n) processors to do this, but if you were designing a chip, instead of writing a program, that kind of scaling would be straightforward...

Related

out-of-core/external-memory combinatorics in python

I am iterating the search space of valid Python3 ASTs. With max recursion depth = 3, my laptop runs out of memory. My implementation makes heavy use of generators, specifically 'yield' and itertools.product().
Ideally, I'd replace product() and the max recursion depth with some sort of iterative deepening, but first things first:
Are there any libraries or useful SO posts for out-of-core/external-memory combinatorics?
If not... I am considering the feasibility of using either dask or joblib's delayed()... or perhaps wendelin-core's ZBigArray, though I don't like the looks of its interface:
root = dbopen('test.fs')
root['A'] = A = ZBigArray((10,), np.int)
transaction.commit()
Based on this example, I think that my solution would involve an annotation/wrapper function that eagerly converts the generators to ZBigArrays, replacing root['A'] with something like root[get_key(function_name, *function_args)] It's not pretty, since my generators are not entirely pure--the output is shuffled. In my current version, this shouldn't be a big deal, but the previous and next versions involve using various NNs and RL rather mere shuffling.

First things first- the reason you're getting the out of memory error is because itertools.product() caches intermediate values. It has no idea if the function that gave you your generator is idempotent, and even if it did, it couldn't be able to infer how to call it again given just the generator. This means itertools.product must cache values of each iterable its passed.
The solution here is to bite the small performance bullet and either write explicit for loops, or write your own cartesian product function, which takes functions that would produce each generator. For instance:
def product(*funcs, repeat=None):
if not funcs:
yield ()
return
if repeat is not None:
funcs *= repeat
func, *rest = funcs
for val in func():
for res in product(*rest):
yield (val, ) + res
from functools import partial
values = product(partial(gen1, arg1, arg2), partial(gen2, arg1))
The bonus from rolling your own here is that you can also change how it goes about traversing the A by B by C ... dimensional search space, so that you could do maybe a breadth-first search instead of an iteratively deepening DFS. Or, maybe pick some random space-filling curve, such as the Hilbert Curve which would iterate all indices/depths of each dimension in your product() in a local-centric fashion.
Apart from that, I have one more thing to point out- you can also implement BFS lazily (using generators) to avoid building a queue that could bloat memory usage as well. See this SO answer, copied below for convenience:
def breadth_first(self):
yield self
for c in self.breadth_first():
if not c.children:
return # stop the recursion as soon as we hit a leaf
yield from c.children
Overall, you will take a performance hit from using semi-coroutines, with zero caching, all in python-land (in comparison to the baked in and heavily optimized C of CPython). However, it should still be doable- algorithmic optimizations (avoiding generating semantically nonsensical ASTs, prioritizing ASTs that suit your goal, etc.) will have a larger impact than the constant-factor performance hit.

Unsure about the time complexity in the code

I have a question about time complexity
import math
def power_iter(x,n) :
for i in range(math.floor(math.log2(n))):
x = x*x
print(x)
return math.pow(2,(n-math.pow(2,math.floor(math.log2(n)))))*x
print(power_iter(2,10))
Q1. Is the time complexity of math.floor(math.log2(n)) and n-math.pow(2,math.floor(math.log2(n)))) O(1)?
Q2. I think that this code's time complexity is O(log2(n)). Is this right?

Q1. Are the "math.floor(math.log2(n))" and "n-math.pow(2,math.floor(math.log2(n))))" time complexity is O(1) each other? or not include in time complexity
Correct, these operations are ultimately irrelevant in the simplified time complexity. Big O notation describes the rate of increase, in this case with respect to n. The iteration over the range object is what you're after here; you can effectively treat the individual math calls within each iteration as you would basic operators on integers with O(1) time.
Q2. I think that this code's time complexity is O(log2(n)). Is this right?
Yes.

Answer to Q1: It depends on the scale of n, but normally, you can suppose that the time complexity of the floor and log functions are in Theta(1).
Answer to Q2: As you found, we have only one loop with the size of log(n). So, if we assume the answer to the first question is right, you can say the time complexity is in O(log(n)) and, also Theta(log(n)).

BIG O time complexity of TSP algorithms

I've written 2 nearest neighbor algorithms in python and I have to analyize the runtime complexity by O(n) and Θ(n).
So I've tried several samples and I don't understand why one of my algorithm is faster than the other one.
So here is my Code for the repeated nearest neighbor (RNN) algorithm:
def repeated_nn_tsp(cities):
return shortest_tour(nn_tsp(cities, start) for start in cities)
def shortest_tour(self, tours):
return min(tours, key=self.tour_length)
nn_tsp has a runtime complexity of O(n^2) and every startpoint will create a new NN Tour. Through all NN tours I have to find the best tour.
That's why I think the time complexity of the RNN has to be T(n)=O(n^3) and T(n)=Θ(n^3).
So here is my Code for the altered nearest neighbor (ANN) algorithm:
def alter_tour(tour):
original_length = tour_length(tour)
for (start, end) in all_segments(len(tour)):
reverse_segment_if_better(tour, start, end)
if tour_length(tour) < original_length:
return alter_tour(tour)
return tour
def all_segments(N):
return [(start, start + length) for length in range(N, 2-1, -1) for start in range(N - length + 1)]
def reverse_segment_if_better(tour, i, j):
A, B, C, D = tour[i-1], tour[i], tour[j-1], tour[j % len(tour)]
if distance(A, B) + distance(C, D) > distance(A, C) + distance(B, D):
tour[i:j] = reversed(tour[i:j])
The time complexity of all_segments should be T(n) = O(1/2 * n^2 - 0.5n) -> O(n^2) and creates n^2 elements.
Inside the Loop through all_segments (through n^2 elements) I call the function reverse_segment_if_better. I'll use the reversed method of python, which causes a time complexity of O(n).
That's why I think the time complexity of the loop has to be O(n^3). When there's a better tour, the function will call itself recursive. I think the outcome of the altered NN has a time complexity of O(n^4). Is that right?
But here we come to my problem: My evaluation, which runs the code 100times over 100cities, shows me that ANN is faster than RNN on average which is the opposite of the runtime complexity I expected. (RNN needs 4.829secs and ANN only needs 0.877secs for 1x 100-city.)
So where did I make a mistake?
Thanks in advance!

First I must say that time-complexity and big-o notations are not always on point, one algorithm may have a 'better' running-time function but still would run slower than expected, or slower than another function with a worst running-time function, in your case, it is very hard to determine what is the worst case to feed the algorithm, and we cannot assure you have done that! Maybe the cases were 'pleasant' with the ANN algorithm while the other one got stuck somewhere..? this is why it is not always 100% correct to rely only on the running time function we calculate.
What I am trying to say, is that you most probably did not make a mistake in your calculations on purpose, because they are hard functions to analyze on the fly, or What kind of input would be the worst, for example
As for the 'why?':
When talking about actual personal running time (as you gave an example of 0.877seconds), it boils down to our own machines, each computer has its own running hardware behind the curtains, not all computers are born the same.
Secondly, when we talk about running time complexity, we drop the low term values as you did with the all_segments function, you can see that you even dropped a negative term which in theory would help us reduce the numbers of 'operations'.
There are many cases in which there is a bit of code not-so efficient, that we bother to execute only if a specific criteria is met, thus reducing the running time.
Last and most importantly is the fact that when we talk about classifying
algorithms into sets such as O(n) or O(nlogn) we are talking about
asymptotic functions, we need to look at the bigger picture and see
what happens when we feed the algorithm very large amount of data,
which I assume you didn't check, because as you wrote, you ran only 100 cities. That may
vary if we would look at let's say, millions and millions of cities.
For your code, I can notice multiple parts that would reasonably be the cause of this 'weird' difference in the running time. The first, is that in theANN code, more specifically in the reverse_segment_if_better function, we are not always reversing the list, only if a certain statement is evaluated to a truthy value. We cannot be sure what kind of input you've given the algorithm, and thus I have only to imagine it is compliant with the second algorithm.
Moreover, it may be that I am missing something (as the function reverse_segment_if_better / we cannot view the function tour_length or distance) but I don't see how you came up with O(n^4) at the end, it seems like it is doing O(n^3):
all_segments- no doubt it is O(n) - returning ~n/2 values
The tricky part is analyzing reverse_segment_if_better and alter_tour - reversing only occurs from i:j thus it is not strictly correct to say it has O(n) - as we do not reverse the whole tour (at least, not for every value of start, end.
It is safe to say that it may be the case of not checking for very large numbers asymptotically, you gave an input and it was kind to this specific algorithm, or the final form of T(n) was not strict enough.

Calculating Time and Space Complexity of xrange(), random.randint() and sort() function

What is the time and space complexity of xrange(),random.randint(1,100) and sort() function in Python
import random
a = [random.randint(1,100) for i in xrange(1000000)]
print a
a.sort()
print a

Without further information on the problem, your actual task and your solving attempts an answer could merely be adequate...but I will try to at least give you some input.
a = [random.randint(1,100) for i in xrange(1000000)]
A statement like a = ... is normally considered to have O(1) in terms of time complexity. Space complexity depends on how detailed you wish to analyze the problem. Simplified one might say 1.000.000 random ints in a list is something like O(1.000.000) and therefore constant, hence one could say in dependency of the input length (1.000.000, 2.000.000, ...) it results in O(n).
[random.randint(1,100) for i in xrange(1000000)] is a for-loop with 1.000.000 loops and generating a random integer. In dependency of the randint-algorithm this would also be something like O(n).
a.sort() is highly dependent on the used sorting algorithm. Most languages use merge-sort, which is O(n * log(n)) at all cases.

I got the answer on Facebook. Thanks to Shashank Gupta.
I'm assuming you know the basics of asymptotic notation and stuff.
Now, forget the a.sort() function for a moment and concentrate on your list comprehension:
a = [random.randint(1,100) for i in xrange(1000000)]
1000000 is pretty big so let's reduce it to 10 for now.
a = [random.randint(1,100) for i in xrange(10)]
You're building a new list here with 10 elements. Each element is generated via the randint function. Let's assume the time complexity of this function is O(1). For 10 elements, this function will be called 10 times, right?
Now, let's generalize this. For integer 'n'
a = [random.randint(1,100) for i in xrange(n)]
You will be calling the randint function 'n' times.
All of this can also be written as:
for i in xrange(n):
a.append(randint(1, 100))
This is O(n).
Following the code, you've a simple print statement. This is O(n) again (internally, python interpreter iterates over the complete list). Now comes the sorting part. You've used the sort function. How much time does it take? There are many sorting algorithms out there, and without going into the exact algo used, I can safely assume the time complexity will be O(n log n)
Hence, the actual time complexity of your code is T(n) = O(n log n) + O(n) which is O(n log n) (the lower term is ignored for large n)
What about space? Your code initialized a new list of size 'n'. Hence space complexity is O(n).
There you go.

LFU cache implementation in python

I have implemented LFU cache in python with the help of Priority Queue Implementation given at
https://docs.python.org/2/library/heapq.html#priority-queue-implementation-notes
I have given code in the end of the post.
But I feel that code has some serious problems:
1. To give a scenario, suppose there is only one page is continuously getting visited (say 50 times). But this code will always mark the already added node as "removed" and add it to heap again. So basically it will have 50 different nodes for the same page. Hence increasing heap size enormously.
2. This question is almost similar to Q1 of Telephonic Interview of
http://www.geeksforgeeks.org/flipkart-interview-set-2-sde-2/
And the person mentioned that doubly linked list can give better efficiency as compared to heap. Can anyone explain me, how?
from llist import dllist
import sys
from heapq import heappush, heappop
class LFUCache:
heap = []
cache_map = {}
REMOVED = "<removed-task>"
def __init__(self, cache_size):
self.cache_size = cache_size
def get_page_content(self, page_no):
if self.cache_map.has_key(page_no):
self.update_frequency_of_page_in_cache(page_no)
else:
self.add_page_in_cache(page_no)
return self.cache_map[page_no][2]
def add_page_in_cache(self, page_no):
if (len(self.cache_map) == self.cache_size):
self.delete_page_from_cache()
heap_node = [1, page_no, "content of page " + str(page_no)]
heappush(self.heap, heap_node)
self.cache_map[page_no] = heap_node
def delete_page_from_cache(self):
while self.heap:
count, page_no, page_content = heappop(self.heap)
if page_content is not self.REMOVED:
del self.cache_map[page_no]
return
def update_frequency_of_page_in_cache(self, page_no):
heap_node = self.cache_map[page_no]
heap_node[2] = self.REMOVED
count = heap_node[0]
heap_node = [count+1, page_no, "content of page " + str(page_no)]
heappush(self.heap, heap_node)
self.cache_map[page_no] = heap_node
def main():
cache_size = int(raw_input("Enter cache size "))
cache = LFUCache(cache_size)
while 1:
page_no = int(raw_input("Enter page no needed "))
print cache.get_page_content(page_no)
print cache.heap, cache.cache_map, "\n"
if __name__ == "__main__":
main()

Efficiency is a tricky thing. In real-world applications, it's often a good idea to use the simplest and easiest algorithm, and only start to optimize when that's measurably slow. And then you optimize by doing profiling to figure out where the code is slow.
If you are using CPython, it gets especially tricky, as even an inefficient algorithm implemented in C can beat an efficient algorithm implemented in Python due to the large constant factors; e.g. a double-linked list implemented in Python tends to be a lot slower than simply using the normal Python list, even for cases where in theory it should be faster.
Simple algorithm:
For an LFU, the simplest algorithm is to use a dictionary that maps keys to (item, frequency) objects, and update the frequency on each access. This makes access very fast (O(1)), but pruning the cache is slower as you need to sort by frequency to cut off the least-used elements. For certain usage characteristics, this is actually faster than other "smarter" solutions, though.
You can optimize for this pattern by not simply pruning your LFU cache to the maximum length, but to prune it to, say, 50% of the maximum length when it grows too large. That means your prune operation is called infrequently, so it can be inefficient compared to the read operation.
Using a heap:
In (1), you used a heap because that's an efficient way of storing a priority queue. But you are not implementing a priority queue. The resulting algorithm is optimized for pruning, but not access: You can easily find the n smallest elements, but it's not quite as obvious how to update the priority of an existing element. In theory, you'd have to rebalance the heap after every access, which is highly inefficient.
To avoid that, you added a trick by keeping elements around even if they are deleted. But this trades in space for time.
If you don't want to trade in time, you could update the frequencies in-place, and simply rebalance the heap before pruning the cache. You regain fast access times at the expense of slower pruning time, like the simple algorithm above. (I doubt there is any speed difference between the two, but I have not measured this.)
Using a double-linked list:
The double-linked list mentioned in (2) takes advantage of the nature of the possible changes here: An element is either added as the lowest priority (0 accesses), or an existing element's priority is incremented exactly by 1. You can use these attributes to your advantage if you design your data structures like this:
You have a double-linked list of elements which is ordered by the frequency of the elements. In addition, you have a dictionary that maps items to elements within that list.
Accessing an element then means:
Either it's not in the dictionary, that is, it's a new item, in which case you can simply append it to the end of the double-linked list (O(1))
or it's in the dictionary, in which case you increment the frequency in the element and move it leftwards through the double-linked list until the list is ordered again (O(n) worst-case, but usually closer to O(1)).
To prune the cache, you simply cut off n elements from the end of the list (O(n)).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.