Suppose I have the below code and my task it to find the recurrence T(n) and its worst runtime. The value if n is the length of the list.
In this case, we have 3 recursions, mystery(mylist[:len(mylist)-t-1]), mystery(mylist[t:len(mylist)- 1]) and mystery(mylist[:len(mylist)-t-1]).
def mystery(mylist):
if len(L) <= 1:
return
if len(mylist) >= 3:
t = len(mylist) // 3
mystery(mylist[:len(mylist)-t-1])
mystery(mylist[t:len(mylist)- 1])
mystery(mylist[:len(mylist)-t-1])
For the recursive case, my observation is because the recursion are together, so the recurrence is:
T(n) = T(floor(2n/3)) + T(floor(n/3)) + T(floor(2n/3)) = 2T(floor(2n/3)) + T(floor(n/3))
Now here is the hard part, to figure out f(n), so I expanded the recursive T(n) and I got more and more T(n)s. How would I be able to figure out f(n)?
For the base case, T(0) and T(1) are 1 because of the first if-statement and T(2) = 0 because there is no if-statement for n=2.
Are my assessments correct?
Thank you!
You are right about the base cases. You could even group T(2) in there. It's still O(1) as you're at least evaluating the two conditional statements, and practically speaking there aren't any O(0) function calls.
The f(n) term in your recurrence is just an expression of all the work you do in the recursive case outside of generating recursive calls. Here you have the O(1) t = len(mylist) // 3 statement and the O(1) cost of evaluating the two conditional statements: O(1) work in total. However, you also have the O(n) cost of slicing your list into three parts to pass into the recursive calls. This gives f(n) = O(n) + O(1) = O(n). From this, we can express the overall recurrence as:
T(n) = 2T(2n/3) + T(n/3) + O(n) if n >=3
T(n) = 1 otherwise
However, the Master Theorem doesn't apply to this case because you have recursive calls which work on different sub-problem sizes: you can't isolate a single a nor b value to apply the Master Theorem. For a recurrence like this, you could apply the generalization of the Master Theorem known as the Akra-Bazzi Method, with the parameters being:
a1=2, a2=1
b1=2/3, b2=1/3
g(n) = n
h1(n) = h2(n) = 0
Following the method, solve 2(2/3)^p + (1/3)^p = 1 for p, then evaluate the integral:
with g(u) = u (as g(n) = n) to determine the complexity class.
If you don't need the exact complexity class, but only want to derive a more simple upper-bound with the Master Theorem, you could upper-bound your recurrence relation for the run-time using the fact that 3T(2n/3) >= 2T(2n/3) + T(n/3):
T(n) <= 3T(2n/3) + O(n) if n >=3
T(n) = 1 otherwise
Then, you can solve this upper bound on the time-complexity with the Master Theorem, with a=3, b=3/2, and f(n)= n^c = n^1 to derive a Big-O (rather than Big-Theta) complexity class.
Related
can you tell me the time complexity of the code, I am using the divide and conquer technique?
def max_of_list(l):
if(len(l)==1):
return l[0]
else:
left_max=max_of_list(l[:len(l)//2])
righ_max=max_of_list(l[len(l)//2:])
return max(left_max,righ_max)
You'll need to use the master theorem since this is a recursive algorithm:
T(n) = a T(n/b) + f(n)
a: number of subproblems
b: size reduction of subproblems
f(n): complexity of split/join of subproblems process
This algorithm is recursive heavy, since f(n) the process of split/joining has complexity O(1). As such the complexity of the algorithm is O(n^c) where c is the critical exponent and is given by:
c = log(a) / log(b)
In this particular case:
c = log(2)/log(2) = 1
Thus, the complexity of the algorithm is linear: e.g. O(n)
You can read more about the Master theorem
I have understood O(logn) in a sense that it increases quickly but with larger inputs, the rate of increase retards.
I am not able to completely understand
O(nlogn)
the difference between an algorithm with complexity nlogn and complexity n + logn.
I could use a modification of the phone book example and/or some basic python code to understand the two queries
How do you think of O(n ^ 2)?
Personally, I like to think of it as doing O(n) work O(n) times.
A contrived O(n ^ 2) algorithm would be to iterate through all pairs of numbers in 0, 1, ..., n - 1
def print_pairs(n):
for i in range(n):
for j in range(i + 1, n):
print('({},{})'.format(i, j))
Using similar logic as above, you could do O(log n) work O(n) times and have a time complexity of O(n log n).
As an example, we are going to use binary search to find all indices of elements in an array.
Yes, I understand this is a dumb example but here I don't want to focus on the usefulness of the algorithm but rather the complexity. For the sake of the correctness of our algorithm let us assume that the input array is sorted. Otherwise, our binary search does not work as intended and could possibly run indefinitely.
def find_indices(arr):
indices = []
for num in arr:
index = binary_search(arr, 0, len(arr), num)
indices.append(index)
return indices
def binary_search(arr, l, r, x):
# Check base case
if r >= l:
mid = l + (r - l)/2
# If element is present at the middle itself
if arr[mid] == x:
return mid
# If element is smaller than mid, then it
# can only be present in left subarray
elif arr[mid] > x:
return binary_search(arr, l, mid-1, x)
# Else the element can only be present
# in right subarray
else:
return binary_search(arr, mid + 1, r, x)
else:
# Element is not present in the array
return -1
As for your second question,
surely, log n << n as n tends to infinity so
O(n + log n) = O(n)
In theory, the log n is dwarfed by the n as we get arbitrarily large so we don't include it in our Big O analysis.
Juxtaposed to practice, where you might want to consider this extra log n work if your algorithm is suffering performance and/or scaling issues.
log n is a much slower growing function than n. When computer scientists speak of big-O, they are interested in the growth of the function for extremely large input values. What the function does near some small number or inflection point is immaterial.
Many common algorithms have time complexity of n log n. For example, merge sort requires n steps to be taken log_2(n) times as the input data is split in half. After studying the algorithm, the fact that its complexity is n log n may come to you by intuition, but you could arrive at the same conclusion by studying the recurrence relation that describes the (recursive) algorithm--in this case T(n) = 2 * T(n / 2) + n. More generally but perhaps least intuitively, the master theorem can be applied to arrive at this n log n expression. In short, don't feel intimidated if it isn't immediately obvious why certain algorithms have certain running times--there are many ways you can take to approach the analysis.
Regarding "complexity n + log n", this isn't how big-O notation tends to get used. You may have an algorithm that does n + log n work, but instead of calling that O(n + log n), we'd call that O(n) because n grows so much faster than log n that the log n term is negligible. The point of big-O is to state only the growth rate of the fastest growing term.
Compared with n log n, an log n algorithm is less complex. If log n is the time complexity of inserting an item into a self-balancing search tree, n log n would be the complexity of inserting n items into such a structure.
There is Grokking algorithms awesome book that explains algorithms complexity detection (among other things) exhaustively and by a very simple language.
Technically, algorithms with complexity O(n + log n) and complexity O(n) are the same, as the log n term becomes negligible when n grows.
O(n) grows linearly. The slope is constant.
O(n log n) grows super-linearly. The slope increases (slowly).
Does the following algorithm have a complexity of O(nlogn)?
The thing that confuses me is that this algorithm divides twice, not once as a regular O(nlogn) algorithm, and each time it does O(n) work.
def equivalent(a, b):
if isEqual(a, b):
return True
half = int(len(a) / 2)
if 2*half != len(a):
return False
if (equivalent(a[:half], b[:half]) and equivalent(a[half:], b[half:])):
return True
if (equivalent(a[:half], b[half:]) and equivalent(a[half:], b[:half])):
return True
return False
Each of the 4 recursive calls to equivalent reduces the amount of input data by a factor of 2. Thus, assuming that a and b have the same length, and isEqual has linear time complexity, we can construct the recurrence relation for the overall complexity:
Where C is some constant. We can solve this relation by repeatedly substituting and spotting a pattern:
What is the upper limit of the summation, m? The stopping condition occurs when len(a) is odd. That may be anywhere between N and 1, depending on the prime decomposition of N. In the worse case scenario, N is a power of 2, so the function recurses until len(a) = 1, i.e.
To enhance the above answer, there is a direct way to calculate with 'Master Method'. The master method works only for following type of recurrences.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
We have three cases based on the f(n) as below and reduction for them:
If f(n) = Θ(nc) where c < Logba then T(n) = Θ(n Logba)
If f(n) = Θ(nc) where c = Logba then T(n) = Θ(nc Log n)
If f(n) = Θ(nc) where c > Logba then T(n) = Θ(f(n)) = Θ(nc)
In your case,
we have a = 4, b = 2, c = 1 and c < Logba
i.e. 1 < log24
Hence => case 1
Therefore:
T(n) = Θ(nLogba)
T(n) = Θ(nLog24)
T(n) = Θ(n2)
More details with examples can be found in wiki.
Hope it helps!
def func(n):
if n == 1:
return 1
return func(n-1) + n*(n-1)
print func(5)
Getting confused. Not sure what exactly it is. Is it O(n)?
Calculating the n*(n-1) is a fixed time operation. The interesting part of the function is calling func(n-1) until n is 1. The function will make n such calls, so it's complexity is O(n).
If we assume that arithmetic operations are constant time operations (and they really are when numbers are relatively small) then time complexity is O(n):
T(n) = T(n-1) + C = T(n-2) + C + C = ... = n * C = O(n)
But the multiplication complexity in practice depends on the underlying type (and we are talking about Python where the type depends on the value). It depends on the N as N approaches infinity. Thus, strictly speaking, the complexity is equal to:
T(n) = O(n * multComplexity(n))
And this multComplexity(n) depends on a specific algorithm that is used for multiplication of huge numbers.
As described in other answers, the answer is close to O(n) for practical purposes. For a more precise analysis, if you don't want to make the approximation that multiplication is constant-time:
Calculating n*(n-1) takes O(log n * log n) (or O(log n)^1.58, depending on the algorithm Python uses, which depends on the size of the integer). See here - note that we need to take the log because the complexity is relative to the number of digits.
Adding the two terms takes O(log n), so we can ignore that.
The multiplication gets done O(n) times, so the total is O(n * log n * log n). (It might be possible to get this bound tighter, but it's certainly larger than O(n) - see the WolframAlpha plot).
In practice, the log terms won't really matter unless n gets very large.
I've had a look through previous posts and I'm still struggling to find the T(n) and big O of these two recursive algorithms, each one takes a sequence of numbers as its argument and sums all numbers in the list (except for last item) then adds the sum to the last item. could anyone please shed some light.
def sum(numberSequence):
assert (len(numberSequence) > 0)
if (len(numberSequence) == 1):
return numberSequence[0]
else:
return sum(numberSequence[-1]) + numberSequence[:-1]
(I believe the bigO is O(n) as in worst case, the function is called n-1 times, but not sure what happens when it is only summing part of the list. I have T(n) = n x n-1 + n = O(n) it just doesn't seem right).
def binarySum(numberSequence):
assert (len(numberSequence) > 0)
breakPoint = int(len(numberSequence)/2)
if (len(numberSequence) == 1):
return numberSequence[0]
else:
return binarySum(numberSequence[:breakPoint]) + binarySum(numberSequence[breakPoint:])
I'm more lost on this one, I think the big O is O(log2 n) as it is binary search but the whole list isn't being divided in half, only most of the list.
Any help would be appreciated.
You're summing a list of N numbers of any size, in any order.
You aren't going to find a clever way to do that faster without some constraints.
It's Ω(N) always (lower bound is N addition operations - you won't get any better than that).
As a commenter below noted your algorithm may in fact be worse - it just can't be better.
Edited: corrections made based on comments regarding O(n) performance of [::].
TL;DR: It could be O(n), but your version is O(n²).
Remember that all of the big-O notations assume "times a constant". That is, O(n) really means O(k * n), and O(log n) really means O(k * log n).
Let's look at your first example:
def sum(numberSequence):
assert (len(numberSequence) > 0)
if (len(numberSequence) == 1):
return numberSequence[0]
else:
return sum(numberSequence[-1]) + numberSequence[:-1]
The first line is assert plus compare plus len. The len operation is a constant time for lists and tuples (But it might not be with some other data structure! Beware!), compare is a constant time, and the assert is effectively a constant time, because if it ever fails the whole thing blows up and we stop computing. So let's just call assert a function call plus a comparison plus a return.
Now, how many times does this function get called? Well, the termination condition obviously represents one time, and every other time it's recursing on a list that is one shorter than the previous list. So the function will be called len(numberSequence) times, which is n for our purposes.
So we have
1 * call (for the user calling us)
+ n * assert
+ n * len
+ n * compare
Next, we have the if statement that marks the termination condition for your recursion. Obviously, this statement will only be successful once (it's the termination condition, right? Only happens at the end...) so that's a comparison each time, and once per sum it's a return of a constant index.
n * compare
+ 1 * constant index
+ 1 * return
Finally, there is the else: branch. I'm pretty sure you have a bug, and it should really be this (note position of colon):
return sum(numberSequence[:-1]) + numberSequence[-1]
In that case you return the sum of a constant negative index lookup and a recursive function call of a slice. You only do this when it's NOT the end of the recursion, so n-1 times.
(n - 1) * constant negative index lookup
+ (n - 1) * slice
+ (n - 1) * recursive call
+ (n - 1) * return
But wait! If you look around for people asking about how to make a copy of a list, you'll find that one common Python idiom is copy = orig[:]. The reason for this is that a slice operation makes a copy of the subrange of the list it is slicing. So when you say numberSequence[:-1] what you're really saying is copy = [orig[i] for i in range(0, len(orig)-1)].
This means that the slice operation is O(n), but on the plus side it's written in C. So the constant is a much smaller one.
Let's add those up:
1 * call
+ n * assert
+ n * len
+ n * compare
+ n * compare
+ 1 * constant index
+ 1 * return
+ (n - 1) * constant negative index lookup
+ (n - 1) * (c * n) slice
+ (n - 1) * recursive call
+ (n - 1) * return
If we assume that constant index and constant negative index take the same time, we can merge them. We can obviously merge the returns and the calls. Which leaves us with:
n * call
+ n * assert
+ n * len
+ n * compare
+ n * compare
+ n * constant (maybe negative) index
+ n * return
+ (n - 1) * (c * n) slice
Now according to "the rules," this is O(n²). Which means that all the details of O(n) behavior fall by the wayside in favor of that big, fat O(n²).
However:
If the len operation were not O(1) - that is, constant time - then the function might well become O(n²) because of that.
If the index operations were not O(1), because of underlying implementation details, the function might become O(n²) or O(n log n) because of that.
So you have implemented an algorithm that could be O(n) using a Python operator that is inherently O(n) itself. Your implementation is "inherently" O(n²). But it can be fixed. Even if fixed, things outside of your control could make your code slower. (But, that's outside your control, so ... ignore it!)
How can we fix your code to make it O(n)? By getting rid of the slice! You don't need that anyway, right? You just need to track the range.
def sum(numberSequence, start=0, end=None):
assert (len(numberSequence) > 0)
if end is None:
end = len(numberSequence) - 1
if end == start:
return numberSequence[start]
else:
return sum(numberSequence, start, end-1) + numberSequence[end]
In this code, I'm doing pretty much the same thing that you did, with two
differences. First, I've added a special case to handle being called by an end user with only the sequence as an argument. And second, of course, there is no slice. With that out of the way, the code is no longer inherently O(n²).
You can do the same math, and make the same changes, to your other example, but it's more complex. However, I will remind you that the sum of 2i for i = 0..n-1 is 2n - 1. As #lollercoaster points out, there ain't no such thing as a free lunch: you have to add up all the numbers.
Technically I think the actual runtimes of your algorithms may both be worse than O(n). The slicing operation is O(length_of_slice), since it copies the relevant portion of the list. That said, since that happens in C under the hood, you may not notice the performance.
I'm torn on whether to count that fact in the runtime of your own algorithm, since if you implemented this e.g. in C with pointer arithmetic rather than Python with slicing, these would both be O(n).
Two side notes:
In your sum function, you slice the wrong sequence (should be return sum(numberSequence[:-1]) + numberSequence[-1]).
In practice, you should just use the sum builtin rather than rolling your own like this.