What's the time complexity for the following python function? - python

def func(n):
if n == 1:
return 1
return func(n-1) + n*(n-1)
print func(5)
Getting confused. Not sure what exactly it is. Is it O(n)?

Calculating the n*(n-1) is a fixed time operation. The interesting part of the function is calling func(n-1) until n is 1. The function will make n such calls, so it's complexity is O(n).

If we assume that arithmetic operations are constant time operations (and they really are when numbers are relatively small) then time complexity is O(n):
T(n) = T(n-1) + C = T(n-2) + C + C = ... = n * C = O(n)
But the multiplication complexity in practice depends on the underlying type (and we are talking about Python where the type depends on the value). It depends on the N as N approaches infinity. Thus, strictly speaking, the complexity is equal to:
T(n) = O(n * multComplexity(n))
And this multComplexity(n) depends on a specific algorithm that is used for multiplication of huge numbers.

As described in other answers, the answer is close to O(n) for practical purposes. For a more precise analysis, if you don't want to make the approximation that multiplication is constant-time:
Calculating n*(n-1) takes O(log n * log n) (or O(log n)^1.58, depending on the algorithm Python uses, which depends on the size of the integer). See here - note that we need to take the log because the complexity is relative to the number of digits.
Adding the two terms takes O(log n), so we can ignore that.
The multiplication gets done O(n) times, so the total is O(n * log n * log n). (It might be possible to get this bound tighter, but it's certainly larger than O(n) - see the WolframAlpha plot).
In practice, the log terms won't really matter unless n gets very large.

Related

Time complexity of python function

I am trying to solve the time complexity of this function (I'm still new to solving complexity problems) and was wondering what the time complexity of this function would be:
def mystery(lis):
n = len(lis)
for index in range(n):
x = 2*index % n
lis[index],lis[x] = lis[x],lis[index]
print(lis)
I believe the answer is O(n) but I am not 100% sure as the line: x = 2*index % n is making me wonder if it is maybe O(n log n).
The operation to * two operands together is usually consider constant time in time complexity analysis. Same with %.
The fact that you have n as one of the operand doesn't make it O(n) because n is a single number. To make it O(n) you need to perform an operation n times.

How do I identify O(nlogn) exactly?

I have understood O(logn) in a sense that it increases quickly but with larger inputs, the rate of increase retards.
I am not able to completely understand
O(nlogn)
the difference between an algorithm with complexity nlogn and complexity n + logn.
I could use a modification of the phone book example and/or some basic python code to understand the two queries
How do you think of O(n ^ 2)?
Personally, I like to think of it as doing O(n) work O(n) times.
A contrived O(n ^ 2) algorithm would be to iterate through all pairs of numbers in 0, 1, ..., n - 1
def print_pairs(n):
for i in range(n):
for j in range(i + 1, n):
print('({},{})'.format(i, j))
Using similar logic as above, you could do O(log n) work O(n) times and have a time complexity of O(n log n).
As an example, we are going to use binary search to find all indices of elements in an array.
Yes, I understand this is a dumb example but here I don't want to focus on the usefulness of the algorithm but rather the complexity. For the sake of the correctness of our algorithm let us assume that the input array is sorted. Otherwise, our binary search does not work as intended and could possibly run indefinitely.
def find_indices(arr):
indices = []
for num in arr:
index = binary_search(arr, 0, len(arr), num)
indices.append(index)
return indices
def binary_search(arr, l, r, x):
# Check base case
if r >= l:
mid = l + (r - l)/2
# If element is present at the middle itself
if arr[mid] == x:
return mid
# If element is smaller than mid, then it
# can only be present in left subarray
elif arr[mid] > x:
return binary_search(arr, l, mid-1, x)
# Else the element can only be present
# in right subarray
else:
return binary_search(arr, mid + 1, r, x)
else:
# Element is not present in the array
return -1
As for your second question,
surely, log n << n as n tends to infinity so
O(n + log n) = O(n)
In theory, the log n is dwarfed by the n as we get arbitrarily large so we don't include it in our Big O analysis.
Juxtaposed to practice, where you might want to consider this extra log n work if your algorithm is suffering performance and/or scaling issues.
log n is a much slower growing function than n. When computer scientists speak of big-O, they are interested in the growth of the function for extremely large input values. What the function does near some small number or inflection point is immaterial.
Many common algorithms have time complexity of n log n. For example, merge sort requires n steps to be taken log_2(n) times as the input data is split in half. After studying the algorithm, the fact that its complexity is n log n may come to you by intuition, but you could arrive at the same conclusion by studying the recurrence relation that describes the (recursive) algorithm--in this case T(n) = 2 * T(n / 2) + n. More generally but perhaps least intuitively, the master theorem can be applied to arrive at this n log n expression. In short, don't feel intimidated if it isn't immediately obvious why certain algorithms have certain running times--there are many ways you can take to approach the analysis.
Regarding "complexity n + log n", this isn't how big-O notation tends to get used. You may have an algorithm that does n + log n work, but instead of calling that O(n + log n), we'd call that O(n) because n grows so much faster than log n that the log n term is negligible. The point of big-O is to state only the growth rate of the fastest growing term.
Compared with n log n, an log n algorithm is less complex. If log n is the time complexity of inserting an item into a self-balancing search tree, n log n would be the complexity of inserting n items into such a structure.
There is Grokking algorithms awesome book that explains algorithms complexity detection (among other things) exhaustively and by a very simple language.
Technically, algorithms with complexity O(n + log n) and complexity O(n) are the same, as the log n term becomes negligible when n grows.
O(n) grows linearly. The slope is constant.
O(n log n) grows super-linearly. The slope increases (slowly).

Time Complexity - Codility - Ladder - Python

The question is available here. My Python code is
def solution(A, B):
if len(A) == 1:
return [1]
ways = [0] * (len(A) + 1)
ways[1], ways[2] = 1, 2
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
result = [1] * len(A)
for i in xrange(len(A)):
result[i] = ways[A[i]] & ((1<<B[i]) - 1)
return result
The detected time complexity by the system is O(L^2) and I can't see why. Thank you in advance.
First, let's show that the runtime genuinely is O(L^2). I copied a section of your code, and ran it with increasing values of L:
import time
import matplotlib.pyplot as plt
def solution(L):
if L == 0:
return
ways = [0] * (L+5)
ways[1], ways[2] = 1, 2
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
points = []
for L in xrange(0, 100001, 10000):
start = time.time()
solution(L)
points.append(time.time() - start)
plt.plot(points)
plt.show()
The result graph is this:
To understand why this O(L^2) when the obvious "time complexity" calculation suggests O(L), note that "time complexity" is not a well-defined concept on its own since it depends on which basic operations you're counting. Normally the basic operations are taken for granted, but in some cases you need to be more careful. Here, if you count additions as a basic operation, then the code is O(N). However, if you count bit (or byte) operations then the code is O(N^2). Here's the reason:
You're building an array of the first L Fibonacci numbers. The length (in digits) of the i'th Fibonacci number is Theta(i). So ways[i] = ways[i-1] + ways[i-2] adds two numbers with approximately i digits, which takes O(i) time if you count bit or byte operations.
This observation gives you an O(L^2) bit operation count for this loop:
for i in xrange(3, len(ways)):
ways[i] = ways[i-1] + ways[i-2]
In the case of this program, it's quite reasonable to count bit operations: your numbers are unboundedly huge as L increases and addition of huge numbers is linear in clock time rather than O(1).
You can fix the complexity of your code by computing the Fibonacci numbers mod 2^32 -- since 2^32 is a multiple of 2^B[i]. That will keep a finite bound on the numbers you're dealing with:
for i in xrange(3, len(ways)):
ways[i] = (ways[i-1] + ways[i-2]) & ((1<<32) - 1)
There are some other issues with the code, but this will fix the slowness.
I've taken the relevant parts of the function:
def solution(A, B):
for i in xrange(3, len(A) + 1): # replaced ways for clarity
# ...
for i in xrange(len(A)):
# ...
return result
Observations:
A is an iterable object (e.g. a list)
You're iterating over the elements of A in sequence
The behavior of your function depends on the number of elements in A, making it O(A)
You're iterating over A twice, meaning 2 O(A) -> O(A)
On point 4, since 2 is a constant factor, 2 O(A) is still in O(A).
I think the page is not correct in its measurement. Had the loops been nested, then it would've been O(A²), but the loops are not nested.
This short sample is O(N²):
def process_list(my_list):
for i in range(0, len(my_list)):
for j in range(0, len(my_list)):
# do something with my_list[i] and my_list[j]
I've not seen the code the page is using to 'detect' the time complexity of the code, but my guess is that the page is counting the number of loops you're using without understanding much of the actual structure of the code.
EDIT1:
Note that, based on this answer, the time complexity of the len function is actually O(1), not O(N), so the page is not incorrectly trying to count its use for the time-complexity. If it were doing that, it would've incorrectly claimed a larger order of growth because it's used 4 separate times.
EDIT2:
As #PaulHankin notes, asymptotic analysis also depends on what's considered a "basic operation". In my analysis, I've counted additions and assignments as "basic operations" by using the uniform cost method, not the logarithmic cost method, which I did not mention at first.
Most of the time simple arithmetic operations are always treated as basic operations. This is what I see most commonly being done, unless the algorithm being analysed is for a basic operation itself (e.g. time complexity of a multiplication function), which is not the case here.
The only reason why we have different results appears to be this distinction. I think we're both correct.
EDIT3:
While an algorithm in O(N) is also in O(N²), I think it's reasonable to state that the code is still in O(N) b/c, at the level of abstraction we're using, the computational steps that seem more relevant (i.e. are more influential) are in the loop as a function of the size of the input iterable A, not the number of bits being used to represent each value.
Consider the following algorithm to compute an:
def function(a, n):
r = 1
for i in range(0, n):
r *= a
return r
Under the uniform cost method, this is in O(N), because the loop is executed n times, but under logarithmic cost method, the algorithm above turns out to be in O(N²) instead due to the time complexity of the multiplication at line r *= a being in O(N), since the number of bits to represent each number is dependent on the size of the number itself.
Codility Ladder competition is best solved in here:
It is super tricky.
We first compute the Fibonacci sequence for the first L+2 numbers. The first two numbers are used only as fillers, so we have to index the sequence as A[idx]+1 instead of A[idx]-1. The second step is to replace the modulo operation by removing all but the n lowest bits

time complexity of summing algorithms

I've had a look through previous posts and I'm still struggling to find the T(n) and big O of these two recursive algorithms, each one takes a sequence of numbers as its argument and sums all numbers in the list (except for last item) then adds the sum to the last item. could anyone please shed some light.
def sum(numberSequence):
assert (len(numberSequence) > 0)
if (len(numberSequence) == 1):
return numberSequence[0]
else:
return sum(numberSequence[-1]) + numberSequence[:-1]
(I believe the bigO is O(n) as in worst case, the function is called n-1 times, but not sure what happens when it is only summing part of the list. I have T(n) = n x n-1 + n = O(n) it just doesn't seem right).
def binarySum(numberSequence):
assert (len(numberSequence) > 0)
breakPoint = int(len(numberSequence)/2)
if (len(numberSequence) == 1):
return numberSequence[0]
else:
return binarySum(numberSequence[:breakPoint]) + binarySum(numberSequence[breakPoint:])
I'm more lost on this one, I think the big O is O(log2 n) as it is binary search but the whole list isn't being divided in half, only most of the list.
Any help would be appreciated.
You're summing a list of N numbers of any size, in any order.
You aren't going to find a clever way to do that faster without some constraints.
It's Ω(N) always (lower bound is N addition operations - you won't get any better than that).
As a commenter below noted your algorithm may in fact be worse - it just can't be better.
Edited: corrections made based on comments regarding O(n) performance of [::].
TL;DR: It could be O(n), but your version is O(n²).
Remember that all of the big-O notations assume "times a constant". That is, O(n) really means O(k * n), and O(log n) really means O(k * log n).
Let's look at your first example:
def sum(numberSequence):
assert (len(numberSequence) > 0)
if (len(numberSequence) == 1):
return numberSequence[0]
else:
return sum(numberSequence[-1]) + numberSequence[:-1]
The first line is assert plus compare plus len. The len operation is a constant time for lists and tuples (But it might not be with some other data structure! Beware!), compare is a constant time, and the assert is effectively a constant time, because if it ever fails the whole thing blows up and we stop computing. So let's just call assert a function call plus a comparison plus a return.
Now, how many times does this function get called? Well, the termination condition obviously represents one time, and every other time it's recursing on a list that is one shorter than the previous list. So the function will be called len(numberSequence) times, which is n for our purposes.
So we have
1 * call (for the user calling us)
+ n * assert
+ n * len
+ n * compare
Next, we have the if statement that marks the termination condition for your recursion. Obviously, this statement will only be successful once (it's the termination condition, right? Only happens at the end...) so that's a comparison each time, and once per sum it's a return of a constant index.
n * compare
+ 1 * constant index
+ 1 * return
Finally, there is the else: branch. I'm pretty sure you have a bug, and it should really be this (note position of colon):
return sum(numberSequence[:-1]) + numberSequence[-1]
In that case you return the sum of a constant negative index lookup and a recursive function call of a slice. You only do this when it's NOT the end of the recursion, so n-1 times.
(n - 1) * constant negative index lookup
+ (n - 1) * slice
+ (n - 1) * recursive call
+ (n - 1) * return
But wait! If you look around for people asking about how to make a copy of a list, you'll find that one common Python idiom is copy = orig[:]. The reason for this is that a slice operation makes a copy of the subrange of the list it is slicing. So when you say numberSequence[:-1] what you're really saying is copy = [orig[i] for i in range(0, len(orig)-1)].
This means that the slice operation is O(n), but on the plus side it's written in C. So the constant is a much smaller one.
Let's add those up:
1 * call
+ n * assert
+ n * len
+ n * compare
+ n * compare
+ 1 * constant index
+ 1 * return
+ (n - 1) * constant negative index lookup
+ (n - 1) * (c * n) slice
+ (n - 1) * recursive call
+ (n - 1) * return
If we assume that constant index and constant negative index take the same time, we can merge them. We can obviously merge the returns and the calls. Which leaves us with:
n * call
+ n * assert
+ n * len
+ n * compare
+ n * compare
+ n * constant (maybe negative) index
+ n * return
+ (n - 1) * (c * n) slice
Now according to "the rules," this is O(n²). Which means that all the details of O(n) behavior fall by the wayside in favor of that big, fat O(n²).
However:
If the len operation were not O(1) - that is, constant time - then the function might well become O(n²) because of that.
If the index operations were not O(1), because of underlying implementation details, the function might become O(n²) or O(n log n) because of that.
So you have implemented an algorithm that could be O(n) using a Python operator that is inherently O(n) itself. Your implementation is "inherently" O(n²). But it can be fixed. Even if fixed, things outside of your control could make your code slower. (But, that's outside your control, so ... ignore it!)
How can we fix your code to make it O(n)? By getting rid of the slice! You don't need that anyway, right? You just need to track the range.
def sum(numberSequence, start=0, end=None):
assert (len(numberSequence) > 0)
if end is None:
end = len(numberSequence) - 1
if end == start:
return numberSequence[start]
else:
return sum(numberSequence, start, end-1) + numberSequence[end]
In this code, I'm doing pretty much the same thing that you did, with two
differences. First, I've added a special case to handle being called by an end user with only the sequence as an argument. And second, of course, there is no slice. With that out of the way, the code is no longer inherently O(n²).
You can do the same math, and make the same changes, to your other example, but it's more complex. However, I will remind you that the sum of 2i for i = 0..n-1 is 2n - 1. As #lollercoaster points out, there ain't no such thing as a free lunch: you have to add up all the numbers.
Technically I think the actual runtimes of your algorithms may both be worse than O(n). The slicing operation is O(length_of_slice), since it copies the relevant portion of the list. That said, since that happens in C under the hood, you may not notice the performance.
I'm torn on whether to count that fact in the runtime of your own algorithm, since if you implemented this e.g. in C with pointer arithmetic rather than Python with slicing, these would both be O(n).
Two side notes:
In your sum function, you slice the wrong sequence (should be return sum(numberSequence[:-1]) + numberSequence[-1]).
In practice, you should just use the sum builtin rather than rolling your own like this.

Factorial running time

When analyzing some code I've written, I've come up with the following recursive equation for its running time -
T(n) = n*T(n-1) + n! + O(n^2).
Initially, I assumed that O((n+1)!) = O(n!), and therefore I solved the equation like this -
T(n) = n! + O(n!) + O(n^3) = O(n!)
Reasoning that even had every recursion yielded another n! (instead of (n-1)!, (n-2)! etc.), it would still only come up to n*n! = (n+1)! = O(n!). The last argument is due to sum of squares.
But, after thinking about it some more, I'm not sure my assumption that O((n+1)!) = O(n!) is correct, in fact, I'm pretty sure it isn't.
If I am right in thinking I made a wrong assumption, I'm not really sure how to actually solve the above recursive equation, since there is no formula for the sum of factorials...
Any guidance would be much appreciated.
Thank you!!!
Since you're looking at run-time, I assume O(n^2) is meant to be the number of operations on that term. Under that assumption, n! can be computed in O(n) time (1*2*3*...*n). So, it can be dropped in comparison to the O(n^2) term. T(n-1) is then computed in approximately O((n-1)^2) time which is roughly O(n^2). Putting it all together you have something which runs in
O(n^2) + O(n) + O(n^2)
resulting in an O(n^2) algorithm.
I figured it out.
T(n) = n*T(n-1) + n! + O(n^2) = n*T(n-1) + n! = n*( (n-1)T(n-2) + (n-1)! ) + n! = n(n-1)T(n-2) + 2n! = ... = n! = n*n! = O(n*n!)
The problem with:
T(n) = n*T(n-1) + n! + O(n^2)
Is that you're mixing two different types of terms. Everything left of the final + refers to a number; to the right of that plus is O(n^2) which denotes the class of all functions which grow asymptotically no faster than n^2.
Assuming you mean:
T(n) = n*T(n-1) + n! + n^2
Then T(n) in O(n!) because n! is the fastest growing term in the sum. (Actually, I'm not sure that n*T(n-1) isn't faster growing - my combinatorics isn't that strong).
Expanding out the recursive term, the recursive "call" to n*T(n-1) reduces to some function which is O((n!)!) O(n!), and so the function as a whole is O(n!).
Fully expanding out the recursive term, it will be the fastest growing term. See the comments for various suggestions for the correct expansion.
From what I understand from the source code:
https://github.com/python/cpython/blob/main/Modules/mathmodule.c#L1982-L2032
it must be at most O(n) if not faster.

Categories

Resources