How to find the time complexity of this function? - python

def f1(n):
cnt = 0
for i in range(n):
for j in range(n):
k = 1
while k < i*j:
k *= 2
cnt += 1
return cnt
I am trying to analyze the time complexity of this function f1, I'm having some troubles dealing with the k < i*j in the loop.
My point of view:
I'm trying to find the number of iterations of the inner loop, so what I'm basically trying to find is when 2^k >= i*j, but I'm having trouble dealing in how to compute i*j each time and find the overall time complexity. I know that at the end I will have 2^k >= n^2 which gives me k >= log(n) , but I must be missing all the iterations before this and I would be happy to know how to calculate them. Any help is really appreciated, Thanks in advance!
EDIT:
With Prune's help I reached this: We're trying to calculate how many times the inner loop iterates, which is log(i*j).
taking i=2 we get log(2) + log(j) (n times) which is n + log(1)+log(2)+...+log(n).
So we have n+log(n!) for each i=0,1,...,n basically n(n+log(n!)). Which is either O(n^2) or O(nlog(n!)). as it's my first time meeting log(n!) I'm not sure which is considered to be the time complexity.

For convenience, let m = i*j.
As you've noted, you execute the while loop log(m) times.
The complexity figure you're missing is the summation:
sum([log(i*j)
for j in range(n)
for i in range(n)])
Would it help to attack this with actual numbers? For instance, try one iteration of the outer loop, with i=2. Also, we'll simplify the log expression:
sum([log(2) + log(j) for j in range(n)])
Using base 2 logs for convenience, and separating this, we have
n*1 + sum([log(j) for j in range(n)])
That's your start. Now, you need to find a closed form for sum(log(j)), and then sum that for i = 0,n
Can you take it from there?
After OP update
The desired closed form for sum(log(j)) is, indeed, log(n!)
This isn't simply "go n times`: it's the sum of log(i)*n + log(n!) over the range.

Related

Calculate number of function calls for any size N

I'm trying to understand a way to write how many times the print statement for fun1 will be called for any size N. Written in summation form. This is more of an analysis question. I know I could just setup a count variable and print the result. S is an array of N items. N is the size.
def myAlg(S,n):
for i in range(1,n+1):
for j in range(1,i+1):
for k in range(1,j+1):
if j > k:
print('fun1 called, and count is now', count)
else:
print('fun2 called')
Im honestly a little lost on how to approach this. Any explanation would be greatly appreciated.
For two first loops we have sum of arithmetic progression 1+2+3+...+n, and result is
T(n) = n*(n+1)/2
known as trianglular numbers (1,3,6,10,15,21...)
So loop for k is executed T(n) times, and inner part is executed
Q(n) = sum(T(i),i=1..n) = n*(n+1)*(n+2)/6
times, sequence is known as tetrahedral numbers (1,4,10,20,35,56...)
But we have to subtract T(n) to exclude fun2 calls (one per loop)
Result = n*(n+1)*(n+2)/6 - n*(n+1)/2 = (n-1)*n*(n+1)/6
This is the same Q sequence without the last term, so
Result(n) = Q(n-1) = (n-1)*n*(n+1)/6

How to find the recurrence relation, and calculate Master Theorem of a Merge Sort Code?

I'm trying to find the Master Theorem of this Merge Sort Code, but first I need to find its recurrence relation, but I'm struggling to do and understand both. I already saw some similar questions here, but couldn't understand the explanations, like, first I need to find how many operations the code has? Could someone help me with that?
def mergeSort(alist):
print("Splitting ",alist)
if len(alist)>1:
mid = len(alist)//2
lefthalf = alist[:mid]
righthalf = alist[mid:]
mergeSort(lefthalf)
mergeSort(righthalf)
i=0
j=0
k=0
while i < len(lefthalf) and j < len(righthalf):
if lefthalf[i] < righthalf[j]:
alist[k]=lefthalf[i]
i=i+1
else:
alist[k]=righthalf[j]
j=j+1
k=k+1
while i < len(lefthalf):
alist[k]=lefthalf[i]
i=i+1
k=k+1
while j < len(righthalf):
alist[k]=righthalf[j]
j=j+1
k=k+1
print("Merging ",alist)
alist = [54,26,93,17,77,31,44,55,20]
mergeSort(alist)
print(alist)
To determine the run-time of a divide-and-conquer algorithm using the Master Theorem, you need to express the algorithm's run-time as a recursive function of input size, in the form:
T(n) = aT(n/b) + f(n)
T(n) is how we're expressing the total runtime of the algorithm on an input size n.
a stands for the number of recursive calls the algorithm makes.
T(n/b) represents the recursive calls: The n/b signifies that the input size to the recursive calls is some particular fraction of original input size (the divide part of divide-and-conquer).
f(n) represents the amount of work you need to do to in the main body of the algorithm, generally just to combine solutions from recursive calls into an overall solution (you could say this is the conquer part).
Here's a slightly re-factored definition of mergeSort:
def mergeSort(arr):
if len(arr) <= 1: return # array size 1 or 0 is already sorted
# split the array in half
mid = len(arr)//2
L = arr[:mid]
R = arr[mid:]
mergeSort(L) # sort left half
mergeSort(R) # sort right half
merge(L, R, arr) # merge sorted halves
We need to determine, a, n/b and f(n)
Because each call of mergeSort makes two recursive calls: mergeSort(L) and mergeSort(R), a=2:
T(n) = 2T(n/b) + f(n)
n/b represents the fraction of the current input that recursive calls are made with. Because we are finding the midpoint and splitting the input in half, passing one half the current array to each recursive call, n/b = n/2 and b=2. (if each recursive call instead got 1/4 of the original array b would be 4)
T(n) = 2T(n/2) + f(n)
f(n) represents all the work the algorithm does besides making recursive calls. Every time we call mergeSort, we calculate the midpoint in O(1) time.
We also split the array into L and R, and technically creating these two sub-array copies is O(n). Then, presuming mergeSort(L), sorted the left half of the array, and mergeSort(R) sorted the right half, we still have to merge the sorted sub-arrays together to sort the entire array with the merge function. Together, this makes f(n) = O(1) + O(n) + complexity of merge. Now let's take a look at merge:
def merge(L, R, arr):
i = j = k = 0 # 3 assignments
while i < len(L) and j < len(R): # 2 comparisons
if L[i] < R[j]: # 1 comparison, 2 array idx
arr[k] = L[i] # 1 assignment, 2 array idx
i += 1 # 1 assignment
else:
arr[k] = R[j] # 1 assignment, 2 array idx
j += 1 # 1 assignment
k += 1 # 1 assignment
while i < len(L): # 1 comparison
arr[k] = L[i] # 1 assignment, 2 array idx
i += 1 # 1 assignment
k += 1 # 1 assignment
while j < len(R): # 1 comparison
arr[k] = R[j] # 1 assignment, 2 array idx
j += 1 # 1 assignment
k += 1 # 1 assignment
This function has more going on, but we just need to get it's overall complexity class to be able to apply the Master Theorem accurately. We can count every single operation, that is, every comparison, array index, and assignment, or just reason about it more generally. Generally speaking, you can say that across the three while loops we are going to iterate through every member of L and R and assign them in order to the output array, arr, doing a constant amount of work for each element. Noting that we are processing every element of L and R (n total elements) and doing a constant amount of work for each element would be enough to say that merge is in O(n).
But, you can get more particular with counting operations if you want. For the first while loop, every iteration we make 3 comparisons, 5 array indexes, and 2 assignments (constant numbers), and the loop runs until one of L and R is fully processed. Then, one of the next two while loops may run to process any leftover elements from the other array, performing 1 comparison, 2 array indexes, and 3 variable assignments for each of those elements (constant work). Therefore, because each of the n total elements of L and R cause at most a constant number of operations to be performed across the while loops (either 10 or 6, by my count, so at most 10), and the i=j=k=0 statement is only 3 constant assignments, merge is in O(3 + 10*n) = O(n). Returning to the overall problem, this means:
f(n) = O(1) + O(n) + complexity of merge
= O(1) + O(n) + O(n)
= O(2n + 1)
= O(n)
T(n) = 2T(n/2) + n
One final step before we apply the Master Theorem: we want f(n) written as n^c. For f(n) = n = n^1, c=1. (Note: things change very slightly if f(n) = n^c*log^k(n) rather than simply n^c, but we don't need to worry about that here)
You can now apply the Master Theorem, which in its most basic form says to compare a (how quickly the number of recursive calls grows) to b^c (how quickly the amount of work per recursive call shrinks). There are 3 possible cases, the logic of which I try to explain, but you can ignore the parenthetical explanations if they aren't helpful:
a > b^c, T(n) = O(n^log_b(a)). (The total number of recursive calls is growing faster than the work per call is shrinking, so the total work is determined by the number of calls at the bottom level of the recursion tree. The number of calls starts at 1 and is multiplied by a log_b(n) times because log_b(n) is the depth of the recursion tree. Therefore, total work = a^log_b(n) = n^log_b(a))
a = b^c, T(n) = O(f(n)*log(n)). (The growth in number of calls is balanced by the decrease in work per call. The work at each level of the recursion tree is therefore constant, so total work is just f(n)*(depth of tree) = f(n)*log_b(n) = O(f(n)*log(n))
a < b^c, T(n) = O(f(n)). (The work per call shrinks faster than the number of calls increases. Total work is therefore dominated by the work at the top level of the recursion tree, which is just f(n))
For the case of mergeSort, we've seen that a = 2, b = 2, and c = 1. As a = b^c, we apply the 2nd case:
T(n) = O(f(n)*log(n)) = O(n*log(n))
And you're done. This may seem like a lot work, but coming up with a recurrence for T(n) gets easier the more you do it, and once you have a recurrence it's very quick to check which case it falls under, making the Master Theorem quite a useful tool for solving more complicated divide/conquer recurrences.

How to calculate Time Complexity of this algorithm

I am new to the concept of asymptotic analysis. I am reading "Data Structures and Algorithms in Python" by Goodrich. In that book it has an implementation as follows:
def prefix average2(S):
”””Return list such that, for all j, A[j] equals average of S[0], ..., S[j].”””
n = len(S)
A = [0] n # create new list of n zeros
for j in range(n):
A[j] = sum(S[0:j+1]) / (j+1) # record the average
return A
The book says that this code runs in O(n^2) but I don't see how. S[0:j+1] runs in O(j+1) time but how do we know what time the 'sum()' runs in and how do we get the running time to be O(n^2)?
You iterate n times in the loop. In the first iteration, you sum 1 number (1 time step), then 2 (2 time steps), and so on, until you reach n (n time steps in this iteration, you have to visit each element once). Therefore, you have 1+2+...+(n-1)+n=(n*(n+1))/2 time steps. This is equal to (n^2+n)/2, or n^2+n after eliminating constants. The order of this term is 2, therefore your running time is O(n^2) (always take the highest power).
for j in range(n): # This loop runs n times.
A[j] = sum(S[0:j+1]) # now lets extend this sum function's implementation.
I'm not sure about the implementation of sum(iterable) function but it must be something like this.
def sum(iterable):
result=0
for item in iterable: # worse time complexity: n
result+=item
return result
so, finally, your prefix_average2 function will run n*n=n^2 time in worse case (When j+1=n)
First of all, I am not an expert on this topic, but I would like to share my opinion with you.
If the code is similar to the below:
for j in range(n):
A[j] += 5
Then we can say the complexity is O(n)
You may ask why did we skip the n=len(S), and A=[0]?
Because those variables take 0(1) time to complete the action.
If we return our case:
for j in range(n):
A[j] = sum(S[0:j+1]) ....
Here, sum(S[0:j+1]) there is also a loop of summation is calculated.
You can think this as:
for q in S:
S[q] += q # This is partially right
The important thing is two-for loop calculation is handling in that code.
for j in range(n):
for q in range(S)
A[j] = ....
Therefore, the complexity is O(n^2)
The For Loop (for j in range(n)) has n iterations:
Iteration(Operation)
1st iteration( 1 operation for summing first 1 element)
2nd iteration( 2 operations for summing first 2 elements)
3rd iteration( 3 operations for summing first 3 elements)
.
.
.
(n-1)th iteration( n-1 operations for summing first n-1 elements)
nth iteration( n operations for summing first n elements)
So, the total number of operation is the summation of (1 + 2 + 3 +......(n-1) + n)...
which is (n*(n+1))//2.
So the time complexity is O(n^2) as we have to (n(n+1))//2 operations.*

Guidance on removing a nested for loop from function

I'm trying to write the fastest algorithm possible to return the number of "magic triples" (i.e. x, y, z where z is a multiple of y and y is a multiple of x) in a list of 3-2000 integers.
(Note: I believe the list was expected to be sorted and unique but one of the test examples given was [1,1,1] with the expected result of 1 - that is a mistake in the challenge itself though because the definition of a magic triple was explicitly noted as x < y < z, which [1,1,1] isn't. In any case, I was trying to optimise an algorithm for sorted lists of unique integers.)
I haven't been able to work out a solution that doesn't include having three consecutive loops and therefore being O(n^3). I've seen one online that is O(n^2) but I can't get my head around what it's doing, so it doesn't feel right to submit it.
My code is:
def solution(l):
if len(l) < 3:
return 0
elif l == [1,1,1]:
return 1
else:
halfway = int(l[-1]/2)
quarterway = int(halfway/2)
quarterIndex = 0
halfIndex = 0
for i in range(len(l)):
if l[i] >= quarterway:
quarterIndex = i
break
for i in range(len(l)):
if l[i] >= halfway:
halfIndex = i
break
triples = 0
for i in l[:quarterIndex+1]:
for j in l[:halfIndex+1]:
if j != i and j % i == 0:
multiple = 2
while (j * multiple) <= l[-1]:
if j * multiple in l:
triples += 1
multiple += 1
return triples
I've spent quite a lot of time going through examples manually and removing loops through unnecessary sections of the lists but this still completes a list of 2,000 integers in about a second where the O(n^2) solution I found completes the same list in 0.6 seconds - it seems like such a small difference but obviously it means mine takes 60% longer.
Am I missing a really obvious way of removing one of the loops?
Also, I saw mention of making a directed graph and I see the promise in that. I can make the list of first nodes from the original list with a built-in function, so in principle I presume that means I can make the overall graph with two for loops and then return the length of the third node list, but I hit a wall with that too. I just can't seem to make progress without that third loop!!
from array import array
def num_triples(l):
n = len(l)
pairs = set()
lower_counts = array("I", (0 for _ in range(n)))
upper_counts = lower_counts[:]
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[i] += 1
upper_counts[j] += 1
return sum(nx * nz for nz, nx in zip(lower_counts, upper_counts))
Here, lower_counts[i] is the number of pairs of which the ith number is the y, and z is the other number in the pair (i.e. the number of different z values for this y).
Similarly, upper_counts[i] is the number of pairs of which the ith number is the y, and x is the other number in the pair (i.e. the number of different x values for this y).
So the number of triples in which the ith number is the y value is just the product of those two numbers.
The use of an array here for storing the counts is for scalability of access time. Tests show that up to n=2000 it makes negligible difference in practice, and even up to n=20000 it only made about a 1% difference to the run time (compared to using a list), but it could in principle be the fastest growing term for very large n.
How about using itertools.combinations instead of nested for loops? Combined with list comprehension, it's cleaner and much faster. Let's say l = [your list of integers] and let's assume it's already sorted.
from itertools import combinations
def div(i,j,k): # this function has the logic
return l[k]%l[j]==l[j]%l[i]==0
r = sum([div(i,j,k) for i,j,k in combinations(range(len(l)),3) if i<j<k])
#alaniwi provided a very smart iterative solution.
Here is a recursive solution.
def find_magicals(lst, nplet):
"""Find the number of magical n-plets in a given lst"""
res = 0
for i, base in enumerate(lst):
# find all the multiples of current base
multiples = [num for num in lst[i + 1:] if not num % base]
res += len(multiples) if nplet <= 2 else find_magicals(multiples, nplet - 1)
return res
def solution(lst):
return find_magicals(lst, 3)
The problem can be divided into selecting any number in the original list as the base (i.e x), how many du-plets we can find among the numbers bigger than the base. Since the method to find all du-plets is the same as finding tri-plets, we can solve the problem recursively.
From my testing, this recursive solution is comparable to, if not more performant than, the iterative solution.
This answer was the first suggestion by #alaniwi and is the one I've found to be the fastest (at 0.59 seconds for a 2,000 integer list).
def solution(l):
n = len(l)
lower_counts = dict((val, 0) for val in l)
upper_counts = lower_counts.copy()
for i in range(n - 1):
lower = l[i]
for j in range(i + 1, n):
upper = l[j]
if upper % lower == 0:
lower_counts[lower] += 1
upper_counts[upper] += 1
return sum((lower_counts[y] * upper_counts[y] for y in l))
I think I've managed to get my head around it. What it is essentially doing is comparing each number in the list with every other number to see if the smaller is divisible by the larger and makes two dictionaries:
One with the number of times a number is divisible by a larger
number,
One with the number of times it has a smaller number divisible by
it.
You compare the two dictionaries and multiply the values for each key because the key having a 0 in either essentially means it is not the second number in a triple.
Example:
l = [1,2,3,4,5,6]
lower_counts = {1:5, 2:2, 3:1, 4:0, 5:0, 6:0}
upper_counts = {1:0, 2:1, 3:1, 4:2, 5:1, 6:3}
triple_tuple = ([1,2,4], [1,2,6], [1,3,6])

Floyd's Algorithm in Python

I am not sure how to implement Floyd's algorithm in the following program. It must print a 5x5 array that represents this graph on page 466 and include a counter which is used to print the total number of comparisons when the algorithm is executed - each execution of the "if" structure counts as one comparison.
Does anyone know how to even start this program? I am not sure how to begin.
The following is purely a transcription of the pseudocode you linked. I changed almost nothing.
for k in range(n):
for i in range(n):
for j in range(n):
if A[i][k]+A[k][j]<A[i][j]:
A[i][j]=A[i][k]+A[k][j]
Translated from the page you linked to,
k=0
while (k <= n-1):
i=0
while (i<=n-1):
j=0
while(j<=n-1):
if(A[i,k] + A[k,j] < A[i,j]):
A[i,j] = A[i,k] + A[k,j]
j += 1
i += 1
k += 1
NB This is the exact translation to Python.
Better, more Pythonic code is also possible - see, e.g. 5xum's answer
which uses the range function instead of manually incrementing the loop counters.
Also A here would be a 2d matrix (e.g. a numpy ndarray).
See more information about numpy here

Categories

Resources