This question already has answers here:
How can I find the time complexity of an algorithm?
(10 answers)
Closed 5 years ago.
I need help in counting the number of steps regarding the time complexity of code fragments.
total = 0
i = 0
while i<3:
j=0
while j<3:
total = total + 1
j = j+1
i = i+1
return total
I have the solution stating: 2+3*(2+3*3+2)+2 = 43
the first two lines from the top where total = 0 and i = 0, yes i know that each of them is 1 time step each therefore adding up gives me 2. for the while statement, I'm not sure how its obtained but since i<3, its 3 time step? and then j = 0 is 1 time step.
Now here's where i don't quite get it. if there is a nested i and j loop, how do i determine the time complexity? in the solution, i notice there is *(multiple) and I will appreciate if anyone could break it down in simpler terms for me.
Time complexity takes an argument. For example, O(n^2).
As it's written, I don't know what part of your function would change, so it's just constant, O(1).
Let's say the thing that i is compared to, 3 in this case, is what can change. Like your function is "do a j-thing three times for each i." In that case, you'll see that if you increase that variable, you'll add three more steps to the loop. That means the complexity would look like O(3n). Since we can remove constant multiples, it's just O(n).
What I just wrote is hypothetical, though. It depends on what varies in your function.
Related
This is for leetcode problem: https://leetcode.com/problems/majority-element
There is something wrong with the way I create solutions, and not sure how to stop doing it. Basically the problem is I always create a count variable. Here is it called greatest_count. For the if statement, I create a conditional, which I think is fine, but I feel like I don't need the additional greatest_count variable here but not sure a better way to write it. I always seem to think I need to count it and check it against the previous counts. Do I need this? How can I write this without needing the count variable? or without using the greatest unique? Any ways to optimize this would be great to know.
Problem area:
if unique_count > greatest_count:
greatest_count = unique_count
greatest_unique = i
Here is the full code:
class Solution:
def majorityElement(self, nums):
unique_nums = set(nums)
greatest_unique = 0
greatest_count = 0
for i in unique_nums:
unique_count = nums.count(i)
if unique_count > greatest_count:
greatest_count = unique_count
greatest_unique = i
return greatest_unique
Thank you
In order to get this to work in O(n) time and O(1) space, you would need a different approach. For example, find the majority bit for each of the 32 bits of the numbers and build the answer from the collected bits that are present in more than half the numbers:
def majorityElement(nums):
m = 0
for b in range(32): # go through all 32 bits
c = sum(n&(2**b)!=0 for n in nums) # count numbers with bit b set
if c>len(nums)//2: m |= 2**b # more than half, keep that bit
return m if m<2**31 else m-2**32 # turn Python's int to 32 bit signed
majorityElement([3,2,3]) # 3
majorityElement([-3,-3,1,1,1,-3,-3]) # -3
This is O(n) (linear) time because it runs through the list a fixed number of times. It is O(1) space because it does not use memory proportionally to the size of the list.
This question already has answers here:
Understanding change-making algorithm
(4 answers)
Closed 3 years ago.
In england we have 1, 2, 5, 10, 20, 50 and a pound(100) p coins. Using these coins i would like to work out all the possible combinations that the coins can be added in to make £2.50. The way i approached this question was to make a list of all the possible combinations of all of the coins. To do this i did the following:
ps = [1, 2, 5, 10, 20, 50, 100]
list_of_combos = [[]]
for i in range(7):
for j in range(7):
for k in range(7):
for l in range(7):
for m in range(7):
for n in range(7):
for o in range(7):
print("processing..")
all_combos = (ps[i], ps[j], ps[k], ps[l], ps[m], ps[n], ps[o])
list_of_combos.append(all_combos)
Then from all the possible combos, i tried picking the only ones that actually add up to 250 by doing this.
for i in list_of_combos:
if sum(i) == 250:
print(i)
The problem i am having it that the first nested loop takes forever to complete, which basically makes the program useless. Is there anything i can do to make this loop finish quicker? Thanks.
I can give you an idea that might help. One idea to replace the loop, which I am not honestly sure how more/less efficient could be, but I expect to be better than the above is adapted from this:
How to get all possible combinations of a list’s elements?
Using the same function as the top answer:
list(itertools.combinations(iterable, r))
However, Keep in mind since you want to create a list of combinations of having more than one coin you might want to create a new list with repeated items.
HOWEVER this is a very very inefficient approach, One that will not get you a result due to the fact that you are limiting your combinations, as per this system you cannot ever have the answer be 250 1c coins for example
A better approach is to go the other way around, Starting from the biggest coins you can work from:
100 - 100 - 50
and go down, Dividing each coin in each different way, The advantage is that every operation you will do will be used to create a wanted result. So you are not wasting any loops (which in this approach is a lot) and you will not need to do any further checks to make sure its equal to the wanted results (e.g. 100 100 50 is a result, dividing you have for example 50 50 50 50 50 which is ALSO a result).
You might want to keep the checks to a limit of maybe 2-3 coin sizes down to improve performance and just loop each result and keep diving further to get every possible outcome
I just took a Codility demo test. The question and my answer can be seen here, but I'll paste my answer here as well. My response:
def solution(A):
# write your code in Python 2.7
retresult = 1; # the smallest integer we can return, if it is not in the array
A.sort()
for i in A:
if i > 0:
if i==retresult: retresult += 1 # increment the result since the current result exists in the array
elif i>retresult: break # we can go out of the loop since we found a bigger number than our current positive integer result
return retresult
My question is around time complexity, which I hope to better understand by your response. The question asks for expected worst-case time complexity is O(N).
Does my function have O(N) time complexity? Does the fact that I sort the array increase the complexity, and if so how?
Codility reports (for my answer)
Detected time complexity:
O(N) or O(N * log(N))
So, what is the complexity for my function? And if it is O(N*log(N)), what can I do to decrease the complexity to O(N) as the problem states?
Thanks very much!
p.s. my background reading on time complexity comes from this great post.
EDIT
Following the reply below, and the answers described here for this problem, I would like to expand on this with my take on the solutions:
basicSolution has an expensive time complexity and so is not the right answer for this Codility test:
def basicSolution(A):
# 0(N*log(N) time complexity
retresult = 1; # the smallest integer we can return, if it is not in the array
A.sort()
for i in A:
if i > 0:
if i==retresult: retresult += 1 #increment the result since the current result exists in the array
elif i>retresult: break # we can go out of the loop since we found a bigger number than our current positive integer result
else:
continue; # negative numbers and 0 don't need any work
return retresult
hashSolution is my take on what is described in the above article, in the "use hashing" paragraph. As I am new to Python, please let me know if you have any improvements to this code (it does work though against my test cases), and what time complexity this has?
def hashSolution(A):
# 0(N) time complexity, I think? but requires 0(N) extra space (requirement states to use 0(N) space
table = {}
for i in A:
if i > 0:
table[i] = True # collision/duplicate will just overwrite
for i in range(1,100000+1): # the problem says that the array has a maximum of 100,000 integers
if not(table.get(i)): return i
return 1 # default
Finally, the actual 0(N) solution (O(n) time and O(1) extra space solution) I am having trouble understanding. I understand that negative/0 values are pushed at the back of the array, and then we have an array of just positive values. But I do not understand the findMissingPositive function - could anyone please describe this with Python code/comments? With an example perhaps? I've been trying to work through it in Python and just cannot figure it out :(
It does not, because you sort A.
The Python list.sort() function uses Timsort (named after Tim Peters), and has a worst-case time complexity of O(NlogN).
Rather than sort your input, you'll have to iterate over it and determine if any integers are missing by some other means. I'd use a set of a range() object:
def solution(A):
expected = set(range(1, len(A) + 1))
for i in A:
expected.discard(i)
if not expected:
# all consecutive digits for len(A) were present, so next is missing
return len(A) + 1
return min(expected)
This is O(N); we create a set of len(A) (O(N) time), then we loop over A, removing elements from expected (again O(N) time, removing elements from a set is O(1)), then test for expected being empty (O(1) time), and finally get the smallest element in expected (at most O(N) time).
So we make at most 3 O(N) time steps in the above function, making it a O(N) solution.
This also fits the storage requirement; all use is a set of size N. Sets have a small overhead, but always smaller than N.
The hash solution you found is based on the same principle, except that it uses a dictionary instead of a set. Note that the dictionary values are never actually used, they are either set to True or absent. I'd rewrite that as:
def hashSolution(A):
seen = {i for i in A if i > 0}
if not seen:
# there were no positive values, so 1 is the first missing.
return 1
for i in range(1, 10**5 + 1):
if i not in seen:
return i
# we can never get here because the inputs are limited to integers up to
# 10k. So either `seen` has a limited number of positive values below
# 10.000 or none at all.
The above avoids looping all the way to 10.000 if there were no positive integers in A.
The difference between mine and theirs is that mine starts with the set of expected numbers, while they start with the set of positive values from A, inverting the storage and test.
I made an algorithm in Python for counting the number of ways of getting an amount of money with different coin denominations:
#measure
def countChange(n, coin_list):
maxIndex = len(coin_list)
def count(n, current_index):
if n>0 and maxIndex>current_index:
c = 0
current = coin_list[current_index]
max_coeff = int(n/current)
for coeff in range(max_coeff+1):
c+=count(n-coeff*current, current_index+1)
elif n==0: return 1
else: return 0
return c
return count(n, 0)
My algorithm uses an index to get a coin denomination and, as you can see, my index is increasing in each stack frame I get in. I realized that the algorithm could be written in this way also:
#measure
def countChange2(n, coin_list):
maxIndex = len(coin_list)
def count(n, current_index):
if n>0 and 0<=current_index:
c = 0
current = coin_list[current_index]
max_coeff = int(n/current)
for coeff in range(max_coeff+1):
c+=count(n-coeff*current, current_index-1)
elif n==0: return 1
else: return 0
return c
return count(n, maxIndex-1)
This time, the index is decreasing each stack frame I get in. I compared the execution time of the functions and I got a very noteworthy difference:
print(countChange(30, range(1, 31)))
print(countChange2(30, range(1, 31)))
>> Call to countChange took 0.9956174254208345 secods.
>> Call to countChange2 took 0.037631815734429974 secods.
Why is there a great difference in the execution times of the algorithms if I'm not even caching the results? Why does the increasing order of the index affect this execution time?
This doesn't really have anything to do with dynamic programming, as I understand it. Just reversing the indices shouldn't make something "dynamic".
What's happening is that the algorithm is input sensitive. Try feeding the input in reversed order. For example,
print(countChange(30, list(reversed(range(1, 31)))))
print(countChange2(30, list(reversed(range(1, 31)))))
Just as some sorting algorithms are extremely fast with already sorted data and very slow with reversed data, you've got that kind of algorithm here.
In the case where the input is increasing, countChange needs a lot more iterations to arrive at its final answer, and thus seems a lot slower. However, when the input is decreasing, the performance characteristics are reversed.
thre number combinations are not huge
the reason is that going forward you have to explore every possibility, however when you go backwards you can eliminate large chunks of invalid solutions without having to actually calculate them
going forward you call count 500k times
going backwards your code only makes 30k calls to count ...
you can make both of these faster by memoizing the calls , (or changing your algorithm to not make duplicate calls)
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
Question: Given a list of unordered timestamps, find the largest span of time that overlaps
For example: [1,3],[10,15],[2,7],[11,13],[12,16],[5,8] => [1,8] and [10,16]
I was asked to solve the above question.
My initial approach was the following:
times = [[1,3],[10,15],[2,7],[11,13],[12,16],[5,8]]
import itertools
def flatten(listOfLists):
return itertools.chain.from_iterable(listOfLists)
start = [i[0] for i in times]
end = [i[1] for i in times]
times = sorted(list(flatten(times)))
# 1=s, 2=s, 3=e, 5=s, 7=e, 8=e, 10=s, 11=s, 12=s, 13=e, 15=e, 16=e
num_of_e = 0
num_of_s = 0
first_s = 0
for time in times:
if first_s == 0:
first_s = time
if time not in end:
num_of_s += 1
if time in end:
num_of_e += 1
if num_of_e == num_of_s:
num_of_e = 0
num_of_s = 0
print [first_s, time]
first_s = 0
Then, the questioner insisted that I should solve it by ordering the times first because "it's better" so I did the following
times = [[1,3],[10,15],[2,7],[11,13],[12,16],[5,8]]
def merge(a,b):
return[min(a[0],b[0]), max(a[1],b[1])]
times.sort()
# [1,3] [2,7] [5,8] [10,15] [11,13] [12,16]
cur = []
for time in times:
if not cur:
cur = time
continue
if time[0] > cur[0] and time[0] < cur[1]:
cur = merge(time,cur)
else:
print cur
cur = time
print cur
Is there such thing as a "better" approach (or maybe another approach that could be better)? I know I could time it and see which one is faster or just evaluate based on big O notation (both O(N) for the actual work part).
Just wanted to see if you guys have any opinions on this?
Which one would you prefer and why?
Or maybe other ways to do it?
Here is a suggestion for eluding the risks related to time in end time computation and specific cases issues:
times = [[1,3],[10,15],[2,7],[11,13],[12,16],[5,8]]
start = [(i[0], 0) for i in times]
end = [(i[1], 1) for i in times]
# Using 0 for start and 1 for end ensures that starts are resolved before ends
times = sorted(start + end)
span_count = 0
first_s = 0
for time, is_start in times:
if first_s == 0:
first_s = time
if is_start == 0:
span_count += 1
else:
span_count -= 1
if span_count == 0:
print [first_s, time]
first_s = 0
Also, it has an easily computable complexity of O(n) (actual work) + O(n*log(n)) (sort) = O(n*log(n))
Speed is often the most important consideration when evaluating an algorithm, but it may not be the only one. But let's look at speed first.
It this case, there are two kinds of speed to consider: asymptotic (which is what big Ω-Θ-O notation characterizes), and non-asymptotic. Even if two algorithms have the same asymptotic behavior, one may still perform considerably better than the other because of other costs in the algorithm that will be significant at smaller data sizes.
In your first algorithm you iterate through the list two times before sorting it, and then iterate through the list a third time after sorting it. In the second answer you only iterate through the list once. I would expect the second to be faster, but in Python, performance can sometimes be surprising, so it's good to measure if you need the speed.
You may also evaluate an algorithm's use of memory. Your first algorithm creates two temporary lists of start and end times, and a third temporary list holding the sorted time spans. Those could be expensive if the data set is large! The second algorithm avoids much of this, but creates a new list of length 2 each time merge is called. That could still be a significant amount of memory being allocated, and might be something to look at optimizing further. There may also be some memory use hidden behind the scenes: your use of sort, for example, may not in fact use much less memory than sorted does when you look at how it's implemented.
A final consideration when evaluating an algorithm is your audience. If you are in an interview, for example, speed and memory may not be as critical for your first attempt at implementing an algorithm as clarity and style.