def myFunction(mylist):
n = len(mylist)
p = []
sum = 0
for x in mylist:
if n > 100:
sum = sum + x
else:
for y in mylist:
p.append(y)
My thought process was that if the else statement were to be executed, the operations within are O(n) because the number of times through depends on the length of the list. Similarly, I understood the first loop to be O(n) as well thus making the entire worst-case complexity O(n^2).
Apparently the correct answer is O(n). Any explanation would be greatly appreciated :)
Just to add a bit, we typically think of Big-O complexity being in the case where n gets large. Thus, as n gets large, we won't execute the second statement. Thus it would just be O(n)
Related
I'm calculating time complexities of algorithms and I assumed both code below to have time complexities of O(n^2)
However my books says first code is O(n^2) and second one is O(n). I don't understand why. Both are using min/max, so whats the difference?
Code 1:
def sum(l, n):
for i in range(1, n - 1):
x = min(l[0:i])
y = min(l[i:num])
return x+y
Code 2:
def sum(a, n):
r = [0] * n
l = [0] * n
min_el = a[0]
for i in range(n):
min_el = min(min_el, a[i])
l[i] = min_el
print(min_el)
In the first block of code the block of code runs the min function over the whole array, which takes O(n) time. Now considering it is in a loop of length n then the total time is O(n^2)
Looking at the 2nd block of code. Note that the min function is only comparing 2 values, which is arguably O(1). Now considering that it is in a loop of length n. The total time is simply the summation of O(n+n+n), which is equal to O(n)
In the first code, it gives an array to the min() function, and this O(n) time complexity because it checks all elements in the array, in the second code, min() functions only compare two values and it takes O(1)
I have an n-element array. All elements except 4√n of them are sorted. We do not know the positions of these misplaced elements. What is the most efficient way of sorting this list?
Is there an O(n) way to do this?
Update 1:
time complexity of an insertion sort is O(n) for almost sorted data (is it true in worst case?)?
There is a fast general method for sorting almost sorted arrays:
Scan through the original array from start to end. If you find two items that are not ordered correctly, move them to a second array and remove them from the first array. Be careful; for example if you remove x2 and x3, then you need to check again that x1 ≤ x2. This is done in O(n) time. In your case, the new array is at most 8sqrt(n) in size.
Sort the second array, then merge both arrays. With the small number of items in the second array, any reasonable sorting algorithm will sort the small second array in O(n), and the merge takes O(n) again, so the total time is O(n).
If you use a O(n log n) algorithm to sort the second array, then sorting is O(n) as long as the number of items in the wrong position is at most O (n / log n).
No, insertion sort isn't O(n) on that. Worst case is when it's the last 4√n elements that are misplaced, and they're so small that they belong at the front of the array. It'll take insertion sort Θ(n √n) to move them there.
Here's a Python implementation of gnasher729's answer that's O(n) time and O(n) space on such near-sorted inputs. We can't naively "remove" pairs from the array, though, that would be inefficient. Instead, I move correctly sorted values into a good list and the misordered pairs into a bad list. So as long as the numbers are increasing, they're just added to good. But if the next number x is smaller than the last good number good[-1], then they're both moved to bad. When I'm done, I concatenate good and bad and let Python's Timsort do the rest. It detects the already sorted run good in O(n - √n) time, then sorts the bad part in O(√n log √n) time, and finally merges the two sorted parts in O(n) time.
def sort1(a):
good, bad = [], []
for x in a:
if good and x < good[-1]:
bad += x, good.pop()
else:
good += x,
a[:] = sorted(good + bad)
Next is a space-improved version that takes O(n) time and only O(√n) space. Instead of storing the good part in an extra list, I store it in a[:good]:
def sort2(a):
good, bad = 0, []
for x in a:
if good and x < a[good-1]:
bad += x, a[good-1]
good -= 1
else:
a[good] = x
good += 1
a[good:] = bad
a.sort()
And here's another O(n) time and O(√n) space variation where I let Python sort bad for me, but then merge the good part with the bad part myself, from right to left. So this doesn't rely on Timsort's sorted-run detection and is thus easily ported to other languages:
def sort3(a):
good, bad = 0, []
for x in a:
if good and x < a[good-1]:
bad += x, a[good-1]
good -= 1
else:
a[good] = x
good += 1
bad.sort()
i = len(a)
while bad:
i -= 1
if good and a[good-1] > bad[-1]:
good -= 1
a[i] = a[good]
else:
a[i] = bad.pop()
Finally, some test code:
from random import random, sample
from math import isqrt
def sort1(a):
...
def sort2(a):
...
def sort3(a):
...
def fake(a):
"""Intentionally do nothing, to show that the test works."""
def main():
n = 10**6
a = [random() for _ in range(n)]
a.sort()
for i in sample(range(n), 4 * isqrt(n)):
a[i] = random()
for sort in sort1, sort2, sort3, fake:
copy = a.copy()
sort(copy)
print(sort.__name__, copy == sorted(a))
if __name__ == '__main__':
main()
Output, shows that both solutions passed the test (and that the test works, detecting fake as incorrect):
sort1 True
sort2 True
sort3 True
fake False
Fun fact: For Timsort alone (i.e., not used as part of the above algorithms), the worst case I mentioned above is rather a best case: It would sort that in O(n) time. Just like in my first version's sorted(good + bad), it'd recognize the prefix of n-√n sorted elements in O(n - √n) time, sort the √n last elements in O(√n log √n) time, and then merge the two sorted parts in O(n) time.
So can we just let Timsort do the whole thing? Is it O(n) on all such near-sorted inputs? No, it's not. If the 4√n misplaced elements are evenly spread over the array, then we have up to 4√n sorted runs and Timsort will take O(n log(4√n)) = O(n log n) time to merge them.
from linkedlist import LinkedList
def find_max(linked_list): # Complexity: O(N)
current = linked_list.get_head_node()
maximum = current.get_value()
while current.get_next_node():
current = current.get_next_node()
val = current.get_value()
if val > maximum:
maximum = val
return maximum
def sort_linked_list(linked_list): # <----- WHAT IS THE COMPLEXITY OF THIS FUNCTION?
print("\n---------------------------")
print("The original linked list is:\n{0}".format(linked_list.stringify_list()))
new_linked_list = LinkedList()
while linked_list.head_node:
max_value = find_max(linked_list)
print(max_value)
new_linked_list.insert_beginning(max_value)
linked_list.remove_node(max_value)
return new_linked_list
Since we loop through the while loop N times, the runtime is at least N. For each loop we call find_max, HOWEVER, for each call to find_max, the linked_list we are parsing to the find_max is reduced by one element. Based on that, isn't the runtime N log N?
Or is it N^2?
It's still O(n²); the reduction in size by 1 each time just makes the effective work n * n / 2 (because on average, you have to deal with half the original length on each pass, and you're still doing n passes). But since constant factors aren't included in big-O notation, that simplifies to just O(n²).
For it to be O(n log n), each step would have to halve the size of the list to scan, not simply reduce it by one.
It's n + n-1 + n-2 + ... + 1 which is arithmetic sequence so it is n(n+1)/2. So in big O notation it is O(n^2).
Don't forget, O-notation deals in terms of worst-case complexity, and describes an entire class of functions. As far as O-notation goes, the following two functions are the same complexity:
64x^2 + 128x + 256 --> O(n^2)
x^2 - 2x + 1 --> O(n^2)
In your case (and your algorithm what's called a selection sort, picking the best element in the list and putting it in the new list; other O(n^2) sorts include insertion sort and bubble sort), you have the following complexities:
0th iteration: n
1st iteration: n-1
2nd iteration: n-2
...
nth iteration: 1
So the entire complexity would be
n + (n-1) + (n-2) + ... + 1 = n(n+1)/2 = 1/2n^2 + 1/2n
which is still O(n^2), though it'd be on the low side of that class.
I'm reordering the entries of an array so that the even ones (divisible by 2) appear first. The code snippet is as follows:
def even_odd(A):
next_even , next_odd = 0, len(A) - 1
while next_even < next_odd:
if A[next_even] % 2 == 0:
next_even += 1
else:
A[next_even], A[next_odd] = A[next_odd], A[next_even]
next_odd -= 1
The time complexity is given as O(N) which I'm guessing is because I'm going through the whole array? But how's the space complexity O(1)?
You use a fixed amount of space to reorder the list. That, by definition, is O(1). The fact that you're dealing with the length of A does not count against your function's space usage: A was already allocated by the problem definition. All you have added to that is two integers, next_even and next_odd: they are your O(1).
UPDATE per OP comment
The size of A does not "count against" your space complexity, as your algorithm uses the space already provided by the calling program. You haven't added any.
Sorry; I didn't realize you had an open question about the time complexity. Yes, your guess is correct: you go through the while loop N-1 times (N = len(A) ); each iteration takes no more than a constant amount of time. Therefore, the time complexity is bounded by N.
I guess the meaning here that the "additional" memory required for the reordering is o(1), excluded the original array.
I have this, and it works:
# E. Given two lists sorted in increasing order, create and return a merged
# list of all the elements in sorted order. You may modify the passed in lists.
# Ideally, the solution should work in "linear" time, making a single
# pass of both lists.
def linear_merge(list1, list2):
finalList = []
for item in list1:
finalList.append(item)
for item in list2:
finalList.append(item)
finalList.sort()
return finalList
# +++your code here+++
return
But, I'd really like to learn this stuff well. :) What does 'linear' time mean?
Linear means O(n) in Big O notation, while your code uses a sort() which is most likely O(nlogn).
The question is asking for the standard merge algorithm. A simple Python implementation would be:
def merge(l, m):
result = []
i = j = 0
total = len(l) + len(m)
while len(result) != total:
if len(l) == i:
result += m[j:]
break
elif len(m) == j:
result += l[i:]
break
elif l[i] < m[j]:
result.append(l[i])
i += 1
else:
result.append(m[j])
j += 1
return result
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]
Linear time means that the time taken is bounded by some undefined constant times (in this context) the number of items in the two lists you want to merge. Your approach doesn't achieve this - it takes O(n log n) time.
When specifying how long an algorithm takes in terms of the problem size, we ignore details like how fast the machine is, which basically means we ignore all the constant terms. We use "asymptotic notation" for that. These basically describe the shape of the curve you would plot in a graph of problem size in x against time taken in y. The logic is that a bad curve (one that gets steeper quickly) will always lead to a slower execution time if the problem is big enough. It may be faster on a very small problem (depending on the constants, which probably depends on the machine) but for small problems the execution time isn't generally a big issue anyway.
The "big O" specifies an upper bound on execution time. There are related notations for average execution time and lower bounds, but "big O" is the one that gets all the attention.
O(1) is constant time - the problem size doesn't matter.
O(log n) is a quite shallow curve - the time increases a bit as the problem gets bigger.
O(n) is linear time - each unit increase means it takes a roughly constant amount of extra time. The graph is (roughly) a straight line.
O(n log n) curves upwards more steeply as the problem gets more complex, but not by very much. This is the best that a general-purpose sorting algorithm can do.
O(n squared) curves upwards a lot more steeply as the problem gets more complex. This is typical for slower sorting algorithms like bubble sort.
The nastiest algorithms are classified as "np-hard" or "np-complete" where the "np" means "non-polynomial" - the curve gets steeper quicker than any polynomial. Exponential time is bad, but some are even worse. These kinds of things are still done, but only for very small problems.
EDIT the last paragraph is wrong, as indicated by the comment. I do have some holes in my algorithm theory, and clearly it's time I checked the things I thought I had figured out. In the mean time, I'm not quite sure how to correct that paragraph, so just be warned.
For your merging problem, consider that your two input lists are already sorted. The smallest item from your output must be the smallest item from one of your inputs. Get the first item from both and compare the two, and put the smallest in your output. Put the largest back where it came from. You have done a constant amount of work and you have handled one item. Repeat until both lists are exhausted.
Some details... First, putting the item back in the list just to pull it back out again is obviously silly, but it makes the explanation easier. Next - one input list will be exhausted before the other, so you need to cope with that (basically just empty out the rest of the other list and add it to the output). Finally - you don't actually have to remove items from the input lists - again, that's just the explanation. You can just step through them.
Linear time means that the runtime of the program is proportional to the length of the input. In this case the input consists of two lists. If the lists are twice as long, then the program will run approximately twice as long. Technically, we say that the algorithm should be O(n), where n is the size of the input (in this case the length of the two input lists combined).
This appears to be homework, so I will no supply you with an answer. Even though this is not homework, I am of the opinion that you will be best served by taking a pen and a piece of paper, construct two smallish example lists which are sorted, and figure out how you would merge those two lists, by hand. Once you figured that out, implementing the algorithm is a piece of cake.
(If all goes well, you will notice that you need to iterate over each list only once, in a single direction. That means that the algorithm is indeed linear. Good luck!)
If you build the result in reverse sorted order, you can use pop() and still be O(N)
pop() from the right end of the list does not require shifting the elements, so is O(1)
Reversing the list before we return it is O(N)
>>> def merge(l, r):
... result = []
... while l and r:
... if l[-1] > r[-1]:
... result.append(l.pop())
... else:
... result.append(r.pop())
... result+=(l+r)[::-1]
... result.reverse()
... return result
...
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]
This thread contains various implementations of a linear-time merge algorithm. Note that for practical purposes, you would use heapq.merge.
Linear time means O(n) complexity. You can read something about algorithmn comlexity and big-O notation here: http://en.wikipedia.org/wiki/Big_O_notation .
You should try to combine those lists not after getting them in the finalList, try to merge them gradually - adding an element, assuring the result is sorted, then add next element... this should give you some ideas.
A simpler version which will require equal sized lists:
def merge_sort(L1, L2):
res = []
for i in range(len(L1)):
if(L1[i]<L2[i]):
first = L1[i]
secound = L2[i]
else:
first = L2[i]
secound = L1[i]
res.extend([first,secound])
return res
itertoolz provides an efficient implementation to merge two sorted lists
https://toolz.readthedocs.io/en/latest/_modules/toolz/itertoolz.html#merge_sorted
'Linear time' means that time is an O(n) function, where n - the number of items input (items in the lists).
f(n) = O(n) means that that there exist constants x and y such that x * n <= f(n) <= y * n.
def linear_merge(list1, list2):
finalList = []
i = 0
j = 0
while i < len(list1):
if j < len(list2):
if list1[i] < list2[j]:
finalList.append(list1[i])
i += 1
else:
finalList.append(list2[j])
j += 1
else:
finalList.append(list1[i])
i += 1
while j < len(list2):
finalList.append(list2[j])
j += 1
return finalList