Related
Could anyone explain exactly what's happening under the hood to make the recursive approach in the following problem much faster and efficient in terms of time complexity?
The problem: Write a program that would take an array of integers as input and return the largest three numbers sorted in an array, without sorting the original (input) array.
For example:
Input: [22, 5, 3, 1, 8, 2]
Output: [5, 8, 22]
Even though we can simply sort the original array and return the last three elements, that would take at least O(nlog(n)) time as the fastest sorting algorithm would do just that. So the challenge is to perform better and complete the task in O(n) time.
So I was able to come up with a recursive solution:
def findThreeLargestNumbers(array, largest=[]):
if len(largest) == 3:
return largest
max = array[0]
for i in array:
if i > max:
max = i
array.remove(max)
largest.insert(0, max)
return findThreeLargestNumbers(array, largest)
In which I kept finding the largest number, removing it from the original array, appending it to my empty array, and recursively calling the function again until there are three elements in my array.
However, when I looked at the suggested iterative method, I composed this code:
def findThreeLargestNumbers(array):
sortedLargest = [None, None, None]
for num in array:
check(num, sortedLargest)
return sortedLargest
def check(num, sortedLargest):
for i in reversed(range(len(sortedLargest))):
if sortedLargest[i] is None:
sortedLargest[i] = num
return
if num > sortedLargest[i]:
shift(sortedLargest, i, num)
return
def shift(array, idx, element):
if idx == 0:
array[0] = element
return array
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
return array
Both codes passed successfully all the tests and I was convinced that the iterative approach is faster (even though not as clean..). However, I imported the time module and put the codes to the test by providing an array of one million random integers and calculating how long each solution would take to return back the sorted array of the largest three numbers.
The recursive approach was way much faster (about 9 times faster) than the iterative approach!
Why is that? Even though the recursive approach is traversing the huge array three times and, on top of that, every time it removes an element (which takes O(n) time as all other 999 elements would need to be shifted in the memory), whereas the iterative approach is traversing the input array only once and yes making some operations at every iteration but with a very negligible array of size 3 that wouldn't even take time at all!
I really want to be able to judge and pick the most efficient algorithm for any given problem so any explanation would tremendously help.
Advice for optimization.
Avoid function calls. Avoid creating temporary garbage. Avoid extra comparisons. Have logic that looks at elements as little as possible. Walk through how your code works by hand and look at how many steps it takes.
Your recursive code makes only 3 function calls, and as pointed out elsewhere does an average of 1.5 comparisons per call. (1 while looking for the min, 0.5 while figuring out where to remove the element.)
Your iterative code makes lots of comparisons per element, calls excess functions, and makes calls to things like sorted that create/destroy junk.
Now compare with this iterative solution:
def find_largest(array, limit=3):
if len(array) <= limit:
# Special logic not needed.
return sorted(array)
else:
# Initialize the answer to values that will be replaced.
min_val = min(array[0:limit])
answer = [min_val for _ in range(limit)]
# Now scan for smallest.
for i in array:
if answer[0] < i:
# Sift elements down until we find the right spot.
j = 1
while j < limit and answer[j] < i:
answer[j-1] = answer[j]
j = j+1
# Now insert.
answer[j-1] = i
return answer
There are no function calls. It is possible that you can make up to 6 comparisons per element (verify that answer[0] < i, verify that (j=1) < 3, verify that answer[1] < i, verify that (j=2) < 3, verify that answer[2] < i, then find that (j=3) < 3 is not true). You will hit that worst case if array is sorted. But most of the time you only do the first comparison then move to the next element. No muss, no fuss.
How does it benchmark?
Note that if you wanted the smallest 100 elements, then you'd find it worthwhile to use a smarter data structure such as a heap to avoid the bubble sort.
I am not really confortable with python, but I have a different approach to the problem for what it's worth.
As far as I saw, all solutions posted are O(NM) where N is the length of the array and M the length of the largest elements array.
Because of your specific situation whereN >> M you could say it's O(N), but the longest the inputs the more it will be O(NM)
I agree with #zvone that it seems you have more steps in the iterative solution, which sounds like an valid explanation to your different computing speeds.
Back to my proposal, implements binary search O(N*logM) with recursion:
import math
def binarySearch(arr, target, origin = 0):
"""
Recursive binary search
Args:
arr (list): List of numbers to search in
target (int): Number to search with
Returns:
int: index + 1 from inmmediate lower element to target in arr or -1 if already present or lower than the lowest in arr
"""
half = math.floor((len(arr) - 1) / 2);
if target > arr[-1]:
return origin + len(arr)
if len(arr) == 1 or target < arr[0]:
return -1
if arr[half] < target and arr[half+1] > target:
return origin + half + 1
if arr[half] == target or arr[half+1] == target:
return -1
if arr[half] < target:
return binarySearch(arr[half:], target, origin + half)
if arr[half] > target:
return binarySearch(arr[:half + 1], target, origin)
def findLargestNumbers(array, limit = 3, result = []):
"""
Recursive linear search of the largest values in an array
Args:
array (list): Array of numbers to search in
limit (int): Length of array returned. Default: 3
Returns:
list: Array of max values with length as limit
"""
if len(result) == 0:
result = [float('-inf')] * limit
if len(array) < 1:
return result
val = array[-1]
foundIndex = binarySearch(result, val)
if foundIndex != -1:
result.insert(foundIndex, val)
return findLargestNumbers(array[:-1],limit, result[1:])
return findLargestNumbers(array[:-1], limit,result)
It is quite flexible and might be inspiration for a more elaborated answer.
The recursive solution
The recursive function goes through the list 3 times to fins the largest number and removes the largest number from the list 3 times.
for i in array:
if i > max:
...
and
array.remove(max)
So, you have 3×N comparisons, plus 3x removal. I guess the removal is optimized in C, but there is again about 3×(N/2) comparisons to find the item to be removed.
So, a total of approximately 4.5 × N comparisons.
The other solution
The other solution goes through the list only once, but each time it compares to the three elements in sortedLargest:
for i in reversed(range(len(sortedLargest))):
...
and almost each time it sorts the sortedLargest with these three assignments:
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
So, you are N times:
calling check
creating and reversing a range(3)
accessing sortedLargest[i]
comparing num > sortedLargest[i]
calling shift
comparing idx == 0
and about 2×N/3 times doing:
array[0] = array[1]
array[idx-1] = array[idx]
array[idx] = element
and N/3 times array[0] = element
It is difficult to count, but that is much more than 4.5×N comparisons.
Here is my code - a bubble sort algorithm for sorting list elements in asc order:
foo = [7, 0, 3, 4, -1]
cnt = 0
for i in foo:
for i in range(len(foo)-1):
if foo[cnt] > foo[cnt + 1]:
temp = foo[cnt]
c[cnt] = c[cnt + 1]
c[cnt + 1] = temp
cnt = cnt + 1
cnt = 0
I've been revising my code, but it is still too inefficient for an online judge. Some help would be greatly appreciated!
Early Exit BubbleSort
The first loop has no bearing on what happens inside
The second loop does all the heavy lifting. You can get rid of count by using enumerate
To swap elements, use the pythonic swap - a, b = b, a.
As per this comment, make use of an early exit. If there are no swaps to be made at any point in the inner loop, that means the list is sorted, and no further iteration is necessary. This is the intuition behind changed.
By definition, after the ith iteration of the outer loop, the last i elements will have been sorted, so you can further reduce the constant factor associated with the algorithm.
foo = [7, 0, 3, 4, -1]
for i in range(len(foo)):
changed = False
for j, x in enumerate(foo[:-i-1]):
if x > foo[j + 1]:
foo[j], foo[j + 1] = foo[j + 1], foo[j]
changed = True
if not changed:
break
print(foo)
[-1, 0, 3, 4, 7]
Note that none of these optimisations change the asymptotic (Big-O) complexity of BubbleSort (which remains O(N ** 2)), instead, only reduces the constant factors associated.
One easy optimization is to start second loop from i+1 index:
for i in range(0, len(foo)):
for j in range(i+1, len(foo)):
if (foo[i] > foo[j]):
temp = foo[i]
foo[i] = foo[j]
foo[j] = temp
Since you already sorted everything up to index i there is no need to iterate over it again. This can save you more than 50% of comparisons - in this case it's 10 versus 25 in your original algorithm.
You need to understand the big Oh notation in order to understand how efficient your algorithm is in terms of usage of computational resources independent of computer architecture or clock rate. It basically helps you analyze the worst case running time or memory usage of your algorithm as the size of the input increases.
In summary, the running time of your algorithm will fall into one of these categories (from fastest to slowest);
O(1): Constant time. Pronounced (Oh of 1). The fastest time.
O(lg n): Logarithmic time. Pronounced (Oh of log n). Faster than linear time.
Traditionally, it is the fastest time bound for search.
O(n): Linear time. Pronounced (Oh of n, n is the size of your input e.g size of
an array). Usually something when you need to examine every single bit of
your input.
O(nlgn): The fastest time we can currently achieve when performing a sort on a
list of elements.
O(n**2): Oh of n squared. Quadratic time. Often this is the bound when we have
nested loops.
O(2**n): Really, REALLY big! A number raised to the power of n is slower than
n raised to any power.
In your case, you are using nested loops which is O(n2). The code i have written uses a single while loop and has a growth complexity of O(n) which is faster than O(n2). I haven't really tried it on a very large array but in your case it seems to work. Try it and let me know if it works as expected.
k = [7, 0, 3, 4, -1]
n = len(k)
i = 0
count = 0
while count < n**2: # assuming we wouldn't go through the loop more than n squared times
if i == n - 1:
i = 0
count += 1
swapped = False
elif k[i] > k[i+1]:
temp = k[i]
k[i] = k[i+1]
k[i+1] = temp
i+=1
swapped = True
elif swapped == False:
i += 1
elif swapped == True and i < n - 1:
i += 1
Note: In the example list (k), we only need to loop through the list three times in order for it to be sorted in ascending order. So if you change the while loop to this line of code while count < 4:, it would still work.
I know merge sort is the best way to sort a list of arbitrary length, but I am wondering how to optimize my current method.
def sortList(l):
'''
Recursively sorts an arbitrary list, l, to increasing order.
'''
#base case.
if len(l) == 0 or len(l) == 1:
return l
oldNum = l[0]
newL = sortList(l[1:]) #recursive call.
#if oldNum is the smallest number, add it to the beginning.
if oldNum <= newL[0]:
return [oldNum] + newL
#find where oldNum goes.
for n in xrange(len(newL)):
if oldNum >= newL[n]:
try:
if oldNum <= newL[n+1]:
return newL[:n+1] + [oldNum] + newL[n+1:]
#if index n+1 is non-existant, oldNum must be the largest number.
except IndexError:
return newL + [oldNum]
What is the complexity of this function? I was thinking O(n^2) but I wasn't sure. Also, is there anyway to further optimize this procedure? (besides ditching it and going for merge sort!).
There's a few places I'd optimize your code.
You do a lot of list copies: each time you slice, you create a new copy of the list. That can be avoided by adding an index to the function declaration that indicates where in the array to start sorting from.
You should follow PEP 8 for naming: sort_list rather than sortList.
The code that does the insertion is a bit weird; intentionally raising an out-of-bounds index exception isn't normal programming practice. Instead, just percolate the value up the array until it's in the right place.
Applying these changes gives this code:
def sort_list(l, i=0):
if i == len(l): return
sort_list(l, i+1)
for j in xrange(i+1, len(l)):
if l[j-1] <= l[j]: return
l[j-1], l[j] = l[j], l[j-1]
This now sorts the array in-place, so there's no return value.
Here's some simple tests:
cases = [
[1, 2, 0, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[5, 4, 3, 2, 1, 1]
]
for c in cases:
got = c[:]
sort_list(got)
if sorted(c) != got:
print "sort_list(%s) = %s, want %s" % (c, got, sorted(c))
The time complexity is, as you suggest, O(n^2) where n is the length of the list. My version uses O(n) additional memory, whereas yours, because of the way the list gets copied at each stage, uses O(n^2).
One more step, which further improves the memory usage is to eliminate the recursion. Here's a version that does that:
def sort_list(l):
for i in xrange(len(l)-2, -1, -1):
for j in xrange(i+1, len(l)):
if l[j-1] <= l[j]: break
l[j-1], l[j] = l[j], l[j-1]
This works just the same as the recursive version, but does it iteratively; first sorting the last two elements in the array, then the last three, then the last four, and so on until the whole array is sorted.
This still has runtime complexity O(n^2), but now uses O(1) additional memory. Also, avoiding recursion means you can sort longer lists without hitting the notoriously low recursion limit in Python. And another benefit is that this code is now O(n) in the best case (when the array is already sorted).
A young Euler came up with a formula that seems appropriate here. The story goes that in grade school his teacher was very tired and to keep the class busy for a while they were told to add up all the numbers zero to one hundred. Young Euler came back with this:
This is applicable here because your run-time is going to be proportional to the sum of all the numbers up to the length of your list because in the worst case your function will be sorting an already sorted list and will go through the entire length newL each time to find the position of the next element at the end of the list.
I'm trying to solve this programming riddle and although the solution (see code below) works correctly, it is too slow for succesful submission.
Any pointers as how to make this run
faster (removal of every n-th element from a list)?
Or suggestions for a better algorithm to calculate the same; seems I can't think of anything
else than brute-force for now...
Basically, the task at hand is:
GIVEN:
L = [2,3,4,5,6,7,8,9,10,11,........]
1. Take the first remaining item in list L (in the general case 'n'). Move it to
the 'lucky number list'. Then drop every 'n-th' item from the list.
2. Repeat 1
TASK:
Calculate the n-th number from the 'lucky number list' ( 1 <= n <= 3000)
My original code (it calculated the 3000 first lucky numbers in about a second on my machine - unfortunately too slow):
"""
SPOJ Problem Set (classical) 1798. Assistance Required
URL: http://www.spoj.pl/problems/ASSIST/
"""
sieve = range(3, 33900, 2)
luckynumbers = [2]
while True:
wanted_n = input()
if wanted_n == 0:
break
while len(luckynumbers) < wanted_n:
item = sieve[0]
luckynumbers.append(item)
items_to_delete = set(sieve[::item])
sieve = filter(lambda x: x not in items_to_delete, sieve)
print luckynumbers[wanted_n-1]
EDIT: thanks to the terrific contributions of Mark Dickinson, Steve Jessop and gnibbler, I got at the following, which is quite a whole lot faster than my original code (and succesfully got submitted at http://www.spoj.pl with 0.58 seconds!)...
sieve = range(3, 33810, 2)
luckynumbers = [2]
while len(luckynumbers) < 3000:
if len(sieve) < sieve[0]:
luckynumbers.extend(sieve)
break
luckynumbers.append(sieve[0])
del sieve[::sieve[0]]
while True:
wanted_n = input()
if wanted_n == 0:
break
else:
print luckynumbers[wanted_n-1]
This series is called ludic numbers
__delslice__ should be faster than __setslice__+filter
>>> L=[2,3,4,5,6,7,8,9,10,11,12]
>>> lucky=[]
>>> lucky.append(L[0])
>>> del L[::L[0]]
>>> L
[3, 5, 7, 9, 11]
>>> lucky.append(L[0])
>>> del L[::L[0]]
>>> L
[5, 7, 11]
So the loop becomes.
while len(luckynumbers) < 3000:
item = sieve[0]
luckynumbers.append(item)
del sieve[::item]
Which runs in less than 0.1 second
Try using these two lines for the deletion and filtering, instead of what you have; filter(None, ...) runs considerably faster than the filter(lambda ...).
sieve[::item] = [0]*-(-len(sieve)//item)
sieve = filter(None, sieve)
Edit: much better to simply use del sieve[::item]; see gnibbler's solution.
You might also be able to find a better termination condition for the while loop: for example, if the first remaining item in the sieve is i then the first i elements of the sieve will become the next i lucky numbers; so if len(luckynumbers) + sieve[0] >= wanted_n you should already have computed the number you need---you just need to figure out where in sieve it is so that you can extract it.
On my machine, the following version of your inner loop runs around 15 times faster than your original for finding the 3000th lucky number:
while len(luckynumbers) + sieve[0] < wanted_n:
item = sieve[0]
luckynumbers.append(item)
sieve[::item] = [0]*-(-len(sieve)//item)
sieve = filter(None, sieve)
print (luckynumbers + sieve)[wanted_n-1]
An explanation on how to solve this problem can be found here. (The problem I linked to asks for more, but the main step in that problem is the same as the one you're trying to solve.) The site I linked to also contains a sample solution in C++.
The set of numbers can be represented in a binary tree, which supports the following operations:
Return the nth element
Erase the nth element
These operations can be implemented to run in O(log n) time, where n is the number of nodes in the tree.
To build the tree, you can either make a custom routine that builds the tree from a given array of elements, or implement an insert operation (make sure to keep the tree balanced).
Each node in the tree need the following information:
Pointers to the left and right children
How many items there are in the left and right subtrees
With such a structure in place, solving the rest of the problem should be fairly straightforward.
I also recommend calculating the answers for all possible input values before reading any input, instead of calculating the answer for each input line.
A Java implementation of the above algorithm gets accepted in 0.68 seconds at the website you linked.
(Sorry for not providing any Python-specific help, but hopefully the algorithm outlined above will be fast enough.)
You're better off using an array and zeroing out every Nth item using that strategy; after you do this a few times in a row, the updates start getting tricky so you'd want to re-form the array. This should improve the speed by at least a factor of 10. Do you need vastly better than that?
Why not just create a new list?
L = [x for (i, x) in enumerate(L) if i % n]
I have this, and it works:
# E. Given two lists sorted in increasing order, create and return a merged
# list of all the elements in sorted order. You may modify the passed in lists.
# Ideally, the solution should work in "linear" time, making a single
# pass of both lists.
def linear_merge(list1, list2):
finalList = []
for item in list1:
finalList.append(item)
for item in list2:
finalList.append(item)
finalList.sort()
return finalList
# +++your code here+++
return
But, I'd really like to learn this stuff well. :) What does 'linear' time mean?
Linear means O(n) in Big O notation, while your code uses a sort() which is most likely O(nlogn).
The question is asking for the standard merge algorithm. A simple Python implementation would be:
def merge(l, m):
result = []
i = j = 0
total = len(l) + len(m)
while len(result) != total:
if len(l) == i:
result += m[j:]
break
elif len(m) == j:
result += l[i:]
break
elif l[i] < m[j]:
result.append(l[i])
i += 1
else:
result.append(m[j])
j += 1
return result
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]
Linear time means that the time taken is bounded by some undefined constant times (in this context) the number of items in the two lists you want to merge. Your approach doesn't achieve this - it takes O(n log n) time.
When specifying how long an algorithm takes in terms of the problem size, we ignore details like how fast the machine is, which basically means we ignore all the constant terms. We use "asymptotic notation" for that. These basically describe the shape of the curve you would plot in a graph of problem size in x against time taken in y. The logic is that a bad curve (one that gets steeper quickly) will always lead to a slower execution time if the problem is big enough. It may be faster on a very small problem (depending on the constants, which probably depends on the machine) but for small problems the execution time isn't generally a big issue anyway.
The "big O" specifies an upper bound on execution time. There are related notations for average execution time and lower bounds, but "big O" is the one that gets all the attention.
O(1) is constant time - the problem size doesn't matter.
O(log n) is a quite shallow curve - the time increases a bit as the problem gets bigger.
O(n) is linear time - each unit increase means it takes a roughly constant amount of extra time. The graph is (roughly) a straight line.
O(n log n) curves upwards more steeply as the problem gets more complex, but not by very much. This is the best that a general-purpose sorting algorithm can do.
O(n squared) curves upwards a lot more steeply as the problem gets more complex. This is typical for slower sorting algorithms like bubble sort.
The nastiest algorithms are classified as "np-hard" or "np-complete" where the "np" means "non-polynomial" - the curve gets steeper quicker than any polynomial. Exponential time is bad, but some are even worse. These kinds of things are still done, but only for very small problems.
EDIT the last paragraph is wrong, as indicated by the comment. I do have some holes in my algorithm theory, and clearly it's time I checked the things I thought I had figured out. In the mean time, I'm not quite sure how to correct that paragraph, so just be warned.
For your merging problem, consider that your two input lists are already sorted. The smallest item from your output must be the smallest item from one of your inputs. Get the first item from both and compare the two, and put the smallest in your output. Put the largest back where it came from. You have done a constant amount of work and you have handled one item. Repeat until both lists are exhausted.
Some details... First, putting the item back in the list just to pull it back out again is obviously silly, but it makes the explanation easier. Next - one input list will be exhausted before the other, so you need to cope with that (basically just empty out the rest of the other list and add it to the output). Finally - you don't actually have to remove items from the input lists - again, that's just the explanation. You can just step through them.
Linear time means that the runtime of the program is proportional to the length of the input. In this case the input consists of two lists. If the lists are twice as long, then the program will run approximately twice as long. Technically, we say that the algorithm should be O(n), where n is the size of the input (in this case the length of the two input lists combined).
This appears to be homework, so I will no supply you with an answer. Even though this is not homework, I am of the opinion that you will be best served by taking a pen and a piece of paper, construct two smallish example lists which are sorted, and figure out how you would merge those two lists, by hand. Once you figured that out, implementing the algorithm is a piece of cake.
(If all goes well, you will notice that you need to iterate over each list only once, in a single direction. That means that the algorithm is indeed linear. Good luck!)
If you build the result in reverse sorted order, you can use pop() and still be O(N)
pop() from the right end of the list does not require shifting the elements, so is O(1)
Reversing the list before we return it is O(N)
>>> def merge(l, r):
... result = []
... while l and r:
... if l[-1] > r[-1]:
... result.append(l.pop())
... else:
... result.append(r.pop())
... result+=(l+r)[::-1]
... result.reverse()
... return result
...
>>> merge([1,2,6,7], [1,3,5,9])
[1, 1, 2, 3, 5, 6, 7, 9]
This thread contains various implementations of a linear-time merge algorithm. Note that for practical purposes, you would use heapq.merge.
Linear time means O(n) complexity. You can read something about algorithmn comlexity and big-O notation here: http://en.wikipedia.org/wiki/Big_O_notation .
You should try to combine those lists not after getting them in the finalList, try to merge them gradually - adding an element, assuring the result is sorted, then add next element... this should give you some ideas.
A simpler version which will require equal sized lists:
def merge_sort(L1, L2):
res = []
for i in range(len(L1)):
if(L1[i]<L2[i]):
first = L1[i]
secound = L2[i]
else:
first = L2[i]
secound = L1[i]
res.extend([first,secound])
return res
itertoolz provides an efficient implementation to merge two sorted lists
https://toolz.readthedocs.io/en/latest/_modules/toolz/itertoolz.html#merge_sorted
'Linear time' means that time is an O(n) function, where n - the number of items input (items in the lists).
f(n) = O(n) means that that there exist constants x and y such that x * n <= f(n) <= y * n.
def linear_merge(list1, list2):
finalList = []
i = 0
j = 0
while i < len(list1):
if j < len(list2):
if list1[i] < list2[j]:
finalList.append(list1[i])
i += 1
else:
finalList.append(list2[j])
j += 1
else:
finalList.append(list1[i])
i += 1
while j < len(list2):
finalList.append(list2[j])
j += 1
return finalList