How to solve R select problem, using random pivot - python

I need to find the kth number in the input list. Please tell me what's wrong
def partition(arr, start, end, pivot):
pivot_locate = arr.index(pivot)
arr[pivot_locate], arr[start] = arr[start], arr[pivot_locate]
L = start; R = end
i = L+1; j = L+1
for k in range(j, R+1): #k = 1~R
if arr[k] < pivot:
arr[i], arr[k] = arr[k], arr[i]
i += 1
j = k
arr[L], arr[i-1] = arr[i-1], arr[L]
return arr
def RSelect(arr, start, end, i):
if start == end : return arr[start]
if start < end :
pivot = random.choice(arr)
while arr.index(pivot) < start or arr.index(pivot) > end :
pivot = random.choice(arr)
arr_new = partition(arr, start, end, pivot)
pLoc = arr_new.index(pivot)
if pLoc == i : return pivot
elif pLoc > i : return RSelect(arr_new, start, pLoc-1, i)
else : return RSelect(arr_new, pLoc+1, end, i)
T =int(input())
for j in range(T):
N, k = map(int, input().split())
my_list = list(map(int,input().split()))
k = len(my_list)-k
anw = RSelect(my_list, 0, len(my_list)-1, k)
print(anw)
Some of the test code works fine, but some outputs incorrect answers. I don't know what's the problem. I am taking a course on the probabilistic selection algorithm.

There are some issues when the input list contains a lot of duplicate values. These issues all relate to how you use index(). You cannot assume that index() will return the index of the searched value within the range [start, end], even if you know it should be within that range, it could also occur outside of it. And as index() returns the index of the first occurrence, you get undesired results:
In partition, arr.index(pivot) could return an index that is less than start, which obviously will lead to an undesired swap of the value at start to outside the range.
while arr.index(pivot) < start could be true, even if the value is also present in the subrange that is under consideration. In case the range consists of only repetitions of this value and no other, this will make that while loop an infinite loop.
A similar problem occurs with arr_new.index(pivot). This can lead to a recursive call where the range is greater than the current range, leading to a potential stack overflow.
Some other remarks:
arr_new = partition() is a bit misleading, as it gives the impression you get a new list, but actually arr has been mutated and it is the list that partition also returns. So to avoid misinterpretation, it is better to just continue with arr and not introduce a new variable for the same list. Instead of returning arr, it would be more useful if partition would return the index of where the pivot value ended up. This way you don't have to perform an index call any more.
partition has to search the given pivot value. It can be relieved from that scan by passing it the index of the pivot. You should really aim to avoid using index at all as it leads to a worse average time complexity.
Here is the proposed code with those points taken into account:
def partition(arr, start, end, pivot_locate):
# The function gets the index, and now gets the pivot value
pivot = arr[pivot_locate]
arr[pivot_locate], arr[start] = arr[start], arr[pivot_locate]
L = start; R = end
i = L+1; j = L+1
for k in range(j, R+1):
if arr[k] < pivot:
arr[i], arr[k] = arr[k], arr[i]
i += 1
j = k
arr[L], arr[i-1] = arr[i-1], arr[L]
return i-1 # return the new index of the pivot
def RSelect(arr, start, end, i):
if start >= end:
return arr[start]
if start < end:
# Select a random index within the range
pLoc = random.randint(start, end)
# call partition with that index and get a new index back
pLoc = partition(arr, start, end, pLoc)
if pLoc == i:
return arr[pLoc]
elif pLoc > i:
return RSelect(arr, start, pLoc-1, i)
else:
return RSelect(arr, pLoc+1, end, i)

Related

Hackerrank Minimum swaps 2 - loop goes to infinity from a small change in code

This challenge asks that you find the minimum number of swaps to sort an array of jumbled consecutive digits to ascending order. This is the correct function for the question:
def minimumSwaps(arr):
n = 0
i =0
while i < len(arr):
index = arr[i]-1
if arr[i] != arr[index]:
arr[i], arr[index] = arr[index], arr[i]
n+=1
print(arr)
else:
i+=1
return n
However if I get rid of the index = arr[i]-1 and replace index with arr[i]-1 everywhere like this:
def minimumSwaps(arr):
n = 0
i =0
while i < len(arr):
if arr[i] != arr[arr[i]-1]:
arr[i], arr[arr[i]-1] = arr[arr[i]-1], arr[i]
n+=1
print(arr)
else:
i+=1
return n
The loop goes to infinity and I cant figure out why this is the case.
Any help will be appreciated.
arr[i], arr[arr[i]-1] = arr[arr[i]-1], arr[i] first evaluates the rhs terms arr[arr[i]-1], arr[i] this is unchanged compared to using index, now think of them being stored in temporary variables a and b. Then it assigns to the lhs terms one by one. First a to arr[i] and only after assigns b to arr[arr[i]-1], at this stage arr[i] has alreday changed so this is no longer equivalent to using index.

MergeSort without slicing list

I am stuck on my algorithm, I am not sure why the output is not as expected. I want to design a merge sort algorithm without slicing the list. My idea is to use start and end index to simulate slicing the list. Can I get some help about the bug?
Thanks a lot!
def mergeSort(alist, start, end):
print("Splitting ",alist[start:end])
length = end - start
if length >1:
mid = length//2
lefthalf = alist[:mid]
righthalf = alist[mid:]
mergeSort(alist, start, start + mid)
mergeSort(alist, start + mid, end)
i=0
j=0
k=0
while i < mid and j < mid:
if alist[start + i] <= alist[start + mid + j]:
alist[k]=alist[start + i]
i=i+1
else:
alist[k]=alist[start + mid + j]
j=j+1
k=k+1
while i < mid:
alist[k]=alist[start + i]
i=i+1
k=k+1
while j < mid:
alist[k]=alist[start + mid + j]
j=j+1
k=k+1
print("Merging ",alist[start:end])
When I try:
alist = [54,26,93,17,77,31,44,55,20]
mergeSort(alist,0, len(alist))
I got
[44, 55, 77, 31, 77, 31, 44, 55, 20]
Some issues:
First of all, the reference alist[k] does not make sense. It would make more sense if it were alist[start + k], since you really don't want to change anything outside the window start:end.
But still then, you will be overwriting values in alist that you still need. Take for instance the first iteration of the first while loop (both i and j are zero). Assume that alist[start + i] > alist[start + mid + j], so the else block kicks in. There you would do:
alist[start + k] = alist[start + mid + j]
Now realise that both alist[start + k] and alist[start + i] reference the same value, so this assignment destroyed the value that is needed in the next iteration of the while loop. It is for ever lost.
You really need extra storage to manage this with lists.
One way is to use a temporary list for gathering the merged values. You would not need the k index anymore, as you would just append values to this new list. Once it is populated, you can inject its values back into alist.
Finally, the while condition assumes that both halves to be merged have the same size, but this is not true. end - start could be odd, and then the second half has one more element. So doing while i < mid and j < end will miss one final iteration in that case.
For this problem I would suggest to not use variables for relative sizes or relative offsets like your current mid, but to only use absolute offsets.
Here is the relevant code that deals with the above issues:
mid = (start + end) // 2 # use absolute offsets only
mergeSort(alist, start, mid)
mergeSort(alist, mid, end)
i = start # use absolute offsets only
j = mid # use absolute offsets only
merged = [] # temporary list
while i < mid and j < end: # now condition is correct
if alist[i] <= alist[j]:
merged.append(alist[i]) # use append
i = i + 1
else:
merged.append(alist[j])
j = j + 1
# add the remainder from the left side
merged.extend(alist[i:mid])
# inject back into main list
alist[start:start+len(merged)] = merged

Partition Array

Given an array nums of integers and an int k, partition the array (i.e move the elements in nums) such that: All elements < k are moved to the left. All elements >= k are moved to the right
Return the partitioning index, i.e the first index i nums[i] >= k.
class Solution:
def partitionArray(self, nums, k):
# write your code here
if nums == []:
return 0
left = 0
i = 0
while i <= len(nums):
if nums[i] < k:
i += 1
left += 1
else:
r = nums[i]
del nums[i]
nums.append(r)
i += 1
return left
My idea is to going through the list one by one. The num[i] whose larger than k will be removed and append at the end of the num, the one whose smaller than k will be kept at the original place. Once the whole list has been going through, all the smaller num are at the front. left is a counter at this point for return. But I cannot fix the problem with nums[i]. After the each mods to the list, the counter i cannot point at the correct item in the list.
How can I write the code base on this idea???
You're looking for the index(k). This seems like a homework assignment so you may be limited to what built in functionality you can use. However, a pythonic approach to this is
def solution(nums, k):
return sorted(nums).index(k)
You are doing several things I would recommend avoiding.
Concurrent modification; you should not add or delete from a list while looping it.
You can not loop up to i == len(nums) because list indexes start at 0.
Since you are really just looking for index(k) you need only keep track of numbers less than k and not concern yourself with re-organizing the list.
class Solution:
def partitionArray(self,nums, k):
# write your code here
if nums == []:
return 0
left = 0
i = 0
while i < len(nums):
if nums[i] < k:
left += 1
i += 1
return left

Why can't I implement merge sort this way

I understand mergesort works by divide and conquer, you keep halving until you reach a point where you can sort in constant time or the list is just one lement and then you merge the lists.
def mergesort(l):
if len(l)<=1:
return l
l1 = l[0:len(l)//2+1]
l2 = l[len(l)//2:]
l1 = mergesort(l1)
l2 = mergesort(l2)
return merge(l1,l2)
I have a working merge implementation and I checked it works fine but the merge sort implementation does not work it just returns half of the elements of the list.
I see on the internet mergesort is implemented using l & r and m = (l + r)/2. What is wrong with my implementation? I am recursively subdividing the list and merging too.
the problem is the +1 in your code, here:
l1 = l[0:len(l)//2]
l2 = l[len(l)//2:]
replace this with your code and you're be fine
The code you have listed doesn't appear to do any sorting. I can't know for certain because you haven't listed the merge() function's code, but the only thing that the above function will do is recursively divide the list into halves. Here is a working implementation of a merge sort:
def mergeSort(L):
# lists with only one value already sorted
if len(L) > 1:
# determine halves of list
mid = len(L) // 2
left = L[:mid]
right = L[mid:]
# recursive function calls
mergeSort(left)
mergeSort(right)
# keeps track of current index in left half
i = 0
# keeps track of current index in right half
j = 0
# keeps track of current index in new merged list
k = 0
while i < len(left) and j < len(right):
# lower values appended to merged list first
if left[i] < right[j]:
L[k] = left[i]
i += 1
else:
L[k] = right[j]
j += 1
k += 1
# catch remaining values in left and right
while i < len(left):
L[k] = left[i]
i += 1
k += 1
while j < len(right):
L[k] = right[j]
j += 1
k += 1
return L
Your function makes no comparisons of values in the original list. Also, when you are splitting the list into halves in:
l1 = l[0:len(l)//2 + 1]
the '+ 1' is unnecessary (and can actually cause incorrect solutions). You can simply use:
l1 = l[:len(l)//2]
If the length is even (i.e 12) it will divide the two halves from [0:6] and [6:12]. If it is odd it will still automatically divide correctly (i.e. length = 13 would be [0:6] and [6:13]. I hope this helps!

3 Quicksorts Functions - Why is the lambda version slower? Code provided

I was testing quicksort runtimes and I noticed the lambda version of quicksort was slower.
Why is the lambda version noticeably slower? I've tried swapping the orders that I call each and it seems to stay constantly slower. Is it because I am redeclaring left, equal, and right for each time since filter has to be assigned (versus appending/in place)?
import timeit
def qsort(list):
if len(list) > 1:
pivot = list[0]
left = filter(lambda x: x < pivot, list)
equal = filter(lambda x: x == pivot, list)
right = filter(lambda x: x > pivot, list)
return qsort(left) + equal + qsort(right)
else:
return list
def sort(array):
less = []
equal = []
greater = []
if len(array) > 1:
pivot = array[0]
for x in array:
if x < pivot:
less.append(x)
elif x == pivot:
equal.append(x)
elif x > pivot:
greater.append(x)
return sort(less)+equal+sort(greater)
else:
return array
def partition(array, begin, end):
pivot = begin
for i in xrange(begin+1, end+1):
if array[i] <= array[begin]:
pivot += 1
array[i], array[pivot] = array[pivot], array[i]
array[pivot], array[begin] = array[begin], array[pivot]
return pivot
def quicksort(array, begin=0, end=None):
if end is None:
end = len(array) - 1
if begin >= end:
return
pivot = partition(array, begin, end)
quicksort(array, begin, pivot-1)
quicksort(array, pivot+1, end)
return array
print qsort([5,3,1,5,2,6])
print sort([5,3,1,5,2,6])
print quicksort([5,3,1,5,2,6])
print (timeit.timeit(lambda: qsort([5,3,1,5,2,6]), number=1000))
print (timeit.timeit(lambda: sort([5,3,1,5,2,6]), number=1000))
print (timeit.timeit(lambda: quicksort([5,3,1,5,2,6]), number=1000))
As mentioned in a comment, the lambdas require an extra function call that is expensive. This function could be rewritten using list comprehensions to provide a time that is very close (sometimes smaller) than the other functions.
def qsort(ls):
if len(ls) > 1:
pivot = ls[0]
left = [e for e in ls if e < pivot]
equal = [e for e in ls if e == pivot]
right = [e for e in ls if e > pivot]
return qsort(left) + equal + qsort(right)
else:
return ls
Also, I changed list to ls so that the built-in list isn't shadowed. The really interesting part is that, despite the repeated iteration over the input, and the creation of new intermediate arrays, this is still comparable in time to the in-place sorting function. Unfortunately, I don't know enough to explain why. For larger lists, it seems like qsort performs the best and the in-place method starts slowing down significantly.

Categories

Resources