MergeSort without slicing list - python

I am stuck on my algorithm, I am not sure why the output is not as expected. I want to design a merge sort algorithm without slicing the list. My idea is to use start and end index to simulate slicing the list. Can I get some help about the bug?
Thanks a lot!
def mergeSort(alist, start, end):
print("Splitting ",alist[start:end])
length = end - start
if length >1:
mid = length//2
lefthalf = alist[:mid]
righthalf = alist[mid:]
mergeSort(alist, start, start + mid)
mergeSort(alist, start + mid, end)
i=0
j=0
k=0
while i < mid and j < mid:
if alist[start + i] <= alist[start + mid + j]:
alist[k]=alist[start + i]
i=i+1
else:
alist[k]=alist[start + mid + j]
j=j+1
k=k+1
while i < mid:
alist[k]=alist[start + i]
i=i+1
k=k+1
while j < mid:
alist[k]=alist[start + mid + j]
j=j+1
k=k+1
print("Merging ",alist[start:end])
When I try:
alist = [54,26,93,17,77,31,44,55,20]
mergeSort(alist,0, len(alist))
I got
[44, 55, 77, 31, 77, 31, 44, 55, 20]

Some issues:
First of all, the reference alist[k] does not make sense. It would make more sense if it were alist[start + k], since you really don't want to change anything outside the window start:end.
But still then, you will be overwriting values in alist that you still need. Take for instance the first iteration of the first while loop (both i and j are zero). Assume that alist[start + i] > alist[start + mid + j], so the else block kicks in. There you would do:
alist[start + k] = alist[start + mid + j]
Now realise that both alist[start + k] and alist[start + i] reference the same value, so this assignment destroyed the value that is needed in the next iteration of the while loop. It is for ever lost.
You really need extra storage to manage this with lists.
One way is to use a temporary list for gathering the merged values. You would not need the k index anymore, as you would just append values to this new list. Once it is populated, you can inject its values back into alist.
Finally, the while condition assumes that both halves to be merged have the same size, but this is not true. end - start could be odd, and then the second half has one more element. So doing while i < mid and j < end will miss one final iteration in that case.
For this problem I would suggest to not use variables for relative sizes or relative offsets like your current mid, but to only use absolute offsets.
Here is the relevant code that deals with the above issues:
mid = (start + end) // 2 # use absolute offsets only
mergeSort(alist, start, mid)
mergeSort(alist, mid, end)
i = start # use absolute offsets only
j = mid # use absolute offsets only
merged = [] # temporary list
while i < mid and j < end: # now condition is correct
if alist[i] <= alist[j]:
merged.append(alist[i]) # use append
i = i + 1
else:
merged.append(alist[j])
j = j + 1
# add the remainder from the left side
merged.extend(alist[i:mid])
# inject back into main list
alist[start:start+len(merged)] = merged

Related

3sum algorithm. I am not getting results for numbers less than the target

How can I get this to print all triplets that have a sum less than or equal to a target? Currently this returns triplets that are = to the target. I've tried to change and think but can't figure out
def triplets(nums):
# Sort array first
nums.sort()
output = []
# We use -2 because at this point the left and right pointers will be at same index
# For example [1,2,3,4,5] current index is 4 and left and right pointer will be at 5, so we know we cant have a triplet
# _ LR
for i in range(len(nums) - 2):
# check if current index and index -1 are same if same continue because we need distinct results
if i > 0 and nums[i] == nums[i - 1]:
continue
left = i + 1
right = len(nums) - 1
while left < right:
currentSum = nums[i] + nums[left] + nums[right]
if currentSum <= 8:
output.append([nums[i], nums[left], nums[right]])
# below checks again to make sure index isnt same with adjacent index
while left < right and nums[left] == nums[left + 1]:
left += 1
while left < right and nums[right] == nums[right - 1]:
right -= 1
# In this case we have to change both pointers since we found a solution
left += 1
right -= 1
elif currentSum > 8:
left += 1
else:
right -= 1
return output
So for example input array is [1,2,3,4,5] we will get the result (1,2,3),(1,2,4),(1,2,5),(1,3,4) Because these have a sum of less than or equal to target of 8.
The main barrier to small changes to your code to solve the new problem is that your original goal of outputting all distinct triplets with sum == target can be solved in O(n^2) time using two loops, as in your algorithm. The size of the output can be of size proportional to n^2, so this is optimal in a certain sense.
The problem of outputting all distinct triplets with sum <= target, cannot always be solved in O(n^2) time, since the output can have size proportional to n^3; for example, with an array nums = [1,2,...,n], target = n^2 + 1, the answer is all possible triples of elements. So your algorithm has to change in a way equivalent to adding a third loop.
One O(n^3) solution is shown below. Being a bit more clever about filtering duplicate elements (like using a hashmap and working with frequencies), this should be improvable to O(max(n^2, H)) where H is the size of your output.
def triplets(nums, target=8):
nums.sort()
output = set()
for i, first in enumerate(nums[:-2]):
if first * 3 > target:
break
# Filter some distinct results
if i + 3 < len(nums) and first == nums[i + 3]:
continue
for j, second in enumerate(nums[i + 1:], i + 1):
if first + 2 * second > target:
break
if j + 2 < len(nums) and second == nums[j + 2]:
continue
for k, third in enumerate(nums[j + 1:], j + 1):
if first + second + third > target:
break
if k + 1 < len(nums) and third == nums[k + 1]:
continue
output.add((first, second, third))
return list(map(list, output))

How to solve R select problem, using random pivot

I need to find the kth number in the input list. Please tell me what's wrong
def partition(arr, start, end, pivot):
pivot_locate = arr.index(pivot)
arr[pivot_locate], arr[start] = arr[start], arr[pivot_locate]
L = start; R = end
i = L+1; j = L+1
for k in range(j, R+1): #k = 1~R
if arr[k] < pivot:
arr[i], arr[k] = arr[k], arr[i]
i += 1
j = k
arr[L], arr[i-1] = arr[i-1], arr[L]
return arr
def RSelect(arr, start, end, i):
if start == end : return arr[start]
if start < end :
pivot = random.choice(arr)
while arr.index(pivot) < start or arr.index(pivot) > end :
pivot = random.choice(arr)
arr_new = partition(arr, start, end, pivot)
pLoc = arr_new.index(pivot)
if pLoc == i : return pivot
elif pLoc > i : return RSelect(arr_new, start, pLoc-1, i)
else : return RSelect(arr_new, pLoc+1, end, i)
T =int(input())
for j in range(T):
N, k = map(int, input().split())
my_list = list(map(int,input().split()))
k = len(my_list)-k
anw = RSelect(my_list, 0, len(my_list)-1, k)
print(anw)
Some of the test code works fine, but some outputs incorrect answers. I don't know what's the problem. I am taking a course on the probabilistic selection algorithm.
There are some issues when the input list contains a lot of duplicate values. These issues all relate to how you use index(). You cannot assume that index() will return the index of the searched value within the range [start, end], even if you know it should be within that range, it could also occur outside of it. And as index() returns the index of the first occurrence, you get undesired results:
In partition, arr.index(pivot) could return an index that is less than start, which obviously will lead to an undesired swap of the value at start to outside the range.
while arr.index(pivot) < start could be true, even if the value is also present in the subrange that is under consideration. In case the range consists of only repetitions of this value and no other, this will make that while loop an infinite loop.
A similar problem occurs with arr_new.index(pivot). This can lead to a recursive call where the range is greater than the current range, leading to a potential stack overflow.
Some other remarks:
arr_new = partition() is a bit misleading, as it gives the impression you get a new list, but actually arr has been mutated and it is the list that partition also returns. So to avoid misinterpretation, it is better to just continue with arr and not introduce a new variable for the same list. Instead of returning arr, it would be more useful if partition would return the index of where the pivot value ended up. This way you don't have to perform an index call any more.
partition has to search the given pivot value. It can be relieved from that scan by passing it the index of the pivot. You should really aim to avoid using index at all as it leads to a worse average time complexity.
Here is the proposed code with those points taken into account:
def partition(arr, start, end, pivot_locate):
# The function gets the index, and now gets the pivot value
pivot = arr[pivot_locate]
arr[pivot_locate], arr[start] = arr[start], arr[pivot_locate]
L = start; R = end
i = L+1; j = L+1
for k in range(j, R+1):
if arr[k] < pivot:
arr[i], arr[k] = arr[k], arr[i]
i += 1
j = k
arr[L], arr[i-1] = arr[i-1], arr[L]
return i-1 # return the new index of the pivot
def RSelect(arr, start, end, i):
if start >= end:
return arr[start]
if start < end:
# Select a random index within the range
pLoc = random.randint(start, end)
# call partition with that index and get a new index back
pLoc = partition(arr, start, end, pLoc)
if pLoc == i:
return arr[pLoc]
elif pLoc > i:
return RSelect(arr, start, pLoc-1, i)
else:
return RSelect(arr, pLoc+1, end, i)

Trouble tracing my Merge sort algorithm (Python 3)

I wrote a short Merge sort algorithm in Python 3. I have trouble understanding how it manages to achieve the correct result, as when I try to trace its logical steps I end up with an out of order list. The annotated code can be seen below.
What I'm specifically referring to is the merging part of the code. The three 'while' loops.
Let me use an example to demonstrate what confuses me. I explain the details in the annotations.
Thank you in advance for your time and help.
Let's assume we want to merge two arrays.
left = [2,6]
right = [4,8]
def merge_sort(array):
if len(array) > 1:
middle = len(array)//2
left = array[:middle]
right = array[middle:]
merge_sort(left)
merge_sort(right)
i = j = k = 0
while i < len(left) and j < len(right):
# i.e. if 2 < 4
if left[i] < right[j]:
# The first index of the array is assigned the value 2
array[k] = left[i]
# i = 1 now
i += 1
# The 'else' statement is not executed, because the 'if' statement was.
else:
array[k] = right[j]
j += 1
# k = 1 now
k += 1
# The 'while' loop below assigns '6' as the value for index 1 of the array and terminates.
# k = 2 now
while i < len(left):
array[k] = left[i]
i += 1
k += 1
# The last 'while' loop assigns '4' and '8' to indexes 2 and 3 of the array, respectively.
while j < len(right):
array[k] = right[j]
j += 1
k += 1
# The algorithm terminates and from what I can see I should end up with the array of [2,6,4,8].
# I do not, however. It is sorted in ascending order and I cannot see where I'm making a mistake.
Firstly, careful with your wording, to be clear merge sort isn't merging distinct arrays, merge sort cleverly deconstructs a single unsorted array into sub-arrarys (in our case left and right) and sorts them individually and merges them back into a single array again with a final sort. In other words, you pass this function a single array unsorted and it returns a single sorted array. If you need to merge two arrays, you would do so before calling it.
Merge Sort
"Merge sort is a recursive algorithm that continually splits a list in half. If the list is empty or has one item, it is sorted by definition (the base case). If the list has more than one item, we split the list and recursively invoke a merge sort on both halves. Once the two halves are sorted, the fundamental operation, called a merge, is performed. Merging is the process of taking two smaller sorted lists and combining them together into a single, sorted, new list."
Debug/Analyze Code
To help with understanding how it works (and debug), inject print comments at the very least to best show what is going on in more detail. I have taken what you wrote and added print comments and pass the function a string to help determine which array (left or right) it is sorting. You can see the splitting sorting, and merging as it accomplishes the sort by splitting the array down to size one and merging the sorted sub arrays etc. in the process ...
def merge_sort(array,type):
print('merge_sort =>' + type)
if len(array) < 2:
print('Array < 2 nothing changed')
return array
middle = len(array) // 2
left = array[:middle]
right = array[middle:]
print('splitting : ' + str(array))
merge_sort(left,'left')
merge_sort(right,'right')
i = j = k = 0
print('sorting.. Left/Right:' + str(left) + str(right))
while i < len(left) and j < len(right):
if left[i] < right[j]:
print(' - left[i] < right[j] ('+ str(left[i]) + ' < ' + str(right[j]) + ') set array[' + str(k) + '] = ' + str(left[i]) + '')
array[k] = left[i]
i += 1
else:
print(' - else left[i] >= right[j] ('+ str(left[i]) + ' >= ' + str(right[j]) + ') set array[' + str(k) + '] = ' + str(right[j]) + '')
array[k] = right[j]
j += 1
k += 1
while i < len(left):
print(' - WHILE i < len(left), ('+str(i) +' < '+str(len(left))+'), set array[' + str(k) + '] = ' + str(left[i]) + '')
array[k] = left[i]
i += 1
k += 1
while j < len(right):
print(' - while j < len(right) ('+str(j) +' < ' + str(len(right)) + '), set array[' + str(k) + '] = ' + str(right[j]) + '')
array[k] = right[j]
j += 1
k += 1
print("returning.." + str(array))
return array
arr = [2,6,4,8]
result = merge_sort(arr,'full')
print(result)
Which provides the following output:
merge_sort =>full
splitting : [2, 6, 4, 8]
merge_sort =>left
splitting : [2, 6]
merge_sort =>left
Array < 2 nothing changed
merge_sort =>right
Array < 2 nothing changed
sorting.. Left/Right:[2][6]
- left[i] < right[j] (2 < 6) set array[0] = 2
- while j < len(right) (0 < 1), set array[1] = 6
returning..[2, 6]
merge_sort =>right
splitting : [4, 8]
merge_sort =>left
Array < 2 nothing changed
merge_sort =>right
Array < 2 nothing changed
sorting.. Left/Right:[4][8]
- left[i] < right[j] (4 < 8) set array[0] = 4
- while j < len(right) (0 < 1), set array[1] = 8
returning..[4, 8]
sorting.. Left/Right:[2, 6][4, 8]
- left[i] < right[j] (2 < 4) set array[0] = 2
- else left[i] >= right[j] (6 >= 4) set array[1] = 4
- left[i] < right[j] (6 < 8) set array[2] = 6
- while j < len(right) (1 < 2), set array[3] = 8
returning..[2, 4, 6, 8]
This yields something apx. like so:
References:
How do I merge arrays in python?
https://runestone.academy/runestone/books/published/pythonds/SortSearch/TheMergeSort.html
It seems in your annotations you exit the first while loop prematurely, you stop after one run when the code actually does 3 runs. Here is how you would follow wgat actually happens:
you run through it once, then you have k=1, i=1 and j=0,
so you go through this loop again (this time it is the else that is executed, and assigns 4 to index 1 of the array, now k=2, i=1 and j=1
so you run through the loop a third time, with thte if executed, finally k=3, i=2 and j=1, so you get out of the first while.

Why can't I implement merge sort this way

I understand mergesort works by divide and conquer, you keep halving until you reach a point where you can sort in constant time or the list is just one lement and then you merge the lists.
def mergesort(l):
if len(l)<=1:
return l
l1 = l[0:len(l)//2+1]
l2 = l[len(l)//2:]
l1 = mergesort(l1)
l2 = mergesort(l2)
return merge(l1,l2)
I have a working merge implementation and I checked it works fine but the merge sort implementation does not work it just returns half of the elements of the list.
I see on the internet mergesort is implemented using l & r and m = (l + r)/2. What is wrong with my implementation? I am recursively subdividing the list and merging too.
the problem is the +1 in your code, here:
l1 = l[0:len(l)//2]
l2 = l[len(l)//2:]
replace this with your code and you're be fine
The code you have listed doesn't appear to do any sorting. I can't know for certain because you haven't listed the merge() function's code, but the only thing that the above function will do is recursively divide the list into halves. Here is a working implementation of a merge sort:
def mergeSort(L):
# lists with only one value already sorted
if len(L) > 1:
# determine halves of list
mid = len(L) // 2
left = L[:mid]
right = L[mid:]
# recursive function calls
mergeSort(left)
mergeSort(right)
# keeps track of current index in left half
i = 0
# keeps track of current index in right half
j = 0
# keeps track of current index in new merged list
k = 0
while i < len(left) and j < len(right):
# lower values appended to merged list first
if left[i] < right[j]:
L[k] = left[i]
i += 1
else:
L[k] = right[j]
j += 1
k += 1
# catch remaining values in left and right
while i < len(left):
L[k] = left[i]
i += 1
k += 1
while j < len(right):
L[k] = right[j]
j += 1
k += 1
return L
Your function makes no comparisons of values in the original list. Also, when you are splitting the list into halves in:
l1 = l[0:len(l)//2 + 1]
the '+ 1' is unnecessary (and can actually cause incorrect solutions). You can simply use:
l1 = l[:len(l)//2]
If the length is even (i.e 12) it will divide the two halves from [0:6] and [6:12]. If it is odd it will still automatically divide correctly (i.e. length = 13 would be [0:6] and [6:13]. I hope this helps!

Merge sort in python doesnt update the array after sorting

ab = [5, 89, 23, 9]
def mergsort(array):
mid = len(array) / 2
if mid > 0:
print (array)
mergsort(array[:mid])
mergsort(array[mid:])
print(array)
merg(array)
return array
def merg(array):
print (array)
mid = len(array)//2
left = array[:mid]
right = array[mid:]
i = j = k = 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
array[k] = left[i]
i+=1
else:
array[k] = right[j]
j+=1
k+=1
while i < len(left):
array[k]=left[i]
i+=1
k+=1
while j < len(right):
array[k] = right[j]
j+=1
k+=1
print (array)
mergsort(ab)
print (ab)
The merge function sort the array given and the array is updated. But in the next recursion the array going into the merg function is not the mutated array.
In the example, first sorting happens and [5,89] and [23,9] are sorted as [5,89] and [9,23] but the merged input in the next recursion is [5,89,23,9] instead of [5,89,9,23].
I am unable to find any reason as mutating the array should affect the parent array.
One problem is with the recursive calls:
mergsort(array[:mid])
mergsort(array[mid:])
the results of these calls are not recorded - so when we continue, it's done with the same original unsorted array.
The fix:
def mergsort(array):
if len(array) == 1:
return array
mid=len(array)/2
left = mergsort(array[:mid]) # save into a parameter
right = mergsort(array[mid:]) # save into a parameter
return merge(left, right) # use the previous two
The second issue is actually the same kind of issue only with:
def merg(array)
the merge operation is done between two arrays, which means that two distinct arrays should be sent to this function, otherwise there is no recollection of mid from the function mergesort() and declaring mid to be length/2 treats the whole array and not the specific two parts that we intend to merge. The idea behind the logic inside this function is correct but should be done, as I mentioned, on two "distinct" arrays.
Last problem is the in-place swap which is incorrectly done, for example in:
array[k]=right[j]
by doing do, we erase the element at array[k]!
The fix:
def merge(left, right):
if len(left+right) < 2:
return left+right
res = []
i = j = 0
while i < len(left) and j < len(right):
if left[i] < right[j]:
res.append(left[i])
i += 1
elif j < len(right):
res.append(right[j])
j += 1
while i < len(left):
res.append(left[i])
i += 1
while j < len(right):
res.append(right[j])
j += 1
return res
After applying both fixes and running:
print mergsort(ab)
The output is:
[5, 9, 23, 89]
as required.

Categories

Resources