Python Why is this quicksort not sorting properly? - python

I have been trying to implement the quicksort function using python for the last three weeks but I always get to this point and it sorts mostly, but there are a few items out of place.
I don't think I understand the quicksort properly, so if you can see why from my code please explain.
I have chosen the pivot to be the first object in the list, "low", and then comparing it to the rest of the list. If the object at list index "low" is greater than list index "i" (in my for loop), then I switch "i" with "E" (initially indexed to the item "low + 1"), if it does switch, "E" increments. Even if it doesn't switch, "i" increments (because of the for loop).
Once my loop is finished, I decrement "E" (to index it to the highest number in the list lower than my pivot) then switch it with "low" (index of pivot)
I then quicksort the left and right halves of the list using "E" to determine where the list splits. - this seems to be the point where the code fails to sort.
I believe this is how the quicksort works, but I haven't been able to make it work. If you know what I'm missing or if it's just one of my lines, please let me know. Any help with this problem would be greatly appreciated.
(PS. The "main" function is just passing a list of 20 length with variables of 0-19 value into my quicksort and the Python build-in sort)
import random
def quick(A, low, high):
if high <= low:
return
elif high > low:
E = low+1
for i in range(E, high):
if A[low] > A[i]:
A[i], A[E] = A[E], A[i]
E +=1
E -= 1
A[low], A[E] = A[E], A[low]
quick(A, low, E-1)
quick(A, E+1, high)
def main():
listA = []
listB = []
for i in range(20):
int = random.randrange(0, 19)
listA.append(int)
for i in range(len(listA)):
listB.append(listA[i])
print("List A (before sort)" + str(listA))
print("List B (before sort)" + str(listB))
quick(listA, 0, len(listA)-1)
print("\nList A (after sort)" + str(listA))
print("List B (before sort)" + str(listB))
listB.sort()
print("\nList A (after sort)" + str(listA))
print("List B (after sort)" + str(listB))
main()

Your problem is that you're ignoring one number with each split. range(min, max) gives a list that includes min but not max, ending rather on max-1
quick(listA, 0, len(listA)-1)
should be
quick(listA, 0, len(listA)),
and
quick(A, low, E-1)
should be
quick(A, low, E).

Related

Binary search in Python results in an infinite loop

list = [27 , 39 , 56, 73, 3, 43, 15, 98, 21 , 84]
found = False
searchFailed = False
first = 0
last = len(list) - 1
searchValue = int(input("Which number are you looking for? "))
while not found and not searchFailed:
mid = (first + last) // 2
if list[mid] == searchValue:
found = True
else:
if first >= last :
searchFailed = True
else:
if list[mid] > searchValue:
last = mid - 1
else:
last = mid + 1
if found:
print("Your number was found at location", mid)
else:
print("The number does not exist within the list")
The code runs properly when I execute it while searching for 27 (the first number), but any other number just results in an infinite loop.
I believe the loop runs smoothly on the first iteration since if I change the value of first to 1, the code correctly finds the position of 39 but repeats the infinite loop error with all the other numbers after that (while 27 "does not exist within the loop" which makes sense). So I suppose the value of mid is not getting updated properly.
Several points to cover here. First, a binary search needs sorted data in order to work. As your list is not sorted, weirdness and hilarity may ensue :-)
Consider, for example, the unsorted [27 , 39 , 56, 73, 3, 43, 15, 98, 21] when you're looking for 39.
The first midpoint is at value 3 so a binary search will discard the left half entirely (including the 3) since it expects 39to be to the right of that3. Hence it will never find 39`, despite the fact it's in the list.
If your list is unsorted, you're basically stuck with a sequential search.
Second, you should be changing first or last depending on the comparison. You change last in both cases, which won't end well.
Third, it's not usually a good idea to use standard data type names or functions as variable names. Because Python treats classes and functions as first-class objects, you can get into a situation where your bindings break things:
>>> a_tuple = (1, 2) ; a_tuple
(1, 2)
>>> list(a_tuple) # Works.
[1, 2]
>>> list = list(a_tuple) ; list # Works, unintended consequences.
[1, 2]
>>> another_list = list(a_tuple) # No longer works.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'list' object is not callable
Covering those issues, your code would look something like this (slightly reorganised in the process):
my_list = [3, 15, 21, 27, 39, 43, 56, 73, 84, 98]
found = False
first, last = 0, len(my_list) - 1
searchValue = int(input("Which number are you looking for? "))
while not found:
if first > last:
break
mid = (first + last) // 2
if my_list[mid] == searchValue:
found = True
else:
if my_list[mid] > searchValue:
last = mid - 1
else:
first = mid + 1
if found:
print("Your number was found at location", mid)
else:
print("The number does not exist within the list")
That works, according to the following transcript:
pax> for i in {1..6}; do echo; python prog.py; done
Which number are you looking for? 3
Your number was found at location 0
Which number are you looking for? 39
Your number was found at location 4
Which number are you looking for? 98
Your number was found at location 9
Which number are you looking for? 1
The number does not exist within the list
Which number are you looking for? 40
The number does not exist within the list
Which number are you looking for? 99
The number does not exist within the list
First of all, do not use any reserved word (here list) to name your variables. Secondly, you have a logical error in the following lines:
if list[mid] > searchValue:
last = mid - 1
else:
last = mid + 1
In the last line of the above snippet, it should be first = mid + 1
There are very good answers to this question, also you can consider this simpler version adapted to your case:
my_list = [3, 15, 21, 27, 39, 43, 56, 73, 84, 98] # sorted!
left, right = 0, len(my_list) # [left, right)
search_value = int(input("Which number are you looking for? "))
while left + 1 < right:
mid = (left + right) // 2
if my_list[mid] <= search_value:
left = mid
else:
right = mid
if my_list[left] == search_value: # found!
print("Your number was found at location", left)
else:
print("The number does not exist within the list")
The problem with your function is that in Binary Search the array or the list needs to be SORTED because it's one of the most important principal of binary search, i made same function working correctly for you
#low is the first index and high is the last index, val is the value to find, list_ is the list, you can leave low as it is
def binary_search(list_: list, val,high: int, low: int = 0):
mid = (low+high)//2
if list_[mid] == val:
return mid
elif list_[mid] <= val:
return binary_search(list_, val, high+1)
elif list_[mid] >= val:
return binary_search(list_, val, high, low-1)
and now here's the output
>>> binary_search(list_, 21, len(list_)-1)
>>> 2
what will happen here is that first it will calculate the middle index of the list, which i think of your list is 5, then it will check whether the middle value is equal to the value given to search, and then return the mid index, and if the mid index is smaller than the value, then we will tell it to add one to high index, and we did the comparison with index and value because as i told you, list needs to be sorted and this means if index is greater or equal to the mid index then the value is also surely greater than the middle value, so now what we will do is that we will call the same function again but this time with a higher high which will increase the mid point and if this time middle index is equal to value, then its gonna return the value and going to do this untill mid is equal to value, and in the last elif it says if middle value is greater than value, we will call same function again but lower the low i.e which is 0 and now -1, which will reduce the mid point and this whole process will continue untill mid is equal to value

Optimized way to find in which range does a number lie

I have multiple ranges lets say 1-1000, 1000-2000, 2000-3000, 3000-4000, 4000-5000. I get a number from the user and now i need to find in which range it lies. One way to do this would be to create multiple if statement and check from there like so:
if num>=1 and num < 1000:
print "1"
elif num >=1000 and num < 2000:
print "2"
....
This method would create a lot of branches.
Is there an optimized way to do this without so many branches and in the least complexity?
PS: I just wrote the code in python since its shorter to write but this can be case in any language. Also the range and output can be very different.
The range and output are examples and can be anything like 1-100, 100-1000, 1000-1500 etc and output like "Very Low, low, medium" something like that.
Store the starting or ending of the range in the list and sort it along with number to find its exact range.
import numpy as np
start = [1,1000,2000,3000,4000]
print(list(np.sort(start+[num])).index(num))
If your ranges don't follow any particular logic, there's not much you can do except testing them one by one, but you can still simplify the code by using a loop:
ranges = [[0,1000],[1500,1600],[1200,1220]]
def find_range(num, ranges)
for low, high in ranges:
if low <= num < high:
return low, high # or any other formating using a dict for example
Of course you can optimize a bit by sorting your ranges and then doing a binary search instead of linear...
my_range=(1,1000), (1000,2000), (2000,3000), (3000,4000), (4000,5000)
my_output='Very Low, Low, Medium, High, Very High'.split(', ')
num=3565
for k,i in enumerate(my_range):
if i[0]<=num<i[1]:print(my_output[k]);break
else:
print('Out of range')
How about something like this:
ranges = {
0: 'undefined range',
1000: '1',
1500: '2',
2500: '3'
}
num = 500
print(ranges[max(ranges, key=lambda x: num < x)])
Output: 1
Suspect that you have many breaks and require an optimized search, you can go with bisection on an ordered list of breakpoints resulting in a logarithmic time consumption:
import random
import time
def split_pivot(intervals, number):
"""Divide and conquer recursively."""
if len(intervals) == 1:
return intervals[0]
if len(intervals) == 2:
if number >= intervals[1][1][0]:
return intervals[1]
elif number < intervals[0][1][1]:
return intervals[0]
else:
raise
pivot = int(len(intervals) // 2.0)
if number < intervals[pivot][1][1]:
return split_pivot(intervals[:pivot + 1], number)
elif number >= intervals[pivot + 1][1][0]:
return split_pivot(intervals[pivot + 1:], number)
else:
raise
if __name__ == '__main__':
UPPER_BOUND = 10000000
newbreak = 0
manybreaks = []
while newbreak < UPPER_BOUND:
step = int(random.random() * 10) + 1
manybreaks.append(newbreak + step)
newbreak = manybreaks[-1]
print('Breaks: {:d}'.format(len(manybreaks)))
intervals = [
(idx, (manybreaks[idx], manybreaks[idx + 1]))
for idx in range(len(manybreaks) - 1)
]
print('Intervals: {:d}'.format(len(intervals)))
print(
' Example: idx {tpl[0]:d}, lower {tpl[1][0]:d}, upper {tpl[1][1]:d}'
.format(tpl=random.choice(intervals)))
thenumber = int(random.random() * UPPER_BOUND)
print('Number: {:d}'.format(thenumber))
t0 = time.time()
result = split_pivot(intervals, thenumber)
t1 = time.time()
print('Result: {e[0]:d} ({e[1][0]:d}, {e[1][1]:d})'.format(e=result))
print(' Done in {:.4f}s'.format(t1 - t0))
The result of the search itself is (on my machine) below 0.05 seconds. The generation of breakpoints and corresponding intervals runs for roughly 4.5 seconds:
Breaks: 1818199
Intervals: 1818198
Example: idx 605849, lower 3330441, upper 3330446
Number: 6951844
Result: 1263944 (6951843, 6951847)
Done in 0.0436s
maybe just divide by 1000 and take the entire part :
here example in python :
>>> x=3608
>>> int(x/1000+1)
4
Following your comment/edit in your post, if you need a different output (a string for example) you can (in python) use a dict :
>>> Output={'1': 'very low', '2': 'low', '3': 'medium','4':'high' }
>>> x=2954
>>> Output[str(int(x/1000+1))]
'medium'

Python 3: Optimised Bubble Sort

please help. I need to optimize my Bubble Sort algorithm in order to get less total comparisons than the non-optimised bubbleSort. I managed to create just the 'Normal Bubble sort (only travels from left to right):
def bubbleSort(values):
n = len(values) - 1
swap = True
ncomp = 0 # My total comparisons counter
while swap:
swap = False
for i in range(n): # i = 0, 1, 2, ..., n-1
ncomp += 1
if values[i] > values[i+1]:
temp = values[i]
values[i] = values[i+1]
values[i+1] = temp
swap = True
return values, ncomp
So basically i dont know how to create an 'optimised bubbleSort', a bubbleSortPlus function in which bubbles travel in both directions: from left to right, immediately followed by a travel from right to left. In theory, in each pass the travel of the bubbles should be shortened (saving in a variable the position of the last swap in a travel, and make the next travel start at that position. I tried so hard but i'm just a python newbie, please help.
Here's some skeleton code that shows how to scan the array forwards and backwards, while shrinking the list on each iteration.
values = 100,101,102,103,104,105
start = 0
stop = len(values)-1
while stop > start:
for i in range(start, stop):
print i, "compare", values[i], "with", values[i+1]
print "moved a large value to index", stop
print
stop = stop - 1
if stop == start:
break
for i in range(stop, start, -1):
print i, "compare", values[i], "with", values[i-1]
print "moved a small value to index", start
print
start = start + 1
I guess it has been optimized...
The naive Bubble sort does not include the swap flag. So it will not return until finish all O(n^2) comparisons in any cases. But with swap flag, the number of comparison will be almost linear if the input sequence has been "almost sorted".

Assign new values to an array based on a function ran on other 3 dimensional array

I have a multiband raster where I want to apply a function to the values that each pixel has across all the bands. Depending on the result, a new value is assigned, and a new single-band raster is generated from these new values. For example if a pixel has increasing values across the bands, the value "1" will be assigned to that pixel in the resulting raster. I am doing some tests on an three dimensional array using numpy but I am not able to resolve the last part, where the new values are assigned.
The function to be applied to the 3 dimensional array is Trend(List). I have defined it in the begining. To be easier to iterate through the array values on the z (or 0) axis I have used np.swapaxes (thank you #Fabricator for this). The problem comes now when assinging new values to the new_band[i,j] array so that the result of Trend(List) over the list:
[myArraySw[0,0]] will be assigned to new_band[0,0]
[myArraySw[0,1]] will be assigned to new_band[0,1]
[myArraySw[0,2]] will be assigned to new_band[0,2]
[myArraySw[0,3]] will be assigned to new_band[0,3]
................................................
[myArraySw[3,3]] will be assigned to new_band[3,3]
Some values are assigned, but some not. For example new_band[0,1] should be "2" but is "0". The same with new_band[3,0], new_band[3,1], new_band[3,2], new_band[3,3] that should be "5" but they are "0". Other values look alright. Where could be the problem?
Thank you
Here is the code:
import os
import numpy as np
def decrease(A):
return all(A[i] >= A[i+1]for i in range(len(A)-1))
def increase(A):
return all(A[i] <= A[i+1] for i in range(len(A)-1))
def Trend(List):
if all(List[i] == List[i+1] for i in range(len(List)-1))==1:
return "Trend: Stable"
else:
a=[value for value in List if value !=0]
MaxList = a.index(max(a)) #max of the list
MinList=a.index(min(a)) #min of the list
SliceInc=a[:MaxList] #slice until max
SliceDec=a[MaxList:] #slice after max value
SliceDec2=a[:MinList] #slice until min value
SliceInc2=a[MinList:] #slice after min value
if all(a[i] <= a[i+1] for i in range(len(a)-1))==1:
return "Trend: increasing"
elif all(a[i] >= a[i+1] for i in range(len(a)-1))==1:
print "Trend: decreasing"
elif increase(SliceInc)==1 and decrease(SliceDec)==1:
return "Trend: Increasing and then Decreasing"
elif decrease(SliceDec2)==1 and increase(SliceInc2)==1:
return "Trend: Decreasing and then Increasing"
else:
return "Trend: mixed"
myArray = np.zeros((4,4,4)) # generating an example array to try the above functions on
myArray[1,0,0] = 2
myArray[3,0,0] = 4
myArray[1,0,1] = 10
myArray[3,0,1] = 8
myArray[0,1,2] = 5
myArray[1,1,2] = 7
myArray[2,1,2] = 4
print "\n"
print "This is the original: "
print "\n"
print myArray
print "\n"
print "\n"
myArraySw = np.swapaxes(np.swapaxes(myArray,0,2),0,1) # swaping axes so that I can iterate through the lists
print "\n"
print "This is the swapped: "
print "\n"
print myArraySw
print "\n"
new_band = np.zeros_like(myArray[0]) # create a new array to store the results of the functions
for j in range(3):
for i in range(3):
if Trend(myArraySw[i,j]) == "Trend: increasing":
new_band[i,j] = 1
elif Trend(myArraySw[i,j]) == "Trend: decreasing":
new_band[i,j] = 2
elif Trend(myArraySw[i,j]) == "Trend: Increasing and then Decreasing":
new_band[i,j] = 3
elif Trend(myArraySw[i,j]) == "Trend: Decreasing and then Increasing":
new_band[i,j] = 4
elif Trend(myArraySw[i,j]) == "Trend: Stable":
new_band[i,j] = 5
elif Trend(myArraySw[i,j]) == "Trend: mixed":
new_band[i,j] = 6
print "\n"
print "The new array is: "
print "\n"
print new_band
At least part of the problem is that when you typed:
elif all(a[i] >= a[i+1] for i in range(len(a)-1))==1:
print "Trend: decreasing"
you probably meant to type this:
elif all(a[i] >= a[i+1] for i in range(len(a)-1))==1:
return "Trend: decreasing"
^^^^^^
Also, if you don't mind a little unsolicited advice, the code you've posted has a pretty strong "code smell" - you're doing a lot of things in unnecessarily complicated ways. Good on you for getting it done anyways, but I think you'll find this sort of thing easier if you work through some python tutorial problem sets, and read over the given solutions to see how more experienced programmers handle common tasks. You'll discover easier ways to implement many of the elements of this and future projects.

Using Range Function

My goal is to make a program that takes an input (Battery_Capacity) and ultimately spits out a list of the (New_Battery_Capacity) and the Number of (Cycle) it takes for it ultimately to reach maximum capacity of 80.
Cycle = range (160)
Charger_Rate = 0.5 * Cycle
Battery_Capacity = float(raw_input("Enter Current Capacity:"))
New_Battery_Capacity = Battery_Capacity + Charger_Rate
if Battery_Capacity < 0:
print 'Battery Reading Malfunction (Negative Reading)'
elif Battery_Capacity > 80:
print 'Battery Reading Malfunction (Overcharged)'
elif float(Battery_Capacity) % 0.5 !=0:
print 'Battery Malfunction (Charges Only 0.5 Interval)'
while Battery_Capacity >= 0 and Battery_Capacity < 80:
print New_Battery_Capacity
I was wondering why my Cycle = range(160) isn't working in my program?
Your first problem is that you have the first two lines in the wrong order. You need a "Cycle" variable to exist before you can use it.
You'll still get an error when you swap them, though. You can't multiply a list by a float. A list comprehension is more what you want:
Charger_Rate = [i * .5 for i in Cycle]
As far as I can tell, the range(160) part is fine.

Categories

Resources