I'm learning Python from a web source which implemented the Insertion Sort algorithm using a combination of for loop and while loop. I thought of practicing the code by myself and I coded an algorithm using only for loops. I need some feedback on whether my code is correct, and whether its valid.
def insertionSort(lst):
for i in range(1,len(lst)):
temp = lst[i]
for j in range(0,i):
if lst[j] > temp:
lst[i], lst[j] = lst[j], lst[i]
return lst
lst = [8, 6, 4, 20, 24, 2, 10, 12]
print(insertionSort(lst))
The output is: [2, 4, 6, 8, 10, 12, 20, 24]
Your algorithm could be called insertion sort in a broad sense, but it is different from what is commonly understood by insertion sort, as it compares the temp value with all previous values, while standard insertion sort only compares temp with the greater values among the previous values, and with one additional value that will become temp's predecessor (if there is one).
This means your implementation will have a best case time complexity that is O(𝑛²), while the best case time complexity of the standard algorithm is O(𝑛). That best case occurs when the input is already sorted.
The typical insertion sort algorithm will have the inner loop going backwards, visiting values in descending order, and stopping when it finds a value that is less than (or equal to) the value to move (temp). During this loop the swaps are done with 2 consecutive values, and this can be improved by delaying the insertion of temp so that values only have to be copied one place to the right until the insertion point is found.
An implementation of that in Python could look like this:
def insertionSort(lst):
for i, temp in enumerate(lst):
for j in range(i - 1, -1, -1):
if lst[j] <= temp: # Found insertion point?
lst[j + 1] = temp
break
lst[j + 1] = lst[j] # Make room for temp
else: # temp is the least value so far: insert at the start
lst[0] = temp
return lst
Correctness testing
To test yourself whether your implementation correctly sorts a list, you can bombard your function with random input. For instance like this:
import random
for size in range(100):
lst = list(range(size))
random.shuffle(lst)
finallist = lst[:]
insertionSort(finallist)
if finallist != sorted(finallist):
print("Not sorted correctly:", lst, "to", finallist)
break
else:
print("All tests passed successfully")
Related
I followed an algorithm with a while loop, but one of the parameters of the question was that I use nested for loops, and I'm not sure how to do that.
This is the while loop:
i = len(lst)
while i > 0:
big = lst.index(max(lst[0:i]))
lst[big], lst[i-1] = lst[i-1], lst[big]
i = i - 1
return lst
This is the question it's answering:
Input: [5,1,7,3]
First, find the largest number, which is 7.
Swap it and the number currently at the end of the list, which is 3. Now we have: [5,1,3,7]
Now, find the largest number, not including the 7, which is 5.
Swap it and the second to last number, which is 3. Now we have: [3,1,5,7].
Now, find the third largest number (excluding the first two), which is 3.
Swap it and the third to last number, which is 1.
Output: [1, 3, 5, 7]
What you're seeing in the algorithm is a selection sort. And here's your second solution which you asked (nested for loops):
def insertion_sort(arr):
l = len(arr)
for i in range(l-1, -1, -1):
m = -10000 # it should be lower than min(arr)
idx = -1
for key, val in enumerate(arr[:i+1]):
if m < val:
m = val
idx = key
if idx != -1:
arr[i], arr[idx] = arr[idx], arr[i]
return arr
And a quick test:
arr = list(range(10))[::-1]
print(arr)
# prints [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
result = insertion_sort(arr)
print(result)
# prints [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
This looks like a (rather slow) sorting algorithm - namely bubble sort. It's iterating from the end of the list lst. Then it's searching for the maximum value in the first n-1 elements, and swapping them with the end. It will, however, fail, if the maximum value is already at the end, because then it will automatically swap the max(n-1) with the n value. You'll need to add a check for this.
So from a first look, I'm not sure if i is defined before, but let's assume it's defined at the length of the list lst, as it seems to be. So let's start with the outer loop - as have a while loop that looks like it's counting down from i to 0. This is the opposite of an increasing for-loop, so we can create a reserved range:
rev_range = range(0,len(lst))
rev_range.reverse()
for j in rev_range:
# perform the sort
We now have the outer loop for the counting-down while loop. The sort itself iterates forward until it finds the maximum. This is a forward for loop.
# sorting
max_val_so_far_index=lst[j]
# lst[:j-1] gets the first j-1 elements of the list
for k in lst[:j-1]:
if lst[k] > lst[max_val_so_far_index]:
max_val_so_far_index = k
# now we have the index of the maximum value
# swap
temp = lst[j]
lst[j] = lst[max_val_so_far_index]
lst[max_val_so_far_index]=temp
Let's put the two components together to get:
rev_range = range(0,len(lst))
rev_range.reverse()
for j in rev_range:
# perform the sort
# sorting
#print j
max_val_so_far_index=j
# get the first j items
for k in range(j):
if lst[k] > lst[max_val_so_far_index]:
max_val_so_far_index = k
# now we have the index of the maximum value
# swap
temp = lst[j]
lst[j] = lst[max_val_so_far_index]
lst[max_val_so_far_index]=temp
At the end lst is sorted.
The algorithm in the question is just another form of a bubble sort. The original algorithm uses two nested for loops. You can find a good explaination here.
I want to delate the duplicated elements in a list using this structure but the list remains unchanged when I use this code. Could anyone help me please?
e.g.,item=[1,2,3,4,5,6,7,8,9,1,2,6,7]
def duplicated(item):
i=0
j=0
while i<len(item):
while j<len(item):
if item[j]==item[i] and i!=j:
del item[j]
j+=1
i+=1
return item
To address why your code isn't working, it is due to the fact that you initialise j at the start of the function, so after the first iteration of the i loop, it is len(item)-1 and then never get's reset for future loops.
This means that you miss many duplicates.
So since we need to initialise it each loop, we still need to know what to. If we initialise it as 0, then we are checking for duplicates behind the current j value in the list so this is a waste of time. To improve efficiency, we should initialise it as i+1 so that we check numbers after i for duplicates as we already know that there are no duplicates of the value at i before the index i in the list. This also simplifies the if as we no longer need to check i != j.
And one final thing is that when you delete j with del, all the indexes after are now shifted down by one. So we also need to subtract 1 from j so that we now check the element straight after the one we just deleted which is now at the same index of the one we just deleted (since they are shifted down).
So, in code:
def duplicated(item):
i = 0
while i < len(item):
j = i + 1
while j < len(item):
if item[j] == item[i]:
del item[j]
j -= 1
j += 1
i += 1
return item
and it works with the examples people have given:
>>> duplicated([1,2,3,4,5,6,7,8,9,1,2,6,7])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> duplicated([1,2,3,4,5,6,7,8,9,1,1,2,6,7])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
However, this solution is current of complexity O(n^2) since we are using nested loops so the time taken is proportional to the size of the input list squared.
But if we were able to modify the algorithm to use only one loop, we would reduce the complexity to O(n). This can be done by using a set for lookup. Adding and checking if an element is in the set is O(1) (average case) so we can use them to speed things up.
So if we have a set which contains the elements we have already seen, then for each future element, if it is already in the seen set, we delete it, otherwise we add it to this set.
So, in code:
def duplicated(item):
seen = set()
i = 0
while i < len(item):
if item[i] in seen:
del item[i]
else:
seen.add(item[i])
i += 1
return item
And I can confirm that this passes the test cases above as well.
One last thing I should point out is that when deleting the element here, I did not subtract from the pointer, this is because before we subtracted as we knew it would later be incremented and we wanted it to be the same, however here, it is only incremented in the else block, so if we don't do anything, it will stay the same.
Actually variable j should start with next item of what i picks.
def duplicated(item):
i=0
while i<len(item):
j = i+1
while j<len(item):
if item[j]==item[i] and i!=j:
del item[j]
j -= 1
j+=1
i+=1
return item
You should reinitialize your j var at each turn inside of the i loop, otherwise, j is always going to be equal to len(item) after the first iteration.
def duplicated(item):
i=0
while i<len(item):
j=0
while j<len(item):
if item[j]==item[i] and i!=j:
print(i,j )
del item[j]
j+=1
i+=1
return item
However, The best way, if you don't care about your list order, to do what you want will probably be to convert your list to a set and then back to a list, has a set can only have distinct elements.
def duplicated(item):
return list(set(item))
You need to reinitilaze j at start of nested while loop each time:
def duplicated(item):
i=0
j=0
while i<len(item)-1:
j=i+1
while j<len(item):
if item[j]==item[i]:
del item[j]
j -= 1
j+=1
i+=1
return item
OUT
[1, 2, 3, 4, 5, 6, 7, 8, 9]
However you can try below simpler code it will remain the insertion order of list
def duplicated(item):
unique_list=[]
for i in item:
if i not in unique_list:
unique_list.append(i)
return unique_list
I know merge sort is the best way to sort a list of arbitrary length, but I am wondering how to optimize my current method.
def sortList(l):
'''
Recursively sorts an arbitrary list, l, to increasing order.
'''
#base case.
if len(l) == 0 or len(l) == 1:
return l
oldNum = l[0]
newL = sortList(l[1:]) #recursive call.
#if oldNum is the smallest number, add it to the beginning.
if oldNum <= newL[0]:
return [oldNum] + newL
#find where oldNum goes.
for n in xrange(len(newL)):
if oldNum >= newL[n]:
try:
if oldNum <= newL[n+1]:
return newL[:n+1] + [oldNum] + newL[n+1:]
#if index n+1 is non-existant, oldNum must be the largest number.
except IndexError:
return newL + [oldNum]
What is the complexity of this function? I was thinking O(n^2) but I wasn't sure. Also, is there anyway to further optimize this procedure? (besides ditching it and going for merge sort!).
There's a few places I'd optimize your code.
You do a lot of list copies: each time you slice, you create a new copy of the list. That can be avoided by adding an index to the function declaration that indicates where in the array to start sorting from.
You should follow PEP 8 for naming: sort_list rather than sortList.
The code that does the insertion is a bit weird; intentionally raising an out-of-bounds index exception isn't normal programming practice. Instead, just percolate the value up the array until it's in the right place.
Applying these changes gives this code:
def sort_list(l, i=0):
if i == len(l): return
sort_list(l, i+1)
for j in xrange(i+1, len(l)):
if l[j-1] <= l[j]: return
l[j-1], l[j] = l[j], l[j-1]
This now sorts the array in-place, so there's no return value.
Here's some simple tests:
cases = [
[1, 2, 0, 3, 4, 5],
[0, 1, 2, 3, 4, 5],
[5, 4, 3, 2, 1, 1]
]
for c in cases:
got = c[:]
sort_list(got)
if sorted(c) != got:
print "sort_list(%s) = %s, want %s" % (c, got, sorted(c))
The time complexity is, as you suggest, O(n^2) where n is the length of the list. My version uses O(n) additional memory, whereas yours, because of the way the list gets copied at each stage, uses O(n^2).
One more step, which further improves the memory usage is to eliminate the recursion. Here's a version that does that:
def sort_list(l):
for i in xrange(len(l)-2, -1, -1):
for j in xrange(i+1, len(l)):
if l[j-1] <= l[j]: break
l[j-1], l[j] = l[j], l[j-1]
This works just the same as the recursive version, but does it iteratively; first sorting the last two elements in the array, then the last three, then the last four, and so on until the whole array is sorted.
This still has runtime complexity O(n^2), but now uses O(1) additional memory. Also, avoiding recursion means you can sort longer lists without hitting the notoriously low recursion limit in Python. And another benefit is that this code is now O(n) in the best case (when the array is already sorted).
A young Euler came up with a formula that seems appropriate here. The story goes that in grade school his teacher was very tired and to keep the class busy for a while they were told to add up all the numbers zero to one hundred. Young Euler came back with this:
This is applicable here because your run-time is going to be proportional to the sum of all the numbers up to the length of your list because in the worst case your function will be sorting an already sorted list and will go through the entire length newL each time to find the position of the next element at the end of the list.
I'm fairly new to programming; I've only been studying Python for a few weeks. I've been given an exercise recently that asks me to generate a list of integers, and then manually sort the numbers from lowest to highest in a separate list.
import random
unordered = list(range(10))
ordered = []
lowest = 0
i = 0
random.shuffle(unordered)
lowest = unordered[0]
while i in unordered:
if unordered[i] < lowest:
lowest = unordered[i]
i += 1
if i >= len(unordered):
i = 0
ordered.append(lowest)
unordered.remove(lowest)
lowest = unordered[i]
print(ordered)
This is what I have so far, and to be quite frank, it doesn't work at all. The pseudocode I have been given is this:
Create an empty list to hold the ordered elements
While there are still elements in the unordered list
Set a variable, lowest, to the first element in the unordered list
For each element in the unordered list
If the element is lower than lowest
Assign the value of that element to lowest
Append lowest to the ordered list
Remove lowest from the unordered list
Print out the ordered list
The biggest issue I'm having so far is that my counter doesn't reliably give me a way to pick out the lowest number from my list unordered. And then I'm having issues with indexing my list i.e. the index being out of range. Can anyone give me a bit of feedback on where I'm going wrong?
Also, I was given this info which I'm not really sure about:
You can use an established method to sort the list called the Selection Sort.
I'm not supposed to be using Python's built in sort methods this time around. It's all supposed to be done manually. Thanks for any help!
You can do this without having to create another list.
x = [5, 4, 3, 2, 5, 1]
n = len(x)
# Traverse through all list elements
for i in range(n):
# Traverse the list from 0 to n-i-1
# (The last element will already be in place after first pass, so no need to re-check)
for j in range(0, n-i-1):
# Swap if current element is greater than next
if x[j] > x[j+1]:
x[j], x[j+1] = x[j+1], x[j]
print(x)
This works with duplicates and descending lists. It also includes a minor optimization to avoid an unnecessary comparison on the last element.
Note: this answer and all the others use bubble sort, which is simple but inefficient. If you're looking for performance, you're much better off with another sorting algorithm. See which is best sorting algorithm and why?
You've just got some of the order wrong: you need to append to your ordered list each time around
import random
unordered = list(range(10))
ordered = []
i = 0
random.shuffle(unordered)
print unordered
lowest = unordered[0]
while len(unordered) > 0:
if unordered[i] < lowest:
lowest = unordered[i]
i += 1
if i == len(unordered):
ordered.append(lowest)
unordered.remove(lowest)
if unordered:
lowest = unordered[0]
i = 0
print(ordered)
you're not supposed to create a new algorithm for sorting list, just implement this one :
http://en.wikipedia.org/wiki/Bubble_sort
I found this is working pretty well for any number of inputs
x = [3, 4, 100, 34, 45]
for i in range(len(x) - 1):
if x[i] > x[i + 1]:
x[i],x[i + 1] = x[i + 1], x[i]
print (x)
Above code won't work if you have repetitive elements.
ordered=[]
i=0
j=0
x = [100, 3, 4, 100, 34, 45]
lowest=x[0]
while len(x)>0:
for i in range(0,len(x)):
if x[i]<=lowest:
lowest=x[i]
ordered.append(lowest)
x.remove(lowest)
if len(x)>1:
lowest=x[0]
print(ordered)
def sort(x):
l=len(x)
for i in range(l):
for j in range((i+1),l):
if x[i]>x[j]:
l1=x[i]
x[i]=x[j]
x[j]=l1
print(x)
l=[8,4,2,6,5,1,12,18,78,45]
sort(l)
From the first number in the list, run a loop to find the lowest value. After that swap them with the first number in the list. Repeat this loop method for remaining numbers in the list.
nlist=[int(a) for a in input('Please insert your list of numbers ').split()]
for loop1 in range (0,len(nlist)-1): # outer loop
min=nlist[loop1]
for loop2 in range (loop1 + 1,len(nlist)): # inner loop to compare
if min > nlist[loop2]:
min=nlist[loop2]
index=loop2
if nlist[loop1] != min:
swap=nlist[loop1]
nlist[loop1]=min
nlist[index]=swap
print('Your hand-sorted list is',nlist)
I have a list of numbers (example: [-1, 1, -4, 5]) and I have to remove numbers from the list without changing the total sum of the list. I want to remove the numbers with biggest absolute value possible, without changing the total, in the example removing [-1, -4, 5] will leave [1] so the sum doesn't change.
I wrote the naive approach, which is finding all possible combinations that don't change the total and see which one removes the biggest absolute value. But that is be really slow since the actual list will be a lot bigger than that.
Here's my combinations code:
from itertools import chain, combinations
def remove(items):
all_comb = chain.from_iterable(combinations(items, n+1)
for n in xrange(len(items)))
biggest = None
biggest_sum = 0
for comb in all_comb:
if sum(comb) != 0:
continue # this comb would change total, skip
abs_sum = sum(abs(item) for item in comb)
if abs_sum > biggest_sum:
biggest = comb
biggest_sum = abs_sum
return biggest
print remove([-1, 1, -4, 5])
It corectly prints (-1, -4, 5). However I am looking for some clever, more efficient solution than looping over all possible item combinations.
Any ideas?
if you redefine the problem as finding a subset whose sum equals the value of the complete set, you will realize that this is a NP-Hard problem, (subset sum)
so there is no polynomial complexity solution for this problem .
#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Copyright © 2009 ClĂłvis FabrĂcio Costa
# Licensed under GPL version 3.0 or higher
def posneg_calcsums(subset):
sums = {}
for group in chain.from_iterable(combinations(subset, n+1)
for n in xrange(len(subset))):
sums[sum(group)] = group
return sums
def posneg(items):
positive = posneg_calcsums([item for item in items if item > 0])
negative = posneg_calcsums([item for item in items if item < 0])
for n in sorted(positive, reverse=True):
if -n in negative:
return positive[n] + negative[-n]
else:
return None
print posneg([-1, 1, -4, 5])
print posneg([6, 44, 1, -7, -6, 19])
It works fine, and is a lot faster than my first approach. Thanks to Alon for the wikipedia link and ivazquez|laptop on #python irc channel for a good hint that led me into the solution.
I think it can be further optimized - I want a way to stop calculating the expensive part once the solution was found. I will keep trying.
Your requirements don't say if the function is allowed to change the list order or not. Here's a possibility:
def remove(items):
items.sort()
running = original = sum(items)
try:
items.index(original) # we just want the exception
return [original]
except ValueError:
pass
if abs(items[0]) > items[-1]:
running -= items.pop(0)
else:
running -= items.pop()
while running != original:
try:
running -= items.pop(items.index(original - running))
except ValueError:
if running > original:
running -= items.pop()
elif running < original:
running -= items.pop(0)
return items
This sorts the list (big items will be at the end, smaller ones will be at the beginning) and calculates the sum, and removes an item from the list. It then continues removing items until the new total equals the original total. An alternative version that preserves order can be written as a wrapper:
from copy import copy
def remove_preserve_order(items):
a = remove(copy(items))
return [x for x in items if x in a]
Though you should probably rewrite this with collections.deque if you really want to preserve order. If you can guarantee uniqueness in your list, you can get a big win by using a set instead.
We could probably write a better version that traverses the list to find the two numbers closest to the running total each time and remove the closer of the two, but then we'd probably end up with O(N^2) performance. I believe this code's performance will be O(N*log(N)) as it just has to sort the list (I hope Python's list sorting isn't O(N^2)) and then get the sum.
I do not program in Python so my apologies for not offering code. But I think I can help with the algorithm:
Find the sum
Add numbers with the lowest value until you get to the same sum
Everything else can be deleted
I hope this helps
This can be solved using integer programming. You can define a binary variable s_i for each of your list elements x_i and minimize \sum_i s_i, limited by the constraint that \sum_i (x_i*s_i) is equal to the original sum of your list.
Here's an implementation using the lpSolve package in R:
library(lpSolve)
get.subset <- function(lst) {
res <- lp("min", rep(1, length(lst)), matrix(lst, nrow=1), "=", sum(lst),
binary.vec=seq_along(lst))
lst[res$solution > 0.999]
}
Now, we can test it with a few examples:
get.subset(c(1, -1, -4, 5))
# [1] 1
get.subset(c(6, 44, 1, -7, -6, 19))
# [1] 44 -6 19
get.subset(c(1, 2, 3, 4))
# [1] 1 2 3 4