Explain the bubble sort algorithm? - python

I'm trying to learn more about algorithms and i'm looking into the bubble sort algorithm. I found the script for it on github but I cant really understand it. I'm sorta new to python so can someone explain to me what's going on in this script.
from __future__ import print_function
def bubble_sort(arr):
n = len(arr)
# Traverse through all array elements
for i in range(n):
# Last i elements are already in place
for j in range(0, n-i-1):
# traverse the array from 0 to n-i-1
# Swap if the element found is greater
# than the next element
if arr[j] > arr[j+1] :
arr[j], arr[j+1] = arr[j+1], arr[j]
return arr
if __name__ == '__main__':
try:
raw_input # Python 2
except NameError:
raw_input = input # Python 3
user_input = raw_input('Enter numbers separated by a comma:').strip()
unsorted = [int(item) for item in user_input.split(',')]
print(*bubble_sort(unsorted), sep=',')

Visualize the array as a vertical list of numbers, with the first element (index 0) on the bottom, and the last element (index n-1) at the top. The idea of bubble sort is that numbers "bubble up" to the top, into the place where they belong.
For example, [2,3,1] would first look at 2 and 3, and not do anything because they're already in order. Then it would look at 3 and 1, swapping them since 3>1 and getting [2,1,3]. Then we repeat by looking at 2 and 1, swapping them since 2>1 to get [1,2,3], which is in order.
The idea is that "3" and then "2" bubbled up to the correct position.
Note that after the 3 bubbled up, we don't have to compare 2 and 3, because we know the last element is already higher than everything before it. In general, after i iterations of bubble sort, there's no need to compare the last i elements.

from __future__ import print_function Here we are essentially bringing in code that was written by somebody else, so that we may use it.
def bubble_sort(arr): This is is a function definition. A function definition is preceded by the keyword def. Following that is the function's name. In this case it is called bubble_sort. What we have in the parenthesis are called parameters. A parameter is something we give to a function, so that the function may use it, e.g., multiply the parameter by a number, sort the list, or send some information to a server.
Since we are on the topic of functions, I would suggest looking up process abstraction.
arr Here I am referring to arr within the function's definition. It is short for array, which is a list type. In python we could define an array like so fruits = ["banana", "apple", "orange"]. Arrays are useful for grouping together like pieces of information, and in python I believe this are actually known as a list type. So, conceptually, it may be easier to imagine a list rather than the more esoteric array.
n = len(arr) We are literally assigning the length of the array into the variable n. This is probably shorthand for number of elements. len(arr) is a function that takes an array/list, and returns its length. Similarly, one could call print len(arr) or simply len(arr).
for j in range(0, n-i-1): This is a bit more complicated since it requires an understanding of the algorithm in play, i.e., bubblesort. I won't explain how bubblesort works since there is probably a ton of videos online, but I will explain the bit within the parenthesis.
(0, n-i-1) We want to make comparisons between our current element and the ones preceding it. The ones preceding our current element are greater than the current element. This means if we are at element i, then we have no need to compare elements from i to n, inclusive. We subtract i from n, which leaves us with elements 0 through i. We don't need to compare i to itself, so we subtract an additional 1. This is due to j cycling through the array, and being potentially the same as i.
if arr[j] > arr[j+1] : This is a conditional statement, also known as a branching statement, or an if-statement. The condition, arr[j] > arr[j+1], is true with the element at position j is greater than the one at j+1.
arr[j], arr[j+1] = arr[j+1], arr[j] I think this is shorthand for swapping. A simple swap is shown below.
temp = arr[j]
arr[j] = arr[j+1]
arr[j+1] = temp
return arr Returns the sorted array.
The last bit I am not familiar with since I don't use python much. Perhaps this could be some research for you.
Hopefully this helps.

Related

Execution Timed Out on codewars when finding the least used element in an array. Needs to be optimized but dont know how

Instructions on codewars:
There is an array with some numbers. All numbers are equal except for one. Try to find it!
find_uniq([ 1, 1, 1, 2, 1, 1 ]) == 2
find_uniq([ 0, 0, 0.55, 0, 0 ]) == 0.55
It’s guaranteed that array contains at least 3 numbers.
The tests contain some very huge arrays, so think about performance.
This is the code I wrote:
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit()
It works as follows:
For every character in the array, if that character appears only once, it returns said character and exits the code. If the character appears more than once, it does nothing
When attempting the code on codewars, I get the following error:
STDERR
Execution Timed Out (12000 ms)
I am a beginner so I have no idea how to further optimize the code in order for it to not time out
The first version of my code looked like this:
def find_uniq(arr):
arr.sort()
rep = str(arr)
for character in arr:
cantidad = arr.count(character)
if cantidad > 1:
rep = rep.replace(str(character), "")
rep = rep.replace("[", "")
rep = rep.replace("]", "")
rep = rep.replace(",", "")
rep = rep.replace(" ", "")
rep = float(rep)
n = rep
return n
After getting timed out, I assumed it was due to the repetitive replace functions and the fact that the code had to go through every element even if it had already found the correct one, since the code was deleting the incorrect ones, instead of just returning the correct one
After some iterations that I didn't save we got to the current code, which checks if the character is only once in the array, returns that and exits
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit()
I have no clue how to further optimize this
.count() iterates over the entire array every time that you call it. If your array has n elements, it will iterate over the array n times, which is quite slow.
You can use collections.Counter as Unmitigated suggests, but if you're not familiar with the module, it might seem overkill for this problem. Since in this case you know that there's only two unique elements in the array, you can get all of the unique elements using set(), and then check the frequency of each unique element:
def find_uniq(arr):
for n in set(arr):
if arr.count(n) == 1:
return n
You can use a dict or collections.Counter to get the frequency of each element with linear time complexity. Then return the element with a frequency of one.
def find_uniq(l):
from collections import Counter
return Counter(l).most_common()[-1][0]
Compare the first two numbers. If they match, find the one in the array that doesn't match (longest solution). Otherwise, return the one that doesn't match the third. Coded:
def find_uniq(arr):
if arr[0]==arr[1]:
target=arr[0]
for i in range(2,len(arr)):
if arr[i] != target:
return arr[i]
else:
if arr[0]==arr[2]:
return arr[1]
else:
return arr[0]
In your original code:
def find_uniq(arr):
for n in arr:
if arr.count(n) == 1:
return n
exit() # note: this line does nothing because you already returned
you're calling arr.count once for each element in the array (assuming the worst case scenario where the unique element is at the very end). Each call to arr.count(n) scans through the entire array counting up n -- so you're iterating over the entire array of N elements N times, which makes this O(N^2) -- very slow if N is big!
The second version of your code has the same problem, but it adds a huge amount of extra complexity by turning the list into a string and then trying to parse the string -- don't do that!
The way to make this fast is to iterate over the entire list once and keep track of the count of each item as you go. This is easiest to do with the built in collections.Counter class:
from collections import Counter
def find_uniq(arr):
return next(i for i, c in Counter(arr).items() if c == 1)
Given the constraint that there are only two different values in the array and exactly one of them is unique, you can make this more efficient (such that you don't even need to iterate over the entire array in all cases) by breaking it into two possibilities: either the first two items are identical and you just need to look for the item that's not equal to those, or they're different and you just need to return the one that's not equal to the third.
def find_uniq(arr):
if arr[0] == arr[1]:
# First two items are the same, iterate through
# the rest of the array to find the unique one.
return next(i for i in arr if i != arr[0])
# Otherwise, either arr[0] or arr[1] is unique.
return arr[0] if arr[1] == arr[2] else arr[1]
In this approach, you only ever need to iterate through the array as far as the unique item (or exactly one item past it in the case where it's one of the first two items). In the specific case where the unique item is toward the start of a very long array, this will be much faster than an approach that iterates over the entire array no matter what. In the worst case scenario, you will still have only a single iteration.

can't quite figure out what's not working (checkio excersise "Even the Last")

You are given an array of integers. You should find the sum of the integers with even indexes (0th, 2nd, 4th...). Then multiply this summed number and the final element of the array together. Don't forget that the first element has an index of 0.
For an empty array, the result will always be 0 (zero).
Input: A list of integers.
Output: The number as an integer.
Precondition: 0 ≤ len(array) ≤ 20
all(isinstance(x, int) for x in array)
all(-100 < x < 100 for x in array
result = 0
if array:
for element in array:
i = array.index(element)
if i%2 == 0:
result += element
else:
pass
else:
return 0
return result
Last_digit = array[-1]
final_result = result*Last_digit
return final_result
print(final_result)```
I've figured out the problem, that you've shared the array you're having problem with. Since you have this array :
[-37,-36,-19,-99,29,20,3,-7,-64,84,36,62,26,-76,55,-24,84,49,-65,41]
If you notice here, 84 appears twice, first at index 9 and then 16. The method you're using to get index of elements, .index returns the index of the first instance the element is found in the list.Therefore for the value of 84, the index is taken as 9 and not 16 which is an odd value, this does not add 84 to your sum. You should rather use enumerate for your code:
for idx, element in enumerate(array):
if idx %2 == 0:
result += element
First, I recommend reading the stackexchange guides on posting a well-formed question. You need to state what your goal is, what you've tried, what errors get thrown, and what the output should look like -- along with code examples and a minimal reproducible example as needed.
However, I'll help you out anyway.
You have a dangling return at line 11:
else:
return 0
return result
This makes no sense, as you've already returned 0. This is also apparently a snippet from a function, no? Post the whole function. But based on the instructions, you could try this:
import random
array = random.sample(range(-100, 100), 20)
def etl_func(arr):
arrsum = 0
for i, val in enumerate(arr):
if i%2 == 0: arrsum += val
return (arrsum * arr[-1])
answer = etl_func(array)
print(answer)
Note that importing random and using array = random.sample(range(-100, 100), 20) are not necessary if you're already GIVEN an array to work with. They're included here just as an example.
Also note that it's unnecessary to use an else: pass. If the condition evaluates to true (i.e. i%2 == 0), the if block will be executed. If i%2 != 0, the loop will short circuit automatically and move to the next iteration. Adding else: pass is like telling someone sitting in your chair to sit in your chair. You're telling the program to do what it's already going to do anyway. There's nothing necessarily wrong with including the else: pass, if it really want to... but it's just adding lines of meaningless code, which nobody wants to deal with.
EDIT: I don't know whether you were supposed to write a function or just some code (back to the "ask a well-formed question" issue...), so I went with a function. It should be trivial to turn the function into just plain code -- but you want to get into the habit of writing good functions for reusability and modularity. Makes everything run more smoothly and elegantly, and makes troubleshooting much easier.
This function also works for the array mentioned in the comments to your original post.
In addition, if you need a direct replacement for your code (rather than a function... I'm not familiar with checkio or how your answers are supposed to be formatted), and you already have the array of integers stored in the variable array, try this:
arrsum = 0
for i, val in enumerate(array):
if i%2 == 0: arrsum += val
print(arrsum * array[-1])
Since your question didn't say anything about using or defining functions, return statements shouldn't appear anywhere. There's nothing to return unless you're writing a function.

Why does the range(first_num, second_num) not include second_num? [duplicate]

This question already has answers here:
Why are slice and range upper-bound exclusive?
(6 answers)
Closed last month.
>>> range(1,11)
gives you
[1,2,3,4,5,6,7,8,9,10]
Why not 1-11?
Did they just decide to do it like that at random or does it have some value I am not seeing?
Because it's more common to call range(0, 10) which returns [0,1,2,3,4,5,6,7,8,9] which contains 10 elements which equals len(range(0, 10)). Remember that programmers prefer 0-based indexing.
Also, consider the following common code snippet:
for i in range(len(li)):
pass
Could you see that if range() went up to exactly len(li) that this would be problematic? The programmer would need to explicitly subtract 1. This also follows the common trend of programmers preferring for(int i = 0; i < 10; i++) over for(int i = 0; i <= 9; i++).
If you are calling range with a start of 1 frequently, you might want to define your own function:
>>> def range1(start, end):
... return range(start, end+1)
...
>>> range1(1, 10)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Although there are some useful algorithmic explanations here, I think it may help to add some simple 'real life' reasoning as to why it works this way, which I have found useful when introducing the subject to young newcomers:
With something like 'range(1,10)' confusion can arise from thinking that pair of parameters represents the "start and end".
It is actually start and "stop".
Now, if it were the "end" value then, yes, you might expect that number would be included as the final entry in the sequence. But it is not the "end".
Others mistakenly call that parameter "count" because if you only ever use 'range(n)' then it does, of course, iterate 'n' times. This logic breaks down when you add the start parameter.
So the key point is to remember its name: "stop".
That means it is the point at which, when reached, iteration will stop immediately. Not after that point.
So, while "start" does indeed represent the first value to be included, on reaching the "stop" value it 'breaks' rather than continuing to process 'that one as well' before stopping.
One analogy that I have used in explaining this to kids is that, ironically, it is better behaved than kids! It doesn't stop after it supposed to - it stops immediately without finishing what it was doing. (They get this ;) )
Another analogy - when you drive a car you don't pass a stop/yield/'give way' sign and end up with it sitting somewhere next to, or behind, your car. Technically you still haven't reached it when you do stop. It is not included in the 'things you passed on your journey'.
I hope some of that helps in explaining to Pythonitos/Pythonitas!
Exclusive ranges do have some benefits:
For one thing each item in range(0,n) is a valid index for lists of length n.
Also range(0,n) has a length of n, not n+1 which an inclusive range would.
It works well in combination with zero-based indexing and len(). For example, if you have 10 items in a list x, they are numbered 0-9. range(len(x)) gives you 0-9.
Of course, people will tell you it's more Pythonic to do for item in x or for index, item in enumerate(x) rather than for i in range(len(x)).
Slicing works that way too: foo[1:4] is items 1-3 of foo (keeping in mind that item 1 is actually the second item due to the zero-based indexing). For consistency, they should both work the same way.
I think of it as: "the first number you want, followed by the first number you don't want." If you want 1-10, the first number you don't want is 11, so it's range(1, 11).
If it becomes cumbersome in a particular application, it's easy enough to write a little helper function that adds 1 to the ending index and calls range().
It's also useful for splitting ranges; range(a,b) can be split into range(a, x) and range(x, b), whereas with inclusive range you would write either x-1 or x+1. While you rarely need to split ranges, you do tend to split lists quite often, which is one of the reasons slicing a list l[a:b] includes the a-th element but not the b-th. Then range having the same property makes it nicely consistent.
The length of the range is the top value minus the bottom value.
It's very similar to something like:
for (var i = 1; i < 11; i++) {
//i goes from 1 to 10 in here
}
in a C-style language.
Also like Ruby's range:
1...11 #this is a range from 1 to 10
However, Ruby recognises that many times you'll want to include the terminal value and offers the alternative syntax:
1..10 #this is also a range from 1 to 10
Consider the code
for i in range(10):
print "You'll see this 10 times", i
The idea is that you get a list of length y-x, which you can (as you see above) iterate over.
Read up on the python docs for range - they consider for-loop iteration the primary usecase.
Basically in python range(n) iterates n times, which is of exclusive nature that is why it does not give last value when it is being printed, we can create a function which gives
inclusive value it means it will also print last value mentioned in range.
def main():
for i in inclusive_range(25):
print(i, sep=" ")
def inclusive_range(*args):
numargs = len(args)
if numargs == 0:
raise TypeError("you need to write at least a value")
elif numargs == 1:
stop = args[0]
start = 0
step = 1
elif numargs == 2:
(start, stop) = args
step = 1
elif numargs == 3:
(start, stop, step) = args
else:
raise TypeError("Inclusive range was expected at most 3 arguments,got {}".format(numargs))
i = start
while i <= stop:
yield i
i += step
if __name__ == "__main__":
main()
The range(n) in python returns from 0 to n-1. Respectively, the range(1,n) from 1 to n-1.
So, if you want to omit the first value and get also the last value (n) you can do it very simply using the following code.
for i in range(1, n + 1):
print(i) #prints from 1 to n
It's just more convenient to reason about in many cases.
Basically, we could think of a range as an interval between start and end. If start <= end, the length of the interval between them is end - start. If len was actually defined as the length, you'd have:
len(range(start, end)) == start - end
However, we count the integers included in the range instead of measuring the length of the interval. To keep the above property true, we should include one of the endpoints and exclude the other.
Adding the step parameter is like introducing a unit of length. In that case, you'd expect
len(range(start, end, step)) == (start - end) / step
for length. To get the count, you just use integer division.
Two major uses of ranges in python. All things tend to fall in one or the other
integer. Use built-in: range(start, stop, step). To have stop included would mean that the end step would be assymetric for the general case. Consider range(0,5,3). If default behaviour would output 5 at the end, it would be broken.
floating pont. This is for numerical uses (where sometimes it happens to be integers too). Then use numpy.linspace.

sub-sum from a list without loops

So i'm studying recursion and have to write some codes using no loops
For a part of my code I want to check if I can sum up a subset of a list to a specific number, and if so return the indexes of those numbers on the list.
For example, if the list is [5,40,20,20,20] and i send it with the number 60, i want my output to be [1,2] since 40+20=60.
In case I can't get to the number, the output should be an empty list.
I started with
def find_sum(num,lst,i,sub_lst_sum,index_lst):
if num == sub_lst_sum:
return index_lst
if i == len(sum): ## finished going over the list without getting to the sum
return []
if sub_lst_sum+lst[i] > num:
return find_sum(num,lst,i+1,sub_lst_sum,index_lst)
return ?..
index_lst = find_sum(60,[5,40,20,20,20],0,0,[])
num is the number i want to sum up to,
lst is the list of numbers
the last return should go over both the option that I count the current number in the list and not counting it.. (otherwise in the example it will take the five and there will be no solution).
I'm not sure how to do this..
Here's a hint. Perhaps the simplest way to go about it is to consider the following inductive reasoning to guide your recursion.
If
index_list = find_sum(num,lst,i+1)
Then
index_list = find_sum(num,lst,i)
That is, if a list of indices can be use to construct a sum num using elements from position i+1 onwards, then it is also a solution when using elements from position i onwards. That much should be clear. The second piece of inductive reasoning is,
If
index_list = find_sum(num-lst[i],lst,i+1)
Then
[i]+index_list = find_sum(num,lst,i)
That is, if a list of indices can be used to return a sum num-lst[i] using elements from position i+1 onwards, then you can use it to build a list of indices whose respective elements sum is num by appending i.
These two bits of inductive reasoning can be translated into two recursive calls to solve the problem. Also the first one I wrote should be used for the second recursive call and not the first (question: why?).
Also you might want to rethink using empty list for the base case where there is no solution. That can work, but your returning as a solution a list that is not a solution. In python I think None would be a the standard idiomatic choice (but you might want to double check that with someone more well-versed in python than me).
Fill in the blanks
def find_sum(num,lst,i):
if num == 0 :
return []
elif i == len(lst) :
return None
else :
ixs = find_sum(???,lst,i+1)
if ixs != None :
return ???
else :
return find_sum(???,lst,i+1)

Why doesnt this sort function work for Python

Please tell me why this sort function for Python isnt working :)
def sort(list):
if len(list)==0:
return list
elif len(list)==1:
return list
else:
for b in range(1,len(list)):
if list[b-1]>list[b]:
print (list[b-1])
hold = list[b-1]
list[b-1]=list[b]
list[b] = hold
a = [1,2,13,131,1,3,4]
print (sort(a))
It looks like you're attempting to implement a neighbor-sort algorithm. You need to repeat the loop N times. Since you only loop through the array once, you end up with the largest element being in its place (i.e., in the last index), but the rest is left unsorted.
You could debug your algorithm on your own, using pdb.
Or, you could use python's built-in sorting.
Lets take a look at you code. Sort is a built in Python function (at least I believe it is the same for both 2.7 and 3.X) So when you are making your own functions try to stay away from name that function with inbuilt functions unless you are going to override them (Which is a whole different topic.) This idea also applies to the parameter that you used. list is a type in the python language AKA you will not be able to use that variable name. Now for some work on your code after you change all the variables and etc...
When you are going through your function you only will swap is the 2 selected elements are next to each other when needed. This will not work with all list combinations. You have to be able to check that the current i that you are at is in the correct place. So if the end element is the lowest in the List then you have to have it swap all the way to the front of the list. There are many ways of sorting (ie. Quick sort, MergeSort,Bubble Sort) and this isnt the best way... :) Here is some help:
def sortThis(L):
if (len(L) == 0 or len(L) == 1):
return list
else:
for i in range(len(L)):
value = L[i]
j = i - 1
while (j >= 0) and (L[j] > value):
L[j+1] = L[j]
j -= 1
L[j+1] = value
a = [1,2,13,131,1,3,4]
sortThis(a)
print a
Take a look at this for more sorting Fun: QuickSort MergeSort
If it works, it would be the best sorting algotithm in the world (O(n)). Your algorithm only puts the greatest element at the end of the list. you have to apply recursively your function to list[:-1].
You should not use python reserved words

Categories

Resources