creating a binary heap implementation giving wrong result - python

I am implementing the binary heap following an online course and I have done the following:
from __future__ import division
class BinaryHeap(object):
def __init__(self, arr=None):
self.heap = []
def insert(self, item):
self.heap.append(item)
self.__swim(len(self.heap) - 1)
def __swim(self, index):
parent_index = (index - 1) // 2
while index > 0 and self.heap[parent_index] < self.heap[index]:
self.heap[parent_index], self.heap[index] = self.heap[index], self.heap[parent_index]
index = parent_index
Now, I use it as:
s = 'SORTEXAMPLE'
a = BinaryHeap()
for c in s:
a.insert(c)
Now, after this the heap is ordered as:
['S', 'T', 'X', 'P', 'L', 'R', 'A', 'M', 'O', 'E', 'E']
rather than
['X', 'T', 'S', 'P', 'L', 'R', 'A', 'M', 'O', 'E', 'E']
So, it seems one of the last exchanges did not happen and I thought I might have messed up the indexing but I could not find any obvious issues.

Ok, I figured it out. of course, I cannot cache the parent_index outside the loop!
The code should be:
def __swim(self, index):
while index > 0 and self.heap[(index - 1) // 2] < self.heap[index]:
self.heap[(index - 1) // 2], self.heap[index] = self.heap[index], self.heap[(index - 1) // 2]
index = (index - 1) // 2
I am surprised this did not go in an infinite loop before....

Related

Python recursion systematic ordering

I wrote my code and it's working perfectly but the output doesn't really look good. I was it to look more presentable/systematic. How do I do that? This is the kind of result I'm currently getting:
and this is the type of result I want:
This code is basically to find permutations of whatever is inputted.
def permutations(aSet):
if len(aSet) <= 1: return aSet
all_perms = []
first_element = aSet[0:1]
subset = aSet[1:]
partial = permutations(subset)
for permutation in partial:
for index in range(len(aSet)):
new_perm = list(permutation[:index])
new_perm.extend(first_element)
new_perm.extend(permutation[index:])
all_perms.append(new_perm)
return all_perms
I can't figure out what to try.
You can sort the output array with a custom key function. Here keyFunc converts a permutaiton (list of characters) into a single string to perform lexicographic sorting.
from pprint import pprint
# insert your function here
def keyFunc(char_list):
return ''.join(char_list)
chars = list('dog')
permutation = permutations(chars)
permutation.sort(key=keyFunc)
pprint(permutation)
Output:
[['d', 'g', 'o'],
['d', 'o', 'g'],
['g', 'd', 'o'],
['g', 'o', 'd'],
['o', 'd', 'g'],
['o', 'g', 'd']]
Here's a way to order the permutations differently: for each item in the input array, take it out of the array, find all permutations of the remaining subarray, then prepend this item to each permutation of this subarray. This has the effect of placing permutations with similar prefixes together.
from pprint import pprint
def permutations2(chars):
if len(chars) <= 1: return [chars]
all_perms = []
for idx, char in enumerate(chars):
subarr = chars[:idx] + chars[idx+1:]
subperms = permutations2(subarr)
for subperm in subperms:
new_perm = [char] + subperm
all_perms.append(new_perm)
return all_perms
chars = list('dog')
pprint(permutations2(chars))
Result:
[['d', 'o', 'g'],
['d', 'g', 'o'],
['o', 'd', 'g'],
['o', 'g', 'd'],
['g', 'd', 'o'],
['g', 'o', 'd']]

Why are these loop outputs different?

I am working on a task where I need to sort and remove the duplicate letters in a string. I ended up with the function doing what I wanted but out of shear luck. I don't know why these lines of code produce different outputs. Could someone help me understand?
def format_string(string1):
sorted1 = sorted(string1)
print(sorted1)
i = 0
while i < len(sorted1) - 1:
if sorted1[i] == sorted1[i + 1]:
del sorted1[i + 1]
else:
i += 1
return sorted1
print(format_string("aretheyhere"))
['a', 'e', 'e', 'e', 'e', 'h', 'h', 'r', 'r', 't', 'y']
['a', 'e', 'h', 'r', 't', 'y']
#This did what I wanted. but these seemingly similar lines don't.
def format_string(string1):
sorted1 = sorted(string1)
print(sorted1)
i = 0
j = i + 1
while i < len(sorted1) - 1:
if sorted1[i] == sorted1[j]:
del sorted1[j]
else:
i += 1
return sorted1
print(format_string("aretheyhere"))
['a', 'e', 'e', 'e', 'e', 'h', 'h', 'r', 'r', 't', 'y']
['a', 'y']
def format_string(string1):
sorted1 = sorted(string1)
print(sorted1)
i = 0
while i < len(sorted1) - 1:
if sorted1[i] == sorted1[i + 1]:
del sorted1[i + 1]
i += 1
return sorted1
print(format_string("aretheyhere"))
['a', 'e', 'e', 'e', 'e', 'h', 'h', 'r', 'r', 't', 'y']
['a', 'e', 'e', 'h', 'r', 't', 'y']
What are the crucial differences here that change the output?
The variable j doesn't increment because it's not updated inside your while loop, i.e. changing the value of i after setting the value of j to i+1 does not change the value of j. For example, this function would give the same result as the first one because the value of j is updated inside the while loop:
def format_string(string1):
sorted1 = sorted(string1)
print(sorted1)
i = 0
while i < len(sorted1) - 1:
j = i + 1
if sorted1[i] == sorted1[j]:
del sorted1[j]
else:
i += 1
return sorted1
print(format_string("aretheyhere"))

How do I alphabetically sort an array of strings without any sort functions? Python

When solving the following problem:
"Assuming you have a random list of strings (for example: a, b, c, d, e, f, g), write a program that will sort the strings in alphabetical order.
You may not use the sort command."
I run into the problem of running strings through the following code, which sometimes gets me duplicated strings in final list
I am fairly new to python and our class just started to look at numpy, and functions in that module, and im not sure of any being used in the code (except any sort function).
import numpy as np
list=[]
list=str(input("Enter list of string(s): "))
list=list.split()
print() # for format space purposes
listPop=list
min=listPop[0]
newFinalList=[]
if(len(list2)!=1):
while(len(listPop)>=1):
for i in range(len(listPop)):
#setting min=first element of list
min=listPop[0]
if(listPop[i]<=min):
min=listPop[i]
print(min)
listPop.pop(i)
newFinalList.append(min)
print(newFinalList)
else:
print("Only one string inputted, so already alphabatized:",list2)
Expected result of ["a","y","z"]
["a","y","z"]
Actual result...
Enter list of string(s): a y z
a
a
a
['a', 'a', 'a']
Enter list of string(s): d e c
d
c
d
d
['c', 'd', 'd']
Selection sort: for each index i of the list, select the smallest item at or after i and swap it into the ith position. Here's an implementation in three lines:
# For each index i...
for i in range(len(list)):
# Find the position of the smallest item after (or including) i.
j = list[i:].index(min(list[i:])) + i
# Swap it into the i-th place (this is a no-op if i == j).
list[i], list[j] = list[j], list[i]
list[i:] is a slice (subset) of list starting at the ith element.
min(list) gives you the smallest element in list.
list.index(element) gives you the (first) index of element in list.
a, b = b, a atomically swaps the values of a and b.
The trickiest part of this implementation is that when you're using index to find the index of the smallest element, you need to find the index within the same list[i:] slice that you found the element in, otherwise you might select a duplicate element in an earlier part of the list. Since you're finding the index relative to list[i:], you then need to add i back to it to get the index within the entire list.
You can implement Quick sort for same:
def partition(arr,low,high):
i = ( low-1 )
pivot = arr[high]
for j in range(low , high):
if arr[j] <= pivot:
i = i+1
arr[i],arr[j] = arr[j],arr[i]
arr[i+1],arr[high] = arr[high],arr[i+1]
return ( i+1 )
def quickSort(arr,low,high):
if low < high:
pi = partition(arr,low,high)
quickSort(arr, low, pi-1)
quickSort(arr, pi+1, high)
arr = ['a', 'x', 'p', 'o', 'm', 'w']
n = len(arr)
quickSort(arr,0,n-1)
print ("Sorted list is:")
for i in range(n):
print ("%s" %arr[i]),
output:
Sorted array is:
a m o p w x
Mergesort:
from heapq import merge
from itertools import islice
def _ms(a, n):
return islice(a,n) if n<2 else merge(_ms(a,n//2),_ms(a,n-n//2))
def mergesort(a):
return type(a)(_ms(iter(a),len(a)))
# example
import string
import random
L = list(string.ascii_lowercase)
random.shuffle(L)
print(L)
print(mergesort(L))
Sample run:
['h', 'g', 's', 'l', 'a', 'f', 'b', 'z', 'x', 'c', 'r', 'j', 'q', 'p', 'm', 'd', 'k', 'w', 'u', 'v', 'y', 'o', 'i', 'n', 't', 'e']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

Solving a "colored Quxes" coding challenge with recursion

I am trying to solve some of the coding challenges that I find online. However I was stopped by the below problem. I tried to solve it using recursion but I feel I am missing a very important concept in recursion. My code works for all of the below examples except the last one it will break down.
Can someone point to me the mistake that I made in this recursion code? Or maybe guide me through solving the issue?
I know why my code breaks but I don't know how to get around the "pass by object reference" in Python which I think creating the bigger problem for me.
The coding question is:
On a mysterious island there are creatures known as Quxes which come in three colors: red, green, and blue. One power of the Qux is that if two of them are standing next to each other, they can transform into a single creature of the third color.
Given N Quxes standing in a line, determine the smallest number of them remaining after any possible sequence of such transformations.
For example, given the input ['R', 'G', 'B', 'G', 'B'], it is possible to end up with a single Qux through the following steps:
Arrangement | Change
----------------------------------------
['R', 'G', 'B', 'G', 'B'] | (R, G) -> B
['B', 'B', 'G', 'B'] | (B, G) -> R
['B', 'R', 'B'] | (R, B) -> G
['B', 'G'] | (B, G) -> R
['R'] |
________________________________________
My code is:
class fusionCreatures(object):
"""Regular Numbers Gen.
"""
def __init__(self , value=[]):
self.value = value
self.ans = len(self.value)
def fusion(self, fus_arr, i):
color = ['R','G','B']
color.remove(fus_arr[i])
color.remove(fus_arr[i+1])
fus_arr.pop(i)
fus_arr.pop(i)
fus_arr.insert(i, color[0])
return fus_arr
def fusionCreatures1(self, arr=None):
# this method is to find the smallest number of creature in a row after fusion
if arr == None:
arr = self.value
for i in range (0,len(arr)-1):
#print(arr)
if len(arr) == 2 and i >= 1 or len(arr)<2:
break
if arr[i] != arr[i+ 1]:
arr1 = self.fusion(arr, i)
testlen = self.fusionCreatures1(arr)
if len(arr) < self.ans:
self.ans = len(arr)
return self.ans
Testing array (all of them work except the last one):
t1 = fusionCreatures(['R','G','B','G','B'])
t2 = fusionCreatures(['R','G','B','R','G','B'])
t3 = fusionCreatures(['R','R','G','B','G','B'])
t4 = fusionCreatures(['G','R','B','R','G'])
t5 = fusionCreatures(['G','R','B','R','G','R','G'])
t6 = fusionCreatures(['R','R','R','R','R'])
t7 = fusionCreatures(['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'])
print(t1.fusionCreatures1())
print(t2.fusionCreatures1())
print(t3.fusionCreatures1())
print(t4.fusionCreatures1())
print(t5.fusionCreatures1())
print(t6.fusionCreatures1())
print(t7.fusionCreatures1())
I'll start by mentioning that there is a deductive approach that works in O(n) and is detailed in this blog post. It boils down to checking the parity of the counts of the three types of elements in the list to determine which of a few fixed outcomes occurs.
You mention that you'd prefer to use a recursive approach, which is O(n!). This is a good start because it can be used as a tool for helping arrive at the O(n) solution and is a common recursive pattern to be familiar with.
Because we can't know whether a given fusion between two Quxes will ultimately lead to an optimal global solution we're forced to try every possibility. We do this by walking over the list and looking for potential fusions. When we find one, perform the transformation in a new list and call fuse_quxes on it. Along the way, we keep track of the smallest length achieved.
Here's one approach:
def fuse_quxes(quxes, choices="RGB"):
fusion = {x[:-1]: [x[-1]] for x in permutations(choices)}
def walk(quxes):
best = len(quxes)
for i in range(1, len(quxes)):
if quxes[i-1] != quxes[i]:
sub = quxes[:i-1] + fusion[quxes[i-1], quxes[i]] + quxes[i+1:]
best = min(walk(sub), best)
return best
return walk(quxes)
This is pretty much the direction your provided code is moving towards, but the implementation seems unclear. Unfortunately, I don't see any single or quick fix. Here are a few general issues:
Putting the fusionCreatures1 function into a class allows it to mutate external state, namely self.value and self.ans. self.value in particular is poorly named and difficult to keep track of. It seems like the intent is to use it as a reference copy to reset arr to its default value, but arr = self.value means that when fus_arr is mutated in fusion(), self.value is as well. Everything is pretty much a reference to one underlying list.
Adding slices to these copies at least makes the program easier to reason about, for example, arr = self.value[:] and fus_arr = fus_arr[:] in the fusion() function. In short, try to write pure functions.
self.ans is also unclear and unnecessary; better to keep the result value relegated to a local variable within the recursive function.
It seems unnecessary to put a stateless function into a class unless it's a purely static method and the class is acting as a namespace.
Another cause of cognitive overload are branching statements like if and break. We want to minimize the frequency and nesting of these. Here is fusionCreatures1 in pseudocode, with annotations for mutations and complex interactions:
def fusionCreatures1():
if ...
read mutated global state
for i in len(arr):
if complex length and index checks:
break
if arr[i] != arr[i+ 1]:
impure_func_that_changes_arr_length(arr)
recurse()
if new best compared to global state:
mutate global state
You'll probably agree that it's pretty difficult to mentally step through a run of this function.
In fusionCreatures1(), two variables are unused:
arr1 = self.fusion(arr, i)
testlen = self.fusionCreatures1(arr)
The assignment arr1 = self.fusion(arr, i) (along with the return fus_arr) seems to indicate a lack of understanding that self.fusion is really an in-place function that mutates its argument array. So calling it means arr1 is arr and we have another aliased variable to reason about.
Beyond this, neither arr1 or testlen are used in the program, so the intent is unclear.
A good linter will pick up these unused variables and identify most of the other complexity issues I've mentioned.
Mutating a list while looping over it is usually disastrous. self.fusion(arr, i) mutates arr inside a loop, making it very difficult to reason about its length and causing an index error when the range(len(arr)) no longer matches the actual len(arr) in the function body (or at least necessitating an in-body precondition). Making self.fusion(arr, i) pure using a slice, as mentioned above, fixes this problem but reveals that there is no recursive base case, resulting in a stack overflow error.
Avoid variable names like arr, arr1, value unless the context is obvious. Again, these obfuscate intent and make the program difficult to understand.
Some minor style suggestions:
Use snake_case per PEP-8. Class names should be TitleCased to differentiate them from functions. No need to inherit from object--that's implicit.
Use consistent spacing around functions and operators: range (0,len(arr)-1): is clearer as range(len(arr) - 1):, for example. Use vertical whitespace around blocks.
Use lists instead of typing out t1, t2, ... t7.
Function names should be verbs, not nouns. A class like fusionCreatures with a method called fusionCreatures1 is unclear. Something like QuxesSolver.minimize(creatures) makes the intent a bit more obvious.
As for the solution I provided above, there are other tricks worth considering to speed it up. One is memoization, which can help avoid duplicate work (any given list will always produce the same minimized length, so we just store this computation in a dict and spit it back out if we ever see it again). If we hit a length of 1, that's the best we can do globally, so we can skip the rest of the search.
Here's a full runner, including the linear solution translated to Python (again, defer to the blog post to read about how it works):
from collections import defaultdict
from itertools import permutations
from random import choice, randint
def fuse_quxes_linear(quxes, choices="RGB"):
counts = defaultdict(int)
for e in quxes:
counts[e] += 1
if not quxes or any(x == len(quxes) for x in counts.values()):
return len(quxes)
elif len(set(counts[x] % 2 for x in choices)) == 1:
return 2
return 1
def fuse_quxes(quxes, choices="RGB"):
fusion = {x[:-1]: [x[-1]] for x in permutations(choices)}
def walk(quxes):
best = len(quxes)
for i in range(1, len(quxes)):
if quxes[i-1] != quxes[i]:
sub = quxes[:i-1] + fusion[quxes[i-1], quxes[i]] + quxes[i+1:]
best = min(walk(sub), best)
return best
return walk(quxes)
if __name__ == "__main__":
tests = [
['R','G','B','G','B'],
['R','G','B','R','G','B'],
['R','R','G','B','G','B'],
['G','R','B','R','G'],
['G','R','B','R','G','R','G'],
['R','R','R','R','R'],
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B']
]
for test in tests:
print(test, "=>", fuse_quxes(test))
assert fuse_quxes_linear(test) == fuse_quxes(test)
for i in range(100):
test = [choice("RGB") for x in range(randint(0, 10))]
assert fuse_quxes_linear(test) == fuse_quxes(test)
Output:
['R', 'G', 'B', 'G', 'B'] => 1
['R', 'G', 'B', 'R', 'G', 'B'] => 2
['R', 'R', 'G', 'B', 'G', 'B'] => 2
['G', 'R', 'B', 'R', 'G'] => 1
['G', 'R', 'B', 'R', 'G', 'R', 'G'] => 2
['R', 'R', 'R', 'R', 'R'] => 5
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'] => 2
Here is my suggestion.
First, instead of "R", "G" and "B" I use integer values 0, 1, and 2. This allows nice and easy fusion between a and b, as long as they are different, by simply doing 3 - a - b.
Then my recursion code is:
def fuse_quxes(l):
n = len(l)
for i in range(n - 1):
if l[i] == l[i + 1]:
continue
else:
newn = fuse_quxes(l[:i] + [3 - l[i] - l[i + 1]] + l[i+2:])
if newn < n:
n = newn
return n
Run this with
IN[5]: fuse_quxes([0, 0, 0, 1, 1, 1, 2, 2, 2])
Out[5]: 2
Here is my attempt of the problem
please find the description in comment
inputs = [['R','G','B','G','B'],
['R','G','B','R','G','B'],
['R','R','G','B','G','B'],
['G','R','B','R','G'],
['G','R','B','R','G','R','G'],
['R','R','R','R','R'],
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B'],]
def fuse_quxes(inp):
RGB_set = {"R", "G", "B"}
merge_index = -1
## pair qux with next in line and loop through all pairs
for i, (q1, q2) in enumerate(zip(inp[:-1], inp[1:])):
merged = RGB_set-{q1,q2}
## If more than item remained in merged after removing q1 and q2 qux can't fuse
if(len(merged))==1:
merged = merged.pop()
merge_index=i
merged_color = merged
## loop through the pair until result of fuse is different from qux in either right
## or left side
if (i>0 and merged!=inp[i-1]) or ((i+2)<len(inp) and merged!=inp[i+2]):
break
print(inp)
## merge two qux which results to qux differnt from either its right or left else do any
## possible merge
if merge_index>=0:
del inp[merge_index]
inp[merge_index] = merged_color
return fuse_quxes(inp)
else:
## if merge can't be made break the recurssion
print("Result", len(inp))
print("_______________________")
return len(inp)
[fuse_quxes(inp) for inp in inputs]
output
['R', 'G', 'B', 'G', 'B']
['R', 'R', 'G', 'B']
['R', 'B', 'B']
['G', 'B']
['R']
Result 1
_______________________
['R', 'G', 'B', 'R', 'G', 'B']
['R', 'G', 'B', 'R', 'R']
['R', 'G', 'G', 'R']
['B', 'G', 'R']
['B', 'B']
Result 2
_______________________
['R', 'R', 'G', 'B', 'G', 'B']
['R', 'B', 'B', 'G', 'B']
['G', 'B', 'G', 'B']
['R', 'G', 'B']
['R', 'R']
Result 2
_______________________
['G', 'R', 'B', 'R', 'G']
['G', 'G', 'R', 'G']
['G', 'B', 'G']
['R', 'G']
['B']
Result 1
_______________________
['G', 'R', 'B', 'R', 'G', 'R', 'G']
['G', 'G', 'R', 'G', 'R', 'G']
['G', 'B', 'G', 'R', 'G']
['R', 'G', 'R', 'G']
['B', 'R', 'G']
['B', 'B']
Result 2
_______________________
['R', 'R', 'R', 'R', 'R']
Result 5
_______________________
['R', 'R', 'R', 'G', 'G', 'G', 'B', 'B', 'B']
['R', 'R', 'B', 'G', 'G', 'B', 'B', 'B']
['R', 'G', 'G', 'G', 'B', 'B', 'B']
['B', 'G', 'G', 'B', 'B', 'B']
['R', 'G', 'B', 'B', 'B']
['R', 'R', 'B', 'B']
['R', 'G', 'B']
['R', 'R']
Result 2
_______________________
[1, 2, 2, 1, 2, 5, 2]

Whats happening this code. Anagram as a substring in a string

I found this question posted on here, but couldn't comment or ask a question, so I am creating a new question.
The original post stated the following:
t = "abd"
s = "abdc"
s trivially contains t. However, when you sort them, you get the strings abd and abcd, and the in comparison fails. The sorting gets other letters in the way.
Instead, you need to step through s in chunks the size of t.
t_len = len(t)
s_len = len(s)
t_sort = sorted(t)
for start in range(s_len - t_len + 1):
chunk = s[start:start+t_len]
if t_sort == sorted(chunk):
# SUCCESS!!
In the for loop why are they taking S-len then subtracting t_len? Why are they adding 1 at the end?
alvits and d_void already explained the value of start; I won't repeat that.
I strongly recommend that you learn some basic trace debugging. Insert some useful print statements to follow the execution. For instance:
Code:
t = "goal"
s = "catalogue"
t_len = len(t)
s_len = len(s)
t_sort = sorted(t)
print "lengths & sorted", t_len, s_len, t_sort
for start in range(s_len - t_len + 1):
chunk = s[start:start+t_len]
print "LOOP start=", start, "\tchunk=", chunk, sorted(chunk)
if t_sort == sorted(chunk):
print "success"
Output:
lengths & sorted 4 9 ['a', 'g', 'l', 'o']
LOOP start= 0 chunk= cata ['a', 'a', 'c', 't']
LOOP start= 1 chunk= atal ['a', 'a', 'l', 't']
LOOP start= 2 chunk= talo ['a', 'l', 'o', 't']
LOOP start= 3 chunk= alog ['a', 'g', 'l', 'o']
success
LOOP start= 4 chunk= logu ['g', 'l', 'o', 'u']
LOOP start= 5 chunk= ogue ['e', 'g', 'o', 'u']
Does that help illustrate what's happening in the loop?

Categories

Resources