recursion code should go infinitely? - python

So, i am trying to learn python and i come across this code to demonstrate recursion in python. Now , I know c++ and I thought this code should create an infinte loop but it doesn't. Any help would be greatly appreciated. This is an program to sort a list by insertion sort.
def InsertionSort(seq):
isort(seq, len(seq))
def isort(seq, k): # Sort slice seq[0:k]
if k > 1:
isort(seq, k - 1) #1
insert(seq, k - 1) #2
def insert(seq, k): # Insert seq[k] into sorted seq[0:k-1]
pos = k
while pos > 0 and seq[pos] < seq[pos - 1]:
(seq[pos], seq[pos - 1]) = (seq[pos - 1], seq[pos])
pos = pos - 1
Shouldn't the compiler got to #1 and again call isort and thus return in a infinite loop and never go to #2.
Thank you for your help.

This code will terminate as the function is calld with k-1, meaning that the condition k>1 will eventually evaluate to False.
Imagine recursion is like a continually growing tree and a squirrel is the interpreter. When the isort() function is called, the tree branches off and the squirrel runs to the end of that branch. However, the tree uses has a finite supply of nutrients (k) each time, and a bit is used (k-1) each time it grows a new branch. The tree will stop branching off when it runs out of nutrients (which is the k>1 condition). When the tree stops growing, the squirrel will reach the end of the last branch and get the nut (return value/ next line(s) of code). The squirrel will now run back to the roots (the code (if any) after the call to the recursive function) by going back through the branches (leaving each recursion depth). When the squirrel arrives back at the roots, the program is finished.
(Hope this analogy helps :) )

Related

Saving valid moves in Negamax makes little to no difference in speed

I have a normal Negamax algorithm with alpha-beta pruning which is initiated with iterative deepening (ID). I thought that to really get use of ID I save the calculated valid moves from depth 1 in a table, so next time I go for depth 2 and the same original position arrives I can just grab the valid moves from the table instead to save time. However, I find that this idea doesn't save any time at all really which makes me think:
I have never seen anyone do this, is it not worth it for some reason?
My implementation of this is wrong?
I am confused by how Negamax works and maybe this is impossible to do in the first place?
Here is the original iterative call, along with a snippet of the Negamax function itself:
self.valid_moves_history = []
for depth in range(1, s.max_search_depth):
move, evaluation = self.negamax(gamestate, depth, -math.inf, math.inf, s.start_color)
# ----------------------------------------------------------------------------
def negamax(self, gamestate, depth, alpha, beta, color):
if self.zobrist_key in self.valid_moves_history:
children = self.valid_moves_history[self.zobrist_key]
else:
children = gamestate.get_valid_moves()
self.valid_moves_history[key] = children
if depth == 0 or gamestate.is_check_mate or gamestate.is_stale_mate:
return None, e.evaluate(gamestate, depth) * color
# Negamax loop
max_eval = -math.inf
for child in reversed(children):
gamestate.make_move(child[0], child[1])
score = -self.negamax(gamestate, depth - 1, -beta, -alpha, -color)[1]
gamestate.unmake_move()
if score > max_eval:
max_eval = score
best_move = child
alpha = max(alpha, max_eval)
if beta <= alpha:
break
The most time consuming tasks of my complete program are distributed something like this (% of total runtime for a game):
Calculate valid moves: 60%
Evaluation function (medium complexity at the moment): 25%
Negamax itself with lookups, table saves etc: 5%
Make/unmake moves: 4%
Is it normal/reasonable for the calculating move time to be this high? This is the main reason why I thought to save valid moves in a list in the first place.
Or can someone please explain why this is a good/bad idea and what I should do instead? Thank you for any input.
I know this thread is quite old at this point but I think that this could still be useful to some people. The whole topic which you are talking about is called transposition tables in Minimax and you can find many links to the topic. Negamax is the same as Minimax except you do not have separate functions for the Max and Min players, and instead you just call a max function and turn it into a negative. I think it is probably more useful for you to implement move ordering first as it can double the speed of your program. You can also find a more efficient way to find valid moves to speed up the program.

Python : Counting execution of recursive call

I am using Euler problems to test my understanding as I learn Python 3.x. After I cobble together a working solution to each problem, I find the posted solutions very illuminating and I can "absorb" new ideas after I have struggled myself. I am working on Euler 024 and I am trying a recursive approach. Now, in no ways do I believe my approach is the most efficient or most elegant, however, I successfully generate a full set of permutations, increasing in value (because I start with a sorted tuple) - which is one of the outputs I want. In addition, in order to find the millionth in the list (which is the other output I want, but can't yet get) I am trying to count how many there are each time I create a permutation and that's where I get stuck. In other words what I want to do is count the number of recursive calls each time I reach the base case, i.e. a completed permutation, not the total number of recursive calls. I have found on StackOverflow some very clear examples of counting number of executions of recursive calls but I am having no luck applying the idea to my code. Essentially my problems in my attempts so far are about "passing back" the count of the "completed" permutation using a return statement. I think I need to do that because the way my for loop creates the "stem" and "tail" tuples. At a high level, either I can't get the counter to increment (so it always comes out as "1" or "5") or the "nested return" just terminates the code after the first permutation is found, depending on where I place the return. Can anyone help insert the counting into my code?
First the "counting" code I found in SO that I am trying to use:
def recur(n, count=0):
if n == 0:
return "Finished count %s" % count
return recur(n-1, count+1)
print(recur(15))
Next is my permutation code with no counting in it. I have tried lots of approaches, but none of them work. So the following has no "counting" in it, just a comment at which point in the code I believe the counter needs to be incremented.
#
# euler 024 : Lexicographic permutations
#
import time
startTime= time.time()
#
def splitList(listStem,listTail):
for idx in range(0,len(listTail)):
tempStem =((listStem) + (listTail[idx],))
tempTail = ((listTail[:idx]) + (listTail[1+idx:]))
splitList(tempStem,tempTail)
if len(listTail) ==0:
#
# I want to increment counter only when I am here
#
print("listStem=",listStem,"listTail=",listTail)
#
inStem = ()
#inTail = ("0","1","2","3","4","5","6","7","8","9")
inTail = ("0","1","2","3")
testStem = ("0","1")
testTail = ("2","3","4","5")
splitList(inStem,inTail)
#
print('Code execution duration : ',time.time() - startTime,' seconds')
Thanks in advance,
Clive
Since it seems you've understood the basic problem but just want to understand how the recursion is happening, all you need to do is pass a variable that tells you at what point of the call stack you're in. You can add a 3rd argument to your function, and increment it with each recursive call:
def splitList(listStem, listTail, count):
for idx in range(0,len(listTail)):
...
splitList(tempStem, tempTail, count)
if len(listTail) == 0:
count[0] += 1
print('Count:', count)
...
Now, call this function like this (same as before):
splitList(inStem, inTail, [0])
Why don't you write generator for this?
Then you can just stop on nth item ("drop while i < n").
Mine solution is using itertools, but you can use your own permutations generator. Just yield next sequence member instead of printing it.
from itertools import permutations as perm, dropwhile as dw
print(''.join(dw(
lambda x: x[0]<1000000,
enumerate(perm('0123456789'),1)
).__next__()[1]))

Understanding Recursion on Binary Trees

Hello I am new to programing so at school we are learning recursion and binary trees. But for me recursion is something that really melts my mind away.
so I was trying out a recursive function on a Binary Tree trying to output the preorder format
def preorder(self, preorder_list):
""" (BinaryTree) -> list
Return a list of the items in this tree using a *preorder* traversal.
"""
if self.root is not EmptyValue:
if self.left and self.right != None:
preorder_list.append(self.left.preorder(preorder_list))
preorder_list.append(self.right.preorder(preorder_list))
def helper_preorder(self):
preorder_list = []
self.preorder(preorder_list)
return preorder_list
When I run this code the output I get is:
for example;
[None, None, None, None]
Now I suspect this has to be a problem with my recursive reasoning. So I would like to know what is the problem with my recursive reasoning and how I can better my self in recursion.
Thanks for your time.
Your problem is that you're never returning anything from preorder, and Python is implicitly returning None. So you append the return value of the function to your list, but that value is None. Your function should look like this (In pseudo code, not valid Python)1:
preorder(node, list):
if node is Empty:
return list
else:
new_list.append(node.data)
preorder(node.left, new_list);
preorder(node.right, new_list);
return list
Note that I am duplicating the return statement, so you could optimize this, but I set it up this way because it's easiest to understand recursion if you think of it as a base case or a recursive call.
To understand recursion2, think about breaking a hard problem into smaller, easier problems. I'll take a simple example, one of the classic recursion examples: factorial(n).
What is factorial(n)? That's a really hard question to answer, so let's find something simpler.
From math class, we know that n! = n*(n-1)*(n-2)*...*(2)*(1), so let's start there. We can immediately see that n! = n * (n-1)!. That helps; now we can write a starter of our factorial function:
factorial(n):
return n * factorial(n-1)
But it's not done yet; if we run that, it'll go forever and not stop3. So we need what's called a base case: a stopping point for the recursion, where the problem is now so simple that we know the answer.
Fortunately, we have a natural base case: 1! = 1, which is true by definition. So we add that to our function:
factorial(n):
if n == 1: return 1
else return n * factorial(n)
Just to make it clear how that works, let's do a trace for something small, say n=4.
So we call factorial(4). Obviously 4 != 1, so factorial(4) = 4*factorial(3). Now we repeat the process, which is the beautiful thing about recursion: applying the same algorithm to smaller subsets of the original problem, and building up a solution from those parts.
Our trace looks something like this:
factorial(4) = 4*factorial(3)
factorial(4) = 4*3*factorial(2)
factorial(4) = 4*3*2*factorial(1)
Now, we know that factorial(1) is: it's just 1, our base case. So finally we have
factorial(4) = 4*3*2*1
factorial(4) = 24
1 This is only one possible way; there are others, but this is what I came up with on the spot
2 You must first understand recursion (Sorry; I couldn't resist)
3 In the real world, it will eventually stop: because each recursive call uses some memory (To keep track of function parameters and local variables and the like), code that recurses infinitely will eventually exceed the memory allocated to it, and will crash. This is one of the most common causes of a stack overflow error

Why is this implementation of binary heap slower than that of Python's stdlib?

I have been implementing my own heap module to help me understand the heap data structure. I understand how they work and are managed, but my implementation is significantly slower than the standard python heapq module while preforming a heap sort (for list of size 100,000, heapq takes 0.6s while my code takes 2s (was originally 2.6s, cut it down to 2s by taking len() statements out of percDown and passing through the length so it doesn't have to calcuate len every time the method calls itself). Here is my implementation:
def percDown(lst, start, end, node):
#Moves given node down the heap, starting at index start, until the heap property is
#satisfied (all children must be larger than their parent)
iChild = 2 * start + 1
i = start
# if the node has reached the end of the heap (i.e. no children left),
# return its index (we are done)
if iChild > end - 1:
return start
#if the second child exists and is smaller than the first child, use that child index
#for comparing later
if iChild + 1 < end and lst[iChild + 1] < lst[iChild]:
iChild += 1
#if the smallest child is less than the node, it is the new parent
if lst[iChild] < node:
#move the child to the parent position
lst[start] = lst[iChild]
#continue recursively going through the child nodes of the
# new parent node to find where node is meant to go
i = percDown(lst, iChild, end, node)
return i
popMin: pops the minimum value (lst[0]) and reorders the heap
def popMin(lst):
length = len(lst)
if (length > 1):
min = lst[0]
ele = lst.pop()
i = percDown(lst, 0, length - 1, ele)
lst[i] = ele
return min
else:
return lst.pop()
heapify: turns a list into a heap in-place
def heapify(lst):
iLastParent = math.floor((len(lst) - 1) / 2)
length = len(lst)
while iLastParent >= 0:
ele = lst[iLastParent]
i = percDown(lst, iLastParent, length, lst[iLastParent])
lst[i] = ele
iLastParent -= 1
sort: heapsorts the given list using the methods above (not in-place)
def sort(lst):
result = []
heap.heapify(lst)
length = len(lst)
for z in range(0, length):
result.append(heap.popMin(lst))
return result
Did I mistakenly add complexity to the algorithm/heap creation, or is it just the python heapq module being heavily optimized? I have a feeling it is the former, as 0.6s vs 2s is a huge difference.
The Python heapq module uses a C extension. You cannot beat C code.
From the heapq module source code:
# If available, use C implementation
try:
from _heapq import *
except ImportError:
pass
Also see the _heapqmodule.c C source.
0.6s vs. 2.6s is a bit less than 4x difference. Is that "too big"?
That's not enough information to answer. A 4x different certainly could be caused by getting the algorithm wrong… but there's really no way to tell without testing at different sizes.
If you get, say, only a 1.2x different at 1000, a 4x difference at 100000, and a 12x difference at 1000000, then your algorithmic complexity is most likely worse, which means you probably did get something wrong, and that's something you need to fix.
On the other hand, if it's about 4x different at all three sizes, then there's just a bigger constant multiplier in your overhead. Most likely because you've got an inner loop running in Python, while the (CPython) stdlib version is using the _heapq accelerator module that does that same loop in C, as explained in Martijn Pieters' answer. So, you didn't get anything wrong. You can probably micro-optimize a bit, but ultimately you're going to have to either rewrite the core of your code in C or run it in a JIT-optimized interpreter to get anywhere near as good as the stdlib. And really, if you're just writing this to understand the algorithm, you don't need to do that.
As a side note, you might want to try running the comparison in PyPy. Most of its stdlib is written in pure Python, with no special optimizations, but the optimized JIT compiler makes it almost as fast as the native C code in CPython. And that same JIT will be applied to your code, meaning your unoptimized code is often nearly as as as the native C code in CPython too. Of course there's no guarantee of that, and it doesn't change the fact that you always need to test at different sizes if you're trying to test for algorithmic complexity.

Recursion not breaking

I am trying to solve Euler problem 18 where I am required to find out the maximum total from top to bottom. I am trying to use recursion, but am stuck with this.
I guess I didn't state my problem earlier. What I am trying to achieve by recursion is to find the sum of the maximum number path. I start from the top of the triangle, and then check the condition is 7 + findsum() bigger or 4 + findsum() bigger. findsum() is supposed to find the sum of numbers beneath it. I am storing the sum in variable 'result'
The problem is I don't know the breaking case of this recursion function. I know it should break when it has reached the child elements, but I don't know how to write this logic in the program.
pyramid=[[0,0,0,3,0,0,0,],
[0,0,7,0,4,0,0],
[0,2,0,4,0,6,0],
[8,0,5,0,9,0,3]]
pos=[0,3]
def downleft(pyramid,pos):#returns down left child
try:
return(pyramid[pos[0]+1][pos[1]-1])
except:return(0)
def downright(pyramid,pos):#returns down right child
try:
return(pyramid[pos[0]+1][pos[1]+1])
except:
return(0)
result=0
def find_max(pyramid,pos):
global result
if downleft(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]-1]) > downright(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]+1]):
new_pos=[pos[0]+1,pos[1]-1]
result+=downleft(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]-1])
elif downright(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]+1]) > downleft(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]-1]):
new_pos=[pos[0]+1,pos[1]+1]
result+=downright(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]+1])
else :
return(result)
find_max(pyramid,pos)
A big part of your problem is that you're recursing a lot more than you need to. You should really only ever call find_max twice recursively, and you need some base-case logic to stop after the last row.
Try this code:
def find_max(pyramid, x, y):
if y >= len(pyramid): # base case, we're off the bottom of the pyramid
return 0 # so, return 0 immediately, without recursing
left_value = find_max(pyramid, x - 1, y + 1) # first recursive call
right_value = find_max(pyramid, x + 1, y + 1) # second recursive call
if left_value > right_value:
return left_value + pyramid[y][x]
else:
return right_value + pyramid[y][x]
I changed the call signature to have separate values for the coordinates rather than using a tuple, as this made the indexing much easier to write. Call it with find_max(pyramid, 3, 0), and get rid of the global pos list. I also got rid of the result global (the function returns the result).
This algorithm could benefit greatly from memoization, as on bigger pyramids you'll calculate the values of the lower-middle areas many times. Without memoization, the code may be impractically slow for large pyramid sizes.
Edit: I see that you are having trouble with the logic of the code. So let's have a look at that.
At each position in the tree you want to make a choice of selecting
the path from this point on that has the highest value. So what
you do is, you calculate the score of the left path and the score of
the right path. I see this is something you try in your current code,
only there are some inefficiencies. You calculate everything
twice (first in the if, then in the elif), which is very expensive. You should only calculate the values of the children once.
You ask for the stopping condition. Well, if you reach the bottom of the tree, what is the score of the path starting at this point? It's just the value in the tree. And that is what you should return at that point.
So the structure should look something like this:
function getScoreAt(x, y):
if at the end: return valueInTree(x, y)
valueLeft = getScoreAt(x - 1, y + 1)
valueRight = getScoreAt(x + 1, y + 1)
valueHere = min(valueLeft, valueRight) + valueInTree(x, y)
return valueHere
Extra hint:
Are you aware that in Python negative indices wrap around to the back of the array? So if you do pyramid[pos[0]+1][pos[1]-1] you may actually get to elements like pyramid[1][-1], which is at the other side of the row of the pyramid. What you probably expect is that this raises an error, but it does not.
To fix your problem, you should add explicit bound checks and not rely on try blocks (try blocks for this is also not a nice programming style).

Categories

Resources