Understanding Recursion on Binary Trees

Understanding Recursion on Binary Trees - python

Hello I am new to programing so at school we are learning recursion and binary trees. But for me recursion is something that really melts my mind away.
so I was trying out a recursive function on a Binary Tree trying to output the preorder format
def preorder(self, preorder_list):
""" (BinaryTree) -> list
Return a list of the items in this tree using a *preorder* traversal.
"""
if self.root is not EmptyValue:
if self.left and self.right != None:
preorder_list.append(self.left.preorder(preorder_list))
preorder_list.append(self.right.preorder(preorder_list))
def helper_preorder(self):
preorder_list = []
self.preorder(preorder_list)
return preorder_list
When I run this code the output I get is:
for example;
[None, None, None, None]
Now I suspect this has to be a problem with my recursive reasoning. So I would like to know what is the problem with my recursive reasoning and how I can better my self in recursion.
Thanks for your time.

Your problem is that you're never returning anything from preorder, and Python is implicitly returning None. So you append the return value of the function to your list, but that value is None. Your function should look like this (In pseudo code, not valid Python)1:
preorder(node, list):
if node is Empty:
return list
else:
new_list.append(node.data)
preorder(node.left, new_list);
preorder(node.right, new_list);
return list
Note that I am duplicating the return statement, so you could optimize this, but I set it up this way because it's easiest to understand recursion if you think of it as a base case or a recursive call.
To understand recursion2, think about breaking a hard problem into smaller, easier problems. I'll take a simple example, one of the classic recursion examples: factorial(n).
What is factorial(n)? That's a really hard question to answer, so let's find something simpler.
From math class, we know that n! = n*(n-1)*(n-2)*...*(2)*(1), so let's start there. We can immediately see that n! = n * (n-1)!. That helps; now we can write a starter of our factorial function:
factorial(n):
return n * factorial(n-1)
But it's not done yet; if we run that, it'll go forever and not stop3. So we need what's called a base case: a stopping point for the recursion, where the problem is now so simple that we know the answer.
Fortunately, we have a natural base case: 1! = 1, which is true by definition. So we add that to our function:
factorial(n):
if n == 1: return 1
else return n * factorial(n)
Just to make it clear how that works, let's do a trace for something small, say n=4.
So we call factorial(4). Obviously 4 != 1, so factorial(4) = 4*factorial(3). Now we repeat the process, which is the beautiful thing about recursion: applying the same algorithm to smaller subsets of the original problem, and building up a solution from those parts.
Our trace looks something like this:
factorial(4) = 4*factorial(3)
factorial(4) = 4*3*factorial(2)
factorial(4) = 4*3*2*factorial(1)
Now, we know that factorial(1) is: it's just 1, our base case. So finally we have
factorial(4) = 4*3*2*1
factorial(4) = 24
1 This is only one possible way; there are others, but this is what I came up with on the spot
2 You must first understand recursion (Sorry; I couldn't resist)
3 In the real world, it will eventually stop: because each recursive call uses some memory (To keep track of function parameters and local variables and the like), code that recurses infinitely will eventually exceed the memory allocated to it, and will crash. This is one of the most common causes of a stack overflow error

Related

What are the base cases in this recursive BST algorithm?

I have not often written recursive functions/methods. I successfully understand the "base case" in simpler functions such as the following:
def countdown(num):
if num == 0: # Base case
return
print(num) # Action
countdown(num-1) # Reduction and recurse
def factorial(num):
if num == 0 or num == 1: # Base cases
return 1
return factorial(num-1) * num # reduction + recurse + action
However, I have programmed an algorithm that finds the closest value to a given value in a Binary Search Tree (BST), as well as insertion, print all nodes, and other such algorithms. I've found these algorithms to be much harder to understand especially with regard to identify the base cases. This question specifically pertains to the following implementation of findClosest and _findClosest. Consider:
class BSTNode:
def __init__(self,value=None):
self.value = value
self.leftChild = None
self.rightChild = None
class BST:
def __init__(self):
self.root = None
def findClosest(self, value):
if self.root is None: # Base case 0
print("There is no tree, so can't find closest!")
return None
else:
return self._findClosest(value, self.root, self.root.value)
def _findClosest(self, searchValue, currentNode, closest):
if currentNode is None: # Base case #1
return closest
if abs(currentNode.value - closest) < abs(closest - searchValue): # ACTION
closest = currentNode.value
if searchValue > currentNode.value: # Reduce and recurse
return self._findClosest(searchValue,currentNode.rightChild, closest)
elif searchValue < currentNode.value: # Reduce and recurse
return self._findClosest(searchValue,currentNode.leftChild,closest)
else:
return currentNode.value # Base case #2
You can see that I attempted to identify the base cases. As I am using a sort of "driver, pseudo-public" method which handles the first base case of no root separately, I labeled that "base case 0". In any event, my point of confusion is that there are 4 separate return statements in _findClosest(), and my current knowledge of recusion would have me surmising that the two I've labeled Base case #1 and Base case #2 with comments are in fact the only two base cases in that method. However, in order for this method to work correctly, I've also had to return the results of _findClosest() when called on the leftChild and rightChild, so would these also be more base cases? I am having a difficult time determining what is a base case vs. what is the reduction/recursive step. The fact that there are multiple base cases and separate "paths of recursion" if you will, is far more difficult for me to understand than those simple recursive functions that I mentioned earlier. Finally, the base cases in the _findClosest method are also spread out, with the recursive calls sandwiched in the middle, further adding to my confusion.
Please note that the code provided runs fine on Python 3.7.9, however I purposely did not include the rest of the BST methods and driver as I was concerned with including too much code. I also am not sure that this questions would require those remaining items. If a suitable answer does, I will edit and add them.

For a recursive function, any state that doesn't require recursive computation can be a base case. Even if calculating the base value requires some computation (that isn't recursive), it qualifies as a base case, since it ends the recursion there.
Similarly, any statement that does or does not do any computation but makes a recursive call to some sub-state can be qualified as a recursive step (reduction depends on the computation; if the search space isn't reduced, you'll technically be stuck in infinite recursion).
As for your implementation, you've correctly classified the statements in _findClosest(). However, the classification for findClosest() wouldn't make much sense, since it isn't a recursive function anyways.

From
https://www.sparknotes.com/cs/recursion/whatisrecursion/section1/page/3/
The base case, or halting case, of a function is the problem that we know the answer to, that can be solved without any more recursive calls. The base case is what stops the recursion from continuing on forever. Every recursive function must have at least one base case (many functions have more than one).
When you are calling your recursive function _findClosest with left_child or right child, you are not sure if you find the the closest number of searchValue. You will be sure you found the closest number when one of the base case happen (currentNode is Null or currentNode.value == searchValue).
By the way, you could easily implement your function _findClosest as an iterative function (because you never let call of your function on the stack, '_findClosest' never called itself two time).
I think you made a mistake in the comparison, when you want to see if searchValue is closer with the currentNode.value then with the closest:
if abs(currentNode.value - closest) < abs(closest - searchValue):
should be:
if abs(currentNode.value - searchValue) < abs(closest - searchValue):

Python : Counting execution of recursive call

I am using Euler problems to test my understanding as I learn Python 3.x. After I cobble together a working solution to each problem, I find the posted solutions very illuminating and I can "absorb" new ideas after I have struggled myself. I am working on Euler 024 and I am trying a recursive approach. Now, in no ways do I believe my approach is the most efficient or most elegant, however, I successfully generate a full set of permutations, increasing in value (because I start with a sorted tuple) - which is one of the outputs I want. In addition, in order to find the millionth in the list (which is the other output I want, but can't yet get) I am trying to count how many there are each time I create a permutation and that's where I get stuck. In other words what I want to do is count the number of recursive calls each time I reach the base case, i.e. a completed permutation, not the total number of recursive calls. I have found on StackOverflow some very clear examples of counting number of executions of recursive calls but I am having no luck applying the idea to my code. Essentially my problems in my attempts so far are about "passing back" the count of the "completed" permutation using a return statement. I think I need to do that because the way my for loop creates the "stem" and "tail" tuples. At a high level, either I can't get the counter to increment (so it always comes out as "1" or "5") or the "nested return" just terminates the code after the first permutation is found, depending on where I place the return. Can anyone help insert the counting into my code?
First the "counting" code I found in SO that I am trying to use:
def recur(n, count=0):
if n == 0:
return "Finished count %s" % count
return recur(n-1, count+1)
print(recur(15))
Next is my permutation code with no counting in it. I have tried lots of approaches, but none of them work. So the following has no "counting" in it, just a comment at which point in the code I believe the counter needs to be incremented.
#
# euler 024 : Lexicographic permutations
#
import time
startTime= time.time()
#
def splitList(listStem,listTail):
for idx in range(0,len(listTail)):
tempStem =((listStem) + (listTail[idx],))
tempTail = ((listTail[:idx]) + (listTail[1+idx:]))
splitList(tempStem,tempTail)
if len(listTail) ==0:
#
# I want to increment counter only when I am here
#
print("listStem=",listStem,"listTail=",listTail)
#
inStem = ()
#inTail = ("0","1","2","3","4","5","6","7","8","9")
inTail = ("0","1","2","3")
testStem = ("0","1")
testTail = ("2","3","4","5")
splitList(inStem,inTail)
#
print('Code execution duration : ',time.time() - startTime,' seconds')
Thanks in advance,
Clive

Since it seems you've understood the basic problem but just want to understand how the recursion is happening, all you need to do is pass a variable that tells you at what point of the call stack you're in. You can add a 3rd argument to your function, and increment it with each recursive call:
def splitList(listStem, listTail, count):
for idx in range(0,len(listTail)):
...
splitList(tempStem, tempTail, count)
if len(listTail) == 0:
count[0] += 1
print('Count:', count)
...
Now, call this function like this (same as before):
splitList(inStem, inTail, [0])

Why don't you write generator for this?
Then you can just stop on nth item ("drop while i < n").
Mine solution is using itertools, but you can use your own permutations generator. Just yield next sequence member instead of printing it.
from itertools import permutations as perm, dropwhile as dw
print(''.join(dw(
lambda x: x[0]<1000000,
enumerate(perm('0123456789'),1)
).__next__()[1]))

How to multiply without the * sign using recursion?

so as homework for a programming class on python we're supposed to multiply to integers (n,m) with each other WITHOUT using the * sign (or another multiplication form). We're supposed to use recursion to solve this problem, so i tried just adding n with itself, m number of times. I think my problem is with using recursion itself. I have searched on the internet for recursion usage, no results. Here is my code. Could someone point me in the right direction?
def mult(n,m):
""" mult outputs the product of two integers n and m
input: any numbers
"""
if m > 0:
return n + n
return m - 1
else:
return 1

I don't want to give you the answer to your homework here so instead hopefully I can provide an example of recursion that may help you along :-).
# Here we define a normal function in python
def count_down(val):
# Next we do some logic, in this case print the value
print(val)
# Now we check for some kind of "exit" condition. In our
# case we want the value to be greater than 1. If our value
# is less than one we do nothing, otherwise we call ourself
# with a new, different value.
if val > 1:
count_down(val-1)
count_down(5)
How can you apply this to what you're currently working on? Maybe, instead of printing something you could have it return something instead...

Thanks guys, i figured it out!!!
i had to return 0 instead of 1, otherwise the answer would always be one higher than what we wanted.
and i understand how you have to call upon the function, which is the main thing i missed.
Here's what i did:
def mult(n,m):
""" mult outputs the product of two integers n and m
input: any numbers
"""
if m == 0:
return 0
else:
return n + mult(n, m - 1)

You have the right mechanics, but you haven't internalized the basics you found in your searches. A recursive function usually breaks down to two cases:
Base Case --
How do you know when you're done? What do you want to do at that point?
Here, you've figured out that your base case is when the multiplier is 0. What do you want to return at this point? Remember, you're doing this as an additive process: I believe you want the additive identity element 0, not the multiplicative 1.
Recursion Case --
Do something trivial to simplify the problem, then recur with this simplified version.
Here, you've figured out that you want to enhance the running sum and reduce the multiplier by 1. However, you haven't called your function again. You haven't properly enhanced any sort of accumulative sum; you've doubled the multiplicand. Also, you're getting confused about recursion: return is to go back to whatever called this function. For recursion, you'll want something like
mult(n, m-1)
Now remember that this is a function: it returns a value. Now, what do you need to do with this value? For instance, if you're trying to compute 4*3, the statement above will give you the value of 4*2, What do you do with that, so that you can return the correct value of 4*3 to whatever called this instance? You'll want something like
result = mult(n, m-1)
return [...] result
... where you have to fill in that [...] spot. If you want, you can combine these into a single line of code; I'm just trying to make it easier for you.

Project Euler #2 Python 3.5 Help on Latency

I'm new to coding and trying to do the project euler exercises to improve my knowledge on coding. I have come across several solutions with regards to Project Euler #2.
However, I would want to know why my code takes so much longer to compute as compared to a solution I found.
I would appreciate if anyone can guide me as to the differences between the two.
My code:
def fib(n):
if n==0:
return 0
elif n == 1:
return 1
else:
f=fib(n-1)+fib(n-2)
return f
i=0
store=[]
while fib(i)<=4000000:
i += 1
if fib(i)%2 == 0:
store.append(fib(i))
print('The total is: '+str(sum(store)))
Online Solution I found:
a = 1
b = 2
s = 0
while b <= 4000000:
if not b % 2:
s += b
a, b = b, a + b
print(s)

To calculated fib(10), with your implementation:
fib(10) = fib(9) + fib(8)
in which fib(9) is calculated recursively:
fib(9) = fib(8) + fib(7)
See the problem? The result of fib(8) has to be calculated twice! To further expand the expression (e.g, to get the result of fib(8)), the redundant calculation is huge when the number is big.
Recursion itself isn't the problem, but you have to store the result of smaller fibonacci numbers rather than calculating the same expression on and on. One possible solution is to use a dictionary to store the intermediate result.

You are using recursive calls to a function where the other solution uses a plain iterative loop.
Making a function call is bound to some overhead for calling and returning from it. For bigger numbers of n you will have a lot of those function calls.
Appending to a list over and over and summing it up is probably also slower than doing this via an accumulator.

Your solution calls a recursive function (with 2 recursions) each time it goes in your while loop. Then in the loop you run that same function again.
The other solution only adds numbers and then does a permutation.
I guess you didn't really need the fibonacci, but if you insist on using it, run it only once and save the result, instead of re-runing it.
Plus you store all your results and sum it at the end. That consumes a bit of time (not only) too, maybe you didn't need to store intermediate results.

As several other answers pointed out, the recursion causes your fib() function to be called very often, 111 561 532 times in fact. This is easily seen by adding a counter:
count = 0
def fib(n):
global count
count += 1
if n==0:
# the rest of your program
print(count)
There are two ways to fix this; rewrite your program to be iterative rather than recursive (like the other solution you posted), or cache intermediate results from fib().
See, you call fib(8), which in turn has to call fib(7) and fib(6), etc, etc. Just calculating fib(8) takes 67 calls to fib()!
But later, when you call fib(9), that also calls fib(8), which has to do all the work over again (67 more calls to fib()). This gets out of hand quickly. It would be better, if fib() could remember that it already calculated fib(8) and remember the result. This is known as caching or memoization.
Luckily, Python's standard library has a decorator just for that purpose, functools.lru_cache:
from functools import lru_cache
#lru_cache()
def fib(n):
if n==0:
...
On my computer, your program execution goes from 111 561 532 invocations of fib() in 27 seconds to 35 invocations in 0.028 seconds.

Why am I exceeding max recursion depth?

Let me stop you right there, I already know you can adjust the maximum allowed depth.
But I would think this function, designed to calculate the nth Fibonacci number, would not exceed it, owing to the attempted memoization.
What am I missing here?
def fib(x, cache={1:0,2:1}):
if x is not 1 and x is not 2 and x not in cache: cache[x] = fib(x-1) + fib(x-2)
return cache[x]

The problem here is the one that tdelaney pointed out in a comment.
You are filling the cache in backward, from x down to 2.
That is sufficient to ensure that you only perform a linear number of recursive calls. The first call to fib(4000) only makes 3998 recursive calls.
But 3998 > sys.getcursionlimit(), so that doesn't help.

Your code works, just set the recursion limit (default is 1000):
>>> def fib(x, cache={1:0,2:1}):
... if x is not 1 and x is not 2 and x not in cache: cache[x] = fib(x-1) + f
ib(x-2)
... return cache[x]
...
>>> from sys import setrecursionlimit
>>> setrecursionlimit(4001)
>>> fib(4000)
24665411055943750739295700920408683043621329657331084855778701271654158540392715
48090034103786310930146677221724629877922534738171673991711165681180811514457211
13771400656054018493704811431159158792987298892998378107544456316501964164304630
21568595514449785504918067352892206292173283858530346012173429628868997174476215
95754737778371797011268738657294932351901755682732067943003555687894170965511472
22394287423465133129791428666544293424932758353804445807459873383767095726534051
03186366562265469193320676382408395686924657068094675464095820220760924728356005
27753139995364477320639625889904027436038223654786222515006804845418392308019640
53848249082837958012652040193422565794818023898141209364892225521425081077545093
40549694342959926058170589410813569880167004050051440392247460055993434072332526
101572422443738016276258104875526626L
>>>
The reason is, if you imagine a large tree, your root node is 4000, which connects to 3999 and 3998. you go all the way down one branch of the tree until you hit a base case. Then you come back up building the cache from the bottom. So the tree is over 1000 deep which is why you hit the limit.

To add to the discussion question comments, wanted to summarize:
You're adding to the cache after the recursive step -- thus your cache isn't doing much.
You're also referring to the same cache value in all the calls. Not sure if that's what you want, but that's the behavior.
This style of recursion isn't idiomatic Python. However, what is idiomatic Python is to use something like a memoization decorator. For an example, look here: https://wiki.python.org/moin/PythonDecoratorLibrary#Memoize (With your exact example)

Maybe this helps to visualise, what is going wrong:
def fib(x, cache={0:..., 1:0, 2:1}):
if x not in cache: cache[x] = fib(x-1) + fib(x-2)
return cache[x]
for n in range(4000): fib(n)
print(fib(4000))
Works perfectly as you explicitely build the cache bottom up. (It is a good thing that default arguments are not evaluated at runtime.)
Btw: your initial dictionary is wrong. fib (1) is 1 and not 0. I kept this numbering offset in my approach, though.

The trick to making memoization work well for a problem like this is to start at the first value you don't yet know and work up towards the value you need to return. This means avoiding top-down recursion. It's easy to iteratively compute Fibonacci values. Here's a really compact version with a memo list:
def fib(n, memo=[0,1]):
while len(memo) < n+1:
memo.append(memo[-2]+memo[-1])
return memo[n]
Here's a quick demo run (which goes very fast):
>>> for i in range(90, 101):
print(fib(i))
2880067194370816120
4660046610375530309
7540113804746346429
12200160415121876738
19740274219868223167
31940434634990099905
51680708854858323072
83621143489848422977
135301852344706746049
218922995834555169026
354224848179261915075
>>> fib(4000)
39909473435004422792081248094960912600792570982820257852628876326523051818641373433549136769424132442293969306537520118273879628025443235370362250955435654171592897966790864814458223141914272590897468472180370639695334449662650312874735560926298246249404168309064214351044459077749425236777660809226095151852052781352975449482565838369809183771787439660825140502824343131911711296392457138867486593923544177893735428602238212249156564631452507658603400012003685322984838488962351492632577755354452904049241294565662519417235020049873873878602731379207893212335423484873469083054556329894167262818692599815209582517277965059068235543139459375028276851221435815957374273143824422909416395375178739268544368126894240979135322176080374780998010657710775625856041594078495411724236560242597759185543824798332467919613598667003025993715274875

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Understanding Recursion on Binary Trees - python

Related

What are the base cases in this recursive BST algorithm?

Python : Counting execution of recursive call

How to multiply without the * sign using recursion?

Project Euler #2 Python 3.5 Help on Latency

Why am I exceeding max recursion depth?

Categories

Resources