This question already has answers here:
Understanding recursion in Python
(4 answers)
Closed 3 years ago.
Found this solution to make a factorial() function in python, but I am having trouble with understanding 'why' it works.
The function is :
def factorial(x):
if x <= 1:
return 1
else:
return x * factorial(x-1)
I'm having trouble understanding where the actual multiplication happens?
It would seem to me, that the function would keep calling itself until it gets to 1, and returns 1. Can someone enlighten me? I'm sure I'm missing something easy.
Consider a few easy examples:
calling factorial(1)
this will immediately return 1
calling factorial(2)
x is 2 in our first scope.
we will enter the else block
we mulitply x, which is 2 with factorial(x-1).
*x-1 is 1.
We call factorial(1) which is our first case. The result is 1.
we return 2
This is basically it, for any higher number we get more scopes, always calling factorial with one less, eventually reaching 1 where we terminate and start returning back values.
Where the multiplication happens:
# +-------------------------- HERE .MUL A, B happens
# | | |
# | v v
# | x ( !(x-1) )
# v
return x * factorial(x-1)
# ---------( -1)
# | | <--------------- vvvvvvvvv
# THIS sends recursive call
# to get B-value "down the line"
# while A is waiting to
# to receive B value
# needed for .MUL A, B
# B waits for return value
# from "recusive" call
# to the same function,
# but with off by one
# smaller number
# UNTIL A == 2 | more exactly A { 2 | 1 | 0 }
# B == 1 | B { 1 | 0 | -1 }
# for which case | for which case
# factorial( 1 ) RETs 1 | factorial( B ) RETs 1
# as a terminal state
# without any .MUL
# without any deeper
# level
# recursion
# call
# thus
# "terminal"
# state
# and since this moment, all "waiting" .MUL A, B-s start to get processed
# from back to front
# one step after another
# 1 * 2 * 3 * .. * A
# which is the definition of !A
# Q.E.D.
This is why it works
A general tip for programming is to insert print statements to help you see what's happening as the code runs. This is especially useful when you have broken code that you are trying to fix but is also good for understanding new code. In this case try running the following:
def factorial(x):
print "x", x
if x <= 1:
print "base case, returning 1"
return 1
else:
sub_fact = factorial(x-1)
print "factorial(x-1)", sub_fact
result = x * sub_fact
print "return", result
return result
factorial(4)
Let's say you call factorial(3), here's how it would look.
factorial(3):
return 3 * factorial(2) = ?
can't return yet since it doesn't have a value, call factorial(2)
factorial(2):
return 2 * factorial(1) = ?
can't return yet since it doesn't have a value, call factorial(1)
factorial(1):
return 1
now bubbling up, since factorial(1) returns 1
factorial(2) = 2 * 1
returns 2
factorial(3) = 3 * 2
returns 6
end of operation
Basically stack frame gets created for every call to the factorial(x) and hierarchy of stack frame is formed.Every call waits for the answer from the next call and so on.Finally, when answer is received by the main call it returns the answer.
A stack frame is a part of the call stack, and a new stack frame is created every time a subroutine is called. So, in our recursive Factorial method above, a new stack frame is created every time the method is called. The stack frame is used to store all of the variables for one invocation of a routine. So, remember that a call stack is basically a stack of stack frames.
Yes, factorial of 0 and 1 is 1 and there is no multiplication nor next recursive call.
Else, you have to note that before we multiply with current x, we must get the result out of next factorial.
So, the recursion "enters" itself down to the stop condition (when x==1) and then result rises back, like this:
factorial(5):
(5 *(4 *(3 *(2 *(1*1)))))
- Read it from right to left, which is the order of recursive execution
- Note: (1*1) would not actually be executed because at x==1 recursion stops.
Because of multiplication rule (a*b = b*a) the direction is irrelevant (top to bottom or bottom to top).
Related
I found a basic code in python to find the numbers of paths you can take in a (m,n) grid if you can only go either down or right.
def gridtraveller(m,n):
if m == 0 or n == 0:
return 0
elif m == 1 or n == 1:
return 1
return gridtraveller(m-1,n) + gridtraveller(m,n-1)
But I dont understand why is this working for two thing:
What does def something (m,n) do ?
And why does here we return the definition ? ( I do understand why we return
m-1 and n-1 , but I don't understant the concepte of a def returning a def)
Thanks to you and sorry english is not my first language.
In Python the def keyword is simply used to define a function, in this case it's the function gridtraveller(m,n). What you're seeing with that last return statement is actually a function returning the value of another function. In this case it's returning the value of another call to gridtraveller, but with different parameter values; this is called recursion. An important part of recursion is having appropriate base cases, or return values that won't end in another recursive call(i.e. the return 0 or return 1 you see).
It can be easier to understand by simply stepping through a few iterations of the recursive calls. If your first function call starts with m = 2 and n = 1, the first call will end with return gridtraveller(1,1) + gridtraveller(2,0). The first call in that statement will then return 1 since either m or n are 1 and the second returns 0 since n = 0 here giving a total result of 1. If larger values of m and n are used it will obviously result in a higher number since more calls to gridtraver(m,n) will happen.
I've been doing Python puzzles and one I have been doing is using a concurrent function to solve the Kempner Function in Python.
The Kempner Function, applied to a composite number, permits to find the smallest integer greater than zero which factorial is exactly divided by the number.
For example:
kempner(6) ➞ 3
1! = 1 % 6 > 0
2! = 2 % 6 > 0
3! = 6 % 6 === 0
kempner(10) ➞ 5
1! = 1 % 10 > 0
2! = 2 % 10 > 0
3! = 6 % 10 > 0
4! = 24 % 10 > 0
5! = 120 % 10 === 0
There are various ways of doing this, and one of the solutions I have seen is this:
def kempner(n, i=1, total=1):
if total % n == 0:
return max(1, i-1)
else:
return kempner(n, i+1, total*i)
I understand the gist of what this is doing, however when I run it through debug mode and see what the variables are doing I can see that when the base condition is reached (if total % n ==0) and return max(1, i-1) is returned then everything in the else clause will continue to run until the function returns to its starting condition (e.g. for kempner(10) then n = 10, i = 1, total = 1). Why does it do that? Surely it should stop its recurrence if the base condition has been reached?
This is a fairly abstract issue and is obviously a blind spot in my knowledge. If anyone has any insight I would be grateful.
Recursive calls are just like any other function call: when they return, they return control back to whatever called them.
Say you have a series of numbered recursive calls:
1 -> 2 -> 3 -> 4
Base Case Reached
If recursive call 3 called recursive call 4, and recursive call 4 ended at the base case, returning from recursive call 4 will take you back to recursive call 3, because 3 called 4. This is just like any other functions call:
def second_func():
print("Inner")
return 5
def first_func():
return second_func()
When you return from second_func, you return control back to first_func, since first_func called second_func. You don't immediately exit from second_func back to main or something else. It's the same with recursive calls. The only difference when dealing with recursion is first_func and second_func are the same function, but that doesn't affect the mechanics of returning.
There is no way (other than using something like exceptions) to exit from the entire call chain at once.
For context, this code is written on a video within the channel Numberphile by Matt Parker talking about multiplication persistence. The code is written in Python 2, and my question is about the line return "DONE".
Evidently, this prevents an infinite loop from being generated, as it is clear running an example (below) with and without that line:
def per(n, steps = 0):
if len(str(n)) == 1:
print "TOTAL STEPS " + str(steps)
return "DONE"
steps += 1
digits = [int(i) for i in str(n)]
result = 1
for j in digits:
result *= j
print result
per (result, steps)
per(27)
# 14
# 4
# TOTAL STEPS 2
Now, the same code without the line return "DONE" would not end the loop, and yield:
14
4
TOTAL STEPS 2
4
TOTAL STEPS 3
4
TOTAL STEPS 4
4
TOTAL STEPS 5
4
TOTAL STEPS 6
4
TOTAL STEPS 7
4
TOTAL STEPS 8
4
TOTAL STEPS 9
4
TOTAL STEPS 10
4
TOTAL STEPS 11
4
TOTAL STEPS 12
4
TOTAL STEPS 13
4
TOTAL STEPS 14
4
...
My question is about the meaning of 'return "HOME"'. Does it simply mean STOP. Is there any meaning of "HOME" in there?
If you were to run the function in a variable assignment you would literally get "DONE". I'm assuming you mean "DONE" instead of "HOME" since there is no return "HOME" in your example.
For instance:
x = per(7)
print(x)
Would print the usual output and also print DONE when you called print(x). Because 7 is a single digit the condition if len(str(n)) == 1 is met on the first run..
TOTAL STEPS 0
DONE
This function calls itself though, which makes it a confusing choice to learn about this stuff since that's fairly unusual. If we run the script with a number with more than one digit, we'll get None instead.. because it never runs the return statement, it's return value is the python default, which is a NoneType:
x = per(27)
print(x)
Result:
14
4
TOTAL STEPS 2
None
Because 27 doesn't meet the condition of having a single digit.. the script calls itself with the results it's printing out.. until it finally uses 4.. which meets the condition and prints out the TOTAL STEPS line.
To see this value we have to modify the function:
def per(n, steps = 0):
if len(str(n)) == 1:
print("TOTAL STEPS " + str(steps))
return "DONE"
steps += 1
digits = [int(i) for i in str(n)]
result = 1
for j in digits:
result *= j
print(result)
x = per(result, steps)
print("Inner return", x)
x = per(27)
print("Outer return", x)
Result:
14
4
TOTAL STEPS 2
Inner return DONE
Inner return None
Outer return None
Thankfully you don't have to worry too much about return values.. since you aren't capturing the return value in a variable in your example.. just using return by itself would have the same result. Which is to stop running the function at that point, and to move on to the line after per(27) if there was one.. it can be a useful way to see what's happening in this function though.
Using variable assignment with = is not the only way to use a return value.. just the easiest way to demonstrate what's happening.
for instance just running print(per(27)) would have the same result as my example.
The return inside the if statement is just a termination for your recursive function - have you noticed that you are calling per(x) inside itself?
If we take out the return, the function will not terminate, and Python will raise a max recursive exception.
Now your function does this: if length of n is 1, terminate; otherwise, call another per function until somewhere it meets your termination condition.
But your per is not actually returning anything when len(str(n)) != 1. By default, a function implicitly returns None. So even though per(1) returns "Done", the returned value will not be propagated to the function which calls it.
To make it clearer, what you might need is (which guarantees the return value will be consistent with different input):
def per(n, steps=0):
if len(str(n)) == 1:
return "Done"
# some other code
return per(result, steps)
And another definition is also consistent:
def per(n, steps=0):
if len(str(n)) == 1:
return # None
# some other code
per(result, steps)
# implicitly return None
This question already has answers here:
How does return statement with recursion calls hold intermediate values in Python?
(2 answers)
Closed 5 years ago.
I'm having a bit of trouble understanding a couple of things regarding recursive functions in Python (well, I guess in general). I tried looking for the answer but wasn't sure how to look for it either.
Taking the recursive function example most common when I search:
def sfactorr(j):
if j == 1:
return 1
else:
return j * sfactorr(j-1)
each time the function is greater than 1, it starts again until j == 1. But when it hits 1, shouldn't the return value be 1? Obviously when you run it you get the result of the whole function, but I don't seem to fully understand why that is.
In other words, how come it returns the correct value and not the one the base condition is returning?
Thanks
You have to work through the steps.
Say we pass 3 to the function.
When the function first runs, j > 1. So instead of returning 1, it returns j * [the function called with one less than j). But it can't actually return that until the function it called has returned.
It keeps on doing that until j is 1. That's the base condition. When j is 1, 1 gets returned to the function above it.
So. Starting with j == 3:
(1) j == 3, returning j * the result of (2)
(2) j ==2, returning j * the result of (3)
(3) j == 1, returning 1
So the functions are called in order (1), (2), (3), but return in order (3), (2), (1).
(3) returns 1
(2) takes the 1 from (3) and multiplies it by (2), resulting in 2. Then it returns that.
(1) takes the 2 from (2) and multiplies it by 3, resulting in 6. It returns that as a final value.
Recursive statements mean that the functions all collapse back into themselves, not that only the last statement runs.
So I have a fairly decent understanding of the concept of recursion, but some implementations really trip me up. Take for instance this simple fibonacci function:
def fib(x):
if x == 0 or x == 1:
return 1
else:
return fib(x-1) + fib(x-2)
I get that this breaks up the fibonacci calculation into smaller more manageable chunks. But how exactly does it come to the end result? What exactly is return returning during the recursive cases? It seems like it is just returning a call to a function that will continue to call the function until it returns 1 -- but it never seems to do any really calculations/operations. Contrast this with the classic factorial function:
def factorial(n):
if n == 1:
return 1
else:
return n * factorial(n)
Here, the function is clearly operating on n, a defined integer, each time, whereas the fibonacci function only ever operates on the function itself until 1 is returned.
Finally, things get even weirder when we bring something like the Merge Sort algorithm into play; namely this chunk of code:
middle = int(len(L)/2)
left = sort(L[:middle], lt)
right = sort(L[middle:], lt)
print(left, right)
return merge(left, right, lt)
left and right seem to be recursively calling sort, yet the print statements seem to indicate that merge is working on every recursive call. So is each recursive call somehow "saved" and then operated on when merge is finally invoked on the return? I'm confusing myself more and more by the second.... I feel like I'm on the verge of a strong understanding of recursion, but my understanding of what exactly return does for recursive calls is standing in my way.
Not understanding how recursive functions work is quite common, but it really indicates that you just don't understand how functions and returning works, because recursive functions work exactly the same as ordinary functions.
print 4
This works because the print statement knows how to print values. It is given the value 4, and prints it.
print 3 + 1
The print statement doesn't understand how to print 3 + 1. 3 + 1 is not a value, it's an expression. Fortunately print doesn't need to know how to print an expression, because it never sees it. Python passes values to things, not expressions. So what Python does is evaluate the expression when the code is executed. In this case, that results in the value 4 being produced. Then the value 4 is given to the print statement, which happily prints it.
def f(x):
return x + 1
print f(3)
This is very similar to the above. f(3) is an expression, not a value. print can't do anything with it. Python has to evaluate the expression to produce a value to give to print. It does that by going and looking up the name f, which fortunately finds the function object created by the def statement, and calling the function with the argument 3.
This results the function's body being executed, with x bound to 3. As in the case with print, the return statement can't do anything with the expression x + 1, so Python evaluates that expression to try to find a value. x + 1 with x bound to 3 produces the value 4, which is then returned.
Returning a value from a function makes the evaluation of the function-call expression become that value. So, back out in print f(3), Python has successfully evaluated the expression f(3) to the value 4. Which print can then print.
def f(x):
return x + 2
def g(y):
return f(y * 2)
print g(1)
Here again, g(2) is an expression not a value, so it needs to be evaluated. Evaluating g(2) leads us to f(y * 2) with y bound to 1. y * 2 isn't a value, so we can't call f on it; we'll have to evaluate that first, which produces the value 2. We can then call f on 2, which returns x + 2 with x bound to 2. x + 2 evaluates to the value 4, which is returned from f and becomes the value of the expression f(y * 2) inside g. This finally gives a value for g to return, so the expression g(1) is evaluated to the value 4, which is then printed.
Note that when drilling down to evaluate f(2) Python still "remembered" that it was already in the middle of evaluating g(1), and it comes back to the right place once it knows what f(2) evaluates to.
That's it. That's all there is. You don't need to understand anything special about recursive functions. return makes the expression that called this particular invocation of the function become the value that was given to return. The immediate expression, not some higher-level expression that called a function that called a function that called a function. The innermost one. It doesn't matter whether the intermediate function-calls happen to be to the same function as this one or not. There's no way for return to even know whether this function was invoked recursively or not, let alone behave differently in the two cases. return always always always returns its value to the direct caller of this function, whatever it is. It never never never "skips" any of those steps and returns the value to a caller further out (such as the outermost caller of a recursive function).
But to help you see that this works, lets trace through the evaluation of fib(3) in more detail.
fib(3):
3 is not equal to 0 or equal to 1
need to evaluate fib(3 - 1) + fib(3 - 2)
3 - 1 is 2
fib(2):
2 is not equal to 0 or equal to 1
need to evaluate fib(2 - 1) + fib(2 - 2)
2 - 1 is 1
fib(1):
1 is equal to 0 or equal to 1
return 1
fib(1) is 1
2 - 2 is 0
fib(0):
0 is equal to 0 or equal to 1
return 1
fib(0) is 1
so fib(2 - 1) + fib(2 - 2) is 1 + 1
fib(2) is 2
3 - 2 is 1
fib(1):
1 is equal to 0 or equal to 1
return 1
fib(1) is 1
so fib(3 - 1) + fib(3 - 2) is 2 + 1
fib(3) is 3
More succinctly, fib(3) returns fib(2) + fib(1). fib(1) returns 1, but fib(3) returns that plus the result of fib(2). fib(2) returns fib(1) + fib(0); both of those return 1, so adding them together gives fib(2) the result of 2. Coming back to fib(3), which was fib(2) + fib(1), we're now in a position to say that that is 2 + 1 which is 3.
The key point you were missing was that while fib(0) or fib(1) returns 1, those 1s form part of the expressions that higher level calls are adding up.
Try this exercise:
What's the value of fib(0)? What's the value of fib(1)? Let's write those down.
fib(0) == 1
fib(1) == 1
We know this because these are "base cases": it matches the first case in the fib definition.
Ok, let's bump it up. What's the value of fib(2)? We can look at the definition of the function, and it's going to be:
fib(2) == fib(1) + fib(0)
We know what the value of fib(1) and fib(0) will be: both of those will do a little work, and then give us an answer. So we know fib(2) will eventually give us a value.
Ok, bump it up. What's the value of fib(3)? We can look at the definition, and it's going to be:
fib(3) == fib(2) + fib(1)
and we already know that fib(2) and fib(1) will eventually compute numbers for us. fib(2) will do a little more work than fib(1), but they'll both eventually bottom out to give us numbers that we can add.
Go for small cases first, and see that when you bump up the size of the problem that the subproblems are things that we'll know how to handle.
If you've gone through a standard high-school math class, you will have seen something similar to this already: mathematicians use what's called "mathematical induction", which is the same idea as the recursion we programmers use as a tool.
You need to understand mathematical induction to really grasp the concept. Once it is understood recursion is simply straightforward. Consider a simple function ,
def fun(a):
if a == 0: return a
else return a + 10
what does the return statement do here? It simply returns a+10. Why is this easy to understand? Of course, one reason is that it doesn't have recursion.;) Why is the return statement so easy to understand is that it has a and 10 available when it is called.
Now, consider a simple sum of n numbers program using recursion. Now, one important thing before coding a recursion is that you must understand how mathematically it is supposed to work. In the case of sum of n numbers we know that if sum of n-1 numbers is known we could return that sum + n. Now what if do not know that sum. Well, we find sum of n-2 terms and add n-1 to it.
So, sumofN(n) = n + sum(n-1).
Now, comes the terminating part. We know that this cant go on indefinitely. Because sumofN(0) = 0
so,
sumofN(n) = 0, if n = 0,
n + sumofN(n-1) , otherwise
In code this would mean,
def sumofN(n):
if n == 0: return 0
return n + sumofN(n-1)
Here suppose we call sumofN(10). It returns 10 + sumofN(9). We have 10 with us. What about the other term. It is the return value of some other function. So what we do is we wait till that function returns. Here, since the function being called is nothing but itself, it waits till sumofN(9) returns. And when we reach 9 + sumofN(8) it waits till sumofN(8) returns.
What actually happens is
10 + sumofN(9) , which is
10 + 9 + sumofN(8), which is
10 + 9 + 8 + sumofN(7) .....
and finally when sumofN(0) returns we have,
10 + 9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 + 0 = 55
This concept is all that is needed to understand recursion. :).
Now, what about mergesort?
mergesort(someArray) = { l = mergesort the left part of array,
r = mergesort the right part of the array,
merge(l and r)
}
Until the left part is available to be returned, it goes on calling mergesort on the "leftest" arrays. Once we have that, we find the right array which indeed finds the "leftest" array. Once we have a left and right we merge them.
One thing about recursion is that it is so damn easy once you look at it from the right perspective and that right perspective is called mathematical induction