Suppose we want to compute some Fibonacci numbers, modulo 997.
For n=500 in C++ we can run
#include <iostream>
#include <array>
std::array<int, 2> fib(unsigned n) {
if (!n)
return {1, 1};
auto x = fib(n - 1);
return {(x[0] + x[1]) % 997, (x[0] + 2 * x[1]) % 997};
}
int main() {
std::cout << fib(500)[0];
}
and in Python
def fib(n):
if n==1:
return (1, 2)
x=fib(n-1)
return ((x[0]+x[1]) % 997, (x[0]+2*x[1]) % 997)
if __name__=='__main__':
print(fib(500)[0])
Both will find the answer 996 without issues. We are taking modulos to keep the output size reasonable and using pairs to avoid exponential branching.
For n=5000, the C++ code outputs 783, but Python will complain
RecursionError: maximum recursion depth exceeded in comparison
If we add a couple of lines
import sys
def fib(n):
if n==1:
return (1, 2)
x=fib(n-1)
return ((x[0]+x[1]) % 997, (x[0]+2*x[1]) % 997)
if __name__=='__main__':
sys.setrecursionlimit(5000)
print(fib(5000)[0])
then Python too will give the right answer.
For n=50000 C++ finds the answer 151 within milliseconds while Python crashes (at least on my machine).
Why are recursive calls so much cheaper in C++? Can we somehow modify the Python compiler to make it more receptive to recursion?
Of course, one solution is to replace recursion with iteration. For Fibonacci numbers, this is easy to do. However, this will swap the initial and the terminal conditions, and the latter is tricky for many problems (e.g. alpha–beta pruning). So generally, this will require a lot of hard work on the part of the programmer.
The issue is that Python has an internal limit on number of recursive function calls.
That limit is configurable as shown in Quentin Coumes' answer. However, too deep a function chain will result in a stack overflow. This underlying limitation¹ applies to both C++ and Python. This limitation also applies to all function calls, not just recursive ones.
In general: You should not write² algorithms that have recursion depth growth with linear complexity or worse. Logarithmically growing recursion is typically fine. Tail-recursive functions are trivial to re-write as iterations. Other recursions may be converted to iteration using external data structures (usually, a dynamic stack).
A related rule of thumb is that you shouldn't have large objects with automatic storage. This is C++-specific since Python doesn't have the concept of automatic storage.
¹ The underlying limitation is the execution stack size. The default size differs between systems, and different function calls consume different amounts of memory, so the limit isn't specified as a number of calls but in bytes instead. This too is configurable on some systems. I wouldn't typically recommend touching that limit due to portability concerns.
² Exceptions to this rule of thumb are certain functional languages that guarantee tail-recursion elimination - such as Haskell - where that rule can be relaxed in case of recursions that are guaranteed to be eliminated. Python is not such a language, and the function in question isn't tail-recursive. While C++ compilers can perform the elimination as an optimization, it isn't guaranteed, and is typically not optimized in debug builds. Hence, the exception doesn't generally apply to C++ either.
Disclaimer: The following is my hypothesis; I don't actually know their rationale: The Python limit is probably a feature that detects potentially infinite recursions, preventing likely unsafe stack overflow crashes and substituting a more controlled RecursionError.
Why are recursive calls so much cheaper in C++?
C++ is a compiled language. Python is interpreted. (Nearly) everything is cheaper in C++, except the translation from source code to an executable program.
Let me first answer your direct questions:
Why are recursive calls so much cheaper in C++?
Because C++ has no limitation on recursive call depth, except the size of the stack. And being a fully compiled language, loops (including recursion) are much faster in C++ than in Python (the reason why special Python modules like numpy/scipy directly use C routines). Additionally, most C++ implementations use a special feature called tail recursion elimination (see later in this post) and optimize some recursive code into iterative equivalents. This is nice here but not guaranteed by the standard, so other compilations could result in a program crashing miserably - but tail recursion is probably not involved here.
If the recursion is too deep and exhausts the available stack, you will invoke the well-known Undefined Behaviour where anything can happen, from an immediate crash to a program giving wrong results (IMHO the latter is much worse and cannot be detected...)
Can we somehow modify the Python compiler to make it more receptive to recursion?
No. Python implementation explicitly never uses tail recursion elimination. You could increase the recursion limit, but this is almost always a bad idea (see later in this post why).
Now for the true explanation of the underlying rationale.
Deep recursion is evil, full stop. You should never use it. Recursion is a handy tool when you can make sure that the depth will stay in sane limits. Python uses a soft limit to warn the programmer that something is going wrong before crashing the system. On the other hand, optimizing C and C++ compilers often internally change tail recursion into an iterative loop. But relying on it is highly dangerous because a slight change could prevent that optimization and cause an application crash.
As found in this other SO post, common Python implementations do not implement that tail recursion elimination. So you should not use recursion at a 5000 depth but instead use an iterative algorithm.
As your underlying computation will need all Fibonacci numbers up to the specified one, it is not hard to iteratively compute them. Furthermore, it will be much more efficient!
A solution is a trampoline: the recursive function, instead of calling another function, returns a function that makes that call with the appropriate arguments. There's a loop one level higher that calls all those functions in a loop until we have the final result. I'm probably not explaining it very well; you can find resources online that do a better job.
The point is that this converts recursion to iteration. I don't think this is faster, maybe it's even slower, but the recursion depth stays low.
An implementation could look like below. I split the pair x into a and b for clarity. I then converted the recursive function to a version that keeps track of a and b as arguments, making it tail recursive.
def fib_acc(n, a, b):
if n == 1:
return (a, b)
return lambda: fib_acc(n - 1, (a+b) % 997, (a+2*b) % 997)
def fib(n):
x = fib_acc(n, 1, 2)
while callable(x):
x = x()
return x
if __name__=='__main__':
print(fib(50000)[0])
"Both will find the answer 996 without issues"
I do see at least one issue : the answer should be 836, not 996.
It seems that both your functions calculate Fibonacci(2*n) % p, and not Fibonacci(n) % p.
996 is the result of Fibonacci(1000) % 997.
Pick a more efficient algorithm
An inefficient algorithm stays an inefficient algorithm, regardless if it's written in C++ or Python.
In order to compute large Fibonacci numbers, there are much faster methods than simple recursion with O(n) calls (see related article).
For large n, this recursive O(log n) Python function should run in circles around your above C++ code:
from functools import lru_cache
#lru_cache(maxsize=None)
def fibonacci(n, p):
"Calculate Fibonacci(n) modulo p"
if n < 3:
return [0, 1, 1][n]
if n % 2 == 0:
m = n // 2
v1 = fibonacci(m - 1, p)
v2 = fibonacci(m, p)
return (2*v1 + v2) * v2 % p
else:
m = (n + 1) // 2
v1 = fibonacci(m, p) ** 2
v2 = fibonacci(m - 1, p) ** 2
return (v1 + v2) % p
print(fibonacci(500, 997))
#=> 836
print(fibonacci(1000, 997))
#=> 996
Try it online!
It will happily calculate fibonacci(10_000_000_000_000_000, 997).
It's possible to add recursion level as parameter, in order to see how deep the recursion needs to go, and display it with indentation. Here's an example for n=500:
# Recursion tree:
500
249
124
61
30
14
6
2
3
1
2
7
4
15
8
31
16
62
125
63
32
250
Try it online!
Your examples would simply look like very long diagonals:
500
499
498
...
...
1
For Windows executables, the stack size is specified in the header of the executable. For the Windows version of Python 3.7 x64, that size is 0x1E8480 or exactly 2.000.000 bytes.
That version crashes with
Process finished with exit code -1073741571 (0xC00000FD)
and if we look that up we find that it's a Stack Overflow.
What we can see on the (native) stack with a native debugger like WinDbg (enable child process debugging) is
[...]
fa 000000e9`6da1b680 00007fff`fb698a6e python37!PyArg_UnpackStack+0x371
fb 000000e9`6da1b740 00007fff`fb68b841 python37!PyEval_EvalFrameDefault+0x73e
fc 000000e9`6da1b980 00007fff`fb698a6e python37!PyArg_UnpackStack+0x371
fd 000000e9`6da1ba40 00007fff`fb68b841 python37!PyEval_EvalFrameDefault+0x73e
fe 000000e9`6da1bc80 00007fff`fb698a6e python37!PyArg_UnpackStack+0x371
ff 000000e9`6da1bd40 00007fff`fb68b841 python37!PyEval_EvalFrameDefault+0x73e
2:011> ? 000000e9`6da1bd40 - 000000e9`6da1ba40
Evaluate expression: 768 = 00000000`00000300
So Python will use 2 Stack frames per method call and there's an enormous 768 bytes difference in the stack positions.
If you modify that value inside the EXE (make a backup copy) with a hex editor to, let's say 256 MB
you can run the following code
[...]
if __name__=='__main__':
sys.setrecursionlimit(60000)
print(fib(50000)[0])
and it will give 151 as the answer.
In C++, we can also force a Stack Overflow, e.g. by passing 500.000 as the parameter. While debugging, we get
0:000> .exr -1
ExceptionAddress: 00961015 (RecursionCpp!fib+0x00000015)
ExceptionCode: c00000fd (Stack overflow)
[...]
0:000> k
[...]
fc 00604f90 00961045 RecursionCpp!fib+0x45 [C:\...\RecursionCpp.cpp # 7]
fd 00604fb0 00961045 RecursionCpp!fib+0x45 [C:\...\RecursionCpp.cpp # 7]
fe 00604fd0 00961045 RecursionCpp!fib+0x45 [C:\...\RecursionCpp.cpp # 7]
ff 00604ff0 00961045 RecursionCpp!fib+0x45 [C:\...\RecursionCpp.cpp # 7]
0:000> ? 00604ff0 - 00604fd0
Evaluate expression: 32 = 00000020
which is just 1 stack frame per method call and only 32 bytes difference on the stack. Compared to Python, C++ can do 768/32 = 24x more recursions for the same stack size.
My Microsoft compiler created the executable with the default stack size of 1 MB (Release build, 32 Bit):
The 64 bit version has a stack difference of 64 bit (also Release build).
Tools used:
Microsoft WinDbg Preview (free)
Sweetscape 010 Editor (commercial) with the template for PE files
You can increase the recursion limit using:
import sys
sys.setrecursionlimit(new_limit)
But note that this limit exists for a reason and that pure Python is not optimized for recursion (and compute-intensive tasks in general).
An alternative to a trampoline is to use reduce.
if you can change the recursive function to be tail-recursive, you can implement it with reduce, here's a possible implementation.
reduce is internally implemented iteratively, so you get to use your recursive function without blowing up the stack.
def inner_fib(acc, num):
# acc is a list of two values
return [(acc[0]+acc[1]) % 997, (acc[0]+2*acc[1]) % 997]
def fib(n):
return reduce(inner_fib,
range(2, n+1), # start from 2 since n=1 is covered in the base case
[1,2]) # [1,2] is the base case
I tried to find sum recursively but getting 'shell returned -1073741571' when I put large values for n. It is working for small values of n say 100. I feel this error is related to stack but not sure about its reason. I've increased recursion limit so that recursion limit is not hit. Here is my code:
import sys
sys.setrecursionlimit(1000000)
print(sys.getrecursionlimit())
def get_sum(num):
if num == 1:
return 1
else:
return num + get_sum(num-1)
print(get_sum(10000))
Edit:
Specs: Windows 10, Python 3.7, Editor: Vim
I've also tried on PyCharm with Python version 3.6. Still same issue.
setrecursionlimit only sets the limits within the python interpreter, on linux/unixish os'es check ulimit output for stack size limit.
When I run your code on my editor, I get this error:
File "J:/maskros/Pithon/test.py", line 8, in get_sum
return num + get_sum(num-1)
[Previous line repeated 997 more times]
MemoryError: Stack overflow
So it is because it hits the recursion limit, but when I set my recursion limit to a lower value, then it worked, If you reduce the setrecursion, to something like 10000, then it might work.
In comments it was noted that trying a lower value doesn't work.
The value mentioned did not work for me but this value did.
import sys
sys.setrecursionlimit(4999)
print(sys.getrecursionlimit())
def get_sum(num):
if num == 1:
return 1
else:
return num + get_sum(num-1)
print(get_sum(4998))
So I would guess and check till I get a good value.
I was intending to use recursion instead of for loop to print out some numbers (for no reason except playing around) so I came to this code and when I implement it, it works perfectly for all values of i's up to 980 where it implements the code correctly but after the end of the implementation it gives
Traceback
(lots of stuff)
File "C:\Python34\lib\idlelib\PyShell.py", line 1342, in write
return self.shell.write(s, self.tags)
RuntimeError: maximum recursion depth exceeded while calling a Python object
Additionally, for values of i greater than 980, it does not print all of the values, It just prints the first 980 loops and then crashes giving the same message.
I suspect that this is related to space and time required to do the implementation and has to do with the structure of how python works rather than an error of the code but I'm no expert in neither python nor programming so I'd like to understand why this happens?
def cout(i):
if i==0:
print(0)
else:
print(i)
i-=1
cout(i)
This is not a stack overflow as #Some_programmer_dude mentions but a security to avoid stack overflow. You can change the recursion limit with sys.setrecursionlimit, but doing so is dangerous -- the standard limit is a little conservative, but Python stackframes can be quite big.
You should rewrite your algorithm without recursion. As you may know, every recursive algorithm can be transformed into an equivalent iterative one.
I'm fairly new to python and understand that recursion is an important concept to grasp. I've been dabbling with various scripts to exercise my knowledge and have come up with the following script to simulate a lottery draw, where you simply draw six from 49 numbers and compare them with another six to see if you've won. I'm struggling though with the recursive function taking the value of another function.
I'm sure it's going to be straightforwardish, but cannot fathom it myself.
Here's my code so far:
from random import randint
def drawSix():
six = []
while len(six) < 6:
a = randint(1,49)
if a not in six:
six.append(a)
return sorted(six)
def lottery(draw,ticket):
if draw == ticket:
return 'win'
return lottery(drawSix(),drawSix())
I call the function with lottery(drawSix(),drawSix())
and get the following recursively.
Traceback (most recent call last):
File "<pyshell#0>", line 1, in <module>
lottery(drawSix(),drawSix())
File "/Users/johnhopkins/Desktop/lottery.py", line 14, in lottery
return lottery(drawSix(),drawSix())
def lottery(draw,ticket):
if draw == ticket:
return 'win'
return lottery(drawSix(),drawSix())
The odds of you actually generating two identical tickets are quite large, well over 1000 which is the maximum stack size of Python.
You need to either do this iteratively to avoid blowing your stack.
def lottery(draw,ticket):
while draw != ticket:
draw, ticket = drawSix(), drawSix()
return "win"
Note this has a very ugly O(n) of O(inf) you could end up running this forever if you were unlucky and still not finding a winning pair
Well, your question has has been answered, but I would suggest changing your drawSix function. As it is now, it could technically run forever. random has a sample method to generate unique numbers.
def drawSix():
return sorted(random.sample(range(1, 50), 6))
Yes - the lottery function will keep on calling itself, each time putting a new version of itself onto the call stack, going deeper and deeper into itself until there are two matching numbers.
This can take a long time, and Python will eventually say "oi! stop it!" and crash.
Some programming languages have a feature called 'tail call optimisation', which means if you try to return the result of the same function, instead of making a new call to the function inside the current one, it simply replaces itself in the stack.
Python doesn't do that.
def lottery():
while (drawSix() != drawSix()):
continue
return 'win!'
will have the same effect as your recursive version, but won't die with recursion errors.
You have not made any programming mistakes. However, the probility of winning the lottery is very small, so you need to generate a lot. Easy recursion add something to the stack.
Number of lotto tickets can be found by the formula for combinations with repetition:
(49+6-1)! / (6! * (49-1)!) = 25827165
This is a lot... Decrease the number 6 or 49, add some debugging lines and you'll see that the code works fine!