Is it always possible to convert a recursion into a tail recursive one?
I am having a hard time converting the following Python function into a tail-recursive one.
def BreakWords(glob):
"""Break a string of characters, glob, into a list of words.
Args:
glob: A string of characters to be broken into words if possible.
Returns:
List of words if glob can be broken down. List can be empty if glob is ''.
None if no such break is possible.
"""
# Base case.
if len(glob) == 0:
return []
# Find a partition.
for i in xrange(1, len(glob) + 1):
left = glob[:i]
if IsWord(left):
right = glob[i:]
remaining_words = BreakWords(right)
if remaining_words is not None:
return [left] + remaining_words
return None
I'n not sure if is always the case, but most of recursive functions can be implemented as tail recursives. Besides Tail Recursion is different from Tail Recursion optimization.
Differences Tail Recursion and "Regular" ones
There are two elements that must be present in a recursive function:
The recursive call
A place to keep count of the return values.
A "regular" recursive function keeps (2) in the stack frame.
The return values in regular recursive function are composed of two types of values:
Other return values
Result of the owns function computation
Let's see a example:
def factorial(n):
if n == 1 return 1
return n * factorial(n-1)
The frame f(5) "stores" the result of it's own computation (5) and the value of f(4), for example. If i call factorial(5), just before the stack calls begin to colapse, i have:
[Stack_f(5): return 5 * [Stack_f(4): 4 * [Stack_f(3): 3 * ... [1[1]]
Notice that each stack stores, besides the values i mentioned, the whole scope of the function. So, the memory usage for a recursive function f is O(x), where x is the number of recursive calls i have to made. So, if i needb 1kb of RAM to calculate factorial(1) or factorial(2), i need ~100k to calculate factorial(100), and so on.
A Tail Recursive function put (2) in it's arguments.
In a Tail Recursion, i pass the result of the partial calculations in each recursive frame to the next one using parameters. Let's see our factorial example, Tail Recursive:
def factorial(n):
def tail_helper(n, acc):
if n == 1 or n == 2: return acc
return tail_helper(n-1, acc + n)
return tail_helper(n,0)
Let's look at it's frames in factorial(4):
[Stack f(4, 5): Stack f(3, 20): [Stack f(2,60): [Stack f(1, 120): 120]]]]
See the differences? In "regular" recursive calls the return functions recursively compose the final value. In Tail Recursion they only reference the base case (last one evaluated). We call accumulator the argument that keeps track of the older values.
Recursion Templates
The regular recursive function go as follows:
def regular(n)
base_case
computation
return (result of computation) combined with (regular(n towards base case))
To transform it in a Tail recursion we:
Introduce a helper function that carries the accumulator
run the helper function inside the main function, with the accumulator set to the base case.
Look:
def tail(n):
def helper(n, accumulator):
if n == base case:
return accumulator
computation
accumulator = computation combined with accumulator
return helper(n towards base case, accumulator)
helper(n, base case)
Your example:
I did something like this:
def BreakWords(glob):
def helper(word, glob, acc_1, acc_2):
if len(word) == 0 and len(glob) == 0:
if not acc_1:
return None
return acc
if len(word) == 0:
word = glob.pop[0]
acc_2 = 0
if IsWord(word.substring[:acc_2]):
acc_1.append(word[:acc_2])
return helper(word[acc_2 + 1:], glob, acc_1, acc_2 + 1)
return helper(word[acc_2 + 1:], glob, acc_1, acc_2 + 1)
return helper("", glob, [], 0)
In order to eliminate the for statement you made, i did my recursive helper function with 2 accumulators. One to store the results, and one to store the position i'm currently trying.
Tail Call optimization
Since no state is being stored on the Non-Border-Cases of the Tail Call stacks, they aren't so important. Some languages/interpreters then substitute the old stack with the new one. So, with no stack frames constraining the number of calls, the Tail Calls behave just like a for-loop.
But unfortunately for you Python isn't one of these cases. You'll get a RunTimeError when the stack gets bigger than 1000. Mr. Guido
thinks that the clarity lost to debugging purposes due to Tail Call Optimization (caused by the frames thrown awy) is more important than the feature. That's a shame. Python has so many cool functional stuff, and tail recursion would be great on top of it :/
Related
This is the original code for Fibonacci sequence by using Recursion
def rec(n):
if n<=1:
return n
else:
return ( rec(n-1) + rec(n-2))
n=int(input())
Above code gets very slow in around 50th term.
Following code I have returned is also basically a recursion.
n=int(input())
n1,n2,count=0,1,0
def rec(n,n1,n2,count):
if count<n:
print(n1)
nth=n1 + n2
n1=n2
n2=nth
count+=1
rec(n,n1,n2,count)
rec(n,n1,n2,count)
My question is are both of these approaches follow recursion (like the real recursion) ?
Both functions are recursive, but as the last function has the call to itself as the last action in the function, it is also described as tail recursive.
Tail recursive function can easily be converted into loops:
def rec(n, n1=0, n2=1, count=0):
while count < n:
print(n1)
n1, n2, count = n2, n1 + n2, count +1
Well, they are both recursion, because you are calling the function a() inside itself. So, the main difference in both your functions is that there are two recursive calls on the first one and just one on the second one.
So now, on another matter:
Above code gets very slow in around 50th term.
Well, you can do something interesting to do it faster. See, because of the recursive definition of Fibonacci sequence, you are doing the same calculations more than one.
For instance: imagine you are to compute fibonacci(5) so you need to compute fibonacci(4) and fibonacci(3). But now, for fibonacci(4) you need to compute fibonacci(3) and fibonacci(2), and so on. But wait, when you finish computing fibonacci(4) branch you already computed all fibonacci for 3 and 2, so when you go back to the other branch (fibonacci(3)) from the first recursive call, you already have computed it. So, What if there is a way I can store thos calculations so I can access them faster? You can do it with Decorators to create a memoize class (some sort of memory to avoid repeated calculations):
This way we are going to store each computation of fibonacci(k) on a dict and we would check each time before a call if it exists on the dictionary, return if True or else compute it. This way is faster
class memoize:
def __init__(self, function):
self.f = function
self.memory = {}
def __call__(self, *args):
if args in self.memory:
return self.memory[args]
else:
value = self.f(*args)
self.memory[args] = value
return value
#memoize
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
r = fib(50)
print(r)
You can see that without memoization it took too long and with memoization it only took 0.263614.
Hope this helps
If a function calls itself, it is considered recursive.
The difference between your two implementations is that the first one calls itself approximately 2**n times, the second calls itself about n times.
For n = 50, 2**50 is 1125899906842624. That's a lot of calls! No wonder it takes a long time. (Example: think of the number of times that rec(10) is called when calculating rec(50). Many, many, many times.)
While both of your functions are recursive, I'd say the latter is a "forward iteration", in that you are essentially moving FORWARD through the fibonacci sequence; for your second rec(50), that function only recurses about 50 times.
One technique to make recursive calls faster is called memoization. See "Memoization" on Wikipedia. It works by immediately returning if the answer has previously been calculated... thereby not "re-recursing".
I have a recursive algorithm in which I calculate some probability values. The input is a list of integers and a single integer value, which represents a constant value.
For instance, p([12,19,13], 2) makes three recursive calls, which are
p([12,19],0) and p([13], 2)
p([12,19],1) and p([13], 1)
p([12,19],2) and p([13], 0)
since 2 can be decomposed as 0+2, 1+1 or 2+0. Then each call follows a similar approach and makes several other recursive calls.
The recursive algorithm I have
limit = 20
def p(listvals, cval):
# base case
if len(listvals) == 0:
return 0
if len(listvals) == 1:
if cval == 0:
return listvals[0]/limit
elif listvals[0] + cval > limit:
return 0
else:
return 1/limit
result = 0
for c in range(0,cval+1):
c1 = c
c2 = cval-c
listvals1 = listvals[:-1]
listvals2 = [listvals[-1]]
if listvals[-1] + c2 <= limit:
r = p(listvals1, c1) * p(listvals2, c2)
result = result+r
return result
I have been trying to convert this into a bottom up DP code, but could not figure out the way I need to make the iteration.
I wrote down all the intermediate steps that are needed to be calculated for the final result, and it is apparent that there are lots of repetitions at the bottom of the recursive calls.
I tried creating a dictionary of pre-calculated values as given below
m[single_value]=[list of calculated values]
and use those values instead of making the second recursive call p(listvals2, c2), but it did not help much as far as the running time is concerned.
How can I improve the running time by using a proper bottom-up approach?
Not sure that I understand what your program wants to compute, so can't help on that, maybe explain a bit more?
Regarding improving performance, you are caching only the leaf nodes of the computations that are repeated in recursive calls. A better way to do that would be have the first parameter of your function p as a tuple instead of a list, and then use tuple of both the arguments to p as caching keys in the dictionary.
Python's standard library functools provides a simple way to do this fairly common piece.
from functools import wraps
def cached(func):
cache = {}
#wraps(func)
def wrapped(listvals, cval):
key = (listvals, cval)
if key not in cache:
cache[key] = func(key)
return cache[key]
return wrapped
Use this decorator to cache all calls function:
#cached
def p(listvals, cval):
Now have your p take tuple instead of list:
p((12,19,13), 2)
Is there a way to to recursively find both minimum and maximum in a list efficiently? I wrote this with python, but it's hugely badly efficient, as is call the function with the same list both for max and both for min each time.
def f(l):
if len(l)==1 : return [l[0],l[0]]
return [max(l[0],f(l[1:])[0]),min(l[0],f(l[1:])[1])]
l=[1,3,9,-3,-30,10,100]
print(f(l))
output: [100, -30]
--
Have you any idea on how to improve it? Is it possible to do it even without passing any other variable to the function?
In Python, a recursive implementation will in any case be much slower than iterative one because of:
call overhead
object creation, incl. partial list construction
not using some of Python's efficient constructs like for .. in loop
You cannot eliminate the former if you're specifically required to do a recursive algorithm, but you can cut on object construction. The list construction is especially taxing since all the elements are copied each time.
instead of constructing a new list each iteration, pass the same list and the current index in it
and in your function, you are constructing a new list not once but twice!
You're also making two recursive calls each iteration. Each of them will also make two calls etc, resulting in a total number of calls a whopping 1+2+4+...+2**(N-1) = 2**N-1! To add insult to injury, the two calls are completely redundant since they both produce the same result.
since the current list element is used multiple times, a few microsecods can also be cut off by caching it in a variable instead of retrieving each time.
def rminmax(l,i=0,cmin=float('inf'),cmax=float('-inf')):
e=l[i]
if e<cmin: cmin=e
if e>cmax: cmax=e
if i==len(l)-1:
return (cmin,cmax)
return rminmax(l,i+1,cmin,cmax)
Also note that due to CPython's stack size limit, you won't be able to process lists longer than a number slightly lower than sys.getrecursionlimit() (slightly lower because the interactive loop machinery also takes up some call stack frames). This limitation may not apply in other Python implementations.
Here's some performance comparison on my machine on sample data:
In [18]: l=[random.randint(0,900) for _ in range(900)]
In [29]: timeit rminmax(l)
1000 loops, best of 3: 395 µs per loop
# for comparison:
In [21]: timeit f(l) #your function
# I couldn't wait for completion; definitely >20min for 3 runs
In [23]: timeit f(l) #sjf's function
100 loops, best of 3: 2.59 ms per loop
I am not sure why you want to use recursion to find the min and max as you can simply pass a list to min and max.
def f(l):
return min(l), max(l)
If you are trying to do this as an exercise in recursion, I don't see a way to solve it without passing the min and max down the recursive call.
def f(l, min_=None, max_=None):
if not l:
return min_,max_
min_ = l[0] if min_ is None else min(l[0], min_)
max_ = l[0] if max_ is None else max(l[0], max_)
return f(l[1:], min_, max_)
There is a way to do this (And recursion in python really is very slow; see the other answers if you want a robust implementation). Think about your recursive formulation from left to right: at each level of recursion, take the min/max of the current item in the list and the result returned from the next level of recursion.
Then (for python>= 2.5, we can use the ternary operator):
def find_min(ls, idx):
return ls[idx] if idx == len(ls) - 1 else min(ls[idx], find_min(ls, idx+1))
find_max is analogous; you can just replace min with max.
If you want a simpler definition, you can wrap a function that only accepts ls around find_min/find_max and make that function call find_min(ls, 0) or find_max(ls, 0).
Why recursively ?
This would work fine and is about 10 times faster than the best recursive algorithm:
def minMax(array): return min(array),max(array)
To avoid having each recursion call itself twice, you could write the function like this:
def minMax(array):
first,*rest = array # first,rest = array[0],array[1:]
if not rest : return first,first
subMin,subMax = minMax(rest)
return min(first,subMin), max(first,subMax)
If you want to avoid the maximum recursion limit (i.e. on large list) you could use a binary approach splitting the array in left and right parts. this will only use log(n) levels of recursion (and also reduce some of the processing overhead):
def minMax(array):
size = len(array)
if size == 1 : return array[0],array[0]
midPoint = size // 2
leftMin,leftMax = minMax(array[:midPoint])
rightMin,rightMax = minMax(array[midPoint:])
return min(leftMin,rightMin), max(leftMin,rightMin)
If you want to reduce the overhead of array creation and function calls, you could pass down the index and avoid min(),max() and len() (but then you're using recursion as a for loop which pretty much defeats the purpose):
def minMax(array, index=None):
index = (index or len(array)) - 1
item = array[index]
if index == 0 : return item,item
subMin,subMax = minMax(array,index)
if item < subMin: return item,subMax
if item > subMax: return subMin,item
return subMin,subMax
You can combine the previous two to reduce overhead and avoid recursion limit, but it is going lose a bit of performance:
def minMax(array, start=0, end=None):
if end is None : end = len(array)-1
if start >= end - 1:
left,right = array[start],array[end]
return (left,right) if left < right else (right,left)
middle = (start + end) >> 1
leftMin,leftMax = minMax(array, start,middle)
rightMin,rightMax = minMax(array, middle+1,end)
return ( leftMin if leftMin < rightMin else rightMin ), \
( leftMax if leftMax > rightMax else rightMax )
I am working on to write a Python function nest(a, b) that takes a value a and a number b. Then the value a is put inside a list, which is put in another list, and so on, up to n levels.
For example:
nest("foo", 3)
should return:
[[[["foo"]]]]
You can try this,
def nest(obj, depth):
ret = obj
for _ in range(depth):
ret = [ret]
return ret
You may write a recursive function to achieve this as:
def my_func(s, n):
return [s] if n == 0 else [my_func(s, n-1)]
Sample Run:
>>> my_func('foo', 3)
[[[['foo']]]]
Note: There is a maximum recursive limit allowed for the Python interpreter stack. This limit prevents infinite recursion from causing an overflow of the C stack and crashing Python.
You can check this value using sys.getrecursionlimit() function which as per the doc:
Return the current value of the recursion limit, the maximum depth of
the Python interpreter stack. This limit prevents infinite recursion
from causing an overflow of the C stack and crashing Python. It can be
set by setrecursionlimit().
This is a recursive solution, it pretty much comes naturally from the problem definition:
def nest(val, n):
if n <= 0:
return val
else:
return [ nest(val, n - 1) ]
As an exercise, i implemented the map function using recursion in python as follows:
#map function that applies the function f on every element of list l and returns the new list
def map(l,f):
if l == []:
return []
else:
return [f(l[0])] + map(l[1:],f)
I am aware of the fact that python does not support tail recursion optimization, but how would i go about writing the same function in a tail recursive manner ?.
Please Help
Thank You
Tail recursion means you must be directly returning the result of a recursive call, with no further manipulation.
The obvious recursion in map is to compute the function on one element of the list, then use a recursive call to process the rest of the list. However, you need to combine the result of processing one element with the result of processing the rest of the list, which requires an operation after the recursive call.
A very common pattern for avoiding that is to move the combination inside the recursive call; you pass in the processed element as an argument, and make it part of map's responsibility to do the combining as well.
def map(l, f):
if l == []:
return []
else:
return map(l[1:], f, f(l[0]))
Now it's tail recursive! But it's also obviously wrong. In the tail recursive call, we're passing 3 arguments, but map only takes two arguments. And then there's the question of what do we do with the 3rd value. In the base case (when the list is empty), it's obvious: return a list containing the information passed in. In the recursive case, we're computing a new value, and we have this extra parameter passed in from the top, and we have the recursive call. The new value and the extra parameter need to be rolled up together to be passed into the recursive call, so that the recursive call can be tail recursive. All of which suggests the following:
def map(l, f):
return map_acc(l, f, [])
def map_acc(l, f, a):
if l == []:
return a
else:
b = a + [f(l[0])]
return map_acc(l[1:], f, b)
Which can be expressed more concisely and Pythonically as other answers have shown, without resorting to a separate helper function. But this shows a general way of turning non-tail-recursive functions into tail recursive functions.
In the above, a is called an accumulator. The general idea is to move the operations you normally do after a recursive call into the next recursive call, by wrapping up the work outer calls have done "so far" and passing that on in an accumulator.
If map can be thought of as meaning "call f on every element of l, and return a list of the results", map_acc can be thought of as meaning "call f on every element of l, returning a list of the results combined with a, a list of results already produced".
This will be an example of the implementation of the built-in function map in tail recursion:
def map(func, ls, res=None):
if res is None:
res = []
if not ls:
return res
res.append(func(ls[0]))
return map(func, ls[1:], res)
But it will not solve the problem of python not having support of TRE which mean that the call stack of each function call will be hold at all the time.
This seems to be tail recursive:
def map(l,f,x=[]):
if l == []:
return x
else:
return map(l[1:],f,x+[f(l[0])])
Or in more compact form:
def map(l,f,x=[]):
return l and map(l[1:],f,x+[f(l[0])]) or x
not really an answer, sorry, but another way to implement map is to write it in terms of a fold. if you try, you'll find that it only comes out "right" with foldr; using foldl gives you a reversed list. unfortunately, foldr isn't tail recursive, while foldl is. this suggests that there's something more "natural" about rev_map (map that returns a reversed list). unfortunately i am not well enough educated to take this any further (i suspect you might be able to generalise this to say that there is no solution that doesn't use an accumulator, but i personally don't see how to make the argument).