Related
This is the original code for Fibonacci sequence by using Recursion
def rec(n):
if n<=1:
return n
else:
return ( rec(n-1) + rec(n-2))
n=int(input())
Above code gets very slow in around 50th term.
Following code I have returned is also basically a recursion.
n=int(input())
n1,n2,count=0,1,0
def rec(n,n1,n2,count):
if count<n:
print(n1)
nth=n1 + n2
n1=n2
n2=nth
count+=1
rec(n,n1,n2,count)
rec(n,n1,n2,count)
My question is are both of these approaches follow recursion (like the real recursion) ?
Both functions are recursive, but as the last function has the call to itself as the last action in the function, it is also described as tail recursive.
Tail recursive function can easily be converted into loops:
def rec(n, n1=0, n2=1, count=0):
while count < n:
print(n1)
n1, n2, count = n2, n1 + n2, count +1
Well, they are both recursion, because you are calling the function a() inside itself. So, the main difference in both your functions is that there are two recursive calls on the first one and just one on the second one.
So now, on another matter:
Above code gets very slow in around 50th term.
Well, you can do something interesting to do it faster. See, because of the recursive definition of Fibonacci sequence, you are doing the same calculations more than one.
For instance: imagine you are to compute fibonacci(5) so you need to compute fibonacci(4) and fibonacci(3). But now, for fibonacci(4) you need to compute fibonacci(3) and fibonacci(2), and so on. But wait, when you finish computing fibonacci(4) branch you already computed all fibonacci for 3 and 2, so when you go back to the other branch (fibonacci(3)) from the first recursive call, you already have computed it. So, What if there is a way I can store thos calculations so I can access them faster? You can do it with Decorators to create a memoize class (some sort of memory to avoid repeated calculations):
This way we are going to store each computation of fibonacci(k) on a dict and we would check each time before a call if it exists on the dictionary, return if True or else compute it. This way is faster
class memoize:
def __init__(self, function):
self.f = function
self.memory = {}
def __call__(self, *args):
if args in self.memory:
return self.memory[args]
else:
value = self.f(*args)
self.memory[args] = value
return value
#memoize
def fib(n):
if n <= 1:
return n
else:
return fib(n-1) + fib(n-2)
r = fib(50)
print(r)
You can see that without memoization it took too long and with memoization it only took 0.263614.
Hope this helps
If a function calls itself, it is considered recursive.
The difference between your two implementations is that the first one calls itself approximately 2**n times, the second calls itself about n times.
For n = 50, 2**50 is 1125899906842624. That's a lot of calls! No wonder it takes a long time. (Example: think of the number of times that rec(10) is called when calculating rec(50). Many, many, many times.)
While both of your functions are recursive, I'd say the latter is a "forward iteration", in that you are essentially moving FORWARD through the fibonacci sequence; for your second rec(50), that function only recurses about 50 times.
One technique to make recursive calls faster is called memoization. See "Memoization" on Wikipedia. It works by immediately returning if the answer has previously been calculated... thereby not "re-recursing".
Is there a way to to recursively find both minimum and maximum in a list efficiently? I wrote this with python, but it's hugely badly efficient, as is call the function with the same list both for max and both for min each time.
def f(l):
if len(l)==1 : return [l[0],l[0]]
return [max(l[0],f(l[1:])[0]),min(l[0],f(l[1:])[1])]
l=[1,3,9,-3,-30,10,100]
print(f(l))
output: [100, -30]
--
Have you any idea on how to improve it? Is it possible to do it even without passing any other variable to the function?
In Python, a recursive implementation will in any case be much slower than iterative one because of:
call overhead
object creation, incl. partial list construction
not using some of Python's efficient constructs like for .. in loop
You cannot eliminate the former if you're specifically required to do a recursive algorithm, but you can cut on object construction. The list construction is especially taxing since all the elements are copied each time.
instead of constructing a new list each iteration, pass the same list and the current index in it
and in your function, you are constructing a new list not once but twice!
You're also making two recursive calls each iteration. Each of them will also make two calls etc, resulting in a total number of calls a whopping 1+2+4+...+2**(N-1) = 2**N-1! To add insult to injury, the two calls are completely redundant since they both produce the same result.
since the current list element is used multiple times, a few microsecods can also be cut off by caching it in a variable instead of retrieving each time.
def rminmax(l,i=0,cmin=float('inf'),cmax=float('-inf')):
e=l[i]
if e<cmin: cmin=e
if e>cmax: cmax=e
if i==len(l)-1:
return (cmin,cmax)
return rminmax(l,i+1,cmin,cmax)
Also note that due to CPython's stack size limit, you won't be able to process lists longer than a number slightly lower than sys.getrecursionlimit() (slightly lower because the interactive loop machinery also takes up some call stack frames). This limitation may not apply in other Python implementations.
Here's some performance comparison on my machine on sample data:
In [18]: l=[random.randint(0,900) for _ in range(900)]
In [29]: timeit rminmax(l)
1000 loops, best of 3: 395 µs per loop
# for comparison:
In [21]: timeit f(l) #your function
# I couldn't wait for completion; definitely >20min for 3 runs
In [23]: timeit f(l) #sjf's function
100 loops, best of 3: 2.59 ms per loop
I am not sure why you want to use recursion to find the min and max as you can simply pass a list to min and max.
def f(l):
return min(l), max(l)
If you are trying to do this as an exercise in recursion, I don't see a way to solve it without passing the min and max down the recursive call.
def f(l, min_=None, max_=None):
if not l:
return min_,max_
min_ = l[0] if min_ is None else min(l[0], min_)
max_ = l[0] if max_ is None else max(l[0], max_)
return f(l[1:], min_, max_)
There is a way to do this (And recursion in python really is very slow; see the other answers if you want a robust implementation). Think about your recursive formulation from left to right: at each level of recursion, take the min/max of the current item in the list and the result returned from the next level of recursion.
Then (for python>= 2.5, we can use the ternary operator):
def find_min(ls, idx):
return ls[idx] if idx == len(ls) - 1 else min(ls[idx], find_min(ls, idx+1))
find_max is analogous; you can just replace min with max.
If you want a simpler definition, you can wrap a function that only accepts ls around find_min/find_max and make that function call find_min(ls, 0) or find_max(ls, 0).
Why recursively ?
This would work fine and is about 10 times faster than the best recursive algorithm:
def minMax(array): return min(array),max(array)
To avoid having each recursion call itself twice, you could write the function like this:
def minMax(array):
first,*rest = array # first,rest = array[0],array[1:]
if not rest : return first,first
subMin,subMax = minMax(rest)
return min(first,subMin), max(first,subMax)
If you want to avoid the maximum recursion limit (i.e. on large list) you could use a binary approach splitting the array in left and right parts. this will only use log(n) levels of recursion (and also reduce some of the processing overhead):
def minMax(array):
size = len(array)
if size == 1 : return array[0],array[0]
midPoint = size // 2
leftMin,leftMax = minMax(array[:midPoint])
rightMin,rightMax = minMax(array[midPoint:])
return min(leftMin,rightMin), max(leftMin,rightMin)
If you want to reduce the overhead of array creation and function calls, you could pass down the index and avoid min(),max() and len() (but then you're using recursion as a for loop which pretty much defeats the purpose):
def minMax(array, index=None):
index = (index or len(array)) - 1
item = array[index]
if index == 0 : return item,item
subMin,subMax = minMax(array,index)
if item < subMin: return item,subMax
if item > subMax: return subMin,item
return subMin,subMax
You can combine the previous two to reduce overhead and avoid recursion limit, but it is going lose a bit of performance:
def minMax(array, start=0, end=None):
if end is None : end = len(array)-1
if start >= end - 1:
left,right = array[start],array[end]
return (left,right) if left < right else (right,left)
middle = (start + end) >> 1
leftMin,leftMax = minMax(array, start,middle)
rightMin,rightMax = minMax(array, middle+1,end)
return ( leftMin if leftMin < rightMin else rightMin ), \
( leftMax if leftMax > rightMax else rightMax )
Does anyone understand the following iterative algorithm for producing all permutations of a list of numbers?
I do not understand the logic within the while len(stack) loop. Can someone please explain how it works?
# Non-Recursion
#param nums: A list of Integers.
#return: A list of permutations.
def permute(self, nums):
if nums is None:
return []
nums = sorted(nums)
permutation = []
stack = [-1]
permutations = []
while len(stack):
index = stack.pop()
index += 1
while index < len(nums):
if nums[index] not in permutation:
break
index += 1
else:
if len(permutation):
permutation.pop()
continue
stack.append(index)
stack.append(-1)
permutation.append(nums[index])
if len(permutation) == len(nums):
permutations.append(list(permutation))
return permutations
I'm just trying to understand the code above.
As mentioned in the comments section to your question, debugging may provide a helpful way to understand what the code does. However, let me provide a high-level perspective of what your code does.
First of all, although there are no recursive calls to the function permute, the code your provided is effectively recursive, as all it does is keeping its own stack, instead of using the one provided by the memory manager of your OS. Specifically, the variable stack is keeping the recursive state, so to speak, that is passed from one recursive call to another. You could, and perhaps should, consider each iteration of the outer while loop in the permute function as a recursive call. If you do so, you will see that the outer while loop helps 'recursively' traverse each permutation of nums in a depth-first manner.
Noticing this, it's fairly easy to figure out what each 'recursive call' does. Basically, the variable permutation keeps the current permutation of nums which is being formed as while loop progresses. Variable permutations store all the permutations of nums that are found. As you may observe, permutations are updated only when len(permutation) is equal to len(nums) which can be considered as the base case of the recurrence relation that is being implemented using a custom stack. Finally, the inner while loop picks which element of nums to add to the current permutation(i.e. stored in variable permutation) being formed.
So that is about it, really. You can figure out what is exactly being done on the lines relevant to the maintenance of stack using a debugger, as suggested. As a final note, let me repeat that I, personally, would not consider this implementation to be non-recursive. It just so happens that, instead of using the abstraction provided by the OS, this recursive solution keeps its own stack. To provide a better understanding of how a proper non-recursive solution would be, you may observe the difference in recursive and iterative solutions to the problem of finding nth Fibonacci number provided below. As you can see, the non-recursive solution keeps no stack, and instead of dividing the problem into smaller instances of it(recursion) it builds up the solution from smaller solutions. (dynamic programming)
def recursive_fib(n):
if n == 0:
return 0
elif n == 1:
return 1
return recursive_fib(n-1) + recursive_fib(n-2)
def iterative_fib(n):
f_0 = 0
f_1 = 1
for i in range(3, n):
f_2 = f_1 + f_0
f_0 = f_1
f_1 = f_2
return f_1
The answer from #ilim is correct and should be the accepted answer but I just wanted to add another point that wouldn't fit as a comment. Whilst I imagine you are studying this algorithm as an exercise it should be pointed out that a better way to proceed, depending on the size of the list, may be to user itertools's permutations() function:
print [x for x in itertools.permutations([1, 2, 3])]
Testing on my machine with a list of 11 items (39m permutations) took 1.7secs with itertools.permutations(x) but took 76secs using the custom solution above. Note however that with 12 items (479m permutations) the itertools solution blows up with a memory error. If you need to generate permutations of such size efficiently you may be better dropping to native code.
I have been attending a couple of hackathons. I am beginning to understand that writing code is not enough. The code has to be optimized. That brings me to my question. Here are two questions that I faced.
def pairsum(numbers, k)
"""Write a function that returns two values in numbers whose sum is K"""
for i, j in numbers:
if i != j:
if i+j == k
return i, j
I wrote this function. And I was kind of stuck with optimization.
Next problem.
string = "ksjdkajsdkajksjdalsdjaksda"
def dedup(string):
""" write a function to remove duplicates in the variable string"""
output = []
for i in string:
if i not in output:
output.append(i)
These are two very simple programs that I wrote. But I am stuck with optimization after this. More on this, when we optimize code, how does the complexity reduce? Any pointers will help. Thanks in advance.
Knowing the most efficient Python idioms and also designing code that can reduce iterations and bail out early with an answer is a major part of optimization. Here are a few examples:
List list comprehensions and generators are usually fastest:
With a straightforward nested approach, a generator is faster than a for loop:
def pairsum(numbers, k):
"""Returns two unique values in numbers whose sum is k"""
return next((i, j) for i in numbers for j in numbers if i+j == k and i != j)
This is probably faster on average since it only goes though one iteration at most and does not check if a possible result is in numbers unless k-i != i:
def pairsum(numbers, k):
"""Returns two unique values in numbers whose sum is k"""
return next((k-i, i) for i in numbers if k-i != i and k-i in numbers)
Ouput:
>>> pairsum([1,2,3,4,5,6], 8)
(6, 2)
Note: I assumed numbers was a flat list since the doc string did not mention tuples and it makes the problem more difficult which is what I would expect in a competition.
For the second problem, if you are to create your own function as opposed to just using ''.join(set(s)) you were close:
def dedup(s):
"""Returns a string with duplicate characters removed from string s"""
output = ''
for c in s:
if c not in output:
output += c
return output
Tip: Do not use string as a name
You can also do:
def dedup(s):
for c in s:
s = c + s.replace(c, '')
return s
or a much faster recursive version:
def dedup(s, out=''):
s0, s = s[0], s.replace(s[0], '')
return dedup(s, n + s0) if s else out + s0
but not as fast as set for strings without lots of duplicates:
def dedup(s):
return ''.join(set(s))
Note: set() will not preserve the order of the remaining characters while the other approaches will preserve the order based on first occurrence.
Your first program is a little vague. I assume numbers is a list of tuples or something? Like [(1,2), (3,4), (5,6)]? If so, your program is pretty good, from a complexity standpoint - it's O(n). Perhaps you want a little more Pythonic solution? The neatest way to clean this up would be to join your conditions:
if i != j and i + j == k:
But this simply increases readability. I think it may also add an additional boolean operation, so it might not be an optimization.
I am not sure if you intended for your program to return the first pair of numbers which sum to k, but if you wanted all pairs which meet this requirement, you could write a comprehension:
def pairsum(numbers, k):
return list(((i, j) for i, j in numbers if i != j and i + j == k))
In that example, I used a generator comprehension instead of a list comprehension so as to conserve resources - generators are functions which act like iterators, meaning that they can save memory by only giving you data when you need it. This is called lazy iteration.
You can also use a filter, which is a function which returns only the elements from a set for which a predicate returns True. (That is, the elements which meet a certain requirement.)
import itertools
def pairsum(numbers, k):
return list(itertools.ifilter(lambda t: t[0] != t[1] and t[0] + t[1] == k, ((i, j) for i, j in numbers)))
But this is less readable in my opinion.
Your second program can be optimized using a set. If you recall from any discrete mathematics you may have learned in grade school or university, a set is a collection of unique elements - in other words, a set has no duplicate elements.
def dedup(mystring):
return set(mystring)
The algorithm to find the unique elements of a collection is generally going to be O(n^2) in time if it is O(1) in space - if you allow yourself to allocate more memory, you can use a Binary Search Tree to reduce the time complexity to O(n log n), which is likely how Python sets are implemented.
Your solution took O(n^2) time but also O(n) space, because you created a new list which could, if the input was already a string with only unique elements, take up the same amount of space - and, for every character in the string, you iterated over the output. That's essentially O(n^2) (although I think it's actually O(n*m), but whatever). I hope you see why this is. Read the Binary Search Tree article to see how it improves your code. I don't want to re-implement one again... freshman year was so grueling!
The key to optimization is basically to figure out a way to make the code do less work, in terms of the total number of primitive steps that needs to be performed. Code that employs control structures like nested loops quickly contributes to the number of primitive steps needed. Optimization is therefore often about replacing loops iterating over the a full list with something more clever.
I had to change the unoptimized pairsum() method sligtly to make it usable:
def pairsum(numbers, k):
"""
Write a function that returns two values in numbers whose sum is K
"""
for i in numbers:
for j in numbers:
if i != j:
if i+j == k:
return i,j
Here we see two loops, one nested inside the other. When describing the time complexity of a method like this, we often say that it is O(n²). Since when the length of the numbers array passed in grows proportional to n, then the number of primitive steps grows proportional to n². Specifically, the i+j == k conditional is evaluated exactly len(number)**2 times.
The clever thing we can do here is to presort the array at the cost of O(n log(n)) which allows us to hone in on the right answer by evaluating each element of the sorted array at most one time.
def fast_pairsum(numbers, k):
sortedints = sorted(numbers)
low = 0
high = len(numbers) - 1
i = sortedints[0]
j = sortedints[-1]
while low < high:
diff = i + j - k
if diff > 0:
# Too high, let's lower
high -= 1
j = sortedints[high]
elif diff < 0:
# Too low, let's increase.
low += 1
i = sortedints[low]
else:
# Just right
return i, j
raise Exception('No solution')
These kinds of optimization only begin to really matter when the size of the problem becomes large. On my machine the break-even point between pairsum() and fast_pairsum() is with a numbers array containing 13 integers. For smaller arrays pairsum() is faster, and for larger arrays fast_pairsum() is faster. As the size grows fast_pairsum() becomes drastically faster than the unoptimized pairsum().
The clever thing to do for dedup() is to avoid having to linearly scan through the output list to find out if you've already seen a character. This can be done by storing information about which characters you've seen in a set, which has O(log(n)) look-up cost, rather than the O(n) look-up cost of a regular list.
With the outer loop, the total cost becomes O(n log(n)) rather than O(n²).
def fast_dedup(string):
# if we didn't care about the order of the characters in the
# returned string we could simply do
# return set(string)
seen = set()
output = []
seen_add = seen.add
output_append = output.append
for i in string:
if i not in seen:
seen_add(i)
output_append(i)
return output
On my machine the break-even point between dedup() and fast_dedup() is with a string of length 30.
The fast_dedup() method also shows another simple optimization trick: Moving as much of the code out of the loop bodies as possible. Since looking up the add() and append() members in the seen and output objects takes time, it is cheaper to do it once outside the loop bodies and store references to the members in variables that is used repeatedly inside the loop bodies.
To properly optimize Python, one needs to find a good algorithm for the problem and a Python idiom close to that algorithm. Your pairsum example is a good case. First, your implementation appears wrong — numbers is most likely a sequence of numbers, not a sequence of pairs of numbers. Thus a naive implementation would look like this:
def pairsum(numbers, k)
"""Write a function that returns two values in numbers whose sum is K"""
for i in numbers:
for j in numbers:
if i != j and i + j != k:
return i, j
This will perform n^2 iterations, n being the length of numbers. For small ns this is not a problem, but once n gets into hundreds, the nested loops will become visibly slow, and once n gets into thousands, they will become unusable.
An optimization would be to recognize the difference between the inner and the outer loops: the outer loop traverses over numbers exactly once, and is unavoidable. The inner loop, however, is only used to verify that the other number (which has to be k - i) is actually present. This is a mere lookup, which can be made extremely fast by using a dict, or even better, a set:
def pairsum(numbers, k)
"""Write a function that returns two values in numbers whose sum is K"""
numset = set(numbers)
for i in numbers:
if k - i in numset:
return i, k - i
This is not only faster by a constant because we're using a built-in operation (set lookup) instead of a Python-coded loop. It actually does less work because set has a smarter algorithm of doing the lookup, it performs it in constant time.
Optimizing dedup in the analogous fashion is left as an excercise for the reader.
Your string one, order preserving is most easily and should be fairly efficient written as:
from collections import OrderedDict
new_string = ''.join(OrderedDict.fromkeys(old_string))
Is it always possible to convert a recursion into a tail recursive one?
I am having a hard time converting the following Python function into a tail-recursive one.
def BreakWords(glob):
"""Break a string of characters, glob, into a list of words.
Args:
glob: A string of characters to be broken into words if possible.
Returns:
List of words if glob can be broken down. List can be empty if glob is ''.
None if no such break is possible.
"""
# Base case.
if len(glob) == 0:
return []
# Find a partition.
for i in xrange(1, len(glob) + 1):
left = glob[:i]
if IsWord(left):
right = glob[i:]
remaining_words = BreakWords(right)
if remaining_words is not None:
return [left] + remaining_words
return None
I'n not sure if is always the case, but most of recursive functions can be implemented as tail recursives. Besides Tail Recursion is different from Tail Recursion optimization.
Differences Tail Recursion and "Regular" ones
There are two elements that must be present in a recursive function:
The recursive call
A place to keep count of the return values.
A "regular" recursive function keeps (2) in the stack frame.
The return values in regular recursive function are composed of two types of values:
Other return values
Result of the owns function computation
Let's see a example:
def factorial(n):
if n == 1 return 1
return n * factorial(n-1)
The frame f(5) "stores" the result of it's own computation (5) and the value of f(4), for example. If i call factorial(5), just before the stack calls begin to colapse, i have:
[Stack_f(5): return 5 * [Stack_f(4): 4 * [Stack_f(3): 3 * ... [1[1]]
Notice that each stack stores, besides the values i mentioned, the whole scope of the function. So, the memory usage for a recursive function f is O(x), where x is the number of recursive calls i have to made. So, if i needb 1kb of RAM to calculate factorial(1) or factorial(2), i need ~100k to calculate factorial(100), and so on.
A Tail Recursive function put (2) in it's arguments.
In a Tail Recursion, i pass the result of the partial calculations in each recursive frame to the next one using parameters. Let's see our factorial example, Tail Recursive:
def factorial(n):
def tail_helper(n, acc):
if n == 1 or n == 2: return acc
return tail_helper(n-1, acc + n)
return tail_helper(n,0)
Let's look at it's frames in factorial(4):
[Stack f(4, 5): Stack f(3, 20): [Stack f(2,60): [Stack f(1, 120): 120]]]]
See the differences? In "regular" recursive calls the return functions recursively compose the final value. In Tail Recursion they only reference the base case (last one evaluated). We call accumulator the argument that keeps track of the older values.
Recursion Templates
The regular recursive function go as follows:
def regular(n)
base_case
computation
return (result of computation) combined with (regular(n towards base case))
To transform it in a Tail recursion we:
Introduce a helper function that carries the accumulator
run the helper function inside the main function, with the accumulator set to the base case.
Look:
def tail(n):
def helper(n, accumulator):
if n == base case:
return accumulator
computation
accumulator = computation combined with accumulator
return helper(n towards base case, accumulator)
helper(n, base case)
Your example:
I did something like this:
def BreakWords(glob):
def helper(word, glob, acc_1, acc_2):
if len(word) == 0 and len(glob) == 0:
if not acc_1:
return None
return acc
if len(word) == 0:
word = glob.pop[0]
acc_2 = 0
if IsWord(word.substring[:acc_2]):
acc_1.append(word[:acc_2])
return helper(word[acc_2 + 1:], glob, acc_1, acc_2 + 1)
return helper(word[acc_2 + 1:], glob, acc_1, acc_2 + 1)
return helper("", glob, [], 0)
In order to eliminate the for statement you made, i did my recursive helper function with 2 accumulators. One to store the results, and one to store the position i'm currently trying.
Tail Call optimization
Since no state is being stored on the Non-Border-Cases of the Tail Call stacks, they aren't so important. Some languages/interpreters then substitute the old stack with the new one. So, with no stack frames constraining the number of calls, the Tail Calls behave just like a for-loop.
But unfortunately for you Python isn't one of these cases. You'll get a RunTimeError when the stack gets bigger than 1000. Mr. Guido
thinks that the clarity lost to debugging purposes due to Tail Call Optimization (caused by the frames thrown awy) is more important than the feature. That's a shame. Python has so many cool functional stuff, and tail recursion would be great on top of it :/