Counting Total Number of Elements in a List using Recursion - python

The Problem:
Count the number of elements in a List using recursion.
I wrote the following function:
def count_rec(arr, i):
"""
This function takes List (arr) and Index Number
then returns the count of number of elements in it
using Recursion.
"""
try:
temp = arr[i] # if element exists at i, continue
return 1 + count_rec(arr, i+1)
except IndexError:
# if index error, that means, i == length of list
return 0
I noticed some problems with it:
RecursionError (when the number of elements is more than 990)
Using a temp element (wasting memory..?)
Exception Handling (I feel like we shouldn't use it unless necessary)
If anyone can suggest how to improve the above solution or come up with an alternative one, It would be really helpful.

What you have is probably as efficient as you are going to get for this thought experiment (obviously, python already calculates and stores length for LIST objects, which can be retrieved with the len() built-in, so this function is completely unnecessary).
You could get shorter code if you want:
def count(L):
return count(L[:-1])+1 if L else 0
But you still need to change python's recursion limit.
import sys; sys.setrecursionlimit(100000)
However, we should note that in most cases, "if else" statements take longer to process than "try except". Hence, "try except" is going to be a better (if you are after performance). Of course, that's weird talking about performance because recursion typically doesn't perform very well, due to how python manage's namespaces and such. Recursion is typically frowned upon, unnecessary, and slow. So, trying to optimize recursion performance is a littler strange.
A last point to note. You mention the temp=arr[i] taking up memory. Yes, possibly a few bytes. Of course, any calculation you do to determine if arr has an element at i, is going to take a few bytes in memory even simply running "arr[i]" without assignment. In addition, those bytes are freed the second the temp variable falls out of scope, gets re-used, or the function exits. Hence, unless you are planning on launching 10,000,000,000 sub-processes, rest assure there is no performance degradation in using a temp variable like that.

you are prob looking for something like this
def count_rec(arr):
if arr == []:
return 0
return count_rec(arr[1:]) + 1

You can use pop() to do it.
def count_r(l):
if l==[]:
return 0
else:
l.pop()
return count_r(l)+1

Related

Appropriate to use repeated function calls to loop through something (i.e. a list) in Python?

Lets say I have the following Python script:
def pop_and_loop():
my_list.pop(0)
my_func()
def my_func():
#do something with list item [0]
if my_list[0] finished_with:
pop_and_loop()
#continued actions if not finished with
if my_list[0] finished_with:
pop_and_loop()
my_list = [#list containing 100 items]
my_func()
Is this an appropriate setup? Because, am I not leaving each function call open in a way because its having to hold a marker at the position where I have left the function to go to another, so theoretically it is waiting for me to come back, but I'm never coming back to that one. Does this create problems and is there a different way you're meant to do this?
EDIT: My actual script is more complicated than this, with loads of different functions that I need to call to whilst processing each item in the main list. Essentially my question is whether I need to convert this setup into an actual loop. Bearing in mind that I will need to refresh the main list to refill it again and then loop through the same again. So how would I keep looping that? Should I instead have:
my_list = []
def my_func(item):
#do something with list item
if item finished_with:
return output
elif item_finished_now:
return output
while not len(my_list):
while #there are items to fill the list with:
#fill list
for x in my_list:
output = my_func(x)
#deal with output and list popping here
#sleep loop waiting for there to be things to put into the list again
time.sleep(60)
Yours is simply an example of recursion.
Both the question and answer are borderline opinion-based, but in most cases you would prefer an iterative solution (loops) over recursion unless the recursive solution has a clear benefit of either being simpler or being easier to comprehend in code and in reasoning.
For various reasons, Python does not have any recursion optimizations such as tail call and creates a new stack frame for each new level (or function call). That, and more, are reasons an iterative solution would generally be faster and why the overhead of extra recursive calls in Python is rather large - it takes more memory for the stack and spends more time creating those frames. On top of all, there is a limit to the recursion depth, and most recursive algorithms can be converted to an iterative solution in an easy fashion.
Your specific example is simple enough to convert like so:
while my_list:
while my_list[0] != "finished":
# do stuff
my_list.pop(0)
On a side note, please don't pop(0) and use a collections.deque instead as it's O(1) instead of O(N).

Confused on why generators are useful [duplicate]

I'm starting to learn Python and I've come across generator functions, those that have a yield statement in them. I want to know what types of problems that these functions are really good at solving.
Generators give you lazy evaluation. You use them by iterating over them, either explicitly with 'for' or implicitly by passing it to any function or construct that iterates. You can think of generators as returning multiple items, as if they return a list, but instead of returning them all at once they return them one-by-one, and the generator function is paused until the next item is requested.
Generators are good for calculating large sets of results (in particular calculations involving loops themselves) where you don't know if you are going to need all results, or where you don't want to allocate the memory for all results at the same time. Or for situations where the generator uses another generator, or consumes some other resource, and it's more convenient if that happened as late as possible.
Another use for generators (that is really the same) is to replace callbacks with iteration. In some situations you want a function to do a lot of work and occasionally report back to the caller. Traditionally you'd use a callback function for this. You pass this callback to the work-function and it would periodically call this callback. The generator approach is that the work-function (now a generator) knows nothing about the callback, and merely yields whenever it wants to report something. The caller, instead of writing a separate callback and passing that to the work-function, does all the reporting work in a little 'for' loop around the generator.
For example, say you wrote a 'filesystem search' program. You could perform the search in its entirety, collect the results and then display them one at a time. All of the results would have to be collected before you showed the first, and all of the results would be in memory at the same time. Or you could display the results while you find them, which would be more memory efficient and much friendlier towards the user. The latter could be done by passing the result-printing function to the filesystem-search function, or it could be done by just making the search function a generator and iterating over the result.
If you want to see an example of the latter two approaches, see os.path.walk() (the old filesystem-walking function with callback) and os.walk() (the new filesystem-walking generator.) Of course, if you really wanted to collect all results in a list, the generator approach is trivial to convert to the big-list approach:
big_list = list(the_generator)
One of the reasons to use generator is to make the solution clearer for some kind of solutions.
The other is to treat results one at a time, avoiding building huge lists of results that you would process separated anyway.
If you have a fibonacci-up-to-n function like this:
# function version
def fibon(n):
a = b = 1
result = []
for i in xrange(n):
result.append(a)
a, b = b, a + b
return result
You can more easily write the function as this:
# generator version
def fibon(n):
a = b = 1
for i in xrange(n):
yield a
a, b = b, a + b
The function is clearer. And if you use the function like this:
for x in fibon(1000000):
print x,
in this example, if using the generator version, the whole 1000000 item list won't be created at all, just one value at a time. That would not be the case when using the list version, where a list would be created first.
Real World Example
Let's say you have 100 million domains in your MySQL table, and you would like to update Alexa rank for each domain.
First thing you need is to select your domain names from the database.
Let's say your table name is domains and column name is domain.
If you use SELECT domain FROM domains it's going to return 100 million rows which is going to consume lot of memory. So your server might crash.
So you decided to run the program in batches. Let's say our batch size is 1000.
In our first batch we will query the first 1000 rows, check Alexa rank for each domain and update the database row.
In our second batch we will work on the next 1000 rows. In our third batch it will be from 2001 to 3000 and so on.
Now we need a generator function which generates our batches.
Here is our generator function:
def ResultGenerator(cursor, batchsize=1000):
while True:
results = cursor.fetchmany(batchsize)
if not results:
break
for result in results:
yield result
As you can see, our function keeps yielding the results. If you used the keyword return instead of yield, then the whole function would be ended once it reached return.
return - returns only once
yield - returns multiple times
If a function uses the keyword yield then it's a generator.
Now you can iterate like this:
db = MySQLdb.connect(host="localhost", user="root", passwd="root", db="domains")
cursor = db.cursor()
cursor.execute("SELECT domain FROM domains")
for result in ResultGenerator(cursor):
doSomethingWith(result)
db.close()
I find this explanation which clears my doubt. Because there is a possibility that person who don't know Generators also don't know about yield
Return
The return statement is where all the local variables are destroyed and the resulting value is given back (returned) to the caller. Should the same function be called some time later, the function will get a fresh new set of variables.
Yield
But what if the local variables aren't thrown away when we exit a function? This implies that we can resume the function where we left off. This is where the concept of generators are introduced and the yield statement resumes where the function left off.
def generate_integers(N):
for i in xrange(N):
yield i
In [1]: gen = generate_integers(3)
In [2]: gen
<generator object at 0x8117f90>
In [3]: gen.next()
0
In [4]: gen.next()
1
In [5]: gen.next()
So that's the difference between return and yield statements in Python.
Yield statement is what makes a function a generator function.
So generators are a simple and powerful tool for creating iterators. They are written like regular functions, but they use the yield statement whenever they want to return data. Each time next() is called, the generator resumes where it left off (it remembers all the data values and which statement was last executed).
See the "Motivation" section in PEP 255.
A non-obvious use of generators is creating interruptible functions, which lets you do things like update UI or run several jobs "simultaneously" (interleaved, actually) while not using threads.
Buffering. When it is efficient to fetch data in large chunks, but process it in small chunks, then a generator might help:
def bufferedFetch():
while True:
buffer = getBigChunkOfData()
# insert some code to break on 'end of data'
for i in buffer:
yield i
The above lets you easily separate buffering from processing. The consumer function can now just get the values one by one without worrying about buffering.
I have found that generators are very helpful in cleaning up your code and by giving you a very unique way to encapsulate and modularize code. In a situation where you need something to constantly spit out values based on its own internal processing and when that something needs to be called from anywhere in your code (and not just within a loop or a block for example), generators are the feature to use.
An abstract example would be a Fibonacci number generator that does not live within a loop and when it is called from anywhere will always return the next number in the sequence:
def fib():
first = 0
second = 1
yield first
yield second
while 1:
next = first + second
yield next
first = second
second = next
fibgen1 = fib()
fibgen2 = fib()
Now you have two Fibonacci number generator objects which you can call from anywhere in your code and they will always return ever larger Fibonacci numbers in sequence as follows:
>>> fibgen1.next(); fibgen1.next(); fibgen1.next(); fibgen1.next()
0
1
1
2
>>> fibgen2.next(); fibgen2.next()
0
1
>>> fibgen1.next(); fibgen1.next()
3
5
The lovely thing about generators is that they encapsulate state without having to go through the hoops of creating objects. One way of thinking about them is as "functions" which remember their internal state.
I got the Fibonacci example from Python Generators - What are they? and with a little imagination, you can come up with a lot of other situations where generators make for a great alternative to for loops and other traditional iteration constructs.
The simple explanation:
Consider a for statement
for item in iterable:
do_stuff()
A lot of the time, all the items in iterable doesn't need to be there from the start, but can be generated on the fly as they're required. This can be a lot more efficient in both
space (you never need to store all the items simultaneously) and
time (the iteration may finish before all the items are needed).
Other times, you don't even know all the items ahead of time. For example:
for command in user_input():
do_stuff_with(command)
You have no way of knowing all the user's commands beforehand, but you can use a nice loop like this if you have a generator handing you commands:
def user_input():
while True:
wait_for_command()
cmd = get_command()
yield cmd
With generators you can also have iteration over infinite sequences, which is of course not possible when iterating over containers.
My favorite uses are "filter" and "reduce" operations.
Let's say we're reading a file, and only want the lines which begin with "##".
def filter2sharps( aSequence ):
for l in aSequence:
if l.startswith("##"):
yield l
We can then use the generator function in a proper loop
source= file( ... )
for line in filter2sharps( source.readlines() ):
print line
source.close()
The reduce example is similar. Let's say we have a file where we need to locate blocks of <Location>...</Location> lines. [Not HTML tags, but lines that happen to look tag-like.]
def reduceLocation( aSequence ):
keep= False
block= None
for line in aSequence:
if line.startswith("</Location"):
block.append( line )
yield block
block= None
keep= False
elif line.startsWith("<Location"):
block= [ line ]
keep= True
elif keep:
block.append( line )
else:
pass
if block is not None:
yield block # A partial block, icky
Again, we can use this generator in a proper for loop.
source = file( ... )
for b in reduceLocation( source.readlines() ):
print b
source.close()
The idea is that a generator function allows us to filter or reduce a sequence, producing a another sequence one value at a time.
A practical example where you could make use of a generator is if you have some kind of shape and you want to iterate over its corners, edges or whatever. For my own project (source code here) I had a rectangle:
class Rect():
def __init__(self, x, y, width, height):
self.l_top = (x, y)
self.r_top = (x+width, y)
self.r_bot = (x+width, y+height)
self.l_bot = (x, y+height)
def __iter__(self):
yield self.l_top
yield self.r_top
yield self.r_bot
yield self.l_bot
Now I can create a rectangle and loop over its corners:
myrect=Rect(50, 50, 100, 100)
for corner in myrect:
print(corner)
Instead of __iter__ you could have a method iter_corners and call that with for corner in myrect.iter_corners(). It's just more elegant to use __iter__ since then we can use the class instance name directly in the for expression.
Basically avoiding call-back functions when iterating over input maintaining state.
See here and here for an overview of what can be done using generators.
Since the send method of a generator has not been mentioned, here is an example:
def test():
for i in xrange(5):
val = yield
print(val)
t = test()
# Proceed to 'yield' statement
next(t)
# Send value to yield
t.send(1)
t.send('2')
t.send([3])
It shows the possibility to send a value to a running generator. A more advanced course on generators in the video below (including yield from explination, generators for parallel processing, escaping the recursion limit, etc.)
David Beazley on generators at PyCon 2014
Some good answers here, however, I'd also recommend a complete read of the Python Functional Programming tutorial which helps explain some of the more potent use-cases of generators.
Particularly interesting is that it is now possible to update the yield variable from outside the generator function, hence making it possible to create dynamic and interwoven coroutines with relatively little effort.
Also see PEP 342: Coroutines via Enhanced Generators for more information.
I use generators when our web server is acting as a proxy:
The client requests a proxied url from the server
The server begins to load the target url
The server yields to return the results to the client as soon as it gets them
Piles of stuff. Any time you want to generate a sequence of items, but don't want to have to 'materialize' them all into a list at once. For example, you could have a simple generator that returns prime numbers:
def primes():
primes_found = set()
primes_found.add(2)
yield 2
for i in itertools.count(1):
candidate = i * 2 + 1
if not all(candidate % prime for prime in primes_found):
primes_found.add(candidate)
yield candidate
You could then use that to generate the products of subsequent primes:
def prime_products():
primeiter = primes()
prev = primeiter.next()
for prime in primeiter:
yield prime * prev
prev = prime
These are fairly trivial examples, but you can see how it can be useful for processing large (potentially infinite!) datasets without generating them in advance, which is only one of the more obvious uses.
Also good for printing the prime numbers up to n:
def genprime(n=10):
for num in range(3, n+1):
for factor in range(2, num):
if num%factor == 0:
break
else:
yield(num)
for prime_num in genprime(100):
print(prime_num)

Replace a simple for-loop with a recursive function?

I was watching Eric Meijer's lectures on functional programming and I found this example to be really nice and intriguing:
If I had to sum a list of numbers in an imperative way, I would do something like:
total=0
for each in range(0, 10):
total = total + each
where I am explaining how to do this, instead of just specifying what I want.
This expression in Python does the same thing:
sum(range(1,10))
and it is same as my original problem statement which is to "sum a list of numbers". This is a nice high-level programming language construct to have since is both readable and declarative.
range(1,10) captures the fact that this is a list of items.
sum captures the computation to be done.
So, my first thought was functions which return values are more useful than for-loops at-least in some scenarios. On further reading I also found for-loops are just a syntactic sugar for doing a jump operation which can also be replaced with a recursive function call with the proper base condition. Is that correct statement?
So generarlizing this, I just wrote a simple reduce function which looks like:
def reduce(operation, start, array):
# I think I could make this an expression too.
if len(array) == 1:
return operation(start, array[0])
return reduce(operation, operation(start, array[0]), array[1:])
I just wanted to know if this a good way to start thinking functionally i.e in terms of inputs and outputs as much as possible?
The advantage I can think of is:
We can create any number of partials like sum, product etc. But I think it can be implemented using loops as well.
The disadvantage is:
I am duplicating this array again and again. Space complexity is O(n^2). I can use indexes to avoid that problem, but the code will look messy.
Since there is no tail recursion in Python, it might create a huge stack. But that is an implementation detail to be aware of.
Re: Disadvantage #2
Yes it does create a huge stack. I have experienced this specifically with IronPython on Windows. If you get an error thrown deep into your number of recursions (unlikely for a simple sum, but when dealing with external APIs, it can happen) you will get a stacktrace back with an error thrown in every frame since the original call. This can make it very difficult to debug.
This code:
class ConnectedItem():
def __init__(self, name):
self.name = name
self.connected_to = None
def connect(self, item):
self.connected_to = item
def __repr__(self):
return "<item: {} connected to {}>".format(self.name, self.connected_to)
items = []
for l in ["a", "b", "c", "d"]:
items += [ConnectedItem(l)]
for n, i in enumerate(items):
if n < 3:
i.connect(items[n + 1])
def recursively_access(item):
# print(item.name)
return recursively_access(item.connected_to)
recursively_access(items[0])
Produces this:
This may not be exactly python, but recursion is a methodology that can be used in any language. Here is an example that can hopefully get you thinking along the right track.
recursiveMethod(param1, param2)
if (param1 > 10)
return from method
recursiveMethod(param1++, param2 += whateverOperation)

Unexpected behaviour of a recursive insertion sort function in python

I was writing a recursive insertion sort function in Python 2.7 and came across two things I can't understand.
The first was the error TypeError: can only assign an iterable, which I guessed it had to do with the recursion of the function, but I don't get in particular the problem with my code :
def recursiveInsertionSort(v):
if len(v)!=2:
v[0:len(v)-1]=recursiveInsertionSort( v[0:len(v)-1])
i=len(v)-1
while v[i-1]>v[i]:
v[i-1], v[i]=v[i], v[i-1]
i-=1
if i==0: return v
The second problem it's probably connected.
In this case I didn't even get an error (if you know why please tell me) but the function just didn't work.
def recursiveInsertionSort(v):
if len(v)!=2:
recursiveInsertionSort(v[0:len(v)-1])
i=len(v)-1
while v[i-1]>v[i] and i>0:
v[i-1], v[i]=v[i], v[i-1]
i-=1
As I guessed the problem was with the recursive use of the function I corrected my mistake:
def recursiveInsertionSort(v):
if len(v)!=2:
temp=v[0:len(v)-1]
recursiveInsertionSort( temp)
v[0:len(v)-1]=temp
i=len(v)-1
while v[i-1]>v[i] and i>0:
v[i-1], v[i]=v[i], v[i-1]
i-=1
But I really would like to understand the causes of these two behaviors, can you help me?
EDIT I also ask if there's a nicer way of doing:
temp=v[0:len(v)-1]
recursiveInsertionSort( temp)
v[0:len(v)-1]=temp
The problem with your first implementation was with the if statement inside the loop. If the loop exited because v[i-1] was less than or equal to v[i], it would not return anything (which in Python is the same as doing return None). The TypeError you're getting is from the recursive call that gets the None value returned to it, since you can't assign None to a slice.
You can make the first version of your code work if you combine the two conditions in the while statement and return v unconditionally after the loop ends:
while i > 0 and v[i-1] > v[i]:
v[i-1], v[i] = v[i], v[i-1]
i -= 1
return v
You probably should put the conditions in this order in your other version too.
As for why your second version didn't work, the issue was that you were slicing the original list, getting a new list with a partial copy of its contents. Then your recursive sort was modifying that copy in place. But then the outer function didn't have any way to see the changes, since it didn't keep a reference to the sliced copy. Your third version fixes this by explicitly saving the slice as a new variable.
I don't think there's any better way to do the exact three lines of your last code block. However, if you changed the function to accept an optional paraemter of the highest index to sort, you could do without slicing at all:
def recursiveInsertionSort(v, i=None):
if i is None:
i = len(v) - 1
if i > 1:
recursiveInsertionSort(v, i-1)
while v[i-1]>v[i] and i>0:
v[i-1], v[i]=v[i], v[i-1]
i-=1
This is going to be a little bit more efficient than your current code, as it doesn't copy the list for each recursive call. Both versions are still O(N**2) though, so don't expect it to compete with quicksort or other more efficient sorts on large data sets.

Get length of list in Python using recursion

I am trying to calculate the length of a list. When I run it on cmd, I get:
RuntimeError: maximum recursion depth exceeded in comparison
I don't think there's anything wrong with my code:
def len_recursive(list):
if list == []:
return 0
else:
return 1 + len_recursive(list[1:])
Don't use recursion unless you can predict that it is not too deep. Python has quite small limit on recursion depth.
If you insist on recursion, the efficient way is:
def len_recursive(lst):
if not lst:
return 0
return 1 + len_recursive(lst[1::2]) + len_recursive(lst[2::2])
The recursion depth in Python is limited, but can be increased as shown in this post. If Python had support for the tail call optimization, this solution would work for arbitrary-length lists:
def len_recursive(lst):
def loop(lst, acc):
if not lst:
return acc
return loop(lst[1:], acc + 1)
return loop(lst, 0)
But as it is, you will have to use shorter lists and/or increase the maximum recursion depth allowed.
Of course, no one would use this implementation in real life (instead using the len() built-in function), I'm guessing this is an academic example of recursion, but even so the best approach here would be to use iteration, as shown in #poke's answer.
As others have explained, there are two problems with your function:
It's not tail-recursive, so it can only handle lists as long as sys.getrecursionlimit.
Even if it were tail-recursive, Python doesn't do tail recursion optimization.
The first is easy to solve. For example, see Óscar López's answer.
The second is hard to solve, but not impossible. One approach is to use coroutines (built on generators) instead of subroutines. Another is to not actually call the function recursively, but instead return a function with the recursive result, and use a driver that applies the results. See Tail Recursion in Python by Paul Butler for an example of how to implement the latter, but here's what it would look like in your case.
Start with Paul Butler's tail_rec function:
def tail_rec(fun):
def tail(fun):
a = fun
while callable(a):
a = a()
return a
return (lambda x: tail(fun(x)))
This doesn't work as a decorator for his case, because he has two mutually-recursive functions. But in your case, that's not an issue. So, using Óscar López's's version:
#tail_rec
def tail_len(lst):
def loop(lst, acc):
if not lst:
return acc
return lambda: loop(lst[1:], acc + 1)
return lambda: loop(lst, 0)
And now:
>>> print tail_len(range(10000))
10000
Tada.
If you actually wanted to use this, you might want to make tail_rec into a nicer decorator:
def tail_rec(fun):
def tail(fun):
a = fun
while callable(a):
a = a()
return a
return functools.update_wrapper(lambda x: tail(fun(x)), fun)
Imagine you're running this using a stack of paper. You want to count how many sheets you have. If someone gives you 10 sheets you take the first sheet, put it down on the table and grab the next sheet, placing it next to the first sheet. You do this 10 times and your desk is pretty full, but you've set out each sheet. You then start to count every page, recycling it as you count it up, 0 + 1 + 1 + ... => 10. This isn't the best way to count pages, but it mirrors the recursive approach and python's implementaion.
This works for small numbers of pages. Now imagine someone gives you 10000 sheets. Pretty soon there is no room on your desk to set out each page. This is essentially what the error message is telling you.
The Maximum Recursion depth is "how many sheets" can the table hold. Each time you call python needs to keep the "1 + result of recursive call" around so that when all the pages have been laid out it can come back and count them up. Unfortunately you're running out of space before the final counting-up occurs.
If you want to do this recursively to learn, since you're want to use len() in any reasonable situation, just use small lists, 25 should be fine.
Some systems could handle this for large lists if they support tail calls
Your exception message means that your method is called recursively too often, so it’s likely that your list is just too long to count the elements recursively like that. You could do it simply using a iterative solution though:
def len_iterative(lst):
length = 0
while lst:
length += 1
lst = lst[1:]
return length
Note that this will very likely still be a terrible solution as lst[1:] will keep creating copies of the list. So you will end up with len(lst) + 1 list instances (with lengths 0 to len(lst)). It is probably the best idea to just use the built-in len directly, but I guess it was an assignment.
Python isn't optimising tail recursion calls, so using such recursive algorythms isn't a good idea.
You can tweak stack with sys.setrecursionlimit(), but it's still not a good idea.

Categories

Resources