I'm asked to make a program that calculates the addition of two polynomials of n and m degrees. I made two dictionaries (one for the first polynomial and the other is for the other polynomial) since each one has the coefficients as values and degrees as keys so that I can check whether the keys from both dictionaries are identical, then I can sum their values. But I don't know why I always get an error. My code so far is:
class poly:
def __init__(self, L=[], D=[]):
self.coef=L
self.deg=D
def __add__(self,L2):
if len(self.coef)>len(self.deg):
dec=dict(zip(self.deg,self.coef))
dec[0]=self.coef[-1]
else:
dec=dict(zip(self.deg,self.coef))
Dec1=dec
if len(L2.coef)>len(L2.deg):
dec=dict(zip(L2.deg,L2.coef))
dec[0]=L2.coef[-1]
else:
dec=dict(zip(L2.deg,L2.coef))
Dec2=dec
p=[]
if len(Dec2)>len(Dec1):
for i in Dec2:
if i in Dec1:
s=Dec1[i]+Dec2[i]
p=p+[s]
else:
p=p+p[Dec2[i]]
for x in Dec1:
if x in Dec2:
p=p
else:
p=p+[dec1[x]]
return(poly(p))
if len(Dec2)<len(Dec1):
for x in Dec1:
if x in Dec2:
g=Dec1[x]
p=p+[g]
else:
p=p+[Dec1[x]]
for m in Dec2:
if m in Dec1:
p=p
else:
p=p+[Dec2[m]]
return (poly(p))
This code doesn't work for all my examples such as
>>> p=poly([2,4,7,34],[6,4,2])
>>> p1=poly([6,3,7,2,8],[8,4,2,1])
>>> p2=p+p1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
p2=p+p1
File "poly.py", line 31, in __add__
p=p+p[Dec2[i]]
IndexError: list index out of range
>>> #The numbers in the first list is the coefficients and the second list is for degrees
This doesn't work! But it worked when I've done the addition without using class method. I'm a beginner and I did my best to fix the problem.
Another question is how to write the def str for my code? I really don't have any idea what I should write in the beginning. I'm sorry guys but I'm new in programming and I need an easy code such as mine.
By common convention, class names should be capitalized (ie Poly)
You have __add__ doing a lot of stuff that has nothing to do with adding. This should be a warning sign.
A lot of __add__'s work is mucking about with the data storage format. Maybe you should use a better storage format, one which won't need so much reshuffling?
You have a lot of repetitive chunks of code in __add__; this is usually an indicator that the code should be factored into a subroutine.
You have this object (self) making changes to the internal details of another object (L2) - another bad smell.
If you move the normalization code for self (if len(self.coef) > len(self.deg) ...) from __add__ into __init__, this will solve #2, #3, half of #4, and #5 all in one go (you no longer have to "do to" L2, it will "do to" itself).
If you realize that it's pretty much irrelevant whether len(Dec1) > len(Dec2) or not, you can get rid of another block of redundant code. This fixes the other half of #4. Suddenly __add__ shrinks from 48 lines of code to about 12, and becomes much easier to understand and debug.
For sake of comparison:
from itertools import izip_longest, chain, product
from collections import defaultdict
class Poly(object):
def __init__(self, coeff=None, power=None):
if coeff is None: coeff = []
if power is None: power = []
self.d = defaultdict(int)
for c,p in izip_longest(coeff, power, fillvalue=0):
if c != 0:
self.d[p] += c
#classmethod
def fromDict(cls, d):
return cls(d.itervalues(), d.iterkeys())
#property
def degree(self):
return max(p for p,c in self.d.iteritems() if c != 0)
def __add__(self, poly):
return Poly(
chain(self.d.itervalues(), poly.d.itervalues()),
chain(self.d.iterkeys(), poly.d.iterkeys())
)
def __mul__(self, poly):
return Poly(
(cs*cp for cs,cp in product(self.d.itervalues(), poly.d.itervalues())),
(ps+pp for ps,pp in product(self.d.iterkeys(), poly.d.iterkeys()))
)
def __call__(self, x):
return sum(c*x**p for p,c in self.d.iteritems())
def __str__(self):
clauses = sorted(((p,c) for p,c in self.d.iteritems() if c != 0), reverse=True)
return " + ".join("{}x^{}".format(c,p) for p,c in clauses) or "0"
Note that:
Each method is short and does only things relevant to what it is supposed to accomplish.
I purposefully wrote __init__ to be very fault-tolerant; it will cheerfully accept multiple coefficients of a given power and sum them. This allowed me to greatly simplify __add__ and __mul__, basically just throwing all the resulting clauses at a new Poly and letting it clean them up again.
I have included a minimal implementation of __str__, which will result in moderately ugly output like 5x^2 + -2x^1 + -5x^0. You may wish to add special handling for negative coefficients and powers of 1 or 0, to make it produce 5x^2 - 2x - 5 instead.
This is for the purpose of understanding, not plagiarism; do not submit it to your teacher as is, he will never in a million years believe you actually wrote it ;-)
Related
I have written the following recursive function, but am incurring a runtime error due to maximum recursion depth. I was wondering is it possible to write an iterative function to overcome this:
def finaldistance(n):
if n%2 == 0:
return 1 + finaldistance(n//2)
elif n != 1:
a = finaldistance(n-1)+1
b = distance(n)
return min(a,b)
else:
return 0
What I have tried is this but it does not seem to be working,
def finaldistance(n, acc):
while n > 1:
if n%2 == 0:
(n, acc) = (n//2, acc+1)
else:
a = finaldistance(n-1, acc) + 1
b = distance(n)
if a < b:
(n, acc) = (n-1, acc+1)
else:
(n, acc) =(1, acc + distance(n))
return acc
Johnbot's solution shows you how to solve your specific problem. How in general can we remove this recursion? Let me show you how, by making a series of small, clearly correct, clearly safe refactorings.
First, here's a slightly rewritten version of your function. I hope you agree it is the same:
def f(n):
if n % 2 == 0:
return 1 + f(n // 2)
elif n != 1:
a = f(n - 1) + 1
b = d(n)
return min(a, b)
else:
return 0
I want the base case to be first. This function is logically the same:
def f(n):
if n == 1:
return 0
if n % 2 == 0:
return 1 + f(n // 2)
a = f(n - 1) + 1
b = d(n)
return min(a, b)
I want the code that comes after each recursive call to be a method call and nothing else. These functions are logically the same:
def add_one(n, x):
return 1 + x
def min_distance(n, x):
a = x + 1
b = d(n)
return min(a, b)
def f(n):
if n == 1:
return 0
if n % 2 == 0:
return add_one(n, f(n // 2))
return min_distance(n, f(n - 1))
Similarly, we add helper functions that compute the recursive argument:
def half(n):
return n // 2
def less_one(n):
return n - 1
def f(n):
if n == 1:
return 0
if n % 2 == 0:
return add_one(n, f(half(n))
return min_distance(n, f(less_one(n))
Again, make sure you agree that this program is logically the same. Now I'm going to simplify the computation of the argument:
def get_argument(n):
return half if n % 2 == 0 else less_one
def f(n):
if n == 1:
return 0
argument = get_argument(n) # argument is a function!
if n % 2 == 0:
return add_one(n, f(argument(n)))
return min_distance(n, f(argument(n)))
Now I'm going to do the same thing to the code after the recursion, and we'll get down to a single recursion:
def get_after(n):
return add_one if n % 2 == 0 else min_distance
def f(n):
if n == 1:
return 0
argument = get_argument(n)
after = get_after(n) # this is also a function!
return after(n, f(argument(n)))
Now I'm noticing that we're passing n to get_after, and then passing it right along to "after" again. I'm going to curry these functions to eliminate that problem. This step is tricky. Make sure you understand it!
def add_one(n):
return lambda x: x + 1
def min_distance(n):
def nested(x):
a = x + 1
b = d(n)
return min(a, b)
return nested
These functions did take two arguments. Now they take one argument, and return a function that takes one argument! So we refactor the use site:
def get_after(n):
return add_one(n) if n % 2 == 0 else min_distance(n)
and here:
def f(n):
if n == 1:
return 0
argument = get_argument(n)
after = get_after(n) # now this is a function of one argument, not two
return after(f(argument(n)))
Similarly we notice that we are calling get_argument(n)(n) to get the argument. Let's simplify that:
def get_argument(n):
return half(n) if n % 2 == 0 else less_one(n)
And let's make it just slightly more general:
base_case_value = 0
def is_base_case(n):
return n == 1
def f(n):
if is_base_case(n):
return base_case_value
argument = get_argument(n)
after = get_after(n)
return after(f(argument))
OK, we now have our program in an extremely compact form. The logic has been spread out into multiple functions, and some of them are curried, to be sure. But now that the function is in this form we can easily remove the recursion. This is the bit that is really tricky is turning the whole thing into an explicit stack:
def f(n):
# Let's make a stack of afters.
afters = [ ]
while not is_base_case(n) :
argument = get_argument(n)
after = get_after(n)
afters.append(after)
n = argument
# Now we have a stack of afters:
x = base_case_value
while len(afters) != 0:
after = afters.pop()
x = after(x)
return x
Study this implementation very carefully. You will learn a lot from it. Remember, when you do a recursive call:
after(f(something))
you are saying that after is the continuation -- the thing that comes next -- of the call to f. We typically implement continuations by putting information about the location in the callers code onto the "call stack". What we're doing in this removal of recursion is simply moving continuation information off of the call stack and onto a stack data structure. But the information is exactly the same.
The important thing to realize here is that we typically think of the call stack as "what is the thing that happened in the past that got me here?". That is exactly backwards. The call stack tells you what you have to do after this call is finished! So that's the information that we encode in the explicit stack. Nowhere do we encode what we did before each step as we "unwind the stack", because we don't need that information.
As I said in my initial comment: there is always a way to turn a recursive algorithm into an iterative one but it is not always easy. I've shown you here how to do it: carefully refactor the recursive method until it is extremely simple. Get it down to a single recursion by refactoring it. Then, and only then, apply this transformation to get it into an explicit stack form. Practice that until you are comfortable with this program transformation. You can then move on to more advanced techniques for removing recursions.
Note that of course this is almost certainly not the "pythonic" way to solve this problem; you could likely build a much more compact, understandable method using lazily evaluated list comprehensions. This answer was intended to answer the specific question that was asked: how in general do we turn recursive methods into iterative methods?
I mentioned in a comment that a standard technique for removing a recursion is to build an explicit list as a stack. This shows that technique. There are other techniques: tail recursion, continuation passing style and trampolines. This answer is already too long, so I'll cover those in a follow-up answer.
Read this answer after you read my first answer.
Again, we are answering the question in general of "how do you turn a recursive algorithm into an iterative algorithm", in this case in Python. As noted previously, this is about exploring the general idea of transforming a program; this is not the "pythonic" way to solve the specific problem.
In my first answer I started by rewriting the program into this form:
def f(n):
if is_base_case(n):
return base_case_value
argument = get_argument(n)
after = get_after(n)
return after(f(argument))
And then transformed it into this form:
def f(n):
# Let's make a stack of afters.
afters = [ ]
while not is_base_case(n) :
argument = get_argument(n)
after = get_after(n)
afters.append(after)
n = argument
# Now we have a stack of afters:
x = base_case_value
while len(afters) != 0:
after = afters.pop()
x = after(x)
return x
The technique here is to construct an explicit stack of "after" calls for a particular input, and then once we have it, run down the whole stack. We are essentially simulating what the runtime already does: constructs a stack of "continuations" that say what to do next.
A different technique is to let the function itself decide what to do with its continuation; this is called "continuation passing style". Let's explore it.
This time, we're going to add a parameter c to the recursive method f. c is a function that takes what would normally be the return value of f, and does whatever was suppose to happen after the call to f. That is, it is explicitly the continuation of f. The method f then becomes "void returning".
The base case is easy. What do we do if we're in the base case? We call the continuation with the value we would have returned:
def f(n, c):
if is_base_case(n):
c(base_case_value)
return
Easy peasy. What about the non-base case? Well, what were we going to do in the original program? We were going to (1) get the arguments, (2) get the "after" -- the continuation of the recursive call, (3) do the recursive call, (4) call "after", its continuation, and (5) return the computed value to whatever the continuation of f is.
We're going to do all the same things, except that when we do step (3) now we need to pass in a continuation that does steps 4 and 5:
argument = get_argument(n)
after = get_after(n)
f(argument, lambda x: c(after(x)))
Hey, that is so easy! What do we do after the recursive call? Well, we call after with the value returned by the recursive call. But now that value is going to be passed to the recursive call's continuation function, so it just goes into x. What happens after that? Well, whatever was going to happen next, and that's in c, so it needs to be called, and we're done.
Let's try it out. Previously we would have said
print(f(100))
but now we have to pass in what happens after f(100). Well, what happens is, the value gets printed!
f(100, print)
and we're done.
So... big deal. The function is still recursive. Why is this interesting? Because the function is now tail recursive! That is, the last thing it does in the non-base case is call itself. Consider a silly case:
def tailcall(x, sum):
if x <= 0:
return sum
return tailcall(x - 1, sum + x)
If we call tailcall(10, 0) it calls tailcall(9, 10), which calls (8, 19), and so on. But any tail-recursive method we can rewrite into a loop very, very easily:
def tailcall(x, sum):
while True:
if x <= 0:
return sum
x = x - 1
sum = sum + x
So can we do the same thing with our general case?
# This is wrong!
def f(n, c):
while True:
if is_base_case(n):
c(base_case_value)
return
argument = get_argument(n)
after = get_after(n)
n = argument
c = lambda x: c(after(x))
Do you see what is wrong? the lambda is closed over c and after, which means that every lambda will use the current value of c and after, not the value it had when the lambda was created. So this is broken, but we can fix it easily by creating a scope which introduces new variables every time it is invoked:
def continuation_factory(c, after)
return lambda x: c(after(x))
def f(n, c):
while True:
if is_base_case(n):
c(base_case_value)
return
argument = get_argument(n)
after = get_after(n)
n = argument
c = continuation_factory(c, after)
And we're done! We've turned this recursive algorithm into an iterative algorithm.
Or... have we?
Think about this really carefully before you read on. Your spider sense should be telling you that something is wrong here.
The problem we started with was that a recursive algorithm is blowing the stack. We've turned this into an iterative algorithm -- there's no recursive call at all here! We just sit in a loop updating local variables.
The question though is -- what happens when the final continuation is called, in the base case? What does that continuation do? Well, it calls its after, and then it calls its continuation. What does that continuation do? Same thing.
All we've done here is moved the recursive control flow into a collection of function objects that we've built up iteratively, and calling that thing is still going to blow the stack. So we haven't actually solved the problem.
Or... have we?
What we can do here is add one more level of indirection, and that will solve the problem. (This solves every problem in computer programming except one problem; do you know what that problem is?)
What we'll do is we'll change the contract of f so that it is no longer "I am void-returning and will call my continuation when I'm done". We will change it to "I will return a function that, when it is called, calls my continuation. And furthermore, my continuation will do the same."
That sounds a little tricky but really its not. Again, let's reason it through. What does the base case have to do? It has to return a function which, when called, calls my continuation. But my continuation already meets that requirement:
def f(n, c):
if is_base_case(n):
return c(base_case_value)
What about the recursive case? We need to return a function, which when called, executes the recursion. The continuation of that call needs to be a function that takes a value and returns a function that when called executes the continuation on that value. We know how to do that:
argument = get_argument(n)
after = get_after(n)
return lambda : f(argument, lambda x: lambda: c(after(x)))
OK, so how does this help? We can now move the loop into a helper function:
def trampoline(f, n, c):
t = f(n, c)
while t != None:
t = t()
And call it:
trampoline(f, 3, print)
And holy goodness it works.
Follow along what happens here. Here's the call sequence with indentation showing stack depth:
trampoline(f, 3, print)
f(3, print)
What does this call return? It effectively returns lambda : f(2, lambda x: lambda : print(min_distance(x)), so that's the new value of t.
That's not None, so we call t(), which calls:
f(2, lambda x: lambda : print(min_distance(x))
What does that thing do? It immediately returns
lambda : f(1,
lambda x:
lambda:
(lambda x: lambda : print(min_distance(x)))(add_one(x))
So that's the new value of t. It's not None, so we invoke it. That calls:
f(1,
lambda x:
lambda:
(lambda x: lambda : print(min_distance(x)))(add_one(x))
Now we're in the base case, so we *call the continuation, substituting 0 for x. It returns:
lambda: (lambda x: lambda : print(min_distance(x)))(add_one(0))
So that's the new value of t. It's not None, so we invoke it.
That calls add_one(0) and gets 1. It then passes 1 for x in the middle lambda. That thing returns:
lambda : print(min_distance(1))
So that's the new value of t. It's not None, so we invoke it. And that calls
print(min_distance(1))
Which prints out the correct answer, print returns None, and the loop stops.
Notice what happened there. The stack never got more than two deep because every call returned a function that said what to do next to the loop, rather than calling the function.
If this sounds familiar, it should. Basically what we're doing here is making a very simple work queue. Every time we "enqueue" a job, it is immediately dequeued, and the only thing the job does is enqueues the next job by returning a lambda to the trampoline, which sticks it in its "queue", the variable t.
We break the problem up into little pieces, and make each piece responsible for saying what the next piece is.
Now, you'll notice that we end up with arbitrarily deep nested lambdas, just as we ended up in the previous technique with an arbitrarily deep queue. Essentially what we've done here is moved the workflow description from an explicit list into a network of nested lambdas, but unlike before, this time we've done a little trick to avoid those lambdas ever calling each other in a manner that increases the stack depth.
Once you see this pattern of "break it up into pieces and describe a workflow that coordinates execution of the pieces", you start to see it everywhere. This is how Windows works; each window has a queue of messages, and messages can represent portions of a workflow. When a portion of a workflow wishes to say what the next portion is, it posts a message to the queue, and it runs later. This is how async await works -- again, we break up the workflow into pieces, and each await is the boundary of a piece. It's how generators work, where each yield is the boundary, and so on. Of course they don't actually use trampolines like this, but they could.
The key thing to understand here is the notion of continuation. Once you realize that you can treat continuations as objects that can be manipulated by the program, any control flow becomes possible. Want to implement your own try-catch? try-catch is just a workflow where every step has two continuations: the normal continuation and the exceptional continuation. When there's an exception, you branch to the exceptional continuation instead of the regular continuation. And so on.
The question here was again, how do we eliminate an out-of-stack caused by a deep recursion in general. I've shown that any recursive method of the form
def f(n):
if is_base_case(n):
return base_case_value
argument = get_argument(n)
after = get_after(n)
return after(f(argument))
...
print(f(10))
can be rewritten as:
def f(n, c):
if is_base_case(n):
return c(base_case_value)
argument = get_argument(n)
after = get_after(n)
return lambda : f(argument, lambda x: lambda: c(after(x)))
...
trampoline(f, 10, print)
and that the "recursive" method will now use only a very small, fixed amount of stack.
First you need to find all the values of n, luckily your sequence is strictly descending and only depends on the next distance:
values = []
while n > 1:
values.append(n)
n = n // 2 if n % 2 == 0 else n - 1
Next you need to calculate the distance at each value. To do that we need to start from the buttom:
values.reverse()
And now we can easily keep track of the previous distance if we need it to calculate the next distance.
distance_so_far = 0
for v in values:
if v % 2 == 0:
distance_so_far += 1
else:
distance_so_far = min(distance(v), distance_so_far + 1)
return distance_so_far
Stick it all together:
def finaldistance(n):
values = []
while n > 1:
values.append(n)
n = n // 2 if n % 2 == 0 else n - 1
values.reverse()
distance_so_far = 0
for v in values:
if v % 2 == 0:
distance_so_far += 1
else:
distance_so_far = min(distance(v), distance_so_far + 1)
return distance_so_far
And now you're using memory instead of stack.
(I don't program in Python so this is probably not be idiomatic Python)
I have tried to write a Merge Sort function as you can see below. But when I try to test it I receive an error:
the name mergesort is not defined
Can anyone point out the cause for this error?
def merge(self,a,b):
sorted_list=[]
while len(a)!=0 and len(b)!=0:
if a[0].get_type()<b[0].get_type():
sorted_list.append(a[0])
a.remove(a[0])
else:
sorted_list.append(b[0])
b.remove(b[0])
if len(a)==0:
sorted_list+=b
else:
sorted_list+=a
return sorted_list
def mergesort(self,lis):
if len(lis) == 0 or len(lis) == 1:
return lis
else:
middle = len(lis)// 2
a = mergesort(lis[middle:]) #in pycharm the next 3 lines are with red underlined
b = mergesort(lis[middle:])
return merge(a,b)
The fact that self is one of the parameters to these methods means that they're most likely part of a class (one you've omitted from your post).
If that's correct, you need to invoke using self.mergesort(l), where l is a list.
As a pre-emptive measure against the next error you'll discover, you need to replace return merge(a, b) with return self.merge(a, b) for similar reasons.
Finally, I have to ask why you're defining all of these functions as methods of a class. They don't seem to rely on any shared data. Are you confident that they wouldn't be more appropriately declared at the module scope?
While I understand that tail recursion optimization is non-Pythonic, I came up with a quick hack to a question on here that was deleted as soon as a I was ready to post.
With a 1000 stack limit, deep recursion algorithms are not usable in Python. But sometimes it is great for initial thoughts through a solution. Since functions are first class in Python, I played with returning a valid function and the next value. Then call the process in a loop until done with single calls. I'm sure this isn't new.
What I found interesting is that I expected the extra overhead of the passing the function back and forth to make this slower than normal recursion. During my crude testing I found it to take 30-50% the time of normal recursion. (With an added bonus of allowing LONG recursions.)
Here is the code I'm running:
from contextlib import contextmanager
import time
# Timing code from StackOverflow most likely.
#contextmanager
def time_block(label):
start = time.clock()
try:
yield
finally:
end = time.clock()
print ('{} : {}'.format(label, end - start))
# Purely Recursive Function
def find_zero(num):
if num == 0:
return num
return find_zero(num - 1)
# Function that returns tuple of [method], [call value]
def find_zero_tail(num):
if num == 0:
return None, num
return find_zero_tail, num - 1
# Iterative recurser
def tail_optimize(method, val):
while method:
method, val = method(val)
return val
with time_block('Pure recursion: 998'):
find_zero(998)
with time_block('Tail Optimize Hack: 998'):
tail_optimize(find_zero_tail, 998)
with time_block('Tail Optimize Hack: 1000000'):
tail_optimize(find_zero_tail, 10000000)
# One Run Result:
# Pure recursion: 998 : 0.000372791020758
# Tail Optimize Hack: 998 : 0.000163852100569
# Tail Optimize Hack: 1000000 : 1.51006975627
Why is the second style faster?
My guess is the overhead with creating entries on the stack, but I'm not sure how to find out.
Edit:
In playing with call counts, I made a loop to try both at various num values. Recursive was much closer to parity when I was looping and calling multiple times.
So, I adding this before the timing, which is find_zero under a new name:
def unrelated_recursion(num):
if num == 0:
return num
return unrelated_recursion(num - 1)
unrelated_recursion(998)
Now the tail optimized call is 85% of the time of the full recursion.
So my theory is that 15% penalty is the overhead for the larger stack, versus single stack.
The reason I saw such a huge disparity in execution time when only running each once was the penalty for allocation of the stack memory and structure. Once that is allocated, the cost of using them is drastically lowered.
Because my algorithm is dead simple, the memory structure allocation is a large portion of the execution time.
When I cut my stack priming call to unrelated_recursion(499), I get about half way between fully primed and not primed stack in find_zero(998) execution time. This makes sense with the theory.
As a comment hopefully remineded me, I was not really answering the question, so here is my sentiment:
In your optimization, you're allocating, unpacking and deallocating tuples, so I tried without them:
# Function that returns tuple of [method], [call value]
def find_zero_tail(num):
if num == 0:
return None
return num - 1
# Iterative recurser
def tail_optimize(method, val):
while val:
val = method(val)
return val
for 1000 tries, each starting with value = 998:
this version take 0.16s
your "optimized" version took 0.22s
the "unoptimized" one took 0.29s
(Note that for me, your optimized version is faster that the un-optimized one ... but we don't do the exact same test.)
But I don't think this is usefull to get those stats: cost is more on the side of Python (methods calls, tuples allocations, ...) that your code doing real things. In a real application you'll not end up measuring the cost of 1000 tuples, but the cost of your actual implementation.
But simply don't do this: this is just hard to read for almost nothing, you're writing for the reader, not for the machine:
# Function that returns tuple of [method], [call value]
def find_zero_tail(num):
if num == 0:
return None, num
return find_zero_tail, num - 1
# Iterative recurser
def tail_optimize(method, val):
while method:
method, val = method(val)
return val
I won't try to implement a more readable version of it because I'll probably end up with:
def find_zero(val):
return 0
But I think in real cases there's some nice ways to deal with recursion limits (both on memory size or depth side):
To help about memory (not depth), an lru_cache from functools may typically help a lot:
>>> from functools import lru_cache
>>> #lru_cache()
... def fib(x):
... return fib(x - 1) + fib(x - 2) if x > 2 else 1
...
>>> fib(100)
354224848179261915075
And for stack size, you may use a list or a deque, depending on your context and usage, instead of using the language stack. Depending on the exact implementation (when you're in fact storing simple sub-computation in your stack to re-use them) it's called dynamic programming:
>>> def fib(x):
... stack = [1, 1]
... while len(stack) < x:
... stack.append(stack[-1] + stack[-2])
... return stack[-1]
...
>>> fib(100)
354224848179261915075
But, and here comes the nice part of using your own structure instead of the call stack, you're not always needed to keep the whole stack to continue computations:
>>> def fib(x):
... stack = (1, 1)
... for _ in range(x - 2):
... stack = stack[1], stack[0] + stack[1]
... return stack[1]
...
>>> fib(100)
354224848179261915075
But to conclude with a nice touch of "know the problem before trying to implement it" (unreadable, hard to debug, hard to visually proove, it's bad code, but it's fun):
>>> def fib(n):
... return (4 << n*(3+n)) // ((4 << 2*n) - (2 << n) - 1) & ((2 << n) - 1)
...
>>>
>>> fib(99)
354224848179261915075
If you ask me, the best implementation is the more readable one (for the Fibonacci example, probably the one with an LRU cache but by changing the ... if ... else ... with a more readable if statement, for another example a deque may be more readable, and for other examples, dynamic programming may be better...
"You're writing for the human reading your code, not for the machine".
I am trying to create a set of functions in python that will all do a similar operation on a set of inputs. All of the functions have one input parameter fixed and half of them also need a second parameter. For the sake of simplicity, below is a toy example with only two functions.
Now, I want, in my script, to run the appropriate function, depending on what the user input as a number. Here, the user is the random function (so the minimum example works). What I want to do is something like this:
def function_1(*args):
return args[0]
def function_2(*args):
return args[0] * args[1]
x = 10
y = 20
i = random.randint(1,2)
f = function_1 if i==1 else function_2
return_value = f(x,y)
And it works, but it seems messy to me. I would rather have function_1 defined as
def function_1(x):
return x
Another way would be to define
def function_1(x,y):
return x
But that leaves me with a dangling y parameter.
but that will not work as easily. Is my way the "proper" way of solving my problem or does there exist a better way?
There are couple of approaches here, all of them adding more boiler-plate code.
There is also this PEP which may be interesting to you.
But 'pythonic' way of doing it is not as elegant as usual function overloading due to the fact that functions are just class attributes.
So you can either go with function like that:
def foo(*args):
and then count how many args you've got which will be very broad but very flexible as well.
another approach is the default arguments:
def foo(first, second=None, third=None)
less flexible but easier to predict, and then lastly you can also use:
def foo(anything)
and detect the type of anything in your function acting accordingly.
Your monkey-patching example can work too, but it becomes more complex if you use it with class methods, and does make introspection tricky.
EDIT: Also, for your case you may want to keep the functions separate and write single 'dispatcher' function that will call appropriate function for you depending on the arguments, which is probably best solution considering above.
EDIT2: base on your comments I believe that following approach may work for you
def weigh_dispatcher(*args, **kwargs):
#decide which function to call base on args
if 'somethingspecial' in kwargs:
return weight2(*args, **kwargs)
def weight_prep(arg):
#common part here
def weight1(arg1, arg2):
weitht_prep(arg1)
#rest of the func
def weight2(arg1, arg2, arg3):
weitht_prep(arg1)
#rest of the func
alternatively you can move the common part into the dispatcher
You may also have a function with optional second argument:
def function_1(x, y = None):
if y != None:
return x + y
else:
return x
Here's the sample run:
>>> function_1(3)
3
>>> function_1(3, 4)
7
Or even optional multiple arguments! Check this out:
def function_2(x, *args):
return x + sum(args)
And the sample run:
>>> function_2(3)
3
>>> function_2(3, 4)
7
>>> function_2(3, 4, 5, 6, 7)
25
You may here refer to args as to list:
def function_3(x, *args):
if len(args) < 1:
return x
else:
return x + sum(args)
And the sample run:
>>> function_3(1,2,3,4,5)
15
I'm trying to write some Python code that includes union/intersection of sets that potentially can be very large. Much of the time, these sets will be essentially set(xrange(1<<32)) or something of the kind, but often there will be ranges of values that do not belong in the set (say, 'bit 5 cannot be clear'), or extra values thrown in. For the most part, the set contents can be expressed algorithmically.
I can go in and do the dirty work to subclass set and create something, but I feel like this must be something that's been done before, and I don't want to spend days on wheel reinvention.
Oh, and just to make it harder, once I've created the set, I need to be able to iterate over it in random order. Quickly. Even if the set has a billion entries. (And that billion-entry set had better not actually take up gigabytes, because I'm going to have a lot of them.)
Is there code out there? Anyone have neat tricks? Am I asking for the moon?
You say:
For the most part, the set contents can be expressed algorithmically.
How about writing a class which presents the entire set API, but determines set inclusion algorithmically. Then with a number of classes which wrap around other sets to perform the union and intersection algorithmically.
For example, if you had a set a and set b which are instances of these pseudo sets:
>>> u = Union(a, b)
And then you use u with the full set API, which will turn around and query a and b using the correct logic. All the set methods could be designed to return these pseudo unions/intersections automatically so the whole process is transparent.
Edit: Quick example with a very limited API:
class Base(object):
def union(self, other):
return Union(self, other)
def intersection(self, other):
return Intersection(self, other)
class RangeSet(Base):
def __init__(self, low, high):
self.low = low
self.high = high
def __contains__(self, value):
return value >= self.low and value < self.high
class Union(Base):
def __init__(self, *sets):
self.sets = sets
def __contains__(self, value):
return any(value in x for x in self.sets)
class Intersection(Base):
def __init__(self, *sets):
self.sets = sets
def __contains__(self, value):
return all(value in x for x in self.sets)
a = RangeSet(0, 10)
b = RangeSet(5, 15)
u = a.union(b)
i = a.intersection(b)
print 3 in u
print 7 in u
print 12 in u
print 3 in i
print 7 in i
print 12 in i
Running gives you:
True
True
True
False
True
False
You are trying to make a set containing all the integer values in from 0 to 4,294,967,295. A byte is 8 bits, which gets you to 255. 99.9999940628% of your values are over one byte in size. A crude minimum size for your set, even if you are able to overcome the syntactic issues, is 4 billion bytes, or 4 GB.
You are never going to be able to hold an instance of that set in less than a GB of memory. Even with compression, it's likely to be a tough squeeze. You are going to have to get much more clever with your math. You may be able to take advantage of some properties of the set. After all, it's a very special set. What you are trying to do?
If you are using python 3.0, you can subclass collections.Set
This sounds like it might overlap with linear programming. In linear programming you are trying to find some optimal case where you add constraints to a set of values (typically integers) which initially van be very large. There are various libraries listed at http://wiki.python.org/moin/NumericAndScientific/Libraries that mention integer and linear programming, but nothing jumps out as being obviously what you want.
I would avoid subclassing set, since clearly you can usefully reuse no part of set's implementation. I would even avoid subclassing collections.Set, since the latter requires you to supply a __len__ -- a functionality which you appear not to need otherwise, and just can't be done effectively in the general case (it's going to be O(N), with, which the kind of size you're talking about, is far too slow). You're unlikely to find some existing implementation that matches your use case well enough to be worth reusing, because your requirements are very specific and even peculiar -- the concept of "random iterating and an occasional duplicate is OK", for example, is a really unusual one.
If your specs are complete (you only need union, intersection, and random iteration, plus occasional additions and removals of single items), implementing a special purpose class that fills those specs is not a crazy undertaking. If you have more specs that you have not explicitly mentioned, it will be trickier, but it's hard to guess without hearing all the specs. So for example, something like:
import random
class AbSet(object):
def __init__(self, predicate, maxitem=1<<32):
# set of all ints, >=0 and <maxitem, satisfying the predicate
self.maxitem = maxitem
self.predicate = predicate
self.added = set()
self.removed = set()
def copy(self):
x = type(self)(self.predicate, self.maxitem)
x.added = set(self.added)
x.removed = set(self.removed)
return x
def __contains__(self, item):
if item in self.removed: return False
if item in self.added: return True
return (0 <= item < self.maxitem) and self.predicate(item)
def __iter__(self):
# random endless iteration
while True:
x = random.randrange(self.maxitem)
if x in self: yield x
def add(self, item):
if item<0 or item>=self.maxitem: raise ValueError
if item not in self:
self.removed.discard(item)
self.added.add(item)
def discard(self, item):
if item<0 or item>=self.maxitem: raise ValueError
if item in self:
self.removed.add(item)
self.added.discard(item)
def union(self, o):
pred = lambda v: self.predicate(v) or o.predicate(v),
x = type(self)(pred, max(self.maxitem, o.maxitem))
toadd = [v for v in (self.added|o.added) if not pred(v)]
torem = [v for v in (self.removed|o.removed) if pred(v)]
x.added = set(toadd)
x.removed = set(torem)
def intersection(self, o):
pred = lambda v: self.predicate(v) and o.predicate(v),
x = type(self)(pred, min(self.maxitem, o.maxitem))
toadd = [v for v in (self.added&o.added) if not pred(v)]
torem = [v for v in (self.removed&o.removed) if pred(v)]
x.added = set(toadd)
x.removed = set(torem)
I'm not entirely certain about the logic determining added and removed upon union and intersection, but I hope this is a good base for you to work from.