How to apply recursion function in land subdivision? - python

I've made a subdivision code that allows division of a polygon by bounding box method. subdivision(coordinates) results in subblockL and subblockR (left and right). If I want to repeat this subdivision code until it reaches the area less than 200, I would need to use recursion method.
ex:
B = subdivision(A)[0], C = subdivision(B)[0], D = subdivision(C)[0]... until it reaches the area close to 200. (in other words,
subdivision(subdivision(subdivision(A)[0])[0])[0]...)
How can I simplify repetition of subdivision? and How can I apply subdivision to every block instead of single block?
while area(subdivision(A)[0]) < 200:
for i in range(A):
subdivision(i)[0]
def sd_recursion(x):
if x == subdivision(A):
return subdivision(A)
else:
return
I'm not sure what function to put in

"What function to put in" is the function itself; that's the definition of recursion.
def sd_recursive(coordinates):
if area(coordinates) < 200:
return [coordinates]
else:
a, b = subdivision(coordinates)
return sd_recursive(a) + sd_recursive(b) # list combination, not arithmetic addition
To paraphrase, if the area is less than 200, simply return the polygon itself. Otherwise, divide the polygon into two parts, and return ... the result of applying the same logic to each part in turn.
Recursive functions are challenging because recursive functions are challenging. Until you have wrapped your head around this apparently circular argument, things will be hard to understand. The crucial design point is to have a "base case" which does not recurse, which in other words escapes the otherwise infinite loop of the function calling itself under some well-defined condition. (There's also indirect recursion, where X calls Y which calls X which calls Y ...)
If you are still having trouble, look at one of the many questions about debugging recursive functions. For example, Understanding recursion in Python
I assumed the function should return a list in every case, but there are multiple ways to arrange this, just so long as all parts of the code obey the same convention. Which way to prefer also depends on how the coordinates are represented and what's convenient for your intended caller.
(In Python, ['a'] + ['b'] returns ['a', 'b'] so this is not arithmetic addition of two lists, it's just a convenient way to return a single list from combining two other lists one after the other.)
Recursion can always be unrolled; the above can be refactored to
def sd_unrolled(coordinates):
result = []
while coordinates:
if area(coordinates[0]) < 200:
result.extend(coordinates[0])
coordinates = coordinates[1:]
a, b = subdivision(coordinates[0])
coordinates = [a, b] + coordinates[1:]
return result
This is tricky in its own right (but could perhaps be simplified by introducing a few temporary variables) and pretty inefficient or at least inelegant as we keep on copying slices of the coordinates list to maintain the tail while we keep manipulating the head (the first element of the list) by splitting it until each piece is small enough.

Related

Recursion not breaking

I am trying to solve Euler problem 18 where I am required to find out the maximum total from top to bottom. I am trying to use recursion, but am stuck with this.
I guess I didn't state my problem earlier. What I am trying to achieve by recursion is to find the sum of the maximum number path. I start from the top of the triangle, and then check the condition is 7 + findsum() bigger or 4 + findsum() bigger. findsum() is supposed to find the sum of numbers beneath it. I am storing the sum in variable 'result'
The problem is I don't know the breaking case of this recursion function. I know it should break when it has reached the child elements, but I don't know how to write this logic in the program.
pyramid=[[0,0,0,3,0,0,0,],
[0,0,7,0,4,0,0],
[0,2,0,4,0,6,0],
[8,0,5,0,9,0,3]]
pos=[0,3]
def downleft(pyramid,pos):#returns down left child
try:
return(pyramid[pos[0]+1][pos[1]-1])
except:return(0)
def downright(pyramid,pos):#returns down right child
try:
return(pyramid[pos[0]+1][pos[1]+1])
except:
return(0)
result=0
def find_max(pyramid,pos):
global result
if downleft(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]-1]) > downright(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]+1]):
new_pos=[pos[0]+1,pos[1]-1]
result+=downleft(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]-1])
elif downright(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]+1]) > downleft(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]-1]):
new_pos=[pos[0]+1,pos[1]+1]
result+=downright(pyramid,pos)+find_max(pyramid,[pos[0]+1,pos[1]+1])
else :
return(result)
find_max(pyramid,pos)
A big part of your problem is that you're recursing a lot more than you need to. You should really only ever call find_max twice recursively, and you need some base-case logic to stop after the last row.
Try this code:
def find_max(pyramid, x, y):
if y >= len(pyramid): # base case, we're off the bottom of the pyramid
return 0 # so, return 0 immediately, without recursing
left_value = find_max(pyramid, x - 1, y + 1) # first recursive call
right_value = find_max(pyramid, x + 1, y + 1) # second recursive call
if left_value > right_value:
return left_value + pyramid[y][x]
else:
return right_value + pyramid[y][x]
I changed the call signature to have separate values for the coordinates rather than using a tuple, as this made the indexing much easier to write. Call it with find_max(pyramid, 3, 0), and get rid of the global pos list. I also got rid of the result global (the function returns the result).
This algorithm could benefit greatly from memoization, as on bigger pyramids you'll calculate the values of the lower-middle areas many times. Without memoization, the code may be impractically slow for large pyramid sizes.
Edit: I see that you are having trouble with the logic of the code. So let's have a look at that.
At each position in the tree you want to make a choice of selecting
the path from this point on that has the highest value. So what
you do is, you calculate the score of the left path and the score of
the right path. I see this is something you try in your current code,
only there are some inefficiencies. You calculate everything
twice (first in the if, then in the elif), which is very expensive. You should only calculate the values of the children once.
You ask for the stopping condition. Well, if you reach the bottom of the tree, what is the score of the path starting at this point? It's just the value in the tree. And that is what you should return at that point.
So the structure should look something like this:
function getScoreAt(x, y):
if at the end: return valueInTree(x, y)
valueLeft = getScoreAt(x - 1, y + 1)
valueRight = getScoreAt(x + 1, y + 1)
valueHere = min(valueLeft, valueRight) + valueInTree(x, y)
return valueHere
Extra hint:
Are you aware that in Python negative indices wrap around to the back of the array? So if you do pyramid[pos[0]+1][pos[1]-1] you may actually get to elements like pyramid[1][-1], which is at the other side of the row of the pyramid. What you probably expect is that this raises an error, but it does not.
To fix your problem, you should add explicit bound checks and not rely on try blocks (try blocks for this is also not a nice programming style).

Memoized to DP solution - Making Change

Recently I read a problem to practice DP. I wasn't able to come up with one, so I tried a recursive solution which I later modified to use memoization. The problem statement is as follows :-
Making Change. You are given n types of coin denominations of values
v(1) < v(2) < ... < v(n) (all integers). Assume v(1) = 1, so you can
always make change for any amount of money C. Give an algorithm which
makes change for an amount of money C with as few coins as possible.
[on problem set 4]
I got the question from here
My solution was as follows :-
def memoized_make_change(L, index, cost, d):
if index == 0:
return cost
if (index, cost) in d:
return d[(index, cost)]
count = cost / L[index]
val1 = memoized_make_change(L, index-1, cost%L[index], d) + count
val2 = memoized_make_change(L, index-1, cost, d)
x = min(val1, val2)
d[(index, cost)] = x
return x
This is how I've understood my solution to the problem. Assume that the denominations are stored in L in ascending order. As I iterate from the end to the beginning, I have a choice to either choose a denomination or not choose it. If I choose it, I then recurse to satisfy the remaining amount with lower denominations. If I do not choose it, I recurse to satisfy the current amount with lower denominations.
Either way, at a given function call, I find the best(lowest count) to satisfy a given amount.
Could I have some help in bridging the thought process from here onward to reach a DP solution? I'm not doing this as any HW, this is just for fun and practice. I don't really need any code either, just some help in explaining the thought process would be perfect.
[EDIT]
I recall reading that function calls are expensive and is the reason why bottom up(based on iteration) might be preferred. Is that possible for this problem?
Here is a general approach for converting memoized recursive solutions to "traditional" bottom-up DP ones, in cases where this is possible.
First, let's express our general "memoized recursive solution". Here, x represents all the parameters that change on each recursive call. We want this to be a tuple of positive integers - in your case, (index, cost). I omit anything that's constant across the recursion (in your case, L), and I suppose that I have a global cache. (But FWIW, in Python you should just use the lru_cache decorator from the standard library functools module rather than managing the cache yourself.)
To solve for(x):
If x in cache: return cache[x]
Handle base cases, i.e. where one or more components of x is zero
Otherwise:
Make one or more recursive calls
Combine those results into `result`
cache[x] = result
return result
The basic idea in dynamic programming is simply to evaluate the base cases first and work upward:
To solve for(x):
For y starting at (0, 0, ...) and increasing towards x:
Do all the stuff from above
However, two neat things happen when we arrange the code this way:
As long as the order of y values is chosen properly (this is trivial when there's only one vector component, of course), we can arrange that the results for the recursive call are always in cache (i.e. we already calculated them earlier, because y had that value on a previous iteration of the loop). So instead of actually making the recursive call, we replace it directly with a cache lookup.
Since every component of y will use consecutively increasing values, and will be placed in the cache in order, we can use a multidimensional array (nested lists, or else a Numpy array) to store the values instead of a dictionary.
So we get something like:
To solve for(x):
cache = multidimensional array sized according to x
for i in range(first component of x):
for j in ...:
(as many loops as needed; better yet use `itertools.product`)
If this is a base case, write the appropriate value to cache
Otherwise, compute "recursive" index values to use, look up
the values, perform the computation and store the result
return the appropriate ("last") value from cache
I suggest considering the relationship between the value you are constructing and the values you need for it.
In this case you are constructing a value for index, cost based on:
index-1 and cost
index-1 and cost%L[index]
What you are searching for is a way of iterating over the choices such that you will always have precalculated everything you need.
In this case you can simply change the code to the iterative approach:
for each choice of index 0 upwards:
for each choice of cost:
compute value corresponding to index,cost
In practice, I find that the iterative approach can be significantly faster (e.g. *4 perhaps) for simple problems as it avoids the overhead of function calls and checking the cache for preexisting values.

How's this for collision detection?

I have to write a collide method inside a Rectangle class that takes another Rectangle object as a parameter and returns True if it collides with the rectangle performing the method and False if it doesn't. My solution was to use a for loop that iterates through every value of x and y in one rectangle to see if it falls within the other, but I suspect there might be more efficient or elegant ways to do it. This is the method (I think all the names are pretty self explanatory, just ask if anything isn't clear):
def collide(self,target):
result = False
for x in range(self.x,self.x+self.width):
if x in range(target.get_x(),target.get_x()+target.get_width()):
result = True
for y in range(self.y,self.y+self.height):
if y in range(target.get_y(),target.get_y()+target.get_height()):
result = True
return result
Thanks in advance!
The problem of collision detection is a well-known one, so I thought rather than speculate I might search for a working algorithm using a well-known search engine. It turns out that good literature on rectangle overlap is less easy to come by than you might think. Before we move on to that, perhaps I can comment on your use of constructs like
if x in range(target.get_x(),target.get_x()+target.get_width()):
It is to Python's credit that such an obvious expression of your idea actually succeeds as intended. What you may not realize is that (in Python 2, anyway) each use of range() creates a list (in Python 3 it creates a generator and iterates over that instead; if you don't know what that means please just accept that it's little better in computational terms). What I suspect you may have meant is
if target.get_x() <= x < target.get_x()+target.get_width():
(I am using open interval testing to reflect your use of range())This has the merit of replacing N equality comparisons with two chained comparisons. By a relatively simple mathematical operation (subtracting target.get_x() from each term in the comparison) we transform this into
if 0 <= x-target.get_x() < target.get_width():
Do not overlook the value of eliminating such redundant method calls, though it's often simpler to save evaluated expressions by assignment for future reference.
Of course, after that scrutiny we have to look with renewed vigor at
for x in range(self.x,self.x+self.width):
This sets a lower and an upper bound on x, and the inequality you wrote has to be false for all values of x. Delving beyond the code into the purpose of the algorithm, however, is worth doing. Because any lit creation the inner test might have done is now duplicated many times over (by the width of the object, to be precise). I take the liberty of paraphrasing
for x in range(self.x,self.x+self.width):
if x in range(target.get_x(),target.get_x()+target.get_width()):
result = True
into pseudocode: "if any x between self.x and self.x+self.width lies between the target's x and the target's x+width, then the objects are colliding". In other words, whether two ranges overlap. But you sure are doing a lot of work to find that out.
Also, just because two objects collide in the x dimension doesn't mean they collide in space. In fact, if they do not also collide in the y dimension then the objects are disjoint, otherwise you would assess these rectangles as colliding:
+----+
| |
| |
+----+
+----+
| |
| |
+----+
So you want to know if they collide in BOTH dimensions, not just one. Ideally one would define a one-dimensional collision detection (which by now we just about have ...) and then apply in both dimensions. I also hope that those accessor functions can be replaced by simple attribute access, and my code is from now on going to assume that's the case.
Having gone this far, it's probably time to take a quick look at the principles in this YouTube video, which makes the geometry relatively clear but doesn't express the formula at all well. It explains the principles quite well as long as you are using the same coordinate system. Basically two objects A and B overlap horizontally if A's left side is between B's left and right sides. They also overlap if B's right is between A's left and right. Both conditions might be true, but in Python you should think about using the keyword or to avoid unnecessary comparisons.
So let's define a one-dimensional overlap function:
def oned_ol(aleft, aright, bleft, bright):
return (aleft <= bright < aright) or (bleft <= aright < bright)
I'm going to cheat and use this for both dimensions, since the inside of my function doesn't know which dimension's data I cam calling it with. If I am correct, the following formulation should do:
def rect_overlap(self, target):
return oned_ol(self.x, self.x+self.width, target.x, target.x+target.width) \
and oned_ol(self.y, self.y+self.height, target.y, target.y+target.height
If you insist on using those accessor methods you will have to re-cast the code to include them. I've done sketchy testing on the 1-D overlap function, and none at all on rect_overlap, so please let me know - caveat lector. Two things emerge.
A superficial examination of code can lead to "optimization" of a hopelessly inefficient algorithm, so sometimes it's better to return to first principles and look more carefully at your algorithm.
If you use expressions as arguments to a function they are available by name inside the function body without the need to make an explicit assignment.
def collide(self, target):
# self left of target?
if x + self.width < target.x:
return False
# self right of target?
if x > target.x + target.width :
return False
# self above target?
if y + self.height < target.y:
return False
# self below target?
if y > target.y + target.height:
return False
return True
Something like that (depends on your coord system, i.e. y positive up or down)

Empty zeroth element in array/list to eliminate repeated decrementing. Does this improve performance?

I am using Python to solve Project Euler problems. Many require caching the results of past calculations to improve performance, leading to code like this:
pastResults = [None] * 1000000
def someCalculation(integerArgument):
# return result of a calculation performed on numberArgument
# for example, summing the factorial or square of its digits
for eachNumber in range(1, 1000001)
if pastResults[eachNumber - 1] is None:
pastResults[eachNumber - 1] = someCalculation(eachNumber)
# perform additional actions with pastResults[eachNumber - 1]
Would the repeated decrementing have an adverse impact on program performance? Would having an empty or dummy zeroth element (so the zero-based array emulates a one-based array) improve performance by eliminating the repeated decrementing?
pastResults = [None] * 1000001
def someCalculation(integerArgument):
# return result of a calculation performed on numberArgument
# for example, summing the factorial or square of its digits
for eachNumber in range(1, 1000001)
if pastResults[eachNumber] is None:
pastResults[eachNumber] = someCalculation(eachNumber)
# perform additional actions with pastResults[eachNumber]
I also feel that emulating a one-based array would make the code easier to follow. That is why I do not make the range zero-based with for eachNumber in range(1000000) as someCalculation(eachNumber + 1) would not be logical.
How significant is the additional memory from the empty zeroth element? What other factors should I consider? I would prefer answers that are not confined to Python and Project Euler.
EDIT: Should be is None instead of is not None.
Not really an answer to the question regarding the performance, rather a general tip about caching previously calculated values. The usual way to do this is to use a map (Python dict) for this, as this allows to use more complex keys instead of just integer numbers, like floating point numbers, strings, or even tuples. Also, you won't run into problems in case your keys are rather sparse.
pastResults = {}
def someCalculation(integerArgument):
if integerArgument not in pastResults:
pastResults[integerArgument] = # calculation performed on numberArg.
return pastResults[integerArgument]
Also, there is no need to perform the calculations "in order" using a loop. Just call the function for the value you are interested in, and the if statement will take care that, when invoked recursively, the function is called only once for each argument.
Ultimately, if you are using this a lot (as clearly the case for Project Euler) you can define yourself a function decorator, like this one:
def memo(f):
f.cache = {}
def _f(*args, **kwargs):
if args not in f.cache:
f.cache[args] = f(*args, **kwargs)
return f.cache[args]
return _f
What this does is: It takes a function and defines another function that first checks whether the given parameters can be found in the cache, and otherwise calculates the result of the original function and puts it into the cache. Just add the #memo annotation to your function definitions and this will take care of caching for you.
#memo
def someCalculation(integerArgument):
# function body
This is syntactic sugar for someCalculation = memo(someCalculation). Note however, that this will not always work out well. First, the paremters have to be hashable (no lists or other mutable types); second, in case you are passing parameters that are not relevant for the result (e.g., debugging stuff etc.) your cache can grow unnecessarily large, as all the parameters are used as the key.

Tips on improving this function?

This may be quite a green question, but I hope you understand – just started on python and trying to improve. Anyways, wrote a little function to do the "Shoelace Method" of finding the area of a polygon in a Cartesian plane (see this for a refresher).
I want to know how can I improve my method, so I can try out fancy new ways of doing the same old things.
def shoelace(list):
r_p = 0 # Positive Values
r_n = 0 # Negative Values
x, y = [i[0] for i in list], [i[1] for i in list]
x.append(x[0]), y.append(y[0])
print(x, y)
for i in range(len(x)):
if (i+1) < len(x):
r_p += (x[i] * y[i+1])
r_n += (x[i+1] * y[i])
else:
break
return ((abs(r_p - r_n))/2)
Don't use short variable names that need to be commented; use names that indicate the function.
list is the name of the built-in list type, so while Python will let you replace that name, it's a bad idea stylistically.
, should not be used to separate what are supposed to be statements. You can use ;, but it's generally better to just put things on separate lines. In your case, it happens to work because you are using .append for the side effect, but basically what you are doing is constructing the 2-tuple (None, None) (the return values from .append) and throwing it away.
Use built-in functions where possible for standard list transformations. See the documentation for zip, for example. Except you don't really need to perform this transformation; you want to consider pairs of adjacent points, so do that - and take apart their coordinates inside the loop.
However, you can use zip to transform the list of points into a list of pairs-of-adjacent-points :) which lets you write a much cleaner loop. The idea is simple: first, we make a list of all the "next" points relative to the originals, and then we zip the two point-lists together.
return is not a function, so the thing you're returning does not need surrounding parentheses.
Instead of tallying up separate positive and negative values, perform signed arithmetic on a single value.
def shoelace(points):
signed_double_area = 0
next_points = points[1:] + points[:1]
for begin, end in zip(points, next_points):
begin_x, begin_y = begin
end_x, end_y = end
signed_double_area += begin_x * end_y
signed_double_area -= end_x * begin_y
return abs(signed_double_area) / 2
Functionally, your program is quite good. One minor remark is to replace range(len(x)) with xrange(len(x)). It makes the program slightly more efficient. Generally, you should use range only in cases where you actually need the full list of values it creates. If all you need is to loop over those values, use xrange.
Also, you don't need the parenthesis in the return statement, nor in the r_p += and r_n += statements.
Regarding style, in Python variable assignments shouldn't be done like you did, but rather with a single space on each side of the = symbol:
r_p = 0
r_n = 0

Categories

Resources