Pythonic way to implement three similar integer range operators? - python

I am working on a circular problem. In this problem, we have objects that are put on a ring of size MAX, and are assigned IDs from (0 to MAX-1).
I have three simple functions to test for range inclusions. inRange(i,j,k) tests if i is in the circular interval [j,k[ (Mnemonic is i inRange(j,k)). And I have the same for ranges ]j,k[ and ]j,k].
Code in those three methods look duplicated from one method to another:
def inRange(i,j,k):
"""
Returns True if i in [j, k[
* 0 <= i, j, k < MAX
* no order is assumed between j and k: we can have k < j
"""
if j <= k:
return j <= i < k
# j > k :
return j <= i or i < k
def inStrictRange(i,j,k):
"""
Returns True if i in ]j, k[
* 0 <= i, j, k < MAX
* no order is assumed between j and k: we can have k < j
"""
if j <= k:
return j < i < k
# j > k :
return j < i or i < k
def inRange2(i,j,k):
"""
Returns True if i in ]j, k]
* 0 <= i, j, k < MAX
* no order is assumed between j and k: we can have k < j
"""
if j <= k:
return j < i <= k
# j > k :
return j < i or i <= k
Do you know any cleaner way to implement those three methods? After all, only the operators are changing?!
After thinking of a better solution, I came up with:
from operator import lt, le
def _compare(i,j,k, op1, op2):
if j <= k:
return op1(j,i) and op2(i,k)
return op1(j,i) or op2(i,k)
def inRange(i,j,k):
return _compare(i,j,k, le, lt)
def inStrictRange(i,j,k):
return _compare(i,j,k, lt, lt)
def inRange2(i,j,k):
return _compare(i,j,k, lt, le)
Is it any better? Can you come up with something more intuitive?
In short, what would be the Pythonic way to write these three operators?
Also, I hate the inRange, inStrictRange, inRange2 names, but I can't think of crystal-clear names. Any ideas?
Thanks.

Two Zen of Python principles leap to mind:
Simple is better than complex.
There should be one—and preferably only one—obvious way to do it.
range
The Python built-in function range(start, end) generates a list from start to end.1 The first element of that list is start, and the last element is end - 1.
There is no range_strict function or inclusive_range function. This was very awkward to me when I started in Python. ("I just want a list from a to b inclusive! How hard is that, Guido?") However, the convention used in calling the range function was simple and easy to remember, and the lack of multiple functions made it easy to remember exactly how to generate a range every time.
Recommendation
As you've probably guessed, my recommendation is to only create a function to test whether i is in the range [j, k). In fact, my recommendation is to keep only your existing inRange function.
(Since your question specifically mentions Pythonicity, I would recommend you name the function as in_range to better fit with the Python Style Guide.)
Justification
Why is this a good idea?
The single function is easy to understand. It is very easy to learn how to use it.
Of course, the same could be said for each of your three starting functions. So far so good.
There is only one function to learn. There are not three functions with necessarily similar names.
Given the similar names and behaviours of your three functions, it is somewhat possible that you will, at some point, use the wrong function. This is compounded by the fact that the functions return the same value except for edge cases, which could lead to a hard-to-find off-by-one bug. By only making one function available, you know you will not make such a mistake.
The function is easy to edit.
It is unlikely that you'll need to ever debug or edit such an easy piece of code. However, should you need to do so, you need only edit this one function. With your original three functions, you have to make the same edit in three places. With your revised code in your self-answer, the code is made slightly less intuitive by the operator obfuscation.
The "size" of the range is obvious.
For a given ring where you would use inRange(i, j, k), it is obvious how many elements would be covered by the range [j, k). Here it is in code.
if j <= k:
size = k - j
if j > k:
size = k - j + MAX
So therefore
size = (k - j) % MAX
Caveats
I'm approaching this problem from a completely generic point of view, such as that of a person writing a function for a publicly-released library. Since I don't know your problem domain, I can't say whether this is a practical solution.
Using this solution may mean a fair bit of refactoring of the code that calls these functions. Look through this code to see if editing it is prohibitively difficult or tedious.
1: Actually, it is range([start], end, [step]). I trust you get what I mean though.

The Pythonic way to do it is to choose readability, and therefor keep the 3 methods as they were at the beginning.
It's not like they are HUGE methods, or there are thousand of them, or you would have to dynamically generate them.

No higher-order functions, but it's less code, even with the extraneous else.
def exclusive(i, j, k):
if j <= k:
return j < i < k
else:
return j < i or i < k
def inclusive_left(i, j, k):
return i==j or exclusive(i, j, k)
def inclusive_right(i, j, k):
return i==k or exclusive(i, j, k)
I actually tried switching the identifiers to n, a, b, but the code began to look less cohesive. (My point: perfecting this code may not be a productive use of time.)

Now I am thinking of something such as:
def comparator(lop, rop):
def comp(i, j, k):
if j <= k:
return lop(j, i) and rop(i,k)
return lop(j, i) or rop(i,k)
return comp
from operator import le, lt
inRange = comparator(le, lt)
inStrictRange = comparator(lt, lt)
inRange2 = comparator(lt, le)
Which looks better indeed.

I certainly agree that you need only one function, and that the function should use a (Pythonic) half-open range.
Two suggestions:
Use meaningful names for the args:
in_range(x, lo, hi) is a big
improvement relative to the
2-keystroke cost.
Document the fact that the
constraint hi < MAX means that it is
not possible to express a range that
includes all MAX elements. As
Wesley remarked, size = (k - j) %
MAX i.e. size = (hi - lo) % MAX
and thus 0 <= size < MAX.

To make it more familiar to your users, I would have one main in_range function with the same bounds as range(). This makes it much easier to remember, and has other nice properties as Wesley mentioned.
def in_range(i, j, k):
return (j <= i < k) if j <= k else (j <= i or i < k)
You can certainly use this one alone for all your use cases by adding 1 to j and/or k. If you find that you're using a specific form frequently, then you can define it in terms of the main one:
def exclusive(i, j, k):
"""Excludes both endpoints."""
return in_range(i, j + 1, k)
def inclusive(i, j, k):
"""Includes both endpoints."""
return in_range(i, j, k + 1)
def weird(i, j, k):
"""Excludes the left endpoint but includes the right endpoint."""
return in_range(i, j + 1, k + 1)
This is shorter than mucking around with operators, and is also much less confusing to understand. Also, note that you should use underscores instead of camelCase for function names in Python.

I'd go one step further than Wesley in aping the normal python 'in range' idiom; i'd write a cyclic_range class:
import itertools
MAX = 10 # or whatever
class cyclic_range(object):
def __init__(self, start, stop):
# mod so you can be a bit sloppy with indices, plus -1 means the last element, as with list indices
self.start = start % MAX
self.stop = stop % MAX
def __len__(self):
return (self.stop - self.start) % MAX
def __getitem__(self, i):
return (self.start + i) % MAX
def __contains__(self, x):
if (self.start < self.stop):
return (x >= self.start) and (x < self.stop)
else:
return (x >= self.start) or (x < self.stop)
def __iter__(self):
for i in xrange(len(self)):
yield self[i]
def __eq__(self, other):
if (len(self) != len(other)): return False
for a, b in itertools.izip(self, other):
if (a != b): return False
return True
def __hash__(self):
return (self.start << 1) + self.stop
def __str__(self):
return str(list(self))
def __repr__(self):
return "cyclic_range(" + str(self.start) + ", " + str(self.stop) + ")"
# and whatever other list-like methods you fancy
You can then write code like:
if (myIndex in cyclic_range(firstNode, stopNode)):
blah
To do the equivalent of inRange. To do inStrictRange, write:
if (myIndex in cyclic_range(firstNode + 1, stopNode)):
And to do inRange2:
if (myIndex in cyclic_range(firstNode + 1, stopNode + 1)):
If you don't like doing the additions by hand, how about adding these methods:
def strict(self):
return cyclic_range(self.start + 1, self.stop)
def right_closed(self):
return cyclic_range(self.start + 1, self.stop + 1)
And then doing:
if (myIndex in cyclic_range(firstNode, stopNode).strict()): # inStrictRange
if (myIndex in cyclic_range(firstNode, stopNode).closed_right()): # inRange2
Whilst this approach is, IMHO, more readable, it does involve doing an allocation, rather than just a function call, which is more expensive - although still O(1). But then if you really cared about performance, you wouldn't be using python!

Related

Top-down approach poorly implemented

I have been kinda practicing some recursive / dynamic programming exercises, but I have been struggling trying to convert this recursive problem into a top-down one, as for example, for f(4,4), the recursive one yields 5, whereas my top-down yields 2.
I can see the base cases being the same, so I might be messing up with the logic somehow, but I haven't found where.
the code is as written:
def funct(n, m):
if n == 0:
return 1
if m == 0 or n < 0:
return 0
return funct(n-m, m) + funct(n, m-1)
and the top-down approach is:
def funct_top_down(n, m):
if n == 0:
return 1
if m == 0 or n < 0:
return 0
memo = [[-1 for x in range(m+1)] for y in range(n+1)]
return funct_top_down_imp(n, m, memo)
def funct_top_down_imp(n, m, memo):
if memo[n][m] == -1:
if n == 0:
memo[n][m] = 1
elif m == 0 or n < 0:
memo[n][m] = 0
else:
memo[n][m] = funct_top_down_imp(n-m, m, memo) + funct_top_down_imp(n, m-1, memo)
return memo[n][m]
Thanks!
The basic idea is to compute all possible values of func instead of just one. Sounds like a lot, but since func needs all previous values anyways, this would be actually the same amount of work.
We place computed values in a matrix N+1 x M+1 for given N, M so that it holds that
matrix[n][m] === func(n, m)
Observe that the first row of the matrix will be all 1, since f(0,m) returns 1 for every m. Then, starting from the second row, compute the values left to right using your recursive dependency:
matrix[n][m] = matrix[n-m][m] + matrix[n][m-1]
(don't forget to check the bounds!)
Upon completion, your answer will be in the bottom right corner of the matrix. Good luck!

Maximum Subarray Recursion

The 'Maximum Subarray' question:
Given an integer array nums, find the contiguous subarray (containing
at least one number) which has the largest sum and return its sum.
Example:
Input: nums = [-2,1,-3,4,-1,2,1,-5,4]
Output: 6
This should be an easy problem, but for some reason, my code runs endlessly when given a big array as input.
I hope someone can help:
This is my code:
def maxSubCaller(nums):
return maxSub(nums, 0, len(nums)-1)
def maxSub(nums, i, j):
if i == j:
return nums[i]
if j < i or i > j:
return min(nums)
sum_res = sum(nums[i:j + 1])
left_sub = maxSub(nums, i, j-1)
right_sub = maxSub(nums, i+1, j)
return max(sum_res, left_sub, right_sub)
This may be because your implementation is actually quite time complex. For every recursive step, you are summing between two indexes. The sum function needs to loop through the passed parameters, and therefore your worst case time complexity is proportional to n^2, where n is the length of the input list.
This problem can be solved in linear time proportional to the length of the input list as follows:
def max_sub_arr(nums):
max_sum = nums[0]
local_sum = nums[0]
for i in range(1, len(nums)):
if (nums[i] > local_sum + nums[i]):
local_sum = nums[i]
else:
local_sum += nums[i]
max_sum = max(local_sum, max_sum)
return max_sum
if __name__ == "__main__":
print(max_sub_arr([-2,1,-3,4,-1,2,1,-5,4]))
This is a different algorithm to yours, and only loops through the array once (doesn't use the sum function), and should execute faster.
There are probably further optimisations to this code, and there is no error handling (i.e. when nums is empty). The above should be fast enough to complete your problem however.

Converting a function with two recursive calls into an interative function

I've got a function that has two recursive calls and I'm trying to convert it to an iterative function. I've got it figured out where I can do it with one call fairly easily, but I can't figure out how to incorporate the other call.
the function:
def specialMultiplication(n):
if n < 2:
return 1
return n * specialMultiplication(n-1) * specialMultiplication(n-2)
If I just had one of them, it would be really easily:
def specialMult(n, mult = 1):
while n > 1:
(n, mult) = (n-1, n * mult) # Or n-2 for the second one
return mult
I just can't figure out how to add the second call in to get the right answer overall. Thanks!
If you don't mind changing the structure of your algorithm a bit more, you can calculate the values in a bottom-up fashion, starting with the smallest values.
def specialMultiplication(max_n):
a = b = 1
for n in range(1, max_n+1):
a, b = b, a*b*n
return b
Convert the recursion to an iterative function using an auxiliary "todo list":
def specialMultiplication(n):
to_process = []
result = 1
if n >= 2:
to_process.append(n)
while to_process: # while list is not empty
n = to_process.pop()
result *= n
if n >= 3:
to_process.append(n-1)
if n >= 4:
to_process.append(n-2)
return result
create a work list (to_process)
if n >= 2, add n to the list
while to_process is not empty, pop item from list, multiply to result
if n-1 < 2, don't perform "left" operation (don't append to work list)
if n-2 < 2, don't perform "right" operation (don't append to work list)
This method has the advantage of consuming less stack. I've checked the results against recursive version for values from 1 to 25 and they were equal.
Note that it's still slow, since complexity is O(2^n) so it's beginning to be really slow from n=30 (time doubles when n increases by 1). n=28 is computed in 12 seconds on my laptop.
I've successfully used this method to fix a stack overflow problem when performing a flood fill algorithm: Fatal Python error: Cannot recover from stack overflow. During Flood Fill but here Blcknght answer is more adapted because it rethinks the way of computing it from the start.
The OP's function has the same recursive structure as the Fibonacci and Lucas functions, just with different values for f0, f1, and g:
f(0) = f0
f(1) = f1
f(n) = g(f(n-2), f(n-1), n)
This is an example of a recurrence relation. Here is an iterative version of the general solution that calculates f(n) in n steps. It corresponds to a bottom-up tail recursion.
def f(n):
if not isinstance(n, int): # Can be loosened a bit
raise TypeError('Input must be an int') # Can be more informative
if n < 0:
raise ValueError('Input must be non-negative')
if n == 0:
return f0
i, fi_1, fi = 1, f0, f1 # invariant: fi_1, fi = f(i-1), f(i)
while i < n:
i += 1
fi_1, fi = fi, g(fi_1, fi, n) # restore invariant for new i
return fi
Blckknight's answer is a simplified version of this

Index out of range in implementation of a variation of mergesort algorithm in python?

I have done a variation of my merge sort algorithm in python, based on what I've learnt from the CLRS book, and compared it with the implementation done on the introductory computer science book by MIT. I cannot find the problem in my algorithm, and the IDLE gives me an index out of range although everything looks fine to me. I'm unsure if this is due to some confusion in borrowing ideas from the MIT algorithm (see below).
lista = [1,2,3,1,1,1,1,6,7,12,2,7,7,67,4,7,9,6,6,3,1,14,4]
def merge(A, p, q, r):
q = (p+r)/2
L = A[p:q+1]
R = A[q+1:r]
i = 0
j = 0
for k in range(len(A)):
#if the list R runs of of space and L[i] has nothing to compare
if i+1 > len(R):
A[k] = L[i]
i += 1
elif j+1 > len(L):
A[k] = R[j]
j += 1
elif L[i] <= R[j]:
A[k] = L[i]
i += 1
elif R[j] <= L[i]:
A[k] = R[j]
j += 1
#when both the sub arrays have run out and all the ifs and elifs done,
# the for loop has effectively ended
return A
def mergesort(A, p, r):
"""A is the list, p is the first index and r is the last index for which
the portion of the list is to be sorted."""
q = (p+r)/2
if p<r:
mergesort(A, p, q)
mergesort(A, q+1, r)
merge (A, p, q, r)
return A
print mergesort(lista, 0, len(lista)-1)
I have followed the pseudocode in CLRS as closely as I could, just without using the "infinity value" at the end of L and R, which would continue to compare (is this less efficient?). I tried to incorporate ideas like that in the MIT book, which is to simply copy down the remaining L or R list to A, to mutate A and return a sorted list. However, I can't seem to find what has went wrong with it. Also, I don't get why the pseudo code requires a 'q' as an input, given that q would be calculated as (p+q)/2 for the middle index anyway. And why is there a need to put p
On the other hand, from the MIT book, we have something that looks really elegant.
def merge(left, right, compare):
"""Assumes left and right are sorted lists and
compare defines an ordering on the elements.
Returns a new sorted(by compare) list containing the
same elements as(left + right) would contain.
"""
result = []
i, j = 0, 0
while i < len(left) and j < len(right):
if compare(left[i], right[j]):
result.append(left[i])
i += 1
else :
result.append(right[j])
j += 1
while (i < len(left)):
result.append(left[i])
i += 1
while (j < len(right)):
result.append(right[j])
j += 1
return result
import operator
def mergeSort(L, compare = operator.lt):
"""Assumes L is a list, compare defines an ordering
on elements of L.
Returns a new sorted list containing the same elements as L"""
if len(L) < 2:
return L[: ]
else :
middle = len(L) //2
left = mergeSort(L[: middle], compare)
right = mergeSort(L[middle: ], compare)
return merge(left, right, compare)
Where could I have gone wrong?
Also, I think the key difference in the MIT implementation is that it creates a new list instead of mutating the original list. This makes it quite difficult for me to understand mergesort, because I found the CLRS explanation quite clear, by understanding it in terms of different layers of recursion occurring to sort the most minute components of the original list (the list of length 1 that needs no sorting), thus "storing" the results of recursion within the old list itself.
However, thinking again, is it right to say that the "result" returned by each recursion in the MIT algorithm, which is in turn combined?
Thank you!
the fundamental difference between your code and the MIT is the conditional statement in the mergesort function. Where your if statement is:
if p<r:
theirs is:
if len(L) < 2:
This means that if you were to have, at any point in the recursive call tree, a list that is of len(A) == 1, then it would still call merge on a size 1 or even 0 list. You can see that this causes problems in the merge function because then your L, R, or both sub lists can end up being of size 0, which would then cause an out if bounds index error.
your problem could then be easily fixed by changing your if statement to something alike to theirs, like len(A) < 2 or r-p < 2

Subsequent Application of Python First Class Functions: Any Way Cleaner Than Nesting?

I'm working an example to help me learn how to use first-class functions in Python. In general, I'm satisfied with the solution I came up with, except for one line of code that screams "un-Pythonic" to me.
So the problem I'm working with is defined here. The puzzle seeks the single permutation (out of 720 possible) of six simple functions involving "2" that ultimately returns -3.
Here's my solution, which simply dumps every possible six-function permutation and its result.
def perform (fun, arg):
return fun(arg)
def a(n):
return n + 2
def d(n):
return n / 2.
def m(n):
return n * 2
def p(n):
return n ** 2
def r(n):
return n ** 0.5
def s(n):
return n - 2
if __name__ == "__main__":
from itertools import permutations
for i, perm in enumerate(permutations([a, d, m, p, r, s])):
try:
k = perform(perm[5], perform(perm[4], perform(perm[3], perform(perm[2], perform(perm[1], perform(perm[0], 0))))))
except ValueError:
k = float('nan')
print "%03d. %s: %8.8f" % (i + 1, ''.join([x.__name__ for x in perm]), k)
The line that doesn't seem right to me is the one with nested perform calls: k = perform(...perform(...(. What I need to do is apply the first function in the permutation tuple to 0, and then that function's result to the second function in the tuple, and so on through the permutation tuple until I come up with the ultimate result of applying the component functions.
Is there a cleaner way to successively apply the functions in perm to the corresponding results, starting with 0 as an argument? I've toyed with map and recursion, but I haven't been able to hit upon a solution any more elegant than the one above.
Why not simply:
x = init value
for f in funclist:
x = f(x)
or in a bit fancier way:
value = reduce(lambda x, f: f(x), funclist, init_value)

Categories

Resources