Theta(n**2) and Theta(n*lgn) algorithm perform unproperly - python

I'm reading the Introduction to Algorithms and trying to finish the exercise in the book.
In Exercise 4.1-3
4.1-3
Implement both the brute-force and recursive algorithms for the maximum-subarray problem on your own computer. What problem size n0 gives the crossover
point at which the recursive algorithm beats the brute-force algorithm? Then,
change the base case of the recursive algorithm to use the brute-force algorithm
whenever the problem size is less than n0. Does that change the crossover point?
I wrote the two algorithms according to the book's pseudo-code. However, there must be something wrong with my code because the second one, which is designed to be Theta(n*lgn) and supposed to run faster, always runs slower than the first Theta(n**2) one. My codes is shown below.
def find_maximum_subarray_bf(a): #bf for brute force
p1 = 0
l = 0 # l for left
r = 0 # r for right
max_sum = 0
for p1 in range(len(a)-1):
sub_sum = 0
for p2 in range(p1, len(a)):
sub_sum += a[p2]
if sub_sum > max_sum:
max_sum = sub_sum
l = p1
r = p2
return l, r, max_sum
def find_maximum_subarray_dc(a): #dc for divide and conquer
# subfunction
# given an arrary and three indics which can split the array into a[l:m]
# and a[m+1:r], find out a subarray a[i:j] where l \leq i \less m \less j \leq r".
# according to the definition above, the target subarray must
# be combined by two subarray, a[i:m] and a[m+1:j]
# Growing Rate: theta(n)
def find_crossing_max(a, l, r, m):
# left side
# ls_r and ls_l indicate the right and left bound of the left subarray.
# l_max_sum indicates the max sum of the left subarray
# sub_sum indicates the sum of the current computing subarray
ls_l = 0
ls_r = m-1
l_max_sum = None
sub_sum = 0
for j in range(m+1)[::-1]: # adding elements from right to left
sub_sum += a[j]
if sub_sum > l_max_sum:
l_max_sum = sub_sum
ls_l = j
# right side
# rs_r and rs_l indicate the right and left bound of the left subarray.
# r_max_sum indicates the max sum of the left subarray
# sub_sum indicates the sum of the current computing subarray
rs_l = m+1
rs_r = 0
r_max_sum = None
sub_sum = 0
for j in range(m+1,len(a)):
sub_sum += a[j]
if sub_sum > r_max_sum:
r_max_sum = sub_sum
rs_r = j
#combine
return (ls_l, rs_r, l_max_sum+r_max_sum)
# subfunction
# Growing Rate: should be theta(nlgn), but there is something wrong
def recursion(a,l,r): # T(n)
if r == l:
return (l,r,a[l])
else:
m = (l+r)//2 # theta(1)
left = recursion(a,l,m) # T(n/2)
right = recursion(a,m+1,r) # T(n/2)
crossing = find_crossing_max(a,l,r,m) # theta(n)
if left[2]>=right[2] and left[2]>=crossing[2]:
return left
elif right[2]>=left[2] and right[2]>=crossing[2]:
return right
else:
return crossing
#back to master function
l = 0
r = len(a)-1
return recursion(a,l,r)
if __name__ == "__main__":
from time import time
a = [100,-10,1,2,-1,4,-6,2,5]
a *= 2**10
time0 = time()
find_maximum_subarray_bf(a)
time1 = time()
find_maximum_subarray_dc(a)
time2 = time()
print "function 1:", time1-time0
print "function 2:", time2-time1
print "ratio:", (time1-time0)/(time2-time1)

First, a mistake in the brute-force:
for p1 in range(len(a)-1):
that should be range(len(a)) [or xrange], as is, it wouldn't find the maximum subarray of [-12,10].
Now, the recursion:
def find_crossing_max(a, l, r, m):
# left side
# ls_r and ls_l indicate the right and left bound of the left subarray.
# l_max_sum indicates the max sum of the left subarray
# sub_sum indicates the sum of the current computing subarray
ls_l = 0
ls_r = m-1
l_max_sum = None
sub_sum = 0
for j in range(m+1)[::-1]: # adding elements from right to left
You are checking all the indices to 0, but you should only check the indices to l. Instead of constructing the range list and reversing it, use xrange(m,l-1,-1)
sub_sum += a[j]
if sub_sum > l_max_sum:
l_max_sum = sub_sum
ls_l = j
For the sum to the right, the analogue holds, you should only check indices to r, so xrange(m+1,r+1).
Further, your intitial values for the sums resp. indices for the maximum subarray are dubious for the left part, and wrong for the right.
For the left part, we start with an empty sum but must include a[m]. That can be done by setting l_max_sum = None initially, or by setting l_max_sum = a[m] and letting j omit the index m. Either way, the initial value for ls_l should not be 0, and for ls_r it shouldn't be m-1. ls_r must be m, and ls_l should start as m+1 if the initial value for l_max_sum is None, and as m if l_max_sum starts as a[m].
For the right part, r_max_sum must start as 0, and rs_r should better start as m (though that isn't really important, it would only give you the wrong indices). If none of the sums on the right is ever non-negative, the right sum should be 0 and not the largest of the negative sums.
In recursion, we have a bit of duplication in
left = recursion(a,l,m) # T(n/2)
the sums including a[m] have already been treated or majorised in find_crossing_max, so that could be
left = recursion(a,l,m-1)
But then one would have to also treat the possibility r < l in recursion, and the repetition is small, so I'll let that stand.
Since you always traverse the entire list in find_crossing_max, and that is called O(n) times, your divide-and-conquer implementation is actually O(n²) too.
If the range checked in find_crossing_max is restricted to [l,r], as it should be, you have (approximately) 2^k calls on ranges of length n/2^k, 0 <= k <= log_2 n, for a total cost of O(n*log n).
With these changes (and some random array generation),
def find_maximum_subarray_bf(a): #bf for brute force
p1 = 0
l = 0 # l for left
r = 0 # r for right
max_sum = 0
for p1 in xrange(len(a)):
sub_sum = 0
for p2 in xrange(p1, len(a)):
sub_sum += a[p2]
if sub_sum > max_sum:
max_sum = sub_sum
l = p1
r = p2
return l, r, max_sum
def find_maximum_subarray_dc(a): #dc for divide and conquer
# subfunction
# given an arrary and three indices which can split the array into a[l:m]
# and a[m+1:r], find out a subarray a[i:j] where l \leq i \less m \less j \leq r".
# according to the definition above, the target subarray must
# be combined by two subarray, a[i:m] and a[m+1:j]
# Growing Rate: theta(n)
def find_crossing_max(a, l, r, m):
# left side
# ls_r and ls_l indicate the right and left bound of the left subarray.
# l_max_sum indicates the max sum of the left subarray
# sub_sum indicates the sum of the current computing subarray
ls_l = m+1
ls_r = m
l_max_sum = None
sub_sum = 0
for j in xrange(m,l-1,-1): # adding elements from right to left
sub_sum += a[j]
if sub_sum > l_max_sum:
l_max_sum = sub_sum
ls_l = j
# right side
# rs_r and rs_l indicate the right and left bound of the left subarray.
# r_max_sum indicates the max sum of the left subarray
# sub_sum indicates the sum of the current computing subarray
rs_l = m+1
rs_r = m
r_max_sum = 0
sub_sum = 0
for j in range(m+1,r+1):
sub_sum += a[j]
if sub_sum > r_max_sum:
r_max_sum = sub_sum
rs_r = j
#combine
return (ls_l, rs_r, l_max_sum+r_max_sum)
# subfunction
# Growing Rate: theta(nlgn)
def recursion(a,l,r): # T(n)
if r == l:
return (l,r,a[l])
else:
m = (l+r)//2 # theta(1)
left = recursion(a,l,m) # T(n/2)
right = recursion(a,m+1,r) # T(n/2)
crossing = find_crossing_max(a,l,r,m) # theta(r-l+1)
if left[2]>=right[2] and left[2]>=crossing[2]:
return left
elif right[2]>=left[2] and right[2]>=crossing[2]:
return right
else:
return crossing
#back to master function
l = 0
r = len(a)-1
return recursion(a,l,r)
if __name__ == "__main__":
from time import time
from sys import argv
from random import randint
alen = 100
if len(argv) > 1:
alen = int(argv[1])
a = [randint(-100,100) for i in xrange(alen)]
time0 = time()
print find_maximum_subarray_bf(a)
time1 = time()
print find_maximum_subarray_dc(a)
time2 = time()
print "function 1:", time1-time0
print "function 2:", time2-time1
print "ratio:", (time1-time0)/(time2-time1)
We get something like we should expect:
$ python subarrays.py 50
(3, 48, 1131)
(3, 48, 1131)
function 1: 0.000184059143066
function 2: 0.00020382
ratio: 0.902923976608
$ python subarrays.py 100
(29, 61, 429)
(29, 61, 429)
function 1: 0.000745058059692
function 2: 0.000561952590942
ratio: 1.32583792957
$ python subarrays.py 500
(35, 350, 3049)
(35, 350, 3049)
function 1: 0.0115859508514
function 2: 0.00170588493347
ratio: 6.79175401817
$ python subarrays.py 1000
(313, 572, 3585)
(313, 572, 3585)
function 1: 0.0537149906158
function 2: 0.00334000587463
ratio: 16.082304233
$ python osubarrays.py 10000
(901, 2055, 4441)
(901, 2055, 4441)
function 1: 4.20316505432
function 2: 0.0381460189819
ratio: 110.186204655

Related

How to efficiently find minimum distance away from an obstacle given a start and end point in a grid

Given a grid ('.' represents positions that we can move to, '*' represents obstacles, 's' and 'e' represent the start and end point in the grid), and a start point and end point, I need to find the minimum distance that I will be away from an obstacle. An optimal path will be to move in a manner that maximizes the distance away from any obstacle. If there is no path from the start to the end point, then 0 will be returned. Else, if there is a path for example, then if the closest I ever get to an obstacle is 2 Manhatten distances away from an obstacle in the grid, I will return 2. 2 will be the minimum distance I will be to the obstacle given an optimal path that tries to maximize the distance from all obstacles. I have tried to use DP to first calculate the distance to the closest obstacle for each point on the grid, and then run BFS with a priority queue from the start to end point. Yet I have still been facing Time Limit Exceeded for this problem. How else can I optimize it?
Example Input:
5
..*
.S.
..*
...
..E
Answer: 2
The closest we ever get to an obstacle is 2. The ideal path to take is (1,1) -> (1,0) -> (2,0) -> (3,0) -> (4,0) -> (4,1) -> (4,2)
My Code:
from collections import deque
from queue import PriorityQueue
def findMaximumDistance(grid):
# # Write your code here
matrix = []
new_matrix = []
for row in grid:
matrix.append(list(row))
row = [1 if ele != '*' else 0 for ele in row]
new_matrix.append(list(row))
# Mark starting point
m, n = len(matrix), len(matrix[0])
for i in range(m):
for j in range(n):
if matrix[i][j] == 'S':
start = (i,j)
elif matrix[i][j]== 'E':
end = (i,j)
def updateMatrix(mat):
m, n = len(mat), len(mat[0])
for r in range(m):
for c in range(n):
if mat[r][c] > 0:
top = mat[r - 1][c] if r > 0 else math.inf
left = mat[r][c - 1] if c > 0 else math.inf
mat[r][c] = min(top, left) + 1
for r in range(m - 1, -1, -1):
for c in range(n - 1, -1, -1):
if mat[r][c] > 0:
bottom = mat[r + 1][c] if r < m - 1 else math.inf
right = mat[r][c + 1] if c < n - 1 else math.inf
mat[r][c] = min(mat[r][c], bottom + 1, right + 1)
return mat
new_matrix = updateMatrix(new_matrix)
def best_first_search(start, end):
nonlocal min_dist, visited
DIR = [0, 1, 0, -1, 0]
pq = PriorityQueue()
pq.put((-new_matrix[start[0]][start[1]], start))
while pq:
dist, pos = pq.get()
dist = -dist
# update min_dist
r, c = pos
min_dist = min(min_dist, dist)
visited.append((r,c))
if pos == end:
break
for i in range(4):
nr, nc = r + DIR[i], c + DIR[i + 1]
if nr < 0 or nr == m or nc < 0 or nc == n or (nr,nc) in visited: continue
pq.put((-new_matrix[nr][nc],(nr, nc)))
visited = []
min_dist = float('inf')
best_first_search(start, end)
return min_dist

Grab 'n' numbers from a given list of numbers with minimum difference between them

I put up a similar question a few hours ago, albeit with a few mistakes, and my poor understanding, admittedly
So the question is, from a given list of indefinite numbers, I'm supposed to take an input from the user, say 3, and grab 3 numbers wherein the numbers have the least difference between them.
def findMinDiff(arr):
# Initialize difference as infinite
diff = 10**20
n = len(arr)
# Find the min diff by comparing difference
# of all possible pairs in given array
for i in range(n-1):
for j in range(i+1,n):
if abs(arr[i]-arr[j]) < diff:
diff = abs(arr[i] - arr[j])
# Return min diff
return diff
def findDiffArray(arr):
diff = 10**20
arr_diff = []
n = len(arr)
for i in range(n-1):
arr_diff.append(abs(arr[i]-arr[i+1]))
return arr_diff
def choosingElements(arr, arr_diff):
arr_choose = []
least_index = 0
least = arr_diff[0]
least_index_array = []
flag = 0
flag2 = 0
for z in range(0,3):
for i in range(0,len(arr_diff)-1):
if arr_diff[i] < least:
if flag > 0:
if i == least_index:
continue
least = arr_diff[i]
least_index = i
least_index_array.append(i)
arr_choose.append(arr[i])
flag += 1
arr_choose.append(arr[i+1])
flag += 1
print("least index is", least_index)
return arr_choose
# Driver code
arr = [1, 5, 3, 19, 18, 25]
arr_diff = findDiffArray(arr)
arr_diff2 = arr_diff.copy()
item_number = int(input("Enter the number of gifts"))
arr_choose = choosingElements(arr, arr_diff2)
print("Minimum difference is " + str(findMinDiff(arr)))
print("Difference array")
print(*arr_diff, sep = "\n")
print("Numbers with least difference for specified items are", arr_choose)
This is how much I've tried, and I've thought to find the difference between numbers, and keep picking ones with the least difference between them, and I realised that my approach is probably wrong.
Can anybody kindly help me out? Thanks!
Now, I'm sure the time complexity on this isn't great, and it might be hard to understand, but how about this:
arr = [1, 18, 5, 19, 25, 3]
# calculates length of the overall path
def calc_path_difference(arr, i1, i2, i3):
return abs(arr[i1] - arr[i2]) + abs(arr[i2] - arr[i3])
# returns dictionary with differences to other numbers in arr from each number
def differences_dict(arr):
return {
current: [
abs(number - current) if abs(number - current) != 0 else float("inf")
for number in arr
]
for current in arr
}
differences = differences_dict(arr)
# Just to give some starting point, take the first three elements of arr
current_path = [calc_path_difference(arr, 0, 1, 2), 0, 1, 2]
# Loop 1
for i, num in enumerate(arr):
# Save some time by skippin numbers who's path
# already exceeds the min path we currently have
if not min(differences[num]) < current_path[0]:
continue
# Loop 2
for j, num2 in enumerate(arr):
# So you can't get 2 of the same index
if j == i:
continue
# some code for making indices i and j of differences
# infinite so they can't be the smallest, but not sure if
# this is needed without more tests
# diff_arr_copy = differences[num2].copy()
# diff_arr_copy[i], diff_arr_copy[j] = float("inf"), float("inf")
# Get index of number in arr with smallest difference to num2
min_index = differences[num2].index(min(differences[num2]))
# So you can't get 2 of the same index again
if min_index == i or min_index == j:
continue
# Total of current path
path_total = calc_path_difference(arr, i, j, min_index)
# Change current path if this one is shorter
if path_total < current_path[0]:
current_path = [path_total, i, j, min_index]
Does this work for you? I played around with the order of the elements in the array and it seemed to give the correct output each time but I would have liked to have another example to test it on.

How to sum elements of the rows of a lattice periodically

Suppose I have a lattice
a = np.array([[1, 1, 1, 1],
[2, 2, 2, 2],
[3, 3, 3, 3],
[4, 4, 4, 4]])
I'd like to make a function func(lattice, start, end) that takes in 3 inputs, where start and end are the index of rows for which the function would sum the elements. For example, for func(a,1,3) it'll sum all the elements of those rows such that func(a,1,3) = 2+2+2+2+3+3+3+3+4+4+4+4 = 36.
Now I know this can be done easily with slicing and np.sum() whatever. But crucially what I want func to do is to also have the ability to wrap around. Namely func(a,2,4) should return 3+3+3+3+4+4+4+4+1+1+1+1.
Couple more examples would be
func(a,3,4) = 4+4+4+4+1+1+1+1
func(a,3,5) = 4+4+4+4+1+1+1+1+2+2+2+2
func(a,0,1) = 1+1+1+1+2+2+2+2
In my situation I'm never gonna get to a point where it'll sum the whole thing again i.e.
func(a,3,6) = sum of all elements
Update:
For my algorithm
for i in range(MC_STEPS_NODE):
sweep(lattice, prob, start_index_master, end_index_master,
rows_for_master)
# calculate the energy
Ene = subhamiltonian(lattice, start_index_master, end_index_master)
# calculate the magnetisation
Mag = mag(lattice, start_index_master, end_index_master)
E1 += Ene
M1 += Mag
E2 += Ene*Ene
M2 += Mag*Mag
if i % sites_for_master == 0:
comm.Send([lattice[start_index_master:start_index_master+1], L, MPI.INT],
dest=(rank-1)%size, tag=4)
comm.Recv([lattice[end_index_master:end_index_master+1], L, MPI.INT],
source = (rank+1)%size, tag=4)
start_index_master = (start_index_master + 1)
end_index_master = (end_index_master + 1)
if start_index_master > 100:
start_index_master = start_index_master % L
if end_index_master > 100:
end_index_master = end_index_master % L
The function I want is the mag() function which calculates the magnetisation of a sublattice which are just sum of all its elements. Imagine a LxL lattice split up into two sublattices, one belongs to the master and the other to the worker. Each sweep sweeps the corresponding sublattice of lattice with start_index_master and end_index_master determining the start and end row of the sublattice. For every i%sites_for_master = 0, the indices move down by adding 1, eventually mod by 100 to prevent memory overflow in mpi4py. So you can imagine if the sublattice is at the centre of the main lattice then start_index_master < end_index_master. Eventually the sublattice will keep moving down to the point where start_index_master < end_index_master where end_index_master > L, so in this case if start_index_master = 10 for a lattice L=10, the most bottom row of the sublattice is the first row ([0]) of the main lattice.
Energy function:
def subhamiltonian(lattice: np.ndarray, col_len_start: int,
col_len_end: int) -> float:
energy = 0
for i in range(col_len_start, col_len_end+1):
for j in range(len(lattice)):
spin = lattice[i%L, j]
nb_sum = lattice[(i%L+1) % L, j] + lattice[i%L, (j+1) % L] + \
lattice[(i%L-1) % L, j] + lattice[i%L, (j-1) % L]
energy += -nb_sum*spin
return energy/4.
This is my function for computing the energy of the sublattice.
You could use np.arange to create the indexes to be summed.
>>> def func(lattice, start, end):
... rows = lattice.shape[0]
... return lattice[np.arange(start, end+1) % rows].sum()
...
>>> func(a,3,4)
20
>>> func(a,3,5)
28
>>> func(a,0,1)
12
You can check if the stop index wraps-around and if it does add the sum from the beginning of the array to the result. This is efficient because it relies on slice indexing and only does extra work if necessary.
def func(a, start, stop):
stop += 1
result = np.sum(a[start:stop])
if stop > len(a):
result += np.sum(a[:stop % len(a)])
return result
The above version works for stop - start < len(a), i.e. no more than one full wrap-around. For an arbitrary number of wrap-around (i.e. arbitrary values for start and stop) the following version can be used:
def multi_wraps(a, start, stop):
result = 0
# Adjust both indices in case the start index wrapped around.
stop -= (start // len(a)) * len(a)
start %= len(a)
stop += 1 # Include the element pointed to by the stop index.
n_wraps = (stop - start) // len(a)
if n_wraps > 0:
result += n_wraps * a.sum()
stop = start + (stop - start) % len(a)
result += np.sum(a[start:stop])
if stop > len(a):
result += np.sum(a[:stop % len(a)])
return result
In case n_wraps > 0 some parts of the array will be summed twice which is unnecessarily inefficient, so we can compute the sum of the various array parts as necessary. The following version sums every array element at most once:
def multi_wraps_efficient(a, start, stop):
# Adjust both indices in case the start index wrapped around.
stop -= (start // len(a)) * len(a)
start %= len(a)
stop += 1 # Include the element pointed to by the stop index.
n_wraps = (stop - start) // len(a)
stop = start + (stop - start) % len(a) # Eliminate the wraps since they will be accounted for separately.
tail_sum = a[start:stop].sum()
if stop > len(a):
head_sum = a[:stop % len(a)].sum()
if n_wraps > 0:
remaining_sum = a[stop % len(a):start].sum()
elif n_wraps > 0:
head_sum = a[:start].sum()
remaining_sum = a[stop:].sum()
result = tail_sum
if stop > len(a):
result += head_sum
if n_wraps > 0:
result += n_wraps * (head_sum + tail_sum + remaining_sum)
return result
The following plot shows a performance comparison between using index arrays and the two multi-wrap methods presented above. The tests are run on a (1_000, 1_000) lattice. One can observe that for the multi_wraps method there is an increase in runtime when going from 1 to 2 wrap-around since it unnecessarily sums the array twice. The multi_wraps_efficient method has the same performance irregardless of the number of wrap-around since it sums every array element no more than once.
The performance plot was generated using the perfplot package:
perfplot.show(
setup=lambda n: (np.ones(shape=(1_000, 1_000), dtype=int), 400, n*1_000 + 200),
kernels=[
lambda x: index_arrays(*x),
lambda x: multi_wraps(*x),
lambda x: multi_wraps_efficient(*x),
],
labels=['index_arrays', 'multi_wraps', 'multi_wraps_efficient'],
n_range=range(1, 11),
xlabel="Number of wrap-around",
equality_check=lambda x, y: x == y,
)

Incorrect indexing for max subarray in Python

I wrote both a brute-force and a divide-and-conquer implementation of the Max Subarray problem in Python. Tests are run by drawing a random sample of integers.
When the length of the input array is large, the assert in __main__ fails because the recursive algorithm does not return the correct answer. However, the two algorithms DO agree when the array is less than 10 elements long (this is approximate, and the actual size of the failed input varies on each execution). The issue does not seem to be related to even or odd array lengths, but it does appear to be related to how the array is indexed.
Sorry if I'm missing something stupid, but why does the recursive algorithm stop returning the correct output when the input array starts getting larger?
# Subarray solutions are represented by an array in the form
# [lower_bound, higher_bound, sum]
from sys import maxsize
import random
import time
# Brute force implementation (THETA(n^2))
def bf_max_subarray(A):
biggest = -maxsize - 1
left = 0
right = 0
for i in range(0, len(A)):
sum = 0
for j in range(i, len(A)):
sum += A[j]
if sum > biggest:
biggest = sum
left = i
right = j
return [left, right, biggest]
# Part of divide-and-conquer solution
def cross_subarray(A, l, m, r):
lsum = -maxsize - 1
rsum = -maxsize - 1
lbound = 0
rbound = 0
tempsum = 0
for i in range(m, l-1, -1):
tempsum += A[i]
if tempsum > lsum:
lsum = tempsum
lbound = i
tempsum = 0
for j in range(m+1, r+1):
tempsum += A[j]
if tempsum > rsum:
rsum = tempsum
rbound = j
return [lbound, rbound, lsum + rsum]
# Recursive solution
def rec_max_subarray(A, l, r):
# Base case: array of one element
if (l == r):
return [l, r, A[l]]
else:
m = (l+r)//2
left = rec_max_subarray(A, l, m)
right = rec_max_subarray(A, m+1, r)
cross = cross_subarray(A, l, m, r)
# Returns the array representing the subarray with the maximum sum.
return max([left, right, cross], key=lambda i:i[2])
if __name__ == "__main__":
for i in range(1, 101):
A = random.sample(range(-i*2, i), i)
start = time.clock()
bf = bf_max_subarray(A)
bf_time = time.clock() - start
start = time.clock()
dc = rec_max_subarray(A, 0, len(A)-1)
dc_time = time.clock() - start
assert dc == bf # Make sure the algorithms agree.
The subarray with the maximum sum is represented by an array of the form [left_bound, right_bound, sum].
But thanks toreturn max([left, right, cross], key=lambda i:i[2]), rec_max_subarray returns the correct maximum sum for A, but risks returning indicies that do not match the indicies returned in bf_max_subarray. My error was assuming that the boundaries of a subarray with the maximum sum would be unique.
The solution is to either fix the criteria that selects a subarray, or just to assert the equality of the sums using assert dc[2] == bf[2].

Python Monte Carlo Simulation Loop

I am working on a simple Monte-Carlo simulation script, which I will later on extend for a larger project. The script is a basic crawler trying to get from point A to point B in a grid. The coordinates of point A is (1,1) (this is top left corner), and the coordinates of point B is (n,n) (this is bottom right corner, n is the size of the grid).
Once the crawler starts moving, there are four options, it can go left, right, up or down (no diagonal movement allowed). If any of these four options satisfy the following:
The new point should still be within the boundries of the n x n grid
The new point should not be visited previously
the new point will be selected randomly among the remaining valid options (as far as I know Python uses the Mersenne Twister algorithm for picking random numbers).
I would like to run the simulation for 1,000,000 times (the code below runs for 100 only), and each iteration should be terminated either:
The crawler gets stuck (no valid options for movement)
The crawler gets to the final destination (n,n) on the grid.
I thought I implemented the algorithm correctly, but obviously something is wrong. No matter how many times I run the simulations (100 or 1,000,000), I only get 1 successful event wehere the crawler manages to get to the end, and rest of the attempts (99, or 999,999) is unsuccessful.
I bet there is something simple I am missing out, but cannot see it for some reason. Any ideas?
Thanks a bunch!
EDIT: Some typos in the text were corrected.
import random
i = 1 # initial coordinate top left corner
j = 1 # initial coordinate top left corner
k = 0 # counter for number of simulations
n = 3 # Grid size
foundRoute = 0 # counter for number of cases where the final point is reached
gotStuck = 0 # counter for number of cases where no valid options found
coordList = [[i, j]]
while k < 100:
while True:
validOptions = []
opt1 = [i - 1, j]
opt2 = [i, j + 1]
opt3 = [i + 1, j]
opt4 = [i, j - 1]
# Check 4 possible options out of bound and re-visited coordinates are
# discarded:
if opt1[0] != 0 and opt1[0] <= n and opt1[1] != 0 and opt1[1] <= n:
if not opt1 in coordList:
validOptions.append(opt1)
if opt2[0] != 0 and opt2[0] <= n and opt2[1] != 0 and opt2[1] <= n:
if not opt2 in coordList:
validOptions.append(opt2)
if opt3[0] != 0 and opt3[0] <= n and opt3[1] != 0 and opt3[1] <= n:
if not opt3 in coordList:
validOptions.append(opt3)
if opt4[0] != 0 and opt4[0] <= n and opt4[1] != 0 and opt4[1] <= n:
if not opt4 in coordList:
validOptions.append(opt4)
# Break loop if there are no valid options
if len(validOptions) == 0:
gotStuck = gotStuck + 1
break
# Get random coordinate among current valid options
newCoord = random.choice(validOptions)
# Append new coordinate to the list of grid points visited (to be used
# for checks)
coordList.append(newCoord)
# Break loop if lower right corner of the grid is reached
if newCoord == [n, n]:
foundRoute = foundRoute + 1
break
# If the loop is not broken, assign new coordinates
i = newCoord[0]
j = newCoord[1]
k = k + 1
print 'Route found %i times' % foundRoute
print 'Route not found %i times' % gotStuck
Your problem is that you're never clearing out your visited locations. Change your block that breaks out of the the inner while loop to look something like this:
if len(validOptions) == 0:
gotStuck = gotStuck + 1
coordList = [[1,1]]
i,j = (1,1)
break
You'll also need to change your block where you succeed:
if newCoord == [n, n]:
foundRoute = foundRoute + 1
coordList = [[1,1]]
i,j = (1,1)
break
Alternatively, you could simply place this code right before your inner while loop. The start of your code would then look like:
k = 0 # counter for number of simulations
n = 3 # Grid size
foundRoute = 0 # counter for number of cases where the final point is reached
gotStuck = 0 # counter for number of cases where no valid options found
while k < 100:
i,j = (1,1)
coordList = [[i,j]]
while True:
#Everything else

Categories

Resources