Need help understanding this python depth first search code - python

I'm quite new to Python coding and have difficulty understanding the following code below. It is on graph theory using DFS to find the largest area amongst all areas of islands. 1 represents an island and 0 represents water in a grid.
def maxAreaOfIsland(grid):
row, col = len(grid), len(grid[0])
def dfs(i, j):
if 0 <= i <= row - 1 and 0 <= j <= col - 1 and grid[i][j]:
grid[i][j] = 0
#scans through all rows & cols and
#turns number in the grid into 0 if all conditions are true?
return 1 + dfs(i - 1, j) + dfs(i + 1, j) + dfs(i, j - 1) + dfs(i, j + 1)
return 0
# recursive function that checks up, down, left, right in the grid.
# when does it return 1?
return max(dfs(i, j) for i in range(row) for j in range(col))
maxAreaOfIsland([[1,0,1,1,1],
[0,0,0,1,1],
[1,1,1,0,1]])
Out: 6
I have included comments, which reflect my understanding so far, but not sure if it's correct. I'm quite confused from line 4 onwards, particularly the recursive part.
Could someone explain in detail? Typically these kind of codes tend to have a queue/dequeue to record whether the island has been visited, but I don't think this code has that?

I guess the question is really about understanding algorithm not Python. Provided Python code is pretty easy.
The code contains function maxAreaOfIsland which in turn comtains recursive function dfs. These 2 functions form 2 layers of computation. Lets look at those layers separately.
# outer layer
def maxAreaOfIsland(grid):
row, col = len(grid), len(grid[0])
# function dfs() definition
return max(dfs(i, j) for i in range(row) for j in range(col))
So outer layer is very simple - compute dfs(i, j) for all possible i and j then choose maximum computed value.
# inner layer - slightly modified
def dfs(i, j):
# recursive case
if (0 <= i <= row - 1 and 0 <= j <= col - 1) and grid[i][j] == 1:
grid[i][j] = 0 # this is how we remember visited cells since we don't count zeros
# optional prints to look at the grid during computation
# print(i, j)
# print(*grid, sep='\n', end='\n\n')
count_current = 1
count_neighbors = dfs(i - 1, j) + dfs(i + 1, j) + dfs(i, j - 1) + dfs(i, j + 1)
return count_current + count_neighbors
# trivial case and out-of-borders case
else:
return 0
Inner layer is a liitle bit more complicated. What it does? (1) It gets i and j. (2) If the cell contains 0 then it's trivial case (water) or we are out of the grid - just return 0. (3) If the cell contains 1 then it's recursive case (land) - function starts to count amount of all the 1 adjacent to the given cell with every 1 counted turning into 0 to avoid double counting.
Your sample grid has 3 rows (0, 1, 2) and 5 columns (0, 1, 2, 3, 4). Suppose we are at i = 0, j = 2. It is 1. We count it (current result is 1), turn it into 0 and look at its neighbors one by one - upper neighbor is out of the grid, bottom neighbor is 0, left neighbor is 0, right neighbor is 1. We dont return current result but proceed to the right neigbor i = 0, j = 3. We count it (cuurent result is 2), turn it into 0 and look at neighbors. Upper neighbor is out of the grid, bottom neighbor is 1. We stop here, we dont return current result, we remember about 2 more neighbors, we proceed to the bottom neighbor i = 1, j = 3. We count it (current result is 3), turn it into 0 and look at neighbors. Upper neighbor is 1. We stop here, we dont return current result, we remember about 3 more neighbors, we proceed to the upper neighbor i = 0, j = 3. And so on.
My advice is to draw simple sample grid (with a pen on a piece of paper) and manually apply dfs algorithm to it.

Related

Python Algorithm Problem | How to find a path of desire weight in a connected graph?

Problem
So imagine you are given a m * n matrix like this:
Grid = 4 * 4
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
You want to connect from the top left corner to the bottom right corner and you can only go right or down. In addition, you want to find the operations needed to get the desired sum.
For example:
Script found in Text Book, not clear
def move(m, n, sum):
count = m + n - 1
max_val = m
s = sum
r = []
while max_val > 0:
if count <= 0:
return False, r
least = max_val * ((max_val - 1) / 2)
r.append(max_val)
s -= max_val
count -= 1
while ((count > 0) and (s > least + (count - (max_val - 1)) * (max_val - 1))):
r.append(max_val)
s -= max_val
count -= 1
if s < least:
return False, r
max_val -= 1
return True, r
def combine(m, n, sum):
result, new_res = move(m, n, sum)
new_res.reverse()
for i in range(1, len(new_res)):
if new_res[i] == new_res[i - 1]:
print("R")
else:
print("D")
combine(4, 4, 16)
I don't quite understand the solution.
Can someone explain the Algorithm?
Especially in the function move where it does this check in the while loop:
while ((count > 0) and (s > least + (count - (max_val - 1)) * (max_val - 1))):
Questions
What is the name of this algorithm?
How does this Script Work?
What's the run time (Time complexity)?
Thank you!
So the script isn't well written nor follow best practices, having say that I re adjust it and hopefully is more clear
Source Code
NOTE: I've added to my GitHub Algorithm-Complete-Guide which is under construction, feel free to use it
def move(m, n, weight_limit):
# ==== < CONFIG VARIABLES > ==== #
step_counter = m + n - 1
max_val = m # NOTE: It starts from the last Value (4) and Goes Backwards
path = [] # NOTE: Stores the path (need to reverse it afterwards)
while max_val:
if step_counter <= 0: # NOTE: starting Node so it break Not need to return
break
least = max_val * ((max_val - 1) / 2)
path.append(max_val)
weight_limit -= max_val
step_counter -= 1
if weight_limit < least: # NOTE: Moved it up here because makes more sense to check this first and break
break
# ==== < ROW CHECK | CAN IT GO TO THE LEFT? > ==== #
if step_counter: # NOTE: Moved it as makes it more clear
check_row = least + (step_counter - (max_val - 1)) * (max_val - 1)
while weight_limit > check_row: # FAQ: 1 Footnotes
path.append(max_val)
weight_limit -= max_val
step_counter -= 1
max_val -= 1
return path
def combine(m, n, sum):
path = move(m, n, sum)
path.reverse() # NOTE: Reverse the result Path
result = []
for i in range(1, len(path)):
if path[i] == path[i - 1]: # NOTE: Check if next Value is the same then it moved it to the Right
result.append((path[i], 'Right'))
else:
result.append((path[i], 'Left'))
return result
def prettify_result(res):
for value in res:
print(f'V={value[0]}) {value[1]} |-> ', end='')
if __name__ == '__main__':
path = combine(4, 4, 16)
prettify_result(path)
Explanation
I first thought was a Rat in a Maze problem solved with Backtracking technique of the Algorithm Depth-first_search that runs at Time Complexity: O(2^(n^2)) but after a review I doubt it, and it seems more a kind of Dijkstra's algorithm but I may be wrong. I don't think is Backtracking simple because is not recursive (and never backtracks..) but because it checks for the Node's Weight it seem a Dijkstra's with a given Max-Weight.
Important note, the Maze is solved Upside down, from bottom to top! So It starts at Value 4 and runs backwards! Hence in reality is checking:
Direction UP with more priority.
Direction LEFT is checking at every step (I left a big comment in the script) all the time, if can't go (because costs too much) Then it goes up (going up costs one less because it goes 4, 3, 2, 1)
"function move, what is least + (count - (max_val - 1)) * (max_val - 1)" this I had some hard time to understand as well, basically is just a Math trick and I added it in a variable called check_row to make it more explicit, but basically what it does it check if can go to the Left or not.
At the end of the Algorithm it reverse the list backwards, so that looks like it went from Top to Bottom.
Consideration
The function move()returns always 2 value, first of which if False and stores is in variable result but is not even used. It is pretty weird (seems not got programming practice) and not used anyway so I just removed and replace it with a breakstatements. That's because makes more sense to break in a while loop rather than return. The Function then takes care to return pathat the end.
I remove many while max_val > 0 or similar bool checks against the 0 because is completely redundant and not Pythonic, as Guide realpython.com/python-conditional-statements/
Online Documentations
BACKTRACKING
Rat in a Maze | Backtracking-2 | geeksforgeeks.org
Backtracking Algorithms | geeksforgeeks.org
algorithm-visualizer.org
Algorithm-Complete-Guide
Depth-first search
algorithms/dfs
Algorithm-Complete-Guide
algorithm-visualizer.org
Dijkstra's algorithm
Algorithm-Complete-Guide
algorithm-visualizer.org
. algorithms

Find Triplets smaller than a given number

I am trying to solve a problem where:
Given an array of n integers nums and a target, find the number of
index triplets i, j, k with 0 <= i < j < k < n that satisfy the
condition nums[i] + nums[j] + nums[k] < target.
For example, given nums = [-2, 0, 1, 3], and target = 2.
Return 2. Because there are two triplets which sums are less than 2:
[-2, 0, 1] [-2, 0, 3]
My algorithm: Remove a single element from the list, set target = target - number_1, search for doublets such that number_1 + number _2 < target - number_1. Problem solved.
The problem link is https://leetcode.com/problems/3sum-smaller/description/ .
My solution is:
def threeSumSmaller(nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: int
"""
nums = sorted(nums)
smaller = 0
for i in range(len(nums)):
# Create temp array excluding a number
if i!=len(nums)-1:
temp = nums[:i] + nums[i+1:]
else:
temp = nums[:len(nums)-1]
# Sort the temp array and set new target to target - the excluded number
l, r = 0, len(temp) -1
t = target - nums[i]
while(l<r):
if temp[l] + temp[r] >= t:
r = r - 1
else:
smaller += 1
l = l + 1
return smaller
My solution fails:
Input:
[1,1,-2]
1
Output:
3
Expected:
1
I am not getting why is the error there as my solution passes more than 30 test cases.
Thanks for your help.
One main point is that when you sort the elements in the first line, you also lose the indexes. This means that, despite having found a triplet, you'll never be sure whether your (i, j, k) will satisfy condition 1, because those (i, j, k) do not come from the original list, but from the new one.
Additionally: everytime you pluck an element from the middle of the array, the remaining part of the array is also iterated (although in an irregular way, it still starts from the first of the remaining elements in tmp). This should not be the case! I'm expanding details:
The example iterates 3 times over the list (which is, again, sorted and thus you lose the true i, j, and k indexes):
First iteration (i = 0, tmp = [1, -2], t = 0).
When you sum temp[l] + temp[r] (l, r are 0, 1) it will be -1.
It satisfies being lower than t. smaller will increase.
The second iteration will be like the first, but with i = 1.
Again it will increase.
The third one will increase as well, because t = 3 and the sum will be 2 now.
So you'll count the value three times (despite only one tuple can be formed in order of indexes) because you are iterating through the permutations of indexes instead of combinations of them. So those two things you did not take care about:
Preserving indexes while sorting.
Ensuring you iterate the indexes in a forward-fashion only.
Try like this better:
def find(elements, upper_bound):
result = 0
for i in range(0, len(elements) - 2):
upper_bound2 = upper_bound - elements[i]
for j in range(i+1, len(elements) - 1):
upper_bound3 = upper_bound2 - elements[j]
for k in range(j+1, len(elements)):
upper_bound4 = upper_bound3 - elements[k]
if upper_bound4 > 0:
result += 1
return result
Seems like you're counting the same triplet more than once...
In the first iteration of the loop, you omit the first 1 in the list, and then increase smaller by 1. Then you omit the second 1 in the list and increase smaller again by 1. And finally you omit the third element in the list, -2, and of course increase smaller by 1, because -- well -- in all these three cases you were in fact considering the same triplet {1,1,-2}.
p.s. It seems like you care more about correctness than performance. In that case, consider maintaining a set of the solution triplets, to ensure you're not counting the same triplet twice.
There are already good answers , Apart that , If you want to check your algorithm result then you can take help of this in-built funtion :
import itertools
def find_(vector_,target):
result=[]
for i in itertools.combinations(vector_, r=3):
if sum(i)<target:
result.append(i)
return result
output:
print(find_([-2, 0, 1, 3],2))
output:
[(-2, 0, 1), (-2, 0, 3)]
if you want only count then:
print(len(find_([-2, 0, 1, 3],2)))
output:
2

Shuffling a list with maximum distance travelled [duplicate]

I have tried to ask this question before, but have never been able to word it correctly. I hope I have it right this time:
I have a list of unique elements. I want to shuffle this list to produce a new list. However, I would like to constrain the shuffle, such that each element's new position is at most d away from its original position in the list.
So for example:
L = [1,2,3,4]
d = 2
answer = magicFunction(L, d)
Now, one possible outcome could be:
>>> print(answer)
[3,1,2,4]
Notice that 3 has moved two indices, 1 and 2 have moved one index, and 4 has not moved at all. Thus, this is a valid shuffle, per my previous definition. The following snippet of code can be used to validate this:
old = {e:i for i,e in enumerate(L)}
new = {e:i for i,e in enumerate(answer)}
valid = all(abs(i-new[e])<=d for e,i in old.items())
Now, I could easily just generate all possible permutations of L, filter for the valid ones, and pick one at random. But that doesn't seem very elegant. Does anyone have any other ideas about how to accomplish this?
This is going to be long and dry.
I have a solution that produces a uniform distribution. It requires O(len(L) * d**d) time and space for precomputation, then performs shuffles in O(len(L)*d) time1. If a uniform distribution is not required, the precomputation is unnecessary, and the shuffle time can be reduced to O(len(L)) due to faster random choices; I have not implemented the non-uniform distribution. Both steps of this algorithm are substantially faster than brute force, but they're still not as good as I'd like them to be. Also, while the concept should work, I have not tested my implementation as thoroughly as I'd like.
Suppose we iterate over L from the front, choosing a position for each element as we come to it. Define the lag as the distance between the next element to place and the first unfilled position. Every time we place an element, the lag grows by at most one, since the index of the next element is now one higher, but the index of the first unfilled position cannot become lower.
Whenever the lag is d, we are forced to place the next element in the first unfilled position, even though there may be other empty spots within a distance of d. If we do so, the lag cannot grow beyond d, we will always have a spot to put each element, and we will generate a valid shuffle of the list. Thus, we have a general idea of how to generate shuffles; however, if we make our choices uniformly at random, the overall distribution will not be uniform. For example, with len(L) == 3 and d == 1, there are 3 possible shuffles (one for each position of the middle element), but if we choose the position of the first element uniformly, one shuffle becomes twice as likely as either of the others.
If we want a uniform distribution over valid shuffles, we need to make a weighted random choice for the position of each element, where the weight of a position is based on the number of possible shuffles if we choose that position. Done naively, this would require us to generate all possible shuffles to count them, which would take O(d**len(L)) time. However, the number of possible shuffles remaining after any step of the algorithm depends only on which spots we've filled, not what order they were filled in. For any pattern of filled or unfilled spots, the number of possible shuffles is the sum of the number of possible shuffles for each possible placement of the next element. At any step, there are at most d possible positions to place the next element, and there are O(d**d) possible patterns of unfilled spots (since any spot further than d behind the current element must be full, and any spot d or further ahead must be empty). We can use this to generate a Markov chain of size O(len(L) * d**d), taking O(len(L) * d**d) time to do so, and then use this Markov chain to perform shuffles in O(len(L)*d) time.
Example code (currently not quite O(len(L)*d) due to inefficient Markov chain representation):
import random
# states are (k, filled_spots) tuples, where k is the index of the next
# element to place, and filled_spots is a tuple of booleans
# of length 2*d, representing whether each index from k-d to
# k+d-1 has an element in it. We pretend indices outside the array are
# full, for ease of representation.
def _successors(n, d, state):
'''Yield all legal next filled_spots and the move that takes you there.
Doesn't handle k=n.'''
k, filled_spots = state
next_k = k+1
# If k+d is a valid index, this represents the empty spot there.
possible_next_spot = (False,) if k + d < n else (True,)
if not filled_spots[0]:
# Must use that position.
yield k-d, filled_spots[1:] + possible_next_spot
else:
# Can fill any empty spot within a distance d.
shifted_filled_spots = list(filled_spots[1:] + possible_next_spot)
for i, filled in enumerate(shifted_filled_spots):
if not filled:
successor_state = shifted_filled_spots[:]
successor_state[i] = True
yield next_k-d+i, tuple(successor_state)
# next_k instead of k in that index computation, because
# i is indexing relative to shifted_filled_spots instead
# of filled_spots
def _markov_chain(n, d):
'''Precompute a table of weights for generating shuffles.
_markov_chain(n, d) produces a table that can be fed to
_distance_limited_shuffle to permute lists of length n in such a way that
no list element moves a distance of more than d from its initial spot,
and all permutations satisfying this condition are equally likely.
This is expensive.
'''
if d >= n - 1:
# We don't need the table, and generating a table for d >= n
# complicates the indexing a bit. It's too complicated already.
return None
table = {}
termination_state = (n, (d*2 * (True,)))
table[termination_state] = 1
def possible_shuffles(state):
try:
return table[state]
except KeyError:
k, _ = state
count = table[state] = sum(
possible_shuffles((k+1, next_filled_spots))
for (_, next_filled_spots) in _successors(n, d, state)
)
return count
initial_state = (0, (d*(True,) + d*(False,)))
possible_shuffles(initial_state)
return table
def _distance_limited_shuffle(l, d, table):
# Generate an index into the set of all permutations, then use the
# markov chain to efficiently find which permutation we picked.
n = len(l)
if d >= n - 1:
random.shuffle(l)
return
permutation = [None]*n
state = (0, (d*(True,) + d*(False,)))
permutations_to_skip = random.randrange(table[state])
for i, item in enumerate(l):
for placement_index, new_filled_spots in _successors(n, d, state):
new_state = (i+1, new_filled_spots)
if table[new_state] <= permutations_to_skip:
permutations_to_skip -= table[new_state]
else:
state = new_state
permutation[placement_index] = item
break
return permutation
class Shuffler(object):
def __init__(self, n, d):
self.n = n
self.d = d
self.table = _markov_chain(n, d)
def shuffled(self, l):
if len(l) != self.n:
raise ValueError('Wrong input size')
return _distance_limited_shuffle(l, self.d, self.table)
__call__ = shuffled
1We could use a tree-based weighted random choice algorithm to improve the shuffle time to O(len(L)*log(d)), but since the table becomes so huge for even moderately large d, this doesn't seem worthwhile. Also, the factors of d**d in the bounds are overestimates, but the actual factors are still at least exponential in d.
In short, the list that should be shuffled gets ordered by the sum of index and a random number.
import random
xs = range(20) # list that should be shuffled
d = 5 # distance
[x for i,x in sorted(enumerate(xs), key= lambda (i,x): i+(d+1)*random.random())]
Out:
[1, 4, 3, 0, 2, 6, 7, 5, 8, 9, 10, 11, 12, 14, 13, 15, 19, 16, 18, 17]
Thats basically it. But this looks a little bit overwhelming, therefore...
The algorithm in more detail
To understand this better, consider this alternative implementation of an ordinary, random shuffle:
import random
sorted(range(10), key = lambda x: random.random())
Out:
[2, 6, 5, 0, 9, 1, 3, 8, 7, 4]
In order to constrain the distance, we have to implement a alternative sort key function that depends on the index of an element. The function sort_criterion is responsible for that.
import random
def exclusive_uniform(a, b):
"returns a random value in the interval [a, b)"
return a+(b-a)*random.random()
def distance_constrained_shuffle(sequence, distance,
randmoveforward = exclusive_uniform):
def sort_criterion(enumerate_tuple):
"""
returns the index plus a random offset,
such that the result can overtake at most 'distance' elements
"""
indx, value = enumerate_tuple
return indx + randmoveforward(0, distance+1)
# get enumerated, shuffled list
enumerated_result = sorted(enumerate(sequence), key = sort_criterion)
# remove enumeration
result = [x for i, x in enumerated_result]
return result
With the argument randmoveforward you can pass a random number generator with a different probability density function (pdf) to modify the distance distribution.
The remainder is testing and evaluation of the distance distribution.
Test function
Here is an implementation of the test function. The validatefunction is actually taken from the OP, but I removed the creation of one of the dictionaries for performance reasons.
def test(num_cases = 10, distance = 3, sequence = range(1000)):
def validate(d, lst, answer):
#old = {e:i for i,e in enumerate(lst)}
new = {e:i for i,e in enumerate(answer)}
return all(abs(i-new[e])<=d for i,e in enumerate(lst))
#return all(abs(i-new[e])<=d for e,i in old.iteritems())
for _ in range(num_cases):
result = distance_constrained_shuffle(sequence, distance)
if not validate(distance, sequence, result):
print "Constraint violated. ", result
break
else:
print "No constraint violations"
test()
Out:
No constraint violations
Distance distribution
I am not sure whether there is a way to make the distance uniform distributed, but here is a function to validate the distribution.
def distance_distribution(maxdistance = 3, sequence = range(3000)):
from collections import Counter
def count_distances(lst, answer):
new = {e:i for i,e in enumerate(answer)}
return Counter(i-new[e] for i,e in enumerate(lst))
answer = distance_constrained_shuffle(sequence, maxdistance)
counter = count_distances(sequence, answer)
sequence_length = float(len(sequence))
distances = range(-maxdistance, maxdistance+1)
return distances, [counter[d]/sequence_length for d in distances]
distance_distribution()
Out:
([-3, -2, -1, 0, 1, 2, 3],
[0.01,
0.076,
0.22166666666666668,
0.379,
0.22933333333333333,
0.07766666666666666,
0.006333333333333333])
Or for a case with greater maximum distance:
distance_distribution(maxdistance=9, sequence=range(100*1000))
This is a very difficult problem, but it turns out there is a solution in the academic literature, in an influential paper by Mark Jerrum, Alistair Sinclair, and Eric Vigoda, A Polynomial-Time Approximation Algorithm for the Permanent of a Matrix with Nonnegative Entries, Journal of the ACM, Vol. 51, No. 4, July 2004, pp. 671–697. http://www.cc.gatech.edu/~vigoda/Permanent.pdf.
Here is the general idea: first write down two copies of the numbers in the array that you want to permute. Say
1 1
2 2
3 3
4 4
Now connect a node on the left to a node on the right if mapping from the number on the left to the position on the right is allowed by the restrictions in place. So if d=1 then 1 on the left connects to 1 and 2 on the right, 2 on the left connects to 1, 2, 3 on the right, 3 on the left connects to 2, 3, 4 on the right, and 4 on the left connects to 3, 4 on the right.
1 - 1
X
2 - 2
X
3 - 3
X
4 - 4
The resulting graph is bipartite. A valid permutation corresponds a perfect matching in the bipartite graph. A perfect matching, if it exists, can be found in O(VE) time (or somewhat better, for more advanced algorithms).
Now the problem becomes one of generating a uniformly distributed random perfect matching. I believe that can be done, approximately anyway. Uniformity of the distribution is the really hard part.
What does this have to do with permanents? Consider a matrix representation of our bipartite graph, where a 1 means an edge and a 0 means no edge:
1 1 0 0
1 1 1 0
0 1 1 1
0 0 1 1
The permanent of the matrix is like the determinant, except there are no negative signs in the definition. So we take exactly one element from each row and column, multiply them together, and add up over all choices of row and column. The terms of the permanent correspond to permutations; the term is 0 if any factor is 0, in other words if the permutation is not valid according to the matrix/bipartite graph representation; the term is 1 if all factors are 1, in other words if the permutation is valid according to the restrictions. In summary, the permanent of the matrix counts all permutations satisfying the restriction represented by the matrix/bipartite graph.
It turns out that unlike calculating determinants, which can be accomplished in O(n^3) time, calculating permanents is #P-complete so finding an exact answer is not feasible in general. However, if we can estimate the number of valid permutations, we can estimate the permanent. Jerrum et. al. approached the problem of counting valid permutations by generating valid permutations uniformly (within a certain error, which can be controlled); an estimate of the value of the permanent can be obtained by a fairly elaborate procedure (section 5 of the paper referenced) but we don't need that to answer the question at hand.
The running time of Jerrum's algorithm to calculate the permanent is O(n^11) (ignoring logarithmic factors). I can't immediately tell from the paper the running time of the part of the algorithm that uniformly generates bipartite matchings, but it appears to be over O(n^9). However, another paper reduces the running time for the permanent to O(n^7): http://www.cc.gatech.edu/fac/vigoda/FasterPermanent_SODA.pdf; in that paper they claim that it is now possible to get a good estimate of a permanent of a 100x100 0-1 matrix. So it should be possible to (almost) uniformly generate restricted permutations for lists of 100 elements.
There may be further improvements, but I got tired of looking.
If you want an implementation, I would start with the O(n^11) version in Jerrum's paper, and then take a look at the improvements if the original algorithm is not fast enough.
There is pseudo-code in Jerrum's paper, but I haven't tried it so I can't say how far the pseudo-code is from an actual implementation. My feeling is it isn't too far. Maybe I'll give it a try if there's interest.
I am not sure how good it is, but maybe something like:
create a list of same length than initial list L; each element of this list should be a list of indices of allowed initial indices to be moved here; for instance [[0,1,2],[0,1,2,3],[0,1,2,3],[1,2,3]] if I understand correctly your example;
take the smallest sublist (or any of the smallest sublists if several lists share the same length);
pick a random element in it with random.choice, this element is the index of the element in the initial list to be mapped to the current location (use another list for building your new list);
remove the randomly chosen element from all sublists
For instance:
L = [ "A", "B", "C", "D" ]
i = [[0,1,2],[0,1,2,3],[0,1,2,3],[1,2,3]]
# I take [0,1,2] and pick randomly 1 inside
# I remove the value '1' from all sublists and since
# the first sublist has already been handled I set it to None
# (and my result will look as [ "B", None, None, None ]
i = [None,[0,2,3],[0,2,3],[2,3]]
# I take the last sublist and pick randomly 3 inside
# result will be ["B", None, None, "D" ]
i = [None,[0,2], [0,2], None]
etc.
I haven't tried it however. Regards.
My idea is to generate permutations by moving at most d steps by generating d random permutations which move at most 1 step and chaining them together.
We can generate permutations which move at most 1 step quickly by the following recursive procedure: consider a permutation of {1,2,3,...,n}. The last item, n, can move either 0 or 1 place. If it moves 0 places, n is fixed, and we have reduced the problem to generating a permutation of {1,2,...,n-1} in which every item moves at most one place.
On the other hand, if n moves 1 place, it must occupy position n-1. Then n-1 must occupy position n (if any smaller number occupies position n, it will have moved by more than 1 place). In other words, we must have a swap of n and n-1, and after swapping we have reduced the problem to finding such a permutation of the remainder of the array {1,...,n-2}.
Such permutations can be constructed in O(n) time, clearly.
Those two choices should be selected with weighted probabilities. Since I don't know the weights (though I have a theory, see below) maybe the choice should be 50-50 ... but see below.
A more accurate estimate of the weights might be as follows: note that the number of such permutations follows a recursion that is the same as the Fibonacci sequence: f(n) = f(n-1) + f(n-2). We have f(1) = 1 and f(2) = 2 ({1,2} goes to {1,2} or {2,1}), so the numbers really are the Fibonacci numbers. So my guess for the probability of choosing n fixed vs. swapping n and n-1 would be f(n-1)/f(n) vs. f(n-2)/f(n). Since the ratio of consecutive Fibonacci numbers quickly approaches the Golden Ratio, a reasonable approximation to the probabilities is to leave n fixed 61% of the time and swap n and n-1 39% of the time.
To construct permutations where items move at most d places, we just repeat the process d times. The running time is O(nd).
Here is an outline of an algorithm.
arr = {1,2,...,n};
for (i = 0; i < d; i++) {
j = n-1;
while (j > 0) {
u = random uniform in interval (0,1)
if (u < 0.61) { // related to golden ratio phi; more decimals may help
j -= 1;
} else {
swap items at positions j and j-1 of arr // 0-based indexing
j -= 2;
}
}
}
Since each pass moves items at most 1 place from their start, d passes will move items at most d places. The only question is the uniform distribution of the permutations. It would probably be a long proof, if it's even true, so I suggest assembling empirical evidence for various n's and d's. Probably to prove the statement, we would have to switch from using the golden ratio approximation to f(n-1)/f(n-2) in place of 0.61.
There might even be some weird reason why some permutations might be missed by this procedure, but I'm pretty sure that doesn't happen. Just in case, though, it would be helpful to have a complete inventory of such permutations for some values of n and d to check the correctness of my proposed algorithm.
Update
I found an off-by-one error in my "pseudocode", and I corrected it. Then I implemented in Java to get a sense of the distribution. Code is below. The distribution is far from uniform, I think because there are many ways of getting restricted permutations with short max distances (move forward, move back vs. move back, move forward, for example) but few ways of getting long distances (move forward, move forward). I can't think of a way to fix the uniformity issue with this method.
import java.util.Random;
import java.util.Map;
import java.util.TreeMap;
class RestrictedPermutations {
private static Random rng = new Random();
public static void rPermute(Integer[] a, int d) {
for (int i = 0; i < d; i++) {
int j = a.length-1;
while (j > 0) {
double u = rng.nextDouble();
if (u < 0.61) { // related to golden ratio phi; more decimals may help
j -= 1;
} else {
int t = a[j];
a[j] = a[j-1];
a[j-1] = t;
j -= 2;
}
}
}
}
public static void main(String[] args) {
int numTests = Integer.parseInt(args[0]);
int d = 2;
Map<String,Integer> count = new TreeMap<String,Integer>();
for (int t = 0; t < numTests; t++) {
Integer[] a = {1,2,3,4,5};
rPermute(a,d);
// convert a to String for storage in Map
String s = "(";
for (int i = 0; i < a.length-1; i++) {
s += a[i] + ",";
}
s += a[a.length-1] + ")";
int c = count.containsKey(s) ? count.get(s) : 0;
count.put(s,c+1);
}
for (String k : count.keySet()) {
System.out.println(k + ": " + count.get(k));
}
}
}
Here are two sketches in Python; one swap-based, the other non-swap-based. In the first, the idea is to keep track of where the indexes have moved and test if the next swap would be valid. An additional variable is added for the number of swaps to make.
from random import randint
def swap(a,b,L):
L[a], L[b] = L[b], L[a]
def magicFunction(L,d,numSwaps):
n = len(L)
new = list(range(0,n))
for i in xrange(0,numSwaps):
x = randint(0,n-1)
y = randint(max(0,x - d),min(n - 1,x + d))
while abs(new[x] - y) > d or abs(new[y] - x) > d:
y = randint(max(0,x - d),min(n - 1,x + d))
swap(x,y,new)
swap(x,y,L)
return L
print(magicFunction([1,2,3,4],2,3)) # [2, 1, 4, 3]
print(magicFunction([1,2,3,4,5,6,7,8,9],2,4)) # [2, 3, 1, 5, 4, 6, 8, 7, 9]
Using print(collections.Counter(tuple(magicFunction([0, 1, 2], 1, 1)) for i in xrange(1000))) we find that the identity permutation comes up heavy with this code (the reason why is left as an exercise for the reader).
Alternatively, we can think about it as looking for a permutation matrix with interval restrictions, where abs(i - j) <= d where M(i,j) would equal 1. We can construct a one-off random path by picking a random j for each row from those still available. x's in the following example represent matrix cells that would invalidate the solution (northwest to southeast diagonal would represent the identity permutation), restrictions represent how many is are still available for each j. (Adapted from my previous version to choose both the next i and the next j randomly, inspired by user2357112's answer):
n = 5, d = 2
Start:
0 0 0 x x
0 0 0 0 x
0 0 0 0 0
x 0 0 0 0
x x 0 0 0
restrictions = [3,4,5,4,3] # how many i's are still available for each j
1.
0 0 1 x x # random choice
0 0 0 0 x
0 0 0 0 0
x 0 0 0 0
x x 0 0 0
restrictions = [2,3,0,4,3] # update restrictions in the neighborhood of (i ± d)
2.
0 0 1 x x
0 0 0 0 x
0 0 0 0 0
x 0 0 0 0
x x 0 1 0 # random choice
restrictions = [2,3,0,0,2] # update restrictions in the neighborhood of (i ± d)
3.
0 0 1 x x
0 0 0 0 x
0 1 0 0 0 # random choice
x 0 0 0 0
x x 0 1 0
restrictions = [1,0,0,0,2] # update restrictions in the neighborhood of (i ± d)
only one choice for j = 0 so it must be chosen
4.
0 0 1 x x
1 0 0 0 x # dictated choice
0 1 0 0 0
x 0 0 0 0
x x 0 1 0
restrictions = [0,0,0,0,2] # update restrictions in the neighborhood of (i ± d)
Solution:
0 0 1 x x
1 0 0 0 x
0 1 0 0 0
x 0 0 0 1 # dictated choice
x x 0 1 0
[2,0,1,4,3]
Python code (adapted from my previous version to choose both the next i and the next j randomly, inspired by user2357112's answer):
from random import randint,choice
import collections
def magicFunction(L,d):
n = len(L)
restrictions = [None] * n
restrict = -1
solution = [None] * n
for i in xrange(0,n):
restrictions[i] = abs(max(0,i - d) - min(n - 1,i + d)) + 1
while True:
availableIs = filter(lambda x: solution[x] == None,[i for i in xrange(n)]) if restrict == -1 else filter(lambda x: solution[x] == None,[j for j in xrange(max(0,restrict - d),min(n,restrict + d + 1))])
if not availableIs:
L = [L[i] for i in solution]
return L
i = choice(availableIs)
availableJs = filter(lambda x: restrictions[x] <> 0,[j for j in xrange(max(0,i - d),min(n,i + d + 1))])
nextJ = restrict if restrict != -1 else choice(availableJs)
restrict = -1
solution[i] = nextJ
restrictions[ nextJ ] = 0
for j in xrange(max(0,i - d),min(n,i + d + 1)):
if j == nextJ or restrictions[j] == 0:
continue
restrictions[j] = restrictions[j] - 1
if restrictions[j] == 1:
restrict = j
print(collections.Counter(tuple(magicFunction([0, 1, 2], 1)) for i in xrange(1000)))
Using print(collections.Counter(tuple(magicFunction([0, 1, 2], 1)) for i in xrange(1000))) we find that the identity permutation comes up light with this code (why is left as an exercise for the reader).
Here's an adaptation of #גלעד ברקן's code that takes only one pass through the list (in random order) and swaps only once (using a random choice of possible positions):
from random import choice, shuffle
def magicFunction(L, d):
n = len(L)
swapped = [0] * n # 0: position not swapped, 1: position was swapped
positions = list(xrange(0,n)) # list of positions: 0..n-1
shuffle(positions) # randomize positions
for x in positions:
if swapped[x]: # only swap an item once
continue
# find all possible positions to swap
possible = [i for i in xrange(max(0, x - d), min(n, x + d)) if not swapped[i]]
if not possible:
continue
y = choice(possible) # choose another possible position at random
if x != y:
L[y], L[x] = L[x], L[y] # swap with that position
swapped[x] = swapped[y] = 1 # mark both positions as swapped
return L
Here is a refinement of the above code that simply finds all possible adjacent positions and chooses one:
from random import choice
def magicFunction(L, d):
n = len(L)
positions = list(xrange(0, n)) # list of positions: 0..n-1
for x in xrange(0, n):
# find all possible positions to swap
possible = [i for i in xrange(max(0, x - d), min(n, x + d)) if abs(positions[i] - x) <= d]
if not possible:
continue
y = choice(possible) # choose another possible position at random
if x != y:
L[y], L[x] = L[x], L[y] # swap with that position
positions[x] = y
positions[y] = x
return L

How to change this recursive function (return a list of paths for a 3X3 matrix) into iterative function in Python 2.7?

the recursive function below helps to find all the paths for a 3X3 matrix, from top left to bottom right, moving either down or right. But i want to change it into an iterative function so that I can edit the function to just find a specific completed path (just 1, from top left to bottom right, by moving right or down) which sums up( sum of the values at each point equate to a set number) to a desired number eg. 12. This is especially important for a bigger matrix eg. a 9 X 1000 matrix. How do I do it?
Note for Danoran:
The values are always positive. If you look at my 3X3 matrix a, you see values of 1s, 2s and 3s. So for example, moving from 1 to 1 to 1 to 2 to 3 (goal) is a completed path and the sum is 8.
This finds all the paths only.
a = []
for i in range(3):
r = []
for j in range(3):
r.append(i+1)
a.append(r)
a = matrix
1 1 1
2 2 2
3 3 3
all_paths = []
def printall(currentRow, currentColumn, nums):
if (currentRow == len(a) - 1):
for i in range(currentColumn, len(a[0])):
nums.append(a[currentRow][i])
all_paths.append(nums)
return all_paths
if (currentColumn == len(a[0]) - 1):
for i in range(currentRow, len(a)):
nums.append(a[i][currentColumn])
all_paths.append(nums)
return all_paths
nums.append(a[currentRow][currentColumn])
printall(currentRow+1, currentColumn, nums[:])
printall(currentRow, currentColumn+1, nums[:])
printall(0,0,[])
print all_paths
If there are R rows and C columns, you have to make R-1 down-jumps and
C-1 right-jumps. That’s invariant. The only variation is in the order of
the jumps. If we say dj=R-1 and rj=C-1, then the total number of paths
is (dj+rj)!/(dj!rj!).
So, we can simply iterate through all the unique permutations. Note that
itertools.permutations() will generate all permutations, not just
the unique ones, so we have to filter out the repeats. Of course, this
also means the run time will be proportional to (dj+rj)!, the number of
non-unique permutations. I won’t go into how to efficiently generate
unique permutations; see, for example, Question 22431637.
In the code below, I’ve increased the number of rows to 4, to help
distinguish rows from columns.
from itertools import permutations
a = []
for i in range(4):
r = []
for j in range(3):
r.append(i+1)
a.append(r)
#print a # Uncomment to trace execution
all_paths = []
def gen_all_paths(matrix):
rows = len(matrix)
cols = len(matrix[0])
dj = rows - 1 # down-jumps
rj = cols - 1 # right-jumps
pathmix = 'd' * dj + 'r' * rj
prev = ()
for path in permutations(pathmix):
if path <= prev: # filter out repeats
continue
prev = path
r, c = 0, 0
cells = [matrix[0][0]]
for i in path:
if i == 'd':
r += 1
else:
c += 1
cells.append(matrix[r][c])
#print ''.join(path), cells # Uncomment to trace execution
all_paths.append(cells)
gen_all_paths(a)
print all_paths

Recursive generator on list of lists

I have 2 demotion list that looks like that:
[[1,2,3,],
[4,5,6],
[7,8,9]]
I'm trying to write a generator that yield the sum of a 'path'.
A 'path' starts from left-top corner and goes only on x+1 and y+1 until it's get to it's last element(the right bottom).
For example, a valid path is 1=>2=>5=>6=>9 (sum=23).
None-valid path could be 1=>2=>5=>**4**=>...
So far I have this code:
my_list = [[0, 2, 5], [1, 1, 3], [2, 1, 1]]
def gen(x, y, _sum):
if x + 1 <= len(my_list):
for i1 in gen(x + 1, y, _sum + my_list[y][x]):
yield _sum
if y + 1 <= len(my_list):
for i2 in gen(x, y + 1, _sum + my_list[y][x]):
yield _sum
yield _sum + my_list[y][x]
g = gen(0, 0, 0)
total = 0
for elm in g:
total += elm
print total
I get the error:
for i2 in gen(x, y+1, _sum+my_list[y][x]):
IndexError: list index out of range
The reason for this error is a simple off-by-one error.*
I think what you wanted here is x <= len(my_list) or, equivalently, x+1 < len(my_list); you've doubled-up the +1-ness, causing you to run past the end of the list.
Consider a concrete case:
len(my_list) is 3. x is 2. So, x+1 <= len(my_list) is 3 <= 3, which is true. So you call yourself recursively with gen(3, …).
In that recursive call, 4 <= 3 is false, so, depending on the value of y, you call either:
gen(x, y + 1, _sum + my_list[y][3]), or
_sum + my_list[y][3]
… either of which will raise an IndexError.
Obviously you need to fix the same problem with y as with x.
You can see it running without errors here.
Of course it doesn't actually print out the right result, because there are other problems in your code. Off the top of my head:
total = + elm replaces whatever's in total with the value of elm. You probably wanted +=, not = + here.
Yielding _sum over and over and ignoring the values yielded by the recursive generators can't possibly be doing any good. Maybe you wanted to yield i1 and i2 instead?
I can't guarantee that those are the only problems in your code, just that they are problems.
* I'm assuming here that this is a silly bug, not a fundamental error—you clearly know that indexes are 0-based, since you called the function with gen(0, 0, 0) rather than gen(1, 1, 0).
If you really wanted to brute-force all permissible paths through an N x M matrix, then simply generate all permutations of N - 1 moves to the right plus M - 1 moves down, then use those moves to sum the values along the path:
from itertools import permutations
def gen_path_sum(matrix):
N, M = len(matrix), len(matrix[0])
for path in permutations([(1, 0)] * (N - 1) + [(0, 1)] * (M - 1)):
sum = matrix[0][0]
x = y = 0
for dx, dy in path:
x += dx; y += dy
sum += matrix[x][y]
yield sum
This'll produce (N + M)! paths; there are 720 such paths for a 3 by 3 matrix.
However, if you are trying to find the maximum path through the matrix, you are going about it the inefficient way.
You can instead calculate the maximum path for any cell in the matrix; it is simply the greatest of the maximum path values of the cell above and to the left, plus value of the current cell. So for the cell in the top left (with no cells above or to the right), the maximum path value is the value of the cell.
You can calculate all those values with a N X M loop:
def max_path_value(matrix):
totals = [row[:] for row in matrix]
for x, row in enumerate(totals):
for y, cell in enumerate(row):
totals[x][y] += max(
totals[x - 1][y] if x else 0,
totals[x][y - 1] if y else 0
)
return totals[-1][-1]
This only takes N X M steps, or 9 steps in total for your 3 by 3 matrix. That's a factor of 80 better than the brute-force approach.
The contrast only increases as your matrix sizes increase; a 10x10 matrix, brute forced, requires examining 2432902008176640000 paths (== 20!), or you can just calculate the maximum path with 100 steps instead.

Categories

Resources