Variation of max path sum problem using more directions - python

I don't know how to approach this question.
We're given an N*N grid listing the costs to get from location a to location b.
Each row in the grid tells us the cost of getting from location to location (each location corresponds to a row in the costs array). (We say that location a is bigger than location b if row a appears after row b in the costs array. The index of every row is a location). We may choose to start from any given location, and visit every location exactly once. At every location p that we visit, we must have already visited all locations less than p, or no locations less than p.
costs[a][b] gives us the cost to move from location a to location b.
costs[a][b] is not necessarily the same as costs[b][a].
costs[a][a] = 0 for every index a (diagonals in the costs array are always 0).
Our task is to find the maximum-sum cost of a valid path.
If the costs array is:
[[0, 9, 1],
[5, 0, 2],
[4, 6, 0]]
The max cost consequently will be 13 as the most expensive valid path is starting at location 2 -> location 0 -> location 1.
The first row tells us how much it will cost to get from location 0 to location 0 (remain in the same location, costs us 0), 0 to location 1 (costs us 9) and 0 to location 2 (costs us 1). The second and third rows follow the same pattern.

The requirements on which locations you can visit mean that after you start at some location i, you're forced to move to a lower location repeatedly until you're at location 0. At that point, you have to ascend consecutively through all the locations that are unvisited. The dynamic programming solution is not obvious, but with a fairly complex implementation you can get an O(n^3) DP algorithm with standard techniques.
It turns out there's an O(n^2) solution as well, which is optimal. It also uses O(n) extra space, which is maybe also optimal. The solution comes from thinking about the structure of our visits: there's a downward sequence of indices (possibly with gaps) ending at 0, and then an upward sequence starting at 0 that contains all other indices. There's 2^n possible subsequences though, so we'll have to think more to speed this up.
Two Sequences
Suppose we have i locations, 0, 1, ... i-1, and we've partitioned these into two ordered subsequences (except 0, which is at the start of both). We'll call these two sequences U and D, for up and down. Exactly one of them has to end on i-1. Without loss of generality, assume U ends with i-1 and D ends with j >= 0.
What happens when we add a location i? We either add it to the end of U so our sequences end on i and j, or we add it to the end of D so our sequences end on i-1 and i. If we add it to U, the path-sum of U (which we define as the sum of cost[u][v] for all adjacent indices u,v in U) increases by cost[i-1][i]. If we add the location to the end of D, the path-sum of D increases by cost[i][j] (since it's a downward sequence, we've flipped the indices relative to U).
It turns out that we only need to track the endpoints of our subsequences as we grow them, as well as the maximum combined path-sum for any pair of subsequences with those endpoints. If we let (i, j) denote the state where U ends with i and D ends with j, we can think about how we could have arrived here.
For example, at (8,5), our previous state must have had a subsequence containing 7, so our previous state must have been (7,5). Therefore max-value(8,5) = max-value(7,5) + cost[7][8]. We always have exactly one predecessor state when the two endpoints differ by more than one.
Now consider the state (8,7). We can't have come from (7,7), since the only number allowed to be in both sequences is 0. So we could have come from any of (0,7), (1,7), ... (6,7): we can choose whichever will maximize our path sum.
def solve(costs: List[List[int]]) -> int:
n = len(costs)
# Deal with edge cases
if n == 1:
return 0
if n == 2:
return max(costs[0][1], costs[1][0])
ups = [costs[0][1]]
downs = [costs[1][0]]
# After iteration i, ups[j] denotes the max-value of state (i, j)
# and downs[j] denotes the max-value of state (j, i)
for i in range(2, n):
ups.append(max(downs[j] + costs[j][i] for j in range(i - 1)))
downs.append(max(ups[j] + costs[i][j] for j in range(i - 1)))
up_gain = costs[i-1][i]
down_gain = costs[i][i-1]
for j in range(i - 1):
ups[j] += up_gain
downs[j] += down_gain
return max(max(ups), max(downs))

Related

generate random directed fully-accessible adjacent probability matrix

given V nodes and E connections as parameters, how do I generate random directed fully-!connected! adjacent probability matrix, where all the connections weights fanning out of a node sum to 1.
The idea is after I pick random starting node to do a random walk according to the probabilities thus generating similar-random-structured sequences.
Although I prefer adj-matrix, graph is OK too.
Of course the fan-out connections can be one or many.
Loops are OK just not with itself.
I can do the walk using np.random.choice(nodes,prob)
Now that Jerome mention it it seem I was mistaken .. I dont want fully-coonnected BUT a closed-loop where there are no islands of sub-graphs i.e. all nodes are accessible via others.
Sorry I dont know how is this type of graph called ?
here is my complex solution ;(
def gen_adjmx(self):
passx = 1
c = 0 #connections so far
#until enough conns are generated
while c < self.nconns :
#loop the rows
for sym in range(self.nsyms):
if c >= self.nconns : break
if passx == 1 : #guarantees at least one connection
self.adj[sym, randint(self.nsyms) ] = randint(100)
else:
if randint(2) == 1 : #maybe a conn ?
col = randint(self.nsyms)
#already exists
if self.adj[sym, col] > 0 : continue
self.adj[sym, col ] = randint(100)
c += 1
passx += 1
self.adj /= self.adj.sum(axis=0)
You can simply create a random matrix and normalize the rows so that the sum is 1:
v = np.random.rand(n, n)
v /= v.sum(axis=1)
You mentioned that you want a graph which doesn't have any islands. I guess what you mean is that the adjacency matrix should be irreducible, i.e. the associated graph doesn't have any disconnected components.
One way to generate a random graph with the required property is to generate a random graph and then see if it has the property; throw it out and try again if it doesn't, otherwise keep it.
Here's a sketch of a solution with that in mind.
(1) generate a matrix n_vertices by n_vertices, which contains n_edges elements which are 1, and the rest are 0. This is a random adjacency matrix.
(2) test the adjacency matrix to see if it's irreducible. If so, keep it, otherwise go back to step 1.
I'm sure you can implement that in Python. I tried a proof of concept in Maxima (https://maxima.sourceforge.io), since it's convenient in some ways. There are probably ways to go about it which directly construct an irreducible matrix.
I implemented the irreducibility test for a matrix A as whether sum(A^^k, k, 0, n) has any 0 elements, according to: https://math.stackexchange.com/a/1703650 That test becomes more and more expensive as the number of vertices grows; and as the ratio of edges to vertices decreases, it increases the probability that you'll have to repeat steps 1 and 2. Whether that's tolerable for you depends on the typical number of vertices and edges you're working with.
random_irreducible (n_vertices, n_edges) :=
block ([A, n: 1],
while not irreducible (A: random_adjacency (n_vertices, n_edges))
do n: n + 1,
[A, n]);
random_adjacency (n_vertices, n_edges) :=
block([list_01, list_01_permuted, get_element],
list_01: append (makelist (1, n_edges), makelist (0, n_vertices^2 - n_edges)),
list_01_permuted: random_permutation (list_01),
get_element: lambda ([i, j], list_01_permuted[1 + (i - 1) + (j - 1)*n_vertices]),
genmatrix (get_element, n_vertices, n_vertices));
irreducible (A) :=
is (member (0, flatten (args (sum (A^^k, k, 0, length(A))))) = false);
A couple of things, one is I left out the part about normalizing edge weights so they sum to 1. I guess you'll have to put in that part to get a transition matrix and not just an adjacency matrix. The other is that I didn't prevent elements on the diagonal, i.e., you can stay on a vertex instead of always going to another one. If that's important, you'll have to deal with that too.

Finding minimum number of jumps increasing the value of the element

Optimizing a leetcode-style question - DP/DFS
The task is the following:
Given N heights, find the minimum number of suboptimal jumps required to go from start to end. [1-D Array]
A jump is suboptimal, if the height of the starting point i is less or equal to the height of the target point j.
A jump is possible, if j-i >= k, where k is the maximal jump distance.
For the first subtask, there is only one k value.
For the second subtask, there are two k values; output the amount of suboptimal jumps for each k value.
For the third subtask, there are 100 k values; output the amount of suboptimal jumps for each k value.
My Attempt
The following snippet is my shot at solving the problem, it gives the correct solution.
This was optimized to handle multiple k values without having to do a lot of unnecessary work.
The Problem is that even a solution with a single k value is o(n^2) in the worst case. (As k <= N)
A solution would be to eliminate the nested for loop, this is what I'm uncertain about how to approach it.
def solve(testcase):
N, Q = 10, 1
h = [1 , 2 , 4 ,2 , 8, 1, 2, 4, 8, 16] # output 3
# ^---- + ---^ 0 ^--- + --^ + ^
k = [3]
l_k = max(k)
distances = [99999999999] * N
distances[N-1] = 0
db = [ [0]*N for i in range(N)]
for i in range(N-2, -1, -1):
minLocalDistance = 99999999999
for j in range(min(i+l_k, N-1), i, -1):
minLocalDistance = min(minLocalDistance, distances[j] + (h[i] <= h[j]))
db[i][j] = distances[j] + (h[i] <= h[j])
distances[i] = minLocalDistance
print(f"Case #{testcase}: {distances[0]}")
NOTE: This is different from the classic min. jumps problem
Consider the best cost to get to a position i. It is the smaller of:
The minimum cost to get to any of the preceding k positions, plus one (a suboptimal jump); or
The minimum cost to get to any of the lower-height position in the same window (an optimal jump).
Case (1) can be handled with the sliding-window-minimum algorithm that you can find described, for example, here: Sliding window maximum in O(n) time. This takes amortized constant time per position, or O(N) all together.
Case (2) has a somewhat obvious solution with a BST: As the window moves, insert each new position into a BST sorted by height. Remove positions that are no longer in the window. Additionally, in each node, store the minimum cost within its subtree. With this structure, you can find the minimum cost for any height bound in O(log k) time.
The expense in case 2 leads to a total complexity of O(N log k) for a single k-value. That's not too bad for complexity, but such BSTs are somewhat complicated and aren't usually provided in standard libraries.
You can make this simpler and faster by recognizing that if the minimum cost in the window is C, then optimal jumps are only beneficial if they come from predecessors of cost C, because cost C+1 is attainable with a sub-optimal jump.
For each cost, then, you can use that same sliding-window-minimum algorithm to keep track of the minimum height in the window for nodes with that cost. Then for case (2), you just need to check to see if that minimum height for the minimum cost is lower than the height you want to jump to.
Maintaining these sliding windows again takes amortized constant time per operation, leading to O(N) time for the whole single-k-value algorithm.
I doubt that there would be any benefit in trying to manage multiple k-values at once.

Fast way to check consecutive subsequences for total

I have a list (up to 10,000 long) of numbers 0, 1, or 2.
I need to see how many consecutive subsequences have a total which is NOT 1. My current method is to for each list do:
cons = 0
for i in range(seqlen+1):
for j in range(i+1, seqlen+1):
if sum(twos[i:j]) != 1:
cons += 1
So an example input would be:
[0, 1, 2, 0]
and the output would be
cons = 8
as the 8 working subsequences are:
[0] [2] [0] [1,2] [2, 0] [0, 1, 2] [1, 2, 0] [0, 1, 2, 0]
The issue is that simply going through all these subsequences (the i in range, j in range) takes almost more time than is allowed, and when the if statement is added, the code takes far too long to run on the server. (To be clear, this is only a small part of a larger problem, I'm not just asking for the solution to an entire problem). Anyway, is there any other way to check faster? I can't think of anything that wouldn't result in more operations needing to happen every time.
I think I see the problem: your terminology is incorrect / redundant. By definition, a sub-sequence is a series of consecutive elements.
Do not sum every candidate. Instead, identify every candidate whose sum is 1, and then subtract that total from the computed quantity of all sub-sequences (simple algebra).
All of the 1-sum candidates are of the regular expression form 0*10*: a 1 surrounded by any quantity of 0s on either or both sides.
Identify all such maximal-length strings. FOr instance, in
210002020001002011
you will pick out 1000, 000100, 01, and 1. For each string compute the quantity of substrings that contain the 1 (a simple equation on the lengths of the 0s on each side). Add up those quantities. Subtract from the total for the entire input. There's you answer.
Use sliding window technique to solve these type of problem. Take two variable to track first and last to track the scope of window. So you start with sum equal to first element. If the sum is larger than required value you subtract the 'first' element from sum and increment sum by 1. If the sum is smaller than required you add next element of 'last' pointer and increment last by 1. Every time sum is equal to required increment some counter.
As for NOT, count number of sub-sequence having '1' sum and then subtract from total number of sub-sequence possible, i.e. n * (n + 1) / 2

Random contiguous slice of list in Python based on a single random integer

Using a single random number and a list, how would you return a random slice of that list?
For example, given the list [0,1,2] there are seven possibilities of random contiguous slices:
[ ]
[ 0 ]
[ 0, 1 ]
[ 0, 1, 2 ]
[ 1 ]
[ 1, 2]
[ 2 ]
Rather than getting a random starting index and a random end index, there must be a way to generate a single random number and use that one value to figure out both starting index and end/length.
I need it that way, to ensure these 7 possibilities have equal probability.
Simply fix one order in which you would sort all possible slices, then work out a way to turn an index in that list of all slices back into the slice endpoints. For example, the order you used could be described by
The empty slice is before all other slices
Non-empty slices are ordered by their starting point
Slices with the same starting point are ordered by their endpoint
So the index 0 should return the empty list. Indices 1 through n should return [0:1] through [0:n]. Indices n+1 through n+(n-1)=2n-1 would be [1:2] through [1:n]; 2n through n+(n-1)+(n-2)=3n-3 would be [2:3] through [2:n] and so on. You see a pattern here: the last index for a given starting point is of the form n+(n-1)+(n-2)+(n-3)+…+(n-k), where k is the starting index of the sequence. That's an arithmetic series, so that sum is (k+1)(2n-k)/2=(2n+(2n-1)k-k²)/2. If you set that term equal to a given index, and solve that for k, you get some formula involving square roots. You could then use the ceiling function to turn that into an integral value for k corresponding to the last index for that starting point. And once you know k, computing the end point is rather easy.
But the quadratic equation in the solution above makes things really ugly. So you might be better off using some other order. Right now I can't think of a way which would avoid such a quadratic term. The order Douglas used in his answer doesn't avoid square roots, but at least his square root is a bit simpler due to the fact that he sorts by end point first. The order in your question and my answer is called lexicographical order, his would be called reverse lexicographical and is often easier to handle since it doesn't depend on n. But since most people think about normal (forward) lexicographical order first, this answer might be more intuitive to many and might even be the required way for some applications.
Here is a bit of Python code which lists all sequence elements in order, and does the conversion from index i to endpoints [k:m] the way I described above:
from math import ceil, sqrt
n = 3
print("{:3} []".format(0))
for i in range(1, n*(n+1)//2 + 1):
b = 1 - 2*n
c = 2*(i - n) - 1
# solve k^2 + b*k + c = 0
k = int(ceil((- b - sqrt(b*b - 4*c))/2.))
m = k + i - k*(2*n-k+1)//2
print("{:3} [{}:{}]".format(i, k, m))
The - 1 term in c doesn't come from the mathematical formula I presented above. It's more like subtracting 0.5 from each value of i. This ensures that even if the result of sqrt is slightly too large, you won't end up with a k which is too large. So that term accounts for numeric imprecision and should make the whole thing pretty robust.
The term k*(2*n-k+1)//2 is the last index belonging to starting point k-1, so i minus that term is the length of the subsequence under consideration.
You can simplify things further. You can perform some computation outside the loop, which might be important if you have to choose random sequences repeatedly. You can divide b by a factor of 2 and then get rid of that factor in a number of other places. The result could look like this:
from math import ceil, sqrt
n = 3
b = n - 0.5
bbc = b*b + 2*n + 1
print("{:3} []".format(0))
for i in range(1, n*(n+1)//2 + 1):
k = int(ceil(b - sqrt(bbc - 2*i)))
m = k + i - k*(2*n-k+1)//2
print("{:3} [{}:{}]".format(i, k, m))
It is a little strange to give the empty list equal weight with the others. It is more natural for the empty list to be given weight 0 or n+1 times the others, if there are n elements on the list. But if you want it to have equal weight, you can do that.
There are n*(n+1)/2 nonempty contiguous sublists. You can specify these by the end point, from 0 to n-1, and the starting point, from 0 to the endpoint.
Generate a random integer x from 0 to n*(n+1)/2.
If x=0, return the empty list. Otherwise, x is unformly distributed from 1 through n(n+1)/2.
Compute e = floor(sqrt(2*x)-1/2). This takes the values 0, 1, 1, 2, 2, 2, 3, 3, 3, 3, etc.
Compute s = (x-1) - e*(e+1)/2. This takes the values 0, 0, 1, 0, 1, 2, 0, 1, 2, 3, ...
Return the interval starting at index s and ending at index e.
(s,e) takes the values (0,0),(0,1),(1,1),(0,2),(1,2),(2,2),...
import random
import math
n=10
x = random.randint(0,n*(n+1)/2)
if (x==0):
print(range(n)[0:0]) // empty set
exit()
e = int(math.floor(math.sqrt(2*x)-0.5))
s = int(x-1 - (e*(e+1)/2))
print(range(n)[s:e+1]) // starting at s, ending at e, inclusive
First create all possible slice indexes.
[0:0], [1:1], etc are equivalent, so we include only one of those.
Finally you pick a random index couple, and apply it.
import random
l = [0, 1, 2]
combination_couples = [(0, 0)]
length = len(l)
# Creates all index couples.
for j in range(1, length+1):
for i in range(j):
combination_couples.append((i, j))
print(combination_couples)
rand_tuple = random.sample(combination_couples, 1)[0]
final_slice = l[rand_tuple[0]:rand_tuple[1]]
print(final_slice)
To ensure we got them all:
for i in combination_couples:
print(l[i[0]:i[1]])
Alternatively, with some math...
For a length-3 list there are 0 to 3 possible index numbers, that is n=4. You have 2 of them, that is k=2. First index has to be smaller than second, therefor we need to calculate the combinations as described here.
from math import factorial as f
def total_combinations(n, k=2):
result = 1
for i in range(1, k+1):
result *= n - k + i
result /= f(k)
# We add plus 1 since we included [0:0] as well.
return result + 1
print(total_combinations(n=4)) # Prints 7 as expected.
there must be a way to generate a single random number and use that one value to figure out both starting index and end/length.
It is difficult to say what method is best but if you're only interested in binding single random number to your contiguous slice you can use modulo.
Given a list l and a single random nubmer r you can get your contiguous slice like that:
l[r % len(l) : some_sparkling_transformation(r) % len(l)]
where some_sparkling_transformation(r) is essential. It depents on your needs but since I don't see any special requirements in your question it could be for example:
l[r % len(l) : (2 * r) % len(l)]
The most important thing here is that both left and right edges of the slice are correlated to r. This makes a problem to define such contiguous slices that wont follow any observable pattern. Above example (with 2 * r) produces slices that are always empty lists or follow a pattern of [a : 2 * a].
Let's use some intuition. We know that we want to find a good random representation of the number r in a form of contiguous slice. It cames out that we need to find two numbers: a and b that are respectively left and right edges of the slice. Assuming that r is a good random number (we like it in some way) we can say that a = r % len(l) is a good approach.
Let's now try to find b. The best way to generate another nice random number will be to use random number generator (random or numpy) which supports seeding (both of them). Example with random module:
import random
def contiguous_slice(l, r):
random.seed(r)
a = int(random.uniform(0, len(l)+1))
b = int(random.uniform(0, len(l)+1))
a, b = sorted([a, b])
return l[a:b]
Good luck and have fun!

Sorting Technique Python

I'm trying to create a sorting technique that sorts a list of numbers. But what it does is that it compares two numbers, the first being the first number in the list, and the other number would be the index of 2k - 1.
2^k - 1 = [1,3,7, 15, 31, 63...]
For example, if I had a list [1, 4, 3, 6, 2, 10, 8, 19]
The length of this list is 8. So the program should find a number in the 2k - 1 list that is less than 8, in this case it will be 7.
So now it will compare the first number in the random list (1) with the 7th number in the same list (19). if it is greater than the second number, it will swap positions.
After this step, it will continue on to 4 and the 7th number after that, but that doesn't exist, so now it should compare with the 3rd number after 4 because 3 is the next number in 2k - 1.
So it should compare 4 with 2 and swap if they are not in the right place. So this should go on and on until I reach 1 in 2k - 1 in which the list will finally be sorted.
I need help getting started on this code.
So far, I've written a small code that makes the 2k - 1 list but thats as far as I've gotten.
a = []
for i in range(10):
a.append(2**(i+1) -1)
print(a)
EXAMPLE:
Consider sorting the sequence V = 17,4,8,2,11,5,14,9,18,12,7,1. The skipping
sequence 1, 3, 7, 15, … yields r=7 as the biggest value which fits, so looking at V, the first sparse subsequence =
17,9, so as we pass along V we produce 9,4,8,2,11,5,14,17,18,12,7,1 after the first swap, and
9,4,8,2,1,5,14,17,18,12,7,11 after using r=7 completely. Using a=3 (the next smaller term in the skipping
sequence), the first sparse subsequence = 9,2,14,12, which when applied to V gives 2,4,8,9,1,5,12,17,18,14,7,11, and the remaining a = 3 sorts give 2,1,8,9,4,5,12,7,18,14,17,11, and then 2,1,5,9,4,8,12,7,11,14,17,18. Finally, with a = 1, we get 1,2,4,5,7,8,9,11,12,14,17,18.
You might wonder, given that at the end we do a sort with no skips, why
this might be any faster than simply doing that final step as the only step at the beginning. Think of it as a comb
going through the sequence -- notice that in the earlier steps we’re using course combs to get distant things in the
right order, using progressively finer combs until at the end our fine-tuning is dealing with a nearly-sorted sequence
needing little adjustment.
p = 0
x = len(V) #finding out the length of V to find indexer in a
for j in a: #for every element in a (1,3,7....)
if x >= j: #if the length is greater than or equal to current checking value
p = j #sets j as p
So that finds what distance it should compare the first number in the list with but now i need to write something that keeps doing that until the distance is out of range so it switches from 3 to 1 and then just checks the smaller distances until the list is sorted.
The sorting algorithm you're describing actually is called Combsort. In fact, the simpler bubblesort is a special case of combsort where the gap is always 1 and doesn't change.
Since you're stuck on how to start this, here's what I recommend:
Implement the bubblesort algorithm first. The logic is simpler and makes it much easier to reason about as you write it.
Once you've done that you have the important algorithmic structure in place and from there it's just a matter of adding gap length calculation into the mix. This means, computing the gap length with your particular formula. You'll then modifying the loop control index and the inner comparison index to use the calculated gap length.
After each iteration of the loop you decrease the gap length(in effect making the comb shorter) by some scaling amount.
The last step would be to experiment with different gap lengths and formulas to see how it affects algorithm efficiency.

Categories

Resources