Optimizing a leetcode-style question - DP/DFS
The task is the following:
Given N heights, find the minimum number of suboptimal jumps required to go from start to end. [1-D Array]
A jump is suboptimal, if the height of the starting point i is less or equal to the height of the target point j.
A jump is possible, if j-i >= k, where k is the maximal jump distance.
For the first subtask, there is only one k value.
For the second subtask, there are two k values; output the amount of suboptimal jumps for each k value.
For the third subtask, there are 100 k values; output the amount of suboptimal jumps for each k value.
My Attempt
The following snippet is my shot at solving the problem, it gives the correct solution.
This was optimized to handle multiple k values without having to do a lot of unnecessary work.
The Problem is that even a solution with a single k value is o(n^2) in the worst case. (As k <= N)
A solution would be to eliminate the nested for loop, this is what I'm uncertain about how to approach it.
def solve(testcase):
N, Q = 10, 1
h = [1 , 2 , 4 ,2 , 8, 1, 2, 4, 8, 16] # output 3
# ^---- + ---^ 0 ^--- + --^ + ^
k = [3]
l_k = max(k)
distances = [99999999999] * N
distances[N-1] = 0
db = [ [0]*N for i in range(N)]
for i in range(N-2, -1, -1):
minLocalDistance = 99999999999
for j in range(min(i+l_k, N-1), i, -1):
minLocalDistance = min(minLocalDistance, distances[j] + (h[i] <= h[j]))
db[i][j] = distances[j] + (h[i] <= h[j])
distances[i] = minLocalDistance
print(f"Case #{testcase}: {distances[0]}")
NOTE: This is different from the classic min. jumps problem
Consider the best cost to get to a position i. It is the smaller of:
The minimum cost to get to any of the preceding k positions, plus one (a suboptimal jump); or
The minimum cost to get to any of the lower-height position in the same window (an optimal jump).
Case (1) can be handled with the sliding-window-minimum algorithm that you can find described, for example, here: Sliding window maximum in O(n) time. This takes amortized constant time per position, or O(N) all together.
Case (2) has a somewhat obvious solution with a BST: As the window moves, insert each new position into a BST sorted by height. Remove positions that are no longer in the window. Additionally, in each node, store the minimum cost within its subtree. With this structure, you can find the minimum cost for any height bound in O(log k) time.
The expense in case 2 leads to a total complexity of O(N log k) for a single k-value. That's not too bad for complexity, but such BSTs are somewhat complicated and aren't usually provided in standard libraries.
You can make this simpler and faster by recognizing that if the minimum cost in the window is C, then optimal jumps are only beneficial if they come from predecessors of cost C, because cost C+1 is attainable with a sub-optimal jump.
For each cost, then, you can use that same sliding-window-minimum algorithm to keep track of the minimum height in the window for nodes with that cost. Then for case (2), you just need to check to see if that minimum height for the minimum cost is lower than the height you want to jump to.
Maintaining these sliding windows again takes amortized constant time per operation, leading to O(N) time for the whole single-k-value algorithm.
I doubt that there would be any benefit in trying to manage multiple k-values at once.
Related
I don't know how to approach this question.
We're given an N*N grid listing the costs to get from location a to location b.
Each row in the grid tells us the cost of getting from location to location (each location corresponds to a row in the costs array). (We say that location a is bigger than location b if row a appears after row b in the costs array. The index of every row is a location). We may choose to start from any given location, and visit every location exactly once. At every location p that we visit, we must have already visited all locations less than p, or no locations less than p.
costs[a][b] gives us the cost to move from location a to location b.
costs[a][b] is not necessarily the same as costs[b][a].
costs[a][a] = 0 for every index a (diagonals in the costs array are always 0).
Our task is to find the maximum-sum cost of a valid path.
If the costs array is:
[[0, 9, 1],
[5, 0, 2],
[4, 6, 0]]
The max cost consequently will be 13 as the most expensive valid path is starting at location 2 -> location 0 -> location 1.
The first row tells us how much it will cost to get from location 0 to location 0 (remain in the same location, costs us 0), 0 to location 1 (costs us 9) and 0 to location 2 (costs us 1). The second and third rows follow the same pattern.
The requirements on which locations you can visit mean that after you start at some location i, you're forced to move to a lower location repeatedly until you're at location 0. At that point, you have to ascend consecutively through all the locations that are unvisited. The dynamic programming solution is not obvious, but with a fairly complex implementation you can get an O(n^3) DP algorithm with standard techniques.
It turns out there's an O(n^2) solution as well, which is optimal. It also uses O(n) extra space, which is maybe also optimal. The solution comes from thinking about the structure of our visits: there's a downward sequence of indices (possibly with gaps) ending at 0, and then an upward sequence starting at 0 that contains all other indices. There's 2^n possible subsequences though, so we'll have to think more to speed this up.
Two Sequences
Suppose we have i locations, 0, 1, ... i-1, and we've partitioned these into two ordered subsequences (except 0, which is at the start of both). We'll call these two sequences U and D, for up and down. Exactly one of them has to end on i-1. Without loss of generality, assume U ends with i-1 and D ends with j >= 0.
What happens when we add a location i? We either add it to the end of U so our sequences end on i and j, or we add it to the end of D so our sequences end on i-1 and i. If we add it to U, the path-sum of U (which we define as the sum of cost[u][v] for all adjacent indices u,v in U) increases by cost[i-1][i]. If we add the location to the end of D, the path-sum of D increases by cost[i][j] (since it's a downward sequence, we've flipped the indices relative to U).
It turns out that we only need to track the endpoints of our subsequences as we grow them, as well as the maximum combined path-sum for any pair of subsequences with those endpoints. If we let (i, j) denote the state where U ends with i and D ends with j, we can think about how we could have arrived here.
For example, at (8,5), our previous state must have had a subsequence containing 7, so our previous state must have been (7,5). Therefore max-value(8,5) = max-value(7,5) + cost[7][8]. We always have exactly one predecessor state when the two endpoints differ by more than one.
Now consider the state (8,7). We can't have come from (7,7), since the only number allowed to be in both sequences is 0. So we could have come from any of (0,7), (1,7), ... (6,7): we can choose whichever will maximize our path sum.
def solve(costs: List[List[int]]) -> int:
n = len(costs)
# Deal with edge cases
if n == 1:
return 0
if n == 2:
return max(costs[0][1], costs[1][0])
ups = [costs[0][1]]
downs = [costs[1][0]]
# After iteration i, ups[j] denotes the max-value of state (i, j)
# and downs[j] denotes the max-value of state (j, i)
for i in range(2, n):
ups.append(max(downs[j] + costs[j][i] for j in range(i - 1)))
downs.append(max(ups[j] + costs[i][j] for j in range(i - 1)))
up_gain = costs[i-1][i]
down_gain = costs[i][i-1]
for j in range(i - 1):
ups[j] += up_gain
downs[j] += down_gain
return max(max(ups), max(downs))
Suppose there are n sets of real numbers: S[1], S[2], ..., S[n]. We know two things about these sets:
Each set S[i] has exactly 3 elements.
All elements in each of the sets S[i] are real numbers in the [0, 1] range. (I don't know if this detail can be helpful for the solution, though).
Let's consider a set T of all numbers that can be represented as p[1] * p[2] * p[3] * ... * p[n] where p[i] is an element of S[i]. This set T, obviously, has 3^n elements.
My question is, given the sets S[1], S[2], ..., S[n] (1 <= n <= 30) and some 1 <= k <= 10 as input, can we find the k-th largest number in T faster than in O(3^n) time? It's important that I need not only the k-th largest number, but also the corresponding numbers (p[1], p[2], p[3], ... , p[n]) that produce it.
Even if the answer is no, I would appreciate any hints on how you would solve this problem approximately, maybe, by using some heuristics? I know about beam search, but maybe you could suggest something else? And even for beam search, it is not really clear how to implement it here the best way.
If the exact answer can be obtained algorithmically in less than O(3^n) time, I would greatly appreciate it if you could point out the solution.
Well, you know that the largest product is the one that uses the largest factor from each set.
Furthermore, every other product can be formed by starting with a larger one, and then decreasing the factor chosen in exactly one set.
That leads to a simple search:
Put the largest product in a max-first priority queue.
Repeat k times:
a. Remove the largest product p from the priority queue
b. For each set that has a smaller number than the one selected in p,
generate the product formed by decreasing that number to the next lower one in that set. If this selection of factors hasn't been seen before, then add it to the priority queue.
Products will be removed from the queue in decreasing order, so the kth one you take out is the kth largest.
Complexity is about N*(k log kN), depending on how you implement things.
Note that there may be multiple ways to select the factors that produce the same product. This solution considers those ways to be distinct products, i.e., each way is counted when finding the kth largest. That may or may not be what you want.
To put the previous discussion into code we can do the following:
import operator
from functools import partial, reduce
import heapq
def prod_by_data(tup, data):
return reduce(operator.mul, (datum[t] for t, datum in zip(tup, data)), 1)
def downset(tup):
return [
tuple(t - (1 if j == i else 0) for j, t in enumerate(tup))
for i in range(len(tup))
if tup[i] > 0
]
data = [
[1, 2, 3],
[4, 2, 1],
[8, 1, 3],
[1, 1, 2],
]
data = [sorted(d) for d in data]
prod = partial(prod_by_data, data=data)
k_smallest = [tuple(len(dat) - 1 for dat in data)]
possible_k_smallest = []
while len(k_smallest) < 10:
new_possible = sorted(downset(k_smallest[-1]), key=prod, reverse=True)
possible_k_smallest = heapq.merge(possible_k_smallest, new_possible, key=prod, reverse=True)
k_smallest.append(next(possible_k_smallest))
print(k_smallest)
print([prod(tup) for tup in k_smallest])
We maintain a heap of the smallest elements. After we pop off the smallest, we need to check all if its downset (tuples that differ in exactly one position), because those tuples might be the next smallest element.
We see that we look through k - 1 times sorting O(n) elements each time with a key that itself is O(n). Because of the key this should make the sort take O(n^2) instead of O(n log n). The heapq is lazy and so popping from it is actually O(k). The initial sorting and preparation should be O(n) as well. Overall I think this makes everything O(k n^2).
I'm trying to figure out the solution to this problem, which is quite similar to the knapsack problem, but I'm not sure what states I should have or how to memorize them.
You have an electric car which weighs W units and you want to make it go for as long as possible. To do this you must pick from N batteries which also have an energy e, a weight b and a cost c.
The amount of time your car can go for is t = Etotal / Wtotal (the sum of energies of batteries you chose divided by the sum of the weight of the batteries you chose + the weight of the car itself)
Given that you have a budget B, what is the maximum time your car can go for?
Example:
INPUT:
N = 10 /number of batteries to choose
B = 1000 /budget
W = 20 /weight of car
#N batteries with numbers e (energy), w (weight), c (cost)
40 40 40
1 1 1
70 30 60
100 20 700
80 50 200
30 1 200
100 100 1
20 1 500
30 20 100
70 50 100
OUTPUT:
3.17073170731707
Straightforward DP algorithm
We can compute the minimum cost f(i, j, k) of a solution that achieves exactly total energy j and total weight k by choosing some subset of the first i batteries. This is given by:
f(0, 0, W) = 0
f(0, j!=0, W) = INF
f(0, j, k!=W) = INF
f(i>0, j, k<W) = INF
f(i>0, j, k>=W) = min(f(i-1, j, k), f(i-1, j-E[i], k-W[i]) + C[i])
where E[i], W[i] and C[i] are the energy, weight and cost of battery i, respectively. After computing values of this function for all 0 <= i <= N, 0 <= j <= Sum(E[]) and 0 <= k <= W+Sum(W[]), find the maximum of j/k over all 0 <= j <= Sum(E[]) and 0 <= k <= W+Sum(W[]) such that f(N, j, k) <= B.
A direct implementation using a 3D DP table will take time and space O(N*Sum(E[])*(W+Sum(W[]))) time and space. But since the recursion never needs to reach back further than 1 step in the first parameter i, we can make the outermost loop increase i and drop the first dimension from the DP table, overwriting its entries as we go, to drop the space complexity by a factor of N.
The above DP computes minimum costs, but it could be "rotated" to optimise for any of the three variables (minimum cost for given energy and weight, maximum energy for given cost and weight, or minimum weight for given energy and cost). The most efficient approach is to optimise for the variable with the largest range, since the time and space complexity involve the product of the ranges of the remaining two variables.
Greedy algorithm for unconstrained costs
The following simple O(N*log N)-time, O(N)-space algorithm maximises the distance travelled if there are no cost constraints. I think it's interesting because of the proof of correctness.
Sort batteries in decreasing order by energy divided by weight (you could think of this as "energy density").
Keep adding batteries from this list until the next battery has energy/weight less than the (total energy)/(total weight) of the batteries (and car) chosen so far.
A key element in proving this correct is the observation that, whenever we combine two multisets of batteries (we can consider the car to be an always-chosen battery with energy level 0), the mean of the resulting multiset is strictly in between the original two means. I'll call this the "mean-betweenness" lemma; see Lemma 1 here for a proof. Intuitively this means (hehe) that whenever we can add a battery with higher energy density than the multiset of batteries chosen so far, we should -- since the result of combining these two multisets (the new battery is a multiset of size 1) will be strictly in between them, and thus strictly higher than the multiset of batteries chosen so far.
Running the algorithm above will choose a multiset of batteries in which some initial number s of batteries in the sorted list will be chosen, and no other batteries will be chosen. By the mean-betweenness lemma, the algorithm clearly chooses an optimal multiset of solutions among all solutions having this form (that is, among solutions that choose only some initial number of batteries in the list). To establish that it chooses an optimal solution overall, we need to show that no solution that "skips over" one or more batteries in this list and then chooses one or more batteries further down can be better.
Suppose to the contrary that there exists an optimal solution X that skips a battery, and that this solution is strictly better than the solution Y produced by the greedy algorithm. Let i be the first battery that X skips. Thus X and Y share the first i-1 batteries. There are 2 cases:
E[i]/W[i] is strictly greater than the energy/weight of X. In this case, by the mean-betweenness lemma, we can add battery i to X to produce a solution that is strictly better than X, contradicting the optimality of X.
E[i]/W[i] is less than or equal to the energy/weight of X.
Continuing with case 2, consider the submultiset X' of batteries chosen further down the list by X (by assumption this must contain at least one battery). Because the list is ordered by decreasing energy/weight, these batteries each have energy/weight at most equal to that of battery i (namely, E[i]/W[i]), so by the mean-betweenness lemma their mean energy/weight is also at most equal to E[i]/W[i]. X = (X-X') ∪ X', so by the mean-betweenness lemma, the mean energy/weight of X is strictly between (X-X') and X'. Since the mean energy/weight of X' is less than or equal to the mean energy/weight of X overall, removing the batteries in X' from X to leave (X-X') will in the best case (when the means of X and X' are equal) leave the mean unchanged, and otherwise increase it. Either way, we have constructed a new solution (X-X') with mean energy/weight at least as high as X and which consists of the first i-1 batteries in the list -- that is, a solution of the form that the greedy algorithm is known to maximise over.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have done this example in O(n^2). Given a array, I do the following:
max_key = 0
for k in set(keys):
count = 0
for divisor in keys:
if key < divisor: break
if key% divisor == 0: count += 1
if count > max_key: max_key = count
print(max_key)
An example of this would be:
keys = [2,4,8,2]
Then the element most divisible by all elements in the keys is 8 because there are 4 elements (2,2,4,8) that can divide 8.
Can anyone suggest an approach better than O(n^2) ?
keys = [2,4,5,8,2]
We can try something like memoization (from dynamic programming) to speed up while not doing any repeated calculations.
First, let's keep a hashmap, which stores all the divisors for a number in that array.
num_divisor = {} # hashmap
We keep another hashmap to store if a value is present in the array or not (count of that number).
cnt_num = {2: 2, 4: 1, 5: 1, 8: 1}
Now, we run prime sieve up to max(keys) to find the smallest prime factor for each number up to max(keys).
Now, we traverse the array while factoring out each number (factoring is only O(logn) given we know the smallest prime factor of each number now),
pseudo-code
for a in keys:
temp_a = a # copy
while temp_a != 1:
prime_factor = smallest_prime_factor[temp_a]
temp_a = temp_a / prime_factor
if solution for new temp_a is already known in num_divisor just update it from there (no recalculation)
else:
if new temp_a is in keys, we increment the solution in num_divisor by 1 and continue
overall complexity: max(keys) * log(max(keys)) [for seive] + n * log(max(keys))
This should work well if the keys are uniformly distributed. For cases like,
keys = [2, 4, 1001210], it will do lots of unnecessary computation, so in those cases, it is better to avoid the sieve and instead compute the prime factors directly or in extreme cases, the pairwise divisor calculation should outperform.
I think you could change one of the n into a factor that is pseudo-polynomial by inserting the numbers into a dict (amortized, expected).
keys = [2,4,8,2]
# https://stackoverflow.com/a/280156/2472827
# max key, O(keys)
max_key = max(keys)
# https://stackoverflow.com/a/6582852/2472827
# occurrence counter, O(keys)
count = dict()
for k in keys:
count[k] = count.get(k, 0) + 1
# https://stackoverflow.com/a/18634675/2472827
# transform into an answer counter, O(keys)
answer = dict.fromkeys(keys, 0)
# https://stackoverflow.com/a/1602964/2472827
# fill the answer, O(keys * max_key/min_key)
for a in answer:
max_factor = int(max_key / a)
for factor in range(1, max_factor + 1):
number = a * factor
if number in answer:
answer[number] += count[a]
# O(keys)
a = max(answer, key = answer.get)
print answer[a], "/", len(keys), "list items dividing", a
I think it works in O(n * max_n/min_n) (expected.) In this case, it is pretty good, but if you have a high dynamic range of values, it's easy to make it go slow.
You could potentially improve your code by:
Account for duplicates by putting keys in a counting map first (e.g. so you don't have to parse the '2' twice in your example). This helps if there's a lot of repetition.
If the square root of the value being checked is smaller than the number of keys, check up to the square root of the value being checked (together with the value being checked divided by its divisors). This helps if there are lots of numbers having square roots that are smaller than the total number of elements.
E.g. If we're checking 30 and the list is big, we only need to check: 1 up to 5 to see if they divide 30 and their counts, as well as 30 divided by any of its divisors in this range (30/1=30, 30/2=15, 30/3=10, 30/5=6) and their counts.
E.g. if we're checking 10^100+17, and there are 10 items total, just check each of them in turn.
Neither of these affect worst case analysis since an adversary could choose inputs where they're useless. They may help in the problems you need to solve depending on your inputs, and may help more broadly if you have some guarantees on the inputs.
Let's think about this a different way: instead of an array of numbers, consider a directed acyclic graph where each number becomes a vertex in the graph, and there is an edge from u → v if and only if u divides v. The problem is now to find a vertex with the greatest in-degree.
Note that we can't actually count the in-degree of a vertex directly in less than Θ(n) time, so the naive solution is to count the in-degree of every vertex in Θ(n2) time. To do better than Θ(n2), we need to take advantage of the fact that if there are edges u → v → w then there is also an edge u → w; we get knowledge of three edges for the price of two divisibility tests. If there are edges u → v → w → x then three divisibility tests can buy us knowledge of six edges in the graph, and so on.
However, crucially, we only get "free" information about edges if the numbers we test are divisible by each other. If we do a divisibility test and the result is negative, we get no "free" information about other possible edges. So in the worst case, there is only one edge in the graph (i.e. all the numbers in the array are not multiples of each other, except for one pair). For an algorithm to output the correct result, it must find the single number in the array which has a divisor in the array, but each divisibility test which doesn't find this pair gives no "free" information. So in the worst case, we would indeed have to test every pair to find the correct answer. This proves that a worst-case time complexity of Θ(n2) is the best you can do in the general* case.
*In case the array elements are bounded, then a pseudopolynomial algorithm could plausibly do better.
I'm solving a programming question and stuck on the last piece of the puzzle.
This is the question: https://leetcode.com/problems/daily-temperatures/
I have a sorted (for values) dictionary and now I want to do a log(n) complexity search on the dictionary. Here's the code I have written so far.
def dailyTemperatures(self, T):
if len(T) == 0:
return []
if len(T) == 1:
return [0]
R = [None] * len(T)
#create map, populate map
M = {}
for i in range(0, len(T)):
M[i] = T[i]
#sort map by value(temps)
MS = sorted(M.items(), key=lambda x: x[1])
for i in MS:
print(i[0], i[1])
for i in range(0,len(T)):
t = T[i] #base value for comparison
R[i] = 0
x = 0
# find smallest x for which temp T[x] > T[i]
# Dictionary is sorted for Temps
R[i] = x - i
return R
The commented part in the loop is where I have trouble. I could not find an answer anywhere which would search a sorted dictionary and then filter by key.
Any tips or new suggestions to tackle this are also appreciated.
Your code could possibly be made to work, but: This algorithm is really just adding more layers of complexity on top of the naive brute force bubble sort-like algorithm, due to needing to backtrack for indexes.
Simplest modification is just to search for the minimum index > than current index. Store the position in the dict's .items() as part of the value so you can retrieve it. But, you can't binary search on index, because it is sorted by value, and index is not in order. This should give you an acceptable O(N) lookup.
You still have to search by index in the end (which has priority over temperature). Even with binary search, your attempted algorithm, ignoring the N log N complexity of pre-sorting, would at best still require O(N * log N * log N) for searching. Your current attempt would actually be O(N^2 log N), but with a third cached index table, nearest index lookup could be turned into log N.
It will be a very convoluted and inefficient algorithm, due to basically having to backtrack your search order. And it will have no advantage over a naive brute force (it's objectively worse).
Note: key point is that you need the nearest index, which is not in sorted order if you sort by value
If you still want to do it that way (I guess as a code golf challenge), you will want to add its position index in .items() of the dict to your dictionary, so when you look up your key in dict, you can find which starting position to start your search through the temperature sorted list. To get the log N, you will need to store each range of temperatures and their range of indexes. This part will probably be particularly complicated to implement. And of course you'll need to implement a binary search algorithm.
Stack algorithm:
Basic idea of below algorithm is that any lower temperatures that follow no longer matter.
eg: [...] 10 >20< 9 6 7 21. After 20; 9 6 7 (or anything <= 20) do not matter. After 9; 6 and 7 don't matter. etc.
So iterate from the end, adding numbers to the stack, popping off the stack numbers less than the current number.
Note that because the number of temperates is bound to 70 values, and numbers less than the current temperature are pruned off the stack at each iteration, both the complexity of searching for the next temperature, and the size of the stack, is bound to 70. In other words constant.
So for each item in T, you will search a maximum of 70 values in the worst case, ie: len(T) * 70.
Thus the complexity of the algorithm is O(N): number of items in T.
def dailyTemperatures(T):
res = [0]*len(T)
stack = []
for i, x in reversed([*enumerate(T)]):
if len(stack) < 1:
stack.append((i,x))
else:
while(len(stack)>0 and stack[-1][1]<=x):
stack.pop()
if len(stack)>0 and stack[-1][1]>x:
res[i] = stack[-1][0] - i
print(x, stack)
stack.append((i,x))
return res
print(dailyTemperatures([73, 74, 75, 71, 69, 72, 76, 73]))