Two weeks ago I posted THIS question here about dynamic programming. User Andrea Corbellini answered precisely what I wanted, but I wanted to take the problem one more step further.
This is my function
def Opt(n):
if len(n) == 1:
return 0
else:
return sum(n) + min(Opt(n[:i]) + Opt(n[i:])
for i in range(1, len(n)))
Let's say you would call
Opt( [ 1,2,3,4,5 ] )
The previous question solved the problem of computing the optimal value. Now,
instead of the computing the optimum value 33 for the above example, I want to print the way we got to the most optimal solution (path to the optimal solution). So, I want to print the indices where the list got cut/divided to get to the optimal solution in the form of a list. So, the answer to the above example would be :
[ 3,2,1,4 ] ( Cut the pole/list at third marker/index, then after second index, then after first index and lastly at fourth index).
That is the answer should be in the form of a list. The first element of the list will be the index where the first cut/division of the list should happen in the optimal path. The second element will be the second cut/division of the list and so on.
There can also be a different solution:
[ 3,4,2,1 ]
They both would still lead you to the correct output. So, it doesn't matter which one you printed. But, I have no idea how to trace and print the optimal path taken by the Dynamic Programming solution.
By the way, I figured out a non-recursive solution to that problem that was solved in my previous question. But, I still can't figure out to print the path for the optimal solution. Here is the non-recursive code for the previous question, it might be helpful to solve the current problem.
def Opt(numbers):
prefix = [0]
for i in range(1,len(numbers)+1):
prefix.append(prefix[i-1]+numbers[i-1])
results = [[]]
for i in range(0,len(numbers)):
results[0].append(0)
for i in range(1,len(numbers)):
results.append([])
for j in range(0,len(numbers)):
results[i].append([])
for i in range(2,len(numbers)+1): # for all lenghts (of by 1)
for j in range(0,len(numbers)-i+1): # for all beginning
results[i-1][j] = results[0][j]+results[i-2][j+1]+prefix[j+i]-prefix[j]
for k in range(1,i-1): # for all splits
if results[k][j]+results[i-2-k][j+k+1]+prefix[j+i]-prefix[j] < results[i-1][j]:
results[i-1][j] = results[k][j]+results[i-2-k][j+k+1]+prefix[j+i]-prefix[j]
return results[len(numbers)-1][0]
Here is one way of printing the selected :
I used the recursive solution using memoization provided by #Andrea Corbellini in your previous question. This is shown below:
cache = {}
def Opt(n):
# tuple objects are hashable and can be put in the cache.
n = tuple(n)
if n in cache:
return cache[n]
if len(n) == 1:
result = 0
else:
result = sum(n) + min(Opt(n[:i]) + Opt(n[i:])
for i in range(1, len(n)))
cache[n] = result
return result
Now, we have the cache values for all the tuples including the selected ones.
Using this, we can print the selected tuples as shown below:
selectedList = []
def printSelected (n, low):
if len(n) == 1:
# No need to print because it's
# already printed at previous recursion level.
return
minVal = math.Inf
minTupleLeft = ()
minTupleRight = ()
splitI = 0
for i in range(1, len(n)):
tuple1ToI = tuple (n[:i])
tupleiToN = tuple (n[i:])
if (cache[tuple1ToI] + cache[tupleiToN]) < minVal:
minVal = cache[tuple1ToI] + cache[tupleiToN]
minTupleLeft = tuple1ToI
minTupleRight = tupleiToN
splitI = low + i
print minTupleLeft, minTupleRight, minVal
print splitI # OP just wants the split index 'i'.
selectedList.append(splitI) # or add to the list as requested by OP
printSelected (list(minTupleLeft), low)
printSelected (list(minTupleRight), splitI)
You call the above method like shown below:
printSelected (n, 0)
Related
I'm familiar with the naive recursive solution to the knapsack problem. However, this solution simply spits out the max value that can be stored in the knapsack given its weight constraints. What I'd like to do is add some form of metadata cache (namely which items have/not been selected, using a "one-hot" array [0,1,1]).
Here's my attempt:
class Solution:
def __init__(self):
self.array = []
def knapSack(self,W, wt, val, n):
index = n-1
if n == 0 or W == 0 :
return 0
if (wt[index] > W):
self.array.append(0)
choice = self.knapSack(W, wt, val, index)
else:
option_A = val[index] + self.knapSack( W-wt[index], wt, val, index)
option_B = self.knapSack(W, wt, val, index)
if option_A > option_B:
self.array.append(1)
choice = option_A
else:
self.array.append(0)
choice = option_B
print(int(option_A > option_B)) #tells you which path was traveled
return choice
# To test above function
val = [60, 100, 120]
wt = [10, 20, 30]
W = 50
n = len(val)
# print(knapSack(W, wt, val, n))
s = Solution()
s.knapSack(W, wt, val, n)
>>>
1
1
1
1
1
1
220
s.array
>>>
[1, 1, 1, 1, 1, 1]
As you can see, s.array returns [1,1,1,1,1,1] and this tells me a few things. (1), even though there are only three items in the problem set, the knapSack method has been called twice for each item and (2) this is because every item flows through the else statement in the method, so option_A and option_B are each computed for each item (explaining why the array length is 6 not 3.)
I'm confused as to why 1 has been appended in every recursive loop. The item at index 0 would is not selected in the optimal solution. To answer this question, please provide:
(A) Why the current solution is behaving this way
(B) How the code can be restructured such that a one-hot "take or don't take" vector can be captured, representing whether a given item goes in the knapsack or not.
Thank you!
(A) Why the current solution is behaving this way
self.array is an instance attribute that is shared by all recursion paths. On one path or another each item is taken and so a one is appended to the list.
option_A = val[index]... takes an item but doesn't append a one to the list.
option_B = self..... skips an item but doesn't append a zero to the list.
if option_A > option_B: When you make this comparison you have lost the information that made it - the items that were taken/discarded in the branch;
in the suites you just append a one or a zero regardless of how many items made those values.
The ones and zeroes then represent whether branch A (1) or branch B (0) was successful in the current instance of the function.
(B) How the code can be restructured such that a one-hot "take or don't take" vector can be captured, representing whether a given item goes in the knapsack or not.
It would be nice to know what you have taken after running through the analysis, I suspect that is what you are trying to do with self.array. You expressed an interest in OOP: instead of keeping track with lists of numbers using indices to select numbers from the lists, make objects to represent the items work with those. Keep the objects in containers and use the functionality of the container to add or remove items/objects from it. Consider how you are going to use a container before choosing one.
Don't put the function in a class.
Change the function's signature to accept
available weight,
a container of items to be considered,
a container holding the items currently in the sack (the current sack).
Use a collections.namedtuple or a class for the items having value and weight attributes.
Item = collections.namedtuple('Item',['wt','val'])
When an item is taken add it to the current sack.
When recursing
if going down the take path add the return value from the call to the current sack
remove the item that was just considered from the list of items to be considered argument.
if taken subtract the item's weight from the available weight argument
When comparing two branches you will need to add up the values of each item the current sack.
return the sack with the highest value
carefully consider the base case
Make the items to be considered like this.
import collections
Item = collections.namedtuple('Item',['wt','val'])
items = [Item(wght,value) for wght,value in zip(wt,val)]
Add up values like this.
value = sum(item.val for item in current_sack)
# or
import operator
val = operator.itemgetter('val')
wt = operator.itemgetter('wt')
value = sum(map(val,current_sack)
Your solution enhanced with debugging prints for the curious.
class Solution:
def __init__(self):
self.array = []
self.other_array = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
def knapSack(self,W, wt, val, n,j=0):
index = n-1
deep = f'''{' '*j*3}'''
print(f'{deep}level {j}')
print(f'{deep}{W} available: considering {wt[index]},{val[index]}, {n})')
# minor change here but has no affect on the outcome0
#if n == 0 or W == 0 :
if n == 0:
print(f'{deep}Base case found')
return 0
print(f'''{deep}{wt[index]} > {W} --> {wt[index] > W}''')
if (wt[index] > W):
print(f'{deep}too heavy')
self.array.append(0)
self.other_array[index] = 0
choice = self.knapSack(W, wt, val, index,j+1)
else:
print(f'{deep}Going down the option A hole')
option_A = val[index] + self.knapSack( W-wt[index], wt, val, index,j+1)
print(f'{deep}Going down the option B hole')
option_B = self.knapSack(W, wt, val, index,j+1)
print(f'{deep}option A:{option_A} option B:{option_B}')
if option_A > option_B:
print(f'{deep}option A wins')
self.array.append(1)
self.other_array[index] = 1
choice = option_A
else:
print(f'{deep}option B wins')
self.array.append(0)
self.other_array[index] = 0
choice = option_B
print(f'{deep}level {j} Returning value={choice}')
print(f'{deep}---------------------------------------------')
return choice
Hi I am trying to solve a problem where I have to return the indices in a sublist of the same person. When i say same person , I mean if they have the same username,phone or email(any one of them).
I understand that these identites are mostly unique but for the sake of questions lets assume.
eg.
data = [("username1","phone_number1", "email1"),
("usernameX","phone_number1", "emailX"),
("usernameZ","phone_numberZ", "email1Z"),
("usernameY","phone_numberY", "emailX"),
("username2","phone_number2", "emailX")]
Expected output :
[[0,1,3,4][2]]
Explaination: As 0,1 have the same phone and 3 and 4 have the same email so They all fall under one category. and 2 index falls in the other catoegry.
My approach until now is :
data = [("username1","phone_number1", "email1"),
("usernameX","phone_number1", "emailX"),
("usernameZ","phone_numberZ", "email1Z"),
("usernameY","phone_numberY", "emailX"),
]
def match(t1,t2):
if(t1[0] == t2[0] or t1[1] == t2[1] or t1[2] == t2[2]):
return True
else:
return False
# print(match(data[1],data[3]))
together = []
for i in range(len(data)):
temp = {i}
for j in range(len(data)):
if(match(data[i],data[j])):
temp.add(j)
together.append(temp)
for i in range(len(data)):
ans = together[i]
for j in range(i+1,len(data)):
if(bool(ans.intersection(together[j]))):
ans = ans.union(together[j])
print(ans)
I am not able to reach desired result.
Any help is appreciated. Thank you.
A first solution is similar to yours with some enhancements:
Leveraging any for the match, such that it doesn't require to know the number of items inside the tuples.
Checking if a user is already identified as part of "together" to skip useless comparison
Here it is:
together = set()
for user_idx, user in enumerate(data):
if user_idx in together:
continue # That user is already matched
# No need to check with previous users
for other_idx, other in enumerate(data[user_idx + 1 :], user_idx + 1):
# Match
if any(val_ref == val_other for val_ref, val_other in zip(user, other)):
together.update((user_idx, other_idx))
isolated = set(range(len(data))) ^ together
Another solution use tricks by going through a numpy array to identify isolated users. With numpy it is easy to compare a user to every other user (aka the original array). An isolated user will only match one time to itself on each of its fields, hence summing the boolean values along fields will return, for an isolated user, the length of the tuple of fields.
data = np.array(data)
# For each user, match it with the whole matrice
matches = sum(user == data for user in data)
# Isolated users only match with themselves, hence only have 1 on their line
isolated = set(np.where(np.sum(matches, axis=1) == data.shape[1])[0])
# Together are other users
together = set(range(len(data))) ^ set(isolated)
see the matches array for better understanding:
[[1 2 1]
[1 2 3]
[1 1 1]
[1 1 3]
[1 1 3]]
However, it is not leveraging any of the optimisation mentioned before.
Still, numpy is fast so it should be ok.
so I've got a list of questions as a dictionary, e.g
{"Question1": 3, "Question2": 5 ... }
That means the "Question1" has 3 points, the second one has 5, etc.
I'm trying to create all subset of question that have between a certain number of questions and points.
I've tried something like
questions = {"Q1":1, "Q2":2, "Q3": 1, "Q4" : 3, "Q5" : 1, "Q6" : 2}
u = 3 #
v = 5 # between u and v questions
x = 5 #
y = 10 #between x and y points
solution = []
n = 0
def main(n_):
global n
n = n_
global solution
solution = []
finalSolution = []
for x in questions.keys():
solution.append("_")
finalSolution.extend(Backtracking(0))
return finalSolution
def Backtracking(k):
finalSolution = []
for c in questions.keys():
solution[k] = c
print ("candidate: ", solution)
if not reject(k):
print ("not rejected: ", solution)
if accept(k):
finalSolution.append(list(solution))
else:
finalSolution.extend(Backtracking(k+1))
return finalSolution
def reject(k):
if solution[k] in solution: #if the question already exists
return True
if k > v: #too many questions
return True
points = 0
for x in solution:
if x in questions.keys():
points = points + questions[x]
if points > y: #too many points
return True
return False
def accept(k):
points = 0
for x in solution:
if x in questions.keys():
points = points + questions[x]
if points in range (x, y+1) and k in range (u, v+1):
return True
return False
print(main(len(questions.keys())))
but it's not trying all possibilities, only putting all the questions on the first index..
I have no idea what I'm doing wrong.
There are three problems with your code.
The first issue is that the first check in your reject function is always True. You can fix that in a variety of ways (you commented that you're now using solution.count(solution[k]) != 1).
The second issue is that your accept function uses the variable name x for what it intends to be two different things (a question from solution in the for loop and the global x that is the minimum number of points). That doesn't work, and you'll get a TypeError when trying to pass it to range. A simple fix is to rename the loop variable (I suggest q since it's a key into questions). Checking if a value is in a range is also a bit awkward. It's usually much nicer to use chained comparisons: if x <= points <= y and u <= k <= v
The third issue is that you're not backtracking at all. The backtracking step needs to reset the global solution list to the same state it had before Backtracking was called. You can do this at the end of the function, just before you return, using solution[k] = "_" (you commented that you've added this line, but I think you put it in the wrong place).
Anyway, here's a fixed version of your functions:
def Backtracking(k):
finalSolution = []
for c in questions.keys():
solution[k] = c
print ("candidate: ", solution)
if not reject(k):
print ("not rejected: ", solution)
if accept(k):
finalSolution.append(list(solution))
else:
finalSolution.extend(Backtracking(k+1))
solution[k] = "_" # backtracking step here!
return finalSolution
def reject(k):
if solution.count(solution[k]) != 1: # fix this condition
return True
if k > v:
return True
points = 0
for q in solution:
if q in questions:
points = points + questions[q]
if points > y: #too many points
return True
return False
def accept(k):
points = 0
for q in solution: # change this loop variable (also done above, for symmetry)
if q in questions:
points = points + questions[q]
if x <= points <= y and u <= k <= v: # chained comparisons are much nicer than range
return True
return False
There are still things that could probably be improved in there. I think having solution be a fixed-size global list with dummy values is especially unpythonic (a dynamically growing list that you pass as an argument would be much more natural). I'd also suggest using sum to add up the points rather than using an explicit loop of your own.
Working on below problem,
Problem,
Given a m * n grids, and one is allowed to move up or right, find the different paths between two grid points.
I write a recursive version and a dynamic programming version, but they return different results, and any thoughts what is wrong?
Source code,
from collections import defaultdict
def move_up_right(remaining_right, remaining_up, prefix, result):
if remaining_up == 0 and remaining_right == 0:
result.append(''.join(prefix[:]))
return
if remaining_right > 0:
prefix.append('r')
move_up_right(remaining_right-1, remaining_up, prefix, result)
prefix.pop(-1)
if remaining_up > 0:
prefix.append('u')
move_up_right(remaining_right, remaining_up-1, prefix, result)
prefix.pop(-1)
def move_up_right_v2(remaining_right, remaining_up):
# key is a tuple (given remaining_right, given remaining_up),
# value is solutions in terms of list
dp = defaultdict(list)
dp[(0,1)].append('u')
dp[(1,0)].append('r')
for right in range(1, remaining_right+1):
for up in range(1, remaining_up+1):
for s in dp[(right-1,up)]:
dp[(right,up)].append(s+'r')
for s in dp[(right,up-1)]:
dp[(right,up)].append(s+'u')
return dp[(right, up)]
if __name__ == "__main__":
result = []
move_up_right(2,3,[],result)
print result
print '============'
print move_up_right_v2(2,3)
In version 2 you should be starting your for loops at 0 not at 1. By starting at 1 you are missing possible permutations where you traverse the bottom row or leftmost column first.
Change version 2 to:
def move_up_right_v2(remaining_right, remaining_up):
# key is a tuple (given remaining_right, given remaining_up),
# value is solutions in terms of list
dp = defaultdict(list)
dp[(0,1)].append('u')
dp[(1,0)].append('r')
for right in range(0, remaining_right+1):
for up in range(0, remaining_up+1):
for s in dp[(right-1,up)]:
dp[(right,up)].append(s+'r')
for s in dp[(right,up-1)]:
dp[(right,up)].append(s+'u')
return dp[(right, up)]
And then:
result = []
move_up_right(2,3,[],result)
set(move_up_right_v2(2,3)) == set(result)
True
And just for fun... another way to do it:
from itertools import permutations
list(map(''.join, set(permutations('r'*2+'u'*3, 5))))
The problem with the dynamic programming version is that it doesn't take into account the paths that start from more than one move up ('uu...') or more than one move right ('rr...').
Before executing the main loop you need to fill dp[(x,0)] for every x from 1 to remaining_right+1 and dp[(0,y)] for every y from 1 to remaining_up+1.
In other words, replace this:
dp[(0,1)].append('u')
dp[(1,0)].append('r')
with this:
for right in range(1, remaining_right+1):
dp[(right,0)].append('r'*right)
for up in range(1, remaining_up+1):
dp[(0,up)].append('u'*up)
I need to turn following equation into a python function.
k = people[i]
i = people[j]
costs[i][j]
costs[j][k]
change = -costs[i][k] - costs[j][l] + costs[i][l] + cost[j][k]
I think what you're looking for is:
def change(i,j,costs,people):
k = people[i]
l = people[j] # was originally i = people[j] but unsure where l comes from otherwise
result = -costs[i][k] - costs[j][l] + costs[i][l] + cost[j][k]
return result
(and then call with:
mychange = change(i,j,costs,people)
)
Note that if my assumption on where l comes from is wrong, change that 3rd line back to i = people[j] and pass in l as well.
Also note that using l as a variable is a bad idea as it's rather hard to tell apart from 1 (and also, using i,j,k,l is bad when you could use a more descriptive value to make it clearer what this means)