I am working on DP solution for a knapsack problem. Having a list of items with their weights and value, I need to find the items with the maximum total value less then some predefined weight. Nothing special, just 0-1 knapsack.
I use DP to generate a matrix:
def getKnapsackTable(items, limit):
matrix = [[0 for w in range(limit + 1)] for j in xrange(len(items) + 1)]
for j in xrange(1, len(items) + 1):
item, wt, val = items[j-1]
for w in xrange(1, limit + 1):
if wt > w:
matrix[j][w] = matrix[j-1][w]
else:
matrix[j][w] = max(matrix[j-1][w], matrix[j-1][w-wt] + val)
return matrix
where items are a list of tuples (name, weight, value). Now having a DP matrix, he maximum possible value is the number in the right down position. I can also backtrack the matrix to find the list of items that gives the best solution.
def getItems(matrix, items):
result = []
I, j = len(matrix) - 1, len(matrix[0]) - 1
for i in range(I, 0, -1):
if matrix[i][j] != matrix[i-1][j]:
item, weight, value = items[i - 1]
result.append(items[i - 1])
j -= weight
return result
Great, now I can get the results:
items = [('first', 1, 1), ('second', 3, 8), ('third', 2, 5), ('forth', 1, 1), ('fifth', 1, 2), ('sixth', 5, 9)]
matrix = getKnapsackTable(items, 7)
print getItems(matrix, items)
and will see: [('fifth', 1, 2), ('third', 2, 5), ('second', 3, 8), ('first', 1, 1)].
The problem is that this is not a unique solution. Instead of the 'first' element, I can take the 'forth' element (which is absolutely the same, but sometimes the solutions can be different). I am trying to figure out how to get all the solutions instead of just one. I realize that it will take more time, but I am ok with that.
You can compute the original DP matrix as usual (i.e., using DP), but to find all optimal solutions you need to recurse as you travel back through the matrix from the final state. That's because any given state (i, j) in your matrix has at least one optimal predecessor state, but it might have two: it might be that the maximum value for state (i, j) can be achieved either by choosing to add item i to the optimal solution for state (i-1, j-w(i)), or by leaving item i out and just keeping the optimal solution for (i-1, j). This occurs exactly when these two choices yield equal total values, i.e., when
matrix[i-1][j] == matrix[i-1][j-w(i)]+v(i),
where w(i) and v(i) are the weight and value of object i, respectively. Whenever you detect such a branching, you need to follow each branch.
Note that there could be an extremely large number of optimal solutions: e.g., consider the case when all items have weight 1. In this case, all (n choose w) solutions are optimal.
Related
Suppose we have data a₁, ..., aₙ, where n is an even integer and each aᵢ ∈ ℝ. Also define the distance between two pairs of elements dis(aᵢ, aⱼ) = | aᵢ − aⱼ |. Now the program should output a list of pairs of elements sorted by the distance in an ascending order. Also the program should pack the input data into pairs, therefore each element aᵢ would only appear once in the output.
For example, given the input [1, 0.4, 3, 1.1] the output should be [(1, 1.1), (0.4, 3)].
A naive brute-force method is to calculate all C(n,2) pair and sorted the distance of each pair.
def not_in_list_of_pair(i, ls):
return not i in [p[0] for p in ls] + [p[1] for p in ls]
def calc(ls):
ls = sorted(ls)
d ={}
for idx1, i in enumerate(ls[:-1]):
for idx2, j in enumerate(ls[idx1+1:], idx1 + 1):
d[(i,j)] = j - i
# 2nd part
res = []
for pair in sorted(d, key = lambda k: d[k]):
i, j = pair
if not_in_list_of_pair(i, res) and not_in_list_of_pair(j, res):
res.append(pair)
return res
# another example
ls = [1, 0.1, 2, 2.4, 3, 4, 1.5]
assert calc(ls) == [(2, 2.4), (1, 1.5), (3, 4)]
But this naive method only works in O(n²), and the 2nd part (extracting min distance) is also slow. Therefore I am looking for a more effective method to solve this problem. Thanks!
I have to say that your descrption of the problem is not clear and the complexity in the description is not correct, i.e., you have to calculate the distance of all the pairs of integers (which is O(n^2)) and after that you sort all the distance (which is O(n^2 * log(n^2))).
For this problem, you are basically finding two integers with smallest distance, pick these two integers out, and repeat the same process on the remaining integers.
One naive solution is, supposed the integers are sorted, and we only find one pair of integers with smallest distance, then we just need to calculate the distance of each two adjacent integers (e.g., dist between ls[0] and ls[1], between ls[1] and ls[2], ..., between ls[n - 2] and ls[n - 1]) and find out which pair is the smallest. After we find one, remove the two selected integers, the remaining integers are still sorted. If we want to find the next pair of integers with smallest distance, the problem remains the same.
The naive solution is still expensive in two aspsects: (1) we need to calculate the distance of each two adjacent integers each time; (2) we need to remove two integers from a sorted array and keep the array sorted.
To solve (1), in fact, we don't have to calculate the all the distances each time. E.g., suppose we have 6 integers where we calculated dist(0, 1), dist(1, 2), dist(2, 3), dist(3, 4), dist(4, 5). We find that the 2nd and the 3rd integers are the closet ones, so we output and remove the 2nd and the 3rd integers. For the next round, we need to calculate dist(0, 1), dist(1, 4), dist(4, 5). We can see that we only need to remove dist(1, 2) and dist(3, 4) as they're useless, but we need to add a new distance dist(1, 4) while dist(0, 1) and dist(4, 5) are not changed. We can maintain a btree to achieve the purpose.
To solve (2), the best data structure where we can remove items from the middle is double linked list with complexity O(1). But we are using array now and we may not want to change array to linked list. One way is that we use index array to mimic a double linked list.
Here is an example.
Update 1: I found OrderedDict does not pop the minimal item each time. I don't find any data structure in python that works as btree. I have to use a heap where I cannot delete those useless distance but I can identiy and ignore them. Sorry for the mistake.
Update 2: Add a else branch in the while loop, i.e., we should not change the double linked list when we see a useless item.
Update 3: Just realize that the heap will have no more than n items in each iteration in the while loop. So the complexity is roughly O(n log n), with n being the number of integers.
from heapq import *
def calc(ls):
ls = sorted(ls) # O(nlogn)
n = len(ls)
# mimic a double linked list
left = [i - 1 for i in range(n)]
right = [i + 1 for i in range(n)]
appeared = [False for i in range(n)]
btree = []
for i in range(0, n - 1):
# distance of adjacent integers, and their indices
heappush(btree, (ls[i + 1] - ls[i], i, i + 1))
# roughly O(n log n), because the heap will have at most `n` items in each iteration
result = []
while len(btree) != 0:
minimal = heappop(btree)
a, b = minimal[1:3]
# skip if either a or b appeared
if not appeared[a] and not appeared[b]:
result.append((ls[a], ls[b]))
appeared[a] = True
appeared[b] = True
else:
continue # this is important
#print result
if left[a] != -1:
right[left[a]] = right[b]
if right[b] != n:
left[right[b]] = left[a]
if left[a] != -1 and right[b] != n:
heappush(btree, (ls[right[b]] - ls[left[a]], left[a], right[b]))
return result
ls = [1, 0.1, 2, 2.4, 3, 4, 1.5]
print calc(ls)
With the following output:
[(2, 2.4), (1, 1.5), (3, 4)]
Note: The number of input integers is 7, which is NOT even.
Show one more image to present what is going on:
I am not very familiar with Python, so I may not be using the best data structure in the above code snippet.
I would like to write a function my_func(n,l) that, for some positive integer n, efficiently enumerates the ordered non-negative integer composition* of length l (where l is greater than n). For example, I want my_func(2,3) to return [[0,0,2],[0,2,0],[2,0,0],[1,1,0],[1,0,1],[0,1,1]].
My initial idea was to use existing code for positive integer partitions (e.g. accel_asc() from this post), extend the positive integer partitions by a couple zeros and return all permutations.
def my_func(n, l):
for ip in accel_asc(n):
nic = numpy.zeros(l, dtype=int)
nic[:len(ip)] = ip
for p in itertools.permutations(nic):
yield p
The output of this function is wrong, because every non-negative integer composition in which a number appears twice (or multiple times) appears several times in the output of my_func. For example, list(my_func(2,3)) returns [(1, 1, 0), (1, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), (0, 1, 1), (2, 0, 0), (2, 0, 0), (0, 2, 0), (0, 0, 2), (0, 2, 0), (0, 0, 2)].
I could correct this by generating a list of all non-negative integer compositions, removing repeated entries, and then returning a remaining list (instead of a generator). But this seems incredibly inefficient and will likely run into memory issues. What is a better way to fix this?
EDIT
I did a quick comparison of the solutions offered in answers to this post and to another post that cglacet has pointed out in the comments.
On the left, we have the l=2*n and on the right we have l=n+1. In these two cases, user2357112's second solutions is faster than the others, when n<=5. For n>5, solutions proposed by user2357112, Nathan Verzemnieks, and AndyP are more or less tied. But the conclusions could be different when considering other relationships between l and n.
..........
*I originally asked for non-negative integer partitions. Joseph Wood correctly pointed out that I am in fact looking for integer compositions, because the order of numbers in a sequence matters to me.
Use the stars and bars concept: pick positions to place l-1 bars between n stars, and count how many stars end up in each section:
import itertools
def diff(seq):
return [seq[i+1] - seq[i] for i in range(len(seq)-1)]
def generator(n, l):
for combination in itertools.combinations_with_replacement(range(n+1), l-1):
yield [combination[0]] + diff(combination) + [n-combination[-1]]
I've used combinations_with_replacement instead of combinations here, so the index handling is a bit different from what you'd need with combinations. The code with combinations would more closely match a standard treatment of stars and bars.
Alternatively, a different way to use combinations_with_replacement: start with a list of l zeros, pick n positions with replacement from l possible positions, and add 1 to each of the chosen positions to produce an output:
def generator2(n, l):
for combination in itertools.combinations_with_replacement(range(l), n):
output = [0]*l
for i in combination:
output[i] += 1
yield output
Starting from a simple recursive solution, which has the same problem as yours:
def nn_partitions(n, l):
if n == 0:
yield [0] * l
else:
for part in nn_partitions(n - 1, l):
for i in range(l):
new = list(part)
new[i] += 1
yield new
That is, for each partition for the next lower number, for each place in that partition, add 1 to the element in that place. It yields the same duplicates yours does. I remembered a trick for a similar problem, though: when you alter a partition p for n into one for n+1, fix all the elements of p to the left of the element you increase. That is, keep track of where p was modified, and never modify any of p's "descendants" to the left of that. Here's the code for that:
def _nn_partitions(n, l):
if n == 0:
yield [0] * l, 0
else:
for part, start in _nn_partitions(n - 1, l):
for i in range(start, l):
new = list(part)
new[i] += 1
yield new, i
def nn_partitions(n, l):
for part, _ in _nn_partitions(n, l):
yield part
It's very similar - there's just the extra parameter passed along at each step, so I added wrapper to remove that for the caller.
I haven't tested it extensively, but this appears to be reasonably fast - about 35 microseconds for nn_partitions(3, 5) and about 18s for nn_partitions(10, 20) (which yields just over 20 million partitions). (The very elegant solution from user2357112 takes about twice as long for the smaller case and about four times as long for the larger one. Edit: this refers to the first solution from that answer; the second one is faster than mine under some circumstances and slower under others.)
Given a list of lists of tuples, I would like to find the subset of lists which maximize the number of distinct integer values without any integer being repeated.
The list looks something like this:
x = [
[(1,2,3), (8,9,10), (15,16)],
[(2,3), (10,11)],
[(9,10,11), (17,18,19), (20,21,22)],
[(4,5), (11,12,13), (18,19,20)]
]
The internal tuples are always sequential --> (1,2,3) or (15,16), but they may be of any length.
In this case, the expected return would be:
maximized_list = [
[(1, 2, 3), (8, 9, 10), (15, 16)],
[(4, 5), (11, 12, 13), (18, 19, 20)]
]
This is valid because in each case:
Each internal list of x remains intact
There is a maximum number of distinct integers (16 in this case)
No integer is repeated.
If there are multiple valid solutions, all should be return in a list.
I have a naive implementation of this, heavily based on a previous stackoverflow question I had asked, which was not as well formed as it could have been (Python: Find tuples with greatest total distinct values):
import itertools
def maximize(self, x):
max_ = 0
possible_patterns = []
for i in range(1, len(x)+1):
b = itertools.combinations(x, i)
for combo in b:
all_ints = tuple(itertools.chain(*itertools.chain(*combo)))
distinct_ints = tuple(set(all_ints))
if sorted(all_ints) != sorted(distinct_ints):
continue
else:
if len(all_ints) >= max_:
if len(all_ints) == max_:
possible_patterns.append(combo)
new_max = len(all_ints)
elif len(all_ints) > max_:
possible_patterns = [combo]
new_max = len(all_ints)
max_ = new_max
return possible_patterns
The above-mentioned function appears to give me the correct result, but does not scale. I will need to accept x values with a few thousand lists (possibly as many as tens of thousands), so an optimized algorithm is required.
The following solves for the maximal subset of sublists, with respect to cardinality. It works by flattening each sublist, constructing a list of sets of intersections between the sublists, and then searches the solution space in a depth-first-search for the solution with the most elements (i.e. largest "weight").
def maximize_distinct(sublists):
subsets = [{x for tup in sublist for x in tup} for sublist in sublists]
def intersect(subset):
return {i for i, sset in enumerate(subsets) if subset & sset}
intersections = [intersect(subset) for subset in subsets]
weights = [len(subset) for subset in subsets]
pool = set(range(len(subsets)))
max_set, _ = search_max(pool, intersections, weights)
return [sublists[i] for i in max_set]
def search_max(pool, intersections, weights):
if not pool: return [], 0
max_set = max_weight = None
for num in pool:
next_pool = {x for x in pool - intersections[num] if x > num}
set_ids, weight = search_max(next_pool, intersections, weights)
if not max_set or max_weight < weight + weights[num]:
max_set, max_weight = [num] + set_ids, weight + weights[num]
return max_set, max_weight
This code can be optimized further by keeping a running total of the "weights" (sum of cardinalities of sublists) discarded, and pruning that branch of the search space when it exceeds that of the maximal solution so far (which will be the minimal discard weight). Unless you run into performance problems however, this will likely be more work than its worth, and for a small list of lists the overhead of the computation will exceed the speedup of pruning.
I am trying to solve a problem where:
Given an array of n integers nums and a target, find the number of
index triplets i, j, k with 0 <= i < j < k < n that satisfy the
condition nums[i] + nums[j] + nums[k] < target.
For example, given nums = [-2, 0, 1, 3], and target = 2.
Return 2. Because there are two triplets which sums are less than 2:
[-2, 0, 1] [-2, 0, 3]
My algorithm: Remove a single element from the list, set target = target - number_1, search for doublets such that number_1 + number _2 < target - number_1. Problem solved.
The problem link is https://leetcode.com/problems/3sum-smaller/description/ .
My solution is:
def threeSumSmaller(nums, target):
"""
:type nums: List[int]
:type target: int
:rtype: int
"""
nums = sorted(nums)
smaller = 0
for i in range(len(nums)):
# Create temp array excluding a number
if i!=len(nums)-1:
temp = nums[:i] + nums[i+1:]
else:
temp = nums[:len(nums)-1]
# Sort the temp array and set new target to target - the excluded number
l, r = 0, len(temp) -1
t = target - nums[i]
while(l<r):
if temp[l] + temp[r] >= t:
r = r - 1
else:
smaller += 1
l = l + 1
return smaller
My solution fails:
Input:
[1,1,-2]
1
Output:
3
Expected:
1
I am not getting why is the error there as my solution passes more than 30 test cases.
Thanks for your help.
One main point is that when you sort the elements in the first line, you also lose the indexes. This means that, despite having found a triplet, you'll never be sure whether your (i, j, k) will satisfy condition 1, because those (i, j, k) do not come from the original list, but from the new one.
Additionally: everytime you pluck an element from the middle of the array, the remaining part of the array is also iterated (although in an irregular way, it still starts from the first of the remaining elements in tmp). This should not be the case! I'm expanding details:
The example iterates 3 times over the list (which is, again, sorted and thus you lose the true i, j, and k indexes):
First iteration (i = 0, tmp = [1, -2], t = 0).
When you sum temp[l] + temp[r] (l, r are 0, 1) it will be -1.
It satisfies being lower than t. smaller will increase.
The second iteration will be like the first, but with i = 1.
Again it will increase.
The third one will increase as well, because t = 3 and the sum will be 2 now.
So you'll count the value three times (despite only one tuple can be formed in order of indexes) because you are iterating through the permutations of indexes instead of combinations of them. So those two things you did not take care about:
Preserving indexes while sorting.
Ensuring you iterate the indexes in a forward-fashion only.
Try like this better:
def find(elements, upper_bound):
result = 0
for i in range(0, len(elements) - 2):
upper_bound2 = upper_bound - elements[i]
for j in range(i+1, len(elements) - 1):
upper_bound3 = upper_bound2 - elements[j]
for k in range(j+1, len(elements)):
upper_bound4 = upper_bound3 - elements[k]
if upper_bound4 > 0:
result += 1
return result
Seems like you're counting the same triplet more than once...
In the first iteration of the loop, you omit the first 1 in the list, and then increase smaller by 1. Then you omit the second 1 in the list and increase smaller again by 1. And finally you omit the third element in the list, -2, and of course increase smaller by 1, because -- well -- in all these three cases you were in fact considering the same triplet {1,1,-2}.
p.s. It seems like you care more about correctness than performance. In that case, consider maintaining a set of the solution triplets, to ensure you're not counting the same triplet twice.
There are already good answers , Apart that , If you want to check your algorithm result then you can take help of this in-built funtion :
import itertools
def find_(vector_,target):
result=[]
for i in itertools.combinations(vector_, r=3):
if sum(i)<target:
result.append(i)
return result
output:
print(find_([-2, 0, 1, 3],2))
output:
[(-2, 0, 1), (-2, 0, 3)]
if you want only count then:
print(len(find_([-2, 0, 1, 3],2)))
output:
2
I am trying to write a code in python for the following: There are n stairs. I want to display the different ways(not the sum total no. of ways) of reaching stair n from stair 1. The catch here is i can skip not more than m stairs at a time. Please help. Note: m and n will be input by user.
The following code displays the total no of ways. but not what all the different ways are:
# A program to count the number of ways to reach n'th stair
# Recursive function used by countWays
def countWaysUtil(n,m):
if n <= 1:
return n
res = 0
i = 1
while i<=m and i<=n:
res = res + countWaysUtil(n-i, m)
i = i + 1
return res
# Returns number of ways to reach s'th stair
def countWays(s,m):
return countWaysUtil(s+1, m)
# Driver program
s,m = 4,2
print "Number of ways =",countWays(s, m)
This seems to be a sort of generalized Fibonacci sequence, except that instead of f(n) = f(n-1) + f(n-2) you have f(n) = f(n-1) + ... + f(n-m). Your code should work, but the massive recursion will give you very high complexity for larger values of n (on the order of O(m^n) if I'm not mistaken). The key to solving this sort of problem is to memorize the results for lower input values in a list:
def ways_up_stairs(n, m):
ways = [1] + [None] * n
for i in range(1, n+1):
ways[i] = sum(ways[max(0, i-m):i])
return ways[n]
Some example input and output:
print(ways_up_stairs(4,2)) # 5
print(ways_up_stairs(4,3)) # 7
print(ways_up_stairs(4,4)) # 8
If you want the actual step-sequences, not the sums, you can easily adapt the code accordingly, using nested list comprehensions:
def ways_up_stairs(n, m):
ways = [[(0,)]] + [None] * n
for i in range(1, n+1):
ways[i] = [w + (i,) for ws in ways[max(0, i-m):i] for w in ws]
return ways[n]
print(ways_up_stairs(4,2))
# [(0, 2, 4), (0, 1, 2, 4), (0, 1, 3, 4), (0, 2, 3, 4), (0, 1, 2, 3, 4)]
Note that you might have to adapt the code a bit, as it is e.g. not really clear whether "skip up to m steps" means that you can take 1..m or 1..m+1 steps, but if you have the expected result for some input, making those "one-off" adaptations should be easy.