Finding shortest combinations in array/sequence that equals sum - python

I'm totally stuck and have no idea how to go about solving this. Let's say I've an array
arr = [1, 4, 5, 10]
and a number
n = 8
I need shortest sequence from within arr which equals n. So for example following sequences within arr equals n
c1 = 5,1,1,1
c2 = 4,4
c3= 1,1,1,1,1,1,1,1
So in above case, our answer is c2 because it's shortest sequences in arr that equals sum.
I'm not sure what's the simplest way of finding a solution to above? Any ideas, or help will be really appreciated.
Thanks!
Edited:
Fixed the array
Array will possibly have postive values only.
I'm not sure how subset problem fixes this, probably due to my own ignorance. Does sub-set algorithm always give the shortest sequence that equals sum? For example, will subset problem identify c2 as the answer in above scenario?

As has been pointed before this is the minimum change coin problem, typically solved with dynamic programming. Here's a Python implementation solved in time complexity O(nC) and space complexity O(C), where n is the number of coins and C the required amount of money:
def min_change(V, C):
table, solution = min_change_table(V, C)
num_coins, coins = table[-1], []
if num_coins == float('inf'):
return []
while C > 0:
coins.append(V[solution[C]])
C -= V[solution[C]]
return coins
def min_change_table(V, C):
m, n = C+1, len(V)
table, solution = [0] * m, [0] * m
for i in xrange(1, m):
minNum, minIdx = float('inf'), -1
for j in xrange(n):
if V[j] <= i and 1 + table[i - V[j]] < minNum:
minNum = 1 + table[i - V[j]]
minIdx = j
table[i] = minNum
solution[i] = minIdx
return (table, solution)
In the above functions V is the list of possible coins and C the required amount of money. Now when you call the min_change function the output is as expected:
min_change([1,4,5,10], 8)
> [4, 4]

For the benefit of people who find this question in future -
As Oscar Lopez and Priyank Bhatnagar, have pointed out, this is the coin change (change-giving, change-making) problem.
In general, the dynamic programming solution they have proposed is the optimal solution - both in terms of (provably!) always producing the required sum using the fewest items, and in terms of execution speed. If your basis numbers are arbitrary, then use the dynamic programming solution.
If your basis numbers are "nice", however, a simpler greedy algorithm will do.
For example, the Australian currency system uses denominations of $100, $50, $20, $10, $5, $2, $1, $0.50, $0.20, $0.10, $0.05. Optimal change can be given for any amount by repeatedly giving the largest unit of change possible until the remaining amount is zero (or less than five cents.)
Here's an instructive implementation of the greedy algorithm, illustrating the concept.
def greedy_give_change (denominations, amount):
# Sort from largest to smallest
denominations = sorted(denominations, reverse=True)
# number of each note/coin given
change_given = list()
for d in denominations:
while amount > d:
change_given.append(d)
amount -= d
return change_given
australian_coins = [100, 50, 20, 10, 5, 2, 1, 0.50, 0.20, 0.10, 0.05]
change = greedy_give_change(australian_coins, 313.37)
print (change) # [100, 100, 100, 10, 2, 1, 0.2, 0.1, 0.05]
print (sum(change)) # 313.35
For the specific example in the original post (denominations = [1, 4, 5, 10] and amount = 8) the greedy solution is not optimal - it will give [5, 1, 1, 1]. But the greedy solution is much faster and simpler than the dynamic programming solution, so if you can use it, you should!

This is problem is known as Minimum coin change problem.
You can solve it by using dynamic programming.
Here is the pseudo code :
Set MinCoin[i] equal to Infinity for all of i
MinCoin[0] = 0
For i = 1 to N // The number N
For j = 0 to M - 1 // M denominations given
// Number i is broken into i-Value[j] for which we already know the answer
// And we update if it gives us lesser value than previous known.
If (Value[j] <= i and MinCoin[i-Value[j]]+1 < MinCoin[i])
MinCoin[i] = MinCoin[i-Value[j]]+1
Output MinCoin[N]

This is an variant of subset-sum problem. In your problem, you can pick an item several times. You still can use a similar idea to solve this problem by using the dynamic prorgamming technique. The basic idea is to design a function F(k, j), such that F(k, j) = 1 means that there is a sequence from arr whose sum is j and length is k.
Formally, the base case is that F(k, 1) = 1, if there exists an i, such that arr[i] = k. For inductive case, F(k, j) = 1, if there exists an i, such that arr[i] = m, and F(k-1, j-m) = 1.
The smallest k with F(k, n) = 1 is the length of the shortest sequence you want.
By using the dynamic programming technique, you can compute function F without using recursion.
By tracking additional information for every F(k, j), you also can reconstruct the shortest sequence.

What you're trying to solve is a variant of the coin change problem. Here you're looking for smallest amount of change, or the minimum amount of coins that sum up to a given amount.
Consider a simple case where your array is
c = [1, 2, 3]
you write 5 as a combination of elements from C and want to know what is the shortest such combination. Here C is the set of coin values and 5 is the amount for which you want to get change.
Let's write down all possible combinations:
1 + 1 + 1 + 1 + 1
1 + 1 + 1 + 2
1 + 2 + 2
1 + 1 + 3
2 + 3
Note that two combinations are the same up to re-ordering, so for instance 2 + 3 = 3 + 2.
Here there is an awesome result that's not obvious at first sight but it's very easy to prove. If you have any sequence of coins/values that is a sequence of minimum length that sums up to a given amount, no matter how you split this sequence the two parts will also be sequences of minimum length for the respective amounts.
For instance if c[3] + c[1] + c[2] + c[7] + c[2] + c[3] add up to S and we know that 6 is the minimal length of any sequence of elements from c that add up to S then if you split
|
S = c[3] + c[1] + c[2] + c[7] | + c[2] + c[3]
|
you have that 4 is the minimal length for sequences that add up to c[3] + c[1] + c[2] + c[7] and 2 the minimal length for sequences that add up to c[2] + c[3].
|
S = c[3] + c[1] + c[2] + c[7] | + c[2] + c[3]
|
= S_left + S_right
How to prove this? By contradiction, assume that the length of S_left is not optimal, that is there's a shorter sequence that adds up to S_left. But then we could write S as a sum of this shorter sequence and S_right, thus contradicting the fact that the length of S is minimal. □
Since this is true no matter how you split the sequence, you can use this result to build a recursive algorithm that follows the principles of dynamic programming paradigm (solving smaller problems while possibly skipping computations that won't be used, memoization or keeping track of computed values, and finally combining the results).
Because of this property of maintaining optimality for subproblems, the coins problem is also said to "exhibit optimal substructure".
OK, so in the small example above this is how we would go about solving the problem with a dynamic programming approach: assume we want to find the shortest sequence of elements from c = [1, 2, 3] for writing the sum 5. We solve the subproblems obtained by subtracting one coin: 5 - 1, 5 - 2, and 5 - 3, we take the smallest solution of these subproblems and add 1 (the missing coin).
So we can write something like
shortest_seq_length([1, 2, 3], 5) =
min( shortest_seq_length([1, 2, 3], 5-1),
shortest_seq_length([1, 2, 3], 5-2),
shortest_seq_length([1, 2, 3], 5-3)
) + 1
It is convenient to write the algorithm bottom-up, starting from smaller values of the sums that can be saved and used to form bigger sums. We just solve the problem for all possible values starting from 1 and going up to the desired sum.
Here's the code in Python:
def shortest_seq_length(c, S):
res = {0: 0} # res contains computed results res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
res[i] = min([res[i-x] for x in c if x<=i]) + 1
return res[S]
Now this works except for the cases when we cannot fill the memoization structure for all values of i. This is the case when we don't have the value 1 in c, so for instance we cannot form the sum 1 if c = [2, 5] and with the above function we get
shortest_seq_length([2, 3], 5)
# ValueError: min() arg is an empty sequence
So to take care of this issue one could for instance use a try/catch:
def shortest_seq_length(c, S):
res = {0: 0} # res contains results for each sum res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
try:
res[i] = min([res[i-x] for x in c if x<=i and res[i-x] is not None]) +1
except:
res[i] = None # takes care of error when [res[i-x] for x in c if x<=i] is empty
return res[S]
Or without try/catch:
def shortest_seq_length(c, S):
res = {0: 0} # res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
prev = [res[i-x] for x in c if x<=i and res[i-x] is not None]
if len(prev)>0:
res[i] = min(prev) +1
else:
res[i] = None # takes care of error when [res[i-x] for x in c if x<=i] is empty
return res[S]
Try it out:
print(shortest_seq_length([2, 3], 5))
# 2
print(shortest_seq_length([1, 5, 10, 25], 37))
# 4
print(shortest_seq_length([1, 5, 10], 30))
# 3
print(shortest_seq_length([1, 5, 10], 25))
# 3
print(shortest_seq_length([1, 5, 10], 29))
# 7
print(shortest_seq_length([5, 10], 9))
# None
To show not only the length but also the combinations of coins of minimal length:
from collections import defaultdict
def shortest_seq_length(coins, sum):
combos = defaultdict(list)
combos[0] = [[]]
for i in range(1, sum+1):
for x in coins:
if x<=i and combos[i-x] is not None:
for p in combos[i-x]:
comb = sorted(p + [x])
if comb not in combos[i]:
combos[i].append(comb)
if len(combos[i])>0:
m = (min(map(len,combos[i])))
combos[i] = [combo for i, combo in enumerate(combos[i]) if len(combo) == m]
else:
combos[i] = None
return combos[sum]
total = 9
coin_sizes = [10, 8, 5, 4, 1]
shortest_seq_length(coin_sizes, total)
# [[1, 8], [4, 5]]
To show all sequences remove the minumum computation:
from collections import defaultdict
def all_seq_length(coins, sum):
combos = defaultdict(list)
combos[0] = [[]]
for i in range(1, sum+1):
for x in coins:
if x<=i and combos[i-x] is not None:
for p in combos[i-x]:
comb = sorted(p + [x])
if comb not in combos[i]:
combos[i].append(comb)
if len(combos[i])==0:
combos[i] = None
return combos[sum]
total = 9
coin_sizes = [10, 5, 4, 8, 1]
all_seq_length(coin_sizes, total)
# [[4, 5],
# [1, 1, 1, 1, 5],
# [1, 4, 4],
# [1, 1, 1, 1, 1, 4],
# [1, 8],
# [1, 1, 1, 1, 1, 1, 1, 1, 1]]
One small improvement to the algorithm is to skip the step of computing the minimum when the sum is equal to one of the values/coins, but this can be done better if we write a loop to compute the minimum. This however doesn't improve the overall complexity that's O(mS) where m = len(c).

Related

Could you help me with this Dynamic Programming Problem?

I tried to solve the problem below using dynamic programming, but there is something wrong with my code and I could not figure it out. Could you help me with it? Thank you!
Problem:
Given two arrays of length m and n with digits 0-9 representing two numbers. Create the maximum number of length k <= m + n from digits of the two. The relative order of the digits from the same array must be preserved. Return an array of the k digits.
Note: You should try to optimize your time and space complexity.
Example 1:
Input:
nums1 = [3, 4, 6, 5]
nums2 = [9, 1, 2, 5, 8, 3]
k = 5
Output:
[9, 8, 6, 5, 3]
Example 2:
Input:
nums1 = [6, 7]
nums2 = [6, 0, 4]
k = 5
Output:
[6, 7, 6, 0, 4]
Example 3:
Input:
nums1 = [3, 9]
nums2 = [8, 9]
k = 3
Output:
[9, 8, 9]
My idea is as following:
dp[i][j][t] is the the maximum number of length i which is picked out of first j digits of array 1 and first t digits of array 2, where i goes from 0 to k, j goes from 0 to len(nums1), t goes from 0 to len(nums2). the state transition equation goes like this:
when nums1[j-1] > nums2[t-1]:
if we already have k-2 digits, then we have to take both nums1[j-1] and nums2[t-1], and we must take nums1[j-1] first in order to maximize the result
if we already have k-1 digits, then we only have to take one more digit, and it must be nums1[j-1], because it is bigger
if we already have k digits, then we do not need to take more digits, so we keep the last result dp[i][j-1][t-1]
Given that we are looking for maximum, our current result should be the biggest among these 3 situations, so we have:
dp[i][j][t] = max(
(dp[i-2][j-1][t-1]*10+nums1[j-1])*10+nums2[t-1],
dp[i-1][j-1][t-1]*10+nums1[j-1],
dp[i][j-1][t-1]
)
when nums1[j-1] < nums2[t-1]:
if we already have k-2 digits, then we have to take both nums1[j-1] and nums2[t-1], and this time we must take nums2[t-1] first because it is bigger
if we already have k-1 digits, then we only have to take one more digit, and it must be nums2[t-1], because it is bigger
if we already have k digits, then we do not need to take more digits, so we keep the last result dp[i][j-1][t-1]
Likewise, we take the biggest result from these possible ones:
dp[i][j][t] = max(
(dp[i-2][j-1][t-1]*10+nums2[t-1])*10+nums1[j-1],
dp[i-1][j-1][t-1]*10+nums2[t-1],
dp[i][j-1][t-1]
)
Here is my code:
import numpy as np
def maxNumber(nums1, nums2, k):
m = len(nums1)
n = len(nums2)
dp = [[[0 for _ in range(n + 1)] for _ in range(m + 1)] for _ in range(k + 1)]
for i in range(2, k + 1):
for j in range(i + 1):
if j > m or (i - j) > n:
continue
tmp = 0
tmp_nums1 = nums1[:j]
tmp_nums2 = nums2[:(i-j)]
while tmp_nums1 or tmp_nums2:
if tmp_nums1 > tmp_nums2:
tmp = tmp * 10 + tmp_nums1.pop(0)
else:
tmp = tmp * 10 + tmp_nums2.pop(0)
dp[i][j][i - j] = tmp
for i in range(m + 1):
for j in range(n + 1):
if not i and not j:
continue
dp[1][i][j] = max(nums1[:i] + nums2[:j])
for i in range(2, k+1):
for j in range(m+1):
for t in range(i+1-j, n + 1):
if nums1[j - 1] > nums2[t - 1]:
dp[i][j][t] = max((dp[i-2][j-1][t-1]*10+nums1[j-1])*10+nums2[t-1], dp[i][j-1][t-1], dp[i-1][j-1][t-1]*10+nums1[j-1])
else:
dp[i][j][t] = max((dp[i-2][j-1][t-1]*10+nums2[t-1])*10+nums1[j-1], dp[i][j-1][t-1], dp[i-1][j-1][t-1]*10+nums2[t-1])
# print(np.array(dp))
res = []
tmp_res = dp[-1][-1][-1]
while tmp_res:
res.append(tmp_res % 10)
tmp_res //= 10
return res[::-1]
But it outputs [8, 9, 9] on Example 3, and I cannot figure out the reason. Could you help me with it?
Thank you in advance!
Dynamic programming usually implies short-circuiting some of the computation based on results from computations made to date. Often this takes the form of a recursive function. You seem to be taking more of a brute force approach (which usually corresponds to the worse case scenario for dp)
Here is an example of a recursive approach that will lend itself better to optimization:
def largestFrom(M,N,K):
if K == 1: return [max(M+N)] # simple case
if not M and len(N)==K : return N # simple case
if not N and len(M)==K : return M # simple case
result = []
for A,B in [(N,M),(M,N)]:
for i,a in enumerate(A): # trial on numbers from A
if len(A)-i+len(B)<K: break # can't take more from A
if result and a < result[0]: continue # short-circuit
R = [a] + largestFrom(A[i+1:],B,K-1) # recurse with remaining numbers
if R > result: result = R # track best so far
return result
After eliminating the obvious solutions that require no special processing, it goes into a recursive trial/error process that short-circuits the traversal for candidate numbers that won't improve the best result found so far.
The traversal goes through the two lists and attempts to use the number at each position as the first one in the result. It then recurses with the remaining numbers and a size of K-1. So, upon returning from the recursion, a list R is formed of the selected number followed by the largest K-1 sized suffix that can be made with the remaining numbers.
One part of the short circuiting is stopping the loop when the index of the selected number would not leave enough remaining numbers to reach a size of K-1 (i.e. combining the remainder of the current list plus all numbers of the other one).
Another part of short circuiting is comparing the number we are about to try with the first one in the best result. If the candidate number is smaller than the first one in the result, then it would be pointless to go deeper as there is no possibility to form an R list greater than the result we already have.
For example:
combining [3,9] [8,9] with K=3
result starts empty
Going through first list [3,9]
select 3 at position 0
recurse with M=[9] N=[8,9] K=2
will produce R = [3] + [9,8]
R > result, result is now [3,9,8]
select 9 at position 1
recurse with M=[] N=[8,9] K=2
will produce R = [9] + [8,9]
R > result, result is now [9,8,9]
Going through second list [8,9]
select 8 at position 0
8 is smaller than R[0] (9)
short-circuit
select 9 at position 1
recurse with M=[3,9] N=[] K=2
will produce R = [9] + [3,9]
result unchanged (R is < result)
return result [9,8,9]
The for A,B in [(N,M),(M,N)]: loop is merely a shorthand way to avoid duplicating the code for the trial loops on numbers in M and numbers N.
testSet = [ ([3,4,6,5],[9,1,2,5,8,3],5),
([6, 7], [6, 0, 4],5),
([3, 9], [8, 9],3)
]
for M,N,K in testSet:
print(M,N,K,":",largestFrom(M,N,K))
[3, 4, 6, 5] [9, 1, 2, 5, 8, 3] 5 : [9, 8, 6, 5, 3]
[6, 7] [6, 0, 4] 5 : [6, 7, 6, 0, 4]
[3, 9] [8, 9] 3 : [9, 8, 9]
There is alternative way than DP to solve it. Here I've just crafted another solution:
def maxNumber(nums1, nums2, k):
def pick(nums, k):
stack = []
drop = len(nums) - k
for num in nums:
while drop and stack and stack[-1] < num:
stack.pop()
drop -= 1
stack.append(num)
return stack[:k]
def merge(A, B):
ans = []
while A or B:
bigger = A if A > B else B
ans.append(bigger.pop(0))
return ans
return max(merge(pick(nums1, i), pick(nums2, k-i))
for i in range(k+1) if i <= len(nums1) and k-i <= len(nums2))
if __name__ == '__main__':
nums1 = [3, 4, 6, 5]
nums2 = [9, 1, 2, 5, 8, 3]
print(maxNumber(nums1, nums2, 5))
print(maxNumber([3,9],[8,9], 3))
Are those answers to your examples provided by the professor? Because they don't make sense to me. Surely the largest number is one that uses all of the digits available? i.e. the largest value will always mean k=m+n. You can't possibly have a larger answer with k=m+(n-1) for instance. What am I missing?
Example 3:
Input: nums1 = [3, 9]
nums2 = [8, 9]
k = 3
Output: [9, 8, 9]
or - in my world k = 4 / Output: [8, 9, 3, 9]
(Hmm... I guess they were provided. Seems a weird question to me. Sorry - I'm unable to help, but I'll post this anyway in case someone else wonders the same thing I did. To me the hard part would be to actually work out what the largest number would be, using all digits. But even then that's not that hard: Compare positions 1 - use the value from the larger array. Compare position 1 of the non-chosen array with position 2... and so on.)

Efficient reverse-factorization of a number given list of divisors

Given a number n and a list of divisors A, how can I efficiently find all the combinations of divisors that, when multiplied, yield to the number?
e.g.
n = 12
A = [2, 3, 4]
Output:
[[3, 2, 2],
[2, 3, 2],
[2, 2, 3],
[4, 3],
[3, 4]]
This is what I managed to do so far (code that I re-adapted from one of the many find-prime-factorization questions on stackoverflow):
def products(n, A):
if n == 1:
yield []
for each_divisor in A:
if n % each_divisor == 0:
for new_product in products(n // each_divisor, A):
yield new_product + [each_divisor]
This code seems to work properly but it's very slow, and if I try to use memoization (passing A as a tuple to the function to avoid unhashable type error) the code doesn't provide the correct result.
Any suggestions on how to improve the efficiency of this code?
The memoized code I tried is the following:
class Memoize:
def __init__(self, fun):
self.fun = fun
self.memo = {}
def __call__(self, *args):
if args not in self.memo:
self.memo[args] = self.fun(*args)
return self.memo[args]
#Memoize
def products(n, A): [as above]
When calling the function with the above defined parameters n, A:
>>> list(products(12, (2, 3, 4)))
[[3, 2, 2]]
Without memoization, the output of the same code is:
[[3, 2, 2], [2, 3, 2], [2, 2, 3], [4, 3], [3, 4]]
Note that other memoizazation functions (e.g. from the functools package #functools.lru_cache(maxsize=128)) lead to the same problem.
Rather than using memoization, you can split the problem into a recursive portion to find all the unique combinations, and a portion to find the combinations of each arrangement. That should cut down your search space considerably and only permute the options that will actually work.
To accomplish this, A should be sorted.
Part 1:
Do a DFS on the graph of possible factorizations that are available. Truncate the search down redundant branches by only selecting orderings in which each factor is greater than or equal to its predecessor. For example:
12
/ | \
/ | \
/ | \
2(x6) 3(x4) 4(x3)
/ | | \
2(x3) 3(x2) 3 4(x1)
/ |
2 3(x1)
Bold nodes are the paths that lead to a successful factorization. Struck nodes are ones that lead to a redundant branch because the remaining n after dividing by the factor is less than the factor. Nodes that don't show a remaining value in parentheses do not lead to a factorization at all. No branch is attempted for the factors lower than the current one: when we try 3, 2 is never revisited, only 3 and 4, etc.
In code:
A.sort()
def products(n, A):
def inner(n, A, L):
for i in range(len(A)):
factor = A[i]
if n % factor: continue
k = n // factor
if k < factor:
if k == 1:
yield L + [factor]
elif n in A:
yield L + [n]
break # Following k guaranteed to be even smaller
# until k == 1, which elif shortcuts
yield from inner(k, A[i:], L + [factor])
yield from inner(n, A, [])
This is pretty fast. In your particular case, it only inspects 4 nodes instead of ~30. In fact, you can prove that it inspects the absolute minimum number of nodes possible. The only improvement you might get is by using iteration instead of recursion, and I doubt that will help much.
Part 2:
Now, you just generate a permutation of each element of the result. Python provides the tools to do this directly in the standard library:
from itertools import chain, permutations
chain.from_iterable(map(permutations, products(n, A)))
You can put this into the last line of products as
yield from chain.from_iterable(map(permutations, inner(n, A, [])))
Running list(products(12, A)) shows a 20-30% improvement on my machine this way (5.2µs vs 4.0µs). Running with a more complicated example like list(products(2 * 3 * 4 * 5 * 5 * 7 * 11, [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 14, 22])) shows an even more dramatic improvement: 7ms vs 42ms.
Part 2b:
You can filter out duplicate permutations that occur because of duplicate factors using an approach similar to the one shown here (shameless plug). Adapting for the fact that we always deal with an initial list of sorted integers, it can be written something like this:
def perm_dedup(tup):
maximum = (-1,) * len(tup)
for perm in permutations(tup):
if perm <= maximum: continue
maximum = perm
yield perm
Now you can use the following in the last line:
yield from chain.from_iterable(map(perm_dedup, inner(n, A, [])))
The timings sill favor this complete approach very much: 5.2µs vs 4.9µs for the question and 6.5ms vs 42ms for the long example. In fact, if anything, avoiding duplicate permutations seems to reduce the timing even more.
TL;DR
A much more efficient implementation that only uses standard libraries and searches only for unique permutations of unique factorizations:
from itertools import chain, permutations
def perm_dedup(tup):
maximum = (-1,) * len(tup)
for perm in permutations(tup):
if perm <= maximum: continue
maximum = perm
yield perm
def products(n, A):
A = sorted(set(A))
def inner(n, A, L):
for i in range(len(A)):
factor = A[i]
if n % factor: continue
k = n // factor
if k < factor:
if k == 1:
yield L + [factor]
elif n in A:
yield L + [n]
break # Following k guaranteed to be even smaller
# until k == 1, which elif shortcuts
yield from inner(k, A[i:], L + [factor])
yield from chain.from_iterable(map(perm_dedup, inner(n, A, [])))

Dynamic Programming - Primitive Calculator

I am trying to solve the following problem using dynamic programming.
You are given a primitive calculator that can perform the following three operations with the current number x: multiply x by 2, multiply x by 3, or add 1 to x. Your goal is given a positive integer n, find the minimum number of operations needed to obtain the number n starting from the number 1.
The output should contain two parts - the number of minimum operations, and the sequence to get to n from 1.
I found the following solution from this post: Dynamic Programming - Primitive Calculator Python.
I am having problem understanding the back tracing part, starting from
"numbers = [ ]
k = n"
Could anyone explain the logic behind it? It works like magic...
The code is as follows:
def dp_min_ops(n):
all_parents = [None] * (n + 1)
all_min_ops = [0] + [None] * n
for k in range(1, n + 1):
curr_parent = k - 1
curr_min_ops = all_min_ops[curr_parent] + 1
if k % 3 == 0:
parent = k // 3
num_ops = all_min_ops[parent] + 1
if num_ops < curr_min_ops:
curr_parent, curr_min_ops = parent, num_ops
if k % 2 == 0:
parent = k // 2
num_ops = all_min_ops[parent] + 1
if num_ops < curr_min_ops:
curr_parent, curr_min_ops = parent, num_ops
all_parents[k], all_min_ops[k] = curr_parent, curr_min_ops
numbers = []
k = n
while k > 0:
numbers.append(k)
k = all_parents[k]
numbers.reverse()
return all_min_ops, numbers
print(dp_min_ops(5)) # ([0, 1, 2, 2, 3, 4], [1, 3, 4, 5])
print(dp_min_ops(10)) # ([0, 1, 2, 2, 3, 4, 3, 4, 4, 3, 4], [1, 3, 9, 10])
Hint : To find the Minimum operations to reach a number n. You will need the following answers :
1) min_operations[n-1]
2) if ( n is divisible by 2)
min_operations[n/2]
3) if ( n is divisible by 3)
min_operations[n/3]
Now if we find the minimum of these above three operations we will have minimum number of operations to reach n by adding one to the minimum of these three(if valid).
Now you know that minimum number of operations to reach 1 is zero. So now start calculating minimum number of operations from 1 to n. Since whenever you will calculate any number say k you will always have answer for all numbers less than k ie. k-1, k/2(if divisible), k/3(if divisible). Hence you could calculate for n if you would traverse from 1 to n finding answers for all numbers in between.

How to search for combination of digits in number (optimizing for speed)?

I'm trying to look for the number of combinations of 7 digit numbers (or more, actually need it to work for 10, but its faster to test with 7) that have 1,3,5,7 in it. Tried a few different methods like using
combinations = 0
for combination in itertools.product(xrange(10), repeat=7):
if all(x in combination for x in (1,3,5,7)):
combinations += 1
However, this next method worked out to be about 4 times faster as it doesnt look for 3,5,7 if 1 is not in the list.
combinations = 0
for combination in itertools.product(xrange(10), repeat=7):
if 1 in combination:
if 3 in combination:
if 5 in combination:
if 7 in combination:
combinations += 1
I'm sure there is a more cleaver way to achieve this result with numpy or something like that, but I can't figure it out.
Thanks for feedback
The problem is to find k-digit numbers that contain all the digits 1, 3, 5, 7.
This answer contains a number of solutions, increasing in sophistication and algorithmic efficiency. By the end, we'll be able to, in a fraction of a second, count solutions for huge k, for example 10^12, modulo a large prime.
The section at the end includes tests that provide good evidence that all the implementations are correct.
Brute force: O(k10^k) time, O(k) space
We'll use this slow approach to test the more optimized versions of the code:
def contains_1357(i):
i = str(i)
return all(x in i for x in '1357')
def combos_slow(k):
return sum(contains_1357(i) for i in xrange(10 ** k))
Counting: O(k^4) time, O(k) space
The simplest moderately efficient method is to count. One way to do this is to count all k-digit numbers where the first occurrences of the four special digits appear at digits a, b, c, d.
Given such an a, b, c, d, the digits up to a must be 0,2,4,6,8,9, the digit a must be one of [1, 3, 5, 7], the digits between a and b must be either the same as the digit a or any of the safe digits, the digit b must be one of [1, 3, 5, 7] that's different from the digit at a, and so on.
Summing over all possible a, b, c, d gives the result. Like this:
import itertools
def combos0(k):
S = 0
for a, b, c, d in itertools.combinations(range(k), 4):
S += 6 ** a * 4 * 7**(b-a-1) * 3 * 8**(c-b-1) * 2 * 9**(d-c-1) * 10**(k-d-1)
return S
Dynamic programming: O(k) time, O(k) and then O(1) space
You can solve this more efficiently with dynamic programming: let c[j][i] be the number of i-digit numbers which contain exactly j different digits from (1, 3, 5, 7).
Then c satisfies these recurrence relations:
c[0][0] = 1
c[j][0] = 0 for j > 0
c[0][i] = 6 * c[0][i-1] for i > 0
c[j][i] = (6+j)c[j][i-1] + (5-j)c[j-1][i-1] for i, j > 0
The final line of the recurrence relations is the hardest one to understand. The first part (6+j)c[j][i-1] says that you can make an i digit number containing j of the digits 1, 3, 5, 7 from a i-1 digit number containing j of the digits 1, 3, 5, 7, and add an extra digit that's either 0, 2, 4, 6, 8, 9 or any of the digits you've already got. Similarly, the second part (5-j)c[j-1][i-1] says that you can take an i-1 digit number containing j-1 of the digits 1, 3, 5, 7 and make it an i-digit number containing j of the special digits by adding one of the digits you haven't already used. There's 5-j of these.
That leads to this O(k) solution using dynamic programming:
def combos(k):
c = [[0] * (k + 1) for _ in xrange(5)]
c[0][0] = 1
for i in xrange(1, k+1):
c[0][i] = 6 * c[0][i-1]
for j in xrange(1, 5):
c[j][i] = (6 + j) * c[j][i-1] + (5-j) * c[j-1][i-1]
return c[4][k]
We can print combos(10):
print 'combos(10) =', combos(10)
This gives this output:
combos(10) = 1425878520
The solution above is already fast enough to compute combos(10000) in a fraction of a second. But it's possible to optimize the DP solution a little to use O(1) rather than O(k) space by observing that values of c depend only on the previous column in the table. With a bit of care (to make sure that we're not overwriting values before they're used), we can write the code like this:
def combos2(k):
c = [1, 0, 0, 0, 0]
for _ in xrange(k):
for j in xrange(4, 0, -1):
c[j] = (6+j)*c[j] + (5-j)*c[j-1]
c[0] *= 6
return c[4]
Matrix power: O(log k) time, O(1) space.
Ultimately, it's possible to get the result in O(log k) time and O(1) space, by expressing the recurrence relation as a matrix-by-vector multiply, and using exponentiation by squaring. That makes it possible to compute combos(k) modulo X even for massive k (here combos(10^12) modulo 2^31 - 1). That looks like this:
def mat_vec(M, v, X):
return [sum(M[i][j] * v[j] % X for j in xrange(5)) for i in xrange(5)]
def mat_mul(M, N, X):
return [[sum(M[i][j] * N[j][k] for j in xrange(5)) % X for k in xrange(5)] for i in xrange(5)]
def mat_pow(M, k, X):
r = [[i==j for i in xrange(5)] for j in xrange(5)]
while k:
if k % 2:
r = mat_mul(r, M, X)
M = mat_mul(M, M, X)
k //= 2
return r
def combos3(k, X):
M = [[6, 0, 0, 0, 0], [4, 7, 0, 0, 0], [0, 3, 8, 0, 0], [0, 0, 2, 9, 0], [0, 0, 0, 1, 10]]
return mat_vec(mat_pow(M, k, X), [1, 0, 0, 0, 0], X)[4]
print combos3(10**12, (2**31) - 1)
Given that your original code struggled for k=10, this is quite an improvement!
Testing
We can test each of the functions against each other (and combos_slow for small values). Since combos3 has an extra arg, we wrap it in a function that passes a modulo that's guaranteed to be larger than the result.
def combos3p(k):
return combos3(k, 10**k)
for c in [combos0, combos, combos2, combos3p]:
for i in xrange(40 if c == combos0 else 100):
assert c(i) == (combos_slow if i < 7 else combos)(i)
This tests all the implementations against combos_slow for i<7, and against each other for 7 <= i < 100 (except for the less efficient combos0 which stops at 40).

Why Won't Python Won't Do a Lot of Recursion?

I'm doing the Project Euler problems, and I'm on number two. The question is:
Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:
1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...
By considering the terms in the Fibonacci sequence whose values do not exceed four million, find the sum of the even-valued terms.
I'm trying to solve this in python. I think I have the correct code, but for some reason When I run it with n being anything greater than or equal to 27, it will wait like a minute and just return 0. However, for anything 26 or lower, it runs fine. Here's my code:
def fib_seq(n):
if n == 0:
return n
elif n == 1:
return n
else:
return fib_seq(n-1) + fib_seq(n-2)
def get_fib_sum(n):
x = n
sum = 0
for i in range(n):
if fib_seq(x) > 4000000:
pass
elif fib_seq(x) % 2 == 0:
pass
else:
sum += fib_seq(x)
x = i
return sum
print get_fib_sum(27)
Is there anyway to fix this or at least get it to work? If it makes a difference, I'm using Wing IDE 101 Student Edition.
In your loop, you are using fib_seq(x) and it should be fib_seq(i)
Also, if you want to reduce time a bit more, you can use memoization technique
def fib_seq(n):
if n == 0:
return n
elif n == 1:
return n
else:
return fib_seq(n-1) + fib_seq(n-2)
def memoize(fn, arg):
memo = {}
if arg not in memo:
memo[arg] = fn(arg)
return memo[arg]
fibm = memoize(fib_seq,27)
print fibm
Why are you using recursion? your code is recalculating the ENTIRE fibonnaci sequence over and over and over and over and over... The code just wants the sum of the even terms. There is NO need for recursion. In pseudo-code:
t1 = 1
t2 = 2;
sum = 2;
do {
t3 = t1 + t2;
if (t3 is even) {
sum += t3;
}
t1 = t2;
t2 = t3;
} while (t2 <= 4000000)
Fibonacci sequence is often used as an example of how to write recursive code, which is ridiculous because it has a very straight-forward iterative solution:
def fib(n):
if n < 2:
return n
else:
a, b = 1, 1
for _ in range(2, n): # O(n)
a, b = b, a+b
return b
What is less obvious is that it also has a matrix representation,
F = [[0, 1]] # initial state
T = [[0, 1], # transition matrix
[1, 1]]
fib(n) = (F * T**n)[0][0]
which is extremely useful because T**n can be computed in O(log(n)) steps.
(As an aside, the eigenvector of the log of the transition matrix leads to the analytic solution,
phi = (1 + 5**0.5) / 2 # golden ratio
fib(n) = round(phi**n / 5**0.5, 0)
but that's not where I'm going with this.)
Looking at the terms produced in terms of odd-or-even, you see
n: 0, 1, 2, 3, 4, 5, 6, 7, 8, ...
f(n): 0, 1, 1, 2, 3, 5, 8, 13, 21, ...
e/o: even, odd, odd, even, odd, odd, even, odd, odd, ...
so what you need is fib(0) + fib(3) + fib(6) + ... and computing T**3 gives you the coefficients needed to step directly from term to term.
The rest is left as an exercise for the reader ;-)
It does a lot of recursion, that's why it's taking so long.
The get_fib_sum() will evaluate fib_seq(27) in a loop, which does a lot of recursion and takes a while. Since the result of fib_seq(27) is greater then 4000000 it will then will never add anything to sum, returning 0 in the end.

Categories

Resources