Related
I did this code that finds two integers in a said list (in this case [2,4,5,1,6,40,-1]) that multiply to twenty. I got a little stuck in the beginning, but adding a function to it solved my problems. I showed this code to a friend of mine who's a programmer and he said I could make this code more "pythonic", but I have no clue how.
Here's the code:
num_list = [2,4,5,1,6,40,-1]
def get_mult_num(given_list):
for i in given_list:
for j in range(i+1, len(given_list)): #for j not to be == i and to be in the list
mult_two_numbers = i * j
if mult_two_numbers == 20:
return i,j
print(get_mult_num(num_list))
I don't necessarily think it is 'unpythonic', you are using standard Python idioms to loop over your data and produce a single result or None. The term Pythonic is nebulous, a subject marred in "I know it when I see it" parameters.
Not that you produced a correct implementation. While i loops over given_numbers, j loops over an integer from i + 2 through to len(given_numbers), mixing values from given_list with indices? For your sample input, you are taking j from the half-open ranges [4, 7), [6, 7), [7, 7) (empty), [3, 7), [8, 7) (empty), [42, 7) (empty) and [1, 7), respectively. That it produces the correct answer at all is luck, not due to correctness; if you give your function the list [2, 10], it'll not find a solution! You want to loop over given_numbers again, limited with slicing, or generate indices starting at the current index of i, but then your outer loop needs to add a enumerate() call too:
for ii, i in enumerate(given_numbers):
for j in given_numbers[ii + 1:]:
# ...
or
for ii, i in enumerate(given_numbers):
for jj in range(ii + 1, len(given_numbers)):
j = given_numbers[jj]
# ...
All this is not nearly as efficient as it can be; the Python standard library offers you the tools to generate your i, j pairs without a nested for loop or slicing or other forms of filtering.
Your double loop should generate combinations of the integer inputs, so use the itertools.combinations() object to generate unique i, j pairs:
from itertools import combinations
def get_mult_num(given_list):
return [(i, j) for i, j in combinations(given_list, 2) if i * j == 20]
This assumes there can be zero or more such solutions, not just a single solution.
If you only ever need the first result or None, you can use the next() function:
def get_mult_num(given_list):
multiplies_to_20 = (
(i, j) for i, j in combinations(given_list, 2)
if i * j == 20)
return next(multiplies_to_20, None)
Next, rather than produce all possible combinations, you may want to invert the problem. If you turn given_list into a set, you can trivially check if the target number 20 can be divided cleanly without remainder by any of your given numbers and where the result of the division is larger and is also an integer in the set of numbers. That gives you an answer in linear time.
You can further limit the search by dividing with numbers smaller than the square root of the target value, because you won't find a larger value to match in your input numbers (given a number n and it's square root s, by definition s * (s + 1) is going to be larger than n).
If we add an argument for the target number to the function and make it a generator function, then you get:
def gen_factors_for(target, numbers):
possible_j = set(numbers)
limit = abs(target) ** 0.5
for i in numbers:
if abs(i) < limit and target % i == 0:
j = target // i
if j in possible_j and abs(j) > abs(i):
yield i, j
This approach is a lot faster than testing all permutations, especially if you need to find all possible factors. Note that I made both functions generators here to even out the comparisons:
>>> import random, operator
>>> from timeit import Timer
>>> def gen_factors_for_division(target, numbers):
... possible_j = set(numbers)
... limit = abs(target) ** 0.5
... for i in numbers:
... if abs(i) < limit and target % i == 0:
... j = target // i
... if j in possible_j and abs(j) > abs(i):
... yield i, j
...
>>> def gen_factors_for_combinations(target, given_list):
... return ((i, j) for i, j in combinations(given_list, 2) if i * j == target)
...
>>> numbers = [random.randint(-10000, 10000) for _ in range(100)]
>>> targets = [operator.mul(*random.sample(set(numbers), 2)) for _ in range(5)]
>>> targets += [t + random.randint(1, 100) for t in targets] # add likely-to-be-unsolvable numbers
>>> for (label, t) in (('first match:', 'next({}, None)'), ('all matches:', 'list({})')):
... print(label)
... for f in (gen_factors_for_division, gen_factors_for_combinations):
... test = t.format('f(t, n)')
... timer = Timer(
... f"[{test} for t in ts]",
... 'from __main__ import targets as ts, numbers as n, f')
... count, total = timer.autorange()
... print(f"{f.__name__:>30}: {total / count * 1000:8.3f}ms")
...
first match:
gen_factors_for_division: 0.219ms
gen_factors_for_combinations: 4.664ms
all matches:
gen_factors_for_division: 0.259ms
gen_factors_for_combinations: 3.326ms
Note that I generate 10 different random targets, to try to avoid a lucky best-case-scenario hit for either approach.
[(i,j) for i in num_list for j in num_list if i<j and i*j==20]
This is my take on it, which uses enumerate:
def get_mult_num(given_list):
return [
item1, item2
for i, item1 in enumerate(given_list)
for item2 in given_list[:i]
if item1*item2 == 20
]
I think your friend may be hinting towards using comprehensions when it makes the code cleaner (sometimes it doesn't).
I can think of using list-comprehension. This also helps to find multiple such-pairs if they exist in the given list.
num_list = [2,4,5,1,6,40,-1]
mult_num = [(num_list[i],num_list[j]) for i in range(len(num_list)) for j in range(i+1, len(num_list)) if num_list[i]*num_list[j] == 20]
print mult_num
Output:
[(4, 5)]
I came up with this. It reverses the approach a little bit, in that it searches in num_list for the required pair partner that the iteration value val would multiply to 20 with. This makes the code easier and needs no imports, even if it's not the most efficient way.
for val in num_list:
if 20 / val in num_list:
print(val, int(20/val))
You could make it more pythonic by using itertools.combinations, instead of nested loops, to find all pairs of numbers. Not always, but often iterating over indices as in for i in range(len(L)): is less pythonic than directly iterating over values as in for v in L:.
Python also allows you to make your function into a generator via the yield keyword so that instead of just returning the first pair that multiplies to 20, you get every pair that does by iterating over the function call.
import itertools
def factors(x, numbers):
""" Generate all pairs in list of numbers that multiply to x.
"""
for a, b in itertools.combinations(numbers, 2):
if a * b == x:
yield (a, b)
numbers = [2, 4, 5, 1, 6, 40, -1]
for pair in factors(20, numbers):
print(pair)
I'm a stumped on how to speed up my algorithm which sums multiples in a given range. This is for a problem on codewars.com here is a link to the problem
codewars link
Here's the code and i'll explain what's going on in the bottom
import itertools
def solution(number):
return multiples(3, number) + multiples(5, number) - multiples(15, number)
def multiples(m, count):
l = 0
for i in itertools.count(m, m):
if i < count:
l += i
else:
break
return l
print solution(50000000) #takes 41.8 seconds
#one of the testers takes 50000000000000000000000000000000000000000 as input
# def multiples(m, count):
# l = 0
# for i in xrange(m,count ,m):
# l += i
# return l
so basically the problem ask the user return the sum of all the multiples of 3 and 5 within a number. Here are the testers.
test.assert_equals(solution(10), 23)
test.assert_equals(solution(20), 78)
test.assert_equals(solution(100), 2318)
test.assert_equals(solution(200), 9168)
test.assert_equals(solution(1000), 233168)
test.assert_equals(solution(10000), 23331668)
my program has no problem getting the right answer. The problem arises when the input is large. When pass in a number like 50000000 it takes over 40 seconds to return the answer. One of the inputs i'm asked to take is 50000000000000000000000000000000000000000, which a is huge number. That's also the reason why i'm using itertools.count() I tried using xrange in my first attempt but range can't handle numbers larger than a c type long. I know the slowest part the problem is the multiples method...yet it is still faster then my first attempt using list comprehension and checking whether i % 3 == 0 or i % 5 == 0, any ideas guys?
This solution should be faster for large numbers.
def solution(number):
number -= 1
a, b, c = number // 3, number // 5, number // 15
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
return 3*asum + 5*bsum - 15*csum
Explanation:
Take any sequence from 1 to n:
1, 2, 3, 4, ..., n
And it's sum will always be given by the formula n(n+1)/2. This can be proven easily if you consider that the expression (1 + n) / 2 is just a shortcut for computing the average, or Arithmetic mean of this particular sequence of numbers. Because average(S) = sum(S) / length(S), if you take the average of any sequence of numbers and multiply it by the length of the sequence, you get the sum of the sequence.
If we're given a number n, and we want the sum of the multiples of some given k up to n, including n, we want to find the summation:
k + 2k + 3k + 4k + ... xk
where xk is the highest multiple of k that is less than or equal to n. Now notice that this summation can be factored into:
k(1 + 2 + 3 + 4 + ... + x)
We are given k already, so now all we need to find is x. If x is defined to be the highest number you can multiply k by to get a natural number less than or equal to n, then we can get the number x by using Python's integer division:
n // k == x
Once we find x, we can find the sum of the multiples of any given k up to a given n using previous formulas:
k(x(x+1)/2)
Our three given k's are 3, 5, and 15.
We find our x's in this line:
a, b, c = number // 3, number // 5, number // 15
Compute the summations of their multiples up to n in this line:
asum, bsum, csum = a*(a+1) // 2, b*(b+1) // 2, c*(c+1) // 2
And finally, multiply their summations by k in this line:
return 3*asum + 5*bsum - 15*csum
And we have our answer!
I am doing problem five in Project Euler: "2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.
What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?"
I have constructed the following code which finds the correct value 2520 when using 1 - 10 as divisors but code seems to be going on forever when using 1 - 20.
Again I don't want the code just a pointer or two on where I am going wrong.
Thanks
def smallestDiv(n):
end=False
while end == False:
divisors = [x for x in range(1,21)] # get divisors
allDivisions = zip(n % i for i in divisors) # get values for n % all integers in divisors
check = all(item[0] == 0 for item in allDivisions ) # check if all values of n % i are equal to zero
if check: # if all values are equal to zero return n
end = True
return n
else: # else increase n by 1
n +=1
EDIT:
I used some code I found relating to LCM and used reduce to solve the problem:
def lcm(*values):
values = [value for value in values]
if values:
n = max(values)
m = n
values.remove(n)
while any( n % value for value in values ):
n +=m
return n
return 0
print reduce(lcm, range(1,21))
If a problem is hard, trying solving a simpler version. Here, how to calculate the lowest common multiple of two numbers. If you've read any number theory book (or think about prime factors), you can do that using the greatest common divisor function (as implemented by the Euclidean algorithm).
from fractions import gcd
def lcm(a,b):
"Calculate the lowest common multiple of two integers a and b"
return a*b//gcd(a,b)
Observing lcm(a,b,c) ≡ lcm(lcm(a,b),c) it's simple to solve your problem with Python's reduce function
>>> from functools import reduce
>>> reduce(lcm, range(1,10+1))
2520
>>> reduce(lcm, range(1,20+1))
232792560
You are doing a brute force search, so it can get arbitrary long. You should read about LCM (least common multiple) in order to code an efficient solution.(which I believe is 232792560)
int gcd(int m, int n)
{
int t;
while(n!=0)
{
t=n;
n=m%n;
m=t;
}
return m;
}
#include<stdio.h>
int main()
{
int i,n;
int long long lcm=1;
printf("Enter the range:");
scanf("%d",&n);
for (i=1;i<=n;i++)
{
lcm = (i*lcm)/gcd(i,lcm);
}
printf("smallest multiple : %uL",lcm);
}
This will give you all the factors in the numbers from 1 to 20:
from collections import Counter
def prime_factors(x):
def factor_this(x, factor):
factors = []
while x % factor == 0:
x /= factor
factors.append(factor)
return x, factors
x, factors = factor_this(x, 2)
x, f = factor_this(x, 3)
factors += f
i = 5
while i * i <= x:
for j in (2, 4):
x, f = factor_this(x, i)
factors += f
i += j
if x > 1:
factors.append(x)
return factors
def factors_in_range(x):
result = {}
for i in range(2, x + 1):
p = prime_factors(i)
c = Counter(p)
for k, v in c.items():
n = result.get(k)
if n is None or n < v:
result[k] = v
return result
print factors_in_range(20)
If you multiply these numbers together, as many times as they occur in the result, you get the smallest number that divides all the numbers from 1 to 20.
import operator
def product(c):
return reduce(operator.__mul__, [k ** v for k, v in c.items()], 1)
c = factors_in_range(20)
print product(c)
I think the answer by Colonel Panic is brilliant but I just wanted to expand on it a little bit without editing the concise answer.
The original solution is:
from fractions import gcd
def lcm(a,b):
"Calculate the lowest common multiple of two integers a and b"
return a*b//gcd(a,b)
>>> from functools import reduce
>>> reduce(lcm, range(1,10+1))
2520
>>> reduce(lcm, range(1,20+1))
232792560
I find it helpful to visualize what the reduce is doing for N = 10:
res = lcm(lcm(lcm(lcm(lcm(lcm(lcm(lcm(lcm(1, 2), 3), 4), 5), 6), 7), 8), 9), 10)
Which evaluates to:
# Evaluates lcm(1, 2)
res = lcm(lcm(lcm(lcm(lcm(lcm(lcm(lcm(lcm(1, 2), 3), 4), 5), 6), 7), 8), 9), 10)
# Evaluates lcm(2, 3)
res = lcm(lcm(lcm(lcm(lcm(lcm(lcm(lcm(2, 3), 4), 5), 6), 7), 8), 9), 10)
# Evaluates lcm(6, 4)
res = lcm(lcm(lcm(lcm(lcm(lcm(lcm(6, 4), 5), 6), 7), 8), 9), 10)
# Evaluates lcm(12, 5)
res = lcm(lcm(lcm(lcm(lcm(lcm(12, 5), 6), 7), 8), 9), 10)
# Evaluates lcm(60, 6)
res = lcm(lcm(lcm(lcm(lcm(60, 6), 7), 8), 9), 10)
# Evaluates lcm(60, 7)
res = lcm(lcm(lcm(lcm(60, 7), 8), 9), 10)
# Evaluates lcm(420, 8)
res = lcm(lcm(lcm(420, 8), 9), 10)
# Evaluates lcm(840, 9)
res = lcm(lcm(840, 9), 10)
# Evaluates lcm(2520, 10)
res = lcm(2520, 10)
print(res)
>>> 2520
The above gets across the intuition of what is happening. When we use reduce we "apply a rolling computation to sequential pairs of values in a list." It does this from the "inside-out" or from the left to the right in range(1, 20+1).
I think it is really important here to point out that you, as a programmer, are NOT expected to intuit this answer as being obvious or readily apparent. It has taken a lot of smart people a long time to learn a great deal about prime numbers, greatest common factors, and least common multiples, etc. However, as a software engineer you ARE expected to know the basics about number theory, gcd, lcm, prime numbers, and how to solve problems with these in your toolkit. Again, you are not expected to re-invent the wheel or re-discover things from number theory each time you solve a problem, but as you go about your business you should be adding tools to your problem solving toolkit.
import sys
def smallestDiv(n):
divisors = [x for x in range(1,(n+1))] # get divisors
for i in xrange(2520,sys.maxint,n):
if(all(i%x == 0 for x in divisors)):
return i
print (smallestDiv(20))
Takes approximately 5 seconds on my 1.7 GHZ i7
I based it on the C# code here:
http://www.mathblog.dk/project-euler-problem-5/
facList=[2]
prod=1
for i in range(3,1000):
n=i
for j in facList:
if n % j == 0:
n//=j
facList.append(n)
for k in facList:
prod*=k
print(prod)
I tried this method and compared my time to Colonel Panic's answer and mine started significantly beating his at about n=200 instead of n=20. His is much more elegant in my opinion, but for some reason mine is faster. Maybe someone with better understanding of algorithm runtime can explain why.
Last function finds the smallest number dividable by n, since the number should be multiples of factorial(n), you need to have a function that calculates factorial (can be done via math. method)
def factoral(n):
if n > 1:
return n * factoral(n - 1)
elif n >= 0:
return 1
else:
return -1
def isMultiple(a, b):
for i in range(1, b):
if a % i != 0:
return False
return True
def EnkucukBul(n):
for i in range(n, factoral(n) + 1, n):
if isMultiple(i, n):
return i
return -1
If you can use math module, you can use math.lcm
import math
def smallestMul():
return(math.lcm(1, 2, 3, ..., 20))
I'm totally stuck and have no idea how to go about solving this. Let's say I've an array
arr = [1, 4, 5, 10]
and a number
n = 8
I need shortest sequence from within arr which equals n. So for example following sequences within arr equals n
c1 = 5,1,1,1
c2 = 4,4
c3= 1,1,1,1,1,1,1,1
So in above case, our answer is c2 because it's shortest sequences in arr that equals sum.
I'm not sure what's the simplest way of finding a solution to above? Any ideas, or help will be really appreciated.
Thanks!
Edited:
Fixed the array
Array will possibly have postive values only.
I'm not sure how subset problem fixes this, probably due to my own ignorance. Does sub-set algorithm always give the shortest sequence that equals sum? For example, will subset problem identify c2 as the answer in above scenario?
As has been pointed before this is the minimum change coin problem, typically solved with dynamic programming. Here's a Python implementation solved in time complexity O(nC) and space complexity O(C), where n is the number of coins and C the required amount of money:
def min_change(V, C):
table, solution = min_change_table(V, C)
num_coins, coins = table[-1], []
if num_coins == float('inf'):
return []
while C > 0:
coins.append(V[solution[C]])
C -= V[solution[C]]
return coins
def min_change_table(V, C):
m, n = C+1, len(V)
table, solution = [0] * m, [0] * m
for i in xrange(1, m):
minNum, minIdx = float('inf'), -1
for j in xrange(n):
if V[j] <= i and 1 + table[i - V[j]] < minNum:
minNum = 1 + table[i - V[j]]
minIdx = j
table[i] = minNum
solution[i] = minIdx
return (table, solution)
In the above functions V is the list of possible coins and C the required amount of money. Now when you call the min_change function the output is as expected:
min_change([1,4,5,10], 8)
> [4, 4]
For the benefit of people who find this question in future -
As Oscar Lopez and Priyank Bhatnagar, have pointed out, this is the coin change (change-giving, change-making) problem.
In general, the dynamic programming solution they have proposed is the optimal solution - both in terms of (provably!) always producing the required sum using the fewest items, and in terms of execution speed. If your basis numbers are arbitrary, then use the dynamic programming solution.
If your basis numbers are "nice", however, a simpler greedy algorithm will do.
For example, the Australian currency system uses denominations of $100, $50, $20, $10, $5, $2, $1, $0.50, $0.20, $0.10, $0.05. Optimal change can be given for any amount by repeatedly giving the largest unit of change possible until the remaining amount is zero (or less than five cents.)
Here's an instructive implementation of the greedy algorithm, illustrating the concept.
def greedy_give_change (denominations, amount):
# Sort from largest to smallest
denominations = sorted(denominations, reverse=True)
# number of each note/coin given
change_given = list()
for d in denominations:
while amount > d:
change_given.append(d)
amount -= d
return change_given
australian_coins = [100, 50, 20, 10, 5, 2, 1, 0.50, 0.20, 0.10, 0.05]
change = greedy_give_change(australian_coins, 313.37)
print (change) # [100, 100, 100, 10, 2, 1, 0.2, 0.1, 0.05]
print (sum(change)) # 313.35
For the specific example in the original post (denominations = [1, 4, 5, 10] and amount = 8) the greedy solution is not optimal - it will give [5, 1, 1, 1]. But the greedy solution is much faster and simpler than the dynamic programming solution, so if you can use it, you should!
This is problem is known as Minimum coin change problem.
You can solve it by using dynamic programming.
Here is the pseudo code :
Set MinCoin[i] equal to Infinity for all of i
MinCoin[0] = 0
For i = 1 to N // The number N
For j = 0 to M - 1 // M denominations given
// Number i is broken into i-Value[j] for which we already know the answer
// And we update if it gives us lesser value than previous known.
If (Value[j] <= i and MinCoin[i-Value[j]]+1 < MinCoin[i])
MinCoin[i] = MinCoin[i-Value[j]]+1
Output MinCoin[N]
This is an variant of subset-sum problem. In your problem, you can pick an item several times. You still can use a similar idea to solve this problem by using the dynamic prorgamming technique. The basic idea is to design a function F(k, j), such that F(k, j) = 1 means that there is a sequence from arr whose sum is j and length is k.
Formally, the base case is that F(k, 1) = 1, if there exists an i, such that arr[i] = k. For inductive case, F(k, j) = 1, if there exists an i, such that arr[i] = m, and F(k-1, j-m) = 1.
The smallest k with F(k, n) = 1 is the length of the shortest sequence you want.
By using the dynamic programming technique, you can compute function F without using recursion.
By tracking additional information for every F(k, j), you also can reconstruct the shortest sequence.
What you're trying to solve is a variant of the coin change problem. Here you're looking for smallest amount of change, or the minimum amount of coins that sum up to a given amount.
Consider a simple case where your array is
c = [1, 2, 3]
you write 5 as a combination of elements from C and want to know what is the shortest such combination. Here C is the set of coin values and 5 is the amount for which you want to get change.
Let's write down all possible combinations:
1 + 1 + 1 + 1 + 1
1 + 1 + 1 + 2
1 + 2 + 2
1 + 1 + 3
2 + 3
Note that two combinations are the same up to re-ordering, so for instance 2 + 3 = 3 + 2.
Here there is an awesome result that's not obvious at first sight but it's very easy to prove. If you have any sequence of coins/values that is a sequence of minimum length that sums up to a given amount, no matter how you split this sequence the two parts will also be sequences of minimum length for the respective amounts.
For instance if c[3] + c[1] + c[2] + c[7] + c[2] + c[3] add up to S and we know that 6 is the minimal length of any sequence of elements from c that add up to S then if you split
|
S = c[3] + c[1] + c[2] + c[7] | + c[2] + c[3]
|
you have that 4 is the minimal length for sequences that add up to c[3] + c[1] + c[2] + c[7] and 2 the minimal length for sequences that add up to c[2] + c[3].
|
S = c[3] + c[1] + c[2] + c[7] | + c[2] + c[3]
|
= S_left + S_right
How to prove this? By contradiction, assume that the length of S_left is not optimal, that is there's a shorter sequence that adds up to S_left. But then we could write S as a sum of this shorter sequence and S_right, thus contradicting the fact that the length of S is minimal. □
Since this is true no matter how you split the sequence, you can use this result to build a recursive algorithm that follows the principles of dynamic programming paradigm (solving smaller problems while possibly skipping computations that won't be used, memoization or keeping track of computed values, and finally combining the results).
Because of this property of maintaining optimality for subproblems, the coins problem is also said to "exhibit optimal substructure".
OK, so in the small example above this is how we would go about solving the problem with a dynamic programming approach: assume we want to find the shortest sequence of elements from c = [1, 2, 3] for writing the sum 5. We solve the subproblems obtained by subtracting one coin: 5 - 1, 5 - 2, and 5 - 3, we take the smallest solution of these subproblems and add 1 (the missing coin).
So we can write something like
shortest_seq_length([1, 2, 3], 5) =
min( shortest_seq_length([1, 2, 3], 5-1),
shortest_seq_length([1, 2, 3], 5-2),
shortest_seq_length([1, 2, 3], 5-3)
) + 1
It is convenient to write the algorithm bottom-up, starting from smaller values of the sums that can be saved and used to form bigger sums. We just solve the problem for all possible values starting from 1 and going up to the desired sum.
Here's the code in Python:
def shortest_seq_length(c, S):
res = {0: 0} # res contains computed results res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
res[i] = min([res[i-x] for x in c if x<=i]) + 1
return res[S]
Now this works except for the cases when we cannot fill the memoization structure for all values of i. This is the case when we don't have the value 1 in c, so for instance we cannot form the sum 1 if c = [2, 5] and with the above function we get
shortest_seq_length([2, 3], 5)
# ValueError: min() arg is an empty sequence
So to take care of this issue one could for instance use a try/catch:
def shortest_seq_length(c, S):
res = {0: 0} # res contains results for each sum res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
try:
res[i] = min([res[i-x] for x in c if x<=i and res[i-x] is not None]) +1
except:
res[i] = None # takes care of error when [res[i-x] for x in c if x<=i] is empty
return res[S]
Or without try/catch:
def shortest_seq_length(c, S):
res = {0: 0} # res[i] = shortest_seq_length(c, i)
for i in range(1, S+1):
prev = [res[i-x] for x in c if x<=i and res[i-x] is not None]
if len(prev)>0:
res[i] = min(prev) +1
else:
res[i] = None # takes care of error when [res[i-x] for x in c if x<=i] is empty
return res[S]
Try it out:
print(shortest_seq_length([2, 3], 5))
# 2
print(shortest_seq_length([1, 5, 10, 25], 37))
# 4
print(shortest_seq_length([1, 5, 10], 30))
# 3
print(shortest_seq_length([1, 5, 10], 25))
# 3
print(shortest_seq_length([1, 5, 10], 29))
# 7
print(shortest_seq_length([5, 10], 9))
# None
To show not only the length but also the combinations of coins of minimal length:
from collections import defaultdict
def shortest_seq_length(coins, sum):
combos = defaultdict(list)
combos[0] = [[]]
for i in range(1, sum+1):
for x in coins:
if x<=i and combos[i-x] is not None:
for p in combos[i-x]:
comb = sorted(p + [x])
if comb not in combos[i]:
combos[i].append(comb)
if len(combos[i])>0:
m = (min(map(len,combos[i])))
combos[i] = [combo for i, combo in enumerate(combos[i]) if len(combo) == m]
else:
combos[i] = None
return combos[sum]
total = 9
coin_sizes = [10, 8, 5, 4, 1]
shortest_seq_length(coin_sizes, total)
# [[1, 8], [4, 5]]
To show all sequences remove the minumum computation:
from collections import defaultdict
def all_seq_length(coins, sum):
combos = defaultdict(list)
combos[0] = [[]]
for i in range(1, sum+1):
for x in coins:
if x<=i and combos[i-x] is not None:
for p in combos[i-x]:
comb = sorted(p + [x])
if comb not in combos[i]:
combos[i].append(comb)
if len(combos[i])==0:
combos[i] = None
return combos[sum]
total = 9
coin_sizes = [10, 5, 4, 8, 1]
all_seq_length(coin_sizes, total)
# [[4, 5],
# [1, 1, 1, 1, 5],
# [1, 4, 4],
# [1, 1, 1, 1, 1, 4],
# [1, 8],
# [1, 1, 1, 1, 1, 1, 1, 1, 1]]
One small improvement to the algorithm is to skip the step of computing the minimum when the sum is equal to one of the values/coins, but this can be done better if we write a loop to compute the minimum. This however doesn't improve the overall complexity that's O(mS) where m = len(c).
recently I became interested in the subset-sum problem which is finding a zero-sum subset in a superset. I found some solutions on SO, in addition, I came across a particular solution which uses the dynamic programming approach. I translated his solution in python based on his qualitative descriptions. I'm trying to optimize this for larger lists which eats up a lot of my memory. Can someone recommend optimizations or other techniques to solve this particular problem? Here's my attempt in python:
import random
from time import time
from itertools import product
time0 = time()
# create a zero matrix of size a (row), b(col)
def create_zero_matrix(a,b):
return [[0]*b for x in xrange(a)]
# generate a list of size num with random integers with an upper and lower bound
def random_ints(num, lower=-1000, upper=1000):
return [random.randrange(lower,upper+1) for i in range(num)]
# split a list up into N and P where N be the sum of the negative values and P the sum of the positive values.
# 0 does not count because of additive identity
def split_sum(A):
N_list = []
P_list = []
for x in A:
if x < 0:
N_list.append(x)
elif x > 0:
P_list.append(x)
return [sum(N_list), sum(P_list)]
# since the column indexes are in the range from 0 to P - N
# we would like to retrieve them based on the index in the range N to P
# n := row, m := col
def get_element(table, n, m, N):
if n < 0:
return 0
try:
return table[n][m - N]
except:
return 0
# same definition as above
def set_element(table, n, m, N, value):
table[n][m - N] = value
# input array
#A = [1, -3, 2, 4]
A = random_ints(200)
[N, P] = split_sum(A)
# create a zero matrix of size m (row) by n (col)
#
# m := the number of elements in A
# n := P - N + 1 (by definition N <= s <= P)
#
# each element in the matrix will be a value of either 0 (false) or 1 (true)
m = len(A)
n = P - N + 1;
table = create_zero_matrix(m, n)
# set first element in index (0, A[0]) to be true
# Definition: Q(1,s) := (x1 == s). Note that index starts at 0 instead of 1.
set_element(table, 0, A[0], N, 1)
# iterate through each table element
#for i in xrange(1, m): #row
# for s in xrange(N, P + 1): #col
for i, s in product(xrange(1, m), xrange(N, P + 1)):
if get_element(table, i - 1, s, N) or A[i] == s or get_element(table, i - 1, s - A[i], N):
#set_element(table, i, s, N, 1)
table[i][s - N] = 1
# find zero-sum subset solution
s = 0
solution = []
for i in reversed(xrange(0, m)):
if get_element(table, i - 1, s, N) == 0 and get_element(table, i, s, N) == 1:
s = s - A[i]
solution.append(A[i])
print "Solution: ",solution
time1 = time()
print "Time execution: ", time1 - time0
I'm not quite sure if your solution is exact or a PTA (poly-time approximation).
But, as someone pointed out, this problem is indeed NP-Complete.
Meaning, every known (exact) algorithm has an exponential time behavior on the size of the input.
Meaning, if you can process 1 operation in .01 nanosecond then, for a list of 59 elements it'll take:
2^59 ops --> 2^59 seconds --> 2^26 years --> 1 year
-------------- ---------------
10.000.000.000 3600 x 24 x 365
You can find heuristics, which give you just a CHANCE of finding an exact solution in polynomial time.
On the other side, if you restrict the problem (to another) using bounds for the values of the numbers in the set, then the problem complexity reduces to polynomial time. But even then the memory space consumed will be a polynomial of VERY High Order.
The memory consumed will be much larger than the few gigabytes you have in memory.
And even much larger than the few tera-bytes on your hard drive.
( That's for small values of the bound for the value of the elements in the set )
May be this is the case of your Dynamic programing algorithm.
It seemed to me that you were using a bound of 1000 when building your initialization matrix.
You can try a smaller bound. That is... if your input is consistently consist of small values.
Good Luck!
Someone on Hacker News came up with the following solution to the problem, which I quite liked. It just happens to be in python :):
def subset_summing_to_zero (activities):
subsets = {0: []}
for (activity, cost) in activities.iteritems():
old_subsets = subsets
subsets = {}
for (prev_sum, subset) in old_subsets.iteritems():
subsets[prev_sum] = subset
new_sum = prev_sum + cost
new_subset = subset + [activity]
if 0 == new_sum:
new_subset.sort()
return new_subset
else:
subsets[new_sum] = new_subset
return []
I spent a few minutes with it and it worked very well.
An interesting article on optimizing python code is available here. Basically the main result is that you should inline your frequent loops, so in your case this would mean instead of calling get_element twice per loop, put the actual code of that function inside the loop in order to avoid the function call overhead.
Hope that helps! Cheers
, 1st eye catch
def split_sum(A):
N_list = 0
P_list = 0
for x in A:
if x < 0:
N_list+=x
elif x > 0:
P_list+=x
return [N_list, P_list]
Some advices:
Try to use 1D list and use bitarray to reduce memory footprint at minimum (http://pypi.python.org/pypi/bitarray) so you will just change get / set functon. This should reduce your memory footprint by at lest 64 (integer in list is pointer to integer whit type so it can be factor 3*32)
Avoid using try - catch, but figure out proper ranges at beginning, you might found out that you will gain huge speed.
The following code works for Python 3.3+ , I have used the itertools module in Python that has some great methods to use.
from itertools import chain, combinations
def powerset(iterable):
s = list(iterable)
return chain.from_iterable(combinations(s, r) for r in range(len(s)+1))
nums = input("Enter the Elements").strip().split()
inputSum = int(input("Enter the Sum You want"))
for i, combo in enumerate(powerset(nums), 1):
sum = 0
for num in combo:
sum += int(num)
if sum == inputSum:
print(combo)
The Input Output is as Follows:
Enter the Elements 1 2 3 4
Enter the Sum You want 5
('1', '4')
('2', '3')
Just change the values in your set w and correspondingly make an array x as big as the len of w then pass the last value in the subsetsum function as the sum for which u want subsets and you wl bw done (if u want to check by giving your own values).
def subsetsum(cs,k,r,x,w,d):
x[k]=1
if(cs+w[k]==d):
for i in range(0,k+1):
if x[i]==1:
print (w[i],end=" ")
print()
elif cs+w[k]+w[k+1]<=d :
subsetsum(cs+w[k],k+1,r-w[k],x,w,d)
if((cs +r-w[k]>=d) and (cs+w[k]<=d)) :
x[k]=0
subsetsum(cs,k+1,r-w[k],x,w,d)
#driver for the above code
w=[2,3,4,5,0]
x=[0,0,0,0,0]
subsetsum(0,0,sum(w),x,w,7)