Recently I read a problem to practice DP. I wasn't able to come up with one, so I tried a recursive solution which I later modified to use memoization. The problem statement is as follows :-
Making Change. You are given n types of coin denominations of values
v(1) < v(2) < ... < v(n) (all integers). Assume v(1) = 1, so you can
always make change for any amount of money C. Give an algorithm which
makes change for an amount of money C with as few coins as possible.
[on problem set 4]
I got the question from here
My solution was as follows :-
def memoized_make_change(L, index, cost, d):
if index == 0:
return cost
if (index, cost) in d:
return d[(index, cost)]
count = cost / L[index]
val1 = memoized_make_change(L, index-1, cost%L[index], d) + count
val2 = memoized_make_change(L, index-1, cost, d)
x = min(val1, val2)
d[(index, cost)] = x
return x
This is how I've understood my solution to the problem. Assume that the denominations are stored in L in ascending order. As I iterate from the end to the beginning, I have a choice to either choose a denomination or not choose it. If I choose it, I then recurse to satisfy the remaining amount with lower denominations. If I do not choose it, I recurse to satisfy the current amount with lower denominations.
Either way, at a given function call, I find the best(lowest count) to satisfy a given amount.
Could I have some help in bridging the thought process from here onward to reach a DP solution? I'm not doing this as any HW, this is just for fun and practice. I don't really need any code either, just some help in explaining the thought process would be perfect.
[EDIT]
I recall reading that function calls are expensive and is the reason why bottom up(based on iteration) might be preferred. Is that possible for this problem?
Here is a general approach for converting memoized recursive solutions to "traditional" bottom-up DP ones, in cases where this is possible.
First, let's express our general "memoized recursive solution". Here, x represents all the parameters that change on each recursive call. We want this to be a tuple of positive integers - in your case, (index, cost). I omit anything that's constant across the recursion (in your case, L), and I suppose that I have a global cache. (But FWIW, in Python you should just use the lru_cache decorator from the standard library functools module rather than managing the cache yourself.)
To solve for(x):
If x in cache: return cache[x]
Handle base cases, i.e. where one or more components of x is zero
Otherwise:
Make one or more recursive calls
Combine those results into `result`
cache[x] = result
return result
The basic idea in dynamic programming is simply to evaluate the base cases first and work upward:
To solve for(x):
For y starting at (0, 0, ...) and increasing towards x:
Do all the stuff from above
However, two neat things happen when we arrange the code this way:
As long as the order of y values is chosen properly (this is trivial when there's only one vector component, of course), we can arrange that the results for the recursive call are always in cache (i.e. we already calculated them earlier, because y had that value on a previous iteration of the loop). So instead of actually making the recursive call, we replace it directly with a cache lookup.
Since every component of y will use consecutively increasing values, and will be placed in the cache in order, we can use a multidimensional array (nested lists, or else a Numpy array) to store the values instead of a dictionary.
So we get something like:
To solve for(x):
cache = multidimensional array sized according to x
for i in range(first component of x):
for j in ...:
(as many loops as needed; better yet use `itertools.product`)
If this is a base case, write the appropriate value to cache
Otherwise, compute "recursive" index values to use, look up
the values, perform the computation and store the result
return the appropriate ("last") value from cache
I suggest considering the relationship between the value you are constructing and the values you need for it.
In this case you are constructing a value for index, cost based on:
index-1 and cost
index-1 and cost%L[index]
What you are searching for is a way of iterating over the choices such that you will always have precalculated everything you need.
In this case you can simply change the code to the iterative approach:
for each choice of index 0 upwards:
for each choice of cost:
compute value corresponding to index,cost
In practice, I find that the iterative approach can be significantly faster (e.g. *4 perhaps) for simple problems as it avoids the overhead of function calls and checking the cache for preexisting values.
Related
If I use the code
from collections import deque
q = deque(maxlen=2)
while step <= step_max:
calculate(item)
q.append(item)
another_calculation(q)
how does it compare in efficiency and readability to
q = []
while step <= step_max:
calculate(item)
q.append(item)
q = q[-2:]
another_calculation(q)
calculate() and another_calculation() are not real in this case but in my actual program are simply two calculations. I'm doing these calculations every step for millions of steps (I'm simulating an ion in 2-d space). Because there are so many steps, q gets very long and uses a lot of memory, while another_calculation() only uses the last two values of q. I had been using the latter method, then heard deque mentioned and thought it might be more efficient; thus the question.
I.e., how do deques in python compare to just normal list slicing?
q = q[-2:]
now this is a costly operation because it recreates a list everytime (and copies the references). (A nasty side effect here is that it changes the reference of q even if you can use q[:] = q[-2:] to avoid that).
The deque object just changes the start of the list pointer and "forgets" the oldest item. So it's faster and it's one of the usages it's been designed for.
Of course, for 2 values, there isn't much difference, but for a bigger number there is.
If I interpret your question correctly, you have a function, that calculates a value, and you want to do another calculation with this and the previous value. The best way is to use two variables:
while step <= step_max:
item = calculate()
another_calculation(previous_item, item)
previous_item = item
If the calculations are some form of vector math, you should consider using numpy.
I have complex algorithm to build in order to select the best combination of elements in my list.
I have a list of 20 elements. I make all the combinations this list using this algorithms, the resutlt would be a list of element with size: 2^20-1 (without duplications)
from itertools import combinations
def get_all_combinations(input_list):
for i in xrange(len(input_list)):
for item in combinations(input_list, r = i + 1):
yield list(item)
input_list = [1,4,6,8,11,13,5,98,45,10,21,34,46,85,311,133,35,938,345,310]
print len(get_all_combinations(input_list)) # 1048575
I have another algorithm that is applied on every list, then calculate the max.
// this is just an example
def calcul_factor(item):
return max(item) * min(item) / sqrt(min(item))
I tried to do it like this way: but it's taking a long time.
columnsList= get_all_combinations(input_list)
for x in columnsList:
i= calcul_factor(x)
factorsList.append(i)
l.append(x)
print "max", max(factorsList)
print "Best combinations:", l[factorsList.index( max(factorsList))]
Does using Maps/Lamda expressions solve issues to make "parallelisme" to calculate the maximum ?
ANy hints to do that ?
In case you can't find a better algorithm (which might be needed here) you can avoid creating those big lists by using generators.
With the help of itertools.chain you can combine the itertools.combinations-generators. Furthermore the max-function can take a function as a key.
Your code can be reduced to:
all_combinations = chain(*[combinations(input_list, i) for i in range(1, len(input_list))])
max(all_combinations, key=algorithm)
Since this code relies solely on generators it might be faster (doesn't mean fast enough).
Edit: I generally agree with Hugh Bothwell, that you should be trying to find a better algorithm before going with an implementation like this. Especially if your lists are going to contain more than 20 elements.
If you can easily calculate calcul_factor(item + [k]) given calcul_factor(item), you might greatly benefit from a dynamic-programming approach.
If you can eliminate some bad solutions early, it will also greatly reduce the total number of combinations to consider (branch-and-bound).
If the calculation is reasonably well-behaved, you might even be able to use ie simplex method or a linear solver and walk directly to a solution (something like O(n**2 log n) runtime instead of O(2**n))
Could you show us the actual calcul_factor code and an actual input_list?
I am using Python to solve Project Euler problems. Many require caching the results of past calculations to improve performance, leading to code like this:
pastResults = [None] * 1000000
def someCalculation(integerArgument):
# return result of a calculation performed on numberArgument
# for example, summing the factorial or square of its digits
for eachNumber in range(1, 1000001)
if pastResults[eachNumber - 1] is None:
pastResults[eachNumber - 1] = someCalculation(eachNumber)
# perform additional actions with pastResults[eachNumber - 1]
Would the repeated decrementing have an adverse impact on program performance? Would having an empty or dummy zeroth element (so the zero-based array emulates a one-based array) improve performance by eliminating the repeated decrementing?
pastResults = [None] * 1000001
def someCalculation(integerArgument):
# return result of a calculation performed on numberArgument
# for example, summing the factorial or square of its digits
for eachNumber in range(1, 1000001)
if pastResults[eachNumber] is None:
pastResults[eachNumber] = someCalculation(eachNumber)
# perform additional actions with pastResults[eachNumber]
I also feel that emulating a one-based array would make the code easier to follow. That is why I do not make the range zero-based with for eachNumber in range(1000000) as someCalculation(eachNumber + 1) would not be logical.
How significant is the additional memory from the empty zeroth element? What other factors should I consider? I would prefer answers that are not confined to Python and Project Euler.
EDIT: Should be is None instead of is not None.
Not really an answer to the question regarding the performance, rather a general tip about caching previously calculated values. The usual way to do this is to use a map (Python dict) for this, as this allows to use more complex keys instead of just integer numbers, like floating point numbers, strings, or even tuples. Also, you won't run into problems in case your keys are rather sparse.
pastResults = {}
def someCalculation(integerArgument):
if integerArgument not in pastResults:
pastResults[integerArgument] = # calculation performed on numberArg.
return pastResults[integerArgument]
Also, there is no need to perform the calculations "in order" using a loop. Just call the function for the value you are interested in, and the if statement will take care that, when invoked recursively, the function is called only once for each argument.
Ultimately, if you are using this a lot (as clearly the case for Project Euler) you can define yourself a function decorator, like this one:
def memo(f):
f.cache = {}
def _f(*args, **kwargs):
if args not in f.cache:
f.cache[args] = f(*args, **kwargs)
return f.cache[args]
return _f
What this does is: It takes a function and defines another function that first checks whether the given parameters can be found in the cache, and otherwise calculates the result of the original function and puts it into the cache. Just add the #memo annotation to your function definitions and this will take care of caching for you.
#memo
def someCalculation(integerArgument):
# function body
This is syntactic sugar for someCalculation = memo(someCalculation). Note however, that this will not always work out well. First, the paremters have to be hashable (no lists or other mutable types); second, in case you are passing parameters that are not relevant for the result (e.g., debugging stuff etc.) your cache can grow unnecessarily large, as all the parameters are used as the key.
Write the function sinusoid(a, w, n) that will return a list of ordered pairs representing n cycles of a sinusoid with amplitude a and frequency w. Each cycle should contain 180 ordered pairs.
So far I have:
def sinusoid(a,w,n):
return [a*sin(x) for x in range 180]
Please consider the actual functional form of a sinusoidal wave and how the frequency comes into the equation. (Hint: http://en.wikipedia.org/wiki/Sine_wave).
Not sure what is meant exactly by 'ordered pairs', but I would assume it means the x,y pairs. Currently you're only returning a list of single values. Also you might want to take a look at the documentation for Python's sin function.
Okay, we know this is a homework assignment and we're not going to do it for you. However, I'll give you a couple hints.
The instructions:
Write the function sinusoid(a, w, n) that will return a list of ordered pairs representing n cycles of a sinusoid with amplitude a and frequency w. Each cycle should contain 180 ordered pairs.
... translated into a bullet list of requirements:
Write a function
... named sinusoid()
... taking three arguments: a, w, and n
returning a list
... of n cycles(?)
... (each consisting of?) 180 "ordered pairs"
The example you've given does define a function, by the correct name, and taking the correct number of arguments. That's a start (not much of one, frankly, but it's something).
The obvious failings are that it doesn't use two of the arguments that are required and it doesn't return pairs of anything. It seems that it would return 180 numbers which are based on the argument supplied to its first parameter.
Surely you can do a bit better than that.
Let's start with a stub:
def sinusoid(a, w, n):
'''Return n cycles of the sinusoid for a given amplitude and frequence
where each cycle consists of 180 ordered pairs
'''
results = list()
# do stuff here
return results
That's a function, takes three arguments and returns a list. Now for that list to contain anything before we return it we'll have to append some things to it ... and the instructions tell us how many things it should return (n times 180) and what sorts of things they should be (ordered pairs).
That sounds quite a bit like we'll need a loop (for n) and another (for 180). Hmmm ...
That might look like:
for each_cycle in range(n):
for each_pair in range(180):
# do something here
results.append(something) # where something is a tuple ... an "ordered pair"
... or it might look like:
for each_cycle in range(n):
this_cycle = list()
for each_pair in range(180):
this_cycle.append(something)
results.extend(this_cycle)
... or it might even look like:
for each_pair in range(n*180):
results.append(something)
... though, frankly, that seems unlikely. (If you try flattening the inner loop to the outer loop in this way you might find that you're having to use modulo arithmetic to get n back out for some other intermediate computational purposes).
I have no idea what the instructor is actually asking for. It seems likely that the math.sin() function will be involved and I guess "ordered pairs" might be co-ordinates mapped to some sort of graphics subsystem and suitable for plotting a graph. I guess 180 of these to show the sinusoid wave through a full range of its values. Maybe you're supposed to multiply something by the amplitude and/or divide something else by the frequency and maybe you're supposed to even add something for each cycle ... some sort of offset to keep the plot moving towards the right or something.
But it seems like you might start with that stub of a function definition and try pasting in one or another of these loop bodies and then figuring out how to actually return meaningful values in the parts where I've used "something" as a placeholder.
Going with the assumption that these "ordered pairs" are co-ordinates, for plotting, then it seems likely that each of the things you append to your results should be of the form (x,y) where x is monotonically increasing (fancy way of saying it keeps going up, never goes down) and might even always be the range(0,n*180) and y is probably math.sin() of something involved a and w ... but that's just speculation on my part.
I have the following need (in python):
generate all possible tuples of length 12 (could be more) containing either 0, 1 or 2 (basically, a ternary number with 12 digits)
filter these tuples according to specific criteria, culling those not good, and keeping the ones I need.
As I had to deal with small lengths until now, the functional approach was neat and simple: a recursive function generates all possible tuples, then I cull them with a filter function. Now that I have a larger set, the generation step is taking too much time, much longer than needed as most of the paths in the solution tree will be culled later on, so I could skip their creation.
I have two solutions to solve this:
derecurse the generation into a loop, and apply the filter criteria on each new 12-digits entity
integrate the filtering in the recursive algorithm, so to prevent it stepping into paths that are already doomed.
My preference goes to 1 (seems easier) but I would like to hear your opinion, in particular with an eye towards how a functional programming style deals with such cases.
How about
import itertools
results = []
for x in itertools.product(range(3), repeat=12):
if myfilter(x):
results.append(x)
where myfilter does the selection. Here, for example, only allowing result with 10 or more 1's,
def myfilter(x): # example filter, only take lists with 10 or more 1s
return x.count(1)>=10
That is, my suggestion is your option 1. For some cases it may be slower because (depending on your criteria) you many generate many lists that you don't need, but it's much more general and very easy to code.
Edit: This approach also has a one-liner form, as suggested in the comments by hughdbrown:
results = [x for x in itertools.product(range(3), repeat=12) if myfilter(x)]
itertools has functionality for dealing with this. However, here is a (hardcoded) way of handling with a generator:
T = (0,1,2)
GEN = ((a,b,c,d,e,f,g,h,i,j,k,l) for a in T for b in T for c in T for d in T for e in T for f in T for g in T for h in T for i in T for j in T for k in T for l in T)
for VAL in GEN:
# Filter VAL
print VAL
I'd implement an iterative binary adder or hamming code and run that way.