So I'm new to Python and one concept that's taken some getting used to is the list comprehension. I've read that they can improve computation speed when used correctly and that they're something to know well if you're to learn Python the right way.
I'm writing a program that implements a particle aggregation algorithm that involves the accumulation of up to 10^6 particles to a growing cluster. I'm going back through my code to optimize performance anywhere possible, and I have the following function:
def step_all_walkers(walkers):
updated_positions = []
for walker in walkers:
decision = random.randint(1,4)
if decision == 1:
updated_walker = (min(walker[0]+1, N-1), walker[1])
elif decision == 2:
updated_walker = (max(walker[0]-1, 1), walker[1])
elif decision == 3:
updated_walker = (walker[0], min(walker[1] + 1, N-1))
elif decision == 4:
updated_walker = (walker[0], max(walker[1]-1, 1))
updated_positions.append(updated_walker)
return updated_positions
This function makes every particle (or walker, as I call them in the code) take a step of unit length in a random direction, and prevents the particles from walking off an N x N grid. I notice that I am creating and returning a new list updated_positions, and since this list and the input walker list are potentially very large, what I sort of know about list comprehensions tells me this might be a good time to use one. However, in some other posts on this question where there is only one if/else to be evaluated, people responded by saying just use a good ole fashion for loop.
I have a few questions then:
1) Can multiple if/elif statements be done in a list comprehension?
2) Does it make sense to write this for loop as a list comprehension? Are there any advantages to doing so?
My main purpose for asking this question is to build up more intuition for when a list comprehension is appropriate, and also to see if this function can be made more efficient with one.
I would turn that into a lookup dictionary to start with and then you might be able to consider a list comprehension
decisions = {1: lambda walker: (min(walker[0]+1, N-1), walker[1])}
return [decisions[random.randint(1,4)](walker) for walker in walkers]
A case where list comprehensions should not be used is when the logic in too complex.
For loops are simpler then since they allow:
Including comments to explain the code.
Use of control flow keywords like continue.
Debug of sections of the more easily using logging statements or asserts.
Allow others to easily see the complexity of the logic by scanning down the lines
However, as Sayse suggests, we can often simplify the logic with additional data structures and helper functions to allow list comprehension.
List comprehension with no if/else statements (need for logic bypassed)
def step_all_walkers(walkers, N):
def decision(walker):
" Helper function for making a decisions "
# Place decision choices into a data structure
options = [
lambda walker: (min(walker[0]+1, N-1), walker[1]),
lambda walker: (max(walker[0]-1, 1), walker[1]),
lambda walker: (walker[0], min(walker[1] + 1, N-1)),
lambda walker: (walker[0], max(walker[1]-1, 1))]
while True:
n = random.randint(0, 3) # use rather than (1, 4) to
# provide proper index into options
yield options[n](walker)
# Now, list comprehension is straight forward to follow
return [decision(walker) for walker in walkers]
Related
Ok, I'm in the process of learning Python, and had a quick question about for loops. I was wondering if you could use math operators in them, like JavaScript. For example, could I do:
for i = 0, i < 5, i++:
#code here
Now, I'm quite aware that Python doesn't support i++, and I think it doesn't support the commas either. So if I can do it that way, could you provide a sample.
Thanks
You would use a range loop:
for i in range(5):
#code here
If you want to increment in a loop you would use a while loop:
i = 0
while i < 5:
i += 1
To decrement you would use i -= 1.
Just as a loop is introduced by for, does not imply the same behaviour for different languages.
Python's for loop iterates over objects. Something like the C-for loop does not exist.
The C for loop (for ( <init> ; <cond> ; <update> ) <statement>, however, is actually identical to the C code:
<init>;
while ( <cond> ) {
<statement>
<update>
}
So, with the additional information that Python does have a while loop which behaves like the C-while loop, you should now be able to implement something like the C for loop in Python. I'll leave that as an exercise:-)
Note: as generating an evenly spaced sequence of integer values is a common case, Python provides the range() (Python 3) or xrange() (Python 2) function. This does create a RangeObject which (basically) yields the next value for a sequence given by start, stop and step arguments.
Quick answer
You may use:
for i in range(5):
# code here
or
i = 0
while i < 5:
i = i + 1 # or i += 1
Boring/pedantic answer
When I was learning Python I disliked the syntax; why should a simple for loop require a second keyword, range? The answer, I believe, is due to the fundamental role of the list in Python's prescriptive syntax. Repeated annoyances by range made me think about how the data were described (or not) before the loop, which in turn led me to think more Pythonically about the design of the data.
Let's say you want to populate a list with the first five perfect squares. You could:
squares = []
for i in range(5):
squares.append(i**2)
Alternatively, you could use comprehension:
initial_values = range(5) # we've declared the initial values
squares = [i**2 for i in initial_values]
Or more compactly:
squares = [i**2 for i in range(5)]
I routinely encounter problems where there's no Pythonic way to write the code, and I end up writing C-like Python (as in the Quick answer above). But just as often I find there's a more elegant and readable way to do things, and usually this indicates some imperfections in the antecedent data design.
Recently I read a problem to practice DP. I wasn't able to come up with one, so I tried a recursive solution which I later modified to use memoization. The problem statement is as follows :-
Making Change. You are given n types of coin denominations of values
v(1) < v(2) < ... < v(n) (all integers). Assume v(1) = 1, so you can
always make change for any amount of money C. Give an algorithm which
makes change for an amount of money C with as few coins as possible.
[on problem set 4]
I got the question from here
My solution was as follows :-
def memoized_make_change(L, index, cost, d):
if index == 0:
return cost
if (index, cost) in d:
return d[(index, cost)]
count = cost / L[index]
val1 = memoized_make_change(L, index-1, cost%L[index], d) + count
val2 = memoized_make_change(L, index-1, cost, d)
x = min(val1, val2)
d[(index, cost)] = x
return x
This is how I've understood my solution to the problem. Assume that the denominations are stored in L in ascending order. As I iterate from the end to the beginning, I have a choice to either choose a denomination or not choose it. If I choose it, I then recurse to satisfy the remaining amount with lower denominations. If I do not choose it, I recurse to satisfy the current amount with lower denominations.
Either way, at a given function call, I find the best(lowest count) to satisfy a given amount.
Could I have some help in bridging the thought process from here onward to reach a DP solution? I'm not doing this as any HW, this is just for fun and practice. I don't really need any code either, just some help in explaining the thought process would be perfect.
[EDIT]
I recall reading that function calls are expensive and is the reason why bottom up(based on iteration) might be preferred. Is that possible for this problem?
Here is a general approach for converting memoized recursive solutions to "traditional" bottom-up DP ones, in cases where this is possible.
First, let's express our general "memoized recursive solution". Here, x represents all the parameters that change on each recursive call. We want this to be a tuple of positive integers - in your case, (index, cost). I omit anything that's constant across the recursion (in your case, L), and I suppose that I have a global cache. (But FWIW, in Python you should just use the lru_cache decorator from the standard library functools module rather than managing the cache yourself.)
To solve for(x):
If x in cache: return cache[x]
Handle base cases, i.e. where one or more components of x is zero
Otherwise:
Make one or more recursive calls
Combine those results into `result`
cache[x] = result
return result
The basic idea in dynamic programming is simply to evaluate the base cases first and work upward:
To solve for(x):
For y starting at (0, 0, ...) and increasing towards x:
Do all the stuff from above
However, two neat things happen when we arrange the code this way:
As long as the order of y values is chosen properly (this is trivial when there's only one vector component, of course), we can arrange that the results for the recursive call are always in cache (i.e. we already calculated them earlier, because y had that value on a previous iteration of the loop). So instead of actually making the recursive call, we replace it directly with a cache lookup.
Since every component of y will use consecutively increasing values, and will be placed in the cache in order, we can use a multidimensional array (nested lists, or else a Numpy array) to store the values instead of a dictionary.
So we get something like:
To solve for(x):
cache = multidimensional array sized according to x
for i in range(first component of x):
for j in ...:
(as many loops as needed; better yet use `itertools.product`)
If this is a base case, write the appropriate value to cache
Otherwise, compute "recursive" index values to use, look up
the values, perform the computation and store the result
return the appropriate ("last") value from cache
I suggest considering the relationship between the value you are constructing and the values you need for it.
In this case you are constructing a value for index, cost based on:
index-1 and cost
index-1 and cost%L[index]
What you are searching for is a way of iterating over the choices such that you will always have precalculated everything you need.
In this case you can simply change the code to the iterative approach:
for each choice of index 0 upwards:
for each choice of cost:
compute value corresponding to index,cost
In practice, I find that the iterative approach can be significantly faster (e.g. *4 perhaps) for simple problems as it avoids the overhead of function calls and checking the cache for preexisting values.
I have complex algorithm to build in order to select the best combination of elements in my list.
I have a list of 20 elements. I make all the combinations this list using this algorithms, the resutlt would be a list of element with size: 2^20-1 (without duplications)
from itertools import combinations
def get_all_combinations(input_list):
for i in xrange(len(input_list)):
for item in combinations(input_list, r = i + 1):
yield list(item)
input_list = [1,4,6,8,11,13,5,98,45,10,21,34,46,85,311,133,35,938,345,310]
print len(get_all_combinations(input_list)) # 1048575
I have another algorithm that is applied on every list, then calculate the max.
// this is just an example
def calcul_factor(item):
return max(item) * min(item) / sqrt(min(item))
I tried to do it like this way: but it's taking a long time.
columnsList= get_all_combinations(input_list)
for x in columnsList:
i= calcul_factor(x)
factorsList.append(i)
l.append(x)
print "max", max(factorsList)
print "Best combinations:", l[factorsList.index( max(factorsList))]
Does using Maps/Lamda expressions solve issues to make "parallelisme" to calculate the maximum ?
ANy hints to do that ?
In case you can't find a better algorithm (which might be needed here) you can avoid creating those big lists by using generators.
With the help of itertools.chain you can combine the itertools.combinations-generators. Furthermore the max-function can take a function as a key.
Your code can be reduced to:
all_combinations = chain(*[combinations(input_list, i) for i in range(1, len(input_list))])
max(all_combinations, key=algorithm)
Since this code relies solely on generators it might be faster (doesn't mean fast enough).
Edit: I generally agree with Hugh Bothwell, that you should be trying to find a better algorithm before going with an implementation like this. Especially if your lists are going to contain more than 20 elements.
If you can easily calculate calcul_factor(item + [k]) given calcul_factor(item), you might greatly benefit from a dynamic-programming approach.
If you can eliminate some bad solutions early, it will also greatly reduce the total number of combinations to consider (branch-and-bound).
If the calculation is reasonably well-behaved, you might even be able to use ie simplex method or a linear solver and walk directly to a solution (something like O(n**2 log n) runtime instead of O(2**n))
Could you show us the actual calcul_factor code and an actual input_list?
I was looking at Wikipedia's pseudo-code (and other webpages like sortvis.org and sorting-algorithm.com) on merge sort and saw the preparation of a merge uses recursion.
I was curious to see if there is a non-recursive way to do it.
Perhaps something like a for each i element in list, i=[i-th element].
I am under the impression that recursion is keep-it-to-a-minimum-because-it's-undesirable, and so therefore I thought of this question.
The following is a pseudo-code sample of the recursive part of the merge-sort from Wikipedia:
function merge_sort(list m)
// if list size is 1, consider it sorted and return it
if length(m) <= 1
return m
// else list size is > 1, so split the list into two sublists
var list left, right
var integer middle = length(m) / 2
for each x in m up to middle
add x to left
for each x in m after or equal middle
add x to right
// recursively call merge_sort() to further split each sublist
// until sublist size is 1
left = merge_sort(left)
right = merge_sort(right)
Bottom-up merge sort is a non-recursive variant of merge sort.
See also this wikipedia page for a more detailed pseudocode implementation.
middle = len(lst) / 2
left = lst[:middle]
right = lst[middle:]
List slicing works fine.
As an aside - recursion is not undesirable per se.
Recursion is undesirable if you have limited stack space (are you afraid of stackoverflow? ;-) ), or in some cases where the time overhead of function calls is of great concern.
For much of the time these conditions do not hold; readability and maintainability of your code will be more relevant. Algorithms like merge sort make more sense when expressed recursively in my opinion.
I have the following need (in python):
generate all possible tuples of length 12 (could be more) containing either 0, 1 or 2 (basically, a ternary number with 12 digits)
filter these tuples according to specific criteria, culling those not good, and keeping the ones I need.
As I had to deal with small lengths until now, the functional approach was neat and simple: a recursive function generates all possible tuples, then I cull them with a filter function. Now that I have a larger set, the generation step is taking too much time, much longer than needed as most of the paths in the solution tree will be culled later on, so I could skip their creation.
I have two solutions to solve this:
derecurse the generation into a loop, and apply the filter criteria on each new 12-digits entity
integrate the filtering in the recursive algorithm, so to prevent it stepping into paths that are already doomed.
My preference goes to 1 (seems easier) but I would like to hear your opinion, in particular with an eye towards how a functional programming style deals with such cases.
How about
import itertools
results = []
for x in itertools.product(range(3), repeat=12):
if myfilter(x):
results.append(x)
where myfilter does the selection. Here, for example, only allowing result with 10 or more 1's,
def myfilter(x): # example filter, only take lists with 10 or more 1s
return x.count(1)>=10
That is, my suggestion is your option 1. For some cases it may be slower because (depending on your criteria) you many generate many lists that you don't need, but it's much more general and very easy to code.
Edit: This approach also has a one-liner form, as suggested in the comments by hughdbrown:
results = [x for x in itertools.product(range(3), repeat=12) if myfilter(x)]
itertools has functionality for dealing with this. However, here is a (hardcoded) way of handling with a generator:
T = (0,1,2)
GEN = ((a,b,c,d,e,f,g,h,i,j,k,l) for a in T for b in T for c in T for d in T for e in T for f in T for g in T for h in T for i in T for j in T for k in T for l in T)
for VAL in GEN:
# Filter VAL
print VAL
I'd implement an iterative binary adder or hamming code and run that way.