Can someone please explain algorithm for itertools.permutations routine in Python standard lib 2.6? I don't understand why it works.
Code is:
def permutations(iterable, r=None):
# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
return
indices = range(n)
cycles = range(n, n-r, -1)
yield tuple(pool[i] for i in indices[:r])
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
yield tuple(pool[i] for i in indices[:r])
break
else:
return
You need to understand the mathematical theory of permutation cycles, also known as "orbits" (it's important to know both "terms of art" since the mathematical subject, the heart of combinatorics, is quite advanced, and you may need to look up research papers which could use either or both terms).
For a simpler introduction to the theory of permutations, wikipedia can help. Each of the URLs I mentioned offers reasonable bibliography if you get fascinated enough by combinatorics to want to explore it further and gain real understanding (I did, personally -- it's become somewhat of a hobby for me;-).
Once you understand the mathematical theory, the code is still subtle and interesting to "reverse engineer". Clearly, indices is just the current permutation in terms of indices into the pool, given that the items yielded are always given by
yield tuple(pool[i] for i in indices[:r])
So the heart of this fascinating machinery is cycles, which represents the permutation's orbits and causes indices to be updated, mostly by the statements
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
I.e., if cycles[i] is j, this means that the next update to the indices is to swap the i-th one (from the left) with the j-th one from the right (e.g., if j is 1, then the last element of indices is being swapped -- indices[-1]). And then there's the less frequent "bulk update" when an item of cycles reached 0 during its decrements:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
this puts the ith item of indices at the very end, shifting all following items of indices one to the left, and indicates that the next time we come to this item of cycles we'll be swapping the new ith item of indices (from the left) with the n - ith one (from the right) -- that would be the ith one again, except of course for the fact that there will be a
cycles[i] -= 1
before we next examine it;-).
The hard part would of course be proving that this works -- i.e., that all permutations are exhaustively generated, with no overlap and a correctly "timed" exit. I think that, instead of a proof, it may be easier to look at how the machinery works when fully exposed in simple cases -- commenting out the yield statements and adding print ones (Python 2.*), we have
def permutations(iterable, r=None):
# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
return
indices = range(n)
cycles = range(n, n-r, -1)
print 'I', 0, cycles, indices
# yield tuple(pool[i] for i in indices[:r])
print indices[:r]
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
print 'B', i, cycles, indices
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
print 'A', i, cycles, indices
else:
print 'b', i, cycles, indices
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
print 'a', i, cycles, indices
# yield tuple(pool[i] for i in indices[:r])
print indices[:r]
break
else:
return
permutations('ABC', 2)
Running this shows:
I 0 [3, 2] [0, 1, 2]
[0, 1]
b 1 [3, 1] [0, 1, 2]
a 1 [3, 1] [0, 2, 1]
[0, 2]
B 1 [3, 0] [0, 2, 1]
A 1 [3, 2] [0, 1, 2]
b 0 [2, 2] [0, 1, 2]
a 0 [2, 2] [1, 0, 2]
[1, 0]
b 1 [2, 1] [1, 0, 2]
a 1 [2, 1] [1, 2, 0]
[1, 2]
B 1 [2, 0] [1, 2, 0]
A 1 [2, 2] [1, 0, 2]
b 0 [1, 2] [1, 0, 2]
a 0 [1, 2] [2, 0, 1]
[2, 0]
b 1 [1, 1] [2, 0, 1]
a 1 [1, 1] [2, 1, 0]
[2, 1]
B 1 [1, 0] [2, 1, 0]
A 1 [1, 2] [2, 0, 1]
B 0 [0, 2] [2, 0, 1]
A 0 [3, 2] [0, 1, 2]
Focus on the cycles: they start as 3, 2 -- then the last one is decremented, so 3, 1 -- the last isn't zero yet so we have a "small" event (one swap in the indices) and break the inner loop. Then we enter it again, this time the decrement of the last gives 3, 0 -- the last is now zero so it's a "big" event -- "mass swap" in the indices (well there's not much of a mass here, but, there might be;-) and the cycles are back to 3, 2. But now we haven't broken off the for loop, so we continue by decrementing the next-to-last (in this case, the first) -- which gives a minor event, one swap in the indices, and we break the inner loop again. Back to the loop, yet again the last one is decremented, this time giving 2, 1 -- minor event, etc. Eventually a whole for loop occurs with only major events, no minor ones -- that's when the cycles start as all ones, so the decrement takes each to zero (major event), no yield occurs on that last cycle.
Since no break ever executed in that cycle, we take the else branch of the for, which returns. Note that the while n may be a bit misleading: it actually acts as a while True -- n never changes, the while loop only exits from that return statement; it could equally well be expressed as if not n: return followed by while True:, because of course when n is 0 (empty "pool") there's nothing more to yield after the first, trivial empty yield. The author just decided to save a couple of lines by collapsing the if not n: check with the while;-).
I suggest you continue by examining a few more concrete cases -- eventually you should perceive the "clockwork" operating. Focus on just cycles at first (maybe edit the print statements accordingly, removing indices from them), since their clockwork-like progress through their orbit is the key to this subtle and deep algorithm; once you grok that, the way indices get properly updated in response to the sequencing of cycles is almost an anticlimax!-)
It is easier to answer with a pattern in results than words(Except you want to know the math part of the theory),
so prints out would be the best way to explain.
The most subtle thing is that,
after looping to the end, it would reset itself to the first turn of the last round, and start the next looping down, or continually reset to first turn of the last even the bigger round, like a clock.
The part of code doing the reset job:
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
whole:
In [54]: def permutations(iterable, r=None):
...: # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
...: # permutations(range(3)) --> 012 021 102 120 201 210
...: pool = tuple(iterable)
...: n = len(pool)
...: r = n if r is None else r
...: if r > n:
...: return
...: indices = range(n)
...: cycles = range(n, n-r, -1)
...: yield tuple(pool[i] for i in indices[:r])
...: print(indices, cycles)
...: while n:
...: for i in reversed(range(r)):
...: cycles[i] -= 1
...: if cycles[i] == 0:
...: indices[i:] = indices[i+1:] + indices[i:i+1]
...: cycles[i] = n - i
...: print("reset------------------")
...: print(indices, cycles)
...: print("------------------")
...: else:
...: j = cycles[i]
...: indices[i], indices[-j] = indices[-j], indices[i]
...: print(indices, cycles, i, n-j)
...: yield tuple(pool[i] for i in indices[:r])
...: break
...: else:
...: return
part of the result:
In [54]: list(','.join(i) for i in permutations('ABCDE', 3))
([0, 1, 2, 3, 4], [5, 4, 3])
([0, 1, 3, 2, 4], [5, 4, 2], 2, 3)
([0, 1, 4, 2, 3], [5, 4, 1], 2, 4)
reset------------------
([0, 1, 2, 3, 4], [5, 4, 3])
------------------
([0, 2, 1, 3, 4], [5, 3, 3], 1, 2)
([0, 2, 3, 1, 4], [5, 3, 2], 2, 3)
([0, 2, 4, 1, 3], [5, 3, 1], 2, 4)
reset------------------
([0, 2, 1, 3, 4], [5, 3, 3])
------------------
([0, 3, 1, 2, 4], [5, 2, 3], 1, 3)
([0, 3, 2, 1, 4], [5, 2, 2], 2, 3)
([0, 3, 4, 1, 2], [5, 2, 1], 2, 4)
reset------------------
([0, 3, 1, 2, 4], [5, 2, 3])
------------------
([0, 4, 1, 2, 3], [5, 1, 3], 1, 4)
([0, 4, 2, 1, 3], [5, 1, 2], 2, 3)
([0, 4, 3, 1, 2], [5, 1, 1], 2, 4)
reset------------------
([0, 4, 1, 2, 3], [5, 1, 3])
------------------
reset------------------(bigger reset)
([0, 1, 2, 3, 4], [5, 4, 3])
------------------
([1, 0, 2, 3, 4], [4, 4, 3], 0, 1)
([1, 0, 3, 2, 4], [4, 4, 2], 2, 3)
([1, 0, 4, 2, 3], [4, 4, 1], 2, 4)
reset------------------
([1, 0, 2, 3, 4], [4, 4, 3])
------------------
([1, 2, 0, 3, 4], [4, 3, 3], 1, 2)
([1, 2, 3, 0, 4], [4, 3, 2], 2, 3)
([1, 2, 4, 0, 3], [4, 3, 1], 2, 4)
I recently stumbled upon the very same question during my journey of reimplementing permutation algorithms, and would like to share my understanding of this interesting algorithm.
TL;DR: This algorithm is based on a recursive permutation generation algorithm (backtracking based and utilizes swapping elements), and is transformed (or optimized) into an iteration form. (possibly to improve efficiency and prevent stack overflow)
Basics
Before we start, I have to make sure we use the same notation as the original algorithm.
n refers to the length of iterable
r refers to the length of one output permutation tuple
And share a simple observation (as discussed by Alex):
Whenever the algorithm yield an output, it just takes the first r elements of the indices list.
cycles
First, let’s discuss the variable cycles and build some intuition. With some debugging prints, we can see that cycles act like a countdown (of time or clock, something like 01:00:00 -> 00:59:59 -> 00:59:58):
Every item is initialized to range(n, n-r, -1), resulting in cycles[0]=n, cycles[1]=n-1...cycles[i]=n-i
Usually, only the last element is decreased, and each decrement (given after the decrement cycles[r-1] !=0) yields an output (a permutation tuple). We can intuitively name this case tick.
Whenever an element (assuming that’s cycles[i]) decreases to 0, it triggers a decrease on the element before it (cycles[i-1]). Then the triggering element (cycles[i]) is restored to its initial value (n-i). This behavior is similar to a borrowed minus, or the reset of minutes when the second reaches 0 in a clock countdown. We can intuitively name this branch reset.
To further confirm our intuition, add some print statements to the algorithm, and run it with the parameter iterable="ABCD", r=2. We can see the following changes of the cycles variable. Note that square brackets indicate a “tick” happening, yielding an output, and the curly braces indicates a “reset” happening, which don’t yield output.
[4,3] -> [4,2] -> [4,1] -> {4,0} -> {4,3} ->
[3,3] -> [3,2] -> [3,1] -> {3,0} -> {3,3} ->
[2,3] -> [2,2] -> [2,1] -> {2,0} -> {2,3} ->
[1,3] -> [1,2] -> [1,1] -> {1,0} -> {1,3} -> {0,3} -> {4,3}
Using the initial values and change pattern of cycles, we can come to a possible interpretation of the meaning of cycles: number of the remaining permutations (outputs), at each index. When initialized, cycles[0]=n represents that there is initially n possible choices at index 0, and cycles[1]=n-1 represents that there is initially n-1 possible choices at index 1, all the way down to cycles[r-1]=n-r+1. This interpretation of cycles matches math, as with some simple combinational math calculation we can confirm that is indeed the case. Another supporting evidence is that whenever the algorithm ends, we have P(n,r) ( P(n,r)=n*(n-1)*...*(n-r+1) ) ticks (counting the initial yield before entering while as a tick).
indices
Now we come to the more complex part, the indices list. As this is essentially a recursive algorithm (more precisely backtracking), I would like to start from a sub-problem (i=r-1): When the value from index 0 to index r-2 (inclusive) in indices is fixed, and only the value at index r-1 (in other words, the last element in indices) is changing. Also, I will introduce a concrete example (iterable="ABCDE", r=3), and we will be focusing on how it generates the first 3 outputs: ABC, ABD, ABE.
Following the sub-problem, we split the list of indices into 3 parts, and give them names,
fixed : indices[0:r-2] (inclusive)
changing: indices[r-1] (only one value)
backlog: indices[r:n-1] (the remaining parts beside the first two)
As this is a backtracking algorithm, we need to keep an invariant unmodified before and after the execution. The invariant is
The sublist contains changing and backlog (indices[r-1:n-1]), which is modified during the execution but restored when it ends.
Now we can turn to the interaction between cycles and indices during the mysterious while loop. Some of the operations have been outlined by Alex, and I further elaborate.
In each tick, the element in the changing part is swapped with some element in the backlog part, and the relative order in the backlog part is maintained.
Using the characters to visualize the indices, and curly braces highlights the backlog part:
ABC{DE} -> ABD{CE} -> ABE{CD}
When reset happens, the element in the changing part is moved to the back of backlog, thus restoring the initial layout of the sublist (containing the changing part and the backlog part)
Using the characters to visualize the indices, and curly braces highlights the changing part:
AB{E}CD -> ABCD{E}
During this execution (of i=r-1), only the tick phase can yield outputs, and it yields n-r+1 outputs in total, matching the initial value of cycles[i]. This is also a result of mathematically we can only have n-r+1 permutation choices when the fixed part is fixed.
After cycles[i] is decreased to 0, the reset phase kicks in, resetting cycles[i] to n-r+1 and restoring the invariant sublist. This phase marks the end of this execution, and indicates that all possible permutation choices giving the fixed prefix part have been outputted.
Therefore, we have shown that, in this sub-problem (i=r-1), this algorithm is indeed a valid backtracking algorithm, as it
Outputs all possible values given the precondition (fixed prefix part)
Keeps the invariant unmodified (restored in reset phase)
This proof(?) can also be generalized to other values of i, thus proofing(?) the correctness of this permutation generation algorithm.
Reimplementation
Phew! That’s a long read, and you may want to have some more tinkering (more print) with the algorithm to be fully convinced. In essence, we can simplify the underlying principle of the algorithm as the following pseudo-code:
// precondition: the fixed part (or prefix) is fixed
OUTPUT initial_permutation // also invokes the next level
WHILE remaining_permutation_count > 0
// tick
swap the changing element with an element in backlog
OUTPUT current_permutation // also invokes the next level
// reset
move the changing element behind the backlog
And here is a Python implementation using simple backtracking:
# helpers
def swap(list, i, j):
list[i], list[j] = list[j], list[i]
def move_to_last(list, i):
list[i:] = list[i+1:] + [list[i]]
def print_first_n_element(list, n):
print("".join(list[:n]))
# backtracking dfs
def permutations(list, r, changing_index):
if changing_index == r:
# we've reached the deepest level
print_first_n_element(list, r)
return
# a pseudo `tick`
# process initial permutation
# which is just doing nothing (using the initial value)
permutations(list, r, changing_index + 1)
# note: initial permutaion has been outputed, thus the minus 1
remaining_choices = len(list) - 1 - changing_index
# for (i=1;i<=remaining_choices;i++)
for i in range(1, remaining_choices+1):
# `tick` phases
# make one swap
swap_idx = changing_index + i
swap(list, changing_index, swap_idx)
# finished one move at current level, now go deeper
permutations(list, r, changing_index + 1)
# `reset` phase
move_to_last(list, changing_index)
# wrapper
def permutations_wrapper(list, r):
permutations(list, r, 0)
# main
if __name__ == "__main__":
my_list = ["A", "B", "C", "D"]
permutations_wrapper(my_list, 2)
Now all the remaining step is just to show that the backtracking version is equivalent to the iteration version in itertools source code. It should be pretty easy once you grasp why this algorithm works. Following the great tradition of various CS textbooks, this is left as an exercise to the reader.
Related
I have 2 - seemingly identical solutions to the n-queen problem. Both produce exactly same results (I found both online), but the second one takes more than double the time the first one does. could you please help me and explain, where is the difference?
from itertools import permutations
import time
punkt1 = time.time()
N=8
sol=0
cols = range(N)
for combo in permutations(cols):
if N==len(set(combo[i]+i for i in cols))==len(set(combo[i]-i for i in cols)):
sol += 1
print('Solution '+str(sol)+' : '+str(combo)+'\n')
#print("\n".join(' o ' * i + ' X ' + ' o ' * (N-i-1) for i in combo) + "\n\n\n\n")
punkt2 = time.time()
czas = punkt2 - punkt1
###################################
def queensproblem(rows, columns):
solutions = [[]]
for row in range(rows):
solutions = add_one_queen(row, columns, solutions)
return solutions
def add_one_queen(new_row, columns, prev_solutions):
return [solution + [new_column]
for solution in prev_solutions
for new_column in range(columns)
if no_conflict(new_row, new_column, solution)]
def no_conflict(new_row, new_column, solution):
return all(solution[row] != new_column and
solution[row] + row != new_column + new_row and
solution[row] - row != new_column - new_row
for row in range(new_row))
punkt3 = time.time()
i = 1
for solution in queensproblem(8, 8):
print('Solution', i,':', solution, '\n')
i = i + 1
punkt4 = time.time()
czas2 = punkt4 - punkt3
print ("Czas wykonania pierwszej metody:")
print (czas,'\n')
print ("Czas wykonania drugiej metody:")
print (czas2)
At first glance, you seemed to be saying these algorithms produce the same results and differ in time by a constant factor, which is irrelevant when talking about algorithms.
However, if you make N a function parameter and check the timing for N=9 or N=10, you will see them diverge significantly. At N=11 the itertools.permutations version took 12 minutes, vs the other's 28 seconds. It becomes an algorithm problem if they grow at different rates, which they do.
The function which calls "for combo in permutations" is literally looking at every possible board, so you could line up three queens in a row, and it still thinks "I gotta keep adding queens and see if it works out". (That's every possible board representable by the notation. The notation itself eliminates a lot, but not enough.)
The other function is able to stop checking bad combinations and thus eliminate many bad candidates at once. Look at this printout of the decision tree for N=4, generated by adding print (row, solutions) in the queensproblem for loop:
0 [[0], [1], [2], [3]]
1 [[0, 2], [0, 3], [1, 3], [2, 0], [3, 0], [3, 1]]
2 [[0, 3, 1], [1, 3, 0], [2, 0, 3], [3, 0, 2]]
3 [[1, 3, 0, 2], [2, 0, 3, 1]]
Early in the logic, it looked at [0, 0] and [0, 1] and simply eliminated them. Therefore it never looked at [0, 0, 0] or ... many others. It continued to add new queens only for the solutions which passed the earlier checks. It also saves a lot of time by not even looking at all the subproblems it is eliminating inside no_conflit, because of short circuit boolean logic of "all" and "and".
I've got a the following "bars and stars" algorithm, implemented in Python, which prints out all decomposition of a sum into 3 bins, for sums going from 0 to 5.
I'd like to generalise my code so it works with N bins (where N less than the max sum i.e 5 here).
The pattern is if you have 3 bins you need 2 nested loops, if you have N bins you need N-1 nested loops.
Can someone think of a generic way of writing this, possibly not using loops?
# bars and stars algorithm
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(len(x)+1)):
for j in range(i,(len(x)+1)):
print sum(x[0:i]), sum(x[i:j]), sum(x[j:len(x)])
If this isn't simply a learning exercise, then it's not necessary for you to roll your own algorithm to generate the partitions: Python's standard library already has most of what you need, in the form of the itertools.combinations function.
From Theorem 2 on the Wikipedia page you linked to, there are n+k-1 choose k-1 ways of partitioning n items into k bins, and the proof of that theorem gives an explicit correspondence between the combinations and the partitions. So all we need is (1) a way to generate those combinations, and (2) code to translate each combination to the corresponding partition. The itertools.combinations function already provides the first ingredient. For the second, each combination gives the positions of the dividers; the differences between successive divider positions (minus one) give the partition sizes. Here's the code:
import itertools
def partitions(n, k):
for c in itertools.combinations(range(n+k-1), k-1):
yield [b-a-1 for a, b in zip((-1,)+c, c+(n+k-1,))]
# Example usage
for p in partitions(5, 3):
print(p)
And here's the output from running the above code.
[0, 0, 5]
[0, 1, 4]
[0, 2, 3]
[0, 3, 2]
[0, 4, 1]
[0, 5, 0]
[1, 0, 4]
[1, 1, 3]
[1, 2, 2]
[1, 3, 1]
[1, 4, 0]
[2, 0, 3]
[2, 1, 2]
[2, 2, 1]
[2, 3, 0]
[3, 0, 2]
[3, 1, 1]
[3, 2, 0]
[4, 0, 1]
[4, 1, 0]
[5, 0, 0]
Another recursive variant, using a generator function, i.e. instead of right away printing the results, it yields them one after another, to be printed by the caller.
The way to convert your loops into a recursive algorithm is as follows:
identify the "base case": when there are no more bars, just print the stars
for any number of stars in the first segment, recursively determine the possible partitions of the rest, and combine them
You can also turn this into an algorithm to partition arbitrary sequences into chunks:
def partition(seq, n, min_size=0):
if n == 0:
yield [seq]
else:
for i in range(min_size, len(seq) - min_size * n + 1):
for res in partition(seq[i:], n-1, min_size):
yield [seq[:i]] + res
Example usage:
for res in partition("*****", 2):
print "|".join(res)
Take it one step at a time.
First, remove the sum() calls. We don't need them:
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(n+1)): # len(x) == n
for j in range(i,(n+1)):
print i, j - i, n - j
Notice that x is an unused variable:
N=5
for n in range(0,N):
for i in range(0,(n+1)):
for j in range(i,(n+1)):
print i, j - i, n - j
Time to generalize. The above algorithm is correct for N stars and three bars, so we just need to generalize the bars.
Do this recursively. For the base case, we have either zero bars or zero stars, which are both trivial. For the recursive case, run through all the possible positions of the leftmost bar and recurse in each case:
from __future__ import print_function
def bars_and_stars(bars=3, stars=5, _prefix=''):
if stars == 0:
print(_prefix + ', '.join('0'*(bars+1)))
return
if bars == 0:
print(_prefix + str(stars))
return
for i in range(stars+1):
bars_and_stars(bars-1, stars-i, '{}{}, '.format(_prefix, i))
For bonus points, we could change range() to xrange(), but that will just give you trouble when you port to Python 3.
This can be solved recursively in the following approach:
#n bins, k stars,
def F(n,k):
#n bins, k stars, list holds how many elements in current assignment
def aux(n,k,list):
if n == 0: #stop clause
print list
elif n==1: #making sure all stars are distributed
list[0] = k
aux(0,0,list)
else: #"regular" recursion:
for i in range(k+1):
#the last bin has i stars, set them and recurse
list[n-1] = i
aux(n-1,k-i,list)
aux(n,k,[0]*n)
The idea is to "guess" how many stars are in the last bin, assign them, and recurse to a smaller problem with less stars (as much that were assigned) and one less bin.
Note: It is easy to replace the line
print list
with any output format you desire when the number of stars in each bin is set.
Here is a nonrecursive algorithm that replicates the "bars and stars" nested loop approach. This assumes the bars all start on the right, and finish on the left (bins going from [x,0,0,...] to [0,0,..,x]). There will always be a zero in the first bin when a loop finishes, so you can follow the logic and match it to "bars and stars."
def combos(nbins, qty):
bins = [0]*nbins
bins[0] = qty #starting bin quantities
while True:
yield bins
if bins[-1] == qty:
return #last combo, we're done!
#leftmost bar movement (inner loop)
if bins[0] > 0:
bins[0] -= 1
bins[1] += 1
else:
#bump next bar in nested loops
#i.e., find first nonzero entry, and split it
nz = 1
while bins[nz] == 0:
nz +=1
bins[0]=bins[nz]-1
bins[nz+1] += 1
bins[nz] = 0
Here is the result of 4 bins, quantity 3:
for m in combos(4, 3):
print(m)
[3, 0, 0, 0]
[2, 1, 0, 0]
[1, 2, 0, 0]
[0, 3, 0, 0]
[2, 0, 1, 0]
[1, 1, 1, 0]
[0, 2, 1, 0]
[1, 0, 2, 0]
[0, 1, 2, 0]
[0, 0, 3, 0]
[2, 0, 0, 1]
[1, 1, 0, 1]
[0, 2, 0, 1]
[1, 0, 1, 1]
[0, 1, 1, 1]
[0, 0, 2, 1]
[1, 0, 0, 2]
[0, 1, 0, 2]
[0, 0, 1, 2]
[0, 0, 0, 3]
I needed to solve the same problem and found this post, but I really wanted a non-recursive general-purpose algorithm that didn't rely on itertools and couldn't find one, so came up with this.
By default, the generator produces the sequence in either lexical order (as the earlier recursive example) but can also produce the reverse-order sequence by setting the "reversed" flag.
def StarsAndBars(bins, stars, reversed=False):
if bins < 1 or stars < 1:
raise ValueError("Number of bins and objects must both be greater than or equal to 1.")
if bins == 1:
yield stars,
return
bars = [ ([0] * bins + [ stars ], 1) ]
if reversed:
while len(bars)>0:
b = bars.pop()
if b[1] == bins:
yield tuple(b[0][y] - b[0][y-1] for y in range(1, bins+1))
else:
bar = b[0][:b[1]]
for x in range(b[0][b[1]], stars+1):
newBar = bar + [ x ] * (bins - b[1]) + [ stars ]
bars.append( (newBar, b[1]+1) )
bars = [ ([0] * bins + [ stars ], 1) ]
else:
while len(bars)>0:
newBars = []
for b in bars:
for x in range(b[0][-2], stars+1):
newBar = b[0][1:bins] + [ x, stars ]
if b[1] < bins-1 and x > 0:
newBars.append( (newBar, b[1]+1) )
yield tuple(newBar[y] - newBar[y-1] for y in range(1, bins+1))
bars = newBars
This problem can also be solved somewhat less verbosely than the previous answers with a list comprehension:
from numpy import array as ar
from itertools import product
number_of_stars = M
number_of_bins = N
decompositions = ar([ar(i) for i in product(range(M+1), repeat=N) if sum(i)==M])
Here the itertools.product() produces a list containing the Cartesian product of the list range(M+1) with itself, where the product has been applied (repeats=)N times. The if statement removes the combinations where the number don't add up to the number of stars, for example one of the combinations is of 0 with 0 with 0 or [0,0,0].
If we're happy with a list of lists then we can simply remove the np.array()'s (just ar for brevity in the example). Here's an example output for 3 stars in 3 bins:
array([[0, 0, 3],
[0, 1, 2],
[0, 2, 1],
[0, 3, 0],
[1, 0, 2],
[1, 1, 1],
[1, 2, 0],
[2, 0, 1],
[2, 1, 0],
[3, 0, 0]])
I hope this answer helps!
Since I found the code in most answers quite hard to follow i.e. asking myself how the shown algorithms relate to the actual problem of stars and bars let's do this step by step:
First we define a function to insert a bar | into a string stars at a given position p:
def insert_bar(stars, p):
head, tail = stars[:p], stars[p:]
return head + '|' + tail
Usage:
insert_bar('***', 1) # returns '*|**'
To insert multiple bars at different positions e.g. (1,3) a simple way is to use reduce (from functools)
reduce(insert_bar, (1,3), '***') # returns '*|*|*'
If we branch the definition of insert_bar to handle both cases we get a nice and reusable function to insert any number of bars into a string of stars
def insert_bars(stars, p):
if type(p) is int:
head, tail = stars[:p], stars[p:]
return head + '|' + tail
else:
return reduce(insert_bar, p, stars)
As #Mark Dickinson explaind in his answer itertools.combinations lets us produce the n+k-1 choose k-1 combinations of bar positions.
What is now left to do is to create a string of '*' of length n, insert the bars at the given positions, split the string at the bars and calculate the length of each resulting bin. The implementation below is thus literally a verbatim translation of the problem statement into code
def partitions(n, k):
for positions in itertools.combinations(range(n+k-1), k-1):
yield [len(bin) for bin in insert_bars(n*"*", positions).split('|')]
anyone looking for the specific case of k=2 can save ALOT of time by simply creating a range and stacking it with the reverse. Comparing versus accepted answer.
n = 500000
%timeit np.array([[i,j] for i,j in partitions(n,2)])
>>> 396 ms ± 13.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
rng = np.arange(n+1)
np.vstack([rng, rng[::-1]]).T
>>> 2.91 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And they are indeed equivalent.
it2k = np.array([[i,j] for i,j in partitions(n,2)])
rng = np.arange(n+1)
np2k = np.vstack([rng, rng[::-1]]).T
(np2k == it2k).all()
>>> True
This program needs to find all permutations of a list by swapping elements using this rule - swap last element until it becomes first (for example 1, 2, 3, 4 becomes 1, 2, 4, 3 and so on until 4, 1, 2, 3), when it becomes the first element then you need to switch last 2 elements and do the same thing in opposite direction (swap first element until it becomes last and then swap first 2 elements and repeat), this is also known as Steinhaus - Johnson - Trotter algorithm.For some reason my implementation isn't working in Python and I'd like to know why and what do I need to do to make it work.
EDIT: By "not working" I mean that the program only prints list1 and does nothing else, the program can only be closed by "killing" it which means that it is stuck in infinite loop (this can be proven by printing all_permutations after appending list1 to all_permutations).
list1 = [0, 1, 2, 3] #list that will be swapped
x = 3 #this is used for swapping
all_permutations = [] #list where permutations will be added
print(list1) #print list1 because it is the first permutation
while len(all_permutations) != 23: #loop until all permutations are found (4! = 24 but since list1 is already 1 permutation we only need 23)
x -= 1
list1[x], list1[x+1] = list1[x+1], list1[x]
all_permutations.append(list1)
#code above swaps the last element until it becomes 1st in the list
if x == 0: #if last element becomes 1st
list1[2], list1[3] = list1[3], list1[2] #swap last 2 elements
while x != 3: #loop which swaps 1st element until it becomes the last element
if len(all_permutations) == 23:
break
else:
continue
x += 1
list1[x-1], list1[x] = list1[x], list1[x-1]
all_permutations.append(list1)
list1[0], list1[1] = list1[1], list1[0] #when loop is over (when 1st element becomes last) switch first 2 elements
all_permutations.append(list1)
else:
continue
print(all_permutations) #print all permutations
while x != 3:
if len(all_permutations) == 23:
break
else:
continue
this piece of code right here will result in an infinite loop. If the length of all_permutations is not 23 it will hit the continue statement. This will send the program back to the beginning of the loop without modifying x or all_permutations.
I believe what you are looking for here is pass which does nothing. continue will move back to the beginning of the loop. So to fix this part of your program you can actually just get rid of the else altogether since the break will exit the loop anyway there is no need for it.
while x != 3:
if len(all_permutations) == 23:
break
x += 1
list1[x-1], list1[x] = list1[x], list1[x-1]
all_permutations.append(list1)
Or you could eliminate the if altogether:
while x != 3 or len(all_permutations) != 23:
x += 1
list1[x-1], list1[x] = list1[x], list1[x-1]
all_permutations.append(list1)
You are adding multiple references to the same list object to all_permutations, and that list object is modified each time through the loop. Instead, add a copy of the list so that you have a collection of distinct permutations.
all_permutations.append(list1[:])
This is one error; the infinite loop is due to the problem pointed out by IanAuld.
The reason that you new code at http://pastebin.com/bY7ZznR1 gets stuck in an infinite loop is that when len(all_permutations) == 23 becomes True in the inner while loop you then append another list on line 30. And when control gets to the top of the outer loop len(all_permutations) == 24, so the loop continues to execute.
That's easy enough to fix, however your algorithm isn't quite correct. I've modified your code to generate permutations of lists of arbitrary size, and noticed that it gives the right results for lists of length 3 or 4, but not for lists of length 2 or 5; I didn't bother testing other sizes.
FWIW, here's a program that implements the recursive version of the Steinhaus - Johnson - Trotter algorithm. You may find it useful if you want to improve your iterative algorithm.
#!/usr/bin/env python
''' Generate permutations using the Steinhaus - Johnson - Trotter algorithm
This generates permutations in the order known to bell ringers as
"plain changes".
See https://en.wikipedia.org/wiki/Steinhaus%E2%80%93Johnson%E2%80%93Trotter_algorithm
From http://stackoverflow.com/q/31209826/4014959
Written by PM 2Ring 2015.07.03
'''
import sys
def sjt_permute(items):
num = len(items)
if num == 1:
yield items[:1]
return
last = items[-1:]
uprange = range(num)
dnrange = uprange[::-1]
descend = True
for perm in sjt_permute(items[:-1]):
rng = dnrange if descend else uprange
for i in rng:
yield perm[:i] + last + perm[i:]
descend = not descend
def main():
num = int(sys.argv[1]) if len(sys.argv) > 1 else 4
items = range(num)
for p in sjt_permute(items):
print(p)
if __name__ == '__main__':
main()
output
[0, 1, 2, 3]
[0, 1, 3, 2]
[0, 3, 1, 2]
[3, 0, 1, 2]
[3, 0, 2, 1]
[0, 3, 2, 1]
[0, 2, 3, 1]
[0, 2, 1, 3]
[2, 0, 1, 3]
[2, 0, 3, 1]
[2, 3, 0, 1]
[3, 2, 0, 1]
[3, 2, 1, 0]
[2, 3, 1, 0]
[2, 1, 3, 0]
[2, 1, 0, 3]
[1, 2, 0, 3]
[1, 2, 3, 0]
[1, 3, 2, 0]
[3, 1, 2, 0]
[3, 1, 0, 2]
[1, 3, 0, 2]
[1, 0, 3, 2]
[1, 0, 2, 3]
What I'm looking to do is find a way that I can have my code return all the combinations of values from a list that add to a variable, returning each answer as a list. For instance,
target_number = 8
usingnumbers = [1, 2, 4, 8]
returns:
[8]
[4, 4]
[4, 2, 2]
[4, 2, 1, 1]
[4, 1, 1, 1, 1]
[2, 2, 1, 1, 1, 1]
[2, 1, 1, 1, 1, 1, 1]
[1, 1, 1, 1, 1, 1, 1, 1]
And so on. I'd like repeated values to be discarded, for instance [4, 2, 2], [2, 4, 2], [2, 2, 4] are all technically valid, but I'd like just one of these to be shown. Ideally, I'd want the code to also return the number of times each number appears in each list, but I'm sure I can do that for myself.
In psuedocode:
subract the largest number from you list from your composed number,keep track of the number you started with
loop that until you can't anymore
move on to second largest etc
start this cycle again, but start with the number smaller than the last loop you did.
Not that difficult.
Not going to write code for you, but there is main idea:
Function F(n, (k1, k2, .. km)) - returns set of lists of numbers:
{(a11, ... a1i), (a21, ... a2i), ... (aj1, ... aji )}
There is recurrent relationship:
F(n, (k1, k2, .., km)) = union(
(k1) (+) F(n - k1, (k1, k2, ... km)),
(k2) (+) F(n - k2, (k2, k3, ... km)),
...
(km) (+) F(n - km, (km))
)
Operation a (+) b is 'append a to each item of b'.
There is a few of many corner cases, but it's up to you.
This is a complete solution to the problem, the whole function is a big generator in disguise, the first for loop uses the smallest coin and in the second one, the smallest coin is discarded and the next big one is going to be the basis of our recursive function. If sum of current coins is equal to given number the list containing the coins is returned, if the sum is bigger then the number that list is discarded.
def changes(number, coins_available, coins_current):
if sum(coins_current) == number:
yield coins_current
elif sum(coins_current) > number:
pass
elif coins_available == []:
pass
else:
for c in changes(number, coins_available[:], coins_current + [coins_available[0]]):
yield c
for c in changes(number, coins_available[1:], coins_current):
yield c
n = 40
coins = [1,2,5,10,20,50,100]
solutions = [sol for sol in changes(n, coins, [])]
for sol in solutions:
print sol
print 'least coins used solution:', min(solutions, key=len)
print 'number of solutions', len(solutions)
I've got a the following "bars and stars" algorithm, implemented in Python, which prints out all decomposition of a sum into 3 bins, for sums going from 0 to 5.
I'd like to generalise my code so it works with N bins (where N less than the max sum i.e 5 here).
The pattern is if you have 3 bins you need 2 nested loops, if you have N bins you need N-1 nested loops.
Can someone think of a generic way of writing this, possibly not using loops?
# bars and stars algorithm
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(len(x)+1)):
for j in range(i,(len(x)+1)):
print sum(x[0:i]), sum(x[i:j]), sum(x[j:len(x)])
If this isn't simply a learning exercise, then it's not necessary for you to roll your own algorithm to generate the partitions: Python's standard library already has most of what you need, in the form of the itertools.combinations function.
From Theorem 2 on the Wikipedia page you linked to, there are n+k-1 choose k-1 ways of partitioning n items into k bins, and the proof of that theorem gives an explicit correspondence between the combinations and the partitions. So all we need is (1) a way to generate those combinations, and (2) code to translate each combination to the corresponding partition. The itertools.combinations function already provides the first ingredient. For the second, each combination gives the positions of the dividers; the differences between successive divider positions (minus one) give the partition sizes. Here's the code:
import itertools
def partitions(n, k):
for c in itertools.combinations(range(n+k-1), k-1):
yield [b-a-1 for a, b in zip((-1,)+c, c+(n+k-1,))]
# Example usage
for p in partitions(5, 3):
print(p)
And here's the output from running the above code.
[0, 0, 5]
[0, 1, 4]
[0, 2, 3]
[0, 3, 2]
[0, 4, 1]
[0, 5, 0]
[1, 0, 4]
[1, 1, 3]
[1, 2, 2]
[1, 3, 1]
[1, 4, 0]
[2, 0, 3]
[2, 1, 2]
[2, 2, 1]
[2, 3, 0]
[3, 0, 2]
[3, 1, 1]
[3, 2, 0]
[4, 0, 1]
[4, 1, 0]
[5, 0, 0]
Another recursive variant, using a generator function, i.e. instead of right away printing the results, it yields them one after another, to be printed by the caller.
The way to convert your loops into a recursive algorithm is as follows:
identify the "base case": when there are no more bars, just print the stars
for any number of stars in the first segment, recursively determine the possible partitions of the rest, and combine them
You can also turn this into an algorithm to partition arbitrary sequences into chunks:
def partition(seq, n, min_size=0):
if n == 0:
yield [seq]
else:
for i in range(min_size, len(seq) - min_size * n + 1):
for res in partition(seq[i:], n-1, min_size):
yield [seq[:i]] + res
Example usage:
for res in partition("*****", 2):
print "|".join(res)
Take it one step at a time.
First, remove the sum() calls. We don't need them:
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(n+1)): # len(x) == n
for j in range(i,(n+1)):
print i, j - i, n - j
Notice that x is an unused variable:
N=5
for n in range(0,N):
for i in range(0,(n+1)):
for j in range(i,(n+1)):
print i, j - i, n - j
Time to generalize. The above algorithm is correct for N stars and three bars, so we just need to generalize the bars.
Do this recursively. For the base case, we have either zero bars or zero stars, which are both trivial. For the recursive case, run through all the possible positions of the leftmost bar and recurse in each case:
from __future__ import print_function
def bars_and_stars(bars=3, stars=5, _prefix=''):
if stars == 0:
print(_prefix + ', '.join('0'*(bars+1)))
return
if bars == 0:
print(_prefix + str(stars))
return
for i in range(stars+1):
bars_and_stars(bars-1, stars-i, '{}{}, '.format(_prefix, i))
For bonus points, we could change range() to xrange(), but that will just give you trouble when you port to Python 3.
This can be solved recursively in the following approach:
#n bins, k stars,
def F(n,k):
#n bins, k stars, list holds how many elements in current assignment
def aux(n,k,list):
if n == 0: #stop clause
print list
elif n==1: #making sure all stars are distributed
list[0] = k
aux(0,0,list)
else: #"regular" recursion:
for i in range(k+1):
#the last bin has i stars, set them and recurse
list[n-1] = i
aux(n-1,k-i,list)
aux(n,k,[0]*n)
The idea is to "guess" how many stars are in the last bin, assign them, and recurse to a smaller problem with less stars (as much that were assigned) and one less bin.
Note: It is easy to replace the line
print list
with any output format you desire when the number of stars in each bin is set.
Here is a nonrecursive algorithm that replicates the "bars and stars" nested loop approach. This assumes the bars all start on the right, and finish on the left (bins going from [x,0,0,...] to [0,0,..,x]). There will always be a zero in the first bin when a loop finishes, so you can follow the logic and match it to "bars and stars."
def combos(nbins, qty):
bins = [0]*nbins
bins[0] = qty #starting bin quantities
while True:
yield bins
if bins[-1] == qty:
return #last combo, we're done!
#leftmost bar movement (inner loop)
if bins[0] > 0:
bins[0] -= 1
bins[1] += 1
else:
#bump next bar in nested loops
#i.e., find first nonzero entry, and split it
nz = 1
while bins[nz] == 0:
nz +=1
bins[0]=bins[nz]-1
bins[nz+1] += 1
bins[nz] = 0
Here is the result of 4 bins, quantity 3:
for m in combos(4, 3):
print(m)
[3, 0, 0, 0]
[2, 1, 0, 0]
[1, 2, 0, 0]
[0, 3, 0, 0]
[2, 0, 1, 0]
[1, 1, 1, 0]
[0, 2, 1, 0]
[1, 0, 2, 0]
[0, 1, 2, 0]
[0, 0, 3, 0]
[2, 0, 0, 1]
[1, 1, 0, 1]
[0, 2, 0, 1]
[1, 0, 1, 1]
[0, 1, 1, 1]
[0, 0, 2, 1]
[1, 0, 0, 2]
[0, 1, 0, 2]
[0, 0, 1, 2]
[0, 0, 0, 3]
I needed to solve the same problem and found this post, but I really wanted a non-recursive general-purpose algorithm that didn't rely on itertools and couldn't find one, so came up with this.
By default, the generator produces the sequence in either lexical order (as the earlier recursive example) but can also produce the reverse-order sequence by setting the "reversed" flag.
def StarsAndBars(bins, stars, reversed=False):
if bins < 1 or stars < 1:
raise ValueError("Number of bins and objects must both be greater than or equal to 1.")
if bins == 1:
yield stars,
return
bars = [ ([0] * bins + [ stars ], 1) ]
if reversed:
while len(bars)>0:
b = bars.pop()
if b[1] == bins:
yield tuple(b[0][y] - b[0][y-1] for y in range(1, bins+1))
else:
bar = b[0][:b[1]]
for x in range(b[0][b[1]], stars+1):
newBar = bar + [ x ] * (bins - b[1]) + [ stars ]
bars.append( (newBar, b[1]+1) )
bars = [ ([0] * bins + [ stars ], 1) ]
else:
while len(bars)>0:
newBars = []
for b in bars:
for x in range(b[0][-2], stars+1):
newBar = b[0][1:bins] + [ x, stars ]
if b[1] < bins-1 and x > 0:
newBars.append( (newBar, b[1]+1) )
yield tuple(newBar[y] - newBar[y-1] for y in range(1, bins+1))
bars = newBars
This problem can also be solved somewhat less verbosely than the previous answers with a list comprehension:
from numpy import array as ar
from itertools import product
number_of_stars = M
number_of_bins = N
decompositions = ar([ar(i) for i in product(range(M+1), repeat=N) if sum(i)==M])
Here the itertools.product() produces a list containing the Cartesian product of the list range(M+1) with itself, where the product has been applied (repeats=)N times. The if statement removes the combinations where the number don't add up to the number of stars, for example one of the combinations is of 0 with 0 with 0 or [0,0,0].
If we're happy with a list of lists then we can simply remove the np.array()'s (just ar for brevity in the example). Here's an example output for 3 stars in 3 bins:
array([[0, 0, 3],
[0, 1, 2],
[0, 2, 1],
[0, 3, 0],
[1, 0, 2],
[1, 1, 1],
[1, 2, 0],
[2, 0, 1],
[2, 1, 0],
[3, 0, 0]])
I hope this answer helps!
Since I found the code in most answers quite hard to follow i.e. asking myself how the shown algorithms relate to the actual problem of stars and bars let's do this step by step:
First we define a function to insert a bar | into a string stars at a given position p:
def insert_bar(stars, p):
head, tail = stars[:p], stars[p:]
return head + '|' + tail
Usage:
insert_bar('***', 1) # returns '*|**'
To insert multiple bars at different positions e.g. (1,3) a simple way is to use reduce (from functools)
reduce(insert_bar, (1,3), '***') # returns '*|*|*'
If we branch the definition of insert_bar to handle both cases we get a nice and reusable function to insert any number of bars into a string of stars
def insert_bars(stars, p):
if type(p) is int:
head, tail = stars[:p], stars[p:]
return head + '|' + tail
else:
return reduce(insert_bar, p, stars)
As #Mark Dickinson explaind in his answer itertools.combinations lets us produce the n+k-1 choose k-1 combinations of bar positions.
What is now left to do is to create a string of '*' of length n, insert the bars at the given positions, split the string at the bars and calculate the length of each resulting bin. The implementation below is thus literally a verbatim translation of the problem statement into code
def partitions(n, k):
for positions in itertools.combinations(range(n+k-1), k-1):
yield [len(bin) for bin in insert_bars(n*"*", positions).split('|')]
anyone looking for the specific case of k=2 can save ALOT of time by simply creating a range and stacking it with the reverse. Comparing versus accepted answer.
n = 500000
%timeit np.array([[i,j] for i,j in partitions(n,2)])
>>> 396 ms ± 13.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
rng = np.arange(n+1)
np.vstack([rng, rng[::-1]]).T
>>> 2.91 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And they are indeed equivalent.
it2k = np.array([[i,j] for i,j in partitions(n,2)])
rng = np.arange(n+1)
np2k = np.vstack([rng, rng[::-1]]).T
(np2k == it2k).all()
>>> True