I've got a the following "bars and stars" algorithm, implemented in Python, which prints out all decomposition of a sum into 3 bins, for sums going from 0 to 5.
I'd like to generalise my code so it works with N bins (where N less than the max sum i.e 5 here).
The pattern is if you have 3 bins you need 2 nested loops, if you have N bins you need N-1 nested loops.
Can someone think of a generic way of writing this, possibly not using loops?
# bars and stars algorithm
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(len(x)+1)):
for j in range(i,(len(x)+1)):
print sum(x[0:i]), sum(x[i:j]), sum(x[j:len(x)])
If this isn't simply a learning exercise, then it's not necessary for you to roll your own algorithm to generate the partitions: Python's standard library already has most of what you need, in the form of the itertools.combinations function.
From Theorem 2 on the Wikipedia page you linked to, there are n+k-1 choose k-1 ways of partitioning n items into k bins, and the proof of that theorem gives an explicit correspondence between the combinations and the partitions. So all we need is (1) a way to generate those combinations, and (2) code to translate each combination to the corresponding partition. The itertools.combinations function already provides the first ingredient. For the second, each combination gives the positions of the dividers; the differences between successive divider positions (minus one) give the partition sizes. Here's the code:
import itertools
def partitions(n, k):
for c in itertools.combinations(range(n+k-1), k-1):
yield [b-a-1 for a, b in zip((-1,)+c, c+(n+k-1,))]
# Example usage
for p in partitions(5, 3):
print(p)
And here's the output from running the above code.
[0, 0, 5]
[0, 1, 4]
[0, 2, 3]
[0, 3, 2]
[0, 4, 1]
[0, 5, 0]
[1, 0, 4]
[1, 1, 3]
[1, 2, 2]
[1, 3, 1]
[1, 4, 0]
[2, 0, 3]
[2, 1, 2]
[2, 2, 1]
[2, 3, 0]
[3, 0, 2]
[3, 1, 1]
[3, 2, 0]
[4, 0, 1]
[4, 1, 0]
[5, 0, 0]
Another recursive variant, using a generator function, i.e. instead of right away printing the results, it yields them one after another, to be printed by the caller.
The way to convert your loops into a recursive algorithm is as follows:
identify the "base case": when there are no more bars, just print the stars
for any number of stars in the first segment, recursively determine the possible partitions of the rest, and combine them
You can also turn this into an algorithm to partition arbitrary sequences into chunks:
def partition(seq, n, min_size=0):
if n == 0:
yield [seq]
else:
for i in range(min_size, len(seq) - min_size * n + 1):
for res in partition(seq[i:], n-1, min_size):
yield [seq[:i]] + res
Example usage:
for res in partition("*****", 2):
print "|".join(res)
Take it one step at a time.
First, remove the sum() calls. We don't need them:
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(n+1)): # len(x) == n
for j in range(i,(n+1)):
print i, j - i, n - j
Notice that x is an unused variable:
N=5
for n in range(0,N):
for i in range(0,(n+1)):
for j in range(i,(n+1)):
print i, j - i, n - j
Time to generalize. The above algorithm is correct for N stars and three bars, so we just need to generalize the bars.
Do this recursively. For the base case, we have either zero bars or zero stars, which are both trivial. For the recursive case, run through all the possible positions of the leftmost bar and recurse in each case:
from __future__ import print_function
def bars_and_stars(bars=3, stars=5, _prefix=''):
if stars == 0:
print(_prefix + ', '.join('0'*(bars+1)))
return
if bars == 0:
print(_prefix + str(stars))
return
for i in range(stars+1):
bars_and_stars(bars-1, stars-i, '{}{}, '.format(_prefix, i))
For bonus points, we could change range() to xrange(), but that will just give you trouble when you port to Python 3.
This can be solved recursively in the following approach:
#n bins, k stars,
def F(n,k):
#n bins, k stars, list holds how many elements in current assignment
def aux(n,k,list):
if n == 0: #stop clause
print list
elif n==1: #making sure all stars are distributed
list[0] = k
aux(0,0,list)
else: #"regular" recursion:
for i in range(k+1):
#the last bin has i stars, set them and recurse
list[n-1] = i
aux(n-1,k-i,list)
aux(n,k,[0]*n)
The idea is to "guess" how many stars are in the last bin, assign them, and recurse to a smaller problem with less stars (as much that were assigned) and one less bin.
Note: It is easy to replace the line
print list
with any output format you desire when the number of stars in each bin is set.
Here is a nonrecursive algorithm that replicates the "bars and stars" nested loop approach. This assumes the bars all start on the right, and finish on the left (bins going from [x,0,0,...] to [0,0,..,x]). There will always be a zero in the first bin when a loop finishes, so you can follow the logic and match it to "bars and stars."
def combos(nbins, qty):
bins = [0]*nbins
bins[0] = qty #starting bin quantities
while True:
yield bins
if bins[-1] == qty:
return #last combo, we're done!
#leftmost bar movement (inner loop)
if bins[0] > 0:
bins[0] -= 1
bins[1] += 1
else:
#bump next bar in nested loops
#i.e., find first nonzero entry, and split it
nz = 1
while bins[nz] == 0:
nz +=1
bins[0]=bins[nz]-1
bins[nz+1] += 1
bins[nz] = 0
Here is the result of 4 bins, quantity 3:
for m in combos(4, 3):
print(m)
[3, 0, 0, 0]
[2, 1, 0, 0]
[1, 2, 0, 0]
[0, 3, 0, 0]
[2, 0, 1, 0]
[1, 1, 1, 0]
[0, 2, 1, 0]
[1, 0, 2, 0]
[0, 1, 2, 0]
[0, 0, 3, 0]
[2, 0, 0, 1]
[1, 1, 0, 1]
[0, 2, 0, 1]
[1, 0, 1, 1]
[0, 1, 1, 1]
[0, 0, 2, 1]
[1, 0, 0, 2]
[0, 1, 0, 2]
[0, 0, 1, 2]
[0, 0, 0, 3]
I needed to solve the same problem and found this post, but I really wanted a non-recursive general-purpose algorithm that didn't rely on itertools and couldn't find one, so came up with this.
By default, the generator produces the sequence in either lexical order (as the earlier recursive example) but can also produce the reverse-order sequence by setting the "reversed" flag.
def StarsAndBars(bins, stars, reversed=False):
if bins < 1 or stars < 1:
raise ValueError("Number of bins and objects must both be greater than or equal to 1.")
if bins == 1:
yield stars,
return
bars = [ ([0] * bins + [ stars ], 1) ]
if reversed:
while len(bars)>0:
b = bars.pop()
if b[1] == bins:
yield tuple(b[0][y] - b[0][y-1] for y in range(1, bins+1))
else:
bar = b[0][:b[1]]
for x in range(b[0][b[1]], stars+1):
newBar = bar + [ x ] * (bins - b[1]) + [ stars ]
bars.append( (newBar, b[1]+1) )
bars = [ ([0] * bins + [ stars ], 1) ]
else:
while len(bars)>0:
newBars = []
for b in bars:
for x in range(b[0][-2], stars+1):
newBar = b[0][1:bins] + [ x, stars ]
if b[1] < bins-1 and x > 0:
newBars.append( (newBar, b[1]+1) )
yield tuple(newBar[y] - newBar[y-1] for y in range(1, bins+1))
bars = newBars
This problem can also be solved somewhat less verbosely than the previous answers with a list comprehension:
from numpy import array as ar
from itertools import product
number_of_stars = M
number_of_bins = N
decompositions = ar([ar(i) for i in product(range(M+1), repeat=N) if sum(i)==M])
Here the itertools.product() produces a list containing the Cartesian product of the list range(M+1) with itself, where the product has been applied (repeats=)N times. The if statement removes the combinations where the number don't add up to the number of stars, for example one of the combinations is of 0 with 0 with 0 or [0,0,0].
If we're happy with a list of lists then we can simply remove the np.array()'s (just ar for brevity in the example). Here's an example output for 3 stars in 3 bins:
array([[0, 0, 3],
[0, 1, 2],
[0, 2, 1],
[0, 3, 0],
[1, 0, 2],
[1, 1, 1],
[1, 2, 0],
[2, 0, 1],
[2, 1, 0],
[3, 0, 0]])
I hope this answer helps!
Since I found the code in most answers quite hard to follow i.e. asking myself how the shown algorithms relate to the actual problem of stars and bars let's do this step by step:
First we define a function to insert a bar | into a string stars at a given position p:
def insert_bar(stars, p):
head, tail = stars[:p], stars[p:]
return head + '|' + tail
Usage:
insert_bar('***', 1) # returns '*|**'
To insert multiple bars at different positions e.g. (1,3) a simple way is to use reduce (from functools)
reduce(insert_bar, (1,3), '***') # returns '*|*|*'
If we branch the definition of insert_bar to handle both cases we get a nice and reusable function to insert any number of bars into a string of stars
def insert_bars(stars, p):
if type(p) is int:
head, tail = stars[:p], stars[p:]
return head + '|' + tail
else:
return reduce(insert_bar, p, stars)
As #Mark Dickinson explaind in his answer itertools.combinations lets us produce the n+k-1 choose k-1 combinations of bar positions.
What is now left to do is to create a string of '*' of length n, insert the bars at the given positions, split the string at the bars and calculate the length of each resulting bin. The implementation below is thus literally a verbatim translation of the problem statement into code
def partitions(n, k):
for positions in itertools.combinations(range(n+k-1), k-1):
yield [len(bin) for bin in insert_bars(n*"*", positions).split('|')]
anyone looking for the specific case of k=2 can save ALOT of time by simply creating a range and stacking it with the reverse. Comparing versus accepted answer.
n = 500000
%timeit np.array([[i,j] for i,j in partitions(n,2)])
>>> 396 ms ± 13.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
rng = np.arange(n+1)
np.vstack([rng, rng[::-1]]).T
>>> 2.91 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And they are indeed equivalent.
it2k = np.array([[i,j] for i,j in partitions(n,2)])
rng = np.arange(n+1)
np2k = np.vstack([rng, rng[::-1]]).T
(np2k == it2k).all()
>>> True
Related
I am working on moving all zeroes to end of list. .. is this approach bad and computationally expensive?
a = [1, 2, 0, 0, 0, 3, 6]
temp = []
zeros = []
for i in range(len(a)):
if a[i] !=0:
temp.append(a[i])
else:
zeros.append(a[i])
print(temp+zeros)
My Program works but not sure if this is a good approach?
A sorted solution that avoids changing the order of the other elements is:
from operator import not_
sorted(a, key=not_)
or without an import:
sorted(a, key=lambda x: not x) # Or x == 0 for specific numeric test
By making the key a simple boolean, sorted splits it into things that are truthy followed by things that are falsy, and since it's a stable sort, the order of things within each category is the same as the original input.
This looks like a list. Could you just use sort?
a = [1, 2, 0, 0, 0, 3, 6]
a.sort(reverse=True)
a
[6, 3, 2, 1, 0, 0, 0]
To move all the zeroes to the end of the list while preserving the order of all the elements in one traversal, we can keep the count of all the non-zero elements from the beginning and swap it with the next element when a non-zero element is encountered after zeroes.
This can be explained as:
arr = [18, 0, 4, 0, 0, 6]
count = 0
for i in range(len(arr):
if arr[i] != 0:
arr[i], arr[count] = arr[count], arr[i]
count += 1
How the loop works:
when i = 0, arr[i] will be 18, so according to the code it will swap with itself, which doesn't make a difference, and count will be incremented by one. When i=1, it will have no affect as till now the list we have traversed is what we want(zero in the end). When i=4, arr[i]= 4 and arr[count(1)]= 0, so we swap them leaving the list as[18, 4, 0, 0, 0, 6] and count becomes 2 signifying two non-zero elements in the beginning. And then the loop continues.
You can try my solution if you like
class Solution:
def moveZeroes(self, nums: List[int]) -> None:
for num in nums:
if num == 0:
nums.remove(num)
nums.append(num)
I have tried this code in leetcode & my submission got accepted using above code.
Nothing wrong with your approach, really depends on how you want to store the resulting values. Here is a way to do it using list.extend() and list.count() that preserves order of the non-zero elements and results in a single list.
a = [1, 2, 0, 0, 0, 3, 6]
result = [n for n in a if n != 0]
result.extend([0] * a.count(0))
print(result)
# [1, 2, 3, 6, 0, 0, 0]
You can try this
a = [1, 2, 0, 0, 0, 3, 6]
x=[i for i in a if i!=0]
y=[i for i in a if i==0]
x.extend(y)
print(x)
There's nothing wrong with your solution, and you should always pick a solution you understand over a 'clever' one you don't if you have to look after it.
Here's an alternative which never makes a new list and only passes through the list once. It will also preserve the order of the items. If that's not necessary the reverse sort solution is miles better.
def zeros_to_the_back(values):
zeros = 0
for value in values:
if value == 0:
zeros += 1
else:
yield value
yield from (0 for _ in range(zeros))
print(list(
zeros_to_the_back([1, 2, 0, 0, 0, 3, 6])
))
# [1, 2, 3, 6, 0, 0, 0]
This works using a generator which spits out answers one at a time. If we spot a good value we return it immediately, otherwise we just count the zeros and then return a bunch of them at the end.
yield from is Python 3 specific, so if you are using 2, just can replace this with a loop yielding zero over and over.
Numpy solution that preserves the order
import numpy as np
a = np.asarray([1, 2, 0, 0, 0, 3, 6])
# mask is a boolean array that is True where a is equal to 0
mask = (a == 0)
# Take the subset of the array that are zeros
zeros = a[mask]
# Take the subset of the array that are NOT zeros
temp = a[~mask]
# Join the arrays
joint_array = np.concatenate([temp, zeros])
I tried using sorted, which is similar to sort().
a = [1, 2, 0, 0, 0, 3, 6]
sorted(a,reverse=True)
ans:
[6, 3, 2, 1, 0, 0, 0]
from typing import List
def move(A:List[int]):
j=0 # track of nonzero elements
k=-1 # track of zeroes
size=len(A)
for i in range(size):
if A[i]!=0:
A[j]=A[i]
j+=1
elif A[i]==0:
A[k]=0
k-=1
since we have to keep the relative order. when you see nonzero element, place that nonzero into the index of jth.
first_nonzero=A[0] # j=0
second_nonzero=A[1] # j=1
third_nonzero=A[2] # j=2
With k we keep track of 0 elements. In python A[-1] refers to the last element of the array.
first_zero=A[-1] # k=-1
second_zero=A[-2] # k=-2
third_zero= A[-3] # k=-3
a = [4,6,0,6,0,7,0]
a = filter (lambda x : x!= 0, a) + [0]*a.count(0)
[4, 6, 6, 7, 0, 0, 0]
I have list/array of integers, call a subarray a peak if it goes up and then goes down. For example:
[5,5,4,5,4]
contains
[4,5,4]
which is a peak.
Also consider
[6,5,4,4,4,4,4,5,6,7,7,7,7,7,6]
which contains
[6,7,7,7,7,7,6]
which is a peak.
The problem
Given an input list, I would like to find all the peaks contained in it of minimal length and report them. In the example above, [5,6,7,7,7,7,7,6] is also a peak but we remove the first element and it remains a peak so we don't report it.
So for input list:
L = [5,5,5,5,4,5,4,5,6,7,8,8,8,8,8,9,9,8]
we would return
[4,5,4] and [8,9,9,8] only.
I am having problems devising a nice algorithm for this. Any help would be hugely appreciated.
Using itertools
Here is a short solution using itertools.groupby to detect peaks. The groups identifying peaks are then unpacked to yield the actual sequence.
from itertools import groupby, islice
l = [1, 2, 1, 2, 2, 0, 0]
fst, mid, nxt = groupby(l), islice(groupby(l), 1, None), islice(groupby(l), 2, None)
peaks = [[f[0], *m[1], n[0]] for f, m, n in zip(fst, mid, nxt) if f[0] < m[0] > n[0]]
print(peaks)
Output
[[1, 2, 1], [1, 2, 2, 0]]
Using a loop (faster)
The above solution is elegant but since three instances of groupby are created, the list is traversed three times.
Here is a solution using a single traversal.
def peaks(lst):
first = 0
last = 1
while last < len(lst) - 1:
if lst[first] < lst[last] == lst[last+1]:
last += 1
elif lst[first] < lst[last] > lst[last+1]:
yield lst[first:last+2]
first = last + 1
last += 2
else:
first = last
last += 1
l = [1, 2, 1, 2, 2, 0, 0]
print(list(peaks(l)))
Output
[[1, 2, 1], [1, 2, 2, 0]]
Notes on benchmark
Upon benchmarking with timeit, I noticed an increase in performance of about 20% for the solution using a loop. For short lists the overhead of groupby could bring that number up to 40%. The benchmark was done on Python 3.6.
Given a number of items (n), what is the most efficient way to generate all possible lists [a1, a2, ..., an] of non-negative integers under the condition that:
1*a1 + 2*a2 + 3*a3 + ... + n*an = n
using Python?
So for example, given an n of 5, the following combinations are:
[0,0,0,0,1]
[1,0,0,1,0]
[0,1,1,0,0]
[2,0,1,0,0]
[1,2,0,0,0]
[3,1,0,0,0]
[5,0,0,0,0]
I've implemented a brute-force method that generates all permutations and then checks if the list meets the above requirement, but is there a more efficient way to do this?
A "greedy" algorithm works well for this. I'm using Python 3 here:
def pick(total):
def inner(highest, total):
if total == 0:
yield result
return
if highest == 1:
result[0] = total
yield result
result[0] = 0
return
for i in reversed(range(total // highest + 1)):
result[highest - 1] = i
newtotal = total - i * highest
yield from inner(min(highest - 1, newtotal),
newtotal)
result = [0] * total
yield from inner(total, total)
Then, e.g.,
for x in pick(5):
print(x)
displays:
[0, 0, 0, 0, 1]
[1, 0, 0, 1, 0]
[0, 1, 1, 0, 0]
[2, 0, 1, 0, 0]
[1, 2, 0, 0, 0]
[3, 1, 0, 0, 0]
[5, 0, 0, 0, 0]
Like most recursive algorithms, it does a more-or-less obvious thing, then recurses to solve the (sub)problem that remains.
Here inner(highest, total) means to find all the decompositions of total using integers no larger than highest. How many copies of highest can we use? The more-than-less obvious answer is that we can use 0, 1, 2, ..., up to (and including) total // highest copies, but no more than that. Unless highest is 1 - then we have to use exactly total copies of 1.
However many copies we use of highest, the subproblem remaining is to decompose whatever remains of the total using integers no larger than highest - 1. Passing min(highest - 1, newtotal) instead of highest - 1 is an optimization, since it's pointless trying any integer larger than the new total.
I've got a the following "bars and stars" algorithm, implemented in Python, which prints out all decomposition of a sum into 3 bins, for sums going from 0 to 5.
I'd like to generalise my code so it works with N bins (where N less than the max sum i.e 5 here).
The pattern is if you have 3 bins you need 2 nested loops, if you have N bins you need N-1 nested loops.
Can someone think of a generic way of writing this, possibly not using loops?
# bars and stars algorithm
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(len(x)+1)):
for j in range(i,(len(x)+1)):
print sum(x[0:i]), sum(x[i:j]), sum(x[j:len(x)])
If this isn't simply a learning exercise, then it's not necessary for you to roll your own algorithm to generate the partitions: Python's standard library already has most of what you need, in the form of the itertools.combinations function.
From Theorem 2 on the Wikipedia page you linked to, there are n+k-1 choose k-1 ways of partitioning n items into k bins, and the proof of that theorem gives an explicit correspondence between the combinations and the partitions. So all we need is (1) a way to generate those combinations, and (2) code to translate each combination to the corresponding partition. The itertools.combinations function already provides the first ingredient. For the second, each combination gives the positions of the dividers; the differences between successive divider positions (minus one) give the partition sizes. Here's the code:
import itertools
def partitions(n, k):
for c in itertools.combinations(range(n+k-1), k-1):
yield [b-a-1 for a, b in zip((-1,)+c, c+(n+k-1,))]
# Example usage
for p in partitions(5, 3):
print(p)
And here's the output from running the above code.
[0, 0, 5]
[0, 1, 4]
[0, 2, 3]
[0, 3, 2]
[0, 4, 1]
[0, 5, 0]
[1, 0, 4]
[1, 1, 3]
[1, 2, 2]
[1, 3, 1]
[1, 4, 0]
[2, 0, 3]
[2, 1, 2]
[2, 2, 1]
[2, 3, 0]
[3, 0, 2]
[3, 1, 1]
[3, 2, 0]
[4, 0, 1]
[4, 1, 0]
[5, 0, 0]
Another recursive variant, using a generator function, i.e. instead of right away printing the results, it yields them one after another, to be printed by the caller.
The way to convert your loops into a recursive algorithm is as follows:
identify the "base case": when there are no more bars, just print the stars
for any number of stars in the first segment, recursively determine the possible partitions of the rest, and combine them
You can also turn this into an algorithm to partition arbitrary sequences into chunks:
def partition(seq, n, min_size=0):
if n == 0:
yield [seq]
else:
for i in range(min_size, len(seq) - min_size * n + 1):
for res in partition(seq[i:], n-1, min_size):
yield [seq[:i]] + res
Example usage:
for res in partition("*****", 2):
print "|".join(res)
Take it one step at a time.
First, remove the sum() calls. We don't need them:
N=5
for n in range(0,N):
x=[1]*n
for i in range(0,(n+1)): # len(x) == n
for j in range(i,(n+1)):
print i, j - i, n - j
Notice that x is an unused variable:
N=5
for n in range(0,N):
for i in range(0,(n+1)):
for j in range(i,(n+1)):
print i, j - i, n - j
Time to generalize. The above algorithm is correct for N stars and three bars, so we just need to generalize the bars.
Do this recursively. For the base case, we have either zero bars or zero stars, which are both trivial. For the recursive case, run through all the possible positions of the leftmost bar and recurse in each case:
from __future__ import print_function
def bars_and_stars(bars=3, stars=5, _prefix=''):
if stars == 0:
print(_prefix + ', '.join('0'*(bars+1)))
return
if bars == 0:
print(_prefix + str(stars))
return
for i in range(stars+1):
bars_and_stars(bars-1, stars-i, '{}{}, '.format(_prefix, i))
For bonus points, we could change range() to xrange(), but that will just give you trouble when you port to Python 3.
This can be solved recursively in the following approach:
#n bins, k stars,
def F(n,k):
#n bins, k stars, list holds how many elements in current assignment
def aux(n,k,list):
if n == 0: #stop clause
print list
elif n==1: #making sure all stars are distributed
list[0] = k
aux(0,0,list)
else: #"regular" recursion:
for i in range(k+1):
#the last bin has i stars, set them and recurse
list[n-1] = i
aux(n-1,k-i,list)
aux(n,k,[0]*n)
The idea is to "guess" how many stars are in the last bin, assign them, and recurse to a smaller problem with less stars (as much that were assigned) and one less bin.
Note: It is easy to replace the line
print list
with any output format you desire when the number of stars in each bin is set.
Here is a nonrecursive algorithm that replicates the "bars and stars" nested loop approach. This assumes the bars all start on the right, and finish on the left (bins going from [x,0,0,...] to [0,0,..,x]). There will always be a zero in the first bin when a loop finishes, so you can follow the logic and match it to "bars and stars."
def combos(nbins, qty):
bins = [0]*nbins
bins[0] = qty #starting bin quantities
while True:
yield bins
if bins[-1] == qty:
return #last combo, we're done!
#leftmost bar movement (inner loop)
if bins[0] > 0:
bins[0] -= 1
bins[1] += 1
else:
#bump next bar in nested loops
#i.e., find first nonzero entry, and split it
nz = 1
while bins[nz] == 0:
nz +=1
bins[0]=bins[nz]-1
bins[nz+1] += 1
bins[nz] = 0
Here is the result of 4 bins, quantity 3:
for m in combos(4, 3):
print(m)
[3, 0, 0, 0]
[2, 1, 0, 0]
[1, 2, 0, 0]
[0, 3, 0, 0]
[2, 0, 1, 0]
[1, 1, 1, 0]
[0, 2, 1, 0]
[1, 0, 2, 0]
[0, 1, 2, 0]
[0, 0, 3, 0]
[2, 0, 0, 1]
[1, 1, 0, 1]
[0, 2, 0, 1]
[1, 0, 1, 1]
[0, 1, 1, 1]
[0, 0, 2, 1]
[1, 0, 0, 2]
[0, 1, 0, 2]
[0, 0, 1, 2]
[0, 0, 0, 3]
I needed to solve the same problem and found this post, but I really wanted a non-recursive general-purpose algorithm that didn't rely on itertools and couldn't find one, so came up with this.
By default, the generator produces the sequence in either lexical order (as the earlier recursive example) but can also produce the reverse-order sequence by setting the "reversed" flag.
def StarsAndBars(bins, stars, reversed=False):
if bins < 1 or stars < 1:
raise ValueError("Number of bins and objects must both be greater than or equal to 1.")
if bins == 1:
yield stars,
return
bars = [ ([0] * bins + [ stars ], 1) ]
if reversed:
while len(bars)>0:
b = bars.pop()
if b[1] == bins:
yield tuple(b[0][y] - b[0][y-1] for y in range(1, bins+1))
else:
bar = b[0][:b[1]]
for x in range(b[0][b[1]], stars+1):
newBar = bar + [ x ] * (bins - b[1]) + [ stars ]
bars.append( (newBar, b[1]+1) )
bars = [ ([0] * bins + [ stars ], 1) ]
else:
while len(bars)>0:
newBars = []
for b in bars:
for x in range(b[0][-2], stars+1):
newBar = b[0][1:bins] + [ x, stars ]
if b[1] < bins-1 and x > 0:
newBars.append( (newBar, b[1]+1) )
yield tuple(newBar[y] - newBar[y-1] for y in range(1, bins+1))
bars = newBars
This problem can also be solved somewhat less verbosely than the previous answers with a list comprehension:
from numpy import array as ar
from itertools import product
number_of_stars = M
number_of_bins = N
decompositions = ar([ar(i) for i in product(range(M+1), repeat=N) if sum(i)==M])
Here the itertools.product() produces a list containing the Cartesian product of the list range(M+1) with itself, where the product has been applied (repeats=)N times. The if statement removes the combinations where the number don't add up to the number of stars, for example one of the combinations is of 0 with 0 with 0 or [0,0,0].
If we're happy with a list of lists then we can simply remove the np.array()'s (just ar for brevity in the example). Here's an example output for 3 stars in 3 bins:
array([[0, 0, 3],
[0, 1, 2],
[0, 2, 1],
[0, 3, 0],
[1, 0, 2],
[1, 1, 1],
[1, 2, 0],
[2, 0, 1],
[2, 1, 0],
[3, 0, 0]])
I hope this answer helps!
Since I found the code in most answers quite hard to follow i.e. asking myself how the shown algorithms relate to the actual problem of stars and bars let's do this step by step:
First we define a function to insert a bar | into a string stars at a given position p:
def insert_bar(stars, p):
head, tail = stars[:p], stars[p:]
return head + '|' + tail
Usage:
insert_bar('***', 1) # returns '*|**'
To insert multiple bars at different positions e.g. (1,3) a simple way is to use reduce (from functools)
reduce(insert_bar, (1,3), '***') # returns '*|*|*'
If we branch the definition of insert_bar to handle both cases we get a nice and reusable function to insert any number of bars into a string of stars
def insert_bars(stars, p):
if type(p) is int:
head, tail = stars[:p], stars[p:]
return head + '|' + tail
else:
return reduce(insert_bar, p, stars)
As #Mark Dickinson explaind in his answer itertools.combinations lets us produce the n+k-1 choose k-1 combinations of bar positions.
What is now left to do is to create a string of '*' of length n, insert the bars at the given positions, split the string at the bars and calculate the length of each resulting bin. The implementation below is thus literally a verbatim translation of the problem statement into code
def partitions(n, k):
for positions in itertools.combinations(range(n+k-1), k-1):
yield [len(bin) for bin in insert_bars(n*"*", positions).split('|')]
anyone looking for the specific case of k=2 can save ALOT of time by simply creating a range and stacking it with the reverse. Comparing versus accepted answer.
n = 500000
%timeit np.array([[i,j] for i,j in partitions(n,2)])
>>> 396 ms ± 13.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%%timeit
rng = np.arange(n+1)
np.vstack([rng, rng[::-1]]).T
>>> 2.91 ms ± 190 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
And they are indeed equivalent.
it2k = np.array([[i,j] for i,j in partitions(n,2)])
rng = np.arange(n+1)
np2k = np.vstack([rng, rng[::-1]]).T
(np2k == it2k).all()
>>> True
Can someone please explain algorithm for itertools.permutations routine in Python standard lib 2.6? I don't understand why it works.
Code is:
def permutations(iterable, r=None):
# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
return
indices = range(n)
cycles = range(n, n-r, -1)
yield tuple(pool[i] for i in indices[:r])
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
else:
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
yield tuple(pool[i] for i in indices[:r])
break
else:
return
You need to understand the mathematical theory of permutation cycles, also known as "orbits" (it's important to know both "terms of art" since the mathematical subject, the heart of combinatorics, is quite advanced, and you may need to look up research papers which could use either or both terms).
For a simpler introduction to the theory of permutations, wikipedia can help. Each of the URLs I mentioned offers reasonable bibliography if you get fascinated enough by combinatorics to want to explore it further and gain real understanding (I did, personally -- it's become somewhat of a hobby for me;-).
Once you understand the mathematical theory, the code is still subtle and interesting to "reverse engineer". Clearly, indices is just the current permutation in terms of indices into the pool, given that the items yielded are always given by
yield tuple(pool[i] for i in indices[:r])
So the heart of this fascinating machinery is cycles, which represents the permutation's orbits and causes indices to be updated, mostly by the statements
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
I.e., if cycles[i] is j, this means that the next update to the indices is to swap the i-th one (from the left) with the j-th one from the right (e.g., if j is 1, then the last element of indices is being swapped -- indices[-1]). And then there's the less frequent "bulk update" when an item of cycles reached 0 during its decrements:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
this puts the ith item of indices at the very end, shifting all following items of indices one to the left, and indicates that the next time we come to this item of cycles we'll be swapping the new ith item of indices (from the left) with the n - ith one (from the right) -- that would be the ith one again, except of course for the fact that there will be a
cycles[i] -= 1
before we next examine it;-).
The hard part would of course be proving that this works -- i.e., that all permutations are exhaustively generated, with no overlap and a correctly "timed" exit. I think that, instead of a proof, it may be easier to look at how the machinery works when fully exposed in simple cases -- commenting out the yield statements and adding print ones (Python 2.*), we have
def permutations(iterable, r=None):
# permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
# permutations(range(3)) --> 012 021 102 120 201 210
pool = tuple(iterable)
n = len(pool)
r = n if r is None else r
if r > n:
return
indices = range(n)
cycles = range(n, n-r, -1)
print 'I', 0, cycles, indices
# yield tuple(pool[i] for i in indices[:r])
print indices[:r]
while n:
for i in reversed(range(r)):
cycles[i] -= 1
if cycles[i] == 0:
print 'B', i, cycles, indices
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
print 'A', i, cycles, indices
else:
print 'b', i, cycles, indices
j = cycles[i]
indices[i], indices[-j] = indices[-j], indices[i]
print 'a', i, cycles, indices
# yield tuple(pool[i] for i in indices[:r])
print indices[:r]
break
else:
return
permutations('ABC', 2)
Running this shows:
I 0 [3, 2] [0, 1, 2]
[0, 1]
b 1 [3, 1] [0, 1, 2]
a 1 [3, 1] [0, 2, 1]
[0, 2]
B 1 [3, 0] [0, 2, 1]
A 1 [3, 2] [0, 1, 2]
b 0 [2, 2] [0, 1, 2]
a 0 [2, 2] [1, 0, 2]
[1, 0]
b 1 [2, 1] [1, 0, 2]
a 1 [2, 1] [1, 2, 0]
[1, 2]
B 1 [2, 0] [1, 2, 0]
A 1 [2, 2] [1, 0, 2]
b 0 [1, 2] [1, 0, 2]
a 0 [1, 2] [2, 0, 1]
[2, 0]
b 1 [1, 1] [2, 0, 1]
a 1 [1, 1] [2, 1, 0]
[2, 1]
B 1 [1, 0] [2, 1, 0]
A 1 [1, 2] [2, 0, 1]
B 0 [0, 2] [2, 0, 1]
A 0 [3, 2] [0, 1, 2]
Focus on the cycles: they start as 3, 2 -- then the last one is decremented, so 3, 1 -- the last isn't zero yet so we have a "small" event (one swap in the indices) and break the inner loop. Then we enter it again, this time the decrement of the last gives 3, 0 -- the last is now zero so it's a "big" event -- "mass swap" in the indices (well there's not much of a mass here, but, there might be;-) and the cycles are back to 3, 2. But now we haven't broken off the for loop, so we continue by decrementing the next-to-last (in this case, the first) -- which gives a minor event, one swap in the indices, and we break the inner loop again. Back to the loop, yet again the last one is decremented, this time giving 2, 1 -- minor event, etc. Eventually a whole for loop occurs with only major events, no minor ones -- that's when the cycles start as all ones, so the decrement takes each to zero (major event), no yield occurs on that last cycle.
Since no break ever executed in that cycle, we take the else branch of the for, which returns. Note that the while n may be a bit misleading: it actually acts as a while True -- n never changes, the while loop only exits from that return statement; it could equally well be expressed as if not n: return followed by while True:, because of course when n is 0 (empty "pool") there's nothing more to yield after the first, trivial empty yield. The author just decided to save a couple of lines by collapsing the if not n: check with the while;-).
I suggest you continue by examining a few more concrete cases -- eventually you should perceive the "clockwork" operating. Focus on just cycles at first (maybe edit the print statements accordingly, removing indices from them), since their clockwork-like progress through their orbit is the key to this subtle and deep algorithm; once you grok that, the way indices get properly updated in response to the sequencing of cycles is almost an anticlimax!-)
It is easier to answer with a pattern in results than words(Except you want to know the math part of the theory),
so prints out would be the best way to explain.
The most subtle thing is that,
after looping to the end, it would reset itself to the first turn of the last round, and start the next looping down, or continually reset to first turn of the last even the bigger round, like a clock.
The part of code doing the reset job:
if cycles[i] == 0:
indices[i:] = indices[i+1:] + indices[i:i+1]
cycles[i] = n - i
whole:
In [54]: def permutations(iterable, r=None):
...: # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC
...: # permutations(range(3)) --> 012 021 102 120 201 210
...: pool = tuple(iterable)
...: n = len(pool)
...: r = n if r is None else r
...: if r > n:
...: return
...: indices = range(n)
...: cycles = range(n, n-r, -1)
...: yield tuple(pool[i] for i in indices[:r])
...: print(indices, cycles)
...: while n:
...: for i in reversed(range(r)):
...: cycles[i] -= 1
...: if cycles[i] == 0:
...: indices[i:] = indices[i+1:] + indices[i:i+1]
...: cycles[i] = n - i
...: print("reset------------------")
...: print(indices, cycles)
...: print("------------------")
...: else:
...: j = cycles[i]
...: indices[i], indices[-j] = indices[-j], indices[i]
...: print(indices, cycles, i, n-j)
...: yield tuple(pool[i] for i in indices[:r])
...: break
...: else:
...: return
part of the result:
In [54]: list(','.join(i) for i in permutations('ABCDE', 3))
([0, 1, 2, 3, 4], [5, 4, 3])
([0, 1, 3, 2, 4], [5, 4, 2], 2, 3)
([0, 1, 4, 2, 3], [5, 4, 1], 2, 4)
reset------------------
([0, 1, 2, 3, 4], [5, 4, 3])
------------------
([0, 2, 1, 3, 4], [5, 3, 3], 1, 2)
([0, 2, 3, 1, 4], [5, 3, 2], 2, 3)
([0, 2, 4, 1, 3], [5, 3, 1], 2, 4)
reset------------------
([0, 2, 1, 3, 4], [5, 3, 3])
------------------
([0, 3, 1, 2, 4], [5, 2, 3], 1, 3)
([0, 3, 2, 1, 4], [5, 2, 2], 2, 3)
([0, 3, 4, 1, 2], [5, 2, 1], 2, 4)
reset------------------
([0, 3, 1, 2, 4], [5, 2, 3])
------------------
([0, 4, 1, 2, 3], [5, 1, 3], 1, 4)
([0, 4, 2, 1, 3], [5, 1, 2], 2, 3)
([0, 4, 3, 1, 2], [5, 1, 1], 2, 4)
reset------------------
([0, 4, 1, 2, 3], [5, 1, 3])
------------------
reset------------------(bigger reset)
([0, 1, 2, 3, 4], [5, 4, 3])
------------------
([1, 0, 2, 3, 4], [4, 4, 3], 0, 1)
([1, 0, 3, 2, 4], [4, 4, 2], 2, 3)
([1, 0, 4, 2, 3], [4, 4, 1], 2, 4)
reset------------------
([1, 0, 2, 3, 4], [4, 4, 3])
------------------
([1, 2, 0, 3, 4], [4, 3, 3], 1, 2)
([1, 2, 3, 0, 4], [4, 3, 2], 2, 3)
([1, 2, 4, 0, 3], [4, 3, 1], 2, 4)
I recently stumbled upon the very same question during my journey of reimplementing permutation algorithms, and would like to share my understanding of this interesting algorithm.
TL;DR: This algorithm is based on a recursive permutation generation algorithm (backtracking based and utilizes swapping elements), and is transformed (or optimized) into an iteration form. (possibly to improve efficiency and prevent stack overflow)
Basics
Before we start, I have to make sure we use the same notation as the original algorithm.
n refers to the length of iterable
r refers to the length of one output permutation tuple
And share a simple observation (as discussed by Alex):
Whenever the algorithm yield an output, it just takes the first r elements of the indices list.
cycles
First, let’s discuss the variable cycles and build some intuition. With some debugging prints, we can see that cycles act like a countdown (of time or clock, something like 01:00:00 -> 00:59:59 -> 00:59:58):
Every item is initialized to range(n, n-r, -1), resulting in cycles[0]=n, cycles[1]=n-1...cycles[i]=n-i
Usually, only the last element is decreased, and each decrement (given after the decrement cycles[r-1] !=0) yields an output (a permutation tuple). We can intuitively name this case tick.
Whenever an element (assuming that’s cycles[i]) decreases to 0, it triggers a decrease on the element before it (cycles[i-1]). Then the triggering element (cycles[i]) is restored to its initial value (n-i). This behavior is similar to a borrowed minus, or the reset of minutes when the second reaches 0 in a clock countdown. We can intuitively name this branch reset.
To further confirm our intuition, add some print statements to the algorithm, and run it with the parameter iterable="ABCD", r=2. We can see the following changes of the cycles variable. Note that square brackets indicate a “tick” happening, yielding an output, and the curly braces indicates a “reset” happening, which don’t yield output.
[4,3] -> [4,2] -> [4,1] -> {4,0} -> {4,3} ->
[3,3] -> [3,2] -> [3,1] -> {3,0} -> {3,3} ->
[2,3] -> [2,2] -> [2,1] -> {2,0} -> {2,3} ->
[1,3] -> [1,2] -> [1,1] -> {1,0} -> {1,3} -> {0,3} -> {4,3}
Using the initial values and change pattern of cycles, we can come to a possible interpretation of the meaning of cycles: number of the remaining permutations (outputs), at each index. When initialized, cycles[0]=n represents that there is initially n possible choices at index 0, and cycles[1]=n-1 represents that there is initially n-1 possible choices at index 1, all the way down to cycles[r-1]=n-r+1. This interpretation of cycles matches math, as with some simple combinational math calculation we can confirm that is indeed the case. Another supporting evidence is that whenever the algorithm ends, we have P(n,r) ( P(n,r)=n*(n-1)*...*(n-r+1) ) ticks (counting the initial yield before entering while as a tick).
indices
Now we come to the more complex part, the indices list. As this is essentially a recursive algorithm (more precisely backtracking), I would like to start from a sub-problem (i=r-1): When the value from index 0 to index r-2 (inclusive) in indices is fixed, and only the value at index r-1 (in other words, the last element in indices) is changing. Also, I will introduce a concrete example (iterable="ABCDE", r=3), and we will be focusing on how it generates the first 3 outputs: ABC, ABD, ABE.
Following the sub-problem, we split the list of indices into 3 parts, and give them names,
fixed : indices[0:r-2] (inclusive)
changing: indices[r-1] (only one value)
backlog: indices[r:n-1] (the remaining parts beside the first two)
As this is a backtracking algorithm, we need to keep an invariant unmodified before and after the execution. The invariant is
The sublist contains changing and backlog (indices[r-1:n-1]), which is modified during the execution but restored when it ends.
Now we can turn to the interaction between cycles and indices during the mysterious while loop. Some of the operations have been outlined by Alex, and I further elaborate.
In each tick, the element in the changing part is swapped with some element in the backlog part, and the relative order in the backlog part is maintained.
Using the characters to visualize the indices, and curly braces highlights the backlog part:
ABC{DE} -> ABD{CE} -> ABE{CD}
When reset happens, the element in the changing part is moved to the back of backlog, thus restoring the initial layout of the sublist (containing the changing part and the backlog part)
Using the characters to visualize the indices, and curly braces highlights the changing part:
AB{E}CD -> ABCD{E}
During this execution (of i=r-1), only the tick phase can yield outputs, and it yields n-r+1 outputs in total, matching the initial value of cycles[i]. This is also a result of mathematically we can only have n-r+1 permutation choices when the fixed part is fixed.
After cycles[i] is decreased to 0, the reset phase kicks in, resetting cycles[i] to n-r+1 and restoring the invariant sublist. This phase marks the end of this execution, and indicates that all possible permutation choices giving the fixed prefix part have been outputted.
Therefore, we have shown that, in this sub-problem (i=r-1), this algorithm is indeed a valid backtracking algorithm, as it
Outputs all possible values given the precondition (fixed prefix part)
Keeps the invariant unmodified (restored in reset phase)
This proof(?) can also be generalized to other values of i, thus proofing(?) the correctness of this permutation generation algorithm.
Reimplementation
Phew! That’s a long read, and you may want to have some more tinkering (more print) with the algorithm to be fully convinced. In essence, we can simplify the underlying principle of the algorithm as the following pseudo-code:
// precondition: the fixed part (or prefix) is fixed
OUTPUT initial_permutation // also invokes the next level
WHILE remaining_permutation_count > 0
// tick
swap the changing element with an element in backlog
OUTPUT current_permutation // also invokes the next level
// reset
move the changing element behind the backlog
And here is a Python implementation using simple backtracking:
# helpers
def swap(list, i, j):
list[i], list[j] = list[j], list[i]
def move_to_last(list, i):
list[i:] = list[i+1:] + [list[i]]
def print_first_n_element(list, n):
print("".join(list[:n]))
# backtracking dfs
def permutations(list, r, changing_index):
if changing_index == r:
# we've reached the deepest level
print_first_n_element(list, r)
return
# a pseudo `tick`
# process initial permutation
# which is just doing nothing (using the initial value)
permutations(list, r, changing_index + 1)
# note: initial permutaion has been outputed, thus the minus 1
remaining_choices = len(list) - 1 - changing_index
# for (i=1;i<=remaining_choices;i++)
for i in range(1, remaining_choices+1):
# `tick` phases
# make one swap
swap_idx = changing_index + i
swap(list, changing_index, swap_idx)
# finished one move at current level, now go deeper
permutations(list, r, changing_index + 1)
# `reset` phase
move_to_last(list, changing_index)
# wrapper
def permutations_wrapper(list, r):
permutations(list, r, 0)
# main
if __name__ == "__main__":
my_list = ["A", "B", "C", "D"]
permutations_wrapper(my_list, 2)
Now all the remaining step is just to show that the backtracking version is equivalent to the iteration version in itertools source code. It should be pretty easy once you grasp why this algorithm works. Following the great tradition of various CS textbooks, this is left as an exercise to the reader.