I would like to write a function my_func(n,l) that, for some positive integer n, efficiently enumerates the ordered non-negative integer composition* of length l (where l is greater than n). For example, I want my_func(2,3) to return [[0,0,2],[0,2,0],[2,0,0],[1,1,0],[1,0,1],[0,1,1]].
My initial idea was to use existing code for positive integer partitions (e.g. accel_asc() from this post), extend the positive integer partitions by a couple zeros and return all permutations.
def my_func(n, l):
for ip in accel_asc(n):
nic = numpy.zeros(l, dtype=int)
nic[:len(ip)] = ip
for p in itertools.permutations(nic):
yield p
The output of this function is wrong, because every non-negative integer composition in which a number appears twice (or multiple times) appears several times in the output of my_func. For example, list(my_func(2,3)) returns [(1, 1, 0), (1, 0, 1), (1, 1, 0), (1, 0, 1), (0, 1, 1), (0, 1, 1), (2, 0, 0), (2, 0, 0), (0, 2, 0), (0, 0, 2), (0, 2, 0), (0, 0, 2)].
I could correct this by generating a list of all non-negative integer compositions, removing repeated entries, and then returning a remaining list (instead of a generator). But this seems incredibly inefficient and will likely run into memory issues. What is a better way to fix this?
EDIT
I did a quick comparison of the solutions offered in answers to this post and to another post that cglacet has pointed out in the comments.
On the left, we have the l=2*n and on the right we have l=n+1. In these two cases, user2357112's second solutions is faster than the others, when n<=5. For n>5, solutions proposed by user2357112, Nathan Verzemnieks, and AndyP are more or less tied. But the conclusions could be different when considering other relationships between l and n.
..........
*I originally asked for non-negative integer partitions. Joseph Wood correctly pointed out that I am in fact looking for integer compositions, because the order of numbers in a sequence matters to me.
Use the stars and bars concept: pick positions to place l-1 bars between n stars, and count how many stars end up in each section:
import itertools
def diff(seq):
return [seq[i+1] - seq[i] for i in range(len(seq)-1)]
def generator(n, l):
for combination in itertools.combinations_with_replacement(range(n+1), l-1):
yield [combination[0]] + diff(combination) + [n-combination[-1]]
I've used combinations_with_replacement instead of combinations here, so the index handling is a bit different from what you'd need with combinations. The code with combinations would more closely match a standard treatment of stars and bars.
Alternatively, a different way to use combinations_with_replacement: start with a list of l zeros, pick n positions with replacement from l possible positions, and add 1 to each of the chosen positions to produce an output:
def generator2(n, l):
for combination in itertools.combinations_with_replacement(range(l), n):
output = [0]*l
for i in combination:
output[i] += 1
yield output
Starting from a simple recursive solution, which has the same problem as yours:
def nn_partitions(n, l):
if n == 0:
yield [0] * l
else:
for part in nn_partitions(n - 1, l):
for i in range(l):
new = list(part)
new[i] += 1
yield new
That is, for each partition for the next lower number, for each place in that partition, add 1 to the element in that place. It yields the same duplicates yours does. I remembered a trick for a similar problem, though: when you alter a partition p for n into one for n+1, fix all the elements of p to the left of the element you increase. That is, keep track of where p was modified, and never modify any of p's "descendants" to the left of that. Here's the code for that:
def _nn_partitions(n, l):
if n == 0:
yield [0] * l, 0
else:
for part, start in _nn_partitions(n - 1, l):
for i in range(start, l):
new = list(part)
new[i] += 1
yield new, i
def nn_partitions(n, l):
for part, _ in _nn_partitions(n, l):
yield part
It's very similar - there's just the extra parameter passed along at each step, so I added wrapper to remove that for the caller.
I haven't tested it extensively, but this appears to be reasonably fast - about 35 microseconds for nn_partitions(3, 5) and about 18s for nn_partitions(10, 20) (which yields just over 20 million partitions). (The very elegant solution from user2357112 takes about twice as long for the smaller case and about four times as long for the larger one. Edit: this refers to the first solution from that answer; the second one is faster than mine under some circumstances and slower under others.)
Related
Suppose we have data a₁, ..., aₙ, where n is an even integer and each aᵢ ∈ ℝ. Also define the distance between two pairs of elements dis(aᵢ, aⱼ) = | aᵢ − aⱼ |. Now the program should output a list of pairs of elements sorted by the distance in an ascending order. Also the program should pack the input data into pairs, therefore each element aᵢ would only appear once in the output.
For example, given the input [1, 0.4, 3, 1.1] the output should be [(1, 1.1), (0.4, 3)].
A naive brute-force method is to calculate all C(n,2) pair and sorted the distance of each pair.
def not_in_list_of_pair(i, ls):
return not i in [p[0] for p in ls] + [p[1] for p in ls]
def calc(ls):
ls = sorted(ls)
d ={}
for idx1, i in enumerate(ls[:-1]):
for idx2, j in enumerate(ls[idx1+1:], idx1 + 1):
d[(i,j)] = j - i
# 2nd part
res = []
for pair in sorted(d, key = lambda k: d[k]):
i, j = pair
if not_in_list_of_pair(i, res) and not_in_list_of_pair(j, res):
res.append(pair)
return res
# another example
ls = [1, 0.1, 2, 2.4, 3, 4, 1.5]
assert calc(ls) == [(2, 2.4), (1, 1.5), (3, 4)]
But this naive method only works in O(n²), and the 2nd part (extracting min distance) is also slow. Therefore I am looking for a more effective method to solve this problem. Thanks!
I have to say that your descrption of the problem is not clear and the complexity in the description is not correct, i.e., you have to calculate the distance of all the pairs of integers (which is O(n^2)) and after that you sort all the distance (which is O(n^2 * log(n^2))).
For this problem, you are basically finding two integers with smallest distance, pick these two integers out, and repeat the same process on the remaining integers.
One naive solution is, supposed the integers are sorted, and we only find one pair of integers with smallest distance, then we just need to calculate the distance of each two adjacent integers (e.g., dist between ls[0] and ls[1], between ls[1] and ls[2], ..., between ls[n - 2] and ls[n - 1]) and find out which pair is the smallest. After we find one, remove the two selected integers, the remaining integers are still sorted. If we want to find the next pair of integers with smallest distance, the problem remains the same.
The naive solution is still expensive in two aspsects: (1) we need to calculate the distance of each two adjacent integers each time; (2) we need to remove two integers from a sorted array and keep the array sorted.
To solve (1), in fact, we don't have to calculate the all the distances each time. E.g., suppose we have 6 integers where we calculated dist(0, 1), dist(1, 2), dist(2, 3), dist(3, 4), dist(4, 5). We find that the 2nd and the 3rd integers are the closet ones, so we output and remove the 2nd and the 3rd integers. For the next round, we need to calculate dist(0, 1), dist(1, 4), dist(4, 5). We can see that we only need to remove dist(1, 2) and dist(3, 4) as they're useless, but we need to add a new distance dist(1, 4) while dist(0, 1) and dist(4, 5) are not changed. We can maintain a btree to achieve the purpose.
To solve (2), the best data structure where we can remove items from the middle is double linked list with complexity O(1). But we are using array now and we may not want to change array to linked list. One way is that we use index array to mimic a double linked list.
Here is an example.
Update 1: I found OrderedDict does not pop the minimal item each time. I don't find any data structure in python that works as btree. I have to use a heap where I cannot delete those useless distance but I can identiy and ignore them. Sorry for the mistake.
Update 2: Add a else branch in the while loop, i.e., we should not change the double linked list when we see a useless item.
Update 3: Just realize that the heap will have no more than n items in each iteration in the while loop. So the complexity is roughly O(n log n), with n being the number of integers.
from heapq import *
def calc(ls):
ls = sorted(ls) # O(nlogn)
n = len(ls)
# mimic a double linked list
left = [i - 1 for i in range(n)]
right = [i + 1 for i in range(n)]
appeared = [False for i in range(n)]
btree = []
for i in range(0, n - 1):
# distance of adjacent integers, and their indices
heappush(btree, (ls[i + 1] - ls[i], i, i + 1))
# roughly O(n log n), because the heap will have at most `n` items in each iteration
result = []
while len(btree) != 0:
minimal = heappop(btree)
a, b = minimal[1:3]
# skip if either a or b appeared
if not appeared[a] and not appeared[b]:
result.append((ls[a], ls[b]))
appeared[a] = True
appeared[b] = True
else:
continue # this is important
#print result
if left[a] != -1:
right[left[a]] = right[b]
if right[b] != n:
left[right[b]] = left[a]
if left[a] != -1 and right[b] != n:
heappush(btree, (ls[right[b]] - ls[left[a]], left[a], right[b]))
return result
ls = [1, 0.1, 2, 2.4, 3, 4, 1.5]
print calc(ls)
With the following output:
[(2, 2.4), (1, 1.5), (3, 4)]
Note: The number of input integers is 7, which is NOT even.
Show one more image to present what is going on:
I am not very familiar with Python, so I may not be using the best data structure in the above code snippet.
I got this problem on CoderByte. The requirement was to find a number of ways. I found solutions for that in StackOverflow and other sites. But moving ahead, I need all possible ways as well to reach the Nth step.
Problem description: There is a staircase of N steps and you can climb either 1 or 2 steps at a time. You need to count and return the total number of unique ways to climb the staircase. The order of steps taken matters.
For Example,
Input: N = 3
Output: 3
Explanation: There are 3 unique ways of climbing a staircase of 3 steps :{1,1,1}, {2,1} and {1,2}
Note: There might be another case that a person can take 2 or 3 or 4 steps at a time (I know that's realistically not possible but trying to add scalability to the input steps in the code)
I'm unable to find the right logic to get all the ways possible. It's useful if I get the solution in Python, but it's not a strict requirement though.
Here's a minimal solution using itertools library:
from itertools import permutations, chain
solve = lambda n: [(1,)*n] + list(set(chain(*[permutations((2,)*i + (1,)*(n-2*i)) for i in range(1, n//2+1)])))
For your example input:
> solve(3)
[(1, 1, 1), (1, 2), (2, 1)]
How it works?
It's easier to see what's happening if we take a step backwards:
def solve(n):
combinations = [(1,)*n]
for i in range(1, n//2+1):
combinations.extend(permutations((2,)*i + (1,)*(n-2*i)))
return list(set(combinations))
The most trivial case is the one where you take one step at a time, so n steps: (1,)*n. Then we can look for how many double steps could we take at most, and that's the floor of n divided by 2: n//2. Then we iterate over the possible double steps: try to add a double step each iteration (2,)*i, filling the remaining space with single steps (1,)*(n-2*i).
The function permutations from itertools will generate all the possible permutations of single and double steps for that iteration. With an input of (1,1,2), it will generate (1,1,2), (1,2,1) and (2,1,1). At the end we use the trick of converting the result to a set in order to remove duplicates, then converting it back into a list.
Generalization for any amount and length of steps (not optimal!)
One liner:
from itertools import permutations, chain, combinations_with_replacement
solve = lambda n, steps: list(set(chain(*[permutations(sequence) for sequence in chain(*[combinations_with_replacement(steps, r) for r in range(n//min(steps)+1)]) if sum(sequence) == n])))
Example output:
> solve(8, [2,3])
[(3, 2, 3), (2, 3, 3), (2, 2, 2, 2), (3, 3, 2)]
Easier to read version:
def solve(n, steps):
result = []
for sequence_length in range(n//min(steps)+1):
sequences = combinations_with_replacement(steps, sequence_length)
for sequence in sequences:
if sum(sequence) == n:
result.extend(permutations(sequence))
return list(set(result))
def solve(n) :
if (n == 0):
return [[]]
else:
left_results = []
right_results = []
if (n > 0):
left_results = solve(n - 1)
for res in left_results: # Add the current step to every result
res.append(1)
if (n > 1):
right_results = solve(n - 2)
for res in right_results: # Same above
res.append(2)
return left_results + right_results
I think there is a better way to do this using dynamic programming but I don't know how to do that. Hope it helps anyway.
Say I have a range(1, n + 1). I want to get m unique pairs.
What I found is, if the number of pairs is close to n(n-1)/2 (maxiumum number of pairs), one can't simply generate random pairs everytime because they will start overriding eachother. I'm looking for a somewhat lazy solution, that will be very efficient (in Python's world).
My attempt so far:
def get_input(n, m):
res = str(n) + "\n" + str(m) + "\n"
buffet = range(1, n + 1)
points = set()
while len(points) < m:
x, y = random.sample(buffet, 2)
points.add((x, y)) if x > y else points.add((y, x)) # meeh
for (x, y) in points:
res += "%d %d\n" % (x, y);
return res
You can use combinations to generate all pairs and use sample to choose randomly. Admittedly only lazy in the "not much to type" sense, and not in the use a generator not a list sense :-)
from itertools import combinations
from random import sample
n = 100
sample(list(combinations(range(1,n),2)),5)
If you want to improve performance you can make it lazy by studying this
Python random sample with a generator / iterable / iterator
the generator you want to sample from is this: combinations(range(1,n)
Here is an approach which works by taking a number in the range 0 to n*(n-1)/2 - 1 and decodes it to a unique pair of items in the range 0 to n-1. I used 0-based math for convenience, but you could of course add 1 to all of the returned pairs if you want:
import math
import random
def decode(i):
k = math.floor((1+math.sqrt(1+8*i))/2)
return k,i-k*(k-1)//2
def rand_pair(n):
return decode(random.randrange(n*(n-1)//2))
def rand_pairs(n,m):
return [decode(i) for i in random.sample(range(n*(n-1)//2),m)]
For example:
>>> >>> rand_pairs(5,8)
[(2, 1), (3, 1), (4, 2), (2, 0), (3, 2), (4, 1), (1, 0), (4, 0)]
The math is hard to easily explain, but the k in the definition of decode is obtained by solving a quadratic equation which gives the number of triangular numbers which are <= i, and where i falls in the sequence of triangular numbers tells you how to decode a unique pair from it. The interesting thing about this decode is that it doesn't use n at all but implements a one-to-one correspondence from the set of natural numbers (starting at 0) to the set of all pairs of natural numbers.
I don't think any thing on your line can improve. After all, as your m get closer and closer to the limit n(n-1)/2, you have thinner and thinner chance to find the unseen pair.
I would suggest to split into two cases: if m is small, use your random approach. But if m is large enough, try
pairs = list(itertools.combination(buffet,2))
ponits = random.sample(pairs, m)
Now you have to determine the threshold of m that determines which code path it should go. You need some math here to find the right trade off.
I would like to find a way to return the set of all vectors [x_1,...,x_n] subject to the constraint x_1+...+x_n=constant, each x_i is a nonnegative integer, and the order doesn't matter. (so [1,1,1,2]=[2,1,1,1]). I have very little experience with programming but I've been working with Python (sage) for the past month or so.
In particular, I'm trying to find the minimum value of a 15-variable (symmetric) function over nonnegative integers (subject to a constraint), but I'd like to write a program to do it because I can use it for similar projects as well.
I have been trying to write a program for 4 days now, and I'm suddenly coming to the realization that I have to somehow recursively define my function...and I have no idea what to do. I have a code which does something similar to what I want (but it's no where near done). I'll post it even though I'm sure it's the least efficient way to do what I'm trying to do:
def each_comb_first_step(vec):
row_num=floor(math.fabs((vec[0,vec.ncols()-1]-vec[0,vec.ncols()-2]))/2)+1
mat=matrix(ZZ, row_num, vec.ncols(), 0)
for j in range(row_num):
mat[j]=vec
vec[0,vec.ncols()-2]=vec[0,vec.ncols()-2]+1
vec[0,vec.ncols()-1]=vec[0,vec.ncols()-1]-1
return mat
def each_comb(num,const):
vec1=matrix(ZZ,1,num,0)
vec1[0,num-1]=const
time=0
steps=0
subtot=0
for i in (2,..,num-1):
steps=floor(const/(i+1))
for j in (1,..,steps):
time=j
for k in (num-i-1,..,num-2):
vec1[0,k]=time
time=time+1
subtot=0
for l in range(num-1):
subtot=subtot+vec1[0,l]
vec1[0,num-1]=const-subtot
mat1=each_comb_first_step(vec1)
return mat1
Is there by any chance a function which already does this, or something similar? Any help or suggestions would be greatly appreciated.
A brute force solution is as follows:
import itertools as it
# Constraint function returns true if inputs meet constraint requirement
def constraint(x1, x2, x3, x4):
return x1 + x2 + x3 + x4 == 10
numbers = range(1,10) #valid numbers (non-negative integers)
num_variables = 4 #size of number tuple to create
#vectors contains all tuples of 4 numbers that meet constraint
vectors = [t for t in it.combinations_with_replacement(numbers, num_variables)
if constraint(*t)]
print vectors
outputs
[(1, 1, 1, 7), (1, 1, 2, 6), (1, 1, 3, 5), (1, 1, 4, 4), (1, 2, 2, 5), (1, 2, 3, 4), (1, 3, 3, 3), (2, 2, 2, 4), (2, 2, 3, 3)]
The running time is O(numbers**num_variables), so will probably be prohibitively slow with your 15 variable solution. You might want to look into linear programming techniques. There's a free course on Linear Optimization at the Cousera website that can be used to solve these sorts of problems much quicker.
Check out this Stack Overflow question for a link to a python module that is an integer constraint solver.
You want to find all fixed-length partitions of a given integer. This can be done iteratively or recursively. The idea behind the recursive algorithm is to add a helper parameter representing a lower bound on the values to allow in the partitions. Then, for every possible least value in the partition, make a recursive call to figure out the ways to construct the rest of the partition.
def fixed_length_partitions(n, k, min_value=0):
"""Yields all partitions of the integer n into k integers."""
if k == 0:
if n == 0:
yield []
else:
for last_num in range(min_value, 1 + n//k):
for nums in _flps(n-last_num, k-1, min_value=last_num):
# Warning: mutative
nums += [last_num]
yield nums
_flps = fixed_length_partitions
An iterative algorithm would be much faster (avoiding a lot of Python function call overhead), but also less readable, since it'd essentially replace the Python call stack with an explicit list and end up making the control flow a lot more confusing.
This stack overflow thread claims that every recursive function can be written as a loop.
Which recursive functions cannot be rewritten using loops?
It makes complete sense. But I'm not sure how to express the following recursive function as a loop because it has a pre recursive piece of logic and a post recursive piece of logic.
Obviously the solution cannot use the goto statement. The code is here:
def gen_perms(lst, k, m):
if k == m:
all_perms.append(list(lst))
else:
for i in xrange(k, m+1):
#swap char
tmp = lst[k]
lst[k] = lst[i]
lst[i] = tmp
gen_perms(lst, k+1, m)
#swap char
tmp = lst[k]
lst[k] = lst[i]
lst[i] = tmp
Invoking it would be like this:
all_perms = []
gen_perm([1, 2, 3], 0, 2)
and it generates every permutation of the list 1,2,3.
The most pythonic way of doing permutations is to use:
>>> from itertools import permutations
>>> permutations([1,2,3])
>>> list(permutations([1,2,3]))
[(1, 2, 3), (1, 3, 2), (2, 1, 3), (2, 3, 1), (3, 1, 2), (3, 2, 1)]
Let's say you want to find all permutations of [1, 2, 3, 4]. There are 24 (=4!) of these, so number them 0-23. What we want is a non-recursive way to find the Nth permutation.
Let's say we sort the permutations in increasing numerical order. Then:
Permutations 0-5 start with 1
Permutations 6-11 start with 2
Permutations 12-17 start with 3
Permutations 18-23 start with 4
So we can get the first number of permutation N by dividing N by 6 (=3!), and rounding up.
How do we get the next number? Look at the second numbers in permutations 0-5:
Permutations 0-1 have second number 2.
Permutations 2-3 have second number 3.
Permutations 4-5 have second number 4.
We see a similar thing with permutations 6-11:
Permutations 6-7 have second number 1.
Permutations 8-9 have second number 3.
Permutations 10-11 have second number 4.
In general, take the remainder after dividing by 6 earlier, divide that by 2 (=2!), and round up. That gives you 1, 2, or 3, and the second item is the 1st, 2nd or 3rd item left in the list (after you've taken out the first item).
You can keep going in this way. Here's some code that does this:
from math import factorial
def gen_perms(lst):
all_perms = []
# Find the number of permutations.
num_perms = factorial(len(lst))
for i in range(num_perms):
# Generate the ith permutation.
perm = []
remainder = i
# Clone the list so we can remove items from it as we
# add them to our permutation.
items = lst[:]
# Pick out each item in turn.
for j in range(len(lst) - 1):
# Divide the remainder at the previous step by the
# next factorial down, to get the item number.
divisor = factorial(len(lst) - j - 1)
item_num = remainder / divisor
# Add the item to the permutation, and remove it
# from the list of available items.
perm.append(items[item_num])
items.remove(items[item_num])
# Take the remainder for the next step.
remainder = remainder % divisor
# Only one item left - add it to the permutation.
perm.append(items[0])
# Add the permutation to the list.
all_perms.append(perm)
return all_perms
I am not too familiar with the python syntax, but the following code (in 'c') shouldn't be too hard to translate assuming python can do nested for statements.
int list[3]={1,2,3};
int i,j,k;
for(i=0;i < SIZE;i++)
for(j=0;j < SIZE;j++)
for(k=0;k < SIZE;k++)
if(i!=j && j!=k && i!=k)
printf("%d%d%d\n",list[i],list[j],list[k]);