Partition N items into K bins in Python lazily - python

Give an algorithm (or straight Python code) that yields all partitions of a collection of N items into K bins such that each bin has at least one item. I need this in both the case where order matters and where order does not matter.
Example where order matters
>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 2))
[([1], [2,3,4]), ([1,2], [3,4]), ([1,2,3], [4])]
>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 3))
[([1], [2], [3,4]), ([1], [2,3], [4]), ([1,2], [3], [4])]
>>> list(partition_n_in_k_bins_ordered((1,2,3,4), 4))
[([1], [2], [3], [4])]
Example where order does not matter
>>> list(partition_n_in_k_bins_unordered({1,2,3,4}, 2))
[{{1}, {2,3,4}}, {{2}, {1,3,4}}, {{3}, {1,2,4}}, {{4}, {1,2,3}},
{{1,2}, {3,4}}, {{1,3}, {2,4}}, {{1,4}, {2,3}}]
These functions should produce lazy iterators/generators, not lists. Ideally they would use primitives found in itertools. I suspect that there is a clever solution that is eluding me.
While I've asked for this in Python I'm also willing to translate a clear algorithm.

you need a recursive function to solve this kind of problem: you take the list, take a subportion of it of increasing length and apply the same procedure to the remaining tail of the list in n-1 pieces.
here is my take to the ordered combination
def partition(lista,bins):
if len(lista)==1 or bins==1:
yield [lista]
elif len(lista)>1 and bins>1:
for i in range(1,len(lista)):
for part in partition(lista[i:],bins-1):
if len([lista[:i]]+part)==bins:
yield [lista[:i]]+part
for i in partition(range(1,5),1):
print i
#[[1, 2, 3, 4]]
for i in partition(range(1,5),2):
print i
#[[1], [2, 3, 4]]
#[[1, 2], [3, 4]]
#[[1, 2, 3], [4]]
for i in partition(range(1,5),3):
print i
#[[1], [2], [3, 4]]
#[[1], [2, 3], [4]]
#[[1, 2], [3], [4]]
for i in partition(range(1,5),4):
print i
#[[1], [2], [3], [4]]

Enrico's algorithm, Knuth's, and only my glue are needed to paste together something that returns the list of lists or set of sets (returned as lists of lists in case elements are not hashable).
def kbin(l, k, ordered=True):
"""
Return sequence ``l`` partitioned into ``k`` bins.
Examples
========
The default is to give the items in the same order, but grouped
into k partitions:
>>> for p in kbin(range(5), 2):
... print p
...
[[0], [1, 2, 3, 4]]
[[0, 1], [2, 3, 4]]
[[0, 1, 2], [3, 4]]
[[0, 1, 2, 3], [4]]
Setting ``ordered`` to None means that the order of the elements in
the bins is irrelevant and the order of the bins is irrelevant. Though
they are returned in a canonical order as lists of lists, all lists
can be thought of as sets.
>>> for p in kbin(range(3), 2, ordered=None):
... print p
...
[[0, 1], [2]]
[[0], [1, 2]]
[[0, 2], [1]]
"""
from sympy.utilities.iterables import (
permutations, multiset_partitions, partitions)
def partition(lista, bins):
# EnricoGiampieri's partition generator from
# http://stackoverflow.com/questions/13131491/
# partition-n-items-into-k-bins-in-python-lazily
if len(lista) == 1 or bins == 1:
yield [lista]
elif len(lista) > 1 and bins > 1:
for i in range(1, len(lista)):
for part in partition(lista[i:], bins - 1):
if len([lista[:i]] + part) == bins:
yield [lista[:i]] + part
if ordered:
for p in partition(l, k):
yield p
else:
for p in multiset_partitions(l, k):
yield p

Related

Create unique combinations regardless of subset size

I am doing a project that requires getting unique combinations in Python regardless of the subset size.
Lets say I have a list of sizes [1,2,2,3,4,5] and a size bound of 8. I want combinations that have all the elements and no repeat such that the sum of each combination should be less than or equal to 8. Another restriction is that the subtraction of the sum and the bound should be minimum.
For example in this case the answer should be [5,3] [4,2,2] [3,1] this way the total waste out of 8 will be 4 which is (3+1)-8=4.
You could use a recursive function to "brute force" the packing combinations and get the best fit out of those:
def pack(sizes,bound,subset=[]):
if not sizes: # all sizes used
yield [subset] # return current subset
return
if sizes and not subset: # start new subset
i,m = max(enumerate(sizes),key=lambda s:s[1])
subset = [m] # using largest size
sizes = sizes[:i]+sizes[i+1:] # (to avoid repeats)
used = sum(subset)
for i,size in enumerate(sizes): # add to current subset
if subset and size>subset[-1]: # non-increasing order
continue # (to avoid repeats)
if used + size <= bound:
yield from pack(sizes[:i]+sizes[i+1:],bound,subset+[size])
if sizes:
for p in pack(sizes,bound): # add more subsets
yield [subset,*p]
def bestFit(sizes,bound):
packs = pack(sizes,bound)
return min(packs,key = lambda p : bound*len(p)-sum(sizes))
output:
for p in pack([1,2,3,4,5],8):
print(p,8*len(p)-sum(map(sum,p)))
[[5, 1], [4], [3, 2]] 9
[[5, 2, 1], [4, 3]] 1
[[5, 2], [4, 3, 1]] 1
[[5, 2], [4], [3, 1]] 9
[[5, 3], [4, 2, 1]] 1
[[5, 3], [4], [2, 1]] 9
[[5], [4, 1], [3, 2]] 9
[[5], [4, 2], [3, 1]] 9
[[5], [4, 3], [2, 1]] 9
[[5], [4], [3, 2, 1]] 9
[[5], [4], [3], [2, 1]] 17
print(*bestFit([1,2,3,4,5],8))
# [5, 2, 1] [4, 3]
print(*bestFit([1,2,3,4,5,6,7,8,9],18))
# [9, 1] [8, 4, 3, 2] [7, 6, 5]
This will take exponentially longer as your list of sizes gets larger but it may be enough if you only have very small inputs
You probably need something like itertools.combinations, that will give you all the possible combinations of elements in sublists of given lenght without duplicate elements.
If you want to know more about function combinations, i would suggest to read also this.
Something like this should work:
for i in range(8//min(myList)):
for j in itertools.permutations(myList, i):
if sum(j) == 8:
print(j)
This way you are getting all the combinations of myList, and printing those ones of which element's sum is 8.
A function like this may be useful:
def permutationsWithSum(myList: list[int], n: int):
for i in range(n//min(myList)):
for j in itertools.permutations(myList, i):
if sum(j) == n:
yield j

How to merge smaller sub-elements into larger parent-elements in a list?

I have a list of lists, but some lists are "sublists" of other lists. What I want to do is remove the sublists from the larger list so that we only have the largest unique sublists.
For example:
>>> some_list = [[1], [1, 2], [1, 2, 3], [1, 4]]
>>> ideal_list = [[1, 2, 3], [1, 4]]
The code that I've written right now is:
new_list = []
for i in range(some_list)):
for j in range(i + 1, len(some_list)):
count = 0
for k in some_list[i]:
if k in some_list[j]:
count += 1
if count == len(some_list[i]):
new_list.append(some_list[j])
The basic algorithm that I had in mind is that we'd check if a list's elements were in the following sublists, and if so then we use the other larger sublist. It doesn't give the desired output (it actually gives [[1, 2], [1, 2, 3], [1, 4], [1, 2, 3]]) and I'm wondering what I could do to achieve what I want.
I don't want to use sets because duplicate elements matter.
Same idea as set, but using Counter instead. It should be a lot more efficient in sublist check part than brute force
from collections import Counter
new_list = []
counters = []
for arr in sorted(some_list, key=len, reverse=True):
arr_counter = Counter(arr)
if any((c & arr_counter) == arr_counter for c in counters):
continue # it is a sublist of something else
new_list.append(arr)
counters.append(arr_counter)
With some inspiration from #mkrieger1's comment, one possible solution would be:
def merge_sublists(some_list):
new_list = []
for i in range(len(some_list)):
true_or_false = []
for j in range(len(some_list)):
if some_list[j] == some_list[i]:
continue
true_or_false.append(all([x in some_list[j] for x in some_list[i]]))
if not any(true_or_false):
new_list.append(some_list[i])
return new_list
As is stated in the comment, a brute-force solution would be to loop through each element and check if it's a sublist of any other sublist. If it's not, then append it to the new list.
Test cases:
>>> merge_sublists([[1], [1, 2], [1, 2, 3], [1, 4]])
[[1, 2, 3], [1, 4]]
>>> merge_sublists([[1, 2, 3], [4, 5], [3, 4]])
[[1, 2, 3], [4, 5], [3, 4]]
Input:
l = [[1], [1, 2], [1, 2, 3], [1, 4]]
One way here:
l1 = l.copy()
for i in l:
for j in l:
if set(i).issubset(set(j)) and i!=j:
l1.remove(i)
break
This prints:
print(l1)
[[1, 2, 3], [1, 4]]
EDIT: (Taking care of duplicates as well)
l1 = [list(tupl) for tupl in {tuple(item) for item in l }]
l2 = l1.copy()
for i in l1:
for j in l1:
if set(i).issubset(set(j)) and i!=j:
l2.remove(i)
break

Merge python lists of different lengths

I am attempting to merge two python lists, where their values at a given index will form a list (element) in a new list. For example:
merge_lists([1,2,3,4], [1,5]) = [[1,1], [2,5], [3], [4]]
I could iterate on this function to combine ever more lists. What is the most efficient way to accomplish this?
Edit (part 2)
Upon testing the answer I had previously selected, I realized I had additional criteria and a more general problem. I would also like to combine lists containing lists or values. For example:
merge_lists([[1,2],[1]] , [3,4]) = [[1,2,3], [1,4]]
The answers currently provided generate lists of higher dimensions in cases like this.
One option is to use itertools.zip_longest (in python 3):
from itertools import zip_longest
[[x for x in t if x is not None] for t in zip_longest([1,2,3,4], [1,5])]
# [[1, 1], [2, 5], [3], [4]]
If you prefer sets:
[{x for x in t if x is not None} for t in zip_longest([1,2,3,4], [1,5])]
# [{1}, {2, 5}, {3}, {4}]
In python 2, use itertools.izip_longest:
from itertools import izip_longest
[[x for x in t if x is not None] for t in izip_longest([1,2,3,4], [1,5])]
#[[1, 1], [2, 5], [3], [4]]
Update to handle the slightly more complicated case:
def flatten(lst):
result = []
for s in lst:
if isinstance(s, list):
result.extend(s)
else:
result.append(s)
return result
This handles the above two cases pretty well:
[flatten(x for x in t if x is not None) for t in izip_longest([1,2,3,4], [1,5])]
# [[1, 1], [2, 5], [3], [4]]
[flatten(x for x in t if x is not None) for t in izip_longest([[1,2],[1]] , [3,4])]
# [[1, 2, 3], [1, 4]]
Note even though this works for the above two cases, but it can still break under deeper nested structure, since the case can get complicated very quickly. For a more general solution, you can see here.
Another way to have your desired output using zip():
def merge(a, b):
m = min(len(a), len(b))
sub = []
for k,v in zip(a,b):
sub.append([k, v])
return sub + list([k] for k in a[m:]) if len(a) > len(b) else sub + list([k] for k in b[m:])
a = [1, 2, 3, 4]
b = [1, 5]
print(merge(a, b))
>>> [[1, 1], [2, 5], [3], [4]]
You could use itertools.izip_longest and filter():
>>> lst1, lst2 = [1, 2, 3, 4], [1, 5]
>>> from itertools import izip_longest
>>> [list(filter(None, x)) for x in izip_longest(lst1, lst2)]
[[1, 1], [2, 5], [3], [4]]
How it works: izip_longest() aggregates the elements from two lists, filling missing values with Nones, which you then filter out with filter().
Another way using zip_longest and chain from itertools:
import itertools
[i for i in list(itertools.chain(*itertools.zip_longest(list1, list2, list3))) if i is not None]
or in 2 lines (more readable):
merged_list = list(itertools.chain(*itertools.zip_longest(a, b, c)))
merged_list = [i for i in merged_list if i is not None]

How to obtain sliced sublists of a list dynamically in python?

Let's say I have a list:
l = [0,1,2,3,4]
And I want to obtain a sequence of lists in this logic:
[[1,2,3,4],[0,1,2,3],[2,3,4],[1,2,3],[0,1,2],[3,4],[2,3],[1,2],[0,1],[0],[1],[2],[3],[4]]
That's it, sublists made of l[1:] and l[:-1]
I started by this recursive function:
l = [0,1,2,3,4]
def sublist(l):
if len(l) == 1:
return l
else:
return [sublist(l[1:]),sublist(l[:-1])]
a = [sublist(l)]
print a
But it's not really what I what as it outputs:
[[[[[[4], [3]], [[3], [2]]], [[[3], [2]], [[2], [1]]]], [[[[3], [2]], [[2], [1]]], [[[2], [1]], [[1], [0]]]]]]
import itertools
[list(itertools.combinations(l, x)) for x in range(1, len(l))]
Here's a very straightforward implementation:
def sublists_n(l, n):
subs = []
for i in range(len(l)-n+1):
subs.extend([l[i:i+n]])
return subs
def sublists(l):
subs = []
for i in range(len(l)-1,0,-1):
subs.extend(sublists_n(l,i))
return subs
>>> l = [0,1,2,3,4]
>>> sublists(l)
[[0, 1, 2, 3], [1, 2, 3, 4], [0, 1, 2], [1, 2, 3], [2, 3, 4], [0, 1], [1, 2], [2, 3], [3, 4], [0], [1], [2], [3], [4]]
[l[x:] for x in range(len(l))] + [l[:x+1] for x in range(len(l))]
Loops through l twice, but you sort of have to no matter what I think (could use zip but same thing).
A simple recursion, doesn't quite order things correctly but its simple.
def sublists(l):
right = l[1:]
left = l[:-1]
result = [right, left]
if len(l) > 2:
result.extend(sublists(right))
result.extend(sublists(left))
return result
print sublists([0,1,2,3,4])

Algorithm to generate (not quite) spanning set in Python

This follows on from this question:
Algorithm to generate spanning set
Given this input: [1,2,3,4]
I'd like to generate this set of sets in python:
[1] [2] [3] [4]
[1] [2] [3,4]
[1] [2, 3, 4]
[1] [2,3] [4]
[1,2] [3] [4]
[1,2] [3,4]
[1,2,3] [4]
[1,2,3,4]
So unlike the previous question, the order of the list is retained.
Ideally the code would work for n items in the list
Thanks very much
EDIT 2: Could anyone advise me on how to do this if the original input is a string rather than a list (where each word in the string becomes an item in a list). Thanks!
EDIT: added [1] [2, 3, 4] Sorry for the mistake
You might also enjoy a recursive solution:
def span(lst):
yield [lst]
for i in range(1, len(lst)):
for x in span(lst[i:]):
yield [lst[:i]] + x
Explanation
We exploit recursion here to break the problem down. The approach is the following:
For every list, the whole list is a valid spanning: [1,2,3,4] => [[1,2,3,4]].
For every list that is longer than size 1, we can use the first item as a group and then apply the same algorithm on the remaining list to get all the combined results:
[1,2,3] =>
[[1]] + [[2], [3]] # => [[1], [2], [3]]
[[1]] + [[2,3]] # => [[1], [2,3]]
For every list that is longer than size 2, we can just as well use the first two items as a group and then apply the same algorithm on the remaining list and combine the results:
[1,2,3,4,5] =>
[[1,2]] + [[3], [4], [5]] # => [[1,2], [3], [4], [5]]
[[1,2]] + [[3,4], [5]] # => [[1,2], [3,4], [5]]
[[1,2]] + [[3], [4,5]] # => [[1,2], [3], [4,5]]
[[1,2]] + [[3,4,5]] # => [[1,2], [3,4,5]]
We can see that the possible combinations on the right side are indeed all possible groupings of the remainder of the list, [3,4,5].
For every list that is longer than ... etc. Thus, the final algorithm is the following:
yield the whole list (it is always a valid spanning, see above)
For every possible splitting of the list, yield the left-hand part of the list combined with all possible spannings of the right-hand part of the list.
yield is a special keyword in Python that make the function a generator, which means that it returns a iterable object that can be used to enumerate all results found. You can transform the result into a list using the list constructor function: list(span([1,2,3,4])).
Adjusting one of the solution from Python: show all possible groupings of a list:
from itertools import combinations
def cut(lst, indexes):
last = 0
for i in indexes:
yield lst[last:i]
last = i
yield lst[last:]
def generate(lst, n):
for indexes in combinations(list(range(1,len(lst))), n - 1):
yield list(cut(lst, indexes))
data = [1,2,3,4]
for i in range(1, len(data)+1): # the only difference is here
for g in generate(data, i):
print(g)
"""
[[1, 2, 3, 4]]
[[1], [2, 3, 4]]
[[1, 2], [3, 4]]
[[1, 2, 3], [4]]
[[1], [2], [3, 4]]
[[1], [2, 3], [4]]
[[1, 2], [3], [4]]
[[1], [2], [3], [4]]
"""
import itertools
a = [1, 2, 3, 4]
n = len(a)
for num_splits in range(n):
for splits in itertools.combinations(range(1, n), num_splits):
splices = zip([0] + list(splits), list(splits) + [n])
print([a[i:j] for i, j in splices])
prints
[[1, 2, 3, 4]]
[[1], [2, 3, 4]]
[[1, 2], [3, 4]]
[[1, 2, 3], [4]]
[[1], [2], [3, 4]]
[[1], [2, 3], [4]]
[[1, 2], [3], [4]]
[[1], [2], [3], [4]]

Categories

Resources