How to take M things N at a time - python

I have a list of 46 items. Each has a number associated with it. I want to pair these items up in a set of 23 pairs. I want to evaluate a function over each set. How do I generate such a set?
I can use the combinations function from itertools to produce all the 2-ples but I don't see how to generate all the sets of 23 pairs.
How do I do this or is there sample code I can reference?

>>> L=range(46)
>>> def f(x, y): #for example
... return x * y
...
>>> [f(x, y) for x, y in zip(*[iter(L)] * 2)]
[0, 6, 20, 42, 72, 110, 156, 210, 272, 342, 420, 506, 600, 702, 812, 930, 1056, 1190, 1332, 1482, 1640, 1806, 1980]
Edit:
For the powerset of the pairs, we start by creating the pairs the same way. For Python3 use range in place of xrange
S = zip(*[iter(L)] * 2) # set of 23 pairs
[{j for i, j in enumerate(S) if (1<<i)&k} for k in xrange(1<<len(S))]
This will be quite a big list, you may want to use a generator expression
for item in ({j for i, j in enumerate(S) if (1<<i)&k} for k in xrange(1<<len(S))):
func(item)

First, the natural way to get all the pairs from a list is:
>>> N = 10
>>> input_list = range(N)
>>> [(a,b) for a, b in zip(input_list[::2], input_list[1::2])]
[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9)]
If you want to generate all such pairs, I'd do something like (this is what I call Case 1 below):
>>> set_of_all_pairs = set()
>>> input_list = range(N)
>>> import itertools
>>> for perm in itertools.permutations(input_list):
pairs = tuple([(a,b) for a, b in zip(perm[::2], perm[1::2])])
set_of_all_pairs.add(pairs)
Granted this as is will differentiate order in pair (e.g., (1,4) is different than (4,1)) as well as consider the order of pairs meaningful. So if you sort the pairs and the set of pairs before adding to the set:
>>> set_of_all_pairs = set()
>>> input_list = range(N)
>>> import itertools
>>> for perm in itertools.permutations(input_list):
pairs = sorted([tuple(sorted((a,b))) for a, b in zip(perm[::2], perm[1::2])])
set_of_all_pairs.add(tuple(pairs))
This is not an efficient algorithm (what I call Case 3 below), but for small values of N it will work.
For N=6, using the sorted method.
set([((0, 4), (1, 3), (2, 5)),
((0, 4), (1, 5), (2, 3)),
((0, 1), (2, 3), (4, 5)),
((0, 3), (1, 5), (2, 4)),
((0, 2), (1, 5), (3, 4)),
((0, 4), (1, 2), (3, 5)),
((0, 3), (1, 4), (2, 5)),
((0, 1), (2, 4), (3, 5)),
((0, 5), (1, 4), (2, 3)),
((0, 5), (1, 2), (3, 4)),
((0, 2), (1, 3), (4, 5)),
((0, 3), (1, 2), (4, 5)),
((0, 2), (1, 4), (3, 5)),
((0, 1), (2, 5), (3, 4)),
((0, 5), (1, 3), (2, 4))])
Note the solution space grows exponentially fast; (e.g., for N=6 its 15; N=8 its 105; N=10, its 945, for N=46 will be 25373791335626257947657609375 ~ 2.5 x 1028).
EDIT: People criticized the O(N!), but the desired solution grows as O(N!)
The question asks to break a list of N elements (assuming most general case of all elements being distinct) into a set of (N/2) pairs, and not only do this once, but generate all sets of these pairings. This answer is the only one that does so. Yes, it's exponentially slow, and completely infeasible for N=46. That's why I used N=10.
There are three reasonable interpretations of the problem:
Case 1: Ordering matters both inside a pair in the tuple (e.g., function arguments are not symmetric) and in the order of the pairs in a set of pairs also matters, then we will have N! ways of pairing up the numbers in our answer. Meaning in this case both the pair (0,1) and (1,0) are consider distinct, as well as for the N=4 case we consider the pairings {(0,1), (2,3)} distinct from {(2,3),(0,1)}.
Case 2: Ordering matters in a pair, but order is irrelevant in a set of pairings. This means we consider (0,1) and (1,0) as distinct pairs, but consider (for the N=4 case) that the set {(0,1),(2,3)} is identical to the set {(2,3), (0,1)} and do not need to consider both. In this case we will have N!/(N/2)! pairings, as any given set has (N/2)! different orderings. (I didn't explicitly give this above; but just stop sorting the tuple).
Case 3: Ordering is irrelevant both within a pair and within a set of pairings. This means we consider (0,1) and (1,0) as the same pair (function arguments are symmetric), so we will have N!/( (N/2)! & 2^(N/2) ) sets of pairs (factorial(N)/(factorial(N/2)*2**(N/2))). Each of the (N/2) pairs in each combination will have two internal orderings that contribute.
So depending on how the problem is phrased we should have:
Case 1 | Case 2 | Case 3
----------------------------------------------
N N! | N!/(N/2)! | N!/((N/2)! 2^(N/2))
6 720 | 120 | 15
8 40320 | 1680 | 105
10 3628800 | 30240 | 945
46 5.5x10^57 | 2.1x10^35 | 2x10^28
Note, my algorithm will go through all permutations, and hence will actually run slower for Case 3 (due to sorting) than Case 1, even though a better algorithm for Case 3 could be much faster. However, my answer is still optimal in asymptotic notation as even case 3 is factorial in its asymptotic running time, and completely infeasible to solve for N~46. Granted if you had to do a problem-size at the limit of feasibility (N~16) for Case 3 (e.g., need to generate 518918400.0 pairings), this solution of iterating through all N! permutations, sorting, and throwing out duplicates is sub-optimal.

Related

Generate itertools.product in different order

I have some sorted/scored lists of parameters. I'd like to generate possible combinations of parameters (cartesian product). However, if the number of parameters is large, this quickly (very quickly!!) becomes a very large number. Basically, I'd like to do a cartesian product, but stop early.
import itertools
parameter_options = ['1234',
'123',
'1234']
for parameter_set in itertools.product(*parameter_options):
print ''.join(parameter_set)
generates:
111
112
113
114
121
122
123
124
131
132
133
134
...
I'd like to generate (or something similar):
111
112
121
211
122
212
221
222
...
So that if I stop early, I'd at least get a couple of "good" sets of parameters, where a good set of parameters comes mostly early from the lists. This particular order would be fine, but I am interested in any technique that changes the "next permutation" choice order. I'd like the early results generated to have most items from the front of the list, but don't really care whether a solution generates 113 or 122 first, or whether 211 or 112 comes first.
My plan is to stop after some number of permutations are generated (maybe 10K or so? Depends on results). So if there are fewer than the cutoff, all should be generated, ultimately. And preferably each generated only once.
I think you can get your results in the order you want if you think of the output in terms of a graph traversal of the output space. You want a nearest-first traversal, while the itertools.product function is a depth-first traversal.
Try something like this:
import heapq
def nearest_first_product(*sequences):
start = (0,)*len(sequences)
queue = [(0, start)]
seen = set([start])
while queue:
priority, indexes = heapq.heappop(queue)
yield tuple(seq[index] for seq, index in zip(sequences, indexes))
for i in range(len(sequences)):
if indexes[i] < len(sequences[i]) - 1:
lst = list(indexes)
lst[i] += 1
new_indexes = tuple(lst)
if new_indexes not in seen:
new_priority = sum(index * index for index in new_indexes)
heapq.heappush(queue, (new_priority, new_indexes))
seen.add(new_indexes)
Example output:
for tup in nearest_first_product(range(1, 5), range(1, 4), range(1, 5)):
print(tup)
(1, 1, 1)
(1, 1, 2)
(1, 2, 1)
(2, 1, 1)
(1, 2, 2)
(2, 1, 2)
(2, 2, 1)
(2, 2, 2)
(1, 1, 3)
(1, 3, 1)
(3, 1, 1)
(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)
(2, 2, 3)
(2, 3, 2)
(3, 2, 2)
(1, 3, 3)
(3, 1, 3)
(3, 3, 1)
(1, 1, 4)
(2, 3, 3)
(3, 2, 3)
(3, 3, 2)
(4, 1, 1)
(1, 2, 4)
(2, 1, 4)
(4, 1, 2)
(4, 2, 1)
(2, 2, 4)
(4, 2, 2)
(3, 3, 3)
(1, 3, 4)
(3, 1, 4)
(4, 1, 3)
(4, 3, 1)
(2, 3, 4)
(3, 2, 4)
(4, 2, 3)
(4, 3, 2)
(3, 3, 4)
(4, 3, 3)
(4, 1, 4)
(4, 2, 4)
(4, 3, 4)
You can get a bunch of slightly different orders by changing up the calculation of new_priority in the code. The current version uses squared Cartesian distance as the priorities, but you could use some other value if you wanted to (for instance, one that incorporates the values from the sequences, not only the indexes).
If you don't care too much about whether (1, 1, 3) comes before (1, 2, 2) (so long as they both come after (1, 1, 2), (1, 2, 1) and (2, 1, 1)), you could probably do a breadth-first traversal instead of nearest-first. This would be a bit simpler, as you could use a regular queue (like a collections.deque) rather than a priority queue.
The queues used by this sort of graph traversal mean that this code uses some amount of memory. However, the amount of memory is a lot less than if you had to produce the results all up front before putting them in order. The maximum memory used is proportional to the surface area of the result space, rather than its volume.
Your question is a bit ambigous, but reading your comments and another answers, it seems you want a cartesian product implementation that does a breadth search instead of a depth search.
Recently I had your same need, but also with the requirement that it doesn't store intermediate results in memory. This is very important to me because I am working with large number of parameters (thus a extremely big cartesian product) and any implementation that stores values or do recursive calls is non-viable. As you state in your question, this seems to be your case also.
As I didn't find an answer that fulfils this requirement, I came to this solution:
def product(*sequences):
'''Breadth First Search Cartesian Product'''
# sequences = tuple(tuple(seq) for seqin sequences)
def partitions(n, k):
for c in combinations(range(n+k-1), k-1):
yield (b-a-1 for a, b in zip((-1,)+c, c+(n+k-1,)))
max_position = [len(i)-1 for i in sequences]
for i in range(sum(max_position)):
for positions in partitions(i, len(sequences)):
try:
yield tuple(map(lambda seq, pos: seq[pos], sequences, positions))
except IndexError:
continue
yield tuple(map(lambda seq, pos: seq[pos], sequences, max_position))
In terms of speed, this generator works fine in the beginning but starts getting slower in the latest results. So, although this implementation is a bit slower it works as a generator that doesn't use memory and doesn't give repeated values.
As I mentioned in #Blckknght answer, parameters here also must be sequences (subscriptable and length-defined iterables). But you can also bypass this limitation (sacrificing a bit of memory) by uncommenting the first line. This may be useful if you are working with generators/iterators as parameters.
I hope I've helped you and let me know if this helps to your problem.
This solution possibly isn't the best as it forces every combination into memory briefly, but it does work. It just might take a little while for large data sets.
import itertools
import random
count = 100 # the (maximum) amount of results
results = random.sample(list(itertools.product(*parameter_options)), count)
for parameter_set in results:
print "".join(parameter_set)
This will give you a list of products in a random order.

Algorithm to generate subsets satisfying binary relation

I am looking for a reasonable algorithm in python (well, because I have rather complicated mathematical objects implemented in python, so I cannot change the language) to achieve the following:
I am given a reflexive, symmetric binary relation bin_rel on a set X. The requested function maximal_compatible_subsets(X, bin_rel) should return all containmentwise maximal subsets of X such that the binary relation holds for all pairs a,b of elements in X.
In some more detail: Suppose I am given a binary relation on a set of objects, say
def bin_rel(elt1,elt2):
# return True if elt1 and elt2 satisfy the relation and False otherwise
# Example: set intersection, in this case, elt1 and elt2 are sets
# and the relation picks those pairs that have a nonempty intersection
return elt1.intersection(elt2)
I can also assume that the relation bin_rel is reflexive (this is, binary_rel(a,a) is True holds) and symmetric (this is, binary_rel(a,b) is binary_rel(b,a) holds).
I am now given a set X and a function bin_rel as above and seek an efficient algorithm to obtain the desired subsets of X
For example, in the case of the set intersection above (with sets replaced by lists for easier reading):
> X = [ [1,2,3], [1,3], [1,6], [3,4], [3,5], [4,5] ]
> maximal_compatible_subsets(X,bin_rel)
[[[1,2,3],[1,3],[1,6]], [[1,2,3],[1,3],[3,4],[3,5]], [[3,4],[3,5],[4,5]]]
This problem doesn't seem to be very exotic, so most welcome would be a pointer to an efficient existing snippet of code.
As Matt Timmermans noted this is finding maximal cliques problem that can be solved by Bron–Kerbosch algorithm. NetworkX has implementation that can be used for Python.
If you want to use python straight out of the box, you could use the following as a starting point:
from itertools import combinations
def maximal_compatible_subsets(X, bin_rel):
retval = []
for i in range(len(X) + 1, 1, -1):
for j in combinations(X, i):
if all(bin_rel(a, b) for a, b in combinations(j, 2)) and not any(set(j).issubset(a) for a in retval):
retval.append(tuple(j))
return tuple(retval)
if __name__ == '__main__':
x = ( (1,2,3), (1,3), (1,6), (3,4), (3,5), (4,5) )
def nonempty_intersection(a, b):
return set(a).intersection(b)
print x
print maximal_compatible_subsets(x, nonempty_intersection)
Outputs:
((1, 2, 3), (1, 3), (1, 6), (3, 4), (3, 5), (4, 5))
(((1, 2, 3), (1, 3), (3, 4), (3, 5)), ((1, 2, 3), (1, 3), (1, 6)), ((3, 4), (3, 5), (4, 5)))

Generating all possible combinations of a list, "itertools.combinations" misses some results

Given a list of items in Python, how can I get all the possible combinations of the items?
There are several similar questions on this site, that suggest using itertools.combinations, but that returns only a subset of what I need:
stuff = [1, 2, 3]
for L in range(0, len(stuff)+1):
for subset in itertools.combinations(stuff, L):
print(subset)
()
(1,)
(2,)
(3,)
(1, 2)
(1, 3)
(2, 3)
(1, 2, 3)
As you see, it returns only items in a strict order, not returning (2, 1), (3, 2), (3, 1), (2, 1, 3), (3, 1, 2), (2, 3, 1), and (3, 2, 1). Is there some workaround for that? I can't seem to come up with anything.
Use itertools.permutations:
>>> import itertools
>>> stuff = [1, 2, 3]
>>> for L in range(0, len(stuff)+1):
for subset in itertools.permutations(stuff, L):
print(subset)
...
()
(1,)
(2,)
(3,)
(1, 2)
(1, 3)
(2, 1)
(2, 3)
(3, 1)
....
Help on itertools.permutations:
permutations(iterable[, r]) --> permutations object
Return successive r-length permutations of elements in the iterable.
permutations(range(3), 2) --> (0,1), (0,2), (1,0), (1,2), (2,0), (2,1)
You can generate all the combinations of a list in python using this simple code
import itertools
a = [1,2,3,4]
for i in xrange(1,len(a)+1):
print list(itertools.combinations(a,i))
Result:
[(1,), (2,), (3,), (4,)]
[(1, 2), (1, 3), (1, 4), (2, 3), (2, 4), (3, 4)]
[(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]
[(1, 2, 3, 4)]
Are you looking for itertools.permutations instead?
From help(itertools.permutations),
Help on class permutations in module itertools:
class permutations(__builtin__.object)
| permutations(iterable[, r]) --> permutations object
|
| Return successive r-length permutations of elements in the iterable.
|
| permutations(range(3), 2) --> (0,1), (0,2), (1,0), (1,2), (2,0), (2,1)
Sample Code :
>>> from itertools import permutations
>>> stuff = [1, 2, 3]
>>> for i in range(0, len(stuff)+1):
for subset in permutations(stuff, i):
print(subset)
()
(1,)
(2,)
(3,)
(1, 2)
(1, 3)
(2, 1)
(2, 3)
(3, 1)
(3, 2)
(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)
From Wikipedia, the difference between permutations and combinations :
Permutation :
Informally, a permutation of a set of objects is an arrangement of those objects into a particular order. For example, there are six permutations of the set {1,2,3}, namely (1,2,3), (1,3,2), (2,1,3), (2,3,1), (3,1,2), and (3,2,1).
Combination :
In mathematics a combination is a way of selecting several things out of a larger group, where (unlike permutations) order does not matter.
itertools.permutations is going to be what you want. By mathematical definition, order does not matter for combinations, meaning (1,2) is considered identical to (2,1). Whereas with permutations, each distinct ordering counts as a unique permutation, so (1,2) and (2,1) are completely different.
Here is a solution without itertools
First lets define a translation between an indicator vector of 0 and 1s and a sub-list (1 if the item is in the sublist)
def indicators2sublist(indicators,arr):
return [item for item,indicator in zip(arr,indicators) if int(indicator)==1]
Next, Well define a mapping from a number between 0 and 2^n-1 to the its binary vector representation (using string's format function) :
def bin(n,sz):
return ('{d:0'+str(sz)+'b}').format(d=n)
All we have left to do, is to iterate all the possible numbers, and call indicators2sublist
def all_sublists(arr):
sz=len(arr)
for n in xrange(0,2**sz):
b=bin(n,sz)
yield indicators2sublist(b,arr)
I assume you want all possible combinations as 'sets' of values. Here is a piece of code that I wrote that might help give you an idea:
def getAllCombinations(object_list):
uniq_objs = set(object_list)
combinations = []
for obj in uniq_objs:
for i in range(0,len(combinations)):
combinations.append(combinations[i].union([obj]))
combinations.append(set([obj]))
return combinations
Here is a sample:
combinations = getAllCombinations([20,10,30])
combinations.sort(key = lambda s: len(s))
print combinations
... [set([10]), set([20]), set([30]), set([10, 20]), set([10, 30]), set([20, 30]), set([10, 20, 30])]
I think this has n! time complexity, so be careful. This works but may not be most efficient
just thought i'd put this out there since i couldn't fine EVERY possible outcome and keeping in mind i only have the rawest most basic of knowledge when it comes to python and there's probably a much more elegant solution...(also excuse the poor variable names
testing = [1, 2, 3]
testing2= [0]
n = -1
def testingSomethingElse(number):
try:
testing2[0:len(testing2)] == testing[0]
n = -1
testing2[number] += 1
except IndexError:
testing2.append(testing[0])
while True:
n += 1
testing2[0] = testing[n]
print(testing2)
if testing2[0] == testing[-1]:
try:
n = -1
testing2[1] += 1
except IndexError:
testing2.append(testing[0])
for i in range(len(testing2)):
if testing2[i] == 4:
testingSomethingElse(i+1)
testing2[i] = testing[0]
i got away with == 4 because i'm working with integers but you may have to modify that accordingly...

generating all permutations of 2 ones and 3 zeroes with itertools

probably basic, but couldn't find it in any other question.
I tried:
print ["".join(seq) for seq in itertools.permutations("00011")]
but got lots of duplications, seems like itertools doesn't understand all zeroes and all ones are the same...
what am I missing?
EDIT:
oops. Thanks to Gareth I've found out this question is a dup of: permutations with unique values.
Not closing it as I think my phrasing of the question is clearer.
list(itertools.combinations(range(5), 2))
returns a list of 10 positions where the two ones can be within the five-digits (others are zero):
[(0, 1),
(0, 2),
(0, 3),
(0, 4),
(1, 2),
(1, 3),
(1, 4),
(2, 3),
(2, 4),
(3, 4)]
For your case with 2 ones and 13 zeros, use this:
list(itertools.combinations(range(5), 2))
which returns a list of 105 positions. And it is much faster than your original solution.
Now the function:
def combiner(zeros=3, ones=2):
for indices in itertools.combinations(range(zeros+ones), ones):
item = ['0'] * (zeros+ones)
for index in indices:
item[index] = '1'
yield ''.join(item)
print list(combiner(3, 2))
['11000',
'01100',
'01010',
'01001',
'00101',
'00110',
'10001',
'10010',
'00011',
'10100']
and this needs 14.4µs.
list(combiner(13, 2))
returning 105 elements needs 134µs.
set("".join(seq) for seq in itertools.permutations("00011"))

Iteration over n values

I'd like to make an iteration to calculate all the possibilities of a given formula. I need to write down nested iteration but couldn't make it right. I am not good at algorithm :(
For calculating all the possibilities(%0-%100) 3 constant{z1,z2,z3} values, I prepared:
a=frange(0,1.0,0.01)
for z1 in a:
for z2 in a:
for z3 in a:
calculate(z1,z2,z3)
and works properly as I expected.
If z is a list which consists of n values(n can be 2-30 in my case), Which algorithm do you suggest to me to handle this? How can I create nested iteration?
The easiest way is to use itertools.product():
a=frange(0,1.0,0.01)
for z in itertools.product(a, repeat=n):
calculate(*z)
If n really would be 30, this would iterate over 100**30 = 10**60 values. Be prepared to wait.
itertools.product will do what you want (and more). Unfortunately it wants the lists whose products it computes in separate arguments, like this:
>>> list(itertools.product([1,2,3],[1,2,3]))
[(1, 1), (1, 2), (1, 3), (2, 1), (2, 2), (2, 3), (3, 1), (3, 2), (3, 3)]
so on the face of it you need to do something like this:
a=frange(0,1.0,0.01)
for (z1,z2,z3) in itertools.product(a,a,a): calculate(z1,z2,z3)
but if you want to use the exact code for different numbers of products you can say
a=frange(0,1.0,0.01)
for (z1,z2,z3) in itertools.product(*(3*[a])): calculate(z1,z2,z3)
or
a=frange(0,1.0,0.01)
for (z1,z2,z3) in apply(itertools.product, 3*[a]): calculate(z1,z2,z3)

Categories

Resources