I'm trying to find a performant solution in Python that works like so:
>>> func([1,2,3], [1,2])
[(1,1), (1,2), (1,3), (2,2), (2,3)]
This is similar to itertools.combinations_with_replacement, except that it can take multiple iterables. It's also similar to itertools.product, except that it omits order-independent duplicate results.
All of the inputs will be prefixes of the same series (i.e. they all start with the same element and follow the same pattern, but might have different lengths).
The function must be able to take any number of iterables as input.
Given a set of lists A, B, C, ..., here is a sketch of an algorithm that generates those results.
assert len(A) <= len(B) <= len(C) <= ...
for i in 0..len(A)
for j in i..len(B)
for k in j..len(C)
.
.
.
yield A[i], B[j], C[k], ...
Things I can't do
Use itertools.product and filter the results. This has to be performant.
Use recursion. The function overhead would make it slower than using itertools.product and filtering for a reasonable number of iterables.
I suspect there's a way to do this with itertools, but I have no idea what it is.
EDIT: I'm looking for the solution that takes the least time.
EDIT 2: There seems to be some confusion about what I'm trying to optimize. I'll illustrate with an example.
>>> len(list(itertools.product( *[range(8)] * 5 )))
32768
>>> len(list(itertools.combinations_with_replacement(range(8), 5)))
792
The first line gives the number of order-dependent possibilities for rolling 5 8-sided dice. The second gives the number of order-independent possibilities. Regardless of how performant itertools.product is, it'll take 2 orders of magnitude more iterations to get a result than itertools.combinations_with_replacement. I'm trying to find a way to do something similar to itertools.combinations_with_replacement, but with multiple iterables that minimizes the number of iterations, or time performance. (product runs in whereas combinations_with_replacement runs in , where M is the number of sides on the die and N is the number of dice)
This solution hasn't recursion or filtering. It's trying to produce only ascending sequences of indices so it's usable only for prefixes of same collection. Also it's uses only indices for element identification so it's not enforces elements of series to be comparable or even hashable.
def prefixCombinations(coll,prefixes):
"produces combinations of elements of the same collection prefixes"
prefixes = sorted(prefixes) # does not impact result through it's unordered combinations
n = len(prefixes)
indices = [0]*n
while True:
yield tuple(coll[indices[i]] for i in range(n))
#searching backwards for non-maximum index
for i in range(n-1,-1,-1):
if indices[i] < prefixes[i] - 1 : break
# if all indices hits maximum - leave
else: break
level = indices[i] + 1
for i in range(i,n): indices[i] = level
examples are
>>> list(prefixCombinations([1,2,3,4,5], (3,2)))
[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3]]
>>> list(prefixCombinations([1,2,3,4,5], (3,2,5)))
[[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 1, 4], [1, 1, 5], [1, 2, 2], [1, 2, 3], [1, 2, 4], [1, 2, 5], [1, 3, 3], [1, 3, 4], [1, 3, 5], [2, 2, 2], [2, 2, 3], [2, 2, 4], [2, 2, 5], [2, 3, 3], [2, 3, 4], [2, 3, 5]]
>>> from itertools import combinations_with_replacement
>>> tuple(prefixCombinations(range(10),[10]*4)) == tuple(combinations_with_replacement(range(10),4))
True
Since this is a generator it doesn't effectively change the performance (just wraps O(n) around itertools.product):
import itertools
def product(*args):
for a, b in itertools.product(*args):
if a >= b:
yield b, a
print list(product([1,2,3], [1,2]))
Output:
[(1, 1), (1, 2), (2, 2), (1, 3), (2, 3)]
Or even:
product = lambda a, b: ((y, x) for x in a for y in b if x >= y)
Here an implementation.
The idea is to use sorted containers to impose canonical order and avoid duplicates this way. So I'm not generating duplicates at one step and avoid need of filtering later.
It relies on "sortedcontainers" library that provides fast (as fast as C implementation) sorted containers. [I'm not affiliated to this library in any manner]
from sortedcontainers import SortedList as SList
#see at http://www.grantjenks.com/docs/sortedcontainers/
def order_independant_combination(*args):
filtered = 0
previous= set()
current = set()
for iterable in args:
if not previous:
for elem in iterable:
current.add(tuple([elem]))
else:
for elem in iterable:
for combination in previous:
newCombination = SList(combination)
newCombination.add(elem)
newCombination = tuple(newCombination)
if not newCombination in current:
current.add(newCombination)
else:
filtered += 1
previous = current
current = set()
if filtered != 0:
print("{0} duplicates have been filtered during geneeration process".format(filtered))
return list(SList(previous))
if __name__ == "__main__":
result = order_independant_combination(*[range(8)] * 5)
print("Generated a result of length {0} that is {1}".format(len(result), result))
Execution give:
[(1, 1), (1, 2), (1, 3), (2, 2), (2, 3)]
You can test adding more iterables as parameters, it works.
Hope it can at least helps you if not solve your problem.
Vaisse Arthur.
EDIT : to answer the comment. This is not a good analysis. Filtering duplicates during generation is far most effectives than using itertools.product and then filters duplicates result. In fact, eliminating duplicates result at one step avoid to generate duplicates solution in all the following steps.
Executing this:
if __name__ == "__main__":
result = order_independant_combination([1,2,3],[1,2],[1,2],[1,2])
print("Generated a result of length {0} that is {1}".format(len(result), result))
I got the following result :
9 duplicates have been filtered during geneeration process
Generated a result of length 9 that is [(1, 1, 1, 1), (1, 1, 1, 2), (1, 1, 1, 3), (1, 1, 2, 2), (1, 1, 2, 3), (1, 2, 2, 2), (1, 2, 2, 3), (2, 2, 2, 2), (2, 2, 2, 3)]
While using itertools I got this :
>>> import itertools
>>> c = list(itertools.product([1,2,3],[1,2],[1,2],[1,2]))
>>> c
[(1, 1, 1, 1), (1, 1, 1, 2), (1, 1, 2, 1), (1, 1, 2, 2), (1, 2, 1, 1), (1, 2, 1, 2), (1, 2, 2, 1), (1, 2, 2, 2), (2, 1, 1, 1), (2, 1, 1, 2), (2, 1, 2, 1), (2, 1, 2, 2), (2, 2, 1, 1), (2, 2, 1, 2), (2, 2, 2, 1), (2, 2, 2, 2), (3, 1, 1, 1), (3, 1, 1, 2), (3, 1, 2, 1), (3, 1, 2, 2), (3, 2, 1, 1), (3, 2, 1, 2), (3, 2, 2, 1), (3, 2, 2, 2)]
>>> len(c)
24
Simple calcul give this:
pruned generation : 9 result + 9 element filtered -> 18 element generated.
itertools : 24 element generated.
And the more element you give it, the more they are long, the more the difference will be important.
Example :
result = order_independant_combination([1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5])
print("Generated a result of length {0} that is {1}".format(len(result), result))
Result :
155 duplicates have been filtered during geneeration process
Generated a result of length 70 ...
Itertools :
>>> len(list(itertools.product([1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5],[1,2,3,4,5])))
625
Difference of 400 elements.
EDIT 2 : with *range(8) * 5 it gives 2674 duplicates have been filtered during geneeration process. Generated a result of length 792...
I have a very specific problem where I need to know how to swap elements in a list or tuple.
I have one list that is called board state and I know the elements that need to be swapped. How do I swap them? In java with two-dimensional arrays, I could easily do the standard swap technique but here it says tuple assignment is not possible.
Here is my code:
board_state = [(0, 1, 2), (3, 4, 5), (6, 7, 8)]
new = [1, 1] # [row, column] The '4' element here needs to be swapped with original
original = [2, 1] # [row, column] The '7' element here needs to be swapped with new
Result should be:
board_state = [(0, 1, 2), (3, 7, 5), (6, 4, 8)]
How do I swap?
Tuples, like strings, are immutable: it is not possible to assign to the individual items of a tuple.
Lists are mutable, so convert your board_state to a list of lists:
>>> board_state = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
And then use the standard Python idiom for swapping two elements in a list:
>>> board_state[1][1], board_state[2][1] = board_state[2][1], board_state[1][1]
>>> board_state
[[0, 1, 2], [3, 7, 5], [6, 4, 8]]
I just extracted some data from a list using python but think it's overcomplicated and unpythonic and there's probably a much better way to do this. I'm actually pretty sure I saw this somewhere in the standard library docs but my brain refuses to tell me where.
So here it goes:
Input:
x = range(8) # any even sequence
Output:
[[0, 1], [2, 3], [4, 5], [6, 7]]
My take:
[ [x[i], x[i+1]] for i in range(len(x))[::2] ]
Tuples?
In Python 2.n
>>> zip(*2*[iter(x)])
[(0, 1), (2, 3), (4, 5), (6, 7)]
In Python 3.n
zip() behaves slightly differently...
>> zip(*2*[iter(x)])
<zip object at 0x285c582c>
>>> list(zip(*2*[iter(x)])])
[(0, 1), (2, 3), (4, 5), (6, 7)]
Lists?
The implementation is the same in Python 2 and 3...
>>> [[i,j] for i,j in zip(*2*[iter(x)])]
[[0, 1], [2, 3], [4, 5], [6, 7]]
Or, alternatively:
>>> [list(t) for t in zip(*2*[iter(x)])]
[[0, 1], [2, 3], [4, 5], [6, 7]]
The latter is more useful if you want to split into lists of 3 or more elements, without spelling it out, such as:
>>> [list(t) for t in zip(*4*[iter(x)])]
[[0, 1, 2, 3], [4, 5, 6, 7]]
If zip(*2*[iter(x)]) looks a little odd to you (and it did to me the first time I saw it!), take a look at How does zip(*[iter(s)]*n) work in Python?.
See also this pairwise implementation, which I think is pretty neat.
If you want tuples instead of lists you can try:
>>> zip(range(0, 8, 2), range(1, 8, 2))
[(0, 1), (2, 3), (4, 5), (6, 7)]
Input:
x = range(8) # any even sequence
Solution:
output = []
for i, j in zip(*[iter(x)]*2):
output.append( [i, j] )
Output:
print output
[[0, 1], [2, 3], [4, 5], [6, 7]]
You can rewrite it a bit:
>>> l = range(8)
>>> [[l[i], l[i+1]] for i in xrange(0, len(l), 2)]
[[0, 1], [2, 3], [4, 5], [6, 7]]
For some list tasks you can use itertools, but I'm pretty sure there's no helper function for this one.
I have a list of lists containing tuples:
[[(1L,)], [(2L,)], [(3L,)], [(4L,)], [(5L,)]
how do i edit the list so the list looks like:
l = [[1][2][3][4][5]]
>>> a
[[(1L,)], [(2L,)], [(3L,)], [(4L,)], [(5L,)]]
>>> a = [[x[0][0]] for x in a]
>>> a
[[1L], [2L], [3L], [4L], [5L]]
if you have, for instance, two items in each sub-list you would need something like this
example_list = [[(1, 0)], [(1, 1)], [(1, 3)], [(1, 4)], [(1, 5)]]
example_list = [[x[0][0], x[0][1]] for x in example_list]
print(example_list)
Output: [[1, 0], [1, 2], [1, 3], [1, 4], [1, 5]]
(note: this is using Python 3.8)