Unique combinations from list with "distance limit"

Unique combinations from list with "distance limit" - python

Given the list a = ['a', 'b', 'c', 'd', 'e'], I would use itertools.combinations to get all unique combos like ['ab', 'ac', ...], as per the classic SO answer
How can I limit the unique combinations to items that are not farther away than n spots?
Example
If I want list items no more than n=2 spots away, I would accept 'ab' and 'ac' as combinations but not 'ae', because the distance between 'a' and 'e' is greater than n=2
Edit - code
Below the plain python code solution, which I'd avoid due to the double-for loop, that is not ideal for large lists
a = ['a', 'b', 'c', 'd', 'e']
n_items = len(a)
n_max_look_forward = 2
unique_combos = []
for i, item in enumerate(a):
for j in range(i+1, min(i+n_max_look_forward+1, n_items)):
unique_combos.append( item+a[j] )
print(unique_combos)

Complexity-wise, your solution is close to the best possible.
You could refactor it to be a generator to generate the values only when you need them so that you don't have to hold all of them in memory at the same time:
def combis(source, max_distance=2):
for i, item in enumerate(source):
for j in range(i+1, min(i+max_distance+1, len(source))):
yield item+source[j]
You can then iterate over the generator:
>>> for combi in combis(['a', 'b', 'c', 'd', 'e']):
... print(combi)
...
ab
ac
bc
bd
cd
ce
de
If you need all of them in memory as a list, you can still use the generator to initialise it:
>>> list(combis(['a', 'b', 'c', 'd', 'e']))
['ab', 'ac', 'bc', 'bd', 'cd', 'ce', 'de']

Related

python string operation. sticked words separation

I have a pretty challenging problem here I need your help.
the problem is this:
I have a string for example "abcde"
Now, I want to separate this string into any possible ordered combinations as a list of strings.
for example,
my_function('abcde')
output =
[
['a', 'b', 'c', 'd', 'e'],
['a', 'b', 'c', 'de'],
['a', 'b', 'cde'],
['a', 'bced'],
['a', 'b', 'cd', 'e'],
['a', 'bc', 'd', 'e'],
['a', 'bc', 'de'],
['a', 'bcd', 'e'],
['a', 'bcde'],
['ab', 'c', 'd', 'e'],
['ab', 'c', 'de'],
['ab', 'cd', 'e'],
['ab', 'cde'],
['abc','d','e'],
['abc', 'de'],
['abcd', 'e'],
['abcde']
]
It is not quite the permutation since the order matters.

Same result without itertools:
s = 'python'
splits = len(s) - 1
output = []
for i in range(2 ** splits):
combination = []
word = ''
for position in range(splits + 1):
word += s[position]
if not (i & (1 << position)):
combination.append(word)
word = ''
output.append(combination)
output.sort()
for combination in output:
print(combination)
Just for beginners.

You could do this:
import itertools
def get_slices(values):
slices_len = len(values) - 1
for is_slice in itertools.product([True, False], repeat=slices_len):
start_index = 0
slices = []
for slice_index, is_index_slice in enumerate(is_slice, 1):
if is_index_slice:
index_slice = values[start_index:slice_index]
start_index = slice_index
slices.append(index_slice)
slices.append(values[start_index:])
yield slices
Most important part of this code is the itertools.product call at the beginning, this generates all possible types of slices. A slice definition here corresponds to a bunch of bools representing whether two adjacent elements at all indices of pairs in values (there are slices_len of these) are joined or not.
list(get_slices("abcde)) will return the list you requested. If you don't need all results immediately, and instead want to iterate through them, you don't need the surrounding list call.
If you want the reverse order, you can switch the [True, False] with [False, True].

i got 16 items and you have 17 :-)
def fn(base_str):
result = [[base_str]]
for i in range(1, len(base_str)):
child = fn(base_str[i:])
for x in child:
x.insert(0, base_str[0:i])
result = child + result
return result
print(fn("abcde"))

list splicing variable assignments automation

Imagine that you have a list of strings.
lst = ['a','b17','c','dz','e','ff','e3','e66']
you want to seperate those strings into individual variables
a = lst[:7]
b = lst[7:14]
c = lst[14:21]
Im wondering if there is a pythonic way of handling this instead of spending time typing out every single list splice.

You can use a generator expression to produce the slices and unpack them to your desired variables:
a, b, c = (lst[i:i+7] for i in range(0, 21, 7))
But that would produce an error of too many items to unpack if there are more than 21 items in the list, so it's better to use a list comprehension to keep it a list instead of individual variables:
[lst[i:i+7] for i in range(0, len(lst), 7)]

Try this method:
def f(lst,n):
l=[]
range_=list(range(0,len(lst),n))
for x,y in zip(range_,range_[1:]):
l.append(lst[x:y])
return l
print(f(lst,7))
Output with lst as:
lst = ['a','b17','c','dz','e','ff','e3','e66']*5
Is:
[['a', 'b17', 'c', 'dz', 'e', 'ff', 'e3'], ['e66', 'a', 'b17', 'c', 'dz', 'e', 'ff'], ['e3', 'e66', 'a', 'b17', 'c', 'dz', 'e'], ['ff', 'e3', 'e66', 'a', 'b17', 'c', 'dz'], ['e', 'ff', 'e3', 'e66', 'a', 'b17', 'c']]

My Python module returns wrong list

I done the following Python script which should return a list of sublists.
def checklisting(inputlist, repts):
result = []
temprs = []
ic = 1;
for x in inputlist
temprs.append(x)
ic += 1
if ic == repts:
ic = 1
result.append(temprs)
return result
Example: If I called the function with the following arguments:
checklisting(['a', 'b', 'c', 'd'], 2)
it would return
[['a', 'b'], ['c', 'd']]
or if I called it like:
checklisting(['a', 'b', 'c', 'd'], 4)
it would return
[['a', 'b', 'c', 'd']]
However what it returns is a weird huge list:
>>> l.checklisting(['a','b','c','d'], 2)
[['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd'], ['a', 'b', 'c', 'd']]
Someone please help! I need that script to compile a list with the data:
['water tax', 20, 'per month', 'electric tax', 1, 'per day']
The logic behind it is that it would separe sequences in the list the size of repts into sublists so it can be better and easier organized. I don't want arbitrary chunks of sublists as these in the other question don't specify the size of the sequence correctly.

Your logic is flawed.
Here are the bugs: You keep appending to temprs. Once repts is reached, you need to remove elements from temprs. Also, list indexes start at 0 so ic should be 0 instead of 1
Replace your def with:
def checklisting(inputlist, repts):
result = []
temprs = []
ic = 0;
for x in inputlist:
temprs.append(x)
ic += 1
if ic == repts:
ic = 0
result.append(temprs)
temprs = []
return result
Here is link to working demo of code above

def split_into_sublists(list_, size):
return list(map(list,zip(*[iter(list_)]*size)))
#[iter(list_)]*size this creates size time lists, if
#size is 3 three lists will be created.
#zip will zip the lists into tuples
#map will covert tuples to lists.
#list will convert map object to list.
print(split_into_sublists(['a', 'b', 'c', 'd'], 2))
[['a', 'b'], ['c', 'd']]
print(split_into_sublists(['a', 'b', 'c', 'd'], 4))
[['a', 'b', 'c', 'd']]

I got lost in your code. I think the more Pythonic approach is to slice the list. And I can never resist list comprehensions.
def checklisting(inputlist, repts):
return [ input_list[i:i+repts] for i in range(int(len(input_list)/repts)) ]

Python permutations of heterogenous list elements

This is the sequence:
l = [['A', 'G'], 'A', ['A', 'C']]
I need the three element sequence back for each permutation
all = ['AAA','GAA','AAC','GAC']
I can't figure this one out! I'm having trouble retaining the permutation order!

You want the product:
from itertools import product
l = [['A', 'G'], 'A', ['A', 'C']]
print(["".join(p) for p in product(*l)])

Python: find all possible word combinations with a sequence of characters (word segmentation)

I'm doing some word segmentation experiments like the followings.
lst is a sequence of characters, and output is all the possible words.
lst = ['a', 'b', 'c', 'd']
def foo(lst):
...
return output
output = [['a', 'b', 'c', 'd'],
['ab', 'c', 'd'],
['a', 'bc', 'd'],
['a', 'b', 'cd'],
['ab', 'cd'],
['abc', 'd'],
['a', 'bcd'],
['abcd']]
I've checked combinations and permutations in itertools library,
and also tried combinatorics.
However, it seems that I'm looking at the wrong side because this is not pure permutation and combinations...
It seems that I can achieve this by using lots of loops, but the efficiency might be low.
EDIT
The word order is important so combinations like ['ba', 'dc'] or ['cd', 'ab'] are not valid.
The order should always be from left to right.
EDIT
#Stuart's solution doesn't work in Python 2.7.6
EDIT
#Stuart's solution does work in Python 2.7.6, see the comments below.

itertools.product should indeed be able to help you.
The idea is this:-
Consider A1, A2, ..., AN separated by slabs. There will be N-1 slabs.
If there is a slab there is a segmentation. If there is no slab, there is a join.
Thus, for a given sequence of length N, you should have 2^(N-1) such combinations.
Just like the below
import itertools
lst = ['a', 'b', 'c', 'd']
combinatorics = itertools.product([True, False], repeat=len(lst) - 1)
solution = []
for combination in combinatorics:
i = 0
one_such_combination = [lst[i]]
for slab in combination:
i += 1
if not slab: # there is a join
one_such_combination[-1] += lst[i]
else:
one_such_combination += [lst[i]]
solution.append(one_such_combination)
print solution

#!/usr/bin/env python
from itertools import combinations
a = ['a', 'b', 'c', 'd']
a = "".join(a)
cuts = []
for i in range(0,len(a)):
cuts.extend(combinations(range(1,len(a)),i))
for i in cuts:
last = 0
output = []
for j in i:
output.append(a[last:j])
last = j
output.append(a[last:])
print(output)
output:
zsh 2419 % ./words.py
['abcd']
['a', 'bcd']
['ab', 'cd']
['abc', 'd']
['a', 'b', 'cd']
['a', 'bc', 'd']
['ab', 'c', 'd']
['a', 'b', 'c', 'd']

There are 8 options, each mirroring the binary numbers 0 through 7:
000
001
010
011
100
101
110
111
Each 0 and 1 represents whether or not the 2 letters at that index are "glued" together. 0 for no, 1 for yes.
>>> lst = ['a', 'b', 'c', 'd']
... output = []
... formatstr = "{{:0{}.0f}}".format(len(lst)-1)
... for i in range(2**(len(lst)-1)):
... output.append([])
... s = "{:b}".format(i)
... s = str(formatstr.format(float(s)))
... lstcopy = lst[:]
... for j, c in enumerate(s):
... if c == "1":
... lstcopy[j+1] = lstcopy[j] + lstcopy[j+1]
... else:
... output[-1].append(lstcopy[j])
... output[-1].append(lstcopy[-1])
... output
[['a', 'b', 'c', 'd'],
['a', 'b', 'cd'],
['a', 'bc', 'd'],
['a', 'bcd'],
['ab', 'c', 'd'],
['ab', 'cd'],
['abc', 'd'],
['abcd']]
>>>

You can use a recursive generator:
def split_combinations(L):
for split in range(1, len(L)):
for combination in split_combinations(L[split:]):
yield [L[:split]] + combination
yield [L]
print (list(split_combinations('abcd')))
Edit. I'm not sure how well this would scale up for long strings and at what point it hits Python's recursion limit. Similarly to some of the other answers, you could also use combinations from itertools to work through every possible combination of split-points.
def split_string(s, t):
return [s[start:finish] for start, finish in zip((None, ) + t, t + (None, ))]
def split_combinations(s):
for i in range(len(s)):
for split_points in combinations(range(1, len(s)), i):
yield split_string(s, split_points)
These both seem to work as intended in Python 2.7 (see here) and Python 3.2 (here). As #twasbrillig says, make sure you indent it as shown.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Unique combinations from list with "distance limit" - python

Related

python string operation. sticked words separation

list splicing variable assignments automation

My Python module returns wrong list

Python permutations of heterogenous list elements

Python: find all possible word combinations with a sequence of characters (word segmentation)

Categories

Resources