Combinations and Permutations of characters - python

I am trying to come up with elegant code that creates combinations/permutations of characters from a single character:
E.g. from a single character I'd like code to create these permutations (order of the result is not important):
'a' ----> ['a', 'aa', 'A', 'AA', 'aA', 'Aa']
The not so elegant solutions I have thus far:
# this does it...
from itertools import permutations
char = 'a'
p = [char, char*2, char.upper(), char.upper()*2]
pp = [] # stores the final list of permutations
for j in range(1,3):
for i in permutations(p,j):
p2 = ''.join(i)
if len(p2) < 3:
pp.append(p2)
print pp
['a', 'aa', 'A', 'AA', 'aA', 'Aa']
#this also works...
char = 'a'
p = ['', char, char*2, char.upper(), char.upper()*2]
pp = [] # stores the final list of permutations
for i in permutations(p,2):
j = ''.join(i)
if len(j) < 3:
pp.append(j)
print list(set(pp))
['a', 'aa', 'aA', 'AA', 'Aa', 'A']
# and finally... so does this:
char = 'a'
p = ['', char, char.upper()]
pp = [] # stores the final list of permutations
for i in permutations(p,2):
pp.append(''.join(i))
print list(set(pp)) + [char*2, char.upper()*2]
['a', 'A', 'aA', 'Aa', 'aa', 'AA']
I'm not great with lambdas, and I suspect that may be where a better solution lies.
So, could you help me find the most elegant/pythonic way to the desired result?

You can simply use the itertools.product with different repeat values to get the expected result
>>> pop = ['a', 'A']
>>> from itertools import product
>>> [''.join(item) for i in range(len(pop)) for item in product(pop, repeat=i + 1)]
['a', 'A', 'aa', 'aA', 'Aa', 'AA']

Related

python from ['a','b','c','d'] to ['a', 'ab', abc', 'abcd']

I have a list ['a','b','c','d'], want to make another list, like this: ['a', 'ab', abc', 'abcd']?
Thanks
Tried:
list1=['a','b','c', 'd']
for i in range(1, (len(list1)+1)):
for j in range(1, 1+i):
print(*[list1[j-1]], end = "")
print()
returns:
a
ab
abc
abcd
It does print what i want, but not sure,how to add it to a list to look like ['a', 'ab', abc', 'abcd']
Use itertools.accumulate, which by default sums up the elements for accumulation like a cummulative sum. Since addition (__add__) is defined for str and results in the concatenation of the strings
assert "a" + "b" == "ab"
we can use accumulate as is:
import itertools
list1 = ["a", "b", "c", "d"]
list2 = list(itertools.accumulate(list1)) # list() because accumulate returns an iterator
print(list2) # ['a', 'ab', 'abc', 'abcd']
Append to a second list in a loop:
list1=['a','b','c', 'd']
list2 = []
s = ''
for c in list1:
s += c
list2.append(s)
print(list2)
Output:
['a', 'ab', 'abc', 'abcd']
list1=['a','b','c', 'd']
l = []
for i in range(len(list1)):
l.append("".join(list1[:i+1]))
print(l)
Printing stuff is useless if you want to do ANYTHING else with the data you are printing. Only use it when you actually want to display something to console.
You could form a string and slice it in a list comprehension:
s = ''.join(['a', 'b', 'c', 'd'])
out = [s[:i+1] for i, _ in enumerate(s)]
print(out):
['a', 'ab', 'abc', 'abcd']
You can do this in a list comprehension:
vals = ['a', 'b', 'c', 'd']
res = [''.join(vals[:i+1]) for i, _ in enumerate(vals)]
Code:
[''.join(list1[:i+1]) for i,l in enumerate(list1)]
Output:
['a', 'ab', 'abc', 'abcd']

python string operation. sticked words separation

I have a pretty challenging problem here I need your help.
the problem is this:
I have a string for example "abcde"
Now, I want to separate this string into any possible ordered combinations as a list of strings.
for example,
my_function('abcde')
output =
[
['a', 'b', 'c', 'd', 'e'],
['a', 'b', 'c', 'de'],
['a', 'b', 'cde'],
['a', 'bced'],
['a', 'b', 'cd', 'e'],
['a', 'bc', 'd', 'e'],
['a', 'bc', 'de'],
['a', 'bcd', 'e'],
['a', 'bcde'],
['ab', 'c', 'd', 'e'],
['ab', 'c', 'de'],
['ab', 'cd', 'e'],
['ab', 'cde'],
['abc','d','e'],
['abc', 'de'],
['abcd', 'e'],
['abcde']
]
It is not quite the permutation since the order matters.
Same result without itertools:
s = 'python'
splits = len(s) - 1
output = []
for i in range(2 ** splits):
combination = []
word = ''
for position in range(splits + 1):
word += s[position]
if not (i & (1 << position)):
combination.append(word)
word = ''
output.append(combination)
output.sort()
for combination in output:
print(combination)
Just for beginners.
You could do this:
import itertools
def get_slices(values):
slices_len = len(values) - 1
for is_slice in itertools.product([True, False], repeat=slices_len):
start_index = 0
slices = []
for slice_index, is_index_slice in enumerate(is_slice, 1):
if is_index_slice:
index_slice = values[start_index:slice_index]
start_index = slice_index
slices.append(index_slice)
slices.append(values[start_index:])
yield slices
Most important part of this code is the itertools.product call at the beginning, this generates all possible types of slices. A slice definition here corresponds to a bunch of bools representing whether two adjacent elements at all indices of pairs in values (there are slices_len of these) are joined or not.
list(get_slices("abcde)) will return the list you requested. If you don't need all results immediately, and instead want to iterate through them, you don't need the surrounding list call.
If you want the reverse order, you can switch the [True, False] with [False, True].
i got 16 items and you have 17 :-)
def fn(base_str):
result = [[base_str]]
for i in range(1, len(base_str)):
child = fn(base_str[i:])
for x in child:
x.insert(0, base_str[0:i])
result = child + result
return result
print(fn("abcde"))

Python 3 sort list -> all entries starting with lower case first

l1 = ['B','c','aA','b','Aa','C','A','a']
the result should be
['a','aA','b','c','A','Aa','B','C']
so same as l1.sort() but beginning with all words that start with lower case.
Try this:
>>> l = ['B', 'b','a','A', 'aA', 'Aa','C', 'c']
>>> sorted(l, key=str.swapcase)
['a', 'aA', 'b', 'c', 'A', 'Aa', 'B', 'C']
EDIT:
A one-liner using the list.sort method for those who prefer the imperative approach:
>>> l.sort(key=str.swapcase)
>>> print l
['a', 'aA', 'b', 'c', 'A', 'Aa', 'B', 'C']
Note:
The first approach leaves the state of l unchanged while the second one does change it.
Here is what you might be looking for:
li = ['a', 'A', 'b', 'B']
def sort_low_case_first(li):
li.sort() # will sort the list, uppercase first
index = 0 # where the list needs to be cuted off
for i, x in enumerate(li): # iterate over the list
if x[0].islower(): # if we uncounter a string starting with a lowercase
index = i # memorize where
break # stop searching
return li[index:]+li[:index] # return the end of the list, containing the sorted lower case starting strings, then the sorted uppercase starting strings
sorted_li = sort_low_case_first(li) # run the function
print(sorted_li) # check the result
>>> ['a', 'b', 'A', 'B']

Python: find all possible word combinations with a sequence of characters (word segmentation)

I'm doing some word segmentation experiments like the followings.
lst is a sequence of characters, and output is all the possible words.
lst = ['a', 'b', 'c', 'd']
def foo(lst):
...
return output
output = [['a', 'b', 'c', 'd'],
['ab', 'c', 'd'],
['a', 'bc', 'd'],
['a', 'b', 'cd'],
['ab', 'cd'],
['abc', 'd'],
['a', 'bcd'],
['abcd']]
I've checked combinations and permutations in itertools library,
and also tried combinatorics.
However, it seems that I'm looking at the wrong side because this is not pure permutation and combinations...
It seems that I can achieve this by using lots of loops, but the efficiency might be low.
EDIT
The word order is important so combinations like ['ba', 'dc'] or ['cd', 'ab'] are not valid.
The order should always be from left to right.
EDIT
#Stuart's solution doesn't work in Python 2.7.6
EDIT
#Stuart's solution does work in Python 2.7.6, see the comments below.
itertools.product should indeed be able to help you.
The idea is this:-
Consider A1, A2, ..., AN separated by slabs. There will be N-1 slabs.
If there is a slab there is a segmentation. If there is no slab, there is a join.
Thus, for a given sequence of length N, you should have 2^(N-1) such combinations.
Just like the below
import itertools
lst = ['a', 'b', 'c', 'd']
combinatorics = itertools.product([True, False], repeat=len(lst) - 1)
solution = []
for combination in combinatorics:
i = 0
one_such_combination = [lst[i]]
for slab in combination:
i += 1
if not slab: # there is a join
one_such_combination[-1] += lst[i]
else:
one_such_combination += [lst[i]]
solution.append(one_such_combination)
print solution
#!/usr/bin/env python
from itertools import combinations
a = ['a', 'b', 'c', 'd']
a = "".join(a)
cuts = []
for i in range(0,len(a)):
cuts.extend(combinations(range(1,len(a)),i))
for i in cuts:
last = 0
output = []
for j in i:
output.append(a[last:j])
last = j
output.append(a[last:])
print(output)
output:
zsh 2419 % ./words.py
['abcd']
['a', 'bcd']
['ab', 'cd']
['abc', 'd']
['a', 'b', 'cd']
['a', 'bc', 'd']
['ab', 'c', 'd']
['a', 'b', 'c', 'd']
There are 8 options, each mirroring the binary numbers 0 through 7:
000
001
010
011
100
101
110
111
Each 0 and 1 represents whether or not the 2 letters at that index are "glued" together. 0 for no, 1 for yes.
>>> lst = ['a', 'b', 'c', 'd']
... output = []
... formatstr = "{{:0{}.0f}}".format(len(lst)-1)
... for i in range(2**(len(lst)-1)):
... output.append([])
... s = "{:b}".format(i)
... s = str(formatstr.format(float(s)))
... lstcopy = lst[:]
... for j, c in enumerate(s):
... if c == "1":
... lstcopy[j+1] = lstcopy[j] + lstcopy[j+1]
... else:
... output[-1].append(lstcopy[j])
... output[-1].append(lstcopy[-1])
... output
[['a', 'b', 'c', 'd'],
['a', 'b', 'cd'],
['a', 'bc', 'd'],
['a', 'bcd'],
['ab', 'c', 'd'],
['ab', 'cd'],
['abc', 'd'],
['abcd']]
>>>
You can use a recursive generator:
def split_combinations(L):
for split in range(1, len(L)):
for combination in split_combinations(L[split:]):
yield [L[:split]] + combination
yield [L]
print (list(split_combinations('abcd')))
Edit. I'm not sure how well this would scale up for long strings and at what point it hits Python's recursion limit. Similarly to some of the other answers, you could also use combinations from itertools to work through every possible combination of split-points.
def split_string(s, t):
return [s[start:finish] for start, finish in zip((None, ) + t, t + (None, ))]
def split_combinations(s):
for i in range(len(s)):
for split_points in combinations(range(1, len(s)), i):
yield split_string(s, split_points)
These both seem to work as intended in Python 2.7 (see here) and Python 3.2 (here). As #twasbrillig says, make sure you indent it as shown.

Split string into strings of repeating elements

I want to split a string like:
'aaabbccccabbb'
into
['aaa', 'bb', 'cccc', 'a', 'bbb']
What's an elegant way to do this in Python? If it makes it easier, it can be assumed that the string will only contain a's, b's and c's.
That is the use case for itertools.groupby :)
>>> from itertools import groupby
>>> s = 'aaabbccccabbb'
>>> [''.join(y) for _,y in groupby(s)]
['aaa', 'bb', 'cccc', 'a', 'bbb']
You can create an iterator - without trying to be smart just to keep it short and unreadable:
def yield_same(string):
it_str = iter(string)
result = it_str.next()
for next_chr in it_str:
if next_chr != result[0]:
yield result
result = ""
result += next_chr
yield result
..
>>> list(yield_same("aaaaaabcbcdcdccccccdddddd"))
['aaaaaa', 'b', 'c', 'b', 'c', 'd', 'c', 'd', 'cccccc', 'dddddd']
>>>
edit
ok, so there is itertools.groupby, which probably does something like this.
Here's the best way I could find using regex:
print [a for a,b in re.findall(r"((\w)\2*)", s)]
>>> import re
>>> s = 'aaabbccccabbb'
>>> [m.group() for m in re.finditer(r'(\w)(\1*)',s)]
['aaa', 'bb', 'cccc', 'a', 'bbb']

Categories

Resources