How to get all possible combinations of characters in a String - python

So I have a string like this:
"abc"
I would need:
"abc"
"acb"
"bca"
"bac"
"cab"
"cba"
I tried:
string = "abc"
combinations = []
for i in range(len(string)):
acc = string[i]
for ii in range(i+1,i+len(string)):
acc += string[ii%len(string)]
combinations.append(acc)
combinations.append(acc[::-1])
print(combinations)
If works for string of size 3, but I believe it is very inefficient and also doesn't work for "abcd". Is there a better approach?
Update: I would like to solve by providing an algorithm. Actually currently working in a recursive way to solve it. Would prefer a solution that is not a python function solving the problem for me

For permutations:
Using Itertools Library:
from itertools import permutations
ini_str = "abc"
print("Initial string", ini_str)
permutation = [''.join(p) for p in permutations(ini_str)]
print("Resultant List", str(permutation))
Initial string abc
Resultant List ['abc', 'acb', 'bac', 'bca', 'cab', 'cba']
Recursive Method:
def permutations(remaining, candidate=""):
if len(remaining) == 0:
print(candidate)
for i in range(len(remaining)):
newCandidate = candidate + remaining[i]
newRemaining = remaining[0:i] + remaining[i+1:]
permutations(newRemaining, newCandidate)
if __name__ == '__main__':
s = "ABC"
permutations(s)
Iterative Method:
def permutations(s):
partial = []
partial.append(s[0])
for i in range(1, len(s)):
for j in reversed(range(len(partial))):
# remove current partial permutation from the list
curr = partial.pop(j)
for k in range(len(curr) + 1):
partial.append(curr[:k] + s[i] + curr[k:])
print(partial, end='')
if __name__ == '__main__':
s = "ABC"
permutations(s)

Try itertools.permutations:
from itertools import permutations
print('\n'.join(map(''.join, list(permutations("abc", 3)))))
Output:
abc
acb
bac
bca
cab
cba
edit:
For an algorithm:
string = "abc"
def combinations(head, tail=''):
if len(head) == 0:
print(tail)
else:
for i in range(len(head)):
combinations(head[:i] + head[i+1:], tail + head[i])
combinations(string)
Output:
abc
acb
bac
bca
cab
cba

Your logic is correct, but you want permutation, not combination. Try:
from itertools import permutations
perms = permutations("abc", r=3)
# perms is a python generator and not a list.
You can easily create a list from that generator if you want.

1. Use permutations from itertools
from itertools import permutations
s = 'abc'
permutation = [''.join(p) for p in permutations(s)]
print(permutation)
# ['abc', 'acb', 'bac', 'bca', 'cab', 'cba']
2. Implement the algorithm
s = 'abc'
result = []
def permutations(string, step = 0):
if step == len(string):
result.append("".join(string))
for i in range(step, len(string)):
string_copy = [character for character in string]
string_copy[step], string_copy[i] = string_copy[i], string_copy[step]
permutations(string_copy, step + 1)
permutations(s)
print(result)
# ['abc', 'acb', 'bac', 'bca', 'cba', 'cab']

Related

Python multiple substring index in string

Given the following list of sub-strings:
sub = ['ABC', 'VC', 'KI']
is there a way to get the index of these sub-string in the following string if they exist?
s = 'ABDDDABCTYYYYVCIIII'
so far I have tried:
for i in re.finditer('VC', s):
print(i.start, i.end)
However, re.finditer does not take multiple arguments.
thanks
You can join those patterns together using |:
import re
sub = ['ABC', 'VC', 'KI']
s = 'ABDDDABCTYYYYVCIIII'
r = '|'.join(re.escape(s) for s in sub)
for i in re.finditer(r, s):
print(i.start(), i.end())
You could map over the find string method.
s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']
print(*map(s.find, sub))
# Output 5 13 -1
How about using list comprehension with str.find?
s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']
results = [s.find(pattern) for pattern in sub]
print(*results) # 5 13 -1
Another approach with re, if there can be multiple indices then this might be better as the list of indices is saved for each key, when there is no index found, the substring won't be in the dict.
import re
s = 'ABDDDABCTYYYYVCIIII'
sub = ['ABC', 'VC', 'KI']
# precompile regex pattern
subpat = '|'.join(sub)
pat = re.compile(rf'({subpat})')
matches = dict()
for m in pat.finditer(s):
# append starting index of found substring to value of matched substring
matches.setdefault(m.group(0),[]).append(m.start())
print(f"{matches=}")
print(f"{'KI' in matches=}")
print(f"{matches['ABC']=}")
Outputs:
matches={'ABC': [5], 'VC': [13]}
'KI' in matches=False
matches['ABC']=[5]
A substring may occur more than once in the main string (although it doesn't in the sample data). One could use a generator based around a string's built-in find() function like this:
note the source string has been modified to demonstrate repetition
sub = ['ABC', 'VC', 'KI']
s = 'ABCDDABCTYYYYVCIIII'
def find(s, sub):
for _sub in sub:
offset = 0
while (idx := s[offset:].find(_sub)) >= 0:
yield _sub, idx + offset
offset += idx + 1
for ss, start in find(s, sub):
print(ss, start)
Output:
ABC 0
ABC 5
VC 13
Just Use String index Method
list_ = ['ABC', 'VC', 'KI']
s = 'ABDDDABCTYYYYVCIIII'
for i in list_:
if i in s:
print(s.index(i))

get occurrence of all substring of matching characters from string

e.g. find substring containing 'a', 'b', 'c' in a string 'abca', answer should be 'abc', 'abca', 'bca'
Below code is what I did, but is there better, pythonic way than doing 2 for loops?
Another e.g. for 'abcabc' count should be 10
def test(x):
counter = 0
for i in range(0, len(x)):
for j in range(i, len(x)+1):
if len((x[i:j]))>2:
print(x[i:j])
counter +=1
print(counter)
test('abca')
You can condense it down with list comprehension:
s = 'abcabc'
substrings = [s[b:e] for b in range(len(s)-2) for e in range(b+3, len(s)+1)]
substrings, len(substrings)
# (['abc', 'abca', 'abcab', 'abcabc', 'bca', 'bcab', 'bcabc', 'cab', 'cabc', 'abc'], 10)
You can use combinations from itertools:
from itertools import combinations
string = "abca"
result = [string[x:y] for x, y in combinations(range(len(string) + 1), r = 2)]
result = [item for item in result if 'a' in item and 'b' in item and 'c' in item]

Getting the nth char of each string in a list of strings

Let's they I have the list ['abc', 'def', 'gh'] I need to get a string with the contents of the first char of the first string, the first of the second and so on.
So the result would look like this: "adgbehcf" But the problem is that the last string in the array could have two or one char.
I already tried to nested for loop but that didn't work.
Code:
n = 3 # The encryption number
for i in range(n):
x = [s[i] for s in partiallyEncrypted]
fullyEncrypted.append(x)
a version using itertools.zip_longest:
from itertools import zip_longest
lst = ['abc', 'def', 'gh']
strg = ''.join(''.join(item) for item in zip_longest(*lst, fillvalue=''))
print(strg)
to get an idea why this works it may help having a look at
for tpl in zip_longest(*lst, fillvalue=''):
print(tpl)
I guess you can use:
from itertools import izip_longest
l = ['abc', 'def', 'gh']
print "".join(filter(None, [i for sub in izip_longest(*l) for i in sub]))
# adgbehcf
Having:
l = ['abc', 'def', 'gh']
This would work:
s = ''
In [18]: for j in range(0, len(max(l, key=len))):
...: for elem in l:
...: if len(elem) > j:
...: s += elem[j]
In [28]: s
Out[28]: 'adgbehcf'
Please don't use this:
''.join(''.join(y) for y in zip(*x)) +
''.join(y[-1] for y in x if len(y) == max(len(j) for j in x))

Convert a letter in string to different letters with multiple output

So I have a DNA sequence
DNA = "TANNNT"
where N = ["A", "G", "C", "T"]
I want to have all possible output of TAAAAT, TAAAGT, TAAACT, TAAATT..... and so on.
Right now from online I found solution of permutations where I can do
perms = [''.join(p) for p in permutations(N, 3)]
then just iterate my DNA sequence as
TA + perms + T
but I wonder if there is easier way to do this, because I have a lot more DNA sequences and make take a lot more time to hard code it.
Edit:
The hard code part will be as in I would have to state
N1 = [''.join(p) for p in permutations(N, 1)]
N2 = [''.join(p) for p in permutations(N, 2)]
N3 = [''.join(p) for p in permutations(N, 3)]
then do for i in N3:
key = "TA" + N3[i] + "T"
Since my sequence is quite long, I don't want count how many consecutive N I have in the sequence and want to see if there is better way to do this.
You can use your permutation results to format a string like:
Code:
import itertools as it
import re
def convert_sequence(base_string, target_letter, perms):
REGEX = re.compile('(%s+)' % target_letter)
match = REGEX.search(base_string).group(0)
pattern = REGEX.sub('%s', base_string)
return [pattern % ''.join(p) for p in it.permutations(perms, len(match))]
Test Code:
print(convert_sequence('TANNNT', 'N', ['A', 'G', 'C', 'T']))
Results:
['TAAGCT', 'TAAGTT', 'TAACGT', 'TAACTT', 'TAATGT',
'TAATCT', 'TAGACT', 'TAGATT', 'TAGCAT', 'TAGCTT',
'TAGTAT', 'TAGTCT', 'TACAGT', 'TACATT', 'TACGAT',
'TACGTT', 'TACTAT', 'TACTGT', 'TATAGT', 'TATACT',
'TATGAT', 'TATGCT', 'TATCAT', 'TATCGT']

Python - Split sorted string of characters when character is different than previous character

I have a string of characters which I know to be sorted. Example:
myString = "aaaabbbbbbcccddddd"
I want to split this item into a list at the point when the character I am on is different than its preceding character, as shown below:
splitList = ["aaaa","bbbbbb","ccc","ddddd"]
I am working in Python 3.4.
Thanks!
In [294]: myString = "aaaabbbbbbcccddddd"
In [295]: [''.join(list(g)) for i,g in itertools.groupby(myString)]
Out[295]: ['aaaa', 'bbbbbb', 'ccc', 'ddddd']
myString = "aaaabbbbbbcccddddd"
result = []
for i,s in enumerate(myString):
l = len(result)
if l == 0 or s != myString[i-1]:
result.append(s)
else:
result[l-1] = result[l-1] + s
print result
Output:
['aaaa', 'bbbbbb', 'ccc', 'ddddd']

Categories

Resources