I have the follwing string and I split it:
>>> st = '%2g%k%3p'
>>> l = filter(None, st.split('%'))
>>> print l
['2g', 'k', '3p']
Now I want to print the g letter two times, the k letter one time and the p letter three times:
ggkppp
How is it possible?
You could use generator with isdigit() to check wheter your first symbol is digit or not and then return following string with appropriate count. Then you could use join to get your output:
''.join(i[1:]*int(i[0]) if i[0].isdigit() else i for i in l)
Demonstration:
In [70]: [i[1:]*int(i[0]) if i[0].isdigit() else i for i in l ]
Out[70]: ['gg', 'k', 'ppp']
In [71]: ''.join(i[1:]*int(i[0]) if i[0].isdigit() else i for i in l)
Out[71]: 'ggkppp'
EDIT
Using re module when first number is with several digits:
''.join(re.search('(\d+)(\w+)', i).group(2)*int(re.search('(\d+)(\w+)', i).group(1)) if re.search('(\d+)(\w+)', i) else i for i in l)
Example:
In [144]: l = ['12g', '2kd', 'h', '3p']
In [145]: ''.join(re.search('(\d+)(\w+)', i).group(2)*int(re.search('(\d+)(\w+)', i).group(1)) if re.search('(\d+)(\w+)', i) else i for i in l)
Out[145]: 'ggggggggggggkdkdhppp'
EDIT2
For your input like:
st = '%2g_%3k%3p'
You could replace _ with empty string and then add _ to the end if the work from list endswith the _ symbol:
st = '%2g_%3k%3p'
l = list(filter(None, st.split('%')))
''.join((re.search('(\d+)(\w+)', i).group(2)*int(re.search('(\d+)(\w+)', i).group(1))).replace("_", "") + '_' * i.endswith('_') if re.search('(\d+)(\w+)', i) else i for i in l)
Output:
'gg_kkkppp'
EDIT3
Solution without re module but with usual loops working for 2 digits. You could define functions:
def add_str(ind, st):
if not st.endswith('_'):
return st[ind:] * int(st[:ind])
else:
return st[ind:-1] * int(st[:ind]) + '_'
def collect(l):
final_str = ''
for i in l:
if i[0].isdigit():
if i[1].isdigit():
final_str += add_str(2, i)
else:
final_str += add_str(1, i)
else:
final_str += i
return final_str
And then use them as:
l = ['12g_', '3k', '3p']
print(collect(l))
gggggggggggg_kkkppp
One-liner Regex way:
>>> import re
>>> st = '%2g%k%3p'
>>> re.sub(r'%|(\d*)(\w+)', lambda m: int(m.group(1))*m.group(2) if m.group(1) else m.group(2), st)
'ggkppp'
%|(\d*)(\w+) regex matches all % and captures zero or moredigit present before any word character into one group and the following word characters into another group. On replacement all the matched chars should be replaced with the value given in the replacement part. So this should loose % character.
or
>>> re.sub(r'%(\d*)(\w+)', lambda m: int(m.group(1))*m.group(2) if m.group(1) else m.group(2), st)
'ggkppp'
Assumes you are always printing single letter, but preceding number may be longer than single digit in base 10.
seq = ['2g', 'k', '3p']
result = ''.join(int(s[:-1] or 1) * s[-1] for s in seq)
assert result == "ggkppp"
LATE FOR THE SHOW BUT READY TO GO
Another way, is to define your function which converts nC into CCCC...C (ntimes), then pass it to a map to apply it on every element of the list l coming from the split over %, the finally join them all, as follows:
>>> def f(s):
x = 0
if s:
if len(s) == 1:
out = s
else:
for i in s:
if i.isdigit():
x = x*10 + int(i)
out = x*s[-1]
else:
out = ''
return out
>>> st
'%4g%10k%p'
>>> ''.join(map(f, st.split('%')))
'ggggkkkkkkkkkkp'
>>> st = '%2g%k%3p'
>>> ''.join(map(f, st.split('%')))
'ggkppp'
Or if you want to put all of these into one single function definition:
>>> def f(s):
out = ''
if s:
l = filter(None, s.split('%'))
for item in l:
x = 0
if len(item) == 1:
repl = item
else:
for c in item:
if c.isdigit():
x = x*10 + int(c)
repl = x*item[-1]
out += repl
return out
>>> st
'%2g%k%3p'
>>> f(st)
'ggkppp'
>>>
>>> st = '%4g%10k%p'
>>>
>>> f(st)
'ggggkkkkkkkkkkp'
>>> st = '%4g%101k%2p'
>>> f(st)
'ggggkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkpp'
>>> len(f(st))
107
EDIT :
In case of the presence of _ where the OP does not want this character to be repeated, then the best way in my opinion is to go with re.sub, it will make things easier, this way:
>>> def f(s):
pat = re.compile(r'%(\d*)([a-zA-Z]+)')
out = pat.sub(lambda m:int(m.group(1))*m.group(2) if m.group(1) else m.group(2), s)
return out
>>> st = '%4g_%12k%p__%m'
>>> f(st)
'gggg_kkkkkkkkkkkkp__m'
Loop the list, check first entry for number, and then append the second digit onwards:
string=''
l = ['2g', 'k', '3p']
for entry in l:
if len(entry) ==1:
string += (entry)
else:
number = int(entry[0])
for i in range(number):
string += (entry[1:])
Related
def r(s):
str = []
for i in len(s):
if (s[i]=='_'):
str = s[i] + str
continue
str = s[i] + str
return str
I tried using the above code to convert the following string
Input: ab_cde
Expected Output: ed_cba
s = 'ab_cde'
out = ''
for a, b in zip(s, s[::-1]):
if b != '_' and a != '_':
out += b
else:
out += a
print(out)
Prints:
ed_cba
EDIT: For more fixed points:
s = 'ab_cde_f_ghijk_l'
i, out = iter(ch for ch in s[::-1] if ch != '_'), ''
out = ''.join(ch if ch == '_' else next(i) for ch in s)
print(out)
Prints:
lk_jih_g_fedcb_a
The main idea is to check all the positions of the underscore _, save them and reverse the string without them, to insert them again after reversing.
import re
def r(s):
# check where all the underscore are
underscore_positions = [m.start() for m in re.finditer('_', s)]
# get list of reversed chars without underscores
reversed_chars = [c for c in reversed(s) if c != '_']
# put underscore back where they where
for p in underscore_positions:
reversed_chars.insert(p, '_')
# profit
return "".join(reversed_chars)
The function can be modified to have a different fixed character.
I also uses the package re for the regex function to identify the _, you can do with a simple loop as underscore_positions = [i for i, c in enumerate(s) if c =='_'] if you prefer.
def fixed_reverse(s, ch):
idxs = [-1] + [i for i, x in enumerate(s) if x == ch] + [len(s)]
idxs = [x - i + 1 for i, x in enumerate(idxs)]
chars = "".join(x for x in s if x != ch)[::-1]
return ch.join(chars[a:b] for a, b in zip(idxs[:-1], idxs[1:]))
>>> fixed_reverse("ab_cde_f_ghijk_l", "_")
'lk_jih_g_fedcb_a'
This works by:
Storing the locations of the fixed-point character "_".
Reversing the string with the "_" characters removed.
Inserting the "_" back into the correct locations.
I want to create a new string from a given string with alternate uppercase and lowercase.
I have tried iterating over the string and changing first to uppercase into a new string and then to lower case into another new string again.
def myfunc(x):
even = x.upper()
lst = list(even)
for itemno in lst:
if (itemno % 2) !=0:
even1=lst[1::2].lowercase()
itemno=itemno+1
even2=str(even1)
print(even2)
Since I cant change the given string I need a good way of creating a new string alternate caps.
Here's a onliner
"".join([x.upper() if i%2 else x.lower() for i,x in enumerate(mystring)])
You can simply randomly choose for each letter in the old string if you should lowercase or uppercase it, like this:
import random
def myfunc2(old):
new = ''
for c in old:
lower = random.randint(0, 1)
if lower:
new += c.lower()
else:
new += c.upper()
return new
Here's one that returns a new string using with alternate caps:
def myfunc(x):
seq = []
for i, v in enumerate(x):
seq.append(v.upper() if i % 2 == 0 else v.lower())
return ''.join(seq)
This does the job also
def foo(input_message):
c = 0
output_message = ""
for m in input_message:
if (c%2==0):
output_message = output_message + m.lower()
else:
output_message = output_message + m.upper()
c = c + 1
return output_message
Here's a solution using itertools which utilizes string slicing:
from itertools import chain, zip_longest
x = 'inputstring'
zipper = zip_longest(x[::2].lower(), x[1::2].upper(), fillvalue='')
res = ''.join(chain.from_iterable(zipper))
# 'iNpUtStRiNg'
Using a string slicing:
from itertools import zip_longest
s = 'example'
new_s = ''.join(x.upper() + y.lower()
for x, y in zip_longest(s[::2], s[1::2], fillvalue=''))
# ExAmPlE
Using an iterator:
s_iter = iter(s)
new_s = ''.join(x.upper() + y.lower()
for x, y in zip_longest(s_iter, s_iter, fillvalue=''))
# ExAmPlE
Using the function reduce():
def func(x, y):
if x[-1].islower():
return x + y.upper()
else:
return x + y.lower()
new_s = reduce(func, s) # eXaMpLe
This code also returns alternative caps string:-
def alternative_strings(strings):
for i,x in enumerate(strings):
if i % 2 == 0:
print(x.upper(), end="")
else:
print(x.lower(), end= "")
return ''
print(alternative_strings("Testing String"))
def myfunc(string):
# Un-hash print statements to watch python build out the string.
# Script is an elementary example of using an enumerate function.
# An enumerate function tracks an index integer and its associated value as it moves along the string.
# In this example we use arithmetic to determine odd and even index counts, then modify the associated variable.
# After modifying the upper/lower case of the character, it starts adding the string back together.
# The end of the function then returns back with the new modified string.
#print(string)
retval = ''
for space, letter in enumerate(string):
if space %2==0:
retval = retval + letter.upper()
#print(retval)
else:
retval = retval + letter.lower()
#print(retval)
print(retval)
return retval
myfunc('Thisisanamazingscript')
String = n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l
I want the script to look at a pair at a time meaning:
evaluate n76a+q80a. if abs(76-80) < 10, then replace '+' with a '_':
else don't change anything.
Then evaluate q80a+l83a next and do the same thing.
The desired output should be:
n76a_q80a_l83a+i153a+l203f_r207a_s211a_s215w_f216a+e283l
What i tried is,
def aa_dist(x):
if abs(int(x[1:3]) - int(x[6:8])) < 10:
print re.sub(r'\+', '_', x)
with open(input_file, 'r') as alex:
oligos_list = alex.read()
aa_dist(oligos_list)
This is what I have up to this point. I know that my code will just replace all '+' into '_' because it only evaluates the first pair and and replace all. How should I do this?
import itertools,re
my_string = "n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l"
#first extract the numbers
my_numbers = map(int,re.findall("[0-9]+",my_string))
#split the string on + (useless comment)
parts = my_string.split("+")
def get_filler((a,b)):
'''this method decides on the joiner'''
return "_" if abs(a-b) < 10 else '+'
fillers = map(get_filler,zip(my_numbers,my_numbers[1:])) #figure out what fillers we need
print "".join(itertools.chain.from_iterable(zip(parts,fillers)))+parts[-1] #it will always skip the last part so gotta add it
is one way you might accomplish this... and is also an example of worthless comments
Through re module only.
>>> s = 'n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l'
>>> m = re.findall(r'(?=\b([^+]+\+[^+]+))', s) # This regex would helps to do a overlapping match. See the demo (https://regex101.com/r/jO6zT2/13)
>>> m
['n76a+q80a', 'q80a+l83a', 'l83a+i153a', 'i153a+l203f', 'l203f+r207a', 'r207a+s211a', 's211a+s215w', 's215w+f216a', 'f216a+e283l']
>>> l = []
>>> for i in m:
if abs(int(re.search(r'^\D*(\d+)', i).group(1)) - int(re.search(r'^\D*\d+\D*(\d+)', i).group(1))) < 10:
l.append(i.replace('+', '_'))
else:
l.append(i)
>>> re.sub(r'([a-z0-9]+)\1', r'\1',''.join(l))
'n76a_q80a_l83a+i153a+l203f_r207a_s211a_s215w_f216a+e283l'
By defining a separate function.
import re
def aa_dist(x):
l = []
m = re.findall(r'(?=\b([^+]+\+[^+]+))', x)
for i in m:
if abs(int(re.search(r'^\D*(\d+)', i).group(1)) - int(re.search(r'^\D*\d+\D*(\d+)', i).group(1))) < 10:
l.append(i.replace('+', '_'))
else:
l.append(i)
return re.sub(r'([a-z0-9]+)\1', r'\1',''.join(l))
string = 'n76a+q80a+l83a+i153a+l203f+r207a+s211a+s215w+f216a+e283l'
print aa_dist(string)
Output:
n76a_q80a_l83a+i153a+l203f_r207a_s211a_s215w_f216a+e283l
Let's say I have a string that looks like "1000101"
I want to iterate over all possible ways to insert "!" where "1" is:
1000101
100010!
1000!01
1000!0!
!000101
!00010!
!000!01
!000!0!
scalable to any string and any number of "1"s
As (almost) always, itertools.product to the rescue:
>>> from itertools import product
>>> s = "10000101"
>>> all_poss = product(*(['1', '!'] if c == '1' else [c] for c in s))
>>> for x in all_poss:
... print(''.join(x))
...
10000101
1000010!
10000!01
10000!0!
!0000101
!000010!
!0000!01
!0000!0!
(Since we're working with one-character strings here we could even get away with
product(*('1!' if c == '1' else c for c in s))
if we wanted.)
Here you go. The recursive structure is that I can generate all the subcombos of s[1:] and then for each one of those combos I can insert in the front ! if s[0] is 1 and either way insert s[0]
def subcombs(s):
if not s:
return ['']
char = s[0]
res = []
for combo in subcombs(s[1:]):
if char == '1':
res.append('!' + combo)
res.append(char + combo)
return res
print(subcombs('1000101'))
['!000!0!', '1000!0!', '!00010!', '100010!', '!000!01', '1000!01', '!000101', '1000101']
An approach with generator:
def possibilities(s):
if not s:
yield ""
else:
for s_next in possibilities(s[1:]):
yield "".join([s[0], s_next])
if s[0] == '1':
yield "".join(['!', s_next])
print list(possibilities("1000101"))
Output:
['1000101', '!000101', '1000!01', '!000!01', '100010!', '!00010!', '1000!0!', '!000!0!']
I am working on a piece of code that requires to find certain characters in a word and then replace those characters in a generated string. The code works fine when the word has only one of each character; however, when I have two or more characters of the same kind, the code only identifies the first one and ignores the following ones. Do you have any suggestions on how to solve this issue?
def write_words (word, al):
newal = (list(al))
n = len(word)
i = 0
x = 0
a = []
b = ["_"]
for i in range(0, n):
a = a + b
while (x <(len(newal))):
z = newal[x]
y = word.find(z)
x = x + 1
print (y)
if y >= 0:
a[y] = z
return(a)
(The Python version I'm working with is 3.2.1)
The problem here is that find() returns the index of the first occurrence of the element.
You can just use the following code to replace the occurrences instead.
>>> word = 'abcdabcd'
>>> ignore = 'ab'
>>> "".join([elem if elem not in ignore else '_' for elem in word])
'__cd__cd'
P.S - Some pointers on your current code.
def write_words (word, al):
newal = (list(al))
n = len(word)
i = 0
x = 0
a = []
b = ["_"]
for i in range(0, n):
a = a + b
while (x <(len(newal))):
z = newal[x]
y = word.find(z)
x = x + 1
print (y)
if y >= 0:
a[y] = z
return(a)
Instead of doing a for loop and appending _ at every element in a, you could have just done a = ['_']*len(word).
You don't need a while loop here or converting your word to a list. Strings are iterable, so you could just do for elem in newal. That way you don't have to keep a seperate x variable to iterate over the string.
So, now your code gets reduced to
>>> def write_words_two(word, al):
a = ['_']*len(word)
for elem in al:
y = word.find(elem)
print(y)
a[y] = z
return a
But, it still has the same problem as before. The problem now seems to be that word.find(elem) only returns the occurrence of the first character and not the indices of occurrences of all of them. So, instead of building up a list first and then replacing the characters, we should build up the list as we go along and test every character for our ignored characters, and if the character needs to be ignored, we just replace that with it's replacement in the list. Then, we come up with the following code
>>> def write_words_three(word, al, ignore):
a = []
for elem in word:
if elem in al:
a.append(ignore)
else:
a.append(elem)
return a
>>> write_words_three('abcdabcd', 'ab', '_')
['_', '_', 'c', 'd', '_', '_', 'c', 'd']
But, it still seems to return the list and not the string and that's what we want, and it seems a little big too. So, why not shorten it with a list comprehension?
>>> def write_words_four(word, al, ignore):
return [elem if elem not in al else ignore for elem in word]
>>> write_words_threefour('abcdabcd', 'ab', '_')
['_', '_', 'c', 'd', '_', '_', 'c', 'd']
We still need a string out of this though and our code just returns a list. We can use the join(...) method for that and join each element of the string.
>>> def write_words_five(word, al, ignore):
return "".join([elem if elem not in al else ignore for elem in word])
>>> write_words_five('abcdabcd', 'ab', '_')
'__cd__cd'
which gives us what we want.
Replace your find function by this one:
def myfind(main, x):
return [i for i,j in enumerate(x) if j==x]
so that in your code:
ys = myfind( word, z )
for y in ys:
a[y] = z
This should do what the OP asked, with minimal change to the original code. Does not work if "_" is an allowed character in al.
def write_words (word, al):
newal = (list(al))
n = len(word)
i = 0
x = 0
a = []
b = ["_"]
for i in range(0, n):
a = a + b
while (x <(len(newal))):
z = newal[x]
y = word.find(z)
while (y >= 0):
print (y)
a[y] = z
word[y] = "_"
y = word.find(z)
x = x + 1
return a