For every string, I need to print # each 6 characters.
For example:
example_string = "this is an example string. ok ????"
myfunction(example_string)
"this i#s an e#xample# strin#g. ok #????"
What is the most efficient way to do that ?
How about this?
'#'.join( [example_string[a:a+6] for a in range(0,len(example_string),6)])
It runs pretty quickly, too. On my machine, five microseconds per 100-character string:
>>> import timeit
>>> timeit.Timer( "'#'.join([s[a:a+6] for a in range(0,len(s),6)])", "s='x'*100").timeit()
4.9556539058685303
>>> str = "this is an example string. ok ????"
>>> import re
>>> re.sub("(.{6})", r"\1#", str)
'this i#s an e#xample# strin#g. ok #????'
Update:
Normally dot matches all characters except new-lines. Use re.S to make dot match all characters including new-line chars.
>>> pattern = re.compile("(.{6})", re.S)
>>> str = "this is an example string with\nmore than one line\nin it. It has three lines"
>>> print pattern.sub(r"\1#", str)
this i#s an e#xample# strin#g with#
more #than o#ne lin#e
in i#t. It #has th#ree li#nes
import itertools
def every6(sin, c='#'):
r = itertools.izip_longest(*([iter(sin)] * 6 + [c * (len(sin) // 6)]))
return ''.join(''.join(y for y in x if y is not None) for x in r)
print every6(example_string)
Related
My code so far is:
def ChangeString():
print (userString.replace(
userString =str(input("Please enter a string "))
ChangeString()
In a string, I need to replace all instances of the first character with a *, without actually replacing the first character itself. An example is, let's say I have "Bobble"; the function would return something like, "Bo**le".
userString[0] + userString[1:].replace(userString[0], "*")
>>> test = 'Bobble'
>>> test = test[0] +''.join(l if l.lower() != test[0].lower() else '*' for l in test[1:])
>>> print test
Bo**le
You could also use a regex:
import re
def ign_first(s, repl):
return re.sub(r"(?<!^){}".format(s[0]), repl, s, flags=re.I)
Demo:
In [5]: s = "Bobble"
In [6]: ign_first(s, "*")
Out[6]: 'Bo**le'
Or use str.join with a set:
def ign_first(s, repl):
first_ch = s[0]
st = {first_ch, first_ch.lower()}
return first_ch + "".join([repl if ch in st else ch for ch in s[1:]])
Demo:
In [10]: ign_first(s, "*")
Out[10]: 'Bo**le'
I would use slices and lower().
>>> test = 'Bobble'
>>> test[0] + test[1:].lower().replace(test[0].lower(), '*')
'Bo**le'
There's no need to use additional variables
>>> st='Bobble'
>>> st=st[0]+st[1:].lower().replace(st[0].lower(),'*')
>>> st
'Bo**le'
case-insensitive solution with regular expressions:
import re
string = "Bobble"
outcome = string[0]+re.sub(string[0].lower() + '|' + string[0].upper(),
"*", string[1:].lower())
>>>
Bo**le
The string might be:
JAIDK392**8'^+%&7JDJ0204İŞÇéS29487
I would like to remove everything from it but only leave behind numbers.
A simple way to do this is with the regular expression library re:
>>> import re
>>> yourString = "JAIDK392**8'^+%&7JDJ0204İŞÇéS29487"
>>> numberOnlyString = re.sub('[^0-9]', '', yourString)
>>> print numberOnlyString
'39287020429487'
There's a way to do it without using any library. You can use the built-in function ord to get the ASCII code of a character. Then you can parse every character in your string to check if it is a number (If it is its ASCII code should be between 47 and 58.
str = "JAIDK392**8'^+%&7JDJ0204İŞÇéS29487"
output = []
for char in str:
if 47 < ord(char) < 58:
output.append(char)
result=''.join(output)
print result
Regular expressions are great, but another way to do it would be:
>>> import string
>>> digits_only = "".join(_ for _ in your_string if _ in string.digits)
I am trying to using Python's re.sub() to match a string with an e character and insert curly braces immediately after the e character and after the lastdigit. For example:
12.34e56 to 12.34e{56}
1e10 to 1e{10}
I can't seem to find the correct regex to insert the desired curly braces. For example, I can properly insert the left brace like this:
>>> import re
>>> x = '12.34e10'
>>> pattern = re.compile(r'(e)')
>>> sub = z = re.sub(pattern, "\1e{", x)
>>> print(sub)
12.34e{10 # this is the correct placement for the left brace
My problem arises when using two back references.
>>> import re
>>> x = '12.34e10'
>>> pattern = re.compile(r'(e).+($)')
>>> sub = z = re.sub(pattern, "\1e{\2}", x)
>>> print(sub)
12.34e{} # this is not what I want, digits 10 have been removed
Can anyone point out my problem? Thanks for the help.
re.sub(r'e(\d+)', r'e{\1}', '12.34e56')
returns '12.34e{56}'
or, the same result but different logic (don't replace e with e):
re.sub(r'(?<=e)(\d+)', r'{\1}', '12.34e56')
Your brace placement is incorrect.
Here's a solution ensuring the that there's a number with optional decimal place before the e:
import re
samples = ['12.34e56','1e10']
for s in samples:
print re.sub(r'(\d+(?:\.\d+)?)e([0-9]+)',"\g<1>e{\g<2>}",s)
Yields:
12.34e{56}
1e{10}
I have a string s with nested brackets: s = "AX(p>q)&E((-p)Ur)"
I want to remove all characters between all pairs of brackets and store in a new string like this: new_string = AX&E
i tried doing this:
p = re.compile("\(.*?\)", re.DOTALL)
new_string = p.sub("", s)
It gives output: AX&EUr)
Is there any way to correct this, rather than iterating each element in the string?
Another simple option is removing the innermost parentheses at every stage, until there are no more parentheses:
p = re.compile("\([^()]*\)")
count = 1
while count:
s, count = p.subn("", s)
Working example: http://ideone.com/WicDK
You can just use string manipulation without regular expression
>>> s = "AX(p>q)&E(qUr)"
>>> [ i.split("(")[0] for i in s.split(")") ]
['AX', '&E', '']
I leave it to you to join the strings up.
>>> import re
>>> s = "AX(p>q)&E(qUr)"
>>> re.compile("""\([^\)]*\)""").sub('', s)
'AX&E'
Yeah, it should be:
>>> import re
>>> s = "AX(p>q)&E(qUr)"
>>> p = re.compile("\(.*?\)", re.DOTALL)
>>> new_string = p.sub("", s)
>>> new_string
'AX&E'
Nested brackets (or tags, ...) are something that are not possible to handle in a general way using regex. See http://www.amazon.de/Mastering-Regular-Expressions-Jeffrey-Friedl/dp/0596528124/ref=sr_1_1?ie=UTF8&s=gateway&qid=1304230523&sr=8-1-spell for details why. You would need a real parser.
It's possible to construct a regex which can handle two levels of nesting, but they are already ugly, three levels will already be quite long. And you don't want to think about four levels. ;-)
You can use PyParsing to parse the string:
from pyparsing import nestedExpr
import sys
s = "AX(p>q)&E((-p)Ur)"
expr = nestedExpr('(', ')')
result = expr.parseString('(' + s + ')').asList()[0]
s = ''.join(filter(lambda x: isinstance(x, str), result))
print(s)
Most code is from: How can a recursive regexp be implemented in python?
You could use re.subn():
import re
s = 'AX(p>q)&E((-p)Ur)'
while True:
s, n = re.subn(r'\([^)(]*\)', '', s)
if n == 0:
break
print(s)
Output
AX&E
this is just how you do it:
# strings
# double and single quotes use in Python
"hey there! welcome to CIP"
'hey there! welcome to CIP'
"you'll understand python"
'i said, "python is awesome!"'
'i can\'t live without python'
# use of 'r' before string
print(r"\new code", "\n")
first = "code in"
last = "python"
first + last #concatenation
# slicing of strings
user = "code in python!"
print(user)
print(user[5]) # print an element
print(user[-3]) # print an element from rear end
print(user[2:6]) # slicing the string
print(user[:6])
print(user[2:])
print(len(user)) # length of the string
print(user.upper()) # convert to uppercase
print(user.lstrip())
print(user.rstrip())
print(max(user)) # max alphabet from user string
print(min(user)) # min alphabet from user string
print(user.join([1,2,3,4]))
input()
I want to remove any brackets from a string. Why doesn't this work properly?
>>> name = "Barack (of Washington)"
>>> name = name.strip("(){}<>")
>>> print name
Barack (of Washington
Because that's not what strip() does. It removes leading and trailing characters that are present in the argument, but not those characters in the middle of the string.
You could do:
name= name.replace('(', '').replace(')', '').replace ...
or:
name= ''.join(c for c in name if c not in '(){}<>')
or maybe use a regex:
import re
name= re.sub('[(){}<>]', '', name)
I did a time test here, using each method 100000 times in a loop. The results surprised me. (The results still surprise me after editing them in response to valid criticism in the comments.)
Here's the script:
import timeit
bad_chars = '(){}<>'
setup = """import re
import string
s = 'Barack (of Washington)'
bad_chars = '(){}<>'
rgx = re.compile('[%s]' % bad_chars)"""
timer = timeit.Timer('o = "".join(c for c in s if c not in bad_chars)', setup=setup)
print "List comprehension: ", timer.timeit(100000)
timer = timeit.Timer("o= rgx.sub('', s)", setup=setup)
print "Regular expression: ", timer.timeit(100000)
timer = timeit.Timer('for c in bad_chars: s = s.replace(c, "")', setup=setup)
print "Replace in loop: ", timer.timeit(100000)
timer = timeit.Timer('s.translate(string.maketrans("", "", ), bad_chars)', setup=setup)
print "string.translate: ", timer.timeit(100000)
Here are the results:
List comprehension: 0.631745100021
Regular expression: 0.155561923981
Replace in loop: 0.235936164856
string.translate: 0.0965719223022
Results on other runs follow a similar pattern. If speed is not the primary concern, however, I still think string.translate is not the most readable; the other three are more obvious, though slower to varying degrees.
string.translate with table=None works fine.
>>> name = "Barack (of Washington)"
>>> name = name.translate(None, "(){}<>")
>>> print name
Barack of Washington
Because strip() only strips trailing and leading characters, based on what you provided. I suggest:
>>> import re
>>> name = "Barack (of Washington)"
>>> name = re.sub('[\(\)\{\}<>]', '', name)
>>> print(name)
Barack of Washington
strip only strips characters from the very front and back of the string.
To delete a list of characters, you could use the string's translate method:
import string
name = "Barack (of Washington)"
table = string.maketrans( '', '', )
print name.translate(table,"(){}<>")
# Barack of Washington
Since strip only removes characters from start and end, one idea could be to break the string into list of words, then remove chars, and then join:
s = 'Barack (of Washington)'
x = [j.strip('(){}<>') for j in s.split()]
ans = ' '.join(j for j in x)
print(ans)
For example string s="(U+007c)"
To remove only the parentheses from s, try the below one:
import re
a=re.sub("\\(","",s)
b=re.sub("\\)","",a)
print(b)