How to "render" \b in a string in python - python

I have a string with "\b" characters.
Is there a way to "render" the string or "apply" the escape sequences, in order to make the string looking it looks with the print() function?
How it looks like: Test..\b\b! 12344\b5
How it should look like: Test! 12345
Do you have an idea to solve my problem?

One way would be simply to use the replace method of the string object:
st = 'Test.\b!'
st.replace('.\b','')
# Out: 'Test!'

I found a solution with regex.
import re
def b(a):
while '\b' in a:
a = re.sub('[^\b]\b', '', a)
return a
b('Test..\b\b! 12344\b5')
# Out: 'Test! 12345'

Related

how to match a pattern and add a character to it

I have something like:
GCF_002904975:2.6672e-05):2.6672e-05.
and I would like to add the word '_S' right after any GCF(any number) entry before the next colon.
In other words I would like my text becoming like:
GCF_002904975_S:2.6672e-05):2.6672e-05.
I have repeated pattern like that all along my text.
This can be easily done with re.sub function. A working example would look like this:
import re
inp_string='(((GCF_001297375:2.6671e-05,GCF_002904975:2.6672e-05)0.924:0.060046136,(GCF_000144955:0.036474926,((GCF_001681075:0.017937143,...'
if __name__ == "__main__":
outp_string = re.sub(r'GCF_(?P<gfc_number>\d+)\:', r'GCF_\g<gfc_number>_S:', inp_string)
print(outp_string)
This code gives the following result, which is hopefully what you need:
(((GCF_001297375_S:2.6671e-05,GCF_002904975_S:2.6672e-05)0.924:0.060046136,(GCF_000144955_S:0.036474926,((GCF_001681075_S:0.017937143,...
For more info take a look at the docs:
https://docs.python.org/3/library/re.html
You can use regular expressions with a function substitution. The solution below depends on the numbers always being 9 digits, but could be modified to work with other cases.
test_str = '(((GCF_001297375:2.6671e-05,GCF_002904975:2.6672e-05)0.924:0.060046136,GCF_000144955:0.036474926,((GCF_001681075:0.017937143,...'
new_str = re.sub(r"GCF_\d{9}", lambda x: x.group(0) + "_S", test_str)
print(new_str)
#(((GCF_001297375_S:2.6671e-05,GCF_002904975_S:2.6672e-05)0.924:0.060046136,GCF_000144955_S:0.036474926,((GCF_001681075_S:0.017937143,...
Why not just do a replace? Shortening your example string to make it easier to read:
"(((GCF_001297375:2.6671e-05,GCF_002904975:2.6672e-05)...".replace(":","_S:")

how to remove comma at the end from the below string in python code

input string
str = "(\"Cardinal\", \"Tom B. Erichsen\", \"Skagen 21\",)"
output string should look like:
("Cardinal", "Tom B. Erichsen", "Skagen 21")
The comma at the end should be removed, help me how to do this in python code.
I tried with str.rstrip(",") it dint work.
You can use some regex for example you can replace (.*),([^,]+)$ with \1\2
result = re.sub(r"(.*),([^,]+)$", r"\1\2", yourstring)
here is a regex demo
Check this code
str = str.replace('",)', '")')
you can chain different str.replace()
str.replace(", )",")").replace(",)",")")
That will work for your string
You can do this in following way
str = "(\"Cardinal\", \"Tom B. Erichsen\", \"Skagen 21\",)"
str = str[:len(str)-2] + str[len(str)-1]
You could use the regex module:
import re
s = "INSERT INTO Customers (CustomerName, ContactName, Address, ) VALUES (\"Cardinal\", \"Tom B. Erichsen\", \"Skagen 21\",)"
print re.sub(r',(\s+)*\)', ')', s)

How to use regex to find the middle of a string

I'm trying to get certain results out of the response from Blogger. I wanna get my blog names. How would I go about something like that with Regex? I've tried Googling my issue but none of the answers helped me in my case unfortunately.
So my response looks something like this:
\\x22http://emyblog.blogspot.com/
So it's always starting with the \\x22http:// and ending with .blogspot.com/
I've tried the following re:
regEx = re.findall(b"""\x22http://(.*)\.blogspot\.com""", r)
But unfortunately it returned an empty list. Any idea's on how to solve this problem?
Thanks,
Use a raw string, otherwise \\x22 is interpreted as the character " instead of a literal string. Not sure that the re.findall method is the good method, re.search should suffice.
Assuming your byte-string is:
>>> r = rb'\\x22http://emyblog.blogspot.com/'
With byte-strings:
>>> res = re.search(rb'\\x22http://(.*)\.blogspot\.com/', r)
>>> res.group(1)
b'emyblog'
With normal strings:
>>> res = re.search(r'\\\\x22http://(.*)\.blogspot\.com/', r.decode('utf-8'))
>>> res.group(1)
'emyblog'
use r'' (string is taken as raw string literal) instead of b''
import re
pattern = re.compile(r'\x22http://(.*)\.blogspot\.com')
match = pattern.match('\x22http://emyblog.blogspot.com/')
match.group(1)
# 'emyblog'
This seems to be working!
import re
text = "\x22http://emyblog.blogspot.com/"
regex = re.compile('\x22http://(.*)\.blogspot\.com')
print regex.findall(text)

python: how to remove '$'?

All I want to do is remove the dollar sign '$'. This seems simple, but I really don't know why my code isn't working.
import re
input = '$5'
if '$' in input:
input = re.sub(re.compile('$'), '', input)
print input
Input still is '$5' instead of just '5'! Can anyone help?
Try using replace instead:
input = input.replace('$', '')
As Madbreaks has stated, $ means match the end of the line in a regular expression.
Here is a handy link to regular expressions: http://docs.python.org/2/library/re.html
In this case, I'd use str.translate
>>> '$$foo$$'.translate(None,'$')
'foo'
And for benchmarking purposes:
>>> def repl(s):
... return s.replace('$','')
...
>>> def trans(s):
... return s.translate(None,'$')
...
>>> import timeit
>>> s = '$$foo bar baz $ qux'
>>> print timeit.timeit('repl(s)','from __main__ import repl,s')
0.969965934753
>>> print timeit.timeit('trans(s)','from __main__ import trans,s')
0.796354055405
There are a number of differences between str.replace and str.translate. The most notable is that str.translate is useful for switching 1 character with another whereas str.replace replaces 1 substring with another. So, for problems like, I want to delete all characters a,b,c, or I want to change a to d, I suggest str.translate. Conversely, problems like "I want to replace the substring abc with def" are well suited for str.replace.
Note that your example doesn't work because $ has special meaning in regex (it matches at the end of a string). To get it to work with regex you need to escape the $:
>>> re.sub('\$','',s)
'foo bar baz qux'
works OK.
$ is a special character in regular expressions that translates to 'end of the string'
you need to escape it if you want to use it literally
try this:
import re
input = "$5"
if "$" in input:
input = re.sub(re.compile('\$'), '', input)
print input
You need to escape the dollar sign - otherwise python thinks it is an anchor http://docs.python.org/2/library/re.html
import re
fred = "$hdkhsd%$"
print re.sub ("\$","!", fred)
>> !hdkhsd%!
Aside from the other answers, you can also use strip():
input = input.strip('$')

How to convert specific character sequences in a string to upper case using Python?

I am looking to accomplish the following and am wondering if anyone has a suggestion as to how best go about it.
I have a string, say 'this-is,-toronto.-and-this-is,-boston', and I would like to convert all occurrences of ',-[a-z]' to ',-[A-Z]'. In this case the result of the conversion would be 'this-is,-Toronto.-and-this-is,-Boston'.
I've been trying to get something working with re.sub(), but as yet haven't figured out how how
testString = 'this-is,-toronto.-and-this-is,-boston'
re.sub(r',_([a-z])', r',_??', testString)
Thanks!
re.sub can take a function which returns the replacement string:
import re
s = 'this-is,-toronto.-and-this-is,-boston'
t = re.sub(',-[a-z]', lambda x: x.group(0).upper(), s)
print t
prints
this-is,-Toronto.-and-this-is,-Boston

Categories

Resources