Replace multiple characters using re.sub [duplicate] - python

This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Remove specific characters from a string in Python
(26 answers)
Closed 2 years ago.
s = "Bob hit a ball!, the hit BALL flew far after it was hit."
I need to get rid of the following characters from s
!?',;.
How to achieve this with re.sub?
re.sub(r"!|\?|'|,|;|."," ",s) #doesn't work. And replaces all characters with space
Can someone tell me what's wrong with this?

The problem is that . matches all characters, not the literal '.'. You want to escape that also, \..
But a better way would be to not use the OR operator |, but simply use a character group instead:
re.sub(r"[!?',;.]", ' ', s)

Related

(Python 2.7) Regex Replace with similar character in replacement string as in pattern [duplicate]

This question already has answers here:
Why do backslashes appear twice?
(2 answers)
Closed 3 years ago.
import re
pattern = re.compile(r"/")
a = "a/b"
I tried
re.sub(pattern, '\/', a)
#(also, a.replace('/', '\/'))
#output
a\\/b
What I want is
a\/b
a.replace('/', '\\/')
the first \ is an escape character, so you need to type it twice to have the real \.
You can use if it's not compulsory to use regex:
a = "a/b"
a=a.replace("/","\/")
print(a)

Python regular expression to capture last word with missing line feed [duplicate]

This question already has answers here:
Python csv string to array
(10 answers)
In regex, match either the end of the string or a specific character
(2 answers)
Closed 3 years ago.
I need to capture words separated by tabs as illustrated in the image below.
The expression (.*?)[\t|\n] works well, except for the last line where a line feed is missing. Can anyone suggest a modification of the regular expression to also match the last word, i.e. Cheyenne? Link to code example
Replace [\t|\n] with (\t|$).
BTW, [\t|\n] is a character class, so the pipe | is literal here. You probably meant [\t\n].

Regex Python - Backslash [duplicate]

This question already has answers here:
What exactly is a "raw string regex" and how can you use it?
(7 answers)
Closed 6 years ago.
I am trying to remove tags in text that are identified by a backslash. For example, for the phrase 'Hello \tag world', I'd like to return the phrase 'Hello world'. I've tried the following but it doesn't get rid of the '\tag'.
print re.sub('\\[A-Za-z]+',' ',text)
I'm sure it's something simple, but I can't seem to figure it out.
Thanks for any help you can give!
Must be:
re.sub('\\\\[A-Za-z]+',' ',text)
Otherwise, '\\' is treated as a regex special escape character.

How to deal with strings containing character codes? [duplicate]

This question already has answers here:
How to write string literals in Python without having to escape them?
(6 answers)
Closed 6 years ago.
\201 is a character code recognised in Python. What is the best way to ignore this in strings?
s = '\2016'
s = s.replace('\\', '/')
print s #6
If you have a string literal with a backslash in it, you can escape the backslash:
s = '\\2016'
or you can use a "raw" string:
s = r'\2016'

Python typing "\" [duplicate]

This question already has answers here:
How can I put an actual backslash in a string literal (not use it for an escape sequence)?
(4 answers)
Closed 6 years ago.
I want to print a string that ends with the \ character, but the problem is it makes the following ") part of the string, so won't work. Is there any way to end a string with \ being considered a regular character?
print("somestuffhere\") and this is still part of that string...
Put another "\" character behind it. This escapes the escape character. Like so:
print("somestuffhere\\")

Categories

Resources