Anyone know a good regex to remove extra whitespace? [duplicate] - python

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Substitute multiple whitespace with single whitespace in Python
trying to figure out how to write a regex that given the string:
"hi this is a test"
I can turn it into
"hi this is a test"
where the whitespace is normalized to just one space
any ideas? thanks so much

import re
re.sub("\s+"," ",string)

Does it need to be a regex?
I'd just use
new_string = " ".join(re.split(s'\s+', old_string.strip()))

sed
sed 's/[ ]\{2,\}/ /g'

Related

How can I split String by two or more new lines without using libraries? [duplicate]

This question already has answers here:
How do I split a string into a list of words?
(9 answers)
Closed 2 years ago.
I have this String:
Hello World.\n
I'm very happy today.\n\n\n\n\n
How are you?\n\n\n
Bye.
And I want to split it by two or more new lines without using any libraries.
Output:
['Hello World.\n I'm very happy today','How are you?','Bye']
Python's base string function split() should work here, without the need to import anything:
inp = "Hello World.\nI'm very happy today.\n\n\n\n\nHow are you?\n\n\nBye."
terms = inp.split('\n\n')
print(terms)
This prints:
["Hello World.\nI'm very happy today.", '', '\nHow are you?', '\nBye.']

Replace multiple characters using re.sub [duplicate]

This question already has answers here:
What special characters must be escaped in regular expressions?
(13 answers)
Remove specific characters from a string in Python
(26 answers)
Closed 2 years ago.
s = "Bob hit a ball!, the hit BALL flew far after it was hit."
I need to get rid of the following characters from s
!?',;.
How to achieve this with re.sub?
re.sub(r"!|\?|'|,|;|."," ",s) #doesn't work. And replaces all characters with space
Can someone tell me what's wrong with this?
The problem is that . matches all characters, not the literal '.'. You want to escape that also, \..
But a better way would be to not use the OR operator |, but simply use a character group instead:
re.sub(r"[!?',;.]", ' ', s)

How to find a string inside of a string space insensitive? [duplicate]

This question already has answers here:
Does Python have a string 'contains' substring method?
(10 answers)
Closed 5 years ago.
For example if a string contains:
odfsdlkfn dskfThe Moonaosjfsl dflkfn
How can I check to see if it contains "The Moon"?
What I have currently been doing is (but does not work):
if string.find("The Moon")!=-1:
doSomething
Is there anyway to do this?
Thanks!
Simple:
string = 'odfsdlkfn dskfThe Moonaosjfsl dflkfn'
if 'The Moon' in string:
dosomething
you could use regular expressions:
import re
text = "odfsdlkfn dskfThe Moonaosjfsl dflkfn"
if re.find("The Moon", text):
...
and in this case, you could ingore casing with re(pattern, text, re.IGNORECASE) if needed.

Regex Python - Backslash [duplicate]

This question already has answers here:
What exactly is a "raw string regex" and how can you use it?
(7 answers)
Closed 6 years ago.
I am trying to remove tags in text that are identified by a backslash. For example, for the phrase 'Hello \tag world', I'd like to return the phrase 'Hello world'. I've tried the following but it doesn't get rid of the '\tag'.
print re.sub('\\[A-Za-z]+',' ',text)
I'm sure it's something simple, but I can't seem to figure it out.
Thanks for any help you can give!
Must be:
re.sub('\\\\[A-Za-z]+',' ',text)
Otherwise, '\\' is treated as a regex special escape character.

how do i check for trail whitespace on python? [duplicate]

This question already has answers here:
How to define a function that would check if a string have whitespaces after the sentence is finished?
(2 answers)
Closed 8 years ago.
I am trying to write a function to check for trail whitespace, but not to remove the spaces. but i have no idea of how to do that. can somebody teach me?
thank you
Using str.endswith:
>>> 'with trailing space '.endswith(' ')
True
>>> 'without trailing space'.endswith(' ')
False

Categories

Resources