This question already has answers here:
How to match a whole word with a regular expression?
(4 answers)
Closed 3 years ago.
I want to replace only specific word in one string. However, some other words have that word inside but I don't want them to be changed.
For example, for the below string I only want to replace x with y in z string. how to do that?
x = "the"
y = "a"
z = "This is the thermometer"
import re
pattern=r'\bthe\b' # \b - start and end of the word
repl='a'
string = 'This is the thermometer'
string=re.sub(pattern, repl, string)
In your case you can use re.sub(x, y, z).
You can read the documentation here for more information.
Related
This question already has answers here:
How to remove all characters after a specific character in python?
(10 answers)
Closed 1 year ago.
I want to remove all the character after a non-alphanumerical character ('_') within a string.
For example:
Petr_;Y -> Petr
ČEZ_^(České_energetické_závody) -> ČEZ
I tried:
''.join(c for c in mystring if c.isalnum())
But this way I'm stripping off only alphanumerical characters itself.
Help would be appreciated.
You may want to use the .split() method on strings.
new_string = your_string.split('_',1)[0]
This way you keep only what's before the fisrt '_'.
Searching the index of first occurrence of "_" will do:
s1 = "Petr_;Y"
s2 = "ČEZ_^(České_energetické_závody)"
s11 = s1[:s1.index("_")]
s22 = s2[:s2.index("_")]
This question already has answers here:
What do ^ and $ mean in a regular expression?
(2 answers)
Closed 2 years ago.
I've got a problem with carets and dollar signs in Python.
I want to find every word which starts with a number and ends with a letter
Here is what I've tried already:
import re
text = "Cell: 415kkk -555- 9999ll Work: 212-555jjj -0000"
phoneNumRegex = re.compile(r'^\d+\w+$')
print(phoneNumRegex.findall(text))
Result is an empty list:
[]
The result I want:
415kkk, 9999ll, 555jjj
Where is the problem?
Problems with your regex:
^...$ means you only want full matches over the whole string - get rid of that.
r'\w+' means "any word character" which means letters + numbers (case ignorant) plus underscore '_'. So this would match '5555' for '555' via
r'\d+' and another '5' as '\w+' hence add it to the result.
You need
import re
text = "Cell: 415kkk -555- 9999ll Work: 212-555jjj -0000"
phoneNumRegex = re.compile(r'\b\d+[a-zA-Z]+\b')
print(phoneNumRegex.findall(text))
instead:
['415kkk', '9999ll', '555jjj']
The '\b' are word boundaries so you do not match 'abcd1111' inside '_§$abcd1111+§$'.
Readup:
re-syntax
regex101.com - Regextester website that can handle python syntax
This question already has answers here:
Escaping regex string
(4 answers)
Closed 3 years ago.
ı am trying to stemmize words in tex of dataframe
data is a dataframe , karma is text column , zargan is the dict of word and root of word
for a in range(1,100000):
for j in data.KARMA[a].split():
pattern = r'\b'+j+r'\b'
data.KARMA[a] = re.sub(pattern, str(zargan.get(j,j)),data.KARMA[a])
print(data.KARMA[1])
I want to change the word and root in the texts
Looks like j contains some regular expression special character like *. If you want it to be interpreted as literal text, you can say
pattern = r'\b'+re.escape(j)+r'\b'
and possibly the same for r if it should similarly be coerced into a literal string.
This question already has answers here:
How can I tell if a string repeats itself in Python?
(13 answers)
Closed 3 years ago.
I need to split a string by using repeated characters.
For example:
My string is "howhowhow"
I need output as 'how,how,how'.
I cant use 'how' directly in my reg exp. because my input varies. I should check the string whether it is repeating the character and need to split that characters.
import re
string = "howhowhow"
print(','.join(re.findall(re.search(r"(.+?)\1", string).group(1), string)))
OUTPUT
howhowhow -> how,how,how
howhowhowhow -> how,how,how,how
testhowhowhow -> how,how,how # not clearly defined by OP
The pattern is non-greedy so that howhowhowhow doesn't map to howhow,howhow which is also legitimate. Remove the ? if you prefer the longest match.
lengthofRepeatedChar = 3
str1 = 'howhowhow'
HowmanyTimesRepeated = int(len(str1)/lengthofRepeatedChar)
((str1[:lengthofRepeatedChar]+',')*HowmanyTimesRepeated)[:-1]
'how,how,how'
Works When u know the length of repeated characters
This question already has answers here:
Replace all the occurrences of specific words
(4 answers)
Find substring in string but only if whole words?
(8 answers)
Closed 6 years ago.
Want to replace a certain words in a string but keep getting the followinf result:
String: "This is my sentence."
User types in what they want to replace: "is"
User types what they want to replace word with: "was"
New string: "Thwas was my sentence."
How can I make sure it only replaces the word "is" instead of any string of the characters it finds?
Code function:
import string
def replace(word, new_word):
new_file = string.replace(word, new_word[1])
return new_file
Any help is much appreciated, thank you!
using regular expression word boundary:
import re
print(re.sub(r"\bis\b","was","This is my sentence"))
Better than a mere split because works with punctuation as well:
print(re.sub(r"\bis\b","was","This is, of course, my sentence"))
gives:
This was, of course, my sentence
Note: don't skip the r prefix, or your regex would be corrupt: \b would be interpreted as backspace.
A simple but not so all-round solution (as given by Jean-Francios Fabre) without using regular expressions.
' '.join(x if x != word else new_word for x in string.split())