python regex sub replace the whole string [duplicate] - python

This question already has answers here:
What Regex would capture everything from ' mark to the end of a line?
(7 answers)
Closed 2 years ago.
Let's say I want to replace string that start with abc with replacement:
import re
s = 'abcdefg'
re.sub(r'^abc', 'replacement', s)
replacementdefg
What should I do to only return replacement instead of replacementdefg?

Match the rest of the string with a .*
import re
s = 'abcdefg'
s = re.sub(r'^abc.*', 'replacement', s)
print(s)
output:
replacement

Related

Python and regex, keep text inside a pattern and remove the outside [duplicate]

This question already has answers here:
How to replace only part of the match with python re.sub
(6 answers)
Closed 5 months ago.
I have text that looks like the following:
my_string = "a + (foo(b)*foo(c))
And I am trying to remove the parentheses that start with C. So the desired output would look like
"a + (b*c)
I found the following to find all patterns inside the string, but not to remove in place.
re.findall(r'foo\((.*?)\)', my_string)
my_string = "a + (foo(b)*foo(c))"
# \w+ to remove foo, that is followed by (
# \( , \) to remove () around 'b' and 'c'
re.sub(r'\w+\((\w)\)',r'\1', my_string)
a + (b*c)
regex DEMO

Is there any way using regular expression to capture this group. result should be 2 [duplicate]

This question already has answers here:
How to use regex to find all overlapping matches
(5 answers)
Closed 2 years ago.
import re
pattern = r'faf'
string = 'fafaf'
print(len(re.findall(pattern, string)))
it is giving the answer as 1, but required answer is 2
You want to use a positive lookahead r"(?=(<pattern>))" to find overlapping patterns:
import re
pattern = r"(?=(faf))"
string = "fafaf"
print(len(re.findall(pattern, string)))
You can test regexes here: https://regex101.com/r/3FxCok/1

Python regular expression to get words enclosed in {}' [duplicate]

This question already has answers here:
Capture contents inside curly braces
(2 answers)
Closed 3 years ago.
How to get the all the words which are enclosed in between {} in a string?
For example:
my_string = "select * from abc where file_id = {some_id} and ghg='0000' and number={some_num} and date={some_dt}"
output should be like:
[some_id,some_num,some_dt]
import re
my_string = "select * from abc where file_id = {some_id} and ghg='0000' and number={some_num} and date={some_dt}"
result = re.findall(r'{(.+?)}', my_string)
print(result)
Since it's words you are after,
import re
ans = re.findall("{([A-z]+)}", my_string)
The pattern [A-z] includes all upper-case and lower-case characters. [A-z]+ to capture at-least one or more characters, surrounded by () to capture the matches.
Output:
['some_id', 'some_num', 'some_dt']

Regex can't escape question mark? [duplicate]

This question already has an answer here:
match trailing slash with Python regex
(1 answer)
Closed 8 years ago.
I can't match the question mark character although I escaped it.
I tried escaping with multiple backslashes and also using re.escape().
What am I missing?
Code:
import re
text = 'test?'
result = ''
result = re.match(r'\?',text)
print ("input: "+text)
print ("found: "+str(result))
Output:
input: test?
found: None
re.match only matches a pattern at the begining of string; as in the docs:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.
so, either:
>>> re.match(r'.*\?', text).group(0)
'test?
or re.search
>>> re.search(r'\?', text).group(0)
'?'

Split string based on regexp without consuming characters [duplicate]

This question already has answers here:
Non-consuming regular expression split in Python
(2 answers)
Closed 8 years ago.
I would like to split a string like the following
text="one,two;three.four:"
into the list
textOut=["one", ",two", ";three", ".four", ":"]
I have tried with
import re
textOut = re.split(r'(?=[.:,;])', text)
But this does not split anything.
I would use re.findall here instead of re.split:
>>> from re import findall
>>> text = "one,two;three.four:"
>>> findall("(?:^|\W)\w*", text)
['one', ',two', ';three', '.four', ':']
>>>
Below is a breakdown of the Regex pattern used above:
(?: # The start of a non-capturing group
^|\W # The start of the string or a non-word character (symbol)
) # The end of the non-capturing group
\w* # Zero or more word characters (characters that are not symbols)
For more information, see here.
I don't know what else can occur in your string, but will this do the trick?
>>> s='one,two;three.four:'
>>> [x for x in re.findall(r'[.,;:]?\w*', s) if x]
['one', ',two', ';three', '.four', ':']

Categories

Resources