re.sub() doesn't replace middle of string [duplicate] - python

This question already has answers here:
How to replace only the contents within brackets using regular expressions?
(2 answers)
Closed 6 years ago.
I am trying to replace the contents of brackets in a string with nothing. The code I am using right now is like this:
tstString = "OUTPUT:TRACK[:STATE]?"
modString = re.sub("[\[\]]","",tstString)
When I print the results, I get:
OUTPUT:TRACK:STATE?
But I want the result to be:
OUTPUT:TRACK?
How can I do this?

I guess this one will work fine. Regexp now match Some string Inside []. Not ? after *. It makes * non-greedy
import re
tstString = "OUTPUT:TRACK[:STATE]?"
modString = re.sub("\[.*?\]", "", tstString)
print modString

Your regular expression "[\[\]]" says 'any of these characters: "[", "]"'.
But you want to delete what's between the square brackets too, so you should use something like r"\[:\w+\]". It says '[, then :, then one or more alphanumeric characters, then ]'.
And please, always use raw strings (r in front of quotes) when working with regular expressions to avoid funny things connected with Python string processing.

Related

What difference does round brackets in regular expression make? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 4 years ago.
I am currently going through pythonchallenge.com, and now trying to make a code that searches for a lowercase letter with exactly three uppercase letters on both side of it. Then I got stuck on trying to make a regular expression for it. This is what I have tried:
import re
#text is in https://pastebin.com/pAFrenWN since it is too long
p = re.compile("[^A-Z]+[A-Z]{3}[a-z][A-Z]{3}[^A-Z]+")
print("".join(p.findall(text)))
This is what I got with it:
dqIQNlQSLidbzeOEKiVEYjxwaZADnMCZqewaebZUTkLYNgouCNDeHSBjgsgnkOIXdKBFhdXJVlGZVme
gZAGiLQZxjvCJAsACFlgfe
qKWGtIDCjn
I later searched for the solution, which had this regular expression:
p = re.compile("[^A-Z]+[A-Z]{3}([a-z])[A-Z]{3}[^A-Z]+")
So there is a bracket around [a-z], and I couldn't figure out what difference it makes. I would like some explanation on this.
Use Parentheses for Grouping and Capturing By placing part of a
regular expression inside round brackets or parentheses, you can group
that part of the regular expression together. This allows you to apply
a quantifier to the entire group or to restrict alternation to part of
the regex.
https://www.regular-expressions.info/brackets.html
Basicly the regex engine can find a list of strings matching the whole search pattern, and return you the parts inside the ().

Python regex matching on strings I don't want [duplicate]

This question already has answers here:
Python- how do I use re to match a whole string [duplicate]
(4 answers)
Closed 5 years ago.
This is my first attempt at trying to use regex with Python or at all, and it is not working as expected. I want a regex to match any alphabetic character or underscore as the first character, then any number of alphanumeric characters or underscores after. The regex I am using is '^[a-z_,A-Z][a-z_A-Z0-9]*', which seems to produce what I want at pythex.org, but in my code it is matching strings that I do not want.
My code is as follows:
isMatch = re.match('^[a-z_A-Z][a-z_A-Z0-9]*', someString)
return True if isMatch else False
Two examples of strings that are matching that I don't want are: "qq-q" and "va[r". What am I doing wrong?
I think that you just forgot the $ at the end of your regex to specify the end of the string.
isMatch = re.match('^[a-z_A-Z][a-z_A-Z0-9]*$', someString)
Without that, it will match the beginning of the string and not the entire string, which explains why it worked on "qq-q" ("qq" is a match) and "va[r" ("va" is a match).

understanding this python regular expression re.compile(r'[ :]') [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 8 years ago.
Hi I am trying to understand python code which has this regular expression re.compile(r'[ :]'). I tried quite a few strings and couldnt find one. Can someone please give example where a text matches this pattern.
The expression simply matches a single space or a single : (or rather, a string containing either). That’s it. […] is a character class.
The [] matches any of the characters in the brackets. So [ :] will match one character that is either a space or a colon.
So these strings would have a match:
"Hello World"
"Field 1:"
etc...
These would not
"This_string_has_no_spaces_or_colons"
"100100101"
Edit:
For more info on regular expressions: https://docs.python.org/2/library/re.html

Python Regex stop at '|' character [duplicate]

This question already has answers here:
Python regular expression again - match URL
(7 answers)
Closed 8 years ago.
I am trying to find a URL in a Dokuwiki using python regex. Dokuwikis format URLs like this:
[['insert URL'|Name of External Link]]
I need to design a python regex that captures the URL but stops at '|'
I could try and type out every non-alphanumeric character besides '|'
(something like this: (https?://[\w|\.|\-|\?|\/|\=|\+|\!|\#|\#|\$|\%|^|&]*) )
However that sounds really tedious and I might miss one.
Thoughts?
You can use negative character sets, or [^things to not match].
In this case, you want to not match |, so you would have [^|].
import re
bool(re.match("[^|]", "a"))
#>>> True
bool(re.match("[^|]", "|"))
#>>> False
You expect any character that's not | followed by a | and some other characters that are not ], everything enclosed within double square brackets. This translates to:
pattern = re.compile('\[\[([^\|]+)\|([^/]]+)\]\]')
print pattern.match("[[http://bla.org/path/to/page|Name of External Link]]").groups()
This would print:
('http://bla.org/path/to/page', 'Name of External Link')
If you don't need the name of the link you can just remove the parenthesis around the second group. More on regular expressions in Python here

matching parentheses in python regular expression [duplicate]

This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 1 year ago.
I have something like
store(s)
ending line like "1 store(s)".
I want to match it using Python regular expression.
I tried something like re.match('store\(s\)$', text)
but it's not working.
This is the code I tried:
import re
s = '1 store(s)'
if re.match('store\(s\)$', s):
print('match')
In more or less direct reply to your comment
Try this
import re
s = '1 stores(s)'
if re.match('store\(s\)$',s):
print('match')
The solution is to use re.search instead of re.match as the latter tries to match the whole string with the regexp while the former just tries to find a substring inside of the string that does match the expression.
Python offers two different primitive
operations based on regular
expressions: match checks for a match
only at the beginning of the string,
while search checks for a match
anywhere in the string (this is what
Perl does by default)
Straight from the docs, but it does come up alot.
have you considered re.match('(.*)store\(s\)$',text) ?

Categories

Resources