Negating ")" inside a regex search pattern [duplicate] - python

This question already has answers here:
Escaping regex string
(4 answers)
Closed 5 years ago.
I'm trying to use regex in python to replace strings. I'd like to replace "SERVERS)" with "SERV" in a string.
example_String = "This is a great SERVERS)"
re.sub("SERVERS)","SERV", example_String)
I expected it to be a straight forward swap, but as I read more into the error, it looks like I need to set the regex pattern to read the ")" as a regular character and not a special regex character.
I'm not very familiar with regex , and would appreciate the help!
Edit : I'm importing data from a database (which is user input), there's quite a few similar issues as mentioned in the question.
re.escape() fits the bill perfectly , thanks!

You could go for
import re
example_String = "This is a great SERVERS)"
new_string = re.sub(r"SERVERS\)","SERV", example_String)
print(new_string)
Which yields
This is a great SERV
To be honest, no regular expression is needed, really:
example_String = "This is a great SERVERS)"
new_string = example_String.replace('SERVERS)', 'SERV')
print(new_string)
In the latter, you don't even need to escape anything and it will be faster.

Related

Splitting quotes [duplicate]

This question already has answers here:
RegEx: Grabbing values between quotation marks
(20 answers)
Closed 6 years ago.
Does anyone have any advice for removing separators of split quotes in a piece of text? I am using Python, and am still a beginner.
For example, "Well," he said, "I suppose I could take a break." In this example, the italicized "he said," is the separator, and needs to be removed. Then, the quote needs to be seen as one string within quotations such as, "Well, I suppose I could take a break." I haven't been able to find code similar to this yet, and was hoping someone may be able to point me in the right direction.
Thanks!
In order to get the content only within " in your given string, you may use re library as:
import re
my_string = '"Well," he said, "I suppose I could take a break."'
quoted_string = re.findall(r'\".*?\"', my_string)
# 'quoted_string' is -> ['"Well,"', '"I suppose I could take a break."']
new_string = ''.join(quoted_string).replace('"', '')
# 'new_string' is -> 'Well, I suppose I could take a break.'
You may write the same as one-liner as:
''.join(re.findall(r'\".*?\"', my_string)).replace('"', '')

re.sub() doesn't replace middle of string [duplicate]

This question already has answers here:
How to replace only the contents within brackets using regular expressions?
(2 answers)
Closed 6 years ago.
I am trying to replace the contents of brackets in a string with nothing. The code I am using right now is like this:
tstString = "OUTPUT:TRACK[:STATE]?"
modString = re.sub("[\[\]]","",tstString)
When I print the results, I get:
OUTPUT:TRACK:STATE?
But I want the result to be:
OUTPUT:TRACK?
How can I do this?
I guess this one will work fine. Regexp now match Some string Inside []. Not ? after *. It makes * non-greedy
import re
tstString = "OUTPUT:TRACK[:STATE]?"
modString = re.sub("\[.*?\]", "", tstString)
print modString
Your regular expression "[\[\]]" says 'any of these characters: "[", "]"'.
But you want to delete what's between the square brackets too, so you should use something like r"\[:\w+\]". It says '[, then :, then one or more alphanumeric characters, then ]'.
And please, always use raw strings (r in front of quotes) when working with regular expressions to avoid funny things connected with Python string processing.

understanding this python regular expression re.compile(r'[ :]') [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 8 years ago.
Hi I am trying to understand python code which has this regular expression re.compile(r'[ :]'). I tried quite a few strings and couldnt find one. Can someone please give example where a text matches this pattern.
The expression simply matches a single space or a single : (or rather, a string containing either). That’s it. […] is a character class.
The [] matches any of the characters in the brackets. So [ :] will match one character that is either a space or a colon.
So these strings would have a match:
"Hello World"
"Field 1:"
etc...
These would not
"This_string_has_no_spaces_or_colons"
"100100101"
Edit:
For more info on regular expressions: https://docs.python.org/2/library/re.html

matching parentheses in python regular expression [duplicate]

This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 1 year ago.
I have something like
store(s)
ending line like "1 store(s)".
I want to match it using Python regular expression.
I tried something like re.match('store\(s\)$', text)
but it's not working.
This is the code I tried:
import re
s = '1 store(s)'
if re.match('store\(s\)$', s):
print('match')
In more or less direct reply to your comment
Try this
import re
s = '1 stores(s)'
if re.match('store\(s\)$',s):
print('match')
The solution is to use re.search instead of re.match as the latter tries to match the whole string with the regexp while the former just tries to find a substring inside of the string that does match the expression.
Python offers two different primitive
operations based on regular
expressions: match checks for a match
only at the beginning of the string,
while search checks for a match
anywhere in the string (this is what
Perl does by default)
Straight from the docs, but it does come up alot.
have you considered re.match('(.*)store\(s\)$',text) ?

Python regex confused by brackets ([])? [duplicate]

This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 3 years ago.
Is python confused, or is the programmer?
I've got a lot of lines of this:
some_dict[0x2a] = blah
some_dict[0xab] = blah, blah
What I'd like to do is to convert the hex codes into all uppercase to look like this:
some_dict[0x2A] = blah
some_dict[0xAB] = blah, blah
So I decided to call in the regular expressions. Normally, I'd just do this using my editor's regexps (xemacs), but the need to convert to uppercase pushes one into Lisp. ....ok... how about Python?
So I whip together a short script which doesn't work. I've condensed the code into this example, which doesn't work either. It looks to me like Python's regexps are getting confused by the brackets in the code. Is it me or Python?
import fileinput
import sys
import re
this = "0x2a"
that = "[0x2b]"
for line in [this, that]:
found = re.match("0x([0-9,a-f]{2})", line)
if found:
print("Found: %s" % found.group(0))
(I'm using the () grouping constructs so I don't capitalize the 'x' in '0x'.)
This example only prints the 0x2a value, not the 0x2b. Is this correct behavior?
I can easily work around this by changing the match expression to:
found = re.match("\[0x([0-9,a-f]{2}\])", line)
but I'm just wondering if someone can give me some insight into what's going on here.
Running Python 2.6.2 on Linux.
re.match matches from the start of the string. Use re.search instead to "match the first occurrence anywhere in the string". The key bit about this in the docs is here.
I don't think you need the comma within the brackets. i.e.:
found = re.match("0x([0-9,a-f]{2})", line)
tells python to look for commas which it might be mistakenly matching. I think you want
found = re.match("0x([0-9a-f]{2})", line)
You're using a partial pattern, so you can't use re.match, which expects to match the entire input string. You need to use re.search, which can perform partial matches.
>>> that = "[0x2b]"
>>> m = re.search("0x([0-9,a-f]{2})", that)
>>> m.group()
'0x2b'
You'll want to change
found = re.match("0x([0-9,a-f]{2})", line)
to
found = re.search("0x([0-9,a-f]{2})", line)
re.match will match only from the beginning of the string, which fails in the "[0x2b]" case.
re.search will match anywhere in the string, and thus ignore the leading "[" in the "[0x2b]" case.
See search() vs. match() for details.
You want to use re.search. This explains why.
If you use re.sub, and pass a callable as the replacement string, it will also do the uppercasing for you:
>>> that = 'some_dict[0x2a] = blah'
>>> m = re.sub("0x([0-9,a-f]{2})", lambda x: "0x"+x.group(1).upper(), that)
>>> m
'some_dict[0x2A] = blah'

Categories

Resources