This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 8 years ago.
My Reg-Ex pattern is not working, why?
string = "../../example/tobematched/nonimportant.html"
pattern = "example\/([a-z]+)\/"
test = re.match(pattern, string)
# None
http://www.regexr.com/39mpu
re.match() matches from the beginning of the string, you need to use re.search() which looks for the first location where the regular expression pattern produces a match and returns a corresponding MatchObject instance.
>>> import re
>>> s = "../../example/tobematched/nonimportant.html"
>>> re.search(r'example/([a-z]+)/', s).group(1)
'tobematched'
Try this.
test = re.search(pattern, string)
Match matches the whole string from the start, so it will give None as the result.
Grab the result from test.group().
To give you the answer in short:
search ⇒ finds something anywhere in the string and return a match object.
match ⇒ finds something at the beginning of the string and return a match object.
That is the reason you have to use
foo = re.search(pattern, bar)
Related
This question already has answers here:
Why do some regex engines match .* twice in a single input string?
(1 answer)
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
I was doing some regex which simplifies to this code:
>>> import re
>>> re.sub(r'^.*$|', "xyz", "abc")
xyzxyz
I was expecting it to replace abc with xyz as the RE ^.*$ matches the whole string, the engine should just return that and exit. So I ran the same regex with re.findall().
>>> re.findall(r'^.*$|', 'abcd')
['abcd', '']
in the docs it says:
A|B, where A and B can be arbitrary REs. As the target string is scanned, REs separated by '|'
are tried from left to right. When one pattern completely matches,
that branch is accepted. This means that once A matches, B will not be
tested further, even if it would produce a longer overall match.
but than why is the regex matching an empty string?
This question already has answers here:
re.findall behaves weird
(3 answers)
Closed 5 years ago.
>>> reg = re.compile(r'^\d{1,3}(,\d{3})*$')
>>> str = '42'
>>> reg.search(str).group()
'42'
>>> reg.findall(str)
['']
>>>
python regex
Why does reg.findall find nothing, but reg.search works in this piece of code above?
When you have capture groups (wrapped with parenthesis) in the regex, findall will return the match of the captured group; And in your case the captured group matches an empty string; You can make it non capture with ?: if you want to return the whole match; re.search ignores capture groups on the other hand. These are reflected in the documentation:
re.findall:
Return all non-overlapping matches of pattern in string, as a list of
strings. The string is scanned left-to-right, and matches are returned
in the order found. If one or more groups are present in the pattern,
return a list of groups; this will be a list of tuples if the pattern
has more than one group.
re.search:
Scan through string looking for the first location where the regular
expression pattern produces a match, and return a corresponding
MatchObject instance. Return None if no position in the string matches
the pattern; note that this is different from finding a zero-length
match at some point in the string.
import re
reg = re.compile(r'^\d{1,3}(?:,\d{3})*$')
s = '42'
reg.search(s).group()
# '42'
reg.findall(s)
# ['42']
This question already has answers here:
Check if a word is in a string in Python
(14 answers)
Closed 8 years ago.
I have a pattern
pattern = "hello"
and a string
str = "good morning! hello helloworld"
I would like to search pattern in str such that the entire string is present as a word i.e it should not return substring hello in helloworld. If str does not contain hello, it should return False.
I am looking for a regex pattern.
\b matches start or end of a word.
So the pattern would be pattern = re.compile(r'\bhello\b')
Assuming you are only looking for one match, re.search() returns None or a class type object (using .group() returns the exact string matched).
For multiple matches you need re.findall(). Returns a list of matches (empty list for no matches).
Full code:
import re
str1 = "good morning! hello helloworld"
str2 = ".hello"
pattern = re.compile(r'\bhello\b')
try:
match = re.search(pattern, str1).group()
print(match)
except AttributeError:
print('No match')
You can use word boundaries around the pattern you are searching for if you are looking to use a regular expression for this task.
>>> import re
>>> pattern = re.compile(r'\bhello\b', re.I)
>>> mystring = 'good morning! hello helloworld'
>>> bool(pattern.search(mystring))
True
This question already has an answer here:
match trailing slash with Python regex
(1 answer)
Closed 8 years ago.
I can't match the question mark character although I escaped it.
I tried escaping with multiple backslashes and also using re.escape().
What am I missing?
Code:
import re
text = 'test?'
result = ''
result = re.match(r'\?',text)
print ("input: "+text)
print ("found: "+str(result))
Output:
input: test?
found: None
re.match only matches a pattern at the begining of string; as in the docs:
If zero or more characters at the beginning of string match the regular expression pattern, return a corresponding match object.
so, either:
>>> re.match(r'.*\?', text).group(0)
'test?
or re.search
>>> re.search(r'\?', text).group(0)
'?'
This question already has answers here:
Extract string with Python re.match
(5 answers)
Closed 8 years ago.
Here's the code:
pattern = re.compile(r'ea')
match = pattern.match('sea ea')
if match:
print match.group()
the result is null. But when I change the code to pattern = re.compile(r'sea'), the output is "sea"
Could anyone give me an explanation?
p.s.
Btw, What I want is to retrieve the "#{year}" from string "select * from records where year = #{year}", plz give me an usable regex. Thanks in advance!
Summary:
Thanks to ALL of u, I find it in the document of python with your instruction. since I can select only one most appropriate answer, I just give it to the one who answered most quickly. Thx again.
pattern.match is anchored at the beginning of the string.
You need pattern.search.
From the documentation:
Python offers two different primitive operations based on regular
expressions: re.match() checks for a match only at the beginning of
the string, while re.search() checks for a match anywhere in the
string (this is what Perl does by default).
You mean to use search, not match. match will match the regular expression only if it is at the start of the string.
pattern = re.compile(r'ea')
match = pattern.search('sea ea')
if match:
print match.group()
match just matches, it doesn't search for things. This does:
>>> pattern = re.compile(r'(#{\w+})')
>>> pattern.split('select * from records where year = #{year}')
['select * from records where year = ', '#{year}', '']