\b boundary regex not working as expected [duplicate] - python

This question already has answers here:
Python regular expression not matching
(3 answers)
Closed 1 year ago.
I was trying to use the \b regex to match whole words but I coudn't get it to work.
match = re.match(r'\bcat\b', 'the cat is sleeping')
print(match) # prints None
With this piece of code, I was expecting to get a match on cat, but it returns None. I tried running the code on my local machine, and also on an online python shell.

re.match starts the match from the beginning of the string. Since your cat is not starting the string, so that's why it's not matching.
You need to use re.search in this case.
re.search(r'\bcat\b', 'the cat is sleeping')
<_sre.SRE_Match object; span=(4, 7), match='cat'>

Related

How to match regex to line ending in python [duplicate]

This question already has an answer here:
Regular expression works on regex101.com, but not on prod
(1 answer)
Closed 2 years ago.
I'm trying to get python regex to match the end of a string (primarily because I want to remove a common section off the end of the string. I have the following code which I think is how the docs describe to do it, but it's not performing as I'm expecting:
input_value = "Non-numeric qty or weight, from 00|XFX|201912192009"
pattern = ", from .*$"
match = re.match(pattern , input_value)
print(match)
The result is None, however I'm expecting to have matched something. I've also tested these values with an online regex tool: https://regex101.com/ using the python flavour, and it works as expected.
What am I doing wrong?
match = re.match(".*, from.*$", input_value)
you should use .* infront else it will try to fin exact match

Why the positive lookahead not working with /$ in the end [duplicate]

This question already has answers here:
Why the positive lookahead not working for this regex [closed]
(3 answers)
Closed 3 years ago.
So I want to match issues/ or settings/general/ but in the second case /general should not be included in the match, so i tried using positive lookahead for the second case but it does not seems to be working, this is what i came up with.
^(issues|settings(?=/general))/$
It's because the /general part of the string is not consumed.
After having checked that settings is correctly followed by /general, the cursor is still at the end of settings, so the matching will continue from this point on.
So the slash is correctly matched, but not the end of line.
As suggested by Wiktor, you'd be better off using groups if you want to extract a part of the string.
Here's a proposition:
^(issues|settings)/general/$
Trying it out:
>>> result = re.match("^(issues|settings)/general/$", "issues/general/")
>>> result
<re.Match object; span=(0, 15), match='issues/general/'>
>>> result.group(1)
'issues'
If you really want to avoid groups though, you can also include /$ inside the lookahead, and so the regex becomes ^(issues|settings(?=/general/$)):
>>> re.match("^(issues|settings(?=/general/$))", "issues/general/")
<re.Match object; span=(0, 6), match='issues'>

i am trying to extract date from text using regular expression [duplicate]

This question already has answers here:
How do I parse an ISO 8601-formatted date?
(29 answers)
Closed 4 years ago.
I am trying to extract the date from this '2025-03-21T12:54:41Z' text using python regular expression.
date=re.match('(\d{4})[/.-](\d{2})[/.-](\d{2})$', date[0])
print(date)
This give output as None
also, I tried this code
date_reg_exp = re.compile('\d{4}(?P<sep>[-/])\d{2}(?P=sep)\d{2}')
matches_list=date_reg_exp.findall(date[0])
for match in matches_list:
print match
This gives output as - only
Please help
Your regular expression is wrong because it has a $ at the end. $ asserts that this is the end of the string.
The regex engine matches your string with the regex and after matching the last two digits, expects a $ - end of the string. However, your string still has T12:54:41Z before the end, so the regex does not match.
To fix this, remove $:
>>> re.match('(\d{4})[/.-](\d{2})[/.-](\d{2})', '2025-03-21T12:54:41Z')
<_sre.SRE_Match object; span=(0, 10), match='2025-03-21'>
Instead of using $ sigil at the end of your regexp, which is matching end-of-line character, try using ^ at the beginning:
import re
date='2025-03-21T12:54:41Z'
date=re.match('^(\d{4})[/.-](\d{2})[/.-](\d{2})', date)
print(date)
Output in python3:
<_sre.SRE_Match object; span=(0, 10), match='2025-03-21'>
Python2:
<_sre.SRE_Match object at 0x7fd191ac1ae0>

regular expression match using python for string with multiple spaces and special character [duplicate]

This question already has answers here:
What is the difference between re.search and re.match?
(9 answers)
Closed 7 years ago.
match() in following string,
string = "(branch=MAIN). See the error log at /home/aswamy/run/test_upgrade/2.0-285979.customer_deployment.22499/test.log.2\n"
m = re.match("See the error",string)
print m ==> (Here m always shows None)
But if I use the same string without any spaces between (branch=MAIN), then match happens as below,
string = "See the error log at /home/kperiyaswamy/runmass/mass_test_upgrade/7.2.0-285979.customer_deployment.22499/infoblox.log.2\n"
m = re.match("See the error",string)
print m ===> (works proper <_sre.SRE_Match object at 0x7fe813825030>)
So if there is a multiple white spaces pattern match doesn't work. Please let me know how to solve above issue
Its not about the whitespaces.match always starts to match from the beginning of the string.In 2nd case the string is at the start.So you get the match.In first case it isnt so you dont get a match.Use findall if you want to get a match in any case.

understanding this python regular expression re.compile(r'[ :]') [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 8 years ago.
Hi I am trying to understand python code which has this regular expression re.compile(r'[ :]'). I tried quite a few strings and couldnt find one. Can someone please give example where a text matches this pattern.
The expression simply matches a single space or a single : (or rather, a string containing either). That’s it. […] is a character class.
The [] matches any of the characters in the brackets. So [ :] will match one character that is either a space or a colon.
So these strings would have a match:
"Hello World"
"Field 1:"
etc...
These would not
"This_string_has_no_spaces_or_colons"
"100100101"
Edit:
For more info on regular expressions: https://docs.python.org/2/library/re.html

Categories

Resources