One character in multiple groups [duplicate] - python

This question already has answers here:
How to use regex to find all overlapping matches
(5 answers)
Closed 3 years ago.
I have a string like:
s = 'ababbabbba'
I'm trying to match all patterns matching any number of b's between a's. This is what I expect the patterns to be for s above:
['aba', 'abba', 'abbba']
This is what I've tried:
import re
re.findall('ab+a', s)
Which gives:
['aba', 'abbba']
I think that happens because any single a can only be part of a single group. Whereas my requirement would make the middle a's be part of two groups. Reading through the re documentation, I can't find any way to do this.

Based on the comment above, the solution is:
re.findall('(?=(ab+a))', s)

Related

python regular expression doesn't match all letters after "or" group [duplicate]

This question already has answers here:
re.findall behaves weird
(3 answers)
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
I'm trying to match FD or MD in a string by doing:
matches = re.findall(r"(F|M)D",myString)
Suppose myString = 'MD'. Then, matches becomes
matches = ['M']
Why does it ignore D?
That's because (F|M) is a group, and D is not a part of this group.
Use this instead:
matches = re.findall(r"((?:F|M)D)",myString)
For a visual representation of the differences between these two patterns, I really like to use Regexper.com:
(F|M)D
((?:F|M)D)
The Python documentation on regular expressions has a lot more information available here.
Note that ?: indicates that F|M is a "non-capturing" group. If the pattern were ((F|M)D) instead, then matches would be [('MD', 'M')] (which doesn't sound like what you want).

Regular Expression to find a gather only the last element on an array [duplicate]

This question already has answers here:
Write a Regex to extract number before '/'
(5 answers)
Closed 2 years ago.
I need to extract from a string which contains a structure of an array
A simple example should be helpful:
Base string: [1, 2, 3, 10000]
If I use the following reg. ex.
, (.+?)\]
The match is 3, 10000. I need to get only the 10000.
Is it possible to do it?
Thanks in advance!
.+? will match commas. Making it non-greedy doesn't force it to start matching later.
Use [^,]+ to not match a comma there.
([^,\s]+)\]
will match what you want in the capture group.

Numeric pattern search in regular expression using Python [duplicate]

This question already has answers here:
How to use regex to find all overlapping matches
(5 answers)
Closed 2 years ago.
I have text as below-
my_text = "My telephone number is 408-555-1234"
on which i am searching the pattern
re.findall(r'\d{3}-\d{1,}',my_text)
My intention was to search for three digit numeric value followed by - and then another set of one or more than one digit numeric value. Hence I was expecting the result to be - ['408-555','555-1234'],
However the result i am getting os only ['408-555'] .
Could anyone suggest me what is wrong in my understaning here. And suggest a pattern that would serve my purpose
you can use:
re.findall(r'(?=(\d{3}-\d+))', my_text)
output:
['408-555', '555-1234']

Python re.findall only returns part of pattern [duplicate]

This question already has answers here:
re.findall behaves weird
(3 answers)
Closed 4 years ago.
I am writing a Python 3 program that keeps track of hours spent with a client. One way to log hours is to use a string like Client 9:35am 1:35pm where the first time is the beginning and the second is the end.
To extract the times from the string, I used regex101.com to construct the following pattern:
r"[01]?[0-9]:[0-5][0-9]\s*([Aa][Mm]?|[Pp][Mm]?)"
When testing it on the above example with regex101, it correctly identifies the two times as two separate matches. However, when trying to use the pattern with Python, the list re.findall returns only contains AM or PM:
re.findall(r"[01]?[0-9]:[0-5][0-9]\s*([Aa][Mm]?|[Pp][Mm]?)", "Client 9:35am 1:35pm")
['am', 'pm']
How can I change this so that matches contain the whole time?
Use a non-capturing group:
r"[01]?[0-9]:[0-5][0-9]\s*(?:[Aa][Mm]?|[Pp][Mm]?)" # not the "?:"
re.findall returns a list of the groups instead of the entire matches if the pattern contains capturing groups.

Excluding words using regex without excluding its variants [duplicate]

This question already has answers here:
Find substring in string but only if whole words?
(8 answers)
Closed 4 years ago.
I am trying to exclude the word ‘define’ without excluding other forms of the word like ‘defined’ or ‘defining’ but the below mentioned regex doesn’t work. Help.
Regex :
^((?!define).)*$
Use word boundaries around the word define:
^((?!\bdefine\b).)*$
You could also write this pattern as:
^(?!.*\bdefine\b).*$
Demo

Categories

Resources