Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 6 years ago.
Improve this question
I am trying to use regular expressions in python to say a 4 characters string with 1st character being a digit and 3 other characters being either a digit or a capital letter.
Here's examples of patterns that should match 1CTT, 2IR8, 35TR, 4T1R
I tried many ways, here's the last code I tried :
exp=re.compile("[0-9]{1}([A-Z0-9]{3})")
Thank you for your help !
The expression you've tried last, looks correct and should match the provided test strings. Though you don't have to specify {1} and there is no need for a capturing group (the parenthesis):
>>> import re
>>> text = "text, 1CTT, 2IR8, 35TR, 4T1R, smth else"
>>> pattern = re.compile(r"[0-9][A-Z0-9]{3}")
>>> pattern.findall(text)
['1CTT', '2IR8', '35TR', '4T1R']
You might need to additionally add the word boundary constraint (thanks to #Jon Clements):
>>> text = "text, 1CTT, 2IR8, 35TR, 4T1R, smth else, 35TT35XYZ"
>>> pattern = re.compile(r"\b[0-9][A-Z0-9]{3}\b")
>>> pattern.findall(text)
['1CTT', '2IR8', '35TR', '4T1R']
Related
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 7 months ago.
Improve this question
What is wrong with the below used regular expression? Why does it not match the password?
import re
pattern = re.compile ('^\w##$%{8,}')
password = '12345abcd##$%'
x = pattern.search(password)
print (x)
print (len(password))
You didn't escape the $ which has a special meaning in a regular expression and didn't put the allowed characters in square brackets to allow any of them.
This: ^[\w##\$%]{8,} is the modified version of the regex which matches the password.
Escaping the $ character isn't really necessary within square brackets so ^[\w##$%]{8,} will work as well.
I suggest you check your regular expressions here: https://regex101.com/r/ldvJLf/1 . This site explains in detail the meaning of all single elements of the regular expression, so you can directly see what is wrong if things doesn't work as you expected.
Tip:
check your regexes online https://regexr.com/
I think you want:
pattern = re.compile ('^[\w##$%]{8,}')
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 1 year ago.
Improve this question
I know the question title is similar to many other questions, but I have also read those answers but didn't work for my case. I have a some strings which are like below:
s = '(ANTENOR)'
s = '(ねぼけ)'
Strings are sometimes in English and sometimes in Japanes. I tried different solutions given in StackOverflow but in my case, those aren't working. For example, I tried the following one, but didn't work in my case:
s = re.sub(r'[()]', '', s)
But not working and returns the same string as the original.
My Output should look like below:
ANTENOR
ねぼけ
Only the text, no brackets, and no parentheses. Any help?
That isn't a classic parenthesis, that is FULLWIDTH LEFT PARENTHESIS.
You can see it using ord. And there isn't even a space, there is only char and it has some space before, in it
# yours
print(ord('(')) # 65288
# classic parenthesis
print(ord('(')) # 40
The solution to remove them, is to copy/paste them in the regex
s = '(ANTENOR)'
s = re.sub(r'[)(]', '', s)
print(f">{s}<") # >ANTENOR<
s = '(ねぼけ)'
s = re.sub(r'[)(]', '', s)
print(f">{s}<") # >ねぼけ<
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 4 years ago.
Improve this question
I have multiple strings like this 90'4 I want to extract the digits from the string and sum them up to get 94.
I tried compiling the pattern.
pattern="\d'\d"
re.compile(pattern)
I tried the methods findall and match, but did not get what I wanted.
I need to use regex I cannot use .split()
Use \d+ with findall to extract numbers and then find their sum:
import re
s = "this is 90'4"
numbers = re.findall(r'\d+', s)
print(sum(map(int, numbers)))
# 94
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 5 years ago.
Improve this question
Objective
I'm looking to use the regular expression \d+ to extract just the digits from the string, answer_40194.
Problem
I'm targeting a form element with Selenium and I'm printing the formID to the Terminal, but after the line re.findall('\d+', formID) I expect formID to be just the numbers 40194, but instead I'm getting the entire string answer_40194.
script.py
import selenium
import re
form = browser.find_element_by_tag_name('form')
formID = form.get_attribute('id')
re.findall('\d+', formID)
print formIDNumber
You need to assign the result to a variable, e.g.
var1 = re.findall('\d+', formID)
print(var1)
This will generate a list, if you only want one result, use
var1 = re.search('\d+', formID)
print(var1.group(0))
The latter is called a regular expression object, hence the .group(0), see the documentation on python.org for more information.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
I have these strings:
Phone: 3396222
Phone: +33333388
I want to extract the numbers.
I tried this regular expression:
Phone:\s*(\d+\.\d+)
But I got an empty result
I am using scrapy so my code is like this: sel.xpath(..).re(..)
please don't suggest using other feature in python than regular expression
Your regular expression requires a . dot in the text, but your sample input has none.
Demo:
>>> import re
>>> re.search(r'Phone:\s*(\d+\.\d+)', 'Phone: 3396222') is None
True
>>> re.search(r'Phone:\s*(\d+\.\d+)', 'Phone: 339.6222').group(1)
'339.6222'
If you wanted to make either of your sample phone numbers match, remove the \. (instead adding it to a character set) and add an optional + to the expression:
r'Phone:\s*(\+?[\d.]+)'
Demo:
>>> re.search(r'Phone:\s*(\+?[\d.]+)', 'Phone: 3396222').group(1)
'3396222'
>>> re.search(r'Phone:\s*(\+?[\d.]+)', 'Phone: +33333388').group(1)
'+33333388'
This pattern also allows for any number of dots in the number:
>>> re.search(r'Phone:\s*(\+?[\d.]+)', 'Phone: +333.333.88').group(1)
'+333.333.88'
You are asking for mandatory dot(.) inside your regex. Mate it optional:
Phone:\s*\+?(\d+\.?\d+)
^^^ ^
I have updated by adding optional \+ as you added + in your input.