I need to replace ravi.jhon#piramal.com| or sam.jennifer#piramal.com| to
''(empty strings).I have written following regex but its unable to deal
with . - emptyspace in the strings.
my regex is \w+#ongoose.com["|"]
now question is how to include ., empty space,- along with alpha numeric characters
my final output should be : ravi.jhon#piramal.com| to ``
Add the character you want to match in a character class [\w.-].
In you example you want to match piramal and in your regex you want to match ongoose. To match both of them you might use an alternation (?:ongoose|piramal) or match any non whitespace character using \S+ and replace with an empty string.
To match a dot you have to escape it \.
[\w.-]+#\S+\.com\|
Related
I am quite new to regex and I right now Have a problem formulating a regex to match a string where the first and last letter are different. I looked up on the internet and found a regex that just does it's opposite. i.e. matches words that have same starting and ending letter. Can anyone please help me to understand if I can negeate this regex in some way or can create a new regex to match my requirements. The regex that needs to be modiifed or changed is:
^\s|^[a-z]$|^([a-z]).*\1$
This matches these Strings :
aba,
a,
b,
c,
d,
" ",
cccbbbbbbac,
aaaaba
But I want it to match strings like:
aaabbcz,
zba,
ccb,
cbbbba
Can anyone please help me in this regard? Thank you.
Note: I will be using this with Python Regex, so the regex should be compataible to be used with Python.
You don't need a regex for this, just use
s[0] != s[-1]
where s is your string. If you must use a regex, you can use this:
^(.).*(?!\1).$
This looks for
^ : beginning of string
(.) : a character (captured in group 1)
.* : some number of characters
(?!\1). : a character which is not the character captured in group 1
$ : end of string
Regex demo on regex101
This part of your pattern ^([a-z]).*\1$ only accounts for chars a-z, but you also want to exclude " "
You can rewrite that pattern by putting the part after the capture group inside a negative lookahead.
^(.)(?!.*\1$).+
^ Start of string
(.) Capture a single char (including spaces) in group 1
(?!.*\1$) Negative lookahead, assert that the string does not end with the same character
.+ Match 1+ characters so that the string has a minimum of 2 characters
See a regex demo.
If the string should start and end with a non whitespace character to prevent / trailing trailing spaces, you can start the match with a non whitespace character \S and also end the match with a non whitespace character.
^(\S)(?!.*\1$).*\S$
See another regex demo.
I am writing a python regex that matches only string that consists of letters, digits and one or more question marks.
For example, regex1: ^[A-Za-z0-9?]+$ returns strings with or without ?
I want a regex2 that matches expressions such as ABC123?A, 1AB?CA?, ?2ABCD, ???, 123? but not ABC123, ABC.?1D1, ABC(a)?1d
on mysql, I did that and it works:
select *
from (
select * from norm_prod.skill_patterns
where pattern REGEXP '^[A-Za-z0-9?]+$') AS XXX
where XXX.pattern not REGEXP '^[A-Za-z0-9]+$'
How about something like this:
^(?=.*\?)[a-zA-Z0-9\?]+$
As you can see here at regex101.com
Explanation
The (?=.*\?) is a positive lookahead that tells the regex that the start of the match should be followed by 0 or more characters and then a ? - i.e., there should be a ? somewhere in the match.
The [a-zA-Z0-9\?]+ matches one-or-more occurrences of the characters given in the character class i.e. a-z, A-Z and digits from 0-9, and the question mark ?.
Altogether, the regex first checks if there is a question mark somewhere in the string to be matched. If yes, then it matches the characters mentioned above. If either the ? is not present, or there is some foreign character, then the string is not matched.
You can validate an alphanumeric string with one or more question marks using
where pattern REGEXP '^[A-Za-z0-9]*([?][A-Za-z0-9]*)+$'
In Python:
re.search(r'^[A-Za-z0-9]*(?:\?[A-Za-z0-9]*)+$', text)
See the regex demo.
Details:
^ - start of string
[A-Za-z0-9]* - zero or more letters or digits
([?][A-Za-z0-9]*)+ - one or more repetitions of a ? char and then zero or more letters or digits
$ - end of string.
If you plan to apply this to any Unicode string, consider using POSIX character classes:
where pattern REGEXP '^[[:alnum:]]*([?][[:alnum:]]*)+$'
where [[:alnum:]] matches any letters and digits. In Python:
re.search(r'^[^\W_]*(?:\?[^\W_]*)+$', text)
In Python, all shorthand character classes are Unicode aware by default, and the [^\W_] pattern is a \w (that matches letters, digits, connector punctuation) with _ subtracted from it.
If there should be at least a single question mark present using MySQL or Python:
^[A-Za-z0-9]*\?[A-Za-z0-9?]*$
Explanation
^ Start of string
[A-Za-z0-9]* Match optional chars A-Z a-z 0-9
\? Match a question mark
[A-Za-z0-9]* Match optional chars A-Z a-z 0-9 or ?
$ End of string
See a regex demo.
In MySQL double escape the backslash like:
REGEXP '^[A-Za-z0-9]*\\?[A-Za-z0-9?]*$'
I'm having trouble making a Regex to find a string matching the format of:
'One or more numeric digits///Any combination of alphanumerics and non-alphanumerics///Any combination of alphanumerics and non-alphanumerics///Any combination of alphanumerics and non-alphanumerics'
A more specific example would be:
'Number of transactions///Total Revenue///Product name///Cost of Supplies'
Which would look something like:
'1002///1502.34///Coca-Cola-12.Oz///902.23'
There are no whitespaces in the strings.
I've tried the Regex: r'\d+///\d+.\d+///\w+///\d+.\d+'
The problem is that for the \w+ section because it can sometimes contain non-alphanumeric characters.
You can use a character class in between the words to allow what characters should be matched. In this case you could add a . and a -
Note to escape the dot to match it literally.
\d+///\d+\.\d+///\w+(?:[.-]\w+)*///\d+\.\d+
Regex demo
Other options could be using only the character class or if you don't want to allow / in between, a negated character class:
\d+///\d+\.\d+///[\w.-]+///\d+\.\d+
\d+///\d+\.\d+///[^/\n]///\d+\.\d+
I have a large string and one of the lines is in the form of
Description: something here....
I want to get everything in the: something here... without any trailing or leading space on it. Currently I'm doing it with a mix of regex and a strip(). How could this be done entirely in regex? Currently:
re.search('Description:\s+(.+)', body).group(1).strip()
Other thoughts:
re.search('Description:\s+\w(.+)\w', body).group(1) # works
Also, why doesn't putting an anchor work in the above context?
re.search('Description:\s+\w(.+)$', body).group(1) # fails
You can use either of
Description:\s+(.*\S)
See the regex demo.
The point is that you need to match up to the last non-whitespace character. .* matches any zero or more characters other than line break chars, as many as possible, so the \S matches the last non-whitespace character in the string.
If you have a multiline string and you need to get to the last non-whitespace character, you may add re.S / re.DOTALL option when passing the pattern above to a regex method, or re-write it as
Description:\s+(\S+(?:\s+\S+)*)
where \S+ matches one or more non-whitespace chars and (?:\s+\S+)* matches zero or more occurrences of one or more whitespaces followed with one or more non-whitespace chars.
See this regex demo.
I have been trying to create a regex that will match on a string of strings with the following format: "static.string static.mod static.bin". I basically want to enforce the string.string format. My current implementation only gets the first string static.string. this is my RE ^(\s*)([A-Za-z]+)(\.+)([A-Za-z]+). This only matches the first string, so how do I make it iterate and match any string that fits that format in a string of strings?
You may use
re.findall(r'(?<!\S)[A-Za-z]+\.[A-Za-z]+(?!\S)', text)
See the regex demo.
The regex matches:
(?<!\S) - a location immediately preceded with a whitespace or start of string
[A-Za-z]+ - 1+ ASCII letters
\. - a dot
[A-Za-z]+ - 1+ ASCII letters
(?!\S) - a location immediately followed with a whitespace or end of string.