Replacing string if it contains specified pattern

Replacing string if it contains specified pattern - python

I need to replace ravi.jhon#piramal.com| or sam.jennifer#piramal.com| to
''(empty strings).I have written following regex but its unable to deal
with . - emptyspace in the strings.
my regex is \w+#ongoose.com["|"]
now question is how to include ., empty space,- along with alpha numeric characters
my final output should be : ravi.jhon#piramal.com| to ``

Add the character you want to match in a character class [\w.-].
In you example you want to match piramal and in your regex you want to match ongoose. To match both of them you might use an alternation (?:ongoose|piramal) or match any non whitespace character using \S+ and replace with an empty string.
To match a dot you have to escape it \.
[\w.-]+#\S+\.com\|

Related

Getting a correct regex for word starting and ending with different letters

I am quite new to regex and I right now Have a problem formulating a regex to match a string where the first and last letter are different. I looked up on the internet and found a regex that just does it's opposite. i.e. matches words that have same starting and ending letter. Can anyone please help me to understand if I can negeate this regex in some way or can create a new regex to match my requirements. The regex that needs to be modiifed or changed is:
^\s|^[a-z]$|^([a-z]).*\1$
This matches these Strings :
aba,
a,
b,
c,
d,
" ",
cccbbbbbbac,
aaaaba
But I want it to match strings like:
aaabbcz,
zba,
ccb,
cbbbba
Can anyone please help me in this regard? Thank you.
Note: I will be using this with Python Regex, so the regex should be compataible to be used with Python.

You don't need a regex for this, just use
s[0] != s[-1]
where s is your string. If you must use a regex, you can use this:
^(.).*(?!\1).$
This looks for
^ : beginning of string
(.) : a character (captured in group 1)
.* : some number of characters
(?!\1). : a character which is not the character captured in group 1
$ : end of string
Regex demo on regex101

This part of your pattern ^([a-z]).*\1$ only accounts for chars a-z, but you also want to exclude " "
You can rewrite that pattern by putting the part after the capture group inside a negative lookahead.
^(.)(?!.*\1$).+
^ Start of string
(.) Capture a single char (including spaces) in group 1
(?!.*\1$) Negative lookahead, assert that the string does not end with the same character
.+ Match 1+ characters so that the string has a minimum of 2 characters
See a regex demo.
If the string should start and end with a non whitespace character to prevent / trailing trailing spaces, you can start the match with a non whitespace character \S and also end the match with a non whitespace character.
^(\S)(?!.*\1$).*\S$
See another regex demo.

Detect strings containing only digits, letters and one or more question marks

I am writing a python regex that matches only string that consists of letters, digits and one or more question marks.
For example, regex1: ^[A-Za-z0-9?]+$ returns strings with or without ?
I want a regex2 that matches expressions such as ABC123?A, 1AB?CA?, ?2ABCD, ???, 123? but not ABC123, ABC.?1D1, ABC(a)?1d
on mysql, I did that and it works:
select *
from (
select * from norm_prod.skill_patterns
where pattern REGEXP '^[A-Za-z0-9?]+$') AS XXX
where XXX.pattern not REGEXP '^[A-Za-z0-9]+$'

How about something like this:
^(?=.*\?)[a-zA-Z0-9\?]+$
As you can see here at regex101.com
Explanation
The (?=.*\?) is a positive lookahead that tells the regex that the start of the match should be followed by 0 or more characters and then a ? - i.e., there should be a ? somewhere in the match.
The [a-zA-Z0-9\?]+ matches one-or-more occurrences of the characters given in the character class i.e. a-z, A-Z and digits from 0-9, and the question mark ?.
Altogether, the regex first checks if there is a question mark somewhere in the string to be matched. If yes, then it matches the characters mentioned above. If either the ? is not present, or there is some foreign character, then the string is not matched.

You can validate an alphanumeric string with one or more question marks using
where pattern REGEXP '^[A-Za-z0-9]*([?][A-Za-z0-9]*)+$'
In Python:
re.search(r'^[A-Za-z0-9]*(?:\?[A-Za-z0-9]*)+$', text)
See the regex demo.
Details:
^ - start of string
[A-Za-z0-9]* - zero or more letters or digits
([?][A-Za-z0-9]*)+ - one or more repetitions of a ? char and then zero or more letters or digits
$ - end of string.
If you plan to apply this to any Unicode string, consider using POSIX character classes:
where pattern REGEXP '^[[:alnum:]]*([?][[:alnum:]]*)+$'
where [[:alnum:]] matches any letters and digits. In Python:
re.search(r'^[^\W_]*(?:\?[^\W_]*)+$', text)
In Python, all shorthand character classes are Unicode aware by default, and the [^\W_] pattern is a \w (that matches letters, digits, connector punctuation) with _ subtracted from it.

If there should be at least a single question mark present using MySQL or Python:
^[A-Za-z0-9]*\?[A-Za-z0-9?]*$
Explanation
^ Start of string
[A-Za-z0-9]* Match optional chars A-Z a-z 0-9
\? Match a question mark
[A-Za-z0-9]* Match optional chars A-Z a-z 0-9 or ?
$ End of string
See a regex demo.
In MySQL double escape the backslash like:
REGEXP '^[A-Za-z0-9]*\\?[A-Za-z0-9?]*$'

Python regular expression for string with the following format

I'm having trouble making a Regex to find a string matching the format of:
'One or more numeric digits///Any combination of alphanumerics and non-alphanumerics///Any combination of alphanumerics and non-alphanumerics///Any combination of alphanumerics and non-alphanumerics'
A more specific example would be:
'Number of transactions///Total Revenue///Product name///Cost of Supplies'
Which would look something like:
'1002///1502.34///Coca-Cola-12.Oz///902.23'
There are no whitespaces in the strings.
I've tried the Regex: r'\d+///\d+.\d+///\w+///\d+.\d+'
The problem is that for the \w+ section because it can sometimes contain non-alphanumeric characters.

You can use a character class in between the words to allow what characters should be matched. In this case you could add a . and a -
Note to escape the dot to match it literally.
\d+///\d+\.\d+///\w+(?:[.-]\w+)*///\d+\.\d+
Regex demo
Other options could be using only the character class or if you don't want to allow / in between, a negated character class:
\d+///\d+\.\d+///[\w.-]+///\d+\.\d+
\d+///\d+\.\d+///[^/\n]///\d+\.\d+

Convert a TRIM on regex

I have a large string and one of the lines is in the form of
Description: something here....
I want to get everything in the: something here... without any trailing or leading space on it. Currently I'm doing it with a mix of regex and a strip(). How could this be done entirely in regex? Currently:
re.search('Description:\s+(.+)', body).group(1).strip()
Other thoughts:
re.search('Description:\s+\w(.+)\w', body).group(1) # works
Also, why doesn't putting an anchor work in the above context?
re.search('Description:\s+\w(.+)$', body).group(1) # fails

You can use either of
Description:\s+(.*\S)
See the regex demo.
The point is that you need to match up to the last non-whitespace character. .* matches any zero or more characters other than line break chars, as many as possible, so the \S matches the last non-whitespace character in the string.
If you have a multiline string and you need to get to the last non-whitespace character, you may add re.S / re.DOTALL option when passing the pattern above to a regex method, or re-write it as
Description:\s+(\S+(?:\s+\S+)*)
where \S+ matches one or more non-whitespace chars and (?:\s+\S+)* matches zero or more occurrences of one or more whitespaces followed with one or more non-whitespace chars.
See this regex demo.

Regex for input validation enforcement

I have been trying to create a regex that will match on a string of strings with the following format: "static.string static.mod static.bin". I basically want to enforce the string.string format. My current implementation only gets the first string static.string. this is my RE ^(\s*)([A-Za-z]+)(\.+)([A-Za-z]+). This only matches the first string, so how do I make it iterate and match any string that fits that format in a string of strings?

You may use
re.findall(r'(?<!\S)[A-Za-z]+\.[A-Za-z]+(?!\S)', text)
See the regex demo.
The regex matches:
(?<!\S) - a location immediately preceded with a whitespace or start of string
[A-Za-z]+ - 1+ ASCII letters
\. - a dot
[A-Za-z]+ - 1+ ASCII letters
(?!\S) - a location immediately followed with a whitespace or end of string.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Replacing string if it contains specified pattern - python

Related

Getting a correct regex for word starting and ending with different letters

Detect strings containing only digits, letters and one or more question marks

Python regular expression for string with the following format

Convert a TRIM on regex

Regex for input validation enforcement

Categories

Resources