negative lookbehind not working as expected [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I have strings of this form:
FPLBX(2x3)ZE(53x13)(4x7)ZGQO
I want to find the blocks in parenthesis but only when they're not preceded by another group.
The other way around works perfectly fine but I can't make it work with preceding.
current regex:
(\(\d*x\d*\))(?<!\))

You simply need to put the so-called negative lookbehind assertion, i.e. the (?<!\))-part, in front of your search re:
>>> import re
>>> txt = "FPLBX(2x3)ZE(53x13)(4x7)ZGQO"
>>> re.findall(r"(?<!\))(\(\d*x\d*\))", txt)
['(2x3)', '(53x13)']

Related

Check if string follow a strict format via Regex Python [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 months ago.
Improve this question
I have a string that might have any of the following format (example) :
1111__1111
1111__1111_11
111_11A_11
I have added the following check :
import re
print(bool(re.match("\d__\d","1111_1111"))
print(bool(re.match("\d__\d_\d","1111_1111_11"))
print(bool(re.match("\d_\d[A-Za-z]_\d","111_11A_11"))
I don't think the regex is correct because when I introduce a character in the first regex for example it returns me True Always.
can you please point me to a solution?
Thank you
It returns True because the pattern is trying to find matches based on each one of the characters inside the pattern string.
The following regular expression finds exact matches for the three scenarios:
print(bool(re.match("(^\d{4}__\d{4}$)","1111__1111")))
print(bool(re.match("(^\d{4}\_\d{4}\_\d{2}$)","1111_1111_11")))
print(bool(re.match("(^\d{3}_\d{2}[A-Z]_\d{2}$)","111_11A_11")))

Python: how to extract the variables between 2 constant substring [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am trying to extract the variables between 2 constant substring in a string. For example,
I wish to extract the variable Apple, Orange, Watermelon, Kiwi....13cups, 14cups...19cups. I am using the re expression to get to the first step of taking the variable between $ sign but I do not get anything results.
Anyone can advise on the correct expression or if there is a better way to extract it ?
Thanks.
import re
file = '$n$n$n$xa0$n$nSHOWALL$nSHOWALL%GROWTH$n$n$xa0$n$xa0$n$n$n$nApple$na$nOrange$n$nWatermelon$nKiwi$n$nBanana$nJackfruit$n$nGuava$na$nGrape$n$nPlum$na$nOrange$n$nCoconut$nWatermelon$n$n12cups$n13cups$n$n14cups$na$n15cups$n$n16cups$na$n17cups$n$n18cups$n19cups$n'
found = re.findall(r'(?=$(.*?)$)',file)
print(found)
Given that the rule(s) for identifying the required character sequences is ambiguous, I contend that RE is impractical. No doubt it could be done but here's a quick'n'dirty approach to the problem:-
data = '$n$n$n$xa0$n$nSHOWALL$nSHOWALL%GROWTH$n$n$xa0$n$xa0$n$n$n$nApple$na$nOrange$n$nWatermelon$nKiwi$n$nBanana$nJackfruit$n$nGuava$na$nGrape$n$nPlum$na$nOrange$n$nCoconut$nWatermelon$n$n12cups$n13cups$n$n14cups$na$n15cups$n$n16cups$na$n17cups$n$n18cups$n19cups$n'
for token in data.split('$n'):
if token not in ('SHOWALL%GROWTH', 'SHOWALL', '$xa0', 'a', ''):
print(token)

unable to match regex pattern in python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to match a regex from an email. If the e-mail says "need update on SRT1000" the regex needs to match. I have my code as below, but it is not working. Can someone look at this and let me know what is wrong here?
def status_update_regex(email):
email = email.lower()
need_sr_update_regex = re.compile('(looking|want|need|seek|seeking|request|requesting)([^/./!/?/,/"]{0,10})(status|update)(^.{0,6})(^srt[0-9]{4})')
if need_sr_update_regex.search(email) != None:
return 1
else:
return 0
You didn't put whitespace \s between words.
You don't have the on string
(looking|want|need|seek|seeking|request|requesting)([\s^.!?,"]{0,10})(status|update)([\s^.]{0,6})(on)([\s^.]{0,6})(srt[0-9]{4})
The best tip I can give to anyone attempting regex matching is to test their solution using https://rubular.com/
Don't put the ^ in the groups, it's trying to match the beginning. Also the extra / are unnecessary.
'(looking|want|need|seek|seeking|request|requesting)([^/.!?,"]{0,10}(status|update)(.{0,6})(srt[0-9]{4})'

Replace every caret with a superscript in a python string [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I want to replace every caret character with a unicode superscript, for nicer printing of equations in python. My problem is, every caret may be followed by a different exponent value, so in the unicode string u'\u00b*', the * wildcard needs to be the exponent I want to print in the string. I figured some regex would work for this, but my experience with that is very little.
For example, supposed I have a string
"x^3-x^2"
, I would then want this to be converted to the unicode string
u"x\u00b3-x\u00b2"
You can use re.sub and str.translate to catch exponents and change them to unicode superscripts.
import re
def to_superscript(num):
transl = str.maketrans(dict(zip('1234567890', '¹²³⁴⁵⁶⁷⁸⁹⁰')))
return num.translate(transl)
s = 'x^3-x^2'
out = re.sub('\^\s*(\d+)', lambda m: to_superscript(m[1]), s)
print(out)
Output
x³-x²

How to extract groups contains desired string from between quotes using regex? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I would like to extract some strings from between quotes using regular expression. The text is shown below:
CCKeyUpDomReady('test.asmx/asdasd', 'QMlPJZTOH09XOPCcbB2jcg==', '0OO6h+G2Tzhr5XWj1Upg0A==', '0OO6h+G2Tzhr5XWj1Upg0A==', '/qqwweq2.asmx/qqq')
Expected result must be:
test.asmx/asdasd
/qqwweq2.asmx/qqq
How can I do it? Here is the platform for testing:
https://regexr.com/3n142
The criteria: string which is between quotes must contains "asmx" word. The text is much more than showed above. You can think like that you are searching asmx urls in a website source code.
See regex in use here
'((?:[^'\\]|\\.)*asmx(?:[^'\\]|\\.)*)'
' Match this literally
((?:[^'\\]|\\.)*asmx(?:[^'\\]|\\.)*) Capture the following into capture group 1
(?:[^'\\]|\\.)* This is a beautiful trick gathered from PhiLho's answer to Regex for quoted string with escaping quotes. It matches escaped ' or any other character.
asmx The OP's search string/criterion
(?:[^'\\]|\\.)* This again
' Match this literally
The result is in capture group:
test.asmx/asdasd
/qqwweq2.asmx/qqq

Categories

Resources