Python: how to extract the variables between 2 constant substring [closed] - python

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
I am trying to extract the variables between 2 constant substring in a string. For example,
I wish to extract the variable Apple, Orange, Watermelon, Kiwi....13cups, 14cups...19cups. I am using the re expression to get to the first step of taking the variable between $ sign but I do not get anything results.
Anyone can advise on the correct expression or if there is a better way to extract it ?
Thanks.
import re
file = '$n$n$n$xa0$n$nSHOWALL$nSHOWALL%GROWTH$n$n$xa0$n$xa0$n$n$n$nApple$na$nOrange$n$nWatermelon$nKiwi$n$nBanana$nJackfruit$n$nGuava$na$nGrape$n$nPlum$na$nOrange$n$nCoconut$nWatermelon$n$n12cups$n13cups$n$n14cups$na$n15cups$n$n16cups$na$n17cups$n$n18cups$n19cups$n'
found = re.findall(r'(?=$(.*?)$)',file)
print(found)

Given that the rule(s) for identifying the required character sequences is ambiguous, I contend that RE is impractical. No doubt it could be done but here's a quick'n'dirty approach to the problem:-
data = '$n$n$n$xa0$n$nSHOWALL$nSHOWALL%GROWTH$n$n$xa0$n$xa0$n$n$n$nApple$na$nOrange$n$nWatermelon$nKiwi$n$nBanana$nJackfruit$n$nGuava$na$nGrape$n$nPlum$na$nOrange$n$nCoconut$nWatermelon$n$n12cups$n13cups$n$n14cups$na$n15cups$n$n16cups$na$n17cups$n$n18cups$n19cups$n'
for token in data.split('$n'):
if token not in ('SHOWALL%GROWTH', 'SHOWALL', '$xa0', 'a', ''):
print(token)

Related

negative lookbehind not working as expected [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 2 years ago.
Improve this question
I have strings of this form:
FPLBX(2x3)ZE(53x13)(4x7)ZGQO
I want to find the blocks in parenthesis but only when they're not preceded by another group.
The other way around works perfectly fine but I can't make it work with preceding.
current regex:
(\(\d*x\d*\))(?<!\))
You simply need to put the so-called negative lookbehind assertion, i.e. the (?<!\))-part, in front of your search re:
>>> import re
>>> txt = "FPLBX(2x3)ZE(53x13)(4x7)ZGQO"
>>> re.findall(r"(?<!\))(\(\d*x\d*\))", txt)
['(2x3)', '(53x13)']

unable to match regex pattern in python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I am trying to match a regex from an email. If the e-mail says "need update on SRT1000" the regex needs to match. I have my code as below, but it is not working. Can someone look at this and let me know what is wrong here?
def status_update_regex(email):
email = email.lower()
need_sr_update_regex = re.compile('(looking|want|need|seek|seeking|request|requesting)([^/./!/?/,/"]{0,10})(status|update)(^.{0,6})(^srt[0-9]{4})')
if need_sr_update_regex.search(email) != None:
return 1
else:
return 0
You didn't put whitespace \s between words.
You don't have the on string
(looking|want|need|seek|seeking|request|requesting)([\s^.!?,"]{0,10})(status|update)([\s^.]{0,6})(on)([\s^.]{0,6})(srt[0-9]{4})
The best tip I can give to anyone attempting regex matching is to test their solution using https://rubular.com/
Don't put the ^ in the groups, it's trying to match the beginning. Also the extra / are unnecessary.
'(looking|want|need|seek|seeking|request|requesting)([^/.!?,"]{0,10}(status|update)(.{0,6})(srt[0-9]{4})'

Search and Replace a word within a word in Python. Replace() method not working [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
How do I search and replace using built-in Python methods?
For instance, with a string of appleorangegrapes (yes all of them joined),
Replace "apple" with "mango".
The .replace method only works if the words are evenly spaced out but not if they are combined as one. Is there a way around this?
I searched the web but again the .replace method only gives me an example if they are spaced out.
Thank you for looking at the problem!
This works exactly as expected and advertised. Have a look:
s = 'appleorangegrapes'
print(s) # -> appleorangegrapes
s = s.replace('apple', 'mango')
print(s) # -> mangoorangegrapes
The only thing that you have to be careful of is that replace is not an in-place operator and as such it does not update s automatically; it only creates a new string that you have to assign to something.
s = 'appleorangegrapes'
s.replace('apple', 'mango') # the change is made but not saved
print(s) # -> appleorangegrapes
replace can work for any string, why you think that it doesn't, here is the test:
>>> s='appleorangegrapes'
>>> s.replace('apple','mango')
'mangoorangegrapes'
>>>
Don't you see that you received your expected result?

How to extract groups contains desired string from between quotes using regex? [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I would like to extract some strings from between quotes using regular expression. The text is shown below:
CCKeyUpDomReady('test.asmx/asdasd', 'QMlPJZTOH09XOPCcbB2jcg==', '0OO6h+G2Tzhr5XWj1Upg0A==', '0OO6h+G2Tzhr5XWj1Upg0A==', '/qqwweq2.asmx/qqq')
Expected result must be:
test.asmx/asdasd
/qqwweq2.asmx/qqq
How can I do it? Here is the platform for testing:
https://regexr.com/3n142
The criteria: string which is between quotes must contains "asmx" word. The text is much more than showed above. You can think like that you are searching asmx urls in a website source code.
See regex in use here
'((?:[^'\\]|\\.)*asmx(?:[^'\\]|\\.)*)'
' Match this literally
((?:[^'\\]|\\.)*asmx(?:[^'\\]|\\.)*) Capture the following into capture group 1
(?:[^'\\]|\\.)* This is a beautiful trick gathered from PhiLho's answer to Regex for quoted string with escaping quotes. It matches escaped ' or any other character.
asmx The OP's search string/criterion
(?:[^'\\]|\\.)* This again
' Match this literally
The result is in capture group:
test.asmx/asdasd
/qqwweq2.asmx/qqq

Extract a part of a string in python [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I have a .txt file that has a really long RNAm sequence. I don´t know the exact length of the sequence.
What I need to do is extract the part of the sequence that is valid, meaning it starts with "AUG" and ends in "UAA" "UAG" or "UGA". Since the sequence is too long I don´t know the index of any of the letters or where the valid sequence is.
I need to save the new sequence in another variable.
Essentially, what you need to do, without coding the whole thing for you, is:
Example string:
rnaSequence = 'ACGUAFBHUAUAUAGAAAAUGGAGAGAGAAAAUUUGGGGGGGAAAAAAUAAAAAGGGUAUAUAGAUGAGAGAGA'
You will want to find the index of the 'AUG' and the index of 'UAA', 'UAG', or 'UGA' .. Something like this
rnaStart = rnaSequence.index(begin)
Then you'll need to set the slice of the string to a new variable
rnaSubstring = rnaSequence[rnaStart:rnaEnd+3]
Which in my string above, returns:
AUGGAGAGAGAAAAUUUGGGGGGGAAAAAAUAA

Categories

Resources