This question already has an answer here:
Find all strings that are in between two sub strings
(1 answer)
Closed last month.
How can I extract with python all strings starting with 'a[' and ending with ']'?
for example
str= "a[0]*x**3+a[13]"
result : a[0], a[13]
thanks
We can use re.findall here:
inp = "a[0]*x**3+a[13]"
matches = re.findall(r'\ba\[.*?\]', inp)
print(matches) # ['a[0]', 'a[13]']
This question already has answers here:
How do I coalesce a sequence of identical characters into just one?
(10 answers)
Closed 2 years ago.
I have a string, something like that (I don't know in advance how much similar characters in a sequence):
s = '&&&&&word&&&word2&&&'
and would like to obtain as a result this string:
'&word&word2&'
Workaround is something like this (not effective I guess for large texts):
while True:
if not '&&' in s:
break
s = s.replace('&&','&')
You can use a regex to replace any occurence of one or more '&' (&+) by '&':
import re
s = '&&&&&word&&&word2&&&'
res = re.sub(r'&+', '&', s)
print(res)
# &word&word2&
This question already has answers here:
String formatting in Python [duplicate]
(14 answers)
Closed 2 years ago.
I'm trying to understand the following code related to complex regex.
I do not understand how the full_regex line operates? What is the use of the '%s' as well as the other % before the (regex1, regex2...)
Can someone please help with this?
regex1 = '(\d{1,2}[/-]\d{1,2}[/-]\d{2,4})'
regex2 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\S]*[+\s]\d{1,2}[,]{0,1}[+\s]\d{4})'
regex3 = '(\d{1,2}[+\s](?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\S]*[+\s]\d{4})'
regex4 = '((?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)[\S]*[+\s]\d{4})'
regex5 = '(\d{1,2}[/-][1|2]\d{3})'
regex6 = '([1|2]\d{3})'
full_regex = '(%s|%s|%s|%s|%s|%s)' %(regex1, regex2, regex3, regex4, regex5, regex6)
The expression
full_regex = '(%s|%s|%s|%s|%s|%s)' % (regex1, regex2, regex3, regex4, regex5, regex6)
just merges all of the other regexps into one big one that alternates between all of them; that's not regex syntax, it's just Python string interpolation.
This question already has answers here:
re.findall behaves weird
(3 answers)
Closed 4 years ago.
Given a string, I want to find all the substrings consisting of two or three '4,'.
For example, given '1,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,4,3,2,'
I want to get ['4,4,', '4,4,', '4,4,4'].
str_ = '1,4,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,3,2,'
m = re.findall(r"(4,){2,3}", str_)
what I get is :
['4,', '4,', '4,', '4,']
what's wrong?
It seems to me that the parenthesis wrapping '4,' is interpreted as grouping but not telling Python '4' and ',' should occur together. However, I don't know how to do this.
Just use non-capturing group (online version of this regex here):
import re
s = '1,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,4,3,2,'
print(re.findall(r'(?:4,?){2,3}', s))
Prints:
['4,4,', '4,4,', '4,4,4,']
EDIT:
Edited regex to capture 2 or 3 elements "4,"
This question already has answers here:
Remove characters from beginning and end or only end of line
(5 answers)
Closed 4 years ago.
So, I have the following string "........my.python.string" and I want to remove all the "." until it gets to the first alphanumeric character, is there a way to achieve this other than converting the string to a list and work it from there?
You can use re.sub:
import re
s = "........my.python.string"
new_s = re.sub('^\.+', '', s)
print(new_s)
Output:
my.python.string