re.findall only finding half the patterns [duplicate] - python

This question already has answers here:
Why doesn't [01-12] range work as expected?
(7 answers)
Closed 4 years ago.
I'm using re.findall to parse the year and month from a string, however it is only outputting patterns from half the string. Why is this?
date_string = '2011-1-1_2012-1-3,2015-3-1_2015-3-3'
find_year_and_month = re.findall('[1-2][0-9][0-9][0-9]-[1-12]', date_string)
print(find_year_and_month)
and my output is this:
['2011-1', '2012-1']
This is the current output for those dates but why am I only getting pattern matching for half the string?

[1-12] doesn't do what you think it does. It matches anything in the range 1 to 1, or it matches a 2.
See this question for some replacement regex options, like ([1-9]|1[0-2]): How to represent regex number ranges (e.g. 1 to 12)?
If you want an interactive tool for experimenting with regexes, I personally recommend Regexr.

Adjust your regex pattern as shown below:
import re
date_string = '2011-1-1_2012-1-3,2015-3-1_2015-3-3'
find_year_and_month = re.findall('([1-2][0-9]{3}-(?:1[0-2]|[1-9]))', date_string)
print(find_year_and_month)
The output:
['2011-1', '2012-1', '2015-3', '2015-3']

Related

python find all strings starting and ending with certain characters [duplicate]

This question already has an answer here:
Find all strings that are in between two sub strings
(1 answer)
Closed last month.
How can I extract with python all strings starting with 'a[' and ending with ']'?
for example
str= "a[0]*x**3+a[13]"
result : a[0], a[13]
thanks
We can use re.findall here:
inp = "a[0]*x**3+a[13]"
matches = re.findall(r'\ba\[.*?\]', inp)
print(matches) # ['a[0]', 'a[13]']

How to extract the first float from a string in python [duplicate]

This question already has answers here:
Extract float/double value
(5 answers)
Closed 11 months ago.
I have a string containing a string and word, I want to extract the first float the string.
myString = 12.5% per month
myFloat= [float(s) for s in re.findall(r'\b\d+\b', myString )][0]
I want to have 12.5 as myFloat.
Thank you
To not change your code completly:
import re
myString = "12.5% 35.6 per month"
myFloat= [float(s) for s in re.findall(r'[0-9]+\.[0-9]+', myString )][0]
All I've changed is the regex expression to r'[0-9]+\.[0-9]+'.
But, as Oliver pointed in his comment, you dont need to use re.findall to get the first occurrence.
You can simply: myFloat= float(re.search(r'[0-9]+\.[0-9]+', myString).group(0))

Regex to extract the date based on particular string [duplicate]

This question already has answers here:
Python/Regex - How to extract date from filename using regular expression?
(5 answers)
Closed 2 years ago.
am trying to extract the date if it matches to a particular regex
Ex :
string1 = '10/22/2019 from'
string2 = '12/22/2020 33455SE'
string3 = '7/20/2020 S0023'
Am trying to extract the string 2
Regex used :
r'(\d+[/]\d+[/]\d+[-\s\.]\d+)'
The above used regex is giving me if the string looks like, "10/22/2019 33455" but if there is a alphabet after as shown "33455SE", my code fails.
Any help ?
Tried codes :
r'(\d+[/]\d+[/]\d+[-\s\.]^\d+)' - Tried to use starts with.
Expected output : only string 2 and string 3
12/22/2020
7/20/2020
This works
import re
a = "3443E hello 10/22/2019 33455SE"
number = re.findall(r"[0-9]{2}[/][0-9]{2}[/][0-9]{4}",a)
print(number[0])
Output :
10/22/2019
This should work:
r'(\d+[/]\d+[/]\d+[-\s\.]\d+[A-Z]*)'
\d{1,2}/\d{2}/\d{4}(?=\s\w*\d+)
https://regex101.com/r/gCXHQ6/3

What is wrong with this Python regular expression? [duplicate]

This question already has answers here:
re.findall behaves weird
(3 answers)
Closed 4 years ago.
Given a string, I want to find all the substrings consisting of two or three '4,'.
For example, given '1,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,4,3,2,'
I want to get ['4,4,', '4,4,', '4,4,4'].
str_ = '1,4,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,3,2,'
m = re.findall(r"(4,){2,3}", str_)
what I get is :
['4,', '4,', '4,', '4,']
what's wrong?
It seems to me that the parenthesis wrapping '4,' is interpreted as grouping but not telling Python '4' and ',' should occur together. However, I don't know how to do this.
Just use non-capturing group (online version of this regex here):
import re
s = '1,4,3,2,1,1,4,4,3,2,1,4,4,3,2,1,4,4,4,3,2,'
print(re.findall(r'(?:4,?){2,3}', s))
Prints:
['4,4,', '4,4,', '4,4,4,']
EDIT:
Edited regex to capture 2 or 3 elements "4,"

Capture repeated characters and split using Python [duplicate]

This question already has answers here:
How can I tell if a string repeats itself in Python?
(13 answers)
Closed 3 years ago.
I need to split a string by using repeated characters.
For example:
My string is "howhowhow"
I need output as 'how,how,how'.
I cant use 'how' directly in my reg exp. because my input varies. I should check the string whether it is repeating the character and need to split that characters.
import re
string = "howhowhow"
print(','.join(re.findall(re.search(r"(.+?)\1", string).group(1), string)))
OUTPUT
howhowhow -> how,how,how
howhowhowhow -> how,how,how,how
testhowhowhow -> how,how,how # not clearly defined by OP
The pattern is non-greedy so that howhowhowhow doesn't map to howhow,howhow which is also legitimate. Remove the ? if you prefer the longest match.
lengthofRepeatedChar = 3
str1 = 'howhowhow'
HowmanyTimesRepeated = int(len(str1)/lengthofRepeatedChar)
((str1[:lengthofRepeatedChar]+',')*HowmanyTimesRepeated)[:-1]
'how,how,how'
Works When u know the length of repeated characters

Categories

Resources