I have a string that looks like a path from which I am trying to extract 020414_001 with a regular expression I got from here.
str1 = "Test 123 <C:\User\Test\xyz\022014-101\more\stuff\022014\1> Text"
Actually I am retrieving the string from a text file so I dont have to escape it, but for testing purpose I used this string instead:
str1 = <C:\\User\\Test\\xyz\\022014-101\\more\\stuff\\022014\\1>
Here is the code I tried to match the first occuring 022014-101:
import re
p = re.compile('(?<=\\)[\d]{6}[^\\]*')
m = p.match(str1)
print m.group(0) #Line 6
It gave me this error:
Traceback (most recent call last):
File "test12.py", line 6, in <module>
print m.group(0)
AttributeError: 'NoneType' object has no attribute 'group'
How can I get the desired output 020414_001 ?
EDIT:
That did it:
import re
m = re.search(r'(?<=\\)[\d]{6}[^\\]*', str1)
print m.group(0)
Related
I am trying to parse the output from an SSH session using Paramiko module. Paramiko channel.recv() returns the output is bytes. I then converted it to UTF-8 string using bytes.decode("utf-8"). No matter what encoding I use, Regex always raises TypeError: expected string or bytes-like object exception.
import re
bytes = b"optical temp=10950"
bytes = bytes.decode("utf-8")
pattern = re.compile("(?<=temp=).*")
temp = re.search(bytes, pattern)
Traceback:
Traceback (most recent call last):
File "main.py", line 7, in <module>
temp = re.search(bytes, pattern)
File "/usr/lib/python3.8/re.py", line 201, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object
Your code is almost functional.
re.search takes the pattern first, then the string to be searched:
import re
bytes = b"optical temp=10950"
bytes = bytes.decode("utf-8")
pattern = re.compile("(?<=temp=).*")
temp = re.search(pattern, bytes)
#OR
temp = pattern.search(bytes)
from mrjob.job import MRJob
import re
Creation_date=re.compile('CreationDate=\"[0-9]*\"[:17]')
class Part2(MRJob):
def mapper(self, _, line):
DateOnly=Creation_date.group(0).split("=")
if(DateOnly > 2013):
yield None, 1
def reducer(self, key, values):
yield key, sum(values)
if __name__ == '__main__':
Part1.run()
I have written python code for MapReduce Job where CreationDate="2010-07-28T19:04:21.300". I have to find all the dates where creation date is at or after 2014-01-01. But I have encountered an error.
Creation_date is just a regex.
You need to match your input string before you can call group(0)
Regular expression object (the result of re.compile) does not have group method:
>>> pattern = re.compile('CreationDate="([0-9]+)"')
>>> pattern.group
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: '_sre.SRE_Pattern' object has no attribute 'group'
To get a match object (which has a group method), you need to match the pattern against the string (line) using regex.search method (or regex.match method depending on your need):
>>> pattern.search('CreationDate="2013"')
<_sre.SRE_Match object at 0x7fac5c64e8a0>
>>> pattern.search('CreationDate="2013"').group(1) # returns a string
'2013'
Creation_date = re.compile('CreationDate="([0-9]+)"')
def mapper(self, _, line):
date_only = Creation_date.search(line), group(1)
if int(date_only) > 2013:
yield None, 1
NOTE: modifed the regular express to capture the numeric part as a group. and convert the matched string to int (comparing string with the number 2013 has no meaning, or raise exception depending on Python version)
i wan to extract (abc)(def) using the regex
which i ended up with that error below
import re
def main():
str = "-->(abc)(def)<--"
match = re.search("\-->(.*?)\<--" , str).group(1)
print match
The error is:
Traceback (most recent call last):
File "test.py", line 7, in <module>
match = re.search("\-->(.*?)\<--" , str).group()
File "/usr/lib/python2.7/re.py", line 146, in search
return _compile(pattern, flags).search(string)
TypeError: expected string or buffer
Corrected:
import re
def main():
my_string = "-->(abc)(def)<--"
match = re.search("\-->(.*?)\<--" , my_string).group(1)
print match
# (abc)(def)
main()
Note, that I renamed str to my_string (do not use standard library functions as own variables!). Maybe you can still optimize your regex with lookarounds, the lazy star (.*?) can get very ineffective sometimes.
I am working through some example code which I've found on What's the most efficient way to find one of several substrings in Python?. I've changed the code to:
import re
to_find = re.compile("hello|there")
search_str = "blah fish cat dog haha"
match_obj = to_find.search(search_str)
#the_index = match_obj.start()
which_word_matched = ""
which_word_matched = match_obj.group()
Since there is now no match , I get:
Traceback (most recent call last):
File "<console>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
What is the standard way in python to handle the scenario of no match, so as to avoid the error
match_obj = to_find.search(search_str)
if match_obj:
#do things with match_obj
Other handling will go in an else block if you need to do something even when there's no match.
Your match_obj is None because the regular expression did not match. Test for it explicitly:
which_word_matched = match_obj.group() if match_obj else ''
I have data split into fileids. I am trying to go through the data per fileid and search for emoticons :( and :) as defined by the regex. If an emoticon is found I need to retain the information a) the emoticon was found b) in this fileid. When I run this piece of script and print the emoticon dictionary I get 0 as a value. How is this possible? I am a beginner.
emoticon = 0
for fileid in corpus.fileids():
m = re.search('^(:\(|:\))+$', fileid)
if m is not None:
emoticon +=1
It looks to me like your regex is working, and that m should indeed not be None.
>>> re.search('^(:\(|:\))+$', ':)').group()
':)'
>>> re.search('^(:\(|:\))+$', ':)').group()
':)'
>>> re.search('^(:\(|:\))+$', ':):(').group()
':):('
>>> re.search('^(:\(|:\))+$', ':)?:(').group()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'
However, a few things are questionable to me.
this will only match strings that are 100% emoticons
is fileid really what you're searching?