I have a for loop on an avro data reader object
for i in reader:
print i
then I got a unicode decode error in the for statement so I wanted to ignore that particular record. So I did this
try:
for i in reader:
print i
except:
pass
but it does not continue further. How can I overcome this problem
Edit: Error trace added
Traceback (most recent call last):
File "modify.py", line 22, in <module>
for record in reader:
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/datafile.py", line 362, in next
datum = self.datum_reader.read(self.datum_decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 445, in read
return self.read_data(self.writers_schema, self.readers_schema, decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 490, in read_data
return self.read_record(writers_schema, readers_schema, decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 690, in read_record
field_val = self.read_data(field.type, readers_field.type, decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 468, in read_data
return decoder.read_utf8()
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 233, in read_utf8
return unicode(self.read_bytes(), "utf-8")
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb4 in position 14: invalid start byte
could it be due to the fact that the file was corrupted?
Edit2:
As per suggestion in answers to go through iterobject I modified code and got this error
Traceback (most recent call last):
File "modify.py", line 28, in <module>
print next(iterobject)["filepath"]
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/datafile.py", line 362, in next
datum = self.datum_reader.read(self.datum_decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 445, in read
return self.read_data(self.writers_schema, self.readers_schema, decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 490, in read_data
return self.read_record(writers_schema, readers_schema, decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 690, in read_record
field_val = self.read_data(field.type, readers_field.type, decoder)
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 468, in read_data
return decoder.read_utf8()
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 233, in read_utf8
return unicode(self.read_bytes(), "utf-8")
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 226, in read_bytes
return self.read(self.read_long())
File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 184, in read_long
b = ord(self.read(1))
TypeError: ord() expected a character, but string of length 0 found
If your error is in for i in. Then try this, it will skip element in iterator if UnicodeDecodeError occurs.
iterobject = iter(reader)
while iterobject:
try:
print(next(iterobject))
except StopIteration:
break
except UnicodeDecodeError:
pass
You need the try/except inside the loop:
for i in reader:
try:
print i
except UnicodeEncodeError:
pass
By the way it's good practice to specify the specific type of error you're trying to catch (like I did with except UnicodeEncodeError:, since otherwise you risk making your code very hard to debug!
You can except the specific error, and avoid unknown errors to pass unnoticed.
Python 3.x:
try:
for i in reader:
print i
except UnicodeDecodeError as ue:
print(str(ue))
Python 2.x:
try:
for i in reader:
print i
except UnicodeDecodeError, ue:
print(str(ue))
By printing the error it's possible to know what happened. When you use only except, you except anything (And that can include an obscure RuntimeError), and you'll never know what happened. It can be useful sometimes, but it's dangerous and generally a bad practice.
Related
Rather a bug report with possible fix. I'm using version 3.0.9.
One of the files I need to handle has a problem with one of the images. When I open it with libreoffice, I see placeholder instead of an image. But when I open it with load_workbook(), an exception occurs:
Traceback (most recent call last):
File "/home/pooh/work/isaac_choi/./1.py", line 5, in <module>
wb=load_workbook('pritelli/FW21 WOMAN 27.09.21.xlsx')
File "/home/pooh/venv39/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 317, in load_workbook
reader.read()
File "/home/pooh/venv39/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 282, in read
self.read_worksheets()
File "/home/pooh/venv39/lib/python3.9/site-packages/openpyxl/reader/excel.py", line 257, in read_worksheets
charts, images = find_images(self.archive, rel.target)
File "/home/pooh/venv39/lib/python3.9/site-packages/openpyxl/reader/drawings.py", line 52, in find_images
image = Image(BytesIO(archive.read(dep.target)))
File "/usr/lib/python3.9/zipfile.py", line 1463, in read
with self.open(name, "r", pwd) as fp:
File "/usr/lib/python3.9/zipfile.py", line 1502, in open
zinfo = self.getinfo(name)
File "/usr/lib/python3.9/zipfile.py", line 1429, in getinfo
raise KeyError(
KeyError: "There is no item named 'xl/drawings/NULL' in the archive"
I think KeyError can be handled right after OSError (line 53), and just continue iterating in this case:
except KeyError:
warn('Missing image')
continue
I have a series of .json files. Each file contains tweets based on a different keyword. Each line in every file is a json object. I read the files using the following code:
# Get tweets out of JSON file
tweetsFromJSON = []
with open(json_file) as f:
for line in f:
json_object = json.loads(line)
tweet_text = json_object["text"]
tweetsFromJSON.append(tweet_text)
For every JSON file I have this works flawlessly. But this particular file gives me the following error:
Traceback (most recent call last):
File "C:/Users/alexandros/Dropbox/Development/Sentiment Analysis/lda_analysis.py", line 119, in <module>
lda_analysis('precision_medicine.json', 'precision medicine')
File "C:/Users/alexandros/Dropbox/Development/Sentiment Analysis/lda_analysis.py", line 46, in lda_analysis
json_object = json.loads(line)
File "C:\Users\alexandros\AppData\Local\Programs\Python\Python35-32\lib\json\__init__.py", line 319, in loads
return _default_decoder.decode(s)
File "C:\Users\alexandros\AppData\Local\Programs\Python\Python35-32\lib\json\decoder.py", line 342, in decode
raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 1 column 5287 (char 5286)
So tried removing the first line to see what happens. The error persists and again it's in the exact same position (line 1 column 5287 (char 5286)). I removed another line and it's the same. I'm breaking my head trying to figure out what's wrong. What am I missing?
I'm writing code to generate an XML with content from different languages strings. I got an error for unicode generation initially, added setdefault command at the beginning, now getting "attributeError: 'str' object has no attribute 'iter' python". Tried searching but the answers didnt help much.
Here is the traceback:
Traceback (most recent call last):
File "oldgood_XliffGenerator.py", line 118, in <module>
convertToXliff(filename)
File "oldgood_XliffGenerator.py", line 47, in convertToXliff
tree.write(destifilename, xml_declaration=True, encoding='utf-8')
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 817, in write
self._root, encoding, default_namespace
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 877, in _namespaces
for elem in iterate():
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 477, in iter
for e in e.iter(tag):
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 477, in iter
for e in e.iter(tag):
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 477, in iter
for e in e.iter(tag):
File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 477, in iter
for e in e.iter(tag):
AttributeError: 'str' object has no attribute 'iter'
Code Snippet:
def convertToXliff(filename):
if filename:
if os.path.isfile(filename):
valid=True
else:
print "Could not open "+filename
else:
print "no input"
global fileLength
root = ET.Element("file")
global file
file = ET.SubElement(root, "file")
file.set("id", generatingLang)
file.set("native", nativeLang)
file.set("useAsLocale", setLocale)
print "Reached stage1"
datainp = fileRead(filename)
RecurseObjects(datainp)
destifilename = "testconvfile.xml"
#Indent(root)
tree = ET.ElementTree(root)
tree.write(destifilename, xml_declaration=True, encoding='utf-8')
plz check and let me know what Im missing
Im trying to read a file line by line. I want to replace key with value if found in the dictionary and write the contents to the new file. Here is the logic:
fout = open(output_file,"w+")
with open(input_file, 'r') as fin:
for line in fin:
for key in sorted(Db):
if re.match(key,line):
line = re.sub(key,Db[key],line) ## line 246
fout.write(line)
break
else:
fout.write(line)
Whenever i try to run this file, I'm getting the following tracebacks:
Traceback (most recent call last):
File "final.py", line 246, in <module>
if re.match(key,line):
File "c:\Python33\lib\re.py", line 156, in match
return _compile(pattern, flags).match(string)
File "c:\Python33\lib\functools.py", line 258, in wrapper
result = user_function(*args, **kwds)
File "c:\Python33\lib\re.py", line 274, in _compile
return sre_compile.compile(pattern, flags)
File "c:\Python33\lib\sre_compile.py", line 493, in compile
p = sre_parse.parse(p, flags)
File "c:\Python33\lib\sre_parse.py", line 724, in parse
p = _parse_sub(source, pattern, 0)
File "c:\Python33\lib\sre_parse.py", line 347, in _parse_sub
itemsappend(_parse(source, state))
File "c:\Python33\lib\sre_parse.py", line 552, in _parse
raise error("nothing to repeat")
sre_constants.error: nothing to repeat
Kindly let me know if I'm missing something. Thanks in advance.
Thanks,
Anand
I think you should try and debug this problem yourself. Here is what I would do.
add a print statement in your script before line 246:
print key,
print Db[key]
print line
Depending on the output, take action.
To test what would work, you can use the python interpreter.
Assuming you get out of the print above:
key
foo
key 123
you can test it:
line = 'key 123'
re.sub('key', 'foo', line)
'foo 123'
In this case it works. I'm sure you'll soon find out what the problem is. Good luck!
How would I handle this error in Python 2.6?
Traceback (most recent call last):
File "./fetch_xml_collect.py", line 32, in <module>
tree=ET.parse(response)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/xml/etree/ElementTree.py", line 862, in parse
tree.parse(source, parser)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/xml/etree/ElementTree.py", line 587, in parse
self._root = parser.close()
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/xml/etree/ElementTree.py", line 1254, in close
self._parser.Parse("", 1) # end of data
xml.parsers.expat.ExpatError: unclosed token: line 56, column 1
Current code being implemented:
while urlget==1:
try:
response = urllib.urlopen(rep)
except IOError:
print("reason")
else:
try:
tree=ET.parse(response)
except IOError:
print("XML Parse Error\n")
else:
root=tree.getroot()
print root[0].text
powerlist=tree.findall('meter/power')
print powerlist[0].tag,powerlist[0].text
The question is: How would I handle the above error in the given code?
try:
#Some code
...
except xml.parsers.expat.ExpatError, ex:
print ex
continue
Something like the above should work. Just continue if you get that error. It will continue with the next iteration with the loop, or if it's the last iteration, break out of the loop.
The XML is formed incorrectly, and is unable to be processed. Just skip it and go on to the next one.