lxml: Obtain file current line number when calling etree.iterparse(f) - python

Since no one answer or comment this post, I decide to have this post rewritten.
Consider the following Python code using lxml:
treeIter = etree.iterparse(fObj)
for event, ele in treeIter:
if ele.tag == 'logRoot':
try:
somefunction(ele)
except InternalException as e:
e.handle(*args)
ele.clear()
InternalException is user-defined and wraps all exceptions from somefunction() besides lxml.etree.XMLSyntaxError. InternalException has well-defined handler function .handle().
fObj has "trueRoot" as top-level tag, and many "logRoot" as 2nd-level leaves.
My question is: Is there a way to record current line number when handling the exception e? *args can be replaced by any arguments available.
Any suggestion is much appreciated.

import lxml.etree as ET
import io
def div(x):
return 1/x
content = '''\
<trueRoot>
<logRoot a1="x1"> 2 </logRoot>
<logRoot a1="x1"> 1 </logRoot>
<logRoot a1="x1"> 0 </logRoot>
</trueRoot>
'''
for event, elem in ET.iterparse(io.BytesIO(content), events=('end', ), tag='logRoot'):
num = int(elem.text)
print('Calling div({})'.format(num))
try:
div(num)
except ZeroDivisionError as e:
print('Ack! ZeroDivisionError on line {}'.format(elem.sourceline))
prints
Calling div(2)
Calling div(1)
Calling div(0)
Ack! ZeroDivisionError on line 4

Related

how to continue processing after exception in python without aborting

I have a block of code where exception is happening for some records while executing an imported method (pls see below). My goal is NOT to stop execution, but perhaps print (to understand what's wrong with the data) some values when in error and continue. I tried various way to use "try...except", but no luck! Can someone pls take a look and suggest? Many thanks in advance!
code below
if student_name not in original_df_names_flat:
for org in orgs:
data = calculate(org, data)
result = imported_module.execute(data) # here's the line where exception happens
return result
else:
return data
if student_name not in original_df_names_flat:
for org in orgs:
data = calculate(org, data)
result = None # May fail to get assigned if error in imported_module
try:
result = imported_module.execute(data) # here's the line where exception happens
except Exception as e:
print(f'{e}') # probably better to log the exception
return result
else:
return data
I guess an exception can occur in 2 places. So try this. It covers your exception line and the other possibility. So your program should keep running.
Secondly I have adjusted the indentation of the second try clause so the loop processes all data in orgs. To return all results I have added a list 'results'.
if student_name not in original_df_names_flat:
results = []
for org in orgs:
try:
data = calculate(org, data)
except Exception as e:
print(str(e))
try:
result = imported_module.execute(data) # here's the line where exception happens
results.append(result)
except Exception as e:
print(str(e))
return results
else:
return data
the below solution took care of the problem:
if student_name not in original_df_names_flat:
for org in orgs:
data = calculate(org, data)
try:
result = imported_module.execute(data) # here's the line where exception happens
except exception as e:
print ("exception happened", e)
pass
return data
return result
else:
return data
the above solution took care of the problem

Handling propagating errors

In my code I have a main function, which calls a function that reads some data from a file and returns this data, which is then used in diffrent ways. Obviously there is a risk that the user inputs a filename that is not to be found, resulting in an error. I want to catch this error and output a error message written by me without the traceback etc. I tried using a standard try-except statement, which works almost as intended, except now the data is not read so there are new errors as I try to calculate using empty variabels. Using sys.exit or raise SystemExit in the exception block results in errors beeig written in the console with tracebacks and the whole point of catching the first error feels redundant. I could wrap the whole program in a try-statement, but I have never seen that being done and it feels wrong. How can I either terminate the program in a clean way or hide all the subsequent errors?
def getData(fileName):
try:
file = open(fileName,"r")
data = file.readlines()
file.close()
x = []
y = []
for i in data:
noNewline = i.rstrip('\n')
x.append(float(noNewline.split("\t")[0]))
y.append(float(noNewline.split("\t")[1]))
return x,y
except FileNotFoundError:
print("Some error messages")
def main(fileName):
x,y = getData(fileName)
# diffrent calculations with x and y
Because main is a function, you could return on an error:
def main(filename):
try:
x, y = getData(filename)
except FileNotFoundError:
print("file not found")
return
# calculations here
Solution
sys.exit and SystemExit take optional arguments—0 is considered a successful termination.
Example
sys.exit(0)
raise SystemExit(0)
References
Python sys.exit: https://docs.python.org/3/library/sys.html#sys.exit
below
def getData(fileName):
file = open(fileName,"r")
data = file.readlines()
file.close()
x = []
y = []
for i in data:
noNewline = i.rstrip('\n')
x.append(float(noNewline.split("\t")[0]))
y.append(float(noNewline.split("\t")[1]))
return x,y
def main(fileName):
# if you only want to handle exception coming from 'getData'
try:
x,y = getData(fileName)
except Exception as e:
print(f'could not get data using file {filename}. Reason: {str(e)}')
return
# do something with x,y
if __name__ == "__main__":
main('get_the_file_name_from_somewhere.txt')

I am getting unbound local error while pickle loading a file

I am. Pickle loading two files one by one and I am getting an unbound local error while closing them.
I used exception handling while opening the file and in the except block it shows unbound local error while closing the files.
though i used filenotfound In the exception block as it is a necessary exception to handle.no indentation errors are there i just am not able to handle the error stating.
"Traceback (most recent call last):
File "d:\Python\t.py", line 648, in dispdeisel
fdl=open("D:/Python/deisel/"+str(z1)+".txt","rb+")
FileNotFoundError: [Errno 2] No such file or directory: 'D:/Python/deisel/Wed Apr 29 2020.txt'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "d:\Python\t.py", line 820, in <module>
b.dispdeisel()
File "d:\Python\t.py", line 664, in dispdeisel
fdl.close()
UnboundLocalError: local variable 'fdl' referenced before assignment"
k1=[]
try:
tdc=open("D:/Python/deisel/collection.txt","rb+")
fdl=open("D:/Python/deisel/"+str(z1)+".txt","rb+")
while True:
self.f1=pickle.load(tdc)
self.fd=pickle.load(fdl)
k1.append(self.f1)
kd.append(self.fd)
except EOFError and FileNotFoundError:
qa=0
for i in kd:
if "L"in i:
qa1=i[:-1]
qa=qa+int(qa)
else:
qa=qa+int(i[0])
print (" Total Collection for Deisel on date ",z1,"is",qa)
tdc.close()
fdl.close()
In your example code, when this line is reached (and causes an error):
tdc=open("D:/Python/deisel/collection.txt","rb+")
.. then the next line will never be executed and fdl won't have a value.
After the error, execution continues after except EOFError and FileNotFoundError: and this line is reached:
fdl.close()
Since fdl was never defined (as that line was skipped), it has no value and that's the cause of your error.
One way of fixing it would be to deal with the exceptions more cleanly:
class SomeClass:
def some_method(self, z1):
# initialisation
qa = 0
kd = []
k1 = []
try:
tdc = open("D:/Python/deisel/collection.txt","rb+")
try:
fdl = open("D:/Python/deisel/"+str(z1)+".txt","rb+")
try:
try:
while True:
self.f1 = pickle.load(tdc)
self.fd = pickle.load(fdl)
k1.append(self.f1)
kd.append(self.fd)
except EOFError:
pass # no message needed, but not the nicest way to use the exception
for i in kd:
if "L" in i:
# this bit makes no sense to me, but it's not relevant
qa1 = i[:-1]
qa = qa + int(qa)
else:
qa = qa + int(i[0])
print(" Total Collection for Deisel on date ", z1, "is", qa)
finally:
tdc.close()
fdl.close()
except FileNotFoundError:
tdc.close() # this is open, closing it
pass # some error message perhaps?
except FileNotFoundError:
pass # some error message perhaps?
This is better, but not very Pythonic and only illustrates how you could solve your problem - it's not at all how I recommend you write this.
Closer to what you probably need:
import pickle
class SomeClass:
def some_method(self, z1):
# initialisation
qa = 0
kd = []
k1 = []
try:
with open("D:/Python/deisel/collection.txt","rb+") as tdc:
try:
with open("D:/Python/deisel/"+str(z1)+".txt","rb+") as fdl:
try:
while True:
self.f1 = pickle.load(tdc)
self.fd = pickle.load(fdl)
k1.append(self.f1)
kd.append(self.fd)
except EOFError:
pass # no message needed, but not the nicest way to use the exception
for i in kd:
if "L" in i:
# this bit makes no sense to me, but it's not relevant
qa1 = i[:-1]
qa = qa + int(qa)
else:
qa = qa + int(i[0])
print(" Total Collection for Deisel on date ", z1, "is", qa)
except FileNotFoundError:
pass # some error message perhaps?
except FileNotFoundError:
pass # some error message perhaps?
with does exactly what you're trying to do and cleans up the file handle, guaranteed.
And this still has the issue of getting multiple pickles from files, with no guarantee the number of pickles will be the same for both - if you write these pickles yourself, why not pickle a list of objects and avoid that mess?
In general, don't use exceptions if you expect them to occur, instead code directly what you expect - it's easier to read and maintain and it generally performs better:
import pickle
from pathlib import Path
class SomeClass:
def some_method(self, z1):
# initialisation
qa = 0
kd = []
k1 = []
fn1 = "D:/Python/deisel/collection.txt"
fn2 = "D:/Python/deisel/"+str(z1)+".txt"
if not Path(fn1).is_file() or not Path(fn2).is_file():
return # some error message perhaps?
with open(fn1, "rb+") as tdc:
with open(fn2, "rb+") as fdl:
try:
while True:
# not using self.f1 and .fd, since it seems they are just local
# also: why are you loading k1 anyway, you're not using it?
k1.append(pickle.load(tdc))
kd.append(pickle.load(fdl))
except EOFError:
pass # no message needed, but not the nicest way to use the exception
for i in kd:
if "L" in i:
qa1 = i[:-1]
qa = qa + int(qa)
else:
qa = qa + int(i[0])
print(" Total Collection for Deisel on date ", z1, "is", qa)
I don't know about the rest of your code, but you could probably get rid of the EOF exception as well, if you pickle in a predictable way and it seems the load of f1 into k1 only serves the purpose of being a possible limit to how many elements are loaded from kd, which is wasteful.
Note how, with every example, the code gets more readable and shorter. Shorter by itself is not a good thing, but if your code becomes more easy to read and understand, performs better and is shorter on top, you know you're on the right track.

Pythonic way of not printing a error message up

I am trying to suppress a error/warning in my log while calling a library. Assume i have this code
try:
kazoo_client.start()
except:
pass
This is calling a zookeeper client which throws some exception which bubble up, now i don't want the warn/error in my logs when i call kazoo_client.start() is there a way to get this suppressed when you call the client
Assuming python 2.7.17
Try this approach:
import sys, StringIO
def funky() :
"1" + 1 # This should raise an error
sys.stderr = StringIO.StringIO()
funky() # this should call the funky function
And your code should look something like this:
import sys, StringIO
# import kazoo somehere around here
sys.stderr = StringIO.StringIO()
kazoo_client.start()
And lastly the Python 3 example:
import sys
from io import StringIO
# import kazoo somehere around here
sys.stderr = StringIO()
kazoo_client.start()
If you know the exception, try contextlib.suppress:
>>> from contextlib import suppress
>>> x = (i for i in range(10))
>>> with suppress(StopIteration):
... for i in range(11):
... print(next(x))
0
1
2
3
4
5
6
7
8
9
Without suppress it throws StopIteration error at last iteration.
>>> x = (i for i in range(10))
>>> for i in range(11):
... print(next(x))
0
1
2
3
4
5
6
7
8
9
Traceback (most recent call last):
File "<ipython-input-10-562798e05ad5>", line 2, in <module>
print(next(x))
StopIteration
Suppress is Pythonic, safe and explicit.
So in your case:
with suppress(SomeError):
kazoo_client.start()
EDIT:
To suppress all exceptions:
with suppress(Exception):
kazoo_client.start()
I would like to suggest a more generic approach, which can be used in general.
I leave you an example of how to create an decorator who ignore errors.
import functools
# Use the Decorator Design Pattern
def ignore_error_decorator(function_reference):
#functools.wraps(function_reference) # the decorator inherits the original function signature
def wrapper(*args):
try:
result = function_reference(*args) # execute the function
return result # If the function executed correctly, return
except Exception as e:
pass # Default ignore; You can also log the error or do what ever you want
return wrapper # Return the wrapper reference
if __name__ == '__main__':
# First alternative to use. Compose the decorator with another function
def my_first_function(a, b):
return a + b
rez_valid = ignore_error_decorator(my_first_function)(1, 3)
rez_invalid = ignore_error_decorator(my_first_function)(1, 'a')
print("Alternative_1 valid: {result}".format(result=rez_valid))
print("Alternative_1 invalid: {result}".format(result=rez_invalid)) # None is return by the exception bloc
# Second alternative. Decorating a function
#ignore_error_decorator
def my_second_function(a, b):
return a + b
rez_valid = my_second_function(1, 5)
rez_invalid = my_second_function(1, 'a')
print("Alternative_2 valid: {result}".format(result=rez_valid))
print("Alternative_2 invalid: {result}".format(result=rez_invalid)) # None is return by the exception bloc
Getting back to your problem, using my alternative you have to run
ignore_error_decorator(kazoo_client.start)()

How do I know which contract failed with Python's contract.py?

I'm playing with contract.py, Terrence Way's reference implementation of design-by-contract for Python. The implementation throws an exception when a contract (precondition/postcondition/invariant) is violated, but it doesn't provide you a quick way of identifying which specific contract has failed if there are multiple ones associated with a method.
For example, if I take the circbuf.py example, and violate the precondition by passing in a negative argument, like so:
circbuf(-5)
Then I get a traceback that looks like this:
Traceback (most recent call last):
File "circbuf.py", line 115, in <module>
circbuf(-5)
File "<string>", line 3, in __assert_circbuf___init___chk
File "build/bdist.macosx-10.5-i386/egg/contract.py", line 1204, in call_constructor_all
File "build/bdist.macosx-10.5-i386/egg/contract.py", line 1293, in _method_call_all
File "build/bdist.macosx-10.5-i386/egg/contract.py", line 1332, in _call_all
File "build/bdist.macosx-10.5-i386/egg/contract.py", line 1371, in _check_preconditions
contract.PreconditionViolationError: ('__main__.circbuf.__init__', 4)
My hunch is that the second argument in the PreconditionViolationError (4) refers to the line number in the circbuf.init docstring that contains the assertion:
def __init__(self, leng):
"""Construct an empty circular buffer.
pre::
leng > 0
post[self]::
self.is_empty() and len(self.buf) == leng
"""
However, it's a pain to have to open the file and count the docstring line numbers. Does anybody have a quicker solution for identifying which contract has failed?
(Note that in this example, there's a single precondition, so it's obvious, but multiple preconditions are possible).
This is an old question but I may as well answer it. I added some output, you'll see it at the comment # jlr001. Add the line below to your contract.py and when it raises an exception it will show the doc line number and the statement that triggered it. Nothing more than that, but it will at least stop you from needing to guess which condition triggered it.
def _define_checker(name, args, contract, path):
"""Define a function that does contract assertion checking.
args is a string argument declaration (ex: 'a, b, c = 1, *va, **ka')
contract is an element of the contracts list returned by parse_docstring
module is the containing module (not parent class)
Returns the newly-defined function.
pre::
isstring(name)
isstring(args)
contract[0] in _CONTRACTS
len(contract[2]) > 0
post::
isinstance(__return__, FunctionType)
__return__.__name__ == name
"""
output = StringIO()
output.write('def %s(%s):\n' % (name, args))
# ttw001... raise new exception classes
ex = _EXCEPTIONS.get(contract[0], 'ContractViolationError')
output.write('\tfrom %s import forall, exists, implies, %s\n' % \
(MODULE, ex))
loc = '.'.join([x.__name__ for x in path])
for c in contract[2]:
output.write('\tif not (')
output.write(c[0])
# jlr001: adding conidition statement to output message, easier debugging
output.write('): raise %s("%s", %u, "%s")\n' % (ex, loc, c[1], c[0]))
# ...ttw001
# ttw016: return True for superclasses to use in preconditions
output.write('\treturn True')
# ...ttw016
return _define(name, output.getvalue(), path[0])
Without modifying his code, I don't think you can, but since this is python...
If you look for where he raises the exception to the user, it I think is possible to push the info you're looking for into it... I wouldn't expect you to be able to get the trace-back to be any better though because the code is actually contained in a comment block and then processed.
The code is pretty complicated, but this might be a block to look at - maybe if you dump out some of the args you can figure out whats going on...
def _check_preconditions(a, func, va, ka):
# ttw006: correctly weaken pre-conditions...
# ab002: Avoid generating AttributeError exceptions...
if hasattr(func, '__assert_pre'):
try:
func.__assert_pre(*va, **ka)
except PreconditionViolationError, args:
# if the pre-conditions fail, *all* super-preconditions
# must fail too, otherwise
for f in a:
if f is not func and hasattr(f, '__assert_pre'):
f.__assert_pre(*va, **ka)
raise InvalidPreconditionError(args)
# rr001: raise original PreconditionViolationError, not
# inner AttributeError...
# raise
raise args
# ...rr001
# ...ab002
# ...ttw006

Categories

Resources