How to handle exceptions correctly? - python

I don't quite understand yet how to correctly use exceptions in Python. I want to process the data I can't completely trust (they are prone to change, and if they change, script may break). Let's say I process a webpage using BeautifulSoup. If author of the website make some changes to his website, some statements may rise exception. Let's look at this code example:
data = urllib2.urlopen('http://example.com/somedocument.php').read()
soup = BeautifulSoup(data, convertEntities="html")
name = soup.find('td', text=re.compile(r'^Name$')).parent.nextSibling.string
print name
Now, if soup.find() fail because owner of that website will change content of the website and rename cell Name to Names, an exception AttributeError: 'NoneType' object has no attribute 'parent' will be raised. But I don't mind! I expect that some data won't be available. I just want to proceed and use what variables I have available (of course there will be some data I NEED, and if they are unavailable I will simply exit.
Only solution I came up with is:
try: name = soup.find('td', text=re.compile(r'^Name$')).parent.nextSibling.string
except AttributeError: name = False
try: email = soup.find('td', text=re.compile(r'^Email$')).parent.nextSibling.string
except AttributeError: email = False
try: phone = soup.find('td', text=re.compile(r'^Phone$')).parent.nextSibling.string
except AttributeError: phone = False
if name: print name
if email: print email
if phone: print phone
Is there any better way, or should I just continue making try-except for every similar statement? It doesn't look very nice at all.
EDIT: Best solution for me would be like this:
try:
print 'do some stuff here that may throw and exception'
print non_existant_variable_that_throws_an_exception_here
print 'and few more things to complete'
except:
pass
This would be great, but pass will skip anything in try code block, so and few more things to complete will never be printed. If there was something like pass, but it would just ignore the error and continue executing, it would be great.

Firstly, if you don't mind the exception you can just let it pass:
try:
something()
except AttributeError:
pass
but never do this as it is will let all errors pass:
try:
something()
except Exception:
pass
As for your code sample, perhaps it could be tidied up with something like this:
myDict = {}
for item in ["Name", "Email", "Phone"]:
try:
myDict[item] = soup.find('td', text=re.compile(r'^%s$' % item)).parent.nextSibling.string
except Attribute
myDict[item] = "Not found"
for item in ["Name", "Email", "Phone"]:
print "%s: %s" % (item, myDict[item])

Have you tried using a try/finally statement instead?
http://docs.python.org/tutorial/errors.html#defining-clean-up-actions
Example from the docs:
>>> def divide(x, y):
... try:
... result = x / y
... except ZeroDivisionError:
... print "division by zero!"
... else:
... print "result is", result
... finally:
... print "executing finally clause"
So, to use your example:
try:
do_some_stuff_here_that_may_throw_an_exception()
except someError:
print "That didn't work!"
else:
print variable_that_we_know_didnt_throw_an_exception_here
finally:
print "finishing up stuff"
"finally" will always excecute, so that's where you can put your "finishing up" stuff.

Related

django - use of 'else:' - interesting case

django's auth middleware has this code:
def get_user(request):
"""
Returns the user model instance associated with the given request session.
If no user is retrieved an instance of `AnonymousUser` is returned.
"""
from .models import AnonymousUser
user = None
try:
user_id = request.session[SESSION_KEY]
backend_path = request.session[BACKEND_SESSION_KEY]
except KeyError:
pass
else: # <------ this doesnot have if-part, but how is it meant to work?
if backend_path in settings.AUTHENTICATION_BACKENDS:
# more code...
the else part is interesting, it doesnot have if-part, what is this? really cool thing if I dont know this yet
The else forms the else part of try except
try:
user_id = request.session[SESSION_KEY]
backend_path = request.session[BACKEND_SESSION_KEY]
except KeyError:
pass
else:
if backend_path in settings.AUTHENTICATION_BACKENDS:
# more code...
The else clause, which, when present, must follow all except clauses. It is useful for code that must be executed if the try clause does not raise an exception.
That is here it would be executed only if the code doesn't raise any KeyErrors
Example
Consider we have a dictionary as
>>> a_dict={"key" : "value"}
We can use try except to handle a KeyError as
>>> try:
... print a_dict["unknown"]
... except KeyError:
... print "key error"
...
key error
Now, we need to check if a key error occurs and if not, the else clause is executed as
>>> try:
... print a_dict["key"]
... except KeyError:
... print "key error"
... else:
... print "no errors"
...
value
no errors
Where as if any of the except clause is raised, then it won't be executed.
>>> try:
... print a_dict["unkown"]
... except KeyError:
... print "key error"
... else:
... print "no errors"
...
key error

Identifying the data that throws an exception in Python: How to shrink this code?

I have a script that reads in from a file of records that is checking for incorrect data. They might each throw the same exception and they exist on the same line. Is there a way to identify which field threw the exception without having to break it up into multiple lines?
Toy example here:
a = [1]
b = [2]
c = [] # Oh no, imagine something happened, like some data entry error
i = 0
try:
z = a[i] + b[i] + c[i]
except IndexError, e:
print "Data is missing! %s" % (str(e))
The problem is that if there's an exception, the user doesn't know if it's a, b, or c that has data missing.
I suppose I could write it as:
def check_data(data, index, message):
try:
return data[index]
except IndexError, e:
print "%s is missing." % (message)
raise e
a = [1]
b = [2]
c = []
i = 0
try:
z = check_data(a, i, "a") + check_data(b, i, "b") + check_data(c, i, "c")
except TypeError, e:
print "Error! We're done."
But that can be pretty tedious.
What other ways can I use to handle this situation to validate each field in exception blocks, if any exist?
Example adapted from reality below:
class Fork:
def __init__(self, index, fork_name, fork_goal, fork_success):
# In reality, we would do stuff here.
pass
forks = []
# In reality, we'd be reading these in and not all of the entries might exist.
fork_names = ["MatrixSpoon", "Spoon", "Spork"]
fork_goals = ["Bend", "Drink soup", "Drink soup but also spear food"]
fork_success = ["Yes!", "Yes!"]
try:
for i in range(0, len(fork_names)):
forks.append(Fork(i + 1, fork_names[i], fork_goals[i], fork_success[i]))
except IndexError, e:
print "There was a problem reading the forks! %s" % (e)
print "The field that is missing is: %s" % ("?")
You could move your error checking into the Fork class and use itertools.izip_longest to make sure /something/ (really None) is passed in if one data stream runs short:
class Fork:
def __init__(self, index, fork_name, fork_goal, fork_success):
# first, check parameters
for name, value in (
('fork_name', fork_name),
('fork_goal', fork_goal),
('fork_success', fork_success)
):
if value is None:
raise ValueError('%s not specified' % name)
# rest of code
forks = []
# In reality, we'd be reading these in and not all of the entries might exist.
fork_names = ["MatrixSpoon", "Spoon", "Spork"]
fork_goals = ["Bend", "Drink soup", "Drink soup but also spear food"]
fork_success = ["Yes!", "Yes!"]
and then change your loop like so:
for name, goal, sucess in izip_longest(fork_names, fork_goals, fork_success):
forks.append(Fork(names, goal, success))
Now, you'll get an error clearly detailing which data element was missing. If your missing element looks more like '' than nothing, you could change the test in __init__ from if value is None to if not value.
When you catch or the exception you still have the information on what caused the
exception, for example:
c_1 = None
try:
c_1 = c[i]
except IndexError, e:
print "c is missing."
raise e # here you still have e and i
So you could do something like that:
try:
a = a_1[i]
except IndexError, e:
raise Exception(e.message+'the violation is because of '+str(i))
a more complete solution ...
If you are interested in knowing what caused the violation, e.g. which list is two short, you can simply hard code the variables:
try:
for i in range(0, len(fork_names)):
forks.append(Fork(i + 1, fork_names[i], fork_goals[i], fork_success[i]))
except IndexError, e:
print "There was a problem reading the forks! %s" % (e)
print "There are fork_names with size %s " % len(fork_names)
print "There are fork_goals with size %s " % len(fork_goals)
print "There are fork_success with size %s " % len(fork_success)
print "You tried accessing index %d" % (i+1)
OK, I admit seems a lot of work! But it's worth it, because you have to think about your input and expect out put (TDD if you want...).
But this is still quite lame, what if you don't know how a method called? Sometimes you
will see this:
def some_function(arg1, arg2, *args, **kwrds)
pass
So, you can really hard code the stuff in the exceptions, for this case you can print the stack information, with sys.exc_info:
try:
for i in range(0, len(fork_names)):
forks.append(Fork(i + 1, fork_names[i], fork_goals[i], fork_success[i]))
except IndexError, e:
type, value, traceback = sys.exc_info()
for k, v in traceback.tb_frame.f_locals.items():
if isinstance(k, (list,tuple)):
print k, " length ", len(k)
else:
print k, v
The above will output
Fork __main__.Fork
traceback <traceback object at 0x7fe51c7ea998>
e list index out of range
__builtins__ <module '__builtin__' (built-in)>
__file__ teststo.py
fork_names ['MatrixSpoon', 'Spoon', 'Spork']
value list index out of range
__package__ None
sys <module 'sys' (built-in)>
i 2
fork_success ['Yes!', 'Yes!']
__name__ __main__
forks [<__main__.Fork instance at 0x7fe51c7ea908>, <__main__.Fork instance at 0x7fe51c7ea950>]
fork_goals ['Bend', 'Drink soup', 'Drink soup but also spear food']
type <type 'exceptions.IndexError'>
__doc__ None
So, after examining the above trace you can figure out which list is too short. Im admitting, this is like shooting a debugger. So if you want, here is how to shoot a debugger:
try:
some_thing_that_fails()
except Exception:
import pdb; pdb.set_trace()
# if you want a better debugger that supports autocomplete and tab pressing
# to explore objects you should you use ipdb
# import ipdb; ipdb.set_trace()
final remark:
for i in range(0, len(fork_names))
This is not really Pythonic. Instead you could use:
for idx, item enumerate(fork_names):
forks.append(Fork(idx + 1, fork_names[idx], fork_goals[idx], fork_success[idx]))
And like it was said in the comments, izip and izip_longest are worth looking into.

Getting first row from sqlalchemy

I have the following query:
profiles = session.query(profile.name).filter(and_(profile.email == email, profile.password == password_hash))
How do I check if there is a row and how do I just return the first (should only be one if there is a match)?
Use query.one() to get one, and exactly one result. In all other cases it will raise an exception you can handle:
from sqlalchemy.orm.exc import NoResultFound
from sqlalchemy.orm.exc import MultipleResultsFound
try:
user = session.query(User).one()
except MultipleResultsFound, e:
print e
# Deal with it
except NoResultFound, e:
print e
# Deal with that as well
There's also query.first(), which will give you just the first result of possibly many, without raising those exceptions. But since you want to deal with the case of there being no result or more than you thought, query.one() is exactly what you should use.
You can use the first() function on the Query object. This will return the first result, or None if there are no results.
result = session.query(profile.name).filter(...).first()
if not result:
print 'No result found'
Alternatively you can use one(), which will give you the only item, but raise exceptions for a query with zero or multiple results.
from sqlalchemy.orm.exc import NoResultFound, MultipleResultsFound
try:
result = session.query(profile.name).filter(...).one()
print result
except NoResultFound:
print 'No result was found'
except MultipleResultsFound:
print 'Multiple results were found'
Assuming you have a model User, you can get the first result with:
User.query.first()
If the table is empty, it will return None.
Use one_or_none(). Return at most one result or raise an exception.
Returns None if the query selects no rows.

How to take arbitrary variables loaded by exec, and use them in a function

I'm looking to write up a basic Bioinformatics course on Codeacademy. They have a nice interface for writing up a course, but it's a bit slow for testing, as one must save, then preview, then run.
So I'm looking to write up a little testing environment that mimics their one. How it appears to work is that the user-input code is read in to a function as a string, all str instances in the code are converted to unicode (I've just used regex for this) and then the code is executed with exec.
The tricky part seems to be when I want to incorporate the Submission Test.
Submission Tests need to return True, False, or a str, and are written as the body of a function. So for example:
A simplified version of what I'm looking to do:
# The submission test must be a function.
def test_code(code, CC, error):
# Use information from errors in student code
if error:
return "Yada yada %s" %error
# Use information in the raw student code
if len(code.split("\n")) is not 2:
return "This should be accomplished in 2 lines"
# Have direct access to variables from the student code
# I'd like to avoid params['y'] if possible.
try:
y
except NameError:
return "Please use the variable y"
if y is not 8:
return "Wrong! Check stuff"
# Use information from print output
if str(y) not in CC:
return "Remember to print your variable!"
return True
# Read in student code
student_code = """y = 8
print y
potato"""
# Catch print output
CC = StringIO.StringIO()
sys.stdout = CC
# Execute student code and catch errors
try:
exec student_code
except Exception as e:
error = e
# Start outputting to the terminal again
sys.stdout = sys.__stdout__
# Run the submission test
submission_test = test_code(student_code, CC.split("\n"), error)
# Output the result of the submission test
if submission_test is True:
print("Well done!")
elif submission_test is False:
print("Oops! You failed... Try again!")
else:
print(submission_test)
However, I can't seem to get the variables from exec code to pass through to the submission test function (test_code in this case).
I could just execute the code in the Submission Test, but I'd like to avoid that if possible, otherwise it will have to be added to each test, which just seems unpythonic!
Any help would be greatly appreciated :)
If you exec mystr in somedict, then somedict has a reference to every variable assigned during the execution of mystr as Python code. In addition, you can pass variables in this way, too.
>>> d = {'y': 3}
>>> exec "x = y" in d
>>> d['x']
3
You need to pass the dictionary you got from running the user code in, so that the submission check code can verify the values in it are appropriate.
It sounds like you want the code and the test to run in the same environment. Both are being submitted as strings, so perhaps the easiest way is to concatenate both and run exec on the combined string:
from __future__ import unicode_literals
def wrap_body(body):
indent = ' '*4
return 'def test():\n' + indent + body.replace('\n','\n'+indent)
code = '''
bool_1 = True
bool_2 = False
str_1 = 'Hello there!'
str_2 = "I hope you've noticed the apostrophe ;)"
'''
code_test = '''
try:
bool_1, bool_2, str_1, str_2
except NameError:
return "Please do not alter the variable names!"
if (bool_1 == bool_2 or str_1 == str_2):
return "Please ensure that all of your variables are different " \
"from one another"
if type(bool_1) != bool: return "bool_1 is incorrect"
if type(bool_2) != bool: return "bool_2 is incorrect"
if type(str_1) != unicode: return "str_1 is incorrect"
if type(str_2) != unicode: return "str_2 is incorrect"
return True
'''
code_test = wrap_body(code_test)
template = code + code_test
namespace = {}
try:
exec template in namespace
print(namespace['test']())
except Exception as err:
print(err)
Okay, my colleague figured this one out.
It uses an element of Devin Jeanpierre's answer.
We use the exec code in dictionary method, then pass the dictionary into the checking function, then, within the checking function we unpack the dictionary into the globals().
# The submission test must be a function.
def test_code(code, CC, error, code_vars):
# unpack the student code namespace into the globals()
globs = globals()
for var, val in code_vars.items():
globs[var] = val
# Use information from errors in student code
if error:
return "Yada yada %s" %error
# Use information in the raw student code
if len(code.split("\n")) is not 2:
return "This should be accomplished in 2 lines"
# Have direct access to variables from the student code
# I'd like to avoid params['y'] if possible.
try:
y
except NameError:
return "Please use the variable y"
if y is not 8:
return "Wrong! Check stuff"
# Use information from print output
if str(y) not in CC:
return "Remember to print your variable!"
return True
# Read in student code
student_code = """y = 8
print y
potato"""
# Catch print output
CC = StringIO.StringIO()
sys.stdout = CC
# create the namespace for the student code
code_vars = {}
# Execute student code and catch errors
try:
# execute the student code in the created namespace
exec student_code in code_vars
except Exception as e:
error = e
# Start outputting to the terminal again
sys.stdout = sys.__stdout__
# Run the submission test
submission_test = test_code(student_code, CC.split("\n"), error, code_vars)
# Output the result of the submission test
if submission_test is True:
print("Well done!")
elif submission_test is False:
print("Oops! You failed... Try again!")
else:
print(submission_test)

Python - Continuing after exception at the point of exception

I'm trying to extract data from an xml file. A sample of my code is as follows:
from xml.dom import minidom
dom = minidom.parse("algorithms.xml")
...
parameter = dom.getElementsByTagName("Parameters")[0]
# loop over parameters
try:
while True:
parameter_id = parameter.getElementsByTagName("Parameter")[m].getAttribute("Id")
parameter_name = parameter.getElementsByTagName("Name")[m].lastChild.data
...
parameter_default = parameter.getElementsByTagName("Default")[m].lastChild.data
print parameter_id
print parameter_default
m = m+1
except IndexError:
#reached end of available parameters
pass
#except AttributeError:
#parameter doesn't exist
#?
If all elements for each parameter exist, the code runs correctly. Unfortunately the data I am supplied often has missing entries in it, raising an AttributeError exception. If I simply pass on that error, then any elements that do exist but are retrieved later in the loop than when the exception occurred are skipped, which I don't want. I need some way to continue where the code left off and skip to the next line of code if this specific exception is raised.
The only way to work around this that I can think of would be to override the minidom's class methods and catch the exception there, but that seems far too messy and too much work to handle what should be a very simple and common problem. Is there some easier way to handle this that I am missing?
Instead of "an individual try-except block for every statement", why not abstract out that part?
def getParam(p, tagName, index, post=None):
post = post or lambda i: i
try:
return post(p.getElementsByTagName(tagname)[index])
except AttributeError:
print "informative message"
return None # will happen anyway, but why not be explicit?
then in the loop you could have things like:
parameter_id = getParam(parameter, "Parameter", m, lambda x: x.getAttribute("Id"))
parameter_name = getParam(parameter, "Name", m, lambda x: x.lastChild.data)
...
I think there are two parts to your question. First, you want the loop to continue after the first AttributeError. This you do by moving the try and except into the loop.
Something like this:
try:
while True:
try:
parameter_id = parameter.getElementsByTagName("Parameter")[m].getAttribute("Id")
parameter_name = parameter.getElementsByTagName("Name")[m].lastChild.data
...
parameter_default = parameter.getElementsByTagName("Default")[m].lastChild.data
print parameter_id
print parameter_default
m = m+1
except AttributeError:
print "parameter doesn't exist"
#?
except IndexError:
#reached end of available parameters
pass
The second part is more tricky. But it is nicely solved by the other answer.

Categories

Resources