My code is working fine, but PyCharm warns about the except clause in the following being too broad. And honestly, it also smells to me as the wrong implementation.
When scraping HTML<tr>s in bulk, which I also have to add to the database or update, I don't know in advance if a game is finished or not. Also, a game that hasn't been finished has different HTML tags in some <td>s, that require different handling.
Basically, if we look at match_score_string and match_relevant_bit we are telling BeautifulSoup to find a <td> with a certain class. If the game is finished already, this <td> will have a score-time sc class, if not, it will have a score-time st class. If it has one, it won't have the other.
It's been a while since I wrote the try-except clauses, but if I recall correctly, the reason (I felt) I had to use them is because BS would throw an error and halt all operations when row.find fails to locate the HTML object.
Is PyCharm complaining justifiably?
# Get Match Score
try:
match_score_string = row.find("td", class_="score-time sc").get_text()
match_string_split = match_score_string.split(" - ")
team_a_score = int(match_string_split[0])
team_b_score = int(match_string_split[1])
print(team_a_score)
print(team_b_score)
except:
team_a_score = None
team_b_score = None
# Get Match URL
try:
match_relevant_bit = row.find("td", class_="score-time sc")
match_url = match_relevant_bit.find("a").get("href")
match_url_done = match_url.rsplit('?JKLMN')
match_url = match_url_done[0]
match_finished = True
except:
match_relevant_bit = row.find("td", class_="score-time st")
match_url = match_relevant_bit.find("a").get("href")
match_url_done = match_url.rsplit('?JKLMN')
match_url = match_url_done[0]
match_finished = False
Your except is certainly too broad. I won't dispute whether or not you need exception handling in these cases -- Exceptions are quite pythonic so use them wherever you feel they are appropriate. However, you haven't argued that you need your code to handle every exception imaginable. A bare except can mask all sorts of bugs. e.g. If you misspell a variable name at a later revision:
try:
match_score_string = row.find("td", class_="score-time sc").get_text()
match_string_split = match_score_string.split(" - ")
team_a_score = int(match_string_spilt[0]) # oops
team_b_score = int(match_string_split[1])
print(team_a_score)
print(team_b_score)
except:
team_a_score = None
team_b_score = None
Bare excepts also prevent a user from sending KeyboardInterrupt to terminate the program ...
These are just 2 examples why you don't want a bare except -- Since the number of programming errors that could get masked by this are near infinite, I don't have enough time to demonstrate all of them ;-).
Instead, you should specify exactly what exceptions you expect to encounter:
try:
match_score_string = row.find("td", class_="score-time sc").get_text()
match_string_split = match_score_string.split(" - ")
team_a_score = int(match_string_spilt[0]) # oops
team_b_score = int(match_string_split[1])
print(team_a_score)
print(team_b_score)
except AttributeError: # If no row is found, accessing get_text fails.
team_a_score = None
team_b_score = None
This way, the code executes as you expect and you don't mask (too many) programming bugs. The next thing you want to do to prevent hiding bugs is to limit amount of code in your exception handler as much as possible.
Try/except clauses are great. It sounds like you don't use them too often.
try/except blocks can help you avoid complicated if statements, or code that checks for a condition before proceeding which can sometimes make it unclear what you are trying to do (see what I did there :P).
There are a few exceptions that you could possibly drag out of your code, and some of them might be more likely than the others. See below:
try:
# This first line could have an AttributeError in two places
# Maybe the row didn't parse quite correctly? find() won't work then.
# Maybe you didn't find any elements? get_text() won't work then.
match_score_string = row.find("td", class_="score-time sc").get_text()
^ ^
# Same as here, you could get an AttributeError if match_score_string isn't a string.
# I would say that isn't very likely in this case,
# Especially if you handle the above line safely.
match_string_split = match_score_string.split(" - ")
# If either of these split strings contains no matches, you'll get an IndexError
# If either of the strings at the index contain letters, you may get a ValueError
team_a_score = int(match_string_split[0])
team_b_score = int(match_string_split[1])
# Theoretically, if you caught the exception from the above two lines
# you might get a NameError here, if the variables above never ended up being defined.
print(team_a_score)
print(team_b_score)
except:
# Can't really imagine much going wrong here though.
team_a_score = None
team_b_score = None
As you might imagine, it may be more useful to catch several types of exceptions. It depends on the complexity of your inputs.
Multiple try/except blocks would also be of benefit here, specifically to handle the case where you catch an error and need to continue executing the rest of the code inside that block.
Related
I'm in trouble about how to end a 'try' loop, which is occurred since I have the 'try', here is the code:
import time
class exe_loc:
mem = ''
lib = ''
main = ''
def wizard():
while True:
try:
temp_me = input('Please specify the full directory of the memory, usually it will be a folder called "mem"> ' )
if temp_me is True:
exe_loc.mem = temp_me
time.sleep(1)
else:
print('Error value! Please run this configurator again!')
sys.exit()
temp_lib = input('Please specify the full directory of the library, usually it will be a folder called "lib"> ')
if temp_lib is True:
exe_loc.lib = temp_lib
time.sleep(1)
else:
print('Invalid value! Please run this configurator again!')
sys.exit()
temp_main = input('Please specify the full main executable directory, usually it will be app main directory> ')
if temp_main is True:
exe_loc.main = temp_main
time.sleep(1)
I tried end it by using break, pass, and I even leaves it empty what I get is Unexpected EOF while parsing, I searched online and they said it is caused when the code blocks were not completed. Please show me if any of my code is wrong, thanks.
Btw, I'm using python 3 and I don't know how to be more specific for this question, kindly ask me if you did not understand. Sorry for my poor english.
EDIT: Solved by removing the try because I'm not using it, but I still wanna know how to end a try loop properly, thanks.
Your problem isn't the break, it's the overall, high-level shape of your try clause.
A try requires either an except or a finally block. You have neither, which means your try clause is never actually complete. So python keeps looking for the next bit until it reaches EOF (End Of File), at which point it complains.
The python docs explain in more detail, but basically you need either:
try:
do_stuff_here()
finally:
do_cleanup_here() # always runs, even if the above raises an exception
or
try:
do_stuff_here()
except SomeException:
handle_exception_here() # if do_stuff_here raised a SomeException
(You can also have both the except and finally.) If you don't need either the cleanup or the exception handling, that's even easier: just get rid of the try altogether, and have the block go directly under that while True.
Finally, as a terminology thing: try is not a loop. A loop is a bit of code that gets executed multiple times -- it loops. The try gets executed once. It's a "clause," not a "loop."
You have to also 'catch' the exception with the except statement, otherwise the try has no use.
So if you do something like:
try:
# some code here
except Exception:
# What to do if specified error is encountered
This way if anywhere in your try block an exception is raised it will not break your code, but it will be catched by your except.
I am learning Python and have stumbled upon a concept I can't readily digest: the optional else block within the try construct.
According to the documentation:
The try ... except statement has an optional else clause, which, when
present, must follow all except clauses. It is useful for code that
must be executed if the try clause does not raise an exception.
What I am confused about is why have the code that must be executed if the try clause does not raise an exception within the try construct -- why not simply have it follow the try/except at the same indentation level? I think it would simplify the options for exception handling. Or another way to ask would be what the code that is in the else block would do that would not be done if it were simply following the try statement, independent of it. Maybe I am missing something, do enlighten me.
This question is somewhat similar to this one but I could not find there what I am looking for.
The else block is only executed if the code in the try doesn't raise an exception; if you put the code outside of the else block, it'd happen regardless of exceptions. Also, it happens before the finally, which is generally important.
This is generally useful when you have a brief setup or verification section that may error, followed by a block where you use the resources you set up in which you don't want to hide errors. You can't put the code in the try because errors may go to except clauses when you want them to propagate. You can't put it outside of the construct, because the resources definitely aren't available there, either because setup failed or because the finally tore everything down. Thus, you have an else block.
One use case can be to prevent users from defining a flag variable to check whether any exception was raised or not(as we do in for-else loop).
A simple example:
lis = range(100)
ind = 50
try:
lis[ind]
except:
pass
else:
#Run this statement only if the exception was not raised
print "The index was okay:",ind
ind = 101
try:
lis[ind]
except:
pass
print "The index was okay:",ind # this gets executes regardless of the exception
# This one is similar to the first example, but a `flag` variable
# is required to check whether the exception was raised or not.
ind = 10
try:
print lis[ind]
flag = True
except:
pass
if flag:
print "The index was okay:",ind
Output:
The index was okay: 50
The index was okay: 101
The index was okay: 10
I've read three beginner-level Python books, however, I still don't understand exceptions.
Could someone give me a high level explanation?
I guess I understand that exceptions are errors in code or process that cause the code to stop working.
In the old days, when people wrote in assembly language or C, every time you called a function that might fail, you had to check whether it succeeded. So you'd have code like this:
def countlines(path):
f = open(path, 'r')
if not f:
print("Couldn't open", path)
return None
total = 0
for line in f:
value, success = int(line)
if not success:
print(line, "is not an integer")
f.close()
return None
total += value
f.close()
return total
The idea behind exceptions is that you don't worry about those exceptional cases, you just write this:
def countlines(path):
total = 0
with open(path, 'r') as f:
for line in f:
total += int(line)
return total
If Python can't open the file, or turn the line into an integer, it will raise an exception, which will automatically close the file, exit your function, and quit your whole program, printing out useful debugging information.
In some cases, you want to handle an exception instead of letting it quit your program. For example, maybe you want to print the error message and then ask the user for a different filename:
while True:
path = input("Give me a path")
try:
print(countlines(path))
break
except Exception as e:
print("That one didn't work:", e)
Once you know the basic idea that exceptions are trying to accomplish, the tutorial has a lot of useful information.
If you want more background, Wikipedia can help (although the article isn't very useful until you understand the basic idea).
If you still don't understand, ask a more specific question.
The best place to start with that is Python's list of built-in exceptions, since most you'll see derive from that.
Keep in mind that anybody can throw any error they want over anything, and then catch it and dismiss it as well. Here's one quick snippet that uses exceptions for handling instead of if/else where __get_site_file() throws an exception if the file isn't found in any of a list of paths. Despite that particular exception, the code will still work. However, the code would throw an uncaught error that stops execution if the file exists but the permissions don't allow reading.
def __setup_site_conf(self):
# Look for a site.conf in the site folder
try:
path = self.__get_site_file('site.conf')
self.__site_conf = open(path).read()
except EnvironmentError:
self.__site_conf = self.__get_site_conf_from_template()
Python's documentation: http://docs.python.org/2/tutorial/errors.html
For a high-level explanation, say we want to divide varA / varB. We know that varB can't equal 0, but we might not want to perform the check every time we do the division:
if varB != 0:
varA / varB
We can use exceptions to try the block without performing the conditional first, and then handle the behavior of the program based on whether or not something went wrong in the try block. In the following code, if varB == 0, then 'oops' is printed to the console:
try:
varA / varB
except ZeroDivisionError:
print 'oops'
Here is a list of exceptions that can be used: http://docs.python.org/2/library/exceptions.html#exceptions.BaseException
However, if you know how it may fail, you can just open a python console and see what exception is raised:
>>> 1 / 0
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ZeroDivisionError: integer division or modulo by zero
Exceptions are unexpected events that occur during the execution of a program. An exception might result from a logical error or an unanticipated situation.
In Python, exceptions (also known as errors) are objects that are raised (or thrown) by code that encounters an unexpected circumstance.
The Python interpreter can also raise an exception should it encounter an unexpected condition, like running out of memory. A raised error may be caught by a surrounding context that “handles” the exception in an appropriate fashion.
If uncaught, an exception causes the interpreter to stop executing the program and to report an appropriate message to the console.
def sqrt(x):
if not isinstance(x, (int, float)):
raise TypeError( x must be numeric )
elif x < 0:
raise ValueError( x cannot be negative )
Exceptions are not necessarily errors. They are things that get raised when the code encounters something it doesn't (immediately) know how to deal with. This may be entirely acceptable, depending on how you make your code. For instance, let's say you ask a user to put in a number. You then try to take that text (string) and convert it to a number (int). If the user put in, let's say, "cat", however, this will raise an exception. You could have your code handle that exception, however, and rather than break, just give the user a small message asking him to try again, and please use a number. Look at this link to see what I'm talking about: http://www.tutorialspoint.com/python/python_exceptions.htm
Also, you usually handle exceptions with a try, except (or catch) block. Example:
try:
integer = int(raw_input("Please enter an integer: "))
except Exception, exc:
print "An error has occured."
print str(exc)
Hope it helps!
I'm currently doing some Python automation of Excel with com. It's fully functional, and does what I want, but I've discovered something surprising. Sometimes, some of the Excel commands I use will fail with an exception for no apparent reason. Other times, they will work.
In the VB equivalent code for what I'm doing, this problem is apparently considered normal, and is plastered over with a On Error Resume Next statement. Python does not have said statement, of course.
I can't wrap up the whole set in a try except loop, because it could "fail" halfway through and not complete properly. So, what would be a pythonic way to wrap several independent statements into a try except block? Specifically, something cleaner than:
try:
statement
except:
pass
try:
statement
except:
pass
The relevant code is the excel.Selection.Borders bit.
def addGridlines(self, infile, outfile):
"""convert csv to excel, and add gridlines"""
# set constants for excel
xlDiagonalDown = 5
xlDiagonalUp = 6
xlNone = -4142
xlContinuous = 1
xlThin = 2
xlAutomatic = -4105
xlEdgeLeft = 7
xlEdgeTop = 8
xlEdgeBottom = 9
xlEdgeRight = 10
xlInsideVertical = 11
xlInsideHorizontal = 12
# open file
excel = win32com.client.Dispatch('Excel.Application')
workbook = excel.Workbooks.Open(infile)
worksheet = workbook.Worksheets(1)
# select all cells
worksheet.Range("A1").CurrentRegion.Select()
# add gridlines, sometimes some of these fail, so we have to wrap each in a try catch block
excel.Selection.Borders(xlDiagonalDown).LineStyle = xlNone
excel.Selection.Borders(xlDiagonalUp).LineStyle = xlNone
excel.Selection.Borders(xlDiagonalUp).LineStyle = xlNone
excel.Selection.Borders(xlEdgeLeft).LineStyle = xlContinuous
excel.Selection.Borders(xlEdgeLeft).Weight = xlThin
excel.Selection.Borders(xlEdgeLeft).ColorIndex = xlAutomatic
excel.Selection.Borders(xlEdgeTop).LineStyle = xlContinuous
excel.Selection.Borders(xlEdgeTop).Weight = xlThin
excel.Selection.Borders(xlEdgeTop).ColorIndex = xlAutomatic
excel.Selection.Borders(xlEdgeBottom).LineStyle = xlContinuous
excel.Selection.Borders(xlEdgeBottom).Weight = xlThin
excel.Selection.Borders(xlEdgeBottom).ColorIndex = xlAutomatic
excel.Selection.Borders(xlEdgeRight).LineStyle = xlContinuous
excel.Selection.Borders(xlEdgeRight).Weight = xlThin
excel.Selection.Borders(xlEdgeRight).ColorIndex = xlAutomatic
excel.Selection.Borders(xlInsideVertical).LineStyle = xlContinuous
excel.Selection.Borders(xlInsideVertical).Weight = xlThin
excel.Selection.Borders(xlInsideVertical).ColorIndex = xlAutomatic
excel.Selection.Borders(xlInsideHorizontal).LineStyle = xlContinuous
excel.Selection.Borders(xlInsideHorizontal).Weight = xlThin
excel.Selection.Borders(xlInsideHorizontal).ColorIndex = xlAutomatic
# refit data into columns
excel.Cells.Select()
excel.Cells.EntireColumn.AutoFit()
# save new file in excel format
workbook.SaveAs(outfile, FileFormat=1)
workbook.Close(False)
excel.Quit()
del excel
Update:
Perhaps a bit of explanation on the error bit is required. Two identical runs on my test machine, with identical code, on the same file, produce the same result. One run throws exceptions for every xlInsideVertical line. The other throws exceptions for every xlInsideHorizontal. Finally, a third run completes with no exceptions at all.
As far as I can tell Excel considers this normal behavior, because I'm cloning the VB code built by excel's macro generator, not VB code produced by a person. This might be an erroneous assumption, of course.
It will function with each line wrapped in a try except block I just wanted something shorter and more obvious, because 20 lines wrapped in their own try catch loops is just asking for trouble later.
Update2:
This is a scrubbed CSV file for testing: gist file
Conclusion:
The answer provided by Vsekhar is perfect. It abstracts away the exception suppression, so that later, if and when I have time, I can actually deal with the exceptions as they occur. It also allows for logging the exceptions so they don't disappear, not stopping other exceptions, and is small enough to be easily manageable six months from now.
Consider abstracting away the suppression. And to Aaron's point, do not swallow exceptions generally.
class Suppressor:
def __init__(self, exception_type):
self._exception_type = exception_type
def __call__(self, expression):
try:
exec expression
except self._exception_type as e:
print 'Suppressor: suppressed exception %s with content \'%s\'' % (type(self._exception_type), e)
# or log.msg('...')
Then, note in the traceback of your current code exactly what exception is raised, and create a Suppressor for just that exception:
s = Suppressor(excel.WhateverError) # TODO: put your exception type here
s('excel.Selection.Borders(xlDiagonalDown).LineStyle = xlNone')
This way you get line-by-line execution (so your tracebacks will still be helpful), and you are suppressing only the exceptions you explicitly intended. Other exceptions propagate as usual.
Exceptions never happen "for no apparent reason". There is always a reason and that reason needs to be fixed. Otherwise, your program will start to produce "random" data where "random" is at the mercy of the bug that you're hiding.
But of course, you need a solution for your problem. Here is my suggestion:
Create a wrapper class that implements all the methods that you need and delegates them to the real Excel instance.
Add a decorator before each method which wraps the method in a try except block and log the exception. Never swallow exceptions
Now the code works for your customer which buys you some time to find out the cause of the problem. My guess is that a) Excel doesn't produce a useful error message or b) the wrapper code swallows the real exception leaving you in the dark or c) the Excel method returns an error code (like "false" for "failed") and you need to call another Excel method to determine what the cause of the problem is.
[EDIT] Based on the comment below which boil down to "My boss doesn't care and there is nothing I can do": You're missing a crucial point: It's your bosses duty to make the decision but it your duty to give her a list of options along with pros/cons so that she can make a sound decision. Just sitting there saying "I can't do anything" will get you into the trouble that you're trying to avoid.
Example:
Solution 1: Ignore the errors
Pro: Least amount of work
Con: There is a chance that the resulting data is wrong or random. If important business decisions are based on it, there is a high risk that those decisions will be wrong.
Solution 2: Log the errors
Pro: Little amount of work, users can start to use the results quickly, buys time to figure out the source of the problem
Con: "If you can't fix it today, what makes you think you will have time to fix it tomorrow?" Also, it might take you a long time to find the source of the problem because you're no expert
Solution 3: Ask an expert
Find an expert in the field and help him/her have a look/improve the solution.
Pro: Will get a solution much more quickly than learning the ins and outs of COM yourself
Con: Expensive but high chance of success. Will also find problems that we don't even know about.
...
I think you see the pattern. Bosses make wrong decisions because we (willingly) let them. Any boss in the world is happy for hard facts and input when they have to make a decision (well, those who don't shouldn't be bosses, so this is a surefire way to know when to start looking for a new job).
If you select solution #2, go for the wrapper approach. See the docs how to write a decorator (example from IBM). It's just a few minutes of work to wrap all the methods and it will give you something to work with.
The next step is to create a smaller example which sometimes fails and then post specific questions about Python, Excel and the COM wrapper here to figure out the reason for the problems.
[EDIT2] Here is some code that wraps the "dangerous" parts in a helper class and makes updating the styles more simple:
class BorderHelper(object):
def __init__(self, excel):
self.excel = excel
def set( type, LineStyle = None, Weight = None, Color = None ):
border = self.excel.Selection.Borders( type )
try:
if LineStyle is not None:
border.LineStyle = LineStyle
except:
pass # Ignore if a style can't be set
try:
if Weight is not None:
border.Weight = Weight
except:
pass # Ignore if a style can't be set
try:
if Color is not None:
border.Color = Color
except:
pass # Ignore if a style can't be set
Usage:
borders = BorderHelper( excel )
borders.set( xlDiagonalDown, LineStyle = xlNone )
borders.set( xlDiagonalUp, LineStyle = xlNone )
borders.set( xlEdgeLeft, LineStyle = xlContinuous, Weight = xlThin, Color = xlAutomatic )
...
This just wraps functions calls, but you can extend it to handle attribute access as well, and to proxy the results of nested attribute accesses, finally just wrapping the __setattr__ in your try:except block.
It might be sensible to swallow only some specific exception types in your case (as #vsekhar says).
def onErrorResumeNext(wrapped):
class Proxy(object):
def __init__(self, fn):
self.__fn = fn
def __call__(self, *args, **kwargs):
try:
return self.__fn(*args, **kwargs)
except:
print "swallowed exception"
class VBWrapper(object):
def __init__(self, wrapped):
self.wrapped = wrapped
def __getattr__(self, name):
return Proxy(eval('self.wrapped.'+name))
return VBWrapper(wrapped)
Example:
exceptionProofBorders = onErrorResumeNext(excel.Selection.Borders)
exceptionProofBorders(xlDiagonalDown).LineStyle = xlNone
exceptionProofBorders(xlDiagonalup).LineStyle = xlNone
You can zip arguments from three list, and do the following:
for border, attr, value in myArgs:
while True:
i = 0
try:
setattr(excel.Selection.Borders(border), attr, value)
except:
if i>100:
break
else:
break
If your exceptions are trully random, this will try until success (with a limit of 100 tries). I don't recommend this.
I know that is a weird question, and probably there is not an answer.
I'm trying to execute the rest of the try block after an exception was caught and the except block was executed.
Example:
[...]
try:
do.this()
do.that()
[...]
except:
foo.bar()
[...]
do.this() raise an exception managed by foo.bar(), then I would like to execute the code from do.that(). I know that there is not a GOTO statement, but maybe some kind of hack or workaround!
Thanks!
A try... except... block catches one exception. That's what it's for. It executes the code inside the try, and if an exception is raised, handles it in the except. You can't raise multiple exceptions inside the try.
This is deliberate: the point of the construction is that you need explicitly to handle the exceptions that occur. Returning to the end of the try violates this, because then the except statement handles more than one thing.
You should do:
try:
do.this()
except FailError:
clean.up()
try:
do.that()
except FailError:
clean.up()
so that any exception you raise is handled explicitly.
Use a finally block? Am I missing something?
[...]
try:
do.this()
except:
foo.bar()
[...]
finally:
do.that()
[...]
If you always need to execute foo.bar() why not just move it after the try/except block? Or maybe even to a finally: block.
One possibility is to write a code in such a way that you can re-execute it all when the error condition has been solved, e.g.:
while 1:
try:
complex_operation()
except X:
solve_problem()
continue
break
fcts = [do.this, do.that]
for fct in fcts:
try:
fct()
except:
foo.bar()
You need two try blocks, one for each statement in your current try block.
This doesn't scale up well, but for smaller blocks of code you could use a classic finite-state-machine:
states = [do.this, do.that]
state = 0
while state < len(states):
try:
states[state]()
except:
foo.bar()
state += 1
Here's another alternative. Handle the error condition with a callback, so that after fixing the problem you can continue. The callback would basically contain exactly the same code you would put in the except block.
As a silly example, let's say that the exception you want to handle is a missing file, and that you have a way to deal with that problem (a default file or whatever). fileRetriever is the callback that knows how to deal with the problem. Then you would write:
def myOp(fileRetriever):
f = acquireFile()
if not f:
f = fileRetriever()
# continue with your stuff...
f2 = acquireAnotherFile()
if not f2:
f2 = fileRetriever()
# more stuff...
myOp(magicalCallback)
Note: I've never seen this design used in practice, but in specific situations I guess it might be usable.