cleaning up when using exceptions and files in python - python

I'm learning python for a couple of days now and am struggling with its 'spirit'.
I'm comming from the C/C++/Java/Perl school and I understand that python is not C (at all) that's why I'm trying to understand the spirit to get the most out of it (and so far it's hard)...
My question is especially focused on exception handling and cleaning:
The code at the end of this post is meant to simulate a fairly common case of file opening/parsing where you need to close the file in case of an error...
Most samples I have seen use the 'else' clause of a try statement to close the file... which made sense to me until I realized that the error might be due to
the opening itself (in which case
there is no need to close the not
opened file)
the parsing (in which
case the file needs to be closed)
The trap here is that if you use the 'else' clause of a try bloc then the file never gets closed if the error happens during parsing!
On the other end using the 'finally' clause result in an extra necessary check because the file_desc variable may not exist if the error happened during the opened (see comments in the code below)...
This extra check is inefficient and full of shit because any reasonable program may contain hundreds of symbols and parsing the results of dir() is a pain in the ass... Not to mention the lack of readability of such a statement...
Most other languages allow for variable definitions which could save the day here... but in python, everything seems to be implicit...
Normally, one would just declare a file_desc variable, then use many try/catch blocs for every task... one for opening, one for parsing and the last one for the closing()... no need to nest them... here I don't know a way to declare the variable... so I'm stuck right at the begining of the problem !
so what is the spirit of python here ???
split the opening/parsing in two different methods ? How ?
use some kind of nested try/except clauses ??? How ?
maybe there is a way to declare the file_desc variable and then there would be no need for the extra checking... is it at all possible ??? desirable ???
what about the close() statement ??? what if it raises an error ?
thanx for your hints... here is the sample code:
class FormatError(Exception):
def __init__(self, message):
self.strerror = message
def __str__(self):
return repr(message)
file_name = raw_input("Input a filename please: ")
try:
file_desc = open(file_name, 'r')
# read the file...
while True:
current_line = file_desc.readline()
if not current_line: break
print current_line.rstrip("\n")
# lets simulate some parsing error...
raise FormatError("oops... the file format is wrong...")
except FormatError as format_error:
print "The file {0} is invalid: {1}".format(file_name, format_error.strerror)
except IOError as io_error:
print "The file {0} could not be read: {1}".format(file_name, io_error.strerror)
else:
file_desc.close()
# finally:
# if 'file_desc' in dir() and not file_desc.closed:
# file_desc.close()
if 'file_desc' in dir():
print "The file exists and closed={0}".format(file_desc.closed)
else:
print "The file has never been defined..."

The easiest way to deal with this is to use the fact that file objects in Python 2.5+ are context managers. You can use the with statement to enter a context; the context manager's __exit__ method is automatically called when exiting this with scope. The file object's context management automatically closes the file then.
try:
with file("hello.txt") as input_file:
for line in input_file:
if "hello" not in line:
raise ValueError("Every line must contain 'hello'!")
except IOError:
print "Damnit, couldn't open the file."
except:
raise
else:
print "Everything went fine!"
The open hello.txt handle will automatically be closed, and exceptions from within the with scope are propagated outside.

Just a note: you can always declare a variable, and then it would become something like this:
file_desc = None
try:
file_desc = open(file_name, 'r')
except IOError, err:
pass
finally:
if file_desc:
close(file_desc)
Of course, if you are using a newer version of Python, the construct using context manager is way better; however, I wanted to point out how you can generically deal with exceptions and variable scope in Python.

As of Python 2.5, there's a with command that simplifies some of what you're fighting with. Read more about it here. Here's a transformed version of your code:
class FormatError(Exception):
def __init__(self, message):
self.strerror = message
def __str__(self):
return repr(message)
file_name = raw_input("Input a filename please: ")
with open(file_name, 'r') as file_desc:
try:
# read the file...
while True:
current_line = file_desc.readline()
if not current_line: break
print current_line.rstrip("\n")
# lets simulate some parsing error...
raise FormatError("oops... the file format is wrong...")
except FormatError as format_error:
print "The file {0} is invalid: {1}".format(file_name, format_error.strerror)
except IOError as io_error:
print "The file {0} could not be read: {1}".format(file_name, io_error.strerror)
if 'file_desc' in dir():
print "The file exists and closed={0}".format(file_desc.closed)
else:
print "The file has never been defined..."

OK, I'm an ass.
edit:and BTW, many thanx for those who already answered while I was posting this
The code below does the trick.
You must create a nested block with the 'with as' statement to make sure the file is cleaned:
class FormatError(Exception):
def __init__(self, message):
self.strerror = message
def __str__(self):
return repr(message)
file_name = raw_input("Input a filename please: ")
try:
#
# THIS IS PYTHON'S SPIRIT... no else/finally
#
with open(file_name, 'r') as file_desc:
# read the file...
while True:
current_line = file_desc.readline()
if not current_line: break
print current_line.rstrip("\n")
raise FormatError("oops... the file format is wrong...")
print "will never get here"
except FormatError as format_error:
print "The file {0} is invalid: {1}".format(file_name, format_error.strerror)
except IOError as io_error:
print "The file {0} could not be read: {1}".format(file_name, io_error.strerror)
if 'file_desc' in dir():
print "The file exists and closed={0}".format(file_desc.closed)
else:
print "The file has never been defined..."

Close can to my knowledge never return an error.
In fact, the file handle will be closed when garbage collected, so you don't have to do it explicitly in Python. Although it's still good programming to do so, obviously.

Related

suggestions for to improve my python function to parse Wordpress/config.php

I am writing a python script function to backup Wordpress. As part of the script i wrote a function to fetch database details from the config.php file.
Working of my function
function takes Wordpress installation location as an argument and using regex to match db_user,db_host,db_user,db_password from that file, the function will exist if can not find "config.php". I am using sys.exit(1) to exit from the function is that the proper way to exit from a function? I am pasting my function code snippet.
def parsing_db_info(location):
config_path = os.path.normpath(location+'/config.php')
if os.path.exists(config_path):
try:
regex_db = r'define\(\s*?\'DB_NAME\'\s*?,\s*?\'(.*?)\'\s*?'.group(1)
regex_user = r'define\(\s*?\'DB_USER\'\s*?,\s*?\'(.*?)\'\s*?'.group(1)
regex_pass = r'define\(\s*?\'DB_PASSWORD\'\s*?,\s*?\'(.*?)\'\s*?'.group(1)
regex_host = r'define\(\s*?\'DB_HOST\'\s*?,\s*?\'(.*?)\'\s*?'.group(1)
db_name = re.match(regex_db,config_path).group(1)
db_user = re.match(regex_user,config_path).group(1)
db_pass = re.match(regex_pass,config_path).group(1)
db_host = re.match(regex_host,config_path).group(1)
return {'dbname':db_name , 'dbuser':db_user , 'dbpass':db_pass , 'dbhost':db_host}
except exception as ERROR:
print(ERROR)
sys.exit(1)
else:
print('Not Found:',config_path)
sys.exit(1)
AFTER EDITING
def parsing_db_info(location):
config_path = os.path.normpath(location+'/wp-config.php')
try:
with open(config_path) as fh:
content = fh.read()
regex_db = r'define\(\s*?\'DB_NAME\'\s*?,\s*?\'(.*?)\'\s*?'
regex_user = r'define\(\s*?\'DB_USER\'\s*?,\s*?\'(.*?)\'\s*?'
regex_pass = r'define\(\s*?\'DB_PASSWORD\'\s*?,\s*?\'(.*?)\'\s*?'
regex_host = r'define\(\s*?\'DB_HOST\'\s*?,\s*?\'(.*?)\'\s*?'
db_name = re.search(regex_db,content).group(1)
db_user = re.search(regex_user,content).group(1)
db_pass = re.search(regex_pass,content).group(1)
db_host = re.search(regex_host,content).group(1)
return {'dbname':db_name , 'dbuser':db_user , 'dbpass':db_pass , 'dbhost':db_host}
except FileNotFoundError:
print('File Not Found,',config_path)
sys.exit(1)
except PermissionError:
print('Unable To read Permission Denied,',config_path)
sys.exit(1)
except AttributeError:
print('Parsing Error wp-config seems to be corrupt,')
sys.exit(1)
To answer your question, you shouldn't normally use sys.exit inside a function like that. Rather, get it to raise an exception in the case where it fails. Preferably, it should be an exception detailing what went wrong, or you could just let the existing exceptions propagate.
The normal rule in Python is this: deal with exceptions at the place you know how to deal with them.
In your code, you catch an exception, and then don't know what to do, so call sys.exit. Instead of this, you should:
let an exception propagate up to a top-level function which can catch it, and then call sys.exit if appropriate
wrap the exception in something more specific, and re-raise, so that a higher level function will have a specific exception to catch. For example, your function might raise a custom ConfigFileNotFound exception or ConfigFileUnparseable exception.
Also, you have put except exception, you probably mean except Exception. However, this is extremely broad, and will mask other programming errors. Instead, catch the specific exception class you expect.

intercept exception with raise

I have function, which looks for special Element if project files:
def csproj_tag_finder(mod_proj_file):
"""Looking for 'BuildType' element in each module's csproj file passed in `mod_proj_file`
ard return it's value (CLOUD, MAIN, NGUI, NONE)"""
try:
tree = ET.ElementTree(file=mod_proj_file)
root = tree.getroot()
for element in root.iterfind('.//'):
if ('BuildType') in element.tag:
return element.text
except IOError as e:
# print 'WARNING: cant find file: %s' % e
If no file found - it prints 'WARNING: cant find file: %s' % e.
This function called from another one:
def parser(modename, mod_proj_file):
...
# module's tag's from project file in <BuildType> elements, looks like CLOUD
mod_tag_from_csproj = csproj_tag_finder(mod_proj_file)
if not mod_tag_from_csproj:
print('WARNING: module %s have not <BuildType> elements in file %s!' % (modename, mod_proj_file))
...
So - when file doesn't found - csproj_tag_finder()return None type, and print WARNING. Second function - parser() find empty mod_tag_from_csproj variable, and also print WARNING. This is harmless, so I want make csproj_tag_finder() raise special Exception, so parser() except it and pass == check, instead of print text.
I tried add something like:
...
except IOError as e:
# print 'WARNING: cant find file: %s' % e
raise Exception('NoFile')
to csproj_tag_finder() to catch it later in parser() - but it's interrupt next steps immediately.
P.S. Later if not mod_tag_from_csproj: will call another function to add new Element. This task can be solved with just return 'NoFile' and then catch with if/else - but it seems to me that raise will more correct way here. Or not?
raise interrupting the next steps immediately is exactly what it's supposed to do. In fact, that's the whole point of exceptions.
But then return also interrupts the next steps immediately, because returning early is also the whole point of return.
If you want to save an error until later, continue doing some other work, and then raise it at the end, you have to do that explicitly. For example:
def spam():
error = None
try:
do_some_stuff()
except IOError as e:
print 'WARNING: cant find file %s' % e
error = Exception('NoFile')
try:
do_some_more_stuff()
except OtherError as e:
print 'WARNING: cant frob the glotz %s' % e
error = Exception('NoGlotz')
# etc.
if error:
raise error
Now, as long as there's no unexpected exception that you forgot to handle, whatever failed last will be in error, and it'll be raised at the end.
As a side note, instead of raising Exception('NoFile'), then using == to test the exception string later, you probably want to create a NoFileException subclass; then you don't need to test it, you can just handle it with except NoFileException:. And that means you can carry some other useful information (the actual exception, the filename, etc.) in your exception without it getting in the way, too. If this sounds scary to implement, it's not. It's literally a one-liner:
class NoFileException(Exception): pass

Avoiding try-except nesting

Given a file of unknown file type, I'd like to open that file with one of a number of handlers. Each of the handlers raises an exception if it cannot open the file.
I would like to try them all and if none succeeds, raise an exception.
The design I came up with is
filename = 'something.something'
try:
content = open_data_file(filename)
handle_data_content(content)
except IOError:
try:
content = open_sound_file(filename)
handle_sound_content(content)
except IOError:
try:
content = open_image_file(filename)
handle_image_content(content)
except IOError:
...
This cascade doesn't seem to be the right way to do it.
Any suggestions?
Maybe you can group all the handlers and evaluate them in a for loop raising an exception at the end if none succeeded. You can also hang on to the raised exception to get some of the information back out of it as shown here:
filename = 'something.something'
handlers = [(open_data_file, handle_data_context),
(open_sound_file, handle_sound_content),
(open_image_file, handle_image_content)
]
for o, h in handlers:
try:
o(filename)
h(filename)
break
except IOError as e:
pass
else:
# Raise the exception we got within. Also saves sub-class information.
raise e
Is checking entirely out of the question?
>>> import urllib
>>> from mimetypes import MimeTypes
>>> guess = MimeTypes()
>>> path = urllib.pathname2url(target_file)
>>> opener = guess.guess_type(path)
>>> opener
('audio/ogg', None)
I know try/except and eafp is really popular in Python, but there are times when a foolish consistency will only interfere with the task at hand.
Additionally, IMO a try/except loop may not necessarily break for the reasons you expect, and as others have pointed out you're going to need to report the errors in a meaningful way if you want to see what's really happening as you try to iterate over file openers until you either succeed or fail. Either way, there's introspective code being written: to dive into the try/excepts and get a meaningful one, or reading the file path and using a type checker, or even just splitting the file name to get the extension... in the face of ambiguity, refuse the temptation to guess.
Like the others, I also recommend using a loop, but with tighter try/except scopes.
Plus, it's always better to re-raise the original exception, in order to preserve extra info about the failure, including traceback.
openers_handlers = [ (open_data_file, handle_data_context) ]
def open_and_handle(filename):
for i, (opener, handler) in enumerate(openers_handlers):
try:
f = opener(filename)
except IOError:
if i >= len(openers_handlers) - 1:
# all failed. re-raise the original exception
raise
else:
# try next
continue
else:
# successfully opened. handle:
return handler(f)
You can use Context Managers:
class ContextManager(object):
def __init__(self, x, failure_handling):
self.x = x
self.failure_handling = failure_handling
def __enter__(self):
return self.x
def __exit__(self, exctype, excinst, exctb):
if exctype == IOError:
if self.failure_handling:
fn = self.failure_handling.pop(0)
with ContextManager(fn(filename), self.failure_handling) as context:
handle_data_content(context)
return True
filename = 'something.something'
with ContextManager(open_data_file(filename), [open_sound_file, open_image_file]) as content:
handle_data_content(content)

keeping the continued code that will NOT fail still in try...except?

Dive into Python -
This is a small snippet from fileinfo.py used in the book. This is opening an MP3 file and reading the last 128 bytes to fetch and later parse the metadata.
try:
fsock = open(filename, "rb", 0)
try:
fsock.seek(-128, 2)
tagdata = fsock.read(128)
finally:
fsock.close()
.
. # process tagdata: will NEVER raise IOError though
.
except IOError:
pass
This can be refactored as:
try:
fsock = open(filename, "rb", 0)
try:
fsock.seek(-128, 2)
tagdata = fsock.read(128)
except IOError:
pass
finally:
fsock.close()
.
. # process tagdata
.
I even used to have this question when I was learning Java. Should we just keep the logic that can actually raise an exception inside the try..except block or for the sake of keeping a code that does one particular job in ONE place; keep the other code that will NEVER raise an exception also within a try...except?
The try/finally clause's primary purpose is to close the file regardless of what happens, it doesn't make sense to move it the outer try/except as I assume you are trying to do:
try:
fsock = open(filename, "rb", 0)
try:
fsock.seek(-128, 2)
tagdata = fsock.read(128)
except:
pass
except IOError:
pass
finally:
fsock.close()
The reason being, if IOError is actually raised, calling fsock.close() would raise another exception, since fsock would not have been assigned. Instead of either, it'd be preferable to use the with statement which will automatically close the file for you:
try:
with open(filename, 'rb') as fsock:
fsock.seek(-128, 2)
tagdata = fsock.read(128)
except IOError:
pass
The most accepted standard is to put as little code in the try..except as possible. Reasoning is that you don't know what the other code will raise if there's a ton of code in a try.. then it becomes really messy.
You can see lots of good styling information in PEP 8, amongst which is:
- Additionally, for all try/except clauses, limit the 'try' clause
to the absolute minimum amount of code necessary. Again, this
avoids masking bugs.
Yes:
try:
value = collection[key]
except KeyError:
return key_not_found(key)
else:
return handle_value(value)
No:
try:
# Too broad!
return handle_value(collection[key])
except KeyError:
# Will also catch KeyError raised by handle_value()
return key_not_found(key)
The second piece of code is syntactically invalid, so you should prefer the first form.
If you'd make it syntactically valid by adding an except or finally clause, it would be semantically invalid: if the open fails, you'd still be trying to close fsock, which would not be assigned.
If the open fails then you can't assign to tagdata, so you should not allow the code to reach the point where you process tagdata. Often the best way to handle this is to process the IOError at a higher level (i.e. wrap this in a function and handle it in the calling context).
BTW, in modern Python we don't need to use finally for this sort of thing - we have a more powerful idiom. We also have an else clause that can be attached to try/except blocks that is executed only if the exception handlers are not invoked.
So we get something like:
def get_data():
with open(filename, "rb", 0) as fsock:
fsock.seek(-128, 2)
return fsock.read(128)
def do_processing():
try: tagdata = get_data()
except IOError: handle_error()
else: process(tagdata)

a question about this python script!

if __name__=="__main__":
fname= raw_input("Please enter your file:")
mTrue=1
Salaries=''
Salarieslist={}
Employeesdept=''
Employeesdeptlist={}
try:
f1=open(fname)
except:
mTrue=0
print 'The %s does not exist!'%fname
if mTrue==1:
ss=[]
for x in f1.readlines():
if 'Salaries' in x:
Salaries=x.strip()
elif 'Employees' in x:
Employeesdept=x.strip()
f1.close()
if Salaries and Employeesdept:
Salaries=Salaries.split('-')[1].strip().split(' ')
for d in Salaries:
s=d.strip().split(':')
Salarieslist[s[0]]=s[1]
Employeesdept=Employeesdept.split('-')[1].strip().split(' ')
for d in Employeesdept:
s=d.strip().split(':')
Employeesdeptlist[s[0]]=s[1]
print "1) what is the average salary in the company: %s "%Salarieslist['Avg']
print "2) what are the maximum and minimum salaries in the company: maximum:%s,minimum:%s "%(Salarieslist['Max'],Salarieslist['Min'])
print "3) How many employees are there in each department :IT:%s, Development:%s, Administration:%s"%(
Employeesdeptlist['IT'],Employeesdeptlist['Development'],Employeesdeptlist['Administration'])
else:
print 'The %s data is err!'%fname
When I enter a filename, but it didn't continue, why? If I enter a file named company.txt, but it always show the file does not exist. why?
I can give you some hints which can help you to resolve problem better
Create a function and call it in main e.g.
if __name__=="__main__":
main()
Don't put whole block under if mTrue==1: instead just return from function on error e.g.
def main():
fname= raw_input("Please enter your file:")
try:
f1=open(fname)
except:
print 'The %s does not exist!'%fname
return
... # main code here
Never catch all exceptions , instead catch specific exception e.g. IOError
try:
f1 = open(fname):
except IOError,e:
print 'The %s does not exist!'%fname
otherwise catching all exception may catch syntax error or mis-spelled names etc
Print the exception you are getting, it may not always be file not found, may be you don't have read permission or something like that
and finally your problem could be just that, file may not exist, try to input full path
Your current working directory does not contain company.txt.
Either set your current working directory or use an absolute path.
You can change the working directory like so:
import os
os.chdir(new_path)
In addition to be more specific about which exceptions you want to catch you should considered capturing the exception object itself so you can print a string representation of it as part of your error message:
try:
f1 = open(fname, 'r')
except IOError, e:
print >> sys.stderr, "Some error occurred while trying to open %s" % fname
print >> sys.stderr, e
(You can also learn more about specific types of Exception objects and perhaps handle
some sorts of exceptions in your code. You can even capture Exceptions for your own inspection from within the interpreter so you can run dir() on them, and type() on each of the interesting attributes you find ... and so on.

Categories

Resources