Check that python function is defined for a particular input? - python

I have tried the following which doesn't work:
try:
f = h5py.File(filepath)
except NameError:
f = scipy.io.loadmat(filepath)
Basically, a user will pass a particular input(a filepath) to a function which is
supposed to load the data in that file. But, I don't expect the user to know whether the function is defined for that input.
I'm getting the following error:
OSError: Unable to create file (Unable to open file: name = '/users/cyrilrocke/documents/c_elegans/data/chemotaxis/n2laura_benzaldehyde_l_2013_03_17__15_39_19___2____features.mat', errno = 17, error message = 'file exists', flags = 15, o_flags = a02)
Note: basically, I want to be able to switch between the function h5py.File() and scipy.io.loadmat() depending on which one doesn't work. Given the inputs, one of these functions must work.

I guess you want something like that:
flag = True;
try:
f = h5py.File(filepath)
except:
flag = False;
try:
if not flag:
f = scipy.io.loadmat(filepath)
except:
print('Error')
The problems in your code are:
you are catching for the wrong error. Without specifying the error type you can handle all kinds of error in a single way (you can also add multiple except clauses)
you are executing a potentially failing function inside the except clause. In this way you won't be able to handle new errors. I then splitted them in two separate try/except blocks.

First: your code should know whether a function is defined! That's not something external, it is something that is entirely in your power.
If an import might fail, maybe do something like
try:
import h5py
HAVE_H5PY = True
except ImportError:
HAVE_H5PY = False
and then use that to check.
Second, you could instead do something like
if 'h5py' in globals():
but that is ugly and unnecessary (you have to catch the ImportError anyway, if that is the problem you're trying to solve).
Third, your error message has nothing to do with any of this, it is returned by one of those two commands. It is trying to create a file that already exists.

error message = 'file exists'
Your program is getting EEXIST and if you see a difference between trials with h5py and scipy.io, it's because they have different flags that they're passing to open() and perhaps differing states if you cleanup the files created between trials.
From the h5py.File docs:
Valid modes are:
...
r Readonly, file must exist
w Create file, truncate if exists
w- or x Create file, fail if exists
a Read/write if exists, create otherwise (default)
Since you're looking to read the file (only?) you should specify 'r' as the mode.
If you're experiencing this problem only rarely, then perhaps it's a race generated by a TOCTOU-style error in h5py.
Indeed, such a TOCTOU design error exists in h5py.
https://github.com/h5py/h5py/blob/c699e741de64fda8ac49081e9ee1be43f8eae822/h5py/_hl/files.py#L99
# Open in append mode (read/write).
# If that fails, create a new file only if it won't clobber an
# existing one (ACC_EXCL)
try:
fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)
except IOError:
fid = h5f.create(name, h5f.ACC_EXCL, fapl=fapl, fcpl=fcpl)
This is the reason for the O_EXCL flag used by posix-style OSs. h5f.ACC_EXCL likely does the same thing, but if it's only used upon failure of h5f.ACC_RDWR then its atomicity is diminished.

Related

guaranteed unit test fixture for non-existent file

Imagine one writes a unit test for handle for a case when path does not exist:
def handle(path):
try:
with open(path) as f:
pass
except FileNotFoundError:
raise FileNotFoundError(path)
I would write something like below for such test:
import pytest
def test_handle_on_non_existent_path():
x = "abc" # some unbelievable string
with pytest.raises(FileNotFoundError):
handle(x)
My question is what is a better way to generate non-existent path for a unit-test.
My ideas are:
force delete a temporary file
genereate a random string like uuid?
"abc" is fairly concise, but in principle does guarantee path does not exist.
Update: in this question x is "no_exist.txt"
With respect to unit-testing, it seems your intent is to test the behaviour of your code for the case that open(path) will throw a FileNotFoundError. Your approach is to have the code actually perform the open call, but with a non-existent path name. This has some disadvantages: As you already noticed, the dependency on the real file system brings the question how to create a value for path that reliably does not exist as a file on the file system. But, there is another point, namely that you are not even sure that there could not exist another problem with the file system, for example some permission related problem, which could cause some other exception to be raised (OSError).
Put together, performing the call to open itself means you are not in full control of what happens. Therefore, a better approach can be, for this unit test case, to mock open and make your mock raise the FileNotFoundError.

Check the permissions of a file in python

I'm trying to check the readability of a file given the specified path. Here's what I have:
def read_permissions(filepath):
'''Checks the read permissions of the specified file'''
try:
os.access(filepath, os.R_OK) # Find the permissions using os.access
except IOError:
return False
return True
This works and returns True or False as the output when run. However, I want the error messages from errno to accompany it. This is what I think I would have to do (But I know that there is something wrong):
def read_permissions(filepath):
'''Checks the read permissions of the specified file'''
try:
os.access(filepath, os.R_OK) # Find the permissions using os.access
except IOError as e:
print(os.strerror(e)) # Print the error message from errno as a string
print("File exists.")
However, if I were to type in a file that does not exist, it tells me that the file exists. Can someone help me as to what I have done wrong (and what I can do to stay away from this issue in the future)? I haven't seen anyone try this using os.access. I'm also open to other options to test the permissions of a file. Can someone help me in how to raise the appropriate error message when something goes wrong?
Also, this would likely apply to my other functions (They still use os.access when checking other things, such as the existence of a file using os.F_OK and the write permissions of a file using os.W_OK). Here is an example of the kind of thing that I am trying to simulate:
>>> read_permissions("located-in-restricted-directory.txt") # Because of a permission error (perhaps due to the directory)
[errno 13] Permission Denied
>>> read_permissions("does-not-exist.txt") # File does not exist
[errno 2] No such file or directory
This is the kind of thing that I am trying to simulate, by returning the appropriate error message to the issue. I hope that this will help avoid any confusion about my question.
I should probably point out that while I have read the os.access documentation, I am not trying to open the file later. I am simply trying to create a module in which some of the components are to check the permissions of a particular file. I have a baseline (The first piece of code that I had mentioned) which serves as a decision maker for the rest of my code. Here, I am simply trying to write it again, but in a user-friendly way (not just True or just False, but rather with complete messages). Since the IOError can be brought up a couple different ways (such as permission denied, or non-existent directory), I am trying to get my module to identify and publish the issue. I hope that this helps you to help me determine any possible solutions.
os.access returns False when the file does not exist, regardless of the mode parameter passed.
This isn't stated explicitly in the documentation for os.access but it's not terribly shocking behavior; after all, if a file doesn't exist, you can't possibly access it. Checking the access(2) man page as suggested by the docs gives another clue, in that access returns -1 in a wide variety of conditions. In any case, you can simply do as I did and check the return value in IDLE:
>>> import os
>>> os.access('does_not_exist.txt', os.R_OK)
False
In Python it's generally discouraged to go around checking types and such before trying to actually do useful things. This philosophy is often expressed with the initialism EAFP, which stands for Easier to Ask Forgiveness than Permission. If you refer to the docs again, you'll see this is particularly relevant in the present case:
Note: Using access() to check if a user is authorized to e.g. open a file before actually doing so using open() creates a security
hole, because the user might exploit the short time interval between
checking and opening the file to manipulate it. It’s preferable to use
EAFP techniques. For example:
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
return "some default data"
is better written as:
try:
fp = open("myfile")
except IOError as e:
if e.errno == errno.EACCES:
return "some default data"
# Not a permission error.
raise
else:
with fp:
return fp.read()
If you have other reasons for checking permissions than second-guessing the user before calling open(), you could look to How do I check whether a file exists using Python? for some suggestions. Remember that if you really need an exception to be raised, you can always raise it yourself; no need to go hunting for one in the wild.
Since the IOError can be brought up a couple different ways (such as
permission denied, or non-existent directory), I am trying to get my
module to identify and publish the issue.
That's what the second approach above does. See:
>>> try:
... open('file_no_existy.gif')
... except IOError as e:
... pass
...
>>> e.args
(2, 'No such file or directory')
>>> try:
... open('unreadable.txt')
... except IOError as e:
... pass
...
>>> e.args
(13, 'Permission denied')
>>> e.args == (e.errno, e.strerror)
True
But you need to pick one approach or the other. If you're asking forgiveness, do the thing (open the file) in a try-except block and deal with the consequences appropriately. If you succeed, then you know you succeeded because there's no exception.
On the other hand, if you ask permission (aka LBYL, Look Before You Leap) in this that and the other way, you still don't know if you can successfully open the file until you actually do it. What if the file gets moved after you check its permissions? What if there's a problem you didn't think to check for?
If you still want to ask permission, don't use try-except; you're not doing the thing so you're not going to throw errors. Instead, use conditional statements with calls to os.access as the condition.

Python: If file not found try looking for different filename

Problem:
I have code that looks for a file and open it. By default it looks for file that starts with ####### (each # being a number).
Problem is sometimes the file name is ##-##### and other times #####.
I would like a way if the file cannot be found try looking for the other two ways the file could be written.
An IOError exception happens when the file is not found. What I was thinking was to have an except statement that says:
except File2:
Look for ##### in myfindFileFunction()
if file is still not found run except File3
except File3:
Look for ##-#### in myfindFileFuction()
except:
print "File not found"
What I am not sure of is how to set up custom exception to work this way, and/or if there is a more pythonic way to do this altogether...
Would setting up a pattern or the three possible file names and iterate thought each until the file is found work better?
Using try/except is indeed a very pythonic (and fast) way of doing things.
You have to weigh not only if it's pythonic, but what impact does that approach has in terms of readability. Will you still understand the code quickly when you look at it again in 6 months? Will somebody else?
I usually make sure that slightly complex try/except clauses to handle this kind of things are well commented. Asides from that... it's a perfectly reasonable way of doing it.
Also, to put your mind at ease regarding performance, a common concern when one is deciding between two approaches, take a look here: Python if vs try-except and you'll see that try/except constructs are fast in Python... really fast.
no custom exception needed
import errno
try:
open('somefile')
except IOError as e:
if e.errno == errno.ENOENT:
open('someotherfilename')
else:
raise e
(this is on *nix- im not sure if you're using windows)
It's easy enough to define your own exceptions -- just create a class derived from Exception. The doco is clear.
However creating separate exceptions per file type, or any exception at all, doesn't seem necessary. You could do something like:
files = ('#######', "##-#####', '#####')
fh = None
for f in files:
try:
fh = open(f)
break
except IOError as e:
if e.errno in (errno.ENOENT,):
pass
else:
raise
if not fh:
## all three tries failed
The use of if around e.errno lets you decide which IO errors mean go on to the next file and which are errors you want to know about . File does not exists (errno.ENOENT) means try next file. But others like 'Too many files open' (errno.ENFILE) probably need a different response.

Safely create a file if and only if it does not exist with Python

I wish to write to a file based on whether that file already exists or not, only writing if it doesn't already exist (in practice, I wish to keep trying files until I find one that doesn't exist).
The following code shows a way in which a potentially attacker could insert a symlink, as suggested in this post in between a test for the file and the file being written. If the code is run with high enough permissions, this could overwrite an arbitrary file.
Is there a way to solve this problem?
import os
import errno
file_to_be_attacked = 'important_file'
with open(file_to_be_attacked, 'w') as f:
f.write('Some important content!\n')
test_file = 'testfile'
try:
with open(test_file) as f: pass
except IOError, e:
# Symlink created here
os.symlink(file_to_be_attacked, test_file)
if e.errno != errno.ENOENT:
raise
else:
with open(test_file, 'w') as f:
f.write('Hello, kthxbye!\n')
Edit: See also Dave Jones' answer: from Python 3.3, you can use the x flag to open() to provide this function.
Original answer below
Yes, but not using Python's standard open() call. You'll need to use os.open() instead, which allows you to specify flags to the underlying C code.
In particular, you want to use O_CREAT | O_EXCL. From the man page for open(2) under O_EXCL on my Unix system:
Ensure that this call creates the file: if this flag is specified in conjunction with O_CREAT, and pathname already exists, then open() will fail. The behavior of O_EXCL is undefined if O_CREAT is not specified.
When these two flags are specified, symbolic links are not followed: if pathname is a symbolic link, then open() fails regardless of where the symbolic link points to.
O_EXCL is only supported on NFS when using NFSv3 or later on kernel 2.6 or later. In environments where NFS O_EXCL support is not provided, programs that rely on it for performing locking tasks will contain a race condition.
So it's not perfect, but AFAIK it's the closest you can get to avoiding this race condition.
Edit: the other rules of using os.open() instead of open() still apply. In particular, if you want use the returned file descriptor for reading or writing, you'll need one of the O_RDONLY, O_WRONLY or O_RDWR flags as well.
All the O_* flags are in Python's os module, so you'll need to import os and use os.O_CREAT etc.
Example:
import os
import errno
flags = os.O_CREAT | os.O_EXCL | os.O_WRONLY
try:
file_handle = os.open('filename', flags)
except OSError as e:
if e.errno == errno.EEXIST: # Failed as the file already exists.
pass
else: # Something unexpected went wrong so reraise the exception.
raise
else: # No exception, so the file must have been created successfully.
with os.fdopen(file_handle, 'w') as file_obj:
# Using `os.fdopen` converts the handle to an object that acts like a
# regular Python file object, and the `with` context manager means the
# file will be automatically closed when we're done with it.
file_obj.write("Look, ma, I'm writing to a new file!")
For reference, Python 3.3 implements a new 'x' mode in the open() function to cover this use-case (create only, fail if file exists). Note that the 'x' mode is specified on its own. Using 'wx' results in a ValueError as the 'w' is redundant (the only thing you can do if the call succeeds is write to the file anyway; it can't have existed if the call succeeds):
>>> f1 = open('new_binary_file', 'xb')
>>> f2 = open('new_text_file', 'x')
For Python 3.2 and below (including Python 2.x) please refer to the accepted answer.
This code will easily create a file if one does not exists.
import os
if not os.path.exists('file'):
open('file', 'w').close()

Python Error-Checking Standard Practice

I have a question regarding error checking in Python. Let's say I have a function that takes a file path as an input:
def myFunction(filepath):
infile = open(filepath)
#etc etc...
One possible precondition would be that the file should exist.
There are a few possible ways to check for this precondition, and I'm just wondering what's the best way to do it.
i) Check with an if-statement:
if not os.path.exists(filepath):
raise IOException('File does not exist: %s' % filepath)
This is the way that I would usually do it, though the same IOException would be raised by Python if the file does not exist, even if I don't raise it.
ii) Use assert to check for the precondition:
assert os.path.exists(filepath), 'File does not exist: %s' % filepath
Using asserts seems to be the "standard" way of checking for pre/postconditions, so I am tempted to use these. However, it is possible that these asserts are turned off when the -o flag is used during execution, which means that this check might potentially be turned off and that seems risky.
iii) Don't handle the precondition at all
This is because if filepath does not exist, there will be an exception generated anyway and the exception message is detailed enough for user to know that the file does not exist
I'm just wondering which of the above is the standard practice that I should use for my codes.
If all you want to do is raise an exception, use option iii:
def myFunction(filepath):
with open(filepath) as infile:
pass
To handle exceptions in a special way, use a try...except block:
def myFunction(filepath):
try:
with open(filepath) as infile:
pass
except IOError:
# special handling code here
Under no circumstance is it preferable to check the existence of the file first (option i or ii) because in the time between when the check or assertion occurs and when python tries to open the file, it is possible that the file could be deleted, or altered (such as with a symlink), which can lead to bugs or a security hole.
Also, as of Python 2.6, the best practice when opening files is to use the with open(...) syntax. This guarantees that the file will be closed, even if an exception occurs inside the with-block.
In Python 2.5 you can use the with syntax if you preface your script with
from __future__ import with_statement
Definitely don't use an assert. Asserts should only fail if the code is wrong. External conditions (such as missing files) shouldn't be checked with asserts.
As others have pointed out, asserts can be turned off.
The formal semantics of assert are:
The condition may or may not be evaluated (so don't rely on side effects of the expression).
If the condition is true, execution continues.
It is undefined what happens if the condition is false.
More on this idea.
The following extends from ~unutbu's example. If the file doesn't exist, or on any other type of IO error, the filename is also passed along in the error message:
path = 'blam'
try:
with open(path) as f:
print f.read()
except IOError as exc:
raise IOError("%s: %s" % (path, exc.strerror))
=> IOError: blam: No such file or directory
I think you should go with a mix of iii) and i). If you know for a fact, that python will throw the exception (i.e. case iii), then let python do it. If there are some other preconditions (e.g. demanded by your business logic) you should throw own exceptions, maybe even derive from Exception.
Using asserts is too fragile imho, because they might be turned off.

Categories

Resources