Sorting disk I/O errors in Python

Sorting disk I/O errors in Python - python

How do I sort out (distinguish) an error derived from a "disk full condition" from "trying to write to a read-only file system"?
I don't want to fill my HD to find out :)
What I want is to know who to catch each exception, so my code can say something to the user when he is trying to write to a ReadOnly FS and another message if the user is trying to write a file in a disk that is full.

Once you catch IOError, e.g. with an except IOError, e: clause in Python 2.*, you can examine e.errno to find out exactly what kind of I/O error it was (unfortunately in a way that's not necessarily fully portable among different operating systems).
See the errno module in Python standard library; opening a file for writing on a R/O filesystem (on a sensible OS) should produce errno.EPERM, errno.EACCES or better yet errno.EROFS ("read-only filesystem"); if the filesystem is R/W but there's no space left you should get errno.ENOSPC ("no space left on device"). But you will need to experiment on the OSes you care about (with a small USB key filling it up should be easy;-).
There's no way to use different except clauses depending on errno -- such clauses must be distinguished by the class of exceptions they catch, not by attributes of the exception instance -- so you'll need an if/else or other kind of dispatching within a single except IOError, e: clause.

On a read-only filesystem, the files themselves will be marked as read-only. Any attempt to open a read-only file for writing (O_WRONLY or O_RDWR) will fail. On UNIX-like systems, the errno EACCES will be set.
>>> file('/etc/resolv.conf', 'a')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 13] Permission denied: '/etc/resolv.conf'
In contrast, attempts to write to a full file may result in ENOSPC. May is critical; the error may be delayed until fsync or close.
>>> file(/dev/full, 'a').write('\n')
close failed in file object destructor:
IOError: [Errno 28] No space left on device

Related

Python on Windows "Handle Invalid" when redirecting stdout writing to file

A script I am trying to fix uses the following paradigm for redirecting stdout to a file.
import os
stdio_file = 'temp.out'
flag = os.O_WRONLY | os.O_CREAT | os.O_TRUNC
stdio_fp = os.open(stdio_file, flag)
os.dup2(stdio_fp, 1)
print("hello")
On Python 2, this works. On Python 3, you get an OSError
Traceback (most recent call last):
File "test.py", line 6, in <module>
print("hello")
OSError: [WinError 6] The handle is invalid
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
OSError: [WinError 6] The handle is invalid
I assume there are more preferable methods to routing stdout through a file but I am wondering why this method stopped working in Python 3 and if there is an easy way to fix it?

Code such as os.dup2(stdio_fp, 1) will work in Python 3.5 and earlier, or in 3.6+ with the environment variable PYTHONLEGACYWINDOWSSTDIO defined.
The issue is that print writes to a sys.stdout object that's only meant for console I/O. Specifically, in 3.6+ the raw layer of Python 3's standard output file (i.e. sys.stdout.buffer.raw) is an io._WindowsConsoleIO instance when stdout is initially a console file1. This object caches the initial handle value of the stdout file descriptor2. Subsequently, dup2 closes this handle while re-associating the file descriptor with a duplicate handle for "temp.out". At this point the cached handle is no longer valid. (Really, it shouldn't be caching the handle, since calling _get_osfhandle is relatively cheap compared to the cost of console I/O.) However, even if it had a valid handle for "temp.out", sys.stdout.write would fail anyway since _WindowsConsoleIO uses the console-only function WriteConsoleW instead of generic WriteFile.
You need to reassign sys.stdout instead of bypassing Python's I/O stack with low-level operations such as dup2. I know it's not ideal from the Unix developer's point of view. I wish we could re-implement the way Unicode is supported for the Windows console without introducing this console-only _WindowsConsoleIO class, which disrupts low-level patterns that people have relied on for decades.
1. _WindowsConsoleIO was added to support the full range of Unicode in the Windows console (at least as well as the console can support it). For this it uses the console's UTF-16 wide-character API (e.g. ReadConsoleW and WriteConsoleW). Previously CPython's console support was limited to text that was encoded with Windows codepages, using generic byte-based I/O (e.g. ReadFile and WriteFile).
2. Windows uses handles to reference kernel objects such as File objects. This system isn't compatible in behavior with POSIX file descriptors (FDs). The C runtime (CRT) thus has a "low I/O" compatibility layer that associates POSIX-style FDs with Windows file handles, and it also implements POSIX I/O functions such as open and write. The CRT's _open_osfhandle function associates a native file handle with an FD, and _get_osfhandle returns the handle associated with an FD. Sometimes CPython uses the CRT low I/O layer, and sometimes it uses the Windows API directly. It's really kind of a mess, if you ask me.

Check the permissions of a file in python

I'm trying to check the readability of a file given the specified path. Here's what I have:
def read_permissions(filepath):
'''Checks the read permissions of the specified file'''
try:
os.access(filepath, os.R_OK) # Find the permissions using os.access
except IOError:
return False
return True
This works and returns True or False as the output when run. However, I want the error messages from errno to accompany it. This is what I think I would have to do (But I know that there is something wrong):
def read_permissions(filepath):
'''Checks the read permissions of the specified file'''
try:
os.access(filepath, os.R_OK) # Find the permissions using os.access
except IOError as e:
print(os.strerror(e)) # Print the error message from errno as a string
print("File exists.")
However, if I were to type in a file that does not exist, it tells me that the file exists. Can someone help me as to what I have done wrong (and what I can do to stay away from this issue in the future)? I haven't seen anyone try this using os.access. I'm also open to other options to test the permissions of a file. Can someone help me in how to raise the appropriate error message when something goes wrong?
Also, this would likely apply to my other functions (They still use os.access when checking other things, such as the existence of a file using os.F_OK and the write permissions of a file using os.W_OK). Here is an example of the kind of thing that I am trying to simulate:
>>> read_permissions("located-in-restricted-directory.txt") # Because of a permission error (perhaps due to the directory)
[errno 13] Permission Denied
>>> read_permissions("does-not-exist.txt") # File does not exist
[errno 2] No such file or directory
This is the kind of thing that I am trying to simulate, by returning the appropriate error message to the issue. I hope that this will help avoid any confusion about my question.
I should probably point out that while I have read the os.access documentation, I am not trying to open the file later. I am simply trying to create a module in which some of the components are to check the permissions of a particular file. I have a baseline (The first piece of code that I had mentioned) which serves as a decision maker for the rest of my code. Here, I am simply trying to write it again, but in a user-friendly way (not just True or just False, but rather with complete messages). Since the IOError can be brought up a couple different ways (such as permission denied, or non-existent directory), I am trying to get my module to identify and publish the issue. I hope that this helps you to help me determine any possible solutions.

os.access returns False when the file does not exist, regardless of the mode parameter passed.
This isn't stated explicitly in the documentation for os.access but it's not terribly shocking behavior; after all, if a file doesn't exist, you can't possibly access it. Checking the access(2) man page as suggested by the docs gives another clue, in that access returns -1 in a wide variety of conditions. In any case, you can simply do as I did and check the return value in IDLE:
>>> import os
>>> os.access('does_not_exist.txt', os.R_OK)
False
In Python it's generally discouraged to go around checking types and such before trying to actually do useful things. This philosophy is often expressed with the initialism EAFP, which stands for Easier to Ask Forgiveness than Permission. If you refer to the docs again, you'll see this is particularly relevant in the present case:
Note: Using access() to check if a user is authorized to e.g. open a file before actually doing so using open() creates a security
hole, because the user might exploit the short time interval between
checking and opening the file to manipulate it. It’s preferable to use
EAFP techniques. For example:
if os.access("myfile", os.R_OK):
with open("myfile") as fp:
return fp.read()
return "some default data"
is better written as:
try:
fp = open("myfile")
except IOError as e:
if e.errno == errno.EACCES:
return "some default data"
# Not a permission error.
raise
else:
with fp:
return fp.read()
If you have other reasons for checking permissions than second-guessing the user before calling open(), you could look to How do I check whether a file exists using Python? for some suggestions. Remember that if you really need an exception to be raised, you can always raise it yourself; no need to go hunting for one in the wild.
Since the IOError can be brought up a couple different ways (such as
permission denied, or non-existent directory), I am trying to get my
module to identify and publish the issue.
That's what the second approach above does. See:
>>> try:
... open('file_no_existy.gif')
... except IOError as e:
... pass
...
>>> e.args
(2, 'No such file or directory')
>>> try:
... open('unreadable.txt')
... except IOError as e:
... pass
...
>>> e.args
(13, 'Permission denied')
>>> e.args == (e.errno, e.strerror)
True
But you need to pick one approach or the other. If you're asking forgiveness, do the thing (open the file) in a try-except block and deal with the consequences appropriately. If you succeed, then you know you succeeded because there's no exception.
On the other hand, if you ask permission (aka LBYL, Look Before You Leap) in this that and the other way, you still don't know if you can successfully open the file until you actually do it. What if the file gets moved after you check its permissions? What if there's a problem you didn't think to check for?
If you still want to ask permission, don't use try-except; you're not doing the thing so you're not going to throw errors. Instead, use conditional statements with calls to os.access as the condition.

Python error exception

I have a script which create a temporary text file and delete after the user close the window.
The problem is that, the temporary text file may or may not be created depending on what the user does.Or sometimes the temporary text file may be deleted before the user exit. There are three possible scenario.
The temporary text file is created with the name of 'tempfilename'.
The temporary text file is created with the name of 'tempfilename' but deleted before the user exit.So, when trying to remove the file it raise OSError
The temporary text file is not created and no variable called 'tempfilename' is created, so it raise NameError
I have tried using this code:
try:
os.remove(str(tempfilename))
except OSError or NameError:
pass
But it seems that it only catch the OSError only. Did i do something wrong?

try:
os.remove(str(tempfilename))
except (OSError, NameError):
pass

tempfilename = None
# ...
if tempfilename is not None and os.path.exists(tempfilename):
os.remove(tempfilename)
It's not good to catch NameError since it will hide other typos in your code (e.g., os.remov(…)).
Also, OSError does not always means that the file did not exist. On Windows, if the file was in use, an exception would be raised (http://docs.python.org/2/library/os.html#os.remove). In that case, you would want to see the exception so you could be aware of the issue and/or handle it another way.
Exception handlers should be kep as narrow as possible to avoid hiding unrelated errors or bugs

disallow access to filesystem inside exec and eval in Python

I want to disallow access to file system from clients code, so I think I could overwrite open function
env = {
'open': lambda *a: StringIO("you can't use open")
}
exec(open('user_code.py'), env)
but I got this
unqualified exec is not allowed in function 'my function' it contains a
nested function with free variables
I also try
def open_exception(*a):
raise Exception("you can't use open")
env = {
'open': open_exception
}
but got the same Exception (not "you can't use open")
I want to prevent of:
executing this:
"""def foo():
return open('some_file').read()
print foo()"""
and evaluate this
"open('some_file').write('some text')"
I also use session to store code that was evaluated previously so I need to prevent of executing this:
"""def foo(s):
return open(s)"""
and then evaluating this
"foo('some').write('some text')"
I can't use regex because someone could use (eval inside string)
"eval(\"opxx('some file').write('some text')\".replace('xx', 'en')"
Is there any way to prevent access to file system inside exec/eval? (I need both)

There's no way to prevent access to the file system inside exec/eval. Here's an example code that demonstrates a way for the user code to call otherwise restricted classes that always works:
import subprocess
code = """[x for x in ().__class__.__bases__[0].__subclasses__()
if x.__name__ == 'Popen'][0](['ls', '-la']).wait()"""
# Executing the `code` will always run `ls`...
exec code in dict(__builtins__=None)
And don't think about filtering the input, especially with regex.
You might consider a few alternatives:
ast.literal_eval if you could limit yourself only to simple expressions
Using another language for user code. You might look at Lua or JavaScript - both are sometimes used to run unsafe code inside sandboxes.
There's the pysandbox project, though I can't guarantee you that the sandboxed code is really safe. Python wasn't designed to be sandboxed, and in particular the CPython implementation wasn't written with sandboxing in mind. Even the author seems to doubt the possibility to implement such sandbox safely.

You can't turn exec() and eval() into a safe sandbox. You can always get access to the builtin module, as long as the sys module is available::
sys.modules[().__class__.__bases__[0].__module__].open
And even if sys is unavailable, you can still get access to any new-style class defined in any imported module by basically the same way. This includes all the IO classes in io.

This actually can be done.
That is, practically just what you describe can be accomplished on Linux, contrary to other answers here. That is, you can achieve a setup where you can have an exec-like call which runs untrusted code under security which is reasonably difficult to penetrate, and which allows output of the result. Untrusted code is not allowed to access the filesystem at all except for reading specifically allowed parts of the Python vm and standard library.
If that's close enough to what you wanted, read on.
I'm envisioning a system where your exec-like function spawns a subprocess under a very strict AppArmor profile, such as the one used by Straitjacket (see here and here). This will limit all filesystem access at the kernel level, other than files specifically allowed to be read. This will also limit the process's stack size, max data segment size, max resident set size, CPU time, the number of signals that can be queued, and the address space size. The process will have locked memory, cores, flock/fcntl locks, POSIX message queues, etc, wholly disallowed. If you want to allow using size-limited temporary files in a scratch area, you can mkstemp it and make it available to the subprocess, and allow writes there under certain conditions (make sure that hard links are absolutely disallowed). You'd want to make sure to clear out anything interesting from the subprocess environment and put it in a new session and process group, and close all FDs in the subprocess except for the stdin/stdout/stderr, if you want to allow communication with those.
If you want to be able to get a Python object back out from the untrusted code, you could wrap it in something which prints the result's repr to stdout, and after you check its size, you evaluate it with ast.literal_eval(). That pretty severely limits the possible types of object that can be returned, but really, anything more complicated than those basic types probably carries the possibility of sekrit maliciousness intended to be triggered within your process. Under no circumstances should you use pickle for the communication protocol between the processes.

As #Brian suggest overriding open doesn't work:
def raise_exception(*a):
raise Exception("you can't use open")
open = raise_exception
print eval("open('test.py').read()", {})
this display the content of the file but this (merging #Brian and #lunaryorn answers)
import sys
def raise_exception(*a):
raise Exception("you can't use open")
__open = sys.modules['__builtin__'].open
sys.modules['__builtin__'].open = raise_exception
print eval("open('test.py').read()", {})
will throw this:
Traceback (most recent call last):
File "./test.py", line 11, in <module>
print eval("open('test.py').read()", {})
File "<string>", line 1, in <module>
File "./test.py", line 5, in raise_exception
raise Exception("you can't use open")
Exception: you can't use open
Error in sys.excepthook:
Traceback (most recent call last):
File "/usr/lib/python2.6/dist-packages/apport_python_hook.py", line 48, in apport_excepthook
if not enabled():
File "/usr/lib/python2.6/dist-packages/apport_python_hook.py", line 23, in enabled
conf = open(CONFIG).read()
File "./test.py", line 5, in raise_exception
raise Exception("you can't use open")
Exception: you can't use open
Original exception was:
Traceback (most recent call last):
File "./test.py", line 11, in <module>
print eval("open('test.py').read()", {})
File "<string>", line 1, in <module>
File "./test.py", line 5, in raise_exception
raise Exception("you can't use open")
Exception: you can't use open
and you can access to open outside user code via __open

"Nested function" refers to the fact that it's declared inside another function, not that it's a lambda. Declare your open override at the top level of your module and it should work the way you want.
Also, I don't think this is totally safe. Preventing open is just one of the things you need to worry about if you want to sandbox Python.

effective use of python shutil copy2

if we take a look at a file copy function, we can see there are several exceptions to handle. A good example is here: http://msdn.microsoft.com/en-us/library/9706cfs5.aspx
my question is if i use python shutil copy2, what should I pay attention to cope with various exceptions (source file not found, access not authorized, etc.)?
e.g.
def copy_file (self):
if not os.path.isdir(dest_path):
os.makedirs(dest_path)
shutil.copy2(src_path, dest_path)
what should i do to the above function?

You may just need handle the IOError exception that may be caused due to any permissions or Invalid destination name issue.
try:
shutil.copy(src,dst)
except IOError as e:
print e
The other exceptions mentioned in the MSDN article seems to fall under the same IOError in python. The FileNotFound and DirectoryNotFound are not really applicable as shutil.copy will create the destination if it not already exists. Also, I find that happening of OSError are also remote this in case.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.