What are os.fdopen() semantics? - python

I used to think that os.fdopen() either eats file descriptor and returns a file io object, or raises an exception.
For example:
fd = os.open("/etc/passwd", os.O_RDONLY)
try: os.fdopen(fd, "w")
except: os.close(fd) # prevent memory leak
However these semantics don't seem to always hold.
Here's an example on OSX:
In [1]: import os
In [2]: os.open("/", os.O_RDONLY, 0660)
Out[2]: 5
In [3]: os.fdopen(5, "rb")
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-3-3ca4d619250e> in <module>()
----> 1 os.fdopen(5, "rb")
IOError: [Errno 21] Is a directory: '<fdopen>'
In [4]: os.close(5)
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-4-76713e571514> in <module>()
----> 1 os.close(5)
OSError: [Errno 9] Bad file descriptor
It seems that os.fdopen() both ate my file descriptor 5 and raised an exception...
Is there a safe way to use os.fdopen()?
Did I miss something?
Did I find a bug?
P.S. Python version string Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) in case someone can't reproduce with theirs.
P.P.S. same problem is present on Py2.7 Linux too.
Py3.3 however doesn't exhibit said problem.

Python checks that the resulting FILE* does not refer to a directory after creating the python file object and storing it in the python object. Because of the error in the directory check, the file object is deref'd (since it won't be returned) which causes the destructor to be called which closes the file.
I agree that it'd be nice if the docs showed what effect it can have on the file descriptor passed in. I'm not sure what you want as a 'safe' way to use fdopen. If you're going to close the filedescriptor on failure, what does it matter that was closed by python? Just use
try: os.close(fd)
except: pass
to quelch the secondary exception.
fill_file_fields is called by PyFile_FromFile to fill in the members of the file object and it calls the dircheck function after the fields have been populated. This causes fill_file_fields to return NULL so PyFile_FromFile does Py_DECREF(f); where f is the file object. Since this is the last reference, the deallocator file_dealloc is called which invokes close_the_file which (surprise, surprise) closes the file.
In the 3.4 branch, the dircheck is done from fileio_init which uses the flag variable fd_is_own to determine whether the file should be closed on an error condition.

Related

Python print() throws an invalid handle after call into dll

Using ctypes to import a DLL. Occasionally, after a function from the dll is called and I call the print() function in Python, I get an OS Error: Invalid handle.
The calls to the dll are successful and 90% of the time the application works without a hitch. Every 10 runs this exception will throw and I can't even catch it properly since I don't have a way to restore the handle.
I'm think the dll is somehow messing with the stdout handle that print() uses. There are some functions within the dll that still print to stdout. Is there any way to reacquire a valid handle?
Traceback (most recent call last):
File "{PATH}/demo.py", line 13, in <module>
print(" ")
OSError: [WinError 6] The handle is invalid
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
OSError: [WinError 6] The handle is invalid
Issue has been fixed by duplicating the stdout handle using os.dup2()
# Duplicate stdout
stdout_copy = 0
os.dup2(sys.stdout.fileno(), stdout_copy)
# Restore stdout
sys.stdout = os.fdopen(stdout_copy,"w")

RuntimeError: lost sys.stdout

I was trying to debug an issue with abc.ABCMeta - in particular a subclass check that didn't work as expected and I wanted to start by simply adding a print to the __subclasscheck__ method (I know there are better ways to debug code, but pretend for the sake of this question that there's no alternative). However when starting Python afterwards Python crashes (like a segmentation fault) and I get this exception:
Fatal Python error: Py_Initialize: can't initialize sys standard streams
Traceback (most recent call last):
File "C:\...\lib\io.py", line 84, in <module>
File "C:\...\lib\abc.py", line 158, in register
File "C:\...\lib\abc.py", line 196, in __subclasscheck__
RuntimeError: lost sys.stdout
So it probebly wasn't a good idea to put the print in there. But where exactly does the exception come from? I only changed Python code, that shouldn't crash, right?
Does someone know where this exception is coming from and if/how I can avoid it but still put a print in the abc.ABCMeta.__subclasscheck__ method?
I'm using Windows 10, Python-3.5 (just in case it might be important).
This exception stems from the fact that CPython imports io, and, indirectly, abc.py during the initialization of the standard streams:
if (!(iomod = PyImport_ImportModule("io"))) {
goto error;
}
io imports the abc module and registers FileIO as a virtual subclass of RawIOBase, a couple of other classes for BufferedIOBase and others for TextIOBase. ABCMeta.register invokes __subclasscheck__ in the process.
As you understand, using print in __subclasscheck__ when sys.stdout isn't set-up is a big no-no; the initialization fails and you get back your error:
if (initstdio() < 0)
Py_FatalError(
"Py_NewInterpreter: can't initialize sys standard streams");
You can get around it by guarding it with a hasattr(sys, 'stdout'), sys has been initialized by this point while stdout hasn't (and, as such, won't exist in sys in the early initialization phase):
if hasattr(sys, 'stdout'):
print("Debug")
you should get good amount of output when firing Python up now.

Closing files in CPython

If in CPython when the reference counter drops to zero the object space is immediately reclaimed how much is closing a file important? I mean if f is a file object and I do f = 10 then the file object space will automatically reclaimed and close will be called.
Although CPython can handle object memory reclaiming based on the reference count, it is a good practice to free external resources like files as soon as they are not needed anymore.
Extracted from https://docs.python.org/2/tutorial/inputoutput.html
"""
When you’re done with a file, call f.close() to close it and free up any system resources taken up by the open file. After calling f.close(), attempts to use the file object will automatically fail.
>>>
>>> f.close()
>>> f.read()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: I/O operation on closed file
It is good practice to use the with keyword when dealing with file objects. This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way. It is also much shorter than writing equivalent try-finally blocks:
>>>
>>> with open('workfile', 'r') as f:
... read_data = f.read()
>>> f.closed
True
"""

How can I modify a Python traceback object when raising an exception?

I'm working on a Python library used by third-party developers to write extensions for our core application.
I'd like to know if it's possible to modify the traceback when raising exceptions, so the last stack frame is the call to the library function in the developer's code, rather than the line in the library that raised the exception. There are also a few frames at the bottom of the stack containing references to functions used when first loading the code that I'd ideally like to remove too.
Thanks in advance for any advice!
You can remove the top of the traceback easily with by raising with the tb_next element of the traceback:
except:
ei = sys.exc_info()
raise ei[0], ei[1], ei[2].tb_next
tb_next is a read_only attribute, so I don't know of a way to remove stuff from the bottom. You might be able to screw with the properties mechanism to allow access to the property, but I don't know how to do that.
Take a look at what jinja2 does here:
https://github.com/mitsuhiko/jinja2/blob/5b498453b5898257b2287f14ef6c363799f1405a/jinja2/debug.py
It's ugly, but it seems to do what you need done. I won't copy-paste the example here because it's long.
Starting with Python 3.7, you can instantiate a new traceback object and use the .with_traceback() method when throwing. Here's some demo code using either sys._getframe(1) (or a more robust alternative) that raises an AssertionError while making your debugger believe the error occurred in myassert(False): sys._getframe(1) omits the top stack frame.
What I should add is that while this looks fine in the debugger, the console behavior unveils what this is really doing:
Traceback (most recent call last):
File ".\test.py", line 35, in <module>
myassert_false()
File ".\test.py", line 31, in myassert_false
myassert(False)
File ".\test.py", line 26, in myassert
raise AssertionError().with_traceback(back_tb)
File ".\test.py", line 31, in myassert_false
myassert(False)
AssertionError
Rather than removing the top of the stack, I have added a duplicate of the second-to-last frame.
Anyway, I focus on how the debugger behaves, and it seems this one works correctly:
"""Modify traceback on exception.
See also https://github.com/python/cpython/commit/e46a8a
"""
import sys
import types
def myassert(condition):
"""Throw AssertionError with modified traceback if condition is False."""
if condition:
return
# This function ... is not guaranteed to exist in all implementations of Python.
# https://docs.python.org/3/library/sys.html#sys._getframe
# back_frame = sys._getframe(1)
try:
raise AssertionError
except AssertionError:
traceback = sys.exc_info()[2]
back_frame = traceback.tb_frame.f_back
back_tb = types.TracebackType(tb_next=None,
tb_frame=back_frame,
tb_lasti=back_frame.f_lasti,
tb_lineno=back_frame.f_lineno)
raise AssertionError().with_traceback(back_tb)
def myassert_false():
"""Test myassert(). Debugger should point at the next line."""
myassert(False)
if __name__ == "__main__":
myassert_false()
You might also be interested in PEP-3134, which is implemented in python 3 and allows you to tack one exception/traceback onto an upstream exception.
This isn't quite the same thing as modifying the traceback, but it would probably be the ideal way to convey the "short version" to library users while still having the "long version" available.
What about not changing the traceback? The two things you request can both be done more easily in a different way.
If the exception from the library is caught in the developer's code and a new exception is raised instead, the original traceback will of course be tossed. This is how exceptions are generally handled... if you just allow the original exception to be raised but you munge it to remove all the "upper" frames, the actual exception won't make sense since the last line in the traceback would not itself be capable of raising the exception.
To strip out the last few frames, you can request that your tracebacks be shortened... things like traceback.print_exception() take a "limit" parameter which you could use to skip the last few entries.
That said, it should be quite possible to munge the tracebacks if you really need to... but where would you do it? If in some wrapper code at the very top level, then you could simply grab the traceback, take a slice to remove the parts you don't want, and then use functions in the "traceback" module to format/print as desired.
For python3, here's my answer. Please read the comments for an explanation:
def pop_exception_traceback(exception,n=1):
#Takes an exception, mutates it, then returns it
#Often when writing my repl, tracebacks will contain an annoying level of function calls (including the 'exec' that ran the code)
#This function pops 'n' levels off of the stack trace generated by exception
#For example, if print_stack_trace(exception) originally printed:
# Traceback (most recent call last):
# File "<string>", line 2, in <module>
# File "<string>", line 2, in f
# File "<string>", line 2, in g
# File "<string>", line 2, in h
# File "<string>", line 2, in j
# File "<string>", line 2, in k
#Then print_stack_trace(pop_exception_traceback(exception),3) would print:
# File "<string>", line 2, in <module>
# File "<string>", line 2, in j
# File "<string>", line 2, in k
#(It popped the first 3 levels, aka f g and h off the traceback)
for _ in range(n):
exception.__traceback__=exception.__traceback__.tb_next
return exception
This code might be of interest for you.
It takes a traceback and removes the first file, which should not be shown. Then it simulates the Python behavior:
Traceback (most recent call last):
will only be shown if the traceback contains more than one file.
This looks exactly as if my extra frame was not there.
Here my code, assuming there is a string text:
try:
exec(text)
except:
# we want to format the exception as if no frame was on top.
exp, val, tb = sys.exc_info()
listing = traceback.format_exception(exp, val, tb)
# remove the entry for the first frame
del listing[1]
files = [line for line in listing if line.startswith(" File")]
if len(files) == 1:
# only one file, remove the header.
del listing[0]
print("".join(listing), file=sys.stderr)
sys.exit(1)

Sorting disk I/O errors in Python

How do I sort out (distinguish) an error derived from a "disk full condition" from "trying to write to a read-only file system"?
I don't want to fill my HD to find out :)
What I want is to know who to catch each exception, so my code can say something to the user when he is trying to write to a ReadOnly FS and another message if the user is trying to write a file in a disk that is full.
Once you catch IOError, e.g. with an except IOError, e: clause in Python 2.*, you can examine e.errno to find out exactly what kind of I/O error it was (unfortunately in a way that's not necessarily fully portable among different operating systems).
See the errno module in Python standard library; opening a file for writing on a R/O filesystem (on a sensible OS) should produce errno.EPERM, errno.EACCES or better yet errno.EROFS ("read-only filesystem"); if the filesystem is R/W but there's no space left you should get errno.ENOSPC ("no space left on device"). But you will need to experiment on the OSes you care about (with a small USB key filling it up should be easy;-).
There's no way to use different except clauses depending on errno -- such clauses must be distinguished by the class of exceptions they catch, not by attributes of the exception instance -- so you'll need an if/else or other kind of dispatching within a single except IOError, e: clause.
On a read-only filesystem, the files themselves will be marked as read-only. Any attempt to open a read-only file for writing (O_WRONLY or O_RDWR) will fail. On UNIX-like systems, the errno EACCES will be set.
>>> file('/etc/resolv.conf', 'a')
Traceback (most recent call last):
File "", line 1, in
IOError: [Errno 13] Permission denied: '/etc/resolv.conf'
In contrast, attempts to write to a full file may result in ENOSPC. May is critical; the error may be delayed until fsync or close.
>>> file(/dev/full, 'a').write('\n')
close failed in file object destructor:
IOError: [Errno 28] No space left on device

Categories

Resources