Closing files in CPython - python

If in CPython when the reference counter drops to zero the object space is immediately reclaimed how much is closing a file important? I mean if f is a file object and I do f = 10 then the file object space will automatically reclaimed and close will be called.

Although CPython can handle object memory reclaiming based on the reference count, it is a good practice to free external resources like files as soon as they are not needed anymore.
Extracted from https://docs.python.org/2/tutorial/inputoutput.html
"""
When you’re done with a file, call f.close() to close it and free up any system resources taken up by the open file. After calling f.close(), attempts to use the file object will automatically fail.
>>>
>>> f.close()
>>> f.read()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: I/O operation on closed file
It is good practice to use the with keyword when dealing with file objects. This has the advantage that the file is properly closed after its suite finishes, even if an exception is raised on the way. It is also much shorter than writing equivalent try-finally blocks:
>>>
>>> with open('workfile', 'r') as f:
... read_data = f.read()
>>> f.closed
True
"""

Related

Any reason for this difference between python's readable() and writable() functions?

When opening some file in Python 3.8.11, you get this unsurprising, boring behavior:
>>> f = open("notes.txt")
>>> f.writable()
False
>>> f.readable()
True
But when you close that file, the two functions behave differently:
>>> f.close()
>>> f.writable()
False
>>> f.readable()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: I/O operation on closed file
I suppose there are really 2 things I don't understand. First, why throw an exception when trying to query about the state of a stream? Sure, it makes sense to throw an exception when trying to read() from a closed stream, but an exception while querying whether a stream is readable seems extreme. Second, if you take the attitude that exceptions are the proper way to handle a closed stream, why not be consistent? Shouldn't writable() throw an exception too?
Of course, the second part of my question may not have an answer beyond "They just wrote it that way", but I'm curious to see if there's some concrete reasoning there.

Is it safe to open a file, write to it and NOT close it?

I have quite a large Python(3) script that I'm trying to optimize.
To my understanding when you use with open(..., ...) as x you DON'T need to use .close() at the end of the "with" block (it is automatically closed).
I also know that you SHOULD add .close() (if you are not using with) after you are done manipulating the file, like this:
f = open(..., ...)
f.write()
f.close()
In an attempt to shove 3 lines (above) into 1 line I tried to change this:
with open(location, mode) as packageFile:
packageFile.write()
Into this:
open(location, mode).write(content).close()
Unfortunately this did not work and I got this error:
Traceback (most recent call last):
File "test.py", line 20, in <module>
one.save("one.txt", sample)
File "/home/nwp/Desktop/Python/test/src/one.py", line 303, in save
open(location, mode).write(content).close()
AttributeError: 'int' object has no attribute 'close'
The same line works fine when I remove the .close() like this:
open(location, mode).write(content)
Why didn't open(location, mode).write(content).close() work and is it safe to omit the .close() function?
I also know that you SHOULD add .close() after you are done manipulating the file
Not if you're using with.
Why didn't open(location, mode).write(content).close() work
Because instead of trying to close the file, you're trying to close the return value of the write method call, which is an integer. You can't close an integer.
is it safe to omit the .close() function?
No. If you ever run this code on a different Python implementation, or if CPython ever abandons reference counting, your write could be arbitrarily delayed or even lost completely.
If you really want to shove this into a single line, you can write a single-line with statement:
with open(location, mode) as packageFile: packageFile.write(whatever)
This is discouraged, but much less discouraged than relying on the file object's finalizer to close the file for you.
That's because the . (dot) operator works from left to right. In a.b, b is a message(method call) passed to a. Say
a.b.c...
reads
do b in a and return r1, then
do c in r1 and return r2, then
...
Now open returns an object that responds to read message, but read returns a str/bytes that has no method close.
Bdw, always use a context manager i.e with with in such context. And I wonder what does this have to do with optimization, as you said you are doing this while optimizing a script.
It looks to me like the goal is for one function to occur right after another, however that isn't what is happening here.
open(location, mode) returns a file:
https://docs.python.org/2/library/functions.html#open
In the syntax of
open(location, mode).write(content)
this is actually equivalent to
file = open(location, mode)
file.write(content)
the open statement becomes a type file which has a function called write(). The reason .close() doesn't work is that the file.write() doesn't have a function called close(). What this is trying to do is file.function().function().
In any case trying to optimize your code by reducing lines will not speed up performance noticeably, the normal with statement should be just as fast. Unless you are trying to go code golfing which is a whole other subject.
Have a good day!
is it safe to omit the .close() function?
In current versions of CPython the file will be closed at the end of the for loop because CPython uses reference counting as its primary garbage collection mechanism but that's an implementation detail, not a feature of the language.
Also directly quoting user2357112: if you ever run this code on a different Python implementation, or if CPython ever abandons reference counting, your write could be arbitrarily delayed or even lost completely.
Why didn't open(location, mode).write(content).close() work?
This is because I was trying to call the .close() method on the return of open(location, mode).write(content).

How do I easily and consistently cause an IOError of some sort when reading from or writing to a text file?

I'm teaching an intro programming class in Python, and we're talking about exceptions and file I/O. I'm looking for a way to quickly and simply test their error handling. Now it's easy to cause an IOError when opening a file (just make sure that there is no file with the given filename), but I'd like to be able to test whether or not they're able to handle IOErrors that crop up while reading from or writing to a file, and I can't think of a simple, quick way to ensure that an IOError will occur at that time. I tried using a binary file, which opens fine but causes trouble if you try to read from it, but that causes a UnicodeDecodeError, which is not a type of IOError in Python.
Any thoughts?
Insert some code that rebinds open() (and possibly file) at the beginning of the code to test:
>>> def open(path, *args, **kw):
... path = os.path.join("/some/path/that/dont/exist", path)
... return __builtins__.open(path, *args, **kw)
...
>>> open("foo.bar")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in open
IOError: [Errno 2] No such file or directory: '/some/path/that/dont/exist/foo.bar'

What are os.fdopen() semantics?

I used to think that os.fdopen() either eats file descriptor and returns a file io object, or raises an exception.
For example:
fd = os.open("/etc/passwd", os.O_RDONLY)
try: os.fdopen(fd, "w")
except: os.close(fd) # prevent memory leak
However these semantics don't seem to always hold.
Here's an example on OSX:
In [1]: import os
In [2]: os.open("/", os.O_RDONLY, 0660)
Out[2]: 5
In [3]: os.fdopen(5, "rb")
---------------------------------------------------------------------------
IOError Traceback (most recent call last)
<ipython-input-3-3ca4d619250e> in <module>()
----> 1 os.fdopen(5, "rb")
IOError: [Errno 21] Is a directory: '<fdopen>'
In [4]: os.close(5)
---------------------------------------------------------------------------
OSError Traceback (most recent call last)
<ipython-input-4-76713e571514> in <module>()
----> 1 os.close(5)
OSError: [Errno 9] Bad file descriptor
It seems that os.fdopen() both ate my file descriptor 5 and raised an exception...
Is there a safe way to use os.fdopen()?
Did I miss something?
Did I find a bug?
P.S. Python version string Python 2.7.6 (v2.7.6:3a1db0d2747e, Nov 10 2013, 00:42:54) in case someone can't reproduce with theirs.
P.P.S. same problem is present on Py2.7 Linux too.
Py3.3 however doesn't exhibit said problem.
Python checks that the resulting FILE* does not refer to a directory after creating the python file object and storing it in the python object. Because of the error in the directory check, the file object is deref'd (since it won't be returned) which causes the destructor to be called which closes the file.
I agree that it'd be nice if the docs showed what effect it can have on the file descriptor passed in. I'm not sure what you want as a 'safe' way to use fdopen. If you're going to close the filedescriptor on failure, what does it matter that was closed by python? Just use
try: os.close(fd)
except: pass
to quelch the secondary exception.
fill_file_fields is called by PyFile_FromFile to fill in the members of the file object and it calls the dircheck function after the fields have been populated. This causes fill_file_fields to return NULL so PyFile_FromFile does Py_DECREF(f); where f is the file object. Since this is the last reference, the deallocator file_dealloc is called which invokes close_the_file which (surprise, surprise) closes the file.
In the 3.4 branch, the dircheck is done from fileio_init which uses the flag variable fd_is_own to determine whether the file should be closed on an error condition.

In python c = pickle.load(open(fileName, 'r')) does this close the file?

I tried to Google but cannot find an answer.
If I just do
c = pickle.load(open(fileName, 'r'))
Will the file be automatically closed after this operation?
No, but you can simply adapt it to close the file:
# file not yet opened
with open(fileName, 'r') as f:
# file opened
c = pickle.load(f)
# file opened
# file closed
What with statement does, is (among other things) calling __exit__() method of object listed in with statement (in this case: opened file), which in this case closes the file.
Regarding opened file's __exit__() method:
>>> f = open('deleteme.txt', 'w')
>>> help(f.__exit__)
Help on built-in function __exit__:
__exit__(...)
__exit__(*excinfo) -> None. Closes the file.
I hate to split hairs, but the answer is either yes or no -- depending on exactly what you are asking.
Will the line, as written, close the file descriptor? Yes.
Will it happen automatically after this operation? Yes…
but probably not immediately.
Will the file be closed immediately after this line? No, not likely (see reply above).
Python will delete the file descriptor when the reference count for the file object is zero. If you are opening the file in a local scope, such as inside a function (either as you have done, or even if you have assigned it to a local variable inside a def), when that local scope is cleaned up -- and if there are no remaining references to the file object, the file will be closed.
It is, however, a much better choice to be explicit, and open the file inside a with block -- which will close the file for you instead of waiting for the garbage collector to kick in.
You also have to be careful in the case where you unintentionally assign the file descriptor to a local variable… (see: opened file descriptors in python) but you aren't doing that. Also check the answers here (see: check what files are open in Python) for some ways that you can check what file descriptors are open, with different variants for whatever OS you are on.
Yes, the return value of open has a lifetime only of that of the call to pickle.load. It closes the open file descriptor when closed.
If you're extremely paranoid, consider:
with open(fileName,'r') as fin:
c = pickle.load(fin)
The with keyword establishes the lifetime of fin
No it does not close the file. This is one place where you could use the with operator:
with open('path/to/file') as infile:
c = pickle.load(f)
Further reading: this and this

Categories

Resources