Python File Close on Program Exit - python

If I wanted to close a file in a program with an infinite loop, I could do the following:
file = open('abc', 'w')
while True:
try:
file.write('a\n')
except KeyboardInterrupt:
break
file.close()
But if I just left it like this:
file = open('abc', 'w')
while True:
file.write('a\n')
Would the file not get closed on exit?

Python has automatic garbage collection. Garbage collection is the automatic clearing away of unused file handles or data in memory. As a python program halts, if it has halted cleanly, it will free up the memory and file handles it has used as it closes. It's prudent to have a familiarity with how garbage collection works when writing programs that run for extended periods of time, or that access a lot of information or files.

As mentioned above by #Mitch Jackson, if a python program halts cleanly, it will free up the memory. In your case, since you are just opening a file in a while loop. The program won't be able to close already opened file when it halts unless you explicitly include exception handling like a try-catch block or a with statement which wraps it up.
Here are the documentation on with statement: docs

Related

File closes before async call finishes, causing IO error

I wrote a package that includes a function to upload something asynchronously. The intent is that the user can use my package, open a file, and upload it async. The problem is, depending on how the user writes their code, I get an IO error.
# EXAMPLE 1
with open("my_file", "rb") as my_file:
package.upload(my_file)
# I/O operation on closed file error
#EXAMPLE 2
my_file = open("my_file", "rb")
package.upload(my_file)
# everything works
I understand that in the first example the file is closing immediately because the call is async. I don't know how to fix this though. I can't tell the user they can't open files in the style of example 1. Can I do something in my package.upload() implementation to prevent the file from closing?
You can use os.dup to duplicate the file descriptor and shield the async process from a close in the caller. The duplicated handle shares other characteristics of the original such as the current file position, so you are not completely shielded from bad things the caller can do.
This also limits your process to things that have file descriptors. If you stick to using the standard file calls, then a user can hand in any file-like object instead of just a file on disk.
def upload(my_file):
my_file = os.fdopen(os.dup(my_file.fileno()))
# ...queue for async
If you are using with to open files, it will close when code block execution finishes inside with. In your case, just pass filename and open inside asynchronus function

Does finally ensure some code gets run atomically, no matter what?

Assume I'm going to write a Python script that catches the KeyboardInterrupt exception to be able to get terminated by the user using Ctrl+C safely
However, I can't put all critical actions (like file writes) into the catch block because it relies on local variables and to make sure a subsequent Ctrl+C does not break it anyway.
Would it work and be good practice to use a try-catch block with empty (pass) try part and all the code inside the finally part to define this snippet as "atomic, interrupt-safe code" which may not get interrupted mid-way?
Example:
try:
with open("file.txt", "w") as f:
for i in range(1000000):
# imagine something useful that takes very long instead
data = str(data ** (data ** data))
try:
pass
finally:
# ensure that this code is not interrupted to prevent file corruption:
f.write(data)
except KeyboardInterrupt:
print("User aborted, data created so far saved in file.txt")
exit(0)
In this example I don't care for the currently produced data string, i.e. that creation could be interrupted and no write would be triggered. But once the write started, it must be completed, that's all I want to ensure. Also, what would happen if an exception (or KeyboardInterrupt) happened while performing the write inside the finally clause?
Code in finally can still be interrupted too. Python makes no guarantees about this; all it guarantees is that execution will switch to the finally suite after the try suite completed or if an exception in the try suite was raised. A try can only handle exceptions raised within its scope, not outside of it, and finally is outside of that scope.
As such there is no point in using try on a pass statement. The pass is a no-op, it won't ever be interrupted, but the finally suite can easily be interrupted still.
You'll need to pick a different technique. You could write to a separate file and move that into place on successful completion; the OS guarantees that a file move is atomic, for example. Or record your last successful write position, and truncate the file to that point if a next write is interrupted. Or write markers in your file that signal a successful record, so that reads know what to ignore.
In your case, there is no problem, because file writes are atomic, but if you have some file object implementetion, that is more complex, your try-except is in the wrong place. You have to place exception handling around the write:
try:
f.write(data)
except:
#do some action to restore file integrity
raise
For example, if you write binary data, you could to the following:
filepos = f.tell()
try:
f.write(data)
except:
# remove the already written data
f.seek(filepos)
f.truncate()
raise

Is it reasonable to wrap an entire main loop in a try..finally block?

I've made a map editor in Python2.7.9 for a small project and I'm looking for ways to preserve the data I edit in the event of some unhandled exception. My editor already has a method for saving out data, and my current solution is to have the main loop wrapped in a try..finally block, similar to this example:
import os, datetime #..and others.
if __name__ == '__main__':
DataMgr = DataManager() # initializes the editor.
save_note = None
try:
MainLoop() # unsurprisingly, this calls the main loop.
except Exception as e: # I am of the impression this will catch every type of exception.
save_note = "Exception dump: %s : %s." % (type(e).__name__, e) # A memo appended to the comments in the save file.
finally:
exception_fp = DataMgr.cwd + "dump_%s.kmap" % str(datetime.datetime.now())
DataMgr.saveFile(exception_fp, memo = save_note) # saves out to a dump file using a familiar method with a note outlining what happened.
This seems like the best way to make sure that, no matter what happens, an attempt is made to preserve the editor's current state (to the extent that saveFile() is equipped to do so) in the event that it should crash. But I wonder if encapsulating my entire main loop in a try block is actually safe and efficient and good form. Is it? Are there risks or problems? Is there a better or more conventional way?
Wrapping the main loop in a try...finally block is the accepted pattern when you need something to happen no matter what. In some cases it's logging and continuing, in others it's saving everything possible and quitting.
So you're code is fine.
If your file isn't that big, I would suggest maybe reading the entire input file into memory, closing the file, then doing your data processing on the copy in memory, this will solve any problems you have with not corrupting your data at the cost of potentially slowing down your runtime.
Alternatively, take a look at the atexit python module. This allows you to register a function(s) for a automatic callback function when the program exits.
That being said what you have should work reasonably well.

File opened in a function doesn't need to be closed manually?

If I open a file in a function:
In [108]: def foo(fname):
...: f=open(fname)
...: print f
...:
In [109]: foo('t.py')
<open file 't.py', mode 'r' at 0x05DA1B78>
Is it better to close f manually or not? Why?
It is better to close the file when you are done with it because it is a good habit, but it isn't entirely necessary because the garbage collector will close the file for you. The reason you'd close it manually is to have more control. You don't know when the garbage collector will run.
But even better is to use the with statement introduced in python 2.5.
with open(f_name) as f:
# do stuff with f
# possibly throw an exception
This will close the file no matter what happens while in the scope of the with statement.
$ cat /proc/sys/fs/file-max
390957
this may break my system ( forgive me for not trying :) ):
fs = []
for i in range(390957+1):
fs.append(open(str(i), 'w'))
for f in files:
f.close()
this (hopefully) won't:
for i in range(390957+1):
with open(str(i), 'w') as f:
# do stuff
Yes, it is better to close the file manually or even better use the with statement when dealing with files(it will automatically close the file for you even if an exception occurs). In CPython an unreferenced file object object will be closed automatically when the garbage collector actually destroys the file object, until then any unflushed data/resources may still hang around in memory.
From docs:
It is good practice to use the with keyword when dealing with file
objects. This has the advantage that the file is properly closed after
its suite finishes, even if an exception is raised on the way.
Related:
File buffer flushing and closing in Python with variable re-assign
How does python close files that have been gc'ed?

Make Python wait until a file exists before continuing

In my code, I write a file to my hard disk. After that, I need to import the generated file and then continue processing it.
for i in xrange(10):
filename=generateFile()
# takes some time, I wish to freeze the program here
# and continue once the file is ready in the system
file=importFile(filename)
processFile(file)
If I run the code snippet in one go, most likely file=importFile(filename) will complain that that file does not exist, since the generation takes some time.
I used to manually run filename=generateFile() and wait before running file=importFile(filename).
Now that I'm using a for loop, I'm searching for an automatic way.
You could use time.sleep and I would expect that if you are loading a module this way you would need to reload rather than import after the first import.
However, unless the file is very large why not just generate the string and then eval or exec it?
Note that since your file generation function is not being invoked in a thread it should be blocking and will only return when it thinks it has finished writing - possibly you can improve things by ensuring that the file writer ends with outfile.flush() then outfile.close() but on some OSs there may still be a time when the file is not actually available.
for i in xrange(10):
(filename, is_finished)=generateFile()
while is_finished:
file=importFile(filename)
processFile(file)
continue;
I think you should use a flag to test if the file is generate.

Categories

Resources