Does finally ensure some code gets run atomically, no matter what?

Does finally ensure some code gets run atomically, no matter what? - python

Assume I'm going to write a Python script that catches the KeyboardInterrupt exception to be able to get terminated by the user using Ctrl+C safely
However, I can't put all critical actions (like file writes) into the catch block because it relies on local variables and to make sure a subsequent Ctrl+C does not break it anyway.
Would it work and be good practice to use a try-catch block with empty (pass) try part and all the code inside the finally part to define this snippet as "atomic, interrupt-safe code" which may not get interrupted mid-way?
Example:
try:
with open("file.txt", "w") as f:
for i in range(1000000):
# imagine something useful that takes very long instead
data = str(data ** (data ** data))
try:
pass
finally:
# ensure that this code is not interrupted to prevent file corruption:
f.write(data)
except KeyboardInterrupt:
print("User aborted, data created so far saved in file.txt")
exit(0)
In this example I don't care for the currently produced data string, i.e. that creation could be interrupted and no write would be triggered. But once the write started, it must be completed, that's all I want to ensure. Also, what would happen if an exception (or KeyboardInterrupt) happened while performing the write inside the finally clause?

Code in finally can still be interrupted too. Python makes no guarantees about this; all it guarantees is that execution will switch to the finally suite after the try suite completed or if an exception in the try suite was raised. A try can only handle exceptions raised within its scope, not outside of it, and finally is outside of that scope.
As such there is no point in using try on a pass statement. The pass is a no-op, it won't ever be interrupted, but the finally suite can easily be interrupted still.
You'll need to pick a different technique. You could write to a separate file and move that into place on successful completion; the OS guarantees that a file move is atomic, for example. Or record your last successful write position, and truncate the file to that point if a next write is interrupted. Or write markers in your file that signal a successful record, so that reads know what to ignore.

In your case, there is no problem, because file writes are atomic, but if you have some file object implementetion, that is more complex, your try-except is in the wrong place. You have to place exception handling around the write:
try:
f.write(data)
except:
#do some action to restore file integrity
raise
For example, if you write binary data, you could to the following:
filepos = f.tell()
try:
f.write(data)
except:
# remove the already written data
f.seek(filepos)
f.truncate()
raise

Related

Does 'finally' always execute in Python?

For any possible try-finally block in Python, is it guaranteed that the finally block will always be executed?
For example, let’s say I return while in an except block:
try:
1/0
except ZeroDivisionError:
return
finally:
print("Does this code run?")
Or maybe I re-raise an Exception:
try:
1/0
except ZeroDivisionError:
raise
finally:
print("What about this code?")
Testing shows that finally does get executed for the above examples, but I imagine there are other scenarios I haven't thought of.
Are there any scenarios in which a finally block can fail to execute in Python?

"Guaranteed" is a much stronger word than any implementation of finally deserves. What is guaranteed is that if execution flows out of the whole try-finally construct, it will pass through the finally to do so. What is not guaranteed is that execution will flow out of the try-finally.
A finally in a generator or async coroutine might never run, if the object never executes to conclusion. There are a lot of ways that could happen; here's one:
def gen(text):
try:
for line in text:
try:
yield int(line)
except:
# Ignore blank lines - but catch too much!
pass
finally:
print('Doing important cleanup')
text = ['1', '', '2', '', '3']
if any(n > 1 for n in gen(text)):
print('Found a number')
print('Oops, no cleanup.')
Note that this example is a bit tricky: when the generator is garbage collected, Python attempts to run the finally block by throwing in a GeneratorExit exception, but here we catch that exception and then yield again, at which point Python prints a warning ("generator ignored GeneratorExit") and gives up. See PEP 342 (Coroutines via Enhanced Generators) for details.
Other ways a generator or coroutine might not execute to conclusion include if the object is just never GC'ed (yes, that's possible, even in CPython), or if an async with awaits in __aexit__, or if the object awaits or yields in a finally block. This list is not intended to be exhaustive.
A finally in a daemon thread might never execute if all non-daemon threads exit first.
os._exit will halt the process immediately without executing finally blocks.
os.fork may cause finally blocks to execute twice. As well as just the normal problems you'd expect from things happening twice, this could cause concurrent access conflicts (crashes, stalls, ...) if access to shared resources is not correctly synchronized.
Since multiprocessing uses fork-without-exec to create worker processes when using the fork start method (the default on Unix), and then calls os._exit in the worker once the worker's job is done, finally and multiprocessing interaction can be problematic (example).
A C-level segmentation fault will prevent finally blocks from running.
kill -SIGKILL will prevent finally blocks from running. SIGTERM and SIGHUP will also prevent finally blocks from running unless you install a handler to control the shutdown yourself; by default, Python does not handle SIGTERM or SIGHUP.
An exception in finally can prevent cleanup from completing. One particularly noteworthy case is if the user hits control-C just as we're starting to execute the finally block. Python will raise a KeyboardInterrupt and skip every line of the finally block's contents. (KeyboardInterrupt-safe code is very hard to write).
If the computer loses power, or if it hibernates and doesn't wake up, finally blocks won't run.
The finally block is not a transaction system; it doesn't provide atomicity guarantees or anything of the sort. Some of these examples might seem obvious, but it's easy to forget such things can happen and rely on finally for too much.

Yes. Finally always wins.
The only way to defeat it is to halt execution before finally: gets a chance to execute (e.g. crash the interpreter, turn off your computer, suspend a generator forever).
I imagine there are other scenarios I haven't thought of.
Here are a couple more you may not have thought about:
def foo():
# finally always wins
try:
return 1
finally:
return 2
def bar():
# even if he has to eat an unhandled exception, finally wins
try:
raise Exception('boom')
finally:
return 'no boom'
Depending on how you quit the interpreter, sometimes you can "cancel" finally, but not like this:
>>> import sys
>>> try:
... sys.exit()
... finally:
... print('finally wins!')
...
finally wins!
$
Using the precarious os._exit (this falls under "crash the interpreter" in my opinion):
>>> import os
>>> try:
... os._exit(1)
... finally:
... print('finally!')
...
$
I'm currently running this code, to test if finally will still execute after the heat death of the universe:
try:
while True:
sleep(1)
finally:
print('done')
However, I'm still waiting on the result, so check back here later.

According to the Python documentation:
No matter what happened previously, the final-block is executed once the code block is complete and any raised exceptions handled. Even if there's an error in an exception handler or the else-block and a new exception is raised, the code in the final-block is still run.
It should also be noted that if there are multiple return statements, including one in the finally block, then the finally block return is the only one that will execute.

Well, yes and no.
What is guaranteed is that Python will always try to execute the finally block. In the case where you return from the block or raise an uncaught exception, the finally block is executed just before actually returning or raising the exception.
(what you could have controlled yourself by simply running the code in your question)
The only case I can imagine where the finally block will not be executed is when the Python interpretor itself crashes for example inside C code or because of power outage.

I found this one without using a generator function:
import multiprocessing
import time
def fun(arg):
try:
print("tried " + str(arg))
time.sleep(arg)
finally:
print("finally cleaned up " + str(arg))
return foo
list = [1, 2, 3]
multiprocessing.Pool().map(fun, list)
The sleep can be any code that might run for inconsistent amounts of time.
What appears to be happening here is that the first parallel process to finish leaves the try block successfully, but then attempts to return from the function a value (foo) that hasn't been defined anywhere, which causes an exception. That exception kills the map without allowing the other processes to reach their finally blocks.
Also, if you add the line bar = bazz just after the sleep() call in the try block. Then the first process to reach that line throws an exception (because bazz isn't defined), which causes its own finally block to be run, but then kills the map, causing the other try blocks to disappear without reaching their finally blocks, and the first process not to reach its return statement, either.
What this means for Python multiprocessing is that you can't trust the exception-handling mechanism to clean up resources in all processes if even one of the processes can have an exception. Additional signal handling or managing the resources outside the multiprocessing map call would be necessary.

You can use a finally with an if statement, below example is checking for network connection and if its connected it will run the finally block
try:
reader1, writer1 = loop.run_until_complete(self.init_socket(loop))
x = 'connected'
except:
print("cant connect server transfer") #open popup
x = 'failed'
finally :
if x == 'connected':
with open('text_file1.txt', "r") as f:
file_lines = eval(str(f.read()))
else:
print("not connected")

Python File Close on Program Exit

If I wanted to close a file in a program with an infinite loop, I could do the following:
file = open('abc', 'w')
while True:
try:
file.write('a\n')
except KeyboardInterrupt:
break
file.close()
But if I just left it like this:
file = open('abc', 'w')
while True:
file.write('a\n')
Would the file not get closed on exit?

Python has automatic garbage collection. Garbage collection is the automatic clearing away of unused file handles or data in memory. As a python program halts, if it has halted cleanly, it will free up the memory and file handles it has used as it closes. It's prudent to have a familiarity with how garbage collection works when writing programs that run for extended periods of time, or that access a lot of information or files.

As mentioned above by #Mitch Jackson, if a python program halts cleanly, it will free up the memory. In your case, since you are just opening a file in a while loop. The program won't be able to close already opened file when it halts unless you explicitly include exception handling like a try-catch block or a with statement which wraps it up.
Here are the documentation on with statement: docs

When is KeyboardInterrupt raised in Python?

All the docs tell us is,
Raised when the user hits the interrupt key (normally Control-C or Delete). During execution, a check for interrupts is made regularly.
But from the point of the code, when can I see this exception? Does it occur during statement execution? Only between statements? Can it happen in the middle of an expression?
For example:
file_ = open('foo')
# <-- can a KeyboardInterrupt be raised here, after the successful
# completion of open but prior to the try? -->
try:
# try some things with file_
finally:
# cleanup
Will this code leak during a well-timed KeyboardInterrupt? Or is it raised during the execution of some statements or expressions?

According to a note in the unrelated PEP 343:
Even if you write bug-free code, a KeyboardInterrupt exception can still cause it to exit between any two virtual machine opcodes.
So it can occur essentially anywhere. It can indeed occur during evaluation of a single expression. (This shouldn't be surprising, since an expression can include function calls, and pretty much anything can happen inside a function call.)

Yes, a KeyboardInterrupt can occur in the place you marked.
To deal with this, you should use a with block:
with open('foo') as file_:
# do some things
raise KeyboardInterrupt
# file resource is closed no matter what, even if a KeyboardInterrupt is raised
However, the exception could occur even between the open() call and the assignment to file_. It's probably not worth worrying about this, because usually a ctrl-c will mean your program is about to end, so the "leaked" file handle will be cleaned up by the OS. But if you know that it is important, you can use a signal handler to catch the signal that raises KeyboardInterrupt (SIGINT).

Is it reasonable to wrap an entire main loop in a try..finally block?

I've made a map editor in Python2.7.9 for a small project and I'm looking for ways to preserve the data I edit in the event of some unhandled exception. My editor already has a method for saving out data, and my current solution is to have the main loop wrapped in a try..finally block, similar to this example:
import os, datetime #..and others.
if __name__ == '__main__':
DataMgr = DataManager() # initializes the editor.
save_note = None
try:
MainLoop() # unsurprisingly, this calls the main loop.
except Exception as e: # I am of the impression this will catch every type of exception.
save_note = "Exception dump: %s : %s." % (type(e).__name__, e) # A memo appended to the comments in the save file.
finally:
exception_fp = DataMgr.cwd + "dump_%s.kmap" % str(datetime.datetime.now())
DataMgr.saveFile(exception_fp, memo = save_note) # saves out to a dump file using a familiar method with a note outlining what happened.
This seems like the best way to make sure that, no matter what happens, an attempt is made to preserve the editor's current state (to the extent that saveFile() is equipped to do so) in the event that it should crash. But I wonder if encapsulating my entire main loop in a try block is actually safe and efficient and good form. Is it? Are there risks or problems? Is there a better or more conventional way?

Wrapping the main loop in a try...finally block is the accepted pattern when you need something to happen no matter what. In some cases it's logging and continuing, in others it's saving everything possible and quitting.
So you're code is fine.

If your file isn't that big, I would suggest maybe reading the entire input file into memory, closing the file, then doing your data processing on the copy in memory, this will solve any problems you have with not corrupting your data at the cost of potentially slowing down your runtime.
Alternatively, take a look at the atexit python module. This allows you to register a function(s) for a automatic callback function when the program exits.
That being said what you have should work reasonably well.

Nesting "for" loop within a "try" operator

Folks,
I've resigned myself to working around this problem, but I wanted to check if Python is really acting as expected.
In the example, "sample.txt" is any multi-line text file that is read and parsed.
try:
file=open('sample.txt','r')
for line in file:
(some action here)
except:
print "Couldn't open file"
file.close()
The actions I observe are that "sample.txt" is opened and the first line handled, the logic then falls through to the "except" clause.
WAD or is this a bug?

If the code in the except block runs it is because an exception was raised. You swallow the exception which makes it hard to know what's going wrong.
Your error message suggests that you are attempting to trap errors raised in the file open. But since your try block surrounds the entire processing of the file, exceptions raised in the processing, as opposed to the file open, will be mis-reported as "Could not open file". If you really must handle the exception then you need to move the for loop to be after the except block.
Personally I'd be inclined to simply ignore the exception and let the default handling of the exception halt execution:
with open('sample.txt','r') as file:
for line in file:
(some action here)
If you must handle the exception then be discerning about the class exception that you handle. For example, handle just IOError since that is what open raises in case of failure.
try:
with open('sample.txt','r') as file:
for line in file:
(some action here)
except IOError:
except IOError as (errno, strerror):
print "I/O error({0}): {1}".format(errno, strerror)

It's not failing on the open line, then. What's the exception?
try:
file=open('sample.txt','r')
for line in file:
(some action here)
except:
print "Exception:"
import traceback; traceback.print_exc()
file.close()

That bare except catches all exceptions, including ones in the (some action here) part. Restructure that as:
try:
inputfile = open('sample.txt', 'r')
except:
print "Couldn't open file"
else:
for line in inputfile: pass
inputfile.close()
Or even better:
with open('sample.txt', 'r') as inputfile:
for line in inputfile: pass
In general, only wrap the bare minimum amount of code possible inside a try block so that you're not accidentally handling exceptions you're not really prepared to deal with.

your code risks raising another assertion trying to close file if for some reason the open() call fails. This is because the file variable won't be set if open() raises an exception, and so your call further down will reference a variable that doesn't exist.
If possible, try using the with statement:
with open('sample.txt', 'r') as file:
try:
for line in file:
(some action)
except:
print "Exception"
import traceback; traceback.print_exc()
This will make sure that file is closed afterwards regardless of what happens inside the with statement.

Try using readlines to generate a list of all lines in the file. And you also shouldnt be catching general errors without at least printing the error code.
try:
file=open('sample.txt','r')
for line in file.readlines():
(some action here)
except Exception, e:
print str(e)
file.close()

The actions I observe are that "sample.txt" is opened and the first line handled, the logic then falls through to the "except" clause.
Yes, this is the case. But this only occurs as long as the file exists.
As for the error falling through to the except clause (assuming the file exists), that implies that there is an issue with the parsing logic you've implemented. We can't be sure what it is since the except catches everything, and unless it's re-raised (may as well not catch it then...), or you print the stack trace from the exception object, you can't tell what is going wrong or where. In general, this is why catching everything is frowned upon; it makes debugging unnecessarily difficult.
I also noticed that you are closing the file at the very end. This is probably another source of errors, since the file only exists within the scope of try. You have two options:
Include file.close() inside of your try block so the file is properly closed after you're done, or
Use the with statement as a context manager to automatically close the file when you're done. #David Heffernan's example is one that's similar to one I would write.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.