Catching a python app before it exits

Catching a python app before it exits - python

I have a python app which is supposed to be very long-lived, but sometimes the process just disappears and I don't know why. Nothing gets logged when this happens, so I'm at a bit of a loss.
Is there some way in code I can hook in to an exit event, or some other way to get some of my code to run just before the process quits? I'd like to log the state of memory structures to better understand what's going on.

atexit is pronounced "at exit". The first times I read that function name, I read it as "a texit", which doesn't make nearly as much sense.

You might try running your application directly from a console (cmd on windows, sh/bash/etc on unix), so you can see any stack trace, etc printed to the console when the process dies.

I'm not sure if you are able to modify the source code, but if so you might want to try this:
def debugexcept(type, value, tb):
if hasattr(sys, 'ps1') or not (sys.stderr.isatty() and sys.stdin.isatty()) or type == SyntaxError:
sys.__excepthook__(type, value, tb)
else:
import traceback, pdb
traceback.print_exception(type, value, tb)
print
pdb.pm()
sys.excepthook = debugexcept
If you launch your python program from a command line you should be dumped into the python debugger when it dies, assuming something 'bad' has happened to cause an exception. I'm guessing maybe stderr/stdout have been captured and you're not seeing some exception?
ie search for something like:
sys.stdout = open('stdout.log', 'w')
sys.stderr = open('stderr.log', 'w')
If the process is dieing without an exception at all then that might be harder to find. One (very hard way) on windows would be to use something like windbg to attach to the process and set a breakpoint in the CRT at some relevant spot.
Good luck!

Related

Python subprocess — how to ignore exit code warnings?

I am trying to display the final results.txt file via default program. I've tried with bare Popen() without run() and got the same effect. The target file is opening (for me it's the see mode) but after exiting it I receive:
Warning: program returned non-zero exit code #256
Is there any way to ignore it and prevent my program from displaying such warning? I don't care about it because it's the last thing the program does, so I don't want people to waste their time clicking Enter each time...
Code's below:
from subprocess import run, Popen
if filepath[len(filepath)-1] != '/':
try:
results = run(Popen(['start', 'results.txt'], shell=True), stdout=None, shell=True, check=False)
except TypeError:
pass
else:
try:
results = run(Popen(['open', 'results.txt']), stdout=None, check=False)
except TypeError:
pass
except FileNotFoundError:
try:
results = run(Popen(['see', 'results.txt']), stdout=None, check=False)
except TypeError:
pass
except FileNotFoundError:
pass

Your immediate error is that you are mixing subprocess.run with subprocess.Popen. The correct syntax is
y = subprocess.Popen(['command', 'argument'])
or
x = subprocess.run(['command', 'argument'])
but you are incorrectly combining them into, effectively
x = subprocess.run(subprocess.Popen(['command', 'argument']), shell=True)
where the shell=True is a separate bug in its own right (though it will weirdly work on Windows).
What happens is that Popen runs successfully, but then you try to run run on the result, which of course is not a valid command at all.
You want to prefer subprocess.run() over subprocess.Popen in this scenario; the latter is for hand-wiring your own low-level functionality in scenarios where run doesn't offer enough flexibility (such as when you require the subprocess to run in parallel with your Python program as an independent process).
Your approach seems vaguely flawed for Unix-like systems; you probably want to run xdg-open if it's available, otherwise the value of os.environ["PAGER"] if it's defined, else fall back to less, else try more. Some ancient systems also have a default pager called pg.
You will definitely want to add check=True to actually make sure your command fails properly if the command cannot be found, which is the diametrical opposite of what you appear to be asking. With this keyword parameter, Python checks whether the command worked, and will raise an exception if not. (In its absence, failures will be silently ignored, in general.) You should never catch every possible exception; instead, trap just the one you really know how to handle.

Okay, I've achieved my goal with a different approach. I didn't need to handle such exception, I did it without the subprocess module.
Question closed, here's the final code (it looks even better):
from os import system
from platform import system as sysname
if sysname() == 'Windows':
system('start results.txt')
elif sysname() == 'Linux':
system('see results.txt')
elif sysname() == 'Darwin':
system('open results.txt')
else:
pass

How to guarantee file removing after script stopped working?

I have a script running by crontab every hour and interacts with API (database sync). Usually it take one hour or so, and I check for the next run if this process still in the memory or not:
#/usr/bin/env python
import os
import sys
pid = str(os.getpid())
pidfile = "/tmp/mydaemon.pid"
if os.path.isfile(pidfile):
print "%s already exists, exiting" % pidfile
sys.exit()
file(pidfile, 'w').write(pid)
try:
# Do some actual work here
finally:
os.unlink(pidfile)
BUT after some time script stopped working, when I look at the "ps aux | grep python", I don't see this script as the process, but I do see file on the place.
And when I run script manually, I see information printed iteratively on the screen, but after some time I see the word "Terminated", script exited and file still on the place.
How to guarantee 100% the file removed after the script stopped working?
Thanks!

It looks like your script is terminated unexpectedly, most probably due to too high memory usage. It's not guaranteed that finally will be executed on unexpected program termination. So, first of all I suggest you to find the cause of the unexpected termination an fix it.
Actually there is no 100% way to guarantee that the file will be removed. However, there are a few workarounds for handling dangling pid files.
Place your pid files on the /var/run volume, so they will be removed on unexpected system restart.
Check wether the process with such pid is still running on each script execution:
import os
def is_alive(pid):
try:
os.kill(pid, 0) # do nothing but throws an exception
return True
except OSError:
return False
# and add this to your code:
if os.path.isfile(pidfile):
with open(pidfile) as f:
if is_alive(f.read()):
sys.exit()
Again, provided code is not 100% safe because of possible pid collisions. You can make the verification of running process more sophisticated by adding parsing of ps command output. Try to find a line with the desired pid value and check wether it looks similar to your crontab entry.

Normally you can use atextit module functionality, but in your case (unexpected termination) it also may not work.
Maybe use of mkstemp (specifying required program suffix/refix) within with statement may work: it will create unique pidfile in /tmp and clear it, when with block completes or terminates.

How to silence "sys.excepthook is missing" error?

NB: I have not attempted to reproduce the problem described below under Windows, or with versions of Python other than 2.7.3.
The most reliable way to elicit the problem in question is to pipe the output of the following test script through : (under bash):
try:
for n in range(20):
print n
except:
pass
I.e.:
% python testscript.py | :
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
My question is:
How can I modify the test script above to avoid the error message when the script is run as shown (under Unix/bash)?
(As the test script shows, the error cannot be trapped with a try-except.)
The example above is, admittedly, highly artificial, but I'm running into the same problem sometimes when the output of a script of mine is piped through some 3rd party software.
The error message is certainly harmless, but it is disconcerting to end-users, so I would like to silence it.
EDIT: The following script, which differs from the original one above only in that it redefines sys.excepthook, behaves exactly like the one given above.
import sys
STDERR = sys.stderr
def excepthook(*args):
print >> STDERR, 'caught'
print >> STDERR, args
sys.excepthook = excepthook
try:
for n in range(20):
print n
except:
pass

How can I modify the test script above to avoid the error message when the script is run as shown (under Unix/bash)?
You will need to prevent the script from writing anything to standard output. That means removing any print statements and any use of sys.stdout.write, as well as any code that calls those.
The reason this is happening is that you're piping a nonzero amount of output from your Python script to something which never reads from standard input. This is not unique to the : command; you can get the same result by piping to any command which doesn't read standard input, such as
python testscript.py | cd .
Or for a simpler example, consider a script printer.py containing nothing more than
print 'abcde'
Then
python printer.py | python printer.py
will produce the same error.
When you pipe the output of one program into another, the output produced by the writing program gets backed up in a buffer, and waits for the reading program to request that data from the buffer. As long as the buffer is nonempty, any attempt to close the writing file object is supposed to fail with an error. This is the root cause of the messages you're seeing.
The specific code that triggers the error is in the C language implementation of Python, which explains why you can't catch it with a try/except block: it runs after the contents of your script has finished processing. Basically, while Python is shutting itself down, it attempts to close stdout, but that fails because there is still buffered output waiting to be read. So Python tries to report this error as it would normally, but sys.excepthook has already been removed as part of the finalization procedure, so that fails. Python then tries to print a message to sys.stderr, but that has already been deallocated so again, it fails. The reason you see the messages on the screen is that the Python code does contain a contingency fprintf to write out some output to the file pointer directly, even if Python's output object doesn't exist.
Technical details
For those interested in the details of this procedure, let's take a look at the Python interpreter's shutdown sequence, which is implemented in the Py_Finalize function of pythonrun.c.
After invoking exit hooks and shutting down threads, the finalization code calls PyImport_Cleanup to finalize and deallocate all imported modules. The next-to-last task performed by this function is removing the sys module, which mainly consists of calling _PyModule_Clear to clear all the entries in the module's dictionary - including, in particular, the standard stream objects (the Python objects) such as stdout and stderr.
When a value is removed from a dictionary or replaced by a new value, its reference count is decremented using the Py_DECREF macro. Objects whose reference count reaches zero become eligible for deallocation. Since the sys module holds the last remaining references to the standard stream objects, when those references are unset by _PyModule_Clear, they are then ready to be deallocated.1
Deallocation of a Python file object is accomplished by the file_dealloc function in fileobject.c. This first invokes the Python file object's close method using the aptly-named close_the_file function:
ret = close_the_file(f);
For a standard file object, close_the_file(f) delegates to the C fclose function, which sets an error condition if there is still data to be written to the file pointer. file_dealloc then checks for that error condition and prints the first message you see:
if (!ret) {
PySys_WriteStderr("close failed in file object destructor:\n");
PyErr_Print();
}
else {
Py_DECREF(ret);
}
After printing that message, Python then attempts to display the exception using PyErr_Print. That delegates to PyErr_PrintEx, and as part of its functionality, PyErr_PrintEx attempts to access the Python exception printer from sys.excepthook.
hook = PySys_GetObject("excepthook");
This would be fine if done in the normal course of a Python program, but in this situation, sys.excepthook has already been cleared.2 Python checks for this error condition and prints the second message as a notification.
if (hook && hook != Py_None) {
...
} else {
PySys_WriteStderr("sys.excepthook is missing\n");
PyErr_Display(exception, v, tb);
}
After notifying us about the missing excepthook, Python then falls back to printing the exception info using PyErr_Display, which is the default method for displaying a stack trace. The very first thing this function does is try to access sys.stderr.
PyObject *f = PySys_GetObject("stderr");
In this case, that doesn't work because sys.stderr has already been cleared and is inaccessible.3 So the code invokes fprintf directly to send the third message to the C standard error stream.
if (f == NULL || f == Py_None)
fprintf(stderr, "lost sys.stderr\n");
Interestingly, the behavior is a little different in Python 3.4+ because the finalization procedure now explicitly flushes the standard output and error streams before builtin modules are cleared. This way, if you have data waiting to be written, you get an error that explicitly signals that condition, rather than an "accidental" failure in the normal finalization procedure. Also, if you run
python printer.py | python printer.py
using Python 3.4 (after putting parentheses on the print statement of course), you don't get any error at all. I suppose the second invocation of Python may be consuming standard input for some reason, but that's a whole separate issue.
1Actually, that's a lie. Python's import mechanism caches a copy of each imported module's dictionary, which is not released until _PyImport_Fini runs, later in the implementation of Py_Finalize, and that's when the last references to the standard stream objects disappear. Once the reference count reaches zero, Py_DECREF deallocates the objects immediately. But all that matters for the main answer is that the references are removed from the sys module's dictionary and then deallocated sometime later.
2Again, this is because the sys module's dictionary is cleared completely before anything is really deallocated, thanks to the attribute caching mechanism. You can run Python with the -vv option to see all the module's attributes being unset before you get the error message about closing the file pointer.
3This particular piece of behavior is the only part that doesn't make sense unless you know about the attribute caching mechanism mentioned in previous footnotes.

I ran into this sort of issue myself today and went looking for an answer. I think a simple workaround here is to ensure you flush stdio first, so python blocks instead of failing during script shutdown. For example:
--- a/testscript.py
+++ b/testscript.py
## -9,5 +9,6 ## sys.excepthook = excepthook
try:
for n in range(20):
print n
+ sys.stdout.flush()
except:
pass
Then with this script nothing happens, as the exception (IOError: [Errno 32] Broken pipe) is suppressed by the try...except.
$ python testscript.py | :
$

In your program throws an exception that can not be caught using try/except block. To catch him, override function sys.excepthook:
import sys
sys.excepthook = lambda *args: None
From documentation:
sys.excepthook(type, value, traceback)
When an exception is raised and uncaught, the interpreter calls
sys.excepthook with three arguments, the exception class, exception
instance, and a traceback object. In an interactive session this
happens just before control is returned to the prompt; in a Python
program this happens just before the program exits. The handling of
such top-level exceptions can be customized by assigning another
three-argument function to sys.excepthook.
Illustrative example:
import sys
import logging
def log_uncaught_exceptions(exception_type, exception, tb):
logging.critical(''.join(traceback.format_tb(tb)))
logging.critical('{0}: {1}'.format(exception_type, exception))
sys.excepthook = log_uncaught_exceptions

I realize that this is an old question, but I found it in a Google search for the error. In my case it was a coding error. One of my last statements was:
print "Good Bye"
The solution was simply fixing the syntax to:
print ("Good Bye")
[Raspberry Pi Zero, Python 2.7.9]

IOError Input/Output Error When Printing

I have inherited some code which is periodically (randomly) failing due to an Input/Output error being raised during a call to print. I am trying to determine the cause of the exception being raised (or at least, better understand it) and how to handle it correctly.
When executing the following line of Python (in a 2.6.6 interpreter, running on CentOS 5.5):
print >> sys.stderr, 'Unable to do something: %s' % command
The exception is raised (traceback omitted):
IOError: [Errno 5] Input/output error
For context, this is generally what the larger function is trying to do at the time:
from subprocess import Popen, PIPE
import sys
def run_commands(commands):
for command in commands:
try:
out, err = Popen(command, shell=True, stdout=PIPE, stderr=PIPE).communicate()
print >> sys.stdout, out
if err:
raise Exception('ERROR -- an error occurred when executing this command: %s --- err: %s' % (command, err))
except:
print >> sys.stderr, 'Unable to do something: %s' % command
run_commands(["ls", "echo foo"])
The >> syntax is not particularly familiar to me, it's not something I use often, and I understand that it is perhaps the least preferred way of writing to stderr. However I don't believe the alternatives would fix the underlying problem.
From the documentation I have read, IOError 5 is often misused, and somewhat loosely defined, with different operating systems using it to cover different problems. The best I can see in my case is that the python process is no longer attached to the terminal/pty.
As best I can tell nothing is disconnecting the process from the stdout/stderr streams - the terminal is still open for example, and everything 'appears' to be fine. Could it be caused by the child process terminating in an unclean fashion? What else might be a cause of this problem - or what other steps could I introduce to debug it further?
In terms of handling the exception, I can obviously catch it, but I'm assuming this means I wont be able to print to stdout/stderr for the remainder of execution? Can I reattach to these streams somehow - perhaps by resetting sys.stdout to sys.__stdout__ etc? In this case not being able to write to stdout/stderr is not considered fatal but if it is an indication of something starting to go wrong I'd rather bail early.
I guess ultimately I'm at a bit of a loss as to where to start debugging this one...

I think it has to do with the terminal the process is attached to. I got this error when I run a python process in the background and closed the terminal in which I started it:
$ myprogram.py
Ctrl-Z
$ bg
$ exit
The problem was that I started a not daemonized process in a remote server and logged out (closing the terminal session). A solution was to start a screen/tmux session on the remote server and start the process within this session. Then detaching the session+log out keeps the terminal associated with the process. This works at least in the *nix world.

I had a very similar problem. I had a program that was launching several other programs using the subprocess module. Those subprocesses would then print output to the terminal. What I found was that when I closed the main program, it did not terminate the subprocesses automatically (as I had assumed), rather they kept running. So if I terminated both the main program and then the terminal it had been launched from*, the subprocesses no longer had a terminal attached to their stdout, and would throw an IOError. Hope this helps you.
*NB: it must be done in this order. If you just kill the terminal, (for some reason) that would kill both the main program and the subprocesses.

I just got this error because the directory where I was writing files to ran out of memory. Not sure if this is at all applicable to your situation.

I'm new here, so please forgive if I slip up a bit when it comes to the code detail.
Recently I was able to figure out what cause the I/O error of the print statement when the terminal associated with the run of the python script is closed.
It is because the string to be printed to stdout/stderr is too long. In this case, the "out" string is the culprit.
To fix this problem (without having to keep the terminal open while running the python script), simply read the "out" string line by line, and print line by line, until we reach the end of the "out" string. Something like:
while true:
ln=out.readline()
if not ln: break
print ln.strip("\n") # print without new line
The same problem occurs if you print the entire list of strings out to the screen. Simply print the list one item by one item.
Hope that helps!

The problem is you've closed the stdout pipe which python is attempting to write to when print() is called
This can be caused by running a script in the background using & and then closing the terminal session (ie. closing stdout)
$ python myscript.py &
$ exit
One solution is to set stdout to a file when running in the background
Example
$ python myscript.py > /var/log/myscript.log 2>&1 &
$ exit
No errors on print()

It could happen when your shell crashes while the print was trying to write the data into it.

For my case, I just restart the service, then this issue disappear. don't now why.
My issue was the same OSError Input/Output error, for Odoo.
After I restart the service, it disappeared.

Python error logging

I'd like to find a way to log every error that forces the python interpreter to quit to be saved to a file as well as being printed to the screen. The reason I would like to do this is that I want to keep stats on the types of errors I make while writing code, with an eye towards finding ways to avoid mistakes I make commonly in the future.
I've been attempting to do this by writing a wrapper for the python interpreter using the subprocess module. Basically, it runs the python interpreter, captures any output, parse and saves it to a file, prints the output, and use matplotlib to make some summary figures. However, I'm having a problem getting output from my wrapper script in real time. For example, if the script I'm running is:
import os
import time
for x in range(10):
print "testing"
time.sleep(10)
and I'm using subprocess.Popen() with p.communicate(), the wrapper will wait 100 seconds, and then print all of the output. I'd like the wrapper to be as invisible as possible - ideally in this case it would print "testing" once every ten seconds.
If someone could point me towards a good way of doing this, I'd greatly appreciate it.
Thanks!

I believe you can simply replace sys.excepthook with your own function. You can read about it in the Python documentation.
Basically, it allows you to customize what happens when an exception percolates up to the point of forcing the Python interpreter to quit. You use it like this:
import sys
def my_excepthook(type, value, tb):
# you can log the exception to a file here
print 'In My Exception Handler'
# the following line does the default (prints it to err)
sys.__excepthook__(type, value, tb)
sys.excepthook = my_excepthook
You'll probably also want to look at the traceback module, for formatting the traceback you get.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.