Python error logging - python

I'd like to find a way to log every error that forces the python interpreter to quit to be saved to a file as well as being printed to the screen. The reason I would like to do this is that I want to keep stats on the types of errors I make while writing code, with an eye towards finding ways to avoid mistakes I make commonly in the future.
I've been attempting to do this by writing a wrapper for the python interpreter using the subprocess module. Basically, it runs the python interpreter, captures any output, parse and saves it to a file, prints the output, and use matplotlib to make some summary figures. However, I'm having a problem getting output from my wrapper script in real time. For example, if the script I'm running is:
import os
import time
for x in range(10):
print "testing"
time.sleep(10)
and I'm using subprocess.Popen() with p.communicate(), the wrapper will wait 100 seconds, and then print all of the output. I'd like the wrapper to be as invisible as possible - ideally in this case it would print "testing" once every ten seconds.
If someone could point me towards a good way of doing this, I'd greatly appreciate it.
Thanks!

I believe you can simply replace sys.excepthook with your own function. You can read about it in the Python documentation.
Basically, it allows you to customize what happens when an exception percolates up to the point of forcing the Python interpreter to quit. You use it like this:
import sys
def my_excepthook(type, value, tb):
# you can log the exception to a file here
print 'In My Exception Handler'
# the following line does the default (prints it to err)
sys.__excepthook__(type, value, tb)
sys.excepthook = my_excepthook
You'll probably also want to look at the traceback module, for formatting the traceback you get.

Related

Reconstructing or catching python errors from subprocess

I'm currently trying to get attributes of a python stack trace and error from either purely the stderr or intercepting a script and catching its Exception. Either way is fine. This may be an XY problem so alternatives are welcome.
I am currently writing a program that kind of, redirects the arguments to call another python script:
import sys, subprocess
pyargs = sys.argv[1:] # Suppose pyargs is something like ["a.py", "123"]
result = subprocess.run(["python"] + pyargs, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
error = result.stderr # do something with this
Approach 1: Parsing dump
I'm not sure whether there is any existing method that supports either of the two plans. I know that the python error dump is displayed in a very formulaic way:
import sys, traceback
exc_type, exc_value, exc_traceback = sys.exc_info() # shows formatting
traceback.print_exc() # prints the stderr of what we wanted.
And so I thought perhaps we could parse the stderr and see what happens.
Approach 2: Catching exception
I know that we can run subprocess with subprocess.run(..., check=True) to get the subprocess.SubprocessError, but since the subprocess can be anything(not just a python process), we can't actually intercept the actual exception that occurred.
One issue with loading in the python file as a module to this file, is that we cannot supply the necessary arguments, unless we can somehow spoof the sys.argv[] or argparse that's running from the underlying script. Scripts that may have if __name__ == "__main__": will not execute either unless we spoof that too.
Any ideas?
I prefer approach #2. Whenever possible, I avoid using subprocess or system to call Python from Python.
Changing the value of sys.argv is easy - you just overwrite it using regular assignment.
Tricking the __name__ == "__main__" conditional is harder. Unfortunately, it probably means that you can't simply import the module in the usual way. According to How to import a module as __main__?, you can use the runpy module to execute your module and supply whatever value of __name__ you like.
import runpy
import sys
sys.argv = ["C:/", "foo", "bar"]
try:
runpy.run_module("yourmodulename", {}, "__main__")
except Exception as e:
print("caught exception:", e)
Now you can catch any uncaught exceptions that your module throws, and handle them however you like within the except.

Preventing write interrupts in python script

I'm writing a parser in Python that outputs a bunch of database rows to standard out. In order for the DB to process them properly, each row needs to be fully printed to the console. I'm trying to prevent interrupts from making the print command stop halfway through printing a line.
I tried the solution that recommended using a signal handler override, but this still doesn't prevent the row from being partially printed when the program is interrupted. (I think the WRITE system call is cancelled to handle the interrupt).
I thought that the problem was solved by issue 10956 but I upgraded to Python 2.7.5 and the problem still happens.
You can see for yourself by running this example:
# Writer
import signal
interrupted = False
def signal_handler(signal, frame):
global interrupted
iterrupted = True
signal.signal(signal.SIGINT, signal_handler)
while True:
if interrupted:
break
print '0123456789'
In a terminal:
$ mkfifo --mode=0666 pipe
$ python writer.py > pipe
In another terminal:
$ cat pipe
Then Ctrl+C the first terminal. Some of the time the second terminal will end with an incomplete sequence of characters.
Is there any way of ensuring that full lines are written?
This seems less like an interrupt problem per se then a buffering issue. If I make a small change to your code, I don't get the partial lines.
# Writer
import sys
while True:
print '0123456789'
sys.stdout.flush()
It sounds like you don't really want to catch a signal but rather block it temporarily. This is supported by some *nix flavours. However Python explicitly does not support this.
You can write a C wrapper for sigmasks or look for a library. However if you are looking for a portable solution...

How to silence "sys.excepthook is missing" error?

NB: I have not attempted to reproduce the problem described below under Windows, or with versions of Python other than 2.7.3.
The most reliable way to elicit the problem in question is to pipe the output of the following test script through : (under bash):
try:
for n in range(20):
print n
except:
pass
I.e.:
% python testscript.py | :
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
My question is:
How can I modify the test script above to avoid the error message when the script is run as shown (under Unix/bash)?
(As the test script shows, the error cannot be trapped with a try-except.)
The example above is, admittedly, highly artificial, but I'm running into the same problem sometimes when the output of a script of mine is piped through some 3rd party software.
The error message is certainly harmless, but it is disconcerting to end-users, so I would like to silence it.
EDIT: The following script, which differs from the original one above only in that it redefines sys.excepthook, behaves exactly like the one given above.
import sys
STDERR = sys.stderr
def excepthook(*args):
print >> STDERR, 'caught'
print >> STDERR, args
sys.excepthook = excepthook
try:
for n in range(20):
print n
except:
pass
How can I modify the test script above to avoid the error message when the script is run as shown (under Unix/bash)?
You will need to prevent the script from writing anything to standard output. That means removing any print statements and any use of sys.stdout.write, as well as any code that calls those.
The reason this is happening is that you're piping a nonzero amount of output from your Python script to something which never reads from standard input. This is not unique to the : command; you can get the same result by piping to any command which doesn't read standard input, such as
python testscript.py | cd .
Or for a simpler example, consider a script printer.py containing nothing more than
print 'abcde'
Then
python printer.py | python printer.py
will produce the same error.
When you pipe the output of one program into another, the output produced by the writing program gets backed up in a buffer, and waits for the reading program to request that data from the buffer. As long as the buffer is nonempty, any attempt to close the writing file object is supposed to fail with an error. This is the root cause of the messages you're seeing.
The specific code that triggers the error is in the C language implementation of Python, which explains why you can't catch it with a try/except block: it runs after the contents of your script has finished processing. Basically, while Python is shutting itself down, it attempts to close stdout, but that fails because there is still buffered output waiting to be read. So Python tries to report this error as it would normally, but sys.excepthook has already been removed as part of the finalization procedure, so that fails. Python then tries to print a message to sys.stderr, but that has already been deallocated so again, it fails. The reason you see the messages on the screen is that the Python code does contain a contingency fprintf to write out some output to the file pointer directly, even if Python's output object doesn't exist.
Technical details
For those interested in the details of this procedure, let's take a look at the Python interpreter's shutdown sequence, which is implemented in the Py_Finalize function of pythonrun.c.
After invoking exit hooks and shutting down threads, the finalization code calls PyImport_Cleanup to finalize and deallocate all imported modules. The next-to-last task performed by this function is removing the sys module, which mainly consists of calling _PyModule_Clear to clear all the entries in the module's dictionary - including, in particular, the standard stream objects (the Python objects) such as stdout and stderr.
When a value is removed from a dictionary or replaced by a new value, its reference count is decremented using the Py_DECREF macro. Objects whose reference count reaches zero become eligible for deallocation. Since the sys module holds the last remaining references to the standard stream objects, when those references are unset by _PyModule_Clear, they are then ready to be deallocated.1
Deallocation of a Python file object is accomplished by the file_dealloc function in fileobject.c. This first invokes the Python file object's close method using the aptly-named close_the_file function:
ret = close_the_file(f);
For a standard file object, close_the_file(f) delegates to the C fclose function, which sets an error condition if there is still data to be written to the file pointer. file_dealloc then checks for that error condition and prints the first message you see:
if (!ret) {
PySys_WriteStderr("close failed in file object destructor:\n");
PyErr_Print();
}
else {
Py_DECREF(ret);
}
After printing that message, Python then attempts to display the exception using PyErr_Print. That delegates to PyErr_PrintEx, and as part of its functionality, PyErr_PrintEx attempts to access the Python exception printer from sys.excepthook.
hook = PySys_GetObject("excepthook");
This would be fine if done in the normal course of a Python program, but in this situation, sys.excepthook has already been cleared.2 Python checks for this error condition and prints the second message as a notification.
if (hook && hook != Py_None) {
...
} else {
PySys_WriteStderr("sys.excepthook is missing\n");
PyErr_Display(exception, v, tb);
}
After notifying us about the missing excepthook, Python then falls back to printing the exception info using PyErr_Display, which is the default method for displaying a stack trace. The very first thing this function does is try to access sys.stderr.
PyObject *f = PySys_GetObject("stderr");
In this case, that doesn't work because sys.stderr has already been cleared and is inaccessible.3 So the code invokes fprintf directly to send the third message to the C standard error stream.
if (f == NULL || f == Py_None)
fprintf(stderr, "lost sys.stderr\n");
Interestingly, the behavior is a little different in Python 3.4+ because the finalization procedure now explicitly flushes the standard output and error streams before builtin modules are cleared. This way, if you have data waiting to be written, you get an error that explicitly signals that condition, rather than an "accidental" failure in the normal finalization procedure. Also, if you run
python printer.py | python printer.py
using Python 3.4 (after putting parentheses on the print statement of course), you don't get any error at all. I suppose the second invocation of Python may be consuming standard input for some reason, but that's a whole separate issue.
1Actually, that's a lie. Python's import mechanism caches a copy of each imported module's dictionary, which is not released until _PyImport_Fini runs, later in the implementation of Py_Finalize, and that's when the last references to the standard stream objects disappear. Once the reference count reaches zero, Py_DECREF deallocates the objects immediately. But all that matters for the main answer is that the references are removed from the sys module's dictionary and then deallocated sometime later.
2Again, this is because the sys module's dictionary is cleared completely before anything is really deallocated, thanks to the attribute caching mechanism. You can run Python with the -vv option to see all the module's attributes being unset before you get the error message about closing the file pointer.
3This particular piece of behavior is the only part that doesn't make sense unless you know about the attribute caching mechanism mentioned in previous footnotes.
I ran into this sort of issue myself today and went looking for an answer. I think a simple workaround here is to ensure you flush stdio first, so python blocks instead of failing during script shutdown. For example:
--- a/testscript.py
+++ b/testscript.py
## -9,5 +9,6 ## sys.excepthook = excepthook
try:
for n in range(20):
print n
+ sys.stdout.flush()
except:
pass
Then with this script nothing happens, as the exception (IOError: [Errno 32] Broken pipe) is suppressed by the try...except.
$ python testscript.py | :
$
In your program throws an exception that can not be caught using try/except block. To catch him, override function sys.excepthook:
import sys
sys.excepthook = lambda *args: None
From documentation:
sys.excepthook(type, value, traceback)
When an exception is raised and uncaught, the interpreter calls
sys.excepthook with three arguments, the exception class, exception
instance, and a traceback object. In an interactive session this
happens just before control is returned to the prompt; in a Python
program this happens just before the program exits. The handling of
such top-level exceptions can be customized by assigning another
three-argument function to sys.excepthook.
Illustrative example:
import sys
import logging
def log_uncaught_exceptions(exception_type, exception, tb):
logging.critical(''.join(traceback.format_tb(tb)))
logging.critical('{0}: {1}'.format(exception_type, exception))
sys.excepthook = log_uncaught_exceptions
I realize that this is an old question, but I found it in a Google search for the error. In my case it was a coding error. One of my last statements was:
print "Good Bye"
The solution was simply fixing the syntax to:
print ("Good Bye")
[Raspberry Pi Zero, Python 2.7.9]

Catching a python app before it exits

I have a python app which is supposed to be very long-lived, but sometimes the process just disappears and I don't know why. Nothing gets logged when this happens, so I'm at a bit of a loss.
Is there some way in code I can hook in to an exit event, or some other way to get some of my code to run just before the process quits? I'd like to log the state of memory structures to better understand what's going on.
atexit is pronounced "at exit". The first times I read that function name, I read it as "a texit", which doesn't make nearly as much sense.
You might try running your application directly from a console (cmd on windows, sh/bash/etc on unix), so you can see any stack trace, etc printed to the console when the process dies.
I'm not sure if you are able to modify the source code, but if so you might want to try this:
def debugexcept(type, value, tb):
if hasattr(sys, 'ps1') or not (sys.stderr.isatty() and sys.stdin.isatty()) or type == SyntaxError:
sys.__excepthook__(type, value, tb)
else:
import traceback, pdb
traceback.print_exception(type, value, tb)
print
pdb.pm()
sys.excepthook = debugexcept
If you launch your python program from a command line you should be dumped into the python debugger when it dies, assuming something 'bad' has happened to cause an exception. I'm guessing maybe stderr/stdout have been captured and you're not seeing some exception?
ie search for something like:
sys.stdout = open('stdout.log', 'w')
sys.stderr = open('stderr.log', 'w')
If the process is dieing without an exception at all then that might be harder to find. One (very hard way) on windows would be to use something like windbg to attach to the process and set a breakpoint in the CRT at some relevant spot.
Good luck!

Python's subprocess.Popen returns the same stdout even though it shouldn't

I'm having a very strange issue with Python's subprocess.Popen. I'm using it to call several times an external exe and keep the output in a list.
Every time you call this external exe, it will return a different string. However, if I call it several times using Popen, it will always return the SAME string. =:-O
It looks like Popen is returning always the same value from stdout, without recalling the exe. Maybe doing some sort of caching without actually calling again the exe.
This is my code:
def get_key():
from subprocess import Popen, PIPE
args = [C_KEY_MAKER, '/26', USER_NAME, ENCRYPTION_TEMPLATE, '0', ]
process = Popen(args, stdout=PIPE)
output = process.communicate()[0].strip()
return output
if __name__ == '__main__':
print get_key() # Returns a certain string
print get_key() # Should return another string, but returns the same!
What on Earth am I doing wrong?!
It is possible (if C_KEY_MAKER's random behaviour is based on the current time in seconds, or similar) that when you run it twice on the command line, the time has changed in between runs and so you get a different output, but when python runs it, it runs it twice in such quick succession that the time hasn't changed and so it returns the same value twice in a row.
Nothing. That works fine, on my own tests (aside from your indentation error at the bottom). The problem is either in your exe. or elsewhere.
To clarify, I created a python program tfile.py
cat > tfile.py
#!/usr/bin/env python
import random
print random.random()
And then altered tthe program to get rid of the indentation problem at the bottom, and to call tfile.py . It did give two different results.
I don't know what is going wrong with your example, I cannot replicate this behaviour, however try a more by-the-book approach:
def get_key():
from subprocess import Popen, PIPE
args = [C_KEY_MAKER, '/26', USER_NAME, ENCRYPTION_TEMPLATE, '0', ]
output = Popen(args, stdout=PIPE).stdout
data = output.read().strip()
output.close()
return data
Your code is not executable as is so it's hard to help you out much. Consider fixing indentation and syntax and making it self-contained, so that we can give it a try.
On Linux, it seems to work fine according to Devin Jeanpierre.

Categories

Resources