Reconstructing or catching python errors from subprocess

Reconstructing or catching python errors from subprocess - python

I'm currently trying to get attributes of a python stack trace and error from either purely the stderr or intercepting a script and catching its Exception. Either way is fine. This may be an XY problem so alternatives are welcome.
I am currently writing a program that kind of, redirects the arguments to call another python script:
import sys, subprocess
pyargs = sys.argv[1:] # Suppose pyargs is something like ["a.py", "123"]
result = subprocess.run(["python"] + pyargs, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
error = result.stderr # do something with this
Approach 1: Parsing dump
I'm not sure whether there is any existing method that supports either of the two plans. I know that the python error dump is displayed in a very formulaic way:
import sys, traceback
exc_type, exc_value, exc_traceback = sys.exc_info() # shows formatting
traceback.print_exc() # prints the stderr of what we wanted.
And so I thought perhaps we could parse the stderr and see what happens.
Approach 2: Catching exception
I know that we can run subprocess with subprocess.run(..., check=True) to get the subprocess.SubprocessError, but since the subprocess can be anything(not just a python process), we can't actually intercept the actual exception that occurred.
One issue with loading in the python file as a module to this file, is that we cannot supply the necessary arguments, unless we can somehow spoof the sys.argv[] or argparse that's running from the underlying script. Scripts that may have if __name__ == "__main__": will not execute either unless we spoof that too.
Any ideas?

I prefer approach #2. Whenever possible, I avoid using subprocess or system to call Python from Python.
Changing the value of sys.argv is easy - you just overwrite it using regular assignment.
Tricking the __name__ == "__main__" conditional is harder. Unfortunately, it probably means that you can't simply import the module in the usual way. According to How to import a module as __main__?, you can use the runpy module to execute your module and supply whatever value of __name__ you like.
import runpy
import sys
sys.argv = ["C:/", "foo", "bar"]
try:
runpy.run_module("yourmodulename", {}, "__main__")
except Exception as e:
print("caught exception:", e)
Now you can catch any uncaught exceptions that your module throws, and handle them however you like within the except.

Related

Python subprocess — how to ignore exit code warnings?

I am trying to display the final results.txt file via default program. I've tried with bare Popen() without run() and got the same effect. The target file is opening (for me it's the see mode) but after exiting it I receive:
Warning: program returned non-zero exit code #256
Is there any way to ignore it and prevent my program from displaying such warning? I don't care about it because it's the last thing the program does, so I don't want people to waste their time clicking Enter each time...
Code's below:
from subprocess import run, Popen
if filepath[len(filepath)-1] != '/':
try:
results = run(Popen(['start', 'results.txt'], shell=True), stdout=None, shell=True, check=False)
except TypeError:
pass
else:
try:
results = run(Popen(['open', 'results.txt']), stdout=None, check=False)
except TypeError:
pass
except FileNotFoundError:
try:
results = run(Popen(['see', 'results.txt']), stdout=None, check=False)
except TypeError:
pass
except FileNotFoundError:
pass

Your immediate error is that you are mixing subprocess.run with subprocess.Popen. The correct syntax is
y = subprocess.Popen(['command', 'argument'])
or
x = subprocess.run(['command', 'argument'])
but you are incorrectly combining them into, effectively
x = subprocess.run(subprocess.Popen(['command', 'argument']), shell=True)
where the shell=True is a separate bug in its own right (though it will weirdly work on Windows).
What happens is that Popen runs successfully, but then you try to run run on the result, which of course is not a valid command at all.
You want to prefer subprocess.run() over subprocess.Popen in this scenario; the latter is for hand-wiring your own low-level functionality in scenarios where run doesn't offer enough flexibility (such as when you require the subprocess to run in parallel with your Python program as an independent process).
Your approach seems vaguely flawed for Unix-like systems; you probably want to run xdg-open if it's available, otherwise the value of os.environ["PAGER"] if it's defined, else fall back to less, else try more. Some ancient systems also have a default pager called pg.
You will definitely want to add check=True to actually make sure your command fails properly if the command cannot be found, which is the diametrical opposite of what you appear to be asking. With this keyword parameter, Python checks whether the command worked, and will raise an exception if not. (In its absence, failures will be silently ignored, in general.) You should never catch every possible exception; instead, trap just the one you really know how to handle.

Okay, I've achieved my goal with a different approach. I didn't need to handle such exception, I did it without the subprocess module.
Question closed, here's the final code (it looks even better):
from os import system
from platform import system as sysname
if sysname() == 'Windows':
system('start results.txt')
elif sysname() == 'Linux':
system('see results.txt')
elif sysname() == 'Darwin':
system('open results.txt')
else:
pass

Retrieving data from original python file to go to imported python file [duplicate]

This question already has answers here:
What is the best way to call a script from another script? [closed]
(16 answers)
Closed 7 years ago.
I want to run a Python script from another Python script. I want to pass variables like I would using the command line.
For example, I would run my first script that would iterate through a list of values (0,1,2,3) and pass those to the 2nd script script2.py 0 then script2.py 1, etc.
I found Stack Overflow question 1186789 which is a similar question, but ars's answer calls a function, where as I want to run the whole script, not just a function, and balpha's answer calls the script but with no arguments. I changed this to something like the below as a test:
execfile("script2.py 1")
But it is not accepting variables properly. When I print out the sys.argv in script2.py it is the original command call to first script "['C:\script1.py'].
I don't really want to change the original script (i.e. script2.py in my example) since I don't own it.
I figure there must be a way to do this; I am just confused how you do it.

Try using os.system:
os.system("script2.py 1")
execfile is different because it is designed to run a sequence of Python statements in the current execution context. That's why sys.argv didn't change for you.

This is inherently the wrong thing to do. If you are running a Python script from another Python script, you should communicate through Python instead of through the OS:
import script1
In an ideal world, you will be able to call a function inside script1 directly:
for i in range(whatever):
script1.some_function(i)
If necessary, you can hack sys.argv. There's a neat way of doing this using a context manager to ensure that you don't make any permanent changes.
import contextlib
#contextlib.contextmanager
def redirect_argv(num):
sys._argv = sys.argv[:]
sys.argv=[str(num)]
yield
sys.argv = sys._argv
with redirect_argv(1):
print(sys.argv)
I think this is preferable to passing all your data to the OS and back; that's just silly.

Ideally, the Python script you want to run will be set up with code like this near the end:
def main(arg1, arg2, etc):
# do whatever the script does
if __name__ == "__main__":
main(sys.argv[1], sys.argv[2], sys.argv[3])
In other words, if the module is called from the command line, it parses the command line options and then calls another function, main(), to do the actual work. (The actual arguments will vary, and the parsing may be more involved.)
If you want to call such a script from another Python script, however, you can simply import it and call modulename.main() directly, rather than going through the operating system.
os.system will work, but it is the roundabout (read "slow") way to do it, as you are starting a whole new Python interpreter process each time for no raisin.

I think the good practice may be something like this;
import subprocess
cmd = 'python script.py'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
out, err = p.communicate()
result = out.split('\n')
for lin in result:
if not lin.startswith('#'):
print(lin)
according to documentation
The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several older modules and functions:
os.system
os.spawn*
os.popen*
popen2.*
commands.*
Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
Read Here

SubProcess module:
http://docs.python.org/dev/library/subprocess.html#using-the-subprocess-module
import subprocess
subprocess.Popen("script2.py 1", shell=True)
With this, you can also redirect stdin, stdout, and stderr.

import subprocess
subprocess.call(" python script2.py 1", shell=True)

How to silence "sys.excepthook is missing" error?

NB: I have not attempted to reproduce the problem described below under Windows, or with versions of Python other than 2.7.3.
The most reliable way to elicit the problem in question is to pipe the output of the following test script through : (under bash):
try:
for n in range(20):
print n
except:
pass
I.e.:
% python testscript.py | :
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
My question is:
How can I modify the test script above to avoid the error message when the script is run as shown (under Unix/bash)?
(As the test script shows, the error cannot be trapped with a try-except.)
The example above is, admittedly, highly artificial, but I'm running into the same problem sometimes when the output of a script of mine is piped through some 3rd party software.
The error message is certainly harmless, but it is disconcerting to end-users, so I would like to silence it.
EDIT: The following script, which differs from the original one above only in that it redefines sys.excepthook, behaves exactly like the one given above.
import sys
STDERR = sys.stderr
def excepthook(*args):
print >> STDERR, 'caught'
print >> STDERR, args
sys.excepthook = excepthook
try:
for n in range(20):
print n
except:
pass

How can I modify the test script above to avoid the error message when the script is run as shown (under Unix/bash)?
You will need to prevent the script from writing anything to standard output. That means removing any print statements and any use of sys.stdout.write, as well as any code that calls those.
The reason this is happening is that you're piping a nonzero amount of output from your Python script to something which never reads from standard input. This is not unique to the : command; you can get the same result by piping to any command which doesn't read standard input, such as
python testscript.py | cd .
Or for a simpler example, consider a script printer.py containing nothing more than
print 'abcde'
Then
python printer.py | python printer.py
will produce the same error.
When you pipe the output of one program into another, the output produced by the writing program gets backed up in a buffer, and waits for the reading program to request that data from the buffer. As long as the buffer is nonempty, any attempt to close the writing file object is supposed to fail with an error. This is the root cause of the messages you're seeing.
The specific code that triggers the error is in the C language implementation of Python, which explains why you can't catch it with a try/except block: it runs after the contents of your script has finished processing. Basically, while Python is shutting itself down, it attempts to close stdout, but that fails because there is still buffered output waiting to be read. So Python tries to report this error as it would normally, but sys.excepthook has already been removed as part of the finalization procedure, so that fails. Python then tries to print a message to sys.stderr, but that has already been deallocated so again, it fails. The reason you see the messages on the screen is that the Python code does contain a contingency fprintf to write out some output to the file pointer directly, even if Python's output object doesn't exist.
Technical details
For those interested in the details of this procedure, let's take a look at the Python interpreter's shutdown sequence, which is implemented in the Py_Finalize function of pythonrun.c.
After invoking exit hooks and shutting down threads, the finalization code calls PyImport_Cleanup to finalize and deallocate all imported modules. The next-to-last task performed by this function is removing the sys module, which mainly consists of calling _PyModule_Clear to clear all the entries in the module's dictionary - including, in particular, the standard stream objects (the Python objects) such as stdout and stderr.
When a value is removed from a dictionary or replaced by a new value, its reference count is decremented using the Py_DECREF macro. Objects whose reference count reaches zero become eligible for deallocation. Since the sys module holds the last remaining references to the standard stream objects, when those references are unset by _PyModule_Clear, they are then ready to be deallocated.1
Deallocation of a Python file object is accomplished by the file_dealloc function in fileobject.c. This first invokes the Python file object's close method using the aptly-named close_the_file function:
ret = close_the_file(f);
For a standard file object, close_the_file(f) delegates to the C fclose function, which sets an error condition if there is still data to be written to the file pointer. file_dealloc then checks for that error condition and prints the first message you see:
if (!ret) {
PySys_WriteStderr("close failed in file object destructor:\n");
PyErr_Print();
}
else {
Py_DECREF(ret);
}
After printing that message, Python then attempts to display the exception using PyErr_Print. That delegates to PyErr_PrintEx, and as part of its functionality, PyErr_PrintEx attempts to access the Python exception printer from sys.excepthook.
hook = PySys_GetObject("excepthook");
This would be fine if done in the normal course of a Python program, but in this situation, sys.excepthook has already been cleared.2 Python checks for this error condition and prints the second message as a notification.
if (hook && hook != Py_None) {
...
} else {
PySys_WriteStderr("sys.excepthook is missing\n");
PyErr_Display(exception, v, tb);
}
After notifying us about the missing excepthook, Python then falls back to printing the exception info using PyErr_Display, which is the default method for displaying a stack trace. The very first thing this function does is try to access sys.stderr.
PyObject *f = PySys_GetObject("stderr");
In this case, that doesn't work because sys.stderr has already been cleared and is inaccessible.3 So the code invokes fprintf directly to send the third message to the C standard error stream.
if (f == NULL || f == Py_None)
fprintf(stderr, "lost sys.stderr\n");
Interestingly, the behavior is a little different in Python 3.4+ because the finalization procedure now explicitly flushes the standard output and error streams before builtin modules are cleared. This way, if you have data waiting to be written, you get an error that explicitly signals that condition, rather than an "accidental" failure in the normal finalization procedure. Also, if you run
python printer.py | python printer.py
using Python 3.4 (after putting parentheses on the print statement of course), you don't get any error at all. I suppose the second invocation of Python may be consuming standard input for some reason, but that's a whole separate issue.
1Actually, that's a lie. Python's import mechanism caches a copy of each imported module's dictionary, which is not released until _PyImport_Fini runs, later in the implementation of Py_Finalize, and that's when the last references to the standard stream objects disappear. Once the reference count reaches zero, Py_DECREF deallocates the objects immediately. But all that matters for the main answer is that the references are removed from the sys module's dictionary and then deallocated sometime later.
2Again, this is because the sys module's dictionary is cleared completely before anything is really deallocated, thanks to the attribute caching mechanism. You can run Python with the -vv option to see all the module's attributes being unset before you get the error message about closing the file pointer.
3This particular piece of behavior is the only part that doesn't make sense unless you know about the attribute caching mechanism mentioned in previous footnotes.

I ran into this sort of issue myself today and went looking for an answer. I think a simple workaround here is to ensure you flush stdio first, so python blocks instead of failing during script shutdown. For example:
--- a/testscript.py
+++ b/testscript.py
## -9,5 +9,6 ## sys.excepthook = excepthook
try:
for n in range(20):
print n
+ sys.stdout.flush()
except:
pass
Then with this script nothing happens, as the exception (IOError: [Errno 32] Broken pipe) is suppressed by the try...except.
$ python testscript.py | :
$

In your program throws an exception that can not be caught using try/except block. To catch him, override function sys.excepthook:
import sys
sys.excepthook = lambda *args: None
From documentation:
sys.excepthook(type, value, traceback)
When an exception is raised and uncaught, the interpreter calls
sys.excepthook with three arguments, the exception class, exception
instance, and a traceback object. In an interactive session this
happens just before control is returned to the prompt; in a Python
program this happens just before the program exits. The handling of
such top-level exceptions can be customized by assigning another
three-argument function to sys.excepthook.
Illustrative example:
import sys
import logging
def log_uncaught_exceptions(exception_type, exception, tb):
logging.critical(''.join(traceback.format_tb(tb)))
logging.critical('{0}: {1}'.format(exception_type, exception))
sys.excepthook = log_uncaught_exceptions

I realize that this is an old question, but I found it in a Google search for the error. In my case it was a coding error. One of my last statements was:
print "Good Bye"
The solution was simply fixing the syntax to:
print ("Good Bye")
[Raspberry Pi Zero, Python 2.7.9]

Python error logging

I'd like to find a way to log every error that forces the python interpreter to quit to be saved to a file as well as being printed to the screen. The reason I would like to do this is that I want to keep stats on the types of errors I make while writing code, with an eye towards finding ways to avoid mistakes I make commonly in the future.
I've been attempting to do this by writing a wrapper for the python interpreter using the subprocess module. Basically, it runs the python interpreter, captures any output, parse and saves it to a file, prints the output, and use matplotlib to make some summary figures. However, I'm having a problem getting output from my wrapper script in real time. For example, if the script I'm running is:
import os
import time
for x in range(10):
print "testing"
time.sleep(10)
and I'm using subprocess.Popen() with p.communicate(), the wrapper will wait 100 seconds, and then print all of the output. I'd like the wrapper to be as invisible as possible - ideally in this case it would print "testing" once every ten seconds.
If someone could point me towards a good way of doing this, I'd greatly appreciate it.
Thanks!

I believe you can simply replace sys.excepthook with your own function. You can read about it in the Python documentation.
Basically, it allows you to customize what happens when an exception percolates up to the point of forcing the Python interpreter to quit. You use it like this:
import sys
def my_excepthook(type, value, tb):
# you can log the exception to a file here
print 'In My Exception Handler'
# the following line does the default (prints it to err)
sys.__excepthook__(type, value, tb)
sys.excepthook = my_excepthook
You'll probably also want to look at the traceback module, for formatting the traceback you get.

Run a Python script from another Python script, passing in arguments [duplicate]

This question already has answers here:
What is the best way to call a script from another script? [closed]
(16 answers)
Closed 8 years ago.
I want to run a Python script from another Python script. I want to pass variables like I would using the command line.
For example, I would run my first script that would iterate through a list of values (0,1,2,3) and pass those to the 2nd script script2.py 0 then script2.py 1, etc.
I found Stack Overflow question 1186789 which is a similar question, but ars's answer calls a function, where as I want to run the whole script, not just a function, and balpha's answer calls the script but with no arguments. I changed this to something like the below as a test:
execfile("script2.py 1")
But it is not accepting variables properly. When I print out the sys.argv in script2.py it is the original command call to first script "['C:\script1.py'].
I don't really want to change the original script (i.e. script2.py in my example) since I don't own it.
I figure there must be a way to do this; I am just confused how you do it.

Try using os.system:
os.system("script2.py 1")
execfile is different because it is designed to run a sequence of Python statements in the current execution context. That's why sys.argv didn't change for you.

This is inherently the wrong thing to do. If you are running a Python script from another Python script, you should communicate through Python instead of through the OS:
import script1
In an ideal world, you will be able to call a function inside script1 directly:
for i in range(whatever):
script1.some_function(i)
If necessary, you can hack sys.argv. There's a neat way of doing this using a context manager to ensure that you don't make any permanent changes.
import contextlib
#contextlib.contextmanager
def redirect_argv(num):
sys._argv = sys.argv[:]
sys.argv=[str(num)]
yield
sys.argv = sys._argv
with redirect_argv(1):
print(sys.argv)
I think this is preferable to passing all your data to the OS and back; that's just silly.

Ideally, the Python script you want to run will be set up with code like this near the end:
def main(arg1, arg2, etc):
# do whatever the script does
if __name__ == "__main__":
main(sys.argv[1], sys.argv[2], sys.argv[3])
In other words, if the module is called from the command line, it parses the command line options and then calls another function, main(), to do the actual work. (The actual arguments will vary, and the parsing may be more involved.)
If you want to call such a script from another Python script, however, you can simply import it and call modulename.main() directly, rather than going through the operating system.
os.system will work, but it is the roundabout (read "slow") way to do it, as you are starting a whole new Python interpreter process each time for no raisin.

I think the good practice may be something like this;
import subprocess
cmd = 'python script.py'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, shell=True)
out, err = p.communicate()
result = out.split('\n')
for lin in result:
if not lin.startswith('#'):
print(lin)
according to documentation
The subprocess module allows you to spawn new processes, connect to their input/output/error pipes, and obtain their return codes. This module intends to replace several older modules and functions:
os.system
os.spawn*
os.popen*
popen2.*
commands.*
Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
Read Here

SubProcess module:
http://docs.python.org/dev/library/subprocess.html#using-the-subprocess-module
import subprocess
subprocess.Popen("script2.py 1", shell=True)
With this, you can also redirect stdin, stdout, and stderr.

import subprocess
subprocess.call(" python script2.py 1", shell=True)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.