how to prevent failure of subprocess stopping the main process in python - python

I wrote a python script to run a command called "gtdownload" on a bunch of files with multiprocessing. The function "download" is where I am having trouble with.
#/usr/bin/env python
import os, sys, subprocess
from multiprocessing import Pool
def cghub_dnld_file(file1, file2, file3, np):
<open files>
<read in lines>
p = Pool(int(np))
map_args = [(line.rstrip(),name_lines[i].rstrip(),bar_lines[i].rstrip()) for i, line in enumerate(id_lines)]
p.map(download_wrapper,map_args)
def download(id, name, bar):
<check if file has been downloaded, if not download>
<.....>
link = "https://cghub.ucsc.edu/cghub/data/analysis/download/" + id
dnld_cmd = "gtdownload -c ~/.cghub.key --max-children 4 -vv -d " + link + " > gt.out 2>gt.err"
subprocess.call(dnld_cmd,shell=True)
def download_wrapper(args):
return download(*args)
def main():
<read in arguments>
<...>
cghub_dnld_file(file1,file2,file3,threads)
if __name__ == "__main__":
main()
If this file does not exist in the database, gtdownload would quit which also kills my python job with the following error:
Traceback (most recent call last):
File "/rsrch1/rists/djiao/bin/cghub_dnld.py", line 102, in <module>
main()
File "/rsrch1/rists/djiao/bin/cghub_dnld.py", line 98, in main
cghub_dnld_file(file1,file2,file3,threads)
File "/rsrch1/rists/djiao/bin/cghub_dnld.py", line 22, in cghub_dnld_file
p.map(download_wrapper,map_args)
File "/rsrch1/rists/apps/x86_64-rhel6/anaconda/lib/python2.7/multiprocessing/pool.py", line 250, in map
return self.map_async(func, iterable, chunksize).get()
File "/rsrch1/rists/apps/x86_64-rhel6/anaconda/lib/python2.7/multiprocessing/pool.py", line 554, in get
raise self._value
OSError: [Errno 2] No such file or directory
The actual error message from gtdownload :
Welcome to gtdownload-3.8.5a.
Ready to download
Communicating with GT Executive ...
Headers received from the client: 'HTTP/1.1 100 Continue
HTTP/1.1 404 Not Found
Date: Tue, 29 Jul 2014 18:49:57 GMT
Server: Apache/2.2.15 (Red Hat and CGHub)
Strict-Transport-Security: max-age=31536000
X-Powered-By: PHP/5.3.3
Content-Length: 669
Connection: close
Content-Type: text/xml
'
Error: You have requested to download a uuid which either does not exist within the system, or has not yet reached the 'live' state. The requested action will not be performed. Please double check the supplied uuid or contact thelpdesk for further assistance.
I would like the script to skip the one that does not exist and start gtdownload on the next one. I tried to output the stderr of subprocess.call to a pipe and see if there is the "error" keyword. But it seems it stops at the exact subprocess.call command. Same thing with os.system.
I made a MCV case without the multiprocessing and subprocess did not kill the main process at all. Looks like multiprocessing messes things up although I had it run with 1 thread just for testing.
#!/usr/bin/env python
import subprocess
#THis is the id that gtdownload had problem with
id = "df1e073f-4485-4d5f-8659-cd8eaac17329"
link = "https://cghub.ucsc.edu/cghub/data/analysis/download/" + id
dlnd_cmd = "gtdownload -c ~/.cghub.key --max-children 4 -vv -d " + link + " > gt.out 2>gt.err"
print dlnd_cmd
subprocess.call(dlnd_cmd,shell=True)
print "done"
Clearly multiprocessing conflicts subprocess.call but it is not clear to me why.

What is the best way to avoid the failure of subprocess killing the main process?
Handle the exception in some appropriate way and move on, of course.
try:
subprocess.call(dlnd_cmd)
except OSError as e:
print 'failed to download: {!r}'.format(e)
However, this may not be appropriate here. The kinds of exceptions that subprocess.call raises are usually not transient things that you can just log and work around; if it's not working now, it will continue to not work forever until you fix the underlying problem (a bug in your script, or gtdownload not being installed right, or whatever).
For example, if the code you showed us is your actual code:
dlnd_cmd = "gtdownload -c ~/.cghub.key --max-children 4 -vv -d " + link + " > gt.out 2>gt.err"
subprocess.call(dlnd_cmd)
… then this is guaranteed to raise an OSError for the reason explained in dano's answer: call (without shell=True) will try to take that entire string—spaces, shell-redirection, etc.—as the name of an executable program to find on your $PATH. And there is no such program. So it will raise an OSError(errno.ENOENT). (Which is exactly what you're seeing.) Just logging that doesn't do you any good; it's a good thing that your entire process is exiting, so you can debug that problem.

subprocess.call should not kill the main process. Something else must be wrong with your script or your conclusions about the script's behaviour is wrong. Did you try printing some trace output after the subprocess call?

You have to use shell=True to use subprocess.call with a string for an argument (and with shell redirection):
subprocess.call(dlnd_cmd, shell=True)
Without shell=True, subprocess tries to treat your entire command string like a single executable name, which of course doesn't exist, and leads to the No such file or directory exception.
See this answer for more info on when to use a string vs. when to use a sequence with subprocess.

Related

How to run a bash script from within python and get all the output?

This is a direct clarification question to the answer in here which I thought it worked, but it does not!
I have the following test bash script (testbash.sh) which just creates some output and a lot of errors for testing purposes (running on Red Hat Enterprise Linux Server release 7.6 (Maipo) and also Ubuntu 16.04.6 LTS):
export MAX_SEED=2
echo "Start test"
pids=""
for seed in `seq 1 ${MAX_SEED}`
do
python -c "raise ValueError('test')" &
pids="${pids} $!"
done
echo "pids: ${pids}"
wait $pids
echo "End test"
If I run this script I get the following output:
Start test
pids: 68322 68323
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
[1]- Exit 1 python -c "raise ValueError('test')"
[2]+ Exit 1 python -c "raise ValueError('test')"
End test
That is the expected outcome. That is fine. I want to get errors!
Now here is the python code that is supposed to catch all the output:
from __future__ import print_function
import sys
import time
from subprocess import PIPE, Popen, STDOUT
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line.decode('ascii'))
out.close()
p = Popen(['. testbash.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1, close_fds=ON_POSIX, shell=True)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
# read line without blocking
while t.is_alive():
#time.sleep(1)
try:
line = q.get(timeout=.1)
except Empty:
print(line)
pass
else:
# got line
print(line, end='')
p.wait()
print('returncode = {}'.format(p.returncode))
But when I run this code I only get the following output:
Start test
pids: 70191 70192
Traceback (most recent call last):
returncode = 0
or this output (without the line End test):
Start test
pids: 10180 10181
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
returncode = 0
Most of the above output is missing! How can I fix this? Also, I need some way to check if any command in the bash script did not succeed. In the example this is the case, but the errorcode printed out is still 0. I expect an errorcode != 0.
It is not important to immediately get the output. A delay of some seconds is fine. Also if the output order is a bit mixed up this is of no concern. The important thing is to get all the output (stdout and stderr).
Maybe there is a simpler way to just get the output of a bash script which is started from python?
To be run with python3
from __future__ import print_function
import os
import stat
import sys
import time
from subprocess import PIPE, Popen, STDOUT
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x
ON_POSIX = 'posix' in sys.builtin_module_names
TESTBASH = '/tmp/testbash.sh'
def create_bashtest():
with open(TESTBASH, 'wt') as file_desc:
file_desc.write("""#!/usr/bin/env bash
export MAX_SEED=2
echo "Start test"
pids=""
for seed in `seq 1 ${MAX_SEED}`
do
python -c "raise ValueError('test')" &
pids="${pids} $!"
sleep .1 # Wait so that error messages don't get out of order.
done
wait $pids; return_code=$?
sleep 0.2 # Wait for background messages to be processed.
echo "pids: ${pids}"
echo "End test"
sleep 1 # Wait for main process to handle all the output
exit $return_code
""")
os.chmod(TESTBASH, stat.S_IEXEC|stat.S_IRUSR|stat.S_IWUSR)
def enqueue_output(queue):
pipe = Popen([TESTBASH], stdout=PIPE, stderr=STDOUT,
bufsize=1, close_fds=ON_POSIX, shell=True)
out = pipe.stdout
while pipe.poll() is None:
line = out.readline()
if line:
queue.put(line.decode('ascii'))
time.sleep(.1)
print('returncode = {}'.format(pipe.returncode))
create_bashtest()
C_CHANNEL = Queue()
THREAD = Thread(target=enqueue_output, args=(C_CHANNEL,))
THREAD.daemon = True
THREAD.start()
while THREAD.is_alive():
time.sleep(0.1)
try:
line = C_CHANNEL.get_nowait()
except Empty:
pass # print("no output")
else:
print(line, end='')
Hope this helps :
First, looks like buffers are not being flushed. Redirecting (and, to be safe, appending) stdout/stderr to a file(s) rather than to the terminal, may help. You can always use tee (or tee -a) if you really want both. Using context managers 'might' help.
As far as the zero return code, $!
https://unix.stackexchange.com/questions/386196/doesnt-work-on-command-line
! may be invoking history invoking history, thereby $! resulting in an empty value.
If you somehow end up with just a bare wait the return code will be a zero. Regardless, return codes can be tricky, and you might be picking a successful return code from elsewhere.
Take a look at stdbuf command to change the buffer sizes for stdout and stderr:
Is there a way to flush stdout of a running process
That may also help with getting the rest of your expected output.
Rewrite the while block this way:
# read line without blocking
while t.is_alive():
try:
line = q.get(block=False)
except Empty:
# print(line)
pass
else:
# got line
print(line, end='')
You don't want to block on getting a line from the Queue when there's none, and you don't need a timeout in this case, as it's only used when blocking the thread is required. Consequently, if the Queue.get() throws Empty, there's no line to print, and we just pass.
===
Also, let's clarify the script execution logic.
Since you're using Bash expressions, and the default shell used by Popen is /bin/sh, you'd probably want to rewrite the invokation line this way:
p = Popen(['/usr/bin/bash','-c', './testbash.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1, close_fds=ON_POSIX)
It won't hurt to add a shebang to your shell script, too:
#!/usr/bin/env bash
<... rest of the script ...>
If you're looking for these lines:
[1]- Exit 1 python -c "raise ValueError('test')"
[2]+ Exit 1 python -c "raise ValueError('test')"
This is a function of the bash shell that's typically only available in interactive mode, i.e. when you're typing commands into a terminal. If you check the bash source code, you can see that it explicitly checks the mode before printing to stdout/stderr.
In the more recent versions of bash, you can't set this inside a script: see https://unix.stackexchange.com/a/364618 . However, you can set this yourself when starting the script:
p = Popen(['/bin/bash -i ./testbash.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1, close_fds=ON_POSIX, shell=True)
I will note that this is only working for me on Python3 - Python2 is only getting part of the output. It isn't clear version of Python you're using, but considering Python2 is end of life now we should probably all be trying to switch to Python3.
As for the bash script, even with interactive mode set it seems you have to change how you wait to get that output:
#!/bin/bash
export MAX_SEED=2
echo "Start test"
pids=""
for seed in `seq 1 ${MAX_SEED}`
do
python -c "raise ValueError('test')" &
pids="${pids} $!"
done
echo "pids: ${pids}"
wait -n $pids
wait -n $pids
ret=$?
echo "End test"
exit $ret
Normal wait wasn't working for me (Ubuntu 18.04), but wait -n seemed to work - but as it only waits for the next job to complete, I had inconsistent output just calling it once. Calling wait -n for each job launched seems to do the trick, but the program flow should probably be refactored to loop over the wait the same number of times you spin up the job.
Also note that to change the return code of the script, Philippe's answer has the right approach - the $? variable has the return code of the latest command that failed, which you can then pass to exit. (Yet another difference in Python versions: Python2 is returning 127 while Python3 returns 1 for me.) If you need the return values for each job, one way might be to parse out the values in the interactive job exit lines.
Just guessing - could it be that a line that starts with an empty character / space is not recognized as a line by your logic.
Maybe this indent is the issue. Another option is, that there is a tab or something like that and the ascii decode might fail.
This is how I usually use subprocess:
import subprocess
with subprocess.Popen(["./test.sh"], shell=True, stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE) as p:
error = p.stderr.read().decode()
std_out = p.stdout.read().decode()
if std_out:
print(std_out)
if error:
print("Error message: {}".format(error))
Here you decode and read both the stdout and the stderr. You get everything but not in the same order, I don't if that's an issue.

How to exit function with signal on Windows?

I have the following code written in Python 2.7 on Windows. I want to check for updates for the current python script and update it, if there is an update, with a new version through ftp server preserving the filename and then executing the new python script after terminating the current through the os.kill with SIGNTERM.
I went with the exit function approach but I read that in Windows this only works with the atexit library and default python exit methods. So I used a combination of the atexit.register() and the signal handler.
***necessary libraries***
filematch = 'test.py'
version = '0.0'
checkdir = os.path.abspath(".")
dircontent = os.listdir(checkdir)
r = StringIO()
def exithandler():
try:
try:
if filematch in dircontent:
os.remove(checkdir + '\\' + filematch)
except Exception as e:
print e
ftp = FTP(ip address)
ftp.login(username, password)
ftp.cwd('/Test')
for filename in ftp.nlst(filematch):
fhandle = open(filename, 'wb')
ftp.retrbinary('RETR ' + filename, fhandle.write)
fhandle.close()
subprocess.Popen([sys.executable, "test.py"])
print 'Test file successfully updated.'
except Exception as e:
print e
ftp = FTP(ip address)
ftp.login(username, password)
ftp.cwd('/Test')
ftp.retrbinary('RETR version.txt', r.write)
if(r.getvalue() != version):
atexit.register(exithandler)
somepid = os.getpid()
signal.signal(SIGTERM, lambda signum, stack_frame: exit(1))
os.kill(somepid, signal.SIGTERM)
print 'Successfully replaced and started the file'
Using the:
signal.signal(SIGTERM, lambda signum, stack_frame: exit(1))
I get:
Traceback (most recent call last):
File "C:\Users\STiX\Desktop\Python Keylogger\test.py", line 50, in <module>
signal.signal(SIGTERM, lambda signum, stack_frame: exit(1))
NameError: name 'SIGTERM' is not defined
But I get the job done without a problem except if I use the current code in a more complex script where the script give me the same error but terminates right away for some reason.
On the other hand though, if I use it the correct way, signal.SIGTERM, the process goes straight to termination and the exit function never executed. Why is that?
How can I make this work on Windows and get the outcome that I described above successfully?
What you are trying to do seems a bit complicated (and dangerous from an infosec-perspective ;-). I would suggest to handle the reload-file-when-updated part of the functionality be adding a controller class that imports the python script you have now as a module and, starts it and the reloads it when it is updated (based on a function return or other technique) - look this way for inspiration - https://stackoverflow.com/a/1517072/1010991
Edit - what about exe?
Another hacky technique for manipulating the file of the currently running program would be the shell ping trick. It can be used from all programming languages. The trick is to send a shell command that is not executed before after the calling process has terminated. Use ping to cause the delay and chain the other commands with &. For your use case it could be something like this:
import subprocess
subprocess.Popen("ping -n 2 -w 2000 1.1.1.1 > Nul & del hack.py & rename hack_temp.py hack.py & hack.py ", shell=True)
Edit 2 - Alternative solution to original question
Since python does not block write access to the currently running script an alternative concept to solve the original question would be:
import subprocess
print "hello"
a = open(__file__,"r")
running_script_as_string = a.read()
b = open(__file__,"w")
b.write(running_script_as_string)
b.write("\nprint 'updated version of hack'")
b.close()
subprocess.Popen("python hack.py")

Trouble with subprocess.check_output()

I'm having some strange issues using subprocess.check_output(). At first I was just using subprocess.call() and everything was working fine. However when I simply switch out call() for check_output(), I receive a strange error.
Before code (works fine):
def execute(hosts):
''' Using psexec, execute the batch script on the list of hosts '''
successes = []
wd = r'c:\\'
file = r'c:\\script.exe'
for host in hosts:
res = subprocess.call(shlex.split(r'psexec \\\\%s -e -s -d -w %s %s' % (host,wd,file)))
if res.... # Want to check the output here
successes.append(host)
return successes
After code (doesn't work):
def execute(hosts):
''' Using psexec, execute the batch script on the list of hosts '''
successes = []
wd = r'c:\\'
file = r'c:\\script.exe'
for host in hosts:
res = subprocess.check_output(shlex.split(r'psexec \\\\%s -e -s -d -w %s %s' % (host,wd,file)))
if res.... # Want to check the output here
successes.append(host)
return successes
This gives the error:
I couldnt redirect this because the program hangs here and I can't ctrl-c out. Any ideas why this is happening? What's the difference between subprocess.call() and check_output() that could be causing this?
Here is the additional code including the multiprocessing portion:
PROCESSES = 2
host_sublists_execute = [.... list of hosts ... ]
poolE = multiprocessing.Pool(processes=PROCESSES)
success_executions = poolE.map(execute,host_sublists_execute)
success_executions = [entry for sub in success_executions for entry in sub]
poolE.close()
poolE.join()
Thanks!
You are encountering Python Issue 9400.
There is a key distinction you have to understand about subprocess.call() vs subprocess.check_output(). subprocess.call() will execute the command you give it, then provide you with the return code. On the other hand, subprocess.check_output() returns the program's output to you in a string, but it tries to do you a favor and check the program's return code and will raise an exception (subprocess.CalledProcessError) if the program did not execute successfully (returned a non-zero return code).
When you call pool.map() with a multiprocessing pool, it will try to propagate exceptions in the subprocesses back to main and raise an exception there. Apparently there is an issue with how the subprocess.CalledProcessError exception class is defined, so the pickling fails when the multiprocessing library tries to propagate the exception back to main.
The program you're calling is returning a non-zero return code, which makes subprocess.check_output() raise an exception, and pool.map() can't handle it properly, so you get the TypeError that results from the failed attempt to retrieve the exception.
As a side note, the definition of subprocess.CalledProcessError must be really screwed up, because if I open my 2.7.6 terminal, import subprocess, and manuallly raise the error, I still get the TypeError, so I don't think it's merely a pickling problem.

How to check results of wget/urllib2 in Python?

I am brand new to Python and am trying to write a monitor to determine whether a Java web app (WAR file) running on localhost (hosted by Apache Tomcat) is running or not. I had earlier devised a script that ran:
ps -aef | grep myWebApp
And inspected the results of the grep to see if any process IDs came back in those results.
But it turns out that the host OS only sees the Tomcat process, not any web apps Tomcat is hosting. I next tried to see if Tomcat came with any kind of CLI that I could hit from the terminal, and it looks like the answer is no.
Now, I'm thinking of using wget or maybe even urllib2 to determine if my web app is running by having them hit http://localhost:8080/myWebApp and checking the results. Here is my best attempt with wget:
wgetCmd = "wget http://localhost:8080/myWebApp"
wgetResults = subprocess.check_output([wgetCmd], shell=True, stderr=subprocess.STDOUT)
for line in wgetResults.strip().split('\n'):
if 'failed' in line:
print "\nError: myWebApp is not running."
sys.exit()
My thinking here is that, if the web app isn't running, wget's output should always contain the word "failed" inside of it (at least, from my experience). Unfortunately, when I run this, I get the following error:
Traceback (most recent call last):
File "/home/myUser/mywebapp-mon.py", line 52, in <module>
main()
File "/home/myUser/mywebapp-mon.py", line 21, in main
wgetResults = subprocess.check_output([wgetCmd], shell=True, stderr=subprocess.STDOUT)
File "/usr/lib/python2.7/subprocess.py", line 544, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['wget http://localhost:8080/myWebApp']' returned non-zero exit status 4
Any thoughts as to what's going on here (what the error is)? Also, and more importantly, am I going about this the wrong way? Thanks in advance!
I suggest you try the Requests module. It is much more user-friendly then wget or urllib. Try something like this:
import requests
r = requests.get('http://localhost:8080/myWebApp')
>>> r.status_code
200
>>> r.text
Some text of your webapp
*EDIT * installation instructions http://docs.python-requests.org/en/latest/user/install/
To check url:
import urllib2
def check(url):
try:
urllib2.urlopen(url).read()
except EnvironmentError:
return False
else:
return True
To investigate what kind of an error occurred you could look at the exception instance.
You can also build on using urllib2/requests and interact with Tomcat's manager service if it's installed. My using the list method you can receive the following information:
OK - Listed applications for virtual host localhost
/webdav:running:0
/examples:running:0
/manager:running:0
/:running:0

Exit code when python script has unhandled exception

I need a method to run a python script file, and if the script fails with an unhandled exception python should exit with a non-zero exit code. My first try was something like this:
import sys
if __name__ == '__main__':
try:
import <unknown script>
except:
sys.exit(-1)
But it breaks a lot of scripts, due to the __main__ guard often used. Any suggestions for how to do this properly?
Python already does what you're asking:
$ python -c "raise RuntimeError()"
Traceback (most recent call last):
File "<string>", line 1, in <module>
RuntimeError
$ echo $?
1
After some edits from the OP, perhaps you want:
import subprocess
proc = subprocess.Popen(['/usr/bin/python', 'script-name'])
proc.communicate()
if proc.returncode != 0:
# Run failure code
else:
# Run happy code.
Correct me if I am confused here.
if you want to run a script within a script then import isn't the way; you could use exec if you only care about catching exceptions:
namespace = {}
f = open("script.py", "r")
code = f.read()
try:
exec code in namespace
except Exception:
print "bad code"
you can also compile the code first with
compile(code,'<string>','exec')
if you are planning to execute the script more than once and exec the result in the namespace
or use subprocess as described above, if you need to grab the output generated by your script.

Categories

Resources