Why does my python script get frequently terminated when using subprocess?

Why does my python script get frequently terminated when using subprocess? - python

I have this code. Basically I using subprocess to execute a program several times in a while loop. It works fine but after several times (5 times to be precise) the my python script just terminates and it still has a long way before finishing.
while x < 50:
# ///////////I am doing things here/////////////////////
cmdline = 'gmx mdrun -ntomp 1 -v -deffnm my_sim'
args = shlex.split(cmdline)
proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
output = proc.communicate()[0].decode()
# ///////////I am doing things here/////////////////////
x += 1
For each time I am calling program, it will take approximately one hour to finish. In the mean time subprocess should wait because depending on the output I must execute parts of my code (that is why I am using .communicate() ).
Why is this happening?
Thanks for the help in advanced!

A subprocess runs asynchronously in the background (since it's a different process) and you need to use subprocess.wait() to wait for it to finish. Since you have multiple subprocesses you'll likely want to wait on all of them, like this:
exit_codes = [p.wait() for p in (p1, p2)]

To solve this problem I suggest doing the following:
while x < 50:
# ///////////I am doing things here/////////////////////
cmdline = 'gmx mdrun -ntomp 1 -v -deffnm my_sim 2>&1 | tee output.txt'
proc = subprocess.check_output(args, shell=True)
with open('output.txt', 'r') as fin:
out_file = fin.read()
# ///////////Do what ever you need with the out_file/////////////
# ///////////I am doing things here/////////////////////
x += 1
I know it is not recommended to use shell=True, so if you do not want to use it then just pass the cmdline you with commas. Be aware that you might get an error when passing with commas. I do want to go into details but in that case you can use shell=True and your problem will be gone.
Using the piece of code I just provided your python script will not terminate abruptly when using subprocess many time and with programs that have a lot of stdout and stderr messages.
It tool some time to discover this and I hope I can help someone out there.

Related

python subprocess.Popen stdin.write

I'm new to python and would like to open a windows cmd prompt, start a process, leave the process running and then issue commands to the same running process.
The commands will change so i cant just include these commands in the cmdline variable below. Also, the process takes 10-15 seconds to start so i dont want to waste time waiting for the process to start and run commands each time. just want to start process once. and run quick commands as needed in the same process
I was hoping to use subprocess.Popen to make this work, though i am open to better methods. Note that my process to run is not cmd, but im just using this as example
import subprocess
cmdline = ['cmd', '/k']
cmd = subprocess.Popen(cmdline, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
cmd.stdin.write("echo hi") #would like this to be written to the cmd prompt
print cmd.stdout.readline() #would like to see 'hi' readback
cmd.stdin.write("echo hi again") #would like this to be written to the cmd prompt
print cmd.stdout.readline() #would like to see 'hi again' readback
The results arent what i expect. Seems as though the stdin.write commands arent actually getting in and the readline freezes up with nothing to read.
I have tried the popen.communicate() instead of write/readline, but it kills the process. I have tried setting bufsize in the Popen line, but that didn't make too much difference

Your comments suggest that you are confusing command-line arguments with input via stdin. Namely, the fact that system-console.exe program accepts script=filename parameter does not imply that you can send it the same string as a command via stdin e.g., python executable accepts -c "print(1)" command-line arguments but it is a SyntaxError if you pass it as a command to Python shell.
Therefore, the first step is to use the correct syntax. Suppose the system-console.exe accepts a filename by itself:
#!/usr/bin/env python3
import time
from subprocess import Popen, PIPE
with Popen(r'C:\full\path\to\system-console.exe -cli -',
stdin=PIPE, bufsize=1, universal_newlines=True) as shell:
for _ in range(10):
print('capture.tcl', file=shell.stdin, flush=True)
time.sleep(5)
Note: if you've redirected more than one stream e.g., stdin, stdout then you should read/write both streams concurrently (e.g., using multiple threads) otherwise it is very easy to deadlock your program.
Related:
Q: Why not just use a pipe (popen())? -- mandatory reading for Unix environment but it might also be applicable for some programs on Windows
subprocess readline hangs waiting for EOF -- code example on how to pass multiple inputs, read multiple outputs using subprocess, pexpect modules.
The second and the following steps might have to deal with buffering issues on the side of the child process (out of your hands on Windows), whether system-console allows to redirect its stdin/stdout or whether it works with a console directly, and character encoding issues (how various commands in the pipeline encode text).

Here is some code that I tested and is working on Windows 10, Quartus Prime 15.1 and Python 3.5
import subprocess
class altera_system_console:
def __init__(self):
sc_path = r'C:\altera_lite\15.1\quartus\sopc_builder\bin\system-console.exe --cli --disable_readline'
self.console = subprocess.Popen(sc_path, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
def read_output(self):
rtn = ""
loop = True
i = 0
match = '% '
while loop:
out = self.console.stdout.read1(1)
if bytes(match[i],'utf-8') == out:
i = i+1
if i==len(match):
loop=False
else:
rtn = rtn + out.decode('utf-8')
return rtn
def cmd(self,cmd_string):
self.console.stdin.write(bytes(cmd_string+'\n','utf-8'))
self.console.stdin.flush()
c = altera_system_console()
print(c.read_output())
c.cmd('set jtag_master [lindex [get_service_paths master] 0]')
print(c.read_output())
c.cmd('open_service master $jtag_master')
print(c.read_output())
c.cmd('master_write_8 $jtag_master 0x00 0xFF')
print(c.read_output())

You need to use iter if you want to see the output in real time:
import subprocess
cmdline = ['cmd', '/k']
cmd = subprocess.Popen(cmdline, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
cmd.stdin.write("echo hi\n")#would like this to be written to the cmd prompt
for line in iter(cmd.stdout.readline,""):
print line
cmd.stdin.write("echo hi again\n")#would like this to be written to the cmd prompt
Not sure exactly what you are trying to do but if you want to input certain data when you get certain output then I would recommend using pexpect

Measuring user+system runtime of external program

As part of an evaluation, I want to measure and compare the user+system runtime of different diff-tools.
As a first approach, I thought about calling the particular tools with time - f (GNU time). Since the rest of the evaluation is done by a bunch of Python scripts I want to realise it in Python.
The time output is formatted as follows:
<some error message>
user 0.4
sys 0.2
The output of the diff tool is redirected to sed to get rid of unneeded output and the output of sed is then further processed. (use of sed deprecated for my example. See Edit 2)
A call from within a shell would look like this (removes lines starting with "Binary"):
$ time -f "user %U\nsys %S\n" diff -r -u0 dirA dirB | sed -e '/^Binary.*/d'
Here is my approach so far:
import subprocess
diffcommand=["time","-f","user %U\nsys %S\n","diff","-r","-u0","testrepo_1/A/rev","testrepo_1/B/rev"]
sedcommand = ["sed","-e","/^Binary.*/d"]
# Execute command as subprocess
diff = subprocess.Popen(diffcommand, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
# Calculate runtime
runtime = 0.0
for line in diff.stderr.readlines():
current = line.split()
if current:
if current[0] == "user" or current[0] == "sys":
runtime = runtime + float(current[1])
print "Runtime: "+str(runtime)
# Pipe to "sed"
sedresult = subprocess.check_output(sedcommand, stdin=diff.stdout)
# Wait for the subprocesses to terminate
diff.wait()
However it feels like that this is not clean (especially from an OS point of view). It also leads to the script being stuck in the readlines part under certain circumstances I couldn't figure out yet.
Is there a cleaner (or better) way to achieve what I want?
Edit 1
Changed head line and gave a more detailed explanation
Edit 2
Thanks to J.F. Sebastian, I had a look at os.wait4(...) (information taken from his answer. But since I AM interested in the output, I had to implement it a bit different.
My code now looks like this:
diffprocess = subprocess.Popen(diffcommand,stdout=subprocess.PIPE)
runtimes = os.wait4(diffprocess.pid,0)[2]
runtime = runtimes.ru_utime + runtimes.ru_stime
diffresult = diffprocess.communicate()[0]
Note that I do not pipe the result to sed any more (decided to trim within python)
The runtime measurement works fine for some test cases, but the execution gets stuck sometimes. Removing the runtime measurement then helps the program to terminate and so does sending stdout to DEVNULL (as demanded here). Could I have a deadlock? (valgrind --tool=helgrind did not find anything) Is there something fundamentally wrong in my approach?

but the execution gets stuck sometimes.
If you use stdout=PIPE then something should read the output while the process is still running otherwise the child process will hang if its stdout OS pipe buffer fills up (~65K on my machine).
from subprocess import Popen, PIPE
p = Popen(diffcommand, stdout=PIPE, bufsize=-1)
with p.stdout:
output = p.stdout.read()
ru = os.wait4(p.pid, 0)[2]

Executing shell program in Python without printing to screen

Is there a way that I can execute a shell program from Python, which prints its output to the screen, and read its output to a variable without displaying anything on the screen?
This sounds a little bit confusing, so maybe I can explain it better by an example.
Let's say I have a program that prints something to the screen when executed
bash> ./my_prog
bash> "Hello World"
When I want to read the output into a variable in Python, I read that a good approach is to use the subprocess module like so:
my_var = subprocess.check_output("./my_prog", shell=True)
With this construct, I can get the program's output into my_var (here "Hello World"), however it is also printed to the screen when I run the Python script. Is there any way to suppress this? I couldn't find anything in the subprocess documentation, so maybe there is another module I could use for this purpose?
EDIT:
I just found out that commands.getoutput() lets me do this. But is there also a way to achieve similar effects in subprocess? Because I was planning to make a Python3 version at some point.
EDIT2: Particular Example
Excerpt from the python script:
oechem_utils_path = "/soft/linux64/openeye/examples/oechem-utilities/"\
"openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/"\
"oechem-utilities/"
rmsd_path = oechem_utils_path + "rmsd"
for file in lMol2:
sReturn = subprocess.check_output("{rmsd_exe} {rmsd_pars}"\
" -in {sIn} -ref {sRef}".format(rmsd_exe=sRmsdExe,\
rmsd_pars=sRmsdPars, sIn=file, sRef=sReference), shell=True)
dRmsds[file] = sReturn
Screen Output (Note that not "everything" is printed to the screen, only a part of
the output, and if I use commands.getoutput everything works just fine:
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: mols in: 1 out: 0
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: confs in: 1 out: 0
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd - RMSD utility [OEChem 1.7.2]
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: mols in: 1 out: 0
/soft/linux64/openeye/examples/oechem-utilities/openeye/toolkits/1.7.2.4/redhat-RHEL5-g++4.3-x64/examples/oechem-utilities/rmsd: confs in: 1 out: 0

To add to Ryan Haining's answer, you can also handle stderr to make sure nothing is printed to the screen:
p = subprocess.Popen(command, shell=True, stdin=subprocess.PIPE, stderr=subprocess.STDOUT, stdout=subprocess.PIPE, close_fds=True)
out,err = p.communicate()

If subprocess.check_ouput is not working for you, use a Popen object and a PIPE to capture the program's output in Python.
prog = subprocess.Popen('./myprog', shell=True, stdout=subprocess.PIPE)
output = prog.communicate()[0]
the .communicate() method will wait for a program to finish execution and then return a tuple of (stdout, stderr) which is why you'll want to take the [0] of that.
If you also want to capture stderr then add stderr=subprocess.PIPE to the creation of the Popen object.
If you wish to capture the output of prog while it is running instead of waiting for it to finish, you can call line = prog.stdout.readline() to read one line at a time. Note that this will hang if there are no lines available until there is one.

I always used Subprocess.Popen, which gives you no output normally

Better multithreaded use of Python subprocess.Popen & communicate()?

I'm running multiple commands which may take some time, in parallel, on a Linux machine running Python 2.6.
So, I used subprocess.Popen class and process.communicate() method to parallelize execution of mulitple command groups and capture the output at once after execution.
def run_commands(commands, print_lock):
# this part runs in parallel.
outputs = []
for command in commands:
proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
output, unused_err = proc.communicate() # buffers the output
retcode = proc.poll() # ensures subprocess termination
outputs.append(output)
with print_lock: # print them at once (synchronized)
for output in outputs:
for line in output.splitlines():
print(line)
At somewhere else it's called like this:
processes = []
print_lock = Lock()
for ...:
commands = ... # a group of commands is generated, which takes some time.
processes.append(Thread(target=run_commands, args=(commands, print_lock)))
processes[-1].start()
for p in processes: p.join()
print('done.')
The expected result is that each output of a group of commands is displayed at once while execution of them is done in parallel.
But from the second output group (of course, the thread that become the second is changed due to scheduling indeterminism), it begins to print without newlines and adding spaces as many as the number of characters printed in each previous line and input echo is turned off -- the terminal state is "garbled" or "crashed". (If I issue reset shell command, it restores normal.)
At first, I tried to find the reason from handling of '\r', but it was not the reason. As you see in my code, I handled it properly using splitlines(), and I confirmed that with repr() function applied to the output.
I think the reason is concurrent use of pipes in Popen and communicate() for stdout/stderr. I tried check_output shortcut method in Python 2.7, but no success. Of course, the problem described above does not occur if I serialize all command executions and prints.
Is there any better way to handle Popen and communicate() in parallel?

A final result inspired by the comment from J.F.Sebastian.
http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py
It seems to be a Python bug.

I am not sure it is clear what run_commands needs to be actually doing, but it seems to be simply doing a poll on a subprocess, ignoring the return-code and continuing in the loop. When you get to the part where you are printing output, how could you know the sub-processes have completed?

In your example code I noticed your use of:
for line in output.splitlines():
to address partially the issue of " /r " ; use of
for line in output.splitlines(True):
would have been helpful.

python run multi command in the same time

Prior to this,I run two command in for loop,like
for x in $set:
command
In order to save time,i want to run these two command in the same time,like parallel method in makefile
Thanks
Lyn

The threading module won't give you much performance-wise because of the Global Interpreter Lock.
I think the best way to do this is to use the subprocess module and open each command with it's own stdout.
processes = {}
for cmd in ['cmd1', 'cmd2', 'cmd3']:
p = subprocess.Popen('cmd1', stdout=subprocess.PIPE)
processes[p.stdout] = p
while len(processes):
rfds, _, _ = select.select(processes.keys(), [], [])
for fd in rfds:
process = processses[fd]
print fd.read()
if process.returncode is not None:
print "Process {0} returned with code {1}".format(process.pid, process.returncode)
del processes[fd]
You basically have to use select to see which file descriptors are ready and you have to check their returncode to see if doing a "read" caused them to exit. Processes basically go into a wait state until their stdout is closed. If you would like to do some things while you're waiting, you can put a timeout on select.select() so you'll stop waiting after so long. You can test the length of rfds and if it is 0 then you know that the timeout happened.

twisted or select module is probably what you're after.
If all you want to do is a bunch of batch commands, shell scripts, ie
#!/bin/sh
for i in "command1 command2 command3"; do
$i &
done
Might work better. Alternately, a Makefile like you said.

Look at the threading module.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why does my python script get frequently terminated when using subprocess? - python

A subprocess runs asynchronously in the background (since it's a different process) and you need to use subprocess.wait() to wait for it to finish. Since you have multiple subprocesses you'll likely want to wait on all of them, like this: exit_codes = [p.wait() for p in (p1, p2)]

Related

python subprocess.Popen stdin.write

Measuring user+system runtime of external program

Executing shell program in Python without printing to screen

Better multithreaded use of Python subprocess.Popen & communicate()?

python run multi command in the same time

Categories

Resources