Python: subprocess.popen() - python

I have a question regarding subprocess.popen():
If supposedly the command executed is in a while loop - is there any way for subprocess.popen() to detect it and exit after printing the first output?
Or is there any other way to do this and print the result?
As you see the following program executed on a linux machine just keeps on executing:
>>> import os
>>> import subprocess as sp
>>> p = sp.Popen("yes", stdout=sp.PIPE)
>>> result = p.communicate()[0]

The communicate method is only useful if the program being called is expected to terminate soon with relatively little output. In the case of your example, the "yes" program never terminates, so communicate never completes. In order to deal with subprocesses which execute indefinitely, and may produce a lot of output, you will need to include a loop which repeatedly calls p.poll() until p.poll() returns a value other than None, which would indicate that the process has terminated. While in this loop you should read from p.stdout and p.stderr to consume any output from the program. If you don't consume the output, the buffers may fill up and cause the program to block waiting to be able to write more output.
import subprocess
import time
p = subprocess.Popen("yes", stdout=subprocess.PIPE)
result = ""
start_time = time.time()
while (p.poll() is None):
result += p.stdout.read(8192)
time.sleep(1)
if (time.time() - start_time) > 5:
print "Timeout"
break
print result
Note that the above example will run indefinitely until you kill the "yes" subprocess it is reading input from. If you want to detect that the process doesn't terminate, you can add a time check to the while loop, and jump out at some point once enough time has passed.
If you are certain that your subprocess will terminate of it's own accord, you can simply call communicate() and get back the output, but this does not work in your example for the reasons I explain above.

Related

Python subprocess call with output and timeout

Summary: I want to start an external process from Python (version 3.6), poll the result nonblocking, and kill after a timeout.
Details: there is an external process with 2 "bad habits":
It prints out the relevant result after an undefined time.
It does not stop after it printed out the result.
Example: maybe the following simple application resembles mostly the actual program to be called (mytest.py; source code not available):
import random
import time
print('begin')
time.sleep(10*random.random())
print('result=5')
while True: pass
This is how I am trying to call it:
import subprocess, time
myprocess = subprocess.Popen(['python', 'mytest.py'], stdout=subprocess.PIPE)
for i in range(15):
time.sleep(1)
# check if something is printed, but do not wait to be printed anything
# check if the result is there
# if the result is there, then break
myprocess.kill()
I want to implement the logic in comment.
Analysis
The following are not appropriate:
Use myprocess.communicate(), as it waits for termination, and the subprocess does not terminate.
Kill the process and then call myprocess.communicate(), because we don't know when exactly the result is printed out
Use process.stdout.readline() because that is a blocikg statement, so it waits until something is printed. But here at the end does not print anything.
The type of the myprocess.stdout is io.BufferedReader. So the question practically is: is there a way to check if something is printed to the io.BufferedReader, and if so, read it, but otherwise do not wait?
I think I got the exact package you need.
Meet command_runner, which is a subprocess wrapper and allows:
Live stdout / stderr output
timeouts regardless of execution
process tree including child processes killing in case of timeout
stdout / stderr redirection to queues, files or callback functions
Install with pip install command_runner
Usage:
from command_runner import command_runner
def callback(stdout_output):
# Do whatever you want here with the output
print(stdout_output)
exit_code, output = command_runner("python mytest.py", timeout=300, stdout=callback, method='poller')
if exit_code == -254:
print("Oh no, we got a timeout")
print(output)
# Check for good exit_code and full stdout output here
If timeout is reached, you'll get exit_code -254 but still get to have output filled with whatever your subprocess wrote to stdout/stderr.
Disclaimer: I'm the author of command_runner
Additional non blocking examples using queues can be seen on the github page.

python: find in subprocess's output, leave it running and continue

I was have to call a shell command
subprocess.Popen(command, stderr=subprocess.STDOUT, stdout=subprocess.PIPE)
I did it. and:
and, that command is prints lots of things like verbose is on, and then when its done it's job it prints (writes) blah blah : Ready
I have to call this command, wait for the 'Ready' text and leave it running on background, then let the rest of the code run
I tried this and things like this, didn't work
...
done=False
with subprocess.Popen(command, stderr=subprocess.STDOUT, stdout=subprocess.PIPE) as proc:
while not done:
x=proc.stdout.read()
x=x.find('Ready')
if x > -1:
done=True
print("YaaaY, this subprocess is done it's job and now running on background")
#rest of the code
i ran similar (edited) code on python terminal and I think I can't even access (read) the terminal of the subprocess. because...
I was expecting it will show every line that this subprocess print but. its just waiting.
Your problem is proc.stdout.read(). This reads the entire output of your subprocess, which is not known until it has terminated (usually). Try something like:
output = b''
while not done:
output += proc.stdout.read(1)
x = output.find(b'Ready')
if x > -1:
done = True
This reads the process's stdout one character at a time, so it doesn't have to wait for it to finish.
Note that using Popen in a context manager (with block) will cause your program to wait for the child process to terminate before it exits the with block, so it will not leave it running past that point. It's unclear if that is desired behaviour or not.

Python subprocess polling not giving return code when used with Java process

I'm having a problem with subprocess poll not returning the return code when the process has finished.
I found out how to set a timeout on subprocess.Popen and used that as the basis for my code. However, I have a call that uses Java that doesn't correctly report the return code so each call "times out" even though it is actually finished. I know the process has finished because when removing the poll timeout check, the call runs without issue returning a good exit code and within the time limit.
Here is the code I am testing with.
import subprocess
import time
def execute(command):
print('start command: {}'.format(command))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
print('wait')
wait = 10
while process.poll() is None and wait > 0:
time.sleep(1)
wait -= 1
print('done')
if wait == 0:
print('terminate')
process.terminate()
print('communicate')
stdout, stderr = process.communicate()
print('rc')
exit_code = process.returncode
if exit_code != 0:
print('got bad rc')
if __name__ == '__main__':
execute(['ping','-n','15','127.0.0.1']) # correctly times out
execute(['ping','-n','5','127.0.0.1']) # correctly runs within the time limit
# incorrectly times out
execute(['C:\\dev\\jdk8\\bin\\java.exe', '-jar', 'JMXQuery-0.1.8.jar', '-url', 'service:jmx:rmi:///jndi/rmi://localhost:18080/jmxrmi', '-json', '-q', 'java.lang:type=Runtime;java.lang:type=OperatingSystem'])
You can see that two examples are designed to time out and two are not to time out and they all work correctly. However, the final one (using jmxquery to get tomcat metrics) doesn't return the exit code and therefore "times out" and has to be terminated, which then causes it to return an error code of 1.
Is there something I am missing in the way subprocess poll is interacting with this Java process that is causing it to not return an exit code? Is there a way to get a timeout option to work with this?
This has the same cause as a number of existing questions, but the desire to impose a timeout requires a different answer.
The OS deliberately gives only a small amount of buffer space to each pipe. When a process writes to one that is full (because the reader has not yet consumed the previous output), it blocks. (The reason is that a producer that is faster than its consumer would otherwise be able to quickly use a great deal of memory for no gain.) Therefore, if you want to do more than one of the following with a subprocess, you have to interleave them rather than doing each in turn:
Read from standard output
Read from standard error (unless it’s merged via subprocess.STDOUT)
Wait for the process to exit, or for a timeout to elapse
Of course, the subprocess might close its streams before it exits, write useful output after you notice the timeout and before you kill it, and/or start additional processes that keep the pipe open indefinitely, so you might want to have multiple timeouts. Probably what’s most informative is the EOF on the pipe, so repeatedly use something like select to wait for (however much is left of) the timeout, issue single reads on the streams that are ready, and wait (with another timeout if you’re concerned about hangs after an early stream closure) on EOF. If the timeout occurs instead, (try to) kill the subprocess, and consider issuing non-blocking reads (or another timeout loop) to get any last available output before closing the pipes.
Using the other answer by #DavisHerring as the basis for more research, I came across a concept that worked for my original case. Here is the code that came out of that.
import subprocess
import threading
import time
def execute(command):
print('start command: {}'.format(command))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
timer = threading.Timer(10, terminate_process, [process])
timer.start()
print('communicate')
stdout, stderr = process.communicate()
print('rc')
exit_code = process.returncode
timer.cancel()
if exit_code != 0:
print('got bad rc')
def terminate_process(p):
try:
p.terminate()
except OSError:
pass # ignore error
It uses the threading.Timer to make sure that the process doesn't go over the time limit and terminates the process if it does. It otherwise waits for a response back and cancels the timer once it finishes.

Infinite while not working with os.execvp

I am programming in python which involves me implementing a shell in Python in Linux. I am trying to run standard unix commands by using os.execvp(). I need to keep asking the user for commands so I have used an infinite while loop. However, the infinite while loop doesn't work. I have tried searching online but they're isn't much available for Python. Any help would be appreciated. Thanks
This is the code I have written so far:
import os
import shlex
def word_list(line):
"""Break the line into shell words."""
lexer = shlex.shlex(line, posix=True)
lexer.whitespace_split = False
lexer.wordchars += '#$+-,./?#^='
args = list(lexer)
return args
def main():
while(True):
line = input('psh>')
split_line = word_list(line)
if len(split_line) == 1:
print(os.execvp(split_line[0],[" "]))
else:
print(os.execvp(split_line[0],split_line))
if __name__ == "__main__":
main()
So when I run this and put in the input "ls" I get the output "HelloWorld.py" (which is correct) and "Process finished with exit code 0". However I don't get the output "psh>" which is waiting for the next command. No exceptions are thrown when I run this code.
Your code does not work because it uses os.execvp. os.execvp replaces the current process image completely with the executing program, your running process becomes the ls.
To execute a subprocess use the aptly named subprocess module.
In case of an ill-advised programming exercise then you need to:
# warning, never do this at home!
pid = os.fork()
if not pid:
os.execvp(cmdline) # in child
else:
os.wait(pid) # in parent
os.fork returns twice, giving the pid of child in parent process, zero in child process.
If you want it to run like a shell you are looking for os.fork() . Call this before you call os.execvp() and it will create a child process. os.fork() returns the process id. If it is 0 then you are in the child process and can call os.execvp(), otherwise continue with the code. This will keep the while loop running. You can have the original process either wait for it to complete os.wait(), or continue without waiting to the start of the while loop. The pseudo code on page 2 of this link should help https://www.cs.auckland.ac.nz/courses/compsci340s2c/assignments/A1/A1.pdf

Better multithreaded use of Python subprocess.Popen & communicate()?

I'm running multiple commands which may take some time, in parallel, on a Linux machine running Python 2.6.
So, I used subprocess.Popen class and process.communicate() method to parallelize execution of mulitple command groups and capture the output at once after execution.
def run_commands(commands, print_lock):
# this part runs in parallel.
outputs = []
for command in commands:
proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
output, unused_err = proc.communicate() # buffers the output
retcode = proc.poll() # ensures subprocess termination
outputs.append(output)
with print_lock: # print them at once (synchronized)
for output in outputs:
for line in output.splitlines():
print(line)
At somewhere else it's called like this:
processes = []
print_lock = Lock()
for ...:
commands = ... # a group of commands is generated, which takes some time.
processes.append(Thread(target=run_commands, args=(commands, print_lock)))
processes[-1].start()
for p in processes: p.join()
print('done.')
The expected result is that each output of a group of commands is displayed at once while execution of them is done in parallel.
But from the second output group (of course, the thread that become the second is changed due to scheduling indeterminism), it begins to print without newlines and adding spaces as many as the number of characters printed in each previous line and input echo is turned off -- the terminal state is "garbled" or "crashed". (If I issue reset shell command, it restores normal.)
At first, I tried to find the reason from handling of '\r', but it was not the reason. As you see in my code, I handled it properly using splitlines(), and I confirmed that with repr() function applied to the output.
I think the reason is concurrent use of pipes in Popen and communicate() for stdout/stderr. I tried check_output shortcut method in Python 2.7, but no success. Of course, the problem described above does not occur if I serialize all command executions and prints.
Is there any better way to handle Popen and communicate() in parallel?
A final result inspired by the comment from J.F.Sebastian.
http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py
It seems to be a Python bug.
I am not sure it is clear what run_commands needs to be actually doing, but it seems to be simply doing a poll on a subprocess, ignoring the return-code and continuing in the loop. When you get to the part where you are printing output, how could you know the sub-processes have completed?
In your example code I noticed your use of:
for line in output.splitlines():
to address partially the issue of " /r " ; use of
for line in output.splitlines(True):
would have been helpful.

Categories

Resources