Python subprocess get stuck at communicate() call - python

Context:
I am using python 2.7.5.
I need to run a subprocess from a python script, wait for its termination and get the output.
The subprocess is run around 1000 times.
In order to run my subprocess, I have defined a function:
def run(cmd):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = p.communicate()
return (p.returncode, stdout, stderr)
The subprocess to be executed is a bash script and is passed as the cmd parameter of the run() function.
The command and its arguments are given through a list (as expected by Popen()).
Issue:
In the past, it has always worked without any error.
But recently, the python script get always stuck on a subprocess call after having successfully executed a lot of calls. The subprocess in question is not executed at all (the bash script is not even started) and the python script blocks.
After stopping the execution with Ctrl+C, I get the point where it was stuck:
[...]
File "import_debug.py", line 20, in run
(stdout, stderr) = p.communicate()
File "/usr/lib64/python2.7/subprocess.py", line 800, in communicate
return self._communicate(input)
File "/usr/lib64/python2.7/subprocess.py", line 1401, in _communicate
stdout, stderr = self._communicate_with_poll(input)
File "/usr/lib64/python2.7/subprocess.py", line 1455, in _communicate_with_poll
ready = poller.poll()
KeyboardInterrupt
I don't understand why I have this issue nor how to solve it.
I have found this SO thread that seems to tackle the same issue or something equivalent (since the output after the keyboard interruption is the same) but there is no answer.
Question: What is happening here ? What am I missing ? How to solve this issue ?
EDIT:
The call is under the form:
(code, out, err) = run(["/path/to/bash_script.sh", "arg1", "arg2", "arg3"])
print out
if code:
print "Failed: " + str(err)
The bash script is doing some basic processing with the data (unzip archives and do something with the extracted data).
When the error occurs, none of the bash script instructions are executed.
I cannot provide the exact command, arguments and contents for company privacy concerns.

The author of the original thread you're referring to says: "If I set stderr=None instead of stderr=subprocess.PIPE I never see this issue." -- I'd recommend to do exactly that and get your script working.
Added after reading the comment section:
There are a few useful options, you may want or not to use:
-f freshen existing files, create none
-n never overwrite existing files
-o overwrite files WITHOUT prompting

Related

Python subprocess package returns broken pipe

I am trying to do a very simple example of using subprocess package. The python script should open a new process and run read command. read command should receive input from stdin via PIPE. Every time when I try to use write() and flush() it says:
Traceback (most recent call last):
File "recorder.py", line 68, in <module>
p.stdin.flush()
BrokenPipeError: [Errno 32] Broken pipe
My python code looks like:
import subprocess
import time
p = subprocess.Popen(
[
"read",
],
stdout=subprocess.PIPE,
stdin=subprocess.PIPE,
stderr=subprocess.STDOUT,
shell=True,
bufsize=1
)
for character in "This is the message!\n":
p.stdin.write(character.encode("utf-8"))
time.sleep(0.25)
p.stdin.flush()
assert p.returncode == 0
Note: it's very important to send character after character (with sleeping timeout).
I actually could not replicate your result*, in my case your loop runs through and it'd fail on the assert as p has not finished yet and has no returncode (or rather its value is still None at that time). Inserting p.wait() after the loop and before the assert would force we only check for result after p has terminated.
Now for the exception you're seeing, it most likely indicates the pipe you're trying to perform flush() on is closed. Most likely due to the process having already terminated. Perhaps in your case at that point it already has a (non-zero) returncode too which could further help understand the problem?**
* On my system /bin/sh used by subprocess.Popen() with shell=True is actually bash. Running ["/bin/dash", "-c", "read"] which presumably happens to be shell called for /bin/sh on your system, I got broken pipe as well.
** Running dash like this seems to fail with:
/bin/dash: 1: read: arg count
And return 2.
Which sort of makes it more of a dash question: why calling /bin/dash -c "read" (from python) fails. It appears that dash read (unlike its bash counterpart) always expect at least one variable name to read into as an argument (replace read with read foo).
I guess this python question just became a lesson about assumptions and shell scripts portability. :)

During the Python subprocess, I do not see anything in the batch file that runs the jar file

I have completed the logic to run the batch file in a subprocess and it works.
query = 'C:/val/start.bat'
process = subprocess.Popen(query, shell=False, stdout=subprocess.PIPE)
The cmd window appears and runs fine, but I do not see any logs that need to be printed.
When I run the batch file directly from Windows, the log is normally generated.
The batch file calls and executes the jar file.
#echo off
"%JAVA_HOME%\bin\java" -Dfile.encoding=utf-8 -Djava.file.encoding=UTF-8 -jar -Xms1024m -Xmx1024m C:\val\val.jar
pause>nul
Could you tell me what the problem is and how to solve it?
You need to get the output.
import subprocess
process = subprocess.Popen('command', stdout=subprocess.PIPE)
process.wait()
result = process.stdout.read()

Trouble printing text live with Python subprocess.call

Off the bat, here is what I am importing:
import os, shutil
from subprocess import call, PIPE, STDOUT
I have a line of code that calls bjam to compile a library:
call(['./bjam',
'-j8',
'--prefix="' + tools_dir + '"'],
stdout=PIPE)
I want it to print out text as the compilation occurs. Instead, it prints everything out at the end.
It does not print anything when I run it like this. I have tried running the command outside of Python and determined that all of the output is to stdout (when I did ./bjam -j8 > /dev/null I got no output, and when I ran ./bjam -j8 2> /dev/null I got output).
What am I doing wrong here? I want to print the output from call live.
As a sidenote, I also noticed something when I was outputting the results of a git clone operation:
call(['git',
'clone', 'https://github.com/moses-smt/mosesdecoder.git'],
stdout=PIPE)
prints the stdout text live as the call process is run.
call(['git',
'clone', 'https://github.com/moses-smt/mosesdecoder.git'],
stdout=PIPE, stderr=STDOUT)
does not print out any text. What is going on here?
stdout=PIPE redirects subprocess' stdout to a pipe. Don't do it unless you want to read from the subprocesses stdout in your code using proc.communicate() method or using proc.stdout attribute directly.
If you remove it then subprocess should print to stdout like it does in the shell:
from subprocess import check_call
check_call(['./bjam', '-j8', '--prefix', tools_dir])
I've used check_call() to raise an exception if the child process fails.
See Python: read streaming input from subprocess.communicate() if you want to read subprocess' output line by line (making the line available as a variable in Python) as soon as it is avaiable.
Try:
def run(command):
proc = subprocess.Popen(command, stdout=subprocess.PIPE)
for lineno, line in enumerate(proc.stdout):
try:
print(line.decode('utf-8').replace('\n', ''))
except UnicodeDecodeError:
print('error(%d): cannot decode %s' % (lineno, line))
The try...except logic is for python 3 (maybe 3.2/3.3, I'm not sure), as there line is a byte array not a string. For earlier versions of python, you should be able to do:
def run(command):
proc = subprocess.Popen(command, stdout=subprocess.PIPE)
for line in proc.stdout:
print(line.replace('\n', ''))
Now, you can do:
run(['./bjam', '-j8', '--prefix="' + tools_dir + '"'])
call will not print anything it captures. As documentation says "Do not use stdout=PIPE or stderr=PIPE with this function. As the pipes are not being read in the current process, the child process may block if it generates enough output to a pipe to fill up the OS pipe buffer."
Consider using check_output and print its return value.
In the first case with git call you are not capturing stderr and therefor it normally flows onto your terminal.

Python too many open files (subprocesses)

I seem to be having an issue with Python when I run a script that creates a large number of sub processes. The sub process creation code looks similar to:
Code:
def execute(cmd, stdout=None, stderr=subprocess.STDOUT, cwd=None):
proc = subprocess.Popen(cmd, shell=True, stdout=stdout, stderr=stderr, cwd=cwd)
atexit.register(lambda: __kill_proc(proc))
return proc
The error message I am receiving is:
OSError: [Errno 24] Too many open files
Once this error occurs, I am unable to create any further sub processes until kill the script and start it again. I am wondering if the following line could be responsible.
atexit.register(lambda: __kill_proc(proc))
Could it be that this line creates a reference to the sub process, resulting in a "file" remaining open until the script exits?
So it seems that the line:
atexit.register(lambda: __kill_proc(proc))
was indeed the culprit. This is probably because of the Popen reference being kept around so the process resources aren't free'd. When I removed that line the error went away. I have now changed the code as #Bakuriu suggested and am using the process' pid value rather than the Popen instance.
Firstly, run ulimit -a to find out how many the maximum open files are set in your Linux system.
Then edit the system configuration file /etc/security/limits.conf and add those code in the bottom.
* - nofile 204800
Then you can open more sub processes if you want.

Executing a command and storing its output in a variable

I'm currently trying to write a python script that, among many things, calls an executable and stores what that executable sends to stdout in a variable. Here is what I have:
1 #!/usr/bin/python
2 import subprocess
3
4 subprocess.call("./pmm", shell=True)
How would I get the output of pmm to be stored in a variable?
In Python 2.7 (and 3.1 or above), you can use subprocess.check_output(). Example from the documentation:
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
'crw-rw-rw- 1 root root 1, 3 Oct 18 2007 /dev/null\n'
p = subprocess.Popen(["./pmm"], shell=False, stdout=subprocess.PIPE)
output = p.stdout.read()
I wrote a post about this some time ago:
http://trifoliummedium.blogspot.com/2010/12/running-command-line-with-python-and.html
Use p.communicate() to get both stdout and stderr
First you have to save a reference to the subprocess (bind it to a name ... which, in other languages and more informally is referred to as "assigning it to a variable"). So you should use something like proc = subprocess.Popen(...)
From there I recommend that you call proc.poll() to test if the program has completed, and either sleep (using the time.sleep() function, for example) or perform other work (using select.select() for example) and then checking again, later. Or you can call proc.wait() so that you're sure the this ./pmm command has completed it's work before your program continues. The poll() method on an subprocess instance will return "None" if the subprocess it still running; otherwise it'll return the exit value of the command that was running on that subprocess. The wait() method for a subprocess will cause your program to block and then return the exit value.
After that you can call (output, errormsgs) = proc.communicate() to capture any output or error messages from your subprocess. If the output is too large it could cause problems; using the process instance's .stdout (PIPE file descriptor) is tricky and, if you were going to attempt this then you should use features in the fcntl (file descriptor control) module to switch it into a non-blocking mode and be prepared to handle he exceptions raised when attempting read() calls on the buffer when it's empty.

Categories

Resources