It's not the first time I'm having this problem, and it's really bugging me.
Whenever I open a pipe using the Python subprocess module, I can only communicate with it once, as the documentation specifies: Read data from stdout and stderr, until end-of-file is reached
proc = sub.Popen("psql -h darwin -d main_db".split(),stdin=sub.PIPE,stdout=sub.PIPE)
print proc.communicate("select a,b,result from experiment_1412;\n")[0]
print proc.communicate("select theta,zeta,result from experiment_2099\n")[0]
The problem here is that the second time, Python isn't happy. Indeed, he decided to close the file after the first communicate:
Traceback (most recent call last):
File "a.py", line 30, in <module>
print proc.communicate("select theta,zeta,result from experiment_2099\n")[0]
File "/usr/lib64/python2.5/subprocess.py", line 667, in communicate
return self._communicate(input)
File "/usr/lib64/python2.5/subprocess.py", line 1124, in _communicate
self.stdin.flush()
ValueError: I/O operation on closed file
Are multiple communications allowed?
I think you misunderstand communicate...
http://docs.python.org/library/subprocess.html#subprocess.Popen.communicate
communicate sends a string to the other process and then waits on it to finish... (Like you said waits for the EOF listening to the stdout & stderror)
What you should do instead is:
proc.stdin.write('message')
# ...figure out how long or why you need to wait...
proc.stdin.write('message2')
(and if you need to get the stdout or stderr you'd use proc.stdout or proc.stderr)
I've had this problem before, and as far as I could ever figure, you couldn't do this with subprocess (which, I agree, is very counterintuitive if true). I ended up using pexpect (obtainable from PyPI).
You can use:
proc.stdin.write('input')
if proc.stdout.closed:
print(proc.stdout)
You can do this simply with single call of communicate():
query1 = 'select a,b,result from experiment_1412;'
query1 = 'select theta,zeta,result from experiment_2099;'
concat_query = "{}\n{}".format(query1, query2)
print(proc.communicate(input=concat_query.encode('utf-8'))[0])
The key-point here is that you only write once to stdin, and \n serve as EOL.
your psql subprocess reads from stdin until \n, then after it finishes the first query, it goes to stdin again, by which time only the second query string is left in the buffer.
Related
Context:
I am using python 2.7.5.
I need to run a subprocess from a python script, wait for its termination and get the output.
The subprocess is run around 1000 times.
In order to run my subprocess, I have defined a function:
def run(cmd):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = p.communicate()
return (p.returncode, stdout, stderr)
The subprocess to be executed is a bash script and is passed as the cmd parameter of the run() function.
The command and its arguments are given through a list (as expected by Popen()).
Issue:
In the past, it has always worked without any error.
But recently, the python script get always stuck on a subprocess call after having successfully executed a lot of calls. The subprocess in question is not executed at all (the bash script is not even started) and the python script blocks.
After stopping the execution with Ctrl+C, I get the point where it was stuck:
[...]
File "import_debug.py", line 20, in run
(stdout, stderr) = p.communicate()
File "/usr/lib64/python2.7/subprocess.py", line 800, in communicate
return self._communicate(input)
File "/usr/lib64/python2.7/subprocess.py", line 1401, in _communicate
stdout, stderr = self._communicate_with_poll(input)
File "/usr/lib64/python2.7/subprocess.py", line 1455, in _communicate_with_poll
ready = poller.poll()
KeyboardInterrupt
I don't understand why I have this issue nor how to solve it.
I have found this SO thread that seems to tackle the same issue or something equivalent (since the output after the keyboard interruption is the same) but there is no answer.
Question: What is happening here ? What am I missing ? How to solve this issue ?
EDIT:
The call is under the form:
(code, out, err) = run(["/path/to/bash_script.sh", "arg1", "arg2", "arg3"])
print out
if code:
print "Failed: " + str(err)
The bash script is doing some basic processing with the data (unzip archives and do something with the extracted data).
When the error occurs, none of the bash script instructions are executed.
I cannot provide the exact command, arguments and contents for company privacy concerns.
The author of the original thread you're referring to says: "If I set stderr=None instead of stderr=subprocess.PIPE I never see this issue." -- I'd recommend to do exactly that and get your script working.
Added after reading the comment section:
There are a few useful options, you may want or not to use:
-f freshen existing files, create none
-n never overwrite existing files
-o overwrite files WITHOUT prompting
It's not the first time I'm having this problem, and it's really bugging me.
Whenever I open a pipe using the Python subprocess module, I can only communicate with it once, as the documentation specifies: Read data from stdout and stderr, until end-of-file is reached
proc = sub.Popen("psql -h darwin -d main_db".split(),stdin=sub.PIPE,stdout=sub.PIPE)
print proc.communicate("select a,b,result from experiment_1412;\n")[0]
print proc.communicate("select theta,zeta,result from experiment_2099\n")[0]
The problem here is that the second time, Python isn't happy. Indeed, he decided to close the file after the first communicate:
Traceback (most recent call last):
File "a.py", line 30, in <module>
print proc.communicate("select theta,zeta,result from experiment_2099\n")[0]
File "/usr/lib64/python2.5/subprocess.py", line 667, in communicate
return self._communicate(input)
File "/usr/lib64/python2.5/subprocess.py", line 1124, in _communicate
self.stdin.flush()
ValueError: I/O operation on closed file
Are multiple communications allowed?
I think you misunderstand communicate...
http://docs.python.org/library/subprocess.html#subprocess.Popen.communicate
communicate sends a string to the other process and then waits on it to finish... (Like you said waits for the EOF listening to the stdout & stderror)
What you should do instead is:
proc.stdin.write('message')
# ...figure out how long or why you need to wait...
proc.stdin.write('message2')
(and if you need to get the stdout or stderr you'd use proc.stdout or proc.stderr)
I've had this problem before, and as far as I could ever figure, you couldn't do this with subprocess (which, I agree, is very counterintuitive if true). I ended up using pexpect (obtainable from PyPI).
You can use:
proc.stdin.write('input')
if proc.stdout.closed:
print(proc.stdout)
You can do this simply with single call of communicate():
query1 = 'select a,b,result from experiment_1412;'
query1 = 'select theta,zeta,result from experiment_2099;'
concat_query = "{}\n{}".format(query1, query2)
print(proc.communicate(input=concat_query.encode('utf-8'))[0])
The key-point here is that you only write once to stdin, and \n serve as EOL.
your psql subprocess reads from stdin until \n, then after it finishes the first query, it goes to stdin again, by which time only the second query string is left in the buffer.
I have a process with which I can communicate on the command line like this:
% process -
input
^D^D
output
So: I start the process, type some input and after hitting Ctrl-D twice, I get the output.
I want to make a Python wrapper around this process. I created this:
from subprocess import Popen, PIPE
p = Popen('process -', stdin=PIPE, stdout=PIPE, stderr=PIPE, shell=True)
while True:
input = raw_input('Enter input: ')
p.stdin.write(input)
p.stdin.close()
p.wait()
output = p.stdout.read()
print output
This works the first time, but after that I get:
Traceback (most recent call last):
File "test.py", line 7, in <module>
p.stdin.write(input)
ValueError: I/O operation on closed file
Is there another way to interact with this process without closing the file?
p.wait() will wait until the subprocess has exited prior to returning, so on the second iteration in your script, p has exited already (and has therefore closed p.stdin).
If the proccess you're wrapping ends after the first ouput, the comunication will fail to the second. Due all pipes (stdin and stdout) will be closed. Hence the error:
ValueError: I/O operation on closed file.
Each time you try to send input to the wrapped process, this must be expecting that input and pipes must be opened.
On the other hand is what Thomas said in his answer, p.wait() is not the way to go for repetitive input/output strategy.
You can't use subprocess.Popen.communicate() neither, due it calls subprocess.Popen.wait() internally.
You can try use p.stdin.write and p.stdout.read here you have a good article about the subject: Writing to a python subprocess pipe
To emulate the shell session:
$ process -
input
^D^D
output
In Python, using check_output():
#!/usr/bin/env python3
from subprocess import check_output
out = check_output(['process', '-'], input='input\n', universal_newlines=True)
print(out, end='')
Ctrl+D is recognized as EOF (terminate the input) by a Unix terminal ($ stty -a -- look for eof = ^D and icanon in the output). If you need to type Ctrl+D twice (at the beginning of a line); it might indicate a bug in the process program such as "for line in sys.stdin: doesn't notice EOF the first time" Python bug.
I have a terminal command that is to be run once. Basically, when you execute the command, the program runs continuously until told otherwise. While the program is running, it will keep outputting strings of text into the terminal. I would like to somehow store these strings as a variable so I can check to see if it contains specific keywords.
I tried using os.system, but alas, that does not store the response from the program.
Thanks in advance, and I'm sorry for my naivety
To spawn a subprocess running a command, use subprocess.Popen:
proc = subprocess.Popen(['cat', 'myfile'], stdout = subprocess.PIPE)
By setting stdout to PIPE, you'll make cat output text through a channel that you control.
When Popen returns, proc.stdout will be a file-like object that you can read(). It also exposes readline(), for convenience.
readline() returns the next line of the stream, \n at the end included. When it returns '', there's nothing left to read.
line = proc.stdout.readline()
while line != '':
print line
You can also create a line iterator to walk the output line by line. You can give this iterator to other functions, that can then consume the output of your subprocess without interacting with the process object directly:
iterator = iter(proc.stdout.readline, '')
# Somewhere else in your program...
for line in iterator:
print line
I got a simple python script which should read from stdin.
So if I redirect a stdout of a program to the stdin to my python script.
But the stuff that's logged by my program to the python script will only "reach" the python script when the program which is logging the stuff gets killed.
But actually I want to handle each line which is logged by my program as soon as it is available and not when my program which should actually run 24/7 quits.
So how can I make this happen? How can I make the stdin not wait for CTRL+D or EOF until they handle data?
Example
# accept_stdin.py
import sys
import datetime
for line in sys.stdin:
print datetime.datetime.now().second, line
# print_data.py
import time
print "1 foo"
time.sleep(3)
print "2 bar"
# bash
python print_data.py | python accept_stdin.py
Like all file objects, the sys.stdin iterator reads input in chunks; even if a line of input is ready, the iterator will try to read up to the chunk size or EOF before outputting anything. You can work around this by using the readline method, which doesn't have this behavior:
while True:
line = sys.stdin.readline()
if not line:
# End of input
break
do_whatever_with(line)
You can combine this with the 2-argument form of iter to use a for loop:
for line in iter(sys.stdin.readline, ''):
do_whatever_with(line)
I recommend leaving a comment in your code explaining why you're not using the regular iterator.
It is also an issue with your producer program, i.e. the one you pipe stdout to your python script.
Indeed, as this program only prints and never flushes, the data it prints is kept in the internal program buffers for stdout and not flushed to the system.
Add sys.stdout.flush() call right after you print statement in print_data.py.
You see the data when you quit the program as it automatically flushes on exit.
See this question for explanation,
As said by #user2357112 you need to use:
for line in iter(sys.stdin.readline, ''):
After that you need to start python with the -u flag to flush stdin and stdout immediately.
python -u print_data.py | python -u accept_stdin.py
You can also specify the flag in the shebang.