I'm spawning a process from a script using subprocess. My subprocess takes a JSON input and performs some operations and should return some real time data to the main process. How can I do this from subprocess?
I'm trying something like this. But it is throwing an error.
Following is may main process "main.py"
p = subprocess.Popen(['python','handler.py'],
stdin=subprocess.PIPE,stdout=subprocess.PIPE)
p.communicate(JSONEncoder().encode(data))
while True:
out = process.stdout.read(1)
if out == '' and process.poll() != None:
break
if out != '':
sys.stdout.write(out)
sys.stdout.flush()
Below is my subprocess "handler.py"
if __name__ == '__main__' :
command = json.load(sys.stdin)
os.environ["PYTHONPATH"] = "../../"
if command["cmd"] == "archive" :
print "command recieved:",command["cmd"]
file_ids, count = archive(command["files"])
sys.stdout.write(JSONEncoder().encode(file_ids))
But it throws an error.
Traceback (most recent call last):
File "./core/main.py", line 46, in <module>
out = p.stdout.read(1)
ValueError: I/O operation on closed file
Am I doing something wrong here??
Popen.communicate() does not return until the process is dead and it returns all the output. You can't read subprocess' stdout after it. Look at the top of the .communicate() docs:
Interact with process: Send data to stdin. Read data from stdout and
stderr, until end-of-file is reached. Wait for process to terminate.emphasis is mine
If you want to send data and then read the output line by line as text while the child process is still running:
#!/usr/bin/env python3
import json
from subprocess import Popen, PIPE
with Popen(command, stdin=PIPE, stdout=PIPE, universal_newline=True) as process:
with process.stdin as pipe:
pipe.write(json.dumps(data))
for line in process.stdout:
print(line, end='')
process(line)
If you need code for older python versions or you have buffering issues, see Python: read streaming input from subprocess.communicate().
If all you want is to pass data to the child process and to print the output to terminal:
#!/usr/bin/env python3.5
import json
import subprocess
subprocess.run(command, input=json.dumps(data).encode())
If your actual child process is a Python script then consider importing it as a module and running the corresponding functions instead, see Call python script with input with in a python script using subprocess.
communicate reads all the output from a subprocess and closes it. If you want to be able to read from the process after writing, you have to use something other than communicate, such as p.stdin.write. Alternatively, just use the output of communicate; it should have what you want https://docs.python.org/3/library/subprocess.html#popen-objects.
Related
Lately I was trying to write a simple python code which was supposed to communicate with another process using stdin. Here's what I tried so far:
File start.py:
import sys
from subprocess import PIPE, Popen
proc = subprocess.Popen(["python3", "receive.py"], stdout=PIPE, stdin=PIPE, stderr=PIPE)
proc.stdin.write(b"foo\n")
proc.stdin.flush()
print(proc.stdout.readline())
File receive.py:
import sys
while True:
receive = sys.stdin.readline().decode("utf-8")
if receive == "END":
break
else:
if receive != "":
sys.stdout.write(receive + "-" + receive)
sys.stdout.flush()
Unfortunately, when I python3 start.py as a result I get b''. How should I answer to the prompt of another process?
The sub-process ends early. You can check it by printing stderr of the sub-process.
# after proc.stdin.flush()
print(proc.stderr.read())
Error message:
Traceback (most recent call last):
File "receive.py", line 4, in <module>
receive = sys.stdin.readline().decode()
AttributeError: 'str' object has no attribute 'decode'
The reason why the sub-process ends early
sys.stdin.readline() returns a string (not a byte string); Trying to call decode against a string cause AttributeError in Python 3.x.
To fix the issue, remove decode(..) call in receive.py:
receive = sys.stdin.readline() # without decode.
And, to make start.py complete, send END, and close stdin of sub-process; let sub-process finish gracefully.
proc.stdin.write(b"foo\n")
proc.stdin.flush()
print(proc.stdout.readline())
proc.stdin.write(b'END') # <---
proc.stdin.close() # <---
# proc.wait()
I have a script in Python3 and if I use subprocess.Popen.wait() I have problem — my script iterates some Linux command many times and it looks to me like my app is not responding. When I use subprocess.Popen.communicate() my application correctly completes its work in a second.
What is the right way to solve this problem using Linux?
I think the solution must be somewhere in manipulating with buffer's variable, but I searched through the entire Internet and could not find anything suitable. May be I don't know enough structure and operation of Linux as a whole.
My question can be reformulated as follows: What's happened exactly when I use .wait() method? And that leads to failure of it? What is the cause of the so long waiting? When I aborting running task I see the next log:
Traceback (most recent call last):
File "./test.py", line 6, in <module>
proc.wait()
File "/usr/lib/python3.5/subprocess.py", line 1658, in wait
(pid, sts) = self._try_wait(0)
File "/usr/lib/python3.5/subprocess.py", line 1608, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt
My files looks approximately like the next things:
script.py:
#!/usr/bin/python3
# -*-coding: utf-8 -*-
import subprocess
proc = subprocess.Popen(['./1.py', '1000000'], stdin=subprocess.PIPE, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
proc.wait()
out = proc.stdout.read()
# out = proc.communicate()[0]
print(len(out))
1.py:
#!/usr/bin/python3
# -*-coding: utf-8 -*-
import sys
x = sys.argv[-1]
# print(x, type(x))
for i in range(int(x)):
print(i)
UPD: As we understand, the problem is a buffer overflow. It turns out the last version of question is, how to use the Linux possibility to expand buffer or redirect buffer to a file before running the script?
UPD2: I also tried run the script as: $ python3 -u ./script.py, but, unfortunally, unbufferring doesn't work as I would like and script is hangs.
Your script is sending output to its stdout or stderr pipes. The operating system will buffer some data then block the process forever when the pipe fills. Suppose I have a long winded command like
longwinded.py:
for i in range(100000):
print('a'*1000)
The following hangs because the stdout pipe fills
import sys
import subprocess as subp
p = subp.Popen([sys.executable, 'longwinded.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
p.wait()
The next one doesn't hang because communicate reads the stdout and stderr pipes into memory
p = subp.Popen([sys.executable, 'longwinded.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
p.communicate()
If you don't care what stdout and err are, you can redirect them to the null device
p = subp.Popen([sys.executable, 'longwinded.py'],
stdout=open(os.devnull, 'w'),
stderr=open(os.devnull, 'w'))
p.wait()
or save them to a file
p = subp.Popen([sys.executable, 'longwinded.py'],
stdout=open('mystdout', 'w'),
stderr=open('mystderr', 'w'))
p.wait()
i'm trying to catch the output of airodump-ng, that has a continuous output, and process every line searching for a string. but that doesn't work. so i try the same thing with "htop" command that has the same kind of output, and it still doesn't work.
i'm trying this with python 3.4 and python 2.7, both on arch linux and osx mavericks. here's the code (not every import is necessary but nevermind):
import subprocess
import sys
import os
import time
command = ["htop"]
proc = subprocess.Popen(command, stdout = subprocess.PIPE)
outs, errs = proc.communicate(timeout=3)
proc.kill()
and it gives me:
Traceback (most recent call last):
File "/Users/andrei/Dropbox/python/file_prova.py", line 8, in <module>
outs, errs = proc.communicate(timeout=3)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 960, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 1618, in _communicate
self._check_timeout(endtime, orig_timeout)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 986, in _check_timeout
raise TimeoutExpired(self.args, orig_timeout)
subprocess.TimeoutExpired: Command '['htop']' timed out after 3 seconds
seems like it crashes at proc.communicate() and doesn't execute the lines under that. i also tried to handle the exception but no way to make it work...
[EDIT]
ok so it's for 4 am, i learned the try - exception handling, and after a looong time a managed to make it work with htop, following the tips hardly found here (the 2nd solution doesn't seem to work):
this is how it looks
from subprocess import Popen, PIPE
from time import sleep
from fcntl import fcntl, F_GETFL, F_SETFL
from os import O_NONBLOCK, read
# run the shell as a subprocess:
p = Popen(['htop'], stdout = PIPE)
# set the O_NONBLOCK flag of p.stdout file descriptor:
flags = fcntl(p.stdout, F_GETFL) # get current p.stdout flags
fcntl(p.stdout, F_SETFL, flags | O_NONBLOCK)
# let the shell output the result:
# get the output
while True:
sleep(1)
try:
print (read(p.stdout.fileno(), 1024).decode("utf-8")),
except OSError:
# the os throws an exception if there is no data
print ('[No more data]')
continue
it works flawlessly. with htop.
but not with airodump-ng. it prints on the terminal its output and every 1 second (the sleep() in the while loop) prints [No more data], like the stream is going elsewhere...
EDIT 2:
solved! the thing was just that airodump-ng dumps data to stderr, not stdout. pretty straight forward try ahah :D
From the documentation:
The timeout argument is passed to Popen.wait(). If the timeout
expires, the child process will be killed and then waited for again.
The TimeoutExpired exception will be re-raised after the child process
has terminated.
That seems to describe exactly the behavior you are seeing. You will need to learn about exception handling using try/except.
In shell script, we have the following command:
/script1.pl < input_file| /script2.pl > output_file
I would like to replicate the above stream in Python using the module subprocess. input_file is a large file, and I can't read the whole file at once. As such I would like to pass each line, an input_string into the pipe stream and return a string variable output_string, until the whole file has been streamed through.
The following is a first attempt:
process = subprocess.Popen(["/script1.pl | /script2.pl"], stdin = subprocess.PIPE, stdout = subprocess.PIPE, shell = True)
process.stdin.write(input_string)
output_string = process.communicate()[0]
However, using process.communicate()[0] closes the stream. I would like to keep the stream open for future streams. I have tried using process.stdout.readline(), instead, but the program hangs.
To emulate /script1.pl < input_file | /script2.pl > output_file shell command using subprocess module in Python:
#!/usr/bin/env python
from subprocess import check_call
with open('input_file', 'rb') as input_file
with open('output_file', 'wb') as output_file:
check_call("/script1.pl | /script2.pl", shell=True,
stdin=input_file, stdout=output_file)
You could write it without shell=True (though I don't see a reason here) based on 17.1.4.2. Replacing shell pipeline example from the docs:
#!/usr/bin/env python
from subprocess import Popen, PIPE
with open('input_file', 'rb') as input_file
script1 = Popen("/script1.pl", stdin=input_file, stdout=PIPE)
with open("output_file", "wb") as output_file:
script2 = Popen("/script2.pl", stdin=script1.stdout, stdout=output_file)
script1.stdout.close() # allow script1 to receive SIGPIPE if script2 exits
script2.wait()
script1.wait()
You could also use plumbum module to get shell-like syntax in Python:
#!/usr/bin/env python
from plumbum import local
script1, script2 = local["/script1.pl"], local["/script2.pl"]
(script1 < "input_file" | script2 > "output_file")()
See also How do I use subprocess.Popen to connect multiple processes by pipes?
If you want to read/write line by line then the answer depends on the concrete scripts that you want to run. In general it is easy to deadlock sending/receiving input/output if you are not careful e.g., due to buffering issues.
If input doesn't depend on output in your case then a reliable cross-platform approach is to use a separate thread for each stream:
#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread
def pump_input(pipe):
try:
for i in xrange(1000000000): # generate large input
print >>pipe, i
finally:
pipe.close()
p = Popen("/script1.pl | /script2.pl", shell=True, stdin=PIPE, stdout=PIPE,
bufsize=1)
Thread(target=pump_input, args=[p.stdin]).start()
try: # read output line by line as soon as the child flushes its stdout buffer
for line in iter(p.stdout.readline, b''):
print line.strip()[::-1] # print reversed lines
finally:
p.stdout.close()
p.wait()
I'd like to use the subprocess module in the following way:
create a new process that potentially takes a long time to execute.
capture stdout (or stderr, or potentially both, either together or separately)
Process data from the subprocess as it comes in, perhaps firing events on every line received (in wxPython say) or simply printing them out for now.
I've created processes with Popen, but if I use communicate() the data comes at me all at once, once the process has terminated.
If I create a separate thread that does a blocking readline() of myprocess.stdout (using stdout = subprocess.PIPE) I don't get any lines with this method either, until the process terminates. (no matter what I set as bufsize)
Is there a way to deal with this that isn't horrendous, and works well on multiple platforms?
Update with code that appears not to work (on windows anyway)
class ThreadWorker(threading.Thread):
def __init__(self, callable, *args, **kwargs):
super(ThreadWorker, self).__init__()
self.callable = callable
self.args = args
self.kwargs = kwargs
self.setDaemon(True)
def run(self):
try:
self.callable(*self.args, **self.kwargs)
except wx.PyDeadObjectError:
pass
except Exception, e:
print e
if __name__ == "__main__":
import os
from subprocess import Popen, PIPE
def worker(pipe):
while True:
line = pipe.readline()
if line == '': break
else: print line
proc = Popen("python subprocess_test.py", shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdout_worker = ThreadWorker(worker, proc.stdout)
stderr_worker = ThreadWorker(worker, proc.stderr)
stdout_worker.start()
stderr_worker.start()
while True: pass
stdout will be buffered - so you won't get anything till that buffer is filled, or the subprocess exits.
You can try flushing stdout from the sub-process, or using stderr, or changing stdout on non-buffered mode.
It sounds like the issue might be the use of buffered output by the subprocess - if a relatively small amount of output is created, it could be buffered until the subprocess exits. Some background can be found here:
Here's what worked for me:
cmd = ["./tester_script.bash"]
p = subprocess.Popen( cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
while p.poll() is None:
out = p.stdout.readline()
do_something_with( out, err )
In your case you could try to pass a reference to the sub-process to your Worker Thread, and do the polling inside the thread. I don't know how it will behave when two threads poll (and interact with) the same subprocess, but it may work.
Also note thate the while p.poll() is None: is intended as is. Do not replace it with while not p.poll() as in python 0 (the returncode for successful termination) is also considered False.
I've been running into this problem as well. The problem occurs because you are trying to read stderr as well. If there are no errors, then trying to read from stderr would block.
On Windows, there is no easy way to poll() file descriptors (only Winsock sockets).
So a solution is not to try and read from stderr.
Using pexpect [http://www.noah.org/wiki/Pexpect] with non-blocking readlines will resolve this problem. It stems from the fact that pipes are buffered, and so your app's output is getting buffered by the pipe, therefore you can't get to that output until the buffer fills or the process dies.
This seems to be a well-known Python limitation, see
PEP 3145 and maybe others.
Read one character at a time: http://blog.thelinuxkid.com/2013/06/get-python-subprocess-output-without.html
import contextlib
import subprocess
# Unix, Windows and old Macintosh end-of-line
newlines = ['\n', '\r\n', '\r']
def unbuffered(proc, stream='stdout'):
stream = getattr(proc, stream)
with contextlib.closing(stream):
while True:
out = []
last = stream.read(1)
# Don't loop forever
if last == '' and proc.poll() is not None:
break
while last not in newlines:
# Don't loop forever
if last == '' and proc.poll() is not None:
break
out.append(last)
last = stream.read(1)
out = ''.join(out)
yield out
def example():
cmd = ['ls', '-l', '/']
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
# Make all end-of-lines '\n'
universal_newlines=True,
)
for line in unbuffered(proc):
print line
example()
Using subprocess.Popen, I can run the .exe of one of my C# projects and redirect the output to my Python file. I am able now to print() all the information being output to the C# console (using Console.WriteLine()) to the Python console.
Python code:
from subprocess import Popen, PIPE, STDOUT
p = Popen('ConsoleDataImporter.exe', stdout = PIPE, stderr = STDOUT, shell = True)
while True:
line = p.stdout.readline()
print(line)
if not line:
break
This gets the console output of my .NET project line by line as it is created and breaks out of the enclosing while loop upon the project's termination. I'd imagine this would work for two python files as well.
I've used the pexpect module for this, it seems to work ok. http://sourceforge.net/projects/pexpect/