as in the question, I would like to pass process A, that is the code below which reads the contents of the outfile.txt file, to be passed to process B which will "download" the contents of process A and create a new file with the contents of process A?
How can I do this or can someone give me an idea?
import subprocess
proc = subprocess.Popen(['python','outfile.txt'],stdout=subprocess.PIPE)
while True:
line = proc.stdout.readline()
if not line:
break
print("test:", line.rstrip())
I don't know if I understand what you want to do.
For me Process A and Process B means two Popen in one script.
if you want to send directly from one process to another then you can use
process_B = subprocess.Popen(..., stdin=process_A.stdout)
Minimal working code. It runs ls | sort -r on Linux
import subprocess
#import sys
process_A = subprocess.Popen(['ls'], stdout=subprocess.PIPE)
process_B = subprocess.Popen(['sort', '-r'], stdout=subprocess.PIPE, stdin=process_A.stdout)
# - show result -
for line in process_B.stdout:
#sys.stdout.write(line.decode())
print(line.decode().rstrip())
or show result with `communicate()
import subprocess
process_A = subprocess.Popen(['ls'], stdout=subprocess.PIPE)
process_B = subprocess.Popen(['sort', '-r'], stdout=subprocess.PIPE, stdin=process_A.stdout)
# - show result -
stout, stderr = process_B.communicate()
print(stout.decode())
EDIT:
If you want to modify data between processes
import subprocess
process_A = subprocess.Popen(['ls', '/'], stdout=subprocess.PIPE)
process_B = subprocess.Popen(['sort', '-r'], stdout=subprocess.PIPE, stdin=subprocess.PIPE)
# - get all from one process -
stdout_A, stderr_A = process_A.communicate()
# - modify all -
stdout_A = stdout_A.upper()
# - send all to other process and get result -
stout_B, stderr_B = process_B.communicate(stdout_A)
print(stout_B.decode())
or
import subprocess
process_A = subprocess.Popen(['ls', '/'], stdout=subprocess.PIPE)
process_B = subprocess.Popen(['sort', '-r'], stdout=subprocess.PIPE, stdin=subprocess.PIPE)
# - send from one process to another line by line -
for line in process_A.stdout:
line = line.upper()
process_B.stdin.write(line)
# - get result -
stout_B, stderr_B = process_B.communicate()
print(stout_B.decode())
Consider the following snippet that runs three different subprocesses one after the other with subprocess.run (and notably all with defaulted kwargs):
import subprocess
p1 = subprocess.run(args1)
if p1.returncode != 0:
error()
p2 = subprocess.run(args2)
if p2.returncode != 0:
error()
p3 = subprocess.run(args3)
if p3.returncode != 0:
error()
How can we rewrite this so that the subprocesses are run in parallel to each other?
With Popen right? What does that exactly look like?
For reference, the implementation of subprocess.run is essentially:
with Popen(*popenargs, **kwargs) as process:
try:
stdout, stderr = process.communicate(input, timeout=timeout)
except TimeoutExpired as exc:
process.kill()
if _mswindows:
exc.stdout, exc.stderr = process.communicate()
else:
process.wait()
raise
except:
process.kill()
raise
retcode = process.poll()
return CompletedProcess(process.args, retcode, stdout, stderr)
So something like...
with Popen(args1) as p1:
with Popen(args2) as p2:
with Popen(args3) as p3:
try:
p1.communicate(None, timeout=None)
p2.communicate(None, timeout=None)
p3.communicate(None, timeout=None)
except:
p1.kill()
p2.kill()
p3.kill()
raise
if p1.poll() != 0 or p2.poll() != 0 or p3.poll() != 0:
error()
Is that along the right lines?
I would just use multiprocessing to accomplish your mission but ensuring that your invocation of subprocess.run uses capture_output=True so that the output from the 3 commands running in parallel are not interlaced:
import multiprocessing
import subprocess
def runner(args):
p = subprocess.run(args, capture_output=True, text=True)
if p.returncode != 0:
raise Exception(r'Return code was {p.returncode}.')
return p.stdout, p.stderr
def main():
args1 = ['git', 'status']
args2 = ['git', 'log', '-3']
args3 = ['git', 'branch']
args = [args1, args2, args3]
with multiprocessing.Pool(3) as pool:
results = [pool.apply_async(runner, args=(arg,)) for arg in args]
for result in results:
try:
out, err = result.get()
print(out, end='')
except Exception as e: # runner completed with an Exception
print(e)
if __name__ == '__main__': # required for Windows
main()
Update
With just subprocess we have something like:
import subprocess
args1 = ['git', 'status']
args2 = ['git', 'log', '-3']
args3 = ['git', 'branch']
p1 = subprocess.Popen(args1)
p2 = subprocess.Popen(args2)
p3 = subprocess.Popen(args3)
p1.communicate()
rc1 = p1.returncode
p2.communicate()
rc2 = p2.returncode
p3.communicate()
rc3 = p3.returncode
But, for whatever reason on my Windows platform I never saw the output from the third subprocess command ('git branch'), so there must be some limitation there. Also, if the command you were running required input from stdin before proceeding, that input would have to be provided to the communicate method. But the communicate method would not complete until the entire subprocess has completed and you would get no parallelism, so as a general solution this is not really very good. In the multiprocessing code, there is no problem with having stdin input to communicate.
Update 2
When I recode it as follows, I now get all the expected output. I am not sure why it makes a difference, however. According to the documentation, Popen.communicate:
Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate and set the returncode attribute. The optional input argument should be data to be sent to the child process, or None, if no data should be sent to the child. If streams were opened in text mode, input must be a string. Otherwise, it must be bytes.
So the call should be waiting for the process to terminate. Nevertheless, my preceding comment about the situation where the command you are executing requiring stdin input (via a pipe) would not run in parallel without using multiprocessing.
import subprocess
args1 = ['git', 'status']
args2 = ['git', 'log', '-3']
args3 = ['git', 'branch']
with subprocess.Popen(args1) as p1:
with subprocess.Popen(args2) as p2:
with subprocess.Popen(args3) as p3:
p1.communicate()
rc1 = p1.returncode
p2.communicate()
rc2 = p2.returncode
p3.communicate()
rc3 = p3.returncode
I have a python subprocess that I'm trying to read output and error streams from. Currently I have it working, but I'm only able to read from stderr after I've finished reading from stdout. Here's what it looks like:
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout_iterator = iter(process.stdout.readline, b"")
stderr_iterator = iter(process.stderr.readline, b"")
for line in stdout_iterator:
# Do stuff with line
print line
for line in stderr_iterator:
# Do stuff with line
print line
As you can see, the stderr for loop can't start until the stdout loop completes. How can I modify this to be able to read from both in the correct order the lines come in?
To clarify: I still need to be able to tell whether a line came from stdout or stderr because they will be treated differently in my code.
The code in your question may deadlock if the child process produces enough output on stderr (~100KB on my Linux machine).
There is a communicate() method that allows to read from both stdout and stderr separately:
from subprocess import Popen, PIPE
process = Popen(command, stdout=PIPE, stderr=PIPE)
output, err = process.communicate()
If you need to read the streams while the child process is still running then the portable solution is to use threads (not tested):
from subprocess import Popen, PIPE
from threading import Thread
from Queue import Queue # Python 2
def reader(pipe, queue):
try:
with pipe:
for line in iter(pipe.readline, b''):
queue.put((pipe, line))
finally:
queue.put(None)
process = Popen(command, stdout=PIPE, stderr=PIPE, bufsize=1)
q = Queue()
Thread(target=reader, args=[process.stdout, q]).start()
Thread(target=reader, args=[process.stderr, q]).start()
for _ in range(2):
for source, line in iter(q.get, None):
print "%s: %s" % (source, line),
See:
Python: read streaming input from subprocess.communicate()
Non-blocking read on a subprocess.PIPE in python
Python subprocess get children's output to file and terminal?
Here's a solution based on selectors, but one that preserves order, and streams variable-length characters (even single chars).
The trick is to use read1(), instead of read().
import selectors
import subprocess
import sys
p = subprocess.Popen(
["python", "random_out.py"], stdout=subprocess.PIPE, stderr=subprocess.PIPE
)
sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)
while True:
for key, _ in sel.select():
data = key.fileobj.read1().decode()
if not data:
exit()
if key.fileobj is p.stdout:
print(data, end="")
else:
print(data, end="", file=sys.stderr)
If you want a test program, use this.
import sys
from time import sleep
for i in range(10):
print(f" x{i} ", file=sys.stderr, end="")
sleep(0.1)
print(f" y{i} ", end="")
sleep(0.1)
The order in which a process writes data to different pipes is lost after write.
There is no way you can tell if stdout has been written before stderr.
You can try to read data simultaneously from multiple file descriptors in a non-blocking way
as soon as data is available, but this would only minimize the probability that the order is incorrect.
This program should demonstrate this:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import select
import subprocess
testapps={
'slow': '''
import os
import time
os.write(1, 'aaa')
time.sleep(0.01)
os.write(2, 'bbb')
time.sleep(0.01)
os.write(1, 'ccc')
''',
'fast': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbb')
os.write(1, 'ccc')
''',
'fast2': '''
import os
os.write(1, 'aaa')
os.write(2, 'bbbbbbbbbbbbbbb')
os.write(1, 'ccc')
'''
}
def readfds(fds, maxread):
while True:
fdsin, _, _ = select.select(fds,[],[])
for fd in fdsin:
s = os.read(fd, maxread)
if len(s) == 0:
fds.remove(fd)
continue
yield fd, s
if fds == []:
break
def readfromapp(app, rounds=10, maxread=1024):
f=open('testapp.py', 'w')
f.write(testapps[app])
f.close()
results={}
for i in range(0, rounds):
p = subprocess.Popen(['python', 'testapp.py'], stdout=subprocess.PIPE
, stderr=subprocess.PIPE)
data=''
for (fd, s) in readfds([p.stdout.fileno(), p.stderr.fileno()], maxread):
data = data + s
results[data] = results[data] + 1 if data in results else 1
print 'running %i rounds %s with maxread=%i' % (rounds, app, maxread)
results = sorted(results.items(), key=lambda (k,v): k, reverse=False)
for data, count in results:
print '%03i x %s' % (count, data)
print
print "=> if output is produced slowly this should work as whished"
print " and should return: aaabbbccc"
readfromapp('slow', rounds=100, maxread=1024)
print
print "=> now mostly aaacccbbb is returnd, not as it should be"
readfromapp('fast', rounds=100, maxread=1024)
print
print "=> you could try to read data one by one, and return"
print " e.g. a whole line only when LF is read"
print " (b's should be finished before c's)"
readfromapp('fast', rounds=100, maxread=1)
print
print "=> but even this won't work ..."
readfromapp('fast2', rounds=100, maxread=1)
and outputs something like this:
=> if output is produced slowly this should work as whished
and should return: aaabbbccc
running 100 rounds slow with maxread=1024
100 x aaabbbccc
=> now mostly aaacccbbb is returnd, not as it should be
running 100 rounds fast with maxread=1024
006 x aaabbbccc
094 x aaacccbbb
=> you could try to read data one by one, and return
e.g. a whole line only when LF is read
(b's should be finished before c's)
running 100 rounds fast with maxread=1
003 x aaabbbccc
003 x aababcbcc
094 x abababccc
=> but even this won't work ...
running 100 rounds fast2 with maxread=1
003 x aaabbbbbbbbbbbbbbbccc
001 x aaacbcbcbbbbbbbbbbbbb
008 x aababcbcbcbbbbbbbbbbb
088 x abababcbcbcbbbbbbbbbb
This works for Python3 (3.6):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, universal_newlines=True)
# Read both stdout and stderr simultaneously
sel = selectors.DefaultSelector()
sel.register(p.stdout, selectors.EVENT_READ)
sel.register(p.stderr, selectors.EVENT_READ)
ok = True
while ok:
for key, val1 in sel.select():
line = key.fileobj.readline()
if not line:
ok = False
break
if key.fileobj is p.stdout:
print(f"STDOUT: {line}", end="")
else:
print(f"STDERR: {line}", end="", file=sys.stderr)
from https://docs.python.org/3/library/subprocess.html#using-the-subprocess-module
If you wish to capture and combine both streams into one, use
stdout=PIPE and stderr=STDOUT instead of capture_output.
so the easiest solution would be:
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
stdout_iterator = iter(process.stdout.readline, b"")
for line in stdout_iterator:
# Do stuff with line
print line
I know this question is very old, but this answer may help others who stumble upon this page in researching a solution for a similar situation, so I'm posting it anyway.
I've built a simple python snippet that will merge any number of pipes into a single one. Of course, as stated above, the order cannot be guaranteed, but this is as close as I think you can get in Python.
It spawns a thread for each of the pipes, reads them line by line and puts them into a Queue (which is FIFO). The main thread loops through the queue, yielding each line.
import threading, queue
def merge_pipes(**named_pipes):
r'''
Merges multiple pipes from subprocess.Popen (maybe other sources as well).
The keyword argument keys will be used in the output to identify the source
of the line.
Example:
p = subprocess.Popen(['some', 'call'],
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
outputs = {'out': log.info, 'err': log.warn}
for name, line in merge_pipes(out=p.stdout, err=p.stderr):
outputs[name](line)
This will output stdout to the info logger, and stderr to the warning logger
'''
# Constants. Could also be placed outside of the method. I just put them here
# so the method is fully self-contained
PIPE_OPENED=1
PIPE_OUTPUT=2
PIPE_CLOSED=3
# Create a queue where the pipes will be read into
output = queue.Queue()
# This method is the run body for the threads that are instatiated below
# This could be easily rewritten to be outside of the merge_pipes method,
# but to make it fully self-contained I put it here
def pipe_reader(name, pipe):
r"""
reads a single pipe into the queue
"""
output.put( ( PIPE_OPENED, name, ) )
try:
for line in iter(pipe.readline,''):
output.put( ( PIPE_OUTPUT, name, line.rstrip(), ) )
finally:
output.put( ( PIPE_CLOSED, name, ) )
# Start a reader for each pipe
for name, pipe in named_pipes.items():
t=threading.Thread(target=pipe_reader, args=(name, pipe, ))
t.daemon = True
t.start()
# Use a counter to determine how many pipes are left open.
# If all are closed, we can return
pipe_count = 0
# Read the queue in order, blocking if there's no data
for data in iter(output.get,''):
code=data[0]
if code == PIPE_OPENED:
pipe_count += 1
elif code == PIPE_CLOSED:
pipe_count -= 1
elif code == PIPE_OUTPUT:
yield data[1:]
if pipe_count == 0:
return
This works for me (on windows):
https://github.com/waszil/subpiper
from subpiper import subpiper
def my_stdout_callback(line: str):
print(f'STDOUT: {line}')
def my_stderr_callback(line: str):
print(f'STDERR: {line}')
my_additional_path_list = [r'c:\important_location']
retcode = subpiper(cmd='echo magic',
stdout_callback=my_stdout_callback,
stderr_callback=my_stderr_callback,
add_path_list=my_additional_path_list)
I got a function that invokes a process using subprocess.Popen in the following way:
def func():
...
process = subprocess.Popen(substr, shell=True, stdout=subprocess.PIPE)
timeout = {"value": False}
timer = Timer(timeout_sec, kill_proc, [process, timeout])
timer.start()
for line in process.stdout:
lines.append(line)
timer.cancel()
if timeout["value"] == True:
return 0
...
I call this function from other function using a loop (e.g from range(1,100) ) , how can I make multiple calls to the function with multiprocessing? that each time several processes will run in parallel
The processes doesn't depend on each other, the only constraint is that each process would be 'working' on only one index (e.g no two processes will work on index 1)
Thanks for your help
Just add the index to your Popen call and create a worker pool with as many CPU cores you have available.
import multiprocessing
def func(index):
....
process = subprocess.Popen(substr + " --index {}".format(index), shell=True, stdout=subprocess.PIPE)
....
if __name__ == '__main__':
p = multiprocessing.Pool(multiprocessing.cpu_count())
p.map(func, range(1, 100))
I am starting a subprocess via python and display the stdout (progress) in a Progress bar:
def rv(args):
p = subprocess.Popen(["linkto.exe"]+[x for x in args], stdout=subprocess.PIPE)
while True:
line = p.stdout.readline()
if line != "":
progressStr=re.search(r"([0-9]+.[0-9]+%)", line.rstrip())
if progressStr == None:
print line.rstrip()
else:
progressInt=int(float(re.sub("[^0123456789\.]", "", progressStr.group())))
print progressInt
else:
break
As you see, progressInt is my cleaned up version of the stdout with integer values for the progress % - it works fine so far. However, depending on my input the stdout may vary because the subprocess may spawn another process after the primary one.
How could I drop all lines of my stdout after progressInt hits 100 for the first time?
I managed to find a solution via re.search. There was a small difference in the stdout of process1 (writes "Info:") and process2 (writes "Info [32]:").
def rv(args):
p = subprocess.Popen(["C:/Program Files/Tweak/RV-4.2.3-64/bin/rvio_hw.exe"]+[x for x in args], stdout=subprocess.PIPE)
for line in iter(p.stdout.readline,""):
noFFMpeg=re.search(r"INFO: (.*)", line.rstrip())
if noFFMpeg is not None:
progressStr=re.search(r"([0-9]+.[0-9]+%)", noFFMpeg.group())
if progressStr is not None:
progressInt=int(float(re.sub("[^0123456789\.]", "", progressStr.group())))
self.prog_QProgressBar.setValue(progressInt)
QtGui.QApplication.processEvents()
print progressStr.group()