NB. I have seen Log output of multiprocessing.Process - unfortunately, it doesn't answer this question.
I am creating a child process (on windows) via multiprocessing. I want all of the child process's stdout and stderr output to be redirected to a log file, rather than appearing at the console. The only suggestion I have seen is for the child process to set sys.stdout to a file. However, this does not effectively redirect all stdout output, due to the behaviour of stdout redirection on Windows.
To illustrate the problem, build a Windows DLL with the following code
#include <iostream>
extern "C"
{
__declspec(dllexport) void writeToStdOut()
{
std::cout << "Writing to STDOUT from test DLL" << std::endl;
}
}
Then create and run a python script like the following, which imports this DLL and calls the function:
from ctypes import *
import sys
print
print "Writing to STDOUT from python, before redirect"
print
sys.stdout = open("stdout_redirect_log.txt", "w")
print "Writing to STDOUT from python, after redirect"
testdll = CDLL("Release/stdout_test.dll")
testdll.writeToStdOut()
In order to see the same behaviour as me, it is probably necessary for the DLL to be built against a different C runtime than than the one Python uses. In my case, python is built with Visual Studio 2010, but my DLL is built with VS 2005.
The behaviour I see is that the console shows:
> stdout_test.py
Writing to STDOUT from python, before redirect
Writing to STDOUT from test DLL
While the file stdout_redirect_log.txt ends up containing:
Writing to STDOUT from python, after redirect
In other words, setting sys.stdout failed to redirect the stdout output generated by the DLL. This is unsurprising given the nature of the underlying APIs for stdout redirection in Windows. I have encountered this problem at the native/C++ level before and never found a way to reliably redirect stdout from within a process. It has to be done externally.
This is actually the very reason I am launching a child process - it's so that I can connect externally to its pipes and thus guarantee that I am intercepting all of its output. I can definitely do this by launching the process manually with pywin32, but I would very much like to be able to use the facilities of multiprocessing, in particular the ability to communicate with the child process via a multiprocessing Pipe object, in order to get progress updates. The question is whether there is any way to both use multiprocessing for its IPC facilities and to reliably redirect all of the child's stdout and stderr output to a file.
UPDATE: Looking at the source code for multiprocessing.Processs, it has a static member, _Popen, which looks like it can be used to override the class used to create the process. If it's set to None (default), it uses a multiprocessing.forking._Popen, but it looks like by saying
multiprocessing.Process._Popen = MyPopenClass
I could override the process creation. However, although I could derive this from multiprocessing.forking._Popen, it looks like I would have to copy a bunch of internal stuff into my implementation, which sounds flaky and not very future-proof. If that's the only choice I think I'd probably plump for doing the whole thing manually with pywin32 instead.
The solution you suggest is a good one: create your processes manually such that you have explicit access to their stdout/stderr file handles. You can then create a socket to communicate with the sub-process and use multiprocessing.connection over that socket (multiprocessing.Pipe creates the same type of connection object, so this should give you all the same IPC functionality).
Here's a two-file example.
master.py:
import multiprocessing.connection
import subprocess
import socket
import sys, os
## Listen for connection from remote process (and find free port number)
port = 10000
while True:
try:
l = multiprocessing.connection.Listener(('localhost', int(port)), authkey="secret")
break
except socket.error as ex:
if ex.errno != 98:
raise
port += 1 ## if errno==98, then port is not available.
proc = subprocess.Popen((sys.executable, "subproc.py", str(port)), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## open connection for remote process
conn = l.accept()
conn.send([1, "asd", None])
print(proc.stdout.readline())
subproc.py:
import multiprocessing.connection
import subprocess
import sys, os, time
port = int(sys.argv[1])
conn = multiprocessing.connection.Client(('localhost', port), authkey="secret")
while True:
try:
obj = conn.recv()
print("received: %s\n" % str(obj))
sys.stdout.flush()
except EOFError: ## connection closed
break
You may also want to see the first answer to this question to get non-blocking reads from the subprocess.
I don't think you have a better option than redirecting a subprocess to a file as you mentioned in your comment.
The way consoles stdin/out/err work in windows is each process when it's born has its std handles defined. You can change them with SetStdHandle. When you modify python's sys.stdout you only modify where python prints out stuff, not where other DLL's are printing stuff. Part of the CRT in your DLL is using GetStdHandle to find out where to print out to. If you want, you can do whatever piping you want in windows API in your DLL or in your python script with pywin32. Though I do think it'll be simpler with subprocess.
Alternatively - and I know this might be slightly off-topic, but helped in my case for the same problem - , this can be resolved with screen on Linux:
screen -L -Logfile './logfile_%Y-%m-%d.log' python my_multiproc_script.py
this way no need to implement all the master-child communication
I assume I'm off base and missing something, but for what it's worth here is what came to mind when I read your question.
If you can intercept all of the stdout and stderr (I got that impression from your question), then why not add or wrap that capture functionality around each of your processes? Then send what is captured through a queue to a consumer that can do whatever you want with all of the outputs?
In my situation I changed sys.stdout.write to write to a PySide QTextEdit. I couldn't read from sys.stdout and I didn't know how to change sys.stdout to be readable. I created two Pipes. One for stdout and the other for stderr. In the separate process I redirect sys.stdout and sys.stderr to the child connection of the multiprocessing pipe. On the main process I created two threads to read the stdout and stderr parent pipe and redirect the pipe data to sys.stdout and sys.stderr.
import sys
import contextlib
import threading
import multiprocessing as mp
import multiprocessing.queues
from queue import Empty
import time
class PipeProcess(mp.Process):
"""Process to pipe the output of the sub process and redirect it to this sys.stdout and sys.stderr.
Note:
The use_queue = True argument will pass data between processes using Queues instead of Pipes. Queues will
give you the full output and read all of the data from the Queue. A pipe is more efficient, but may not
redirect all of the output back to the main process.
"""
def __init__(self, group=None, target=None, name=None, args=tuple(), kwargs={}, *_, daemon=None,
use_pipe=None, use_queue=None):
self.read_out_th = None
self.read_err_th = None
self.pipe_target = target
self.pipe_alive = mp.Event()
if use_pipe or (use_pipe is None and not use_queue): # Default
self.parent_stdout, self.child_stdout = mp.Pipe(False)
self.parent_stderr, self.child_stderr = mp.Pipe(False)
else:
self.parent_stdout = self.child_stdout = mp.Queue()
self.parent_stderr = self.child_stderr = mp.Queue()
args = (self.child_stdout, self.child_stderr, target) + tuple(args)
target = self.run_pipe_out_target
super(PipeProcess, self).__init__(group=group, target=target, name=name, args=args, kwargs=kwargs,
daemon=daemon)
def start(self):
"""Start the multiprocess and reading thread."""
self.pipe_alive.set()
super(PipeProcess, self).start()
self.read_out_th = threading.Thread(target=self.read_pipe_out,
args=(self.pipe_alive, self.parent_stdout, sys.stdout))
self.read_err_th = threading.Thread(target=self.read_pipe_out,
args=(self.pipe_alive, self.parent_stderr, sys.stderr))
self.read_out_th.daemon = True
self.read_err_th.daemon = True
self.read_out_th.start()
self.read_err_th.start()
#classmethod
def run_pipe_out_target(cls, pipe_stdout, pipe_stderr, pipe_target, *args, **kwargs):
"""The real multiprocessing target to redirect stdout and stderr to a pipe or queue."""
sys.stdout.write = cls.redirect_write(pipe_stdout) # , sys.__stdout__) # Is redirected in main process
sys.stderr.write = cls.redirect_write(pipe_stderr) # , sys.__stderr__) # Is redirected in main process
pipe_target(*args, **kwargs)
#staticmethod
def redirect_write(child, out=None):
"""Create a function to write out a pipe and write out an additional out."""
if isinstance(child, mp.queues.Queue):
send = child.put
else:
send = child.send_bytes # No need to pickle with child_conn.send(data)
def write(data, *args):
try:
if isinstance(data, str):
data = data.encode('utf-8')
send(data)
if out is not None:
out.write(data)
except:
pass
return write
#classmethod
def read_pipe_out(cls, pipe_alive, pipe_out, out):
if isinstance(pipe_out, mp.queues.Queue):
# Queue has better functionality to get all of the data
def recv():
return pipe_out.get(timeout=0.5)
def is_alive():
return pipe_alive.is_set() or pipe_out.qsize() > 0
else:
# Pipe is more efficient
recv = pipe_out.recv_bytes # No need to unpickle with data = pipe_out.recv()
is_alive = pipe_alive.is_set
# Loop through reading and redirecting data
while is_alive():
try:
data = recv()
if isinstance(data, bytes):
data = data.decode('utf-8')
out.write(data)
except EOFError:
break
except Empty:
pass
except:
pass
def join(self, *args):
# Wait for process to finish (unless a timeout was given)
super(PipeProcess, self).join(*args)
# Trigger to stop the threads
self.pipe_alive.clear()
# Pipe must close to prevent blocking and waiting on recv forever
if not isinstance(self.parent_stdout, mp.queues.Queue):
with contextlib.suppress():
self.parent_stdout.close()
with contextlib.suppress():
self.parent_stderr.close()
# Close the pipes and threads
with contextlib.suppress():
self.read_out_th.join()
with contextlib.suppress():
self.read_err_th.join()
def run_long_print():
for i in range(1000):
print(i)
print(i, file=sys.stderr)
print('finished')
if __name__ == '__main__':
# Example test write (My case was a QTextEdit)
out = open('stdout.log', 'w')
err = open('stderr.log', 'w')
# Overwrite the write function and not the actual stdout object to prove this works
sys.stdout.write = out.write
sys.stderr.write = err.write
# Create a process that uses pipes to read multiprocess output back into sys.stdout.write
proc = PipeProcess(target=run_long_print, use_queue=True) # If use_pipe=True Pipe may not write out all values
# proc.daemon = True # If daemon and use_queue Not all output may be redirected to stdout
proc.start()
# time.sleep(5) # Not needed unless use_pipe or daemon and all of stdout/stderr is desired
# Close the process
proc.join() # For some odd reason this blocks forever when use_queue=False
# Close the output files for this test
out.close()
err.close()
Here is the simple and straightforward way for capturing stdout for multiprocessing.Process:
import app
import io
import sys
from multiprocessing import Process
def run_app(some_param):
sys.stdout = io.TextIOWrapper(open(sys.stdout.fileno(), 'wb', 0), write_through=True)
app.run()
app_process = Process(target=run_app, args=('some_param',))
app_process.start()
# Use app_process.termninate() for python <= 3.7.
app_process.kill()
I am trying to use Python to open another file. This file is going to start up a socket and create threads for listening for additional connections, and threads for sending/receiving data. The main thread will not return.
However, if the setup of sockets fail, I want to return a error code to the other python script that executed the subprocess.
main.py
py3output = subprocess.check_output(['python3', 'py3.py'])
print('py3 said:' + str(py3output))
py3.py
def returnme():
return 10
returnme()
When I run this, it prints:
py3 said:b''
I am just trying to figure out how to get the return value back to the main calling program.
To return an exit code n back to the OS, you need sys.exit(n). But seems like you do not want to check the exit code but the stdout otput. So your program might need to rewrite to:
def returnme():
return 10
print(returnme())
You should only return a string as a standard output using following code:
sample.py
import sys
def returnme():
sys.stdout.write(str(10))
sys.stdout.flush()
returnme()
main.py
from subprocess import check_output
output = check_output(['python','sample.py'])
print('Sample.py says :' + output)
Let us consider the following Python code, to be executed by cpython on a Linux system (warning: it will try to create or overwrite files in /tmp/first, /tmp/second and /tmp/third).
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
import subprocess
import os
import sys
import threading
class ThreadizedPopen(threading.Thread):
def __init__(self, command, stdin_name, stdout_name):
super(ThreadizedPopen, self).__init__()
self.command = command
self.stdin_name = stdin_name
self.stdout_name = stdout_name
self.returncode = None
def run(self):
with open(self.stdin_name, 'rb') as fin:
with open(self.stdout_name, 'wb') as fout:
popen = subprocess.Popen(self.command, stdin=fin, stdout=fout, stderr=None)
popen.communicate()
self.returncode = popen.returncode
def main():
os.system('mkfifo /tmp/first')
os.system('mkfifo /tmp/second')
os.system('mkfifo /tmp/third')
popen1 = ThreadizedPopen(['cat'], '/tmp/first', '/tmp/second')
popen2 = ThreadizedPopen(['cat'], '/tmp/second', '/tmp/third')
popen1.start()
popen2.start()
with open('/tmp/third') as fin:
print fin.read()
popen1.join()
popen2.join()
if __name__ == '__main__':
main()
I execute it then, on another shell, I write something in /tmp/first (say with echo test > /tmp/first). I would expect the Python program to quickly exit and print the same thing I fed to the first FIFO.
In theory it should happen that the string I wrote in /tmp/first gets copied over by the two cat processes spawned by my program to the other two FIFOs and then picked up by the main Python program to be wrote on its stdout. As soon as every cat process finished, it should close its end of the writing FIFO, making the corresponding reading end return EOF and triggering the termination of the following cat process. Looking at the program with strace reveals that the test string is copied correctly through all the three FIFOs and is read by the main Python program. The first FIFO is also correctly closed (and the first cat process exits, together with its manager Python thread). However the second cat process is stuck in a read() call, expecting data from its reading FIFO.
I do not understand why this happens. From the pipe(t) man page (which, I understand, covers also this kind of FIFOs) it seems that a read on a FIFO is returned EOF as soon as the writing end (and all its duplicates) are closed. According to strace this appears to be the trace (in particular, the cat process is dead, thus all its file descriptors are closed; its managing thread has closed its descriptors as well, I can see it in the strace output).
Can you suggest me why that happens? I can post the strace output if it can be useful.
I found this question
and simply added close_fds=True to your subprocess call. Your code now reads:
#!/usr/bin/env python2
# -*- coding: utf-8 -*-
import subprocess
import os
import sys
import threading
class ThreadizedPopen(threading.Thread):
def __init__(self, command, stdin_name, stdout_name):
super(ThreadizedPopen, self).__init__()
self.command = command
self.stdin_name = stdin_name
self.stdout_name = stdout_name
self.returncode = None
def run(self):
with open(self.stdin_name, 'rb') as fin:
with open(self.stdout_name, 'wb') as fout:
popen = subprocess.Popen(self.command, stdin=fin, stdout=fout, stderr=None, close_fds=True)
popen.communicate()
self.returncode = popen.returncode
def main():
os.system('mkfifo /tmp/first')
os.system('mkfifo /tmp/second')
os.system('mkfifo /tmp/third')
popen1 = ThreadizedPopen(['cat'], '/tmp/first', '/tmp/second')
popen2 = ThreadizedPopen(['cat'], '/tmp/second', '/tmp/third')
popen1.start()
popen2.start()
with open('/tmp/third') as fin:
print fin.read()
popen1.join()
popen2.join()
if __name__ == '__main__':
main()
I placed your code in a script called fifo_issue.py and ran it in a terminal. The script was idling as you'd expect (ignore mkfifo: cannot create fifo):
$ python fifo_issue.py
mkfifo: cannot create fifo ‘/tmp/first’: File exists
mkfifo: cannot create fifo ‘/tmp/second’: File exists
mkfifo: cannot create fifo ‘/tmp/third’: File exists
Then, in a second terminal, I typed:
$ echo "I was echoed to /tmp/first!" > /tmp/first
Back to the first terminal that's still running your idling threads:
$ python fifo_issue.py
mkfifo: cannot create fifo ‘/tmp/first’: File exists
mkfifo: cannot create fifo ‘/tmp/second’: File exists
mkfifo: cannot create fifo ‘/tmp/third’: File exists
I was echoed to /tmp/first!
After which python exited correctly
I have been using the following snippet to silence (redirect output from) C code called in my Python script:
from ctypes import CDLL, c_void_p
import os
import sys
# Code
class silence(object):
def __init__(self, stdout=os.devnull):
self.outfile = stdout
def __enter__(self):
# Flush
sys.__stdout__.flush()
# Save
self.saved_stream = sys.stdout
self.fd = sys.stdout.fileno()
self.saved_fd = os.dup(self.fd)
# Open the redirect
self.new_stream = open(self.outfile, 'wb', 0)
self.new_fd = self.new_stream.fileno()
# Replace
os.dup2(self.new_fd, self.fd)
def __exit__(self, *args):
# Flush
self.saved_stream.flush()
# Restore
os.dup2(self.saved_fd, self.fd)
sys.stdout = self.saved_stream
# Clean up
self.new_stream.close()
os.close(self.saved_fd)
# Test case
libc = CDLL('libc.so.6')
# Silence!
with silence():
libc.printf(b'Hello from C in silence\n')
The idea is to redirect the fd associated with stdout and replace it with one associated with an open null device. Unfortunately, it does not work as expected under Python 3:
$ python2.7 test.py
$ python3.3 -u test.py
$ python3.3 test.py
Hello from C in silence
Under Python 2.7 and 3.3 with unbuffered output it does work. I am unsure what the underlying cause is, however. Even if stdout is buffered the call to sys.saved_stream.flush() should end up calling fflush(stdout) at the C level (flushing the output to the null device).
What part of the Python 3 I/O model am I misunderstanding?
I'm not 100% sure I understand the Py3 I/O model either, but adding
sys.stdout = os.fdopen(self.fd, 'wb', 0)
right after your assignment to self.fd fixes it for me in Python 3.4 (I was able to reproduce the problem in 3.4 before I added this statement).
I'm not entirely sure what's going on either, but on my system there are two ways to fix this:
Replace the call to self.saved_stream.flush() in __exit__ with libc.fflush(None).
Call libc.printf with any string before calling silence(), for example:
libc = CDLL('/bin/cygwin1.dll')
libc.printf(b'')
Also, only with the second way has the outputs of Python's print and libc.printf remains synchronized after with silence(): block.
I have a question about pexepct in Python.
What I wanna to do is, run my script at some time, and then stop it at some time.
Pexpect wont work like it should. I don't know what I'm doing wrong, so can you give me some advice on my code below?
#!/usr/bin/python
# -*- coding: utf-8 -*-
date = '2014-09-06'
start = '15:32'
stop = '16:30'
import pexpect, sys
string = 'at '+start+' '+date
child = pexpect.spawn(string)
child.expect('warning: commands will be executed using /bin/sh')
child.expect('at> ')
child.sendline('./run_script.py\n')
child.expect('at> ')
child.sendline('\^D\n')
print child.before
The problem is, when all commands send pexepct wont create a job.
Any advice should be great.
Here the way ctrl+d is sent is not valid. Even after sending ctrl+d, the script has to wait for couple of seconds for the at command to register the new job before closing the pexpect spawn object.
import pexpect
import sys
import time
prompt = "at>"
try:
conn = pexpect.spawn("at 14:30 2019-06-14")
conn.logfile = sys.stdout
conn.expect(prompt)
conn.sendline("touch /tmp/test.txt")
conn.expect(prompt)
conn.sendcontrol("d")
time.sleep(3)
conn.close()
except Exception as e:
print(e)
After executing the above code snippet, run the command 'atq' in the linux terminal to verify that job has been queued up.
# atq
52 Fri Jun 14 14:30:00 2019 a root