Send input from one threaded subprocess to another - python

main script starts the second script in a new subprocess and thread that continuously checks the stdout for data. The second script asks for input. I would like to have the first script ask for user input then pass it to the second script. I'm developing on windows and couldn't get pexpect to work.
test.py - main script
import threading
import subprocess
def read_output(process):
print("starting to read")
for line in process.stdout:
print (line.rstrip())
def write_output(process,s):
process.stdin.write(s.encode('utf-8'))
process.stdin.flush()
process = subprocess.Popen('python test2.py', shell=False,
stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=None)
# Create new threads
thread1 = threading.Thread(read_output(process))
# Start new Threads
thread1.daemon=True
thread1.start()
s=input("test input:")
print("yep:"+s)
thread1.process.stdin.write(s.encode('utf-8'))
thread1.process.stdin.flush()
test2.py second script
print("Enter an input A,B,C:")
s=input("")
print("you selected:"+s)

First mistake: wrong args when creating thread. You're passing the result of the function, called in the main process: the thread isn't started yet, you read the output in the main thread, not in the started thread.
Fix it like this:
thread1 = threading.Thread(target=read_output,args=(process,))
Second mistake (or maybe that the program continues), you must close process stdin after writing a string in it:
process.stdin.close()
Fixed test1.py file:
import threading
import subprocess
def read_output(process):
print("starting to read")
for line in process.stdout:
print (line.rstrip())
process = subprocess.Popen('python test2.py', shell=False,
stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=None)
# Create new thread: pass target and argument
thread1 = threading.Thread(target=read_output,args=(process,))
# Start new Threads
thread1.daemon=True
thread1.start()
s=input("test input:")
print("yep:"+s)
process.stdin.write(s.encode('utf-8'))
process.stdin.write("\r\n".encode('utf-8')) # emulate "ENTER" in thread
process.stdin.close() # close standard input or thread doesn't terminate
thread1.join() # wait for thread to finish

Related

Running a small python program in background inside Jupyter notebook without blocking the main process

Let suppose I have this simple function:
def fun():
for i in range(5):
print(i)
sleep(2)
and I want to run it in background without interrupting the main code flow, is this achievable?
I tried saving the code in test.py and did:
from IPython.lib.backgroundjobs import BackgroundJobFunc
with open('test.py') as code:
job = BackgroundJobFunc(exec, code.read())
result = job.run()
It printed 0 and exited.
I also tried:
from subprocess import Popen, PIPE
process = Popen(['python', 'test.py.py'], stdout=PIPE, stderr=PIPE)
stdout, stderr = process.communicate()
print(stdout)
and
from threading import Thread
thread = Thread(target = fun)
thread.start()
thread.join()
print("thread finished...exiting")
Both blocked the main process. Could not do anything before it finished it's execution.
Is there a different way?
Creating a daemon thread solved the problem ( with one problem that it prints the value in cell you're currently printing / working)
t1 = threading.Thread(target=fun)
t1.setDaemon(True)
t1.start()
Any corrections / suggestions to this?

How do I use multiprocessing.Queue from a process with a pre-existing Pipe?

I am trying to use multiprocessing from inside another process that was spawned with Popen. I want to be able to communicate between this process and a new child process, but this "middle" process has a polling read on the pipe with its parent, which seems to block execution of its child process.
Here is my file structure:
entry.py
import subprocess, threading, time, sys
def start():
# Create process 2
worker = subprocess.Popen([sys.executable, "-u", "mproc.py"],
# When creating the subprocess with an open pipe to stdin and
# subsequently polling that pipe, it blocks further communication
# between subprocesses
stdin=subprocess.PIPE,
close_fds=False,)
t = threading.Thread(args=(worker))
t.start()
time.sleep(4)
if __name__ == '__main__':
start()
mproc.py
import multiprocessing as mp
import time, sys, threading
def exit_on_stdin_close():
try:
while sys.stdin.read():
pass
except:
pass
def put_hello(q):
# We never reach this line if exit_poll.start() is uncommented
q.put("hello")
time.sleep(2.4)
def start():
exit_poll = threading.Thread(target=exit_on_stdin_close, name="exit-poll")
exit_poll.daemon = True
# This daemon thread polling stdin blocks execution of subprocesses
# But ONLY if running in another process with stdin connected
# to its parent by PIPE
exit_poll.start()
ctx = mp.get_context('spawn')
q = ctx.Queue()
p = ctx.Process(target=put_hello, args=(q,))
# Create process 3
p.start()
p.join()
print(f"result: {q.get()}")
if __name__ == '__main__':
start()
My desired behavior is that when running entry.py, mproc.py should run on a subprocess and be able to communicate with its own subprocess to get the Queue output, and this does happen if I don't start the exit-poll daemon thread:
$ python -u entry.py
result: hello
but if exit-poll is running, then process 3 blocks as soon as it's started. The put_hello method isn't even entered until the exit-poll thread ends.
Is there a way to create a process 3 from process 2 and communicate between the two, even while the pipe between processes 1 and 2 is being used?
Edit: I can only consistently reproduce this problem on Windows. On Linux (Ubuntu 20.04 WSL) the Queues are able to communicate even with exit-poll running, but only if I'm using the spawn multiprocessing context. If I change it to fork then I get the same behavior that I see on Windows.

Python Subprocess readline() hangs; can't use normal options

To start, I'm aware this looks like a duplicate. I've been reading:
Python subprocess readlines() hangs
Python Subprocess readline hangs() after reading all input
subprocess readline hangs waiting for EOF
But these options either straight don't work or I can't use them.
The Problem
# Obviously, swap HOSTNAME1 and HOSTNAME2 with something real
cmd = "ssh -N -f -L 1111:<HOSTNAME1>:80 <HOSTNAME2>"
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, env=os.environ)
while True:
out = p.stdout.readline()
# Hangs here ^^^^^^^ forever
out = out.decode('utf-8')
if out:
print(out)
if p.poll() is not None:
break
My dilemma is that the function calling the subprocess.Popen() is a library function for running bash commands, so it needs to be very generic and has the following restrictions:
Must display output as it comes in; not block and then spam the screen all at once
Can't use multiprocessing in case the parent caller is multiprocessing the library function (Python doesn't allow child processes to have child processes)
Can't use signal.SIGALRM for the same reason as multiprocessing; the parent caller may be trying to set their own timeout
Can't use third party non-built-in modules
Threading straight up doesn't work. When the readline() call is in a thread, thread.join(timeout=1)lets the program continue, but ctrl+c doesn't work on it at all, and calling sys.exit() doesn't exit the program, since the thread is still open. And as you know, you can't kill a thread in python by design.
No manner of bufsize or other subprocess args seems to make a difference; neither does putting readline() in an iterator.
I would have a workable solution if I could kill a thread, but that's super taboo, even though this is definitely a legitimate use case.
I'm open to any ideas.
One option is to use a thread to publish to a queue. Then you can block on the queue with a timeout. You can make the reader thread a daemon so it won't prevent system exit. Here's a sketch:
import subprocess
from threading import Thread
from queue import Queue
def reader(stream, queue):
while True:
line = stream.readline()
queue.put(line)
if not line:
break
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, ...)
queue = Queue()
thread = Thread(target=reader, args=(p.stdout, queue))
thread.daemon = True
thread.start()
while True:
out = queue.get(timeout=1) # timeout is optional
if not out: # Reached end of stream
break
... # Do whatever with output
# Output stream was closed but process may still be running
p.wait()
Note that you should adapt this answer to your particular use case. For example, you may want to add a way to signal to the reader thread to stop running before reaching the end of stream.
Another option would be to poll the input stream, like in this question: timeout on subprocess readline in python
I finally got a working solution; the key piece of information I was missing was thread.daemon = True, which #augurar pointed out in their answer.
Setting thread.daemon = True allows the thread to be terminated when the main process terminates; therefore unblocking my use of a sub-thread to monitor readline().
Here is a sample implementation of my solution; I used a Queue() object to pass strings to the main process, and I implemented a 3 second timer for cases like the original problem I was trying to solve where the subprocess has finished and terminated, but the readline() is hung for some reason.
This also helps avoid a race condition between which thing finishes first.
This works for both Python 2 and 3.
import sys
import threading
import subprocess
from datetime import datetime
try:
import queue
except:
import Queue as queue # Python 2 compatibility
def _monitor_readline(process, q):
while True:
bail = True
if process.poll() is None:
bail = False
out = ""
if sys.version_info[0] >= 3:
out = process.stdout.readline().decode('utf-8')
else:
out = process.stdout.readline()
q.put(out)
if q.empty() and bail:
break
def bash(cmd):
# Kick off the command
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
# Create the queue instance
q = queue.Queue()
# Kick off the monitoring thread
thread = threading.Thread(target=_monitor_readline, args=(process, q))
thread.daemon = True
thread.start()
start = datetime.now()
while True:
bail = True
if process.poll() is None:
bail = False
# Re-set the thread timer
start = datetime.now()
out = ""
while not q.empty():
out += q.get()
if out:
print(out)
# In the case where the thread is still alive and reading, and
# the process has exited and finished, give it up to 3 seconds
# to finish reading
if bail and thread.is_alive() and (datetime.now() - start).total_seconds() < 3:
bail = False
if bail:
break
# To demonstrate output in realtime, sleep is called in between these echos
bash("echo lol;sleep 2;echo bbq")

How to collect output from a Python subprocess

I am trying to make a python process that reads some input, processes it and prints out the result. The processing is done by a subprocess (Stanford's NER), for ilustration I will use 'cat'. I don't know exactly how much output NER will give, so I use run a separate thread to collect it all and print it out. The following example illustrates.
import sys
import threading
import subprocess
# start my subprocess
cat = subprocess.Popen(
['cat'],
shell=False, stdout=subprocess.PIPE, stdin=subprocess.PIPE,
stderr=None)
def subproc_cat():
""" Reads the subprocess output and prints out """
while True:
line = cat.stdout.readline()
if not line:
break
print("CAT PROC: %s" % line.decode('UTF-8'))
# a daemon that runs the above function
th = threading.Thread(target=subproc_cat)
th.setDaemon(True)
th.start()
# the main thread reads from stdin and feeds the subprocess
while True:
line = sys.stdin.readline()
print("MAIN PROC: %s" % line)
if not line:
break
cat.stdin.write(bytes(line.strip() + "\n", 'UTF-8'))
cat.stdin.flush()
This seems to work well when I enter text with the keyboard. However, if I try to pipe input into my script (cat file.txt | python3 my_script.py), a racing condition seems to occur. Sometimes I get proper output, sometimes not, sometimes it locks down. Any help would be appreciated!
I am runing Ubuntu 14.04, python 3.4.0. The solution should be platform-independant.
Add th.join() at the end otherwise you may kill the thread prematurely before it has processed all the output when the main thread exits: daemon threads do not survive the main thread (or remove th.setDaemon(True) instead of th.join()).

showing progress while spawning and running subprocess

I need to show some progress bar or something while spawning and running subprocess.
How can I do that with python?
import subprocess
cmd = ['python','wait.py']
p = subprocess.Popen(cmd, bufsize=1024,stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
p.stdin.close()
outputmessage = p.stdout.read() #This will print the standard output from the spawned process
message = p.stderr.read()
I could spawn subprocess with this code, but I need to print out something when each second is passing.
Since the subprocess call is blocking, one way to print something out while waiting would be to use multithreading. Here's an example using threading._Timer:
import threading
import subprocess
class RepeatingTimer(threading._Timer):
def run(self):
while True:
self.finished.wait(self.interval)
if self.finished.is_set():
return
else:
self.function(*self.args, **self.kwargs)
def status():
print "I'm alive"
timer = RepeatingTimer(1.0, status)
timer.daemon = True # Allows program to exit if only the thread is alive
timer.start()
proc = subprocess.Popen([ '/bin/sleep', "5" ])
proc.wait()
timer.cancel()
On an unrelated note, calling stdout.read() while using multiple pipes can lead to deadlock. The subprocess.communicate() function should be used instead.
As far as I see it all you need to do is put those reads in a loop with a delay and a print - does it have to be precisely a second or around about a second?

Categories

Resources