How to capture interlaced stdin and stdout from a Python subprocess

How to capture interlaced stdin and stdout from a Python subprocess - python

I am trying to write a Python script that automatically grades a Python script submitted by a student, where the student's script uses the input() function to get some information from the user.
Suppose the student's script is something simple like this:
name = input('Enter your name: ')
print(f'Hello {name}!')
The portion of the test script that runs the student script is something like this:
import subprocess
run_cmd = 'python student_script.py'
test_input = 'Bob'
p = subprocess.run(run_cmd.split(), input=test_input, capture_output=True, text=True)
After running that portion of the test script, output from the student's script is captured and can be accessed via p.stdout which is a string having this value:
'Enter your name: Hello Bob!\n'
No surprise there, since this is everything output by the student script, but notice that the 'Bob' test input is not included.
In the test report, I want to show the script input and output in the same way that it would appear if the script had been run from a command line, which would look like this:
Enter your name: Bob
Hello Bob!
Given that the scripts are written by students, the prompt message output by the student script could be anything (e.g., What is your name?, Who are you?, Type in name:, etc.) and the student script might also print something other than 'Hello Bob!', so I don't think there is any way to reliably figure out where to correctly insert the 'Bob' test input (and a trailing new line) into p.stdout.
Is there a way to get subprocess.run() to capture interlaced stdin and stdout?
Or is there another way to run a Python script from a Python script that captures interlaced stdin and stdout?
Ideally, for this example, I would be able to get a string having this value:
'Enter your name: Bob\nHello Bob!\n'
I've search SO and read through the subprocess documentation, but thus far I've come up short on finding a solution.

Here's the solution I came up with. I expect there is a more elegant way to do it, but it works on the Ubuntu Linux computer that the automated test scripts run on. I have not tried it on Windows, but I believe it will not work since os.set_blocking() is only supported on Unix per the os module documentation.
import subprocess
import os
import time
run_cmd = 'python student_script.py'
test_input = 'Bob'
# Start the student script running
p = subprocess.Popen(run_cmd.split(), stdin=subprocess.PIPE, stdout=subprocess.PIPE, text = True)
# Give the script some time to run
time.sleep(2)
# String to hold interleaved stdin and stdout text
stdio_text = ''
# Capture everything from stdout
os.set_blocking(p.stdout.fileno(), False) # Prevents readline() blocking
stdout_text = p.stdout.readline()
while stdout_text != '':
stdio_text += stdout_text
stdout_text = p.stdout.readline()
# Append test input to interleaved stdin and stdout text
stdio_text += (test_input + '\n')
try:
# Send test input to stdin and wait for student script to terminate
stdio_text += p.communicate(input=test_input, timeout=5)[0]
except subprocess.TimeoutExpired:
# Something is wrong with student script
pass
p.terminate()
The key to this solution working is os.set_blocking(), which I found out about here. Without it readline() blocks indefinitely.
I don't love the time.sleep(2) since it assumes it will take 2 seconds or less for the student script to reach the point where it calls input(), but there does not seem to be any way to determine when a process is looking for input from stdin. The sleep time could be increased for longer scripts.
If you've got any ideas for improvements, please share.

Related

How can I take piped input from another application AND take user input after?

import sys
stdin_input = sys.stdin.read()
print(f"Info loaded from stdin: {stdin_input}")
user_input = input("User input goes here: ")
Error received:
C:\>echo "hello" | python winput.py
Info loaded from stdin: "hello"
User input goes here: Traceback (most recent call last):
File "C:\winput.py", line 6, in <module>
user_input = input("User input goes here: ")
EOFError: EOF when reading a line
I've recently learned this is because sys.stdin is being used for FIFO, which leaves it closed after reading.
I can make it work on CentOS by adding sys.stdin = open("/dev/tty") after stdin_input = sys.stdin.read() based on this question, but this doesn't work for Windows.
Preferably rather than identifying the OS and assigning a new value to sys.stdin accordingly, I'd rather approach it dynamically. Is there a way to identify what the equivalent of /dev/tty would be in every case, without necessarily having to know /dev/tty or the equivalent is for the specific OS?
Edit:
The reason for the sys.stdin.read() is to take in JSON input piped from another application. I also have an option to read the JSON data from a file, but being able to used the piped data is very convenient. Once the data is received, I'd like to get user input separately.
I'm currently working around my problem with the following:
if os.name == "posix":
sys.stdin = open("/dev/tty")
elif os.name == "nt":
sys.stdin = open("con")
else:
raise RunTimeError(
f"Error trying to assign to sys.stdin due to unknown os {os.name}"
)
This may very well work in all cases but it would still be preferable to know what /dev/tty or con or whatever the equivalent is for the OS is dynamically. If it's not possible and my workaround is the best solution, I'm okay with that.

Since you're using Bash, you can avoid this problem by using process substitution, which is like a pipe, but delivered via a temporary filename argument instead of via stdin.
That would look like:
winput.py <(another-application)
Then in your Python script, receive the argument and handle it accordingly:
import json
import sys
with open(sys.argv[1]) as f:
d = json.load(f)
print(d)
user_input = input("User input goes here: ")
print('User input:', user_input)
(sys.argv is just used for demo. In a real script I'd use argparse.)
Example run:
$ tmp.py <(echo '{"someKey": "someValue"}')
{'someKey': 'someValue'}
User input goes here: 6
User input: 6
The other massive advantage of this is that it works seamlessly with actual filenames, for example:
$ cat test.json
{"foo": "bar"}
$ tmp.py test.json
{'foo': 'bar'}
User input goes here: x
User input: x

So your real issue is that sys.stdin can be only one of two things:
Connected to the typed input from the terminal
Connected to some file-like object that is not the terminal (actual file, pipe, whatever)
It doesn't matter that you consumed all of sys.stdin by doing sys.stdin.read(), when sys.stdin was redirected to some file-system object, you lost the ability to read from the terminal via sys.stdin.
In practice, I'd strongly suggest not trying to do this. Use argparse and accept whatever you were considering accepting via input from the command line and avoid the whole problem (in practice, I basically never see real production code that's not a REPL of some sort dynamically interacting with the user via stdin/stdout interactions; for non-REPL cases, sys.stdin is basically always either unused or piped from a file/program, because writing clean user-interaction code like this is a pain, and it's a pain for the user to have to type their responses without making mistakes). The input that might come for a file or stdin can be handled by passing type=argparse.FileType() to the add_argument call in question, and the user can then opt to pass either a file name or - (where - means "Read from stdin"), leaving your code looking like:
parser = argparse.ArgumentParser('Program description here')
parser.add_argument('inputfile', type=argparse.FileType(), help='Description here; pass "-" to read from stdin')
parser.add_argument('-c', '--cmd', action='append', help='User commands to execute after processing input file')
args = parser.parse_args()
with args.inputfile as f:
data = f.read()
for cmd in args.cmd:
# Do stuff based on cmd
The user can then do:
otherprogram_that_generates_data | myprogram.py - -c 'command 1' -c 'command 2'
or:
myprogram.py file_containing_data -c 'command 1' -c 'command 2'
or (on shells with process substitution, like bash, as an alternative to the first use case):
myprogram.py <(otherprogram_that_generates_data) -c 'command 1' -c 'command 2'
and it works either way.
If you must do this, your existing solution is really the only reasonable solution, but you can make it a little cleaner factoring it out and only making the path dynamic, not the whole code path:
import contextlib
import os
import sys
TTYNAMES = {"posix": "/dev/tty", "nt": "con"}
#contextlib.contextmanager
def stdin_from_terminal():
try:
ttyname = TTYNAMES[os.name]
except KeyError:
raise OSError(f"{os.name} does not support manually reading from the terminal")
with open(ttyname) as tty:
sys.stdin, oldstdin = tty, sys.stdin
try:
yield
finally:
sys.stdin = oldstdin
This will probably die with an OSError subclass on the open call if run without a connected terminal, e.g. when launched with pythonw on Windows (another reason not to use this design), or launched in non-terminal ways on UNIX-likes, but that's better than silently misbehaving.
You'd use it with just:
with stdin_from_terminal():
user_input = input("User input goes here: ")
and it would restore the original sys.stdin automatically when the with block is exited.

Bypass a command subprocess that ask for user input ? (password) [duplicate]

I'm working with a piece of scientific software called Chimera. For some of the code downstream of this question, it requires that I use Python 2.7.
I want to call a process, give that process some input, read its output, give it more input based on that, etc.
I've used Popen to open the process, process.stdin.write to pass standard input, but then I've gotten stuck trying to get output while the process is still running. process.communicate() stops the process, process.stdout.readline() seems to keep me in an infinite loop.
Here's a simplified example of what I'd like to do:
Let's say I have a bash script called exampleInput.sh.
#!/bin/bash
# exampleInput.sh
# Read a number from the input
read -p 'Enter a number: ' num
# Multiply the number by 5
ans1=$( expr $num \* 5 )
# Give the user the multiplied number
echo $ans1
# Ask the user whether they want to keep going
read -p 'Based on the previous output, would you like to continue? ' doContinue
if [ $doContinue == "yes" ]
then
echo "Okay, moving on..."
# [...] more code here [...]
else
exit 0
fi
Interacting with this through the command line, I'd run the script, type in "5" and then, if it returned "25", I'd type "yes" and, if not, I would type "no".
I want to run a python script where I pass exampleInput.sh "5" and, if it gives me "25" back, then I pass "yes"
So far, this is as close as I can get:
#!/home/user/miniconda3/bin/python2
# talk_with_example_input.py
import subprocess
process = subprocess.Popen(["./exampleInput.sh"],
stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
process.stdin.write("5")
answer = process.communicate()[0]
if answer == "25":
process.stdin.write("yes")
## I'd like to print the STDOUT here, but the process is already terminated
But that fails of course, because after `process.communicate()', my process isn't running anymore.
(Just in case/FYI): Actual problem
Chimera is usually a gui-based application to examine protein structure. If you run chimera --nogui, it'll open up a prompt and take input.
I often need to know what chimera outputs before I run my next command. For example, I will often try to generate a protein surface and, if Chimera can't generate a surface, it doesn't break--it just says so through STDOUT. So, in my python script, while I'm looping through many proteins to analyze, I need to check STDOUT to know whether to continue analysis on that protein.
In other use cases, I'll run lots of commands through Chimera to clean up a protein first, and then I'll want to run lots of separate commands to get different pieces of data, and use that data to decide whether to run other commands. I could get the data, close the subprocess, and then run another process, but that would require re-running all of those cleaning up commands each time.
Anyways, those are some of the real-world reasons why I want to be able to push STDIN to a subprocess, read the STDOUT, and still be able to push more STDIN.
Thanks for your time!

you don't need to use process.communicate in your example.
Simply read and write using process.stdin.write and process.stdout.read. Also make sure to send a newline, otherwise read won't return. And when you read from stdin, you also have to handle newlines coming from echo.
Note: process.stdout.read will block until EOF.
# talk_with_example_input.py
import subprocess
process = subprocess.Popen(["./exampleInput.sh"],
stdin = subprocess.PIPE,
stdout = subprocess.PIPE)
process.stdin.write("5\n")
stdout = process.stdout.readline()
print(stdout)
if stdout == "25\n":
process.stdin.write("yes\n")
print(process.stdout.readline())
$ python2 test.py
25
Okay, moving on...
Update
When communicating with an program in that way, you have to pay special attention to what the application is actually writing. Best is to analyze the output in a hex editor:
$ chimera --nogui 2>&1 | hexdump -C
Please note that readline [1] only reads to the next newline (\n). In your case you have to call readline at least four times to get that first block of output.
If you just want to read everything up until the subprocess stops printing, you have to read byte by byte and implement a timeout. Sadly, neither read nor readline does provide such a timeout mechanism. This is probably because the underlying read syscall [2] (Linux) does not provide one either.
On Linux we can write a single-threaded read_with_timeout() using poll / select. For an example see [3].
from select import epoll, EPOLLIN
def read_with_timeout(fd, timeout__s):
"""Reads from fd until there is no new data for at least timeout__s seconds.
This only works on linux > 2.5.44.
"""
buf = []
e = epoll()
e.register(fd, EPOLLIN)
while True:
ret = e.poll(timeout__s)
if not ret or ret[0][1] is not EPOLLIN:
break
buf.append(
fd.read(1)
)
return ''.join(buf)
In case you need a reliable way to read non blocking under Windows and Linux, this answer might be helpful.
[1] from the python 2 docs:
readline(limit=-1)
Read and return one line from the stream. If limit is specified, at most limit bytes will be read.
The line terminator is always b'\n' for binary files; for text files, the newline argument to open() can be used to select the line terminator(s) recognized.
[2] from man 2 read:
#include <unistd.h>
ssize_t read(int fd, void *buf, size_t count);
[3] example
$ tree
.
├── prog.py
└── prog.sh
prog.sh
#!/usr/bin/env bash
for i in $(seq 3); do
echo "${RANDOM}"
sleep 1
done
sleep 3
echo "${RANDOM}"
prog.py
# talk_with_example_input.py
import subprocess
from select import epoll, EPOLLIN
def read_with_timeout(fd, timeout__s):
"""Reads from f until there is no new data for at least timeout__s seconds.
This only works on linux > 2.5.44.
"""
buf = []
e = epoll()
e.register(fd, EPOLLIN)
while True:
ret = e.poll(timeout__s)
if not ret or ret[0][1] is not EPOLLIN:
break
buf.append(
fd.read(1)
)
return ''.join(buf)
process = subprocess.Popen(
["./prog.sh"],
stdin = subprocess.PIPE,
stdout = subprocess.PIPE
)
print(read_with_timeout(process.stdout, 1.5))
print('-----')
print(read_with_timeout(process.stdout, 3))
$ python2 prog.py
6194
14508
11293
-----
10506

Interact with python subprocess once waits for user input

I'm working on a script to automate tests of a certain software, and as part of it I need to chech if it runs commands correctly.
I'm currently launching an executeable using subprocess and passing the initial parameters.
My code is: subprocess.run("program.exe get -n WiiVNC", shell=True, check=True)
As far as I understand, this runs the executeable, and is supposed to return an exception if the exit code is 1.
Now, the program launches, but at some point waits for user input like so:
My question is, how do I go about submitting the user input "y" using subprocess once either, the text "Continue with download of "WiiVNC"? (y/n) >" shows up, or once the program waits for user input.

You should use the pexpect module for all complicated subprocessing. In particular, the module is designed to handle the complicated case of either passing through the input to the current process for the user to answer and/or allowing your script to answer the input for the user and continue the subprocess.
Added some code for an example:
### File Temp ###
# #!/bin/env python
# x = input('Type something:')
# print(x)
import pexpect
x = pexpect.spawn('python temp') #Start subprocess.
x.interact() #Imbed subprocess in current process.
# or
x = pexpect.spawn('python temp') #Start subprocess.
find_this_output = x.expect(['Type something:'])
if find_this_output is 0:
x.send('I type this in for subprocess because I found the 0th string.')

Try this:
import subprocess
process = subprocess.Popen("program.exe get -n WiiVNC", stdin=subprocess.PIPE, shell=True)
process.stdin.write(b"y\n")
process.stdin.flush()
stdout, stderr = process.communicate()

Python subprocess output blocking on application prompt

I am running jirashell in a python script using the subprocess library. I am currently having issues having the outputs print in real time. When I run jirashell it outputs information than prompts the user (y/n). The subprocess won't print out information prior to the prompt until I enter 'y' or 'n'.
The code I am using is
_consumer_key = "justin-git"
_cmd = "jirashell -s {0} -od -k {1} -ck {2} -pt".format(JIRA_SERVER,
_rsa_private_key_path, _consumer_key)
p = subprocess.Popen(_cmd.split(" "), stdout=subprocess.PIPE,
stderr=subprocess.PIPE, bufsize=0)
out, err = p.communicate() # Blocks here
print out
print err
The output is like so:
n # I enter a "n" before program will print output.
Output:
Request tokens received.
Request token: asqvavefafegagadggsgewgqqegqgqge
Request token secret: asdbresbdfbrebsaerbbsbdabweabfbb
Please visit this URL to authorize the OAuth request:
http://localhost:8000/plugins/servlet/oauth/authorize?oauth_token=zzzzzzzzzzzzzzzzzzzzzzzzzzzzz
Have you authorized this program to connect on your behalf to http://localhost:8000? (y/n)
Error:
Abandoning OAuth dance. Your partner faceplants. The audience boos. You feel shame.
Does anyone know how I can have it print the output prior to the prompt than wait for an input of y/n? Note I also need to be able to store the output produced by the command so "os.system()" won't work...
EDIT:
It looks like inside jirashell there is a part of the code that is waiting for an input and this is causing the block. Until something is passed into this input nothing is outputted... Still looking into how I can get around this. I'm in the process of trying to move the portion of code I need into my application. This solution doesn't seem elegant but I can't see any other way right now.
approved = input(
'Have you authorized this program to connect on your behalf to {}? (y/n)'.format(server))

Method which prints and caches the standard output:
You can use a thread which reads the standard output of your subprocess, while the main thread is blocked until the subprocess is done. The following example will run the program other.py, which looks like
#!/usr/bin/env python3
print("Hello")
x = input("Type 'yes': ")
Example:
import threading
import subprocess
import sys
class LivePrinter(threading.Thread):
"""
Thread which reads byte-by-byte from the input stream and writes it to the
standard out.
"""
def __init__(self, stream):
self.stream = stream
self.log = bytearray()
super().__init__()
def run(self):
while True:
# read one byte from the stream
buf = self.stream.read(1)
# break if end of file reached
if len(buf) == 0:
break
# save output to internal log
self.log.extend(buf)
# write and flush to main standard output
sys.stdout.buffer.write(buf)
sys.stdout.flush()
# create subprocess
p = subprocess.Popen('./other.py', stdout=subprocess.PIPE)
# Create reader and start the thread
r = LivePrinter(p.stdout)
r.start()
# Wait until subprocess is done
p.wait()
# Print summary
print(" -- The process is done now -- ")
print("The standard output was:")
print(r.log.decode("utf-8"))
The class LivePrinter reads every byte from the subprocess and writes it to the standard output. (I have to admit, this is not the most efficient approach, but a larger buffer size blocks, the LiveReader until the buffer is full, even though the subprocess is awaiting the answer to a prompt.) Since the bytes are written to sys.stdout.buffer, there shouldn't be a problem with multi-byte utf-8 characters.
The LiveReader class also stores the complete output of the subprocess in the variable log for later use.
As this answer summarizes, it is save to start a thread after forking with subprocess.
Original answer which has problems, when the prompt line doesn't end a line:
The output is delayed because communicate() blocks the execution of your script until the sub-process is done (https://docs.python.org/3/library/subprocess.html#subprocess.Popen.communicate).
You can read and print the standard output of the subprocess, while it is executed using stdout.readline. There are some issues about buffering, which require this rather complicated iter(process.stdout.readline, b'') construct. The following example uses gpg2 --gen-key because this command starts an interactive tool.
import subprocess
process = subprocess.Popen(["gpg2", "--gen-key"], stdout=subprocess.PIPE)
for stdout_line in iter(process.stdout.readline, b''):
print(stdout_line.rstrip())
Alternative answer which uses shell and does not cache the output:
As Sam pointed out, there is a problem with the above solution, when the prompt line does not end the line (which prompts they usually don't). An alternative solution is to use the shell argument to interact with the sub-process.
import subprocess
subprocess.call("gpg2 --gen-key", shell=True)

Python - Run a simple command-line program with prompted I/O and "proxy" it, on Windows

I have a simple command-line binary program hello which outputs to STDOUT:
What is your name?
and waits for the user to input it. After receiving their input it outputs:
Hello, [name]!
and terminates.
I want to use Python to run computations on the final output of this program ("Hello, [name]!"), however before the final output I want the Python script to essentially "be" the binary program. In other words I'd like Python to forward all of the prompts to STDOUT and then accept the user's input and give it to the program. However I want to hide the final output so I can process it and show my own results to the user. I do not want to replicate the hello's behavior in the script, as this simple program is a stand-in for a more complex program that I am actually working with.
I was hoping there would be some sort of mechanic in subprocess where I would be able to do something akin to:
while process.is_running():
next_char = process.stdout.read(1)
if next_char == input_prompt_thing: # somehow check if the program is waiting for input
user_input = raw_input(buffer)
process.stdin.write(user_input)
else:
buffer += next_char
I have been playing with subprocess and essentially got as far as realizing I could use process.stdout.read(1) to read from the program before it began blocking, but I can't figure out how to break this loop before the process blocks my Python script. I am not too familiar with console I/O and it is not an area of much expertise for me, so I am starting to feel pretty lost. I appreciate any help!

You could try winpexpect (not tested):
import re
from winpexpect import winspawn
p = winspawn('hello')
prompt = "What is your name?"
p.expect(re.escape(prompt))
name = raw_input(prompt)
p.sendline(name)
p.expect("Hello, .*!")
output = p.after
# ...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.