Reading from flushed vs unflushed buffers - python

I've got a script parent.py trying to to read stdout from a subprocess sub.py in Python.
The parent parent.py:
#!/usr/bin/python
import subprocess
p = subprocess.Popen("sub.py", stdout=subprocess.PIPE)
print p.stdout.read(1)
And the subprocess, sub.py:
#!/usr/bin/python
print raw_input( "hello world!" )
I would expect running parent.py to print the 'h' from "hello world!". Actually, it hangs. I can only get my expected behaviour by adding -u to sub.py's she-bang line.
This confuses me because the -u switch makes no difference when sub.py is run directly from a shell; the shell is somehow privy to the un-flushed output stream, unlike parent.py.
My goal is to run a C program as the subprocess, so I won't be able to control whether or not it flushes stdout. How is it that a shell has better access to a process's stdout than Python running the same thing from subprocess.Popen? Am I going to be able to read such a stdout stream from a C program that doesn't flush its buffers?
EDIT:
Here is an updated example based on korylprince's comment...
## capitalize.sh ##
#!/bin/sh
while [ 1 ]; do
read s
echo $s | tr '[:lower:]' '[:upper:]'
done
########################################
## parent.py ##
#!/usr/bin/python
from subprocess import Popen, PIPE
# cmd = [ 'capitalize.sh' ] # This would work
cmd = [ 'script', '-q', '-f', '-c', 'capitalize.sh', '/dev/null']
p = Popen(cmd, stdin=PIPE)
p.stdin.write("some string\n")
p.wait()
When running through script, I get steady printing of newlines (and if this were a Python, subprocess, it'd raise an EOFerror).

An alternative is
p = subprocess.Popen(["python", "-u", "sub.py"], stdout=subprocess.PIPE)
or the suggestions here.
My experience is that yes, you will be able to read from most C programs without any extra effort.
The Python interpreter takes extra steps to buffer its output which is why it needs the -u switch to disable output buffering. Your typical C program won't do this.
I haven't run into any program (C or otherwise) other than the Python interpreter that I expected to work and didn't within a subshell.

The reason the shell can read output immediately, regardless of "-u" is because the program you're launching from the shell has its output connected to a TTY. When the stdout is connected to a TTY, it is unbuffered (because it is up to the TTY to buffer). When you launch the python subprocess from within python, you're connecting stdout to a pipe, which means you're at the mercy of the subprocess to flush its output when it feels like it.
If you're looking to do complicated interactions with a subprocess, look into this tutorial.

Related

How to execute a command and read/write to its STDIN/TTY (together)?

I've seen examples and questions about how to do these things individually. But in this question I'm trying to do them all jointly.
Basically my case is that I have a command that needs me to write to its STDIN, read from its STDOUT, and to answer its TTY prompts. All done with a single execution of the command. Not that it matters, but if you're curious, the command is scrypt enc - out.enc.
Restrictions: must be pure Python.
Question: how to do it?
I tried these:
import pty
import os
import subprocess
master, slave = pty.openpty()
p = subprocess.Popen(['sudo', 'ls', '-lh'], stdin=slave, stdout=master)
x= os.read(master)
print(x)
stdout, stderr = p.communicate(b'lol\r\n')
import pty
import os
import sys
import subprocess
def read(fd):
data = os.read(fd, 1024)
data_str = data.decode()
if data_str.find('[sudo] password for') == 0:
data_str = 'password plz: '
sys.stdout.write(data_str)
sys.stdout.flush()
def write(fd):
x = 'lol\r\n'
for b in x.encode():
os.write(fd, b)
pty.spawn(['sudo', 'ls', '-lh'], read, write)
The goal is to fully wrap the TTY prompts so that they are not visible to the user, and at the same time to feed some password to processes TTY input to make sudo happy.
Based on that goal, none of these attempts work for various reasons.
But it is even worse: suppose that they work, how can I feed the process something to its STDIN and its TTY-input? What confuses me is that the Popen example literally states that stdin is mapped to TTY (pty), so how can it know which is which? How will it know that some input is for STDIN and not TTY-in?
Disclaimer:
Discussing this topic in detail would require a lot of text so I will try to simplify things to keep it short. I will try to include as many "for further reading" links as possible.
To make it short, there is only one input stream, that is STDIN. In a normal terminal, STDIN is connected to a TTY. So what you "type on TTY" will be read by the shell. The shell decides what to do with it then. It there is a program running, it will send it to STDIN of that program.
If you run something with Popen in python, that will not have a tty. You can check that easily by doing this:
from subprocess import Popen, PIPE
p = Popen("tty", stdin=PIPE, stdout=PIPE, stderr=PIPE)
o, e = p.communicate()
print(o)
It will produce this output: b'not a tty\n'
But how does scrypt then try to use a TTY? Because that is what it does.
You have to look at the manpage and code, to find the answer.
If -P is not given, scrypt reads passphrases from its controlling terminal, or failing that, from stdin.
What it does is actually, it is just opening /dev/tty (look at the code). That exists, even if the process does not have a TTY. So it can open it and it will try to read the password from it.
How can you solve your problem now?
Well, that is easy in this case. Check the manpage for the -P parameter.
Here is a working example:
from subprocess import Popen, PIPE
p = Popen("scrypt enc -P - out.enc", stdin=PIPE, stdout=PIPE, stderr=PIPE, universal_newlines=True, shell=True)
p.communicate("pwd\nteststring")
This will encrypt the string "teststring" with the password "pwd".
There are a lot of "hacks" around ttys etc. but you should avoid those as they can have unexpected results. For example, start a shell and run tty then run a second shell and run cat with the output of the tty command (e.g. cat /dev/pts/7). Then type something in the first shell and watch what happens.
If you don't want to try it out, some characters will end up in the first shell, some in the second.
Check this post and this article about what a TTY is and where it comes from.

Pass variable to bash command with Python

I have the next code:
from subprocess import Popen, PIPE
p = Popen("C:/cygwin64/bin/bash.exe", stdin=PIPE, stdout=PIPE)
path = "C:/Users/Link/Desktop/folder/"
p.stdin.write(b"cd " + str.encode(path)))
p.stdin.close()
out = p.stdout.read()
print(out)
The output is b''
Is there any way to pass a variable to the bash command p.stdin.write(b"cd " + path)
I ask because the way it is written above don't work. Output is null, just like Cygwin started and nothing else.
EDIT
As long as I see the question is not so clear, I'll add this scenario:
I am on Windows and I am using Python 3.6.
I have a bash cmd that requieres Cygwin to be executed. This cmd may have a variable in his string, which will change after every execution. Immagine a for loop which executes a command.
For example (an ImageMagick command):
convert image.jpg -resize 1024x768 output_file.jpg
How can I execute this cmd from Python with output_file.jpg as variable ?
Bash doesn't run in interactive mode by default unless it detects that standard input and output are connected to a terminal. You PIPEd these in, therefore they're definitely not connected to a terminal.
Bash does not display any prompts in non-interactive mode, hence you see nothing. You can force it to be interactive with -i switch.
However, even then, it is not going to write to stdout but stderr; you can try piping stderr to stdout
from subprocess import Popen, PIPE, STDOUT
p = Popen(["C:/cygwin64/bin/bash.exe", "-i"], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
and you will capture the prompts and such.
Or use your original approach with a command that does produce output - here pwd that prints the current working directory:
p.stdin.write(b"cd " + path.encode() + b"\n")
p.stdin.write(b"pwd")
It is tricky to talk to an interactive process like this though - read too little => deadlock. Write too much => deadlock. This is why Popen has the .communicate method for providing all of input at once and getting the stdout and stderr afterwards.
As it seems you are using the Cygwin python, than you should use proper
Posix paths and not Windows-like ones
Instead of
p = Popen("C:/cygwin64/bin/bash.exe", stdin=PIPE, stdout=PIPE)
use
p = Popen("/bin/bash.exe", stdin=PIPE, stdout=PIPE)

Need help to read out the output of a subprocess

My python script (python 3.4.3) calls a bash script via subprocess.
OutPST = subprocess.check_output(cmd,shell=True)
It works, but the problem is, that I only get half of the data. The subprocess I call, calls a different subprocess and I have the guess, that if the "sub subprocess" sends the EOF, my programm thinks, that that´s it and ends the check_output.
Has someone an idea how to get all the data?
You should use subprocess.run() unless you really need that fine grained of control over talking to the processing via its stdin (or doing something else while the process is running instead of blocking for it to finish). It makes capturing output super easy:
from subprocess import run, PIPE
result = run(cmd, stdout=PIPE, stderr=PIPE)
print(result.stdout)
print(result.stderr)
If you want to merge stdout and stderr (like how you'd see it in your terminal if you didn't do any redirection), you can use the special destination STDOUT for stderr:
from subprocess import STDOUT
result = run(cmd, stdout=PIPE, stderr=STDOUT)
print(result.stdout)

Check on the stdout of a running subprocess in python

If need to periodically check the stdout of a running process. For example, the process is tail -f /tmp/file, which is spawned in the python script. Then every x seconds, the stdout of that subprocess is written to a string and further processed. The subprocess is eventually stopped by the script.
To parse the stdout of a subprocess, if used check_output until now, which doesn't seem to work, as the process is still running and doesn't produce a definite output.
>>> from subprocess import check_output
>>> out = check_output(["tail", "-f", "/tmp/file"])
#(waiting for tail to finish)
It should be possible to use threads for the subprocesses, so that the output of multiple subprocesses may be processed (e.g. tail -f /tmp/file1, tail -f /tmp/file2).
How can I start a subprocess, periodically check and process its stdout and eventually stop the subprocess in a multithreading friendly way? The python script runs on a Linux system.
The goal is not to continuously read a file, the tail command is an example, as it behaves exactly like the actual command used.
edit: I didn't think this through, the file did not exist. check_output now simply waits for the process to finish.
edit2: An alternative method, with Popen and PIPE appears to result in the same issue. It waits for tail to finish.
>>> from subprocess import Popen, PIPE, STDOUT
>>> cmd = 'tail -f /tmp/file'
>>> p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT, close_fds=True)
>>> output = p.stdout.read()
#(waiting for tail to finish)
Your second attempt is 90% correct. The only issue is that you are attempting to read all of tail's stdout at the same time once it's finished. However, tail is intended to run (indefinitely?) in the background, so you really want to read stdout from it line-by-line:
from subprocess import Popen, PIPE, STDOUT
p = Popen(["tail", "-f", "/tmp/file"], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
for line in p.stdout:
print(line)
I have removed the shell=True and close_fds=True arguments. The first is unnecessary and potentially dangerous, while the second is just the default.
Remember that file objects are iterable over their lines in Python. The for loop will run until tail dies, but it will process each line as it appears, as opposed to read, which will block until tail dies.
If I create an empty file in /tmp/file, start this program and begin echoing lines into the file using another shell, the program will echo those lines. You should probably replace print with something a bit more useful.
Here is an example of commands I typed after starting the code above:
Command line
$ echo a > /tmp/file
$ echo b > /tmp/file
$ echo c >> /tmp/file
Program Output (From Python in a different shell)
b'a\n'
b'tail: /tmp/file: file truncated\n'
b'b\n'
b'c\n'
In the case that you want your main program be responsive while you respond to the output of tail, start the loop in a separate thread. You should make this thread a daemon so that it does not prevent your program from exiting even if tail is not finished. You can have the thread open the sub-process or you can just pass in the standard output to it. I prefer the latter approach since it gives you more control in the main thread:
def deal_with_stdout():
for line in p.stdout:
print(line)
from subprocess import Popen, PIPE, STDOUT
from threading import Thread
p = Popen(["tail", "-f", "/tmp/file"], stdin=PIPE, stdout=PIPE, stderr=STDOUT)
t = Thread(target=deal_with_stdout, daemon=True)
t.start()
t.join()
The code here is nearly identical, with the addition of a new thread. I added a join() at the end so the program would behave well as an example (join waits for the thread to die before returning). You probably want to replace that with whatever processing code you would normally be running.
If your thread is complex enough, you may also want to inherit from Thread and override the run method instead of passing in a simple target.

Python - subprocesses and the python shell

I am trying to shell out to a non-python subprocess and allow it to inherit the stdin and stdout from python. - i am using subprocess.Popen
This would probably work if I am calling from a console, but it definitely doesn't work when I am using the python shell
(I am using IDLE by the way)
Is there any way to convince python to allow a non python subprocess to print it's stdout to the python shell?
This works both from a script and from the interactive interpreter, but not from IDLE:
subprocess.Popen(whatever, stdin=sys.stdout, stdout=sys.stdin)
You can't use the objects which IDLE assigns to sys.stdin and sys.stdout as arguments to subprocess.Popen. These objects (the interfaces to the IDLE shell window) are file-like, but they're not real file handles with fileno attributes, and Unix-like operating systems require a fileno to be specified as the stdin or stdout for a subprocess. I cannot speak for Windows, but I imagine it has similar requirements.
Taymon's answer addresses your question directly in that IDLE's stdin/stdout are actually file-like objects and not the standard file streams associated with a console/terminal. Moreover, in Windows IDLE runs with pythonw.exe, which doesn't even have an attached win32 console.
That said, if you just need the output from a program to be printed to the user in real time, then in many cases (but not all) you can read the output line by line and echo it accordingly. The following works for me in Windows IDLE. It demonstrates reading from a piped stdout line by line. It also shows what happens if the process buffers the pipe, in which case readline will block until either the buffer is full or the pipe closes. This buffering can be manually disabled with some programs (such as the Python interpreter's -u option), and there are workarounds for Unix such as stdbuf.
test1.py
import sys
import subprocess
def test(cmd):
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stdin=subprocess.PIPE,
stderr=subprocess.PIPE)
it = iter(p.stdout.readline, b'')
for line in it:
print(line.rstrip().decode('ascii'))
print('Testing buffered subprocess...')
test([sys.executable, 'test2.py'])
print('\nTesting unbuffered subprocess...')
#-u: unbuffered binary stdout and stderr
test([sys.executable, '-u', 'test2.py'])
test2.py:
import time
for i in range(5):
print(i)
time.sleep(1)
The output in IDLE should be the following, with the first set of digits printed all at once after a delay and the second set printed line by line.
Testing buffered subprocess...
0
1
2
3
4
Testing unbuffered subprocess...
0
1
2
3
4

Categories

Resources