Python 3 STDOUT Processing - python

I am new to Python and have searched high and low for an example of how to process the STDOUT of a subprocess, but with no luck.
I have a test command line executable that produces output continually and am trying to read the STDOUT in real-time and print this to the screen. Eventually it will process the data locally and then send a summary onto another machine, but I have simplified this down to ask this question.
I have tried to use Python to do this as shown in the code below:
import subprocess, time, os, sys
cmd = [sys.executable, 'TestStdOut.exe']
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in iter(p.stdout.readline, b''):
print(">>> " + line)
The test executable I am using just outputs dateTime for the moment, and the stdout looks like this:
....
16/11/2015 10:52:44
16/11/2015 10:52:44
16/11/2015 10:52:44
16/11/2015 10:52:44
16/11/2015 10:52:44
16/11/2015 10:52:45
....
But I am getting this error and I cannot work out why:
Traceback (most recent call last):
File "test_stdout.py", line 9, in <module>
print(">>> " + line)
TypeError: Can't convert 'bytes' object to str implicitly
I was expecting the Python to simply display the same lines as if I ran the executable from the command line. I have googled the error, but I cannot seem to find an answer.
What am I missing?
Many thanks
Pete
UPDATE : Thanks to 2RING I went back and reviewed my notes, this code now works:
import subprocess, time, os, sys
cmd = ['TestStdOut.exe']
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
while True:
line = p.stdout.readline()
if line != '':
print(line.rstrip().decode('utf-8'))
else:
break

Related

How to run a bash script from within python and get all the output?

This is a direct clarification question to the answer in here which I thought it worked, but it does not!
I have the following test bash script (testbash.sh) which just creates some output and a lot of errors for testing purposes (running on Red Hat Enterprise Linux Server release 7.6 (Maipo) and also Ubuntu 16.04.6 LTS):
export MAX_SEED=2
echo "Start test"
pids=""
for seed in `seq 1 ${MAX_SEED}`
do
python -c "raise ValueError('test')" &
pids="${pids} $!"
done
echo "pids: ${pids}"
wait $pids
echo "End test"
If I run this script I get the following output:
Start test
pids: 68322 68323
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
[1]- Exit 1 python -c "raise ValueError('test')"
[2]+ Exit 1 python -c "raise ValueError('test')"
End test
That is the expected outcome. That is fine. I want to get errors!
Now here is the python code that is supposed to catch all the output:
from __future__ import print_function
import sys
import time
from subprocess import PIPE, Popen, STDOUT
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line.decode('ascii'))
out.close()
p = Popen(['. testbash.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1, close_fds=ON_POSIX, shell=True)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
# read line without blocking
while t.is_alive():
#time.sleep(1)
try:
line = q.get(timeout=.1)
except Empty:
print(line)
pass
else:
# got line
print(line, end='')
p.wait()
print('returncode = {}'.format(p.returncode))
But when I run this code I only get the following output:
Start test
pids: 70191 70192
Traceback (most recent call last):
returncode = 0
or this output (without the line End test):
Start test
pids: 10180 10181
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
Traceback (most recent call last):
File "<string>", line 1, in <module>
ValueError: test
returncode = 0
Most of the above output is missing! How can I fix this? Also, I need some way to check if any command in the bash script did not succeed. In the example this is the case, but the errorcode printed out is still 0. I expect an errorcode != 0.
It is not important to immediately get the output. A delay of some seconds is fine. Also if the output order is a bit mixed up this is of no concern. The important thing is to get all the output (stdout and stderr).
Maybe there is a simpler way to just get the output of a bash script which is started from python?
To be run with python3
from __future__ import print_function
import os
import stat
import sys
import time
from subprocess import PIPE, Popen, STDOUT
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x
ON_POSIX = 'posix' in sys.builtin_module_names
TESTBASH = '/tmp/testbash.sh'
def create_bashtest():
with open(TESTBASH, 'wt') as file_desc:
file_desc.write("""#!/usr/bin/env bash
export MAX_SEED=2
echo "Start test"
pids=""
for seed in `seq 1 ${MAX_SEED}`
do
python -c "raise ValueError('test')" &
pids="${pids} $!"
sleep .1 # Wait so that error messages don't get out of order.
done
wait $pids; return_code=$?
sleep 0.2 # Wait for background messages to be processed.
echo "pids: ${pids}"
echo "End test"
sleep 1 # Wait for main process to handle all the output
exit $return_code
""")
os.chmod(TESTBASH, stat.S_IEXEC|stat.S_IRUSR|stat.S_IWUSR)
def enqueue_output(queue):
pipe = Popen([TESTBASH], stdout=PIPE, stderr=STDOUT,
bufsize=1, close_fds=ON_POSIX, shell=True)
out = pipe.stdout
while pipe.poll() is None:
line = out.readline()
if line:
queue.put(line.decode('ascii'))
time.sleep(.1)
print('returncode = {}'.format(pipe.returncode))
create_bashtest()
C_CHANNEL = Queue()
THREAD = Thread(target=enqueue_output, args=(C_CHANNEL,))
THREAD.daemon = True
THREAD.start()
while THREAD.is_alive():
time.sleep(0.1)
try:
line = C_CHANNEL.get_nowait()
except Empty:
pass # print("no output")
else:
print(line, end='')
Hope this helps :
First, looks like buffers are not being flushed. Redirecting (and, to be safe, appending) stdout/stderr to a file(s) rather than to the terminal, may help. You can always use tee (or tee -a) if you really want both. Using context managers 'might' help.
As far as the zero return code, $!
https://unix.stackexchange.com/questions/386196/doesnt-work-on-command-line
! may be invoking history invoking history, thereby $! resulting in an empty value.
If you somehow end up with just a bare wait the return code will be a zero. Regardless, return codes can be tricky, and you might be picking a successful return code from elsewhere.
Take a look at stdbuf command to change the buffer sizes for stdout and stderr:
Is there a way to flush stdout of a running process
That may also help with getting the rest of your expected output.
Rewrite the while block this way:
# read line without blocking
while t.is_alive():
try:
line = q.get(block=False)
except Empty:
# print(line)
pass
else:
# got line
print(line, end='')
You don't want to block on getting a line from the Queue when there's none, and you don't need a timeout in this case, as it's only used when blocking the thread is required. Consequently, if the Queue.get() throws Empty, there's no line to print, and we just pass.
===
Also, let's clarify the script execution logic.
Since you're using Bash expressions, and the default shell used by Popen is /bin/sh, you'd probably want to rewrite the invokation line this way:
p = Popen(['/usr/bin/bash','-c', './testbash.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1, close_fds=ON_POSIX)
It won't hurt to add a shebang to your shell script, too:
#!/usr/bin/env bash
<... rest of the script ...>
If you're looking for these lines:
[1]- Exit 1 python -c "raise ValueError('test')"
[2]+ Exit 1 python -c "raise ValueError('test')"
This is a function of the bash shell that's typically only available in interactive mode, i.e. when you're typing commands into a terminal. If you check the bash source code, you can see that it explicitly checks the mode before printing to stdout/stderr.
In the more recent versions of bash, you can't set this inside a script: see https://unix.stackexchange.com/a/364618 . However, you can set this yourself when starting the script:
p = Popen(['/bin/bash -i ./testbash.sh'], stdout=PIPE, stderr=STDOUT, bufsize=1, close_fds=ON_POSIX, shell=True)
I will note that this is only working for me on Python3 - Python2 is only getting part of the output. It isn't clear version of Python you're using, but considering Python2 is end of life now we should probably all be trying to switch to Python3.
As for the bash script, even with interactive mode set it seems you have to change how you wait to get that output:
#!/bin/bash
export MAX_SEED=2
echo "Start test"
pids=""
for seed in `seq 1 ${MAX_SEED}`
do
python -c "raise ValueError('test')" &
pids="${pids} $!"
done
echo "pids: ${pids}"
wait -n $pids
wait -n $pids
ret=$?
echo "End test"
exit $ret
Normal wait wasn't working for me (Ubuntu 18.04), but wait -n seemed to work - but as it only waits for the next job to complete, I had inconsistent output just calling it once. Calling wait -n for each job launched seems to do the trick, but the program flow should probably be refactored to loop over the wait the same number of times you spin up the job.
Also note that to change the return code of the script, Philippe's answer has the right approach - the $? variable has the return code of the latest command that failed, which you can then pass to exit. (Yet another difference in Python versions: Python2 is returning 127 while Python3 returns 1 for me.) If you need the return values for each job, one way might be to parse out the values in the interactive job exit lines.
Just guessing - could it be that a line that starts with an empty character / space is not recognized as a line by your logic.
Maybe this indent is the issue. Another option is, that there is a tab or something like that and the ascii decode might fail.
This is how I usually use subprocess:
import subprocess
with subprocess.Popen(["./test.sh"], shell=True, stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE) as p:
error = p.stderr.read().decode()
std_out = p.stdout.read().decode()
if std_out:
print(std_out)
if error:
print("Error message: {}".format(error))
Here you decode and read both the stdout and the stderr. You get everything but not in the same order, I don't if that's an issue.

How to use standard Linux tools to fix a deadlocked script?

I have a script in Python3 and if I use subprocess.Popen.wait() I have problem — my script iterates some Linux command many times and it looks to me like my app is not responding. When I use subprocess.Popen.communicate() my application correctly completes its work in a second.
What is the right way to solve this problem using Linux?
I think the solution must be somewhere in manipulating with buffer's variable, but I searched through the entire Internet and could not find anything suitable. May be I don't know enough structure and operation of Linux as a whole.
My question can be reformulated as follows: What's happened exactly when I use .wait() method? And that leads to failure of it? What is the cause of the so long waiting? When I aborting running task I see the next log:
Traceback (most recent call last):
File "./test.py", line 6, in <module>
proc.wait()
File "/usr/lib/python3.5/subprocess.py", line 1658, in wait
(pid, sts) = self._try_wait(0)
File "/usr/lib/python3.5/subprocess.py", line 1608, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt
My files looks approximately like the next things:
script.py:
#!/usr/bin/python3
# -*-coding: utf-8 -*-
import subprocess
proc = subprocess.Popen(['./1.py', '1000000'], stdin=subprocess.PIPE, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
proc.wait()
out = proc.stdout.read()
# out = proc.communicate()[0]
print(len(out))
1.py:
#!/usr/bin/python3
# -*-coding: utf-8 -*-
import sys
x = sys.argv[-1]
# print(x, type(x))
for i in range(int(x)):
print(i)
UPD: As we understand, the problem is a buffer overflow. It turns out the last version of question is, how to use the Linux possibility to expand buffer or redirect buffer to a file before running the script?
UPD2: I also tried run the script as: $ python3 -u ./script.py, but, unfortunally, unbufferring doesn't work as I would like and script is hangs.
Your script is sending output to its stdout or stderr pipes. The operating system will buffer some data then block the process forever when the pipe fills. Suppose I have a long winded command like
longwinded.py:
for i in range(100000):
print('a'*1000)
The following hangs because the stdout pipe fills
import sys
import subprocess as subp
p = subp.Popen([sys.executable, 'longwinded.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
p.wait()
The next one doesn't hang because communicate reads the stdout and stderr pipes into memory
p = subp.Popen([sys.executable, 'longwinded.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
p.communicate()
If you don't care what stdout and err are, you can redirect them to the null device
p = subp.Popen([sys.executable, 'longwinded.py'],
stdout=open(os.devnull, 'w'),
stderr=open(os.devnull, 'w'))
p.wait()
or save them to a file
p = subp.Popen([sys.executable, 'longwinded.py'],
stdout=open('mystdout', 'w'),
stderr=open('mystderr', 'w'))
p.wait()

Receive return data from subprocess in python

I'm spawning a process from a script using subprocess. My subprocess takes a JSON input and performs some operations and should return some real time data to the main process. How can I do this from subprocess?
I'm trying something like this. But it is throwing an error.
Following is may main process "main.py"
p = subprocess.Popen(['python','handler.py'],
stdin=subprocess.PIPE,stdout=subprocess.PIPE)
p.communicate(JSONEncoder().encode(data))
while True:
out = process.stdout.read(1)
if out == '' and process.poll() != None:
break
if out != '':
sys.stdout.write(out)
sys.stdout.flush()
Below is my subprocess "handler.py"
if __name__ == '__main__' :
command = json.load(sys.stdin)
os.environ["PYTHONPATH"] = "../../"
if command["cmd"] == "archive" :
print "command recieved:",command["cmd"]
file_ids, count = archive(command["files"])
sys.stdout.write(JSONEncoder().encode(file_ids))
But it throws an error.
Traceback (most recent call last):
File "./core/main.py", line 46, in <module>
out = p.stdout.read(1)
ValueError: I/O operation on closed file
Am I doing something wrong here??
Popen.communicate() does not return until the process is dead and it returns all the output. You can't read subprocess' stdout after it. Look at the top of the .communicate() docs:
Interact with process: Send data to stdin. Read data from stdout and
stderr, until end-of-file is reached. Wait for process to terminate.emphasis is mine
If you want to send data and then read the output line by line as text while the child process is still running:
#!/usr/bin/env python3
import json
from subprocess import Popen, PIPE
with Popen(command, stdin=PIPE, stdout=PIPE, universal_newline=True) as process:
with process.stdin as pipe:
pipe.write(json.dumps(data))
for line in process.stdout:
print(line, end='')
process(line)
If you need code for older python versions or you have buffering issues, see Python: read streaming input from subprocess.communicate().
If all you want is to pass data to the child process and to print the output to terminal:
#!/usr/bin/env python3.5
import json
import subprocess
subprocess.run(command, input=json.dumps(data).encode())
If your actual child process is a Python script then consider importing it as a module and running the corresponding functions instead, see Call python script with input with in a python script using subprocess.
communicate reads all the output from a subprocess and closes it. If you want to be able to read from the process after writing, you have to use something other than communicate, such as p.stdin.write. Alternatively, just use the output of communicate; it should have what you want https://docs.python.org/3/library/subprocess.html#popen-objects.

Catch the continuous output from a subprocess

i'm trying to catch the output of airodump-ng, that has a continuous output, and process every line searching for a string. but that doesn't work. so i try the same thing with "htop" command that has the same kind of output, and it still doesn't work.
i'm trying this with python 3.4 and python 2.7, both on arch linux and osx mavericks. here's the code (not every import is necessary but nevermind):
import subprocess
import sys
import os
import time
command = ["htop"]
proc = subprocess.Popen(command, stdout = subprocess.PIPE)
outs, errs = proc.communicate(timeout=3)
proc.kill()
and it gives me:
Traceback (most recent call last):
File "/Users/andrei/Dropbox/python/file_prova.py", line 8, in <module>
outs, errs = proc.communicate(timeout=3)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 960, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 1618, in _communicate
self._check_timeout(endtime, orig_timeout)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 986, in _check_timeout
raise TimeoutExpired(self.args, orig_timeout)
subprocess.TimeoutExpired: Command '['htop']' timed out after 3 seconds
seems like it crashes at proc.communicate() and doesn't execute the lines under that. i also tried to handle the exception but no way to make it work...
[EDIT]
ok so it's for 4 am, i learned the try - exception handling, and after a looong time a managed to make it work with htop, following the tips hardly found here (the 2nd solution doesn't seem to work):
this is how it looks
from subprocess import Popen, PIPE
from time import sleep
from fcntl import fcntl, F_GETFL, F_SETFL
from os import O_NONBLOCK, read
# run the shell as a subprocess:
p = Popen(['htop'], stdout = PIPE)
# set the O_NONBLOCK flag of p.stdout file descriptor:
flags = fcntl(p.stdout, F_GETFL) # get current p.stdout flags
fcntl(p.stdout, F_SETFL, flags | O_NONBLOCK)
# let the shell output the result:
# get the output
while True:
sleep(1)
try:
print (read(p.stdout.fileno(), 1024).decode("utf-8")),
except OSError:
# the os throws an exception if there is no data
print ('[No more data]')
continue
it works flawlessly. with htop.
but not with airodump-ng. it prints on the terminal its output and every 1 second (the sleep() in the while loop) prints [No more data], like the stream is going elsewhere...
EDIT 2:
solved! the thing was just that airodump-ng dumps data to stderr, not stdout. pretty straight forward try ahah :D
From the documentation:
The timeout argument is passed to Popen.wait(). If the timeout
expires, the child process will be killed and then waited for again.
The TimeoutExpired exception will be re-raised after the child process
has terminated.
That seems to describe exactly the behavior you are seeing. You will need to learn about exception handling using try/except.

subprocess can't get the stdin input from other process

I use subprocess exchange data between two process
I edit a repeat.py file with:
this file is a example from http://www.doughellmann.com/PyMOTW/subprocess/
import sys
sys.stderr.write('repeater.py: starting\n')
sys.stderr.flush()
while True:
next_line = sys.stdin.readline()
if not next_line:
break
sys.stdout.write(next_line)
sys.stdout.flush()
sys.stderr.write('repeater.py: exiting\n')
sys.stderr.flush()
and run this file in ipython
In [1]: import subprocess
In [2]: f=subprocess.Popen(['python','~/repeat.py'],shell=True,stdin=subprocess.PIPE,stdout=subprocess.PIPE)
In [3]: f.stdin.write('teststs\n')
In [4]: f.communicate()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'teststs' is not defined
Out[4]: ('', None)
why teststs is not defined?
You seem to be starting an interactive Python session instead of running repeat.py. Try removing shell=True, it doesn't make sense together with a list of parameters. (Using shell=True is almost always a bad idea, by the way.)
This works with some strange behavior at first 5 key-presses. I don't know why. After that if works fine, and we have access to ls -l, cd, previous commands when press UP, seems command line has full functionality.
#!/bin/python3
import subprocess
import sys
proc = subprocess.Popen(['bash'])
while True:
buff = sys.stdin.readline()
stdoutdata, stderrdata = proc.communicate(buff)
if( stdoutdata ):
print( stdoutdata )
else:
print('n')
break
Here is my similar question.

Categories

Resources