How to use standard Linux tools to fix a deadlocked script?

How to use standard Linux tools to fix a deadlocked script? - python

I have a script in Python3 and if I use subprocess.Popen.wait() I have problem — my script iterates some Linux command many times and it looks to me like my app is not responding. When I use subprocess.Popen.communicate() my application correctly completes its work in a second.
What is the right way to solve this problem using Linux?
I think the solution must be somewhere in manipulating with buffer's variable, but I searched through the entire Internet and could not find anything suitable. May be I don't know enough structure and operation of Linux as a whole.
My question can be reformulated as follows: What's happened exactly when I use .wait() method? And that leads to failure of it? What is the cause of the so long waiting? When I aborting running task I see the next log:
Traceback (most recent call last):
File "./test.py", line 6, in <module>
proc.wait()
File "/usr/lib/python3.5/subprocess.py", line 1658, in wait
(pid, sts) = self._try_wait(0)
File "/usr/lib/python3.5/subprocess.py", line 1608, in _try_wait
(pid, sts) = os.waitpid(self.pid, wait_flags)
KeyboardInterrupt
My files looks approximately like the next things:
script.py:
#!/usr/bin/python3
# -*-coding: utf-8 -*-
import subprocess
proc = subprocess.Popen(['./1.py', '1000000'], stdin=subprocess.PIPE, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
proc.wait()
out = proc.stdout.read()
# out = proc.communicate()[0]
print(len(out))
1.py:
#!/usr/bin/python3
# -*-coding: utf-8 -*-
import sys
x = sys.argv[-1]
# print(x, type(x))
for i in range(int(x)):
print(i)
UPD: As we understand, the problem is a buffer overflow. It turns out the last version of question is, how to use the Linux possibility to expand buffer or redirect buffer to a file before running the script?
UPD2: I also tried run the script as: $ python3 -u ./script.py, but, unfortunally, unbufferring doesn't work as I would like and script is hangs.

Your script is sending output to its stdout or stderr pipes. The operating system will buffer some data then block the process forever when the pipe fills. Suppose I have a long winded command like
longwinded.py:
for i in range(100000):
print('a'*1000)
The following hangs because the stdout pipe fills
import sys
import subprocess as subp
p = subp.Popen([sys.executable, 'longwinded.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
p.wait()
The next one doesn't hang because communicate reads the stdout and stderr pipes into memory
p = subp.Popen([sys.executable, 'longwinded.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
p.communicate()
If you don't care what stdout and err are, you can redirect them to the null device
p = subp.Popen([sys.executable, 'longwinded.py'],
stdout=open(os.devnull, 'w'),
stderr=open(os.devnull, 'w'))
p.wait()
or save them to a file
p = subp.Popen([sys.executable, 'longwinded.py'],
stdout=open('mystdout', 'w'),
stderr=open('mystderr', 'w'))
p.wait()

Related

Reading stdout from a subprocess in real time

Given this code snippet:
from subprocess import Popen, PIPE, CalledProcessError
def execute(cmd):
with Popen(cmd, shell=True, stdout=PIPE, bufsize=1, universal_newlines=True) as p:
for line in p.stdout:
print(line, end='')
if p.returncode != 0:
raise CalledProcessError(p.returncode, p.args)
base_cmd = [
"cmd", "/c", "d:\\virtual_envs\\py362_32\\Scripts\\activate",
"&&"
]
cmd1 = " ".join(base_cmd + ['python -c "import sys; print(sys.version)"'])
cmd2 = " ".join(base_cmd + ["python -m http.server"])
If I run execute(cmd1) the output will be printed without any problems.
However, If I run execute(cmd2) instead nothing will be printed, why is that and how can I fix it so I could see the http.server's output in real time.
Also, how for line in p.stdout is been evaluated internally? is it some sort of endless loop till reaches stdout eof or something?
This topic has already been addressed few times here in SO but I haven't found a windows solution. The above snippet is code from this answer and I'm running http.server from a virtualenv (python3.6.2-32bits on win7)

If you want to read continuously from a running subprocess, you have to make that process' output unbuffered. Your subprocess being a Python program, this can be done by passing -u to the interpreter:
python -u -m http.server
This is how it looks on a Windows box.

With this code, you can`t see the real-time output because of buffering:
for line in p.stdout:
print(line, end='')
But if you use p.stdout.readline() it should work:
while True:
line = p.stdout.readline()
if not line: break
print(line, end='')
See corresponding python bug discussion for details
UPD: here you can find almost the same problem with various solutions on stackoverflow.

I think the main problem is that http.server somehow is logging the output to stderr, here I have an example with asyncio, reading the data either from stdout or stderr.
My first attempt was to use asyncio, a nice API, which exists in since Python 3.4. Later I found a simpler solution, so you can choose, both of em should work.
asyncio as solution
In the background asyncio is using IOCP - a windows API to async stuff.
# inspired by https://pymotw.com/3/asyncio/subprocesses.html
import asyncio
import sys
import time
if sys.platform == 'win32':
loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)
async def run_webserver():
buffer = bytearray()
# start the webserver without buffering (-u) and stderr and stdin as the arguments
print('launching process')
proc = await asyncio.create_subprocess_exec(
sys.executable, '-u', '-mhttp.server',
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE
)
print('process started {}'.format(proc.pid))
while 1:
# wait either for stderr or stdout and loop over the results
for line in asyncio.as_completed([proc.stderr.readline(), proc.stdout.readline()]):
print('read {!r}'.format(await line))
event_loop = asyncio.get_event_loop()
try:
event_loop.run_until_complete(run_df())
finally:
event_loop.close()
redirecting the from stdout
based on your example this is a really simple solution. It just redirects the stderr to stdout and only stdout is read.
from subprocess import Popen, PIPE, CalledProcessError, run, STDOUT import os
def execute(cmd):
with Popen(cmd, stdout=PIPE, stderr=STDOUT, bufsize=1) as p:
while 1:
print('waiting for a line')
print(p.stdout.readline())
cmd2 = ["python", "-u", "-m", "http.server"]
execute(cmd2)

How for line in p.stdout is been evaluated internally? is it some sort of endless loop till reaches stdout eof or something?
p.stdout is a buffer (blocking). When you are reading from an empty buffer, you are blocked until something is written to that buffer. Once something is in it, you get the data and execute the inner part.
Think of how tail -f works on linux: it waits until something is written to the file, and when it does it echo's the new data to the screen. What happens when there is no data? it waits. So when your program gets to this line, it waits for data and process it.
As your code works, but when run as a model not, it has to be related to this somehow. The http.server module probably buffers the output. Try adding -u parameter to Python to run the process as unbuffered:
-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x
see man page for details on internal buffering relating to '-u'
Also, you might want to try change your loop to for line in iter(lambda: p.stdout.read(1), ''):, as this reads 1 byte at a time before processing.
Update: The full loop code is
for line in iter(lambda: p.stdout.read(1), ''):
sys.stdout.write(line)
sys.stdout.flush()
Also, you pass your command as a string. Try passing it as a list, with each element in its own slot:
cmd = ['python', '-m', 'http.server', ..]

You could implement the no-buffer behavior at the OS level.
In Linux, you could wrap your existing command line with stdbuf :
stdbuf -i0 -o0 -e0 YOURCOMMAND
Or in Windows, you could wrap your existing command line with winpty:
winpty.exe -Xallow-non-tty -Xplain YOURCOMMAND
I'm not aware of OS-neutral tools for this.

Receive return data from subprocess in python

I'm spawning a process from a script using subprocess. My subprocess takes a JSON input and performs some operations and should return some real time data to the main process. How can I do this from subprocess?
I'm trying something like this. But it is throwing an error.
Following is may main process "main.py"
p = subprocess.Popen(['python','handler.py'],
stdin=subprocess.PIPE,stdout=subprocess.PIPE)
p.communicate(JSONEncoder().encode(data))
while True:
out = process.stdout.read(1)
if out == '' and process.poll() != None:
break
if out != '':
sys.stdout.write(out)
sys.stdout.flush()
Below is my subprocess "handler.py"
if __name__ == '__main__' :
command = json.load(sys.stdin)
os.environ["PYTHONPATH"] = "../../"
if command["cmd"] == "archive" :
print "command recieved:",command["cmd"]
file_ids, count = archive(command["files"])
sys.stdout.write(JSONEncoder().encode(file_ids))
But it throws an error.
Traceback (most recent call last):
File "./core/main.py", line 46, in <module>
out = p.stdout.read(1)
ValueError: I/O operation on closed file
Am I doing something wrong here??

Popen.communicate() does not return until the process is dead and it returns all the output. You can't read subprocess' stdout after it. Look at the top of the .communicate() docs:
Interact with process: Send data to stdin. Read data from stdout and
stderr, until end-of-file is reached. Wait for process to terminate.emphasis is mine
If you want to send data and then read the output line by line as text while the child process is still running:
#!/usr/bin/env python3
import json
from subprocess import Popen, PIPE
with Popen(command, stdin=PIPE, stdout=PIPE, universal_newline=True) as process:
with process.stdin as pipe:
pipe.write(json.dumps(data))
for line in process.stdout:
print(line, end='')
process(line)
If you need code for older python versions or you have buffering issues, see Python: read streaming input from subprocess.communicate().
If all you want is to pass data to the child process and to print the output to terminal:
#!/usr/bin/env python3.5
import json
import subprocess
subprocess.run(command, input=json.dumps(data).encode())
If your actual child process is a Python script then consider importing it as a module and running the corresponding functions instead, see Call python script with input with in a python script using subprocess.

communicate reads all the output from a subprocess and closes it. If you want to be able to read from the process after writing, you have to use something other than communicate, such as p.stdin.write. Alternatively, just use the output of communicate; it should have what you want https://docs.python.org/3/library/subprocess.html#popen-objects.

Catch the continuous output from a subprocess

i'm trying to catch the output of airodump-ng, that has a continuous output, and process every line searching for a string. but that doesn't work. so i try the same thing with "htop" command that has the same kind of output, and it still doesn't work.
i'm trying this with python 3.4 and python 2.7, both on arch linux and osx mavericks. here's the code (not every import is necessary but nevermind):
import subprocess
import sys
import os
import time
command = ["htop"]
proc = subprocess.Popen(command, stdout = subprocess.PIPE)
outs, errs = proc.communicate(timeout=3)
proc.kill()
and it gives me:
Traceback (most recent call last):
File "/Users/andrei/Dropbox/python/file_prova.py", line 8, in <module>
outs, errs = proc.communicate(timeout=3)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 960, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 1618, in _communicate
self._check_timeout(endtime, orig_timeout)
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/subprocess.py", line 986, in _check_timeout
raise TimeoutExpired(self.args, orig_timeout)
subprocess.TimeoutExpired: Command '['htop']' timed out after 3 seconds
seems like it crashes at proc.communicate() and doesn't execute the lines under that. i also tried to handle the exception but no way to make it work...
[EDIT]
ok so it's for 4 am, i learned the try - exception handling, and after a looong time a managed to make it work with htop, following the tips hardly found here (the 2nd solution doesn't seem to work):
this is how it looks
from subprocess import Popen, PIPE
from time import sleep
from fcntl import fcntl, F_GETFL, F_SETFL
from os import O_NONBLOCK, read
# run the shell as a subprocess:
p = Popen(['htop'], stdout = PIPE)
# set the O_NONBLOCK flag of p.stdout file descriptor:
flags = fcntl(p.stdout, F_GETFL) # get current p.stdout flags
fcntl(p.stdout, F_SETFL, flags | O_NONBLOCK)
# let the shell output the result:
# get the output
while True:
sleep(1)
try:
print (read(p.stdout.fileno(), 1024).decode("utf-8")),
except OSError:
# the os throws an exception if there is no data
print ('[No more data]')
continue
it works flawlessly. with htop.
but not with airodump-ng. it prints on the terminal its output and every 1 second (the sleep() in the while loop) prints [No more data], like the stream is going elsewhere...
EDIT 2:
solved! the thing was just that airodump-ng dumps data to stderr, not stdout. pretty straight forward try ahah :D

From the documentation:
The timeout argument is passed to Popen.wait(). If the timeout
expires, the child process will be killed and then waited for again.
The TimeoutExpired exception will be re-raised after the child process
has terminated.
That seems to describe exactly the behavior you are seeing. You will need to learn about exception handling using try/except.

stdin for Interactive command from python

I'm trying to integrate an interactive lua shell into my python GUI with a similar approach as described here: Running an interactive command from within python Target platform for now is windows. I want to be able to feed the lua interpreter line by line.
import subprocess
import os
from queue import Queue
from queue import Empty
from threading import Thread
import time
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
lua = '''\
-- comment
print("A")
test = 0
test2 = 1
os.exit()'''
command = os.path.join('lua', 'bin', 'lua.exe')
process = (subprocess.Popen(command + ' -i', shell=True,
stdin=subprocess.PIPE, stderr=subprocess.PIPE,
stdout=subprocess.PIPE, cwd=os.getcwd(), bufsize=1,
universal_newlines=True))
outQueue = Queue()
errQueue = Queue()
outThread = Thread(target=enqueue_output, args=(process.stdout, outQueue))
errThread = Thread(target=enqueue_output, args=(process.stderr, errQueue))
outThread.daemon = True
errThread.daemon = True
outThread.start()
errThread.start()
script = lua.split('\n')
time.sleep(.2)
for line in script:
while True:
try:
rep = outQueue.get(timeout=.2)
except Empty:
break
else: # got line
print(rep)
process.stdin.write(line)
The only output I receive is the very first line of the lua.exe shell. It seems that the writing to stdin doesn't actually take place. Is there anything I miss?
Running an external lua file with the -i switch actually works and yields the expected output which makes me think the issue is connected to the stdin.
I experimented a bit in python interactive mode using the python shell trying something similar to the solution featuring a file for the stdout here: Interactive input/output using python. However, this only wrote the output to the file once I stopped the python shell, which also seems like the stdin gets stalled somewhere and is only actually transmitted, once I quit the shell. Any ideas what goes wrong here?

How do I get 'real-time' information back from a subprocess.Popen in python (2.5)

I'd like to use the subprocess module in the following way:
create a new process that potentially takes a long time to execute.
capture stdout (or stderr, or potentially both, either together or separately)
Process data from the subprocess as it comes in, perhaps firing events on every line received (in wxPython say) or simply printing them out for now.
I've created processes with Popen, but if I use communicate() the data comes at me all at once, once the process has terminated.
If I create a separate thread that does a blocking readline() of myprocess.stdout (using stdout = subprocess.PIPE) I don't get any lines with this method either, until the process terminates. (no matter what I set as bufsize)
Is there a way to deal with this that isn't horrendous, and works well on multiple platforms?

Update with code that appears not to work (on windows anyway)
class ThreadWorker(threading.Thread):
def __init__(self, callable, *args, **kwargs):
super(ThreadWorker, self).__init__()
self.callable = callable
self.args = args
self.kwargs = kwargs
self.setDaemon(True)
def run(self):
try:
self.callable(*self.args, **self.kwargs)
except wx.PyDeadObjectError:
pass
except Exception, e:
print e
if __name__ == "__main__":
import os
from subprocess import Popen, PIPE
def worker(pipe):
while True:
line = pipe.readline()
if line == '': break
else: print line
proc = Popen("python subprocess_test.py", shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdout_worker = ThreadWorker(worker, proc.stdout)
stderr_worker = ThreadWorker(worker, proc.stderr)
stdout_worker.start()
stderr_worker.start()
while True: pass

stdout will be buffered - so you won't get anything till that buffer is filled, or the subprocess exits.
You can try flushing stdout from the sub-process, or using stderr, or changing stdout on non-buffered mode.

It sounds like the issue might be the use of buffered output by the subprocess - if a relatively small amount of output is created, it could be buffered until the subprocess exits. Some background can be found here:

Here's what worked for me:
cmd = ["./tester_script.bash"]
p = subprocess.Popen( cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
while p.poll() is None:
out = p.stdout.readline()
do_something_with( out, err )
In your case you could try to pass a reference to the sub-process to your Worker Thread, and do the polling inside the thread. I don't know how it will behave when two threads poll (and interact with) the same subprocess, but it may work.
Also note thate the while p.poll() is None: is intended as is. Do not replace it with while not p.poll() as in python 0 (the returncode for successful termination) is also considered False.

I've been running into this problem as well. The problem occurs because you are trying to read stderr as well. If there are no errors, then trying to read from stderr would block.
On Windows, there is no easy way to poll() file descriptors (only Winsock sockets).
So a solution is not to try and read from stderr.

Using pexpect [http://www.noah.org/wiki/Pexpect] with non-blocking readlines will resolve this problem. It stems from the fact that pipes are buffered, and so your app's output is getting buffered by the pipe, therefore you can't get to that output until the buffer fills or the process dies.

This seems to be a well-known Python limitation, see
PEP 3145 and maybe others.

Read one character at a time: http://blog.thelinuxkid.com/2013/06/get-python-subprocess-output-without.html
import contextlib
import subprocess
# Unix, Windows and old Macintosh end-of-line
newlines = ['\n', '\r\n', '\r']
def unbuffered(proc, stream='stdout'):
stream = getattr(proc, stream)
with contextlib.closing(stream):
while True:
out = []
last = stream.read(1)
# Don't loop forever
if last == '' and proc.poll() is not None:
break
while last not in newlines:
# Don't loop forever
if last == '' and proc.poll() is not None:
break
out.append(last)
last = stream.read(1)
out = ''.join(out)
yield out
def example():
cmd = ['ls', '-l', '/']
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
# Make all end-of-lines '\n'
universal_newlines=True,
)
for line in unbuffered(proc):
print line
example()

Using subprocess.Popen, I can run the .exe of one of my C# projects and redirect the output to my Python file. I am able now to print() all the information being output to the C# console (using Console.WriteLine()) to the Python console.
Python code:
from subprocess import Popen, PIPE, STDOUT
p = Popen('ConsoleDataImporter.exe', stdout = PIPE, stderr = STDOUT, shell = True)
while True:
line = p.stdout.readline()
print(line)
if not line:
break
This gets the console output of my .NET project line by line as it is created and breaks out of the enclosing while loop upon the project's termination. I'd imagine this would work for two python files as well.

I've used the pexpect module for this, it seems to work ok. http://sourceforge.net/projects/pexpect/

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use standard Linux tools to fix a deadlocked script? - python

Related

Reading stdout from a subprocess in real time

Receive return data from subprocess in python

Catch the continuous output from a subprocess

stdin for Interactive command from python

How do I get 'real-time' information back from a subprocess.Popen in python (2.5)

Categories

Resources