python subprocess32 with timeout, overflowerror - python

this is my log
File "/opt/ibm/db2-governor/helpers/utils.py", line 10, in run_cmd
output = proc.communicate(timeout = timeout)[0]
File "/opt/ibm/dynamite/python/lib/python2.7/site-packages/subprocess32.py", line 927, in communicate
stdout, stderr = self._communicate(input, endtime, timeout)
File "/opt/ibm/dynamite/python/lib/python2.7/site-packages/subprocess32.py", line 1713, in _communicate
orig_timeout)
File "/opt/ibm/dynamite/python/lib/python2.7/site-packages/subprocess32.py", line 1786, in _communicate_with_poll
ready = poller.poll(self._remaining_time(endtime))
OverflowError: Python int too large to convert to C lon
so the code that triggers this is
output = proc.communicate(timeout = timeout)[0]
timeout is set to 20, this happens intermitently (almost never but it happens), im using python 2.7.11 with subprocess32 library, is this a python bug?
ok, i checked subprocess32.py, the line goes like this
endtime = time.time() + timeout
ready = poller.poll(self._remaining_time(endtime))
so basically timestamp is too large to convert into a c int, is there anything i can do to resolve this?

Sounds like a bug all right.
If you're interested, here's a workaround proposal: instead of communicate, read from process stdout in a thread and check if process is over by either nothing more to read or return code yield through poll.
Since you control the loop, you can wait 1 second in main thread and countdown for the timeout (not extra accurate, since sleep can drift, but that would be good enough & simple). Also kill the process when reaches 0.
import threading
output = ""
def subp(p):
global output
while True:
# read blocks but since we're in a thread it doesn't matter
data = proc.stdout.read()
if not data or proc.poll() != None:
break
output += data
# here create the process
proc = subprocess...
# create a thread, pass the process handle
t = threading.Thread(target=subp,args=(proc,))
while True:
if proc.poll() != None:
# exit: OK
break
timeout -= 1
if timeout < 0:
# took too long: kill
proc.terminate()
break
time.sleep(1)
t.join()

Related

Run Python script within Python by using `subprocess.Popen` in real time

I want to run a Python script (or any executable, for that manner) from a python script and get the output in real time. I have followed many tutorials, and my current code looks like this:
import subprocess
with open("test2", "w") as f:
f.write("""import time
print('start')
time.sleep(5)
print('done')""")
process = subprocess.Popen(['python3', "test2"], stdout=subprocess.PIPE)
while True:
output = process.stdout.readline()
if output == '' and process.poll() is not None:
break
if output:
print(output.strip())
rc = process.poll()
The first bit just creates the file that will be run, for clarity's sake.
I have two problems with this code:
It does not give the output in real time. It waits untill the process has finished.
It does not terminate the loop once the process has finished.
Any help would be very welcome.
EDIT: Thanks to #JohnAnderson for the fix to the first problem: replacing if output == '' and process.poll() is not None: with if output == b'' and process.poll() is not None:
Last night I've set out to do this using a pipe:
import os
import subprocess
with open("test2", "w") as f:
f.write("""import time
print('start')
time.sleep(2)
print('done')""")
(readend, writeend) = os.pipe()
p = subprocess.Popen(['python3', '-u', 'test2'], stdout=writeend, bufsize=0)
still_open = True
output = ""
output_buf = os.read(readend, 1).decode()
while output_buf:
print(output_buf, end="")
output += output_buf
if still_open and p.poll() is not None:
os.close(writeend)
still_open = False
output_buf = os.read(readend, 1).decode()
Forcing buffering out of the picture and reading one character at the time (to make sure we do not block writes from the process having filled a buffer), closing the writing end when process finishes to make sure read catches the EOF correctly. Having looked at the subprocess though that turned out to be a bit of an overkill. With PIPE you get most of that for free and I ended with this which seems to work fine (call read as many times as necessary to keep emptying the pipe) with just this and assuming the process finished, you do not have to worry about polling it and/or making sure the write end of the pipe is closed to correctly detect EOF and get out of the loop:
p = subprocess.Popen(['python3', '-u', 'test2'],
stdout=subprocess.PIPE, bufsize=1,
universal_newlines=True)
output = ""
output_buf = p.stdout.readline()
while output_buf:
print(output_buf, end="")
output += output_buf
output_buf = p.stdout.readline()
This is a bit less "real-time" as it is basically line buffered.
Note: I've added -u to you Python call, as you need to also make sure your called process' buffering does not get in the way.

A Python Wrapper or Handler for A Minecraft Server

I am using Windows and am looking for a handler or wrapper using Python for a Minecraft server so that I can automatically enter commands without user input. I have searched through many questions on the website and only found half answers (in my case at least). I believe I will need to use the subprocess module but cannot decide which to use at the moment I am experimenting with the Popen functions. I have found an answer which I modified for my case:
server = Popen("java -jar minecraft_server.jar nogui", stdin=PIPE, stdout=PIPE, stderr=STDOUT)
while True:
print(server.stdout.readline())
server.stdout.flush()
command = input("> ")
if command:
server.stdin.write(bytes(command + "\r\n", "ascii"))
server.stdin.flush()
This does work in some way but only prints a line for every time you enter a command, which cannot work and all my efforts to change this end up with the program unable to execute anything else and instead just read. This is not a duplicate question because none of the answers in similar questions could help me enough.
As you already know, your server.stdout.readline() and input("> ") are blocking your code execution.
You need to make your code non-blocking, by not waiting to actually return what you want, but by checking, if there is anything to read and ignore it, if there isn't and continue to do other things.
On Linux systems you might be able to use select module, but on Windows it only works on sockets.
I was able to make it work on Windows by using threads and queues. (note: it's Python 2 code)
import subprocess, sys
from Queue import Queue, Empty
from threading import Thread
def process_line(line):
if line == "stop\n": # lines have trailing new line characters
print "SERVER SHUTDOWN PREVENTED"
return None
elif line == "quit\n":
return "stop\n"
elif line == "l\n":
return "list\n"
return line
s = subprocess.Popen("java -jar minecraft_server.jar nogui", stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
def read_lines(stream, queue):
while True:
queue.put(stream.readline())
# terminal reading thread
q = Queue()
t = Thread(target=read_lines, args=(sys.stdin, q))
t.daemon = True
t.start()
# server reading thread
qs = Queue()
ts = Thread(target=read_lines, args=(s.stdout, qs))
ts.daemon = True
ts.start()
while s.poll() == None: # loop while the server process is running
# get a user entered line and send it to the server
try:
line = q.get_nowait()
except Empty:
pass
else:
line = process_line(line) # do something with the user entered line
if line != None:
s.stdin.write(line)
s.stdin.flush()
# just pass-through data from the server to the terminal output
try:
line = qs.get_nowait()
except Empty:
pass
else:
sys.stdout.write(line)
sys.stdout.flush()

writing large amount of data to stdin

I am writing a large amount of data to stdin.
How do i ensure that it is not blocking?
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write('A very very very large amount of data')
p.stdin.flush()
output = p.stdout.readline()
It seems to hang at p.stdin.write() after i read a large string and write to it.
I have a large corpus of files which will be written to stdin sequentially(>1k files)
So what happens is that i am running a loop
#this loop is repeated for all the files
for stri in lines:
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write(stri)
output = p.stdout.readline()
#do some processing
It somehow hangs at file no. 400. The file is a large file with long strings.
I do suspect its a blocking issue.
This only happens if i iterate from 0 to 1000. However, if i were to start from file 400, the error would not happen
To avoid the deadlock in a portable way, write to the child in a separate thread:
#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread
def pump_input(pipe, lines):
with pipe:
for line in lines:
pipe.write(line)
p = Popen(path, stdin=PIPE, stdout=PIPE, bufsize=1)
Thread(target=pump_input, args=[p.stdin, lines]).start()
with p.stdout:
for line in iter(p.stdout.readline, b''): # read output
print line,
p.wait()
See Python: read streaming input from subprocess.communicate()
You may have to use Popen.communicate().
If you write a large amount of data to the stdin and during this the child process generates output to stdout then it may become a problem that the stdout buffer of the child becomes full before processing all of your stdin data. The child process blocks on a write to stdout (because you are not reading it) and you are blocked on writing the stdin.
Popen.communicate() can be used to write stdin and read stdout/stderr at the same time to avoid the previous problem.
Note: Popen.communicate() is suitable only when the input and output data can fit to your memory (they are not too large).
Update:
If you decide to hack around with threads here is an example parent and child process implementation that you can tailor to suit your needs:
parent.py:
#!/usr/bin/env python2
import os
import sys
import subprocess
import threading
import Queue
class MyStreamingSubprocess(object):
def __init__(self, *argv):
self.process = subprocess.Popen(argv, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
self.stdin_queue = Queue.Queue()
self.stdout_queue = Queue.Queue()
self.stdin_thread = threading.Thread(target=self._stdin_writer_thread)
self.stdout_thread = threading.Thread(target=self._stdout_reader_thread)
self.stdin_thread.start()
self.stdout_thread.start()
def process_item(self, item):
self.stdin_queue.put(item)
return self.stdout_queue.get()
def terminate(self):
self.stdin_queue.put(None)
self.process.terminate()
self.stdin_thread.join()
self.stdout_thread.join()
return self.process.wait()
def _stdin_writer_thread(self):
while 1:
item = self.stdin_queue.get()
if item is None:
# signaling the child process that the end of the
# input has been reached: some console progs handle
# the case when reading from stdin returns empty string
self.process.stdin.close()
break
try:
self.process.stdin.write(item)
except IOError:
# making sure that the current self.process_item()
# call doesn't deadlock
self.stdout_queue.put(None)
break
def _stdout_reader_thread(self):
while 1:
try:
output = self.process.stdout.readline()
except IOError:
output = None
self.stdout_queue.put(output)
# output is empty string if the process has
# finished or None if an IOError occurred
if not output:
break
if __name__ == '__main__':
child_script_path = os.path.join(os.path.dirname(__file__), 'child.py')
process = MyStreamingSubprocess(sys.executable, '-u', child_script_path)
try:
while 1:
item = raw_input('Enter an item to process (leave empty and press ENTER to exit): ')
if not item:
break
result = process.process_item(item + '\n')
if result:
print('Result: ' + result)
else:
print('Error processing item! Exiting.')
break
finally:
print('Terminating child process...')
process.terminate()
print('Finished.')
child.py:
#!/usr/bin/env python2
import sys
while 1:
item = sys.stdin.readline()
sys.stdout.write('Processed: ' + item)
Note: IOError is processed on the reader/writer threads to handle the cases where the child process exits/crashes/killed.

Python: asynhronously print stdout from multiple subprocesses

I'm testing out a way to print out stdout from several subprocesses in Python 2.7. What I have setup is a main process that spawns, at the moment, three subprocesses and spits out their output. Each subprocess is a for-loop that goes to sleep for some random amount of time, and when it wakes up, says "Slept for X seconds".
The problem I'm seeing is that the printing out seems synchronous. Say subprocess A sleeps for 1 second, subprocess B sleeps for 3 seconds, and subprocess C sleeps for 10 seconds. The main process stops for the full 10 seconds when it's trying to see if subprocess C has something, even though the other two have probably slept and printed something out. This is to simulate if a subprocess truly has nothing to output for a longer period of time than the other two.
I need a solution which works on Windows.
My code is as follows:
main_process.py
import sys
import subprocess
logfile = open('logfile.txt', 'w')
processes = [
subprocess.Popen('python subproc_1.py', stdout=subprocess.PIPE, bufsize=1),
subprocess.Popen('python subproc_2.py', stdout=subprocess.PIPE, bufsize=1),
subprocess.Popen('python subproc_3.py', stdout=subprocess.PIPE, bufsize=1),
]
while True:
line = processes[0].stdout.readline()
if line != '':
sys.stdout.write(line)
logfile.write(line)
line = processes[1].stdout.readline()
if line != '':
sys.stdout.write(line)
logfile.write(line)
line = processes[2].stdout.readline()
if line != '':
sys.stdout.write(line)
logfile.write(line)
#If everyone is dead, break
if processes[0].poll() is not None and \
processes[1].poll() is not None and \
processes[2].poll() is not None:
break
processes[0].wait()
processes[1].wait()
print 'Done'
subproc_1.py/subproc_2.py/subproc_3.py
import time, sys, random
sleep_time = random.random() * 3
for x in range(0, 20):
print "[PROC1] Slept for {0} seconds".format(sleep_time)
sys.stdout.flush()
time.sleep(sleep_time)
sleep_time = random.random() * 3 #this is different for each subprocess.
Update: Solution
Taking the answer below along with this question, this is this should work.
import sys
import subprocess
from threading import Thread
try:
from Queue import Queue, Empty
except ImportError:
from queue import Queue, Empty # for Python 3.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
if __name__ == '__main__':
logfile = open('logfile.txt', 'w')
processes = [
subprocess.Popen('python subproc_1.py', stdout=subprocess.PIPE, bufsize=1),
subprocess.Popen('python subproc_2.py', stdout=subprocess.PIPE, bufsize=1),
subprocess.Popen('python subproc_3.py', stdout=subprocess.PIPE, bufsize=1),
]
q = Queue()
threads = []
for p in processes:
threads.append(Thread(target=enqueue_output, args=(p.stdout, q)))
for t in threads:
t.daemon = True
t.start()
while True:
try:
line = q.get_nowait()
except Empty:
pass
else:
sys.stdout.write(line)
logfile.write(line)
logfile.flush()
#break when all processes are done.
if all(p.poll() is not None for p in processes):
break
print 'All processes done'
I'm not sure if I need any cleanup code at the end of the while loop. If anyone has comments about it, please add them.
And each subproc script looks similar to this (I edited for the sake of making a better example):
import datetime, time, sys, random
for x in range(0, 20):
sleep_time = random.random() * 3
time.sleep(sleep_time)
timestamp = datetime.datetime.fromtimestamp(time.time()).strftime('%H%M%S.%f')
print "[{0}][PROC1] Slept for {1} seconds".format(timestamp, sleep_time)
sys.stdout.flush()
print "[{0}][PROC1] Done".format(timestamp)
sys.stdout.flush()
Your problem comes from the fact that readline() is a blocking function; if you call it on a file object and there isn't a line waiting to be read, the call won't return until there is a line of output. So what you have now will read repeatedly from subprocesses 1, 2, and 3 in that order, pausing at each until output is ready.
(Edit: The OP clarified that they're on Windows, which makes the below inapplicable. )
If you want to read from whichever output stream is ready, you need to check on the status of the streams in non-blocking fashion, using the select module, and then attempt reads only on those that are ready. select provides various ways of doing this, but for the sake of example we'll use select.select(). After starting your subprocesses, you'll have something like:
streams = [p.stdout for p in processes]
def output(s):
for f in [sys.stdout, logfile]:
f.write(s)
f.flush()
while True:
rstreams, _, _ = select.select(streams, [], [])
for stream in rstreams:
line = stream.readline()
output(line)
if all(p.poll() is not None for p in processes):
break
for stream in streams:
output(stream.read())
What select() does, when called with three lists of file objects (or file descriptors), is return three subsets of its arguments, which are the streams that are ready for reading, are ready for writing, or have an error condition. Thus on each iteration of the loop we check to see which output streams are ready to read, and iterate over just those. Then we repeat. (Note that it's important here that you're line-buffering the output; the above code assumes that if a stream is ready for reading there's at least one full line ready to be read. If you specify different buffering the above can block.)
A further problem with your original code: When you exit the loop after poll() reports all subprocesses to have exited, you might not have read all their output. So you need to do a last sweep over the streams to read any remaining output.
Note: The example code I gave doesn't try all that hard to capture the subprocesses' output in exactly the order in which it becomes available (which is impossible to do perfectly, but can be approximated more closely than the above manages to do). It also lacks other refinements (for example, in the main loop it'll continue to select on the stdout of every subprocess, even after some have already terminated, which is harmless, but inefficient). It's just meant to illustrate a basic technique of non-blocking IO.

How can I use Python to pipe stdin/stdout to Perl script

This Python code pipes data through Perl script fine.
import subprocess
kw = {}
kw['executable'] = None
kw['shell'] = True
kw['stdin'] = None
kw['stdout'] = subprocess.PIPE
kw['stderr'] = subprocess.PIPE
args = ' '.join(['/usr/bin/perl','-w','/path/script.perl','<','/path/mydata'])
subproc = subprocess.Popen(args,**kw)
for line in iter(subproc.stdout.readline, ''):
print line.rstrip().decode('UTF-8')
However, it requires that I first to save my buffers to a disk file (/path/mydata). It's cleaner to loop through the data in Python code and pass line-by-line to the subprocess like this:
import subprocess
kw = {}
kw['executable'] = '/usr/bin/perl'
kw['shell'] = False
kw['stderr'] = subprocess.PIPE
kw['stdin'] = subprocess.PIPE
kw['stdout'] = subprocess.PIPE
args = ['-w','/path/script.perl',]
subproc = subprocess.Popen(args,**kw)
f = codecs.open('/path/mydata','r','UTF-8')
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
print line.strip() ### code hangs after printing this ###
for line in iter(subproc.stdout.readline, ''):
print line.rstrip().decode('UTF-8')
subproc.terminate()
f.close()
The code hangs with the readline after sending the first line to the subprocess. I have other executables that use this exact same code perfectly.
My data files can be quite large (1.5 GB) Is there way to accomplish piping the data without saving to file? I don't want to re-write the perl script for compatibility with other systems.
Your code is blocking at the line:
for line in iter(subproc.stdout.readline, ''):
because the only way this iteration can terminate is when EOF (end-of-file) is reached, which will happen when the subprocess terminates. You don't want to wait till the process terminates, however, you only want to wait till its finished processing the line that was sent to it.
Futhermore, you're encountering issues with buffering as Chris Morgan has already pointed out. Another question on stackoverflow discusses how you can do non-blocking reads with subprocess. I've hacked up a quick and dirty adaptation of the code from that question to your problem:
def enqueue_output(out, queue):
for line in iter(out.readline, ''):
queue.put(line)
out.close()
kw = {}
kw['executable'] = '/usr/bin/perl'
kw['shell'] = False
kw['stderr'] = subprocess.PIPE
kw['stdin'] = subprocess.PIPE
kw['stdout'] = subprocess.PIPE
args = ['-w','/path/script.perl',]
subproc = subprocess.Popen(args, **kw)
f = codecs.open('/path/mydata','r','UTF-8')
q = Queue.Queue()
t = threading.Thread(target = enqueue_output, args = (subproc.stdout, q))
t.daemon = True
t.start()
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
print "Sent:", line.strip() ### code hangs after printing this ###
try:
line = q.get_nowait()
except Queue.Empty:
pass
else:
print "Received:", line.rstrip().decode('UTF-8')
subproc.terminate()
f.close()
It's quite likely that you'll need to make modifications to this code, but at least it doesn't block.
Thanks srgerg. I had also tried the threading solution. This solution alone, however, always hung. Both my previous code and srgerg's code were missing the final solution, Your tip gave me one last idea.
The final solution writes enough dummy data force the final valid lines from the buffer. To support this, I added code that tracks how many valid lines were written to stdin. The threaded loop opens the output file, saves the data, and breaks when the read lines equal the valid input lines. This solution ensures it reads and writes line-by-line for any size file.
def std_output(stdout,outfile=''):
out = 0
f = codecs.open(outfile,'w','UTF-8')
for line in iter(stdout.readline, ''):
f.write('%s\n'%(line.rstrip().decode('UTF-8')))
out += 1
if i == out: break
stdout.close()
f.close()
outfile = '/path/myout'
infile = '/path/mydata'
subproc = subprocess.Popen(args,**kw)
t = threading.Thread(target=std_output,args=[subproc.stdout,outfile])
t.daemon = True
t.start()
i = 0
f = codecs.open(infile,'r','UTF-8')
for line in f:
subproc.stdin.write('%s\n'%(line.strip().encode('UTF-8')))
i += 1
subproc.stdin.write('%s\n'%(' '*4096)) ### push dummy data ###
f.close()
t.join()
subproc.terminate()
See the warnings mentioned in the manual about using Popen.stdin and Popen.stdout (just above Popen.stdin):
Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
I realise that having a gigabyte-and-a-half string in memory all at once isn't very desirable, but using communicate() is a way that will work, while as you've observed, once the OS pipe buffer fills up, the stdin.write() + stdout.read() way can become deadlocked.
Is using communicate() feasible for you?

Categories

Resources