How to attend several request in parallel with twisted Klein - python

I am creating an API to execute command line commands. The server has only two methods practically, "run" and "stop". So, the main function of "run" is to run a command line program in the server side and return a list with the system output. For the other hand, the function of "stop" just kill the process running. Here is the code:
import sys
import json
import subprocess
from klein import Klein
class ItemStore(object):
app = Klein()
current_process = None
def __init__(self):
self._items = {}
def create_process(self, exe):
"""
Run command and return the system output inside a JSON string
"""
print("COMMAND: ", exe)
process = subprocess.Popen(exe, shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
self.current_process = process
# Poll process for new output until finished
output_lines = []
counter = 0
while True:
counter = counter + 1
nextline = process.stdout.readline()
if process.poll() is not None:
break
aux = nextline.decode("utf-8")
output_lines.append(aux)
sys.stdout.flush()
counter = counter + 1
print("RETURN CODE: ", process.returncode)
return json.dumps(output_lines)
#app.route('/run/<command>', methods=['POST'])
def run(self, request, command):
"""
Execute command line process
"""
exe = command
print("COMMAND: ", exe)
output_lines = self.create_process(exe)
request.setHeader("Content-Type", "application/json")
request.setResponseCode(200)
return output_lines
#app.route('/stop', methods=['POST'])
def stop(self, request):
"""
Kill current execution
"""
self.current_process.kill()
request.setResponseCode(200)
return None
if __name__ == '__main__':
store = ItemStore()
store.app.run('0.0.0.0', 15508)
Well, the problem with this is, if I need to stop the current execution, "stop" request will be not attend until "run" request has finished, so it has no sense to work in this way. I have been reading several pages about async/await solution, but I can not get it work!. I think the most prominent solution is in this webpage https://crossbario.com/blog/Going-Asynchronous-from-Flask-to-Twisted-Klein/ , however, "run" is still a synchronous process. I just posted my main and original code in order to not confuse with the webpage changes.
Best regards

Everything to do with Klein in this example is already handling requests concurrently. However, your application code blocks until it has fully responded to a request.
You have to write your application code to be non-blocking instead of blocking.
Switch your code from the subprocess module to Twisted's process support.
Use Klein's feature of being able to return a Deferred instead of a result (if you want incremental results while the process is running, also look at the request interface - in particular, the write method - so you can write those results before the Deferred fires with a final result).
After Deferreds make sense to you, then you might want to think about syntactic sugar that's available in the form of async/await. Until you understand what Deferreds are doing, async/await is just going to be black magic that will only ever work by accident in your programs.

Related

Real time multipocess stdout monitoring

Right now, I'm using subprocess to run a long-running job in the background. For multiple reasons (PyInstaller + AWS CLI) I can't use subprocess anymore.
Is there an easy way to achieve the same thing as below ? Running a long running python function in a multiprocess pool (or something else) and do real time processing of stdout/stderr ?
import subprocess
process = subprocess.Popen(
["python", "long-job.py"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True,
)
while True:
out = process.stdout.read(2000).decode()
if not out:
err = process.stderr.read().decode()
else:
err = ""
if (out == "" or err == "") and process.poll() is not None:
break
live_stdout_process(out)
Thanks
getting it cross platform is messy .... first of all windows implementation of non-blocking pipe is not user friendly or portable.
one option is to just have your application read its command line arguments and conditionally execute a file, and you get to use subprocess since you will be launching yourself with different argument.
but to keep it to multiprocessing :
the output must be logged to queues instead of pipes.
you need the child to execute a python file, this can be done using runpy to execute the file as __main__.
this runpy function should run under a multiprocessing child, this child must first redirect its stdout and stderr in the initializer.
when an error happens, your main application must catch it .... but if it is too busy reading the output it won't be able to wait for the error, so a child thread has to start the multiprocess and wait for the error.
the main process has to create the queues and launch the child thread and read the output.
putting it all together:
import multiprocessing
from multiprocessing import Queue
import sys
import concurrent.futures
import threading
import traceback
import runpy
import time
class StdoutQueueWrapper:
def __init__(self,queue:Queue):
self._queue = queue
def write(self,text):
self._queue.put(text)
def flush(self):
pass
def function_to_run():
# runpy.run_path("long-job.py",run_name="__main__") # run long-job.py
print("hello") # print something
raise ValueError # error out
def initializer(stdout_queue: Queue,stderr_queue: Queue):
sys.stdout = StdoutQueueWrapper(stdout_queue)
sys.stderr = StdoutQueueWrapper(stderr_queue)
def thread_function(child_stdout_queue,child_stderr_queue):
with concurrent.futures.ProcessPoolExecutor(1, initializer=initializer,
initargs=(child_stdout_queue, child_stderr_queue)) as pool:
result = pool.submit(function_to_run)
try:
result.result()
except Exception as e:
child_stderr_queue.put(traceback.format_exc())
if __name__ == "__main__":
child_stdout_queue = multiprocessing.Queue()
child_stderr_queue = multiprocessing.Queue()
child_thread = threading.Thread(target=thread_function,args=(child_stdout_queue,child_stderr_queue),daemon=True)
child_thread.start()
while True:
while not child_stdout_queue.empty():
var = child_stdout_queue.get()
print(var,end='')
while not child_stderr_queue.empty():
var = child_stderr_queue.get()
print(var,end='')
if not child_thread.is_alive():
break
time.sleep(0.01) # check output every 0.01 seconds
Note that a direct consequence of running as a multiprocess is that if the child runs into a segmentation fault or some unrecoverable error the parent will also die, hencing running yourself under subprocess might seem a better option if segfaults are expected.

ProcessPoolExecutor not limiting to set value

I have a number of computation processes that need to be ran. They take anywhere from 20 minutes to 1+ days. I want the user to be able to observe what each is doing through the standard output, therefore I am executing each in its own cmd window. When I set the number of workers, it does not observe that value and keeps on spinning up more and more until i cancel the program.
def run_job(args):
os.system("start cmd /k \"{} > \"{}\\stdout.txt\"\"".format(run_command,
outpath))
CONCURRENCY_HANDLER = concurrent.futures.ProcessPoolExecutor(max_workers = 3)
jobs =[]
ALL_RUNS_MATRIX = [{k1:v1...kn:vn},....
{kA1,vA1...kAn,vAn}
]
with CONCURRENCY_HANDLER as executor:
for idx, configuration in enumerate(ALL_RUNS_MATRIX):
generate_run_specific_files(configuration,idx)
args = [doesnt,matter]
time.sleep(5)
print("running new")
jobs.append( executor.submit(run_job,args))
time.sleep(10)
I Originally tried using the ThreadPoolExector to the same effect. Why is this not actually limiting the number happening concurrently, and if this wont work what should I use instead? I need to retain this "generate -> wait->run" path because of the nature of the program (I change a file that it reads for config, It starts, retains all necessary info in memory, then executes) so I am wary of the "workers pull their work off a queue as they come available" model
Not quite sure what you're trying to do. Maybe give us an example with a simple task that has the same issue with processes? Are you thinking of max_workers as an upper bound to the number of processes spawned? That might not be true. I think max_workers is the number of processor cores your process pool is allowed to use. According to the docs,
If max_workers is None or not given, it will default to the number of processors on the machine. If max_workers is less than or equal to 0, then a ValueError will be raised. On Windows, max_workers must be less than or equal to 61. If it is not then ValueError will be raised. If max_workers is None, then the default chosen will be at most 61, even if more processors are available.
Here is a simple example,
from concurrent.futures import ProcessPoolExecutor
from time import sleep
futures = []
def job(i):
print('Job started: ' + str(i))
return i
def all_done():
done = True
for ft in futures:
done = done and ft.done()
return done
with ProcessPoolExecutor(max_workers=8) as executor:
for i in range(3):
futures.append(executor.submit(job, i))
while not all_done():
sleep(0.1)
for ft in futures:
print('Job done: ' + str(ft.result()))
It prints,
Job started: 0
Job started: 1
Job started: 2
Job done: 0
Job done: 1
Job done: 2
Does this help?
As I mentioned in my comment as soon as the start command is satisfied by opening up the new command window, the system command returns as completed even though the run command being passed to cmd /K has only just started to run. Therefore the process in the pool is now free to run another task.
If I understand correctly your problem, you have the following goals:
Detect the true completion of your command so that you ensure that no more than 3 commands are running concurrently.
Collect the output of the command in a window that will remain open even after the command has completed. I infer this from your having used the /K switch when invoking cmd.
My solution would be to use windows created by tkinter to hold your output and to use subprocess.Popen to run your commands using argument shell=True. You can specify the additional argument stdout=PIPE to read the output from a command and funnel it the tkinter window. How to actually do that is the challenge.
I have not done tkinter programming before and perhaps someone with more experience could find a more direct method. It seems to me that the windows need to be created and written to in the main thread. To that end for every command that will be executed a window (a special subclass of Tk called CmdWindow) will be created and paired with the window command. The command and the output window number will be passed to a worker function run_command along with an instance of queue.Queue. run_command will then use subprocess.Popen to execute the command and for every line of output it reads from the output pipe, it will write a tuple to the queue with the values of the window number and the line to be written. The main thread is in a loop reading these tuples and writing the lines to the appropriate window. Because the main thread is occupied with writing command output, a special thread is used to create a thread pool and to submit all the commands that need to be run and to await for their completion. When all tasks are completed, a special "end" record is added to the queue signifying to the main thread that it can stop reading from the queue. A that point the main thread displays a 'Pausing for termination...' message and will not terminate until the user enters a carriage return at the console.
from concurrent.futures import ThreadPoolExecutor, as_completed
from subprocess import Popen, PIPE
from tkinter import *
from tkinter.scrolledtext import ScrolledText
from queue import Queue
from threading import Thread
class CmdWindow(Tk):
""" A console window """
def __init__(self, cmd):
super().__init__()
self.title(cmd)
self.configure(background="#BAD0EF")
title = Entry(self, relief=FLAT, bg="#BAD0EF", bd=0)
title.pack(side=TOP)
textArea = ScrolledText(self, height=24, width=120, bg="#FFFFFF", font=('consolas', '14'))
textArea.pack(expand=True, fill='both')
textArea.bind("<Key>", lambda e: "break") # read only
self._textArea = textArea
def write(self, s):
""" write the next line of output """
self._textArea.insert(END, s)
self.update()
def run_command(q, cmd, win):
""" run command cmd with output window win """
# special "create window" command:
q.put((win, None)) # create the window
with Popen(cmd, stdout=PIPE, shell=True, text=True) as proc:
for line in iter(proc.stdout.readline, ''):
# write line command:
q.put((win, line))
def run_tasks(q, arguments):
# we only need a thread pool since each command will be its own process:
with ThreadPoolExecutor(max_workers=3) as executor:
futures = []
for win, cmd in arguments:
futures.append(executor.submit(run_command, q, cmd, win))
# each task doesn't currently return anything
results = [future.result() for future in as_completed(futures)]
q.put(None) # signify end
def main():
q = Queue()
# sample commands to execute (under Windows):
cmds = ['dir *.py', 'dir *.html', 'dir *.txt', 'dir *.js', 'dir *.csv']
# each command will get its own window for output:
windows = list(cmds)
# pair a command with a window number:
arguments = enumerate(cmds)
# create the thread for running the commands:
thread = Thread(target=run_tasks, args=(q, arguments))
# start the thread:
thread.start()
# wait for command output in main thread
# output must be written from main thread
while True:
t = q.get() # get next tuple or special "end" record
if t is None: # special end record?
break # yes!
# unpack tuple:
win, line = t
if line is None: # special create window command
# use cmd as title and replace with actual window:
windows[win] = CmdWindow(windows[win])
else:
windows[win].write(line)
thread.join() # wait for run_jobs thread to end
input('Pausing for termination...') # wait for user to be finished looking at windows
if __name__ == '__main__':
main()

How to properly encapsulate an asyncio Process

I have a program executed in a subprocess. This program runs forever, reads a line from its stdin, processes it, and outputs a result on stdout. I have encapsulated it as follows:
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = asyncio.run(self.create_process(filepath))
# check if the program could not be executed
if self._process.returncode is not None:
raise BrainException(f"Could not start process {filepath}")
#staticmethod
async def create_process(filepath):
process = await sp.create_subprocess_exec(
filepath, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
return process
# destructor function
def __del__(self):
self._process.kill() # kill the program, since it never stops
# waiting for the program to terminate
# self._process.wait() is asynchronous so I use async.run() to execute it
asyncio.run(self._process.wait())
async def _send(self, msg):
b = bytes(msg + '\n', "utf-8")
self._process.stdin.write(b)
await self._process.stdin.drain()
async def _readline(self):
return await self._process.stdout.readline()
def send_start_cmd(self, size):
asyncio.run(self._send(f"START {size}"))
line = asyncio.run(self._readline())
print(line)
return line
From my understanding asyncio.run() is used to run asynchronous code in a synchronous context. That is why I use it at the following lines:
# in __init__
self._process = asyncio.run(self.create_process(filepath))
# in send_start_cmd
asyncio.run(self._send(f"START {size}"))
# ...
line = asyncio.run(self._readline())
# in __del__
asyncio.run(self._process.wait())
The first line seems to work properly (the process is created correctly), but the other throw exceptions that look like got Future <Future pending> attached to a different loop.
Code:
brain = BrainProcess("./test")
res = brain.send_start_cmd(20)
print(res)
So my questions are:
What do these errors mean ?
How do I fix them ?
Did I use asyncio.run() correctly ?
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?
asyncio.run is meant to be used for running a body of async code, and producing a well-defined result. The most typical example is running the whole program:
async def main():
# your application here
if __name__ == '__main__':
asyncio.run(main())
Of couurse, asyncio.run is not limited to that usage, it is perfectly possible to call it multiple times - but it will create a fresh event loop each time. This means you won't be able to share async-specific objects (such as futures or objects that refer to them) between invocations - which is precisely what you tried to do. If you want to completely hide the fact that you're using async, why use asyncio.subprocess in the first place, wouldn't the regular subprocess do just as well?
The simplest fix is to avoid asyncio.run and just stick to the same event loop. For example:
_loop = asyncio.get_event_loop()
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = _loop.run_until_complete(self.create_process(filepath))
...
...
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?
The idea is precisely for the whole application to use async/await, otherwise you won't be able to take advantage of asyncio - e.g. you won't be able to parallelize your async code.

click.testing.CliRunner and handling SIGINT/SIGTERM signals

I want to add few tests on how my cli app handles different signals (SIGTERM, etc). And I am using native testing solution click.testing.CliRunner alongside of pytest.
Test looks pretty standard and simple
def test_breaking_process(server, runner):
address = server.router({'^/$': Page("").exists().slow()})
runner = CliRunner(mix_stderr=True)
args = [address, '--no-colors', '--no-progress']
result = runner.invoke(main, args)
assert result.exit_code == 0
And here I am stuck, how could I send SIGTERM to process in runner.invoke? I see no problem to do it if I use e2e tests (calling executable and not CLIrunner), but I would like to try to implement this (at least able to send os.kill)
is there a way to do it?
So, if you want to test your click powered application for handling different signals, you can do next procedure.
def test_breaking_process(server, runner):
from multiprocessing import Queue, Process
from threading import Timer
from time import sleep
from os import kill, getpid
from signal import SIGINT
url = server.router({'^/$': Page("").slow().exists()})
args = [url, '--no-colors', '--no-progress']
q = Queue()
# Running out app in SubProcess and after a while using signal sending
# SIGINT, results passed back via channel/queue
def background():
Timer(0.2, lambda: kill(getpid(), SIGINT)).start()
result = runner.invoke(main, args)
q.put(('exit_code', result.exit_code))
q.put(('output', result.output))
p = Process(target=background)
p.start()
results = {}
while p.is_alive():
sleep(0.1)
else:
while not q.empty():
key, value = q.get()
results[key] = value
assert results['exit_code'] == 0
assert "Results can be inconsistent, as execution was terminated" in results['output']

Use python's pty to create a live console

I'm trying to create an execution environment/shell that will remotely execute on a server, which streams the stdout,err,in over the socket to be rendered in a browser. I currently have tried the approach of using subprocess.run with a PIPE. The Problem is that I get the stdout after the process has completed. What i want to achieve is to get a line-by-line, pseudo-terminal sort of implementation.
My current implementation
test.py
def greeter():
for _ in range(10):
print('hello world')
greeter()
and in the shell
>>> import subprocess
>>> result = subprocess.run(['python3', 'test.py'], stdout=subprocess.PIPE)
>>> print(result.stdout.decode('utf-8'))
hello world
hello world
hello world
hello world
hello world
hello world
hello world
hello world
hello world
hello world
If i try to attempt even this simple implementation with pty, how does one do it?
If your application is going to work asynchronously with multiple tasks, like reading data from stdout and then writing it to a websocket, I suggest using asyncio.
Here is an example that runs a process and redirects its output into a websocket:
import asyncio.subprocess
import os
from aiohttp.web import (Application, Response, WebSocketResponse, WSMsgType,
run_app)
async def on_websocket(request):
# Prepare aiohttp's websocket...
resp = WebSocketResponse()
await resp.prepare(request)
# ... and store in a global dictionary so it can be closed on shutdown
request.app['sockets'].append(resp)
process = await asyncio.create_subprocess_exec(sys.executable,
'/tmp/test.py',
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE,
bufsize=0)
# Schedule reading from stdout and stderr as asynchronous tasks.
stdout_f = asyncio.ensure_future(p.stdout.readline())
stderr_f = asyncio.ensure_future(p.stderr.readline())
# returncode will be set upon process's termination.
while p.returncode is None:
# Wait for a line in either stdout or stderr.
await asyncio.wait((stdout_f, stderr_f), return_when=asyncio.FIRST_COMPLETED)
# If task is done, then line is available.
if stdout_f.done():
line = stdout_f.result().encode()
stdout_f = asyncio.ensure_future(p.stdout.readline())
await ws.send_str(f'stdout: {line}')
if stderr_f.done():
line = stderr_f.result().encode()
stderr_f = asyncio.ensure_future(p.stderr.readline())
await ws.send_str(f'stderr: {line}')
return resp
async def on_shutdown(app):
for ws in app['sockets']:
await ws.close()
async def init(loop):
app = Application()
app['sockets'] = []
app.router.add_get('/', on_websocket)
app.on_shutdown.append(on_shutdown)
return app
loop = asyncio.get_event_loop()
app = loop.run_until_complete(init())
run_app(app)
It uses aiohttp and is based on the web_ws and subprocess streams examples.
Im sure theres a dupe around somewhere but i couldnt find it quickly
process = subprocess.Popen(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE,bufsize=0)
for out in iter(process.stdout.readline, b""):
print(out)
If you are on Windows then you will be fighting an uphill battle for a very long time, and I am sorry for the pain you will endure (been there). If you are on Linux, however, you can use the pexpect module. Pexpect allows you to spawn a background child process which you can perform bidirectional communication with. This is useful for all types of system automation, but a very common use case is ssh.
import pexpect
child = pexpect.spawn('python3 test.py')
message = 'hello world'
while True:
try:
child.expect(message)
except pexpect.exceptions.EOF:
break
input('child sent: "%s"\nHit enter to continue: ' %
(message + child.before.decode()))
print('reached end of file!')
I have found it very useful to create a class to handle something complicated like an ssh connection, but if your use case is simple enough that might not be appropriate or necessary. The way pexpect.before is of type bytes and omits the pattern you are searching for can be awkward, so it may make sense to create a function that handles this for you at the very least.
def get_output(child, message):
return(message + child.before.decode())
If you want to send messages to the child process, you can use child.sendline(line). For more details, check out the documentation I linked.
I hope I was able to help!
I don't know if you can render this in a browser, but you can run a program like module so you get stdout immediately like this:
import importlib
from importlib.machinery import SourceFileLoader
class Program:
def __init__(self, path, name=''):
self.path = path
self.name = name
if self.path:
if not self.name:
self.get_name()
self.loader = importlib.machinery.SourceFileLoader(self.name, self.path)
self.spec = importlib.util.spec_from_loader(self.loader.name, self.loader)
self.mod = importlib.util.module_from_spec(self.spec)
return
def get_name(self):
extension = '.py' #change this if self.path is not python program with extension .py
self.name = self.path.split('\\')[-1].strip('.py')
return
def load(self):
self.check()
self.loader.exec_module(self.mod)
return
def check(self):
if not self.path:
Error('self.file is NOT defined.'.format(path)).throw()
return
file_path = 'C:\\Users\\RICHGang\\Documents\\projects\\stackoverflow\\ptyconsole\\test.py'
file_name = 'test'
prog = Program(file_path, file_name)
prog.load()
You can add sleep in test.py to see the difference:
from time import sleep
def greeter():
for i in range(10):
sleep(0.3)
print('hello world')
greeter()
Take a look at terminado. Works on Windows and Linux.
Jupiter Lab uses it.

Categories

Resources