How to properly encapsulate an asyncio Process - python

I have a program executed in a subprocess. This program runs forever, reads a line from its stdin, processes it, and outputs a result on stdout. I have encapsulated it as follows:
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = asyncio.run(self.create_process(filepath))
# check if the program could not be executed
if self._process.returncode is not None:
raise BrainException(f"Could not start process {filepath}")
#staticmethod
async def create_process(filepath):
process = await sp.create_subprocess_exec(
filepath, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
return process
# destructor function
def __del__(self):
self._process.kill() # kill the program, since it never stops
# waiting for the program to terminate
# self._process.wait() is asynchronous so I use async.run() to execute it
asyncio.run(self._process.wait())
async def _send(self, msg):
b = bytes(msg + '\n', "utf-8")
self._process.stdin.write(b)
await self._process.stdin.drain()
async def _readline(self):
return await self._process.stdout.readline()
def send_start_cmd(self, size):
asyncio.run(self._send(f"START {size}"))
line = asyncio.run(self._readline())
print(line)
return line
From my understanding asyncio.run() is used to run asynchronous code in a synchronous context. That is why I use it at the following lines:
# in __init__
self._process = asyncio.run(self.create_process(filepath))
# in send_start_cmd
asyncio.run(self._send(f"START {size}"))
# ...
line = asyncio.run(self._readline())
# in __del__
asyncio.run(self._process.wait())
The first line seems to work properly (the process is created correctly), but the other throw exceptions that look like got Future <Future pending> attached to a different loop.
Code:
brain = BrainProcess("./test")
res = brain.send_start_cmd(20)
print(res)
So my questions are:
What do these errors mean ?
How do I fix them ?
Did I use asyncio.run() correctly ?
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?

asyncio.run is meant to be used for running a body of async code, and producing a well-defined result. The most typical example is running the whole program:
async def main():
# your application here
if __name__ == '__main__':
asyncio.run(main())
Of couurse, asyncio.run is not limited to that usage, it is perfectly possible to call it multiple times - but it will create a fresh event loop each time. This means you won't be able to share async-specific objects (such as futures or objects that refer to them) between invocations - which is precisely what you tried to do. If you want to completely hide the fact that you're using async, why use asyncio.subprocess in the first place, wouldn't the regular subprocess do just as well?
The simplest fix is to avoid asyncio.run and just stick to the same event loop. For example:
_loop = asyncio.get_event_loop()
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = _loop.run_until_complete(self.create_process(filepath))
...
...
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?
The idea is precisely for the whole application to use async/await, otherwise you won't be able to take advantage of asyncio - e.g. you won't be able to parallelize your async code.

Related

Python: Threads are not running in parrallel

I'm trying to create a networking project using UDP connections. The server that I'm creating has to multithread in order to be able to receive multiple commands from multiple clients. However when trying to multithread the server, only one thread is running. Here is the code:
def action_assigner():
print('Hello Assign')
while True:
if work_queue.qsize() != 0:
data, client_address, request_number = work_queue.get()
do_actions(data, client_address, request_number)
def task_putter():
request_number = 0
print('Hello Task')
while True:
data_received = server_socket.recvfrom(1024)
request_number += 1
taskRunner(data_received, request_number)
try:
thread_task = threading.Thread(target=task_putter())
action_thread = threading.Thread(target=action_assigner())
action_thread.start()
thread_task.start()
action_thread.join()
thread_task.join()
except Exception as e:
server_socket.close()
When running the code, I only get Hello Task as the result meaning that the action_thread never started. Can someone explain how to fix this?
The problem here is that you are calling the functions that should be the "body" of each thread when creating the Threads themselves.
Upon executing the line thread_task = threading.Thread(target=task_putter()) Python will resolve first the expession inside the parentheses - it calls the function task_putter, which never returns. None of the subsequent lines on your program is ever run.
What we do when creating threads, and other calls that takes callable objects as arguments, is to pass the function itself, not calling it (which will run the function and evaluate to its return value).
Just change both lines creating the threads to not put the calling parentheses on the target= argument and you will get past this point:
...
try:
thread_task = threading.Thread(target=task_putter)
action_thread = threading.Thread(target=action_assigner)
...

Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks

I got two processes and in order to do some clean up in case of fatal errors (instead of processes keeping running), I want to remove all remaining tasks en empty the queue (in order to let join() proceed). How can I achieve that (preferably it should be code to apply in both processes, but my code allows the child process to signal the main process of its failure state and instruct main to do the clean up as well)?
I was trying to get a understand it by inspecting the source at:
https://github.com/python/cpython/blob/main/Lib/multiprocessing/queues.py
But I got a little bit lost with code like:
...
self._unfinished_tasks._semlock._is_zero():
...
def __init__(self, maxsize=0, *, ctx):
Queue.__init__(self, maxsize, ctx=ctx)
self._unfinished_tasks = ctx.Semaphore(0)
...
(also where does the _semlock property comes from?)
For example, what is ctx and it appears not be required as I did not use it in my object creation. Digging further, it may have something to do with (a little bit too mysterious or me)
mp.get_context('spawn')
or
#asynccontextmanager
async def ctx():
yield
I need something like mentioned here by V.E.O (which is quite understandable, but that is only a single process as far as I understand):
Clear all items from the queue
I came up with the following code (to be tested):
def clearAndDiscardQueue(self):
try: # cleanup, preferably in the process that is adding to the queue
while True:
self.task_queue.get_nowait()
except Empty:
pass
except ValueError: # in case of closed
pass
self.task_queue.close()
# theoretically a new item could be placed by the
# other process by the time the interpreter is on this line,
# therefore the part above should be run in the process that
# fills (put) the queue when it is in its failure state
# (when the main process fails it should communicate to
# raise an exception in the child process to run the cleanup
# so main process' join will work)
try: # could be one of the processes
while True:
self.task_queue.task_done()
except ValueError: # too many times called, do not care
# since all remaining will not be processed due to failure state
pass
Else I would need to try understanding code like the following. I think messing with the next code, analogous to calling queue.clear() as in a single process queue, would have serious consequences in terms of race conditions when clearing the buffer/pipe myself somehow.
class Queue(object):
def __init__(self, maxsize=0, *, ctx):
…
self._reader, self._writer = connection.Pipe(duplex=False)
…
def put(self, obj, block=True, timeout=None):
…
self._buffer.append(obj) # in case of close() the background thread
# will quit once it has flushed all buffered data to the pipe.
…
def get(self, block=True, timeout=None):
…
res = self._recv_bytes()
…
return _ForkingPickler.loads(res)
…
class JoinableQueue(Queue):
def __init__(self, maxsize=0, *, ctx):
…
self._unfinished_tasks = ctx.Semaphore(0)
…
def task_done(self):
…
if not self._unfinished_tasks._semlock._is_zero():
…
in which _is_zero() is somehow externally defined (see synchronize.py), like mentioned here:
Why doesn't Python's _multiprocessing.SemLock have 'name'?

await asyncio.sleep(1) not working in python

My code execution does not reach the print statement: print("I want to display after MyClass has started")
Why is this? I thought the purpose of await asyncio.sleep() is to unblock execution of code so that subsequent lines of code can run. Is that not the case?
import asyncio
class MyClass:
def __init__(self):
self.input = False
asyncio.run(self.start())
print("I want to display after MyClass has started") #This line is never reached.
async def start(self):
while True:
print("Changing state...")
if self.input:
print("I am on.")
break
await asyncio.sleep(1)
m = MyClass()
m.input = True #This line is never reached! Why?
print("I want to display after MyClass is started")
When I execute, it keeps printing "Changing state...". Even when I ctrl+c to quit, the execution continues as shown below. How can I properly terminate the execution? Sorry, I am new to python.
EDIT:
I appreciate the common use of asyncio is for running two or more separate functions asynchronously. However, my class is one which will be responding to changes in its state. For example, I intend to write code in the setters to do stuff when the class objects attributes change -WHILE still having a while True event loop running in the background. Is there not any way to permit this? I have tried running the event loop in it's own thread. However, that thread then dominates and the class objects response times run into several seconds. This may be due to the GIL (Global Interpreter Lock) which we can do nothing about. I have also tried using multiprocessing, but then I lose access to the properties and methods of the object as parallel process run in their own memory spaces.
In the init method of MyClass you invoke asyncio.run() - this method will execute the required routine until that routine terminates. In your case, since the main method includes a while True loop, it will never terminate.
Here is a slight modification of your code that perhaps shows the concurrency effect you're after -
import asyncio
class MyClass:
def __init__(self):
self.input = False
asyncio.run(self.main())
print("I want to display after MyClass has been initialized.") # This line is never reached.
async def main(self):
work1 = self.work1()
work2 = self.work2()
await asyncio.gather(work1, work2)
async def work1(self):
for idx in range(5):
print('doing some work 1...')
await asyncio.sleep(1)
async def work2(self):
for idx in range(5):
print('doing some work 2...')
await asyncio.sleep(1)
m = MyClass()
print("I want to display after MyClass is terminated")

How to use asyncio with threading in Python

I am trying to use asyncio together with threading for a Discord Bot. I've found this script which I changed to my needs:
import time
import threading as th
import asyncio
import discord
class discordvars(object):
client=discord.Client()
TOKEN=('---')
running_discordthread=False
discordloop = asyncio.get_event_loop()
discordloop.create_task(client.start(TOKEN))
discordthread=th.Thread(target=discordloop.run_forever)
def start():
if discordvars.running_discordthread==False:
discordvars.discordthread.start()
print("Discord-Client started...")
discordvars.running_discordthread=True
else:
print("Discord-CLient allready running...")
time.sleep(2)
def stop():
if discordvars.running_discordthread==True:
discordvars.discordloop.call_soon_threadsafe(discordvars.discordloop.stop())
print("Requestet Discord-Client stop!")
discordvars.discordthread.join()
print(discordvars.discordthread.isAlive())
time.sleep(1)
print("Discord-Client stopped...")
discordvars.running_discordthread=False
else:
print("Discord-Client not running...")
time.sleep(2)
#discordvars.client.event
async def on_message(message):
if message.content.startswith('!test'):
embed = discord.Embed(title="test", color=0x0071ce, description="test")
await message.channel.send(embed=embed)
Starting the Script with the start() function works great. Also stopping with the stop() function works somehow. If I call the stop() function it prints: "False" so I am thinking that the thread was stopped. But if I then call the start() function I will get an error:
RuntimeError: threads can only be started once
This script is part of a big project so I am calling the functions from another script. But I think that shouldn't be the problem.
What is the problem? Thanks in advance.
You cannot re-start the existing thread, but you can start a new thread that runs the event loop. You can achieve that by moving the assignment to discordthread to the start function.
And your call to call_soon_threadsafe is wrong. You need to pass discordloop.stop to it, without parentheses. That refers to the actual function without calling it right away, and allows the loop thread to call it, which was intended:
discordloop.call_soon_threadsafe(discordloop.stop)
Finally, your init function is missing a global declaration for the variables you assign that are intended as globals.

How to attend several request in parallel with twisted Klein

I am creating an API to execute command line commands. The server has only two methods practically, "run" and "stop". So, the main function of "run" is to run a command line program in the server side and return a list with the system output. For the other hand, the function of "stop" just kill the process running. Here is the code:
import sys
import json
import subprocess
from klein import Klein
class ItemStore(object):
app = Klein()
current_process = None
def __init__(self):
self._items = {}
def create_process(self, exe):
"""
Run command and return the system output inside a JSON string
"""
print("COMMAND: ", exe)
process = subprocess.Popen(exe, shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
self.current_process = process
# Poll process for new output until finished
output_lines = []
counter = 0
while True:
counter = counter + 1
nextline = process.stdout.readline()
if process.poll() is not None:
break
aux = nextline.decode("utf-8")
output_lines.append(aux)
sys.stdout.flush()
counter = counter + 1
print("RETURN CODE: ", process.returncode)
return json.dumps(output_lines)
#app.route('/run/<command>', methods=['POST'])
def run(self, request, command):
"""
Execute command line process
"""
exe = command
print("COMMAND: ", exe)
output_lines = self.create_process(exe)
request.setHeader("Content-Type", "application/json")
request.setResponseCode(200)
return output_lines
#app.route('/stop', methods=['POST'])
def stop(self, request):
"""
Kill current execution
"""
self.current_process.kill()
request.setResponseCode(200)
return None
if __name__ == '__main__':
store = ItemStore()
store.app.run('0.0.0.0', 15508)
Well, the problem with this is, if I need to stop the current execution, "stop" request will be not attend until "run" request has finished, so it has no sense to work in this way. I have been reading several pages about async/await solution, but I can not get it work!. I think the most prominent solution is in this webpage https://crossbario.com/blog/Going-Asynchronous-from-Flask-to-Twisted-Klein/ , however, "run" is still a synchronous process. I just posted my main and original code in order to not confuse with the webpage changes.
Best regards
Everything to do with Klein in this example is already handling requests concurrently. However, your application code blocks until it has fully responded to a request.
You have to write your application code to be non-blocking instead of blocking.
Switch your code from the subprocess module to Twisted's process support.
Use Klein's feature of being able to return a Deferred instead of a result (if you want incremental results while the process is running, also look at the request interface - in particular, the write method - so you can write those results before the Deferred fires with a final result).
After Deferreds make sense to you, then you might want to think about syntactic sugar that's available in the form of async/await. Until you understand what Deferreds are doing, async/await is just going to be black magic that will only ever work by accident in your programs.

Categories

Resources