How to schedule and cancel tasks with asyncio - python

I am writing a client-server application. While connected, client sends to the server a "heartbeat" signal, for example, every second.
On the server-side I need a mechanism where I can add tasks (or coroutines or something else) to be executed asynchronously. Moreover, I want to cancel tasks from a client, when it stops sending that "heartbeat" signal.
In other words, when the server starts a task it has kind of timeout or ttl, in example 3 seconds. When the server receives the "heartbeat" signal it resets timer for another 3 seconds until task is done or client disconnected (stops send the signal).
Here is an example of canceling a task from asyncio tutorial on pymotw.com. But here the task is canceled before the event_loop started, which is not suitable for me.
import asyncio
async def task_func():
print('in task_func')
return 'the result'
event_loop = asyncio.get_event_loop()
try:
print('creating task')
task = event_loop.create_task(task_func())
print('canceling task')
task.cancel()
print('entering event loop')
event_loop.run_until_complete(task)
print('task: {!r}'.format(task))
except asyncio.CancelledError:
print('caught error from cancelled task')
else:
print('task result: {!r}'.format(task.result()))
finally:
event_loop.close()

You can use asyncio Task wrappers to execute a task via the ensure_future() method.
ensure_future will automatically wrap your coroutine in a Task wrapper and attach it to your event loop. The Task wrapper will then also ensure that the coroutine 'cranks-over' from await to await statement (or until the coroutine finishes).
In other words, just pass a regular coroutine to ensure_future and assign the resultant Task object to a variable. You can then call Task.cancel() when you need to stop it.
import asyncio
async def task_func():
print('in task_func')
# if the task needs to run for a while you'll need an await statement
# to provide a pause point so that other coroutines can run in the mean time
await some_db_or_long_running_background_coroutine()
# or if this is a once-off thing, then return the result,
# but then you don't really need a Task wrapper...
# return 'the result'
async def my_app():
my_task = None
while True:
await asyncio.sleep(0)
# listen for trigger / heartbeat
if heartbeat and my_task is None:
my_task = asyncio.ensure_future(task_func())
# also listen for termination of hearbeat / connection
elif not heartbeat and my_task:
if not my_task.cancelled():
my_task.cancel()
else:
my_task = None
run_app = asyncio.ensure_future(my_app())
event_loop = asyncio.get_event_loop()
event_loop.run_forever()
Note that tasks are meant for long-running tasks that need to keep working in the background without interrupting the main flow. If all you need is a quick once-off method, then just call the function directly instead.

Related

How to process webcoket inside a thread created in the fastAPI endpoint? [duplicate]

I'm using FastAPI with WebSockets to "push" SVGs to the client. The problem is: If iterations run continuously, they block the async event loop and the socket therefore can't listen to other messages.
Running the loop as a background task is not suitable because each iteration is CPU heavy and the data must be returned to the client.
Is there a different approach, or will I need to trigger each step from the client? I thought multiprocessing could work but not sure how this would work with asynchronous code like await websocket.send_text().
#app.websocket("/ws")
async def read_websocket(websocket: WebSocket) -> None:
await websocket.accept()
while True:
data = await websocket.receive_text()
async def run_continuous_iterations():
#needed to run the steps until the user sends "stop"
while True:
svg_string = get_step_data()
await websocket.send_text(svg_string)
if data == "status":
await run_continuous_iterations()
#this code can't run if the event loop is blocked by run_continuous_iterations
if data == "stop":
is_running = False
print("Stopping process")
"...each iteration is CPU heavy and the data must be returned to the
client".
As described in this answer, a "coroutine suspends its execution only when it explicitly requests to be suspended", for example, if there is an await call to an asynchronous operation/function; normally, to I/O-bound tasks such as the ones described here (Note: FastAPI/Starlette runs I/O-bound methods, such as reading File contents, in a threadpool, using the async run_in_threadpool() function) and awaits for them; hence, calling such File operations from your async def endpoint, e.g., await file.read() won't block the event loop—see the linked answer above for more details). This, however, does not apply to blocking I/O-bound or CPU-bound operations, such as the ones mentioned here. Running such operations inside an async def endpoint will block the event loop; and hence, any further requests will get blocked until the blocking operation is completed.
Additionally, from the code snippet your provided, it seems that you would like to be sending data back to the client, while at the same time listening for new messages (in order to check if the client sent a "stop" msg, in order to stop the process). Thus, awaiting for an operation to be completed is not the way to go, but rather starting a separate thread or process to execute that task should be a more suitable way. Solutions are given below.
Using asyncio's run_in_executor:
#app.websocket("/ws")
async def websocket_endpoint(websocket: WebSocket):
is_running = True
await websocket.accept()
try:
while True:
data = await websocket.receive_text()
async def run_continuous_iterations():
while is_running:
svg_string = get_step_data()
await websocket.send_text(svg_string)
if data == "status":
is_running = True
loop = asyncio.get_running_loop()
loop.run_in_executor(None, lambda: asyncio.run(run_continuous_iterations()))
if data == "stop":
is_running = False
print("Stopping process")
except WebSocketDisconnect:
is_running = False
print("Client disconnected")
Using threading's Thread:
#... rest of the code is the same as above
if data == "status":
is_running = True
thread = threading.Thread(target=lambda: asyncio.run(run_continuous_iterations()))
thread.start()
#... rest of the code is the same as above

Cancel run_in_executor coroutine from the main thread not working

In my use case I ping to an service while a sync operation is occurring. Once the sync operation is finished I need to to stop the ping operation, but it seems run_in_executor is not able to cancel
import asyncio
import threading
async def running_bg(loop, event):
await loop.run_in_executor(None, running, loop, event)
def running(loop, event):
while True:
if event.is_set():
print("cancelling")
break
print("We are in running")
future = asyncio.run_coroutine_threadsafe(asyncio.sleep(5), loop)
future.result()
return
async def run_them(steps, loop):
step = steps
event = threading.Event()
task = loop.create_task(running_bg(loop, event))
while steps:
await asyncio.sleep(2)
steps -= 1
event.set()
task.cancel() # if I comment this, it works well, but if I dont, then it hangs
try:
await task
except asyncio.CancelledError:
print("task cancelled")
return
if __name__ == "__main__":
loop = asyncio.get_event_loop()
loop.run_until_complete(run_them(3, loop))
When I call cancel() it hangs in the terminal with:
We are in running
We are in running
task cancelled
^CError in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python3.8/concurrent/futures/thread.py", line 40, in _python_exit
t.join()
File "/usr/lib/python3.8/threading.py", line 1011, in join
self._wait_for_tstate_lock()
File "/usr/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock
elif lock.acquire(block, timeout):
KeyboardInterrupt
But When I dont call the cancel() it works fine with the threading.Event flag.
We are in running
We are in running
cancelling
I know we dont need to cancel the if we have an event flag, but I saw this example from a different answer.
So why does the program hangs, any possible reason?
The problem is related to the fact that your running method, which executes in the asyncio default thread pool, is calling back into the asyncio event loop thread and waiting for a result. In your example code, you're letting the run_them function exit immediately after you cancel the Task, which immediately shuts down your event loop. When the event loop shuts down, it means any outstanding coroutines do not complete.
This means your event loop shuts down before your running method receives the result from its future.result() call, which is waiting on an asyncio.sleep(5) call that will never complete. That means future.result() never returns, which leaves the running method hanging, which means the ThreadPoolExecutor it is running in can't shutdown. This is what prevents your application from exiting. Note how the stack trace you get when you Ctrl+C starts in the concurrent.futures library - that's where it waits for the ThreadPoolExecutor to shut down.
If you're using Python 3.9+, you should be able to fix this by adding a call to await loop.shutdown_default_executor() at the end of your run_them method. If you're using an earlier version, you have to basically implement that method yourself:
def shutdown(fut, loop):
try:
loop._default_executor.shutdown(wait=True)
finally:
loop.call_soon_threadsafe(fut.set_result, None)
async def run_them(steps, loop):
step = steps
event = threading.Event()
task = loop.create_task(running_bg(loop, event))
while steps:
await asyncio.sleep(2)
steps -= 1
event.set()
task.cancel()
try:
await task
except asyncio.CancelledError:
print("task cancelled")
# Wait for the default thread pool to shut down before exiting
fut = loop.create_future()
t = threading.Thread(target=shutdown, args=(fut, loop))
t.start()
await fut
t.join()
Alternatively, you could just not call task.cancel() and rely on the Event() to break out of the running method.

How to wait for all tasks to finish before terminating the event loop?

What's the standard way in Python to ensure that all concurrent tasks are completed before the event loop ends? Here's a simplified example:
import asyncio
async def foo(delay):
print("Start foo.") # Eg: Send message
asyncio.create_task(bar(delay))
print("End foo.")
async def bar(delay):
print("Start bar.")
await asyncio.sleep(delay)
print("End bar.") # Eg: Delete message after delay
def main():
asyncio.run(foo(2))
if __name__ == "__main__":
main()
Current output:
Start foo. # Eg: Send message
End foo.
Start bar.
Desired output:
Start foo. # Eg: Send message
End foo.
Start bar.
End bar. # Eg: Delete message after delay
I've tried to run all outstanding tasks after loop.run_until_complete(), but that doesn't work since the loop will have been terminated by then. I've also tried modifying the main function to the following:
async def main():
await foo(2)
tasks = asyncio.all_tasks()
if len(tasks) > 0:
await asyncio.wait(tasks)
if __name__ == "__main__":
asyncio.run(main())
The output is correct, but it never terminates since the coroutine main() is one of the tasks. The setup above is also how discord.py sends a message and deletes it after a period of time, except that it uses loop.run_forever() instead, so does not encounter the problem.
There is no standard way to wait for all tasks in asyncio (and similar frameworks), and in fact one should not try to. Speaking in terms of threads, a Task expresses both regular and daemon activities. Waiting for all tasks indiscriminately may cause an application to stall indefinitely.
A task that is created but never awaited is de-facto a background/daemon task. In contrast, if a task should not be treated as background/daemon then it is the callers responsibility to ensure it is awaited.
The simplest solution is for every coroutine to await and/or cancel all tasks it spawns.
async def foo(delay):
print("Start foo.")
task = asyncio.create_task(bar(delay))
print("End foo.")
await task # foo is done here, it ensures the other task finishes as well
Since the entire point of async/tasks is to have cheap task switching, this is a cheap operation. It should also not affect any well-designed applications:
If the purpose of a function is to produce a value, any child tasks should be part of producing that value.
If the purpose of a function is some side-effect, any child tasks should be parts of that side-effect.
For more complex situations, it can be worthwhile to return any outstanding tasks.
async def foo(delay):
print("Start foo.")
task = asyncio.create_task(bar(delay))
print("End foo.")
return task # allow the caller to wait for our child tasks
This requires the caller to explicitly handle outstanding tasks, but gives prompt replies and the most control. The top-level task is then responsible for handling any orphan tasks.
For async programming in general, the structured programming paradigm encodes the idea of "handling outstanding tasks" in a managing object. In Python, this pattern has been encoded by the trio library as so-called Nursery objects.
import trio
async def foo(delay, nursery):
print("Start foo.")
# spawning a task via a nursery means *someone* awaits it
nursery.start_soon(bar, delay)
print("End foo.")
async def bar(delay):
print("Start bar.")
await trio.sleep(delay)
print("End bar.")
async def main():
# a task may spawn a nursery and pass it to child tasks
async with trio.open_nursery() as nursery:
await foo(2, nursery)
if __name__ == "__main__":
trio.run(main)
While this pattern has been suggested for asyncio as TaskGroups, so far it has been deferred.
Various ports of the pattern for asyncio are available via third-party libraries, however.

Asyncio: Start a non-blocking listening server

This is the basic tcp server from asyncio tutotial:
import asyncio
class EchoServerClientProtocol(asyncio.Protocol):
def connection_made(self, transport):
peername = transport.get_extra_info('peername')
print('Connection from {}'.format(peername))
self.transport = transport
def data_received(self, data):
message = data.decode()
print('Data received: {!r}'.format(message))
print('Send: {!r}'.format(message))
self.transport.write(data)
print('Close the client socket')
self.transport.close()
loop = asyncio.get_event_loop()
# Each client connection will create a new protocol instance
coro = loop.create_server(EchoServerClientProtocol, '127.0.0.1', 8888)
server = loop.run_until_complete(coro)
# Serve requests until CTRL+c is pressed
print('Serving on {}'.format(server.sockets[0].getsockname()))
try:
loop.run_forever()
except KeyboardInterrupt:
pass
# Close the server
server.close()
loop.run_until_complete(server.wait_closed())
loop.close()
Like all (i found) other examples it uses blocking loop.run_forever().
How do i start listeting server and do something else in the time?
I have tried to outsource starting server in a function and start this function with asyncio.async(), but with no success.
What i'm missing here?
You can schedule several concurrent asyncio tasks before calling loop.run_forever().
#asyncio.coroutine
def other_task_coroutine():
pass # do something
start_tcp_server_task = loop.create_task(loop.create_server(
EchoServerClientProtocol, '127.0.0.1', 8888))
other_task = loop.create_task(other_task_coroutine())
self.run_forever()
When you call loop.create_task(loop.create_server()) or loop.create_task(other_task_coroutine()), nothing is actually executed: a coroutine object is created and wrapped in a task (consider a task to be a shell and the coroutine an instance of the code that will be executed in the task). The tasks are scheduled on the loop when created.
The loop will execute start_tcp_server_task first (as it's scheduled first) until a blocking IO event is pending or the passive socket is ready to listen for incoming connections.
You can see asyncio as a non-preemptible scheduler running on one CPU: once the first task interrupts itself or is done, the second task will be executed. Hence, when one task is executed, the other one has to wait until the running task finishes or yields (or "awaits" with Python 3.5). "yielding" (yield from client.read()) or "awaiting" (await client.read()) means that the task gives back the hand to the loop's scheduler, until client.read() can be executed (data is available on the socket).
Once the task gave back the control to the loop, it can schedule the other pending tasks, process incoming events and schedule the tasks which were waiting for those events. Once there is nothing left to do, the loop will perform the only blocking call of the process: sleep until the kernel notifies it that events are ready to be processed.
In this context, you must understand that when using asyncio, everything running in the process must run asynchronously so the loop can do its work. You can not use multiprocessing objects in the loop.
Note that asyncio.async(coroutine(), loop=loop) is equivalent to loop.create_task(coroutine()).
Additionally, you can consider running what you want in an executor.
For example.
coro = loop.create_server(EchoServerClientProtocol, '127.0.0.1', 8888)
server = loop.run_until_complete(coro)
async def execute(self, loop):
await loop.run_in_executor(None, your_func_here, args:
asyncio.async(execute(loop))
loop.run_forever()
An executor will run whatever function you want in an executor, which wont block your server.

Python asyncio force timeout

Using asyncio a coroutine can be executed with a timeout so it gets cancelled after the timeout:
#asyncio.coroutine
def coro():
yield from asyncio.sleep(10)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(coro(), 5))
The above example works as expected (it times out after 5 seconds).
However, when the coroutine doesn't use asyncio.sleep() (or other asyncio coroutines) it doesn't seem to time out. Example:
#asyncio.coroutine
def coro():
import time
time.sleep(10)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(coro(), 1))
This takes more than 10 seconds to run because the time.sleep(10) isn't cancelled. Is it possible to enforce the cancellation of the coroutine in such a case?
If asyncio should be used to solve this, how could I do that?
No, you can't interrupt a coroutine unless it yields control back to the event loop, which means it needs to be inside a yield from call. asyncio is single-threaded, so when you're blocking on the time.sleep(10) call in your second example, there's no way for the event loop to run. That means when the timeout you set using wait_for expires, the event loop won't be able to take action on it. The event loop doesn't get an opportunity to run again until coro exits, at which point its too late.
This is why in general, you should always avoid any blocking calls that aren't asynchronous; any time a call blocks without yielding to the event loop, nothing else in your program can execute, which is probably not what you want. If you really need to do a long, blocking operation, you should try to use BaseEventLoop.run_in_executor to run it in a thread or process pool, which will avoid blocking the event loop:
import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
#asyncio.coroutine
def coro(loop):
ex = ProcessPoolExecutor(2)
yield from loop.run_in_executor(ex, time.sleep, 10) # This can be interrupted.
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(coro(loop), 1))
Thx #dano for your answer. If running a coroutine is not a hard requirement, here is a reworked, more compact version
import asyncio, time
timeout = 0.5
loop = asyncio.get_event_loop()
future = asyncio.wait_for(loop.run_in_executor(None, time.sleep, 2), timeout)
try:
loop.run_until_complete(future)
print('Thx for letting me sleep')
except asyncio.exceptions.TimeoutError:
print('I need more sleep !')
For the curious, a little debugging in my Python 3.8.2 showed that passing None as an executor results in the creation of a _default_executor, as follows:
self._default_executor = concurrent.futures.ThreadPoolExecutor()
The examples I've seen for timeout handling are very trivial. Given reality, my app is bit more complex. The sequence is:
When a client connects to server, have the server create another connection to internal server
When the internal server connection is ok, wait for the client to send data. Based on this data we may make a query to internal server.
When there is data to send to internal server, send it. Since internal server sometimes doesn't respond fast enough, wrap this request into a timeout.
If the operation times out, collapse all connections to signal the client about error
To achieve all of the above, while keeping the event loop running, the resulting code contains following code:
def connection_made(self, transport):
self.client_lock_coro = self.client_lock.acquire()
asyncio.ensure_future(self.client_lock_coro).add_done_callback(self._got_client_lock)
def _got_client_lock(self, task):
task.result() # True at this point, but call there will trigger any exceptions
coro = self.loop.create_connection(lambda: ClientProtocol(self),
self.connect_info[0], self.connect_info[1])
asyncio.ensure_future(asyncio.wait_for(coro,
self.client_connect_timeout
)).add_done_callback(self.connected_server)
def connected_server(self, task):
transport, client_object = task.result()
self.client_transport = transport
self.client_lock.release()
def data_received(self, data_in):
asyncio.ensure_future(self.send_to_real_server(message, self.client_send_timeout))
def send_to_real_server(self, message, timeout=5.0):
yield from self.client_lock.acquire()
asyncio.ensure_future(asyncio.wait_for(self._send_to_real_server(message),
timeout, loop=self.loop)
).add_done_callback(self.sent_to_real_server)
#asyncio.coroutine
def _send_to_real_server(self, message):
self.client_transport.write(message)
def sent_to_real_server(self, task):
task.result()
self.client_lock.release()

Categories

Resources