Using asyncio a coroutine can be executed with a timeout so it gets cancelled after the timeout:
#asyncio.coroutine
def coro():
yield from asyncio.sleep(10)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(coro(), 5))
The above example works as expected (it times out after 5 seconds).
However, when the coroutine doesn't use asyncio.sleep() (or other asyncio coroutines) it doesn't seem to time out. Example:
#asyncio.coroutine
def coro():
import time
time.sleep(10)
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(coro(), 1))
This takes more than 10 seconds to run because the time.sleep(10) isn't cancelled. Is it possible to enforce the cancellation of the coroutine in such a case?
If asyncio should be used to solve this, how could I do that?
No, you can't interrupt a coroutine unless it yields control back to the event loop, which means it needs to be inside a yield from call. asyncio is single-threaded, so when you're blocking on the time.sleep(10) call in your second example, there's no way for the event loop to run. That means when the timeout you set using wait_for expires, the event loop won't be able to take action on it. The event loop doesn't get an opportunity to run again until coro exits, at which point its too late.
This is why in general, you should always avoid any blocking calls that aren't asynchronous; any time a call blocks without yielding to the event loop, nothing else in your program can execute, which is probably not what you want. If you really need to do a long, blocking operation, you should try to use BaseEventLoop.run_in_executor to run it in a thread or process pool, which will avoid blocking the event loop:
import asyncio
import time
from concurrent.futures import ProcessPoolExecutor
#asyncio.coroutine
def coro(loop):
ex = ProcessPoolExecutor(2)
yield from loop.run_in_executor(ex, time.sleep, 10) # This can be interrupted.
loop = asyncio.get_event_loop()
loop.run_until_complete(asyncio.wait_for(coro(loop), 1))
Thx #dano for your answer. If running a coroutine is not a hard requirement, here is a reworked, more compact version
import asyncio, time
timeout = 0.5
loop = asyncio.get_event_loop()
future = asyncio.wait_for(loop.run_in_executor(None, time.sleep, 2), timeout)
try:
loop.run_until_complete(future)
print('Thx for letting me sleep')
except asyncio.exceptions.TimeoutError:
print('I need more sleep !')
For the curious, a little debugging in my Python 3.8.2 showed that passing None as an executor results in the creation of a _default_executor, as follows:
self._default_executor = concurrent.futures.ThreadPoolExecutor()
The examples I've seen for timeout handling are very trivial. Given reality, my app is bit more complex. The sequence is:
When a client connects to server, have the server create another connection to internal server
When the internal server connection is ok, wait for the client to send data. Based on this data we may make a query to internal server.
When there is data to send to internal server, send it. Since internal server sometimes doesn't respond fast enough, wrap this request into a timeout.
If the operation times out, collapse all connections to signal the client about error
To achieve all of the above, while keeping the event loop running, the resulting code contains following code:
def connection_made(self, transport):
self.client_lock_coro = self.client_lock.acquire()
asyncio.ensure_future(self.client_lock_coro).add_done_callback(self._got_client_lock)
def _got_client_lock(self, task):
task.result() # True at this point, but call there will trigger any exceptions
coro = self.loop.create_connection(lambda: ClientProtocol(self),
self.connect_info[0], self.connect_info[1])
asyncio.ensure_future(asyncio.wait_for(coro,
self.client_connect_timeout
)).add_done_callback(self.connected_server)
def connected_server(self, task):
transport, client_object = task.result()
self.client_transport = transport
self.client_lock.release()
def data_received(self, data_in):
asyncio.ensure_future(self.send_to_real_server(message, self.client_send_timeout))
def send_to_real_server(self, message, timeout=5.0):
yield from self.client_lock.acquire()
asyncio.ensure_future(asyncio.wait_for(self._send_to_real_server(message),
timeout, loop=self.loop)
).add_done_callback(self.sent_to_real_server)
#asyncio.coroutine
def _send_to_real_server(self, message):
self.client_transport.write(message)
def sent_to_real_server(self, task):
task.result()
self.client_lock.release()
Related
Similar Question (but answer does not work for me): How to cancel long-running subprocesses running using concurrent.futures.ProcessPoolExecutor?
Unlike the question linked above and the solution provided, in my case the computation itself is rather long (CPU bound) and cannot be run in a loop to check if some event has happened.
Reduced version of the code below:
import asyncio
import concurrent.futures as futures
import time
class Simulator:
def __init__(self):
self._loop = None
self._lmz_executor = None
self._tasks = []
self._max_execution_time = time.monotonic() + 60
self._long_running_tasks = []
def initialise(self):
# Initialise the main asyncio loop
self._loop = asyncio.get_event_loop()
self._loop.set_default_executor(
futures.ThreadPoolExecutor(max_workers=3))
# Run separate processes of long computation task
self._lmz_executor = futures.ProcessPoolExecutor(max_workers=3)
def run(self):
self._tasks.extend(
[self.bot_reasoning_loop(bot_id) for bot_id in [1, 2, 3]]
)
try:
# Gather bot reasoner tasks
_reasoner_tasks = asyncio.gather(*self._tasks)
# Send the reasoner tasks to main monitor task
asyncio.gather(self.sample_main_loop(_reasoner_tasks))
self._loop.run_forever()
except KeyboardInterrupt:
pass
finally:
self._loop.close()
async def sample_main_loop(self, reasoner_tasks):
"""This is the main monitor task"""
await asyncio.wait_for(reasoner_tasks, None)
for task in self._long_running_tasks:
try:
await asyncio.wait_for(task, 10)
except asyncio.TimeoutError:
print("Oops. Some long operation timed out.")
task.cancel() # Doesn't cancel and has no effect
task.set_result(None) # Doesn't seem to have an effect
self._lmz_executor.shutdown()
self._loop.stop()
print('And now I am done. Yay!')
async def bot_reasoning_loop(self, bot):
import math
_exec_count = 0
_sleepy_time = 15
_max_runs = math.floor(self._max_execution_time / _sleepy_time)
self._long_running_tasks.append(
self._loop.run_in_executor(
self._lmz_executor, really_long_process, _sleepy_time))
while time.monotonic() < self._max_execution_time:
print("Bot#{}: thinking for {}s. Run {}/{}".format(
bot, _sleepy_time, _exec_count, _max_runs))
await asyncio.sleep(_sleepy_time)
_exec_count += 1
print("Bot#{} Finished Thinking".format(bot))
def really_long_process(sleepy_time):
print("I am a really long computation.....")
_large_val = 9729379273492397293479237492734 ** 344323
print("I finally computed this large value: {}".format(_large_val))
if __name__ == "__main__":
sim = Simulator()
sim.initialise()
sim.run()
The idea is that there is a main simulation loop that runs and monitors three bot threads. Each of these bot threads then perform some reasoning but also start a really long background process using ProcessPoolExecutor, which may end up running longer their own threshold/max execution time for reasoning on things.
As you can see in the code above, I attempted to .cancel() these tasks when a timeout occurs. Though this is not really cancelling the actual computation, which keeps happening in the background and the asyncio loop doesn't terminate until after all the long running computation have finished.
How do I terminate such long running CPU-bound computations within a method?
Other similar SO questions, but not necessarily related or helpful:
asyncio: Is it possible to cancel a future been run by an Executor?
How to terminate a single async task in multiprocessing if that single async task exceeds a threshold time in Python
Asynchronous multiprocessing with a worker pool in Python: how to keep going after timeout?
How do I terminate such long running CPU-bound computations within a method?
The approach you tried doesn't work because the futures returned by ProcessPoolExecutor are not cancellable. Although asyncio's run_in_executor tries to propagate the cancellation, it is simply ignored by Future.cancel once the task starts executing.
There is no fundamental reason for that. Unlike threads, processes can be safely terminated, so it would be perfectly possible for ProcessPoolExecutor.submit to return a future whose cancel terminated the corresponding process. Asyncio coroutines have well-defined cancellation semantics and could automatically make use of it. Unfortunately, ProcessPoolExecutor.submit returns a regular concurrent.futures.Future, which assumes the lowest common denominator of the underlying executors, and treats a running future as untouchable.
As a result, to cancel tasks executed in subprocesses, one must circumvent the ProcessPoolExecutor altogether and manage one's own processes. The challenge is how to do this without reimplementing half of multiprocessing. One option offered by the standard library is to (ab)use multiprocessing.Pool for this purpose, because it supports reliable shutdown of worker processes. A CancellablePool could work as follows:
Instead of spawning a fixed number of processes, spawn a fixed number of 1-worker pools.
Assign tasks to pools from an asyncio coroutine. If the coroutine is canceled while waiting for the task to finish in the other process, terminate the single-process pool and create a new one.
Since everything is coordinated from the single asyncio thread, don't worry about race conditions such as accidentally killing a process which has already started executing another task. (This would need to be prevented if one were to support cancellation in ProcessPoolExecutor.)
Here is a sample implementation of that idea:
import asyncio
import multiprocessing
class CancellablePool:
def __init__(self, max_workers=3):
self._free = {self._new_pool() for _ in range(max_workers)}
self._working = set()
self._change = asyncio.Event()
def _new_pool(self):
return multiprocessing.Pool(1)
async def apply(self, fn, *args):
"""
Like multiprocessing.Pool.apply_async, but:
* is an asyncio coroutine
* terminates the process if cancelled
"""
while not self._free:
await self._change.wait()
self._change.clear()
pool = usable_pool = self._free.pop()
self._working.add(pool)
loop = asyncio.get_event_loop()
fut = loop.create_future()
def _on_done(obj):
loop.call_soon_threadsafe(fut.set_result, obj)
def _on_err(err):
loop.call_soon_threadsafe(fut.set_exception, err)
pool.apply_async(fn, args, callback=_on_done, error_callback=_on_err)
try:
return await fut
except asyncio.CancelledError:
pool.terminate()
usable_pool = self._new_pool()
finally:
self._working.remove(pool)
self._free.add(usable_pool)
self._change.set()
def shutdown(self):
for p in self._working | self._free:
p.terminate()
self._free.clear()
A minimalistic test case showing cancellation:
def really_long_process():
print("I am a really long computation.....")
large_val = 9729379273492397293479237492734 ** 344323
print("I finally computed this large value: {}".format(large_val))
async def main():
loop = asyncio.get_event_loop()
pool = CancellablePool()
tasks = [loop.create_task(pool.apply(really_long_process))
for _ in range(5)]
for t in tasks:
try:
await asyncio.wait_for(t, 1)
except asyncio.TimeoutError:
print('task timed out and cancelled')
pool.shutdown()
asyncio.get_event_loop().run_until_complete(main())
Note how the CPU usage never exceeds 3 cores, and how it starts dropping near the end of the test, indicating that the processes are being terminated as expected.
To apply it to the code from the question, make self._lmz_executor an instance of CancellablePool and change self._loop.run_in_executor(...) to self._loop.create_task(self._lmz_executor.apply(...)).
I am writing a client-server application. While connected, client sends to the server a "heartbeat" signal, for example, every second.
On the server-side I need a mechanism where I can add tasks (or coroutines or something else) to be executed asynchronously. Moreover, I want to cancel tasks from a client, when it stops sending that "heartbeat" signal.
In other words, when the server starts a task it has kind of timeout or ttl, in example 3 seconds. When the server receives the "heartbeat" signal it resets timer for another 3 seconds until task is done or client disconnected (stops send the signal).
Here is an example of canceling a task from asyncio tutorial on pymotw.com. But here the task is canceled before the event_loop started, which is not suitable for me.
import asyncio
async def task_func():
print('in task_func')
return 'the result'
event_loop = asyncio.get_event_loop()
try:
print('creating task')
task = event_loop.create_task(task_func())
print('canceling task')
task.cancel()
print('entering event loop')
event_loop.run_until_complete(task)
print('task: {!r}'.format(task))
except asyncio.CancelledError:
print('caught error from cancelled task')
else:
print('task result: {!r}'.format(task.result()))
finally:
event_loop.close()
You can use asyncio Task wrappers to execute a task via the ensure_future() method.
ensure_future will automatically wrap your coroutine in a Task wrapper and attach it to your event loop. The Task wrapper will then also ensure that the coroutine 'cranks-over' from await to await statement (or until the coroutine finishes).
In other words, just pass a regular coroutine to ensure_future and assign the resultant Task object to a variable. You can then call Task.cancel() when you need to stop it.
import asyncio
async def task_func():
print('in task_func')
# if the task needs to run for a while you'll need an await statement
# to provide a pause point so that other coroutines can run in the mean time
await some_db_or_long_running_background_coroutine()
# or if this is a once-off thing, then return the result,
# but then you don't really need a Task wrapper...
# return 'the result'
async def my_app():
my_task = None
while True:
await asyncio.sleep(0)
# listen for trigger / heartbeat
if heartbeat and my_task is None:
my_task = asyncio.ensure_future(task_func())
# also listen for termination of hearbeat / connection
elif not heartbeat and my_task:
if not my_task.cancelled():
my_task.cancel()
else:
my_task = None
run_app = asyncio.ensure_future(my_app())
event_loop = asyncio.get_event_loop()
event_loop.run_forever()
Note that tasks are meant for long-running tasks that need to keep working in the background without interrupting the main flow. If all you need is a quick once-off method, then just call the function directly instead.
#asyncio.coroutine
def listener():
while True:
message = yield from websocket.recieve_message()
if message:
yield from handle(message)
loop = asyncio.get_event_loop()
loop.run_until_complete(listener())
Let's say i'm using websockets with asyncio. That means I recieve messages from websockets. And when I recieve a message, I want to handle the message but I'm loosing all the async thing with my code. Because the yield from handle(message) is definetly blocking... How could I find a way to make it non-blocking ? Like, handle multiple messages in the same time. Not having to wait the message to be handled before I can handle another message.
Thanks.
If you don't care about the return value from handle message, you can simply create a new Task for it, which will run in the event loop alongside your websocket reader. Here is a simple example:
#asyncio.coroutine
def listener():
while True:
message = yield from websocket.recieve_message()
if message:
asyncio.ensure_future(handle(message))
ensure_future will create a task and attach it to the default event loop. Since the loop is already running, it will get processed alongside your websocket reader in parallel. In fact, if it is a slow-running I/O blocked task (like sending an email), you could easily have a few dozen handle(message) tasks running at once. They are created dynamically when needed, and destroyed when finished (with much lower overhead than spawning threads).
If you want a bit more control, you could simply write to an asyncio.Queue in the reader and have a task pool of a fixed size that can consume the queue, a typical pattern in multi-threaded or multi-process programming.
#asyncio.coroutine
def consumer(queue):
while True:
message = yield from queue.get()
yield from handle(message)
#asyncio.coroutine
def listener(queue):
for i in range(5):
asyncio.ensure_future(consumer(queue))
while True:
message = yield from websocket.recieve_message()
if message:
yield from q.put(message)
q = asyncio.Queue()
loop = asyncio.get_event_loop()
loop.run_until_complete(listener(q))
I have a TCP server implemented in Python using asyncio's create_server.
I call the coroutine start_server with a connection_handler_cb.
Now my question is this: let's say my connection_handler_cb looks something
like this:
def connection_handler_cb(reader, writer):
while True:
yield from reader.read()
--do some computation--
I know that only the yield from coroutines are being run "concurrently" (I know it's not really concurrent), all the "--do some computation--" part is being called sequentially and is preventing everything else from running in the loop.
Let's say we are talking about a TCP server with multiple clients trying to send. Can this situation cause send timeout from the other side - the client side?
If your clients are waiting for a response from the server, and that response isn't sent until the computation is done, then it's possible the clients could eventually timeout, if the computations took long enough. More likely, though, is that the clients will just hang until the computations are done and the event loop gets unblocked.
In any case, if you're worried about timeouts or hangs, use loop.run_in_executor to run your computations in a background process (this is preferable), or thread (probably not a good choice since you're doing CPU-bound computations) without blocking the event loop:
import asyncio
import multiprocessing
from concurrent.futures import ProcessPoolExecutor
def comp_func(arg1, arg2):
# Do computation here
return output
def connection_handler_cb(reader, writer):
while True:
yield from reader.read()
# Do computation in a background process
# This won't block the event loop.
output = yield from loop.run_in_executor(None, comp_func, arg1, arg2) #
if __name__ == "__main__":
executor =
loop = asyncio.get_event_loop()
loop.set_default_executor(
ProcessPoolExecutor(multiprocessing.cpu_count()))
asyncio.async(asyncio.start_server(connect_handler_cb, ...))
loop.run_forever()
I am trying to understand the asyncio library, specifically with using sockets. I have written some code in an attempt to gain understanding,
I wanted to run a sender and a receiver sockets asynchrounously. I got to the point where I get all data sent up till the last one, but then I have to run one more loop. Looking at how to do this, I found this link from stackoverflow, which I implemented below -- but what is going on here? Is there a better/more sane way to do this than to call stop followed by run_forever?
The documentation for stop() in the event loop is:
Stop running the event loop.
Every callback scheduled before stop() is called will run. Callbacks scheduled after stop() is called will not run. However, those callbacks will run if run_forever() is called again later.
And run_forever()'s documentation is:
Run until stop() is called.
Questions:
why in the world is run_forever the only way to run_once? This doesn't even make sense
Is there a better way to do this?
Does my code look like a reasonable way to program with the asyncio library?
Is there a better way to add tasks to the event loop besides asyncio.async()? loop.create_task gives an error on my Linux system.
https://gist.github.com/cloudformdesign/b30e0860497f19bd6596
The stop(); run_forever() trick works because of how stop is implemented:
def stop(self):
"""Stop running the event loop.
Every callback scheduled before stop() is called will run.
Callback scheduled after stop() is called won't. However,
those callbacks will run if run() is called again later.
"""
self.call_soon(_raise_stop_error)
def _raise_stop_error(*args):
raise _StopError
So, next time the event loop runs and executes pending callbacks, it's going to call _raise_stop_error, which raises _StopError. The run_forever loop will break only on that specific exception:
def run_forever(self):
"""Run until stop() is called."""
if self._running:
raise RuntimeError('Event loop is running.')
self._running = True
try:
while True:
try:
self._run_once()
except _StopError:
break
finally:
self._running = False
So, by scheduling a stop() and then calling run_forever, you end up running one iteration of the event loop, then stopping once it hits the _raise_stop_error callback. You may have also noticed that _run_once is defined and called by run_forever. You could call that directly, but that can sometimes block if there aren't any callbacks ready to run, which may not be desirable. I don't think there's a cleaner way to do this currently - That answer was provided by Andrew Svetlov, who is an asyncio contributor; he would probably know if there's a better option. :)
In general, your code looks reasonable, though I think that you shouldn't be using this run_once approach to begin with. It's not deterministic; if you had a longer list or a slower system, it might require more than two extra iterations to print everything. Instead, you should just send a sentinel that tells the receiver to shut down, and then wait on both the send and receive coroutines to finish:
import sys
import time
import socket
import asyncio
addr = ('127.0.0.1', 1064)
SENTINEL = b"_DONE_"
# ... (This stuff is the same)
#asyncio.coroutine
def sending(addr, dataiter):
loop = asyncio.get_event_loop()
for d in dataiter:
print("Sending:", d)
sock = socket.socket()
yield from send_close(loop, sock, addr, str(d).encode())
# Send a sentinel
sock = socket.socket()
yield from send_close(loop, sock, addr, SENTINEL)
#asyncio.coroutine
def receiving(addr):
loop = asyncio.get_event_loop()
sock = socket.socket()
try:
sock.setblocking(False)
sock.bind(addr)
sock.listen(5)
while True:
data = yield from accept_recv(loop, sock)
if data == SENTINEL: # Got a sentinel
return
print("Recevied:", data)
finally: sock.close()
def main():
loop = asyncio.get_event_loop()
# add these items to the event loop
recv = asyncio.async(receiving(addr), loop=loop)
send = asyncio.async(sending(addr, range(10)), loop=loop)
loop.run_until_complete(asyncio.wait([recv, send]))
main()
Finally, asyncio.async is the right way to add tasks to the event loop. create_task was added in Python 3.4.2, so if you have an earlier version it won't exist.