I am new to python and struggling to understand why my coroutine is not working.
In the current code, the only one job is running and another is always stays idle. Why?
class Worker:
def job1_sync(self):
count = 0
while True:
print('JOB A:', count)
count = count + 1
def job2_sync(self):
count = 0
while True:
print('JOB B:', count)
count = count + 1
async def job1(self):
await self.job1_sync()
async def job2(self):
await self.job2_sync()
worker = Worker()
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
loop.run_until_complete(asyncio.gather(worker.job1(), worker.job2()))
Asyncio does not do multi-tasking or multithreading. What it does is it schedules tasks within one thread, using a cooperative model.
That is, the event loop runs again when the current task awaits something that will "block", and only then it schedules another task. Under the hood, async functions are coroutines, and calls to await make the corouting yield to the event loop, which resumes it a later point, when awaited condition arises.
Here you never await anything, so job1 never relinquishes control, so the event loop never has a chance to distribute computing power to other tasks.
Now if your job was to actually relinquish control, say by triggering a delay, then your code would work:
async def job1_sync(self): # note the async : only async functions can await
count = 0
while True:
print('JOB A:', count)
count = count + 1
await asyncio.sleep(1) # main even loop gets control
TLDR: asyncio is useful for what it says: doing stuff asynchronously, allowing other tasks to make progress while current task waits for something. Nothing runs in parallel.
Related
Is there a way to call an async function from a sync one without waiting for it to complete?
My current tests:
Issue: Waits for test_timer_function to complete
async def test_timer_function():
await asyncio.sleep(10)
return
def main():
print("Starting timer at {}".format(datetime.now()))
asyncio.run(test_timer_function())
print("Ending timer at {}".format(datetime.now()))
Issue: Does not call test_timer_function
async def test_timer_function():
await asyncio.sleep(10)
return
def main():
print("Starting timer at {}".format(datetime.now()))
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
asyncio.ensure_future(test_timer_function())
print("Ending timer at {}".format(datetime.now()))
Any suggestions?
Async functions really do not run in the background: they run always in a single thread.
That means that when there are parallel tasks in async code (normal async code), it is only when you give a chance to the asyncio loop to run that those are executed - this happens when your code uses await, call one of async for, async with or return from a co-routine function that is running as a task.
In non-async code, you have to enter the loop and pass control to it, in order to the async code to run - that is what asyncio.run does - and asyncio.ensure_future does not: this call just registers a task to be executed, whenever the asyncio loop has time for it: but you return from the function without ever passing control to the async loop, so your program just finishes.
One thing that can be done is to establish a secondary thread, where the asyncio code will run: this thread will run its asyncio loop, and you can communicate with tasks in it by using global variables and normal thread data structures like Queues.
The minimal changes for your code are:
import asyncio
import threading
from datetime import datetime
now = datetime.now
async def test_timer_function():
await asyncio.sleep(2)
print(f"ending async task at {now()}")
return
def run_async_loop_in_thread():
asyncio.run(test_timer_function())
def main():
print(f"Starting timer at {now()}")
t = threading.Thread(target=run_async_loop_in_thread)
t.start()
print(f"Ending timer at {now()}")
return t
if __name__ == "__main__":
t = main()
t.join()
print(f"asyncio thread exited normally at {now()}")
(please, when posting Python code, include the import lines and lines to call your functions and make your code actually run: it is not a lot of boiler plate like may be needed in other languages, and turn your snippets in complete, ready to run, examples)
printout when running this snippet at the console:
Starting timer at 2022-10-20 16:47:45.211654
Ending timer at 2022-10-20 16:47:45.212630
ending async task at 2022-10-20 16:47:47.213464
asyncio thread exited normally at 2022-10-20 16:47:47.215417
The answer is simply no. It's not gonna happen in a single thread.
First issue:
In your first issue, main() is a sync function. It stops at the line asyncio.run(test_timer_function()) until the event loop finishes its work.
What is its only task? test_timer_function! This task "does" give the control back to event loop but not to the caller main! So if the event loop had other tasks too, they would cooperate with each other. But within the tasks of the event loop, not between event loop and the caller.
So it will wait 10 seconds. There is no other one here to use this 10 seconds to do its work.
Second issue:
You didn't even run the event loop. Check documentation for ensure_future.
I read the following code
async def f():
sc_client = session.client("ec2")
for id in ids:
await IOLoop.current().run_in_executor(None, lambda: client.terminate(id))
How does it compare to the following code? Will client.terminate be run in parallel? But each execution is awaited?
for id in ids:
client.terminate(id)
Will client.terminate be run in parallel?
NO, it still runs as sequence.
IOLoop.current().run_in_executor will run the blocking function in a separate thread and returns an asyncio.Future, while await will wait until the Future which call client.terminate finish, then the loop continue.
The difference the 2 options you given is:
If the program has other coroutine to run, using the 1st option, the other coroutine won't block, while using the 2nd option, the other coroutine will block to wait your for loop finish.
An example to make you understand it (Here, will use loop.run_in_executor to simulate the IOLoop.current().run_in_executor for simple sake):
test.py:
import asyncio
import concurrent.futures
import time
def client_terminate(id):
print(f"start terminate {id}")
time.sleep(5)
print(f"end terminate {id}")
async def f():
loop = asyncio.get_running_loop()
for id in range(2):
with concurrent.futures.ThreadPoolExecutor() as pool:
await loop.run_in_executor(pool, client_terminate, id)
# client_terminate(id)
async def f2():
await asyncio.sleep(1)
print("other task")
async def main():
await asyncio.gather(*[f(), f2()])
asyncio.run(main())
The run output is:
$ python3 test.py
start terminate 0
other task
end terminate 0
start terminate 1
end terminate 1
You could see the two client_terminate in for loop still runs in sequence, BUT, the function f2 which print other task inject between them, it won't block asyncio scheduler to schedule f2.
Additional:
If you comment the 2 lines related to await loop.run_in_executor & threadpool, directly call client_terminate(id), the output will be:
$ python3 test.py
start terminate 0
end terminate 0
start terminate 1
end terminate 1
other task
Means if you don't wraps the blocking function in a Future, the other task will have to wait your for loop to finish which waste CPU.
I have 3 tasks. -
def task_a():
while True:
file.write()
asyncio.sleep(10)
def task_b():
while True:
file.write()
asyncio.sleep(10)
def task_c():
# do something
main.py -
try:
loop = asyncio.get_event_loop()
A = loop.create_task(task_a)
B = loop.create_task(task_b)
C = loop.create_task(task_c)
awaitable_pending_tasks = asyncio.all_tasks()
execution_group = asyncio.gather(*awaitable_pending_tasks, return_exceptions=True)
fi_execution = loop.run_until_complete(execution_group)
finally:
loop.run_forever()
I want to make sure that the loop is exited when the task_c is completed.
Tried with loop.close() in finally but since it's async, it closes in between.
task_a and task_b write to a file and there is another process running that checks the time the file was modified. If it's greater than a minute it will result in an error(which I don't want) hence I've put the while loop in it and once its written I added a sleep()
Once task_c is complete, I need the loop to stop.
Other answers on StackOverflow looked complicated to understand.
Any way we can do this?
You could call loop.run_until_complete or asyncio.run (but not run_forever) to run a function that prepares the tasks you need and then only awaits the one you want to terminate the loop (untested):
async def main():
asyncio.create_task(task_a)
asyncio.create_task(task_b)
await task_c
tasks = set(asyncio.all_tasks()) - set([asyncio.current_task()])
for t in tasks:
t.cancel()
await asyncio.gather(*tasks, return_exceptions=True)
asyncio.run(main())
# or asyncio.get_event_loop().run_until_complete(main())
I am writing a Python program that run tasks taken from a queue concurrently, to learn asyncio.
Items will be put onto a queue by interacting with a main thread (within REPL).
Whenever a task is put onto the queue, it should be consumed and executed immediately.
My approach is to kick off a separate thread and pass a queue to the event loop within that thread.
The tasks are running but only sequentially and I am not clear on how to run the tasks concurrently. My attempt is as follows:
import asyncio
import time
import queue
import threading
def do_it(task_queue):
'''Process tasks in the queue until the sentinel value is received'''
_sentinel = 'STOP'
def clock():
return time.strftime("%X")
async def process(name, total_time):
status = f'{clock()} {name}_{total_time}:'
print(status, 'START')
current_time = time.time()
end_time = current_time + total_time
while current_time < end_time:
print(status, 'processing...')
await asyncio.sleep(1)
current_time = time.time()
print(status, 'DONE.')
async def main():
while True:
item = task_queue.get()
if item == _sentinel:
break
await asyncio.create_task(process(*item))
print('event loop start')
asyncio.run(main())
print('event loop end')
if __name__ == '__main__':
tasks = queue.Queue()
th = threading.Thread(target=do_it, args=(tasks,))
th.start()
tasks.put(('abc', 5))
tasks.put(('def', 3))
Any advice pointing me in the direction of running these tasks concurrently would be greatly appreciated!
Thanks
UPDATE
Thank you Frank Yellin and cynthi8! I have reformed main() according to your advice:
removed await before asyncio.create_task - fixed concurrency
added wait while loop so that main would not return prematurely
used non-blocking mode of Queue.get()
The program now works as expected 👍
UPDATE 2
user4815162342 has offered further improvements, I have annotated his suggestions below.
'''
Starts auxiliary thread which establishes a queue and consumes tasks within a
queue.
Allow enqueueing of tasks from within __main__ and termination of aux thread
'''
import asyncio
import time
import threading
import functools
def do_it(started):
'''Process tasks in the queue until the sentinel value is received'''
_sentinel = 'STOP'
def clock():
return time.strftime("%X")
async def process(name, total_time):
print(f'{clock()} {name}_{total_time}:', 'Started.')
current_time = time.time()
end_time = current_time + total_time
while current_time < end_time:
print(f'{clock()} {name}_{total_time}:', 'Processing...')
await asyncio.sleep(1)
current_time = time.time()
print(f'{clock()} {name}_{total_time}:', 'Done.')
async def main():
# get_running_loop() get the running event loop in the current OS thread
# out to __main__ thread
started.loop = asyncio.get_running_loop()
started.queue = task_queue = asyncio.Queue()
started.set()
while True:
item = await task_queue.get()
if item == _sentinel:
# task_done is used to tell join when the work in the queue is
# actually finished. A queue length of zero does not mean work
# is complete.
task_queue.task_done()
break
task = asyncio.create_task(process(*item))
# Add a callback to be run when the Task is done.
# Indicate that a formerly enqueued task is complete. Used by queue
# consumer threads. For each get() used to fetch a task, a
# subsequent call to task_done() tells the queue that the processing
# on the task is complete.
task.add_done_callback(lambda _: task_queue.task_done())
# keep loop going until all the work has completed
# When the count of unfinished tasks drops to zero, join() unblocks.
await task_queue.join()
print('event loop start')
asyncio.run(main())
print('event loop end')
if __name__ == '__main__':
# started Event is used for communication with thread th
started = threading.Event()
th = threading.Thread(target=do_it, args=(started,))
th.start()
# started.wait() blocks until started.set(), ensuring that the tasks and
# loop variables are available from the event loop thread
started.wait()
tasks, loop = started.queue, started.loop
# call_soon schedules the callback callback to be called with args arguments
# at the next iteration of the event loop.
# call_soon_threadsafe is required to schedule callbacks from another thread
# put_nowait enqueues items in non-blocking fashion, == put(block=False)
loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')
As others pointed out, the problem with your code is that it uses a blocking queue which halts the event loop while waiting for the next item. The problem with the proposed solution, however, is that it introduces latency because it must occasionally sleep to allow other tasks to run. In addition to introducing latency, it prevents the program from ever going to sleep, even when there are no items in the queue.
An alternative is to switch to asyncio queue which is designed for use with asyncio. This queue must be created inside the running loop, so you can't pass it to do_it, you must retrieve it. Also, since it's an asyncio primitive, its put method must be invoked through call_soon_threadsafe to ensure that the event loop notices it.
One final issue is that your main() function uses another busy loop to wait for all the tasks to complete. This can be avoided by using Queue.join, which is explicitly designed for this use case.
Here is your code adapted to incorporate all of the above suggestions, with the process function remaining unchanged from your original:
import asyncio
import time
import threading
def do_it(started):
'''Process tasks in the queue until the sentinel value is received'''
_sentinel = 'STOP'
def clock():
return time.strftime("%X")
async def process(name, total_time):
status = f'{clock()} {name}_{total_time}:'
print(status, 'START')
current_time = time.time()
end_time = current_time + total_time
while current_time < end_time:
print(status, 'processing...')
await asyncio.sleep(1)
current_time = time.time()
print(status, 'DONE.')
async def main():
started.loop = asyncio.get_running_loop()
started.queue = task_queue = asyncio.Queue()
started.set()
while True:
item = await task_queue.get()
if item == _sentinel:
task_queue.task_done()
break
task = asyncio.create_task(process(*item))
task.add_done_callback(lambda _: task_queue.task_done())
await task_queue.join()
print('event loop start')
asyncio.run(main())
print('event loop end')
if __name__ == '__main__':
started = threading.Event()
th = threading.Thread(target=do_it, args=(started,))
th.start()
started.wait()
tasks, loop = started.queue, started.loop
loop.call_soon_threadsafe(tasks.put_nowait, ('abc', 5))
loop.call_soon_threadsafe(tasks.put_nowait, ('def', 3))
loop.call_soon_threadsafe(tasks.put_nowait, 'STOP')
Note: an unrelated issue with your code was that it awaited the result of create_task(), which nullified the usefulness of create_task() because it wasn't allowed to run in the background. (It would be equivalent to immediately joining a thread you've just started - you can do it, but it doesn't make much sense.) This issue is fixed both in the above code and in your edit to the question.
There are two problems with your code.
First, you should not have the await before the asyncio.create_task. This is possibly what is causing your code to run synchronously.
Then, once you've made your code run asynchronously, you need something after the while loop in main so that the code doesn't return immediately, but instead waits for all the jobs to finish. Another stackoverflow answer recommends:
while len(asyncio.Task.all_tasks()) > 1: # Any task besides main() itself?
await asyncio.sleep(0.2)
Alternatively there are versions of Queue that can keep track of running tasks.
As an additional problem:
If a queue.Queue is empty, get() blocks by default and does not return a sentinel string. https://docs.python.org/3/library/queue.html
I'm trying to generate a polling mechanism for a long running task in Python. To do this, I'm using a concurrent Future and poll with .done(). The task exists of many iterations that are themselves blocking, which I wrapped in an async function. I don't have access to the code of the blocking functions as I'm calling third-party software. This is a minimal example of my current approach:
import asyncio
import time
async def blocking_iteration():
time.sleep(1)
async def long_running():
for i in range(5):
print(f"sleeping {i}")
await blocking_iteration()
async def poll_run():
future = asyncio.ensure_future(long_running())
while not future.done():
print("before polling")
await asyncio.sleep(0.05)
print("polling")
future.result()
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(poll_run())
loop.close()
The result of this is:
before polling
sleeping 0
sleeping 1
sleeping 2
sleeping 3
sleeping 4
polling
From my current understanding of the asyncio mechanism in Python, I had expected the loop to unblock after the first sleep, return control to the loop that would go back to the poll_run await statement and would only run the second iteration of the long_running function after the subsequent poll.
So desired output is something like this:
before polling
sleeping 0
polling
before polling
sleeping 1
polling
before polling
sleeping 2
polling
before polling
sleeping 3
polling
before polling
sleeping 4
polling
Can this be achieved with the current approach somehow, or is it possible in a different way?
EDIT
Thanks to #drjackild was able to solve it by changing
async def blocking_iteration():
time.sleep(1)
into
def blocking():
time.sleep(1)
async def blocking_iteration():
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, blocking)
time is synchronous library and block whole main thread when executing. If you have such blocking calls in your program you can avoid blocking with thread or process pool executors (you can read about it here). Or, change your blocking_iteration to use asyncio.sleep instead of time.sleep
UPD. Just to make it clear, here is non-blocking version, which use loop.run_in_executor with default executor. Please, pay attention, that blocking_iteration now without async
import asyncio
import concurrent.futures
import time
def blocking_iteration():
time.sleep(1)
async def long_running():
loop = asyncio.get_event_loop()
for i in range(5):
print(f"sleeping {i}")
await loop.run_in_executor(None, blocking_iteration)
async def poll_run():
task = asyncio.create_task(long_running())
while not task.done():
print("before polling")
await asyncio.sleep(0.05)
print("polling")
print(task.result())
if __name__ == '__main__':
loop = asyncio.get_event_loop()
loop.run_until_complete(poll_run())
loop.close()