python asyncio add tasks dynamically - python

I'm learning asyncio in python, I want to try something with asyncio. My goal is to read input from user continuously and use that input to create a job/task that needs to be run asynchronously.
import asyncio
import os
loop = asyncio.get_event_loop()
async def action():
inp = int(input('enter: '))
await asyncio.sleep(inp)
os.system(f"say '{inp} seconds waited'")
async def main():
while True:
await asyncio.ensure_future(action())
try:
asyncio.run(main())
except Exception as e:
print(str(e))
finally:
loop.close()
I'm messing up something and I want to know how to achieve it. Every time user enters a number, script needs to sleep for given time, then speak out that it has waited. This entire thing needs to be in concurrent. if user enters 100 as input, the script should start a task to sleep for 100 seconds, but at the user side, it needs to ask for input again as soon as the user enters it.

The main problem with your code was that you called input() directly in your async function. input itself is a blocking function and does not return until a newline or end-of-file is read. This is a problem because Python asynchronous code is still single-threaded, and if there is a blocking function, nothing else will execute. You need to use run_in_executor in this case.
Another problem with your code, although not directly relevant to your question, was that you mixed the pre-python3.7 way of invoking an event loop and the python3.7+ way. Per documentation, asyncio.run is used on its own. If you want to use the pre 3.7 way of invoking a loop, the correct way is
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
or
loop = asyncio.get_event_loop()
asyncio.ensure_future(main())
loop.run_forever()
Since you have a while True in your main(), there's no difference between run_until_complete and run_forever.
Lastly, there is no point in using ensure_future() in your main(). The point of ensure_future is providing a "normal" (i.e. non-async) function a way to schedule things into the event loop, since they can't use the await keyword. Another reason to use ensure_future is if you want to schedule many tasks with high io-bounds (ex. network requests) without waiting for their results. Since you are awaiting the function call, there is naturally no point of using ensure_future.
Here's the modified version:
import asyncio
import os
async def action():
loop = asyncio.get_running_loop()
inp = await loop.run_in_executor(None, input, 'Enter a number: ')
await asyncio.sleep(int(inp))
os.system(f"say '{inp} seconds waited'")
async def main():
while True:
await action()
asyncio.run(main())
In this version, before a user-input is entered, the code execution is alternating between await action() and await loop.run_in_executor(). When no other tasks are scheduled, the event-loop is mostly idle. However, when there are things scheduled (simulated using await sleep()), then the control will be naturally transferred to the long-running task that is scheduled.
One key to Python async programming is you have to ensure the control is transferred back to the event-loop once in a while so other scheduled things can be run. This happens whenever an await is encountered. In your original code, the interpreter get stuck at input() and never had a chance to go back to the event-loop, which is why no other scheduled tasks ever get executed until a user-input is provided.

You can try something like this:
import asyncio
WORKERS = 10
async def worker(q):
while True:
t = await q.get()
await asyncio.sleep(t)
q.task_done()
print(f"say '{t} seconds waited'")
async def main():
q = asyncio.Queue()
tasks = []
for _ in range(WORKERS):
tasks.append(asyncio.create_task(worker(q)))
print(f'Keep inserting numbers, "q" to quit...')
while (number := await asyncio.to_thread(input)) != "q":
q.put_nowait(int(number))
await q.join()
for task in tasks:
task.cancel()
await asyncio.gather(*tasks, return_exceptions=True)
if __name__ == "__main__":
asyncio.run(main())
Test:
$ python test.py
Keep inserting numbers, "q" to quit...
1
say '1 seconds waited'
3
2
1
say '1 seconds waited'
say '2 seconds waited'
say '3 seconds waited'
q
Note: Python 3.9+ required due to some new syntax (:=) and function (asyncio.to_thread) in use.

import asyncio
async def aworker(q):
''' Worker that takes numbers from the queue and prints them '''
while True:
t = await q.get() # Wait for a number to be put in the queue
print(f"{t} received {asyncio.current_task().get_coro().__name__}:{asyncio.current_task().get_name()}")
await asyncio.sleep(t)
q.task_done()
print(f"waited for {t} seconds in {asyncio.current_task().get_coro().__name__}:{asyncio.current_task().get_name()}")
async def looper():
''' Infinite loop that prints the current task name '''
i = 0
while True:
i+=1
await asyncio.sleep(1)
print(f"{i} {asyncio.current_task().get_name()}")
names = []
for task in asyncio.all_tasks():
names.append(task.get_name())
print(names)
async def main():
q = asyncio.Queue()
tasks = []
# create two worker tasks and one infinitely looping task
tasks.append(asyncio.create_task(aworker(q), name="aworker 1")) # Create a worker which handles input from the queue
tasks.append(asyncio.create_task(aworker(q), name="aworker 2")) # Create another worker which handles input from the queue
tasks.append(asyncio.create_task(looper(),name="looper")) # Create a looper task which prints the current task name and the other running tasks
for task in tasks:
# print the task names thus far
print(task.get_name())
print(f'Keep inserting numbers, "q" to quit...')
''' asyncio.thread names itself Task-1 '''
while (number := await asyncio.to_thread(input)) != "q":
try:
q.put_nowait(int(number))
except ValueError:
print("Invalid number")
await q.join()
for task in tasks:
task.cancel()
await asyncio.gather(*tasks, return_exceptions=True)
if __name__ == "__main__":
asyncio.run(main())

Related

Running blocking function (f.e. requests) concurrently but asynchronous with Python

There is a function that blocks event loop (f.e. that function makes an API request). I need to make continuous stream of requests which will run in parallel but not synchronous. So every next request will be started before the previous request will be finished.
So I found this solved question with the loop.run_in_executer() solution and use it in the beginning:
import asyncio
import requests
#blocking_request_func() defined somewhere
async def main():
loop = asyncio.get_event_loop()
future1 = loop.run_in_executor(None, blocking_request_func, 'param')
future2 = loop.run_in_executor(None, blocking_request_func, 'param')
response1 = await future1
response2 = await future2
print(response1)
print(response2)
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
this works well, requests run in parallel but there is a problem for my task - in this example we make group of tasks/futures in the beginning and then run this group synchronous. But I need something like this:
1. Sending request_1 and not awaiting when it's done.
(AFTER step 1 but NOT in the same time when step 1 starts):
2. Sending request_2 and not awaiting when it's done.
(AFTER step 2 but NOT in the same time when step 2 starts):
3. Sending request_3 and not awaiting when it's done.
(Request 1(or any other) gives the response)
(AFTER step 3 but NOT in the same time when step 3 starts):
4. Sending request_4 and not awaiting when it's done.
(Request 2(or any other) gives the response)
and so on...
I tried using asyncio.TaskGroup():
async def request_func():
global result #the list of results of requests defined somewhere in global area
loop = asyncio.get_event_loop()
result.append(await loop.run_in_executor(None, blocking_request_func, 'param')
await asyncio.sleep(0) #adding or removing this line gives the same result
async def main():
async with asyncio.TaskGroup() as tg:
for i in range(0, 10):
tg.create_task(request_func())
all these things gave the same result: first of all we defined group of tasks/futures and only then run this group synchronous and concurrently. But is there a way to run all these requests concurrently but "in the stream"?
I tried to make visualization if my explanation is not clear enough.
What I have for now
What I need
================ Update with the answer ===================
The most close answer however with some limitations:
import asyncio
import random
import time
def blockme(n):
x = random.random() * 2.0
time.sleep(x)
return n, x
def cb(fut):
print("Result", fut.result())
async def main():
#You need to control threads quantity
pool = concurrent.futures.ThreadPoolExecutor(max_workers=4)
loop = asyncio.get_event_loop()
futs = []
#You need to control requests per second
delay = 0.5
while await asyncio.sleep(delay, result=True):
fut = loop.run_in_executor(pool, blockme, n)
fut.add_done_callback(cb)
futs.append(fut)
#You need to control futures quantity, f.e. like this:
if len(futs)>40:
completed, futs = await asyncio.wait(futs,
timeout=5,
return_when=FIRST_COMPLETED)
asyncio.run(main())
I think this might be what you want. You don't have to await each request - the run_in_executor function returns a Future. Instead of awaiting that, you can attach a callback function:
import asyncio
import random
import time
def blockme(n):
x = random.random() * 2.0
time.sleep(x)
return n, x
def cb(fut):
print("Result", fut.result())
async def main():
loop = asyncio.get_event_loop()
futs = []
for n in range(20):
fut = loop.run_in_executor(None, blockme, n)
fut.add_done_callback(cb)
futs.append(fut)
await asyncio.gather(*futs)
# await asyncio.sleep(10)
asyncio.run(main())
All the requests are started at the beginning, but they don't all execute in parallel because the number of threads is limited by the ThreadPool. You can adjust the number of threads if you want.
Here I simulated a blocking call with time.sleep. I needed a way to prevent main() from ending before all the callbacks occurred, so I used gather for that purpose. You can also wait for some length of time, but gather is cleaner.
Apologies if I don't understand what you want. But I think you want to avoid using await for each call, and I tried to show one way you can do that.
This is directly referenced from Python documentation. The code snippet from documentation of asyncio library explains how you can run a blocking code concurrently using asyncio. It uses to_thread method to create task
you can find more here - https://docs.python.org/3/library/asyncio-task.html#running-in-threads
def blocking_io():
print(f"start blocking_io at {time.strftime('%X')}")
# Note that time.sleep() can be replaced with any blocking
# IO-bound operation, such as file operations.
time.sleep(1)
print(f"blocking_io complete at {time.strftime('%X')}")
async def main():
print(f"started main at {time.strftime('%X')}")
await asyncio.gather(
asyncio.to_thread(blocking_io),
asyncio.sleep(1))
print(f"finished main at {time.strftime('%X')}")
asyncio.run(main())

Why do my async functions get stuck in infinite loop?

I'm trying to setup some functions to continuously fetch and send data back and forth. However, after sending, there needs to be a brief rest period (which is why I have asyncio.sleep(10)). So, I want to be continuously fetching data in my loop during this waiting time. My problem is once task #1 starts sleeping and task #2 begins executing, it never reverts back to task #1 when it wakes up. It gets stuck in this fetching data loop endlessly.
I tried fixing this problem with a global boolean variable to indicate when the sender was on cooldown but that felt like a cheap solution. I wanted to find out if there was a way to achieve my goals using asyncio built-in functions.
Trying to repeat this process: fetch some data continuously -> send some data -> go on cooldown and continue fetching data during this period
import asyncio
data = []
async def fetcher():
while True:
# Some code continuously fetching data
print("STUCK IN FETCHER")
async def sender():
# Some code which sends data
await asyncio.sleep(10)
async def main():
while True:
t1 = asyncio.create_task(sender())
t2 = asyncio.create_task(fetcher())
await t1
asyncio.run(main())
You are not awaiting anything in your fetcher, hence it is essentially blocking and doesn't give other coroutines the Chance to do some work. You can add await asyncio.sleep(0) which should help. Apart from this, you should also await the sleep in sender() as otherwise, it will not actually sleep 10 seconds, but just create a coroutine that's not executed.
fetcher is blocking/sync function. If you want to run them concurrently, it needs to put fetcher into another executor.
import asyncio
data = []
def fetcher():
"""Blocking fetcher."""
while True:
# Some code continuously fetching data
print("STUCK IN FETCHER")
async def sender():
"""Async sender."""
while True:
# Some code which sends data
await asyncio.sleep(10)
async def main():
loop = asyncio.get_running_loop()
await asyncio.gather(
# https://docs.python.org/3/library/asyncio-eventloop.html#asyncio.loop.run_in_executor
loop.run_in_executor(None, fetcher),
sender(),
)
asyncio.run(main())
It still needs to synchronise fetcher and sender on accessing data.
Building on what Simon Hawe said..
import asyncio
data = []
async def fetcher(event):
print("fetch = start")
while not event.is_set():
# Some code continuously fetching data
await asyncio.sleep(0) # This hands control to other coroutine
print("fetch = end")
async def sender(event):
print ("send = start")
# Some code which sends data
await asyncio.sleep(2)
print("send = done")
event.set()
async def main():
event = asyncio.Event()
for _ in range(2):
event.clear()
t1 = asyncio.create_task(sender(event))
t2 = asyncio.create_task(fetcher(event))
await asyncio.gather(t1,t2)
print("-")
asyncio.run(main())
Console result:
send = start
fetch = start
send = done
fetch = end
-
send = start
fetch = start
send = done
fetch = end
-
async.Event() is documented here. The use of an event comes with a caution: The use of this method is not thread-safe. The set state of the event is available to all tasks that consumes it. As stated:
Set the event. All tasks waiting for event to be set will be immediately awakened.

Why does 'await' break from the local function when called from main()?

I am new to asynchronous programming, and while I understand most concepts, there is one relating to the inner runnings of 'await' that I don't quite understand.
Consider the following:
import asyncio
async def foo():
print('start fetching')
await asyncio.sleep(2)
print('done fetcihng')
async def main():
task1 = asyncio.create_task(foo())
asyncio.run(main())
Output: start fetching
vs.
async def foo():
print('start fetching')
print('done fetcihng')
async def main():
task1 = asyncio.create_task(foo())
asyncio.run(main())
Output: start fetching followed by done fetching
Perhaps it is my understanding of await, which I do understand insofar that we can use it to pause (2 seconds in the case above), or await for functions to fully finish running before any further code is run.
But for the first example above, why does await cause 'done fetching' to not run??
asyncio.create_task schedules an awaitable on the event loop and returns immediately, so you are actually exiting the main function (and closing the event loop) before the task is able to finish
you need to change main to either
async def main():
task1 = asyncio.create_task(foo())
await task1
or
async def main():
await foo()
creating a task first (the former) is useful in many cases, but they all involve situations where the event loop will outlast the task, e.g. a long running server, otherwise you should just await the coroutine directly like the latter

How to handle tasks with varying completion times

I'm watching a video(a) on YouTube about asyncio and, at one point, code like the following is presented for efficiently handling multiple HTTP requests:
# Need an event loop for doing this.
loop = asyncio.get_event_loop()
# Task creation section.
tasks = []
for n in range(1, 50):
tasks.append(loop.create_task(get_html(f"https://example.com/things?id={n}")))
# Task processing section.
for task in tasks:
html = await task
thing = get_thing_from_html(html)
print(f"Thing found: {thing}", flush=True)
I realise that this is efficient in the sense that everything runs concurrently but what concerns me is a case like:
the first task taking a full minute; but
all the others finishing in under three seconds.
Because the task processing section awaits completion of the tasks in the order in which they entered the list, it appears to me that none will be reported as complete until the first one completes.
At that point, the others that finished long ago will also be reported. Is my understanding correct?
If so, what is the normal way to handle that scenario, so that you're getting completion notification for each task the instant that task finishes?
(a) From Michael Kennedy of "Talk Python To Me" podcast fame. The video is Demystifying Python's Async and Await Keywords if you're interested. I have no affiliation with the site other than enjoying the podcast, so heartily recommend it.
If you just need to do something after each task, you can create another async function that does it, and run those in parallel:
async def wrapped_get_html(url):
html = await get_html(url)
thing = get_thing_from_html(html)
print(f"Thing found: {thing}")
async def main():
# shorthand for creating tasks and awaiting them all
await asyncio.gather(*
[wrapped_get_html(f"https://example.com/things?id={n}")
for n in range(50)])
asyncio.run(main())
If for some reason you need your main loop to be notified, you can do that with as_completed:
async def main():
for next_done in asyncio.as_completed([
get_html(f"https://example.com/things?id={n}")
for n in range(50)]):
html = await next_done
thing = get_thing_from_html(html)
print(f"Thing found: {thing}")
asyncio.run(main())
You can make the tasks to run in parallel with the code example below. I introduced asyncio.gather to make task run concurrently. Also I demonstrated poison pill technique and daemon task technique.
Please follow comments in code and feel free to ask questions if you have any.
import asyncio
from random import randint
WORKERS_NUMBER = 5
URL_NUM = 20
async def producer(task_q: asyncio.Queue) -> None:
"""Produce tasks and send them to workers"""
print("Producer-Task Started")
# imagine that it is a list of urls
for i in range(URL_NUM):
await task_q.put(i)
# send poison pill to workers
for i in range(WORKERS_NUMBER):
await task_q.put(None)
print("Producer-Task Finished")
async def results_shower(result_q: asyncio.Queue) -> None:
"""Receives results from worker tasks and show the result"""
while True:
res = await result_q.get()
print(res)
result_q.task_done() # confirm that task is done
async def worker(
name: str,
task_q: asyncio.Queue,
result_q: asyncio.Queue,
) -> None:
"""Get's tasks from task_q, do some job and send results to result_q"""
print(f"Worker {name} Started")
while True:
task = await task_q.get()
# if worker received poison pill - break
if task is None:
break
await asyncio.sleep(randint(1, 10))
result = task ** 2
await result_q.put(result)
print(f"Worker {name} Finished")
async def amain():
"""Wrapper around all async ops in the app"""
_task_q = asyncio.Queue(maxsize=5) # just some random maxsize
_results_q = asyncio.Queue(maxsize=5) # just some random maxsize
# we run results_shower as a "daemon task", so we never await
# if asyncio loop has nothing else to do, loop stops
# without waiting for "daemon task"
asyncio.create_task(results_shower(_results_q))
# gather block means that we run task in parallel and wait till all the task are finished
await asyncio.gather(
producer(_task_q),
*[worker(f"W-{i}", _task_q, _results_q) for i in range(WORKERS_NUMBER)]
)
# q.join() prevents loop from stopping, until results_shower print all task result
# it has some internal counter, which is decreased by task_done and increases
# q.put(). If counter is 0, the q can join.
await _results_q.join()
print("All work is finished!")
if __name__ == '__main__':
asyncio.run(amain())

How to make request without blocking (using asyncio)?

I would like to achieve the following using asyncio:
# Each iteration of this loop MUST last only 1 second
while True:
# Make an async request
sleep(1)
However, the only examples I've seen use some variation of
async def my_func():
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, requests.get, 'http://www.google.com')
loop = asyncio.get_event_loop()
loop.run_until_complete(my_func())
But run_until_complete is blocking! Using run_until_complete in each iteration of my while loop would cause the loop to block.
I've spent the last couple of hours trying to figure out how to correctly run a non-blocking task (defined with async def) without success. I must be missing something obvious, because something as simple as this should surely be simple. How can I achieve what I have described?
run_until_complete runs the main event loop. It's not "blocking" so to speak, it just runs the event loop until the coroutine you passed as a parameter returns. It has to hang because otherwise, the program would either stop or be blocked by the next instructions.
It's pretty hard to tell what you are willing to achieve, but this piece code actually does something:
async def my_func():
loop = asyncio.get_event_loop()
while True:
res = await loop.run_in_executor(None, requests.get, 'http://www.google.com')
print(res)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
loop.run_until_complete(my_func())
It will perform a GET request on Google homepage every seconds, popping a new thread to perform each request. You can convince yourself that it's actually non-blocking by running multiple requests virtually in parallel:
async def entrypoint():
await asyncio.wait([
get('https://www.google.com'),
get('https://www.stackoverflow.com'),
])
async def get(url):
loop = asyncio.get_event_loop()
while True:
res = await loop.run_in_executor(None, requests.get, url)
print(url, res)
await asyncio.sleep(1)
loop = asyncio.get_event_loop()
loop.run_until_complete(entrypoint())
Another thing to notice is that you're running requests in separate threads each time. It works, but it's sort of a hack. You should rather be using a real asynchronus HTTP client such as aiohttp.
This is Python 3.10
asyncio is single threaded execution, using await to yield the cpu to other function until what is await'ed is done.
import asyncio
async def my_func(t):
print("Start my_func")
await asyncio.sleep(t) # The await yields cpu, while we wait
print("Exit my_func")
async def main():
asyncio.ensure_future(my_func(10)) # Schedules on event loop, we might want to save the returned future to later check for completion.
print("Start main")
await asyncio.sleep(1) # The await yields cpu, giving my_func chance to start.
print("running other stuff")
await asyncio.sleep(15)
print("Exit main")
if __name__ == "__main__":
asyncio.run(main()) # Starts event loop

Categories

Resources