I'm trying understand if asyncio is a necessary part of the Python definition of coroutines or simply a convenience package.
Can I run this program without asyncio?
import time
async def clk():
time.sleep(0.1)
async def process():
for _ in range(2):
await clk();
time.sleep(0.2)
print("I am DONE waiting!")
def run():
await process()
if __name__ == "__main__":
run()
I get the error that run() is not defined with async, which I get, but there seems to be an infinite regress to the top. Interestingly, this code runs (without the run() function) in Jupyter notebook. I just type await process.
To run async functions, you need to provide an event loop. One of the main functionalities of asyncio is to provide such a loop: when you execute asyncio.run(process) it provides a loop internally.
The reason why this code works in a notebook is that Jupyter (as well as the ipython REPL) provides a loop under the hood; there are other third-party libraries that provide a loop, such as trio and curio.
That being said, you can freely provide your own loop instead of using a library, as demonstrated in this answer. But in practice there is no point in doing that as asyncio is part of the Python standard library.
Related
QL;DR
In Python, is it a good practice to use asyncio.run inside a non-main function?
Description of the problem
I have a python code that runs multiple commands using subprocesses.
At the moment I run those subprocesses sequentially and wait until each one of the is finished until I run the next one. I want to start avoiding this using the async keyword and the asyncio library in general.
Now, in python you cannot use the await keyword unless you're in an async function. This forces you to propagate the async prefix to every function that calls on an async function, up until the main function at the top layer. the only way to avoid it that I know is to use asyncio.run. However, in all of the tutorials I saw the only place this function is used is when calling the main function, which doesn't help me avoid this propagation.
I was wondering if there is a real reason not to use asyncio.run in non-main function and avoid making all of my functions async for no reason.
Would love to know the answer!
Here's what I want to do.
I have multiple asynchronous functions and a separate asynchronous function let's say main. I want to call this main function with every other function I call.
I'm using this structure in a telegram bot, and functions are called upon a certain command. But I want to run main on any incoming message including the messages with commands as mentioned above where another function is also called. So in that case, I wanna run both (first command specific function then main function)
I believe this can be done using threading.RLock() as suggested by someone, but I can't figure out how.
What's the best approach for this?
You could use aiotelegram in combination with asyncio's create_task().
While Threads can also do the job, they don't seem to be as good as asynchronous execution.
You can choose any telegram framework that provides an async context like Bot.run() does in aiotelegram, or you can even implement your own API client, just make sure you run on an asynchronous (ASGI) context.
The main idea then is to call asyncio.create_task() to fire up the main() function in parallel with the rest of the function that runs the Telegram Bot command.
Here's an example (note I've use my_main() instead main()):
import asyncio
from aiotg import Bot, Chat
bot = Bot(api_token="...")
async def other_command():
#Replace this with the actual logic
await asyncio.sleeep(2)
async def my_main():
# Replace this with the actual parallel task
await asyncio.sleep(5)
#bot.command(r"/echo_command (.+)")
async def echo(chat: Chat, match):
task = asyncio.create_task(my_main())
return chat.reply(match.group(1))
#bot.command(r"/other_command (.+)")
async def other(chat: Chat, match):
task = asyncio.create_task(my_main())
result = other_command()
return chat.reply(result)
bot.run()
It is important to know that with this approach, the tasks are never awaited nor checked for their completion, so Exceptions or failed executions can be difficult to track, as well as any result from main() that needs to be kept.
A simple solution for this is to declare a global dict() where you store the Task(s), so you can get back to them later on (i.e. with a specific Telegram Command, or running always within certain existing Telegram Commands).
Whatever logic you decide to keep track of the tasks, you can check if they're completed, and their results, if any, with Task.done() and Task.result(). See their official doc for further details about how to manage Tasks.
I want to write a library that mixes synchronous and asynchronous work, like:
def do_things():
# 1) do sync things
# 2) launch a bunch of slow async tasks and block until they are all complete or an exception is thrown
# 3) more sync work
# ...
I started implementing this using asyncio as an excuse to learn the learn the library, but as I learn more it seems like this may be the wrong approach. My problem is that there doesn't seem to be a clean way to do 2, because it depends on the context of the caller. For example:
I can't use asyncio.run(), because the caller could already have a running event loop and you can only have one loop per thread.
Marking do_things as async is too heavy because it shouldn't require the caller to be async. Plus, if do_things was async, calling synchronous code (1 & 3) from an async function seems to be bad practice.
asyncio.get_event_loop() also seems wrong, because it may create a new loop, which if left running would prevent the caller from creating their own loop after calling do_things (though arguably they shouldn't do that). And based on the documentation of loop.close, it looks like starting/stopping multiple loops in a single thread won't work.
Basically it seems like if I want to use asyncio at all, I am forced to use it for the entire lifetime of the program, and therefore all libraries like this one have to be written as either 100% synchronous or 100% asynchronous. But the behavior I want is: Use the current event loop if one is running, otherwise create a temporary one just for the scope of 2, and don't break client code in doing so. Does something like this exist, or is asyncio the wrong choice?
I can't use asyncio.run(), because the caller could already have a running event loop and you can only have one loop per thread.
If the caller has a running event loop, you shouldn't run blocking code in the first place because it will block the caller's loop!
With that in mind, your best option is to indeed make do_things async and call sync code using run_in_executor which is designed precisely for that use case:
async def do_things():
loop = asyncio.get_event_loop()
await loop.run_in_executor(None, sync_stuff)
await async_func()
await loop.run_in_executor(None, more_sync_stuff)
This version of do_things is usable from async code as await do_things() and from sync code as asyncio.run(do_things()).
Having said that... if you know that the sync code will run very briefly, or you are for some reason willing to block the caller's event loop, you can work around the limitation by starting an event loop in a separate thread:
def run_async(aw):
result = None
async def run_and_store_result():
nonlocal result
result = await aw
t = threading.Thread(target=asyncio.run, args=(run_and_store_result(),))
t.start()
t.join()
return result
do_things can then look like this:
async def do_things():
sync_stuff()
run_async(async_func())
more_sync_stuff()
It will be callable from both sync and async code, but the cost will be that:
it will create a brand new event loop each and every time. (Though you can cache the event loop and never exit it.)
when called from async code, it will block the caller's event loop, thus effectively breaking its asyncio usage, even if most time is actually spent inside its own async code.
I am trying to understand asyncio module and spend about one hour with run_coroutine_threadsafe function, I even came to the working example, it works as expected, but works with several limitations.
First of all I do not understand how should I properly call asyncio loop in main (any other) thread, in the example I call it with run_until_complete and give it a coroutine to make it busy with something until another thread will not give it a coroutine. What are other options I have?
What are situations when I have to mix asyncio and threading (in Python) in real life? Since as far as I understand asyncio is supposed to take place of threading in Python (due to GIL for not IO ops), if I am wrong, do not be angry and share your suggestions.
Python version is 3.7/3.8
import asyncio
import threading
import time
async def coro_func():
return await asyncio.sleep(3, 42)
def another_thread(_loop):
coro = coro_func() # is local thread coroutine which we would like to run in another thread
# _loop is a loop which was created in another thread
future = asyncio.run_coroutine_threadsafe(coro, _loop)
print(f"{threading.current_thread().name}: {future.result()}")
time.sleep(15)
print(f"{threading.current_thread().name} is Finished")
if __name__ == '__main__':
loop = asyncio.get_event_loop()
main_th_cor = asyncio.sleep(10)
# main_th_cor is used to make loop busy with something until another_thread will not send coroutine to it
print("START MAIN")
x = threading.Thread(target=another_thread, args=(loop, ), name="Some_Thread")
x.start()
time.sleep(1)
loop.run_until_complete(main_th_cor)
print("FINISH MAIN")
First of all I do not understand how should I properly call asyncio loop in main (any other) thread, in the example I call it with run_until_complete and give it a coroutine to make it busy with something until another thread will not give it a coroutine. What are other options I have?
This is a good use case for loop.run_forever(). The loop will run and serve the coroutines you submit using run_coroutine_threadsafe. (You can even submit such coroutines from multiple threads in parallel; you never need to instantiate more than one event loop.)
You can stop the loop from a different thread by calling loop.call_soon_threadsafe(loop.stop).
What are situations when I have to mix asyncio and threading (in Python) in real life?
Ideally there should be none. But in the real world, they do crop up; for example:
When you are introducing asyncio into an existing large program that uses threads and blocking calls and cannot be converted to asyncio all at once. run_coroutine_threadsafe allows regular blocking code to make use of asyncio.
When you are dealing with older "async" APIs which use threads under the hood and call the user-supplied APIs from other threads. There are many examples, such as Python's own multiprocessing.
When you need to call blocking functions that have no async equivalent from asyncio - e.g. CPU-bound functions, legacy database drivers, things like that. This is not a use case for run_coroutine_threadsafe, here you'd use run_in_executor, but it is another example of mixing threads and asyncio.
In the Python docs, it states:
Application developers should typically use the high-level asyncio
functions, such as asyncio.run(), and should rarely need to reference
the loop object or call its methods.
This section is intended mostly for authors of lower-level code, libraries, and frameworks, who need finer control over the event loop behavior.
When using both async and a threadpoolexecutor, as show in the example code (from the docs):
import asyncio
import concurrent.futures
def blocking_io():
# File operations (such as logging) can block the
# event loop: run them in a thread pool.
with open('/dev/urandom', 'rb') as f:
return f.read(100)
async def main():
loop = asyncio.get_running_loop()
# 2. Run in a custom thread pool:
with concurrent.futures.ThreadPoolExecutor() as pool:
result = await loop.run_in_executor(
pool, blocking_io)
print('custom thread pool', result)
asyncio.run(main())
Do I need to call loop.close(), or would the asyncio.run() close the loop for me?
Is using both asyncio and threadpoolexecutor together, one of those situations where you need finer control over the event loop? Can using both, asyncio and threadpoolexecutor together be done without referencing the loop?
For question #1, The coroutines and tasks documentation linked in the Event Loop documentation you reference indicates that asyncio.run closes the loop:
asyncio.run(coro, *, debug=False)
Execute the coroutine coro and return the result.
This function runs the passed coroutine, taking care of managing the
asyncio event loop and finalizing asynchronous generators.
This function cannot be called when another asyncio event loop is
running in the same thread.
If debug is True, the event loop will be run in debug mode.
This function always creates a new event loop and closes it at the
end. It should be used as a main entry point for asyncio programs, and
should ideally only be called once.
For #2, that use of get_running_loop with ThreadExecutor is a way to run blocking code without blocking the OS thread. In Developing with asyncio, they indicate:
The loop.run_in_executor() method can be used with a concurrent.futures.ThreadPoolExecutor to execute blocking code in a different OS thread without blocking the OS thread that the event loop runs in.
It is an exception to the general warning they give in the Event documentation around calling the lower-level methods. So the answer to question #2, the answer is yes, it is one of those situations where you need a small amount of finer control in order to execute this recipe that handles blocking code in async scenarios.