QL;DR
In Python, is it a good practice to use asyncio.run inside a non-main function?
Description of the problem
I have a python code that runs multiple commands using subprocesses.
At the moment I run those subprocesses sequentially and wait until each one of the is finished until I run the next one. I want to start avoiding this using the async keyword and the asyncio library in general.
Now, in python you cannot use the await keyword unless you're in an async function. This forces you to propagate the async prefix to every function that calls on an async function, up until the main function at the top layer. the only way to avoid it that I know is to use asyncio.run. However, in all of the tutorials I saw the only place this function is used is when calling the main function, which doesn't help me avoid this propagation.
I was wondering if there is a real reason not to use asyncio.run in non-main function and avoid making all of my functions async for no reason.
Would love to know the answer!
Related
Here's what I want to do.
I have multiple asynchronous functions and a separate asynchronous function let's say main. I want to call this main function with every other function I call.
I'm using this structure in a telegram bot, and functions are called upon a certain command. But I want to run main on any incoming message including the messages with commands as mentioned above where another function is also called. So in that case, I wanna run both (first command specific function then main function)
I believe this can be done using threading.RLock() as suggested by someone, but I can't figure out how.
What's the best approach for this?
You could use aiotelegram in combination with asyncio's create_task().
While Threads can also do the job, they don't seem to be as good as asynchronous execution.
You can choose any telegram framework that provides an async context like Bot.run() does in aiotelegram, or you can even implement your own API client, just make sure you run on an asynchronous (ASGI) context.
The main idea then is to call asyncio.create_task() to fire up the main() function in parallel with the rest of the function that runs the Telegram Bot command.
Here's an example (note I've use my_main() instead main()):
import asyncio
from aiotg import Bot, Chat
bot = Bot(api_token="...")
async def other_command():
#Replace this with the actual logic
await asyncio.sleeep(2)
async def my_main():
# Replace this with the actual parallel task
await asyncio.sleep(5)
#bot.command(r"/echo_command (.+)")
async def echo(chat: Chat, match):
task = asyncio.create_task(my_main())
return chat.reply(match.group(1))
#bot.command(r"/other_command (.+)")
async def other(chat: Chat, match):
task = asyncio.create_task(my_main())
result = other_command()
return chat.reply(result)
bot.run()
It is important to know that with this approach, the tasks are never awaited nor checked for their completion, so Exceptions or failed executions can be difficult to track, as well as any result from main() that needs to be kept.
A simple solution for this is to declare a global dict() where you store the Task(s), so you can get back to them later on (i.e. with a specific Telegram Command, or running always within certain existing Telegram Commands).
Whatever logic you decide to keep track of the tasks, you can check if they're completed, and their results, if any, with Task.done() and Task.result(). See their official doc for further details about how to manage Tasks.
In JavaScript, to make a delayed call to a function there's a very handy utility built in to the language called setTimeout() [yes, the naming is horrible, but hey, this is an old javascript legacy and no one has high expectations].
setTimeout() takes a function name or a lambda function and a number of milliseconds as parameters, and invokes the function in the same thread once a the event loop and the timeout reach. If timeout is zero, it will run it in the next event loop cycle.
Is there any native python object that can achieve this in a one/two liner?
Python being the single thread application running without an event loop it seems not possible but
But in Python 3.7+ you can something similar to this with asyncio package
try using this with event loop and async method
await asyncio.sleep(5)
which is non-blocking but not a one-line solution
asyncio
I'm not very experienced in Python asyncio, although synchronous Python is going well.
I have a function, which creates a task list, and another function which is to be called with tasks in this list:
import asyncio
async def createTasks(some_dict):
coroutines = []
# some_dict can have an arbitrary number of items
for item in some_dict:
coroutines.append(executeOneTask(item))
tasks = await asyncio.gather(*coroutines, return_exceptions=True)
return tasks
async def executeOneTask(item):
# execute a task asynchronously
return
Here's the part where you are free to correct me if I'm wrong.
Now, my understanding of asyncio is that I need an event loop to execute an asynchronous function, which means that to asyncio.gather I need to await it that means this needs to happen inside an async function. OK, so I need an event loop to create my list of asynchronous tasks that I actually want to execute asynchronously.
If my understanding of event loops is correct, I cannot easily add tasks inside an event loop to that same event loop. Let's assume that I have an asynchronous main() function which is supposed to first retrieve a list of asynchronous tasks using createTasks() and then create an amount (equal to list length) of asynchronous tasks to be run by utilizing executeOneTask().
How should I approach the construction of such a main() function? Do I need multiple event loops? Can I create a task list some other way, which enables easier code?
Side note: The approach I have set up here might be a very difficult or upside-down way to solve the problem. What I aim to do is to create a list of asynchronous tasks and then run those tasks asynchronously. Feel free to not follow the code structure above if a smart solution requires that.
Thanks!
You should only use one event loop in the entire application. Start the main function by asyncio.run(main()) and asyncio creates a loop for you. With Python 3.8 you rarely need to access or use the loop directly but with older versions you may obtain it by asyncio.get_event_loop() if using loop methods or some functions that require loop argument.
Do note that IPython (used in Spyder and Jupyter) also runs its own loop, so in those you can directly call and await without calling asyncio.run.
If you only wish to do async programming but don't specifically need to work with asyncio, I would recommend checking out https://trio.readthedocs.io/ which basically does the same things but is much, much easier to use (correctly).
In the article "I'm not feeling the async pressure" Armin Ronacher makes the following observation:
In threaded code any function can yield. In async code only async functions can. This means for instance that the writer.write method cannot block.
This observation is made with reference to the following code sample:
from asyncio import start_server, run
async def on_client_connected(reader, writer):
while True:
data = await reader.readline()
if not data:
break
writer.write(data)
async def server():
srv = await start_server(on_client_connected, '127.0.0.1', 8888)
async with srv:
await srv.serve_forever()
run(server())
I do not understand this comment. Specifically:
How come synchronous functions cannot yield when inside of asynchronous functions?
What does yield have to do with blocking execution? Why is it that a function that cannot yield, cannot block?
Going line-by-line:
In threaded code any function can yield.
Programs running on a machine are organized in terms of processes. Each process may have one or more threads. Threads, like processes, are scheduled by (and interruptible by) the operating system. The word "yield" in this context means "letting other code run". When work is split between multiple threads, functions "yield" easily: the operating system suspends the code running in one thread, runs some code in a different thread, suspends that, comes back, and works some more on the first thread, and so on. By switching between threads in this way, concurrency is achieved.
In this execution model, whether the code being suspended is synchronous or asynchronous does not matter. The code within the thread is being run line-by-line, so the fundamental assumption of a synchronous function---that no changes occurred in between running one line of code and the next---is not violated.
In async code only async functions can.
"Async code" in this context means a single-threaded application that does the same work as the multi-threaded application, except that it achieves concurrency by using asynchronous functions within a thread, instead of splitting the work between different threads. In this execution model, your interpreter, not the operating system, is responsible for switching between functions as needed to achieve concurrency.
In this execution model, it is unsafe for work to be suspended in the middle of a synchronous function that's located inside of an asynchronous function. Doing so would mean running some other code in the middle of running your synchronous function, breaking the "line-by-line" assumption made by the synchronous function.
As a result, the interpreter will wait only suspend the execution of an asynchronous function in between synchronous sub-functions, never within one. This is what is meant by the statement that synchronous functions in async code cannot yield: once a synchronous function starts running, it must complete.
This means for instance that the writer.write method cannot block.
The writer.write method is synchronous, and hence, when run in an async program, uninterruptible. If this method were to block, it would block not just the asynchronous function it is running inside of, but the entire program. That would be bad. writer.write avoids blocking the program by writing to a write buffer instead and returning immediately.
Strictly speaking, writer.write can block, it's just inadvisable to do so.
If you need to block inside of an async function, the proper way to do so is to await another async function. This is what e.g. await writer.drain() does. This will block asynchronously: while this specific function remains blocked, it will correctly yield to other functions that can run.
“Yield” here refers to cooperative multitasking (albeit within a process rather than among them). In the context of the async/await style of Python programming, asynchronous functions are defined in terms of Python’s pre-existing generator support: if a function blocks (typically for I/O), all its callers that are performing awaits suspend (with an invisible yield/yield from that is indeed of the generator variety). The actual call for any generator is to its next method; that function actually returns.
Every caller, up to some sort of driver that most programmers never write, must participate for this approach to work: any function that did not suspend would suddenly have the responsibility of the driver of deciding what to do next while waiting on the function it called to complete. This “infectious” aspect of asynchronicity has been called a “color”; it can be problematic, as for example when people forget to await a coroutine call that looks correct because it looks like any other call. (The async/await syntax exists to minimize the disruption of the program’s structure from the concurrency by implicitly converting functions into state machines, but this ambiguity remains.) It can also be a good thing: an asynchronous function can be interrupted exactly when it awaits, so it’s straightforward to reason about the consistency of data structures.
A synchronous function therefore cannot yield simply as a matter of definition. The import of the restriction is rather that a function called with a normal (synchronous) call cannot yield: its caller is not prepared to handle such an interaction. (What will happen if it does anyway is of course the same “forgotten await”.) This also affects refactoring: a function cannot be changed to be asynchronous without changing all its clients (and making them asynchronous as well if they are not already). (This is similar to how all I/O works in Haskell, since it affects the type of any function that performs any.)
Note that yield is allowed in its role as a normal generator used with an ordinary for even in an asynchronous function, but that’s just the general fact that the caller must expect the same protocol as the callee: if an enhanced generator (an “old-style” coroutine) is used with for, it just gets None from every (yield), and if an async function is used with for, it produces awaitables that probably break when they are sent None.
The distinction with threading, or with so-called stackful coroutines or fibers, is that no special resumption support is needed from the caller because the actual function call simply doesn’t return until the thread/fiber is resumed. (In the thread case, the kernel also chooses when to resume it.) In that sense, these approaches are easier to use, but with fibers the ability to “sneak” a pause into any function is partially compromised by the need to specify arguments to that function to tell it about the userspace scheduler with which to register itself (unless you’re willing to use global variables for that…). Threads, on the other hand, have even higher overhead than fibers, which matters when great numbers of them are running.
I'm writing a little app in python, consuming some http services, but i really don't understand the difference between using an async function or an Thread for consuming that services.
Anyone can help me to understand?
I have been reading up on the threaded model of programming versus the asynchronous model from this really good article. http://krondo.com/blog/?p=1209
However, the article mentions the following points.
An async program will simply outperform a sync program by switching between tasks whenever there is a I/O.
Threads are managed by the operating system.
I remember reading that threads are managed by the operating system by moving around TCBs between the Ready-Queue and the Waiting-Queue(amongst other queues). In this case, threads don't waste time on waiting either do they?
In light of the above mentioned, what are the advantages of async programs over threaded programs?
In a function there is an entry point and there is an exit point (which is usually a return statement or last statement of function).
Thread: executes all the possible statements from entry point to exit point.
async function :
Functions defined with async def syntax are always coroutine functions
This is from python reference documentation. And coroutines can be entered,exited or resumed from different points anywhere between entry and exit point of the function.
Now, based on your requirement, you can choose which one to use.