Granted, given the GIL, classic asyncio design should focus on "single main thread with single event loop" in it. Nonetheless, are there legitimate "multiple threads with multiple event loops" use cases, which bring some architectural or performance advantages over the singular case? Please share.
I never compare the performance of using multiple event loop in multiple threads.
As far as I know asynchronous is an event-driven architechture where single event loop is rely on single thread and the running function is this event-loop will wait for the trigger to get their time to run. This would becomes faster than threading (in theory) since we're no longer taking care of resource management (memory, cpu etc).
Threading basically will try to manage the resource in terms to make them run concurrently because its actually switching the resource usage.
But both is proposed to execute the program in parallel even if its not concurrent at a time. and for security, asynchronous is more thread safe since it's in a single thread.
Found: one such use case is QEventLoop <-> asyncio.EventLoop "mutual stop-and-go hand-over" pattern in PySide6, when QT event loop stops itself, then allows asyncio event loop to run for a while, then QT event loop is back in control, all the while sharing the same thread.
Related
I've read quite a few articles on threading and asyncio modules in python and the major difference I can seem to draw (correct me if I'm wrong) is that in,
threading: multiple threads can be used to execute the python program and these threads are juggled by the OS itself. Further only when non blocking I/O is happening on a thread the GIL lock can be released to allow another thread to use it (since GIL makes python interpreter single threaded). This is also more resource intensive than asyncio io, since multiple threads will be utilising multiple resources.
asyncio: one single thread can have multiple tasks/coroutines that multitask cooperatively to achieve concurrency. Here, the issue of GIL doesn't arise since it is on a single thread anyway and whenever one non blocking I/O bound task is happening, python interpreter can be used by another coroutine - and all of this is managed by asyncio's event loop.
Also, one article: http://masnun.rocks/2016/10/06/async-python-the-different-forms-of-concurrency/
says,
if io_bound:
if io_very_slow:
print("Use Asyncio")
else:
print("Use Threads")
else:
print("Multi Processing")
I'd like to understand, just for better clarity, why exactly we can't use asyncio and threading as substitutes for each other, given we have sufficient resources available. Use cases of when to use what would help understand better. Further, since this topic is very new for me, there might be gaps in my understanding, so any kind of resources, explanations and corrections would be really appreciated.
I am really struggling to understand the interaction between asyncio event loop and multiple workers/threads/processes.
I am using dash: which uses flask internally and gunicorn.
Say I have two functions
def async_download_multiple_files(files):
# This function uses async just so that it can concurrently send
# Multiple requests to different webservers and returns data.
def sync_callback_dash(files):
# This is a sync function that is called from a dash callback to get data
asyncio.run(async_download_multiple_files(files))
As I understand, asyncio.run runs the async function in an event loop but blocks it:
From Python Docs
While a Task is running in the event loop, no other Tasks can run in the same thread.
But what happens when I run a WSGI server like Gunicorn with multiple workers.
Say there are 2 requests coming in simultaneously, presumably there will be multiple calls to sync_callback_dash which will happen in parallel because of multiple Gunicorn workers.
Can both request 1 and request 2 try to execute the asyncio.run in parallel in different threads\processes ? Will one block the other ?
If they can run in parallel, what is the use of having asyncio workers that Gunicorn offers?
I answered this question with the assumption that there is some lack of knowledge on some of the fundamental understandings of threads/processes/async loop. If there was not, forgive me for the amount of detail.
First thing to note is that processes and threads are two separate concepts. This answer might give you some context. To expand:
Processes are run directly by the CPU, and if the CPU has multiple cores, processes can be run in parallel. Inside processes is where threads are run. There is always at least 1 thread per process, but there can be more. If there are more, the process switches between which thread it is executing after every (specific) millisecond (dictated by things out of the scope of this question)- and therefore threads are not run in absolute parallel, but rather constantly switched in and out of the CPU (at least as it pertains to Python, specifically, due to something called the GIL). The async loop runs inside a thread, and switches context relating specifically to I/O-bound instructions (more of this below).
Regarding this question, it's worth noting that Gunicorn workers are processes, and not threads (though you can increase the amount of threads per worker).
The intention of asynchronous code (with the use of async def, await, and asyncio) is to speed-up performance as it specifically relates to I/O bound tasks. Stuff like getting a file from disk, sending/receiving a network request, or anything that requires a physical piece of your computer - whether it is SSD, or the network card - other than the CPU to do some work. It can also be used for large CPU-bound instructions, but this is usually where threads come in. Note that I/O bound instructions are much slower than CPU bound instructions as the electricity inside your computer literally has to travel further distances, as well as perform extra steps in the hardware level (to keep things simple).
These tasks waste the CPU time (or, more specifically, the current process's time) on simply waiting for a reply. Asynchronous code is run with the help of a loop that auto-manages the context switching of I/O bound instructions and normal CPU bound instructions (dependent on the use of await keywords) by leveraging the idea that a function can "yield" control back to the loop, and allow the loop to continue processing other pieces of code while it waits. When async code sends an I/O bound instruction (e.g. grab the latest packet from the network card), instead of sitting still and waiting for a reply it will switch the current process' context to the next task in its list to speed up general execution time (adding that previous I/O bound call to this list to check back in later). There is more to this, but this is the general gist as it relates to your question.
This is what it means when the docs says:
While a Task is running in the event loop, no other Tasks can run in the same thread.
The async loop is not running things in parallel, but rather constantly switching context between different instructions for a more optimized CPU + I/O relationship/execution.
Processes, in the other hand, run in parallel in your CPU assuming you have multiple cores. Gunicorn workers - as mentioned earlier - are processes. When you run multiple async workers with Gunicorn you are effectively running multiple asyncio.loop in multiple (independent, and parallel-running) processes. This should answer your question on:
Can both request 1 and request 2 try to execute the asyncio.run in parallel in different threads\processes ? Will one block the other ?
If there is ever the case that one worker gets stuck on some extremely long I/O bound (or even non-async computation) instruction(s), other workers are there to take care of the next request(s).
With asyncio it is possible to run a separate event loop in each thread. Both will run in parallel (to the extent the Python Interpreter is capable). There are some restrictions. Communication between those loops must use threadsafe methods. Signals and subprocesses can be handled in the main thread only.
Calling asyncio.run in a callback will block until the asyncio part completely finishes. It is not clear from your question if this is what you want.
Alternatively, you could start a long running event loop in one thread and use asyncio.run_coroutine_threadsafe from other threads. Read the docs with an example here.
The event loop is meant to be thread-specific, since asyncio is about cooperative multitasking using single thread. So I don't understand how asyncio.run_in_exceutor work together with ThreadPoolExcecutor?
I would like to know the purpose of the function
The loop.run_in_executor awaitable has two main use cases:
Perform an I/O operation that cannot be managed through the file descriptor interface of the selector loop (i.e using the loop.add/remove_reader methods). This happens occasionally, see how the code for loop.getaddrinfo uses loop.run_in_executor under the hood for instance.
Perform a heavy CPU operation that would block the event loop context switching mechanism for too long. There are plenty of legitimate use cases for that, imagine running some data processing task in the context of an asyncio application for instance.
I am using Python with the Rasbian OS (based on Linux) on the Raspberry Pi board. My Python script uses GPIOs (hardware inputs). I have noticed when a GPIO activates, its callback will interrupt the current thread.
This has forced me to use locks to prevent issues when the threads access common resources. However it is getting a bit complicated. It struck me that if the GPIO was 'queued up' until the main thread went to sleep (e.g. hits a time.sleep) it would simplify things considerably (i.e. like the way that javascript deals with things).
Is there a way to implement this in Python?
Are you using RPi.GPIO library? Or you call your Python code from C when a callback fires?
In case of RPi.GPIO, it runs a valid Python thread, and you do not need extra synchronization if you organize the threads interaction properly.
The most common pattern is to put your event in a queue (in case of Python 3 this library will do the job, Python 2 has this one). Then, when your main thread is ready to process the event, process all the events in your queue. The only problem is how you find a moment for processing them. The simplest solution is to implement a function that does that and call it from time to time. If you use a long sleep call, you may have to split it into many smaller sleeps to make sure the external events are processed often enough. You may even implement your own wrapper for sleep that splits one large delay into several smaller ones and processes the queue between them. The other solution is to use Queue.get with timeout parameter instead of sleep (it returns immediately after an event arrives into the queue), however, if you need to sleep exactly for a period you specified, you may have to do some extra magic such as measuring the time yourself and calling get again if you need to wait more after processing the events.
Use a Queue from the multithreading module to store the tasks you want to execute. The main loop periodically checks for entries in the queue and executes them one by one when it finds something.
You GPIO monitoring threads put their tasks into the queue (only one is required to collect from many threads).
You can model your tasks as callable objects or function objects.
I have a Qt application written in PySide (Qt Python binding). This application has a GUI thread and many different QThreads that are in charge of performing some heavy lifting - some rather long tasks. As such long task sometimes gets stuck (usually because it is waiting for a server response), the application sometimes freezes.
I was therefore wondering if it is safe to call QCoreApplication.processEvents() "manually" every second or so, so that the GUI event queue is cleared (processed)? Is that a good idea at all?
It's safe to call QCoreApplication.processEvents() whenever you like. The docs explicitly state your use case:
You can call this function occasionally when your program is busy
performing a long operation (e.g. copying a file).
There is no good reason though why threads would block the event loop in the main thread, though. (Unless your system really can't keep up.) So that's worth looking into anyway.
A couple of hints people might find useful:
A. You need to beware of the following:
Every so often the threads want to send stuff back to the main thread. So they post an event and call processEvents
If the code runs from the event also calls processEvents then instead of returning to the next statement, python can instead dispatch a worker thread again and that can then repeat this process.
The net result of this can be hundreds or thousands of nested processEvent statements which can then result in a recursion level exceeded error message.
Moral - if you are running a multi-threaded application do NOT call processEvents in any code initiated by a thread which runs in the main thread.
B. You need to be aware that CPython has a Global Interpreter Lock (GIL) that limits threads so that only one can run at any one time and the way that Python decides which threads to run is counter-intuitive. Running process events from a worker thread does not seem to do what it says on the can, and CPU time is not allocated to the main thread or to Python internal threads. I am still experimenting, but it seems that putting worker threads to sleep for a few miliseconds allows other threads to get a look in.