From Python in a Nutshell
In what is sometimes known as a multiplexed async
architecture, your code keeps track of the I/O channels on which
operations may be pending; when you can do no more until one or more
of the pending I/O operations completes, the thread running your
code goes into a blocking wait (this situation is usually referred to
as “your code blocks”), specifically waiting for any completion on
the relevant set of channels. When a completion wakes up the blocking
wait, your code deals with the specifics of that completion (such
“dealing with” may include initiating more I/O operations),
then, usually, goes back to the blocking wait. Python offers
several low-level modules supporting multiplexed async architectures,
but the best one to use is the higher-level selectors module
The meaning of "multiplexed" in "multiplexed async" can't be
explained by its definition from
http://searchnetworking.techtarget.com/definition/multiplexing
Multiplexing (or muxing) is a way of sending multiple signals or streams of information over a communications link at the same time
in the form of a single, complex signal; the receiver recovers the
separate signals, a process called demultiplexing (or demuxing).
What does "multiplexed" mean in "multiplexed async architecture"?
Given that "when you can do no more until one or more of the pending I/O operations completes, the thread running your code goes into a blocking wait", how is "multiplexed async architecture" asynchronous?
The meaning of "multiplexed" in "multiplexed async" can't be explained by its definition
Yes, but it's pretty close. The meaning is:
Multiplexing is a way of performing multiple operations using a single thread at the same time.
What this architecture does is similar to time-division multiplexing.
Given that "when you can do no more until one or more of the pending I/O operations completes, the thread running your code goes into a blocking wait", how is "multiplexed async architecture" asynchronous?
Wikipedia defines "asynchrony" as:
Asynchrony […] refers to the occurrence of […] actions instigated by a program that take place concurrently with program execution, without the program blocking to wait for results.
That is what's happening in this case. That blocking is used as an implementation detail doesn't matter. What matters is that while one operation is waiting, others can continue.
Related
I am really struggling to understand the interaction between asyncio event loop and multiple workers/threads/processes.
I am using dash: which uses flask internally and gunicorn.
Say I have two functions
def async_download_multiple_files(files):
# This function uses async just so that it can concurrently send
# Multiple requests to different webservers and returns data.
def sync_callback_dash(files):
# This is a sync function that is called from a dash callback to get data
asyncio.run(async_download_multiple_files(files))
As I understand, asyncio.run runs the async function in an event loop but blocks it:
From Python Docs
While a Task is running in the event loop, no other Tasks can run in the same thread.
But what happens when I run a WSGI server like Gunicorn with multiple workers.
Say there are 2 requests coming in simultaneously, presumably there will be multiple calls to sync_callback_dash which will happen in parallel because of multiple Gunicorn workers.
Can both request 1 and request 2 try to execute the asyncio.run in parallel in different threads\processes ? Will one block the other ?
If they can run in parallel, what is the use of having asyncio workers that Gunicorn offers?
I answered this question with the assumption that there is some lack of knowledge on some of the fundamental understandings of threads/processes/async loop. If there was not, forgive me for the amount of detail.
First thing to note is that processes and threads are two separate concepts. This answer might give you some context. To expand:
Processes are run directly by the CPU, and if the CPU has multiple cores, processes can be run in parallel. Inside processes is where threads are run. There is always at least 1 thread per process, but there can be more. If there are more, the process switches between which thread it is executing after every (specific) millisecond (dictated by things out of the scope of this question)- and therefore threads are not run in absolute parallel, but rather constantly switched in and out of the CPU (at least as it pertains to Python, specifically, due to something called the GIL). The async loop runs inside a thread, and switches context relating specifically to I/O-bound instructions (more of this below).
Regarding this question, it's worth noting that Gunicorn workers are processes, and not threads (though you can increase the amount of threads per worker).
The intention of asynchronous code (with the use of async def, await, and asyncio) is to speed-up performance as it specifically relates to I/O bound tasks. Stuff like getting a file from disk, sending/receiving a network request, or anything that requires a physical piece of your computer - whether it is SSD, or the network card - other than the CPU to do some work. It can also be used for large CPU-bound instructions, but this is usually where threads come in. Note that I/O bound instructions are much slower than CPU bound instructions as the electricity inside your computer literally has to travel further distances, as well as perform extra steps in the hardware level (to keep things simple).
These tasks waste the CPU time (or, more specifically, the current process's time) on simply waiting for a reply. Asynchronous code is run with the help of a loop that auto-manages the context switching of I/O bound instructions and normal CPU bound instructions (dependent on the use of await keywords) by leveraging the idea that a function can "yield" control back to the loop, and allow the loop to continue processing other pieces of code while it waits. When async code sends an I/O bound instruction (e.g. grab the latest packet from the network card), instead of sitting still and waiting for a reply it will switch the current process' context to the next task in its list to speed up general execution time (adding that previous I/O bound call to this list to check back in later). There is more to this, but this is the general gist as it relates to your question.
This is what it means when the docs says:
While a Task is running in the event loop, no other Tasks can run in the same thread.
The async loop is not running things in parallel, but rather constantly switching context between different instructions for a more optimized CPU + I/O relationship/execution.
Processes, in the other hand, run in parallel in your CPU assuming you have multiple cores. Gunicorn workers - as mentioned earlier - are processes. When you run multiple async workers with Gunicorn you are effectively running multiple asyncio.loop in multiple (independent, and parallel-running) processes. This should answer your question on:
Can both request 1 and request 2 try to execute the asyncio.run in parallel in different threads\processes ? Will one block the other ?
If there is ever the case that one worker gets stuck on some extremely long I/O bound (or even non-async computation) instruction(s), other workers are there to take care of the next request(s).
With asyncio it is possible to run a separate event loop in each thread. Both will run in parallel (to the extent the Python Interpreter is capable). There are some restrictions. Communication between those loops must use threadsafe methods. Signals and subprocesses can be handled in the main thread only.
Calling asyncio.run in a callback will block until the asyncio part completely finishes. It is not clear from your question if this is what you want.
Alternatively, you could start a long running event loop in one thread and use asyncio.run_coroutine_threadsafe from other threads. Read the docs with an example here.
In the article "I'm not feeling the async pressure" Armin Ronacher makes the following observation:
In threaded code any function can yield. In async code only async functions can. This means for instance that the writer.write method cannot block.
This observation is made with reference to the following code sample:
from asyncio import start_server, run
async def on_client_connected(reader, writer):
while True:
data = await reader.readline()
if not data:
break
writer.write(data)
async def server():
srv = await start_server(on_client_connected, '127.0.0.1', 8888)
async with srv:
await srv.serve_forever()
run(server())
I do not understand this comment. Specifically:
How come synchronous functions cannot yield when inside of asynchronous functions?
What does yield have to do with blocking execution? Why is it that a function that cannot yield, cannot block?
Going line-by-line:
In threaded code any function can yield.
Programs running on a machine are organized in terms of processes. Each process may have one or more threads. Threads, like processes, are scheduled by (and interruptible by) the operating system. The word "yield" in this context means "letting other code run". When work is split between multiple threads, functions "yield" easily: the operating system suspends the code running in one thread, runs some code in a different thread, suspends that, comes back, and works some more on the first thread, and so on. By switching between threads in this way, concurrency is achieved.
In this execution model, whether the code being suspended is synchronous or asynchronous does not matter. The code within the thread is being run line-by-line, so the fundamental assumption of a synchronous function---that no changes occurred in between running one line of code and the next---is not violated.
In async code only async functions can.
"Async code" in this context means a single-threaded application that does the same work as the multi-threaded application, except that it achieves concurrency by using asynchronous functions within a thread, instead of splitting the work between different threads. In this execution model, your interpreter, not the operating system, is responsible for switching between functions as needed to achieve concurrency.
In this execution model, it is unsafe for work to be suspended in the middle of a synchronous function that's located inside of an asynchronous function. Doing so would mean running some other code in the middle of running your synchronous function, breaking the "line-by-line" assumption made by the synchronous function.
As a result, the interpreter will wait only suspend the execution of an asynchronous function in between synchronous sub-functions, never within one. This is what is meant by the statement that synchronous functions in async code cannot yield: once a synchronous function starts running, it must complete.
This means for instance that the writer.write method cannot block.
The writer.write method is synchronous, and hence, when run in an async program, uninterruptible. If this method were to block, it would block not just the asynchronous function it is running inside of, but the entire program. That would be bad. writer.write avoids blocking the program by writing to a write buffer instead and returning immediately.
Strictly speaking, writer.write can block, it's just inadvisable to do so.
If you need to block inside of an async function, the proper way to do so is to await another async function. This is what e.g. await writer.drain() does. This will block asynchronously: while this specific function remains blocked, it will correctly yield to other functions that can run.
“Yield” here refers to cooperative multitasking (albeit within a process rather than among them). In the context of the async/await style of Python programming, asynchronous functions are defined in terms of Python’s pre-existing generator support: if a function blocks (typically for I/O), all its callers that are performing awaits suspend (with an invisible yield/yield from that is indeed of the generator variety). The actual call for any generator is to its next method; that function actually returns.
Every caller, up to some sort of driver that most programmers never write, must participate for this approach to work: any function that did not suspend would suddenly have the responsibility of the driver of deciding what to do next while waiting on the function it called to complete. This “infectious” aspect of asynchronicity has been called a “color”; it can be problematic, as for example when people forget to await a coroutine call that looks correct because it looks like any other call. (The async/await syntax exists to minimize the disruption of the program’s structure from the concurrency by implicitly converting functions into state machines, but this ambiguity remains.) It can also be a good thing: an asynchronous function can be interrupted exactly when it awaits, so it’s straightforward to reason about the consistency of data structures.
A synchronous function therefore cannot yield simply as a matter of definition. The import of the restriction is rather that a function called with a normal (synchronous) call cannot yield: its caller is not prepared to handle such an interaction. (What will happen if it does anyway is of course the same “forgotten await”.) This also affects refactoring: a function cannot be changed to be asynchronous without changing all its clients (and making them asynchronous as well if they are not already). (This is similar to how all I/O works in Haskell, since it affects the type of any function that performs any.)
Note that yield is allowed in its role as a normal generator used with an ordinary for even in an asynchronous function, but that’s just the general fact that the caller must expect the same protocol as the callee: if an enhanced generator (an “old-style” coroutine) is used with for, it just gets None from every (yield), and if an async function is used with for, it produces awaitables that probably break when they are sent None.
The distinction with threading, or with so-called stackful coroutines or fibers, is that no special resumption support is needed from the caller because the actual function call simply doesn’t return until the thread/fiber is resumed. (In the thread case, the kernel also chooses when to resume it.) In that sense, these approaches are easier to use, but with fibers the ability to “sneak” a pause into any function is partially compromised by the need to specify arguments to that function to tell it about the userspace scheduler with which to register itself (unless you’re willing to use global variables for that…). Threads, on the other hand, have even higher overhead than fibers, which matters when great numbers of them are running.
I'm writing a little app in python, consuming some http services, but i really don't understand the difference between using an async function or an Thread for consuming that services.
Anyone can help me to understand?
I have been reading up on the threaded model of programming versus the asynchronous model from this really good article. http://krondo.com/blog/?p=1209
However, the article mentions the following points.
An async program will simply outperform a sync program by switching between tasks whenever there is a I/O.
Threads are managed by the operating system.
I remember reading that threads are managed by the operating system by moving around TCBs between the Ready-Queue and the Waiting-Queue(amongst other queues). In this case, threads don't waste time on waiting either do they?
In light of the above mentioned, what are the advantages of async programs over threaded programs?
In a function there is an entry point and there is an exit point (which is usually a return statement or last statement of function).
Thread: executes all the possible statements from entry point to exit point.
async function :
Functions defined with async def syntax are always coroutine functions
This is from python reference documentation. And coroutines can be entered,exited or resumed from different points anywhere between entry and exit point of the function.
Now, based on your requirement, you can choose which one to use.
In the Python documentation it describes how to start and use coroutines.
This section describes how to use a Task.
In the Task section, it states:
Tasks are used to schedule coroutines concurrently
I'm failing to understand, what is happening when I start a coroutines without using Task? Is the code running asynchronously but not concurrently? Does it mean when the code sees an await it goes and does something else?
When I use a Task is it like start two threads and calling join()? I start two or more tasks and wait for the result, correct?
For simple cases, creating Tasks manually is somewhat similar to threads – you can create them, event loop will eventually run them, and you should eventually get result/exception.
But in most cases, your code is built around await coro() – nothing low-level. This means that your code may do some I/O operation inside coro, so process is free to put your implicitly created task into queue, and resume execution later.
I'm a bit puzzled about how to write asynchronous code in python/twisted. Suppose (for arguments sake) I am exposing a function to the world that will take a number and return True/False if it is prime/non-prime, so it looks vaguely like this:
def IsPrime(numberin):
for n in range(2,numberin):
if numberin % n == 0: return(False)
return(True)
(just to illustrate).
Now lets say there is a webserver which needs to call IsPrime based on a submitted value. This will take a long time for large numberin.
If in the meantime another user asks for the primality of a small number, is there a way to run the two function calls asynchronously using the reactor/deferreds architecture so that the result of the short calc gets returned before the result of the long calc?
I understand how to do this if the IsPrime functionality came from some other webserver to which my webserver would do a deferred getPage, but what if it's just a local function?
i.e., can Twisted somehow time-share between the two calls to IsPrime, or would that require an explicit invocation of a new thread?
Or, would the IsPrime loop need to be chunked into a series of smaller loops so that control can be passed back to the reactor rapidly?
Or something else?
I think your current understanding is basically correct. Twisted is just a Python library and the Python code you write to use it executes normally as you would expect Python code to: if you have only a single thread (and a single process), then only one thing happens at a time. Almost no APIs provided by Twisted create new threads or processes, so in the normal course of things your code runs sequentially; isPrime cannot execute a second time until after it has finished executing the first time.
Still considering just a single thread (and a single process), all of the "concurrency" or "parallelism" of Twisted comes from the fact that instead of doing blocking network I/O (and certain other blocking operations), Twisted provides tools for performing the operation in a non-blocking way. This lets your program continue on to perform other work when it might otherwise have been stuck doing nothing waiting for a blocking I/O operation (such as reading from or writing to a socket) to complete.
It is possible to make things "asynchronous" by splitting them into small chunks and letting event handlers run in between these chunks. This is sometimes a useful approach, if the transformation doesn't make the code too much more difficult to understand and maintain. Twisted provides a helper for scheduling these chunks of work, cooperate. It is beneficial to use this helper since it can make scheduling decisions based on all of the different sources of work and ensure that there is time left over to service event sources without significant additional latency (in other words, the more jobs you add to it, the less time each job will get, so that the reactor can keep doing its job).
Twisted does also provide several APIs for dealing with threads and processes. These can be useful if it is not obvious how to break a job into chunks. You can use deferToThread to run a (thread-safe!) function in a thread pool. Conveniently, this API returns a Deferred which will eventually fire with the return value of the function (or with a Failure if the function raises an exception). These Deferreds look like any other, and as far as the code using them is concerned, it could just as well come back from a call like getPage - a function that uses no extra threads, just non-blocking I/O and event handlers.
Since Python isn't ideally suited for running multiple CPU-bound threads in a single process, Twisted also provides a non-blocking API for launching and communicating with child processes. You can offload calculations to such processes to take advantage of additional CPUs or cores without worrying about the GIL slowing you down, something that neither the chunking strategy nor the threading approach offers. The lowest level API for dealing with such processes is reactor.spawnProcess. There is also Ampoule, a package which will manage a process pool for you and provides an analog to deferToThread for processes, deferToAMPProcess.