I'm trying to understand concurrency in Python and am confused about how threads are scheduled and how tasks (in asyncio library) are scheduled to run/wait.
Suppose a thread tries to acquire a Lock and is blocked. Does the Python interpreter immediately put that thread into the 'blocked' queue? How is this blocked thread put back into the running state? Is there busy waiting involved?
How is this different when a task (the equivalent of a thread) in the asyncio library is blocked on an async mutex?
What is the advantage of asyncio, if there is no busy waiting involved in either of the above two cases?
Suppose a thread tries to acquire a Lock and is blocked. Does the Python interpreter immediately put that thread into the 'blocked' queue?
Python creates real operating system threads, so no queuing or scheduling needs to be done by the interpreter.
The one possible exception is the global lock use by the interpreter to serialize execution of Python code and access to Python objects. This lock is released not only before acquiring a threading lock, but also before any (potentially) blocking operation, such as reading from an IO handle or sleeping.
What is the advantage of asyncio, if there is no busy waiting involved in either of the above two cases?
The advantage is that asyncio doesn't require a new OS thread for each coroutine it executes in parallel. OS threads are expensive, and asyncio tasks are quite lightweight. Also, asyncio makes the potential switch points visible (the await keyword), so there's less potential for race conditions.
You can think of asyncio as a successor to Twisted, but with a modern API and using suspendable coroutines instead of explicit callback chaining.
Related
I've read quite a few articles on threading and asyncio modules in python and the major difference I can seem to draw (correct me if I'm wrong) is that in,
threading: multiple threads can be used to execute the python program and these threads are juggled by the OS itself. Further only when non blocking I/O is happening on a thread the GIL lock can be released to allow another thread to use it (since GIL makes python interpreter single threaded). This is also more resource intensive than asyncio io, since multiple threads will be utilising multiple resources.
asyncio: one single thread can have multiple tasks/coroutines that multitask cooperatively to achieve concurrency. Here, the issue of GIL doesn't arise since it is on a single thread anyway and whenever one non blocking I/O bound task is happening, python interpreter can be used by another coroutine - and all of this is managed by asyncio's event loop.
Also, one article: http://masnun.rocks/2016/10/06/async-python-the-different-forms-of-concurrency/
says,
if io_bound:
if io_very_slow:
print("Use Asyncio")
else:
print("Use Threads")
else:
print("Multi Processing")
I'd like to understand, just for better clarity, why exactly we can't use asyncio and threading as substitutes for each other, given we have sufficient resources available. Use cases of when to use what would help understand better. Further, since this topic is very new for me, there might be gaps in my understanding, so any kind of resources, explanations and corrections would be really appreciated.
Being new to using concurrency, I am confused about when to use the different python concurrency libraries. To my understanding, multiprocessing, multithreading and asynchronous programming are part of concurrency, while multiprocessing is part of a subset of concurrency called parallelism.
I searched around on the web about different ways to approach concurrency in python, and I came across the multiprocessing library, concurrenct.futures' ProcessPoolExecutor() and ThreadPoolExecutor(), and asyncio. What confuses me is the difference between these libraries. Especially what the multiprocessing library does, since it has methods like pool.apply_async, does it also do the job of asyncio? If so, why is it called multiprocessing when it is a different method to achieve concurrency from asyncio (multiple processes vs cooperative multitasking)?
There are several different libraries at play:
threading: interface to OS-level threads. Note that CPU-bound work is mostly serialized by the GIL, so don't expect threading to speed up calculations. Use it when you need to invoke blocking APIs in parallel, and when you require precise control over thread creation. Avoid creating too many threads (e.g. thousands), as they are not free. If possible, don't create threads yourself, use concurrent.futures instead.
multiprocessing: interface to spawning multiple python processes with an API intentionally similar to threading. Multiple processes work in parallel, so you can actually speed up calculations using this method. The disadvantage is that you can't share in-memory datastructures without using multi-processing specific tools.
concurrent.futures: A modern interface to threading and multiprocessing, which provides convenient thread/process pools it calls executors. The pool's main entry point is the submit method which returns a handle that you can test for completion or wait for its result. Getting the result gives you the return value of the submitted function and correctly propagates raised exceptions (if any), which would be tedious to do with threading. concurrent.futures should be the tool of choice when considering thread or process based parallelism.
asyncio: While the previous options are "async" in the sense that they provide non-blocking APIs (this is what methods like apply_async refer to), they are still relying on thread/process pools to do their magic, and cannot really do more things in parallel than they have workers in the pool. Asyncio is different: it uses a single thread of execution and async system calls across the board. It has no blocking calls at all, the only blocking part being the asyncio.run() entry point. Asyncio code is typically written using coroutines, which use await to suspend until something interesting happens. (Suspending is different than blocking in that it allows the event loop thread to continue to other things while you're waiting.) It has many advantages compared to thread-based solutions, such as being able to spawn thousands of cheap "tasks" without bogging down the system, and being able to cancel tasks or easily wait for multiple things at once. Asyncio should be the tool of choice for servers and for clients connecting to multiple servers.
When choosing between asyncio and multithreading/multiprocessing, consider the adage that "threading is for working in parallel, and async is for waiting in parallel".
Also note that asyncio can await functions executed in thread or process pools provided by concurrent.futures, so it can serve as glue between all those different models. This is part of the reason why asyncio is often used to build new library infrastructure.
I'm looking for a conceptual answer on this question.
I'm wondering whether using ThreadPool in python to perform concurrent tasks, guarantees that data is not corrupted; I mean multiple threads don't access the critical data at the same time.
If so, how does this ThreadPoolExecutor internally works to ensure that critical data is accessed by only one thread at a time?
Thread pools do not guarantee that shared data is not corrupted. Threads can swap at any byte code execution boundary and corruption is always a risk. Shared data should be protected by synchronization resources such as locks, condition variables and events. See the threading module docs
concurrent.futures.ThreadPoolExecutor is a thread pool specialized to the concurrent.futures async task model. But all of the risks of traditional threading are still there.
If you are using the python async model, things that fiddle with shared data should be dispatched on the main thread. The thread pool should be used for autonomous events, especially those that wait on blocking I/O.
If so, how does this ThreadPoolExecutor internally works to ensure that critical data is accessed by only one thread at a time?
It doesn't, that's your job.
The high-level methods like map will use a safe work queue and not share work items between threads, but if you've got other resources which can be shared then the pool does not know or care, it's your problem as the developer.
In the crawler i am working on. It makes requests using pycurl multi.
What kind of efficiency improvement can i expect if i switch to aiohttp?
Skepticism has me doubting the potential improvement since python has the GIL. Most of the time is spent waiting for the requests(network IO), so if i could do them in a true parallel way and then process them as they come in i could get a good speedup.
Has anyone been through this and can offer some insights?
Thanks
The global interpreter lock is a mutex that protects access to Python
objects, preventing multiple threads from executing Python bytecodes
at once.
This means that affects the performance of your multithreaded code. AsyncIO is more about handling concurrent requests rather than parallel. With AsyncIO your code will be able to handle more request even with a single threaded loop because the network IO is going to be async. This means that during the time a coroutine fetches a network resource it will "pause" and not lock the thread it's running on and allow other coroutines to execute. The main idea with asyncIO is that even with a single thread you can have your CPU performing calculation constantly instead of waiting for network IO.
If you want to understand more about asyncIO, you need to understand the difference between concurrency and parallelism. This is an excellent Go talk about this subject, but the principals are the same.
So even if python has GIL, performance with asyncIO will be by far better than using traditional threads. Here are some benchmarks:
From the gevent docs:
The greenlets all run in the same OS thread and are scheduled cooperatively.
From asyncio docs:
This module provides infrastructure for writing single-threaded concurrent code using coroutines. asyncio does provide
Try as I might, I haven't come across any major Python libraries that implement multi-threaded or multi-process coroutines i.e. spreading coroutines across multiple threads so as to increase the number of I/O connections that can be made.
I understand coroutines essentially allow the main thread to pause executing this one I/O bound task and move on to the next I/O bound task, forcing an interrupt only when one of these I/O operations finish and require handling. If that is the case, then distributing I/O tasks across several threads, each of which could be operating on different cores, should obviously increase the number of requests you could make.
Maybe I'm misunderstanding how coroutines work or are meant to work, so my question is in two parts:
Is it possible to even have a coroutine library that operates over multiple threads (possibly on different cores) or multiple processes?
If so, is there such a library?