Is Python's asyncio `loop.create_task(...)` threadsafe? - python

I have matplotlib running on the main thread with a live plot of data coming in from an external source. To handle the incoming data I have a simple UDP listener listening for packages using asyncio with the event loop running on a seperate thread.
I now want to add more sources and I'd like to run their listeners on the same loop/thread as the first one. To do this I'm just passing the loop object to the classes implementing the listeners and their constructor adds a task to the loop that will initialize and run the listener.
However since these classes are initialized in the main thread I'm calling the loop.create_task(...) function from there instead of the loop's thread. Will this cause any issues?

The answer is no, using loop.create_task(...) to schedule a coroutine from a different thread is not threadsafe, use asyncio.run_coroutine_threadsafe(...) instead.

Related

Integrate embedded python asyncio into boost::asio event loop

I have a C++ binary with an embedded python interpreter, done via pybind11::scoped_interpreter.
It also has a number of tcp connections using boost::asio which consume a proprietary messaging protocol and update some state based on the message contents.
On startup we import a python module, instantiate a specific class therein and obtain pybind11::py_object handles to various callback methods within the class.
namespace py = pybind11;
class Handler
{
public:
Handler(const cfg::Config& cfg)
: py_interpreter_{std::make_unique<py::scoped_interpreter>()}
{
auto module = py::module_::import(cfg.module_name);
auto Class = module.attr(cfg.class_name);
auto obj = Class(this);
py_on_foo_ = obj.attr("on_foo");
py_on_bar_ = obj.attr("on_bar");
}
std::unique_ptr<py::scoped_interpreter> py_interpreter_;
py::object py_on_foo_;
py::object py_on_bar_;
};
For each specific message which comes in, we call the associated callback method in the python code.
void Handler::onFoo(const msg::Foo& foo)
{
py_on_foo_(foo); // calls python method
}
All of this works fine... however, it means there is no "main thread" in the python code - instead, all python code execution is driven by events originating in the C++ code, from the boost::asio::io_context which is running on the C++ application's main thread.
What I'm now tasked with is a way to get this C++-driven code to play nicely with some 3rd-party asyncio python libraries.
What I have managed to do is to create a new python threading.Thread, and from there add some data to a thread-safe queue and make a call to boost::asio::post (exposed via pybind11) to execute a callback in the C++ thread context, from which I can drain the queue.
This is working as I expected, but I'm new to asyncio, and am lost as to how to create a new asyncio.event_loop on the new thread I've created, and post the async results to my thread-safe queue / C++ boost::asio::post bridge to the C++ thread context.
I'm not sure if this is even a recommended approach... or if there is some asyncio magic I should be using to wake up my boost::asio::io_context and have the events delivered in that context?
Questions:
How can I integrate an asyncio.event_loop into my new thread and have the results posted to my thread-safe event-queue?
Is it possible to create a decorator or some such similar functionality which will "decorate" an async function so that the results are posted to my thread-safe queue?
Is this approach recommended, or is there another asyncio / "coroutiney" way of doing things I should be looking at?
There are three possibilities to integrate the asio and asyncio event loops:
Run both event loops in the same thread, alternating between them
Run one event loop in the main thread and the other in a worker thread
Merge the two event loops together.
The first option is straightforward, but has the downside that you will be running that thread hot since it never gets the chance to sleep (classically, in a select), which is inconsiderate and can disguise performance issues (since the thread always uses all available CPU). Here option 1a would be to run the asio event loop as a guest in asyncio:
async def runAsio(asio: boost.asio.IoContext):
while await asyncio.sleep(0, True):
asio.poll()
And option 1b would be to run the asyncio event loop as a guest in asio:
boost::asio::awaitable<void> runAsyncio(py::object asyncio) {
for (;; co_await boost::asio::defer()) {
asyncio.attr("stop")();
asyncio.attr("run_forever")();
}
}
The second option is more efficient, but has the downside that completions will be invoked on either thread depending on which event loop they're triggered by. This is the approach taken by the asynchronizer library; it spawns a std::thread to run the asio event loop on the side (option 2a), but you could equally take your approach (option 2b) of spawning a threading.Thread and running the asyncio event loop on the side. If you're doing this you should create a new event loop in the worker thread and run it using run_forever. To post callbacks to this event loop from the main thread use call_soon_threadsafe.
Note that a downside of approach 2b would be that Python code invoked in the main thread won't be able to access the asyncio event loop using get_running_loop and, worse any code using the deprecated get_event_loop in the main thread will hang. If instead you use option 2a and run the C++ event loop in the worker thread you can ensure that any Python callbacks that might want access to the asyncio event loop are running in the main thread.
Finally, the third option is to replace one event loop with the other (or even possibly both with a third, e.g. libuv). Replacing the asio scheduler/reactor/proactor is pretty involved and fairly pointless (since it would mean adding overhead to C++ code that should be fast), but replacing the asyncio loop is far more straightforward and is very much a supported use case; see Event Loop Implementations and Policies and maybe take a look at uvloop which replaces the asyncio event loop with libuv. On the downside, I'm not aware of a fully supported asio implementation of the asyncio event loop, but there is a GSoC project that looks pretty complete, although it's (unsurprisingly) written using Boost.Python so might need a little work to integrate with your pybind11 codebase.

Any other code doesn't run after I combined discord.py with my main script

I made a script that includes web scraping and api requests but I wanted to add discord.py for sending the results to my discord server but it stops after this:
client.run('token')
Is there any way to fix this?
You need to use threads.
Python threading allows you to have different parts of your program run concurrently and can simplify your design.
What Is a Thread?
A thread is a separate flow of execution. This means that your program will have two things happening at once.
Getting multiple tasks running simultaneously requires a non-standard implementation of Python, writing some of your code in a different language, or using multiprocessing which comes with some extra overhead.
Starting a thread
The Python standard library provides threading
import threading
x = threading.Thread(target=thread_function, args=(1,))
x.start()
wrapping up
you need to create two threads for each loop. Create and run the discord client in a thread, use another thread for web scraping and API requests.
The run method is completely blocking, so there are two ways I can see to solve this issue:
create and run the client in a separate thread, and use a queue of some sort to communicate between the client and the rest
use the start method, that returns an async coroutine which you can wrap into a task and multiplex with your scraping and API-requesting, assuming that also uses async coroutines
client.run seems to be a blocking operation.
E.g. your code is not meant to be executed after client.run
You can try using loop.create_task() as described here, to create another coroutine that would run in background and feed some messages into your client.

Why GUI should use call_soon_threadsafe() when talking to asyncio loop

In a couple of answers (see here and here), when dealing with GUI + asyncio in separate threads, it suggests to use a queue when the asyncio loop needs to communicate with the GUI. However, when the GUI wants to communicate with the asyncio event loop it should use call_soon_threadsafe().
For example, one answer states:
When the event loop needs to notify the GUI to refresh something, it can use a queue as shown here. On the other hand, if the GUI needs to tell the event loop to do something, it can call call_soon_threadsafe or run_coroutine_threadsafe.
What I don't understand is why can the GUI not also use another Queue rather than call_soon_threadsafe()? i.e. can the GUI not put data on a queue for the asyncio loop to get and process? Is it just a design decision or is there some technical reason not to use a queue from GUI to asyncio loop?
There's no appropriate queue class to use. asyncio.Queue isn't safe to interact with from outside the event loop, and queue.Queue would block the event loop.
If you want to use a queue anyway, you could use asyncio.run_coroutine_threadsafe to call an asyncio.Queue's put method.

paho-mqtt : callback thread

I am implementing a MQTT worker in python with paho-mqtt.
Are all the on_message() multi threaded in different threads, so that if one of the task is time consuming, other messages can still be processed?
If not, how to achieve this behaviour?
The python client doesn't actually start any threads, that's why you have to call the loop function to handle network events.
In Java you would use the onMessage callback to put the incoming message on to a local queue that a separate pool of threads will handle.
Python doesn't have native threading support but does have support for spawning processes to act like threads. Details of the multiprocessing can be found here:
https://docs.python.org/2.7/library/multiprocessing.html
EDIT:
On looking closer at the paho python code a little closer it appears it can actually start a new thread (using the loop_start() function) to handle the network side of things previously requiring the loop functions. This does not change the fact the all calls to the on_message callback will happen on this thread. If you need to do large amounts of work in this callback you should definitely look spinning up a pool of new threads to do this work.
http://www.tutorialspoint.com/python/python_multithreading.htm

Listening for events on a network and handling callbacks robostly

I am developing a small Python program for the Raspberry Pi that listens for some events on a Zigbee network.
The way I've written this is rather simplisic, I have a while(True): loop checking for a Uniquie ID (UID) from the Zigbee. If a UID is received it's sent to a dictionary containing some callback methods. So, for instance, in the dictionary the key 101 is tied to a method called PrintHello().
So if that key/UID is received method PrintHello will be executed - pretty simple, like so:
if self.expectedCallBacks.has_key(UID) == True:
self.expectedCallBacks[UID]()
I know this approach is probably too simplistic. My main concern is, what if the system is busy handling a method and the system receives another message?
On an embedded MCU I can handle easily with a circuler buffer + interrupts but I'm a bit lost with it comes to doing this with a RPi. Do I need to implement a new thread for the Zigbee module that basically fills a buffer that the call back handler can then retrieve/read from?
I would appreciate any suggestions on how to implement this more robustly.
Threads can definitely help to some degree here. Here's a simple example using a ThreadPool:
from multiprocessing.pool import ThreadPool
pool = ThreadPool(2) # Create a 2-thread pool
while True:
uid = zigbee.get_uid()
if uid in self.expectedCallbacks:
pool.apply_async(self.expectedCallbacks[UID])
That will kick off the callback in a thread in the thread pool, and should help prevent events from getting backed up before you can send them to a callback handler. The ThreadPool will internally handle queuing up any tasks that can't be run when all the threads in the pool are already doing work.
However, remember that Raspberry Pi's have only one CPU core, so you can't execute more than one CPU-based operation concurrently (and that's even ignoring the limitations of threading in Python caused by the GIL, which is normally solved by using multiple processes instead of threads). That means no matter how many threads/processes you have, only one can get access to the CPU at a time. For that reason, you probably don't want more than one thread actually running the callbacks, since as you add more you're just going to slow things down, due to the OS needing to constantly switch between threads.

Categories

Resources