I am embedding Python in a multi-threaded C++ application, is it safe to call
Py_Initialize() in multiple threads? Or should I call it in the main thread?
The Py_Initialize() code contains:
if (initialized)
return;
initialized = 1;
The documentation for the function also says:
https://docs.python.org/2/c-api/init.html#c.Py_Initialize
This is a no-op when called for a second time (without calling Py_Finalize() first).
My recommendation though is you only do it from the main thread, although depending on what you are doing, it can get complicated.
The problem is that signal handlers are only triggered in context of the main Python thread. That is, whatever thread was the one to call Py_Initialize(). So if that is a transient thread and is only used once and then discarded, then no chance to ever have signal handlers called. So you have to give some thought as to how you handle signals.
Also be careful of using lots of transient threads created in C code using native thread API and calling into Python interpreter as each will create data in the Python interpreter. That will accumulate if keep creating and discarding these external threads. You should endeavour to use a thread pool instead if calling in from external threads, and keep reusing prior threads.
Related
The question
Is pybind11 somehow magically doing the work of PyGILState_Ensure() and PyGILState_Release()? And if not, how should I do it?
More details
There are many questions regarding passing a python function to C++ as a callback using pybind11, but I haven't found one that explains the use of the GIL with pybind11.
The documentation is pretty clear about the GIL:
[...] However, when threads are created from C (for example by a third-party library with its own thread management), they don’t hold the GIL, nor is there a thread state structure for them.
If you need to call Python code from these threads (often this will be part of a callback API provided by the aforementioned third-party library), you must first register these threads with the interpreter by creating a thread state data structure, then acquiring the GIL, and finally storing their thread state pointer, before you can start using the Python/C API.
I can easily bind a C++ function that takes a callback:
py::class_<SomeApi> some_api(m, "SomeApi");
some_api
.def(py::init<>())
.def("mode", &SomeApi::subscribe_mode, "Subscribe to 'mode' updates.");
With the corresponding C++ function being something like:
void subscribe_mode(const std::function<void(Mode mode)>& mode_callback);
But because pybind11 cannot know about the threading happening in my C++ implementation, I suppose it cannot handle the GIL for me. Therefore, if mode_callback is called by a thread created from C++, does that mean that I should write a wrapper to SomeApi::subscribe_mode that uses PyGILState_Ensure() and PyGILState_Release() for each call?
This answer seems to be doing something similar, but still slightly different: instead of "taking the GIL" when calling the callback, it seems like it "releases the GIL" when starting/stopping the thread. Still I'm wondering if there exists something like py::call_guard<py::gil_scoped_acquire>() that would do exactly what I (believe I) need, i.e. wrapping my callback with PyGILState_Ensure() and PyGILState_Release().
In general
pybind11 tries to do the Right Thing and the GIL will be held when pybind11 knows that it is calling a python function, or in C++ code that is called from python via pybind11. The only time that you need to explicitly acquire the GIL when using pybind11 is when you are writing C++ code that accesses python and will be called from other C++ code, or if you have explicitly dropped the GIL.
std::function wrapper
The wrapper for std::function always acquires the GIL via gil_scoped_acquire when the function is called, so your python callback will always be called with the GIL held, regardless which thread it is called from.
If gil_scoped_acquire is called from a thread that does not currently have a GIL thread state associated with it, then it will create a new thread state. As a side effect, if nothing else in the thread acquires the thread state and increments the reference count, then once your function exits the GIL will be released by the destructor of gil_scoped_acquire and then it will delete the thread state associated with that thread.
If you're only calling the function once from another thread, this isn't a problem. If you're calling the callback often, it will create/delete the thread state a lot, which probably isn't great for performance. It would be better to cause the thread state to be created when your thread starts (or even easier, start the thread from Python and call your C++ code from python).
I'm adding python scripting support to an application.
This application has an API which is not thread safe, and I cannot change this aspect.
One requirement I have is being able to run multiple independent scripts, thus I have to run sub-interpreters in separate threads.
Although, due to the GIL in CPython, no more than one thread runs concurrently, whatever thread holds the GIL will still run concurrently with the main thread, and this will cause problems due to the thread-unsafe API of the application.
To summarize: I'm looking for a way to run all python code (__main__, threads, every sub-interpreter) in the main thread.
How can this be solved?
Should the main thread always hold the GIL, and have a function that -in a cooperative-multitasking fashion- would release it and reacquire it x milliseconds later, thus allowing the interpreter to do some work? This doesn't look right: such function will consume x milliseconds also when python has no work to do.
I have a Qt application written in PySide (Qt Python binding). This application has a GUI thread and many different QThreads that are in charge of performing some heavy lifting - some rather long tasks. As such long task sometimes gets stuck (usually because it is waiting for a server response), the application sometimes freezes.
I was therefore wondering if it is safe to call QCoreApplication.processEvents() "manually" every second or so, so that the GUI event queue is cleared (processed)? Is that a good idea at all?
It's safe to call QCoreApplication.processEvents() whenever you like. The docs explicitly state your use case:
You can call this function occasionally when your program is busy
performing a long operation (e.g. copying a file).
There is no good reason though why threads would block the event loop in the main thread, though. (Unless your system really can't keep up.) So that's worth looking into anyway.
A couple of hints people might find useful:
A. You need to beware of the following:
Every so often the threads want to send stuff back to the main thread. So they post an event and call processEvents
If the code runs from the event also calls processEvents then instead of returning to the next statement, python can instead dispatch a worker thread again and that can then repeat this process.
The net result of this can be hundreds or thousands of nested processEvent statements which can then result in a recursion level exceeded error message.
Moral - if you are running a multi-threaded application do NOT call processEvents in any code initiated by a thread which runs in the main thread.
B. You need to be aware that CPython has a Global Interpreter Lock (GIL) that limits threads so that only one can run at any one time and the way that Python decides which threads to run is counter-intuitive. Running process events from a worker thread does not seem to do what it says on the can, and CPU time is not allocated to the main thread or to Python internal threads. I am still experimenting, but it seems that putting worker threads to sleep for a few miliseconds allows other threads to get a look in.
I'm writing a multi-threaded application that utilizes QThreads. I know that, in order to start a thread, I need to override the run() method and call that method using the thread.start() somewhere (in my case in my GUI thread).
I was wondering, however, is it required to call the .wait() method anywhere and also am I supposed to call the .quit() once the thread finishes, or is this done automatically?
I am using PySide.
Thanks
Both answers depend on what your code is doing and what you expect from the thread.
If your logic which uses the thread needs to wait synchronously for the moment QThread finishes, then yes, you need to call wait(). However such requirement is a sign of sloppy threading model, except very specific situations like application startup and shutdown. Usage of QThread::wait() suggests creeping sequential operation, which means that you are effectively not using threads concurrently.
quit() exits QThread-internal event loop, which is not mandatory to use. A long-running thread (as opposed to one-task worker) must have an event loop of some sort - this is a generic statement, not specific to QThread. You either do it yourself (in form of some while(keepRunning) { } cycle) or use Qt-provided event loop, which you fire off by calling exec() in your run() method. The former implementation is finishable by you, because you did provide the keepRunning condition. The Qt-provided implementation is hidden from you and here goes the quit() call - which internally does nothing more than setting some sort of similar flag inside Qt.
I've heard there are problems when calling os.waitpid from within a thread. I have not experienced such problems yet (especially using os.WNOHANG option). However I have not paid much attention to the performance implications of such use.
Are there any performance penalties or any other issues one should be aware of?
Does this have to do with os.waitpid (potentially) using signals?
I don't see how signals could be related, though, since otherwise (I suppose) I wouldn't be able to get os.waitpid to return when calling it from a non-main thread.
By default, a child process dies, the parent is sent a SIGCHLD signal. Concern for calling os.waitpid() probably comes from this.
If you look in the Python "signal" module documentation the warning is pretty clear:
Some care must be taken if both signals and threads are used in the same program. The fundamental thing to remember in using signals and threads simultaneously is: always perform signal() operations in the main thread of execution. Any thread can perform an alarm(), getsignal(), pause(), setitimer() or getitimer(); only the main thread can set a new signal handler, and the main thread will be the only one to receive signals (this is enforced by the Python signal module, even if the underlying thread implementation supports sending signals to individual threads). This means that signals can’t be used as a means of inter-thread communication. Use locks instead.
http://docs.python.org/library/signal.html
BUT... if you leave the SIGCHLD signal alone, then you should be happily able to call os.waitpid() (or any other os.wait() variant) from a thread.
The main drawback then is that you'll need to use os.waitpid() with WNOHANG and poll periodically, if you want any way to cancel the operation. If you don't ever need to cancel the os.waitpid(), then you can just invoke it in blocking mode.
My guess: people are just referring to calling waitpid() without WNOHANG, which of course obviates the reason you use multiple threads in the first place. (That is, of course, unless you are just using it to reap the zombies).