C++ Python Not Always Executing Python Script - python

I currently have a piece of hardware connected to C++ code using the MFC (Windows programming) framework. Basically the hardware is passing in image frames to my C++ code. In my C++ code, I am then calling a Python script using the CPython (Python embedding in C++) API to execute a model on that image. I've been noticing some weird behavior with the images though.
My C++ code is executing my Python script perfectly until some frame in the range of 80-90. After that point, my C++ code, for some reason, just stops executing the Python script. Despite that, the C++ code is still running normally - EXCEPT for the fact (which I just stated) that it's not executing the Python script.
Something to note: my Python script takes 5 seconds to execute the FIRST time, but then only 0.02 seconds to execute each frame after that first frame (I think due to the model getting set up).
At first, I thought it was a problem with the speed, so I replaced all my Python code with just a "time.sleep()" call with varying time, and, even if I sleep 5 seconds each C++ call to Python still always gets executed. As a result, I don't think it's a matter of the total time. For instance, if I do "time.sleep(1)" which sleeps for a second (which is longer than my Python script execution time AFTER the first frame), my Python script still always gets executed.
Does anyone have any idea why this might be happening? Could it be because of the uneven running times? Since it's taking 5 seconds to run the first frame and then significantly faster for each frame after that. Could it be that the Python is somehow unable to catch up after that time period?
This is my first time executing C++/Python on hardware, so I'm also new to this. Any helps would be greatly appreciated!
To give some idea of my code, here is a snippet:
if (pFuncFrame && PyCallable_Check(pFuncFrame)) {
PyObject* pArgs = PyTuple_New(1);
PyTuple_SetItem(pArgs, 0, PyUnicode_FromString("img.bmp"));
PyObject_CallObject(pFuncFrame, pArgs);
std::cout << "Called the frame function";
}
else {
std::cout << "Did not get the frame function";
}

I'm willing to bet that the first execution ends in a Python exception which isn't cleared until you execute some new Python statement in the second iteration, which therefore fails immediately. I recommend fixing the memory leaks and adding some error handling code to get some diagnostics (which will be useful either which way). For example (haven't tried, since you didn't provide a compilable example, but the following shouldn't be too far off):
if (pFuncFrame && PyCallable_Check(pFuncFrame)) {
PyObject* pArgs = PyTuple_New(1);
PyTuple_SetItem(pArgs, 0, PyUnicode_FromString("img.bmp"));
PyObject* res = PyObject_CallObject(pFuncFrame, pArgs);
if (!res) {
if (PyErr_Occurred()) PyErr_Print();
else std::cerr << "Python exception without error set\n";
} else {
Py_DECREF(res);
std::cout << "Called the frame function";
}
Py_DECREF(pArgs);
}
else {
std::cout << "Did not get the frame function";
}

Related

Creating a basic PyTupleObject using Python's C API

I'm having difficulty with creating a PyTupleObject using the Python C api.
#include "Python.h"
int main() {
int err;
Py_ssize_t size = 2;
PyObject *the_tuple = PyTuple_New(size); // this line crashes the program
if (!the_tuple)
std::cerr << "the tuple is null" << std::endl;
err = PyTuple_SetItem(the_tuple, (Py_ssize_t) 0, PyLong_FromLong((long) 5.7));
if (err < 0) {
std::cerr << "first set item failed" << std::endl;
}
err = PyTuple_SetItem(the_tuple, (Py_ssize_t) 1, PyLong_FromLong((long) 5.7));
if (err < 0) {
std::cerr << "second set item failed" << std::endl;
}
return 0;
}
crashes with
Process finished with exit code -1073741819 (0xC0000005)
But so does everything else i've tried so far. Any ideas what I'm doing wrong? Not that I'm just trying to run the as a C++ program, as I'm just trying to do tests on the code before adding a swig typemap.
The commenter #asynts is correct in that you need to initialize the interpreter via Py_Initialize if you want to interact with Python objects (you are, in fact, embedding Python). There are a subset of functions from the API that can safely be called without initializing the interpreter, but creating Python objects do not fall within this subset.
Py_BuildValue may "work" (as in, not creating a segfault with those specific arguments), but it will cause issues elsewhere in the code if you try to do anything with it without having initialized the interpreter.
It seems that you're trying to extend Python rather than embed it, but you're embedding it to test the extension code. You may want to refer to the official documentation for extending Python with C/C++ to guide you through this process.

Embed Python/C API in a multi-threading C++ program

I am trying to embed Python in a C++ multi-threading program using the Python/C API (version 3.7.3) on a quad-core ARM 64 bit architecture. A dedicated thread-safe class "PyHandler" takes care of all the Python API calls:
class PyHandler
{
public:
PyHandler();
~PyHandler();
bool run_fun();
// ...
private:
PyGILState_STATE _gstate;
std::mutex _mutex;
}
In the constructor I initialize the Python interpreter:
PyHandler::PyHandler()
{
Py_Initialize();
//PyEval_SaveThread(); // UNCOMMENT TO MAKE EVERYTHING WORK !
}
And in the destructor I undo all initializations:
PyHandler::~PyHandler()
{
_gstate = PyGILState_Ensure();
if (Py_IsInitialized()) // finalize python interpreter
Py_Finalize();
}
Now, in order to make run_fun() callable by one thread at a time, I use the mutex variable _mutex (see below). On top of this, I call PyGILState_Ensure() to make sure the current thread holds the python GIL, and call PyGILState_Release() at the end to release it. All the remaining python calls happen within these two calls:
bool PyHandler::run_fun()
{
std::lock_guard<std::mutex> lockGuard(_mutex);
_gstate = PyGILState_Ensure(); // give the current thread the Python GIL
// Python calls...
PyGILState_Release(_gstate); // release the Python GIL till now assigned to the current thread
return true;
}
Here is how the main() looks like:
int main()
{
PyHandler py; // constructor is called !
int n_threads = 10;
std::vector<std::thread> threads;
for (int i = 0; i < n_threads; i++)
threads.push_back(std::thread([&py]() { py.run_fun(); }));
for (int i = 0; i < n_threads; i++)
if (threads[i].joinable())
threads[i].join();
}
Although all precautions, the program always deadlocks at the PyGILState_Ensure() line in run_fun() during the very first attempt. BUT when I uncomment the line with PyEval_SaveThread() in the constructor everything magically works. Why is that ?
Notice that I am not calling PyEval_RestoreThread() anywhere. Am I supposed to use the macros Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS instead ? I thought these macros and PyEval_SaveThread() are only used dealing with Python threads and NOT with non-Python threads, as in my case! Am I missing something ?
The documentation for my case, only mentions the use of PyGILState_Ensure() and PyGILState_Release. Any help is highly appreciated.

Calling PyObject_Call from thread causes stack overflow

I am trying to use Jedi https://github.com/davidhalter/jedi to create a custom python editor and I am using c++, it works perfectly, but it is a bit slow and stalls for a small while, so I am calling those functions from inside a thread in c++, but doing so I am having a stack overflow error sometimes.
Here is my code:
//~ Create Script Instance class
PyObject* pScript = PyObject_Call(
PyObject_GetAttrString(GetJediModule(), "Script"),
PyTuple_Pack(1, PyString_FromString(TCHAR_TO_UTF8(*Source))),
NULL);
if (pScript == NULL)
{
UE_LOG(LogTemp, Verbose, TEXT("unable to get Script Class from Jedi Module"));
Py_DECREF(pScript);
ClearPython();
return;
}
//~ Call complete method from Script Class
PyObject* Result = PyObject_Call(
PyObject_GetAttrString(pScript, "complete"),
PyTuple_Pack(2, Py_BuildValue("i", Line), Py_BuildValue("i", Offset)),
NULL);
if (Result == NULL)
{
UE_LOG(LogTemp, Verbose, TEXT("unable to call complete method from Script class"));
Py_DECREF(Result);
ClearPython();
return;
}
The error happens when call PyObject_Call, I assume is because the thread, since it works perfectly when I call the function from the main thread, but the stack is not telling me anything useful, just an error inside the python.dll
Well I found the answer just by luck, it is possible to choose the stack size when I launch my thread in UE and I was using a super tiny value of 1024, I did a small modification and I have been testing for 3 hours without crashes anymore, so I guess is safe to assume is working now.
Here is how I setup the stack size, third arg is the stack size:
Thread = FRunnableThread::Create(this, TEXT("FAutoCompleteWorker"), 8 * 8 * 4096, TPri_Normal);

How to execute a python console using Qt on win10?

I'm trying to execute a python console using QProcess and show the content of the console in a QTextEdit.
Below is my major codes:
connect(&process, SIGNAL(readyReadStandardOutput()), this, SLOT(PrintStandardOutput()));
connect(&process, SIGNAL(started()), this, SLOT(Started()));
...
void Widget::PrintStandardOutput()
{
QString stdOtuput = QString::fromLocal8Bit(process.readAllStandardOutput());
ui->textEdit_Output->append(stdOtuput);
}
void Widget::Started()
{
ui->stateLabel->setText("Process started...");
}
I used QProcess::start("python") to start a new process and QProcess::write() to write my python codes, but I can see nothing shown in my QTextEdit, by the way, the python process is indeed executed according to the stateLabel showning "Process started...". How to let the python output shown in the QTextEdit? Thanks.
Try :
connect(&process, SIGNAL(readyReadStandardOutput()), this, SLOT(PrintStandardOutput()));
connect(&process, SIGNAL(started()), this, SLOT(Started()));
process.setProcessChannelMode(QProcess::MergedChannels);
process.start(processToStart, arguments)
void Widget::PrintStandardOutput()
{
// Get the output
QString output;
if (process.waitForStarted(-1)) {
while(process.waitForReadyRead(-1)) {
output += process.readAll();
ui->textEdit_Output->append(output);
}
}
process.waitForFinished();
}
void Widget::Started()
{
ui->stateLabel->setText("Process started...");
}
setProcessChannelMode(QProcess::MergedChannels) will merge the output channels. Various programs write to different outputs. Some use the error output for their normal logging, some use the "standard" output, some both. Better to merge them.
readAll() reads everything available so far.
It is put in a loop with waitForReadyRead(-1) (-1 means no timeout), which will block until something is available to read. This is to make sure that everything is actually read.
Simply calling readAll() after the process is finished has turned out to be highly unreliable (buffer might already be empty).

Stop embedded python

I'm embedding python in a C++ plug-in. The plug-in calls a python algorithm dozens of times during each session, each time sending the algorithm different data. So far so good
But now I have a problem:
The algorithm takes sometimes minutes to solve and to return a solution, and during that time often the conditions change making that solution irrelevant. So, what I want is to stop the running of the algorithm at any moment, and run it immediately after with other set of data.
Here's the C++ code for embedding python that I have so far:
void py_embed (void*data){
counter_thread=false;
PyObject *pName, *pModule, *pDict, *pFunc;
//To inform the interpreter about paths to Python run-time libraries
Py_SetProgramName(arg->argv[0]);
if(!gil_init){
gil_init=1;
PyEval_InitThreads();
PyEval_SaveThread();
}
PyGILState_STATE gstate = PyGILState_Ensure();
// Build the name object
pName = PyString_FromString(arg->argv[1]);
if( !pName ){
textfile3<<"Can't build the object "<<endl;
}
// Load the module object
pModule = PyImport_Import(pName);
if( !pModule ){
textfile3<<"Can't import the module "<<endl;
}
// pDict is a borrowed reference
pDict = PyModule_GetDict(pModule);
if( !pDict ){
textfile3<<"Can't get the dict"<<endl;
}
// pFunc is also a borrowed reference
pFunc = PyDict_GetItemString(pDict, arg->argv[2]);
if( !pFunc || !PyCallable_Check(pFunc) ){
textfile3<<"Can't get the function"<<endl;
}
/*Call the algorithm and treat the data that is returned from it
...
...
*/
// Clean up
Py_XDECREF(pArgs2);
Py_XDECREF(pValue2);
Py_DECREF(pModule);
Py_DECREF(pName);
PyGILState_Release(gstate);
counter_thread=true;
_endthread();
};
Edit: The python's algorithm is not my work and I shouldn't change it
This is based off of a cursory knowledge of python, and reading the python docs quickly.
PyThreadState_SetAsyncExc lets you inject an exception into a running python thread.
Run your python interpreter in some thread. In another thread, PyGILState_STATE then PyThreadState_SetAsyncExc into the main thread. (This may require some precursor work to teach the python interpreter about the 2nd thread).
Unless the python code you are running is full of "catch alls", this should cause it to terminate execution.
You can also look into the code to create python sub-interpreters, which would let you start up a new script while the old one shuts down.
Py_AddPendingCall is also tempting to use, but there are enough warnings around it maybe not.
Sorry, but your choices are short. You can either change the python code (ok, plugin - not an option) or run it on another PROCESS (with some nice ipc between). Then you can use the system api to wipe it out.
So, I finally thought of a solution (more of a workaround really).
Instead of terminating the thread that is running the algorithm - let's call it T1 -, I create another one -T2 - with the set of data that is relevant at that time.
In every thread i do this:
thread_counter+=1; //global variable
int thisthread=thread_counter;
and after the solution from python is given I just verify which is the most "recent", the one from T1 or from T2:
if(thisthread==thread_counter){
/*save the solution and treat it */
}
Is terms of computer effort this is not the best solution obviously, but it serves my purposes.
Thank you for the help guys
I've been thinking about this problem, and I agree that sub interpreters may provide you one possible solution https://docs.python.org/2/c-api/init.html#sub-interpreter-support. It supports calls for creating new interpreters and ending existing ones. The bugs & caveats sections describes some issues that depending on your architecture may or may not present a problem.
Another possible solution is to use the python multiprocessing module, and within your worker thread test a global variable (something like time_to_die). Then from the parent, you grab the GIL, set the variable, release the GIL and wait for the child to finish.
But then another idea ocurred to me. Why not just use fork(), init your python interpreter in the child and when the parent decides it's time for the python thread to end, just kill it. Something like this:
void process() {
int pid = fork();
if (pid) {
// in parent
sleep(60);
kill(pid, 9);
}
else{
// in child
Py_Initialize();
PyRun_SimpleString("# insert long running python calculation");
}
}
(This example assumes *nix, if you're on windows, substitute CreateProcess()/TerminateProcess())

Categories

Resources